For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > October 2006 > problem extracting forms from HTML page.









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author problem extracting forms from HTML page.
sr

2006-10-23, 9:57 pm

Hi All,

I am a beginner in Perl. I am trying to extract the content between
the <form>,</form> tags. My code seems to be correct, somehow I am
unable to trace where it's going wrong??

#!/usr/local/bin/perl
use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
use HTML::TreeBuilder;

$mystring="http://www.google.com/";
$contents=get($mystring);

my $form1 = get_heading($contents);
print $form1;

sub get_heading {
my $tree = HTML::TreeBuilder->new;
$tree->parse_file($_[0]);
my $formcontent;
my $form = $tree->look_down('_tag', 'form');
print $form; ## Does'nt print form

if($form) {
$formcontent = $form->as_text;
} else {
warn "No form found";
}
$tree->delete;
return $formcontent;
}

The above code just prints the warning message i.e., "No form found"
though the input URL has a form in it...

Please input ur suggestions...

thanks in advance

Paul Lalli

2006-10-24, 7:57 am

sr wrote:
> I am a beginner in Perl. I am trying to extract the content between
> the <form>,</form> tags. My code seems to be correct,


By what definition? If it's not producing the desired output, it's not
correct.

> somehow I am unable to trace where it's going wrong??


What attempts at tracing did you make? Did you try to print any of the
intermediate variables to see what their contents were?

> #!/usr/local/bin/perl


You forgot
use strict;
use warnings;

Always ask the computer to tell you when you're doing something wrong.

> use LWP::Simple;
> use LWP::UserAgent;
> use HTTP::Request;
> use HTTP::Response;
> use HTML::TreeBuilder;
>
> $mystring="http://www.google.com/";
> $contents=get($mystring);
>
> my $form1 = get_heading($contents);
> print $form1;
>
> sub get_heading {
> my $tree = HTML::TreeBuilder->new;
> $tree->parse_file($_[0]);


parse_file takes a filename or filehandle. You are giving it neither.
You are giving it a chunk of HTML.

Try parse() rather than parse_file()

> Please input ur suggestions...


Please do not talk like a 10 year old. Be an adult, use real words.
It doesn't take that long to type the extra two characters.

Paul Lalli

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com