For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > November 2007 > 400 bad request while retrieving a frame page with WWW::Mechanize









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author 400 bad request while retrieving a frame page with WWW::Mechanize
lcerneka@gmail.com

2007-11-16, 8:00 am

Hi everyone

I get an html page with a 400 error code (Bad Request) when running
the following code:

#!/usr/bin/perl -w
use strict;
use warnings;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new( agent => 'Mozilla 2.0.0.9' );

$mech->get('http://www.aeroporto.fvg.it/tab/fmarrb.php');
my $arrivi = $mech->content;


print "Content-type: text/html\n\n";
print $arrivi;

exit;


When asking for this page directly from a browser (Firefox or IE) it
works fine... you can see the frame as a stand-alone page, exactly
what I want to get... but I cannot!
Do you have some good suggestion?
Thanks in advance
Livius

lcerneka@gmail.com

2007-11-17, 7:59 am

On Nov 16, 4:39 pm, t...@stonehenge.com (Tom Phoenix) wrote:
> On 11/16/07, lcern...@gmail.com <lcern...@gmail.com> wrote:
>
>
> This happens often enough that it is covered in the FAQ for WWW::Mechanize:
>
> http://search.cpan.org/~petdance/WW...WW/Mechanize...
>
> Hope this helps!
>
> --Tom Phoenix
> Stonehenge Perl Training


Thanks guys for tries and good link... but I still cannot figure it
out... I tried to debug with use LWP::Debug qw(+) , and even tried to
extract the frame as a link with my @frame_links = $mech-
>find_link( tag => "frame" ) as suggested in the Mechanize FAQ... but

they don't work... the result is always the same: 400 Bad request...
I'm a bit frustrated 'cause I often write code to data retrieval and
this is the first time such an error occurs... The 400 error code is
about malformed URL's syntax or request header... Any other idea?

Peter Scott

2007-11-17, 7:59 am

On Sat, 17 Nov 2007 03:28:14 -0800, lcerneka wrote:
> On Nov 16, 4:39 pm, t...@stonehenge.com (Tom Phoenix) wrote:
[snip][color=darkred]
> Thanks guys for tries and good link... but I still cannot figure it
> out... I tried to debug with use LWP::Debug qw(+) , and even tried to
> extract the frame as a link with my @frame_links = $mech-
> they don't work... the result is always the same: 400 Bad request...
> I'm a bit frustrated 'cause I often write code to data retrieval and
> this is the first time such an error occurs... The 400 error code is
> about malformed URL's syntax or request header... Any other idea?


Show us your code, including the URL; or go to the libwww mailing list
(http://lists.cpan.org/showlist.cgi?name=libwww) and show them; or hire an
expert.

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/

lcerneka@gmail.com

2007-11-17, 7:00 pm

On Nov 17, 2:30 pm, Pe...@PSDT.com (Peter Scott) wrote:
> On Sat, 17 Nov 2007 03:28:14 -0800, lcerneka wrote:
>
>
> [snip]
>
> Show us your code, including the URL; or go to the libwww mailing list
> (http://lists.cpan.org/showlist.cgi?name=libwww) and show them; or hire an
> expert.
>
> --
> Peter Scotthttp://www.perlmedic.com/http://www.perldebugged.com/


Hi Peter,
here is the code:


#!/usr/bin/perl -w
use strict;
use warnings;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new( agent => 'Mozilla 2.0.0.9' );

$mech->get('http://www.aeroporto.fvg.it/tab/fmarrb.php');
my $arrivi = $mech->content;

print "Content-type: text/html\n\n";
print $arrivi;

exit;

As explained above, it cannot obtain the requested page... it receives
an html page with a 400 error Bad request...

Thanks for helping.

Alan_C

2007-11-18, 4:01 am

lcerneka@gmail.com wrote:
[ . . ]
> my $arrivi = $mech->content;


What is "page content"?

http://www.aeroporto.fvg.it/tab/fmarrb.php

I web browser browsed to there then chose "view source"

And I discovered nothing but html markup and pics

IOW other than pics to load and the html formatting, there is no content.

--
Alan_C

Peter Scott

2007-11-18, 7:00 pm

On Sat, 17 Nov 2007 07:38:54 -0800, lcerneka wrote:
>
> #!/usr/bin/perl -w
> use strict;
> use warnings;
> use WWW::Mechanize;
>
> my $mech = WWW::Mechanize->new( agent => 'Mozilla 2.0.0.9' );
>
> $mech->get('http://www.aeroporto.fvg.it/tab/fmarrb.php');
> my $arrivi = $mech->content;
>
> print "Content-type: text/html\n\n";
> print $arrivi;
>
> exit;


Okay, I reproduced this and solved it. First, I verified that I got a
proper response from Safari. Then I ran tcpdump to compare the request
sent by Safari with the one sent by Mech. (Regrettably, setting
LWP::Debug +conns does not show traffic unless you take the obscure step
of setting the environment variable PERL_LWP_USE_HTTP_10 to revert to HTTP
1.0, so I use tcpdump instead.) Then I adjusted headers until I found one
that worked.

Add

$mech->add_header( Accept => '*/*' );

before the fetch and the server will respond properly. Don't know why
it's behaving that way.

If you look at the source of the page in a browser there's no frame so
looking for one would be pointless.

A couple of other comments:

-w is redundant when you have the superior use warnings.
exit is redundant as the last statement. That's what it's going to do
anyway.

--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/

Yitzle

2007-11-18, 7:00 pm

On Nov 18, 2007 11:30 AM, Peter Scott <Peter@psdt.com> wrote:
>
> Okay, I reproduced this and solved it. First, I verified that I got a
> proper response from Safari. Then I ran tcpdump to compare the request
> sent by Safari with the one sent by Mech. (Regrettably, setting
> LWP::Debug +conns does not show traffic unless you take the obscure step
> of setting the environment variable PERL_LWP_USE_HTTP_10 to revert to HTTP
> 1.0, so I use tcpdump instead.) Then I adjusted headers until I found one
> that worked.
>
> Add
>
> $mech->add_header( Accept => '*/*' );
>
> before the fetch and the server will respond properly. Don't know why
> it's behaving that way.


Unusual.
Does Mech by default not set an Accept line?
As I wrote earlier, I tried downloading the page with the standard
wget tool and the lynx text-based browser and neither managed to get
the page.
Do most servers simply not bother to check the Accept header?
lcerneka@gmail.com

2007-11-18, 7:00 pm

On Nov 18, 5:30 pm, Pe...@PSDT.com (Peter Scott) wrote:
> On Sat, 17 Nov 2007 07:38:54 -0800, lcerneka wrote:
>
>
>
>
>
>
>
> Okay, I reproduced this and solved it. First, I verified that I got a
> proper response from Safari. Then I ran tcpdump to compare the request
> sent by Safari with the one sent by Mech. (Regrettably, setting
> LWP::Debug +conns does not show traffic unless you take the obscure step
> of setting the environment variable PERL_LWP_USE_HTTP_10 to revert to HTTP
> 1.0, so I use tcpdump instead.) Then I adjusted headers until I found one
> that worked.
>
> Add
>
> $mech->add_header( Accept => '*/*' );
>
> before the fetch and the server will respond properly. Don't know why
> it's behaving that way.
>
> If you look at the source of the page in a browser there's no frame so
> looking for one would be pointless.
>
> A couple of other comments:
>
> -w is redundant when you have the superior use warnings.
> exit is redundant as the last statement. That's what it's going to do
> anyway.
>
> --
> Peter Scotthttp://www.perlmedic.com/http://www.perldebugged.com/


Thank you a lot, Peter! Now it works perfectly... And thanks for the
good extra suggestions. I'm so glad there is such expert and kind
people around.
Thanks to the others who tried to help me, too.

Livius

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com