For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > December 2004 > LWP get only <center> img









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author LWP get only <center> img
Brian Volk

2004-12-22, 9:00 pm

Hi All,

I have a list of url source files... I need to get a certain "<img src="
from each file. The one thing that separates it from the other <img src
tags is it is preceded by <center> for example: <center><img
src="/rcp/ObjectServer?table=Images&id=381" but the sequence of img tags is
different in each of the files. Is there a way to get the img 'src' tag if
the img tag is eq to <center>? Maybe I could write a regex to do this?
pointers?

I've broken my script down to try and get the <center> <img scr= from just
one source file.

Below is one attempt where I thought I was getting close ... maybe not...
:~). Any suggestions would be greatly appreciated.



#!/usr/bin/perl

use strict;
use warnings;
use HTML::TokeParser::Simple;
use LWP::Simple;

my $url = "
http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013
<http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
my $page = get($url)
or die "Could not load URL\n";

my $parser = HTML::TokeParser::Simple->new(\$page)
or die "Could not parse page";

$parser->get_tag ("img") || die;
my $token = $parser->get_token;
if ($token->[0] eq "center");
print;

# ---end ---


Brian Volk
HP Products
317.298.9950 x1245
<mailto:bvolk@hpproducts.com> bvolk@hpproducts.com



mgoland@optonline.net

2004-12-23, 3:58 am




----- Original Message -----
From: Brian Volk <BVolk@HPProducts.com>
Date: Wednesday, December 22, 2004 12:59 pm
Subject: LWP get only <center> img

> Hi All,


Hello
>
> I have a list of url source files... I need to get a certain "<img
> src="from each file. The one thing that separates it from the
> other <img src
> tags is it is preceded by <center> for example: <center><img
> src="/rcp/ObjectServer?table=Images&id=381" but the sequence of
> img tags is
> different in each of the files. Is there a way to get the img
> 'src' tag if
> the img tag is eq to <center>? Maybe I could write a regex to do
> this?pointers?

The module you are trying to use already has everything you need for the task.
>
> I've broken my script down to try and get the <center> <img scr=
> from just
> one source file.
>
> Below is one attempt where I thought I was getting close ...
> maybe not...
> :~). Any suggestions would be greatly appreciated.


You are real close, you need to use a few other functions from the module.
>
>
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use HTML::TokeParser::Simple;
> use LWP::Simple;
>
> my $url = "
> http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> my $page = get($url)
> or die "Could not load URL\n";

You can avoid all of that if you download the latest release of HTML::TokeParser::Simple from CPAN.

>
> my $parser = HTML::TokeParser::Simple->new(\$page)
> or die "Could not parse page";
>
> $parser->get_tag ("img") || die;
> my $token = $parser->get_token;
> if ($token->[0] eq "center");
> print;
>
> # ---end ---


Here is one way to do it. It's not a compleate deal, but will work for the test page you have supplied and as a learning base.

#!/usr/bin/perl

use strict;
use warnings;
use HTML::TokeParser::Simple;


my $url = 'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';

my $parser = HTML::TokeParser::Simple->new(url => $url) or die "Could not parse page";



while ( my $token = $parser->get_token ) {


if ( $token->is_start_tag( 'center' ) ) {
my $TAG = $parser->get_token();
print $TAG->get_attr('src');
}

}

HTH,
Mark G.

P.S. How about a free garbige bin for getting you on the way :O)

>
>
> Brian Volk
> HP Products
> 317.298.9950 x1245
> <bvolk@hpproducts.com> bvolk@hpproducts.com
>
>
>


Brian Volk

2004-12-23, 3:58 pm

Mark,

Thank you so much for your help, that worked great! It turns out that I
already had the latest version of HTML::Tokeparser::Simple installed.

Thanks again!

Brian

> -----Original Message-----
> From: mgoland@optonline.net [mailto:mgoland@optonline.net]
> Sent: Thursday, December 23, 2004 1:22 AM
> To: Brian Volk
> Cc: Beginners (E-mail)
> Subject: Re: LWP get only <center> img
>
>
>
>
>
> ----- Original Message -----
> From: Brian Volk <BVolk@HPProducts.com>
> Date: Wednesday, December 22, 2004 12:59 pm
> Subject: LWP get only <center> img
>
>
> Hello
> The module you are trying to use already has everything you
> need for the task.
>
> You are real close, you need to use a few other functions
> from the module.
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> You can avoid all of that if you download the latest release
> of HTML::TokeParser::Simple from CPAN.
>
>
> Here is one way to do it. It's not a compleate deal, but will
> work for the test page you have supplied and as a learning base.
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use HTML::TokeParser::Simple;
>
>
> my $url =

'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';

my $parser = HTML::TokeParser::Simple->new(url => $url) or die "Could not
parse page";



while ( my $token = $parser->get_token ) {


if ( $token->is_start_tag( 'center' ) ) {
my $TAG = $parser->get_token();
print $TAG->get_attr('src');
}

}

HTH,
Mark G.

P.S. How about a free garbige bin for getting you on the way :O)

>
>
> Brian Volk
> HP Products
> 317.298.9950 x1245
> <bvolk@hpproducts.com> bvolk@hpproducts.com
>
>
>



--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
<http://learn.perl.org/> <http://learn.perl.org/first-response>


mgoland@optonline.net

2004-12-23, 3:58 pm



----- Original Message -----
From: Brian Volk <BVolk@HPProducts.com>
Date: Thursday, December 23, 2004 8:55 am
Subject: RE: LWP get only <center> img

> Mark,
>
> Thank you so much for your help, that worked great! It turns out
> that I
> already had the latest version of HTML::Tokeparser::Simple installed.


NP Bryon, that is what this list is all about. I am sure next time you'll try to give me a hand.

Happy Holiday's All...Cheers
Mark G.
>
> Thanks again!
>
> Brian
>
> "<img
> of
> do
> scr=
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> 'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';
>
> my $parser = HTML::TokeParser::Simple->new(url => $url) or die
> "Could not
> parse page";
>
>
>
> while ( my $token = $parser->get_token ) {
>
>
> if ( $token->is_start_tag( 'center' ) ) {
> my $TAG = $parser->get_token();
> print $TAG->get_attr('src');
> }
>
> }
>
> HTH,
> Mark G.
>
> P.S. How about a free garbige bin for getting you on the way :O)
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com