Home > Archive > PERL Beginners > December 2004 > LWP get only <center> img
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
LWP get only <center> img
|
|
| Brian Volk 2004-12-22, 9:00 pm |
| Hi All,
I have a list of url source files... I need to get a certain "<img src="
from each file. The one thing that separates it from the other <img src
tags is it is preceded by <center> for example: <center><img
src="/rcp/ObjectServer?table=Images&id=381" but the sequence of img tags is
different in each of the files. Is there a way to get the img 'src' tag if
the img tag is eq to <center>? Maybe I could write a regex to do this?
pointers?
I've broken my script down to try and get the <center> <img scr= from just
one source file.
Below is one attempt where I thought I was getting close ... maybe not...
:~). Any suggestions would be greatly appreciated.
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TokeParser::Simple;
use LWP::Simple;
my $url = "
http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013
<http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
my $page = get($url)
or die "Could not load URL\n";
my $parser = HTML::TokeParser::Simple->new(\$page)
or die "Could not parse page";
$parser->get_tag ("img") || die;
my $token = $parser->get_token;
if ($token->[0] eq "center");
print;
# ---end ---
Brian Volk
HP Products
317.298.9950 x1245
<mailto:bvolk@hpproducts.com> bvolk@hpproducts.com
| |
| mgoland@optonline.net 2004-12-23, 3:58 am |
|
----- Original Message -----
From: Brian Volk <BVolk@HPProducts.com>
Date: Wednesday, December 22, 2004 12:59 pm
Subject: LWP get only <center> img
> Hi All,
Hello
>
> I have a list of url source files... I need to get a certain "<img
> src="from each file. The one thing that separates it from the
> other <img src
> tags is it is preceded by <center> for example: <center><img
> src="/rcp/ObjectServer?table=Images&id=381" but the sequence of
> img tags is
> different in each of the files. Is there a way to get the img
> 'src' tag if
> the img tag is eq to <center>? Maybe I could write a regex to do
> this?pointers?
The module you are trying to use already has everything you need for the task.
>
> I've broken my script down to try and get the <center> <img scr=
> from just
> one source file.
>
> Below is one attempt where I thought I was getting close ...
> maybe not...
> :~). Any suggestions would be greatly appreciated.
You are real close, you need to use a few other functions from the module.
>
>
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use HTML::TokeParser::Simple;
> use LWP::Simple;
>
> my $url = "
> http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> my $page = get($url)
> or die "Could not load URL\n";
You can avoid all of that if you download the latest release of HTML::TokeParser::Simple from CPAN.
>
> my $parser = HTML::TokeParser::Simple->new(\$page)
> or die "Could not parse page";
>
> $parser->get_tag ("img") || die;
> my $token = $parser->get_token;
> if ($token->[0] eq "center");
> print;
>
> # ---end ---
Here is one way to do it. It's not a compleate deal, but will work for the test page you have supplied and as a learning base.
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TokeParser::Simple;
my $url = 'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';
my $parser = HTML::TokeParser::Simple->new(url => $url) or die "Could not parse page";
while ( my $token = $parser->get_token ) {
if ( $token->is_start_tag( 'center' ) ) {
my $TAG = $parser->get_token();
print $TAG->get_attr('src');
}
}
HTH,
Mark G.
P.S. How about a free garbige bin for getting you on the way :O)
>
>
> Brian Volk
> HP Products
> 317.298.9950 x1245
> <bvolk@hpproducts.com> bvolk@hpproducts.com
>
>
>
| |
| Brian Volk 2004-12-23, 3:58 pm |
| Mark,
Thank you so much for your help, that worked great! It turns out that I
already had the latest version of HTML::Tokeparser::Simple installed.
Thanks again!
Brian
> -----Original Message-----
> From: mgoland@optonline.net [mailto:mgoland@optonline.net]
> Sent: Thursday, December 23, 2004 1:22 AM
> To: Brian Volk
> Cc: Beginners (E-mail)
> Subject: Re: LWP get only <center> img
>
>
>
>
>
> ----- Original Message -----
> From: Brian Volk <BVolk@HPProducts.com>
> Date: Wednesday, December 22, 2004 12:59 pm
> Subject: LWP get only <center> img
>
>
> Hello
> The module you are trying to use already has everything you
> need for the task.
>
> You are real close, you need to use a few other functions
> from the module.
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> You can avoid all of that if you download the latest release
> of HTML::TokeParser::Simple from CPAN.
>
>
> Here is one way to do it. It's not a compleate deal, but will
> work for the test page you have supplied and as a learning base.
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use HTML::TokeParser::Simple;
>
>
> my $url =
'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';
my $parser = HTML::TokeParser::Simple->new(url => $url) or die "Could not
parse page";
while ( my $token = $parser->get_token ) {
if ( $token->is_start_tag( 'center' ) ) {
my $TAG = $parser->get_token();
print $TAG->get_attr('src');
}
}
HTH,
Mark G.
P.S. How about a free garbige bin for getting you on the way :O)
>
>
> Brian Volk
> HP Products
> 317.298.9950 x1245
> <bvolk@hpproducts.com> bvolk@hpproducts.com
>
>
>
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
<http://learn.perl.org/> <http://learn.perl.org/first-response>
| |
| mgoland@optonline.net 2004-12-23, 3:58 pm |
|
----- Original Message -----
From: Brian Volk <BVolk@HPProducts.com>
Date: Thursday, December 23, 2004 8:55 am
Subject: RE: LWP get only <center> img
> Mark,
>
> Thank you so much for your help, that worked great! It turns out
> that I
> already had the latest version of HTML::Tokeparser::Simple installed.
NP Bryon, that is what this list is all about. I am sure next time you'll try to give me a hand.
Happy Holiday's All...Cheers
Mark G.
>
> Thanks again!
>
> Brian
>
> "<img
> of
> do
> scr=
> <http://www.rcpworksmarter.com/rcp/p...jsp?rcpNum=1013> ";
> 'http://www.rcpworksmarter.com/rcp/products/detail.jsp?rcpNum=1013';
>
> my $parser = HTML::TokeParser::Simple->new(url => $url) or die
> "Could not
> parse page";
>
>
>
> while ( my $token = $parser->get_token ) {
>
>
> if ( $token->is_start_tag( 'center' ) ) {
> my $TAG = $parser->get_token();
> print $TAG->get_attr('src');
> }
>
> }
>
> HTH,
> Mark G.
>
> P.S. How about a free garbige bin for getting you on the way :O)
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe@perl.org
> For additional commands, e-mail: beginners-help@perl.org
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>
>
|
|
|
|
|