For Programmers: Free Programming Magazines  


Home > Archive > PERL CGI Beginners > January 2005 > Extracting links.









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Extracting links.
Sara

2005-01-16, 3:55 pm

I am trying to extract links along with HTML tags <a href=blah> from a list, but it's not working on my XP machine with Active State Perl 5.0.6
Kindly help.

################# CODE START ####################

my @array = qq|
<body><a href="http://www.mydomain.com"><img alt="Free Hosting, Freebies" border=0 src="http://www.mydomain.com/images/logo2.gif"></a>
|;
#extract LINKS (no image links) only <a href="http://www.mydomain.com">

my @get = grep {/<a .*?>/} @array;
print "@get\n"

################### CODE END ###################

Thanks,

Sara.





Randy W. Sims

2005-01-16, 3:55 pm

Sara wrote:
> I am trying to extract links along with HTML tags <a href=blah> from a list, but it's not working on my XP machine with Active State Perl 5.0.6
> Kindly help.
>
> ################# CODE START ####################
>
> my @array = qq|
> <body><a href="http://www.mydomain.com"><img alt="Free Hosting, Freebies" border=0 src="http://www.mydomain.com/images/logo2.gif"></a>
> |;
> #extract LINKS (no image links) only <a href="http://www.mydomain.com">
>
> my @get = grep {/<a .*?>/} @array;
> print "@get\n"
>
> ################### CODE END ###################


I'm not sure why you're assigning a string to an array...

(completely untested)

my $html = <<HTML;
<body><a href="http://www.mydomain.com"><img alt="Free Hosting,
Freebies" border=0 src="http://www.mydomain.com/images/logo2.gif"></a>
HTML

use HTML::LinkExtractor;

my $lx = new HTML::LinkExtractor();
$lx->parse(\$html);

for my $link( @{$lx->links} ) {
if( $$link{tag} !~ /img/i ) {
my $href = $$link{href};
print $href->as_string();
}
}

__END__

Charles K. Clarkson

2005-01-16, 3:55 pm

Sara <sara_samsara@hotpop.com> wrote:

: I am trying to extract links along with HTML tags <a
: href=blah> from a list, but it's not working on my XP machine
: with Active State Perl 5.0.6 Kindly help.
:

While Randy already addressed using HTML::LinkExtractor to
retrieve links, you should also hop over to ActiveState. 5.0.6
is a pretty old version of perl.

http://www.activestate.com/Products/ActivePerl/


HTH,

Charles K. Clarkson
--
Mobile Homes Specialist
254 968-8328


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com