For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > May 2006 > html regex help









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author html regex help
Jerry Preston

2006-05-22, 7:00 pm

Hi!

I am trying to break down some html:

use strict;

$_ = "Owner Name:</td><td class="ssDetailData">last, first middle";

if( /Owner Name:<\/td><td
class=\"ssDetailData\">(\w+),\s+(\w+)\s+(\w+)/ ) {
print "\nName 1 $1\n2 $2\n3 $3\n\n*$_\n";
print $_;
}

This works, looking for a better way and what if there more than 3 names?

$_ = ">Owner Address:</td><td class="ssDetailData" valign="top">2403 93RD ST
<br />LUBBOCK,TX 79423</td>"

if( /Owner Address:<\/td><td class=\"ssDetailData\">*(\w+)/ ) {
print "\nOwner 1 $1\n2 $2\n3 $3\n\n*$_\n";
print FO $_;
}

This does not work. Any ideas?

Thanks,

Jerry


DJ Stunks

2006-05-22, 7:00 pm

Jerry Preston wrote:
> Hi!
>
> I am trying to break down some html:
>
> use strict;
>
> $_ = "Owner Name:</td><td class="ssDetailData">last, first middle";


please paste _actual_ code, not something written up on the fly. That
line will not compile.

> <regex parser snipped>
>
> This does not work. Any ideas?


Is it Monday already? I guess it's time for our bi-wly question
about parsing HTML with regular expressions!

Could you guys start lurking for a while before you start posting? Or
at least search the list archives? Or, oh wait - this is good, read
the Perl FAQ???

If you had, Jerry, you would have found that the recommendation is to
use an HTML parsing module to parse HTML.

HTML::Parser or HTML::TokeParser are good suggestions.

I can't wait till the next person asks this on Thursday! Maybe we need
an EFAQ - EXTREMELY Frequently Asked Questions...

-jp

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com