For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > March 2005 > Regex Multi line match









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Regex Multi line match
William Melanson

2005-03-08, 3:57 pm


Greetings,

This is the multi line pattern in which I wish to match:

<tr>
<td><b> String 1.2.3.4.5.6 </b></td>
<td><img src="pics/green.gif" alt="OK"></td>
<tr>

This is what I have:

========================================
==================
#!/usr/bin/perl -w

my $file='./index.html';

open INFILE, $file or die "Can't open $file: $!";
while (<INFILE> )
{
if (/^<tr>\n
^<td><b> String 1\.2\.3\.4\.5\.6 </b></td>\n
^<td><img src\=\"pics/green\.gif\" alt\=\"OK\"></td>\n
^<tr>\n/smx) {

print("$_");
}
}
close INFILE;

========================================
==================

What have I been overlooking?

Thanks much


--


Regards,

Bill

========================================
==============================
William Melanson: WebUnited Systems Administration
Phone: 954-418-8884 Ext: 287
========================================
==============================


John Doe

2005-03-08, 3:57 pm

Am Dienstag, 8. M=E4rz 2005 17.20 schrieb William Melanson:
> Greetings,


Greetings too :-)
>
> This is the multi line pattern in which I wish to match:
>
> <tr>
> <td><b> String 1.2.3.4.5.6 </b></td>
> <td><img src=3D"pics/green.gif" alt=3D"OK"></td>
> <tr>
>
> This is what I have:
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
> #!/usr/bin/perl -w
>
> my $file=3D'./index.html';
>
> open INFILE, $file or die "Can't open $file: $!";
> while (<INFILE> )
> {
> if (/^<tr>\n
> ^<td><b> String 1\.2\.3\.4\.5\.6 </b></td>\n
> ^<td><img src\=3D\"pics/green\.gif\" alt\=3D\"OK\"></td>\n
> ^<tr>\n/smx) {
>
> print("$_");
> }
> }
> close INFILE;
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> What have I been overlooking?


that=20

while (<INFILE> )

reads just _one_ line at the time :-)


[...]
Dave Gray

2005-03-08, 3:57 pm

> This is the multi line pattern in which I wish to match:
>
> <tr>
> <td><b> String 1.2.3.4.5.6 </b></td>
> <td><img src="pics/green.gif" alt="OK"></td>
> <tr>


One way to solve this would be to read lines from the file and save
chunks of N lines (4 in this case) in a temp variable. Then your regex
would operate on enough of the file to have a chance of working.
Something like (untested):

my (@lines, $num) = ((), 4);
while (<INPUT> ) {
push @lines, $_;
shift @lines if @lines == $num+1;
print 'lines '.($.-$num+1).' to '.($.)." match\n"
if join('',@lines) =~ /regex goes here/;
}
Jay Savage

2005-03-09, 3:56 am

On Tue, 8 Mar 2005 13:42:44 -0500, Dave Gray <yargevad@gmail.com> wrote:
>
> One way to solve this would be to read lines from the file and save
> chunks of N lines (4 in this case) in a temp variable. Then your regex
> would operate on enough of the file to have a chance of working.
> Something like (untested):
>
> my (@lines, $num) = ((), 4);
> while (<INPUT> ) {
> push @lines, $_;
> shift @lines if @lines == $num+1;
> print 'lines '.($.-$num+1).' to '.($.)." match\n"
> if join('',@lines) =~ /regex goes here/;
> }
>


That assumes that the pattern being searched for will begin 4n lines
from the beginning of the file, but just because we're looking for
four lines doesn't mean the file is written in four line chunks. In
fact, it probably isn't.

Why don't you tell us what you're actually trying to do here; I'm
guessing the goal isn't to search through a file for a literal string
and then print it. If you knew what you were looking for, you
wouldn't need to seach the file; you could just print it. So is the
ultimate goal to perform a substitution? Count the number of
occurrances? What?

If it's a single, reasonably sized file, try something like:

my @lines = (<> );
my $text = join '', @lines;
$text =~ /regex/ ;


If it's too big to hold in memory, things get a little more interesting.

HTH,

--jay
Dave Gray

2005-03-09, 3:56 am

> > Something like (untested):
>
> That assumes that the pattern being searched for will begin 4n lines
> from the beginning of the file, but just because we're looking for
> four lines doesn't mean the file is written in four line chunks. In
> fact, it probably isn't.


Er, no it doesn't. Read it again. It's a rolling n-line chunk of the file.

> Why don't you tell us what you're actually trying to do here; I'm
> guessing the goal isn't to search through a file for a literal string
> and then print it. If you knew what you were looking for, you
> wouldn't need to seach the file; you could just print it. So is the
> ultimate goal to perform a substitution? Count the number of
> occurrances? What?


Now this I agree with.
John W. Krahn

2005-03-09, 3:56 am

Dave Gray wrote:
>
> One way to solve this would be to read lines from the file and save
> chunks of N lines (4 in this case) in a temp variable. Then your regex
> would operate on enough of the file to have a chance of working.
> Something like (untested):
>
> my (@lines, $num) = ((), 4);


You are assigning the list ((), 4) to @lines and nothing to $num. Perhaps you
meant:

my ( $num, @lines ) = ( 4, () );

Or simply:

my ( $num, @lines ) = 4;


John
--
use Perl;
program
fulfillment
Jay Savage

2005-03-09, 3:56 am

On Tue, 8 Mar 2005 15:26:28 -0500, Dave Gray <yargevad@gmail.com> wrote:
>
> Er, no it doesn't. Read it again. It's a rolling n-line chunk of the file.
>


You are correct, I misread.

--j
Dave Gray

2005-03-09, 3:56 am

> > my (@lines, $num) = ((), 4);
>
> You are assigning the list ((), 4) to @lines and nothing to $num. Perhaps you
> meant:
>
> my ( $num, @lines ) = ( 4, () );
>
> Or simply:
>
> my ( $num, @lines ) = 4;


Indeed, good catch.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com