Home > Archive > PERL Beginners > March 2005 > Regex Multi line match
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Regex Multi line match
|
|
| William Melanson 2005-03-08, 3:57 pm |
|
Greetings,
This is the multi line pattern in which I wish to match:
<tr>
<td><b> String 1.2.3.4.5.6 </b></td>
<td><img src="pics/green.gif" alt="OK"></td>
<tr>
This is what I have:
========================================
==================
#!/usr/bin/perl -w
my $file='./index.html';
open INFILE, $file or die "Can't open $file: $!";
while (<INFILE> )
{
if (/^<tr>\n
^<td><b> String 1\.2\.3\.4\.5\.6 </b></td>\n
^<td><img src\=\"pics/green\.gif\" alt\=\"OK\"></td>\n
^<tr>\n/smx) {
print("$_");
}
}
close INFILE;
========================================
==================
What have I been overlooking?
Thanks much
--
Regards,
Bill
========================================
==============================
William Melanson: WebUnited Systems Administration
Phone: 954-418-8884 Ext: 287
========================================
==============================
| |
| John Doe 2005-03-08, 3:57 pm |
| Am Dienstag, 8. M=E4rz 2005 17.20 schrieb William Melanson:
> Greetings,
Greetings too :-)
>
> This is the multi line pattern in which I wish to match:
>
> <tr>
> <td><b> String 1.2.3.4.5.6 </b></td>
> <td><img src=3D"pics/green.gif" alt=3D"OK"></td>
> <tr>
>
> This is what I have:
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
> #!/usr/bin/perl -w
>
> my $file=3D'./index.html';
>
> open INFILE, $file or die "Can't open $file: $!";
> while (<INFILE> )
> {
> if (/^<tr>\n
> ^<td><b> String 1\.2\.3\.4\.5\.6 </b></td>\n
> ^<td><img src\=3D\"pics/green\.gif\" alt\=3D\"OK\"></td>\n
> ^<tr>\n/smx) {
>
> print("$_");
> }
> }
> close INFILE;
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> What have I been overlooking?
that=20
while (<INFILE> )
reads just _one_ line at the time :-)
[...]
| |
| Dave Gray 2005-03-08, 3:57 pm |
| > This is the multi line pattern in which I wish to match:
>
> <tr>
> <td><b> String 1.2.3.4.5.6 </b></td>
> <td><img src="pics/green.gif" alt="OK"></td>
> <tr>
One way to solve this would be to read lines from the file and save
chunks of N lines (4 in this case) in a temp variable. Then your regex
would operate on enough of the file to have a chance of working.
Something like (untested):
my (@lines, $num) = ((), 4);
while (<INPUT> ) {
push @lines, $_;
shift @lines if @lines == $num+1;
print 'lines '.($.-$num+1).' to '.($.)." match\n"
if join('',@lines) =~ /regex goes here/;
}
| |
| Jay Savage 2005-03-09, 3:56 am |
| On Tue, 8 Mar 2005 13:42:44 -0500, Dave Gray <yargevad@gmail.com> wrote:
>
> One way to solve this would be to read lines from the file and save
> chunks of N lines (4 in this case) in a temp variable. Then your regex
> would operate on enough of the file to have a chance of working.
> Something like (untested):
>
> my (@lines, $num) = ((), 4);
> while (<INPUT> ) {
> push @lines, $_;
> shift @lines if @lines == $num+1;
> print 'lines '.($.-$num+1).' to '.($.)." match\n"
> if join('',@lines) =~ /regex goes here/;
> }
>
That assumes that the pattern being searched for will begin 4n lines
from the beginning of the file, but just because we're looking for
four lines doesn't mean the file is written in four line chunks. In
fact, it probably isn't.
Why don't you tell us what you're actually trying to do here; I'm
guessing the goal isn't to search through a file for a literal string
and then print it. If you knew what you were looking for, you
wouldn't need to seach the file; you could just print it. So is the
ultimate goal to perform a substitution? Count the number of
occurrances? What?
If it's a single, reasonably sized file, try something like:
my @lines = (<> );
my $text = join '', @lines;
$text =~ /regex/ ;
If it's too big to hold in memory, things get a little more interesting.
HTH,
--jay
| |
| Dave Gray 2005-03-09, 3:56 am |
| > > Something like (untested):
>
> That assumes that the pattern being searched for will begin 4n lines
> from the beginning of the file, but just because we're looking for
> four lines doesn't mean the file is written in four line chunks. In
> fact, it probably isn't.
Er, no it doesn't. Read it again. It's a rolling n-line chunk of the file.
> Why don't you tell us what you're actually trying to do here; I'm
> guessing the goal isn't to search through a file for a literal string
> and then print it. If you knew what you were looking for, you
> wouldn't need to seach the file; you could just print it. So is the
> ultimate goal to perform a substitution? Count the number of
> occurrances? What?
Now this I agree with.
| |
| John W. Krahn 2005-03-09, 3:56 am |
| Dave Gray wrote:
>
> One way to solve this would be to read lines from the file and save
> chunks of N lines (4 in this case) in a temp variable. Then your regex
> would operate on enough of the file to have a chance of working.
> Something like (untested):
>
> my (@lines, $num) = ((), 4);
You are assigning the list ((), 4) to @lines and nothing to $num. Perhaps you
meant:
my ( $num, @lines ) = ( 4, () );
Or simply:
my ( $num, @lines ) = 4;
John
--
use Perl;
program
fulfillment
| |
| Jay Savage 2005-03-09, 3:56 am |
| On Tue, 8 Mar 2005 15:26:28 -0500, Dave Gray <yargevad@gmail.com> wrote:
>
> Er, no it doesn't. Read it again. It's a rolling n-line chunk of the file.
>
You are correct, I misread.
--j
| |
| Dave Gray 2005-03-09, 3:56 am |
| > > my (@lines, $num) = ((), 4);
>
> You are assigning the list ((), 4) to @lines and nothing to $num. Perhaps you
> meant:
>
> my ( $num, @lines ) = ( 4, () );
>
> Or simply:
>
> my ( $num, @lines ) = 4;
Indeed, good catch.
|
|
|
|
|