For Programmers: Free Programming Magazines  


Home > Archive > PERL Miscellaneous > March 2005 > FAQ 6.2 I'm having trouble matching over more than one line. What's wrong?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author FAQ 6.2 I'm having trouble matching over more than one line. What's wrong?
PerlFAQ Server

2005-03-31, 8:56 am

This message is one of several periodic postings to comp.lang.perl.misc
intended to make it easier for perl programmers to find answers to
common questions. The core of this message represents an excerpt
from the documentation provided with Perl.

--------------------------------------------------------------------

6.2: I'm having trouble matching over more than one line. What's wrong?

Either you don't have more than one line in the string you're looking at
(probably), or else you aren't using the correct modifier(s) on your
pattern (possibly).

There are many ways to get multiline data into a string. If you want it
to happen automatically while reading input, you'll want to set $/
(probably to '' for paragraphs or "undef" for the whole file) to allow
you to read more than one line at a time.

Read perlre to help you decide which of "/s" and "/m" (or both) you
might want to use: "/s" allows dot to include newline, and "/m" allows
caret and dollar to match next to a newline, not just at the end of the
string. You do need to make sure that you've actually got a multiline
string in there.

For example, this program detects duplicate words, even when they span
line breaks (but not paragraph ones). For this example, we don't need
"/s" because we aren't using dot in a regular expression that we want to
cross line boundaries. Neither do we need "/m" because we aren't wanting
caret or dollar to match at any point inside the record next to
newlines. But it's imperative that $/ be set to something other than the
default, or else we won't actually ever have a multiline record read in.

$/ = ''; # read in more whole paragraph, not just one line
while ( <> ) {
while ( /\b([\w'-]+)(\s+\1)+\b/gi ) { # word starts alpha
print "Duplicate $1 at paragraph $.\n";
}
}

Here's code that finds sentences that begin with "From " (which would be
mangled by many mailers):

$/ = ''; # read in more whole paragraph, not just one line
while ( <> ) {
while ( /^From /gm ) { # /m makes ^ match next to \n
print "leading from in paragraph $.\n";
}
}

Here's code that finds everything between START and END in a paragraph:

undef $/; # read in whole file, not just one line or paragraph
while ( <> ) {
while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries
print "$1\n";
}
}



--------------------------------------------------------------------

Documents such as this have been called "Answers to Frequently
Asked Questions" or FAQ for short. They represent an important
part of the Usenet tradition. They serve to reduce the volume of
redundant traffic on a news group by providing quality answers to
questions that keep coming up.

If you are some how irritated by seeing these postings you are free
to ignore them or add the sender to your killfile. If you find
errors or other problems with these postings please send corrections
or comments to the posting email address or to the maintainers as
directed in the perlfaq manual page.

Note that the FAQ text posted by this server may have been modified
from that distributed in the stable Perl release. It may have been
edited to reflect the additions, changes and corrections provided
by respondents, reviewers, and critics to previous postings of
these FAQ. Complete text of these FAQ are available on request.

The perlfaq manual page contains the following copyright notice.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2002 Tom Christiansen and Nathan
Torkington, and other contributors as noted. All rights
reserved.

This posting is provided in the hope that it will be useful but
does not represent a commitment or contract of any kind on the part
of the contributers, authors or their agents.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com