Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

gawk problem matching multiple patterns ?!? HOW-TO?
running gawk;

I have an ascii file with the following format:
start: record 1
head1: fjoijefowijfwoijf
head2: fiwjowiefojwf
head3: fwofjwfoiwfoj
headx: woifjowjwioef
end
name: blb abl bla blb fjie j
address: fwoijwe fwlkjwefj
phone: wfjowejf wfjw ofi
cell: ifejw foiw jfowi jeoi
value: fi woiw fowiej owefj
start: record 2 etc...


here's my gawk code:
BEGIN { RS="(start.*end)*" }
{
print "---\n"$0"\n===";
}


for those not familiar w/ gawk, you can use a full regex for the
record separator.

I have a long RS because 1) I don't care about that data, & 2) there
is a variable amount of data there.

the problem I'm having is this:
the regex, as I'm using it matches the "start" at the begining of the
FILE, and the end at the END of the FILE.

I therefore only get 2 records printed.

I want to see all the records - I need my regex to match EVERY
occurance of the start...end "string".

any ideas?
tia - Bob



Report this thread to moderator Post Follow-up to this message
Old Post
Bob
03-20-04 01:24 AM


Re: gawk problem matching multiple patterns ?!? HOW-TO?
["Followup-To:" header set to comp.lang.awk.]
On Tue, 02 Mar 2004 16:34:45 -0600, Bob
<nospam_nsh@starnetwx.net> wrote:
>
> here's my gawk code:
> BEGIN { RS="(start.*end)*" }
> {
> print "---\n"$0"\n===";
> }
>
>
> for those not familiar w/ gawk, you can use a full regex for the
> record separator.
>
> I have a long RS because 1) I don't care about that data, & 2) there
> is a variable amount of data there.
>
> the problem I'm having is this:
> the regex, as I'm using it matches the "start" at the begining of the
> FILE, and the end at the END of the FILE.
>
> I therefore only get 2 records printed.
>
> I want to see all the records - I need my regex to match EVERY
> occurance of the start...end "string".
>
RS="start[^e]*end"

--
Incrsease your earoning poswer and gaerner profwessional resspect.
Get the Un1iversity Dewgree you have already earned.
[from the prestigious, non-accredited University of Spam!]

Report this thread to moderator Post Follow-up to this message
Old Post
Bill Marcum
03-20-04 01:24 AM


Re: gawk problem matching multiple patterns ?!? HOW-TO?
On Wed, 3 Mar 2004 02:53:30 -0500, Bill Marcum
<bmarcum@iglou.com.urgent> wrote:

>["Followup-To:" header set to comp.lang.awk.]
>On Tue, 02 Mar 2004 16:34:45 -0600, Bob
>  <nospam_nsh@starnetwx.net> wrote: 
>RS="start[^e]*end"

Bill - Tera-thanks!

that did the trick. Another question though; as I was playing around
with other permutations of your RE, trying to gain understanding as to
why your RE worked, and mine didn't; I discovered another strange
thing.

I THOUGHT that:
"start.*end" == "start[.]*end"

I found, in fact each of these RS regex's produced vastly different
results. I suppose that to understand why my original RE didn't work,
and yours did, I should re-read the order of precidence for gawk; but
in my last example, I can't imagine why the 2 RE's shouldn't be the
same.

can you shed any lite?

tx again ia!!!
Bob



Report this thread to moderator Post Follow-up to this message
Old Post
Bob
03-20-04 01:24 AM


Re: gawk problem matching multiple patterns ?!? HOW-TO?
On Wed, 03 Mar 2004 06:26:33 -0600, Bob <nospam_nsh@starnetwx.net>
wrote: 
>
>Bill - Tera-thanks!
>
>that did the trick. Another question though; as I was playing around
>with other permutations of your RE, trying to gain understanding as to
>why your RE worked, and mine didn't; I discovered another strange
>thing.
>
>I THOUGHT that:
>"start.*end" == "start[.]*end"

OH MY GOD - what the hell was I thinking!!!
sorry to bother - I just released my brain fart..... ;-)




Report this thread to moderator Post Follow-up to this message
Old Post
Bob
03-20-04 01:24 AM


Re: gawk problem matching multiple patterns ?!? HOW-TO?
On Wed, 03 Mar 2004 06:26:33 -0600, Bob
<nospam_nsh@starnetwx.net> wrote: 
>
> Bill - Tera-thanks!
>
> that did the trick. Another question though; as I was playing around
> with other permutations of your RE, trying to gain understanding as to
> why your RE worked, and mine didn't; I discovered another strange
> thing.
>
> I THOUGHT that:
> "start.*end" == "start[.]*end"
>
> I found, in fact each of these RS regex's produced vastly different
> results. I suppose that to understand why my original RE didn't work,
> and yours did, I should re-read the order of precidence for gawk; but
> in my last example, I can't imagine why the 2 RE's shouldn't be the
> same.
>
> can you shed any lite?
>
Regular expressions like "a.*b" are greedy; as the expression is
evaluated from left to right, each "*" matches the longest possible
string.
Actually, my "start[^e]*end" might not work if the letter "e" appears
between "start" and "end".  A better solution might be
BEGIN{RS="end"}
{sub(/start.*/,"")}

--
Incrsease your earoning poswer and gaerner profwessional resspect.
Get the Un1iversity Dewgree you have already earned.
[from the prestigious, non-accredited University of Spam!]

Report this thread to moderator Post Follow-up to this message
Old Post
Bill Marcum
03-20-04 01:24 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 11:24 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.