For Programmers: Free Programming Magazines  


Home > Archive > AWK > June 2007 > Help parsing a file









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Help parsing a file
juanpo@bellsouth.net

2007-06-11, 9:57 pm

Hi,

I need some help parsing a file using awk. The file looks like this:

SRA xyz
SR2 abc
SR3 123
SR4 012
SRL ABC
SRA xyz
SR2 abc
SR3 123
SR4 012
SRL ABC
SRL XYZ

First I need to merge the records that start with SRA, SR2, SR3 and
SR4 into one line. Also the records that start with SRL need to be
parsed into a separate file. Output should look like this:

File1:
SRA xyzSR2 abcSR3 123SR4 012
SRA xyzSR2 abcSR3 123SR4 012
File2:
SRL ABC
SRL ABC
SRL XYZ

Any easy way to do this using awk?

Thanks in advance.

Vassilis

2007-06-11, 9:57 pm


jua...@bellsouth.net :
> Hi,
>
> I need some help parsing a file using awk. The file looks like this:
>
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRL XYZ
>
> First I need to merge the records that start with SRA, SR2, SR3 and
> SR4 into one line. Also the records that start with SRL need to be
> parsed into a separate file. Output should look like this:
>
> File1:
> SRA xyzSR2 abcSR3 123SR4 012
> SRA xyzSR2 abcSR3 123SR4 012
> File2:
> SRL ABC
> SRL ABC
> SRL XYZ
>
> Any easy way to do this using awk?
>
> Thanks in advance.



Hi,
Try this script out:

awk '/SR(A|2|3)/ { printf "%s", $0 > "out1" } /SR4/ { print >
"out1" } /SRL/ { print > "out2" }' file

This creates two files, one (out1) for SRA, SR2... records and the
other (out2) for SRL records.

Vassilis

Ed Morton

2007-06-11, 9:57 pm

juanpo@bellsouth.net wrote:
> Hi,
>
> I need some help parsing a file using awk. The file looks like this:
>
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRL XYZ
>
> First I need to merge the records that start with SRA, SR2, SR3 and
> SR4 into one line. Also the records that start with SRL need to be
> parsed into a separate file. Output should look like this:
>
> File1:
> SRA xyzSR2 abcSR3 123SR4 012
> SRA xyzSR2 abcSR3 123SR4 012
> File2:
> SRL ABC
> SRL ABC
> SRL XYZ
>
> Any easy way to do this using awk?
>
> Thanks in advance.
>


awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file

Ed.
juanpo@bellsouth.net

2007-06-11, 9:57 pm

Thanks Vassilis, this worked great. I appreciate it.

juanpo@bellsouth.net

2007-06-11, 9:57 pm

> awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file
>
> Ed.- Hide quoted text -
>
> - Show quoted text -


Thanks Ed, I'll try this one just for fun..

Kenny McCormack

2007-06-11, 9:57 pm

In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file
>
> Ed.


Bravo!

You could compress even further by doing:

print > ("file"((/SRL/)+1))

Notes:
1) Works in POSIX AWKs, but not TAWK.
2) Outer level of parens possibly not necessary. Haven't tested.
3) Not sure why parens around /SRL/ are needed.

Vassilis

2007-06-11, 9:57 pm


/ Kenny McCormack :
> In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
> Bravo!
>
> You could compress even further by doing:
>
> print > ("file"((/SRL/)+1))
>
> Notes:
> 1) Works in POSIX AWKs, but not TAWK.
> 2) Outer level of parens possibly not necessary. Haven't tested.
> 3) Not sure why parens around /SRL/ are needed.



Outstanding!
But why "+1"? Without it, you also save a pair of parens.

Vassilis

Anton Treuenfels

2007-06-11, 9:57 pm


"Vassilis" <F.H.Novalis@gmail.com> wrote in message
news:1181596537.526397.296590@h2g2000hsg.googlegroups.com...
>
>
> Outstanding!
> But why "+1"? Without it, you also save a pair of parens.


I'd guess (and it is a guess) that the inner parentheses enable the /SRL/ to
be parsed as a sub-expression, ie., complete in itself, that evaluates as 0
or 1 depending on match or not. The "+1" then yields 1 or 2, so the final
result is "file1" or "file2".

I suppose one could write:

print > "file" ($0 ~ /SRL/ + 1)

to get the same result (not tested).

- Anton Treuenfels


Vassilis

2007-06-11, 9:57 pm


/ Anton Treuenfels :
> "Vassilis" <F.H.Novalis@gmail.com> wrote in message
> news:1181596537.526397.296590@h2g2000hsg.googlegroups.com...
>
> I'd guess (and it is a guess) that the inner parentheses enable the /SRL/ to
> be parsed as a sub-expression, ie., complete in itself, that evaluates as 0
> or 1 depending on match or not. The "+1" then yields 1 or 2, so the final
> result is "file1" or "file2".
>
> I suppose one could write:
>
> print > "file" ($0 ~ /SRL/ + 1)
>
> to get the same result (not tested).
>
> - Anton Treuenfels



Sure, but I was thinking more along the lines of awk golf ;)
Without "+1", files generated default to file[01], which is perfectly
good

Vassilis

Ed Morton

2007-06-12, 7:57 am

Kenny McCormack wrote:
> In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
>
>
> Bravo!
>
> You could compress even further by doing:
>
> print > ("file"((/SRL/)+1))


or this should work in any awk and is arguably a bit clearer:

print > "file"(/SRL/?2:1)

Regards,

Ed.
Kenny McCormack

2007-06-12, 9:57 pm

In article < WPKdnZXISY6FCPPbnZ2dnUVZ_vCknZ2d@comcast
.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
>Kenny McCormack wrote:
>
>or this should work in any awk and is arguably a bit clearer:
>
> print > "file"(/SRL/?2:1)


$ tawk '{print "file"(/foo/?2:1)}'
awk: error in program line 1: illegal expression (check parenthesis)
awk: aborting due to compilation errors
$

I.e., this doesn't work in "any awk", since at least one AWK treats
regular expressions as first class citizens [1].

[1] This is a bit jargon-y, but you know what I mean.

Paiger

2007-06-14, 3:19 am

http://www.freedutchmovies.com/player.wmv?movie=1673286
Rising

2007-06-15, 4:36 am

Olsen Twins and Halle Berry playing with each other on film!
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com