Home > Archive > AWK > June 2007 > Help parsing a file
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Help parsing a file
|
|
| juanpo@bellsouth.net 2007-06-11, 9:57 pm |
| Hi,
I need some help parsing a file using awk. The file looks like this:
SRA xyz
SR2 abc
SR3 123
SR4 012
SRL ABC
SRA xyz
SR2 abc
SR3 123
SR4 012
SRL ABC
SRL XYZ
First I need to merge the records that start with SRA, SR2, SR3 and
SR4 into one line. Also the records that start with SRL need to be
parsed into a separate file. Output should look like this:
File1:
SRA xyzSR2 abcSR3 123SR4 012
SRA xyzSR2 abcSR3 123SR4 012
File2:
SRL ABC
SRL ABC
SRL XYZ
Any easy way to do this using awk?
Thanks in advance.
| |
| Vassilis 2007-06-11, 9:57 pm |
|
jua...@bellsouth.net :
> Hi,
>
> I need some help parsing a file using awk. The file looks like this:
>
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRL XYZ
>
> First I need to merge the records that start with SRA, SR2, SR3 and
> SR4 into one line. Also the records that start with SRL need to be
> parsed into a separate file. Output should look like this:
>
> File1:
> SRA xyzSR2 abcSR3 123SR4 012
> SRA xyzSR2 abcSR3 123SR4 012
> File2:
> SRL ABC
> SRL ABC
> SRL XYZ
>
> Any easy way to do this using awk?
>
> Thanks in advance.
Hi,
Try this script out:
awk '/SR(A|2|3)/ { printf "%s", $0 > "out1" } /SR4/ { print >
"out1" } /SRL/ { print > "out2" }' file
This creates two files, one (out1) for SRA, SR2... records and the
other (out2) for SRL records.
Vassilis
| |
| Ed Morton 2007-06-11, 9:57 pm |
| juanpo@bellsouth.net wrote:
> Hi,
>
> I need some help parsing a file using awk. The file looks like this:
>
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRA xyz
> SR2 abc
> SR3 123
> SR4 012
> SRL ABC
> SRL XYZ
>
> First I need to merge the records that start with SRA, SR2, SR3 and
> SR4 into one line. Also the records that start with SRL need to be
> parsed into a separate file. Output should look like this:
>
> File1:
> SRA xyzSR2 abcSR3 123SR4 012
> SRA xyzSR2 abcSR3 123SR4 012
> File2:
> SRL ABC
> SRL ABC
> SRL XYZ
>
> Any easy way to do this using awk?
>
> Thanks in advance.
>
awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file
Ed.
| |
| juanpo@bellsouth.net 2007-06-11, 9:57 pm |
| Thanks Vassilis, this worked great. I appreciate it.
| |
| juanpo@bellsouth.net 2007-06-11, 9:57 pm |
| > awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file
>
> Ed.- Hide quoted text -
>
> - Show quoted text -
Thanks Ed, I'll try this one just for fun..
| |
| Kenny McCormack 2007-06-11, 9:57 pm |
| In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>awk '{ORS=/SR[4L]/?"\n":""; print > /SRL/?"file2":"file1"}' file
>
> Ed.
Bravo!
You could compress even further by doing:
print > ("file"((/SRL/)+1))
Notes:
1) Works in POSIX AWKs, but not TAWK.
2) Outer level of parens possibly not necessary. Haven't tested.
3) Not sure why parens around /SRL/ are needed.
| |
| Vassilis 2007-06-11, 9:57 pm |
|
/ Kenny McCormack :
> In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
> Bravo!
>
> You could compress even further by doing:
>
> print > ("file"((/SRL/)+1))
>
> Notes:
> 1) Works in POSIX AWKs, but not TAWK.
> 2) Outer level of parens possibly not necessary. Haven't tested.
> 3) Not sure why parens around /SRL/ are needed.
Outstanding!
But why "+1"? Without it, you also save a pair of parens.
Vassilis
| |
| Anton Treuenfels 2007-06-11, 9:57 pm |
|
"Vassilis" <F.H.Novalis@gmail.com> wrote in message
news:1181596537.526397.296590@h2g2000hsg.googlegroups.com...
>
>
> Outstanding!
> But why "+1"? Without it, you also save a pair of parens.
I'd guess (and it is a guess) that the inner parentheses enable the /SRL/ to
be parsed as a sub-expression, ie., complete in itself, that evaluates as 0
or 1 depending on match or not. The "+1" then yields 1 or 2, so the final
result is "file1" or "file2".
I suppose one could write:
print > "file" ($0 ~ /SRL/ + 1)
to get the same result (not tested).
- Anton Treuenfels
| |
| Vassilis 2007-06-11, 9:57 pm |
|
/ Anton Treuenfels :
> "Vassilis" <F.H.Novalis@gmail.com> wrote in message
> news:1181596537.526397.296590@h2g2000hsg.googlegroups.com...
>
> I'd guess (and it is a guess) that the inner parentheses enable the /SRL/ to
> be parsed as a sub-expression, ie., complete in itself, that evaluates as 0
> or 1 depending on match or not. The "+1" then yields 1 or 2, so the final
> result is "file1" or "file2".
>
> I suppose one could write:
>
> print > "file" ($0 ~ /SRL/ + 1)
>
> to get the same result (not tested).
>
> - Anton Treuenfels
Sure, but I was thinking more along the lines of awk golf ;)
Without "+1", files generated default to file[01], which is perfectly
good
Vassilis
| |
| Ed Morton 2007-06-12, 7:57 am |
| Kenny McCormack wrote:
> In article < Y_GdnbGTA8YGFPDbnZ2dnUVZ_tKjnZ2d@comcast
.com>,
> Ed Morton <morton@lsupcaemnt.com> wrote:
> ...
>
>
>
> Bravo!
>
> You could compress even further by doing:
>
> print > ("file"((/SRL/)+1))
or this should work in any awk and is arguably a bit clearer:
print > "file"(/SRL/?2:1)
Regards,
Ed.
| |
| Kenny McCormack 2007-06-12, 9:57 pm |
| In article < WPKdnZXISY6FCPPbnZ2dnUVZ_vCknZ2d@comcast
.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
>Kenny McCormack wrote:
>
>or this should work in any awk and is arguably a bit clearer:
>
> print > "file"(/SRL/?2:1)
$ tawk '{print "file"(/foo/?2:1)}'
awk: error in program line 1: illegal expression (check parenthesis)
awk: aborting due to compilation errors
$
I.e., this doesn't work in "any awk", since at least one AWK treats
regular expressions as first class citizens [1].
[1] This is a bit jargon-y, but you know what I mean.
| |
|
|
|
|
|
|
|