Code Comments
Programming Forum and web based access to our favorite programming groups.I wrote:
> I am piping the output from one awk command to another awk command, and
> I was wondering if it is possible to combine them.
>
> The first command is just to print the second field of a file:
>
> awk "BEGIN {FS = \" \"} {print $2}"
>
> and the second command is to remove duplicates from the (unsorted)
> result of the first command:
>
> awk "{if (data[$0]++ == 0) lines[++count] = $0} END {for (i = 1; i
> <=count; i++) print lines[i]}"
>
> Please could you tell me if it is possible to combine them.
Thanks hq00e and Ed for your replies.
I found Ed's answer:
awk "BEGIN{ FS=\" \" } { array[$2]++ } END{ for ( i in array ) print i
}"
to run slightly faster than hq00e's:
awk "BEGIN{ FS=\" \" } { array[$2]=$2 } END{ for ( i in array ) print
array[i] }"
and both of them are running about 40% to 45% quicker than my
two-command approach.
Ed wrote:
> but you'd probably be changing the order of the output compared to the
> input by using the "in" operator this way (see
> http://www.gnu.org/software/gawk/ma...anning-an-Array)
> which may not be desirable.
I don't mind, since the input file was not in any particular order
anyway.
Your help is much appreciated.
Regards,
Jonny
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.