Code Comments
Programming Forum and web based access to our favorite programming groups.Hello, I'm very new to awk, and am unsure if I have approched this
correctly. I have a logfile that may have duplicate records. I need to
take the latest version. I decided to load the records into an array
using the key as the index. This way when a duplicate key is found it
just sets the new record in its place.
{
key = $1
accounts[key] = $0
}
END {
for (x in accounts) {
print accounts[x]
}
}
Are there any limitations or problems with this? THanks in advance,
Michael
Post Follow-up to this messageOn 30 May 2005 22:09:22 -0700, like.a.mango@gmail.com wrote:
Looks OK to me.
>END {
> for (x in accounts) {
> print accounts[x]
> }
Hint: the "in" operator doesn't print the array elements in the
sequence of input! If you want to keep this order you can use a second
array with the keys and a sequential count. Use this to access the
accounts array. If your key is sortable (i.e. something like a
timestamp YYYYMMDDhhmmss) you can pipe the result of awk to "sort".
>}
>
HTH
Axel
Post Follow-up to this messageAxel Sander wrote: > On 30 May 2005 22:09:22 -0700, like.a.mango@gmail.com wrote: > > Looks OK to me. > > For a large file it'll use a lot of memory. In older awks there's a limit of (IIRC) 4096 entries in an array. > > > Hint: the "in" operator doesn't print the array elements in the > sequence of input! If you want to keep this order you can use a second > array with the keys and a sequential count. Use this to access the > accounts array. If your key is sortable (i.e. something like a > timestamp YYYYMMDDhhmmss) you can pipe the result of awk to "sort". > If he had access to "sort" he wouldn't need to do this in awk. If you're using "gawk", you can sort the result with "asort()" or "asorti()" but just keeping an index seems like the best approach. Ed.
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.