For Programmers: Free Programming Magazines  


Home > Archive > AWK > April 2005 > Find difference between first and last values in a column









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Find difference between first and last values in a column
Jonny

2005-04-12, 3:56 am

Hi,

I have a file containing, for example:

15000 3000
20000 6700

I would like to be able to calculate:

(last value in col 1) - (first value in col 1)

and also:

(last value in col 2) - (first value in col 2)

Please can you help.

Many Thanks.

Regards,
Jonny
Jonny

2005-04-12, 3:56 am

Loki Harfagr wrote:

> Le Sat, 09 Apr 2005 09:38:03 +0000, Jonny a écrit :
>
>
> Could be :
> $ awk 'BEGIN{getline; A=$1; B=$2} {LA=$1-A; LB=$2-B} END{print LA" "LB}'
>
> or simply :
> $ awk 'BEGIN{getline; A=$1; B=$2} END{print $1-A" "$2-B}'



Thanks Loki. This is exactly what I wanted.

Regards,
Jonny
Ed Morton

2005-04-12, 3:56 am



Jonny wrote:
> Loki Harfagr wrote:
>
>
>
>
>
> Thanks Loki. This is exactly what I wanted.


No, it isn't. Although it might produce the output you want, using
getline has many undesirable side-effects (google for it) and by-passes
awks built-in record loop so it's rarely the right (i.e. awk idiomatic)
solution.

In this case, there's just no need for getline at all. The right
solution is the shorter and more idiomatic:

awk 'NR==1{A=$1;B=$2}END{print $1-A,$2-B}'

Ed.
Kenny McCormack

2005-04-12, 3:56 am

In article <tMadnSqaeq_ySsrfRVn-jg@comcast.com>,
Ed Morton <morton@lsupcaemnt.com> wrote:
....
>No, it isn't. Although it might produce the output you want, using
>getline has many undesirable side-effects (google for it) and by-passes
>awks built-in record loop so it's rarely the right (i.e. awk idiomatic)
>solution.


Well put, sir.

>In this case, there's just no need for getline at all. The right
>solution is the shorter and more idiomatic:
>
>awk 'NR==1{A=$1;B=$2}END{print $1-A,$2-B}'
>
> Ed.


Yup. That's what I had in mind.

Jonny

2005-04-12, 3:56 am

Ed Morton wrote:

> getline has many undesirable side-effects (google for it) and by-passes
> awks built-in record loop so it's rarely the right (i.e. awk idiomatic)
> solution.
>
> In this case, there's just no need for getline at all. The right
> solution is the shorter and more idiomatic:
>
> awk 'NR==1{A=$1;B=$2}END{print $1-A,$2-B}'



Thanks Ed.

I didn't know about the potential problems with getline.

Regards,
Jonny
Ed Morton

2005-04-12, 3:56 am



Ulrich M. Schwarz wrote:
> Ed Morton <morton@lsupcaemnt.com> writes:
>
> [...]
>
>
>
> In the quest for less key strokes and more programmer heart strokes:
> assuming A cannot be zero, shouldn't
>
> awk '!A{A=$1; B=$2} END{print $1-A,$2-B}'
>
> also work? ;-)


Yes, but I don't see any reason to assume A can't be zero.

> Ulrich
> (frankly, if you had asked me if an END rule could still access
> $i, I'd have said "Euuuuuuugh... better don't rely on it".)


it'd be safe with all modern awks, but if you're worried, do this:

awk 'NR==1{A=$1;B=$2}{X=$1;Y=$2}END{print X-A,Y-B}'

Ed.
Ed Morton

2005-04-12, 3:56 am



Loki Harfagr wrote:
> Le Sat, 09 Apr 2005 08:23:57 -0500, Ed Morton a écrit :
>
>
>
>
> In this case ? I frankly doubt, could you elaborate please ?


One side-effect that springs to mind is that by adding the call to
getline, the FILENAME variable gets set in the BEGIN section when it
wouldn't normally be. That may or may not be a big deal, but it
introduces a totally unnecessary consideration for future enhancements.
Now, what if you decide to enhance it to read the value into a variable?
Well, then you'll find that for reaseons best known to the providers, NF
didn't get set so you can't rely on that for additional parsing. To be
totally honest, I really considered figuring out which particular
getline caveats would and wouldn't apply in this context but there's
just no point since there's absolutely no need for it in this context so
I'm not going to waste my time. Just avoid it unless you NEED it, then
figure out how to address it's caveats in that context and I'll be happy
to help if you have questions.

Ed.
Loki Harfagr

2005-04-13, 8:55 pm

Le Sat, 09 Apr 2005 08:23:57 -0500, Ed Morton a écrit_:

>
> No, it isn't. Although it might produce the output you want, using
> getline has many undesirable side-effects


In this case ? I frankly doubt, could you elaborate please ?
(I mean for this typical case of reading first and last records)

> (google for it) and by-passes
> awks built-in record loop so it's rarely the right (i.e. awk idiomatic)
> solution.


I do agree on that, and it was just a way to not use a condition
as not specialy needed; I admit it was an overkill for optimisation
in this special case wen all what's needed is reading first and last
record :D)

>
> In this case, there's just no need for getline at all. The right
> solution is the shorter and more idiomatic:
>
> awk 'NR==1{A=$1;B=$2}END{print $1-A,$2-B}'


Definitly better, especially if the script is supposed to be extended,
which I foolishly presupposed wasn't the case, I should've known better!


Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com