Code Comments
Programming Forum and web based access to our favorite programming groups.This kinda follows from the post by Stephen in another thread (*) - about
splitting things up by spaces, in which some variation of:
FS='';$1=$1
was given. The idea being that when $1 is assigned to - the whole line
gets re-built with OFS.
Anyway, this area of AWK wizardry has always seemed weird to me - in the
sense that I see no reason for it (i.e., for this rebuilding with OFS) and
I don't the rules for it are very well-defined (which is why I generally
avoid it). Consider the following case:
I am trying to extract a field from a line where the field occurs in
columns 126-137 (12 chars), but may be (in fact, usually is) less than 12
chars, and it is, of course, blank-padded on the right in that case. The
goal is to get the field, without any trailing blanks.
Now, the following works (with either tawk or gawk):
{$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
But, and here is the thing, why do I need that extra $0=$0? W/o it,
I still get the trailing blanks (with both tawk & gawk). I thought
I understood dem rules!
(*) Either here or in c.l.shell - not sure right now.
Post Follow-up to this messageIn article <chqpj6$k7f$1@yin.interaccess.com>,
Kenny McCormack <gazelle@interaccess.com> wrote:
[...]
% Now, the following works (with either tawk or gawk):
%
% {$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
%
% But, and here is the thing, why do I need that extra $0=$0?
From susv3
The symbol $0 shall refer to the entire record; setting any other field
causes the re-evaluation of $0. Assigning to $0 shall reset the values
of all other fields and the NF built-in variable.
In other words, when you assign to $1, you change the value of $0, but not
of $2, $3, and so on. When you assign to $0, you change the value of NF
and $1, $2, .., $NF.
--
Patrick TJ McPhee
East York Canada
ptjm@interlog.com
Post Follow-up to this messageIn article <2qchjdFuduisU1@uni-berlin.de>,
Patrick TJ McPhee <ptjm@interlog.com> wrote:
>In article <chqpj6$k7f$1@yin.interaccess.com>,
>Kenny McCormack <gazelle@interaccess.com> wrote:
>
>[...]
>
>% Now, the following works (with either tawk or gawk):
>%
>% {$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
>%
>% But, and here is the thing, why do I need that extra $0=$0?
>
>From susv3
>
> The symbol $0 shall refer to the entire record; setting any other field
> causes the re-evaluation of $0. Assigning to $0 shall reset the values
> of all other fields and the NF built-in variable.
>
>In other words, when you assign to $1, you change the value of $0, but not
>of $2, $3, and so on. When you assign to $0, you change the value of NF
>and $1, $2, .., $NF.
I'm sorry. So, what's your point?
Post Follow-up to this messageHello Kenny,
In article <chqpj6$k7f$1@yin.interaccess.com>, Kenny McCormack wrote:
> {$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
>
> But, and here is the thing, why do I need that extra $0=$0?
By assigning a value to $1, you also change the value of $0.
But no field splitting is performed, so there is no way how could
the spaces disappear.
By assigning a value to $0, you change $1, $2, etc., because the new
text is split to fields.
An example:
echo XX YY ZZ | awk '
{
$1="a b c"
print "0", $0
print "1", $1
print "2", $2
$0=($1 " YY ZZ") # which is current value of $0
print "2", $2
}'
Result:
0 a b c YY ZZ
1 a b c
2 YY
2 b
Fields can contain spaces, why not?
So the assignment $1="a b c" doesn't change $2.
But when you assign new text to $0, it has to be split to fields.
And it is not important how the expression on the right-hand side of the
assignment looks like. It may be a constant string literal, or any
expression. Having $0 on the right-hand side is just one special case.
Hope this explains it,
Stepan Kasal
Post Follow-up to this messageIn article <slrnck2i3i.pca.kasal@matsrv.math.cas.cz>,
Stepan Kasal <kasal@ucw.cz> wrote:
>Hello Kenny,
>
>In article <chqpj6$k7f$1@yin.interaccess.com>, Kenny McCormack wrote:
>
>By assigning a value to $1, you also change the value of $0.
>But no field splitting is performed, so there is no way how could
>the spaces disappear.
>
>By assigning a value to $0, you change $1, $2, etc., because the new
>text is split to fields.
>
>An example:
>
>echo XX YY ZZ | awk '
>{
> $1="a b c"
> print "0", $0
> print "1", $1
> print "2", $2
> $0=($1 " YY ZZ") # which is current value of $0
> print "2", $2
>}'
>
>Result:
>0 a b c YY ZZ
>1 a b c
>2 YY
>2 b
>
>Fields can contain spaces, why not?
>So the assignment $1="a b c" doesn't change $2.
>
>But when you assign new text to $0, it has to be split to fields.
>And it is not important how the expression on the right-hand side of the
>assignment looks like. It may be a constant string literal, or any
>expression. Having $0 on the right-hand side is just one special case.
I see. It turns out the magic bullet is to do it like this:
{$0=substr($0,126,12);print "|"$1"|",length($1)}
In fact, I had worked this all out once long ago - but I still think the
rules are weird. But then again, that's why we like AWK - because it has
subtle wizardy/weirdness - unlike Perl's "hit you over the head with it"
approach.
Post Follow-up to this messageIn article <chrc28$nrf$1@yin.interaccess.com>,
Kenny McCormack <gazelle@interaccess.com> wrote:
% In article <2qchjdFuduisU1@uni-berlin.de>,
% Patrick TJ McPhee <ptjm@interlog.com> wrote:
% >In article <chqpj6$k7f$1@yin.interaccess.com>,
% >Kenny McCormack <gazelle@interaccess.com> wrote:
% >
% >[...]
% >
% >% Now, the following works (with either tawk or gawk):
% >%
% >% {$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
% >%
% >% But, and here is the thing, why do I need that extra $0=$0?
[...]
% >In other words, when you assign to $1, you change the value of $0, but no
t
% >of $2, $3, and so on. When you assign to $0, you change the value of NF
% >and $1, $2, .., $NF.
%
% I'm sorry. So, what's your point?
I thought you were wondering why you need to perform that `extra' $0 = $0.
The reason is contained in my replay: you want awk reparse $0, and that
doesn't happen unless you assign to $0. Assigning to $1 does not cause
$0 to be reparsed.
--
Patrick TJ McPhee
East York Canada
ptjm@interlog.com
Post Follow-up to this messageIn article <2qchjdFuduisU1@uni-berlin.de>,
Patrick TJ McPhee <ptjm@interlog.com> wrote:
>In article <chqpj6$k7f$1@yin.interaccess.com>,
>Kenny McCormack <gazelle@interaccess.com> wrote:
>
>[...]
>
>% Now, the following works (with either tawk or gawk):
>%
>% {$1=substr($0,126,12);$0=$0;print "|"$1"|",length($1)}
>%
>% But, and here is the thing, why do I need that extra $0=$0?
>
>From susv3
>
> The symbol $0 shall refer to the entire record; setting any other field
> causes the re-evaluation of $0. Assigning to $0 shall reset the values
> of all other fields and the NF built-in variable.
>
>In other words, when you assign to $1, you change the value of $0, but not
>of $2, $3, and so on. When you assign to $0, you change the value of NF
>and $1, $2, .., $NF.
I'm sorry. So, what's your point?
Post Follow-up to this messagePowered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.