Home > Archive > AWK > April 2005 > String handling
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
|
| I'm trying to count the number of leading spaces in lines from a file using:
tfc=0
w=$0
for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
but this generates a syntax error
What am I doing wrong ??
Any help ...
Thank you
Colin
| |
| Bob Harris 2005-04-08, 3:55 am |
| In article <33c7bd32.0504071709.5711f3e3@posting.google.com>,
colinhay66@hotmail.com (Colin) wrote:
> I'm trying to count the number of leading spaces in lines from a file using:
>
> tfc=0
> w=$0
> for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
>
> but this generates a syntax error
>
> What am I doing wrong ??
>
> Any help ...
>
> Thank you
>
> Colin
w is not an array. so you can not index it. Also I'm not sure what you
are trying to do from the code. I'll try to see if I guess from your
text description.
awk '
{ sub(/[^ ].*/,"",$0); tfc += length($0) }
END { print tfc }
' input.file
I took $0, substituted everything after leading spaces on the line with
nothing, then used length to count the leading spaces that remained.
Bob Harris
| |
| Ed Morton 2005-04-08, 3:55 am |
|
Colin wrote:
> I'm trying to count the number of leading spaces in lines from a file using:
>
> tfc=0
> w=$0
> for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
>
> but this generates a syntax error
>
> What am I doing wrong ??
>
> Any help ...
>
> Thank you
>
> Colin
The answers you got to this question at comp.unix.shell were correct.
Did you have a follow up question?
Ed
| |
| Patrick TJ McPhee 2005-04-08, 3:55 am |
| In article <33c7bd32.0504071709.5711f3e3@posting.google.com>,
Colin <colinhay66@hotmail.com> wrote:
% I'm trying to count the number of leading spaces in lines from a file using:
%
% tfc=0
% w=$0
% for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
You have unbalanced braces and missing semi-colons. w is a scalar, but
you treat it as an array. You use ' as a string delimiter, which it
isn't.
You could solve this using your approach like this:
BEGIN { FS = ""; tfc = 0 }
END { print tfc }
{ for (i = 1; i <= NF; i++) if ($i == " ") tfc++; else break }
I would do it like this:
BEGIN { tfc = 0 }
END { print tfc }
match($0, /^ +/) { tfc += RLENGTH }
which is about 3 times faster using gawk or mawk on this machine, and about
10 times faster using nawk.
Others don't like match() and might do it like this
BEGIN { tfc = 0 }
END { print tfc }
sub(/[^ ].*/, "") { tfc += length }
which seems to be slightly slower, but still quite a bit faster than going
at it character-by-character.
--
Patrick TJ McPhee
North York Canada
ptjm@interlog.com
| |
| Loki Harfagr 2005-04-08, 3:56 pm |
| Le Thu, 07 Apr 2005 18:09:11 -0700, Colin a écrit_:
> I'm trying to count the number of leading spaces in lines from a file using:
>
> tfc=0
> w=$0
> for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
>
> but this generates a syntax error
>
> What am I doing wrong ??
Now you know that :-)
> Any help ...
Then, another help/way of, just to be exhaustive
and for the pleasure to have fun with the toolbox :-)
$ sed 's/[^ ].*//' testfile|tr -d '\n' |wc -c
Well, it'd choke on non LF-terminating files of course ...
| |
| Michael Tosch 2005-04-09, 3:55 pm |
| In article <425690a6$0$32081$626a14ce@news.free.fr>, Loki Harfagr <loki@DarkDesign.free.fr> writes:
> Le Thu, 07 Apr 2005 18:09:11 -0700, Colin a écrit_:
>
Is this an awk script?
[color=darkred]
>
> Now you know that :-)
>
>
> Then, another help/way of, just to be exhaustive
> and for the pleasure to have fun with the toolbox :-)
>
> $ sed 's/[^ ].*//' testfile|tr -d '\n' |wc -c
>
With awk this becomes:
awk '{sum+=match($0"x","[^ ]")-1}END{print sum}' testfile
and can be stripped down to print the number per line:
awk '{print match($0"x","[^ ]")-1}' testfile
--
Michael Tosch @ hp : com
| |
| Loki Harfagr 2005-04-09, 3:55 pm |
| Le Sat, 09 Apr 2005 15:17:36 +0000, Michael Tosch a écrit_:
> Is this an awk script?
Oooops ! So sorry !
I really don't know why I wrongly xposted here :
My only feeble bginning of an explaination is my
conjunctivitis is getting worser and worser ...
Though o real harm done then for it gave you the opportunity
to give the good awk translation of the stuff :D)
>
> With awk this becomes:
> awk '{sum+=match($0"x","[^ ]")-1}END{print sum}' testfile
> and can be stripped down to print the number per line:
> awk '{print match($0"x","[^ ]")-1}' testfile
Thanx for this, and sorry again.
| |
| Bob Harris 2005-04-12, 3:56 am |
| In article <33c7bd32.0504071709.5711f3e3@posting.google.com>,
colinhay66@hotmail.com (Colin) wrote:
> I'm trying to count the number of leading spaces in lines from a file using:
>
> tfc=0
> w=$0
> for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
>
> but this generates a syntax error
>
> What am I doing wrong ??
>
> Any help ...
>
> Thank you
>
> Colin
w is not an array. so you can not index it. Also I'm not sure what you
are trying to do from the code. I'll try to see if I guess from your
text description.
awk '
{ sub(/[^ ].*/,"",$0); tfc += length($0) }
END { print tfc }
' input.file
I took $0, substituted everything after leading spaces on the line with
nothing, then used length to count the leading spaces that remained.
Bob Harris
| |
| Patrick TJ McPhee 2005-04-12, 3:56 am |
| In article <33c7bd32.0504071709.5711f3e3@posting.google.com>,
Colin <colinhay66@hotmail.com> wrote:
% I'm trying to count the number of leading spaces in lines from a file using:
%
% tfc=0
% w=$0
% for(i=1;i<=NF;i++) if (w[i] == ' ') tfc++ else break}
You have unbalanced braces and missing semi-colons. w is a scalar, but
you treat it as an array. You use ' as a string delimiter, which it
isn't.
You could solve this using your approach like this:
BEGIN { FS = ""; tfc = 0 }
END { print tfc }
{ for (i = 1; i <= NF; i++) if ($i == " ") tfc++; else break }
I would do it like this:
BEGIN { tfc = 0 }
END { print tfc }
match($0, /^ +/) { tfc += RLENGTH }
which is about 3 times faster using gawk or mawk on this machine, and about
10 times faster using nawk.
Others don't like match() and might do it like this
BEGIN { tfc = 0 }
END { print tfc }
sub(/[^ ].*/, "") { tfc += length }
which seems to be slightly slower, but still quite a bit faster than going
at it character-by-character.
--
Patrick TJ McPhee
North York Canada
ptjm@interlog.com
| |
| Michael Tosch 2005-04-12, 3:56 am |
| In article <425690a6$0$32081$626a14ce@news.free.fr>, Loki Harfagr <loki@DarkDesign.free.fr> writes:
> Le Thu, 07 Apr 2005 18:09:11 -0700, Colin a écrit_:
>
Is this an awk script?
[color=darkred]
>
> Now you know that :-)
>
>
> Then, another help/way of, just to be exhaustive
> and for the pleasure to have fun with the toolbox :-)
>
> $ sed 's/[^ ].*//' testfile|tr -d '\n' |wc -c
>
With awk this becomes:
awk '{sum+=match($0"x","[^ ]")-1}END{print sum}' testfile
and can be stripped down to print the number per line:
awk '{print match($0"x","[^ ]")-1}' testfile
--
Michael Tosch @ hp : com
| |
| Loki Harfagr 2005-04-12, 3:56 am |
| Le Sat, 09 Apr 2005 15:17:36 +0000, Michael Tosch a écrit_:
> Is this an awk script?
Oooops ! So sorry !
I really don't know why I wrongly xposted here :
My only feeble bginning of an explaination is my
conjunctivitis is getting worser and worser ...
Though o real harm done then for it gave you the opportunity
to give the good awk translation of the stuff :D)
>
> With awk this becomes:
> awk '{sum+=match($0"x","[^ ]")-1}END{print sum}' testfile
> and can be stripped down to print the number per line:
> awk '{print match($0"x","[^ ]")-1}' testfile
Thanx for this, and sorry again.
|
|
|
|
|