Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

Get range of fields without losing FS chars
Hey everyone...

Here's a question:

I want to take an input line in awk and output the line from the start of
field 9 until the end of line EXACTLY as input.  I can't do a simple loop
from field 9 until NF as that strips out the FS chars.  Field 9 does not sta
rt
at the same column each time, so I can't use substr and such...  I suppose I
could do an index, searching for $9 in $0, but that's error prone if the tex
t
appears more than in just $9...

What I really want is something like a shift operator so I can shift off the
first 9 arguments, or something to give me the starting character position o
f
field 9.

Ideas?  I know I can use a shell script to do this, but I was hoping for a m
ore
elegant solution in awk.  My awk program is complete, I just need this one
thing.



--

It's 3:30, why aren't you at work?
I.. I didn't feel like it.

Report this thread to moderator Post Follow-up to this message
Old Post
Paul Coene
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars
In article <slrnd52i3o.f9b.drool@noudess.droolsretreat.com>,
Paul Coene  <drool@noudess.droolsretreat.com> wrote:
>Hey everyone...
>
>Here's a question:
>
>I want to take an input line in awk and output the line from the start of
>field 9 until the end of line EXACTLY as input.  I can't do a simple loop
>from field 9 until NF as that strips out the FS chars.  Field 9 does not st
art
>at the same column each time, so I can't use substr and such...  I suppose 
I
>could do an index, searching for $9 in $0, but that's error prone if the te
xt
>appears more than in just $9...

This is a fairly common problem and it is one that is not directly
addressed by the language.  I have often wished for this sort of
capability.

I'm pretty sure this will work, assuming your FS is the default
("whitespace"):

sub(/^[ \t]*/,"")
for (i=1; i<=8; i++)
sub(/[^ \t]*^[ \t]*/,"")

>What I really want is something like a shift operator so I can shift off th
e
>first 9 arguments, or something to give me the starting character position 
of
>field 9.

Alas, no shift operator in AWK.  I've often wondered what the original
rationale was for the "when the input is line is changed, you lose your
original spacing" "feature" was.  I've never seen any benefit to it, and
working around it has achieved FAQ status.

>Ideas?  I know I can use a shell script to do this.

Really?  I don't see it.  But then again, I've never been a fan of these
complicated shell-only solutions, using obscure and poorly documented shell
features.


Report this thread to moderator Post Follow-up to this message
Old Post
Kenny McCormack
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars
> I'm pretty sure this will work, assuming your FS is the default
> ("whitespace"):
>
> sub(/^[ \t]*/,"")
> for (i=1; i<=8; i++)
>     sub(/[^ \t]*^[ \t]*/,"")

I'm not sure what you're doing above.. making fields 1-8 one field?  That st
ill
doesn't output fields 9 until NF unaltered..  Maybe we misunderstand one
another.  This is what I have done, ugly but effective:

 ########################################
####################################
##
# Ok, now we need to toss off all chars up to the 9th field and keep the res
t
# of the input line verbatim
#
if (NF > 8) # should always be, but don't want infinate loop below
{
curchar=1;
# Skip over 8 fields and any precdeding FS chars
for (i=1;i<=8;++i)
{
while (substr($0,curchar,1) ~ FS)
curchar++;
while (substr($0,curchar,1) !~ FS)
curchar++;
}

# Skip the FS chars preceding field 9
while (substr($0,curchar,1) ~ FS)
curchar++;
# Print the rest of $0 from $9 on...
print substr($0, curchar)
}
 ########################################
####################################
##

--

It's 3:30, why aren't you at work?
I.. I didn't feel like it.

Report this thread to moderator Post Follow-up to this message
Old Post
Paul Coene
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars
In article <slrnd52k9b.fg7.drool@noudess.droolsretreat.com>,
Paul Coene  <drool@noudess.droolsretreat.com> wrote: 
>
>I'm not sure what you're doing above.. making fields 1-8 one field?  That s
till
>doesn't output fields 9 until NF unaltered..  Maybe we misunderstand one
>another.

Well, if you misunderstand, then you need to go back and read more
carefully.

Notes:
1) The above doesn't actually print the line - left as an exercise
for the reader.
2) I usually post "minimalist - leave the i dotting and t crossing
to the reader" type solutions.

By the way, what is your FS?


Report this thread to moderator Post Follow-up to this message
Old Post
Kenny McCormack
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars

Paul Coene wrote:
> Hey everyone...
>
> Here's a question:
>
> I want to take an input line in awk and output the line from the start of
> field 9 until the end of line EXACTLY as input.  I can't do a simple loop
> from field 9 until NF as that strips out the FS chars.  Field 9 does not s
tart
> at the same column each time, so I can't use substr and such...  I suppose
 I
> could do an index, searching for $9 in $0, but that's error prone if the t
ext
> appears more than in just $9...
>
> What I really want is something like a shift operator so I can shift off t
he
> first 9 arguments, or something to give me the starting character position
 of
> field 9.
>
> Ideas?  I know I can use a shell script to do this, but I was hoping for a
 more
> elegant solution in awk.  My awk program is complete, I just need this one
> thing.

This will do it in gawk:

gawk --re-interval 'sub(/ ^[[:space:]]*([^[:space:]]*[[:space:]]*)
{8}/,"")'

The "8" is obviously the number of fields you want to delete from the
start of each record.

Regards,

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars
Paul Coene wrote:
> I want to take an input line in awk and output the line from
> the start of field 9 until the end of line EXACTLY as input.
> I can't do a simple loop from field 9 until NF as that strips
> out the FS chars.  Field 9 does not start at the same column
> each time, so I can't use substr and such...

I think this function will do what you want. Demonstrated with Cygwin,
bash shell, GNU awk:

epement@SW218-ET03 ~
$ echo 'a b c  d e f  g h  i  j k   l'
a b c  d e f  g h  i  j k   l

epement@SW218-ET03 ~
$ cat tail.awk
function tail(line, arg) {
# returns the tail of a line, keeping field separators
# of varying lengths. arg is the last parameter to omit
for (i=1; i<=arg; i++)
sub($i, "", line)
sub(/^[ \t]*/, "", line)
return line
}
{ print tail($0,8) }

epement@SW218-ET03 ~
$ echo 'a b c  d e f  g h  i  j k   l' | gawk -f tail.awk
i  j k   l

Hope this solution is helpful.

--
Eric Pement


Report this thread to moderator Post Follow-up to this message
Old Post
Eric Pement
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars

Ed Morton wrote:
>
>
> Paul Coene wrote:
> 
<snip>
> This will do it in gawk:
>
> gawk --re-interval 'sub(/ ^[[:space:]]*([^[:space:]]*[[:space:]]*)
{8}/,"")'

Or in a POSIX awk (e.g. /usr/xpg4/bin/awk on Solaris) where interval
expressions are enabled by default:

awk 'sub(/ ^[[:space:]]*([^[:space:]]*[[:space:]]*)
{8}/,"")'

Note that the [:space:] character class includes newlines (see
http://www.gnu.org/software/gawk/ma...har_002dclasses
).
For the default FS, you could use [:blank:] if you prefer.

> The "8" is obviously the number of fields you want to delete from the
> start of each record.
>
> Regards,
>
>     Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-06-05 05:05 PM


Re: Get range of fields without losing FS chars
In article <slrnd52i3o.f9b.drool@noudess.droolsretreat.com>,
Paul Coene  <drool@noudess.droolsretreat.com> wrote:
>Hey everyone...
>
>Here's a question:
>
>I want to take an input line in awk and output the line from the start of
>field 9 until the end of line EXACTLY as input.  I can't do a simple loop
>from field 9 until NF as that strips out the FS chars.  Field 9 does not st
art
>at the same column each time, so I can't use substr and such...  I suppose 
I
>could do an index, searching for $9 in $0, but that's error prone if the te
xt
>appears more than in just $9...

This is a fairly common problem and it is one that is not directly
addressed by the language.  I have often wished for this sort of
capability.

I'm pretty sure this will work, assuming your FS is the default
("whitespace"):

sub(/^[ \t]*/,"")
for (i=1; i<=8; i++)
sub(/[^ \t]*^[ \t]*/,"")

>What I really want is something like a shift operator so I can shift off th
e
>first 9 arguments, or something to give me the starting character position 
of
>field 9.

Alas, no shift operator in AWK.  I've often wondered what the original
rationale was for the "when the input is line is changed, you lose your
original spacing" "feature" was.  I've never seen any benefit to it, and
working around it has achieved FAQ status.

>Ideas?  I know I can use a shell script to do this.

Really?  I don't see it.  But then again, I've never been a fan of these
complicated shell-only solutions, using obscure and poorly documented shell
features.


Report this thread to moderator Post Follow-up to this message
Old Post
Kenny McCormack
04-06-05 05:40 PM


Re: Get range of fields without losing FS chars
Paul Coene wrote:
> I want to take an input line in awk and output the line from
> the start of field 9 until the end of line EXACTLY as input.
> I can't do a simple loop from field 9 until NF as that strips
> out the FS chars.  Field 9 does not start at the same column
> each time, so I can't use substr and such...

I think this function will do what you want. Demonstrated with Cygwin,
bash shell, GNU awk:

epement@SW218-ET03 ~
$ echo 'a b c  d e f  g h  i  j k   l'
a b c  d e f  g h  i  j k   l

epement@SW218-ET03 ~
$ cat tail.awk
function tail(line, arg) {
# returns the tail of a line, keeping field separators
# of varying lengths. arg is the last parameter to omit
for (i=1; i<=arg; i++)
sub($i, "", line)
sub(/^[ \t]*/, "", line)
return line
}
{ print tail($0,8) }

epement@SW218-ET03 ~
$ echo 'a b c  d e f  g h  i  j k   l' | gawk -f tail.awk
i  j k   l

Hope this solution is helpful.

--
Eric Pement


Report this thread to moderator Post Follow-up to this message
Old Post
Eric Pement
04-06-05 08:55 PM


Re: Get range of fields without losing FS chars

Ed Morton wrote:
>
>
> Paul Coene wrote:
> 
<snip>
> This will do it in gawk:
>
> gawk --re-interval 'sub(/ ^[[:space:]]*([^[:space:]]*[[:space:]]*)
{8}/,"")'

Or in a POSIX awk (e.g. /usr/xpg4/bin/awk on Solaris) where interval
expressions are enabled by default:

awk 'sub(/ ^[[:space:]]*([^[:space:]]*[[:space:]]*)
{8}/,"")'

Note that the [:space:] character class includes newlines (see
http://www.gnu.org/software/gawk/ma...har_002dclasses
).
For the default FS, you could use [:blank:] if you prefer.

> The "8" is obviously the number of fields you want to delete from the
> start of each record.
>
> Regards,
>
>     Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
04-07-05 01:55 AM


Sponsored Links




Last Thread Next Thread Next
Pages (2): [1] 2 »
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 07:00 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.