For Programmers: Free Programming Magazines  


Home > Archive > AWK > June 2004 > Newbie formatting question









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Newbie formatting question
eldorado

2004-06-15, 3:56 pm

I have been going through my Sed and awk book and cannot seem to put
together the answer to this question. Please excuse the newbieness.

I have a file that looks like
1=a
2=b
3=c
4=d

1=e
2=f
3=g

1=h
2=i
3=j
4=k

etc..

I want it to look like
1=a|2=b|3=c|4=d|
1=e|2=f|3=g|
1=h|2=i|3=j|4=k|

Ok, so I figured that this would work:
/^$/!{
H
D
}
/^$/!{
BEGIN { FS = "\n"; RS = "" }
{print $4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$1
5,$16,$17,$18}
#will not hold more than 18
}

(I can used sed to create the |, s/ /|/g) - My reading is that I would
create a holding space ending with the blank line and then do the script
below (which does work)...I must not be understanding the way to do the
hold space (error is awk: syntax error near line 1 awk: bailing out near
line 1)

Would someone please point me in the correct direction?

Thanks!

--
Randomly generated signature --
What the hell, go and put all your eggs in one basket.

Ed Morton

2004-06-15, 3:56 pm



eldorado wrote:
> I have been going through my Sed and awk book and cannot seem to put
> together the answer to this question. Please excuse the newbieness.
>
> I have a file that looks like
> 1=a
> 2=b
> 3=c
> 4=d
>
> 1=e
> 2=f
> 3=g
>
> 1=h
> 2=i
> 3=j
> 4=k
>
> etc..
>
> I want it to look like
> 1=a|2=b|3=c|4=d|
> 1=e|2=f|3=g|
> 1=h|2=i|3=j|4=k|
>
> Ok, so I figured that this would work:
> /^$/!{
> H
> D
> }
> /^$/!{


I have no idea what the above 5 lines are supposed to do, but whatever
it is, you don't need them.

> BEGIN { FS = "\n"; RS = "" }
> {print $4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$1
5,$16,$17,$18}
> #will not hold more than 18
> }
> (I can used sed to create the |, s/ /|/g) - My reading is that I would
> create a holding space ending with the blank line and then do the script
> below (which does work)...I must not be understanding the way to do the
> hold space (error is awk: syntax error near line 1 awk: bailing out near
> line 1)


If those leading 5 lines are part of your awk script then, yes, I'd
expect a syntax error.

> Would someone please point me in the correct direction?


Change all of the above to just:

awk 'BEGIN{RS="";FS="\n";OFS="|"}{$#=$#"|";print}'

and it'll be about what you want.

Ed.

> Thanks!
>


eldorado

2004-06-15, 3:56 pm

On Tue, 15 Jun 2004, Ed Morton wrote:

>
> Change all of the above to just:
>
> awk 'BEGIN{RS="";FS="\n";OFS="|"}{$#=$#"|";print}'
>
> and it'll be about what you want.
>

Thanks for your help, but I am still erroring on this script.

$ awk -f test2.awk file
awk: syntax error near line 2
awk: illegal statement near line 2

$ more test2.awk
BEGIN{RS="";FS="\n";OFS="|"}{$#=$#"|";print}


What is $#=$#"|" supposed to do? I was also trying to get only certain
fields (print $4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$1
5,$16,$17,$18)
between blank spaces (which is why I was trying to use a holding space).
I should be able to change your print statement, right?


--
Randomly generated signature --
I have never killed a man, but I have read many obituaries with great pleasure. (Clarence Darrow)

Ed Morton

2004-06-15, 3:56 pm



eldorado wrote:

> On Tue, 15 Jun 2004, Ed Morton wrote:
>
>
>
> Thanks for your help, but I am still erroring on this script.
>
> $ awk -f test2.awk file
> awk: syntax error near line 2
> awk: illegal statement near line 2
>
> $ more test2.awk
> BEGIN{RS="";FS="\n";OFS="|"}{$#=$#"|";print}
>
>
> What is $#=$#"|" supposed to do?


Sorry, I had shell script on my brain. I'm just adding a "|" to the
final field and reformatting $0 to use the new OFS. It should be:

awk 'BEGIN{RS="";FS="\n";OFS="|"}{$NF=$NF"|";print}'

I was also trying to get only certain
> fields (print $4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$1
5,$16,$17,$18)
> between blank spaces (which is why I was trying to use a holding space).


I don't know what you mean by that. Each record is separated by a blank
line (RS=""). Each field is separated by a newline (FS="\n"). You're
expected output showed every field now separated by a "|" instead of a
newline (OFS="|"). You're not skipping any fields, and you don't have
any blank spaces.

A "holding space" sounds like a sed term to me.

> I should be able to change your print statement, right?


You shouldn't need to.

Ed.
>
> --
> Randomly generated signature --
> I have never killed a man, but I have read many obituaries with great pleasure. (Clarence Darrow)
>


eldorado

2004-06-15, 3:56 pm

On Tue, 15 Jun 2004, Ed Morton wrote:

>
> awk 'BEGIN{RS="";FS="\n";OFS="|"}{$NF=$NF"|";print}'
>
> I was also trying to get only certain
>
> I don't know what you mean by that. Each record is separated by a blank
> line (RS=""). Each field is separated by a newline (FS="\n"). You're
> expected output showed every field now separated by a "|" instead of a
> newline (OFS="|"). You're not skipping any fields, and you don't have
> any blank spaces.
>
> A "holding space" sounds like a sed term to me.
>
>
> You shouldn't need to.
>
> Ed.


Ed, This works very nice, thank you.

The only problem (and I can solve
this with sed if need be) is I don't want the first 3 entries in each
section.

I tried to be concise and so was not clear.

1=a
2=b
3=c
4=d
5=e

1=f
2=g
3=h
4=i
5=j
6=k
7=l

should be
4=d|5=e|
4=i|5=j|6=k|7=l|


>
>


--
Randomly generated signature --
We are coming after you. God may have mercy on you, but we won't -- John McCain

Ed Morton

2004-06-15, 3:56 pm



eldorado wrote:
<snip>
> The only problem (and I can solve
> this with sed if need be) is I don't want the first 3 entries in each
> section.

<snip>

You can use something like:

awk 'BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
sub(OFS OFS OFS,"");print}'

i.e. just set the first 3 fields to blanks, then use sub to get rid of
the first 3 fields separators.

Ed.

eldorado

2004-06-15, 3:56 pm

On Tue, 15 Jun 2004, Ed Morton wrote:

>
>
> eldorado wrote:
> <snip>
> <snip>
>
> You can use something like:
>
> awk 'BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
> sub(OFS OFS OFS,"");print}'
>
> i.e. just set the first 3 fields to blanks, then use sub to get rid of
> the first 3 fields separators.
>
> Ed.

This looks exactly like what I need, I am getting the following error:

$ awk -f test3.awk file
awk: syntax error near line 2
awk: illegal statement near line 2

$ more test3.awk
BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
sub(OFS OFS OFS,"");print}

This seems to work (sort of)
BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
print} except my lines start with ||| (which I assume the sub will get
rid of. I did try sub(OFS OFS OFS = "") which also errors.

If it takes more time to figure this out than it is worth I can do
sed 's/^|||//' to make the above work, just would be nice to have it all
in
one place.

Again thank you for all the help (and patience) with this one!

--
Randomly generated signature --
Never hit a man with glasses; hit him with your fist.

Ed Morton

2004-06-15, 3:56 pm



eldorado wrote:
> On Tue, 15 Jun 2004, Ed Morton wrote:
>
>
>
> This looks exactly like what I need, I am getting the following error:
>
> $ awk -f test3.awk file
> awk: syntax error near line 2
> awk: illegal statement near line 2

<snip>

I suspect you're running "old awk" and I don't have a solution for that.
You're going to kick yourself if you don't bite the bullet now and get
gawk (GNU awk) or nawk or some other non-ancient version.

Ed.

Patrick TJ McPhee

2004-06-15, 8:55 pm

In article <Pine.LNX.4.44.0406151215120.14643-100000@eris.io.com>,
eldorado <eldorado@eris.io.com> wrote:
% On Tue, 15 Jun 2004, Ed Morton wrote:

% > awk 'BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
% > sub(OFS OFS OFS,"");print}'

% This looks exactly like what I need, I am getting the following error:
%
% $ awk -f test3.awk file
% awk: syntax error near line 2
% awk: illegal statement near line 2
%
% $ more test3.awk
% BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
% sub(OFS OFS OFS,"");print}

I'm not sure why you would have a syntax error, but it's probably better
to put each action on its own line

BEGIN{RS="";FS="\n";OFS="|"}
{$1=$2=$3="";$NF=$NF"|";sub(OFS OFS OFS,"");print}

I don't think OFS OFS OFS is the correct argument for sub. In this case,
it translates to

"|||"

which, when treated as a regular expression matches the empty string,
since | indicates alternation in extended regular experessions. I would
use

BEGIN{RS="";FS="\n";OFS="|"}
{$1=$2=$3="";$NF=$NF"|";sub(/^\|+/,"");print}

which matches any number of | at the start of the line. Your choice
of RS guarantees there won't be any empty fields other than the ones
you make empty yourself.

To be strictly honest, I'd probably use

BEGIN{RS="";FS="\n"}
{
d0 = ""
for (i = 4; i <= NF; i++)
d0 = d0 "|" $i
print d0 "|"
}

--

Patrick TJ McPhee
East York Canada
ptjm@interlog.com
Ed Morton

2004-06-16, 3:55 am



Patrick TJ McPhee wrote:
> In article <Pine.LNX.4.44.0406151215120.14643-100000@eris.io.com>,
> eldorado <eldorado@eris.io.com> wrote:
> % On Tue, 15 Jun 2004, Ed Morton wrote:
>
> % > awk 'BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
> % > sub(OFS OFS OFS,"");print}'
>
> % This looks exactly like what I need, I am getting the following error:
> %
> % $ awk -f test3.awk file
> % awk: syntax error near line 2
> % awk: illegal statement near line 2
> %
> % $ more test3.awk
> % BEGIN{RS="";FS="\n";OFS="|"}{$1=$2=$3="";$NF=$NF"|";
> % sub(OFS OFS OFS,"");print}
>
> I'm not sure why you would have a syntax error, but it's probably better
> to put each action on its own line
>
> BEGIN{RS="";FS="\n";OFS="|"}
> {$1=$2=$3="";$NF=$NF"|";sub(OFS OFS OFS,"");print}
>
> I don't think OFS OFS OFS is the correct argument for sub. In this case,
> it translates to
>
> "|||"
>
> which, when treated as a regular expression matches the empty string,
> since | indicates alternation in extended regular experessions.


Yes, you're right. A different OFS would work, or this:

awk 'BEGIN{RS="";FS="\n";OFS="|";eofs="\\"OFS}{$1=$2=$3="";$NF=$NF"|";
sub(eofs eofs eofs,"");print}'

where "eofs" means "escaped OFS".

I would
> use
>
> BEGIN{RS="";FS="\n";OFS="|"}
> {$1=$2=$3="";$NF=$NF"|";sub(/^\|+/,"");print}


but then you're hard-coding the "|" in the sub RE, so if you changed the
OFS, you'd have to remember to change that RE too - that's what I'm
trying to avoid, but then I'm hard-coding the number of fields being
skipped.

> which matches any number of | at the start of the line. Your choice
> of RS guarantees there won't be any empty fields other than the ones
> you make empty yourself.
>
> To be strictly honest, I'd probably use
>
> BEGIN{RS="";FS="\n"}
> {
> d0 = ""
> for (i = 4; i <= NF; i++)
> d0 = d0 "|" $i
> print d0 "|"
> }
>


ITYM:

BEGIN{RS="";FS="\n"}
{
for (i = 4; i <= NF; i++)
d0 = d0 $i "|"
print d0
}

and so would I!

Regards,

Ed.

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com