For Programmers: Free Programming Magazines  


Home > Archive > AWK > May 2005 > convert row-data to column data









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author convert row-data to column data
Bart Vandewoestyne

2005-04-29, 3:55 pm

I'm quite new to awk scripting, and i haven't been able to solve this
problem:

I have this data file:

http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt

I would like to use the data from that datafile in gnuplot, which
expects the data as columns. Now the data in my file is stored in the
even rows.

How do i transform the data from the even rows in to columns? The best
i could come up with up until now is

http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk

but this does not give me what i want. The row-data is indeed changed
to column data, but the columns should be next to each other, not below
each other and separated by an empty line...

Any help appreciated.

Regards,
Bart

--
"Share what you know. Learn what you don't."
William Park

2005-04-29, 3:55 pm

Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
> I'm quite new to awk scripting, and i haven't been able to solve this
> problem:
>
> I have this data file:
>
> http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt
>
> I would like to use the data from that datafile in gnuplot, which
> expects the data as columns. Now the data in my file is stored in the
> even rows.
>
> How do i transform the data from the even rows in to columns? The best
> i could come up with up until now is
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> but this does not give me what i want. The row-data is indeed changed
> to column data, but the columns should be next to each other, not below
> each other and separated by an empty line...
>
> Any help appreciated.


Search <comp.lang.awk> and <comp.unix.shell> for 'transpose' keyword in
subject.

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Ed Morton

2005-04-29, 3:55 pm



Bart Vandewoestyne wrote:
> I'm quite new to awk scripting, and i haven't been able to solve this
> problem:
>
> I have this data file:
>
> http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt
>
> I would like to use the data from that datafile in gnuplot, which
> expects the data as columns. Now the data in my file is stored in the
> even rows.
>
> How do i transform the data from the even rows in to columns? The best
> i could come up with up until now is
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> but this does not give me what i want. The row-data is indeed changed
> to column data, but the columns should be next to each other, not below
> each other and separated by an empty line...
>


Take a look at this:

--------------
Transposing rows to selected columns and sorting by key.

Given the following input file:
Number of executions = 437
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

Number of executions = 1
Number of compilations = 1
Worst preparation time (ms) = 4
Best preparation time (ms) = 4
Rows deleted = 0

Number of executions = 29
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

To tranpose certain rows into columns and sort by one of the
column, like the following which is sorted by "Number of executions":

Number of executions Number of compilations Rows deleted
437 1 0
29 1 0
29 1 0

This will do it all in gawk:

gawk -vRS="" -F"\n" 'BEGIN{ fields = "1 2 5"; key = "1"
numflds = split(fields,flds," ")
}
{
for (i=1; i<=NF;i++) {
split($i,f,"=")
# Get rid of all spaces from the end of the title text
sub(/[[:blank:]]*$/,"",f[1])
title[i]=f[1]
# Get rid of all spaces from the value field
value[i]=f[2]+0
# Determine the width for this column based on the width
# of the title text plus 3 for spacing. Left-justify (%-).
fmt[i]="%-"(length(title[i])+3)"s"
}
# We will want to sort on the key column so we need to create a
# string at the start of each line to sort on later. Take the key
# columns value and pad it with zeros up to 20 chars followed by
# a space to separate it fromthe first real column. Conversion of
# "7" to "0007" and "17" to "0017" is necessary because asort()
# is alphabetical not numerical so all numeric fields must be the
# same width to compare alphabetically.
lines[NR] = sprintf("%020s ",value[key])

# Now add the real columns, formatted as determined earlier.
for (i=1; i<=numflds; i++) {
lines[NR] = lines[NR] sprintf(fmt[flds[i]], value[flds[i]])
}
}
END {
# Print the title line
for (i=1; i<=numflds; i++) {
printf fmt[flds[i]], title[flds[i]]
}
print ""
# Sort the lines alphabetically, i.e. by the value of the key column
# added above to the front of each line.
asort(lines)
# Print each line
for (i=1; i<=NR; i++) {
# strip out the first numeric value, the key value added above
sub("[[:digit:]]* ","",lines[i])
print lines[i]
}
}'

Setting fields and key at the beginning obvious dictates which fields to
be printed and which key to sort on. The only thing it assumes about field
sizes is that the key fields values won't be more than 20 characters.
--------------

and come back if you have questions.

Ed.
Bart Vandewoestyne

2005-04-29, 8:55 pm

In article <e82fe$42725f4f$d1b71443$7763@PRIMUS.CA>, William Park wrote:
>
> Search <comp.lang.awk> and <comp.unix.shell> for 'transpose' keyword in
> subject.


Thanks. I was always searching for 'convert row data to column data'
and search terms like that.

The 'transpose' hint was very usefull and I was able to write a working
awk script:

http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk

This does exactly what I want. Of course, I'm always interested in
reading other shorter/cleaner/more_intelligent solutions :-)

Regards,
Bart

--
"Share what you know. Learn what you don't."
Ed Morton

2005-04-30, 3:56 am



Bart Vandewoestyne wrote:
> In article <e82fe$42725f4f$d1b71443$7763@PRIMUS.CA>, William Park wrote:
>
>
>
> Thanks. I was always searching for 'convert row data to column data'
> and search terms like that.
>
> The 'transpose' hint was very usefull and I was able to write a working
> awk script:
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> This does exactly what I want. Of course, I'm always interested in
> reading other shorter/cleaner/more_intelligent solutions :-)
>
> Regards,
> Bart
>


I assume since you posted the link that you';d like some feedback so:

> #!/usr/bin/awk -f


The above is "old awk", generally considered broken and to be avoided.
On Solaris use either: gawk (you may need to install it yourself from
http://www.gnu.org/software/gawk), /usr/xpg4/bin/awk, or /usr/bin/nawk
with gawk being the first choice.

> #
> # Convert George's data files which are row-oriented towards column-oriented
> # data files.
>
> BEGIN { numcols=NF; numrows=0; }


The above line is not useful. NF is not set in the BEGIN section and
numrows will take the numeric value zero anyway.

> # Match a line with data
> /^[0-9].*/ {


It doesn't really matter, but you could use a character class of
[:digit:] here instead of explicitly testing for digits with newer awks.

More importantly, though, the ".*" means "any sequence of characters".
so the above tests for a line that starts with a digit and then has more
subsequent characters which may not be your intent.

If you wanted to test for a line that starts with a digit and don't care
whether or not there's subsequent characters, you'd just write:

/^[0-9]/

If you wanted to test for a line that's all digits, you'd write:

/^[0-9]*$/

etc....

>
> # Extract the amount of data points
> numcols=NF;


You don't need to set this for every line. You could just set it once in
the END section for the final line (for newer awks), but you can just
use NF instead.

> # There is now one extra row/column of data
> numrows=numrows+1


That could just be written as numrows++.

> # Store the data in an array so we can extract it later on
> for (i=1; i<=numcols; i++) {


Just use NF for numcols. If you REALLY wanted a numcols variable, you
could just increment it in place of "i" here with suitable arithmetic
adjustment.

> data[i, numrows]=$i;


You don't need a terminating semicolon.

> }
> }
>
> # Now show all the data that we stored in the array in a column-oriented way.
> END {
> for (col=1; col<=numcols; col++) {
> for (row=1; row<=numrows; row++) {
> printf("%s ", data[col,row]);



You don't need the terminating semicolon. Also, this will put an extra
space at the end of your line. You can avoid that by doing:

printf("%s%s",sep,data[col,row])
sep=" "

> }
> printf("\n");


No need for the semicolon, and normally people just use:

print ""

to add that final newline.

> }
> }


You don't actually need all the "{" and "}"s, but they don't do any harm
and do future-proof so they're not necessarily a bad idea.

So, the above could be written as:

#!/usr/wherever/bin/gawk -f
#
# Convert George's data files which are row-oriented towards column-oriented
# data files.

# Match a line with data
/^[[:digit:]].*/ {
# There is now one extra row/column of data
numrows++

# Store the data in an array so we can extract it later on
for (i=1; i<=NF; i++)
data[i, numrows]=$i
}

# Now show all the data that we stored in the array in a column-oriented
way.
END {
for (col=1; col<=NF; col++) {
for (row=1; row<=numrows; row++) {
printf("%s%s", sep, data[col,row])
sep=" "
}
print ""
}
}

Just showing some possibilities - pick anything you want to keep.....

Ed.
William Park

2005-04-30, 3:56 am

Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
> In article <e82fe$42725f4f$d1b71443$7763@PRIMUS.CA>, William Park wrote:
>
> Thanks. I was always searching for 'convert row data to column data'
> and search terms like that.
>
> The 'transpose' hint was very usefull and I was able to write a working
> awk script:
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> This does exactly what I want. Of course, I'm always interested in
> reading other shorter/cleaner/more_intelligent solutions :-)


That smells like a Fortran. Try something like

i=0
while read; do
printf '%s\n' $REPLY > file.$((++i))
done < file
paste file.*

If your file has funny characters, then use 'set -f' to disable
globbing. To do this all in memory, then you need my patched Bash
shell
http://freshmeat.net/projects/bashdiff/
which does "transpose in-place".

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Steve Calfee

2005-04-30, 3:56 am

On Fri, 29 Apr 2005 19:41:17 -0500, Ed Morton <morton@lsupcaemnt.com>
wrote:

>
>No need for the semicolon, and normally people just use:
>
> print ""
>
>to add that final newline.
>


Hi Ed, oh master of minimal awk. I have always just used "print" on a
line to print an EOL. is 'print ""' better, or why are you wasting 3
typed characters? I am not being snide, I want to know and I
appreciate your tips.

Regards, ~Steve



There is no "x" in my email address.
Ed Morton

2005-04-30, 3:56 am



Steve Calfee wrote:
> On Fri, 29 Apr 2005 19:41:17 -0500, Ed Morton <morton@lsupcaemnt.com>
> wrote:
>
>
>
>
> Hi Ed, oh master of minimal awk. I have always just used "print" on a
> line to print an EOL. is 'print ""' better, or why are you wasting 3
> typed characters? I am not being snide, I want to know and I
> appreciate your tips.



print on a line prints $0. To ONLY print the newline character, you need
print "".

Ed.
Bart Vandewoestyne

2005-04-30, 3:56 am

In article <cdmdnfW3PueySe_fRVn-rg@comcast.com>, Ed Morton wrote:
>
> I assume since you posted the link that you';d like some feedback so:
>
> <snip feedback>
>
> Just showing some possibilities - pick anything you want to keep.....


Thanks. Ik really appreciate this kind of feedback to improve my skills
in 'yet another language' that I'm learning :-)

Regards,
Bart

--
"Share what you know. Learn what you don't."
Loki Harfagr

2005-04-30, 8:55 am

Le Fri, 29 Apr 2005 19:41:17 -0500, Ed Morton a écrit_:

>
>
> Bart Vandewoestyne wrote:
....[color=darkred]
> I assume since you posted the link that you';d like some feedback so:

....
> So, the above could be written as:
>
> #!/usr/wherever/bin/gawk -f
> #
> # Convert George's data files which are row-oriented towards column-oriented
> # data files.
>
> # Match a line with data
> /^[[:digit:]].*/ {
> # There is now one extra row/column of data
> numrows++
>
> # Store the data in an array so we can extract it later on
> for (i=1; i<=NF; i++)
> data[i, numrows]=$i
> }
>
> # Now show all the data that we stored in the array in a column-oriented
> way.
> END {
> for (col=1; col<=NF; col++) {
> for (row=1; row<=numrows; row++) {
> printf("%s%s", sep, data[col,row])
> sep=" "
> }
> print ""
> }
> }
>
> Just showing some possibilities - pick anything you want to keep.....


May I play too ?-)
Just for the sake of doing it *almost* differently
and add some more feedback ;-)
#!/usr/bin/gawk -f
#
/^[[:digit:]].*/ {
max=NF
while(NF){
data[NR,NF]=$NF;
NF--
}
}
END{
while(max - j++){
i=1
while(data[i,j]) printf data[i++,j]FS
print ""
}
}

Well, I know it's not foolproof in case the input file is not
symetric in it col/row matrix, but it shouldn't ...

Worse, it acts funny when values are zero, need another type
of test than (data[i,j]) in case zero values might be present.
For instance the usual for loop :
END{
for(j=1;j<=max;j++){
for(i=1; i<=NR; i++)
printf data[i,j]FS
print ""
}
}

And ...
It doesn't cope with *not* printing the first blank, which
Ed.'s answer didn't either :D)
( If really needed we should *void* the sep between the `for's :-)
for (col=1; col<=NF; col++) {
sep=""
for (row=1; row<=numrows; row++) {
)

Patrick TJ McPhee

2005-04-30, 3:55 pm

In article <b258e$4272da3b$d8fea0ea$16952@PRIMUS.CA>,
William Park <opengeometry@yahoo.ca> wrote:

% printf '%s\n' $REPLY > file.$((++i))
[...]
% paste file.*

Leaving aside the fact that it's completely off-topic, this doesn't
work if there are more than 9 lines in the file, or if there's any
conflict with existing files. On the whole, it seems like a really
bad way of approaching the problem.
--

Patrick TJ McPhee
North York Canada
ptjm@interlog.com
Steve Calfee

2005-04-30, 3:55 pm

On Fri, 29 Apr 2005 21:31:49 -0500, Ed Morton <morton@lsupcaemnt.com>
wrote:

>
>
>Steve Calfee wrote:
>
>
>print on a line prints $0. To ONLY print the newline character, you need
>print "".
>
> Ed.


DOH! I guess I was mixing awk up with BASIC! Sorry for the noise.

There is no "x" in my email address.
William Park

2005-04-30, 8:56 pm

Patrick TJ McPhee <ptjm@interlog.com> wrote:
> In article <b258e$4272da3b$d8fea0ea$16952@PRIMUS.CA>,
> William Park <opengeometry@yahoo.ca> wrote:
>
> % printf '%s\n' $REPLY > file.$((++i))
> [...]
> % paste file.*
>
> Leaving aside the fact that it's completely off-topic, this doesn't
> work if there are more than 9 lines in the file, or if there's any
> conflict with existing files. On the whole, it seems like a really
> bad way of approaching the problem.


I did even worse. I've patched Bash shell, so that it will do
"transpose" in one shot. Furthermore, I added another builtin command
to join the elements using a user supplied string. I have other evil
things planned for Bash, but I can't seem to find time...

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Ed Morton

2005-05-01, 3:56 pm



Bart Vandewoestyne wrote:
> I'm quite new to awk scripting, and i haven't been able to solve this
> problem:
>
> I have this data file:
>
> http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt
>
> I would like to use the data from that datafile in gnuplot, which
> expects the data as columns. Now the data in my file is stored in the
> even rows.
>
> How do i transform the data from the even rows in to columns? The best
> i could come up with up until now is
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> but this does not give me what i want. The row-data is indeed changed
> to column data, but the columns should be next to each other, not below
> each other and separated by an empty line...
>


Take a look at this:

--------------
Transposing rows to selected columns and sorting by key.

Given the following input file:
Number of executions = 437
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

Number of executions = 1
Number of compilations = 1
Worst preparation time (ms) = 4
Best preparation time (ms) = 4
Rows deleted = 0

Number of executions = 29
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

To tranpose certain rows into columns and sort by one of the
column, like the following which is sorted by "Number of executions":

Number of executions Number of compilations Rows deleted
437 1 0
29 1 0
29 1 0

This will do it all in gawk:

gawk -vRS="" -F"\n" 'BEGIN{ fields = "1 2 5"; key = "1"
numflds = split(fields,flds," ")
}
{
for (i=1; i<=NF;i++) {
split($i,f,"=")
# Get rid of all spaces from the end of the title text
sub(/[[:blank:]]*$/,"",f[1])
title[i]=f[1]
# Get rid of all spaces from the value field
value[i]=f[2]+0
# Determine the width for this column based on the width
# of the title text plus 3 for spacing. Left-justify (%-).
fmt[i]="%-"(length(title[i])+3)"s"
}
# We will want to sort on the key column so we need to create a
# string at the start of each line to sort on later. Take the key
# columns value and pad it with zeros up to 20 chars followed by
# a space to separate it fromthe first real column. Conversion of
# "7" to "0007" and "17" to "0017" is necessary because asort()
# is alphabetical not numerical so all numeric fields must be the
# same width to compare alphabetically.
lines[NR] = sprintf("%020s ",value[key])

# Now add the real columns, formatted as determined earlier.
for (i=1; i<=numflds; i++) {
lines[NR] = lines[NR] sprintf(fmt[flds[i]], value[flds[i]])
}
}
END {
# Print the title line
for (i=1; i<=numflds; i++) {
printf fmt[flds[i]], title[flds[i]]
}
print ""
# Sort the lines alphabetically, i.e. by the value of the key column
# added above to the front of each line.
asort(lines)
# Print each line
for (i=1; i<=NR; i++) {
# strip out the first numeric value, the key value added above
sub("[[:digit:]]* ","",lines[i])
print lines[i]
}
}'

Setting fields and key at the beginning obvious dictates which fields to
be printed and which key to sort on. The only thing it assumes about field
sizes is that the key fields values won't be more than 20 characters.
--------------

and come back if you have questions.

Ed.
Bart Vandewoestyne

2005-05-01, 8:55 pm

In article <e82fe$42725f4f$d1b71443$7763@PRIMUS.CA>, William Park wrote:
>
> Search <comp.lang.awk> and <comp.unix.shell> for 'transpose' keyword in
> subject.


Thanks. I was always searching for 'convert row data to column data'
and search terms like that.

The 'transpose' hint was very usefull and I was able to write a working
awk script:

http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk

This does exactly what I want. Of course, I'm always interested in
reading other shorter/cleaner/more_intelligent solutions :-)

Regards,
Bart

--
"Share what you know. Learn what you don't."
Jürgen Kahrs

2005-05-01, 8:55 pm

William Park wrote:

> I did even worse. I've patched Bash shell, so that it will do
> "transpose" in one shot. Furthermore, I added another builtin command
> to join the elements using a user supplied string. I have other evil
> things planned for Bash, but I can't seem to find time...


Do you intend to implement all the matrix operations
of the Matlab language ?
William Park

2005-05-01, 8:55 pm

Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
> In article <e82fe$42725f4f$d1b71443$7763@PRIMUS.CA>, William Park wrote:
>
> Thanks. I was always searching for 'convert row data to column data'
> and search terms like that.
>
> The 'transpose' hint was very usefull and I was able to write a working
> awk script:
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> This does exactly what I want. Of course, I'm always interested in
> reading other shorter/cleaner/more_intelligent solutions :-)


That smells like a Fortran. Try something like

i=0
while read; do
printf '%s\n' $REPLY > file.$((++i))
done < file
paste file.*

If your file has funny characters, then use 'set -f' to disable
globbing. To do this all in memory, then you need my patched Bash
shell
http://freshmeat.net/projects/bashdiff/
which does "transpose in-place".

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Bart Vandewoestyne

2005-05-02, 8:55 am

In article <cdmdnfW3PueySe_fRVn-rg@comcast.com>, Ed Morton wrote:
>
> I assume since you posted the link that you';d like some feedback so:
>
> <snip feedback>
>
> Just showing some possibilities - pick anything you want to keep.....


Thanks. Ik really appreciate this kind of feedback to improve my skills
in 'yet another language' that I'm learning :-)

Regards,
Bart

--
"Share what you know. Learn what you don't."
William Park

2005-05-05, 3:56 pm

Patrick TJ McPhee <ptjm@interlog.com> wrote:
> In article <b258e$4272da3b$d8fea0ea$16952@PRIMUS.CA>,
> William Park <opengeometry@yahoo.ca> wrote:
>
> % printf '%s\n' $REPLY > file.$((++i))
> [...]
> % paste file.*
>
> Leaving aside the fact that it's completely off-topic, this doesn't
> work if there are more than 9 lines in the file, or if there's any
> conflict with existing files. On the whole, it seems like a really
> bad way of approaching the problem.


I did even worse. I've patched Bash shell, so that it will do
"transpose" in one shot. Furthermore, I added another builtin command
to join the elements using a user supplied string. I have other evil
things planned for Bash, but I can't seem to find time...

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
William Park

2005-05-06, 8:55 am

Bart Vandewoestyne <MyFirstName.MyLastName@telenet.be> wrote:
> I'm quite new to awk scripting, and i haven't been able to solve this
> problem:
>
> I have this data file:
>
> http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt
>
> I would like to use the data from that datafile in gnuplot, which
> expects the data as columns. Now the data in my file is stored in the
> even rows.
>
> How do i transform the data from the even rows in to columns? The best
> i could come up with up until now is
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> but this does not give me what i want. The row-data is indeed changed
> to column data, but the columns should be next to each other, not below
> each other and separated by an empty line...
>
> Any help appreciated.


Search <comp.lang.awk> and <comp.unix.shell> for 'transpose' keyword in
subject.

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Ed Morton

2005-05-06, 3:55 pm



Bart Vandewoestyne wrote:
> I'm quite new to awk scripting, and i haven't been able to solve this
> problem:
>
> I have this data file:
>
> http://www.cs.kuleuven.ac.be/~bartv..._1_0p041_12.txt
>
> I would like to use the data from that datafile in gnuplot, which
> expects the data as columns. Now the data in my file is stored in the
> even rows.
>
> How do i transform the data from the even rows in to columns? The best
> i could come up with up until now is
>
> http://www.cs.kuleuven.ac.be/~bartv...ads/convert.awk
>
> but this does not give me what i want. The row-data is indeed changed
> to column data, but the columns should be next to each other, not below
> each other and separated by an empty line...
>


Take a look at this:

--------------
Transposing rows to selected columns and sorting by key.

Given the following input file:
Number of executions = 437
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

Number of executions = 1
Number of compilations = 1
Worst preparation time (ms) = 4
Best preparation time (ms) = 4
Rows deleted = 0

Number of executions = 29
Number of compilations = 1
Worst preparation time (ms) = 1
Best preparation time (ms) = 1
Rows deleted = 0

To tranpose certain rows into columns and sort by one of the
column, like the following which is sorted by "Number of executions":

Number of executions Number of compilations Rows deleted
437 1 0
29 1 0
29 1 0

This will do it all in gawk:

gawk -vRS="" -F"\n" 'BEGIN{ fields = "1 2 5"; key = "1"
numflds = split(fields,flds," ")
}
{
for (i=1; i<=NF;i++) {
split($i,f,"=")
# Get rid of all spaces from the end of the title text
sub(/[[:blank:]]*$/,"",f[1])
title[i]=f[1]
# Get rid of all spaces from the value field
value[i]=f[2]+0
# Determine the width for this column based on the width
# of the title text plus 3 for spacing. Left-justify (%-).
fmt[i]="%-"(length(title[i])+3)"s"
}
# We will want to sort on the key column so we need to create a
# string at the start of each line to sort on later. Take the key
# columns value and pad it with zeros up to 20 chars followed by
# a space to separate it fromthe first real column. Conversion of
# "7" to "0007" and "17" to "0017" is necessary because asort()
# is alphabetical not numerical so all numeric fields must be the
# same width to compare alphabetically.
lines[NR] = sprintf("%020s ",value[key])

# Now add the real columns, formatted as determined earlier.
for (i=1; i<=numflds; i++) {
lines[NR] = lines[NR] sprintf(fmt[flds[i]], value[flds[i]])
}
}
END {
# Print the title line
for (i=1; i<=numflds; i++) {
printf fmt[flds[i]], title[flds[i]]
}
print ""
# Sort the lines alphabetically, i.e. by the value of the key column
# added above to the front of each line.
asort(lines)
# Print each line
for (i=1; i<=NR; i++) {
# strip out the first numeric value, the key value added above
sub("[[:digit:]]* ","",lines[i])
print lines[i]
}
}'

Setting fields and key at the beginning obvious dictates which fields to
be printed and which key to sort on. The only thing it assumes about field
sizes is that the key fields values won't be more than 20 characters.
--------------

and come back if you have questions.

Ed.
William Park

2005-05-07, 3:55 pm

Patrick TJ McPhee <ptjm@interlog.com> wrote:
> In article <b258e$4272da3b$d8fea0ea$16952@PRIMUS.CA>,
> William Park <opengeometry@yahoo.ca> wrote:
>
> % printf '%s\n' $REPLY > file.$((++i))
> [...]
> % paste file.*
>
> Leaving aside the fact that it's completely off-topic, this doesn't
> work if there are more than 9 lines in the file, or if there's any
> conflict with existing files. On the whole, it seems like a really
> bad way of approaching the problem.


I did even worse. I've patched Bash shell, so that it will do
"transpose" in one shot. Furthermore, I added another builtin command
to join the elements using a user supplied string. I have other evil
things planned for Bash, but I can't seem to find time...

--
William Park <opengeometry@yahoo.ca>, Toronto, Canada
Slackware Linux -- because it works.
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com