Home > Archive > AWK > December 2004 > use array index or values as index to fields to print
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
use array index or values as index to fields to print
|
|
| Thomas Toth 2004-12-01, 3:55 pm |
| i've got a comma-separated file in which the first line is the header of
each column and the remaining lines are data. i would like to scan the
header for certain patterns and then print the selected columns.
i've got most of it working, except for the printing of the correct
fields. i'm not sure how to do this smartly.
here's what i've got for a code so far. this one stores all indices of
fields containing TT in the header in the array data_array. then (the
failing part) on all following lines it should print the fields
indicated by the array.
BEGIN { FS = ","; OFS = "," }
(NR==1) {for (i=1;i<=NF;i++) {
if ($i~/TT/){
data_array[i]=i
}
}
}
{print $data_array}
as i question my ability to explain, here an example input and desired
output:
IN:
nr,a,c,tt1,tt2,tt3,d,e,tt5
1,2,3,4,5,6,7,8,9
11,12,13,14,15,16,17,18,19
21,22,23,24,25,26,27,28,29
31,32,33,34,35,36,37,38,39
OUT:
4,5,6,9
14,15,16,19
24,25,26,29
34,35,36,39
the file is over 1G in size so efficiency would be nice.
thanks for any help,
Tom
| |
| Robert Stearns 2004-12-01, 3:55 pm |
| See the intercalated comments. Untested.
Thomas Toth wrote:
> i've got a comma-separated file in which the first line is the header of
> each column and the remaining lines are data. i would like to scan the
> header for certain patterns and then print the selected columns.
>
> i've got most of it working, except for the printing of the correct
> fields. i'm not sure how to do this smartly.
>
> here's what i've got for a code so far. this one stores all indices of
> fields containing TT in the header in the array data_array. then (the
> failing part) on all following lines it should print the fields
> indicated by the array.
>
> BEGIN { FS = ","; OFS = "," }
> (NR==1) {for (i=1;i<=NF;i++) {
> if ($i~/TT/){
> data_array[i]=i
Rather: fields[fieldct++] = i
> }
> }
> }
> {print $data_array}
Something like:
dlm = ""
for(i=0;i<fieldct;i++) {
printf("%s%s", dlm, $fields[i])
dlm = OFS
}
>
>
> as i question my ability to explain, here an example input and desired
> output:
>
> IN:
> nr,a,c,tt1,tt2,tt3,d,e,tt5
> 1,2,3,4,5,6,7,8,9
> 11,12,13,14,15,16,17,18,19
> 21,22,23,24,25,26,27,28,29
> 31,32,33,34,35,36,37,38,39
>
> OUT:
> 4,5,6,9
> 14,15,16,19
> 24,25,26,29
> 34,35,36,39
>
> the file is over 1G in size so efficiency would be nice.
>
> thanks for any help,
>
> Tom
| |
| Doug McClure 2004-12-01, 3:55 pm |
| WARNING: Untested code
BEGIN { FS = ","; OFS = "," }
(NR==1) {
for (i=1;i<=NF;i++)
if ($i~/tt/) {data_array[i]=i}
next # Skip the line of labels
}
{
for (i in data_array) printf("%s", $i)
print ""
}
DKM
On Wed, 01 Dec 2004 15:24:26 +0100, Thomas Toth <user@example.net>
wrote:
>i've got a comma-separated file in which the first line is the header of
>each column and the remaining lines are data. i would like to scan the
>header for certain patterns and then print the selected columns.
>
>i've got most of it working, except for the printing of the correct
>fields. i'm not sure how to do this smartly.
>
>here's what i've got for a code so far. this one stores all indices of
>fields containing TT in the header in the array data_array. then (the
>failing part) on all following lines it should print the fields
>indicated by the array.
>
>BEGIN { FS = ","; OFS = "," }
>(NR==1) {for (i=1;i<=NF;i++) {
> if ($i~/TT/){
> data_array[i]=i
> }
>}
>}
>{print $data_array}
>
>
>as i question my ability to explain, here an example input and desired
>output:
>
>IN:
>nr,a,c,tt1,tt2,tt3,d,e,tt5
>1,2,3,4,5,6,7,8,9
>11,12,13,14,15,16,17,18,19
>21,22,23,24,25,26,27,28,29
>31,32,33,34,35,36,37,38,39
>
>OUT:
>4,5,6,9
>14,15,16,19
>24,25,26,29
>34,35,36,39
>
>the file is over 1G in size so efficiency would be nice.
>
>thanks for any help,
>
>Tom
To contact me directly, send EMAIL to (single letters all)
DEE_KAY_EMM AT EarthLink.net. [For example X_X_X@EarthLink.net.]
| |
| Kenny McCormack 2004-12-01, 3:55 pm |
| In article <8hmrq05eug4gtd8uk046o2bpintb77sgar@4ax.com>,
Doug McClure <Dee_Kay_Emm@EarthLink.net> wrote:
>WARNING: Untested code
>
>BEGIN { FS = ","; OFS = "," }
>
>(NR==1) {
> for (i=1;i<=NF;i++)
> if ($i~/tt/) {data_array[i]=i}
ITYM:
if ($i~/tt/) data_array[++flds]=i
> next # Skip the line of labels
> }
>
>{
>for (i in data_array) printf("%s", $i)
ITYM:
for (i=1; i<=flds; i++) printf("%s", $data_array[i])
>print ""
>}
Note that using "for (i in data_array) ..." is dangerous unless using TAWK
(or GAWK with WHINY_USERS set) because the data comes out in
(pseudo-)random order.
| |
| William James 2004-12-01, 3:55 pm |
| This version uses Kenny's suggestion:
BEGIN { FS = ","; ORS="" }
1==NR {
for (i=1; i<=NF; i++)
if ( $i ~ /TT|tt/ )
selected[++fields] = i
next
}
{ for (i=1; i<=fields; i++)
{ if ( i > 1 ) print FS
print $(selected[i])
}
print "\n"
}
| |
| Thomas Toth 2004-12-02, 3:56 pm |
| William James wrote:
> This version uses Kenny's suggestion:
>
> BEGIN { FS = ","; ORS="" }
>
> 1==NR {
> for (i=1; i<=NF; i++)
> if ( $i ~ /TT|tt/ )
> selected[++fields] = i
> next
> }
>
> { for (i=1; i<=fields; i++)
> { if ( i > 1 ) print FS
> print $(selected[i])
> }
> print "\n"
> }
thanks a lot, your solutions worked.
just to exploit the possibilities, would it be possible to build a
string first and then use it to print the line?
something like (pseudo):
NR==1 { for (i=1;i<=fields;i++)
if ( $i ~ /TT|tt/ ) {
selected[++fields] = i
string=(string $i)}
next
}
{print string}
thanks for all the help,
tom
|
|
|
|
|