Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

better solutions?
Dear all
I intended to merge two files according to the third column in file2

file1: ( 27600 lines)
tm0 O9:AN3:3.15+O9:A134:OG1+A105:ND2:O7
tm1 O9:AN3:3. 14+O9:A134:OG1+A134:OG1:N3+O9:A132:O+A10
1:N:O7
tm2 O9:AN3:3.15+O9:A134:OG1
tm3 O9:A131:OG
tm4 O9:A131:OG+A131:N:O9+O9:A127:O

file2: (35 lines)
XBX_12291  32.10 21442
XBX_16460  56.51 22536
XBX_16460  56.0  22537
XBX_23526  53.25 23516
XBX_23526  54.49 23510

final:
XBX_23526  53.25 23516
XBX_12291  32.10 21442 O9:A131:OG
XBX_16460  56.51 22536 A131:N:O9
XBX_23526  54.49 23510 O9:A134:OG1+A105:ND2:O7

I have written one but it run very very slowly.

awk 'FILENAME=="file1" { name[++i]=substr($1,3); line[++x]=$2} {
num=$3; for ( r=1; r<=i; ++r ){ if ( num==name[r] ) print
$0,line[r]}}' file1 file2

Does anyone have better solution?
Thank you

Jui-Hua

Report this thread to moderator Post Follow-up to this message
Old Post
moggces
11-16-04 11:50 PM


Re: better solutions?
I think your example isn't correct, however, here's my take on what you're
trying to do:

FILENAME=="file2" {
a[$3]=$0
next
}
{ if (substr($1,3) in a) print a[substr($1,3)] $2 }



Report this thread to moderator Post Follow-up to this message
Old Post
A Ferenstein
11-16-04 11:50 PM


Re: better solutions?
I missed a " " (space) between  a[substr($1,3)] and $2

"A Ferenstein" <epaalx@hotmail.com> wrote in message
news:cmvf2i$hc4$1@newstree.wise.edt.ericsson.se...
> I think your example isn't correct, however, here's my take on what you're
> trying to do:
>
> FILENAME=="file2" {
>  a[$3]=$0
>  next
> }
> { if (substr($1,3) in a) print a[substr($1,3)] $2 }
>
>



Report this thread to moderator Post Follow-up to this message
Old Post
A Ferenstein
11-16-04 11:50 PM


Re: better solutions?

moggces wrote:
> Dear all
> I intended to merge two files according to the third column in file2
>
> file1: ( 27600 lines)
> tm0 O9:AN3:3.15+O9:A134:OG1+A105:ND2:O7
> tm1 O9:AN3:3. 14+O9:A134:OG1+A134:OG1:N3+O9:A132:O+A10
1:N:O7
> tm2 O9:AN3:3.15+O9:A134:OG1
> tm3 O9:A131:OG
> tm4 O9:A131:OG+A131:N:O9+O9:A127:O
>
> file2: (35 lines)
> XBX_12291  32.10 21442
> XBX_16460  56.51 22536
> XBX_16460  56.0  22537
> XBX_23526  53.25 23516
> XBX_23526  54.49 23510
>
> final:
> XBX_23526  53.25 23516
> XBX_12291  32.10 21442 O9:A131:OG
> XBX_16460  56.51 22536 A131:N:O9
> XBX_23526  54.49 23510 O9:A134:OG1+A105:ND2:O7

The above example makes no sense, but I assume the "tm1", etc. at the
start of file1 are supposed to match the numbers at the end of file2.

> I have written one but it run very very slowly.
>
> awk 'FILENAME=="file1" { name[++i]=substr($1,3); line[++x]=$2}

Quick fix: Put a "next" at the end of the above line. Right now you're
running the rest of the script on each of the 27600 lines in file1 when
you really only want to do it on the 35 lines in file2.

{
> num=$3; for ( r=1; r<=i; ++r ){ if ( num==name[r] ) print
> $0,line[r]}}' file1 file2
>
> Does anyone have better solution?
> Thank you
>
> Jui-Hua

Without a real example, it's hard to advise what else you could do to
improve your script, but it seems like, from a memory usage standpoint,
you'd be better storing file2 in the array then acting on file1 rather
than the other way around. You could also look at the UNIX "join"
command, but if you want to do it in awk, it should probably look more
like this:

awk 'NR==FNR{line[$3]=$0;next}
$1 in line {print line[$1], $2}' file2 file1

Again, the above is just a guess since the example doesn't make sense.

Regards,

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
11-16-04 11:50 PM


Re: better solutions?
juihuahsieh@nhri.org.tw (moggces) wrote in message news:<f4230c15.0411110112.159dc84a@posti
ng.google.com>...
> Dear all
> I intended to merge two files according to the third column in file2
>
> file1: ( 27600 lines)
> tm0 O9:AN3:3.15+O9:A134:OG1+A105:ND2:O7
> tm1 O9:AN3:3. 14+O9:A134:OG1+A134:OG1:N3+O9:A132:O+A10
1:N:O7
> tm2 O9:AN3:3.15+O9:A134:OG1
> tm3 O9:A131:OG
> tm4 O9:A131:OG+A131:N:O9+O9:A127:O
>
> file2: (35 lines)
> XBX_12291  32.10 21442
> XBX_16460  56.51 22536
> XBX_16460  56.0  22537
> XBX_23526  53.25 23516
> XBX_23526  54.49 23510
>
> final:
> XBX_23526  53.25 23516
> XBX_12291  32.10 21442 O9:A131:OG
> XBX_16460  56.51 22536 A131:N:O9
> XBX_23526  54.49 23510 O9:A134:OG1+A105:ND2:O7
>
> I have written one but it run very very slowly.
>
> awk 'FILENAME=="file1" { name[++i]=substr($1,3); line[++x]=$2} {
> num=$3; for ( r=1; r<=i; ++r ){ if ( num==name[r] ) print
> $0,line[r]}}' file1 file2
>
> Does anyone have better solution?
> Thank you
>
> Jui-Hua


Sorry for the wrong final file. I didn't check well and I didn't point
out the point.

final:
XBX_12291  32.10 21442 O9:A134:OG1+A134:OG1:N3+O9:A132:O
XBX_16460  56.51 22536 O9:A132:O
XBX_16460  56.0  22537 O9:A131:OG+A131:N:O9+O9:A127:O
XBX_23526  53.25 23516 O9:A131:OG
^^^^^
XBX_23526  54.49 23510 O9:A134:OG1
^^^^^
I have tried the solution like A Ferenstein before I posted. However,
output will change to
XBX_12291  32.10 21442 O9:A134:OG1+A134:OG1:N3+O9:A132:O
XBX_16460  56.51 22536 O9:A132:O
XBX_16460  56.0  22537 O9:A131:OG+A131:N:O9+O9:A127:O
XBX_23526  54.49 23510 O9:A134:OG1
^^^^^
XBX_23526  53.25 23516 O9:A131:OG
^^^^^

And I look up "join" in UNIX command. It seems only could merge two
files with identical lines.


Thanks all.

Jui-Hua

Report this thread to moderator Post Follow-up to this message
Old Post
moggces
11-16-04 11:50 PM


Re: better solutions?

moggces wrote:
> juihuahsieh@nhri.org.tw (moggces) wrote in message news:<f4230c15.04111101
12.159dc84a@posting.google.com>...
> 
<snip>
> And I look up "join" in UNIX command. It seems only could merge two
> files with identical lines.

No, it can merge based on a specific field in each file. It must be me -
I just can't see what field is common between file1 and file2 in your
examples. You say it's the third column of file 2, but where does the
number "21442", for example, appear in file1????

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
11-16-04 11:50 PM


Re: better solutions?
juihuahsieh@nhri.org.tw (moggces) wrote in message news:<f4230c15.0411110112.159dc84a@posti
ng.google.com>...
> Dear all
> I intended to merge two files according to the third column in file2
>
> file1: ( 27600 lines)
> tm0 O9:AN3:3.15+O9:A134:OG1+A105:ND2:O7
> tm1 O9:AN3:3. 14+O9:A134:OG1+A134:OG1:N3+O9:A132:O+A10
1:N:O7
> tm2 O9:AN3:3.15+O9:A134:OG1
> tm3 O9:A131:OG
> tm4 O9:A131:OG+A131:N:O9+O9:A127:O
>
> file2: (35 lines)
> XBX_12291  32.10 21442
> XBX_16460  56.51 22536
> XBX_16460  56.0  22537
> XBX_23526  53.25 23516
> XBX_23526  54.49 23510
>
> final:
> XBX_23526  53.25 23516
> XBX_12291  32.10 21442 O9:A131:OG
> XBX_16460  56.51 22536 A131:N:O9
> XBX_23526  54.49 23510 O9:A134:OG1+A105:ND2:O7
>
> I have written one but it run very very slowly.
>
> awk 'FILENAME=="file1" { name[++i]=substr($1,3); line[++x]=$2} {
> num=$3; for ( r=1; r<=i; ++r ){ if ( num==name[r] ) print
> $0,line[r]}}' file1 file2
>
> Does anyone have better solution?
> Thank you
>
> Jui-Hua


Sorry for the wrong final file. I didn't check well and I didn't point
out the point.

final:
XBX_12291  32.10 21442 O9:A134:OG1+A134:OG1:N3+O9:A132:O
XBX_16460  56.51 22536 O9:A132:O
XBX_16460  56.0  22537 O9:A131:OG+A131:N:O9+O9:A127:O
XBX_23526  53.25 23516 O9:A131:OG
^^^^^
XBX_23526  54.49 23510 O9:A134:OG1
^^^^^
I have tried the solution like A Ferenstein before I posted. However,
output will change to
XBX_12291  32.10 21442 O9:A134:OG1+A134:OG1:N3+O9:A132:O
XBX_16460  56.51 22536 O9:A132:O
XBX_16460  56.0  22537 O9:A131:OG+A131:N:O9+O9:A127:O
XBX_23526  54.49 23510 O9:A134:OG1
^^^^^
XBX_23526  53.25 23516 O9:A131:OG
^^^^^

And I look up "join" in UNIX command. It seems only could merge two
files with identical lines.


Thanks all.

Jui-Hua

Report this thread to moderator Post Follow-up to this message
Old Post
moggces
11-16-04 11:50 PM


Re: better solutions?

moggces wrote:
> juihuahsieh@nhri.org.tw (moggces) wrote in message news:<f4230c15.04111101
12.159dc84a@posting.google.com>...
> 
<snip>
> And I look up "join" in UNIX command. It seems only could merge two
> files with identical lines.

No, it can merge based on a specific field in each file. It must be me -
I just can't see what field is common between file1 and file2 in your
examples. You say it's the third column of file 2, but where does the
number "21442", for example, appear in file1????

Ed.

Report this thread to moderator Post Follow-up to this message
Old Post
Ed Morton
11-16-04 11:50 PM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

AWK archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 05:56 AM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.