Code Comments
Programming Forum and web based access to our favorite programming groups.Dear all,
I have bar separated file:
name1|345
name2|201
...
I store it into a hash;
while (<FILE_A> ) {
chomp;
($name,$score) = split (/\|/,$_);
$hash{$name} = $score;
}
Then I have second file:
ID - 001
NA - name1
NA - name2
ID - 002
NA - name2
NA - name4
...
I match all ID's and NA's:
while (<FILE_B> ) {
chomp;
if (/^ID/ {
$ID = substr($_,5);
}
elseif (/^NA/) {
$NA = substr($_,5)
}
Now I have to do somethig like;
001 | 345+201
...
So, I want to read ID and NA fields from second file and then with each
ID print the sum of scores from first file.
Any suggestion. Which structure should I use to do that. Thank's in
advance.
Cheers, Andrej
Post Follow-up to this messageAndrej Kastrin am Montag, 30. Januar 2006 10.14:
> Dear all,
>
> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }
Let's assume the resulting hash is %scores.
> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }
> Now I have to do somethig like;
> 001 | 345+201
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.
Now you could parse FILE_B and use another twodimensional hash to accumulate
the scores by ID for each name. The loop could look like (untested):
my %sums;
my $id;
while (<FILE_B> ) {
chomp;
next if (($id)=$_=~/^ID - (\d+)/);
next unless my ($na)=$_=~/^ID - (\w+)/;
$sums{$id}->{$na}+=$scores{$na};
}
foreach my $id (sort keys %sums) {
print "ID $id\n";
foreach my $name (sort keys %{$sums{$id}}) {
print "name: $name - scores: ",
$sums{$id}->{$name}, "\n";
}
}
(All handling of possible errors is missing here)
hth,
joe
Post Follow-up to this messageJohn Doe wrote:
>Andrej Kastrin am Montag, 30. Januar 2006 10.14:
>
>
>
>Let's assume the resulting hash is %scores.
>
>
>
>
>
>
>
>
>Now you could parse FILE_B and use another twodimensional hash to accumulat
e
>the scores by ID for each name. The loop could look like (untested):
>
>my %sums;
>my $id;
>while (<FILE_B> ) {
> chomp;
> next if (($id)=$_=~/^ID - (\d+)/);
> next unless my ($na)=$_=~/^ID - (\w+)/;
> $sums{$id}->{$na}+=$scores{$na};
>}
>
>foreach my $id (sort keys %sums) {
> print "ID $id\n";
> foreach my $name (sort keys %{$sums{$id}}) {
> print "name: $name - scores: ",
> $sums{$id}->{$name}, "\n";
> }
>}
>
>(All handling of possible errors is missing here)
>
>
>hth,
>joe
>
>
>
>
I'm totally
now and I have no more ideas... Thank's for your
reply Joe, but I didn't manage. Here is the more real example:
First file: (I modify it)
270|Germany|Hospitals|Poland
272|Germany|History
273|Physiology|Poland|Portraits
Second file:
Germany|100
History|200
Hospitals|50
Poland|50
Physiology|10
Portrait|10
Output file:
270|100|50|50|200 #270 is the key in table 1; 100, 50, 50 are values for
nouns from second file, 200 is the sum of them
272|100|200|300
273|10|50|10|70
I study this problem for 5 hours now, but I'm afraid I can't d it myself.
Cheers, Andrej
Post Follow-up to this messageAndrej Kastrin am Montag, 30. Januar 2006 16.50: > John Doe wrote: > > I'm totallynow and I have no more ideas... Thank's for your > reply Joe, but I didn't manage. Here is the more real example: My answer above is obviously not appropriate to the following quite differen t example... [1] > First file: (I modify it) > 270|Germany|Hospitals|Poland > 272|Germany|History > 273|Physiology|Poland|Portraits Quite different from: name1|345 [2] > Second file: > Germany|100 > History|200 > Hospitals|50 > Poland|50 > Physiology|10 > Portrait|10 (I assume you mean 'portraits' here) Quite different from: ID - 001 NA - name1 NA - name2 ID - 002 [3] > Output file: > 270|100|50|50|200 #270 is the key in table 1; > 100, 50, 50 are values for > nouns from second file, 200 is the sum of th em > 272|100|200|300 > 273|10|50|10|70 Ok, you want [3] from [1] and [2]? If yes, here is the script, I tested it, and it produces the output you like : #!/usr/bin/perl use strict; use warnings; # Make a lookup hash of file 2. # haskeys are keys, hashvalues names of the file. # It is assumed that a name only occurs once in the file. # my %lookup; open (my $fh2, '<', 'file2') or die; while (<$fh2> ) { chomp; my ($name, $score)=split /\|/; $lookup{$name}=$score; } close $fh2 or die; # Now in every line of file1 all names are replaced # by their score (?) value found in the lookup hash # and a sum is added at the end. The line is # directly printed out. # open (my $fh1, '<', 'file1') or die; open (my $fh3, '>', 'file3') or die; while (<$fh1> ) { chomp; my ($num, @entries)=split /\|/; # replace names by scores. If no score found, # take score=0 # @entries=map {$lookup{$_}||0} @entries; my $sum=0; $sum+=$_ for @entries; print $fh3 join '|', $num, @entries, $sum; print $fh3 "\n"; } close $fh1 or die; close $fh3 or die;
Post Follow-up to this messageAndrej Kastrin wrote:
> Dear all,
Hello,
> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }
>
> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }
>
> Now I have to do somethig like;
> 001 | 345+201
> ...
>
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
>
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.
It looks like you could do something like this (UNTESTED):
my %scores;
while ( <FILE_A> ) {
chomp;
my ( $name, $score ) = split /\|/;
$scores{ $name } = $score;
}
my ( $ID, %ids );
while ( <FILE_B> ) {
if ( /^ID\s*-\s*(.+)/ ) {
$ID = $1;
}
elseif ( /^NA\s*-\s*(.+)/ ) {
$ids{ $ID } += $scores{ $1 };
}
}
for my $id ( keys %ids ) {
print "$id | $ids{$id}\n";
}
John
--
use Perl;
program
fulfillment
Post Follow-up to this messageJohn W. Krahn wrote:
>Andrej Kastrin wrote:
>
>
>
>Hello,
>
>
>
>
>It looks like you could do something like this (UNTESTED):
>
>my %scores;
>while ( <FILE_A> ) {
> chomp;
> my ( $name, $score ) = split /\|/;
> $scores{ $name } = $score;
>}
>
>my ( $ID, %ids );
>while ( <FILE_B> ) {
> if ( /^ID\s*-\s*(.+)/ ) {
> $ID = $1;
> }
> elseif ( /^NA\s*-\s*(.+)/ ) {
> $ids{ $ID } += $scores{ $1 };
> }
>}
>
>for my $id ( keys %ids ) {
> print "$id | $ids{$id}\n";
>}
>
>
>
>John
>
>
John & John, thank's for help.
Now I understand (I hope so).
Best, Andrej
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.