Home > Archive > PERL Beginners > January 2006 > Hash problem
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Andrej Kastrin 2006-01-30, 3:55 am |
| Dear all,
I have bar separated file:
name1|345
name2|201
....
I store it into a hash;
while (<FILE_A> ) {
chomp;
($name,$score) = split (/\|/,$_);
$hash{$name} = $score;
}
Then I have second file:
ID - 001
NA - name1
NA - name2
ID - 002
NA - name2
NA - name4
....
I match all ID's and NA's:
while (<FILE_B> ) {
chomp;
if (/^ID/ {
$ID = substr($_,5);
}
elseif (/^NA/) {
$NA = substr($_,5)
}
Now I have to do somethig like;
001 | 345+201
....
So, I want to read ID and NA fields from second file and then with each
ID print the sum of scores from first file.
Any suggestion. Which structure should I use to do that. Thank's in
advance.
Cheers, Andrej
| |
| John Doe 2006-01-30, 7:55 am |
| Andrej Kastrin am Montag, 30. Januar 2006 10.14:
> Dear all,
>
> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }
Let's assume the resulting hash is %scores.
> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }
> Now I have to do somethig like;
> 001 | 345+201
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.
Now you could parse FILE_B and use another twodimensional hash to accumulate
the scores by ID for each name. The loop could look like (untested):
my %sums;
my $id;
while (<FILE_B> ) {
chomp;
next if (($id)=$_=~/^ID - (\d+)/);
next unless my ($na)=$_=~/^ID - (\w+)/;
$sums{$id}->{$na}+=$scores{$na};
}
foreach my $id (sort keys %sums) {
print "ID $id\n";
foreach my $name (sort keys %{$sums{$id}}) {
print "name: $name - scores: ",
$sums{$id}->{$name}, "\n";
}
}
(All handling of possible errors is missing here)
hth,
joe
| |
| Andrej Kastrin 2006-01-30, 6:56 pm |
| John Doe wrote:
>Andrej Kastrin am Montag, 30. Januar 2006 10.14:
>
>
>
>Let's assume the resulting hash is %scores.
>
>
>
>
>
>
>
>
>Now you could parse FILE_B and use another twodimensional hash to accumulate
>the scores by ID for each name. The loop could look like (untested):
>
>my %sums;
>my $id;
>while (<FILE_B> ) {
> chomp;
> next if (($id)=$_=~/^ID - (\d+)/);
> next unless my ($na)=$_=~/^ID - (\w+)/;
> $sums{$id}->{$na}+=$scores{$na};
>}
>
>foreach my $id (sort keys %sums) {
> print "ID $id\n";
> foreach my $name (sort keys %{$sums{$id}}) {
> print "name: $name - scores: ",
> $sums{$id}->{$name}, "\n";
> }
>}
>
>(All handling of possible errors is missing here)
>
>
>hth,
>joe
>
>
>
>
I'm totally now and I have no more ideas... Thank's for your
reply Joe, but I didn't manage. Here is the more real example:
First file: (I modify it)
270|Germany|Hospitals|Poland
272|Germany|History
273|Physiology|Poland|Portraits
Second file:
Germany|100
History|200
Hospitals|50
Poland|50
Physiology|10
Portrait|10
Output file:
270|100|50|50|200 #270 is the key in table 1; 100, 50, 50 are values for
nouns from second file, 200 is the sum of them
272|100|200|300
273|10|50|10|70
I study this problem for 5 hours now, but I'm afraid I can't d it myself.
Cheers, Andrej
| |
| John Doe 2006-01-30, 6:56 pm |
| Andrej Kastrin am Montag, 30. Januar 2006 16.50:
> John Doe wrote:
>
> I'm totally now and I have no more ideas... Thank's for your
> reply Joe, but I didn't manage. Here is the more real example:
My answer above is obviously not appropriate to the following quite different
example...
[1]
> First file: (I modify it)
> 270|Germany|Hospitals|Poland
> 272|Germany|History
> 273|Physiology|Poland|Portraits
Quite different from:
name1|345
[2]
> Second file:
> Germany|100
> History|200
> Hospitals|50
> Poland|50
> Physiology|10
> Portrait|10
(I assume you mean 'portraits' here)
Quite different from:
ID - 001
NA - name1
NA - name2
ID - 002
[3]
> Output file:
> 270|100|50|50|200 #270 is the key in table 1;
> 100, 50, 50 are values for
> nouns from second file, 200 is the sum of them
> 272|100|200|300
> 273|10|50|10|70
Ok, you want [3] from [1] and [2]?
If yes, here is the script, I tested it, and it produces the output you like:
#!/usr/bin/perl
use strict;
use warnings;
# Make a lookup hash of file 2.
# haskeys are keys, hashvalues names of the file.
# It is assumed that a name only occurs once in the file.
#
my %lookup;
open (my $fh2, '<', 'file2') or die;
while (<$fh2> ) {
chomp;
my ($name, $score)=split /\|/;
$lookup{$name}=$score;
}
close $fh2 or die;
# Now in every line of file1 all names are replaced
# by their score (?) value found in the lookup hash
# and a sum is added at the end. The line is
# directly printed out.
#
open (my $fh1, '<', 'file1') or die;
open (my $fh3, '>', 'file3') or die;
while (<$fh1> ) {
chomp;
my ($num, @entries)=split /\|/;
# replace names by scores. If no score found,
# take score=0
#
@entries=map {$lookup{$_}||0} @entries;
my $sum=0;
$sum+=$_ for @entries;
print $fh3 join '|', $num, @entries, $sum;
print $fh3 "\n";
}
close $fh1 or die;
close $fh3 or die;
| |
| John W. Krahn 2006-01-31, 7:55 am |
| Andrej Kastrin wrote:
> Dear all,
Hello,
> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }
>
> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }
>
> Now I have to do somethig like;
> 001 | 345+201
> ...
>
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
>
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.
It looks like you could do something like this (UNTESTED):
my %scores;
while ( <FILE_A> ) {
chomp;
my ( $name, $score ) = split /\|/;
$scores{ $name } = $score;
}
my ( $ID, %ids );
while ( <FILE_B> ) {
if ( /^ID\s*-\s*(.+)/ ) {
$ID = $1;
}
elseif ( /^NA\s*-\s*(.+)/ ) {
$ids{ $ID } += $scores{ $1 };
}
}
for my $id ( keys %ids ) {
print "$id | $ids{$id}\n";
}
John
--
use Perl;
program
fulfillment
| |
| Andrej Kastrin 2006-01-31, 7:55 am |
| John W. Krahn wrote:
>Andrej Kastrin wrote:
>
>
>
>Hello,
>
>
>
>
>It looks like you could do something like this (UNTESTED):
>
>my %scores;
>while ( <FILE_A> ) {
> chomp;
> my ( $name, $score ) = split /\|/;
> $scores{ $name } = $score;
>}
>
>my ( $ID, %ids );
>while ( <FILE_B> ) {
> if ( /^ID\s*-\s*(.+)/ ) {
> $ID = $1;
> }
> elseif ( /^NA\s*-\s*(.+)/ ) {
> $ids{ $ID } += $scores{ $1 };
> }
>}
>
>for my $id ( keys %ids ) {
> print "$id | $ids{$id}\n";
>}
>
>
>
>John
>
>
John & John, thank's for help.
Now I understand (I hope so).
Best, Andrej
|
|
|
|
|