For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > January 2006 > Hash problem









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Hash problem
Andrej Kastrin

2006-01-30, 3:55 am

Dear all,

I have bar separated file:
name1|345
name2|201
....

I store it into a hash;
while (<FILE_A> ) {
chomp;
($name,$score) = split (/\|/,$_);
$hash{$name} = $score;
}

Then I have second file:
ID - 001
NA - name1
NA - name2

ID - 002
NA - name2
NA - name4
....

I match all ID's and NA's:

while (<FILE_B> ) {
chomp;
if (/^ID/ {
$ID = substr($_,5);
}
elseif (/^NA/) {
$NA = substr($_,5)
}

Now I have to do somethig like;
001 | 345+201
....

So, I want to read ID and NA fields from second file and then with each
ID print the sum of scores from first file.

Any suggestion. Which structure should I use to do that. Thank's in
advance.

Cheers, Andrej
John Doe

2006-01-30, 7:55 am

Andrej Kastrin am Montag, 30. Januar 2006 10.14:
> Dear all,
>
> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }


Let's assume the resulting hash is %scores.

> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }



> Now I have to do somethig like;
> 001 | 345+201
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.


Now you could parse FILE_B and use another twodimensional hash to accumulate
the scores by ID for each name. The loop could look like (untested):

my %sums;
my $id;
while (<FILE_B> ) {
chomp;
next if (($id)=$_=~/^ID - (\d+)/);
next unless my ($na)=$_=~/^ID - (\w+)/;
$sums{$id}->{$na}+=$scores{$na};
}

foreach my $id (sort keys %sums) {
print "ID $id\n";
foreach my $name (sort keys %{$sums{$id}}) {
print "name: $name - scores: ",
$sums{$id}->{$name}, "\n";
}
}

(All handling of possible errors is missing here)


hth,
joe

Andrej Kastrin

2006-01-30, 6:56 pm

John Doe wrote:

>Andrej Kastrin am Montag, 30. Januar 2006 10.14:
>
>
>
>Let's assume the resulting hash is %scores.
>
>
>
>
>
>
>
>
>Now you could parse FILE_B and use another twodimensional hash to accumulate
>the scores by ID for each name. The loop could look like (untested):
>
>my %sums;
>my $id;
>while (<FILE_B> ) {
> chomp;
> next if (($id)=$_=~/^ID - (\d+)/);
> next unless my ($na)=$_=~/^ID - (\w+)/;
> $sums{$id}->{$na}+=$scores{$na};
>}
>
>foreach my $id (sort keys %sums) {
> print "ID $id\n";
> foreach my $name (sort keys %{$sums{$id}}) {
> print "name: $name - scores: ",
> $sums{$id}->{$name}, "\n";
> }
>}
>
>(All handling of possible errors is missing here)
>
>
>hth,
>joe
>
>
>
>

I'm totally now and I have no more ideas... Thank's for your
reply Joe, but I didn't manage. Here is the more real example:

First file: (I modify it)
270|Germany|Hospitals|Poland
272|Germany|History
273|Physiology|Poland|Portraits

Second file:
Germany|100
History|200
Hospitals|50
Poland|50
Physiology|10
Portrait|10

Output file:
270|100|50|50|200 #270 is the key in table 1; 100, 50, 50 are values for
nouns from second file, 200 is the sum of them

272|100|200|300
273|10|50|10|70

I study this problem for 5 hours now, but I'm afraid I can't d it myself.

Cheers, Andrej
John Doe

2006-01-30, 6:56 pm

Andrej Kastrin am Montag, 30. Januar 2006 16.50:
> John Doe wrote:
>
> I'm totally now and I have no more ideas... Thank's for your
> reply Joe, but I didn't manage. Here is the more real example:


My answer above is obviously not appropriate to the following quite different
example...

[1]
> First file: (I modify it)
> 270|Germany|Hospitals|Poland
> 272|Germany|History
> 273|Physiology|Poland|Portraits


Quite different from:

name1|345

[2]
> Second file:
> Germany|100
> History|200
> Hospitals|50
> Poland|50
> Physiology|10
> Portrait|10

(I assume you mean 'portraits' here)

Quite different from:

ID - 001
NA - name1
NA - name2

ID - 002

[3]
> Output file:
> 270|100|50|50|200 #270 is the key in table 1;
> 100, 50, 50 are values for
> nouns from second file, 200 is the sum of them
> 272|100|200|300
> 273|10|50|10|70


Ok, you want [3] from [1] and [2]?
If yes, here is the script, I tested it, and it produces the output you like:

#!/usr/bin/perl

use strict;
use warnings;

# Make a lookup hash of file 2.
# haskeys are keys, hashvalues names of the file.
# It is assumed that a name only occurs once in the file.
#
my %lookup;
open (my $fh2, '<', 'file2') or die;
while (<$fh2> ) {
chomp;
my ($name, $score)=split /\|/;
$lookup{$name}=$score;
}
close $fh2 or die;


# Now in every line of file1 all names are replaced
# by their score (?) value found in the lookup hash
# and a sum is added at the end. The line is
# directly printed out.
#
open (my $fh1, '<', 'file1') or die;
open (my $fh3, '>', 'file3') or die;
while (<$fh1> ) {
chomp;
my ($num, @entries)=split /\|/;

# replace names by scores. If no score found,
# take score=0
#
@entries=map {$lookup{$_}||0} @entries;

my $sum=0;
$sum+=$_ for @entries;

print $fh3 join '|', $num, @entries, $sum;
print $fh3 "\n";
}
close $fh1 or die;
close $fh3 or die;



John W. Krahn

2006-01-31, 7:55 am

Andrej Kastrin wrote:
> Dear all,


Hello,

> I have bar separated file:
> name1|345
> name2|201
> ...
>
> I store it into a hash;
> while (<FILE_A> ) {
> chomp;
> ($name,$score) = split (/\|/,$_);
> $hash{$name} = $score;
> }
>
> Then I have second file:
> ID - 001
> NA - name1
> NA - name2
>
> ID - 002
> NA - name2
> NA - name4
> ...
>
> I match all ID's and NA's:
>
> while (<FILE_B> ) {
> chomp;
> if (/^ID/ {
> $ID = substr($_,5);
> }
> elseif (/^NA/) {
> $NA = substr($_,5)
> }
>
> Now I have to do somethig like;
> 001 | 345+201
> ...
>
> So, I want to read ID and NA fields from second file and then with each
> ID print the sum of scores from first file.
>
> Any suggestion. Which structure should I use to do that. Thank's in
> advance.


It looks like you could do something like this (UNTESTED):

my %scores;
while ( <FILE_A> ) {
chomp;
my ( $name, $score ) = split /\|/;
$scores{ $name } = $score;
}

my ( $ID, %ids );
while ( <FILE_B> ) {
if ( /^ID\s*-\s*(.+)/ ) {
$ID = $1;
}
elseif ( /^NA\s*-\s*(.+)/ ) {
$ids{ $ID } += $scores{ $1 };
}
}

for my $id ( keys %ids ) {
print "$id | $ids{$id}\n";
}



John
--
use Perl;
program
fulfillment
Andrej Kastrin

2006-01-31, 7:55 am

John W. Krahn wrote:

>Andrej Kastrin wrote:
>
>
>
>Hello,
>
>
>
>
>It looks like you could do something like this (UNTESTED):
>
>my %scores;
>while ( <FILE_A> ) {
> chomp;
> my ( $name, $score ) = split /\|/;
> $scores{ $name } = $score;
>}
>
>my ( $ID, %ids );
>while ( <FILE_B> ) {
> if ( /^ID\s*-\s*(.+)/ ) {
> $ID = $1;
> }
> elseif ( /^NA\s*-\s*(.+)/ ) {
> $ids{ $ID } += $scores{ $1 };
> }
>}
>
>for my $id ( keys %ids ) {
> print "$id | $ids{$id}\n";
>}
>
>
>
>John
>
>

John & John, thank's for help.
Now I understand (I hope so).

Best, Andrej
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com