Home > Archive > PERL Beginners > December 2007 > Calcualting Avg.
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Mike Tran 2007-12-07, 7:01 pm |
| Hi All,
I'm not very familiar with Perl yet, so could someone help me with this
please? I have a file which has two columns one called Basenumber the
other Rating (Rating.txt), how do I loop through and get an average for
the Rating column for each distinct baseNumber? I want to write the new
result out into a new file called AvgRatings.txt. Thanks in advance for
any help on this.
Rating.txt:
Basenumber|Rating
10000|5
10000|4
10000|5
10007|4
10007|5
10007|4
10007|5
10008|4
10008|3
10008|2
10008|1
The new file (AvgRatings.txt) should looks like this:
Basenumber|AvgRating
10000|4.67
10007|4.5
10008|2.5
Best regards,
Mike Tran
e: mtran@sierratradingpost.com
t: (307) 772-8956
| |
| Steven M. O'Neill 2007-12-07, 7:01 pm |
| Mike Tran <mtran@sierratradingpost.com> wrote:
>I'm not very familiar with Perl yet, so could someone help me with this
>please? I have a file which has two columns one called Basenumber the
>other Rating (Rating.txt), how do I loop through and get an average for
>the Rating column for each distinct baseNumber? I want to write the new
>result out into a new file called AvgRatings.txt. Thanks in advance for
>any help on this.
A good thing to know about is the "split" function. It will
separate strings into an array of strings, based on whatever
pattern you tell it to "split" on, (e.g. /|/). ("perldoc -f split")
Maybe this search will provide you with a good example for what
you're trying to do:
http://www.google.com/search?q=perl...arch&btnI=Lucky
--
Steven O'Neill steveo@panix.com
Brooklyn, NY http://www.panix.com/~steveo
| |
| Michael Wang 2007-12-07, 7:01 pm |
| Hi
On 12/7/07, Mike Tran <mtran@sierratradingpost.com> wrote:
>
> Hi All,
>
>
>
> I'm not very familiar with Perl yet, so could someone help me with this
> please? I have a file which has two columns one called Basenumber the
> other Rating (Rating.txt), how do I loop through and get an average for
> the Rating column for each distinct baseNumber? I want to write the new
> result out into a new file called AvgRatings.txt. Thanks in advance for
> any help on this.
>
>
>
> Rating.txt:
>
>
>
> Basenumber|Rating
>
> 10000|5
>
> 10000|4
>
> 10000|5
>
> 10007|4
>
> 10007|5
>
> 10007|4
>
> 10007|5
>
> 10008|4
>
> 10008|3
>
> 10008|2
>
> 10008|1
>
>
>
> The new file (AvgRatings.txt) should looks like this:
>
>
>
> Basenumber|AvgRating
>
> 10000|4.67
>
> 10007|4.5
>
> 10008|2.5
>
>
>
>
>
> Best regards,
>
>
>
> Mike Tran
>
> e: mtran@sierratradingpost.com
>
> t: (307) 772-8956
Learn perl hash something like:
use strict;
> use warnings;
>
> my %numsum;
> my %numcnt;
>
> while(<DATA> ){
>
> chomp;
> my ($num, $rate) = split/\|/;
> $numsum{$num} += $rate;
> ++$numcnt{$num};
> }
> print "Basenumber|AvgRating\n";
> for (sort keys %numsum){
> my $numavg = $numsum{$_} / $numcnt{$_};
> print "$_\|$numavg\n";
> }
>
> __DATA__
>
put your data here
| |
| Gunnar Hjalmarsson 2007-12-07, 7:01 pm |
| Mike Tran wrote:
> I have a file which has two columns one called Basenumber the
> other Rating (Rating.txt), how do I loop through and get an average for
> the Rating column for each distinct baseNumber?
Use a hash.
> I want to write the new
> result out into a new file called AvgRatings.txt.
What have you tried? Show us your best attempt, and somebody may be
willing to help you get it right.
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
| |
| Mike Tran 2007-12-08, 4:01 am |
| The code below is what I got so far, but I'm still getting the "|0.0"
for the first row returned. I don't know which part of the code is
generating this, could someone help explain this to me? Thanks for all
your help.
new_ratings.txt:
base_no|AvgRating
|0.0
10000|4.7
10007|4.5
10008|2.5
#!/usr/bin/perl
########################################
###############################
# This script reads through reviewsRating.txt and calculates an average
#
# rating for each base_no
#
########################################
###############################
use strict;
use warnings;
my %numsum;
my %numcnt;
my $output= 'new_ratings.txt';
my $file ='reviewsRating.txt';
#Open output file to print header row
open(OUT,">$output") or die "Could not open $output: $!";
print OUT "base_no|AvgRating\n";
close(OUT);
open my $fh, '<', $file or die $!;
open OUT, '>>', $output or die "Could not open '$output' $!";
while(<$fh> ){
next if $. == 1; # exclude header
chomp;
my ($num, $rate) = split/\|/;
$numsum{$num} += $rate;
++$numcnt{$num};
}
for (sort keys %numsum){
my $numavg = sprintf "%.1f",$numsum{$_} / $numcnt{$_};
print OUT "$_\|$numavg\n";
}
close($fh);
close(OUT);
________________________________
From: michael wang [mailto:michael9264@gmail.com]
Sent: Friday, December 07, 2007 2:29 PM
To: Mike Tran
Cc: beginners@perl.org
Subject: Re: Calcualting Avg.
Hi
On 12/7/07, Mike Tran <mtran@sierratradingpost.com> wrote:
Hi All,
I'm not very familiar with Perl yet, so could someone help me with this
please? I have a file which has two columns one called Basenumber the
other Rating (Rating.txt), how do I loop through and get an average for
the Rating column for each distinct baseNumber? I want to write the new
result out into a new file called AvgRatings.txt. Thanks in advance for
any help on this.
Rating.txt:
Basenumber|Rating
10000|5
10000|4
10000|5
10007|4
10007|5
10007|4
10007|5
10008|4
10008|3
10008|2
10008|1
The new file (AvgRatings.txt) should looks like this:
Basenumber|AvgRating
10000|4.67
10007|4.5
10008|2.5
Best regards,
Mike Tran
e: mtran@sierratradingpost.com
t: (307) 772-8956
Learn perl hash something like:
use strict;
use warnings;
my %numsum;
my %numcnt;
while(<DATA> ){
chomp;
my ($num, $rate) = split/\|/;
$numsum{$num} += $rate;
++$numcnt{$num};
}
print "Basenumber|AvgRating\n";
for (sort keys %numsum){
my $numavg = $numsum{$_} / $numcnt{$_};
print "$_\|$numavg\n";
}
__DATA__
put your data here
| |
| Rob Dixon 2007-12-08, 4:01 am |
| Mike Tran wrote:
>
> The code below is what I got so far, but I'm still getting the "|0.0"
> for the first row returned. I don't know which part of the code is
> generating this, could someone help explain this to me? Thanks for all
> your help.
>
> new_ratings.txt:
>
> base_no|AvgRating
> |0.0
> 10000|4.7
> 10007|4.5
> 10008|2.5
>
>
>
> #!/usr/bin/perl
> ########################################
###############################
> # This script reads through reviewsRating.txt and calculates an average
> #
> # rating for each base_no
> #
> ########################################
###############################
>
> use strict;
> use warnings;
>
> my %numsum;
> my %numcnt;
> my $output= 'new_ratings.txt';
> my $file ='reviewsRating.txt';
>
> #Open output file to print header row
> open(OUT,">$output") or die "Could not open $output: $!";
> print OUT "base_no|AvgRating\n";
> close(OUT);
>
> open my $fh, '<', $file or die $!;
> open OUT, '>>', $output or die "Could not open '$output' $!";
>
> while(<$fh> ){
> next if $. == 1; # exclude header
> chomp;
> my ($num, $rate) = split/\|/;
> $numsum{$num} += $rate;
> ++$numcnt{$num};
> }
> for (sort keys %numsum){
> my $numavg = sprintf "%.1f",$numsum{$_} / $numcnt{$_};
> print OUT "$_\|$numavg\n";
> }
> close($fh);
> close(OUT);
Hi Mike
Your code looks like it should work. Your extra output line is probably
because there is a blank line in the input file. I suggest you check
that the base number in each line is valid before you accumulate it,
like this:
while (<$fh> ){
chomp;
my ($num, $rate) = split/\|/;
next unless $num =~ /^\d+$/;
$numsum{$num} += $rate;
++$numcnt{$num};
}
Also, I assume you realise there's no need to close and reopen the
output file? And your indenting could use some work.
HTH,
Rob
|
|
|
|
|