For Programmers: Free Programming Magazines  


Home > Archive > PERL Beginners > August 2005 > A better way.









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author A better way.
Dan Klose

2005-08-02, 5:00 pm

Hello Everyone.

I have the following code:


### CODE BLOCK ###

my %hash_of_arrays2;

for (keys %hash_of_arrays) {
my @array_of_data = exists($hash_of_arrays{$_})
?@{$hash_of_arrays{$_}}
:();

my $mean = mean(\@array_of_data); #GET MEAN
my $std_dev = dev(\@array_of_data); #GET STANDARD DEV

push @{$hash_of_arrays2{$_}}, $mean;
push @{$hash_of_arrays2{$_}}, $std_dev;
}

for (sort {$hash_of_arrays2{$a}[0] <=> $hash_of_arrays2{$b}[0]}
keys %hash_of_arrays2) {

print "$_, $hash_of_arrays2{$_}[0], $hash_of_arrays2{$_}[1]\n";
}

### END OF CODE ###

If anyone has the time could you please show me ways of shortening this
code e.g. eliminating the second for block (integration into the
previous block) and any other tips / tricks.

There are no other people here that code perl and so I get no
advice/feedback on how to improve my scripts.

Thanks in advance.

Dan.

Wiggins d'Anconia

2005-08-02, 5:00 pm

Dan Klose wrote:
> Hello Everyone.
>
> I have the following code:
>
>
> ### CODE BLOCK ###
>
> my %hash_of_arrays2;
>


Have you considered a hash of hashes? For me, given the sample below, I
would prefer it, but obviously I haven't seen your whole script.

> for (keys %hash_of_arrays) {
> my @array_of_data = exists($hash_of_arrays{$_})
> ?@{$hash_of_arrays{$_}}
> :();


I don't think the above is doing what you anticipate. 'exists' just
tests whether there is a key/value pair given the key. But since you are
iterating over the keys of the hash the test must always be true. If
there is no value, then your initialization will be the same as ()
anyways. So basically you can drop the above completely, or a better
check would be to test the number of elements (make sure you have an
array ref first if you are worried about it).

perldoc -f exists

>
> my $mean = mean(\@array_of_data); #GET MEAN
> my $std_dev = dev(\@array_of_data); #GET STANDARD DEV
>


Given that the above two functions take an array reference there is no
reason to dereference the array reference first, just to pass a
reference. You are using twice the memory to do so when the references
should allow you to save on memory that Perl would automatically duplicate.

> push @{$hash_of_arrays2{$_}}, $mean;
> push @{$hash_of_arrays2{$_}}, $std_dev;


This gets back to my question of the hash of hashes. Rather than
stuffing the mean and std_dev into indexes 0 and 1, why not just put
them in named indexes.

$hash_of_hashes->{$_}->{'mean'} = $mean;
$hash_of_hashes->{$_}->{'std_dev'} = $std_dev;

> }
>
> for (sort {$hash_of_arrays2{$a}[0] <=> $hash_of_arrays2{$b}[0]}
> keys %hash_of_arrays2) {
>
> print "$_, $hash_of_arrays2{$_}[0], $hash_of_arrays2{$_}[1]\n";
> }
>
> ### END OF CODE ###
>
> If anyone has the time could you please show me ways of shortening this
> code e.g. eliminating the second for block (integration into the
> previous block) and any other tips / tricks.
>


The problem is that to do a sort you have to know all of the values that
you are sorting on before you can start, which means you can't print the
output in a sorted manner in the first loop, so there really is no way
to combine the two. The best you could hope for would be to populate an
array in sorted order in the first loop and then just print the result
in the second, but your sorting routine would very likely be less
efficient than Perl's built-in one.

> There are no other people here that code perl and so I get no
> advice/feedback on how to improve my scripts.
>


You have come to the right place then. I know that feeling, it stinks.

> Thanks in advance.
>
> Dan.


http://danconia.org

-- UNTESTED --
my %hash_of_calculations;
for my $key (keys %hash_of_arrays) {
if (ref $hash_of_arrays->{$key} eq 'ARRAY' and
@{$hash_of_arrays->{$key}}) {
# data to process
@{$hash_of_calculations->{$key}}{'mean','std_dev'} =
(mean($hash_of_arrays->{$key}), dev($hash_of_arrays->{$key}));
}
else {
# no data to process
@{$hash_of_calculations->{$key}}{'mean','std_dev'} = ();
}
}
for my $key (sort { $hash_of_calculations->{$a}->{'mean'} <=>
$hash_of_calculations->{$b}->{'mean'} } keys %hash_of_calculations) {
print "Key: $key\n";
print "Mean: $hash_of_calculations->{$key}->{'mean'}\n";
print "Std Dev: $hash_of_calculations->{$key}->{'std_dev'}\n";
}
Jeff 'japhy' Pinyan

2005-08-02, 5:00 pm

On Aug 2, Dan Klose said:

> my %hash_of_arrays2;
>
> for (keys %hash_of_arrays) {
> my @array_of_data = exists($hash_of_arrays{$_})
> ?@{$hash_of_arrays{$_}}
> :();


Given that you're iterating over the list of keys in %hash_of_arrays,
there's NO reason to be using exists() here.

my @array_of_data = @{ $hash_of_arrays{$_} };

> my $mean = mean(\@array_of_data); #GET MEAN
> my $std_dev = dev(\@array_of_data); #GET STANDARD DEV


There's no reason to make a COPY of the array and then dereference. I'd
suggest doing:

my $data = $hash_of_arrays{$_};
my $mean = mean($data);
my $std_dev = dev($data);

> push @{$hash_of_arrays2{$_}}, $mean;
> push @{$hash_of_arrays2{$_}}, $std_dev;


You can do both at once:

$hash_of_arrays2{$_} = [ $mean, $std_dev ];

> }


You can make that for loop into one statement:

my %hash_of_arrays2 = map {
$_ => [ mean($hash_of_arrays{$_}), dev($hash_of_arrays{$_}) ]
} keys %hash_of_arrays;

> for (sort {$hash_of_arrays2{$a}[0] <=> $hash_of_arrays2{$b}[0]}
> keys %hash_of_arrays2) {
>
> print "$_, $hash_of_arrays2{$_}[0], $hash_of_arrays2{$_}[1]\n";
> }


There's no way to eliminate the second loop, because you need to have all
the data calculated before you can sort them.

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
http://japhy.perlmonk.org/ % have long ago been overpaid?
http://www.perlmonks.org/ % -- Meister Eckhart
Dan Klose

2005-08-02, 5:00 pm

Hi,

I have gone for the following code:

my %avs_dev = map{
$_ => [mean($hoa{$_}), dev($hoa{$_})]
} keys %hoa;

for (sort {$avs_dev{$a}[0] <=> $avs_dev{$b}[0]} keys %avs_dev) {
print "$_, $avs_dev{$_}[0], $avs_dev{$_}[1]\n";
}

I have never used MAP before... it looks handy!
Works a treat.

Thanks to Jeff 'japhy' Pinyan & Wiggins d'Anconia

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com