Home > Archive > PERL Beginners > September 2007 > bootstrapping in Perl
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
bootstrapping in Perl
|
|
| Pedro Soto 2007-09-25, 4:00 am |
| Dear all,
I need to do some bootstrap analysis and found a module in CPAN
called Math::Random::OO::Bootstrap. Does any body has experience with it?
I tried to install the modules via perl, -MCPAN -e shell and it did not
complain. But I can not run the script provided in the POD documentation for
the module Math ::Random::OO. First I got an error saying the the Param::
Validate.pm was not there, and I installed it too. But now I get the
following error: 'Base class package "Class::Accessor::Fast" is empty' ,
which I do not understand.
Does anybody have any idea about it or know another way to do bootstrapping
within Perl?
Thanks
P Soto
| |
| Tom Phoenix 2007-09-25, 10:00 pm |
| On 9/24/07, Pedro Soto <pedrosoto2007@gmail.com> wrote:
> I need to do some bootstrap analysis and found a module in CPAN
> called Math::Random::OO::Bootstrap. Does any body has
> experience with it?
Did you choose that module because its documentation led you to
believe that it would be useful to you? It sounds as if you chose it
just because it had "Bootstrap" in the name.
> Does anybody have any idea about it or know another way to
> do bootstrapping within Perl?
If you can explain what "bootstrapping" means to you, I'm sure
somebody here can help you. What is it?
Cheers!
--Tom Phoenix
Stonehenge Perl Training
| |
| Tom Phoenix 2007-09-26, 10:00 pm |
| On 9/26/07, Pedro Soto <pedrosoto2007@gmail.com> wrote:
> I need to derive a subsample with replacement from a large distribution of
> data. Say if my large sample is 10000 I need to get 100 data out of the
> 10000 data and repeat the procedure n times(that's what I called
> boostrapping).
Perl can easily select 100 items at random from 10000, as many times
as you need.
> I am using the function of perl srand to generate random
> numbers in order to do the resampling at 'random'.
It's rare to need to use srand(). You probably want just plain rand().
> The problem is that the
> distribution of the original data (10000) does not follow a gaussian
> distribution and therefore I am not sure if using only this function
> (srand) in perl would be enough, because the numbers of the large
> distribution won't have the same probability of being selected.
The probability of an item being selected by rand() shouldn't normally
depend upon the item itself. This code pulls 100 samples at random
from a list (@source) of at least that many items, but the items
themselves don't have any influence on the selection.
my $samples_needed = 100;
die "Not enough data" if @source < $samples_needed;
my $count = 0;
my @samples; # starts off empty
foreach my $item (@source) {
next if $samples_needed / (++$count) <= rand;
if (@samples < $samples_needed) {
push @samples, $item;
} else {
$samples[rand @samples] = $item; # random index
}
}
Does this get you any closer to a solution? Good luck with it!
--Tom Phoenix
Stonehenge Perl Training
|
|
|
|
|