Home > Archive > APL > February 2008 > picking a new random seed
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
picking a new random seed
|
|
| Charles Brenner 2008-01-26, 10:03 pm |
| When doing Monte Carlo simulations, it may be uninteresting to redo
the same ones every time the workspace is loaded. As a crude method to
avoid reusing the same random number sequence, I wrote a loop to run
for one second generating a random number each time. To my surprise
the number of iterations in a second shows very little variation --
85% of 1-second trials ran the same number of iterations hence ended
with the same random seed. Also, that number was quite small. (APL
+DOS: It was 814 but I did also advance a counter in the loop.)
I guess it would be better just to read the system clock. At 100 tics
per second, there are as many tics in a year as the number of valid
values for []RL.
Charles
| |
| Sam Sirlin 2008-01-26, 10:03 pm |
| Charles Brenner wrote:
> When doing Monte Carlo simulations, it may be uninteresting to redo
> the same ones every time the workspace is loaded. As a crude method to
> avoid reusing the same random number sequence, I wrote a loop to run
> for one second generating a random number each time. To my surprise
> the number of iterations in a second shows very little variation --
> 85% of 1-second trials ran the same number of iterations hence ended
> with the same random seed. Also, that number was quite small. (APL
> +DOS: It was 814 but I did also advance a counter in the loop.)
>
> I guess it would be better just to read the system clock. At 100 tics
> per second, there are as many tics in a year as the number of valid
> values for []RL.
If you really care, you probably want to check out whatever generator
you use with various tests. There are numerous references on the web,
for instance
http://www.math.utah.edu/~pa/Random/Random.html
http://csrc.nist.gov/groups/ST/toolkit/rng/index.html
Sam
| |
| Curtis A. Jones 2008-01-27, 7:01 pm |
| Charles,
Simply constructing an integer from the system clock to use as the
seed {quadRL} for the random number generator may give a less than
ideal sequence.
James J. Weinkam discusses this in this comp.lang.apl group under
"Random Number Generation in IP Sharp APL" (29 Jun 2006 1917)
http://groups.google.com/group/comp...7a0ef84292e7b45
Among the consequences of a poor choice of seed:
"Moreover choosing a seed which is nor relatively prime to the modulus
causes the generator to produce a subsequence which is not of maximum
length."
One recommendation he makes:
"...saving the value of the seed at the end or each run and using that
value to seed the generator at the start of the next run."
I imagine you know this and have a function that selects only "valid
values for {quadRL}" from the clock. Would you share it?
My crude approach is to simply list {quadRL} after every permutation I
generate (often for assigning students to groups in labs). Next time
I go back to the listing in the session log to start from there.
Curtis
| |
| Phil Last 2008-01-27, 7:01 pm |
| On Jan 27, 9:33 pm, "Curtis A. Jones" <curtis_jo...@ieee.org> wrote:
> Charles,
> Simply constructing an integer from the system clock to use as the
> seed {quadRL} for the random number generator may give a less than
> ideal sequence.
>
> James J. Weinkam discusses this in this comp.lang.apl group under
> "Random Number Generation in IP Sharp APL" (29 Jun 2006 1917)http://groups=
..google.com/group/comp.lang.apl/browse_frm/thread/33c60d...
>
> Among the consequences of a poor choice of seed:
> "Moreover choosing a seed which is nor relatively prime to the modulus
> causes the generator to produce a subsequence which is not of maximum
> length."
>
> One recommendation he makes:
> "...saving the value of the seed at the end or each run and using that
> value to seed the generator at the start of the next run."
>
> I imagine you know this and have a function that selects only "valid
> values for {quadRL}" from the clock. Would you share it?
>
> My crude approach is to simply list {quadRL} after every permutation I
> generate (often for assigning students to groups in labs). Next time
> I go back to the listing in the session log to start from there.
> Curtis
Why not combine the random element of the clock with advancing =E2=8E=95RL b=
y
running
z=E2=86=90?(0=E2=8A=A5=E2=8E=95TS)=E2=8D=B41 =E2=8D=9D z gets query (zero de=
code quad TS)reshape one
That way, even if you never save the ws with a different =E2=8E=95RL you at
least get a random selection of the present and the next 999.
Phil
| |
| Curtis A. Jones 2008-01-28, 7:01 pm |
| Phil,
On this Windows XP system with Workstation APL2 the clock "ticks" seem
to be much more than 1 millisecond. i.e. Successive values of
{quadTS} are incremented by something closer to 10 ms than 1 ms. (I
ran {quadTS} 5000 times and got 15 unique millisecond values: 756 766
776 786 796 806 816 826 836 846 856 866 876 886 896.)
This may not be a problem for your approach to finding a number of
random numbers to generate if the millisecond element of {quadTS} can
take on at some time every one of the integer values from 0 to 999.
Curtis Jones
| |
| Phil Last 2008-01-28, 7:01 pm |
| On Jan 28, 5:09 pm, "Curtis A. Jones" <curtis_jo...@ieee.org> wrote:
> Phil,
> On this Windows XP system with Workstation APL2 the clock "ticks" seem
> to be much more than 1 millisecond. i.e. Successive values of
> {quadTS} are incremented by something closer to 10 ms than 1 ms. (I
> ran {quadTS} 5000 times and got 15 unique millisecond values: 756 766
> 776 786 796 806 816 826 836 846 856 866 876 886 896.)
>
> This may not be a problem for your approach to finding a number of
> random numbers to generate if the millisecond element of {quadTS} can
> take on at some time every one of the integer values from 0 to 999.
> Curtis Jones
Curtis,
I tried what you suggest and more than agree.
=E2=88=AA{0=E2=8A=A5=E2=8E=95TS}=C2=A8=E
2=8D=B35000 =E2=8D=9D unique{0 decod=
e quadTS}each iota 5000
being equivalent to what you said on Dyalog gives me, each time I run
it either one or two values, rather less than 15, which of course
depends merely on the speed of the machine, algorithm or interpreter
but each time I do it again I get a different one or two. The pairs
are always about 15 or 16 ms apart being the tick speed, apparently
1/60 or 1/64 sec. (Didn't it used to be 1/18 sec?)
But as you hint it isn't a problem if the intention is just to run it
once in a while.
Phil
| |
|
| On Jan 26, 8:47 pm, Sam Sirlin <swsir...@charter.net> wrote:
> Charles Brenner wrote:
>
>
> If you really care, you probably want to check out whatever generator
> you use with various tests. There are numerous references on the web,
> for instance
>
> http://www.math.utah.edu/~pa/Random...ist.gov/groups=
/ST/toolkit/rng/index.html
>
> Sam
The real-time cycle counter on modern X86 processors provide a
timestamp that
should change often (every machine cycle!) enough to keep Charles
happy. Here's a DDJ article that
gives a snippet of C code to read it:
http://www.ddj.com/showArticle.jhtm...j0405b&pgno=3D2
Here's the definition of RTDSC from: "Intel=AE 64 and IA-32
Architectures
Optimization Reference Manual":
"The time-stamp counter increments whenever the sleep pin is not
asserted or when
the clock signal on the system bus is active. It is read using the
RDTSC instruction.
The difference in values between two reads (modulo 2**64) gives the
number of
processor clocks between reads."
Now, having said that...
I don't know which RNG you're using, but the one in most early vintage
APL
interpreters (linear congruential) is not the greatest in the world.
You'd be better off adopting
one of the more modern RNG algorithms if cycle size, distribution
characteristics,
etc., are important to your application.
Bob
Bob
| |
| Charles Brenner 2008-01-30, 4:00 am |
| On Jan 27, 1:33 pm, "Curtis A. Jones" <curtis_jo...@ieee.org> wrote:
> Charles,
> Simply constructing an integer from the system clock to use as the
> seed {quadRL} for the random number generator may give a less than
> ideal sequence.
>
> James J. Weinkam discusses this in this comp.lang.apl group under
> "Random Number Generation in IP Sharp APL" (29 Jun 2006 1917)http://groups.google.com/group/comp...hread/33c60d...
>
> Among the consequences of a poor choice of seed:
> "Moreover choosing a seed which is nor relatively prime to the modulus
> causes the generator to produce a subsequence which is not of maximum
> length."
>
> One recommendation he makes:
> "...saving the value of the seed at the end or each run and using that
> value to seed the generator at the start of the next run."
Yes, I thought of that but would rather not decide on all the
housekeeping issues.
> I imagine you know this and have a function that selects only "valid
> values for {quadRL}" from the clock. Would you share it?
That's quite easy. The random number generation algorithm is
multiply by 5*7
reduce modulo (2*31)-1.
This algorithm generates every possible value before it repeats (I
checked the cycle length, which didn't take very long. In formal
language, the modulus is a prime and 5*7 is a primitive root, meaning
a number whose powers include every modulus class except 0. The fact
is that every prime does have primitive roots, so once you realize
that (2*31)-1 is a prime it follows that unless the implementor
blundered remarkably, the chosen multiplier must be a primitive root
and the generated numbers must be the full cycle. There was really no
need to check.)
That is, all numbers
0 < n < 2*31
are in the cycle, so start with any one of them. For example
1. Hash the system clock or whatever to a number. To ensure a valid
and efficient []RL
2. Reduce that number modulus (2*31)-2
3. Add 1.
Charles
| |
| James J. Weinkam 2008-01-30, 7:00 pm |
| Charles Brenner wrote:
> On Jan 27, 1:33 pm, "Curtis A. Jones" <curtis_jo...@ieee.org> wrote:
>
> Yes, I thought of that but would rather not decide on all the
> housekeeping issues.
>
>
> That's quite easy. The random number generation algorithm is
> multiply by 5*7
> reduce modulo (2*31)-1.
> This algorithm generates every possible value before it repeats (I
> checked the cycle length, which didn't take very long. In formal
> language, the modulus is a prime and 5*7 is a primitive root, meaning
> a number whose powers include every modulus class except 0. The fact
> is that every prime does have primitive roots, so once you realize
> that (2*31)-1 is a prime it follows that unless the implementor
> blundered remarkably, the chosen multiplier must be a primitive root
> and the generated numbers must be the full cycle. There was really no
> need to check.)
>
> That is, all numbers
> 0 < n < 2*31
> are in the cycle, so start with any one of them. For example
>
> 1. Hash the system clock or whatever to a number. To ensure a valid
> and efficient []RL
> 2. Reduce that number modulus (2*31)-2
> 3. Add 1.
>
> Charles
You are missing the point.
It is true in the present instance that any non zero value less than
2*31 is a valid seed; but if you simply choose a new seed at random
there is a risk that you will simply be generating the same stream of
random numbers you used in a previous experiment shifted by a bit. This
can lead to undesirable correlation between runs which can invalidate
subsequent statistical analysis.
Saving {quad}RL at the end of a run and restoring it at the beginning of
the next is hardly a major housekeeping issue.
| |
|
|
"James J. Weinkam" <jjw@cs.sfu.ca> wrote in message
news:Ro6oj.165$C61.91@edtnps89...
> Charles Brenner wrote:
>
> You are missing the point.
>
> It is true in the present instance that any non zero value less than 2*31 is a
> valid seed; but if you simply choose a new seed at random there is a risk that
> you will simply be generating the same stream of random numbers you used in a
> previous experiment shifted by a bit. This can lead to undesirable
> correlation between runs which can invalidate subsequent statistical analysis.
>
> Saving {quad}RL at the end of a run and restoring it at the beginning of the
> next is hardly a major housekeeping issue.
so, there's no real RANDOM in computing (as I've read somewhere in the old days)
| |
| Jane Sullivan 2008-01-31, 4:00 am |
| In message <47a0f4e1$0$25480$ba620dc5@text.nova.planet.nl>, jk
<*axy*@planet.nl> writes
>so, there's no real RANDOM in computing (as I've read somewhere in the old days)
I thought the favoured terminology was "pseudo-random", i.e. a sequence
of pseudo-random numbers passes all statistical tests for randomness,
but it can be reproduced by resetting the seed to the value used for its
generation.
--
Jane Sullivan
| |
| Charles Brenner 2008-02-03, 7:00 pm |
| On Jan 30, 1:57 pm, "James J. Weinkam" <j...@cs.sfu.ca> wrote:
> Charles Brenner wrote:
>
>
>
>
>
>
>
>
>
>
> You are missing the point.
>
> It is true in the present instance that any non zero value less than
> 2*31 is a valid seed;
less than (2*31)-1
> but if you simply choose a new seed at random
> there is a risk that you will simply be generating the same stream of
> random numbers you used in a previous experiment shifted by a bit.
That an inevitable consequence of using a deterministic random number
generator, i.e. one for which the sequence depends only on the
starting point.
> This can lead to undesirable correlation between runs which can invalidate
> subsequent statistical analysis.
That would be true if one used a method of choosing a restart point
for the random seed which is limited to a relative (relative to
2*31-1) handful of possible values. If the restart method truly chose
at random -- equal probability -- among all possible seed values, then
it would have no more tendency toward serial correlation than would
saving []RL. If you disbelieve this, that would explain the sense in
which you think I missed "the" point.
I think hashing []TS is good enough -- as opposed to the method of
counting loop iterations in a fixed short time which I remarked
earlier turned out to be even worse than I expected.
> Saving {quad}RL at the end of a run and restoring it at the beginning of
> the next is hardly a major housekeeping issue.
Given the nature of my application the mechanism would be tedious
enough that I prefer not to bother. I don't force my decision on
anyone else.
Charles
|
|
|
|
|