| ab_def@prontomail.com 2006-03-17, 7:58 am |
| Suppose that we have n independent identically distributed random
variables {u[1], ..., u[n]} and P[u[i] == u[j]] == 0 for i != j. We
form another sequence {xi[1] = Boole[u[1] > u[2]], ..., xi[n - 1] =
Boole[u[n - 1] > u[n]]} and we're looking for the variance of the sum
of xi[i]:
D[N[n]] == Variance[Sum[xi[i], {i, n - 1}]] ==
Variance[Sum[xi[i], {i, n - 2}] + xi[n - 1]] ==
Variance[Sum[xi[i], {i, n - 2}]] + Variance[xi[n - 1]] +
2*Covariance[Sum[xi[i], {i, n - 2}], xi[n - 1]] ==
D[N[n - 1]] + 1/4 + 2*Sum[Covariance[xi[i], xi[n - 1]], {i, n - 2}]
For any pair of adjacent elements we have
Covariance[xi[1], xi[2]] ==
P[xi[1] == 1 && xi[2] == 1] - P[xi[1] == 1]*P[xi[2] == 1] ==
P[u[1] > u[2] > u[3]] - P[u[1] > u[2]]*P[u[2] > u[3]] ==
1/6 - 1/4 == -1/12
because all permutations of {u[1], ..., u[n]} are equally probable. For
any non-adjacent elements Covariance[xi[i], xi[j]] == 0. Therefore,
D[N[n]] == D[N[n - 1]] + 1/4 + 2*(-1/12), D[N[2]] = 1/4
and D[N[n]] == (n + 1)/12 if n >= 2.
Here is a check for n = 6:
In[1]:= n = 6;
Lvalfreq = {First@ #, Length@ #}& /@ Split@ Sort@
(Count[Sign[Most@ # - Rest@ #], 1]& /@
Permutations@ Range@ n)
{Lval, Lp} = {Lvalfreq[[All, 1]], Lvalfreq[[All, 2]]/n!};
mu = Lval.Lp
sigma = ((Lval - mu)^2).Lp
Out[2]= {{0, 1}, {1, 57}, {2, 302}, {3, 302}, {4, 57}, {5, 1}}
Out[4]= 5/2
Out[5]= 7/12
And a numerical test:
In[6]:= Lcnt = Array[
Count[Sign[Most@ # - Rest@ #]&@ Array[Random[]&, n], 1]&,
10^5];
{Mean@ Lcnt, Variance@ Lcnt} - {mu, sigma} // N
Out[7]= {0.00262, 0.0033856695}
Maxim Rytin
m.r@inbox.ru
Darren Glosemeyer wrote:[color=darkred]
>
> For the variance quoted on the TimeSeries page, I initially thought the
> same thing you did. Assuming the signs are independent and there are equal
> probabilities of getting positive and negative signs (and 0 probability of
> getting a 0 difference), the statistic would follow
> BinomialDistribution[n-1, 1/2], which would have a variance of
> (n-1)/4. Simulations give a variance that appears to be (n+1)/12 (which
> would still indicate a typo in the TimeSeries documentation). I haven't
> figured out why this should be the variance yet. My best guess is that the
> assumption of independence is not valid given the differencing and as a
> result the distribution is something other than BinomialDistribution[n-1, 1/2].
>
>
> Darren Glosemeyer
> Wolfram Research
>
>
> At 05:15 AM 3/10/2006 -0500, john.hawkin@gmail.com wrote:
|