applet-magic.com
Thayer Watkins
Silicon Valley
USA

 The Distribution of Sample Variance (when the mean value of the random variable is known) as a Function of Sample Size

Suppose the probability density distribution for x is

#### p(x) = 1 for -0.5≤x≤+0.5 p(x) = 0 for all other values of x

The square of x can then only have values between 0 and 0.25. Thus the probability density function for w=x2 is given by

#### P(w) = w-1/2 for 0≤w≤0.25 P(w) = 0 for all other values of w

Below are shown the histograms for 2000 repetitions of taking samples of n random variables and computing the sum of the squares of a random variable which is uniformly distributed between -0.5 and +0.5. The sum is normalized by dividing by the square root of the sample size n. This keeps the dispersion of the distribution constant. Otherwise with larger n the distribution would be more spread out. Althought the random variable is distributed between -0.5 and +0.5 its square is distributed between 0 and 0.25.

Each time the display is refreshed a new batch of 2000 samples is created.

As can be seen, as the sample size n gets larger the distribution more closely approximates the shape of the normal distribution.

Although the distribution for n=1 is decidedly non-normal, for n=16 the distribution looks quite close to a normal distribution even though the sample value can take on only positive values.

If the square root is taken of the sum of the squares the distributions of the results are as is shown below:

The positive square root of the square of the random variable is distributed from 0 to 0.5. Although the distributions for larger sample size look generally like normal distributions they are transforms of normal distributions.

A caution: If the population distribution has an infinite standard deviation then the sample standard deviation will not converge to any value. The sample standard deveiation will always be finite but it is of no relevance. For more on this topic see Sample Statistics for a Stable Distribution.