San José State University 

appletmagic.com Thayer Watkins Silicon Valley & Tornado Alley USA 


The Central Limit Theorem (CLT) is a powerful and important result of mathematical analysis. In its standard form it says that if a stochastic variable x has a finite variance then the distribution of the sums of n samples of x will approach a normal distribution as the sample size n increases without limit. In this standard version there is the requirement that the elements of the sum have identical independent distributions. This is a requirement only in order to make the proof feasible. The CLT applies to the case of nonidentical distributions so long as the set of distributions is bounded in terms of mean and variance. The CLT can be also extended to sample statistics beyond sums and means. For example, sample statistics of the form
will have a limiting distribution which is a transform of a normal distribution.
Consider the instance of this case in which f(x)=x^{2}. This would be the sample standard deviation when the distribution of x is known to have a zero mean. Since the CLT applies for any distribution of finite variance it would apply to the distribution of x^{2}. Thus the distribution of the sum of squares would approach a normal distribution. The statistic s would then approach a distribution which is the square root transform of a normal distribution.
As another instance of this case consider the geometric means of samples:
This can be put into the form
The sum of the logarithms of x will approach a normal distribution and likewise the mean of the sample logarithms. Therefore the distribution of log(g) will approach a normal distribution and hence g will have a limiting distribution which is the exponential transform of a normal distribution.
In general then the sum of the f(x)'s will have a limit distribution which is normal and the statistic s will have a limit distribution which is the f^{1}() transform of a normal distribution.
If z has a probability distribution p(z) what is the distribution of f(z)? Consider first the case in which f(z) is a monotonically increasing function. The probability that z lies between a and b is given by
The probability distribution for f(z) is given mathematically by the change of variable in the integral; i.e., the probability that w=f(z) is between f(a) and f(b) is given by:
For instance suppose w=f(z)=z^{3} and hence z=w^{1/3}. If z has the normal distribution (1/√2π)exp[z^{2}/2] then
When f(z) is monotonically decreasing the result is essentially the same except there has to be a reversal of the limits of integration which results in a negative sign in the result which when multiplied by the negative sign of dz/dw is equivalent to taking the absolute value of dz/dw.
When f(z) is not monotonic then the possibility of multiple solution for z of the equation f(z)=w must be taken into account. This is expressed as the probability density function for w, P(w), being given by
Consider the case of f(z)=z^{2}. Then z=±w^{1/2} and dz/dw =(1/2)w^{1/2}. The variable w can have only nonnegative values. Its probability density function is given by:
Suppose the probability density distribution for z is
The square of z can then only have values between 0 and 0.25. Thus the probability density function for w=z^{2} is given by
Below are shown the histograms for 2000 repetitions of taking samples of n random variables and computing the sum of the squares of a random variable which is uniformly distributed between 0.5 and +0.5. The sum is normalized by dividing by the square root of the sample size n. This keeps the dispersion of the distribution constant. Otherwise with larger n the distribution would be more spread out. Although the random variable is distributed between 0.5 and +0.5 its square is distributed between 0 and 0.25. Each time the display is refreshed a new set of 2000 repetitions of the samples is created.
As can be seen, as the sample size n gets larger the distribution more closely approximates the shape of the normal distribution.
Although the distribution for n=1 is decidedly nonnormal, for n=16 the distribution looks quite close to a normal distribution even though the sample value can take on only positive values.
If the square root is taken of the sum of the squares the distributions of the results are as is shown below:
The positive square root of the square of the random variable is distributed from 0 to 0.5. Although the distributions for larger sample size look generally like normal distributions they are transforms of normal distributions.
Consider the distribution of sample maximums for samples of a random variable uniformly distributed between 0.5 and +0.5. For n=1 the sample maximum is just the sample value.
Although the above distributions suggests that for an extension of the central limit theorem to apply the sample statistic must be representable as a sum, it should be noted that the maximum function can be represented as the limit of such functions; i.e.,
This means the distribution of the sample maximum will be the limit of σ roots of the normal distribution.
HOME PAGE OF Thayer Watkins 