applet-magic.comThayer WatkinsSilicon Valley & Tornado Alley USA |
---|

of the Monthly Probability Distributions for Rainfall in San Jose, California |

As explained in the reading on characteristic functions the characteristic function for a distribution is the expected value of exp(izω). The characteristic function will generally be a complex function; i.e., Φ(ω) = Χ(ω) + iΥ(ω). Since exp(iωz} = cos(ωz} + isin(ωz} the components of the characteristic function are given by:

Χ(ω) = E{cos(ωz)} = ∫

and

Υ(ω) = E{sin(ωz)} = ∫

As a practical matter the probability distribution is in the form of a histogram; i.e., the frequencies or probabilities for specified ranges of the random variable. The computation of the components of the characteristic function is based upon the midpoints of the ranges for the histogram. In effect, this replaces the probability distribution which is defined over a continuous variable with one defined for a discrete variable. Instead of the distribution curve being a smooth curve it is taken to be structured like a comb. This means that there is a limit for the frequency variable ω. If the frequency is so high that the wavelength of the trigometric functions is comparable to the interval between the midpoints of the histogram ranges, the distances between the teeth of the comb, then the estimates for the characteristic function are no longer valid. For this reason the characteristic function components are computed for values of ω no higher than 16.

The results of the computations are shown below for the January distribution of rainfalls for the days which had some rain.

In the graph shown below the components of the characteristic function are used to compute the components of the logarithm of the characteristic function. This process is not simply a matter of the taking the logarithm of the components.

The logarithm of the characteristic function will also be a complex function with real and imaginary components. The logarithm of a variable W is defined as as w if:

exp(w) = W.

For a complex variable X+iY we must find x+iy such that

exp(x+iy) = X+iY.

Since

exp(x+iy)=exp(x)exp(iy)

= exp(x)(cos(y)+isin(y)

= exp(x)cos(x)+iexp(x)sin(x)

it follows that

X = exp(x)cos(y)

and

Y = exp(x)sin(y)

Thus the imaginary component y can be determined from:

tan(y) = Y/X and hence y = tan

The real component x can then be found from:

x = log(X) - log(cos(y)).

The reason for wanting the components of the logarithm of the characteristic function is that these components provide a way to estimate the parameters of a Lévy stable distribution. If the distribution is a Lé stable distribution the plot of log(-x) versus log(ω) should be a straight whose slope is equal to the alpha parameter of a stable distribution.

The real component of the log-characteristic function for a stable distribution is

Χ(ω) = - |νω|

and therefore

ν

This last relationship implies that:

log(-Χ(ω)) = αlog(|ω|) + αlog(ν)

Thus for a stable distribution the graph of the logarithm of the real component of the log-characteristic function as a function of the logarithm of ω is a straight line, the slope of which is the stability index of the distribution, α.

The value of the logarithm of log(-Χ(ω)) when ω=1 is the intercept of the straight line and is equal to αlog(ν). Thus a knowledge of the intercept and the value of α determines the value of ν, the dispersion parameter of the distribution; i.e.,

log(ν) = log(-Χ(1))/α

As can be seen from the above graph, the plot is at least very close to being a straight line except for the last data point. The failure for the highest value of ω is probably due to the limitation created by using the midpoints of the ranges of the histogram. The maximum value for which the estimates of the characteristic function based upon the midpoints of the intervals used is probably lower than 16 as was assumed.

The imaginary component of the log-characteristic function for stable distributions is:

Υ(ω) = δω + βν|ω|

where F(ω,α,ν) for α≠1 and ω>0 is tan(πα/2). With α and ν known the values of δ and β can be determined from the imaginary component of the log-characteristic function. The values of could be found from any two points on the curve; i.e., by solving the linear equations in the two unkowns δ and β:

Υ(ω

Υ(ω

As can be seen above, if two points for ω=1 and ω=4 are used to estimate beta and delta; the other points can be computed rather closely except for the largest value of ω. The failure for ω=16 can be attributed to the problems associated with using the midpoints of the ranges in computing the characteristic function.

The general conclusion is that distribution functions for rainfall can be represented as Lévy stable functions.

The estimates for other months are are shown below.

from the Empirical Estimates of Its Characteristic Function

The real component of the log-characteristic function for a stable distribution is

Χ(ω) = - |νω|

and therefore

ν

This last relationship implies that:

log(-Χ(ω)) = αlog(|ω|) + αlog(ν)

Thus for a stable distribution the graph of the logarithm of the real component of the log-characteristic function as a function of the logarithm of ω is a straight line, the slope of which is the stability index of the distribution, α.

The value of the logarithm of log(-Χ(ω)) when ω=1 is the intercept of the straight line and is equal to αlog(ν). Thus a knowledge of the intercept and the value of α determines the value of ν, the dispersion parameter of the distribution; i.e.,

log(ν) = log(-Χ(1))/α

The imaginary component of the log-characteristic function for stable distributions is:

Υ(ω) = δω + βν|ω|

With α and ν known the values of δ and β can be determined from the imaginary component of the log-characteristic function. The values of could be found from any two points on the curve; i.e., by solving the linear equations in the two unkowns δ and β:

Υ(ω

Υ(ω

Thus if a probability distribution is actually a stable distribution it is an easy matter to determine the values of its parameters from its log-characteristic function. The problem is how to properly make the estimates when the true probability distribution is not known but only a sample estimate is available.