Thayer Watkins
Silicon Valley
& Tornado Alley

Numerical Computation Fundamentals

Computation as Function Evaluation

One essential feature of computation is its definiteness and repeatibility: the same inputs give the same outputs. In the case of computers the inputs and outputs have to be sequences of bits, which can be thought of as rational numbers. These rational numbers can have only finite numbers of digits to the left and right of the decimal point. Let this set of allowable numbers be denoted as S. Then a function involved in computation is of the nature f:S→S. There is the special provision that if the highest positive or lowest negative number comes up in a computation the computation stops and a message of Overflow is outputed. If the computation gives a result that is not zero but rounds to zero then the computation outputs a message of Underflow.

Suppose there is some mathematic procedure to be carried out, say finding the square root of a non-negative real number. Let the set of non-negative real numbers be denoted as R. Then the square root function is F:R→R where F(x)=√x.

A computational approximation of F involves using three approximations;

The process can be expressed in terms of a diagram:

Errors, Forward and Backward

The obvious way to define the error between an estimate y and the actual value y is

absolute error E = |y −y |
or in relative form
ε = |y −y |/|y|

In the example of the square root, suppose x=1.21. Then y=1.1 and y=1.105 so E=0.005 and ε=0.0045.

The trouble with this obvious definitions of error is that the true value y may be unknown or unknowable. For example, suppose x=2 and we have the estimate of its square root as y=1.5. We have only approximations to the true value of y, such as 1.41 or 1.414 or 1.4142 or 1.41421 etc. We thus can only approximate the values of E amd ε. (And thus we might have an error in the values of the errors.)

James Wilkinson found an ingenious way around this problem. Instead of comparing the computed value y to the true value y Wilkinson asked what was the value of x that had y as a true value. He then determined the difference betwee this x and the true value of x and called it the backward error. Thus in the above example concerning square roots y=1.105 and hence x=1.221 so the absolute backward error is 0.011 and the relative backward error is 0.01, precisely.

To see the beauty of Wilkinson's backward error analysis consider the case of x=2. The approximation is y=1.5. The direct or forward error is (1.5 − 1.4142136....)=0.0857864..., sort of. On the other hand, for this case, x=2.25 and the backward error is 0.25, precisely.

Wilkinson's backward error does not handle all of the problems of specifying computational error. To illustrate more dramatically the additional problems consider the case in which numbers and computation are given only to two decimal places. This situation is brought to mind by the lament of Norbert Weiner who during World War II had a computer (human-type) with an accounting background. All calculations were carried out to exactly two decimal places, no more and no less. Now suppose x=√2. The nearest one can get to √2 in the set of numbers available is [√2]=1.41. The computed "square root" for this number is y=[1+(1.41-1)/2]=[1.205]=1.20, presuming rounding of 5 to the nearest even digit. The number that this is the true square root of is 1.44. The true square root of √2 is 1.1892... The forward error of the estimate is 0.010793...., sort of. The backward error could be defined as (1.452−1.41436...) or as (1.44−1.41) or as something else.

The error introduced by using 1.41 rather than √2 is called truncation error. The error involved in taking 1.20 as the result rather than 1.205 is called the round-off error. The situation illustrated above is sometimes called mixed forward-backward error.

The Condition Number

The relationship between the magnitude of the forward and the backward error is of interest. It is presumed that a computation method that has a small backward error also necessarily has a small forward error. This is not always the case, but if it is not true then the computation problem is not a well posed problem.

Consider the computation of an approximation of y = f(x) where f() has a Taylor's series

f(x+Δx) = f(x) + f'(x)Δx + O((Δx)²)

Let the estimate of y, y, be given by backward difference analysis as

y = f(x+Δx) = f(x)+ f'(x)Δx + O((Δx)²)
but f(x)= y so
y−y = f'(x)Δx + O((Δx)²)
and dividing by y=f(x) so
(y−y)/y = (f'(x)/f(x))Δx + O((Δx)²)

By multiplying and dividing the first term on the right by x the above relation takes the form

(y−y)/y = (xf'(x)/f(x))(Δx/x) + O((Δx)²)
which is the same as
(relative forward error) =
±c(x)(relative backward error) + O((Δx)²)

where c(x)=|(xf'(x)/f(x))| and is called the relative condition number. It could also be called the elasticity of the function value with respect to its argument. This elasticity or condition number can be represented as

c(x) = d(ln(f))/d(ln(x))

It follows that as Δx goes to zero the ratio of the relative forward error to the relative backward error goes to the condition number in magnitude.

For the computation of the square root the condition number is (1/2). For finding the n-th root the condition number is (1/n). For the n-th power c is n.

But things are not always so simple. Consider the case when the function f is the natural logarithm function. The condition number is |1/ln(x)|. For x near 1 this becomes very large. Computing ln(x) near x=1 is an ill conditioned probem subject to a large relative error.

The generalization of the concept of condition number is very important in other forms of computation.

Cancellation and Near Cancellation: Potential Disaster for Computational Accuracy

The issue involved here is that any operation that results in the true value of a computation be zero or near zero results in the amplification of the significance of errors from previous computations. For situations of this sort the solution is to reformulate the computation scheme to avoid cancellation.

Illustration of the Problem of Cancellation
and the Means of Avoiding It

Solving a Quadratic Equation

One formula etched upon the mind of every algebra student is that the solutions to the quadratic equation ax²+bx+c=0 are given by:

x = (-b±√(b²−4ac))/2a

While this formula is mathematical correct there is a computation problem when one root is very small. In such a case 4ac is small compared with b² and thus the numerator is the difference of two nearly equal quantities. In such a case any truncation or rounding errors in the computation result in a high relative error in the computed result.

(To be continued.)

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins