﻿ Second Order Conditions for Optimization, Constrained and Unconstrained: The Hessian and Bordered Hessian Matrix
San José State University
Department of Economics

applet-magic.com
Thayer Watkins
Silicon Valley
USA

 Second Order Conditions for Optimization: Constrained and Unconstrained

The purpose of this reading is the derivation of the second order conditions for optimization so that the reader will not only understand what those second order conditions are but also why those are the proper conditions. For the unconstrained case the conditions are stated in terms of the matrix of second derivatives called the Hessian matrix. the Hessian matrix is intuitively understandable. the conditions for the constrained case can be easily stated in terms of a matrix called the bordered Hessian. Generation after generation of applied mathematics students have accepted the bordered Hessian without a clue as to why it is the relevant entity.

In order to fulfill the goal of providing an intuitive derivation of the second order conditions the one, two and three variable cases will be given first before moving to the general n variable case.

## The One Variable Case

Although this case is so simple it seems almost trivial its analysis sets the stage for the more complex cases.

Let f(x) be a differentiable function. The quadratic approximation to f(x) near some point x0 is given by

#### f(x) = f(x0) + fx(x0)(x-x0) + ½fxx(x0)(x-x0)2or, equivalently df = fx(x0)dx + ½fxx(x0)(dx)2

where dx=(x-x0) and df=f(x)-f(x0).

A critical point is a value of x0 such that fx(x0)=0. If fx(x0)=0 then

#### df = ½fxx(x0)(dx)2

For a minimum df has to be positive and since (dx)2 is always positive it means that fxx(x0) must be positive. On the other hand for a maximum df has to be negative and that requires that fxx(x0) be negative. The Hessian matrix for this case is just the 1×1 matrix [fxx(x0)]. (Hereafter the point at which the second derivatives are evaluated will not be expressed explicitly so the Hessian matrix for this case would be said to be [fxx].

There is no corresponding constrained optimization problems for this one variable case.

## The Two Variable Case

The quadratic approximation about the point (x0, y0) for a differentiable function f(x,y) is

#### df = fxdx + fydy + ½[fxxdxdx + fxydxdy + fyxdydx + fyydydy] which can be expressed in the form df = fxdx + fydy + ½(dx, dy)H(dx, dy)T

where (dx, dy)T is the column vector which is the transpose of the row vector (dx, dy) and H is the matrix of second derivatives; i.e.,

#### | fxx    fxy | H = |              |   | fyx    fyy |

For the unconstrained case a critical point is one such that fx=0 and fy=0 so

#### df = ½(dx, dy)H(dx, dy)T

For a minimum the second order condition is that H be a positive definite matrix. The conditon for a matrix to be positive definite is that its principal minors all be positive. For a maximum, H must be a negative definite matrix which will be the case if the principal minors alternate in sign starting with a negative sign.

For the constrained case a critical point is defined in terms of the Lagrangian multiplier method. Suppose the constraint is

#### pxx + pyy = I in which case the first order conditions are fx = λpx fy = λpy

where λ, the Lagrangian multiplier, is chosen so as to have the critical values satisfy the constraint.

The quadratic approximation can now be expressed as

#### df = λpxdx + λpydy + ½(dx, dy)H(dx, dy)T

Since any deviations from the critical point must also satisfy the constraint

therefore

#### df = λ(pxdx + pydy) + ½(dx, dy)H(dx, dy)T = ½(dx, dy)H(dx, dy)T

but now H does not have to be either positive or negative definite for an extreme (maximum or minimum) because dx and dy are not unrestricted; i.e., only dx and dy such that pxdx + pydy = 0 are allowed. This makes specifying the conditions on H very difficult. The further analysis of the constrained case will be postponed until after the consideration of the unconstrained case for three variables.

## The Three Variable Case

For a trinary function f(x,y,z) the quadratic approximation of the deviations is

#### df = fxdx + fydy + fzdz + ½(dx, dy, dz)H(dx, dy, dz)T

where now H is given by

#### | fxx    fxy    fxz | H = | fyx    fyy    fyz |   | fzx    fzy    fzz |

As in the one and two variable unconstrained case the first order terms vanish and the conditions for a minimum is the positive definiteness of H and similarly negative definiteness for the maximum. Those conditions in turn can be stated in terms of the signs of the principal minors of H. There is nothing new here for this case.

The significance of this case is that the constrained two variable case can be restated in terms of a three variable case through the use of the Lagrangian multiplier method.

In the Lagrangian multiplier method the optimization problem of minimizing f(x,y) with respect to x and y subject to the constraint pxx + pyy = I is transformed into a three variable problem of unconstrained minimizing

#### L(x,y,λ) = f(x,y) - λ(pxx + pyy - I)

with respect to x, y and λ.

The first order condition for λ is

#### Lλ = pxx + pyy - I = 0 which is equivalent to satisfying the constraint.

The quadratic approximation for L(x,y,λ) is then

#### dL = Lxdx + Lydy + Lλdλ + ½(dx, dy, dλ)H*(dx, dy, dλ)Twhich on the basis of the definition of L reduces to dL = fxdx + fydy + (pxx + pyy - I)dλ + ½(dx, dy, dλ)H*(dx, dy, dλ)Tand for the first order conditions dL = λpxdx + λpydy + (pxx + pyy - I)dλ + ½(dx, dy, dλ)H*(dx, dy, dλ)T

Because of the satisfication of the constraints the first two terms reduce to zero and likewise the third term. Thus the value of dL is given by

#### dL = + ½(dx, dy, dλ)H*(dx, dy, dλ)T

The second order conditons for the constrained two variable case then can be stated in terms of the Hessian H* for the corrsponding three variable case. The character of H* is given by first noting that

#### Lx = fx - λpx Ly = fy - λpy Lλ = -(pxx + pyy - I)

And thus the second derivatives are given as:

#### | fxx    fxy    -px | H* = | fyx    fyy    -py |   | -px    -py    0   |

So this is the enigmatic bordered Hessian. The positive or negative definiteness of H* then constitutes the second order conditions for the constrained optimization problem.