﻿ The Implicit Function Theorem and Its Proof
San José State University

applet-magic.com
Thayer Watkins
Silicon Valley
USA

 The Implicit Function Theoremand Its Proof

The Implicit Function Theorem (IFT) is a generalization of the result that

#### If G(x,y)=C, where G(x,y) is a continuous function and C is a constant, and ∂G/∂y≠0 at some point P then y may be expressed as a function of x in some domain about P; i.e., there exists a function over that domain such that y=g(x).

Note that without any loss of generality the constant C can be taken to be 0. If G*(x,y)=C then G(x,y)=G*(x,y)-C=0.

The IFT is a very important tool in economic analysis and so the conditions under which it holds must be carefully specified. The simplest way to do this is to give a formal, explicit proof of the theorem. First a proof of an artifically limited version of the IFT will be given and this will provide an understanding and a guide to the proof of the full version.

• Theorem 1: Let F(x,y,z) be a continuous function with continuous partial derivatives defined on an open set S containing the point P=(x0,y0,z0). If ∂F/∂z ≠ 0 at P then there exists a region R about (x0y0) such that for any (,y) in R there is a unique z such that F(x,y,z)=0.

This means that within R z can be represented as a function of x and y; i.e., z=f(x,y).

Proof:
Without any loss of generality the set S can be taken to be a parallelpiped, a box,centered on P because within the set S there will be at least one such box. Within any such box there will be another box containing P such that ∂F/∂z has the the same sign as at P. Let the box be given as a triple (a,b,c) such that

#### |x-x0| ≤ a |y-y0| ≤ b |z-z0| ≤ c

Without any loss of generality the sign of ∂F/∂z at P can be taken to be positive.

This means that

#### F(x0,y0,z0+c) > 0 F(x0,y0,z0-c) < 0

Now consider any point (x1,y1) such that

#### |x1-x0| ≤ a |y1-y0| ≤ b

Because F(x0,y0,z0+c) > 0 and F(x,y,z) is continuous, F(x1,y1,z0+c) > 0. Likewise F(x1,y1,z0-c) < 0. With x and y held fixed at x1 and y1, G(z)=F(x1,y1,z) is a function such that G(z0+c) > 0 and G(z0-c) < 0. Therefore there is some z between z0-c and z0+c such that G(z)=0; i.e., F(x1,y1,z)=0. Moreover this value of z is unique. Since this holds for any (x,y) such

#### |x-x0| ≤ a |y-y0| ≤ b

over this domain there a function z=f(x,y) such that F(x,y,z)=0.

Corollary 1: The function f(x,y) determined above is continuous.

Corollary 2: The partial derivatives of the function f(x,y) determined above are given by:

#### ∂f/∂x = -(∂F/∂x)/(∂F/∂z) ∂f/∂y = -(∂F/∂y)/(∂F/∂z)

Proof:
The Mean Value Theorem says that for a function z=h(x) with a continuous derivative

#### Δz = h(x+Δx)-h(x) = h'(x + θΔ)Δxfor some θ between 0 and 1.

This can be extended to a binary function w=G(x,z) with continuous partial derivatives so that

#### Δw = G(x+Δx,z+Δz)-G(x,z)=[(∂G/∂x)Δx+(∂G/∂z)Δz] where the partial derivatives are evaluated at (x+θΔx,z+θΔz) for some θ between 0 and 1.

This is not the only generalization of the Mean Value Theorem but it is sufficient for the purpose here. Let G(x,z) be F(x,y,z) with y held fixed. Then

#### ΔF = (∂F/∂x)Δx + (∂F/∂z)Δz where the partial derivatives are evaluated at (x+θΔx,z+θΔz) for some θ between 0 and 1.

Now let Δz be the change in z for z on the surface F(x,y,z)=0. Thus ΔF=0 and hence

#### Δz/Δx = - (∂F/∂x)/(∂F/∂z) where the partial derivatives are evaluated at (x+θΔx,z+θΔz) for some θ between 0 and 1.

In the limit as Δx goes to zero this becomes

#### dz/dx = df/dx = - (∂F/∂x)/(∂F/∂z).

Likewise, by an analogous procedure,

#### dz/dy = df/dy = - (∂F/∂y)/(∂F/∂z).

• Theorem 2: Let F(x1,x2,...,xn,z) be a continuous function with continuous partial derivatives defined on an open set S containing the point P=(x1(P),x2(P),...,xn(P),,z(P)). If ∂F/∂z ≠ 0 at P then there exists a region R about (x1(P),x2(P),...,xn(P)) such that for any (x1,x2,...,xn) in R there is a unique z such that F(x1,x2,...,xn,z)=0 and thus over R z=f(x1,x2,...,xn).

The proof is essentially the same as for Theorem 1, but some unnecessary restrictions in the proof can be removed. Within the set S there will be a parallelpiped containing P and specified by its lower and upper corner points (X1L,...,XnL,zL) and (X1UX1U,zU) such that:

#### X1L < x1 < X1U ....................... XnL < xn < XnU

and ∂F/∂z has the same sign as at P.

Then F(x1(P),...,x1(P),zU) and F(x1(P),...,x1(P),zL) will have opposite signs. For any point Q within the parallelpiped, ignoring the coordinate z,

#### F(x1(Q),...,x1(Q),zU) and F(x1(Q),...,x1(Q),zL)

will have the same signs as for P and are thus of opposite sign. For a fixed Q then G(z)=F(x1(P),...,x1(P),z) has opposite signs at zU and zL and therefore there is a z between these two levels such that G(z)=0. This value of z is a function of ((x1(Q),...,x1(Q)); i.e., z=f((x1(Q),...,x1(Q)).

• Theorem 3: Let F(x,y,u,v) and G(x,y,u,v) be continuous function with continuous partial derivatives. If F(x,y,u,v)=0 and G(x,y,u,v)=0 at some point P in an open set S and the Jacobian J(F,G,u,v) is not zero at P then there exists a domain about P such that u=f(x,y) and v=g(x,y).

Proof:
Consider F(x,y,u,v)=0 and the point P and apply Theorem 2 to get v as a function of x,y and u; i.e., find v=h(x,y,u). Substitute h(x,y,u) for v in G(x,y,u,v) to obtain H(x,y,u)=G(x,y,u,h(x,y,u)). H(x,y,u)=0 so Theorem 1 applies and thus there exists f(x,y) such that u=f(x,y). A substitution of this function for u in v=h(x,y,u) gives v=g(x,y)=h(x,y,f(x,y)).

• Theorem 4: Let Fi(x1,...,xn,u1,..,um) be a set of m continuous functions with continuous partial derivatives defined on an open set S containing the point P=(x1(P),x2(P),...,xn(P),,u1(P),...,um(P)). If the Jacobian of the functions {Fi: i=1,...,m} with respect to the variables {ui: i=1,...,m} is not equal to 0 at P then there exists a region R about (x1(P),x2(P),...,xn(P)) such that for any (x1,x2,...,xn) in R there is a unique point (u1,...,um)

#### such that Fi(x1,x2,...,xn,u1,...,um)=0 for i=1 to m and thus over R ui=fi(x1,x2,...,xn) for i=1 to m.

Proof: The procedure is the same as in the proof of Theorem 3. Theorem 2 is applied to one of the F-functions to obtain one of the u-variables as a function of the x-variables and the other u-variables. This expression for the u-variable as function of the other variables is substituted into a second F-function and another u-variable is obtained as a function of the remaining variables until one u-variable is found as a function of only the x-variables. This u-variable is then substituted into the preceding expression for a u-variable. This process is continued until all of the u-variables are obtained as functions only of the x-variables.