Suppose (x0,y0) is a local minimizer. Two cases: 1. ∇f(x0,y0)=0: we claim that ∇f(x0,y0) is perpendicular to the tangent space to the unit circle h−1({0}) at (x0,y0). If this is not the case, then we obtain a contradiction by looking at the level sets of f, to which ∇f is perpendicular. Therefore ∇f(x0,y0)=λ∇h(x0,y0) for some λ.
∇f(x0,y0)=0: as in the previous case, λ=0.
In either case, at a local minimizer, the gradient of the function to be minimized is parallel to the gradient of the constraints.
Defn. Surface
For us, a surface is the set of common zeroes of a finite set of C1 functions.
Defn. Differentiable Curve
For us, a differentiable curve on the surface M⊆Rn is the image of a C1 function x:(a,b)→M.
Defn. Tangent Vector
Let x(s) be a differentiable curve on M that passes through x0∈M at time x(0)=x0. The velocity vector v=dsds=0x(s) of x(s) at x0 is, for us, said to be a tangent vector to the surface M at x0. The set of all tangent vectors to M at x0 is called the tangent space to M at x0 and is denoted by Tx0M.
Defn. Regular Point
Let M={x∈Rn:h1(x)=⋯=hk(x)=0} be a surface. If ∇h1(x0),…,∇hk(x0) are all linearly independent, then x0 is said to be a regular point of M.
Claim 1
At a regular point x0∈M, the tangent space Tx0M is given by
Tx0M={y∈Rn:∇h(x0)y=0}
Claim 2
Let f,h1,…,hk be C1 functions on the open set Ω⊆Rn. Let x0∈M={x∈Ω:h1(x)=⋯=hk(x)=0}. Suppose x0 is a local minimizer of f subject to the constraints hi(x)=0. Then ∇f(x0) is perpendicular to Tx0M.
proof. Without loss of generality, suppose Ω=Rn. Let v∈Tx0M. Then v=dsds=0x(s) for some differentiable curve x(s) in M with x(0)=x0. Since x0 is a local minimizer of f, 0 is a local minimizer of f∘x, so ∇f(x0)⋅x′(0)=∇f(x0)⋅v=0.
Thrm. First Order Necessary Conditions (Lagrange Multipliers)
Claim. Let f,h1,…,hk be C1 functions on some open Ω⊆Rn. Suppose x0 is a local minimizer of f subject to the constraints h1(x),…,hk(x)=0, which is also a regular point of these constraints. Then there are λ1,…,λk∈R ("Lagrange multipliers") such that
∇f(x0)+λ1∇h1(x0)+⋯+λk∇hk(x0)=0
proof. Since x0 is regular, Tx0M=span({∇h1(x0),…,∇hk(x0)})⊥. By a lemma from last class, ∇f(x0)∈(Tx0M)⊥. Therefore ∇f(x0)∈span({∇h1(x0),…,∇hk(x0)}), since we are dealing with a finite dimensional vector space. We are done.
Example: Max volume of Box with constrained surface area
Given a fixed area A>0, how do we construct a box of maximum volume with surface area A? Suppose the volume is V(x,y,z)=xyz and the area is A(x,y,z)=2(xy+xz+yz). Our problem is stated as a maximization problem, so we have to convert it to a minimization problem. Let f=−V. We are therefore dealing with the problem
minimize subject to f(x,y,z)=−xyzh(x,y,z)=A(x,y,z)−A=0,x,y,z≥0
But we don't know how to deal with inequality constraints right now, so we have to make some changes. Note that if any one of x,y,z is zero, then the volume is zero. Therefore the problem we want to consider is really the problem
minimize subject to f(x,y,z)h(x,y,z)=0,x,y,z>0
Now, if Ω={(x,y,z)∈R3:x,y,z>0}, then the above minimization problem may be solved using the first order necessary condition we gave above, for the set Ω is open.
Suppose (x0,y0,z0) is a local minimizer of f subject to the constraint h(x,y,z)=0. This point is regular because we are only considering points whose coordinates are all positive. Then there is a λ∈R such that ∇f(x0,y0,z0)+λ∇h(x0,y0,z0)=0. Therefore [ (-y_0z_0, -x_0z_0, -x_0y_0) + \lambda (2y_0 + 2z_0, 2x_0 + 2z_0, 2x_0 + 2y_0) = (0,0,0). ] Equivalently,
Add all of these equations together: [ 2\lambda( 2x_0 + 2y_0 + 2z_0 ) = x_0z_0 + x_0y_0 + y_0z_0 = \frac{A}{2} > 0 ] implying that λ>0. The first two equations tell us that
Subtracting these two equations gives 2λ(x0z0−y0z0)=0. Cancelling the z0's gives 2λ(x0−y0)=0, and since λ>0, we have x0=y0. Since we could have done the same thing with the other pairs of equations, we get x0=y0=z0.
Physically, this tells us that in order to maximize the volume of a rectangular solid of fixed area, we must make a cube. Note that we haven't actually solved the maximization problem; we've only figured out what form its solutions must take.
Thrm. Second Order Necessary Conditions
Claim. Let f,h1,…,hk be C2 on some open set Ω⊆Rn. Suppose x0 is a regular point which is a local minimizer of f subject to the constraints. Then 1. There are λ1,…,λk∈R such that
By taking s small it follows that 0≤21x′(0)⋅L(x0)x′(0). Since any tangent vector v∈Tx0M can be described as the tangent vector to a curve in M through x0, it follows that L(x0) is positive semi-definite on Tx0M.
Thrm. Second Order Sufficient Conditions
Claim. Consider functions f,h1,…,hk which are C2 on the open Ω⊆Rn. Suppose x0 is a regular point of the constraints given by h1(x)=⋯=hk(x)=0. Let M=⋂hi−1({0}). Suppose there exist λ1,…,λk∈R such that
∇f(x0)+∑λi∇hi(x0)=0
L(x0)=∇2f(x0)+∑λi∇2hi(x0)
is positive definite on Tx0M.
Then x0 is a strict local minimizer of f on M.
proof. Recall that if L(x0) is positive definite on Tx0M, then there is an a>0 such that v⋅L(x0)v≥a∥v∥2 for all v∈Tx0M. (This is very easily proven by diagonalizing the matrix.) Let x(s) be a smooth curve in M such that x(0)=x0, and normalize the curve so that ∥x′(0)∥=1. We have which becomes
f(x(s))−f(x(0))=sdsds=0f(x(s))+21s2ds2d2s=0f(x(s))+o(s2)=sdsds=0[f(x(s))+∑λihi(x(s))]+21s2ds2d2s=0[f(x(s))+∑λihi(x(s))]+o(s2)=s=0 by 1.[∇f(x0)+∑λi∇hi(x0)]⋅x′(0)+21s2x′(0)⋅L(x0)x′(0)+21s2=0 by 1.[∇f(x0)+∑λi∇hi(x0)]⋅x′′(0)+o(s2)=21s2x′(0)TL(x0)x′(0)+o(s2)≥21s2a∥x′(0)∥2+o(s2)=21s2a+o(s2)=s2(21a+s2o(s2))
For sufficiently small s, the above is positive, so f(x(s))>f(x0) for all sufficiently small s. Since x(s) was arbitrary, x0 is a strict local minimizer of f on M.
Example 1
Recall the box example: maximizing the volume of a box of sides x,y,z≥0 subject to a fixed surface area A>0. We were really minimizing the negative of the volume. We got (x0,y0,z0)=(l,l,l), where l=A/6. Our Lagrange multiplier was λ=8(x0+y0+z0)A=24lA>0. We had (after some calculation)
so by the SOSC under equality constraints, our point (x0,y0,z0) is a strict local maximizer of the volume. In fact, it is a strict global minimum (which is yet to be seen).
Example 2
Consider the problem
minimize subject to f(x,y)=x2−y2h(x,y)=y=0.
Then
∇f(x,y)+λ∇h(x,y)=(2x−2y)+λ(01)=(00)
implying that λ=0 and that (x,y)=(0,0) is our candidate local minimizer. Since ∇h(x,y)=(0,0), the candidate is a regular point. We have
L(0,0)=(200−2)+0(0000)=(200−2)
which is not positive semi-definite everywhere. What about on the tangent space T(0,0)(x-axis)=(x-axis)? Clearly it is positive definite on the x-axis, so by the SOSC that we just proved, (0,0) is a strict local minimizer of f on the x-axis. Thinking of level sets, this is intuitively true.
Example 3
Consider the problem
minimize subject to f(x,y)=(x−a)2+(y−b)2h(x,y)=x2+y2−1=0.
Let us assume that (a,b) satisfies a2+b2>1. We have ∇h(x,y)=(2x,2y), which is non-zero on S1, implying that every point of S1 is a regular point. Lagrange tells us that
(2(x−a)2(y−b))+λ(2x2y)=(00)
as well as x2+y2=1. This may be written
(1+λ)x(1+λ)yx2+y2=a=b=1
By our assumption that a2+b2>1, we have λ=−1. Therefore
(xy)=1+λ1(ab)
which implies that
1+λ1=a2+b21
by the third equation. Therefore
(x0y0)=a2+b21(ab)
Thinking of level sets, this is intuitively true. The Lagrangian is
L(x0,y0)=(2002)+λ(2002)=>0(1+λ)(2002)
which, by the SOSC that we proved, proves that (x0,y0) is a strict local minimizer of f on S1. In fact, this point is a global minimizer of f on S1, which follows immediately by the fact that f necessarily takes on a global minimum on S1 and that it only takes on the point (x0,y0).
Example 4: Proof of Lagrange Multiplier on A Special Case
For a special case, we will derive the Lagrange multipliers equation. Suppose we are working with C1 functions f,h. Our problem is
minimize subject to f(x,y,z)g(x,y,z)=z−h(x,y)=0.
That is, we are minimizing f(x,y,z) on the graph Γh of h. The Lagrange equation tells us that
We will derive the above formula by expressing it as an unconstrained minimization problem
minimize (x,y)∈R2F(x,y)
for some function F. We will then find the first order necessary conditions for an unconstrained minimization, and then express it as the equation we would like to prove.
Define F(x,y)=f(x,y,f(x,y)). The constrained minimization problem is therefore equivalent to the unconstrained problem. By our theory of unconstrained minimization, ∇F(x0,y0)=(0,0). That is,