Unconstrainted Optimization on Calulus of Variations
Introducing The Calculus of Variations
Consider the problem
where \(\mathcal{A}\) is a set of functions. Here, \(F\) is a function of functions, often called a Functional. This is the general unconstrained calculus of variations problem.
For example, consider
Define \(F : \mathcal{A} \to \mathbb R\) by
To solve the minimization problem
is to find a \(u^* \in \mathcal{A}\) such that \(F[u^*] \leq F[u]\) for all \(u \in \mathcal{A}\).
Thrm. Fundamental Lemma of Calculus of Variations
Claim. Suppose \(g \in C^0([a,b])\). If \(\int_a^b g(x)v(x) \, dx = 0\) for all test functions \(v\) on \([a,b]\), then \(g \equiv 0\) on \([a,b]\).
proof. Suppose \(g\not\equiv 0\). Then there is an \(x_0 \in (a,b)\) such that \(g(x_0) \neq 0\). (We can ensure that \(x_0\) is in the interior of the interval because of continuity.) Assume without loss of generality that \(g(x_0) > 0\). There exists an open neighbourhood \((c,d)\) of \(x_0\) inside \((a,b)\) on which \(g\) is positive. Let \(v\) be a \(C^1\) function on \([a,b]\) such that \(v > 0\) on \((c,d)\) and \(v = 0\) otherwise. Then \(v\) is a test function on \([a,b]\), so by the hypotheses, \(0 = \int_a^b g(x)v(x) \, dx = \int_c^d g(x)v(x) \, dx > 0\) a contradiction.
Intuitions. Fix \(v \in C^1([0,1],\mathbb R)\) with \(v(0) = v(1) = 0\). Suppose \(u^*\) is a minimizer of \(F\) on \(\mathcal{A}\). Clearly \(u^* + sv \in \mathcal{A}\) for all \(s \in \mathbb R\). Define \(f : \mathbb R \to \mathbb R\) by \(f(s) := F[u^* + sv]\). Then \(f(s) \geq f(0)\) for all \(s\), since \(u^*\) is a minimizer of \(F\). Then \(0\) is a minimizer of \(f\), implying \(f'(0) = 0\). How do we actually compute \(f'(0)\)? Since
implying that
or
The above equality holds for all \(v \in C^1([0,1], \mathbb R)\) such that \(v(0)=v(1)=0\). This is a primitive form of the first order necessary conditions.
Let us call the functions \(v\) described in \((*)\) the test functions on \([0,1]\). We would like to write \((*)\) in a more useful way. Let us make the simplifying assumption that \(u^*\) is \(C^2\). Integration by parts gives
By substituting this into \((*)\) we obtain
Factor the common \(v\) out to get
So we have a continuous function \(u^*(x) - u^{*''}(x)\) that is zero whenever "integrated against test functions". We claim that any function satisfying this condition must be zero. This result or its variations is called the fundamental lemma of the calculus of variations. We shall show that \(u^* = u^{*''}\) on \([0,1]\); this gives us the first order necessary conditions we wanted in the first place.
Defn. Variational Derivative
Given \(u \in \mathcal{A}\), suppose there is a function \(g : [a,b] \to \mathbb R\) such that
for all test functions \(v\) on \([a,b]\). Then \(g\) is called the variational derivative of \(F\) at \(u\). We denote the function \(g\) by \(\frac{\delta F}{\delta u}(u)\).
We can think of \(\frac{\delta F}{\delta u}(u)\) as an analogue of the gradient. We have
for all test functions \(v\) on \([a,b]\). Compare this with the finite-dimensional formula
if one thinks of the integral as an "infinite sum of infinitesimally small pieces", then we can understand how the functional derivative might be an "infinite-dimensional" version of the gradient.
Euler Lagrange Equation in 1-Dim
Lemma (A corollary of Fundamental Lemma) Suppose \(u_* \in \mathcal{A}\) satisfies \(u_* + v \in \mathcal{A}\) for all test functions \(v\) on \([a,b]\). Then if \(u_*\) minimizes \(F\) on \(\mathcal{A}\) and if \(\frac{\delta F}{\delta u}(u_*)\) exists and is continuous, then \(\frac{\delta F}{\delta u}(u_*) \equiv 0\).
Thrm. Euler-Lagrange equation
Claim. Suppose \(L,u\) are \(C^2\) functions. Then \(\frac{\delta F}{\delta u}(u)\) exists, is continuous, and \(\frac{\delta F}{\delta u}(u)(x) = -\frac{d}{dx} L_p(x,u(x), u'(x)) + L_z(x,u(x),u'(x))\)
proof. Let \(v\) be a test function on \([a,b]\). Then
Since \(u\) and \(L\) are \(C^2\), the function in the integrand is continuous. By the definition of the variational derivative we have the desired result.
Thrm. DuBois-Raymond Lemma
The lemma allows us to relax the restrictions for the Euler-Lagrange equation to hold from twice continuously differentiable to only once continuously differentiable.
Claim. Suppose \(\alpha, \beta\) are continuous functions on \([a,b]\) such that \(\int_a^b ( \alpha(x)v(x) + \beta(x)v'(x) ) \, dx = 0\) for all test functions \(v\) on \([a,b]\). Then \(\beta\) is \(C^1\), and \(\beta' = \alpha\) on \([a,b]\).
proof. Let \(A(x) = \int_a^x \alpha(t) \, dt\) be an antiderivative of \(\alpha\). Since \(\alpha\) is continuous, \(A\) is \(C^1\). Then
By the original assumption,
We are done if we are able to show that \(-A(x) + \beta(x)\) is constant on \([a,b]\). Let \(\gamma = -A + \beta\). Define \(C\) to be the constant
so that \(\int_a^b (\gamma(t) - C) \, dt = 0\). Define \(v(x) := \int_a^x (\gamma(t) - C) \, dt\). The function \(v\) is \(C^1\) since \(\gamma(t) - C\) is continuous, and \(v(a) = v(b) = 0\); so \(v\) is a test function on \([a,b]\). By some algebra,
Since \((\gamma(x) - C)^2 \geq 0\) on \([a,b]\), we must have \(\gamma(x) = C\). Therefore \(\gamma\) is constant, which proves the lemma.
Example
Consider two points \((a,A), (b,B)\) in \(\mathbb R^2\) with \(a < b\). We seek a function \(u\) on \([a,b]\) with \(u(a) = A\), \(u(b) = B\), and with
minimized. Denote by \(\mathcal{A}\) the set of \(C^1\) functions \(u\) with \(u(a) = A\) and \(u(b) = B\). Suppose that \(u_*\) is a minimizer. Then, by the previous lemma applied to the last result of the previous lecture, \(u_*\) satisfies the Euler-Lagrange equation. Let \(L(x,z,p) := \sqrt{1 + p^2}\). Then \(L_z = 0\), and
The Euler-Lagrange equation is, in this case,
This implies that
for some constant \(C\), implying
hence \(u_*'\) is constant, or \(u_*(x) = \alpha x + \beta\) for some constants \(\alpha, \beta\). As expected, the minimizer is a line. This answer is expected, since the shortest path joining two points is the line joining them.
Example: Area of Revolution
Suppose \(u\) is a \(C^1\) function on an interval \([a,b]\). Consider the surface of revolution obtained by rotating the graph of \(u\) on \([a,b]\) about the \(x\)-axis. Consider the functional
which is, by some calculus,
With the set \(\mathcal{A}\) of functions defined as in the previous example, we seek to find a function \(u_* \in \mathcal{A}\) minimizing \(F\) on \(\mathcal{A}\).
In this example, the Lagrangian is \(L(x,z,p) = 2\pi z \sqrt{1 + p^2}\), which gives
The Euler-Lagrange equation is, in this case,
Cancel the \(2\pi\)'s to get the ODE
By magic, the general solution to this differential equation has the form
for some constants \(\alpha, \beta\). We won't argue why we got this solution, but we can differentiate it and check that it solves the ODE; uniqueness theorems give us what we want.
It is now obvious that \(u\) solves \((**)\). Therefore a minimizer \(u_*\) must be of the form
We may use the boundary conditions to find \(\alpha, \beta\).
Consider the special case \((a,A) = (0, 1)\) and \((b, B) = (1,0)\). The boundary conditions on \(u_*\) give us the system
Since \(\cosh\) is strictly positive, the second equation gives us \(\beta = 0\), a contradiction. We conclude that there is no \(C^1\) minimizer in this special case.
Euler-Lagrange Equation in N-Dim
We consider a functional
where \(u : [a,.b] \to \mathbb R^n\). In this case,
Our general space of functions will be denoted by
The Euler-Lagrange equation from the real-valued case generalizes to
The proof is a straightforward generalization of the proof given when \(n = 1\).
Example: Newton's Second Law
Let us consider an example from classical mechanics. We consider the physical situation of a point mass moving in a potential field. Denote by \(V(x)\) the potential energy at a point \(x\). The kinetic energy of a point of mass \(m\) with velocity \(v\) is \(\frac{1}{2} m |v|^2\). Define the Lagrangian
as the difference between the kinetic and potential energies. Suppose our particle is moving along a path \(x = x(t)\) parametrized by time in \(\R^n\). Our functional is
This represents the difference between the kinetic energy and the potential energy \emph{along the entire path}. We can think of it as the net change in energy from the kinetic energy to potential energy along the path.
In this case, the Euler-Lagrange equations for a a minimizing path \(x(t)\) are
One computes that
Then the Euler-Lagrange equations are