## You are here

Homecalculus of variations

## Primary tabs

# calculus of variations

Imagine a bead of mass $m$ on a wire whose endpoints are at $a=(0,0)$ and $b=(x_{f},y_{f})$, with $y_{f}$ lower than the starting position. If gravity acts on the bead with force $F=mg$, what path (arrangement of the wire) minimizes the bead’s travel time from $a$ to $b$, assuming no friction?

This is the famed *brachistochrone problem*, and its solution was one of the first accomplishments of the calculus of variations. Many minimum problems can be solved using the techniques introduced here.

In its general form, the calculus of variations concerns quantities

$S[q,\dot{q},t]=\int_{{a}}^{{b}}L(q(t),\dot{q}(t),t)dt$ | (1) |

for which we wish to find a minimum or a maximum.

To make this concrete, let’s consider a much simpler problem than the brachistochrone: what’s the shortest distance between two points $p=(x1,y1)$ and $q=(x2,y2)$? Let the variable $s$ represent distance along the path, so that $\int_{{p}}^{{q}}ds=S$. We wish to find the path such that $S$ is a minimum. Zooming in on a small portion of the path, we can see that

$\displaystyle ds^{2}$ | $\displaystyle=dx^{2}+dy^{2}$ | (2) | ||

$\displaystyle ds$ | $\displaystyle=\sqrt{dx^{2}+dy^{2}}$ | (3) |

If we parameterize the path by $t$, then we have

$ds=\sqrt{\left(\frac{dx}{dt}\right)^{2}+\left(\frac{dy}{dt}\right)^{2}}\ dt$ | (4) |

Let’s assume $y=f(x)$, so that we may simplify (4) to

$ds=\sqrt{1+\left(\frac{dy}{dx}\right)^{2}}\ dx=\sqrt{1+f^{{\prime}}(x)^{2}}\ dx.$ | (5) |

Now we have

$S=\int_{{p}}^{{q}}L\ dx=\int_{{x1}}^{{x2}}\sqrt{1+f^{{\prime}}(x)^{2}}\ dx$ | (6) |

In this case, $L$ is particularly simple. Converting to $q$’s and $t$’s to make the comparison easier, we have $L=L[f^{{\prime}}(x)]=L[\dot{q}(t)]$, not the more general $L[q(t),\dot{q}(t),t]$ covered by the calculus of variations. We’ll see later how to use our $L$’s simplicity to our advantage. For now, let’s talk more generally.

We wish to find the path described by $L$, passing through a point $q(a)$ at $t=a$ and through $q(b)$ at $t=b$, for which the quantity $S$ is a minimum, for which small perturbations in the path produce no first-order change in $S$, which we’ll call a “stationary point.” This is directly analogous to the idea that for a function $f(t)$, the minimum can be found where small perturbations $\delta t$ produce no first-order change in $f(t)$. This is where $f(t+\delta t)\approx f(t)$; taking a Taylor series expansion of $f(t)$ at $t$, we find

$f(t+\delta t)=f(t)+\delta tf^{{\prime}}(t)+O({\delta t}^{2})=f(t),$ | (7) |

with $f^{{\prime}}(t):=\frac{d}{d{t}}{f(t)}$. Of course, since the whole point is to consider $\delta t\neq 0$, once we neglect terms $O({\delta t}^{2})$ this is just the point where $f^{{\prime}}(t)=0$. This point, call it $t=t_{0}$, could be a minimum or a maximum, so in the usual calculus of a single variable we’d proceed by taking the second derivative, $f^{{\prime\prime}}(t_{0})$, and seeing if it’s positive or negative to see whether the function has a minimum or a maximum at $t_{0}$, respectively.

In the calculus of variations, we’re not considering small perturbations in $t$—we’re considering small perturbations in the *integral* of the relatively complicated *function* $L(q,\dot{q},t)$, where $\dot{q}=\frac{d}{d{t}}{q(t)}$. Also, $S$ is a functional, and we can think of the minimization problem as the discovery of a minimum in $S$-space as we jiggle the parameters $q$ and $\dot{q}$.

For the shortest-distance problem, it’s clear the maximum time doesn’t exist, since for any finite path length $S_{0}$ we (intuitively) can always find a curve for which the path’s length is greater than $S_{0}$. This is often true, and we’ll assume for this discussion that finding a stationary point means we’ve found a minimum.

Formally, we write the condition that small parameter perturbations produce no change in $S$ as $\delta S=0$. To make this precise, we simply write

$\displaystyle\delta S$ | $\displaystyle:=S[q+\delta q,\ \dot{q}+\delta\dot{q},\ t]-S[q,\dot{q},t]$ | ||

$\displaystyle=\int_{{a}}^{{b}}L(q+\delta q,\ \dot{q}+\delta\dot{q})dt-S[q,\dot% {q},t]$ |

How are we to simplify this mess? We are considering small perturbations to the path, which suggests a Taylor series expansion of $L(q+\delta q,\dot{q}+\delta\dot{q})$ about $(q,\dot{q})$:

$L(q+\delta q,\dot{q}+\delta\dot{q})=L(q,\dot{q})+\delta q\frac{\partial}{% \partial{q}}L(q,\dot{q})+\delta\dot{q}\frac{\partial}{\partial{\dot{q}}}L(q,% \dot{q})+O(\delta q^{2})+O(\delta\dot{q}^{2})$ |

and since we make little error by discarding higher-order terms in $\delta q$ and $\delta\dot{q}$, we have

$\int_{{a}}^{{b}}L(q+\delta q,\dot{q}+\delta\dot{q})dt=S[q,\dot{q},t]+\int_{{a}% }^{{b}}\delta q\frac{\partial}{\partial{q}}L(q,\dot{q})+\delta\dot{q}\frac{% \partial}{\partial{\dot{q}}}L(q,\dot{q})dt$ |

Keeping in mind that $\delta\dot{q}=\frac{d}{d{t}}{\delta q}$ and noting that

$\displaystyle\frac{d}{d{t}}{\left(\delta q\frac{\partial}{\partial{\dot{q}}}L(% q,\dot{q})\right)}$ | $\displaystyle=\delta q\frac{d}{d{t}}{\frac{\partial}{\partial{\dot{q}}}L(q,% \dot{q})}+\delta\dot{q}\frac{\partial}{\partial{\dot{q}}}L(q,\dot{q}),$ |

a simple application of the product rule $\frac{d}{d{t}}{(fg)}=\dot{f}g+f\dot{g}$ which allows us to substitute

$\displaystyle\delta\dot{q}\frac{\partial}{\partial{\dot{q}}}L(q,\dot{q})$ | $\displaystyle=\frac{d}{d{t}}{\left(\delta q\frac{\partial}{\partial{\dot{q}}}L% (q,\dot{q})\right)}-\delta q\frac{d}{d{t}}{\frac{\partial}{\partial{\dot{q}}}L% (q,\dot{q})},$ |

we can rewrite the integral, shortening $L(q,\dot{q})$ to $L$ for convenience, as:

$\displaystyle\int_{{a}}^{{b}}\delta q\frac{\partial}{\partial{q}}L+\delta\dot{% q}\frac{\partial}{\partial{\dot{q}}}Ldt$ | $\displaystyle=\int_{{a}}^{{b}}\delta q\frac{\partial}{\partial{q}}L-\delta q% \frac{d}{d{t}}{\frac{\partial}{\partial{\dot{q}}}L}+\frac{d}{d{t}}{\left(% \delta q\frac{\partial}{\partial{\dot{q}}}L\right)}dt$ | ||

$\displaystyle=\int_{{a}}^{{b}}\delta q\left[\frac{\partial}{\partial{q}}L-% \frac{d}{d{t}}{\frac{\partial}{\partial{\dot{q}}}L}\right]dt+\delta q\frac{% \partial}{\partial{\dot{q}}}L\Big|_{{a}}^{{b}}$ |

Substituting all of this progressively back into our original expression for $\delta S$, we obtain

$\displaystyle\delta S$ | $\displaystyle=\int_{{a}}^{{b}}L(q+\delta q,\dot{q}+\delta\dot{q})dt-S[q,\dot{q% },t]$ | ||

$\displaystyle=S+\int_{{a}}^{{b}}\left[\delta q\frac{\partial}{\partial{q}}L+% \delta\dot{q}\frac{\partial}{\partial{\dot{q}}}L\right]dt-S$ | |||

$\displaystyle=\int_{{a}}^{{b}}\delta q\left[\frac{\partial}{\partial{q}}L-% \frac{d}{d{t}}{\frac{\partial}{\partial{\dot{q}}}L}\right]dt+\delta q\frac{% \partial}{\partial{\dot{q}}}L\Big|_{{a}}^{{b}}=0.$ |

Two conditions come to our aid. First, we’re only interested in the neighboring paths that still begin at $a$ and end at $b$, which corresponds to the condition $\delta q=0$ at $a$ and $b$, which lets us cancel the final term. Second, between those two points, we’re interested in the paths which *do* vary, for which $\delta q\neq 0$.
This leads us to the condition

$\int_{{a}}^{{b}}\delta q\left[\frac{\partial}{\partial{q}}L-\frac{d}{d{t}}{% \frac{\partial}{\partial{\dot{q}}}L}\right]dt=0.$ | (8) |

The fundamental theorem of the calculus of variations is that for continuous functions $f(t),g(t)$ with $g(t)\neq 0\ \forall t\in(a,b)$,

$\int_{{a}}^{{b}}f(t)g(t)\,dt=0\quad\Longrightarrow\quad f(t)=0\;\;\forall t\in% (a,b).$ | (9) |

Using this theorem, we obtain

$\frac{\partial}{\partial{q}}L-\frac{d}{d{t}}\left(\frac{\partial}{\partial{% \dot{q}}}L\right)=0.$ | (10) |

This condition, one of the fundamental equations of the calculus of variations, is called the *Euler–Lagrange condition*. When presented with a problem in the calculus of variations, the first thing one usually does is to ask why one simply doesn’t plug the problem’s $L$ into this equation and solve.

Recall our shortest-path problem, where we had arrived at

$S=\int_{{a}}^{{b}}L\ dx=\int_{{x1}}^{{x2}}\sqrt{1+f^{{\prime}}(x)^{2}}\ dx.$ | (11) |

Here, $x$ takes the place of $t$, $f$ takes the place of $q$, and (8) becomes

$\frac{\partial}{\partial{f}}L-\frac{d}{d{x}}{\frac{\partial}{\partial{f^{{% \prime}}}}L}=0$ | (12) |

Even with $\frac{\partial}{\partial{f}}L=0$, this is still ugly. However, because $\frac{\partial}{\partial{f}}L=0$, we can use the Beltrami identity,

$L-q^{{\prime}}{\frac{\partial}{\partial{q^{{\prime}}}}L}=C.$ | (13) |

(For the derivation of this useful little trick, see the corresponding entry.) Now we must simply solve

$\sqrt{1+f^{{\prime}}(x)^{2}}-f^{{\prime}}(x){\frac{\partial}{\partial{f^{{% \prime}}}}L}=C$ | (14) |

which looks just as daunting, but quickly reduces to

$\displaystyle\sqrt{1+f^{{\prime}}(x)^{2}}-f^{{\prime}}(x)\frac{\frac{1}{2}2f^{% {\prime}}(x)}{\sqrt{1+f^{{\prime}}(x)^{2}}}$ | $\displaystyle=C$ | (15) | ||

$\displaystyle\frac{1+f^{{\prime}}(x)^{2}-f^{{\prime}}(x)^{2}}{\sqrt{1+f^{{% \prime}}(x)^{2}}}$ | $\displaystyle=C$ | (16) | ||

$\displaystyle\frac{1}{\sqrt{1+f^{{\prime}}(x)^{2}}}$ | $\displaystyle=C$ | (17) | ||

$\displaystyle f^{{\prime}}(x)$ | $\displaystyle=\sqrt{\frac{1}{C^{2}}-1}=m.$ | (18) |

That is, the slope of the curve representing the shortest path between two points is a constant, which means the searched curve, i.e. the extremal of this variational problem, must be a straight line. Through this lengthy process, we’ve proved that a straight line is the shortest distance between two points.

To find the actual function $f(x)$ given endpoints $(x_{1},y_{1})$ and $(x_{2},y_{2})$, simply integrate with respect to $x$:

$f(x)=\int f^{{\prime}}(x)dx=\int bdx=mx+d$ | (19) |

and then apply the boundary conditions

$\displaystyle f(x_{1})$ | $\displaystyle=y_{1}=mx_{1}+d$ | (20) | ||

$\displaystyle f(x_{2})$ | $\displaystyle=y_{2}=mx_{2}+d$ | (21) |

Subtracting the first condition from the second, we get $m=\frac{y_{2}-y_{1}}{x_{2}-x_{1}}$, the standard equation for the slope of a line. Solving for $d=y_{1}-mx_{1}$, we get

$f(x)=\frac{y_{2}-y_{1}}{x_{2}-x_{1}}(x-x_{1})+y_{1}$ | (22) |

which is the basic equation for a line passing through $(x_{1},y_{1})$ and $(x_{2},y_{2})$.

The solution to the brachistochrone problem, while slightly more complicated, follows along exactly the same lines.

## Mathematics Subject Classification

49K05*no label found*47A60

*no label found*

- Forums
- Planetary Bugs
- HS/Secondary
- University/Tertiary
- Graduate/Advanced
- Industry/Practice
- Research Topics
- LaTeX help
- Math Comptetitions
- Math History
- Math Humor
- PlanetMath Comments
- PlanetMath System Updates and News
- PlanetMath help
- PlanetMath.ORG
- Strategic Communications Development
- The Math Pub
- Testing messages (ignore)

- Other useful stuff
- Corrections

## Attached Articles

## Corrections

bad links and minor things by matte ✓

not rendering by rmilson ✓

Typo by halb ✓

## Comments

## Beltrami Identity

I am not sure if your prconfition for using the Beltrami identiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.

## Beltrami Identity

I am not sure if your prcondition for using the Beltrami identiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.

## Beltrami Identity

I am not sure if your prcondition for using the Beltrami Ddentiry is correct.

You are saying that dL/df = 0 as the precondition for using the Beltrami Identity. Sould it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.

## Beltrami Identity

I am not sure if your prcondition for using the Beltrami Identiry is correct.

All the derivatives in this message should be interpreted as partial derivatives.

## Beltrami Identity

I am not sure if your precondition for using the Beltrami Identity is correct.

You are saying that dL/df = 0 is the precondition for using the Beltrami Identity. Should it be dL/dx = 0 instead?

All the derivatives in this message should be interpreted as partial derivatives.

## 5 identical messages

One can see here (the messages attached to the entry "calculus of variations") that there are 5 messages with identical contents. Probably it was because the post button was 5 times pushed. So may be it is not bad idea to leave only one of them and rest to delete. If this can be done than this message is also not needed and should be deleted as well.

Regards

Serg.

-------------------------------

knowledge can become a science

only with a help of mathematics