The Unapologetic Mathematician

Mathematics for the interested outsider

Local Extrema in Multiple Variables

Just like in one variable, we’re interested in local maxima and minima of a function f:X\rightarrow\mathbb{R}, where X is an open region in \mathbb{R}^n. Again, we say that f has a local minimum at a point a\in X if there is some neighborhood N of a so that f(a)\leq f(x) for all x\in N. A maximum is similarly defined, except that we require f(a)\geq f(x) in the neighborhood. As I alluded to recently, we can bring Fermat’s theorem to bear to determine a necessary condition.

Specifically, if we have coordinates on \mathbb{R}^n given by a basis \{e_i\}_{i=1}^n, we can regard f as a function of the n variables x^i. We can fix n-1 of these variables x^i=a^i for i\neq k and let x^k vary in a neighborhood of a^k. If f has a local extremum at x=a, then in particular it has a local extremum along this coordinate line at x^k=a^k. And so we can use Fermat’s theorem to draw conclusions about the derivative of this restricted function at x^k=a^k, which of course is the partial derivative \frac{\partial f}{\partial x^k}\big\vert_{x=a}.

So what can we say? For each variable x^k, the partial derivative \frac{\partial f}{\partial x^k} either does not exist or is equal to zero at x=a. And because the differential subsumes the partial derivatives, if any of them fail to exist the differential must fail to exist as well. On the other hand, if they all exist they’re all zero, and so df(a)=0 as well. Incidentally, we can again make the connection to the usual coverage in a multivariable calculus course by remembering that the gradient \nabla f(a) is the vector that corresponds to the linear functional of the differential df(a). So at a local extremum we must have \nabla f(a)=0.

As was the case with Fermat’s theorem, this provides a necessary, but not a sufficient condition to have a local extremum. Anything that can go wrong in one dimension can be copied here. For instance, we could define f(x,y)=x^2+y^3. Then we find df=2x\,dx+3y^2\,dy, which is zero at (0,0). But any neighborhood of this point will contain points (0,t) and (0,-t) for small enough t>0, and we see that f(0,t)>f(0,0)>f(0,-t), so the origin cannot be a local extremum.

But weirder things can happen. We might ask that f have a local minimum at a along any line, like we tried with directional derivatives. But even this can go wrong. If we define

\displaystyle f(x,y)=(y-x^2)(y-3x^2)=y^2-4x^2y+3x^4

we can calculate

\displaystyle df=\left(-8xy+12x^3\right)dx+\left(2y-4x^2\right)dy

which again is zero at (0,0). Along any slanted line through the origin y=kx we find


and so the second derivative is always positive at the origin, except along the x-axis. For the vertical line, we find


so along all of these lines we have a local minimum at the origin by the second derivative test. And along the x-axis, we have f(x,0)=3x^4, which has the origin as a local minimum.

Unfortunately, it’s still not a local minimum in the plane, since any neighborhood of the origin must contain points of the form (t,2t^2) for small enough t. For these points we find

\displaystyle f(t,2t^2)=-t^4<0=f(0,0)

and so f cannot have a local minimum at the origin.

What we’ll do is content ourselves with this analogue and extension of Fermat’s theorem as a necessary condition, and then develop tools that can distinguish the common behaviors near such critical points, analogous to the second derivative test.

November 23, 2009 Posted by | Analysis, Calculus | 4 Comments