Classifying Critical Points
So let’s say we’ve got a critical point of a multivariable function . That is, a point
where the differential
vanishes. We want something like the second derivative test that might tell us more about the behavior of the function near that point, and to identify (some) local maxima and minima. We’ll assume here that
is twice continuously differentiable in some region
around
.
The analogue of the second derivative for multivariable functions is the second differential . This function assigns to every point a bilinear function of two displacement vectors
and
, and it measures the rate at which the directional derivative in the direction of
is changing as we move in the direction of
. That is,
If we choose coordinates on given by an orthonormal basis
, we can write the second differential in terms of coordinates
This matrix is often called the “Hessian” of at the point
.
As I said above, this is a bilinear form. Further, Clairaut’s theorem tells us that it’s a symmetric form. Then the spectral theorem tells us that we can find an orthonormal basis with respect to which the Hessian is actually diagonal, and the diagonal entries are the eigenvalues of the matrix.
So let’s go back and assume we’re working with such a basis. This means that our second partial derivatives are particularly simple. We find that for we have
and for , the second partial derivative is an eigenvalue
which we can assume (without loss of generality) are nondecreasing. That is, .
Now, if all of these eigenvalues are positive at a critical point , then the Hessian is positive-definite. That is, given any direction
we have
. On the other hand, if all of the eigenvalues are negative, the Hessian is negative definite; given any direction
we have
. In the former case, we’ll find that
has a local minimum in a neighborhood of
, and in the latter case we’ll find that
has a local maximum there. If some eigenvalues are negative and others are positive, then the function has a mixed behavior at
we’ll call a “saddle” (sketch the graph of
near
to see why). And if any eigenvalues are zero, all sorts of weird things can happen, though at least if we can find one positive and one negative eigenvalue we know that the critical point can’t be a local extremum.
We remember that the determinant of a diagonal matrix is the product of its eigenvalues, so if the determinant of the Hessian is nonzero then either we have a local maximum, we have a local minimum, or we have some form of well-behaved saddle. These behaviors we call “generic” critical points, since if we “wiggle” the function a bit (while maintaining a critical point at ) the Hessian determinant will stay nonzero. If the Hessian determinant is zero, wiggling the function a little will make it nonzero, and so this sort of critical point is not generic. This is the sort of unstable situation analogous to a failure of the second derivative test. Unfortunately, the analogy doesn’t extent, in that the sign of the Hessian determinant isn’t instantly meaningful. In two dimensions a positive determinant means both eigenvalues have the same sign — denoting a local maximum or a local minimum — while a negative determinant denotes eigenvalues of different signs — denoting a saddle. This much is included in multivariable calculus courses, although usually without a clear explanation why it works.
So, given a direction vector so that
, then since
is in
, there will be some neighborhood
of
so that
for all
. In particular, there will be some range of
so that
. For any such point we can use Taylor’s theorem with
to tell us that
for some . And from this we see that
for every
so that
. A similar argument shows that if
then
for any
near
in the direction of
.
Now if the Hessian is positive-definite then every direction from
gives us
, and so every point
near
satisfies
. If the Hessian is negative-definite, then every point
near
satisfies
. And if the Hessian has both positive and negative eigenvalues then within any neighborhood we can find some directions in which
and some in which
.
It’s funny, I asked about this some years ago and it only recently made sense when I realized it was a special case of the Morse lemma (even though you don’t need that much to prove it).
About which part in particular?
About the generalization of the second derivative test to higher dimensions, in multivariable calculus class.
Ah, yes. The real answer is this one, about the signature of a bilinear form. The test as seen in those classes is a complete accident about the way things work out in two dimensions. But they don’t have the linear algebra background to do it properly.