Classifying Critical Points
So let’s say we’ve got a critical point of a multivariable function . That is, a point where the differential vanishes. We want something like the second derivative test that might tell us more about the behavior of the function near that point, and to identify (some) local maxima and minima. We’ll assume here that is twice continuously differentiable in some region around .
The analogue of the second derivative for multivariable functions is the second differential . This function assigns to every point a bilinear function of two displacement vectors and , and it measures the rate at which the directional derivative in the direction of is changing as we move in the direction of . That is,
If we choose coordinates on given by an orthonormal basis , we can write the second differential in terms of coordinates
This matrix is often called the “Hessian” of at the point .
As I said above, this is a bilinear form. Further, Clairaut’s theorem tells us that it’s a symmetric form. Then the spectral theorem tells us that we can find an orthonormal basis with respect to which the Hessian is actually diagonal, and the diagonal entries are the eigenvalues of the matrix.
So let’s go back and assume we’re working with such a basis. This means that our second partial derivatives are particularly simple. We find that for we have
and for , the second partial derivative is an eigenvalue
which we can assume (without loss of generality) are nondecreasing. That is, .
Now, if all of these eigenvalues are positive at a critical point , then the Hessian is positive-definite. That is, given any direction we have . On the other hand, if all of the eigenvalues are negative, the Hessian is negative definite; given any direction we have . In the former case, we’ll find that has a local minimum in a neighborhood of , and in the latter case we’ll find that has a local maximum there. If some eigenvalues are negative and others are positive, then the function has a mixed behavior at we’ll call a “saddle” (sketch the graph of near to see why). And if any eigenvalues are zero, all sorts of weird things can happen, though at least if we can find one positive and one negative eigenvalue we know that the critical point can’t be a local extremum.
We remember that the determinant of a diagonal matrix is the product of its eigenvalues, so if the determinant of the Hessian is nonzero then either we have a local maximum, we have a local minimum, or we have some form of well-behaved saddle. These behaviors we call “generic” critical points, since if we “wiggle” the function a bit (while maintaining a critical point at ) the Hessian determinant will stay nonzero. If the Hessian determinant is zero, wiggling the function a little will make it nonzero, and so this sort of critical point is not generic. This is the sort of unstable situation analogous to a failure of the second derivative test. Unfortunately, the analogy doesn’t extent, in that the sign of the Hessian determinant isn’t instantly meaningful. In two dimensions a positive determinant means both eigenvalues have the same sign — denoting a local maximum or a local minimum — while a negative determinant denotes eigenvalues of different signs — denoting a saddle. This much is included in multivariable calculus courses, although usually without a clear explanation why it works.
So, given a direction vector so that , then since is in , there will be some neighborhood of so that for all . In particular, there will be some range of so that . For any such point we can use Taylor’s theorem with to tell us that
for some . And from this we see that for every so that . A similar argument shows that if then for any near in the direction of .
Now if the Hessian is positive-definite then every direction from gives us , and so every point near satisfies . If the Hessian is negative-definite, then every point near satisfies . And if the Hessian has both positive and negative eigenvalues then within any neighborhood we can find some directions in which and some in which .