The Unapologetic Mathematician

Mathematics for the interested outsider

The Jacobian

Now that we’ve used exterior algebras to come to terms with parallelepipeds and their transformations, let’s come back to apply these ideas to the calculus.

We’ll focus on a differentiable function f:X\rightarrow\mathbb{R}^n, where X is itself some open region in \mathbb{R}^n. That is, if we pick a basis \{e_i\}_{i=1}^n and coordinates of \mathbb{R}^n, then the function f is a vector-valued function of n real variables x^1,\dots,x^n with components f^1,\dots,f^n. The differential, then, is itself a vector-valued function whose components are the differentials of the component functions: df=df^ie_i. We can write these differentials out in terms of partial derivatives:

\displaystyle df^i(x^1,\dots,x^n;t^1,\dots,t^n)=\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}t^1+\dots+\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}t^n

Just like we said when discussing the chain rule, the differential at the point (x^1,\dots,x^n) defines a linear transformation from the n-dimensional space of displacement vectors at (x^1,\dots,x^n) to the n-dimensional space of displacement vectors at f(x^1,\dots,x^n), and the matrix entries with respect to the given basis are given by the partial derivatives.

It is this transformation that we will refer to as the Jacobian, or the Jacobian transformation. Alternately, sometimes the representing matrix is referred to as the Jacobian, or the Jacobian matrix. Since this matrix is square, we can calculate its determinant, which is also referred to as the Jacobian, or the Jacobian determinant. I’ll try to be clear which I mean, but often the specific referent of “Jacobian” must be sussed out from context.

So, in light of our recent discussion, what does the Jacobian determinant mean? Well, imagine starting with a n-dimensional parallelepiped at the point (x^1,\dots,x^n), with one side in each of the basis directions, and positively oriented. That is, it consists of the points (x^1+t^1,\dots,x^n+t^n) with t^i in the interval [0,\Delta x^i] for some fixed \Delta x^i. We’ll assume for the moment that this whole region lands within the region X. It should be clear that this parallelepiped is represented by the wedge

\displaystyle(\Delta x^1e_1)\wedge\dots\wedge(\Delta x^ne_n)=(\Delta x^1\dots\Delta x^n)e_1\wedge\dots\wedge e_n

which clearly has volume given by the product of all the \Delta x^i.

Now the function f sends this cube to a sort of curvy parallelepiped, consisting of the points f(x^1+t^1,\dots,x^n+t^n), with each t^i in the interval [0,\Delta x^i], and this image will have some volume. Unfortunately, we have no idea as yet how to measure such a volume. But we might be able to approximate it. Instead of using the actual curvy parallelepiped, we’ll build a new one. And if the \Delta x^i are small enough, it will be more or less the same set of points, with the same volume. Or at least close enough for our purposes. We’ll replace the curved path defined by

\displaystyle f(x^1,\dots,x^i+t,\dots,x^n)\qquad0\leq t\leq\Delta x^i

by the displacement vector between the two endpoints:

\displaystyle f(x^1,\dots,x^i+\Delta x^i,\dots,x^n)-f(x^1,\dots,x^i,\dots,x^n)

and use these new vectors to build a new parallelepiped

\displaystyle\left(f(x^1+\Delta x^1,\dots,x^n)-f(x^1,\dots,x^n)\right)\wedge\dots\wedge\left(f(x^1,\dots,x^n+\Delta x^n)-f(x^1,\dots,x^n)\right)

But this is still an awkward volume to work with. However, we can use the differential to approximate each of these differences

\displaystyle\begin{aligned}f(x^1,\dots,x^k+\Delta x^k,\dots,x^n)&-f(x^1,\dots,x^k,\dots,x^n)\\&\approx df(x^1,\dots,x^n;0,\dots,\Delta x^k,\dots,0)\\&=\Delta x^kdf(x^1,\dots,x^n;0,\dots,1,\dots,0)\\&=\Delta x^kdf^i(x^1,\dots,x^n;0,\dots,1,\dots,0)e_i\\&=\Delta x^k\frac{\partial f^i}{\partial x^k}\bigg\vert_{(x^1,\dots,x^n)}e_i\end{aligned}

with no summation here on the index k.

Now we can easily calculate the volume of this parallelepiped, represented by the wedge

\displaystyle\left(\Delta x^1\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)\wedge\dots\wedge\left(\Delta x^n\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)

which can be rewritten as

\displaystyle\left(\Delta x^1\dots\Delta x^n\right)\left(\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)\wedge\dots\wedge\left(\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)

which clearly has a volume of \left(\Delta x^1\dots\Delta x^n\right) — the volume of the original parallelepiped — times the Jacobian determinant. That is, the Jacobian determinant at (x^1,\dots,x^n) estimates the factor by which the function f expands small volumes near that point. Or it tells us that locally f reverses the orientation of small regions near the point if the Jacobian determinant is negative.

November 11, 2009 Posted by | Analysis, Calculus | 22 Comments