Let’s start today by introducing some notation for the Jacobian determinant which we introduced yesterday. We’ll write the Jacobian determinant of a differentiable function at a point as . Or, in more of a Leibnizean style:
We’re interested in determining the Jacobian of the composite of two differentiable functions. To which end, suppose and are differentiable functions on two open regions and in , with , and let be their composite. Then the chain rule tells us that
where each differential is an matrix, and the right-hand side is a matrix multiplication.
But these matrices are exactly the Jacobian matrices of the functions! And since the by definition, the determinant of the product of two matrices is the product of their determinants. That is, we find the equation
Or, we could define and use the Leibniz notation to write
As a special case, let’s assume that the differentiable function is injective in some open neighborhood of a point . That is, every is sent to a distinct point by , making up the whole image . Further, let’s suppose that the function which sends each point back to the point in from which it came — if and only if — is also differentiable. Then we have the composition , and thus we find
Thus, if a differentiable function has a differentiable inverse function defined in some neighborhood of a point , then the Jacobian determinant of the function must be nonzero at that point. A fair bit of work will now be put to turning this statement around. That is, we seek to show that if the Jacobian determinant , then has a differentiable inverse in some neighborhood of .
We’ll focus on a differentiable function , where is itself some open region in . That is, if we pick a basis and coordinates of , then the function is a vector-valued function of real variables with components . The differential, then, is itself a vector-valued function whose components are the differentials of the component functions: . We can write these differentials out in terms of partial derivatives:
Just like we said when discussing the chain rule, the differential at the point defines a linear transformation from the -dimensional space of displacement vectors at to the -dimensional space of displacement vectors at , and the matrix entries with respect to the given basis are given by the partial derivatives.
It is this transformation that we will refer to as the Jacobian, or the Jacobian transformation. Alternately, sometimes the representing matrix is referred to as the Jacobian, or the Jacobian matrix. Since this matrix is square, we can calculate its determinant, which is also referred to as the Jacobian, or the Jacobian determinant. I’ll try to be clear which I mean, but often the specific referent of “Jacobian” must be sussed out from context.
So, in light of our recent discussion, what does the Jacobian determinant mean? Well, imagine starting with a -dimensional parallelepiped at the point , with one side in each of the basis directions, and positively oriented. That is, it consists of the points with in the interval for some fixed . We’ll assume for the moment that this whole region lands within the region . It should be clear that this parallelepiped is represented by the wedge
which clearly has volume given by the product of all the .
Now the function sends this cube to a sort of curvy parallelepiped, consisting of the points , with each in the interval , and this image will have some volume. Unfortunately, we have no idea as yet how to measure such a volume. But we might be able to approximate it. Instead of using the actual curvy parallelepiped, we’ll build a new one. And if the are small enough, it will be more or less the same set of points, with the same volume. Or at least close enough for our purposes. We’ll replace the curved path defined by
by the displacement vector between the two endpoints:
and use these new vectors to build a new parallelepiped
But this is still an awkward volume to work with. However, we can use the differential to approximate each of these differences
with no summation here on the index .
Now we can easily calculate the volume of this parallelepiped, represented by the wedge
which can be rewritten as
which clearly has a volume of — the volume of the original parallelepiped — times the Jacobian determinant. That is, the Jacobian determinant at estimates the factor by which the function expands small volumes near that point. Or it tells us that locally reverses the orientation of small regions near the point if the Jacobian determinant is negative.
Finally we can get to something that is presented to students in multivariable calculus and physics classes as if it were a basic operation: the cross product of three-dimensional vectors. This only works out because the Hodge star defines an isomorphism from to when . We define
All the usual properties of the cross product are really properties of the wedge product combined with the Hodge star. Geometrically, is defined as a vector perpendicular to the plane spanned by and , which is exactly what the Hodge star produces. We choose which perpendicular direction by the “right-hand rule”, but this is only because we choose the basis vectors , , and (or as these classes often call them: , , and ) by the same convention, and this defines an orientation we have to stick with when we define the Hodge star. The length of the cross product is the area of the parallelogram spanned by and , again as expected from the Hodge star. Algebraically, the cross product is anticommutative and linear in each variable. These are properties of the wedge product, and the Hodge star — being linear — preserves them.
The biggest fib we tell students is that the value of the cross product is a vector. It certainly looks like a vector on the surface, but the problem is that it doesn’t transform like a vector. Before the advent of thinking of all these things geometrically, people thought of a vector quantity as a triple of real numbers that transform in a certain way when we change to a different orthonormal basis. This is inspired by the physical world, where there’s no magic orthonormal basis floating out somewhere to pick out coordinates. We should be able to turn our heads and translate the laws of physics to compensate exactly. These rotations form the special orthogonal group of orientation- and inner product-preserving transformations, but we can also throw in reflections to get the whole orthogonal group, of all transformations from one orthonormal basis to another.
So let’s imagine what happens to a cross product when we reflect the world. In fact, stand by a mirror and hold out your right hand in the familiar way, with your index finger along one imagined vector , your middle finger along another vector , and your thumb pointing in the direction of the cross product . Now look in the mirror.
The orientation has been reversed, and mirror-you is holding out its left hand! If mirror-you tried to use its version of the cross product, it would find that the cross product should go in the other direction. The cross product doesn’t behave like all the other vectors in the world, because it doesn’t reflect the same way.
Physicists to this day use the old language describing a triple of real numbers that transform like a vector under rotations, but point the wrong way under reflections. They call such a quantity a “pseudovector”. And they also have a word for a single real number that somehow mysteriously flips its sign when we apply a reflection: a “pseudoscalar”. Whenever we read about scalar, vector, pseudovector, and pseudoscalar quantities, they just mean real numbers (or triples of them) and specify how they change under certain orthogonal transformations.
But geometrically we can see exactly what’s going on. These are just the spaces , , , and , along with their representations of the orthogonal group . And the “pseudo” means we’ve used the Hodge star — which depends essentially on a choice of orientation — to pretend that bivectors in and trivectors in are just like vectors in and scalars in , respectively. And we can get away with it for a long time, until a mirror shows up.
The only essential tool from multivariable calculus or introductory physics built from the cross product that we might have need of is the “triple scalar product”, which takes three vectors , , and . It calculates the cross product of two of them, and then the inner product with the third to get a scalar. But this is the coefficient of our unit cube in the definition of the Hodge star:
since . That is, the triple scalar product gives the (oriented) volume of the parallelepiped spanned by , , and , just as we remember from those classes. We really don’t need the cross product as a primitive operation at all, and in the long run it only leads to confusion as it identifies vectors and pseudovectors without the explicit use of the orientation-dependent Hodge star to keep us straight.
Sorry for the delay from last Friday to today, but I was chasing down a good lead.
Anyway, last week I said that I’d talk about a linear map that extends the notion of the correspondence between parallelograms in space and perpendicular vectors.
First of all, we should see why there may be such a correspondence. We’ve identified -dimensional parallelepipeds in an -dimensional vector space with antisymmetric tensors of degree : . Of course, not every such tensor will correspond to a parallelepiped (some will be linear combinations that can’t be written as a single wedge of vectors), but we’ll just keep going and let our methods apply to such more general tensors. Anyhow, we also know how to count the dimension of the space of such tensors:
This formula tells us that and will have the exact same dimension, and so it makes sense that there might be an isomorphism between them. And we’re going to look for one which defines the “perpendicular” -dimensional parallelepiped with the same size.
So what do we mean by “perpendicular”? It’s not just in terms of the “angle” defined by the inner product. Indeed, in that sense the parallelograms and are perpendicular. No, we want any vector in the subspace defined by our parallelepiped to be perpendicular to any vector in the subspace defined by the new one. That is, we want the new parallelepiped to span the orthogonal complement to the subspace we start with.
Our definition will also need to take into account the orientation on . Indeed, considering the parallelogram in three-dimensional space, the perpendicular must be for some nonzero constant , or otherwise it won’t be perpendicular to the whole – plane. And has to be in order to get the right size. But will it be or ? The difference is entirely in the orientation.
Okay, so let’s pick an orientation on , which gives us a particular top-degree tensor so that . Now, given some , we define the Hodge dual to be the unique antisymmetric tensor of degree satisfying
for all . Notice here that if and describe parallelepipeds, and any side of is perpendicular to all the sides of , then the projection of onto the subspace spanned by will have zero volume, and thus . This is what we expect, for then this side of must lie within the perpendicular subspace spanned by , and so the wedge should also be zero.
As a particular example, say we have an orthonormal basis of so that . Then given a multi-index the basic wedge gives us the subspace spanned by the vectors . The orthogonal complement is clearly spanned by the remaining basis vectors , and so , with the sign depending on whether the list is an even or an odd permutation of .
To be even more explicit, let’s work these out for the cases of dimensions three and four. First off, we have a basis . We work out all the duals of basic wedges as follows:
This reconstructs the correspondence we had last week between basic parallelograms and perpendicular basis vectors. In the four-dimensional case, the basis leads to the duals
It’s not a difficult exercise to work out the relation for a degree tensor in an -dimensional space.
Today I want to run through an example of how we use our new tools to read geometric information out of a parallelogram.
I’ll work within with an orthonormal basis and an identified origin to give us a system of coordinates. That is, given the point , we set up a vector pointing from to (which we can do in a Euclidean space). Then this vector has components in terms of the basis:
and we’ll write the point as .
So let’s pick four points: , , , and . These four point do, indeed, give the vertices of a parallelogram, since both displacements from to and from to are , and similarly the displacements from to and from to are both . Alternatively, all four points lie within the plane described by , and the region in this plane contained between the vertices consists of points so that
for some and both in the interval . So this is a parallelogram contained between and . Incidentally, note that the fact that all these points lie within a plane means that any displacement vector between two of them is in the kernel of some linear transformation. In this case, it’s the linear functional , and the vector is perpendicular to any displacement in this plane, which will come in handy later.
Now in a more familiar approach, we might say that the area of this parallelogram is its base times its height. Let’s work that out to check our answer against later. For the base, we take the length of one vector, say . We use the inner product to calculate its length as . For the height we can’t just take the length of the other vector. Some basic trigonometry shows that we need the length of the other vector (which is again ) times the sine of the angle between the two vectors. To calculate this angle we again use the inner product to find that its cosine is , and so its sine is . Multiplying these all together we find a height of , and thus an area of .
On the other hand, let’s use our new tools. We represent the parallelogram as the wedge — incidentally choosing an orientation of the parallelogram and the entire plane containing it — and calculate its length using the inner product on the exterior algebra:
Alternately, we could calculate it by expanding in terms of basic wedges. That is, we can write
This tells us that if we take our parallelogram and project it onto the – plane (which has an orthonormal basis ) we get an area of . Similarly, projecting our parallelogram onto the – plane (with orthonormal basis we get an area of . That is, the area is and the orientation of the projected parallelogram disagrees with that of the plane. Anyhow, now the squared area of the parallelogram is the sum of the squares of these projected areas: .
Notice, now, the similarity between this expression and the perpendicular vector we found before: . Each one is the sum of three terms with the same choices of signs. The terms themselves seem to have something to do with each other as well; the wedge describes an area in the – plane, while describes a length in the perpendicular -axis. Similarly, describes an area in the – plane, while describes a length in the perpendicular -axis. And, magically, the sum of these three perpendicular vectors to these three parallelograms gives the perpendicular vector to their sum!
There is, indeed, a linear correspondence between parallelograms and vectors that extends this idea, which we will explore tomorrow. The seemingly-odd choice of to correspond to , though, should be a tip-off that this correspondence is closely bound up with the notion of orientation.
So, why bother with this orientation stuff, anyway? We’ve got an inner product on spaces of antisymmetric tensors, and that should give us a concept of length. Why can’t we just calculate the size of a parallelepiped by sticking it into this bilinear form twice?
Well, let’s see what happens. Given a -dimensional parallelepiped with sides through , we represent the parallelepiped by the wedge . Then we might try defining the volume by using the renormalized inner product
Let’s expand one copy of the wedge out in terms of our basis of wedges of basis vectors
where the multi-index runs over all increasing -tuples of indices . But we already know that , and so this is squared-volume is the sum of the squares of these components, just like we’re familiar with. Then we can define the -volume of the parallelepiped as the square root of this sum.
Let’s look specifically at what happens for top-dimensional parallelepipeds, where . Then we only have one possible multi-index , with coefficient
and so our formula reads
So we get the magnitude of the volume without having to worry about choosing an orientation. Why even bother?
Because we already do care about orientation. Let’s go all the way back to one-dimensional parallelepipeds, which are just described by vectors. A vector doesn’t just describe a certain length, it describes a length along a certain line in space. And it doesn’t just describe a length along that line, it describes a length in a certain direction along that line. A vector picks out three things:
- A one-dimensional subspace of the ambient space .
- An orientation of the subspace .
- A volume (length) of this oriented subspace.
And just like vectors, nondegenerate -dimensional parallelepipeds pick out three things
- A -dimensional subspace of the ambient space .
- An orientation of the subspace .
- A -dimensional volume of this oriented subspace.
The difference is that when we get up to the top dimension the space itself can have its own orientation, which may or may not agree with the orientation induced by the parallelepiped. We don’t always care about this disagreement, and we can just take the absolute value to get rid of a sign if we don’t care, but it might come in handy.
The universal property of spaces of antisymmetric tensors says that any such functional corresponds to a unique linear functional . That is, we take the parallelepiped with sides through and represent it by the antisymmetric tensor . Notice, in particular, that if the parallelepiped is degenerate then this tensor is , as we hoped. Then volume is some linear functional that takes in such an antisymmetric tensor and spits out a real number. But which linear functional?
I’ll start by answering this question for -dimensional parallelepipeds in -dimensional space. Such a parallelepiped is represented by an antisymmetric tensor with the sides as its tensorands. But we’ve calculated the dimension of the space of such tensors: . That is, once we represent these parallelepipeds by antisymmetric tensors there’s only one parameter left to distinguish them: their volume. So if we specify the volume of one parallelepiped linearity will take care of all the others.
There’s one parallelepiped whose volume we know already. The unit -cube must have unit volume. So, to this end, pick an orthonormal basis . A parallelepiped with these sides corresponds to the antisymmetric tensor , and the volume functional must send this to . But be careful! The volume doesn’t depend just on the choice of basis, but on the order of the basis elements. Swap two of the basis elements and we should swap the sign of the volume. So we’ve got two different choices of volume functional here, which differ exactly by a sign. We call these two choices “orientations” on our vector space.
This is actually not as esoteric as it may seem. Almost all introductions to vectors — from multivariable calculus to vector-based physics — talk about “left-handed” and “right-handed” coordinate systems. These differ by a reflection, which would change the signs of all parallelepipeds. So we must choose one or the other, and choose which unit cube will have volume and which will have volume . The isomorphism from to then gives us a “volume form” , which will give us the volume of a parallelepiped represented by a given top-degree wedge.
Once we’ve made that choice, what about general parallelepipeds? If we have sides — written in components as — we represent the parallelepiped by the wedge . This is the image of our unit cube under the transformation sending to , and so we find
The volume of the parallelepiped is the determinant of this transformation.
Incidentally, this gives a geometric meaning to the special orthogonal group . Orthogonal transformations send orthonormal bases to other orthonormal bases, which will send unit cubes to other unit cubes. But the determinant of an orthogonal transformation may be either or . Transformations of the first kind make up the special orthogonal group, while transformations of the second kind send “positive” unit cubes to “negative” ones, and vice-versa. That is, they involve some sort of reflection, swapping the choice of orientation we made above. Special orthogonal transformations are those which preserve not only lengths and angles, but the orientation of the space. More generally, there is a homomorphism sending a transformation to the sign of its determinant. Transformations with positive determinant are said to be “orientation-preserving”, while those with negative determinant are said to be “orientation-reversing”.
And we’re back with more of what Mr. Martinez of Harvard’s Medical School assures me is onanism of the highest caliber. I’m sure he, too, blames me for not curing cancer.
Coming up in our study of calculus in higher dimensions we’ll need to understand parallelepipeds, and in particular their volumes. First of all, what is a parallelepiped? Or, more specifically, what is a -dimensional parallelepiped in -dimensional space? It’s a collection of points in space that we can describe as follows. Take a point and vectors in . The parallelepiped is the collection of points reachable by moving from by some fraction of each of the vectors . That is, we pick values , each in the interval , and use them to specify the point . The collection of all such points is the parallelepiped with corner and sides .
One possible objection is that these sides may not be linearly independent. If the sides are linearly independent, then they span a -dimensional subspace of the ambient space, justifying our calling it -dimensional. But if they’re not, then the subspace they span has a lower dimension. We’ll deal with this by calling such a parallelepiped “degenerate”, and the nice ones with linearly independent sides “nondegenerate”. Trust me, things will be more elegant in the long run if we just deal with them both on the same footing.
Now we want to consider the volume of a parallelepiped. The first observation is that the volume doesn’t depend on the corner point . Indeed, we should be able to slide the corner around to any point in space as long as we bring the same displacement vectors along with us. So the volume should be a function only of the sides.
The second observation is that as a function of the sides, the volume function should commute with scalar multiplication in each variable separately. That is, if we multiply by a non-negative factor of , then we multiply the whole volume of the parallelepiped by as well. But what about negative scaling factors? What if we reflect the side (and thus the whole parallelepiped) to point the other way? One answer might be that we get the same volume, but it’s going to be easier (and again more elegant) if we say that the new parallelepiped has the negative of the original one’s volume.
Negative volume? What could that mean? Well, we’re going to move away from the usual notion of volume just a little. Instead, we’re going to think of “signed” volume, which includes the possibility of being positive or negative. By itself, this sign will be less than clear at first, but we’ll get a better understanding as we go. As a first step we’ll say that two parallelepipeds related by a reflection have opposite signs. This won’t only cover the above behavior under scaling sides, but also what happens when we exchange the order of two sides. For example, the parallelogram with sides and and the parallelogram with sides and have the same areas with opposite signs. Similarly, swapping the order of two sides in a given parallelepiped will flip its sign.
The third observation is that the volume function should be additive in each variable. One way to see this is that the -dimensional volume of the parallelepiped with sides through should be the product of the -dimensional volume of the parallelepiped with sides through and the length of the component of perpendicular to all the other sides, and this length is a linear function of . Since there’s nothing special here about the last side, we could repeat the argument with the other sides.
The other way to see this fact is to consider the following diagram, helpfully supplied by Kate from over at f(t):
The side of one parallelogram is the (vector) sum of the sides of the other two, and we can see that the area of the one parallelogram is the sum of the areas of the other two. This justifies the assertion that for parallelograms in the plane, the area is additive as a function of one side (and, similarly, of the other). Similar diagrams should be apparent to justify the assertion for higher-dimensional parallelepipeds in higher-dimensional spaces.
Putting all these together, we find that the -dimensional volume of a parallelepiped with sides is an alternating multilinear functional, with the sides as variables, and so it lives somewhere in the exterior algebra . We’ll have to work out which particular functional gives us a good notion of volume as we continue.