The Unapologetic Mathematician

Mathematics for the interested outsider

The Jacobian of a Composition

Let’s start today by introducing some notation for the Jacobian determinant which we introduced yesterday. We’ll write the Jacobian determinant of a differentiable function f at a point x as J_f(x)=\det(df(x)). Or, in more of a Leibnizean style:

\displaystyle\frac{\partial(f^1,\dots,f^n)}{\partial(x^1,\dots,x^n)}=\det\left(\frac{\partial f^i}{\partial x^j}\right)

We’re interested in determining the Jacobian of the composite of two differentiable functions. To which end, suppose g:X\rightarrow\mathbb{R}^n and f:Y\rightarrow{R}^n are differentiable functions on two open regions X and Y in \mathbb{R}^n, with g(X)\subseteq Y, and let h=f\circ g:X\rightarrow\mathbb{R}^n be their composite. Then the chain rule tells us that

\displaystyle dh(x)=df(g(x))dg(x)

where each differential is an n\times n matrix, and the right-hand side is a matrix multiplication.

But these matrices are exactly the Jacobian matrices of the functions! And since the by definition, the determinant of the product of two matrices is the product of their determinants. That is, we find the equation

\displaystyle J_h(x)=J_f(g(x))J_g(x)

Or, we could define y^i=g^i(x) and use the Leibniz notation to write

\displaystyle\frac{\partial(h^1,\dots,h^n)}{\partial(x^1,\dots,x^n)}=\frac{\partial(h^1,\dots,h^n)}{\partial(y^1,\dots,y^n)}\frac{\partial(y^1,\dots,y^n)}{\partial(x^1,\dots,x^n)}

As a special case, let’s assume that the differentiable function f:X\rightarrow\mathbb{R}^n is injective in some open neighborhood A of a point a. That is, every x\in A is sent to a distinct point by f, making up the whole image f(A). Further, let’s suppose that the function f^{-1} which sends each point y\in f(A) back to the point in A from which it came — f^{-1}(y)=x if and only if y=f(x) — is also differentiable. Then we have the composition f^{-1}(f(x))=x, and thus we find

\displaystyle J_{f^{-1}}(f(a))J_f(a)=1

or

\displaystyle\frac{\partial(y^1,\dots,y^n)}{\partial(x^1,\dots,x^n)}\frac{\partial(x^1,\dots,x^n)}{\partial(y^1,\dots,y^n)}=1

Thus, if a differentiable function f has a differentiable inverse function defined in some neighborhood of a point a, then the Jacobian determinant of the function must be nonzero at that point. A fair bit of work will now be put to turning this statement around. That is, we seek to show that if the Jacobian determinant J_f(a)\neq0, then f has a differentiable inverse in some neighborhood of a.

November 12, 2009 Posted by | Analysis, Calculus | 3 Comments

The Jacobian

Now that we’ve used exterior algebras to come to terms with parallelepipeds and their transformations, let’s come back to apply these ideas to the calculus.

We’ll focus on a differentiable function f:X\rightarrow\mathbb{R}^n, where X is itself some open region in \mathbb{R}^n. That is, if we pick a basis \{e_i\}_{i=1}^n and coordinates of \mathbb{R}^n, then the function f is a vector-valued function of n real variables x^1,\dots,x^n with components f^1,\dots,f^n. The differential, then, is itself a vector-valued function whose components are the differentials of the component functions: df=df^ie_i. We can write these differentials out in terms of partial derivatives:

\displaystyle df^i(x^1,\dots,x^n;t^1,\dots,t^n)=\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}t^1+\dots+\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}t^n

Just like we said when discussing the chain rule, the differential at the point (x^1,\dots,x^n) defines a linear transformation from the n-dimensional space of displacement vectors at (x^1,\dots,x^n) to the n-dimensional space of displacement vectors at f(x^1,\dots,x^n), and the matrix entries with respect to the given basis are given by the partial derivatives.

It is this transformation that we will refer to as the Jacobian, or the Jacobian transformation. Alternately, sometimes the representing matrix is referred to as the Jacobian, or the Jacobian matrix. Since this matrix is square, we can calculate its determinant, which is also referred to as the Jacobian, or the Jacobian determinant. I’ll try to be clear which I mean, but often the specific referent of “Jacobian” must be sussed out from context.

So, in light of our recent discussion, what does the Jacobian determinant mean? Well, imagine starting with a n-dimensional parallelepiped at the point (x^1,\dots,x^n), with one side in each of the basis directions, and positively oriented. That is, it consists of the points (x^1+t^1,\dots,x^n+t^n) with t^i in the interval [0,\Delta x^i] for some fixed \Delta x^i. We’ll assume for the moment that this whole region lands within the region X. It should be clear that this parallelepiped is represented by the wedge

\displaystyle(\Delta x^1e_1)\wedge\dots\wedge(\Delta x^ne_n)=(\Delta x^1\dots\Delta x^n)e_1\wedge\dots\wedge e_n

which clearly has volume given by the product of all the \Delta x^i.

Now the function f sends this cube to a sort of curvy parallelepiped, consisting of the points f(x^1+t^1,\dots,x^n+t^n), with each t^i in the interval [0,\Delta x^i], and this image will have some volume. Unfortunately, we have no idea as yet how to measure such a volume. But we might be able to approximate it. Instead of using the actual curvy parallelepiped, we’ll build a new one. And if the \Delta x^i are small enough, it will be more or less the same set of points, with the same volume. Or at least close enough for our purposes. We’ll replace the curved path defined by

\displaystyle f(x^1,\dots,x^i+t,\dots,x^n)\qquad0\leq t\leq\Delta x^i

by the displacement vector between the two endpoints:

\displaystyle f(x^1,\dots,x^i+\Delta x^i,\dots,x^n)-f(x^1,\dots,x^i,\dots,x^n)

and use these new vectors to build a new parallelepiped

\displaystyle\left(f(x^1+\Delta x^1,\dots,x^n)-f(x^1,\dots,x^n)\right)\wedge\dots\wedge\left(f(x^1,\dots,x^n+\Delta x^n)-f(x^1,\dots,x^n)\right)

But this is still an awkward volume to work with. However, we can use the differential to approximate each of these differences

\displaystyle\begin{aligned}f(x^1,\dots,x^k+\Delta x^k,\dots,x^n)&-f(x^1,\dots,x^k,\dots,x^n)\\&\approx df(x^1,\dots,x^n;0,\dots,\Delta x^k,\dots,0)\\&=\Delta x^kdf(x^1,\dots,x^n;0,\dots,1,\dots,0)\\&=\Delta x^kdf^i(x^1,\dots,x^n;0,\dots,1,\dots,0)e_i\\&=\Delta x^k\frac{\partial f^i}{\partial x^k}\bigg\vert_{(x^1,\dots,x^n)}e_i\end{aligned}

with no summation here on the index k.

Now we can easily calculate the volume of this parallelepiped, represented by the wedge

\displaystyle\left(\Delta x^1\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)\wedge\dots\wedge\left(\Delta x^n\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)

which can be rewritten as

\displaystyle\left(\Delta x^1\dots\Delta x^n\right)\left(\frac{\partial f^i}{\partial x^1}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)\wedge\dots\wedge\left(\frac{\partial f^i}{\partial x^n}\bigg\vert_{(x^1,\dots,x^n)}e_i\right)

which clearly has a volume of \left(\Delta x^1\dots\Delta x^n\right) — the volume of the original parallelepiped — times the Jacobian determinant. That is, the Jacobian determinant at (x^1,\dots,x^n) estimates the factor by which the function f expands small volumes near that point. Or it tells us that locally f reverses the orientation of small regions near the point if the Jacobian determinant is negative.

November 11, 2009 Posted by | Analysis, Calculus | 22 Comments

The Cross Product and Pseudovectors

Finally we can get to something that is presented to students in multivariable calculus and physics classes as if it were a basic operation: the cross product of three-dimensional vectors. This only works out because the Hodge star defines an isomorphism from A^2(V) to V when \dim(V)=3. We define

u\times v=*(u\wedge v)

All the usual properties of the cross product are really properties of the wedge product combined with the Hodge star. Geometrically, u\times v is defined as a vector perpendicular to the plane spanned by u and v, which is exactly what the Hodge star produces. We choose which perpendicular direction by the “right-hand rule”, but this is only because we choose the basis vectors e_1, e_2, and e_3 (or as these classes often call them: \hat{\imath}, \hat{\jmath}, and \hat{k}) by the same convention, and this defines an orientation we have to stick with when we define the Hodge star. The length of the cross product is the area of the parallelogram spanned by u and v, again as expected from the Hodge star. Algebraically, the cross product is anticommutative and linear in each variable. These are properties of the wedge product, and the Hodge star — being linear — preserves them.

The biggest fib we tell students is that the value of the cross product is a vector. It certainly looks like a vector on the surface, but the problem is that it doesn’t transform like a vector. Before the advent of thinking of all these things geometrically, people thought of a vector quantity as a triple of real numbers that transform in a certain way when we change to a different orthonormal basis. This is inspired by the physical world, where there’s no magic orthonormal basis floating out somewhere to pick out coordinates. We should be able to turn our heads and translate the laws of physics to compensate exactly. These rotations form the special orthogonal group of orientation- and inner product-preserving transformations, but we can also throw in reflections to get the whole orthogonal group, of all transformations from one orthonormal basis to another.

So let’s imagine what happens to a cross product when we reflect the world. In fact, stand by a mirror and hold out your right hand in the familiar way, with your index finger along one imagined vector u, your middle finger along another vector v, and your thumb pointing in the direction of the cross product u\times v. Now look in the mirror.

The orientation has been reversed, and mirror-you is holding out its left hand! If mirror-you tried to use its version of the cross product, it would find that the cross product should go in the other direction. The cross product doesn’t behave like all the other vectors in the world, because it doesn’t reflect the same way.

Physicists to this day use the old language describing a triple of real numbers that transform like a vector under rotations, but point the wrong way under reflections. They call such a quantity a “pseudovector”. And they also have a word for a single real number that somehow mysteriously flips its sign when we apply a reflection: a “pseudoscalar”. Whenever we read about scalar, vector, pseudovector, and pseudoscalar quantities, they just mean real numbers (or triples of them) and specify how they change under certain orthogonal transformations.

But geometrically we can see exactly what’s going on. These are just the spaces A^0(V)=\mathbb{R}, A^1(V)=v, A^2(V), and A^3(V), along with their representations of the orthogonal group \mathrm{O}(V). And the “pseudo” means we’ve used the Hodge star — which depends essentially on a choice of orientation — to pretend that bivectors in A^2(V) and trivectors in A^3(V) are just like vectors in V and scalars in \mathbb{R}, respectively. And we can get away with it for a long time, until a mirror shows up.

The only essential tool from multivariable calculus or introductory physics built from the cross product that we might have need of is the “triple scalar product”, which takes three vectors u, v, and w. It calculates the cross product v\times w of two of them, and then the inner product \langle u,v\times w\rangle=\langle u,*(v\wedge w)\rangle with the third to get a scalar. But this is the coefficient of our unit cube \omega in the definition of the Hodge star:

\displaystyle\langle u,*(v\wedge w)\rangle\omega=u\wedge**(v\wedge w)=u\wedge v\wedge w

since **(v\wedge w)=(-1)^{2\cdot(3-2)}v\wedge w. That is, the triple scalar product gives the (oriented) volume of the parallelepiped spanned by u, v, and w, just as we remember from those classes. We really don’t need the cross product as a primitive operation at all, and in the long run it only leads to confusion as it identifies vectors and pseudovectors without the explicit use of the orientation-dependent Hodge star to keep us straight.

November 10, 2009 Posted by | rants | 6 Comments

The Hodge Star

Sorry for the delay from last Friday to today, but I was chasing down a good lead.

Anyway, last week I said that I’d talk about a linear map that extends the notion of the correspondence between parallelograms in space and perpendicular vectors.

First of all, we should see why there may be such a correspondence. We’ve identified k-dimensional parallelepipeds in an n-dimensional vector space V with antisymmetric tensors of degree k: A^k(V). Of course, not every such tensor will correspond to a parallelepiped (some will be linear combinations that can’t be written as a single wedge of k vectors), but we’ll just keep going and let our methods apply to such more general tensors. Anyhow, we also know how to count the dimension of the space of such tensors:

\displaystyle\dim\left(A^k(V)\right)=\binom{n}{k}=\frac{n!}{k!(n-k)!}

This formula tells us that A^k(V) and A^{n-k}(V) will have the exact same dimension, and so it makes sense that there might be an isomorphism between them. And we’re going to look for one which defines the “perpendicular” n-k-dimensional parallelepiped with the same size.

So what do we mean by “perpendicular”? It’s not just in terms of the “angle” defined by the inner product. Indeed, in that sense the parallelograms e_1\wedge e_2 and e_1\wedge e_3 are perpendicular. No, we want any vector in the subspace defined by our parallelepiped to be perpendicular to any vector in the subspace defined by the new one. That is, we want the new parallelepiped to span the orthogonal complement to the subspace we start with.

Our definition will also need to take into account the orientation on V. Indeed, considering the parallelogram e_1\wedge e_2 in three-dimensional space, the perpendicular must be ce_3 for some nonzero constant c, or otherwise it won’t be perpendicular to the whole xy plane. And \vert c\vert has to be {1} in order to get the right size. But will it be +e_3 or -e_3? The difference is entirely in the orientation.

Okay, so let’s pick an orientation on V, which gives us a particular top-degree tensor \omega so that \mathrm{vol}(\omega)=1. Now, given some \eta\in A^k(V), we define the Hodge dual *\eta\in A^{n-k}(V) to be the unique antisymmetric tensor of degree n-k satisfying

\displaystyle\zeta\wedge*\eta=\langle\zeta,\eta\rangle\omega

for all \zeta\in A^k(V). Notice here that if \eta and \zeta describe parallelepipeds, and any side of \zeta is perpendicular to all the sides of \eta, then the projection of \zeta onto the subspace spanned by \eta will have zero volume, and thus \langle\zeta,\eta\rangle=0. This is what we expect, for then this side of \zeta must lie within the perpendicular subspace spanned by *\eta, and so the wedge \zeta\wedge*\eta should also be zero.

As a particular example, say we have an orthonormal basis \{e_i\}_{i=1}^n of V so that \omega=e_1\wedge\dots\wedge e_n. Then given a multi-index I=(i_1,\dots,i_k) the basic wedge e_I gives us the subspace spanned by the vectors \{e_{i_1},\dots,e_{i_k}\}. The orthogonal complement is clearly spanned by the remaining basis vectors \{e_{j_1},\dots,e_{j_{n-k}}\}, and so *e_I=\pm e_J, with the sign depending on whether the list (i_1,\dots,i_k,j_1,\dots,j_{n-k}) is an even or an odd permutation of (1,\dots,n).

To be even more explicit, let’s work these out for the cases of dimensions three and four. First off, we have a basis \{e_1,e_2,e_3\}. We work out all the duals of basic wedges as follows:

\displaystyle\begin{aligned}*1&=e_1\wedge e_2\wedge e_3\\ *e_1&=e_2\wedge e_3\\ *e_2&=-e_1\wedge e_3=e_3\wedge e_1\\ *e_3&=e_1\wedge e_2\\ *(e_1\wedge e_2)&=e_3\\ *(e_1\wedge e_3)&=-e_2\\ *(e_2\wedge e_3)&=e_1\\ *(e_1\wedge e_2\wedge e_3)&=1\end{aligned}

This reconstructs the correspondence we had last week between basic parallelograms and perpendicular basis vectors. In the four-dimensional case, the basis \{e_1,e_2,e_3,e_4\} leads to the duals

\displaystyle\begin{aligned}*1&=e_1\wedge e_2\wedge e_3\wedge e_4\\ *e_1&=e_2\wedge e_3\wedge e_4\\ *e_2&=-e_1\wedge e_3\wedge e_4\\ *e_3&=e_1\wedge e_2\wedge e_4\\\ *e_4&=-e_1\wedge e_2\wedge e_3\\ *(e_1\wedge e_2)&=e_3\wedge e_4\\ *(e_1\wedge e_3)&=-e_2\wedge e_4\\ *(e_1\wedge e_4)&=e_2\wedge e_3\\ *(e_2\wedge e_3)&=e_1\wedge e_4\\ *(e_2\wedge e_4)&=-e_1\wedge e_3\\ *(e_3\wedge e_4)&=e_1\wedge e_2\\ *(e_1\wedge e_2\wedge e_3)&=e_4\\ *(e_1\wedge e_2\wedge e_4)&=-e_3\\ *(e_1\wedge e_3\wedge e_4)&=e_2\\ *(e_2\wedge e_3\wedge e_4)&=-e_1\\ *(e_1\wedge e_2\wedge e_3\wedge e_4)&=1\end{aligned}

It’s not a difficult exercise to work out the relation **\eta=(-1)^{k(n-k)}\eta for a degree k tensor in an n-dimensional space.

November 9, 2009 Posted by | Algebra, Analytic Geometry, Geometry, Linear Algebra | 6 Comments

An Example of a Parallelogram

Today I want to run through an example of how we use our new tools to read geometric information out of a parallelogram.

I’ll work within \mathbb{R}^3 with an orthonormal basis \{e_1, e_2, e_3\} and an identified origin O to give us a system of coordinates. That is, given the point P, we set up a vector \overrightarrow{OP} pointing from O to P (which we can do in a Euclidean space). Then this vector has components in terms of the basis:

\displaystyle\overrightarrow{OP}=xe_1+ye_2+ze_3

and we’ll write the point P as (x,y,z).

So let’s pick four points: (0,0,0), (1,1,0), (2,1,1), and (1,0,1). These four point do, indeed, give the vertices of a parallelogram, since both displacements from (0,0,0) to (1,1,0) and from (1,0,1) to (2,1,1) are e_1+e_2, and similarly the displacements from (0,0,0) to (1,0,1) and from (1,1,0) to (2,1,1) are both e_1+e_3. Alternatively, all four points lie within the plane described by x=y+z, and the region in this plane contained between the vertices consists of points P so that

\displaystyle\overrightarrow{OP}=u(e_1+e_2)+v(e_1+e_3)

for some u and v both in the interval [0,1]. So this is a parallelogram contained between e_1+e_2 and e_1+e_3. Incidentally, note that the fact that all these points lie within a plane means that any displacement vector between two of them is in the kernel of some linear transformation. In this case, it’s the linear functional \langle e_1-e_2-e_3,\underline{\hphantom{X}}\rangle, and the vector e_1-e_2-e_3 is perpendicular to any displacement in this plane, which will come in handy later.

Now in a more familiar approach, we might say that the area of this parallelogram is its base times its height. Let’s work that out to check our answer against later. For the base, we take the length of one vector, say e_1+e_2. We use the inner product to calculate its length as \sqrt{2}. For the height we can’t just take the length of the other vector. Some basic trigonometry shows that we need the length of the other vector (which is again \sqrt{2}) times the sine of the angle between the two vectors. To calculate this angle we again use the inner product to find that its cosine is \frac{1}{2}, and so its sine is \frac{\sqrt{3}}{2}. Multiplying these all together we find a height of \sqrt{\frac{3}{2}}, and thus an area of \sqrt{3}.

On the other hand, let’s use our new tools. We represent the parallelogram as the wedge (e_1+e_2)\wedge(e_1+e_3) — incidentally choosing an orientation of the parallelogram and the entire plane containing it — and calculate its length using the inner product on the exterior algebra:

\displaystyle\begin{aligned}\mathrm{vol}\left((e_1+e_2)\wedge(e_1+e_3)\right)^2&=2!\langle(e_1+e_2)\wedge(e_1+e_3),(e_1+e_2)\wedge(e_1+e_3)\rangle\\&=2!\frac{1}{2!}\det\begin{pmatrix}\langle e_1+e_2,e_1+e_2\rangle&\langle e_1+e_2,e_1+e_3\rangle\\\langle e_1+e_3,e_1+e_2\rangle&\langle e_1+e_3,e_1+e_3\rangle\end{pmatrix}\\&=\det\begin{pmatrix}2&1\\1&2\end{pmatrix}\\&=\left(2\cdot2-1\cdot1\right)=3\end{aligned}

Alternately, we could calculate it by expanding in terms of basic wedges. That is, we can write

\displaystyle\begin{aligned}(e_1+e_2)\wedge(e_1+e_3)&=e_1\wedge e_1+e_1\wedge e_3+e_2\wedge e_1+e_2\wedge e_3\\&=e_2\wedge e_3-e_3\wedge e_1-e_1\wedge e_2\end{aligned}

This tells us that if we take our parallelogram and project it onto the yz plane (which has an orthonormal basis \{e_2,e_3\}) we get an area of {1}. Similarly, projecting our parallelogram onto the xy plane (with orthonormal basis \{e_1,e_2\} we get an area of -1. That is, the area is {1} and the orientation of the projected parallelogram disagrees with that of the plane. Anyhow, now the squared area of the parallelogram is the sum of the squares of these projected areas: 1^2+(-1)^2+(-1)^2=3.

Notice, now, the similarity between this expression e_2\wedge e_3-e_3\wedge e_1-e_1\wedge e_2 and the perpendicular vector we found before: e_1-e_2-e_3. Each one is the sum of three terms with the same choices of signs. The terms themselves seem to have something to do with each other as well; the wedge e_2\wedge e_3 describes an area in the yz plane, while e_1 describes a length in the perpendicular x-axis. Similarly, e_1\wedge e_2 describes an area in the xy plane, while e_3 describes a length in the perpendicular z-axis. And, magically, the sum of these three perpendicular vectors to these three parallelograms gives the perpendicular vector to their sum!

There is, indeed, a linear correspondence between parallelograms and vectors that extends this idea, which we will explore tomorrow. The seemingly-odd choice of e_3\wedge e_1 to correspond to e_2, though, should be a tip-off that this correspondence is closely bound up with the notion of orientation.

November 5, 2009 Posted by | Analytic Geometry, Geometry | 6 Comments

Parallelepipeds and Volumes III

So, why bother with this orientation stuff, anyway? We’ve got an inner product on spaces of antisymmetric tensors, and that should give us a concept of length. Why can’t we just calculate the size of a parallelepiped by sticking it into this bilinear form twice?

Well, let’s see what happens. Given a k-dimensional parallelepiped with sides v_1 through v_k, we represent the parallelepiped by the wedge \omega=v_1\wedge\dots\wedge v_k. Then we might try defining the volume by using the renormalized inner product

\displaystyle\mathrm{vol}(\omega)^2=k!\langle\omega,\omega\rangle

Let’s expand one copy of the wedge \omega out in terms of our basis of wedges of basis vectors

\displaystyle k!\langle\omega,\omega\rangle=k!\langle\omega,\omega^Ie_I\rangle=k!\langle\omega,e_I\rangle\omega^I

where the multi-index I runs over all increasing k-tuples of indices 1\leq i_1<\dots<i_k\leq n. But we already know that \omega^I=k!\langle\omega,e_I\rangle, and so this is squared-volume is the sum of the squares of these components, just like we’re familiar with. Then we can define the k-volume of the parallelepiped as the square root of this sum.

Let’s look specifically at what happens for top-dimensional parallelepipeds, where k=n. Then we only have one possible multi-index I=(1,\dots,n), with coefficient

\displaystyle\omega^{1\dots n}=n!\langle e_1\wedge\dots\wedge e_n,v_1\wedge\dots\wedge v_n\rangle=\det\left(v_j^i\right)

and so our formula reads

\displaystyle\mathrm{vol}(\omega)=\sqrt{\left(\det\left(v_j^i\right)\right)^2}=\left\lvert\det\left(v_j^i\right)\right\rvert

So we get the magnitude of the volume without having to worry about choosing an orientation. Why even bother?

Because we already do care about orientation. Let’s go all the way back to one-dimensional parallelepipeds, which are just described by vectors. A vector doesn’t just describe a certain length, it describes a length along a certain line in space. And it doesn’t just describe a length along that line, it describes a length in a certain direction along that line. A vector picks out three things:

  • A one-dimensional subspace L of the ambient space V.
  • An orientation of the subspace L.
  • A volume (length) of this oriented subspace.

And just like vectors, nondegenerate k-dimensional parallelepipeds pick out three things

  • A k-dimensional subspace L of the ambient space V.
  • An orientation of the subspace L.
  • A k-dimensional volume of this oriented subspace.

The difference is that when we get up to the top dimension the space itself can have its own orientation, which may or may not agree with the orientation induced by the parallelepiped. We don’t always care about this disagreement, and we can just take the absolute value to get rid of a sign if we don’t care, but it might come in handy.

November 4, 2009 Posted by | Analytic Geometry, Geometry | 5 Comments

Parallelepipeds and Volumes II

Yesterday we established that the k-dimensional volume of a parallelepiped with k sides should be an alternating multilinear functional of those k sides. But now we want to investigate which one.

The universal property of spaces of antisymmetric tensors says that any such functional corresponds to a unique linear functional V_k:A^k\left(\mathbb{R}^n\right)\rightarrow\mathbb{R}. That is, we take the parallelepiped with sides v_1 through v_k and represent it by the antisymmetric tensor v_1\wedge\dots\wedge v_k. Notice, in particular, that if the parallelepiped is degenerate then this tensor is {0}, as we hoped. Then volume is some linear functional that takes in such an antisymmetric tensor and spits out a real number. But which linear functional?

I’ll start by answering this question for n-dimensional parallelepipeds in n-dimensional space. Such a parallelepiped is represented by an antisymmetric tensor with the n sides as its tensorands. But we’ve calculated the dimension of the space of such tensors: \dim\left(A^n\left(\mathbb{R}^n\right)\right)=1. That is, once we represent these parallelepipeds by antisymmetric tensors there’s only one parameter left to distinguish them: their volume. So if we specify the volume of one parallelepiped linearity will take care of all the others.

There’s one parallelepiped whose volume we know already. The unit n-cube must have unit volume. So, to this end, pick an orthonormal basis \left\{e_i\right\}_{i=1}^n. A parallelepiped with these sides corresponds to the antisymmetric tensor e_1\wedge\dots\wedge e_n, and the volume functional must send this to {1}. But be careful! The volume doesn’t depend just on the choice of basis, but on the order of the basis elements. Swap two of the basis elements and we should swap the sign of the volume. So we’ve got two different choices of volume functional here, which differ exactly by a sign. We call these two choices “orientations” on our vector space.

This is actually not as esoteric as it may seem. Almost all introductions to vectors — from multivariable calculus to vector-based physics — talk about “left-handed” and “right-handed” coordinate systems. These differ by a reflection, which would change the signs of all parallelepipeds. So we must choose one or the other, and choose which unit cube will have volume {1} and which will have volume -1. The isomorphism from \Lambda(V) to \Lambda(V)^* then gives us a “volume form” \mathrm{vol}\left(\underline{\hphantom{X}}\right)=n!\langle e_1\wedge\dots\wedge e_n,\underline{\hphantom{X}}\rangle, which will give us the volume of a parallelepiped represented by a given top-degree wedge.

Once we’ve made that choice, what about general parallelepipeds? If we have sides \left\{v_1\right\}_{i=1}^n — written in components as v_i^je_j — we represent the parallelepiped by the wedge v_1\wedge\dots\wedge v_n. This is the image of our unit cube under the transformation sending e_i to v_i, and so we find

\displaystyle\begin{aligned}\mathrm{vol}\left(v_1\wedge\dots\wedge v_n\right)&=n!\langle e_1\wedge\dots\wedge e_n,v_1\wedge\dots\wedge v_n\rangle\\&=\det\left(\langle e_i,v_j\rangle\right)\\&=\det\left(v_j^i\right)\end{aligned}

The volume of the parallelepiped is the determinant of this transformation.

Incidentally, this gives a geometric meaning to the special orthogonal group \mathrm{SO}(n,\mathbb{R}). Orthogonal transformations send orthonormal bases to other orthonormal bases, which will send unit cubes to other unit cubes. But the determinant of an orthogonal transformation may be either +1 or -1. Transformations of the first kind make up the special orthogonal group, while transformations of the second kind send “positive” unit cubes to “negative” ones, and vice-versa. That is, they involve some sort of reflection, swapping the choice of orientation we made above. Special orthogonal transformations are those which preserve not only lengths and angles, but the orientation of the space. More generally, there is a homomorphism \mathrm{GL}(n,\mathbb{R})\rightarrow\mathbb{Z}_2 sending a transformation to the sign of its determinant. Transformations with positive determinant are said to be “orientation-preserving”, while those with negative determinant are said to be “orientation-reversing”.

November 3, 2009 Posted by | Analytic Geometry, Geometry | 5 Comments

Parallelepipeds and Volumes I

And we’re back with more of what Mr. Martinez of Harvard’s Medical School assures me is onanism of the highest caliber. I’m sure he, too, blames me for not curing cancer.

Coming up in our study of calculus in higher dimensions we’ll need to understand parallelepipeds, and in particular their volumes. First of all, what is a parallelepiped? Or, more specifically, what is a k-dimensional parallelepiped in n-dimensional space? It’s a collection of points in space that we can describe as follows. Take a point p and k vectors \left\{v_i\right\}_{i=1}^k in \mathbb{R}^n. The parallelepiped is the collection of points reachable by moving from p by some fraction of each of the vectors v_i. That is, we pick k values t^i, each in the interval \left[0,1\right], and use them to specify the point p+t^iv_i. The collection of all such points is the parallelepiped with corner p and sides v_i.

One possible objection is that these sides may not be linearly independent. If the sides are linearly independent, then they span a k-dimensional subspace of the ambient space, justifying our calling it k-dimensional. But if they’re not, then the subspace they span has a lower dimension. We’ll deal with this by calling such a parallelepiped “degenerate”, and the nice ones with linearly independent sides “nondegenerate”. Trust me, things will be more elegant in the long run if we just deal with them both on the same footing.

Now we want to consider the volume of a parallelepiped. The first observation is that the volume doesn’t depend on the corner point p. Indeed, we should be able to slide the corner around to any point in space as long as we bring the same displacement vectors along with us. So the volume should be a function only of the sides.

The second observation is that as a function of the sides, the volume function should commute with scalar multiplication in each variable separately. That is, if we multiply v_i by a non-negative factor of \lambda, then we multiply the whole volume of the parallelepiped by \lambda as well. But what about negative scaling factors? What if we reflect the side (and thus the whole parallelepiped) to point the other way? One answer might be that we get the same volume, but it’s going to be easier (and again more elegant) if we say that the new parallelepiped has the negative of the original one’s volume.

Negative volume? What could that mean? Well, we’re going to move away from the usual notion of volume just a little. Instead, we’re going to think of “signed” volume, which includes the possibility of being positive or negative. By itself, this sign will be less than clear at first, but we’ll get a better understanding as we go. As a first step we’ll say that two parallelepipeds related by a reflection have opposite signs. This won’t only cover the above behavior under scaling sides, but also what happens when we exchange the order of two sides. For example, the parallelogram with sides v_1=a and v_2=b and the parallelogram with sides v_1=b and v_2=a have the same areas with opposite signs. Similarly, swapping the order of two sides in a given parallelepiped will flip its sign.

The third observation is that the volume function should be additive in each variable. One way to see this is that the k-dimensional volume of the parallelepiped with sides v_1 through v_k should be the product of the k-1-dimensional volume of the parallelepiped with sides v_1 through v_{k-1} and the length of the component of v_k perpendicular to all the other sides, and this length is a linear function of v_k. Since there’s nothing special here about the last side, we could repeat the argument with the other sides.

The other way to see this fact is to consider the following diagram, helpfully supplied by Kate from over at f(t):

Parallelograms

The side of one parallelogram is the (vector) sum of the sides of the other two, and we can see that the area of the one parallelogram is the sum of the areas of the other two. This justifies the assertion that for parallelograms in the plane, the area is additive as a function of one side (and, similarly, of the other). Similar diagrams should be apparent to justify the assertion for higher-dimensional parallelepipeds in higher-dimensional spaces.

Putting all these together, we find that the k-dimensional volume of a parallelepiped with k sides is an alternating multilinear functional, with the k sides as variables, and so it lives somewhere in the exterior algebra \Lambda(V^*). We’ll have to work out which particular functional gives us a good notion of volume as we continue.

November 2, 2009 Posted by | Analytic Geometry, Geometry | 5 Comments