The Unapologetic Mathematician

Mathematics for the interested outsider

The Determinant of Unitary and Orthogonal Transformations

Okay, we’ve got groups of unitary and orthogonal transformations (and the latter we can generalize to groups of matrices over arbitrary fields. These are defined by certain relations involving transformations and their adjoints (transposes of matrices over more general fields). So now that we’ve got information about how the determinant and the adjoint interact, we can see what happens when we restrict the determinant homomorphism to these subgroups of \mathrm{GL}(V).

First the orthogonal groups. This covers orthogonality with respect to general (nondegenerate) forms on an inner product space \mathrm{O}(V,B), the special case of orthogonality with respect to the underlying inner product \mathrm{O}(V), and the orthogonal matrix group over arbitrary fields \mathrm{O}(n,\mathbb{F})\subseteq\mathrm{GL}(n,\mathbb{F}). The general form describing all of these cases is

\displaystyle O^*BO=B

where O^* is the adjoint or the matrix transpose, as appropriate. Now we can take the determinant of both sides of this equation, using the fact that the determinant is a homomorphism. We find

\displaystyle\det(O^*)\det(B)\det(O)=\det(B)

Next we can use the fact that \det(O^*)=\det(O). We can also divide out by \det(B), since we know that B is invertible, and so its determinant is nonzero. We’re left with the observation that

\displaystyle\det(O)^2=1

And thus that the determinant of an orthogonal transformation O must be a square root of {1} in our field. For both real and complex matrices, this says \det(O)=\pm1, landing in the “sign group” (which is isomorphic to \mathbb{Z}_2).

What about unitary transformations? Here we just look at the unitarity condition

\displaystyle U^*U=I_V

We take determinants

\displaystyle\det(U^*)\det(U)=\det(I_V)=1

and use the fact that the determinant of the adjoint is the conjugate of the determinant

\displaystyle\overline{\det(U)}\det(U)=\lvert\det(U)\rvert^2=1

So the determinant of a unitary transformation U must be a unit complex number in the circle group (which, incidentally, contains the sign group above).

It seems, then, that when we take determinants the analogy we’ve been pushing starts to come out. Unitary (and orthogonal) transformations are like complex numbers on the unit circle, and their determinants actually are complex numbers on the unit circle.

July 31, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 2 Comments

The Determinant of the Adjoint

It will be useful to know what happens to the determinant of a transformation when we pass to its adjoint. Since the determinant doesn’t depend on any particular choice of basis, we can just pick one arbitrarily and do our computations on matrices. And as we saw yesterday, adjoints are rather simple in terms of matrices: over real inner product spaces we take the transpose of our matrix, while over complex inner product spaces we take the conjugate transpose. So it will be convenient for us to just think in terms of matrices over any field for the moment, and see what happens to the determinant of a matrix when we take its transpose.

Okay, so let’s take a linear transformation T:V\rightarrow V and pick a basis \left\{\lvert i\rangle\right\}_{i=1}^n to get the matrix

\displaystyle t_i^j=\langle j\rvert T\lvert i\rangle

and the formula for the determinant reads

\displaystyle\det(T)=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^nt_k^{\pi(k)}=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi(k)\rvert T\lvert k\rangle

and the determinant of the adjoint is

\displaystyle\det(T^*)=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi(k)\rvert T^*\lvert k\rangle=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle k\rvert T\lvert\pi(k)\rangle

where we’ve now taken the transpose.

Now for the term corresponding to the permutation \pi we can rearrange the multiplication. Instead of multiplying from {1} to n, we multiply from \pi^{-1}(1) to \pi^{-1}(n). All we’re doing is rearranging factors, and our field multiplication is commutative, so this doesn’t change the result at all:

\displaystyle\det(T^*)=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi^{-1}(k)\rvert T\lvert\pi(\pi^{-1}(k))\rangle=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi^{-1}(k)\rvert T\lvert k\rangle

But as \pi ranges over the symmetric group S_n, so does its inverse \pi^{-1}. So we relabel to find

\displaystyle\det(T^*)=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi(k)\rvert T\lvert k\rangle

and we’re back to our formula for the determinant of T itself! That is, when we take the transpose of a matrix we don’t change its determinant at all. And since the transpose of a real matrix corresponds to the adjoint of the transformation on a real inner product space, taking the adjoint doesn’t change the determinant of the transformation.

What about over complex inner product spaces, where adjoints correspond to conjugate transposes? Well, all we have to do is take the complex conjugate of each term in our calculation when we take the transpose of our matrix. Then carrying all these through to the end as we juggle indices around we’re left with

\displaystyle\det(T^*)=\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\overline{\langle\pi(k)\rvert T\lvert k\rangle}=\overline{\sum\limits_{\pi\in S_n}\mathrm{sgn}(\pi)\prod\limits_{k=1}^n\langle\pi(k)\rvert T\lvert k\rangle}=\overline{\det(T)}

The determinant of the adjoint is, in this case, the complex conjugate of the determinant.

July 30, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 7 Comments

Unitary and Orthogonal Matrices

Let’s see what happens when we take a unitary or orthogonal transformation and turn it into a matrix by picking a basis for our vector space.

First a unitary transformation on a complex vector space. We pick a basis \left\{\lvert i\rangle\right\}_{i=1}^n and set up the matrix

\displaystyle u_{ij}=\langle i\rvert U\lvert j\rangle

We can also set up the matrix for the adjoint

\displaystyle {u^*}_{ij}=\langle i\rvert U^*\lvert j\rangle=\overline{\langle j\rvert U\lvert i\rangle}=\overline{u_{ji}}

That is, the adjoint matrix is the conjugate transpose. This isn’t really anything new, since we essentially saw it when we considered Hermitian matrices.

But now we want to apply the unitarity condition that U^*U=I_V. It will make our lives easier here to just write out the sum over the basis in the middle and find

\displaystyle\delta_{ij}=\sum_{k=1}^n{u^*}_{ik}u_{kj}=\sum_{k=1}^n\overline{u_{ki}}u_{kj}

Now, this isn’t particularly useful on its face. I mean, what does that mess even mean? But if nothing else it tells us that we can describe unitary matrices in terms of (a lot of) equations involving only complex numbers. We can then pick out all the n\times n complex matrices which represent unitary transformations. They form the “unitary group” \mathrm{U}(n).

What about orthogonal matrices? Again, we pick a basis \left\{\lvert i\rangle\right\}_{i=1}^n to get a matrix

\displaystyle o_{ij}=\langle i\rvert O\lvert j\rangle

and also a matrix for the adjoint

\displaystyle {o^*}_{ij}=\langle i\rvert O^*\lvert j\rangle=\langle j\rvert O\lvert i\rangle=o_{ji}

Here the adjoint matrix is just the transpose, not the conjugate transpose, since we’re working over a real inner product space. Then we can write down the orthogonality condition

\displaystyle\delta_{ij}\sum_{k=1}^no_{ki}o_{kj}

Again, this doesn’t really seem to tell us much, but we can use these equations to cut out the matrices which represent orthogonal transformations from all n\times n real matrices. They form the “orthogonal group” \mathrm{O}(n).

But there’s something else we should notice here. The equations for the unitary group involved complex conjugation, so we need some structure like that to talk sensibly about unitarity. However, the orthogonality equations only involve basic field operations like addition and multiplication, and so these equations make sense over any field whatsoever. That is, given a field \mathbb{F} we can consider the collection of all n\times n matrices with entries in \mathbb{F}, and then impose the above orthogonality condition to cut out the matrices in the orthogonal group \mathrm{O}(n,\mathbb{F}), while the first orthogonal group is \mathrm{O}(n)=\mathrm{O}(n,\mathbb{R}).

One useful orthogonal group is \mathrm{O}(n,\mathbb{C}). This is not the same as the unitary group \mathrm{U}(n), though it can be confusing to keep the two separate at first. The unitary group consists of matrices whose inverses are their conjugate transposes, instead of just their transposes for the complex orthogonal group. The unitary group preserves a sesquilinear inner product, which has a clear geometric interpretation we’ve been talking about. The orthogonal group preserves a bilinear form, which doesn’t have such a clear visual interpretation. They are related in a way, but we’ll be coming back to that subject much later on.

July 29, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 4 Comments

Unitary Transformations

Unitary transformations are like orthogonal transformations, except we’re working with a complex inner product space. We’ll focus on just the transformations that are unitary with respect to the inner product itself. That is, we ask that

\displaystyle\langle w\vert v\rangle=\langle U(w)\vert U(v)\rangle=\langle w\rvert U^*U\lvert v\rangle

and so we must have U^*=U^{-1} just as we wrote for orthogonal transformations, but we have to use the adjoint that’s appropriate to our complex inner product. In this way, unitary and orthogonal transformations are related in a way similar to that in which Hermitian and symmetric forms are related.

Now, we’ve got this running analogy between endomorphisms on an inner product space and complex numbers. Taking the adjoint is like complex conjugation, so Hermitian transformations are like real numbers because they’re equal to their own adjoints. But here, we’re looking at transformations whose inverses are equal to their adjoints. What does this look like in terms of our analogy?

Well, we’ve noted that a transformation composed with its own adjoint is a sure way to get a positive-definite transformation T^*T. This is analogous to the way that a complex number times its own conjugate is always nonnegative: \bar{z}z\geq0. In fact, we use this to interpret \bar{z}z as the squared-length of a the complex number z. So what’s the analogue of the unitarity condition U^*U=I_V? That’s like asking for \lvert z\rvert^2=\bar{z}z=1, and so z must be a unit-length complex number. Unitary transformations are like the complex numbers on the unit circle.

July 28, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 8 Comments

Orthogonal transformations

Given a form on a vector space V represented by the transformation B and a linear map T:V\rightarrow V, we’ve seen how to transform B by the action of T. That is, the space of all bilinear forms is a vector space which carries a representation of \mathrm{GL}(V). But given a particular form B, what is the stabilizer of B? That is, what transformations in \mathrm{GL}(V) send B back to itself.

Before we answer this, let’s look at it in a slightly different way. Given a form B we have a way of pairing vectors in V to get scalars. On the other hand, if we have a transformation T we could use it on the vectors before pairing them. We’re looking for those transformations so that for every pair of vectors the result of the pairing by B is the same before and after applying T.

So let’s look at the action we described last time: the form B is sent to T^*BT. So we’re looking for all T so that

\displaystyle T^*BT=B

We say that such a transformation is B-orthogonal, and the subgroup of all such transformations is the “orthogonal group” \mathrm{O}(V,B)\subseteq \mathrm{GL}(V). Sometimes, since the vector space V is sort of implicit in the form B, we abbreviate the group to \mathrm{O}(B).

Now there’s one particular orthogonal group that’s particularly useful. If we’ve got an inner-product space V (the setup for having our bra-ket notation) then the inner product itself is a form, and it’s described by the identity transformation. That is, the orthogonality condition in this case is that

\displaystyle T^*T=I_V

A transformation is orthogonal if its adjoint is the same as its inverse. This is the version of orthogonality that we’re most familiar with. Commonly, when we say that a transformation is “orthogonal” with no qualification about what form we’re using, we just mean that this condition holds.

Let’s take a look at this last condition geometrically. We use the inner product to define a notion of (squared-)length \langle v\vert v\rangle and a notion of (the cosine of) angle \langle w\vert v\rangle. So let’s transform the space by T and see what happens to our inner product, and thus to lengths and angles.

\displaystyle\langle T(w)\vert T(v)\rangle=\langle w\rvert T^*T\lvert v\rangle

First off, note that no matter what T we use, the transformation in the middle is self-adjoint and positive-definite, and so the new form is symmetric and positive-definite, and thus defines another inner product. But when is it the same inner product? When T^*T=I_V, of course! For then we have

\displaystyle\langle T(w)\vert T(v)\rangle=\langle w\rvert T^*T\lvert v\rangle=\langle w\rvert I_V\lvert v\rangle=\langle w\vert v\rangle

So orthogonal transformations are exactly those which preserve the notions of length and angle defined by the inner product. Geometrically, they correspond to rotations and reflections that change orientations, but leave lengths of vectors the same, and leave the angle between any pair of vectors the same.

July 27, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 7 Comments

Sunday Samples 131

One of the things I really liked about Ben Folds was that his vocal range is almost exactly the same as mine, and so it was always rather easy to sing along to. The general tenor of his lyrics didn’t hurt much either.

I don’t know if this one had any inspiration from last week’s Sample. Either way, it seems interesting that boxing seems to have this cultural resonance with the idea of a washed-up career. From the 1995 debut it’s “Boxing”. Not a bad waltz, either.
Read more »

July 26, 2009 Posted by John Armstrong | Sunday Samples | | No Comments Yet

Transformations of Bilinear Forms

Sorry for the delay, but at least now everything is in Maryland. Now I just need a job and then I’ll have to move back out of my parents’ house again :/

Anyway, now we want to see how linear maps between vector spaces affect forms on those spaces. We’ve seen a hint when we talked about the category of inner product spaces: if we have a bilinear form B on a space W and a linear map T:V\rightarrow W, then we can “pull back” the form along the map. That is, we take two vectors in V, apply T to both of them, and then stick them into the form on W.

When we write this out with our Dirac notation, we don’t really make much of a distinction between the vector spaces, relying on context to tell which vectors belong to which spaces, as well as which pairing we’re using. Luckily, when we write something sensible, it usually doesn’t matter how we parse things. Dirac notation is very robust! So, given vectors w_1 and w_2 in W we have the form given by

\displaystyle\langle w_1\rvert B\lvert w_2\rangle

as we’ve seen over and over. But now let’s take two vectors v_1 and v_2 in V. First, we hit each of these vectors with T. The ket vector \lvert v_2\rangle clearly becomes T\lvert v_2\rangle, while the bra vector \langle v_1\rvert becomes \langle v_1\rvert T^*. Now using the form from before we get

\displaystyle\langle v_1\rvert T^*BT\lvert v_2\rangle

This looks like a form on V that’s described by the linear transformation T^*BT. Just as a sanity check, we can verify that T sends V to W, B then sends W to itself, and the adjoint T^* sends W back to V. So this is indeed a transformation from V to itself that can be used to represent a form.

As a particular case, we might consider an automorphism S:V\rightarrow V and consider it as a change of basis. When we did this to a linear map T we got a new linear map by conjugation S^{-1}TS, and we say that these two transformations are similar. In one sense, this is “the same” linear map, but described with different “coordinates” on the vector space. But when we apply it to a bilinear form we turn B into S^*BS, and we say that these two are “congruent”. In the same sense as before, they describe “the same” form on V, but using different coordinates on the vector space.

Like similarity, congruence gives an action of \mathrm{GL}(V) on the space of bilinear forms on V. Indeed, if S and T are both automorphisms, we can act on B first by S to get S^*BS, then then by T to get T^*S^*BST. But this is the same as \left(ST\right)^*B\left(ST\right), which is the action by the product ST.

July 24, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 7 Comments

Sunday Samples 130 (one day late)

I sort of which I had something more momentous to say in my thousandth post. Oh well…

What is it about moving that pushes Simon & Garfunkel into my mind? Maybe something about the way Art Garfunkel got a graduate degree in mathematics, but never did anything significant with it either.

Oh well. This time it’s “The Boxer”, from 1970s Bridge Over Troubled Water.
Read more »

July 20, 2009 Posted by John Armstrong | Sunday Samples | | 2 Comments

Nondegenerate Forms II

Okay, we know what a nondegenerate form is, but what does this mean for the transformation that represents the form?

Remember that the form represented by the transformation B is nondegenerate if for every nonzero ket vector \lvert v\rangle there is some bra vector \langle w\rvert so that \langle w\rvert B\lvert v\rangle\neq0. But before we go looking for such a bra vector, the transformation B has turned the ket vector \lvert v\rangle into a new ket vector B\lvert v\rangle=\lvert B(v)\rangle. If we find that B(v)=0, then there can be no suitable vector w with which to pair it. So, at the very least, we must have B(v)\neq0 for every v\neq0. That is, the kernel of B is trivial. Since B is a transformation from the vector space V to itself, the rank-nullity theorem tells us that the image of B is all of V. That is, B must be an invertible transformation.

On the other hand, if B is invertible, then every nonzero ket vector \lvert v\rangle becomes another nonzero ket vector \lvert w\rangle=B\lvert v\rangle. Then we find that

\displaystyle\langle w\rvert B\lvert v\rangle=\langle w\vert w\rangle>0

where this last inequality holds because the bra-ket pairing is an inner product, and is thus positive-definite. Indeed, a positive-definite (or negative-definite) form must be nondegenerate. Thus it is sufficient for B to be invertible.

Incidentally, this approach gives us a good way of constructing a lot of positive-definite transformations. Given an invertible transformation B, we expand

\displaystyle\langle w\vert w\rangle=\langle v\rvert B^*B\lvert v\rangle

Since the form defined by the bra-ket pairing is invertible, so is the form defined by B^*B. And this is a sensible concept, since B^*B is self-adjoint. Indeed, we take its adjoint to find

\displaystyle\left(B^*B\right)^*=B^*\left(B^*\right)^*=B^*B

This extends our analogy with the complex numbers. An invertible transformation composed with its adjoint is a self-adjoint, positive-definite transformation, just as a nonzero complex number multiplied by its conjugate is a real, positive number.

July 17, 2009 Posted by John Armstrong | Algebra, Linear Algebra | | 2 Comments

The Unapologetic… Book?

Yesterday some lonely voice on Twitter with the moniker @arthegall (which nickname I simply cannot parse) suggested turning this weblog into a book. The idea has come up before, but it usually has a spirit of “Turning weblogs into books is a Thing That People Do”, and so I might someday follow suit. But to look deeply at this particular weblog and say that it should be done? Clearly there are some problems here.

First of all, is there really a market for it? I flatter myself that there really is a “generally interested lay audience”, but how big is it? Unless one says something controversial — or that some crank thinks is controversial — math, physics, and computer science weblog readers tend to be a rather silent bunch. The obvious exceptions are those who have major corporate backing. Not that I disparage major corporate backing or anything, but it might reinforce my point that I don’t have it already. And even if that’s just the whims of chance, I really have no good way of gauging my current audience, let alone who might be willing to shell out for an honest-to-Gauß book. Without an audience, publishers aren’t going to be interested, and without a publisher I’m left with vanity presses. At which point, what’s really the difference between a vanity press and a weblog anyway?

Second of all.. well, just look at this mess. I can’t even keep my blogroll up-to-date. If you’re wondering why you don’t see your weblog there it may well be that I’m reading it and haven’t updated that thing in over a year. I’m worse with bibliographies. One of the nice things about the weblog format is that I don’t really have to do a lot in the way of referencing my sources, especially since nothing I’m saying is particularly new (unless I’m making it up myself). Ninety percent of what you read would get cited as “[stuff] I remember from graduate school [or earlier]“, with the rest filled in with occasional tokes from old textbooks or Wikipedia to make sure I’m getting the details and standard usage more or less correct. See, I really don’t think in terms of where ideas come from. Once they go into my head they’re churned into a sort of lumpy grey paste. Semantic leveling is great when it comes to pulling out oblique analogies (a useful tool in mathematics, actually), but it’s really horrible when it comes to giving credit where it’s due.

And yet, a book could lead to fame and fortune, and what American doesn’t want that? The tour might not get me on with my mancrush Keith, but Rachel tries to extend overtures to math/science/technology geeks from her policy-wonk side of the room. She even made mention of ∏ Day, which I’m sure could provide a great hook for an interview given my well-established stance. And Stewart and Colbert are wide open. Yesterday Stewart’s featured interview was with a guy who wrote a bio of Henry Hudson, of all things. So.. it might not be a bad idea.

But what would I write about. Just “turn blog into book” doesn’t feel right to me. Terry Tao can just make a tarball of a year’s worth of posts, slap an ex post facto theme on it, and ship it out. He can get away with this partially for the reason he’s got a built-in weblog audience in the first place: he’s got a Fields Medal. Me, I can’t get a proper postdoctoral position to save my life, so I have to work a bit harder on coherence. There’s way too much here, with no clear goal. I have local goals, and longer-term goals, but overall this project is very unstructured. I can hold what attention I do because it’s a nice little snack of whatever I happen to be talking about today. But a book needs to have a theme and, in a sense, a goal. Tangents are acceptable, but even they should clearly tie into the overall narrative.

And then I thought of one topic I’ve covered here, which could be stripped down a bit and rearranged into a single coherent thread: the progression from natural numbers to real numbers, and beyond. Natural numbers are, or at least seem, to be simple and intuitive, but by the time we’re talking about real numbers I don’t think most people have a real sense of what these things are. At the time most people first see them (even under the capable eyes of a good schoolteacher) they probably don’t realize what huge conceptual leaps they’re making. Even their teachers may not really know. Even up through the calculus sequence, which is where even most mathematically-able people stop (going on into science or engineering), it’s never really made explicit what’s going on. The fine details that are essential for making calculus work as it does are swept under the rug. By the time many people are ready to put real thought into what a number is, they’re beyond the point where there’s going to be a class telling them that the question is as deep as it really is.

Now, I don’t think I’d be interested in going in a philosophical direction with this, trying to really answer the question in a deep way, but I’d probably have to brush up on some of that sort of thing to make appropriate mention of it. My concern would be more to explore the progression of notions of “number”, and to explore the structures we encounter along the way. I wouldn’t want to go into nearly as much depth on group theory as I’ve done here, but some mention of it in parallel with the step from natural numbers to integers would be useful, for instance. But I’d need a collaborator — or at least an editor — who can understand what I’m doing and keep me honest about my references.

But given the right support I wouldn’t be against the idea…

July 16, 2009 Posted by John Armstrong | Uncategorized | | 13 Comments