Okay, we’ve got groups of unitary and orthogonal transformations (and the latter we can generalize to groups of matrices over arbitrary fields. These are defined by certain relations involving transformations and their adjoints (transposes of matrices over more general fields). So now that we’ve got information about how the determinant and the adjoint interact, we can see what happens when we restrict the determinant homomorphism to these subgroups of .
First the orthogonal groups. This covers orthogonality with respect to general (nondegenerate) forms on an inner product space , the special case of orthogonality with respect to the underlying inner product , and the orthogonal matrix group over arbitrary fields . The general form describing all of these cases is
where is the adjoint or the matrix transpose, as appropriate. Now we can take the determinant of both sides of this equation, using the fact that the determinant is a homomorphism. We find
Next we can use the fact that . We can also divide out by , since we know that is invertible, and so its determinant is nonzero. We’re left with the observation that
And thus that the determinant of an orthogonal transformation must be a square root of in our field. For both real and complex matrices, this says , landing in the “sign group” (which is isomorphic to ).
What about unitary transformations? Here we just look at the unitarity condition
We take determinants
and use the fact that the determinant of the adjoint is the conjugate of the determinant
So the determinant of a unitary transformation must be a unit complex number in the circle group (which, incidentally, contains the sign group above).
It seems, then, that when we take determinants the analogy we’ve been pushing starts to come out. Unitary (and orthogonal) transformations are like complex numbers on the unit circle, and their determinants actually are complex numbers on the unit circle.
It will be useful to know what happens to the determinant of a transformation when we pass to its adjoint. Since the determinant doesn’t depend on any particular choice of basis, we can just pick one arbitrarily and do our computations on matrices. And as we saw yesterday, adjoints are rather simple in terms of matrices: over real inner product spaces we take the transpose of our matrix, while over complex inner product spaces we take the conjugate transpose. So it will be convenient for us to just think in terms of matrices over any field for the moment, and see what happens to the determinant of a matrix when we take its transpose.
Okay, so let’s take a linear transformation and pick a basis to get the matrix
and the formula for the determinant reads
and the determinant of the adjoint is
where we’ve now taken the transpose.
Now for the term corresponding to the permutation we can rearrange the multiplication. Instead of multiplying from to , we multiply from to . All we’re doing is rearranging factors, and our field multiplication is commutative, so this doesn’t change the result at all:
But as ranges over the symmetric group , so does its inverse . So we relabel to find
and we’re back to our formula for the determinant of itself! That is, when we take the transpose of a matrix we don’t change its determinant at all. And since the transpose of a real matrix corresponds to the adjoint of the transformation on a real inner product space, taking the adjoint doesn’t change the determinant of the transformation.
What about over complex inner product spaces, where adjoints correspond to conjugate transposes? Well, all we have to do is take the complex conjugate of each term in our calculation when we take the transpose of our matrix. Then carrying all these through to the end as we juggle indices around we’re left with
The determinant of the adjoint is, in this case, the complex conjugate of the determinant.
First a unitary transformation on a complex vector space. We pick a basis and set up the matrix
We can also set up the matrix for the adjoint
That is, the adjoint matrix is the conjugate transpose. This isn’t really anything new, since we essentially saw it when we considered Hermitian matrices.
But now we want to apply the unitarity condition that . It will make our lives easier here to just write out the sum over the basis in the middle and find
Now, this isn’t particularly useful on its face. I mean, what does that mess even mean? But if nothing else it tells us that we can describe unitary matrices in terms of (a lot of) equations involving only complex numbers. We can then pick out all the complex matrices which represent unitary transformations. They form the “unitary group” .
What about orthogonal matrices? Again, we pick a basis to get a matrix
and also a matrix for the adjoint
Here the adjoint matrix is just the transpose, not the conjugate transpose, since we’re working over a real inner product space. Then we can write down the orthogonality condition
Again, this doesn’t really seem to tell us much, but we can use these equations to cut out the matrices which represent orthogonal transformations from all real matrices. They form the “orthogonal group” .
But there’s something else we should notice here. The equations for the unitary group involved complex conjugation, so we need some structure like that to talk sensibly about unitarity. However, the orthogonality equations only involve basic field operations like addition and multiplication, and so these equations make sense over any field whatsoever. That is, given a field we can consider the collection of all matrices with entries in , and then impose the above orthogonality condition to cut out the matrices in the orthogonal group , while the first orthogonal group is .
One useful orthogonal group is . This is not the same as the unitary group , though it can be confusing to keep the two separate at first. The unitary group consists of matrices whose inverses are their conjugate transposes, instead of just their transposes for the complex orthogonal group. The unitary group preserves a sesquilinear inner product, which has a clear geometric interpretation we’ve been talking about. The orthogonal group preserves a bilinear form, which doesn’t have such a clear visual interpretation. They are related in a way, but we’ll be coming back to that subject much later on.
Unitary transformations are like orthogonal transformations, except we’re working with a complex inner product space. We’ll focus on just the transformations that are unitary with respect to the inner product itself. That is, we ask that
and so we must have just as we wrote for orthogonal transformations, but we have to use the adjoint that’s appropriate to our complex inner product. In this way, unitary and orthogonal transformations are related in a way similar to that in which Hermitian and symmetric forms are related.
Now, we’ve got this running analogy between endomorphisms on an inner product space and complex numbers. Taking the adjoint is like complex conjugation, so Hermitian transformations are like real numbers because they’re equal to their own adjoints. But here, we’re looking at transformations whose inverses are equal to their adjoints. What does this look like in terms of our analogy?
Well, we’ve noted that a transformation composed with its own adjoint is a sure way to get a positive-definite transformation . This is analogous to the way that a complex number times its own conjugate is always nonnegative: . In fact, we use this to interpret as the squared-length of a the complex number . So what’s the analogue of the unitarity condition ? That’s like asking for , and so must be a unit-length complex number. Unitary transformations are like the complex numbers on the unit circle.
Given a form on a vector space represented by the transformation and a linear map , we’ve seen how to transform by the action of . That is, the space of all bilinear forms is a vector space which carries a representation of . But given a particular form , what is the stabilizer of ? That is, what transformations in send back to itself.
Before we answer this, let’s look at it in a slightly different way. Given a form we have a way of pairing vectors in to get scalars. On the other hand, if we have a transformation we could use it on the vectors before pairing them. We’re looking for those transformations so that for every pair of vectors the result of the pairing by is the same before and after applying .
So let’s look at the action we described last time: the form is sent to . So we’re looking for all so that
We say that such a transformation is -orthogonal, and the subgroup of all such transformations is the “orthogonal group” . Sometimes, since the vector space is sort of implicit in the form , we abbreviate the group to .
Now there’s one particular orthogonal group that’s particularly useful. If we’ve got an inner-product space (the setup for having our bra-ket notation) then the inner product itself is a form, and it’s described by the identity transformation. That is, the orthogonality condition in this case is that
A transformation is orthogonal if its adjoint is the same as its inverse. This is the version of orthogonality that we’re most familiar with. Commonly, when we say that a transformation is “orthogonal” with no qualification about what form we’re using, we just mean that this condition holds.
Let’s take a look at this last condition geometrically. We use the inner product to define a notion of (squared-)length and a notion of (the cosine of) angle . So let’s transform the space by and see what happens to our inner product, and thus to lengths and angles.
First off, note that no matter what we use, the transformation in the middle is self-adjoint and positive-definite, and so the new form is symmetric and positive-definite, and thus defines another inner product. But when is it the same inner product? When , of course! For then we have
So orthogonal transformations are exactly those which preserve the notions of length and angle defined by the inner product. Geometrically, they correspond to rotations and reflections that change orientations, but leave lengths of vectors the same, and leave the angle between any pair of vectors the same.
Sorry for the delay, but at least now everything is in Maryland. Now I just need a job and then I’ll have to move back out of my parents’ house again
Anyway, now we want to see how linear maps between vector spaces affect forms on those spaces. We’ve seen a hint when we talked about the category of inner product spaces: if we have a bilinear form on a space and a linear map , then we can “pull back” the form along the map. That is, we take two vectors in , apply to both of them, and then stick them into the form on .
When we write this out with our Dirac notation, we don’t really make much of a distinction between the vector spaces, relying on context to tell which vectors belong to which spaces, as well as which pairing we’re using. Luckily, when we write something sensible, it usually doesn’t matter how we parse things. Dirac notation is very robust! So, given vectors and in we have the form given by
as we’ve seen over and over. But now let’s take two vectors and in . First, we hit each of these vectors with . The ket vector clearly becomes , while the bra vector becomes . Now using the form from before we get
This looks like a form on that’s described by the linear transformation . Just as a sanity check, we can verify that sends to , then sends to itself, and the adjoint sends back to . So this is indeed a transformation from to itself that can be used to represent a form.
As a particular case, we might consider an automorphism and consider it as a change of basis. When we did this to a linear map we got a new linear map by conjugation , and we say that these two transformations are similar. In one sense, this is “the same” linear map, but described with different “coordinates” on the vector space. But when we apply it to a bilinear form we turn into , and we say that these two are “congruent”. In the same sense as before, they describe “the same” form on , but using different coordinates on the vector space.
Like similarity, congruence gives an action of on the space of bilinear forms on . Indeed, if and are both automorphisms, we can act on first by to get , then then by to get . But this is the same as , which is the action by the product .
Okay, we know what a nondegenerate form is, but what does this mean for the transformation that represents the form?
Remember that the form represented by the transformation is nondegenerate if for every nonzero ket vector there is some bra vector so that . But before we go looking for such a bra vector, the transformation has turned the ket vector into a new ket vector . If we find that , then there can be no suitable vector with which to pair it. So, at the very least, we must have for every . That is, the kernel of is trivial. Since is a transformation from the vector space to itself, the rank-nullity theorem tells us that the image of is all of . That is, must be an invertible transformation.
On the other hand, if is invertible, then every nonzero ket vector becomes another nonzero ket vector . Then we find that
where this last inequality holds because the bra-ket pairing is an inner product, and is thus positive-definite. Indeed, a positive-definite (or negative-definite) form must be nondegenerate. Thus it is sufficient for to be invertible.
Incidentally, this approach gives us a good way of constructing a lot of positive-definite transformations. Given an invertible transformation , we expand
Since the form defined by the bra-ket pairing is invertible, so is the form defined by . And this is a sensible concept, since is self-adjoint. Indeed, we take its adjoint to find
This extends our analogy with the complex numbers. An invertible transformation composed with its adjoint is a self-adjoint, positive-definite transformation, just as a nonzero complex number multiplied by its conjugate is a real, positive number.
The notion of a positive semidefinite form opens up the possibility that, in a sense, a vector may be “orthogonal to itself”. That is, if we let be the self-adjoint transformation corresponding to our (conjugate) symmetric form, we might have a nonzero vector such that . However, the vector need not be completely trivial as far as the form is concerned. There may be another vector so that .
Let us work out a very concrete example. For our vector space, we take with the standard basis, and we’ll write the ket vectors as columns, so:
Then we will write the bra vectors as rows — the transposes of ket vectors:
If we were working over a complex vector space we’d take conjugate transposes instead, of course. Now it will hopefully make the bra-ket and matrix connection clear if we note that the bra-ket pairing now becomes multiplication of the corresponding matrices. For example:
The bra-ket pairing is exactly the inner product we get by declaring our basis to be orthonormal.
Now let’s insert a transformation between the bra and ket to make a form. Specifically, we’ll use the one with the matrix . Then the basis vector is just such a one of these vectors “orthogonal” to itself (with respect to our new bilinear form). Indeed, we can calculate
However, this vector is not totally trivial with respect to the form . For we can calculate
Now, all this is prologue to a definition. We say that a form (symmetric or not) is “degenerate” if there is some non-zero ket vector so that for every bra vector we find
And, conversely, we say that a form is “nondegenerate” if for every ket vector there exists some bra vector so that
We’ll stick with the background vector space with inner product . If we want another inner product to actually work with, we need to pick out a bilinear form (or sesquilinear, over . So this means we need a transformation to stick between bras and kets.
Now, for our new bilinear form to be an inner product it must be symmetric (or conjugate-symmetric). This is satisfied by picking our transformation to be symmetric (or hermitian). But we also need our form to be “positive-definite”. That is, we need
for all vectors , and for equality to obtain only when .
So let’s look at this condition on its own, first over . If is antisymmetric, then we see by taking the adjoint, and thus must be zero. But an arbitrary transformation can be split into a symmetric part and an antisymmetric part . It’s easy to check that . So the antisymmetric part of must be trivial, and the concept of being “positive-definite” only makes real sense for symmetric transformations.
What happens over ? Now we want to interpret the positivity condition as saying that is first and foremost a real number. Then, taking adjoints we see that . Thus the transformation must always give zero when we feed it two copies of the same vector.
But now we have the polarization identities to work with! The real and imaginary parts of are completely determined in terms of expressions like . But since these are always zero, so is the rest of the form. And thus we conclude that . That is, positive-definiteness only really makes sense for Hermitian transformations.
Actually, this all sort of makes sense. Self-adjoint transformations (symmetric or Hermitian) are analogous to the real numbers sitting inside the complex numbers. Within these, positive-definite matrices are sort of like the positive real numbers. It doesn’t make sense to talk about “positive” complex numbers, and it doesn’t make sense to talk about “positive-definite” transformations in general.
Now, there are three variations that I should also mention. The most obvious one is for a transformation to be “negative-definite”. In this case, we have , with equality only for . We can also have transformations which are “positive-semidefinite” and “negative-semidefinite”. These are just the same as the definite versions, except we don’t require that equality only obtain for .
The simplest structure we can look for in our bilinear forms is that they be symmetric, antisymmetric, or (if we’re working over the complex numbers) Hermitian. A symmetric form gives the same answer, an antisymmetric form negates the answer, and a Hermitian form conjugates the answer when we swap inputs. Thus if we have a symmetric form given by the linear operator , an antisymmetric form given by the operator , and a Hermitian form given by , we can write
Each of these conditions an immediately be translated into a condition on the corresponding linear operator. We’ll flip over each of the terms on the left, using the symmetry of the inner product and the adjoint property. In the third line, though, we’ll use the conjugate-symmetry of the complex inner product.
We can conjugate both sides of the last line to simplify it. Similarly, we can use linearity in the second line to rewrite
Now in each line we have one operator on the left and another operator on the right, and these operators give rise to the same forms. I say that this means the operators themselves must be the same. To show this, consider the general case
Pulling both forms to one side and using linearity we find
Now, if the difference is not the zero transformation, then there is some so that . Then we can consider
And so we must have .
In particular, this shows that if we have a symmetric form, it’s described by a self-adjoint transformation . Hermitian forms are also described by self-adjoint transformations . And antisymmetric forms are described by “skew-adjoint” transformations
So what’s the difference between a symmetric and a Hermitian form? It’s all in the fact that a symmetric form is based on a vector space with a symmetric inner product, while a Hermitian form is based on a complex vector space with a conjugate-symmetric inner product. The different properties of the two inner products account for the different ways that adjoint transformations behave.