Every Self-Adjoint Transformation has an Eigenvector
Okay, this tells us nothing in the complex case, but for real transformations we have no reason to assume that a given transformation has any eigenvalues at all. But if our transformation is self-adjoint it must have one.
When we found this in the complex case we saw that the characteristic polynomial had to have a root, since is algebraically closed. It’s the fact that
isn’t algebraically closed that causes our trouble. But since
sits inside
we can consider any real polynomial as a complex polynomial. That is, the characteristic polynomial of our transformation, considered as a complex polynomial (whose coefficients just happen to all be real) must have a complex root.
This really feels like a dirty trick, so let’s try to put it on a bit firmer ground. We’re looking at a transformation on a vector space
over
. What we’re going to do is “complexify” our space, so that we can use some things that only work over the complex numbers. To do this, we’ll consider
itself as a two-dimensional vector space over
and form the tensor product
. The transformation
immediately induces a transformation
by defining
. It’s a complex vector space, since given a complex constant
we can define the scalar product of
by
as
. Finally,
is complex-linear since it commutes with our complex scalar product.
What have we done? Maybe it’ll be clearer if we pick a basis for
. That is, any vector in
is a linear combination of the
in a unique way. Then every (real) vector in
is a unique linear combination of
and
(this latter
is the complex number, not the index; try to keep them separate). But as complex vectors, we have
, and so every vector is a unique complex linear combination of the
. It’s like we’ve kept the same basis, but just decided to allow complex coefficients too.
And what about the matrix of with respect to this (complex) basis of
? Well it’s just the same as the old matrix of
with respect to the
! Just write
Then if is self-adjoint its matrix will be symmetric, and so will the matrix of
, which must then be self-adjoint as well. And we can calculate the characteristic polynomial of
from its matrix, so the characteristic polynomial of
will be the same — except it will be a complex polynomial whose coefficients all just happen to be real.
Okay so back to the point. Since is a transformation on a complex vector space it must have an eigenvalue
and a corresponding eigenvector
. And I say that since
is self-adjoint$, the eigenvalue
must be real. Indeed, we can calculate
and thus , so
is real.
Therefore, we have found a real number so that when we plug it into the characteristic polynomial of
, we get zero. But then we also get zero when we plug it into the characteristic polynomial of
, and thus it’s also an eigenvalue of
.
And so, finally, every self-adjoint transformation on a real vector space has at least one eigenvector.
Invariant Subspaces of Self-Adjoint Transformations
Okay, today I want to nail down a lemma about the invariant subspaces (and, in particular, eigenspaces) of self-adjoint transformations. Specifically, the fact that the orthogonal complement of an invariant subspace is also invariant.
So let’s say we’ve got a subspace and its orthogonal complement
. We also have a self-adjoint transformation
so that
for all
. What we want to show is that for every
, we also have
Okay, so let’s try to calculate the inner product for an arbitrary
.
since is self-adjoint,
is in
, and
is in
. Then since this is zero no matter what
we pick, we see that
. Neat!
The Complex Spectral Theorem
We’re now ready to characterize those transformations on complex vector spaces which have a diagonal matrix with respect to some orthonormal basis. First of all, such a transformation must be normal. If we have a diagonal matrix we can find the matrix of the adjoint by taking its conjugate transpose, and this will again be diagonal. Since any two diagonal matrices commute, the transformation must commute with its adjoint, and is therefore normal.
On the other hand, let’s start with a normal transformation and see what happens as we try to diagonalize it. First, since we’re working over
here, we can pick an orthonormal basis that gives us an upper-triangular matrix and call the basis
. Now, I assert that this matrix already is diagonal when
is normal.
Let’s write out the matrices for
and
Now we can see that , while
. Since these bases are orthonormal, it’s easy to calculate the squared-lengths of these two:
But since is normal, these two must be the same. And so all the entries other than maybe
in the first row of our matrix must be zero. We can then repeat this reasoning with the basis vector
, and reach a similar conclusion about the second row, and so on until we see that all the entries above the diagonal must be zero.
That is, not only is it necessary that a transformation be normal in order to diagonalize it, it’s also sufficient. Any normal transformation on a complex vector space has an orthonormal basis of eigenvectors.
Now if we have an arbitrary orthonormal basis — say is a transformation on
with the standard basis already floating around — we may want to work with the matrix of
with respect to this basis. If this were our basis of eigenvectors,
would have the diagonal matrix
. But we may not be so lucky. Still, we can perform a change of basis using the basis of eigenvectors to fill in the columns of the change-of-basis matrix. And since we’re going from one orthonormal basis to another, this will be unitary!
Thus a normal transformation is not only equivalent to a diagonal transformation, it is unitarily equivalent. That is, the matrix of any normal transformation can be written as for a diagonal matrix
and a unitary matrix
. And any matrix which is unitarily equivalent to a diagonal matrix is normal. That is, if you take the subspace of diagonal matrices within the space of all matrices, then use the unitary group to act by conjugation on this subspace, the result is the subspace of all normal matrices, which represent normal transformations.
Often, you’ll see this written as , which is really the same thing of course, but there’s an interesting semantic difference. Writing it using the inverse is a similarity, which is our notion of equivalence for transformations. So if we’re thinking of our matrix as acting on a vector space, this is the “right way” to think of the spectral theorem. On the other hand, using the conjugate transpose is a congruence, which is our notion of equivalence for bilinear forms. So if we’re thinking of our matrix as representing a bilinear form, this is the “right way” to think of the spectral theorem. But of course since we’re using unitary transformations here, it doesn’t matter! Unitary equivalence of endomorphisms and of bilinear forms is exactly the same thing.
Unitary and Orthogonal Matrices and Orthonormal Bases
I almost forgot to throw in this little observation about unitary and orthogonal matrices that will come in handy.
Let’s say we’ve got a unitary transformation and an orthonormal basis
. We can write down the matrix as before
Now, each column is a vector. In particular, it’s the result of transforming a basis vector by
.
What do these vectors have to do with each other? Well, let’s take their inner products and find out.
since preserves the inner product. That is the collection of columns of the matrix of
form another orthonormal basis.
On the other hand, what if we have in mind some other orthonormal basis . We can write each of these vectors out in terms of the original basis
and even get a change-of-basis transformation (like we did for general linear transformations) defined by
so the are the matrix entries for
with respect to the basis
. This transformation
will then be unitary.
Indeed, take arbitrary vectors and
. Their inner product is
On the other hand, after acting by we find
since the basis is orthonormal as well.
To sum up: with respect to an orthonormal basis, the columns of a unitary matrix form another orthonormal basis. Conversely, writing any other orthonormal basis in terms of the original basis and using these coefficients as the columns of a matrix gives a unitary matrix. The same holds true for orthogonal matrices, with similar reasoning all the way through. And both of these are parallel to the situation for general linear transformations: the columns of an invertible matrix with respect to any basis form another basis, and conversely.
Eigenvalues and Eigenvectors of Normal Transformations
Let’s say we have a normal transformation . It turns out we can say some interesting things about its eigenvalues and eigenvectors.
First off, it turns out that the eigenvalues of are exactly the complex conjugates of those of
(the same, if we’re working over
. Actually, this isn’t even special to normal operators. Indeed, if
has a nontrivial kernel, then we can take the adjoint to find that
must have a nontrivial kernel as well. But if our transformation is normal, it turns out that not only do we have conjugate eigenvalues, they correspond to the same eigenvectors as well!
To see this, we do almost the same thing as before. But we get more than just a nontrivial kernel this time. Given an eigenvector we know that
, and so it must have length zero. But if
is normal then so is
:
and so acting by gives the same length as acting by
. That is:
thus by the definiteness of length, we know that . That is,
is also an eigenvector of
, with eigenvalue
.
Then as a corollary we can find that not only are the eigenvectors corresponding to distinct eigenvalues linearly independent, they are actually orthogonal! Indeed, if and
are eigenvectors of
with distinct eigenvalues
and
, respectively, then we find
Since we must conclude that
, and that the two eigenvectors are orthogonal.
Normal Transformations
All the transformations in our analogy — self-adjoint and unitary (or orthogonal), and even anti-self-adjoint (antisymmetric and “skew-Hermitian”) transformations satisfying — all satisfy one slightly subtle but very interesting property: they all commute with their adjoints. Self-adjoint and anti-self-adjoint transformations do because any transformation commutes with itself and also with its negative, since negation is just scalar multiplication. Orthogonal and unitary transformations do because every transformation commutes with its own inverse.
Now in general most pairs of transformations do not commute, so there’s no reason to expect this to happen commonly. Still, if we have a transformation so that
, we call it a “normal” transformation.
Let’s bang out an equivalent characterization of normal operators while we’re at it, so we can get an idea of what they look like geometrically. Take any vector , hit it with
, and calculate its squared-length (I’m not specifying real or complex, since the notation is the same either way). We get
On the other hand, we could do the same thing but using instead of
.
But if is normal, then
and
are the same, and thus
for all vectors
Conversely, if for all vectors
, then we can use the polarization identities to conclude that
.
So normal transformations are exactly those that the length of a vector is the same whether we use the transformation or its adjoint. For self-adjoint and anti-self-adjoint transformations this is pretty obvious since they’re (almost) the same thing anyway. For orthogonal and unitary transformations, they don’t change the lengths of vectors at all, so this makes sense.
Just to be clear, though, there are matrices that are normal, but which aren’t any of the special kinds we’ve talked about so far. For example, the transformation represented by the matrix
has its adjoint represented by
which is neither the original transformation nor its negative, so it’s neither self-adjoint nor anti-self-adjoint. We can calculate their product in either order to get
since we get the same answer, the transformation is normal, but it’s clearly not unitary because if it were we’d get the identity matrix here.
The Determinant of a Positive-Definite Transformation
Let’s keep pushing the analogy we’ve got going.
First, we know that the determinant of the adjoint of a transformation is the complex conjugate of the determinant of the original transformation (or just the same, for a real transformation). So what about self-adjoint transformations? We’ve said that these are analogous to real numbers, and indeed their determinants are real numbers. If we have a transformation satisfying
, then we can take determinants to find
and so the determinant is real.
What if is not only self-adjoint, but positive-definite? We would like the determinant to actually be a positive real number.
Well, first let’s consider the eigenvalues of . If
is an eigenvector we have
for some scalar
. Then we can calculate
If is to be positive-definite, this must be positive, and so
itself must be positive. Thus the eigenvalues of a positive-definite transformation are all positive.
Now if we’re working with a complex transformation we’re done. We can pick a basis so that the matrix for is upper-triangular, and then its determinant is the product of its eigenvalues. Since the eigenvalues are all positive, so is the determinant.
But what happens over the real numbers? Now we might not be able to put the transformation into an upper-triangular form. But we can put it into an almost upper-triangular form. The determinant is then the product of the determinants of the blocks along the diagonal. The blocks are just eigenvalues, which still must be positive.
The blocks, on the other hand, correspond to eigenpairs. They have trace
and determinant
, and these must satisfy
, or else we could decompose the block further. But
is definitely positive, and so the determinant
has to be positive as well in any of these blocks. And thus the product of the determinants of the blocks down the diagonal is again positive.
So either way, the determinant of a positive-definite transformation is positive.
