Finally, we come to the analogue of Jordan normal form over the real numbers.
Given a linear transformation on a real vector space of dimension , we can find its characteristic polynomial. We can factor a real polynomial into the product of linear terms and irreducible quadratic terms with . These give us a list of eigenvalues and eigenpairs for .
For each distinct eigenvalue we get a subspace of generalized eigenvectors, with distinct eigenvalues in total. Similarly, for each distinct eigenpair we get a subspace of generalized eigenvectors, with distinct eigenpairs in total.
We know that these subspaces are mutually disjoint. We also know that the dimension of is equal to the multiplicity of , which is the number of factors of in the characteristic polynomial. Similarly, the dimension of is twice the multiplicity of , which is the number of factors of in the characteristic polynomial. Since each linear factor contributes to the degree of the polynomial, while each irreducible quadratic contributes , we can see that the sum of the dimensions of the and is equal to the degree of the characteristic polynomial, which is the dimension of itself.
That is, we have a decomposition of as a direct sum of invariant subspaces
I’ll leave it to you to work out what this last property implies for the matrix on the generalized eigenspace of an eigenpair, in analogy with a Jordan block for an eigenvalue.
As usual, let be a linear transformation on a real vector space of dimension . We know that can be put into an almost upper-triangular form
where each block is either a matrix or a matrix with no eigenvalues. Now I assert that
- If is a real number, then exactly of the blocks are the matrix whose single entry is .
- If is a pair of real numbers with , then exactly
of the blocks are matrices with characteristic polynomial . Incidentally, this shows that the dimension of this generalized eigenspace is even.
This statement is parallel to the one about multiplicities of eigenvalues over algebraically closed fields. And we’ll use a similar proof. First, let’s define the polynomial to be if we’re trying to prove the first part, and to be if we’re trying to prove the second part, and let be the degree of . We’ll proceed by induction on the number of blocks along the diagonal.
If then is either one- or two-dimensional. Then the statement reduces to what we worked out by cases earlier. So from here we’ll assume that , and that the statement holds for all matrices with blocks.
Let’s define to be the subspace spanned by the first blocks, and be the subspace spanned by the last block. That is, consists of all but the last one or two rows and columns of the matrix, depending on whether is or . Clearly is invariant under , and the restriction of to has matrix
with blocks. Thus the inductive hypothesis tells us that exactly
of the blocks from to have characteristic polynomial .
We’ll also define to be the linear transformation acting by the block . This is essentially the action of on the quotient space , but we’re viewing as giving representatives in for vectors in the quotient space. This way, if is a vector in this subspace of representatives we can write for some . Further, for some other vector . No matter which form of we’re using, we can see that for some , and further that for some .
Now, either has characteristic polynomial or not. If not, then I say that . This implies that
and thus that both the dimension of this kernel and the number of blocks with characteristic polynomial are the same as for .
So let’s assume and write with and . Then
for some . This implies that , but since is not the characteristic polynomial of , it is invertible on . Thus and .
On the other hand, if the characteristic polynomial of is , then we want to show that
The inclusion-exclusion principle tells us that
We’ll show that , and so its dimension is , and we have the result we want.
So take . Because the characteristic polynomial of is , we know that . Thus . Then
where the last equality holds because the dimension of is , and so the image has stabilized by this point. Thus we can choose so that . And so
which shows that . And thus is in . Since was arbitrary, the whole subspace , which shows that , which completes our proof.
Now, all of that handled we turn to calculate the characteristic polynomial of , only to find that it’s the product of the characteristic polynomials of all the blocks . That is, we will have factors of and factors of . We can thus define this half-dimension to be the multiplicity of the eigenpair. Like the multiplicity of an eigenvalue, it counts both the number of times the corresponding factor shows up in the characteristic polynomial of , and the number of blocks on the diagonal of an almost upper-triangular matrix for that have this characteristic polynomial.
When working over an algebraically closed field we found that generalized eigenspaces are invariant and disjoint from each other. The same holds now that we’re allowing eigenpairs for transformations on real vector spaces.
First off, the generalized eigenspace of an eigenpair is the kernel of a polynomial in . Just like before, this kernel is automatically invariant under , just like the generalized eigenspace is.
Generalized eigenspaces of distinct eigenvalues are disjoint, as before. But let be an eigenvalue, be an eigenpair, and be a vector in both generalized eigenspaces. The invariance of under shows that if is a generalized eigenvector of , then so is . Just like we did before, we can keep hitting with until one step before it vanishes (which it eventually must, since it’s a generalized eigenvector of ). So without loss of generality we can assume that is an actual eigenvector of and a generalized eigenvector of .
Now we can use the generalized eigenvector property to write
but since is an eigenvector with eigenvalue , this says
If is nonzero, this can only be true if is a root of , which we assumed not to be the case.
Finally we consider two distinct eigenpairs and , and a generalized eigenvector of both, . Another argument like that above shows that without loss of generality we can assume is an actual eigenvector of . This eigenspace is the kernel , which is thus invariant, and another argument like before lets us assume that is an actual eigenvector of both eigenpairs. Thus we have
Subtracting, we find
If , this makes an eigenvector with eigenvalue . On the other hand, if then , and we conclude that .
At the end of the day, no nonzero vector can be a generalized eigenvector of more than one eigenvalue or eigenpair.
Just as we saw when dealing with eigenvalues, eigenvectors alone won’t cut it. We want to consider the kernel not just of one transformation, but of its powers. Specifically, we will say that is a generalized eigenvector of the eigenpair if for some power we have
The same argument as before tells us that the kernel will stabilize by the time we take powers of an operator, so we define the generalized eigenspace of an eigenpair to be
Let’s look at these subspaces a little more closely, along with the older ones of the form , just to make sure they’re as well-behaved as our earlier generalized eigenspaces are. First, let be one-dimensional, so must be multiplication by . Then the kernel of is all of if , and is trivial otherwise. On the other hand, what happens with an eigenpair ? Well, one application of the operator gives
for any nonzero . But this will always be itself nonzero, since we’re assuming that the polynomial has no roots. Thus the generalized eigenspace of will be trivial.
Next, if is two-dimensional, either has an eigenvalue or it doesn’t. If it does, then this gives a one-dimensional invariant subspace. The argument above shows that the generalized eigenspace of any eigenpair is again trivial. But if has no eigenvalues, then the generalized eigenspace of any eigenvalue is trivial. On the other hand we’ve seen that the kernel of is either the whole of or nothing, and the former case happens exactly when is the trace of and is its determinant.
Now if is a real vector space of any finite dimension we know we can find an almost upper-triangular form. This form is highly non-unique, but there are some patterns we can exploit as we move forward.
An eigenvalue of a linear transformation is the same thing as a root of the characteristic polynomial of . That is, the characteristic polynomial has a factor . We can evaluate this polynomial at to get the linear transformation . Vectors in the kernel of this space are the eigenvalues corresponding to the eigenvector .
Now we want to do the same thing with an eigenpair . This corresponds to an irreducible quadratic factor in the characteristic polynomial of . Evaluating this polynomial at we get the transformation , which I assert has a nontrivial kernel. Specifically, I want to focus in on some two-dimensional invariant subspace on which has no eigenvalues. This corresponds to a block in an almost upper-triangular representation of . So we’ll just assume for the moment that has dimension .
What I assert is this: if is a monic polynomial (with leading coefficient ) of degree two, then either is the characteristic polynomial of or it’s not. If it is, then , and its kernel is the whole of . If not, then the kernel is trivial, and is invertible.
In the first case we can just pick a basis, find a matrix, and crank out the calculation. If the matrix of is
then the characteristic polynomial is . We substitute the matrix into this polynomial to find
On the other hand, if is the characteristic polynomial and is any other monic polynomial of degree two, then , as we just showed. Then we can calculate
for some constants and , at least one of which must be nonzero. If , then is a nonzero multiple of the identity, which is invertible as claimed. On the other hand, if , then
which must be invertible since we assumed that has no eigenvalues.
So for any block, the action of an irreducible quadratic polynomial in either kills off the whole block or has a trivial kernel. This makes it reasonable to define an eigenvector of the eigenpair to be a vector in the kernel , in analogy with the definition of an eigenvector of a given eigenvalue.
We continue working over the field of real numbers. Again, let be a linear transformation from a real vector space of dimension to itself. We want to find the characteristic polynomial of this linear transformation.
When we had an algebraically closed field, this was easy. We took an upper-triangular matrix, and then the determinant was just the product down the diagonal. This gave one factor of the form for each diagonal entry , which established that the diagonal entries of an upper-triangular matrix were exactly the eigenvalues of the linear transformation.
Now we don’t always have an upper-triangular matrix, but we can always find a matrix that’s almost upper-triangular. That is, one that looks like
where the blocks are all either matrices
In this latter case, we define to be the trace , and to be the determinant . We must find that , otherwise we can find another basis which breaks up into two blocks. Let’s go a step further and insist that all the blocks show up first, followed by all the blocks.
Now we can start calculating the determinant of , summing over permutations. Just like we saw with an upper-triangular matrix, if we have a block in the lower-right we have to choose the rightmost entry in the bottom column, or the whole term will be zero. So we start racking up factors just like before. Each block, then, gives us a root of the characteristic polynomial, which is an eigenvalue. So far everything is the same as in the upper-triangular case.
Once we get to the blocks we have to be a bit more careful. We have two choices of a nonzero entry in the lowest row: or . But if we choose then we can only choose on the next row up to have a chance of a nonzero term. On the other hand, if we choose on the lowest row we are forced to choose next. The choice between these two is independent of any other choices we might make in calculating the determinant. The first always gives a factor of to the term corresponding to that permutation, while the second always gives a factor of to its term. These permutations (no matter what other choices we might make) differ by exactly one swap, and so they enter the determinant with opposite signs.
Now we can collect together all the permutations where we make one choice in block , and all the permutations where we make the other choice. From the first collection we can factor out , and from the second we can factor out . What remains after we pull these factors out is the same in either case, so the upshot is that the block contributes a factor of to the determinant. Some calculation simplifies this:
which is a quadratic factor with no real roots (since we assumed that ).
But a factor of the characteristic polynomial of this formula is exactly what we defined to be an eigenpair. That is, just as eigenvectors — roots of the characteristic polynomial — correspond to one-dimensional invariant subspaces, so too do eigenpairs — irreducible quadratic factors of the characteristic polynomial — correspond to two-dimensional invariant subspaces. The blocks that show up along the diagonal of the almost upper-triangular matrix give rise to the eigenpairs of .
Over an algebraically closed field we can always find an upper-triangular matrix for any linear endomorphism. Over the real numbers we’re not quite so lucky, but we can come close.
Let be a linear transformation from a real vector space of dimension to itself. We might not be able to find an eigenvector — a one-dimensional invariant subspace — but we know that we can find either a one-dimensional or a two-dimensional invariant subspace . Just like before we get an action of on the quotient space . Why? Because if we have two representatives and of the same vector in the quotient space, then we can write . Acting by , we find . And since , the vectors and are again equivalent in the quotient space.
Now we can find a subspace which is invariant under this action of . Is this an invariant subspace of ? No, it’s not even a subspace of . But we could pick some containing a unique representative for each vector in . For instance, we could pick a basis of , a representative for each basis vector, and let be the span of these representatives. Is this an invariant subspace? Still, the answer is no. Let’s say is the identified representative of . Then all we know is that is a representative of , not that it’s the identified representative. It could have some components spilling out into .
As we proceed, picking up either a one- or two-dimensional subspace at each step, we can pick a basis of each subspace. The action of sends each basis vector into the current subspace and possibly earlier subspaces. Writing it all out, we get a matrix that looks like
where each is either a matrix or a matrix with no eigenvalues. The blocks come from the one-dimensional invariant subspaces in the construction, while the blocks come from the two-dimensional invariant subspaces in the construction, though they may not be invariant once we put them back into . Above the diagonal we have no control (yet) over the entries, but below the diagonal almost all the entries are zero. The only exceptions are in the blocks, where we poke just barely down by one row.
We can note here that if there are two-dimensional blocks and one-dimensional blocks, then the total number of columns will be . Thus we must have at least blocks, and at most blocks. The latter extreme corresponds to an actual upper-triangular matrix.