Okay, there’s been a bit of a gap while things have gotten hectic around here, but let’s try and get back to work.
When we worked over an algebraically closed field, every polynomial had a root. Applying this to the characteristic polynomial of a linear transformation, we found that it must have a root, which would be definition be an eigenvalue of the transformation. There would then be an eigenvector, which gave a one-dimensional invariant subspace. But when we look at real vector spaces we might not have any one-dimensional invariant subspaces. However, if we don’t we will be sure to have a two-dimensional invariant subspace.
So let’s start with a linear transformation from a real vector space of dimension to itself. Pick any vector and construct the sequence of images , , and so on up to . Together with the original vector, these vectors cannot be linearly independent, since there are more than of them. Thus we have a linear relation
We can regard this as a polynomial in applied to , which has real coefficients. We can factor it to write
Note that either or could be zero, in which case there are no factors of that form. Also, note that we have no reason to believe that this linear combination has anything to do with the characteristic polynomial, so this factorization is not necessarily giving us eigenvalues or eigenpairs.
All that we can conclude is that at least one of these factors is not injective. If it’s a factor , then all the factors after that point act on to give a vector satisfying
This gives a one-dimensional invariant subspace. On the other hand, if one of the quadratic factors is not injective, then all the factors after that point act on to give a vector satisfying
which shows that the vectors and span a two-dimensional invariant subspace, since both basis vectors are sent to a linear combination of each other under the action of .
Thus we can always find an invariant subspace of dimension one or two. It’s not quite as neat as over the complex numbers, but it’s something we can work with.
I’m a little under two hours late, but here’s to all who celebrate today, including the contingent of grad students I knew back at Yale.
And remember: Nowruz is good ruz.
I’ve made my opinion clear about today. It’s completely based on two accidents. One is the use of decimal notation, and one is the use of the Gregorian calendar. And it reduces mathematics to a caricature, with no real understanding even of its referent.
And now even Rachel is getting in on it, which I figured she probably would. Okay, so Rachel, I’ve got a deal for you: if non-public-policy geekdom is actually of interest to you beyond “One More Thing” fodder, I’d be glad to come on the show next year and explain why these celebrations are actually detrimental. I’ll be waiting for your email.
Well, Wednesday I was up at the University of Pennsylvania again, and yesterday I was making arrangements for a visit to San Diego in a couple weeks. And next week is an exam week, so I’ll have to inch forward today.
We’ve seen a lot about Jordan normal forms, which can pretty much capture the behavior of any single linear transformation over an algebraically closed field. But not all fields are algebraically closed, and one of them is very important to us. We want to investigate the situation over the field of real numbers a little more deeply.
The key point about algebraically closed fields is that we can find some upper-triangular matrix. And the crux of that is the fact that any linear transformation has at least one eigenvalue. And that happens because the characteristic polynomial always has a root over an algebraically closed field. So if your field isn’t algebraically closed a characteristic polynomial might not have roots, and your transformation might have no eigenvalues.
And indeed, some real polynomials have no roots. But all is not lost! We do know something about factoring real polynomials. We can break any one down into the product of linear terms like and quadratic terms like . If we’re factoring the characteristic polynomial of a linear endomorphism , then a linear term gives us an eigenvalue , so the new and interesting stuff is in the quadratic terms. I’m going to use the nonstandard term “eigenpair” to describe a pair of real numbers that shows up in this way.
If we were working over the complex numbers, we could factor a quadratic term into a pair of linear terms:
which gives us two complex eigenvalues
This gives us no problem over the real numbers if , so an eigenpair must have . In this case the two complex roots are a conjugate pair. Their sum is , and their product is .
So how can this arise in practice? Well, since it’s a quadratic term it’s the characteristic polynomial of an endomorphism on. So let’s write down a matrix and take a look:
The characteristic polynomial is the determinant of times the identity matrix minus this matrix. We calculate
So we can define to be the trace of this matrix, and to be its determinant. If , we’ve got an eigenpair.
Here’s a nice little thing we can do with Jordan normal forms: show that every invertible linear endomorphism on a vector space of finite dimension over an algebraically closed field has a square root so that . Over at the Secret Blogging Seminar, they discussed this without using the Jordan normal form, and they dealt with the nilpotent blocks later, which has a similar feel to what we’re doing.
First, we find the Jordan normal form for . This decomposes into Jordan blocks, none of which have eigenvalue zero since we’re assuming that is invertible. The transformation acts without mixing up the blocks, so if we can find a square root for each block then we can put those square roots together into a square root for the whole of . So we may as well restrict our attention to a single block , with .
We can write this block as , where is a nilpotent matrix. In fact, it’s the matrix with just above the diagonal and zeroes everywhere else. Since we’re working over an algebraically closed field, the scalar must have a square root. So if has a square root, we’ll be done.
Now, it might seem like a really weird digression, but let’s look at the Taylor series for the function . Yes, that was purely a product of our work on analysis over , but let’s just consider it formally. It’s an infinite series
which has the formal algebraic property that if we multiply it by itself (and wave our hands frantically at all the convergence issues) we’ll get the polynomial . But now if we put in for we notice something: after a while, all the powers of are zero, since is nilpotent! That is, we don’t have a power series, but just a nice polynomial in . And then if we multiply this polynomial by itself, we get a bigger polynomial. But once we take into account the fact that is nilpotent, the only terms that survive are .
To be a little more explicit, we’re trying to find a square root of , where . So we work out the Taylor series above and write down the transformation
Squaring this transformation gives .
We’ve previously considered the representation of the general linear group on the vector space of matrices over by conjugation. What we want to consider now are the orbits of this group action. That is, given two matrices and , we will consider them equivalent if there is an invertible matrix so that . We want to describe these equivalence classes explicitly, and have a test to determine whether two matrices are conjugate or not.
This isn’t just a toy problem. Remember that any matrix is a linear transformation , and conjugation by an automorphism is how we render a change of basis into the language of matrices. That is, what we’re really asking is, “given two matrices, do they represent the same linear transformation with respect to two different bases?” This will be true if and only if they are conjugate by a change-of-basis matrix .
To get an idea of how to solve this problem, let’s consider a couple other orbit-finding problems for a moment.
First, let’s let the integers act on the real numbers by addition. That is, given a real number and an integer we set . This just slides the real line along itself by integer steps. We consider two real numbers to be equivalent if they differ by an integer. How do we classify the orbits? A common way is to pick one point in each orbit to represent the whole. In general, making such a choice can be difficult (or impossible!), but it turns out we have a way of picking out such a representative without having to do much work: pick the one point in the interval . Notice that every real number is between two integers (using the standard inclusion), and if we subtract the lower of the two we get a number between and . Further, no two of these numbers are equivalent to each other. This interval contains exactly one representative for each orbit.
Next, let’s let the real numbers act on the plane with an identified point. We haven’t really talked much about the plane, but an intuitive notion will suffice here. The action of the real number will be to rotate the plane by an angle around the identified point (again, a detailed understanding of “angle” doesn’t really matter here). The orbits here are circles around the common center. To classify them, we again want to pick out a “canonical” representative point in each orbit. To do this, we can draw a ray from the center out towards infinity. Then each circle meets this ray exactly once, and no two points on the ray are equivalent.
So let’s try to attack the problem at hand in a similar manner. We have the general linear group throwing matrices around the orbits. We want to pick exactly one matrix in each orbit as a representative of the whole.
But we know how to do this! Use a matrix in Jordan normal form! We know that within a given conjugacy class, the Jordan normal form is unique — up to rearrangement of the Jordan blocks. So we don’t quite have a unique representative, but it’s a lot easier to check if two Jordan matrices are equivalent than to search by hand for a conjugating matrix.
We have an answer to the problem of determining whether two matrices and are conjugate. Put each one into Jordan normal form, and compare the resulting matrices. If they differ only by a rearrangement of the blocks, then and are equivalent, and if not they’re not.
So we’ve got a Jordan normal form for every linear endomorphism on a vector space of finite dimension over an algebraically closed base field . That is, we can always pick a basis with respect to which the matrix of is block-diagonal, and each block is a “Jordan block” . This is an matrix
with the eigenvalue down the diagonal and just above the diagonal.
Abstractly, this is a decomposition of as a direct sum of various subspaces, each of which is invariant under the action of . And, in fact, it’s the only “complete” decomposition of the sort. Decomposing into generalized eigenspaces can really be done in only one way, and breaking each eigenspace into Jordan blocks is also essentially unique. The biggest Jordan block comes from picking one vector that lives through as many applications of as possible. As we apply over and over to , we expand until (after no more than iterations) we fill out an invariant subspace. Not only that, but we know that we can break up our generalized eigenspace as the direct sum of this block and another subspace, which is also invariant. And that lets us continue the process, splitting off blocks until we’ve used up the whole generalized eigenspace.
Now, there is one sense in which this process is not unique. Direct sums are commutative (up to isomorphism), so we can rearrange the Jordan blocks for a given endomorphism , and the result is still a Jordan normal form. But that’s the only way that two Jordan normal forms for the same endomorphism can differ.
Okay, let’s put everything together now. Start with a linear endomorphism on a vector space of finite dimension over an algebraically closed field . If you want to be specific, use the complex numbers .
Now we can calculate the characteristic polynomial of , whose roots are the eigenvalues of . For each eigenvalue , we can define the generalized eigenspace as the kernel , since if some power of kills a vector then the th power will.
These generalized eigenspaces do not overlap, and each one is invariant under . The dimension of the generalized eigenspace associated to is the multiplicity of , which is the number of times shows up on the diagonal of an upper-triangular matrix for . Since the total number of diagonal entries is , we see that the dimensions of all the generalized eigenspaces add up to the dimension of the entire space . Thus, we have a decomposition
of as the direct sum of these generalized eigenspaces, where runs over the roots of the characteristic polynomial.
where each block has the form
We can now add back in the eigenvalue times the identity transformation to the restriction of . Now we still have the block-diagonal form, but the blocks themselves now have the form
where, of course, a block could be a single matrix whose only entry is .
Putting these together for all the different eigenvalues, we have a Jordan basis for . This puts the matrix into “Jordan normal form”. That is, the matrix of with respect to a Jordan basis is block-diagonal, with each block consisting of one eigenvalue down its diagonal, and s just above the diagonal.
Unfortunately, if the base field is not algebraically closed, we may not have any upper-triangular matrix for , and so we can only put the portion of captured by generalized eigenspaces into Jordan normal form. There may still be another direct summand which contains no generalized eigenvectors at all. Over an arbitrary field, this sort of thing gets complicated quickly, but it will be useful for us to consider what happens over the real numbers . We’ll come back to this.
Sorry for the delays. It’s been a bit busy.
Today we’ll finish off our treatment of nilpotent transformations by taking last Thursday’s lemma and using it to find a really nice basis for a nilpotent transformation, called a “Jordan basis”. This is one that makes the matrix of the transformation break up into a block-diagonal form
where each block has the form
with zeroes down the diagonal — as we should have for any nilpotent transformation in upper-triangular form — and ones just above it. Note that a block could just be a single zero in a matrix, with no space above the diagonal to put any ones anyway.
So, remember that we’re talking about a linear transformation with the property that for some . We’ll proceed by induction on the dimension . If then has to be the zero transformation, and any basis will be a Jordan basis.
Now we’ll assume that the statement holds for all vector spaces of dimension less than . If we pick to be the smallest power so that , then we know . Thus there is some vector so that . This vector spans a one-dimensional subspace which trivially intersects the kernel of . Our lemma then tells us that we can decompose as a direct sum:
for some invariant subspace . Now, can’t be all of , since we know isn’t in it. It could be nontrivial, though. Then we’ve broken into the direct sum of two invariant subspaces, and we can restrict to each subspace and use the inductive hypothesis to get a Jordan basis for each part. Putting those bases together gives us a Jordan basis for all of .
What’s left is the case where is trivial, and so . This means that is spanned by the vectors . Thus , since a basis is a minimal spanning set. But we also know that , so . Thus , and the spanning set is actually a basis.
Reversing the order in the list above, we see that , so the first column in the matrix will be zero. Next, , so the second column will have a one in the first row, and zeros elsewhere. As we go forward, the th column has a one just above the th row, and zeroes elsewhere, showing that this is a Jordan basis, and finishing the theorem.
Essentially what we’ve done is this: we pick one vector that lives as long as possible, and follow it until it dies, adding new basis elements as we go. Then if we haven’t used up the whole space, we pick another vector that lives as long as possible — which might be not as long as the first one did — and repeat the process until eventually we fill up all of . The lemma is important in that it tells us that these later streams of basis vectors will never step on the toes of earlier streams.
We should point out here that this result about nilpotent transformations works no matter what base field we’re working with. A single nilpotent transformation on its own is actually a remarkably simple thing, which can be described very succinctly.