The Unapologetic Mathematician

Mathematics for the interested outsider

Real Invariant Subspaces

Okay, there’s been a bit of a gap while things have gotten hectic around here, but let’s try and get back to work.

When we worked over an algebraically closed field, every polynomial had a root. Applying this to the characteristic polynomial of a linear transformation, we found that it must have a root, which would be definition be an eigenvalue of the transformation. There would then be an eigenvector, which gave a one-dimensional invariant subspace. But when we look at real vector spaces we might not have any one-dimensional invariant subspaces. However, if we don’t we will be sure to have a two-dimensional invariant subspace.

So let’s start with a linear transformation T:V\rightarrow V from a real vector space V of dimension d to itself. Pick any vector v and construct the sequence of images {Tv}, T^2v, and so on up to T^dv. Together with the original vector, these {d+1} vectors cannot be linearly independent, since there are more than d of them. Thus we have a linear relation

\displaystyle a_0v+a_1Tv+a_2T^2v+...+a_dT^dv=0

We can regard this as a polynomial in T applied to v, which has real coefficients. We can factor it to write

\displaystyle c(T-\lambda_1I_V)...(T-\lambda_mI_V)(T^2-\tau_1T+\delta_1I_V)...(T^2-\tau_MT+\delta_MI_V)v

Note that either m or M could be zero, in which case there are no factors of that form. Also, note that we have no reason to believe that this linear combination has anything to do with the characteristic polynomial, so this factorization is not necessarily giving us eigenvalues or eigenpairs.

All that we can conclude is that at least one of these factors is not injective. If it’s a factor (T-\lambda_jI_V), then all the factors after that point act on v to give a vector u satisfying

\displaystyle Tu={\lambda_j}u

This gives a one-dimensional invariant subspace. On the other hand, if one of the quadratic factors T^2-\tau_jT+\delta_jI_V is not injective, then all the factors after that point act on v to give a vector u satisfying

\displaystyle T^2u=\tau_jTu-\delta_ju

which shows that the vectors u and Tu span a two-dimensional invariant subspace, since both basis vectors are sent to a linear combination of each other under the action of T.

Thus we can always find an invariant subspace of dimension one or two. It’s not quite as neat as over the complex numbers, but it’s something we can work with.

March 31, 2009 Posted by | Algebra, Linear Algebra | 6 Comments

Happy New Year

I’m a little under two hours late, but here’s to all who celebrate today, including the contingent of grad students I knew back at Yale.

And remember: Nowruz is good ruz.

March 20, 2009 Posted by | Uncategorized | Leave a comment

It’s that time again

I’ve made my opinion clear about today. It’s completely based on two accidents. One is the use of decimal notation, and one is the use of the Gregorian calendar. And it reduces mathematics to a caricature, with no real understanding even of its referent.

And now even Rachel is getting in on it, which I figured she probably would. Okay, so Rachel, I’ve got a deal for you: if non-public-policy geekdom is actually of interest to you beyond “One More Thing” fodder, I’d be glad to come on the show next year and explain why these celebrations are actually detrimental. I’ll be waiting for your email.

March 14, 2009 Posted by | rants | 9 Comments


Well, Wednesday I was up at the University of Pennsylvania again, and yesterday I was making arrangements for a visit to San Diego in a couple weeks. And next week is an exam week, so I’ll have to inch forward today.

We’ve seen a lot about Jordan normal forms, which can pretty much capture the behavior of any single linear transformation over an algebraically closed field. But not all fields are algebraically closed, and one of them is very important to us. We want to investigate the situation over the field \mathbb{R} of real numbers a little more deeply.

The key point about algebraically closed fields is that we can find some upper-triangular matrix. And the crux of that is the fact that any linear transformation has at least one eigenvalue. And that happens because the characteristic polynomial always has a root over an algebraically closed field. So if your field isn’t algebraically closed a characteristic polynomial might not have roots, and your transformation might have no eigenvalues.

And indeed, some real polynomials have no roots. But all is not lost! We do know something about factoring real polynomials. We can break any one down into the product of linear terms like (X-\lambda) and quadratic terms like (X^2-\tau X+\delta). If we’re factoring the characteristic polynomial of a linear endomorphism T, then a linear term (X-\lambda) gives us an eigenvalue \lambda, so the new and interesting stuff is in the quadratic terms. I’m going to use the nonstandard term “eigenpair” to describe a pair of real numbers (\tau,\delta) that shows up in this way.

If we were working over the complex numbers, we could factor a quadratic term into a pair of linear terms:

\displaystyle X^2-\tau X+\delta=\left(X-\frac{\tau+\sqrt{\tau^2-4\delta}}{2}\right)\left(X-\frac{\tau-\sqrt{\tau^2-4\delta}}{2}\right)

which gives us two complex eigenvalues


This gives us no problem over the real numbers if \tau^2\geq4\delta, so an eigenpair must have \tau^2<4\delta. In this case the two complex roots are a conjugate pair. Their sum is \tau, and their product is \delta.

So how can this arise in practice? Well, since it’s a quadratic term it’s the characteristic polynomial of an endomorphism on\mathbb{R}^2. So let’s write down a 2\times2 matrix and take a look:


The characteristic polynomial is the determinant of {X} times the identity matrix minus this matrix. We calculate


So we can define \tau to be the trace of this matrix, and \delta to be its determinant. If \tau^2<4\delta, we’ve got an eigenpair.

March 13, 2009 Posted by | Algebra, Linear Algebra | 1 Comment

Square Roots

Here’s a nice little thing we can do with Jordan normal forms: show that every invertible linear endomorphism S:V\rightarrow V on a vector space V of finite dimension d over an algebraically closed field \mathbb{F} has a square root T:V\rightarrow V so that T^2=S. Over at the Secret Blogging Seminar, they discussed this without using the Jordan normal form, and they dealt with the nilpotent blocks later, which has a similar feel to what we’re doing.

First, we find the Jordan normal form for S. This decomposes V into Jordan blocks, none of which have eigenvalue zero since we’re assuming that S is invertible. The transformation S acts without mixing up the blocks, so if we can find a square root for each block then we can put those square roots together into a square root for the whole of S. So we may as well restrict our attention to a single block J_n(\lambda), with \lambda\neq0.

We can write this block as \lambda(1_V+N), where N is a nilpotent matrix. In fact, it’s the matrix with \frac{1}{\lambda} just above the diagonal and zeroes everywhere else. Since we’re working over an algebraically closed field, the scalar \lambda must have a square root. So if 1_V+N has a square root, we’ll be done.

Now, it might seem like a really weird digression, but let’s look at the Taylor series for the function \sqrt{1+x}. Yes, that was purely a product of our work on analysis over \mathbb{R}, but let’s just consider it formally. It’s an infinite series


which has the formal algebraic property that if we multiply it by itself (and wave our hands frantically at all the convergence issues) we’ll get the polynomial 1+x. But now if we put N in for x we notice something: after a while, all the powers of N are zero, since N is nilpotent! That is, we don’t have a power series, but just a nice polynomial in N. And then if we multiply this polynomial by itself, we get a bigger polynomial. But once we take into account the fact that N is nilpotent, the only terms that survive are 1_V+N.

To be a little more explicit, we’re trying to find a square root of 1_V+N, where N^n=0. So we work out the Taylor series above and write down the transformation


Squaring this transformation gives 1_V+N+N^nP(N)=1_V+N.

March 10, 2009 Posted by | Algebra, Linear Algebra | 3 Comments

Orbits of the General Linear Action on Matrices

We’ve previously considered the representation of the general linear group \mathrm{GL}_n(\mathbb{F}) on the vector space of n\times n matrices over \mathbb{F} by conjugation. What we want to consider now are the orbits of this group action. That is, given two matrices S and T, we will consider them equivalent if there is an invertible matrix G so that S=G^{-1}TG. We want to describe these equivalence classes explicitly, and have a test to determine whether two matrices are conjugate or not.

This isn’t just a toy problem. Remember that any matrix is a linear transformation S:\mathbb{F}^n\rightarrow\mathbb{F}^n, and conjugation by an automorphism is how we render a change of basis into the language of matrices. That is, what we’re really asking is, “given two matrices, do they represent the same linear transformation with respect to two different bases?” This will be true if and only if they are conjugate by a change-of-basis matrix G\in\mathrm{GL}_n(\mathbb{F}).

To get an idea of how to solve this problem, let’s consider a couple other orbit-finding problems for a moment.

First, let’s let the integers act on the real numbers by addition. That is, given a real number x and an integer n we set n(x)=n+x. This just slides the real line along itself by integer steps. We consider two real numbers to be equivalent if they differ by an integer. How do we classify the orbits? A common way is to pick one point in each orbit to represent the whole. In general, making such a choice can be difficult (or impossible!), but it turns out we have a way of picking out such a representative without having to do much work: pick the one point in the interval \left[0,1\right). Notice that every real number is between two integers (using the standard inclusion), and if we subtract the lower of the two we get a number between {0} and {1}. Further, no two of these numbers are equivalent to each other. This interval contains exactly one representative for each orbit.

Next, let’s let the real numbers act on the plane with an identified point. We haven’t really talked much about the plane, but an intuitive notion will suffice here. The action of the real number x will be to rotate the plane by an angle x around the identified point (again, a detailed understanding of “angle” doesn’t really matter here). The orbits here are circles around the common center. To classify them, we again want to pick out a “canonical” representative point in each orbit. To do this, we can draw a ray from the center out towards infinity. Then each circle meets this ray exactly once, and no two points on the ray are equivalent.

So let’s try to attack the problem at hand in a similar manner. We have the general linear group throwing matrices around the orbits. We want to pick exactly one matrix in each orbit as a representative of the whole.

But we know how to do this! Use a matrix in Jordan normal form! We know that within a given conjugacy class, the Jordan normal form is unique — up to rearrangement of the Jordan blocks. So we don’t quite have a unique representative, but it’s a lot easier to check if two Jordan matrices are equivalent than to search by hand for a conjugating matrix.

We have an answer to the problem of determining whether two matrices S and T are conjugate. Put each one into Jordan normal form, and compare the resulting matrices. If they differ only by a rearrangement of the blocks, then S and T are equivalent, and if not they’re not.

March 6, 2009 Posted by | Algebra, Linear Algebra | 3 Comments

Uniqueness of Jordan Normal Forms

So we’ve got a Jordan normal form for every linear endomorphism T:V\rightarrow V on a vector space V of finite dimension d over an algebraically closed base field \mathbb{F}. That is, we can always pick a basis with respect to which the matrix of T is block-diagonal, and each block is a “Jordan block” J_n(\lambda). This is an n\times n matrix


with the eigenvalue \lambda down the diagonal and {1} just above the diagonal.

Abstractly, this is a decomposition of V as a direct sum of various subspaces, each of which is invariant under the action of T. And, in fact, it’s the only “complete” decomposition of the sort. Decomposing into generalized eigenspaces can really be done in only one way, and breaking each eigenspace into Jordan blocks is also essentially unique. The biggest Jordan block comes from picking one vector v that lives through as many applications of T-\lambda1_V as possible. As we apply T over and over to v, we expand until (after no more than d iterations) we fill out an invariant subspace. Not only that, but we know that we can break up our generalized eigenspace as the direct sum of this block and another subspace, which is also invariant. And that lets us continue the process, splitting off blocks until we’ve used up the whole generalized eigenspace.

Now, there is one sense in which this process is not unique. Direct sums are commutative (up to isomorphism), so we can rearrange the Jordan blocks for a given endomorphism T, and the result is still a Jordan normal form. But that’s the only way that two Jordan normal forms for the same endomorphism can differ.

March 5, 2009 Posted by | Algebra, Linear Algebra | 1 Comment

Jordan Normal Form

Okay, let’s put everything together now. Start with a linear endomorphism T:V\rightarrow V on a vector space V of finite dimension d over an algebraically closed field \mathbb{F}. If you want to be specific, use the complex numbers \mathbb{C}.

Now we can calculate the characteristic polynomial of T, whose roots are the eigenvalues of T. For each eigenvalue \lambda, we can define the generalized eigenspace as the kernel U_\lambda=\mathrm{Ker}\left((T-\lambda1_V)^d\right), since if some power of T-\lambda1_V kills a vector then the dth power will.

These generalized eigenspaces do not overlap, and each one is invariant under T. The dimension of the generalized eigenspace U_\lambda associated to \lambda is the multiplicity of \lambda, which is the number of times \lambda shows up on the diagonal of an upper-triangular matrix for T. Since the total number of diagonal entries is d, we see that the dimensions of all the generalized eigenspaces add up to the dimension of the entire space V. Thus, we have a decomposition

\displaystyle V=\bigoplus\limits_{\lambda}U_\lambda

of V as the direct sum of these generalized eigenspaces, where \lambda runs over the roots of the characteristic polynomial.

If we restrict T to the generalized eigenspace U_\lambda with eigenvalue \lambda, the transformation T\vert_{U_\lambda}-\lambda1_{U_\lambda} is nilpotent. Thus we can find a Jordan basis for U_\lambda, which puts T\vert_{U_\lambda}-\lambda1_{U_\lambda} into the block-diagonal form


where each block has the form


We can now add back in the eigenvalue \lambda times the identity transformation to the restriction of T. Now we still have the block-diagonal form, but the blocks themselves now have the form


where, of course, a block could be a single 1\times1 matrix whose only entry is \lambda.

Putting these together for all the different eigenvalues, we have a Jordan basis for V. This puts the matrix T into “Jordan normal form”. That is, the matrix of T with respect to a Jordan basis is block-diagonal, with each block consisting of one eigenvalue \lambda down its diagonal, and {1}s just above the diagonal.

Unfortunately, if the base field \mathbb{F} is not algebraically closed, we may not have any upper-triangular matrix for T, and so we can only put the portion of T captured by generalized eigenspaces into Jordan normal form. There may still be another direct summand which contains no generalized eigenvectors at all. Over an arbitrary field, this sort of thing gets complicated quickly, but it will be useful for us to consider what happens over the real numbers \mathbb{R}. We’ll come back to this.

March 4, 2009 Posted by | Algebra, Linear Algebra | 8 Comments

Nilpotent Transformations II

Sorry for the delays. It’s been a bit busy.

Today we’ll finish off our treatment of nilpotent transformations by taking last Thursday’s lemma and using it to find a really nice basis for a nilpotent transformation, called a “Jordan basis”. This is one that makes the matrix of the transformation break up into a block-diagonal form


where each block has the form


with zeroes down the diagonal — as we should have for any nilpotent transformation in upper-triangular form — and ones just above it. Note that a block could just be a single zero in a matrix, with no space above the diagonal to put any ones anyway.

So, remember that we’re talking about a linear transformation N:V\rightarrow V with the property that N^k=0 for some k. We’ll proceed by induction on the dimension d=\dim(V). If d=1 then N has to be the zero transformation, and any basis will be a Jordan basis.

Now we’ll assume that the statement holds for all vector spaces of dimension less than d. If we pick k to be the smallest power so that N^k=0, then we know N^{k-1}\neq0. Thus there is some vector u so that N^{k-1}u\neq0. This vector spans a one-dimensional subspace U which trivially intersects the kernel of N^{k-1}. Our lemma then tells us that we can decompose V as a direct sum:

V=(U+NU+...+N^{k-1}U)\oplus W

for some invariant subspace W. Now, W can’t be all of V, since we know u isn’t in it. It could be nontrivial, though. Then we’ve broken V into the direct sum of two invariant subspaces, and we can restrict N to each subspace and use the inductive hypothesis to get a Jordan basis for each part. Putting those bases together gives us a Jordan basis for all of V.

What’s left is the case where W is trivial, and so V=U+NU+...+N^{k-1}U. This means that V is spanned by the k vectors \{u,Nu,...,N^{k-1}u\}. Thus k\geq d, since a basis is a minimal spanning set. But we also know that N^d=0, so k\leq d. Thus k=d, and the spanning set is actually a basis.

Reversing the order in the list above, we see that N(N^{k-1}u)=0, so the first column in the matrix will be zero. Next, N(N^{k-2}u)=N^{k-1}u, so the second column will have a one in the first row, and zeros elsewhere. As we go forward, the ith column has a one just above the ith row, and zeroes elsewhere, showing that this is a Jordan basis, and finishing the theorem.

Essentially what we’ve done is this: we pick one vector that lives as long as possible, and follow it until it dies, adding new basis elements as we go. Then if we haven’t used up the whole space, we pick another vector that lives as long as possible — which might be not as long as the first one did — and repeat the process until eventually we fill up all of V. The lemma is important in that it tells us that these later streams of basis vectors will never step on the toes of earlier streams.

We should point out here that this result about nilpotent transformations works no matter what base field we’re working with. A single nilpotent transformation on its own is actually a remarkably simple thing, which can be described very succinctly.

March 3, 2009 Posted by | Algebra, Linear Algebra | 2 Comments