A Trace Criterion for Nilpotence
We’re going to need another way of identifying nilpotent endomorphisms. Let be two subspaces of endomorphisms on a finite-dimensional space
, and let
be the collection of
such that
sends
into
. If
satisfies
for all
then
is nilpotent.
The first thing we do is take the Jordan-Chevalley decomposition of —
— and fix a basis that diagonalizes
with eigenvalues
. We define
to be the
-subspace of
spanned by the eigenvalues. If we can prove that this space is trivial, then all the eigenvalues of
must be zero, and thus
itself must be zero.
We proceed by showing that any linear functional must be zero. Taking one, we define
to be the endomorphism whose matrix with respect to our fixed basis is diagonal:
. If
is the corresponding basis of
we can calculate that
Now we can find some polynomial such that
; there is no ambiguity here since if
then the linearity of
implies that
Further, picking we can see that
, so
has no constant term. It should be apparent that
.
Now, we know that is the semisimple part of
, so the Jordan-Chevalley decomposition lets us write it as a polynomial in
with no constant term. But then we can write
. Since
maps
into
, so does
, and our hypothesis tells us that
Hitting this with we find that the sum of the squares of the
is also zero, but since these are rational numbers they must all be zero.
Thus, as we asserted, the only possible -linear functional on
is zero, meaning that
is trivial, all the eigenvalues of
are zero, and
is nipotent, as asserted.
Uses of the Jordan-Chevalley Decomposition
Now that we’ve given the proof, we want to mention a few uses of the Jordan-Chevalley decomposition.
First, we let be any finite-dimensional
-algebra — associative, Lie, whatever — and remember that
contains the Lie algebra of derivations
. I say that if
then so are its semisimple part
and its nilpotent part
; it’s enough to show that
is.
Just like we decomposed in the proof of the Jordan-Chevalley decomposition, we can break
down into the eigenspaces of
— or, equivalently, of
. But this time we will index them by the eigenvalue:
consists of those
such that
for sufficiently large
.
Now we have the identity:
which is easily verified. If a sufficiently large power of applied to
and a sufficiently large power of
applied to
are both zero, then for sufficiently large
one or the other factor in each term will be zero, and so the entire sum is zero. Thus we verify that
.
If we take and
then
, and thus
. On the other hand,
And thus satisfies the derivation property
so and
are both in
.
For the other side we note that, just as the adjoint of a nilpotent endomorphism is nilpotent, the adjoint of a semisimple endomorphism is semisimple. Indeed, if is a basis of
such that the matrix of
is diagonal with eigenvalues
, then we let
be the standard basis element of
, which is isomorphic to
using the basis
. It’s a straightforward calculation to verify that
and thus is diagonal with respect to this basis.
So now if is the Jordan-Chevalley decomposition of
, then
is semisimple and
is nilpotent. They commute, since
Since is the decomposition of
into a semisimple and a nilpotent part which commute with each other, it is the Jordan-Chevalley decomposition of
.
The Jordan-Chevalley Decomposition (proof)
We now give the proof of the Jordan-Chevalley decomposition. We let have distinct eigenvalues
with multiplicities
, so the characteristic polynomial of
is
We set so that
is the direct sum of these subspaces, each of which is fixed by
.
On the subspace ,
has the characteristic polynomial
. What we want is a single polynomial
such that
That is, has no constant term, and for each
there is some
such that
Thus, if we evaluate on the
block we get
.
To do this, we will make use of a result that usually comes up in number theory called the Chinese remainder theorem. Unfortunately, I didn’t have the foresight to cover number theory before Lie algebras, so I’ll just give the statement: any system of congruences — like the one above — where the moduli are relatively prime — as they are above, unless is an eigenvalue in which case just leave out the last congruence since we don’t need it — has a common solution, which is unique modulo the product of the separate moduli. For example, the system
has the solution , which is unique modulo
. This is pretty straightforward to understand for integers, but it works as stated over any principal ideal domain — like
— and, suitably generalized, over any commutative ring.
So anyway, such a exists, and it’s the
we need to get the semisimple part of
. Indeed, on any block
differs from
by stripping off any off-diagonal elements. Then we can just set
and find
. Any two polynomials in
must commute — indeed we can simply calculate
Finally, if then so must any polynomial in
, so the last assertion of the decomposition holds.
The only thing left is the uniqueness of the decomposition. Let’s say that is a different decomposition into a semisimple and a nilpotent part which commute with each other. Then we have
, and all four of these endomorphisms commute with each other. But the left-hand side is semisimple — diagonalizable — but the right hand side is nilpotent, which means its only possible eigenvalue is zero. Thus
and
.
The Jordan-Chevalley Decomposition
We recall that any linear endomorphism of a finite-dimensional vector space over an algebraically closed field can be put into Jordan normal form: we can find a basis such that its matrix is the sum of blocks that look like
where is some eigenvalue of the transformation. We want a slightly more abstract version of this, and it hinges on the idea that matrices in Jordan normal form have an obvious diagonal part, and a bunch of entries just above the diagonal. This off-diagonal part is all in the upper-triangle, so it is nilpotent; the diagonalizable part we call “semisimple”. And what makes this particular decomposition special is that the two parts commute. Indeed, the block-diagonal form means we can carry out the multiplication block-by-block, and in each block one factor is a constant multiple of the identity, which clearly commutes with everything.
More generally, we will have the Jordan-Chevalley decomposition of an endomorphism: any can be written uniquely as the sum
, where
is semisimple — diagonalizable — and
is nilpotent, and where
and
commute with each other.
Further, we will find that there are polynomials and
— each of which with no constant term — such that
and
. And thus we will find that any endomorphism that commutes with
with also commute with both
and
.
Finally, if is any pair of subspaces such that
then the same is true of both
and
.
We will prove these next time, but let’s see that this is actually true of the Jordan normal form. The first part we’ve covered.
For the second, set aside the assertion about and
; any endomorphism commuting with
either multiplies each block by a constant or shuffles similar blocks, and both of these operations commute with both
and
.
For the last part, we may as well assume that , since otherwise we can just restrict to
. If
then the Jordan normal form shows us that any complementary subspace to
must be spanned by blocks with eigenvalue
. In particular, it can only touch the last row of any such block. But none of these rows are in the range of either the diagonal or off-diagonal portions of the matrix.
Flags
We’d like to have matrix-oriented versions of Engel’s theorem and Lie’s theorem, and to do that we’ll need flags. I’ve actually referred to flags long, long ago, but we’d better go through them now.
In its simplest form, a flag is simply a strictly-increasing sequence of subspaces of a given finite-dimensional vector space. And we almost always say that a flag starts with
and ends with
. In the middle we have some other subspaces, each one strictly including the one below it. We say that a flag is “complete” if
— and thus
— and for our current purposes all flags will be complete unless otherwise mentioned.
The useful thing about flags is that they’re a little more general and “geometric” than ordered bases. Indeed, given an ordered basis we have a flag on
: define
to be the span of
. As a partial converse, given any (complete) flag we can come up with a not-at-all-unique basis: at each step let
be the preimage in
of some nonzero vector in the one-dimensional space
.
We say that an endomorphism of “stabilizes” a flag if it sends each
back into itself. In fact, we saw something like this in the proof of Lie’s theorem: we build a complete flag on the subspace
, building the subspace up one basis element at a time, and then showed that each
stabilized that flag. More generally, we say a collection of endomorphisms stabilizes a flag if all the endomorphisms in the collection do.
So, what do Lie’s and Engel’s theorems tell us about flags? Well, Lie’s theorem tells us that if is solvable then it stabilizes some flag in
. Equivalently, there is some basis with respect to which the matrices of all elements of
are upper-triangular. In other words,
is isomorphic to some subalgebra of
. We see that not only is
solvable, it is in a sense the archetypal solvable Lie algebra.
The proof is straightforward: Lie’s theorem tells us that has a common eigenvector
. We let this span the one-dimensional subspace
and consider the action of
on the quotient
. Since we know that the image of
in
will again be solvable, we get a common eigenvector
. Choosing a pre-image
with
we get our second basis vector. We can continue like this, building up a basis of
such that at each step we can write
for all
and some
.
For nilpotent , the same is true — of course, nilpotent Lie algebras are automatically solvable — but Engel’s theorem tells us more: the functional $\lambda$ must be zero, and the diagonal entries of the above matrices are all zero. We conclude that any nilpotent
is isomorphic to some subalgebra of
. That is, not only is
nilpotent, it is the archetype of all nilpotent Lie algebras in just the same way as
is the archetypal solvable Lie algebra.
More generally, if is any solvable (nilpotent) Lie algebra and
is any finite-dimensional representation of
, then we know that the image
is a solvable (nilpotent) linear Lie algebra acting on
, and thus it must stabilize some flag of
. As a particular example, consider the adjoint action
; a subspace of
invariant under the adjoint action of
is just the same thing as an ideal of
, so we find that there must be some chain of ideals:
where . Given such a chain, we can of course find a basis of
with respect to which the matrices of the adjoint action are all in
(
).
In either case, we find that is nilpotent. Indeed, if
is already nilpotent this is trivial. But if
is merely solvable, we see that the matrices of the commutators
for
lie in
But since is a homomorphism, this is the matrix of
acting on
, and obviously its action on the subalgebra
is nilpotent as well. Thus each element of
is ad-nilpotent, and Engel’s theorem then tells us that
is a nilpotent Lie algebra.
Lie’s Theorem
The lemma leading to Engel’s theorem boils down to the assertion that there is some common eigenvector for all the endomorphisms in a nilpotent linear Lie algebra on a finite-dimensional nonzero vector space
. Lie’s theorem says that the same is true of solvable linear Lie algebras. Of course, in the nilpotent case the only possible eigenvalue was zero, so we may find things a little more complicated now. We will, however, have to assume that
is algebraically closed and that no multiple of the unit in
is zero.
We will proceed by induction on the dimension of using the same four basic steps as in the lemma: find an ideal
of codimension one, so we can write
for some
; find common eigenvectors for
; find a subspace of such common eigenvectors stabilized by
; find in that space an eigenvector for
.
First, solvability says that properly includes
, or else the derived series wouldn’t be able to even start heading towards
. The quotient
must be abelian, with all brackets zero, so we can pick any subspace of this quotient with codimension one and it will be an ideal. The preimage of this subspace under the quotient projection will then be an ideal
of codimension one.
Now, is a subalgebra of
, so we know it’s also solvable, so induction tells us that there’s a common eigenvector
for the action of
. If
is zero, then
must be one-dimensional abelian, in which case the proof is obvious. Otherwise there is some linear functional
defined by
Of course, is not the only such eigenvector; we define the (nonzero) subspace
by
Next we must show that sends
back into itself. To see this, pick
and
and check that
But if , then we’d have
; we need to verify that
. In the nilpotent case — Engel’s theorem — the functional
was constantly zero, so this was easy, but it’s a bit harder here.
Fixing and
we pick
to be the first index where the collection
is linearly independent — the first one where we can express
as the linear combination of all the previous
. If we write
for the subspace spanned by the first
of these vectors, then the dimension of
grows one-by-one until we get to
, and
from then on.
I say that each of the are invariant under each
. Indeed, we can prove the congruence
that is, acts on
by multiplication by
, plus some “lower-order terms”. For
this is the definition of
; in general we have
for some .
And so we conclude that, using the obvious basis of the action of
on this subspace is in the form of an upper-triangular matrix with
down the diagonal. The trace of this matrix is
. And in particular, the trace of the action of
on
is
. But
and
both act as endomorphisms of
— the one by design and the other by the above proof — and the trace of any commutator is zero! Since
must have an inverse we conclude that
.
Okay so that checks out that the action of sends
back into itself. We finish up by picking some eigenvector
of
, which we know must exist because we’re working over an algebraically closed field. Incidentally, we can then extend
to all of
by using
.
Engel’s Theorem
When we say that a Lie algebra is nilpotent, another way of putting it is that for any sufficiently long sequence
of elements of
the nested adjoint
is zero for all . In particular, applying
enough times will eventually kill any element of
. That is, each
is ad-nilpotent. It turns out that the converse is also true, which is the content of Engel’s theorem.
But first we prove this lemma: if is a linear Lie algebra on a finite-dimensional, nonzero vector space
that consists of nilpotent endomorphisms, then there is some nonzero
for which
for all
.
If then
is spanned by a single nilpotent endomorphism, which has only the eigenvalue zero, and must have an eigenvector
, proving the lemma in this case.
If is any nontrivial subalgebra of
then
is nilpotent for all
. We also get an everywhere-nilpotent action on the quotient vector space
. But since
, the induction hypothesis gives us a nonzero vector
that gets killed by every
. But this means that
for all
, while
. That is,
is strictly contained in the normalizer
.
Now instead of just taking any subalgebra, let be a maximal proper subalgebra in
. Since
is properly contained in
, we must have
, and thus
is actually an ideal of
. If
then we could find an even larger subalgebra of
containing
, in contradiction to our assumption, so as vector spaces we can write
for any
.
Finally, let consist of those vectors killed by all
, which the inductive hypothesis tells us is a nonempty collection. Since
is an ideal,
sends
back into itself:
. Picking a
as above, its action on
is nilpotent, so it must have an eigenvector
with
. Thus
for all
.
So, now, to Engel’s theorem. We take a Lie algebra consisting of ad-nilpotent elements. Thus the algebra
consists of nilpotent endomorphisms on the vector space
, and there is thus some nonzero
for which
. That is,
has a nontrivial center —
.
The quotient thus has a lower dimension than
, and it also consists of ad-nilpotent elements. By induction on the dimension of
we assume that
is actually nilpotent, which proves that
itself is nilpotent.
Facts About Solvability and Nilpotence
Solvability is an interesting property of a Lie algebra , in that it tends to “infect” many related algebras. For one thing, all subalgebras and quotient algebras of
are also solvable. For the first count, it should be clear that if
then
. On the other hand, if
is a quotient epimorphism then any element in
has a representative in
, so if the derived series of
bottoms out at
then so must the derived series of
.
As a sort of converse, suppose that is a solvable quotient of
by a solvable ideal
; then
is itself solvable. Indeed, if
and
is the quotient epimorphism then
, as we saw above. That is,
, but since
is solvable this means that
— as a subalgebra — is solvable, and thus
is as well.
Finally, if and
are solvable ideals of
then so is
. Here, we can use the third isomorphism theorem to establish an isomorphism
. The right hand side is a quotient of
, and so it’s solvable, which makes
a solvable quotient by a solvable ideal, meaning that
is itself solvable.
As an application, let be any Lie algebra and let
be a maximal solvable ideal, contained in no larger solvable ideal. If
is any other solvable ideal, then
is solvable as well, and it obviously contains
. But maximality then tells us that
, from which we conclude that
. Thus we conclude that the maximal solvable ideal
is unique; we call it the “radical” of
, written
.
In the case that the radical of is zero, we say that
is “semisimple”. In particular, a simple Lie algebra is semisimple, since the only ideals of
are itself and
, and
is not solvable.
In general, the quotient is semisimple, since if it had a solvable ideal it would have to be of the form
for some
containing
. But if
is a solvable quotient of
by a solvable ideal, then
must be solvable, which means it must be contained in the radical of
. Thus the only solvable ideal of
is
, as we said.
We also have some useful facts about nilpotent algebras. First off, just as for solvable algebras all subalgebras and quotient algebras of a nilpotent algebra are nilpotent. Even the proof is all but identical.
Next, if — where
is the center of
— is nilpotent then
is as well. Indeed, to say that
is to say that
for some
. But then
.
Finally, if is nilpotent, then
. To see this, note that if
is the first term of the descending central series that equals zero, then
, since the brackets of everything in
with anything in
are all zero.
Nilpotent and Solvable Lie Algebras
There are two big types of Lie algebras that we want to take care of right up front, and both of them are defined similarly. We remember that if and
are ideals of a Lie algebra
, then
— the collection spanned by brackets of elements of
and
— is also an ideal of
. And since the bracket of any element of
with any element of
is back in
, we can see that
. Similarly we conclude
, so
.
Now, starting from we can build up a tower of ideals starting with
and moving down by
. We call this the “derived series” of
. If this tower eventually bottoms out at
we say that
is “solvable”. If
is abelian we see that
, so
is automatically solvable. At the other extreme, if
is simple — and thus not abelian — the only possibility is
, so the derived series never gets down to
, and thus
is not solvable.
We can build up another tower, again starting with , but this time moving down by
. We call this the “lower central series” or “descending central series” of
. If this tower eventually bottoms out at
we say that
is “nilpotent”. Just as above we see that abelian Lie algebras are automatically nilpotent, while simple Lie algebras are never nilpotent.
It’s not too hard to see that for all
. Indeed,
to start. Then if
then
so the assertion follows by induction. Thus we see that any nilpotent algebra is solvable, but solvable algebras are not necessarily nilpotent.
As some explicit examples, we look back at the algebras and
. The second, as we might guess, is nilpotent, and thus solvable. The first, though, is merely solvable.
First, let’s check that is nilpotent. The obvious basis consists of all the matrix entries
with
, and we can know that
We have an obvious sense of the “level” of an element: the difference , which is well-defined on each basis element. We can tell that the bracket of two basis elements gives either zero or another basis element whose level is the sum of the levels of the first two basis elements. The ideal
is spanned by all the basis elements of level
. The ideal
is then spanned by basis elements of level
. And so it goes, each
spanned by basis elements of level
. But this must run out soon enough, since the highest possible level is
. In terms of the matrix, elements of
are zero everywhere on or below the diagonal; elements of
are also zero one row above the diagonal; and so on, each step pushing the nonzero elements “off the edge” to the upper-right of the matrix. Thus
is nilpotent, and thus solvable as well.
Turning to , we already know that
, which we just showed to be solvable! We see that
, which will eventually bottom out at
, thus
is solvable as well. However, we can also calculate that
and so the derived series of stops after the first term and never reaches
. Thus this algebra is solvable, but not nilpotent.
An Explicit Example
Let’s pause and catch our breath with an actual example of some of the things we’ve been talking about. Specifically, we’ll consider — the special linear Lie algebra on a two-dimensional vector space. This is a nice example not only because it’s nicely representative of some general phenomena, but also because the algebra itself is three-dimensional, which helps keep clear the distinction between
as a Lie algebra and the adjoint action of
on itself, particularly since these are both thought of in terms of matrix multiplications.
Now, we know a basis for this algebra:
which we will take in this order. We want to check each of the brackets of these basis elements:
Writing out each bracket of basis elements as a (unique) linear combination of basis elements specifies the bracket completely, by linearity. We call the coefficients the “structure constants” of , and they determine the algebra up to isomorphism.
Okay, now we want to use this basis of the vector space and write down matrices for the action of
on
:
Now, both and
are nilpotent. In the case of
we can see that
sends the line spanned by
to the line spanned by
, the line spanned by
to the line spanned by
, and the line spanned by
to zero. So we can calculate the powers:
and the exponential:
Similarly we can calculate the exponential of :
So now it’s a simple matter to write down the following element of :
In other words, ,
, and
.
We can also see that and
themselves are also nilpotent, as endomorphisms of the vector space
. We can calculate their exponentials:
and the product:
It’s easy to check from here that conjugation by has the exact same effect as the action of
:
.
This is a very general phenomenon: if is any linear Lie algebra and
is nilpotent, then conjugation by the exponential of
is the same as applying the exponential of the adoint of
.
Indeed, considering , we can write it as
where and
are left- and right-multiplication by
in
. Since these two commute with each other and both are nilpotent we can write
That is, the action of is the same as left-multiplication by
followed by right-multiplication by
. All we need now is to verify that this is the inverse of
, but the expanded Leibniz identity from last time tells us that
, thus proving our assertion.
We can also tell at this point that the nilpotency of and
and that of
and
are not unrelated. Indeed, if
is nilpotent then
is, too. Indeed, since
and
are commuting nilpotents, their difference —
— is again nilpotent.
We must be careful to note that the converse is not true. Indeed, is ad-nilpotent, but
itself is certainly not nilpotent.