Remember that we call a -module irreducible or “simple” if it has no nontrivial submodules. In general, an object in any category is simple if it has no nontrivial subobjects. If a morphism in a category has a kernel and an image — as we’ve seen all -morphisms do — then these are subobjects of the source and target objects.
So now we have everything we need to state and prove Schur’s lemma. Working in a category where every morphism has both a kernel and an image, if is a morphism between two simple objects, then either is an isomorphism or it’s the zero morphism from to . Indeed, since is simple it has no nontrivial subobjects. The kernel of is a subobject of , so it must either be itself, or the zero object. Similarly, the image of must either be itself or the zero object. If either or then is the zero morphism. On the other hand, if and we have an isomorphism.
To see how this works in the case of -modules, every time I say “object” in the preceding paragraph replace it by “-module”. Morphisms are -morphisms, the zero morphism is the linear map sending every vector to , and the zero object is the trivial vector space . If it feels more comfortable, walk through the preceding proof making the required substitutions to see how it works for -modules.
In terms of matrix representations, let’s say and are two irreducible matrix representations of , and let be any matrix so that for all . Then Schur’s lemma tells us that either is invertible — it’s the matrix of an isomorphism — or it’s the zero matrix.
A nice quick one today. Let’s take two -modules and . We’ll write for the vector space of intertwinors from to . This is pretty appropriate because these are the morphisms in the category of -modules. It turns out that this category has kernels and has images. Those two references are pretty technical, so we’ll talk in more down-to-earth terms.
Any intertwinor is first and foremost a linear map . And as usual the kernel of is the subspace of vectors for which . I say that this isn’t just a subspace of , but it’s a submodule as well. That is, is an invariant subspace of . Indeed, we check that if and is any element of , then , so as well.
Similarly, as usual the image of is the subspace of vectors for which there’s some with . And again I say that this is an invariant subspace. Indeed, if and is any element of , then as well.
Thus these images and kernels are not just subspaces of the vector spaces and , but submodules to boot. That is, they can act as images and kernels in the category of -modules just like they do in the category of complex vector spaces.
Maschke’s theorem is a fundamental result that will make our project of understanding the representation theory of finite groups — and of symmetric groups in particular — far simpler. It tells us that every representation of a finite group is completely reducible.
We saw last time that in the presence of an invariant form, any reducible representation is decomposable, and so any representation with an invariant form is completely reducible. Maschke’s theorem works by showing that there is always an invariant form!
Let’s start by picking any form whatsoever. We know that we can do this by picking a basis of and declaring it to be orthonormal. We don’t anything fancy like Gram-Schmidt, which is used to find orthonormal bases for a given inner product. No, we just define our inner product by saying that — the Kronecker delta, with value when its indices are the same and otherwise — and extend the only way we can. If we have and then we find
so this does uniquely define an inner product. But there’s no reason at all to believe it’s -invariant.
We will use this arbitrary form to build an invariant form by a process of averaging. For any vectors and , define
Showing that this satisfies the definition of an inner product is a straightforward exercise. As for invariance, we want to show that for any we have . Indeed:
where the essential second equality follows because as ranges over , the product ranges over as well, just in a different order.
And so we conclude that if is a representation of then we can take any inner product whatsoever on and “average” it to obtain an invariant form. Then with this invariant form in hand, we know that is completely reducible.
Why doesn’t this work for our counterexample representation of ? Because the group is infinite, and so the averaging process breaks down. This approach only works for finite groups, where the average over all only involves a finite sum.
A very useful structure to have on a complex vector space carrying a representation of a group is an “invariant form”. To start with, this is a complex inner product , which we recall means that it is
- linear in the second slot —
- conjugate symmetric —
- positive definite — for all
Again as usual these imply conjugate linearity in the first slot, so the form isn’t quite bilinear. Still, people are often sloppy and say “invariant bilinear form”.
Anyhow, now we add a new condition to the form. We demand that it be
- invariant under the action of —
Here I have started to write as shorthand for . We will only do this when the representation in question is clear from the context.
The inner product gives us a notion of length and angle. Invariance now tells us that these notions are unaffected by the action of . That is, the vectors and have the same length for all and . Similarly, the angle between vectors and is exactly the same as the angle between and . Another way to say this is that if the form is invariant for the representation , then the image of is actually contained in the
orthogonal group [commenter Eric Finster, below, reminds me that since we’ve got a complex inner product we’re using the group of unitary transformations with respect to the inner product : ].
More important than any particular invariant form is this: if we have an invariant form on our space , then any reducible representation is decomposable. That is, if is a submodule, we can find another submodule so that as -modules.
If we just consider them as vector spaces, we already know this: the orthogonal complement is exactly the subspace we need, for . I say that if is a -invariant subspace of , then is as well, and so they are both submodules. Indeed, if , then we check that is as well:
where the first equality follows from the -invariance of our form; the second from the representation property; and the third from the fact that is an invariant subspace, so .
So in the presence of an invariant form, all finite-dimensional representations are “completely reducible”. That is, they can be decomposed as the direct sum of a number of irreducible submodules. If the representation is irreducible to begin with, we’re done. If not, it must have some submodule . Then the orthogonal complement is also a submodule, and we can write . Then we can treat both and the same way. The process must eventually bottom out, since each of and have dimension smaller than that of , which was finite to begin with. Each step brings the dimension down further and further, and it must stop by the time it reaches .
This tells us, for instance, that there can be no inner product on that is invariant under the representation of the group of integers we laid out at the end of last time. Indeed, that was an example of a reducible representation that is not decomposable, but if there were an invariant form it would have to decompose.
Today I’d like to cover a stronger condition than reducibility: decomposability. We say that a module is “decomposable” if we can write it as the direct sum of two nontrivial submodules and . The direct sum gives us inclusion morphisms from and into , and so any decomposable module is reducible.
What does this look like in terms of matrices? Well, saying that means that we can write any vector uniquely as a sum with and . Then if we have a basis of and a basis of , then we can write and uniquely in terms of these basis vectors. Thus we can write any vector uniquely in terms of the , and so these constitute a basis of .
If we write the matrices in terms of this basis, we find that the image of any can be written in terms of the others because is -invariant. Similarly, the -invariance of tells us that the image of each can be written in terms of the others. The same reasoning as last time now allows us to conclude that the matrices of the all have the form
Conversely, if we can write each of the in this form, then this gives us a decomposition of as the direct sum of two -invariant subspaces, and the representation is decomposable.
Now, I said above that decomposability is stronger than reducibility. Indeed, in general there do exist modules which are reducible, but not decomposable. Indeed, in categorical terms this is the statement that for some groups there are short exact sequences which do not split. To chase this down a little further, our work yesterday showed that even in the reducible case we have the equation . This is the representation of on the quotient space, which gives our short exact sequence
But in general this sequence may not split; we may not be able to write as -modules. Indeed, we’ve seen that the representation of the group of integers
We say that a module is “reducible” if it contains a nontrivial submodule. Thus our examples last time show that the left regular representation is always reducible, since it always contains a copy of the trivial representation as a nontrivial submodule. Notice that we have to be careful about what we mean by each use of “trivial” here.
If the -dimensional representation has a nontrivial -dimensional submodule — and — then we can pick a basis of . And then we know that we can extend this to a basis for all of : .
Now since is a -invariant subspace of , we find that for any vector and the image is again a vector in , and can be written out in terms of the basis vectors. In particular, we find , and all the coefficients of through are zero. That is, the matrix of has the following form:
where is an matrix, is an matrix, and is an matrix. And, in fact, this same form holds for all . In fact, we can use the rule for block-multiplying matrices to find:
and we see that actually provides us with the matrix for the representation we get when restricting to the submodule . This shows us that the converse is also true: if we can find a basis for so that the matrix has the above form for every , then the subspace spanned by the first basis vectors is -invariant, and so it gives us a subrepresentation.
As an example, consider the defining representation of , which is a permutation representation arising from the action of on the set . This representation comes with the standard basis , and it’s easy to see that every permutation leaves the vector — along with the subspace that it spans — fixed. Thus carries a copy of the trivial representation as a submodule of . We can take the given vector as a basis and throw in two others to get a new basis for : .
Now we can take a permutation — say — and calculate its action in terms of the new basis:
The others all work similarly. Then we can write these out as matrices:
Notice that these all have the required form:
Representations that are not reducible — those modules that have no nontrivial submodules — are called “irreducible representations”, or sometimes “irreps” for short. They’re also called “simple” modules, using the general term from category theory for an object with no nontrivial subobjects.
Fancy words: a submodule is a subobject in the category of group representations. What this means is that if and are -modules, and if we have an injective morphism of modules , then we say that is a “submodule” of . And, just to be clear, a -morphism is injective if and only if it’s injective as a linear map; its kernel is zero. We call the “inclusion map” of the submodule.
In practice, we often identify a -submodule with the image of its inclusion map. We know from general principles that since is injective, then is isomorphic to its image, so this isn’t really a big difference. What we can tell, though, is that the action of sends the image back into itself.
That is, let’s say that is the image of some vector . I say that for any group element , acting by on gives us some other vector that’s also in the image of . Indeed, we check that
which is again in the image of , as asserted. We say that the image of is “-invariant”.
The flip side of this is that any time we find such a -invariant subspace of , it gives us a submodule. That is, if is a -module, and is a -invariant subspace, then we can define a new representation on by restriction: . The inclusion map that takes any vector and considers it as a vector in clearly intertwines the original action and the restricted action , and its kernel is trivial. Thus constitutes a -submodule.
That is, consists of all vectors for which all the coefficients are equal. I say that this subspace is -invariant. Indeed, we calculate
But this last sum runs through all the elements of , just in a different order. That is, , and so carries the one-dimensional trivial representation of . That is, we’ve found a copy of the trivial representation of as a submodule of the left regular representation.
As another example, let be one of the symmetric groups. Again, let carry the left regular representation, but now let be the one-dimensional space spanned by
It’s a straightforward exercise to show that is a one-dimensional submodule carrying a copy of the signum representation.
Every -module contains two obvious submodules: the zero subspace and the entire space itself are both clearly -invariant. We call these submodules “trivial”, and all others “nontrivial”.
Since every representation of is a -module, we have an obvious notion of a morphism between them. But let’s be explicit about it.
A -morphism from a -module to another -module is a linear map between the vector spaces and that commutes with the actions of . That is, for every we have . Even more explicitly, if and , then
We can also express this with a commutative diagram:
For each group element our representations give us vertical arrows and . The linear map provides horizontal arrows . To say that the diagram “commutes” means that if we compose the arrows along the top and right to get a linear map from to , and if we compose the arrows along the left and bottom to get another, we’ll find that we actually get the same function. In other words, if we start with a vector in the upper-left and move it by the arrows around either side of the square to get to a vector in , we’ll get the same result on each side. We get one of these diagrams — one of these equations — for each , and they must all commute for to be a -morphism.
Another common word that comes up in these contexts is “intertwine”, as in saying that the map “intertwines” the representations and , or that it is an “intertwinor” for the representations. This language goes back towards the viewpoint that takes the representing functions and to be fundamental, while -morphism tends to be more associated with the viewpoint emphasizing the representing spaces and .
If, as will usually be the case for the time being, we have a presentation of our group by generators and relations, then we’ll only need to check that intertwines the actions of the generators. Indeed, if intertwines the actions of and , then it intertwines the actions of . We can see this in terms of diagrams by stacking the diagram for on top of the diagram for . In terms of equations, we check that
So if we’re given a set of generators and we can write every group element as a finite product of these generators, then as soon as we check that the intertwining equation holds for the generators we know it will hold for all group elements.
There are also deep connections between -morphisms and natural transformations, in the categorical viewpoint. Those who are really interested in that can dig into the archives a bit.
Next up is a family of interesting representations that are also applicable to any group . The main ingredient is a subgroup — a subset of the elements of so that the inverse of any element in is also in , and the product of any two elements of is also in .
Our next step is to use to break up into cosets. We consider and to be equivalent if . It’s easy to check that this is actually and equivalence relation (reflexive, symmetric, and transitive), and so it breaks up into equivalence classes. We write the coset of all that are equivalent to as , and we write the collection of all cosets of as .
We should note that we don’t need to worry about being a normal subgroup of , since we only care about the set of cosets. We aren’t trying to make this set into a group — the quotient group — here.
Let’s work out an example to see this a bit more explicitly. For our group, take the symmetric group , and for our subgroup let . Indeed, is closed under both inversion and multiplication. And we can break up into cosets:
where we have picked a “transversal” — one representative of each coset so that we can write them down more simply. It doesn’t matter whether we write or , since both are really the same set. Now we can write down the multiplication table for the group action. It takes and , and tells us which coset falls in:
This is our group action. Since there are three elements in the set , the permutation representation we get will be three-dimensional. We can write down all the matrices just by working them out from this multiplication table:
It turns out that these matrices are the same as we saw when writing down the defining representation of . There’s a reason for this, which we will examine later.
As special cases, if , then there is one coset for each element of , and the coset representation is the same as the left regular representation. At the other extreme, if , then there is only one coset and we get the trivial representation.
Now it comes time to introduce what’s probably the most important representation of any group, the “left regular representation”. This arises because any group acts on itself by left-multiplication. That is, we have a function — given by . Indeed, this is an action because first ; and second , and as well.
So, as with any group action on a finite set, we get a finite-dimensional permutation representation. The representing space has a standard basis corresponding to the elements of . That is, to every element we have a basis vector . But we can recognize this as the standard basis of the group algebra . That is, the group algebra itself carries a representation.
Of course, this shouldn’t really surprise us. After all, representations of are equivalent to modules for the group algebra; and the very fact that is an algebra means that it comes with a bilinear function , which makes it into a module over itself.
We should note that since this is the left regular representation, there is also such a thing as the right regular representation, which arises from the action of on itself by multiplication on the right. But by itself right-multiplication doesn’t really give an action, because it reverses the order of multiplication. Indeed, for a group action as we’ve defined it first acting by and then acting by is the same as acting by the product . But if we first multiply on the right by and then by we get , which is the same as acting by . The order has been reversed.
To compensate for this, we define the right regular representation by the function . Then , and as well.
As an exercise, let’s work out the matrices of the left regular representation for the cyclic group with respect to its standard basis. We have four elements in this group: and . Thus the regular representation will be four-dimensional, and we will index the rows and columns of our matrices by the exponents , , , and . Then in the matrix the entry in the th row and th column is . The multiplication rule tells us that , where the exponent is defined up to a multiple of four, and so the matrix entry is if , and otherwise. That is:
You can check for yourself that these matrices indeed give a representation of the cyclic group.