The Unapologetic Mathematician

Mathematics for the interested outsider

Schur’s Lemma

Now that we know that images and kernels of Gmorphisms between Gmodules are G-modules as well, we can bring in a very general result.

Remember that we call a G-module irreducible or “simple” if it has no nontrivial submodules. In general, an object in any category is simple if it has no nontrivial subobjects. If a morphism in a category has a kernel and an image — as we’ve seen all G-morphisms do — then these are subobjects of the source and target objects.

So now we have everything we need to state and prove Schur’s lemma. Working in a category where every morphism has both a kernel and an image, if f:V\to W is a morphism between two simple objects, then either f is an isomorphism or it’s the zero morphism from V to W. Indeed, since V is simple it has no nontrivial subobjects. The kernel of f is a subobject of V, so it must either be V itself, or the zero object. Similarly, the image of f must either be W itself or the zero object. If either \mathrm{Ker}(f)=V or \mathrm{Im}(f)=\mathbf{0} then f is the zero morphism. On the other hand, if \mathrm{Ker}(f)=\mathbf{0} and \mathrm{Im}(f)=W we have an isomorphism.

To see how this works in the case of G-modules, every time I say “object” in the preceding paragraph replace it by “G-module”. Morphisms are G-morphisms, the zero morphism is the linear map sending every vector to 0, and the zero object is the trivial vector space \mathbf{0}. If it feels more comfortable, walk through the preceding proof making the required substitutions to see how it works for G-modules.

In terms of matrix representations, let’s say X and Y are two irreducible matrix representations of G, and let T be any matrix so that TX(g)=Y(g)T for all g\in G. Then Schur’s lemma tells us that either T is invertible — it’s the matrix of an isomorphism — or it’s the zero matrix.

September 30, 2010 Posted by | Algebra, Category theory, Group theory, Representation Theory | 6 Comments

Images and Kernels

A nice quick one today. Let’s take two Gmodules V and W. We’ll write \hom_G(V,W) for the vector space of intertwinors from V to W. This is pretty appropriate because these are the morphisms in the category of G-modules. It turns out that this category has kernels and has images. Those two references are pretty technical, so we’ll talk in more down-to-earth terms.

Any intertwinor f\in\hom_G(V,W) is first and foremost a linear map f:V\to W. And as usual the kernel of f is the subspace \mathrm{Ker}(f)\subseteq V of vectors v for which f(v)=0. I say that this isn’t just a subspace of V, but it’s a submodule as well. That is, \mathrm{Ker}(f) is an invariant subspace of V. Indeed, we check that if v\in\mathrm{Ker}(f) and g is any element of G, then f(gv)=gf(v)=g0=0, so gv\in\mathrm{Ker}(f) as well.

Similarly, as usual the image of f is the subspace \mathrm{Im}(f)\subseteq W of vectors w for which there’s some v\in V with f(v)=w. And again I say that this is an invariant subspace. Indeed, if w=f(v)\in\mathrm{Im}(f) and g is any element of G, then gw=gf(v)=f(gv)\in\mathrm{Im}(f) as well.

Thus these images and kernels are not just subspaces of the vector spaces V and W, but submodules to boot. That is, they can act as images and kernels in the category of G-modules just like they do in the category of complex vector spaces.

September 29, 2010 Posted by | Algebra, Group theory, Representation Theory | 3 Comments

Maschke’s Theorem

Maschke’s theorem is a fundamental result that will make our project of understanding the representation theory of finite groups — and of symmetric groups in particular — far simpler. It tells us that every representation of a finite group is completely reducible.

We saw last time that in the presence of an invariant form, any reducible representation is decomposable, and so any representation with an invariant form is completely reducible. Maschke’s theorem works by showing that there is always an invariant form!

Let’s start by picking any form whatsoever. We know that we can do this by picking a basis \{e_i\} of V and declaring it to be orthonormal. We don’t anything fancy like Gram-Schmidt, which is used to find orthonormal bases for a given inner product. No, we just define our inner product by saying that \langle e_i,e_j\rangle=\delta_{i,j} — the Kronecker delta, with value 1 when its indices are the same and 0 otherwise — and extend the only way we can. If we have v=\sum v^ie_i and w=\sum w^je_j then we find

\displaystyle\begin{aligned}\langle v,w\rangle&=\left\langle\sum\limits_{i=1}^{\dim(V)}v^ie_i,\sum\limits_{j=1}^{\dim(V)}w^je_j\right\rangle\\&=\sum\limits_{i=1}^{\dim(V)}\sum\limits_{j=1}^{\dim(V)}\left\langle v^ie_i,w^je_j\right\rangle\\&=\sum\limits_{i=1}^{\dim(V)}\sum\limits_{j=1}^{\dim(V)}\overline{v^i}w^j\left\langle e_i,e_j\right\rangle\\&=\sum\limits_{i=1}^{\dim(V)}\sum\limits_{j=1}^{\dim(V)}\overline{v^i}w^j\delta_{i,j}\\&=\sum\limits_{i=1}^{\dim(V)}\overline{v^i}w^i\end{aligned}

so this does uniquely define an inner product. But there’s no reason at all to believe it’s G-invariant.

We will use this arbitrary form to build an invariant form by a process of averaging. For any vectors v and w, define

\displaystyle\langle v,w\rangle_G=\sum\limits_{g\in G}\langle gv,gw\rangle

Showing that this satisfies the definition of an inner product is a straightforward exercise. As for invariance, we want to show that for any h\in G we have \langle hv,hw\rangle_G=\langle v,w\rangle_G. Indeed:

\displaystyle\begin{aligned}\langle hv,hw\rangle_G&=\sum\limits_{g\in G}\langle ghv,ghw\rangle\\&=\sum\limits_{k\in G}\langle kv,kw\rangle\\&=\langle v,w\rangle_G\end{aligned}

where the essential second equality follows because as g ranges over G, the product k=gh ranges over G as well, just in a different order.

And so we conclude that if V is a representation of G then we can take any inner product whatsoever on V and “average” it to obtain an invariant form. Then with this invariant form in hand, we know that V is completely reducible.

Why doesn’t this work for our counterexample representation of \mathbb{Z}? Because the group \mathbb{Z} is infinite, and so the averaging process breaks down. This approach only works for finite groups, where the average over all g\in G only involves a finite sum.

September 28, 2010 Posted by | Algebra, Group theory, Representation Theory | 14 Comments

Invariant Forms

A very useful structure to have on a complex vector space V carrying a representation \rho of a group G is an “invariant form”. To start with, this is a complex inner product (v,w)\mapsto\langle v,w\rangle, which we recall means that it is

  • linear in the second slot — \langle u,av+bw\rangle=a\langle u,v\rangle+b\langle u,w\rangle
  • conjugate symmetric — \langle v,w\rangle=\overline{\langle w,v\rangle}
  • positive definite — \langle v,v\rangle>0 for all v\neq0

Again as usual these imply conjugate linearity in the first slot, so the form isn’t quite bilinear. Still, people are often sloppy and say “invariant bilinear form”.

Anyhow, now we add a new condition to the form. We demand that it be

  • invariant under the action of G\langle gv,gw\rangle=\langle v,w\rangle

Here I have started to write gv as shorthand for \rho(g)v. We will only do this when the representation in question is clear from the context.

The inner product gives us a notion of length and angle. Invariance now tells us that these notions are unaffected by the action of G. That is, the vectors v and gv have the same length for all v\in V and g\in G. Similarly, the angle between vectors v and w is exactly the same as the angle between gv and gw. Another way to say this is that if the form B is invariant for the representation \rho:G\to GL(V), then the image of \rho is actually contained in the orthogonal group [commenter Eric Finster, below, reminds me that since we’ve got a complex inner product we’re using the group of unitary transformations with respect to the inner product B: \rho:G\to U(V,B)].

More important than any particular invariant form is this: if we have an invariant form on our space V, then any reducible representation is decomposable. That is, if W\subseteq V is a submodule, we can find another submodule U\subseteq V so that V=U\oplus W as G-modules.

If we just consider them as vector spaces, we already know this: the orthogonal complement W^\perp=\left\{v\in V\vert\forall w\in W,\langle v,w\rangle=0\right\} is exactly the subspace we need, for V=W\oplus W^\perp. I say that if W is a G-invariant subspace of V, then W^\perp is as well, and so they are both submodules. Indeed, if v\in W^\perp, then we check that gv is as well:

\displaystyle\begin{aligned}\langle gv,w\rangle&=\langle g^{-1}gv,g^{-1}w\rangle\\&=\langle v,g^{-1}w\\&=0\end{aligned}

where the first equality follows from the G-invariance of our form; the second from the representation property; and the third from the fact that W is an invariant subspace, so g^{-1}w\in W.

So in the presence of an invariant form, all finite-dimensional representations are “completely reducible”. That is, they can be decomposed as the direct sum of a number of irreducible submodules. If the representation V is irreducible to begin with, we’re done. If not, it must have some submodule W. Then the orthogonal complement W^\perp is also a submodule, and we can write V=W\oplus W^\perp. Then we can treat both W and W^\perp the same way. The process must eventually bottom out, since each of W and W^\perp have dimension smaller than that of V, which was finite to begin with. Each step brings the dimension down further and further, and it must stop by the time it reaches 1.

This tells us, for instance, that there can be no inner product on \mathbb{C}^2 that is invariant under the representation of the group of integers \mathbb{Z} we laid out at the end of last time. Indeed, that was an example of a reducible representation that is not decomposable, but if there were an invariant form it would have to decompose.

September 27, 2010 Posted by | Algebra, Group theory, Linear Algebra, Representation Theory | 7 Comments


Today I’d like to cover a stronger condition than reducibility: decomposability. We say that a module V is “decomposable” if we can write it as the direct sum of two nontrivial submodules U and W. The direct sum gives us inclusion morphisms from U and W into V, and so any decomposable module is reducible.

What does this look like in terms of matrices? Well, saying that V=U\oplus W means that we can write any vector v\in V uniquely as a sum v=u+w with u\in U and w\in W. Then if we have a basis \{u_i\}_{i=1}^m of U and a basis \{w_j\}_{j=1}^n of W, then we can write u and w uniquely in terms of these basis vectors. Thus we can write any vector v\in V uniquely in terms of the \{u_1,\dots,u_m,v_1,\dots,v_n\}, and so these constitute a basis of V.

If we write the matrices \rho(g) in terms of this basis, we find that the image of any u_i can be written in terms of the others because U is G-invariant. Similarly, the G-invariance of W tells us that the image of each w_j can be written in terms of the others. The same reasoning as last time now allows us to conclude that the matrices of the \rho(g) all have the form


Conversely, if we can write each of the \rho(g) in this form, then this gives us a decomposition of V as the direct sum of two G-invariant subspaces, and the representation is decomposable.

Now, I said above that decomposability is stronger than reducibility. Indeed, in general there do exist modules which are reducible, but not decomposable. Indeed, in categorical terms this is the statement that for some groups G there are short exact sequences which do not split. To chase this down a little further, our work yesterday showed that even in the reducible case we have the equation \gamma(g)\gamma(h)=\gamma(gh). This \gamma is the representation of G on the quotient space, which gives our short exact sequence

\displaystyle\mathbf{0}\to W\to V\to V/W\to\mathbf{0}

But in general this sequence may not split; we may not be able to write V\cong W\oplus V/W as G-modules. Indeed, we’ve seen that the representation of the group of integers

\displaystyle n\mapsto\begin{pmatrix}1&n\\{0}&1\end{pmatrix}

is indecomposable.

September 24, 2010 Posted by | Algebra, Group theory, Representation Theory | 5 Comments


We say that a module is “reducible” if it contains a nontrivial submodule. Thus our examples last time show that the left regular representation is always reducible, since it always contains a copy of the trivial representation as a nontrivial submodule. Notice that we have to be careful about what we mean by each use of “trivial” here.

If the n-dimensional representation V has a nontrivial m-dimensional submodule W\subseteq Vm\neq0 and m\neq n — then we can pick a basis \{w^1,\dots,w^m\} of W. And then we know that we can extend this to a basis for all of V: \{w^1,\dots,w^m,v^{m+1},\dots,v^n\}.

Now since W is a G-invariant subspace of V, we find that for any vector w\in W and g\in G the image \left[\rho(g)\right](w) is again a vector in W, and can be written out in terms of the w^i basis vectors. In particular, we find \left[\rho(g)\right](w^i)=\rho_j^iw^j, and all the coefficients of v^{m+1} through v^n are zero. That is, the matrix of \rho(g) has the following form:


where \alpha(g) is an m\times m matrix, \beta(g) is an m\times(n-m) matrix, and \gamma(g) is an (n-m)\times(n-m) matrix. And, in fact, this same form holds for all g. In fact, we can use the rule for block-multiplying matrices to find:


and we see that \alpha(g) actually provides us with the matrix for the representation we get when restricting \rho to the submodule W. This shows us that the converse is also true: if we can find a basis for V so that the matrix \rho(g) has the above form for every g\in G, then the subspace spanned by the first m basis vectors is G-invariant, and so it gives us a subrepresentation.

As an example, consider the defining representation V of S_3, which is a permutation representation arising from the action of S_3 on the set \{1,2,3\}. This representation comes with the standard basis \{\mathbf{1},\mathbf{2},\mathbf{3}\}, and it’s easy to see that every permutation leaves the vector \mathbf{1}+\mathbf{2}+\mathbf{3} — along with the subspace W that it spans — fixed. Thus W carries a copy of the trivial representation as a submodule of V. We can take the given vector as a basis and throw in two others to get a new basis for V: \{\mathbf{1}+\mathbf{2}+\mathbf{3},\mathbf{2},\mathbf{3}\}.

Now we can take a permutation — say (1\,2) — and calculate its action in terms of the new basis:


The others all work similarly. Then we can write these out as matrices:


Notice that these all have the required form:


Representations that are not reducible — those modules that have no nontrivial submodules — are called “irreducible representations”, or sometimes “irreps” for short. They’re also called “simple” modules, using the general term from category theory for an object with no nontrivial subobjects.

September 23, 2010 Posted by | Algebra, Group theory, Representation Theory | 9 Comments


Fancy words: a submodule is a subobject in the category of group representations. What this means is that if (V,\rho_V) and (W,\rho_W) are G-modules, and if we have an injective morphism of G modules \iota:W\to V, then we say that W is a “submodule” of V. And, just to be clear, a G-morphism is injective if and only if it’s injective as a linear map; its kernel is zero. We call \iota the “inclusion map” of the submodule.

In practice, we often identify a G-submodule with the image of its inclusion map. We know from general principles that since \iota is injective, then W is isomorphic to its image, so this isn’t really a big difference. What we can tell, though, is that the action of g sends the image back into itself.

That is, let’s say that \iota(w) is the image of some vector w\in W. I say that for any group element g, acting by g on \iota(w) gives us some other vector that’s also in the image of \iota. Indeed, we check that


which is again in the image of \iota, as asserted. We say that the image of \iota is “G-invariant”.

The flip side of this is that any time we find such a G-invariant subspace of V, it gives us a submodule. That is, if (V,\rho_V) is a G-module, and W\subseteq V is a G-invariant subspace, then we can define a new representation on W by restriction: \rho_W(g)=\rho_V(g)\vert_W. The inclusion map that takes any vector w\in W\subseteq V and considers it as a vector in V clearly intertwines the original action \rho_V and the restricted action \rho_W, and its kernel is trivial. Thus W constitutes a G-submodule.

As an example, let G be any finite group, and let \mathbb{C}[G] be its group algebra, which carries the left regular representation \rho. Now, consider the subspace V spanned by the vector

\displaystyle v=\sum\limits_{g\in G}\mathbf{g}

That is, V consists of all vectors for which all the coefficients c_g are equal. I say that this subspace V\subseteq\mathbb{C}[G] is G-invariant. Indeed, we calculate

\displaystyle\left[\rho(g)\right](cv)=c\left[\rho(g)\right]\left(\sum\limits_{g'\in G}\mathbf{g'}\right)=c\sum\limits_{g'\in G}\left[\rho(g)\right](\mathbf{g'})=c\sum\limits_{g'\in G}\mathbf{gg'}

But this last sum runs through all the elements of G, just in a different order. That is, \displaystyle\left[\rho(g)\right](cv)=cv, and so V carries the one-dimensional trivial representation of G. That is, we’ve found a copy of the trivial representation of G as a submodule of the left regular representation.

As another example, let G=S_n be one of the symmetric groups. Again, let \mathbb{C}[G] carry the left regular representation, but now let W be the one-dimensional space spanned by

\displaystyle w=\sum\limits_{g\in G}\mathrm{sgn}(g)\mathbf{g}

It’s a straightforward exercise to show that W is a one-dimensional submodule carrying a copy of the signum representation.

Every G-module V contains two obvious submodules: the zero subspace \{0\} and the entire space V itself are both clearly G-invariant. We call these submodules “trivial”, and all others “nontrivial”.

September 22, 2010 Posted by | Algebra, Group theory, Representation Theory | 4 Comments

Morphisms Between Representations

Since every representation of G is a Gmodule, we have an obvious notion of a morphism between them. But let’s be explicit about it.

A G-morphism from a G-module (V,\rho_V) to another G-module (W,\rho_W) is a linear map T:V\to W between the vector spaces V and W that commutes with the actions of G. That is, for every g\in G we have \rho_W(g)\circ T=T\circ\rho_V(g). Even more explicitly, if g\in G and v\in V, then


We can also express this with a commutative diagram:

For each group element g\in G our representations give us vertical arrows \rho_V(g):V\to V and \rho_W(g):W\to W. The linear map T provides horizontal arrows T:V\to W. To say that the diagram “commutes” means that if we compose the arrows along the top and right to get a linear map from V to W, and if we compose the arrows along the left and bottom to get another, we’ll find that we actually get the same function. In other words, if we start with a vector v\in V in the upper-left and move it by the arrows around either side of the square to get to a vector in W, we’ll get the same result on each side. We get one of these diagrams — one of these equations — for each g\in G, and they must all commute for T to be a G-morphism.

Another common word that comes up in these contexts is “intertwine”, as in saying that the map T “intertwines” the representations \rho_V and \rho_W, or that it is an “intertwinor” for the representations. This language goes back towards the viewpoint that takes the representing functions \rho_V and \rho_W to be fundamental, while G-morphism tends to be more associated with the viewpoint emphasizing the representing spaces V and W.

If, as will usually be the case for the time being, we have a presentation of our group by generators and relations, then we’ll only need to check that T intertwines the actions of the generators. Indeed, if T intertwines the actions of g and h, then it intertwines the actions of gh. We can see this in terms of diagrams by stacking the diagram for h on top of the diagram for g. In terms of equations, we check that

\displaystyle\begin{aligned}\rho_W(gh)\circ T&=\rho_W(g)\circ\rho_W(h)\circ T\\&=\rho_W(g)\circ T\circ\rho_V(h)\\&=T\circ\rho_V(g)\circ\rho_V(h)\\&=T\circ\rho_V(gh)\end{aligned}

So if we’re given a set of generators and we can write every group element as a finite product of these generators, then as soon as we check that the intertwining equation holds for the generators we know it will hold for all group elements.

There are also deep connections between G-morphisms and natural transformations, in the categorical viewpoint. Those who are really interested in that can dig into the archives a bit.

September 21, 2010 Posted by | Algebra, Group theory, Representation Theory | 15 Comments

Coset Representations

Next up is a family of interesting representations that are also applicable to any group G. The main ingredient is a subgroup H — a subset of the elements of G so that the inverse of any element in H is also in H, and the product of any two elements of H is also in H.

Our next step is to use H to break G up into cosets. We consider g_1 and g_2 to be equivalent if g_2^{-1}g_1\in H. It’s easy to check that this is actually and equivalence relation (reflexive, symmetric, and transitive), and so it breaks G up into equivalence classes. We write the coset of all g'\in G that are equivalent to g as gH, and we write the collection of all cosets of H as G/H.

We should note that we don’t need to worry about H being a normal subgroup of G, since we only care about the set of cosets. We aren’t trying to make this set into a group — the quotient group — here.

Okay, now multiplication on the left by G shuffles around the cosets. That is, we have a group action of G on the quotient set G/H, and this gives us a permutation representation of G!

Let’s work out an example to see this a bit more explicitly. For our group, take the symmetric group S_3, and for our subgroup let H=\{e,(2\,3)\}. Indeed, H is closed under both inversion and multiplication. And we can break G up into cosets:


where we have picked a “transversal” — one representative of each coset so that we can write them down more simply. It doesn’t matter whether we write (1\,2)H or (1\,2\,3)H, since both are really the same set. Now we can write down the multiplication table for the group action. It takes g_1\in G and g_2H\in G/H, and tells us which coset g_1g_2 falls in:

\displaystyle\begin{tabular}{c|rrr}&H&(1\,2)H&(1\,3)H\\\hline e&H&(1\,2)H&(1\,3)H\\(1\,2)&(1\,2)H&H&(1\,3)H\\(1\,3)&(1\,3)H&(1\,2)H&H\\(2\,3)&H&(1\,3)H&(1\,2)H\\(1\,2\,3)&(1\,2)H&(1\,3)H&H\\(1\,3\,2)&(1\,3)H&H&(1\,2)H\end{tabular}

This is our group action. Since there are three elements in the set G/H, the permutation representation we get will be three-dimensional. We can write down all the matrices just by working them out from this multiplication table:


It turns out that these matrices are the same as we saw when writing down the defining representation of S_3. There’s a reason for this, which we will examine later.

As special cases, if H=\{e\}, then there is one coset for each element of G, and the coset representation is the same as the left regular representation. At the other extreme, if H=G, then there is only one coset and we get the trivial representation.

September 20, 2010 Posted by | Algebra, Group theory, Representation Theory | 5 Comments

The (Left) Regular Representation

Now it comes time to introduce what’s probably the most important representation of any group, the “left regular representation”. This arises because any group G acts on itself by left-multiplication. That is, we have a function G\times G\to G — given by (g,h)\mapsto gh. Indeed, this is an action because first (e,g_1)\mapsto g_1; and second (g_1,(g_2,h))\mapsto(g_1,g_2h)\mapsto g_1g_2h, and (g_1g_2,h)\mapsto g_1g_2h as well.

So, as with any group action on a finite set, we get a finite-dimensional permutation representation. The representing space \mathbb{C}G has a standard basis corresponding to the elements of G. That is, to every element g\in G we have a basis vector \mathbf{g}\in\mathbb{C}G. But we can recognize this as the standard basis of the group algebra \mathbb{C}[G]. That is, the group algebra itself carries a representation.

Of course, this shouldn’t really surprise us. After all, representations of G are equivalent to modules for the group algebra; and the very fact that \mathbb{C}[G] is an algebra means that it comes with a bilinear function \mathbb{C}[G]\times\mathbb{C}[G]\to\mathbb{C}[G], which makes it into a module over itself.

We should note that since this is the left regular representation, there is also such a thing as the right regular representation, which arises from the action of G on itself by multiplication on the right. But by itself right-multiplication doesn’t really give an action, because it reverses the order of multiplication. Indeed, for a group action as we’ve defined it first acting by g_2 and then acting by g_1 is the same as acting by the product g_1g_2. But if we first multiply on the right by g_2 and then by g_1 we get hg_2h_1, which is the same as acting by g_2g_1. The order has been reversed.

To compensate for this, we define the right regular representation by the function (g,h)\mapsto hg^{-1}. Then (g_1,(g_2,h))\mapsto(g_1,hg_2^{-1})\mapsto hg_2^{-1}g_1^{-1}=h(g_1g_2)^{-1}, and (g_1g_2,h)\mapsto h(g_1g_2)^{-1} as well.

As an exercise, let’s work out the matrices of the left regular representation for the cyclic group \mathbb{Z}_4 with respect to its standard basis. We have four elements in this group: \{g^0,g^1,g^2,g^3\} and g^4=g^0. Thus the regular representation will be four-dimensional, and we will index the rows and columns of our matrices by the exponents 0, 1, 2, and 3. Then in the matrix \rho(g^k) the entry in the ith row and jth column is \delta_{g^kg^j,g^i}. The multiplication rule tells us that (g^i,g^j)\mapsto g^{i+j}, where the exponent is defined up to a multiple of four, and so the matrix entry is 1 if i=j+k, and 0 otherwise. That is:


You can check for yourself that these matrices indeed give a representation of the cyclic group.

September 17, 2010 Posted by | Algebra, Group theory, Representation Theory | 11 Comments