Group Actions and Representations
From the module perspective, we’re led back to the concept of a group action. This is like a -module, but “discrete”. Let’s just write down the axioms for easy reference: we’ve got a set
and a function
such that
preserves the identity:
.
preserves the group operation:
.
Notice how this looks almost exactly like the axioms for a -module, except since
is just a set we don’t have any sense of “linearity”.
Now, from a group action on a finite set we can get a finite-dimensional representation. We let
— the vector space defined to have
as a basis. That is, vectors in
are of the form
for some complex coefficients . We get a
-module by extending
to a bilinear function
. We already know how it behaves on the basis of the form
, and the extension to a bilinear function is uniquely defined. We call
the “permutation representation” associated to
, and the elements
for
we call the “standard basis”.
As an example, the group is defined from the very beginning by the fact that it acts on the set
by shuffling the numbers around. And so we get a representation from this action, which we call the “defining representation”. By definition, it has dimension
, since it has a basis given by
. To be even more explicit, let me write out the defining matrix representation for
. Technically, going from an abstract representation to a matrix representation requires not just a basis, but an ordered basis, but the order should be pretty clear in this case. And so, with no further ado:
To see how this works, note that the permutation sends
to
. Similarly, we find that
. That is:
We also see that composition of permutations turns into matrix multiplication. For example, . In terms of the matrices we calculate:
You can check for yourself all the other cases that you care to.
Notice that in general the matrices are index by two elements of , and the matrix element
— the one in the
th row and
th column — is
. That is, it’s
if
— if the action of
sends
to
— and
otherwise. This guarantees that every entry will be either
or
, and that each row and each column will have exactly one
. Such a matrix we call a “permutation matrix”, and we see that the matrices that occur in permutation representations are permutation matrices.
Modules
With the group algebra in hand, we now define a “-module” to be a module for the group algebra of
. That is, it’s a (finite-dimensional) vector space
and a bilinear map
. This map must satisfy
and
.
This is really the same thing as a representation, since we may as well pick a basis for
and write
. Then for any
we can write
That is, is a linear map from
to itself, with its matrix entries given by
. We define this matrix to be
, which must be a representation because of the conditions on
above.
Conversely, if we have a matrix representation , we can define a module map for
as
where we apply the matrix to the column vector
. This must satisfy the above conditions, since they reflect the fact that
is a representation.
In fact, to define , all we really need to do is to define it for the basis elements
. Then linearity will take care of the rest of the group algebra. That is, we can just as well say that a
-module is a vector space
and a function
satisfying the following three conditions:
is linear in
:
.
preserves the identity:
.
preserves the group operation:
.
The difference between the representation viewpoint and the -module viewpoint is that representations emphasize the group elements and their actions, while
-modules emphasize the representing space
. This viewpoint will be extremely helpful when we want to consider a representation as a thing in and of itself. It’s easier to do this when we think of it as a vector space equipped with the extra structure of a
-action.
The Group Algebra
A useful construction for our purposes is the group algebra . We’ve said a lot about this before, and showed a number of things about it, but most of those we can ignore for now. All that’s really important is that
is an algebra whose representations are intimately connected with those of
.
When we say it’s an algebra, we just mean that it’s a vector space with a distributive multiplication defined on it. And in our case of a finite group it’s easy to define both. For every group element
we have a basis vector
. That is, we get every vector in the algebra by picking one complex coefficient
for each element
, and adding them all up:
Multiplication is exactly what we might expect: the product of two basis vectors and
is the basis vector
, and we extend everything else by linearity!
We often rearrange this sum to collect all the terms with a given basis vector together:
Neat, huh?
And we can go from representations of a group to representations of its group algebra. Indeed, if is a representation, then we can define
Indeed, each is a matrix, and we can multiply matrices by complex numbers and add them together, so the right hand side is perfectly well-defined as a
matrix in the matrix algebra
. It’s a simple matter to verify that
preserves addition of vectors, scalar multiples of vectors, and products of vectors.
Conversely, if is a representation, then we can restrict it to our basis vectors and get a map
. The image of each basis vector must be invertible, for we have
Some Sample Representations
As promised, we want to see some examples of matrix representations for those who might not have seen much of them before. These are homomorphisms from a group to the group
— we will often simply write
— of invertible
complex matrices. The positive integer
is called the “dimension” or “degree” of the representation. The first few of our examples will be one-dimensional, largely because that makes the examples simpler. Indeed, a
matrix is simply a complex number, and multiplication of such matrices is just complex multiplication!
First off, every group has the “trivial representation”, which sends each group element to the matrix
. It should be clear that the identity of
is sent to the identity matrix — indeed, every group element is sent to the identity matrix — and that the image of the product of two group elements is the product of their images. Writing
for the trivial representation we find:
Another one we’ve seen is the signum representation of a permutation group, which sends even permutations to
and odd permutations to
.
Let’s diverge from the symmetric group for a moment and consider the cyclic group . This consists of powers of a single generator
with the relation
. That is, we have
. The definition of a representation tells us that we must send
to
, but we may have some latitude in choosing where to send the generator
. Since it has to go somewhere, let’s set
. And now we know what happens to everything else in the group!
in particular, we have , so these must go to the same value. That is,
and we must have . Thus we have such a representation for each of the
th roots of unity, and these are all possible one-dimensional representations of
.
This illustrates a common technique for finding representations, at least when we have a nice finite presentation of our group: choose an image for each of the generators, and use the relations to give us equations which these images must satisfy. The fact that this works out is deeply wrapped up in the universal properties of free groups, but if that sounds scary you don’t have to worry about it!
We should include at least one example of a higher-degree representation. A nice one is the following representation of the group of integers. Remember that the usual operation on integers is addition, not “multiplication” like we usually say for groups. With that in mind, we set
It’s straightforward to check that the additive identity is sent to the identity matrix, and that the group operation is preserved:
Conjugates
For some more review, let’s recall the idea of conjugation in a group . We say that two elements
and
are “conjugate” if
for some
.
This is an equivalence relation — reflexive, symmetric, and transitive. Any element is conjugate to itself by the group identity; if
, then
; if
and
, then
. Thus the set underlying the group
can be partitioned into “conjugacy classes”: two group elements are in the same conjugacy class if and only if they are conjugate. The conjugacy class containing a group
is commonly written
. The different conjugacy classes are pairwise disjoint, and their union is all of
.
Now, since these days we’re concerned with the symmetric group. And it turns out that some nice things happen in this case. We’ve actually already seen a lot of these!
First of all, when we write a permutation in cycle notation it’s easy to see what conjugation does to it. Indeed, if is a
-cycle and if
is any other permutation, then we can write down the conjugate:
And the same goes for any other permutation written in cycle notation: the conjugate by is given by applying
to each symbol in the cycle notation. In particular, any two conjugate permutations have the same cycle type. In fact, the converse is also true: given any two permutations with the same cycle type, we can find a permutation that sends the one to the other.
For example, consider the permutations and
. Stack them on top of each other:
turn this into two-line notation
and we’ve got a permutation which sends
to
:
which is equivalent to . That is, two permutations are conjugate if and only if they have the same cycle type. This is really big. Given a cycle type
, we will write the corresponding conjugacy class as
.
We also know from some general facts about group actions that the number of elements in the conjugacy class is equal to the number of cosets of the “centralizer”
. We recall that the centralizer of
is the collection of group elements that commute with
. That is:
and we have the equation
In the case of — and armed with our formula for conjugating permutations in cycle notation — we can use this to count the size of each
. We know that
, so all we need is to find out how many permutations
leave a permutation
(with cycle type
) the same when we conjugate it to get
. We will write this number as
, and similarly we will write
.
So, how can we change the cycle notation of and get something equivalent back again? Well, for any
-cycle in
we can rotate it around in
different ways. For instance,
is the same as
is the same as
. We can also shuffle around any cycles that have the same length:
is the same as
.
Thus if has cycle type
, then we can shuffle the
-cycles
ways; we can shuffle the
-cycles
ways; and so on until we can shuffle the
-cycles
ways. Each
-cycle can be rotated into
different position for a total of
choices; each
-cycle can be rotated into
different positions for a total of
choices; and so on until each
-cycle can be rotated into
different positions for a total of
choices. Therefore we have a total of
different ways of writing the same permutation . And each of these ways corresponds to a permutation in
. We have thus calculated
and we conclude
As a special case, how many transpositions are there in the group ? Recall that a transposition is a permutation of the form
, which has cycle type
. Our formula tells us that there are
permutations in this conjugacy class, as we could expect.
Cycle Type
One concept I’d like to introduce is that of the “cycle type” of a permutation. This simply counts the number of cycles of various length in a permutation. For example — using the sample permutations from last time — has two cycles of length two, while
has one cycle of length three and one “fixed point” — a cycle of length one. No cycle in a permutation in
can be longer than
, so we only need to count up to
. We collect the information in a cycle type into a tuple of the form
. The cycle type of
is
, and that of
is
.
It should be clear that the sum of the cycle lengths is . In a formula:
That is, the cycle type breaks up or “partitions” into chunks whose total size adds up to
. In general, a partition
of
is a sequence of numbers
in nonincreasing order, whose sum is
. Thus the cycle type of
gives us the partition
, while the cycle type of
gives us the partition
.
One more example, from the beginning: the two-line notation
describing a permutation in has the cycle notation
. Its cycle type is
, which corresponds to the partition
.
Some Review
Before we push on into our new topic, let’s look back at some of the background that we’ve already covered.
We’re talking about symmetric groups, which are, of course, groups. We have various ways of writing down an element of , including the two-line notation and the cycle notation that are covered in our earlier description of the symmetric groups. As an example, the two-line notation
and the cycle notation both describe the permutation
that sends
to
,
back to
, and similarly swaps
and
. Similarly, the two-line notation
the composition of
and the cycle notation or (equivalently)
describe the permutation
that cycles the elements
,
, and
(in that order) and leaves
untouched.
We’re specifically concerned with complex representations of these groups. That is, we want to pick some complex vector space , and for each permutation
we want to come up with some linear transformation
for which the composition of linear transformations and the composition of permutations are “the same” in the sense that given two permutations
and
, the transportation corresponding to the composite
is equal to the composite of the corresponding transformations
.
We’re primarily interested in finite-dimensional representations. That is, ones for which is a finite-dimensional complex vector space. In this case, we know that we can always just assume that
— the space of
-tuples of complex numbers — and that linear transformations are described by matrices. Composition of transformations is reflected in matrix multiplication. That is, for every permutation
we want to come up with an
matrix
so that the matrix
corresponding to the composition of two permutations is the product
of the matrices corresponding to the two permutations. I’ll be giving some more explicit examples soon.
New Topic: The Representation Theory of the Symmetric Group
Okay, I’m done with measure theory (for now, at least), and not a moment too soon. It’s been good for me to work through all this stuff again, and I hope it’s provided a good resource, but my traffic has really taken a hit, at least as measured by daily page views. I think maybe everybody else hates analysis too?
So let’s go in a completely different direction! I want to talk about the representation theory of permutation groups. Now at least on the surface you might not think there’s a lot to say, but it’s a surprisingly detailed subject. And since every finite group can be embedded in a permutation group — its action on itself by left multiplication permutes its own elements — and many natural symmetries come in the form of permutations, it’s a very useful subject as well.
But as a niche it doesn’t get taught very much. Even the most direct application I know (the representation theory of classical groups) usually avoids getting into the details of this topic. And so it’s likely that many if not most readers haven’t really seen much of it. Along the way we’ll learn a certain amount about the representation theory of more general finite groups, so if you don’t know much about it yet, don’t worry! I’ll try to link back to more general background information where appropriate — and please ask me to fill in points that I seem to skim over — but the coverage should be pretty self-contained. The great thing about this is that you don’t have to have read anything else I’ve written, and you don’t have to be a particularly expert mathematician to follow along! And of course neither do your friends and acquaintances, so this is the perfect chance to pass the word along. Don’t be shy about telling other people to join in!
There’s also going to be a fair amount of focus on the combinatorics involved in the representation theory, and there are some really amazingly beautiful patterns. But we’ll also see some very explicit descriptions of how to actually count these things. If I feel up to it, I may try to actually implement some of the algorithms we see and talk about that over at my new programming weblog The Unapologetic Programmer. Oh yeah: if you haven’t heard yet, I’ve got a new weblog about programming, so all of you into software design should be heading over there and adding it to your RSS reader. And since my work there is a lot more casual, you should tell your other programmer friends about it too!
So enough shameless self-promotion; let’s get down to business.
Some Continuous Duals
I really wish I could just say in post titles.
Anyway, I want to investigate the continuous dual of for
. That is, we’re excluding the case where either
(but not its Hölder conjugate
) is infinite. And I say that when
is
-finite, the space
of bounded linear functionals on
is isomorphic to
.
First, I’m going to define a linear map . Given a function
, let
be the linear functional defined for any
by
It’s clear from the linearity of multiplication and of the integral itself, that this is a linear functional on . Hölder’s inequality itself shows us that not only does the integral on the right exist, but
That is, is a bounded linear functional, and the operator norm
is at most the
norm of
. The extremal case of Hölder’s inequality shows that there is some
for which this is an equality, and thus we conclude that
. That is,
is an isometry of normed vector spaces. Such a mapping has to be an injection, because if
then
, which implies that
.
Now I say that is also a surjection. That is, any bounded linear functional
is of the form
for some
. Indeed, if
then we can just pick
as a preimage. Thus we may assume that
is a nonzero bounded linear functional on
, and
. We first deal with the case of a totally finite measure space.
In this case, we define a set function on measurable sets by . It’s straightforward to see that
is additive. To prove countable additivity, suppose that
is the countable disjoint union of a sequence
. If we write
for the union of
through
, we find that
Since is continuous, we conclude that
, and thus that
is a (signed) measure. It should also be clear that
implies
, and so
. The Radon-Nikodym theorem now tells us that there exists an integrable function
so that
Linearity tells us that
for simple functions , and also for every
, since each such function is the uniform limit of simple functions. We want to show that
.
If , then we must show that
is essentially bounded. In this case, we find
for every measurable , from which we conclude that
a.e., or else we could find some set on which this inequality was violated. Thus
.
For other , we can find a measurable
with
so that
. Setting
and defining
, we find that
on
,
, and so
We thus find
and thus
Applying the monotone convergence theorem as we find that
.
Thus in either case we’ve found a so that
.
In the -finite case, we can write
as the countable disjoint union of sets
with
. We let
be the union of the first
of these sets. We note that
for every measurable set
, so
is a linear functional on
of norm at most
. The finite case above shows us that there are functions
on
so that
.
We can define if
, and let
be the sum of all these
. We see that
for every , and since
we find that
. Then Fatou’s lemma shows us that
. Thus the
-finite case is true as well.
One case in particular is especially worthy of note: since is Hölder-coonjugate to itself, we find that
is isomorphic to its own continuous dual space in the same way that a finite-dimensional inner-product space is isomorphic to its own dual space.
Bounded Linear Transformations
In the context of normed vector spaces we have a topology on our spaces and so it makes sense to ask that maps between them be continuous. In the finite-dimensional case, all linear functions are continuous, so this hasn’t really come up before in our study of linear algebra. But for functional analysis, it becomes much more important.
Now, really we only need to require continuity at one point — the origin, to be specific — because if it’s continuous there then it’ll be continuous everywhere. Indeed, continuity at means that for any
there is a
so that
implies
. In particular, if
, then this means
implies
. Clearly if this holds, then the general version also holds.
But it turns out that there’s another equivalent condition. We say that a linear transformation is “bounded” if there is some
such that
for all
. That is, the factor by which
stretches the length of a vector is bounded. By linearity, we only really need to check this on the unit sphere
, but it’s often just as easy to test it everywhere.
Anyway, I say that a linear transformation is continuous if and only if it’s bounded. Indeed, if is bounded, then we find
so as we let approach
— as
approaches
— the difference between
and
approaches zero as well. And so
is continuous.
Conversely, if is continuous, then it is bounded. Since it’s continuous, we let
and find a
so that
for all vectors
with
. Thus for all nonzero
we find
Thus we can use and conclude that
is bounded.
The least such that works in the condition for
to be bounded is called the “operator norm” of
, which we write as
. It’s straightforward to verify that
, and that
if and only if
is the zero operator. It remains to verify the triangle identity.
Let’s say that we have bounded linear transformations and
with operator norms
and
, respectively. We will show that
works as a bound for
, and thus conclude that
. Indeed, we check that
and our assertion follows. In particular, when our base field is itself a normed linear space (like or
itself) we can conclude that the “continuous dual space”
consisting of bounded linear functionals
is a normed linear space using the operator norm on
.
