The Unapologetic Mathematician

Mathematics for the interested outsider

Group Actions and Representations

From the module perspective, we’re led back to the concept of a group action. This is like a G-module, but “discrete”. Let’s just write down the axioms for easy reference: we’ve got a set S and a function A:G\times S\to S such that

  • A preserves the identity: A(e,s)=s.
  • A preserves the group operation: A(g,A(h,s))=A(gh,s).

Notice how this looks almost exactly like the axioms for a G-module, except since S is just a set we don’t have any sense of “linearity”.

Now, from a group action on a finite set S we can get a finite-dimensional representation. We let V=\mathbb{C}S — the vector space defined to have S as a basis. That is, vectors in \mathbb{C}S are of the form

\displaystyle\sum\limits_{s\in S}c_s\mathbf{s}

for some complex coefficients c_s. We get a G-module by extending A to a bilinear function \mathbb{C}[G]\times\mathbb{C}S\to\mathbb{C}S. We already know how it behaves on the basis of the form (\mathbf{g},\mathbf{s}), and the extension to a bilinear function is uniquely defined. We call \mathbb{C}S the “permutation representation” associated to S, and the elements \mathbf{s} for s\in S we call the “standard basis”.

As an example, the group S_n is defined from the very beginning by the fact that it acts on the set \{1,2,\dots,n\} by shuffling the numbers around. And so we get a representation from this action, which we call the “defining representation”. By definition, it has dimension n, since it has a basis given by \{\mathbf{1},\dots,\mathbf{n}\}. To be even more explicit, let me write out the defining matrix representation for S_3. Technically, going from an abstract representation to a matrix representation requires not just a basis, but an ordered basis, but the order should be pretty clear in this case. And so, with no further ado:

\displaystyle\begin{aligned}\rho\left(e\right)&=\begin{pmatrix}1&0&0\\{0}&1&0\\{0}&0&1\end{pmatrix}\\\rho\left((1\,2)\right)&=\begin{pmatrix}{0}&1&0\\1&0&0\\{0}&0&1\end{pmatrix}\\\rho\left((1\,3)\right)&=\begin{pmatrix}{0}&0&1\\{0}&1&0\\1&0&0\end{pmatrix}\\\rho\left((2\,3)\right)&=\begin{pmatrix}1&0&0\\{0}&0&1\\{0}&1&0\end{pmatrix}\\\rho\left((1\,2\,3)\right)&=\begin{pmatrix}{0}&0&1\\1&0&0\\{0}&1&0\end{pmatrix}\\\rho\left((1\,3\,2)\right)&=\begin{pmatrix}{0}&1&0\\{0}&0&1\\1&0&0\end{pmatrix}\end{aligned}

To see how this works, note that the permutation (1\,2\,3) sends 3 to 1. Similarly, we find that \left[\rho\left((1\,2\,3)\right)\right](\mathbf{3})=\mathbf{1}. That is:

\displaystyle\begin{pmatrix}{0}&0&1\\1&0&0\\{0}&1&0\end{pmatrix}\begin{pmatrix}{0}\\{0}\\1\end{pmatrix}=\begin{pmatrix}1\\{0}\\{0}\end{pmatrix}

We also see that composition of permutations turns into matrix multiplication. For example, (1\,2\,3)(2\,3)=(1\,2). In terms of the matrices we calculate:

\displaystyle\begin{pmatrix}{0}&0&1\\1&0&0\\{0}&1&0\end{pmatrix}\begin{pmatrix}1&0&0\\{0}&0&1\\{0}&1&0\end{pmatrix}=\begin{pmatrix}{0}&1&0\\1&0&0\\{0}&0&1\end{pmatrix}

You can check for yourself all the other cases that you care to.

Notice that in general the matrices are index by two elements of S, and the matrix element \rho(g)_s^t — the one in the sth row and tth column — is \delta_{A(g,t),s}. That is, it’s 1 if A(g,t)=s — if the action of g sends s to t — and 0 otherwise. This guarantees that every entry will be either 0 or 1, and that each row and each column will have exactly one 1. Such a matrix we call a “permutation matrix”, and we see that the matrices that occur in permutation representations are permutation matrices.

September 16, 2010 Posted by | Algebra, Group Actions, Group theory, Representation Theory, Representations of Symmetric Groups | 9 Comments

Modules

With the group algebra in hand, we now define a “G-module” to be a module for the group algebra of G. That is, it’s a (finite-dimensional) vector space V and a bilinear map A:\mathbb{C}[G]\times V\to V. This map must satisfy A(\mathbf{e},v)=v and A(\mathbf{g},A(\mathbf{h},v))=A(\mathbf{gh},v).

This is really the same thing as a representation, since we may as well pick a basis \{e_i\} for V and write V=\mathbb{C}^d. Then for any g\in G we can write

\displaystyle A(\mathbf{g},e_i)=\sum\limits_{j=1}^dm_i^je_j

That is, A(\mathbf{g},\underbar{\hphantom{X}}) is a linear map from V to itself, with its matrix entries given by m_i^j. We define this matrix to be \rho(g), which must be a representation because of the conditions on A above.

Conversely, if we have a matrix representation \rho:G\to GL_d, we can define a module map for \mathbb{C}^d as

\displaystyle A(\mathbf{g},v)=\rho(g)v

where we apply the matrix \rho(g) to the column vector v. This must satisfy the above conditions, since they reflect the fact that \rho is a representation.

In fact, to define A, all we really need to do is to define it for the basis elements \mathbf{g}\in\mathbb{C}[G]. Then linearity will take care of the rest of the group algebra. That is, we can just as well say that a G-module is a vector space V and a function A:G\times V\to V satisfying the following three conditions:

  • A is linear in V: A(g,cv+dw)=cA(g,v)+dA(g,w).
  • A preserves the identity: A(e,v)=v.
  • A preserves the group operation: A(g,A(h,v))=A(gh,v).

The difference between the representation viewpoint and the G-module viewpoint is that representations emphasize the group elements and their actions, while G-modules emphasize the representing space V. This viewpoint will be extremely helpful when we want to consider a representation as a thing in and of itself. It’s easier to do this when we think of it as a vector space equipped with the extra structure of a G-action.

September 15, 2010 Posted by | Algebra, Group theory, Representation Theory | 11 Comments

The Group Algebra

A useful construction for our purposes is the group algebra \mathbb{C}[G]. We’ve said a lot about this before, and showed a number of things about it, but most of those we can ignore for now. All that’s really important is that \mathbb{C}[G] is an algebra whose representations are intimately connected with those of G.

When we say it’s an algebra, we just mean that it’s a vector space with a distributive multiplication defined on it. And in our case of a finite group G it’s easy to define both. For every group element g\in G we have a basis vector \mathbf{g}\in\mathbb{C}[G]. That is, we get every vector in the algebra by picking one complex coefficient c_g for each element g\in G, and adding them all up:

\displaystyle\sum\limits_{g\in G}c_g\mathbf{g}

Multiplication is exactly what we might expect: the product of two basis vectors \mathbf{g} and \mathbf{h} is the basis vector \mathbf{gh}, and we extend everything else by linearity!

\displaystyle\left(\sum\limits_{g\in G}c_g\mathbf{g}\right)\left(\sum\limits_{h\in G}d_h\mathbf{h}\right)=\sum\limits_{g,h\in G}c_gd_h\mathbf{gh}

We often rearrange this sum to collect all the terms with a given basis vector together:

\displaystyle\sum\limits_{g\in G}\sum\limits_{g_1g_2=g}c_{g_1}d_{g_2}\mathbf{g}=\sum\limits_{g\in G}\left(\sum\limits_{h\in G}c_{gh^{-1}}d_h\right)\mathbf{g}

Neat, huh?

And we can go from representations of a group to representations of its group algebra. Indeed, if \rho:G\to GL_d is a representation, then we can define

\displaystyle\rho\left(\sum\limits_{g\in G}c_g\mathbf{g}\right)=\sum\limits_{g\in G}c_g\rho(g)

Indeed, each \rho(g) is a matrix, and we can multiply matrices by complex numbers and add them together, so the right hand side is perfectly well-defined as a d\times d matrix in the matrix algebra M_d. It’s a simple matter to verify that \rho:\mathbb{C}[G]\to M_d preserves addition of vectors, scalar multiples of vectors, and products of vectors.

Conversely, if \rho:\mathbb{C}[G]\to M_d is a representation, then we can restrict it to our basis vectors and get a map \rho:G\to GL_d. The image of each basis vector must be invertible, for we have

\displaystyle\rho(\mathbf{g})\rho(\mathbf{g^{-1}})=\rho(\mathbf{gg^{-1}})=\rho(\mathbf{e})=1

September 14, 2010 Posted by | Algebra, Group theory, Representation Theory | 8 Comments

Some Sample Representations

As promised, we want to see some examples of matrix representations for those who might not have seen much of them before. These are homomorphisms from a group G to the group GL_d(\mathbb{C}) — we will often simply write GL_d — of invertible d\times d complex matrices. The positive integer d is called the “dimension” or “degree” of the representation. The first few of our examples will be one-dimensional, largely because that makes the examples simpler. Indeed, a 1\times1 matrix is simply a complex number, and multiplication of such matrices is just complex multiplication!

First off, every group has the “trivial representation”, which sends each group element g\in G to the matrix \begin{pmatrix}1\end{pmatrix}. It should be clear that the identity of G is sent to the identity matrix — indeed, every group element is sent to the identity matrix — and that the image of the product of two group elements is the product of their images. Writing \rho_1 for the trivial representation we find:

\displaystyle\rho_1(g)\rho_1(h)=\begin{pmatrix}1\end{pmatrix}\begin{pmatrix}1\end{pmatrix}=\begin{pmatrix}1\end{pmatrix}=\rho_1(gh)

Another one we’ve seen is the signum representation \mathrm{sgn} of a permutation group, which sends even permutations to \begin{pmatrix}1\end{pmatrix} and odd permutations to \begin{pmatrix}-1\end{pmatrix}.

Let’s diverge from the symmetric group for a moment and consider the cyclic group \mathbb{Z}_n. This consists of powers of a single generator g with the relation g^n=e. That is, we have G=\left\{e,g,g^2,g^3,\dots,g^n=e\right\}. The definition of a representation tells us that we must send e to \begin{pmatrix}1\end{pmatrix}, but we may have some latitude in choosing where to send the generator g. Since it has to go somewhere, let’s set \rho_c(g)=\begin{pmatrix}c\end{pmatrix}. And now we know what happens to everything else in the group!

\displaystyle\rho_c(g^k)=\rho_c(g)^k=\begin{pmatrix}c\end{pmatrix}^k=\begin{pmatrix}c^k\end{pmatrix}

in particular, we have g^n=e, so these must go to the same value. That is,

\displaystyle\begin{pmatrix}c^n\end{pmatrix}=\rho_c(g^n)=\rho_c(e)=\begin{pmatrix}1\end{pmatrix}

and we must have c^n=1. Thus we have such a representation for each of the nth roots of unity, and these are all possible one-dimensional representations of \mathbb{Z}_n.

This illustrates a common technique for finding representations, at least when we have a nice finite presentation of our group: choose an image for each of the generators, and use the relations to give us equations which these images must satisfy. The fact that this works out is deeply wrapped up in the universal properties of free groups, but if that sounds scary you don’t have to worry about it!

We should include at least one example of a higher-degree representation. A nice one is the following representation of the group \mathbb{Z} of integers. Remember that the usual operation on integers is addition, not “multiplication” like we usually say for groups. With that in mind, we set

\displaystyle\rho(n)=\begin{pmatrix}1&n\\{0}&1\end{pmatrix}

It’s straightforward to check that the additive identity 0 is sent to the identity matrix, and that the group operation is preserved:

\displaystyle\rho(m)\rho(n)=\begin{pmatrix}1&m\\{0}&1\end{pmatrix}\begin{pmatrix}1&n\\{0}&1\end{pmatrix}=\begin{pmatrix}1&m+n\\{0}&1\end{pmatrix}=\rho(m+n)

September 13, 2010 Posted by | Algebra, Group theory, Representation Theory | Leave a comment

Conjugates

For some more review, let’s recall the idea of conjugation in a group G. We say that two elements g and h are “conjugate” if g=khk^{-1} for some k\in G.

This is an equivalence relation — reflexive, symmetric, and transitive. Any element g is conjugate to itself by the group identity; if g=khk^{-1}, then h=k^{-1}g\left(k^{-1}\right)^{-1}; if f=jgj^{-1} and g=khk^{-1}, then f=(jk)h(jk)^{-1}. Thus the set underlying the group G can be partitioned into “conjugacy classes”: two group elements are in the same conjugacy class if and only if they are conjugate. The conjugacy class containing a group g is commonly written K_g. The different conjugacy classes are pairwise disjoint, and their union is all of G.

Now, since these days we’re concerned with the symmetric group. And it turns out that some nice things happen in this case. We’ve actually already seen a lot of these!

First of all, when we write a permutation in cycle notation it’s easy to see what conjugation does to it. Indeed, if (a_1\,a_2\,\dots\,a_k) is a k-cycle and if g is any other permutation, then we can write down the conjugate:

\displaystyle g(a_1\,\dots\,a_k)g^{-1}=(g(a_1)\,\dots\,g(a_k))

And the same goes for any other permutation written in cycle notation: the conjugate by g is given by applying g to each symbol in the cycle notation. In particular, any two conjugate permutations have the same cycle type. In fact, the converse is also true: given any two permutations with the same cycle type, we can find a permutation that sends the one to the other.

For example, consider the permutations (1\,5\,2)(3\,4) and (2\,3\,1)(4\,5). Stack them on top of each other:

\displaystyle\begin{aligned}(1\,5\,2)&(3\,4)\\(2\,3\,1)&(4\,5)\end{aligned}

turn this into two-line notation

\displaystyle\left\lvert\begin{matrix}1&5&2&3&4\\2&3&1&4&5\end{matrix}\right\rvert

and we’ve got a permutation (1\,2)(3\,4\,5) which sends (1\,5\,2)(3\,4) to (2\,3\,1)(4\,5):

\displaystyle(1\,2)(3\,4\,5)\cdot(1\,5\,2)(3\,4)\cdot(2\,1)(5\,4\,3)=(1\,2\,3)(4\,5)

which is equivalent to (2\,3\,1)(4\,5). That is, two permutations are conjugate if and only if they have the same cycle type. This is really big. Given a cycle type \lambda, we will write the corresponding conjugacy class as K_\lambda.

We also know from some general facts about group actions that the number of elements in the conjugacy class K_g is equal to the number of cosets of the “centralizer” Z_g. We recall that the centralizer of g is the collection of group elements that commute with g. That is:

\displaystyle Z_g=\left\{h\in G\vert hgh^{-1}=g\right\}

and we have the equation

\displaystyle\lvert K_g\rvert=\frac{\lvert G\rvert}{\lvert Z_g\rvert}

In the case of S_n — and armed with our formula for conjugating permutations in cycle notation — we can use this to count the size of each K_\lambda. We know that \lvert S_n\rvert=n!, so all we need is to find out how many permutations h leave a permutation g (with cycle type \lambda) the same when we conjugate it to get hgh^{-1}. We will write this number as z_\lambda=\lvert Z_g\rvert, and similarly we will write k_\lambda=\lvert K_\lambda\rvert=\lvert K_g\rvert.

So, how can we change the cycle notation of g and get something equivalent back again? Well, for any k-cycle in g we can rotate it around in k different ways. For instance, (1\,2\,3) is the same as (2\,3\,1) is the same as (3\,1\,2). We can also shuffle around any cycles that have the same length: (1\,2)(3\,4) is the same as (3\,4)(1\,2).

Thus if g has cycle type \lambda=(1^{m_1},2^{m_2},\dots,n^{m_n}), then we can shuffle the 1-cycles m_1! ways; we can shuffle the 2-cycles m_2! ways; and so on until we can shuffle the n-cycles m_n! ways. Each 1-cycle can be rotated into 1 different position for a total of 1^{m_1} choices; each 2-cycle can be rotated into 2 different positions for a total of 2^{m_2} choices; and so on until each n-cycle can be rotated into n different positions for a total of n^{m_n} choices. Therefore we have a total of

1^{m_1}m_1!2^{m_2}m_2!\dots n^{m_n}m_n!

different ways of writing the same permutation g. And each of these ways corresponds to a permutation in Z_g. We have thus calculated

\lvert Z_g\rvert=1^{m_1}m_1!2^{m_2}m_2!\dots n^{m_n}m_n!

and we conclude

\displaystyle k_\lambda=\lvert K_g\rvert=\frac{\lvert S_n\rvert}{\lvert Z_g\rvert}=\frac{n!}{z_\lambda}=\frac{n!}{1^{m_1}m_1!2^{m_2}m_2!\dots n^{m_n}m_n!}

As a special case, how many transpositions are there in the group S_n? Recall that a transposition is a permutation of the form (a\,b), which has cycle type (1^{n-2},2^1,3^0,\dots,n^0). Our formula tells us that there are

\displaystyle\frac{n!}{1^{n-2}(n-2)!2^11!3^00!\dots n^00!}=\frac{n!}{(n-2)!2}=\binom{n}{2}

permutations in this conjugacy class, as we could expect.

September 10, 2010 Posted by | Algebra, Group theory, Representation Theory, Representations of Symmetric Groups | 4 Comments

Cycle Type

One concept I’d like to introduce is that of the “cycle type” of a permutation. This simply counts the number of cycles of various length in a permutation. For example — using the sample permutations from last time\alpha=(1\,2)(3\,4) has two cycles of length two, while \beta=(1\,4\,3)(2) has one cycle of length three and one “fixed point” — a cycle of length one. No cycle in a permutation in S_n can be longer than n, so we only need to count up to n. We collect the information in a cycle type into a tuple of the form (1^{m_1},2^{m_2},\dots,n^{m_n}). The cycle type of \alpha is (1^0,2^2,3^0), and that of \beta is (1^1,2^0,3^1).

It should be clear that the sum of the cycle lengths is n. In a formula:

\displaystyle\sum\limits_{k=1}^nkm_k=n

That is, the cycle type breaks up or “partitions” n into chunks whose total size adds up to n. In general, a partition \lambda of n is a sequence of numbers (\lambda_1,\dots,\lambda_l) in nonincreasing order, whose sum is n. Thus the cycle type of \alpha gives us the partition (2,2), while the cycle type of \beta gives us the partition (3,1).

One more example, from the beginning: the two-line notation

\displaystyle\left\lvert\begin{matrix}1&2&3&4&5\\2&3&1&4&5\end{matrix}\right\rvert

describing a permutation in S_5 has the cycle notation (1\,2\,3)(4)(5). Its cycle type is (1^2,2^0,3^1,4^0,5^0), which corresponds to the partition (3,1,1).

September 9, 2010 Posted by | Algebra, Group theory, Representation Theory, Representations of Symmetric Groups | 1 Comment

Some Review

Before we push on into our new topic, let’s look back at some of the background that we’ve already covered.

We’re talking about symmetric groups, which are, of course, groups. We have various ways of writing down an element of S_n, including the two-line notation and the cycle notation that are covered in our earlier description of the symmetric groups. As an example, the two-line notation

\displaystyle\left\lvert\begin{matrix}1&2&3&4\\2&1&4&3\end{matrix}\right\rvert

and the cycle notation (1\,2)(3\,4) both describe the permutation \alpha\in S_4 that sends 1 to 2, 2 back to 1, and similarly swaps 3 and 4. Similarly, the two-line notation
the composition of
\displaystyle\left\lvert\begin{matrix}1&2&3&4\\4&2&1&3\end{matrix}\right\rvert

and the cycle notation (1\,4\,3)(2) or (equivalently) (1\,4\,3) describe the permutation \beta that cycles the elements 1, 4, and 3 (in that order) and leaves 2 untouched.

We’re specifically concerned with complex representations of these groups. That is, we want to pick some complex vector space V, and for each permutation \sigma\in S_n we want to come up with some linear transformation \rho(\sigma):V\to V for which the composition of linear transformations and the composition of permutations are “the same” in the sense that given two permutations \sigma and \tau, the transportation corresponding to the composite \rho(\sigma\tau) is equal to the composite of the corresponding transformations \rho(\sigma)\rho(\tau).

We’re primarily interested in finite-dimensional representations. That is, ones for which V is a finite-dimensional complex vector space. In this case, we know that we can always just assume that V=\mathbb{C}^k — the space of k-tuples of complex numbers — and that linear transformations are described by matrices. Composition of transformations is reflected in matrix multiplication. That is, for every permutation \sigma\in S_n we want to come up with an k\times k matrix \rho(\sigma) so that the matrix \rho(\sigma\tau) corresponding to the composition of two permutations is the product \rho(\sigma)\rho(\tau) of the matrices corresponding to the two permutations. I’ll be giving some more explicit examples soon.

September 8, 2010 Posted by | Algebra, Group theory, Representation Theory, Representations of Symmetric Groups | 5 Comments

New Topic: The Representation Theory of the Symmetric Group

Okay, I’m done with measure theory (for now, at least), and not a moment too soon. It’s been good for me to work through all this stuff again, and I hope it’s provided a good resource, but my traffic has really taken a hit, at least as measured by daily page views. I think maybe everybody else hates analysis too?

So let’s go in a completely different direction! I want to talk about the representation theory of permutation groups. Now at least on the surface you might not think there’s a lot to say, but it’s a surprisingly detailed subject. And since every finite group can be embedded in a permutation group — its action on itself by left multiplication permutes its own elements — and many natural symmetries come in the form of permutations, it’s a very useful subject as well.

But as a niche it doesn’t get taught very much. Even the most direct application I know (the representation theory of classical groups) usually avoids getting into the details of this topic. And so it’s likely that many if not most readers haven’t really seen much of it. Along the way we’ll learn a certain amount about the representation theory of more general finite groups, so if you don’t know much about it yet, don’t worry! I’ll try to link back to more general background information where appropriate — and please ask me to fill in points that I seem to skim over — but the coverage should be pretty self-contained. The great thing about this is that you don’t have to have read anything else I’ve written, and you don’t have to be a particularly expert mathematician to follow along! And of course neither do your friends and acquaintances, so this is the perfect chance to pass the word along. Don’t be shy about telling other people to join in!

There’s also going to be a fair amount of focus on the combinatorics involved in the representation theory, and there are some really amazingly beautiful patterns. But we’ll also see some very explicit descriptions of how to actually count these things. If I feel up to it, I may try to actually implement some of the algorithms we see and talk about that over at my new programming weblog The Unapologetic Programmer. Oh yeah: if you haven’t heard yet, I’ve got a new weblog about programming, so all of you into software design should be heading over there and adding it to your RSS reader. And since my work there is a lot more casual, you should tell your other programmer friends about it too!

So enough shameless self-promotion; let’s get down to business.

September 7, 2010 Posted by | Algebra, Combinatorics, Group theory, Representation Theory, Representations of Symmetric Groups | 15 Comments

Some Continuous Duals

I really wish I could just say L^p in post titles.

Anyway, I want to investigate the continuous dual of L^p(\mu) for 1\leq p<\infty. That is, we’re excluding the case where either p (but not its Hölder conjugate q) is infinite. And I say that when (X,\mathcal{S},\mu) is \sigma-finite, the space L^p(\mu)' of bounded linear functionals on L^p(\mu) is isomorphic to L^q(\mu).

First, I’m going to define a linear map \kappa_p:L^q\to\left(L^p\right)'. Given a function g\in L^q, let \kappa_p(g) be the linear functional defined for any f\in L^p by

\displaystyle\left[\kappa_p(g)\right](f)=\int fg\,d\mu

It’s clear from the linearity of multiplication and of the integral itself, that this is a linear functional on L^p. Hölder’s inequality itself shows us that not only does the integral on the right exist, but

\displaystyle\lvert\left[\kappa_p(g)\right](f)\rvert\leq\lVert fg\rVert_1\leq\lVert f\rVert_p\lVert g\rVert_q

That is, \kappa_p(g) is a bounded linear functional, and the operator norm \lVert\kappa_p(g)\rVert_\text{op} is at most the L^q norm of g. The extremal case of Hölder’s inequality shows that there is some f for which this is an equality, and thus we conclude that \lVert\kappa_p(g)\rVert_\text{op}=\lVert g\rVert_q. That is, \kappa_p:L^q\to\left(L^p\right)' is an isometry of normed vector spaces. Such a mapping has to be an injection, because if \kappa_p(g)=0 then 0=\lVert\kappa_p(g)\rVert_\text{op}=\lVert g\rVert_q, which implies that g=0.

Now I say that \kappa_p is also a surjection. That is, any bounded linear functional \Lambda:L^p\to\mathbb{R} is of the form \kappa_p(g) for some g\in L^q. Indeed, if \Lambda=0 then we can just pick g=0 as a preimage. Thus we may assume that \Lambda is a nonzero bounded linear functional on L^p, and \lVert\Lambda\rVert_\text{op}>0. We first deal with the case of a totally finite measure space.

In this case, we define a set function on measurable sets by \lambda(E)=\Lambda(\chi_E). It’s straightforward to see that \lambda is additive. To prove countable additivity, suppose that E is the countable disjoint union of a sequence \{E_n\}. If we write A_k for the union of E_1 through E_k, we find that

\displaystyle\lVert\chi_E-\chi_{A_k}\rVert_p=\left(\mu(E\setminus A_k)\right)^\frac{1}{p}\to0

Since \Lambda is continuous, we conclude that \lambda(A_k)\to\lambda(E), and thus that \lambda is a (signed) measure. It should also be clear that \mu(E)=0 implies \lambda(E)=0, and so \lambda\ll\mu. The Radon-Nikodym theorem now tells us that there exists an integrable function g so that

\displaystyle\Lambda(\chi_E)=\lambda(E)=\int\limits_Eg\,d\mu=\int\chi_Eg\,d\mu

Linearity tells us that

\displaystyle\Lambda(f)=\int fg\,d\mu

for simple functions f, and also for every f\in L^\infty(\mu), since each such function is the uniform limit of simple functions. We want to show that g\in L^q.

If p=1, then we must show that g is essentially bounded. In this case, we find

\displaystyle\left\lvert\int\limits_Eg\,d\mu\right\rvert\leq\lVert\Lambda\rVert_\text{op}\,\lVert\chi_E\rVert_1=\lVert\Lambda\rVert_\text{op}\mu(E)

for every measurable E, from which we conclude that \lvert g(x)\rvert\leq\lVert\Lambda\rVert_\text{op} a.e., or else we could find some set on which this inequality was violated. Thus \lVert g\rVert_\infty\leq\lVert\Lambda\rVert_\text{op}.

For other p, we can find a measurable \alpha with \lvert\alpha\rvert=1 so that \alpha g=\lvert g\rvert. Setting E_n=\{x\in X\vert n\geq\lvert g(x)\rvert\} and defining f=\chi_{E_n}\lvert g\rvert^{q-1}\alpha, we find that \lvert f\rvert^p=\lvert g\rvert^q on E_n, f\in L^\infty, and so

\displaystyle\int\limits_{E_n}\lvert g\rvert^q\,d\mu=\int\limits_Xfg\,d\mu=\Lambda(f)\leq\lVert\Lambda\rVert_\text{op}=\lVert\Lambda\rVert_\text{op}\left(\int\limits_{E_n}\lvert g\rvert^q\,d\mu\right)^\frac{1}{p}

We thus find

\displaystyle\left(\int\limits_{E_n}\lvert g\rvert^q\,d\mu\right)^\frac{1}{q}=\left(\int\limits_{E_n}\lvert g\rvert^q\,d\mu\right)^{1-\frac{1}{p}}\leq\lVert\Lambda\rVert_\text{op}

and thus

\displaystyle\int\limits_X\chi_{E_n}\lvert g\rvert^q\,d\mu\leq\lVert\Lambda\rVert_\text{op}^q

Applying the monotone convergence theorem as n\to\infty we find that \lVert g\rVert_q\leq\lVert\Lambda\rVert_\text{op}.

Thus in either case we’ve found a g\in L^q so that \Lambda=\kappa_p(g).

In the \sigma-finite case, we can write X as the countable disjoint union of sets X_i with \mu(X_i)<\infty. We let Y_k be the union of the first k of these sets. We note that \lVert\chi_E f\rVert_p\leq\lVert f\rVert_p for every measurable set E, so f\mapsto\Lambda(\chi_Ef) is a linear functional on L^p of norm at most \lVert\Lambda\rVert_\text{op}. The finite case above shows us that there are functions g_i on X_i so that

\displaystyle\Lambda(\chi_{X_i}f)=\int\limits_{X_i}fg_i\,d\mu.

We can define g_i(x)=0 if x\notin X_i, and let g be the sum of all these g_i. We see that

\displaystyle\Lambda(\chi_{Y_k}f)=\int\limits_{Y_k}f(g_1+\dots+g_k)\,d\mu

for every f\in L^p, and since \mu(Y_k)<\infty we find that \lVert g_1+\dots+g_k\rVert_q\leq\lVert\Lambda\rVert_\text{op}. Then Fatou’s lemma shows us that \lVert g\rVert_q\leq\lVert\Lambda\rVert_\text{op}. Thus the \sigma-finite case is true as well.

One case in particular is especially worthy of note: since 2 is Hölder-coonjugate to itself, we find that L^2 is isomorphic to its own continuous dual space in the same way that a finite-dimensional inner-product space is isomorphic to its own dual space.

September 3, 2010 Posted by | Analysis, Functional Analysis, Measure Theory | Leave a comment

Bounded Linear Transformations

In the context of normed vector spaces we have a topology on our spaces and so it makes sense to ask that maps f:V\to W between them be continuous. In the finite-dimensional case, all linear functions are continuous, so this hasn’t really come up before in our study of linear algebra. But for functional analysis, it becomes much more important.

Now, really we only need to require continuity at one point — the origin, to be specific — because if it’s continuous there then it’ll be continuous everywhere. Indeed, continuity at v' means that for any \epsilon>0 there is a \delta>0 so that \lVert v-v'\rVert_V<\delta implies \lVert f(v)-f(v')\rVert_W=\lVert f(v-v')\rVert_W<\epsilon. In particular, if v'=0, then this means \lVert v\rVert_V<\delta implies \lVert f(v)\rVert_W<\epsilon. Clearly if this holds, then the general version also holds.

But it turns out that there’s another equivalent condition. We say that a linear transformation f:V\to W is “bounded” if there is some M>0 such that \lVert f(v)\rVert_W\leq M\lVert v\rVert_V for all v\in V. That is, the factor by which f stretches the length of a vector is bounded. By linearity, we only really need to check this on the unit sphere \lVert v\rVert_V=1, but it’s often just as easy to test it everywhere.

Anyway, I say that a linear transformation is continuous if and only if it’s bounded. Indeed, if f:V\to W is bounded, then we find

\displaystyle M\lVert h \rVert_V\geq\lVert f(h)\rVert_W=\lVert f(v+h)-f(v)\rVert_W

so as we let h approach 0 — as v+h approaches v — the difference between f(v+h) and f(v) approaches zero as well. And so f is continuous.

Conversely, if f is continuous, then it is bounded. Since it’s continuous, we let \epsilon=1 and find a \delta so that \lVert f(h)\rVert_W<1 for all vectors h with \lVert h\rVert_V<\delta. Thus for all nonzero v\in V we find

\displaystyle\begin{aligned}\lVert f(v)\rVert_W&=\left\lVert\frac{\lVert v\rVert_V}{\delta}f\left(\delta\frac{v}{\lVert v\rvert_V}\right)\right\rVert_W\\&=\frac{\lVert v\rVert_V}{\delta}\left\lVert f\left(\delta\frac{v}{\lVert v\rVert_V}\right)\right\rVert_W\\&\leq\frac{\lVert v\rVert_V}{\delta}\,1\\&=\frac{1}{\delta}\lVert v\rVert_V\end{aligned}

Thus we can use M=\frac{1}{\delta} and conclude that f is bounded.

The least such M that works in the condition for f to be bounded is called the “operator norm” of f, which we write as \lVert f\rVert_\text{op}. It’s straightforward to verify that \lVert cf\rVert_\text{op}=\lvert c\rvert\lVert f\rVert_\text{op}, and that \lVert f\rVert_\text{op}=0 if and only if f is the zero operator. It remains to verify the triangle identity.

Let’s say that we have bounded linear transformations f:V\to W and g:T\to W with operator norms M=\lVert f\rVert_\text{op} and N=\lVert g\rVert_\text{op}, respectively. We will show that M+N works as a bound for f+g, and thus conclude that \lVert f+g\rVert_\text{op}\leq\lVert f\rVert_\text{op}+\lVert g\rVert_\text{op}. Indeed, we check that

\displaystyle\begin{aligned}\lVert[f+g](v)\rVert_W&=\lVert f(v)+g(v)\rVert_W\\&\leq\lVert f(v)\rVert_W+\lVert g(v)\rVert_W\\&\leq M\lVert v\rVert_V+N\lVert v\rVert_V\\&=(M+N)\lVert v\rVert_V\end{aligned}

and our assertion follows. In particular, when our base field is itself a normed linear space (like \mathbb{C} or \mathbb{R} itself) we can conclude that the “continuous dual spaceV' consisting of bounded linear functionals \Lambda:V\to\mathbb{F} is a normed linear space using the operator norm on V'.

September 2, 2010 Posted by | Analysis, Functional Analysis, Measure Theory | 2 Comments

Follow

Get every new post delivered to your Inbox.

Join 366 other followers