Representations of a Polynomial Algebra
Sorry for the delays, but tests are killing me this week.
Okay, so let’s take the algebra of polynomials, and consider its representation theory.
What is a representation of this algebra? It’s a homomorphism of -algebras
. But the algebra of polynomials satisfies a universal property! A homomorphism of
-algebras is uniquely determined by the image of the single element
, and we can pick this image freely. That is, once we pick a linear transformation
and set
, then we are forced to use
for all the other polynomials. That is, representations of
are in bijection with the linear transformations
.
But remember that these representations don’t live in a vacuum. No, they’re just the objects of a whole category of representations. We need to consider the morphisms between representations too!
So if and
are linear transformations, what’s a morphism
? It’s a linear map
such that
. Notice that if
intertwines the linear maps
and
, then it will automatically intertwine the values of
and
for every polynomial.
Rather than try to examine this condition in detail (which leads to an interesting problem in the theory of quivers, if I recall), let’s just consider which representations are isomorphic. That is, let’s decategorify this category.
So we ask that the linear map be an isomorphism, with inverse
. Then we can take the intertwining relation
and compose on the left with
to find
. But this uniquely specifies
given
and
. That is, given a representation
and an isomorphism
, there is a unique representation
so that
is a natural isomorphism.
And we’re drawn again to consider the special case where . Now an isomorphism is just a change of basis. Representations of
are equivalent if they do “the same thing” to the vector space
, but just express it with different coordinates.
So here’s the upshot: the general linear group acts on the hom-set
by conjugation — basis changes. In fact, this is a representation of the group, but I’m not ready to go into that detail right now. What I can say is that the orbits of this action are in bijection with the equivalence classes of representations of
on
.
The Category of Representations
Now let’s narrow back in to representations of algebras, and the special case of representations of groups, but with an eye to the categorical interpretation. So, representations are functors. And this immediately leads us to the category of such functors. The objects, recall, are functors, while the morphisms are natural transformations. Now let’s consider what, exactly, a natural transformation consists of in this case.
Let’s say we have representations and
. That is, we have functors
and
with
,
— where
is the single object of
, when it’s considered as a category — and the given actions on morphisms. We want to consider a natural transformation
.
Such a natural transformation consists of a list of morphisms indexed by the objects of the category . But
has only one object:
. Thus we only have one morphism,
, which we will just call
.
Now we must impose the naturality condition. For each arrow in
we ask that the diagram
commute. That is, we want for every algebra element
. We call such a transformation an “intertwiner” of the representations. These intertwiners are the morphisms in the category of
of representations of
. If we want to be more particular about the base field, we might also write
.
Here’s another way of putting it. Think of as a “translation” from
to
. If
is an isomorphism of vector spaces, for instance, it could be a change of basis. We want to take a transformation from the algebra
and apply it, and we also want to translate. We could first apply the transformation in
, using the representation
, and then translate to
. Or we could first translate from
to
and then apply the transformation, now using the representation
. Our condition is that either order gives the same result, no matter which element of
we’re considering.
Category Representations
We’ve seen how group representations are special kinds of algebra representations. But even more general than that is the representation of a category.
A group is a special monoid, within which each element is invertible. And a monoid is just a category with a single object. Similarly, an -algebra is just like a monoid but enriched over the category of vector spaces over
. That is, it’s a one-object category with an
-bilinear composition. It makes sense to regard both of these structures as categories of sorts. A representation will then be a functor from one of these categories.
The clear target category is . So what’s a functor
from, say, a group
(considered as a category) to
? First the single object of the category
picks out some object
. That is,
is a vector space over
. Then for each arrow
in
— each group element — we have an arrow
. Since
has to be invertible, this
must be invertible — an element of
.
What about an algebra? Now our source category and our target category
are both enriched over
. It only makes sense, then, for us to consider
-linear functors. Such a functor
again picks out a single vector space
for the single object of
(considered as a category). Every arrow
in
gets sent to an arrow
. This mapping is linear over the field
.
So what do category representations get us? Well, one thing is this: consider a combinatorial graph — a collection of “vertices” with some directed “edges” joining them. A path in the graph is a sequence of directed edges joined tip-to-tail, and the collection of all paths in the graph constitutes the “path category” of the graph (exercise: identify the identity paths). A representation of this path category is what mathematicians call a “quiver representation”, and they’re big business.
More interesting to me is this: the category of tangles (or
of oriented tangles,
of framed tangles, or
of framed, oriented tangles). This is a monoidal category with duals, as is
, and so it only makes sense to ask that our functors respect those structures as well. We don’t ask that it send the braiding to the symmetry on
, since that would trivialize the structure.
So what is a representation of the category ? It is my contention that this is nothing but a knot invariant, viewed in a more natural habitat. A little more generally, knot invariants are the restrictions to knots (and links) of functors defined on the category of tangles, which can often (always?) be decategorified — or otherwise rendered down — into representations of
. This is my work: to translate existing knot theoretical ideas into this algebraic language, where I believe they find a better home.
Algebra Representations
We’ve defined a representation of the group as a homomorphism
for some vector space
. But where did we really use the fact that
is a group?
This leads us to the more general idea of representing a monoid . Of course, now we don’t need the image of a monoid element to be invertible, so we may as well just consider a homomorphism of monoids
, where we consider this endomorphism algebra as a monoid under composition.
And, of course, once we’ve got monoids and -linearity floating around, we’re inexorably drawn — Serge would way we have an irresistable compulsion — to consider monoid objects in the category of
-modules. That is:
-algebras.
And, indeed, things work nicely for -algebras. We say a representation of an
-algebra
is a homomorphism
for some vector space
over
. How else can we view such a homomorphism?
Well, it turns an algebra element into an endomorphism. And the most important thing about an endomorphism is that it does something to vectors. So given an algebra element , and a vector
, we get a new vector
. And this operation is
-linear in both of its variables. So we have a linear map
, built from the representation
and the evaluation map
. But this is just a left
-module!
In fact, the evaluation above is the counit of the adjunction between and the internal
functor
. This adjunction is a natural isomorphism of
sets:
. That is, left
-modules are in natural bijection with representations of
. In practice, we just consider the two structures to be the same, and we talk interchangeably about modules and representations.
As it would happen, the notion of an algebra representation properly extends that of a group representation. Given any group we can build the group algebra
. As a vector space, this has a basis vector
for each group element
. We then define a multiplication on pairs of basis elements by
, and extend by bilinearity.
Now it turns out that representations of the group and representations of the group algebra
are in bijection. Indeed, the basis vectors
are invertible in the algebra
. Thus, given a homomorphism
, the linear maps
must be invertible. And so we have a group representation
. Conversely, if
is a representation of the group
, then we can define
and extend by linearity to get an algebra representation
.
So we have representations of algebras. Within that we have the special cases of representations of groups. These allow us to cast abstract algebraic structures into concrete forms, acting as transformations of vector spaces.
Group Representations
We’ve now got the general linear group of all invertible linear maps from a vector space
to itself. Incidentally this lives inside the endomorphism algebra
of all linear transformations from
to itself. In fact, in ring-theory terms it’s the group of units of that algebra. So what can we do with it?
One of the biggest uses is to provide representations for other algebraic structures. Let’s say we’ve got some abstract group. It’s a set with some binary operation defined on it, sure, but what does it do? We’ve seen groups acting on sets before, where we interpret a group element as a permutation of an actual collection of elements. Alternatively, an action of a group is a homomorphism from
to the group of permutations of some set
—
.
Another concrete representation of a group is as symmetries of some vector space. That is, we’re interested in homomorphisms . A “representation” of a group
is a vector space
with such a homomorphism.
In fact, this extends the notion of a group acting on a set. Indeed, for any set we can build the free vector space
with a basis vector
for each
. Given a permutation
on
we get a linear map
defined by setting
and extending by linearity.
We thus get a homomorphism from the group of permutations of to
. And then if we have a group action on
we can promote it to a representation on the vector space
. We call such a representation a “permutation representation”.
General Linear Groups — Generally
Monday, we saw that the general linear groups are matrix groups, specifically consisting of those whose columns are linearly independent. But what about more general vector spaces?
Well, we know that every finite-dimensional vector space has a basis, and is thus isomorphic to , where
is the cardinality of the basis. So given a vector space
with a basis
of cardinality
, we have the isomorphism
defined by
and
.
This isomorphism of vector spaces then induces an isomorphism of their automorphism groups. That is, . Given an invertible linear transformation
, we can conjugate it by
to get
. This has inverse
, and so is an element of
. Thus (not unexpectedly) every invertible linear transformation from a vector space
to itself gets an invertible matrix.
But this assignment depends essentially on the arbitrary choice of the basis for
. What if we choose a different basis
? Then we get a new transformation
and a new isomorphism of groups
. But this gives us an inner automorphism of
. Given a transformation
, we get the transformation
This composite sends
to itself, and it has an inverse. Thus changing the basis on
induces an inner automorphism of the matrix group
.
Now let’s consider a linear transformation . We have two bases for
, and thus two different matrices — two different elements of
— corresponding to
:
and
. We get from one to the other by conjugation with
:
And what is this transformation ? How does it act on a basis vector in
? We calculate:
where expresses the vectors in one basis for
in terms of those of the other. That is, the
th column of the matrix
consists of the components of
written in terms of the
. Similarly, the inverse matrix
with entries
, writes the
in terms of the
:
.
It is these “change-of-basis” matrices that effect all of our, well, changes of basis. For example, say we have a vector with components
. Then we can expand this:
So our components in the new basis are .
As another example, say that we have a linear transformation with matrix components
with respect to the basis
. That is,
. Then we can calculate:
and we have the new matrix components .
The General Linear Groups
Not just any general group for any vector space
, but the particular groups
. I can’t put LaTeX, or even HTML subscripts in post titles, so this will have to do.
The general linear group is the automorphism group of the vector space
of
-tuples of elements of
. That is, it’s the group of all invertible linear transformations sending this vector space to itself. The vector space
comes equipped with a basis
, where
has a
in the
th place, and
elsewhere. And so we can write any such transformation as an
matrix.
Let’s look at the matrix of some invertible transformation :
How does it act on a basis element? Well, let’s consider its action on :
It just reads off the first column of the matrix of . Similarly,
will read off the
th column of the matrix of
. This works for any linear endomorphism of
: its columns are the images of the standard basis vectors. But as we said last time, an invertible transformation must send a basis to another basis. So the columns of the matrix of
must form a basis for
.
Checking that they’re a basis turns out to be made a little easier by the special case we’re in. The vector space has dimension , and we’ve got
column vectors to consider. If all
are linearly independent, then the column rank of the matrix is
. Then the dimension of the image of
is
, and thus
is surjective.
On the other hand, any vector in the image of
is a linear combination of the columns of the matrix of
(use the components of
as coefficients). If these columns are linearly independent, then the only combination adding up to the zero vector has all coefficients equal to
. And so
implies
, and
is injective.
Thus we only need to check that the columns of the matrix of are linearly independent to know that
is invertible.
Conversely, say we’re given a list of linearly independent vectors
in
. They must be a basis, since any linearly independent set can be completed to a basis, and a basis of
must have exactly
elements, which we already have. Then we can use the
as the columns of a matrix. The corresponding transformation
has
, and extends from there by linearity. It sends a basis to a basis, and so must be invertible.
The upshot is that we can consider this group as a group of matrices. They are exactly the ones so that the set of columns is linearly independent.
Isomorphisms of Vector Spaces
Okay, after that long digression into power series and such, I’m coming back to linear algebra. What we want to talk about now is how two vector spaces can be isomorphic. Of course, this means that they are connected by an invertible linear transformation, (which preserves the addition and scalar multiplication operations):
First off, to be invertible the kernel of must be trivial. Otherwise we’d have two vectors in
mapping to the same vector in
, and we wouldn’t be able to tell which one it came from in order to invert the map. Similarly, the cokernel of
must be trivial, or we’d have missed some vectors in
, and we couldn’t tell where in
to send them under the inverse map. This tells us that the index of an isomorphism must be zero, and thus that the vector spaces must have the same dimension. This seems sort of obvious, that isomorphic vector spaces would have to have the same dimension, but you can’t be too careful.
Next we note that an isomorphism sends bases to bases. That is, if is a basis for
, then the collection of
will form a basis for
.
Since is surjective, given any
there is some
with
. But
uniquely (remember the summation convention) because the
form a basis. Then
, and so we have an expression of
as a linear combination of the
. The collection
thus spans
.
On the other hand, if we have a linear combination , then we can write
. Since
is injective we find
, and thus each
, since the
form a basis. Thus the spanning set
is linearly independent, and thus forms a basis.
The converse, it turns out, is also true. If is a basis of
, and
is a basis of
, then the map
defined by
(and extending by linearity) is an isomorphism. Indeed, we can define an inverse straight away:
, and extend by linearity.
The upshot of these facts is that two vector spaces are isomorphic exactly when they have the same dimension. That is, just the same way that the cardinality of a set determines its isomorphism class in the category of sets, the dimension of a vector space determines its isomorphism class in the category of vector spaces.
Now let’s step back and consider what happens in any category and throw away all the morphisms that aren’t invertible. We’re left with a groupoid, and like any groupoid it falls apart into a bunch of “connected” pieces: the isomorphism classes. In this case, the isomorphism classes are given by the dimensions of the vector spaces.
Each of these connected pieces, then, is equivalent (as a groupoid) to the automorphism group of any of its objects, all of which such groups are isomorphic. In this case, we have a name for these automorphism groups.
Given any vector space , all the interesting information about isomorphisms to or from this group can be summed up in the “general linear group” of
, which consists of all invertible linear maps from
to itself. We write this automorphism group as
.
We have a special name in the case when is the vector space
of
-tuples of elements of the base field
. In this case we write the general linear group as
or as
. Since every finite-dimensional vector space over
is isomorphic to one of these (specifically, the one with
), we have
. These particular general linear groups are thus extremely important for understanding isomorphisms of finite-dimensional vector spaces. We’ll investigate these groups as we move forward.
Pi: A Wrap-Up
A couple months ago, in a post on World Series odds (how are those working out, Michael?), a commenter by the moniker of Kurt Osis asked a random question:
Ok now to my random question for the day. Is all human knowledge based on Pi? This just occurred to me the other day, if knowledge is based on measurement and the only objective form of measurement we have is the ratio between a circle’s circumference and diameter then is all knowledge really based on Pi?
Naturally, this sounds like just the sort of woo that I’ve decried in The Tao of Physics and The Dancing Wu Li Masters. It also smacks of “mathing up” the fuzzy ideas to give them the veneer of rigor and respectability. I’ve seen politicians do it, we’ve all seen poststructuralists do it, and there’s a lot of others that do too. And one of the very few undeniably mathy words that almost everyone knows is that blasted Greek constant , so it gets called into service a lot.
Clearly, I had to nip this in the bud.
I pointed out that this idea of wrapping things up with “measurement” really gave away that this was nonsense. I cited that curvature of spacetime throws off exactly such measurements (a point I recently brought up with Todd, but he hadn’t thrown “measurement” out there himself). At that point Kurt backtracked and said that was an idealization, and the measured discrepancies were knowledge. Of course I had forgotten about how slippery arguments can be with someone who only cares for the veneer of rigor.
Still I pressed onwards. I pointed out that has nothing to do with any real, physical measurement. The Cabibbo angle, or the fine-structure constant — those are the real-world constants that are actually interesting because there is (as yet) no reason why they have to have the values that they do.
Then the discussion moved from an unrelated post on Michael’s weblog to an unrelated post on mine.
Again, Kurt advances the “epistemic ” hypothesis as if it’s remotely coherent. Now he asserts that he was “trying to think of something independent of the number system itself”, and finally I had something. Here I made my stand:
is far from independent of the number system. It is what it is exactly because of the way the real number system is structured.
Then and there I decided to stop what I was working on about linear algebra. Instead, I set off on power series and how power series expansions can be used to express analytic functions. Then I showed how power series can be used to solve certain differential equations, which led us to defining the functions sine and cosine. Then I showed that the sine function must have a least positive zero, which we define to be .
The assumptions that have led to the definition of are just those of the real number system: we are working within the unique (up to isomorphism) largest archimedean field. There is no measurement, no knowledge, no science, and no epistemology to it. Kurt’s real question — the one he hops onto other mathematical weblogs’ unrelated comment threads to ask — is really about philosophy. He’s asking for a final answer to the entire field of epistemic research. It’s not forthcoming; not on a math weblog, not on a philosophy weblog, not anywhere. It’s been around in its current form for hundreds of years, and I don’t see a resolution on the horizon. But it certainly doesn’t lie in an accidental quirk of the real number system that society has for some reason decided to exalt far beyond its true value.
Properties of the Sine and Cosine
Blaise got most of the classic properties of the sine and cosine in the comments to the last post, so I’ll crib generously from his work. As a note: I know many people write powers of the sine and cosine functions as (for example) instead of
. As I tell my calculus students every year I refuse to do that myself because that should mean
, and I guarantee people will get confused between
or
First, let’s consider the function . We can take its derivative using the rules for derivatives of trigonometric functions from last time:
So this function is a constant. We easily check that , and so
.
What does this mean? It tells us that if and
are the lengths of the legs of a right triangle, the hypotenuse will have length
. Alternately, the point with coordinates
in the standard coordinate plane will lie on the unit circle. We haven’t talked yet about using integration to calculate the length of a path in the plane, but when we do we’ll see that the length of the arc on the circle from
to
is exactly
.
This gives us another definition for the sine and cosine functions — one closer to the usual one people see in a trigonometry class. Given an input value , walk that far around the unit circle, starting from the point
. The coordinates of the point you end up at are the sine and cosine of
. And this gives us our “original” definitions: given a right triangle, it is similar to a right triangle whose hypotenuse has length
, and the sine and cosine are the lengths of the two legs.
Now, since and
are both nonnegative, they must each be bounded above by
. Thus
and
. More specifically, any time that
we must have
.
We know that and
, so if we ever have another point
where
and
we have a period. This is because the differential equation will determine the future behavior of
the same way it determined the behavior of
. In fact, if
and
, then the future behavior of
will be exactly the negative of the behavior of
, and so eventually
and
again.
Admittedly, I’m sort of waving my hands here without an existence/uniqueness proof for solving differential equations. But the geometric intuition should suffice for the idea that since the function’s value and first derivative at are enough to determine the function, then the specific point we know them at shouldn’t matter.
So, does the sine function have a positive zero? That is, is there some so that
? If so, the lowest such one would have to have
(because positive numbers near
have positive sines). The next one would then be
with
, and the whole thing repeats with period
.
The function starts out increasing, and so
decreases (since
. If
has a maximum, then
(its derivative) must cross zero. Then
is decreasing, and it cannot increase again unless
crosses zero again. But if
crosses zero again it must have passed through a local extremum (Rolle) and so
cannot increase again before it crosses zero itself.
So if we are to avoid having a positive zero, it must either increase to some asymptote below
, or it must increase to a maximum and then decrease to some asymptote below
. But for a function to have an asymptote it must approach a horizontal line, and its derivative must approach
. That is, we can only have
approaching an asymptote at
, while
approaches an asymptote at
.
But if approaches an asymptote, its derivative must also asymptotically approach
. But this derivative is
, which we are assuming approaches
! And so none of these asymptotes are possible!
So the sine function must have a positive zero: . And thus the sine and cosine (and all other solutions to this differential equation) will have period
.
Finally, what the heck is this value ? In point of fact, we have no way of telling. But it might come in handy, so we’ll define this number and give it a new name:
. Whenever we say
we’ll mean “the first positive zero of the sine function”.
Here I want to point out that I’ve fulfilled my boast of a few months ago on some other weblog. In my tireless rant against the -fetishism that infests the geek community, I told someone that
can be derived, ultimately, from solely the properties of the real number system. Studying this field — itself uniquely specified on algebraic and topological grounds — leads us to both differential calculus and to power series, and from there to series solutions to differential equations. One of the most natural differential equations in the world thus gives rise to the trigonometric functions, and the definition
follows from their properties. There is no possible way it could be anything other than what it is when you see it from this side, while the geometric definition hinges on some very deep assumptions on the geometry of spacetime.
