Finally, all the pieces are in place to state the theorem I’ve been driving at for a while now:
To show this, we must show that it is full, faithful, and essentially surjective. The first two conditions say that given natural numbers and the linear transformations and the matrices over are in bijection.
But this is just what we’ve been showing! The vector spaces of -tuples come with their canonical bases, and given these bases every linear transformation gets a uniquely-defined matrix. Conversely, every matrix defines a unique linear transformation when we’ve got the bases to work with. So fullness and faithfulness are straightforward.
Now for essential surjectivity. This says that given any finite-dimensional vector space we have some so that . But we know that every vector space has a basis, and for it must be finite; that’s what “finite-dimensional” means! Let’s say that we’ve got a basis consisting of vectors.
Now we just line up the canonical basis of and define linear transformations by and . Remember that we can define a linear transformation by specifying its values on a basis (which can all be picked independently) and then extending by linearity. Thus we do have two well-defined linear transformations here. But just as clearly we see that for any we have
and a similar equation holds for every -tuple in . Thus and are inverses of each other, and are the isomorphism we need.
This tells us that the language of linear transformations between finite-dimensional vector spaces is entirely equivalent to that of matrices. But we gain some conceptual advantages by thinking in terms of finite-dimensional vector spaces. One I can point to right here is how we can tell the difference between a vector space and its dual. Sure, they’ve got the same dimension, and so there’s some isomorphism between them. Still, when we’re dealing with both at the same time they behave differently, and it’s valuable to keep our eye on that difference.
On the other hand, there are benefits to matrices. For one thing, we can actually write them down and calculate with them. A lot of people are — surprise! — interested in using mathematics to solve problems. And the problems that linear algebra is most directly applicable to are naturally stated in terms of matrices.
What the theorem tells us is that none of this matters. We can translate problems from the category of matrices to the category of vector spaces and back, and nothing is lost in the process.
Now here’s the really important thing: There’s a functor that assigns the finite-dimensional vector space of -tuples of elements of to each object of . Such a vector space of -tuples comes with the basis , where the vector has a in the th place and a elsewhere. In matrix notation:
and so on. We can write (remember the summation convention), so the vector components of the basis vectors are given by the Kronecker delta. We will think of other vectors as column vectors.
Given a matrix we clearly see a linear transformation from to . Given a column vector with components (where the index satisfies ), we construct the column vector (here ). But we’ve already established that matrix multiplication represents composition of linear transformations. Further, it’s straightforward to see that the linear transformation corresponding to a matrix is the identity on (depending on the range of the indices on the Kronecker delta). This establishes that we really have defined a functor.
But wait, there’s more! The functor is linear over , so it’s a functor enriched over . The Kronecker product of matrices corresponds to the monoidal product of linear transformations, so the functor is monoidal, too. Following the definitions, we can even find that our functor preserves duals.
So we’ve got a functor from our category of matrices to the category of finite-dimensional vector spaces, and it preserves all of the relevant structure.
We define the monoidal product on objects by multiplication — — and on morphisms by using the Kronecker product. That is, if we have an matrix and an matrix , then we get the Kronecker product
Here we have to be careful about what we’re saying. In accordance with our convention, the pair of indices (with and ) should be considered as the single index . It’s clear that this quantity then runs between and . A similar interpretation goes for the index pairs .
Of course, we need some relations for this to be a monoidal structure. Strict associativity is straightforward:
For our identity object, we naturally use , with its identity morphism . Note that the first of these is the object the natural number , while the second is the matrix whose single entry is the field element . Then we can calculate the Kronecker product to find
and so strict associativity holds as well.
The category of matrices also has duals. In fact, each object is self-dual! That is, we set . We then need our arrows and .
The morphism will be a matrix. Specifically, we’ll use , with and both running between and . Again, we interpret an index pair as described above. The symbol is another form of the Kronecker delta, which takes the value when its indices agree and when they don’t.
Similarly, will be an matrix: , using yet another form of the Kronecker delta.
Now we have compatibility relations. Since the monoidal structure is strict, these are simpler than usual:
But now all the basic matrices in sight are various Kronecker deltas! The first equation reads
which is true. You should be able to verify the second one similarly.
The upshot is that we’ve got the structure of a monoidal category with duals on .
What we’ve been building up to is actually the definition of a category. Given a field we define the category of matrices over .
Most of our other categories have been named after their objects — groups are the objects of , commutative monoids are the objects of , and so on — but not here. In this case, matrices will be the morphisms, and the category of matrices illustrates in a clearer way than any we’ve seen yet how similar categories are to other algebraic structures that are usually seen as simpler and more concrete.
Down to business: the objects of will be the natural numbers , and the morphisms in are the matrices. That is, a morphism is a collection of field elements where runs from to and runs from to .
The category of matrices is actually enriched over the category of vector spaces over . This means that each set of morphisms is actually a vector space over . Specifically, we add matrices of the same dimensions and multiply matrices by scalars component-by-component.
We have yet to speak very clearly about identities. The axioms of an enriched category state that for each object (natural number) there must be a linear function . Because of linearity, this function is completely determined by its value at : . We must pick this matrix so that it acts as an identity for matrix multiplication, and we choose the Kronecker delta for this purpose: . That is, we use an matrix whose entries are if the indices are equal and otherwise. It’s straightforward to check that this is indeed an identity.
Other properties I’ve skipped over, but which aren’t hard to check, are that matrix multiplication is bilinear and associative. Both of these are straightforward once written out in terms of the summation convention; sometimes deceptively so. For example, the associativity condition reads . But remember that there are hidden summation signs in here, so it should really read:
so there’s an implicit change in the order of summation here. Since we’re just doing finite sums, this is no problem, but it’s still worth keeping an eye on.
Now I need to find a follow-on to this paper and start applying to those financial math jobs.
Long-time readers may remember that back in September I went to a conference at the University of Texas at Tyler. Well, it turns out that the AMS wants to publish a proceedings of the conference. So I’m trying to throw together a paper on the stuff I was talking about.
As I’m doing so, I’m recognizing that one part of my original talk — the whole business about anafunctors — isn’t quite ready for prime-time, and the whole thing hangs together better without it. And this brings up the design philosophy I talked about recently. In this case, writing the smaller paper first is being sort of forced on me by a short deadline.
Still, it’s crunch time, and I’m trying to crank this paper out while also teaching, applying for jobs, and dealing with car troubles you wouldn’t even believe. I don’t really feel like working up the next post in the calculus series today, and so I thought I’d toss up
an alpha version of the paper. I’ll keep tweaking it, and replacing the version here as I finish more chunks, until I get to a beta version, which I’ll update here and post to the arXiv.
One particular note on the incompleteness: I haven’t even started writing the introductory section or the abstract yet. I’m finding that I tend to do better if I just dive into the mathematics and then come back later to say what I said.
And finally: it looks like I’ll be talking about this stuff at the University of Pennsylvania on March 19. Mark your calendars!
[UPDATE] 02/26: Still sans abstract and intro, but with all mathematical content there, I present a new version. Bibliography suggestions are particularly appreciated (thanks Scott).
[UPDATE] 03/04: Now with the abstract and introduction, a beta version. Bibliography suggestions would still be appreciated.
First we’ll need to pick out our base category , in which we’ll find our hom-objects. Consider the set of nonnegative real numbers with their real-number order, and add in a point called that’s above all the other points. This is a totally ordered set, and orders are categories. Let’s take the opposite of this category. That is, the objects of our category are the points in the “interval” , and we have an arrow exactly when .
This turns out to be a monoidal category, and the monoidal structure is just addition. Clearly this gives a monoid on the set of objects, but we need to check it on morphisms to see it’s functorial. But if and then , and so we can see addition as a functor.
So we’ve got a monoidal category, and we can now use it to form enriched categories. Let’s keep out lives simple by considering a small -category . Here’s how the definition looks.
We have a set of objects that we’ll call “points” in a set . Between any two points and we need a hom-object . That is, we have a function .
For a triple of objects we need an arrow . In more quotidian terms, this means that .
Also, for each point there is an arrow from the identity object of to the hom-object . That is, , so .
These conditions are the first, fourth, and half of the second conditions in the definition of a metric space! In fact, there’s a weaker notion of a “pseudometric” space, wherein the second condition is simply that , and so we’re almost exactly giving the definition of a pseudometric space.
The only thing we’re missing is the requirement that . The case can be made (and has been, by Lawvere) that this requirement is actually extraneous, and that it’s in some sense more natural to work with “asymmetric” (pseudo)metric spaces that are exactly those given by this enriched categorical framework.
We’ve defined topological spaces and continuous maps between them. Together these give us a category . We’d like to understand a few of our favorite categorical constructions as they work in this context.
First off, the empty set has a unique topology, since it only has the one subset at all. Given any other space (we’ll omit explicit mention of its topology) there is a unique function , and it is continuous since the preimage of any subset of is empty. Thus is the initial object in .
On the other side, any singleton set also has a unique topology, since the only subsets are the whole set and the empty set, which must both be open. Given any other space there is a unique function , and it is continuous because the preimage of the empty set is empty and the preimage of the single point is the whole of , both of which are open in . Thus is a terminal object in .
Now for products. Given a family of topological spaces indexed by , we can form the product set , which comes with projection functions and satisfies a universal property. We want to use this same set and these same functions to describe the product in , so we must choose our topology on the product set so that these projections will be continuous. Given an open set , then, its preimage must be open in . Let’s take these preimages to be a subbase and consider the topology they generate.
If is any other space with a family of continuous maps , then the universal property in gives us a unique function . But will it be a continuous map? To check this, remember that we only need to verify it on a subbase for the topology on the product space, and we have one ready to work with. Each set in the subbase is the preimage of an open set in some , and then its preimage under is , which is open by the assumption that each is continuous. And so the product set equipped with the product topology described above is the categorical product of the topological spaces .
What about coproducts? Let’s again start with the coproduct in , which is the disjoint union , and which comes with canonical injections . This time let’s jump right into the universal property, which says that given another space and functions , we have a unique function . Now we need any function we get like this to be continuous. The preimage of an open set will be the union of the preimages of each of the , sitting inside the disjoint union. By choosing , the , and judiciously, we can get the preimage to be any open set we want in , so the open sets in the disjoint union should consist precisely of those subsets whose preimage is open for each . It’s easy to verify that this collection is actually a topology, which then gives us the categorical coproduct in .
If we start with a topological space and take any subset then we can ask for the coarsest topology on that makes the inclusion map continuous, sort of like how we defined the product topology above. The open sets in will be any set of the form for an open subset . Then given another space , a function will be continuous if and only if is continuous. Indeed, the preimage clearly shows this equivalence. We call this the subspace topology on .
In particular, if we have two continuous maps and , then we can consider the subspace consisting of those points satisfying . Given any other space and a continuous map such that , clearly sends all of into the set ; the function factors as , where is the inclusion map. Then must be continuous because is, and so the subspace is the equalizer of the maps and .
Dually, given a topological space and an equivalence relation on the underlying set of we can define the quotient space to be the set of equivalence classes of points of . This comes with a canonical function , which we want to be continuous. Further, we know that if is any function for which implies , then factors as for some function . We want to define the topology on the quotient set so that is continuous if and only if is. Given an open set , its preimage is the set of equivalence classes that get sent into , while its preimage is the set of all points that get sent to . And so we say a subset of the quotient space is open if and only if its preimage — the union of the equivalence classes in is open in .
In particular, if we have two maps and we get an equivalence relation on by defining if there is a so that and . If we walk through the above description of the quotient space we find that this construction gives us the coequalizer of and .
And now, the existence theorem for limits tells us that all limits and colimits exist in . That is, the category of topological spaces is both complete and cocomplete.
As a particularly useful example, let’s look at an example of a pushout. If we have two topological spaces and and a third space with maps and making into a subspace of both and , then we can construct the pushout of and over . The general rule is to first construct the coproduct of and , and then pass to an appropriate coequalizer. That is, we take the disjoint union and then identify the points in the copy of sitting inside with those in the copy of sitting inside . That is, we get the union of and , “glued” along .
Okay, so we’ve defined a topology on a set . But we also love categories, so we want to see this in terms of categories. And, indeed, every topology is a category!
First, remember that the collection of subsets of , like the collection of subobjects on an object in any category, is partially ordered by inclusion. And since every partially ordered set is a category, so is the collection of subsets of .
In fact, it’s a lattice, since we can use union and intersection as our join and meet, respectively. When we say that a poset has pairwise least upper bounds it’s the same as saying when we consider it as a category it has finite coproducts, and similarly pairwise greatest lower bounds are the same as finite products. But here we can actually take the union or intersection of any collection of subsets and get a subset, so we have all products and coproducts. In the language of posets, we have a “complete lattice”.
So now we want to talk about topologies. A topology is just a collection of the subsets that’s closed under finite intersections and arbitrary unions. We can use the same order (inclusion of subsets) to make a topology into a partially-ordered set. In the language of posets, the requirements are that we have a sublattice (finite meets and joins, along with the same top and bottom element) with arbitrary meets — the topology contains the least upper bound of any collection of its elements.
And now we translate the partial order language into category theory. A topology is a subcategory of the category of subsets of with finite products and all coproducts. That is, we have an arrow from the object to the object if and only if as subsets of . Given any finite collection of objects we have their product , and given any collection of objects we have their coproduct . In particular we have the empty product — the terminal object — and we have the empty coproduct — the initial object . And all the arrows in our category just tell us how various open sets sit inside other open sets. Neat!
Part of the disappointment I mentioned is that the road I was on just looked so pretty. I’ve said in various places that I agree with (what I understand to be) David Corfield’s view of mathematics as a process of telling good stories, and this was a great story, but unfortunately it just doesn’t quite ring true. Before I purge it, I want to show you the picture of the tensorator as I thought it would work.
Across the top are two tensor products of one span and one object each, and across the bottom are the other two, giving the compositions in both orders. The squares (that look like triangles) at the top and bottom are pullbacks, giving the actual composite spans. Then we can put the tensor product in the middle, and get arrows up and down from the universal properties of the pullback squares. And it even looks like a big tensor product symbol!