The Unapologetic Mathematician

Mathematics for the interested outsider

The Category of Matrices IV

Finally, all the pieces are in place to state the theorem I’ve been driving at for a while now:

The functor that we described from \mathbf{Mat}(\mathbb{F}) to \mathbf{FinVec}(\mathbb{F}) is an equivalence.

To show this, we must show that it is full, faithful, and essentially surjective. The first two conditions say that given natural numbers m and n the linear transformations \mathbb{F}^m\rightarrow\mathbb{F}^n and the n\times m matrices over \mathbb{F} are in bijection.

But this is just what we’ve been showing! The vector spaces of n-tuples come with their canonical bases, and given these bases every linear transformation gets a uniquely-defined matrix. Conversely, every matrix defines a unique linear transformation when we’ve got the bases to work with. So fullness and faithfulness are straightforward.

Now for essential surjectivity. This says that given any finite-dimensional vector space V we have some n so that V\cong\mathbb{F}^n. But we know that every vector space has a basis, and for V it must be finite; that’s what “finite-dimensional” means! Let’s say that we’ve got a basis \left\{f_i\right\} consisting of n vectors.

Now we just line up the canonical basis \left\{e_i\right\} of \mathbb{F}^n and define linear transformations by S(e_i)=f_i and T(f_i)=e_i. Remember that we can define a linear transformation by specifying its values on a basis (which can all be picked independently) and then extending by linearity. Thus we do have two well-defined linear transformations here. But just as clearly we see that for any v\in V we have


and a similar equation holds for every n-tuple in \mathbb{F}^n. Thus S and T are inverses of each other, and are the isomorphism we need.

This tells us that the language of linear transformations between finite-dimensional vector spaces is entirely equivalent to that of matrices. But we gain some conceptual advantages by thinking in terms of finite-dimensional vector spaces. One I can point to right here is how we can tell the difference between a vector space and its dual. Sure, they’ve got the same dimension, and so there’s some isomorphism between them. Still, when we’re dealing with both at the same time they behave differently, and it’s valuable to keep our eye on that difference.

On the other hand, there are benefits to matrices. For one thing, we can actually write them down and calculate with them. A lot of people are — surprise! — interested in using mathematics to solve problems. And the problems that linear algebra is most directly applicable to are naturally stated in terms of matrices.

What the theorem tells us is that none of this matters. We can translate problems from the category of matrices to the category of vector spaces and back, and nothing is lost in the process.


June 24, 2008 Posted by | Algebra, Category theory, Linear Algebra | Leave a comment

The Category of Matrices III

At long last, let’s get back to linear algebra. We’d laid out the category of matrices \mathbf{Mat}(\mathbb{F}), and we showed that it’s a monoidal category with duals.

Now here’s the really important thing: There’s a functor \mathbf{Mat}(\mathbb{F})\rightarrow\mathbf{FinVec}(\mathbb{F}) that assigns the finite-dimensional vector space \mathbb{F}^n of n-tuples of elements of \mathbb{F} to each object n of \mathbf{Mat}(\mathbb{F}). Such a vector space of n-tuples comes with the basis \left\{e_i\right\}, where the vector e_i has a {1} in the ith place and a {0} elsewhere. In matrix notation:

\displaystyle e_1=\begin{pmatrix}1\\{0}\\\vdots\\{0}\end{pmatrix}
\displaystyle e_2=\begin{pmatrix}{0}\\1\\\vdots\\{0}\end{pmatrix}

and so on. We can write e_i=\delta_i^je_j (remember the summation convention), so the vector components of the basis vectors are given by the Kronecker delta. We will think of other vectors as column vectors.

Given a matrix \left(t_i^j\right)\in\hom(m,n) we clearly see a linear transformation from \mathbb{F}^m to \mathbb{F}^n. Given a column vector with components v^i (where the index satisfies 1\leq i\leq m), we construct the column vector t_i^jv^i (here 1\leq j\leq n). But we’ve already established that matrix multiplication represents composition of linear transformations. Further, it’s straightforward to see that the linear transformation corresponding to a matrix \left(\delta_i^j\right) is the identity on \mathbb{F}^n (depending on the range of the indices on the Kronecker delta). This establishes that we really have defined a functor.

But wait, there’s more! The functor is linear over \mathbb{F}, so it’s a functor enriched over \mathbb{F}. The Kronecker product of matrices corresponds to the monoidal product of linear transformations, so the functor is monoidal, too. Following the definitions, we can even find that our functor preserves duals.

So we’ve got a functor from our category of matrices to the category of finite-dimensional vector spaces, and it preserves all of the relevant structure.

June 23, 2008 Posted by | Algebra, Category theory, Linear Algebra | 1 Comment

The Category of Matrices II

As we consider the category \mathbf{Mat}(\mathbb{F}) of matrices over the field \mathbb{F}, we find a monoidal structure.

We define the monoidal product \boxtimes on objects by multiplication — m\boxtimes n=mn — and on morphisms by using the Kronecker product. That is, if we have an m_1\times n_1 matrix \left(s_{i_1}^{j_1}\right)\in\hom(n_1,m_1) and an m_2\times n_2 matrix \left(t_{i_2}^{j_2}\right)\in\hom(n_2,m_2), then we get the Kronecker product


Here we have to be careful about what we’re saying. In accordance with our convention, the pair of indices (i_1,i_2) (with 1\leq i_1\leq m_1 and 1\leq i_2\leq m_2) should be considered as the single index (i_1-1)m_2+i_2. It’s clear that this quantity then runs between {1} and m_1m_2. A similar interpretation goes for the index pairs (j_1,j_2).

Of course, we need some relations for this to be a monoidal structure. Strict associativity is straightforward:


For our identity object, we naturally use {1}, with its identity morphism \left(1\right). Note that the first of these is the object the natural number {1}, while the second is the 1\times1 matrix whose single entry is the field element {1}. Then we can calculate the Kronecker product to find


and so strict associativity holds as well.

The category of matrices also has duals. In fact, each object is self-dual! That is, we set n^*=n. We then need our arrows \eta_n:1\rightarrow n\boxtimes n and \epsilon_n:n\boxtimes n\rightarrow1.

The morphism \eta_n will be a 1\times n^2 matrix. Specifically, we’ll use \eta_n=\left(\delta^{i,j}\right), with i and j both running between {1} and n. Again, we interpret an index pair as described above. The symbol \delta^{i,j} is another form of the Kronecker delta, which takes the value {1} when its indices agree and {0} when they don’t.

Similarly, \epsilon_n will be an n^2\times1 matrix: \epsilon_n=\left(\delta_{i,j}\right), using yet another form of the Kronecker delta.

Now we have compatibility relations. Since the monoidal structure is strict, these are simpler than usual:


But now all the basic matrices in sight are various Kronecker deltas! The first equation reads


which is true. You should be able to verify the second one similarly.

The upshot is that we’ve got the structure of a monoidal category with duals on \mathbf{Mat}(\mathbb{F}).

June 3, 2008 Posted by | Algebra, Category theory, Linear Algebra | 1 Comment

The Category of Matrices I

What we’ve been building up to is actually the definition of a category. Given a field \mathbb{F} we define the category \mathbf{Mat}(\mathbb{F}) of matrices over \mathbb{F}.

Most of our other categories have been named after their objects — groups are the objects of \mathbf{Grp}, commutative monoids are the objects of \mathbf{CMon}, and so on — but not here. In this case, matrices will be the morphisms, and the category of matrices illustrates in a clearer way than any we’ve seen yet how similar categories are to other algebraic structures that are usually seen as simpler and more concrete.

Down to business: the objects of \mathbf{Mat}(\mathbb{F}) will be the natural numbers \mathbb{N}, and the morphisms in \hom(m,n) are the n\times m matrices. That is, a morphism is a collection of field elements \left(t_i^j\right) where i runs from {1} to m and j runs from {1} to n.

We compose two morphisms by the process of matrix multiplication. If \left(s_i^j\right) is an n\times m matrix in \hom(m,n) and \left(t_j^k\right) is a p\times n matrix in \hom(n,p), then their product \left(s_i^jt_j^k\right) is a p\times m matrix in \hom(m,p) (remember the summation convention).

The category of matrices is actually enriched over the category of vector spaces over \mathbb{F}. This means that each set of morphisms is actually a vector space over \mathbb{F}. Specifically, we add matrices of the same dimensions and multiply matrices by scalars component-by-component.

We have yet to speak very clearly about identities. The axioms of an enriched category state that for each object (natural number) n there must be a linear function I_n:\mathbb{F}\rightarrow\hom(n,n). Because of linearity, this function is completely determined by its value at 1\in\mathbb{F}: I_n(x)=xI_n(1). We must pick this matrix I_n(1) so that it acts as an identity for matrix multiplication, and we choose the Kronecker delta for this purpose: I_n(1)=\left(\delta_i^j\right). That is, we use an n\times n matrix whose entries are {1} if the indices are equal and {0} otherwise. It’s straightforward to check that this is indeed an identity.

Other properties I’ve skipped over, but which aren’t hard to check, are that matrix multiplication is bilinear and associative. Both of these are straightforward once written out in terms of the summation convention; sometimes deceptively so. For example, the associativity condition reads (r_i^js_j^k)t_k^l=r_i^j(s_j^kt_k^l). But remember that there are hidden summation signs in here, so it should really read:


so there’s an implicit change in the order of summation here. Since we’re just doing finite sums, this is no problem, but it’s still worth keeping an eye on.

June 2, 2008 Posted by | Algebra, Category theory, Linear Algebra | 9 Comments

Category Theory Isn’t Useless After All!

Today on the arXiv, we find a posting of an old paper, which uses spans of “reflexive graphs” to give an algebraic framework for describing partita doppia — double-entry bookkeeping.

Now I need to find a follow-on to this paper and start applying to those financial math jobs.

March 18, 2008 Posted by | Algebra, Category theory | Leave a comment

Drafting a Paper

Long-time readers may remember that back in September I went to a conference at the University of Texas at Tyler. Well, it turns out that the AMS wants to publish a proceedings of the conference. So I’m trying to throw together a paper on the stuff I was talking about.

As I’m doing so, I’m recognizing that one part of my original talk — the whole business about anafunctors — isn’t quite ready for prime-time, and the whole thing hangs together better without it. And this brings up the design philosophy I talked about recently. In this case, writing the smaller paper first is being sort of forced on me by a short deadline.

Still, it’s crunch time, and I’m trying to crank this paper out while also teaching, applying for jobs, and dealing with car troubles you wouldn’t even believe. I don’t really feel like working up the next post in the calculus series today, and so I thought I’d toss up an alpha version of the paper. I’ll keep tweaking it, and replacing the version here as I finish more chunks, until I get to a beta version, which I’ll update here and post to the arXiv.

One particular note on the incompleteness: I haven’t even started writing the introductory section or the abstract yet. I’m finding that I tend to do better if I just dive into the mathematics and then come back later to say what I said.

And finally: it looks like I’ll be talking about this stuff at the University of Pennsylvania on March 19. Mark your calendars!

[UPDATE] 02/26: Still sans abstract and intro, but with all mathematical content there, I present a new version. Bibliography suggestions are particularly appreciated (thanks Scott).

[UPDATE] 03/04: Now with the abstract and introduction, a beta version. Bibliography suggestions would still be appreciated.

February 22, 2008 Posted by | Category theory, Knot theory | 15 Comments

Metric Spaces are Categories!

A guest post by Tom Leinster over at The n-Category Café reminded me of an interesting fact I haven’t mentioned yet: a metric space is actually an example of an enriched category!

First we’ll need to pick out our base category \mathcal{V}, in which we’ll find our hom-objects. Consider the set of nonnegative real numbers with their real-number order, and add in a point called \infty that’s above all the other points. This is a totally ordered set, and orders are categories. Let’s take the opposite of this category. That is, the objects of our category V are the points in the “interval” \left[0,\infty\right], and we have an arrow x\rightarrow y exactly when x\geq y.

This turns out to be a monoidal category, and the monoidal structure is just addition. Clearly this gives a monoid on the set of objects, but we need to check it on morphisms to see it’s functorial. But if x_1\geq y_1 and x_2\geq y_2 then x_1+x_2\geq y_1+y_2, and so we can see addition as a functor.

So we’ve got a monoidal category, and we can now use it to form enriched categories. Let’s keep out lives simple by considering a small \mathcal{V}-category \mathcal{C}. Here’s how the definition looks.

We have a set of objects \mathrm{Ob}(\mathcal{C}) that we’ll call “points” in a set X. Between any two points p_1 and p_2 we need a hom-object \hom_\mathcal{C}(p_1,p_2)\in\mathrm{Ob}(\mathcal{V}). That is, we have a function d:X\times X\rightarrow\left[0,\infty\right].

For a triple (p_1,p_2,p_3) of objects we need an arrow \hom_\mathcal{C}(p_2,p_3)\otimes\hom_\mathcal{C}(p_1,p_2)\rightarrow\hom_\mathcal{C}(p_1,p_3). In more quotidian terms, this means that d(p_2,p_3)+d(p_1,p_2)\geq d(p_1,p_3).

Also, for each point p there is an arrow from the identity object of \mathcal{V} to the hom-object \hom_\mathcal{C}(p,p). That is, 0\geq d(p,p), so d(p,p)=0.

These conditions are the first, fourth, and half of the second conditions in the definition of a metric space! In fact, there’s a weaker notion of a “pseudometric” space, wherein the second condition is simply that d(p,p)=0, and so we’re almost exactly giving the definition of a pseudometric space.

The only thing we’re missing is the requirement that d(p_1,p_2)=d(p_2,p_1). The case can be made (and has been, by Lawvere) that this requirement is actually extraneous, and that it’s in some sense more natural to work with “asymmetric” (pseudo)metric spaces that are exactly those given by this enriched categorical framework.

February 11, 2008 Posted by | Algebra, Category theory, Point-Set Topology, Topology | 7 Comments

Limits of Topological Spaces

We’ve defined topological spaces and continuous maps between them. Together these give us a category \mathbf{Top}. We’d like to understand a few of our favorite categorical constructions as they work in this context.

First off, the empty set \varnothing has a unique topology, since it only has the one subset at all. Given any other space X (we’ll omit explicit mention of its topology) there is a unique function \varnothing\rightarrow X, and it is continuous since the preimage of any subset of X is empty. Thus \varnothing is the initial object in \mathbf{Top}.

On the other side, any singleton set \{*\} also has a unique topology, since the only subsets are the whole set and the empty set, which must both be open. Given any other space X there is a unique function X\rightarrow\{*\}, and it is continuous because the preimage of the empty set is empty and the preimage of the single point is the whole of X, both of which are open in X. Thus \{*\} is a terminal object in \mathbf{Top}.

Now for products. Given a family X_\alpha of topological spaces indexed by \alpha\in A, we can form the product set \prod\limits_{\alpha\in A}X_\alpha, which comes with projection functions \pi_\beta:\prod\limits_{\alpha\in A}X_\alpha\rightarrow X_\beta and satisfies a universal property. We want to use this same set and these same functions to describe the product in \mathbf{Top}, so we must choose our topology on the product set so that these projections will be continuous. Given an open set U\subseteq X_\beta, then, its preimage \pi_\beta^{-1}(U) must be open in \prod\limits_{\alpha\in A}X_\alpha. Let’s take these preimages to be a subbase and consider the topology they generate.

If X is any other space with a family of continuous maps f_\alpha:X\rightarrow X_\alpha, then the universal property in \mathbf{Set} gives us a unique function f:X\rightarrow\prod\limits_{\alpha\in A}X_\alpha. But will it be a continuous map? To check this, remember that we only need to verify it on a subbase for the topology on the product space, and we have one ready to work with. Each set in the subbase is the preimage \pi_\beta^{-1}(U) of an open set in some X_\beta, and then its preimage under f is f^{-1}(\pi_\beta^{-1}(U))=(\pi_\beta\circ f)^{-1}(U)=f_\beta^{-1}(U), which is open by the assumption that each f_\beta is continuous. And so the product set equipped with the product topology described above is the categorical product of the topological spaces X_\alpha.

What about coproducts? Let’s again start with the coproduct in \mathbf{Set}, which is the disjoint union \biguplus\limits_{\alpha\in A}X_\alpha, and which comes with canonical injections \iota_\beta:X_\beta\rightarrow\biguplus\limits_{\alpha\in A}X_\alpha. This time let’s jump right into the universal property, which says that given another space X and functions f_\alpha:X_\alpha\rightarrow X, we have a unique function f:\biguplus\limits_{\alpha\in A}X_\alpha\rightarrow X. Now we need any function we get like this to be continuous. The preimage of an open set U\subseteq X will be the union of the preimages of each of the f_\alpha, sitting inside the disjoint union. By choosing X, the f_\alpha, and U judiciously, we can get the preimage f_\alpha^{-1}(U) to be any open set we want in X_\alpha, so the open sets in the disjoint union should consist precisely of those subsets V whose preimage \iota_\alpha^{-1}(V)\subseteq X_\alpha is open for each \alpha\in A. It’s easy to verify that this collection is actually a topology, which then gives us the categorical coproduct in \mathbf{Top}.

If we start with a topological space X and take any subset S\subseteq X then we can ask for the coarsest topology on S that makes the inclusion map i:S\rightarrow X continuous, sort of like how we defined the product topology above. The open sets in S will be any set of the form S\cap U for an open subset U\subseteq X. Then given another space Y, a function f:Y\rightarrow S will be continuous if and only if i\circ f:Y\rightarrow X is continuous. Indeed, the preimage (i\circ f)^{-1}(U)=f^{-1}(S\cap U) clearly shows this equivalence. We call this the subspace topology on S.

In particular, if we have two continuous maps f:X\rightarrow Y and g:X\rightarrow Y, then we can consider the subspace E\subseteq X consisting of those points x\in X satisfying f(x)=g(x). Given any other space Z and a continuous map h:Z\rightarrow X such that f\circ h=g\circ h, clearly h sends all of Z into the set E; the function h factors as e\circ h', where e:E\rightarrow X is the inclusion map. Then h' must be continuous because h is, and so the subspace E is the equalizer of the maps f and g.

Dually, given a topological space X and an equivalence relation \sim on the underlying set of X we can define the quotient space X/\sim to be the set of equivalence classes of points of X. This comes with a canonical function p:X\rightarrow X/\sim, which we want to be continuous. Further, we know that if g:X\rightarrow Y is any function for which x_1\sim x_2 implies g(x_1)=g(x_2), then g factors as g=g'\circ p for some function g':X/\sim\rightarrow Y. We want to define the topology on the quotient set so that g is continuous if and only if g' is. Given an open set U\in Y, its preimage g'^{-1}(U) is the set of equivalence classes that get sent into U, while its preimage g^{-1}(U) is the set of all points that get sent to U. And so we say a subset V of the quotient space X/\sim is open if and only if its preimage — the union of the equivalence classes in V is open in X.

In particular, if we have two maps f:Y\rightarrow X and g:Y\rightarrow X we get an equivalence relation on X by defining x_1\sim x_2 if there is a y\in Y so that f(y)=x_1 and g(y)=x_2. If we walk through the above description of the quotient space we find that this construction gives us the coequalizer of f and g.

And now, the existence theorem for limits tells us that all limits and colimits exist in \mathbf{Top}. That is, the category of topological spaces is both complete and cocomplete.

As a particularly useful example, let’s look at an example of a pushout. If we have two topological spaces U and V and a third space A with maps A\rightarrow U and A\rightarrow V making A into a subspace of both U and V, then we can construct the pushout of U and V over A. The general rule is to first construct the coproduct of U and V, and then pass to an appropriate coequalizer. That is, we take the disjoint union U\uplus V and then identify the points in the copy of A sitting inside U with those in the copy of A sitting inside V. That is, we get the union of U and V, “glued” along A.

November 26, 2007 Posted by | Category theory, Topology | 25 Comments

Topologies as Categories

Okay, so we’ve defined a topology on a set X. But we also love categories, so we want to see this in terms of categories. And, indeed, every topology is a category!

First, remember that the collection of subsets of X, like the collection of subobjects on an object in any category, is partially ordered by inclusion. And since every partially ordered set is a category, so is the collection of subsets of X.

In fact, it’s a lattice, since we can use union and intersection as our join and meet, respectively. When we say that a poset has pairwise least upper bounds it’s the same as saying when we consider it as a category it has finite coproducts, and similarly pairwise greatest lower bounds are the same as finite products. But here we can actually take the union or intersection of any collection of subsets and get a subset, so we have all products and coproducts. In the language of posets, we have a “complete lattice”.

So now we want to talk about topologies. A topology is just a collection of the subsets that’s closed under finite intersections and arbitrary unions. We can use the same order (inclusion of subsets) to make a topology into a partially-ordered set. In the language of posets, the requirements are that we have a sublattice (finite meets and joins, along with the same top and bottom element) with arbitrary meets — the topology contains the least upper bound of any collection of its elements.

And now we translate the partial order language into category theory. A topology is a subcategory of the category of subsets of X with finite products and all coproducts. That is, we have an arrow from the object U to the object V if and only if U\subseteq V as subsets of X. Given any finite collection \{U_i\}_{i=1}^n of objects we have their product \bigcap\limits_{i=1}^nU_i, and given any collection \{U_\alpha\}_{\alpha\in A} of objects we have their coproduct \bigcup\limits_{\alpha\in A}U_\alpha. In particular we have the empty product — the terminal object X — and we have the empty coproduct — the initial object \varnothing. And all the arrows in our category just tell us how various open sets sit inside other open sets. Neat!

November 9, 2007 Posted by | Category theory, Orders, Point-Set Topology, Topology | 8 Comments

(Not) The Tensorator for Span 2-categories

Part of the disappointment I mentioned is that the road I was on just looked so pretty. I’ve said in various places that I agree with (what I understand to be) David Corfield’s view of mathematics as a process of telling good stories, and this was a great story, but unfortunately it just doesn’t quite ring true. Before I purge it, I want to show you the picture of the tensorator as I thought it would work.


Across the top are two tensor products of one span and one object each, and across the bottom are the other two, giving the compositions in both orders. The squares (that look like triangles) at the top and bottom are pullbacks, giving the actual composite spans. Then we can put the tensor product F\otimes G in the middle, and get arrows up and down from the universal properties of the pullback squares. And it even looks like a big tensor product symbol!

But ultimately there’s no way to make this span we get always be unitary, or even invertible. And all the pretty pictures in the world can’t save a deeply flawed story. Just ask Michael Bay.

November 7, 2007 Posted by | Category theory | Leave a comment