The Unapologetic Mathematician

Mathematics for the interested outsider

Set — the card game

I’d like to talk about the card game Set. Why? Because the goal is to find affine lines! Huh?

Well, first you have to understand that we’re not working over the usual fields we draw our intuition from. We’re using the field \mathbb{Z}_3 of integers modulo 3. A bit more explicitly, we start with the ring \mathbb{Z} of integers and quotient out by the ideal 3\mathbb{Z} of multiples of 3. This is a maximal ideal, because if we add in any other integer we can subtract off the closest multiple of 3. Then we’re left with either {1} or -1, and this gives us all the integers in our expanded ideal. But we know that the quotient of a ring by a maximal ideal is a field! Thus adding and multiplying integers modulo 3 gives us a field. For the three elements of \mathbb{Z}_3 we’ll use \{1,2,3\} (using 3 instead of the usual {0}).

So what does this have to do with the card game? Well, each one has four features, each picked from three choices:

  • Colors: red, green, or purple
  • Symbols: squiggles, diamonds, or ovals
  • Patterns: solid, striped, or outlined
  • Numbers: one, two, or three

Let’s consider the color alone for now. We choose (somewhat arbitrarily) to correlate the colors red, green, and blue with the elements {1}, 2, and 3 of \mathbb{Z}_3, respectively. We don’t want to think of them as “being” the field elements, nor even as the elements of a one-dimensional vector space over \mathbb{Z}_3. The arbitrariness of our correspondence points to the fact that the colors constitute an affine line over \mathbb{Z}_3!

Similarly, we have lines for symbols, patterns, and numbers. And it’s pretty straightforward to see that we can take the product these affine lines to get a four-dimensional affine space. That is, the cards in Set form an affine four-space over \mathbb{Z}_3, and this is the stage on which we play our game.

Now what’s the goal of the game? You have to identify a collection of three cards so that in each of the four characteristics they’re all the same or all different. What I assert is that these conditions identify affine lines in the affine four-space. So what’s an affine line in this context?

Well, first let’s move from the affine four-space of cards to the affine four-space of four-tuples of field elements. We’ll just use the lists of choices above and identify them with {1}, 2, and 3, respectively, and read them in the same order as above. That is, the four-tuple \left(3,1,2,2\right) would mean the card with two purple, striped squiggles. We can do this because (since they’re torsors) all affine spaces of the same dimension over the same field are isomorphic (but not canonically so!).

Now we can say that an affine line is a map from the standard affine line (given by “one-tuples” of field elements) to this affine four-space. But it can’t be just any function. It has to preserve relative differences! Since we have only three points to consider, we can reduce this question somewhat: an affine line is a list of three four-tuples of field elements, and the difference from the first to the second must equal the difference from the second to the third.

So we can take any two points, and then we can work out what the third one has to be from there. And we can tell that each of the four components works independently from all the others. So let’s look at colors alone again. Let’s say that the first two cards have colors c_1 and c_2. Then the affine condition says that c_3-c_2=c_2-c_1. That is, c_3=2c_2-c_1. It’s straightforward from here to see that if c_1=c_2, then c_3 is again the same as those two. On the other hand, if c_1 and c_2 are different, then c_3 must be the remaining choice.

What have we seen here? If we start with two cards with the same color, the third card on the affine line will have the same color as well. If the first two cards have different colors, the third one will be the remaining color. And the same analysis applies to symbols, patterns, and numbers. Thus affine lines are sets of three cards which are all the same or all different in each of the four characteristics.

Now that we know that, what can we do with it? Well, look at the gameplay. We don’t have all of the cards spread out at once: we deal out a certain number at a time. So the question is: how many points in the affine space \mathbb{Z}_3^4 can we take without them containing an affine line? This is analogous to the problem of placing eight queens on a chessboard. A more advanced problem is to give the expected number of affine lines in a subset of size n.


July 17, 2008 Posted by | Algebra, Linear Algebra | 18 Comments

Affine Spaces

Today we’re still considering the solution set of an inhomogenous linear system a_i^jx^i=y^j and its associated homogenous system a_i^jx^i=0. Remember that we also wrote these systems in more abstract notation as Ax=y and Ax=0. The solution space to the homogenous system is the kernel \mathrm{Ker}(A), and any two solutions of the inhomogenous system differ by a vector in this subspace.

We call such a collection of points an “affine space”. We can also talk about such a thing from the inside, without seeing it as embedded in some ambient vector space as a coset. Algebraically, we characterize an affine space S as a collection of points and a function \Theta:S\times S\rightarrow V, where V is some associated vector space. The value \Theta(a,b) — also written a-b — is thought of as the “displacement vector” from b to a.

We require two properties for this map: first that \Theta(a,b)+\Theta(b,c)=\Theta(a,c); second that for every b the map a\mapsto\Theta(a,b) from S to V is a bijection. The former condition provides coherence for the interpretation as displacement vectors. The latter implements the idea that an affine space S “looks like” the associated vector space V.

Given a subspace V of a vector space W, any coset w+V is an affine space associated to V. As a degenerate case, we can consider V as a subspace of itself, and V itself is its only coset. Thus any vector space can be considered as an affine space associated to itself. In fact, since any affine space is in bijection with its associated vector space, we can get any one of them by this construction. Thus any two affine spaces associated to a given vector space are isomorphic, but not canonically so. It’s this lack of a canonical isomorphism that makes things interesting, because we can’t justify simply identifying non-canonically isomorphic spaces.

Another consequence of the bijection is that we can “add” a vector v to a point b in an affine space. Since a\mapsto\Theta(a,b) is a bijection, there must be a unique point we’ll call b+v so that \Theta(b+v,b)=v. It’s straightforward from here to show that this gives an action of the vector space V (considered as an abelian group) on the affine space S.

In fact, considering an affine space in the category of V-actions, the bijection shows us that S is isomorphic to V‘s action on itself by addition. We can even use this to characterize an affine space exactly as a V-set isomorphic to V‘s action on itself. In other words, it’s a torsor for V.

July 16, 2008 Posted by | Algebra, Linear Algebra | 6 Comments

Inhomogenous Linear Systems

In distinction from homogenous systems we have inhomogenous systems. These are systems of linear equations where some of the constants on the right of the equations are nonzero. In our matrix notation we have a_i^jx^i=y^j, where at least one y^j\neq0.

Now we no longer have the nice property that solutions form a vector space. For example, if \bar{x}^i are the components of one solution, and \hat{x}^i are the components of another solution, then their sum is definitely not a solution. Indeed, we calculate


And this only works out as a solution to our system if 2y^j=y^j for each index j, implying that each y^j=0.

However, we do have something interesting. What if we take the difference between two solutions? Now we calculate


That is, the difference between the two solutions satisfies the “associated homogenous equation” a_i^jx^i=0. If we take some solution to the inhomogenous equation, any other solution is the sum of this particular solution and a solution of the associated homogenous equation.

So we’re still very interested in solving homogenous equations. Once we have a complete solution to the homogenous equation — usually through having a basis for the kernel of A — we just need a particular solution \bar{x} (with components \bar{x}^i) to the inhomogenous equation and we can parametrize all the solutions. If we remember that a vector space is built up from an abelian group, we might think back to group theory and the language of cosets. The solution set to the inhomogenous equation is the coset \bar{x}+\mathrm{Ker}(A).

Of course, we should also remember that expressions for cosets are far from unique. Any particular solution in the coset is just as good as any other. If \hat{x} is another solution, then \bar{x}+\mathrm{Ker}(A)=\hat{x}+\mathrm{Ker}(A). Both cosets consist of the same collection of vectors, but they are written with different offsets (picked from \mathrm{Ker}(A)) from different base-points — \bar{x} and \hat{x}. This illustrates that, unlike in a vector space, the solution set has no special element like the identity. All solutions are on the same footing.

July 15, 2008 Posted by | Algebra, Linear Algebra | 4 Comments

Homogenous Linear Systems

An important special case of a linear system is a set of homogenous equations. All this means (in this case) is that the right side of each of the equations is zero.

In matrix notation (using the summation convention), we have the equation a_i^jx^i=0. Remember that this is actually a collection of n equations, one for each value of the index j. And in our more abstract notation we write Ax=0, where the right had side is the zero vector in \mathbb{F}^n.

So what is a solution of this system? It’s a vector x\in\mathbb{F}^m that gets sent to 0\in\mathbb{F}^n by the linear transformation A. But a vector that gets sent to the zero vector is exactly one in the kernel \mathrm{Ker}(A). So solving the homogenous system a_i^jx^i=0 is equivalent to determining the kernel of the linear transformation A.

We don’t yet have any tools for making this determination yet, but we can say some things about the set of solutions. For one thing, they form a subspace of \mathbb{F}^m. That is, the sum of any two solutions is again a solution, and a constant multiple of any solution is again a solution. We’re interested, then, in finding linearly independent solutions, because from them we can construct more solutions without redundancy.

A maximal collection of linearly independent solutions will be a basis for the subspace of solutions — for the kernel of the linear map. As such, the number of solutions in any maximal collection will be the dimension of this subspace, which we called the nullity of the linear transformation A. The rank-nullity theorem then tells us that we have a relationship between the number of independent solutions to the system (the nullity), the number of variables in the system (the dimension of \mathbb{F}^m), and the rank of A, which we will also call the rank of the system. Thus if we can learn ways to find the rank of the system then we can determine the number of independent solutions.

July 14, 2008 Posted by | Algebra, Linear Algebra | 5 Comments

The Matrix of a Linear System

As I wait for the iTunes store to be less busy so it can reauthorize my iPhone to work with the updated firmware, we can finally get back on track.

Let’s consider a system of linear equations. We’ll use the m variables x^1, x^2, and so on up to x^m; and we’ll let there be n equations. Let’s write these out:

a_1^1x^1 + a_2^1x^2 + ... + a_m^1x^m = y^1
a_1^2x^1 + a_2^2x^2 + ... + a_m^2x^m = y^2

a_1^nx^1 + a_2^nx^2 + ... + a_m^nx^m = y^n

Here the constant a_i^j are the coefficient of x^i in the jth equation, and y^j is the constant term on the right hand side of the jth equation.

But this is all but writing out exactly our matrix notation! We can take the above system and rewrite it as


Picking values for the variables x^i is the same as picking the components of a column vector x=x^ie_i. We can collect the right hand sides of all our equations into one column vector y=y^jf_j, and the coefficients give a (linear) formula for taking the values we choose for the variables and turning them into the n values on the right of our equations. That is, they define a linear map A:\mathbb{F}^m\rightarrow\mathbb{F}^n. We can thus rewrite our system in a more abstract notation as:


Suddenly it looks a lot more like the first — and simplest — linear equation we wrote down. But now we can’t just “divide by A” to solve it. We need heavier tools to manage this task, or even just to show when it can be managed at all! In short: we need linear algebra.

Incidentally, now we see why we indexed the variables with superscripts: because that’s how we wrote the components of a vector, and the variables are the components of a single vector variable. And if you’re still on the fence, I’ll note that physicists use superscripts all the time to index variables (for similar purposes), and they even do it when the equations aren’t all linear. Just try it. You’ll get used to it.

July 11, 2008 Posted by | Algebra, Linear Algebra | 3 Comments

More on the C-G Eversion

Some people had trouble grabbing the whole 50MB file that I posted, so Scott Carter broke it into pieces. He also included these comments:

The red, blue, and purple curves on the large (distorted) spherical objects at the bottom of each page of the eversion are the preimages of the the folds (color coded of course) and the double decker sets. Since at each time the sphere is immersed it may have double and triple points. Each arc of double points lifts to a pair of arcs on the ambient sphere, and each triple point lifts to three points on the ambient sphere. These lifts are the “decker sets.”

They are obtained via Gauss-Morse codes. Pick a base point and orientation on each curve in a movie. These are chosen
consistently from one still to the next. Label the double points and the optima and read the labels as they are encountered upon a single journey around the curve. The labels, too, are chosen consistently from one still to the next. Write these down for each curve in a movie, and connect the letters in the words as the curves change according to the basic changes that occur in each of the movie scenes.

These curves then are instructions on how to immerse the ambient sphere to create the illustrations.

Sarah’s thesis computes that the fold set is an annulus, the double point set is the connected sum of three projective planes, and the double decker set is the connected orientation double cover: a genus 2 surface.

So here are the pieces:

  1. Immersed spheres as movies (2.2 MB)
  2. The basic movie moves (3.4 MB)
  3. The eversion from the red side to the quadruple point (19 MB)
  4. Half of the eversion from the quadruple point halfway to the blue side (24 MB)
  5. The other half of the eversion from the quadruple point halfway to the blue side (17 MB)

There’s a glitch in part 4, so I’ll post that as soon as I can.

July 10, 2008 Posted by | Category theory, Knot theory, Topology | 1 Comment

The Carter-Gelsinger Eversion

I’ve mentioned Outside In before. That video shows a way of turning a sphere inside out. It’s simpler than the first explicit eversions to be discovered, but the simplicity is connected to a high degree of symmetry. This leads to very congested parts of the movie, where it’s very difficult to see what’s going on. Further, many quadruple points — where four sections of the surface pass through the same point — occur simultaneously, and even higher degree points occur. We need a simpler version.

What would constitute “simple” for us, then? We want as few multiple points as possible, and as few at a time as possible. In fact, it would be really nice if we could write it down algebraically, in some sense? But what sense?

Go back to the diagrammatics of braided monoidal categories with duals. There we could draw knots and links to represent morphisms from the monoidal identity object to itself. And topologically deformed versions of the same knot encoded the same morphism. This is the basic idea of the category \mathcal{T}ang of tangles.

But if we shift our perspective a bit, we consider the 2-category of tangles. Instead of saying that deformations are “the same” tangle, we consider explicit 2-isomorphisms between tangles. We’ve got basic 2-isomorphisms for each of the Reidemeister moves, and a couple to create or cancel caps and cups in pairs (duality) and to pull crossings past caps or cups (naturality). Just like we can write out any link diagram in terms of a small finite collection of basic tangles, we can write out any link diagram isotopy in terms of a small finite collection of basic moves.

What does a link diagram isotopy describe? Links (in our picture) are described by collections of points moving around in the plane. As we stack up pictures of these planes the points trace out a link. So now we’ve got links moving around in space. As we stack up pictures of these spaces, the links trace out linked surfaces in four-dimensional space. And we can describe any such surface in terms of a small collection of basic 2-morphisms in the braided monoidal 2-category of 2-tangles. These are analogous to the basic cups, caps, and crossings for tangles.

Of course the natural next step is to consider how to deform 2-tangles into each other. And we again have a small collection of basic 3-morphisms that can be used to describe any morphisms of 2-tangles. These are analogous to the Reidemeister moves. Any deformation of a surface (which is written in terms of the basic 2-morphisms) can be written out in terms of these basic 3-morphisms.

We can simplify our picture a bit. Instead of knotting surfaces in four-dimensional space, let’s just let them intersect each other in three-dimensional space. To do this, we need to use a symmetric monoidal 3-category with duals, since there’s no distinction between two types of crossings.

And now we come back to eversions. We write the sphere as a 2-dimensional cup followed by a 2-dimensional cap. Since we have duals, we can consider one side to be “painted red” and one side “painted blue”. One way of writing the sphere has the outside painted red and the other side is painted blue. An eversion in our language will be an explicit list of 3-morphisms that run from one of these spheres to the other.

Scott Carter and Sarah Gelsinger have now created just such an explicit list of directions to evert a sphere. And, what’s more, they’ve rendered pictures of it! Here, for the first time in public, is a 50MB PDF file showing the Carter-Gelsinger eversion.

First they illustrate the basic pieces of a diagram of knotted surfaces (pp. 1-4). Then they illustrate the basic 2-morphisms that build up surfaces (pp. 5-6), and write out a torus as an example (p. 7). Then come a few more basic 2-morphisms that involve self-intersections (pp. 8-9) and a more complicated immersed sphere (pp. 10-11). Each of these is written out also as a “movie” of self-intersecting loops in the plane. Next come the “movie moves” — the 3-morphisms connecting the 2-morphism “movies” (pp. 12-17). These are the basic pieces that let us move from one immersed surface to another.

Finally, the eversion itself, consisting of the next 79 pages. Each one consists of an immersed sphere, rendered in a number of different ways. On the left is a movie of immersed plane curves. On the top are three views of the sphere as a whole — a “solid” view on the right, a sketch of the double-point curves in the middle, and a “see-through” view on the left. The largest picture on each page is a more schematic view I don’t want to say too much about.

The important thing to see here is that between each two frames of this movie is exactly one movie move. Everything here is rendered into pictures, but we could write out the movie on each page as a sequence of 2-morphisms form the top of the page to the bottom. Then moving from one page to the next we trace out a sequence of 3-morphisms, writing out the eversion explicitly in terms of the basic 3-morphisms. As an added bonus, there’s only ever one quadruple point — where we pass from Red 26 to Blue 53 — and no higher degree points.

I’d like to thank Scott for not only finishing off this rendering he’s been promising for ages, but for allowing me to host its premiere weblog appearance. I, for one, am looking forward to the book, although I’m not sure this one will be better than the movie.

[UPDATE] Some people have been having trouble with the whole 50MB PDF (and more people might as the Carnival comes to see this page. Scott Carter broke the file up into five pieces, and I’ve put them up here in a new post. There’s a glitch in part 4, but I’ll have that one up as soon as I can.

July 6, 2008 Posted by | Category theory, Knot theory, Topology | 7 Comments

Linear Equations

Okay, now I really should introduce one of the most popular applications of linear algebra, at least outside mathematics. Matrices can encode systems of linear equations, and matrix algebra can be used to solve them.

What is a linear equation? It’s simply an algebraic equation where each variable shows up at most to the first power. For example, we could consider the equation


and clearly we can solve this equation by dividing by 3 on each side to find x=4. Of course, there could be more than one variable.

Sometimes people might use different names like x, y, and z, but since we want to be open-ended about things we’ll just say x^1, x^2, x^3, and so on. Notice here that because variables can only show up to the first power, there is no ambiguity about writing our indices as superscripts — something we’ve done before. Anyhow, we might write an equation


Now we have many different possible solutions. We could set x^1=4 and x^2=0, or we could set x^1=0 and x^2=3, or all sorts of combinations in between. This one equation is not enough to specify a unique solution.

Things might change, though, if we add more equations. Consider the system


Now we can rewrite the second equation as x^1=11+x^2, and then drop this into the first equation to find 3(11+x^2)+4x^2=12, or 33+7x^2=12. This just involves the one variable, and we can easily solve it to find x^2=-3, which quickly yields x^1=8. We now have a single solution for the pair of equations.

What if we add another equation? Consider


Now we know that the only values solving both of the first two equations are x^1=8 and x^2=-3. But then x^1+x^2=5, so the third equation cannot possibly be satisfied. Too many equations can be impossible to solve.

In general, things work out when we’ve got one equation for each variable. As we move forwards we’ll see how to express this fact more concretely in terms of matrices and vector spaces.

July 3, 2008 Posted by | Algebra, Linear Algebra | 19 Comments

Row Rank

Yesterday we defined the column rank of a matrix to be the maximal number of linearly independent columns. Flipping over, we can consider the obviously analogous quantity for rows. The “row rank” is the maximal number of linearly independent rows in a matrix. But what’s a row? It’s a vector in a dual space.

Okay, let’s talk a bit more concretely. Consider a linear transformation T:\mathbb{F}^m\rightarrow\mathbb{F}^n, which is described by the n\times m matrix \left(t_i^j\right). Then for each index j we can take the jth row in this matrix:


And use it as a linear functional on the space of column vectors \mathbb{F}^m. Specifically, it sends the vector with components v^i to the number t_i^jv^i. Thus we get n linear functionals — elements of the dual space \left(\mathbb{F}^m\right)^*.

But what linear functionals are these? They must be something special, and indeed they are. Remember that we’ve got our canonical basis \left\{f_j\right\} for the target space \mathbb{F}^n. We also immediately have the dual basis \left\{\phi^j\right\} of \left(\mathbb{F}^n\right)^*. The linear functionals we got from the rows are then just the pullbacks of these basic linear functionals \phi^j\circ T.

To see this, notice that the linear functional \phi^k is given by the 1\times n matrix with a {1} in the kth component and {0} everywhere else. That is, its components come from the delta: \delta_j^k. We get a linear functional on \mathbb{F}^m by first hitting a column vector with the matrix \left(t_i^j\right) and then with this 1\times n matrix. We use matrix multiplication to see that this is equivalent to just using the 1\times m matrix — the element of \left(\mathbb{F}^m\right)^* — with components \delta_j^kt_i^j=t_i^k. And this is just the kth row of the matrix!

So the rows span the image of the dual space \left(\mathbb{F}^n\right)^* under the dual map T^*. This is, of course, a subspace of \left(\mathbb{F}^m\right)^*, and its dimension is exactly the row rank of T. We’ll come back later and show that this must actually be the same as the column rank.

July 2, 2008 Posted by | Algebra, Linear Algebra | Leave a comment

Column Rank

Let’s go back and consider a linear map T:V\rightarrow W. Remember that we defined its rank to be the dimension of its image. Let’s consider this a little more closely.

Any vector in the image of T can be written as T(v) for some vector v\in V. If we pick a basis \left\{e_i\right\} of V, then we can write T(v)=T(v^ie_i)=v^iT(e_i). Thus the vectors T(e_i)\in W span the image of T. And thus they contain a basis for the image.

More specifically, we can get a basis for the image by throwing out some of these vectors until those that remain are linearly independent. The number that remain must be the dimension of the image — the rank — and so must be independent of which vectors we throw out. Looking back at the maximality property of a basis, we can state a new characterization of the rank: it is the cardinality of the largest linearly independent subset of \left\{T(e_i)\right\}.

Now let’s consider in particular a linear transformation T:\mathbb{F}^m\rightarrow\mathbb{F}^n. Remember that these spaces of column vectors come with built-in bases \left\{e_i\right\} and \left\{f_j\right\} (respectively), and we have a matrix T(e_i)=t_i^jf_j. For each index i, then, we have the column vector

\displaystyle T(e_i)=\begin{pmatrix}t_i^1\\t_i^2\\\vdots\\t_i^n\end{pmatrix}

appearing as a column in the matrix \left(t_i^j\right).

So what is the rank of T? It’s the maximum number of linearly independent columns in the matrix of T. This quantity we will call the “column rank” of the matrix.

July 1, 2008 Posted by | Algebra, Linear Algebra | 2 Comments