The Unapologetic Mathematician

Mathematics for the interested outsider

Duality terminology

I should have mentioned this before, but usually dual notions are marked by the prefix “co-“. As an example, we have “well-powered” and “co-well-powered” categories.

Another example: We know through Erdős that “a mathematician is a device for turning coffee into theorems”. It thus follows by duality that a comathematician is a device for turning cotheorems into ffee.

May 31, 2007 Posted by | Category theory | 3 Comments

The Opposite Category

One of the most interesting general facts about categories is how ubiquitous the notion of duality is. Pretty much everything has a “mirror image”, and the mirror is the opposite category.

Given a category \mathcal{C}, with objects {\rm Ob}(\mathcal{C}) and morphisms {\rm Mor}(\mathcal{C}), we can construct the “opposite category” \mathcal{C}^{\rm op} simply enough. In fact, it has the exact same objects and morphisms as \mathcal{C} does. The difference comes in how they relate to each other.

Remember that we had two functions, s(m) and t(m) assigning the “source” and “target” objects to any arrow. To get the opposite category we just swap them. Given a morphism, its source in \mathcal{C}^{\rm op} is its target in \mathcal{C}, and vice versa. Of course, now we have to swap the order of composition. If we have f:A\rightarrow B and g:B\rightarrow C in \mathcal{C}, then we get f:B\rightarrow A and g:C\rightarrow B in \mathcal{C}^{\rm op}. In \mathcal{C} the composition g\circ f:A\rightarrow C is defined, but in \mathcal{C}^{\rm op} the composition f\circ g:C\rightarrow A is defined.

Most definitions we give will automatically come with a “dual” definition, which we get by reversing all the arrows like this. For example, monos and epis are dual notions, as are subobjects and quotient objects. Just write down one definition in terms of morphisms, reverse all the morphisms (and the order of composition), and you get the other.

Theorems work this way too. If you dualize the hypotheses and the conclusions, then you can dualize each step of the proof to prove the dual theorem. I can prove that any injection is monic, so it follows immediately by duality that any surjection is epic. Many texts actually completely omit even the statements of dual notions and theorems once they define the opposite category, but I’ll try to be explicit about the duals (though of course I won’t need to give the proofs).

Another place duality comes up is in defining “contravariant” functors. This is just a functor F:\mathcal{C}^{\rm op}\rightarrow\mathcal{D}. It sends each object of \mathcal{C} to an object of \mathcal{D}, and sends a morphism f:A\rightarrow B in \mathcal{C} to a morphism F(f):F(B)\rightarrow F(A) in \mathcal{D}. See how the direction of the image morphism flipped? Early on, contravariant and regular (“covariant”) functors were treated somewhat separately, but really they’re just the same thing once you take the opposite category into account. Sometimes, though, it’s easier to think in terms of contravariant functors rather than mentally flipping all the arrows.

I’ll close with an example of a contravariant functor we’ve seen before. Consider a ring R with unit and a left module M over R. That is, M is an object in the category R-{\bf mod}. We can construct the dual module M^*, which is now an object in the category {\bf mod}-R of right R-modules. I say that this is a contravariant functor. We’ve specified how the dual module construction behaves on objects, but we need to see how it behaves on morphisms. This is what makes it functorial.

So let’s say we have two left R-modules M and N, and an R-module homomorphism f:M\rightarrow N. Since we want this to be a contravariant functor we need to find a morphism f^*:N^*\rightarrow M^*. But notice that M^*=\hom_{R-{\bf mod}}(M,R), and similarly for N. Then we have the composition of R-module homomorphisms \circ:\hom_{R-{\bf mod}}(N,R)\otimes\hom_{R-{\bf mod}}(M,N)\rightarrow\hom_{R-{\bf mod}}(M,R). If \nu is a linear functional on N, then we get \nu\circ f:M\rightarrow R as a linear functional on M. We can define f^*(\nu)=\nu\circ f.

Now, is this construction functorial? We have to check that it preserves identities and compositions. For identities it’s simple: 1_N^*(\nu)=\nu\circ1_N=\nu, so every linear functional on N gets sent back to itself. For compositions we have to be careful. The order has to switch around because this is a contravariant functor. We take f:M\rightarrow N and g:N\rightarrow L and check (g\circ f)^*(\lambda)=\lambda\circ g\circ f=g^*(\lambda)\circ f=f^*(g^*(\lambda))=(f^*\circ g^*)(\lambda), as it should.

May 31, 2007 Posted by | Category theory | 7 Comments

Solution to the ARML Scrimmage Power Question

Today I’ll give my own solution to the power question I posted last week. I’m going to present it as I always wished I could have: as a whole with answers to the specific questions spun off as they naturally arise, rather than separated part by part.

First I’ll restate the setup into a more natural language. This is really a question about certain endomorphisms on vector spaces over the field \mathbb{Z}_2. This is the quotient of the ring of integers by the maximal ideal generated by 2. If you haven’t seen it before, you should be able to construct it by all I’ve said so far.

Anyhow, the “arrangements” are just vectors in the vector space \mathbb{Z}_2^n. Such a space comes with a basis \{e_i\}_{i=1}^n, where e_i has a 1 in place i and {}0 elsewhere. Each of these spaces of course has the identity endomorphism I_n, and it also has a “left rotation” endomorphism L_n sending e_i to e_{i-1} and e_1 to e_n, and its inverse R_n — right rotation. The transformation we are concerned with is T_n=I_n+L_n.

Since we are given bases we are justified in writing out matrices for transformations, and even for vectors (because V\cong\hom(\mathbb{Z}_2,V)). The transformation T_n has the matrix (t_{i,j}) where t_{i,j} is 1 if j=i or j=i+1 and {}0 otherwise. A vector of length n will be an n\times1 matrix.

Numbers 1 and 2 are just calculations, so I’ll omit them.

We can write T_2 as the matrix \begin{pmatrix}1&1\\1&1\end{pmatrix}, so T_2^2 has matrix \begin{pmatrix}1&1\\1&1\end{pmatrix}\begin{pmatrix}1&1\\1&1\end{pmatrix}=\begin{pmatrix}0&0\\ 0&0\end{pmatrix}, which sends every vector to the zero vector. This is part 4.

Now one interesting property we can see straight off. We can tell whether there are an even or an odd number of ones in a given arrangement by adding up all the entries. That is, we can take the product with the 1\times n matrix p consisting of all ones. If we first transform an arrangement a, then measure the number of ones, this is like taking the product pT_na. But each column of the matrix for T_n has exactly two ones in it, so the product pT_n consists of all zeroes, and thus pT_na is always zero. That is, the image of any arrangement after a transformation always has an even number of ones. That’s number 10.

What arrangements are fixed by the transformation? This amounts to solving the equation
a=T_na=I_na+L_na=a+L_na
so we must have 0=L_na or a=R_n0=0. The zero vector is the only fixed point.

What vectors get sent to this fixed point? This is the kernel of the transformation — the vectors a such that 0=a+L_na. Equivalently, these satisfy a=L_na (why?). Thus all the entries must be the same, and the vector must consist of all zeroes or all ones. That’s number 9.

Now we see that if T_n^ka=0 then T_n^{k-1}a is in the kernel of T_n, and is thus either all zeroes or all ones. But if n is odd, the vector consisting of all ones is not in the image of T_n. Thus if we don’t start with a vector in the kernel we’ll never land in the kernel, and we’ll never get to the vector of all zeroes. That’s number 11. As a special case we have number 3.

Let’s consider T_n^2. We expand this as (I_n+L_n)(I_n+L_n)=I_n+L_n+L_n+L_n^2=I_n+L_n^2, since we’re working over \mathbb{Z}_2. Similarly, if we square this we get I_n+L_n^4. In fact, we have that T_n^{2^k}=I_n+L_n^{2^k}. Indeed, this is true by definition for k=0, and if it’s true for k then
T_n^{2^{k+1}}=(T_n^{2^k})^2=(I_n+L_n^{2^k})^2=I_n+(L_n^{2^k})^2=I_n+L_n^{2^{k+1}}
so the claim is true by induction.

This means that after 2^k transformations a vector is sent to the sum (in \mathbb{Z}_2) of itself with its left-rotation by 2^k places. Thus if 2^k|n we can divide the entries in a into 2^k vectors of length \frac{n}{2^k} each — just pick the entries spaced out by separations of 2^k. Then T_n^{2^k} acts on a the same way T_{\frac{n}{2^k}} acts on each of the subvectors, since the shift by 2^k places fixes the subvectors. That’s number 7. Parts 5 and 6 are now special cases.

Also, if n=2^k then L_n^{2^k}=I_n, so T_n^{2^k}=0, so 2^k transformations send every vector of length 2^k to zero. That’s number 8.

Finally, let n be any even number that is not a power of 2. Now the result of T_n^{q2^k}a is the same as applying T_n^q to each of the 2^k subvectors as described above. But now each subvector has odd length. If a has a single 1 in it, then one of these subvectors must contain it. By number 11 this subvector can never be sent to zero, so T_n^{q2^k}a is never zero. If T_n^ma were ever zero then T_n^{qn^k}a would be for a large enough q, which will never happen. That’s part 12.

May 31, 2007 Posted by | Uncategorized | 2 Comments