The Unapologetic Mathematician

Mathematics for the interested outsider

Blath Captchas

Every so often people start talking about captchas. There was a bubble of them last year when all the new blatherers came out of the woodwork. Well, today I ran across one perfect for our community’s needs (click thumbnail to embiggen):

and another one:

May 30, 2008 Posted by John Armstrong | Uncategorized | | 6 Comments

Matrix notation

I just spent all day on the road back to NOLA to handle some end-of-month business, clean out my office, and so on. This one will have to do for today and tomorrow.

It gets annoying to write out matrices using the embedded LaTeX here, but I suppose I really should, just for thoroughness’ sake.

In general, a matrix is a collection of field elements with an upper and a lower index. We can write out all these elements in a rectangular array. The upper index should list the rows of our array, while the lower index should list the columns. The matrix \left(t_i^j\right) with entries t_i^j for i running from {1} to m and n running from {1} to n is written out in full as

\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}

We call this an n\times m matrix, because the array is n rows high and m columns wide.

There is a natural isomorphism V\cong\hom(\mathbb{F},V). This means that every vector in dimension m, written out in the components relative to a given basis, can be seen as an m\times1 “column vector”:

\displaystyle\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}

Similarly, a linear functional on an n-dimensional space can be written as a 1\times n “row vector”:

\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}

Notice that evaluation of linear transformations is now just a special case of matrix multiplication! Let’s practice by writing out the composition of a linear functional \mu\in\left(\mathbb{F}^n\right)^*, a linear map T:\mathbb{F}^m\rightarrow\mathbb{F}^n, and a vector v\in\mathbb{F}^m.

\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}

A matrix product makes sense if and only if the number of columns in the left-hand matrix is the same as the number of rows in the right-hand matrix. That is, an m\times n and an n\times p can be multiplied. The result will be an m\times p matrix. We calculate it by taking a row from the left-hand matrix and a column from the right-hand matrix. Since these are the same length (by assumption) we can multiply corresponding elements and sum up.

In the example above, the n\times m matrix \left(t_i^j\right) and the m\times1 matrix \left(v^i\right) can be multiplied. There is only one column in the latter to pick, so we simply choose row j out of n on the left: \begin{pmatrix}t_1^j&t_2^j&\cdots&t_m^j\end{pmatrix}. Multiplying corresponding elements and summing gives the single field element t_i^jv^i (remember the summation convention). We get n of these elements — one for each row — and we arrange them in a new n\times1 matrix:

\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}t_i^1v^i\\t_i^2v^i\\\vdots\\t_i^nv^i\end{pmatrix}

Then we can multiply the row vector \begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix} by this column vector to get the 1\times1 matrix:

\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}\mu_jt_i^jv^i\end{pmatrix}

Just like we slip back and forth between vectors and m\times1 matrices, we will usually consider a field element and the 1\times1 matrix with that single entry as being pretty much the same thing.

The first multiplication here turned an m-dimensional (column) vector into an n-dimensional one, reflecting the source and target of the transformation T. Then we evaluated the linear functional \mu on the resulting vector. But by the associativity of matrix multiplication we could have first multiplied on the left:

\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}=\begin{pmatrix}\mu_jt_1^j&\mu_jt_2^j&\cdots&\mu_jt_m^j\end{pmatrix}

turning the linear functional on \mathbb{F}^n into one on \mathbb{F}^m. But this is just the dual transformation T^*:\left(\mathbb{F}^n\right)^*\rightarrow\left(\mathbb{F}^m\right)^*! Then we can evaluate this on the column vector to get the same result: \mu_jt_i^jv^j.

There is one slightly touchy thing we need to be careful about: Kronecker products. When the upper index is a pair (i_1,i_2) with 1\leq i_1\leq n_1 and 1\leq i_2\leq n_2 we have to pick an order on the set of such pairs. We’ll always use the “lexicographic” order. That is, we start with (1,1), then (1,2), and so on until (1,n_2) before starting over with (2,1), (2,2), and so on. Let’s write out a couple examples just to be clear:

\displaystyle\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}\boxtimes\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}=\begin{pmatrix}s_1^1t_1^1&s_1^1t_2^1&s_1^1t_3^1&s_2^1t_1^1&s_2^1t_2^1&s_2^1t_3^1\\s_1^1t_1^2&s_1^1t_2^2&s_1^1t_3^2&s_2^1t_1^2&s_2^1t_2^2&s_2^1t_3^2\\s_1^2t_1^1&s_1^2t_2^1&s_1^2t_3^1&s_2^2t_1^1&s_2^2t_2^1&s_2^2t_3^1\\s_1^2t_1^2&s_1^2t_2^2&s_1^2t_3^2&s_2^2t_1^2&s_2^2t_2^2&s_2^2t_3^2\end{pmatrix}

\displaystyle\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}\boxtimes\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}=\begin{pmatrix}t_1^1s_1^1&t_1^1s_2^1&t_2^1s_1^1&t_2^1s_2^1&t_3^1s_1^1&t_3^1s_2^1\\t_1^1s_1^2&t_1^1s_2^2&t_2^1s_1^2&t_2^1s_2^2&t_3^1s_1^2&t_3^1s_2^2\\t_1^2s_1^1&t_1^2s_2^1&t_2^2s_1^1&t_2^2s_2^1&t_3^2s_1^1&t_3^2s_2^1\\t_1^2s_1^2&t_1^2s_2^2&t_2^2s_1^2&t_2^2s_2^2&t_3^2s_1^2&t_3^2s_2^2\end{pmatrix}

So the Kronecker product depends on the order of multiplication. But this dependence is somewhat illusory. The only real difference is reordering the bases we use for the tensor products of the vector spaces involved, and so a change of basis can turn one into the other. This is an example of how matrices can carry artifacts of our choice of bases.

May 30, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 7 Comments

Matrices IV

Like we saw with the tensor product of vector spaces, the dual space construction turns out to be a functor. In fact, it’s a contravariant functor. That is, if we have a linear transformation T:U\rightarrow V we get a linear transformation T^*:V^*\rightarrow U^*. As usual, we ask what this looks like for matrices.

First, how do we define the dual transformation? It turns out this is the contravariant functor represented by \mathbb{F}. That is, if \mu:V\rightarrow\mathbb{F} is a linear functional, we define T^*(\mu)=\mu\circ T:U\rightarrow\mathbb{F}. In terms of the action on vectors, \left[T^*(\mu)\right](v)=\mu(T(v))

Now let’s assume that U and V are finite-dimensional, and pick bases \left\{e_i\right\} and \left\{f_k\right\} for U and V, respectively. Then the linear transformation T has matrix coefficients t_i^k. We also get the dual bases \left\{\epsilon^j\right\} of U^* and \left\{\phi^l\right\} of V^*.

Given a basic linear functional \phi^l on V, we want to write T^*(\phi^l) in terms of the \epsilon^j. So let’s evaluate it on a generic basis vector e_i and see what we get. The formula above shows us that

\left[T^*(\phi^l)\right](e_i)=\phi^l(T(e_i))=\phi^l(t_i^kf_k)=t_i^k\delta_k^l=t_i^l

In other words, we can write T^*(\phi^l)=t_j^l\epsilon^j. The same matrix works, but we use its indices differently.

In general, given a linear functional \mu with coefficients \mu_l we find the coefficients of T^*(\mu) as t_j^l\mu_l. The value \left[T^*(\mu)\right](v)=\mu(T(u)) becomes \mu_lt_i^lu^i. Notice that the summation convention tells us this must be a scalar (as we expect) because there are no unpaired indices. Also notice that because we can use the same matrix for two different transformations we seem to have an ambiguity: is the lower index running over a basis for U or one for U^*? Luckily, since every basis gives rise to a dual basis, we don’t need to care. Both spaces have the same dimension anyhow.

May 28, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 1 Comment

Dual Spaces

Another thing vector spaces come with is duals. That is, given a vector space V we have the dual vector space V^*=\hom(V,\mathbb{F}) of “linear functionals” on V — linear functions from V to the base field \mathbb{F}. Again we ask how this looks in terms of bases.

So let’s take a finite-dimensional vector space V with basis \left\{e_i\right\}, and consider some linear functional \mu\in V^*. Like any linear function, we can write down matrix coefficients \mu_i=\mu(e_i). Notice that since our target space (the base field \mathbb{F}) is only one-dimensional, we don’t need another index to count its basis.

Now let’s consider a specially-crafted linear functional. We can define one however we like on the basis vectors e_i and then let linearity handle the rest. So let’s say our functional takes the value {1} on e_1 and the value {0} on every other basis element. We’ll call this linear functional \epsilon^1. Notice that on any vector we have

\epsilon^1(v)=\epsilon^1(v^ie_i)=v^i\epsilon^1(e_i)=v^1

so it returns the coefficient of e_1. There’s nothing special about e_1 here, though. We can define functionals \epsilon^j by setting \epsilon^j(e_i)=\delta_i^j. This is the “Kronecker delta”, and it has the value {1} when its two indices match, and {0} when they don’t.

Now given a linear functional \mu with matrix coefficients \mu_j, let’s write out a new linear functional \mu_j\epsilon^j. What does this do to basis elements?

\mu_j\epsilon^j(e_i)=\mu_j\delta_i^j=\mu_i

so this new transformation has exactly the same matrix as \mu does. It must be the same transformation! So any linear functional can be written uniquely as a linear combination of the \epsilon^j, and thus they form a basis for the dual space. We call \left\{\epsilon^j\right\} the “dual basis” to \left\{e_i\right\}.

Now if we take a generic linear functional \mu and evaluate it on a generic vector v we find

\mu(v)=\mu_j\epsilon^j(v^ie_i)=\mu_jv^i\epsilon^j(e_i)=\mu_jv^i\delta_i^j=\mu_iv^i

Once we pick a basis for V we immediately get a basis for V^*, and evaluation of a linear functional on a vector looks neat in terms of these bases.

May 27, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 6 Comments

Matrices III

Given two finite-dimensional vector spaces U and V, with bases \left\{e_i\right\} and \left\{f_j\right\} respectively, we know how to build a tensor product: use the basis \left\{e_i\otimes f_j\right\}.

But an important thing about the tensor product is that it’s a functor. That is, if we have linear transformations S:U\rightarrow U' and T:V\rightarrow V', then we get a linear transformation S\otimes T:U\otimes V\rightarrow U'\otimes V'. So what does this operation look like in terms of matrices?

First we have to remember exactly how we get the tensor product S\otimes T. Clearly we can consider the function S\times T:U\times V\rightarrow U'\times V'. Then we can compose with the bilinear function U'\times V'\rightarrow U'\otimes V' to get a bilinear function from U\times V to U'\otimes V'. By the universal property, this must factor uniquely through a linear function U\otimes V\rightarrow U'\otimes V'. It is this map we call S\otimes T.

We have to pick bases \left\{e_k'\right\} of U' and \left\{f_l'\right\} of V'. This gives us a matrix coefficients s_i^k for S and t_j^l for T. To calculate the matrix for S\otimes T we have to evaluate it on the basis elements e_i\otimes f_j of U\otimes V. By definition we find:

\left[S\otimes T\right](e_i\otimes f_j)=S(e_i)\otimes T(f_j)=\left(s_i^ke_k'\right)\otimes\left(t_j^lf_l'\right)=s_i^kt_j^le_k'\otimes f_l'

that is, the matrix coefficient between the index pair (i,j) and the index pair (k,l) is s_i^kt_j^l.

It’s not often taught anymore, but there is a name for this operation: the Kronecker product. If we write the matrices (as opposed to just their coefficients) \left(s_i^k\right) and \left(t_j^l\right), then we write the Kronecker product \left(s_i^k\right)\boxtimes\left(t_j^l\right)=\left(s_i^kt_j^l\right).

May 26, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 4 Comments

Sunday Samples 70

Today some people I know are getting together and heading out to a firing range out in the boondocks of Maryland. And so I think I’ll pull out Moby’s cover of “That’s When I Reach For My Revolver”. Yes, I know the original was by Mission of Burma, so if you want a more political spin today you can go listen to that version.
Read more »

May 25, 2008 Posted by John Armstrong | Sunday Samples | | 1 Comment

Go Speed Go!

I know I don’t usually do reviews here, but I trekked out to the Udvar-Hazy annex of the Air & Space Museum last night (all the way in Chantilly) to see Speed Racer on IMAX. Go see it, but only if you can find it near you in IMAX. It’s just not worth it for a regular screen.

The quote of the night:
“I should have worn my glasses. All I saw was bright colors, blurs, and I couldn’t tell what was going on.”
“No, that’s what it was supposed to look like.”

May 24, 2008 Posted by John Armstrong | Uncategorized | | No Comments

Tensor Products and Bases

Since we’re looking at vector spaces, which are special kinds of modules, we know that \mathbf{Vec}(\mathbb{F}) has a tensor product structure. Let’s see what this means when we pick bases.

First off, let’s remember what the tensor product of two vector spaces U and V is. It’s a new vector space U\otimes V and a bilinear (linear in each of two variables separately) function B:U\times V\rightarrow U\otimes V satisfying a certain universal property. Specifically, if F:U\times V\rightarrow W is any bilinear function it must factor uniquely through B as F=\bar{F}\circ B. The catch here is that when we say “linear” and “bilinear” we mean that the functions preserve both addition and scalar multiplication. As with any other universal property, such a tensor product will be uniquely defined up to isomorphism.

So let’s take finite-dimensional vector spaces U and V, and bases \left\{e_i\right\} of U and \left\{f_j\right\} of V. I say that the vector space with basis \left\{e_i\otimes f_j\right\}, and with the bilinear function B(e_i,f_j)=e_i\otimes f_j is a tensor product. Here the expression e_i\otimes f_j is just a name for a basis element of the new vector space. Such elements are indexed by the set of pairs (i,j), where i indexes a basis for U and j indexes a basis for V.

First off, what do I mean by the bilinear function B(e_i,f_j)=e_i\otimes f_j? Just as for linear functions, we can define bilinear functions by defining them on bases. That is, if we have u=u^ie_i\in U and v=v^jf_j\in V, we get the vector

B(u,v)=B(u^ie_i,v^je_j)=u^iv^jB(e_i,f_j)=u^iv^je_i\otimes f_j

in our new vector space, with coefficients u^iv^j.

So let’s take a bilinear function F and define a linear function \bar{F} by setting

\bar{F}(e_i\otimes f_j)=F(e_i,f_j)

We can easily check that F does indeed factor as desired, since

\bar{F}(B(e_i,f_j))=\bar{F}(e_i\otimes f_j)=F(e_i,f_j)

so F=\bar{F}\circ B on basis elements. By linearity, they must agree for all pairs (u,v). It should also be clear that we can’t define \bar{F} any other way and hope to satisfy this equation, so the factorization is unique.

Thus if we have bases \left\{e_i\right\} of U and \left\{f_j\right\} of V, we immediately get a basis \left\{e_i\otimes f_j\right\} of U\otimes V. As a side note, we immediately see that the dimension of the tensor product of two vector spaces is the product of their dimensions.

May 23, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 2 Comments

Accomplishment?

To show that my underemployment isn’t wasted:
Not Anymore!
This used to be a tree.

Insert your own joke about runaway pruning algorithms here.

May 22, 2008 Posted by John Armstrong | Uncategorized | | No Comments

Matrices II

With the summation convention firmly in hand, we continue our discussion of matrices.

We’ve said before that the category of vector space is enriched over itself. That is, if we have vector spaces U and V over the field \mathbb{F}, the set of linear transformations \hom(U,V) is itself a vector space over \mathbb{F}. In fact, it inherits this structure from the one on V. We define the sum and the scalar product

\left[S+T\right](u)=S(u)+T(u)
\left[cT\right](u)=cT(u)

for linear transformations S and T from U to V, and for a constant c\in\mathbb{F}. Verifying that these are also linear transformations is straightforward.

So what do these structures look like in the language of matrices? If U and V are finite-dimensional, let’s pick bases \left\{e_i\right\} of U and \left\{f_j\right\} of V. Now we get matrix coefficients s_i^j and t_i^j, where i indexes the basis of U and j indexes the basis of V. Now we can calculate the matrices of the sum and scalar product above.

We do this, as usual, by calculating the value the transformations take at each basis element. First, the sum:

\left[S+T\right](e_i)=S(e_i)+T(e_i)=s_i^jf_j+t_i^jf_j=(s_i^j+t_i^j)f_j

and now the scalar product:

\left[cT\right](e_i)=cT(e_i)=(ct_i^j)f_j

so we calculate the matrix coefficients of the sum of two linear transformations by adding the corresponding matrix coefficients of each transformation, and the matrix coefficients of the scalar product by multiplying each coefficient by the same scalar.

May 22, 2008 Posted by John Armstrong | Algebra, Linear Algebra | | 1 Comment