# The Unapologetic Mathematician

## Matrix notation

I just spent all day on the road back to NOLA to handle some end-of-month business, clean out my office, and so on. This one will have to do for today and tomorrow.

It gets annoying to write out matrices using the embedded LaTeX here, but I suppose I really should, just for thoroughness’ sake.

In general, a matrix is a collection of field elements with an upper and a lower index. We can write out all these elements in a rectangular array. The upper index should list the rows of our array, while the lower index should list the columns. The matrix $\left(t_i^j\right)$ with entries $t_i^j$ for $i$ running from ${1}$ to $m$ and $n$ running from ${1}$ to $n$ is written out in full as

$\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}$

We call this an $n\times m$ matrix, because the array is $n$ rows high and $m$ columns wide.

There is a natural isomorphism $V\cong\hom(\mathbb{F},V)$. This means that every vector in dimension $m$, written out in the components relative to a given basis, can be seen as an $m\times1$ “column vector”:

$\displaystyle\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}$

Similarly, a linear functional on an $n$-dimensional space can be written as a $1\times n$ “row vector”:

$\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}$

Notice that evaluation of linear transformations is now just a special case of matrix multiplication! Let’s practice by writing out the composition of a linear functional $\mu\in\left(\mathbb{F}^n\right)^*$, a linear map $T:\mathbb{F}^m\rightarrow\mathbb{F}^n$, and a vector $v\in\mathbb{F}^m$.

$\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}$

A matrix product makes sense if and only if the number of columns in the left-hand matrix is the same as the number of rows in the right-hand matrix. That is, an $m\times n$ and an $n\times p$ can be multiplied. The result will be an $m\times p$ matrix. We calculate it by taking a row from the left-hand matrix and a column from the right-hand matrix. Since these are the same length (by assumption) we can multiply corresponding elements and sum up.

In the example above, the $n\times m$ matrix $\left(t_i^j\right)$ and the $m\times1$ matrix $\left(v^i\right)$ can be multiplied. There is only one column in the latter to pick, so we simply choose row $j$ out of $n$ on the left: $\begin{pmatrix}t_1^j&t_2^j&\cdots&t_m^j\end{pmatrix}$. Multiplying corresponding elements and summing gives the single field element $t_i^jv^i$ (remember the summation convention). We get $n$ of these elements — one for each row — and we arrange them in a new $n\times1$ matrix:

$\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}t_i^1v^i\\t_i^2v^i\\\vdots\\t_i^nv^i\end{pmatrix}$

Then we can multiply the row vector $\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}$ by this column vector to get the $1\times1$ matrix:

$\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}\mu_jt_i^jv^i\end{pmatrix}$

Just like we slip back and forth between vectors and $m\times1$ matrices, we will usually consider a field element and the $1\times1$ matrix with that single entry as being pretty much the same thing.

The first multiplication here turned an $m$-dimensional (column) vector into an $n$-dimensional one, reflecting the source and target of the transformation $T$. Then we evaluated the linear functional $\mu$ on the resulting vector. But by the associativity of matrix multiplication we could have first multiplied on the left:

$\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}=\begin{pmatrix}\mu_jt_1^j&\mu_jt_2^j&\cdots&\mu_jt_m^j\end{pmatrix}$

turning the linear functional on $\mathbb{F}^n$ into one on $\mathbb{F}^m$. But this is just the dual transformation $T^*:\left(\mathbb{F}^n\right)^*\rightarrow\left(\mathbb{F}^m\right)^*$! Then we can evaluate this on the column vector to get the same result: $\mu_jt_i^jv^j$.

There is one slightly touchy thing we need to be careful about: Kronecker products. When the upper index is a pair $(i_1,i_2)$ with $1\leq i_1\leq n_1$ and $1\leq i_2\leq n_2$ we have to pick an order on the set of such pairs. We’ll always use the “lexicographic” order. That is, we start with $(1,1)$, then $(1,2)$, and so on until $(1,n_2)$ before starting over with $(2,1)$, $(2,2)$, and so on. Let’s write out a couple examples just to be clear:

$\displaystyle\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}\boxtimes\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}=\begin{pmatrix}s_1^1t_1^1&s_1^1t_2^1&s_1^1t_3^1&s_2^1t_1^1&s_2^1t_2^1&s_2^1t_3^1\\s_1^1t_1^2&s_1^1t_2^2&s_1^1t_3^2&s_2^1t_1^2&s_2^1t_2^2&s_2^1t_3^2\\s_1^2t_1^1&s_1^2t_2^1&s_1^2t_3^1&s_2^2t_1^1&s_2^2t_2^1&s_2^2t_3^1\\s_1^2t_1^2&s_1^2t_2^2&s_1^2t_3^2&s_2^2t_1^2&s_2^2t_2^2&s_2^2t_3^2\end{pmatrix}$

$\displaystyle\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}\boxtimes\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}=\begin{pmatrix}t_1^1s_1^1&t_1^1s_2^1&t_2^1s_1^1&t_2^1s_2^1&t_3^1s_1^1&t_3^1s_2^1\\t_1^1s_1^2&t_1^1s_2^2&t_2^1s_1^2&t_2^1s_2^2&t_3^1s_1^2&t_3^1s_2^2\\t_1^2s_1^1&t_1^2s_2^1&t_2^2s_1^1&t_2^2s_2^1&t_3^2s_1^1&t_3^2s_2^1\\t_1^2s_1^2&t_1^2s_2^2&t_2^2s_1^2&t_2^2s_2^2&t_3^2s_1^2&t_3^2s_2^2\end{pmatrix}$

So the Kronecker product depends on the order of multiplication. But this dependence is somewhat illusory. The only real difference is reordering the bases we use for the tensor products of the vector spaces involved, and so a change of basis can turn one into the other. This is an example of how matrices can carry artifacts of our choice of bases.

May 30, 2008 - Posted by | Algebra, Linear Algebra

1. This isn’t exactly relevant to what’s going on right now, but might be entertaining for people interested in the category theory entries from last year, and with the tensor symbols flying around again, maybe here later:

Categorical semantics of linear logic: a survey

http://www.pps.jussieu.fr/~mellies/

A decent workout on a fair range of basic concepts.

Comment by Avery Andrews | May 31, 2008 | Reply

2. That’s a good point, Avery. I think you already know where I’m going with this next week.

Comment by John Armstrong | May 31, 2008 | Reply

3. […] we have to be careful about what we’re saying. In accordance with our convention, the pair of indices (with and ) should be considered as the single index . It’s clear that […]

Pingback by The Category of Matrices II « The Unapologetic Mathematician | June 3, 2008 | Reply

4. […] articles from Wikipedia. On reading a post on The Unapologetic Mathematician that discusses Kronecker product I went to Wikipedia and I’m now reading four or five articles on […]

Pingback by Blockbuster yet again « Michael Cassidy Weblog | June 12, 2008 | Reply

5. […] Now here’s the really important thing: There’s a functor that assigns the finite-dimensional vector space of -tuples of elements of to each object of . Such a vector space of -tuples comes with the basis , where the vector has a in the th place and a elsewhere. In matrix notation: […]

Pingback by The Category of Matrices III « The Unapologetic Mathematician | June 23, 2008 | Reply

6. […] introduce one of the most popular applications of linear algebra, at least outside mathematics. Matrices can encode systems of linear equations, and matrix algebra can be used to solve […]

Pingback by Linear Equations « The Unapologetic Mathematician | July 3, 2008 | Reply

7. […] this is all but writing out exactly our matrix notation! We can take the above system and rewrite it […]

Pingback by The Matrix of a Linear System « The Unapologetic Mathematician | July 11, 2008 | Reply