# The Unapologetic Mathematician

## Matrix notation

I just spent all day on the road back to NOLA to handle some end-of-month business, clean out my office, and so on. This one will have to do for today and tomorrow.

It gets annoying to write out matrices using the embedded LaTeX here, but I suppose I really should, just for thoroughness’ sake.

In general, a matrix is a collection of field elements with an upper and a lower index. We can write out all these elements in a rectangular array. The upper index should list the rows of our array, while the lower index should list the columns. The matrix $\left(t_i^j\right)$ with entries $t_i^j$ for $i$ running from ${1}$ to $m$ and $n$ running from ${1}$ to $n$ is written out in full as $\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}$

We call this an $n\times m$ matrix, because the array is $n$ rows high and $m$ columns wide.

There is a natural isomorphism $V\cong\hom(\mathbb{F},V)$. This means that every vector in dimension $m$, written out in the components relative to a given basis, can be seen as an $m\times1$ “column vector”: $\displaystyle\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}$

Similarly, a linear functional on an $n$-dimensional space can be written as a $1\times n$ “row vector”: $\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}$

Notice that evaluation of linear transformations is now just a special case of matrix multiplication! Let’s practice by writing out the composition of a linear functional $\mu\in\left(\mathbb{F}^n\right)^*$, a linear map $T:\mathbb{F}^m\rightarrow\mathbb{F}^n$, and a vector $v\in\mathbb{F}^m$. $\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}$

A matrix product makes sense if and only if the number of columns in the left-hand matrix is the same as the number of rows in the right-hand matrix. That is, an $m\times n$ and an $n\times p$ can be multiplied. The result will be an $m\times p$ matrix. We calculate it by taking a row from the left-hand matrix and a column from the right-hand matrix. Since these are the same length (by assumption) we can multiply corresponding elements and sum up.

In the example above, the $n\times m$ matrix $\left(t_i^j\right)$ and the $m\times1$ matrix $\left(v^i\right)$ can be multiplied. There is only one column in the latter to pick, so we simply choose row $j$ out of $n$ on the left: $\begin{pmatrix}t_1^j&t_2^j&\cdots&t_m^j\end{pmatrix}$. Multiplying corresponding elements and summing gives the single field element $t_i^jv^i$ (remember the summation convention). We get $n$ of these elements — one for each row — and we arrange them in a new $n\times1$ matrix: $\displaystyle\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}t_i^1v^i\\t_i^2v^i\\\vdots\\t_i^nv^i\end{pmatrix}$

Then we can multiply the row vector $\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}$ by this column vector to get the $1\times1$ matrix: $\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}\begin{pmatrix}v^1\\v^2\\\vdots\\v^m\end{pmatrix}=\begin{pmatrix}\mu_jt_i^jv^i\end{pmatrix}$

Just like we slip back and forth between vectors and $m\times1$ matrices, we will usually consider a field element and the $1\times1$ matrix with that single entry as being pretty much the same thing.

The first multiplication here turned an $m$-dimensional (column) vector into an $n$-dimensional one, reflecting the source and target of the transformation $T$. Then we evaluated the linear functional $\mu$ on the resulting vector. But by the associativity of matrix multiplication we could have first multiplied on the left: $\displaystyle\begin{pmatrix}\mu_1&\mu_2&\cdots&\mu_n\end{pmatrix}\begin{pmatrix}t_1^1&t_2^1&\cdots&t_m^1\\t_1^2&t_2^2&\cdots&t_m^2\\\vdots&\vdots&\ddots&\vdots\\t_1^n&t_2^n&\cdots&t_m^n\end{pmatrix}=\begin{pmatrix}\mu_jt_1^j&\mu_jt_2^j&\cdots&\mu_jt_m^j\end{pmatrix}$

turning the linear functional on $\mathbb{F}^n$ into one on $\mathbb{F}^m$. But this is just the dual transformation $T^*:\left(\mathbb{F}^n\right)^*\rightarrow\left(\mathbb{F}^m\right)^*$! Then we can evaluate this on the column vector to get the same result: $\mu_jt_i^jv^j$.

There is one slightly touchy thing we need to be careful about: Kronecker products. When the upper index is a pair $(i_1,i_2)$ with $1\leq i_1\leq n_1$ and $1\leq i_2\leq n_2$ we have to pick an order on the set of such pairs. We’ll always use the “lexicographic” order. That is, we start with $(1,1)$, then $(1,2)$, and so on until $(1,n_2)$ before starting over with $(2,1)$, $(2,2)$, and so on. Let’s write out a couple examples just to be clear: $\displaystyle\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}\boxtimes\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}=\begin{pmatrix}s_1^1t_1^1&s_1^1t_2^1&s_1^1t_3^1&s_2^1t_1^1&s_2^1t_2^1&s_2^1t_3^1\\s_1^1t_1^2&s_1^1t_2^2&s_1^1t_3^2&s_2^1t_1^2&s_2^1t_2^2&s_2^1t_3^2\\s_1^2t_1^1&s_1^2t_2^1&s_1^2t_3^1&s_2^2t_1^1&s_2^2t_2^1&s_2^2t_3^1\\s_1^2t_1^2&s_1^2t_2^2&s_1^2t_3^2&s_2^2t_1^2&s_2^2t_2^2&s_2^2t_3^2\end{pmatrix}$ $\displaystyle\begin{pmatrix}t_1^1&t_2^1&t_3^1\\t_1^2&t_2^2&t_3^2\end{pmatrix}\boxtimes\begin{pmatrix}s_1^1&s_2^1\\s_1^2&s_2^2\end{pmatrix}=\begin{pmatrix}t_1^1s_1^1&t_1^1s_2^1&t_2^1s_1^1&t_2^1s_2^1&t_3^1s_1^1&t_3^1s_2^1\\t_1^1s_1^2&t_1^1s_2^2&t_2^1s_1^2&t_2^1s_2^2&t_3^1s_1^2&t_3^1s_2^2\\t_1^2s_1^1&t_1^2s_2^1&t_2^2s_1^1&t_2^2s_2^1&t_3^2s_1^1&t_3^2s_2^1\\t_1^2s_1^2&t_1^2s_2^2&t_2^2s_1^2&t_2^2s_2^2&t_3^2s_1^2&t_3^2s_2^2\end{pmatrix}$

So the Kronecker product depends on the order of multiplication. But this dependence is somewhat illusory. The only real difference is reordering the bases we use for the tensor products of the vector spaces involved, and so a change of basis can turn one into the other. This is an example of how matrices can carry artifacts of our choice of bases.

May 30, 2008 Posted by | Algebra, Linear Algebra | 7 Comments