The Unapologetic Mathematician

The Einstein Summation Convention

Look at the formulas we were using yesterday. There’s a lot of summations in there, and a lot of big sigmas. Those get really tiring to write over and over, and they get tiring really quick. Back when Einstein was writing up his papers, he used a lot of linear transformations, and wrote them all out in matrices. Accordingly, he used a lot of those big sigmas.

When we’re typing nowadays, or when we write on a pad or on the board, this isn’t a problem. But remember that up until very recently, publications had to actually set type. Actual little pieces of metal with characters raised (and reversed!) on them would get slathered with ink and pressed to paper. Incidentally, this is why companies that produce fonts are called “type foundries”. They actually forged those metal bits with letter shapes in different styles, and sold sets of them to printers.

Now Einstein was using a lot of these big sigmas, and there were pages that had so many of them that the printer would run out! Even if they set one page at once and printed them off, they just didn’t have enough little pieces of metal with big sigmas on them to handle it. Clearly something needed to be done to cut down on demand for them.

Here we note that we’re always summing over some basis. Even if there’s no basis element in a formula — say, the formula for a matrix product — the summation is over the dimension of some vector space. We also notice that when we chose to write some of our indices as subscripts and some as superscripts, we’re always summing over one of each. We now adopt the convention that if we ever see a repeated index — once as a superscript and once as a subscript — we’ll read that as summing over an appropriate basis.

For example, when we wanted to write a vector $v\in V$, we had to take the basis $\{f_j\}$ of $V$ and write up the sum

$\displaystyle v=\sum\limits_{j=1}^{\dim(V)}v^jf_j$

but now we just write $v^jf_j$. The repeated index and the fact that we’re talking about a vector in $V$ means we sum for $j$ running from ${1}$ to the dimension of $V$. Similarly we write out the value of a linear transformation on a basis vector: $T(f_j)=t_j^kg_k$. Here we determine from context that $k$ should run from ${1}$ to the dimension of $W$.

What about finding the coefficients of a linear transformation acting on a vector? Before we wrote this as

$\displaystyle T(v)^k=\sum\limits_{j=1}^{\dim(V)}t_j^kv^j$

Where now we write the result as $t_j^kv^j$. Since the $v^j$ are the coefficients of a vector in $V$, $j$ must run from ${1}$ to the dimension of $V$.

And similarly given linear transformations $S:U\rightarrow V$ and $T:V\rightarrow W$ represented (given choices of bases) by the matrices with components $s_i^j$ and $t_j^k$, the matrix of their product is then written $s_i^jt_j^k$. Again, we determine from context that we should be summing $j$ over a set indexing a basis for $V$.

One very important thing to note here is that it’s not going to matter what basis for $V$ we use here! I’m not going to prove this quite yet, but built right into this notation is the fact that the composite of the two transformations is completely independent of the choice of basis of $V$. Of course, the matrix of the composite still depends on the bases of $U$ and $W$ we pick, but the dependence on $V$ vanishes as we take the sum.

Einstein had a slightly easier time of things: he was always dealing with four-dimensional vector spaces, so all his indices had the same range of summation. We’ve got to pay some attention here and be careful about what vector space a given index is talking about, but in the long run it saves a lot of time.

May 21, 2008