The Unapologetic Mathematician

Mathematics for the interested outsider

The Einstein Summation Convention

Look at the formulas we were using yesterday. There’s a lot of summations in there, and a lot of big sigmas. Those get really tiring to write over and over, and they get tiring really quick. Back when Einstein was writing up his papers, he used a lot of linear transformations, and wrote them all out in matrices. Accordingly, he used a lot of those big sigmas.

When we’re typing nowadays, or when we write on a pad or on the board, this isn’t a problem. But remember that up until very recently, publications had to actually set type. Actual little pieces of metal with characters raised (and reversed!) on them would get slathered with ink and pressed to paper. Incidentally, this is why companies that produce fonts are called “type foundries”. They actually forged those metal bits with letter shapes in different styles, and sold sets of them to printers.

Now Einstein was using a lot of these big sigmas, and there were pages that had so many of them that the printer would run out! Even if they set one page at once and printed them off, they just didn’t have enough little pieces of metal with big sigmas on them to handle it. Clearly something needed to be done to cut down on demand for them.

Here we note that we’re always summing over some basis. Even if there’s no basis element in a formula — say, the formula for a matrix product — the summation is over the dimension of some vector space. We also notice that when we chose to write some of our indices as subscripts and some as superscripts, we’re always summing over one of each. We now adopt the convention that if we ever see a repeated index — once as a superscript and once as a subscript — we’ll read that as summing over an appropriate basis.

For example, when we wanted to write a vector v\in V, we had to take the basis \{f_j\} of V and write up the sum

\displaystyle v=\sum\limits_{j=1}^{\dim(V)}v^jf_j

but now we just write v^jf_j. The repeated index and the fact that we’re talking about a vector in V means we sum for j running from {1} to the dimension of V. Similarly we write out the value of a linear transformation on a basis vector: T(f_j)=t_j^kg_k. Here we determine from context that k should run from {1} to the dimension of W.

What about finding the coefficients of a linear transformation acting on a vector? Before we wrote this as

\displaystyle T(v)^k=\sum\limits_{j=1}^{\dim(V)}t_j^kv^j

Where now we write the result as t_j^kv^j. Since the v^j are the coefficients of a vector in V, j must run from {1} to the dimension of V.

And similarly given linear transformations S:U\rightarrow V and T:V\rightarrow W represented (given choices of bases) by the matrices with components s_i^j and t_j^k, the matrix of their product is then written s_i^jt_j^k. Again, we determine from context that we should be summing j over a set indexing a basis for V.

One very important thing to note here is that it’s not going to matter what basis for V we use here! I’m not going to prove this quite yet, but built right into this notation is the fact that the composite of the two transformations is completely independent of the choice of basis of V. Of course, the matrix of the composite still depends on the bases of U and W we pick, but the dependence on V vanishes as we take the sum.

Einstein had a slightly easier time of things: he was always dealing with four-dimensional vector spaces, so all his indices had the same range of summation. We’ve got to pay some attention here and be careful about what vector space a given index is talking about, but in the long run it saves a lot of time.

May 21, 2008 - Posted by John Armstrong | Fundamentals, Linear Algebra | | 8 Comments

8 Comments »

  1. Oh yuck. I’m firmly with Spivak on the matter of the Einstein summation convention - it’s awful, makes the mathematics unreadable and confusing.

    I tried thrice to learn differential geometry. I failed each time at the task. And one of the first things that really turned me off each time was the summation convention.

    That said I don’t want to turn into yet another troll hounding your poor blog; so I’ll shut up about the summation convention after this.

    Comment by Mikael Vejdemo Johansson | May 21, 2008

  2. Mikael, I think the problem is not the summation convention so much as working with matrices in the first place. Once you’re committed to using bases, the convention just simplifies notation. Unreadable formulas in terms of the summation convention are even more unreadable without it.

    While I talk about linear algebra I’ll use matrices, and I’ll use the summation convention to simplify the notation. However (and as I’ll say more explicitly soon) the point is to tie matrices to the abstract formulations and lift concepts up. That is, many people may have seen matrices and matrix operations. Often a problem arises in terms of matrices. Part of my coverage of linear algebra is about making the transition from matrices and bases to abstract linear transformations. When speaking abstractly I won’t have to use the summation convention.

    Comment by John Armstrong | May 21, 2008

  3. [...] With the summation convention firmly in hand, we continue our discussion of [...]

    Pingback by Matrices II « The Unapologetic Mathematician | May 22, 2008

  4. [...] . Multiplying corresponding elements and summing gives the single field element (remember the summation convention). We get of these elements — one for each row — and we arrange them in a new [...]

    Pingback by Matrix notation « The Unapologetic Mathematician | May 30, 2008

  5. [...] We compose two morphisms by the process of matrix multiplication. If is an matrix in and is a matrix in , then their product is a matrix in (remember the summation convention). [...]

    Pingback by The Category of Matrices I « The Unapologetic Mathematician | June 2, 2008

  6. [...] so on. We can write (remember the summation convention), so the vector components of the basis vectors are given by the Kronecker delta. We will think of [...]

    Pingback by The Category of Matrices III « The Unapologetic Mathematician | June 23, 2008

  7. [...] show up to the first power, there is no ambiguity about writing our indices as superscripts — something we’ve done before. Anyhow, we might write an [...]

    Pingback by Linear Equations « The Unapologetic Mathematician | July 3, 2008

  8. [...] here that we’re not using the summation convention for polynomials, though we could in principle. Remember, an algebra is a vector space, and what [...]

    Pingback by Polynomials I « The Unapologetic Mathematician | July 28, 2008

Leave a comment