# The Unapologetic Mathematician

## The Einstein Summation Convention

Look at the formulas we were using yesterday. There’s a lot of summations in there, and a lot of big sigmas. Those get really tiring to write over and over, and they get tiring really quick. Back when Einstein was writing up his papers, he used a lot of linear transformations, and wrote them all out in matrices. Accordingly, he used a lot of those big sigmas.

When we’re typing nowadays, or when we write on a pad or on the board, this isn’t a problem. But remember that up until very recently, publications had to actually set type. Actual little pieces of metal with characters raised (and reversed!) on them would get slathered with ink and pressed to paper. Incidentally, this is why companies that produce fonts are called “type foundries”. They actually forged those metal bits with letter shapes in different styles, and sold sets of them to printers.

Now Einstein was using a lot of these big sigmas, and there were pages that had so many of them that the printer would run out! Even if they set one page at once and printed them off, they just didn’t have enough little pieces of metal with big sigmas on them to handle it. Clearly something needed to be done to cut down on demand for them.

Here we note that we’re always summing over some basis. Even if there’s no basis element in a formula — say, the formula for a matrix product — the summation is over the dimension of some vector space. We also notice that when we chose to write some of our indices as subscripts and some as superscripts, we’re always summing over one of each. We now adopt the convention that if we ever see a repeated index — once as a superscript and once as a subscript — we’ll read that as summing over an appropriate basis.

For example, when we wanted to write a vector $v\in V$, we had to take the basis $\{f_j\}$ of $V$ and write up the sum

$\displaystyle v=\sum\limits_{j=1}^{\dim(V)}v^jf_j$

but now we just write $v^jf_j$. The repeated index and the fact that we’re talking about a vector in $V$ means we sum for $j$ running from ${1}$ to the dimension of $V$. Similarly we write out the value of a linear transformation on a basis vector: $T(f_j)=t_j^kg_k$. Here we determine from context that $k$ should run from ${1}$ to the dimension of $W$.

What about finding the coefficients of a linear transformation acting on a vector? Before we wrote this as

$\displaystyle T(v)^k=\sum\limits_{j=1}^{\dim(V)}t_j^kv^j$

Where now we write the result as $t_j^kv^j$. Since the $v^j$ are the coefficients of a vector in $V$, $j$ must run from ${1}$ to the dimension of $V$.

And similarly given linear transformations $S:U\rightarrow V$ and $T:V\rightarrow W$ represented (given choices of bases) by the matrices with components $s_i^j$ and $t_j^k$, the matrix of their product is then written $s_i^jt_j^k$. Again, we determine from context that we should be summing $j$ over a set indexing a basis for $V$.

One very important thing to note here is that it’s not going to matter what basis for $V$ we use here! I’m not going to prove this quite yet, but built right into this notation is the fact that the composite of the two transformations is completely independent of the choice of basis of $V$. Of course, the matrix of the composite still depends on the bases of $U$ and $W$ we pick, but the dependence on $V$ vanishes as we take the sum.

Einstein had a slightly easier time of things: he was always dealing with four-dimensional vector spaces, so all his indices had the same range of summation. We’ve got to pay some attention here and be careful about what vector space a given index is talking about, but in the long run it saves a lot of time.

May 21, 2008 - Posted by | Fundamentals, Linear Algebra

1. Oh yuck. I’m firmly with Spivak on the matter of the Einstein summation convention – it’s awful, makes the mathematics unreadable and confusing.

I tried thrice to learn differential geometry. I failed each time at the task. And one of the first things that really turned me off each time was the summation convention.

That said I don’t want to turn into yet another troll hounding your poor blog; so I’ll shut up about the summation convention after this.

Comment by Mikael Vejdemo Johansson | May 21, 2008 | Reply

2. Mikael, I think the problem is not the summation convention so much as working with matrices in the first place. Once you’re committed to using bases, the convention just simplifies notation. Unreadable formulas in terms of the summation convention are even more unreadable without it.

While I talk about linear algebra I’ll use matrices, and I’ll use the summation convention to simplify the notation. However (and as I’ll say more explicitly soon) the point is to tie matrices to the abstract formulations and lift concepts up. That is, many people may have seen matrices and matrix operations. Often a problem arises in terms of matrices. Part of my coverage of linear algebra is about making the transition from matrices and bases to abstract linear transformations. When speaking abstractly I won’t have to use the summation convention.

Comment by John Armstrong | May 21, 2008 | Reply

3. […] With the summation convention firmly in hand, we continue our discussion of […]

Pingback by Matrices II « The Unapologetic Mathematician | May 22, 2008 | Reply

4. […] . Multiplying corresponding elements and summing gives the single field element (remember the summation convention). We get of these elements — one for each row — and we arrange them in a new […]

Pingback by Matrix notation « The Unapologetic Mathematician | May 30, 2008 | Reply

5. […] We compose two morphisms by the process of matrix multiplication. If is an matrix in and is a matrix in , then their product is a matrix in (remember the summation convention). […]

Pingback by The Category of Matrices I « The Unapologetic Mathematician | June 2, 2008 | Reply

6. […] so on. We can write (remember the summation convention), so the vector components of the basis vectors are given by the Kronecker delta. We will think of […]

Pingback by The Category of Matrices III « The Unapologetic Mathematician | June 23, 2008 | Reply

7. […] show up to the first power, there is no ambiguity about writing our indices as superscripts — something we’ve done before. Anyhow, we might write an […]

Pingback by Linear Equations « The Unapologetic Mathematician | July 3, 2008 | Reply

8. […] here that we’re not using the summation convention for polynomials, though we could in principle. Remember, an algebra is a vector space, and what […]

Pingback by Polynomials I « The Unapologetic Mathematician | July 28, 2008 | Reply

9. […] is surjective, given any there is some with . But uniquely (remember the summation convention) because the form a basis. Then , and so we have an expression of as a linear combination of the […]

Pingback by Isomorphisms of Vector Spaces « The Unapologetic Mathematician | October 17, 2008 | Reply

10. […] basis. So we introduce the so-called “Sweedler notation”. If you didn’t like the summation convention, you’re going to hate […]

Pingback by Sweedler notation « The Unapologetic Mathematician | November 10, 2008 | Reply

11. […] we’re not using the summation convention for the […]

Pingback by Calculating the Determinant « The Unapologetic Mathematician | January 2, 2009 | Reply

12. […] instead of the variable, we’ll have a term . We add all of these up, summing over — as our notation suggests we should! And now we have the second coefficient of the characteristic polynomial. We […]

Pingback by The Trace of a Linear Transformation « The Unapologetic Mathematician | January 30, 2009 | Reply

13. […] of the basis at hand to writing the function as taking real variables . I know that some people don’t like superscript indices and the summation convention, but they’ll be standard when we get to more […]

Pingback by Directional Derivatives « The Unapologetic Mathematician | September 23, 2009 | Reply

14. […] index in the denominator, we’ll consider it to be like an upper index for the purposes of the summation convention. We can even incorporate evaluation […]

Pingback by Examples and Notation « The Unapologetic Mathematician | October 2, 2009 | Reply

15. […] differential. Today I want to identify exactly what goes wrong, and I’ll make use of the summation convention to greatly simplify the […]

Pingback by Higher Differentials and Composite Functions « The Unapologetic Mathematician | October 19, 2009 | Reply

16. […] to we call a “multi-index”, and sometimes we just write it as , which in the summation convention runs over all increasing collections of indices. Correspondingly, we can just write for the […]

Pingback by Inner Products on Exterior Algebras and Determinants « The Unapologetic Mathematician | October 30, 2009 | Reply

17. […] the space consisting of -tuples of real numbers . Remember that we’re writing our indices as superscripts, so we shouldn’t think of these as powers of some number , but as the components of a vector. […]

Pingback by Reflections « The Unapologetic Mathematician | January 18, 2010 | Reply

18. […] of vectors I won’t be writing dot products explicitly. Instead I’ll use the common convention that when the same index appears twice we’re supposed to sum over it, remembering that the […]

Pingback by The Higgs Mechanism part 2: Examples of Lagrangian Field Equations « The Unapologetic Mathematician | July 17, 2012 | Reply