The Unapologetic Mathematician

Mathematics for the interested outsider

The Gram-Schmidt Process

Now that we have a real or complex inner product, we have notions of length and angle. This lets us define what it means for a collection of vectors to be “orthonormal”: each pair of distinct vectors is perpendicular, and each vector has unit length. In formulas, we say that the collection \left\{e_i\right\}_{i=1}^n is orthonormal if \langle e_i,e_j\rangle=\delta_{i,j}. These can be useful things to have, but how do we get our hands on them?

It turns out that if we have a linearly independent collection of vectors \left\{v_i\right\}_{i=1}^n then we can come up with an orthonormal collection \left\{e_i\right\}_{i=1}^n spanning the same subspace of V. Even better, we can pick it so that the first k vectors \left\{e_i\right\}_{i=1}^k span the same subspace as \left\{v_i\right\}_{i=1}^k. The method goes back to Laplace and Cauchy, but gets its name from Jørgen Gram and Erhard Schmidt.

We proceed by induction on the number of vectors in the collection. If n=1, then we simply set

\displaystyle e_1=\frac{v_1}{\lVert v_1\rVert}

This “normalizes” the vector to have unit length, but doesn’t change its direction. It spans the same one-dimensional subspace, and since it’s alone it forms an orthonormal collection.

Now, lets assume the procedure works for collections of size n-1 and start out with a linearly independent collection of n vectors. First, we can orthonormalize the first n-1 vectors using our inductive hypothesis. This gives a collection \left\{e_i\right\}_{i=1}^{n-1} which spans the same subspace as \left\{v_i\right\}_{i=1}^{n-1} (and so on down, as noted above). But v_n isn’t in the subspace spanned by the first n-1 vectors (or else the original collection wouldn’t have been linearly independent). So it points at least somewhat in a new direction.

To find this new direction, we define

\displaystyle w_n=v_n-\langle e_1,v_n\rangle e_1-...-\langle e_{n-1},v_n\rangle e_{n-1}

This vector will be orthogonal to all the vectors from e_1 to e_{n-1}, since for any such e_j we can check

\displaystyle\begin{aligned}\langle e_j,w_n&=\langle e_j,v_n-\langle e_1,v_n\rangle e_1-...-\langle e_{n-1},v_n\rangle e_{n-1}\rangle\\&=\langle e_j,v_n\rangle-\langle e_1,v_n\rangle\langle e_j,e_1\rangle-...-\langle e_{n-1},v_n\rangle\langle e_j,e_{n-1}\rangle\\&=\langle e_j,v_n\rangle-\langle e_j,v_n\rangle=0\end{aligned}

where we use the orthonormality of the collection \left\{e_i\right\}_{i=1}^{n-1} to show that most of these inner products come out to be zero.

So we’ve got a vector orthogonal to all the ones we collected so far, but it might not have unit length. So we normalize it:

\displaystyle e_n=\frac{w_n}{\lVert w_n\rVert}

and we’re done.

April 28, 2009 Posted by | Algebra, Linear Algebra | 37 Comments