The Unapologetic Mathematician

Mathematics for the interested outsider

Orthonormal Bases

Now that we have the Gram-Schmidt process as a tool, we can use it to come up with orthonormal bases.

Any vector space V with finite dimension d has a finite basis \left\{v_i\right\}_{i=1}^d. This is exactly what it means for V to have dimension d. And now we can apply the Gram-Schmidt process to turn this basis into an orthonormal basis \left\{e_i\right\}_{i=1}^d.

We also know that any linearly independent set can be expanded to a basis. In fact, we can also extend any orthonormal collection of vectors to an orthonormal basis. Indeed, if \left\{e_i\right\}_{i=1}^n is an orthonormal collection, we can add the vectors \left\{v_i\right\}_{i=n+1}^d to fill out a basis. Then when we apply the Gram-Schmidt process to this basis it will start with e_1, which is already normalized. It then moves on to e_2, which is orthonormal with e_1, and so on. Each of the e_i is left unchanged, and the v_i are modified to make them orthonormal with the existing collection.

April 30, 2009 Posted by | Algebra, Linear Algebra | 5 Comments

The Gram-Schmidt Process

Now that we have a real or complex inner product, we have notions of length and angle. This lets us define what it means for a collection of vectors to be “orthonormal”: each pair of distinct vectors is perpendicular, and each vector has unit length. In formulas, we say that the collection \left\{e_i\right\}_{i=1}^n is orthonormal if \langle e_i,e_j\rangle=\delta_{i,j}. These can be useful things to have, but how do we get our hands on them?

It turns out that if we have a linearly independent collection of vectors \left\{v_i\right\}_{i=1}^n then we can come up with an orthonormal collection \left\{e_i\right\}_{i=1}^n spanning the same subspace of V. Even better, we can pick it so that the first k vectors \left\{e_i\right\}_{i=1}^k span the same subspace as \left\{v_i\right\}_{i=1}^k. The method goes back to Laplace and Cauchy, but gets its name from Jørgen Gram and Erhard Schmidt.

We proceed by induction on the number of vectors in the collection. If n=1, then we simply set

\displaystyle e_1=\frac{v_1}{\lVert v_1\rVert}

This “normalizes” the vector to have unit length, but doesn’t change its direction. It spans the same one-dimensional subspace, and since it’s alone it forms an orthonormal collection.

Now, lets assume the procedure works for collections of size n-1 and start out with a linearly independent collection of n vectors. First, we can orthonormalize the first n-1 vectors using our inductive hypothesis. This gives a collection \left\{e_i\right\}_{i=1}^{n-1} which spans the same subspace as \left\{v_i\right\}_{i=1}^{n-1} (and so on down, as noted above). But v_n isn’t in the subspace spanned by the first n-1 vectors (or else the original collection wouldn’t have been linearly independent). So it points at least somewhat in a new direction.

To find this new direction, we define

\displaystyle w_n=v_n-\langle e_1,v_n\rangle e_1-...-\langle e_{n-1},v_n\rangle e_{n-1}

This vector will be orthogonal to all the vectors from e_1 to e_{n-1}, since for any such e_j we can check

\displaystyle\begin{aligned}\langle e_j,w_n&=\langle e_j,v_n-\langle e_1,v_n\rangle e_1-...-\langle e_{n-1},v_n\rangle e_{n-1}\rangle\\&=\langle e_j,v_n\rangle-\langle e_1,v_n\rangle\langle e_j,e_1\rangle-...-\langle e_{n-1},v_n\rangle\langle e_j,e_{n-1}\rangle\\&=\langle e_j,v_n\rangle-\langle e_j,v_n\rangle=0\end{aligned}

where we use the orthonormality of the collection \left\{e_i\right\}_{i=1}^{n-1} to show that most of these inner products come out to be zero.

So we’ve got a vector orthogonal to all the ones we collected so far, but it might not have unit length. So we normalize it:

\displaystyle e_n=\frac{w_n}{\lVert w_n\rVert}

and we’re done.

April 28, 2009 Posted by | Algebra, Linear Algebra | 37 Comments

The Parallelogram Law

There’s an interesting little identity that holds for norms — translation-invariant metrics on vector spaces over \mathbb{R} or {C} — that come from inner products. Even more interestingly, it actually characterizes such norms.

Geometrically, if we have a parallelogram whose two sides from the same point are given by the vectors v and w, then we can construct the two diagonals v+w and v-w. It then turns out that the sum of the squares on all four sides is equal to the sum of the squares on the diagonals. We write this formally by saying

\displaystyle\lVert v+w\rVert^2+\lVert v-w\rVert^2=2\lVert v\rVert^2+2\lVert w\rVert^2

where we’ve used the fact that opposite sides of a parallelogram have the same length. Verifying this identity is straightforward, using the definition of the norm-squared:

\displaystyle\begin{aligned}\lVert v+w\rVert^2+\lVert v-w\rVert^2&=\langle v+w,v+w\rangle+\langle v-w,v-w\rangle\\&=\langle v,v\rangle+\langle v,w\rangle+\langle w,v\rangle+\langle w,w\rangle\\&+\langle v,v\rangle-\langle v,w\rangle-\langle w,v\rangle+\langle w,w\rangle\\&=2\langle v,v\rangle+2\langle w,w\rangle\\&=2\lVert v\rVert^2+2\lVert w\rVert^2\end{aligned}

On the other hand, what if we have a norm that satisfies this parallelogram law? Then we can use the polarization identities to define a unique inner product.

\displaystyle\langle v,w\rangle=\frac{\lVert v+w\rVert^2-\lVert v-w\rVert^2}{4}+i\frac{\lVert v-iw\rVert^2-\lVert v+iw\rVert^2}{4}

where we ignore the second term when working over real vector spaces.

However, if we have a norm that does not satisfy the parallelogram law and try to use it in these formulas, then the resulting form must fail to be an inner product. If we did get an inner product, then the norm would satisfy the parallelogram law, which it doesn’t.

Now, I haven’t given any examples of norms on vector spaces which don’t satisfy the parallelogram law, but they show up all the time in functional analysis. For now I just want to point out that such things do, in fact, exist.

April 24, 2009 Posted by | Algebra, Linear Algebra | 4 Comments

The Polarization Identities

If we have an inner product on a real or complex vector space, we get a notion of length called a “norm”. It turns out that the norm completely determines the inner product.

Let’s take the sum of two vectors v and w. We can calculate its norm-squared as usual:

\displaystyle\begin{aligned}\lVert v+w\rVert^2&=\langle v+w,v+w\rangle\\&=\langle v,v\rangle+\langle v,w\rangle+\langle w,v\rangle+\langle w,w\rangle\\&=\lVert v\rVert^2+\lVert w\rVert^2+\langle v,w\rangle+\overline{\langle v,w\rangle}\\&=\lVert v\rVert^2+\lVert w\rVert^2+2\Re\left(\langle v,w\rangle\right)\end{aligned}

where \Re(z) denotes the real part of the complex number z. If z is already a real number, it does nothing.

So we can rewrite this equation as

\displaystyle\Re\left(\langle v,w\rangle\right)=\frac{1}{2}\left(\lVert v+w\rVert^2-\lVert v\rVert^2-\lVert w\rVert^2\right)

If we’re working over a real vector space, this is the inner product itself. Over a complex vector space, this only gives us the real part of the inner product. But all is not lost! We can also work out

\displaystyle\begin{aligned}\lVert v+iw\rVert^2&=\langle v+iw,v+iw\rangle\\&=\langle v,v\rangle+\langle v,iw\rangle+\langle iw,v\rangle+\langle iw,iw\rangle\\&=\lVert v\rVert^2+\lVert iw\rVert^2+\langle v,iw\rangle+\overline{\langle v,iw\rangle}\\&=\lVert v\rVert^2+\lVert w\rVert^2+2\Re\left(i\langle v,w\rangle\right)\\&=\lVert v\rVert^2+\lVert w\rVert^2-2\Im\left(\langle v,w\rangle\right)\end{aligned}

where \Im(z) denotes the imaginary part of the complex number z. The last equality holds because

\displaystyle\Re\left(i(a+bi)\right)=\Re(ai-b)=-b=-\Im(a+bi)

so we can write

\displaystyle\Im\left(\langle v,w\rangle\right)=\frac{1}{2}\left(\lVert v\rVert^2+\lVert w\rVert^2-\lVert v+iw\rVert^2\right)

We can also write these identities out in a couple other ways. If we started with v-w, we could find the identities

\displaystyle\Re\left(\langle v,w\rangle\right)=\frac{1}{2}\left(\lVert v\rVert^2+\lVert w\rVert^2-\lVert v-w\rVert^2\right)
\displaystyle\Im\left(\langle v,w\rangle\right)=\frac{1}{2}\left(\lVert v-iw\rVert^2-\lVert v\rVert^2-\lVert w\rVert^2\right)

Or we could combine both forms above to write

\displaystyle\Re\left(\langle v,w\rangle\right)=\frac{1}{4}\left(\lVert v+w\rVert^2-\lVert v-w\rVert^2\right)
\displaystyle\Im\left(\langle v,w\rangle\right)=\frac{1}{4}\left(\lVert v-iw\rVert^2-\lVert v+iw\rVert^2\right)

In all these ways we see that not only does an inner product on a real or complex vector space give us a norm, but the resulting norm completely determines the inner product. Different inner products necessarily give rise to different norms.

April 23, 2009 Posted by | Algebra, Linear Algebra | 5 Comments

Complex Inner Products

Now consider a complex vector space. We can define bilinear forms, and even ask that they be symmetric and nondegenerate. But there’s no way for such a form to be positive-definite. Indeed, we saw that there isn’t even a notion of “order” on the field of complex numbers. They do contain the real numbers as a subfield, but we can’t manage to stay in the positive real numbers. Indeed, if we have \langle v,v\rangle=a+0i for some real a\geq0, then we also have \langle iv,iv\rangle=i^2(a+0i)=-a+0i. So it seems we aren’t going to get the same geometric interpretations this way.

But let’s slow down and look at a one-dimensional complex vector space — the field of complex numbers itself. We do have a notion of length here. We define the length of a complex number z=a+bi as the square root of \bar{z}z=(a-bi)(a+bi)=a^2+b^2. This quantity is always a positive real number, and thus always has a square root. And it looks sort of like how we compute the squared length of a vector with a bilinear form. Indeed, if we think of \mathbb{C} as a real vector space with basis \{1,i\}, it’s exactly the norm we get when we define this basis to be orthonormal. The only thing weird is that conjugation.

Well, let’s run with this a while. Given a complex vector space V, we want a form \langle\underline{\hphantom{X}},\underline{\hphantom{X}}\rangle which is

  • linear in the second slot — \langle u,av+bw\rangle=a\langle u,v\rangle+b\langle u,w\rangle
  • conjugate symmetric — \langle v,w\rangle=\overline{\langle w,v\rangle}

Conjugate symmetry implies that the form is conjugate linear in the first slot — \langle av+bw,u\rangle=\bar{a}\langle v,u\rangle+\bar{b}\langle w,u\rangle — and also that \langle v,v\rangle=\overline{\langle v,v\rangle} is always real. This makes it reasonable to also ask that the form be

  • positive definite — \langle v,v\rangle>0 for all v\neq0

This mixture of being linear in one variable and “half-linear” in the other makes the whole form “one and a half” times linear, or “sesquilinear”.

Anyhow, now we do get a notion of length, defined by setting \lVert v\rVert^2=\langle v,v\rangle as before. What about angle? That will depend directly on the Cauchy-Schwarz inequality, assuming it holds. We’ll check that now.

Our previous proof doesn’t really work, since our scalars are now complex, and we can’t argue that certain polynomials have no zeroes. But we can modify it. We start similarly, calculating

\displaystyle0\leq\langle v-tw,v-tw\rangle=\langle v,v\rangle-t\langle v,w\rangle-\bar{t}\langle w,v\rangle+\bar{t}t\langle w,w\rangle

Now the Cauchy-Schwarz inequality is trivial if w=0, so we may assume \langle w,w\rangle\neq0, and set t=\frac{\langle w,v\rangle}{\langle w,w\rangle}. Then we see

\displaystyle\begin{aligned}0&\leq\langle v,v\rangle-\frac{\langle w,v\rangle\langle v,w\rangle}{\langle w,w\rangle}-\frac{\langle v,w\rangle\langle w,v\rangle}{\langle w,w\rangle}+\frac{\langle w,v\rangle\langle v,w\rangle}{\langle w,w\rangle}\\&=\langle v,v\rangle-\frac{\lvert\langle v,w\rangle\rvert^2}{\langle w,w\rangle}\end{aligned}

Multiplying through by \langle w,w\rangle and rearranging, we find

\displaystyle\lvert\langle v,w\rangle\rvert^2\leq\langle v,v\rangle\langle w,w\rangle

which is the complex version of the Cauchy-Schwarz inequality. And then just as in the real case we can write it as

\displaystyle\frac{\lvert\langle v,w\rangle\rvert^2}{\lVert v\rVert^2\lVert w\rVert^2}\leq1

which implies that

\displaystyle-1\leq\frac{\langle v,w\rangle}{\lVert v\rVert\lVert w\rVert}\leq1

which we can again interpret as the cosine of an angle.

So all the same notions of length and angle can be recovered from this sort of complex inner product.

April 22, 2009 Posted by | Algebra, Linear Algebra | 13 Comments

Inner Products and Lengths

We’re still looking at a real vector space V with an inner product. We used the Cauchy-Schwarz inequality to define a notion of angle between two vectors.

\displaystyle\cos(\theta)=\frac{\lvert\langle v,w\rangle\rvert}{\langle v,v\rangle^{1/2}\langle w,w\rangle^{1/2}}

Let’s take a closer look at those terms in the diagonal. What happens when we compute \langle v,v\rangle? Well, if we’ve got an orthonormal basis around and components v^ie_i, we can write

\displaystyle\langle v,v\rangle=\sum\limits_{i=1}^d\left(v^i\right)^2

The v^i are distances we travel in each of the mutually-orthogonal directions given by the vectors e_i. But then this formula looks a lot like the Pythagorean theorem about calculating the square of the resulting distance. It may make sense to define this as the square of the length of v, and so the quantities in the denominator above were the lengths of v and w, respectively.

Let’s be a little more formal. We want to define something called a “norm”, which is a notion of length on a vector space. If we think of a vector v as an arrow pointing from the origin (the zero vector) to the point at its tip, we should think of the norm \lVert v\rVert as the distance between these two points. Similarly, the distance between the tips of v and w should be the length of the displacement vector v-w which points from one to the other. But a notion of distance is captured in the idea of a metric! So whatever a norm is, it should give rise to a metric by defining the distance d(v,w) as the norm of v-w.

Here are some axioms: A function from V to \mathbb{R} is a norm, written \lVert v\rVert, if

  • For all vectors v and scalars c, we have \lVert cv\rVert=\lvert c\rvert\lVert v\rVert.
  • For all vectors v and w, we have \lVert v+w\rVert\leq\lVert v\rVert+\lVert w\rVert.
  • The norm \lVert v\rVert is zero if and only if the vector v is the zero vector.

The first of these is eminently sensible, stating that multiplying a vector by a scalar should multiply the length of the vector by the size (absolute value) of the scalar. The second is essentially the triangle inequality in a different guise, and the third says that nonzero vectors have nonzero lengths.

Putting these axioms together we can work out

\displaystyle0=\lVert0\rVert=\lVert v-v\rVert\leq\lVert v\rVert+\lVert -v\rVert=\lVert v\rVert+\lvert-1\rvert\lVert v\rVert=2\lVert v\rVert

And thus every vector’s norm is nonnegative. From here it’s straightforward to check the conditions in the definition of a metric.

All this is well and good, but does an inner product give rise to a norm? Well, the third condition is direct from the definiteness of the inner product. For the first condition, let’s check

\displaystyle\sqrt{\langle cv,cv\rangle}=\sqrt{c^2\langle v,v\rangle}=\sqrt{c^2}\sqrt{\langle v,v\rangle}=\lvert c\rvert\sqrt{\langle v,v\rangle}

as we’d hope. Finally, let’s check the triangle inequality. We’ll start with

\displaystyle\begin{aligned}\lVert v+w\rVert^2&=\langle v+w,v+w\rangle\\&=\langle v,v\rangle+2\langle v,w\rangle+\langle w,w\rangle\\&\leq\lVert v\rVert^2+2\lvert\langle v,w\rangle\rvert+\lVert w\rVert^2\\&\leq\lVert v\rVert^2+2\lVert v\rVert\lVert w\rVert+\lVert w\rVert^2\\&=\left(\lVert v\rVert+\lVert w\rVert\right)^2\end{aligned}

where the second inequality uses the Cauchy-Schwarz inequality. Taking square roots (which preserves order) gives us the triangle inequality, and thus verifies that we do indeed get a norm, and a notion of length.

April 21, 2009 Posted by | Algebra, Geometry, Linear Algebra | 18 Comments

Inner Products and Angles

We again consider a real vector space V with an inner product. We’re going to use the Cauchy-Schwarz inequality to give geometric meaning to this structure.

First of all, we can rewrite the inequality as

\displaystyle\frac{\langle v,w\rangle^2}{\langle v,v\rangle\langle w,w\rangle}\leq1

Since the inner product is positive definite, we know that this quantity will be positive. And so we can take its square root to find

\displaystyle-1\leq\frac{\lvert\langle v,w\rangle\rvert}{\langle v,v\rangle^{1/2}\langle w,w\rangle^{1/2}}\leq1

This range is exactly that of the cosine function. Let’s consider the cosine restricted to the interval \left[0,\pi\right], where it’s injective. Here we can define an inverse function, the “arccosine”. Using the geometric view on the cosine, the inverse takes a value between -1 and {1} and considers the point with that x-coordinate on the upper half of the unit circle. The arccosine is then the angle made between the positive x-axis and the ray through this point, as a number between {0} and \pi.

So let’s take this arccosine function and apply it to the value above. We define the angle \theta between vectors v and w by

\displaystyle\cos(\theta)=\frac{\lvert\langle v,w\rangle\rvert}{\langle v,v\rangle^{1/2}\langle w,w\rangle^{1/2}}

Some immediate consequences show that this definition makes sense. First of all, what’s the angle between v and itself? We find

\displaystyle\cos(\theta)=\frac{\lvert\langle v,w\rangle\rvert}{\langle v,v\rangle^{1/2}\langle v,v\rangle^{1/2}}=1

and so \theta=0. A vector makes no angle with itself. Secondly, what if we take two vectors from an orthonormal basis \left\{e_i\right\}? We calculate

\displaystyle\cos(\theta_{ij})=\frac{\lvert\langle e_i,e_j\rangle\rvert}{\langle e_i,e_i\rangle^{1/2}\langle e_j,e_j\rangle^{1/2}}=\delta_{ij}

If we pick the same vector twice, we already know we get \theta_{ii}=0, but if we pick two different vectors we find that \cos(\theta_{ij})=0, and thus \theta_{ij}=\frac{\pi}{2}. That is, two different vectors in an orthonormal basis are perpendicular, or “orthogonal”.

April 17, 2009 Posted by | Algebra, Geometry, Linear Algebra | 15 Comments

The Cauchy-Schwarz Inequality

Today I want to present a deceptively simple fact about spaces equipped with inner products. The Cauchy-Schwarz inequality states that

\displaystyle\langle v,w\rangle^2\leq\langle v,v\rangle\langle w,w\rangle

for any vectors v,w\in V. The proof uses a neat little trick. We take a scalar t and construct the vector v+tw. Now the positive-definiteness, bilinearity, and symmetry of the inner product tells us that

\displaystyle0\leq\langle v+tw,v+tw\rangle=\langle v,v\rangle+2\langle v,w\rangle t+t^2\langle w,w\rangle

This is a quadratic function of the real variable t. It can have at most one zero, if there is some value t_0 such that v+t_0w is the zero vector, but it definitely can’t have two zeroes. That is, it’s either a perfect square or an irreducible quadratic. Thus we consider the discriminant and conclude

\displaystyle\left(2\langle v,w\rangle\right)^2-4\langle w,w\rangle\langle v,v\rangle\leq0

which is easily seen to be equivalent to the Cauchy-Schwarz inequality above. As a side effect, we see that we only get an equality (rather than an inequality) when v and w are linearly dependent.

April 16, 2009 Posted by | Algebra, Linear Algebra | 5 Comments

Real Inner Products

Now that we’ve got bilinear forms, let’s focus in on when the base field is \mathbb{R}. We’ll also add the requirement that our bilinear forms be symmetric. As we saw, a bilinear form B:V\otimes V\rightarrow\mathbb{R} corresponds to a linear transformation B_1:V\rightarrow V^*. Since B is symmetric, the matrix of B_1 must itself be symmetric with respect to any basis. So let’s try to put it into a canonical form!

We know that we can put B into the almost upper-triangular form

\displaystyle\begin{pmatrix}A_1&&*\\&\ddots&\\{0}&&A_m\end{pmatrix}

but now all the blocks above the diagonal must be zero, since they have to equal the blocks below the diagonal. On the diagonal, the 1\times1 blocks are fine, but the 2\times2 blocks must themselves be symmetric. That is, they must look like

\displaystyle\begin{pmatrix}a&b\\b&d\end{pmatrix}

which gives a characteristic polynomial of X^2-(a+d)X+(ad-b^2) for the block. But recall that we could only use this block if there were no eigenvalues. And, indeed, we can check

\displaystyle\begin{aligned}\tau^2-4\delta&=(a+d)^2-4(ad-b^2)\\&=a^2+2ad+d^2-4ad+4b^2\\&=a^2-2ad+d^2+b^2\\&=(a-d)^2+b^2\geq0\end{aligned}

The discriminant is positive, and so this 2\times2 block will break down into two 1\times1 blocks. Thus any symmetric real matrix can be diagonalized, which means that any symmetric real bilinear form has a basis with respect to which its matrix is diagonal.

Let \left\{e_i\right\} be such a basis. To be explicit, this means that \langle e_i,e_j\rangle=b_i\delta_{ij}, where the b_i are real numbers and \delta_{ij} is the Kronecker delta{1} if its indices match, and {0} if they don’t. But we still have some freedom. If I multiply e_i by a scalar c, we find \langle ce_i,ce_i\rangle=c^2b_i. We can always find some c so that c^2=\frac{1}{|b_i|}, and so we can always pick our basis so that b_i is {1}, -1, or {0}. We’ll call such a basis “orthonormal”.

The number of diagonal entries b_i with each of these three values won’t depend on the orthonormal basis we choose. The form is nondegenerate if and only if there are no {0} entries on the diagonal. If not, we can decompose V as the direct sum of the subspace \bar{V} on which the form is nondegenerate, and the remainder W on which the form is completely degenerate. That is, \langle w_1,w_2\rangle=0 for all w_1,w_2\in W. We’ll only consider nondegenerate bilinear forms from here on out.

We write p for the number of diagonal entries equal to {1}, and q for the number equal to -1. Then the pair (p,q) is called the signature of the form. Clearly for nondegenerate forms, p+q=d, the dimension of V. We’ll have reason to consider some different signatures in the future, but for now we’ll be mostly concerned with the signature (d,0). In this case we call the form positive definite, since we can calculate

\displaystyle\langle v,v\rangle=v^iv^j\langle e_i,e_j\rangle=v^iv^j\delta_{ij}=\sum\limits_{i=1}^d\left(v^i\right)^2

The form is called “positive”, since this result is always nonnegative, and “definite”, since this result can only be zero if v is the zero vector.

This is what we’ll call an inner product on a real vector space V — a nondegenerate, positive definite, symmetric bilinear form \langle\underbar{\hphantom{X}},\underbar{\hphantom{X}}\rangle:V\otimes V\rightarrow\mathbb{R}. Notice that choosing such a form picks out a certain class of bases as orthonormal. Conversely, if we choose any basis \left\{e_i\right\} at all we can create a form by insisting that this basis be orthonormal. Just define \langle e_i,e_j\rangle=\delta_{ij} and extend by bilinearity.

April 15, 2009 Posted by | Algebra, Linear Algebra | 14 Comments

Bilinear Forms

Now that we’ve said a lot about individual operators on vector spaces, I want to go back and consider some other sorts of structures we can put on the space itself. Foremost among these is the idea of a bilinear form. This is really nothing but a bilinear function to the base field: B:V\times V\rightarrow\mathbb{F}. Of course, this means that it’s equivalent to a linear function from the tensor square: B:V\otimes V\rightarrow\mathbb{F}.

Instead of writing this as a function, we will often use a slightly different notation. We write a bracket B(v,w)=\langle v,w\rangle, or sometimes \langle v,w\rangle_B, if we need to specify which of multiple different inner products under consideration.

Another viewpoint comes from recognizing that we’ve got a duality for vector spaces. This lets us rewrite our bilinear form B:V\otimes V\rightarrow\mathbb{F} as a linear transformation B_1:V\rightarrow V^*. We can view this as saying that once we pick one of the vectors x\in V, the bilinear form reduces to a linear functional \langle v,\underbar{\hphantom{X}}\rangle:V\rightarrow\mathbb{F}, which is a vector in the dual space V^*. Or we could focus on the other slot and define B_2(v)=\langle\underbar{\hphantom{X}},v\rangle\in V^*.

We know that the dual space of a finite-dimensional vector space has the same dimension as the space itself, which raises the possibility that B_1 or B_2 is an isomorphism from V to V^*. If either one is, then both are, and we say that the bilinear form B is nondegenerate.

We can also note that there is a symmetry on the category of vector spaces. That is, we have a linear transformation \tau_{V,V}:V\otimes V\rightarrow V\otimes V defined by \tau_{V,V}(v\otimes w)=w\otimes v. This makes it natural to ask what effect this has on our form. Two obvious possibilities are that \tau_{V,V}\circ B=B and that \tau_{V,V}\circ B=-B. In the first case we’ll call the bilinear form “symmetric”, and in the second we’ll call it “antisymmetric”. In terms of the maps B_1 and B_2, we see that composing B with the symmetry swaps the roles of these two functions. For symmetric bilinear forms, B_1=B_2, while for antisymmetric bilinear forms we have B_1=-B_2.

This leads us to consider nondegenerate bilinear forms a little more. If B_2 is an isomorphism it has an inverse B_2^{-1}. Then we can form the composite B_2^{-1}\circ B_1:V\rightarrow V. If B is symmetric then this composition is the identity transformation on V. On the other hand, if B is antisymmetric then this composition is the negative of the identity transformation. Thus, the composite transformation measures how much the bilinear transformation diverges from symmetry. Accordingly, we call it the asymmetry of the form B.

Finally, if we’re working over a finite-dimensional vector space we can pick a basis \left\{e_i\right\} for V, and get a matrix for B. We define the matrix entry B_{ij}=\langle e_i,e_j\rangle_B. Then if we have vectors v=v^ie_i and w=w^je_j we can calculate

\displaystyle\langle v,w\rangle=\langle v^ie_iw^je_j\rangle=v^iw^j\langle e_i,e_j\rangle=v^iw^jB_{ij}

In terms of this basis and its dual basis \left\{\epsilon^j\right\}, we find the image of the linear transformation B_1(v)=\langle v,\underbar{\hphantom{X}}\rangle=v^iB_{ij}\epsilon^j. That is, the matrix also can be used to represent the partial maps B_1 and B_2. If B is symmetric, then the matrix is symmetric B_{ij}=B_{ji}, while if it’s antisymmetric then B_{ij}=-B_{ji}.

April 14, 2009 Posted by | Algebra, Linear Algebra | 9 Comments