The Unapologetic Mathematician

Mathematics for the interested outsider

The Trace of a Linear Transformation

Again, let’s take a linear endomorphism T on a vector space V of finite dimension d. We know that its characteristic polynomial can be defined without reference to a basis of V, and so each of the coefficients of V is independent of any choice of basis. The leading coefficient is always {1}, so that’s not very interesting. The constant term is the determinant, which we’d known from other considerations before. There’s one more coefficient we’re interested in, partly for the interesting properties we’ll explore, and partly for its ease of computation. This is the coefficient of \lambda^{d-1}.

So, let’s go back to our formula for the characteristic polynomial:

\displaystyle\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^d(\lambda\delta_k^{\pi(k)}-t_k^{\pi(k)})

Which terms can involve \lambda^{d-1}. Well, we can get one factor of \lambda every time we have k=\pi(k), and so we need this to happen at least d-1 times to get d-1 factors of \lambda. But if the permutation \pi sends all but one index back to itself, then the last index must also be fixed, since there’s nowhere else for it to go! So we only have to look at the term corresponding to the identity permutation. This will simplify our lives immensely.

Now we’re considering the product

\displaystyle\prod\limits_{k=1}^d(\lambda-t_k^k)

When we multiply this out, we make d choices. At each step we can either take the \lambda, or we can take the t_k^k. We’re interested in the terms where we take the \lambda d-1 times. There are d ways of making this choice, corresponding to which one of the indices we don’t take the \lambda. Incidentally, we could also think of this in terms of combinations, as d=\binom{d}{d-1}.

Anyhow, for each choice of one index to use the matrix entry instead of the variable, we’ll have a term -t_k^k\lambda^{d-1}. We add all of these up, summing over k — as our notation suggests we should! And now we have the second coefficient of the characteristic polynomial. We drop the negative sign and call this the “trace” of T:

\displaystyle\mathrm{Tr}(T)=\sum\limits_{k=1}^dt_k^k=t_k^k

where in the last formula we’re using the summation convention again. Incidentally, “trace” should be read as referring to a telltale sign that T has left behind, like a hunted animal’s.. um.. “leavings”.

Anyhow, we can now write out a few of the terms in the characteristic polynomial:

\det(\lambda1_V-T)=\lambda^d-\mathrm{Tr}(T)\lambda^{d-1}+...+(-1)^d\det(T)

January 30, 2009 Posted by | Algebra, Linear Algebra | 5 Comments

Extracting the Determinant from the Characteristic Polynomial

This one’s a pretty easy entry. If we know the characteristic polynomial of an endomorphism T on a vector space V of finite dimension d, then we can get its determinant from the constant term.

First let’s look at the formula for the characteristic polynomial in terms of the matrix entries of T:

\displaystyle\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^d(\lambda\delta_k^{\pi(k)}-t_k^{\pi(k)})

Now we’re interested in \det(T), which is exactly what we calculate to determine if the kernel of T is nontrivial. But the kernel of T is the eigenspace corresponding to eigenvalue {0}, so this should have something to do with the characteristic polynomial at \lambda=0. So let’s see what happens.

\displaystyle\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^d(-t_k^{\pi(k)})=\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)(-1)^d\prod\limits_{k=1}^dt_k^{\pi(k)}=(-1)^d\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^dt_k^{\pi(k)}

This is just (-1)^d times our formula for the determinant of T. But of course we know the dimension ahead of time, so we know whether to flip the sign or not. So just take the characteristic polynomial, evaluate it at zero, and flip the sign if necessary to get the determinant.

There’s one thing to note here, even though it doesn’t really tell us anything new. We’ve said that T is noninvertible if and only if its determinant is zero. Now we know that this will happen if and only if the constant term of the characteristic polynomial is zero. In this case, the polynomial must have a root at \lambda=0, which means that the {0}-eigenspace of T is nontrivial. But this is just the kernel of T is nontrivial. Thus (as we already know) a linear transformation is noninvertible if and only if its kernel is nontrivial.

January 29, 2009 Posted by | Algebra, Linear Algebra | 1 Comment

The Characteristic Polynomial

Given a linear endomorphism T:V\rightarrow V on a vector space V of dimension d, we’ve defined a function\det(T-\lambda1_V) — whose zeroes give exactly those field elements \lambda so that the \lambdaeigenspace of T is nontrivial. Actually, we’ll switch it up a bit and use the function \det(\lambda1_V-T), which has the same useful property. Now let’s consider this function a little more deeply.

First off, if we choose a basis for V we have matrix representations of endomorphisms, and thus a formula for their determinants. For instance, if T is represented by the matrix with entries t_i^j, then its determinant is given by

\displaystyle\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^dt_k^{\pi(k)}

which is a sum of products of matrix entries. Now, the matrix entries for the transformation \lambda1_V-T are given by t_i^j-\lambda\delta_i^j. Each of these new entries is a polynomial (either constant or linear) in the variable \lambda. Any sum of products of polynomials is again a polynomial, and so our function is actually a polynomial in \lambda. We call it the “characteristic polynomial” of the transformation T. In terms of the matrix entries of T itself, we get

\displaystyle\sum\limits_{\pi\in S_d}\mathrm{sgn}(\pi)\prod\limits_{k=1}^d(\lambda\delta_k^{\pi(k)}-t_k^{\pi(k)})

What’s the degree of this polynomial? Well, first let’s consider the degree of each term in the sum. Given a permutation \pi\in S_d the term is the product of d factors. The kth of these factors will be a field element if k\neq\pi(k), and will be a linear polynomial if k=\pi(k). Since multiplying polynomials adds their degrees, the degree of the \pi term will be the number of k such that k=\pi(k). Thus the highest possible degree happens if k=\pi(k) for all index values k. This only happens for one permutation — the identity — so there can’t be another term of the same degree to cancel the highest-degree monomial when we add them up. And so the characteristic polynomial has degree d, equal to the dimension of the vector space V.

What’s the leading coefficient? Again, the degree-d monomial can only show up once, in the term corresponding to the identity permutation. Specifically, this term is

\prod\limits_{k=1}^d(\lambda-t_k^k)

Each factor gives \lambda a coefficient of {1}, and so the coefficient of the \lambda^d term is also {1}. Thus the leading coefficient of the characteristic polynomial is {1} — a fact which turns out to be useful.

January 28, 2009 Posted by | Algebra, Linear Algebra | 13 Comments

Determining Eigenvalues

Yesterday, we defined eigenvalues, eigenvectors, and eigenspaces. But we didn’t talk about how to actually find them (though one commenter decided to jump the gun a bit). It turns out that determining the eigenspace for any given eigenvalue is the same sort of problem as determining the kernel.

Let’s say we’ve got a linear endomorphism T:V\rightarrow V and a scalar value \lambda. We want to determine the subspace of V consisting of those eigenvectors v satisfying the equation

T(v)=\lambda v

First, let’s adjust the right hand side. Instead of thinking of the scalar product of \lambda and v, we can write it as the action of the transformation \lambda1_V, where 1_V is the identity transformation on V. That is, we have the equation

T(v)=\left[\lambda1_V\right](v)

Now we can do some juggling to combine these two linear transformations being evaluated at the same vector v:

\left[T-\lambda1_V\right](v)=0

And we find that the \lambda-eigenspace of T is the kernel \mathrm{Ker}(T-\lambda1_V).

Now, as I stated yesterday most of these eigenspaces will be trivial, just as the kernel may be trivial. The interesting stuff happens when \mathrm{Ker}(T-\lambda1_V) is nontrivial. In this case, we’ll call \lambda an eigenvalue of the transformation T (thus the eigenvalues of a transformation are those which correspond to nonzero eigenvectors). So how can we tell whether or not a kernel is trivial? Well, we know that the kernel of an endomorphism is trivial if and only if the endomorphism is invertible. And the determinant provides a test for invertibility!

So we can take the determinant \det(T-\lambda1_V) and consider it as a function of \lambda. If we get the value {0}, then the \lambda-eigenspace of T is nontrivial, and \lambda is an eigenvalue of T. Then we can use other tools to actually determine the eigenspace if we need to.

January 27, 2009 Posted by | Algebra, Linear Algebra | 4 Comments

Eigenvalues, Eigenvectors, and Eigenspaces

Okay, I’m back in Kentucky, and things should get back up to speed. For the near future I’ll be talking more about linear endomorphisms — transformations from a vector space to itself.

The absolute simplest thing that a linear transformation can do to a vector is to kill it off entirely. That is, given a linear transformation T:V\rightarrow V, it’s possible that T(v)=0 for some vectors v\in V. This is just what we mean by saying that v is in the kernel \mathrm{Ker}(T). We’ve seen before that the vectors in the kernel form a subspace of V

What other simple things could T do to the vector v? One possibility is that T does nothing to v at all. That is, T(v)=v. We can call this vector T a “fixed point” of the transformation T. Notice that if v is a fixed point, then so is any scalar multiple of v. Indeed, T(cv)=cT(v)=cv by the linearity of T. Similarly, if v and w are both fixed points, then T(v+w)=T(v)+T(w)=v+w. Thus the fixed points of T also form a subspace of V.

What else could happen? Well, notice that the two above cases are related. The condition that v is in the kernel can be written T(v)=0v. The condition that it’s a fixed point can be written T(v)=1v. Each one says that the action of T on v is to multiply it by some fixed scalar. Let’s change that scalar and see what happens.

We’re now considering a linear transformation T and a vector v so that T(v)=\lambda v for some scalar \lambda. That is, T hasn’t changed the direction of v, but only its length. We call such a vector an “eigenvector” of T, and the corresponding scalar its “eigenvalue”. In contexts where our vector space is a space of functions, it’s common (especially among quantum physicists) to use the term “eigenfunction” instead of eigenvector, and even weirder applications of the “eigen-” prefix, but these are almost always just special cases of eigenvectors.

Now it turns out that the eigenvectors associated to any particular eigenvalue form a subspace of V. If we assume that v and w are both eigenvectors with eigenvalue \lambda, and that c is another scalar, then we can check

T(cv)=cT(v)=c\lambda v=\lambda cv

and

T(v+w)=T(v)+T(w)=\lambda v+\lambda w=\lambda(v+w)

We call the subspace of eigenvectors with eigenvalue \lambda the \lambda-eigenspace of V. Notice here that the {0}-eigenspace is the kernel of V, and the {1}-eigenspace is the subspace of fixed points. The eigenspace makes sense for all scalars \lambda, but any given eigenspace might be trivial, just as the transformation might have a trivial kernel.

Now, all of this is basically definitional. What’s surprising is how much of the behavior of any linear transformation is caught up in the behavior of its eigenvalues. We’ll see this more and more as we go further.

January 26, 2009 Posted by | Algebra, Linear Algebra | 19 Comments

The Determinant of a Noninvertible Transformation

We’ve defined and calculated the determinant representation of \mathrm{GL}(V) for a finite-dimensional vector space V. But we can extend this definition to apply to any linear transformation sending V to itself.

So what happens when T:V\rightarrow V fails to be invertible? Its image must miss some vectors in V. That is, we have a nontrivial kernel. The index of T is zero, so a trivial kernel would mean a trivial cokernel. We would then have a one-to-one and onto linear transformation, and T would be invertible.

Let’s take a basis of \mathrm{Ker}(T). Since this is a linearly independent set spanning a subspace of V, it can be completed to a basis \{e_i\} for all of V. Now we can use this basis of V to write out the matrix of T and use our formula from last time to calculate \det(T).

The ith column of the matrix is the vector T(e_i) written out in terms of our basis. But since the first few basis vectors are in the kernel of T we have at least T(e_1)=0. So the first column of the matrix must be all zeroes. Now as we pick a permutation and walk down the rows, some row is going to tell us to multiply by the entry in the first column, which we have just seen is zero. That is, for every permutation, the term in our determinant formula for that permutation is zero, and so the sum of them all — the determinant itself — must also be zero.

Notice that this still lets us think of the determinant as preserving multiplications in the algebra of endomorphisms of V. Any noninvertible linear transformation is sent to zero. The product of a noninvertible transformation and any other transformation will be noninvertible, and the product of their determinants will be zero. This also gives us a test for invertibility! Take the linear transformation T and run it through the determinant function. If the result is zero, then T is noninvertible. If the result is nonzero, then T is invertible.

January 14, 2009 Posted by | Algebra, Linear Algebra | 1 Comment

Joint Meetings 2009 — Day 4

Sorry for not posting yesterday, but I was sort of run-down. I”m taking today as a break before I join Knots in Washington tomorrow, already in progress. Anyhow, the last day of the Joint Meetings is always slow, and there’s nothing much for me to talk about here. Instead, I’ll put up some pictures.
Continue reading

January 9, 2009 Posted by | Uncategorized | Leave a comment

Joint Meetings 2009 — Day 3

Today’s special session on homotopy theory and higher categories seems to have pushed its higher categories until the afternoon, so I got a chance to see Dan Teague (of Dartmouth) talk about “Making Math out of Style”.

This was rather interesting to me, since the jumping-off point was the identification of Pollack paintings by box-counting dimensions. I really liked this story when it came out, since it’s a great story to relate mathematics to art. The talk continued to discuss efforts to identify authorship of some of the Federalist Papers, and of one of the Wizard of Oz books. Then there was identifying forged Van Gogh paintings (which I think I saw on Scientific American Frontiers a few months ago). Neat stuff.

In the afternoon, John Baez led off, talking about the classifying space of a 2-group. I’ll also him later discussing groupoidification in the categorification and link homology. I wanted to make this post now in a bit of down time so I could remind people that I’ll be at Tryst at 8, and to pass on this bit of wisdom from Baez’ first talk: X is just M in a really weird font.

[UPDATE]: Paul is right in his comment below. Dan Rockmore gave the talk, and Dan Teague introduced him. I met someone else the next day who confirmed both this fact and that he was initially confused by it as well.

January 7, 2009 Posted by | Uncategorized | 4 Comments

Joint Meetings 2009 — day 2

Today was a little thin. I had to meet someone in the exhibit hall when it opened, and the talks I saw after that in the morning were sort of lackluster. After lunch I saw a couple topology and applied mathematics talks, but then had to head off for another meeting about a job prospect. After that rather than sticking around for Mikhail Khovanov’s invited address (I probably know what he’d say anyway), I decided to hit the metro and try to beat the beltway traffic.

One talk in the early morning caught my attention. David Clark talked about the functoriality of the \mathfrak{sl}_3 analogue of Khovanov homology (which is based around \mathfrak{sl}_2. The talk itself I don’t care to talk about much here, but I was glad to see that the first step was to pass from links to tangles, and to treat them as the natural setting. Now if I can just get the term “tangle covariant” to catch on…

Oh, I wasn’t able to join the Secret Blogging Seminar’s drinking tonight. Tomorrow night, however, I’m thinking I’ll be at Tryst. I’m done with special sessions by 6, and I’ll be wanting to have dinner of course. So I’ll say “8 PM”. Here’s a map.

January 7, 2009 Posted by | Uncategorized | Leave a comment

Joint Meetings 2009 — day 1

Well, I made some good non-academic contacts already. I’d rather not go into details, since being too talky might be a problem when prospective employers look to Google and find me as the top hit for my name and subject. But I’m feeling good about my prospects, even without looking at the wilds of federal government and contracting jobs.

Anyhow, as for mathematics there were a number of good talks, but most of them were either what was expected from the speaker, or felt pretty technical. One, though, really grabbed me. Kerry Luse, formerly a student of Yongwu Rong’s at George Washington, spoke about “A transition polynomial for signed Feynman diagrams”. She started with the chord diagram of a knot and added a sign to each chord. If you read the signs as orientations, you get Feynman diagrams for a single species of noninteracting, non-self-dual particles. Alternately, you can interpret the diagrams as arising from RNA secondary structures, as she did. Either way, she was looking for a polynomial invariant to be calculated from such a diagram, and she came up with some really interesting results from her choice. One property in particular was the fact that the resulting polynomial (as applied to chord diagrams arising from knots) is multiplicative under connected sums of links. This makes me think it’s got something to do with the Alexander polynomial.

She also mentioned chord diagrams for links, with more than one loop, which I don’t think I’ve ever considered as such. Immediately this made me think of extending to tangles (naturally), and then that these chord diagrams may themselves form a category of their own. Is there some sort of duality here? If so it might turn connected sum on one side into disjoint union on the other side, which could provide a fascinating connection between classical and quantum topology…

See, I’m not going to be stopping research, and definitely not stopping this project here (thanks, btw, for the comments), but I just need to get out of the academic game I’ve been playing the last few years.

Anyhow, since it’s been weeks since I’ve been at home cooking for myself (thanks to Dad insisting on doing it all), I figured I’d have a bonus “I (Didn’t) Made It!”:

Eating at the Afghan Grill

Eating at the Afghan Grill

At the Afghan Grill, just around the corner from the Marriott, I’m having the mantoo, and across the table from me is the lamb qabili palao. So who ordered the Afghan equivalent of biryani?

Mikael Vejdemo Johansson

This guy! I also ran into Jesse Johnson this morning. Oh, and if Sarah from John Hopkins is reading this and is at the meetings, she really needs to drop me an email so she can join the fun.

January 6, 2009 Posted by | Uncategorized | 1 Comment