## Factoring Real Polynomials

Okay, we know that we can factor any complex polynomial into linear factors because the complex numbers are algebraically closed. But we also know that real polynomials can have too few roots. Now, there are a lot of fields out there that aren’t algebraically closed, and I’m not about to go through all of them. But we use the real numbers so much because of the unique position it holds by virtue of the interplay between its topology and its algebra. So it’s useful to see what we *can* say about real polynomials.

We start by noting that since the real numbers sit inside the complex numbers, we can consider any real polynomial as a complex polynomial. If the polynomial has a real root , then the complex polynomial has a root . So all the real roots still show up.

Now, we might not have as many real roots as the degree would indicate. But we are sure to have as many *complex* roots as the degree, which will include the real roots. Some of the roots may actually be complex numbers like . Luckily, one really interesting thing happens here: if is a root, then so is its complex conjugate .

Let’s write out our polynomial , where all the are real numbers. To say that is a root means that when we substitute it for we get the equation

Now we can take the complex conjugate of this equation

But complex conjugation is a field automorphism, so it preserves both addition and multiplication

Now since all the (and ) are real, complex conjugation leaves them alone. Conjugation sends to , and so we find

So is a root as well. Thus if we have a (complex) linear factor like we’ll also have another one like . These multiply to give

which is a real polynomial again.

Now let’s start with our polynomial of degree . We know that over the complex numbers it has a root . If this root is real, then we can write

where is another real polynomial which has degree . On the other hand, if is complex then is also a root of , and so we can write

where is another real polynomial which has degree .

Either way, now we can repeat our reasoning starting with . At each step we can pull off either a linear term or a quadratic term (which can’t be factor into two real linear terms).

Thus every real polynomial factors into the product of a bunch of linear polynomials and a bunch of irreducible quadratic polynomials, and the number of linear factors plus twice the number of quadratic factors must add up to the degree of . It’s not *quite* so nice as the situation over the complex numbers, but it’s still pretty simple. We’ll see many situations into the future where this split between two distinct real roots and a conjugate pair of complex roots (and the border case of two equal real roots) shows up with striking qualitative effects.

## The Complex Numbers

Yesterday we defined a field to be algebraically closed if any polynomial over the field always has exactly as many roots (counting multiplicities) as we expect from its degree. But we don’t know a single example of an algebraically complete field. Today we’ll (partially) remedy that problem.

First, remember the example we used for a polynomial with no roots over the real numbers. That is . The problem is that we have no field element whose square is . So let’s just postulate one! This, of course, has the same advantages as those of theft over honest toil. We write our new element as for “imaginary”, and throw it in with the rest of the real numbers .

Okay, just like when we threw in as a new element, we can build up sums and products involving real numbers and this new element . But there’s one big difference here: we have a relation that must satisfy. When we use the evaluation map we must find . And, of course, any polynomial which includes as a factor must evaluate to as well. But this is telling us that the kernel of the evaluation homomorphism for contains the principal ideal .

Can it contain anything else? If is a polynomial in the kernel, but is *not* divisible by , then Euclid’s algorithm gives us a greatest common divisor of and , which is a linear combination of these two, and must have degree either or . In the former case, we would find that the evaluation map would have to send everything — even the constant polynomials — to zero. In the latter case, we’d have a linear factor of , which would be a root. Clearly neither of these situations can occur, so the kernel of the evaluation homomorphism at is exactly the principal ideal .

Now the first isomorphism theorem for rings tells us that we can impose our relation by taking the quotient ring . But what we just discussed above further goes to show that is a maximal ideal, and the quotient of a ring by a maximal ideal is a field! Thus when we take the real numbers and adjoin a square root of to get a ring we might call , the result is a field. This is the field of “complex numbers”, which is more usually written .

Now we’ve gone through a lot of work to just add one little extra element to our field, but it turns out this is all we need. Luckily enough, the complex numbers are already algebraically complete! This is very much not the case if we were to try to algebraically complete other fields (like the rational numbers). Unfortunately, the proof really is essentially analytic. It seems to be a completely algebraic statement, but remember all the messy analysis and topology that went into defining the real numbers.

Don’t worry, though. We’ll come back and prove this fact once we’ve got a bit more analysis under our belts. We’ll also talk a lot more about how to think about complex numbers. But for now all we need to know is that they’re the “algebraic closure” of the real numbers, we get them by adding a square root of that we call , and we can use them as an example of an algebraically closed field.

One thing we can point out now, though, is the inherent duality of our situation. You see, we didn’t just add one square root of . Indeed, once we have complex numbers to work with we can factor as (test this by multiplying it out and imposing the relation). Then we have another root: . This is just as much a square root of as was, and anything we can do with we can do with . That is, there’s a symmetry in play that exchanges and . We can pick one and work with it, but we must keep in mind that whenever we do we’re making a non-canonical choice.

## Polynomials with Too Few Roots

Okay, we saw that roots of polynomials exactly correspond to linear factors, and that a polynomial can have at most as many roots as its degree. In fact, there’s an expectation that a polynomial of degree will have *exactly* roots. Today I want to talk about two ways this can fail.

First off, let’s work over the real numbers and consider the polynomial . A root of will be a real number so that , but a little playing around will show us that is the only possible solution. The degree of the polynomial is two, so we expect two roots, but we only find one. What went wrong?

Well, we know from the fact that is a root that is a factor of . Our division algorithm shows us that we can write . The factor that gives us the root shows up twice! But since we’ve already counted the root once the second occurrence doesn’t do anything.

To remedy this, let’s define the “multiplicity” of a root to be the number of times we can evenly divide out a factor of . So in our example the root has multiplicity . When we count the roots along with their multiplicities, we get back exactly the degree.

So do multiplicities handle all our problems? Unfortunately, no. An example here over the real numbers is the polynomial . A root of this polynomial would be a real number with . But since the square of any real number is nonnegative, this can’t be satisfied. So there exist polynomials with fewer roots than their degrees would indicate; even with no roots at all!

Now, some fields are well-behaved. We say that a field is “algebraically closed” if every polynomial over that field has a root . In that case we can divide out by to get a polynomial of one degree less, which must again have a root by algebraic closure. We can keep going until we write the polynomial as the product of a bunch of linear factors — the number is the same as the degree of the polynomial — and leave one field element left once we get down to degree zero. Thus over an algebraically closed field every polynomial has exactly as many roots as its degree indicates.. if you count them with their multiplicities!

## Euclid’s Algorithm for Polynomials

Consider the division algorithm. It looks a lot like division for integers, as it should. Specifically, it’s got all the features we need for the setup of Euclid’s Algorithm, and we have an analogue here!

Just like we did back then, we start with two polynomials and , and we divide them to find . Then if we turn around and divide to find . We keep going until we end up with a remainder of , at which point we can’t divide anymore, and so we stop. And we must eventually stop, because the degree of the remainders keep going down.

At the end, the last nonzero remainder will be the “greatest common divisor” of and . That is, it’s a polynomial that divides both and (leaving no remainder), and any other such common divisor must itself evenly divide .

Euclid’s algorithm also gives us a way of writing out the greatest common divisor as a linear combination . Just like for the integers, this marks the greatest common divisor as the linear combination with the least degree.

Now how to show all this? It’s easy: just go back to the proof of Euclid’s algorithm for integers! Everything works the same way again. We just have to stick in references to the degree where appropriate to measure the “size” of a polynomial.

In fact, if we can equip any integral domain with a degree function (where is the set of nonzero ring elements) so that we have a division algorithm like those for integers and polynomials, then we’ve got the setup for Euclid’s algorithm. In this case, we say we have a “Euclidean domain”. So the point here is that is Euclidean, and so it has a form of Euclid’s algorithm, and all that follows from it.

Notice that in the case of the integers we had some ambiguity about common divisors, since would both work equally well. The point here is that they differ by multiplication by a unit, and so each divides the other. This sort of thing happens in the divisibility preorder for any ring. For polynomials, the units are just the nonzero elements of the base field, considered as constant (degree zero) polynomials. Thus the greatest common divisor is only determined up to a constant multiple, but we can easily ignore that little ambiguity. The “greatest” just has to be read as referring to the degree of the polynomial.

## Polynomial Division

Polynomials generally form a ring, and not a field. However, the integers also form a ring, and we have a notion of division from arithmetic. For polynomials we have a similar form of division we’ll discuss now. Instead of using the absolute value of an integer to measure its size, we’ll use the degree of a polynomial to measure its size.

Given two polynomials and , we can find two more polynomials — the quotient and the remainder — so that , and . These polynomials are unique. The fact that we can find them is called the “division algorithm”, even though it’s a theorem. There are various algorithms we can use to establish this fact, but one of the most straightforward is polynomial long division, which we were all taught in high school algebra.

We start by writing . If the degree of is less than that of , we’re done. Otherwise let’s write and . The polynomials lead off with terms and . If we multiply by its leading term becomes exactly the same as that of , so when we subtract the top degree terms cancel out. That is, we write , and we see that has degree less than that of .

Now if the degree of is still greater than or equal to that of we can keep going. Multiply by a term so that when we subtract the product from we’ll kill off the highest remaining term and drop its degree. Since the degree of the remainder keeps going down, it must eventually drop below that of . Even if the degree of is zero, we can get the remainder down to the polynomial , which we decided to give the degree . This gives us our and

Now for uniqueness. Let’s say we have and , and both and have degrees less than that of . We can subtract to find . Now the degree on the right must be less than that of , since adding and subtracting polynomials can’t increase the degree beyond the larger of the two. On the left, on the other hand, the degree is at *least* that of , with one exception: if . In this case the left hand side vanishes. Since this is the only way to drop the degree of the left-hand side we must have . And then on the right we find .

## Roots of Polynomials II

This one might have to suffice for two days. The movers are almost done loading the truck here in New Orleans, and I’ll be taking today and tomorrow to drive to Maryland.

We can actually tease out more information from the factorization we constructed yesterday. Bur first we need a little definition.

Remember that when we set up the algebra of polynomials we noted that the coefficients have to be all zero after some finite number of them. Thus there must be a greatest nonzero coefficient . The index corresponding to this coefficient we call the “degree” of the polynomial. If all the coefficients are zero — giving the zero polynomial — then a common convention is to assign it degree . This actually isn’t completely arbitrary, but the details won’t concern us until later.

So, armed with this information, look at how we constructed the factorization . We replaced each term in with the term . Then we factored out from this term, giving . So the highest power of that shows up in this term (with a nonzero coefficient) is . And the highest power coming from all the terms of will be . The power shows up only once in the expression for , so there’s no way for two such terms to add together and make its coefficient turn out to be zero, and no higher power of ever shows up at all. Thus the degree of is one less than that of .

So what does this gain us? Well, each time we find a root we can factor out a term like , which reduces the degree by . So if has degree there can only be at most roots!

A nonzero constant polynomial has degree , but it also has no roots! Perfect.

A linear polynomial has degree , and it has exactly one root: .

Now let’s assume that our statement is true for all polynomials of degree : they have or fewer roots. Then given a polynomial of degree either has a root or it doesn’t. If it doesn’t, then we’re already done. If it does, then we can factor , where has degree . But then can have at most roots, and thus can have at most roots!

A nice little corollary of this is that if our base field is infinite (like it is for the most familiar examples) then only the zero polynomial can give us the zero function when we evaluate it at various field elements. That is, if for all , then . This must be true because has an infinite number of roots, and so no finite degree polynomial can possibly have that many roots. The only possibility left is the zero polynomial.

Just to be clear, though, let’s look at this one counterexample. Think about the field we used when we talked about Set. The polynomial is not the zero polynomial, but is the zero function. Indeed , , and , which is divisible by , and so is the same as in this field.

## Roots of Polynomials I

When we consider a polynomial as a function, we’re particularly interested in those field elements so that . We call such an a “zero” or a “root” of the polynomial .

One easy way to get this to happen is for to have a factor of . Indeed, in that case if we write for some other polynomial then we evaluate to find

The interesting thing is that this is the *only* way for a root to occur, other than to have the zero polynomial. Let’s say we have the polynomial

and let’s also say we’ve got a root so that . But that means

This is not just a field element — it’s the zero polynomial! So we can subtract it from to find

Now for any we can use the identity

to factor out from each term above. This gives the factorization we were looking for.

## Polynomials

Okay, we’re going to need some other algebraic tools before we go any further into linear algebra. Specifically, we’ll need to know a few things about the algebra of polynomials. Specifically (and diverging from the polynomials discussed earlier) we’re talking about polynomials in one variable, and with coefficients in the field we’re building our vector spaces over already.

We’ll write this algebra as , where is now not a “variable”, like it was back in high school calculus. It’s a new element of the algebra. We start with the field which is trivially an algebra over itself. Then we just throw in this new thing called . Then, since we want to still be an algebra over , we have to be able to multiply elements. Defining a scalar multiple for each is a good start, but we also have to multiply by itself to get . There’s no reason this should be anything we’ve seen yet, so it’s new. Going along, we get , , and so on. Each of these is a new element, and we also get scalar multiples , and even linear combinations:

as long as there are only a finite number of nonzero terms in this sum. That is, the coefficients are all zero after some point. We customarily take — the unit of the algebra.

Note here that we’re not using the summation convention for polynomials, though we could in principle. Remember, an algebra is a vector space, and what we’ve said above establishes that the set constitutes a *basis* for this vector space!

The algebra structure can be specified by defining it on pairs of basis elements. Remember that the structure is just a bilinear multiplication, which is just a linear map from the tensor square to . And we know that the basis for a tensor product consists of pairs of basis elements. So we can specify this linear map on a basis and extend by linearity — bilinearity — whatever…

Anyhow, how should we define the multiplication? Simply: . Then the whole rest of the algebra structure is defined for us. Now this *looks* like adding exponents, but remember we can just as well think of these as indices on basis elements that just *happen* to add when we multiply corresponding basis elements. Thus we wouldn’t be out of place using the summation convention here, though we won’t for the moment.