The Unapologetic Mathematician

Convergence of Power Series

Now that we’ve imported a few rules about convergent series of complex numbers we can talk about when the series we get from evaluating power series converge or not.

We’ll just consider our power series to be in $\mathbb{C}[[X]]$, because if we have a real power series we can consider each coefficient as a complex number instead. Now we take a complex number $z$ and try to evaluate the power series $\sum\limits_{k=0}^\infty c_kX^k$ at this point. We get a series of complex numbers

$\displaystyle\sum\limits_{k=0}^\infty c_kz^k=\lim\limits_{n\rightarrow\infty}\sum\limits_{k=0}^nc_kz^k$

by evaluating each polynomial truncation of the power series at $z$ and taking the limit of the sequence. For some $z$ this series may converge and for others it may not. The amazing fact is, though, that we can always draw a circle in the complex plane — $|z|=R$ — within which the series always converges absolutely, and outside of which it always diverges. We’ll say nothing in general about whether it converges on the circle, though.

The tool here is the root test. We take the $n$th root of the size of the $n$th term in the series to find $\sqrt[n]{|c_nz^n|}=|z|\sqrt[n]{|c_n|}$. Then we can pull the $z$-dependance completely out of the limit superior to write $\rho=|z|\limsup\limits_{n\rightarrow\infty}\sqrt[n]{|c_n|}$. The root test tells us that if this is less than ${1}$ the series will converge absolutely, while if it’s greater than ${1}$ the series will diverge.

So let’s define $\lambda=\limsup\limits_{n\rightarrow\infty}\sqrt[n]{|c_n|}$. The root test now says that if $|z|<\frac{1}{\lambda}$ we have absolute convergence, while if $|z|>\frac{1}{\lambda}$ the series diverges. Thus $\frac{1}{\lambda}$ is the radius of convergence that we seek.

Now there are examples of series with all sorts of behavior on the boundary of this disk. The series $\sum\limits_{k=0}^\infty z^k$ has radius of convergence ${1}$ (as we can tell from the above procedure), but it doesn’t converge anywhere on the boundary circle. On the other hand, the series $\sum\limits_{k=1}^\infty\frac{1}{k^2}z^k$ has the same radius of convergence, but it converges everywhere on the boundary circle. And, just to be perverse, the series $\sum\limits_{k=1}^\infty\frac{1}{k}z^k$ has the same radius of convergence but converges everywhere on the boundary but the single point $z=1$.

August 29, 2008 Posted by | Analysis, Calculus, Power Series | 6 Comments

Convergence of Complex Series

Today, I want to note that all of our work on convergence of infinite series carries over — with slight modifications — to complex numbers.

First we have to get down an explicit condition on convergence of complex sequences. In any metric space we can say that the sequence $z_n$ converges to a limit $z$ if for every $\epsilon>0$ there is some $N$ so that $d(z_n,z)<\epsilon$ for all $n>N$. Of course, here we’ll be using our complex distance function $|z_n-z|=\sqrt{(z_n-z)\overline{(z_n-z)}}$. Now we just have to replace any reference to real absolute values with complex absolute values and we should be good.

Cauchy’s condition comes in to say that the series $\sum\limits_{k=0}^\infty a_k$ converges if and only for every $\epsilon>0$ there is an $N$ so that for all $n\geq m>N$ the sum $\left|\sum\limits_{k=m}^na_k\right|<\epsilon$.

Similarly, we say that the series $\sum\limits_{k=0}^\infty a_k$ is absolutely convergent if the series $\sum\limits_{k=0}^\infty|a_k|$ is convergent, and this implies that the original series converges.

Since the complex norm is multiplicative, everything for the geometric series goes through again: $\sum\limits_{k=0}^\infty c_0r^k=\frac{c_0}{1-r}$ if $|r|<1$, and it diverges if $|r|>1$. The case where $|r|=1$ is more complicated, but it can be shown to diverge as well.

The ratio and root tests are basically proven by comparing series of norms with geometric series. Since once we take the norm we’re dealing with real numbers, and since the norm is multiplicative, we find that the proofs go through again.

August 28, 2008 Posted by | Analysis, Calculus, Power Series | 1 Comment

Evaluating Power Series

Now that we’ve got some topological fields to use as examples, let’s focus in on power series over $\mathbb{R}$ or $\mathbb{C}$.

Remember that a power series is like an infinite polynomial. In fact, we introduced a topology so we could see in any power series a sequence of polynomials that converged to it. To be explicit, we write the series as a limit

$\displaystyle S=\sum\limits_{k=0}^\infty c_kX^k=\lim\limits_{n\rightarrow\infty}\sum\limits_{k=0}^nc_kX^k$

where the $c_k$ are coefficients selected from our base field.

Now evaluation of power series is specified by two conditions: it should agree with evaluation of polynomials when we’ve got a power series that cuts off after a finite number of terms, and it should be continuous.

The first condition says that each of our approximating polynomials should evaluate just the same as it did before. That is, if we cut off after the degree-$n$ term and evaluate at the point $x$ in the base field, we should get $\sum\limits_{k=0}^nc_kx^k$.

The second condition says that evaluation should preserve limits. And we’ve got a sequence right here: the $n$th term is the evaluation of the $n$th approximating polynomial! So the power series should evaluate to the limit $\lim\limits_{n\rightarrow\infty}\sum\limits_{k=0}^nc_kx^k$. If this limit exists, that is. And that’s why we need a topological field to make sense of evaluations.

Now we’re back in the realm of infinite series, and taking the limit of a sequence of partial sums. The series in question has as its $n$th term the evaluated monomial $c_nx^n$. We can start using our old techniques to sum these series.

August 27, 2008 Posted by | Analysis, Calculus, Power Series | 4 Comments

Some Topological Fields

Okay, hopefully now that I’m back in Kentucky and I’ve gotten the first day of all three of my classes under my belt, I can get back into the groove.

In a little while we’re going to want to talk about “evaluating” a power series like we did when we considered polynomials as functions. But when we try to map into our base field, we get a sequence of values and we ask questions about convergence. That means we need to have a topology on our field! Luckily there are a few hanging around.

The real numbers have a topology. In fact, that’s really their main characteristic. The rational numbers have a topology too, but the whole motivation of the construction of the real numbers is that rational numbers have a lot of holes that sequence limits fall through. Since we’re talking about sequences here we really need the tighter weave that the real numbers provide.

What else can we use? The complex numbers form a two-dimensional vector space over the reals, which means that as a vector space we have the isomorphism $\mathbb{C}\cong\mathbb{R}\times\mathbb{R}$. So let’s just use the product metric, along with the topology it gives on the complex numbers.

Let’s be a little explicit here: the product metric depends on picking the basis $\{1,i\}$ for $\mathbb{C}$ as a vector space over $\mathbb{R}$. We get the “size” of a complex number $|a+bi|=\sqrt{a^2+b^2}$, and then we define the distance between two complex numbers as the size of their difference.

I said before that there may be many different distance functions that give the same topology, so why do we really want to use this one? Well, it turns out that this formula can actually be written in a really neat way in complex number language. If we have a complex number $z=a+bi$ we also have its conjugate $\bar{z}=a-bi$. Then we can calculate the product

$z\bar{z}=(a+bi)(a-bi)=(a^2+b^2)+0i$

This is just the square of our distance function, written as a complex number! But also notice that this formula is symmetric between a complex number and its conjugate. That is, conjugation preserves the size of complex numbers, as we should expect because there’s no a priori difference between the two.

Now we need to prove that the field operations are continuous. For multiplication, for example, we need to ask that if $z_2$ is close to $z_1$ and $w_2$ is close to $w_1$, then $z_2w_2$ is close to $z_1w_1$. We write $z_2=z_1+\Delta z$ and $w_2=w_1+\Delta w$, and multiply out

$z_2w_2=z_1w_1+z_1\Delta w+\Delta zw_1+\Delta z\Delta w$

The condition is that for any real $\epsilon>0$ we can find real $\delta_z$ and $\delta_w$ so that $\delta_z>|\Delta z|$ and $\delta_w>|\Delta w|$ together imply that $|z_1\Delta w+\Delta zw_1+\Delta z\Delta w|<\epsilon$. From here it’s a bunch of mucking around with our formula, but none of it is very difficult.

At the end of the day, we’ve got two topological fields to work with — the real numbers $\mathbb{R}$ and the complex numbers $\mathbb{C}$ — and we can talk about evaluating power series from $\mathbb{R}[[X]]$ or $\mathbb{C}[[X]]$.

UPDATE: I forgot to mention that it’s also easy to see that the norm is multiplicative. That is, $|zw|=\sqrt{zw\overline{zw}}=\sqrt{z\bar{z}}\sqrt{w\bar{w}}=|z||w|$.

August 26, 2008 Posted by | Algebra, Ring theory | 5 Comments

Today is my last long drive for a while. I’ll try to post a Sunday Sample tonight, or maybe tomorrow. But for now, here’s a problem to chew on.

In a multiple choice test, one question was illegible, but the following choice of answers was clearly printed. Which must be the right answer?

1. All of the below
2. None of the below
3. All of the above
4. One of the above
5. None of the above
6. None of the above

August 24, 2008 Posted by | Uncategorized | 15 Comments

Zero-Knowledge Sudoku

Scientific American has been doing a lot about privacy, since that’s the theme of its latest issue. One of its posts today linked to this one from two years ago at a weblog that my old teacher Bill Gasarch has something to do with. It’s a pretty cogent explanation of zero-knowledge proofs, with a sudoku puzzle as the centerpiece.

A zero-knowledge proof is a way of convincing you that I know something without you getting any information about what that thing is. I know, it sounds like LitCrit, but this actually makes sense.

The classic motivating example is a cave with two entrances. Way back deep in the cave there’s a door with a combination lock I tell you I can open. I go into one of the cave entrances (you don’t see which, but if I can’t open the door I’m stuck in whichever one I picked). Then you come to the mouth(s) of the cave and yell out which entrance I should come out of.

If I can open the door, I can come out either entrance, so I come out the one you want. If I can’t open the door, then I can only come out the one you want if that’s the one I went in. Juggling the numbers a bit we can find that the probability I don’t know the combination given that I came out the correct entrance is $\frac{1}{2}$.

This means that half the time I could have gotten the right answer by luck rather than by skill. That’s not very convincing, but we can repeat the experiment, and each run is independent of all the others. So if we go through the motions $n$ times and I’ve met your challenge every single time, then the probability that I’ve just been lucky is $\frac{1}{2^n}$, which gets very small very quickly. After a while you’ll be sufficiently convinced that I know the answer.

In a way it’s more like a physics “proof” than the ones mathematicians are used to. The hypothesis can be falsified at any step, and each confirmation just increases the probability that it’s accurate.

August 20, 2008 Posted by | Cryptography | 4 Comments

Products of Metric Spaces

Shortly we’re going to need a construction that’s sort of interesting in its own right.

We know about products of topological spaces. We can take products of metric spaces, too, and one method comes down to us all the way from Pythagoras.

The famous Pythagorean theorem tells us that in a right triangle the length $c$ of the side opposite the right angle stands in a certain relation to the lengths $a$ and $b$ of the other two sides: $c^2=a^2+b^2$. So let’s say we’ve got metric spaces $(M_1,d_1)$ and $(M_2,d_2)$. For the moment we’ll think of them as being perpendicular and define a distance function $d$ on $M_1\times M_2$ by

$d((x_1,x_2),(y_1,y_2)=\sqrt{d_1(x_1,y_1)^2+d_2(x_2,y_2)^2}$

The quantity inside the radical here must be nonnegative, since it’s the sum of two nonnegative numbers. Since the result needs to be nonnegative, we take the unique nonnegative square root.

Oops, I don’t think I mentioned this before. Since the function $f(x)=x^2$ has $f'(x)=2x$ as its derivative, it’s always increasing where $x$ is positive. And since we can eventually a square above any real number we choose, its values run from zero all the way up to infinity. Now the same sort of argument as we used to construct the exponential function gives us an inverse sending any nonnegative number to a unique nonnegative square root.

Okay, that taken care of, we’ve got a distance function. It’s clearly nonnegative and symmetric. The only way for it to be zero is for the quantity in the radical to be zero, and this only happens if each of the terms $d_1(x_1,y_1)$ and $d_2(x_2,y_2)$ are zero. But since these are distance functions, that means $x_1=y_1$ and $x_2=y_2$, so $(x_1,x_2)=(y_1,y_2)$.

The last property we need is the triangle inequality. That is, for any three pairs $(x_1,x_2)$, $(y_1,y_2)$, $(z_1,z_2)$ we have the inequality

$d((x_1,x_2),(z_1,z_2))\leq d((x_1,x_2),(y_1,y_2))+d((y_1,y_2),(z_1,z_2))$

Substituting from the definition of $d$ we get the statement

$\sqrt{d_1(x_1,z_1)^2+d_2(x_2,z_2)^2}\leq\sqrt{d_1(x_1,y_1)^2+d_2(x_2,y_2)^2}+\sqrt{d_1(y_1,z_1)^2+d_2(y_2,z_2)^2}$

The triangle inequalities for $d_1$ and $d_2$ tell us that $d_1(x_1,z_1)\leq d_1(x_1,y_1)+d_1(y_1,z_1)$ and $d_2(x_2,z_2)\leq d_2(x_2,y_2)+d_1(y_2,z_2)$. So if we make these substitutions on the left, it increases the left side of the inequality we want. Thus if we can prove the stronger inequality

\begin{aligned}\sqrt{d_1(x_1,y_1)^2+2d_1(x_1,y_1)d_1(y_1,z_1)+d_1(y_1,z_1)^2+d_2(x_2,y_2)^2+2d_2(x_2,y_2)d_2(y_2,z_2)+d_2(y_2,z_2)^2}\\\leq\sqrt{d_1(x_1,y_1)^2+d_2(x_2,y_2)^2}+\sqrt{d_1(y_1,z_1)^2+d_2(y_2,z_2)^2}\end{aligned}

we’ll get the one we really want. Now since squaring preserves the order on the nonnegative reals, we can find this equivalent to

\begin{aligned}d_1(x_1,y_1)^2+2d_1(x_1,y_1)d_1(y_1,z_1)+d_1(y_1,z_1)^2+d_2(x_2,y_2)^2+2d_2(x_2,y_2)d_2(y_2,z_2)+d_2(y_2,z_2)^2\\\leq d_1(x_1,y_1)^2+d_2(x_2,y_2)^2+2\sqrt{d_1(x_1,y_1)^2+d_2(x_2,y_2)^2}\sqrt{d_1(y_1,z_1)^2+d_2(y_2,z_2)^2}+d_1(y_1,z_1)^2+d_2(y_2,z_2)^2\end{aligned}

Some cancellations later:

\begin{aligned}d_1(x_1,y_1)d_1(y_1,z_1)+d_2(x_2,y_2)d_2(y_2,z_2)\\\leq \sqrt{d_1(x_1,y_1)^2d_1(y_1,z_1)^2+d_1(x_1,y_1)^2d_2(y_2,z_2)^2+d_2(x_2,y_2)^2d_1(y_1,z_1)^2+d_2(x_2,y_2)^2d_2(y_2,z_2)^2}\end{aligned}

We square and cancel some more:

$2d_1(x_1,y_1)d_1(y_1,z_1)d_2(x_2,y_2)d_2(y_2,z_2)\leq d_1(x_1,y_1)^2d_2(y_2,z_2)^2+d_2(x_2,y_2)^2d_1(y_1,z_1)^2$

Moving these terms around we find

\begin{aligned}0\leq\left(d_1(x_1,y_1)d_2(y_2,z_2)\right)^2-2\left(d_1(x_1,y_1)d_2(y_2,z_2)\right)\left(d_2(x_2,y_2)d_1(y_1,z_1)\right)+\left(d_2(x_2,y_2)d_1(y_1,z_1)\right)^2\\=\left(d_1(x_1,y_1)d_2(y_2,z_2)-d_2(x_2,y_2)d_1(y_1,z_1)\right)^2\end{aligned}

So at the end of the day, our triangle inequality is equivalent to asking if a certain quantity squared is nonnegative, which it clearly is!

Now here’s the important thing at the end of all that calculation: this is just one way to get a metric on the product of two metric spaces. There are many other ones which give rise to different distance functions, but the same topology and the same uniform structure. And often it’s the topology that we’ll be most interested in.

In particular, this will give us a topology on any finite-dimensional vector space over the real numbers, but we don’t want to automatically equip that vector space with this norm unless we say so very explicitly. In fact, we don’t even want to make that same assumption about the two spaces being perpendicular to each other. The details of exactly why this is so I’ll leave until we get back to linear algebra, but I want to be clear right now that topology comes for free, but we may have good reason to use different “distances”.

August 19, 2008

Power Series

Prodded by some comments, I think I’ll go even further afield from linear algebra. It’s a slightly different order than I’d originally thought of, but it will lead to some more explicit examples when we’re back in the realm of linear algebra, so it’s got its own benefits.

I’ll note here in passing that mathematics actually doesn’t proceed in a straight line, despite the impression most people get. The lower-level classes are pretty standard, yes — natural-number arithmetic, fractions, algebra, geometry, calculus, and so on. But at about this point where most people peter out, the subject behaves more like an alluvial fan — many parallel rivulets carry off in different directions, but they’re all ultimately part of the same river. So in that metaphor, I’m pulling a bit of an avulsion.

Anyhow, power series are sort of like polynomials, except that the coefficients don’t have to die out at infinity. That is, when we consider the algebra of polynomials $\mathbb{F}[X]$ as a vector space over $\mathbb{F}$ it’s isomorphic to the infinite direct sum

$\displaystyle\mathbb{F}[X]\cong\bigoplus\limits_{k=0}^\infty\mathbb{F}X^k$

but the algebra of power series — written $\mathbb{F}[[X]]$ — is isomorphic to the infinite direct product

$\displaystyle\mathbb{F}[[X]]\cong\prod\limits_{k=0}^\infty\mathbb{F}X^k$

It’s important to note here that the $X^i$ do not form a basis here, since we can’t write an arbitrary power series as a finite linear combination of them. But really they should behave like a basis, because they capture the behavior of every power series. In particular, if we specify that $\mu(X^m,X^n)=X^{m+n}$ then we have a well-defined multiplication extending that of power series.

I don’t want to do all the fine details right now, but I can at least sketch how this all works out, and how we can adjust our semantics to talk about power series as if the $X^i$ were an honest basis. The core idea is that we’re going to introduce a topology on the space of polynomials.

So what polynomials should be considered “close” to each other? It turns out to make sense to consider those which agree in their lower-degree terms to be close. That is, we should have the space of tails

$\displaystyle\bigoplus\limits_{k=n+1}^\infty\mathbb{F}X^k$

as an open set. More concretely, for every polynomial $p$ with degree $n$ there is an open set $U_p$ consisting of those polynomials $q$ so that $X^{n+1}$ divides the difference $q-p$.

Notice here that any power series defines, by cutting it off after successively higher degree terms, a descending sequence of these open sets. More to the point, it defines a sequence of polynomials. If the power series’ coefficients are zero after some point — if it’s a polynomial itself — then this sequence stops and stays at that polynomial. But if not it never quite settles down to any one point in the space. Doesn’t this look familiar?

Exactly. Earlier we had sequences of rational numbers which didn’t converge to a rational number. Then we completed the topology to give us the real numbers. Well here we’re just doing the same thing! It turns out that the topology above gives a uniform structure to the space of polynomials, and we can complete that uniform structure to give the vector space underlying the algebra of power series.

So here’s the punch line: once we do this, it becomes natural to consider not just linear maps, but continuous linear maps. Now the images of the $X^k$ can’t be used to uniquely specify a linear map, but they will specify at most one value for a continuous linear map! That is, any power series comes with a sequence converging to it — its polynomial truncations — and if we know the values $f(X^k)$ then we have uniquely defined images of each of these polynomial truncations since each one is a finite linear combination. Then continuity tells us that the image of the power series must be the limit of this sequence of images, if the limit exists.

August 18, 2008

Factoring Real Polynomials

Okay, we know that we can factor any complex polynomial into linear factors because the complex numbers are algebraically closed. But we also know that real polynomials can have too few roots. Now, there are a lot of fields out there that aren’t algebraically closed, and I’m not about to go through all of them. But we use the real numbers so much because of the unique position it holds by virtue of the interplay between its topology and its algebra. So it’s useful to see what we can say about real polynomials.

We start by noting that since the real numbers sit inside the complex numbers, we can consider any real polynomial as a complex polynomial. If the polynomial has a real root $r$, then the complex polynomial has a root $r+0i$. So all the real roots still show up.

Now, we might not have as many real roots as the degree would indicate. But we are sure to have as many complex roots as the degree, which will include the real roots. Some of the roots may actually be complex numbers like $a+bi$. Luckily, one really interesting thing happens here: if $a+bi$ is a root, then so is its complex conjugate $a-bi$.

Let’s write out our polynomial $c_0+c_1X+...+c_nX^n$, where all the $c_i$ are real numbers. To say that $a+bi$ is a root means that when we substitute it for ${X}$ we get the equation

$c_0+c_1(a+bi)+...+c_n(a+bi)^n=0$

Now we can take the complex conjugate of this equation

$\overline{c_0+c_1(a+bi)+...+c_n(a+bi)^n}=\overline{0}$

But complex conjugation is a field automorphism, so it preserves both addition and multiplication

$\overline{c_0}+\overline{c_1}\left(\overline{a+bi}\right)+...+\overline{c_n}\left(\overline{a+bi}\right)^n=\overline{0}$

Now since all the $c_i$ (and ${0}$) are real, complex conjugation leaves them alone. Conjugation sends $a+bi$ to $a-bi$, and so we find

$c_0+c_1(a-bi)+...+c_n(a-bi)^n=0$

So $a-bi$ is a root as well. Thus if we have a (complex) linear factor like $\left(X-(a+bi)\right)$ we’ll also have another one like $\left(X-(a-bi)\right)$. These multiply to give

\begin{aligned}\left(X-(a+bi)\right)\left(X-(a-bi)\right)=X^2-(a-bi)X-(a+bi)X+(a+bi)(a-bi)\\=X^2-(2a)X+(a^2+b^2)\end{aligned}

which is a real polynomial again.

Now let’s start with our polynomial $p$ of degree $n$. We know that over the complex numbers it has a root $\lambda$. If this root is real, then we can write

$p=(X-\lambda)\tilde{p}$

where $\tilde{p}$ is another real polynomial which has degree $n-1$. On the other hand, if $\lambda=a+bi$ is complex then $\bar{\lambda}$ is also a root of $p$, and so we can write

$p=(X-(a+bi))(X-(a-bi))\tilde{p}=(X^2-(2a)X+(a^2+b^2))\tilde{p}$

where $\tilde{p}$ is another real polynomial which has degree $n-2$.

Either way, now we can repeat our reasoning starting with $\tilde{p}$. At each step we can pull off either a linear term or a quadratic term (which can’t be factor into two real linear terms).

Thus every real polynomial factors into the product of a bunch of linear polynomials and a bunch of irreducible quadratic polynomials, and the number of linear factors plus twice the number of quadratic factors must add up to the degree of $p$. It’s not quite so nice as the situation over the complex numbers, but it’s still pretty simple. We’ll see many situations into the future where this split between two distinct real roots and a conjugate pair of complex roots (and the border case of two equal real roots) shows up with striking qualitative effects.

August 14, 2008

I’ve got internet access again, but I’m busy with a few things today, like assembling my furniture. Luckily, Tim Gowers has a post on “How to use Zorn’s lemma”. His example is the construction of additive (not linear) functions from $\mathbb{R}$ to itself.
In practice, as he points out, this is equivalent to defining linear functions (not just additive) from $\mathbb{R}$ to itself, if we’re considering $\mathbb{R}$ as a vector space over the field $\mathbb{Q}$ of rational numbers! This is a nifty little application within the sort of stuff we’ve been doing lately, and it really explains how we used Zorn’s Lemma when we showed that every vector space has a basis.