The Unapologetic Mathematician

Mathematics for the interested outsider

Complex Numbers and Polar Coordinates

Forgot to hit “publish” earlier…

So we’ve seen that the unit complex numbers can be written in the form e^{i\theta} where \theta denotes the (signed) angle between the point on the circle and 1+0i. We’ve also seen that this view behaves particularly nicely with respect to multiplication: multiplying two unit complex numbers just adds their angles. Today I want to extend this viewpoint to the whole complex plane.

If we start with any nonzero complex number z=a+bi, we can find its absolute value \lvert z\rvert=\sqrt{a^2+b^2}. This is a positive real number which we’ll also call r. We can factor this out of z to find z=r\left(\frac{a}{r}+\frac{b}{r}i\right). The complex number in parentheses has unit absolute value, and so we can write it as e^{i\theta} for some \theta between -\pi and \pi. Thus we’ve written our complex number in the form

\displaystyle z=re^{i\theta}

where the positive real number r is the absolute value of z, and \theta — a real number in the range \left(-\pi,\pi\right] — is the angle z makes with the reference point 1+0i. But this is exactly how we define the polar coordinates (r,\theta) back in high school math courses.

Just like we saw for unit complex numbers, this notation is very well behaved with respect to multiplication. Given complex numbers r_1e^{i\theta_1} and r_2e^{i\theta_2} we calculate their product:

\displaystyle r_1e^{i\theta_1}r_2e^{i\theta_2}=\left(r_1r_2\right)e^{i\left(\theta_1+\theta_2\right)}

That is, we multiply their lengths (as we already knew) and add their angles, just like before. This viewpoint also makes division simple:


In particular we see that


so multiplicative inverses are given in terms of complex conjugates and magnitudes as we already knew.

Powers (including roots) are also easy, which gives rise to easy ways to remember all those messy double- and triple-angle formulæ from trigonometry:



Other angle addition formulæ should be similarly easy to verify from this point.

In general, since we consider complex numbers multiplicatively so often it will be convenient to have this polar representation of complex numbers at hand. It will also generalize nicely, as we will see.


May 29, 2009 Posted by | Fundamentals, Numbers | 3 Comments

The Circle Group

Yesterday we saw that the unit-length complex numbers are all of the form e^{i\theta}, where \theta measures the oriented angle from 1+0i around to the point in question. Since the absolute value of a complex number is multiplicative, we know that the product of two unit-length complex numbers is again of unit length. We can also see this using the exponential property:

\displaystyle e^{i\theta_1}e^{i\theta_2}=e^{i(\theta_1+\theta_2)}

So multiplying two unit-length complex numbers corresponds to adding their angles.

That is, the complex numbers on the unit circle form a group under multiplication of complex numbers — a subgroup of the multiplicative group of the complex field — and we even have an algebraic description of this group. The function sending the real number \theta to the point on the circle e^{i\theta} is a homomorphism from the additive group of real numbers to the circle group. Since every point on the circle has such a representative, it’s an epimorphism. What is the kernel? It’s the collection of real numbers satisfying

\displaystyle e^{i\theta}=\cos(\theta)+i\sin(\theta)=1+0i

that is, \theta must be an integral multiple of 2\pi — an element of the subgroup 2\pi\mathbb{Z}\subseteq\mathbb{R}. So, algebraically, the circle group is the quotient \mathbb{R}/(2\pi\mathbb{Z}). Or, isomorphically, we can just write \mathbb{R}/\mathbb{Z}.

Something important has happened here. We have in hand two distinct descriptions of the circle. One we get by putting the unit-length condition on points in the plane. The other we get by taking the real line and “wrapping” it around itself periodically. I haven’t really mentioned the topologies, but the first approach inherits the subspace topology from the topology on the complex numbers, while the second approach inherits the quotient topology from the topology on the real numbers. And it turns out that the identity map from one version of the circle to the other one is actually a homeomorphism, which further shows that the two descriptions give us “the same” result.

What’s really different between the two cases is how they generalize. I’ll probably come back to these in more detail later, but for now I’ll point out that the first approach generalizes to spheres in higher dimensions, while the second generalizes to higher-dimensional tori. Thus the circle is sometimes called the one-dimensional sphere S^1, and sometimes called the one-dimensional torus T^1, and each one calls to mind a slightly different vision of the same basic object of study.

May 27, 2009 Posted by | Algebra, Fundamentals, Group theory, Numbers | 3 Comments

Complex Numbers and the Unit Circle

When I first talked about complex numbers there was one perspective I put off, and now need to come back to. It makes deep use of Euler’s formula, which ties exponentials and trigonometric functions together in the relation

\displaystyle e^{i\theta}=\cos(\theta)+i\sin(\theta)

where we’ve written e for \exp(1) and used the exponential property.

Remember that we have a natural basis for the complex numbers as a vector space over the reals: \left\{1,i\right\}. If we ask that this natural basis be orthonormal, we get a real inner product on complex numbers, which in turn gives us lengths and angles. In fact, this notion of length is exactly that which we used to define the absolute value of a complex number, in order to get a topology on the field.

So what happens when we look at e^{i\theta}? First, we can calculate its length using this inner product, getting

\displaystyle\left\lvert e^{i\theta}\right\rvert=\cos(\theta)^2+\sin(\theta)^2=1

by the famous trigonometric identity. That is, every complex number of the form e^{i\theta} lies a unit distance from the complex number {0}.

In particular, 1+0i=e^{0i} is a nice reference point among such points. We can use it as a fixed post in the complex plane, and measure the angle it makes with any other point. For example, we can calculate the inner product


and thus we find that the point e^{i\theta} makes an angle \lvert\theta\rvert with our fixed post {1}, at least for -\pi\leq\theta\leq\pi. We see that e^{i\theta} traces a circle by increasing the angle in one direction as \theta increases from {0} to \pi, and increasing the angle in the other direction as \theta decreases from {0} to -\pi. For values of \theta outside this range, we can use the fact that

\displaystyle e^{2\pi i}=\cos(2\pi)+i\sin(2\pi)=1+0i

to see that the function e^{i\theta} is periodic with period 2\pi. That is, we can add or subtract whatever multiple of 2\pi we need to move \theta within the range -\pi<\theta\leq\pi. Thus, as \theta varies the point e^{i\theta} traces out a circle of unit radius, going around and around with period 2\pi, and every point on the unit circle has a unique representative of this form with \theta in the given range.

May 26, 2009 Posted by | Fundamentals, Numbers | 3 Comments

Properties of Complex Numbers

Today I’ll collect a few basic properties of complex numbers.

First off, they form a vector space over the reals. We constructed them as an algebra — the quotient of the algebra of polynomials by a certain ideal — and every algebra is a vector space. So what can we say about them as a vector space? The easiest fact is that it’s two-dimensional, and it’s got a particularly useful basis.

To see this, remember that we have a basis for the algebra of polynomials, which is given by the powers of the variable. So here when we throw in the formal element i, its powers form a basis of the ring \mathbb{R}[i]. But we have a relation, and that cuts things down a bit. Specifically, the element i^2 is the same as the element -1.

Given a polynomial p in the “variable” i, we can write it as


We can peel off the constant and linear terms, and then pull out a factor of i^2:


Now this factor of i^2 can be replaced by -1, which drops the overall degree. We can continue like this, eventually rewriting any term involving higher powers of i using only constant and linear terms. That is, any complex number can be written as c_0+c_1i, where c_0 and c_1 are real constants. Further, this representation is unique. This establishes the set \{1,i\} as a basis for \mathbb{C} as a vector space over \mathbb{R}.

Now the additive parts of the field structure are clear from the vector space structure here. We can write the sum of two complex numbers a_1+b_1i and a_2+b_2i simply by adding the components: (a_1+a_2)+(b_1+b_2)i. We get the negative of a complex number by taking the negatives of the components.

We can also write out products pretty simply, since we know the product of pairs of basis elements. The only one that doesn’t involve the unit of the algebra is i\otimes i\mapsto-1. So in terms of components we can write out the product of the complex numbers above as (a_1a_2-b_1b_2)+(a_1b_2+a_2b_1)i.

Notice here that the field of real numbers sits inside that of complex numbers, using scalar multiples of the complex unit. This is characteristic of algebras, but it’s worth pointing out here. Any real number a can be considered as the complex number a+0i. This preserves all the field structures, but it ignores the order on the real numbers. A small price to pay, but an important one in certain ways.

We also mentioned the symmetry between i and -i. Either one is just as valid as a square root of -1 as the other is, so if we go through consistently replacing i with -i, and -i with i, we can’t tell the difference. This leads to an automorphism of fields called “complex conjugation”. It sends the complex number a+bi to its “conjugate” a-bi. This preserves all the field structure — additive and multiplicative — and it fixes the real numbers sitting inside the complex numbers.

Studying this automorphism, and similar structures of other field extensions forms the core of what algebraists call “Galois theory”. I’m not going there now, but it’s a huge part of modern mathematics, and its study is ultimately the root of all of our abstract algebraic techniques. The first groups were automorphism groups shuffling around roots of polynomials.

August 8, 2008 Posted by | Fundamentals, Numbers | 25 Comments

The Complex Numbers

Yesterday we defined a field to be algebraically closed if any polynomial over the field always has exactly as many roots (counting multiplicities) as we expect from its degree. But we don’t know a single example of an algebraically complete field. Today we’ll (partially) remedy that problem.

First, remember the example we used for a polynomial with no roots over the real numbers. That is p=X^2+1. The problem is that we have no field element whose square is -1. So let’s just postulate one! This, of course, has the same advantages as those of theft over honest toil. We write our new element as i for “imaginary”, and throw it in with the rest of the real numbers \mathbb{R}.

Okay, just like when we threw in X as a new element, we can build up sums and products involving real numbers and this new element i. But there’s one big difference here: we have a relation that i must satisfy. When we use the evaluation map we must find \mathrm{ev}(X^2+1,i)=0. And, of course, any polynomial which includes X^2+1 as a factor must evaluate to {0} as well. But this is telling us that the kernel of the evaluation homomorphism for i contains the principal ideal (X^2+1).

Can it contain anything else? If q\in\mathbb{R}[X] is a polynomial in the kernel, but q is not divisible by X^2+1, then Euclid’s algorithm gives us a greatest common divisor of q and X^2+1, which is a linear combination of these two, and must have degree either {0} or {1}. In the former case, we would find that the evaluation map would have to send everything — even the constant polynomials — to zero. In the latter case, we’d have a linear factor of X^2+1, which would be a root. Clearly neither of these situations can occur, so the kernel of the evaluation homomorphism at i is exactly the principal ideal (X^2+1).

Now the first isomorphism theorem for rings tells us that we can impose our relation by taking the quotient ring \mathbb{R}[X]/(X^2+1). But what we just discussed above further goes to show that (X^2+1) is a maximal ideal, and the quotient of a ring by a maximal ideal is a field! Thus when we take the real numbers and adjoin a square root of -1 to get a ring we might call \mathbb{R}[i], the result is a field. This is the field of “complex numbers”, which is more usually written \mathbb{C}.

Now we’ve gone through a lot of work to just add one little extra element to our field, but it turns out this is all we need. Luckily enough, the complex numbers are already algebraically complete! This is very much not the case if we were to try to algebraically complete other fields (like the rational numbers). Unfortunately, the proof really is essentially analytic. It seems to be a completely algebraic statement, but remember all the messy analysis and topology that went into defining the real numbers.

Don’t worry, though. We’ll come back and prove this fact once we’ve got a bit more analysis under our belts. We’ll also talk a lot more about how to think about complex numbers. But for now all we need to know is that they’re the “algebraic closure” of the real numbers, we get them by adding a square root of -1 that we call i, and we can use them as an example of an algebraically closed field.

One thing we can point out now, though, is the inherent duality of our situation. You see, we didn’t just add one square root of -1. Indeed, once we have complex numbers to work with we can factor X^2+1 as (X-i)(X+i) (test this by multiplying it out and imposing the relation). Then we have another root: -i. This is just as much a square root of -1 as i was, and anything we can do with i we can do with -i. That is, there’s a symmetry in play that exchanges i and -i. We can pick one and work with it, but we must keep in mind that whenever we do we’re making a non-canonical choice.

August 7, 2008 Posted by | Algebra, Fundamentals, Numbers, Polynomials, Ring theory | 10 Comments

Archimedean Groups and the Largest Archimedean Field

Okay, I’d promised to get back to the fact that the real numbers form the “largest” Archimedean field. More precisely, any Archimedean field is order-isomorphic to a subfield of \mathbb{R}.

There’s an interesting side note here. I was thinking about this and couldn’t quite see my way forward. So I started asking around Tulane’s math department and seeing if anyone knew. Someone pointed me towards Mike Mislove, and when I asked him, he suggested we ask Laszlo Fuchs around the corner from him. Dr. Fuchs, it turned out, did know the answer, and it was in a book he’d written himself: Partially Ordered Algebraic Systems. It’s an interesting little volume, which I may come back and mine later for more topics.

Anyhow, we’ll do this a little more generally. First let’s talk about Archimedean ordered groups a bit. In a totally-ordered group G we’ll say two elements a and b are “Archimedean equivalent” (A\sim B) if there are natural numbers m and n so that |a|<|b|^m and |b|<|a|^n (here I’m using the absolute value that comes with any totally-ordered group). That is, neither one is infinitesimal with respect to the other. This can be shown to be an equivalence relation, so it chops the elements of G into equivalence classes. There are always at least two in any nontrivial group because the identity element is infinitesimal with respect to everything else. We say a group is Archimedean if there are only two Archimedean equivalence classes. That is, for any a and b other than the identity, there is a natural number n with |a|<|b|^n.

Now we have a theorem of Hölder which says that any Archimedean group is order-isomorphic to a subgroup of the real numbers with addition. In particular, we will see that any Archimedean group is commutative.

Now either G has a least positive element g or it doesn’t. If it does, then e\leq x<g implies that x=e (e is the identity of the group). By the Archimedean property, any element a has an integer n so that g^n\leq a<g^{n+1}. Then we can multiply by g^{-n} to find that e\leq g^{-n}a<g, so g^{-n}a=e. Every element is thus some power of g, and the group is isomorphic to the integers \mathbb{Z}\subseteq\mathbb{R}.

On the other hand, what if given a positive x we can always find a positive y with y<x? In this case, y^2 may be greater than x, but in this case we can show that (xy^{-1})^2\leq x, and xy^{-1} itself is less than x, so in either case we have an element z with e<z<x and z^2\leq x.

Now if two positive elements a and b fail to commute then without loss of generality we can assume ba<ab. Then we pick x=aba^{-1}b^{-1}>e and choose a z to go with this x. By the Archimedean property we’ll have numbers m and n with z^m\leq a<z^{m+1} and z^n\leq b<z^{n+1}. Thus we find that x<z^2, which contradicts how we picked z. And thus G is commutative.

So we can pick some positive element a\in G and just set f(a)=1\in\mathbb{R}. Now we need to find where to send every other element. To do this, note that for any b\in G and any rational number \frac{m}{n}\in\mathbb{Q} we’ll either have a^m\leq b^n or a^m\geq b^n, and both of these situations must arise by the Archimedean property. This separates the rational numbers into two nonempty collections — a cut! So we define f(b) to be the real number specified by this cut. It’s straightforward now to show that f(bc)=f(b)+f(c), and thus establish the order isomorphism.

So all Archimedean groups are just subgroups of \mathbb{R} with addition as its operation. In fact, homomorphisms of such groups are just as simple.

Say that we have a nontrivial Archimedean group A\subseteq\mathbb{R}, a (possibly trivial) Archimedean group B\subseteq\mathbb{R}, and a homomorphism f:A\rightarrow B. If f(a)=0 for some positive a\in A then this is just the trivial homomorphism sending everything to zero, since for any positive x there is a natural number n so that x<na. In this case the homomorphism is “multiply by {0}“.

On the other hand, take any two positive elements a_1,a_2\in A and consider the quotients (in \mathbb{R}) \frac{a_1}{a_2} and \frac{f(a_1)}{f(a_2)}. If they’re different (say, \frac{f(a_1)}{f(a_2)}<\frac{a_1}{a_2}) then we can pick a rational number \frac{m}{n} between them. Then nf(a_1)<mf(a_2), while ma_2<na_1, which contradicts the order-preserving property of the isomorphism! Thus we find the ratio \frac{f(a)}{a} must be a constant r>0, and the homomorphism is “multiply by r“.

Now let’s move up to Archimedean rings, whose definition is the same as that for Archimedean fields. In this case, either the product of any two elements is {0} (we have a “zero ring”) and the additive group is order-isomorphic to a subgroup of \mathbb{R}, or the ring is order-isomorphic to a subring of \mathbb{R}. If we have a zero ring, then the only data left is an Archimedean group, which the above discussion handles, so we’ll just assume that we have some nonzero product and show that we have an order-isomorphism with a subring of \mathbb{R}.

So we’ve got some Archimedean ring R and its additive group R_+. By the theorem above, R_+ is order-isomorphic to a subgroup of \mathbb{R}. We also know that for any positive a\in R the operation \lambda_a(x)=a\cdot x (the dot will denote the product in R) is an order-homomorphism from R_+ to itself. Thus there is some non-negative real number r_a so that \lambda_a(x)=r_ax. If we define r_{-a}=-r_a then the assignment a\mapsto r_a gives us an order-homomorphism from R_+ to some group S_+\subseteq\mathbb{R}.

Again, we must have r_a=sa for some non-negative real number s. If s=0 then all multiplications in R would give zero, so 0<s, and so the assignment is invertible. Now we see that a\cdot b=r_ab=sab. Similarly, we have r_{a\cdot b}=s(a\cdot b)=(sa)(sb)=r_ar_b, and so the function a\mapsto r_a is an order-isomorphism of rings.

In particular, a field \mathbb{F} can’t be a zero ring, and so there must be an injective order-homomorphism \mathbb{F}\rightarrow\mathbb{R}. In fact, there can be only one, for if there were more than one the images would be related by multiplication by some positive r\in\mathbb{R}: \phi_1(a)=r\phi_2(a). But then r\phi_2(a)\phi_2(b)=r\phi_2(a\cdot b)=\phi_1(a\cdot b)=\phi_1(a)\phi_1(b)=r^2\phi_2(a)\phi_2(b), and so r=1.

We can sum this up by saying that the real numbers \mathbb{R} are a terminal object in the category of Archimedean fields.

December 17, 2007 Posted by | Algebra, Fundamentals, Group theory, Numbers, Orders, Ring theory | 3 Comments

Archimedean Fields

Whether we use Dedekind cuts or Cauchy sequences to construct the ordered field of real numbers \mathbb{R} (and it doesn’t matter which), we are taking the ordered field of rational numbers and enlarging it to be “complete” in some sense or another. But we also aren’t making it too much bigger. The universality property we got from completing the uniform structure already gives evidence of that, but there’s another property which we can show is true of \mathbb{R}, and which shows that the real numbers aren’t too unwieldy.

In The Sand Reckoner, the ancient Greek mathematician Archimedes once set about the problem of should the number of grains of sand in existence to be finite. He does this by determining a (very weak) upper bound: the number of grains of sand it would take to fill up the entire universe, as he understood the latter term. He writes:

There are some … who think that the number of the sand is infinite in multitude; and I mean by the sand not only that which exists about Syracuse and the rest of Sicily but also that which is found in every region whether inhabited or uninhabited. Again there are some who, without regarding it as infinite, yet think that no number has been named which is great enough to exceed its multitude. And it is clear that they who hold this view, if they imagined a mass made up of sand in other respects as large as the mass of the earth filled up to a height equal to that of the highest of the mountains, would be many times further still from recognizing that any number could be expressed which exceeded the multitude of the sand so taken. But I will try to show you by means of geometrical proofs, which you will be able to follow, that, of the numbers named by me … some exceed not only the number of the mass of sand equal in magnitude to the earth filled up in the way described, but also that of a mass equal in magnitude to the universe

The deep fact here is a fundamental realization about numbers: the set of natural numbers has no upper bound in the real number system. That is, no matter how huge a real number we pick there’s always a natural number bigger than it. Equivalently, given any positive real number x — even as small as the volume of a grain of sand — and another positive real number y — even as large as the volume of (the ancient Greek conception of) the universe — there’s some natural number n so that nx\geq y. When this happens in a given ordered field we say that the field is “Archimedean”.

So let’s show that \mathbb{R} is Archimedean. If there were positive real numbers x and y so that nx\leq y for all natural numbers n, then y would be an upper bound for the set of nx. Then Dedekind completeness gives us a least upper bound \sup\{nx\}, and we can just take y to be this least upper bound. Now nx\leq y, and also (n+1)x\leq y, and so nx\leq y-x. That is, y-x is another upper bound for the set of multiples of x. But since x was chosen to be positive we see that y-x<y, contradicting the assumption that y was the least such upper bound. So such a pair of real numbers can’t exist.

In particular, we can take a positive real number x and consider the set of natural numbers n which are larger than it. Since the natural numbers are well-ordered, there is a least such number, and it can’t be {0} because we assume x>0. Subtracting one from this number will then give the largest natural number that is still below x in the real number order, and we denote this number by \lfloor x\rfloor. We can thus write any positive number uniquely as the sum \lfloor x\rfloor+r of a natural number and some remainder with 0\leq r<1.

It turns out that the real numbers are actually the largest Archimedean field. That is, if \mathbb{F} is any ordered field satisfying the Archimedean property, there will be an monomorphism of ordered fields \mathbb{F}\rightarrow\mathbb{R}, making (the image of) \mathbb{F} a subfield of \mathbb{R}. I won’t prove this here, but I will note one thing about the meaning of this result: the Archimedean property essentially limits the size of an ordered field. That is, an ordered field can’t get too big without breaking this property. Dually, an ordered field can’t get too small without breaking Dedekind completeness or uniform completeness. Completeness pulls the field one way, while the Archimedean property pulls the other way, and the two reach a sort of equilibrium in the real numbers, living both at the top of one world and the bottom of the other.

December 7, 2007 Posted by | Fundamentals, Numbers | 7 Comments

Cuts and Sequences are Equivalent

Sorry to not get this posted until so late, but the end of the semester has been a bit hectic.

We’ve used Dedekind cuts to “complete” the order on the rational numbers — to make sure that every nonempty set of numbers with an upper bound has a least upper bound. We’ve also used Cauchy sequences to “complete” the uniform structure on the rational numbers — to make sure that every Cauchy sequence converges. But do we actually get the same thing in each case?

If we take a real number x represented by a Cauchy sequence x_n it’s easy to come up with a cut. Given a rational number q we use the constant sequence q_n=q and compare it to x_n. If x_n-q_n is eventually nonnegative then q is less than x, and should go into the left set X_L. On the other hand, if it’s eventually nonpositive then q is greater than x and should go into the right set X_R. It’s straightforward to show that this function from \mathbb{R} to the set of cuts preserves the order.

Now let’s start with the cut (X_L,X_R) and write down a Cauchy sequence. Pick some x_L\in X_L and x_R\in X_R, and construct the sequence as follows. First write down x_0=x_L and x_1=x_R. Now set x_2=\frac{x_L+x_R}{2}. This value will either be in X_L or it won’t. If it is, replace x_L by x_2, and otherwise replace x_R by x_2. Then define x_3 as the midpoint between our two left and right points, and again replace either the left or the right point. Keep going, and we see that all future numbers in the sequence are closer to each other than the current x_L and x_R are to each other. And these two always keep moving closer and closer to each other, halving their distance at each step. So the sequence has to be Cauchy. If we picked a different x_L and x_R to start with, we’d get an equivalent sequence. I’ll leave this to you to show.

Notice here that the points in the sequence that lie in X_L are moving steadily upwards towards the cut, and those in X_R are moving steadily downwards towards it. Eventually, the sequence will rise above any point in X_L and fall below any point in X_R, and so if we take this sequence and build a cut from it we will get back the exact same cut we started with. Also, if we build a cut from a Cauchy sequence, and then a sequence from it, we get back an equivalent sequence. Thus we have set up a bijection between the set of cuts and the set of equivalence classes of Cauchy sequences, and we’ve already seen that it preserves the order structure.

Now let’s look at the map from sequences to cuts and verify that it preserves addition and multiplication of positive numbers. This will make the map into an isomorphism of ordered fields, and so both constructions are describing essentially the same thing. So if we have Cauchy sequences x_n and y_n, which give right sets X_R and Y_R of rational numbers, then what’s the right set of the sequence x_n+y_n? It’s the set of rational numbers q so that x_n+y_n-q is eventually nonnegative. But any such q can be broken up as q=q_x+q_y, where x_n-q_x and y_n-q_y are both eventually nonnegative. That is, (X+Y)_R is the set of sums of elements of X_R and Y_R, and so addition is preserved. The proof for multiplication is essentially the same.

So both methods of extending the real numbers give us essentially the same ordered field, which is thus both complete as a uniform space and Dedekind complete.

December 7, 2007 Posted by | Fundamentals, Numbers | 1 Comment

Dedekind Completion

There’s another sense in which the rational numbers are lacking and the real numbers fix them up. This one is completely about the order structure on \mathbb{Q}, and will lead to another construction of the real numbers.

Okay, so what’s wrong with the rational numbers now? In any partial order we can consider least upper or greatest lower bounds. That is, given a nonempty set of rational numbers S the least upper bound or “supremum” \sup S is a rational number so that \sup S\geq s for all s\in S — it’s an upper bound — and if b is any upper bound then \sup S\leq b. Similarly the “infimum” \inf S is the greatest lower bound — a lower bound that’s greater than all the other lower bounds.

There’s just one problem: there might not be a supremum of a given set. Even if the set is bounded above, there may be no least upper bound. For instance, the set of all rational numbers r so that r^2\leq 2 is bounded above, since 2 is an example of an upper bound, and \frac{3}{2} is another. But no matter what upper bound we have in hand, we can always find a lower number which is still an upper bound for this set. In fact, the upper bound should be \sqrt{2}, but this isn’t a rational number!

Now we define an ordered field to be “Dedekind complete” if every nonempty set S with an upper bound has a least upper bound \sup S. Considering the set consisting of the negatives of elements of S, every nonempty set with a lower bound will have a least lower bound. The flaw in the rational numbers is that they are not Dedekind complete.

So, in order to complete them, we will use the method of “Dedekind cuts”. Given a nonempty set S with any upper bounds at all, the collection of all the upper bounds forms another set S', and any element of S gives a lower bound for S'. We then have the collection of all lower bounds of S', which we call S''. Every rational number will be in either S' or S''. Given a rational number r if there is an s\in S with s\geq r then r is a lower bound for S', and is thus in S''. If not, then r is an upper bound for S and is thus in S'. However, one number may be in both S' and S''. If S has a rational supremum then it is in both S' and S''. There can be no more overlap.

So we’ve come up with a way of cutting the rational numbers into a pair of sets (X_L,X_R) — the “left set” and “right set” of the “cut” X — with x_L\geq x_R for every x_L\in X_L and x_R\in x_R. We define a new total order on the set of cuts by (X_L,X_R)\leq(Y_L,Y_R) if X_L\subseteq Y_L. Every rational number corresponds to one of the cuts which contains a one-point overlap at that rational number, and clearly this inclusion preserves the order on the rational numbers.

Now if I take any nonempty collection of cuts (X^\alpha_L,X^\alpha_R) with an upper bound (in the order on the set of cuts) we can take the union of all the X^\alpha_L and the intersection of all the X^\alpha_R to get a new cut. The left set contains all the X^\alpha_L, and it’s contained in any other left set which contains them, so it’s the least upper bound. Thus the collection of cuts is now Dedekind complete.

We can also add field structures to this completion of an ordered field. For this purpose, it will be useful to denote a generic element of X_L by x_L, and assume x_L runs over all elements of X_L wherever it appears. For instance, given cuts (X_L,X_R) and (Y_L,Y_R), we define their sum to be (\{x_L+y_L\},\{x_R+y_R\}). The negative of a cut (X_L,X_R) will be (\{-x_R\},\{-x_L\}). The product of two positive cuts (X_L,X_R) and (Y_L,Y_R) will have as its right set the collection of all products \{x_Ry_R\}, and its left set defined to make this a cut. Finally, the reciprocal of a positive cut (X_L,X_R) will have the set \{\frac{1}{x_R}\} as its right set, and its left set defined to make this a cut.

This suffices to define all the field operations on the set of cuts, and if we start with \mathbb{Q} we get another model of \mathbb{R}. I’ll leave verification of the field axioms as an exercise, and come back to prove that the method of cuts and the method of Cauchy sequences are equivalent. Once you play with cuts for a while, you may understand why I came at the real numbers with Cauchy sequences first. The cut approach seems to have a certain simplicity, and it’s less ontologically demanding since we’re only ever talking about pairs of subsets of the rational numbers rather than gigantic equivalence classes of sequences. But in the end I always find cuts to be extremely difficult to work with. Luckily, once we’ve shown them to be equivalent to Cauchy sequences it will establish that the real numbers we’ve been talking about are Dedekind complete, and we can put the messiness of this definition behind us.

December 5, 2007 Posted by | Fundamentals, Numbers, Orders | 15 Comments

The Order on the Real Numbers

We’ve defined the real numbers \mathbb{R} as a topological field by completing the rational numbers \mathbb{Q} as a uniform space, and then extending the field operations to the new points by continuity. Now we extend the order on the rational numbers to make \mathbb{R} into an ordered field.

First off, we can simplify our work greatly by recognizing that we just need to determine the subset \mathbb{R}^+ of positive real numbers — those x\in\mathbb{R} with x\geq0. Then we can say x\geq y if x-y\geq0. Now, each real number is represented by a Cauchy sequence of rational numbers, and so we say x\geq0 if x has a representative sequence x_n with each point x_n\geq 0.

What we need to check is that the positive numbers are closed under both addition and multiplication. But clearly if we pick x_n and y_n to be nonnegative Cauchy sequences representing x and y, respectively, then x+y is represented by x_n+y_n and xy is represented by x_ny_n, and these will be nonnegative since \mathbb{Q} is an ordered field.

Now for each x, x-x=0\geq0, so x\geq x. Also, if x\geq y and y\geq z, then x-y\geq0 and y-z\geq0, so x-z=(x-y)+(y-z)\geq0, and so x\geq z. These show that \geq defines a preorder on \mathbb{R}, since it is reflexive and transitive. Further, if x\geq y and y\geq x then x-y\geq0 and y-x\geq0, so x-y=0 and thus x=y. This shows that \geq is a partial order. Clearly this order is total because any real number either has a nonnegative representative or it doesn’t.

One thing is a little hazy here. We asserted that if a number and its negative are both greater than or equal to zero, then it must be zero itself. Why is this? Well if x_n is a nonnegative Cauchy sequence representing x then -x_n represents -x. Now can we find a nonnegative Cauchy sequence y_n equivalent to -x_n? The lowest rational number that y_n can be is, of course, zero, and so \left|y_n-(-x_n)\right|\geq x_n. But for -x_n and y_n to be equivalent we must have for each positive rational r an N so that r\geq\left|y_n-(-x_n)\right|\geq x_n for n\geq N. But this just says that x_n converges to {0}!

So \mathbb{R} is an ordered field, so what does this tell us? First off, we get an absolute value \left|x\right| just like we did for the rationals. Secondly, we’ll get a uniform structure as we do for any ordered group. This uniform topology has a subbase consisting of all the half-infinite intervals (x,\infty) and (-\infty,x) for all real x. But this is also a subbase for the metric we got from completing the rationals, and so the two topologies coincide!

One more very important thing holds for all ordered fields. As a field \mathbb{F} is a kind of a ring with unit, and like any ring with unit there is a unique ring homomorphism \mathbb{Z}\rightarrow\mathbb{F}. Now since 1gt;0 in any ordered field, we have 2=1+1>0, and 3=2+1>0, and so on, to show that no nonzero integer can become zero under this map. Since we have an injective homomorphism of rings, the universal property of the field of fractions gives us a unique field homomorphism \mathbb{Q}\rightarrow\mathbb{F} extending the ring homomorphism from the integers.

Now if \mathbb{F} is complete in the uniform structure defined by its order, this homomorphism will be uniformly complete. Therefore by the universal property of uniform completions, we will find a unique extension \mathbb{R}\rightarrow\mathbb{F}. That is, given any (uniformly) complete ordered field there is a unique uniformly continuous homomorphism of fields from the real numbers to the field in question. Thus \mathbb{R} is the universal such field, which characterizes it uniquely up to isomorphism!

So we can unambiguously speak of “the” real numbers, even if we use a different method of constructing them, or even no method at all. We can work out the rest of the theory of real numbers from these properties (though for the first few we might fall back on our construction) just as we could work out the theory of natural numbers from the Peano axioms.

December 4, 2007 Posted by | Fundamentals, Numbers, Point-Set Topology, Topology | 3 Comments