Whether we use Dedekind cuts or Cauchy sequences to construct the ordered field of real numbers (and it doesn’t matter which), we are taking the ordered field of rational numbers and enlarging it to be “complete” in some sense or another. But we also aren’t making it too much bigger. The universality property we got from completing the uniform structure already gives evidence of that, but there’s another property which we can show is true of , and which shows that the real numbers aren’t too unwieldy.
In The Sand Reckoner, the ancient Greek mathematician Archimedes once set about the problem of should the number of grains of sand in existence to be finite. He does this by determining a (very weak) upper bound: the number of grains of sand it would take to fill up the entire universe, as he understood the latter term. He writes:
There are some … who think that the number of the sand is infinite in multitude; and I mean by the sand not only that which exists about Syracuse and the rest of Sicily but also that which is found in every region whether inhabited or uninhabited. Again there are some who, without regarding it as infinite, yet think that no number has been named which is great enough to exceed its multitude. And it is clear that they who hold this view, if they imagined a mass made up of sand in other respects as large as the mass of the earth filled up to a height equal to that of the highest of the mountains, would be many times further still from recognizing that any number could be expressed which exceeded the multitude of the sand so taken. But I will try to show you by means of geometrical proofs, which you will be able to follow, that, of the numbers named by me … some exceed not only the number of the mass of sand equal in magnitude to the earth filled up in the way described, but also that of a mass equal in magnitude to the universe
The deep fact here is a fundamental realization about numbers: the set of natural numbers has no upper bound in the real number system. That is, no matter how huge a real number we pick there’s always a natural number bigger than it. Equivalently, given any positive real number — even as small as the volume of a grain of sand — and another positive real number — even as large as the volume of (the ancient Greek conception of) the universe — there’s some natural number so that . When this happens in a given ordered field we say that the field is “Archimedean”.
So let’s show that is Archimedean. If there were positive real numbers and so that for all natural numbers , then would be an upper bound for the set of . Then Dedekind completeness gives us a least upper bound , and we can just take to be this least upper bound. Now , and also , and so . That is, is another upper bound for the set of multiples of . But since was chosen to be positive we see that , contradicting the assumption that was the least such upper bound. So such a pair of real numbers can’t exist.
In particular, we can take a positive real number and consider the set of natural numbers which are larger than it. Since the natural numbers are well-ordered, there is a least such number, and it can’t be because we assume . Subtracting one from this number will then give the largest natural number that is still below in the real number order, and we denote this number by . We can thus write any positive number uniquely as the sum of a natural number and some remainder with .
It turns out that the real numbers are actually the largest Archimedean field. That is, if is any ordered field satisfying the Archimedean property, there will be an monomorphism of ordered fields , making (the image of) a subfield of . I won’t prove this here, but I will note one thing about the meaning of this result: the Archimedean property essentially limits the size of an ordered field. That is, an ordered field can’t get too big without breaking this property. Dually, an ordered field can’t get too small without breaking Dedekind completeness or uniform completeness. Completeness pulls the field one way, while the Archimedean property pulls the other way, and the two reach a sort of equilibrium in the real numbers, living both at the top of one world and the bottom of the other.
Sorry to not get this posted until so late, but the end of the semester has been a bit hectic.
We’ve used Dedekind cuts to “complete” the order on the rational numbers — to make sure that every nonempty set of numbers with an upper bound has a least upper bound. We’ve also used Cauchy sequences to “complete” the uniform structure on the rational numbers — to make sure that every Cauchy sequence converges. But do we actually get the same thing in each case?
If we take a real number represented by a Cauchy sequence it’s easy to come up with a cut. Given a rational number we use the constant sequence and compare it to . If is eventually nonnegative then is less than x$, and should go into the left set . On the other hand, if it’s eventually nonpositive then is greater than and should go into the right set . It’s straightforward to show that this function from to the set of cuts preserves the order.
Now let’s start with the cut and write down a Cauchy sequence. Pick some and , and construct the sequence as follows. First write down and . Now set . This value will either be in or it won’t. If it is, replace by , and otherwise replace by . Then define as the midpoint between our two left and right points, and again replace either the left or the right point. Keep going, and we see that all future numbers in the sequence are closer to each other than the current and are to each other. And these two always keep moving closer and closer to each other, halving their distance at each step. So the sequence has to be Cauchy. If we picked a different and to start with, we’d get an equivalent sequence. I’ll leave this to you to show.
Notice here that the points in the sequence that lie in are moving steadily upwards towards the cut, and those in are moving steadily downwards towards it. Eventually, the sequence will rise above any point in and fall below any point in , and so if we take this sequence and build a cut from it we will get back the exact same cut we started with. Also, if we build a cut from a Cauchy sequence, and then a sequence from it, we get back an equivalent sequence. Thus we have set up a bijection between the set of cuts and the set of equivalence classes of Cauchy sequences, and we’ve already seen that it preserves the order structure.
Now let’s look at the map from sequences to cuts and verify that it preserves addition and multiplication of positive numbers. This will make the map into an isomorphism of ordered fields, and so both constructions are describing essentially the same thing. So if we have Cauchy sequences and , which give right sets and of rational numbers, then what’s the right set of the sequence ? It’s the set of rational numbers so that is eventually nonnegative. But any such can be broken up as , where and are both eventually nonnegative. That is, is the set of sums of elements of and , and so addition is preserved. The proof for multiplication is essentially the same.
So both methods of extending the real numbers give us essentially the same ordered field, which is thus both complete as a uniform space and Dedekind complete.
There’s another sense in which the rational numbers are lacking and the real numbers fix them up. This one is completely about the order structure on , and will lead to another construction of the real numbers.
Okay, so what’s wrong with the rational numbers now? In any partial order we can consider least upper or greatest lower bounds. That is, given a nonempty set of rational numbers the least upper bound or “supremum” is a rational number so that for all — it’s an upper bound — and if is any upper bound then . Similarly the “infimum” is the greatest lower bound — a lower bound that’s greater than all the other lower bounds.
There’s just one problem: there might not be a supremum of a given set. Even if the set is bounded above, there may be no least upper bound. For instance, the set of all rational numbers so that is bounded above, since is an example of an upper bound, and is another. But no matter what upper bound we have in hand, we can always find a lower number which is still an upper bound for this set. In fact, the upper bound should be , but this isn’t a rational number!
Now we define an ordered field to be “Dedekind complete” if every nonempty set with an upper bound has a least upper bound . Considering the set consisting of the negatives of elements of , every nonempty set with a lower bound will have a least lower bound. The flaw in the rational numbers is that they are not Dedekind complete.
So, in order to complete them, we will use the method of “Dedekind cuts”. Given a nonempty set with any upper bounds at all, the collection of all the upper bounds forms another set , and any element of gives a lower bound for . We then have the collection of all lower bounds of , which we call . Every rational number will be in either or . Given a rational number if there is an with then is a lower bound for , and is thus in . If not, then is an upper bound for and is thus in . However, one number may be in both and . If has a rational supremum then it is in both and . There can be no more overlap.
So we’ve come up with a way of cutting the rational numbers into a pair of sets — the “left set” and “right set” of the “cut” — with for every and . We define a new total order on the set of cuts by if . Every rational number corresponds to one of the cuts which contains a one-point overlap at that rational number, and clearly this inclusion preserves the order on the rational numbers.
Now if I take any nonempty collection of cuts with an upper bound (in the order on the set of cuts) we can take the union of all the and the intersection of all the to get a new cut. The left set contains all the , and it’s contained in any other left set which contains them, so it’s the least upper bound. Thus the collection of cuts is now Dedekind complete.
We can also add field structures to this completion of an ordered field. For this purpose, it will be useful to denote a generic element of by , and assume runs over all elements of wherever it appears. For instance, given cuts and , we define their sum to be . The negative of a cut will be . The product of two positive cuts and will have as its right set the collection of all products , and its left set defined to make this a cut. Finally, the reciprocal of a positive cut will have the set as its right set, and its left set defined to make this a cut.
This suffices to define all the field operations on the set of cuts, and if we start with we get another model of . I’ll leave verification of the field axioms as an exercise, and come back to prove that the method of cuts and the method of Cauchy sequences are equivalent. Once you play with cuts for a while, you may understand why I came at the real numbers with Cauchy sequences first. The cut approach seems to have a certain simplicity, and it’s less ontologically demanding since we’re only ever talking about pairs of subsets of the rational numbers rather than gigantic equivalence classes of sequences. But in the end I always find cuts to be extremely difficult to work with. Luckily, once we’ve shown them to be equivalent to Cauchy sequences it will establish that the real numbers we’ve been talking about are Dedekind complete, and we can put the messiness of this definition behind us.
We’ve defined the real numbers as a topological field by completing the rational numbers as a uniform space, and then extending the field operations to the new points by continuity. Now we extend the order on the rational numbers to make into an ordered field.
First off, we can simplify our work greatly by recognizing that we just need to determine the subset of positive real numbers — those with . Then we can say if . Now, each real number is represented by a Cauchy sequence of rational numbers, and so we say if has a representative sequence with each point .
What we need to check is that the positive numbers are closed under both addition and multiplication. But clearly if we pick and to be nonnegative Cauchy sequences representing and , respectively, then is represented by and is represented by , and these will be nonnegative since is an ordered field.
Now for each , , so . Also, if and , then and , so , and so . These show that defines a preorder on , since it is reflexive and transitive. Further, if and then and , so and thus . This shows that is a partial order. Clearly this order is total because any real number either has a nonnegative representative or it doesn’t.
One thing is a little hazy here. We asserted that if a number and its negative are both greater than or equal to zero, then it must be zero itself. Why is this? Well if is a nonnegative Cauchy sequence representing then represents . Now can we find a nonnegative Cauchy sequence equivalent to ? The lowest rational number that can be is, of course, zero, and so . But for and to be equivalent we must have for each positive rational an so that for . But this just says that converges to !
So is an ordered field, so what does this tell us? First off, we get an absolute value just like we did for the rationals. Secondly, we’ll get a uniform structure as we do for any ordered group. This uniform topology has a subbase consisting of all the half-infinite intervals and for all real . But this is also a subbase for the metric we got from completing the rationals, and so the two topologies coincide!
One more very important thing holds for all ordered fields. As a field is a kind of a ring with unit, and like any ring with unit there is a unique ring homomorphism . Now since in any ordered field, we have , and , and so on, to show that no nonzero integer can become zero under this map. Since we have an injective homomorphism of rings, the universal property of the field of fractions gives us a unique field homomorphism extending the ring homomorphism from the integers.
Now if is complete in the uniform structure defined by its order, this homomorphism will be uniformly complete. Therefore by the universal property of uniform completions, we will find a unique extension . That is, given any (uniformly) complete ordered field there is a unique uniformly continuous homomorphism of fields from the real numbers to the field in question. Thus is the universal such field, which characterizes it uniquely up to isomorphism!
So we can unambiguously speak of “the” real numbers, even if we use a different method of constructing them, or even no method at all. We can work out the rest of the theory of real numbers from these properties (though for the first few we might fall back on our construction) just as we could work out the theory of natural numbers from the Peano axioms.
We’ve defined the topological space we call the real number line as the completion of the rational numbers as a uniform space. But we want to be able to do things like arithmetic on it. That is, we want to put the structure of a field on this set. And because we’ve also got the structure of a topological space, we want the field operations to be continuous maps. Then we’ll have a topological field, or a “field object” (analogous to a group object) in the category of topological spaces.
Not only do we want the field operations to be continuous, we want them to agree with those on the rational numbers. And since is dense in (and similarly is dense in ), we will get unique continuous maps to extend our field operations. In fact the uniqueness is the easy part, due to the following general property of dense subsets.
Consider a topological space with a dense subset . Then every point has a sequence with . Now if and are two continuous functions which agree for every point in , then they agree for all points in . Indeed, picking a sequence in converging to we have
So if we can show the existence of a continuous extension of, say, addition of rational numbers to all real numbers, then the extension is unique. In fact, the continuity will be enough to tell us what the extension should look like. Let’s take real numbers and , and sequences of rational numbers and converging to and , respectively. We should have
but how do we know that the limit on the right exists? Well if we can show that the sequence is a Cauchy sequence of rational numbers, then it must converge because is complete.
Given a rational number we must show that there exists a natural number so that for all . But we know that there’s a number so that for , and a number so that for . Then we can choose to be the larger of and and find
So the sequence of sums is Cauchy, and thus converges.
What if we chose different sequences and converging to and ? Then we get another Cauchy sequence of rational numbers. To show that addition of real numbers is well-defined, we need to show that it’s equivalent to the sequence . So given a rational number does there exist an so that for all ? This is almost exactly the same as the above argument that each sequence is Cauchy! As such, I’ll leave it to you.
So we’ve got a continuous function taking two real numbers and giving back another one, and which agrees with addition of rational numbers. Does it define an Abelian group? The uniqueness property for functions defined on dense subspaces will come to our rescue! We can write down two functions from to defined by and . Since agrees with addition on rational numbers, and since triples of rational numbers are dense in the set of triples of real numbers, these two functions agree on a dense subset of their domains, and so must be equal. If we take the from as the additive identity we can also verify that it acts as an identity real number addition. We can also find the negative of a real number by negating each term of a Cauchy sequence converging to , and verify that this behaves as an additive inverse, and we can show this addition to be commutative, all using the same techniques as above. From here we’ll just write for the sum of real numbers and .
What about the multiplication? Again, we’ll want to choose rational sequences and converging to and , and define our function by
so it will be continuous and agree with rational number multiplication. Now we must show that for every rational number there is an so that for all . This will be a bit clearer if we start by noting that for each rational there is an so that for all . In particular, for sufficiently large we have , so the sequence is bounded above by some . Similarly, given we can pick so that for and get an upper bound for all . Then choosing to be the larger of and we will have
for . Now given a rational we can (with a little work) find and so that the expression on the right will be less than , and so the sequence is Cauchy, as desired.
Then, as for addition, it turns out that a similar proof will show that this definition doesn’t depend on the choice of sequences converging to and , so we get a multiplication. Again, we can use the density of the rational numbers to show that it’s associative and commutative, that serves as its unit, and that multiplication distributes over addition. We’ll just write for the product of real numbers and from here on.
To show that is a field we need a multiplicative inverse for each nonzero real number. That is, for each Cauchy sequence of rational numbers that doesn’t converge to , we would like to consider the sequence , but some of the might equal zero and thus throw us off. However, there can only be a finite number of zeroes in the sequence or else would be an accumulation point of the sequence and it would either converge to or fail to be Cauchy. So we can just change each of those to some nonzero rational number without breaking the Cauchy property or changing the real number it converges to. Then another argument similar to that for multiplication shows that this defines a function from the nonzero reals to themselves which acts as a multiplicative inverse.
Well, yesterday was given over to exam-writing, so today I’ll pick up a few scraps I mentioned in passing on Thursday.
First of all, the rational numbers are countable. To be explicit, in case I haven’t been before, this means that there is an injective function from the set of rational numbers to the set of natural numbers. Really, I’ll just handle the positive rationals, but it’s straightforward how to include the negatives as well. To every positive rational number we can get a pair of natural numbers — the numerator and the denominator. Then we can send the pair to the number , which is a bijection between the set of all pairs of natural numbers and all natural numbers. Clearly they contain the natural numbers, so the set of rational numbers is countably infinite.
Now, equivalence relations. Given any relation on a set we can build it up into an equivalence relation. First throw in the diagonal to make it reflexive. Then throw in all the points for to make it symmetric. For transitivity, we can similarly start throwing elements into the relation until this condition is satisfied.
But that’s all sort of ugly. Here’s a more elegant way of doing it: given any relation , consider all the relations on that contain — . Some of these will be equivalence relations. In particular, the whole product is an equivalence relation, so there is at least one such. It’s simple to verify that the intersection of any family of equivalence relations on is again an equivalence relation, so we can take the intersection of all equivalence relations on containing to get the smallest such relation. Notice, by the way, how this is similar to generating a topology from a subbase, or to taking the closure of a subset in a topological space.
Finally, absolute values. In any totally ordered group we have the positive “cone” of all elements with . and the negative “cone” of all elements with . In the latter case, we can multiply both sides by the inverse of to get in the positive cone. Notice that the identity is in both cones, but the reflection described leaves it fixed. So for every element in we get a well-defined element called its absolute value. Of course, we often assume that is abelian and write this all additively instead of multiplicatively.
This function has a number of nice properties. First of all, is always in . Secondly, is the identity in if and only if itself is the identity. Thirdly, . And finally, if is abelian we have the “triangle inequality” .
Okay, does that catch us up?
Okay, so we’ve defined a topology on a set . But we also love categories, so we want to see this in terms of categories. And, indeed, every topology is a category!
First, remember that the collection of subsets of , like the collection of subobjects on an object in any category, is partially ordered by inclusion. And since every partially ordered set is a category, so is the collection of subsets of .
In fact, it’s a lattice, since we can use union and intersection as our join and meet, respectively. When we say that a poset has pairwise least upper bounds it’s the same as saying when we consider it as a category it has finite coproducts, and similarly pairwise greatest lower bounds are the same as finite products. But here we can actually take the union or intersection of any collection of subsets and get a subset, so we have all products and coproducts. In the language of posets, we have a “complete lattice”.
So now we want to talk about topologies. A topology is just a collection of the subsets that’s closed under finite intersections and arbitrary unions. We can use the same order (inclusion of subsets) to make a topology into a partially-ordered set. In the language of posets, the requirements are that we have a sublattice (finite meets and joins, along with the same top and bottom element) with arbitrary meets — the topology contains the least upper bound of any collection of its elements.
And now we translate the partial order language into category theory. A topology is a subcategory of the category of subsets of with finite products and all coproducts. That is, we have an arrow from the object to the object if and only if as subsets of . Given any finite collection of objects we have their product , and given any collection of objects we have their coproduct . In particular we have the empty product — the terminal object — and we have the empty coproduct — the initial object . And all the arrows in our category just tell us how various open sets sit inside other open sets. Neat!
A poset which has both least upper bounds and greatest lower bounds is called a lattice. In more detail, let’s say we have a poset and give it two operations: meet (written ) and join (written ). These satisfy the requirements that
- and .
- If and then .
- and .
- If and then .
Not every poset has a meet and a join operation, but if these operations do exist they are uniquely specified by these requirements. In fact, we can see this sort of like how we saw that direct products of groups are unique up to isomorphism: if we have two least upper bounds for a pair of elements then they must each be below or equal to the other, so they must be the same.
We can derive the following properties of the operations:
- and .
- and .
- and .
from these we see that there’s a sort of duality between the two operations. In fact, we can see that these provide two commutative semigroup structures that happen to interact in a certain nice way.
Actually, it gets even better. If we have two operations on any set satisfying these properties then we can define a partial order: if and only if . So we can define a lattice either by the order property and get the algebraic properties, or we can define it by the algebraic properties and get the order property from them.
In many cases, a lattice also satisfies , or equivalently . In this case we call it “distributive”. A bit weaker is to require that if then for all . In this case we call the lattice “modular”.
A lattice may have elements above everything else or below everything else. We call a greatest element of a lattice and a least element . In this case we can define “complements”: and are complements if and . If the lattice is distributive, then the complement of is unique if it exists. A distributive lattice where every element has a complement is called “Boolean”.
We use cardinal numbers to count how many elements are in a set. Another thing we think of numbers for is listing elements. That is, we put things in order: first, second, third, and so on.
We identified a cardinal number as an isomorphism class of sets. Ordinal numbers work much the same way, but we use sets equipped with well-orders. Now we don’t allow all the functions between two sets. We just consider the order-preserving functions. If and are two well-ordered sets, a function preserves the order if whenever then . We consider two well-ordered sets to be equivalent if there is an order-preserving bijection between them, and define an ordinal number to be an equivalence class of well-ordered sets under this relation.
If two well-ordered sets are equivalent, they must have the same cardinality. Indeed, we can just forget the order structure and we have a bijection between the two sets. This means that two sets representing the same ordinal number also represent the same cardinal number.
Now let’s just look at finite sets for a moment. If two finite well-ordered sets have the same number of elements, then it turns out they are order-equivalent too. It can be a little tricky to do this straight through, so let’s sort of come at it from the side. We’ll use finite ordinal numbers to give a model of the natural numbers. Since the finite cardinals also give such a model there must be an isomorphism (as models of between finite ordinals and finite cardinals. We’ll see that the isomorphism required by the universal property sends each ordinal to its cardinality. If two ordinals had the same cardinality, then this couldn’t be an isomorphism, so distinct finite ordinals have distinct cardinalities. We’ll also blur the distinction between a well-ordered set and the ordinal number it represents.
So here’s the construction. We start with the empty set, which has exactly one order. It can seem a little weird, but if you just follow the definitions it makes sense: any relation from to itself is a subset of , and there’s only one of them. Reading the definitions carefully, it uses a lot of “for every”, but no “there exists”. Each time we say “for every” it’s trivially true, since there’s nothing that can make it false. Since we never require the existence of an element having a certain property, that’s not a problem. Anyhow, we call the empty set with this (trivial) well-ordering the ordinal . Notice that it has (cardinal number) zero elements.
Now given an ordinal number we define . That is, each new number has the set of all the ordinals that came before it as elements. We need to put a well-ordering on this set, which is just the order in which the ordinals showed up. In fact, we can say this a bit more concisely: if . More explicitly, each ordinal number is an element of every one that comes after it. Also notice that each time we make a new ordinal out of the ones that came before it, we add one new element. The successor function here adds one to the cardinality, meaning it corresponds to the successor in the cardinal number model of . This gives a function from the finite ordinals onto the finite cardinals.
What’s left to check is the universal property. Here we can leverage the cardinal number model and this surjection of finite ordinals onto finite cardinals. I’ll leave the details to you, but if you draw out the natural numbers diagram it should be pretty clear how to how that the universal property is satisfied.
The upshot of all of this is that finite ordinals, like finite cardinals, give another model of the natural numbers, which is why natural numbers seem to show up when we list things.
I’ve said a bunch about natural numbers, but I seem to have ignored what we’re most used to doing with them: counting things! The reason is that we actually don’t use natural numbers to count, we use something called cardinal numbers.
So let’s go back and think about sets and functions. In fact, for the moment let’s just think about finite sets. It seems pretty straightforward to say there are three elements in the set , and that there are also three elements in the set . Step back for a moment, though, and consider why there are the same number of elements in these two sets. Try to do it without counting the numbers first. I’ll wait.
The essential thing that says there’s something the same about these two sets is that there is a bijection between them. For example, I could define a function by , , and . Every element of is hit by exactly one element of , so this is a bijection. Of course, it’s not the only one, but we’ll leave that alone for now.
So now let’s move back to all (possibly infinte) sets and define a relation. Say that sets and are “in bijection” — and write — if there is some bijection . This is an equivalence relation! Any set is in bijection with itself, using the identity function. If is in bijection with then we can use the inverse function to see that . Finally, if and are bijections, then is a bijection.
Any time we have an equivalence relation we can split things up into equivalence classes. Now I define a cardinal number to be an bijection class of sets — every set in the class is in bijection with every other, and with none outside the class.
So what does this have to do with natural numbers? Well, let’s focus in on finite sets again. There’s only one empty set , so let’s call its cardinal number . Now given any finite set with cardinal number — bijection class — , there’s something not in . Pick any such something, call it , and look at the set . If I took any other set in bijection with and anything not in then there is a bijection between and . Just apply the bijection from to on those elements from , and send to . This shows that the bijection class — the cardinal number — doesn’t depend on what choices we made along the way. Since it’s well-defined we can call it the successor .
We look at the set of all bijection classes of finite sets. We’ve got an identified element , and a successor function. In fact, this satisfies the universal property for natural numbers. The set of cardinal numbers of finite sets is (isomorphic to) the set of natural numbers!
And that’s how we count things.