Okay, defining the integral as the limit of a net of Riemann sums is all well and good, but it’s a huge net, and it seems impossible to calculate with. We need a better way of getting a handle on these things. What we’ll use is a little trick for evaluating limits of nets that I haven’t mentioned yet: “cofinal sets”.
Given a directed set , a directed subset is cofinal if for every there is some with . Now watch what happens when we try to show that the limit of a net is a point . We need to find for every neighborhood of an index so that for every we have . But if is such an index, then there is some above it, and every above that is also above , and so . That is, if the limit over exists, then the limit over exists and has the same value.
Let’s give a cofinal set of tagged partitions by giving a rule for picking the tags that go with any partition. Then our net consists just of partitions of the interval , and the tags come for free. If the function is Riemann-integrable, then the limit over this cofinal set will be the integral. Here’s our rule: in the closed subinterval pick a point so that is the supremum of the values of in that subinterval. If the function is continuous it will attain a maximum at our tag, and if not it’ll get close or shoot off to infinity (if there is no supremum).
Why is this cofinal? Let’s imagine a tagged partition where is not chosen according to this rule. Then we can refine the partition by splitting up the th strip in such a way that is the maximum in one of the new strips, and choosing all the new tags according to the rule. Then we’ve found a good partition above the one we started with. Similarly, we can build another cofinal set by always choosing the tags where approaches an infimum.
When we consider a partition in the first cofinal set we can set up something closely related to the Riemann sums: the “upper Darboux sums”
where is the supremum of on the interval , or infinity if the value of is unbounded above here. Similarly, we can define the “lower Darboux sum”
where now is the infimum (or negative infinity). If the function is Riemann-integrable, then the limits over these cofinal sets both exist and are both equal to the Riemann integral. So we define a function to be “Darboux-integrable” if the limits of the upper and lower Darboux sums both exist and have the same value. Then the Darboux integral is defined to be this common value. Notice that if the function ever shoots off to positive or negative infinity we’ll get an infinite value for one of the terms, and we can never converge, so such functions are not Darboux-integrable.
We should notice here that given any partition , the upper Darboux sum must be larger than any Riemann sum with that same partition, since no matter how we choose the tag we’ll find that by definition. Similarly, the lower Darboux sum must be smaller than any Riemann sum on the same partition. Now let’s say that the upper and lower Darboux sums both converge to the same value . Then given any neighborhood of we can find a partition so that every upper Darboux sum over a refinement of is in the neighborhood, and a similar partition for the lower Darboux sums. Choosing a common refinement of both (which we can do because partitions form a directed set) both its upper and lower Darboux sums (and those of any of its refinements) will be in our neighborhood. Then we can choose any tags in we want, and the Riemann sum will again be in the neighborhood. Thus a Darboux-integrable function is also Riemann-integrable.
So this new notion of Darboux-integrability is really the same one as Riemann-integrability, but it involves taking two limits over a much less complicated directed set. For now, we’ll just call a function which satisfies either of these two equivalent conditions “integrable” and be done with it, using whichever construction of the integral is most appropriate to our needs at the time.
Okay, I’d promised to get back to the fact that the real numbers form the “largest” Archimedean field. More precisely, any Archimedean field is order-isomorphic to a subfield of .
There’s an interesting side note here. I was thinking about this and couldn’t quite see my way forward. So I started asking around Tulane’s math department and seeing if anyone knew. Someone pointed me towards Mike Mislove, and when I asked him, he suggested we ask Laszlo Fuchs around the corner from him. Dr. Fuchs, it turned out, did know the answer, and it was in a book he’d written himself: Partially Ordered Algebraic Systems. It’s an interesting little volume, which I may come back and mine later for more topics.
Anyhow, we’ll do this a little more generally. First let’s talk about Archimedean ordered groups a bit. In a totally-ordered group we’ll say two elements and are “Archimedean equivalent” () if there are natural numbers and so that and (here I’m using the absolute value that comes with any totally-ordered group). That is, neither one is infinitesimal with respect to the other. This can be shown to be an equivalence relation, so it chops the elements of into equivalence classes. There are always at least two in any nontrivial group because the identity element is infinitesimal with respect to everything else. We say a group is Archimedean if there are only two Archimedean equivalence classes. That is, for any and other than the identity, there is a natural number with .
Now we have a theorem of Hölder which says that any Archimedean group is order-isomorphic to a subgroup of the real numbers with addition. In particular, we will see that any Archimedean group is commutative.
Now either has a least positive element or it doesn’t. If it does, then implies that ( is the identity of the group). By the Archimedean property, any element has an integer so that . Then we can multiply by to find that , so . Every element is thus some power of , and the group is isomorphic to the integers .
On the other hand, what if given a positive we can always find a positive with ? In this case, may be greater than , but in this case we can show that , and itself is less than , so in either case we have an element with and .
Now if two positive elements and fail to commute then without loss of generality we can assume . Then we pick and choose a to go with this . By the Archimedean property we’ll have numbers and with and . Thus we find that , which contradicts how we picked . And thus is commutative.
So we can pick some positive element and just set . Now we need to find where to send every other element. To do this, note that for any and any rational number we’ll either have or , and both of these situations must arise by the Archimedean property. This separates the rational numbers into two nonempty collections — a cut! So we define to be the real number specified by this cut. It’s straightforward now to show that , and thus establish the order isomorphism.
So all Archimedean groups are just subgroups of with addition as its operation. In fact, homomorphisms of such groups are just as simple.
Say that we have a nontrivial Archimedean group , a (possibly trivial) Archimedean group , and a homomorphism . If for some positive then this is just the trivial homomorphism sending everything to zero, since for any positive there is a natural number so that . In this case the homomorphism is “multiply by “.
On the other hand, take any two positive elements and consider the quotients (in ) and . If they’re different (say, ) then we can pick a rational number between them. Then , while , which contradicts the order-preserving property of the isomorphism! Thus we find the ratio must be a constant , and the homomorphism is “multiply by “.
Now let’s move up to Archimedean rings, whose definition is the same as that for Archimedean fields. In this case, either the product of any two elements is (we have a “zero ring”) and the additive group is order-isomorphic to a subgroup of , or the ring is order-isomorphic to a subring of . If we have a zero ring, then the only data left is an Archimedean group, which the above discussion handles, so we’ll just assume that we have some nonzero product and show that we have an order-isomorphism with a subring of .
So we’ve got some Archimedean ring and its additive group . By the theorem above, is order-isomorphic to a subgroup of . We also know that for any positive the operation (the dot will denote the product in ) is an order-homomorphism from to itself. Thus there is some non-negative real number so that . If we define then the assignment gives us an order-homomorphism from to some group .
Again, we must have for some non-negative real number . If then all multiplications in would give zero, so , and so the assignment is invertible. Now we see that . Similarly, we have , and so the function is an order-isomorphism of rings.
In particular, a field can’t be a zero ring, and so there must be an injective order-homomorphism . In fact, there can be only one, for if there were more than one the images would be related by multiplication by some positive : . But then , and so .
We can sum this up by saying that the real numbers are a terminal object in the category of Archimedean fields.
There’s another sense in which the rational numbers are lacking and the real numbers fix them up. This one is completely about the order structure on , and will lead to another construction of the real numbers.
Okay, so what’s wrong with the rational numbers now? In any partial order we can consider least upper or greatest lower bounds. That is, given a nonempty set of rational numbers the least upper bound or “supremum” is a rational number so that for all — it’s an upper bound — and if is any upper bound then . Similarly the “infimum” is the greatest lower bound — a lower bound that’s greater than all the other lower bounds.
There’s just one problem: there might not be a supremum of a given set. Even if the set is bounded above, there may be no least upper bound. For instance, the set of all rational numbers so that is bounded above, since is an example of an upper bound, and is another. But no matter what upper bound we have in hand, we can always find a lower number which is still an upper bound for this set. In fact, the upper bound should be , but this isn’t a rational number!
Now we define an ordered field to be “Dedekind complete” if every nonempty set with an upper bound has a least upper bound . Considering the set consisting of the negatives of elements of , every nonempty set with a lower bound will have a least lower bound. The flaw in the rational numbers is that they are not Dedekind complete.
So, in order to complete them, we will use the method of “Dedekind cuts”. Given a nonempty set with any upper bounds at all, the collection of all the upper bounds forms another set , and any element of gives a lower bound for . We then have the collection of all lower bounds of , which we call . Every rational number will be in either or . Given a rational number if there is an with then is a lower bound for , and is thus in . If not, then is an upper bound for and is thus in . However, one number may be in both and . If has a rational supremum then it is in both and . There can be no more overlap.
So we’ve come up with a way of cutting the rational numbers into a pair of sets — the “left set” and “right set” of the “cut” — with for every and . We define a new total order on the set of cuts by if . Every rational number corresponds to one of the cuts which contains a one-point overlap at that rational number, and clearly this inclusion preserves the order on the rational numbers.
Now if I take any nonempty collection of cuts with an upper bound (in the order on the set of cuts) we can take the union of all the and the intersection of all the to get a new cut. The left set contains all the , and it’s contained in any other left set which contains them, so it’s the least upper bound. Thus the collection of cuts is now Dedekind complete.
We can also add field structures to this completion of an ordered field. For this purpose, it will be useful to denote a generic element of by , and assume runs over all elements of wherever it appears. For instance, given cuts and , we define their sum to be . The negative of a cut will be . The product of two positive cuts and will have as its right set the collection of all products , and its left set defined to make this a cut. Finally, the reciprocal of a positive cut will have the set as its right set, and its left set defined to make this a cut.
This suffices to define all the field operations on the set of cuts, and if we start with we get another model of . I’ll leave verification of the field axioms as an exercise, and come back to prove that the method of cuts and the method of Cauchy sequences are equivalent. Once you play with cuts for a while, you may understand why I came at the real numbers with Cauchy sequences first. The cut approach seems to have a certain simplicity, and it’s less ontologically demanding since we’re only ever talking about pairs of subsets of the rational numbers rather than gigantic equivalence classes of sequences. But in the end I always find cuts to be extremely difficult to work with. Luckily, once we’ve shown them to be equivalent to Cauchy sequences it will establish that the real numbers we’ve been talking about are Dedekind complete, and we can put the messiness of this definition behind us.
Well, yesterday was given over to exam-writing, so today I’ll pick up a few scraps I mentioned in passing on Thursday.
First of all, the rational numbers are countable. To be explicit, in case I haven’t been before, this means that there is an injective function from the set of rational numbers to the set of natural numbers. Really, I’ll just handle the positive rationals, but it’s straightforward how to include the negatives as well. To every positive rational number we can get a pair of natural numbers — the numerator and the denominator. Then we can send the pair to the number , which is a bijection between the set of all pairs of natural numbers and all natural numbers. Clearly they contain the natural numbers, so the set of rational numbers is countably infinite.
Now, equivalence relations. Given any relation on a set we can build it up into an equivalence relation. First throw in the diagonal to make it reflexive. Then throw in all the points for to make it symmetric. For transitivity, we can similarly start throwing elements into the relation until this condition is satisfied.
But that’s all sort of ugly. Here’s a more elegant way of doing it: given any relation , consider all the relations on that contain — . Some of these will be equivalence relations. In particular, the whole product is an equivalence relation, so there is at least one such. It’s simple to verify that the intersection of any family of equivalence relations on is again an equivalence relation, so we can take the intersection of all equivalence relations on containing to get the smallest such relation. Notice, by the way, how this is similar to generating a topology from a subbase, or to taking the closure of a subset in a topological space.
Finally, absolute values. In any totally ordered group we have the positive “cone” of all elements with . and the negative “cone” of all elements with . In the latter case, we can multiply both sides by the inverse of to get in the positive cone. Notice that the identity is in both cones, but the reflection described leaves it fixed. So for every element in we get a well-defined element called its absolute value. Of course, we often assume that is abelian and write this all additively instead of multiplicatively.
This function has a number of nice properties. First of all, is always in . Secondly, is the identity in if and only if itself is the identity. Thirdly, . And finally, if is abelian we have the “triangle inequality” .
Okay, does that catch us up?
Okay, so we’ve defined a topology on a set . But we also love categories, so we want to see this in terms of categories. And, indeed, every topology is a category!
First, remember that the collection of subsets of , like the collection of subobjects on an object in any category, is partially ordered by inclusion. And since every partially ordered set is a category, so is the collection of subsets of .
In fact, it’s a lattice, since we can use union and intersection as our join and meet, respectively. When we say that a poset has pairwise least upper bounds it’s the same as saying when we consider it as a category it has finite coproducts, and similarly pairwise greatest lower bounds are the same as finite products. But here we can actually take the union or intersection of any collection of subsets and get a subset, so we have all products and coproducts. In the language of posets, we have a “complete lattice”.
So now we want to talk about topologies. A topology is just a collection of the subsets that’s closed under finite intersections and arbitrary unions. We can use the same order (inclusion of subsets) to make a topology into a partially-ordered set. In the language of posets, the requirements are that we have a sublattice (finite meets and joins, along with the same top and bottom element) with arbitrary meets — the topology contains the least upper bound of any collection of its elements.
And now we translate the partial order language into category theory. A topology is a subcategory of the category of subsets of with finite products and all coproducts. That is, we have an arrow from the object to the object if and only if as subsets of . Given any finite collection of objects we have their product , and given any collection of objects we have their coproduct . In particular we have the empty product — the terminal object — and we have the empty coproduct — the initial object . And all the arrows in our category just tell us how various open sets sit inside other open sets. Neat!
A poset which has both least upper bounds and greatest lower bounds is called a lattice. In more detail, let’s say we have a poset and give it two operations: meet (written ) and join (written ). These satisfy the requirements that
- and .
- If and then .
- and .
- If and then .
Not every poset has a meet and a join operation, but if these operations do exist they are uniquely specified by these requirements. In fact, we can see this sort of like how we saw that direct products of groups are unique up to isomorphism: if we have two least upper bounds for a pair of elements then they must each be below or equal to the other, so they must be the same.
We can derive the following properties of the operations:
- and .
- and .
- and .
from these we see that there’s a sort of duality between the two operations. In fact, we can see that these provide two commutative semigroup structures that happen to interact in a certain nice way.
Actually, it gets even better. If we have two operations on any set satisfying these properties then we can define a partial order: if and only if . So we can define a lattice either by the order property and get the algebraic properties, or we can define it by the algebraic properties and get the order property from them.
In many cases, a lattice also satisfies , or equivalently . In this case we call it “distributive”. A bit weaker is to require that if then for all . In this case we call the lattice “modular”.
A lattice may have elements above everything else or below everything else. We call a greatest element of a lattice and a least element . In this case we can define “complements”: and are complements if and . If the lattice is distributive, then the complement of is unique if it exists. A distributive lattice where every element has a complement is called “Boolean”.
We use cardinal numbers to count how many elements are in a set. Another thing we think of numbers for is listing elements. That is, we put things in order: first, second, third, and so on.
We identified a cardinal number as an isomorphism class of sets. Ordinal numbers work much the same way, but we use sets equipped with well-orders. Now we don’t allow all the functions between two sets. We just consider the order-preserving functions. If and are two well-ordered sets, a function preserves the order if whenever then . We consider two well-ordered sets to be equivalent if there is an order-preserving bijection between them, and define an ordinal number to be an equivalence class of well-ordered sets under this relation.
If two well-ordered sets are equivalent, they must have the same cardinality. Indeed, we can just forget the order structure and we have a bijection between the two sets. This means that two sets representing the same ordinal number also represent the same cardinal number.
Now let’s just look at finite sets for a moment. If two finite well-ordered sets have the same number of elements, then it turns out they are order-equivalent too. It can be a little tricky to do this straight through, so let’s sort of come at it from the side. We’ll use finite ordinal numbers to give a model of the natural numbers. Since the finite cardinals also give such a model there must be an isomorphism (as models of between finite ordinals and finite cardinals. We’ll see that the isomorphism required by the universal property sends each ordinal to its cardinality. If two ordinals had the same cardinality, then this couldn’t be an isomorphism, so distinct finite ordinals have distinct cardinalities. We’ll also blur the distinction between a well-ordered set and the ordinal number it represents.
So here’s the construction. We start with the empty set, which has exactly one order. It can seem a little weird, but if you just follow the definitions it makes sense: any relation from to itself is a subset of , and there’s only one of them. Reading the definitions carefully, it uses a lot of “for every”, but no “there exists”. Each time we say “for every” it’s trivially true, since there’s nothing that can make it false. Since we never require the existence of an element having a certain property, that’s not a problem. Anyhow, we call the empty set with this (trivial) well-ordering the ordinal . Notice that it has (cardinal number) zero elements.
Now given an ordinal number we define . That is, each new number has the set of all the ordinals that came before it as elements. We need to put a well-ordering on this set, which is just the order in which the ordinals showed up. In fact, we can say this a bit more concisely: if . More explicitly, each ordinal number is an element of every one that comes after it. Also notice that each time we make a new ordinal out of the ones that came before it, we add one new element. The successor function here adds one to the cardinality, meaning it corresponds to the successor in the cardinal number model of . This gives a function from the finite ordinals onto the finite cardinals.
What’s left to check is the universal property. Here we can leverage the cardinal number model and this surjection of finite ordinals onto finite cardinals. I’ll leave the details to you, but if you draw out the natural numbers diagram it should be pretty clear how to how that the universal property is satisfied.
The upshot of all of this is that finite ordinals, like finite cardinals, give another model of the natural numbers, which is why natural numbers seem to show up when we list things.
A well-ordering on a set is a special kind of total order: one in which every non-empty subset contains a least element.
The two best examples of ordered sets we have handy are the natural numbers and the integers . For now, forget all the other algebraic structure we’ve talked about. The natural numbers are well-ordered, but the integers are not.
Seeing that the integers are not well-ordered is pretty easy: just take the subset of negative integers. There’s no negative number that’s less than all the others. Seeing that the natural numbers are well-ordered is a little more difficult. Given a set , either is in or it isn’t. If it is, it’s the least element — nothing could be less than it. If it isn’t, then we move to . As we keep going, either we eventually hit an element of or we don’t. If we do, the first one is the least element of . If not, then by induction the set of natural numbers not in is all of them, and must be empty!
It turns out that in the standard logical framework I’m using, any set can be equipped with a well-order. For instance, we could well-order the integers by saying if either or and . We can list the integers in this order: . The problem is that this doesn’t mesh with the algebraic structure very well.
Now as we’ll see when we get to them, the natural order on the real numbers is even further from being well-ordered, and there doesn’t seem to be any sensible way to well-order them. Still, we’ll see eventually that though the well-ordering principle is “plainly false” — very counterintuitive — it turns out to be logically equivalent to another tool that’s so incredibly useful we really should accept this unexpected fact.
As a bonus today, I want to define a few more kinds of relations.
A preorder is a relation on a set which is reflexive and transitive. We often write a general preorder as and say that precedes or that succeeds . A set equipped with a preorder is called a preordered set. If we also have that for any two elements and there is some element (possibly the same as or ) that succeeds both of them we call the structure a directed set.
A partial order is a preorder which is also antisymmetric: the only way to have both and is for and to be the same element. We call a set with a partial order a partially-ordered set or a “poset”.
Any set gives a partial order on its set of subsets, given by inclusion: if and are subsets of a set , then precedes if is contained in . This has the further nice property that it has a top element, itself, that succeeds every element. It also has a bottom element, the empty subset, that precedes everything. The same sort of construction applies to give the poset of subgroups of any given group. These kinds of partially-ordered sets are very important in logic and set theory, and they’ll come up in more detail later.
Finally, a partial order where for any two elements and we either have or is called a total order. Total orders show up over and over, and they’re nice things to have around. I must admit, though, that as far as I’m concerned they’re pretty boring in and of themselves.