The Unapologetic Mathematician

Mathematics for the interested outsider

The Differential Mean Value Theorem

Let’s say we’ve got a function f that’s continuous on the closed interval \left[a,b\right] and differentiable on (a,b). We don’t even assume the function is defined outside the interval, so we can’t really set up the limit for differentiability at the endpoints, but they don’t matter much in the end.

Anyhow, if we look at the graph of f we could just draw a straight line from the point (a,f(a)) to the point (b,f(b)). The graph itself wanders away from this line and back, but the line tells us that on average we’re moving from f(a) to f(b) at a certain rate — the slope of the line. Since this is an average behavior, sometimes we must be going faster and sometimes slower. The differential mean value theorem says that there’s at least one point where we’re going exactly that fast. Geometrically, this means that the tangent line will be parallel to the secant we drew between the endpoints. In formulas we say there is a point c\in(a,b) with f'(c)=\frac{f(b)-f(a)}{b-a}.

First let’s nail down a special case, called “Rolle’s theorem”. If f(a)=0=f(b), we’re asserting that there is some point c\in(a,b) with f'(c)=0. Since \left[a,b\right] is compact and f is continuous, the extreme value theorem tells us that f must take a maximum and a minimum. If these are both zero, then we’re looking at the constant function f(x)=0, and any point in the middle satisfies f'(c)=0. On the other hand, if either the maximum or minimum is nonzero, then we have a local extremum at a point c\in(a,b) where f is differentiable (since it’s differentiable all through the open interval). Now Fermat’s theorem tells us that f'(c)=0 since c is a local extremum! Thus Rolle’s theorem is proved.

Now for the general case. Start with the function f and build from it the function g(x)=f(x)-\frac{f(b)-f(a)}{b-a}(x-a)-f(a). On the graph, this corresponds to applying an “affine transformation” (which sends straight lines in the plane to other straight lines in the plane) to pull both f(a) and f(b) down to zero. In fact, it’s a straightforward calculation to see that g(a)=0=g(b). Thus Rolle’s theorem applies and we find a point c with g'(c)=0. But applying our laws of differentiation, we see that g'(c)=f'(c)-\frac{f(b)-f(a)}{b-a}. And so f'(c)=\frac{f(b)-f(a)}{b-a}, as desired.

January 22, 2008 Posted by John Armstrong | Analysis, Calculus | | 15 Comments

Fermat’s Theorem

Okay, the Heine-Borel theorem tells us that a continuous real-valued function f on a compact space X takes a maximum and a minimum value. In particular, this holds for functions on closed intervals. But how can we recognize a maximum or a minimum when we see one?

First of all, what we get from the Heine-Borel theorem is a global maximum and minimum. That is, a point c\in X so that for any x\in X we have f(c)\geq f(x) (or f(c)\leq f(x)). We also can consider “local” maxima and minima. As you might guess from local connectedness and local compactness, a local maximum (minimum) c is a global maximum (minimum) in some neighborhood U\in\mathcal{N}(c). For example, if f is a function on some region in \mathbb{R} then having a local maximum at c means that there is some interval (a,b) with a<c<b, and for every x\in(a,b) we have f(c)\geq f(x).

So a function may have a number of local maxima and minima, but they’re not all global. Still, finding local maxima and minima is an important first step. In practice there’s only a finite number of them, and we can easily pick out which of them are global by just computing the function. So what do they look like?

For functions on regions in \mathbb{R}, the biggest part of the answer comes from Fermat’s theorem. The theorem itself actually talks about differentiable functions, so the first thing we’ll say is that an extremum may occur at a point where the function is not differentiable (though a point of nondifferentiability is not a sure sign of being an extremum).

Now, let’s say that we have a local maximum at c and that f is differentiable at c. We can set up the difference quotient \frac{f(x)-f(c)}{x-c}. When we take our limit as x goes to c, we can restrict to the neighborhood where c gives a global maximum, so f(x)-f(c)\leq0. To the right of c, x-c>0, so the difference quotient is negative here. To the left of c, x-c<0, so the difference quotient is positive here. Then since the limit must be a limit point of both of these regions, it must be {0}. That is, f'(c)=0. And the same thing happens for local minima.

So let’s define a “critical point” of a function to be one where either f isn’t differentiable or f'(c)=0. Then any local extremum must happen at a critical point. But not every critical point is a local extremum. The easiest example is f(x)=x^3, which has derivative f'(x)=3x^2. Then the only critical point is x=0, for which f(x)=0, but any neighborhood of x=0 has both positive and negative values of f(x), so it’s not a local maximum or minimum.

Geometrically, we should have expected as much as this. Remember that the derivative is the slope of the tangent line. At a local maximum, the function rises to the crest and falls again, and at the top the tangent line balances perfectly level with zero slope. We can see this when we draw the graph, and it provides the intuition behind Fermat’s theorem, but to speak with certainly we need the analytic definitions and the proof of the theorem.

January 21, 2008 Posted by John Armstrong | Analysis, Calculus | | 3 Comments

Sunday Samples 52 (One Day Late)

Sorry, but the Sunday Sample just slipped my mind yesterday.

Fifty-two weeks. I’ve been doing this a year now. Okay, so how about something along those lines?

One of the ways I have to completely date myself is that I went to college right after Rent came out. Since I was into the theater group back in high school (Do you want to do any teaching? Take a drama course.) this was just huge. But for some reason, people just a couple years younger than me completely missed it. When the (terrible) movie adaptation came out, one particularly well-read and culturally-aware friend of mine told me she’d never even heard of the original musical. What’s that noise?

Really, it’s a great show, and the music is a big part of that. If you look for it, though, try to find the original Broadway soundtrack, not the one for the movie. I heard the motion picture soundtrack before the movie itself came out and it should have been my first warning. It cuts a lot of the recitative, hacks up the arrangements, and even screws with the order of the songs in some instances. Just terrible.

Anyhow, why Rent? The musical itself takes place over a single year, mostly contained in the second act, and the entr’acte becomes a repeated theme through this part of the story. Standing apart from the narrative itself, it reflects on those changes that happen when you consider a year all at once: where were you a year ago? and where will you be a year from now? It effectively became the theme song of the musical, and on the Broadway soundtrack there’s even a bonus cover of the song by Stevie Wonder with the cast as backup.

From the 1996 Broadway musical Rent (which is closing its 12-year Broadway run on June 1 so GET YOUR TICKETS NOW BEFORE YOU’RE STUCK WATCHING CHRIS COLUMBUS’ HORRIBLE CINEMATIC ADAPTATION): “Seasons Of Love (yes, the video is from the movie, but I couldn’t find a video with the original that I really liked).

Read more »

January 21, 2008 Posted by John Armstrong | Sunday Samples | | No Comments

The Heine-Borel Theorem

We’ve talked about compact subspaces, particularly of compact spaces and Hausdorff spaces (and, of course, compact Hausdorff spaces). So how can we use this to understand the space \mathbb{R} of real numbers, or higher-dimensional versions like \mathbb{R}^n?

First off, \mathbb{R} is Hausdorff, which should be straightforward to prove. Unfortunately, it’s not compact. To see this, consider the open sets of the form (-x,x) for all positive real numbers x. Given any real number y we can find an x with |y|<x, so y\in(-x,x). Therefore the collection of these open intervals covers \mathbb{R}. But if we take any finite number of them, one will be the biggest, and so we must miss some real numbers. This open cover does not have a finite subcover, and \mathbb{R} is not compact. We can similarly show that \mathbb{R}^n is Hausdorff, but not compact.

So, since \mathbb{R}^n is Hausdorff, any compact subset of \mathbb{R}^n must be closed. But not every closed subset is compact. What else does compactness imply? Well, we can take the proof that \mathbb{R}^n isn’t compact and adapt it to any subset A\subseteq\mathbb{R}^n. We take the collection of all open “cubes” (-x,x)^n consisting of n-tuples of real numbers, each of which is between -x and x, and we form open subsets of A by the intersections U_x=(-x,x)^n\cap A. Now the only way for there to be a finite subcover of this open cover of A is for there to be some x so that U_x=A. That is, every component of every point of A has absolute value less than x, and so we say that A is “bounded”.

We see now that every compact subset of \mathbb{R}^n is closed and bounded. It turns out that being closed and bounded is not only necessary for compactness, but they’re also sufficient! To see this, we’ll show that the closed cube \left[-x,x\right]^n is compact. Then a bounded set A is contained in some such cube, and a closed subset of a compact space is compact. This is the Heine-Borel theorem.

In the n=1 case, we just need to see that the interval \left[-x,x\right] is compact. Take an open cover \{U_i\} of this interval, and define the set S to consist of all y\in\left[-x,x\right] so that a finite collection of the U_i cover \left[-x,y\right]. Then define t to be the least upper bound of S. Basically, t is as far along the interval as we can get with a finite number of sets, and we’re hoping to show that t=x. Clearly it can’t go past x, since S\subseteq\left[-x,x\right]. But can it be less than x?

In fact it can’t, because if it were, then we can find some open set U from the cover that contains t. As an open neighborhood of t, the set U contains some interval (t-\epsilon,t+\epsilon). Then t-\epsilon must be in S, and so there is some finite collection of the U_i which covers \left[-x,t-\epsilon\right]. But then we can just add in U to get a finite collection of the U_i which covers \left[-x,t+\frac{\epsilon}{2}\right], and this contradicts the fact that t is the supremum of S. Thus t=x and there is a finite subcover of \left[-x,x\right], making this closed interval compact!

Now Tychonoff’s Theorem tells us that products of closed intervals are also compact. In particular, the closed cube \left[-x,x\right]^n\subseteq\mathbb{R}^n is compact. And since any closed and bounded set is contained in some such cube, it will be compact as a closed subspace of a compact space. Incidentally, since n is finite, we don’t need to wave the Zorn talisman to get this invocation of the Tychonoff magic to work.

As a special case, we can look back at the one-dimensional case to see that a compact, connected space must be a closed interval \left[a,b\right]. Then we know that the image of a connected space is connected, and that the image of a compact space is compact, so the image of a closed interval under a continuous function f:\mathbb{R}\rightarrow\mathbb{R} is another closed interval.

The fact that this image is an interval gave us the intermediate value theorem. The fact that it’s closed now gives us the extreme value theorem: a continuous, real-valued function f on a closed interval \left[a,b\right] attains a maximum and a minimum. That is, there is some c\in\left[a,b\right] so that f(c)\geq f(x) for all x\in\left[a,b\right], and similarly there is some d\in\left[a,b\right] so that f(d)\leq f(x) for all x\in\left[a,b\right].

January 18, 2008 Posted by John Armstrong | Analysis, Calculus, Point-Set Topology, Topology | | 2 Comments

A sad day for chess

White on white translucent black knights
Back on the rack
Bobby Fischer is dead

The rooks have left the castle
The bishops have all fled
Red velvet lines the time clock
Bobby Fischer is dead
Undead undead undead

The virginal pawns file past his tomb
Strewn with time delays
Bereft in deathly bloom
Alone in a darkened room
The king
Bobby Fischer is dead
Undead undead undead

January 18, 2008 Posted by John Armstrong | Uncategorized | | 14 Comments

Tychonoff’s Theorem

One of the biggest results in point-set topology is Tychonoff’s Theorem: the fact that the product of any family \{X_i\}_{i\in\mathcal{I}} of compact spaces is again compact. Unsurprisingly, the really tough bit comes in when we look at an infinite product. Our approach will use the dual definition of compactness.

Let’s say that a collection \mathcal{F} of closed sets has the finite intersection hypothesis if all finite intersections of members of the collection are nonempty, so compactness says that any collection satisfying the finite intersection hypothesis has nonempty intersection. We can then form the collection \Omega=\{\mathcal{F}\} of all collections of sets satisfying the finite intersection hypothesis. This can be partially ordered by containment — \mathcal{F}'\leq\mathcal{F} if every set in \mathcal{F}' is also in \mathcal{F}.

Given any particular collection \mathcal{F} we can find a maximal collection containing it by finding the longest increasing chain in \Omega starting at \mathcal{F}. Then we simply take the union of all these collections to find the collection at its top. This is almost exactly the same thing as we did back when we showed that every vector space is a free module! And just like then, we need Zorn’s lemma to tell us that we can manage the trick in general, but if we look closely at how we’re going to use it we’ll see that we can get away without Zorn’s lemma for finite products.

Anyhow, this maximal collection \mathcal{F} has two nice properties: it contains all of its own finite intersections, and it contains any set which intersects each set in \mathcal{F}. These are both true because if \mathcal{F} didn’t contain one of these sets we could throw it in, make \mathcal{F} strictly larger, and still satisfy the finite intersection hypothesis.

Now let’s assume that \mathcal{F} is a collection of closed subsets of \prod\limits_{i\in\mathcal{I}}X_i satisfying the finite intersection hypothesis. We can then get a maximal collection \mathcal{G} containing \mathcal{F}. Then given an index i\in\mathcal{I} we can consider the collection \{\overline{\pi_i(G)}\}_{G\in\mathcal{G}} of closed subsets of X_i and see that it, too, satisfies the finite intersection hypothesis. Thus by compactness of X_i the intersection of this collection is nonempty. Letting U_i be a closed set containing one of these intersection points x_i, we see that the preimage \pi_i^{-1}(U_i) meets every G\in\mathcal{G}, and so must itself be in \mathcal{G}.

Okay, so let’s take the point x_i for each index and consider the point p in \prod\limits_{i\in\mathcal{I}}X_i with i-th coordinate x_i. Then pick some set V=\prod\limits_{i\in\mathcal{I}}V_i containing p from the base for the product topology. For all but a finite number of the i, V_i=X_i. For those finite number where it’s smaller, the closure of V_i contains the point x_i\in X_i, and so \pi_i^{-1}(V_i) is in \mathcal{G}. So their finite intersection must be nonempty, and so is V itself!

Now, since V is in \mathcal{G}, it must intersect each of the closed sets in the original collection \mathcal{F}. Since the only constraint on V is that it contain p, this point must be a limit point of each of the sets in \mathcal{F}. And because they’re closed, they must contain all of their limit points. Thus the intersection of all the sets in \mathcal{F} is nonempty, and the product space is compact!

January 17, 2008 Posted by John Armstrong | Point-Set Topology, Topology | | 3 Comments

The Image of a Compact Space

One of the nice things about connectedness is that it’s preserved under continuous maps. It turns out that compactness is the same way — the image of a compact space X under a continuous map f:X\rightarrow Y is compact.

Let’s take an open cover \{U_i\} of the image f(X). Since f is continuous, we can take the preimage of each of these open sets \{f^{-1}(U_i)\} to get a bunch of open sets in X. Clearly every point of X is the preimage of some point of f(X), so the f^{-1}(U_i) form an open cover of X. Then we can take a finite subcover by compactness of X, picking out some finite collection of indices. Then looking back at the U_i corresponding to these indices (instead of their preimages) we get a finite subcover of f(X). Thus any open cover of the image has a finite subcover, and the image is compact.

January 16, 2008 Posted by John Armstrong | Point-Set Topology, Topology | | 2 Comments

Some compact subspaces

Let’s say we have a compact space X. A subset C\subseteq X may not be itself compact, but there’s one useful case in which it will be. If C is closed, then C is compact.

Let’s take an open cover \{F_i\}_{i\in\mathcal{I}} of C. The sets F_i are open subsets of C, but they may not be open as subsets of X. But by the definition of the subspace topology, each one must be the intersection of C with an open subset of X. Let’s just say that each F_i is an open subset of X to begin with.

Now, we have one more open set floating around. The complement of C is open, since C is closed! So between the collection \{F_i\} and the extra set X\setminus C we’ve got an open cover of X. By compactness of X, this open cover has a finite subcover. We can throw out X\setminus C from the subcover if it’s in there, and we’re left with a finite open cover of C, and so C is compact.

In fact, if we restrict to Hausdorff spaces, C must be closed to be compact. Indeed, we proved that if C is compact and X is Hausdorff then any point x\in X\setminus C can be separated from C by a neighborhood U\subseteq X\setminus C. Since there is such an open neighborhood, x must be an interior point of X\setminus C. And since x was arbitrary, every point of X\setminus C is an interior point, and so X\setminus C must be open.

Putting these two sides together, we can see that if X is compact Hausdorff, then a subset C\subseteq X is compact exactly when it’s closed.

January 15, 2008 Posted by John Armstrong | Point-Set Topology, Topology | | 1 Comment

Compact Spaces

An amazingly useful property for a space X is that it be “compact”. We define this term by saying that if \{U_i\}_{i\in\mathcal{I}} is any collection of open subsets of X indexed by any (possibly infinite) set \mathcal{I} so that their union \bigcup\limits_{i\in\mathcal{I}}U_i is the whole of X — the sexy words are “open cover” — then there is some finite collection of the index set \mathcal{A}\subseteq\mathcal{I} so that the union of this finite number of open sets \bigcup\limits_{i\in\mathcal{A}}U_i still contains all of X — the sexy words are “has a finite subcover”.

So why does this matter? Well, let’s consider a Hausdorff space X, a point x\in X, and a finite collection of points A\subseteq X. Given any point a\in A, we can separate x and a by open neighborhoods x\in U_a and a\in V_a, precisely because X is Hausdorff. Then we can take the intersection U=\bigcap\limits_{a\in A}U_a and the union V=\bigcup\limits_{a\in A}V_a. The set U is a neighborhood of X, since it’s a finite intersection of neighborhoods, while the set V is a neighborhood of A. These two sets can’t intersect, and so we have separated x and A by neighborhoods.

But what if A is an infinite set? Then the infinite intersection \bigcap\limits_{a\in A}U_a may not be a neighborhood of x! Infinite operations sometimes cause problems in topology, but compactness can make them finite. If A is a compact subset of X, then we can proceed as before. For each a\in A we have open neighborhoods x\in U_a and a\in V_a, and so A\subseteq\bigcup\limits_{a\in A}V_a — the open sets V_a form a cover of A. Then compactness tells us that we can pick a finite collection A'\subseteq A so that the union V=\bigcup\limits_{a\in A'}V_a of that finite collection of sets still covers A — we only need a finite number of the V_a to cover A. The finite intersection U=\bigcap\limits_{a\in A'}U_a will then be a neighborhood of x which doesn’t touch V, and so we can separate any point x\in X and any compact set A\subseteq X by neighborhoods.

As an exercise, do the exact same thing again to show that in a Hausdorff space X we can separate any two compact sets A\subseteq X and B\subseteq X by neighborhoods.

In a sense, this shows that while compact spaces may be infinite, they sometimes behave as nicely as finite sets. This can make a lot of things simpler in the long run. And just like we saw for connectivity, we are often interested in things behaving nicely near a point. We thus define a space to be “locally compact” if every point has a neighborhood which is compact (in the subspace topology).

There’s an equivalent definition in terms of closed sets, which is dual to this one. Let’s say we have a collection \{F_i\}_{i\in\mathcal{I}} of closed subsets of X so that the intersection of any finite collection of the F_i is nonempty. Then I assert that the intersection of all of the F_i will be nonempty as well if X is compact. To see this, assume that the intersection is empty:
\bigcap\limits_{i\in\mathcal{I}}F_i=\varnothing
Then the complement of this intersection is all of X. We can rewrite this as the union of the complements of the F_i:
X=\bigcup\limits_{i\in\mathcal{I}}F_i^c
Since we’re assuming X to be compact, we can find some finite subcollection \mathcal{A}\subseteq\mathcal{I} so that
X=\bigcup\limits_{i\in\mathcal{A}}F_i^c
which, taking complements again, implies that
\bigcap\limits_{i\in\mathcal{A}}F_i=\varnothing
but we assumed that all of the finite intersections were nonempty!

Now turn this around and show that if we assume this “finite intersection property” — that if all finite intersections of a collection of closed sets F_i are nonempty, then the intersection of all the F_i are nonempty — then we can derive the first definition of compactness from it.

January 14, 2008 Posted by John Armstrong | Point-Set Topology, Topology | | 3 Comments

Ambivalence

The adult in me says this is horrible, but the nerdy kid in me really wishes I could use the Metro or the T or the BART or something like that for my model train set.

January 14, 2008 Posted by John Armstrong | Uncategorized | | No Comments