I really wish I could just say in post titles.
Anyway, I want to investigate the continuous dual of for . That is, we’re excluding the case where either (but not its Hölder conjugate ) is infinite. And I say that when is -finite, the space of bounded linear functionals on is isomorphic to .
First, I’m going to define a linear map . Given a function , let be the linear functional defined for any by
It’s clear from the linearity of multiplication and of the integral itself, that this is a linear functional on . Hölder’s inequality itself shows us that not only does the integral on the right exist, but
That is, is a bounded linear functional, and the operator norm is at most the norm of . The extremal case of Hölder’s inequality shows that there is some for which this is an equality, and thus we conclude that . That is, is an isometry of normed vector spaces. Such a mapping has to be an injection, because if then , which implies that .
Now I say that is also a surjection. That is, any bounded linear functional is of the form for some . Indeed, if then we can just pick as a preimage. Thus we may assume that is a nonzero bounded linear functional on , and . We first deal with the case of a totally finite measure space.
In this case, we define a set function on measurable sets by . It’s straightforward to see that is additive. To prove countable additivity, suppose that is the countable disjoint union of a sequence . If we write for the union of through , we find that
Since is continuous, we conclude that , and thus that is a (signed) measure. It should also be clear that implies , and so . The Radon-Nikodym theorem now tells us that there exists an integrable function so that
Linearity tells us that
for simple functions , and also for every , since each such function is the uniform limit of simple functions. We want to show that .
If , then we must show that is essentially bounded. In this case, we find
for every measurable , from which we conclude that a.e., or else we could find some set on which this inequality was violated. Thus .
For other , we can find a measurable with so that . Setting and defining , we find that on , , and so
We thus find
Applying the monotone convergence theorem as we find that .
Thus in either case we’ve found a so that .
In the -finite case, we can write as the countable disjoint union of sets with . We let be the union of the first of these sets. We note that for every measurable set , so is a linear functional on of norm at most . The finite case above shows us that there are functions on so that
We can define if , and let be the sum of all these . We see that
for every , and since we find that . Then Fatou’s lemma shows us that . Thus the -finite case is true as well.
One case in particular is especially worthy of note: since is Hölder-coonjugate to itself, we find that is isomorphic to its own continuous dual space in the same way that a finite-dimensional inner-product space is isomorphic to its own dual space.
In the context of normed vector spaces we have a topology on our spaces and so it makes sense to ask that maps between them be continuous. In the finite-dimensional case, all linear functions are continuous, so this hasn’t really come up before in our study of linear algebra. But for functional analysis, it becomes much more important.
Now, really we only need to require continuity at one point — the origin, to be specific — because if it’s continuous there then it’ll be continuous everywhere. Indeed, continuity at means that for any there is a so that implies . In particular, if , then this means implies . Clearly if this holds, then the general version also holds.
But it turns out that there’s another equivalent condition. We say that a linear transformation is “bounded” if there is some such that for all . That is, the factor by which stretches the length of a vector is bounded. By linearity, we only really need to check this on the unit sphere , but it’s often just as easy to test it everywhere.
Anyway, I say that a linear transformation is continuous if and only if it’s bounded. Indeed, if is bounded, then we find
so as we let approach — as approaches — the difference between and approaches zero as well. And so is continuous.
Conversely, if is continuous, then it is bounded. Since it’s continuous, we let and find a so that for all vectors with . Thus for all nonzero we find
Thus we can use and conclude that is bounded.
The least such that works in the condition for to be bounded is called the “operator norm” of , which we write as . It’s straightforward to verify that , and that if and only if is the zero operator. It remains to verify the triangle identity.
Let’s say that we have bounded linear transformations and with operator norms and , respectively. We will show that works as a bound for , and thus conclude that . Indeed, we check that
and our assertion follows. In particular, when our base field is itself a normed linear space (like or itself) we can conclude that the “continuous dual space” consisting of bounded linear functionals is a normed linear space using the operator norm on .
To complete what we were saying about the spaces, we need to show that they’re complete. As it turns out, we can adapt the proof that mean convergence is complete, but we will take a somewhat different approach. It suffices to show that for any sequence of functions in so that the series of -norms converges
the series of functions converges to some function .
For finite , Minkowski’s inequality allows us to conclude that
The monotone convergence theorem now tells us that the limiting function
is defined a.e., and that . The dominated convergence theorem can now verify that the partial sums of the series are -convergent to :
In the case , we can write . Then except on some set of measure zero. The union of all the must also be negligible, and so we can throw it all out and just have . Now the series of the converges by assumption, and thus the series of the must converge to some function bounded by the sum of the (except on the union of the ).
Don’t worry about that little dangling off of the norm, or why we call this the “ norm”. That will become clear later when we generalize.
We can easily verify that and that , using our properties of integrals. The catch is that doesn’t imply that is identically zero, but only that almost everywhere. But really throughout our treatment of integration we’re considering two functions that are equal a.e. to be equivalent, and so this isn’t really a problem — implies that is equivalent to the constant zero function for our purposes.
Of course, a norm gives rise to a metric:
and this gives us a topology on the space of integrable simple functions. And with a topology comes a notion of convergence!
We say that a sequence of integrable functions is “Cauchy in the mean” or is “mean Cauchy” if as and get arbitrarily large. We won’t talk quite yet about convergence because our situation is sort of like the one with rational numbers; we have a sense of when functions are getting close to each other, but most of these mean Cauchy sequences actually don’t converge within our space. That is, the normed vector space is not a Banach space.
However we can say some things about this notion of convergence. For one, a sequence that is Cauchy in the mean is Cauchy in measure as well. Indeed, for any we can define the sets
And then we find that
As and get arbitrarily large, the fact that the sequence is mean Cauchy tells us that the left hand side of this inequality gets pushed down to zero, and so the right hand side must as well.
This notion of convergence will play a major role in our study of integration.
Given a sequence of extended real-valued functions on a measure space , we say that converges a.e. to the function if there is a set with so that for all . Similarly, we say that the sequence is Cauchy a.e. if there exists a set of measure zero so that is a Cauchy sequence of real numbers for all . That is, given and there is some natural number depending on and so that whenever we have
Because the real numbers form a complete metric space, being Cauchy and being convergent are equivalent — a sequence of finite real numbers is convergent if and only if it is Cauchy, and a similar thing happens here. If a sequence of finite-valued functions is convergent a.e., then converges to away from a set of measure zero. Each of these sequences is thus Cauchy, and so is Cauchy almost everywhere. On the other hand, if is Cauchy a.e. then the sequences are Cauchy away from a set of measure zero, and these sequences then converge.
We can also define what it means for a sequence of functions to converge uniformly almost everywhere. That is, there is some set of measure zero so that for every we can find a natural number so that for all and we have . The uniformity means that is independent of , but if we choose a different negligible we may have to choose different values of to get the desired control on the sequence.
As it happens, the topology defined by uniform a.e. convergence comes from a norm: the essential supremum; using this notion of convergence makes the algebra of essentially bounded measurable functions on a measure space into a normed vector space. Indeed, we can check what it means for a sequence of functions to converge to under the essential supremum norm — for any there is some so that for all we have . Unpacking the definition of the essential supremum, this means that there is some measurable set with measure zero so that for all , which is exactly what we said for uniform a.e. convergence above.
We can also turn around and define what it means for a sequence to be uniformly Cauchy almost everywhere — for any there is some so that for all we have . Unpacking again, there is some measurable set so that for all . It’s straightforward to check that a sequence that converges uniformly a.e. is uniformly Cauchy a.e., and vice versa. That is, the topology defined by the essential supremum norm is complete, and the algebra of essentially bounded measurable functions on a measure space is a Banach space.
Before we move on, we want to define some structures that blend algebraic and topological notions. These are all based on vector spaces. And, particularly, we care about infinite-dimensional vector spaces. Finite-dimensional vector spaces are actually pretty simple, topologically. For pretty much all purposes you have a topology on your base field , and the vector space (which is isomorphic to for some ) will get the product topology.
But for infinite-dimensional spaces the product topology is often not going to be particularly useful. For example, the space of functions is a product; we write to mean the product of one copy of for each point in . Limits in this topology are “pointwise” limits of functions, but this isn’t always the most useful way to think about limits of functions. The sequence
converges pointwise to a function for and . But we will find it useful to be able to ignore this behavior at the one isolated point and say that . It’s this connection with spaces of functions that brings such infinite-dimensional topological vector spaces into the realm of “functional analysis”.
Okay, so to get a topological vector space, we take a vector space and put a (surprise!) topology on it. But not just any topology will do: Remember that every point in a vector space looks pretty much like every other one. The transformation has an inverse , and it only makes sense that these be homeomorphisms. And to capture this, we put a uniform structure on our space. That is, we specify what the neighborhoods are of , and just translate them around to all the other points.
Now, a common way to come up with such a uniform structure is to define a norm on our vector space. That is, to define a function satisfying the three axioms
- For all vectors and scalars , we have .
- For all vectors and , we have .
- The norm is zero if and only if the vector is the zero vector.
Notice that we need to be working over a field in which we have a notion of absolute value, so we can measure the size of scalars. We might also want to do away with the last condition and use a “seminorm”. In any event, it’s important to note that though our earlier examples of norms all came from inner products we do not need an inner product to have a norm. In fact, there exist norms that come from no inner product at all.
So if we define a norm we get a “normed vector space”. This is a metric space, with a metric function defined by . This is nice because metric spaces are first-countable, and thus sequential. That is, we can define the topology of a (semi-)normed vector space by defining exactly what it means for a sequence of vectors to converge, and in particular what it means for them to converge to zero.
Finally, if we’ve got a normed vector space, it’s a natural question to ask whether or not this vector space is complete or not. That is, we have all the pieces in place to define Cauchy sequences in our vector space, and we would like for all of these sequences to converge under our uniform structure. If this happens — if we have a complete normed vector space — we call our structure a “Banach space”. Most of the spaces we’re concerned with in functional analysis are Banach spaces.
Again, for finite-dimensional vector spaces (at least over or ) this is all pretty easy; we can always define an inner product, and this gives us a norm. If our underlying topological field is complete, then the vector space will be as well. Even without considering a norm, convergence of sequences is just given component-by-component. But infinite-dimensional vector spaces get hairier. Since our algebraic operations only give us finite sums, we have to take some sorts of limits to even talk about most vectors in the space in the first place, and taking limits of such vectors could just complicate things further. Studying these interesting topologies and seeing how linear algebra — the study of vector spaces and linear transformations — behaves in the infinite-dimensional context is the taproot of functional analysis.
So, what’s so great right now about uniform convergence?
As we’ve said before, when we evaluate a power series we get a regular series at each point, which may or may not converge. If we restrict to those points where it converges, we get a function. That is the series of functions converges pointwise to a limiting function. What’s great is that for any compact set contained within the radius of convergence of the series, this convergence is uniform!
To be specific, take a power series which converges for , and let be a compact subset of the disk of radius . Now the function is a continuous, real-valued function on , and the image of a compact space is compact, so takes some maximum value on .
That is, there is some point so that for every point we have . And thus we have for all . Setting , we invoke the Weierstrass M-test — the series converges because is within the disk of convergence, and thus evaluation at converges absolutely.
Now every point within the disk of convergence is separated by some compact set (closed disks are compact, so pick a radius small less than the distance from the point to the boundary of the disk of convergence), within which the convergence is uniform. Since each term is continuous, the uniform limit will also be continuous at the point in question. Thus inside the radius of convergence a power series evaluates to a continuous function.
This gives us our first hint as to what can block a power series. As an explicit example, consider the geometric series , which converges for to the function . This function is clearly discontinuous at , and so the power series can’t converge in any disk containing that point, since if it did it would have to be continuous there. And indeed, we can calculate the radius of convergence to be exactly .
It’s important to note something in this example. For , we have , but these two functions are definitely not equal outside that region. Indeed, at the function clearly has the value , while the geometric series diverges wildly. The equality only holds within the radius of convergence.
Since series of anything are special cases of sequences, we can import our notions to series. We say that a series converges uniformly to a sum if the sequence of partial sums converges uniformly to . That is, if for every there is an so that implies that for all in the domain under consideration.
And we’ve got Cauchy’s condition: a series converges uniformly if for every there is an so that and both greater than zero implies that for all in the domain.
Here’s a a great way to put this to good use: the Weierstrass M-test, which is sort of like the comparison test. Say that we have a positive bound for the size of each term in the series: for all in the domain. And further assume that the series converges. Then the series must converge uniformly.
Since the series of the converges, Cauchy’s condition for series of numbers tells us that for every there is some so that when and are bigger than , . But now when we consider we note that it’s just a finite sum, and so we can use the triangle inequality to write
So Cauchy’s condition tells us that the series converges uniformly in the domain under consideration.
Specifically, a sequence converges uniformly to a function if and only if for every there exists an so that and imply that .
One direction is straightforward. Assume that converges uniformly to . Given we can pick so that implies that for all . Then if and we have
In the other direction, if the Cauchy condition holds for the sequence of functions, then the Cauchy condition holds for the sequence of numbers we get by evaluating at each point . So at least we know that the sequence of functions must converge pointwise. We set to be this limit, and we’re left to show that the convergence is uniform.
Given an the Cauchy condition tells us that we have an so that implies that for every natural number . Then taking the limit over we find
Thus the convergence is uniform.
Today we’ll give the answer to the problem of pointwise convergence. It’s analogous to the notion of uniform continuity in a metric space. In that case we noted that things became nicer if we could choose our the same for every point, and something like that will happen here.
To reiterate: we say that a sequence converges pointwise to a function if for every , and for every , there is an so that implies that . Just like we did for uniform continuity we’re going to move around the quantifiers so that can depend only on , not on .
We say that a sequence of functions converges uniformly to a function if for every there is an so that for every , implies that . In pointwise convergence, the value at each point does converge to the value of the limiting function, but the rates can vary widely enough to make it impossible to control convergence at two different parts of the domain simultaneously. But in uniform convergence we have “uniform” control of the convergence over the entire domain.
So let’s see how we can use this to show that the limiting function is continuous if each function in the sequence is. Uniform convergence tells us that for every there is an so that implies that for every . But since is continuous at there is some so that implies that .
And now we can use this to show the continuity of . For if , we find
The essential point here is that we were able to keep control of the convergence of the sequence both at the point of interest , and at all points in the -wide neighborhood.
Uniform convergence isn’t the only way to be assured of continuity in the limit, but it’s surely one of the most convenient. One thing that’s especially nice about uniform convergence is the way that we can control the separation of sequence terms from the limiting function by a single number instead of a whole function of them.
That is, instead of fixing an , fix an and consider how far sequence terms can be from the limit. Take the maximum
This depends on , but if the convergence is uniform we can keep it down below some constant function. For pointwise convergence that isn’t uniform, no matter how big we pick the there will still be arbitrarily large differences.
In this way, uniform convergence is more like convergence of numbers than pointwise convergence of functions. Uniform convergence just isn’t as floppy as pointwise convergence can be.