The Tangent Space of a Product
Let and
be smooth manifolds, with
the
-dimensional product manifold. Given points
and
we want to investigate the tangent space
of this product at the point
.
For some notation, remember that we have the projections and
. Also, if we have a point
we get a smooth inclusion mapping
defined by
. Similarly, given a point
we get an inclusion map
defined by
. These maps satisfy the relations
where the last two are the constant maps with the given values. We can thus use the chain rule to calculate the derivatives of these relations
These are four of the five relations we need to show that decomposes as the direct sum of
and
. The remaining one states
where is a linear map from
to
. The real content of the first four relations is effectively that
That is, we know that is a right-inverse of
, and we want to know if it’s a left-inverse as well. But this follows since both vector spaces
and
have dimension
. Thus the tangent space of the product decomposes canonically as the direct sum of the tangent spaces of the factors. In terms of our geometric intuition, there are directions we can go “along
“, and directions “along
“, and any other direction we can go in
is a linear combination of one of each.
Note how this dovetails with our discussion of submanifolds. The projection is a smooth map, and every point
is a regular value. Its preimage
is a submanifold diffeomorphic to
. The embedding realizing this diffeomorphism is
. The tangent space at a point
on the submanifold
is mapped by
to
, and the kernel of this map is exactly the image of the inclusion
. The same statements hold with
and
swapped appropriately, which is what gives us a canonical decomposition in this case.
Tangent Spaces and Regular Values
If we have a smooth map and a regular value
of
, we know that the preimage
is a smooth
-dimensional submanifold. It turns out that we also have a nice decomposition of the tangent space
for every point
.
The key observation is that the inclusion induces an inclusion of each tangent space by using the derivative
. The directions in this subspace are those “tangent to” the submanifold
, and so these are the directions in which
doesn’t change, “to first order”. Heuristically, in any direction
tangent to
we can set up a curve
with that tangent vector which lies entirely within
. Along this curve, the value of
is constantly
, and so the derivative of
is zero. Since the derivative of
in the direction
only depends on
and not the specific choice of curve
, we conclude that
should be zero.
This still feels a little handwavy. To be more precise, if and
is a smooth function on a neighborhood of
, then we calculate
since any tangent vector applied to a constant function is automatically zero. Thus we conclude that . In fact, we can say more. The rank-nullity theorem tells us that the dimension of
and the dimension of
add up to the dimension of
, which of course is
. But the assumption that
is a regular point means that the rank of
is
, so the dimension of the kernel is
. And this is exactly the dimension of
, and thus of its tangent space
! Since the subspace
has the same dimesion as
, we conclude that they are in fact equal.
What does this mean? It tells us that not only are the tangent directions to contained in the kernel of the derivative
, every vector in the kernel is tangent to
. Thus we can break down any tangent vector in
into a part that goes “along”
and a part that goes across it. Unfortunately, this isn’t really canonical, since we don’t have a specific complementary subspace to
in mind. Still, it’s a useful framework to keep in mind, reinforcing the idea that near the subspace
the manifold
“looks like” the product of
(from
) and
, and we can even pick coordinates that reflect this “decomposition”.
Spheres as Submanifolds
With our extension of the implicit function theorem in hand, we have another way of getting at the sphere, this time as a submanifold.
Start with the Euclidean space and take the smooth function
defined by
. In components, this is
, where the
are the canonical coordinates on
. We can easily calculate the derivative in these coordinates:
. This is the zero function if and only if
, and so
has rank
at any nonzero point
. The point
is a critical point, and every other point is regular.
On the image side, we see that , so the only critical value is
. Every other value is regular, though
is empty for
. For
we have a nonempty preimage, which by our result is a manifold of dimension
. This is the
-dimensional sphere of radius
, though we aren’t going to care so much about the radius for now.
Anyway, is this really the same sphere as before? Remember, when we first saw the two-dimensional sphere as an example, we picked coordinate patches by hand. Now we have the same set of points — those with a fixed squared-distance from the origin — but we might have a different differentiable manifold structure. But if we can show that the inclusion mapping that takes each of our handcrafted coordinate patches into is an immersion, then they must be compatible with the submanifold structure.
We only really need to check this for a single patch, since all six are very similar. We take the local coordinates from our patch and the canonical coordinates on to write out the inclusion map:
Then we use these coordinates to calculate the derivative
This clearly always has rank for
, and so the inclusion of our original sphere into
is an immersion, which must then be equivalent to the inclusion of the submanifold
, since they give the same subspace of
.
Regular and Critical Points
Let be a smooth map between manifolds. We say that a point
is a “regular point” if the derivative
has rank
; otherwise, we say that
is a “critical point”. A point
is called a “regular value” if its preimage
contains no critical points.
The first thing to notice is that this is only nontrivial if . If
then
can have rank at most
, and thus every point is critical. Another observation is that is
then
is automatically regular; if its preimage is empty then it cannot contain any critical points.
Regular values are useful because of the generalization of the first part of the implicit function theorem: if is a regular value of
, then
is a topological manifold of dimension
. Or, to put it another way,
is a submanifold of “codimension”
. Further, there is a unique differentiable structure for which
is a smooth submanifold of
.
Indeed, let be a coordinate patch around
with
. Given
, pick a coordinate patch
of
with
. Let
be the projection onto the first
components; let
be the projection onto the last
components; an let
be the inclusion of the subspace whose first
components are
.
Now, we can write down the composition . Since this has (by assumption) maximal rank at
, the implicit function theorem tells us that there is a coordinate patch
in a neighborhood of
such that
. So we can set
, which is open in
, and get
Setting we conclude that
, since all these points are sent by
to the preimage
.
Now we claim that is not just any subset of
, but in fact
. Clearly
is contained in this intersection, since
On the other hand, if is in this intersection, then
for a unique
— unique because
and
are both coordinate maps and thus invertible — and we have
meaning that the first components of
must be
, and thus
. Thus
.
Therefore maps
homeomorphically onto a neighborhood of
in the subspace topology induced by
. But this means that
acts as a coordinate patch on
! Since every point
can be found in some local coordinate patch,
is a topological manifold. For its differentiable structure we’ll just take the one induced by these patches.
Finally, we have to check that the inclusion is smooth, so
is a smooth submanifold — that its differentiable structure is compatible with that of
. But this is easy, since at any point
we can go through the above process and get all these functions. We check smoothness by using local coordinates
on
and
on
, concluding that
, which is clearly smooth.
Submanifolds
At last we can actually define submanifolds. If and
are both manifolds with
as topological spaces — the points of
form a subset of the points of
and the topology of
agrees with the subspace topology from
— then we say that
is a submanifold of
if the inclusion map
is an embedding. If the inclusion is only an immersion, we say that
is an “immersed submanifold” of
.
Now, if is any embedding of one manifold into another, then the image
is a submanifold, as defined above. Similarly, the image of an injective immersion is an immersed submanifold. The tricky bit here is that if we have a situation like the second of our pathological immersions, we have to consider the topology on the image that does not consider the endpoints to be “close” to the middle point on the curve that they approach.
This motivates us to define an equivalence relation on injective immersions into : if
and
are two maps, we consider them equivalent if there is a diffeomorphism
so that
. Clearly, this is reflexive (we just let
be the identity map), symmetric (a diffeomorphism
is invertible), and transitive (the composition of two diffeomorphisms is another one).
The nice thing about this equivalence class is that every immersion is equivalent to a unique immersed submanifold, and so there is no real loss in speaking about an immersion as “being” an immersed submanifold. And of course the same goes for embeddings “being” submanifolds as well.
Immersions are Locally Embeddings
In both of our pathological examples last time, the problems were very isolated. They depended on two separated parts of the domain manifold interacting with each other. And since manifolds can be carved up easily, we can always localize and find patches of the domain where the immersion map is a well-behaved embedding.
More specifically, if is an immersion, with
always an injection for every
, then for every point
there exists a neighborhood
of
and a coordinate map
around
so that
if and only if
. Further, the restriction
is an embedding.
This is basically the actual extension of the second part of the implicit function theorem to manifolds. Appropriately, then, we’ll let be the same inclusion into the first
coordinates. We pick a coordinate map
around
with
, and another map
around
with
. Then we get a map
from a neighborhood of
to a neighborhood of
.
Now, the assumption on is that
is injective, meaning it has maximal rank
at every point. Since
and
are diffeomorphisms, the composite also has maximal rank
at
. The implicit function theorem tells us there is a coordinate map
in some neighborhood of
and a neighborhood
of
such that
.
We set , and
, restricting the domain of
, if necessary. This establishes the first part of our assertion. Next we need to show that
is an embedding. But
, which is a composition of embeddings, and is thus an embedding itself.
If is already an embedding at the outset, then
for some open
. In this case, with
as in the theorem, we have
That is, there is always a set of local coordinates in so that the image of
is locally the hyperplane spanned by the first
of them.
Immersions and Embeddings
As we said before, the notion of a “submanifold” gets a little more complicated than a naïve, purely categorical approach might suggest. Instead, we work from the concepts of immersions and embeddings.
A map of manifolds is called an “immersion” if the derivative
is injective at every point
. Immediately we can tell that this can only happen if
.
Notice now that this does not guarantee that itself is injective. For instance, if
and
, then we can form the mapping
. Using the coordinates
on
and
on
, we can calculate the derivative in coordinates:
The second component of this vector is only zero if itself is, but in this case the first component is
, thus
is never the zero map between the tangent spaces. But
, so
is not injective in terms of the underlying point sets of
and
.
Courtesy of Wolfram Alpha, we can plot this map to see what’s going on:
The image of the curve crosses itself at the origin, but if we restrict ourselves to, say, the intervals and
, there is no self-intersection in each interval.
There is another, more subtle pathology to be careful about. Let be the open interval
, and left
. We plot this curve, stopping just slightly shy of each endpoint:
We see that there’s never quite a self-intersection like before, but the ends of the curve come right up to almost touch the curve in the middle. Going all the way to the limit, the image of is a figure eight, which includes the crossing point in the middle and is thus not a manifold, even though the parameter space is.
To keep away from these pathologies, we define an “embedding” to be an immersion where the image — endowed with the subspace topology — is homeomorphic to
itself by
. This is closer to the geometrically intuitive notion of a submanifold, but we will still find the notion of an immersion to be useful.
As a particular example, notice (and check!) that the inclusion map of an open submanifold, as defined earlier, is an embedding.
The Implicit Function Theorem
We can also recall the implicit function theorem. This is less directly generalizable to manifolds, since talking about a function is effectively considering a manifold with a particular product structure: the product between the function’s domain and range.
Still, we can go back and clean up not only the statement of the implicit function theorem, but its proof, as well. And we can even extend to a different, related statement, all using the inverse function theorem for manifolds.
So, take a smooth function , where
with
. Suppose further that
has maximal rank
at a point
. If we write
for the projection onto the first
components of
, then there is some coordinate patch
of
around
so that
in that patch.
This is pretty much just like the original proof. We can clearly define to agree with
in its first
components and just to copy over the
th component for
. That is,
Then , and the Jacobian of
is
After possibly rearranging the arguments of , we may assume that the matrix in the upper-left has nonzero determinant —
has rank
at
, by assumption — and so the Jacobian of
also has nonzero determinant. By the inverse function theorem,
has a neighborhood
of
on which it’s a diffeomorphism
. Thus on
we conclude
This is basically the implicit function theorem from before. But now let’s consider what happens when . Again, we assume that
has maximal rank — this time it’s
— at a point
. If we write
for the inclusion of
into the first
components of
, then I say that there is a coordinate patch
around
so that
in a neighborhood of
.
This time, we take the product and define the function
by
Then , and the Jacobian of
at
is
Just as before, by rearranging the components of we can assume that the determinant of the matrix in the upper-left is nonzero, and thus the determinant of the whole Jacobian is nonzero. And thus
is a diffeomorphism on some neighborhood
. We let
be the image of this neighborhood, and write
. Thus on some neighborhood we conclude
Either way, the conclusion is that we can always pick local coordinates on the larger-dimensional space so that is effectively just a simple inclusion or projection with respect to those coordinates.
The Inverse Function Theorem
Recall the inverse function theorem from multivariable calculus: if is a
map defined on an open region
, and if the Jacobian of
has maximal rank
at a point
then there is some neighborhood
of
so that the restriction
is a diffeomorphism. This is slightly different than how we stated it before, but it’s a pretty straightforward translation.
Anyway, this generalizes immediately to more general manifolds. We know that the proper generalization of the Jacobian is the derivative of a smooth map , where
is an open region of an
-manifold and
is another
-manifold. If the derivative
has maximal rank
at
, then there is some neighborhood
of
for which
is a diffeomorphism.
Well, this is actually pretty simple to prove. Just take coordinates at
and
at
. We can restrict the domain of
to assume that
is entirely contained in the
coordinate patch. Then we can set up the function
.
Since has maximal rank, so does the matrix of
with respect to the bases of coordinate vectors
and
, which is exactly the Jacobian of
. Thus the original inverse function theorem applies to show that there is some
on which
is a diffeomorphism. Since the coordinate maps
and
are diffeomorphisms we can write
for some
, and conclude that
is a diffeomorphism, as asserted.
Cotangent Vectors, Differentials, and the Cotangent Bundle
There’s another construct in differential topology and geometry that isn’t quite so obvious as a tangent vector, but which is every bit as useful: a cotangent vector. A cotangent vector at a point
is just an element of the dual space to
, which we write as
.
We actually have a really nice example of cotangent vectors already: a gadget that takes a tangent vector at and gives back a number. It’s the differential, which when given a vector returns the directional derivative in that direction. And we can generalize that right away.
Indeed, if is a smooth germ at
, then we have a linear functional
defined for all tangent vectors
. We will call this functional the differential of
at
, and write
.
If we have local coordinates at
, then each coordinate function
is a smooth function, which has differential
. These actually furnish the dual basis to the coordinate vectors
. Indeed, we calculate
That is, evaluating the coordinate differential on the coordinate vector
gives the value
if
and
otherwise.
Of course, the define a basis of
at every point
, just like the
define a basis of
at every point
. This was exactly what we needed to compare vectors — at least to some extent — at points within a local coordinate patch, and let us define the tangent bundle as a
-dimensional manifold.
In exactly the same way, we can define the cotangent bundle . Given the coordinate patch
we define a coordinate patch covering all the cotangent spaces
with
. The coordinate map is defined on a cotangent vector
by
Everything else in the construction of the cotangent bundle proceeds exactly as it did for the tangent bundle, but we’re missing one thing: how to translate from one basis of coordinate differentials to another.
So, let’s say and
are two coordinate maps at
, defining coordinate differentials
and
. How are these two bases related? We can calculate this by applying
to
:
where are the components of the Jacobian matrix of the transition function
. What does this mean? Well, consider the linear functional
This has the same values on each of the as
does, and we conclude that they are, in fact, the same cotangent vector:
On the other hand, recall that
That is, we use the Jacobian of one transition function to transform from the basis to the
basis of
, but the transpose of the same Jacobian to transform from the
basis to the
basis of
. And this is actually just as we expect, since the transpose is actually the adjoint transformation, which automatically connects the dual spaces.