## The Tangent Space of a Product

Let and be smooth manifolds, with the -dimensional product manifold. Given points and we want to investigate the tangent space of this product at the point .

For some notation, remember that we have the projections and . Also, if we have a point we get a smooth inclusion mapping defined by . Similarly, given a point we get an inclusion map defined by . These maps satisfy the relations

where the last two are the constant maps with the given values. We can thus use the chain rule to calculate the derivatives of these relations

These are four of the five relations we need to show that decomposes as the direct sum of and . The remaining one states

where is a linear map from to . The real content of the first four relations is effectively that

That is, we know that is a right-inverse of , and we want to know if it’s a left-inverse as well. But this follows since both vector spaces and have dimension . Thus the tangent space of the product decomposes canonically as the direct sum of the tangent spaces of the factors. In terms of our geometric intuition, there are directions we can go “along “, and directions “along “, and any other direction we can go in is a linear combination of one of each.

Note how this dovetails with our discussion of submanifolds. The projection is a smooth map, and every point is a regular value. Its preimage is a submanifold diffeomorphic to . The embedding realizing this diffeomorphism is . The tangent space at a point on the submanifold is mapped by to , and the kernel of this map is exactly the image of the inclusion . The same statements hold with and swapped appropriately, which is what gives us a canonical decomposition in this case.

## Tangent Spaces and Regular Values

If we have a smooth map and a regular value of , we know that the preimage is a smooth -dimensional submanifold. It turns out that we also have a nice decomposition of the tangent space for every point .

The key observation is that the inclusion induces an inclusion of each tangent space by using the derivative . The directions in this subspace are those “tangent to” the submanifold , and so these are the directions in which doesn’t change, “to first order”. Heuristically, in any direction tangent to we can set up a curve with that tangent vector which lies entirely within . Along this curve, the value of is constantly , and so the derivative of is zero. Since the derivative of in the direction only depends on and not the specific choice of curve , we conclude that should be zero.

This still feels a little handwavy. To be more precise, if and is a smooth function on a neighborhood of , then we calculate

since any tangent vector applied to a constant function is automatically zero. Thus we conclude that . In fact, we can say more. The rank-nullity theorem tells us that the dimension of and the dimension of add up to the dimension of , which of course is . But the assumption that is a regular point means that the rank of is , so the dimension of the kernel is . And this is exactly the dimension of , and thus of its tangent space ! Since the subspace has the same dimesion as , we conclude that they are in fact equal.

What does this mean? It tells us that not only are the tangent directions to contained in the kernel of the derivative , every vector in the kernel is tangent to . Thus we can break down any tangent vector in into a part that goes “along” and a part that goes across it. Unfortunately, this isn’t really canonical, since we don’t have a specific complementary subspace to in mind. Still, it’s a useful framework to keep in mind, reinforcing the idea that near the subspace the manifold “looks like” the product of (from ) and , and we can even pick coordinates that reflect this “decomposition”.

## Spheres as Submanifolds

With our extension of the implicit function theorem in hand, we have another way of getting at the sphere, this time as a submanifold.

Start with the Euclidean space and take the smooth function defined by . In components, this is , where the are the canonical coordinates on . We can easily calculate the derivative in these coordinates: . This is the zero function if and only if , and so has rank at any nonzero point . The point is a critical point, and every other point is regular.

On the image side, we see that , so the only critical value is . Every other value is regular, though is empty for . For we have a nonempty preimage, which by our result is a manifold of dimension . This is the -dimensional sphere of radius , though we aren’t going to care so much about the radius for now.

Anyway, is this really the same sphere as before? Remember, when we first saw the two-dimensional sphere as an example, we picked coordinate patches by hand. Now we have the same set of points — those with a fixed squared-distance from the origin — but we might have a different differentiable manifold structure. But if we can show that the inclusion mapping that takes each of our handcrafted coordinate patches into is an immersion, then they must be compatible with the submanifold structure.

We only really need to check this for a single patch, since all six are very similar. We take the local coordinates from our patch and the canonical coordinates on to write out the inclusion map:

Then we use these coordinates to calculate the derivative

This clearly always has rank for , and so the inclusion of our original sphere into is an immersion, which must then be equivalent to the inclusion of the submanifold , since they give the same subspace of .

## Regular and Critical Points

Let be a smooth map between manifolds. We say that a point is a “regular point” if the derivative has rank ; otherwise, we say that is a “critical point”. A point is called a “regular value” if its preimage contains no critical points.

The first thing to notice is that this is only nontrivial if . If then can have rank at most , and thus every point is critical. Another observation is that is then is automatically regular; if its preimage is empty then it cannot contain any critical points.

Regular values are useful because of the generalization of the first part of the implicit function theorem: if is a regular value of , then is a topological manifold of dimension . Or, to put it another way, is a submanifold of “codimension” . Further, there is a unique differentiable structure for which is a smooth submanifold of .

Indeed, let be a coordinate patch around with . Given , pick a coordinate patch of with . Let be the projection onto the first components; let be the projection onto the last components; an let be the inclusion of the subspace whose first components are .

Now, we can write down the composition . Since this has (by assumption) maximal rank at , the implicit function theorem tells us that there is a coordinate patch in a neighborhood of such that . So we can set , which is open in , and get

Setting we conclude that , since all these points are sent by to the preimage .

Now we claim that is not just any subset of , but in fact . Clearly is contained in this intersection, since

On the other hand, if is in this intersection, then for a unique — unique because and are both coordinate maps and thus invertible — and we have

meaning that the first components of must be , and thus . Thus .

Therefore maps homeomorphically onto a neighborhood of in the subspace topology induced by . But this means that acts as a coordinate patch on ! Since every point can be found in some local coordinate patch, is a topological manifold. For its differentiable structure we’ll just take the one induced by these patches.

Finally, we have to check that the inclusion is smooth, so is a smooth submanifold — that its differentiable structure is compatible with that of . But this is easy, since at any point we can go through the above process and get all these functions. We check smoothness by using local coordinates on and on , concluding that , which is clearly smooth.

## Submanifolds

At last we can actually define submanifolds. If and are both manifolds with as topological spaces — the points of form a subset of the points of and the topology of agrees with the subspace topology from — then we say that is a submanifold of if the inclusion map is an embedding. If the inclusion is only an immersion, we say that is an “immersed submanifold” of .

Now, if is any embedding of one manifold into another, then the image is a submanifold, as defined above. Similarly, the image of an injective immersion is an immersed submanifold. The tricky bit here is that if we have a situation like the second of our pathological immersions, we have to consider the topology on the image that does *not* consider the endpoints to be “close” to the middle point on the curve that they approach.

This motivates us to define an equivalence relation on injective immersions into : if and are two maps, we consider them equivalent if there is a diffeomorphism so that . Clearly, this is reflexive (we just let be the identity map), symmetric (a diffeomorphism is invertible), and transitive (the composition of two diffeomorphisms is another one).

The nice thing about this equivalence class is that every immersion is equivalent to a unique immersed submanifold, and so there is no real loss in speaking about an immersion as “being” an immersed submanifold. And of course the same goes for embeddings “being” submanifolds as well.

## Immersions are Locally Embeddings

In both of our pathological examples last time, the problems were very isolated. They depended on two separated parts of the domain manifold interacting with each other. And since manifolds can be carved up easily, we can always localize and find patches of the domain where the immersion map is a well-behaved embedding.

More specifically, if is an immersion, with always an injection for every , then for every point there exists a neighborhood of and a coordinate map around so that if and only if . Further, the restriction is an embedding.

This is basically the actual extension of the second part of the implicit function theorem to manifolds. Appropriately, then, we’ll let be the same inclusion into the first coordinates. We pick a coordinate map around with , and another map around with . Then we get a map from a neighborhood of to a neighborhood of .

Now, the assumption on is that is injective, meaning it has maximal rank at every point. Since and are diffeomorphisms, the composite also has maximal rank at . The implicit function theorem tells us there is a coordinate map in some neighborhood of and a neighborhood of such that .

We set , and , restricting the domain of , if necessary. This establishes the first part of our assertion. Next we need to show that is an embedding. But , which is a composition of embeddings, and is thus an embedding itself.

If is already an embedding at the outset, then for some open . In this case, with as in the theorem, we have

That is, there is always a set of local coordinates in so that the image of is locally the hyperplane spanned by the first of them.

## Immersions and Embeddings

As we said before, the notion of a “submanifold” gets a little more complicated than a naïve, purely categorical approach might suggest. Instead, we work from the concepts of immersions and embeddings.

A map of manifolds is called an “immersion” if the derivative is injective at every point . Immediately we can tell that this can only happen if .

Notice now that this does not guarantee that itself is injective. For instance, if and , then we can form the mapping . Using the coordinates on and on , we can calculate the derivative in coordinates:

The second component of this vector is only zero if itself is, but in this case the first component is , thus is never the zero map between the tangent spaces. But , so is not injective in terms of the underlying point sets of and .

Courtesy of Wolfram Alpha, we can plot this map to see what’s going on:

The image of the curve crosses itself at the origin, but if we restrict ourselves to, say, the intervals and , there is no self-intersection in each interval.

There is another, more subtle pathology to be careful about. Let be the open interval , and left . We plot this curve, stopping just slightly shy of each endpoint:

We see that there’s never quite a self-intersection like before, but the ends of the curve come right up to almost touch the curve in the middle. Going all the way to the limit, the image of is a figure eight, which includes the crossing point in the middle and is thus not a manifold, even though the parameter space is.

To keep away from these pathologies, we define an “embedding” to be an immersion where the image — endowed with the subspace topology — is homeomorphic to itself by . This is closer to the geometrically intuitive notion of a submanifold, but we will still find the notion of an immersion to be useful.

As a particular example, notice (and check!) that the inclusion map of an open submanifold, as defined earlier, is an embedding.

## The Implicit Function Theorem

We can also recall the implicit function theorem. This is less directly generalizable to manifolds, since talking about a function is effectively considering a manifold with a particular product structure: the product between the function’s domain and range.

Still, we can go back and clean up not only the statement of the implicit function theorem, but its proof, as well. And we can even extend to a different, related statement, all using the inverse function theorem for manifolds.

So, take a smooth function , where with . Suppose further that has maximal rank at a point . If we write for the projection onto the first components of , then there is some coordinate patch of around so that in that patch.

This is pretty much just like the original proof. We can clearly define to agree with in its first components and just to copy over the th component for . That is,

Then , and the Jacobian of is

After possibly rearranging the arguments of , we may assume that the matrix in the upper-left has nonzero determinant — has rank at , by assumption — and so the Jacobian of also has nonzero determinant. By the inverse function theorem, has a neighborhood of on which it’s a diffeomorphism . Thus on we conclude

This is basically the implicit function theorem from before. But now let’s consider what happens when . Again, we assume that has maximal rank — this time it’s — at a point . If we write for the inclusion of into the first components of , then I say that there is a coordinate patch around so that in a neighborhood of .

This time, we take the product and define the function by

Then , and the Jacobian of at is

Just as before, by rearranging the components of we can assume that the determinant of the matrix in the upper-left is nonzero, and thus the determinant of the whole Jacobian is nonzero. And thus is a diffeomorphism on some neighborhood . We let be the image of this neighborhood, and write . Thus on some neighborhood we conclude

Either way, the conclusion is that we can always pick local coordinates on the larger-dimensional space so that is effectively just a simple inclusion or projection with respect to those coordinates.

## The Inverse Function Theorem

Recall the inverse function theorem from multivariable calculus: if is a map defined on an open region , and if the Jacobian of has maximal rank at a point then there is some neighborhood of so that the restriction is a diffeomorphism. This is slightly different than how we stated it before, but it’s a pretty straightforward translation.

Anyway, this generalizes immediately to more general manifolds. We know that the proper generalization of the Jacobian is the derivative of a smooth map , where is an open region of an -manifold and is another -manifold. If the derivative has maximal rank at , then there is some neighborhood of for which is a diffeomorphism.

Well, this is actually pretty simple to prove. Just take coordinates at and at . We can restrict the domain of to assume that is entirely contained in the coordinate patch. Then we can set up the function .

Since has maximal rank, so does the matrix of with respect to the bases of coordinate vectors and , which is exactly the Jacobian of . Thus the original inverse function theorem applies to show that there is some on which is a diffeomorphism. Since the coordinate maps and are diffeomorphisms we can write for some , and conclude that is a diffeomorphism, as asserted.

## Cotangent Vectors, Differentials, and the Cotangent Bundle

There’s another construct in differential topology and geometry that isn’t quite so obvious as a tangent vector, but which is every bit as useful: a cotangent vector. A cotangent vector at a point is just an element of the dual space to , which we write as .

We actually have a really nice example of cotangent vectors already: a gadget that takes a tangent vector at and gives back a number. It’s the differential, which when given a vector returns the directional derivative in that direction. And we can generalize that right away.

Indeed, if is a smooth germ at , then we have a linear functional defined for all tangent vectors . We will call this functional the differential of at , and write .

If we have local coordinates at , then each coordinate function is a smooth function, which has differential . These actually furnish the dual basis to the coordinate vectors . Indeed, we calculate

That is, evaluating the coordinate differential on the coordinate vector gives the value if and otherwise.

Of course, the define a basis of at every point , just like the define a basis of at every point . This was exactly what we needed to compare vectors — at least to some extent — at points within a local coordinate patch, and let us define the tangent bundle as a -dimensional manifold.

In exactly the same way, we can define the cotangent bundle . Given the coordinate patch we define a coordinate patch covering all the cotangent spaces with . The coordinate map is defined on a cotangent vector by

Everything else in the construction of the cotangent bundle proceeds exactly as it did for the tangent bundle, but we’re missing one thing: how to translate from one basis of coordinate differentials to another.

So, let’s say and are two coordinate maps at , defining coordinate differentials and . How are these two bases related? We can calculate this by applying to :

where are the components of the Jacobian matrix of the transition function . What does this mean? Well, consider the linear functional

This has the same values on each of the as does, and we conclude that they are, in fact, the same cotangent vector:

On the other hand, recall that

That is, we use the Jacobian of one transition function to transform from the basis to the basis of , but the transpose of the same Jacobian to transform from the basis to the basis of . And this is actually just as we expect, since the transpose is actually the adjoint transformation, which automatically connects the dual spaces.