# The Unapologetic Mathematician

## The Tangent Space of a Product

Let $M^m$ and $N^n$ be smooth manifolds, with $M\times N$ the $m+n$-dimensional product manifold. Given points $p\in M$ and $q\in N$ we want to investigate the tangent space $\mathcal{T}_{(p,q)}M\times N$ of this product at the point $(p,q)$.

For some notation, remember that we have the projections $\pi_1:M\times N\to M$ and $\pi_2:M\times N\to N$. Also, if we have a point $q\in N$ we get a smooth inclusion mapping $i_q:M\to M\times N$ defined by $i_q(p)=(p,q)$. Similarly, given a point $p\in M$ we get an inclusion map $j_p:N\to M\times N$ defined by $j_p(q)=(p,q)$. These maps satisfy the relations

\displaystyle\begin{aligned}\pi_1\circ i_q&=1_M\\\pi_2\circ j_p&=1_N\\\pi_1\circ j_p&=p\\\pi_2\circ i_q&=q\end{aligned}

where the last two are the constant maps with the given values. We can thus use the chain rule to calculate the derivatives of these relations

\displaystyle\begin{aligned}\pi_{1*(p,q)}\circ i_{q*p}&=1_{\mathcal{T}_pM}\\\pi_{2*(p,q)}\circ j_{p*q}&=1_{\mathcal{T}_qN}\\\pi_{1*(p,q)}\circ j_{p*q}&=0\\\pi_{2*(p,q)}\circ i_{q*p}&=0\end{aligned}

These are four of the five relations we need to show that $\mathcal{T}_{(p,q)}M\times N$ decomposes as the direct sum of $\mathcal{T}_pM$ and $\mathcal{T}_qN$. The remaining one states

$\displaystyle 1_{\mathcal{T}_{(p,q)}M\times N}=i_{q*p}\circ\pi_{1*(p,q)}+j_{p*q}\circ\pi_{1*(p,q)}=L\circ(\pi_{1*(p,q)},\pi_{2*(p,q)})$

where $L(u,v)=i_{q*p}(u)+j_{p*q}(v)$ is a linear map from $\mathcal{T}_pM\times\mathcal{T}_qN$ to $\mathcal{T}_{(p,q)}M\times N$. The real content of the first four relations is effectively that

$\displaystyle (\pi_{1*(p,q)},\pi_{2*(p,q)})\circ L=1_{\mathcal{T}_pM\times\mathcal{T}_qN}$

That is, we know that $L$ is a right-inverse of $(\pi_{1*(p,q)},\pi_{2*(p,q)})$, and we want to know if it’s a left-inverse as well. But this follows since both vector spaces $\mathcal{T}_{(p,q)}M\times N$ and $\mathcal{T}_pM\times\mathcal{T}_qN$ have dimension $m+n$. Thus the tangent space of the product decomposes canonically as the direct sum of the tangent spaces of the factors. In terms of our geometric intuition, there are directions we can go “along $M$“, and directions “along $N$“, and any other direction we can go in $M\times N$ is a linear combination of one of each.

Note how this dovetails with our discussion of submanifolds. The projection $\pi_1:M\times N\to M$ is a smooth map, and every point $p\in M$ is a regular value. Its preimage $\pi_1^{-1}(p)$ is a submanifold diffeomorphic to $N$. The embedding realizing this diffeomorphism is $j_p$. The tangent space at a point $(p,q)$ on the submanifold $j_p(N)$ is mapped by $\pi_{1*(p,q)}$ to $\mathcal{T}_pM$, and the kernel of this map is exactly the image of the inclusion $j_{p*q}$. The same statements hold with $M$ and $N$ swapped appropriately, which is what gives us a canonical decomposition in this case.

April 27, 2011

## Tangent Spaces and Regular Values

If we have a smooth map $f:M^m\to N^n$ and a regular value $q\in N$ of $f$, we know that the preimage $f^{-1}(q)=A\subseteq M$ is a smooth $m-n$-dimensional submanifold. It turns out that we also have a nice decomposition of the tangent space $\mathcal{T}_pM$ for every point $p\in A$.

The key observation is that the inclusion $\iota:A\to M$ induces an inclusion of each tangent space by using the derivative $\iota_{*p}(\mathcal{T}_pA)\subseteq\mathcal{T}_pM$. The directions in this subspace are those “tangent to” the submanifold $A$, and so these are the directions in which $f$ doesn’t change, “to first order”. Heuristically, in any direction $v$ tangent to $A$ we can set up a curve $\gamma$ with that tangent vector which lies entirely within $A$. Along this curve, the value of $f$ is constantly $q\in N$, and so the derivative of $f\circ\gamma$ is zero. Since the derivative of $f$ in the direction $v$ only depends on $v$ and not the specific choice of curve $\gamma$, we conclude that $f_{*p}(v)$ should be zero.

This still feels a little handwavy. To be more precise, if $v\in\mathcal{T}_pA$ and $\phi$ is a smooth function on a neighborhood of $q\in N$, then we calculate

\displaystyle\begin{aligned}\left[f_{*p}(\iota_{*p}(v))\right](\phi)&=\left[\left[f\circ\iota\right]_{*p}(v)\right](\phi)\\&=v\left(\phi\circ f\circ\iota\right)\\&=v\left(\phi(q)\right)\\&=0\end{aligned}

since any tangent vector applied to a constant function is automatically zero. Thus we conclude that $\iota_{*p}(\mathcal{T}_pA)\subseteq\mathrm{Ker}(f_{*p})$. In fact, we can say more. The rank-nullity theorem tells us that the dimension of $\mathrm{Ker}(f_{*p})$ and the dimension of $\mathrm{Im}(f_{*p})$ add up to the dimension of $\mathcal{T}_pM$, which of course is $m$. But the assumption that $p$ is a regular point means that the rank of $f_{*p}$ is $n=\dim(N)$, so the dimension of the kernel is $m-n$. And this is exactly the dimension of $A$, and thus of its tangent space $\mathcal{T}_pA$! Since the subspace $\mathcal{T}_pA$ has the same dimesion as $\mathrm{Ker}(f_{*p})$, we conclude that they are in fact equal.

What does this mean? It tells us that not only are the tangent directions to $A$ contained in the kernel of the derivative $f_*$, every vector in the kernel is tangent to $A$. Thus we can break down any tangent vector in $\mathcal{T}_pM$ into a part that goes “along” $A$ and a part that goes across it. Unfortunately, this isn’t really canonical, since we don’t have a specific complementary subspace to $\mathcal{T}_pA$ in mind. Still, it’s a useful framework to keep in mind, reinforcing the idea that near the subspace $A$ the manifold $M$ “looks like” the product of $\mathbb{R}^{m-n}$ (from $A$) and $\mathbb{R}^n$, and we can even pick coordinates that reflect this “decomposition”.

April 26, 2011

## Spheres as Submanifolds

With our extension of the implicit function theorem in hand, we have another way of getting at the sphere, this time as a submanifold.

Start with the Euclidean space $\mathbb{R}^{n+1}$ and take the smooth function $f:\mathbb{R}^{n+1}\to\mathbb{R}$ defined by $f(x)=\langle x,x\rangle$. In components, this is $\sum_i\left(x^i\right)^2$, where the $x^i$ are the canonical coordinates on $\mathbb{R}^{n+1}$. We can easily calculate the derivative in these coordinates: $f_{*x}(v)=2\langle x,v\rangle$. This is the zero function if and only if $x=0$, and so $f_{*x}$ has rank $1$ at any nonzero point $x$. The point $x=0$ is a critical point, and every other point is regular.

On the image side, we see that $f(0)=0$, so the only critical value is $0$. Every other value is regular, though $f^{-1}(y)$ is empty for $y<0$. For $f^{-1}(a^2)$ we have a nonempty preimage, which by our result is a manifold of dimension $(n+1)-1=n$. This is the $n$-dimensional sphere of radius $a$, though we aren’t going to care so much about the radius for now.

Anyway, is this really the same sphere as before? Remember, when we first saw the two-dimensional sphere as an example, we picked coordinate patches by hand. Now we have the same set of points — those with a fixed squared-distance from the origin — but we might have a different differentiable manifold structure. But if we can show that the inclusion mapping that takes each of our handcrafted coordinate patches into $\mathbb{R}^3$ is an immersion, then they must be compatible with the submanifold structure.

We only really need to check this for a single patch, since all six are very similar. We take the local coordinates from our patch and the canonical coordinates on $\mathbb{R}^3$ to write out the inclusion map:

$\displaystyle g(x,y)=\left(x,y,\sqrt{1-x^2-y^2}\right)$

Then we use these coordinates to calculate the derivative

\displaystyle\begin{aligned}g_{*(x,y)}(u,v)&=\left(1,0,\frac{-x}{\sqrt{1-x^2-y^2}}\right)u+\left(0,1,\frac{-y}{\sqrt{1-x^2-y^2}}\right)v\\&=\left(u,v,\frac{-xu-yv}{\sqrt{1-x^2-y^2}}\right)\end{aligned}

This clearly always has rank $2$ for $x^2+y^2<1$, and so the inclusion of our original sphere into $\mathbb{R}^3$ is an immersion, which must then be equivalent to the inclusion of the submanifold $f^{-1}(1)$, since they give the same subspace of $\mathbb{R}^3$.

April 25, 2011

## Regular and Critical Points

Let $f:M^m\to N^n$ be a smooth map between manifolds. We say that a point $p\in M$ is a “regular point” if the derivative $f_{*p}$ has rank $n$; otherwise, we say that $p$ is a “critical point”. A point $q\in N$ is called a “regular value” if its preimage $f^{-1}(q)$ contains no critical points.

The first thing to notice is that this is only nontrivial if $m\geq n$. If $m then $f_{*p}$ can have rank at most $m$, and thus every point is critical. Another observation is that is $q\notin f(M)$ then $q$ is automatically regular; if its preimage is empty then it cannot contain any critical points.

Regular values are useful because of the generalization of the first part of the implicit function theorem: if $q$ is a regular value of $f:M\to N$, then $A=f^{-1}(q)\subseteq M$ is a topological manifold of dimension $m-n$. Or, to put it another way, $A$ is a submanifold of “codimension” $n=\dim(N)$. Further, there is a unique differentiable structure for which $A$ is a smooth submanifold of $M$.

Indeed, let $(V,y)$ be a coordinate patch around $q$ with $y(q)=0$. Given $p\in A$, pick a coordinate patch $(U,x)$ of $M$ with $x(p)=0$. Let $\pi_1:\mathbb{R}^m\to\mathbb{R}^n$ be the projection onto the first $n$ components; let $\pi_2:\mathbb{R}^m\to\mathbb{R}^{m-n}$ be the projection onto the last $m-n$ components; an let $\iota_2:\mathbb{R}^{m-n}\to\mathbb{R}^m$ be the inclusion of the subspace whose first $m$ components are $0$.

Now, we can write down the composition $y\circ f\circ x^{-1}$. Since this has (by assumption) maximal rank at $0\in\mathbb{R}^m$, the implicit function theorem tells us that there is a coordinate patch $(W,h)$ in a neighborhood of $0$ such that $y\circ f\circ x^{-1}\circ h=\pi_1\vert_W$. So we can set $\tilde{W}=\pi_2(W)$, which is open in $\mathbb{R}^{m-n}$, and get

$\displaystyle y\circ f\circ x^{-1}\circ h\circ\iota_2\vert_{\tilde{W}}=\pi_1\circ\iota_2\vert_{\tilde{W}}=0$

Setting $z=x^{-1}\circ h\circ\iota_2\vert_{\tilde{W}}$ we conclude that $z(\tilde{W})\subseteq A$, since all these points are sent by $f$ to the preimage $y^{-1}(0)=q$.

Now we claim that $z(\tilde{W})$ is not just any subset of $A$, but in fact $z(\tilde{W})=A\cap x^{-1}(h(W))$. Clearly $z(\tilde{W})$ is contained in this intersection, since

$\displaystyle z(\tilde{W})=x^{-1}(h(\iota_2(\tilde{W})))=x^{-1}(h(W\cap(0\times\mathbb{R}^{m-n})))$

On the other hand, if $\tilde{p}$ is in this intersection, then $\tilde{p}=x^{-1}(h(u))$ for a unique $u\in W$ — unique because $x$ and $h$ are both coordinate maps and thus invertible — and we have

$\displaystyle0=y(f(\tilde{p}))=y(f(x^{-1}(h(u))))=\pi_1(u)$

meaning that the first $n$ components of $u$ must be $0$, and thus $u\in(0\times\tilde{W})$. Thus $\tilde{p}\in z(\tilde{W})$.

Therefore $z$ maps $\tilde{W}\subseteq\mathbb{R}^{m-n}$ homeomorphically onto a neighborhood of $p\in A$ in the subspace topology induced by $M$. But this means that $(z(\tilde{W}),z^{-1})$ acts as a coordinate patch on $A$! Since every point $p\in A$ can be found in some local coordinate patch, $A$ is a topological manifold. For its differentiable structure we’ll just take the one induced by these patches.

Finally, we have to check that the inclusion $\iota:A\to M$ is smooth, so $A$ is a smooth submanifold — that its differentiable structure is compatible with that of $M$. But this is easy, since at any point $p$ we can go through the above process and get all these functions. We check smoothness by using local coordinates $x$ on $M$ and $z^{-1}$ on $A$, concluding that $x\circ\iota\circ(z^{-1})^{-1}=h\circ\iota_2$, which is clearly smooth.

April 21, 2011

## Submanifolds

At last we can actually define submanifolds. If $M$ and $N$ are both manifolds with $M\subseteq N$ as topological spaces — the points of $M$ form a subset of the points of $N$ and the topology of $M$ agrees with the subspace topology from $N$ — then we say that $M$ is a submanifold of $N$ if the inclusion map $\iota:M\to N$ is an embedding. If the inclusion is only an immersion, we say that $M$ is an “immersed submanifold” of $N$.

Now, if $f:M\to N$ is any embedding of one manifold into another, then the image $f(M)\subseteq N$ is a submanifold, as defined above. Similarly, the image of an injective immersion is an immersed submanifold. The tricky bit here is that if we have a situation like the second of our pathological immersions, we have to consider the topology on the image that does not consider the endpoints to be “close” to the middle point on the curve that they approach.

This motivates us to define an equivalence relation on injective immersions into $N$: if $f_1:M_1\to N$ and $f_2:M_2\to N$ are two maps, we consider them equivalent if there is a diffeomorphism $g:M_1\to M_2$ so that $f_1=f_2\circ g$. Clearly, this is reflexive (we just let $g$ be the identity map), symmetric (a diffeomorphism $g$ is invertible), and transitive (the composition of two diffeomorphisms is another one).

The nice thing about this equivalence class is that every immersion is equivalent to a unique immersed submanifold, and so there is no real loss in speaking about an immersion $f:M\to N$ as “being” an immersed submanifold. And of course the same goes for embeddings “being” submanifolds as well.

April 20, 2011

## Immersions are Locally Embeddings

In both of our pathological examples last time, the problems were very isolated. They depended on two separated parts of the domain manifold interacting with each other. And since manifolds can be carved up easily, we can always localize and find patches of the domain where the immersion map is a well-behaved embedding.

More specifically, if $f:M^m\to N^n$ is an immersion, with $f_{*p}$ always an injection for every $p\in M$, then for every point $p$ there exists a neighborhood $U\subseteq M$ of $p$ and a coordinate map $(V,y)$ around $f(p)\in N$ so that $q\in f(U)\cap V$ if and only if $y^{m+1}(q)=\dots=y^n(q)=0$. Further, the restriction $f\vert_U$ is an embedding.

This is basically the actual extension of the second part of the implicit function theorem to manifolds. Appropriately, then, we’ll let $\iota_\mathbb{R}^m\to\mathbb{R}^n$ be the same inclusion into the first $m$ coordinates. We pick a coordinate map $x$ around $p$ with $x(p)=0$, and another map $\tilde{y}$ around $f(p)$ with $\tilde{y}(f(p))=0$. Then we get a map $\tilde{y}\circ f\circ x^{-1}$ from a neighborhood of $0\in\mathbb{R}^m$ to a neighborhood of $0\in\mathbb{R}^n$.

Now, the assumption on $f$ is that $f_{*p}$ is injective, meaning it has maximal rank $m$ at every point. Since $x^{-1}$ and $\tilde{y}$ are diffeomorphisms, the composite also has maximal rank $m$ at $0$. The implicit function theorem tells us there is a coordinate map $g$ in some neighborhood of $0\in\mathbb{R}^n$ and a neighborhood $W$ of $0\in\mathbb{R}^m$ such that $g\circ\tilde{y}\circ f\circ x^{-1}\vert_W=\iota\vert_W$.

We set $U=x^{-1}(W)\subseteq M$, and $y=g\circ\tilde{y}$, restricting the domain of $g$, if necessary. This establishes the first part of our assertion. Next we need to show that $f\vert_U$ is an embedding. But $f\vert_U=y^{-1}\circ\iota\circ x\vert_U$, which is a composition of embeddings, and is thus an embedding itself.

If $f$ is already an embedding at the outset, then $f(U)=f(M)\cap W$ for some open $W\in N$. In this case, with $V$ as in the theorem, we have

$\displaystyle f(M)\cap V=\{q\in V\vert y^{m+1}(q)=\dots=y^n(q)=0\}$

That is, there is always a set of local coordinates in $N$ so that the image of $M$ is locally the hyperplane spanned by the first $m$ of them.

April 19, 2011

## Immersions and Embeddings

As we said before, the notion of a “submanifold” gets a little more complicated than a naïve, purely categorical approach might suggest. Instead, we work from the concepts of immersions and embeddings.

A map $f:M^m\to N^n$ of manifolds is called an “immersion” if the derivative $f_{*p}:\mathcal{T}_pM\to\mathcal{T}_{f(p)}N$ is injective at every point $p\in M$. Immediately we can tell that this can only happen if $m\leq n$.

Notice now that this does not guarantee that $f$ itself is injective. For instance, if $M=\mathbb{R}^1$ and $N=\mathbb{R}^2$, then we can form the mapping $f(t)=(t-t^3,1-t^2)$. Using the coordinates $t$ on $M$ and ${x,y}$ on $N$, we can calculate the derivative in coordinates:

$\displaystyle f_{*t}:\frac{\partial}{\partial t}(t)\mapsto(1-3t^2)\frac{\partial}{\partial x}(f(t))-2t\frac{\partial}{\partial y}(f(t))$

The second component of this vector is only zero if $t$ itself is, but in this case the first component is $1$, thus $f_{*t}$ is never the zero map between the tangent spaces. But $f(1)=f(-1)=(0,0)$, so $f$ is not injective in terms of the underlying point sets of $M$ and $N$.

Courtesy of Wolfram Alpha, we can plot this map to see what’s going on:

The image of the curve crosses itself at the origin, but if we restrict ourselves to, say, the intervals $(-2,0)$ and $(0,2)$, there is no self-intersection in each interval.

There is another, more subtle pathology to be careful about. Let $M$ be the open interval $(0,2\pi)$, and left $f(t)=(\sin(t),\sin(2t))$. We plot this curve, stopping just slightly shy of each endpoint:

We see that there’s never quite a self-intersection like before, but the ends of the curve come right up to almost touch the curve in the middle. Going all the way to the limit, the image of $f$ is a figure eight, which includes the crossing point in the middle and is thus not a manifold, even though the parameter space is.

To keep away from these pathologies, we define an “embedding” to be an immersion where the image $f(M)\subseteq N$ — endowed with the subspace topology — is homeomorphic to $M$ itself by $f$. This is closer to the geometrically intuitive notion of a submanifold, but we will still find the notion of an immersion to be useful.

As a particular example, notice (and check!) that the inclusion map of an open submanifold, as defined earlier, is an embedding.

April 18, 2011

## The Implicit Function Theorem

We can also recall the implicit function theorem. This is less directly generalizable to manifolds, since talking about a function is effectively considering a manifold with a particular product structure: the product between the function’s domain and range.

Still, we can go back and clean up not only the statement of the implicit function theorem, but its proof, as well. And we can even extend to a different, related statement, all using the inverse function theorem for manifolds.

So, take a smooth function $f:U\to\mathbb{R}^n$, where $U\subseteq\mathbb{R}^m$ with $m\geq n$. Suppose further that $f$ has maximal rank $n$ at a point $p\in U$. If we write $\pi:\mathbb{R}^m\to\mathbb{R}^n$ for the projection onto the first $n$ components of $\mathbb{R}^n$, then there is some coordinate patch $(V,h)$ of $\mathbb{R}^m$ around $p$ so that $f\circ h^{-1}=\pi$ in that patch.

This is pretty much just like the original proof. We can clearly define $F:U\to\mathbb{R}^m$ to agree with $f$ in its first $n$ components and just to copy over the $j$th component for $n+1\leq j\leq m$. That is,

$\displaystyle F(a^1,\dots,a^m)=\left(f^1(a^1,\dots,a^m),a^{n+1},\dots,a^m\right)$

Then $f=\pi\circ F$, and the Jacobian of $F$ is

$\displaystyle\begin{pmatrix}\left(D_jf^i(p)\right)_{1\leq j\leq n}&\left(D_jf^i(p)\right)_{n+1\leq j\leq m}\\{0}&I_{m-n}\end{pmatrix}$

After possibly rearranging the arguments of $f$, we may assume that the matrix in the upper-left has nonzero determinant — $f$ has rank $n$ at $p$, by assumption — and so the Jacobian of $F$ also has nonzero determinant. By the inverse function theorem, $F$ has a neighborhood $V\subseteq U$ of $p$ on which it’s a diffeomorphism $F\vert_V=h:V\to h(V)\subseteq\mathbb{R}^m$. Thus on $h(V)$ we conclude

$\displaystyle f\circ h^{-1}=\pi\circ F\circ h^{-1}=\pi$

This is basically the implicit function theorem from before. But now let’s consider what happens when $m\leq n$. Again, we assume that $f$ has maximal rank — this time it’s $m$ — at a point $p\in U$. If we write $\iota:\mathbb{R}^m\to\mathbb{R}^n$ for the inclusion of $\mathbb{R}^m$ into the first $m$ components of $\mathbb{R}^n$, then I say that there is a coordinate patch $(V,g)$ around $f(p)$ so that $g\circ f=\iota$ in a neighborhood of $p$.

This time, we take the product $U\times\mathbb{R}^{n-m}\subseteq\mathbb{R}^n$ and define the function $F:U\times\mathbb{R}^{n-m}\to\mathbb{R}^n$ by

$\displaystyle F(a^1,\dots,a^n)=f(a^1,\dots,a^m)+\left(0,\dots,0,a^{m+1},\dots,a^n\right)$

Then $F\circ\iota=f$, and the Jacobian of $F$ at $p$ is

$\displaystyle\begin{pmatrix}\left(D_jf^i(p)\right)_{1\leq i\leq m}&0\\\left(D_jf^i(p)\right)_{m+1\leq i\leq n}&I_{n-m}\end{pmatrix}$

Just as before, by rearranging the components of $f$ we can assume that the determinant of the matrix in the upper-left is nonzero, and thus the determinant of the whole Jacobian is nonzero. And thus $F$ is a diffeomorphism on some neighborhood $W\subseteq U\times\mathbb{R}^{n-m}$. We let $V=F(W)$ be the image of this neighborhood, and write $g=\left(F\vert_W\right)^{-1}:V\to W\subseteq\mathbb{R}^n$. Thus on some neighborhood we conclude

$\displaystyle g\circ f=g\circ F\circ\iota=\iota$

Either way, the conclusion is that we can always pick local coordinates on the larger-dimensional space so that $f$ is effectively just a simple inclusion or projection with respect to those coordinates.

April 15, 2011

## The Inverse Function Theorem

Recall the inverse function theorem from multivariable calculus: if $f:U\to\mathbb{R}^n$ is a $C^1$ map defined on an open region $U\subseteq\mathbb{R}^n$, and if the Jacobian of $f$ has maximal rank $n$ at a point $p\in U$ then there is some neighborhood $V$ of $p$ so that the restriction $f\vert_V:V\to f(V)\subseteq\mathbb{R}^n$ is a diffeomorphism. This is slightly different than how we stated it before, but it’s a pretty straightforward translation.

Anyway, this generalizes immediately to more general manifolds. We know that the proper generalization of the Jacobian is the derivative of a smooth map $f:U\to N$, where $U\subseteq M$ is an open region of an $n$-manifold and $N$ is another $n$-manifold. If the derivative $f_{*p}:\mathcal{T}_pM\to\mathcal{T}_{f(p)}N$ has maximal rank $n$ at $p$, then there is some neighborhood $V\subseteq M$ of $p$ for which $f\vert_V:V\to f(V)\subseteq N$ is a diffeomorphism.

Well, this is actually pretty simple to prove. Just take coordinates $x$ at $p\in M$ and $y$ at $f(p)\in N$. We can restrict the domain of $f$ to assume that $U$ is entirely contained in the $x$ coordinate patch. Then we can set up the function $y\circ f\circ x^{-1}:x(U)\to\mathbb{R}^n$.

Since $f$ has maximal rank, so does the matrix of $f$ with respect to the bases of coordinate vectors $\frac{\partial}{\partial x^i}$ and $\frac{\partial}{\partial y^j}$, which is exactly the Jacobian of $y\circ f\circ x^{-1}$. Thus the original inverse function theorem applies to show that there is some $W\subseteq x(U)$ on which $y\circ f\circ x^{-1}$ is a diffeomorphism. Since the coordinate maps $x$ and $y$ are diffeomorphisms we can write $W=x(V)$ for some $V\subseteq M$, and conclude that $f:V\to f(V)$ is a diffeomorphism, as asserted.

April 14, 2011

## Cotangent Vectors, Differentials, and the Cotangent Bundle

There’s another construct in differential topology and geometry that isn’t quite so obvious as a tangent vector, but which is every bit as useful: a cotangent vector. A cotangent vector $\lambda$ at a point $p\in M$ is just an element of the dual space to $\mathcal{T}_pM$, which we write as $\mathcal{T}^*_pM$.

We actually have a really nice example of cotangent vectors already: a gadget that takes a tangent vector at $p$ and gives back a number. It’s the differential, which when given a vector returns the directional derivative in that direction. And we can generalize that right away.

Indeed, if $f$ is a smooth germ at $p$, then we have a linear functional $v\mapsto v(f)$ defined for all tangent vectors $v\in\mathcal{T}_pM$. We will call this functional the differential of $f$ at $p$, and write $\left[df(p)\right](v)=v(f)$.

If we have local coordinates $(U,x)$ at $p$, then each coordinate function $x^i$ is a smooth function, which has differential $dx^i(p)$. These actually furnish the dual basis to the coordinate vectors $\frac{\partial}{\partial x^i}(p)$. Indeed, we calculate

\displaystyle\begin{aligned}\left[dx^i(p)\right]\left(\frac{\partial}{\partial x^j}(p)\right)&=\left[\frac{\partial}{\partial x^j}\right](x^i)\\&=\left[D_j(u^i\circ x\circ x^{-1})\right](x(p))\\&=\delta_j^i\end{aligned}

That is, evaluating the coordinate differential $dx^i(p)$ on the coordinate vector $\frac{\partial}{\partial x^j}(p)$ gives the value $1$ if $i=j$ and $0$ otherwise.

Of course, the $dx^j(p)$ define a basis of $\mathcal{T}^*_pM$ at every point $p\in U$, just like the $\frac{\partial}{\partial x^j}(p)$ define a basis of $\mathcal{T}_pM$ at every point $p\in U$. This was exactly what we needed to compare vectors — at least to some extent — at points within a local coordinate patch, and let us define the tangent bundle as a $2n$-dimensional manifold.

In exactly the same way, we can define the cotangent bundle $\mathcal{T}^*M$. Given the coordinate patch $(U,x)$ we define a coordinate patch covering all the cotangent spaces $\mathcal{T}^*_pM$ with $p\in U$. The coordinate map is defined on a cotangent vector $\lambda\in\mathcal{T}^*_pM$ by

$\displaystyle\tilde{x}(\lambda)=\left(x^1(p),\dots,x^n(p),\lambda\left(\frac{\partial}{\partial x^1}(p)\right),\dots,\lambda\left(\frac{\partial}{\partial x^n}(p)\right)\right)$

Everything else in the construction of the cotangent bundle proceeds exactly as it did for the tangent bundle, but we’re missing one thing: how to translate from one basis of coordinate differentials to another.

So, let’s say $x$ and $y$ are two coordinate maps at $p$, defining coordinate differentials $dx^i(p)$ and $dy^j(p)$. How are these two bases related? We can calculate this by applying $dy^j(p)$ to $\frac{\partial}{\partial x^j}(p)$:

\displaystyle\begin{aligned}\left[dy^j(p)\right]\left(\frac{\partial}{\partial x^i}(p)\right)&=\left[\frac{\partial}{\partial x^i}\right](y^j)\\&=\left[D_i(u^j\circ y\circ x^{-1})\right](x(p))\\&=J_i^j(p)\end{aligned}

where $J_i^j(p)$ are the components of the Jacobian matrix of the transition function $y\circ x^{-1}$. What does this mean? Well, consider the linear functional

$\displaystyle\sum\limits_iJ_i^j(p)dx^i(p)$

This has the same values on each of the $\frac{\partial}{\partial x^i}(p)$ as $dy^j$ does, and we conclude that they are, in fact, the same cotangent vector:

$\displaystyle dy^j(p)=\sum\limits_iJ_i^j(p)dx^i(p)$

On the other hand, recall that

$\displaystyle\frac{\partial}{\partial x^i}(p)=\sum\limits_jJ_i^j(p)\frac{\partial}{\partial y^j}(p)$

That is, we use the Jacobian of one transition function to transform from the $dx^i(p)$ basis to the $dy^j(p)$ basis of $\mathcal{T}^*_pM$, but the transpose of the same Jacobian to transform from the $\frac{\partial}{\partial x^i}(p)$ basis to the $\frac{\partial}{\partial y^j}(p)$ basis of $\mathcal{T}_pM$. And this is actually just as we expect, since the transpose is actually the adjoint transformation, which automatically connects the dual spaces.

April 13, 2011