# The Unapologetic Mathematician

## One-Parameter Groups

Let $\Phi:\mathbb{R}\times M\to M$ be a differentiable map. For each $t\in\mathbb{R}$ we can define the differentiable map $\Phi_t:M\to M$ by $\Phi_t(p)=\Phi(t,p)$. We call the collection $\left\{\Phi_t\right\}_{t\in\mathbb{R}}$ a “one-parameter group” of diffeomorphisms — since it has, obviously, a single parameter — so long as it satisfies the two conditions $\Phi_0=1_M$ and $\Phi_{t_1+t_2}=\Phi_{t_1}+\Phi_{t_2}$. That is, $t\mapsto\Phi_t$ is a homomorphism from the additive group of real numbers to the diffeomorphism group of $M$. Indeed, each $\Phi_t$ is a diffeomorphism of $M$ — a differentiable isomorphism of the manifold to itself — with inverse $\Phi_{-t}$

If we define a vector field $X$ by $X(p)=\Phi_*\frac{\partial}{\partial t}(0,p)$ then $\Phi$ is a flow for this vector field. Indeed, it’s a maximal flow, since it’s defined for all time at each point.

Conversely, if $\Phi:W\to M$ is the maximal flow of a vector field $X\in\mathfrak{X}M$, then $\Phi$ defines something like a one-parameter subgroup. Indeed, “flowing forward” by $t_1$ and the again by $t_2$ is the same as flowing forward by $t_1+t_2$ along each integral curve, and so $\Phi_{t_1+t_2}=\Phi_{t_1}+\Phi_{t_2}$ wherever both sides of this equation are well-defined. But they might not, since even if both $(t_1,p)$ and $(t_2,p)$ are in $W$ for all $p\in M$ the point $(t_1+t_2,p)$ might not be. But if every integral curve can be extended for all times, then we call the vector field “complete” and conclude that its maximal flow is a one-parameter group of diffeomorphisms.

May 31, 2011

## The Maximal Flow of a Vector Field

Given a smooth vector field $X\in\mathfrak{X}M$ we know what it means for a curve $c$ to be an integral curve of $X$. We even know how to find them by starting at a point $p$ and solving differential equations as far out as we can. For every $p\in M$, let $I_p$ be the maximal open interval containing $0$ on which we can define the integral curve $\Phi_p$ with $\Phi_p(0)=p$.

Now, I say that there is a unique open set $W\subseteq\mathbb{R}\times M$ and a unique smooth map $\Phi:W\to M$ such that $W\cap(\mathbb{R}\times\{p\})=I_p\times\{p\}$ — the set $W$ cuts out the interval $I_p$ from the copy of $\mathbb{R}$ at $p$ — and further $\Phi(t,p)=\phi_p(t)$ for all $(t,p)\in W$. This is called the “maximal flow” of $X$.

Since there is some integral curve through each point $p\in M$, we can see that $\{0\}\times M\subseteq W$. Further, it should be immediately apparent that $\Phi$ is also a local flow. What needs to be proven is that $W$ is open, and that $\Phi$ is smooth.

Given a $p\in M$, let $I\subseteq I_p$ be the collection of $t$ for which there is a neighborhood of $(t,p)$ contained in $W$ on which $\Phi$ is differentiable. We will show that $I$ is nonempty, open, and closed in $I_p$, meaning that it must be the whole interval.

Nonemptiness is obvious, since it just means that $p$ is contained in some local flow, which we showed last time. Openness also follows directly from the definition of $I$.

As for closedness, let $t_0$ be any point in $\bar{I}$, the closure of $I$. We know there exists some local flow $\Phi':I'\times V'\to M$ with $0\in I'$ and $\Phi_p(t_0)\in V'$. Now pick an $t_1\in I$ close enough to $t_0$ so that $t_0-t_1\in I'$ and $\Phi_p(t_1)\in V'$ — this is possible since $t_0$ is in the closure of $I$ and $\Phi_p$ is continuous. Then choose an interval $I_0$ around $t_0$ so that $t-t_1\in I'$ for each $t\in I_0$. And finally the continuity of $\Phi$ at $(t_1,p)$ tells us that there is a neighborhood $V$ of $p$ so that $\Phi(t_1\times V)\subseteq V'$.

Now, $\Phi$ is defined and differentiable on $I_0\times V$, showing that $t_0\in I$. Indeed, if $t\in I_0$ and $q\in V$, then $t-t_1\in I'$ and $\Phi(t_1,q)\in V'$, so $\Phi'(t-t_1,\Phi(t_1,q))$ is defined. The curve $s\mapsto\Phi'(s-t_1,\Phi(t_1,q))$ is an integral curve of $X$, and it equals $\Phi(t_1,q)$ at $t_1$. Uniqueness tells us that $\Phi(t,q)=\Phi'(t-t_1,\Phi(t_1,q))$ is defined, and $\Phi$ is thus differentiable at $(t,q)$.

May 30, 2011

## Integral Curves and Local Flows

Let $\mathfrak{X}M$ is a vector field on the manifold $M$ and let $q$ be any point in $M$. Then I say there exists a neighborhood $V\subseteq M$ of $q$, an interval $I\subseteq\mathbb{R}$ around $0$, and a differentiable map $\Phi:I\times V\to M$ such that

\displaystyle\begin{aligned}\Phi(0,p)&=p\\\Phi_*\left(\frac{\partial}{\partial t}(t,p)\right)&=X\left(\Phi(t,p)\right)\end{aligned}

for all $t\in I$ and $p\in V$. These should look familiar, since they’re very similar to the conditions we wrote down for the flow of a differential equation.

It might help a bit to clarify that $\frac{\partial}{\partial t}(t,p)$ is the inclusion $\iota_{p*}\left(\frac{d}{dt}(t)\right)$ of the canonical vector $\frac{d}{dt}(t)\in\mathcal{T}_t\mathbb{R}$ which points in the direction of increasing $t$. That is, $\iota_p:I\to I\times V$ includes the interval $I$ into $I\times V$ “at the point $p\in V$“, and thus its derivative carries along its tangent bundle. At each point of an (oriented) interval $I$ there’s a canonical vector, and $\frac{\partial}{\partial t}(t,p)$ is the image of that vector.

Further, take note that we can write the left side of our second condition as

$\displaystyle\Phi_*\left(\iota_{p*}\left(\frac{d}{dt}(t)\right)\right)$

The chain rule lets us combine these two outer derivatives into one:

$\displaystyle\left[\Phi\circ\iota_p\right]_*\left(\frac{d}{dt}(t)\right)$

But this is exactly how we defined the derivative of a curve! That is, we can write down a function $c=\Phi\circ\iota_p:I\to M$ which satisfies $c'(t)=X(c(t))$ for every $t\in I$. We call such a curve an “integral curve” of the vector field $X$, and when they’re collected together as in $\Phi$ we call it a “local flow” of $X$.

So how do we prove this? We just take local coordinates and use our good old existence theorem! Indeed, if $(U,x)$ is a coordinate patch around $q$ then we can set $G=x(U)$, $a=x(q)$, and

$\displaystyle F=(X^1,\dots,X^n)\circ x^{-1}:G\to\mathbb{R}^n$

where the $X^i$ are the components $Xx^i$ of $X$ relative to the given local coordinates.

Now our existence theorem tells us there is a neighborhood $W\subseteq G$ of $a$, an interval $I$ around $0$, and a map $\psi:I\times W\to G$ satisfying the conditions for a flow. Setting $V=x^{-1}(W)$ and $\Phi(t,p)=x^{-1}\left(\psi(t,x(p))\right)$ we find our local flow.

We can also do the same thing with our uniqueness theorem: if $c$ and $\tilde{c}$ are two integral curves of $X$ defined on the same interval $I$, and if $c(t_0)=\tilde{c}(t_0)$ for some $t_0\in I$, then $c=\tilde{c}$.

Thus we find the geometric meaning of that messy foray into analysis: a smooth vector field has a smooth local flow around every point, and integral curves of vector fields are unique.

May 28, 2011

## Identifying Vector Fields

We know what vector fields are on a region $U\subseteq M$, but to identify them in the wild we need to verify that a given function sending each $p\in U$ to a vector in $\mathcal{T}_pM$ is smooth. This might not always be so easy to check directly, so we need some equivalent conditions. First we need to define how vector fields act on functions.

If $X\in\mathfrak{X}U$ is a vector field and $f\in\mathcal{O}U$ is a smooth function then we get another function $Xf$ by defining $Xf(p)=\left[X(p)\right](f)$. Indeed, $X(p)\in\mathcal{T}_pM$, so it can take (the germ of) a smooth function at $p$ and give us a number. Essentially, at each point the vector field defines a displacement, and we ask how the function $f$ changes along this displacement. This action is key to our conditions, and to how we will actually use vector fields.

Firstly, if $X$ is a vector field — a differentiable function — and if $(V,x)$ is a chart with $V\subseteq U$, then $Xx^i$ is always smooth. Indeed, remember that $(V,x)$ gives us a coordinate patch $(\pi^{-1}(V),\bar{x})$ on the tangent bundle. Since $\bar{x}$ is smooth and $X$ is smooth, the composition

$\displaystyle\bar{x}\circ X\vert_V=(x\circ I\vert_V;X\vert_V(x^1),\dots,x\vert_V(x^n))$

is also smooth. And thus each component $Xx^i$ is smooth on $V$.

Next, we do not assume that $X$ is a vector field — it is a function but not necessarily a differentiable one — but we assume that it satisfies the conclusion of the preceding paragraph. That is, for every chart $(V,x)$ with $V\subseteq U$ each $Xx^i$ is smooth. Now we will show that $Xf$ is smooth for every smooth $f\in\mathcal{O}V$, not just those that arise as coordinate functions. To see this, we use the decomposition of $X$ into coordinate vector fields:

$\displaystyle X=\sum\limits_{i=1}^nX^i\frac{\partial}{\partial x^i}$

which didn’t assume that $X$ was smooth, except to show that the coefficient functions were smooth. We can now calculate that $X^i=Xx^i$, since

$\displaystyle Xx^i=\sum\limits_{j=1}^nX^j\frac{\partial x^i}{\partial x^j}=\sum\limits_{j=1}^nX^j\delta^i_j=X^i$

But this means we can write

$\displaystyle Xf=\sum\limits_{i=1}^nXx^i\frac{\partial f}{\partial x^i}$

which makes $Xf$ a linear combination of the smooth (by assumption) functions $Xx^i$ with the coefficients $\frac{\partial f}{\partial x^i}$, proving that it is itself smooth.

Okay, now I say that if $Xf$ is smooth for every smooth function $f\in\mathcal{O}V$ on some region $V\subseteq U$, then $X$ is smooth as a function, and thus is a vector field. In this case around any $p\in U$ we can find some coordinate patch $(V,x)$. Now we go back up to the composition above:

$\displaystyle\bar{x}\circ X\vert_V=(x\circ I\vert_V;X\vert_V(x^1),\dots,x\vert_V(x^n))$

Everything in sight on the right is smooth, and so the left is also smooth. But this is exactly what we need to check when we’re using the local coordinates $(V,x)$ and $(\pi^{-1}(V),\bar{x})$ to verify the smoothness of $X$ at $p$.

The upshot is that when we want to verify that a function $X$ really is a smooth vector field, we take an arbitrary smooth “test function” and feed it into $X$. If the result is always smooth, then $X$ is smooth. In fact, some authors take this as the definition, regarding the action of $X$ on functions as fundamental, and only later talking in terms of its “value at a point”.

May 25, 2011

## Coordinate Vector Fields

If we consider an open subset $U\subseteq M$ along with a suitable map $x:U\to\mathbb{R}^n$ such that $(U,x)$ is a coordinate patch, it turns out that we can actually give an explicit basis of the module $\mathfrak{X}U$ of vector fields over the ring $\mathcal{O}U$.

Indeed, at each point $p\in U$ we can define the $n$ coordinate vectors:

$\displaystyle\frac{\partial}{\partial x^i}(p)\in\mathcal{T}_pM$

Thus each $\frac{\partial}{\partial x^i}$ itself qualifies as a vector field in $U$ as long as the map $p\mapsto\frac{\partial}{\partial x^i}(p)$ is smooth. But we can check this using the coordinates $(U,x)$ on $M$ and the coordinate patch induced by $(U,x)$ on the tangent bundle. With this choice of source and target coordinates the map is just the inclusion of $U$ into the subspace
$\displaystyle U\times\left\{(0,\dots,0,1,0,\dots,0)\right\}\subseteq U\times\mathbb{R}^n$

where the $1$ occurs in the $i$th place. This is clearly smooth.

Now we know at each point that the coordinate vectors span the tangent space. So let’s take a vector field $X\in\mathfrak{X}U$ and break up the vector $X(p)$. We can write

$\displaystyle X(p)=\sum\limits_{i=1}^nX^i(p)\frac{\partial}{\partial x^i}(p)$

which defines the $X^i(p)$ as real-valued functions on $U$. It’s also smooth; we know that $X:U\to U\times\mathbb{R}^n$ is smooth by the definition of a vector field and the same choice of local coordinates as above, and passing from $X(p)$ to $X^i(p)$ is really just the projection onto the $i$th component of $\mathbb{R}^n$ in these local coordinates.

Since this now doesn’t really depend on $p$ we can write

$\displaystyle X=\sum\limits_{i=1}^nX^i\frac{\partial}{\partial x^i}$

which describes an arbitrary vector field $X$ as a linear combination of the coordinate vector fields times “scalar coefficient” functions $X^i\in\mathcal{O}U$, showing that these coordinate vector fields span the whole module $\mathfrak{X}U$. It should be clear that they’re independent, because if we had a nontrivial linear combination between them we’d have one between the coordinate vectors at at least one point, which we know doesn’t exist.

We should note here that just because $\mathfrak{X}U$ is a free module — not a vector space since $\mathcal{O}U$ might have a weird structure — in the case where $(U,x)$ is a coordinate patch does not mean that all the $\mathfrak{X}U$ are free modules over their respective rings of smooth functions. But in a sense every “sufficiently small” open region $U$ can be contained in some coordinate patch, and thus $\mathfrak{X}U$ will always be a free module in this case.

May 24, 2011

## Vector Fields

At last, we get back to the differential geometry and topology. Let’s say that we have a manifold $M$ with tangent bundle $\mathcal{T}M$, which of course comes with a projection map $\pi:\mathcal{T}M\to M$. If $U\subseteq M$ is an open submanifold, we can restrict the bundle to the tangent bundle $\pi:\mathcal{T}U\to U$ with no real difficulty.

Now a “vector field” on $U$ is a “section” of this projection map. That is, it’s a function $v:U\to\mathcal{T}U$ so that the composition $\pi\circ v:U\to U$ is the identity map on $U$. In other words, to every point $p\in U$ we get a vector $v(p)\in\mathcal{T}_pU$ at that point.

I should step aside to dissuade people from a common mistake. Back in multivariable calculus, it’s common to say that a vector field in $\mathbb{R}^3$ is a function which assigns “a vector” to every point in some region $U\subseteq\mathbb{R}^3$; that is, a function $U\to\mathbb{R}^3$. The problem here is that it’s assuming that every point gets a vector in the same vector space, when actually each point gets assigned a vector in its own tangent space.

The confusion comes because we know that if $M$ has dimension $n$ then each tangent space $\mathcal{T}_pM$ has dimension $n$, and thus they’re all isomorphic. Worse, when working over Euclidean space there is a canonical identification between a tangent space $\mathcal{T}_pE$ and the space $E$ itself, and thus between any two tangent spaces. But when we’re dealing with an arbitrary manifold there is no such canonical way to compare vectors based at different points; we have to be careful to keep them separate.

For each $U\subseteq M$ we have a collection of vector fields, which we will write $\mathfrak{X}_MU$, or $\mathfrak{X}U$ for short. It should be apparent that if $V\subseteq U$ is an open subspace we can restrict a vector field on $U$ to one on $V$, which means we’re talking about a presheaf. In fact, it’s not hard to see that we can uniquely glue together vector fields which agree on shared domains, meaning we have a sheaf of vector fields.

For any $U$, we can define the sum and scalar multiple of vector fields on $U$ just by defining them pointwise. That is, if $v_1$ and $v_2$ are vector fields on $U$ and $a_1$ and $a_2$ are real scalars, then we define

$\displaystyle\left[a_1v_1+a_2v_2\right](p)=a_1v_1(p)+a_2v_2(p)$

using the addition and scalar multiplication in $\mathcal{T}_pM$. But that’s not all; we can also multiply a vector field $v\in\mathfrak{X}U$ by any function $f\in\mathcal{O}U$:

$\displaystyle\left[fv\right](p)=f(p)v(p)$

using the scalar multiplication in $\mathcal{T}_pM$. This makes $\mathfrak{X}_M$ into a sheaf of modules over the sheaf of rings $\mathcal{O}_M$.

May 23, 2011

## Lie Algebras from Associative Algebras

There is a great source for generating many Lie algebras: associative algebras. Specifically, if we have an associative algebra $A$ we can build a lie algebra $L(A)$ on the same underlying vector space by letting the bracket be the “commutator” from $A$. That is, for any algebra elements $a$ and $b$ we define

$\displaystyle[a,b]=ab-ba$

In fact, this is such a common way of coming up with Lie algebras that many people think of the bracket as a commutator by definition.

Clearly this is bilinear and antisymmetric, but does it satisfy the Jacobi identity? Well, let’s take three algebra elements and form the double bracket

\displaystyle\begin{aligned}\left[a,[b,c]\right]&=[a,bc-cb]\\&=a(bc-cb)-(bc-cb)a\\&=abc-acb-bca+cba\end{aligned}

We can find the other orders just as easily

\displaystyle\begin{aligned}\left[a,[b,c]\right]&=abc-acb-bca+cba\\\left[c,[a,b]\right]&=cab-cba-abc+bac\\\left[b,[c,a]\right]&=bca-bac-cab+acb\end{aligned}

and when we add these all up each term cancels against another term, leaving zero. Thus the commutator in an associative algebra does indeed act as a bracket.

May 18, 2011 Posted by | Algebra, Lie Algebras | 6 Comments

## Lie Algebras

One more little side trip before we proceed with the differential geometry: Lie algebras. These are like “regular” associative algebras in that we take a module (often a vector space) and define a bilinear operation on it. This much is covered at the top of the post on algebras.

The difference is that instead of insisting that the operation be associative, we impose different conditions. Also, instead of writing our operation like a multiplication (and using the word “multiplication”), we will write it as $[A,B]$ and call it the “bracket” of $A$ and $B$. Now, our first condition is that the bracket be antisymmetric:

$\displaystyle[A,B]=-[B,A]$

Secondly, and more importantly, we demand that the bracket should satisfy the “Jacobi identity”:

$\displaystyle[A,[B,C]]=[[A,B],C]+[B,[A,C]]$

What this means is that the operation of “bracketing with $A$” acts like a derivation on the Lie algebra; we can apply $[A,\underline{\hphantom{X}}]$ to the bracket $[B,C]$ by first applying it to $B$ and bracketing the result with $C$, then bracketing $B$ with the result of applying the operation to $C$, and adding the two together.

This condition is often stated in the equivalent form

$\displaystyle[A,[B,C]]+[C,[A,B]]+[B,[C,A]]=0$

It’s a nice exercise to show that (assuming antisymmetry) these two equations are indeed equivalent. This form of the Jacobi identity is neat in the way it shows a rotational symmetry among the three algebra elements, but I feel that it misses the deep algebraic point about why the Jacobi identity is so important: it makes for an algebra that acts on itself by derivations of its own structure.

It turns out that we already know of an example of a Lie algebra: the cross product of vectors in $\mathbb{R}^3$. Indeed, take three vectors $u$, $v$, and $w$ and try multiplying them out in all three orders:

\displaystyle\begin{aligned}u\times&(v\times w)\\w\times&(u\times v)\\v\times&(w\times u)\end{aligned}

and add the results together to see that you always get zero, thus satisfying the Jacobi identity.

May 17, 2011 Posted by | Algebra, Lie Algebras | 4 Comments

## Smooth Dependence on Initial Conditions

Now that we’ve got the existence and uniqueness of our solutions down, we have one more of our promised results: the smooth dependence of solutions on initial conditions. That is, if we use our existence and uniqueness theorems to construct a unique “flow” function $\psi:I\times U\to\mathbb{R}^n$ satisfying

\displaystyle\begin{aligned}\frac{\partial}{\partial t}\psi(t,u)&=F(\psi(t,u))\\\psi(0,u)=u\end{aligned}

by setting $\psi(t,u)=v_u(t)$ — where $v_u$ is the unique solution with initial condition $v_u(0)=u$ — then $\psi$ is continuously differentiable.

Now, we already know that $\psi$ is continuously differentiable in the time direction by definition. What we need to show is that the directional derivatives involving directions in $U$ exist and are continuous. To that end, let $a\in U$ be a base point and $h$ be a small enough displacement that $a+h\in U$ as well. Similarly, let $t_0$ be a fixed point in time and let $\Delta t$ be a small change in time

\displaystyle\begin{aligned}\lVert\psi(t_0+\Delta t,a+h)-\psi(t,a)\rVert=&\lVert v_{a+h}(t+\Delta t)-v_a(t)\rVert\\\leq&\lVert v_{a+h}(t+\Delta t)-v_a(t+\Delta t)\rVert\\&+\lVert v_a(t+\Delta t)-v_a(t)\rVert\end{aligned}

But now our result from last time tells us that these solutions can diverge no faster than exponentially. Thus we conclude that

$\displaystyle\lVert v_{a+h}(t+\Delta t)-v_a(t+\Delta t)\rVert\leq\lVert h\rVert e^{K\Delta t}$

and so as $\lVert h\rVert\to0$ this term must go to zero as well. Meanwhile, the second term also goes to zero by the differentiability of $v_a$. We can now see that the directional derivative at $(t_0,a)$ in the direction of $(\Delta t,h)$ exists.

But are these directional derivatives continuous. This turns out to be a lot more messy, but essentially doable by similar methods and a generalization of Gronwall’s inequality. For the sake of getting back to differential equations I’m going to just assert that not only do all directional derivatives exist, but they’re continuous, and thus the flow is $C^1$.

May 16, 2011

## Control on the Divergence of Solutions

Now we can establish some control on how nearby solutions to the differential equation

$\displaystyle v'(t)=F(v(t))$

diverge. That is, as time goes by, how can the solutions move apart from each other?

Let $x$ and $y$ be two solutions satisfying initial conditions $x(t_0)=x_0$ and $y(t_0)=y_0$, respectively. The existence and uniqueness theorems we’ve just proven show that $x$ and $y$ are uniquely determined by this choice in some interval, and we’ll pick a $t_1$ so they’re both defined on the closed interval $[t_0,t_1]$. Now for every $t$ in this interval we have

$\displaystyle\lVert y(t)-x(t)\rVert\leq\lVert y_0-x_0\rVert e^{K(t-t_0)}$

Where $K$ is a Lipschitz constant for $F$ in the region we’re concerned with. That is, the separation between the solutions $x(t)$ and $y(t)$ can increase no faster than exponentially.

So, let’s define $d(t)=\lVert y(t)-x(t)\rVert$ to be this distance. Converting to integral equations, it’s clear that

$\displaystyle y(t)-x(t)=y_0-x_0+\int\limits_{t_0}^t\left(F(y(s))-F(x(s))\right)\,ds$

and thus

\displaystyle\begin{aligned}d(t)&\leq\lVert y(t_0)-x(t_0)\rVert+\int\limits_{t_0}^t\left\lVert F(y(s))-F(x(s))\right\rVert\,ds\\&\leq\lVert y(t_0)-x(t_0)\rVert+\int\limits_{t_0}^tK\lVert y(s)-x(s)\rVert\,ds\\&=d(t_0)+\int\limits_{t_0}^tKd(s)\,ds\end{aligned}

Now Gronwall’s inequality tells us that $d(t)\leq d(t_0)e^{K(t-t_0)}$, which is exactly the inequality we asserted above.

May 13, 2011