The Unapologetic Mathematician

Mathematics for the interested outsider

Transforming Differential Operators

Because of the chain rule and Cauchy’s invariant rule, we know that we can transform differentials along with functions. For example, if we write

\displaystyle\begin{aligned}x&=r\cos(\theta)\\y&=r\sin(\theta)\end{aligned}

we can write the differentials of x and y in terms of the differentials of r and \theta:

\displaystyle\begin{aligned}dx&=\cos(\theta)dr-r\sin(\theta)d\theta\\dy&=\sin(\theta)dr+r\cos(\theta)d\theta\end{aligned}

It turns out that the chain rule also tells us how to rewrite differential operators in terms of the variables. But these go in the other direction. That is, we can write the differential operators \frac{\partial}{\partial r} and \frac{\partial}{\partial\theta} in terms of the operators \frac{\partial}{\partial x} and \frac{\partial}{\partial y}.

First of all, let’s write down the differential of f in terms of x and y and in terms of r and \theta:

\displaystyle\begin{aligned}df&=\frac{\partial f}{\partial x}dx+\frac{\partial f}{\partial y}dy\\df&=\frac{\partial f}{\partial r}dr+\frac{\partial f}{\partial\theta}d\theta\end{aligned}

and now we can rewrite dx and dy in terms of dr and d\theta.

\displaystyle\begin{aligned}df&=\frac{\partial f}{\partial x}\left(\cos(\theta)dr-r\sin(\theta)d\theta\right)+\frac{\partial f}{\partial y}\left(\sin(\theta)dr+r\cos(\theta)d\theta\right)\\&=\frac{\partial f}{\partial x}\cos(\theta)dr-\frac{\partial f}{\partial x}r\sin(\theta)d\theta+\frac{\partial f}{\partial y}\sin(\theta)dr+\frac{\partial f}{\partial y}r\cos(\theta)d\theta\\&=\left(\cos(\theta)\frac{\partial f}{\partial x}+\sin(\theta)\frac{\partial f}{\partial y}\right)dr+\left(-r\sin(\theta)\frac{\partial f}{\partial x}+r\cos(\theta)\frac{\partial f}{\partial y}\right)d\theta\end{aligned}

Now by uniqueness we can read off the partial derivatives of f in terms of r and \theta:

\displaystyle\begin{aligned}\frac{\partial f}{\partial r}&=\cos(\theta)\frac{\partial f}{\partial x}+\sin(\theta)\frac{\partial f}{\partial y}\\\frac{\partial f}{\partial\theta}&=-r\sin(\theta)\frac{\partial f}{\partial x}+r\cos(\theta)\frac{\partial f}{\partial y}\end{aligned}

Finally, we pull all mention of f out of our notation and just write out the differential operators.

\displaystyle\begin{aligned}\frac{\partial}{\partial r}&=\cos(\theta)\frac{\partial}{\partial x}+\sin(\theta)\frac{\partial}{\partial y}\\\frac{\partial}{\partial\theta}&=-r\sin(\theta)\frac{\partial}{\partial x}+r\cos(\theta)\frac{\partial}{\partial y}\end{aligned}

Now we’re done rewriting, but for good form we should express these coefficients in terms of x and y.

\displaystyle\begin{aligned}\frac{\partial}{\partial r}&=\frac{x}{\sqrt{x^2+y^2}}\frac{\partial}{\partial x}+\frac{y}{\sqrt{x^2+y^2}}\frac{\partial}{\partial y}\\\frac{\partial}{\partial\theta}&=-y\frac{\partial}{\partial x}+x\frac{\partial}{\partial y}\end{aligned}

It’s important to note that there’s really no difference between these last two steps. The first one uses the variables r and \theta while the second uses the variables x and y, but they express the exact same functions, given the original substitutions above.

More generally, let’s say we have a vector-valued function g:\mathbb{R}^m\rightarrow\mathbb{R}^n defining a substitution

\displaystyle\begin{aligned}y^1&=g^1(x^1,\dots,x^m)\\&\vdots\\y^n&=g^n(x^1,\dots,x^m)\end{aligned}

Cauchy’s invariant rule tells us that this gives rise to a substitution for differentials.

\displaystyle\begin{aligned}dy^1=dg^1(x^1,\dots,x^m)&=\frac{\partial g^1}{\partial x^1}dx^1+\dots+\frac{\partial g^1}{\partial x^m}dx^m=\frac{\partial g^1}{\partial x^i}dx^i\\&\vdots\\dy^n=dg^n(x^1,\dots,x^m)&=\frac{\partial g^n}{\partial x^1}dx^1+\dots+\frac{\partial g^n}{\partial x^m}dx^m=\frac{\partial g^n}{\partial x^i}dx^i\end{aligned}

We can play it a little loose and write this out in matrix notation:

\displaystyle\begin{pmatrix}dy^1\\\vdots\\dy^n\end{pmatrix}=\begin{pmatrix}\frac{\partial g^1}{\partial x^1}&\dots&\frac{\partial g^1}{\partial x^m}\\\vdots&\ddots&\vdots\\\frac{\partial g^n}{\partial x^1}&\dots&\frac{\partial g^n}{\partial x^m}\end{pmatrix}\begin{pmatrix}dx^1\\\vdots\\dx^m\end{pmatrix}

Now if we have a function f in terms of the y variables, we can use the substitution above to write it as a function of the x variables. We can write the differential of f in terms of each

\displaystyle\begin{aligned}df&=\frac{\partial f}{\partial y^j}dy^j\\df&=\frac{\partial f}{\partial x^i}dx^i\end{aligned}

Next we use the substitutions of the differentials to rewrite the first form as

\displaystyle df=\frac{\partial f}{\partial y^j}\frac{\partial g^j}{\partial x^i}dx^i

Then uniqueness allows us to match up the coefficients and write out the partial derivatives in terms of the x variables

\displaystyle\frac{\partial f}{\partial x^i}=\frac{\partial g^j}{\partial x^i}\frac{\partial f}{\partial y^j}

It is in this form that the chain rule is most often introduced, or the similar form

\displaystyle\frac{\partial f}{\partial x^i}=\frac{\partial y^j}{\partial x^i}\frac{\partial f}{\partial y^j}

And now we can remove mention of f from the formulæ and speak directly in terms of the operators

\displaystyle\frac{\partial}{\partial x^i}=\frac{\partial y^j}{\partial x^i}\frac{\partial}{\partial y^j}

Again, we can play it a little loose and write this in matrix notation

\displaystyle\begin{pmatrix}\frac{\partial}{\partial x^1}\\\vdots\\\frac{\partial}{\partial x^m}\end{pmatrix}=\begin{pmatrix}\frac{\partial y^1}{\partial x^1}&\dots&\frac{\partial y^n}{\partial x^1}\\\vdots&\ddots&\vdots\\\frac{\partial y^1}{\partial x^m}&\dots&\frac{\partial y^n}{\partial x^m}\end{pmatrix}\begin{pmatrix}\frac{\partial}{\partial y^1}\\\vdots\\\frac{\partial}{\partial y^n}\end{pmatrix}

This is very similar to the substitution for differentials written in matrix notation. The differences are that we transform from y-derivations to x-derivations instead of from x-differentials to y-differentials, and the two substitution matrices are the transposes of each other. Those who have been following closely (or who have some background in differential geometry) should start to see the importance of this latter fact, but for now we’ll consider this a statement about formulas and methods of calculation. We’ll come to the deeper geometric meaning when we come through again in a wider context.

October 12, 2009 Posted by | Analysis, Calculus | 4 Comments

   

Follow

Get every new post delivered to your Inbox.

Join 388 other followers