Cauchy’s Invariant Rule
An immediate corollary of the chain rule is another piece of “syntactic sugar”.
If we have functions and
for some open regions
and
so that the image
is contained in
, we can compose the two functions to get a new function
. In terms of formulas, we can choose coordinates
on
and write out both the function
and the component functions
. We get a formula for
by substituting
for
in the formula for
and write
.
The language there seems a little convoluted, so I’d like to give an example. We might define a function for all points
in the plane
. This is all well and good, but we might want to talk about the function in polar coordinates. To this end, we may define
and
. These are the component functions describing a transformation
from the region
to the region where
. We can substitute
for
and
for
in our formula for
to get a new function
with formula
This much is straightforward. The thing is, now we want to take differentials. What Cauchy’s invariant rule tells us is that we can calculate the differential of by not only substituting
for
, but also substituting
for
in the formula for
. That is, if
then we have the equivalence
In our particular example, we can easily calculate the differential of using our first formula:
or using our second formula:
We want to call both of these simply . But can we do so unambiguously? Indeed, if
then we find
and if then we find
We substitute these into our formula for to find
just the same as if we calculated directly from the formula in terms of and
.
That is, we can substitute our formulæ for the coordinate functions before taking the differential in terms of
, or we can take the differential in terms of
and then substitute our formulæ for the coordinate functions
and their differentials
into the result. Either way, we end up in the same place, so we don’t have to worry about ending up with two (or more!) “different” differentials of
.
So, how do we verify this using the chain rule? Just write out the differentials out using partial derivatives. For example, we know that
and so on. So, performing our substitutions we can find:
The important part here is the passage from products of two partial derivatives to single partial derivatives of . This works out because when we consider differentials as linear transformations, the matrix entries are the partial derivatives. The composition of the linear transformations
and
is given by the product of these matrices, and the entries of the resulting matrix must (by uniqueness) be the partial derivatives of the composite function.
