The Chain Rule
Today we get another rule for manipulating derivatives. Along the way we’ll see another way of viewing the definition of the derivative which will come in handy in the future.
Okay, we defined the derivative of the function at the point as the limit of the difference quotient:
The point of the derivative-as-limit-of-difference-quotient is that if we adjust our input by , we adjust our output “to first order” by . That is, the the change in output is roughly the change in input times the derivative, and we have a good idea of how to control the error:
where is a function of satisfying . This means the difference between the actual change in output and the change predicted by the derivative not only goes to zero as we look closer and closer to , but it goes to zero fast enough that we can divide it by and still it goes to zero. (Does that make sense?)
Okay, so now we can use this viewpoint on the derivative to look at what happens when we follow one function by another. We want to consider the composite function at the point where is differentiable. We’re also going to assume that is differentiable at the point . The differentiability of at tells us that
and the differentiability of at tells us that
where , and similarly for . Now when we compose the functions and we set , and is exactly the value described in the first line! That is,
The last quantity in parentheses which we multiply by goes to zero as does. First, does by assumption. Then as goes to zero, so does , since must be continuous. Thus must go to zero, and the whole quantity is then zero in the limit. This establishes that not only is differentiable at , but that its derivative there is
This means that since “to first order” we get the change in the output of by multiplying the change in its input by , and “to first order” we get the change in the output of by multiplying the change in its input by , we get the change in the output of their composite by multiplying first by and then by .
Another way we often write the chain rule is by setting and . Then the derivative is written , while is written . The chain rule then says:
This is nice since it looks like we’re multiplying fractions. The drawback is that we have to remember in our heads where to evaluate each derivative.
Now we can take this rule and use it to find the derivative of the inverse of an invertible function . More specifically, if a function is one-to-one in some neighborhood of a point , we can find another function whose domain is the set of values takes — the range of — and so that . Then if the function is differentiable at and the derivative is not zero, the inverse function will be differentiable, with a derivative we will calculate.
First we set and . Then we take the derivative of the defining equation of the inverse to get , which we could write even more suggestively as . That is, the derivative of the composition inverse of our function is the multiplicative inverse of the derivative. But as we noted above, we have to remember where to evaluate everything. So let’s do it again in the other notation.
Since , we differentiate to find . Then we substitute and juggle some algebra to write