Okay, partial derivatives don’t work as an extension of derivation to higher-dimensional spaces. Even generalizing them to directional derivatives doesn’t give us what we want. What we need is not just the separate existence of a bunch of directional derivatives, but a single object which gives us all directional derivatives at once. To find it, let’s look back at the derivative in one dimension.
If we know the derivative of a function at a point , we can use it to build a close linear approximation to the function near that point. This is what we mean when we say that the derivative is the slope of the tangent line. It says that if we move away from by an amount , we can approximate the change in the function’s value
I’m going to write the part on the right-hand side as one function: . We use a semicolon here to distinguish the very different roles that and play. Before the semicolon we pick a point at which to approximate . After the semicolon we pick a displacement from our starting point. The “differential” approximates how much the function will change from its value at when we move away by a displacement . Importantly, for a fixed value of this displacement is a linear function of . This is obvious here, since once we pick , the value of is determined by multiplying by some real number, and multiplication of real numbers is linear.
What’s less obvious is also more important: the differential approximates the difference . By this, we can’t just mean that the distance between the two goes to zero as does. This is obvious, since both of them must themselves go to zero by linearity and continuity, respectively. No, we want them to agree more closely than that. We want something better than just continuity.
I say that if has a finite derivative at (so the differential exists), then for every there is a so that if we have
That is, not only does the difference get small (as the limit property would say), but it gets small even faster than does. And indeed this is the case. We can divide both sides by , which (since is small) magnifies the difference on the left side.
But if we can always find a neighborhood where this inequality holds, we have exactly the statement of the limit
which is exactly what it means for to be the derivative of at .
So for a single-variable function, having a derivative — the limit of a difference quotient — is equivalent to being differentiable — having a differential. And it’s differentials that generalize nicely.
Now let be a real-valued function defined on some open region in . The differential of at a point , if it exists, is a function satisfying the properties
- The function takes two variables in . The values are defined for every value of , and for some region of values containing the point under consideration. Typically, we’ll be looking for it to be defined in the same region .
- The differential is linear in the second variable. That is, given two vectors and in , and real scalars and , we must have
- The differential closely approximates the change in the value of as we move away from the point , in the sense that for every there is a so that if we have
I’m not making any sort of assertion about whether or not such a function exists, or under what conditions it exists. More subtly, I’m not yet making any assertion that if such a function exists it is unique. All I’m saying for the moment is that having this sort of linear approximation to the function near is the right generalization of the one-variable notion of differentiability.