# The Unapologetic Mathematician

## Differentials

Okay, partial derivatives don’t work as an extension of derivation to higher-dimensional spaces. Even generalizing them to directional derivatives doesn’t give us what we want. What we need is not just the separate existence of a bunch of directional derivatives, but a single object which gives us all directional derivatives at once. To find it, let’s look back at the derivative in one dimension.

If we know the derivative of a function $f$ at a point $x$, we can use it to build a close linear approximation to the function near that point. This is what we mean when we say that the derivative is the slope of the tangent line. It says that if we move away from $x$ by an amount $t$, we can approximate the change in the function’s value

$\displaystyle f(x+t)-f(x)\approx f'(x)t$

I’m going to write the part on the right-hand side as one function: $df(x;t)=f'(x)t$. We use a semicolon here to distinguish the very different roles that $x$ and $t$ play. Before the semicolon we pick a point at which to approximate $f$. After the semicolon we pick a displacement from our starting point. The “differential” $df$ approximates how much the function will change from its value at $x$ when we move away by a displacement $t$. Importantly, for a fixed value of $x$ this displacement is a linear function of $t$. This is obvious here, since once we pick $x$, the value of $df(x;t)$ is determined by multiplying by some real number, and multiplication of real numbers is linear.

What’s less obvious is also more important: the differential $df(x;t)$ approximates the difference $f(x+t)-f(x)$. By this, we can’t just mean that the distance between the two goes to zero as $t$ does. This is obvious, since both of them must themselves go to zero by linearity and continuity, respectively. No, we want them to agree more closely than that. We want something better than just continuity.

I say that if $f$ has a finite derivative at $x$ (so the differential exists), then for every $\epsilon>0$ there is a $\delta>0$ so that if $\delta>\lvert t\rvert>0$ we have

$\displaystyle\lvert\left[f(x+t)-f(x)\right]-df(x;t)\rvert<\epsilon\lvert t\rvert$

That is, not only does the difference get small (as the limit property would say), but it gets small even faster than $\lvert t\rvert$ does. And indeed this is the case. We can divide both sides by $\lvert t\rvert$, which (since $t$ is small) magnifies the difference on the left side.

$\displaystyle\left\lvert\frac{\left[f(x+t)-f(x)\right]-df(x;t)}{t}\right\rvert=\left\lvert\frac{f(x+t)-f(x)}{t}-f'(x)\right\rvert<\epsilon$

But if we can always find a neighborhood where this inequality holds, we have exactly the statement of the limit

$\displaystyle f'(x)=\lim\limits_{t\to0}\frac{f(x+t)-f(x)}{t}$

which is exactly what it means for $f'(x)$ to be the derivative of $f$ at $x$.

So for a single-variable function, having a derivative — the limit of a difference quotient — is equivalent to being differentiable — having a differential. And it’s differentials that generalize nicely.

Now let $f$ be a real-valued function defined on some open region $S$ in $\mathbb{R}^n$. The differential of $f$ at a point $x\in S$, if it exists, is a function $df$ satisfying the properties

• The function $df$ takes two variables in $\mathbb{R}^n$. The values $df(x;t)$ are defined for every value of $t\in\mathbb{R}^n$, and for some region of $x$ values containing the point under consideration. Typically, we’ll be looking for it to be defined in the same region $S$.
• The differential is linear in the second variable. That is, given two vectors $s$ and $t$ in $\mathbb{R}^n$, and real scalars $a$ and $b$, we must have

$\displaystyle df(x;as+bt)=adf(x;s)+bdf(x;t)$

• The differential closely approximates the change in the value of $f$ as we move away from the point $x$, in the sense that for every $\epsilon>0$ there is a $\delta>0$ so that if $\delta>\lVert t\rVert>0$ we have

$\displaystyle\lvert\left[f(x+t)-f(x)\right]-df(x;t)\rvert<\epsilon\lVert t\rVert$

I’m not making any sort of assertion about whether or not such a function exists, or under what conditions it exists. More subtly, I’m not yet making any assertion that if such a function exists it is unique. All I’m saying for the moment is that having this sort of linear approximation to the function near $x$ is the right generalization of the one-variable notion of differentiability.