# The Unapologetic Mathematician

## An Existence Condition for the Differential

To this point we’ve seen what happens when a function $f$ does have a differential at a given point $x$, but we haven’t yet seen any conditions that tell us that any such function $df(x;t)$ exists. We know from the uniqueness proof that if it does exist, then given an orthonormal basis we have all partial derivatives, and the differential must be given by the formula

$\displaystyle df(x;t)=\left[D_if\right](x)t^i$

where $D_if$ is the partial derivative of $f$ in the $i$th coordinate direction. This is clearly linear in the displacement $t$, so all that remains is to see whether the inequality in the definition of the differential can be satisfied.

We must have all partial derivatives to write down this formula, but that can’t be sufficient for differentiability, because if it were then having all partial derivatives would imply continuity, and we know that it doesn’t. What will be sufficient is to ask that not only do all partial derivatives exist at $x$, but that they themselves are continuous there. Note, though, that I’m not asserting that this condition is not necessary for a function to be differentiable. Indeed, it’s possible to construct differentiable functions whose partial derivatives all exist, but are not continuous at $x$. This is an example of the way that analysis tends to be shot through with “counterexamples”, as Michael was talking about recently.

Okay, so let’s assume that all these partial derivatives $D_if$ exist and are continuous at $x$. We have to show that for any $\epsilon>0$ there is some $\delta>0$ so that if $\delta>\lVert t\rVert>0$ we have the inequality

$\displaystyle\left\lvert\left[f(x+t)-f(x)\right]-\left[D_if\right](x)t^i\right\rvert<\epsilon\lVert t\rVert$

We’re going to take the difference $f(x+t)-f(x)$ and break it into $n$ terms, each of which will approximate one of the partial derivative terms.

First off, since each $D_if$ is continuous at $x$, there is some $\delta$ so that if $\lVert t\rVert<\delta$ then $\lvert\left[D_if\right](x+t)-\left[D_if\right](x)\rvert<\frac{\epsilon}{n}$. In fact, there’s a $\delta$ for each index $i$, but we can just take the smallest of all these, and that one will work for each index. From this point on, we’ll assume that $\lVert t\rVert$ is actually less than $\frac{\delta}{2}$. We’ll write $t=\lambda u$, where $u$ is a unit vector and $\lambda$ is a scalar so that $\lvert\lambda\rvert=\lVert t\rVert<\frac{\delta}{2}$. We’ll also write $u$ in terms of our orthonormal basis $u=u^ie_i$.

Now we can build up our displacement direction $u$ step-by-step as a sequence of vectors $v_0=0$, $v_1=u^1e_1$, and so on, stepping in the $i$th direction on the $i$th step: $v_k=v_{k-1}+u^ke_k$ (not summing on $k$ here). So we can break up the difference of function values as

$\displaystyle f(x+\lambda u)-f(x)=\sum\limits_{k=1}^n\left[f(x+\lambda v_{k-1}+\lambda u^ke_k)-f(x+\lambda v_{k-1})\right]$

So now each step only changes the $k$th coordinate, and the points at each end both lie within the ball of radius $\frac{\delta}{2}$ around $x$, since each $v_k$ is shorter than $u$, which has unit length. To look closer at the step from $f(x+\lambda v_{k-1})$ to $f(x+\lambda v_{k-1}+\lambda u^ke_k)$, we introduce a new function of one real variable:

$\displaystyle g(\alpha)=f(x+\lambda v_{k-1}+\alpha e_k)$

for $-\lvert\lambda u^k\rvert\leq\alpha\leq\lvert\lambda u^k\rvert$. This lets us write our step as $g(\lambda u^k)-g(0)$. It turns out that everywhere in this closed interval, the function $g$ is differentiable! Indeed, we have

$\displaystyle\frac{g(\alpha+h)-g(\alpha)}{h}=\frac{f(x+\lambda v_{k-1}+\alpha e_k+he_k)-f(x+\lambda v_{k-1}+\alpha e_k)}{h}$

So as $h$ goes to zero, we find $g'(\alpha)=\left[D_kf\right](x+\lambda v_{k-1}+\alpha e_k)$, which exists because we’re in a small enough ball around $x$. Now the mean value theorem can be brought to bear, which says

$\displaystyle g(\lambda u^k)-g(0)=\lambda u^kg'(\alpha_k)$

for some $-\lvert\lambda u^k\rvert\leq\alpha_k\leq\lvert\lambda u^k\rvert$. And now the difference of function values can be written

\displaystyle\begin{aligned}f(x+t)-f(x)&=\lambda\sum\limits_{k=1}^nu^k\left[D_kf\right](x+\lambda v_{k-1}+\alpha_ke_k)\\&=\sum\limits_{k=1}^n\left[D_kf\right](x)t^k+\lambda\sum\limits_{k=1}^nu^k\left[\left[D_kf\right](x+\lambda v_{k-1}+\alpha_ke_k)-\left[D_kf\right](x)\right]\end{aligned}

since $t^k=\lambda u^k$.

Now $\lvert\lambda v_{k-1}+\alpha_ke_k\rvert\leq\lvert\lambda\rvert+\lvert\lambda u^k\rvert<2\lvert\lambda\rvert<\delta$, and so we find that the each of these differences of partial derivative evaluations is less than $\frac{\epsilon}{n}$. And thus

$\displaystyle\left\lvert\left[f(x+t)-f(x)\right]-\sum\limits_{k=1}^n\left[D_kf\right](x)t^k\right\rvert<\lvert\lambda\rvert\epsilon=\epsilon\lVert t\rVert$

which establishes the inequality we need.

October 1, 2009 Posted by | Analysis, Calculus | 10 Comments