The Unapologetic Mathematician

Directional Derivatives

Okay, now let’s generalize away from partial derivatives. The conceptual problem there was picking a bunch of specific directions as our basis, and restricting ourselves to that basis. So instead, let’s pick any direction at all, or even more generally than that.

Given a vector $u\in\mathbb{R}^n$, we define the directional derivative of the function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ in the direction of $u$ by

$\displaystyle\left[D_uf\right](x)=\lim\limits_{t\to0}\frac{f(x+ut)-f(x)}{t}$

It’s common to omit the brackets I’ve written in here, but that doesn’t make it as clear that we have a new function $D_uf$, and we’re asking for its value at $x$. Instead, $D_uf(x)$ can suggest that we’re applying $D_u$ to the value $f(x)$. It’s also common to restrict $u$ to be a unit complex number, which is then used as a representative vector for all of those pointing in the same direction. I find that to be a needless hindrance, but others may disagree.

Anyhow, this looks a lot like our familiar derivative. Indeed, if we’re working in $\mathbb{R}^1$ and we set $u=1$ we recover our regular derivative. And we have the same sort of interpretation: if we move a little bit $\Delta t$ in the direction of $u$ then we can approximate the change in $f$

$\displaystyle f(x+u\Delta t)\approx f(x)+\left[D_uf\right](x)\Delta t$
$\displaystyle\Delta f=f(x+u\Delta t)-f(x)\approx\left[D_uf\right](x)\Delta t$
$\displaystyle\frac{\Delta f}{\Delta t}\approx\left[D_uf\right](x)$

Now, does the existence of these limits guarantee the continuity of $f$ at $x$? No, not even the existence of all directional derivatives at a point assures us that the function will be continuous at that point. Indeed, we can consider another of our pathological cases

$\displaystyle f(x,y)=\frac{x^2y}{x^4+y^2}$

and patch it by defining $f(0,0)=0$. We take the directional derivative at $(x,y)=(0,0)$ using the direction vector $(u,v)$

\displaystyle\begin{aligned}\left[D_{(u,v)}f\right](0,0)&=\lim\limits_{t\to0}\frac{f(ut,vt)-f(0,0)}{t}\\&=\lim\limits_{t\to0}\frac{\frac{t^3u^2v}{t^4u^4+t^2v^2}}{t}\\&=\lim\limits_{t\to0}\frac{u^2v}{t^2u^4+v^2}\end{aligned}

If $v\neq0$ then we find $\left[D_{(u,v)}f\right](0,0)=\frac{u^2}{v}$, while if $v=0$ we find $\left[D_{(u,v)}f\right](0,0)=0$. But we know that this function can’t be continuous, since if we approach the origin along the parabola $y=x^2$ we get a limit of $\frac{1}{2}$ instead of $f(0,0)=0$.

Again, the problem is that directional derivatives imply continuity along straight lines in various directions, but even continuity along every straight line through the point isn’t enough to assure continuity as a function of two variables, let alone more. We need something even stronger than directional derivatives.

On the other hand, directional derivatives are definitely stronger than partial derivatives. First of all, we haven’t had to make any choice of an orthonormal basis. But if we do have an orthonormal basis $\left\{e_i\right\}_{i=1}^n$ at hand, we find that partial derivatives are just particular directional derivatives

\displaystyle\begin{aligned}\left[D_{e_k}f\right](x)&=\lim\limits_{t\to0}\frac{f(x+e_kt)-f(x)}{t}\\&=\lim\limits_{t\to0}\frac{f(x^ie_i+e_kt)-f(x^ie_i)}{t}\\&=\lim\limits_{t\to0}\frac{f(x^1,\dots,x^k+t,\dots,x^n)-f(x^1,\dots,x^k,\dots,x^n)}{t}\\&=f_k(x^1,\dots,x^n)\end{aligned}

Incidentally, I’ve done two things here worth noting. First of all, I’ve gone back to using superscript indices for vector components. This allows the second thing, which is the transition from writing a function as taking one vector variable $f(x)$ to rewriting the vector in terms of the basis at hand $f(x^ie_i)$ to writing the function as taking $n$ real variables $f(x^1,\dots,x^n)$. I know that some people don’t like superscript indices and the summation convention, but they’ll be standard when we get to more general spaces later, so we may as well get used to them now. Luckily, when we really understand something we shouldn’t have to pick coordinates, and indices only come into play when we do pick coordinates. Thus all the really meaningful statements shouldn’t have many indices to confuse us.