# The Unapologetic Mathematician

## Directional Derivatives

Okay, now let’s generalize away from partial derivatives. The conceptual problem there was picking a bunch of specific directions as our basis, and restricting ourselves to that basis. So instead, let’s pick any direction at all, or even more generally than that.

Given a vector $u\in\mathbb{R}^n$, we define the directional derivative of the function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ in the direction of $u$ by

$\displaystyle\left[D_uf\right](x)=\lim\limits_{t\to0}\frac{f(x+ut)-f(x)}{t}$

It’s common to omit the brackets I’ve written in here, but that doesn’t make it as clear that we have a new function $D_uf$, and we’re asking for its value at $x$. Instead, $D_uf(x)$ can suggest that we’re applying $D_u$ to the value $f(x)$. It’s also common to restrict $u$ to be a unit complex number, which is then used as a representative vector for all of those pointing in the same direction. I find that to be a needless hindrance, but others may disagree.

Anyhow, this looks a lot like our familiar derivative. Indeed, if we’re working in $\mathbb{R}^1$ and we set $u=1$ we recover our regular derivative. And we have the same sort of interpretation: if we move a little bit $\Delta t$ in the direction of $u$ then we can approximate the change in $f$

$\displaystyle f(x+u\Delta t)\approx f(x)+\left[D_uf\right](x)\Delta t$
$\displaystyle\Delta f=f(x+u\Delta t)-f(x)\approx\left[D_uf\right](x)\Delta t$
$\displaystyle\frac{\Delta f}{\Delta t}\approx\left[D_uf\right](x)$

Now, does the existence of these limits guarantee the continuity of $f$ at $x$? No, not even the existence of all directional derivatives at a point assures us that the function will be continuous at that point. Indeed, we can consider another of our pathological cases

$\displaystyle f(x,y)=\frac{x^2y}{x^4+y^2}$

and patch it by defining $f(0,0)=0$. We take the directional derivative at $(x,y)=(0,0)$ using the direction vector $(u,v)$

\displaystyle\begin{aligned}\left[D_{(u,v)}f\right](0,0)&=\lim\limits_{t\to0}\frac{f(ut,vt)-f(0,0)}{t}\\&=\lim\limits_{t\to0}\frac{\frac{t^3u^2v}{t^4u^4+t^2v^2}}{t}\\&=\lim\limits_{t\to0}\frac{u^2v}{t^2u^4+v^2}\end{aligned}

If $v\neq0$ then we find $\left[D_{(u,v)}f\right](0,0)=\frac{u^2}{v}$, while if $v=0$ we find $\left[D_{(u,v)}f\right](0,0)=0$. But we know that this function can’t be continuous, since if we approach the origin along the parabola $y=x^2$ we get a limit of $\frac{1}{2}$ instead of $f(0,0)=0$.

Again, the problem is that directional derivatives imply continuity along straight lines in various directions, but even continuity along every straight line through the point isn’t enough to assure continuity as a function of two variables, let alone more. We need something even stronger than directional derivatives.

On the other hand, directional derivatives are definitely stronger than partial derivatives. First of all, we haven’t had to make any choice of an orthonormal basis. But if we do have an orthonormal basis $\left\{e_i\right\}_{i=1}^n$ at hand, we find that partial derivatives are just particular directional derivatives

\displaystyle\begin{aligned}\left[D_{e_k}f\right](x)&=\lim\limits_{t\to0}\frac{f(x+e_kt)-f(x)}{t}\\&=\lim\limits_{t\to0}\frac{f(x^ie_i+e_kt)-f(x^ie_i)}{t}\\&=\lim\limits_{t\to0}\frac{f(x^1,\dots,x^k+t,\dots,x^n)-f(x^1,\dots,x^k,\dots,x^n)}{t}\\&=f_k(x^1,\dots,x^n)\end{aligned}

Incidentally, I’ve done two things here worth noting. First of all, I’ve gone back to using superscript indices for vector components. This allows the second thing, which is the transition from writing a function as taking one vector variable $f(x)$ to rewriting the vector in terms of the basis at hand $f(x^ie_i)$ to writing the function as taking $n$ real variables $f(x^1,\dots,x^n)$. I know that some people don’t like superscript indices and the summation convention, but they’ll be standard when we get to more general spaces later, so we may as well get used to them now. Luckily, when we really understand something we shouldn’t have to pick coordinates, and indices only come into play when we do pick coordinates. Thus all the really meaningful statements shouldn’t have many indices to confuse us.

September 23, 2009 - Posted by | Analysis, Calculus

1. This series of posts is great! I’m TA-ing Calc III for the first time this quarter, so it is nice to see lots of motivation that I haven’t thought about for awhile (of course the general Calc III student is far more worried about remembering how to do the calculations than any of the “why” behind it).

Comment by hilbertthm90 | September 23, 2009 | Reply

2. Is this your first time TAing that course? You do learn more when you take an undergraduate advanced calculus course, but I think it really solidifies when you turn around to teach the material, once you’ve already had a really deep view inside what’s going on.

Comment by John Armstrong | September 23, 2009 | Reply

3. I’m sorry this is probably very elementary but how did you get this part

But we know that this function can’t be continuous, since if we approach the origin along the parabola y=x^2 we get a limit of \frac{1}{2} instead of f(0,0)=0.

I’m sure once I see it I will have an ah hah of course moment.

Comment by David | September 23, 2009 | Reply

4. David, I ran through that example in the post on multivariable limits that I linked to above with the text “pathological cases”. Basically, try parametrizing that curve as $(x,y)=(t,t^2)$ and take the limit as $t$ goes to zero.

Comment by John Armstrong | September 23, 2009 | Reply

5. *snicker*
I think I deserved that one. 😀

Comment by Mikael Vejdemo Johansson | September 23, 2009 | Reply

6. You’re not the only one, Mikael. You’re just the one who spoke up, and it’s worth noting that not everyone agrees on some of these notational issues.

Comment by John Armstrong | September 23, 2009 | Reply

7. […] work as an extension of derivation to higher-dimensional spaces. Even generalizing them to directional derivatives doesn’t give us what we want. What we need is not just the separate existence of a bunch of […]

Pingback by Differentials « The Unapologetic Mathematician | September 24, 2009 | Reply

8. […] we showed that given an orthonormal basis we have all partial derivatives. We even have all directional derivatives, with pretty much the same proof. We replace with an arbitrary vector , and pick the scalar so […]

Pingback by Differentiability Implies Continuity « The Unapologetic Mathematician | September 30, 2009 | Reply

9. […] to differential calculus in more than one variable starts by defining partial derivatives and directional derivatives, as we did. But instead of defining the differential, it simply collects the partial derivatives […]

Pingback by The Gradient Vector « The Unapologetic Mathematician | October 5, 2009 | Reply

10. […] value for the instantaneous rate of change, we have a differential . But we can use it to find directional derivatives. Specifically, we’ll consider the derivative of in the direction pointing from to . […]

Pingback by The Mean Value Theorem « The Unapologetic Mathematician | October 13, 2009 | Reply

11. […] things can happen. We might ask that have a local minimum at along any line, like we tried with directional derivatives. But even this can go wrong. If we […]

Pingback by Local Extrema in Multiple Variables « The Unapologetic Mathematician | November 23, 2009 | Reply

12. […] point a bilinear function of two displacement vectors and , and it measures the rate at which the directional derivative in the direction of is changing as we move in the direction of . That […]

Pingback by Classifying Critical Points « The Unapologetic Mathematician | November 24, 2009 | Reply

13. […] at and gives back a number. It’s the differential, which when given a vector returns the directional derivative in that direction. And we can generalize that right […]

Pingback by Cotangent Vectors, Differentials, and the Cotangent Bundle « The Unapologetic Mathematician | April 13, 2011 | Reply