The Unapologetic Mathematician

Differentials

Okay, partial derivatives don’t work as an extension of derivation to higher-dimensional spaces. Even generalizing them to directional derivatives doesn’t give us what we want. What we need is not just the separate existence of a bunch of directional derivatives, but a single object which gives us all directional derivatives at once. To find it, let’s look back at the derivative in one dimension.

If we know the derivative of a function $f$ at a point $x$, we can use it to build a close linear approximation to the function near that point. This is what we mean when we say that the derivative is the slope of the tangent line. It says that if we move away from $x$ by an amount $t$, we can approximate the change in the function’s value

$\displaystyle f(x+t)-f(x)\approx f'(x)t$

I’m going to write the part on the right-hand side as one function: $df(x;t)=f'(x)t$. We use a semicolon here to distinguish the very different roles that $x$ and $t$ play. Before the semicolon we pick a point at which to approximate $f$. After the semicolon we pick a displacement from our starting point. The “differential” $df$ approximates how much the function will change from its value at $x$ when we move away by a displacement $t$. Importantly, for a fixed value of $x$ this displacement is a linear function of $t$. This is obvious here, since once we pick $x$, the value of $df(x;t)$ is determined by multiplying by some real number, and multiplication of real numbers is linear.

What’s less obvious is also more important: the differential $df(x;t)$ approximates the difference $f(x+t)-f(x)$. By this, we can’t just mean that the distance between the two goes to zero as $t$ does. This is obvious, since both of them must themselves go to zero by linearity and continuity, respectively. No, we want them to agree more closely than that. We want something better than just continuity.

I say that if $f$ has a finite derivative at $x$ (so the differential exists), then for every $\epsilon>0$ there is a $\delta>0$ so that if $\delta>\lvert t\rvert>0$ we have

$\displaystyle\lvert\left[f(x+t)-f(x)\right]-df(x;t)\rvert<\epsilon\lvert t\rvert$

That is, not only does the difference get small (as the limit property would say), but it gets small even faster than $\lvert t\rvert$ does. And indeed this is the case. We can divide both sides by $\lvert t\rvert$, which (since $t$ is small) magnifies the difference on the left side.

$\displaystyle\left\lvert\frac{\left[f(x+t)-f(x)\right]-df(x;t)}{t}\right\rvert=\left\lvert\frac{f(x+t)-f(x)}{t}-f'(x)\right\rvert<\epsilon$

But if we can always find a neighborhood where this inequality holds, we have exactly the statement of the limit

$\displaystyle f'(x)=\lim\limits_{t\to0}\frac{f(x+t)-f(x)}{t}$

which is exactly what it means for $f'(x)$ to be the derivative of $f$ at $x$.

So for a single-variable function, having a derivative — the limit of a difference quotient — is equivalent to being differentiable — having a differential. And it’s differentials that generalize nicely.

Now let $f$ be a real-valued function defined on some open region $S$ in $\mathbb{R}^n$. The differential of $f$ at a point $x\in S$, if it exists, is a function $df$ satisfying the properties

• The function $df$ takes two variables in $\mathbb{R}^n$. The values $df(x;t)$ are defined for every value of $t\in\mathbb{R}^n$, and for some region of $x$ values containing the point under consideration. Typically, we’ll be looking for it to be defined in the same region $S$.
• The differential is linear in the second variable. That is, given two vectors $s$ and $t$ in $\mathbb{R}^n$, and real scalars $a$ and $b$, we must have

$\displaystyle df(x;as+bt)=adf(x;s)+bdf(x;t)$

• The differential closely approximates the change in the value of $f$ as we move away from the point $x$, in the sense that for every $\epsilon>0$ there is a $\delta>0$ so that if $\delta>\lVert t\rVert>0$ we have

$\displaystyle\lvert\left[f(x+t)-f(x)\right]-df(x;t)\rvert<\epsilon\lVert t\rVert$

I’m not making any sort of assertion about whether or not such a function exists, or under what conditions it exists. More subtly, I’m not yet making any assertion that if such a function exists it is unique. All I’m saying for the moment is that having this sort of linear approximation to the function near $x$ is the right generalization of the one-variable notion of differentiability.

September 24, 2009 - Posted by | Analysis, Calculus

1. I’m still confused about the distinction between a “derivative” and a (the?) “differential”.

Is it just a matter of how they’re specified? If both exist, do they always denote the same function? Will the distinction become clearer once we move into higher-dimensional spaces?

Comment by Joe English | September 26, 2009 | Reply

2. In one dimension, the distinction is mostly semantic, but semantics are important. The differential of a function (at a point) is a linear function which, given a change in the input, gives an estimate of the resulting change in the output. It’s the best such linear estimate, as we’ll see when we establish uniqueness.

So, for a single-valued function of a single real variable what does this mean? It’s a linear transformation from $\mathbb{R}^1$ to $\mathbb{R}^1$, since changes in both the input and output are one-dimensional. And thus we can represent this transformation with a $1\times1$ matrix with a single real entry. What is that entry? The derivative of the function at that point.

That is, the derivative is a number, and the differential is the linear transformation of multiplication by that number. In one dimension it doesn’t really look like there’s any difference, but in higher dimensions it’s the second interpretation that generalizes.

Comment by John Armstrong | September 26, 2009 | Reply

3. […] In light of our discussion of differentials, I want to make a point here that is usually glossed over in most treatments of multivariable […]

Pingback by Euclidean Spaces « The Unapologetic Mathematician | September 28, 2009 | Reply

4. […] if we look at a particular point we can put it into our differential and leave the second (vector) slot blank: . We will also write this simply as , and apply it to a […]

Pingback by Uniqueness of the Differential « The Unapologetic Mathematician | September 29, 2009 | Reply

5. […] first we take in the definition of the differential, to […]

Pingback by Differentiability Implies Continuity « The Unapologetic Mathematician | September 30, 2009 | Reply

6. […] for the Differential To this point we’ve seen what happens when a function does have a differential at a given point , but we haven’t yet seen any conditions that tell us that any such function […]

Pingback by An Existence Condition for the Differential « The Unapologetic Mathematician | October 1, 2009 | Reply

7. […] by defining partial derivatives and directional derivatives, as we did. But instead of defining the differential, it simply collects the partial derivatives together as the components of a vector called the […]

Pingback by The Gradient Vector « The Unapologetic Mathematician | October 5, 2009 | Reply

8. […] how to modify the notion of the derivative of a function to deal with vector inputs by defining the differential. But what about functions that have vectors as […]

Pingback by Vector-Valued Functions « The Unapologetic Mathematician | October 6, 2009 | Reply

9. […] we don’t just have a single value for the instantaneous rate of change, we have a differential . But we can use it to find directional derivatives. Specifically, we’ll consider the […]

Pingback by The Mean Value Theorem « The Unapologetic Mathematician | October 13, 2009 | Reply

10. […] Just like we assembled partial derivatives into the differential of a function, so we can assemble higher partial derivatives into higher-order differentials. The […]

Pingback by Higher-Order Differentials « The Unapologetic Mathematician | October 16, 2009 | Reply

11. […] themselves continuous throughout . We’ve seen that this will imply that such a function has a differential at each point of . This gives us a subalgebra of which we write as . That is, these functions have […]

Pingback by Smoothness « The Unapologetic Mathematician | October 21, 2009 | Reply

12. […] function of real variables with components . The differential, then, is itself a vector-valued function whose components are the differentials of the component […]

Pingback by The Jacobian « The Unapologetic Mathematician | November 11, 2009 | Reply

13. […] spaces, we will assume that transforms small enough changes in almost linearly — is differentiable — and that the Jacobian determinant is everywhere nonzero, so we can invert the […]

Pingback by Change of Variables in Multiple Integrals I « The Unapologetic Mathematician | January 5, 2010 | Reply

14. […] like a term of art. Really what we know is what it means for a function from to itself to be differentiable, and for that differential to be continuous, and for there to be a second differential, and so on. […]

Pingback by Classes of Manifolds « The Unapologetic Mathematician | February 25, 2011 | Reply

15. […] vectors already: a gadget that takes a tangent vector at and gives back a number. It’s the differential, which when given a vector returns the directional derivative in that direction. And we can […]

Pingback by Cotangent Vectors, Differentials, and the Cotangent Bundle « The Unapologetic Mathematician | April 13, 2011 | Reply

16. […] this is obvious. The differential is a linear transformation. Since it goes between finite-dimensional vector spaces it’s […]

Pingback by Continuously Differentiable Functions are Locally Lipschitz « The Unapologetic Mathematician | May 4, 2011 | Reply

17. […] the -form that takes a vector field and evaluates it on the function . And this is just like the differential of a multivariable function: a new function that takes a point and a vector at that point and gives […]

Pingback by The Exterior Derivative « The Unapologetic Mathematician | July 15, 2011 | Reply

18. So, let me see if I get the generalization to higher dimensions: in a (differentiable) map from R^n to R^m, the differential will be the mxn-Jacobian matrix (evaluated at a vector dx of “infinitesimal displacements” dx_1,dx_2,…,dx_n , and the derivative is the higher-dimensional verson of the tangent-plane approximation? Also: where do these differentials “live”; are they elements of the dual of differentiable functions?

Comment by Esteban | May 5, 2013 | Reply

19. Pretty much, yeah. The differential is the Jacobian of a function from $\mathbb{R}^n$ to $\mathbb{R}$. As for where it lives, the differential $df$ at a point $p$ is in the dual of the tangent space $T_p\mathbb{R}^n$, so $df$ itself is a section of the cotangent bundle $T^*\mathbb{R}^n$ (which I don’t think I really got around to discussing properly before other demands on my time drew me further and further away from this blog).

Comment by John Armstrong | May 6, 2013 | Reply