# The Unapologetic Mathematician

## Rubik’s 7x7x7 Cube

From Alexandre Borovik I find this video of someone solving the “order seven” Rubik’s Cube.

I’m not about to sit down and work up a solution like we did before, but it shouldn’t be impossible to repeat the same sort of analysis. I will point out, however, that the solver in this video is making heavy use of both of our solution techniques: commutators and a tower of nested subgroups.

The nested subgroups are obvious. As the solution progresses, more and more structure becomes apparent, and is preserved as the solution continues. In particular, the solver builds up the centers of faces and then slips to the subgroup of maneuvers which leaves such “big centers” fixed in place. Near the end, almost all of the moves are twists of the outer faces, because these are assured not to affect anything but the edge and corner cubies.

The commutators take a quicker eye to spot, but they’re in there. Watch how many times he’ll do a couple twists, a short maneuver, and then undo those couple twists. Just as we used such commutators, these provide easy generalizations of basic cycles, and they form the heart of this solver’s algorithm.

Alexandre asked a question about the asymptotic growth of the “worst assembly time” for the $n\times n\times n$ cube. What this is really asking is for the “diameter” of the $n$th Rubik’s group $G_n$. I don’t know offhand what this would be, but here’s a way to get at a rough estimate.

First, find a similar expression for the structure of $G_n$ as we found before for $G_3$. Then what basic twists do we have? For $n=3$ we had all six faces, which could be turned either way, and we let the center slices be fixed. In general we’ll have $\lfloor\frac{n}{2}\rfloor$ slices in each of six directions, each of which can be turned either way, for a total of $12\lfloor\frac{n}{2}\rfloor$ generators (and their inverses). But each generator should (usually) be followed by a different one, and definitely not by its own inverse. Thus we can estimate the number of words of length $l$ as $\left(12\lfloor\frac{n}{2}\rfloor-2\right)^l$. Then the structure of $G_n$ gives us a total size of the group, and the diameter should be about $\log_{\left(12\lfloor\frac{n}{2}\rfloor-2\right)}(|G_n|)$. Notice that for $n=3$ this gives us $20$, which isn’t far off from the known upper bound of $26$ quarter-turns.

February 19, 2008

## How to Use the FToC

I’ll get back to deconstructing comics another time. For now, I want to push on with some actual mathematics.

After much blood, toil, tears, and sweat, we’ve proven the Fundamental Theorem of Calculus. So what do we do with it? The answer’s in this diagram:

This is sort of schematic rather than something we can interpret literally.

On the left we have real-valued functions — we’re being vague about their domains — and collections of “signed” points. We also have a way of pairing a function with a collection of points: evaluate the function at each point, and then add up all the values or their negatives, depending on the sign of the point. On the right we also have real-valued functions, but now we consider intervals of the real line. We have another way of pairing a function with an interval: integration!

At the top of the diagram, we can take a function and differentiate it to get back another function. At the bottom, we can take an interval and get a collection of signed points by moving to the boundary. The interval $\left[a,b\right]$ has the boundary points $\{a^-,b^+\}$, where we consider $a$ to be “negatively signed”.

Now, what does the FToC tell us? If we start with a function $F$ in the upper left and an interval $\left[a,b\right]$ in the lower right, we have two ways of trying to pair them off. First, we could take the derivative of $F$ and then integrate it from $a$ to $b$ to get $\int_a^b F'(x)dx$. On the other hand, we could take the boundary of the interval and add up the function values along the boundary to get $F(b)-F(a)$. The FToC tells us that these two give us the same answer!

To write this in a diagram seems a little much, but keep the diagram in mind. We’ll come back to it later. For now, though, we can use it to understand how to use the FToC to handle integration.

Say we have a function $f$ and an interval $\left[a,b\right]$, and we need to find $\int_a^bf(x)dx$. We’ve got these big, messy Riemann sums (or Darboux sums), and there’s a lot of work to compute the integral by hand. But notice that the integral is living on the right side of the diagram. If we could move it over to the left, we’d just have to evaluate a function twice and add up the results.

Moving the interval to the left of the diagram is easy: we can just read off the boundary. Moving the function is harder. What we need is to find an antiderivative $F(x)$ so that $F'(x)=f(x)$. Then we move to the left of the diagram by switch attention from $f$ to $F$. Then we can evaluate $F(b)-F(a)$ and get exactly the same value as the integral we set out to calculate. So if we want to find integrals, we’d better get good at finding antiderivatives!

This has an immediate consequence. Our basic rules of antiderivatives carry over to give some basic rules for integration. In particular, we know that integrals play nicely with sums and scalar multiples:

$\displaystyle\int\limits_a^bf(x)+g(x)dx=\int\limits_a^bf(x)dx+\int\limits_a^bg(x)dx$
$\displaystyle\int\limits_a^bkf(x)dx=k\int\limits_a^bf(x)dx$

February 18, 2008 Posted by | Analysis, Calculus | 16 Comments

## XKCD… WTF?

Okay, usually I’m all behind XKCD, but today’s installment is a bit of a head-scratcher.

The title seems especially ill-chosen. I mean, I know that Randall’s not a doctor of linguistics, but he’s usually pretty on the ball. Clearly he can’t mean the title as a normative statement, but he also has to understand that “how it works” will commonly parse as “how it should work”. The fact that there’s no comeuppance for the jerk doesn’t help here. Without further comment, it’s easy to read the comic as an endorsement of this attitude.

The other thing that leaves a bad taste in my mouth is that the guy on the left is not a clearly-defined character we all know to be unpleasant already. Yes, I know this is arguing semiotics, but there’s a reason Goofus and Gallant comics are so easily read: a generic character will be interpreted as a generic person. Their behavior is then also taken as generic. Putting the Hat Guy in there would go a long way towards making this not seem like an endorsement.

And then the details are off. The characters are looking at a calculus problem. I don’t know anyone — at least any instructor — in this day and age who thinks like this at the calculus level. As far as I know, the psychological damage is usually done by this point. The attitude comes in during grade-school, so an arithmetic problem (and younger characters at the board) would be more appropriate. That is, unless Randall is asserting that this attitude is endemic (remember generic character => generic person) among calculus instructors.

In that case I really have to disagree with him on the strongest possible terms. But again, there’s no further comment, and the whole thing just feels disappointing as a result.

February 18, 2008 Posted by | rants | 42 Comments

## FToC Flame War Wrap-Up

Well, we’ve certainly had a lively time the last few days. Regular commentercomplainer Michael Livshits kicked it off by noting that I presented

The same pathetic proof that mixes apples and oranges, and makes the reader believe that MVT has anything to do woth FTC!

Then came some back-and-forth. I argued that there are many approaches, and due to different motivations we’ve chosen different ones. Michael argued that I was part of some “Church of Limitology” which “indoctrinates” calculus students, and that my proofs “suck”, are “trashy”, and are “in bad taste”. My point that this is not the approach I actually take in a classroom setting was ignored.

However, he did have some points. One of them was that we can weaken the theorem to only assume that the function $f$ is continuous at the point $x$, and my proof assumed way too much to say the function is continuous everywhere. But let’s consider where I actually used continuity. First it shows up in invoking the Integral Mean Value Theorem, but there I really only need it to say there’s a maximum and a minimum, so the Darboux sums work out. An integrable function still manages to satisfy this condition. Then I use continuity to show that $\lim\limits_{c\rightarrow x}f(c)=f(x)$, which really only needs continuity at $x$. In fact, my proof already works in Michael’s extended context.

He also tried presenting his own proof of the crucial step. He argues by continuity at $x$ that for any $\epsilon$ there is a $\delta$ so that $|f(x+\Delta x)-f(x)|<\epsilon$ when $|\Delta x|<\delta$. This fact then shows that $\int_x^{x+\Delta x}f(t)-f(x)dt<\epsilon|\Delta x|$, and so the limit (there’s that awful word again!) that I claimed in the post works out.

But what does his proof really mean? Go on, try and draw a picture. It’s saying that the difference in area we add by integrating from $x$ to $\Delta x$ from the area we’d add if we just used a constant height of $f(x)$ by less than any constant multiple of the width $\Delta x$. And that means… it’s obscure to me, at least.

On the other hand, my proof says that the area function changes by some amount as we go from $x$ to $x+\Delta x$, which means there’s some average (“mean”) rate of change over that interval. At some point along the way, the derivative actually attains that mean value, and as we contract the interval we push that middle point down to $x$. Now that makes sense to me.

Now, the real genius came later, after I sewed the two parts of the FToC together. Eventually, Michael said:

By MVT I meant the one for continuous function, that it hits the zero if it changes sign. Is it the one you were talking about, or you were talking abouth the Lagrange theorem (= MVT for the derivative)? I’m a bit confused. Well, either way it’s not too important.

And here it all runs off the rails. This whole time he hasn’t actually been reading a single thing about my proof, and evidently he hasn’t read the proofs in the calculus textbooks he so despises. He doesn’t even know which theorem I’m invoking! And it is important, because the different theorems say vastly different things.

Now it’s plain as day that Michael is a crank, pushing his pet theories while remaining so embittered to the “system” that “indoctrinates” students against him. Either that or he’s been trolling. The actual merits of my own proof — which I hope I’ve shown above to meet his tests — never mattered at all. I do hope he will set up his own weblog to present all his work in his own space, and then interested readers can judge for themselves the merits and demerits of different proofs.

As for this discussion, it’s closed. I have a proof, and Michael has a proof, and they both work. Our proofs emphasize different aspects of the theorem, and we choose between them depending on what we want to highlight for our current audience. Despite all his ranting, neither one is “the right way” or “the wrong way”, independent of context. I’m glad to hear alternative approaches here, since they might highlight points that I missed. But as a word to future ranters: don’t even try to use my weblog as your soapbox. That sort of behavior really is trashy, and in bad taste.

Besides, I’m the Dennis Miller around here.

[UPDATE]: I’ve come to a decision, since the war seems to rage on unabated, and Mr. Livshits refuses to take the olive branches of equanimity I’ve been offering since the beginning. As of midnight (Central Standard Time) tonight, this is over. Mr. Livshits goes in the kill file, and I wash my hands of the whole business. I’m sure he’ll cry foul, and oppression, and maybe he’s right. However, this whole mess just distracts from my work here, and I’m sick of it. From sideline conversations with numerous non-commenting readers, I’m not the only one.

I’ve made my case, and tried over and over to say that ultimately the whole debate comes down to aesthetics. His approach has its merits, as does mine. He really dislikes my approach, so much so that he’s willing to fight tooth and nail. Ultimately, I really don’t care to fight this any more. But since I can’t seem to continue this project without having my “mathematical taste” insulted left and right, I’m using my authority as the owner of this space to cut off debate. This does not continue here.

If Mr. Livshits wants to continue his tirades, he’s free to set up his own weblog, as I’ve encouraged him to do time and again. He can even continue to read and post to his own space in parallel to my coverage. If he’s right and a significant majority of my readers want to hear his side, he’ll have a built-in audience ready and waiting, and he’s welcome to it. Just like the sky, there’s a lot of blogosphere out there. Of course, he’ll eventually have to come up with something to fill his space, since I’m not spending the rest of my life here on calculus and elementary analysis.

February 16, 2008 Posted by | Analysis, Calculus | 1 Comment

## The Fundamental Theorem of Calculus (all together now)

So we’ve seen two sides of the FToC: the first part, which says that given a continuous function $f:\left[a,b\right]\rightarrow\mathbb{R}$ we can integrate and differentiate to get our function back:

$\displaystyle\frac{d}{dx}\int\limits_a^xf(t)dt=f(x)$

and the second part, which says that given a differentiable function $F:\left[a,b\right]\rightarrow\mathbb{R}$ whose derivative is the continuous function $f$, we can integrate to get (part of) our function back again:

$\displaystyle F(b)-F(a)=\int\limits_a^bf(x)dx$

Now, we proved these two sides in very different ways, but it turns out that we can get from one to the other.

Let’s assume the first part holds. Then we take the function $F$ and define $f(x)=F'(x)$ as its derivative. The first part of the theorem tells us that we know a function whose derivative is $f$: the function defined by $G(x)=\int_a^xf(t)dt$. And we know that any two functions with the same derivative must differ by a constant! That is, there is some real number $C$ with $F(x)=G(x)+C$. Using this to evaluate $F(b)-F(a)$ we find:

$\displaystyle F(b)-F(a)=(G(b)+C)-(G(a)+C)=G(b)-G(a)=$
$\displaystyle\int\limits_a^bf(x)dx-\int\limits_a^af(x)dx=\int\limits_a^bf(x)dx$

Which gives us the second part of the theorem.

On the other hand, what if we assume the second part of the theorem holds? Then we start with a continuous function $f:\left[a,b\right]\rightarrow\mathbb{R}$. Given $x\in\left[a,b\right]$, the function is continuous on the subinterval $\left[a,x\right]$, and so the second part of the FToC says that $F(x)-F(a)=\int_a^xf(t)dt$. That is, the integral in the first part of the FToC differs by a constant ($F(a)$) from the function $F$ we assumed to be an antiderivative of $f$. Thus it must itself be an antiderivative of $f$.

So each half of the Fundamental Theorem implies the other, and we can prove either one first before immediately deriving the other.

February 15, 2008 Posted by | Analysis, Calculus | 125 Comments

## The Fundamental Theorem of Calculus II

And now we come to the second part of the FToC. This takes the first part and flips it around.

We again start with a continuous function $f:\left[a,b\right]\rightarrow\mathbb{R}$, but now we take any antiderivative $F$, so that $f(x)=F'(x)$. Then the FToC asserts that

$\displaystyle F(b)-F(a)=\int\limits_a^bf(x)dx$

Before we differentiated a function we got by integrating to get back where we started. Now we’re integrating a function we get by differentiating, and again get back where we started. Integration and differentiation are two sides of the same coin.

Let’s consider a partition of $\left[a,b\right]$ with points $a=x_0,x_1,...,x_{n-1},x_n=b$. Then we see that $F(b)-F(a)=F(x_n)-F(x_0)$. We can add and subtract the value of $F$ at each of the intermediate points to see that

$\displaystyle F(b)-F(a)=F(x_n)-F(x_{n-1})+F(x_{n-1})-...-F(x_1)+F(x_1)-F(x_0)$
$\displaystyle F(b)-F(a)=\sum\limits_{i=1}^nF(x_i)-F(x_{i-1})$

Now the Differential Mean Value Theorem tells us that there’s a point $c_i\in\left[x_{i-1},x_i\right]$ so that $F(x_i)-F(x_{i-1})=(x_i-x_{i-1})F'(c_i)$. And we assumed that $F'(c_i)=f(c_i)$, so we have

$\displaystyle F(b)-F(a)=\sum\limits_{i=1}^n(x_i-x_{i-1})f(c_i)$

But this is a Riemann sum for the partition we chose, using the points $c_i$ as the tags. Since every partition, no matter how fine, has such a Riemann sum, the integral must take this value, and the second part of the FToC holds.

February 14, 2008 Posted by | Analysis, Calculus | 18 Comments

## The Fundamental Theorem of Calculus I

Today we get to the Fundamental Theorem of Calculus, which comes in two parts. This theorem is essential, in that it shows how the two seemingly-dissimilar fields of integral and differential calculus are actually two sides of the same coin. From this point, most of the basic theory of integration comes down to finding the “mirror image” of facts about differentiation.

First, let’s start with some continuous function $f:\left[a,b\right]\rightarrow\mathbb{R}$ and define a new function by $F(x)=\int_a^xf(t)dt$. Now the fundamental theorem tells us that this new function is differentiable, and its derivative is the function we started with! That is:

$\displaystyle\frac{d}{dx}\int\limits_a^xf(t)dt=f(x)$

To see this, let’s consider the difference $F(x+\Delta x)-F(x)$. The first term here is the integral $\int_a^{x+\Delta x}f(t)dt$. Then we can split this interval up to get the sum of the integrals $\int_a^{x}f(t)dt+\int_x^{x+\Delta x}f(t)dt$. But the first part here is just $F(x)$, which we’re about to subtract off. Then the difference quotient is $\frac{1}{\Delta x}\int_x^{x+\Delta x}f(t)dt$. The derivative $F'(x)$ will be the limit of this difference quotient as $\Delta x$ goes to ${0}$.

So now let’s use the Integral Mean Value Theorem to get at the integral here. It tells us that there’s some $c$ between $x$ and $x+\Delta x$ with $f(c)=\frac{1}{\Delta x}\int_x^{x+\Delta x}f(t)dt$ — the difference quotient exactly! And as $\Delta x$ gets smaller and smaller, $c$ gets squeezed closer and closer to $x$. And because $f$ is continuous, we find that $\lim\limits_{c\rightarrow x}f(c)=f(x)$. Presto!

A very common metaphor here is to think of a carpet whose width at a point $x$ along its length is $f(x)$. Then its total area from the starting point $a$ up to $x$ is the integral $\int_a^xf(t)dt$. How fast is the area increasing as we unroll more carpet? As we unroll $dx$ more length we get $f(x)dx$ more area, and so the derivative of the area is the width of the carpet.

February 13, 2008 Posted by | Analysis, Calculus | 39 Comments

## The Integral Mean Value Theorem

Okay: time to get back on track. Today, we’ll see a theorem about integrals that’s similar to the Differential Mean Value Theorem. Specifically, it states that if we have a continuous function $f:\left[a,b\right]\rightarrow\mathbb{R}$ then there is some $c\in\left[a,b\right]$ so that

$\displaystyle f(c)=\frac{1}{b-a}\int\limits_a^bf(x)dx$

Let’s consider the Darboux sums we use to define the integral. We know that if we choose a partition, then its upper Darboux sum is greater than any Riemann sum of any refinement of that partition. So let’s take the absolute coarsest possible partition: the one where we just have partition points $a$ and $b$. Then the upper Darboux sum is $(b-a)M$, where $M$ is the maximum value of $f$ on the interval $\left[a,b\right]$. Similarly, the lower Darboux sum on this interval is $(b-a)m$ (where $m$ is the minimum value of $f$), and it’s the lowest possible Darboux sum. Then we can divide everything in sight by $b-a$ to get the inequality

$\displaystyle m\leq\frac{1}{b-a}\int\limits_a^bf(x)dx\leq M$

Now the Intermediate Value Theorem tells us that $f$ must take every value between $m$ and $M$ at some point between $a$ and $b$. And thus there must exist a $c\in\left[a,b\right]$ so that

$\displaystyle f(c)=\frac{1}{b-a}\int\limits_a^bf(x)dx$

just as we wanted.

February 12, 2008 Posted by | Analysis, Calculus | 7 Comments

## Metric Spaces are Categories!

A guest post by Tom Leinster over at The n-Category Café reminded me of an interesting fact I haven’t mentioned yet: a metric space is actually an example of an enriched category!

First we’ll need to pick out our base category $\mathcal{V}$, in which we’ll find our hom-objects. Consider the set of nonnegative real numbers with their real-number order, and add in a point called $\infty$ that’s above all the other points. This is a totally ordered set, and orders are categories. Let’s take the opposite of this category. That is, the objects of our category $V$ are the points in the “interval” $\left[0,\infty\right]$, and we have an arrow $x\rightarrow y$ exactly when $x\geq y$.

This turns out to be a monoidal category, and the monoidal structure is just addition. Clearly this gives a monoid on the set of objects, but we need to check it on morphisms to see it’s functorial. But if $x_1\geq y_1$ and $x_2\geq y_2$ then $x_1+x_2\geq y_1+y_2$, and so we can see addition as a functor.

So we’ve got a monoidal category, and we can now use it to form enriched categories. Let’s keep out lives simple by considering a small $\mathcal{V}$-category $\mathcal{C}$. Here’s how the definition looks.

We have a set of objects $\mathrm{Ob}(\mathcal{C})$ that we’ll call “points” in a set $X$. Between any two points $p_1$ and $p_2$ we need a hom-object $\hom_\mathcal{C}(p_1,p_2)\in\mathrm{Ob}(\mathcal{V})$. That is, we have a function $d:X\times X\rightarrow\left[0,\infty\right]$.

For a triple $(p_1,p_2,p_3)$ of objects we need an arrow $\hom_\mathcal{C}(p_2,p_3)\otimes\hom_\mathcal{C}(p_1,p_2)\rightarrow\hom_\mathcal{C}(p_1,p_3)$. In more quotidian terms, this means that $d(p_2,p_3)+d(p_1,p_2)\geq d(p_1,p_3)$.

Also, for each point $p$ there is an arrow from the identity object of $\mathcal{V}$ to the hom-object $\hom_\mathcal{C}(p,p)$. That is, $0\geq d(p,p)$, so $d(p,p)=0$.

These conditions are the first, fourth, and half of the second conditions in the definition of a metric space! In fact, there’s a weaker notion of a “pseudometric” space, wherein the second condition is simply that $d(p,p)=0$, and so we’re almost exactly giving the definition of a pseudometric space.

The only thing we’re missing is the requirement that $d(p_1,p_2)=d(p_2,p_1)$. The case can be made (and has been, by Lawvere) that this requirement is actually extraneous, and that it’s in some sense more natural to work with “asymmetric” (pseudo)metric spaces that are exactly those given by this enriched categorical framework.

February 11, 2008

## Some integrable functions

Okay, we know what it means for a function to be integrable (in either of the equivalent Riemann or Darboux senses), but we don’t yet know any functions to actually be integrable. I won’t give the whole story now, but a just a large enough part to work with for the moment.

The major theorem here is that a continuous function $f$ on a closed interval $\left[a,b\right]$ is integrable. Notice from the Heine-Cantor theorem that $f$ is automatically uniformly continuous. That is, for any $\epsilon$ there is some $\delta$ so that for all $x$ and $y$ in $\left[a,b\right]$ with $|y-x|<\delta$ we have $|f(y)-(x)|>\epsilon$. Again, the important thing here is that we can choose our $\delta$ independently of the point $x$, while continuity says $\delta$ might depend on $x$.

So now we need to take our function and show that the upper and lower Darboux sums converge to the same value. Equivalently, we can show that their difference converges to zero. So given an $\epsilon$ we want to show that there is some partition $x$ so that the difference

$\displaystyle U_x(f)-L_x(f)=\sum\limits_{i=1}^n(M_i-m_i)(x_i-x_{i-1})$

is less than $\epsilon$, and the same is true of any refinement of this partition.

We’ll choose our partition with every slice having constant width $\delta$, so there are $\frac{b-a}{\delta}$ of them. By the uniform continuity of $f$ we can find a $\delta$ so that for any points $x$ and $y$ with $|y-x|<\delta$ we’ll have $|f(y)-f(x)|<\frac{\epsilon}{b-a}$. Then in particular the difference $M_i-m_i$ will be less than $\frac{\epsilon}{b-a}$, while $x_i-x_{i-1}=\delta$ and $n=\frac{b-a}{\delta}$. Thus the difference in the Darboux sums will be less than $\epsilon$, as we wanted.

What about refinements? Well, any refinement of a partition can only lower the upper Darboux sum and raise the lower one. This is because adding a point to a partition can’t raise the maximum in either of the new subintervals or lower the minimum, and in fact adding a point will usually lower the maximum and raise the minimum. So our partition has a small enough difference in the Darboux sums, and any refinement will make the difference even smaller, and thus we have the convergence we need.

Now, can we do better than continuous functions? Well, we can relax continuity at the endpoints a bit. If the function jumps to a different value at $a$ or $b$ than the limit seems to indicate, we can still get uniform continuity everywhere but that one point, and we’re still good. We still have problems with asymptotes where the function shoots off to infinity, like $f(x)=\frac{1}{x}$ does at the left endpoint of $\left[0,1\right]$.

What else? Well, we can allow a finite number of discontinuities, as long as none of them are asymptotes. If a discontinuity happens at $c\in\left[a,b\right]$, we can choose $c$ to be a partition point, and so on. Then a partition with these selected points is just the same as a partition on each of the continuous sections of the function in between the discontinuities, and we know that they’re all good.

Incidentally, we can use this same method of picking some of the partition points ahead of time to show another nice property of the integrals: we can break the integral in the middle of the interval and evaluate the two pieces separately, then add them together. That is, if $c\in\left[a,b\right]$ then we have the equation

$\displaystyle\int_a^bf(x)dx=\int_a^cf(x)dx+\int_c^bf(x)dx$

as long as both sides are integrable.

February 1, 2008 Posted by | Analysis, Calculus | 2 Comments