The Unapologetic Mathematician

Mathematics for the interested outsider

The Meaning of the Speed of Light

Let’s pick up where we left off last time converting Maxwell’s equations into differential forms:

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c^2\rho\\d\beta&=0\\d\epsilon&=-\frac{\partial\beta}{\partial t}\\{}*d*\beta&=\mu_0\iota+\frac{1}{c^2}\frac{\partial\epsilon}{\partial t}\end{aligned}

Now let’s notice that while the electric field has units of force per unit charge, the magnetic field has units of force per unit charge per unit velocity. Further, from our polarized plane-wave solutions to Maxwell’s equations, we see that for these waves the magnitude of the electric field is c — a velocity — times the magnitude of the magnetic field. So let’s try collecting together factors of c\beta:

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c^2\rho\\d(c\beta)&=0\\d\epsilon&=-\frac{1}{c}\frac{\partial(c\beta)}{\partial t}\\{}*d*(c\beta)&=\mu_0c\iota+\frac{1}{c}\frac{\partial\epsilon}{\partial t}\end{aligned}

Now each of the time derivatives comes along with a factor of \frac{1}{c}. We can absorb this by introducing a new variable \tau=ct, which is measured in units of distance rather than time. Then we can write:


The easy thing here is to just write t instead of \tau, but this hides a deep insight: the speed of light c is acting like a conversion factor from units of time to units of distance. That is, we don’t just say that light moves at a speed of c=299\,792\,457\frac{\mathrm{m}}{\mathrm{s}}, we say that one second of time is 299,792,457 meters of distance. This is an incredibly identity that allows us to treat time and space on an equal footing, and it is borne out in many more or less direct experiments. I don’t want to get into all the consequences of this fact — the name for them as a collection is “special relativity” — but I do want to use it.

This lets us go back and write \beta instead of c\beta, since the factor of c here is just an artifact of using some coordinate system that treats time and distance separately; we see that the electric and magnetic fields in a propagating electromagnetic plane-wave are “really” the same size, and the factor of c is just an artifact of our coordinate system. We can also just write t instead of c t for the same reason. Finally, we can collect c\rho together to put it on the exact same footing as \iota.

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c\rho\\d\beta&=0\\d\epsilon&=-\frac{\partial\beta}{\partial t}\\{}*d*\beta&=\mu_0c\iota+\frac{\partial\epsilon}{\partial t}\end{aligned}

The meanings of these terms are getting further and further from familiarity. The 1-form \epsilon is still made of the same components as the electric field; the 2-form \beta is c times the Hodge star of the 1-form whose components are those of the magnetic field; the function \rho is c times the charge density; and the vector field \iota is the current density.

February 24, 2012 Posted by | Electromagnetism, Mathematical Physics | 4 Comments

Maxwell’s Equations in Differential Forms

To this point, we’ve mostly followed a standard approach to classical electromagnetism, and nothing I’ve said should be all that new to a former physics major, although at some points we’ve infused more mathematical rigor than is typical. But now I want to go in a different direction.

Starting again with Maxwell’s equations, we see all these divergences and curls which, though familiar to many, are really heavy-duty equipment. In particular, they rely on the Riemannian structure on \mathbb{R}^3. We want to strip this away to find something that works without this assumption, and as a first step we’ll flip things over into differential forms.

So let’s say that the magnetic field B corresponds to a 1-form \beta, while the electric field E corresponds to a 1-form \epsilon. To avoid confusion between \epsilon and the electric constant \epsilon_0, let’s also replace some of our constants with the speed of light — \epsilon_0\mu_0=\frac{1}{c^2}. At the same time, we’ll replace J with a 1-form \iota. Now Maxwell’s equations look like:

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c^2\rho\\{}*d*\beta&=0\\{}*d\epsilon&=-\frac{\partial\beta}{\partial t}\\{}*d\beta&=\mu_0\iota+\frac{1}{c^2}\frac{\partial\epsilon}{\partial t}\end{aligned}

Now I want to juggle around some of these Hodge stars:

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c^2\rho\\d(*\beta)&=0\\d\epsilon&=-\frac{\partial(*\beta)}{\partial t}\\{}*d*(*\beta)&=\mu_0\iota+\frac{1}{c^2}\frac{\partial\epsilon}{\partial t}\end{aligned}

Notice that we’re never just using the 1-form \beta, but rather the 2-form *\beta. Let’s actually go back and use \beta to represent a 2-form, so that B corresponds to the 1-form *\beta:

\displaystyle\begin{aligned}*d*\epsilon&=\mu_0c^2\rho\\d\beta&=0\\d\epsilon&=-\frac{\partial\beta}{\partial t}\\{}*d*\beta&=\mu_0\iota+\frac{1}{c^2}\frac{\partial\epsilon}{\partial t}\end{aligned}

In the static case — where time derivatives are zero — we see how symmetric this new formulation is:


For both the 1-form \epsilon and the 2-form \beta, the exterior derivative vanishes, and the operator *d* connects the fields to sources of physical charge and current.

February 22, 2012 Posted by | Electromagnetism, Mathematical Physics | 2 Comments

A Short Rant about Electromagnetism Texts

I’d like to step aside from the main line to make one complaint. In refreshing my background in classical electromagnetism for this series I’ve run into something that bugs the hell out of me as a mathematician. I remember it from my own first course, but I’m shocked to see that it survives into every upper-level treatment I’ve seen.

It’s about the existence of potentials, and the argument usually goes like this: as Faraday’s law tells us, for a static electric field we have \nabla\times E=0; therefore E=\nabla\phi for some potential function \phi because the curl of a gradient is zero.


Let’s break this down to simple formal logic that any physics undergrad can follow. Let P be the statement that there exists a \phi such that E=\nabla\phi. Let Q be the statement that \nabla\times E=0. The curl of a gradient being zero is the implication P\implies Q. So here’s the logic:

\displaystyle\begin{aligned}&Q\\&P\implies Q\\&\therefore P\end{aligned}

and that doesn’t make sense at all. It’s a textbook case of “affirming the consequent”.

Saying that E has a potential function is a nice, convenient way of satisfying the condition that its curl should vanish, but this argument gives no rationale for believing it’s the only option.

If we flip over to the language of differential forms, we know that the curl operator on a vector field corresponds to the operator \alpha\mapsto*d\alpha on 1-forms, while the gradient operator corresponds to f\mapsto df. We indeed know that *ddf=0 automatically — the curl of a gradient vanishes — but knowing that d\alpha=0 is not enough to conclude that \alpha=df for some f. In fact, this question is exactly what de Rham cohomology is all about!

So what’s missing? Full formality demands that we justify that the first de Rham cohomology of our space vanish. Now, I’m not suggesting that we make physics undergrads learn about homology — it might not be a terrible idea, though — but we can satisfy this in the context of a course just by admitting that we are (a) being a little sloppy here, and (b) the justification is that (for our purposes) the electric field E is defined in some simply-connected region of space which has no “holes” one could wrap a path around. In fact, if the students have had a decent course in multivariable calculus they’ve probably seen the explicit construction of a potential function for a vector field whose curl vanishes subject to the restriction that we’re working over a simply-connected space.

The problem arises again in justifying the existence of a vector potential: as Gauss’ law for magnetism tells us, for a magnetic field we have \nabla\cdot B=0; therefore B=\nabla\times A for some vector potential A because the divergence of a curl is zero.

Again we see the same problem of affirming the consequent. And again the real problem hinges on the unspoken assumption that the second de Rham cohomology of our space vanishes. Yes, this is true for contractible spaces, but we must make mention of the fact that our space is contractible! In fact, I did exactly that when I needed to get ahold of the magnetic potential once.

Again: we don’t need to stop simplifying and sweeping some of these messier details of our arguments under the rug when dealing with undergraduate students, but we do need to be honest that those details were there to be swept in the first place. The alternative most texts and notes choose now is to include statements which are blatantly false, and to rely on our authority to make students accept them unquestioningly.

February 18, 2012 Posted by | Electromagnetism, Mathematical Physics | 27 Comments

Conservation of Electromagnetic Energy

Let’s start with Ampère’s law, including Maxwell’s correction:

\displaystyle\nabla\times B=\mu_0J+\epsilon_0\mu_0\frac{\partial E}{\partial t}

Now let’s take the dot product of this with the electric field:

\displaystyle E\cdot(\nabla\times B)=\mu_0E\cdot J+\epsilon_0\mu_0E\cdot\frac{\partial E}{\partial t}

On the left, we can run a product rule in reverse:

\displaystyle B\cdot(\nabla\times E)-\nabla\cdot(E\times B)=\mu_0E\cdot J+\epsilon_0\mu_0E\cdot\frac{\partial E}{\partial t}

Now, Faraday’s law tells us that

\displaystyle\nabla\times E=-\frac{\partial B}{\partial t}

so we can write:

\displaystyle-B\cdot\frac{\partial B}{\partial t}-\nabla\cdot(E\times B)=\mu_0E\cdot J+\epsilon_0\mu_0E\cdot\frac{\partial E}{\partial t}

Let’s rearrange this a bit:

\displaystyle-\frac{1}{\mu_0}B\cdot\frac{\partial B}{\partial t}-\epsilon_0E\cdot\frac{\partial E}{\partial t}=\nabla\cdot\left(\frac{E\times B}{\mu_0}\right)+E\cdot J

The dot product of a vector field with its own derivative should look familiar; we can rewrite:

\displaystyle-\frac{\partial}{\partial t}\left(\frac{1}{2\mu_0}B\cdot B-\frac{\epsilon_0}{2}E\cdot E\right)=\nabla\cdot\left(\frac{E\times B}{\mu_0}\right)+E\cdot J

But now we should recognize almost all the terms in sight! On the left, we’re taking the derivative of the combined energy densities of the electric and magnetic fields:

\displaystyle U=\frac{\epsilon_0}{2}\lvert E\rvert^2+\frac{1}{2\mu_0}\lvert B\rvert^2

The second term on the right is the energy density lost to Joule heating per unit time. The only thing left is this vector field:

\displaystyle u=\frac{E\times B}{\mu_0}

which we call the “Poynting vector”. It’s really named after British physicist John Henry Poynting, but generations of students remember it because it “points” in the direction electromagnetic energy flows.

To see this, look at the final form of our equation:

\displaystyle-\frac{\partial U}{\partial t}=\nabla\cdot u+E\cdot J

On the left we have the rate at which the electromagnetic energy is going down at any given point. On the right, we have two terms; the second is the rate electromagnetic energy density is being lost to heat energy at the point, while the first is the rate electromagnetic energy is “flowing away from” the point.

Compare this with the conservation of charge:

\displaystyle-\frac{\partial\rho}{\partial t}=\nabla\cdot J

where the rate at which charge density decreases is equal to the rate that charge is “flowing away” through currents. The only difference is that there is no dissipation term for charge like there is for energy.

One other important thing to notice is what this tells us about our plane wave solutions. If we take such an electromagnetic wave propagating in the direction k and with the electric field polarized in some particular direction, then we can determine that

\displaystyle u=\frac{E\times B}{\mu_0}=\frac{\lvert E\rvert^2}{\mu_0c}k=\epsilon_0c\lvert E\rvert^2k

showing that electromagnetic waves carry electromagnetic energy in the direction that they propagate.

February 17, 2012 Posted by | Electromagnetism, Mathematical Physics | 2 Comments

Ohm’s Law

When calculating the potential energy of the magnetic field, we calculated the power needed to run a certain current around a certain circuit. Let’s look into that a little more deeply.

We start with Ohm’s law, which basically says that — as a first approximation — the electromotive force around a circuit is proportional to the current around it; push harder and you’ll move charge faster. As a formula:

\displaystyle V=IR

The electromotive force — or “voltage” — on the left is equal to the current around the circuit times the “resistance”. What’s the resistance? Well, here it’s basically just a constant of proportionality, which we read as “how hard is it to push charge around this circuit?”

But let’s dig in a bit more. A current doesn’t really flow around an infinitely-thin wire; it flows around a wire with some thickness. The thicker the wire is — the bigger its cross-sectional area — the easier it should be to push charge around, while the longer the circuit is, the harder. We’ll write down our resistance in the form

\displaystyle R=\eta\frac{l}{A}

where l is the length of the wire, A is its cross-sectional area, and \eta is a new proportionality constant we call “resistivity”. Putting this together with the first form of Ohm’s law we find

\displaystyle V=\eta\frac{l}{A}I

But look at this: the current is made up of a current density flowing along the wire, integrated across a cross-section. If the wire is running in the z direction and the current density in that direction is constantly J_z, then I=JA. Further — at least to a first approximation — the electromotive force is the z-component of the electric field E_z times the length l traveled in that direction.

Thus we conclude that E_z=\eta J_z. But since there’s nothing really special about the z direction, we actually find that

\displaystyle E=\eta J

which is Ohm’s law again, but now in terms of fields and current distributions.

But what about the power? We’ve got a battery pushing a current around a circuit and using power to do it; where does the energy go? Well, if we think about pushing little bits of charge around the wire, they’re going to hit parts of the wire and lose some energy in the process. The parts they hit get shaken up, and this appears as heat energy; the process is called “Ohmic” or “Joule” heating, the latter from Joule’s own experiments using a resistive wire to heat up a tub of water.

If we have a current I made up of N bits of charge q per unit time, then each bit takes an energy of qV to go around the circuit once. This happens N times per unit time, so the total power expenditure is

\displaystyle P=NqV=IV

just as we said last time. But now we can do the same trick as above and write

\displaystyle P=IV=(J\cdot E)Al


\displaystyle\frac{P}{Al}=E\cdot J

which measures the power per unit volume dissipated through Joule heating in the circuit.

February 16, 2012 Posted by | Electromagnetism, Mathematical Physics | 1 Comment

Energy and the Magnetic Field

Last time we calculated the energy of the electric field. Now let’s repeat with the magnetic field, and let’s try to be a little more careful about it since magnetic fields can be slippery.

Let’s consider a static magnetic field B generated by a collection of circuits C_i, each carrying a current I_i. Recall that Gauss’ law for magnetism tells us that \nabla\cdot B=0; since space is contractible, we know that its homology is trivial, and thus B must be the curl of some other vector field A, which we call the “magnetic potential” or “vector potential”. Now we can write down the flux of the magnetic field through each circuit:

\displaystyle\Phi_i=\int\limits_{S_i}B\cdot dS_i=\int\limits_{C_i}A\cdot dr_i

Now Faraday’s law tells us about the electromotive force induced on the circuit:

\displaystyle V_i=\frac{d\Phi_i}{dt}

This electromotive force must be counterbalanced by a battery maintaining the current or else the magnetic field wouldn’t be static.

We can determine how much power the battery must expend to maintain the current; a charge q moving around the circuit goes down by qV_i in potential energy, which the battery must replace to send it around again. If n such charges pass around in unit time, this is a work of nqV_i per unit time; since nq=I — the current — we find that the power expenditure is P_i=I_iV_i, or.

\displaystyle P_i=I_i\frac{d\Phi_i}{dt}

Thus if we want to ramp the currents — and the field — up from a cold start in a time T it takes a total work of

\displaystyle W=\sum\limits_{i=1}^N\int\limits_0^TI_i\frac{d\Phi_i}{dt}\,dt

which is then the energy stored in the magnetic field.

This expression doesn’t depend on exactly how the field turns on, so let’s say the currents ramp up linearly:

\displaystyle I_i(t)=I_i(T)\frac{t}{T}

and since the fluxes are proportional to the currents they must also ramp up linearly:


Plugging these in above, we find:

\displaystyle W=\sum\limits_{i=1}^N\int\limits_0^TI_i(T)\Phi_i(T)\frac{t}{T^2}\,dt=\frac{1}{2}\sum\limits_{i=1}^NI_i(T)\Phi_i(T)

Now we can plug in our original expression for the flux:

\displaystyle W=\frac{1}{2}\sum\limits_{i=1}^NI_i\int\limits_{C_i}A\cdot dr_i

This is great. But to be more general, let’s replace our currents with a current distribution:

\displaystyle W=\frac{1}{2}\int\limits_{\mathbb{R}^3}A\cdot J\,dV

Now we can use Ampère’s law to write

\displaystyle\begin{aligned}W&=\frac{1}{2\mu_0}\int\limits_{\mathbb{R}^3}A\cdot(\nabla\times B)\,dV\\&=\frac{1}{2\mu_0}\int\limits_{\mathbb{R}^3}B\cdot(\nabla\times A)-\nabla\cdot(A\times B)\,dV\\&=\frac{1}{2\mu_0}\int\limits_{\mathbb{R}^3}B\cdot B\,dV-\frac{1}{2\mu_0}\int\limits_{\mathbb{R}^3}\nabla\cdot(A\times B)\,dV\end{aligned}

We can pull the same sort of trick last time to make the second integral go away; use the divergence theorem to convert to

\displaystyle\frac{1}{2\mu_0}\lim\limits_{R\to\infty}\int\limits_{S_R}(A\times B)\cdot dA

and take the surface far enough away that the integral becomes negligible. We handwave that A\times B falls off roughly as the inverse fifth power of R, while the area of S_R only grows as the second power, and say that the term goes to zero.

So now we have a similar expression as last time for a magnetic energy density:

\displaystyle u_B=\frac{1}{2\mu_0}\lvert B\rvert^2

Again, we can check the units; the magnetic field has units of force per unit charge per unit velocity:


while the magnetic constant has units of henries per meter:


Putting together an inverse factor of the magnetic constant and two of the magnetic field and we get:


or, units of energy per unit volume, just like we expect for an energy density.

February 14, 2012 Posted by | Electromagnetism, Mathematical Physics | 4 Comments

Energy and the Electric Field

Okay, now let’s consider the electric field from the perspective of energy. We have an idea that this might be interesting because we know that the field produces a force, and forces and energies interact in interesting ways.

So recall that if we have a “test charge” q at a point p in an electric field E it experiences a force F=qE(p). As we saw when discussing Faraday’s law, for a static electric field we can write E=-\nabla\phi for some “electric potential” function \phi. Thus we can also write F=-\nabla U for the potential energy function U=q\phi.

Now, say the field is generated by a charge distribution \rho; how much potential energy is contained in the force the field exerts on the little bit of charge at p? We count U=\rho(p)\phi(p), but this is too much — half of it is due to the rest of the distribution acting on the bit of charge at r and half of it comes from r acting back. We can thus find the total potential energy by integrating

\displaystyle U=\frac{1}{2}\int\limits_{\mathbb{R}^3}\rho(p)\phi(p)\,d^3p

Now, Gauss’ law tells us that \rho=\epsilon_0\nabla\cdot E, so we substitute:

\displaystyle U=\frac{1}{2}\int\limits_{\mathbb{R}^3}\epsilon_0(\nabla\cdot E)\phi\,dV

Next we use a form of the product rule — \nabla\cdot(fV)=(\nabla f)\cdot V+f(\nabla\cdot V) — and run it backwards to write:

\displaystyle\begin{aligned}U&=\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}\nabla\cdot(\phi E)-(\nabla\phi)\cdot E\,dV\\&=\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}\nabla\cdot(\phi E)\,dV+\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}(-\nabla\phi)\cdot E\,dV\\&=\frac{\epsilon_0}{2}\lim\limits_{R\to\infty}\int\limits_{B_R}\nabla\cdot(\phi E)+\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}E\cdot E\,dV\end{aligned}

where we evaluate the first integral over space by evaluating it over the solid ball of radius R and taking the limit as R goes off to infinity. The divergence theorem says we can write:

\displaystyle\begin{aligned}U&=\frac{\epsilon_0}{2}\lim\limits_{R\to\infty}\int\limits_{S_R}\phi E\cdot dA+\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}E\cdot E\,dV\\&=\frac{\epsilon_0}{2}\int\limits_{\mathbb{R}^3}E\cdot E\,dV\end{aligned}

where, as usual, we have taken the charge distribution to be compactly supported, so as our sphere gets large enough, the potential energy \phi goes to zero. Yes, this is very hand-wavy, but this is how the physicists do it.

Anyway, what does this tell us? It means that a static electric field contains energy with a density

\displaystyle u_E=\frac{1}{2}\epsilon_0\lvert E\rvert^2

which we can integrate over any region of space to find the electrostatic potential energy contained in the field.

We can also check the units here; the electric field has units of force per unit charge:


while the electric constant has units of farads per meter:


Putting these together — two factors of E and one of \epsilon_0 we find the units:


Joules per cubic meter — energy per unit of volume, just as we’d expect for an energy density.

February 14, 2012 Posted by | Electromagnetism, Mathematical Physics | 5 Comments

Polarization of Electromagnetic Waves

Let’s look at another property of our plane wave solutions of Maxwell’s equations. Specifically, we’ll assume that the electric and magnetic fields are each plane waves in the directions k_E and k_B, repectively:

\displaystyle\begin{aligned}E(r,t)&=\hat{E}(k_E\cdot r-ct)\\B(r,t)&=\hat{B}(k_B\cdot r-ct)\end{aligned}

We can take these and plug them into the vacuum version of Maxwell’s equations, and evaluate them at (r,t)=(0,0):


The first equation says that \hat{E}'(0) is perpendicular to k_E, but the second equation implies, in part, that \hat{B}'(0) is also perpendicular to k_E. Similarly, the third and fourth equations say that both \hat{E}'(0) and \hat{B}'(0) are perpendicular to k_B, meaning that k_E and k_B either point in the same direction or in opposite directions. We can always pick our coordinates so that k_E points in the direction of the z-axis and \hat{E}'(0) points in the direction of the x-axis; then \hat{B}'(0) points in the direction of the y-axis. It’s then straightforward to check that k_B=k_E rather than k_B=-k_E. Of course, it’s possible that \hat{E}'(0) — and thus \hat{B}'(0) also — is zero; in this case we can just pick some different time at which to evaluate the equations. There must be some time for which these values are nonzero, or else \hat{E} and \hat{B} are simply constants, which is a pretty vacuous solution that we’ll just subtract off and ignore.

The upshot of this is that E and B must be plane waves traveling in the same direction. We put this back into our assumption:

\displaystyle\begin{aligned}E(r,t)&=\hat{E}(k\cdot r-ct)\\B(r,t)&=\hat{B}(k\cdot r-ct)\end{aligned}

and then Maxwell’s equations imply


where these are now full functions and not just evaluations at some conveniently-chosen point. And, incidentally, the second and fourth equations are completely equivalent. Now we can see that \hat{E}' and \hat{B}' are perpendicular at every point. Further, whatever component either vector has in the k direction is constant, and again we will just subtract it off and ignore it.

As the wave propagates in the direction of k, the electric and magnetic fields move around in the plane perpendicular to k. If we pick our z-axis in the direction of k, we can write \hat{E}=\hat{E}_x\hat{i}+\hat{E}_y\hat{j} and \hat{B}=\hat{B}_x\hat{i}+\hat{B}_y\hat{j}. Then the second (and fourth) equation tells us


That is, we get two decoupled equations:


This tells us that we can break up our plane wave solution into two different plane wave solutions. In one, the electric field “waves” in the x direction while the magnetic field waves in the y direction; in the other, the electric field waves in the y direction and the magnetic field waves in the -x direction.

This decomposition is the basis of polarized light. We can create filters that only allow waves with the electric field oriented in one direction to pass; generic waves can be decomposed into a component waving in the chosen direction and a component waving in the perpendicular direction, and the latter component gets destroyed as the wave passes through the Polaroid filter — yes, that’s where the company got its name — leaving only the light oriented in the “right” way.

As a quick, familiar application, we can make glasses with a film over the left eye that polarizes light vertically, and one over the right eye that polarizes light horizontally. Then if we show a quickly-alternating series of images, each polarized with the opposite axis, then they will be presented to each eye separately. This is the basis of the earliest modern stereoscopic — or “3-D” — glasses, which had the problem that if you tilted your head the effect was first lost, and then reversed as your neck’s angle increased. If you’ve been paying attention, you should be able to see why.

February 10, 2012 Posted by | Electromagnetism, Mathematical Physics | 4 Comments

The Propagation Velocity of Electromagnetic Waves

Now we’ve derived the wave equation from Maxwell’s equations, and we have worked out the plane-wave solutions. But there’s more to Maxwell’s equations than just the wave equation. Still, let’s take some plane-waves and see what we get.

First and foremost, what’s the propagation velocity of our plane-wave solutions? Well, it’s c for the generic wave equation

\displaystyle\frac{\partial^2F}{\partial t^2}-c^2\nabla^2F=0

while our electromagnetic wave equation is

\displaystyle\begin{aligned}\frac{\partial^2E}{\partial t^2}-\frac{1}{\epsilon_0\mu_0}\nabla^2E&=0\\\frac{\partial^2B}{\partial t^2}-\frac{1}{\epsilon_0\mu_0}\nabla^2B&=0\end{aligned}

so we find the propagation velocity of waves in both electric and magnetic fields is

\displaystyle c=\frac{1}{\sqrt{\epsilon_0\mu_0}}


Conveniently, I already gave values for both \epsilon_0 and \mu_0:


Multiplying, we find:


which means that

\displaystyle c=\frac{1}{\sqrt{\epsilon_0\mu_0}}=0.299792457\times10^9\frac{\mathrm{m}}{\mathrm{s}}=299\,792\,457\frac{\mathrm{m}}{\mathrm{s}}

And this is a number which should look very familiar: it’s the speed of light. In an 1864 paper, Maxwell himself noted:

The agreement of the results seems to show that light and magnetism are affections of the same substance, and that light is an electromagnetic disturbance propagated through the field according to electromagnetic laws.

Indeed, this supposition has been borne out in experiment after experiment over the last century and a half: light is an electromagnetic wave.

February 9, 2012 Posted by | Electromagnetism, Mathematical Physics | 4 Comments

Plane Waves

We’ve derived a “wave equation” from Maxwell’s equations, but it’s not clear what it means, or even why this is called a wave equation. Let’s consider the abstracted form, which both electric and magnetic fields satisfy:

\displaystyle\frac{\partial^2F}{\partial t^2}-c^2\nabla^2F=0

where \nabla^2 is the “Laplacian” operator, defined on scalar functions by taking the gradient followed by the divergence, and extended linearly to vector fields. If we have a Cartesian coordinate system — and remember we’re working in good, old \mathbb{R}^3 so it’s possible to pick just such coordinates, albeit not canonically — we can write

\displaystyle\frac{\partial^2F_x}{\partial t^2}-c^2\nabla^2F_x=0

where F_x is the x-component of F, and a similar equation holds for the y and z components as well. We can also write out the Laplacian in terms of coordinate derivatives:

\displaystyle\frac{\partial^2f}{\partial t^2}-c^2\left(\frac{\partial^2f}{\partial x^2}+\frac{\partial^2f}{\partial y^2}+\frac{\partial^2f}{\partial z^2}\right)=0

Let’s simplify further to just consider functions that depend on x and t, and which are constant in the y and z directions:

\displaystyle\frac{\partial^2f}{\partial t^2}-c^2\frac{\partial^2f}{\partial x^2}=\left[\frac{\partial^2}{\partial t^2}-c^2\frac{\partial^2}{\partial x^2}\right]f=0

We can take this big operator and “factor” it:

\displaystyle\left[\left(\frac{\partial}{\partial t}+c\frac{\partial}{\partial x}\right)\left(\frac{\partial}{\partial t}-c\frac{\partial}{\partial x}\right)\right]f=0

Any function which either “factor” sends to zero will be a solution of the whole equation. We find solutions like

\displaystyle\begin{aligned}\left[\frac{\partial}{\partial t}+c\frac{\partial}{\partial x}\right]A(x-ct)&=A'(x-ct)\frac{\partial(x-ct)}{\partial t}+cA'(x-ct)\frac{\partial(x-ct)}{\partial x}\\&=A'(x-ct)(-c+c)=0\\\left[\frac{\partial}{\partial t}-c\frac{\partial}{\partial x}\right]B(x+ct)&=B'(x+ct)\frac{\partial(x+ct)}{\partial t}-cB'(x+ct)\frac{\partial(x+ct)}{\partial x}\\&=B'(x+ct)(c-c)=0\end{aligned}

where A and B are pretty much any function that’s at least mildly well-behaved.

We call solutions of the first form “right-moving”, for if we view t as time and watch as it increases, the “shape” of A(x-ct) stays the same; it just moves in the increasing x direction. That is, at time t_0+\Delta t we see the same thing at x that we saw at x-c\Delta tc\Delta t units to the left — at time t_0. Similarly, we call solutions of the second form “left-moving”. In each family, solutions propagate at a rate of c, which was the constant from our original equation. Any solution of this simplified, one-dimensional wave equation will be the sum of a right-moving and a left-moving term.

More generally, for the three-dimensional version we have “plane-wave” solutions propagating in any given direction we want. We could do a big, messy calculation, but note that if k is any unit vector, we can pick a Cartesian coordinate system where k is the unit vector in the x direction, in which case we’re back to the right-moving solutions from above. And of course there’s no reason we can’t let A be a vector-valued function. Such a solution looks like

\displaystyle A(r,t)=\hat{A}(k\cdot r-ct)

The bigger t is, the further in the k direction the position vector r must extend to compensate; the shape \hat{A} stays the same, but moves in the direction of k with a velocity of c.

It will be helpful to work out some of the basic derivatives of such solutions. Time is easy:

\displaystyle\begin{aligned}\frac{\partial}{\partial t}A(r,t)&=\frac{\partial}{\partial t}\hat{A}(k\cdot r-ct)\\&=\hat{A}'(k\cdot r-ct)\frac{\partial}{\partial t}(k\cdot r-ct)\\&=-c\hat{A}'(k\cdot r-ct)\end{aligned}

Spatial derivatives are a little trickier. We pick a Cartesian coordinate system to write:

\displaystyle\begin{aligned}\frac{\partial}{\partial x}A(r,t)&=\frac{\partial}{\partial x}\hat{A}(k\cdot r-ct)\\&=\hat{A}'(k\cdot r-ct)\frac{\partial}{\partial x}(k\cdot r-ct)\\&=k_x\hat{A}'(k\cdot r-ct)\\\frac{\partial}{\partial y}A(r,t)&=k_y\hat{A}'(k\cdot r-ct)\\\frac{\partial}{\partial z}A(r,t)&=k_z\hat{A}'(k\cdot r-ct)\end{aligned}

We don’t really want to depend on coordinates, so luckily it’s easy enough to figure out:

\displaystyle\begin{aligned}\nabla\cdot A(r,t)&=k\cdot\hat{A}'(k\cdot r-ct)\\\nabla\times A(r,t)&=k\times\hat{A}'(k\cdot r-ct)\end{aligned}

which will make our lives much easier to have worked out in advance.

February 8, 2012 Posted by | Analysis, Differential Equations | 3 Comments


Get every new post delivered to your Inbox.

Join 366 other followers