The Meaning of the Speed of Light
Let’s pick up where we left off last time converting Maxwell’s equations into differential forms:
Now let’s notice that while the electric field has units of force per unit charge, the magnetic field has units of force per unit charge per unit velocity. Further, from our polarized plane-wave solutions to Maxwell’s equations, we see that for these waves the magnitude of the electric field is — a velocity — times the magnitude of the magnetic field. So let’s try collecting together factors of
:
Now each of the time derivatives comes along with a factor of . We can absorb this by introducing a new variable
, which is measured in units of distance rather than time. Then we can write:
The easy thing here is to just write instead of
, but this hides a deep insight: the speed of light
is acting like a conversion factor from units of time to units of distance. That is, we don’t just say that light moves at a speed of
, we say that one second of time is 299,792,457 meters of distance. This is an incredibly identity that allows us to treat time and space on an equal footing, and it is borne out in many more or less direct experiments. I don’t want to get into all the consequences of this fact — the name for them as a collection is “special relativity” — but I do want to use it.
This lets us go back and write instead of
, since the factor of
here is just an artifact of using some coordinate system that treats time and distance separately; we see that the electric and magnetic fields in a propagating electromagnetic plane-wave are “really” the same size, and the factor of
is just an artifact of our coordinate system. We can also just write
instead of
for the same reason. Finally, we can collect
together to put it on the exact same footing as
.
The meanings of these terms are getting further and further from familiarity. The -form
is still made of the same components as the electric field; the
-form
is
times the Hodge star of the
-form whose components are those of the magnetic field; the function
is
times the charge density; and the vector field
is the current density.
Maxwell’s Equations in Differential Forms
To this point, we’ve mostly followed a standard approach to classical electromagnetism, and nothing I’ve said should be all that new to a former physics major, although at some points we’ve infused more mathematical rigor than is typical. But now I want to go in a different direction.
Starting again with Maxwell’s equations, we see all these divergences and curls which, though familiar to many, are really heavy-duty equipment. In particular, they rely on the Riemannian structure on . We want to strip this away to find something that works without this assumption, and as a first step we’ll flip things over into differential forms.
So let’s say that the magnetic field corresponds to a
-form
, while the electric field
corresponds to a
-form
. To avoid confusion between
and the electric constant
, let’s also replace some of our constants with the speed of light —
. At the same time, we’ll replace
with a
-form
. Now Maxwell’s equations look like:
Now I want to juggle around some of these Hodge stars:
Notice that we’re never just using the -form
, but rather the
-form
. Let’s actually go back and use
to represent a
-form, so that
corresponds to the
-form
:
In the static case — where time derivatives are zero — we see how symmetric this new formulation is:
For both the -form
and the
-form
, the exterior derivative vanishes, and the operator
connects the fields to sources of physical charge and current.
A Short Rant about Electromagnetism Texts
I’d like to step aside from the main line to make one complaint. In refreshing my background in classical electromagnetism for this series I’ve run into something that bugs the hell out of me as a mathematician. I remember it from my own first course, but I’m shocked to see that it survives into every upper-level treatment I’ve seen.
It’s about the existence of potentials, and the argument usually goes like this: as Faraday’s law tells us, for a static electric field we have ; therefore
for some potential function
because the curl of a gradient is zero.
What?
Let’s break this down to simple formal logic that any physics undergrad can follow. Let be the statement that there exists a
such that
. Let
be the statement that
. The curl of a gradient being zero is the implication
. So here’s the logic:
and that doesn’t make sense at all. It’s a textbook case of “affirming the consequent”.
Saying that has a potential function is a nice, convenient way of satisfying the condition that its curl should vanish, but this argument gives no rationale for believing it’s the only option.
If we flip over to the language of differential forms, we know that the curl operator on a vector field corresponds to the operator on
-forms, while the gradient operator corresponds to
. We indeed know that
automatically — the curl of a gradient vanishes — but knowing that
is not enough to conclude that
for some
. In fact, this question is exactly what de Rham cohomology is all about!
So what’s missing? Full formality demands that we justify that the first de Rham cohomology of our space vanish. Now, I’m not suggesting that we make physics undergrads learn about homology — it might not be a terrible idea, though — but we can satisfy this in the context of a course just by admitting that we are (a) being a little sloppy here, and (b) the justification is that (for our purposes) the electric field is defined in some simply-connected region of space which has no “holes” one could wrap a path around. In fact, if the students have had a decent course in multivariable calculus they’ve probably seen the explicit construction of a potential function for a vector field whose curl vanishes subject to the restriction that we’re working over a simply-connected space.
The problem arises again in justifying the existence of a vector potential: as Gauss’ law for magnetism tells us, for a magnetic field we have ; therefore
for some vector potential
because the divergence of a curl is zero.
Again we see the same problem of affirming the consequent. And again the real problem hinges on the unspoken assumption that the second de Rham cohomology of our space vanishes. Yes, this is true for contractible spaces, but we must make mention of the fact that our space is contractible! In fact, I did exactly that when I needed to get ahold of the magnetic potential once.
Again: we don’t need to stop simplifying and sweeping some of these messier details of our arguments under the rug when dealing with undergraduate students, but we do need to be honest that those details were there to be swept in the first place. The alternative most texts and notes choose now is to include statements which are blatantly false, and to rely on our authority to make students accept them unquestioningly.
Conservation of Electromagnetic Energy
Let’s start with Ampère’s law, including Maxwell’s correction:
Now let’s take the dot product of this with the electric field:
On the left, we can run a product rule in reverse:
Now, Faraday’s law tells us that
so we can write:
Let’s rearrange this a bit:
The dot product of a vector field with its own derivative should look familiar; we can rewrite:
But now we should recognize almost all the terms in sight! On the left, we’re taking the derivative of the combined energy densities of the electric and magnetic fields:
The second term on the right is the energy density lost to Joule heating per unit time. The only thing left is this vector field:
which we call the “Poynting vector”. It’s really named after British physicist John Henry Poynting, but generations of students remember it because it “points” in the direction electromagnetic energy flows.
To see this, look at the final form of our equation:
On the left we have the rate at which the electromagnetic energy is going down at any given point. On the right, we have two terms; the second is the rate electromagnetic energy density is being lost to heat energy at the point, while the first is the rate electromagnetic energy is “flowing away from” the point.
Compare this with the conservation of charge:
where the rate at which charge density decreases is equal to the rate that charge is “flowing away” through currents. The only difference is that there is no dissipation term for charge like there is for energy.
One other important thing to notice is what this tells us about our plane wave solutions. If we take such an electromagnetic wave propagating in the direction and with the electric field polarized in some particular direction, then we can determine that
showing that electromagnetic waves carry electromagnetic energy in the direction that they propagate.
Ohm’s Law
When calculating the potential energy of the magnetic field, we calculated the power needed to run a certain current around a certain circuit. Let’s look into that a little more deeply.
We start with Ohm’s law, which basically says that — as a first approximation — the electromotive force around a circuit is proportional to the current around it; push harder and you’ll move charge faster. As a formula:
The electromotive force — or “voltage” — on the left is equal to the current around the circuit times the “resistance”. What’s the resistance? Well, here it’s basically just a constant of proportionality, which we read as “how hard is it to push charge around this circuit?”
But let’s dig in a bit more. A current doesn’t really flow around an infinitely-thin wire; it flows around a wire with some thickness. The thicker the wire is — the bigger its cross-sectional area — the easier it should be to push charge around, while the longer the circuit is, the harder. We’ll write down our resistance in the form
where is the length of the wire,
is its cross-sectional area, and
is a new proportionality constant we call “resistivity”. Putting this together with the first form of Ohm’s law we find
But look at this: the current is made up of a current density flowing along the wire, integrated across a cross-section. If the wire is running in the direction and the current density in that direction is constantly
, then
. Further — at least to a first approximation — the electromotive force is the
-component of the electric field
times the length
traveled in that direction.
Thus we conclude that . But since there’s nothing really special about the
direction, we actually find that
which is Ohm’s law again, but now in terms of fields and current distributions.
But what about the power? We’ve got a battery pushing a current around a circuit and using power to do it; where does the energy go? Well, if we think about pushing little bits of charge around the wire, they’re going to hit parts of the wire and lose some energy in the process. The parts they hit get shaken up, and this appears as heat energy; the process is called “Ohmic” or “Joule” heating, the latter from Joule’s own experiments using a resistive wire to heat up a tub of water.
If we have a current made up of
bits of charge
per unit time, then each bit takes an energy of
to go around the circuit once. This happens
times per unit time, so the total power expenditure is
just as we said last time. But now we can do the same trick as above and write
or
which measures the power per unit volume dissipated through Joule heating in the circuit.
Energy and the Magnetic Field
Last time we calculated the energy of the electric field. Now let’s repeat with the magnetic field, and let’s try to be a little more careful about it since magnetic fields can be slippery.
Let’s consider a static magnetic field generated by a collection of circuits
, each carrying a current
. Recall that Gauss’ law for magnetism tells us that
; since space is contractible, we know that its homology is trivial, and thus
must be the curl of some other vector field
, which we call the “magnetic potential” or “vector potential”. Now we can write down the flux of the magnetic field through each circuit:
Now Faraday’s law tells us about the electromotive force induced on the circuit:
This electromotive force must be counterbalanced by a battery maintaining the current or else the magnetic field wouldn’t be static.
We can determine how much power the battery must expend to maintain the current; a charge moving around the circuit goes down by
in potential energy, which the battery must replace to send it around again. If
such charges pass around in unit time, this is a work of
per unit time; since
— the current — we find that the power expenditure is
, or.
Thus if we want to ramp the currents — and the field — up from a cold start in a time it takes a total work of
which is then the energy stored in the magnetic field.
This expression doesn’t depend on exactly how the field turns on, so let’s say the currents ramp up linearly:
and since the fluxes are proportional to the currents they must also ramp up linearly:
Plugging these in above, we find:
Now we can plug in our original expression for the flux:
This is great. But to be more general, let’s replace our currents with a current distribution:
Now we can use Ampère’s law to write
We can pull the same sort of trick last time to make the second integral go away; use the divergence theorem to convert to
and take the surface far enough away that the integral becomes negligible. We handwave that falls off roughly as the inverse fifth power of
, while the area of
only grows as the second power, and say that the term goes to zero.
So now we have a similar expression as last time for a magnetic energy density:
Again, we can check the units; the magnetic field has units of force per unit charge per unit velocity:
while the magnetic constant has units of henries per meter:
Putting together an inverse factor of the magnetic constant and two of the magnetic field and we get:
or, units of energy per unit volume, just like we expect for an energy density.
Energy and the Electric Field
Okay, now let’s consider the electric field from the perspective of energy. We have an idea that this might be interesting because we know that the field produces a force, and forces and energies interact in interesting ways.
So recall that if we have a “test charge” at a point
in an electric field
it experiences a force
. As we saw when discussing Faraday’s law, for a static electric field we can write
for some “electric potential” function
. Thus we can also write
for the potential energy function
.
Now, say the field is generated by a charge distribution ; how much potential energy is contained in the force the field exerts on the little bit of charge at
? We count
, but this is too much — half of it is due to the rest of the distribution acting on the bit of charge at
and half of it comes from
acting back. We can thus find the total potential energy by integrating
Now, Gauss’ law tells us that , so we substitute:
Next we use a form of the product rule — — and run it backwards to write:
where we evaluate the first integral over space by evaluating it over the solid ball of radius and taking the limit as
goes off to infinity. The divergence theorem says we can write:
where, as usual, we have taken the charge distribution to be compactly supported, so as our sphere gets large enough, the potential energy goes to zero. Yes, this is very hand-wavy, but this is how the physicists do it.
Anyway, what does this tell us? It means that a static electric field contains energy with a density
which we can integrate over any region of space to find the electrostatic potential energy contained in the field.
We can also check the units here; the electric field has units of force per unit charge:
while the electric constant has units of farads per meter:
Putting these together — two factors of and one of
we find the units:
Joules per cubic meter — energy per unit of volume, just as we’d expect for an energy density.
Polarization of Electromagnetic Waves
Let’s look at another property of our plane wave solutions of Maxwell’s equations. Specifically, we’ll assume that the electric and magnetic fields are each plane waves in the directions and
, repectively:
We can take these and plug them into the vacuum version of Maxwell’s equations, and evaluate them at :
The first equation says that is perpendicular to
, but the second equation implies, in part, that
is also perpendicular to
. Similarly, the third and fourth equations say that both
and
are perpendicular to
, meaning that
and
either point in the same direction or in opposite directions. We can always pick our coordinates so that
points in the direction of the
-axis and
points in the direction of the
-axis; then
points in the direction of the
-axis. It’s then straightforward to check that
rather than
. Of course, it’s possible that
— and thus
also — is zero; in this case we can just pick some different time at which to evaluate the equations. There must be some time for which these values are nonzero, or else
and
are simply constants, which is a pretty vacuous solution that we’ll just subtract off and ignore.
The upshot of this is that and
must be plane waves traveling in the same direction. We put this back into our assumption:
and then Maxwell’s equations imply
where these are now full functions and not just evaluations at some conveniently-chosen point. And, incidentally, the second and fourth equations are completely equivalent. Now we can see that and
are perpendicular at every point. Further, whatever component either vector has in the
direction is constant, and again we will just subtract it off and ignore it.
As the wave propagates in the direction of , the electric and magnetic fields move around in the plane perpendicular to
. If we pick our
-axis in the direction of
, we can write
and
. Then the second (and fourth) equation tells us
That is, we get two decoupled equations:
This tells us that we can break up our plane wave solution into two different plane wave solutions. In one, the electric field “waves” in the direction while the magnetic field waves in the
direction; in the other, the electric field waves in the
direction and the magnetic field waves in the
direction.
This decomposition is the basis of polarized light. We can create filters that only allow waves with the electric field oriented in one direction to pass; generic waves can be decomposed into a component waving in the chosen direction and a component waving in the perpendicular direction, and the latter component gets destroyed as the wave passes through the Polaroid filter — yes, that’s where the company got its name — leaving only the light oriented in the “right” way.
As a quick, familiar application, we can make glasses with a film over the left eye that polarizes light vertically, and one over the right eye that polarizes light horizontally. Then if we show a quickly-alternating series of images, each polarized with the opposite axis, then they will be presented to each eye separately. This is the basis of the earliest modern stereoscopic — or “3-D” — glasses, which had the problem that if you tilted your head the effect was first lost, and then reversed as your neck’s angle increased. If you’ve been paying attention, you should be able to see why.
The Propagation Velocity of Electromagnetic Waves
Now we’ve derived the wave equation from Maxwell’s equations, and we have worked out the plane-wave solutions. But there’s more to Maxwell’s equations than just the wave equation. Still, let’s take some plane-waves and see what we get.
First and foremost, what’s the propagation velocity of our plane-wave solutions? Well, it’s for the generic wave equation
while our electromagnetic wave equation is
so we find the propagation velocity of waves in both electric and magnetic fields is
Hm.
Conveniently, I already gave values for both and
:
Multiplying, we find:
which means that
And this is a number which should look very familiar: it’s the speed of light. In an 1864 paper, Maxwell himself noted:
The agreement of the results seems to show that light and magnetism are affections of the same substance, and that light is an electromagnetic disturbance propagated through the field according to electromagnetic laws.
Indeed, this supposition has been borne out in experiment after experiment over the last century and a half: light is an electromagnetic wave.
Plane Waves
We’ve derived a “wave equation” from Maxwell’s equations, but it’s not clear what it means, or even why this is called a wave equation. Let’s consider the abstracted form, which both electric and magnetic fields satisfy:
where is the “Laplacian” operator, defined on scalar functions by taking the gradient followed by the divergence, and extended linearly to vector fields. If we have a Cartesian coordinate system — and remember we’re working in good, old
so it’s possible to pick just such coordinates, albeit not canonically — we can write
where is the
-component of
, and a similar equation holds for the
and
components as well. We can also write out the Laplacian in terms of coordinate derivatives:
Let’s simplify further to just consider functions that depend on and
, and which are constant in the
and
directions:
We can take this big operator and “factor” it:
Any function which either “factor” sends to zero will be a solution of the whole equation. We find solutions like
where and
are pretty much any function that’s at least mildly well-behaved.
We call solutions of the first form “right-moving”, for if we view as time and watch as it increases, the “shape” of
stays the same; it just moves in the increasing
direction. That is, at time
we see the same thing at
that we saw at
—
units to the left — at time
. Similarly, we call solutions of the second form “left-moving”. In each family, solutions propagate at a rate of
, which was the constant from our original equation. Any solution of this simplified, one-dimensional wave equation will be the sum of a right-moving and a left-moving term.
More generally, for the three-dimensional version we have “plane-wave” solutions propagating in any given direction we want. We could do a big, messy calculation, but note that if is any unit vector, we can pick a Cartesian coordinate system where
is the unit vector in the
direction, in which case we’re back to the right-moving solutions from above. And of course there’s no reason we can’t let
be a vector-valued function. Such a solution looks like
The bigger is, the further in the
direction the position vector
must extend to compensate; the shape
stays the same, but moves in the direction of
with a velocity of
.
It will be helpful to work out some of the basic derivatives of such solutions. Time is easy:
Spatial derivatives are a little trickier. We pick a Cartesian coordinate system to write:
We don’t really want to depend on coordinates, so luckily it’s easy enough to figure out:
which will make our lives much easier to have worked out in advance.