I’d like to step aside from the main line to make one complaint. In refreshing my background in classical electromagnetism for this series I’ve run into something that bugs the hell out of me as a mathematician. I remember it from my own first course, but I’m shocked to see that it survives into every upper-level treatment I’ve seen.
It’s about the existence of potentials, and the argument usually goes like this: as Faraday’s law tells us, for a static electric field we have ; therefore for some potential function because the curl of a gradient is zero.
Let’s break this down to simple formal logic that any physics undergrad can follow. Let be the statement that there exists a such that . Let be the statement that . The curl of a gradient being zero is the implication . So here’s the logic:
and that doesn’t make sense at all. It’s a textbook case of “affirming the consequent”.
Saying that has a potential function is a nice, convenient way of satisfying the condition that its curl should vanish, but this argument gives no rationale for believing it’s the only option.
If we flip over to the language of differential forms, we know that the curl operator on a vector field corresponds to the operator on -forms, while the gradient operator corresponds to . We indeed know that automatically — the curl of a gradient vanishes — but knowing that is not enough to conclude that for some . In fact, this question is exactly what de Rham cohomology is all about!
So what’s missing? Full formality demands that we justify that the first de Rham cohomology of our space vanish. Now, I’m not suggesting that we make physics undergrads learn about homology — it might not be a terrible idea, though — but we can satisfy this in the context of a course just by admitting that we are (a) being a little sloppy here, and (b) the justification is that (for our purposes) the electric field is defined in some simply-connected region of space which has no “holes” one could wrap a path around. In fact, if the students have had a decent course in multivariable calculus they’ve probably seen the explicit construction of a potential function for a vector field whose curl vanishes subject to the restriction that we’re working over a simply-connected space.
The problem arises again in justifying the existence of a vector potential: as Gauss’ law for magnetism tells us, for a magnetic field we have ; therefore for some vector potential because the divergence of a curl is zero.
Again we see the same problem of affirming the consequent. And again the real problem hinges on the unspoken assumption that the second de Rham cohomology of our space vanishes. Yes, this is true for contractible spaces, but we must make mention of the fact that our space is contractible! In fact, I did exactly that when I needed to get ahold of the magnetic potential once.
Again: we don’t need to stop simplifying and sweeping some of these messier details of our arguments under the rug when dealing with undergraduate students, but we do need to be honest that those details were there to be swept in the first place. The alternative most texts and notes choose now is to include statements which are blatantly false, and to rely on our authority to make students accept them unquestioningly.
Let’s start with Ampère’s law, including Maxwell’s correction:
Now let’s take the dot product of this with the electric field:
On the left, we can run a product rule in reverse:
Now, Faraday’s law tells us that
so we can write:
Let’s rearrange this a bit:
The dot product of a vector field with its own derivative should look familiar; we can rewrite:
The second term on the right is the energy density lost to Joule heating per unit time. The only thing left is this vector field:
which we call the “Poynting vector”. It’s really named after British physicist John Henry Poynting, but generations of students remember it because it “points” in the direction electromagnetic energy flows.
To see this, look at the final form of our equation:
On the left we have the rate at which the electromagnetic energy is going down at any given point. On the right, we have two terms; the second is the rate electromagnetic energy density is being lost to heat energy at the point, while the first is the rate electromagnetic energy is “flowing away from” the point.
Compare this with the conservation of charge:
where the rate at which charge density decreases is equal to the rate that charge is “flowing away” through currents. The only difference is that there is no dissipation term for charge like there is for energy.
One other important thing to notice is what this tells us about our plane wave solutions. If we take such an electromagnetic wave propagating in the direction and with the electric field polarized in some particular direction, then we can determine that
showing that electromagnetic waves carry electromagnetic energy in the direction that they propagate.
When calculating the potential energy of the magnetic field, we calculated the power needed to run a certain current around a certain circuit. Let’s look into that a little more deeply.
We start with Ohm’s law, which basically says that — as a first approximation — the electromotive force around a circuit is proportional to the current around it; push harder and you’ll move charge faster. As a formula:
The electromotive force — or “voltage” — on the left is equal to the current around the circuit times the “resistance”. What’s the resistance? Well, here it’s basically just a constant of proportionality, which we read as “how hard is it to push charge around this circuit?”
But let’s dig in a bit more. A current doesn’t really flow around an infinitely-thin wire; it flows around a wire with some thickness. The thicker the wire is — the bigger its cross-sectional area — the easier it should be to push charge around, while the longer the circuit is, the harder. We’ll write down our resistance in the form
where is the length of the wire, is its cross-sectional area, and is a new proportionality constant we call “resistivity”. Putting this together with the first form of Ohm’s law we find
But look at this: the current is made up of a current density flowing along the wire, integrated across a cross-section. If the wire is running in the direction and the current density in that direction is constantly , then . Further — at least to a first approximation — the electromotive force is the -component of the electric field times the length traveled in that direction.
Thus we conclude that . But since there’s nothing really special about the direction, we actually find that
which is Ohm’s law again, but now in terms of fields and current distributions.
But what about the power? We’ve got a battery pushing a current around a circuit and using power to do it; where does the energy go? Well, if we think about pushing little bits of charge around the wire, they’re going to hit parts of the wire and lose some energy in the process. The parts they hit get shaken up, and this appears as heat energy; the process is called “Ohmic” or “Joule” heating, the latter from Joule’s own experiments using a resistive wire to heat up a tub of water.
If we have a current made up of bits of charge per unit time, then each bit takes an energy of to go around the circuit once. This happens times per unit time, so the total power expenditure is
just as we said last time. But now we can do the same trick as above and write
which measures the power per unit volume dissipated through Joule heating in the circuit.
Last time we calculated the energy of the electric field. Now let’s repeat with the magnetic field, and let’s try to be a little more careful about it since magnetic fields can be slippery.
Let’s consider a static magnetic field generated by a collection of circuits , each carrying a current . Recall that Gauss’ law for magnetism tells us that ; since space is contractible, we know that its homology is trivial, and thus must be the curl of some other vector field , which we call the “magnetic potential” or “vector potential”. Now we can write down the flux of the magnetic field through each circuit:
This electromotive force must be counterbalanced by a battery maintaining the current or else the magnetic field wouldn’t be static.
We can determine how much power the battery must expend to maintain the current; a charge moving around the circuit goes down by in potential energy, which the battery must replace to send it around again. If such charges pass around in unit time, this is a work of per unit time; since — the current — we find that the power expenditure is , or.
Thus if we want to ramp the currents — and the field — up from a cold start in a time it takes a total work of
which is then the energy stored in the magnetic field.
This expression doesn’t depend on exactly how the field turns on, so let’s say the currents ramp up linearly:
and since the fluxes are proportional to the currents they must also ramp up linearly:
Plugging these in above, we find:
Now we can plug in our original expression for the flux:
This is great. But to be more general, let’s replace our currents with a current distribution:
Now we can use Ampère’s law to write
We can pull the same sort of trick last time to make the second integral go away; use the divergence theorem to convert to
and take the surface far enough away that the integral becomes negligible. We handwave that falls off roughly as the inverse fifth power of , while the area of only grows as the second power, and say that the term goes to zero.
So now we have a similar expression as last time for a magnetic energy density:
Again, we can check the units; the magnetic field has units of force per unit charge per unit velocity:
while the magnetic constant has units of henries per meter:
Putting together an inverse factor of the magnetic constant and two of the magnetic field and we get:
or, units of energy per unit volume, just like we expect for an energy density.
Okay, now let’s consider the electric field from the perspective of energy. We have an idea that this might be interesting because we know that the field produces a force, and forces and energies interact in interesting ways.
So recall that if we have a “test charge” at a point in an electric field it experiences a force . As we saw when discussing Faraday’s law, for a static electric field we can write for some “electric potential” function . Thus we can also write for the potential energy function .
Now, say the field is generated by a charge distribution ; how much potential energy is contained in the force the field exerts on the little bit of charge at ? We count , but this is too much — half of it is due to the rest of the distribution acting on the bit of charge at and half of it comes from acting back. We can thus find the total potential energy by integrating
Now, Gauss’ law tells us that , so we substitute:
Next we use a form of the product rule — — and run it backwards to write:
where we evaluate the first integral over space by evaluating it over the solid ball of radius and taking the limit as goes off to infinity. The divergence theorem says we can write:
where, as usual, we have taken the charge distribution to be compactly supported, so as our sphere gets large enough, the potential energy goes to zero. Yes, this is very hand-wavy, but this is how the physicists do it.
Anyway, what does this tell us? It means that a static electric field contains energy with a density
which we can integrate over any region of space to find the electrostatic potential energy contained in the field.
We can also check the units here; the electric field has units of force per unit charge:
while the electric constant has units of farads per meter:
Putting these together — two factors of and one of we find the units:
Joules per cubic meter — energy per unit of volume, just as we’d expect for an energy density.
Let’s look at another property of our plane wave solutions of Maxwell’s equations. Specifically, we’ll assume that the electric and magnetic fields are each plane waves in the directions and , repectively:
We can take these and plug them into the vacuum version of Maxwell’s equations, and evaluate them at :
The first equation says that is perpendicular to , but the second equation implies, in part, that is also perpendicular to . Similarly, the third and fourth equations say that both and are perpendicular to , meaning that and either point in the same direction or in opposite directions. We can always pick our coordinates so that points in the direction of the -axis and points in the direction of the -axis; then points in the direction of the -axis. It’s then straightforward to check that rather than . Of course, it’s possible that — and thus also — is zero; in this case we can just pick some different time at which to evaluate the equations. There must be some time for which these values are nonzero, or else and are simply constants, which is a pretty vacuous solution that we’ll just subtract off and ignore.
The upshot of this is that and must be plane waves traveling in the same direction. We put this back into our assumption:
and then Maxwell’s equations imply
where these are now full functions and not just evaluations at some conveniently-chosen point. And, incidentally, the second and fourth equations are completely equivalent. Now we can see that and are perpendicular at every point. Further, whatever component either vector has in the direction is constant, and again we will just subtract it off and ignore it.
As the wave propagates in the direction of , the electric and magnetic fields move around in the plane perpendicular to . If we pick our -axis in the direction of , we can write and . Then the second (and fourth) equation tells us
That is, we get two decoupled equations:
This tells us that we can break up our plane wave solution into two different plane wave solutions. In one, the electric field “waves” in the direction while the magnetic field waves in the direction; in the other, the electric field waves in the direction and the magnetic field waves in the direction.
This decomposition is the basis of polarized light. We can create filters that only allow waves with the electric field oriented in one direction to pass; generic waves can be decomposed into a component waving in the chosen direction and a component waving in the perpendicular direction, and the latter component gets destroyed as the wave passes through the Polaroid filter — yes, that’s where the company got its name — leaving only the light oriented in the “right” way.
As a quick, familiar application, we can make glasses with a film over the left eye that polarizes light vertically, and one over the right eye that polarizes light horizontally. Then if we show a quickly-alternating series of images, each polarized with the opposite axis, then they will be presented to each eye separately. This is the basis of the earliest modern stereoscopic — or “3-D” — glasses, which had the problem that if you tilted your head the effect was first lost, and then reversed as your neck’s angle increased. If you’ve been paying attention, you should be able to see why.
Now we’ve derived the wave equation from Maxwell’s equations, and we have worked out the plane-wave solutions. But there’s more to Maxwell’s equations than just the wave equation. Still, let’s take some plane-waves and see what we get.
First and foremost, what’s the propagation velocity of our plane-wave solutions? Well, it’s for the generic wave equation
while our electromagnetic wave equation is
so we find the propagation velocity of waves in both electric and magnetic fields is
Conveniently, I already gave values for both and :
Multiplying, we find:
which means that
And this is a number which should look very familiar: it’s the speed of light. In an 1864 paper, Maxwell himself noted:
The agreement of the results seems to show that light and magnetism are affections of the same substance, and that light is an electromagnetic disturbance propagated through the field according to electromagnetic laws.
Indeed, this supposition has been borne out in experiment after experiment over the last century and a half: light is an electromagnetic wave.
We’ve derived a “wave equation” from Maxwell’s equations, but it’s not clear what it means, or even why this is called a wave equation. Let’s consider the abstracted form, which both electric and magnetic fields satisfy:
where is the “Laplacian” operator, defined on scalar functions by taking the gradient followed by the divergence, and extended linearly to vector fields. If we have a Cartesian coordinate system — and remember we’re working in good, old so it’s possible to pick just such coordinates, albeit not canonically — we can write
where is the -component of , and a similar equation holds for the and components as well. We can also write out the Laplacian in terms of coordinate derivatives:
Let’s simplify further to just consider functions that depend on and , and which are constant in the and directions:
We can take this big operator and “factor” it:
Any function which either “factor” sends to zero will be a solution of the whole equation. We find solutions like
where and are pretty much any function that’s at least mildly well-behaved.
We call solutions of the first form “right-moving”, for if we view as time and watch as it increases, the “shape” of stays the same; it just moves in the increasing direction. That is, at time we see the same thing at that we saw at — units to the left — at time . Similarly, we call solutions of the second form “left-moving”. In each family, solutions propagate at a rate of , which was the constant from our original equation. Any solution of this simplified, one-dimensional wave equation will be the sum of a right-moving and a left-moving term.
More generally, for the three-dimensional version we have “plane-wave” solutions propagating in any given direction we want. We could do a big, messy calculation, but note that if is any unit vector, we can pick a Cartesian coordinate system where is the unit vector in the direction, in which case we’re back to the right-moving solutions from above. And of course there’s no reason we can’t let be a vector-valued function. Such a solution looks like
The bigger is, the further in the direction the position vector must extend to compensate; the shape stays the same, but moves in the direction of with a velocity of .
It will be helpful to work out some of the basic derivatives of such solutions. Time is easy:
Spatial derivatives are a little trickier. We pick a Cartesian coordinate system to write:
We don’t really want to depend on coordinates, so luckily it’s easy enough to figure out:
which will make our lives much easier to have worked out in advance.
Maxwell’s equations give us a collection of differential equations to describe the behavior of the electric and magnetic fields. Juggling them, we can come up with other differential equations that give us more insight into how these fields interact. And, in particular, we come up with a familiar equation that describes waves.
Specifically, let’s consider Maxwell’s equations in a vacuum, where there are no charges and no currents:
Now let’s take the curl of both of the curl equations:
We also have an identity for the double curl:
But for both of our fields we have , meaning we can rewrite our equations as
which are the wave equations we were looking for.
It’s important to note at this point that we didn’t have to start with our experimentally-justified axioms. Maxwell’s equations suffice to derive all the physics we need.
Coulomb’s law is almost as simple. If we have a point charge it makes sense that it generate a spherically symmetric, radial electric field. Given this assumption, we just need to calculate its magnitude at the radius . To do this, set up a sphere of that radius around the point; Gauss’ law in integral form tells us that the flow of out through this sphere is the total charge inside. But it’s easy to calculate the integral, getting
which is the magnitude given by Coulomb’s law.
To get the Biot-Savart law, we can use Ampère’s law to calculate the magnetic field around an infinitely long straight current . We again argue on geometric grounds that the magnitude of the magnetic field should only depend on the distance from the current and should point directly around the current. If we set up a circle of radius then, the total circulation around the circle is, by Ampère’s law:
Now, we can compare this to the last time we computed the magnetic field of the straight infinite current by integrating the Biot-Savart law directly and got essentially the same answer.
Finally, we can derive conservation of charge from Ampère’s law, with Maxwell’s correction by taking its divergence:
The quantity on the left is the divergence of a curl, so it automatically vanishes. Meanwhile, Gauss’ law tells us that , so we conclude
which is the “continuity equation” expressing the conservation of charge.
The importance is that while we originally derived Maxwell’s equations from four experimentally-justified laws, those laws are themselves essentially derivable from Maxwell’s equations. Thus any reformulation of Maxwell’s equations is just as sufficient a basis for all of electromagnetism as our original physical axioms.