At last we’re ready to explain the Higgs mechanism. We start where we left off last time: a complex scalar field with a gauged phase symmetry that brings in a (massless) gauge field . The difference is that now we add a new self-interaction term to the Lagrangian:
where is a constant that determines the strength of the self-interaction. We recall the gauged symmetry transformations:
If we write down an expression for the energy of a field configuration we get a bunch of derivative terms — basically like kinetic energy — that all occur with positive signs and then the potential energy term that comes in the brackets above:
Now, the “ground state” of the system should be one that minimizes the total energy, but the usual choice of setting all the fields equal to zero doesn’t do that here. The potential has a “bump” in the center, like the punt in the bottom of a wine bottle, or like a sombrero.
So instead of using that as our ground state, we’ll choose one. It doesn’t matter which, but it will be convenient to pick:
where is chosen to minimize the potential. We can still use the same field as before, but now we will write
Since the ground state is a point along the real axis in the complex plane, vibrations in the field measure movement that changes the length of , while vibrations in measure movement that changes the phase.
We want to consider the case where these vibrations are small — the field basically sticks near its ground state — because when they get big enough we have enough energy flying around in the system that we may as well just work in the more symmetric case anyway. So we are justified in only working out our new Lagrangian in terms up to quadratic order in the fields. This will also make our calculations a lot simpler. Indeed, to quadratic order (and ignoring an irrelevant additive constant) we have
so vibrations of the field don’t show up at all in quadratic interactions.
We should also write out our covariant derivative up to linear terms:
so that the quadratic Lagrangian is
Now, the term in parentheses on the right looks like the mass term of a vector field with mass . But what is the kinetic term of this field?
And so we can write down the final form of our quadratic Lagrangian:
In order to deal with the fact that our normal vacuum was not a minimum for the energy, we picked a new ground state that did minimize energy. But the new ground state doesn’t have the same symmetry the old one did — we have broken the symmetry — and when we write down the Lagrangian in terms of excitations around the new ground state, we find it convenient to change variables. The previously massless gauge field “eats” part of the scalar field and gains a mass, leaving behind the Higgs field.
This is essentially what’s going on in the Standard Model. The biggest difference is that instead of the initial symmetry being a simple phase, which just amounts to rotations around a circle, we have a (slightly) more complicated symmetry to deal with. For those that are familiar with some classical groups, we start with an action of on a column vector made of two complex scalar fields with a potential of the form:
which is invariant under the obvious action of and a phase action of . Since the group is three-dimensional there are three gauge fields to introduce for its symmetry and one more for the symmetry.
When we pick a ground state that breaks the symmetry it doesn’t completely break; a one-dimensional subgroup still leaves the new ground state invariant — though it’s important to notice that this is not just the factor, but rather a mixture of this factor and a subgroup of . Thus only three of these gauge fields gain mass; they become the and bosons that carry the weak force. The other gauge field remains massless, and becomes — the photon.
At high enough energies — when the fields bounce around enough that the bump doesn’t really affect them — then the symmetry comes back and we see that the electromagnetic and weak interactions are really two different aspects of the same, unified phenomenon, just like electricity and magnetism are really two different aspects of electromagnetism.
Now we’re starting to get to the really meaty stuff. We talked about the phase symmetry of the complex scalar field:
which basically wants to express the idea that the physics of this field only really depends on the length of the complex field values and not on their phases. But another big principle of physics is locality — what happens here doesn’t instantly affect what happens elsewhere — so why should the phase change be global?
To answer this, we “gauge” the symmetry and make it local. The origin of the term is fascinating, but takes us too far afield. The upshot is that we now have the symmetry transformation:
where is no longer a constant, but a function of the spacetime point .
And here’s the big problem: since varies from point to point, it now affects our derivative terms! Before we had
and similarly for . We say that the derivatives are “covariant” under the transformation; they transform in the same way as the underlying fields. And this is what lets us say that
and makes the whole Lagrangian symmetric.
On the other hand, what do we see now?
We pick up this extra term when we differentiate, and it ruins the symmetry.
The way out is to add another field that can “soak up” this extra term. Since the derivative is a vector, we introduce a vector field and say that it transforms as
Next, we introduce a new derivative operator: . That is:
And we calculate
So the derivative does vary the same way as the underlying field does! We call the “covariant derivative”. If we use it in our Lagrangian, we do recover our symmetry, though now we’ve got a new field to contend with. Just like the electromagnetic potential we use the derivative to write
which is now symmetric under the gauged symmetry transformations.
It may not be apparent, but this Lagrangian does contain interaction terms. We can expand out the second term to find:
Our rules of thumb tell us that if we vary the Lagrangian with respect to we get the field equation
which — if we expand out as if it’s the Faraday field into “electric” and “magnetic” fields — give us Gauss’ and Ampère’s law in the presence of a charge-current density .
The charge-current, in particular, we can write as
or, in a gauge-invariant manner, as
which is just the conserved current from last time with the regular derivatives replaced by covariant ones. Similarly, varying with respect to the field we find the “covariant” Klein-Gordon equation:
and, when this holds, we can show that .
So we’ve found that if we take the global symmetry of the complex scalar field and “gauge” it, something like electromagnetism naturally pops out, and the particle of the complex scalar field interacts with it like charged particles interact with the real electromagnetic field.
This is part two of a four-part discussion of the idea behind how the Higgs field does its thing. Read Part 1 first.
Okay, now that we’re sold on the Lagrangian formalism you can rest easy: I’m not going to go through the gory details of any more variational calculus. I do want to clear a couple notational things out of the way, though. They might not all matter for the purposes of our discussion, but better safe than sorry.
First off, I’m going to use a coordinate system where the speed of light is 1. That is, if my unit of time is seconds, my unit of distance is light-seconds. Mostly this helps keep annoying constants out of the way of the equations; physicists do this basically all the time. The other thing is that I’m going to work in four-dimensional spacetime, meaning we’ve got four coordinates: , , , and . We calculate dot products by writing . Yes, that minus sign is weird, but that’s just how spacetime works.
Also instead of writing spacetime vectors, I’m going to write down their components, indexed by a subscript that’s meant to run from 0 to 3. Usually this will be a Greek letter from the middle of the alphabet like or . Similarly, instead of writing for the vector composed of the four spacetime derivatives of a field I’ll just write down the derivatives, and I’ll write instead of .
Along with writing down components instead of vectors I won’t be writing dot products explicitly. Instead I’ll use the common convention that when the same index appears twice we’re supposed to sum over it, remembering that the zero component gets a minus sign. That is, is the dot product from above. Similarly, we can multiply a matrix with entries by a vector to get ; notice how the summed index gets “eaten up” in the process.
Okay, now even without going through the details there’s a fair bit we can infer from general rules of thumb. Any term in the Lagrangian that contains a derivative of the field we’re varying is almost always going to be the squared-length of that derivative, and the resulting term in the variational equations will be the negative of a second derivative of the field. For any term that involves the plain field we basically take its derivative as if the field were a variable. Any term that doesn’t involve the field at all just goes away. And since we prefer positive second-derivative terms to negative ones, we usually flip the sign of the resulting equation; since the other side is zero this doesn’t matter.
So if, for instance, we have the following Lagrangian of a complex scalar field :
we get two equations by varying the field and its complex conjugate separately:
It may not seem to make sense to vary the field and its complex conjugate separately, but the two equations we get at the end are basically the same anyway, so we’ll let this slide for now. Anyway, what we get is a second derivative of set equal to times itself, which we call the “Klein-Gordon wave equation” for . Since the term gives rise to the term in the field equations, we call this the “mass term”.
In the case of electromagnetism in a vacuum we just have the electromagnetic fields and no charge or current distribution. We use the Faraday field to write down the Lagrangian
which gives rise to the field equations
or, equivalently in terms of the potential field :
The second equation just expresses a choice we can make to always consider divergence-free potentials without affecting the predictions of electromagnetism; the first equation looks like the Klein-Gordon equation again, except there’s no mass term. Indeed, we know that photons — the particles associated to the electromagnetic field — have no rest mass!
Turning back to the complex scalar field, we notice that there’s a certain symmetry to this Lagrangian. Specifically, if we replace and by
for any constant , we get the same result. This is important, and it turns out to be a clue that leads us — I won’t go into the details — to consider the quantity
This is interesting because we can calculate
where we’ve used the results of the Klein-Gordon equations. Since , this is a suitable vector field to use as a charge-current distribution; the equation just says that charge is conserved! That is, we can write down a Lagrangian involving both electromagnetism — that is, our “massless vector field” and our scalar field:
where is a “coupling constant” that tells us how important the “interaction term” involving both and is. If it’s zero, then the fields don’t actually interact at all, but if it’s large then they affect each other very strongly.
This is part one of a four-part discussion of the idea behind how the Higgs field does its thing.
Wow, about six months’ hiatus as other parts of my life have taken precedence. But I drag myself slightly out of retirement to try to fill a big gap in the physics blogosphere: how the Higgs mechanism works.
There’s a lot of news about this nowadays, since the Large Hadron Collider has announced evidence of a “Higgs-like” particle. As a quick explanation of that, I use an analogy I made up on Twitter: “If Mirror-Spock exists, he has a goatee. We have found a man with a goatee. We do not yet know if he is Mirror-Spock.”
So, what is the Higgs boson? Well, it’s the particle expression of the Higgs field. That doesn’t explain anything, so we go one step further. What is the Higgs field? It’s the (conjectured) thing that gives some other particles (some of their) mass, in certain situations where normally we wouldn’t expect there to be any mass. And then there’s hand-waving about something like the ether that particles have to push through or shag carpet that they have to rub against that slows them down and hey, mass. Which doesn’t really explain anything, but sort of sounds like it might and so people nod sagely and then either forget about it all or spin their misconceptions into a new wave of Dancing Wu-Li Masters.
I think we can do better, at least for the science geeks out there who are actually interested and not allergic to a little math.
A couple warnings and comments before we begin. First off: I’m not going to go through this in my usual depth because I want to cram it into just three posts, albeit longer ones than usual, even though what I will say touches on all sorts of insanely cool mathematics that disappointingly few people see put together like this. Second: Ironically, that seems to include a lot of the physicists, who are generally more concerned with making predictions than with understanding how the underlying theory connects to everything else and it’s totally fine, honestly, that they’re interested in different aspects than I am. But I’m going to make a relatively superficial pass over describing the theory as physicists talk about it rather than go into those underlying structures. Lastly: I’m not going to describe the actual Higgs particle or field as they exist in the Standard Model; that would require quantum field theory and all sorts of messy stuff like that, when it turns out that the basic idea already shows up in classical field theory, which is a lot easier to explain. Even within classical field theory I’m going to restrict myself to a simpler example of the sort of thing that happens. Because reasons.
That all said, let’s dive in with Lagrangian mechanics. This is a subject that you probably never heard about unless you were a physics major or maybe a math major. Basically, Newtonian mechanics works off of the three laws that were probably drilled into your head by the end of high school science classes:
- Newton’s Laws of Motion
- An object at rest tends to stay at rest; an object in motion tends to stay in that motion.
- Force applied to an object is proportional to the acceleration that object experiences. The constant of proportionality is the object’s mass.
- Every action comes paired with an equal and opposite reaction.
It’s the second one that gets the most use since we can write it down in a formula: . And for most forces we’re interested in the force is a conservative vector field, meaning that it’s the (negative) gradient (fancy word for “derivative” that comes up in more than one dimension) of a potential energy function: . What this means is that things like to move in the direction that potential energy decreases, and they “feel a force” pushing them in that direction. Upshot for Newton: .
Lagrangian mechanics comes at this same formula with a different explanation: objects like to move along paths that (locally) minimize some quantity called “action”. This principle unifies the usual topics of high school Newtonian physics with things like optics where we say that light likes to move along the shortest path between two points. Indeed, the “action” for light rays is just the distance they travel! This also explains things like “the angle of incidence equals the angle of reflection”; if you look at all paths between two points that bounce off of a mirror, the one that satisfies this property has the shortest length, making it a local minimum for the action.
Let’s set this up for a body moving around in some potential field to show you how it works. The action of a suggested path — the body is at the point at time over a time interval is:
where is the velocity vector of the particle, is the square of its length, and is a potential function depending only on the position of the particle. Don’t worry: there’s a big scary integral here, but we aren’t going to actually do any integration.
The function on the inside of the integral is called the Lagrangian function, and we calculate the action of the path by integrating the Langrangian over the time interval we’re concerned with. We write this as with square brackets to emphasize that this is a “functional” that takes a function and gives a number back. Of course, as mathematicians there’s really nothing inherently special about functions taking functions as arguments, but for beginners it helps keep things straight.
Now, what happens if we “wiggle” the path a bit? What if we calculate the action of , where is some “small” function called the “variation” of ? We calculate:
Taking the derivative is linear, so we see that ; “the variation of the derivative is the derivative of the variation”. Plugging this in:
where we’ve thrown away terms involving second and higher powers of ; the variation is small, so the square (and cube, and …) is negligible. So what’s the difference between this and ? What’s the variation of the action?
where again we throw away negligible terms. Now we can handle the first term here using integration by parts:
“Wait a minute!” those of you paying attention will cry out, “what about the boundary terms!?” Indeed, when we use integration by parts we should pick up , but we will assume that we know where the body is at the beginning and the end of our time interval, and we’re just trying to figure out how it gets from one point to the other. That is, is zero at both endpoints.
So, now we apply our Lagrangian principle: bodies like to move along action-minimizing paths. We know how action changes if we “wiggle” the path by a little variation , and this should remind us about how to find local minima: they happen when no matter how we change the input, the “first derivative” of the output is zero. Here the first derivative is the variation in the action, throwing away the negligible terms. So, what condition will make zero no matter what function we put in for ? Well, the other term in the integrand will have to vanish:
But this is just Newton’s second law from above, coming back again!
Everything we know from Newtonian mechanics can be written down in Lagrangian mechanics by coming up with a suitable action functional, which usually takes the form of an integral of an appropriate Lagrangian function. But lots more things can be described using the Lagrangian formalism, including field theories like electromagnetism.
In the presence of a charge distribution and a current distribution , we take the potentials and as fundamental and start with the action (suppressing the space and time arguments so we can write instead of :
When we vary with respect to and insist that the variance of be zero we get Gauss’ law:
The other two of Maxwell’s equations come automatically from taking the potentials as fundamental and coming up with the electric and magnetic fields from them.
Scientific American has been doing a lot about privacy, since that’s the theme of its latest issue. One of its posts today linked to this one from two years ago at a weblog that my old teacher Bill Gasarch has something to do with. It’s a pretty cogent explanation of zero-knowledge proofs, with a sudoku puzzle as the centerpiece.
A zero-knowledge proof is a way of convincing you that I know something without you getting any information about what that thing is. I know, it sounds like LitCrit, but this actually makes sense.
The classic motivating example is a cave with two entrances. Way back deep in the cave there’s a door with a combination lock I tell you I can open. I go into one of the cave entrances (you don’t see which, but if I can’t open the door I’m stuck in whichever one I picked). Then you come to the mouth(s) of the cave and yell out which entrance I should come out of.
If I can open the door, I can come out either entrance, so I come out the one you want. If I can’t open the door, then I can only come out the one you want if that’s the one I went in. Juggling the numbers a bit we can find that the probability I don’t know the combination given that I came out the correct entrance is .
This means that half the time I could have gotten the right answer by luck rather than by skill. That’s not very convincing, but we can repeat the experiment, and each run is independent of all the others. So if we go through the motions times and I’ve met your challenge every single time, then the probability that I’ve just been lucky is , which gets very small very quickly. After a while you’ll be sufficiently convinced that I know the answer.
In a way it’s more like a physics “proof” than the ones mathematicians are used to. The hypothesis can be falsified at any step, and each confirmation just increases the probability that it’s accurate.
Randall’s got an analogy for Rubik’s cube. Like the cube, there’s a trick to it. Unlike the cube, it doesn’t really illustrate any interesting mathematics. Also unlike the cube, I’m not about to go telling everyone what the trick is out in public.
I mean, sure, it’s not like I’m using it or anything, but it’s the principle of the thing.
I’m not about to sit down and work up a solution like we did before, but it shouldn’t be impossible to repeat the same sort of analysis. I will point out, however, that the solver in this video is making heavy use of both of our solution techniques: commutators and a tower of nested subgroups.
The nested subgroups are obvious. As the solution progresses, more and more structure becomes apparent, and is preserved as the solution continues. In particular, the solver builds up the centers of faces and then slips to the subgroup of maneuvers which leaves such “big centers” fixed in place. Near the end, almost all of the moves are twists of the outer faces, because these are assured not to affect anything but the edge and corner cubies.
The commutators take a quicker eye to spot, but they’re in there. Watch how many times he’ll do a couple twists, a short maneuver, and then undo those couple twists. Just as we used such commutators, these provide easy generalizations of basic cycles, and they form the heart of this solver’s algorithm.
Alexandre asked a question about the asymptotic growth of the “worst assembly time” for the cube. What this is really asking is for the “diameter” of the th Rubik’s group . I don’t know offhand what this would be, but here’s a way to get at a rough estimate.
First, find a similar expression for the structure of as we found before for . Then what basic twists do we have? For we had all six faces, which could be turned either way, and we let the center slices be fixed. In general we’ll have slices in each of six directions, each of which can be turned either way, for a total of generators (and their inverses). But each generator should (usually) be followed by a different one, and definitely not by its own inverse. Thus we can estimate the number of words of length as . Then the structure of gives us a total size of the group, and the diameter should be about . Notice that for this gives us , which isn’t far off from the known upper bound of quarter-turns.
Over at Not Even Wrong, there’s a discussion of David Vogan’s talks at Columbia about the “orbit method” or “orbit philosophy”. This is the view that there is — or at least there should be — a correspondence between unitary irreps of a Lie group and the orbits of a certain action of . As Woit puts it
This is described as a “method” or “philosophy” rather than a theorem because it doesn’t always work, and remains poorly understood in some cases, while at the same time having shown itself to be a powerful source of inspiration in representation theory.
What he doesn’t say in so many words (but which I’m just rude enough to) is that the same statement applies to a lot of theoretical physics. Path integrals are, as they currently stand, prima facie nonsense. In some cases we’ve figured out how to make sense of them, and to give real meaning to the conceptual framework of what should happen. And this isn’t a bad thing. Path integrals have proven to be a powerful source of inspiration, and a lot of actual, solid mathematics and physics has come out of trying to determine what the hell they’re supposed to mean.
Where this becomes a problem is when people take the conceptual framework as literal truth rather than as the inspirational jumping-off point it properly is.
I just got the latest issue of the Notices of the American Mathematical Society today, and was excited to find in it an article by David Vogan on the Atlas Project. In particular, there’s a lot of information on how the programming and computation ran. I found it sort of interesting, but I know there are people out there who just love tweaking programs and squeezing that extra bit of performance out. Most of the time it really doesn’t matter, but the Kazhdan-Lusztig polynomials for split are one of those few calculations that are just so unbelievably massive that you have to use every trick you can just to get the damn thing to fit in the computer.
Oh, and if any of you code junkies wants to Slashdot that article, I’d appreciate a “heard it on The UM” nod 😉