In SUSY gauge theories there’s a big distinction between the Wilsonian and the 1PI effective actions. Seiberg makes a big distinction between the two during his lectures (e.g. see the discussion that arose during his explanation of Seiberg Duality in the SIS07 school). This isn’t explained in any of the usual QFT textbooks, so I figured it was worth writing a little note that at least collects some referenes.

The most critical application of the distinction is manifested in the beta function for supersymmetric gauge theories. The difference between the 1PI and Wilsonian effective actions ends up being the difference between the 1-loop exact beta function of and the NSVZ “exact to all orders” beta function that includes multiple loops. For a discussion of this, most paper point to Shifman and Vainshtein’s paper, “Solution of the anomaly puzzle inSUSY gauge theories and the Wilson operator expansion.” [doi:10.1016/0550-3213(86)90451-7]. It’s worth noting that Arkani-Hamed and Murayama further clarified this ambiguity in terms of the holomorphic versus the canonical gauge coupling in, “Holomorphy, Rescaling Anomalies, and Exact beta Functions in SUSY Gauge Theories,” [hep-th/9707133].

The distinction between the two is roughly this:

  • The Wilsonian effective action is given by setting a scale \mu and integrating out all modes whose mass or momentum are larger than this scale. This quantity has no IR subtleties because IR divergences are cut off. To be explicit, the Wilsonian action is a theory with a cutoff. It is a theory where couplings run according to the Wilsonian RG flow, i.e. it is a theory that we still have to treat quantum mechanically. We still have to perform the path integral.
  • The 1PI effective action is the quantity appearing in ithe generating functional of 1PI diagrams, usually called \Gamma. This quantity is formally defined including all virtual contributions coming from loops so that the tree-level diagrams are exact. (Of course we end up having to calculate in a loop expansion.) The one-loop zero-momentum contribution is the Coleman-Weinberg potential. The 1PI effective action is the quantity that we deal with when we Legendre transform the action with respect to sources and classical background fields. The 1PI effective action is meant to be classical in the sense that all quantum effects are accounted for. Because it takes into account all virtual modes, it is sensitive to the problems of massless particles. Thus the 1PI effective action can have IR divergences, i.e. it is non-analytic. It can get factors of log p coming from massless particles running in loops. Seiberg says a good example of this is the chiral Lagrangian for pions.

Further references not linked to above:

  • Bilal, “(Non) Gauge Invariance of Wilsonian Effective Actions in (SUSY) Gauge Theories: A Critical Discussion.” [0705.0362].
  • Burgess, “An Introduction to EFT.” [hep-th/0701053]. An excellent pedagogical explanation of the Wilsonian vs 1PI and how they are connected. A real pleasure to read.
  • Seiberg, “Naturalness vs. SUSY Non-renormalization.” [hep-ph/9309335]. Mentions the distinction.
  • Polchinski, “Renormalization and effective Lagrangian.” [doi:10.1016/0550-3213(84)90287-6] Only mentions Wilsonian effective action, but still a nice pedagogical read.
  • Tim Hollowood’s renormalization notes ( are always worth looking at.

Good news for theoretical physics graduate education: the unofficial buzz is that Perimeter Institute will be making their “Perimeter Scholars International” (PSI) lectures publicly available through the institute’s PIRSA video archive.

For those of who don’t know, PSI is a new masters-degree course for training theoretical physicists. The program has a unique breakdown of intense 3-week terms where students focus on progressively specialized material. The choice to have more terms with fewer courses (and I assume more weekly hours per course) allows the program to recruit some really big-name faculty from around the world to come and lecture.

The course also includes a research component and I suspect the inaugural batch of students will have quite an experience interacting with the lecturers and each other. This aspect of the program, of course, cannot be reproduce online — but I suspect that the online lectures will serve as an excellent advertisement for prospective students while also acting as a unique archive of pedagogical lectures for those of us who have already started our PhDs. 🙂

Caveat: I am not an experimentalist and I do not pretend to properly understand experimental nuances… but I’m doing my best to try to keep up with what I think are interesting results in particle physics. This post is primarily based on notes from the talk `Updated Oscillation Results from MiniBooNE.’

MiniBooNE photomultiplier tubes, image from image bank.

Oh no! Alien robot drones coming to enslave us! No, just kidding. They're just MiniBooNE photomultiplier tubes. Image from image bank.

The MiniBooNE experiment’s initial goal when it started taking data in 2002 was to test the hypothesis of neutrino mixing with a heavy sterile neutrino that had been proposed to explain the so-called `LSND-anomaly.’ In 2006 (07?) many watched as the collaboration revealed data that disproved this hypothesis, though their data set had an unexplained excess in low energy (below 475 MeV) electrons. Since this was in a region of large background and didn’t affect the fits used in the neutrino mixing analysis, they mentioned this in passing and promised to look into it.

A couple of months ago the collaboration came back with an improved background analysis showing that the low-energy excess still appears with over 3 sigma confidence (0812.2243). One novel model came from the paper `Anomaly-mediated neutrino-photon interactions at finite baryon density,’ (0708.1281), which was apparently a theorists’ favorite.  The model, however, predicted a similar excess for anti-neutrinos, which the latest analysis does not indicate (see the `2009 tour‘ talk of the MiniBooNE spokesperson, R. van de Water).

Some background

Neutrinos are slippery little particles that only interact via the weak force. They have also been of interest for beyond-the-standard model theorists since they are they key to several approaches to new physics, including:

  • lepton flavor physics (the PMNS matrix as the analogy for the CKM matrix in the quark sector)
  • see-saw mechanism (neutrino masses coming from mixing with GUT-scale right-handed neutrinos)
  • majorana mass terms (lepton number violating)
  • leptogenesis (transferring CP violation from the lepton sector to the hadron sector).

One of the big events of 1998 was the discovery of neutrino mixing (i.e. masses). This is actually a rather subtle topic, as recent confusion over the GSI anomaly has shown; my favorite recent pedagogical paper is 0810.4602. (See also 0706.1216 for an excellent discussion of why neutrinos oscillate rather than the charged leptons.)

The mixing probability between two neutrino mass eigenstates goes like \sin^2(\Delta m^2 L/E). I’ve dropped a numerical factor in the second sine, but this is a heuristic discussion anyway. The L and E represent the `baseline’ (distance the neutrinos travel) and the neutrino energy. A similar expression occurs for three neutrino mixing, such as between the three light neutrinos that we’ve come to know and love since 1998.

The early probes of neutrino oscillations came from `medium’ and `long’ baseline experiments where the neutrinos detected came from cosmic ray showers in the atmosphere and the sun respectively. The LSND experiment was the first to probe `short-baseline’ neutrinos, with an L/E of about 30 m / 50 MeV. What LSND found was incompatible with the standard story of three light neutrino mixing (hep-ex/0104049). They found a 3.8 sigma excess of electron anti-neutrinos over what one would expect, leading to the suggestion that this `LSND anomaly’ may have been due to mixing of the light neutrinos with a fourth, heavy `sterile’ neutrino.


MiniBooNE set out to test the sterile neutrino hypothesis by looking at the muon neutrino to electron neutrino mixing (LSND loked at anti-mu neutrinos to anti-e neutrinos). The experimental set-up had Fermilab shooting 8 GeV protons into a fixed target to produce pions and kaons. These are focused with a magnetic `horn’ so that they decay into relatively collimated neutrinos (mu and e) and charged particles. The horn can be run with opposite polarity to study the analogous anti-neutrino processes. The leptons then go through around 500 meters of dirt, which provides ample matter to stop the charged particles while leaving the neutrinos to hit an 800 ton mineral oil detector. (The energy and baseline are chosen to match the L/E of LSND.) These neutrinos may produce charged leptons, which produce Cerenkov radiation (the electromagnetic equivalent of sonic-booms) which is picked up by an array of photomultiplier tubes to read out information about the particle energy. Apparently the pattern of Cerenkov light is even enough to distinguish muons from electrons.

I don’t know why, but the MiniBooNE people measure their data in terms of protons-on-target (POT). It seems to me that the natural units would be something like luminosity or number of neutrino candidates… but perhaps these are less-meaningful in this sort of experiment?

The first results

Here’s what MiniBooNE had to say in 2007 (0704.1500):

MiniBooNE 2007 Results, from 0704.1500

MiniBooNE 2007 Results, from 0704.1500

As is standard in particle physics, did a blind analysis, i.e. analyzed the data without looking at the entire dataset, to prevent the analysts from inserting bias in their cuts. The found that their signal does not fit the LSND sterile neutrino hypothesis (pink and green solid lines on the bottom plot). Part of their blind analysis was to focus on the data above 475 MeV, since the data below this had larger backgrounds (top plot). Above this scale their data is very close to a fit to the standard 3-light-neutrino model. Below this, however, they found an odd excess of low energy electrons. Since this region has more difficult background than the 475+ MeV region and the latter region had [conclusively?] ruled out the natural LSND interpretation, they decided to publish their result look more carefully into the low energy region.

One year later

The group has since put up a further analysis of the sub 475 MeV region (0812.2243), and the result is that there is still a 3-sigma deviation. The new analysis includes several improvements to get better handles on backgrounds. I do not properly understand most of these (“theorist’s naievte”), but will mention a few to the extent that I am capable:

  • Pion neutral current distributions were reweighted to model pion kinematics properly
  • “Photonuclear absorption” was taken into account. This is the process where a pion decays into two photons and one of the photons is absorbed by a carbon atom. The remaining photon Cerenkov radiates is misidentified in the detector as an electron. (This apparently contributed 25% to the background!)
  • A new cut on the data was imposed to get rid of pion-to-photon decay backgrounds in the dirt (a generic term for earthy matter) immediately outside the detector. Signals that are pointing opposite the neutrino beam and originate near the exterior of the detector are removed since Monte Carlo simulations showed that these events are primarily background.

If I understand correctly, systematic errors are now smaller for the 200-475 MeV region (13% compared to 15% in the 475+ MeV region). The new result is:

Still an electron-like excess at low energies, from 0812.2243

Still an "electron-like" excess at low energies, from 0812.2243

The significance at each energy range is something like

  • 200 – 300 MeV: 1.7 sigma
  • 300 – 475 MeV: 3.4 sigma
  • 475 – 1250 MeV: 0.6 sigma

The 200 – 475 MeV data combine to an overall 3 sigma discrepancy. The MiniBooNE spokesperson also points out that since we now understand this low energy region better than the high-energy region (in terms of systematic errors), this is still solid indication that there is a MiniBooNE excess. It seems we’ve justed traded the `LSND anomaly’ for a `MiniBooNE anomaly’.

[For theorists: this is a familiar concept in duality called anomaly matching. For experimentalists: that was a bad joke.]

At this point, people who were model building for MiniBooNE could rest easy and keep earning their wages.


That is, until we look at the anti-neutrino data. Since the neutrino/anti-neutrino character of the beam is based on the “horn” polarity, the experiment can run in either nu or anti-nu mode, but not both simultaneously. Last December the MiniBooNE collaboration also “unblinded” their antineutrino data and found…

MiniBooNE antineutrino data fits background.

MiniBooNE antineutrino data fits background.

… what is it? A big piece of coal in their stocking. Well, ok, that was overly harsh. In fact, this is actually rather interesting. Recall that LSND was running in the antineutrino mode when it found its anomaly. [In fact, I don’t properly understand why MiniBooNE didn’t run in the antineutrino mode initially? I suspect it may have something to do with how well one can measure electrons vs positrons in the detector.]

The antineutrino data matches the background predictions rather well. No story here. Unfortunately this non-signal killed the favorite model for the electron excess (axial anomaly mediation), which predicted an analogous excess in the antineutrino mode. In fact, it killed most of the interesting interpretations. Apparently the betting game during the blind antineutrino analysis was to use the neutrino data to predict the excess in the antineutrino data in various scenarios. The unblinded data suggests the excess only exists in the neutrino mode.

One of the few models that survived the antineutrino data is based on “multichannel oscillations,” nucl-th/0703023.

Where to go from here

MiniBooNE continues to take data and the collaboration is planning on combining the neutrino and antineutrino data (I’m not sure what this means). They’re waiting on a request for extra running, with some proposals for exploring this anomaly at different baselines. Since background goes as 1/L^2 while oscillations go as \sin^2(L/E)/E, this L/E dependence can shed some light about the nature of the electron excess. One proposal is to actually just move the MiniBooNE detector to 200m. Since this would be closer, it would only take one year to accumulate data equivalent to the entire seven year run so far.

There was also a nice remark by the MiniBooNE spokesperson to `cover all the bases,’ so to speak: even if the low energy excess in MiniBooNE is completely background (i.e. uninteresting), this would still turn out to be very important for long baseline experiments (T2K, NoVA, DUSEL-FNAL) since they operate in this energy range.

The CDF multi-muon anomaly has been an experimental curiosity for a few months now, but it seems to have taken a back seat to PAMELA/ATIC for `exciting experimental directions’ in particle phenomenology. [Of course, it’s still doing better than the LHC…]

The end of 2008 for hep-ph‘ists was marked by three interesting leptonic signals. The CDF multi-muon anomaly, PAMELA/ATIC, and the MiniBooNE excess. Of these three, PAMELA/ATIC have gotten the lion’s share of papers — but there have been rumblings that the Fermi/GLAST preliminary results are favoring a `vanilla’ astrophysical explanation. There have been a couple of notable papers attempting to explain PAMELA/ATIC along these lines (0812.4457, 0902.0376) along these lines. As for MiniBooNE, not very much has been said about the excess in low-energy electrons. I hope to be able to learn a bit more about this before I blog about it.

The original multi-muon paper is quite a read (and the associated initial model-building attempt), and indeed produced an interesting `response’ from a theorist (much of which is an excellent starting point for multi-muon model-building), which in-turn produced a response on Tommaso’s blog… which eventually turned a bit ugly in the comments section. Anyway, the best `armchair’ reading on the multi-muon anomaly is still Tommaso’s set of notes: part 0, part 1, part 2, part 3, part 4. An excellent theory-side discussion can be found at Resonaances.

There have been a handful of model attempts by theorists since the above discussions. General remarks on the hidden-valley context and how to start thinking about this signal can be found in the aforementioned paper 0811.1560. A connection between CDF multi-muons and the cosmic ray lepton excess was presented in 0812.4240. A very recent paper also attacks CDF + PAMELA with a hidden valley scalar, 0902.2145. As usual any map from possible new signals to variants of the MSSM is surjective (though never one-to-one), so it’s no surprise that people have found a singlet extension to the MSSM to fit the CDF anomaly in 0812.1167. An exploration of `what can we still squeeze out of the Tevatron’ comes from Fermilab, which explains that a very heavy t’ could not only be found at the Tevatron, but could explain the CDF anomaly, 0902.0792.

There are some very respectable theorists trying their hand at multi-muon model building, though there generally seems to be some reluctance from the community as a whole to devote much effort towards it. Maybe people are holding their breath for direct production of new physics at the LHC, or are otherwise convinced that the thing to do right now is construct theories of dark matter since we know dark matter must eventually show up in a particle experiment.

For me, the threshold for jumping into the field head-first was waiting to hear what the D0 collaboration had to say about this. According to rumors, however, it seems like the Tevatron’s other detector won’t have anything to say since it won’t be doing this analysis. From what I understand, this comes from the way that the D0 collaboration skims their data. (What does `skim’ mean?) Rumor has it that it’s very difficult for them to do the same analysis that CDF did, and they’ve decided that (1) the likelihood of new physics is so unlikely that it’s not worth their effort to jump in and try to get in on the glory, and (2) the signal is so absurd that it’s not even worth their effort to do the analysis to disprove their friendly rivals at CDF. Not being an experimentalist I can’t comment on the validity or rationale for this — if it is indeed true — but as a phenomenologist I’m smacking my head.

If the multi-muon signal pans out, it could be an experimental discovery that would launch a thousand theorists (Helen of Troy reference indended). If not, a cross check with D0 would have definitively (to the extent that anything is definite in science) put the issue to rest. There are some people in the CDF collaboration who are really convinced by their analysis, and I hope that there will be an opportunity in the near future to cross-check those results at another detector.

I’ve been spending some time thinking about spinors on curved spacetime. There exists a decent set of literature out there for this, but unfortunately it’s scattered across different `cultures’ like a mathematical Tower of Babel. Mathematicians, general relativists, string theorists, and particle physicists all have a different set of tools and language to deal with spinors.

Particle physicists — the community from which I hail — are the most recent to use curved-space spinors in mainstream work. It was only a decade ago that the Randall-Sundrum model for a warped extra dimension was first presented in which the Standard Model was confined to a (3+1)-brane in a 5D Anti-deSitter spacetime. Shortly after, flavor constraints led physicists to start placing fields in the bulk of the RS space. Grossman and Neubert were among the first to show how to place fermion fields in the bulk. The fancy new piece of machinery (by then an old hat for string theorists and a really old hat for relativists) was the spin connection which allows us to connect the flat-space formalism for spinors to curved spaces. [I should make an apology: supergravity has made use of this formalism for some time now, but I unabashedly classify supergravitists as effective string theorists for the sake of argument.]

One way of looking at the formalism is that spinors live in the tangent space of a manifold. By definition this space is flat, and we may work with spinors as in Minkowski space. The only problem is that one then wants to relate the tangent space at one spacetime point to neighboring points. For this one needs a new kind of covariant derivative (i.e. a new connection) that will translate tangent space spinor indices at one point of spacetime to another.

By the way, now is a fair place to state that mathematicians are likely to be nauseous at my “physicist” language… it’s rather likely that my statements will be mathematically ambiguous or even incorrect. Fair warning.

Mathematicians will use words like the “square root of a principle fiber bundle” or “repere mobile” (moving frame) to refer to this formalism in differential geometry. Relativists and string theorists may use words like “tetrad” or “vielbein,” the latter of which has been adopted by particle physicists.

A truly well-written “for physicists” exposition on spinors can be found in Green, Schwartz, and Witten, volume II section 12.1. It’s a short section that you can read independently of the rest of the book. I will summarize their treatment in what follows.

We would like to introduce the a basis of orthonormal vectors at each point in spacetime, e^a_\mu(x), which we call the vielbein. This translates to `many legs’ in German. One will often also hear the term vierbein meaning `four legs,’ or `funfbein’ meaning `five legs’ depending on what dimensionality of spacetime one is working with. The index \mu refers to indices on the spacetime manifold (which is curved in general), while the index a labels the different basis vectors.

If this makes sense, go ahead and skip this paragraph. Otherwise, let me add a few words. Imagine the tangent space of a manifold. We’d like a set of basis vectors for this tangent space. Of course, whatever basis we’re using for the manifold induces a basis on the tangent space, but let’s be more general. Let us write down an arbitrary basis. Each basis vector has n components, where n is the dimensionality of the manifold. Thus each basis vector gets an undex from 1 to n, which we call mu. The choice of this label is intentional, the components of this basis map directly (say, by exponentiation) to the manifold itself, so these really are indices relative to the basis on the manifold. We can thus write a particular basis vector of the tangent space at x as e_\mu(x). How many basis vectors are there for the tangent space? There are n. We can thus label the different basis vectors with another letter, a. Hence we may write our vector as e^a_\mu(x).

The point, now, is that these objects allow us to convert from manifold coordinates to tangent space coordinates. (Tautological sanity check: the a are tangent space coordinates because they label a basis for the tangent space.) In particular, we can go from the curved-space indices of a warped spacetime to flat-space indices that spinors understand. The choice of an orthonormal basis of tangent vectors means that

e^a_\mu (x) e_{a\nu}(x) = g_{\mu\nu}(x),

where the a index is raised and lowered with the flat space (Minkowski) metric. In this sense the vielbeins can be thought of as `square roots’ of the metric that relate flat and curved coordinates. (Aside: this was the first thing I ever learned at a group meeting as a grad student.)

Now here’s the good stuff: there’s nothing `holy’ about a particular orientation of the vielbein at a particular point of spacetime. We could have arbitrarily defined the tangent space z-direction (i.e. a = 3, not $\mu=3$) pointing in one direction (x_\mu=(0,0,0,1)) or another (x_\mu=(0,1,0,0)) relative to the manifold’s basis so long as the two directions are related by a Lorentz transformation. Thus we have an SO(3,1) symmetry (or whatever symmetry applies to the manifold). Further, we could have made this arbitrary choice independently for each point in spacetime. This means that the symmetry is local, i.e. it is a gauge symmetry. Indeed, think back to handy definitions of gauge symmetries in QFT: this is an overall redundancy in how we describe our system, it’s a `non-physical’ degree of freedom that needs to be `modded out’ when describing physical dynamics.

Like any other gauge symmetry, we are required to introduce a gauge field for the Lorentz group, which we shall call \omega_{\mu\phantom{a}\nu}^{\phantom{mu}a}(x). From the point of view of Riemannian geometry this is just the connection, so we can alternately call this creature the spin connection. Note that this is all different from the (local) diffeomorphism symmetry of general relativity, for which we have the Christoffel connection.

What do we know about the spin connection? If we want to be consistent with general relativity while adding only minimal structure (which GSW notes is not always the case), we need to impose consistency when we take covariant derivatives. In particular, any vector field with manifold indices (V^\mu(x)) can now be recast as a vector field with tangent-space indices (V^a = e^\mu_a(x)V^\mu(x)). By requiring that both objects have the same covariant derivative, we get the constraint

D_\mu e^a_\mu(x) = 0.

Note that the covariant derivative is defined as usual for multi-index objects: a partial derivative followed by a connection term for each index. For the manifold index there’s a Christoffel connection, while for the tangent space index there’s a spin connection:

D_\mu e^a_\mu(x) = \partial_\mu e^a_\nu - \Gamma^\lambda_{\mu\nu}e^a_\nu + \omega_{\mu\phantom{a}b}^{\phantom\mu a}e^b_\nu.

This turns out to give just enough information to constrain the spin connection in terms of the vielbeins,

\omega^{ab}_\mu = \frac 12 g^{\rho\nu}e^{[a}_{\phantom{a}\rho}\partial_{\nu}e^{b]}_{\phantom{b]}\nu}+ \frac 14 g^{\rho\nu}g^{\tau\sigma}e^{[a}_{\phantom{[a}\rho}e^{b]}_{\phantom{b]}\tau}\partial_{[\sigma}e^c_{\phantom{c}\nu]}e^d_\mu\eta_{cd},

this is precisely equation (11) of hep-ph/980547 (EFT for a 3-Brane Universe, by Sundrum) and equation ( 4.28 ) of hep-ph/0510275 (TASI Lectures on EWSB from XD, Csaki, Hubisz, Meade). I recommend both references for RS model-building, but note that neither of them actually explain where this equation comes from (well, the latter cites the former)… so I thought it’d be worth explaining this explicitly. GSW makes a further note that the spin connection can be using the torsion since they are the only terms that survive the antisymmetry of the torsion tensor.

Going back to our original goal of putting fermions on a curved spacetime, in order to define a Clifford algebra on such a spacetime it is now sufficient to consider objects \Gamma_mu(x) = e^a_\mu(x)\gamma_a, where the right-hand side contains a flat-space (constant) gamma matrix with its index converted to a spacetime index via the position-dependent vielbein, resulting in a spacetime gamma matrix that is also position dependent (left-hand side). One can see that indeed the spacetime gamma matrices satisfy the Clifford algebra with the curved space metric, \{\Gamma_\mu(x),\Gamma_\nu(y)\} = 2g_{\mu\nu}(x).

There’s one last elegant thought I wanted to convey from GSW. In a previous post we mentioned the role of topology on the existence of the (quantum mechanical) spin representation of the Lorentz group. Now, once again, topology becomes relevant when dealing with the spin connection. When we wrote down our vielbeins we assumed that it was possible to form a basis of orthonormal vectors on our spacetime. A sensible question to ask is whether this is actually valid globally (rather than just locally). The answer, in general, is no. One simply has to consider the “hairy ball” theorem that states that one cannot have a continuous nowhere-vanishing vector field on the 2-sphere. Thus one cannot always have a nowhere-vanishing global vielbein.

Topologies that can be covered by a single vielbein are actually `comparatively scarce’ and are known as parallelizable manifolds. For non-parallelizable manifolds, the best we can do is to define vielbeins on local regions and patch them together via Lorentz transformations (`transition functions’) along their boundary. Consistency requires that in a region with three overlapping patches, the transition from patch 1 to 2, 2 to 3, and then from 3 to 1 is the identity. This is indeed the case.

Spinors must also be patched together along the manifold in a similar way, but we run into problems. The consistency condition on a triple-overlap region is no longer always true since the double-valuedness of the spinor transformation (i.e. the spinor transformation has a sign ambiguity relative to the vector transformation). If it is possible to choose signs on the spinor transformations such that the consistency condition always holds, then the manifold is known as a spin manifold and is said to admit a spin structure. In order to have a consistent theory with fermions, it is necessary to restrict to a spin manifold.

Here’s a new installment to the never-ending debate about the best way to draw figures in LaTeX. (A previous suggestion: Adobe Illustrator.)

Stereographic projection image made using TikZ by Thomas Trzeciak, available at

Stereographic projection image made using TikZ by Thomas Trzeciak, available at

A promising solution is the combination of PGF (“Portable Graphics Format”) and TikZ (“TikZ ist kein Zeichenprogramm,” or “TikZ is not a drawing program”), both developed by Till Tantau, whom you may know better for creating Beamer.

PGF is the ‘base system’ that provides commands to draw vector images. This layer of the graphics system is tedious to use directly since it only provides the most basic tools. TikZ is a frontend that provies a user-friendly environment for writing commands to draw diagrams.

Like other LaTeX drawing packages (the picture environment, axodraw, etc.), the TikZ figures are a series of in-line commands and so are extremely compact and easy to modify. Unlike other LaTeX packages, though, TikZ provides a powerful layer of abstraction that makes it relatively easy to make fairly complicated diagrams.  Further, the entire system is PDFLaTeX-friendly, which is more than one can say for pstricks-based drawing options.

The cost is that the system has a bit of a learning curve. Like LaTeX itself, there are many commands and techniques that one must gradually become familiar with in order to make figures. Fortunately, there is a very pedagogical and comprehensive manual available. Unfortunately the manual is rather lengthy and many of the examples contain small errors that prevent the code from compiling. comes to the rescue, however, witha nice gallery of TikZ examples (with source code), including those from the manual. If you’re thinking about learning TikZ, go ahead and browse some of the examples right now; the range of possibilities is really impressive.

To properly learn how to use TikZ, I would suggest setting aside a day or two to go through the tutorials (Part I) of the manual. Start from the beginning and work your way through one page at a time. The manual was written in such a way that you can’t just skip to a picture that you like and copy the source code, you need to be sure to include all the libraries and define all the variables that are discussed over the course of each tutorial. (You can always consult the source code at the gallery of TikZ examples in a pinch.)

Let me just mention two really neat things that TikZ can do which sold me onto the system.

TikZ Feynman Diagram by K. Fauske, available at

TikZ Feynman Diagram by K. Fauske, available at

The first example is, of course, drawing a Feynman diagram. The TikZ code isn’t necessarily any cleaner than what one would generate using Jaxodraw, but TikZ offers much more control in changing the way things look. For example, one could turn all fermions blue without having to modify each line.

TikZ used to draw arrows on a Beamer presentation. Image by K. Fauske, available at

TikZ arrows on a Beamer presentation. Image by K. Fauske, available at

The next example is a solution to one of the most difficult aspects of Beamer: drawing arrows between elements of a frame. This is consistently the feature that PowerPoint, Keynote, and the ‘ol chalkboard always do better than Beamer. No longer!

Check out the source code for the example above. Adding arrows is as easy as defining some nodes and writing one line of code for each line. The lines are curved ‘naturally’ and the trick works with Beamer’s overlays. (Beamer is also built on PGF.)

Anyway, for those with the time to properly work through the tutorials, TikZ has the potential to be a very powerful tool to add to one’s LaTeX arsenal.


  • Tools for creating high-quality vector graphics
  • “Node” structure is very useful for drawing charts and Feynman diagrams
  • Works with PDFLaTeX
  • Images are drawn ‘in-line’ (no need to attach extra files)
  • Easy to insert TeX into images


  • A bit of a learning curve to overcome
  • No standard GUI interface

Download PGF/TikZ. Installation instructions: place the files into your texmf tree. For Mac OS X users, this means putting everything into a subdirectory of ~/Library/texmf/tex/latex/.

Today I’ll be reviewing P.M. Stevenson, “Dimensional Analysis in Field Theory,” Annals of Physics 132, 383 (1981). It’s a cute paper that helps provide some insight for the renormalization group.

A theory is a black box

A theory is a black box that we can shake to make predictions of physical observables.

We’ve already said a few cursory words on dimensional analysis and renormalization. It turns out that we can use simple dimensional analysis to yield some insight on the nature of the renormalization group without having to think about the technical ‘heavy machinery’ required to do actual calculations.

First let us define a theory as a black box that is characterized by a Lagrangian and its corresponding parameters: coupling constants, masses, fields, etc. All these things, however, are contained within the black box and are in some sense abstract objects. One can ask the black box to predict physical observables, which can then be measured experimentally. Such observables could be cross sections, ratios of cross sections, or potentials, as shown in the image above.

Let’s now restrict ourselves to the case of a `naively-scale-invariant’ or `naively-dimensionless’ theory, i.e. one where there are no couplings with mass dimensions. For example, \lambda\phi^4 theory or massless QCD. We shall further restrict to dimensionless observables, such as the ratio of cross sections. Let’s call a general observable \rho(Q), where we have inserted a dependence on the energy Q with foresight that such things renormalize with energy scale.

Dimensional Analysis

But one can immediately take a step back and realize that this is ridiculous. How could a dimensionless observable from a dimensionless theory have a nontrivial dependence on a dimensionful quantity, Q? Stevenson makes this more explicit by quoting a theorem of dimensional analysis:

Thm. A function f(x,y) which depends only on two massive variables x,y and which is

  1. dimensionless
  2. uniquely defined
  3. defined without any dimensionful constants

must then be a function of the ratio of x/y only, f(x,y)=f(x/y).

Cor. If f(x,y) is independent of y, then f(x,y) is constant.

Then by the corollary, \rho(Q) must be constant. This is a problem, since our experiments show a Q dependence.

Evading Dimensional Analysis

The answer is that the theorem doesn’t hold: the theory inside the black box is not `uniquely defined,’ violating condition 2. This is what we meant by the stuff inside the black box being `abstract,’ the Lagrangian is actually a one-parameter family of theories with different bare couplings. That is to say that the black box is defined up to a freedom in the renormalization conditions.

Now that we see that it is possible to have Q-dependence, it’s a bit of a curiosity how our dimensionless theory manages to define a dimensionful dependence of \rho without any dimensonful quantities to draw upon. The simplest way to do this is to have the theory define the first derivative:

\frac {d\rho}{dQ} = \frac{1}{Q} \beta(\rho),

where \beta is the usual beta function calculated in perturbation theory. It is dimensionless and is uniquely defined by the theory. Another way one can define Q dependence is to do so recursively; one can read Stevenson’s paper to see that this is equivalent to defining the \beta function.

One can integrate the equation for \beta to write,

\log Q + constant = \int^\rho_{-\infty} \frac{d\rho'}{\beta(\rho)} \equiv K(\rho).

The constant of integration now characterizes the one-parameter ambiguity of the theory. (The ambiguity can be mapped onto the lack of a boundary condition.) We may parameterize this ambiguity by writing

constant = K_0 - \log \mu,

for some arbitrary \mu of mass dimension 1. (This form is necessary to get a dimensionles logarithm on the left-hand side.) The appearance of this massive constant is something of a `virgin birth’ for the naively-dimensionless theory and is called dimensional transmutation. By setting Q=\mu we see that K_0 = K(\rho(\mu)). Thus we see finally that the integral of the \beta equation is

\log(Q/\mu) + K(\rho(\mu)) = K(\rho(Q)).

All of the one-parameter ambiguity of the theory is now packaged into the massive parameter \mu. K is an integral that comes from the \beta function, which is in turn specified by the Lagrangian of the theory. On the left-hand side we have quantities which depend on the arbitrary scale \mu while the right-hand side contains only quantities that depend on the energy scale Q.

If K(\rho(\mu)) vanishes for some \mu=\Lambda, then we can write our observable in terms of this scale,

\rho(Q) = K^{-1}(\log(Q/\Lambda)).

Note that \mu is arbitrary, while \Lambda is fixed for a particular theory. This latter quantity is rather interesting because even though it is an intrinsic property of the black box, it is not predicted by the black box, it must be fixed by explicit measurement of an observable. (more…)