You are currently browsing the monthly archive for December 2007.

Einstein’s equation $E=mc^2$ describing the equivalence of mass and energy is arguably the most famous equation in physics. But his beautifully elegant derivation of this formula (here is the English translation) from previously understood laws of physics is considerably less famous. (There is an amusing Far Side cartoon in this regard, with the punchline “squared away”, which you can find on-line by searching hard enough, though I will not link to it directly.)

This topic had come up in recent discussion on this blog, so I thought I would present Einstein’s derivation here. Actually, to be precise, in the paper mentioned above, Einstein uses the postulates of special relativity and other known laws of physics to show the following:

Proposition. (Mass-energy equivalence) If a body at rest emits a total energy of E while remaining at rest, then the mass of that body decreases by $E/c^2$.

Assuming that bodies at rest with zero mass necessarily have zero energy, this implies the famous formula $E = mc^2$ – but only for bodies which are at rest. For moving bodies, there is a similar formula, but one has to first decide what the correct definition of mass is for moving bodies; I will not discuss this issue here, but see for instance the Wikipedia entry on this topic.

Broadly speaking, the derivation of the above proposition proceeds via the following five steps:

1. Using the postulates of special relativity, determine how space and time coordinates transform under changes of reference frame (i.e. derive the Lorentz transformations).
2. Using 1., determine how the temporal frequency $\nu$ (and wave number k) of photons transform under changes of reference frame (i.e. derive the formulae for relativistic Doppler shift).
3. Using Planck’s relation $E = h\nu$ (and de Broglie’s law $p = \hbar k$) and 2., determine how the energy E (and momentum p) of photons transform under changes of reference frame.
4. Using the law of conservation of energy (and momentum) and 3., determine how the energy (and momentum) of bodies transform under changes of reference frame.
5. Comparing the results of 4. with the classical Newtonian approximations $KE \approx \frac{1}{2} m|v|^2$ (and $p \approx mv$), deduce the relativistic relationship between mass and energy for bodies at rest (and more generally between mass, velocity, energy, and momentum for moving bodies).

Actually, as it turns out, Einstein’s analysis for bodies at rest only needs to understand changes of reference frame at infinitesimally low velocity, $|v| \ll c$. However, in order to see enough relativistic effects to deduce the mass-energy equivalence, one needs to obtain formulae which are accurate to second order in v (or more precisely, $v/c$), as opposed to those in Newtonian physics which are accurate to first order in v. Also, to understand the relationship between mass, velocity, energy, and momentum for moving bodies rather than bodies at rest, one needs to consider non-infinitesimal changes of reference frame.

Important note: Einstein’s argument is, of course, a physical argument rather than a mathematical one. While I will use the language and formalism of pure mathematics here, it should be emphasised that I am not exactly giving a formal proof of the above Proposition in the sense of modern mathematics; these arguments are instead more like the classical proofs of Euclid, in that numerous “self evident” assumptions about space, time, velocity, etc. will be made along the way. (Indeed, there is a very strong analogy between Euclidean geometry and the Minkowskian geometry of special relativity.) One can of course make these assumptions more explicit, and this has been done in many other places, but I will avoid doing so here in order not to overly obscure Einstein’s original argument. Read the rest of this entry »

I’m continuing my series of articles for the Princeton Companion to Mathematics through the holiday season with my article on “Differential forms and integration“. This is my attempt to explain the concept of a differential form in differential geometry and several variable calculus; which I view as an extension of the concept of the signed integral in single variable calculus. I briefly touch on the important concept of de Rham cohomology, but mostly I stick to fundamentals. [Added, Feb 24 2018: Typo on page 5: “more any other” should be “more generally, any other”.  Thanks to Michael Drumheller for the correction.]

I would also like to highlight Doron Zeilberger‘s PCM article “Enumerative and Algebraic combinatorics“. This article describes the art of how to usefully count the number of objects of a given type exactly; this subject has a rather algebraic flavour to it, in contrast with asymptotic combinatorics, which is more concerned with computing the order of magnitude of number of objects in a class. The two subjects complement each other; for instance, in my own work, I have found enumerative and other algebraic methods tend to be useful for controlling “main terms” in a given expression, while asymptotic and other analytic methods tend to be good at controlling “error terms”.

I’m continuing my series of articles for the Princeton Companion to Mathematics with my article on phase space. This brief article, which overlaps to some extent with my article on the Schrödinger equation, introduces the concept of phase space, which is used to describe both the positions and momenta of a system in both classical and quantum mechanics, although in the latter one has to accept a certain amount of ambiguity (or non-commutativity, if one prefers) in this description thanks to the uncertainty principle. (Note that positions alone are not sufficient to fully characterise the state of a system; this observation essentially goes all the way back to Zeno with his arrow paradox.)

Phase space is also used in pure mathematics, where it is used to simultaneously describe position (or time) and frequency; thus the term “time-frequency analysis” is sometimes used to describe phase space-based methods in analysis. The counterpart of classical mechanics is then symplectic geometry and Hamiltonian ODE, while the counterpart of quantum mechanics is the theory of linear differential and pseudodifferential operators. The former is essentially the “high-frequency limit” of the latter; this can be made more precise using the techniques of microlocal analysis, semi-classical analysis, and geometric quantisation.

As usual, I will highlight another author’s PCM article in this post, this one being Frank Kelly‘s article “The mathematics of traffic in networks“, a subject which, as a resident of Los Angeles, I can relate to on a personal level :-) . Frank’s article also discusses in detail Braess’s paradox, which is the rather unintuitive fact that adding extra capacity to a network can sometimes increase the overall delay in the network, by inadvertently redirecting more traffic through bottlenecks! If nothing else, this paradox demonstrates that the mathematics of traffic is non-trivial.

Camil Muscalu, Christoph Thiele and I have just uploaded to the arXiv our joint paper, “Multi-linear multipliers associated to simplexes of arbitrary length“, submitted to Analysis & PDE. This paper grew out of our project from many years ago to attempt to prove the nonlinear (or “scattering”) version of Carleson’s theorem on the almost everywhere convergence of Fourier series. This version is still open; our original approach was to handle the nonlinear Carleson operator by multilinear expansions in terms of the potential function V, but while the first three terms of this expansion were well behaved, the fourth term was unfortunately divergent, due to the unhelpful location of a certain minus sign. [This survey by Michael Lacey, as well as this paper of ourselves, covers some of these topics.]

However, what we did find out in this paper was that if we modified the nonlinear Carleson operator slightly, by replacing the underlying Schrödinger equation by a more general AKNS system, then for “generic” choices of this system, the problem of the ill-placed minus sign goes away, and each term in the multilinear series is, in fact, convergent (though we did not yet verify that the series actually converged, though in view of the earlier work of Christ and Kiselev on this topic, this seems likely). The verification of this convergence (at least with regard to the scattering data, rather than the more difficult analysis of the eigenfunctions) is the main result of our current paper. It builds upon our earlier estimates of the bilinear term in the expansion (which we dubbed the “biest”, as a multilingual pun). The main new idea in our earlier paper was to decompose the relevant region of frequency space $\{ (\xi_1,\xi_2,\xi_3) \in {\Bbb R}^3: \xi_1 < \xi_2 < \xi_3 \}$ into more tractable regions, a typical one being the region in which $\xi_2$ was much closer to $\xi_1$ than to $\xi_3$. The contribution of each region can then be “parafactored” into a “paracomposition” of simpler operators, such as the bilinear Hilbert transform, which can be treated by standard time-frequency analysis methods. (Much as a paraproduct is a frequency-restricted version of a product, the paracompositions that arise here are frequency-restricted versions of composition.)

A similar analysis happens to work for the multilinear operators associated to the frequency region $S := \{ (\xi_1,\ldots,\xi_n): \xi_1 < \ldots < \xi_n \}$, but the combinatorics are more complicated; each of the component frequency regions has to be indexed by a tree (in a manner reminiscent of the well-separated pairs decomposition), and a certain key “weak Bessel inequality” becomes considerably more delicate. Our ultimate conclusion is that the multilinear operator

$T(V_1,\ldots,V_n) := \int_{(\xi_1,\ldots,\xi_n) \in S} \hat V_1(\xi_1) \ldots \hat V_n(\xi_n) e^{2i (\xi_1+\ldots+\xi_n) x}\ d\xi_1 \ldots d\xi_n$ (1)

(which generalises the bilinear Hilbert transform and the biest) obeys Hölder-type $L^p$ estimates (note that Hölder’s inequality related to the situation in which the (projective) simplex S is replaced by the entire frequency space ${\Bbb R}^n$).

For the remainder of this post, I thought I would describe the “nonlinear Carleson theorem” conjecture, which is still one of my favourite open problems, being an excellent benchmark for measuring progress in the (still nascent) field of “nonlinear Fourier analysis“, while also being of interest in its own right in scattering and spectral theory.

Next quarter, starting on Wednesday January 9, I will be teaching a graduate course entitled “Topics in Ergodic Theory“. As an experiment, I have decided to post my lecture notes on this blog as the course progresses, as it seems to be a good medium to encourage feedback and corrections. (On the other hand, I expect that my frequency of posting on non-ergodic theory topics is going to go down substantially during this quarter.) All of my class posts will be prefaced with the course number, 254A, and will be placed in their own special category.

The topics I plan to cover include

• Topological dynamics;
• Classical ergodic theorems;
• The Furstenberg-Zimmer structure theory of measure preserving systems;
• Multiple recurrence theorems, and the connections with Szemerédi-type theorems;
• Orbits in homogeneous spaces (and in particular, in nilmanifolds);
• (Special cases of) Ratner’s theorem, and applications to number theory (e.g. the Oppenheim conjecture).

If time allows I will cover some other topics in ergodic theory as well (I haven’t decided yet exactly which ones to discuss yet, and might be willing to entertain some suggestions in this regard.)

If this works out well then I plan to also do the same for my spring class, in which I will cover as much of Perelman’s proof of the Poincaré conjecture as I can manage. (Note though that this latter class will build upon a class on Ricci flow given by my colleague William Wylie in the winter quarter, which will thus be a de facto prerequisite for my spring course.)

This post will have only the most tangential connection to mathematics.

I am an Australian citizen (and permanent resident in the US), but I nevertheless take an interest in the upcoming US presidential election in 2008. I’ve recently learned of a grassroots campaign to have one of the presidential debates focused on policy issues in science, technology, health, and the environment. (See also this LA Times op-ed and Wall Street Journal op-ed.) Personally, I think this is an excellent idea, and hope that it succeeds; it seems that they are currently petitioning signatures towards this goal.

While on this topic, it is also interesting to see what the political prediction markets are currently forecasting as the outcome of the election…

[Update, Dec 20: Here is a table listing the major candidates and their positions on mostly science-related issues.]

Ciprian Demeter, Michael Lacey, Christoph Thiele and I have just uploaded our joint paper, “The Walsh model for $M_2^*$ Carleson” to the arXiv. This paper (which was recently accepted for publication in Revista Iberoamericana) establishes a simplified model for the key estimate (the “$M_2^*$ Carleson estimate”) in another (much longer) paper of ours on the return times theorem of Bourgain, in which the Fourier transform is replaced by its dyadic analogue, the Walsh-Fourier transform. This model estimate is established by the now-standard techniques of time-frequency analysis: one decomposes the expression to be estimated into a sum over tiles, and then uses combinatorial stopping time arguments into group the tiles into trees, and the trees into forests. One then uses (phase-space localised, and frequency-modulated) versions of classical Calderòn-Zygmund theory (or in this particular case, a certain maximal Fourier inequality of Bourgain) to control individual trees and forests, and sums up over the trees and forests using orthogonality methods (excluding an exceptional set if necessary).

Rather than discuss time-frequency analysis in detail here, I thought I would dwell instead on the return times theorem, and sketch how it is connected to the $M_2^*$ Carleson estimate; this is a more complicated version of the “$M_2$ Carleson estimate”, which is an estimate which is logically equivalent to Carleson’s famous theorem (and its extension by Hunt) on the almost everywhere convergence of Fourier series.

This is my final Milliman lecture, in which I talk about the sum-product phenomenon in arithmetic combinatorics, and some selected recent applications of this phenomenon to uniform distribution of exponentials, expander graphs, randomness extractors, and detecting (sieving) almost primes in group orbits, particularly as developed by Bourgain and his co-authors.
Read the rest of this entry »

This is my second Milliman lecture, in which I talk about recent applications of ideas from additive combinatorics (and in particular, from the inverse Littlewood-Offord problem) to the theory of discrete random matrices.
Read the rest of this entry »

This week I am visiting the University of Washington in Seattle, giving the Milliman Lecture Series for 2007-2008. My chosen theme here is “Recent developments in arithmetic combinatorics“. In my first lecture, I will speak (once again) on how methods in additive combinatorics have allowed us to detect additive patterns in the prime numbers, in particular discussing my joint work with Ben Green. In the second lecture I will discuss how additive combinatorics has made it possible to study the invertibility and spectral behaviour of random discrete matrices, in particular discussing my joint work with Van Vu; and in the third lecture I will discuss how sum-product estimates have recently led to progress in the theory of expanders relating to Lie groups, as well as to sieving over orbits of such groups, in particular presenting work of Jean Bourgain and his coauthors.