As is now widely reported, the Fields medals for 2010 have been awarded to Elon Lindenstrauss, Ngo Bao Chau, Stas Smirnov, and Cedric Villani. Concurrently, the Nevanlinna prize (for outstanding contributions to mathematical aspects of information science) was awarded to Dan Spielman, the Gauss prize (for outstanding mathematical contributions that have found significant applications outside of mathematics) to Yves Meyer, and the Chern medal (for lifelong achievement in mathematics) to Louis Nirenberg. All of the recipients are of course exceptionally qualified and deserving for these awards; congratulations to all of them. (I should mention that I myself was only very tangentially involved in the awards selection process, and like everyone else, had to wait until the ceremony to find out the winners. I imagine that the work of the prize committees must have been extremely difficult.)

Today, I thought I would mention one result of each of the Fields medalists; by chance, three of the four medalists work in areas reasonably close to my own. (Ngo is rather more distant from my areas of expertise, but I will give it a shot anyway.) This will of course only be a tiny sample of each of their work, and I do not claim to be necessarily describing their “best” achievement, as I only know a portion of the research of each of them, and my selection choice may be somewhat idiosyncratic. (I may discuss the work of Spielman, Meyer, and Nirenberg in a later post.)

— 1. Elon Lindenstrauss —

Elon Lindenstrauss works primarily in ergodic theory and dynamical systems, particularly with regard to the homogeneous dynamics of the action of some subgroup {H} of a Lie group {G} on a quotient {G/\Gamma}, and on the applications of this theory to questions in analytic number theory.

One of the themes of modern mathematics is that within any field of mathematics, there is some sort of underlying action by a group (or group-like object) which endows certain key objects of study in that field with a rich, symmetric structure (both algebraic and geometric). Analytic number theory is no exception to this; many classical questions, for instance, about Minkowski’s geometry of numbers, Diophantine approximation, or quadratic forms can be interpreted through this modern perspective as questions about the actions of groups such as {SL_2({\bf R})} on spaces such as the homogeneous space {SL_2({\bf R})/SL_2({\bf Z})}. In particular, the dynamical properties of such actions (e.g. the behaviour of orbits, the classification of invariant measures, the mixing properties of the action, or averages of the action) often have direct bearing on the number-theoretic properties of these objects. A typical example of this is the resolution of the Oppenheim conjecture on quadratic forms by Margulis, using a classification of orbit closures in the homogeneous space {SL_3({\bf R})/SL_3({\bf Z})} of a certain subgroup {U} of {SL_3({\bf R})}, which is also a special case of Ratner’s theorems; this was discussed earlier on this blog at this writeup of a lecture of Margulis, or at this discussion of Ratner’s theorems. Indeed, even though the Oppenheim conjecture is a purely number-theoretic statement, the only known proofs of this conjecture in full generality proceed via homogeneous dynamics. (There are however more number-theoretic proofs of partial cases of this conjecture.)

Another demonstration of this principle is the remarkable 2006 paper of Einsiedler, Katok, and Lindenstrauss that gives the best partial result known to date on the notorious Littlewood conjecture, which asserts that for any real numbers {\alpha, \beta}, and any {\varepsilon > 0}, one can find a positive integer {n} such that {\| n \alpha \| \| n \beta \| \leq \varepsilon/n}, where {\|x\|} is the distance from {x} to the nearest integer. Note that the Dirichlet approximation theorem already gives {\|n\alpha\| = O(1/n)} for infinitely many {n}, and {\|n\beta\|} is clearly bounded, so the conjecture is tantalisingly close to being “easy”; nevertheless, it has defied proof for about eight decades now.

Einsiedler, Katok, and Lindenstrauss establish the partial result that the Littlewood conjecture is true for “most” {\alpha,\beta}, in the sense that the set of pairs {(\alpha,\beta)} for which the conjecture fails is a subset of {{\bf R}^2} of Hausdorff dimension zero. This is a much stronger statement than the fact that the Littlewood conjecture holds for almost every {(\alpha,\beta)}, which is fairly easy to prove; being dimension zero is much stronger than having measure zero.

How is this connected with homogeneous dynamics? Fix {\alpha,\beta}, and let us consider the orbit {A^+x} of a point {x} in the homogeneous space

\displaystyle  G/\Gamma := SL_3({\bf R}) / SL_3({\bf Z}),

where {x} is given as

\displaystyle  x := \begin{pmatrix} 1 & 0 & 0 \\ \alpha & 1 & 0 \\ \beta & 0 & 1 \end{pmatrix} SL_3({\bf Z})

and {A^+} is the semigroup

\displaystyle  A^+ := \left\{ \begin{pmatrix} e^{-r-s} & 0 & 0 \\ 0 & e^r & 0 \\ 0 & 0 & e^s \end{pmatrix}: r, s \geq 0 \right\}.

The space {G/\Gamma} has finite volume, but is non-compact, having a “cusp” at infinity. Thus, the orbit {A^+x} need not be bounded, but could wander arbitrarily close to the cusp. The connection with the Littlewood conjecture (first observed, I believe, by Margulis) is then

Lemma 1 The Littlewood conjecture is true for {(\alpha,\beta)} if and only if {A^+x} is unbounded.

Proof: We’ll just prove the “if” part; the “only if” part is proven by a slightly more sophisticated version of the arguments below. One can identify {G/\Gamma} with the space of all unimodular lattices in {{\bf R}^3}, by identifying each point {T SL_3({\bf Z})} of {G/\Gamma} with the lattice {T{\bf Z}^3}. The non-compactness of this space arises from the fact that the shortest non-zero vector in such lattices can become arbitrarily small (or to put it contrapositively, as long as one keeps the non-zero vectors of a unimodular lattice away from zero, the space of such lattices becomes (pre)compact). In particular, {A^+x} is unbounded if and only if the vector

\displaystyle  \begin{pmatrix} e^{-r-s} & 0 & 0 \\ 0 & e^r & 0 \\ 0 & 0 & e^s \end{pmatrix} \begin{pmatrix} 1 & 0 & 0 \\ \alpha & 1 & 0 \\ \beta & 0 & 1 \end{pmatrix} \begin{pmatrix} n \\ m_1 \\ m_2 \end{pmatrix}

where {r,s \geq 0} and {n,m_1,m_2} are integers not all zero, can become arbitrarily small. We can multiply out this vector as

\displaystyle  \begin{pmatrix} ne^{-r-s} \\ e^r(m_1+n\alpha) \\ e^s(m_2+n\beta) \end{pmatrix}.

We conclude that if {A^+x} is unbounded, then for any {\varepsilon > 0}, we can find {r,s \geq 0} and {n,m_1,m_2} (not all zero) such that

\displaystyle  |ne^{-r-s}|, |e^r(m_1+n\alpha)|, |e^s(m_2+n\beta)| \leq \varepsilon^{1/3}.

Since {r,s \geq 0} and {n,m_1,m_2} are not all zero, we easily verify that this forces {n} to be non-zero if {\varepsilon} is small enough. Multiplying all three terms together, we obtain

\displaystyle  |ne^{-r-s}| \times |e^r(m_1+n\alpha)| \times |e^s(m_2+n\beta)| \leq \varepsilon

which simplifies to

\displaystyle  |m_1 + n \alpha| |m_2 + n \beta| \leq \varepsilon/n.

But the left-hand side is at least {\|n\alpha\| \|n\beta\|}, we conclude that Littlewood’s conjecture is true for {(\alpha,\beta)} as required. \Box

The orbits of semigroups such as {A^+} are still not well understood, in contrast to the orbits of unipotently generated groups {U} for which we have the powerful theorems of Ratner. Nevertheless, by a remarkable analysis of the effects of hyperbolicity on the entropy of an invariant measure, Einsiedler, Katok, and Lindenstrauss are able to obtain some general theorems concerning invariant measures of such semigroups which gives the above consequence to Littlewood’s conjecture, as well as many other applications.

— 2. Ngo Bao Chau —

(Caveat: my understanding of the subject matter here is rather superficial, and so what I write below may be somewhat inaccurate in places. Corrections and clarifications would, of course, be greatly appreciated!)

Ngo Bao Chau has made major contributions to the Langlands programme, which among other things seeks to control the automorphic representations of a (connected, reductive) algebraic group {G} by the Langlands dual group {{}^L G}, in a way which is analogous to how the representation theory of a (locally compact) abelian group {G} is controlled by the Pontryagin dual group {\hat G}. Thus, the Langlands programme can be viewed in some sense as a generalisation of Fourier analysis to non-abelian groups {G}. For instance, the classical Poisson summation formula, that relates summations in an abelian group {G} to summations in its Pontryagin dual {\hat G}, has a vast generalisation in the Selberg trace formula and its generalisations (in particular, the Arthur-Selberg trace formula), which very roughly speaking relates summations (of orbital integrals) over conjugacy classes in {G} with sums over the automorphic representations of {G} (which, by Langlands duality, should be in turn controlled somehow by {{}^L G}).

The Poisson summation formula is closely related to the “functorial” properties of the Fourier transform {f \mapsto \hat f}: every homomorphism {\phi: G \rightarrow H} of locally compact abelian groups induces a corresponding adjoint homomorphism {\phi^*: \hat H \rightarrow \hat G} on the Pontryagin dual groups, and one can interpret Poisson summation as being a consequence of the claim that pullback by the homomorphism {\phi} becomes pushforward by the adjoint homomorphism {\phi^*} after taking Fourier transforms. As I understand it (which is, admittedly, not very well), Langlands functoriality is a nonabelian generalisation of this Fourier functoriality. However, the notions of pushforward and pullback become much more complicated in the nonabelian world, as one is working on things such as conjugacy classes and representations, rather than individual elements of the groups involved. One can define these notions relatively easily in a “local” fashion, working one place at a time, but to glue everything together properly into a “global” setting (in which one is working over an adele ring) in such a way that the relevant trace formulae remain compatible requires the fundamental lemma, the establishment of which was then a major goal of the Langlands programme (as it makes the trace formula significantly more useful, particularly for applications to number theory). In 2008, Ngo established this lemma in full generality, building upon several special cases and previous reductions by other mathematicians (including an earlier breakthrough paper of Laumon and Ngo), and also importing a major new tool from geometry and mathematical physics, namely that of a Hitchin fibration. Apparently, a key observation of Ngo is that the sums over conjugacy classes that appear in the trace formula can be naturally interpreted in terms of some geometric data associated to such fibrations. This is a deep connection which not only gives the fundamental lemma (after a lot of difficult work), but gives a better understanding as to the Langlands programme as a whole.

— 3. Stas Smirnov —

Stas Smirnov works in a number of related fields, and in particular in complex dynamics and in statistical physics. One of his celebrated results in the latter area is the first rigorous demonstration of conformal invariance for a scaling limit percolation model, namely that of percolation on the triangular lattice; this invariance was conjectured for physicists for some time (being part of a larger philosophy of universality, that asserts that the large-scale behaviour of a statistical system should be largely insensitive to the precise small-scale geometry of that system, after normalising some key parameters). Smirnov’s result, especially when combined with the theory of the Schramm-Loewner equation (SLE) developed by Lawler, Schramm, and Werner which classifies conformally invariant processes, allows for a rigorous analysis of many random processes in statistical physics, confirming the general intuition of universality, although the underlying explanation for the universality phenomenon is still lacking. (For instance, we still do not have an analogue of Stas’s result for any lattice other than the triangular one.) Note that this universality phenomenon is not directly related to the universality phenomenon observed in random matrix theory (which has guided much of my own recent research with Van Vu), though there are certainly some superficial similarities between the two.

Smirnov’s full result of conformal invariance for the triangular lattice is a bit tricky to state, but it is a bit easier to state a simpler but still highly non-trivial consequence of that result, namely Cardy’s formula for crossing probabilities in the scaling limit. As observed by Carleson, this formula is easiest to state when the domain is a unit equilateral triangle {ABC}, though it is an important consequence of Stas’s conformal invariance result (first conjectured by Aizenman) that it can in fact be phrased for any simply connected domain in the plane.

Pick a large natural number {n}, and subdivide the original unit equilateral triangle into {n^2} subtriangles of sidelength {1/n}. This creates a triangular lattice on {\frac{(n+1)(n+2)}{2}} (the {(n+1)^{th}} triangular number, naturally) vertices or sites. Now randomly colour each site blue and yellow (say), with an independent {1/2} probability of each; this is the critical probability for this lattice, in which neither colour dominates, and thus gives the most interesting behaviour. The site colouring then separates the {\frac{(n+1)(n+2)}{2}} vertices into blue and yellow clusters, with two vertices of the same colour belonging to the same cluster if they are connected by a path in the lattice that only goes through vertices of that colour.

The size, shape, and other geometrical properties of clusters in such lattice models, especially in the scaling limit {n \rightarrow \infty}, is the main subject of study of percolation theory. There are many such aspects to this theory, but let us focus on a relatively simple aspect, namely the crossing probability between two line segments on the boundary of the triangle {ABC}, say {AB} and {CD}, where {D} is a site on the line segment {CB}. This crossing probability is defined to be the probability that there is a blue (say) cluster connecting {AB} with {CD}, or equivalently that there is a blue path that starts at {AB} and ends at {CD}. Clearly, for fixed {n}, this probability is an increasing function of the length {|CD|} of the line segment {CD}, which equals {1} when {D} is equal to {B}, and which we expect to be close to zero at the other extreme when {D} is equal to {C}. Cardy’s formula is the assertion that in the limit {n \rightarrow\infty}, the crossing probability is asymptotically equal to the length {|CD|} of the interval. (There is a similar formula when one replaces the triangle {ABC} by a more general simply connected domain (while still keeping the small-scale triangular lattice structure), and replaces {AB} and {CD} by two arcs on the boundary of that domain, but then one has to apply the Riemann mapping theorem to convert that domain back into the equilateral triangle.)

The first step in Smirnov’s proof of Cardy’s formula is to consider a two-dimensional extension of the crossing probability, namely the function {h(z)} defined for {z} in the solid triangle {ABC}, defined as the probability that there is a blue path from {AB} to {CB} that separates {z} from {AC}. Note that when {z} is a boundary point {D} on the edge {CB}, then {h(z)} is basically the crossing probability from {AB} to {CD} (ignoring some boundary effects which are negligible in the asymptotic limit {n \rightarrow \infty}); thus the crossing probability arises as the boundary values of {h}. Smirnov was able to show that the function {h} is asymptotically harmonic and obeys some natural Dirichlet-Neumann boundary conditions, which ultimately gives Cardy’s formula. By pushing this analysis much further, these methods also eventually give the conformal invariance of the entire triangular lattice percolation process.

— 4. Cedric Villani —

Cedric Villani works in several areas of mathematical physics, and particularly in the rigorous theory of continuum mechanics equations such as the Boltzmann equation.

Imagine a gas consisting of a large number of particles traveling at various velocities. To begin with, let us take a ridiculously oversimplified discrete model and suppose that there are only four distinct velocities that the particles can be in, namely {v_1, v_2, v_3}, and {v_4}. Let us also make the homogeneity assumption that the distribution of velocities of the gas is independent of the position; then the distribution of the gas at any given time {t} can then be described by four densities {f(t,v_1), f(t,v_2), f(t,v_3), f(t,v_4)} adding up to {1}, which describe the proportion of the gas that is currently traveling at velocities {v_1}, etc..

If there were no collisions between the particles that could transfer velocity from one particle to another, then all the quantities {f(t,v_i)} would be constant in time: {\frac{\partial}{\partial t} f(t,v_i) = 0}. But suppose that there is a collision reaction that can take two particles traveling at velocities {v_1, v_2} and change their velocities to {v_3, v_4}, or vice versa, and that no other collision reactions are possible. Making the key heuristic assumption that different particles are distributed more or less independently in space for the purposes of computing the rate of collision (this hypothesis is also known as the molecular chaos or Stosszahlansatz hypothesis), the rate at which the former type of collision occurs will be proportional to {f(t,v_1) f(t,v_2)}, while the rate at which the latter type of collision occurs is proportional to {f(t,v_3) f(t,v_4)}. This leads to equations of motion such as

\displaystyle  \frac{\partial}{\partial t} f(t,v_1) = \kappa ( f(t,v_3) f(t,v_4) - f(t,v_1) f(t,v_2) )

for some rate constant {\kappa > 0}, and similarly for {f(t,v_2)}, {f(t,v_3)}, and {f(t,v_4)}. It is interesting to note that even in this simplified model, we see the emergence of an “arrow of time”: the rate of a collision is determined by the density of the initial velocities rather than the final ones, and so the system is not time reversible, despite being a statistical limit of a time-reversible collision from the velocities {v_1,v_2} to {v_3,v_4} and vice versa.

To take a less ridiculously oversimplified model, now suppose that particles can take a continuum of velocities, but we still make the homogeneity assumption the velocity distribution is still independent of position, so that the state is now described by a density function {f(t,v)}, with {v} now ranging continuously over {{\bf R}^3}. There are now a continuum of possible collisions, in which two particles of initial velocity {v', v'_*} (say) collide and emerge with velocities {v, v_*}. If we assume purely elastic collisions between particles of identical mass {m}, then we have the law of conservation of momentum

\displaystyle  mv' + mv'_* = mv + mv_*

and conservation of energy

\displaystyle  \frac{1}{2} m |v'|^2 + \frac{1}{2} m |v'_*|^2 = \frac{1}{2} m |v|^2 + \frac{1}{2} m |v'|^2

some simple Euclidean geometry shows that the pre-collision velocities {v', v'_*} must be related to the post-collision velocities {v, v_*} by the formulae

\displaystyle  v' = \frac{v+v_*}{2} + \frac{|v-v_*|}{2} \sigma; \quad v'_* = \frac{v+v_*}{2} - \frac{|v-v_*|}{2} \sigma \ \ \ \ \ (1)

for some unit vector {\sigma \in S^2}. Thus a collision can be completely described by the post-collision velocities {v,v_* \in {\bf R}^3} and the pre-collision direction vector {\sigma \in S^2}; assuming Galilean invariance, the physical features of this collision can in fact be described just using the relative post-collision velocity {v-v_*} and the pre-collision direction vector {\sigma}. Using the same independence heuristics used in the four velocities model, we are then led to the equation of motion

\displaystyle  \frac{\partial}{\partial t} f(t,v) = Q(f,f)(t,v)

where {Q(f,f)} is the quadratic expression

\displaystyle  Q(f,f)(t,v) := \int_{{\bf R}^3} \int_{S^2} (f(t,v') f(t,v'_*) - f(t,v) f(t,v_*)) B(v-v_*,\sigma) dv_* d\sigma

for some Boltzmann collision kernel {B(v-v_*,\sigma) > 0}, which depends on the physical nature of the hard spheres, and needs to be specified as part of the dynamics. Here of course {v', v'_*} are given by (1).

If one now allows the velocity distribution to depend on position {x \in \Omega} in a domain {\Omega \subset {\bf R}^3}, so that the density function is now {f(t,x,v)}, then one has to combine the above equation with a transport equation, leading to the Boltzmann equation

\displaystyle  \frac{\partial}{\partial t} f + v \cdot \nabla_x f = Q(f,f),

together with some boundary conditions on the spatial boundary {\partial \Omega} that will not be discussed here.

One of the most fundamental facts about this equation is the Boltzmann H theorem, which asserts that (given sufficient regularity and integrability hypotheses on {f}, and reasonable boundary conditions), the {H}-functional

\displaystyle  H(f)(t) := \int_{{\bf R}^3} \int_\Omega f(t,x,v) \log f(t,x,v)\ dx dv

is non-increasing in time, with equality if and only if the density function {f} is Gaussian in {v} at each position {x} (but where the mass, mean and variance of the Gaussian distribution being allowed to vary in {x}). Such distributions are known as Maxwellian distributions.

From a physical perspective, {H} is the negative of the entropy of the system, so the H theorem is a manifestation of the second law of thermodynamics, which asserts that the entropy of a system is non-decreasing in time, thus clearly demonstrating the “arrow of time” mentioned earlier.

There are considerable technical issues in ensuring that the derivation of the H theorem is actually rigorous for reasonable regularity hypotheses on {f} (and on {B}), in large part due to the delicate and somewhat singular nature of “grazing collisions” when the pre-collision and post-collision velocities are very close to each other. Important work was done by Villani and his co-authors on resolving this issue, but this is not the result I want to focus on here. Instead, I want to discuss the long-time behaviour of the Boltzmann equation.

As the {H} functional always decreases until a Maxwellian distribution is attained, it is then reasonable to conjecture that the density function {f} must converge (in some suitable topology) to a Maxwellian distribution. Furthermore, even though the {H} theorem allows the Maxwellian distribution to be non-homogeneous in space, the transportation aspects of the Boltzmann equation should serve to homogenise the spatial behaviour, so that the limiting distribution should in fact be a homogeneous Maxwellian. In a remarkable 72-page tour de force, Desvilletes and Villani showed that (under some strong regularity assumptions), this was indeed the case, and furthermore the convergence to the Maxwellian distribution was quite fast, faster than any polynomial rate of decay in fact. Remarkably, this was a large data result, requiring no perturbative hypotheses on the initial distribution (although a fair amount of regularity was needed). As is usual in PDE, large data results are considerably more difficult due to the lack of perturbative techniques that are initially available; instead, one has to primarily rely on such tools as conservation laws and monotonicity formulae. One of the main tools used here is a quantitative version of the H theorem (also obtained by Villani), but this is not enough; the quantitative bounds on entropy production given by the H theorem involve quantities other than the entropy, for which further equations of motion (or more precisely, differential inequalities on their rate of change) must be found, by means of various inequalities from harmonic analysis and information theory. This ultimately leads to a finite-dimensional system of ordinary differential inequalities that control all the key quantities of interest, which must then be solved to obtain the required convergence.