You are currently browsing the category archive for the ‘obituary’ category.

I am totally stunned to learn that Maryam Mirzakhani died today yesterday, aged 40, after a severe recurrence of the cancer she had been fighting for several years.  I had planned to email her some wishes for a speedy recovery after learning about the relapse yesterday; I still can’t fully believe that she didn’t make it.

My first encounter with Maryam was in 2010, when I was giving some lectures at Stanford – one on Perelman’s proof of the Poincare conjecture, and another on random matrix theory.  I remember a young woman sitting in the front who asked perceptive questions at the end of both talks; it was only afterwards that I learned that it was Mirzakhani.  (I really wish I could remember exactly what the questions were, but I vaguely recall that she managed to put a nice dynamical systems interpretation on both of the topics of my talks.)

After she won the Fields medal in 2014 (as I posted about previously on this blog), we corresponded for a while.  The Fields medal is of course one of the highest honours one can receive in mathematics, and it clearly advances one’s career enormously; but it also comes with a huge initial burst of publicity, a marked increase in the number of responsibilities to the field one is requested to take on, and a strong expectation to serve as a public role model for mathematicians.  As the first female recipient of the medal, and also the first to come from Iran, Maryam was experiencing these pressures to a far greater extent than previous medallists, while also raising a small daughter and fighting off cancer.  I gave her what advice I could on these matters (mostly that it was acceptable – and in fact necessary – to say “no” to the vast majority of requests one receives).

Given all this, it is remarkable how productive she still was mathematically in the last few years.  Perhaps her greatest recent achievement has been her “magic wandtheorem with Alex Eskin, which is basically the analogue of the famous measure classification and orbit closure theorems of Marina Ratner, in the context of moduli spaces instead of unipotent flows on homogeneous spaces.  (I discussed Ratner’s theorems in this previous post.  By an unhappy coincidence, Ratner also passed away this month, aged 78.)  Ratner’s theorems are fundamentally important to any problem to which a homogeneous dynamical system can be associated (for instance, a special case of that theorem shows up in my work with Ben Green and Tamar Ziegler on the inverse conjecture for the Gowers norms, and on linear equations in primes), as it gives a good description of the equidistribution of any orbit of that system (if it is unipotently generated); and it seems the Eskin-Mirzakhani result will play a similar role in problems associated instead to moduli spaces.  The remarkable proof of this result – which now stands at over 200 pages, after three years of revision and updating – uses almost all of the latest techniques that had been developed for homogeneous dynamics, and ingeniously adapts them to the more difficult setting of moduli spaces, in a manner that had not been dreamed of being possible only a few years earlier.

Maryam was an amazing mathematician and also a wonderful and humble human being, who was at the peak of her powers.  Today was a huge loss for Maryam’s family and friends, as well as for mathematics.

[EDIT, Jul 16: New York times obituary here.]

[EDIT, Jul 18: New Yorker memorial here.]


Klaus Roth, who made fundamental contributions to analytic number theory, died this Tuesday, aged 90.

I never met or communicated with Roth personally, but was certainly influenced by his work; he wrote relatively few papers, but they tended to have outsized impact. For instance, he was one of the key people (together with Bombieri) to work on simplifying and generalising the large sieve, taking it from the technically formidable original formulation of Linnik and Rényi to the clean and general almost orthogonality principle that we have today (discussed for instance in these lecture notes of mine). The paper of Roth that had the most impact on my own personal work was his three-page paper proving what is now known as Roth’s theorem on arithmetic progressions:

Theorem 1 (Roth’s theorem on arithmetic progressions) Let {A} be a set of natural numbers of positive upper density (thus {\limsup_{N \rightarrow\infty} |A \cap \{1,\dots,N\}|/N > 0}). Then {A} contains infinitely many arithmetic progressions {a,a+r,a+2r} of length three (with {r} non-zero of course).

At the heart of Roth’s elegant argument was the following (surprising at the time) dichotomy: if {A} had some moderately large density within some arithmetic progression {P}, either one could use Fourier-analytic methods to detect the presence of an arithmetic progression of length three inside {A \cap P}, or else one could locate a long subprogression {P'} of {P} on which {A} had increased density. Iterating this dichotomy by an argument now known as the density increment argument, one eventually obtains Roth’s theorem, no matter which side of the dichotomy actually holds. This argument (and the many descendants of it), based on various “dichotomies between structure and randomness”, became essential in many other results of this type, most famously perhaps in Szemerédi’s proof of his celebrated theorem on arithmetic progressions that generalised Roth’s theorem to progressions of arbitrary length. More recently, my recent work on the Chowla and Elliott conjectures that was a crucial component of the solution of the Erdös discrepancy problem, relies on an entropy decrement argument which was directly inspired by the density increment argument of Roth.

The Erdös discrepancy problem also is connected with another well known theorem of Roth:

Theorem 2 (Roth’s discrepancy theorem for arithmetic progressions) Let {f(1),\dots,f(n)} be a sequence in {\{-1,+1\}}. Then there exists an arithmetic progression {a+r, a+2r, \dots, a+kr} in {\{1,\dots,n\}} with {r} positive such that

\displaystyle  |\sum_{j=1}^k f(a+jr)| \geq c n^{1/4}

for an absolute constant {c>0}.

In fact, Roth proved a stronger estimate regarding mean square discrepancy, which I am not writing down here; as with the Roth theorem in arithmetic progressions, his proof was short and Fourier-analytic in nature (although non-Fourier-analytic proofs have since been found, for instance the semidefinite programming proof of Lovasz). The exponent {1/4} is known to be sharp (a result of Matousek and Spencer).

As a particular corollary of the above theorem, for an infinite sequence {f(1), f(2), \dots} of signs, the sums {|\sum_{j=1}^k f(a+jr)|} are unbounded in {a,r,k}. The Erdös discrepancy problem asks whether the same statement holds when {a} is restricted to be zero. (Roth also established discrepancy theorems for other sets, such as rectangles, which will not be discussed here.)

Finally, one has to mention Roth’s most famous result, cited for instance in his Fields medal citation:

Theorem 3 (Roth’s theorem on Diophantine approximation) Let {\alpha} be an irrational algebraic number. Then for any {\varepsilon > 0} there is a quantity {c_{\alpha,\varepsilon}} such that

\displaystyle  |\alpha - \frac{a}{q}| > \frac{c_{\alpha,\varepsilon}}{q^{2+\varepsilon}}.

From the Dirichlet approximation theorem (or from the theory of continued fractions) we know that the exponent {2+\varepsilon} in the denominator cannot be reduced to {2} or below. A classical and easy theorem of Liouville gives the claim with the exponent {2+\varepsilon} replaced by the degree of the algebraic number {\alpha}; work of Thue and Siegel reduced this exponent, but Roth was the one who obtained the near-optimal result. An important point is that the constant {c_{\alpha,\varepsilon}} is ineffective – it is a major open problem in Diophantine approximation to produce any bound significantly stronger than Liouville’s theorem with effective constants. This is because the proof of Roth’s theorem does not exclude any single rational {a/q} from being close to {\alpha}, but instead very ingeniously shows that one cannot have two different rationals {a/q}, {a'/q'} that are unusually close to {\alpha}, even when the denominators {q,q'} are very different in size. (I refer to this sort of argument as a “dueling conspiracies” argument; they are strangely prevalent throughout analytic number theory.)

Lars Hörmander, who made fundamental contributions to all areas of partial differential equations, but particularly in developing the analysis of variable-coefficient linear PDE, died last Sunday, aged 81.

I unfortunately never met Hörmander personally, but of course I encountered his work all the time while working in PDE. One of his major contributions to the subject was to systematically develop the calculus of Fourier integral operators (FIOs), which are a substantial generalisation of pseudodifferential operators and which can be used to (approximately) solve linear partial differential equations, or to transform such equations into a more convenient form. Roughly speaking, Fourier integral operators are to linear PDE as canonical transformations are to Hamiltonian mechanics (and one can in fact view FIOs as a quantisation of a canonical transformation). They are a large class of transformations, for instance the Fourier transform, pseudodifferential operators, and smooth changes of the spatial variable are all examples of FIOs, and (as long as certain singular situations are avoided) the composition of two FIOs is again an FIO.

The full theory of FIOs is quite extensive, occupying the entire final volume of Hormander’s famous four-volume series “The Analysis of Linear Partial Differential Operators”. I am certainly not going to try to attempt to summarise it here, but I thought I would try to motivate how these operators arise when trying to transform functions. For simplicity we will work with functions {f \in L^2({\bf R}^n)} on a Euclidean domain {{\bf R}^n} (although FIOs can certainly be defined on more general smooth manifolds, and there is an extension of the theory that also works on manifolds with boundary). As this will be a heuristic discussion, we will ignore all the (technical, but important) issues of smoothness or convergence with regards to the functions, integrals and limits that appear below, and be rather vague with terms such as “decaying” or “concentrated”.

A function {f \in L^2({\bf R}^n)} can be viewed from many different perspectives (reflecting the variety of bases, or approximate bases, that the Hilbert space {L^2({\bf R}^n)} offers). Most directly, we have the physical space perspective, viewing {f} as a function {x \mapsto f(x)} of the physical variable {x \in {\bf R}^n}. In many cases, this function will be concentrated in some subregion {\Omega} of physical space. For instance, a gaussian wave packet

\displaystyle  f(x) = A e^{-(x-x_0)^2/\hbar} e^{i \xi_0 \cdot x/\hbar}, \ \ \ \ \ (1)

where {\hbar > 0}, {A \in {\bf C}} and {x_0, \xi_0 \in {\bf R}^n} are parameters, would be physically concentrated in the ball {B(x_0,\sqrt{\hbar})}. Then we have the frequency space (or momentum space) perspective, viewing {f} now as a function {\xi \mapsto \hat f(\xi)} of the frequency variable {\xi \in {\bf R}^n}. For this discussion, it will be convenient to normalise the Fourier transform using a small constant {\hbar > 0} (which has the physical interpretation of Planck’s constant if one is doing quantum mechanics), thus

\displaystyle  \hat f(\xi) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{\bf R} e^{-i\xi \cdot x/\hbar} f(x)\ dx.

For instance, for the gaussian wave packet (1), one has

\displaystyle  \hat f(\xi) = A e^{i\xi_0 \cdot x_0/\hbar} e^{-(\xi-\xi_0)^2/\hbar} e^{-i \xi \cdot x_0/\hbar},

and so we see that {f} is concentrated in frequency space in the ball {B(\xi_0,\sqrt{\hbar})}.

However, there is a third (but less rigorous) way to view a function {f} in {L^2({\bf R}^n)}, which is the phase space perspective in which one tries to view {f} as distributed simultaneously in physical space and in frequency space, thus being something like a measure on the phase space {T^* {\bf R}^n := \{ (x,\xi): x, \xi \in {\bf R}^n\}}. Thus, for instance, the function (1) should heuristically be concentrated on the region {B(x_0,\sqrt{\hbar}) \times B(\xi_0,\sqrt{\hbar})} in phase space. Unfortunately, due to the uncertainty principle, there is no completely satisfactory way to canonically and rigorously define what the “phase space portrait” of a function {f} should be. (For instance, the Wigner transform of {f} can be viewed as an attempt to describe the distribution of the {L^2} energy of {f} in phase space, except that this transform can take negative or even complex values; see Folland’s book for further discussion.) Still, it is a very useful heuristic to think of functions has having a phase space portrait, which is something like a non-negative measure on phase space that captures the distribution of functions in both space and frequency, albeit with some “quantum fuzziness” that shows up whenever one tries to inspect this measure at scales of physical space and frequency space that together violate the uncertainty principle. (The score of a piece of music is a good everyday example of a phase space portrait of a function, in this case a sound wave; here, the physical space is the time axis (the horizontal dimension of the score) and the frequency space is the vertical dimension. Here, the time and frequency scales involved are well above the uncertainty principle limit (a typical note lasts many hundreds of cycles, whereas the uncertainty principle kicks in at {O(1)} cycles) and so there is no obstruction here to musical notation being unambiguous.) Furthermore, if one takes certain asymptotic limits, one can recover a precise notion of a phase space portrait; for instance if one takes the semiclassical limit {\hbar \rightarrow 0} then, under certain circumstances, the phase space portrait converges to a well-defined classical probability measure on phase space; closely related to this is the high frequency limit of a fixed function, which among other things defines the wave front set of that function, which can be viewed as another asymptotic realisation of the phase space portrait concept.

If functions in {L^2({\bf R}^n)} can be viewed as a sort of distribution in phase space, then linear operators {T: L^2({\bf R}^n) \rightarrow L^2({\bf R}^n)} should be viewed as various transformations on such distributions on phase space. For instance, a pseudodifferential operator {a(X,D)} should correspond (as a zeroth approximation) to multiplying a phase space distribution by the symbol {a(x,\xi)} of that operator, as discussed in this previous blog post. Note that such operators only change the amplitude of the phase space distribution, but not the support of that distribution.

Now we turn to operators that alter the support of a phase space distribution, rather than the amplitude; we will focus on unitary operators to emphasise the amplitude preservation aspect. These will eventually be key examples of Fourier integral operators. A physical translation {Tf(x) := f(x-x_0)} should correspond to pushing forward the distribution by the transformation {(x,\xi) \mapsto (x+x_0,\xi)}, as can be seen by comparing the physical and frequency space supports of {Tf} with that of {f}. Similarly, a frequency modulation {Tf(x) := e^{i \xi_0 \cdot x/\hbar} f(x)} should correspond to the transformation {(x,\xi) \mapsto (x,\xi+\xi_0)}; a linear change of variables {Tf(x) := |\hbox{det} L|^{-1/2} f(L^{-1} x)}, where {L: {\bf R}^n \rightarrow {\bf R}^n} is an invertible linear transformation, should correspond to {(x,\xi) \mapsto (Lx, (L^*)^{-1} \xi)}; and finally, the Fourier transform {Tf(x) := \hat f(x)} should correspond to the transformation {(x,\xi) \mapsto (\xi,-x)}.

Based on these examples, one may hope that given any diffeomorphism {\Phi: T^* {\bf R}^n \rightarrow T^* {\bf R}^n} of phase space, one could associate some sort of unitary (or approximately unitary) operator {T_\Phi: L^2({\bf R}^n) \rightarrow L^2({\bf R}^n)}, which (heuristically, at least) pushes the phase space portrait of a function forward by {\Phi}. However, there is an obstruction to doing so, which can be explained as follows. If {T_\Phi} pushes phase space portraits by {\Phi}, and pseudodifferential operators {a(X,D)} multiply phase space portraits by {a}, then this suggests the intertwining relationship

\displaystyle  a(X,D) T_\Phi \approx T_\Phi (a \circ \Phi)(X,D),

and thus {(a \circ \Phi)(X,D)} is approximately conjugate to {a(X,D)}:

\displaystyle  (a \circ \Phi)(X,D) \approx T_\Phi^{-1} a(X,D) T_\Phi. \ \ \ \ \ (2)

The formalisation of this fact in the theory of Fourier integral operators is known as Egorov’s theorem, due to Yu Egorov (and not to be confused with the more widely known theorem of Dmitri Egorov in measure theory).

Applying commutators, we conclude the approximate conjugacy relationship

\displaystyle  \frac{1}{i\hbar} [(a \circ \Phi)(X,D), (b \circ \Phi)(X,D)] \approx T_\Phi^{-1} \frac{1}{i\hbar} [a(X,D), b(X,D)] T_\Phi.

Now, the pseudodifferential calculus (as discussed in this previous post) tells us (heuristically, at least) that

\displaystyle  \frac{1}{i\hbar} [a(X,D), b(X,D)] \approx \{ a, b \}(X,D)


\displaystyle  \frac{1}{i\hbar} [(a \circ \Phi)(X,D), (b \circ \Phi)(X,D)] \approx \{ a \circ \Phi, b \circ \Phi \}(X,D)

where {\{,\}} is the Poisson bracket. Comparing this with (2), we are then led to the compatibility condition

\displaystyle  \{ a \circ \Phi, b \circ \Phi \} \approx \{ a, b \} \circ \Phi,

thus {\Phi} needs to preserve (approximately, at least) the Poisson bracket, or equivalently {\Phi} needs to be a symplectomorphism (again, approximately at least).

Now suppose that {\Phi: T^* {\bf R}^n \rightarrow T^* {\bf R}^n} is a symplectomorphism. This is morally equivalent to the graph {\Sigma := \{ (z, \Phi(z)): z \in T^* {\bf R}^n \}} being a Lagrangian submanifold of {T^* {\bf R}^n \times T^* {\bf R}^n} (where we give the second copy of phase space the negative {-\omega} of the usual symplectic form {\omega}, thus yielding {\omega \oplus -\omega} as the full symplectic form on {T^* {\bf R}^n \times T^* {\bf R}^n}; this is another instantiation of the closed graph theorem, as mentioned in this previous post. This graph is known as the canonical relation for the (putative) FIO that is associated to {\Phi}. To understand what it means for this graph to be Lagrangian, we coordinatise {T^* {\bf R}^n \times T^* {\bf R}^n} as {(x,\xi,y,\eta)} suppose temporarily that this graph was (locally, at least) a smooth graph in the {x} and {y} variables, thus

\displaystyle  \Sigma = \{ (x, F(x,y), y, G(x,y)): x, y \in {\bf R}^n \}

for some smooth functions {F, G: {\bf R}^n \rightarrow {\bf R}^n}. A brief computation shows that the Lagrangian property of {\Sigma} is then equivalent to the compatibility conditions

\displaystyle  \frac{\partial F_i}{\partial x_j} = \frac{\partial F_j}{\partial x_i}

\displaystyle  \frac{\partial G_i}{\partial y_j} = \frac{\partial G_j}{\partial y_i}

\displaystyle  \frac{\partial F_i}{\partial y_j} = - \frac{\partial G_j}{\partial x_i}

for {i,j=1,\ldots,n}, where {F_1,\ldots,F_n, G_1,\ldots,G_n} denote the components of {F,G}. Some Fourier analysis (or Hodge theory) lets us solve these equations as

\displaystyle  F_i = -\frac{\partial \phi}{\partial x_i}; \quad G_j = \frac{\partial \phi}{\partial y_j}

for some smooth potential function {\phi: {\bf R}^n \times {\bf R}^n \rightarrow {\bf R}}. Thus, we have parameterised our graph {\Sigma} as

\displaystyle  \Sigma = \{ (x, -\nabla_x \phi(x,y), y, \nabla_y \phi(x,y)): x,y \in {\bf R}^n \} \ \ \ \ \ (3)

so that {\Phi} maps {(x, -\nabla_x \phi(x,y))} to {(y, \nabla_y \phi(x,y))}.

A reasonable candidate for an operator associated to {\Phi} and {\Sigma} in this fashion is the oscillatory integral operator

\displaystyle  Tf(y) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i \phi(x,y)/\hbar} a(x,y) f(x)\ dx \ \ \ \ \ (4)

for some smooth amplitude function {a} (note that the Fourier transform is the special case when {a=1} and {\phi(x,y)=xy}, which helps explain the genesis of the term “Fourier integral operator”). Indeed, if one computes an inner product {\int_{{\bf R}^n} Tf(y) \overline{g(y)}\ dy} for gaussian wave packets {f, g} of the form (1) and localised in phase space near {(x_0,\xi_0), (y_0,\eta_0)} respectively, then a Taylor expansion of {\phi} around {(x_0,y_0)}, followed by a stationary phase computation, shows (again heuristically, and assuming {\phi} is suitably non-degenerate) that {T} has (3) as its canonical relation. (Furthermore, a refinement of this stationary phase calculation suggests that if {a} is normalised to be the half-density {|\det \nabla_x \nabla_y \phi|^{1/2}}, then {T} should be approximately unitary.) As such, we view (4) as an example of a Fourier integral operator (assuming various smoothness and non-degeneracy hypotheses on the phase {\phi} and amplitude {a} which we do not detail here).

Of course, it may be the case that {\Sigma} is not a graph in the {x,y} coordinates (for instance, the key examples of translation, modulation, and dilation are not of this form), but then it is often a graph in some other pair of coordinates, such as {\xi,y}. In that case one can compose the oscillatory integral construction given above with a Fourier transform, giving another class of FIOs of the form

\displaystyle  Tf(y) := \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i \phi(\xi,y)/\hbar} a(\xi,y) \hat f(\xi)\ d\xi. \ \ \ \ \ (5)

This class of FIOs covers many important cases; for instance, the translation, modulation, and dilation operators considered earlier can be written in this form after some Fourier analysis. Another typical example is the half-wave propagator {T := e^{it \sqrt{-\Delta}}} for some time {t \in {\bf R}}, which can be written in the form

\displaystyle  Tf(y) = \frac{1}{(2\pi \hbar)^{n/2}} \int_{{\bf R}^n} e^{i (\xi \cdot y + t |\xi|)/\hbar} a(\xi,y) \hat f(\xi)\ d\xi.

This corresponds to the phase space transformation {(x,\xi) \mapsto (x+t\xi/|\xi|, \xi)}, which can be viewed as the classical propagator associated to the “quantum” propagator {e^{it\sqrt{-\Delta}}}. More generally, propagators for linear Hamiltonian partial differential equations can often be expressed (at least approximately) by Fourier integral operators corresponding to the propagator of the associated classical Hamiltonian flow associated to the symbol of the Hamiltonian operator {H}; this leads to an important mathematical formalisation of the correspondence principle between quantum mechanics and classical mechanics, that is one of the foundations of microlocal analysis and which was extensively developed in Hörmander’s work. (More recently, numerically stable versions of this theory have been developed to allow for rapid and accurate numerical solutions to various linear PDE, for instance through Emmanuel Candés’ theory of curvelets, so the theory that Hörmander built now has some quite significant practical applications in areas such as geology.)

In some cases, the canonical relation {\Sigma} may have some singularities (such as fold singularities) which prevent it from being written as graphs in the previous senses, but the theory for defining FIOs even in these cases, and in developing their calculus, is now well established, in large part due to the foundational work of Hörmander.

Garth Gaudry, who made many contributions to harmonic analysis and to Australian mathematics, and was also both my undergradaute and masters advisor as well as the head of school during one of my first academic jobs, died yesterday after a long battle with cancer, aged 71.

Garth worked on the interface between real-variable harmonic analysis and abstract harmonic analysis (which, despite their names, are actually two distinct fields, though certainly related to each other).  He was one of the first to realise the central importance of Littlewood-Paley theory as a general foundation for both abstract and real-variable harmonic analysis, writing an influential text with Robert Edwards on the topic.  He also made contributions to Clifford analysis, which was also the topic of my masters thesis.

But, amongst Australian mathematicians at least, Garth will be remembered for his tireless service to the field, most notably for his pivotal role in founding the Australian Mathematical Sciences Institute (AMSI) and then serving as AMSI’s first director, and then in directing the International Centre of Excellence for Education in Mathematics (ICE-EM), the educational arm of AMSI which, among other things, developed a full suite of maths textbooks and related educational materials covering Years 5-10 (which I reviewed here back in 2008).

I knew Garth ever since I was an undergraduate at Flinders University.   He was head of school then (a position roughly equivalent to department chair in the US), but still was able to spare an hour a week to meet with me to discuss real analysis, as I worked my way through Rudin’s “Real and complex analysis” and then Stein’s “Singular integrals”, and then eventually completed a masters thesis under his supervision on Clifford-valued singular integrals.  When Princeton accepted my application for graduate study, he convinced me to take the opportunity without hesitation.  Without Garth, I certainly wouldn’t be where I am at today, and I will always be very grateful for his advisorship.  He was a good person, and he will be missed very much by me and by many others.

Bill Thurston, who made fundamental contributions to our understanding of low-dimensional manifolds and related structures, died on Tuesday, aged 65.

Perhaps Thurston’s best known achievement is the proof of the hyperbolisation theorem for Haken manifolds, which showed that 3-manifolds which obeyed a certain number of topological conditions, could always be given a hyperbolic geometry (i.e. a Riemannian metric that made the manifold isometric to a quotient of the hyperbolic 3-space H^3).  This difficult theorem connecting the topological and geometric structure of 3-manifolds led Thurston to give his influential geometrisation conjecture, which (in principle, at least) completely classifies the topology of an arbitrary compact 3-manifold as a combination of eight model geometries (now known as Thurston model geometries).  This conjecture has many consequences, including Thurston’s hyperbolisation theorem and (most famously) the Poincaré conjecture.  Indeed, by placing that conjecture in the context of a conceptually appealing general framework, of which many other cases could already be verified, Thurston provided one of the strongest pieces of evidence towards the truth of the Poincaré conjecture, until the work of Grisha Perelman in 2002-2003 proved both the Poincaré conjecture and the geometrisation conjecture by developing Hamilton’s Ricci flow methods.  (There are now several variants of Perelman’s proof of both conjectures; in the proof of geometrisation by Bessieres, Besson, Boileau, Maillot, and Porti, Thurston’s hyperbolisation theorem is a crucial ingredient, allowing one to bypass the need for the theory of Alexandrov spaces in a key step in Perelman’s argument.)

One of my favourite results of Thurston’s is his elegant method for everting the sphere (smoothly turning a sphere S^2 in {\bf R}^3 inside out without any folds or singularities).  The fact that sphere eversion can be achieved at all is highly unintuitive, and is often referred to as Smale’s paradox, as Stephen Smale was the first to give a proof that such an eversion exists.  However, prior to Thurston’s method, the known constructions for sphere eversion were quite complicated.  Thurston’s method, relying on corrugating and then twisting the sphere, is sufficiently conceptual and geometric that it can in fact be explained quite effectively in non-technical terms, as was done in the following excellent video entitled “Outside In“, and produced by the Geometry Center:

In addition to his direct mathematical research contributions, Thurston was also an amazing mathematical expositor, having the rare knack of being able to describe the process of mathematical thinking in addition to the results of that process and the intuition underlying it.  His wonderful essay “On proof and progress in mathematics“, which I highly recommend, is the quintessential instance of this; more recent examples include his many insightful questions and answers on MathOverflow.

I unfortunately never had the opportunity to meet Thurston in person (although we did correspond a few times online), but I know many mathematicians who have been profoundly influenced by him and his work.  His death is a great loss for mathematics.

A few days ago, I received the sad news that Yahya Ould Hamidoune had recently died. Hamidoune worked in additive combinatorics, and had recently solved a question on noncommutative Freiman-Kneser theorems posed by myself on this blog last year. Namely, Hamidoune showed

Theorem 1 (Noncommutative Freiman-Kneser theorem for small doubling) Let {0 < \epsilon \leq 1}, and let {S \subset G} be a finite non-empty subset of a multiplicative group {G} such that {|A \cdot S| \leq (2-\epsilon) |S|} for some finite set {A} of cardinality {|A|} at least {|S|}, where {A \cdot S := \{ as: a \in A, s \in S \}} is the product set of {A} and {S}. Then there exists a finite subgroup {H} of {G} with cardinality {|H| \leq C(\epsilon) |S|}, such that {S} is covered by at most {C'(\epsilon)} right-cosets {H \cdot x} of {H}, where {C(\epsilon), C'(\epsilon) > 0} depend only on {\epsilon}.

One can of course specialise here to the case {A=S}, and view this theorem as a classification of those sets {S} of doubling constant at most {2-\epsilon}.

In fact Hamidoune’s argument, which is completely elementary, gives the very nice explicit constants {C(\epsilon) := \frac{2}{\epsilon}} and {C'(\epsilon) := \frac{2}{\epsilon} - 1}, which are essentially optimal except for factors of {2} (as can be seen by considering an arithmetic progression in an additive group). This result was also independently established (in the {A=S} case) by Tom Sanders (unpublished) by a more Fourier-analytic method, in particular drawing on Sanders’ deep results on the Wiener algebra {A(G)} on arbitrary non-commutative groups {G}.

This type of result had previously been known when {2-\epsilon} was less than the golden ratio {\frac{1+\sqrt{5}}{2}}, as first observed by Freiman; see my previous blog post for more discussion.

Theorem 1 is not, strictly speaking, contained in Hamidoune’s paper, but can be extracted from his arguments, which share some similarity with the recent simple proof of the Ruzsa-Plünnecke inequality by Petridis (as discussed by Tim Gowers here), and this is what I would like to do below the fold. I also include (with permission) Sanders’ unpublished argument, which proceeds instead by Fourier-analytic methods. Read the rest of this entry »

Israel Gelfand, who made profound and prolific contributions to many areas of mathematics, including functional analysis, representation theory, operator algebras, and partial differential equations, died on Monday, age 96.

Gelfand’s beautiful theory of {C^*}-algebras and related spaces made quite an impact on me as a graduate student in Princeton, to the point where I was seriously considering working in this area; but there was not much activity in operator algebras at the time there, and I ended up working in harmonic analysis under Eli Stein instead. (Though I am currently involved in another operator algebras project, of which I hope to be able to discuss in the near future. The commutative version of Gelfand’s theory is discussed in these lecture notes of mine.)

I met Gelfand only once, in one of the famous “Gelfand seminars” at the IHES in 2000. The speaker was Tim Gowers, on his new proof of Szemerédi’s theorem. (Endre Szemerédi, incidentally, was Gelfand’s student.) Gelfand’s introduction to the seminar, on the subject of Banach spaces which both mathematicians contributed so greatly to, was approximately as long as Gowers’ talk itself!

There are far too many contributions to mathematics by Gelfand to name here, so I will only mention two. The first are the Gelfand-Tsetlin patterns, induced by an {n \times n} Hermitian matrix {A}. Such matrices have {n} real eigenvalues {\lambda_{n,1} \leq \ldots \leq \lambda_{n,n}}. If one takes the top {n-1 \times n-1} minor, this is another Hermitian matrix, whose {n-1} eigenvalues {\lambda_{n-1,1} \leq \ldots \leq \lambda_{n-1,n-1}} intersperse the {n} eigenvalues of the original matrix: {\lambda_{n,i} \leq \lambda_{n-1,i} \leq \lambda_{n,i+1}} for every {1 \leq i \leq n-1}. This interspersing can be easily seen from the minimax characterisation

\displaystyle  \lambda_{n,i} = \inf_{\hbox{dim}(V)=i} \sup_{v \in V: \|v\|=1} \langle Av, v \rangle

of the eigenvalues of {A}, with the eigenvalues of the {n-1 \times n-1} minor being similar but with {V} now restricted to a subspace of {{\mathbb C}^{n-1}} rather than {{\mathbb C}^n}.

Similarly, the eigenvalues {\lambda_{n-2,1} \leq \ldots \leq \lambda_{n-2,n-2}} of the top {n-2 \times n-2} minor of {A} intersperse those of the previous minor. Repeating this procedure one eventually gets a pyramid of real numbers of height and width {n}, with the numbers in each row interspersing the ones in the row below. Such a pattern is known as a Gelfand-Tsetlin pattern. The space of such patterns forms a convex cone, and (if one fixes the initial eigenvalues {\lambda_{n,1},\ldots,\lambda_{n,n}}) becomes a compact convex polytope. If one fixes the initial eigenvalues {\lambda_{n,1},\ldots,\lambda_{n,n}} of {A} but chooses the eigenvectors randomly (using the Haar measure of the unitary group), then the resulting Gelfand-Tsetlin pattern is uniformly distributed across this polytope; the {n=2} case of this observation is essentially the classic observation of Archimedes that the cross-sectional areas of a sphere and a circumscribing cylinder are the same. (Ultimately, the reason for this is that the Gelfand-Tsetlin pattern almost turns the space of all {A} with a fixed spectrum (i.e. the co-adjoint orbit associated to that spectrum) into a toric variety. More precisely, there exists a mostly diffeomorphic map from the co-adjoint orbit to a (singular) toric variety, and the Gelfand-Tsetlin pattern induces a complete set of action variables on that variety.) There is also a “quantum” (or more precisely, representation-theoretic) version of this observation, in which one can decompose any irreducible representation of the unitary group {U(n)} into a canonical basis (the Gelfand-Tsetlin basis), indexed by integer-valued Gelfand-Tsetlin patterns, by first decomposing this representation into irreducible representations of {U(n-1)}, then {U(n-2)}, and so forth.

The structure, symplectic geometry, and representation theory of Gelfand-Tsetlin patterns was enormously influential in my own work with Allen Knutson on honeycomb patterns, which control the sums of Hermitian matrices and also the structure constants of the tensor product operation for representations of {U(n)}; indeed, Gelfand-Tsetlin patterns arise as the degenerate limit of honeycombs in three different ways, and we in fact discovered honeycombs by trying to glue three Gelfand-Tsetlin patterns together. (See for instance our Notices article for more discussion. The honeycomb analogue of the representation-theoretic properties of these patterns was eventually established by Henriques and Kamnitzer, using {gl(n)} crystals and their Kashiwara bases.)

The second contribution of Gelfand I want to discuss is the Gelfand-Levitan-Marchenko equation for solving the one-dimensional inverse scattering problem: given the scattering data of an unknown potential function {V(x)}, recover {V}. This is already interesting in and of itself, but is also instrumental in solving integrable systems such as the Korteweg-de Vries equation, because the Lax pair formulation of such equations implies that they can be linearised (and solved explicitly) by applying the scattering and inverse scattering transforms associated with the Lax operator. I discuss the derivation of this equation below the fold.

Read the rest of this entry »

I am very saddened (and stunned) to learn that Oded Schramm, who made fundamental contributions to conformal geometry, probability theory, and mathematical physics, died in a hiking accident this Monday, aged 46.  (I knew him as a fellow editor of the Journal of the American Mathematical Society, as well as for his mathematical research, of course.)  It is a loss of both a great mathematician and a great person.

One of Schramm’s most fundamental contributions to mathematics is the introduction of the stochastic Loewner equation (now sometimes called the Schramm-Loewner equation in his honour), together with his subsequent development of the theory of this equation with Greg Lawler and Wendelin Werner.  (This work has been recognised by a number of awards, including the Fields Medal in 2006 to Wendelin.)  This equation (which I state after the jump) describes, for each choice of a parameter \kappa > 0, a random (fractal) curve SLE(\kappa) in the plane; this random curve can be viewed as a nonlinear variant of Brownian motion, although the SLE curves tend to cross themselves much less frequently than Brownian paths do.  By the nature of their construction, the SLE(\kappa) curves are conformally invariant: any conformal transformation of an SLE(\kappa) curve (respecting the boundary) gives another curve which has the same distribution as the original curve.  (Brownian motion is also conformally invariant; given the close connections between Brownian motion and harmonic functions, it is not surprising that this fact is closely related to the fact that the property of a function being harmonic in the plane is preserved under conformal transformations.) Conversely, one can show that any conformally invariant random curve distribution which obeys some additional regularity and locality axioms must be of the form SLE(\kappa) for some \kappa.

The amazing fact is that many other natural processes for generating random curves in the plane – e.g. loop-erased random walk, the boundary of Brownian motion (also known as the “Brownian frontier”), or the limit of percolation on the triangular lattice – are known or conjectured to be distributed according to SLE(\kappa) for some specific \kappa (in the above three examples, \kappa is 2, 8/3, and 6 respectively).  In particular, this implies that the highly non-trivial fact that such distributions are conformally invariant, a phenomenon that had been conjectured by physicists but which only obtained rigorous mathematical proof following the work of Schramm and his coauthors.

[Update, Sep 6: A memorial blog to Oded has been set up by his Microsoft Research group here.  See also these posts by Gil Kalai, Yuval Peres, and Luca Trevisan.]

Read the rest of this entry »

Atle Selberg, who made immense and fundamental contributions to analytic number theory and related areas of mathematics, died last Monday, aged 90.

Selberg’s early work was focused on the study of the Riemann zeta function \zeta(s). In 1942, Selberg showed that a positive fraction of the zeroes of this function lie on the critical line \hbox{Re}(s)=1/2. Apart from improvements in the fraction (the best value currently being a little over 40%, a result of Conrey), this is still one of the strongest partial results we have towards the Riemann hypothesis. (I discuss Selberg’s result, and the method of mollifiers he introduced there, in a little more detail after the jump.)

In working on the zeta function, Selberg developed two powerful tools which are still used routinely in analytic number theory today. The first is the method of mollifiers to smooth out the magnitude oscillations of the zeta function, making the (more interesting) phase oscillation more visible. The second was the method of the Selberg \Lambda^2 sieve, which is a particularly elegant choice of sieve which allows one to count patterns in almost primes (and hence to upper bound patterns in primes) quite accurately. Variants of the Selberg sieve were a crucial ingredient in, for instance, the recent work of Goldston-Yıldırım-Pintz on prime gaps, as well as the work of Ben Green and myself on arithmetic progressions in primes. (I discuss the Selberg sieve, as well as the Selberg symmetry formula below, in my post on the parity problem. Incidentally, Selberg was the first to formalise this problem as a significant obstruction in sieve theory.)

For all of these achievements, Selberg was awarded the Fields Medal in 1950. Around that time, Selberg and Erdős also produced the first elementary proof of the prime number theorem. A key ingredient here was the Selberg symmetry formula, which is an elementary analogue of the prime number theorem for almost primes.

But perhaps Selberg’s greatest contribution to mathematics was his discovery of the Selberg trace formula, which is a non-abelian generalisation of the Poisson summation formula, and which led to many further deep connections between representation theory and number theory, and in particular being one of the main inspirations for the Langlands program, which in turn has had an impact on many different parts of mathematics (for instance, it plays a role in Wiles’ proof of Fermat’s last theorem). For an introduction to the trace formula, its history, and its impact, I recommend the survey article of Arthur.

Other major contributions of Selberg include the Rankin-Selberg theory connecting Artin L-functions from representation theory to the integrals of automorphic forms (very much in the spirit of the Langlands program), and the Chowla-Selberg formula relating the Gamma function at rational values to the periods of elliptic curves with complex multiplication. He also made an influential conjecture on the spectral gap of the Laplacian on quotients of SL(2,{\Bbb R}) by congruence groups, which is still open today (Selberg had the first non-trivial partial result). As an example of this conjecture’s impact, Selberg’s eigenvalue conjecture has inspired some recent work of Sarnak-Xue, Gamburd, and Bourgain-Gamburd on new constructions of expander graphs, and has revealed some further connections between number theory and arithmetic combinatorics (such as sum-product theorems); see this announcement of Bourgain-Gamburd-Sarnak for the most recent developments (this work, incidentally, also employs the Selberg sieve). As observed by Satake, Selberg’s eigenvalue conjecture and the more classical Ramanujan-Petersson conjecture can be unified into a single conjecture, now known as the Ramanujan-Selberg conjecture; the eigenvalue conjecture is then essentially an archimedean (or “non-dyadic“) special case of the general Ramanujan-Selberg conjecture. (The original (dyadic) Ramanujan-Petersson conjecture was finally proved by Deligne-Serre, after many important contributions by other authors, but the non-dyadic version remains open.)

Read the rest of this entry »

I am very saddened to find out (first via Wikipedia, then by several independent confirmations) that Paul Cohen died on Friday, aged 72.

Paul Cohen is of course best known in mathematics for his Fields Medal-winning proof of the undecidability of the continuum hypothesis within the standard Zermelo-Frankel-Choice (ZFC) axioms of set theory, by introducing the now standard method of forcing in model theory. (More precisely, assuming ZFC is consistent, Cohen proved that models of ZFC exist in which the continuum hypothesis fails; Gödel had previously shown under the same assumption that models exist in which the continuum hypothesis is true.) Cohen’s method also showed that the axiom of choice was independent of ZF. The friendliest introduction to forcing is perhaps still Timothy Chow‘s “Forcing for dummies“, though I should warn that Tim has a rather stringent definition of “dummy”.

But Cohen was also a noted analyst. For instance, the Cohen idempotent theorem in harmonic analysis classifies the idempotent measures \mu in a locally compact abelian group G (i.e. the finite regular measures for which \mu * \mu = \mu); specifically, a finite regular measure \mu is idempotent if and only if the Fourier transform \hat \mu of the measure only takes values 0 and 1, and furthermore can be expressed as a finite linear combination of indicator functions of cosets of open subgroups of the Pontryagin dual \hat G of G. (Earlier results in this direction were obtained by Helson and by Rudin; a non-commutative version was subsequently given by Host. These results play an important role in abstract harmonic analysis.) Recently, Ben Green and Tom Sanders connected this classical result to the very recent work on Freiman-type theorems in additive combinatorics, using the latter to create a quantitative version of the former, which in particular is suitable for use in finite abelian groups.

Paul Cohen’s legacy also includes the advisorship of outstanding mathematicians such as the number theorist and analyst Peter Sarnak (who, incidentally, taught me analytic number theory when I was a graduate student). Cohen was in fact my “uncle”; his advisor, Antoni Zygmund, was the advisor of my own advisor Elias Stein.

It is a great loss for the world of mathematics.

[Update, Mar 25: Added the hypothesis that ZFC is consistent to the description of Cohen’s result. Several other minor edits also.]