This week I am visiting the University of Washington in Seattle, giving the Milliman Lecture Series for 2007-2008. My chosen theme here is “Recent developments in arithmetic combinatorics“. In my first lecture, I will speak (once again) on how methods in additive combinatorics have allowed us to detect additive patterns in the prime numbers, in particular discussing my joint work with Ben Green. In the second lecture I will discuss how additive combinatorics has made it possible to study the invertibility and spectral behaviour of random discrete matrices, in particular discussing my joint work with Van Vu; and in the third lecture I will discuss how sum-product estimates have recently led to progress in the theory of expanders relating to Lie groups, as well as to sieving over orbits of such groups, in particular presenting work of Jean Bourgain and his coauthors.

Additive combinatorics is focused, among other things, on the task of studying additive patterns in general sets of integers (or more generally, sets in an additive group). It is descended in part from the more classical subject of additive number theory: the study of additive patterns, structures, and operations on explicit sets of integers, such as the primes {\mathcal P} := \{2,3,5,7,11,\ldots\} and the squares {\mathcal S} := \{0,1,4,9,16,\ldots\}. Here are some typical results and conjectures in the subject for both the primes and the squares, side by side for comparison:

Squares {\mathcal S} Primes {\mathcal P}
Lagrange’s four square theorem: For every positive integer N, there exists a pattern in {\mathcal S} of the form a,b,c, N-a-b-c. Vinogradov’s theorem: For every sufficiently large integer N, there exists a pattern in {\mathcal P} of the form a, b, c, N-a-b-c.
Fermat’s two square theorem: For every prime number N = 1 \hbox{ mod } 4, there exists a pattern in {\mathcal S} of the form a, N-a. Even Goldbach conjecture: For every even number N \geq 4, there exists a pattern in {\mathcal P} of the form a, N-a.
Fermat’s four square theorem: There does not exist any pattern in {\mathcal S} of the form a, a+b, a+2b, a+3b with b \neq 0. Green-Tao theorem: For any k \geq 1, there exist infinitely many patterns in {\mathcal P} of the form a, a+b,\ldots,a+(k-1)b with b \neq 0.
Pell’s equation: There are infinitely many patterns in {\mathcal S} of the form a, 2a+1. Sophie Germain conjecture: There are infinitely many patterns in {\mathcal P} of the form a, 2a+1.

I have deliberately phrased the above results in a unified format, namely that of counting additive patterns with one or more free parameters a, b, \ldots in either the squares or the primes. However, this apparent unification is actually an illusion: the results involving square numbers are much older (the Pell equation solutions, for instance, was essentially known to Diophantus, as well as the ancient Indians) and are proven using completely different methods than for the prime numbers. For the square numbers, there are some key algebraic identities and connections, ranging from the high-school factorisations a^2-b^2 = (a-b)(a+b), a^2 - 2b^2 = (a-\sqrt{2}b)(a+\sqrt{2}b) and a^2+b^2 = (a-ib)(a+ib) to deeper connections between quadratic forms and elliptic curves, which allow one to prove the results on the left-hand column via the methods of algebraic number theory. For the primes, on the other hand, there are very few usable algebraic properties available: one has local (mod q) information, such as the fact that all primes are odd (with one exception), or adjacent to a multiple of 6 (with two exceptions), but there are essentially no global algebraic identities or structures to exploit amongst the primes (except, perhaps, for the identities such as the Euler product formula \zeta(s) = \prod_p (1 - p^{-s})^{-1} connecting the prime numbers to the Riemann zeta function and its relatives, although this only directly helps one count multiplicative patterns in the primes rather than additive ones). So, whereas the square numbers can be profitably studied by cleverly exploiting their special algebraic structure, when dealing with the prime numbers it is in fact better to rely instead on more general tools which require very little structural control on the set being studied. In particular, in recent years we have learned that the methods of additive combinatorics, which offers tools to count additive patterns in arbitrary sets of integers (or more generally, subsets of an additive group), can be remarkably effective in additive prime number theory. Thus – rather counter-intuitively – some of our strongest results about additive patterns in primes have been obtained by using very little information about the primes at all!

To give a very simple example of how additive combinatorics can be applied to the primes, let us consider the problem of finding parallelograms inside the primes – patterns of the form a, a+b, a+c, a+b+c with b, c positive integers; for instance, 3, 7, 43, 47 is a parallelogram of primes. It is very hard to produce any parallelograms of primes by algebraic means (such as an explicit formula); however, there is a simple combinatorial argument that shows that such parallelograms exist in abundance. The only actual information needed about the primes for this argument is the prime number theorem, which says that the number of primes less than a large number N is equal to (1+o(1)) N/\log N in the limit N \to\infty. (Actually, we won’t even need the full strength of the prime number theorem; the weaker statement that there are \gg N/\log N primes less than N which was known to Chebyshev and can be established by elementary means based on the prime factorisation of \binom{2N}{N}, will suffice.)

Let N be a large number, then there are (1+o(1)) N/\log N primes less than N. This allows us to form roughly (\frac{1}{2}+o(1)) N^2/\log^2 N differences p-q of primes 1 < q < p \leq N. But each of these differences takes values between 1 and N. For N large enough, we can thus use the pigeonhole principle to conclude that there are two differences p-q and p'-q' of primes which have the same value, which implies that the quadruplet p, q, q', p' forms a parallelogram. In fact, a slight refinement this argument (using the Cauchy-Schwarz inequality, which can provide a more quantitative version of the pigeonhole principle) shows that there are \gg N^3 / \log^4 N parallelograms of primes less than N, and in particular that there are infinitely many parallelograms of primes.

The above example shows how one can detect additive patterns in the primes using very little information about the primes themselves; in the above case, the only information we actually needed about the primes was about their cardinality. (Indeed, the argument is not really about primes at all, and is best viewed as a general statement about dense sets of integers, known as the Szemerédi cube lemma.) More generally, the strategy of the additive combinatorial approach is to minimise the number of facts one actually needs to establish about the primes, and rely primarily on tools which are valid for rather general classes of sets of integers.

A good example of this type of tool is Szemerédi’s theorem, which asserts any set of integers A of positive density contains arbitrarily long arithmetic progressions; as with the case of parallelograms, the only information needed about the set is that it is large. This theorem does not directly apply to the prime numbers {\mathcal P}, as they have density zero, but it turns out that there is a trick (which Ben Green and I call the transference principle) which (very roughly speaking) lets one locate a dense set of integers A which “models” the primes, in the sense that there is a relationship between additive patterns in A and additive patterns in {\mathcal P}. (The relationship here is somewhat analogous to Monte Carlo integration, which uses the average value of a function f on a sparse set to approximate the average value of f on a much larger domain.) As a consequence of this principle, Ben and I were able to use Szemerédi’s theorem to establish that the primes contained arbitrarily long arithmetic progressions. There have since been a number of similar results in which Szemerédi-type results for dense sets of integers have been transferred to yield similar statements about the primes and related sets (e.g. constellations in Gaussian primes, or polynomial patterns in the ordinary primes).

In this talk, though, I am not going to discuss the above results further, but instead focus on the task of using additive combinatorics to detect more general classes of additive patterns in sets of integers such as the primes, with the philosophy of always trying to use as little structural information about these sets as possible.

To illustrate some of the ideas, let us consider the odd Goldbach conjecture, which asserts that any odd integer larger than 5 can be expressed as the sum of three primes. Let’s first tackle a model problem in the same spirit: we’ll work in a cyclic group {\Bbb Z}/N{\Bbb Z} instead of the integers, we will pick three sets A, B, C in this group as well as an element x, and we will ask whether x can be expressed as the sum of an element a from A, an element b from B, and an element c from C.

Of course, to make any headway on this problem we have to make some assumptions on A, B, C. Let us first assume that A, B, C are relatively dense subsets of {\Bbb Z}/N{\Bbb Z}, say |A|, |B|, |C| \geq \frac{1}{10} N. Even with such large sets, there is no guarantee that every element x can be expressed as a sum of elements from A, B, and C respectively. For instance, if A=B=C = \{1,\ldots,\lfloor N/10\rfloor+1\}, we see that only about 30% of the numbers in {\Bbb Z}/N{\Bbb Z} can be expressed in this way. Or, if N is a multiple of 10 and A=B=C=\{10,20,30,\ldots\} consist of those elements in {\Bbb Z}/N{\Bbb Z} which are multiples of 10, then only 10% of the elements in {\Bbb Z}/N{\Bbb Z} can be expressed in this fashion. Thus there are some non-trivial obstructions to this Goldbach-type problem.

However, it turns out that if one of the sets, say A, is sufficiently “uniform” or “pseudorandom”, then one can always solve this Goldbach-type problem, regardless of what the other two sets are. This type of fact is often established by Fourier-analytic means (or by closely related techniques, such as spectral theory), but let me give a heuristic combinatorial argument to indicate why one would expect this type of phenomenon to occur. We will work in the contrapositive: we assume that we can find an x which cannot be expressed as the sum of elements from A, B, and C, and somehow eliminate the role of x, B, and C to to deduce some “non-uniformity” or “structure” for A.

So, suppose that x \neq a+b+c for all a in A, b in B, c in C. This implies that x-a-b always avoids C. Thus x-a-b does not range freely throughout {\Bbb Z}/N{\Bbb Z}, but is instead concentrated in a set of 90% the size or smaller. Because of this more confined space, one would expect more “collisions” than usual, in the sense that there should be more solutions to the equation x-a-b = x-a'-b' with a,a’ in A and b,b’ in B than one would normally expect. Rearranging, we conclude that there are more solutions to the equation a-a' = b-b' than one might first expect. This means that the differences a-a’ and the differences b-b’ have to cluster in the same region of {\Bbb Z}/N{\Bbb Z}, which then suggests that we should have more collisions a-a' = a''-a''' with a,a’,a”,a” in A than one might first expect. To put it another way, A should contain a higher than expected number of parallelograms a, a+r, a+s, a+r+s.

The above argument can be made rigorous by two quick applications of the Cauchy-Schwarz inequality. If we had |A|, |B|, |C|\geq \delta N for some \delta > 0, say, then it is not hard to use Cauchy-Schwarz to show that A will contain at least \delta^4 N^3 parallelograms (where we allow degenerate parallelograms, in order to simplify the formulae a little); but if there existed an x which was not the sum of an element from A, an element from B, and an element of C, one can use this to conclude that A must have a few more parallelograms, in fact it must have at least (1+c\delta) \delta^4 N^3 for some absolute constant c > 0.

Taking contrapositives, we conclude that if A has a near-minimal number of parallelograms (between \delta^4 N^3 and (1+c\delta) \delta^4 N^3), then we can solve the Goldbach problem for any x and any choice of sets B, C of density \delta.

So, by using elementary additive combinatorics, we can reduce Goldbach-type problems to the problem of counting parallelograms in a given set A. But how can one achieve the latter task? It turns out that for this specific problem, there is an elegant formula from Fourier analysis: the number of parallelograms in a set A \subset {\Bbb Z}/N{\Bbb Z} is equal to

N^3 \sum_{\xi \in {\Bbb Z}/N{\Bbb Z}} |\hat 1_A(\xi)|^4 (1)

where \hat 1_A(\xi) is the Fourier coefficient of the indicator function 1_A at frequency \xi:

\hat 1_A(\xi) := \frac{1}{N} \sum_{x \in A} e^{-2\pi i x \xi/N}.

The connection between Fourier analysis and parallelograms a, a+r, a+s, a+r+s might not be immediately evident, but a link between the two can be discerned from the algebraic identity

e^{-2\pi i a \xi / N} e^{2\pi i (a+r) \xi/N} e^{2\pi i (a+s)\xi/N} e^{-2\pi i (a+r+s)\xi/N} = 1

which is a fancy way of saying that the linear function x \mapsto x \xi / N \hbox{ mod } 1 has a vanishing second derivative.

Anyway, returning to the formula (1), in the case when A has density exactly \delta, thus |A|=\delta N, we see that the number of parallelograms is equal to

(\delta^4 + \sum_{\xi \neq 0} |\hat 1_A(\xi)|^4) N^3.

Thus we see (informally, at least) that a set A is going to have near-minimal number of parallelograms precisely when it is Fourier-pseudorandom in the sense that its Fourier coefficients at non-zero frequencies are all small, or in other words that the set A exhibits no correlation or bias with respect to any non-trivial linear phase function x \mapsto e^{2\pi i x \xi/N}. (It is instructive to consider our two counterexamples to the toy Goldbach problem, namely A = \{1,\ldots,\lfloor N/10\rfloor +1\} and A = \{10,20,30,\ldots\}. The first set is biased with respect to the phase x \mapsto e^{2\pi i x/N}; the second set is biased with respect to x \mapsto e^{2\pi i x/10}.)

This gives us a strategy to solve Goldbach-type problems: if we can show somehow that a set A does not correlate strongly with any non-trivial linear phase function, then it should be sufficiently Fourier pseudorandom that there is no further obstruction to the Goldbach problem. If instead A does closely resemble something related to a non-trivial linear phase function, then that is quite a bit of structural information on A and that we should be able to solve the Goldbach type problem by explicit algebraic counting of solutions (as is for instance the case in the two model examples A = \{1,2,\ldots,\lfloor N/10\rfloor+1\} and A = \{10,20,30,\ldots\} discussed earlier).

In the case of sets of integers such as the primes, this type of strategy is known as the Hardy-Littlewood circle method. It was successfully used by Vinogradov to establish his theorem that every sufficiently large odd number is the sum of three primes (and thus every sufficiently large number is the sum of four primes); the problem boils down to getting sufficiently strong estimates for exponential sums over primes such as \sum_{p < N} e^{2\pi i \alpha p}. In the “major arc” case where \alpha is rational (or very close to rational) with small denominator then methods from multiplicative number theory, based on zeta functions and L-functions, become useful; in contrast, in the complementary “minor arc” case where \alpha behaves irrationally, one can use more analytic methods (based, ultimately, on the equidistribution of multiples of \alpha modulo 1 on the unit circle, and on the obvious fact that the product of two primes is a non-prime) to obtain good estimates. (I hope to discuss this in more detail in a later post.) A similar argument was used by van der Corput to establish that the prime numbers contained arbitrarily many arithmetic progressions of length three. These arguments are actually quite quantitative and precise; for instance, Vinogradov’s theorem not only gives the existence of a representation N = p_1 + p_2 + p_3 of any sufficiently large odd number as the sum of three primes, but in fact gives an asymptotically precise formula as N \to \infty as to how many such representations exist. Similarly, van der Corput’s argument gives an asymptotically precise formula as to how many arithmetic progressions of length three consisting of primes less than N there are, as N \to \infty.

This strategy unfortunately fails miserably for the even Goldbach problem, which asks whether every even number greater than 2 is the sum of two primes; it turns out that there is no useful analogue of the parallelogram in this problem, basically due to the fact that there is only one free parameter in the pattern one is looking for. However, it is possible to adapt the strategy to more complicated patterns with two or more free parameters, such as arithmetic progressions of length greater than three. For instance, if one wants to find arithmetic progressions of length 4 in a set A, it turns out that this problem is controlled by the number of parallelopipeds

a, a+r, a+s, a+t, a+r+s, a+r+t,a+s+t,a+r+s+t

that A contains, in much the same way that the odd Goldbach problem was controlled by parallelograms. So, if one knows how to count how many parallelopipeds there are in a set, one can (in principle) count how many progressions of length 4 there are as well (and one can also count a large variety of other patterns too). One would then hope for an elegant formula analogous to (1) to count these objects, but unfortunately it seems that no such formula exists. Part of the problem is that while parallelograms are closely tied to the linear (or Fourier) phases x \mapsto x \xi / N, because such phases have vanishing second derivative, parallelopipeds are more naturally tied to the larger class of phases which have vanishing third derivative, such as the quadratic phases x \mapsto x^2 \xi / N. (Actually, there are also many more “pseudoquadratic” phases, such as x \mapsto \lfloor \alpha x \rfloor \beta x for various real numbers \alpha,\beta, whose third derivative exhibits some cancellation but does not vanish entirely, and are connected to flows on nilmanifolds, but I will not discuss them in detail here.) With this much larger class of potentially relevant phases, it appears that there is no useful analogue of the formula (1) (basically because there are so many such phases out there, most of which having no significant correlation with the set A, that the noise from these irrelevant phases drowns out the signal from those few phases that are actually important). Nevertheless, there are a set of tools, developed initially by Timothy Gowers, in what might loosely be called quadratic Fourier analysis, which can make precise the connection between parallelopipeds and correlations with quadratic (or pseudoquadratic) phases; there is also the beginnings of a more general theory connecting higher dimensional parallelopipeds and higher degree phases. This is still work in progress, but we have already been able to use the theory to understand several types of linear patterns already; for instance, Ben and I showed that the number of arithmetic progressions of length four consisting of primes less than a given large number N is equal to

(\frac{3}{4} \prod_{p\geq 5}(1 - \frac{3p-1}{(p-1)^3} + o(1)) \frac{N^2}{\log^4 N} \approx 0.4764 \frac{N^2}{\log^4 N}.

Very briefly, the role of additive combinatorics (and generalised Fourier analysis) is to replace problems of counting patterns involving multiple prime parameters, with that of counting correlations that involve a single prime parameter (e.g. computing a sum \sum_{p < N} e^{2\pi i \alpha p^2} for various real numbers \alpha), which is significantly easier (though not entirely trivial) and amenable to a current technology from analytic number theory. So we don’t dispense with the number theory entirely, but thanks to combinatorics we can reduce the amount of difficult number theoretical work that we have to do to a feasible level.