This is my final Milliman lecture, in which I talk about the sum-product phenomenon in arithmetic combinatorics, and some selected recent applications of this phenomenon to uniform distribution of exponentials, expander graphs, randomness extractors, and detecting (sieving) almost primes in group orbits, particularly as developed by Bourgain and his co-authors.
In the previous two lectures we had concentrated on additive combinatorics - the study of additive operations and patterns on sets. Now we will look at arithmetic combinatorics - the simultaneous study of additive and multiplicative operations on sets; this is sort of a combinatorial analogue of commutative algebra.
There are many questions to study here, but the most basic is the sum-product problem, which we can state as follows. Let A be a finite non-empty set of elements of a ring R (e.g. finite sets of integers, or elements of a cyclic group , or sets of matrices over some ring). Then we can form the sum set
and the product set
To avoid degeneracies, let us assume that none (or very few) of the elements in A are zero divisors (as this may cause to become very small). Then it is easy to see that
and
will be at least as large as A itself.
Typically, both of these sets will be much larger than A itself, indeed, if we select A at random, we generically expect and
to have cardinality comparable to
. But when A enjoys additive or multiplicative structure, the sets
or
can be of size comparable to A. For instance, if A is an arithmetic progression
or an additive subgroup in the ring R (modulo zero divisors, such as 0), then
. Similarly, if A is a geometric progression
or a multiplicative subgroup in the ring R, then
. And of course, if A is both an additive and a multiplicative subgroup of R (modulo zero divisors), i.e. if A is a subring of R, then
and
are both comparable in size to |A|. These examples are robust with respect to small perturbations; for instance, if A is a dense subset of an arithmetic progression or additive subgroup, then it is still the case that
is comparable in size to A. There are also slightly more complicated examples of interest, such as generalised arithmetic progressions, but we will not discuss these here.
Now let us work in the ring of integers . This ring has no non-trivial finite additive subgroups or multiplicative subgroups (and it certainly has no non-trivial finite subrings), but it of course has plenty of arithmetic progressions and geometric progressions. But observe that it is rather difficult for a finite set A of integers to resemble both an arithmetic progression and a geometric progression simultaneously (unless A is very small). So one expects at least one of
and
to be significantly larger than A itself. This claim was made precise by Erdős and Szemerédi, who showed that
(1)
for some absolute constant . The value of this constant as improved steadily over the years; the best result currently is due to Solymosi, who showed that one can take
arbitrarily close to 3/11. Erdős and Szemerédi in fact conjectured that one can take
arbitrarily close to 1 (i.e. for any finite set of integers A, either the sum set or product set has to be very close to its maximal size of
), but this conjecture seems out of reach at present. Nevertheless, even just the epsilon improvement over the trivial bound of |A| is actually quite useful. It is the first example of what is now called the sum-product phenomenon: if a finite set A is not close to an actual subring, then either the sum set
or the product set
must be significantly larger than A itself. One can view (1) as a “robust” version of the assertion that the integers contain no non-trivial finite subrings; (1) is asserting that in fact the integers contain no non-trivial finite sets which even come close to behaving like a subring.
In 1999, Tom Wolff posed the question of whether the sum-product phenomenon held true in finite fields of prime order (note that such fields have no non-trivial subrings), and in particular whether (1) was true when
, and A was not close to being all of
, in the sense that
for some
; of course one would need
to depend on
. (Actually, Tom only posed the question for
, being motivated by finite field analogues of the Kakeya problem, but the question was clearly of interest for other ranges of A as well.) This question was solved in the affirmative by Bourgain, Katz, and myself (in the range
) and then by Bourgain, Glibichuk, and Konyagin (in the full range
); the result is now known as the sum-product theorem for
(and there have since been several further proofs and refinements of this theorem). The fact that the field has prime order is key; if for instance we were working in a field of order
, then by taking A to be the subfield of order p we see that both
and
have exactly the same size as A. So any proof of the sum-product theorem must use at some point the fact that the field has prime order.
As in the integers, one can view the sum-product theorem as a robust assertion of the obvious statement that the field contains no non-trivial subrings. So the main difficulty in the proof is to find a proof of this latter fact which is robust enough to generalise to this combinatorial setting. The standard way to classify subrings is to use Lagrange’s theorem that the order of a subgroup divides the order of the whole group, which is proven by partitioning the whole group into cosets of the subgroup, but this argument is very unstable and does not extend to the combinatorial setting. But there are other ways to proceed. The argument of Bourgain, Katz, and myself (which is based on an earlier argument of Edgar and Miller), roughly speaking, proceeds by investigating the “dimension” of
relative to A, or in other words the least number of elements
in
such that every element of
can be expressed in the form
. Note that the number of such representations is equal to
. The key observation is that as
is prime, it cannot equal
if
, and so by the pigeonhole principle some element must have more than one representation. One can use this “linear dependence” to reduce the dimension by 1 (assuming that A behaves a lot like a subring), and so can eventually reduce to the d=1 case, which is prohibited by our assumption
. (The hypothesis
is needed to ensure that the initial dimension d is bounded, so that the iteration only requires a bounded number of steps.) The argument of Bourgain, Glibichuk, and Konyagin uses a more algebraic method (a variant of the polynomial method of Stepanov), using the basic observation that the number of zeroes of a polynomial (counting multiplicity) is bounded by the degree of that polynomial to obtain upper bounds for various sets (such as the number of parallelograms in A). More recently, a short argument of Garaev proceeds using the simple observation that if A is any non-trivial subset of
, then there must exist
such that
; applying this to the “fraction field”
of A one can conclude that Q[A] does not in fact behave like a field, and hence A does not behave like a ring.
The sum-product phenomenon implies that if a set of medium size
is multiplicatively structured (e.g. it is a geometric progression or a multiplicative subgroup) then it cannot be additively structured:
is significantly larger than A. It turns out that with a little bit of extra work, this observation can be iterated: A+A+A+A is even larger than A, and so on and so forth, and in fact one can show that
for some bounded k depending only on
, where
is the k-fold sumset of A. (The key to this iteration essentially lies in the inclusion
, which is a consequence of the distributive law. The use of this law unfortunately breaks the symmetry between multiplication and addition that one sees in the sum-product estimates.) Thus any multiplicatively structured subset A of
of medium size must eventually additively generate the whole field. As a consequence of this, one can show that A is an additive expander, which roughly speaking means that
is spread out on a significantly larger set than B for any medium-sized set B. (In more probabilistic language, if one considered the random walk whose steps were drawn randomly from A, then this walk would converge extremely rapidly to the uniform distribution.) From that observation (and some more combinatorial effort), one can in fact conclude that multiplicatively structured sets must be distributed uniformly in an additive sense; if they concentrated too much in, say, a subinterval of
, then this could be used to contradict the additive expansion property.
Let me note one cute application of this technology, due to Bourgain, to the Diffie-Hellman key exchange protocol and its relatives in cryptography. Suppose we have two people, Alice and Bob, who want to communicate privately and securely, but have never met each other before and can only contact each other via an unsecured network (e.g. the internet, or physical mail), in which anyone can eavesdrop. How can Alice and Bob achieve this?
If one was sending a physical object (e.g. a physical letter) by physical mail (which could be opened by third parties), one could proceed as follows.
- Alice places the object in a box, and locks the box with her own padlock, keeping the key. She then mails the locked box to Bob. Anyone who intercepts the box cannot open it, since they don’t have Alice’s key.
- Of course, Bob can’t open the box either. But what he can do instead is put his own padlock on the box (keeping the key), and sends the doubly locked box back to Alice.
- Alice can’t unlock Bob’s padlock… but she can unlock her own. So she removes her lock, and sends the singly locked box back to Bob.
- Bob can unlock his own padlock, and so retreives the object safely. At no point was the object available to any interceptor.
A similar procedure (a slight variant of the Diffie-Hellman protocol, essentially the Massey-Omura cryptosystem) can be used to transmit a digital message g (which one should think of as just being a number) from Alice to Bob over an unsecured network, as follows:
- Alice and Bob agree (over the unsecured network) on some large prime p (larger than the maximum size of the message g).
- Alice “locks” the message g by raising it to a power a mod p, where Alice generates the “key” a randomly and keeps it secret. She then sends the locked message
to Bob.
- Bob can’t decode this message (he doesn’t know a), but he doubly locks the message by raising the message to his own power b, and returns the doubly locked message
back to Alice.
- Alice then “unlocks” her part of the message by taking the
root (which can be done by Cauchy’s theorem) and sends
back to Bob.
- Bob then takes the
root of the message and recovers g.
An eavesdropper (let’s call her Eve) could intercept p, as well as the three “locked” values , but does not directly recover g. Now, it is possible that one could use this information to reconstruct g (indeed, if one could quickly take discrete logarithms, then this would be a fairly easy task) but no feasible algorithm for this is known (if p is large, e.g. 500+ digits); the problem is generally believed to be roughly comparable in difficulty to that of factoring large numbers. But no-one knows how to rigorously prove that the Diffie-Hellman reconstruction problem is hard (e.g. non-polynomial time); indeed, this would imply
, since this reconstruction problem is easily seen to be in NP (though it is not believed to be NP-complete).
Using the sum-product technology, Bourgain was at least able to show that the Diffie-Hellman protocol was secure (for sufficiently large p) if Eve was only able to see the high bits of , thus pinning down
to intervals. The reason for this is that the set
has a lot of multiplicative structure (indeed, it is a multiplicative subgroup of the ring
) and so should be uniformly distributed in an additive sense (by adapting the above sum-product technology to
).
Another application of sum-product technology was to build efficient randomness extractors - deterministic algorithms that can create high-quality (very uniform) random bits from several independent low-quality (non-uniform) random sources; such extractors are of importance in computer science and cryptography. Basically, the sum-product estimate implies that if are sets of medium size, then the set
is significantly larger than A, B, or C. As a consequence, if X, Y, Z are independent random variables in
which are not too narrowly distributed(in particular, they are not deterministic, and thus distributed only on a single value), one can show (with the assistance of some additive combinatorics) that the random variable
is significantly more uniformly distributed than X, Y, or Z. Iterating this leads to some surprisingly good randomness extractors, as was first observed by Barak, Impagliazzo, and Wigderson.
Another application of the above sum-product technology was to get a product estimate in matrix groups, such as . Indeed, Helfgott was able to show that if A was a subset of
of medium or small size, and it was not trapped inside a proper subgroup of
, then
was significantly larger than A itself. (One needs to work with triple products here instead of double products for a rather trivial reason: if
was the union of a subgroup and some external element, then
is still comparable in size to A, but
will be much larger. This result may not immediately look like a sum-product estimate, because there is no obvious addition, but it is concealed within the matrix multiplication law for
. The key observation in Helfgott’s argument, which relies crucially on the sum-product estimate, is that if V is a collection of diagonal matrices in
of medium size, and g is a non-diagonal matrix element, then the set
is significantly larger than V itself. If one works out explicitly what this trace is, one sees a sum-product type of result emerging. Conversely, if the trace
of a group-like set A is large, then the conjugacy classes in A are fairly small (since trace is conjugation-invariant), which forces many pairs in A to commute, which creates large sets V of simultaneously commuting (and hence simultaneously diagonalisable) elements, due to the fact (specific to
) that if two elements commute with a third, then they are quite likely to commute with each other. The tension between these two implications is what underlies Helfgott’s results.
The estimate of Helfgott shows that multiplication by medium-size sets in expands rapidly across the group (unless it is trapped in a subgroup). As a consequence of Helfgott’s estimate, Bourgain and Gamburd were able to show that if S was any finite symmetric set of matrices in
which generated a sufficiently large (or more precisely, Zariski dense) subgroup of
, and
was the projection of S to
, then the random walk using
on
was very rapidly mixing, so that after about
steps, the walk was very close to uniform. (The precise statement was that the Cayley graph associated to
for each p formed an expander family.) Quite recently, Bourgain, Gamburd, and Sarnak have applied these results (and generalisations thereof) to the problem of detecting (or sieving) almost primes in thin algebraically generated sets. To motivate the problem, we observe that many classical questions in prime number theory can be rephrased as one of detecting prime points
in algebraic subsets
of a lattice
. For instance, the twin prime problem asks whether the line
contains infinitely many prime points. In general, these problems are very difficult, especially once one considers sets described by polynomials rather than linear functions; even the one-dimensional problem of determining whether the set
contains infinitely many primes has been open for quite a long time (though it is worth mentioning the celebrated result of Friedlander and Iwaniec that the somewhat larger set
is known to have infinitely many primes).
So prime points are hard to detect. However, by using methods from sieve theory, one can often detect almost prime points in various sets - points whose coordinates are the products of only a few primes. For instance, a famous theorem of Chen shows that the line
contains infinitely many points which are almost prime in the sense that the first coordinate is prime, and the second coordinate is the product of at most two primes. The basic idea of sieve theory is to sift out primes and almost primes by removing all points whose coordinates are divisible by small factors (and then, due to various generalisations of the inclusion-exclusion principle, one has to add back in points which are divisible by multiple small factors, and so forth). See my post on sieve theory and the parity problem for further discussion. In order for sieve theory to work well, one needs to be able to accurately count the size of the original set
(or more precisely, the size of this set restricted to a ball or a similar object), and also need to count how many points in that set have a certain residue class modulo q, for various values of q. (For instance, to sieve out twin primes or twin almost primes in the interval {1,..,N}, one needs to count how many elements n in that interval are such that n and n+2 are both invertible modulo q (i.e. coprime to q) for various values of q.)
For arbitrary algebraic sets , these tasks are very difficult. For instance, even the basic task of determining whether a set
described by several polynomials is non-empty is essentially Hilbert’s tenth problem, which is undecidable in general. But if the set
is generated by a group
acting on
(in some polynomial fashion), thus
for some point
, then the problems become much more tractable. If the group
is generated by some finite set S, and we restrict attention to group elements with some given word length, the problem of understanding how
is distributed modulo q is equivalent to asking how random walks on S of a given length distribute themselves on
. This latter problem is very close to the problem solved by the mixing results of Bourgain and Gamburd mentioned earlier, which is where the link to sum-product estimates arises from. Indeed, Bourgain, Gamburd, and Sarnak have now shown that rather general classes of algebraic sets generated by subgroups of
will contain infinitely many almost primes, as long as there are no obvious algebraic obstructions; the methods should hopefully extend to more general groups, such as subgroups of
.
[Update, Dec 7: terminology and typos corrected.]

15 comments
Comments feed for this article
6 December, 2007 at 11:05 pm
intoverflow
If I’m not mistaken, the variant on Diffie-Hellman here presented is the Massey-Omura crypto system. Also, I’ve heard rumors that the box-with-locks system was actually used to move shared keys between US embassies prior to the discovery of public key methods ala Cliff Cocks and RSA, although I can’t personally testify to their truth.
I enjoyed your lectures this week very much, and hope you had a pleasant time in Seattle.
7 December, 2007 at 9:50 am
Terence Tao
Dear intoverflow,
Thanks for pointing out the Massey-Omura system! It seems like this system is usually proposed for fields of small characteristic (such as
), but of course it can be implemented in any finite field.
7 December, 2007 at 12:20 pm
Qi
In your definition of A*A, shoud + be * instead?
I really enjoy this post. Thank you.
7 December, 2007 at 3:09 pm
Harald Helfgott
Dear Terence,
(a) As it happens, I proved |A A A|>|A|^{1+epsilon} not just for middle-sized sets, but for all A that are not very large (|A| < |G|^{0.9999}, say). This was, of course, possible thanks to the strengthening of your result due to Konyagin et al.
(Bourgain and Gamburd need the result only for middle-sized sets, but that is another matter.)
(b) By now, I am not sure that looking at how sums and products “hide” inside the law for matrix multiplication is the best thing to do, or the best way to explain why growth happens in SL_2 (and, let us hope, in some other groups). The entries are not independent variables, and so no direct application of sum-product is possible, even for SL_2(Z/pZ).
Rather, I would say that the sum-product theorem is really about multiplication and linearity; since SL_2 is a linear algebraic group, there is a certain tension between its linearity and the fact that it is a group. The argument connecting V and traces is crucial because it shows that a set A that does not grow would have to have a large intersection with a maximal torus; inside the torus, the group operation is simply multiplication of numbers - and thus we obtain the desired tension between linearity and multiplication.
In the general case, the “linearity” will most like show itself through characters, rather than through explicit computations.
7 December, 2007 at 5:10 pm
Terence Tao
Dear Qi and Harald: thanks for the corrections!
Dear Harald: I like this new perspective, it looks somewhat less “ad hoc” than the current arguments, which bodes well for extension to more general groups. So I guess the point is that having an element g outside the maximal torus induces a nontrivial (but linear) conjugation operation on that torus, which when combined with multiplication causes the expansion.
8 December, 2007 at 11:29 am
mathaaron
Prof Tao,
Can you talk more about why Tom Wolff posted the sum product problem
for the range |A|~ p^1/2 ? You mentioned that it was motivated by Kakeya problem in finite fields. Also, as we know, for any big enough promes P , one can construct A whose size ~|p^1/2 , so that the sum-product of A is less that |A|^3/2 .
8 December, 2007 at 2:14 pm
Guest
Prof. Tao,
Is there any chance this year’s lectures will be made available online (like last years)?
8 December, 2007 at 3:42 pm
Terence Tao
Dear Mathaaron,
Tom was working on the Kakeya conjecture in the three dimensional finite vector space
. He showed that Kakeya sets in that setting have cardinality
. He had some heuristic arguments (based to some extent on consideration of the Heisenberg group, which can be thought of as a sort of 5/2-dimensional manifold in
which contains lots of (complex) lines and so behaves much like a Kakeya set) which suggested that one would need to exploit a sum-product phenomenon to improve this bound. In my paper with Bourgain and Katz, we made these heuristics rigorous by using our sum-product estimate to improve the lower bound for the cardinality of Kakeya sets in
to
for some
.
Dear Guest: You would have to ask someone at the UW math department, I do not know what they will be doing with the video of the lectures.
17 December, 2007 at 3:53 am
Larry
hey! your lecture is very understandable! thanks for this lecture!
22 December, 2007 at 1:05 pm
Harald Helfgott
There is also the issue of whether the sum-product theorem is the “right” theorem at all. One of the latest incarnations of the proof works for sets of rather unequal size. What is more - it seems to me that the proof can be transferred effortlessly to things that are not fields, but modules.
To wit: let G be an abelian group acting on an abelian group A. (In other words, A is a
module.) Then, given a subset A’ of A and a subset G’ of G, and given a non-trivial technical condition (namely, the map on A given by
is required to be injective for all distinct g_1, g_2 in G’), either
(a) we can play with G’ and A’ a bounded number of times (eight or so!) to get at least |G’| |A’| elements of A, or
(b) |G’| |A’| is quite large, and we can play with G’ and A’ a bounded number of times to get all of
(i.e., the submodule of A generated by all our friends in G’ and A’).
End of lemma.
My two favourite examples?
(a) G = A = F, where F is a field. For
, we know that
(since F is cyclic) and thus we get the sum-product theorem we know and love (=yours);
(b) S a solvable (linear algebraic) group, G the maximal torus, A any quotient in the abelian tower of the group of nilpotent elements of S. (G acts on the nilpotents by conjugation.) In SL_n, what I called the “technical condition” above becomes “the restriction of any root map
to G’ is injective”. Perhaps something like that is true in general, but I haven’t checked.
* * *
The technical condition looks somewhat nasty, but it is not hard to construct examples in the style (b) showing that it (or something like it) is necessary. I wonder whether it has a nicer-looking avatar, though.
22 December, 2007 at 8:20 pm
Harald Helfgott
Actually, it seems A need not be abelian at all. Thus,
we should really be talking about an abelian group of automorphisms (was: G) of an arbitrary group (was: A).
The same proof should work there.
Here’s an immediate application of this - the group A could be the upper-triangular matrices in SL_n with 1’s in the diagonal, and the group of automorphisms could consist of conjugation by diagonal matrices.
The “technical condition” can be replaced by a much softer quantitative version of itself. Will write more later. Thanks for an incentive to thought…
22 December, 2007 at 8:43 pm
Terence Tao
Sounds very interesting! Beyond the immediate applications to various interesting rings, the fact that these techniques could apply to pairs of sets of different sizes should also be very useful. If nothing else, they may be able to simplify a lot of the current analysis of Bourgain, Bourgain-Gamburd, and others, in which one has to iteratively expand some set until it stabilises in size, all so that one can apply symmetric sum-product theorems.
24 December, 2007 at 7:40 am
Harald Helfgott
When I first saw a statement of sum-product valid for sets of different sizes, I was quite excited; it is very tempting to think that it should lead to direct proofs of expansion - perhaps uniform expansion for all generators.
However, both my methods and all known statements of sum-product do involve operations involving at least two copies of either of the two given sets A, B (and, thus, at least two copies of the larger of the two sets). This makes the possibility of deriving expansion directly rather more doubtful.
* * *
Returning to the nilpotents: that gives us the statement A*A*A > A^{1+eps} for G = SL_n under the assumption that |A| is more than the
square root of |G| or so. This is a rather broader range than that given by the (very elegant) methods of Gowers and Babai-Nikolov-Pyber.
11 January, 2008 at 10:27 am
Distinguished Lecture Series II: Avi Wigderson, “Expander graphs - constructions and applications” « What’s new
[...] The talk was largely based on (a variant of) these slides. Avi also has a recent monograph with Hoory and Linial on these topics. (For a brief introduction to expanders, I can also recommend Peter Sarnak’s Notices article. I also mention expanders to some extent in my third Milliman lecture.) [...]
18 June, 2008 at 7:21 am
The sum-product phenomenon in arbitrary rings « What’s new
[...] also my third Milliman lecture “Sum-product estimates, expanders, and exponential sums”.] Possibly related posts: (automatically generated)“Mittens” In The RingsI dare youLight [...]