In this lecture, we describe the simple but fundamental Furstenberg correspondence principle which connects the “soft analysis” subject of ergodic theory (in particular, recurrence theorems) with the “hard analysis” subject of combinatorial number theory (or more generally with results of “density Ramsey theory” type). Rather than try to set up the most general and abstract version of this principle, we shall instead study the canonical example of this principle in action, namely the equating of the Furstenberg multiple recurrence theorem with Szemerédi’s theorem on arithmetic progressions.
In 1975, Szemerédi established the following theorem, which had been conjectured in 1936 by Erdős and Turán:
Theorem 1. (Szemerédi’s theorem) Let be an integer, and let A be a set of integers of positive upper density, thus . Then A contains a non-trivial arithmetic progression of length k. (By “non-trivial” we mean that .) [More succinctly: every set of integers of positive upper density contains arbitrarily long arithmetic progressions.]
This theorem is trivial for k=1 and k=2. The first non-trivial case is k=3, which was proven by Roth in 1953 and will be discussed in a later lecture. The k=4 case was also established by Szemerédi in 1969.
In 1977, Furstenberg gave another proof of Szemerédi’s theorem, by establishing the following equivalent statement:
Theorem 2. (Furstenberg multiple recurrence theorem) Let be an integer, let be a measure-preserving system, and let E be a set of positive measure. Then there exists such that is non-empty.
Remark 1. The negative signs here can be easily removed because T is invertible, but I have placed them here for consistency with some later results involving non-invertible transformations, in which the negative sign becomes important.
Exercise 1. Prove that Theorem 2 is equivalent to the apparently stronger theorem in which “is non-empty” is replaced by “has positive measure”, and “there exists ” is replaced by “there exist infinitely many “.
Note that the k=1 case of Theorem 2 is trivial, while the k=2 case follows from the Poincaré recurrence theorem (Theorem 1 from Lecture 8). We will prove the higher k cases of this theorem in later lectures. In this one, we will explain why, for any fixed k, Theorem 1 and Theorem 2 are equivalent.
Let us first give the easy implication that Theorem 1 implies Theorem 2. This follows immediately from
Lemma 1. Let be a measure-preserving system, and let E be a set of positive measure. Then there exists a point x in X such that the recurrence set has positive upper density.
Indeed, from Lemma 1 and Theorem 1, we obtain a point x for which the set contains an arithmetic progression of length k and some step r, which implies that is non-empty.
Proof of Lemma 1. Observe (from the shift-invariance of ) that
On the other hand, the integrand is at most 1. We conclude that for each N, the set must have measure at least . This implies that the function is not absolutely integrable even after excluding an arbitrary set of measure up to , which implies that is not finite a.e., and the claim follows (cf. the proof of the Borel-Cantelli lemma).
Now we show how Theorem 2 implies Theorem 1. If we could pretend that “upper density” was a probability measure on the integers, then this implication would be immediate by applying Theorem 2 to the dynamical system . Of course, we know that the integers do not admit a shift-invariant probability measure (and upper density is not even additive, let alone a probability measure). So this does not work directly. Instead, we need to first lift from the integers to a more abstract universal space and use a standard “compactness and contradiction” argument in order to be able to build the desired probability measure properly.
More precisely, let A be as in Theorem 1. Consider the topological boolean Bernoulli dynamical system with the product topology and the shift . The set A can be viewed as a point in this system, and the orbit closure of that point becomes a subsystem of that Bernoulli system, with the relative topology.
Suppose for contradiction that A contains no non-trivial progressions of length k, thus for all . Then, if we define the cylinder set to be the collection of all points in X which (viewed as sets of integers) contain 0, we see (after unpacking all the definitions) that for all .
In order to apply Theorem 2 and obtain the desired contradiction, we need to find a shift-invariant Borel probability measure on X which assigns a positive measure to E.
For each integer N, consider the measure which assigns a mass of to the points in X for , and no mass to the rest of X. Then we see that . Thus, since A has positive upper density, there exists some sequence going to infinity such that . On the other hand, by vague sequential compactness (Lemma 1 of Lecture 7) we know that some subsequence of converges in the vague topology to a probability measure , which then assigns a positive measure to the (clopen) set E. As the are asymptotically shift invariant, we see that is invariant also (as in the proof of Corollary 1 of Lecture 7). As now has all the required properties, we have completed the deduction of Theorem 1 from Theorem 2.
Exercise 2. Show that Theorem 2 in fact implies a seemingly stronger version of Theorem 1, in which the conclusion becomes the assertion that the set has positive upper density for infinitely many r.
Exercise 3. Show that Theorem 1 in fact implies a seemingly stronger version of Theorem 2: If are sets in a probability space with uniformly positive measure (i.e. ), then for any k there exists positive integers n, r such that .
— Varnavides type theorems —
A similar “compactness and contradiction” argument (combined with a preliminary averaging-over-dilations trick of Varnavides) allows us to use Theorem 2 to imply the following apparently stronger statement (observed by Bergelson, Host, McCutcheon, and Parreau):
Theorem 3. (Uniform Furstenberg multiple recurrence theorem) Let be an integer and . Then for any measure-preserving system and any measurable set E with we have
for all , where is a positive quantity which depends only on k and (i.e. it is uniform over all choices of system and of the set E with measure at least ).
Exercise 4. Assuming Theorem 3, show that if N is sufficiently large depending on k and , then any subset of with cardinality at least will contain at least non-trivial arithmetic progressions of length k, for some . (This result for k=3 was first established by Varnavides via an averaging argument from Roth’s theorem.) Conclude in particular that Theorem 3 implies Theorem 1.
It is clear that Theorem 3 implies Theorem 2; let us now establish the converse. We first use an averaging argument of Varnavides to reduce Theorem 3 to a weaker statement, in which the conclusion (2) is not asserted to hold for all N, but instead one asserts that
is true for some depending only on k and (note that the r=0 term in (2′) has been dropped, otherwise the claim is trivial). To see why one can recover (2) from (2′), observe by replacing the shift T with a power that we can amplify (2′) to
for all a. Averaging (2”) over we easily conclude (2).
It remains to prove that (2”) holds under the hypotheses of Theorem 3. Our next reduction is to observe that for it suffices to perform this task for the boolean Bernoulli system with the cylinder set as before. To see this, recall from Example 5 of Lecture 2 that there is a morphism from any measure-preserving system with a distinguished set E to the system with the product -algebra , the usual shift , and the set , and with the push-forward measure . Specifically, sends any point x in X to its recurrence set . Using this morphism it is not difficult to show that the claim (2) for and E would follow from the same claim for and .
We still need to prove (2”) for the boolean system. The point is that by lifting to this universal setting, the dynamical system and the set E have been canonically fixed; the only remaining parameter is the probability measure . But now we can exploit vague sequential compactness again as follows.
Suppose for contradiction that Theorem 3 failed for the boolean system. Then by carefully negating all the quantifiers, we can find such that for any there is a sequence of shift-invariant probability measures on X with ,
as . Note that if (3) holds for one value of , then it also holds for all smaller values of . A standard diagonalisation argument then allows us to build a sequence as above, but which obeys (3) for all .
Now we are finally in a good position to apply vague sequential compactness. By passing to a subsequence if necessary, we may assume that converges vaguely to a limit , which is a shift-invariant probability measure. In particular we have , while from (3) we see that
for all ; thus the sets all have zero measure for . But this contradicts Theorem 2 (and Exercise 1). This completes the deduction of Theorem 3 from Theorem 2.
— Other recurrence theorems and their combinatorial counterparts —
The Furstenberg correspondence principle can be extended to relate several other recurrence theorems to their combinatorial analogues. We give some representative examples here (without proofs). Firstly, there is a multidimensional version of Szemerédi’s theorem (compare with Exercise 7 from Lecture 4):
Theorem 4 (Multidimensional Szemerédi theorem) Let , let , and let be a set of positive upper Banach density (which means that , where ). Then A contains a pattern of the form for some and .
Note that Theorem 1 corresponds to the special case when and .
This theorem was first proven by Furstenberg and Katznelson, who deduced it via the correspondence principle from the following generalisation of Theorem 2:
Theorem 5 (Recurrence for multiple commuting shifts) Let be an integer, let be a probability space, let be measure-preserving bimeasurable maps which commute with each other, and let E be a set of positive measure. Then there exists such that is non-empty.
Exercise 5. Show that Theorem 4 and Theorem 5 are equivalent.
Exercise 6. State an analogue of Theorem 3 for multiple commuting shifts, and prove that it is equivalent to Theorem 5.
There is also a polynomial version of these theorems (cf. Theorem 1 from Lecture 5), which we will also state in general dimension:
Theorem 6 (Multidimensional polynomial Szemerédi theorem) Let , let be polynomials with , and let be a set of positive upper Banach density. Then A contains a pattern of the form for some and .
This theorem was established by Bergelson and Leibman, who deduced it from
Theorem 7 (Polynomial recurrence for multiple commuting shifts) Let k, , , E be as in Theorem 5, and let be as in Theorem 6. Then there exists such that is non-empty, where we adopt the convention (thus we are making the action of on X explicit).
Exercise 7. Show that Theorem 6 and Theorem 7 are equivalent.
Exercise 8. State an analogue of Theorem 3 for polynomial recurrence for multiple commuting shifts, and prove that it is equivalent to Theorem 7. (Hint: first establish this in the case that each of the are monomials, in which case there is enough dilation symmetry to use the Varnavides averaging trick. Interestingly, if one only restricts attention to one-dimensional systems k=1, it does not seem possible to deduce the uniform polynomial recurrence theorem from the non-uniform polynomial recurrence theorem, thus indicating that the averaging trick is less universal in its applicability than the correspondence principle.)
In the above theorems, the underlying action was given by either the integer group or the lattice group . It is not too difficult to generalise these results to the semigroups and (thus dropping the assumption that the shift maps are invertible), by using a trick similar to that used in Exercise 9 of Lecture 4, or by using the correspondence principle back and forth a few times. A bit more surprisingly, it is possible to extend these results to even weaker objects than semigroups. To describe this we need some more notation.
Define a partial semigroup to be a set G together with a partially defined multiplication operation for some subset , which is associative in the sense that whenever is defined, then is defined and equal to , and vice versa. A good example of a partial semigroup is the finite subsets of a fixed set S, where the multiplication operation is disjoint union, or more precisely when A and B are disjoint, and is undefined otherwise.
Remark 2. One can extend a partial semigroup to be a genuine semigroup by adjoining a new element to G, and redefining multiplication to equal if it was previously undefined (or if one of a or b was already equal to ). However, we will avoid using this trick here, as it tends to complicate the notation a little.
One can take Cartesian products of partial semigroups in the obvious manner to obtain more partial semigroups. In particular, we have the partial semigroup for any , defined as the collection of d-tuples of finite sets of natural numbers (not necessarily disjoint), with the partial semigroup law whenever and are disjoint for each .
If is a probability space and is a partial semigroup, we define a measure-preserving action of G on X to be an assignment of a measure-preserving transformation (not necessarily invertible) to each , such that whenever is defined.
An action T of on X is known as an IP system on X; it is generated by a countable number of commuting measure-preserving transformations, with . (Admittedly, it is possible that the action of the empty set is not necessarily the identity, but this turns out to have a negligible impact on matters.) An action T of is then a collection of d simultaneously commuting IP systems.
Furstenberg and Katznelson showed the following generalisation of Theorem 5:
Theorem 8 (IP multiple recurrence theorem) Let T be an action of on a probability space . Then there exists a non-empty set such that is non-empty, where is the group element which equals A in the position and is the empty set otherwise.
It has a number of combinatorial consequences, such as the following strengthening of Szemerédi’s theorem:
Theorem 9. (IP Szemerédi theorem) Let A be a set of integers of positive upper density, let , and let be infinite. Then A contains an arithmetic progression of length k in which r lies in FS(B), the set of finite sums of B (cf. Hindman’s theorem from Lecture 5).
(There is also a multidimensional version of this theorem, but it requires a fair amount of notation to state properly.)
Exercise 9. Deduce Theorem 9 from Theorem 8.
Exercise 10. Using Theorem 9, show that for any k, and any set of integers A of positive upper density, the set of steps r which occur in the arithmetic progressions in A of length k is syndetic.
Exercise 11. Using Theorem 8, show that if is a finite field, and is the canonical vector space over spanned (in the algebraic sense) by a countably infinite number of basis vectors, show that any subset A of of positive upper Banach density (which means that contains affine subspaces of arbitrarily high dimension.
The IP recurrence theorem is already very powerful, but even stronger theorems are known. For instance, Furstenberg and Katznelson established the following deep strengthening of the Hales-Jewett theorem (Theorem 8 from Lecture 5), as well as of Exercise 11 above:
Theorem 10 (Density Hales-Jewett theorem) Let A be a finite alphabet. If E is a subset of of positive upper Banach density, then E contains a combinatorial line.
This theorem was deduced (via an advanced form of the correspondence principle) by a somewhat complicated recurrence theorem which we will not state here; rather than the action of a group, semigroup, or partial semigroup, one instead works with an ensemble of sets (as in Exercise 3), and furthermore one regularises the system of the probability space and set ensemble (which can collectively be viewed as a random process) to be what Furstenberg and Katznelson call a strongly stationary process, which (very) roughly means that the statistics of this process look “the same” when restricted to any combinatorial subspace of a fixed dimension.
Remark 3. Similar correspondence principles can be established connecting property testing results for graphs and hypergraphs to the measure theory of exchangeable measures: see this paper of myself, and of myself and Austin, for details. There is also a correspondence principle connecting ergodic convergence theorems with a (rather complicated looking) finitary analogue; see the papers of Avigad-Gerhardy-Towsner and of myself on this subject. Finally, we have implicitly been using a similar correspondence principle between topological dynamics and colouring Ramsey theorems in our previous lectures (in particular Lecture 3, Lecture 4, and Lecture 5).
Remark 4. The Furstenberg correspondence principle also comes tantalisingly close to deducing my theorem with Ben that the primes contain arbitrarily long arithmetic progressions from Szemerédi’s theorem. More precisely, they show that any subset A of a genuinely random set of integers with logarithmic-type density B, with A having positive relative upper density with respect to B, contains arbitrarily long arithmetic progressions; see this unpublished note of myself. Unfortunately, the almost primes are not known to quite obey enough “correlation conditions” to behave sufficiently pseudorandomly that these arguments apply to the primes, though perhaps there is still a “softer” way to prove our theorem than the way we did it (there is for instance some recent work by Trevisan, Tulsiani, and Vadhan in this direction).