In this lecture, we use topological dynamics methods to prove some other Ramsey-type theorems, and more specifically the polynomial van der Waerden theorem, the hypergraph Ramsey theorem, Hindman’s theorem, and the Hales-Jewett theorem. In proving these statements, I have decided to focus on the ultrafilter-based proofs, rather than the combinatorial or topological proofs, though of course these styles of proof are also available for each of the above theorems.
— The polynomial van der Waerden theorem —
We first prove a significant generalisation of van der Waerden’s theorem (Theorem 2 from the previous lecture):
Theorem 1 (Polynomial van der Waerden theorem). Let be a tuple of be integer-valued polynomials ) (or tuple for short) with . Then whenever the integers are finitely coloured, one of the colour classes will contain a pattern of the form for some and .
This result is due to Bergelson and Leibman, who proved it using “epsilon and delta” topological dynamical methods. A combinatorial proof was only obtained rather recently by Walters. In these notes, I will translate the Bergelson-Leibman argument to the ultrafilter setting.
Note that the case recovers the ordinary van der Waerden theorem. But the result is significantly stronger; it implies for instance that one of the colour classes contains arbitrarily many shifted geometric progressions , which does not obviously follow from the van der Waerden theorem. The result here only claims a single monochromatic pattern , but it is not hard to amplify this theorem to show that at least one colour class contains infinitely many such patterns.
Remark 1. The theorem can fail if the hypothesis is dropped; consider for instance the case , , , and with the integers partitioned (or coloured) into the odd and even integers. More generally, the theorem fails whenever there exists a modulus N such that the polynomials are never simultaneously equal modulo N. This turns out to be the only obstruction; this is a somewhat difficult recent result of Bergelson, Leibman, and Lesigne.
Exercise 1. Show that the polynomial has a root modulo N for every positive integer N, but has no root in the integers. Thus we see that the Bergelson-Leibman-Lesigne result is stronger than the polynomial van der Waerden theorem; it does not seem possible to directly use the latter to conclude that in every finite colouring of the integers, one of the classes contains the pattern .
Here are the topological dynamics and ultrafilter versions of the above theorem.
Theorem 2 (Polynomial van der Waerden theorem, topological dynamics version). Let be a tuple with . Let be an open cover of a topological dynamical system . Then there exists a set in this cover such that for at least one .
Theorem 3 (Polynomial van der Waerden theorem, ultrafilter version). Let be a tuple with , and let be a minimal ultrafilter. Then there exists such that
for all . (1)
Exercise 2. Show that Theorem 1 and Theorem 2 are equivalent, and that Theorem 3 implies Theorem 2 (or Theorem 1). [For the converse implication, see Exercise 21.]
As in the previous lecture, we shall prove this proposition by induction. However, the induction will be more complicated than just inducting on the number k of polynomials involved, or on the degree of these polynomials, but will instead involve a more complicated measure of the “complexity” of the polynomials being measured. Let us say that a tuple obeys the vdW property if the conclusion of Theorem 3 is true for this tuple. Thus for instance, from Proposition 1 from the previous lecture we know that any tuple of linear polynomials which vanish at the origin will obey the vdW property.
Our goal is to show that every tuple of polynomials which simultaneously vanish at the origin has the vdW property. The strategy will be to reduce from any given tuple to a collection of “simpler” tuples. We first begin with an easy observation, that one can always shift one of the polynomials to be zero:
Lemma 1. (Translation invariance) Let Q be any integer-valued polynomial. Then a tuple obeys the vdW property if and only if has the vdW property.
Proof. Let be minimal. If has the vdW property, then we can find such that
for all . (2)
If we then define one easily verifies that (1) holds (with q replaced by q’), and the claim holds. The converse implication is similar.
Now we come to the key inductive step.
Lemma 2 (Inductive step). Let be a tuple with , and let Q be another integer-valued polynomial. Suppose that for every finite set of integers , the tuple has the vdW property. Then also has the vdW property.
Proof. This will be a reprise of the proof of Proposition 1 from the previous lecture. Given any finite number of pairs with , we see from hypothesis that there exists (depending on these pairs) such that
for all and .
Now, for every and , consider the expression defined by
where are defined recursively as above and
From (3) we see that
for all and , and thus
in this case. For i=0, we have the slightly different identity
We let be arbitrary, and set . By the minimality of p, we can find such that . We thus have
for all . If one then sets
where and , one easily verifies (1) as required.
Let’s see how this lemma is used in practice. Suppose we wish to show that the tuple has the vdW property (where we use r to denote the independent variable). Applying Lemma 2 with , we reduce to showing that the tuples have the vdW property for all finite collections of integers. But observe that all the polynomials in these tuples are linear polynomials that vanish at the origin. By the ordinary van der Waerden theorem, these tuples all have the vdW property, and so has the vdW property also.
A similar argument shows that the tuple has the vdW property whenever are linear polynomials that vanish at the origin. Applying Lemma 1, we see that obeys the vdW property when is also linear and vanishing at the origin.
Now, let us consider a tuple where is also a linear polynomial that vanishes at the origin. The vdW property for this tuple follows from the previously established vdW properties by first applying Lemma 1 to reduce to the case , and then applying Lemma 2 with . Continuing in this fashion, we see that a tuple will also obey the vdW property for any linear that vanish at the origin, for any k and l.
Now the vdW property for the tuple follows from the previously established cases and Lemma 2 with .
Remark 2. It is possible to continue this inductive procedure (known as PET induction; the PET stands, variously, for “polynomial ergodic theorem” or “polynomial exhaustion theorem”); this is carried out in Exercise 3 below.
Exercise 3. Define the top order monomial of a non-zero polynomial with to be . Define the top order monomials of a tuple to be the set of top order monomials of the , not counting multiplicity; for instance, the top order monomials of are . Define the weight vector of a tuple relative to one of its members to be the infinite vector , where each denotes the number of monomials of degree d in the top order monomials of . Thus for instance, the tuple has weight vector with respect to 0, but weight vector with respect to (say) . Let us say that one weight vector is larger than another if there exists such that and for all .
- Show that the space of all weight vectors is a well-ordered set.
- Show that if is a tuple with , has the least degree of all the , and are integers with , then the weight vector of with respect to is strictly smaller than the weight vector of with respect to .
- Using 1., 2., Lemma 1, and Lemma 2, deduce Theorem 3.
Exercise 4. Find a direct proof of Theorem 2 analogous to the “epsilon and delta” proof of Theorem 4 from the previous lecture. (You can look up the paper of Bergelson and Leibman if you’re stuck.)
Exercise 5. Let be vector-valued polynomials (thus each of the d components of each of the is a polynomial) which all vanish at the origin. Show that if is finitely coloured, then one of the colour classes contains a pattern of the form for some and .
Exercise 6. Show that for any polynomial sequence taking values in a torus, there exists integers such that converges to . (One can also tweak the argument to make the converge to positive infinity, by the “doubling up” trick of replacing P(n) with (P(n),P(-n)).) On the other hand, show that this claim can fail with exponential sequences such as for certain values of . Thus we see that polynomials have better recurrence properties than exponentials.
— Ramsey’s theorem —
Given any finite palette K of colours, a vertex set V, and an integer , define a K-coloured hypergraph G = (V,E) of order k on V to be a function , where denotes the k-element subsets of V. Thus for instance a hypergraph of order 1 is a vertex colouring, a hypergraph of order 2 is an edge-coloured complete graph, and so forth. We say that a hypergraph G is monochromatic if the edge colouring function E is constant. If W is a subset of V, we refer to the hypergraph as a subhypergraph of G.
We will now prove the following result:
Theorem 4 (Hypergraph Ramsey theorem). Let K be a finite set, let , and let G = (V,E) be a K-coloured hypergraph of order k on a countably infinite vertex set V. Then G contains arbitrarily large finite monochromatic subhypergraphs.
Remark 3. There is a stronger statement known, namely that G contains an infinitely large monochromatic subhypergraph, but we will not prove this statement, known as the infinite hypergraph Ramsey theorem. In the case k=1, these statements are the pigeonhole principle and infinite pigeonhole principle respectively, and are compared in my article on hard and soft analysis.
Exercise 7. Show that Theorem 4 implies a finitary analogue: given any finite K and positive integers k, m, there exists N such that every K-coloured hyeprgraph of order k on contains a monochromatic subhypergraph on m vertices. (Hint: as in Exercise 3 from the previous lecture [note – exercises have been renumbered recently], one should use a compactness and contradiction argument as in my article on hard and soft analysis.)
It is not immediately obvious, but Theorem 4 is a statement about a topological dynamical system, albeit one in which the underlying group is not the integers , but rather the symmetric group , defined as the group of bijections from V to itself which are the identity outside of a finite set. More precisely, we have
Theorem 5. (Hypergraph Ramsey theorem, topological dynamics version) Let V be a countably infinite set, and let W be a finite subset of V, thus is a subgroup of . Let be a -topological dynamical system, thus is compact metrisable and is an action of on X via homeomorphisms. Let be an open cover of X, such that each is -invariant. Then there exists an element of this cover such that for every finite set there exists a group element such that (i.e. there exists such that for all ).
This claim should be compared with Theorem 2 of this lecture, or Theorem 1 of the previous lecture.
Exercise 8. Show that Theorem 4 and Theorem 5 are equivalent. (Hint: At some point, you will need the use the fact that the quotient space is isomorphic to .)
As before, though, we shall only illustrate the ultrafilter approach to Ramsey’s theorem, leaving the other approaches to exercises. Here, we will not work on the compactified integers , but rather on the compactified permutations . (We will view here as a discrete group; one could also give this group the topology inherited from the product topology on , leading to a slightly coarser (and thus less powerful) compactification, though one which is still sufficient for the arguments here.) This is a semigroup with the usual multiplication law
Let us say that is minimal if is a minimal left-ideal of . One can show (by repeating Exercise 10 from Lecture 3) that every left ideal contains at least one minimal element; in particular, minimal elements exist.
Note that if W is a k-element subset of V, then there is an image map which maps a permutation to its inverse image of W. We can compactify this to a map . (Caution: is not the same thing as , for instance the latter is not even compact.) We can now formulate the ultrafilter version of Ramsey’s theorem:
Theorem 6. (Hypergraph Ramsey theorem, ultrafilter version) Let V be countably infinite, and let be minimal. Then for every finite set W, is constant on , thus for all .
This result should be compared with Proposition 1 from the previous lecture (or Theorem 3 from this lecture).
Exercise 9. Show that Theorem 6 implies both Theorem 4 and Theorem 5.
Proof of Theorem 6. By relabeling we may assume and for some k.
Given any integers , let denote the permutation that swaps j with for all , but leaves all other integers unchanged. We select some non-principal ultrafilter and define the sequence by the formula
(Note that the condition will be asymptotically true thanks to the choice of limits here.)
Let , and let be the a permutation which is the identity outside of . Then we have the identity
for every , where are the elements of in order. Taking limits as , and then inserting the resulting formula into (11), we conclude (after discarding the trivial limits and relabeling the rest) that
and in particular that is independent of (if is the identity outside of . Now let , then we have independent of p’ for all . Taking limits we conclude that is constant on . But from construction we see that p’ lies in the closed minimal ideal , thus . The claim follows.
Exercise 10. Establish Theorem 4 directly by a combinatorial argument without recourse to topological dynamics or ultrafilters. (If you are stuck, I recommend reading the classic text of Graham, Rothschild, and Spencer.)
Exercise 11. Establish Theorem 5 directly by a topological dynamics argument, using combinatorial arguments for the k=1 case but then proceeding by induction afterwards (as in the proof of Theorem 4 from the previous lecture).
Remark 4. More generally, one can interpret the theory of graphs and hypergraphs on a vertex set V through the lens of dynamics of actions; I learned this perspective from Balazs Szegedy, and it underlies my paper on the hypergraph correspondence principle, as well as my more recent paper with Tim Austin on hypergraph property testing.
— Idempotent ultrafilters and Hindman’s theorem —
Thus far, we have been using ultrafilter technology rather lightly, and indeed all of the arguments so far can be converted relatively easily to the topological dynamics formalism, or even a purely combinatorial formalism, with only a moderate amount of effort. But now we will exploit some deeper properties of ultrafilters, which are more difficult to replicate in other settings. In particular, we introduce the notion of an idempotent ultrafilter.
Definition 1. Let be a discrete semigroup, and let be given the usual semigroup operation . An element is idempotent if . (We of course define idempotence analogously if the group operation on S is denoted by + instead of .)
Of course, 0 is idempotent, but the remarkable fact is that many other idempotents exist as well. The key tool for creating this is
Lemma 3 (Ellis-Nakamura lemma). Let S be a discrete semigroup, and let K be a compact non-empty sub-semigroup of . Then K contains at least one idempotent.
Proof: A simple application of Zorn’s lemma shows that K contains a compact non-empty sub-semigroup K’ which is minimal with respect to set inclusion. We claim that every element of K’ is idempotent. To see this, let p be an arbitrary element of K’. Then observe that K’p is a compact non-empty sub-semigroup of K’ and must therefore be equal to K’; in particular, . (Note that semigroups need not contain an identity.) In particular, the stabiliser is non-empty. But one easily observes that this stabiliser is also a compact sub-semigroup of K’, and so K”=K’. In particular, p must stabilise itself, i.e. it is idempotent.
Remark 5. A posteriori, this results shows that the minimal non-empty sub-semigroups K’ are in fact just the singleton sets consisting of idempotents. Of course, one cannot see this without first deriving all of Lemma 3.
Idempotence turns out to be particularly powerful when combined with minimality, and to this end we observe the following corollary of the above lemma:
Corollary 4. Let S be a discrete semigroup. For every , there exists which is both minimal and idempotent.
Proof. By Exercise 10 from Lecture 3, there exists which is minimal. It is then easy to see that every element of is minimal. Since is a compact non-empty sub-semigroup of , the claim now follows from Lemma 3.
Remark 6. Somewhat amusingly, minimal idempotent ultrafilters require three distinct applications of Zorn’s lemma to construct: one to define the compactified space , one to locate a minimal left-ideal, and one to locate an idempotent inside that ideal! It seems particularly challenging therefore to define civilised substitutes for this tool which do not explicitly use the axiom of choice.
What can we do with minimal idempotent ultrafilters? One particularly striking example is Hindman’s theorem. Given any set A of positive integers, define FS(A) to be the set of all finite sums from A, where B ranges over all finite non-empty subsets of A. (For instance, if are the powers of 2, then .)
Theorem 7 (Hindman’s theorem) Suppose that the natural numbers are finitely coloured. Then one of the colour classes contains a set of the form FS(A) for some infinite set A.
Remark 7. Theorem 7 implies Folkman’s theorem, which has the same hypothesis but concludes that one of the colour classes contains sets of the form FS(A) for arbitrarily large but finite sets A. It does not seem possible to easily deduce Hindman’s theorem from Folkman’s theorem.
Exercise 12. Folkman’s theorem in turn implies Schur’s theorem, which asserts that if the natural numbers are finitely coloured, one of the colour classes contains a set of the form for some x, y (compare with the k=3 case of van der Waerden’s theorem). Using the Cayley graph construction, deduce Schur’s theorem from Ramsey’s theorem (the k=2 case of Theorem 4). Thus we see that there are some connections between the various Ramsey-type theorems discussed here.
Proof of Theorem 7. By Corollary 4, we can find a minimal idempotent element p in ; note that as no element of is minimal (cf. Exercise 7 from Lecture 3), we know that . Let denote the given colouring function, then is a colour in . Since
we may find a positive integer such that . Now from (14), (15) and the similar calculations
we can find an integer such that , thus for all . Continuing inductively in this fashion, one can find such that for all and all k. If we set , the claim follows.
Remark 8. Purely combinatorial (and quite succinct) proofs of Hindman’s theorem exist – see for instance the one in the text by Graham, Rothschild, and Spencer – but they generally rely on some ad hoc trickery. Here, the trickery has been encapsulated into the existence of minimal idempotent ultrafilters, which can be reused in other contexts (for instance, we will use it to prove the Hales-Jewett theorem below).
Exercise 13. Define an IP-set to be a set of positive integers which contains a subset of the form FS(A) for some infinite A. Show that if an IP-set S is finitely coloured, then one of its colour classes is also an IP-set. (Hint: S contains FS(A) for some infinite . Show that the set is a compact non-empty semigroup and thus contains a minimal idempotent ultrafilter p. Use this p to repeat the proof of Theorem 7.)
— The Hales-Jewett theorem —
Given a finite alphabet A, let be the free semigroup generated by A, i.e. the set of all finite non-empty words using the alphabet A, with concatenation as the group operation. (E.g. if , then contains words such as abc and cbb, with .) If we add another letter * to A (the “wildcard” letter), we create a larger semigroup (e.g. containing words such as ab**c*). We of course assume that * was not already present in A. Given any letter , we have a semigroup homomorphism which substitutes every occurrence of the wildcard * with x and leaves all other letters unchanged. (For instance, .) Define a combinatorial line in to be any set of the form for some . For instance, if , then is a combinatorial line, generated by the word .
We shall prove the following fundamental theorem.
Theorem 8 (Hales-Jewett theorem). Let A be a finite alphabet. If is finitely coloured, then one of the colour classes contains a combinatorial line.
Exercise 14. Show that the Hales-Jewett theorem has the following equivalent formulation: for every finite alphabet A and any there exists N such that if is partitioned into m classes, then one of the classes contains a combinatorial line.
Exercise 15. Assume the Hales-Jewett theorem. In this exercise we compare the strength of this theorem against other Ramsey-type theorems.
- Deduce van der Waerden’s theorem (Theorem 2 from the previous lecture.) Hint: the base k representation of the non-negative natural numbers provides a map from to .
- Deduce the multidimensional van der Waerden’s theorem of Gallai (Exercise 7 from the previous lecture.)
- Deduce the syndetic van der Waerden theorem of Furstenberg: if the integers are finitely coloured and k is a positive integer, then there are infinitely many monochromatic arithmetic progressions of length k, and furthermore the set of all the step sizes r which appear in such progressions is syndetic. (Hint: argue by contradiction, assuming that the set of all step sizes has arbitrarily long gaps, and use the Hales-Jewett theorem in a manner adapted to these gaps.) For an additional challenge, show that there exists a single colour class whose progressions of length k have spacings in a syndetic set for every k.
- Deduce the IP-van der Waerden theorem: If the integers are finitely coloured, k is a positive integer, and S is an IP-set (see Exercise 13), show that there are infinitely many monochromatic arithmetic progressions of length k whose step size lies in S. (For an additional challenge, show that one of the colour classes has the property that for every k, the spacings of the k-term progressions in that class forms an -set, i.e. it has non-empty intersection with every IP-set. There is an even stronger topological dynamics version of this statement, due to Furstenberg and Weiss, which I will not describe here.)
- For any , define a d-dimensional combinatorial subspace of to be a set of the form , where is a word containing at least one copy of each of the d wildcards , and is the homomorphism that substitutes each wildcard with . Show that if is finitely coloured, then one of the colour classes contains arbitrarily high-dimensional combinatorial subspaces.
- Let F be a finite field. If the vector space (the inverse limit of the finite vector spaces ) is finitely coloured, show that one of the colour classes contains arbitrarily high-dimensional affine subspaces over F. (This geometric Ramsey theorem is due to Graham, Leeb, and Rothschild.)
We now give an ultrafilter-based proof of the Hales-Jewett theorem due to Blass. As usual, the first step is to obtain a statement involving ultrafilters rather than colourings:
Proposition 1. (Hales-Jewett theorem, ultrafilter version) Let A be a finite alphabet, and let p be a minimal idempotent element of the semigroup . Then there exists such that for all .
Exercise 16. Deduce Theorem 8 from Proposition 1.
To prove Proposition 1, we need a variant of Corollary 4. If is a discrete semigroup and p and q are two idempotents in , let us write if we have .
Exercise 17. Show that is a partial ordering on the idempotents of , and that an idempotent is minimal in if and only if it is minimal with respect to .
Lemma 4. Let S be a discrete semigroup, and let p be an idempotent in . Then there exists a minimal idempotent q in such that .
Proof of Lemma 4. By Exercise 10 from Lecture 3 (generalised to arbitrary discrete semigroups S), contains a minimal left-ideal . By Lemma 3, contains an idempotent s. Since and p is idempotent, we conclude sp=s. If we then set , we easily check that q is idempotent, that , and (since q lies in the minimal left-ideal ) it is minimal. The claim follows.
Proof of Proposition 1. Since p is an idempotent element of , it is also an idempotent element of . However, it is not minimal in that semigroup; for instance, does not contain p, since and are disconnected from each other. However, by Lemma 4, we can find a minimal idempotent q in such that . By the above discussion, q is not equal to p; since p is minimal in , we see that q must lie outside of .
Now let . Since is a homomorphism,
is also a homomorphism (why?). Since q is idempotent and (note that these are both purely algebraic statements), we conclude that is idempotent and . But is minimal in , hence by Exercise 17, we have . The claim follows.
Exercise 18. Adapt the above proof to give an alternate proof of the ultrafilter version of van der Waerden’s theorem (Proposition 1 from the previous lecture) which relies on idempotence rather than on induction on k. (If you are stuck, read the proof of Proposition 1.55 of Glasner’s book.)
Remark 9. Several of the above Ramsey-type theorems can be unified. For instance, the polynomial van der Waerden theorem and the Hales-Jewett theorem have been unified into the polynomial Hales-Jewett theorem of Bergelson and Leibman (see also the more recent paper of Walters on this topic). This type of Ramsey theory is still an active subject, and we do not yet have a comprehensive and systematic theory (or a “universal” Ramsey theorem) that encompasses all known examples.
Exercise 19. Let X be an at most countable set (with the discrete topology), and let be a family of subsets of X. Show that the following two statements are equivalent:
- Whenever X is finitely coloured, one of the colour classes contains a subset in .
- There exists such that every neighbourhood of p contains a subset in .
Exercise 20. Let X, Y be at most countable sets with the discrete topology, and let be a finite collection of functions. Show that the following two statements are equivalent:
- Whenever X is finitely coloured, one of the colour classes contains a set for some .
- There exists such that .
(Hint: look at the closure of in $(\beta X)^k$.)
Exercise 21. Using the previous exercise, deduce Theorem 3 from Theorem 1, and deduce Theorem 6 from Theorem 4.
[Update, Jan 24: extra factors added to the polynomial in Exercise 1 to sidestep issues at 2 and 3.]
[Update, Jan 30: remarks added and renumbered.]
[Update, Feb 1: More exercises added.]
[Update, Mar 21: Various corrections.]
[Update, Mar 25: Statement of Theorem 5 repaired.]
[Update, Mar 31 2009: Proof of Proposition 1 expanded.]