In this lecture, we use topological dynamics methods to prove some other Ramsey-type theorems, and more specifically the polynomial van der Waerden theorem, the hypergraph Ramsey theorem, Hindman’s theorem, and the Hales-Jewett theorem. In proving these statements, I have decided to focus on the ultrafilter-based proofs, rather than the combinatorial or topological proofs, though of course these styles of proof are also available for each of the above theorems.

– The polynomial van der Waerden theorem –

We first prove a significant generalisation of van der Waerden’s theorem (Theorem 2 from the previous lecture):

Theorem 1 (Polynomial van der Waerden theorem). Let (P_1,\ldots,P_k) be a tuple of be integer-valued polynomials P_1,\ldots,P_k: {\Bbb Z} \to {\Bbb Z}) (or tuple for short) with P_1(0)=\ldots=P_k(0). Then whenever the integers are finitely coloured, one of the colour classes will contain a pattern of the form n + P_1(r), \ldots, n+P_k(r) for some n \in {\Bbb Z} and r \in {\Bbb N}.

This result is due to Bergelson and Leibman, who proved it using “epsilon and delta” topological dynamical methods. A combinatorial proof was only obtained rather recently by Walters. In these notes, I will translate the Bergelson-Leibman argument to the ultrafilter setting.

Note that the case P_j(r) := (j-1)r recovers the ordinary van der Waerden theorem. But the result is significantly stronger; it implies for instance that one of the colour classes contains arbitrarily many shifted geometric progressions n + r, n+r^2, \ldots, n+r^k, which does not obviously follow from the van der Waerden theorem. The result here only claims a single monochromatic pattern n + P_1(r), \ldots, n+P_k(r), but it is not hard to amplify this theorem to show that at least one colour class contains infinitely many such patterns.

Remark 1. The theorem can fail if the hypothesis P_1(0)=\ldots=P_k(0) is dropped; consider for instance the case k = 2, P_1(r) = 0, P_2(r) = 2r+1, and with the integers partitioned (or coloured) into the odd and even integers. More generally, the theorem fails whenever there exists a modulus N such that the polynomials P_1, \ldots, P_k are never simultaneously equal modulo N. This turns out to be the only obstruction; this is a somewhat difficult recent result of Bergelson, Leibman, and Lesigne. \diamond

Exercise 1. Show that the polynomial P(r) := (r^2-2)(r^2-3)(r^2-6)(r^2-7)(r^3-3) has a root modulo N for every positive integer N, but has no root in the integers. Thus we see that the Bergelson-Leibman-Lesigne result is stronger than the polynomial van der Waerden theorem; it does not seem possible to directly use the latter to conclude that in every finite colouring of the integers, one of the classes contains the pattern n, n+P(r). \diamond

Here are the topological dynamics and ultrafilter versions of the above theorem.

Theorem 2 (Polynomial van der Waerden theorem, topological dynamics version). Let (P_1,\ldots,P_k) be a tuple with P_1(0)=\ldots=P_k(0). Let (U_\alpha)_{\alpha \in A} be an open cover of a topological dynamical system (X, {\mathcal F}, T). Then there exists a set U_\alpha in this cover such that T^{P_1(r)} U \cap \ldots \cap T^{P_k(r)} U \neq \emptyset for at least one r > 0.

Theorem 3 (Polynomial van der Waerden theorem, ultrafilter version). Let (P_1,\ldots,P_k) be a tuple with P_1(0)=\ldots=P_k(0), and let p \in \beta {\Bbb Z} be a minimal ultrafilter. Then there exists q \in \beta ({\Bbb Z} \times {\Bbb N}) such that

\lim_{(n,r) \to q} n + P_i(r) + p = p for all 1 \leq i \leq k. (1)

Exercise 2. Show that Theorem 1 and Theorem 2 are equivalent, and that Theorem 3 implies Theorem 2 (or Theorem 1). [For the converse implication, see Exercise 21.] \diamond

As in the previous lecture, we shall prove this proposition by induction. However, the induction will be more complicated than just inducting on the number k of polynomials involved, or on the degree of these polynomials, but will instead involve a more complicated measure of the “complexity” of the polynomials being measured. Let us say that a tuple (P_1,\ldots,P_k) obeys the vdW property if the conclusion of Theorem 3 is true for this tuple. Thus for instance, from Proposition 1 from the previous lecture we know that any tuple of linear polynomials which vanish at the origin will obey the vdW property.

Our goal is to show that every tuple of polynomials which simultaneously vanish at the origin has the vdW property. The strategy will be to reduce from any given tuple to a collection of “simpler” tuples. We first begin with an easy observation, that one can always shift one of the polynomials to be zero:

Lemma 1. (Translation invariance) Let Q be any integer-valued polynomial. Then a tuple (P_1,\ldots,P_k) obeys the vdW property if and only if (P_1-Q, \ldots,P_k-Q) has the vdW property.

Proof. Let p\in \beta {\Bbb Z} be minimal. If (P_1-Q,\ldots,P_k-Q) has the vdW property, then we can find q\in \beta ({\Bbb Z} \times {\Bbb N}) such that

\lim_{(n,r) \to q} n + P_i(r) - Q(r) + p = p for all 1 \leq i \leq k. (2)

If we then define q' := \lim_{(n,r) \to q} (n - Q(r),r) \in \beta({\Bbb Z} \times {\Bbb N}) one easily verifies that (1) holds (with q replaced by q’), and the claim holds. The converse implication is similar. \Box

Now we come to the key inductive step.

Lemma 2 (Inductive step). Let (P_0,P_1,\ldots,P_k) be a tuple with P_0=0, and let Q be another integer-valued polynomial. Suppose that for every finite set of integers h_1,\ldots,h_m, the tuple ( P_i( \cdot + h_j ) - P_i( h_j ) - Q(\cdot) )_{1 \leq i \leq k; 1 \leq j \leq m} has the vdW property. Then (0,P_1,\ldots,P_k) also has the vdW property.

Proof. This will be a reprise of the proof of Proposition 1 from the previous lecture. Given any finite number of pairs (n_1,r_1),\ldots,(n_{b-1},r_{b-1}) \in {\Bbb Z} \times {\Bbb N} with b \geq 1, we see from hypothesis that there exists q_b \in \beta({\Bbb Z} \times {\Bbb N}) (depending on these pairs) such that

\lim_{(n_b,r_b) \to q_b} n_b + P_i(r_{a+1}+\ldots+r_b) - P_i(r_{a+1}+\ldots+r_{b-1})

- Q(r_b) + p = p (3)

for all 0 \leq a < b and 1 \leq i \leq k.

Now, for every 0 \leq a \leq b and 0 \leq i \leq k, consider the expression p_{a,b,i} \in \beta {\Bbb Z} + p defined by

p_{a,b,i} := \lim_{(n_1,r_1) \to q_1} \ldots \lim_{(n_b,r_b) \to q_b} P_i(r_{a+1}+\ldots+r_b) + m_b + p, (4)

where q_1,\ldots,q_b are defined recursively as above and

m_b := \sum_{i=1}^b n_i - Q(r_i) (5).

From (3) we see that

p_{a,b,i} = p_{a,b-1,i} (6)

for all 0 \leq a < b and 1 \leq i \leq k, and thus

p_{a,b,i} = p_{a,a,i} = p_{a,a,0} (7)

in this case. For i=0, we have the slightly different identity

p_{a,b,0} = p_{b,b,0}. (7′)

We let p_* \in \beta {\Bbb Z}/{\Bbb Z} be arbitrary, and set p' := \lim_{a \to p_*} p_{a,a,0} = \lim_{b \to p_*} p_{b,b,0} \in   \beta {\Bbb Z} + p. By the minimality of p, we can find p'' \in \beta{\Bbb Z} such that p''+p'=p. We thus have

\lim_{h \to p''} \lim_{a \to p_*} \lim_{b \to p_*} h + p_{a,b,i} = p (8)

for all 0 \leq i \leq k. If one then sets

q :=  \lim_{h \to p''} \lim_{a \to p_*} \lim_{b \to p_*} \lim_{(n_1,r_1) \to q_1} \ldots \lim_{(n_b,r_b) \to q_b} (n,r) (9)

where n := h+m_b and r := r_{a+1}+\ldots+r_b, one easily verifies (1) as required. \Box

Let’s see how this lemma is used in practice. Suppose we wish to show that the tuple (0, r^2) has the vdW property (where we use r to denote the independent variable). Applying Lemma 2 with Q(r) := r^2, we reduce to showing that the tuples ((r+h_1)^2 - h_1^2 - r^2, \ldots, (r+h_m)^2 - h_m^2 - r^2 ) have the vdW property for all finite collections h_1,\ldots,h_m of integers. But observe that all the polynomials in these tuples are linear polynomials that vanish at the origin. By the ordinary van der Waerden theorem, these tuples all have the vdW property, and so (0,r^2) has the vdW property also.

A similar argument shows that the tuple (0, r^2 + P_1(r), \ldots, r^2 + P_k(r)) has the vdW property whenever P_1,\ldots,P_k are linear polynomials that vanish at the origin. Applying Lemma 1, we see that (Q_1(r), r^2 + P_1(r), \ldots, r^2 + P_k(r)) obeys the vdW property when Q_1 is also linear and vanishing at the origin.

Now, let us consider a tuple (Q_1(r),  Q_2(r), r^2+P_1(r),\ldots,r^2+P_k(r)) where Q_2 is also a linear polynomial that vanishes at the origin. The vdW property for this tuple follows from the previously established vdW properties by first applying Lemma 1 to reduce to the case Q_1=0, and then applying Lemma 2 with Q=Q_2. Continuing in this fashion, we see that a tuple (Q_1(r),\ldots,Q_l(r), r^2+P_1(r),\ldots,r^2+P_k(r)) will also obey the vdW property for any linear Q_1,\ldots,Q_l,P_1,\ldots,P_k that vanish at the origin, for any k and l.

Now the vdW property for the tuple (0, r^2, 2r^2) follows from the previously established cases and Lemma 2 with Q(r) = r^2.

Remark 2. It is possible to continue this inductive procedure (known as PET induction; the PET stands, variously, for “polynomial ergodic theorem” or “polynomial exhaustion theorem”); this is carried out in Exercise 3 below. \diamond

Exercise 3. Define the top order monomial of a non-zero polynomial P(r) = a_d r^d + \ldots + a_0 with a_d \neq 0 to be a_d r^d. Define the top order monomials of a tuple (0,P_1,\ldots,P_k) to be the set of top order monomials of the P_1,\ldots,P_k, not counting multiplicity; for instance, the top order monomials of (0, r^2, r^2+r, 2r^2,  2r^2+r) are \{r^2,2r^2\}. Define the weight vector of a tuple (P_1,\ldots,P_k) relative to one of its members P_i to be the infinite vector (w_1,w_2,\ldots) \in {\Bbb Z}_{\geq 0}^{\Bbb N}, where each w_d denotes the number of monomials of degree d in the top order monomials of (P_1-P_i,\ldots,P_k-P_i). Thus for instance, the tuple (0, r^2, r^2+r, 2r^2,  2r^2+r) has weight vector (0,2,0,\ldots) with respect to 0, but weight vector (1,2,0,\ldots) with respect to (say) r^2. Let us say that one weight vector (w_1,w_2,\ldots) is larger than another (w'_1,w'_2,\ldots) if there exists d \geq 1 such that w_d > w'_d and w_i = w'_i for all i > d.

  1. Show that the space of all weight vectors is a well-ordered set.
  2. Show that if (0,P_1,\ldots,P_k) is a tuple with k \geq 1, P_1 has the least degree of all the P_i, and h_1,\ldots,h_m are integers with m \geq 1, then the weight vector of ( P_i( \cdot + h_j ) - P_i( h_j ) )_{1 \leq i \leq k; 1 \leq j \leq m} with respect to P_1 is strictly smaller than the weight vector of (0,P_1,\ldots,P_k) with respect to P_1.
  3. Using 1., 2., Lemma 1, and Lemma 2, deduce Theorem 3. \diamond

Exercise 4. Find a direct proof of Theorem 2 analogous to the “epsilon and delta” proof of Theorem 4 from the previous lecture. (You can look up the paper of Bergelson and Leibman if you’re stuck.) \diamond

Exercise 5. Let P_1,\ldots,P_k: {\Bbb Z} \to {\Bbb Z}^d be vector-valued polynomials (thus each of the d components of each of the P_i is a polynomial) which all vanish at the origin. Show that if {\Bbb Z}^d is finitely coloured, then one of the colour classes contains a pattern of the form n + P_1(r),\ldots,n+P_k(r) for some n \in {\Bbb Z}^d and r \in {\Bbb N}. \diamond

Exercise 6. Show that for any polynomial sequence P: {\Bbb Z} \to ({\Bbb R}/{\Bbb Z})^d taking values in a torus, there exists integers n_j \to \infty such that P(n_j) converges to P(0). (One can also tweak the argument to make the n_j converge to positive infinity, by the “doubling up” trick of replacing P(n) with (P(n),P(-n)).) On the other hand, show that this claim can fail with exponential sequences such as P(n) := 10^n \alpha \hbox{ mod } 1 \in {\Bbb R}/{\Bbb Z} for certain values of \alpha. Thus we see that polynomials have better recurrence properties than exponentials. \diamond

– Ramsey’s theorem –

Given any finite palette K of colours, a vertex set V, and an integer k  \geq 1, define a K-coloured hypergraph G = (V,E) of order k on V to be a function E: \binom{V}{k} \to K, where \binom{V}{k} := \{ e \subset V: |e|=k\} denotes the k-element subsets of V. Thus for instance a hypergraph of order 1 is a vertex colouring, a hypergraph of order 2 is an edge-coloured complete graph, and so forth. We say that a hypergraph G is monochromatic if the edge colouring function E is constant. If W is a subset of V, we refer to the hypergraph G\downharpoonright_W := (W, E\downharpoonright_{\binom{W}{k}}) as a subhypergraph of G.

We will now prove the following result:

Theorem 4 (Hypergraph Ramsey theorem). Let K be a finite set, let k \geq 1, and let G = (V,E) be a K-coloured hypergraph of order k on a countably infinite vertex set V. Then G contains arbitrarily large finite monochromatic subhypergraphs.

Remark 3. There is a stronger statement known, namely that G contains an infinitely large monochromatic subhypergraph, but we will not prove this statement, known as the infinite hypergraph Ramsey theorem. In the case k=1, these statements are the pigeonhole principle and infinite pigeonhole principle respectively, and are compared in my article on hard and soft analysis. \diamond

Exercise 7. Show that Theorem 4 implies a finitary analogue: given any finite K and positive integers k, m, there exists N such that every K-coloured hyeprgraph of order k on \{1,\ldots,N\} contains a monochromatic subhypergraph on m vertices. (Hint: as in Exercise 3 from the previous lecture [note - exercises have been renumbered recently], one should use a compactness and contradiction argument as in my article on hard and soft analysis.) \diamond

It is not immediately obvious, but Theorem 4 is a statement about a topological dynamical system, albeit one in which the underlying group is not the integers {\Bbb Z}, but rather the symmetric group \hbox{Sym}_0(V), defined as the group of bijections from V to itself which are the identity outside of a finite set. More precisely, we have

Theorem 5. (Hypergraph Ramsey theorem, topological dynamics version) Let V be a countably infinite set, and let W be a finite subset of V, thus \hbox{Sym}_0(W) \times \hbox{Sym}_0(V \backslash W) is a subgroup of \hbox{Sym}_0(V). Let (X, {\mathcal F}, T) be a \hbox{Sym}_0(V)-topological dynamical system, thus (X,{\mathcal F}) is compact metrisable and T: \sigma \mapsto T^\sigma is an action of \hbox{Sym}_0(V) on X via homeomorphisms. Let (U_\alpha)_{\alpha \in A} be an open cover of X, such that each U_\alpha is \hbox{Sym}_0(W) \times \hbox{Sym}_0(V \backslash W)-invariant. Then there exists an element U_\alpha of this cover such that for every finite set \Gamma \subset \hbox{Sym}_0(V) there exists a group element \sigma \in \hbox{Sym}_0(V) such that \bigcap_{\gamma \in \Gamma} (T^{\gamma \sigma})^{-1}(U_\alpha) \neq \emptyset (i.e. there exists x \in X such that T^{\gamma \sigma} x \in U_\alpha for all \gamma \in \Gamma).

This claim should be compared with Theorem 2 of this lecture, or Theorem 1 of the previous lecture.

Exercise 8. Show that Theorem 4 and Theorem 5 are equivalent. (Hint: At some point, you will need the use the fact that the quotient space \hbox{Sym}_0(V)/(\hbox{Sym}_0(W) \times \hbox{Sym}_0(V \backslash W)) is isomorphic to \binom{V}{|W|}.) \diamond

As before, though, we shall only illustrate the ultrafilter approach to Ramsey’s theorem, leaving the other approaches to exercises. Here, we will not work on the compactified integers \beta {\Bbb Z}, but rather on the compactified permutations \beta \hbox{Sym}_0(V). (We will view \hbox{Sym}_0(V) here as a discrete group; one could also give this group the topology inherited from the product topology on V^V, leading to a slightly coarser (and thus less powerful) compactification, though one which is still sufficient for the arguments here.) This is a semigroup with the usual multiplication law

pq := \lim_{\sigma \to p} \lim_{\rho \to q} \sigma \rho. (10)

Let us say that p \in \beta \hbox{Sym}_0(V) is minimal if \beta \hbox{Sym}_0(V) p is a minimal left-ideal of \beta \hbox{Sym}_0(V). One can show (by repeating Exercise 10 from Lecture 3) that every left ideal \beta\hbox{Sym}_0(V) p contains at least one minimal element; in particular, minimal elements exist.

Note that if W is a k-element subset of V, then there is an image map \pi_W: \hbox{Sym}_0(V) \to \binom{V}{k} which maps a permutation \sigma to its inverse image \sigma^{-1}(W) of W. We can compactify this to a map \beta \pi_W: \beta \hbox{Sym}_0(V) \to \beta \binom{V}{k}. (Caution: \beta \binom{V}{k} is not the same thing as \binom{\beta V}{k}, for instance the latter is not even compact.) We can now formulate the ultrafilter version of Ramsey’s theorem:

Theorem 6. (Hypergraph Ramsey theorem, ultrafilter version) Let V be countably infinite, and let p \in \beta \hbox{Sym}_0(V) be minimal. Then for every finite set W, \beta \pi_W is constant on \beta \hbox{Sym}_0(V) p, thus \beta \pi_W( qp ) = \beta \pi_W(p) for all q \in \beta \hbox{Sym}_0(V).

This result should be compared with Proposition 1 from the previous lecture (or Theorem 3 from this lecture).

Exercise 9. Show that Theorem 6 implies both Theorem 4 and Theorem 5. \diamond

Proof of Theorem 6. By relabeling we may assume V = \{1,2,3,\ldots\} and W = \{1,\ldots,k\} for some k.

Given any integers 1 \leq a < i_1 < i_2 < \ldots < i_a, let \sigma_{i_1,\ldots,i_a} \in \hbox{Sym}_0(V) denote the permutation that swaps j with i_j for all 1 \leq j \leq a, but leaves all other integers unchanged. We select some non-principal ultrafilter p_* := \beta V \backslash V and define the sequence p_1, p_2, \ldots \in \beta \hbox{Sym}_0(V) by the formula

p_a := \lim_{i_1 \to p_*} \ldots \lim_{i_a \to p_*} \sigma_{i_1,\ldots,i_a} p. (11)

(Note that the condition a < i_1 < \ldots < i_a will be asymptotically true thanks to the choice of limits here.)

Let a \geq k, and let \alpha \in \hbox{Sym}_0(V) be the a permutation which is the identity outside of \{1,\ldots,a\}. Then we have the identity

\pi_W( \alpha \sigma_{i_1,\ldots,i_a} \rho ) = \pi_W( \sigma_{i_{j_1},\ldots,i_{j_k}} \rho ) (12)

for every \rho \in \hbox{Sym}_0(V), where j_1 < \ldots < j_k are the elements of \alpha^{-1}(\{1,\ldots,k\}) in order. Taking limits as \rho \to p, and then inserting the resulting formula into (11), we conclude (after discarding the trivial limits and relabeling the rest) that

\beta \pi_W(\alpha p_a) = \lim_{i_1 \to p_*} \ldots \lim_{i_k \to p_*} \beta \pi_W(\sigma_{i_1,\ldots,i_k} p) (13)

and in particular that \beta \pi_W(\alpha p_a) is independent of \alpha (if \alpha is the identity outside of \{1,\ldots,j\}. Now let p' := \lim_{a \to p_*} p_a, then we have \beta \pi_W(\alpha p') independent of p’ for all \alpha \in \hbox{Sym}_0(V). Taking limits we conclude that \beta \pi_W is constant on (\beta \hbox{Sym}_0(V)) p'. But from construction we see that p’ lies in the closed minimal ideal (\beta \hbox{Sym}_0(V) )p, thus (\beta \hbox{Sym}_0(V)) p' = (\beta \hbox{Sym}_0(V)) p. The claim follows. \Box

Exercise 10. Establish Theorem 4 directly by a combinatorial argument without recourse to topological dynamics or ultrafilters. (If you are stuck, I recommend reading the classic text of Graham, Rothschild, and Spencer.) \diamond

Exercise 11. Establish Theorem 5 directly by a topological dynamics argument, using combinatorial arguments for the k=1 case but then proceeding by induction afterwards (as in the proof of Theorem 4 from the previous lecture). \diamond

Remark 4. More generally, one can interpret the theory of graphs and hypergraphs on a vertex set V through the lens of dynamics of \hbox{Sym}_0(V) actions; I learned this perspective from Balazs Szegedy, and it underlies my paper on the hypergraph correspondence principle, as well as my more recent paper with Tim Austin on hypergraph property testing. \diamond

– Idempotent ultrafilters and Hindman’s theorem –

Thus far, we have been using ultrafilter technology rather lightly, and indeed all of the arguments so far can be converted relatively easily to the topological dynamics formalism, or even a purely combinatorial formalism, with only a moderate amount of effort. But now we will exploit some deeper properties of ultrafilters, which are more difficult to replicate in other settings. In particular, we introduce the notion of an idempotent ultrafilter.

Definition 1. Let (S,\cdot) be a discrete semigroup, and let \beta S be given the usual semigroup operation \cdot. An element p \in \beta S is idempotent if p \cdot p=p. (We of course define idempotence analogously if the group operation on S is denoted by + instead of \cdot.)

Of course, 0 is idempotent, but the remarkable fact is that many other idempotents exist as well. The key tool for creating this is

Lemma 3 (Ellis-Nakamura lemma). Let S be a discrete semigroup, and let K be a compact non-empty sub-semigroup of \beta S. Then K contains at least one idempotent.

Proof: A simple application of Zorn’s lemma shows that K contains a compact non-empty sub-semigroup K’ which is minimal with respect to set inclusion. We claim that every element of K’ is idempotent. To see this, let p be an arbitrary element of K’. Then observe that K’p is a compact non-empty sub-semigroup of K’ and must therefore be equal to K’; in particular, p \in K'p. (Note that semigroups need not contain an identity.) In particular, the stabiliser K'' := \{ q \in K': qp=p\} is non-empty. But one easily observes that this stabiliser is also a compact sub-semigroup of K’, and so K”=K’. In particular, p must stabilise itself, i.e. it is idempotent. \Box

Remark 5. A posteriori, this results shows that the minimal non-empty sub-semigroups K’ are in fact just the singleton sets consisting of idempotents. Of course, one cannot see this without first deriving all of Lemma 3. \diamond

Idempotence turns out to be particularly powerful when combined with minimality, and to this end we observe the following corollary of the above lemma:

Corollary 4. Let S be a discrete semigroup. For every p \in \beta S, there exists q \in (\beta S) p which is both minimal and idempotent.

Proof. By Exercise 10 from Lecture 3, there exists r \in (\beta S) p which is minimal. It is then easy to see that every element of (\beta S) r is minimal. Since (\beta S) r  \subset (\beta S) p is a compact non-empty sub-semigroup of \beta S, the claim now follows from Lemma 3. \diamond

Remark 6. Somewhat amusingly, minimal idempotent ultrafilters require three distinct applications of Zorn’s lemma to construct: one to define the compactified space \beta S, one to locate a minimal left-ideal, and one to locate an idempotent inside that ideal! It seems particularly challenging therefore to define civilised substitutes for this tool which do not explicitly use the axiom of choice. \diamond

What can we do with minimal idempotent ultrafilters? One particularly striking example is Hindman’s theorem. Given any set A of positive integers, define FS(A) to be the set of all finite sums \sum_{n \in B} n from A, where B ranges over all finite non-empty subsets of A. (For instance, if A = \{1,2,4,\ldots\} are the powers of 2, then FS(A) = {\Bbb N}.)

Theorem 7 (Hindman’s theorem) Suppose that the natural numbers {\Bbb N} are finitely coloured. Then one of the colour classes contains a set of the form FS(A) for some infinite set A.

Remark 7. Theorem 7 implies Folkman’s theorem, which has the same hypothesis but concludes that one of the colour classes contains sets of the form FS(A) for arbitrarily large but finite sets A. It does not seem possible to easily deduce Hindman’s theorem from Folkman’s theorem. \diamond

Exercise 12. Folkman’s theorem in turn implies Schur’s theorem, which asserts that if the natural numbers are finitely coloured, one of the colour classes contains a set of the form FS(\{x,y\}) =\{x,y,x+y\} for some x, y (compare with the k=3 case of van der Waerden’s theorem). Using the Cayley graph construction, deduce Schur’s theorem from Ramsey’s theorem (the k=2 case of Theorem 4). Thus we see that there are some connections between the various Ramsey-type theorems discussed here. \diamond

Proof of Theorem 7. By Corollary 4, we can find a minimal idempotent element p in \beta {\Bbb N}; note that as no element of {\Bbb N} is minimal (cf. Exercise 7 from Lecture 3), we know that p \not \in {\Bbb N}. Let c: {\Bbb N} \to \{1,\ldots,m\} denote the given colouring function, then \beta c(p) is a colour in \{1,\ldots,m\}. Since

\lim_{n \to p} \beta c(n) = \beta c(p) (14)


\lim_{n \to p} \beta c(n+p) = \beta c(p+p) = \beta c(p) (15)

we may find a positive integer n_1 such that \beta c(n_1) = \beta c(n_1+p) = \beta c(p). Now from (14), (15) and the similar calculations

\lim_{n \to p} \beta c(n_1+n) = \beta c(n_1+p) = \beta c(p) (16)


\lim_{n \to p} \beta c(n_1+n+p) = \beta c(n_1+p+p) = \beta c(n_1+p) = \beta c(p) (17)

we can find an integer n_2 > n_1 such that \beta c(n_2) = \beta c(n_2+p) = \beta c(n_2+n_1) = \beta c(n_2+n_1+p) = \beta c(p), thus \beta c(m) = \beta c(m+p)=\beta c(p) for all m \in FS(\{ n_1,n_2\}). Continuing inductively in this fashion, one can find n_1 < n_2 < n_3 < \ldots such that \beta c(m) = \beta c(m+p) = \beta c(p) for all m \in FS(\{n_1,\ldots,n_k\}) and all k. If we set A := \{n_1,n_2,\ldots\}, the claim follows. \Box

Remark 8. Purely combinatorial (and quite succinct) proofs of Hindman’s theorem exist – see for instance the one in the text by Graham, Rothschild, and Spencer – but they generally rely on some ad hoc trickery. Here, the trickery has been encapsulated into the existence of minimal idempotent ultrafilters, which can be reused in other contexts (for instance, we will use it to prove the Hales-Jewett theorem below). \diamond

Exercise 13. Define an IP-set to be a set of positive integers which contains a subset of the form FS(A) for some infinite A. Show that if an IP-set S is finitely coloured, then one of its colour classes is also an IP-set. (Hint: S contains FS(A) for some infinite A = \{a_1,a_2,a_3,\ldots\}. Show that the set \bigcap_{n=1}^\infty \beta FS( \{ a_n,a_{n+1},\ldots \}) is a compact non-empty semigroup and thus contains a minimal idempotent ultrafilter p. Use this p to repeat the proof of Theorem 7.) \diamond

– The Hales-Jewett theorem –

Given a finite alphabet A, let A^{<\omega} be the free semigroup generated by A, i.e. the set of all finite non-empty words using the alphabet A, with concatenation as the group operation. (E.g. if A = \{a,b,c\}, then A^{<\omega} contains words such as abc and cbb, with abc \cdot cbb = abccbb.) If we add another letter * to A (the “wildcard” letter), we create a larger semigroup (A \cup \{*\})^{<\omega} (e.g. containing words such as ab**c*). We of course assume that * was not already present in A. Given any letter x \in A, we have a semigroup homomorphism \pi_x: (A \cup \{*\})^{<\omega} \to A^{<\omega} which substitutes every occurrence of the wildcard * with x and leaves all other letters unchanged. (For instance, \pi_a(ab**c*) = abaaca.) Define a combinatorial line in A^{<\omega} to be any set of the form \{ \pi_x(v): x \in A \} for some v \in (A \cup \{*\})^{<\omega} \backslash A^{<\omega}. For instance, if A = \{a,b,c\}, then \{ abaaca, abbbcb, abcccc\} is a combinatorial line, generated by the word v = ab**c*.

We shall prove the following fundamental theorem.

Theorem 8 (Hales-Jewett theorem). Let A be a finite alphabet. If A^{<\omega} is finitely coloured, then one of the colour classes contains a combinatorial line.

Exercise 14. Show that the Hales-Jewett theorem has the following equivalent formulation: for every finite alphabet A and any m \geq 1 there exists N such that if A^N is partitioned into m classes, then one of the classes contains a combinatorial line. \diamond

Exercise 15. Assume the Hales-Jewett theorem. In this exercise we compare the strength of this theorem against other Ramsey-type theorems.

  1. Deduce van der Waerden’s theorem (Theorem 2 from the previous lecture.) Hint: the base k representation of the non-negative natural numbers provides a map from \{0,\ldots,k-1\}^{<\omega} to {\Bbb Z}_{\geq 0}.
  2. Deduce the multidimensional van der Waerden’s theorem of Gallai (Exercise 7 from the previous lecture.)
  3. Deduce the syndetic van der Waerden theorem of Furstenberg: if the integers are finitely coloured and k is a positive integer, then there are infinitely many monochromatic arithmetic progressions n, n+r, \ldots, n+(k-1)r of length k, and furthermore the set of all the step sizes r which appear in such progressions is syndetic. (Hint: argue by contradiction, assuming that the set of all step sizes has arbitrarily long gaps, and use the Hales-Jewett theorem in a manner adapted to these gaps.) For an additional challenge, show that there exists a single colour class whose progressions of length k have spacings in a syndetic set for every k.
  4. Deduce the IP-van der Waerden theorem: If the integers are finitely coloured, k is a positive integer, and S is an IP-set (see Exercise 13), show that there are infinitely many monochromatic arithmetic progressions of length k whose step size lies in S. (For an additional challenge, show that one of the colour classes has the property that for every k, the spacings of the k-term progressions in that class forms an IP^*-set, i.e. it has non-empty intersection with every IP-set. There is an even stronger topological dynamics version of this statement, due to Furstenberg and Weiss, which I will not describe here.)
  5. For any d \geq 1, define a d-dimensional combinatorial subspace of A^{<\omega} to be a set of the form \{ \pi_{x_1,\ldots,x_d}(v): x_1,\ldots,x_d \in A \}, where v \in (A \cup \{ *_1,\ldots,*_d\})^{<\omega} is a word containing at least one copy of each of the d wildcards *_1,\ldots,*_d, and \pi_{x_1,\ldots,x_d}: (A \cup \{ *_1,\ldots,*_d\})^{<\omega} \to A^{<\omega} is the homomorphism that substitutes each wildcard *_j with x_j. Show that if A^{<\omega} is finitely coloured, then one of the colour classes contains arbitrarily high-dimensional combinatorial subspaces.
  6. Let F be a finite field. If the vector space \lim_{n \to \infty} F^n (the inverse limit of the finite vector spaces F^n) is finitely coloured, show that one of the colour classes contains arbitrarily high-dimensional affine subspaces over F. (This geometric Ramsey theorem is due to Graham, Leeb, and Rothschild.)

We now give an ultrafilter-based proof of the Hales-Jewett theorem due to Blass. As usual, the first step is to obtain a statement involving ultrafilters rather than colourings:

Proposition 1. (Hales-Jewett theorem, ultrafilter version) Let A be a finite alphabet, and let p be a minimal idempotent element of the semigroup \beta( A^{<\omega}). Then there exists q \in \beta (A \cup \{*\})^{<\omega} ) \backslash \beta( A^{<\omega}) such that \beta \pi_x(q) = p for all x \in A.

Exercise 16. Deduce Theorem 8 from Proposition 1. \diamond

To prove Proposition 1, we need a variant of Corollary 4. If (S,\cdot ) is a discrete semigroup and p and q are two idempotents in \beta S, let us write p \prec q if we have pq=qp=p.

Exercise 17. Show that \prec is a partial ordering on the idempotents of \beta S, and that an idempotent is minimal in \beta S if and only if it is minimal with respect to \prec. \diamond

Lemma 4. Let S be a discrete semigroup, and let p be an idempotent in \beta S. Then there exists a minimal idempotent q in \beta S such that q \prec p.

Proof of Lemma 4. By Exercise 10 from Lecture 3 (generalised to arbitrary discrete semigroups S), (\beta S) p contains a minimal left-ideal (\beta S) r. By Lemma 3, (\beta S) r contains an idempotent s. Since s \in (\beta S) p and p is idempotent, we conclude sp=s. If we then set q := ps, we easily check that q is idempotent, that q \prec p, and (since q lies in the minimal left-ideal (\beta S) r) it is minimal. The claim follows. \Box

Proof of Proposition 1. Since p is an idempotent element of \beta( A^{<\omega} ), it is also an idempotent element of \beta (A \cup \{*\}^{<\omega}). However, it is not minimal in that semigroup; for instance, \beta(A \cup \{*\})^{<\omega} * p does not contain p, since \beta(A^{<\omega}) and \beta((A \cup \{*\})^{<\omega} \backslash A^{<\omega}) are disconnected from each other.   However, by Lemma 4, we can find a minimal idempotent q in \beta (A \cup \{*\})^{<\omega} such that q \prec p.  By the above discussion, q is not equal to p; since p is minimal in \beta(A^{<\omega}), we see that q must lie outside of \beta(A^{<\omega}).

Now let x \in A. Since \pi_x:   (A \cup \{*\})^{<\omega} \to A^{<\omega} is a homomorphism,

\beta \pi_x: \beta  (A \cup \{*\})^{<\omega} \to \beta A^{<\omega} is also a homomorphism (why?). Since q is idempotent and q \prec p (note that these are both purely algebraic statements), we conclude that \beta \pi_x(q) is idempotent and \beta \pi_x(q) \prec \beta \pi_x(p). But \beta \pi_x(p) = p is minimal in \beta A^{<\omega}, hence by Exercise 17, we have \beta \pi_x(q) = p. The claim follows. \Box

Exercise 18. Adapt the above proof to give an alternate proof of the ultrafilter version of van der Waerden’s theorem (Proposition 1 from the previous lecture) which relies on idempotence rather than on induction on k. (If you are stuck, read the proof of Proposition 1.55 of Glasner’s book.) \diamond

Remark 9. Several of the above Ramsey-type theorems can be unified. For instance, the polynomial van der Waerden theorem and the Hales-Jewett theorem have been unified into the polynomial Hales-Jewett theorem of Bergelson and Leibman (see also the more recent paper of Walters on this topic). This type of Ramsey theory is still an active subject, and we do not yet have a comprehensive and systematic theory (or a “universal” Ramsey theorem) that encompasses all known examples. \diamond

Exercise 19. Let X be an at most countable set (with the discrete topology), and let {\mathcal F} be a family of subsets of X. Show that the following two statements are equivalent:

  1. Whenever X is finitely coloured, one of the colour classes contains a subset in {\mathcal F}.
  2. There exists p \in \beta X such that every neighbourhood of p contains a subset in {\mathcal F}. \diamond

Exercise 20. Let X, Y be at most countable sets with the discrete topology, and let f_1,\ldots,f_k: Y \to X be a finite collection of functions. Show that the following two statements are equivalent:

  1. Whenever X is finitely coloured, one of the colour classes contains a set \{f_1(y), \ldots, f_k(y)\} for some y \in Y.
  2. There exists q \in \beta Y such that \beta f_1(q) = \ldots = \beta f_k(q).

(Hint: look at the closure of \{ (f_1(y),\ldots,f_k(y)): y \in Y \} in $(\beta X)^k$.) \diamond

Exercise 21. Using the previous exercise, deduce Theorem 3 from Theorem 1, and deduce Theorem 6 from Theorem 4. \diamond

[Update, Jan 24: extra factors added to the polynomial in Exercise 1 to sidestep issues at 2 and 3.]

[Update, Jan 30: remarks added and renumbered.]

[Update, Feb 1: More exercises added.]

[Update, Mar 21: Various corrections.]

[Update, Mar 25: Statement of Theorem 5 repaired.]

[Update, Mar 31 2009: Proof of Proposition 1 expanded.]