You are currently browsing the tag archive for the ‘multiple recurrence’ tag.

Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our joint paper “Multiple recurrence and convergence results associated to ${{\bf F}_{p}^{\omega}}$-actions“. This paper is primarily concerned with limit formulae in the theory of multiple recurrence in ergodic theory. Perhaps the most basic formula of this type is the mean ergodic theorem, which (among other things) asserts that if ${(X,{\mathcal X}, \mu,T)}$ is a measure-preserving ${{\bf Z}}$-system (which, in this post, means that ${(X,{\mathcal X}, \mu)}$ is a probability space and ${T: X \mapsto X}$ is measure-preserving and invertible, thus giving an action ${(T^n)_{n \in {\bf Z}}}$ of the integers), and ${f,g \in L^2(X,{\mathcal X}, \mu)}$ are functions, and ${X}$ is ergodic (which means that ${L^2(X,{\mathcal X}, \mu)}$ contains no ${T}$-invariant functions other than the constants (up to almost everywhere equivalence, of course)), then the average

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x)\ d\mu \ \ \ \ \ (1)$

converges as ${N \rightarrow \infty}$ to the expression

$\displaystyle (\int_X f(x)\ d\mu) (\int_X g(x)\ d\mu);$

see e.g. this previous blog post. Informally, one can interpret this limit formula as an equidistribution result: if ${x}$ is drawn at random from ${X}$ (using the probability measure ${\mu}$), and ${n}$ is drawn at random from ${\{1,\ldots,N\}}$ for some large ${N}$, then the pair ${(x, T^n x)}$ becomes uniformly distributed in the product space ${X \times X}$ (using product measure ${\mu \times \mu}$) in the limit as ${N \rightarrow \infty}$.

If we allow ${(X,\mu)}$ to be non-ergodic, then we still have a limit formula, but it is a bit more complicated. Let ${{\mathcal X}^T}$ be the ${T}$-invariant measurable sets in ${{\mathcal X}}$; the ${{\bf Z}}$-system ${(X, {\mathcal X}^T, \mu, T)}$ can then be viewed as a factor of the original system ${(X, {\mathcal X}, \mu, T)}$, which is equivalent (in the sense of measure-preserving systems) to a trivial system ${(Z_0, {\mathcal Z}_0, \mu_{Z_0}, 1)}$ (known as the invariant factor) in which the shift is trivial. There is then a projection map ${\pi_0: X \rightarrow Z_0}$ to the invariant factor which is a factor map, and the average (1) converges in the limit to the expression

$\displaystyle \int_{Z_0} (\pi_0)_* f(z) (\pi_0)_* g(z)\ d\mu_{Z_0}(x), \ \ \ \ \ (2)$

where ${(\pi_0)_*: L^2(X,{\mathcal X},\mu) \rightarrow L^2(Z_0,{\mathcal Z}_0,\mu_{Z_0})}$ is the pushforward map associated to the map ${\pi_0: X \rightarrow Z_0}$; see e.g. this previous blog post. We can interpret this as an equidistribution result. If ${(x,T^n x)}$ is a pair as before, then we no longer expect complete equidistribution in ${X \times X}$ in the non-ergodic, because there are now non-trivial constraints relating ${x}$ with ${T^n x}$; indeed, for any ${T}$-invariant function ${f: X \rightarrow {\bf C}}$, we have the constraint ${f(x) = f(T^n x)}$; putting all these constraints together we see that ${\pi_0(x) = \pi_0(T^n x)}$ (for almost every ${x}$, at least). The limit (2) can be viewed as an assertion that this constraint ${\pi_0(x) = \pi_0(T^n x)}$ are in some sense the “only” constraints between ${x}$ and ${T^n x}$, and that the pair ${(x,T^n x)}$ is uniformly distributed relative to these constraints.

Limit formulae are known for multiple ergodic averages as well, although the statement becomes more complicated. For instance, consider the expression

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x) h(T^{2n} x)\ d\mu \ \ \ \ \ (3)$

for three functions ${f,g,h \in L^\infty(X, {\mathcal X}, \mu)}$; this is analogous to the combinatorial task of counting length three progressions in various sets. For simplicity we assume the system ${(X,{\mathcal X},\mu,T)}$ to be ergodic. Naively one might expect this limit to then converge to

$\displaystyle (\int_X f\ d\mu) (\int_X g\ d\mu) (\int_X h\ d\mu)$

which would roughly speaking correspond to an assertion that the triplet ${(x,T^n x, T^{2n} x)}$ is asymptotically equidistributed in ${X \times X \times X}$. However, even in the ergodic case there can be additional constraints on this triplet that cannot be seen at the level of the individual pairs ${(x,T^n x)}$, ${(x, T^{2n} x)}$. The key obstruction here is that of eigenfunctions of the shift ${T: X \rightarrow X}$, that is to say non-trivial functions ${f: X \rightarrow S^1}$ that obey the eigenfunction equation ${Tf = \lambda f}$ almost everywhere for some constant (or ${T}$-invariant) ${\lambda}$. Each such eigenfunction generates a constraint

$\displaystyle f(x) \overline{f(T^n x)}^2 f(T^{2n} x) = 1 \ \ \ \ \ (4)$

tying together ${x}$, ${T^n x}$, and ${T^{2n} x}$. However, it turns out that these are in some sense the only constraints on ${x,T^n x, T^{2n} x}$ that are relevant for the limit (3). More precisely, if one sets ${{\mathcal X}_1}$ to be the sub-algebra of ${{\mathcal X}}$ generated by the eigenfunctions of ${T}$, then it turns out that the factor ${(X, {\mathcal X}_1, \mu, T)}$ is isomorphic to a shift system ${(Z_1, {\mathcal Z}_1, \mu_{Z_1}, x \mapsto x+\alpha)}$ known as the Kronecker factor, for some compact abelian group ${Z_1 = (Z_1,+)}$ and some (irrational) shift ${\alpha \in Z_1}$; the factor map ${\pi_1: X \rightarrow Z_1}$ pushes eigenfunctions forward to (affine) characters on ${Z_1}$. It is then known that the limit of (3) is

$\displaystyle \int_\Sigma (\pi_1)_* f(x_0) (\pi_1)_* g(x_1) (\pi_1)_* h(x_2)\ d\mu_\Sigma$

where ${\Sigma \subset Z_1^3}$ is the closed subgroup

$\displaystyle \Sigma = \{ (x_1,x_2,x_3) \in Z_1^3: x_1-2x_2+x_3=0 \}$

and ${\mu_\Sigma}$ is the Haar probability measure on ${\Sigma}$; see this previous blog post. The equation ${x_1-2x_2+x_3=0}$ defining ${\Sigma}$ corresponds to the constraint (4) mentioned earlier. Among other things, this limit formula implies Roth’s theorem, which in the context of ergodic theory is the assertion that the limit (or at least the limit inferior) of (3) is positive when ${f=g=h}$ is non-negative and not identically vanishing.

If one considers a quadruple average

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x) h(T^{2n} x) k(T^{3n} x)\ d\mu \ \ \ \ \ (5)$

(analogous to counting length four progressions) then the situation becomes more complicated still, even in the ergodic case. In addition to the (linear) eigenfunctions that already showed up in the computation of the triple average (3), a new type of constraint also arises from quadratic eigenfunctions ${f: X \rightarrow S^1}$, which obey an eigenfunction equation ${Tf = \lambda f}$ in which ${\lambda}$ is no longer constant, but is now a linear eigenfunction. For such functions, ${f(T^n x)}$ behaves quadratically in ${n}$, and one can compute the existence of a constraint

$\displaystyle f(x) \overline{f(T^n x)}^3 f(T^{2n} x)^3 \overline{f(T^{3n} x)} = 1 \ \ \ \ \ (6)$

between ${x}$, ${T^n x}$, ${T^{2n} x}$, and ${T^{3n} x}$ that is not detected at the triple average level. As it turns out, this is not the only type of constraint relevant for (5); there is a more general class of constraint involving two-step nilsystems which we will not detail here, but see e.g. this previous blog post for more discussion. Nevertheless there is still a similar limit formula to previous examples, involving a special factor ${(Z_2, {\mathcal Z}_2, \mu_{Z_2}, S)}$ which turns out to be an inverse limit of two-step nilsystems; this limit theorem can be extracted from the structural theory in this paper of Host and Kra combined with a limit formula for nilsystems obtained by Lesigne, but will not be reproduced here. The pattern continues to higher averages (and higher step nilsystems); this was first done explicitly by Ziegler, and can also in principle be extracted from the structural theory of Host-Kra combined with nilsystem equidistribution results of Leibman. These sorts of limit formulae can lead to various recurrence results refining Roth’s theorem in various ways; see this paper of Bergelson, Host, and Kra for some examples of this.

The above discussion was concerned with ${{\bf Z}}$-systems, but one can adapt much of the theory to measure-preserving ${G}$-systems for other discrete countable abelian groups ${G}$, in which one now has a family ${(T_g)_{g \in G}}$ of shifts indexed by ${G}$ rather than a single shift, obeying the compatibility relation ${T_{g+h}=T_g T_h}$. The role of the intervals ${\{1,\ldots,N\}}$ in this more general setting is replaced by that of Folner sequences. For arbitrary countable abelian ${G}$, the theory for double averages (1) and triple limits (3) is essentially identical to the ${{\bf Z}}$-system case. But when one turns to quadruple and higher limits, the situation becomes more complicated (and, for arbitrary ${G}$, still not fully understood). However one model case which is now well understood is the finite field case when ${G = {\bf F}_p^\omega = \bigcup_{n=1}^\infty {\bf F}_p^n}$ is an infinite-dimensional vector space over a finite field ${{\bf F}_p}$ (with the finite subspaces ${{\bf F}_p^n}$ then being a good choice for the Folner sequence). Here, the analogue of the structural theory of Host and Kra was worked out by Vitaly, Tamar, and myself in these previous papers (treating the high characteristic and low characteristic cases respectively). In the finite field setting, it turns out that nilsystems no longer appear, and one only needs to deal with linear, quadratic, and higher order eigenfunctions (known collectively as phase polynomials). It is then natural to look for a limit formula that asserts, roughly speaking, that if ${x}$ is drawn at random from a ${{\bf F}_p^\omega}$-system and ${n}$ drawn randomly from a large subspace of ${{\bf F}_p^\omega}$, then the only constraints between ${x, T^n x, \ldots, T^{(p-1)n} x}$ are those that arise from phase polynomials. The main theorem of this paper is to establish this limit formula (which, again, is a little complicated to state explicitly and will not be done here). In particular, we establish for the first time that the limit actually exists (a result which, for ${{\bf Z}}$-systems, was one of the main results of this paper of Host and Kra).

As a consequence, we can recover finite field analogues of most of the results of Bergelson-Host-Kra, though interestingly some of the counterexamples demonstrating sharpness of their results for ${{\bf Z}}$-systems (based on Behrend set constructions) do not seem to be present in the finite field setting (cf. this previous blog post on the cap set problem). In particular, we are able to largely settle the question of when one has a Khintchine-type theorem that asserts that for any measurable set ${A}$ in an ergodic ${{\bf F}_p^\omega}$-system and any ${\epsilon>0}$, one has

$\displaystyle \mu( T_{c_1 n} A \cap \ldots \cap T_{c_k n} A ) > \mu(A)^k - \epsilon$

for a syndetic set of ${n}$, where ${c_1,\ldots,c_k \in {\bf F}_p}$ are distinct residue classes. It turns out that Khintchine-type theorems always hold for ${k=1,2,3}$ (and for ${k=1,2}$ ergodicity is not required), and for ${k=4}$ it holds whenever ${c_1,c_2,c_3,c_4}$ form a parallelogram, but not otherwise (though the counterexample here was such a painful computation that we ended up removing it from the paper, and may end up putting it online somewhere instead), and for larger ${k}$ we could show that the Khintchine property failed for generic choices of ${c_1,\ldots,c_k}$, though the problem of determining exactly the tuples for which the Khintchine property failed looked to be rather messy and we did not completely settle it.

Tim Austin, Tanja Eisner, and I have just uploaded to the arXiv our joint paper Nonconventional ergodic averages and multiple recurrence for von Neumann dynamical systems, submitted to Pacific Journal of Mathematics. This project started with the observation that the multiple recurrence theorem of Furstenberg (and the related multiple convergence theorem of Host and Kra) could be interpreted in the language of dynamical systems of commutative finite von Neumann algebras, which naturally raised the question of the extent to which the results hold in the noncommutative setting. The short answer is “yes for small averages, but not for long ones”.

The Furstenberg multiple recurrence theorem can be phrased as follows: if ${X = (X, {\mathcal X}, \mu)}$ is a probability space with a measure-preserving shift ${T:X \rightarrow X}$ (which naturally induces an isomorphism ${\alpha: L^\infty(X) \rightarrow L^\infty(X)}$ by setting ${\alpha a := a \circ T^{-1}}$), ${a \in L^\infty(X)}$ is non-negative with positive trace ${\tau(a) := \int_X a\ d\mu}$, and ${k \geq 1}$ is an integer, then one has

$\displaystyle \liminf_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \tau( a (\alpha^n a) \ldots (\alpha^{(k-1)n} a) ) > 0.$

In particular, ${\tau( a (\alpha^n a) \ldots (\alpha^{(k-1)n} a) ) > 0}$ for all ${n}$ in a set of positive upper density. This result is famously equivalent to Szemerédi’s theorem on arithmetic progressions.

The Host-Kra multiple convergence theorem makes the related assertion that if ${a_0,\ldots,a_{k-1} \in L^\infty(X)}$, then the scalar averages

$\displaystyle \frac{1}{N} \sum_{n=1}^N \tau( a_0 (\alpha^n a_1) \ldots (\alpha^{(k-1)n} a_{k-1}) )$

converge to a limit as ${N \rightarrow \infty}$; a fortiori, the function averages

$\displaystyle \frac{1}{N} \sum_{n=1}^N (\alpha^n a_1) \ldots (\alpha^{(k-1)n} a_{k-1})$

converge in (say) ${L^2(X)}$ norm.

The space ${L^\infty(X)}$ is a commutative example of a von Neumann algebra: an algebra of bounded linear operators on a complex Hilbert space ${H}$ which is closed under the weak operator topology, and under taking adjoints. Indeed, one can take ${H}$ to be ${L^2(X)}$, and identify each element ${m}$ of ${L^\infty(X)}$ with the multiplier operator ${a \mapsto ma}$. The operation ${\tau: a \mapsto \int_X a\ d\mu}$ is then a finite trace for this algebra, i.e. a linear map from the algebra to the scalars ${{\mathbb C}}$ such that ${\tau(ab)=\tau(ba)}$, ${\tau(a^*) = \overline{\tau(a)}}$, and ${\tau(a^* a) \geq 0}$, with equality iff ${a=0}$. The shift ${\alpha: L^\infty(X) \rightarrow L^\infty(X)}$ is then an automorphism of this algebra (preserving shift and conjugation).

We can generalise this situation to the noncommutative setting. Define a von Neumann dynamical system ${(M, \tau, \alpha)}$ to be a von Neumann algebra ${M}$ with a finite trace ${\tau}$ and an automorphism ${\alpha: M \rightarrow M}$. In addition to the commutative examples generated by measure-preserving systems, we give three other examples here:

• (Matrices) ${M = M_n({\mathbb C})}$ is the algebra of ${n \times n}$ complex matrices, with trace ${\tau(a) = \frac{1}{n} \hbox{tr}(a)}$ and shift ${\alpha(a) := UaU^{-1}}$, where ${U}$ is a fixed unitary ${n \times n}$ matrix.
• (Group algebras) ${M = \overline{{\mathbb C} G}}$ is the closure of the group algebra ${{\mathbb C} G}$ of a discrete group ${G}$ (i.e. the algebra of finite formal complex combinations of group elements), which acts on the Hilbert space ${\ell^2(G)}$ by convolution (identifying each group element with its Kronecker delta function). A trace is given by ${\alpha(a) = \langle a \delta_0, \delta_0 \rangle_{\ell^2(G)}}$, where ${\delta_0 \in \ell^2(G)}$ is the Kronecker delta at the identity. Any automorphism ${T: G \rightarrow G}$ of the group induces a shift ${\alpha: M \rightarrow M}$.
• (Noncommutative torus) ${M}$ is the von Neumann algebra acting on ${L^2(({\mathbb R}/{\mathbb Z})^2)}$ generated by the multiplier operator ${f(x,y) \mapsto e^{2\pi i x} f(x,y)}$ and the shifted multiplier operator ${f(x,y) \mapsto e^{2\pi i y} f(x+\alpha,y)}$, where ${\alpha \in {\mathbb R}/{\mathbb Z}}$ is fixed. A trace is given by ${\alpha(a) = \langle 1, a1\rangle_{L^2(({\mathbb R}/{\mathbb Z})^2)}}$, where ${1 \in L^2(({\mathbb R}/{\mathbb Z})^2)}$ is the constant function.

Inspired by noncommutative generalisations of other results in commutative analysis, one can then ask the following questions, for a fixed ${k \geq 1}$ and for a fixed von Neumann dynamical system ${(M,\tau,\alpha)}$:

• (Recurrence on average) Whenever ${a \in M}$ is non-negative with positive trace, is it true that$\displaystyle \liminf_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \tau( a (\alpha^n a) \ldots (\alpha^{(k-1)n} a) ) > 0?$
• (Recurrence on a dense set) Whenever ${a \in M}$ is non-negative with positive trace, is it true that$\displaystyle \tau( a (\alpha^n a) \ldots (\alpha^{(k-1)n} a) ) > 0$for all ${n}$ in a set of positive upper density?
• (Weak convergence) With ${a_0,\ldots,a_{k-1} \in M}$, is it true that$\displaystyle \frac{1}{N} \sum_{n=1}^N \tau( a_0 (\alpha^n a_1) \ldots (\alpha^{(k-1)n} a_{k-1}) )$converges?
• (Strong convergence) With ${a_1,\ldots,a_{k-1} \in M}$, is it true that$\displaystyle \frac{1}{N} \sum_{n=1}^N (\alpha^n a_1) \ldots (\alpha^{(k-1)n} a_{k-1})$converges in using the Hilbert-Schmidt norm ${\|a\|_{L^2(M)} := \tau(a^* a)^{1/2}}$?

Note that strong convergence automatically implies weak convergence, and recurrence on average automatically implies recurrence on a dense set.

For ${k=1}$, all four questions can trivially be answered “yes”. For ${k=2}$, the answer to the above four questions is also “yes”, thanks to the von Neumann ergodic theorem for unitary operators. For ${k=3}$, we were able to establish a positive answer to the “recurrence on a dense set”, “weak convergence”, and “strong convergence” results assuming that ${M}$ is ergodic. For general ${k}$, we have a positive answer to all four questions under the assumption that ${M}$ is asymptotically abelian, which roughly speaking means that the commutators ${[a,\alpha^n b]}$ converges to zero (in an appropriate weak sense) as ${n \rightarrow \infty}$. Both of these proofs adapt the usual ergodic theory arguments; the latter result generalises some earlier work of Niculescu-Stroh-Zsido, Duvenhage, and Beyers-Duvenhage-Stroh. For the ${k=3}$ result, a key observation is that the van der Corput lemma can be used to control triple averages without requiring any commutativity; the “generalised von Neumann” trick of using multiple applications of the van der Corput trick to control higher averages, however, relies much more strongly on commutativity.

In most other situations we have counterexamples to all of these questions. In particular:

• For ${k=3}$, recurrence on average can fail on an ergodic system; indeed, one can even make the average negative. This example is ultimately based on a Behrend example construction and a von Neumann algebra construction known as the crossed product.
• For ${k=3}$, recurrence on a dense set can also fail if the ergodicity hypothesis is dropped. This also uses the Behrend example and the crossed product construction.
• For ${k=4}$, weak and strong convergence can fail even assuming ergodicity. This uses a group theoretic construction, which amusingly was inspired by Grothendieck’s interpretation of a group as a sheaf of flat connections, which I blogged about recently, and which I will discuss below the fold.
• For ${k=5}$, recurrence on a dense set fails even with the ergodicity hypothesis. This uses a fancier version of the Behrend example due to Ruzsa in this paper of Bergelson, Host, and Kra. This example only applies for ${k \geq 5}$; we do not know for ${k=4}$ whether recurrence on a dense set holds for ergodic systems.

In Lecture 11, we studied compact measure-preserving systems – those systems $(X, {\mathcal X}, \mu, T)$ in which every function $f \in L^2(X, {\mathcal X}, \mu)$ was almost periodic, which meant that their orbit $\{ T^n f: n \in {\Bbb Z}\}$ was precompact in the $L^2(X, {\mathcal X}, \mu)$ topology. Among other things, we were able to easily establish the Furstenberg recurrence theorem (Theorem 1 from Lecture 11) for such systems.

In this lecture, we generalise these results to a “relative” or “conditional” setting, in which we study systems which are compact relative to some factor $(Y, {\mathcal Y}, \nu, S)$ of $(X, {\mathcal X}, \mu, T)$. Such systems are to compact systems as isometric extensions are to isometric systems in topological dynamics. The main result we establish here is that the Furstenberg recurrence theorem holds for such compact extensions whenever the theorem holds for the base. The proof is essentially the same as in the compact case; the main new trick is to not to work in the Hilbert spaces $L^2(X,{\mathcal X},\mu)$ over the complex numbers, but rather in the Hilbert module $L^2(X,{\mathcal X},\mu|Y, {\mathcal Y}, \nu)$ over the (commutative) von Neumann algebra $L^\infty(Y,{\mathcal Y},\nu)$. (Modules are to rings as vector spaces are to fields.) Because of the compact nature of the extension, it turns out that results from topological dynamics (and in particular, van der Waerden’s theorem) can be exploited to good effect in this argument.

[Note: this operator-algebraic approach is not the only way to understand these extensions; one can also proceed by disintegrating $\mu$ into fibre measures $\mu_y$ for almost every $y \in Y$ and working fibre by fibre. We will discuss the connection between the two approaches below.]

The primary objective of this lecture and the next few will be to give a proof of the Furstenberg recurrence theorem (Theorem 2 from the previous lecture). Along the way we will develop a structural theory for measure-preserving systems.

The basic strategy of Furstenberg’s proof is to first prove the recurrence theorems for very simple systems – either those with “almost periodic” (or compact) dynamics or with “weakly mixing” dynamics. These cases are quite easy, but don’t manage to cover all the cases. To go further, we need to consider various combinations of these systems. For instance, by viewing a general system as an extension of the maximal compact factor, we will be able to prove Roth’s theorem (which is equivalent to the k=3 form of the Furstenberg recurrence theorem). To handle the general case, we need to consider compact extensions of compact factors, compact extensions of compact extensions of compact factors, etc., as well as weakly mixing extensions of all the previously mentioned factors.

In this lecture, we will consider those measure-preserving systems $(X, {\mathcal X}, \mu, T)$ which are compact or almost periodic. These systems are analogous to the equicontinuous or isometric systems in topological dynamics discussed in Lecture 6, and as with those systems, we will be able to characterise such systems (or more precisely, the ergodic ones) algebraically as Kronecker systems, though this is not strictly necessary for the proof of the recurrence theorem.

In this lecture, we describe the simple but fundamental Furstenberg correspondence principle which connects the “soft analysis” subject of ergodic theory (in particular, recurrence theorems) with the “hard analysis” subject of combinatorial number theory (or more generally with results of “density Ramsey theory” type). Rather than try to set up the most general and abstract version of this principle, we shall instead study the canonical example of this principle in action, namely the equating of the Furstenberg multiple recurrence theorem with Szemerédi’s theorem on arithmetic progressions.
Read the rest of this entry »

In the previous lecture, we established single recurrence properties for both open sets and for sequences inside a topological dynamical system $(X, {\mathcal F}, T)$. In this lecture, we generalise these results to multiple recurrence. More precisely, we shall show

Theorem 1. (Multiple recurrence in open covers) Let $(X,{\mathcal F},T)$ be a topological dynamical system, and let $(U_\alpha)_{\alpha \in A}$ be an open cover of X. Then there exists $U_\alpha$ such that for every $k \geq 1$, we have $U_\alpha \cap T^{-r} U_\alpha \cap \ldots \cap T^{-(k-1)r} U_\alpha \neq \emptyset$ for infinitely many r.

Note that this theorem includes Theorem 1 from the previous lecture as the special case $k=2$. This theorem is also equivalent to the following well-known combinatorial result:

Theorem 2. (van der Waerden’s theorem) Suppose the integers ${\Bbb Z}$ are finitely coloured. Then one of the colour classes contains arbitrarily long arithmetic progressions.

Exercise 1. Show that Theorem 1 and Theorem 2 are equivalent. $\diamond$

Exercise 2. Show that Theorem 2 fails if “arbitrarily long” is replaced by “infinitely long”. Deduce that a similar strengthening of Theorem 1 also fails. $\diamond$

Exercise 3. Use Theorem 2 to deduce a finitary version: given any positive integers m and k, there exists an integer N such that whenever $\{1,\ldots,N\}$ is coloured into m colour classes, one of the colour classes contains an arithmetic progression of length k. (Hint: use a “compactness and contradiction” argument, as in my article on hard and soft analysis.) $\diamond$

We also have a stronger version of Theorem 1:

Theorem 3. (Multiple Birkhoff recurrence theorem) Let $(X,{\mathcal F},T)$ be a topological dynamical system. Then for any $k \geq 1$ there exists a point $x \in X$ and a sequence $r_j \to \infty$ of integers such that $T^{i r_j} x \to x$ as $j \to \infty$ for all $0 \leq i \leq k-1$.

These results already have some application to equidistribution of explicit sequences. Here is a simple example (which is also a consequence of Weyl’s equidistribution theorem):

Corollary 1. Let $\alpha$ be a real number. Then there exists a sequence $r_j \to \infty$ of integers such that $\hbox{dist}(r_j^2 \alpha,{\Bbb Z}) \to 0$ as $j \to \infty$.

Proof. Consider the skew shift system $X = ({\Bbb R}/{\Bbb Z})^2$ with $T(x,y) := (x+\alpha,y+x)$. By Theorem 3, there exists $(x,y) \in X$ and a sequence $n_j \to \infty$ such that $T^{r_j}(x,y)$ and $T^{2r_j}(x,y)$ both convege to $(x,y)$. If we then use the easily verified identity

$(x,y) - 2T^{r_j}(x,y) + T^{2r_j}(x,y) = (0, r_j^2 \alpha)$ (1)

we obtain the claim. $\Box$

Exercise 4. Use Theorem 1 or Theorem 2 in place of Theorem 3 to give an alternate derivation of Corollary 1. $\diamond$

As in the previous lecture, we will give both a traditional topological proof and an ultrafilter-based proof of Theorem 1 and Theorem 3; the reader is invited to see how the various proofs are ultimately equivalent to each other.