In this lecture – the final one on general measure-preserving dynamics – we put together the results from the past few lectures to establish the Furstenberg-Zimmer structure theorem for measure-preserving systems, and then use this to finish the proof of the Furstenberg recurrence theorem.

– The Furstenberg-Zimmer structure theorem –

Let $X = (X,{\mathcal X},\mu,T)$ be a measure-preserving system, and let $Y = (Y, {\mathcal Y},\nu,S)$ be a factor. In Theorem 2 of the previous lecture, we showed that if X was not a weakly mixing extension of Y, then we could find a non-trivial compact extension Z of Y (thus $L^2(Z)$ is a non-trivial superspace of $L^2(Y)$). Combining this with Zorn’s lemma (and starting with the trivial factor Y = pt), one obtains

Theorem 1. (Furstenberg-Zimmer structure theorem) Let $(X,{\mathcal X},\mu,T)$ be a measure-preserving system. Then there exists an ordinal $\alpha$ and a factor $Y_\beta = (Y_\beta, {\mathcal Y}_\beta, \nu_\beta, S_\beta)$ for every $\beta \leq \alpha$ with the following properties:

1. $Y_\emptyset$ is a point.
2. For every successor ordinal $\beta+1 \leq \alpha$, $Y_{\beta+1}$ is a compact extension of $Y_\beta$.
3. For every limit ordinal $\beta \leq \alpha$, $Y_\beta$ is the inverse limit of the $Y_\gamma$ for the $\gamma < \beta$, in the sense that $L^2(Y_\beta)$ is the closure of $\bigcup_{\gamma < \beta} L^2(Y_\gamma)$.
4. X is a weakly mixing extension of $Y_\alpha$.

This theorem should be compared with Furstenberg’s structure theorem for distal systems in topological dynamics (Theorem 2 from Lecture 7). Indeed, in analogy to that theorem, the factors $Y_\beta$ are known as distal measure-preserving systems. The result was proven independently by Furstenberg and by Zimmer.

Exercise 1. Deduce Theorem 1 from Theorem 2 of the previous lecture. $\diamond$

Remark 1. Since the Hilbert spaces $L^2(Y_\beta)$ are increasing inside the separable Hilbert space $L^2(X)$, it is not hard to see that the ordinal $\alpha$ must be at most countable. Conversely, a result of Beleznay and Foreman shows that every countable ordinal can appear as the minimal length of a Furstenberg tower of a given system. Thus, in some sense, the complexity of a system can be as great as any countable ordinal. This is because the structure theorem roots out every last trace of structure from the system, so much so that every remaining function orthogonal to the final factor $L^2(Y_\alpha)$ is weakly mixing. But in many applications one does not need so much weak mixing; for instance to establish k-fold recurrence for a function f, it would be enough to obtain weak mixing control on just a few combinations of f (such as $T^h f \overline{f}$), as we already saw in the proof of Roth’s theorem in Lecture 12. In fact, it is not hard to show that to prove Furstenberg’s recurrence theorem for a fixed k, one only needs to analyse the first k-2 steps of the Furstenberg tower. As one consequence of this, it is possible to avoid the use of Zorn’s lemma (and the axiom of choice) in the proof of the recurrence theorem. $\diamond$

Remark 2. Analogues of the structure theorem exist for other actions, such as the action of ${\Bbb Z}^d$ on a measure space (which can equivalently be viewed as the action of d commuting shifts $T_1,\ldots,T_d: X \to X$). There is a new feature in this case, though: instead of having a tower of purely compact extensions, followed by one weakly mixing extension at the end, one instead has a tower of hybrid extensions (known as primitive extensions), each one of which is compact along one subgroup of ${\Bbb Z}^d$ and weakly mixing along a complementary subgroup. See for instance Furstenberg’s book for details. $\diamond$

– The Furstenberg recurrence theorem –

The Furstenberg recurrence theorem asserts that every measure-preserving system $(X,{\mathcal X},\mu,T)$ has the uniform multiple recurrence (UMR) property, thus

$\liminf_{N \to \infty} \frac{1}{N} \sum_{n=0}^{N-1} \int_X f T^n f \ldots T^{(k-1)n} f\ d\mu > 0$ (1)

whenever $k \geq 1$ and $f \in L^\infty(X)$ is non-negative with positive mean. The UMR property is trivially true for a point, and we have already shown that UMR is preserved by compact extensions (Theorem 1 of Lecture 13) and by weakly mixing extensions (Theorem 1 of Lecture 14). The former result lets us climb the successor ordinal steps of the tower in Theorem 1, while the latter lets us jump from the final distal system $Y_\alpha$ to X. But to clinch the proof of the recurrence theorem, we also need to deal with the limit ordinals. More precisely, we need to prove

Theorem 2. (Limits of chains) Let $(Y_\beta)_{\beta \in B}$ be a totally ordered family of factors of a measure-preserving system X (thus $L^2(Y_\beta)$ is increasing with $\beta$, and let Y be the inverse limit of the $Y_\beta$. If each of the $Y_\beta$ obeys the UMR, then Y does also.

With this theorem, the Furstenberg recurrence theorem (Theorem 1 from Lecture 11) follows from the previous theorems and transfinite induction.

The main difficulty in establishing Theorem 2 is that while each $Y_\beta$ obeys the UMR separately, we do not know that this property holds uniformly in $\beta$. The main new observation needed to establish the theorem is that there is another way to leverage the UMR from a factor to an extension… if the support of the function f is sufficiently “dense”. We motivate this by first considering the unconditional case.

Proposition 1. (UMR for densely supported functions) Let $(X,{\mathcal X},\mu,T)$ be a measure-preserving system, let $k \geq 1$ be an integer, and let $f \in L^\infty(X)$ be a non-negative function whose support $\{ x: f(x) > 0\}$ has measure greater than $1-1/k$. Then (1) holds.

Proof. By monotone convergence, we can find $\varepsilon > 0$ such that $f(x) > \varepsilon$ for all x outside of a set E of measure at most $1/k - \varepsilon$. For any n, this implies that $f(x) T^n f(x) \ldots T^{(k-1) n} f(x) > \varepsilon^k$ for all x outside of the set $E \cup T^n E \cup \ldots \cup T^{(k-1)n} E$, which has measure at most $1-k\varepsilon$. In particular we see that

$\int_X f T^n f \ldots T^{(k-1) n} f\ d\mu > k \varepsilon^{k+1}$ (2)

for all n, and the claim follows. $\Box$

As with the other components of the proof of the recurrence theorem, we will need to upgrade the above proposition to a “relative” version:

Proposition 2. (UMR for relatively densely supported functions) Let $(X,{\mathcal X},\mu,T)$ be an extension of a factor $(Y,{\mathcal Y},\nu, S)$ with the UMR, let $k \geq 1$ be an integer, and let $f \in L^\infty(X)$ be a non-negative function whose support $\Omega := \{ x: f(x) > 0\}$ is such that the set $\{ y \in Y: {\Bbb E}(1_\Omega|Y) > 1-1/k \}$ has positive measure in Y. Then (1) holds.

Proof. By monotone convergence again, we can find $\varepsilon > 0$ such that the set $E := \{ x: f(x) > \varepsilon \}$ is such that the set $F := \{ y \in Y: {\Bbb E}(1_E|Y) > 1-1/k+\varepsilon \}$ has positive measure. Since Y has the UMR, this implies that (1) holds for $1_F$. In other words, there exists $c > 0$ such that

$\nu( F \cap T^n F \cap \ldots \cap T^{(k-1)n} F ) > c$ (3)

for all n in a set of positive lower density.

Now we turn to f. We have the pointwise lower bound $f(x) \geq \varepsilon 1_E(x)$, and so

$f T^n f \ldots T^{(k-1)n} f(x) \geq \varepsilon^k 1_{E \cap T^n E \cap \ldots \cap T^{(k-1)n} E}(x)$. (4)

We have the crude lower bound

$1_{E \cap T^n E \cap \ldots \cap T^{(k-1)n} E}(x) \geq 1 - \sum_{j=0}^{k-1} 1_{T^{jn} E^c}(x)$; (5)

inserting this into (4) and taking conditional expectations, we conclude

${\Bbb E}( f T^n f \ldots T^{(k-1)n} f | Y) (y) \geq \varepsilon^k ( 1 - \sum_{j=0}^{k-1} {\Bbb E}( 1_{T^{jn} E^c} | Y)(y))$ (6)

a.e. On the other hand, we have

${\Bbb E}( 1_{T^{jn} E^c} | Y) = 1 - {\Bbb E}( 1_{T^{jn} E} | Y) = 1 - T^{jn} {\Bbb E}(1_E|Y)$. (7)

By definition of F, we thus see that if y lies in $F \cap T^n F \cap \ldots \cap T^{(k-1)n} F$, then

${\Bbb E}( f T^n f \ldots T^{(k-1)n} f | Y) (y) \geq \varepsilon^k \times k \varepsilon$. (8)

Integrating this and using (3), we obtain

$\int_X f T^n f \ldots T^{(k-1)n} f\ d\mu \geq c \varepsilon^k \times k \varepsilon$ (9)

for all n in a set of positive lower density, and (1) follows. $\Box$

Proof of Theorem 2. Let $f \in L^\infty(Y)$ be non-negative with positive mean $\int_X f\ d\mu = c > 0$; we may normalise f to be bounded by 1. Since Y is the inverse limit of the $Y_\beta$, we see that the orthogonal projections ${\Bbb E}(f|Y_\beta)$ converge in $L^2(X)$ norm to ${\Bbb E}(f|Y) = f$. Thus, for any $\varepsilon$, we can find $\beta$ such that

$\| f - {\Bbb E}(f|Y_\beta) \|_{L^2(X)} \leq \varepsilon$. (10)

Now ${\Bbb E}(f|Y_\beta)$ has the same mean c as f, and is also bounded by 1. Thus the set $E := \{ y: {\Bbb E}(f|Y_\beta)(y) \geq c/2 \}$ must have measure at least c/2 in $Y_\beta$. Now if $\Omega := \{ x: f(x) > 0 \}$, then we have the pointwise bound

$|f - {\Bbb E}(f|Y_\beta)| \geq \frac{c}{2} 1_{\Omega^c} 1_E$; (11)

squaring this and taking conditional expectations we obtain

${\Bbb E}(|f - {\Bbb E}(f|Y_\beta)|^2|Y_\beta)(y) \geq \frac{c^2}{4} (1 - {\Bbb E}(1_{\Omega}|Y_\beta)(y)) 1_E(y)$, (12)

and so by (10) and Markov’s inequality we see that $1 - {\Bbb E}(1_{\Omega}|Y_\beta)(y) 1_E(y) < 1/k$ on a set of measure $O_c(\varepsilon^2)$. Choosing $\varepsilon$ sufficiently small depending on c, we conclude (from the lower bound $\mu(E) \geq c/2$) that ${\Bbb E}(1_{\Omega}|Y_\beta)(y) > 1-1/k$ on a set of positive measure. The claim now follows from Proposition 2. $\Box$

The proof of the Furstenberg recurrence theorem (and thus Szemerédi’s theorem) is finally complete.

Remark 3. The same type of argument yields many further recurrence theorems, and thus (by the correspondence principle) many combinatorial results also. For instance, in the original paper of Furstenberg it was noted that the above arguments allow one to strengthen (1) to

$\liminf_{N \to \infty} \inf_M \frac{1}{N} \sum_{n=M}^{M+N-1} \int_X f T^n f \ldots T^{(k-1)n} f\ d\mu > 0$, (13)

which allows one to conclude that in a set A of positive upper density, the set of n for which $A \cap (A+n) \cap \ldots \cap (A+(k-1)n)$ has positive upper density is syndetic for every k. One can also extend the argument to higher dimensions, and to polynomial recurrence without too many changes in the structure of the proof. But some more serious modifications to the argument are needed for other recurrence results involving IP systems or Hales-Jewett type results; see Lecture 10 for more discussion. $\diamond$

[Update, Mar 6: Statement of Proposition 1 corrected.]