I’ve just uploaded to the ArXiV my paper “Norm convergence of multiple ergodic averages for commuting transformations“, submitted to
Ergodic Theory and Dynamical Systems. This paper settles in full generality the norm convergence problem for several commuting transformations. Specifically, if (X, {\mathcal X},\mu) is a probability space and T_1, \ldots, T_l: X \to X are commuting measure-preserving transformations, then for any bounded measurable functions f_1, \ldots, f_l: X \to {\Bbb R}, the multiple average

\frac{1}{N} \sum_{n=0}^{N-1} \int_X T_1^n f_1 \ldots T_l^n f_l (1)

is convergent in the L^2(X) norm topology (and thus also converges in probability). The corresponding question of pointwise almost everywhere convergence remains open (and quite difficult, in my opinion). My argument also does not readily establish a formula as to what this limit actually is (it really establishes that the sequence is Cauchy in L^2 rather than convergent).

The l=1 case of this theorem is the classical mean ergodic theorem (also known as the von Neumann ergodic theorem). The l=2 case was established by Conze and Lesigne. The higher l case was partially resolved by Frantzikinakis and Kra, under the additional hypotheses that all of the transformations T_i, as well as the quotients T_i T_j^{-1}, are ergodic. The special case T_j=T^j was established by Host-Kra (with another proof given subsequently by Ziegler). Another relevant result is the Furstenberg-Katznelson theorem, which asserts among other things when f_1=\ldots=f_l is non-negative and not identically zero, then the inner product of the expression (1) with f has a strictly positive limit inferior as N \to \infty. This latter result also implies Szemerédi’s theorem.

It is also known that the Furstenberg-Katznelson theorem can be proven by hypergraph methods, and in fact my paper also proceeds by a hypergraph-inspired approach, although the language of hypergraphs is not explicitly used in the body of the argument. (In contrast to the work of Host-Kra and Ziegler, no nilsystems appear in the proof.)

In fact, the first step in the argument is to replace the infinitary norm convergence result by an equivalent finitary norm convergence result, in exactly the same way that the infinite convergence principle is equivalent to the finite convergence principle. (This is in marked contrast with the usual ergodic-theory approach to these problems, in which one tries to prove the infinitary statement first, by purely infinitary means, and only deduces the finitary counterpart as a corollary at the very end of the argument.) In the finitary counterpart of this norm convergence result, the abstract dynamical system X is replaced by a very concrete one, namely the product ({\Bbb Z}/P{\Bbb Z})^l of l cyclic groups, with the standard set of l commuting shifts. The main advantage of this concrete setting is that it has an obvious Cartesian product structure, which lets one interpret the functions f_1,\ldots,f_l as weighted hypergraphs, and which allows the technology of the hypergraph regularity lemma to be applied. (Actually we will only need a weak version of this lemma, similar to the “Koopman-von Neumann-type theorems” that appear for instance in my work with Ben on arithmetic progressions of primes, or to the weak regularity lemma of Frieze and Kannan.)

Roughly speaking, the idea is now to induct on the “complexity” of the functions f_1,\ldots,f_l. Informally, a function f on, say, ({\Bbb Z}/P{\Bbb Z})^l (actually for technical reasons we use ({\Bbb Z}/P{\Bbb Z})^{l+1}) is of complexity d if it only depends of d of the coordinates, or if it is a polynomial combination of such functions (with quantitative bounds on the polynomial involved). The regularity lemma allows one to approximate a complexity d function by a complexity d-1 function, plus an error which is sufficiently “pseudorandom” or “uniform” to be negligible for the purposes of the norm convergence problem. We iterate this all the way down to d=1, at which point one is at the level of the mean ergodic theorem and one can proceed by a variety of methods.

One amusing side-effect of the finitary nature of the argument was that I needed a finitary version of the Lebesgue dominated convergence theorem, which I include as an appendix. Actually, one does not strictly speaking need this theorem to run the argument, if one is willing to add a little bit more notation instead, but the finitary convergence theorem may be of some independent interest.

While the methods in the paper are finitary, it seems likely that a more traditional infinitary ergodic proof of this theorem should be possible (once one figures out how to obtain the counterpart of Cartesian product structure in the infinitary setting). It would thus be interesting to obtain a second proof of this result.