Ben Green, and I have just uploaded to the arXiv our paper “An arithmetic regularity lemma, an associated counting lemma, and applications“, submitted (a little behind schedule) to the 70th birthday conference proceedings for Endre Szemerédi. In this paper we describe the general-degree version of the arithmetic regularity lemma, which can be viewed as the counterpart of the Szemerédi regularity lemma, in which the object being regularised is a function ${f: [N] \rightarrow [0,1]}$ on a discrete interval ${[N] = \{1,\ldots,N\}}$ rather than a graph, and the type of patterns one wishes to count are additive patterns (such as arithmetic progressions ${n,n+d,\ldots,n+(k-1)d}$) rather than subgraphs. Very roughly speaking, this regularity lemma asserts that all such functions can be decomposed as a degree ${\leq s}$ nilsequence (or more precisely, a variant of a nilsequence that we call an virtual irrational nilsequence), plus a small error, plus a third error which is extremely tiny in the Gowers uniformity norm ${U^{s+1}[N]}$. In principle, at least, the latter two errors can be readily discarded in applications, so that the regularity lemma reduces many questions in additive combinatorics to questions concerning (virtual irrational) nilsequences. To work with these nilsequences, we also establish a arithmetic counting lemma that gives an integral formula for counting additive patterns weighted by such nilsequences.

The regularity lemma is a manifestation of the “dichotomy between structure and randomness”, as discussed for instance in my ICM article or FOCS article. In the degree ${1}$ case ${s=1}$, this result is essentially due to Green. It is powered by the inverse conjecture for the Gowers norms, which we and Tamar Ziegler have recently established (paper to be forthcoming shortly; the ${k=4}$ case of our argument is discussed here). The counting lemma is established through the quantitative equidistribution theory of nilmanifolds, which Ben and I set out in this paper.

The regularity and counting lemmas are designed to be used together, and in the paper we give three applications of this combination. Firstly, we give a new proof of Szemerédi’s theorem, which proceeds via an energy increment argument rather than a density increment one. Secondly, we establish a conjecture of Bergelson, Host, and Kra, namely that if ${A \subset [N]}$ has density ${\alpha}$, and ${\epsilon > 0}$, then there exist ${\gg_{\alpha,\epsilon} N}$ shifts ${h}$ for which ${A}$ contains at least ${(\alpha^4 - \epsilon)N}$ arithmetic progressions of length ${k=4}$ of spacing ${h}$. (The ${k=3}$ case of this conjecture was established earlier by Green; the ${k=5}$ case is false, as was shown by Ruzsa in an appendix to the Bergelson-Host-Kra paper.) Thirdly, we establish a variant of a recent result of Gowers-Wolf, showing that the true complexity of a system of linear forms over ${[N]}$ indeed matches the conjectured value predicted in their first paper.

In all three applications, the scheme of proof can be described as follows:

• Apply the arithmetic regularity lemma, and decompose a relevant function ${f}$ into three pieces, ${f_{nil}, f_{sml}, f_{unf}}$.
• The uniform part ${f_{unf}}$ is so tiny in the Gowers uniformity norm that its contribution can be easily dealt with by an appropriate “generalised von Neumann theorem”.
• The contribution of the (virtual, irrational) nilsequence ${f_{nil}}$ can be controlled using the arithmetic counting lemma.
• Finally, one needs to check that the contribution of the small error ${f_{sml}}$ does not overwhelm the main term ${f_{nil}}$. This is the trickiest bit; one often needs to use the counting lemma again to show that one can find a set of arithmetic patterns for ${f_{nil}}$ that is so sufficiently “equidistributed” that it is not impacted by the small error.

To illustrate the last point, let us give the following example. Suppose we have a set ${A \subset [N]}$ of some positive density (say ${|A| = 10^{-1} N}$) and we have managed to prove that ${A}$ contains a reasonable number of arithmetic progressions of length ${5}$ (say), e.g. it contains at least ${10^{-10} N^2}$ such progressions. Now we perturb ${A}$ by deleting a small number, say ${10^{-2} N}$, elements from ${A}$ to create a new set ${A'}$. Can we still conclude that the new set ${A'}$ contains any arithmetic progressions of length ${5}$?

Unfortunately, the answer could be no; conceivably, all of the ${10^{-10} N^2}$ arithmetic progressions in ${A}$ could be wiped out by the ${10^{-2} N}$ elements removed from ${A}$, since each such element of ${A}$ could be associated with up to ${|A|}$ (or even ${5|A|}$) arithmetic progressions in ${A}$.

But suppose we knew that the ${10^{-10} N^2}$ arithmetic progressions in ${A}$ were equidistributed, in the sense that each element in ${A}$ belonged to the same number of such arithmetic progressions, namely ${5 \times 10^{-9} N}$. Then each element deleted from ${A}$ only removes at most ${5 \times 10^{-9} N}$ progressions, and so one can safely remove ${10^{-2} N}$ elements from ${A}$ and still retain some arithmetic progressions. The same argument works if the arithmetic progressions are only approximately equidistributed, in the sense that the number of progressions that a given element ${a \in A}$ belongs to concentrates sharply around its mean (for instance, by having a small variance), provided that the equidistribution is sufficiently strong. Fortunately, the arithmetic regularity and counting lemmas are designed to give precisely such a strong equidistribution result.

A succinct (but slightly inaccurate) summation of the regularity+counting lemma strategy would be that in order to solve a problem in additive combinatorics, it “suffices to check it for nilsequences”. But this should come with a caveat, due to the issue of the small error above; in addition to checking it for nilsequences, the answer in the nilsequence case must be sufficiently “dispersed” in a suitable sense, so that it can survive the addition of a small (but not completely negligible) perturbation.

One last “production note”. Like our previous paper with Emmanuel Breuillard, we used Subversion to write this paper, which turned out to be a significant efficiency boost as we could work on different parts of the paper simultaneously (this was particularly important this time round as the paper was somewhat lengthy and complicated, and there was a submission deadline). When doing so, we found it convenient to split the paper into a dozen or so pieces (one for each section of the paper, basically) in order to avoid conflicts, and to help coordinate the writing process. I’m also looking into git (a more advanced version control system), and am planning to use it for another of my joint projects; I hope to be able to comment on the relative strengths of these systems (and with plain old email) in the future.