You are currently browsing the tag archive for the ‘arithmetic regularity lemma’ tag.

Ben Green and I have updated our paper “An arithmetic regularity lemma, an associated counting lemma, and applications” to account for a somewhat serious issue with the paper that was pointed out to us recently by Daniel Altman. This paper contains two core theorems:

• An “arithmetic regularity lemma” that, roughly speaking, decomposes an arbitrary bounded sequence ${f(n)}$ on an interval ${\{1,\dots,N\}}$ as an “irrational nilsequence” ${F(g(n) \Gamma)}$ of controlled complexity, plus some “negligible” errors (where one uses the Gowers uniformity norm as the main norm to control the neglibility of the error); and
• An “arithmetic counting lemma” that gives an asymptotic formula for counting various averages ${{\mathbb E}_{{\bf n} \in {\bf Z}^d \cap P} f(\psi_1({\bf n})) \dots f(\psi_t({\bf n}))}$ for various affine-linear forms ${\psi_1,\dots,\psi_t}$ when the functions ${f}$ are given by irrational nilsequences.

The combination of the two theorems is then used to address various questions in additive combinatorics.

There are no direct issues with the arithmetic regularity lemma. However, it turns out that the arithmetic counting lemma is only true if one imposes an additional property (which we call the “flag property”) on the affine-linear forms ${\psi_1,\dots,\psi_t}$. Without this property, there does not appear to be a clean asymptotic formula for these averages if the only hypothesis one places on the underlying nilsequences is irrationality. Thus when trying to understand the asymptotics of averages involving linear forms that do not obey the flag property, the paradigm of understanding these averages via a combination of the regularity lemma and a counting lemma seems to require some significant revision (in particular, one would probably have to replace the existing regularity lemma with some variant, despite the fact that the lemma is still technically true in this setting). Fortunately, for most applications studied to date (including the important subclass of translation-invariant affine forms), the flag property holds; however our claim in the paper to have resolved a conjecture of Gowers and Wolf on the true complexity of systems of affine forms must now be narrowed, as our methods only verify this conjecture under the assumption of the flag property.

In a bit more detail: the asymptotic formula for our counting lemma involved some finite-dimensional vector spaces ${\Psi^{[i]}}$ for various natural numbers ${i}$, defined as the linear span of the vectors ${(\psi^i_1({\bf n}), \dots, \psi^i_t({\bf n}))}$ as ${{\bf n}}$ ranges over the parameter space ${{\bf Z}^d}$. Roughly speaking, these spaces encode some constraints one would expect to see amongst the forms ${\psi^i_1({\bf n}), \dots, \psi^i_t({\bf n})}$. For instance, in the case of length four arithmetic progressions when ${d=2}$, ${{\bf n} = (n,r)}$, and

$\displaystyle \psi_i({\bf n}) = n + (i-1)r$

for ${i=1,2,3,4}$, then ${\Psi^{[1]}}$ is spanned by the vectors ${(1,1,1,1)}$ and ${(1,2,3,4)}$ and can thus be described as the two-dimensional linear space

$\displaystyle \Psi^{[1]} = \{ (a,b,c,d): a-2b+c = b-2c+d = 0\} \ \ \ \ \ (1)$

while ${\Psi^{[2]}}$ is spanned by the vectors ${(1,1,1,1)}$, ${(1,2,3,4)}$, ${(1^2,2^2,3^2,4^2)}$ and can be described as the hyperplane

$\displaystyle \Psi^{[2]} = \{ (a,b,c,d): a-3b+3c-d = 0 \}. \ \ \ \ \ (2)$

As a special case of the counting lemma, we can check that if ${f}$ takes the form ${f(n) = F( \alpha n, \beta n^2 + \gamma n)}$ for some irrational ${\alpha,\beta \in {\bf R}/{\bf Z}}$, some arbitrary ${\gamma \in {\bf R}/{\bf Z}}$, and some smooth ${F: {\bf R}/{\bf Z} \times {\bf R}/{\bf Z} \rightarrow {\bf C}}$, then the limiting value of the average

$\displaystyle {\bf E}_{n, r \in [N]} f(n) f(n+r) f(n+2r) f(n+3r)$

as ${N \rightarrow \infty}$ is equal to

$\displaystyle \int_{a_1,b_1,c_1,d_1 \in {\bf R}/{\bf Z}: a_1-2b_1+c_1=b_1-2c_1+d_1=0} \int_{a_2,b_2,c_2,d_2 \in {\bf R}/{\bf Z}: a_2-3b_2+3c_2-d_2=0}$

$\displaystyle F(a_1,a_2) F(b_1,b_2) F(c_1,c_2) F(d_1,d_2)$

which reflects the constraints

$\displaystyle \alpha n - 2 \alpha(n+r) + \alpha(n+2r) = \alpha(n+r) - 2\alpha(n+2r)+\alpha(n+3r)=0$

and

$\displaystyle (\beta n^2 + \gamma n) - 3 (\beta(n+r)^2+\gamma(n+r))$

$\displaystyle + 3 (\beta(n+2r)^2 +\gamma(n+2r)) - (\beta(n+3r)^2+\gamma(n+3r))=0.$

These constraints follow from the descriptions (1), (2), using the containment ${\Psi^{[1]} \subset \Psi^{[2]}}$ to dispense with the lower order term ${\gamma n}$ (which then plays no further role in the analysis).

The arguments in our paper turn out to be perfectly correct under the assumption of the “flag property” that ${\Psi^{[i]} \subset \Psi^{[i+1]}}$ for all ${i}$. The problem is that the flag property turns out to not always hold. A counterexample, provided by Daniel Altman, involves the four linear forms

$\displaystyle \psi_1(n,r) = r; \psi_2(n,r) = 2n+2r; \psi_3(n,r) = n+3r; \psi_4(n,r) = n.$

Here it turns out that

$\displaystyle \Psi^{[1]} = \{ (a,b,c,d): d-c=3a; b-2a=2d\}$

and

$\displaystyle \Psi^{[2]} = \{ (a,b,c,d): 24a+3b-4c-8d=0 \}$

and ${\Psi^{[1]}}$ is no longer contained in ${\Psi^{[2]}}$. The analogue of the asymptotic formula given previously for ${f(n) = F( \alpha n, \beta n^2 + \gamma n)}$ is then valid when ${\gamma}$ vanishes, but not when ${\gamma}$ is non-zero, because the identity

$\displaystyle 24 (\beta \psi_1(n,r)^2 + \gamma \psi_1(n,r)) + 3 (\beta \psi_2(n,r)^2 + \gamma \psi_2(n,r))$

$\displaystyle - 4 (\beta \psi_3(n,r)^2 + \gamma \psi_3(n,r)) - 8 (\beta \psi_4(n,r)^2 + \gamma \psi_4(n,r)) = 0$

holds in the former case but not the latter. Thus the output of any purported arithmetic regularity lemma in this case is now sensitive to the lower order terms of the nilsequence and cannot be described in a uniform fashion for all “irrational” sequences. There should still be some sort of formula for the asymptotics from the general equidistribution theory of nilsequences, but it could be considerably more complicated than what is presented in this paper.

Fortunately, the flag property does hold in several key cases, most notably the translation invariant case when ${\Psi^{[1]}}$ contains ${(1,\dots,1)}$, as well as “complexity one” cases. Nevertheless non-flag property systems of affine forms do exist, thus limiting the range of applicability of the techniques in this paper. In particular, the conjecture of Gowers and Wolf (Theorem 1.13 in the paper) is now open again in the non-flag property case.

I’ve just uploaded to the arXiv my paper “Cancellation for the multilinear Hilbert transform“, submitted to Collectanea Mathematica. This paper uses methods from additive combinatorics (and more specifically, the arithmetic regularity and counting lemmas from this paper of Ben Green and myself) to obtain a slight amount of progress towards the open problem of obtaining ${L^p}$ bounds for the trilinear and higher Hilbert transforms (as discussed in this previous blog post). For instance, the trilinear Hilbert transform

$\displaystyle H_3( f_1, f_2, f_3 )(x) := p.v. \int_{\bf R} f_1(x+t) f_2(x+2t) f_3(x+3t)\ \frac{dt}{t}$

is not known to be bounded for any ${L^{p_1}({\bf R}) \times L^{p_2}({\bf R}) \times L^{p_3}({\bf R})}$ to ${L^p({\bf R})}$, although it is conjectured to do so when ${1/p =1/p_1 +1/p_2+1/p_3}$ and ${1 < p_1,p_2,p_3,p < \infty}$. (For ${p}$ well below ${1}$, one can use additive combinatorics constructions to demonstrate unboundedness; see this paper of Demeter.) One can approach this problem by considering the truncated trilinear Hilbert transforms

$\displaystyle H_{3,r,R}( f_1, f_2, f_3 )(x) := \int_{r \leq |t| \leq R} f_1(x+t) f_2(x+2t) f_3(x+3t)\ \frac{dt}{t}$

for ${0 < r < R}$. It is not difficult to show that the boundedness of ${H_3}$ is equivalent to the boundedness of ${H_{3,r,R}}$ with bounds that are uniform in ${R}$ and ${r}$. On the other hand, from Minkowski’s inequality and Hölder’s inequality one can easily obtain the non-uniform bound of ${2 \log \frac{R}{r}}$ for ${H_{3,r,R}}$. The main result of this paper is a slight improvement of this trivial bound to ${o( \log \frac{R}{r})}$ as ${R/r \rightarrow \infty}$. Roughly speaking, the way this gain is established is as follows. First there are some standard time-frequency type reductions to reduce to the task of obtaining some non-trivial cancellation on a single “tree”. Using a “generalised von Neumann theorem”, we show that such cancellation will happen if (a discretised version of) one or more of the functions ${f_1,f_2,f_3}$ (or a dual function ${f_0}$ that it is convenient to test against) is small in the Gowers ${U^3}$ norm. However, the arithmetic regularity lemma alluded to earlier allows one to represent an arbitrary function ${f_i}$, up to a small error, as the sum of such a “Gowers uniform” function, plus a structured function (or more precisely, an irrational virtual nilsequence). This effectively reduces the problem to that of establishing some cancellation in a single tree in the case when all functions ${f_0,f_1,f_2,f_3}$ involved are irrational virtual nilsequences. At this point, the contribution of each component of the tree can be estimated using the “counting lemma” from my paper with Ben. The main term in the asymptotics is a certain integral over a nilmanifold, but because the kernel ${\frac{dt}{t}}$ in the trilinear Hilbert transform is odd, it turns out that this integral vanishes, giving the required cancellation.

The same argument works for higher order Hilbert transforms (and one can also replace the coefficients in these transforms with other rational constants). However, because the quantitative bounds in the arithmetic regularity and counting lemmas are so poor, it does not seem likely that one can use these methods to remove the logarithmic growth in ${R/r}$ entirely, and some additional ideas will be needed to resolve the full conjecture.

In Notes 3, we saw that the number of additive patterns in a given set was (in principle, at least) controlled by the Gowers uniformity norms of functions associated to that set.

Such norms can be defined on any finite additive group (and also on some other types of domains, though we will not discuss this point here). In particular, they can be defined on the finite-dimensional vector spaces ${V}$ over a finite field ${{\bf F}}$.

In this case, the Gowers norms ${U^{d+1}(V)}$ are closely tied to the space ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ of polynomials of degree at most ${d}$. Indeed, as noted in Exercise 20 of Notes 4, a function ${f: V \rightarrow {\bf C}}$ of ${L^\infty(V)}$ norm ${1}$ has ${U^{d+1}(V)}$ norm equal to ${1}$ if and only if ${f = e(\phi)}$ for some ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$; thus polynomials solve the “${100\%}$ inverse problem” for the trivial inequality ${\|f\|_{U^{d+1}(V)} \leq \|f\|_{L^\infty(V)}}$. They are also a crucial component of the solution to the “${99\%}$ inverse problem” and “${1\%}$ inverse problem”. For the former, we will soon show:

Proposition 1 (${99\%}$ inverse theorem for ${U^{d+1}(V)}$) Let ${f: V \rightarrow {\bf C}}$ be such that ${\|f\|_{L^\infty(V)}}$ and ${\|f\|_{U^{d+1}(V)} \geq 1-\epsilon}$ for some ${\epsilon > 0}$. Then there exists ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ such that ${\| f - e(\phi)\|_{L^1(V)} = O_{d, {\bf F}}( \epsilon^c )}$, where ${c = c_d > 0}$ is a constant depending only on ${d}$.

Thus, for the Gowers norm to be almost completely saturated, one must be very close to a polynomial. The converse assertion is easily established:

Exercise 1 (Converse to ${99\%}$ inverse theorem for ${U^{d+1}(V)}$) If ${\|f\|_{L^\infty(V)} \leq 1}$ and ${\|f-e(\phi)\|_{L^1(V)} \leq \epsilon}$ for some ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$, then ${\|F\|_{U^{d+1}(V)} \geq 1 - O_{d,{\bf F}}( \epsilon^c )}$, where ${c = c_d > 0}$ is a constant depending only on ${d}$.

In the ${1\%}$ world, one no longer expects to be close to a polynomial. Instead, one expects to correlate with a polynomial. Indeed, one has

Lemma 2 (Converse to the ${1\%}$ inverse theorem for ${U^{d+1}(V)}$) If ${f: V \rightarrow {\bf C}}$ and ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ are such that ${|\langle f, e(\phi) \rangle_{L^2(V)}| \geq \epsilon}$, where ${\langle f, g \rangle_{L^2(V)} := {\bf E}_{x \in G} f(x) \overline{g(x)}}$, then ${\|f\|_{U^{d+1}(V)} \geq \epsilon}$.

Proof: From the definition of the ${U^1}$ norm (equation (18) from Notes 3), the monotonicity of the Gowers norms (Exercise 19 of Notes 3), and the polynomial phase modulation invariance of the Gowers norms (Exercise 21 of Notes 3), one has

$\displaystyle |\langle f, e(\phi) \rangle| = \| f e(-\phi) \|_{U^1(V)}$

$\displaystyle \leq \|f e(-\phi) \|_{U^{d+1}(V)}$

$\displaystyle = \|f\|_{U^{d+1}(V)}$

and the claim follows. $\Box$

In the high characteristic case ${\hbox{char}({\bf F}) > d}$ at least, this can be reversed:

Theorem 3 (${1\%}$ inverse theorem for ${U^{d+1}(V)}$) Suppose that ${\hbox{char}({\bf F}) > d \geq 0}$. If ${f: V \rightarrow {\bf C}}$ is such that ${\|f\|_{L^\infty(V)} \leq 1}$ and ${\|f\|_{U^{d+1}(V)} \geq \epsilon}$, then there exists ${\phi \in \hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ such that ${|\langle f, e(\phi) \rangle_{L^2(V)}| \gg_{\epsilon,d,{\bf F}} 1}$.

This result is sometimes referred to as the inverse conjecture for the Gowers norm (in high, but bounded, characteristic). For small ${d}$, the claim is easy:

Exercise 2 Verify the cases ${d=0,1}$ of this theorem. (Hint: to verify the ${d=1}$ case, use the Fourier-analytic identities ${\|f\|_{U^2(V)} = (\sum_{\xi \in \hat V} |\hat f(\xi)|^4)^{1/4}}$ and ${\|f\|_{L^2(V)} = (\sum_{\xi \in \hat V} |\hat f(\xi)|^2)^{1/2}}$, where ${\hat V}$ is the space of all homomorphisms ${\xi: x \mapsto \xi \cdot x}$ from ${V}$ to ${{\bf R}/{\bf Z}}$, and ${\hat f(\xi) := \mathop{\bf E}_{x \in V} f(x) e(-\xi \cdot x)}$ are the Fourier coefficients of ${f}$.)

This conjecture for larger values of ${d}$ are more difficult to establish. The ${d=2}$ case of the theorem was established by Ben Green and myself in the high characteristic case ${\hbox{char}({\bf F}) > 2}$; the low characteristic case ${\hbox{char}({\bf F}) = d = 2}$ was independently and simultaneously established by Samorodnitsky. The cases ${d>2}$ in the high characteristic case was established in two stages, firstly using a modification of the Furstenberg correspondence principle, due to Ziegler and myself. to convert the problem to an ergodic theory counterpart, and then using a modification of the methods of Host-Kra and Ziegler to solve that counterpart, as done in this paper of Bergelson, Ziegler, and myself.

The situation with the low characteristic case in general is still unclear. In the high characteristic case, we saw from Notes 4 that one could replace the space of non-classical polynomials ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf R}/{\bf Z})}$ in the above conjecture with the essentially equivalent space of classical polynomials ${\hbox{Poly}_{\leq d}(V \rightarrow {\bf F})}$. However, as we shall see below, this turns out not to be the case in certain low characteristic cases (a fact first observed by Lovett, Meshulam, and Samorodnitsky, and independently by Ben Green and myself), for instance if ${\hbox{char}({\bf F}) = 2}$ and ${d \geq 3}$; this is ultimately due to the existence in those cases of non-classical polynomials which exhibit no significant correlation with classical polynomials of equal or lesser degree. This distinction between classical and non-classical polynomials appears to be a rather non-trivial obstruction to understanding the low characteristic setting; it may be necessary to obtain a more complete theory of non-classical polynomials in order to fully settle this issue.

The inverse conjecture has a number of consequences. For instance, it can be used to establish the analogue of Szemerédi’s theorem in this setting:

Theorem 4 (Szemerédi’s theorem for finite fields) Let ${{\bf F} = {\bf F}_p}$ be a finite field, let ${\delta > 0}$, and let ${A \subset {\bf F}^n}$ be such that ${|A| \geq \delta |{\bf F}^n|}$. If ${n}$ is sufficiently large depending on ${p,\delta}$, then ${A}$ contains an (affine) line ${\{ x, x+r, \ldots, x+(p-1)r\}}$ for some ${x,r \in {\bf F}^n}$ with ${ r\neq 0}$.

Exercise 3 Use Theorem 4 to establish the following generalisation: with the notation as above, if ${k \geq 1}$ and ${n}$ is sufficiently large depending on ${p,\delta}$, then ${A}$ contains an affine ${k}$-dimensional subspace.

We will prove this theorem in two different ways, one using a density increment method, and the other using an energy increment method. We discuss some other applications below the fold.