Ben Green, Tamar Ziegler and I have just uploaded to the arXiv our paper “An inverse theorem for the Gowers $U^4$ norm“.  This paper establishes the next case of the inverse conjecture for the Gowers norm for the integers (after the $U^3$ case, which was done by Ben and myself a few years ago).  This conjecture has a number of combinatorial and number-theoretic consequences, for instance by combining this new inverse theorem with previous results, one can now get the correct asymptotic for the number of arithmetic progressions of primes of length five in any large interval $[N] = \{1,\ldots,N\}$.

To state the inverse conjecture properly requires a certain amount of notation.  Given a function $f: {\Bbb Z} \to {\Bbb C}$ and a shift $h \in {\Bbb Z}$, define the multiplicative derivative

$\Delta_h f(x) := f(x+h) \overline{f(x)}$

and then define the Gowers $U^{s+1}[N]$ norm of a function $f: [N] \to {\Bbb C}$ to (essentially) be the quantity

$\| f\|_{U^{s+1}[N]} := ({\Bbb E}_{h_1,\ldots,h_{s+1} \in [-N,N]} {\Bbb E}_{x \in [N]} |\Delta_{h_1} \ldots \Delta_{h_{s+1}} f(x)|)^{1/2^{s+1}},$

where we extend f by zero outside of $[N]$.  (Actually, we use a slightly different normalisation to ensure that the function 1 has a $U^{s+1}$ norm of 1, but never mind this for now.)

Informally, the Gowers norm $\|f\|_{U^{s+1}[N]}$ measures the amount of bias present in the $(s+1)^{st}$ multiplicative derivatives of $f$.  In particular, if $f = e(P) := e^{2\pi i P}$ for some polynomial $P: {\Bbb Z} \to {\Bbb C}$, then the $(s+1)^{th}$ derivative of $f$ is identically 1, and so is the Gowers norm.

However, polynomial phases are not the only functions with large Gowers norm.  For instance, consider the function $f(n) := e( \lfloor \sqrt{2} n \rfloor \sqrt{3} n )$, which is what we call a quadratic bracket polynomial phase.  This function isn’t quite quadratic, but it is close enough to being quadratic (because one has the approximate linearity relationship $\lfloor x+y \rfloor = \lfloor x \rfloor + \lfloor y \rfloor$ holding a good fraction of the time) that it turns out that third derivative is trivial fairly often, and the Gowers norm $\|f\|_{U^3[N]}$ is comparable to 1.  This bracket polynomial phase can be modeled as a nilsequence $n \mapsto F( g(n) \Gamma )$, where $n \mapsto g(n) \Gamma$ is a polynomial orbit on a nilmanifold $G/\Gamma$, which in this case has step 2.  (The function F is only piecewise smooth, due to the discontinuity in the floor function $\lfloor \rfloor$, so strictly speaking we would classify this as an almost nilsequence rather than a nilsequence, but let us ignore this technical issue here.)  In fact, there is a very close relationship between nilsequences and bracket polynomial phases, but I will detail this in a later post.

The inverse conjecture for the Gowers norm, GI(s), asserts that such nilsequences are the only obstruction to the Gowers norm being small.  Roughly speaking, it goes like this:

Inverse conjecture, GI(s). (Informal statement)  Suppose that $f: [N] \to {\Bbb C}$ is bounded but has large $U^{s+1}[N]$ norm.  Then there is an s-step nilsequence $n \mapsto F( g(n) \Gamma )$ of “bounded complexity” that correlates with f.

This conjecture is trivial for s=0, is a short consequence of Fourier analysis when s=1, and was proven for s=2 by Ben and myself.  In this paper we establish the s=3 case.  An equivalent formulation in this case is that any bounded function $f$ of large $U^4$ norm must correlate with a “bracket cubic phase”, which is the product of a bounded number of phases from the following list

$e( \alpha n^3 + \beta n^2 + \gamma n), e( \lfloor \alpha n \rfloor \beta n^2 ), e( \lfloor \alpha n \rfloor \lfloor \beta n \rfloor \gamma n ), e( \lfloor \alpha n \rfloor \beta n )$ (*)

for various real numbers $\alpha,\beta,\gamma$.

It appears that our methods also work in higher step, though for technical reasons it is convenient to make a number of adjustments to our arguments to do so, most notably a switch from standard analysis to non-standard analysis, about which I hope to say more later.  But there are a number of simplifications available on the s=3 case which make the argument significantly shorter, and so we will be writing the higher s argument in a separate paper.

The arguments largely follow those for the s=2 case (which in turn are based on this paper of Gowers).  Two major new ingredients are a deployment of a normal form and equidistribution theory for bracket quadratic phases, and a combinatorial decomposition of frequency space which we call the sunflower decomposition.  I will sketch these ideas below the fold.

– Sketch of proof –

We will use an inductive argument, relying on the known inverse theory for the $U^3$ norm to establish the $U^4$ result.

Let f be a function with large $U^4$ norm.  The goal is to show that f correlates with some cubic object, such as something generated from the list (*).  By definition of the $U^4$ norm, we already know that the derivatives $\Delta_h f$ have large $U^3$ norm for many h.  Applying the inverse theorem at the $U^3$ level, we conclude that for many such $h$, $\Delta_h f$ correlates with a bracket quadratic phase $\chi_h$, which one can write as a bounded product of phases of the form

$e( \alpha_h n^2 + \beta_h n ), e( \lfloor \xi_h n \rfloor \eta_h n )$

for various frequencies $\alpha_h, \beta_h, \xi_h, \eta_h$ depending on $h$.  To simplify the discussion let us ignore the pure quadratics $e( \alpha_h n^2 + \beta_h n )$ as lower order terms, and suppose that only a single bracket quadratic is present.  We thus have

$\Delta_h f(n) \approx e( \lfloor \xi_h n \rfloor \eta_h n )$

where we will be vague about exactly what the $\approx$ symbol means here.

A key observation of Gowers is that the cocycle nature of $\Delta_h f$ induces some non-trivial relationships between the frequencies $\xi_h, \eta_h$.  For instance, if f had magnitude 1 throughout, then we would have the identity

$\Delta_{h+k} f(n) = \Delta_h f(n+k) \Delta_k f(n)$

which suggests (discarding lower order terms) that one has the relation

$e( \lfloor \xi_{h+k} n \rfloor \eta_{h+k} n ) \approx$

$e( \lfloor \xi_h n \rfloor \eta_h n + \lfloor \xi_k n \rfloor \eta_k n )$ (**)

for many h, k.  (In practice, one actually obtains a slightly more complicated relationship, in which the additive triple $(h,k,h+k)$ is replaced by an additive quadruple $(h_1,h_2,h_3,h_4)$ with $h_1+h_2=h_3+h_4$, but never mind this technicality.)

The relation (**) suggests that one has an identity of the form

$\lfloor \xi_{h+k} n \rfloor \eta_{h+k} n \approx \lfloor \xi_h n \rfloor \eta_h n + \lfloor \xi_k n \rfloor \eta_k n$

modulo lower order terms (and also modulo 1).  How can such an identity occur?  There are two obvious ways this can happen:

1. The $\xi_h$ are independent of h, and $\eta_h$ varies linearly in h.
2. The $\eta_h$ are independent of h, and $\xi_h$ varies linearly in h.  (The error caused by the failure of $\lfloor \rfloor$ to be exactly linear turns out to be lower order and will be absorbed into the vaguely defined $\approx$ symbol.)

It turns out that one can generalise these scenarios a little bit.  Firstly, since we are only seeking structure for many h,k rather than for all h,k, one can replace “linearly in h” in the above scenarios by “bracket-linearly in h”, e.g. $\eta_h$ could be given by $\lfloor \alpha h \rfloor \beta$ for some real numbers $\alpha,\beta$, as this would still be sufficiently close to linear to generate a relation such as (**).  Secondly, one could modify the $\xi_h, \eta_h$ by integers (or more generally, by rational numbers with bounded denominator) without significantly affecting (**) due to the periodicity of the function e() modulo 1.  Finally, there are some identities of bracket calculus that show some non-trivial relations between various quadratic expressions $\lfloor x \rfloor y$, such as

$\lfloor qx \rfloor \frac{y}{q} \approx \lfloor x \rfloor y$

for any non-zero rational q with bounded numerator and denominator, and also the antisymmetry property

$\lfloor x \rfloor y \approx - \lfloor y \rfloor x \hbox{ mod } 1$

(the point being that $\{x\}\{y\} = (x-\lfloor x \rfloor) (y - \lfloor y\rfloor)$ can be considered a lower order term, as can xy, and $\lfloor x \rfloor \lfloor y \rfloor$ vanishes modulo 1).

It turns out though that the above maneuvres basically exhaust all the possible ways in which a relation of the form (**) can occur.    There are two main reasons for this:

1. One can show that if any two of the three pairs $(\xi_h, \eta_h), (\xi_k, \eta_k), (\xi_{h+k}, \eta_{h+k})$ are in “general position”, then no identity of the form (**) can occur.
2. If there is a frequency shared in common among all three frequency sets, e.g. if $\xi_h$ is independent of h, then the other frequency $\eta_h$ must essentially depend linearly (or bracket-linearly) in h.

The reason for claim 1 ultimately reduces to the following simple algebraic lemma (which we call the “Furstenberg-Weiss trick“, though it has appeared in some other contexts also, for instance in this paper of Hrushovski):

Lemma 1. Let H be a subgroup of a product $G_1 \times G_2 \times G_3$ of three groups.  If the projection of H to $G_1 \times G_2$ is surjective, and the projection of H to $G_1 \times G_3$ is surjective, then H contains the subgroup $[G_1,G_1] \times \{1\} \times \{1\}$.

Proof. Let $g, h$ be two elements of $G_1$.  By the first surjectivity property, H contains an element of the form $(g, 1, *)$ where * denotes an unspecified coefficient, and by the second property, H also contains $(h,*,1)$.  Taking commutators yields $([g,h],1,1)$, and the claim follows. $\Box$

Roughly speaking, the reason why this Lemma implies Claim 1 is because the joint orbit of $\lfloor \xi_h n \rfloor \eta_h n$, $\lfloor \xi_k n \rfloor \eta_k n$, $\lfloor \xi_{h+k} n \rfloor \eta_{h+k} n$ can be controlled, using Ratner’ s theorem (or more precisely, a quantitative version of this theorem that Ben and I worked out a few years ago) by a certain subgroup H of the product of three Heisenberg groups (times an abelian group to take care of lower order errors, but let’s ignore this).  The general position hypotheses give us the hypotheses of the lemma, and the conclusion basically tells us that the phase of $\lfloor \xi_h n \rfloor \eta_h n$ (say) can be varied independently of the other two phases, which prevents any relation of the form (**) occurring.  A related argument also gives Claim 2.

Once one has Claim 1 and Claim 2, a combinatorial argument (which we call the sunflower decomposition) that separates the frequencies $\xi_h, \eta_h$ into “core” frequencies (shared by many h) and “petal” frequencies (which vary with h, so that the petals for h and petals for k are in general position for most h,k), and then using standard tools from additive combinatorics (notably the Balog-Szemeredi-Gowers lemma and Freiman’s theorem), one eventually concludes (possibly after reshuffling the bracket polynomials a bit, to take care of the ambiguities mentioned earlier) that of the two frequencies $\xi_h, \eta_h$ that appear, one is independent of h and the other varies bracket linearly in h.  Thus we see now that $\Delta_h f$ resembles an expression which is the product of a bounded number of terms such as

$e( \lfloor \xi n \rfloor \lfloor \alpha h \rfloor \beta n )$

and permutations.  We write this as

$\Delta_h f(n) \approx e( T( h, n, n ) )$

for some “trilinear” form T in three variables.

If T was symmetric in all three variables, then $e(T(h,n,n))$ would resemble the derivative $\Delta_h e( \frac{1}{3} T(n,n,n) )$ of $e( \frac{1}{3} T(n,n,n) )$ (modulo lower order terms), which turns out to imply (after invoking the induction hypothesis GI(2) again) that $f(n)$ correlates with $e(T(n,n,n))$ (times lower order terms).  But by construction, $e( T(n,n,n) )$ is a bracket cubic, and this would give the claim GI(3).

So one seeks to impose a “symmetry” property on T.  The reason for this symmetry ultimately stems from the identity

$\Delta_k \Delta h f(n) = \Delta_h \Delta_k f(n)$

which suggests that

$\Delta_k e( T(h,n,n) ) \approx \Delta_h e( T( k, n, n ) )$

which, when expanded, gives something reasonably close to the required symmetry property.  One can make this argument precise using a carefully chosen sequence of applications of the Cauchy-Schwarz inequality.  One initial obstacle is that the desired symmetry only holds for many pairs (h,k), rather than for all pairs (h,k), but it turns out that one can step over this obstacle by using a theorem of Sarkozy, which asserts that if H is a dense subset of [N], then a suitable sumset $H+H+\ldots+H$ of bounded length will contain a long arithmetic progression (of size comparable to N).  Combining this theorem with the trilinearity properties of T lets one pass from “many” to “all” (or more precisely, “all in a long arithmetic progression”, which is good enough for our purposes).