Tamar Ziegler and I have just uploaded to the arXiv our paper, “The inverse conjecture for the Gowers norm over finite fields via the correspondence principle“, submitted to Analysis & PDE.  As announced a few months ago in this blog post, this paper establishes (most of) the inverse conjecture for the Gowers norm from an ergodic theory analogue of this conjecture (in a forthcoming paper by Vitaly Bergelson, Tamar Ziegler, and myself, which should be ready shortly), using a variant of the Furstenberg correspondence principle.  Our papers were held up for a while due to some unexpected technical difficulties arising in the low characteristic case; as a consequence, our paper only establishes the full inverse conjecture in the high characteristic case $p \geq k$, and gives a partial result in the low characteristic case $p < k$.

In the rest of this post, I would like to describe the inverse conjecture (in both combinatorial and ergodic forms), and sketch how one deduces one from the other via the correspondence principle (together with two additional ingredients, namely a statistical sampling lemma and a local testability result for polynomials).

— Polynomials and the Gowers norm —

Let F be a finite field, and let $F^n$ be a vector space over that field.  Given any function $P: F^n \to {\Bbb R}/{\Bbb Z}$ taking values in the unit circle ${\Bbb R}/{\Bbb Z}$, we define the additive derivative $\Delta^+ P: F^n \to {\Bbb R}/{\Bbb Z}$ of P in the direction h by the formula $\Delta^+_h f(x) := f(x+h) - f(x)$.

Now let $k \geq 0$.  A function $P: F^n \to {\Bbb R}/{\Bbb Z}$ is said to be a polynomial of degree $ if one has $\Delta^+_{h_1} \ldots \Delta^+_{h_k} P = 0$ for all $h_1,\ldots,h_k \in F^n$.  Thus for instance, the only polynomial of degree <0 is the zero function, the only polynomials of degree <1 are the constants, the only polynomial of degree <2 are the affine characters, and so forth.  If $F=F_p$ is a field of prime order, and we make the additional assumption that P takes values in the $p^{th}$ roots of unity (which we can identify with F in the usual manner), then we can express the polynomial P in the customary manner as $P(x_1,\ldots,x_n) = \sum_{i_1+\ldots+i_n < k} c_{i_1,\ldots,i_n} x_1^{i_1} \ldots x_n^{i_n}$ (1)

for some coefficients $c_{i_1,\ldots,i_n} \in F$; let us refer to these as classical polynomials of degree <k.  However, if one does not require P to take values in $p^{th}$ roots of unity, then the above definition also encompasses some non-classical polynomials.  For instance, if $F = F_2$, the function $P: F \to {\Bbb R}/{\Bbb Z}$ defined by P(0)=0 and P(1)=1/4 is of degree <3 (i.e. is a “quadratic” polynomial), but is not classical.  [However, it is a nice exercise to show that in the high characteristic case $p \geq k$, every polynomial of degree $ can be expressed as the sum of a classical polynomial and a constant; thus the genuinely non-classical polynomials are a purely low characteristic phenomenon.  I may write more about the relationship between classical and non-classical polynomials in a future post.]

We can define multiplicative analogues of the above concept.  Given a function $f: F^n \to {\Bbb C}$, we define the multiplicative derivative $\Delta_h^\times f: F^n \to {\Bbb C}$ to be the function $\Delta^\times_h f(x) := f(x+h) \overline{f(x)},$

and say that f is a phase polynomial of degree <k if $\Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f = 1$ for all $h_1,\ldots,h_k \in F^n$.  It is not hard to show that f is a phase polynomial of degree <k if and only if $f=e(P)$ for some polynomial $P: F^n \to {\Bbb R}/{\Bbb Z}$ of degree <k, where $e:{\Bbb R}/{\Bbb Z} \to {\Bbb C}$ is the standard character $e(x) = e^{2\pi i x}$.

The Gowers uniformity norms are a means to measure the extent to which an arbitrary function $f: F^n \to {\Bbb C}$ behaves like a phase polynomial; if $k \geq 1$, they are defined by the formula $\|f\|_{U^k(F^n)} := [{\Bbb E}_{x,h_1,\ldots,h_k \in F^n} \Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f(x)]^{1/2^k}$

where ${\Bbb E}_{a \in A} f(a) := \frac{1}{|A|} \sum_{a \in A} f(a)$ denotes the average of f on A.  Thus, for instance, if f is bounded in magnitude by 1, then $\|f\|_{U^k(F^n)}$ is a number between 0 and 1, and equals 1 if and only if f is a phase polynomial of degree <k.

We have just seen that exact extremisers of the Gowers norms are given by phase polynomials.  This statement is in fact robust, in that near-exact extremisers of the Gowers norms are close to phase polynomials:

Theorem 1 (local testabilty of phase polynomials)  Suppose that $f: F^n \to {\Bbb C}$ is bounded in magnitude by 1 and obeys the bound $\|f\|_{U^k(F^n)} \geq 1-\varepsilon$.  Then there exists a phase polynomial g of degree <k such that ${\Bbb E}_x |f(x)-g(x)| = o_{\varepsilon \to 0;k}(1)$, where $o_{\varepsilon \to 0;k,F}(1)$ denotes a quantity that goes to zero as $\varepsilon \to 0$ for fixed k, F, uniformly in the choice of f or n.

This fact is essentially due to Alon, Kaufman, Krivelevich, Litsyn, and Ron; their paper is focused on the case when $F=F_2$ and P is a classical polynomial, but it is not too difficult to extend the result to the generality of Theorem 1 (this is done in an appendix to our paper, as we need this local testability result in our arguments).  One consequence of this theorem is that one can test (with high confidence) whether a given function is close to a phase polynomial of degree <k, by randomly computing some derivatives $\Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f(x)$ and seeing whether they are consistently close to 1.  Note that this test is local, in the sense that one only has to evaluate f at a bounded number of positions in $F^n$ (even as $n \to \infty$) in order to perform the test.

Theorem 1 describes what happens when a function “behaves like a phase polynomial 99% of the time”, in the sense that the Gowers norm is very close to 1.  The inverse conjecture for the Gowers norm addresses the more general case in which a function “behaves like a phase polynomial 51% of the time”, in the sense that the Gowers norm is bounded below by some small constant $\delta > 0$.  (Note that if $F=F_2$ and $f: F^n \to \{-1,+1\}$ was a random boolean function, then $\Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f(x)$ would equal +1 about 50% of the time and -1 for the other 50%, leading to a very small Gowers norm.)  More precisely, we have

Inverse Conjecture for the Gowers norm over finite fields. Let $f: F^n \to {\Bbb C}$ be a function bounded in magnitude by 1 such that $\|f\|_{U^k(F^n)} \geq \delta > 0$.  Then there exists a phase polynomial g of degree <k such that $|{\Bbb E} f(x) \overline{g(x)}| \geq c(k,F,\delta) > 0$ for some $c(k,F,\delta)$ independent of n.

This conjecture has a number of applications to the structural theory of additive patterns (e.g. arithmetic progressions) in subsets of $F^n$; see for instance this paper of Gowers-Wolf.  It is also connected to various problems in computer science, such as generation of pseudorandom bits (see e.g. this paper of Bogdanov-Viola), and to polynomiality testing (see e.g. this paper of Samorodnitsky-Trevisan).

Previously, this conjecture had been verified for $k \leq 3$ by Samorodnitsky (in even characteristic) and by Ben Green and myself (in odd characteristic).  In high characteristic $p \geq k$, a partial result (assuming that f was already a classical polynomial of bounded degree) was obtained by Ben and myself; but in low characteristic, it was discovered in that paper and simultaneously by Lovett-Meshulam-Samorodnitsky that the conjecture can fail if one restricts g to be a classical phase polynomial.  Nevertheless we believe the conjecture should still in the low characteristic case hold if we allow g to be non-classical.

Our main results are as follows:

Theorem 2. The inverse conjecture is true in the high characteristic case $p \geq k$.

Theorem 3. In general characteristic, a weaker form of the inverse conjecture is true, in which we allow g to be a phase polynomial of degree $ rather than $, for some C(k) depending only on k.

As mentioned above, we expect that we should be able to take C(k)=k in any characteristic, but this seems to require a delicate algebraic analysis and we were not able to establish it here.

— Ergodic theory —

We were able to deduce the combinatorial results in Theorems 2, 3 from ergodic analogues of these results.  Roughly speaking, these ergodic results correspond to “local” versions of the “global” theorems in Theorem 2, 3, because they involve averaging over only a small portion of the underlying space, as opposed to the global averages used to define the Gowers norms.  [In this regard, the ergodic theory results are weaker than the combinatorial ones.  However, there is another direction in which the ergodic results are stronger, in that they encode some additional “measurability” information that is not present in the combinatorial setting; roughly speaking, this information asserts that the function g appearing in the above conjecture can be “locally reconstructed” from f in some sense, as opposed to merely existing in some non-constructive sense.] Fortunately, thanks to some “local-to-global” principles (including the local testability result in Theorem 1), we are able to pass back and forth between the two types of theorems in a fairly “soft” manner.

Whereas the combinatorial results take place in the finitary setting of a finite-dimensional vector space $F^n$, the ergodic theory results take place in the infinitary setting of a probability space $(X, {\mathcal B}, \mu)$ with a measure-preserving action $(T_g)_{g \in F^{\omega}}$ of the countably infinite vector space $F^\omega := \lim_{n \to \infty} F^n = \bigcup_n F^n$.   Given a measurable function $f: X \to {\Bbb C}$, one can define multiplicative derivatives $\Delta^\times_h f(x) := f( T_h x ) \overline{f(x)}$

for $h \in F^\omega$, and we say that f is a phase polynomial of degree <k if $\Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f = 1$ a.e. for all $h_1,\ldots,h_k \in F^\omega$.  We can also define the Gowers-Host-Kra uniformity seminorm $\|f\|_{U^k(X)}$ for $k \geq 1$ by the formula $\displaystyle \|f\|_{U^k(X)} := (\lim_{n_1 \to \infty} \ldots \lim_{n_k \to \infty} {\Bbb E}_{h_1 \in F^{n_1}, \ldots, h_k \in F^{n_k}}$ $\displaystyle \int_X \Delta^\times_{h_1} \ldots \Delta^\times_{h_k} f\ d\mu)^{1/2^k};$

it can be shown that the limits exist and that this is indeed a seminorm.  Thus for instance, as in the combinatorial setting, phase polynomials of degree <k have a Gowers-Host-Kra seminorm of 1.

There is an analogue of the inverse conjecture:

Ergodic Inverse Conjecture for the Gowers norm over finite fields. Suppose that X is ergodic.  Let $f \in L^\infty(X)$ be such that $\|f\|_{U^k(X)} > 0$.  Then there exists a phase polynomial g of degree <k such that $|\int_X f \overline{g}\ d\mu| > 0$.

In a forthcoming paper of Bergelson, Ziegler, and myself, we show

Theorem 2′. The ergodic inverse conjecture is true in the high characteristic case $p \geq k$.

Theorem 3′. In general characteristic, a weaker form of the ergodic inverse conjecture is true, in which we allow g to be a phase polynomial of degree $ rather than $, for some C(k) depending only on k.

The aim of the current paper is to deduce Theorems 2, 3 from Theorems 2′, 3′ respectively.

— Outline of proof —

The main idea here is to use the compactness and contradiction argument as embodied in the Furstenberg correspondence principle (which was discussed in my previous blog post).  Suppose for instance that Theorem 2 failed; then one would have a sequence of bounded functions $f_j: F^{n_j} \to {\Bbb C}$ (where $n_j$ is going off to infinity) with Gowers norms of order k uniformly bounded from below, but such that $f_j$ becomes increasingly orthogonal to all phase polynomials of degree <k.  The idea is then to exploit weak compactness to take a “limit”, which would be a bounded measurable function $f \in L^\infty(X)$ on some measure space X with an $F^\omega$ action.

The problem in doing this is that the convergence given by weak compactness is only in the “weak” or “local” sense; for instance, one can establish convergence results for local statistics, e.g. $\displaystyle {\Bbb E}_{x \in F^{n_j}} f_j(x) f_j(x+h) f_j(x+k) \to \int_X f(x) f(T_h x) f(T_k x)\ d\mu(x)$

for any fixed $h, k \in F^\omega$, but one cannot immediately establish convergence results for global statistics, e.g. $\displaystyle {\Bbb E}_{x,h,k \in F^{n_j}} f_j(x) f_j(x+h) f_j(x+k)$ $\displaystyle \not \to \lim_{N \to \infty} \lim_{M \to \infty} {\Bbb E}_{h \in F^N, k \in F^M} \int_X f(x) f(T_h x) f(T_k x)\ d\mu(x)$ (2)

for the limit objects X, f obtained by the traditional Furstenberg correspondence principle.  To get around this, we first “shuffle” each of the original spaces $F^{n_j}$ by a randomly chosen general linear transformation before applying the correspondence principle argument (actually, to be more precise, we pull back $F^{n_j}$ to $F^\infty$ using a random linear transformation from the latter to the former, but never mind this technical detail).   The point of doing this shuffling is that it makes local averages approximate global averages with high confidence, for exactly the same reason that random polls with moderately large sample sizes can be accurate even when the overall population is enormous (see my previous post for more discussion on this).  To cut a long story short, this shuffling (or “statistical sampling”) enables one to obtain convergence results such as (2).  This serves several useful purposes in our paper.  For instance, it allows us to ensure that the limit system X is ergodic.  But more importantly, it allows us to carry the information that each $f_j$ has large Gowers norm into the limit, concluding that f has large Gowers-Host-Kra seminorm.  Thus we are now in a position to apply the hypothesis Theorem 2′ (or Theorem 3′, if the objective is to prove Theorem 3) and conclude that f correlates with a phase polynomial g.

Now we want to pass back from the limit system to the original finitary systems and conclude the desired contradiction.  A new problem emerges: the nature of the convergence of the latter to the former allows one to relate (local) statistics of $f$ to (local) statistics of $f_j$ (which in turn relate to global statistics of $f_j$), but says nothing about g.  Fortunately, the ergodic inverse theorems tell us that g is measurable, which turns out to imply that g is “nearly local” in the sense that it can be approximated to arbitrary accuracy as some function g’ of finitely many shifts of f, which is almost a phase polynomial and will continue to correlate with f if the approximation is good enough.  This latter object can be passed back to the original finitary systems, creating functions $g'_j$ correlating with $f_j$ which are also almost polynomials.  To finish off the proof, we use the local testability (Theorem 1) to approximate $g'_j$ by a genuine phase polynomial; this means that the $f_j$ do, after all, have significant correlation with a phase polynomial, contradicting our hypothesis and establishing the result.

It is tempting to try to use this type of argument to handle the inverse conjecture for the Gowers norm on the integers ${\Bbb Z}$, especially since ergodic inverse conjectures in this setting are already known thanks to the work of Host-Kra (and also Ziegler).  However, there are two non-trivial obstacles here.  Firstly, one cannot “shuffle” the integers (or related objects, such as cyclic groups ${\Bbb Z}/N{\Bbb Z}$) with the same degree of freedom as one can for a vector space, causing a difficulty in relating local and global statistics.  Secondly, whereas polynomials are known to be local testable, the local testability of the analogous object for the integer case, namely that of nilsequences, is unknown.  These difficulties do not appear to be totally insurmountable, but they would require some additional effort to resolve.