Recently, I had tentatively announced a forthcoming result with Ben Green establishing the “Gowers inverse conjecture” (or more accurately, the “inverse conjecture for the Gowers uniformity norm”) for vector spaces ${\Bbb F}_p^n$ over a finite field ${\Bbb F}_p$, in the special case when p=2 and when the function $f: {\Bbb F}_p^n \to {\Bbb C}$ for which the inverse conjecture is to be applied is assumed to be a polynomial phase of bounded degree (thus $f= e^{2\pi i P/|{\Bbb F}|}$, where $P: {\Bbb F}_p^n \to {\Bbb F}_p$ is a polynomial of some degree $d=O(1)$). See my FOCS article for some further discussion of this conjecture, which has applications to both polynomiality testing and to various structural decompositions involving the Gowers norm.

This conjecture can be informally stated as follows. By iterating the obvious fact that the derivative of a polynomial of degree at most d is a polynomial of degree at most d-1, we see that a function $P: {\Bbb F}_p^n \to {\Bbb F}_p$ is a polynomial of degree at most d if and only if $\sum_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}} (-1)^{\omega_1+\ldots+\omega_{d+1}} P(x +\omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) = 0$

for all $x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n$. From this one can deduce that a function $f: {\Bbb F}_p^n \to {\Bbb C}$ bounded in magnitude by 1 is a polynomial phase of degree at most d if and only if the Gowers norm $\|f\|_{U^{d+1}({\Bbb F}_p^n)} := \bigl( {\Bbb E}_{x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n} \prod_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}}$ ${\mathcal C}^{\omega_1+\ldots+\omega_{d+1}} f(x + \omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) \bigr)^{1/2^{d+1}}$

is equal to its maximal value of 1. The inverse conjecture for the Gowers norm, in its usual formulation, says that, more generally, if a function $f: {\Bbb F}_p^n \to {\Bbb C}$ bounded in magnitude by 1 has large Gowers norm (e.g. $\|f\|_{U^{d+1}} \geq \varepsilon$) then f has some non-trivial correlation with some polynomial phase g (e.g. $\langle f, g \rangle > c(\varepsilon)$ for some $c(\varepsilon) > 0$). Informally, this conjecture asserts that if a function has biased $(d+1)^{th}$ derivatives, then one should be able to “integrate” this bias and conclude that the function is biased relative to a polynomial of degree d. The conjecture has already been proven for $d \leq 2$. There are analogues of this conjecture for cyclic groups which are of relevance to Szemerédi’s theorem and to counting linear patterns in primes, but I will not discuss those here.

At the time of the announcement, our paper had not quite been fully written up. This turned out to be a little unfortunate, because soon afterwards we discovered that our arguments at one point had to go through a version of Newton’s interpolation formula, which involves a factor of d! in the denominator and so is only valid when the characteristic p of the field exceeds the degree. So our arguments in fact are only valid in the range $p > d$, and in particular are rather trivial in the important case $p=2$; my previous announcement should thus be amended accordingly.

On investigating this further, we found that the conjecture as stated above is in fact false in the characteristic 2 case: specifically, the symmetric quartic polynomial $S_4: {\Bbb F}_2^n \to {\Bbb F}_2$ defined by $S_4(x_1,\ldots,x_n) := \sum_{1 \leq i < j < k < l \leq n} x_i x_j x_k x_l$ (1)

has no significant correlation with any cubic polynomial, but nevertheless exhibits “pseudocubic” behaviour in the sense that its fourth derivative $S_4(x+a+b+c+d)-S_4(x+a+b+c)-\ldots - S_4(x+d) + S_4(x)$ (2)

(there are 16 terms in this alternating sum) is biased to be 0 (in fact, it is 0 about 9/16 of the time), basically because the above expression can be factorised as $B(a,b) B(c,d) + B(a,d) B(b,c) + B(a,c) B(b,d)$ (3)

where B is the symmetric bilinear form $B(a,b) := \sum_{1 \leq i < j \leq n} a_i b_j + a_j b_i$.

This same example had also been discovered shortly beforehand by Lovett, Meshulam, and Samorodnitsky (private communication), who are now in the process of generalising this example to higher characteristics and higher degrees, and obtaining strong bounds on the discorrelation of these examples with lower degree polynomial phases. (On the other hand, for characteristics higher than the degree, this phenomenon does not occur; our forthcoming paper will show, for instance, that every pseudocubic quartic correlates with a genuine cubic in characteristic 5 and higher.)

It seems intuitively obvious that the quartic $S_4$ does not correlate with any lower degree polynomials, but sometimes there are non-obvious correlations. For instance, the first three symmetric functions $S_1(x_1,\ldots,x_n) := \sum_{1 \leq i \leq n} x_i$ $S_2(x_1,\ldots,x_n) := \sum_{1 \leq i < j \leq n} x_i x_j$ $S_3(x_1,\ldots,x_n) := \sum_{1 \leq i < j < k \leq n} x_i x_j x_k$

do not seem to be obviously related, but over ${\Bbb F}_2$, they turn out to obey the relationship $S_3 = S_1 S_2$

and as a consequence, the cubic function $S_3$ correlates with the linear function $S_1$ and the quadratic function $S_2$. Indeed, if $|x|$ denotes the number of indices i for which $x_i=1$, we see that $S_1, S_2, S_3$ are equal to 1 if and only if $|x| \hbox{ mod } 4$ lies in $\{1,3\}$, $\{2,3\}$, and $\{3\}$ respectively.

In contrast, $S_4$ is equal to 1 if and only if $|x| \hbox{ mod } 8$ lies in $\{4,5,6,7\}$. (The pattern continues using Pascal’s triangle modulo 2, or equivalently the infinite Sierpinski gasket.) So it is clear that $S_4$ cannot be expressed in terms of $S_1, S_2, S_3$; but this does not yet rule out the possibility that $S_4$ instead correlates with some other linear, quadratic, or cubic polynomials.

The forthcoming paper of Lovett, Meshulam, and Samorodnitsky will establish quite a strong discorrelation estimate on $S_4$, indeed they show that $\langle (-1)^{S_4}, (-1)^{Q} \rangle = O( 2^{-cn} )$

for all cubic polynomials $Q: {\Bbb F}_2^n \to {\Bbb F}_2$, where c is some positive absolute constant and $\langle, \rangle$ is the usual normalised inner product on ${\Bbb F}_2^n$. I will not present that argument here (though it is fairly short); instead, I would like to discuss a very pretty argument of Alon and Beigel which uses Ramsey theory to give the weaker estimate $\langle (-1)^{S_4}, (-1)^{Q} \rangle = o(1)$ (4)

(actually, if one plugs in the best known bounds for Ramsey’s theorem, one gets the more precise bound of $O( 1 / \log n )$ for some c > 0). [We thank Andrej Bogdanov and Emanuele Viola for drawing the Alon-Beigel argument to our attention; we had an earlier argument, using, of all things, a finite field analogue of the multidimensional Szemerédi theorem, which gave inferior bounds.]
The key idea is to use Ramsey theory to symmetrise the polynomial Q. Indeed, if the cubic polynomial Q was completely symmetric with respect to permutations of the coefficients $x_1,\ldots,x_n$, then it is a linear combination of $S_1, S_2, S_3$, and then by explicit computation using the previously mentioned relationships between $S_1(x), S_2(x), S_3(x), S_4(x)$ and |x|, we can verify by hand that (2) holds in these cases (in fact from the theory of random walks one soon establishes an exponential decay bound).

Now suppose that Q is symmetric modulo a linear error, thus $Q(x) = Q_s(x) + \sum_{i \in A} x_i$ (5)

for some symmetric cubic $Q_s$ and some collection of indices $A \subset \{1,\ldots,n\}$. Then we can symmetrise the error $\sum_{i \in A} x_i$ by the simple expedient of the pigeonhole principle: either the set A contains at least $m := \lfloor n/2 \rfloor$ indices, or its complement contains at least m indices. Suppose for sake of argument that A contains at least m indices, and specifically the indices $\{1,\ldots,m\}$. Then the point is that while the linear polynomial $\sum_{i \in A} x_i$ is not symmetric with respect to interchange of all coordinates $x_1,\ldots,x_n$, it is at least symmetric with respect to interchange of the coordinates $x_1,\ldots,x_m$. If we then foliate the domain ${\Bbb F}_2^n$ into translates of the smaller domain ${\Bbb F}_2^m$ (by freezing the coordinates $x_{m+1},\ldots,x_n$), using the previous results towards (4) to obtain a correlation o(1) on each of these translates, and then averaging everything together using the triangle inequality to obtain (4) in the case (5).

Now consider the case when Q is symmetric modulo quadratic errors, thus $Q(x) = Q_s(x) + \sum_{\{i,j\} \in E} x_i x_j + L(x)$

where $Q_s$ is symmetric, E is a set of pairs or “edges” in $\{1,\ldots,n\}$, and L is some linear polynomial. The quadratic $\sum_{\{i,j\} \in E} x_i x_j$ is not symmetric in general, but it can be symmetrised on medium-sized subspaces by an appeal to Ramsey’s theorem. If we view E as the edges of a graph G, this theorem says that for some integer m growing slowly with n (we can take $m \sim \log n$, in fact), the graph G either contains a clique of size m, or an independent set of size m. Let’s say it contains a clique of size m, which without loss of generality we can take to be $\{1,\ldots,m\}$. Then the quadratic $\sum_{\{i,j\} \in E} x_i x_j$ can be expressed as the sum of $\sum_{1 \leq i < j \leq m} x_i x_j$, which is symmetric with respect to permutations of $x_1,\ldots,x_m$, plus a remainder which is at most linear in $x_1,\ldots,x_m$. So if we once again foliate ${\Bbb F}_2^n$ into translates of ${\Bbb F}_2^m$ and use the previous results, we again recover the discorrelation estimate (4) for these polynomials. (Note that the o(1) notation for $n \to \infty$ and the o(1) notation for $m \to \infty$ are interchangeable, because m goes to infinity as n goes to infinity.)

Finally, we consider the general case when Q is an arbitrary cubic, thus $Q(x) = \sum_{\{i,j,k\} \in E} x_i x_j x_k + \hbox{ quadratic error}$

for some collection E of unordered triples in $\{1,\ldots,n\}$. One can view E as the edges of a (3-uniform) hypergraph on n vertices. It turns out that Ramsey’s theorem also extends to hypergraphs (it’s the same proof, iterated a few more times), and so once again we can locate medium-sized subspaces on which the cubic $\sum_{\{i,j,k\} \in E} x_i x_j x_k$ becomes symmetric, and so by appeal to the previous results we obtain (4) in full generality.

[A note for the experts: the above iterated use of Ramsey’s theorem was inefficient. It is possible to use just a single application of the 3-uniform hypergraph Ramsey theorem (and no application of the graph Ramsey theorem) to eventually end up with a correlation bound of $O( 1 / \log n )$; details will be provided in our forthcoming paper. Of course, this is still significantly inferior to the exponential bounds that will be obtained by Lovett, Meshulam, and Samorodnitsky.]

Despite the above example, I personally believe that some form of the inverse conjecture of the Gowers norm persists. One particularly promising candidate is to replace the notion of a global polynomial by a local one (as is discussed in one of my papers with Ben). For instance, while $S_4$ is not a globally cubic polynomial, it turns out that it is “locally cubic” on the quadratic surface $\{ x: S_2(x) = 0\}$, in the sense that the fourth derivative (2) of $S_4$ vanishes when all sixteen points $x, x+a, x+b, \ldots, x+a+b+c+d$ of the relevant parallelopiped lies on that quadratic surface. This is basically due to the identity that equates (2) and (3). So it seems there is still much to be investigated regarding this conjecture over finite fields…