Recently, I had tentatively announced a forthcoming result with Ben Green establishing the “Gowers inverse conjecture” (or more accurately, the “inverse conjecture for the Gowers uniformity norm”) for vector spaces {\Bbb F}_p^n over a finite field {\Bbb F}_p, in the special case when p=2 and when the function f: {\Bbb F}_p^n \to {\Bbb C} for which the inverse conjecture is to be applied is assumed to be a polynomial phase of bounded degree (thus f= e^{2\pi i P/|{\Bbb F}|}, where P: {\Bbb F}_p^n \to {\Bbb F}_p is a polynomial of some degree d=O(1)). See my FOCS article for some further discussion of this conjecture, which has applications to both polynomiality testing and to various structural decompositions involving the Gowers norm.

This conjecture can be informally stated as follows. By iterating the obvious fact that the derivative of a polynomial of degree at most d is a polynomial of degree at most d-1, we see that a function P: {\Bbb F}_p^n \to {\Bbb F}_p is a polynomial of degree at most d if and only if

\sum_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}} (-1)^{\omega_1+\ldots+\omega_{d+1}} P(x +\omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) = 0

for all x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n. From this one can deduce that a function f: {\Bbb F}_p^n \to {\Bbb C} bounded in magnitude by 1 is a polynomial phase of degree at most d if and only if the Gowers norm

\|f\|_{U^{d+1}({\Bbb F}_p^n)} := \bigl( {\Bbb E}_{x,h_1,\ldots,h_{d+1} \in {\Bbb F}_p^n} \prod_{\omega_1,\ldots,\omega_{d+1} \in \{0,1\}}

{\mathcal C}^{\omega_1+\ldots+\omega_{d+1}} f(x + \omega_1 h_1 + \ldots + \omega_{d+1} h_{d+1}) \bigr)^{1/2^{d+1}}

is equal to its maximal value of 1. The inverse conjecture for the Gowers norm, in its usual formulation, says that, more generally, if a function f: {\Bbb F}_p^n \to {\Bbb C} bounded in magnitude by 1 has large Gowers norm (e.g. \|f\|_{U^{d+1}} \geq \varepsilon) then f has some non-trivial correlation with some polynomial phase g (e.g. \langle f, g \rangle > c(\varepsilon) for some c(\varepsilon) > 0). Informally, this conjecture asserts that if a function has biased (d+1)^{th} derivatives, then one should be able to “integrate” this bias and conclude that the function is biased relative to a polynomial of degree d. The conjecture has already been proven for d \leq 2. There are analogues of this conjecture for cyclic groups which are of relevance to Szemerédi’s theorem and to counting linear patterns in primes, but I will not discuss those here.

At the time of the announcement, our paper had not quite been fully written up. This turned out to be a little unfortunate, because soon afterwards we discovered that our arguments at one point had to go through a version of Newton’s interpolation formula, which involves a factor of d! in the denominator and so is only valid when the characteristic p of the field exceeds the degree. So our arguments in fact are only valid in the range p > d, and in particular are rather trivial in the important case p=2; my previous announcement should thus be amended accordingly.

On investigating this further, we found that the conjecture as stated above is in fact false in the characteristic 2 case: specifically, the symmetric quartic polynomial S_4: {\Bbb F}_2^n \to {\Bbb F}_2 defined by

S_4(x_1,\ldots,x_n) := \sum_{1 \leq i < j < k < l \leq n} x_i x_j x_k x_l (1)

has no significant correlation with any cubic polynomial, but nevertheless exhibits “pseudocubic” behaviour in the sense that its fourth derivative

S_4(x+a+b+c+d)-S_4(x+a+b+c)-\ldots - S_4(x+d) + S_4(x) (2)

(there are 16 terms in this alternating sum) is biased to be 0 (in fact, it is 0 about 9/16 of the time), basically because the above expression can be factorised as

B(a,b) B(c,d) + B(a,d) B(b,c) + B(a,c) B(b,d) (3)

where B is the symmetric bilinear form

B(a,b) := \sum_{1 \leq i < j \leq n} a_i b_j + a_j b_i.

This same example had also been discovered shortly beforehand by Lovett, Meshulam, and Samorodnitsky (private communication), who are now in the process of generalising this example to higher characteristics and higher degrees, and obtaining strong bounds on the discorrelation of these examples with lower degree polynomial phases. (On the other hand, for characteristics higher than the degree, this phenomenon does not occur; our forthcoming paper will show, for instance, that every pseudocubic quartic correlates with a genuine cubic in characteristic 5 and higher.)

It seems intuitively obvious that the quartic S_4 does not correlate with any lower degree polynomials, but sometimes there are non-obvious correlations. For instance, the first three symmetric functions

S_1(x_1,\ldots,x_n) := \sum_{1 \leq i \leq n} x_i

S_2(x_1,\ldots,x_n) := \sum_{1 \leq i < j \leq n} x_i x_j

S_3(x_1,\ldots,x_n) := \sum_{1 \leq i < j < k \leq n} x_i x_j x_k

do not seem to be obviously related, but over {\Bbb F}_2, they turn out to obey the relationship

S_3 = S_1 S_2

and as a consequence, the cubic function S_3 correlates with the linear function S_1 and the quadratic function S_2. Indeed, if |x| denotes the number of indices i for which x_i=1, we see that S_1, S_2, S_3 are equal to 1 if and only if |x| \hbox{ mod } 4 lies in \{1,3\}, \{2,3\}, and \{3\} respectively.

In contrast, S_4 is equal to 1 if and only if |x| \hbox{ mod } 8 lies in \{4,5,6,7\}. (The pattern continues using Pascal’s triangle modulo 2, or equivalently the infinite Sierpinski gasket.) So it is clear that S_4 cannot be expressed in terms of S_1, S_2, S_3; but this does not yet rule out the possibility that S_4 instead correlates with some other linear, quadratic, or cubic polynomials.

The forthcoming paper of Lovett, Meshulam, and Samorodnitsky will establish quite a strong discorrelation estimate on S_4, indeed they show that

\langle (-1)^{S_4}, (-1)^{Q} \rangle = O( 2^{-cn} )

for all cubic polynomials Q: {\Bbb F}_2^n \to {\Bbb F}_2, where c is some positive absolute constant and \langle, \rangle is the usual normalised inner product on {\Bbb F}_2^n. I will not present that argument here (though it is fairly short); instead, I would like to discuss a very pretty argument of Alon and Beigel which uses Ramsey theory to give the weaker estimate

\langle (-1)^{S_4}, (-1)^{Q} \rangle = o(1) (4)

(actually, if one plugs in the best known bounds for Ramsey’s theorem, one gets the more precise bound of O( 1 / \log n ) for some c > 0). [We thank Andrej Bogdanov and Emanuele Viola for drawing the Alon-Beigel argument to our attention; we had an earlier argument, using, of all things, a finite field analogue of the multidimensional Szemerédi theorem, which gave inferior bounds.]
The key idea is to use Ramsey theory to symmetrise the polynomial Q. Indeed, if the cubic polynomial Q was completely symmetric with respect to permutations of the coefficients x_1,\ldots,x_n, then it is a linear combination of S_1, S_2, S_3, and then by explicit computation using the previously mentioned relationships between S_1(x), S_2(x), S_3(x), S_4(x) and |x|, we can verify by hand that (2) holds in these cases (in fact from the theory of random walks one soon establishes an exponential decay bound).

Now suppose that Q is symmetric modulo a linear error, thus

Q(x) = Q_s(x) + \sum_{i \in A} x_i (5)

for some symmetric cubic Q_s and some collection of indices A \subset \{1,\ldots,n\}. Then we can symmetrise the error \sum_{i \in A} x_i by the simple expedient of the pigeonhole principle: either the set A contains at least m := \lfloor n/2 \rfloor indices, or its complement contains at least m indices. Suppose for sake of argument that A contains at least m indices, and specifically the indices \{1,\ldots,m\}. Then the point is that while the linear polynomial \sum_{i \in A} x_i is not symmetric with respect to interchange of all coordinates x_1,\ldots,x_n, it is at least symmetric with respect to interchange of the coordinates x_1,\ldots,x_m. If we then foliate the domain {\Bbb F}_2^n into translates of the smaller domain {\Bbb F}_2^m (by freezing the coordinates x_{m+1},\ldots,x_n), using the previous results towards (4) to obtain a correlation o(1) on each of these translates, and then averaging everything together using the triangle inequality to obtain (4) in the case (5).

Now consider the case when Q is symmetric modulo quadratic errors, thus

Q(x) = Q_s(x) + \sum_{\{i,j\} \in E} x_i x_j + L(x)

where Q_s is symmetric, E is a set of pairs or “edges” in \{1,\ldots,n\}, and L is some linear polynomial. The quadratic \sum_{\{i,j\} \in E} x_i x_j is not symmetric in general, but it can be symmetrised on medium-sized subspaces by an appeal to Ramsey’s theorem. If we view E as the edges of a graph G, this theorem says that for some integer m growing slowly with n (we can take m \sim \log n, in fact), the graph G either contains a clique of size m, or an independent set of size m. Let’s say it contains a clique of size m, which without loss of generality we can take to be \{1,\ldots,m\}. Then the quadratic \sum_{\{i,j\} \in E} x_i x_j can be expressed as the sum of \sum_{1 \leq i < j \leq m} x_i x_j, which is symmetric with respect to permutations of x_1,\ldots,x_m, plus a remainder which is at most linear in x_1,\ldots,x_m. So if we once again foliate {\Bbb F}_2^n into translates of {\Bbb F}_2^m and use the previous results, we again recover the discorrelation estimate (4) for these polynomials. (Note that the o(1) notation for n \to \infty and the o(1) notation for m \to \infty are interchangeable, because m goes to infinity as n goes to infinity.)

Finally, we consider the general case when Q is an arbitrary cubic, thus

Q(x) = \sum_{\{i,j,k\} \in E} x_i x_j x_k + \hbox{ quadratic error}

for some collection E of unordered triples in \{1,\ldots,n\}. One can view E as the edges of a (3-uniform) hypergraph on n vertices. It turns out that Ramsey’s theorem also extends to hypergraphs (it’s the same proof, iterated a few more times), and so once again we can locate medium-sized subspaces on which the cubic \sum_{\{i,j,k\} \in E} x_i x_j x_k becomes symmetric, and so by appeal to the previous results we obtain (4) in full generality.

[A note for the experts: the above iterated use of Ramsey’s theorem was inefficient. It is possible to use just a single application of the 3-uniform hypergraph Ramsey theorem (and no application of the graph Ramsey theorem) to eventually end up with a correlation bound of O( 1 / \log n ); details will be provided in our forthcoming paper. Of course, this is still significantly inferior to the exponential bounds that will be obtained by Lovett, Meshulam, and Samorodnitsky.]

Despite the above example, I personally believe that some form of the inverse conjecture of the Gowers norm persists. One particularly promising candidate is to replace the notion of a global polynomial by a local one (as is discussed in one of my papers with Ben). For instance, while S_4 is not a globally cubic polynomial, it turns out that it is “locally cubic” on the quadratic surface \{ x: S_2(x) = 0\}, in the sense that the fourth derivative (2) of S_4 vanishes when all sixteen points x, x+a, x+b, \ldots, x+a+b+c+d of the relevant parallelopiped lies on that quadratic surface. This is basically due to the identity that equates (2) and (3). So it seems there is still much to be investigated regarding this conjecture over finite fields…