In Notes 3, we saw that the number of additive patterns in a given set was (in principle, at least) controlled by *the Gowers uniformity norms* of functions associated to that set.

Such norms can be defined on any finite additive group (and also on some other types of domains, though we will not discuss this point here). In particular, they can be defined on the finite-dimensional vector spaces over a finite field .

In this case, the Gowers norms are closely tied to the space of polynomials of degree at most . Indeed, as noted in Exercise 20 of Notes 4, a function of norm has norm equal to if and only if for some ; thus polynomials solve the “ inverse problem” for the trivial inequality . They are also a crucial component of the solution to the “ inverse problem” and “ inverse problem”. For the former, we will soon show:

Proposition 1 ( inverse theorem for )Let be such that and for some . Then there exists such that , where is a constant depending only on .

Thus, for the Gowers norm to be almost completely saturated, one must be very close to a polynomial. The converse assertion is easily established:

Exercise 1 (Converse to inverse theorem for )If and for some , then , where is a constant depending only on .

In the world, one no longer expects to be close to a polynomial. Instead, one expects to *correlate* with a polynomial. Indeed, one has

Lemma 2 (Converse to the inverse theorem for )If and are such that , where , then .

*Proof:* From the definition of the norm (equation (18) from Notes 3), the monotonicity of the Gowers norms (Exercise 19 of Notes 3), and the polynomial phase modulation invariance of the Gowers norms (Exercise 21 of Notes 3), one has

and the claim follows.

In the high characteristic case at least, this can be reversed:

Theorem 3 ( inverse theorem for )Suppose that . If is such that and , then there exists such that .

This result is sometimes referred to as the *inverse conjecture for the Gowers norm* (in high, but bounded, characteristic). For small , the claim is easy:

Exercise 2Verify the cases of this theorem. (Hint:to verify the case, use the Fourier-analytic identities and , where is the space of all homomorphisms from to , and are the Fourier coefficients of .)

This conjecture for larger values of are more difficult to establish. The case of the theorem was established by Ben Green and myself in the high characteristic case ; the low characteristic case was independently and simultaneously established by Samorodnitsky. The cases in the high characteristic case was established in two stages, firstly using a modification of the Furstenberg correspondence principle, due to Ziegler and myself. to convert the problem to an ergodic theory counterpart, and then using a modification of the methods of Host-Kra and Ziegler to solve that counterpart, as done in this paper of Bergelson, Ziegler, and myself.

The situation with the low characteristic case in general is still unclear. In the high characteristic case, we saw from Notes 4 that one could replace the space of non-classical polynomials in the above conjecture with the essentially equivalent space of classical polynomials . However, as we shall see below, this turns out not to be the case in certain low characteristic cases (a fact first observed by Lovett, Meshulam, and Samorodnitsky, and independently by Ben Green and myself), for instance if and ; this is ultimately due to the existence in those cases of non-classical polynomials which exhibit no significant correlation with classical polynomials of equal or lesser degree. This distinction between classical and non-classical polynomials appears to be a rather non-trivial obstruction to understanding the low characteristic setting; it may be necessary to obtain a more complete theory of non-classical polynomials in order to fully settle this issue.

The inverse conjecture has a number of consequences. For instance, it can be used to establish the analogue of Szemerédi’s theorem in this setting:

Theorem 4 (Szemerédi’s theorem for finite fields)Let be a finite field, let , and let be such that . If is sufficiently large depending on , then contains an (affine) line for some with .

Exercise 3Use Theorem 4 to establish the following generalisation: with the notation as above, if and is sufficiently large depending on , then contains an affine -dimensional subspace.

We will prove this theorem in two different ways, one using a density increment method, and the other using an energy increment method. We discuss some other applications below the fold.

** — 1. The inverse theorem — **

We now prove Proposition 1. Results of this type for general appear in this paper of Alon, Kaufman, Krivelevich, Litsyn, and Ron (see also this paper of Sudan, Trevisan, and Vadhan for a precursor result), the case was treated previously by by Blum, Luby, and Rubinfeld. The argument here is due to Tamar Ziegler and myself. The argument has a certain “cohomological” flavour (comparing cocycles with coboundaries, determining when a closed form is exact, etc.). Indeed, the inverse theory can be viewed as a sort of “additive combinatorics cohomology”.

Let be as in the theorem. We let all implied constants depend on . We use the symbol to denote various positive constants depending only on . We may assume is sufficiently small depending on , as the claim is trivial otherwise.

The case is easy, so we assume inductively that and that the claim has been already proven for .

The first thing to do is to make unit magnitude. One easily verifies the crude bound

and thus

Since pointwise, we conclude that

As such, differs from a function of unit magnitude by in norm. By replacing with and using the triangle inequality for the Gowers norm (changing and worsening the constant in Proposition 1 if necessary), we may assume without loss of generality that throughout, thus for some .

Since

we see from Markov’s inequality that

for all in a subset of of density . Applying the inductive hypothesis, we see that for each such , we can find a polynomial such that

Now let . Using the cocycle identity

where is the shift operator , we see using Hölder’s inequality that

On the other hand, is a polynomial of order . Also, since is so dense, every element of has at least one representation of the form for some (indeed, out of all possible representations , or can fall outside of for at most of these representations). We conclude that for every there exists a polynomial such that

The new polynomial supercedes the old one ; to reflect this, we abuse notation and write for . Applying the cocycle equation again, we see that

for all . Applying the rigidity of polynomials (Exercise 14 from Notes 4), we conclude that

for some constant . From (2) we in fact have for all .

The expression is known as a *-coboundary* (see this blog post for more discussion). To eliminate it, we use the finite characteristic to discretise the problem as follows. First, we use the cocycle identity

where is the characteristic of the field. Using (1), we conclude that

On the other hand, takes values in some coset of a finite subgroup of (depending only on ), by Lemma 1 of Notes 4. We conclude that this coset must be a shift of by . Since itself takes values in some coset of a finite subgroup, we conclude that there is a finite subgroup (depending only on ) such that each takes values in a shift of by .

Next, we note that we have the freedom to shift each by (adjusting accordingly) without significantly affecting any of the properties already established. Doing so, we can thus ensure that all the take values in itself, which forces to do so also. But since , we conclude that for all , thus is a perfect cocycle:

We may thus integrate and write , where . Thus is a polynomial of degree for each , thus itself is a polynomial of degree . From (1) one has

for all ; averaging in we conclude that

and thus

and Proposition 1 follows.

One consequence of Proposition 1 is that the property of being a classical polynomial of a fixed degree is *locally testable*, which is a notion of interest in theoretical computer science. More precisely, suppose one is given a large finite vector space and two functions . One is told that one of the functions is a classical polynomial of degree at most , while the other is quite far from being such a classical polynomial, in the sense that every polynomial of degree at most will differ with that polynomial on at least of the values in . The task is then to decide with a high degree of confidence which of the functions is a polynomial and which one is not, without inspecting too many of the values of or .

This can be done as follows. Pick at random, and test whether the identities

and

hold; note that one only has to inspect at values in for this. If one of these identities fails, then that function must not be polynomial, and so one has successfully decided which of the functions is polynomials. We claim that the probability that the identity fails for the non-polynomial function is at least for some , and so if one iterates this test times, one will be able to successfully solve the problem with probability arbitrarily close to . To verify the claim, suppose for contradiction that the identity only failed at most of the time for the non-polynomial (say it is ); then , and thus by Proposition 1, is very close in norm to a polynomial; rounding that polynomial to a root of unity we thus see that agrees with high accuracy to a classical polynomial, which leads to a contradiction if is chosen suitably.

** — 2. A partial counterexample in low characteristic — **

We now show a distinction between classical polynomials and non-classical polynomials that causes the inverse conjecture to fail in low characteristic if one insists on using classical polynomials. For simplicity we restrict attention to the characteristic two case . We will use an argument of Alon and Beigel, reproduced in this paper of Green and myself. A different argument (with stronger bounds) appears in this paper of Lovett, Meshulam, and Samorodnitsky.

We work in a standard vector space , with standard basis and coordinates . Among all the classical polynomials on this space are the *symmetric polynomials*

which play a special role.

Exercise 4Let be the digit summation function . Show thatEstablish Lucas’ theorem

where , is the binary expansion of . Show that is the binary coefficient of , and conclude that is a function of . (

Note:These results are closely related to the well-known fact that Pascal’s triangle modulo takes the form of an infinite Sierpinski gasket.)

We define an *an affine coordinate subspace* to be a translate of a subspace of generated by some subset of the standard basis vectors . To put it another way, an affine coordinate subspace is created by freezing some of the coordinates, but letting some other coordinates be arbitrary.

Of course, not all classical polynomials come from symmetric polynomials. However, thanks to an application of Ramsey’s theorem observed by Alon and Biegel, this is true on coordinate subspaces:

Lemma 5 (Ramsey’s theorem for polynomials)Let be a polynomial of degree at most . Then one can partition into affine coordinate subspaces of dimension at least , where as for fixed , such that on each such subspace , is equal to a linear combination of the symmetric polynomials .

*Proof:* We induct on . The claim is trivial for , so suppose that and the claim has already been proven for smaller . The degree term of can be written as

where is a -uniform hypergraph on , i.e. a collection of -element subsets of . Applying Ramsey’s theorem (for hypergraphs), one can find a subcollection of indices with such that either has no edges in , or else contains all the edges in . We then foliate into the affine subspaces formed by translating the coordinate subspace generated by . By construction, we see that on each such subspace, is equal to either or plus a polynomial of degree . The claim then follows by applying the induction hypothesis (and noting that the linear span of on an affine coordinate subspace is equivariant with respect to translation of that subspace).

Because of this, if one wants to concoct a function which is almost orthogonal to all polynomials of degree at most , it will suffice to build a function which is almost orthogonal to the symmetric polynomials on all affine coordinate subspaces of moderately large size. Pursuing this idea, we are led to

Proposition 6 (Counterexample to classical inverse conjecture)Let , and let be the function , where is as in Exercise 4. Then is a non-classical polynomial of degree at most , and so ; but one hasuniformly for all classical polynomials of degree less than , where is bounded in magnitude by a quantity that goes to zero as for each fixed .

*Proof:* We first prove the polynomiality of . Let be the obvious map from to , thus

By linearity, it will suffice to show that each function is a polynomial of degree at most . But one easily verifies that for any , is equal to zero when and equal to when . Iterating this observation times, we obtain the claim.

Now let be a classical polynomial of degree less than . By Lemma 5, we can partition into affine coordinate subspaces of dimension at least such that is a linear combination of on each such subspace. By the pigeonhole principle, we thus can find such a such that

On the other hand, from Exercise 4, the function on depends only on . Now, as , the function (which is essentially the distribution function of a simple random walk of length on ) becomes equidistributed; in particular, for any , the function will take the values and with asymptotically equal frequency on , whilst remains unchanged. As such we see that as , and thus as , and the claim follows.

Exercise 5With the same setup as the previous proposition, show that , but that for all classical polynomials of degree less than .

** — 3. The inverse theorem: sketches of a proof — **

The proof of Theorem 3 is rather difficult once ; even the case is not particularly easy. However, the arguments still have the same cohomological flavour encountered in the theory. We will not give full proofs of this theorem here, but indicate some of the main ideas.

We begin by discussing (quite non-rigorously) the significantly simpler (but still non-trivial) case, established by Ben Green and myself. Unsurprisingly, we will take advantage of the case of the theorem as an induction hypothesis.

Let for some field of characteristic greater than , and be a function with and . We would like to show that correlates with a quadratic phase function (due to the characteristic hypothesis, we may take to be classical), in the sense that .

We expand as . By the pigeonhole principle, we conclude that

for “many” , where by “many” we mean “a proportion of “. Applying the inverse theorem, we conclude that for many , that there exists a linear polynomial (which we may as well take to be classical) such that

This should be compared with the theory. There, we were able to force close to for most ; here, we only have the weaker statement that *correlates* with for *many* (not *most*) . Still, we will keep going. In the theory, we were able to assume had magnitude , which made the cocycle equation available; this then forced an approximate cocycle equation for most (indeed, we were able to use this trick to upgrade “most” to “all”).

This doesn’t quite work in the case. Firstly, need not have magnitude exactly equal to . This is not a terribly serious problem, but the more important difficulty is that correlation, unlike the property of being close, is not transitive or multiplicative: just because correlates with , and correlates with , one cannot then conclude that correlates with ; and even if one had this, and if correlated with , one could not conclude that correlated with .

Despite all these obstacles, it is still possible to extract something resembling a cocycle equation for the , by means of the Cauchy-Schwarz inequality. Indeed, we have the following remarkable observation of Gowers:

Lemma 7 (Gowers’ Cauchy-Schwarz argument)Let be a finite additive group, and let be a function, bounded by . Let be a subset with , and suppose that for each , suppose that we have a function bounded by , such thatuniformly in . Then there exist quadruples with such that

uniformly among the quadruples.

We shall refer to quadruples obeying the relation as *additive quadruples*.

*Proof:* We extend to be zero when lies outside of . Then we have

We expand the left-hand side as

setting , this becomes

From the pigeonhole principle, we conclude that for many values of , one has

Performing Cauchy-Schwarz once in and once in to eliminate the factors, and then re-averaging in , we conclude that

Setting to be the additive quadruple we obtain

Performing the averages we obtain

and the claim follows (note that for the quadruples obeying the stated lower bound, must lie in ).

Applying this lemma to our current situation, we find many additive quadruples for which

In particular, by the equidistribution theory in Notes 4, the polynomial is low rank.

The above discussion is valid in any value of , but is particularly simple when , as the are now linear, and so is now *constant*. Writing for some using the standard dot product on , and some (irrelevant) constant term , we conclude that

for many additive quadruples .

We now have to solve an additive combinatorics problem, namely to classify the functions from to which are “ affine linear” in the sense that the property (3) holds for many additive quadruples; equivalently, the graph in has high “additive energy”, defined as the number of additive quadruples that it contains. An obvious example of a function with this property is an affine-linear function , where is a linear transformation and . As it turns out, this is essentially the only example:

Proposition 8 (Balog-Szemerédi-Gowers-Freiman theorem for vector spaces)Let , and let be a map from to such that (3) holds for additive quadruples in . Then there exists an affine function such that for values of in .

This proposition is a consequence of standard results in additive combinatorics, in particular the Balog-Szemerédi-Gowers lemma and Freiman’s theorem for vector spaces; see Section 11.3 of my book with Van for further discussion. The proof is elementary but a little lengthy and would take us too far afield, so we simply assume this proposition for now and keep going. We conclude that

The most difficult term to deal with here is the quadratic term . To deal with this term, suppose temporarily that is symmetric, thus . Then (since we are in odd characteristic) we can *integrate* as

and thus

for many . Taking norms in , we conclude that the inner product between two copies of and two copies of . Applying the Cauchy-Schwarz-Gowers inequality, followed by the inverse theorem, we conclude that correlates with for some linear phase, and thus itself correlates with for some quadratic phase.

This argument also works (with minor modification) when is *virtually symmetric*, in the sense that there exist a bounded index subspace of such that the restriction of the form to is symmetric, by foliating into cosets of that subspace; we omit the details. On the other hand, if is not virtually symmetric, there is no obvious way to “integrate” the phase to eliminate it as above. (Indeed, in order for to be “exact” in the sense that it is the “derivative” of something (modulo lower order terms), e.g. for some , it must first be “closed” in the sense that in some sense, since we have ; thus we again see the emergence of cohomological concepts in the background.)

To establish the required symmetry on , we return to Gowers’ Cauchy-Schwarz argument from Lemma 7, and tweak it slightly. We start with (4) and rewrite it as

where . We square-average this in to obtain

Now we make the somewhat unusual substitution to obtain

Thus there exists such that

We collect all terms that depend only on (and ) or only on (and ) to obtain

for some bounded functions . Eliminating these functions by two applications of Cauchy-Schwarz, we obtain

or, on making the change of variables ,

Using equidistribution theory, this means that the quadratic form is low rank, which easily implies that is virtually symmetric.

Now we turn to the general case. In principle, the above argument should still work, say for . The main sticking point is that instead of dealing with a vector-valued function that is approximately linear in the sense that (3) holds for many additive quadruples, in the case one is now faced with a \xi_{h_1}

for many additive quadruples , where the matrix has bounded rank. With our current level of additive combinatorics technology, we are not able to deal properly with this bounded rank error (the main difficulty being that the set of low rank matrices has no good “doubling” properties). Because of this obstruction, no generalisation of the above arguments to higher has been found.

There is however another approach, based ultimately on the ergodic theory work of Host-Kra and of Ziegler, that can handle the general case, which was worked out in two papers, one by myself and Ziegler, and one by Bergelson, Ziegler, and myself. It turns out that it is convenient to phrase these arguments in the language of ergodic theory. However, in order not to have to introduce too much additional material, I will try to describe the arguments here in the case without explicitly using ergodic theory notation. To do this, though, I will have to sacrifice a lot of rigour and only work with some illustrative special cases rather than the general case, and also use somewhat vague terminology (e.g. “general position” or “low rank”).

To simplify things further, we will establish the inverse theorem only for a special type of function, namely a quartic phase , where is a classical polynomial of degree . (A good example to keep in mind is the symmetric polynomial phase , though one has to take some care with this example due to the low characteristic.) The claim to show then is that if , then correlates with a cubic phase. In the high characteristic case , this result can be handled by equidistribution theory. Indeed, since

that theory tells us that the quartic polynomial is low rank. On the other hand, in high characteristic one has the Taylor expansion

for some cubic function (as can be seen for instance by decomposing into monomials). From this we easily conclude that itself has low rank (i.e. it is a function of boundedly many cubic (or lower degree) polynomials), at which point it is easy to see from Fourier analysis that will correlate with the exponential of a polynomial of degree at most .

Now we present a different argument that relies slightly less on the quartic nature of ; it is a substantially more difficult argument, and we will skip some steps here to simplify the exposition, but the argument happens to extend to more general situations. As , we have for many , thus by the inverse theorem, correlates with a quadratic phase. Using equidistribution theory, we conclude that the cubic polynomial is low rank.

At present, the low rank property for is only true for many . But from the cocycle identity

we see that if and are both low rank, then so is ; thus the property of being low rank is in some sense preserved by addition. Using this and a bit of additive combinatorics, one can conclude that is low rank for all in a bounded index subspace of ; restricting to that subspace, we will now assume that is low rank for *all* . Thus we have

where is some bounded collection of quadratic polynomials for each , and is some function. To simplify the discussion, let us pretend that in fact consists of just a single quadratic , plus some linear polynomials , thus

There are two extreme cases to consider, depending on how depends on . Consider first a “core” case when is independent of . Thus

If is low rank, then we can absorb it into the factors, so suppose instead thaat is high rank, and thus equidistributed even after fixing the values of .

The function is cubic, and is a high rank quadratic. Because of this, the function must be at most linear in the variable; this can be established by another application of equidistribution theory, see Section 8 of this paper of Ben and myself; thus one can factorise

for some functions . In fact, as is cubic, must be linear, while is cubic.

By comparing the coefficients in the cocycle equation (5), we see that the function is itself a cocycle:

As a consequence, we have for some function . Since is linear, is quadratic; thus we have

With a high characteristic assumption , one can ensure is classical. We will assume that is high rank, as this is the most difficult case.

Suppose first that . In high characteristic, one can then integrate by expressing this as plus lower order terms, thus is an order function in the sense that it is a function of a bounded number of linear functions. In particular, has a large norm for all , which implies that has a large norm, and thus correlates with a quadratic phase. Since can be decomposed by Fourier analysis into a linear combination of quadratic phases, we conclude that correlates with a quadratic phase and one is thus done in this case.

Now consider the other extreme, in which and lie in general position. Then, if we differentiate (8) in , we obtain one has

and then anti-symmetrising in one has

If and are unrelated, then the linear forms will typically be in general position with respect to each other and with , and similarly will be in general position with respect to each other and with . From this, one can show that the above equation is not satisfiable generically, because the mixed terms cannot be cancelled by the simpler terms in the above expression.

An interpolation of the above two arguments can handle the case in which does not depend on . Now we consider the other extreme, in which varies in , so that and are in general position for generic , and similarly for and , or for and . (Note though that we cannot simultaneously assume that are in general position; indeed, might vary linearly in , and indeed we expect this to be the basic behaviour of here, as was observed in the preceding argument.)

To analyse this situation, we return to the cocycle equation (5), which currently reads

Because any two of can be assumed to be in general position, one can show using equidistribution theory that the above equation can only be satisfied when the are linear in the variable, thus

much as before. Furthermore, the coefficients must now be (essentially) constant in in order to obtain (9). Absorbing this constant into the definition of , we now have

We will once again pretend that is just a single linear form . Again we consider two extremes. If is independent of , then by passing to a bounded index subspace (the level set of ) we now see that is quadratic, hence is cubic, and we are done. Now suppose instead that varies in , so that are in general position for generic . We look at the cocycle equation again, which now tells us that obeys the *quasicocycle* condition

where is a quadratic polynomial. With any two of in general position, one can then conclude (using equidistribution theory) that are quadratic polynomials. Thus is quadratic, and is cubic as before. This completes the heuristic discussion of various extreme model cases; the general case is handled by a rather complicated combination of all of these special case methods, and is best performed in the framework of ergodic theory (in particular, the idea of extracting out the coefficient of a key polynomial, such as the coerfficient of , is best captured by the ergodic theory concept of *vertical differentiation*); see this paper of Bergelson, Ziegler, and myself.

** — 4. Consequences of the Gowers inverse conjecture — **

We now discuss briefly some of the consequences of the Gowers inverse conjecture, beginning with Szemerédi’s theorem in vector fields (Theorem 4). We will use the density increment method (an energy increment argument is also possible, but is more complicated; see this paper). Let be a set of density at least containing no lines. This implies that the -linear form

has size . On the other hand, as this pattern has complexity , one has from Notes 3 the bound

whenever are bounded in magnitude by . Splitting , we conclude that

and thus (for large enough)

Applying Theorem 3, we find that there exists a polynomial of degree at most such that

To proceed we need the following analogue of Proposition 5 of Notes 2:

Exercise 6 (Fragmenting a polynomial into subspaces)Let be a classical polynomial of degree . Show that one can partition into affine subspaces of dimension at least , where as for fixed , such that is constant on each . (Hint:Induct on , and use Exercise 6 of Notes 4 repeatedly to find a good initial partition into subspaces on which has degree at most .)

Exercise 7Use the previous exercise to complete the proof of Theorem 4. (Hint:mimic the density increment argument from Notes 2.)

By using the inverse theorem in place of the Fourier-analytic analogue in Lemma 7 of Notes 2, one obtains the following regularity lemma, analogous to Theorem 10 of Notes 2:

Theorem 9 (Strong arithmetic regularity lemma)Suppose that . Let , let , and let be an arbitrary function. Then we can decompose and find such that

- (Nonnegativity) take values in , and have mean zero;
- (Structure) is a function of classical polynomials of degree at most ;
- (Smallness) has an norm of at most ; and
- (Pseudorandomness) One has for all .

For a proof, see this paper of mine. The argument is similar to that appearing in Theorem 10 of Notes 2, but the discrete nature of polynomials in bounded characteristic allows one to avoid a number of technical issues regarding measurability.

This theorem can then be used for a variety of applications in additive combinatorics. For instance, it gives the following variant of a result of Bergelson, Host, and Kra:

Proposition 10 (Bergelson-Host-Kra type result)Let , let , and let with , and let . Then for values of , one has

Roughly speaking, the idea is to apply the regularity lemma to , discard the contribution of the and errors, and then control the structured component using the equidistribution theory from Notes 4. A proof of this result can be found in this paper of Ben Green; see also this paper of Ben and myself for an analogous result in . Curiously, the claim fails when is replaced by any larger number; this is essentially an observation of Ruzsa that appears in the appendix of the paper of Bergelson, Host, and Kra.

The above regularity lemma (or more precisely, a close relative of this lemma) was also used by Gowers and Wolf to determine the true complexity of a linear system:

Theorem 11 (Gowers-Wolf theorem)Let be a collection of linear forms with integer coefficients, with no two forms being linearly dependent. Let have sufficiently large characteristic, and suppose that are functions bounded in magnitude by such thatwhere was the form defined in Notes 3. Then for each there exists a classical polynomial of degree at most such that

where is the true complexity of the system defined in Notes 3. This is best possible.

## 2 comments

Comments feed for this article

24 May, 2010 at 9:19 pm

MadsProf. Tao,

I was wondering if it might be “better” to give interactive references to, i.e., theorems on MathWorld instead of on Wikipedia? (For example in Exercise 4 in this post.)

By the way; thank you for al the interessting entries in this blog! It is ‘gold’ for an amateur matmematician like myself.

29 May, 2010 at 2:15 pm

254B, Notes 6: The inverse conjecture for the Gowers norm II. The integer case « What’s new[...] inverse conjecture for the Gowers norm, nilsequences, polynomial maps | by Terence Tao In Notes 5, we saw that the Gowers uniformity norms on vector spaces in high characteristic were controlled [...]