In the previous lectures, we have focused mostly on the equidistribution or linear patterns on a subset of the integers , and in particular on intervals . The integers are of course a very important domain to study in additive combinatorics; but there are also other fundamental model examples of domains to study. One of these is that of a vector space over a finite field of prime order. Such domains are of interest in computer science (particularly when ) and also in number theory; but they also serve as an important simplified “dyadic model” for the integers. See this survey article of Green for further discussion of this point.

The additive combinatorics of the integers , and of vector spaces over finite fields, are analogous, but not quite identical. For instance, the analogue of an arithmetic progression in is a subspace of . In many cases, the finite field theory is a little bit simpler than the integer theory; for instance, subspaces are closed under addition, whereas arithmetic progressions are only “almost” closed under addition in various senses. (For instance, is closed under addition approximately half of the time.) However, there are some ways in which the integers are better behaved. For instance, because the integers can be generated by a single generator, a homomorphism from to some other group can be described by a single group element : . However, to specify a homomorphism from a vector space to one would need to specify one group element for each dimension of . Thus we see that there is a tradeoff when passing from (or ) to a vector space model; one gains a bounded torsion property, at the expense of conceding the bounded generation property. (Of course, if one wants to deal with arbitrarily large domains, one has to concede one or the other; the only additive groups that have both bounded torsion and boundedly many generators, are bounded.)

The starting point for this course (Notes 1) was the study of equidistribution of polynomials from the integers to the unit circle. We now turn to the parallel theory of equidistribution of polynomials from vector spaces over finite fields to the unit circle. Actually, for simplicity we will mostly focus on the *classical* case, when the polynomials in fact take values in the roots of unity (where is the characteristic of the field ). As it turns out, the non-classical case is also of importance (particularly in low characteristic), but the theory is more difficult; see these notes for some further discussion.

** — 1. Polynomials: basic theory — **

Throughout this section, will be a finite-dimensional vector space over a finite field of prime order .

Recall from the previous notes that a function is a function is a *polynomial of degree at most * if

for all , where . As mentioned in previous notes, this is equivalent to the assertion that the Gowers uniformity norm . The space of polynomials of degree at most will be denoted ; it is clearly an additive group. Note that a polynomial of degree zero is the same thing as a constant function, thus .

An important special case of polynomials are the *classical polynomials*, which take values in (which we identify with the roots of unity in in the obvious manner); the space of such polynomials of degree at most will be denoted ; this is clearly a vector space over . The classical polynomials have a familiar description, once we use a basis to identify with :

Exercise 1Let be a function, and be an integer. Show that is a (classical) polynomial of degree at most if and only if one has a representation of the formfor some coefficients . Furthermore, show that we can restrict the exponents to lie in the range , and that once one does so, the representation is unique. (

Hint:First establish the case, which can be done for instance by a dimension counting argument, and then induct on dimension.)

Exercise 2Show that the cardinality of is at most , with equality if and only if .

Now we study more general polynomials. A basic fact here is that multiplying a polynomial by the characteristic lowers the degree:

*Proof:* Without loss of generality we may take ; an easy induction on then shows it suffices to verify the base case . Our task is now to show that is constant, or equivalently that for all .

Fix . The operator represents a shift by . Since , we conclude that . On the other hand, as has degree at most , , and so

Using the binomial formula, we can factorise the left-hand side as

The first factor can be inverted by Neumann series since acts nilpotently on polynomials. We conclude that as required.

Exercise 3Establish the identityfor an indeterminate and any integer , by testing on roots of unity. (This identity was communicated to me by Andrew Granville.) Use this to give an alternate proof of Lemma 1.

This classifies all polynomials in the high characteristic case :

Corollary 2If , then . In other words, every polynomial of degree at most is the sum of a classical polynomial and a constant.

The situation is more complicated in the low characteristic case , in which non-classical polynomials can occur (polynomials that are not simply a classical polynomial up to constants). For instance, consider the function defined by and . One easily verifies that this is a (non-classical) quadratic (i.e. a polynomial of degree at most ), but is clearly not a shifted version of a classical polynomial since its range is not a shift of the second roots of unity.

Exercise 4Let be a function. Show that is a polynomial of degree at most if and only if the range of is a translate of the roots of unity (i.e. is constant).

For further discussion of non-classical polynomials, see this previous blog post. Henceforth we shall avoid this technical issue by restricting to the high characteristic case (or equivalently, the low degree case ).

** — 2. Equidistribution — **

Let us now consider the equidistribution theory of a classical polynomial , where we think of as being a fixed field (in particular, ), and the dimension of as being very large; will play the role here that the interval played in Notes 1. This theory is classical for linear and quadratic polynomials. The general theory was studied first by Ben Green and myself in the high characteristic case , and extended to the low characteristic case by Kaufman and Lovett. An analogous theory surely exists for the non-classical case, although this is not currently in the literature.

The situation here is simpler because a classical polynomial can only take values, so that in the equidistributed case one expects each value to be obtained about times. Inspired by this, let us call a classical polynomial *-equidistributed* if one has

for all .

Exercise 5Show that this is equivalent to the notion of -equidistribution given in Notes 1, if one gives the metric induced from , and if one is willing to modify by a multiplicative factor depending on in the equivalences.

Before we study equidistribution in earnest, we first give a classical estimate.

Exercise 6 (Chevalley-Warning theorem)Let be a finite dimensional space, and let be a classical polynomial of degree less than . Show that . (Hint:Identify with for some and apply Exercise 1. Use the fact that for all , which can be deduced by using a change of variables .) If furthermore has degree less than , conclude that for every , that is a multiple of . (Hint:Apply Fermat’s little theorem to the quantity .) In particular, if , then there exists at least one further such that .If has degree at most and , obtain the recurrence inequality

Hint:normalise , then average the previous claim over all subspaces of of a certain dimension.

The above exercise goes some way towards establishing equidistribution, by showing that every element in the image of is attained a fairly large number of times. But additional techniques will be needed (together with additional hypotheses on ) in order to obtain full equidistribution. It will be convenient to work in the ultralimit setting. Define a *limit classical polynomial* on a limit finite-dimensional vector space of degree at most to be an ultralimit of classical polynomials of degree at most (we keep and fixed independently of ). We say that a limit classical polynomial is *equidistributed* if one has

for all , where the cardinalities here are of course limit cardinalities.

Exercise 7Let be a limit finite-dimensional vector space. Show that a limit function is a limit classical polynomial of degree at most if and only if it is a classical polynomial of degree at most (observing here that every limit vector space is automatically a vector space).

Exercise 8Let be a limit classical polynomial. Show that is equidistributed if and only if, for every , is -equidistributed for sufficiently close to .

Exercise 9Let be a limit classical polynomial which is linear (i.e. of degree at most ). Show that is equidistributed if and only if is non-constant.

There is an analogue of the Weyl criterion in this setting. Call a limit function *biased* if , and *unbiased* if , where we identify with an element of .

Exercise 10 (Weyl criterion)Let be a limit function. Show that is equidistributed if and only if is unbiased for all non-zero .

Thus to understand the equidistribution of polynomials, it suffices to understand the size of exponential sums . For linear polynomials, this is an easy application of Fourier analysis:

Exercise 11Let be a polynomial of degree at most . Show that equals if is constant, and equals if is not constant. (Note that this is completely consistent with the previous two exercises.)

Next, we turn attention to the quadratic case. Here, we can use the Weyl differencing trick, which we phrase as an identity

for any finite vector space and function , where is the multiplicative derivative. Taking ultralimits, we see that the identity also holds for limit functions on limit finite dimensional vector spaces. In particular, we have

for any limit function on a limit finite dimensional space.

If is quadratic, then is linear. Applying (11), we conclude that if is biased, then must be constant for values of .

On the other hand, by using the cocycle identity

we see that the set of for which is constant is a limit subspace of . On that subspace, is then linear; passing to a codimension one subspace of , is then constant on . As is linear for every , is then linear on each coset of . As , there are only a bounded number of such cosets; thus is piecewise linear, and thus piecewise constant on slightly smaller cosets. Intersecting all the subspaces together, we can thus find another limit subspace with such that is constant on each coset of . To put it another way, if we view as the intersection of a bounded number of kernels of linear homomorphism (where is the codimension of ), then is constant on every simultaneous level set of , and can thus be expressed as a function of these linear polynomials.

More generally, let us say that a limit classical polynomial of degree is *low rank* if it can be expressed as where are a bounded number of polynomials of degree . We can summarise the above discussion (and also Exercise 11) as follows:

Proposition 3Let , and let be a limit classical polynomial. If is biased, then is low rank.In particular, from the Weyl criterion, we see that if is not equidistributed, then is of low rank.

Of course, the claim fails if the low rank hypothesis is dropped. For instance, consider a limit classical quadratic that is the product of two linearly independent linear polynomials . Then attains each non-zero value with a density of rather than (and attains with a density of rather than ).

Exercise 12Suppose that the characteristic of is greater than , and suppose that is a quadratic polynomial of the form , where , , is a symmetric matrix with coefficients in , and is the transpose of . Show that , where is the rank of . Furthermore, if is orthogonal to the kernel of , show that equality is attained, and otherwise vanishes.What happens in the even characteristic case (assuming now that is not symmetric)?

Exercise 13 (Van der Corput lemma)Let be a limit function on a limit finite dimensional vector space , and suppose that there exists a limit subset of which is sparse in the sense that , and such that is equidistributed for all . Show that itself is equidistributed. Use this to give an alternate proof of 3.

Exercise 14 (Space of polynomials is discrete)Let be a polynomial of degree at most such that for some constant . Show that is constant. (Hint:induct on .) Conclude that if are two distinct polynomials of degree at most , that .

The fact that high rank polynomials are equidistributed extends to higher degrees also:

Theorem 4Let be a limit classical polynomial. If is biased, then is low rank.In particular, from the Weyl criterion, we see that if is not equidistributed, then is of low rank.

In the high characteristic case , this claim was obtained by Ben Green and myself; the generalisation to the low characteristic case was carried out by Kaufman and Lovett. The statement is phrased in the language of ultrafilters, but it has an equivalent finitary analogue:

Exercise 15Show that Theorem 4 is equivalent to the claim that for every and , and every classical polynomial of degree at most on a finite-dimensional vector space with , that can be expressed as a function of at most classical polynomials of degree at most .

The proof of Theorem 4 is a little lengthy. It splits up into two pieces. We say that a limit function (not necessarily a polynomial) is *of order * if it can be expressed as a function of a bounded number of polynomials of degree less than . Our task is thus to show that every polynomial of degree which is biased, is of order . We first get within an epsilon of this goal, using an argument of Bogdanov and Viola:

Lemma 5 (Bogdanov-Viola lemma)Let be a limit polynomial of degree which is biased, and let be standard. Then one can find a limit function of order such that .

*Proof:* Let be a small standard number (depending on ) to be chosen later, let be a large standard integer (depending on ) to be chosen later, and let be chosen uniformly at random from . An application of the second moment method (which we leave as an exercise) shows that if is large enough, then with probability at least , one has

for at least choices of . We can rearrange this as

where ; note from hypothesis that . If we let be the nearest root of unity to , then (if is small enough) we conclude that for at least choices of . On the other hand, is clearly of order , and the claim follows.

Exercise 16Establish the claim left as an exercise in the above proof.

To conclude the proof of Theorem 4 from Lemma 5, it thus suffices to show

Proposition 6 (Rigidity)Let be a limit polynomial of degree which is equal to a limit function of order on at least of , where is standard. If is sufficiently small with respect to , then is also of order .

This proposition is somewhat tricky to prove, even in the high characteristic case . We fix and assume inductively that the proposition (and hence) Theorem 4 has been demonstrated for all smaller values of .

The main idea here is to start with the “noisy polynomial” , and perform some sort of “error correction” on to recover ; the key is then to show that this error correction procedure preserves the property of being order . From Exercise 14 we know that *in principle*, this error correction is possible if is small enough; but in order to preserve the order property we need a more explicit error correction algorithm which is tractable for analysis. This is provided by the following lemma.

Lemma 7 (Error correction of polynomials)Let be a (limit) classical polynomial of degree at most , and let be a (limit) function which agrees with at least of the time for some . Then for every , is equal to the most common value (i.e. the mode) of as vary in .

*Proof:* As is a polynomial of degree at most , one has

for all . We rearrange this as

holds unless and differ at for some .

On the other hand, if is fixed and are chosen independently and uniformly at random from , then for each , is also uniformly distributed in , and so the probability that and differ at is at most . Applying the union bound for the values of under consideration, we conclude that (3) happens more than half the time, and the claim follows.

Note that the above argument in fact shows that the mode is attained for at least of the choices of .

In view of this lemma, the goal is now to show that if is of order and is sufficiently close to a polynomial of degree , then the mode of is also of order .

By hypothesis, we have for some standard and some polynomials of degree . To motivate the general argument, let us first work in an easy model case, in which the are polynomials of degree that are linearly independent modulo low rank (i.e. order ) errors, i.e. no non-trivial linear combination of over is of low rank. This is not the most general case, but is somewhat simpler and will serve to illustrate the main ideas.

The linear independence, combined with the inductive hypothesis, implies that any non-trivial linear combination of is unbiased. From this and Fourier analysis, we see that is jointly equidistributed, thus in particular we have

In fact, we have a much stronger equidistribution property than this; not only do we understand the distribution of for a single , but more generally we can control the distribution of an entire parallelopiped

for any standard integer . Because all the components are polynomials of degree , the quantity is constrained to the space , defined as the subspace of consisting of all tuples obeying the constraints

for all faces of dimension , where is the sign of . (This constraint is of course vacuous if .)

Proposition 8is equidistributed in , thusfor all . Furthermore, we have the refined bound

for all and all .

*Proof:* It suffices to prove the second claim. Fix and From the definition of , we see that is uniquely determined by the component and . It will thus suffice to show that

for all , where

By Fourier analysis, it suffices to show that

for any non-zero . In other words, we need to show that

whenever the for are not all zero.

Let be such that , and such that is as large as possible; let us write , so that . Without loss of generality we may take . Suppose (5) failed, then by the pigeonhole principle one can find such that

We write the left-hand side as

where are bounded limit functions depending on that are independent of .

We can eliminate each term in turn by the Cauchy-Schwarz argument used in Notes 3, and conclude that

and thus by the monotonicity of Gowers norms

or in other words that the degree polynomial is biased. By the induction hypothesis, this polynomial must be low rank.

At this point we crucially exploit the high characteristic hypothesis by noting the Taylor expansion formula

The high characteristic is necessary here to invert . We conclude that is of low rank, but this contradicts the hypothesis on the and the non-zero nature of , and the claim follows.

Let and . From the above proposition we have an equidistribution result for a cube pinned at :

In fact, we can do a bit better than this, and obtain equidistribution even after fixing a second vertex:

Exercise 17 (Equidistribution of doubly pinned cubes)Let , let , let . Then for all but elements of , one has(

Hint:One can proceed by applying Proposition 8 with replaced by a larger dimension, such as ; details can be found in the paper of Ben Green and myself.)

We can now establish Proposition 6 in the case where the are independent modulo low rank errors. Let and . It will suffice to show that does not depend on as long as stays inside .

Call an atom *good* if and agree for at least of the elements of ; by Markov’s inequality (and (4)) we see that at least of the atoms are good. From this and an easy counting argument we can find an element in with the specified value of , such that is good for every .

Fix . Now consider all the pinned cubes with for all . By (6), the number of such cubes is . On the other hand, by Proposition 17, the total number of such cubes for which

for some is . We conclude that there exists a pinned cube for which

for all , and in particular (3) holds. However, as is constant on each of the , we see that the right-hand side of (3) does not depend on , and so the left-hand side does also.

This completes the proof of Proposition 6 in the independent case. In the general case, one reduces to a (slight generalisation of) this case by the following regularity lemma:

Lemma 9 (Regularity lemma)Let be a bounded number of limit classical polynomials of degree . Then there exists a limit classical bounded number of polynomials of degree for each , such that each is a function of the for and , and such that for each , the are independent modulo low rank polynomials of degree .

*Proof:* We induct on . The claim is vacuously true for , so suppose that and that the claim has already been proven for .

Let be the space of limit classical polynomials of degree , and let be the subspace of low rank limit classical polynomials. Working in the quotient space , we see that generates a finite-dimensional space here, which thus has a basis , thus are linearly independent modulo low rank polynomials of degree , and the are linear combinations of the plus combinations of some additional polynomials of degree . Applying the induction hypothesis to those additional polynomials, one obtains the claim.

Exercise 18Show that the polynomials appearing in the above lemma are equidistributed in the sense thatfor any with .

Applying the above lemma, one can express any order function in the form . It is then possible to modify the previous arguments to obtain Proposition 6; see the paper of Ben Green and myself for more details. (We phrase the arguments in a finitary setting rather than a nonstandard one, but the two approaches are equivalent; see this blog post for more discussion.)

It is possible to modify the above arguments to handle the low characteristic case, but due to the lack of a good Taylor expansion, one has to regularise the derivatives of the polynomials, as well as the polynomials themselves; see the paper of Kaufman and Lovett for details.

** — 3. Analytic rank — **

Define the *rank* of a degree (limit) classical polynomial to be the least number of degree (limit) classical polynomials such that is a function of . Theorem 3 tells us that is equidistributed whenever the rank is unbounded. However, the proof was rather involved. There is a more elementary approach to equidistribution due to Gowers and Wolf which replaces the rank by a different object, called *analytic rank*, and which can serve as a simpler substitute for the concept of rank in some applications.

Definition 10 (Analytic rank)The analytic rank of a (limit) classical polynomial of degree is defined to be the quantity

From the properties of the Gowers norms we see that this quantity is non-negative, is zero if and only if is a polynomial of degree , and is finite (or limit finite) for . (For , the analytic rank is infinite if is non-constant and zero if is constant.)

Exercise 19Show that if and is a (limit) classical polynomial of degree , then .

Exercise 20Show that if the analytic rank of a limit classical polynomial of degree is unbounded, then is equidistributed.

Exercise 21Suppose we are in the high characteristic case Using Theorem 3, show that a limit classical polynomial has bounded analytic rank if and only if it has bounded rank. (Hint:One direction follows from the preceding exercise. For the other direction, use the Taylor formula .) This is a special case of theinverse conjecture for the Gowers norms, which we will discuss in more detail in later notes.Conclude the following finitary version: if is a classical polynomial of degree on a finite-dimensoinal vector space , and , then ; conversely, if , then .

Exercise 22Show that if is a (limit) classical polynomial of degree , then and for all , and and for all (limit) classical polynomials of degree .

It is clear that the rank obeys the triangle inequality for all (limit) classical polynomials of degree . There is an analogue for analytic rank, due to Gowers and Wolf:

Proposition 11 (Quasi-triangle inequality for analytic rank)Let be (limit) classical polynomials of degree . Then .

*Proof:* Let be the -linear form

(note that the right-hand side is independent of ); similarly define

By definition, we have

and

and thus

We make the substitution . Using the multilinearity of , we can write the left-hand side as

where the are functions bounded in magnitude by that are independent of the variable. Eliminating all these factors by Cauchy-Schwarz as in the previous notes, we can bound the above expression by

which using the substitution and the multilinearity of simplifies to

which by definition of analytic rank is

and the claim follows.

## 8 comments

Comments feed for this article

10 May, 2010 at 2:44 am

MadsProfessor Tau,

An typo has found it’s way into the hint to Exercise 17: “cfan” should be “can”.

[Corrected, thanks – T.]20 May, 2010 at 9:44 pm

254B, Notes 5: The inverse conjecture for the Gowers norm I. The finite field case « What’s new[…] closely tied to the space of polynomials of degree at most . Indeed, as noted in Exercise 20 of Notes 4, a function of norm has norm equal to if and only if for some ; thus polynomials solve the […]

26 May, 2010 at 8:50 pm

PietroDear Terry, I may be misinterpreting the notation horribly, but I don’t think that the map x |–> x^i is a bijection on F_p for all 0<i<p-1. You need i to be prime to p-1 for that to be true. (Otherwise quadratic reciprocity would be too easy!)

Fortunately you don't need this to get the power sums to be zero, all you need is some nonzero b in F s.t. b^i is different from 1, which exists since the multiplicative group F_p\{0} is cyclic. (Or alternatively because the polynomial X^i – 1 has at most i roots.)

26 May, 2010 at 9:00 pm

Terence TaoOops, good point! (I had confused the proof of the vanishing of with the proof of Fermat’s little theorem that is based on the fact that is a bijection.) The correct statement is that is a covering map onto a nontrivial multiplicative subgroup, but it is indeed simpler to proceed by dilating by a non-i^th-root-of-unity b.

3 June, 2010 at 2:38 pm

Sungjin KimIn exercise 6, Sum x^i should be summing over x in F, not i in F.

[Corrected, thanks – T.]30 May, 2010 at 9:19 pm

mmailliw/williamI believe that your estimate for Exercise 14 doesn’t quite hold.

As a counterexample, we can let F = F_3, d = 2, and P = x^2. Then, because e(P(x)) = 1 for x = 0 and (-1 + i sqrt 3)/2 for x nonzero, the expected value of e(P(x)) is (sqrt 3)/3 i, whose absolute value is about .5773 (this also follows from Exercise 12, which gives 1/(sqrt 3) in this case); however, the statement on Exercise 14 would imply that if

|E(e(P(x))| > 1 – 2^(-d + 1) = 1 – 2^(-1) = .5 in this case, then P would have to be constant, which is a contradiction.

The induction proof almost works, though (I believe that the stumbling point is that |E_x (Delta_h P(x))| could equal 1 if h = 0).

30 May, 2010 at 11:01 pm

Terence TaoAh, right, I meant to have a lower bound on E |e(P(x)) – c| for some unit phase c (instead of a lower bound on |E e(P(x)) – c|, which was essentially what was written). The original claim is still true if one replaces by something a bit weaker, of course. I’ve reworded the question accordingly.

3 June, 2010 at 9:20 am

Sungjin KimDear Professor Tao,

In exercise 2, the answer holds when we have no restrictions on exponents i_1, …, i_n. However, by exercise 1, we have a restriction that all i_1, … i_n should be in {0,1, …, p-1}.

[Corrected, thanks – T.]