In the previous lectures, we have focused mostly on the equidistribution or linear patterns on a subset of the integers , and in particular on intervals . The integers are of course a very important domain to study in additive combinatorics; but there are also other fundamental model examples of domains to study. One of these is that of a vector space over a finite field of prime order. Such domains are of interest in computer science (particularly when ) and also in number theory; but they also serve as an important simplified “dyadic model” for the integers. See this survey article of Green for further discussion of this point.
The additive combinatorics of the integers , and of vector spaces over finite fields, are analogous, but not quite identical. For instance, the analogue of an arithmetic progression in is a subspace of . In many cases, the finite field theory is a little bit simpler than the integer theory; for instance, subspaces are closed under addition, whereas arithmetic progressions are only “almost” closed under addition in various senses. (For instance, is closed under addition approximately half of the time.) However, there are some ways in which the integers are better behaved. For instance, because the integers can be generated by a single generator, a homomorphism from to some other group can be described by a single group element : . However, to specify a homomorphism from a vector space to one would need to specify one group element for each dimension of . Thus we see that there is a tradeoff when passing from (or ) to a vector space model; one gains a bounded torsion property, at the expense of conceding the bounded generation property. (Of course, if one wants to deal with arbitrarily large domains, one has to concede one or the other; the only additive groups that have both bounded torsion and boundedly many generators, are bounded.)
The starting point for this course (Notes 1) was the study of equidistribution of polynomials from the integers to the unit circle. We now turn to the parallel theory of equidistribution of polynomials from vector spaces over finite fields to the unit circle. Actually, for simplicity we will mostly focus on the classical case, when the polynomials in fact take values in the roots of unity (where is the characteristic of the field ). As it turns out, the non-classical case is also of importance (particularly in low characteristic), but the theory is more difficult; see these notes for some further discussion.
— 1. Polynomials: basic theory —
Throughout this section, will be a finite-dimensional vector space over a finite field of prime order .
Recall from the previous notes that a function is a function is a polynomial of degree at most if
for all , where . As mentioned in previous notes, this is equivalent to the assertion that the Gowers uniformity norm . The space of polynomials of degree at most will be denoted ; it is clearly an additive group. Note that a polynomial of degree zero is the same thing as a constant function, thus .
An important special case of polynomials are the classical polynomials, which take values in (which we identify with the roots of unity in in the obvious manner); the space of such polynomials of degree at most will be denoted ; this is clearly a vector space over . The classical polynomials have a familiar description, once we use a basis to identify with :
for some coefficients . Furthermore, show that we can restrict the exponents to lie in the range , and that once one does so, the representation is unique. (Hint: First establish the case, which can be done for instance by a dimension counting argument, and then induct on dimension.)
Exercise 2 Show that the cardinality of is at most , with equality if and only if .
Now we study more general polynomials. A basic fact here is that multiplying a polynomial by the characteristic lowers the degree:
Proof: Without loss of generality we may take ; an easy induction on then shows it suffices to verify the base case . Our task is now to show that is constant, or equivalently that for all .
Fix . The operator represents a shift by . Since , we conclude that . On the other hand, as has degree at most , , and so
Using the binomial formula, we can factorise the left-hand side as
The first factor can be inverted by Neumann series since acts nilpotently on polynomials. We conclude that as required.
Exercise 3 Establish the identity
for an indeterminate and any integer , by testing on roots of unity. (This identity was communicated to me by Andrew Granville.) Use this to give an alternate proof of Lemma 1.
This classifies all polynomials in the high characteristic case :
Corollary 2 If , then . In other words, every polynomial of degree at most is the sum of a classical polynomial and a constant.
The situation is more complicated in the low characteristic case , in which non-classical polynomials can occur (polynomials that are not simply a classical polynomial up to constants). For instance, consider the function defined by and . One easily verifies that this is a (non-classical) quadratic (i.e. a polynomial of degree at most ), but is clearly not a shifted version of a classical polynomial since its range is not a shift of the second roots of unity.
Exercise 4 Let be a function. Show that is a polynomial of degree at most if and only if the range of is a translate of the roots of unity (i.e. is constant).
For further discussion of non-classical polynomials, see this previous blog post. Henceforth we shall avoid this technical issue by restricting to the high characteristic case (or equivalently, the low degree case ).
— 2. Equidistribution —
Let us now consider the equidistribution theory of a classical polynomial , where we think of as being a fixed field (in particular, ), and the dimension of as being very large; will play the role here that the interval played in Notes 1. This theory is classical for linear and quadratic polynomials. The general theory was studied first by Ben Green and myself in the high characteristic case , and extended to the low characteristic case by Kaufman and Lovett. An analogous theory surely exists for the non-classical case, although this is not currently in the literature.
The situation here is simpler because a classical polynomial can only take values, so that in the equidistributed case one expects each value to be obtained about times. Inspired by this, let us call a classical polynomial -equidistributed if one has
for all .
Exercise 5 Show that this is equivalent to the notion of -equidistribution given in Notes 1, if one gives the metric induced from , and if one is willing to modify by a multiplicative factor depending on in the equivalences.
Before we study equidistribution in earnest, we first give a classical estimate.
Exercise 6 (Chevalley-Warning theorem) Let be a finite dimensional space, and let be a classical polynomial of degree less than . Show that . (Hint: Identify with for some and apply Exercise 1. Use the fact that for all , which can be deduced by using a change of variables .) If furthermore has degree less than , conclude that for every , that is a multiple of . (Hint: Apply Fermat’s little theorem to the quantity .) In particular, if , then there exists at least one further such that .
If has degree at most and , obtain the recurrence inequality
Hint: normalise , then average the previous claim over all subspaces of of a certain dimension.
The above exercise goes some way towards establishing equidistribution, by showing that every element in the image of is attained a fairly large number of times. But additional techniques will be needed (together with additional hypotheses on ) in order to obtain full equidistribution. It will be convenient to work in the ultralimit setting. Define a limit classical polynomial on a limit finite-dimensional vector space of degree at most to be an ultralimit of classical polynomials of degree at most (we keep and fixed independently of ). We say that a limit classical polynomial is equidistributed if one has
for all , where the cardinalities here are of course limit cardinalities.
Exercise 7 Let be a limit finite-dimensional vector space. Show that a limit function is a limit classical polynomial of degree at most if and only if it is a classical polynomial of degree at most (observing here that every limit vector space is automatically a vector space).
Exercise 8 Let be a limit classical polynomial. Show that is equidistributed if and only if, for every , is -equidistributed for sufficiently close to .
Exercise 9 Let be a limit classical polynomial which is linear (i.e. of degree at most ). Show that is equidistributed if and only if is non-constant.
There is an analogue of the Weyl criterion in this setting. Call a limit function biased if , and unbiased if , where we identify with an element of .
Exercise 10 (Weyl criterion) Let be a limit function. Show that is equidistributed if and only if is unbiased for all non-zero .
Thus to understand the equidistribution of polynomials, it suffices to understand the size of exponential sums . For linear polynomials, this is an easy application of Fourier analysis:
for any finite vector space and function , where is the multiplicative derivative. Taking ultralimits, we see that the identity also holds for limit functions on limit finite dimensional vector spaces. In particular, we have
for any limit function on a limit finite dimensional space.
If is quadratic, then is linear. Applying (11), we conclude that if is biased, then must be constant for values of .
On the other hand, by using the cocycle identity
we see that the set of for which is constant is a limit subspace of . On that subspace, is then linear; passing to a codimension one subspace of , is then constant on . As is linear for every , is then linear on each coset of . As , there are only a bounded number of such cosets; thus is piecewise linear, and thus piecewise constant on slightly smaller cosets. Intersecting all the subspaces together, we can thus find another limit subspace with such that is constant on each coset of . To put it another way, if we view as the intersection of a bounded number of kernels of linear homomorphism (where is the codimension of ), then is constant on every simultaneous level set of , and can thus be expressed as a function of these linear polynomials.
More generally, let us say that a limit classical polynomial of degree is low rank if it can be expressed as where are a bounded number of polynomials of degree . We can summarise the above discussion (and also Exercise 11) as follows:
In particular, from the Weyl criterion, we see that if is not equidistributed, then is of low rank.
Of course, the claim fails if the low rank hypothesis is dropped. For instance, consider a limit classical quadratic that is the product of two linearly independent linear polynomials . Then attains each non-zero value with a density of rather than (and attains with a density of rather than ).
Exercise 12 Suppose that the characteristic of is greater than , and suppose that is a quadratic polynomial of the form , where , , is a symmetric matrix with coefficients in , and is the transpose of . Show that , where is the rank of . Furthermore, if is orthogonal to the kernel of , show that equality is attained, and otherwise vanishes.
What happens in the even characteristic case (assuming now that is not symmetric)?
Exercise 13 (Van der Corput lemma) Let be a limit function on a limit finite dimensional vector space , and suppose that there exists a limit subset of which is sparse in the sense that , and such that is equidistributed for all . Show that itself is equidistributed. Use this to give an alternate proof of 3.
Exercise 14 (Space of polynomials is discrete) Let be a polynomial of degree at most such that for some constant . Show that is constant. (Hint: induct on .) Conclude that if are two distinct polynomials of degree at most , that .
The fact that high rank polynomials are equidistributed extends to higher degrees also:
In particular, from the Weyl criterion, we see that if is not equidistributed, then is of low rank.
In the high characteristic case , this claim was obtained by Ben Green and myself; the generalisation to the low characteristic case was carried out by Kaufman and Lovett. The statement is phrased in the language of ultrafilters, but it has an equivalent finitary analogue:
Exercise 15 Show that Theorem 4 is equivalent to the claim that for every and , and every classical polynomial of degree at most on a finite-dimensional vector space with , that can be expressed as a function of at most classical polynomials of degree at most .
The proof of Theorem 4 is a little lengthy. It splits up into two pieces. We say that a limit function (not necessarily a polynomial) is of order if it can be expressed as a function of a bounded number of polynomials of degree less than . Our task is thus to show that every polynomial of degree which is biased, is of order . We first get within an epsilon of this goal, using an argument of Bogdanov and Viola:
Proof: Let be a small standard number (depending on ) to be chosen later, let be a large standard integer (depending on ) to be chosen later, and let be chosen uniformly at random from . An application of the second moment method (which we leave as an exercise) shows that if is large enough, then with probability at least , one has
for at least choices of . We can rearrange this as
where ; note from hypothesis that . If we let be the nearest root of unity to , then (if is small enough) we conclude that for at least choices of . On the other hand, is clearly of order , and the claim follows.
Exercise 16 Establish the claim left as an exercise in the above proof.
Proposition 6 (Rigidity) Let be a limit polynomial of degree which is equal to a limit function of order on at least of , where is standard. If is sufficiently small with respect to , then is also of order .
This proposition is somewhat tricky to prove, even in the high characteristic case . We fix and assume inductively that the proposition (and hence) Theorem 4 has been demonstrated for all smaller values of .
The main idea here is to start with the “noisy polynomial” , and perform some sort of “error correction” on to recover ; the key is then to show that this error correction procedure preserves the property of being order . From Exercise 14 we know that in principle, this error correction is possible if is small enough; but in order to preserve the order property we need a more explicit error correction algorithm which is tractable for analysis. This is provided by the following lemma.
Lemma 7 (Error correction of polynomials) Let be a (limit) classical polynomial of degree at most , and let be a (limit) function which agrees with at least of the time for some . Then for every , is equal to the most common value (i.e. the mode) of as vary in .
Proof: As is a polynomial of degree at most , one has
for all . We rearrange this as
holds unless and differ at for some .
On the other hand, if is fixed and are chosen independently and uniformly at random from , then for each , is also uniformly distributed in , and so the probability that and differ at is at most . Applying the union bound for the values of under consideration, we conclude that (3) happens more than half the time, and the claim follows.
Note that the above argument in fact shows that the mode is attained for at least of the choices of .
In view of this lemma, the goal is now to show that if is of order and is sufficiently close to a polynomial of degree , then the mode of is also of order .
By hypothesis, we have for some standard and some polynomials of degree . To motivate the general argument, let us first work in an easy model case, in which the are polynomials of degree that are linearly independent modulo low rank (i.e. order ) errors, i.e. no non-trivial linear combination of over is of low rank. This is not the most general case, but is somewhat simpler and will serve to illustrate the main ideas.
The linear independence, combined with the inductive hypothesis, implies that any non-trivial linear combination of is unbiased. From this and Fourier analysis, we see that is jointly equidistributed, thus in particular we have
for all , where .
In fact, we have a much stronger equidistribution property than this; not only do we understand the distribution of for a single , but more generally we can control the distribution of an entire parallelopiped
for any standard integer . Because all the components are polynomials of degree , the quantity is constrained to the space , defined as the subspace of consisting of all tuples obeying the constraints
for all faces of dimension , where is the sign of . (This constraint is of course vacuous if .)
for all . Furthermore, we have the refined bound
for all and all .
Proof: It suffices to prove the second claim. Fix and From the definition of , we see that is uniquely determined by the component and . It will thus suffice to show that
for all , where
By Fourier analysis, it suffices to show that
whenever the for are not all zero.
Let be such that , and such that is as large as possible; let us write , so that . Without loss of generality we may take . Suppose (5) failed, then by the pigeonhole principle one can find such that
We write the left-hand side as
where are bounded limit functions depending on that are independent of .
We can eliminate each term in turn by the Cauchy-Schwarz argument used in Notes 3, and conclude that
and thus by the monotonicity of Gowers norms
or in other words that the degree polynomial is biased. By the induction hypothesis, this polynomial must be low rank.
At this point we crucially exploit the high characteristic hypothesis by noting the Taylor expansion formula
The high characteristic is necessary here to invert . We conclude that is of low rank, but this contradicts the hypothesis on the and the non-zero nature of , and the claim follows.
In fact, we can do a bit better than this, and obtain equidistribution even after fixing a second vertex:
We can now establish Proposition 6 in the case where the are independent modulo low rank errors. Let and . It will suffice to show that does not depend on as long as stays inside .
Call an atom good if and agree for at least of the elements of ; by Markov’s inequality (and (4)) we see that at least of the atoms are good. From this and an easy counting argument we can find an element in with the specified value of , such that is good for every .
for some is . We conclude that there exists a pinned cube for which
This completes the proof of Proposition 6 in the independent case. In the general case, one reduces to a (slight generalisation of) this case by the following regularity lemma:
Lemma 9 (Regularity lemma) Let be a bounded number of limit classical polynomials of degree . Then there exists a limit classical bounded number of polynomials of degree for each , such that each is a function of the for and , and such that for each , the are independent modulo low rank polynomials of degree .
Proof: We induct on . The claim is vacuously true for , so suppose that and that the claim has already been proven for .
Let be the space of limit classical polynomials of degree , and let be the subspace of low rank limit classical polynomials. Working in the quotient space , we see that generates a finite-dimensional space here, which thus has a basis , thus are linearly independent modulo low rank polynomials of degree , and the are linear combinations of the plus combinations of some additional polynomials of degree . Applying the induction hypothesis to those additional polynomials, one obtains the claim.
Exercise 18 Show that the polynomials appearing in the above lemma are equidistributed in the sense that
for any with .
Applying the above lemma, one can express any order function in the form . It is then possible to modify the previous arguments to obtain Proposition 6; see the paper of Ben Green and myself for more details. (We phrase the arguments in a finitary setting rather than a nonstandard one, but the two approaches are equivalent; see this blog post for more discussion.)
It is possible to modify the above arguments to handle the low characteristic case, but due to the lack of a good Taylor expansion, one has to regularise the derivatives of the polynomials, as well as the polynomials themselves; see the paper of Kaufman and Lovett for details.
— 3. Analytic rank —
Define the rank of a degree (limit) classical polynomial to be the least number of degree (limit) classical polynomials such that is a function of . Theorem 3 tells us that is equidistributed whenever the rank is unbounded. However, the proof was rather involved. There is a more elementary approach to equidistribution due to Gowers and Wolf which replaces the rank by a different object, called analytic rank, and which can serve as a simpler substitute for the concept of rank in some applications.
Definition 10 (Analytic rank) The analytic rank of a (limit) classical polynomial of degree is defined to be the quantity
From the properties of the Gowers norms we see that this quantity is non-negative, is zero if and only if is a polynomial of degree , and is finite (or limit finite) for . (For , the analytic rank is infinite if is non-constant and zero if is constant.)
Exercise 19 Show that if and is a (limit) classical polynomial of degree , then .
Exercise 20 Show that if the analytic rank of a limit classical polynomial of degree is unbounded, then is equidistributed.
Exercise 21 Suppose we are in the high characteristic case Using Theorem 3, show that a limit classical polynomial has bounded analytic rank if and only if it has bounded rank. (Hint: One direction follows from the preceding exercise. For the other direction, use the Taylor formula .) This is a special case of the inverse conjecture for the Gowers norms, which we will discuss in more detail in later notes.
Conclude the following finitary version: if is a classical polynomial of degree on a finite-dimensoinal vector space , and , then ; conversely, if , then .
Exercise 22 Show that if is a (limit) classical polynomial of degree , then and for all , and and for all (limit) classical polynomials of degree .
It is clear that the rank obeys the triangle inequality for all (limit) classical polynomials of degree . There is an analogue for analytic rank, due to Gowers and Wolf:
Proposition 11 (Quasi-triangle inequality for analytic rank) Let be (limit) classical polynomials of degree . Then .
Proof: Let be the -linear form
(note that the right-hand side is independent of ); similarly define
By definition, we have
We make the substitution . Using the multilinearity of , we can write the left-hand side as
where the are functions bounded in magnitude by that are independent of the variable. Eliminating all these factors by Cauchy-Schwarz as in the previous notes, we can bound the above expression by
which using the substitution and the multilinearity of simplifies to
which by definition of analytic rank is
and the claim follows.