In the previous lectures, we have focused mostly on the equidistribution or linear patterns on a subset of the integers , and in particular on intervals
. The integers are of course a very important domain to study in additive combinatorics; but there are also other fundamental model examples of domains to study. One of these is that of a vector space
over a finite field
of prime order. Such domains are of interest in computer science (particularly when
) and also in number theory; but they also serve as an important simplified “dyadic model” for the integers. See this survey article of Green for further discussion of this point.
The additive combinatorics of the integers , and of vector spaces
over finite fields, are analogous, but not quite identical. For instance, the analogue of an arithmetic progression in
is a subspace of
. In many cases, the finite field theory is a little bit simpler than the integer theory; for instance, subspaces are closed under addition, whereas arithmetic progressions are only “almost” closed under addition in various senses. (For instance,
is closed under addition approximately half of the time.) However, there are some ways in which the integers are better behaved. For instance, because the integers can be generated by a single generator, a homomorphism from
to some other group
can be described by a single group element
:
. However, to specify a homomorphism from a vector space
to
one would need to specify one group element for each dimension of
. Thus we see that there is a tradeoff when passing from
(or
) to a vector space model; one gains a bounded torsion property, at the expense of conceding the bounded generation property. (Of course, if one wants to deal with arbitrarily large domains, one has to concede one or the other; the only additive groups that have both bounded torsion and boundedly many generators, are bounded.)
The starting point for this course (Notes 1) was the study of equidistribution of polynomials from the integers to the unit circle. We now turn to the parallel theory of equidistribution of polynomials
from vector spaces over finite fields to the unit circle. Actually, for simplicity we will mostly focus on the classical case, when the polynomials in fact take values in the
roots of unity (where
is the characteristic of the field
). As it turns out, the non-classical case is also of importance (particularly in low characteristic), but the theory is more difficult; see these notes for some further discussion.
— 1. Polynomials: basic theory —
Throughout this section, will be a finite-dimensional vector space over a finite field
of prime order
.
Recall from the previous notes that a function is a function is a polynomial of degree at most
if
for all , where
. As mentioned in previous notes, this is equivalent to the assertion that the Gowers uniformity norm
. The space of polynomials of degree at most
will be denoted
; it is clearly an additive group. Note that a polynomial of degree zero is the same thing as a constant function, thus
.
An important special case of polynomials are the classical polynomials, which take values in (which we identify with the
roots of unity in
in the obvious manner); the space of such polynomials of degree at most
will be denoted
; this is clearly a vector space over
. The classical polynomials have a familiar description, once we use a basis to identify
with
:
Exercise 1 Let
be a function, and
be an integer. Show that
is a (classical) polynomial of degree at most
if and only if one has a representation of the form
for some coefficients
. Furthermore, show that we can restrict the exponents
to lie in the range
, and that once one does so, the representation is unique. (Hint: First establish the
case, which can be done for instance by a dimension counting argument, and then induct on dimension.)
Exercise 2 Show that the cardinality of
is at most
, with equality if and only if
.
Now we study more general polynomials. A basic fact here is that multiplying a polynomial by the characteristic lowers the degree:
Proof: Without loss of generality we may take ; an easy induction on
then shows it suffices to verify the base case
. Our task is now to show that
is constant, or equivalently that
for all
.
Fix . The operator
represents a shift by
. Since
, we conclude that
. On the other hand, as
has degree at most
,
, and so
Using the binomial formula, we can factorise the left-hand side as
The first factor can be inverted by Neumann series since acts nilpotently on polynomials. We conclude that
as required.
Exercise 3 Establish the identity
for an indeterminate
and any integer
, by testing on
roots of unity. (This identity was communicated to me by Andrew Granville.) Use this to give an alternate proof of Lemma 1.
This classifies all polynomials in the high characteristic case :
Corollary 2 If
, then
. In other words, every polynomial of degree at most
is the sum of a classical polynomial and a constant.
The situation is more complicated in the low characteristic case , in which non-classical polynomials can occur (polynomials that are not simply a classical polynomial up to constants). For instance, consider the function
defined by
and
. One easily verifies that this is a (non-classical) quadratic (i.e. a polynomial of degree at most
), but is clearly not a shifted version of a classical polynomial since its range is not a shift of the second roots
of unity.
Exercise 4 Let
be a function. Show that
is a polynomial of degree at most
if and only if the range of
is a translate of the
roots of unity (i.e.
is constant).
For further discussion of non-classical polynomials, see this previous blog post. Henceforth we shall avoid this technical issue by restricting to the high characteristic case (or equivalently, the low degree case
).
— 2. Equidistribution —
Let us now consider the equidistribution theory of a classical polynomial , where we think of
as being a fixed field (in particular,
), and the dimension of
as being very large;
will play the role here that the interval
played in Notes 1. This theory is classical for linear and quadratic polynomials. The general theory was studied first by Ben Green and myself in the high characteristic case
, and extended to the low characteristic case by Kaufman and Lovett. An analogous theory surely exists for the non-classical case, although this is not currently in the literature.
The situation here is simpler because a classical polynomial can only take values, so that in the equidistributed case one expects each value to be obtained about
times. Inspired by this, let us call a classical polynomial
-equidistributed if one has
for all .
Exercise 5 Show that this is equivalent to the notion of
-equidistribution given in Notes 1, if one gives
the metric induced from
, and if one is willing to modify
by a multiplicative factor depending on
in the equivalences.
Before we study equidistribution in earnest, we first give a classical estimate.
Exercise 6 (Chevalley-Warning theorem) Let
be a finite dimensional space, and let
be a classical polynomial of degree less than
. Show that
. (Hint: Identify
with
for some
and apply Exercise 1. Use the fact that
for all
, which can be deduced by using a change of variables
.) If furthermore
has degree less than
, conclude that for every
, that
is a multiple of
. (Hint: Apply Fermat’s little theorem to the quantity
.) In particular, if
, then there exists at least one further
such that
.
If
has degree at most
and
, obtain the recurrence inequality
Hint: normalise
, then average the previous claim over all subspaces of
of a certain dimension.
The above exercise goes some way towards establishing equidistribution, by showing that every element in the image of is attained a fairly large number of times. But additional techniques will be needed (together with additional hypotheses on
) in order to obtain full equidistribution. It will be convenient to work in the ultralimit setting. Define a limit classical polynomial
on a limit finite-dimensional vector space
of degree at most
to be an ultralimit of classical polynomials
of degree at most
(we keep
and
fixed independently of
). We say that a limit classical polynomial
is equidistributed if one has
for all , where the cardinalities here are of course limit cardinalities.
Exercise 7 Let
be a limit finite-dimensional vector space. Show that a limit function
is a limit classical polynomial of degree at most
if and only if it is a classical polynomial of degree at most
(observing here that every limit vector space is automatically a vector space).
Exercise 8 Let
be a limit classical polynomial. Show that
is equidistributed if and only if, for every
,
is
-equidistributed for
sufficiently close to
.
Exercise 9 Let
be a limit classical polynomial which is linear (i.e. of degree at most
). Show that
is equidistributed if and only if
is non-constant.
There is an analogue of the Weyl criterion in this setting. Call a limit function biased if
, and unbiased if
, where we identify
with an element of
.
Exercise 10 (Weyl criterion) Let
be a limit function. Show that
is equidistributed if and only if
is unbiased for all non-zero
.
Thus to understand the equidistribution of polynomials, it suffices to understand the size of exponential sums . For linear polynomials, this is an easy application of Fourier analysis:
Exercise 11 Let
be a polynomial of degree at most
. Show that
equals
if
is constant, and equals
if
is not constant. (Note that this is completely consistent with the previous two exercises.)
Next, we turn attention to the quadratic case. Here, we can use the Weyl differencing trick, which we phrase as an identity
for any finite vector space and function
, where
is the multiplicative derivative. Taking ultralimits, we see that the identity also holds for limit functions on limit finite dimensional vector spaces. In particular, we have
for any limit function on a limit finite dimensional space.
If is quadratic, then
is linear. Applying (11), we conclude that if
is biased, then
must be constant for
values of
.
On the other hand, by using the cocycle identity
we see that the set of for which
is constant is a limit subspace of
. On that subspace,
is then linear; passing to a codimension one subspace
of
,
is then constant on
. As
is linear for every
,
is then linear on each coset
of
. As
, there are only a bounded number of such cosets; thus
is piecewise linear, and thus piecewise constant on slightly smaller cosets. Intersecting all the subspaces together, we can thus find another limit subspace
with
such that
is constant on each coset of
. To put it another way, if we view
as the intersection of a bounded number of kernels of linear homomorphism
(where
is the codimension of
), then
is constant on every simultaneous level set of
, and can thus be expressed as a function
of these linear polynomials.
More generally, let us say that a limit classical polynomial of degree
is low rank if it can be expressed as
where
are a bounded number of polynomials of degree
. We can summarise the above discussion (and also Exercise 11) as follows:
Proposition 3 Let
, and let
be a limit classical polynomial. If
is biased, then
is low rank.
In particular, from the Weyl criterion, we see that if
is not equidistributed, then
is of low rank.
Of course, the claim fails if the low rank hypothesis is dropped. For instance, consider a limit classical quadratic that is the product of two linearly independent linear polynomials
. Then
attains each non-zero value with a density of
rather than
(and attains
with a density of
rather than
).
Exercise 12 Suppose that the characteristic
of
is greater than
, and suppose that
is a quadratic polynomial of the form
, where
,
,
is a symmetric
matrix with coefficients in
, and
is the transpose of
. Show that
, where
is the rank of
. Furthermore, if
is orthogonal to the kernel of
, show that equality is attained, and otherwise
vanishes.
What happens in the even characteristic case (assuming now that
is not symmetric)?
Exercise 13 (Van der Corput lemma) Let
be a limit function on a limit finite dimensional vector space
, and suppose that there exists a limit subset
of
which is sparse in the sense that
, and such that
is equidistributed for all
. Show that
itself is equidistributed. Use this to give an alternate proof of 3.
Exercise 14 (Space of polynomials is discrete) Let
be a polynomial of degree at most
such that
for some constant
. Show that
is constant. (Hint: induct on
.) Conclude that if
are two distinct polynomials of degree at most
, that
.
The fact that high rank polynomials are equidistributed extends to higher degrees also:
Theorem 4 Let
be a limit classical polynomial. If
is biased, then
is low rank.
In particular, from the Weyl criterion, we see that if
is not equidistributed, then
is of low rank.
In the high characteristic case , this claim was obtained by Ben Green and myself; the generalisation to the low characteristic case
was carried out by Kaufman and Lovett. The statement is phrased in the language of ultrafilters, but it has an equivalent finitary analogue:
Exercise 15 Show that Theorem 4 is equivalent to the claim that for every
and
, and every classical polynomial
of degree at most
on a finite-dimensional vector space with
, that
can be expressed as a function of at most
classical polynomials of degree at most
.
The proof of Theorem 4 is a little lengthy. It splits up into two pieces. We say that a limit function (not necessarily a polynomial) is of order
if it can be expressed as a function of a bounded number of polynomials of degree less than
. Our task is thus to show that every polynomial of degree
which is biased, is of order
. We first get within an epsilon of this goal, using an argument of Bogdanov and Viola:
Lemma 5 (Bogdanov-Viola lemma) Let
be a limit polynomial of degree
which is biased, and let
be standard. Then one can find a limit function
of order
such that
.
Proof: Let be a small standard number (depending on
) to be chosen later, let
be a large standard integer (depending on
) to be chosen later, and let
be chosen uniformly at random from
. An application of the second moment method (which we leave as an exercise) shows that if
is large enough, then with probability at least
, one has
for at least choices of
. We can rearrange this as
where ; note from hypothesis that
. If we let
be the nearest
root of unity to
, then (if
is small enough) we conclude that
for at least
choices of
. On the other hand,
is clearly of order
, and the claim follows.
Exercise 16 Establish the claim left as an exercise in the above proof.
To conclude the proof of Theorem 4 from Lemma 5, it thus suffices to show
Proposition 6 (Rigidity) Let
be a limit polynomial of degree
which is equal to a limit function
of order
on at least
of
, where
is standard. If
is sufficiently small with respect to
, then
is also of order
.
This proposition is somewhat tricky to prove, even in the high characteristic case . We fix
and assume inductively that the proposition (and hence) Theorem 4 has been demonstrated for all smaller values of
.
The main idea here is to start with the “noisy polynomial” , and perform some sort of “error correction” on
to recover
; the key is then to show that this error correction procedure preserves the property of being order
. From Exercise 14 we know that in principle, this error correction is possible if
is small enough; but in order to preserve the order
property we need a more explicit error correction algorithm which is tractable for analysis. This is provided by the following lemma.
Lemma 7 (Error correction of polynomials) Let
be a (limit) classical polynomial of degree at most
, and let
be a (limit) function which agrees with
at least
of the time for some
. Then for every
,
is equal to the most common value (i.e. the mode) of
as
vary in
.
Proof: As is a polynomial of degree at most
, one has
for all . We rearrange this as
holds unless and
differ at
for some
.
On the other hand, if is fixed and
are chosen independently and uniformly at random from
, then for each
,
is also uniformly distributed in
, and so the probability that
and
differ at
is at most
. Applying the union bound for the
values of
under consideration, we conclude that (3) happens more than half the time, and the claim follows.
Note that the above argument in fact shows that the mode is attained for at least of the choices of
.
In view of this lemma, the goal is now to show that if is of order
and is sufficiently close to a polynomial of degree
, then the mode of
is also of order
.
By hypothesis, we have for some standard
and some polynomials
of degree
. To motivate the general argument, let us first work in an easy model case, in which the
are polynomials of degree
that are linearly independent modulo low rank (i.e. order
) errors, i.e. no non-trivial linear combination of
over
is of low rank. This is not the most general case, but is somewhat simpler and will serve to illustrate the main ideas.
The linear independence, combined with the inductive hypothesis, implies that any non-trivial linear combination of is unbiased. From this and Fourier analysis, we see that
is jointly equidistributed, thus in particular we have
for all , where
.
In fact, we have a much stronger equidistribution property than this; not only do we understand the distribution of for a single
, but more generally we can control the distribution of an entire parallelopiped
for any standard integer . Because all the components
are polynomials of degree
, the quantity
is constrained to the space
, defined as the subspace of
consisting of all tuples
obeying the constraints
for all faces of dimension
, where
is the sign of
. (This constraint is of course vacuous if
.)
Proposition 8
is equidistributed in
, thus
for all
. Furthermore, we have the refined bound
for all
and all
.
Proof: It suffices to prove the second claim. Fix and
From the definition of
, we see that
is uniquely determined by the component
and
. It will thus suffice to show that
for all , where
By Fourier analysis, it suffices to show that
for any non-zero . In other words, we need to show that
whenever the for
are not all zero.
Let be such that
, and such that
is as large as possible; let us write
, so that
. Without loss of generality we may take
. Suppose (5) failed, then by the pigeonhole principle one can find
such that
We write the left-hand side as
where are bounded limit functions depending on
that are independent of
.
We can eliminate each term in turn by the Cauchy-Schwarz argument used in Notes 3, and conclude that
and thus by the monotonicity of Gowers norms
or in other words that the degree polynomial
is biased. By the induction hypothesis, this polynomial must be low rank.
At this point we crucially exploit the high characteristic hypothesis by noting the Taylor expansion formula
The high characteristic is necessary here to invert . We conclude that
is of low rank, but this contradicts the hypothesis on the
and the non-zero nature of
, and the claim follows.
Let and
. From the above proposition we have an equidistribution result for a cube pinned at
:
In fact, we can do a bit better than this, and obtain equidistribution even after fixing a second vertex:
Exercise 17 (Equidistribution of doubly pinned cubes) Let
, let
, let
. Then for all but
elements
of
, one has
(Hint: One can proceed by applying Proposition 8 with
replaced by a larger dimension, such as
; details can be found in the paper of Ben Green and myself.)
We can now establish Proposition 6 in the case where the are independent modulo low rank errors. Let
and
. It will suffice to show that
does not depend on
as long as
stays inside
.
Call an atom good if
and
agree for at least
of the elements of
; by Markov’s inequality (and (4)) we see that at least
of the atoms are good. From this and an easy counting argument we can find an element
in
with the specified value of
, such that
is good for every
.
Fix . Now consider all the pinned cubes
with
for all
. By (6), the number of such cubes is
. On the other hand, by Proposition 17, the total number of such cubes for which
for some is
. We conclude that there exists a pinned cube for which
for all , and in particular (3) holds. However, as
is constant on each of the
, we see that the right-hand side of (3) does not depend on
, and so the left-hand side does also.
This completes the proof of Proposition 6 in the independent case. In the general case, one reduces to a (slight generalisation of) this case by the following regularity lemma:
Lemma 9 (Regularity lemma) Let
be a bounded number of limit classical polynomials of degree
. Then there exists a limit classical bounded number of polynomials
of degree
for each
, such that each
is a function of the
for
and
, and such that for each
, the
are independent modulo low rank polynomials of degree
.
Proof: We induct on . The claim is vacuously true for
, so suppose that
and that the claim has already been proven for
.
Let be the space of limit classical polynomials of degree
, and let
be the subspace of low rank limit classical polynomials. Working in the quotient space
, we see that
generates a finite-dimensional space here, which thus has a basis
, thus
are linearly independent modulo low rank polynomials of degree
, and the
are linear combinations of the
plus combinations of some additional polynomials
of degree
. Applying the induction hypothesis to those additional polynomials, one obtains the claim.
Exercise 18 Show that the polynomials
appearing in the above lemma are equidistributed in the sense that
for any
with
.
Applying the above lemma, one can express any order function
in the form
. It is then possible to modify the previous arguments to obtain Proposition 6; see the paper of Ben Green and myself for more details. (We phrase the arguments in a finitary setting rather than a nonstandard one, but the two approaches are equivalent; see this blog post for more discussion.)
It is possible to modify the above arguments to handle the low characteristic case, but due to the lack of a good Taylor expansion, one has to regularise the derivatives of the polynomials, as well as the polynomials themselves; see the paper of Kaufman and Lovett for details.
— 3. Analytic rank —
Define the rank of a degree
(limit) classical polynomial
to be the least number
of degree
(limit) classical polynomials
such that
is a function of
. Theorem 3 tells us that
is equidistributed whenever the rank is unbounded. However, the proof was rather involved. There is a more elementary approach to equidistribution due to Gowers and Wolf which replaces the rank by a different object, called analytic rank, and which can serve as a simpler substitute for the concept of rank in some applications.
Definition 10 (Analytic rank) The analytic rank
of a (limit) classical polynomial
of degree
is defined to be the quantity
From the properties of the Gowers norms we see that this quantity is non-negative, is zero if and only if is a polynomial of degree
, and is finite (or limit finite) for
. (For
, the analytic rank is infinite if
is non-constant and zero if
is constant.)
Exercise 19 Show that if
and
is a (limit) classical polynomial of degree
, then
.
Exercise 20 Show that if the analytic rank
of a limit classical polynomial
of degree
is unbounded, then
is equidistributed.
Exercise 21 Suppose we are in the high characteristic case
Using Theorem 3, show that a limit classical polynomial has bounded analytic rank if and only if it has bounded rank. (Hint: One direction follows from the preceding exercise. For the other direction, use the Taylor formula
.) This is a special case of the inverse conjecture for the Gowers norms, which we will discuss in more detail in later notes.
Conclude the following finitary version: if
is a classical polynomial of degree
on a finite-dimensoinal vector space
, and
, then
; conversely, if
, then
.
Exercise 22 Show that if
is a (limit) classical polynomial of degree
, then
and
for all
, and
and
for all (limit) classical polynomials
of degree
.
It is clear that the rank obeys the triangle inequality for all (limit) classical polynomials of degree
. There is an analogue for analytic rank, due to Gowers and Wolf:
Proposition 11 (Quasi-triangle inequality for analytic rank) Let
be (limit) classical polynomials of degree
. Then
.
Proof: Let be the
-linear form
(note that the right-hand side is independent of ); similarly define
By definition, we have
and
and thus
We make the substitution . Using the multilinearity of
, we can write the left-hand side as
where the are functions bounded in magnitude by
that are independent of the
variable. Eliminating all these factors by Cauchy-Schwarz as in the previous notes, we can bound the above expression by
which using the substitution and the multilinearity of
simplifies to
which by definition of analytic rank is
and the claim follows.
8 comments
Comments feed for this article
10 May, 2010 at 2:44 am
Mads
Professor Tau,
An typo has found it’s way into the hint to Exercise 17: “cfan” should be “can”.
[Corrected, thanks – T.]
20 May, 2010 at 9:44 pm
254B, Notes 5: The inverse conjecture for the Gowers norm I. The finite field case « What’s new
[…] closely tied to the space of polynomials of degree at most . Indeed, as noted in Exercise 20 of Notes 4, a function of norm has norm equal to if and only if for some ; thus polynomials solve the […]
26 May, 2010 at 8:50 pm
Pietro
Dear Terry, I may be misinterpreting the notation horribly, but I don’t think that the map x |–> x^i is a bijection on F_p for all 0<i<p-1. You need i to be prime to p-1 for that to be true. (Otherwise quadratic reciprocity would be too easy!)
Fortunately you don't need this to get the power sums to be zero, all you need is some nonzero b in F s.t. b^i is different from 1, which exists since the multiplicative group F_p\{0} is cyclic. (Or alternatively because the polynomial X^i – 1 has at most i roots.)
26 May, 2010 at 9:00 pm
Terence Tao
Oops, good point! (I had confused the proof of the vanishing of
with the proof of Fermat’s little theorem
that is based on the fact that
is a bijection.) The correct statement is that
is a covering map onto a nontrivial multiplicative subgroup, but it is indeed simpler to proceed by dilating by a non-i^th-root-of-unity b.
3 June, 2010 at 2:38 pm
Sungjin Kim
In exercise 6, Sum x^i should be summing over x in F, not i in F.
[Corrected, thanks – T.]
30 May, 2010 at 9:19 pm
mmailliw/william
I believe that your estimate for Exercise 14 doesn’t quite hold.
As a counterexample, we can let F = F_3, d = 2, and P = x^2. Then, because e(P(x)) = 1 for x = 0 and (-1 + i sqrt 3)/2 for x nonzero, the expected value of e(P(x)) is (sqrt 3)/3 i, whose absolute value is about .5773 (this also follows from Exercise 12, which gives 1/(sqrt 3) in this case); however, the statement on Exercise 14 would imply that if
|E(e(P(x))| > 1 – 2^(-d + 1) = 1 – 2^(-1) = .5 in this case, then P would have to be constant, which is a contradiction.
The induction proof almost works, though (I believe that the stumbling point is that |E_x (Delta_h P(x))| could equal 1 if h = 0).
30 May, 2010 at 11:01 pm
Terence Tao
Ah, right, I meant to have a lower bound on E |e(P(x)) – c| for some unit phase c (instead of a lower bound on |E e(P(x)) – c|, which was essentially what was written). The original claim is still true if one replaces
by something a bit weaker, of course. I’ve reworded the question accordingly.
3 June, 2010 at 9:20 am
Sungjin Kim
Dear Professor Tao,
In exercise 2, the answer holds when we have no restrictions on exponents i_1, …, i_n. However, by exercise 1, we have a restriction that all i_1, … i_n should be in {0,1, …, p-1}.
[Corrected, thanks – T.]