This is the final continuation of the online reading seminar of Zhang’s paper for the polymath8 project. (There are two other continuations; this previous post, which deals with the combinatorial aspects of the second part of Zhang’s paper, and this previous post, that covers the Type I and Type II sums.) The main purpose of this post is to present (and hopefully, to improve upon) the treatment of the final and most innovative of the key estimates in Zhang’s paper, namely the Type III estimate.
The main estimate was already stated as Theorem 17 in the previous post, but we quickly recall the relevant definitions here. As in other posts, we always take to be a parameter going off to infinity, with the usual asymptotic notation associated to this parameter.
Definition 1 (Coefficient sequences) A coefficient sequence is a finitely supported sequence that obeys the bounds
for all , where is the divisor function.
- (i) If is a coefficient sequence and is a primitive residue class, the (signed) discrepancy of in the sequence is defined to be the quantity
- (ii) A coefficient sequence is said to be at scale for some if it is supported on an interval of the form .
- (iii) A coefficient sequence at scale is said to be smooth if it takes the form for some smooth function supported on obeying the derivative bounds
for all fixed (note that the implied constant in the notation may depend on ).
For any , let denote the square-free numbers whose prime factors lie in . The main result of this post is then the following result of Zhang:
for some fixed . Let be coefficient sequences at scale respectively with smooth. Then for any we have
(This is very slightly stronger than previously claimed, in that the condition has been dropped.)
It turns out that Zhang does not exploit any averaging of the factor, and matters reduce to the following:
for some fixed . Let be smooth coefficient sequences at scales respectively. Then we have
for all and some fixed .
for all , where denotes a quantity that is independent of (but can depend on other quantities such as ). The left-hand side can be rewritten as
From Theorem 3 we have
It remains to establish Theorem 3. This is done by a set of tools similar to that used to control the Type I and Type II sums:
- (i) completion of sums;
- (ii) the Weil conjectures and bounds on Ramanujan sums;
- (iii) factorisation of smooth moduli ;
- (iv) the Cauchy-Schwarz and triangle inequalities (Weyl differencing).
The specifics are slightly different though. For the Type I and Type II sums, it was the classical Weil bound on Kloosterman sums that were the key source of power saving; Ramanujan sums only played a minor role, controlling a secondary error term. For the Type III sums, one needs a significantly deeper consequence of the Weil conjectures, namely the estimate of Bombieri and Birch on a three-dimensional variant of a Kloosterman sum. Furthermore, the Ramanujan sums – which are a rare example of sums that actually exhibit better than square root cancellation, thus going beyond even what the Weil conjectures can offer – make a crucial appearance, when combined with the factorisation of the smooth modulus (this new argument is arguably the most original and interesting contribution of Zhang).
— 1. A three-dimensional exponential sum —
The power savings in Zhang’s Type III argument come from good estimates on the three-dimensional exponential sum
defined for positive integer and (or ). The key estimate is
where is the greatest common divisor of (and we adopt the convention that ). (Here, the denotes a quantity that goes to zero as , rather than as .)
Note that the square root cancellation heuristic predicts as the size for , thus we can achieve better than square root cancellation if has a common factor with that is not shared with . This improvement over the square root heuristic, which is ultimately due to the presence of a Ramanujan sum inside this three-dimensional exponential sum in certain degenerate cases, is crucial to Zhang’s argument.
Proof: Suppose that factors as , thus are coprime. Then we have
(see Lemma 7 of this previous post). From this and the Chinese remainder theorem we see that factorises as
where . Dilating by , we conclude the multiplicative law
Iterating this law, we see that to prove Theorem 4 it suffices to do so in the case when is prime, or more precisely that
Making the change of variables , , this becomes
Performing the sums this becomes
where is the Ramanujan sum
Basic Fourier analysis tells us that equals when and when . The expression (6) then follows from direct computation.
Next, suppose that and . Making the change of variables , becomes
Performing the summation, this becomes
For each , the summation is a Kloosterman sum and is thus by the classical Weil bound (Theorem 8 from previous notes). This gives a net estimate of as desired. Similarly if .
The only remaining case is when . Here one cannot proceed purely through Ramanujan and Weil bounds, and we need to invoke the deep result of Bombieri and Birch, proven in Theorem 1 of the the appendix to this paper of Friedlander and Iwaniec. This bound can be proven by applying Deligne’s proof of the Weil conjectures to a certain -function attached to the surface ; an elementary but somewhat lengthy second proof is also given in the above appendix.
To deal with factors such as , the following simple lemma will be useful.
As in the previous theorem, here denotes a quantity that goes to zero as , rather than as .
Note that it is important that the term is excluded from the first sum, otherwise one acquires an additional term. In particular,
we can bound
— 2. Cauchy-Schwarz —
We now prove Theorem 3. The reader may wish to track the exponents involved in the model regime
where is any fixed power of (e.g. , in which case can be slightly larger than ).
where is any fixed power of (e.g. , in which case can be slightly larger than ).
Let be as in Theorem 3, and let be a sufficiently small fixed quantity. It will suffice to show that
where does not depend on . We rewrite the left-hand side as
and then apply completion of sums (Lemma 6 from this previous post) to rewrite this expression as the sum of the main term
plus the error terms
where is any fixed quantity and
It will be convenient to reduce to the case when and are coprime. More precisely, it will suffice to prove the following claim:
for some fixed .
for any function of , so that (8) can be written as
which we expand as
In order to apply Proposition (6) we need to modify the , constraints. By Möbius inversion one has
We may discard those values of for which is less than one, as the summation is vacuous in that case. We then apply Proposition (6) with replaced by respectively and set equal to , and replaced by and . One can check that all the hypotheses of Proposition 6 are obeyed, so we may bound (12) by
which by the divisor bound is , which is acceptable (after shrinking slightly).
By shifting by for and then averaging, we may write the left-hand side of (14) as
Next, we combine the and summations into a single summation over . We first use a Taylor expansion and (15) to write
for any fixed . If is large enough, then the error term will be acceptable, so it suffices to establish (17) with replaced by for any fixed . We can rewrite
where is such that and
Thus we can estimate the left-hand side of (17) by
Here we have bounded by .
We will eliminate the expression via Cauchy-Schwarz. Observe from the smoothness of that
Note that implies . But from (10) we have , so in fact we have . Thus
From the divisor bound, we see that for each fixed there are choices for , thus
Comparing with the trivial bound of , our task is now to gain a factor of more than over the trivial bound.
We square out (19) as
If we shift by , then relabel by , and use the fact that , we can reduce this to
Next we perform another completion of sums, this time in the variables, to bound
Making the change of variables and and comparing with(5), we see that
Applying Theorem 4 (and recalling that ) we reduce to showing that
We now choose to be a factor of , thus
for some coprime to . We compute the sum on the left-hand side:
Lemma 7 We have
Proof: We first consider the contribution of the diagonal case . This term may be estimated by
The term gives , while the contribution of the non-zero are acceptable by Lemma 5.
For the non-diagonal case , we see from Lemma 5 that
since , we obtain a bound of from this case as required.
From (11) one has
and . As is -smooth, we can thus find with the desired properties by the greedy algorithm. (In view of Corollary 12 from this previous post, one could also have ensured that has no tiny factors, although this does not seem to be of much actual use in the Type III analysis.)