As in previous posts, we use the following asymptotic notation: is a parameter going off to infinity, and all quantities may depend on unless explicitly declared to be “fixed”. The asymptotic notation is then defined relative to this parameter. A quantity is said to be of polynomial size if one has , and bounded if . We also write for , and for .
The purpose of this post is to collect together all the various refinements to the second half of Zhang’s paper that have been obtained as part of the polymath8 project and present them as a coherent argument. In order to state the main result, we need to recall some definitions. If is a bounded subset of , let denote the square-free numbers whose prime factors lie in , and let denote the product of the primes in . Note by the Chinese remainder theorem that the set of primitive congruence classes modulo can be identified with the tuples of primitive congruence classes of congruence classes modulo for each which obey the Chinese remainder theorem
for all coprime , since one can identify with the tuple for each .
If and is a natural number, we say that is -densely divisible if, for every , one can find a factor of in the interval . We say that is doubly -densely divisible if, for every , one can find a factor of in the interval such that is itself -densely divisible. We let denote the set of doubly -densely divisible natural numbers, and the set of -densely divisible numbers.
Given any finitely supported sequence and any primitive residue class , we define the discrepancy
for any fixed , any bounded , and any primitive , where is the von Mangoldt function. Importantly, we do not require or to be fixed, in particular could grow polynomially in , and could grow exponentially in , but the implied constant in (1) would still need to be fixed (so it has to be uniform in and ). (In previous formulations of these estimates, the system of congruence was also required to obey a controlled multiplicity hypothesis, but we no longer need this hypothesis in our arguments.) In this post we will record the proof of the following result, which is currently the best distribution result produced by the ongoing polymath8 project to optimise Zhang’s theorem on bounded gaps between primes:
This improves upon the previous constraint of (see this previous post), although that latter statement was stronger in that it only required single dense divisibility rather than double dense divisibility. However, thanks to the efficiency of the sieving step of our argument, the upgrade of the single dense divisibility hypothesis to double dense divisibility costs almost nothing with respect to the parameter (which, using this constraint, gives a value of as verified in these comments, which then implies a value of ).
This estimate is deduced from three sub-estimates, which require a bit more notation to state. We need a fixed quantity .
for all , where is the divisor function.
- (i) A coefficient sequence is said to be at scale for some if it is supported on an interval of the form .
- (ii) A coefficient sequence at scale is said to obey the Siegel-Walfisz theorem if one has
for any , any fixed , and any primitive residue class .
- (iii) A coefficient sequence at scale (relative to this choice of ) is said to be smooth if it takes the form for some smooth function supported on obeying the derivative bounds
for all fixed (note that the implied constant in the notation may depend on ).
Definition 3 (Type I, Type II, Type III estimates) Let , , and be fixed quantities. We let be an arbitrary bounded subset of , and a primitive congruence class.
- (i) We say that holds if, whenever are quantities with
for some fixed , and are coefficient sequences at scales respectively, with obeying a Siegel-Walfisz theorem, we have
- (ii) We say that holds if the conclusion (7) of holds under the same hypotheses as before, except that (6) is replaced with
for some sufficiently small fixed .
- (iii) We say that holds if, whenever are quantities with
Theorem 1 is then a consequence of the following four statements.
In particular, if
then all values of that are sufficiently close to are admissible.
The proofs of Theorems 4, 5, 6 will be given below the fold, while the proof of Lemma 7 follows from the arguments in this previous post. We remark that in our current arguments, the double dense divisibility is only fully used in the Type I estimates; the Type II and Type III estimates are also valid just with single dense divisibility.
Remark 1 Theorem 6 is vacuously true for , as the condition (10) cannot be satisfied in this case. If we use this trivial case of Theorem 6, while keeping the full strength of Theorems 4 and 5, we obtain Theorem 1 in the regime
— 1. Exponential sum estimates —
It will be convenient to introduce a little bit of formal algebraic notation. Define an integral rational function to be a formal rational function in a formal indeterminate where are polynomials with integer coefficients, and is monic; in particular any polynomial can be identified with the integral rational function . For minor technical reasons we do not equate integral rational functions under cancelling, thus for instance we consider to be distinct from ; we need to do this because the domain of definition of these two functions is a little different (the former is not defined when , but the latter can still be defined here). Because we refuse to cancel, we have to be a little careful how we define algebraic operations: specifically, we define
Note that the denominator always remains monic with respect to these operations. This is not quite a ring with a derivation (the subtraction operation does not quite cancel the addition operation due to the inability to cancel) but this will not bother us in practice. (On the other hand, addition and multiplication remain associative, and the latter continues to distribute over the former, and differentiation obeys the usual sum and product rules.) Note that if is an integral rational function, we can localise it modulo for any modulus to obtain a rational function that is the ratio of two polynomials in , with the denominator monic and hence non-vanishing. We can define the algebraic operations of addition, subtraction, multiplication, and differentiation on integral rational functions modulo by the same formulae as above, and we observe that these operations are completely compatible with their counterparts over (even without the ability to cancel), thus for instance . We say that is divisible by , and write , if the numerator of has all coefficients divisible by .
If is an integral rational function and , then is well defined as an element of except when is a zero divisor in . We adopt the convention that when is a zero divisor in , thus is really shorthand for ; by abuse of notation we view both as a function on and as a -periodic function on . Thus for instance
Note that if , then for all for which is well defined. We define to be the largest factor of for which ; in particular, if is square-free, we have
Note with these conventions that .
We recall the following Chinese remainder theorem:
for all integers , where is the inverse of in and similarly for .
When there is no chance of confusion we will write for (though note that does not qualify as an integral rational function since the constant is not monic).
Proof: See Lemma 7 of this previous post.
Now we give an estimate for complete exponential sums, which combines both Ramanujan sum bounds with Weil conjecture bounds.
Proof: See Proposition 4 of this previous post.
Proposition 10 (Incomplete sums) Let be a positive square-free integer of polynomial size, and let be an integral rational function with of bounded degree with . Let be a fixed integer, and suppose that we have a factorisation . Then for any , and any coefficient sequence at scale of polynomial size, one has
where and .
Proof: Let be a sufficiently large fixed quantity depending on the degrees of and . We first make the technical reduction that it suffices to establish the claim in the case when has no prime factors less than , for otherwise one can factor where is the product of all the prime factors of less than , and by splitting the summation into residue classes and performing the substitution and applying the proposition with replaced by (and adjusting accordingly) we obtain the claim.
By dividing through by (and replacing with ) we may assume without loss of generality that and . As vanishes at infinity, this implies that and (see Lemma 3 from this previous post).
We induct on . We begin with the base case , where the task is to show that
By Proposition 9, we have
so we may delete the condition on the right-hand side without penalty.
By completion of sums (see Lemma 6 of this previous post), the left-hand side is
so it will suffice to show that
By Proposition 9 again, the left-hand side is
Since , we see that , and the claim follows.
and the claim then follows by the induction hypothesis (concatenating and ). Similarly, if , then , and the claim follows from the triangle inequality. Thus we may assume that
Let . We can rewrite as
By Lemma 8 we have , and so by the triangle inequality and the Cauchy-Schwarz inequality
since the summand is only non-zero when is supported on an interval of length . This last expression may be rearranged as
We observe that is an integral rational function whose numerator has lower degree than the denominator. If a prime dividing also divides this rational function, then is divisible by ; if is not divisible by , this implies by telescoping series that for all . This implies that is constant where it is defined; as vanishes at infinity and is defined outside of elements, this implies that (here we use the fact that must exceed , since it divides ). We conclude that
Applying the induction hypothesis and Proposition 9, we may thus bound
The contribution of the first two terms to (14) is acceptable, so the only contribution remaining to control is
If we bound by , we can bound this expression by
which by Lemma 5 of this previous post and the bound is bounded by
which is acceptable.
We record a special case of the above proposition:
Corollary 11 Let be square-free numbers (not necessarily coprime) of polynomial size, let be integers, let , and let be a coefficient sequence at scale . Suppose that is -densely divisible. Let be a residue class with . Then
where for . We also have the variant bound
Proof: We first consider the case , so that the congruence condition can be deleted. By the dense divisibility hypothesis we may factor for some
The first bound then follows from the case of Proposition 10, combined with the bound
Now we consider the case when . Writing , , and , we have from Lemma 8 that
If we then apply the previous results with replaced by (with being -densely divisible) and replaced by (and with suitable alterations to ), we obtain the required claims.
for square-free and .)
Lemma 12 (Correlation of hyper-Kloosterman sums) Let be square-free numbers of polynomial size with . Let , , and . Then
Proof: From Lemma 8 we have
and so it suffices to prove the estimates
By further application of Lemma 8, together with the divisor bound, it suffices to show that
whenever , , and .
Suppose first that , so that . Then , and the left-hand side simplifies to
which can be expanded as
Performing the Fourier summation in , this can be bounded by the sum of
The first term either vanishes or is a Kloosterman sum, and is in either case, while the second term can be calculated by Fourier series to be , and the claim follows. Similarly if .
Now suppose that and . Then we use the Deligne bound to obtain the desired claim. The only remaining case is when and or , so our task is to show that
in this case. A proof of this claim (which uses the full strength of Deligne’s work) can be found in this paper of Michel (see also Proposition 6 of this recent expository note of Fouvry, Kowalski, and Michel).
From this and completion and sums we have
where , and the sum runs over those coprime to .
— 2. Type I estimate —
for some sufficiently slowly decaying , since otherwise we may use the Bombieri-Vinogradov theorem (Theorem 4 from this previous post). Thus, by dyadic decomposition, we need to show that
for any fixed and for any in the range
be a sufficiently small fixed exponent.
By Lemma 11 of this previous post, we know that for all in outside of a small number of exceptions, we have
Specifically, the number of exceptions in the interval is for any fixed . The contribution of the exceptional can be shown to be acceptable by Cauchy-Schwarz and trivial estimates (see Section 5 of this previous post), so we restrict attention to those for which (18) holds. In particular, as is restricted to be doubly -densely divisible we may factor
with coprime and square-free, with -densely divisible with , and
Here we use the easily verified fact that . Since is -densely divisible, we also have .
By dyadic decomposition, it thus sufices to show that
Fix . We abbreviate and by and respectively, thus our task is to show that
We now split the discrepancy
as the sum of the subdiscrepancies
In Section 5 of this previous post, it was established (using the Bombieri-Vinogradov theorem) that
for each individual with , which we now select. It will suffice to prove the slightly stronger statement
for all coprime to , since if one then specialises to the case when and averages over all primitive we obtain (23) from the triangle inequality.
We use the dispersion method. We write the left-hand side of (24) as
for some bounded sequence (which may also depend on , but we suppress this dependence). This expression may be rearranged as
where is subject to the same constraints as (thus and for ), and is some quantity that is independent of .
Observe that must be coprime to and coprime to , with , to have a non-zero contribution to (26). We then rearrange the left-hand side as
note that these inverses in the various rings , , are well-defined thanks to the coprimality hypotheses.
for some independent of , .
At this stage in previous posts we isolated the coprime case as the dominant case, using a controlled multiplicity hypothesis to deal with the non-coprime case. Here, we will carry the non-coprime case with us for a little longer so as not to rely on a controlled multiplicity hypothesis; this introduces some additional factors of into the analysis but they should be ignored on a first reading.
Let us first deal with the main term (29). The contribution of the coprime case does not depend on and can thus be absorbed into the term. Now we consider the contribution of the non-coprime case when . We may estimate the contribution of this case by
We may estimate by . We just estimate the contribution of , as the other case is treated similarly (after shifting by ). We rearrange this contribution as
The summation is . Evaluating the and summations, we obtain a bound of
Since and , we have , and so we may evaluate the summation as
It remains to control (30). We may assume that , as the claim is trivial otherwise. It will suffice to obtain the bound
Using (31), it will suffice to show that
for each .
for each .
Henceforth we work with a single choice of . We pause to verify the relationship
As is -densely divisible, we may now factor where
Factoring out , we may then write where
By dyadic decomposition, it thus suffices to show that
whenever are such that
We rearrange this estimate as
for some bounded sequence which is only non-zero when
By Cauchy-Schwarz and crude estimates, it then suffices to show that
The contribution of the diagonal case is by the divisor bound, which is acceptable since . Thus it suffices to control the off-diagonal case . Observe that for a given choice of , the phase either vanishes identically, or is equal to
for some quantities with
Also, by construction, , and are -densely divisible, so is as well. (Here we use the fact that the least common multiple of two -densely divisible numbers is again -densely divisible, which follows from the more general fact that if , , and is -densely divisible, then is also.) The condition is either not satisfiable, or restricts to a congruence class for some dividing . We can thus apply Corollary 11 and bound
Bounding by , we can thus bound the off-diagonal contribution to (34) by
which sums (using Lemma 5 of this previous post and the divisor bound) to
Discarding some factors of , we reduce to showing that
From (20), this will be implied by
or equivalently that
which by (6) is obeyed whenever
The second condition is implied by the first and may be deleted. The proof of Theorem 4 is now complete.
— 3. Type II estimate —
Now we prove Theorem 5. We repeat the Type I arguments through to (33) (noting that the hypothesis (6) is never used until that point, other than to ensure that ), thus we are again faced with the task of proving
Indeed by (31) this is equivalent to
With this weaker bound (35) we have to perform Cauchy-Schwarz differently. We rearrange the left-hand side as
for some bounded coefficients . Applying Cauchy-Schwarz, it then suffices to show that
The left-hand side may be bounded by
We isolate the diagonal case . By the divisor bound, the contribution of this case is , which is acceptable by (35). So we now restrict attention to the off-diagonal case . The phase either vanishes identically, or takes the form
for some with . By the second part of Corollary 11 we may thus bound the previous expression by
By the divisor bound and Lemma 5 of this previous post, this sums to
Discarding some factors of , it suffices to show that
From (20), this will be implied by
or equivalently that
which by (8) is obeyed whenever
The second condition is implied by the first and may be deleted. The proof of Theorem 5 is now complete.
— 4. Type III estimate —
We now prove Theorem 6. Let be as in the definition of . We will not need the full strength of double dense divisibility here, and work instead with single dense divisibility. By a finer-than-dyadic decomposition (and using the Bombieri-Vinogradov theorem to handle small moduli), it suffices to show that
Henceforth we work with a single choice of , and abbreviate the summation as . The left-hand side may then be written as
for some bounded sequence . So it suffices to show that
for some that is independent of , as the claim then follows by averaging in .
Note that for one has
By Fourier inversion we have
for all , where is the Fourier transform
From the smoothness of , Poisson summation, and integration by parts we have the decay estimates
for any fixed and any . More generally, we also have the derivative estimates
for any fixed and any . We thus have
(say) when , where
Furthermore, for , we can perform a Taylor expansion around and conclude that
for some fixed (depending on ), any , and some coefficients whose exact value will not be of importance to us. We may thus express (37), up to negligible errors, as the sum of a bounded number of expressions of the form
for some bounded sequences whose exact value will not be of importance to us other than their support, which is contained in the sets
respectively, and where
If we then introduce the modified hyper-Kloosterman sum
defined for and , then our objective is now to show that
for some that does not depend on , where is the reciprocal of in and .
We may rewrite as . Observe that is independent of if one of vanishes (as can be seen by dilating ), and similarly if or vanishes. Thus we may delete the case from the above sum, and reduce to showing that
At this point we need to account for a technical problem that the may still share a common factor with even after being restricted to be non-zero. For , let be the product of all the primes in (counting multiplicity) that also divide ; thus where is coprime to . As we shall see, the case is dominant, and on a first reading one may wish to focus exclusively on this case in what follows to simplify the discussion. We then write ; this divides , so we may write . Note that as is -densely divisible, is -densely divisible, thus .
Now we factor . From Lemma 8 we see that
For the second term, we observe that is coprime to for , and so by dilating the variables we have
where is the residue class
and we recall that is the normalised hyper-Kloosterman sum
As for the first term, we have the following estimate:
Lemma 14 We have
where (thus for instance ).
Proof: By further applications of Lemma 8 it suffices to show that
whenever is prime, with , and .
Without loss of generality we may assume that , then we may rewrite as
But this factors as the product of two Ramanujan sums divided by , and the claim then follows by direct computation.
For brevity we write for . We may thus bound the left-hand side of (38) by
where the summations are over the ranges
Writing , so that
and recalling that , we may thus estimate the previous expression by
where is the quantity
where is the third divisor function and
We now focus on estimating .
be a quantity to optimise in later, where
Observe that every that appears in the expression for is -densely divisible and may thus be factored as for some coprime with
with . Thus we may write
From crude estimates we have
and is a smooth cutoff function supported on the interval which equals one on .
Now we estimate . We can expand this expression as
We first dispose of the diagonal case . Here we use the Deligne bound to bound this case in magnitude by
By the divisor bound, for each there are choices for , so this expression is
which sums to .
Applying Lemma 13 for the off-diagonal case , we thus have
Using the bound , this becomes
By Lemma 5 from this previous post we have
and hence also
Similarly, we have
for all , so on summing over all we have
We thus have
Writing , and , we thus have
Performing the summation, this becomes
and then performing the summation we obtain
The net power of here is always at most , so the term in the summation dominates:
To optimise this in , we select
(The quantity comes from equating and .) By construction, we have the second inequality in (40). We also claim the first inequality, since this is equivalent to
which would follow if
Inserting this value of (using for the first two terms in (43)), we conclude that
One should view the first term here as the main term. By (42), we conclude that
Since , , and , we thus have
From Euler products we see that
and so to prove (39) it will suffice to show that
We can rewrite these conditions as upper bounds on :
As and , we can rewrite these conditions as upper bounds on :
Since and , these conditions become
which we may rearrange as