As in previous posts, we use the following asymptotic notation: is a parameter going off to infinity, and all quantities may depend on unless explicitly declared to be “fixed”. The asymptotic notation is then defined relative to this parameter. A quantity is said to be of polynomial size if one has , and bounded if . We also write for , and for .
The purpose of this post is to collect together all the various refinements to the second half of Zhang’s paper that have been obtained as part of the polymath8 project and present them as a coherent argument (though not fully self-contained, as we will need some lemmas from previous posts).
In order to state the main result, we need to recall some definitions.
Definition 1 (Singleton congruence class system) Let , and let denote the square-free numbers whose prime factors lie in . A singleton congruence class system on is a collection of primitive residue classes for each , obeying the Chinese remainder theorem property
whenever are coprime. We say that such a system has controlled multiplicity if the
for any fixed and any congruence class with . Here is the divisor function.
Next we need a relaxation of the concept of -smoothness.
Definition 2 (Dense divisibility) Let . A positive integer is said to be -densely divisible if, for every , there exists a factor of in the interval . We let denote the set of -densely divisible positive integers.
Now we present a strengthened version of the Motohashi-Pintz-Zhang conjecture , which depends on parameters and .
for any fixed , where is the von Mangoldt function.
The difference between this conjecture and the weaker conjecture is that the modulus is constrained to be -densely divisible rather than -smooth (note that is no longer constrained to lie in ). This relaxation of the smoothness condition improves the Goldston-Pintz-Yildirim type sieving needed to deduce from ; see this previous post.
The main result we will establish is
This improves upon previous constraints of (see this blog comment) and (see Theorem 13 of this previous post), which were also only established for instead of . Inserting Theorem 4 into the Pintz sieve from this previous post gives for (see this blog comment), which when inserted in turn into newly set up tables of narrow prime tuples gives infinitely many prime gaps of separation at most .
— 1. Reduction to Type I/II and Type III estimates —
Following Zhang, we can perform a combinatorial reduction to reduce Theorem 4 to two sub-estimates. To state this properly we need some more notation. We need a large fixed constant (that determines how finely we slice up the scales).
for all .
- (i) If is a coefficient sequence and is a primitive residue class, the (signed) discrepancy of in the sequence is defined to be the quantity
- (ii) A coefficient sequence is said to be at scale for some if it is supported on an interval of the form .
- (iii) A coefficient sequence at scale is said to obey the Siegel-Walfisz theorem if one has
for any , any fixed , and any primitive residue class .
- (iv) A coefficient sequence at scale is said to be smooth if it takes the form for some smooth function supported on obeying the derivative bounds
for all fixed (note that the implied constant in the notation may depend on ).
We can now state the two subestimates needed. The first controls sums of Type I or Type II:
with obeying a Siegel-Walfisz theorem. Then for any and any singleton congruence class system with controlled multiplicity we have
This improves upon Theorem 16 in this previous post, in which the modulus was required to be -smooth, and the constraints (9), (10), (11) were replaced by the stronger constraint , and (12) was similarly replaced by a stronger constraint . Of the three constraints (9), (10), (11), the second constraint (10) is more stringent in practice, while the constraint (12) is dominated by other constraints (such as (4)).
The second subestimate controls sums of Type III:
for some fixed . Let be coefficient sequences at scales respectively, with smooth. Then for any , and any singleton congruence class system we have
for any fixed .
This improves upon Theorem 17 in this previous post, in which the modulus was required to be -smooth, and the constraints (15), (16) were replaced by the stronger constraint . Of the two constraints (15), (16), the first constraint (15) is more stringent in practice.
Let us now recall the combinatorial argument (from this previous post) that allows one to deduce Theorem 4 from Theorems 6 and 7. As in Section 3 of this previous post, we let be a fixed integer ( will suffice). Using the Heath-Brown identity as discussed in that section, we reduce to establishing the bound
where , are quantities with the following properties:
- (i) Each is a coefficient sequence at scale . More generally the convolution of the for is a coefficient sequence at scale .
- (ii) If for some fixed , then is smooth.
- (iii) If for some fixed , then obeys a Siegel-Walfisz theorem. More generally, obeys a Siegel-Walfisz theorem if for some fixed .
- (iv) .
We can write for , where the are non-negative reals that sum to . We apply Lemma 6 from this previous post with some parameter
to be chosen later and conclude one of the following:
- (Type 0) There is a with .
- (Type I/II) There is a partition such that
- (Type III) There exist distinct with and .
In the Type 0 case, we can write in a form in which Theorem 15 from this previous post applies. Similarly, in the Type I/II case we can write in a form in which Theorem 6 applies, provided that the conditions (9), (10), (11) are obeyed (the condition (12) is implied by (4)). Now suppose we are in the Type III case. For large enough (e.g. ), we see that for some fixed . Theorem 7 will then apply with
Since , we have
Clearly (22) is implied by (23), and (26) is implied by (27); restricting to the case (which follows from (4)), (28) is also implied by (26). Assuming (4), we also see that (23), (25) are implied by (24), so we reduce to establishing that
but this rearranges to (4).
— 2. Type I/II analysis —
for some sufficiently slowly decaying , since otherwise we may use the Bombieri-Vinogradov theorem (Theorem 4 from this previous post). Thus, by dyadic decomposition, we need to show that
for any in the range
be an exponent to be optimised later (in the Type I case, it will be infinitesimally close to zero, while in the Type II case, it will be infinitesimally larger than ).
By Lemma 11 of this previous post, we know that for all in outside of a small number of exceptions, we have
Specifically, the number of exceptions in the interval is for any fixed . The contribution of the exceptional can be shown to be acceptable by Cauchy-Schwarz and trivial estimates (see Section 5 of this previous post), so we restrict attention to those for which (31) holds. In particular, as is restricted to be -densely divisible we may factor
with coprime and square-free, with with , and
By dyadic decomposition, it thus sufices to show that
Fix . We abbreviate and by and respectively, thus our task is to show that
We now split the discrepancy
as the sum of the subdiscrepancies
In Section 5 of this previous post, it was established that
whenever are good singleton congruence class systems.
By duality and Cauchy-Schwarz exactly as in Section 5 of the previous post, it suffices to show that
where is subject to the same constraints as (thus and for ), and is some quantity that is independent of the choice of congruence classes , .
As in the previous notes, we can dispose of the case when share a common factor by using the controlled multiplicity hypothesis, provided we have the hypothesis
It remains to verify (40). Observe that must be coprime to and coprime to , with , to have a non-zero contribution to the sum. We then rearrange the left-hand side as
note that these inverses in the various rings , , are well-defined thanks to the coprimality hypotheses.
for some independent of the and .
Applying completion of sums (Section 2 from the previous post), we reduce to showing that
and we have dropped all hypotheses on other than magnitude, and we abbreviate as .
We now split into two cases, one which works when are not too close to , and one which works when are close to . Here is the Type I estimate:
hold for some fixed , then (43) holds for a sufficiently small fixed .
In practice the condition (47) is dominant.
Now we give the Type II estimate:
holds, then (43) holds for a sufficiently small fixed .
This result improves upon Theorem 14 from this previous post which had the stronger condition
In practice, (50) will not hold with the original value of in Theorem 6; instead, we only use Theorem 9 the case excluded by Theorem 8, in which case we will be able to lower down to be and verify (50) in that case.
which means that the condition (39) is now weaker than (30) (for small enough) and may be omitted. By (9), (10), (11), we can simultaneously obey (30), (46), (47), (48) by setting sufficiently close to zero, and the claim now follows from Theorem 8.
From this we see that we may replace by in (14) and in all of the above analysis. If we set then the conditions (30), (39) are obeyed (again taking small enough). Theorem 9 will then give us what we want provided that
which is satisfied for small enough thanks to (12).
— 3. The Type I sum —
We now prove Theorem 8. It suffices to show that
for any bounded real coefficients . Performing the manipulations from Section 6 of this previous post, we reduce to showing that
for any .
Now we treat the non-diagonal case . The key estimate here is
Proof: From (45) we may of course assume that
Now for the new input that was not present in the previous Type I analysis. Applying Proposition 5(iii) from this previous post, and noting that are coprime, we can bound the left-hand side of (52) as
Since , , and
the claim follows.
Note from the divisor bound that for each choice of and , there are choices of such that . From this and Lemma 5 of this previous post we see that
so to conclude (51) we need to show that
— 4. The Type II sum —
As in the previous post, the diagonal case is acceptable provided that
We have the following analogue of Lemma 10:
This is an improved version of the estimate (48) from this previous post in which several inefficiencies in the second term on the right-hand side have been removed.
Proof: From (45) we may assume
and by the arguments from the previous post we may rewrite the left-hand side of (55) as
By Proposition 5(ii) of this previous post, we may the bound this quantity by
where , . We may bound
since divides but is coprime to , and the claim follows.
Arguing as in the previous section we have
and so the off-diagonal contribution to (54) is
To conclude (54) we thus need to show that
which can be rearranged as
respectively, and thus both follow from (50).
— 5. The Type III estimate —
for all with and all , and some sufficiently fixed .
Fix . It suffices to show that
for some that does not depend on . Applying completion of sums, we can express the left-hand side as the main term
plus the error terms
and a tiny error
for any fixed , where
it will suffice to prove the following claim:
for some fixed , and let be smooth coefficient sequences at scale respectively. Then
if is sufficiently small.
for any function of , so that (57) can be written as
which we expand as
In order to apply Proposition 12 we need to modify the , constraints. By Möbius inversion one has
We may discard those values of for which is less than one, as the summation is vacuous in that case. We then apply Proposition (12) with replaced by respectively (but with unchanged), replaced with its restriction to values coprime to , and set equal to , replaced by , and replaced by and . One can check that all the hypotheses of Proposition 12 are obeyed (with (60) coming from (15), (61) coming from (16), and (62), (63) coming from (17)), so we may bound (64) by
which by the divisor bound is , which is acceptable.
and save this condition to be verified later.
By shifting by for and then averaging, we may write the left-hand side of (66) as
Next, we combine the and summations into a single summation over . We first use a Taylor expansion and (67) to write
for any fixed . If is large enough, then the error term will be acceptable, so it suffices to establish (69) with replaced by for any fixed . We can rewrite
where is such that and
Thus we can estimate the left-hand side of (69) by
Here we have bounded by .
We will eliminate the expression via Cauchy-Schwarz. Observe from the smoothness of that
Note that implies . But from (59) we have , so in fact we have for some . Thus
From the divisor bound, we see that for each fixed there are choices for , thus
We square out (71) as
If we shift by , then relabel by , and use the fact that , we can reduce this to
Next we perform another completion of sums, this time in the variables, to bound
(the prime is there to distinguish this quantity from ) and
Making the change of variables and , we see that
Applying the Bombieri-Birch bound (Theorem 4 from this previous post), and recalling that , we reduce to showing that
We may cross multiply and write
By the divisor bound, for each choice of and there is choices for and . Thus it suffices to show that
We now choose to be a factor of , thus
for some coprime to . We compute the sum on the left-hand side:
Lemma 13 If , then we have
Proof: We first consider the contribution of the diagonal case . This term may be estimated by
The term gives , while the contribution of the non-zero are also acceptable by Lemma 5 from this previous post.
For the non-diagonal case , we see from Lemma 5 from this previous post that
since , we obtain a bound of from this case as required.
We combine these constraints as
since will then have a factor in where , which works if (and if we can just take as the factor).
It remains to establish . But this bound can be rewritten as