This post is a continuation of the previous post on sieve theory, which is an ongoing part of the Polymath8 project to improve the various parameters in Zhang’s proof that bounded gaps between primes occur infinitely often. Given that the comments on that page are getting quite lengthy, this is also a good opportunity to “roll over” that thread.
We will continue the notation from the previous post, including the concept of an admissible tuple, the use of an asymptotic parameter going to infinity, and a quantity depending on that goes to infinity sufficiently slowly with , and (the -trick).
The objective of this portion of the Polymath8 project is to make as efficient as possible the connection between two types of results, which we call and . Let us first state , which has an integer parameter :
Conjecture 1 () Let be a fixed admissible -tuple. Then there are infinitely many translates of which contain at least two primes.
Zhang was the first to prove a result of this type with . Since then the value of has been lowered substantially; at this time of writing, the current record is .
There are two basic ways known currently to attain this conjecture. The first is to use the Elliott-Halberstam conjecture for some :
Conjecture 2 () One has
for all fixed . Here we use the abbreviation for .
Here of course is the von Mangoldt function and the Euler totient function. It is conjectured that holds for all , but this is currently only known for , an important result known as the Bombieri-Vinogradov theorem.
In a breakthrough paper, Goldston, Yildirim, and Pintz established an implication of the form
for any , where depends on . This deduction was very recently optimised by Farkas, Pintz, and Revesz and also independently in the comments to the previous blog post, leading to the following implication:
where is the first positive zero of the Bessel function . Then implies .
Implications of the form Theorem 3 were modified by Motohashi and Pintz, which in our notation replaces by an easier conjecture for some and , at the cost of degrading the sufficient condition (2) slightly. In our notation, this conjecture takes the following form for each choice of parameters :
and is the set of congruence classes
and is the polynomial
This is a weakened version of the Elliott-Halberstam conjecture:
In particular, since is conjecturally true for all , we conjecture to be true for all and .
then the hypothesis (applied to and and then subtracting) tells us that
for any fixed . From the Chinese remainder theorem and the Siegel-Walfisz theorem we have
for any coprime to (and in particular for ). Since , where is the number of prime divisors of , we can thus bound the left-hand side of (3) by
The contribution of the second term is by standard estimates (see Proposition 8 below). Using the very crude bound
and standard estimates we also have
and the claim now follows from the Cauchy-Schwarz inequality.
In practice, the conjecture is easier to prove than due to the restriction of the residue classes to , and also the restriction of the modulus to -smooth numbers. Zhang proved for any . More recently, our Polymath8 group has analysed Zhang’s argument (using in part a corrected version of the analysis of a recent preprint of Pintz) to obtain whenever are such that
The work of Motohashi and Pintz, and later Zhang, implicitly describe arguments that allow one to deduce from provided that is sufficiently large depending on . The best implication of this sort that we have been able to verify thus far is the following result, established in the previous post:
where is the quantity
Then implies .
This complicated version of is roughly of size . It is unlikely to be optimal; the work of Motohashi-Pintz and Pintz suggests that it can essentially be improved to , but currently we are unable to verify this claim. One of the aims of this post is to encourage further discussion as to how to improve the term in results such as Theorem 6.
We remark that as (5) is an open condition, it is unaffected by infinitesimal modifications to , and so we do not ascribe much importance to such modifications (e.g. replacing by for some arbitrarily small ).
The known deductions of from claims such as or rely on the following elementary observation of Goldston, Pintz, and Yildirim (essentially a weighted pigeonhole principle), which we have placed in “-tricked form”:
Lemma 7 (Criterion for DHL) Let . Suppose that for each fixed admissible -tuple and each congruence class such that is coprime to for all , one can find a non-negative weight function , fixed quantities , a quantity , and a fixed positive power of such that one has the upper bound
holds. Then holds. Here is defined to equal when is prime and otherwise.
By (8), this expression is positive for all sufficiently large . On the other hand, (9) can only be positive if at least one summand is positive, which only can happen when contains at least two primes for some with . Letting we obtain as claimed.
In practice, the quantity (referred to as the sieve level) is a power of such as or , and reflects the strength of the distribution hypothesis or that is available; the quantity will also be a key parameter in the definition of the sieve weight . The factor reflects the order of magnitude of the expected density of in the residue class ; it could be absorbed into the sieve weight by dividing that weight by , but it is convenient to not enforce such a normalisation so as not to clutter up the formulae. In practice, will some combination of and .
Once one has decided to rely on Lemma 7, the next main task is to select a good weight for which the ratio is as small as possible (and for which the sieve level is as large as possible. To ensure non-negativity, we use the Selberg sieve
where takes the form
for some weights vanishing for that are to be chosen, where is an interval and is the polynomial . If the distribution hypothesis is , one takes and ; if the distribution hypothesis is instead , one takes and .
is used for some additional parameter to be optimised over. More generally, one can take
for some suitable (in particular, sufficiently smooth) cutoff function . We will refer to this choice of sieve weights as the “analytic Selberg sieve”; this is the choice used in the analysis in the previous post.
for a sufficiently smooth function , where
for is a -variant of the Euler totient function, and
for is a -variant of the function . (The derivative on the cutoff is convenient for computations, as will be made clearer later in this post.) This choice of weights may seem somewhat arbitrary, but it arises naturally when considering how to optimise the quadratic form
(which arises naturally in the estimation of in (6)) subject to a fixed value of (which morally is associated to the estimation of in (7)); this is discussed in any sieve theory text as part of the general theory of the Selberg sieve, e.g. Friedlander-Iwaniec.
The use of the elementary Selberg sieve for the bounded prime gaps problem was studied by Motohashi and Pintz. Their arguments give an alternate derivation of from for sufficiently large, although unfortunately we were not able to confirm some of their calculations regarding the precise dependence of on , and in particular we have not yet been able to improve upon the specific criterion in Theorem 6 using the elementary sieve. However it is quite plausible that such improvements could become available with additional arguments.
Below the fold we describe how the elementary Selberg sieve can be used to reprove Theorem 3, and discuss how they could potentially be used to improve upon Theorem 6. (But the elementary Selberg sieve and the analytic Selberg sieve are in any event closely related; see the appendix of this paper of mine with Ben Green for some further discussion.) For the purposes of polymath8, either developing the elementary Selberg sieve or continuing the analysis of the analytic Selberg sieve from the previous post would be a relevant topic of conversation in the comments to this post.
— 1. Sums of multiplicative functions —
In this section we review a standard estimate on a sum of multiplicative functions. We fix an interval . For any positive integer , we say that a multiplicative function has dimension if one has the asymptotic
for all ; in particular (since as ) we see that is non-negative on for large enough. Thus for instance
has dimension one, the divisor function
has dimension two, and the functions
defined in the introduction have dimension . Dimension interacts well with multiplication; the product of a -dimensional multiplicative function and a -dimensional multiplicative function is clearly a -multiplicative function.
We have the following basic asymptotic in the untruncated case :
Lemma 8 (Untruncated asymptotic) Let Let be a fixed positive integer, and let be a multiplicative function of dimension . Then for any fixed compactly supported, Riemann-integrable function , and any that goes to infinity as , one has
Proof: By approximating from above and below by smooth compactly supported functions we see that we may assume without loss of generality that is smooth and compactly supported. But then the claim follows from Proposition 10 of the previous post.
We remark that Proposition 10 of the previous post also gives asymptotics for a number of other sums of multiplicative functions, but one (small) advantage of the elementary Selberg sieve is that these (slightly) more complicated asymptotics are not needed. The generalisation in Lemma 8 from smooth to Riemann integrable implies in particular that
and conversely Lemma 8 can be easily deduced from (12) by another approximation argument (using piecewise constant functions instead of smooth functions). We also make the trivial remark that if is non-negative and is any subset of , then we have the upper bound
for any non-negative Riemann integrable .
Actually, (12) can be derived by purely elementary means (without the need to explicitly work with asymptotics of zeta functions as was done in the previous post) by an induction on the dimension as follows. In the dimension zero case we have the Euler product
which gives (12) in the case.
where is a multiplicative function with
for all and , and for and . Then the left-hand side of (12) can be rearranged as
Elementary sieving gives
and hence by summation by parts
Meanwhile we have
From these estimates one easily obtains (12) for .
Now suppose that and that the claim has been proven inductively for . We again may decompose (14), but now has dimension instead of dimension zero. Arguing as before, we can write the left-hand side of (12) as
The contribution of the and error terms are acceptable by induction hypothesis, and the main term is also acceptable from induction hypothesis and summation by parts, giving the claim.
— 2. Untruncated implication —
We first reprove Theorem 3. The key calculations for and are as follows:
Lemma 9 (Untruncated sieve bounds) Assume holds for some and some . Let be a smooth function that is supported on , let be a fixed admissible -tuple for some fixed , let be such that is coprime to for all , and let be the elementary Selberg sieve with weights (11) associated to the function , the sieve level and the untruncated interval . Then (6), (7) hold with
for non-zero can be made arbitrarily close to (the extremiser is not quite smooth at if one extends by zero for , but this can be easily dealt with by a standard regularisation argument), and Theorem 6 then follows from Lemma 7 (using the open nature of (2) to replace by for some small ).
We expand the left-hand side of (6) as
The weights are only non-vanishing when . From the Chinese remainder theorem we then have
The contribution of the error term is
which we can upper bound by
and hence by another application of Lemma 8 the previous expression may be upper bounded by , which is negligible by choice of . So we reduce to showing that
for , which can be easily verified by working locally (when for some prime ) and then using multiplicativity. Using this identity we can diagonalise the left-hand side of (18) as
Now we use the form (11) of , which has been optimised specifically for ease of computing this expression. We can expand as
writing and , we can rewrite this as
which by Möbius inversion simplifies to
The left-hand side of (18) has now simplified to
Now we turn to the more difficult lower bound (7) for a fixed (again we will be able to get an asymptotic here rather than just a lower bound). The left-hand side expands as
Again, may be restricted to at most , so that is at most . As before, the inner summand vanishes unless lies in one of the residue classes , where
and is the modified polynomial
The cardinality of is , where
We can thus estimate
where the error term is given by
By a modification of the proof of Proposition 5 we see that the hypothesis implies that
Analogously to (19) we have the decomposition
for , where is the function
We can thus diagonalise the left-hand side of (20) similarly to before as
We can expand as
writing and and noting that , we can rewrite this as
so we can simplify the left-hand side of (20) as
where is the -dimensional multiplicative function
By Proposition 8, the inner sum is equal to
which by the fundamental theorem of calculus simplifies to
We remark that the error term here is uniform in , because the translates are equicontinuous and thus uniformly Riemann integrable. We conclude that (23) is equal to
By the preceding discussion, the term of this sum is
Now we consider the terms, which are error terms. We may bound the total contribution of these terms in magnitude by
Arguing as before we have
and so the expression (25) becomes
where the implied constant in the notation can depend on . The square of this expression is then
The left-hand side of (20) is now expressed as the sum of the main term
and the error terms
— 3. Applying truncation —
Now we experiment with truncating the above argument to to obtain results of the shape of Theorem 6. Unfortunately thus far the results do not give very good explicit dependencies of on , but this may perhaps improve with further effort.
Assume holds for some and some . Let be a smooth function that is supported on , let be a fixed admissible -tuple for some fixed , let be such that is coprime to for all , and let be the elementary Selberg sieve with weights (11) associated to the function , the sieve level and the truncated interval . As before, we set
for here, but it is possible that we could do better than this.
Now we turn to (7). Arguing as in the previous section, we reduce to showing that
Because we now seek a lower bound, we cannot simply pass to the untruncated interval (e.g. using (13)), and must proceed more carefully. A simple way to proceed (as was done by Motohashi and Pintz) is to just discard all less than , only retaining those in the region between and . The reason for doing this is that the parameter is then forced to be at most if one wants the summand to be non-zero, and so for the summation at least one can replace by without incurring any error. As in the previous section we then have
and so one can lower bound (26), up to negligible errors, by
If the truncated interval were replaced by the untruncated interval , then Proposition 8 would estimate this expression as
To deal with the truncated interval , we use a variant of the Buchstab identity, namely the easy inequality
minus the sum
(The term comes from .)If is non-negative and non-increasing on , then we can upper bound
for , and so
On the other hand, from the prime number theorem we have
Putting all this together, we can thus obtain (7) with
Following Pintz, we may upper bound by and rescale to obtain
which we can crudely bound by
But of course we can also calculate and explicitly for any fixed choice of . We conclude the following variant of Theorem 6:
with given by (27), and some smooth supported on which is non-negative and non-increasing on . Then implies .
For large enough depending on the hypotheses in Theorem 10 can be verified (e.g. by setting for a reasonably large ) but the dependence is poor due to the localisation of the integral in the denominator to the narrow interval . But perhaps there is a way to not have such a strict localisation in these arguments.