This post is a continuation of the previous post on sieve theory, which is an ongoing part of the Polymath8 project to improve the various parameters in Zhang’s proof that bounded gaps between primes occur infinitely often. Given that the comments on that page are getting quite lengthy, this is also a good opportunity to “roll over” that thread.

We will continue the notation from the previous post, including the concept of an admissible tuple, the use of an asymptotic parameter {x} going to infinity, and a quantity {w} depending on {x} that goes to infinity sufficiently slowly with {x}, and {W := \prod_{p<w} p} (the {W}-trick).

The objective of this portion of the Polymath8 project is to make as efficient as possible the connection between two types of results, which we call {DHL[k_0,2]} and {MPZ[\varpi,\delta]}. Let us first state {DHL[k_0,2]}, which has an integer parameter {k_0 \geq 2}:

Conjecture 1 ({DHL[k_0,2]}) Let {{\mathcal H}} be a fixed admissible {k_0}-tuple. Then there are infinitely many translates {n+{\mathcal H}} of {{\mathcal H}} which contain at least two primes.

Zhang was the first to prove a result of this type with {k_0 = 3,500,000}. Since then the value of {k_0} has been lowered substantially; at this time of writing, the current record is {k_0 = 26,024}.

There are two basic ways known currently to attain this conjecture. The first is to use the Elliott-Halberstam conjecture {EH[\theta]} for some {\theta>1/2}:

Conjecture 2 ({EH[\theta]}) One has

\displaystyle  \sum_{1 \leq q \leq x^\theta} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\sum_{n < x: n = a\ (q)} \Lambda(n) - \frac{1}{\phi(q)} \sum_{n < x} \Lambda(n)|

\displaystyle = O( \frac{x}{\log^A x} )

for all fixed {A>0}. Here we use the abbreviation {n=a\ (q)} for {n=a \hbox{ mod } q}.

Here of course {\Lambda} is the von Mangoldt function and {\phi} the Euler totient function. It is conjectured that {EH[\theta]} holds for all {0 < \theta < 1}, but this is currently only known for {0 < \theta < 1/2}, an important result known as the Bombieri-Vinogradov theorem.

In a breakthrough paper, Goldston, Yildirim, and Pintz established an implication of the form

\displaystyle  EH[\theta] \implies DHL[k_0,2] \ \ \ \ \ (1)

for any {1/2 < \theta < 1}, where {k_0 = k_0(\theta)} depends on {\theta}. This deduction was very recently optimised by Farkas, Pintz, and Revesz and also independently in the comments to the previous blog post, leading to the following implication:

Theorem 3 (EH implies DHL) Let {1/2 < \theta < 1} be a real number, and let {k_0 \geq 2} be an integer obeying the inequality

\displaystyle  2\theta > \frac{j_{k_0-2}^2}{k_0(k_0-1)}, \ \ \ \ \ (2)

where {j_n} is the first positive zero of the Bessel function {J_n(x)}. Then {EH[\theta]} implies {DHL[k_0,2]}.

Note that the right-hand side of (2) is larger than {1}, but tends asymptotically to {1} as {k_0 \rightarrow \infty}. We give an alternate proof of Theorem 3 below the fold.

Implications of the form Theorem 3 were modified by Motohashi and Pintz, which in our notation replaces {EH[\theta]} by an easier conjecture {MPZ[\varpi,\delta]} for some {0 < \varpi < 1/4} and {0 < \delta < 1/4+\varpi}, at the cost of degrading the sufficient condition (2) slightly. In our notation, this conjecture takes the following form for each choice of parameters {\varpi,\delta}:

Conjecture 4 ({MPZ[\varpi,\delta]}) Let {{\mathcal H}} be a fixed {k_0}-tuple (not necessarily admissible) for some fixed {k_0 \geq 2}, and let {b\ (W)} be a primitive residue class. Then

\displaystyle  \sum_{q \in {\mathcal S}_I: q< x^{1/2+2\varpi}} \sum_{a \in C(q)} |\Delta_{b,W}(\Lambda; q,a)| = O( x \log^{-A} x) \ \ \ \ \ (3)

for any fixed {A>0}, where {I = (w,x^{\delta})}, {{\mathcal S}_I} are the square-free integers whose prime factors lie in {I}, and {\Delta_{b,W}(\Lambda;q,a)} is the quantity

\displaystyle  \Delta_{b,W}(\Lambda;q,a) := | \sum_{x \leq n \leq 2x: n=b\ (W); n = a\ (q)} \Lambda(n) \ \ \ \ \ (4)

\displaystyle  - \frac{1}{\phi(q)} \sum_{x \leq n \leq 2x: n = b\ (W)} \Lambda(n)|.

and {C(q)} is the set of congruence classes

\displaystyle  C(q) := \{ a \in ({\bf Z}/q{\bf Z})^\times: P(a) = 0 \}

and {P} is the polynomial

\displaystyle  P(a) := \prod_{h \in {\mathcal H}} (a+h).

This is a weakened version of the Elliott-Halberstam conjecture:

Proposition 5 (EH implies MPZ) Let {0 < \varpi < 1/4} and {0 < \delta < 1/4+\varpi}. Then {EH[1/2+2\varpi+\epsilon]} implies {MPZ[\varpi,\delta]} for any {\epsilon>0}. (In abbreviated form: {EH[1/2+2\varpi+]} implies {MPZ[\varpi,\delta]}.)

In particular, since {EH[\theta]} is conjecturally true for all {0 < \theta < 1/2}, we conjecture {MPZ[\varpi,\delta]} to be true for all {0 < \varpi < 1/4} and {0<\delta<1/4+\varpi}.

Proof: Define

\displaystyle  E(q) := \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\sum_{x \leq n \leq 2x: n = a\ (q)} \Lambda(n) - \frac{1}{\phi(q)} \sum_{x \leq n \leq 2x} \Lambda(n)|

then the hypothesis {EH[1/2+2\varpi+\epsilon]} (applied to {x} and {2x} and then subtracting) tells us that

\displaystyle  \sum_{1 \leq q \leq Wx^{1/2+2\varpi}} E(q) \ll x \log^{-A} x

for any fixed {A>0}. From the Chinese remainder theorem and the Siegel-Walfisz theorem we have

\displaystyle  \sup_{a \in ({\bf Z}/q{\bf Z})^\times} \Delta_{b,W}(\Lambda;q,a) \ll E(qW) + \frac{1}{\phi(q)} x \log^{-A} x

for any {q} coprime to {W} (and in particular for {q \in {\mathcal S}_I}). Since {|C(q)| \leq k_0^{\Omega(q)}}, where {\Omega(q)} is the number of prime divisors of {q}, we can thus bound the left-hand side of (3) by

\displaystyle  \ll \sum_{q \in {\mathcal S}_I: q< x^{1/2+2\varpi}} k_0^{\Omega(q)} E(qW) + k_0^{\Omega(q)} \frac{1}{\phi(q)} x \log^{-A} x.

The contribution of the second term is {O(x \log^{-A+O(1)} x)} by standard estimates (see Proposition 8 below). Using the very crude bound

\displaystyle  E(q) \ll \frac{1}{\phi(q)} x \log x

and standard estimates we also have

\displaystyle  \sum_{q \in {\mathcal S}_I: q< x^{1/2+2\varpi}} k_0^{2\Omega(q)} E(qW) \ll x \log^{O(1)} A

and the claim now follows from the Cauchy-Schwarz inequality. \Box

In practice, the conjecture {MPZ[\varpi,\delta]} is easier to prove than {EH[1/2+2\varpi+]} due to the restriction of the residue classes {a} to {C(q)}, and also the restriction of the modulus {q} to {x^\delta}-smooth numbers. Zhang proved {MPZ[\varpi,\varpi]} for any {0 < \varpi < 1/1168}. More recently, our Polymath8 group has analysed Zhang’s argument (using in part a corrected version of the analysis of a recent preprint of Pintz) to obtain {MPZ[\varpi,\delta]} whenever {\delta, \varpi > 0} are such that

\displaystyle  207\varpi + 43\delta < \frac{1}{4}.

The work of Motohashi and Pintz, and later Zhang, implicitly describe arguments that allow one to deduce {DHL[k_0,2]} from {MPZ[\varpi,\delta]} provided that {k_0} is sufficiently large depending on {\varpi,\delta}. The best implication of this sort that we have been able to verify thus far is the following result, established in the previous post:

Theorem 6 (MPZ implies DHL) Let {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and let {k_0 \geq 2} be an integer obeying the constraint

\displaystyle  1+4\varpi > \frac{j_{k_0-2}^2}{k_0(k_0-1)} (1+\kappa) \ \ \ \ \ (5)

where {\kappa} is the quantity

\displaystyle \kappa := \sum_{1 \leq n < \frac{1+4\varpi}{2\delta}} (1 - \frac{2n \delta}{1 + 4\varpi})^{k_0/2} \prod_{j=1}^{n} (1 + 3k_0 \log(1+\frac{1}{j})) ).

Then {MPZ[\varpi,\delta]} implies {DHL[k_0,2]}.

This complicated version of {\kappa} is roughly of size {3 \log(2) k_0 \exp( - k_0 \delta)}. It is unlikely to be optimal; the work of Motohashi-Pintz and Pintz suggests that it can essentially be improved to {\frac{1}{\delta} \exp(-k_0 \delta)}, but currently we are unable to verify this claim. One of the aims of this post is to encourage further discussion as to how to improve the {\kappa} term in results such as Theorem 6.

We remark that as (5) is an open condition, it is unaffected by infinitesimal modifications to {\varpi,\delta}, and so we do not ascribe much importance to such modifications (e.g. replacing {\varpi} by {\varpi-\epsilon} for some arbitrarily small {\epsilon>0}).

The known deductions of {DHL[k_0,2]} from claims such as {EH[\theta]} or {MPZ[\varpi,\delta]} rely on the following elementary observation of Goldston, Pintz, and Yildirim (essentially a weighted pigeonhole principle), which we have placed in “{W}-tricked form”:

Lemma 7 (Criterion for DHL) Let {k_0 \geq 2}. Suppose that for each fixed admissible {k_0}-tuple {{\mathcal H}} and each congruence class {b\ (W)} such that {b+h} is coprime to {W} for all {h \in {\mathcal H}}, one can find a non-negative weight function {\nu: {\bf N} \rightarrow {\bf R}^+}, fixed quantities {\alpha,\beta > 0}, a quantity {A>0}, and a fixed positive power {R} of {x} such that one has the upper bound

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \leq (\alpha+o(1)) A\frac{x}{W}, \ \ \ \ \ (6)

the lower bound

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \theta(n+h_i) \geq (\beta-o(1)) A\frac{x}{W} \log R \ \ \ \ \ (7)

for all {h_i \in {\mathcal H}}, and the key inequality

\displaystyle  \frac{\log R}{\log x} > \frac{1}{k_0} \frac{\alpha}{\beta} \ \ \ \ \ (8)

holds. Then {DHL[k_0,2]} holds. Here {\theta(n)} is defined to equal {\log n} when {n} is prime and {0} otherwise.

Proof: Consider the quantity

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) (\sum_{h \in {\mathcal H}} \theta(n+h) - \log(3x)). \ \ \ \ \ (9)

By (6), (7), this quantity is at least

\displaystyle  k_0 \beta A\frac{x}{W} \log R - \alpha \log(3x) A\frac{x}{W} - o(A\frac{x}{W} \log x).

By (8), this expression is positive for all sufficiently large {x}. On the other hand, (9) can only be positive if at least one summand is positive, which only can happen when {n+{\mathcal H}} contains at least two primes for some {x \leq n \leq 2x} with {n=b\ (W)}. Letting {x \rightarrow \infty} we obtain {DHL[k_0,2]} as claimed. \Box

In practice, the quantity {R} (referred to as the sieve level) is a power of {x} such as {x^{\theta/2}} or {x^{1/4+\varpi}}, and reflects the strength of the distribution hypothesis {EH[\theta]} or {MPZ[\varpi,\delta]} that is available; the quantity {R} will also be a key parameter in the definition of the sieve weight {\nu}. The factor {A} reflects the order of magnitude of the expected density of {\nu} in the residue class {b\ (W)}; it could be absorbed into the sieve weight {\nu} by dividing that weight by {A}, but it is convenient to not enforce such a normalisation so as not to clutter up the formulae. In practice, {A} will some combination of {\frac{\phi(W)}{W}} and {\log R}.

Once one has decided to rely on Lemma 7, the next main task is to select a good weight {\nu} for which the ratio {\alpha/\beta} is as small as possible (and for which the sieve level {R} is as large as possible. To ensure non-negativity, we use the Selberg sieve

\displaystyle  \nu = \lambda^2, \ \ \ \ \ (10)

where {\lambda(n)} takes the form

\displaystyle  \lambda(n) = \sum_{d \in {\mathcal S}_I: d|P(n)} \mu(d) a_d

for some weights {a_d \in {\bf R}} vanishing for {d>R} that are to be chosen, where {I \subset (w,+\infty)} is an interval and {P} is the polynomial {P(n) := \prod_{h \in {\mathcal H}} (n+h)}. If the distribution hypothesis is {EH[\theta]}, one takes {R := x^{\theta/2}} and {I := (w,+\infty)}; if the distribution hypothesis is instead {MPZ[\varpi,\delta]}, one takes {R := x^{1/4+\varpi}} and {I := (w,x^\delta)}.

One has a useful amount of flexibility in selecting the weights {a_d} for the Selberg sieve. The original work of Goldston, Pintz, and Yildirim, as well as the subsequent paper of Zhang, the choice

\displaystyle  a_d := \log(\frac{R}{d})_+^{k_0+\ell_0}

is used for some additional parameter {\ell_0 > 0} to be optimised over. More generally, one can take

\displaystyle  a_d := g( \frac{\log d}{\log R} )

for some suitable (in particular, sufficiently smooth) cutoff function {g: {\bf R} \rightarrow {\bf R}}. We will refer to this choice of sieve weights as the “analytic Selberg sieve”; this is the choice used in the analysis in the previous post.

However, there is a slight variant choice of sieve weights that one can use, which I will call the “elementary Selberg sieve”, and it takes the form

\displaystyle  a_d := \frac{1}{\Phi(d) \Delta(d)} \sum_{q \in {\mathcal S}_I: (q,d)=1} \frac{1}{\Phi(q)} f'( \frac{\log dq}{\log R}) \ \ \ \ \ (11)

for a sufficiently smooth function {f: {\bf R} \rightarrow {\bf R}}, where

\displaystyle  \Phi(d) := \prod_{p|d} \frac{p-k_0}{k_0}

for {d \in {\mathcal S}_I} is a {k_0}-variant of the Euler totient function, and

\displaystyle  \Delta(d) := \prod_{p|d} \frac{k_0}{p} = \frac{k_0^{\Omega(d)}}{d}

for {d \in {\mathcal S}_I} is a {k_0}-variant of the function {1/d}. (The derivative on the {f} cutoff is convenient for computations, as will be made clearer later in this post.) This choice of weights {a_d} may seem somewhat arbitrary, but it arises naturally when considering how to optimise the quadratic form

\displaystyle  \sum_{d_1,d_2 \in {\mathcal S}_I} \mu(d_1) a_{d_1} \mu(d_2) a_{d_2} \Delta([d_1,d_2])

(which arises naturally in the estimation of {\alpha} in (6)) subject to a fixed value of {a_1} (which morally is associated to the estimation of {\beta} in (7)); this is discussed in any sieve theory text as part of the general theory of the Selberg sieve, e.g. Friedlander-Iwaniec.

The use of the elementary Selberg sieve for the bounded prime gaps problem was studied by Motohashi and Pintz. Their arguments give an alternate derivation of {DHL[k_0,2]} from {MPZ[\varpi,\theta]} for {k_0} sufficiently large, although unfortunately we were not able to confirm some of their calculations regarding the precise dependence of {k_0} on {\varpi,\theta}, and in particular we have not yet been able to improve upon the specific criterion in Theorem 6 using the elementary sieve. However it is quite plausible that such improvements could become available with additional arguments.

Below the fold we describe how the elementary Selberg sieve can be used to reprove Theorem 3, and discuss how they could potentially be used to improve upon Theorem 6. (But the elementary Selberg sieve and the analytic Selberg sieve are in any event closely related; see the appendix of this paper of mine with Ben Green for some further discussion.) For the purposes of polymath8, either developing the elementary Selberg sieve or continuing the analysis of the analytic Selberg sieve from the previous post would be a relevant topic of conversation in the comments to this post.

— 1. Sums of multiplicative functions —

In this section we review a standard estimate on a sum of multiplicative functions. We fix an interval {I \subset (w,+\infty)}. For any positive integer {k}, we say that a multiplicative function {f: {\bf N} \rightarrow {\bf R}} has dimension {k} if one has the asymptotic

\displaystyle  f(p) = k + O(\frac{1}{p})

for all {p \in I}; in particular (since {w \rightarrow \infty} as {x \rightarrow \infty}) we see that {f} is non-negative on {S_I} for {x} large enough. Thus for instance

\displaystyle  n \mapsto \frac{\phi(n)}{n}

has dimension one, the divisor function

\displaystyle  n \mapsto \tau(n)

has dimension two, and the functions

\displaystyle  n \mapsto k_0^{\Omega(n)},

\displaystyle  n \mapsto \frac{n}{\Phi(n)},

and

\displaystyle  n \mapsto n \Delta(n)

defined in the introduction have dimension {k_0}. Dimension interacts well with multiplication; the product of a {k}-dimensional multiplicative function and a {k'}-dimensional multiplicative function is clearly a {kk'}-multiplicative function.

We have the following basic asymptotic in the untruncated case {I = (w,+\infty)}:

Lemma 8 (Untruncated asymptotic) Let {I = (w,+\infty)} Let {k} be a fixed positive integer, and let {f: {\bf N} \rightarrow {\bf R}} be a multiplicative function of dimension {k}. Then for any fixed compactly supported, Riemann-integrable function {g: {\bf R} \rightarrow {\bf R}}, and any {R>1} that goes to infinity as {x \rightarrow \infty}, one has

\displaystyle  \sum_{d \in {\mathcal S}_I} \frac{f(d)}{d} g(\frac{\log d}{\log R}) = (\frac{\phi(W)}{W} \log R)^k ( \int_0^\infty g(t) \frac{t^{k-1}}{(k-1)!}\ dt + o(1) ).

Proof: By approximating {g} from above and below by smooth compactly supported functions we see that we may assume without loss of generality that {g} is smooth and compactly supported. But then the claim follows from Proposition 10 of the previous post. \Box

We remark that Proposition 10 of the previous post also gives asymptotics for a number of other sums of multiplicative functions, but one (small) advantage of the elementary Selberg sieve is that these (slightly) more complicated asymptotics are not needed. The generalisation in Lemma 8 from smooth {g} to Riemann integrable {g} implies in particular that

\displaystyle  \sum_{d \in {\mathcal S}_I: d \leq R} \frac{f(d)}{d} = (\frac{1}{k!} + o(1)) (\frac{\phi(W)}{W} \log R)^k \ \ \ \ \ (12)

and conversely Lemma 8 can be easily deduced from (12) by another approximation argument (using piecewise constant functions instead of smooth functions). We also make the trivial remark that if {g} is non-negative and {J} is any subset of {I}, then we have the upper bound

\displaystyle  \sum_{d \in {\mathcal S}_J} \frac{f(d)}{d} g(\frac{\log d}{\log R}) \leq (\frac{\phi(W)}{W} \log R)^k ( \int_0^\infty g(t) \frac{t^{k-1}}{(k-1)!}\ dt + o(1) ) \ \ \ \ \ (13)

for any non-negative Riemann integrable {g}.

Actually, (12) can be derived by purely elementary means (without the need to explicitly work with asymptotics of zeta functions as was done in the previous post) by an induction on the dimension {k} as follows. In the dimension zero case we have the Euler product

\displaystyle  \sum_{d \in {\mathcal S}_I} \frac{|f(d)|}{d} = \prod_{p \in I} (1 + \frac{|f(p)|}{p}) = 1+o(1)

and hence

\displaystyle  \sum_{d \in {\mathcal S}_I: d\neq 1} \frac{|f(d)|}{d} = o(1)

which gives (12) in the {k=0} case.

Now suppose that {f} has dimension {1}. In this case we write

\displaystyle  f(d) 1_{{\mathcal S}_I}(d) = \sum_{a|d; d/a \in {\mathcal S}_I} h(a) \ \ \ \ \ (14)

where {h} is a multiplicative function with

\displaystyle  h(p^j) := (-1)^{j-1} (f(p)-1) = O(\frac{1}{p^2}),

for all {p > w} and {j \geq 1}, and {h(p^j)=0} for {p \leq w} and {j \geq 1}. Then the left-hand side of (12) can be rearranged as

\displaystyle \sum_{a \leq R} \frac{h(a)}{a} \sum_{d \in {\mathcal S}_I: d \leq R/a} \frac{1}{d}.

Elementary sieving gives

\displaystyle  \sum_{d \in {\mathcal S}_I: d \leq y} 1 = (\frac{\phi(W)}{W} + o(1)) y + O( W )

and hence by summation by parts

\displaystyle  \sum_{d \in {\mathcal S}_I: d \leq y} \frac{1}{d} = (\frac{\phi(W)}{W} + o(1)) \log y + O( W ).

Meanwhile we have

\displaystyle  \sum_a \frac{|h(a)|}{a} = \prod_{p > w} (1 + \sum_{p=1}^\infty \frac{|f(p)-1|}{p^j}) = 1+o(1)

and so

\displaystyle  \sum_{a \neq 1} \frac{|h(a)|}{a} = o(1).

From these estimates one easily obtains (12) for {k=1}.

Now suppose that {k \geq 1} and that the claim has been proven inductively for {k-1}. We again may decompose (14), but now {g} has dimension {k-1} instead of dimension zero. Arguing as before, we can write the left-hand side of (12) as

\displaystyle \sum_{a \in {\mathcal S}_I: a \leq R} \frac{h(a)}{a} ( (\frac{\phi(W)}{W} + o(1)) \log y + O( W ) ).

The contribution of the {o(1)} and {O(W)} error terms are acceptable by induction hypothesis, and the main term is also acceptable from induction hypothesis and summation by parts, giving the claim.

— 2. Untruncated implication —

We first reprove Theorem 3. The key calculations for {\alpha} and {\beta} are as follows:

Lemma 9 (Untruncated sieve bounds) Assume {EH[\theta+\epsilon]} holds for some {1/2 < \theta < 1} and some {\epsilon>0}. Let {f: {\bf R} \rightarrow {\bf R}} be a smooth function that is supported on {[-1,1]}, let {{\mathcal H}} be a fixed admissible {k_0}-tuple for some fixed {k_0 \geq 2}, let {b\ (W)} be such that {b+h} is coprime to {W} for all {h \in {\mathcal H}}, and let {\nu} be the elementary Selberg sieve with weights (11) associated to the function {f}, the sieve level {R := x^{\theta/2}} and the untruncated interval {I := (w,+\infty)}. Then (6), (7) hold with

\displaystyle  \alpha := \int_0^1 f'(t)^2 \frac{t^{k_0-1}}{(k_0-1)!}\ dt, \ \ \ \ \ (15)

\displaystyle  \beta := \int_0^1 f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt, \ \ \ \ \ (16)

and

\displaystyle  A := (\frac{\phi(W)}{W} \log R)^{k_0}.

As computed in Theorem 14 of the previous post (and also in the recent preprint of Farkas, Pintz, and Revesz), the ratio

\displaystyle  \frac{\int_0^1 f'(t)^2 t^{k_0-1}\ dt}{\int_0^1 f(t)^2 t^{k_0-2}\ dt}

for non-zero {f} can be made arbitrarily close to {j_{k_0-2}^2/4} (the extremiser is not quite smooth at {t=1} if one extends by zero for {t>1}, but this can be easily dealt with by a standard regularisation argument), and Theorem 6 then follows from Lemma 7 (using the open nature of (2) to replace {EH[\theta]} by {EH[\theta+\epsilon]} for some small {\epsilon>0}).

It remains to prove Lemma 9. We begin with the proof of (6) (which will in fact be an asymptotic and not just an upper bound).

We expand the left-hand side of (6) as

\displaystyle  \sum_{d_1,d_2 \in {\mathcal S}_I} \mu(d_1) a_{d_1} \mu(d_2) a_{d_2} \sum_{x \leq n \leq 2x: [d_1,d_2] | P(n); n=b\ (W)} 1.

The weights {a_{d_1} a_{d_2}} are only non-vanishing when {d_1,d_2 \leq R}. From the Chinese remainder theorem we then have

\displaystyle  \sum_{x \leq n \leq 2x: [d_1,d_2] | P(n); n=b\ (W)} 1 = \frac{x}{W} \Delta([d_1,d_2]) + O( [d_1,d_2] \Delta([d_1,d_2]) ).

The contribution of the error term is

\displaystyle  \ll \sum_{d_1,d_2 \in {\mathcal S}_I: d_1,d_2 \leq R} |a_{d_1}| |a_{d_2}| k_0^{\Omega([d_1,d_2])}

which we can upper bound by

\displaystyle  \ll (\sum_{d \in {\mathcal S}_I: d \leq R} |a_d| k_0^{\Omega(d)})^2.

Using (11) and Lemma 8 we have the crude upper bound

\displaystyle  |a_d| \ll \frac{1}{\Phi(d) \Delta(d)} (\frac{\phi(W)}{W} \log R)^{k_0} \ \ \ \ \ (17)

and hence by another application of Lemma 8 the previous expression may be upper bounded by {O( (W/\phi(W))^{O(1)} R^2 \log^{O(1)} R )}, which is negligible by choice of {R}. So we reduce to showing that

\displaystyle  \sum_{d_1,d_2 \in {\mathcal S}_I} \mu(d_1) a_{d_1} \mu(d_2) a_{d_2} \Delta([d_1,d_2]) \leq (\alpha+o(1)) (\frac{\phi(W)}{W} \log R)^{k_0}. \ \ \ \ \ (18)

To proceed further we follow Selberg and observe the decomposition

\displaystyle  \Delta([d_1,d_2]) = \sum_{d_0|d_1,d_2} \Phi(d_0) \Delta(d_1) \Delta(d_2) \ \ \ \ \ (19)

for {d_1,d_2 \in {\mathcal S}_I}, which can be easily verified by working locally (when {d_1,d_2 \in \{1,p\}} for some prime {p \in I}) and then using multiplicativity. Using this identity we can diagonalise the left-hand side of (18) as

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \Phi(d_0) (\sum_{d \in {\mathcal S}_I: d_0|d} \mu(d) a_d \Delta(d))^2.

Now we use the form (11) of {a_d}, which has been optimised specifically for ease of computing this expression. We can expand {\sum_{d \in {\mathcal S}_I: d_0|d} \mu(d) a_d \Delta(d)} as

\displaystyle  \sum_{d \in {\mathcal S}_I: d_0|d} \frac{\mu(d)}{\Phi(d)} \sum_{q \in {\mathcal S}_I: (q,d) = 1} \frac{1}{\Phi(q)} f'( \frac{\log dq}{\log R});

writing {d = d_0 d_1} and {m = d_1 q}, we can rewrite this as

\displaystyle  \frac{\mu(d_0)}{\Phi(d_0)} \sum_{m \in {\mathcal S}_I: (m,d_0)=1} \frac{f'(\frac{\log d_0 m}{\log R})}{\Phi(m)} \sum_{d_1 | m} \mu(d_1)

which by Möbius inversion simplifies to

\displaystyle \frac{\mu(d_0)}{\Phi(d_0)} f'( \frac{\log d_0}{\log R} ).

The left-hand side of (18) has now simplified to

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{1}{\Phi(d_0)} f'( \frac{\log d_0}{\log R} )^2.

By Lemma 8 and (15) we obtain (18) and hence (6) as required.

Now we turn to the more difficult lower bound (7) for a fixed {h_i \in {\mathcal H}} (again we will be able to get an asymptotic here rather than just a lower bound). The left-hand side expands as

\displaystyle  \sum_{d_1,d_2 \in {\mathcal S}_I} \mu(d_1) a_{d_1} \mu(d_2) a_{d_2} \sum_{x \leq n \leq 2x: [d_1,d_2] | P(n); n = b\ (W)} \theta(n+h_i).

Again, {d_1,d_2} may be restricted to at most {R}, so that {[d_1,d_2]} is at most {R^2 = x^{1/2+2\varpi}}. As before, the inner summand vanishes unless {n+h_i \ ([d_1,d_2])} lies in one of the residue classes {C_i([d_1,d_2])}, where

\displaystyle  C_i(q) := \{ a \in {\bf Z}/q{\bf Z}^\times: P_i(a) = 0 \}

and {P_i} is the modified polynomial

\displaystyle  P_i(a) := \prod_{h \in {\mathcal H} \backslash \{h_i\}} (a+h-h_i).

The cardinality of {C_i(q)} is {\phi(q)\Delta^*(q)}, where

\displaystyle  \Delta^*(q) := \prod_{p|q} \frac{k_0-1}{p-1} = \frac{(k_0-1)^{\Omega(q)}}{\phi(q)}.

We can thus estimate

\displaystyle  \sum_{x \leq n \leq 2x: [d_1,d_2] | P(n); n = b\ (W)} \Lambda(n+h) = \frac{1}{\phi(W)} x \Delta^*([d_1,d_2]) + O( E^*([d_1,d_2]) )

where the error term {E^*(q)} is given by

\displaystyle  E^*(q) = \sum_{a \in C_i(q)} | \sum_{x \leq n \leq 2x: n=b\ (W); n = a\ (q)} \theta(n) - \frac{x}{\phi(Wq)}|.

By a modification of the proof of Proposition 5 we see that the hypothesis {EH[\theta+\epsilon]} implies that

\displaystyle  \sum_{q \leq R^2} h(q) E^*(q) \ll x \log^{-A} x

for any fixed {A>0} and any multiplicative function {h} of a fixed dimension {k}. Using the bound (17) we can then conclude that the contribution of the error term {E^*([d_1,d_2])} to (7) is negligible. So (7) becomes

\displaystyle  \sum_{d_1,d_2 \in {\mathcal S}_I} \mu(d_1) a_{d_1} \mu(d_2) a_{d_2} \Delta^*([d_1,d_2]) \ \ \ \ \ (20)

\displaystyle  \geq (\beta-o(1)) (\frac{\phi(W)}{W}\log R)^{k_0+1}.

Analogously to (19) we have the decomposition

\displaystyle  \Delta^*([d_1,d_2]) = \sum_{d_0|d_1,d_2} \Phi^*(d_0) \Delta^*(d_1) \Delta^*(d_2) \ \ \ \ \ (21)

for {d_1,d_2 \in {\mathcal S}_I}, where {\Phi^*} is the function

\displaystyle  \Phi^*(d) := \prod_{p|d} \frac{p-k_0}{k_0-1}.

We can thus diagonalise the left-hand side of (20) similarly to before as

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \Phi^*(d_0) (\sum_{d \in S_I: d_0|d} \mu(d) a_d \Delta^*(d))^2.

We can expand {\sum_{d \in {\mathcal S}_I: d_0|d} \mu(d) a_d \Delta^*(d)} as

\displaystyle  \sum_{d \in {\mathcal S}_I: d_0|d} \frac{\mu(d)}{\Phi(d)} \frac{\Delta^*(d)}{\Delta(d)} \sum_{q \in {\mathcal S}_I: (q,d) = 1} \frac{1}{\Phi(q)} f'( \frac{\log dq}{\log R});

writing {d = d_0 d_1} and {m = d_1 q} and noting that {\frac{\Delta^*(d)}{\Delta(d)} = (1-\frac{1}{k_0})^{\Omega(d)}}, we can rewrite this as

\displaystyle  \frac{\mu(d_0)}{\Phi(d_0)} \frac{\Delta^*(d_0)}{\Delta(d_0)} \sum_{m \in {\mathcal S}_I: (m,d_0)=1} \frac{f'(\frac{\log d_0 m}{\log R})}{\Phi(m)} \sum_{d_1 | m} \mu(d_1) \frac{\Delta^*(d_1)}{\Delta(d_1)}.

Observe that

\displaystyle  \frac{1}{\Phi(m)} \sum_{d_1|m} \mu(d_1) \frac{\Delta^*(d_1)}{\Delta(d_1)} = \frac{1}{\phi(m)}

so we can simplify the left-hand side of (20) as

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} (\sum_{m \in {\mathcal S}_I: (m,d_0)=1} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} )^2 \ \ \ \ \ (22)

where {h} is the {k_0-1}-dimensional multiplicative function

\displaystyle  h(d) := d \frac{\Phi^*(d)}{\Phi(d)^2} (\frac{\Delta^*(d)}{\Delta(d)})^2

\displaystyle  = \prod_{p|d} (k_0-1) \frac{(p-1)^2}{p(p-k_0)}.

To control this sum, let us first pretend that the {(m,d_0)=1} constraint was not present, thus suppose we had to estimate

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} (\sum_{m \in {\mathcal S}_I} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} )^2. \ \ \ \ \ (23)

By Proposition 8, the inner sum {\sum_{m \in {\mathcal S}_I} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)}} is equal to

\displaystyle  = (\frac{\phi(W)}{W} \log R) (\int_0^\infty f'(t + \frac{\log d_0}{\log R})\ dt + o(1))

which by the fundamental theorem of calculus simplifies to

\displaystyle  = - (\frac{\phi(W)}{W} \log R) (f(\frac{\log d_0}{\log R})+ o(1)).

We remark that the error term {o(1)} here is uniform in {d_0}, because the translates {f'(\cdot + \frac{\log d_0}{\log R})} are equicontinuous and thus uniformly Riemann integrable. We conclude that (23) is equal to

\displaystyle  (\frac{\phi(W)}{W} \log R)^2 \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} (f(\frac{\log d_0}{\log R})^2+ o(1))

where the error term {o(1)} is again uniform in {d_0}. By Proposition 8 and (16), this expression is equal to

\displaystyle  (\beta-o(1)) (\frac{\phi(W)}{W} \log R)^{k_0+1} \ \ \ \ \ (24)

as required.

Now we reinstate the condition {(m,d_0)=1}, which turns out to be negligible thanks to the {W}-trick. More precisely, we may use Möbius inversion to write

\displaystyle  \sum_{m \in {\mathcal S}_{(w,+\infty)}: (m,d_0)=1} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} \ \ \ \ \ (25)

\displaystyle  = \sum_{k | d_0} \frac{\mu(k)}{\phi(k)} \sum_{m \in {\mathcal S}_{(w,+\infty)}: (m,k)=1} \frac{f'(\frac{\log d_0 k m}{\log R})}{\phi(m)}.

By the preceding discussion, the {k=1} term of this sum is

\displaystyle - (\frac{\phi(W)}{W} \log R) (f(\frac{\log k}{\log R}) + o(1))

Now we consider the {k \neq 1} terms, which are error terms. We may bound the total contribution of these terms in magnitude by

\displaystyle  O( \sum_{k|d_0: k \neq 1} \frac{1}{\phi(k)} |\sum_{m \in {\mathcal S}_{(w,+\infty)}} \frac{f'(\frac{\log d_0 k m}{\log R})}{\phi(m)}| ).

Arguing as before we have

\displaystyle  \sum_{m \in {\mathcal S}_{(w,+\infty)}} \frac{f'(\frac{\log d_0 k m}{\log R})}{\phi(m)} = O( \frac{\phi(W)}{W} \log R )

and so the expression (25) becomes

\displaystyle  -(\frac{\phi(W)}{W} \log R) (f(\frac{\log d_0}{\log R}) + O( \frac{d_0}{\phi(d_0)}-1 ) + o(1) )

where the implied constant in the {O()} notation can depend on {f}. The square of this expression is then

\displaystyle  (\frac{\phi(W)}{W} \log R)^2 (f(\frac{\log d_0}{\log R})^2 + O( (\frac{d_0}{\phi(d_0)}-1)^2 ) + O( \frac{d_0}{\phi(d_0)}-1 ) + o(1) ).

The left-hand side of (20) is now expressed as the sum of the main term

\displaystyle  (\frac{\phi(W)}{W} \log R)^2 \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} f(\frac{\log d_0}{\log R})^2

and the error terms

\displaystyle  O( (\frac{\phi(W)}{W} \log R)^2 \sum_{d_0 \in {\mathcal S}_I: d_0 \leq R} \frac{h(d_0)}{d_0} (\frac{d_0}{\phi(d_0)}-1)^j )

for {j=1,2} and

\displaystyle  o( (\frac{\phi(W)}{W} \log R)^2 \sum_{d_0 \in {\mathcal S}_I: d_0 \leq R} \frac{h(d_0)}{d_0} ).

The main term has already been estimated as (24). From Proposition 8 we have

\displaystyle  \sum_{d_0 \in {\mathcal S}_I: d_0 \leq R} \frac{h(d_0)}{d_0} (\frac{d_0}{\phi(d_0)})^j = (\frac{\phi(W)}{W} \log R)^{k_0-1} \int_0^1 \frac{x^{k_0-2}}{(k_0-2)!}+o(1)

for {j=0,1,2}, and so all of the error terms end up being {o( (\frac{\phi(W)}{W} \log R)^{k_0+1} )}, and (7) follows. This concludes the proof of Theorem 3.

— 3. Applying truncation —

Now we experiment with truncating the above argument to {I = (w,x^\delta)} to obtain results of the shape of Theorem 6. Unfortunately thus far the results do not give very good explicit dependencies of {k_0} on {\varpi,\delta}, but this may perhaps improve with further effort.

Assume {MPZ[\varpi,\delta]} holds for some {0 < \varpi < 1/4} and some {0 < \delta < 1/4+\varpi}. Let {f: {\bf R} \rightarrow {\bf R}} be a smooth function that is supported on {[-1,1]}, let {{\mathcal H}} be a fixed admissible {k_0}-tuple for some fixed {k_0 \geq 2}, let {b\ (W)} be such that {b+h} is coprime to {W} for all {h \in {\mathcal H}}, and let {\nu} be the elementary Selberg sieve with weights (11) associated to the function {f}, the sieve level {R := x^{1/4 + \varpi}} and the truncated interval {I := (w,x^\delta)}. As before, we set

\displaystyle  A := (\frac{\phi(W)}{W} \log R)^{k_0}

and seek the best values for {\alpha,\beta} for which we can establish the upper bound (6) and the lower bound (7). Arguing as in the previous section (using (13) to control error terms) we can reduce (6) to

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{1}{\Phi(d_0)} f'( \frac{\log d_0}{\log R} )^2 \leq (\alpha+o(1)) (\frac{\phi(W)}{W} \log R)^{k_0}.

If we crudely replace the truncated interval {(w,x^\delta)} by the untruncated interval {(w,\infty)} and apply Proposition 8 (or (13)) we may reuse the previous value

\displaystyle  \alpha = \int_0^1 f'(t)^2 \frac{t^{k_0-1}}{(k_0-1)!}\ dx

for {\alpha} here, but it is possible that we could do better than this.

Now we turn to (7). Arguing as in the previous section, we reduce to showing that

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} (\sum_{m \in {\mathcal S}_I: (m,d_0)=1} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} )^2

\displaystyle  \geq (\beta-o(1)) (\frac{\phi(W)}{W} \log R)^{k_0+1}.

We can, if desired, discard the {(m,d_0)=1} constraint here by arguing as in the previous section, leaving us with

\displaystyle  \sum_{d_0 \in {\mathcal S}_I} \frac{h(d_0)}{d_0} (\sum_{m \in {\mathcal S}_I} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} )^2 \ \ \ \ \ (26)

\displaystyle  \geq (\beta-o(1)) (\frac{\phi(W)}{W} \log R)^{k_0+1}.

Because we now seek a lower bound, we cannot simply pass to the untruncated interval {(w,\infty)} (e.g. using (13)), and must proceed more carefully. A simple way to proceed (as was done by Motohashi and Pintz) is to just discard all {d_0} less than {x^{-\delta} R}, only retaining those {d_0} in the region between {x^{-\delta} R} and {R}. The reason for doing this is that the {m} parameter is then forced to be at most {x^\delta} if one wants the summand to be non-zero, and so for the {m} summation at least one can replace {I} by {(w,+\infty)} without incurring any error. As in the previous section we then have

\displaystyle  \sum_{m \in {\mathcal S}_I} \frac{f'(\frac{\log d_0 m}{\log R})}{\phi(m)} = - (\frac{\phi(W)}{W} \log R) (f(\frac{\log d_0}{\log R})+ o(1))

and so one can lower bound (26), up to negligible errors, by

\displaystyle (\frac{\phi(W)}{W} \log R)^2 \sum_{d_0 \in {\mathcal S}_I: x^{-\delta} R \leq d_0 \leq R} \frac{h(d_0)}{d_0} f(\frac{\log d_0}{\log R})^2.

If the truncated interval {I} were replaced by the untruncated interval {(w,\infty)}, then Proposition 8 would estimate this expression as

\displaystyle (\frac{\phi(W)}{W} \log R)^{k+1} \int_{1-\frac{\delta}{1/4+\varpi}}^1 f(t)^2 \frac{t^{k-2}}{(k-2)!}\ dt.

To deal with the truncated interval {I}, we use a variant of the Buchstab identity, namely the easy inequality

\displaystyle  \sum_{d \in {\mathcal S}_I: d \leq R} F(d) \geq \sum_{d \in {\mathcal S}_{(w,+\infty)}: d \leq R} F(d) - \sum_{x^\delta \leq p \leq R} \sum_{d \in {\mathcal S}_{(w,+\infty)}: d \leq R/p} F(pd)

for any non-negative function {F}. Using this identity and Proposition 8, we find that we may lower bound (26), up to negligible errors, by

\displaystyle  (\frac{\phi(W)}{W} \log R)^{k+1} \int_{1-\frac{\delta}{1/4+\varpi}}^1 f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt

minus the sum

\displaystyle  (k_0-1) (\frac{\phi(W)}{W} \log R)^{k_0+1} \sum_{x^\delta \leq p \leq R} \int_0^{1-\log p/\log R} f(t+\frac{\log p}{\log R})^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt.

(The {(k_0-1)} term comes from {h(p)}.)If {f} is non-negative and non-increasing on {[0,1]}, then we can upper bound

\displaystyle  f(t+\frac{\log p}{\log R}) \leq f( t / (1 - \frac{\log p}{\log R}) )

for {0 \leq t \leq 1-\log p/\log R}, and so

\displaystyle  \int_0^{1-\log p/\log R} f(t+\frac{\log p}{\log R})^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt

\displaystyle  \leq (1-\frac{\log p}{\log R})^{k_0-1} \int_0^1 f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt.

On the other hand, from the prime number theorem we have

\displaystyle  \sum_{x^\delta \leq p \leq R} (1-\frac{\log p}{\log R})^{k_0-1} = \int_{\delta/(1/4+\varpi)}^1 (1-t)^{k_0-1}\ \frac{dt}{t} + o(1).

Putting all this together, we can thus obtain (7) with

\displaystyle  \beta := (1-\kappa) \int_0^1 f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt

where

\displaystyle  \kappa := \frac{\int_0^{1-\frac{\delta}{1/4+\varpi}} f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt}{\int_0^{1} f(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt} + \kappa' \ \ \ \ \ (27)

and

\displaystyle  \kappa' := (k_0-1) \int_{\delta/(1/4+\varpi)}^1 (1-t)^{k_0-1}\ \frac{dt}{t}.

Following Pintz, we may upper bound {(1-t)^{k_0-1}} by {\exp(-(k_0-1) t)} and rescale to obtain

\displaystyle  \kappa' \leq \int_{(k_0-1)\delta/(1/4+\varpi)} \exp(-t) \frac{dt}{t}

which we can crudely bound by

\displaystyle  \kappa' \leq \exp( - (k_0-1)\delta/(1/4+\varpi)).

But of course we can also calculate {\kappa} and {\kappa'} explicitly for any fixed choice of {\delta,\varpi,k_0}. We conclude the following variant of Theorem 6:

Theorem 10 (MPZ implies DHL) Let {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and let {k_0 \geq 2} be an integer obeying the constraint

\displaystyle  (1+4\varpi)(1-\kappa) > \frac{4}{k_0(k_0-1)} \frac{\int_0^1 f'(t)^2 t^{k_0-1}\ dt}{\int_{1-\frac{\delta}{1/4+\varpi}}^1 f(t)^2 t^{k_0-2}\ dt},

with {\kappa} given by (27), and some smooth {f: {\bf R} \rightarrow {\bf R}} supported on {[-1,1]} which is non-negative and non-increasing on {[0,1]}. Then {MPZ[\varpi,\delta]} implies {DHL[k_0]}.

For {k_0} large enough depending on {\varpi,\delta} the hypotheses in Theorem 10 can be verified (e.g. by setting {f(t) = (1-t)^l} for a reasonably large {l}) but the dependence is poor due to the localisation of the integral in the denominator to the narrow interval {[1-\delta/(1/4+\varpi),1]}. But perhaps there is a way to not have such a strict localisation in these arguments.