For each natural number {m}, let {H_m} denote the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

where {p_n} denotes the {n\textsuperscript{th}} prime. In other words, {H_m} is the least quantity such that there are infinitely many intervals of length {H_m} that contain {m+1} or more primes. Thus, for instance, the twin prime conjecture is equivalent to the assertion that {H_1 = 2}, and the prime tuples conjecture would imply that {H_m} is equal to the diameter of the narrowest admissible tuple of cardinality {m+1} (thus we conjecturally have {H_1 = 2}, {H_2 = 6}, {H_3 = 8}, {H_4 = 12}, {H_5 = 16}, and so forth; see this web page for further continuation of this sequence).

In 2004, Goldston, Pintz, and Yildirim established the bound {H_1 \leq 16} conditional on the Elliott-Halberstam conjecture, which remains unproven. However, no unconditional finiteness of {H_1} was obtained (although they famously obtained the non-trivial bound {p_{n+1}-p_n = o(\log p_n)}), and even on the Elliot-Halberstam conjecture no finiteness result on the higher {H_m} was obtained either (although they were able to show {p_{n+2}-p_n=o(\log p_n)} on this conjecture). In the recent breakthrough of Zhang, the unconditional bound {H_1 \leq 70,000,000} was obtained, by establishing a weak partial version of the Elliott-Halberstam conjecture; by refining these methods, the Polymath8 project (which I suppose we could retroactively call the Polymath8a project) then lowered this bound to {H_1 \leq 4,680}.

With the very recent preprint of James Maynard, we have the following further substantial improvements:

Theorem 1 (Maynard’s theorem) Unconditionally, we have the following bounds:

  • {H_1 \leq 600}.
  • {H_m \leq C m^3 e^{4m}} for an absolute constant {C} and any {m \geq 1}.

If one assumes the Elliott-Halberstam conjecture, we have the following improved bounds:

  • {H_1 \leq 12}.
  • {H_2 \leq 600}.
  • {H_m \leq C m^3 e^{2m}} for an absolute constant {C} and any {m \geq 1}.

The final conclusion {H_m \leq C m^3 e^{2m}} on Elliott-Halberstam is not explicitly stated in Maynard’s paper, but follows easily from his methods, as I will describe below the fold. (At around the same time as Maynard’s work, I had also begun a similar set of calculations concerning {H_m}, but was only able to obtain the slightly weaker bound {H_m \leq C \exp( C m )} unconditionally.) In the converse direction, the prime tuples conjecture implies that {H_m} should be comparable to {m \log m}. Granville has also obtained the slightly weaker explicit bound {H_m \leq e^{8m+5}} for any {m \geq 1} by a slight modification of Maynard’s argument.

The arguments of Maynard avoid using the difficult partial results on (weakened forms of) the Elliott-Halberstam conjecture that were established by Zhang and then refined by Polymath8; instead, the main input is the classical Bombieri-Vinogradov theorem, combined with a sieve that is closer in spirit to an older sieve of Goldston and Yildirim, than to the sieve used later by Goldston, Pintz, and Yildirim on which almost all subsequent work is based.

The aim of the Polymath8b project is to obtain improved bounds on {H_1, H_2}, and higher values of {H_m}, either conditional on the Elliott-Halberstam conjecture or unconditional. The likeliest routes for doing this are by optimising Maynard’s arguments and/or combining them with some of the results from the Polymath8a project. This post is intended to be the first research thread for that purpose. To start the ball rolling, I am going to give below a presentation of Maynard’s results, with some minor technical differences (most significantly, I am using the Goldston-Pintz-Yildirim variant of the Selberg sieve, rather than the traditional “elementary Selberg sieve” that is used by Maynard (and also in the Polymath8 project), although it seems that the numerology obtained by both sieves is essentially the same). An alternate exposition of Maynard’s work has just been completed also by Andrew Granville.

— 1. Overview of argument —

Define an admissible {k_0}-tuple to be an increasing tuple {{\cal H} = (h_1,\ldots,h_{k_0})} of integers, which avoids at least one residue class modulo {p} for each {p}. For {1 \leq j_0 \leq k_0}, let {DHL[k_0,j_0]} denote the following claim:

Conjecture 2 ({DHL[k_0,j_0]}) If {{\cal H}} is an admissible {k_0}-tuple, then there are infinitely many translates {n + {\cal H}} of {{\cal H}} that contain at least {j_0} primes.

The prime tuples conjecture is then the assertion that {DHL[k_0,j_0]} holds for all {k_0,j_0}. Clearly, if {DHL[k_0,m+1]} is true, then we have {H_m \leq h_{k_0}-h_1} whenever {(h_1,\ldots,h_{k_0})} is an admissible {k_0}-tuple. Theorem 1 then follows from the following claim:

Theorem 3 (Maynard’s theorem, DHL version) Unconditionally, we have the following bounds:

  • {DHL[105,2]}.
  • {DHL[k_0,m+1]} whenever {k_0} is sufficiently large and {4m < \log k_0 - 2 \log\log k_0 - 2}.

If one assumes the Elliott-Halberstam conjecture, we have the following improved bounds:

  • {DHL[5,2]}.
  • {DHL[105,3]}.
  • {DHL[k_0,m+1]} whenever {k_0} is sufficiently large and {2m < \log k_0 - 2 \log\log k_0 - 2}.

Indeed, the {m=1,2} results then follow from using the admissible {5}-tuple {(0,2,6,8,12)} and the admissible {105}-tuple

\displaystyle  (0, 10, 12, 24, 28, 30, 34, 42, 48, 52, 54, 64, 70, 72, 78, 82, 90, 94, 100,

\displaystyle  112, 114, 118, 120, 124, 132, 138, 148, 154, 168, 174, 178, 180, 184, 190,

\displaystyle  192, 202, 204, 208, 220, 222, 232, 234, 250, 252, 258, 262, 264, 268, 280,

\displaystyle  288, 294, 300, 310, 322, 324, 328, 330, 334, 342, 352, 358, 360, 364, 372,

\displaystyle  378, 384, 390, 394, 400, 402, 408, 412, 418, 420, 430, 432, 442, 444, 450,

\displaystyle  454, 462, 468, 472, 478, 484, 490, 492, 498, 504, 510, 528, 532, 534, 538,

\displaystyle  544, 558, 562, 570, 574, 580, 582, 588, 594, 598, 600)

found by Engelsma (and recorded on this site). For the larger {m} results, note that the bound {4m < \log k_0 - 2 \log\log k_0 - 2} is obeyed if {k_0 \geq C m^2 e^{4m}} for a sufficiently large {C} and {m} is large enough, and the claim follows by using the observation that one can create an admissible {k_0}-tuple of length {O( k_0 \log k_0)} by using the first {k_0} primes past {k_0}; similarly if one assumes the Elliott-Halberstam conjecture. (Note as the {H_m} are clearly non-decreasing in {m}, it suffices to work with sufficiently large {m} to obtain bounds such as {H_m \leq C m^3 e^{4m}}.)

As in previous work, the {DHL[k_0,m+1]} conclusions are obtained by constructing a sieve weight with good properties. We use the same asymptotic notation as in the Polymath8a project, thus all quantities depend on an asymptotic parameter {x} unless explicitly declared to be fixed, and asymptotic notation such {O()}, {o()} or {\ll} is relative to this parameter. We let

\displaystyle  w := \lfloor \log\log\log x \rfloor

and {W := \prod_{p < w} p} as before. We let {\theta(n)} be the quantity {\log n} when {n} is prime, and zero otherwise.

Lemma 4 (Criterion for {DHL}) Let {k_0 \geq 2} and {m \geq 1} be fixed integers. Suppose that for each fixed admissible {k_0}-tuple {{\cal H}} and each congruence class {b\ (W)} such that {b+h} is coprime to {W} for all {h \in {\cal H}}, one can find a non-negative weight function {\nu \colon {\bf N} \rightarrow {\bf R}^+}, fixed quantities {\alpha,\beta > 0}, a quantity {B>0}, and a quantity {R>0} such that one has the upper bound

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \leq (\alpha+o(1)) B\frac{x}{W}, \ \ \ \ \ (1)

the lower bound

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \theta(n+h_i) \geq (\beta-o(1)) B\frac{x}{W} \log R \ \ \ \ \ (2)

for all {h_i \in {\cal H}}, and the key inequality

\displaystyle  \frac{\log R}{\log x} > \frac{m}{k_0} \frac{\alpha}{\beta}. \ \ \ \ \ (3)

Then {DHL[k_0,m+1]} holds.

The {m=1} case of this lemma is Lemma 4.1 of the polymath8a paper. The general {m} case is proven by an essentially identical argument, namely one considers the expression

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) (\sum_{i=1}^{k_0} \theta(n+h_i) - m \log 3x ),

uses the hypotheses (1), (2), (3) to show that this is positive for sufficiently large {m}, and observing that the summand is only positive when {n+h_1,\ldots,n+h_{k_0}} contain at least {m+1} primes.

We recall the statement of the Elliott-Halberstam conjecture, for a given choice of parameter {0 < \theta < 1}:

If {Q \lessapprox x^\theta} and {A \geq 1} is fixed, then

\displaystyle  \sum_{q \leq Q} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (4)


\displaystyle  \Delta(\alpha; a\ (q)) := \sum_{n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{(n,q)=1} \alpha(n).

The Bombieri-Vinogradov theorem asserts that {EH[\theta]} holds for all {0 <\theta < 1/2}, while the Elliott-Halberstam conjecture asserts that {EH[\theta]} holds for all {0 < \theta < 1}.

In Polymath8a, the sieve weight {\nu} was constructed in terms of a smooth compactly supported one-variable function {F: [0,+\infty) \rightarrow {\bf R}}. A key innovation in Maynard’s work is to replace the sieve with one constructed using a smooth compactly supported multi-variable function {f: [0,+\infty)^{k_0} \rightarrow {\bf R}}, which affords significantly greater flexibility. More precisely, we will show

Proposition 5 (Sieve asymptotics) Suppose that {EH[\theta]} holds for some fixed {0 < \theta < 1}, and set {R = x^{c/2}} for some fixed {0 < c < \theta}. Let {f: [0,+\infty)^{k_0} \rightarrow {\bf R}} is a fixed symmetric smooth function supported on the simplex

\displaystyle  \Delta_{k_0} := \{ (t_1,\ldots,t_{k_0}) \in [0,+\infty)^{k_0}: t_1+\ldots+t_{k_0} \leq 1 \}.

Then one can find {\nu} obeying the bounds (1), (2) with

\displaystyle  B := (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R}

\displaystyle  \alpha := \int_{\Delta_{k_0}} f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} \ \ \ \ \ (5)

\displaystyle  \beta := \int_{\Delta_{k_0-1}} f_{1,\ldots,k_0-1}(t_1,\ldots,t_{k_0-1},0)^2\ dt_1 \ldots dt_{k_0-1} \ \ \ \ \ (6)

where we use the shorthand

\displaystyle  f_{i_1,\ldots,i_j}(t_1,\ldots,t_n) := \frac{\partial^j}{\partial t_{i_1} \ldots \partial t_{i_j}} f(t_1,\ldots,t_n)

for the mixed partial derivatives of {f}.

(In fact, one can obtain asymptotics for (1), (2), rather than upper and lower bounds.)

(One can work with non-symmetric functions {f}, but this does not improve the numerology; see the remark after (7.1) of Maynard’s paper.)

We prove this proposition in Section 2. We remark that if one restricts attention to functions {f} of the form

\displaystyle  f(t_1,\ldots,t_k) = F(t_1+\ldots+t_k)

for {F: [0,+\infty) \rightarrow {\bf R}} smooth and supported on {[0,1]}, then

\displaystyle  \alpha = \int_0^1 f^{(k_0)}(t)^2 \frac{t^{k_0-1}}{(k_0-1)!}\ dt


\displaystyle  \beta = \int_0^1 f^{(k_0-1)}(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt

and this claim was already essentially established back in this Polymath8a post (or see Proposition 4.1 and Lemma 4.7 of the Polymath8a paper for essentially these bounds). In that previous post (and also in the paper of Farkas, Pintz, and Revesz), the ratio {\alpha/\beta} was optimised in this one-dimensional context using Bessel functions, and the method was unable to reach {m=1} without an improvement to Bombieri-Vinogradov, or to reach {m=2} even on Elliott-Halberstam. However, the additional flexibility afforded by the use of multi-dimensional cutoffs allows one to do better.

Combining Proposition 5 with Lemma 4, we obtain the following conclusion. For each {k_0}, let {M_{k_0}} be the quantity

\displaystyle  M_{k_0} := \sup_f k_0 \frac{ \int_{\Delta_{k_0-1}} f_{1,\ldots,k_0-1}(t_1,\ldots,t_{k_0-1},0)^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} }

where {f} ranges over all smooth symmetric functions {f: [0,+\infty)^{k_0} \rightarrow {\bf R}} that are supported on the simplex {\Delta_{k_0}}. Equivalently, by substituting {F := f_{1,\ldots,k_0}} and using the fundamental theorem of calculus, followed by an approximation argument to remove the smoothness hypotheses on {F}, we have

\displaystyle  M_{k_0} := \sup_F k_0 \frac{ \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} } \ \ \ \ \ (7)

where {F} ranges over all bounded measurable functions supported on {\Delta_{k_0}}. Then we have

Corollary 6 Let {0 < \theta<1} be such that {EH[\theta]} holds, and let {k_0 \geq 2}, {m \geq 1} be integers such that

\displaystyle  M_{k_0} > \frac{2m}{\theta}.

Then {DHL[k_0,m+1]} holds.

To use this corollary, we simply have to locate test functions {f} that give as large a lower bound for {M_{k_0}} as one can manage; this is a purely analytic problem that no longer requires any further number-theoretic input.

In particular, Theorem 3 follows from the following lower bounds:

Proposition 7

  • {M_5 > 2}.
  • {M_{105} > 4}.
  • If {k_0} is sufficiently large, then

    \displaystyle  M_{k_0} > \log k_0 - 2 \log\log k_0 - 2. \ \ \ \ \ (8)

The first two cases of this proposition are obtained numerically (see Section 7 of Maynard’s paper), by working with functions {F} that of the special form

\displaystyle  F = 1_{\Delta_{k_0}} \sum_{i=1}^d a_i (1-P_1)^{b_i} P_2^{c_i}

for various real coefficients {a_i} and non-negative integers {b_i,c_i}, where

\displaystyle P_1(t_1,\ldots,t_{k_0}) := t_1+\ldots+t_{k_0}


\displaystyle P_2(t_1,\ldots,t_{k_0}) := t_1^2+\ldots+t_{k_0}^2.

In Maynard’s paper, the ratio

\displaystyle  k_0 \frac{ \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} }

in this case is computed to be

\displaystyle  \frac{a^T M_2 a}{a^T M_1 a}

where {a} is the {d \times 1} matrix with entries {a_1,\ldots,a_d}, {M_1} is the {d \times d} matrix with {ij} entry equal to

\displaystyle  \frac{(b_i+b_j)! G_{c_i+c_j,2}(k_0)}{(k_0+b_i+b_j+2c_i+2c_j)!}


\displaystyle  G_{b,j}(x) := b! \sum_{r=1}^b \binom{x}{r} \sum_{b_1,\ldots,b_r \geq 1: \sum_{i=1}^r b_i = b} \prod_{i=1}^r \frac{(jb_i)!}{b_i!}

and {M_2} is the {d \times d} matrix with {ij} entry equal to

\displaystyle  k_0 \sum_{c'_1=0}^{c_1} \sum_{c'_2=0}^{c_2} \binom{c_1}{c'_1} \binom{c_2}{c'_2} \frac{\gamma_{b_i,b_j,c_i,c_j,c'_1,c'_2} G_{c'_1+c'_2,2}(k_0-1)}{(k_0+b_i+b_j+2c_i+2c_j+1)!}

where {\gamma_{b_i,b_j,c_i,c_j,c'_1,c'_2}} is the quantity

\displaystyle  \frac{b_i! b_j! (2c_i-2c'_1)! (2c_j-2c'_2)! (b_i+b_j+2c_i+2c_j-2c'_i-2c'_j+2)!}{(b_i+2c_i-2c'_1+1)! (b_j + 2c_j-2c'_2+1)!}.

One then optimises the ratio {\frac{a^T M_2 a}{a^T M_1 a}} by linear programming methods (a similar idea appears in the original paper of Goldston, Pintz, and Yildirim) to obtain a lower bound for {M_{k_0}} for {k_0=5} and {k_0=105}.

The final case is established in a different manner; we give a proof of the slightly weaker bound

\displaystyle  M_{k_0} > \log k_0 - 4 \log\log k_0 - O(1) \ \ \ \ \ (9)

in Section 3.

— 2. Sieve asymptotics —

We now prove Proposition 5. We use a Fourier-analytic method, similar to that in this previous blog post.

The sieve we will use is of the form

\displaystyle  \nu(n) := (\sum_{d_1,\ldots,d_{k_0} \in {\cal S}: d_i|n+h_i \hbox{ for all } i=1,\ldots,k_0} \ \ \ \ \ (10)

\displaystyle (\prod_{i=1}^{k_0} \mu(d_i)) f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ))^2,

where {{\cal S}} denotes the square-free integers and {\mu} is the Möbius function This should be compared with the sieve

\displaystyle  \nu(n) := (\sum_{d \in {\cal S}: d|\prod_{i=1}^{k_0}(n+h_i)} \mu(d) F( \frac{\log d}{\log R}))^2

used in the previous blog post, which basically corresponds to the special case {f(t_1,\ldots,t_{k_0}) =F(t_1+\ldots+t_{k_0})}.

We begin with (1). Using (10), we may rearrange the left-hand side of (1) as

\displaystyle  \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))

\displaystyle  f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0} 1.

Observe that as the numbers {n+h_1,\ldots,n+h_{k_0}} have no common factor when {n = b\ (W)}, the inner sum vanishes unless the quantities {[d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}]} are coprime, in which case this inner sum can be estimated as

\displaystyle  \frac{x}{W [d_1,d'_1] \ldots [d_{k_0},d'_{k_0}]}+O(1).

Also, at least one of the two products involving {f} will vanish unless one has

\displaystyle  d_1 \ldots d_{k_0}, d'_1,\ldots,d'_{k_0} \leq R.

Let us first deal with the contribution of the error term {O(1)} to (1). This contribution may be bounded by

\displaystyle O( \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}: d_1 \ldots d_{k_0}, d'_1,\ldots,d'_{k_0} \leq R} 1 ),

which sums to {\ll R^2 \log^{O(1)} R}; since {R = x^{c/2}} and {c < 1/2 < 1}, we conclude that this contribution is negligible.

To conclude the proof of (1), it thus suffices to show that

\displaystyle  \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}: [d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}] \hbox{ coprime}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j)) \ \ \ \ \ (11)

\displaystyle  \frac{f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )}{[d_1,d'_1] \ldots [d_k,d'_k]}

\displaystyle  = (\alpha+o(1)) (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R}.

Next, we smoothly extend the function {f: [0,+\infty)^{k_0} \rightarrow {\bf R}} to a smooth compactly supported function {f: {\bf R}^{k_0}\rightarrow {\bf R}}, which by abuse of notation we will continue to refer to as {f}. By Fourier inversion, we may then express {f} in the form

\displaystyle  f(t_1,\ldots,t_{k_0}) = \int_{{\bf R}^{k_0}} \eta(\vec s) e^{-\sum_{j=1}^{k_0} (1+is_j)t_j}\ d \vec s \ \ \ \ \ (12)

where {\vec s := (s_1,\ldots,s_{k_0})} and where {\eta: {\bf R}^k \rightarrow {\bf C}} is a smooth function obeying the rapid decay bounds

\displaystyle  |\eta(\vec s)| \ll (1+|\vec s|)^{-A} \ \ \ \ \ (13)

for any fixed {A>0}. The left-hand side of (11) may then be rewritten as

\displaystyle  \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') H(\vec s, \vec s')\ d\vec s d\vec s' \ \ \ \ \ (14)

where {\vec s' := (s'_1,\ldots,s'_{k_0})} and

\displaystyle  H(\vec s, \vec s') := \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}: [d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}] \hbox{ coprime}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))

\displaystyle  \frac{\prod_{j=1}^{k_0} d_j^{-(1+is_j)/\log R} (d'_j)^{-(1+is'_j)/\log R}}{[d_1,d'_1] \ldots [d_k,d'_k]}.

We may factorise {H(\vec s,\vec s')} as an Euler product

\displaystyle  H(\vec s,\vec s') = \prod_{p>w}

\displaystyle  (1 - \sum_{j=1}^{k_0} p^{-1-(1+is_j)/\log R} + p^{-1-(1+is'_j)/\log R} - p^{-1-(1+is_j)/\log R - (1+is'_j)/\log R} ).

In particular, we have the crude bound

\displaystyle |H(\vec s,\vec s')| \leq \prod_{p>w} (1 +3p^{-1/\log R}) \ll \log^{O(1)} R;

combining this with (13) we see that the contribution to (14) in which {|\vec s| \geq \sqrt{\log R}} or {|\vec s'| \geq \sqrt{\log R}} is negligible, so we may restrict the integral in (14) to the region {|\vec s|,|\vec s'| \leq \sqrt{\log R}}. In this region, we have the Euler product approximations

\displaystyle  \prod_{p>w} (1 - p^{-1-(1+is_j)/\log R}) = \zeta(1+\frac{1+is_j}{\log R})^{-1} \prod_{p \leq w} (1 - p^{-1-(1+is_j)/\log R})^{-1}

\displaystyle = (1+o(1)) (\frac{1+is_j}{\log R}) \prod_{p \leq w} (1-p^{-1})^{-1}

\displaystyle = (1+o(1)) \frac{W}{\phi(W)} (\frac{1+is_j}{\log R})

where we have used the bound {W = O(\log\log x)} and the asymptotic {\zeta(s) = (1+o(1)) (s-1)^{-1}} for {s = 1 + O(1/\log R)}. Since also {\sum_{p>w} p^{-2} = o(1)}, we conclude that

\displaystyle  F(\vec s,\vec s') = (1+o(1)) \prod_{j=1}^{k_0} \prod_{p>w} \frac{ (1-p^{-1-(1+is_j)/\log R}) (1-p^{-1-(1+is'_j)/\log R})}{1-p^{-1-(1+is_j)/\log R- (1+is'_j)/\log R}}

\displaystyle  = (1+o(1)) (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R} \prod_{j=1}^{k_0} \frac{ (1+is_j) (1+is'_j)}{1+is_j+1+is'_j}.

Using (13) again to dispose of the {o(1)} error term, and then using (13) once more to remove the restriction to {|\vec s|,|\vec s'| \leq \sqrt{\log R}}, we thus reduce to verifying the identity

\displaystyle  \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') \prod_{j=1}^{k_0} \frac{ (1+is_j) (1+is'_j)}{1+is_j+1+is'_j} \ d\vec s d\vec s' = \alpha.

But from repeatedly differentiating (12) under the integral sign, one has

\displaystyle  f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0}) = (-1)^{k_0} \int_{{\bf R}^{k_0}} \eta(\vec s) e^{-\sum_{j=1}^{k_0} (1+is_j)t_j} \prod_{j=1}^{k_0} (1+is_j)\ d \vec s

and thus

\displaystyle  f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2 = \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') e^{-\sum_{j=1}^{k_0} (1+is_j +1 +is'_j)t_j}

\displaystyle  \prod_{j=1}^{k_0} (1+is_j) (1+is'_j)\ d \vec s d\vec s';

integrating this for {(t_1,\ldots,t_{k_0} \in [0,+\infty)} using Fubini’s theorem (and (13)), the claim then follows from (5). This concludes the proof of (1).

Now we prove (2). We will just prove the claim for {i=k_0}, as the other cases follow similarly using the symmetry hypothesis on {f}. The left-hand side of (2) may then be expanded as

\displaystyle \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))

\displaystyle f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0} \theta(n+h_{k_0}).

Observe that the summand vanishes unless {d_{k_0}=d'_{k_0}=1} (note that {n+k_0} is comparable to {x} and thus exceeds {R}). So we may simplify the above expression to

\displaystyle  \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}\in {\cal S}} (\prod_{j=1}^{k_0-1} \mu(d_j) \mu(d'_j)) f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0-1} }{\log R}, 0 ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0-1} }{\log R}, 0 )

\displaystyle  \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0-1} \theta(n+h_{k_0}).

As in the estimation of (1), the summand vanishes unless {[d_1,d'_1],\ldots,[d_{k_0-1},d'_{k_0-1}]} are coprime, and if

\displaystyle  d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1} \leq R.

Let {d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1} \in {\cal S}} obey the above constraints. For any modulus {q}, define the discrepancy

\displaystyle  E(q) := \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\sum_{x \leq n \leq 2x: n+ h_{k_0} = a\ (q)} \theta(n+h_{k_0}) - \frac{x}{\phi(q)}|. \ \ \ \ \ (15)

Since {R = x^{c/2}} and {0 < c < \theta} is fixed, the hypothesis {EH[\theta]} implies that

\displaystyle  \sum_{q \leq WR^2} E(q) \ll x \log^{-A} x \ \ \ \ \ (16)

for any fixed {A>0}. On the other hand, the sum

\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0-1} \theta(n+h_{k_0})

can, by the Chinese remainder theorem, be rewritten in the form

\displaystyle  \sum_{x \leq n \leq 2x: n+h_{k_0} = a\ (q)} \theta(n+h_{k_0})


\displaystyle  q := W \prod_{j=0}^{k_0-1} [d_j,d'_j] \ \ \ \ \ (17)

and {a\ (q)} is a primitive residue class; note that {q} does not exceed {WR^2}. By (15), this expression can then be written as

\displaystyle  \frac{x}{\phi(W) \prod_{j=0}^{k_0-1} \phi([d_j,d'_j])} + O( E(W \prod_{j=0}^{k_0-1} [d_j,d'_j]) ).

Let us first control the error term, which may be bounded by

\displaystyle  O( \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}: d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1} \leq R} E(W \prod_{j=0}^{k_0-1} [d_j,d'_j]) ).

Note that for any {q \leq WR^2}, there are {O( \tau(q)^{O(1)})} choices of {d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1}} for which (17) holds. Thus we may bound the previous expression by

\displaystyle  \ll \sum_{q \leq WR^2} \tau(q)^{O(1)} E(q).

By the Cauchy-Schwarz inequality and (16), this expression may be bounded by

\displaystyle  \ll (x \log^{-A} x)^{1/2} (\sum_{q \leq WR^2} \tau(q)^{O(1)} E(q))^{1/2}

for any fixed {A}. On the other hand, we have the crude bound {E(q) \ll \frac{x}{q} \log^{O(1)} x}, as well as the standard estimate

\displaystyle  \sum_{q \leq y} \frac{\tau(q)^{O(1)}}{q} \ll \log^{O(1)} y.

(see e.g. Corollary 2.15 of Montgomery-Vaughan). Putting all this together, we conclude that the contribution of the error term to (2) is negligible. To conclude the proof of (2), it thus suffices to show that

\displaystyle  \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}\in {\cal S}} (\prod_{j=1}^{k_0-1} \mu(d_j) \mu(d'_j))

\displaystyle  \frac{f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0-1} }{\log R}, 0 ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0-1} }{\log R}, 0 )}{\prod_{j=0}^{k_0-1} \phi([d_j,d'_j])}

\displaystyle  = (\beta-o(1)) (\frac{W}{\phi(W)})^{k_0-1} \frac{1}{\log^{k_0-1} R}.

But this can be proven by repeating the arguments used to prove (11) (with {k_0} replaced by {k_0-1}, and {f(t_1,\ldots,t_{k_0})} replaced by {f(t_1,\ldots,t_{k_0-1},0)}); the presence of the Euler totient function causes some factors of {\frac{1}{p}} in that analysis to be replaced by {\frac{1}{p-1} = \frac{1}{p} + O(\frac{1}{p^2})}, but this turns out to have a negligible impact on the final asymptotics since {\sum_{p >w} \frac{1}{p^2} = o(1)}. This concludes the proof of (2) and hence Proposition 5.

Remark 1 An inspection of the above arguments shows that the simplex {\Delta_{k_0}} can be enlarged slightly to the region

\displaystyle  \Delta'_{k_0} := \{ (t_1,\ldots,t_{k_0}) \in [0,+\infty)^{k_0}: t_1+\ldots+t_{k_0} \leq 1 + t_i

\displaystyle  \hbox{ for all } 1 \leq i \leq k_0 \},

however this only leads to a tiny improvement in the numerology. It is interesting to note however that in the {k_0=2} case, {\Delta'_{k_0}} is the unit square {[0,1]^2}, and by taking {f(t_1,t_2) := (1-t_1)_+ (1-t_2)_+} and taking {c} close to {1}, one can come “within an epsilon” of establishing {DHL[2,2]} (and in particular, the twin prime conjecture) from the full Elliott-Halberstam conjecture; this fact was already essentially observed by Bombieri, using the weight {\nu(n):=\Lambda_2(n) \Lambda_2(n+2)} rather than the Selberg sieve. (Strictly speaking, to establish (1) in this context, one needs the Elliott-Halberstam conjecture not only for {\Lambda}, but also for other arithmetic functions with a suitable Dirichlet convolution {\alpha*\beta} structure; we omit the details.)

Remark 2 It appears that the {MPZ[\varpi,\delta]} conjecture studied in Polymath8a can serve as a substitute for {EH[\frac{1}{2}+2\varpi]} in Corollary 6, except that one also has to impose an additional constraint on the function {F} (or {f}), namely that it is supported in the cube {[0, \frac{\delta}{1/4 +\varpi}]^{k_0}} (in order to keep the moduli involved appropriately smooth). Perhaps as a first approximation, we should ignore the role of {\delta}, and just pretend that {MPZ[\varpi,\delta]} is as good as {EH[\frac{1}{2}+2\varpi]}. In particular, inserting our most optimistic value of {\varpi} obtained by Polymath8, namely {\frac{13}{1080}}, we can in principle take {\theta} as large as {283/540 = 0.524\ldots}, although this only is a {5\%} improvement or so over the Bombieri-Vinogradov inequality.

— 3. Large {k_0} computations —

We now give a proof of (9) for sufficiently large {k_0}. We use the ansatz

\displaystyle  F = 1_{\Delta_{k_0}} F_0

where {F_0} is the tensor product

\displaystyle  F_0(t_1,\ldots,t_k) = \prod_{i=1}^{k_0} k_0^{1/2} g( k_0 t_i )

and {g: [0,+\infty) \rightarrow {\bf R}} is supported on some interval {[0,T]} and normalised so that

\displaystyle  \int_0^\infty g(t)^2\ dt = 1. \ \ \ \ \ (18)

The function {F} is clearly symmetric and supported on {\Delta_{k_0}}. We now estimate the numerator and denominator of the ratio

\displaystyle  k_0 \frac{ \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} }

that lower bounds {M_{k_0}}.

For the denominator, we bound {F} by {F_0} and use Fubini’s theorem and (8) to obtain the upper bound

\displaystyle \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} \leq 1

and thus

\displaystyle  M_{k_0} \geq k_0 \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1}.

Now we observe that

\displaystyle  \int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0} = (\prod_{i=0}^{k_0-1} k_0^{1/2} g(k_0 t_i)) k_0^{-1/2} \int_0^\infty g(t)\ dt

whenever {t_1+\ldots+t_{k_0-1} \leq 1 - \frac{T}{k_0}}, and so we have the lower bound

\displaystyle  M_{k_0} \geq (\int_0^\infty g(t)\ dt)^2 \int_{t_1+\ldots+t_{k_0-1} \leq 1-\frac{T}{k_0}} (\prod_{i=0}^{k_0-1} k_0^{1/2} g(k_0 t_i))^2 dt_1 \ldots dt_{k_0-1}.

We interpret this probabilistically. Let {X_1,\ldots,X_{k_0-1}} be independent, identically distributed non-negative real random variables with probability density {g(t)^2\ dt}; this is well-defined thanks to (18). Observe that {(\prod_{i=1}^{k_0-1} k_0^{1/2} g( k_0 t_i ))^2} is the joint probability density of {\frac{1}{k_0}(X_1,\ldots,X_{k_0-1})}, and so

\displaystyle  M_{k_0} \geq (\int_0^\infty g(t)\ dt)^2 {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T).

We will lower bound this probability using the Chebyshev inequality. (In my own calculations, I had used the Hoeffding inequality instead, but it seems to only give a slightly better bound in the end for {H_m} (perhaps saving one or two powers of {m}).) In order to exploit the law of large numbers, we would like the mean {(k_0-1) \mu} of {X_1 + \ldots + X_{k_0-1}}, where {\mu := \int_0^T t g(t)^2\ dt}, to be less than {k_0-T}:

\displaystyle  (k_0-1) \mu < k_0 - T.

The variance of {X_1 + \ldots + X_{k_0-1}} is {k_0-1} times the variance of a single {X_i}, which we can bound (somewhat crudely) by

\displaystyle  \hbox{Var}(X_i) \leq {\bf E} X_i^2 \leq T {\bf E} X_i = T \mu.

Thus by Chebyshev’s inequality, we have

\displaystyle  {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{(k_0-1) T \mu}{(k_0-T-(k_0-1)\mu)^2}.

To clean things up a bit we bound {k_0-1} by {k_0} to obtain the simpler bound

\displaystyle  {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{k_0 T \mu}{(k_0-T-k_0\mu)^2}

assuming now that {k_0 \mu < k_0 - T}. In particular, {\mu \leq 1}, so we have the cleaner bound

\displaystyle  {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{T}{k_0 (1-T/k_0-\mu)^2}.

To summarise, we have shown that

\displaystyle  M_{k_0} \geq (\int_0^T g(t)\ dt)^2 ( 1 - \frac{T}{k_0 (1-T/k_0-\mu)^2} ) \ \ \ \ \ (19)

whenever {g: [0,T] \rightarrow {\bf R}} is such that

\displaystyle  \int_0^T g(t)^2\ dt = 1 \ \ \ \ \ (20)


\displaystyle  \mu = \int_0^T t g(t)^2\ dt < 1 - \frac{T}{k_0}.

One can optimise this carefully to give (8) (as is done in Maynard’s paper), but for the purposes of establishing the slightly weaker bound (9), we can use {g} of the form

\displaystyle  g(t) = \frac{c}{1+At}

with {A := \log k_0} and {T := k_0 \log^{-3} k_0}. With the normalisation (20) we have

\displaystyle  c^2 = \frac{1+AT}{T} = \log k_0 + O(1)


\displaystyle  c = \log^{1/2} k_0 + O(\log^{-1/2} k_0)


\displaystyle  \mu = \frac{c^2}{A^2} (\log(1+AT) -1 - \frac{1}{1+AT})

\displaystyle  = 1 - \frac{2 \log\log k_0}{\log k_0} + O( \frac{1}{\log k_0} )

and thus

\displaystyle  \frac{T}{k_0 (1-T/k_0-\mu)^2} = O( \frac{1}{\log k_0} )


\displaystyle  \int_0^T g(t)= \frac{c}{A} \log(1+AT)

\displaystyle  = \log^{1/2} k_0 - 2 \log^{-1/2} k_0 \log\log k_0 + O( \log^{-1/2} k_0 )

which gives the claim.