This is the second thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} are:

  • (Maynard) {H_1 \leq 600}.
  • (Polymath8b, tentative) {H_2 \leq 484,276}.
  • (Polymath8b, tentative) {H_m \leq \exp( 3.817 m )} for sufficiently large {m}.
  • (Maynard) Assuming the Elliott-Halberstam conjecture, {H_1 \leq 12}, {H_2 \leq 600}, and {H_m \ll m^3 e^{2m}}.

Following the strategy of Maynard, the bounds on {H_m} proceed by combining four ingredients:

  1. Distribution estimates {EH[\theta]} or {MPZ[\varpi,\delta]} for the primes (or related objects);
  2. Bounds for the minimal diameter {H(k)} of an admissible {k}-tuple;
  3. Lower bounds for the optimal value {M_k} to a certain variational problem;
  4. Sieve-theoretic arguments to convert the previous three ingredients into a bound on {H_m}.

Accordingly, the most natural routes to improve the bounds on {H_m} are to improve one or more of the above four ingredients.

Ingredient 1 was studied intensively in Polymath8a. The following results are known or conjectured (see the Polymath8a paper for notation and proofs):

  • (Bombieri-Vinogradov) {EH[\theta]} is true for all {0 < \theta < 1/2}.
  • (Polymath8a) {MPZ[\varpi,\delta]} is true for {\frac{600}{7} \varpi + \frac{180}{7}\delta < 1}.
  • (Polymath8a, tentative) {MPZ[\varpi,\delta]} is true for {\frac{1080}{13} \varpi + \frac{330}{13} \delta < 1}.
  • (Elliott-Halberstam conjecture) {EH[\theta]} is true for all {0 < \theta < 1}.

Ingredient 2 was also studied intensively in Polymath8a, and is more or less a solved problem for the values of {k} of interest (with exact values of {H(k)} for {k \leq 342}, and quite good upper bounds for {H(k)} for {k < 5000}, available at this page). So the main focus currently is on improving Ingredients 3 and 4.

For Ingredient 3, the basic variational problem is to understand the quantity

\displaystyle  M_k({\cal R}_k) := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}

for {F: {\cal R}_k \rightarrow {\bf R}} bounded measurable functions, not identically zero, on the simplex

\displaystyle  {\cal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \}

with {I_k, J_k^{(m)}} being the quadratic forms

\displaystyle  I_k(F) := \int_{{\cal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k

and

\displaystyle  J_k^{(m)}(F) := \int_{{\cal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_i)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k.

Equivalently, one has

\displaystyle  M_k({\cal R}_k) := \sup_F \frac{\int_{{\cal R}_k} F {\cal L}_k F}{\int_{{\cal R}_k} F^2}

where {{\cal L}_k: L^2({\cal R}_k) \rightarrow L^2({\cal R}_k)} is the positive semi-definite bounded self-adjoint operator

\displaystyle  {\cal L}_k F(t_1,\ldots,t_k) = \sum_{m=1}^k \int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_{m-1},s,t_{m+1},\ldots,t_k)\ ds,

so {M_k} is the operator norm of {{\cal L}}. Another interpretation of {M_k({\cal R}_k)} is that the probability that a rook moving randomly in the unit cube {[0,1]^k} stays in simplex {{\cal R}_k} for {n} moves is asymptotically {(M_k({\cal R}_k)/k + o(1))^n}.

We now have a fairly good asymptotic understanding of {M_k({\cal R}_k)}, with the bounds

\displaystyle  \log k - 2 \log\log k -2 \leq M_k({\cal R}_k) \leq \log k + \log\log k + 2

holding for sufficiently large {k}. There is however still room to tighten the bounds on {M_k({\cal R}_k)} for small {k}; I’ll summarise some of the ideas discussed so far below the fold.

For Ingredient 4, the basic tool is this:

Theorem 1 (Maynard) If {EH[\theta]} is true and {M_k({\cal R}_k) > \frac{2m}{\theta}}, then {H_m \leq H(k)}.

Thus, for instance, it is known that {M_{105} > 4} and {H(105)=600}, and this together with the Bombieri-Vinogradov inequality gives {H_1\leq 600}. This result is proven in Maynard’s paper and an alternate proof is also given in the previous blog post.

We have a number of ways to relax the hypotheses of this result, which we also summarise below the fold.

— 1. Improved sieving —

A direct modification of the proof of Theorem 1 also shows:

Theorem 2 If {MPZ[\varpi,\delta]} is true and {M_k({\cal R}_k \cap [0,\frac{\delta}{1/4+\varpi}]^k) > \frac{m}{1/4+\varpi}}, then {H_m \leq H(k)}.

Here {M_k} is defined for the truncated simplex {{\cal R}_k \cap [0,\frac{\delta}{1/4+\varpi}]^k} in the obvious fashion. This allows us to use the MPZ-type bounds obtained in Polymath8a, at the cost of requiring the test functions {F} to have somewhat truncated support. Fortunately, in the large {k} setting, the functions we were using had such a truncated support anyway. It looks likely that we can replace the cube {[0,\frac{\delta}{1/4+\varpi}]^k} by significantly larger regions by using the (multiple) dense divisibility versions of {MPZ}, but we have not yet looked into this.

It also appears that if one generalises the Elliott-Halberstam conjecture {EH[\theta]} to also encompass more general Dirichlet convolutions {\alpha * \beta} than the von Mangoldt function {\Lambda} (see e.g. Conjecture 1 of Bombieri-Friedlander-Iwaniec), then one can enlarge the simplex {{\cal R}_k} in Theorem 1 (and probably for Theorem 2 also) to the slightly larger region

\displaystyle  {\cal R}'_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: \sum_{i \neq m} t_i \leq 1 \hbox{ for all } m=1,\ldots,k \}.

Basically, the reason for this is that the restriction to the simplex {{\cal R}_k} (as opposed to {{\cal R}'_k}) is only needed to control the sum {\sum_n \nu(n)}, but by splitting {\nu} into products of simpler divisor sums, and using the Elliott-Halberstam hypothesis to control one of the factors, it looks like one can still control error terms in the larger region {{\cal R}'_k} (but this will have to be checked at some point, if we end up using this refinement). This is only likely to give a slight improvement, except when {k} is small; from the inclusions

\displaystyle  {\cal R}_k \subset {\cal R}'_k \subset \frac{k}{k-1} \cdot {\cal R}_k

and a scaling argument we see that

\displaystyle  M_k({\cal R}_k) \leq M_k({\cal R}'_k) \leq \frac{k}{k-1} M_k( {\cal R}_k ).

Assume EH. To improve the bound {H_1 \leq 12} to {H_1 \leq 10}, it suffices to obtain a bound of the form

\displaystyle  P_0 + P_2 + P_6 + P_8 + P_{12} > 1 + P_{0,12}

where

\displaystyle  P_h = \sum_n 1_{n+h \hbox{ prime}} \nu(n) / \sum_n \nu(n)

and

\displaystyle  P_{0,12} = \sum_n 1_{n,n+12 \hbox{ prime}} \nu(n) / \sum_n \nu(n).

With {\nu} given in terms of a cutoff function {F}, the left-hand side {P_0 + P_2 + P_6 + P_8 + P_{12}} can be computed as usual as

\displaystyle  P_0 + P_2 + P_6 + P_8 + P_{12} = \frac{1}{2} \sum_{m=0}^5 J_5^{(m)}(F) / I_5(F) + o(1)

while we have the upper bound

\displaystyle  P_{0,12} \leq \frac{1}{2} \int \frac{(\int_{t_1+t_5 \leq 1-t_2-t_3-t_4} F(t_1,t_2,t_3,t_4,t_5)\ dt_1 dt_5)^2}{1-t_2-t_3-t_4}\ dt_2 t_3 t_4 / I_5(F)

\displaystyle + o(1)

and other bounds may be possible. (This is discussed in this comment.)

For higher {k}, it appears that similar maneuvers will have a relatively modest impact, perhaps shaving {\sqrt{k}} or so off of the current values of {k}.

— 2. Upper bound on {M_k}

We have the upper bound

\displaystyle  M_k \leq (1 + \frac{1}{A}) \log(1+Ak)

for any {A>0}. To see this, observe from Cauchy-Schwarz that

\displaystyle  (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_i)^2 \leq

\displaystyle  (\int_0^{1-\sum_{i \neq m} t_i} (1+Akt_m) F(t_1,\ldots,t_k)^2\ dt_i)

\displaystyle  \times (\int_0^1 \frac{1}{1+Akt_m}\ dt_m).

The final factor is {\frac{1}{Ak} \log (1+Ak)}, and so

\displaystyle  J_k^{(m)}(F) \leq \frac{1}{Ak} \log (1+Ak) \int_{{\cal R}_k} (1+Akt_m) F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k.

Summing in {m} and noting that {t_1+\ldots+t_k \leq 1} on the simplex we have

\displaystyle  J_k^{(m)}(F) \leq \frac{1}{Ak} \log (1+Ak) \int_{{\cal R}_k} (k+Ak) F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k,

and the claim follows.

Setting {A = \log k}, we conclude that

\displaystyle  M_k \leq \log k + \log\log k + 2

for sufficiently large {k}. There may be some room to improve these bounds a bit further.

— 3. Lower bounds on {M_k}

For small {k}, one can optimise the quadratic form

\displaystyle  \frac{\int F {\cal L} F}{\int F^2}

by specialising {F} to a finite-dimensional space and then performing the appropriate linear algebra. It is known that we may restrict without loss of generality to symmetric {F}; one could in principle also restrict to the functions of the form

\displaystyle  F(t_1,\ldots,t_k) = \sum_{m=1}^k G( t_1,\ldots,t_{m-1},t_{m+1},\ldots,t_k)

for some symmetric function {G: {\cal R}_{k-1} \rightarrow {\bf R}} (indeed, morally at least {F} should be an eigenfunction of {{\cal L}}), although we have not been able to take much advantage of this yet.

For large {k}, we can use the bounds

\displaystyle  M_k({\cal R}_k)^m \geq \int_{{\cal R}_k} F {\cal L}^m F

for any {m \geq 1} and any {F} with {\int_{{\cal R}_k} F^2 \leq 1}; we can also start with a given {F} and improve it by replacing it with {{\cal L} F} (normalising in {L^2} if desired), and perhaps even iterating and accelerating this process.

The basic functions {F} we have been using take the form

\displaystyle  F(t_1,\ldots,t_k) := 1_{t_1+\ldots+t_k \leq 1} \prod_{i=1}^k \frac{k^{1/2}}{m_2^{1/2}} g(k t_i)

where

\displaystyle  g(t) := 1_{[0,T]} \frac{1}{1+AT}

and

\displaystyle  m_1 := \int_0^T g(t)\ dt = \frac{1}{A} \log(1+AT)

\displaystyle  m_2 := \int_0^T g(t)^2\ dt = \frac{1}{A} (1 - \frac{1}{1+AT}) = \frac{T}{1+AT}.

Then {\int_{{\cal R}_k} F^2 \leq 1}, and

\displaystyle  M_k \geq \int F {\cal L} F \geq \frac{m_1^2}{m_2} {\bf P}( X_1 + \ldots + X_{k-1} \geq k - T )

where {X_1,\ldots,X_k} are iid random variables on {[0,T]} with density {\frac{1}{m_2} g(t)^2\ dt}. By Chebyshev’s inequality we then have

\displaystyle  M_k \geq \frac{m_1^2}{m_2} ( 1 - \frac{(k-1)\sigma^2}{(k-T-(k-1)\mu)^2} )

if {k-T-(k-1)\mu>0}, where

\displaystyle  \mu := \frac{1}{m_2} \int_0^T t g(t)^2\ dt

\displaystyle  = \frac{1}{m_2} \frac{1}{A^2}( \log(1+AT) - 1 + \frac{1}{1+AT} )

and

\displaystyle  \sigma^2 := \frac{1}{m_2} \int_0^T t^2 g(t)^2\ dt - \mu^2

\displaystyle  = \frac{1}{m_2} (\frac{T}{A^2} + \frac{T}{A^2 (1+AT)} - \frac{2\log(1+AT)}{A^3}) - \mu^2.

A lengthier computation for {\int F {\cal L}^2 F} gives

\displaystyle  M_k^2 \geq (1-\frac{1}{k}) \frac{m_1^4}{m_2^2} (1 - \frac{(k-2)\sigma^2}{(k-2T-(k-2)\mu)^2})

\displaystyle  + \frac{1}{k} \frac{m_1^2}{m_2} (k-T-(k-1)\mu)

assuming {k-2T-(k-2)\mu > 0}.