Polymath8b: Bounded intervals with many primes, after Maynard

19 November, 2013 in math.CA, math.NT, polymath | Tags: James Maynard, polymath8 | by Terence Tao

For each natural number ${m}$ , let ${H_m}$ denote the quantity

$\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),$

where ${p_n}$ denotes the ${n\textsuperscript{th}}$ prime. In other words, ${H_m}$ is the least quantity such that there are infinitely many intervals of length ${H_m}$ that contain ${m+1}$ or more primes. Thus, for instance, the twin prime conjecture is equivalent to the assertion that ${H_1 = 2}$ , and the prime tuples conjecture would imply that ${H_m}$ is equal to the diameter of the narrowest admissible tuple of cardinality ${m+1}$ (thus we conjecturally have ${H_1 = 2}$ , ${H_2 = 6}$ , ${H_3 = 8}$ , ${H_4 = 12}$ , ${H_5 = 16}$ , and so forth; see this web page for further continuation of this sequence).

In 2004, Goldston, Pintz, and Yildirim established the bound ${H_1 \leq 16}$ conditional on the Elliott-Halberstam conjecture, which remains unproven. However, no unconditional finiteness of ${H_1}$ was obtained (although they famously obtained the non-trivial bound ${p_{n+1}-p_n = o(\log p_n)}$ ), and even on the Elliot-Halberstam conjecture no finiteness result on the higher ${H_m}$ was obtained either (although they were able to show ${p_{n+2}-p_n=o(\log p_n)}$ on this conjecture). In the recent breakthrough of Zhang, the unconditional bound ${H_1 \leq 70,000,000}$ was obtained, by establishing a weak partial version of the Elliott-Halberstam conjecture; by refining these methods, the Polymath8 project (which I suppose we could retroactively call the Polymath8a project) then lowered this bound to ${H_1 \leq 4,680}$ .

With the very recent preprint of James Maynard, we have the following further substantial improvements:

Theorem 1 (Maynard’s theorem) Unconditionally, we have the following bounds:

${H_1 \leq 600}$ .

${H_m \leq C m^3 e^{4m}}$ for an absolute constant ${C}$ and any ${m \geq 1}$ .

If one assumes the Elliott-Halberstam conjecture, we have the following improved bounds:

${H_1 \leq 12}$ .

${H_2 \leq 600}$ .

${H_m \leq C m^3 e^{2m}}$ for an absolute constant ${C}$ and any ${m \geq 1}$ .

The final conclusion ${H_m \leq C m^3 e^{2m}}$ on Elliott-Halberstam is not explicitly stated in Maynard’s paper, but follows easily from his methods, as I will describe below the fold. (At around the same time as Maynard’s work, I had also begun a similar set of calculations concerning ${H_m}$ , but was only able to obtain the slightly weaker bound ${H_m \leq C \exp( C m )}$ unconditionally.) In the converse direction, the prime tuples conjecture implies that ${H_m}$ should be comparable to ${m \log m}$ . Granville has also obtained the slightly weaker explicit bound ${H_m \leq e^{8m+5}}$ for any ${m \geq 1}$ by a slight modification of Maynard’s argument.

The arguments of Maynard avoid using the difficult partial results on (weakened forms of) the Elliott-Halberstam conjecture that were established by Zhang and then refined by Polymath8; instead, the main input is the classical Bombieri-Vinogradov theorem, combined with a sieve that is closer in spirit to an older sieve of Goldston and Yildirim, than to the sieve used later by Goldston, Pintz, and Yildirim on which almost all subsequent work is based.

The aim of the Polymath8b project is to obtain improved bounds on ${H_1, H_2}$ , and higher values of ${H_m}$ , either conditional on the Elliott-Halberstam conjecture or unconditional. The likeliest routes for doing this are by optimising Maynard’s arguments and/or combining them with some of the results from the Polymath8a project. This post is intended to be the first research thread for that purpose. To start the ball rolling, I am going to give below a presentation of Maynard’s results, with some minor technical differences (most significantly, I am using the Goldston-Pintz-Yildirim variant of the Selberg sieve, rather than the traditional “elementary Selberg sieve” that is used by Maynard (and also in the Polymath8 project), although it seems that the numerology obtained by both sieves is essentially the same). An alternate exposition of Maynard’s work has just been completed also by Andrew Granville.

— 1. Overview of argument —

Define an admissible ${k_0}$ -tuple to be an increasing tuple ${{\cal H} = (h_1,\ldots,h_{k_0})}$ of integers, which avoids at least one residue class modulo ${p}$ for each ${p}$ . For ${1 \leq j_0 \leq k_0}$ , let ${DHL[k_0,j_0]}$ denote the following claim:

Conjecture 2 ( ${DHL[k_0,j_0]}$ ) If ${{\cal H}}$ is an admissible ${k_0}$ -tuple, then there are infinitely many translates ${n + {\cal H}}$ of ${{\cal H}}$ that contain at least ${j_0}$ primes.

The prime tuples conjecture is then the assertion that ${DHL[k_0,j_0]}$ holds for all ${k_0,j_0}$ . Clearly, if ${DHL[k_0,m+1]}$ is true, then we have ${H_m \leq h_{k_0}-h_1}$ whenever ${(h_1,\ldots,h_{k_0})}$ is an admissible ${k_0}$ -tuple. Theorem 1 then follows from the following claim:

Theorem 3 (Maynard’s theorem, DHL version) Unconditionally, we have the following bounds:

${DHL[105,2]}$ .

${DHL[k_0,m+1]}$ whenever ${k_0}$ is sufficiently large and ${4m < \log k_0 - 2 \log\log k_0 - 2}$ .

If one assumes the Elliott-Halberstam conjecture, we have the following improved bounds:

${DHL[5,2]}$ .

${DHL[105,3]}$ .

${DHL[k_0,m+1]}$ whenever ${k_0}$ is sufficiently large and ${2m < \log k_0 - 2 \log\log k_0 - 2}$ .

Indeed, the ${m=1,2}$ results then follow from using the admissible ${5}$ -tuple ${(0,2,6,8,12)}$ and the admissible ${105}$ -tuple

$\displaystyle (0, 10, 12, 24, 28, 30, 34, 42, 48, 52, 54, 64, 70, 72, 78, 82, 90, 94, 100,$

$\displaystyle 112, 114, 118, 120, 124, 132, 138, 148, 154, 168, 174, 178, 180, 184, 190,$

$\displaystyle 192, 202, 204, 208, 220, 222, 232, 234, 250, 252, 258, 262, 264, 268, 280,$

$\displaystyle 288, 294, 300, 310, 322, 324, 328, 330, 334, 342, 352, 358, 360, 364, 372,$

$\displaystyle 378, 384, 390, 394, 400, 402, 408, 412, 418, 420, 430, 432, 442, 444, 450,$

$\displaystyle 454, 462, 468, 472, 478, 484, 490, 492, 498, 504, 510, 528, 532, 534, 538,$

$\displaystyle 544, 558, 562, 570, 574, 580, 582, 588, 594, 598, 600)$

found by Engelsma (and recorded on this site). For the larger ${m}$ results, note that the bound ${4m < \log k_0 - 2 \log\log k_0 - 2}$ is obeyed if ${k_0 \geq C m^2 e^{4m}}$ for a sufficiently large ${C}$ and ${m}$ is large enough, and the claim follows by using the observation that one can create an admissible ${k_0}$ -tuple of length ${O( k_0 \log k_0)}$ by using the first ${k_0}$ primes past ${k_0}$ ; similarly if one assumes the Elliott-Halberstam conjecture. (Note as the ${H_m}$ are clearly non-decreasing in ${m}$ , it suffices to work with sufficiently large ${m}$ to obtain bounds such as ${H_m \leq C m^3 e^{4m}}$ .)

As in previous work, the ${DHL[k_0,m+1]}$ conclusions are obtained by constructing a sieve weight with good properties. We use the same asymptotic notation as in the Polymath8a project, thus all quantities depend on an asymptotic parameter ${x}$ unless explicitly declared to be fixed, and asymptotic notation such ${O()}$ , ${o()}$ or ${\ll}$ is relative to this parameter. We let

$\displaystyle w := \lfloor \log\log\log x \rfloor$

and ${W := \prod_{p < w} p}$ as before. We let ${\theta(n)}$ be the quantity ${\log n}$ when ${n}$ is prime, and zero otherwise.

Lemma 4 (Criterion for ${DHL}$ ) Let ${k_0 \geq 2}$ and ${m \geq 1}$ be fixed integers. Suppose that for each fixed admissible ${k_0}$ -tuple ${{\cal H}}$ and each congruence class ${b\ (W)}$ such that ${b+h}$ is coprime to ${W}$ for all ${h \in {\cal H}}$ , one can find a non-negative weight function ${\nu \colon {\bf N} \rightarrow {\bf R}^+}$ , fixed quantities ${\alpha,\beta > 0}$ , a quantity ${B>0}$ , and a quantity ${R>0}$ such that one has the upper bound

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \leq (\alpha+o(1)) B\frac{x}{W}, \ \ \ \ \ (1)$

the lower bound

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) \theta(n+h_i) \geq (\beta-o(1)) B\frac{x}{W} \log R \ \ \ \ \ (2)$

for all ${h_i \in {\cal H}}$ , and the key inequality

$\displaystyle \frac{\log R}{\log x} > \frac{m}{k_0} \frac{\alpha}{\beta}. \ \ \ \ \ (3)$

Then ${DHL[k_0,m+1]}$ holds.

The ${m=1}$ case of this lemma is Lemma 4.1 of the polymath8a paper. The general ${m}$ case is proven by an essentially identical argument, namely one considers the expression

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W)} \nu(n) (\sum_{i=1}^{k_0} \theta(n+h_i) - m \log 3x ),$

uses the hypotheses (1), (2), (3) to show that this is positive for sufficiently large ${m}$ , and observing that the summand is only positive when ${n+h_1,\ldots,n+h_{k_0}}$ contain at least ${m+1}$ primes.

We recall the statement of the Elliott-Halberstam conjecture, for a given choice of parameter ${0 < \theta < 1}$ :

If ${Q \lessapprox x^\theta}$ and ${A \geq 1}$ is fixed, then

$\displaystyle \sum_{q \leq Q} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (4)$

where

$\displaystyle \Delta(\alpha; a\ (q)) := \sum_{n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{(n,q)=1} \alpha(n).$

The Bombieri-Vinogradov theorem asserts that ${EH[\theta]}$ holds for all ${0 <\theta < 1/2}$ , while the Elliott-Halberstam conjecture asserts that ${EH[\theta]}$ holds for all ${0 < \theta < 1}$ .

In Polymath8a, the sieve weight ${\nu}$ was constructed in terms of a smooth compactly supported one-variable function ${F: [0,+\infty) \rightarrow {\bf R}}$ . A key innovation in Maynard’s work is to replace the sieve with one constructed using a smooth compactly supported multi-variable function ${f: [0,+\infty)^{k_0} \rightarrow {\bf R}}$ , which affords significantly greater flexibility. More precisely, we will show

Proposition 5 (Sieve asymptotics) Suppose that ${EH[\theta]}$ holds for some fixed ${0 < \theta < 1}$ , and set ${R = x^{c/2}}$ for some fixed ${0 < c < \theta}$ . Let ${f: [0,+\infty)^{k_0} \rightarrow {\bf R}}$ is a fixed symmetric smooth function supported on the simplex

$\displaystyle \Delta_{k_0} := \{ (t_1,\ldots,t_{k_0}) \in [0,+\infty)^{k_0}: t_1+\ldots+t_{k_0} \leq 1 \}.$

Then one can find ${\nu}$ obeying the bounds (1), (2) with

$\displaystyle B := (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R}$

$\displaystyle \alpha := \int_{\Delta_{k_0}} f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} \ \ \ \ \ (5)$

$\displaystyle \beta := \int_{\Delta_{k_0-1}} f_{1,\ldots,k_0-1}(t_1,\ldots,t_{k_0-1},0)^2\ dt_1 \ldots dt_{k_0-1} \ \ \ \ \ (6)$

where we use the shorthand

$\displaystyle f_{i_1,\ldots,i_j}(t_1,\ldots,t_n) := \frac{\partial^j}{\partial t_{i_1} \ldots \partial t_{i_j}} f(t_1,\ldots,t_n)$

for the mixed partial derivatives of ${f}$ .

(In fact, one can obtain asymptotics for (1), (2), rather than upper and lower bounds.)

(One can work with non-symmetric functions ${f}$ , but this does not improve the numerology; see the remark after (7.1) of Maynard’s paper.)

We prove this proposition in Section 2. We remark that if one restricts attention to functions ${f}$ of the form

$\displaystyle f(t_1,\ldots,t_k) = F(t_1+\ldots+t_k)$

for ${F: [0,+\infty) \rightarrow {\bf R}}$ smooth and supported on ${[0,1]}$ , then

$\displaystyle \alpha = \int_0^1 f^{(k_0)}(t)^2 \frac{t^{k_0-1}}{(k_0-1)!}\ dt$

and

$\displaystyle \beta = \int_0^1 f^{(k_0-1)}(t)^2 \frac{t^{k_0-2}}{(k_0-2)!}\ dt$

and this claim was already essentially established back in this Polymath8a post (or see Proposition 4.1 and Lemma 4.7 of the Polymath8a paper for essentially these bounds). In that previous post (and also in the paper of Farkas, Pintz, and Revesz), the ratio ${\alpha/\beta}$ was optimised in this one-dimensional context using Bessel functions, and the method was unable to reach ${m=1}$ without an improvement to Bombieri-Vinogradov, or to reach ${m=2}$ even on Elliott-Halberstam. However, the additional flexibility afforded by the use of multi-dimensional cutoffs allows one to do better.

Combining Proposition 5 with Lemma 4, we obtain the following conclusion. For each ${k_0}$ , let ${M_{k_0}}$ be the quantity

$\displaystyle M_{k_0} := \sup_f k_0 \frac{ \int_{\Delta_{k_0-1}} f_{1,\ldots,k_0-1}(t_1,\ldots,t_{k_0-1},0)^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} }$

where ${f}$ ranges over all smooth symmetric functions ${f: [0,+\infty)^{k_0} \rightarrow {\bf R}}$ that are supported on the simplex ${\Delta_{k_0}}$ . Equivalently, by substituting ${F := f_{1,\ldots,k_0}}$ and using the fundamental theorem of calculus, followed by an approximation argument to remove the smoothness hypotheses on ${F}$ , we have

$\displaystyle M_{k_0} := \sup_F k_0 \frac{ \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} } \ \ \ \ \ (7)$

where ${F}$ ranges over all bounded measurable functions supported on ${\Delta_{k_0}}$ . Then we have

Corollary 6 Let ${0 < \theta<1}$ be such that ${EH[\theta]}$ holds, and let ${k_0 \geq 2}$ , ${m \geq 1}$ be integers such that

$\displaystyle M_{k_0} > \frac{2m}{\theta}.$

Then ${DHL[k_0,m+1]}$ holds.

To use this corollary, we simply have to locate test functions ${f}$ that give as large a lower bound for ${M_{k_0}}$ as one can manage; this is a purely analytic problem that no longer requires any further number-theoretic input.

In particular, Theorem 3 follows from the following lower bounds:

Proposition 7

${M_5 > 2}$ .

${M_{105} > 4}$ .

If ${k_0}$ is sufficiently large, then
$\displaystyle M_{k_0} > \log k_0 - 2 \log\log k_0 - 2. \ \ \ \ \ (8)$

The first two cases of this proposition are obtained numerically (see Section 7 of Maynard’s paper), by working with functions ${F}$ that of the special form

$\displaystyle F = 1_{\Delta_{k_0}} \sum_{i=1}^d a_i (1-P_1)^{b_i} P_2^{c_i}$

for various real coefficients ${a_i}$ and non-negative integers ${b_i,c_i}$ , where

$\displaystyle P_1(t_1,\ldots,t_{k_0}) := t_1+\ldots+t_{k_0}$

and

$\displaystyle P_2(t_1,\ldots,t_{k_0}) := t_1^2+\ldots+t_{k_0}^2.$

In Maynard’s paper, the ratio

$\displaystyle k_0 \frac{ \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1} }{ \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} }$

in this case is computed to be

$\displaystyle \frac{a^T M_2 a}{a^T M_1 a}$

where ${a}$ is the ${d \times 1}$ matrix with entries ${a_1,\ldots,a_d}$ , ${M_1}$ is the ${d \times d}$ matrix with ${ij}$ entry equal to

$\displaystyle \frac{(b_i+b_j)! G_{c_i+c_j,2}(k_0)}{(k_0+b_i+b_j+2c_i+2c_j)!}$

where

$\displaystyle G_{b,j}(x) := b! \sum_{r=1}^b \binom{x}{r} \sum_{b_1,\ldots,b_r \geq 1: \sum_{i=1}^r b_i = b} \prod_{i=1}^r \frac{(jb_i)!}{b_i!}$

and ${M_2}$ is the ${d \times d}$ matrix with ${ij}$ entry equal to

$\displaystyle k_0 \sum_{c'_1=0}^{c_1} \sum_{c'_2=0}^{c_2} \binom{c_1}{c'_1} \binom{c_2}{c'_2} \frac{\gamma_{b_i,b_j,c_i,c_j,c'_1,c'_2} G_{c'_1+c'_2,2}(k_0-1)}{(k_0+b_i+b_j+2c_i+2c_j+1)!}$

where ${\gamma_{b_i,b_j,c_i,c_j,c'_1,c'_2}}$ is the quantity

$\displaystyle \frac{b_i! b_j! (2c_i-2c'_1)! (2c_j-2c'_2)! (b_i+b_j+2c_i+2c_j-2c'_i-2c'_j+2)!}{(b_i+2c_i-2c'_1+1)! (b_j + 2c_j-2c'_2+1)!}.$

One then optimises the ratio ${\frac{a^T M_2 a}{a^T M_1 a}}$ by linear programming methods (a similar idea appears in the original paper of Goldston, Pintz, and Yildirim) to obtain a lower bound for ${M_{k_0}}$ for ${k_0=5}$ and ${k_0=105}$ .

The final case is established in a different manner; we give a proof of the slightly weaker bound

$\displaystyle M_{k_0} > \log k_0 - 4 \log\log k_0 - O(1) \ \ \ \ \ (9)$

in Section 3.

— 2. Sieve asymptotics —

We now prove Proposition 5. We use a Fourier-analytic method, similar to that in this previous blog post.

The sieve we will use is of the form

$\displaystyle \nu(n) := (\sum_{d_1,\ldots,d_{k_0} \in {\cal S}: d_i|n+h_i \hbox{ for all } i=1,\ldots,k_0} \ \ \ \ \ (10)$

$\displaystyle (\prod_{i=1}^{k_0} \mu(d_i)) f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ))^2,$

where ${{\cal S}}$ denotes the square-free integers and ${\mu}$ is the Möbius function This should be compared with the sieve

$\displaystyle \nu(n) := (\sum_{d \in {\cal S}: d|\prod_{i=1}^{k_0}(n+h_i)} \mu(d) F( \frac{\log d}{\log R}))^2$

used in the previous blog post, which basically corresponds to the special case ${f(t_1,\ldots,t_{k_0}) =F(t_1+\ldots+t_{k_0})}$ .

We begin with (1). Using (10), we may rearrange the left-hand side of (1) as

$\displaystyle \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))$

$\displaystyle f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )$

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0} 1.$

Observe that as the numbers ${n+h_1,\ldots,n+h_{k_0}}$ have no common factor when ${n = b\ (W)}$ , the inner sum vanishes unless the quantities ${[d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}]}$ are coprime, in which case this inner sum can be estimated as

$\displaystyle \frac{x}{W [d_1,d'_1] \ldots [d_{k_0},d'_{k_0}]}+O(1).$

Also, at least one of the two products involving ${f}$ will vanish unless one has

$\displaystyle d_1 \ldots d_{k_0}, d'_1,\ldots,d'_{k_0} \leq R.$

Let us first deal with the contribution of the error term ${O(1)}$ to (1). This contribution may be bounded by

$\displaystyle O( \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}: d_1 \ldots d_{k_0}, d'_1,\ldots,d'_{k_0} \leq R} 1 ),$

which sums to ${\ll R^2 \log^{O(1)} R}$ ; since ${R = x^{c/2}}$ and ${c < 1/2 < 1}$ , we conclude that this contribution is negligible.

To conclude the proof of (1), it thus suffices to show that

$\displaystyle \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}: [d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}] \hbox{ coprime}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j)) \ \ \ \ \ (11)$

$\displaystyle \frac{f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )}{[d_1,d'_1] \ldots [d_k,d'_k]}$

$\displaystyle = (\alpha+o(1)) (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R}.$

Next, we smoothly extend the function ${f: [0,+\infty)^{k_0} \rightarrow {\bf R}}$ to a smooth compactly supported function ${f: {\bf R}^{k_0}\rightarrow {\bf R}}$ , which by abuse of notation we will continue to refer to as ${f}$ . By Fourier inversion, we may then express ${f}$ in the form

$\displaystyle f(t_1,\ldots,t_{k_0}) = \int_{{\bf R}^{k_0}} \eta(\vec s) e^{-\sum_{j=1}^{k_0} (1+is_j)t_j}\ d \vec s \ \ \ \ \ (12)$

where ${\vec s := (s_1,\ldots,s_{k_0})}$ and where ${\eta: {\bf R}^k \rightarrow {\bf C}}$ is a smooth function obeying the rapid decay bounds

$\displaystyle |\eta(\vec s)| \ll (1+|\vec s|)^{-A} \ \ \ \ \ (13)$

for any fixed ${A>0}$ . The left-hand side of (11) may then be rewritten as

$\displaystyle \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') H(\vec s, \vec s')\ d\vec s d\vec s' \ \ \ \ \ (14)$

where ${\vec s' := (s'_1,\ldots,s'_{k_0})}$ and

$\displaystyle H(\vec s, \vec s') := \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}: [d_1,d'_1],\ldots,[d_{k_0},d'_{k_0}] \hbox{ coprime}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))$

$\displaystyle \frac{\prod_{j=1}^{k_0} d_j^{-(1+is_j)/\log R} (d'_j)^{-(1+is'_j)/\log R}}{[d_1,d'_1] \ldots [d_k,d'_k]}.$

We may factorise ${H(\vec s,\vec s')}$ as an Euler product

$\displaystyle H(\vec s,\vec s') = \prod_{p>w}$

$\displaystyle (1 - \sum_{j=1}^{k_0} p^{-1-(1+is_j)/\log R} + p^{-1-(1+is'_j)/\log R} - p^{-1-(1+is_j)/\log R - (1+is'_j)/\log R} ).$

In particular, we have the crude bound

$\displaystyle |H(\vec s,\vec s')| \leq \prod_{p>w} (1 +3p^{-1/\log R}) \ll \log^{O(1)} R;$

combining this with (13) we see that the contribution to (14) in which ${|\vec s| \geq \sqrt{\log R}}$ or ${|\vec s'| \geq \sqrt{\log R}}$ is negligible, so we may restrict the integral in (14) to the region ${|\vec s|,|\vec s'| \leq \sqrt{\log R}}$ . In this region, we have the Euler product approximations

$\displaystyle \prod_{p>w} (1 - p^{-1-(1+is_j)/\log R}) = \zeta(1+\frac{1+is_j}{\log R})^{-1} \prod_{p \leq w} (1 - p^{-1-(1+is_j)/\log R})^{-1}$

$\displaystyle = (1+o(1)) (\frac{1+is_j}{\log R}) \prod_{p \leq w} (1-p^{-1})^{-1}$

$\displaystyle = (1+o(1)) \frac{W}{\phi(W)} (\frac{1+is_j}{\log R})$

where we have used the bound ${W = O(\log\log x)}$ and the asymptotic ${\zeta(s) = (1+o(1)) (s-1)^{-1}}$ for ${s = 1 + O(1/\log R)}$ . Since also ${\sum_{p>w} p^{-2} = o(1)}$ , we conclude that

$\displaystyle F(\vec s,\vec s') = (1+o(1)) \prod_{j=1}^{k_0} \prod_{p>w} \frac{ (1-p^{-1-(1+is_j)/\log R}) (1-p^{-1-(1+is'_j)/\log R})}{1-p^{-1-(1+is_j)/\log R- (1+is'_j)/\log R}}$

$\displaystyle = (1+o(1)) (\frac{W}{\phi(W)})^{k_0} \frac{1}{\log^{k_0} R} \prod_{j=1}^{k_0} \frac{ (1+is_j) (1+is'_j)}{1+is_j+1+is'_j}.$

Using (13) again to dispose of the ${o(1)}$ error term, and then using (13) once more to remove the restriction to ${|\vec s|,|\vec s'| \leq \sqrt{\log R}}$ , we thus reduce to verifying the identity

$\displaystyle \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') \prod_{j=1}^{k_0} \frac{ (1+is_j) (1+is'_j)}{1+is_j+1+is'_j} \ d\vec s d\vec s' = \alpha.$

But from repeatedly differentiating (12) under the integral sign, one has

$\displaystyle f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0}) = (-1)^{k_0} \int_{{\bf R}^{k_0}} \eta(\vec s) e^{-\sum_{j=1}^{k_0} (1+is_j)t_j} \prod_{j=1}^{k_0} (1+is_j)\ d \vec s$

and thus

$\displaystyle f_{1,\ldots,k_0}(t_1,\ldots,t_{k_0})^2 = \int_{{\bf R}^{k_0}} \int_{{\bf R}^{k_0}} \eta(\vec s) \eta(\vec s') e^{-\sum_{j=1}^{k_0} (1+is_j +1 +is'_j)t_j}$

$\displaystyle \prod_{j=1}^{k_0} (1+is_j) (1+is'_j)\ d \vec s d\vec s';$

integrating this for ${(t_1,\ldots,t_{k_0} \in [0,+\infty)}$ using Fubini’s theorem (and (13)), the claim then follows from (5). This concludes the proof of (1).

Now we prove (2). We will just prove the claim for ${i=k_0}$ , as the other cases follow similarly using the symmetry hypothesis on ${f}$ . The left-hand side of (2) may then be expanded as

$\displaystyle \sum_{d_1,\ldots,d_{k_0},d'_1,\ldots,d'_{k_0}\in {\cal S}} (\prod_{j=1}^{k_0} \mu(d_j) \mu(d'_j))$

$\displaystyle f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0} }{\log R} ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0} }{\log R} )$

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0} \theta(n+h_{k_0}).$

Observe that the summand vanishes unless ${d_{k_0}=d'_{k_0}=1}$ (note that ${n+k_0}$ is comparable to ${x}$ and thus exceeds ${R}$ ). So we may simplify the above expression to

$\displaystyle \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}\in {\cal S}} (\prod_{j=1}^{k_0-1} \mu(d_j) \mu(d'_j)) f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0-1} }{\log R}, 0 ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0-1} }{\log R}, 0 )$

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0-1} \theta(n+h_{k_0}).$

As in the estimation of (1), the summand vanishes unless ${[d_1,d'_1],\ldots,[d_{k_0-1},d'_{k_0-1}]}$ are coprime, and if

$\displaystyle d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1} \leq R.$

Let ${d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1} \in {\cal S}}$ obey the above constraints. For any modulus ${q}$ , define the discrepancy

$\displaystyle E(q) := \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\sum_{x \leq n \leq 2x: n+ h_{k_0} = a\ (q)} \theta(n+h_{k_0}) - \frac{x}{\phi(q)}|. \ \ \ \ \ (15)$

Since ${R = x^{c/2}}$ and ${0 < c < \theta}$ is fixed, the hypothesis ${EH[\theta]}$ implies that

$\displaystyle \sum_{q \leq WR^2} E(q) \ll x \log^{-A} x \ \ \ \ \ (16)$

for any fixed ${A>0}$ . On the other hand, the sum

$\displaystyle \sum_{x \leq n \leq 2x: n = b\ (W); [d_j,d'_j] | n+h_j \hbox{ for all } j=1,\ldots,k_0-1} \theta(n+h_{k_0})$

can, by the Chinese remainder theorem, be rewritten in the form

$\displaystyle \sum_{x \leq n \leq 2x: n+h_{k_0} = a\ (q)} \theta(n+h_{k_0})$

where

$\displaystyle q := W \prod_{j=0}^{k_0-1} [d_j,d'_j] \ \ \ \ \ (17)$

and ${a\ (q)}$ is a primitive residue class; note that ${q}$ does not exceed ${WR^2}$ . By (15), this expression can then be written as

$\displaystyle \frac{x}{\phi(W) \prod_{j=0}^{k_0-1} \phi([d_j,d'_j])} + O( E(W \prod_{j=0}^{k_0-1} [d_j,d'_j]) ).$

Let us first control the error term, which may be bounded by

$\displaystyle O( \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}: d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1} \leq R} E(W \prod_{j=0}^{k_0-1} [d_j,d'_j]) ).$

Note that for any ${q \leq WR^2}$ , there are ${O( \tau(q)^{O(1)})}$ choices of ${d_1 \ldots d_{k_0-1}, d'_1 \ldots d'_{k_0-1}}$ for which (17) holds. Thus we may bound the previous expression by

$\displaystyle \ll \sum_{q \leq WR^2} \tau(q)^{O(1)} E(q).$

By the Cauchy-Schwarz inequality and (16), this expression may be bounded by

$\displaystyle \ll (x \log^{-A} x)^{1/2} (\sum_{q \leq WR^2} \tau(q)^{O(1)} E(q))^{1/2}$

for any fixed ${A}$ . On the other hand, we have the crude bound ${E(q) \ll \frac{x}{q} \log^{O(1)} x}$ , as well as the standard estimate

$\displaystyle \sum_{q \leq y} \frac{\tau(q)^{O(1)}}{q} \ll \log^{O(1)} y.$

(see e.g. Corollary 2.15 of Montgomery-Vaughan). Putting all this together, we conclude that the contribution of the error term to (2) is negligible. To conclude the proof of (2), it thus suffices to show that

$\displaystyle \sum_{d_1,\ldots,d_{k_0-1},d'_1,\ldots,d'_{k_0-1}\in {\cal S}} (\prod_{j=1}^{k_0-1} \mu(d_j) \mu(d'_j))$

$\displaystyle \frac{f( \frac{\log d_1}{\log R}, \ldots, \frac{\log d_{k_0-1} }{\log R}, 0 ) f( \frac{\log d'_1}{\log R}, \ldots, \frac{\log d'_{k_0-1} }{\log R}, 0 )}{\prod_{j=0}^{k_0-1} \phi([d_j,d'_j])}$

$\displaystyle = (\beta-o(1)) (\frac{W}{\phi(W)})^{k_0-1} \frac{1}{\log^{k_0-1} R}.$

But this can be proven by repeating the arguments used to prove (11) (with ${k_0}$ replaced by ${k_0-1}$ , and ${f(t_1,\ldots,t_{k_0})}$ replaced by ${f(t_1,\ldots,t_{k_0-1},0)}$ ); the presence of the Euler totient function causes some factors of ${\frac{1}{p}}$ in that analysis to be replaced by ${\frac{1}{p-1} = \frac{1}{p} + O(\frac{1}{p^2})}$ , but this turns out to have a negligible impact on the final asymptotics since ${\sum_{p >w} \frac{1}{p^2} = o(1)}$ . This concludes the proof of (2) and hence Proposition 5.

Remark 1 An inspection of the above arguments shows that the simplex ${\Delta_{k_0}}$ can be enlarged slightly to the region

$\displaystyle \Delta'_{k_0} := \{ (t_1,\ldots,t_{k_0}) \in [0,+\infty)^{k_0}: t_1+\ldots+t_{k_0} \leq 1 + t_i$

$\displaystyle \hbox{ for all } 1 \leq i \leq k_0 \},$

however this only leads to a tiny improvement in the numerology. It is interesting to note however that in the ${k_0=2}$ case, ${\Delta'_{k_0}}$ is the unit square ${[0,1]^2}$ , and by taking ${f(t_1,t_2) := (1-t_1)_+ (1-t_2)_+}$ and taking ${c}$ close to ${1}$ , one can come “within an epsilon” of establishing ${DHL[2,2]}$ (and in particular, the twin prime conjecture) from the full Elliott-Halberstam conjecture; this fact was already essentially observed by Bombieri, using the weight ${\nu(n):=\Lambda_2(n) \Lambda_2(n+2)}$ rather than the Selberg sieve. (Strictly speaking, to establish (1) in this context, one needs the Elliott-Halberstam conjecture not only for ${\Lambda}$ , but also for other arithmetic functions with a suitable Dirichlet convolution ${\alpha*\beta}$ structure; we omit the details.)

Remark 2 It appears that the ${MPZ[\varpi,\delta]}$ conjecture studied in Polymath8a can serve as a substitute for ${EH[\frac{1}{2}+2\varpi]}$ in Corollary 6, except that one also has to impose an additional constraint on the function ${F}$ (or ${f}$ ), namely that it is supported in the cube ${[0, \frac{\delta}{1/4 +\varpi}]^{k_0}}$ (in order to keep the moduli involved appropriately smooth). Perhaps as a first approximation, we should ignore the role of ${\delta}$ , and just pretend that ${MPZ[\varpi,\delta]}$ is as good as ${EH[\frac{1}{2}+2\varpi]}$ . In particular, inserting our most optimistic value of ${\varpi}$ obtained by Polymath8, namely ${\frac{13}{1080}}$ , we can in principle take ${\theta}$ as large as ${283/540 = 0.524\ldots}$ , although this only is a ${5\%}$ improvement or so over the Bombieri-Vinogradov inequality.

— 3. Large ${k_0}$ computations —

We now give a proof of (9) for sufficiently large ${k_0}$ . We use the ansatz

$\displaystyle F = 1_{\Delta_{k_0}} F_0$

where ${F_0}$ is the tensor product

$\displaystyle F_0(t_1,\ldots,t_k) = \prod_{i=1}^{k_0} k_0^{1/2} g( k_0 t_i )$

and ${g: [0,+\infty) \rightarrow {\bf R}}$ is supported on some interval ${[0,T]}$ and normalised so that

$\displaystyle \int_0^\infty g(t)^2\ dt = 1. \ \ \ \ \ (18)$

The function ${F}$ is clearly symmetric and supported on ${\Delta_{k_0}}$ . We now estimate the numerator and denominator of the ratio

that lower bounds ${M_{k_0}}$ .

For the denominator, we bound ${F}$ by ${F_0}$ and use Fubini’s theorem and (8) to obtain the upper bound

$\displaystyle \int_{\Delta_{k_0}} F(t_1,\ldots,t_{k_0})^2\ dt_1 \ldots dt_{k_0} \leq 1$

and thus

$\displaystyle M_{k_0} \geq k_0 \int_{\Delta_{k_0-1}} (\int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0})^2\ dt_1 \ldots dt_{k_0-1}.$

Now we observe that

$\displaystyle \int_0^\infty F(t_1,\ldots,t_{k_0-1},t_{k_0})\ dt_{k_0} = (\prod_{i=0}^{k_0-1} k_0^{1/2} g(k_0 t_i)) k_0^{-1/2} \int_0^\infty g(t)\ dt$

whenever ${t_1+\ldots+t_{k_0-1} \leq 1 - \frac{T}{k_0}}$ , and so we have the lower bound

$\displaystyle M_{k_0} \geq (\int_0^\infty g(t)\ dt)^2 \int_{t_1+\ldots+t_{k_0-1} \leq 1-\frac{T}{k_0}} (\prod_{i=0}^{k_0-1} k_0^{1/2} g(k_0 t_i))^2 dt_1 \ldots dt_{k_0-1}.$

We interpret this probabilistically. Let ${X_1,\ldots,X_{k_0-1}}$ be independent, identically distributed non-negative real random variables with probability density ${g(t)^2\ dt}$ ; this is well-defined thanks to (18). Observe that ${(\prod_{i=1}^{k_0-1} k_0^{1/2} g( k_0 t_i ))^2}$ is the joint probability density of ${\frac{1}{k_0}(X_1,\ldots,X_{k_0-1})}$ , and so

$\displaystyle M_{k_0} \geq (\int_0^\infty g(t)\ dt)^2 {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T).$

We will lower bound this probability using the Chebyshev inequality. (In my own calculations, I had used the Hoeffding inequality instead, but it seems to only give a slightly better bound in the end for ${H_m}$ (perhaps saving one or two powers of ${m}$ ).) In order to exploit the law of large numbers, we would like the mean ${(k_0-1) \mu}$ of ${X_1 + \ldots + X_{k_0-1}}$ , where ${\mu := \int_0^T t g(t)^2\ dt}$ , to be less than ${k_0-T}$ :

$\displaystyle (k_0-1) \mu < k_0 - T.$

The variance of ${X_1 + \ldots + X_{k_0-1}}$ is ${k_0-1}$ times the variance of a single ${X_i}$ , which we can bound (somewhat crudely) by

$\displaystyle \hbox{Var}(X_i) \leq {\bf E} X_i^2 \leq T {\bf E} X_i = T \mu.$

Thus by Chebyshev’s inequality, we have

$\displaystyle {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{(k_0-1) T \mu}{(k_0-T-(k_0-1)\mu)^2}.$

To clean things up a bit we bound ${k_0-1}$ by ${k_0}$ to obtain the simpler bound

$\displaystyle {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{k_0 T \mu}{(k_0-T-k_0\mu)^2}$

assuming now that ${k_0 \mu < k_0 - T}$ . In particular, ${\mu \leq 1}$ , so we have the cleaner bound

$\displaystyle {\bf P} (X_1 + \ldots + X_{k_0-1} \leq k_0 - T) \geq 1 - \frac{T}{k_0 (1-T/k_0-\mu)^2}.$

To summarise, we have shown that

$\displaystyle M_{k_0} \geq (\int_0^T g(t)\ dt)^2 ( 1 - \frac{T}{k_0 (1-T/k_0-\mu)^2} ) \ \ \ \ \ (19)$

whenever ${g: [0,T] \rightarrow {\bf R}}$ is such that

$\displaystyle \int_0^T g(t)^2\ dt = 1 \ \ \ \ \ (20)$

and

$\displaystyle \mu = \int_0^T t g(t)^2\ dt < 1 - \frac{T}{k_0}.$

One can optimise this carefully to give (8) (as is done in Maynard’s paper), but for the purposes of establishing the slightly weaker bound (9), we can use ${g}$ of the form

$\displaystyle g(t) = \frac{c}{1+At}$

with ${A := \log k_0}$ and ${T := k_0 \log^{-3} k_0}$ . With the normalisation (20) we have

$\displaystyle c^2 = \frac{1+AT}{T} = \log k_0 + O(1)$

$\displaystyle c = \log^{1/2} k_0 + O(\log^{-1/2} k_0)$

and

$\displaystyle \mu = \frac{c^2}{A^2} (\log(1+AT) -1 - \frac{1}{1+AT})$

$\displaystyle = 1 - \frac{2 \log\log k_0}{\log k_0} + O( \frac{1}{\log k_0} )$

and thus

$\displaystyle \frac{T}{k_0 (1-T/k_0-\mu)^2} = O( \frac{1}{\log k_0} )$

while

$\displaystyle \int_0^T g(t)= \frac{c}{A} \log(1+AT)$

$\displaystyle = \log^{1/2} k_0 - 2 \log^{-1/2} k_0 \log\log k_0 + O( \log^{-1/2} k_0 )$

which gives the claim.

161 comments

Comments feed for this article

19 November, 2013 at 11:04 pm

Terence Tao

There are a couple of initial directions to proceed from here, I think. One is to get better bounds on $M_{k_0}$ ; as pointed out by Eytan in the previous thread; Maynard is working with functions of the first two symmetric polynomials $P_1 = t_1+\ldots+t_{k_0}$ and $P_2 = t_1^2+\ldots+t_{k_0}^2$ , but in principle all the symmetric functions are available. (Though perhaps before doing that we should at least get to the point where we can replicate Maynard’s computations; note that he has supplied a Mathematica notebook in the source code of his arXiv file, although it is slightly nontrivial to extract it from the arXiv interface.) Another is to use MPZ instead of Bombieri-Vinogradov, but this puts an additional constraint on the test function F that has to be accounted for, and is likely to have a modest impact on $k_0$ (improving this quantity by about 5% or so). A third thing to do is to try to enlarge the allowable support of the function F, as per Remark 1; this is likely to be a very modest gain (basically it should shift $k_0$ downward by 1 or so), but this could be significant for bounding $H_1$ on EH, as it may shift $k_0$ down from 5 to 4 (bringing the $H_1$ bound from 12 to 8).

Indeed, it seems that focusing on the $k_0=4$ case (assuming EH) may well be the best place to start, as the numerics should be fast and one can work a certain number of things out by hand in such low-dimensional cases.

20 November, 2013 at 3:51 am

Andrew Sutherland

With Maynard’s permission, I have posted a copy of his Mathematica notebook source as a text file here.

20 November, 2013 at 6:19 am

Anonymous

Any chance I can make you (on anyone else) convert the text document into a notebook? I’m totally new to Mathematica and would like to play around with a fully working notebook. (I’m using v9.0 in case it is important.) [Sorry for interrupting the thread.]

20 November, 2013 at 7:51 am

Andrew Sutherland

You should be able to just save the file with the extension .nb and mathematica will be happy to load it as a notebook.

20 November, 2013 at 8:09 am

Anonymous

Thank you, Andrew.

@Terry: Just remove this post and my previous question if you want.

20 November, 2013 at 10:16 am

James Maynard

I don’t think you can get to $k_0=4$ (under EH) using just Remark 1 and higher symmetric polynomials (this seems to get only about half way numerically). That said, there are various possible small modifications (such as those you mention below) which might give enough of an improvement.

20 November, 2013 at 11:36 am

Vít Tuček

Did you try nonpolynomial symmetric functions? There is some work done on symmetric orthogonal polynomials on the standard simplex. See e.g. the section 3.3 of https://www.sciencedirect.com/science/article/pii/S0021904504002199 and https://www.sciencedirect.com/science/article/pii/S0377042700005045

20 November, 2013 at 1:03 pm

Eytan Paldi

A related idea is to use the transformation $t_j = x_j^2$ to have the weight function over the unit ball (which is perhaps simpler than the simplex).

20 November, 2013 at 3:16 pm

Vít Tuček

That’s precisely what is happening in the articles I’ve linked. The author then “constructs” some invariant orthogonal polynomials on the standard simplex. The problem is to find the matrix of $F \to (\int_0^\infty F(t_1,\ldots,t_{k-1},t_k) dt_k)^2$ with respect to the basis of orthogonal symmetric polynomials.

Of course, we can always take arbitrary polynomials in elementary symmetric functions (which give us all invariant polynomials) and evaluate the integrals numerically.

20 November, 2013 at 3:24 pm

James Maynard

I only looked very briefly at the numerical optimization with non-polynomial functions.

We know (Weierstrass) that polynomial approximations will converge to the optimal function. Moreover, I found that the numerical bounds from polynomial approximations seemed to converge quickly.

For $k_0=4$ , it is easy to use approximations involving higher symmetric polynomials, but there is only a very limited numerical improvement after having taken a low-degree approximation.

For $k_0\approx 105$ things are more complicated, and convergence is a bit slower, so if there is some nicer basis of symmetric functions maybe thats worth looking at. That said, I would have guessed that it is computationally feasible to include higher degree symmetric polynomials and get a bound that is accurate enough to get the globally best value of $k_0$ (if we’re only following the method in my paper – with new ideas, such as MPZ this might no longer be true).

20 November, 2013 at 3:42 pm

Eytan Paldi

There is no need for numerical evaluation since any monomial in $t_1, ..., t_k$ over the simplex is transformed to a monomial in $x_1, ..., x_k$ over the unit ball whose integral is separable (by using spherical coordinates) with explicit expression in terms of beta functions (in this case factorials.)

19 November, 2013 at 11:14 pm

Polymath 8 – a Success! | Combinatorics and more

[…] Tao launched a followup polymath8b to improve the bounds for gaps between primes based on Maynard’s […]

19 November, 2013 at 11:21 pm

Gil Kalai

Naive questions: 1) Do the small gaps results apply to primes in arithmetic progressions? 2) Could there be a way to bootstrap results for k-tuples of prime in small intervals to get back results on 2-tuples. (The k-tuples result feels in some sense also stronger than the twin prime conjecture.)

20 November, 2013 at 8:03 am

Terence Tao

For (1), this should be possible. The original Goldston-Pintz-Yildirim results were extended to arithmetic progressions by Freiberg (http://arxiv.org/abs/1110.6624) and to more general sets of Bohr type by my student Benatar (http://arxiv.org/abs/1305.0348). In Maynard’s paper, it is noted that for the purposes of getting finite values for $H_m$ , the set of primes can be replaced by any other set that obeys a Bombieri-Vinogradov type theorem (at any level of distribution, not necessarily as large as 1/2), which includes the primes in arithmetic progressions (and also primes in fairly short intervals).

There may be some interplay between the k-tuple results and the 2-tuple results, but note that the set of primes we can detect in a k-tuple is rather sparse, we only catch about $\log k$ primes in such a tuple. Basically, Maynard’s construction gives, for each $k$ and each admissible tuple $(h_1,\ldots,h_k)$ , a sieve weight $\nu= \nu_k$ (which is basically concentrated on a set in $[x,2x]$ of density about $\log^{-k} x$ ) with the property that if $n \in [x,2x]$ is chosen randomly with probability density equal to $\frac{\nu(n)}{\sum_m \nu(m)}$ , then

${\bf P}( n + h_i \hbox{ prime}) \gg \frac{\log k}{k}$

(or, in the original notation, $\sum_{n+h_i \hbox{ prime}} \nu(n) \gg \frac{\log k}{k} \sum_n \nu(n)$ ) for $i=1,\ldots,k$ , which by the first moment method shows that with positive probability at least $\log k$ of the $n+h_i$ will be prime. As $k$ increases, it is true that we get a few more primes this way, but they are spread out in a much longer interval, and the number of $n$ with this many primes also gets sparser, so it’s not clear how to perform the tradeoff.

But this does remind me of a possible idea I had to shave 1 off of $k$ . Suppose for instance one was trying to catch two primes close together in the tuple $n,n+2,n+6,n+8$ . Currently, we need estimates of the form

${\bf P}( n \hbox{ prime}) > \frac{1}{4}$

${\bf P}( n+2 \hbox{ prime}) > \frac{1}{4}$

${\bf P}( n+6 \hbox{ prime}) > \frac{1}{4}$

${\bf P}( n+8 \hbox{ prime}) > \frac{1}{4}$

to catch two primes at distance at most 8 apart; one can also replace the four constants $\frac{1}{4}$ by other constants that add up to 1, although this doesn’t seem to help much. To get primes distance at most 6 apart, one would ordinarily need to boost three of these constants up to $1/3$ (and drop the last one down to zero). But suppose we had bounds of the form

${\bf P}( n \hbox{ prime}) > \frac{1}{4} + \varepsilon$

${\bf P}( n+2 \hbox{ prime}) > \frac{1}{4} + \varepsilon$

${\bf P}( n+6 \hbox{ prime}) > \frac{1}{4} + \varepsilon$

${\bf P}( n+8 \hbox{ prime}) > \frac{1}{4} + \varepsilon$

${\bf P}( n, n+8 \hbox{ both prime}) \leq \varepsilon$ .

Then one could get a gap of at most 6, and not just 8, because the first four estimates show that there are two primes in n,n+2,n+6,n+8 with probability greater than $\varepsilon$ , and the last estimate excludes the possibility that this only occurs for the n,n+8 pair. Unfortunately with sieve theory it is difficult to get really good bounds on ${\bf P}( n, n+8 \hbox{ both prime})$ , so I don’t know whether this idea will actually work.

20 November, 2013 at 10:29 am

Terence Tao

Here is a slight refinement of the last idea. Assume EH, and suppose we want to improve the current bound of $H_1 \leq 12$ to $H_1 \leq 10$ . Assume this were false, thus after some point there are no pairs of primes of distance 10 or less apart. Thus, given any large random integer n (chosen using a probability distribution $\nu(n) / \sum_m\nu(m)$ ), the events “n prime”, “n+2 prime”, “n+6 prime”, “n+8 prime”, “n+12 prime” are all disjoint, except for the pair “n prime” and “n+12 prime” which are allowed to overlap. If we then let $P_h$ be the probability that $n+h$ is prime, and $P_{0,12}$ be the probability that n and n+12 are simultaneously prime, we thus have the inequality

$P_0 + P_2 + P_6 + P_8 + P_{12} \leq 1 + P_{0,12}.$

So if we can get a lower bound on

$\sum_{h=0,2,6,8,12} \sum_n 1_{n+h \hbox{ prime}} \nu(n)$ (1)

which is asymptotically better than an upper bound for

$\sum_n \nu(n) + \sum_n 1_{n,n+12 \hbox{ prime}} \nu(n)$

then we can reduce $H_1$ from 12 to 10. The tricky thing though is still how to get a good upper bound on $\sum_n 1_{n,n+12 \hbox{ prime}} \nu(n)$ . But perhaps there is now a fair amount of “room” in the ratio between (1) and $\sum_n \nu(n)$ if we use more symmetric polynomials and enlarge the simplex as per Remark 1; the computations in Maynard’s paper show that this ratio is at least $\frac{1}{2} M_5 > 1.00058$ which gives almost no room to play with, but perhaps with the above improvements one can get a more favorable ratio.

20 November, 2013 at 10:39 am

James Maynard

Another comment in the same vein:

Instead of using $\sum_n \nu(n)$ we can use $\sum_n \nu(n)-\sum_n 1_{\hbox{none of } n,n+2,\dots,n+12 \hbox{ prime}} \nu(n)$ . We can therefore get a small improvement numerically from removing e.g. the contribution when all of $n,n+2,n+6,n+8,n+12$ have a small prime factor.

20 November, 2013 at 3:48 pm

Terence Tao

I thought about this idea a bit more, but encountered a problem in the EH case: if one wants a meaningful lower bound on the second term, then one starts wanting to estimate sums such as $\sum_{d_1|n,d_2|n+2,\ldots,d_5|n+12} \nu(n)$ for various values of $d_1,\ldots,d_5$ , but this is only plausible if the sieve level $R^2$ of $\nu$ is a bit below the threshold of $x$ (by a factor of $d_1\ldots d_5$ ), but then one is probably losing more from having a reduced sieve level $R$ (as it now has to be below the optimal value of $x^{1/2}$ that one wants to pick if one wants to get the maximum benefit of EH) than one is gaining from removing this piece.

However, the situation is better if one is starting from Bombieri-Vinogradov or MPZ as the distribution estimate (so one would now be trying to beat the 600 bound rather than the 12 bound). Here the sieve level $R^2$ is more like $x^{1/2}$ and so there is actually a fair bit of room to insert some divisibility constraints $d_1|n,\ldots,d_{105}|n+600$ and still be able to control $\sum_{d_1|n,\ldots,d_{105}|n+600} \nu(n)$ . The bad news is that the final bound one gets here could be really small, something like $105^{-105}$ of the main term, which looks too tiny to lead to any actual improvement in the H value.

21 November, 2013 at 6:51 pm

James Maynard

Good point. Maybe (I haven’t really thought about this) we can use exponential sum estimates to get us a bit of extra room, but this is already sounding like quite a lot of work for what would be only a very small numerical improvement.

21 November, 2013 at 7:15 am

Terence Tao

I found a way to get an upper bound on the quantity

$\displaystyle \sum_n 1_{n,n+12 prime} \nu(n)$

(and hence $P_{0,12}$ ) but I don’t know how effective it will be yet. Basically, the idea is to observe that if $\nu$ is given in terms of a multidimensional GPY cutoff function f as in equation (10) in the blog post, then when n and n+12 are both prime, we have $\nu(n) = \nu'(n)$ , whenever $\nu'$ is given in terms of another multidimensional GPY cutoff function f’ with the property that $f(0,t_2,t_3,t_4,0) = f'(0,t_2,t_3,t_4,0)$ for all $t_2,t_3,t_4$ . Then

$\displaystyle \sum_n 1_{n,n+12 prime} \nu(n) = \sum_n 1_{n,n+12 prime} \nu'(n)$
$\displaystyle \leq \sum_n 1_{n prime} \nu'(n)$ .

The last step is of course crude, but it crucially replaces a sum over two primality conditions (which is hopeless to estimate with current technology) with a sum over one primality condition (which we do know how to control, assuming a suitable distribution hypothesis). The final expression may be computed by Proposition 5 of the blog post to be

$\displaystyle (1+o(1)) \frac{Bx}{W} \frac{\log R}{\log x} \int f'_{2345}(0,t_2,t_3,t_4,t_5)^2\ dt_2 dt_3 dt_4 dt_5$

and so the probability $P_{0,12}$ above may be upper bounded by the ratio

$\displaystyle \frac{\log R}{\log x} \frac{ \int f'_{2345}(0,t_2,t_3,t_4,t_5)^2\ dt_2 dt_3 dt_4 dt_5 }{ \int f_{12345}^2(t_1,t_2,t_3,t_4,t_5)^2\ dt_1dt_2 dt_3 dt_4 dt_5 } + o(1)$ .

If we set f’=f then this gives the trivial upper bound $P_{0,12}\leq P_0$ , but the point is that we can do better by optimising in f’.

Now the optimisation problem for f’ is much easier than the main optimisation for $M_k$ : we need to optimise the square-integral of $f'_{2345}(0,t_2,t_3,t_4,t_5)$ subject to $f'$ being supported on the simplex and having the same trace as $f$ on the boundary $t_1=t_5=0$ . (No symmetry hypothesis needs to be imposed on f’.) It is not difficult (basically fundamental theorem of calculus + converse to Cauchy-Schwarz) to see that the extremiser occurs with

$\displaystyle f'_{2345}(0,t_2,t_3,t_4,t_5) = \frac{f_{234}(0,t_2,t_3,t_4,0)}{1-t_2-t_3-t_4}$

and so we have

$\displaystyle P_{0,12} \leq \frac{\log R}{\log x} \frac{ \int \frac{f_{234}(0,t_2,t_3,t_4,0)^2}{1-t_2-t_3-t_4}\ dt_2 dt_3 dt_4 }{ \int f_{12345}^2(t_1,t_2,t_3,t_4,t_5)^2\ dt_1dt_2 dt_3 dt_4 dt_5 } + o(1)$ .

But I don’t know yet how strong this bound will be in practice; one needs $f$ to stay a little bit away from the extreme points of the simplex in order for the denominator $\frac{1}{1-t_2-t_3-t-4}$ not to kill us. (On the other hand, as observed earlier, the bound can’t get worse than $P_0$ .)

In the large k regime, the analogue of $P_0,P_{12}$ has size about $\log k/k$ ; I’m hoping that the bound for $P_{0,12}$ is significantly smaller, indeed independence heuristics suggest it should be something like $O( \log^2 k / k^2 )$ . It’s dangerous of course to extrapolate this to the k=5 case but one can hope that it is still the case that $P_{0,12}$ is small compared to $P_0$ in this case.

21 November, 2013 at 6:39 pm

James Maynard

Nice! If we’re only using Bombieri-Vinogradov (or MPZ) then presumably we can use a hybrid bound, where if $d_2d_3d_4$ is small we use your argument as above, but if $d_2d_3d_4$ is larger we instead use $\displaystyle\sum_n 1_{n,n+12 prime}\nu (n)\le \sum_n \nu ''(n)$
(i.e. rather than forcing n to be prime and n+12 an almost-prime with a very small level-of-distribution, we relax this to n and n+12 being almost-primes.)

If we’re using Elliott-Halberstam then we get no use from this, since the primes have (apart from an $\epsilon$ ) the same level-of-distribution results available.

21 November, 2013 at 7:03 pm

Terence Tao

Certainly we have the option of deleting both primality constraints, but I’m not sure how one can separate the large $d_2d_3d_4$ case from the small $d_2d_3d_4$ case while retaining the positivity of all the relevant components of $\nu$ , which is implicitly used in order to safely drop the constraints that n is prime or n+12 is prime.

In the large k regime, if we take the sieve from your Section 8, but restrict it to a smaller simplex, such as $\{t_1+\ldots+t_k \leq 0.9\}$ , then it does appear that the analogue of $P_{0,12}$ scales like $\log^2 k/k^2$ , which it should (morally we should have $P_{0,12} \sim P_0 P_{12}$ , though with current methods we could only hope for an upper bound that is of this order of magnitude). It would be interesting to see what numerics we can get in the k=5 case though.

One nice thing is that all the bounds for $P_0, P_{0,12}$ , etc. are quadratic forms in the function f, so the same linear programming methods in your paper will continue to be useful here, without need for more nonlinear optimisation.

21 November, 2013 at 7:25 pm

James Maynard

I guess I was imagining a Cauchy-Schwarz step:
$\displaystyle 1_{n,n+12 prime}(\sum_{d_2,d_3,d_4}\lambda_{d_2,d_3,d_4})^2$

$\displaystyle \le 1_{n,n+12 prime}(\sum_{d_2,d_3,d_4 `small'}\lambda_{d_2,d_3,d_4})^2$

$\displaystyle +1_{n,n+12 prime}(\sum_{d_2,d_3,d_4 `large'}\lambda_{d_2,d_3,d_4})^2$
and then using the separate bounds in for the different sums.

21 November, 2013 at 8:19 pm

Terence Tao

Ah, I see. That could work, although one loses a factor of 2 from using the inequality $(a+b)^2 \leq 2a^2 + 2b^2$ and so splitting might not be efficient in practice – but I guess we can experiment with all sorts of combinations to see what works numerically and what doesn’t.

21 November, 2013 at 9:00 pm

James Maynard

We can regain the factor of two by being a bit more careful. Let $\lambda_{d_2,d_3,d_4}=\lambda_{small}+\lambda_{large}$ , where $\lambda_{small}$ is $\lambda$ provided $d_2d_3d_4$ is small, and $\lambda_{large}$ is the remainder. Then, expanding the square, we would use your bound for all the terms except the $\lambda_{large}\lambda_{large}$ terms. Of course, it might be that in practice that the benefit of doing this is small.

(PS: Apologies for the formatting errors I’ve had in my posts – I’ll try to check them more carefully)

21 November, 2013 at 9:30 pm

James Maynard

Ignore my last comment – this loses positivity.

21 November, 2013 at 9:30 pm

Terence Tao

Hmm… but in general, the $\lambda_{large}, \lambda_{small}$ do not have a definite sign, and so I don’t see how to discard the cutoffs $1_{n \hbox{ prime}}$ or $1_{n+12 \hbox{ prime}}$ in the cross terms $\sum_n 1_{n,n+12 \hbox{ prime}} \lambda_{small} \lambda_{large}$ unless one used Cauchy-Schwarz (which would be a little bit more efficient than the $(a+b)^2 \leq 2a^2+2b^2$ inequality, though not by too much).

22 November, 2013 at 7:35 am

James Maynard

Actually, I think the Cauchy step I suggested above is rubbish (I was clearly more tired last night than I realized). We would lose the vital filtering out of small primes, or we would be doing worse than just going one way or the other.

I think my concern was that with only a limited amount of room for divisibility constraints we might only be filtering out factors of $n+12$ which are typically less than $N^{c/k}$ for a constant $c$ , and potentially losing a factor of $k$ . (I’m thinking about the case when $k$ is large).I think this is what would be happening with the original GPY weights, but our weights are constructed so that the main contribution is when $d_2\dotsc d_k$ is of size $N^c$ (some suitable constant $c$ ), which gives us enough room that we don’t need to worry here.

Presumably this would give an improvement to $k_0$ of size about $k_0^{1/2}$ (ignoring all log factors).

20 November, 2013 at 3:57 pm

Tristan Freiberg

Re (1), depends what exactly one means by GPY for APs, but GPY extended their original work to primes in APs — wasn’t me. My understanding is that these amazing new results apply equally well to admissible k-tuples of linear forms g_1x + h_1,…,g_kx + h_k, not just monic ones. I’m sure we’ll see lots of neat applications of this that we can get almost for free after Maynard’s spectacular proof!

20 November, 2013 at 6:20 am

timur

In the second sentence, is it mean that H_m is the least quantity such that there are infinitely many intervals of length H_m that contain m+1 (not m) or more primes?

[Corrected, thanks – T.]

20 November, 2013 at 8:22 am

Pace Nielsen

Dear Terry,

Could you expound a little bit on your Remark 1? If we are within epsilon of DHL[2,2] under EH, why don’t we have DHL[3,2] under EH?

20 November, 2013 at 9:06 am

Terence Tao

Good question! There is a funny failure of monotonicity when one tries to use the boost in Remark 1, which I’ll try to explain a bit more here.

Start with the task of trying to prove DHL[2,2], or more precisely the twin prime problem of finding two primes in $n, n+2$ . The way we try to do this is to pick a sieve weight $\nu$ which basically has the form

$\nu(n) = \sum_{d_1 | n, d_2|n+2} \lambda_{d_1,d_2}$

for some sieve coefficients $\lambda_{d_1,d_2}$ whose exact value we will ignore for now. We then want to control sums such as

$\sum_n \nu(n)$
$\sum_n \theta(n) \nu(n)$
$\sum_n \theta(n+2) \nu(n).$

The first sum can be easily estimated in practice, so let us ignore it for now and focus on the second two sums. At first glance, because of the two divisibility constraints $d_1|n, d_2|n+2$ , it looks like we need $d_1 d_2 < x$ to have any real hope of controlling the sum; but note that the weight $\theta(n)$ forces $n$ to be prime and so $d_1$ will be $1$ , so we actually only need $d_2 < x$ to control the second sum, and similarly $d_1 < x$ to control the first sum. The upshot of this is that the function $F$ in the blog post only needs to be supported on the square $\{ (t_1,t_2): t_1,t_2 \leq 1\}$ rather than the triangle $\{(t_1,t_2): t_1+t_2 \leq 1\}$ (and if one then sets F equal to 1 on this square, one gets $M_2 = 2$ , coming within an epsilon of the twin prime conjecture on EH).

[I'm glossing over some details as to how to deal with the allegedly easy first sum. If $d_1 d_2 < x$ then this sum is indeed easy to estimate, but if $d_1 d_2$ can exceed $x$ then one has to proceed more carefully. I think what one does here is try to use a separation of variables trick, breaking up $\nu$ into pieces of the form $\nu_1(n) \nu_2(n)$ , where $\nu_1(n) = \sum_{d_1|n} \lambda_{d_1}$ is a divisor sum of $n$ only, and $\nu_2(n) = \sum_{d_2|n+2} \lambda_{d_2}$ is a divisor sum of $n+2$ only, and then use an Elliott-Halberstam hypothesis for $\nu_1$ or $\nu_2$ (rather than $\theta$ ) to control error terms; this should get us back to the weaker constraints $d_1 < x$ , $d_2 < x$ .]

Now we turn to DHL[3,2], playing with the tuple n,n+2,n+6. Now we use a sieve of the form

$\nu(n) = \sum_{d_1 | n, d_2|n+2, d_3|n+6} \lambda_{d_1,d_2,d_3}$

and want to estimate the sums
$\sum_n \nu(n)$
$\sum_n \theta(n) \nu(n)$
$\sum_n \theta(n+2) \nu(n)$
$\sum_n \theta(n+6) \nu(n)$ .

Again ignoring the first sum as being presumably easy to handle, we now need the constraints $d_2 d_3 < x, d_1 d_3 < x, d_1 d_2 < x$ in order to control the latter three sums respectively (rather than $d_1 d_2 d_3 < x$ ). In terms of the cutoff function F, this means that we may enlarge the support from the simplex $\{(t_1,t_2,t_3): t_1+t_2+t_3 \leq 1 \}$ to $\{ (t_1,t_2,t_3): t_2+t_3, t_1+t_3, t_1+t_2 \leq 1 \}$ . But note now that we have a constraint $d_1 d_2 < x$ present for the 3-tuple problem which was not present in the 2-tuple problem, because we had no need to control $\sum_n \theta(n+6) \nu(n)$ in the 2-tuple problem. So there is a failure of monotonicity; the simple example of $F(t_1,t_2) = 1_{[0,1]}(t_1) 1_{[0,1]}(t_2)$ which gets us within an epsilon of success for 2-tuples on EH doesn't directly translate to something for 3-tuples.

20 November, 2013 at 11:50 am

Terence Tao

Thinking about it a bit more, this lack of monotonicity is probably a sign that the arguments are not as efficient as they could be. In what one might now call the “classical” GPY/Zhang arguments, one relies on the Elliott-Halberstam conjecture just for the von Mangoldt function $\Lambda(n) = \sum_{d|n} \mu(d) \log(n/d) = - \sum_{d|n} \mu(d) \log d$ . To obtain the enlargement of the simplex, one would also need the EH conjecture for other divisor sums $\sum_{d|n} \lambda_d$ , but this type of generalisation of EH is a standard extension (see e.g. Conjecture 1 of Bombieri-Friedlander-Iwaniec) and so this would not be considered that much more “expensive” than EH just for von Mangoldt (although the latter is pretty expensive at present!).

To recover monotonicity, one would have to also assume EH for hybrid functions such as $\Lambda(n+2) \sum_{d|n} \lambda_d$ . One could simply conjecture EH for such beasts, but this now looks considerably more “expensive” than the previous versions of EH (indeed, it may already be stronger than the twin prime conjecture, especially if the level of the divisor sum is allowed to be large). On the other hand, perhaps a Bombieri-Vinogradov type theorem for these hybrid functions is not unreasonable to hope for (though I am not sure exactly how this would help in a situation in which we already had EH for the non-hybrid functions).

20 November, 2013 at 12:03 pm

Pace Nielsen

Thank you for this explanation. It makes a lot of sense.

I wonder if taking different admissible 3-tuples (e.g. {0,2,6} and {0,4,6}) and then weighting them appropriately in the sieve would allow further refinement of the support simplex.

20 November, 2013 at 2:43 pm

Terence Tao

Unfortunately these tuples somehow live in completely different worlds and don’t seem to have much interaction. Specifically, the only chance we have to make $n,n+2,n+6$ prime for large n is if $n = 2\ (3)$ , while the only chance we have to make $n, n+4, n+6$ prime is if $n=1\ (3)$ . So any sieve that “sees” the first tuple in any non-trivial way would necessarily be concentrated on $2\ (3)$ , and any sieve that sees the second tuple would be concentrated instead on $1\ (3)$ . The distribution of primes in $2\ (3)$ and in $1\ (3)$ are more or less independent (one may at best save a multiplicative factor of two in the error terms by treating them together, which is negligible in the grand scheme of things), so I don’t think one gains much from a weighted average of sieves adapted to two different residue classes mod 3.

20 November, 2013 at 5:22 pm

Pace Nielsen

Good point. I should have used {0,2,6} and {2,6,8} together instead. I think the idea in my head was something along the lines of your earlier post, about using probabilities in some ways. This would require our weights to discriminate against both 0 and 8 being prime simultaneously (in this case), and that appears to be a difficult task.

20 November, 2013 at 9:09 am

Sylvain JULIEN

You might be interested in the following heuristics conditional on a rather stronger form of Goldbach’s conjecture (that I call NFPR conjecture) explaining roughly why $H_{k}$ should be $O(k\log k)$: http://mathoverflow.net/questions/132973/would-the-following-conjectures-imply-lim-inf-n-to-inftyp-nk-p-n-ok-lo
It would be interesting to try to establish rigorously the presumably best possible upper bound $r_{0}(n)=O*(\log^{2}n)$ that would settle both NFPR conjecture and Cramer’s conjecture: maybe you could start a future Polymath project to do so?

20 November, 2013 at 12:15 pm

Terence Tao

One thought on the variational problem; for sake of discussion let us take k=3. The quantity $M_3$ is then the maximum value of

$3 \int_{x+y \leq 1} (\int_{z \leq 1-x-y} F(x,y,z)\ dz)^2 dx dy$

for symmetric F(x,y,z) supported on the simplex $\{x+y+z \leq 1\}$ subject to the constraint

$\int_{x+y+z \leq 1} F(x,y,z)^2\ dx dy dz = 1.$

(As discussed earlier, we can hope to enlarge this simplex to the larger region $\{x+y,y+z,z+x \leq 1\}$ , but let us ignore this improvement for now.) In Maynard’s paper, one considers arbitrary polynomial combinations of the first two symmetric polynomials $P_1 = x+y+z$ , $P_2=x^2+y^2+z^2$ as candidates for F.

However, we know from Lagrange multipliers (as remarked after (7.1) in Maynard’s paper) that the optimal F must be a constant multiple of

$\int_0^{1-x-y} F(x,y,t)\ dt + \int_0^{1-x-z} F(x,t,z)\ dt + \int_0^{1-y-z} F(t,y,z)\ dt$

and so in particular takes the form

$F(x,y,z) = G(x,y)+ G(x,z) + G(y,z)$

for a symmetric function G(x,y) of two variables supported on the triangle $\{x+y \leq 1\}$ . So one could collapse the three-dimensional variational problem to a two-dimensional one, which in principle helps avoid the “curse of dimensionality” and would allow one to numerically explore a greater region of the space of symmetric functions (e.g. by writing everything in terms of G and expanding G into functions of symmetric polynomials such as $x+y, x^2+y^2$ ).

It is possible that one could iterate this process and reduce to a one-dimensional variational problem, but it looks messy and I have not yet attempted this.

20 November, 2013 at 3:43 pm

James Maynard

I was unable to iterate the reduction in dimension step more than once. (This is roughly why I was able to solve the eigenfunction equation when $k_0=2$ , since it reduces to solving a single variable PDE, but not for $k_0>2$ ).

The eigenfunction equation for $G(x,y)$ looks like
$(\lambda-1+x+y)G(x,y)=\int_0^{1-x-y}(G(x,t)+G(t,y))dt,$
and I failed to get anything particularly useful out of this (although maybe someone else has some clever ideas – I’m far from an expert at this. Differentiating wrt x and y turns this into a two variable PDE). One can do some sort of iterative substitution, but I just got a mess.

20 November, 2013 at 2:22 pm

Anonymous

Notation:

* Use “ $\liminf_{n\to\infty}$ ” instead of “ $\lim\inf_{n\to\infty}$ ”. (Also, it is easier to use “ $\to$ ” than “ $\rightarrow$ ”.)
* Use “ $m\textsuperscript{th}$ ” instead of “ $m^{th}$ ”.

20 November, 2013 at 2:49 pm

Eytan Paldi

By representing $M_{k_0}$ as the largest eigenvalue of a certain non-negative integral operator, a simple upper bound for $M_{k_0}$ (implying some limitations of the method) is given by the operator trace.

20 November, 2013 at 3:55 pm

Terence Tao

A cheap way to combine Maynard’s paper with the Zhang/Polymath8a stuff: if $MPZ[\varpi,\delta]$ is true, then we have $H_m \ll \exp( \frac{m}{1/4+\varpi} + o(m) )$ as $m \to\infty$ . The reason for this is that the function F (or f) constructed in the large k_0 setting is supported in the cube $[0,T/k_0]^{k_0}$ (as well as in the simplex); by construction, $T/k_0$ decays like $\log^{-3} k_0$ , so in particular f will be supported on $[0, \frac{\delta}{1/4+\varpi}]^{k_0}$ for $k_0$ large enough. This means that the moduli that come out of the sieve asymptotic analysis will be $x^\delta$ -smooth, and so MPZ may be used in place of EH much as in Zhang (or Polymath8a).

So using our best verified value 7/600 of $\varpi$ , we have the bound $H_m \ll \exp( 3.822 m)$ , and using the tentative value 13/1080 this improves to $H_m \ll \exp( 3.817 m)$ . Not all that dramatic of an improvement, but it does show that the Polymath8a stuff is good for something :).

20 November, 2013 at 5:31 pm

Terence Tao

The computations in Section 8 of Maynard’s paper should give explicit values of H_m for small m that are fairly competitive with the bound $H_1 \leq 600$ , I think.

For simplicity let us just work off of Bombieri-Vinogradov (so $\theta$ is basically 1/2); one can surely push things further by using the MPZ estimates, but let’s defer that to later. The criterion (Proposition 4.2 of Maynard, or Corollary 6 here) is then that $H_m \leq H(k)$ holds whenever $M_k > 4m$ , where $H(k)$ is the diameter of the narrowest k-tuple. For instance, $M_{105} > 4$ gives $H_1 \leq 600$ .

In Section 8 of Maynard, the lower bound

$\displaystyle M_k \geq \frac{m_1^2}{m_2} (1 - \frac{T}{k(1-T/k-\mu)^2} )$ (1)

holds for any $A,T > 0$ such that

$\displaystyle 1-T/k-\mu > 0$ (1′)

where

$\displaystyle m_1 = \frac{\log(1+AT)}{A}$

$\displaystyle m_2 = \frac{1}{A} (1 - \frac{1}{1+AT} )$

$\displaystyle \mu = \frac{1}{m_2} \frac{1}{A^2} ( \log(1+AT) - 1 + \frac{1}{1+AT} )$ .

(here $m_i$ are the moments $m_i = \int_0^T g(t)^i dt$ , where $g(t) = \frac{1}{1+At}$ .) It looks like a fairly routine matter to optimise A,T for any given k, and then find the first k with $M_k > 4$ (to get a $H_1$ bound), the first k with $M_k > 8$ (to get a $H_2$ bound), and so forth.

Actually there is some room to improve things a bit; if one inspects Maynard’s argument more carefully, one can improve (1) to

$\displaystyle M_k \geq \frac{m_1^2}{m_2} (1 - \frac{(k-1)\sigma^2}{(k - T - (k-1)\mu)^2} )$ (2)

where $\sigma^2$ is the variance

$\displaystyle\sigma^2 = \frac{1}{m_2} \int_0^T t^2 g(t)^2\ dt - \mu^2$
$\displaystyle = \frac{T}{A^2} + \frac{T}{A^2(1+AT)} - \frac{2 \log(1+AT)}{A^3} - \mu^2$

and the criterion (1′) is replaced with

$k-T-(k-1)\mu > 0$ (2′)

which could lead to some modest improvement in the numerology. (I also have an idea on removing the -T in the denominator in (2) and (2′) (or the -T/k in (1) and (1′), but this is a bit more complicated.)

20 November, 2013 at 9:29 pm

Pace Nielsen

“The computations in Section 8 of Maynard’s paper should give explicit values of H_m for small m that are fairly competitive with the bound H_1 \leq 600, I think.”

Assuming I didn’t make any errors, for $m=1$ you can get $k=145$ rather easily with your improved equations. (Take $T=21,A=5.3$ for example.) I couldn’t get $k=144$ to work out. Also, I didn’t try to do $m=2$ so others might want to give it a try.

I didn’t do anything fancy to find the solution for $k=145$ — just made a lot of graphs. Similarly, for $k=144$ I just graphed enough points so that I was convinced that the maximum for the RHS of (2) is less than 4.

20 November, 2013 at 9:40 pm

Pace Nielsen

P.S. While I was playing around with trying to actually find the optimal solution, I did notice that making the substitution $A=(B-1)/T$ (for some variable $B>1$ , which we may want to replace with an exponential) simplifies things quite a bit. For instance, this allows one to convert (2′) into an inequality of the form $T< f(B,k)$ for a function $f$ (which has a number of nice properties).

20 November, 2013 at 10:05 pm

Pace Nielsen

P.P.S. I just discovered the “FindMaximum” function in Mathematica, which concurs with my findings. When $k=145$ , It says that the RHS of (2) takes a maximum value of $4.00324717$ when (2′) holds, at $T=21.9045,A=5.45969$ . On the other hand, when $k=144$ , then the RHS of (2) is less than 4, when (2′) holds.

If this function is to be believed, when $m=2$ then $k=13986$ is the lowest $k$ value for which the RHS of (2) is bigger than 8 when (2′) also holds. One takes $T=994.001,A=10.0689$ .

21 November, 2013 at 2:32 am

Andrew Sutherland

$k=13986$ for $m=2$ gives $H_2 \le 145212$ .
The admissible $k$ -tuple is here.

21 November, 2013 at 1:51 pm

Terence Tao

In light of James’ calculation below that a naive insertion of $\varpi=7/600$ (or $\varpi=13/1080$ ) gives good results for the small k numerics, it is now tempting to do the same for the large k asymptotics given above.

If one does things optimistically (ignoring $\delta$ ), then all one has to do is replace the condition $M_k > 4m$ with $M_k > \frac{m}{1/4+\varpi}$ for one’s favourite value of $\varpi$ . Given that in the small k case this improved 105 to 68, perhaps we could hope that 145 improves to below 105?

To do things properly (i.e. rigorously), we unfortunately have to also take into account $\delta$ , and specifically there is an annoying new constraint

$\displaystyle \frac{T}{k_0} \leq \frac{\delta}{1/4+\varpi}$

needed to ensure that all the moduli involved are $x^\delta$ -smooth. This (when combined with either $\frac{600}{7} \varpi + \frac{180}{7} \delta < 1$ or $\frac{1080}{13} \varpi + \frac{330}{13} \delta < 1$ ) is likely to cause some degradation in the bounds, but at least these bounds would be rigorous.

On the other hand, we do know that our MPZ estimates only need a certain amount of dense divisibility rather than smoothness. It is likely that one could modify the above arguments (probably by introducing two different deltas, much as we did in Polymath8a) we could reduce the dependence on delta to the point where it is almost negligible. But this looks a little tricky and is probably best deferred until a little later.

p.s. Gah, I just realised that my formula for $\sigma^2$ given previously was incorrect, as I had lost the $m_2$ factor. The true formula for $\sigma^2$ is

$\displaystyle \sigma^2 = \frac{1}{m_2}( \frac{T}{A^2} + \frac{T}{A^2(1+AT)} - \frac{2 \log(1+AT)}{A^3} ) - \mu^2$

which unfortunately is likely to make the previous computations a little bit worse. Sorry about that!

21 November, 2013 at 2:58 pm

Pace Nielsen

With the corrected formula for $\sigma^2$ , the FindMaximum command gives the following results:

For $m=1$ , we have $k=582, T=38.5972,A=6.91488$ .

For $m=2$ , we have $k=59451,T=1574.21,A=10.9443$ .

I’ll give your other improvement some thought when I have more time.

21 November, 2013 at 4:13 pm

Terence Tao

Thanks for the quick recalculation! It’s a shame that we lost a factor of 5 due to the error, although we still beat Polymath8a, for what it’s worth…

22 November, 2013 at 1:33 am

Andrew Sutherland

With $k = 59451$ we get $H_2 \le 698288$ with this tuple.

21 November, 2013 at 3:51 pm

xfxie

Just had an initial try for $\frac{1080}{13} \varpi + \frac{330}{13} \delta < 1$ . Seems $k$ cannot be dropped down too far. Here is a possible solution 508.

21 November, 2013 at 8:12 pm

Pace Nielsen

Using $\frac{1080}{13}\varpi + \frac{330}{13}\delta <1$ , xfxie's $k$ appears optimal for $m=1$ . For $m=2$ , we can get $k=42392$ .

22 November, 2013 at 6:18 am

Andrew Sutherland

This gives the lower bound 484290 on $H_2$ .

22 November, 2013 at 7:04 am

Andrew Sutherland

Of course I meant to say upper bound :)

22 November, 2013 at 7:37 am

Andrew Sutherland

Actually we can get H_2 down to 484276.

22 November, 2013 at 9:18 am

Fan

The link to the 484276 tuple keeps throwing 403

22 November, 2013 at 10:04 am

Andrew Sutherland

Fixed.

23 November, 2013 at 4:13 pm

xfxie

For $k$ =42392, seems $H$ can be further down to 484272 :).

23 November, 2013 at 4:33 pm

Andrew Sutherland

Slight further improvement: 484260.

23 November, 2013 at 6:32 pm

xfxie

Further improved to 484238.

24 November, 2013 at 2:00 am

Andrew Sutherland

Another small improvement: 484234.

24 November, 2013 at 6:35 am

xfxie

Can be further down to 484200.

24 November, 2013 at 6:21 pm

Andrew Sutherland

A little more progress: 484192.

25 November, 2013 at 2:09 am

Andrew Sutherland

484176.

26 November, 2013 at 5:01 am

xfxie

Can be dropped to: 484168 (for k=42392).

27 November, 2013 at 2:08 am

Andrew Sutherland

484162

27 November, 2013 at 12:40 pm

Andrew Sutherland

484142

28 November, 2013 at 4:47 am

Andrew Sutherland

484136

28 November, 2013 at 6:25 am

Andrew Sutherland

484126

20 November, 2013 at 6:36 pm

andre

Please forgive me my ignorance, I even didn’t fully read Maynard’s preprint.
On page 5, in Proposition 4.1, he defines $J_k^{(m)}(F)$ with an outer integration over the (k-1)-dim. cube. On page 19, along the proof of Lemma 7.2, formula (7.12) he uses an outer integration over the (k-1)-dim simplex. So my question is, if the outer integration in the definition of $J_k^{(m)}(F)$ was over the (k-1)-dim simplex, too?

20 November, 2013 at 9:13 pm

Terence Tao

In Lemma 7.2, one is only considering functions F of the form defined in (7.3), which are supported on the simplex, and so integration of such functions in the cube is the same as on the simplex.

20 November, 2013 at 6:48 pm

Eytan Paldi

It seems that the largest eigenvalue of the operator defined in (7.2) of Maynard’s preprint is the square root of $M_k$ (not $M_k$ as remarked there.)

20 November, 2013 at 7:11 pm

Eytan Paldi

In fact, the exact relationship between the above linear operator and $M_k$ is not sufficiently clear to me.

20 November, 2013 at 9:09 pm

Terence Tao

The connection comes from the identity

$\sum_{m=1}^k J_k^{(m)}(F) = \int_{{\mathcal R}_k} F {\cal L} F$

which shows that $M_k$ is the largest eigenvalue of ${\cal L}$ . (Actually, as ${\cal L}$ is not quite compact, although it is bounded, self-adjoint, and positive definite, one has to be a little careful here with the spectral theory to ensure that ${\cal L}$ only has point spectrum at its operator norm; I think it will work out though, I will try to supply some details later. In any case this doesn’t impact the substance of Maynard’s paper, as one never actually analyses eigenfunctions in that paper.)

20 November, 2013 at 9:37 pm

Eytan Paldi

Thanks for the explanation! One may use this to show that for each given polynomial P, the ratio for $L P$ should increase (unless P is optimal). This gives iterative process for improving $F$ .

22 November, 2013 at 7:31 am

Terence Tao

A little expansion of Eytan’s remark (a version of the power method): from Cauchy-Schwarz one has

$\displaystyle (\int_{{\cal R}_k} f {\cal L} f)^2 \leq (\int_{{\cal R}_k} f f) (\int_{{\cal R}_k} {\cal L} f {\cal L} f)$

which upon rearranging means that the Rayleigh quotient $\int_{{\cal R}_k} f {\cal L} f / \int_{{\cal R}_k} f f$ is non-decreasing if one replaces $f$ by ${\cal L}^{1/2} f$ . Iterating this, we may replace $f$ by ${\cal L} f$ (or any higher power ${\cal L}^m f$ and guarantee a better lower bound on $M_k$ unless one is already at an eigenfunction.

This suggests that we can improve the large k analysis (and in particular, the m=2 values) by taking our current candidate for f and applying ${\cal L}$ to it to obtain a new candidate that is guaranteed to be better. This is a bit messy though; a simpler (though slightly less efficient) approach would be to compute $\int_{{\cal R}_k} {\cal L} f {\cal L} f$ instead of $\int_{{\cal R}_k} f {\cal L} f$ and use the inequality

$M_k^2 \geq \int_{{\cal R}_k} {\cal L} f {\cal L} f / \int_{{\cal R}_k} f f$

to obtain what should presumably be a better lower bound for M_k than what we currently have.

22 November, 2013 at 8:20 am

Eytan Paldi

In general this iteration gives linear convergence to $M_k$ , so one may try to accelerate it numerically (e.g. by Aitken’s $\delta ^2$ method.)

21 November, 2013 at 5:44 am

Eytan Paldi

It seems that the integral operator above is compact since its representing integrand is continuous on its domain (i.e. the product of the simplex in t and the simplex in u.)
But I don’t see why its representing integrand is symmetric (as it should for a self-adjoint operator)!

I suggest to see if the integrands representing powers of this operators can be evaluated explicitly (e.g. by the integrand composition operation) – so the traces of (small) powers of the operator may be evaluated – giving upper bounds for corresponding powers of $M_k$ .

21 November, 2013 at 6:57 am

Terence Tao

One can check by hand that $\int F {\cal L} G = \int G {\cal L} F$ for any (say) bounded compactly supported F,G on the simplex.

The operator ${\cal L}$ is not quite an integral operator because each of the $k$ components only integrates in one of the dimensions rather than all $k$ of the dimensions. As such, it doesn’t look compact, but may still have pure point spectrum, at least at the top of the spectrum (I have to check this). In particular, I don’t expect the traces of powers of this operator to be finite. A model case here is the operator $T: L^2([0,1]^2) \to L^2([0,1]^2)$ given by

$T f(x,y) := \int_0^1 f(x,t)\ dt$

which is positive semi-definite and bounded with an operator norm of 1, but has an infinity of eigenfunctions at 1 (any f that is independent of y will be an eigenfunction).

21 November, 2013 at 7:40 am

Eytan Paldi

Thanks! (I see now my error.)

Anyway, I think that it is also important to get also some upper bounds for $M_k$ (to estimate the tightness of its lower bounds and to get some limitations of the method.)

21 November, 2013 at 11:51 am

Terence Tao

I no longer have any clear intuition as to whether the spectrum of ${\cal L}$ is pure point, absolutely continuous, or some combination of both (it could also possibly have singular continuous spectrum); the situation could be as complicated as it is for discrete or continuous Schrodinger operators, which have a very wide range of possible behaviours. Consider for instance the model operator

$T F(x,y) = \int_0^{1-x} F(x,t)\ dt$

on the triangle $\Delta = \{ (x,y): x+y \le 1\}$ (all variables are assumed to be non-negative); this is one of the two components of ${\cal L}$ in the k=2 case. One can split $L^2(\Delta)$ into the functions that are constant in the y direction, and functions that are mean zero in the y direction. The operator T annihilates the latter space, and acts by multiplication by $1-x$ on the former space. So there is absolutely continuous spectrum on $[0,1]$ and a big null space at 0. The operator norm of 1 is not attained by any function, but one can get arbitrarily close to this norm by, e.g. the indicator function of the rectangle $[0,\varepsilon] \times [0,1-\varepsilon]$ for $\varepsilon > 0$ (one can normalise to have unit norm in L^2 if one wishes).

Despite the possible lack of eigenfunctions (which in particular may mean that in some cases the maximum of the ratio $\sum_{m=1}^k J_k^{(m)}(F)/I(F)$ is not attainable), I believe it is still correct that the operator norm of ${\cal L}$ for all functions on the simplex is the same as the operator norm of ${\cal L}$ for symmetric functions on the simplex, which informally means that we can restrict without loss of generality to the case of symmetric functions F. To see this, suppose that we have a sequence $F_n$ of $L^2$ -normalised functions on the simplex which is a extremising sequence for $\int F_n {\cal L} F_n = \sum_{m=1}^k J_k^{(m)}(F_n)$ , thus this sequence approaches the supremum $M_k$ as $n \to\infty$ . By replacing $F_n$ with its absolute value $|F_n|$ (which does not change the L^2 norm, but can only serve to increase the $J_k^{(m)}(F_n)$ ) we may assume that the $F_n$ are non-negative. In particular, the projection $F_n$ to the symmetric functions (i.e. the symmetrisation of $F_n$ ) is bounded away from zero (in fact it is at least $1/k!$ ). An application of the parallelogram law (or the spectral theorem) shows that any average of two or more extremising sequences is again an extremising sequence (if the norms stay away from zero), so the symmetrised $F_n$ are also extremising. (One could also have taken a slightly higher-minded approach and split up ${\cal L}$ according to isotypic $S_k$ components here if one wished.)

21 November, 2013 at 11:56 am

Terence Tao

My tentative understanding of what is going on near the top of the spectrum is that if one has strict monotonicity $M_k > M_{k-1}$ then the supremum should be obtained by an eigenfunction of ${\cal L}$ , but if one has equality $M_k = M_{k-1}$ then the near-extrema may instead be obtained from lower-dimensional functions, e.g. functions of the form $F(t_1,\ldots,t_k) = F_\varepsilon(t_1,\ldots,t_{k-1}) 1_{[0,\varepsilon]}(t_k)$ where $F_\varepsilon$ is a near-extremiser of the $k-1$ -dimensional problem (supported on a slightly shrunken version of the $k-1$ -dimensional simplex).

It is presumably the case that the $M_k$ are strictly increasing (except possibly for very small k), but I don’t see how one would prove this other than numerically (and even numerical verification is a bit of a challenge because we do not yet have good upper bounds on $M_k$ ).

22 November, 2013 at 10:49 am

Eytan Paldi

A “steepest descent” type iteration for $M_k$ :

Consider the ratio

$R(x) = \langle x, L x \rangle / \langle x, x \rangle$

In a given Hilbert space $X$ (in our case $L^2$ over the simplex) where L is a linear, symmetric, bounded operator on $X$ .

Clearly, we need to consider $\log R(x)$ whose first variation is

$g(x) = 2 L x / \langle x, L x \rangle - 2 x / \langle x, x \rangle$

Note that $g(x)$ may be interpreted as the “gradient” of $\log R$ at x. This implies the “line search” iteration

$x_{j+1} = x_j + \alpha_j d(x_j)$

Where $d(x_j) = g(x_j)/ ||g(x_j)||$ is the direction of search and
$\alpha_j \geq 0$ is the search step size (determined to maximize the ratio at $x_{j+1}$ .

Remarks:

1. If $x_0$ is a polynomial then all $x_j$ are polynomials.

2. It is easy to find the closed form formula for the search step size.

3. This search can be improved to a “conjugate gradient” type search (and perhaps even to “quasi Newton” type methods) with quadratic convergence rate!

22 November, 2013 at 2:51 pm

Eytan Paldi

It should be remarked that in all iterative methods of this type (i.e. “gradient based”), $x_j$ is a linear combination of $x_0, L x_0, ..., L^{j-1} x_0$ – for which Maynard’s method (for best linear combination) should be applied on these basis functions.

21 November, 2013 at 4:26 am

Andreas

Shouldn’t that be “where $p_n$ denotes the $n$-th prime” in the beginning?

21 November, 2013 at 5:45 am

Anonymous

I think you are right. Also, I would write “ $(p_{n+m}-p_n)$ ”, i.e., with parentheses.

The improvements suggestioned in https://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251777 could also be incorporated. :)

[Changes now made – T.]

21 November, 2013 at 12:50 pm

James Maynard

One numerical data point that people might find interesting:

Assume $EH[1/4+\varpi]$ with the provisional value $\varpi=7/600$ . Using computations as in my paper (a polynomial approximation of terms $P_1^b(1-P_2)^c$ with $b+2c\le 13$ , we find for $k_0=68$ that $M_{68}>3.822...$ . We find then that $(1/4+7/600)M_{68}>1$ , and so two primes with $k_0=68$ , which corresponds to gaps of size 356.

21 November, 2013 at 1:08 pm

Terence Tao

That’s smaller than I would have expected, given that $1/4 + 7/600$ is only 5% or so bigger than 1/4. (Incidentally, with our conventions, we would use $EH[2 * (1/4+\varpi) ]$ rather than $EH[1/4+\varpi]$ . But I guess since $M_5$ is already as large as 2, and $M_k$ seems to grow logarithmically, it isn’t so unreasonable.

Do you have some preliminary data on how your lower bounds on $M_k$ behave in k for small k (e.g. up to 105)? This would give a baseline for further numerical improvement, and would also give a better sense as to what kind of growth $M_k$ should have (given that the values should be pretty good for small k, perhaps enough to do some reliable extrapolation).

Of course we have to play with MPZ instead of EH, which causes some difficulties (specifically, we can’t play on the entire simplex, but must truncate to the cube $[0, \frac{\delta}{1/4+\varpi}]^k$ ), but it does show that there is hopefully a fair bit of room for improvement.

Incidentally, the 7/600 value is not so provisional (once Emmanuel signs off on Section 10 of the Polymath8a paper, in particular, it would have been checked at least once). We have a much more tentative value of 13/1080 that is not fully verified (it requires some difficult algebraic geometry that we haven’t got around to yet) but it might be interesting to see how much more of a gain one could expect with this slightly better value.

21 November, 2013 at 4:10 pm

James Maynard

The growth looks roughly logarithmic.

A reasonable approximation for $10 <k_0<100$ is $M_{k_0} \approx 0.6\log{k_0}+1.3$ .Specifically, for $k_0=10,20,...100$ I get the following bounds: 2.53, 3.05, 3.34, 3.52, 3.66, 3.76, 3.84, 3.90, 3.95, 3.99.

21 November, 2013 at 1:27 pm

Stijn Hanson

Sorry for the stupid question but, In Maynard’s Lemma 5.1 (and roughly in here just below (10) ) it states that if $W, [d_1, e_1], \ldots, [d_k, e_k]$ are not pairwise coprime then $n \cong v_0 \pmod{W}$ and $[d_i,e_i]|n+h_i$ cannot be satisfied by any n. It’s utterly trivial if d divides n and one of the lcms but I can’t seem to understand why it works if d divides two of the lcms.

21 November, 2013 at 4:23 pm

Terence Tao

If d divides both $[d_i,e_i]$ and $[d_j,e_j]$ , then it divides $n+h_i$ and $n+h_j$ , and hence also $h_i-h_j$ . On the other hand, since $n+h_i$ is coprime to W, $d$ must be coprime to $W$ . Since $W$ is the product of all small primes, and in particular all primes dividing $h_i-h_j$ (if x is large enough), we obtain a contradiction.

22 November, 2013 at 8:30 am

Terence Tao

I think there is a probabilistic interpretation of M_k, as follows. Consider the following random process $x^{(0)}, x^{(1)}, \ldots \in [0,1]^k$ on the unit cube as follows:

* $x^{(0)}$ is chosen uniformly at random from the unit cube.

* Once $x^{(i)}$ is chosen, $x^{(i+1)}$ is chosen by taking $x^{(i)}$ , selecting a random coordinate of $x^{(i)}$ , and replacing it with a number chosen uniformly from [0,1] independently of all previous choices.

(For the chess-oriented, this process describes a (k-dimensional) rook moving randomly in the unit cube.)

The probability that $x^{(0)},\ldots,x^{(n)}$ all lie in the simplex is $\int_{{\cal R}_k} 1 k^{-n} {\cal L}^n 1$ , which asymptotically is $(M_k/k + o(1))^n$ as $n \to \infty$ . So $M_k$ is measuring the exponential decay of the probability that the randomly moving rook always stays inside the simplex.

Among other things, this suggests that a Monte Carlo method might be used to approximate $M_k$ , although this would not give rigorous deterministic bounds on $M_k$ .

24 January, 2014 at 9:37 am

Eytan Paldi

Is it possible to give to $M_k$ a quantum mechanical interpretation as well? (The problem is to give a quantum mechanical interpretation to the operator $\mathcal L$ – which may make sense in smaller dimensions like 3,4).
Perhaps quantum mechanical bounds (a kind of “uncertainty principle”) bounds may be obtained for $M_k$ ?

22 November, 2013 at 10:11 am

Terence Tao

Here are my (unfortunately somewhat lengthy) computations for the second moment $\int f {\mathcal L}^2 f$ for the function

$\displaystyle f(t_1,\ldots,t_k) = \prod_{i=1}^k k^{-1/2} m_2^{-1/2} g(k t_i) 1_{t_1+\ldots+t_k \leq 1}$

used in the large k analysis, with the quantities $m_1,m_2,\sigma,\mu$ defined in this previous comment. Note that $\int f^2 \leq 1$ , and hence

$\displaystyle M_k \geq \int f {\mathcal L} f$ (1)

and

$\displaystyle M_k^2 \geq \int f {\mathcal L}^2 f$ (2)

so we can get lower bounds on $M_k$ by computing either of the quantities in (1) and (2), and we know from Cauchy-Schwarz that (2) should give a slightly better bound.

As a warmup, let us first compute the RHS of (1). By symmetry, this becomes

$\displaystyle k \int f(t_1,t_2,\ldots,t_k) f(t'_1,t_2,\ldots,t_k)\ dt_1 dt'_1 dt_2 \ldots dt_k$

which if we restrict to the region $t_2+\ldots+t_k \leq 1 - T/k$ , simplifies to

$\displaystyle \frac{m_1^2}{m_2} {\bf P}( X_2+\ldots+X_k \leq k-T )$

where $X_1,\ldots,X_k$ are iid with distribution $\frac{1}{m_2} g(t)^2 dt$ on $[0,T]$ . This, together with the Chebyshev inequality, is what gives the lower bound

$\displaystyle M_k \geq \frac{m_1^2}{m_2} (1 - \frac{(k-1)\sigma^2}{k-T-(k-1)\mu} )$

which is what we had in the previous comment.

Now we compute the RHS of (2). By symmetry, this becomes the sum of

$\displaystyle k (k-1) \int f(t_1,t_2,t_3,\ldots,t_k) f(t'_1,t'_2,t_3,\ldots,t_k)$
$\displaystyle 1_{t'_1+t_2+\ldots+t_k \leq 1} \ dt_1 dt'_1 dt_2 dt'_2 dt_3 \ldots dt_k$

and

$\displaystyle k \int f(t_1,t_2,\ldots,t_k) f(t''_1,t_2,\ldots,t_k)$
$\displaystyle 1_{t'_1+t_2+\ldots+t_k \leq 1}\ dt_1 dt'_1 dt''_1 dt_2 \ldots dt_k$ .

For the first expression, we restrict to the region $t_3+\ldots+t_k \geq 1-2T/k$ and obtain a lower bound of

$\displaystyle \frac{k-1}{k} \frac{m_1^4}{m_2^2} {\bf P}( X_3+\ldots+X_k \leq k-2T)$ .

By Chebyshev, this is at least

$\displaystyle \frac{k-1}{k} \frac{m_1^4}{m_2^2} (1 - \frac{(k-2)\sigma^2}{k-2T-(k-2)\mu} )$ (3)

subject to the constraint

$\displaystyle k - 2T - (k-2) \mu > 0$ . (4)

Meanwhile,for the second expression we restrict to the region $t_2+\ldots+t_k \geq 1-T/k$ and obtain a lower bound of

$\displaystyle \frac{1}{k} (\int_0^T g(t)\ dt)^2 {\bf E} (k - X_2 - \ldots - X_k ) 1_{X_2+\ldots+X_k \leq k-T}$

which we can lower bound by

$\displaystyle \frac{1}{k} \frac{m_1^2}{m_2} {\bf E} (k - T - X_2 - \ldots - X_k )$

which is equal to

$\displaystyle \frac{1}{k} \frac{m_1^2}{m_2} (k - T - (k-1) \mu)$ . (5)

Thus we obtain the bound

$M_k^2 \geq (3) + (5)$

whenever $A,T>0$ obey the constraint (4).

Asymptotically, this does not seem to give too much of an improvement to the lower bound of $M_k$ (a back of the envelope calculation suggests that the gain is only $O(1)$ or less) but it might still give some noticeable improvement for the m=1 and m=2 numerics. (And if it doesn’t, it would provide some indication as to how close our test function F is to being an eigenfunction.)

22 November, 2013 at 1:47 pm

Pace Nielsen

Is your formula for (3) correct? Or should there be a square in the denominator of the rightmost fraction? [I ask, because three offset equations earlier, you have a formula that doesn’t match your earlier bound for $M_k$ as it it missing a square in the denominator of the rightmost fraction as well.]

22 November, 2013 at 1:52 pm

Terence Tao

Yes, you’re right; there should be a square in the denominator in both cases as per Chebyshev. So the first lower bound for M_k should be

$\displaystyle M_k\geq \frac{m_1^2}{m_2} ( 1 - \frac{(k-1)\sigma^2}{(k-T-(k-1)\mu)^2} )$

and the quantity in (3) should instead be

$\displaystyle \frac{k-1}{k} \frac{m_1^4}{m_2^2} (1 - \frac{(k-2)\sigma^2}{(k-2T-(k-2)\mu)^2} ).$

Thanks for spotting the error. But actually this error may end up in our favour, since I think the denominator is in fact greater than 1, so perhaps xfxie’s calculations below will in fact improve with this correction!

22 November, 2013 at 2:03 pm

Pace Nielsen

Unfortunately (or fortunately, depending on your frame of reference) xfxie’s calculation used the correct formula. When I wasn’t able to reproduce his results, I discovered the error.

22 November, 2013 at 3:03 pm

xfxie

Pace is right. For the implementation, I modified the code based on the obvious difference between M_k and (3) — so it might be the reason that the typo (which appeared in both places) was automatically skipped.

22 November, 2013 at 2:01 pm

Pace Nielsen

Assuming that the answer to my question is that the formula for (3) was incorrect, and that there should be a square (which I think is true, if I’m understanding your use of Chebyshev), we get (without using the $\delta,\varpi$ inequalities from Polymath 8a) the following:

For $m=1$ , $k=448,T=26.1665,A=6.15853$ .
For $m=2$ , $k=43134, T=1079.33, A=10.2938$ .
Under EH, $k=28, T=3.5308, A=3.52752$ .

23 November, 2013 at 1:28 am

Andrew Sutherland

This implies $H_2$ is at most 493528.

23 November, 2013 at 10:23 am

Andrew Sutherland

Actually, this can be improved slightly: 493510

23 November, 2013 at 4:40 pm

Andrew Sutherland

Further improved to 493458.

24 November, 2013 at 6:16 pm

Andrew Sutherland

Down to 493442.

25 November, 2013 at 1:53 pm

Andrew Sutherland

493436

26 November, 2013 at 1:53 am

Andrew Sutherland

493426

6 December, 2013 at 7:32 am

Andrew Sutherland

493408

22 November, 2013 at 5:07 pm

Terence Tao

I’m noting a rather technical complication with this second moment computation when one tries to combine it with $MPZ[\varpi,\delta]$ . When using MPZ, one has effectively limited oneself to using functions $f$ supported on the cube $[0, \frac{\delta}{1/4+\varpi}]^k$ to get smoothness of the moduli. However, while the original function f has this form (if $T \leq \frac{\delta}{1/4+\varpi} k$ ), higher iterates ${\mathcal L}^m f$ do not necessarily have this form. So, strictly speaking, one cannot combine the two improvements.

However, this is likely a fixable problem provided one puts in a certain amount of effort. The function ${\mathcal L}^m f$ is still supported on a set in which all but m of the coordinates are known to lie in $[0, \frac{\delta}{1/4+\varpi}]^k$ , which for small m almost guarantees multiple $x^\delta$ -dense divisibility, if not $x^\delta$ -smoothness (there will be an exceptional set in which this fails but it will be exponentially small). So it is likely that this technicality will have a negligible impact on the estimates, and hopefully will not degrade the value of k at all.

It’s all getting rather complicated; I will try to write up a fresh blog post soon to summarise the various things being explored here.

22 November, 2013 at 11:17 am

xfxie

For $\frac{1080}{13} \varpi + \frac{330}{13} \delta < 1$ , seems $k$ can drop significantly. Here is a possible solution 388, if my implementation is correct.

22 November, 2013 at 12:54 pm

Aubrey de Grey

xfxie, can you please clarify – which permutation of other assumptions are you using here? Or to put it another way, to which previously-mentioned k0 assuming varpi = 7/600 (if any) is your 388 the 13/1080 counterpart?

22 November, 2013 at 3:13 pm

xfxie

Aubery, 13/1080 is based on Terence’s calculation here (and the counterpart of k_0 is 603). It is also mentioned in an comment above by Terence. The result is not totally verified yet. But the calculation might provide a basic sense on how far the method can reach.

24 November, 2013 at 12:05 pm

Aubrey de Grey

Thanks, but actually that wasn’t quite my question: I know that 603 is the pre-Maynard k0 for varpi = 13/1080, but I was looking for the post-Maynard k0 for varpi = 7/600. If I’m not mistaken, Pace’s k0 of 448 assumes only Bombieri-Vinogradov, and until 13/1080 is fully confirmed I’m thinking that it might be of interest also to continue tracking the various k0 that arise from 7/600.

Also, incidentally, isn’t k0 = 42392 obsolete now? – isn’t that the m=2 counterpart of 508, which was superseded by Terry’s second-moment computations on Friday? I’m unclear as to why you and Drew are still refining H for 42392 rather than for the m=2 counterpart of 388 (which I don’t think anyone has posted yet).

24 November, 2013 at 6:35 pm

Andrew Sutherland

The situation with k0=42392 isn’t clear to me, but if someone has a better value, please post it; I agree there is no point in optimizing H values for an obsolete k0 (although it has motivated me to spend some time tweaking the algorithm to more efficiently handle k0 in this range, so the time isn’t wasted). I’d also be curious to see a k0 for m=3 if anyone is motivated enough to try it.

As far as I know the best “Deligne-free” k0 for m=2 is still 43134, but someone please correct me if I am wrong.

22 November, 2013 at 2:04 pm

Terence Tao

I think I now have an upper bound of $M_k$ of logarithmic type that matches the lower bound asymptotically (up to the $\log\log k$ correction terms).

First, let me describe the warmup problem which led me to discover the bound. Morally speaking (ignoring the $\sigma^2$ terms), the large k computations boil down to the problem of maximising $(\int_0^T g(t)\ dt)^2$ subject to the constraints $\int_0^T g(t)^2\ dt = 1$ and $\int_0^T t g(t)^2\ dt \leq 1$ . Calculus of variations tells us that the extremiser should be a multiple of $1/(1+At)$ for some suitable $A$ . Motivated by this, I wrote down the Cauchy-Schwarz inequality

$\displaystyle (\int_0^T g(t)\ dt)^2 \leq (\int_0^T (1+At) g(t)^2\ dt) (\int_0^T \frac{1}{1+At}\ dt)$

(knowing that equality is attained here when $g$ is a multiple of $1/(1+At)$ ) and the right-hand side evaluates to be bounded by

$\displaystyle (1+ \frac{1}{A}) \log(1+AT)$

which optimises to essentially $\log T$ (up to errors of $\log\log T$ ) after setting $A = \log T$ (say).

This then led me to an analogous argument in the multidimensional setting (after a rescaling). Namely, Cauchy-Schwarz gives

$\displaystyle (\int_0^{1-t_1-\ldots-t_{k-1}} f(t_1,\ldots,t_k)\ dt_k)^2 \leq$
$\displaystyle (\int_0^{1-t_1-\ldots-t_{k-1}} (1+k A t_k) f(t_1,\ldots,t_k)^2\ dt_k)$
$\displaystyle \times (\int_0^1 (1+k At_k)^{-1}\ dt_k)$ .

The latter integral evaluates to $\frac{1}{kA} \log(1+kA)$ . Integrating in the remaining variables $t_1,\ldots,t_{k-1}$ , we conclude that

$\displaystyle J_k^{(k)}(f) \leq \frac{1}{kA} \log(1+kA) \int_{{\cal R}_k} (1+kAt_k) f(t_1,\ldots,t_k)^2\ dt_1 \ldots t_k.$

Similarly for permutations. Summing and using the fact that $t_1 +\ldots+t_k \leq 1$ on the simplex, we conclude that

$\displaystyle \sum_{m=1}^k J_k^{(m)}(f) \leq \frac{1}{kA} \log(1+kA) \int_{{\cal R}_k} (k+kA) f(t_1,\ldots,t_k)^2\ dt_1 \ldots t_k$

$\displaystyle = (1+\frac{1}{A}) \log(1+Ak) I_k(f)$

and so we have the upper bound

$\displaystyle M_k \leq (1 + \frac{1}{A}) \log(1+Ak)$

for any $A>0$ ; again setting $A = \log k$ , we obtain an upper bound of the form

$\displaystyle M_k \leq \log k + \log \log k + O(1)$

and one can probably optimise this a bit further.

22 November, 2013 at 3:05 pm

Eytan Paldi

For small $k$ , this is quite close to Maynard’s lower bounds logarithmic approximation above.

2 April, 2014 at 7:28 am

Gergely Harcos

Just a very small note that the $O(1)$ term can be taken as $2$ for all $k\geq 2$ .

22 November, 2013 at 2:23 pm

Eytan Paldi

It seems that it should be “…multiple of $1/(1+ A t)$ …”.

[Corrected, thanks – T.]

22 November, 2013 at 2:28 pm

Eytan Paldi

Sorry, at first sight it seems to me $1+A t$ (instead of its inverse.)

22 November, 2013 at 2:28 pm

Pace Nielsen

Minor typo: Eight lines from the bottom, $(1+A)$ should be $(1+Ak)$ .

[Actually, I think it is correct as it stands, although I had skipped a step involving cancelling a factor of $k$ which I have now put back in to try to reduce confusion. -T.]

22 November, 2013 at 5:36 pm

Pace Nielsen

You are right. Sorry about that!

22 November, 2013 at 7:41 pm

Polymath8b, II: Optimising the variational problem and the sieve | What's new

[…] for small values of (in particular ) or asymptotically as . The previous thread may be found here. The currently best known bounds on […]

22 November, 2013 at 7:44 pm

Terence Tao

Rolling over the thread to a fresh one in order to make it easier to catch up on the discussion.

23 November, 2013 at 2:27 am

Nikita Sidorov

Out of curiosity – what does the numerics suggest for $H_m$?

27 November, 2013 at 6:59 am

Primzahlenzwillinge | UGroh's Weblog

[…] Artikel in dem Quanta Magazin findet man mehr über die neusten Entwicklungen dazu (siehe auch den Blog von Terence Tao […]

7 December, 2013 at 4:06 pm

Ultraproducts as a Bridge Between Discrete and Continuous Analysis | What's new

[…] transfer from the corresponding laws for the standard natural numbers . For a more topical example, it is now a theorem that given any standard natural number , there exist standard primes such that ; it is an […]

8 December, 2013 at 4:47 pm

Polymath8b, III: Numerical optimisation of the variational problem, and a search for new sieves | What's new

[…] the notation of this previous post, we have the lower […]

20 December, 2013 at 2:05 pm

Polymath8b, IV: Enlarging the sieve support, more efficient numerics, and explicit asymptotics. | What's new

[…] products supported on small cubes in avoiding (5). For the GPY argument to work (as laid out in this previous post), we need the […]

8 January, 2014 at 11:21 am

Polymath8b, V: Stretching the sieve support further | What's new

[…] baseline bounds for the numerator and denominator in (1) (as established for instance in this previous post) are as follows. If is supported on the […]

28 January, 2014 at 9:19 pm

Polymath8b, VII: Using the generalised Elliott-Halberstam hypothesis to enlarge the sieve support yet further | What's new

[…] wish to understand the correlation of various products of divisor sums on . For instance, in this previous blog post, the […]

2 February, 2014 at 3:44 pm

Коллективный разум в теории чисел » CreativLabs

[…] растерялся и огранизовал новый коллективный проект polymath8b, направленный на улучшение вновь полученной границы. […]

2 February, 2014 at 6:43 pm

Eytan Paldi

On the dependence of the current M-value on $\epsilon$ :

According to some preliminary analysis, it seems that for $\epsilon \geq 1/4$ , we have the linear dependence

$\hat{M}''_{3, \epsilon, 1} = \frac{3}{2} + (1-\epsilon)C$

For some (still undetermined) absolute positive constant C (related to the solution of a certain integral equation). Interestingly, it is easy to determine the dependence on $\epsilon$ without knowing the constant C!

I suggest to verify empirically this linear dependence.

14 March, 2014 at 11:17 am

Urns and the Dirichlet Distribution | Eventually Almost Everywhere

[…] Polymath8b: Bounded intervals with many primes, after Maynard (terrytao.wordpress.com) […]

1 April, 2014 at 10:42 am

Gergely Harcos

I think that (8) in Proposition 7 is valid for all $k_0$ , not just for $k_0$ sufficiently large. First, we can assume that $k_0 > 200$ , since otherwise the right hand side of (8) is negative. Then, in the bound (8.17) of Maynard’s paper, we can choose $\log k_0 > A > 3$ such that $A^2e^A = k_0$ . Clearly, $e^A = k_0/A^2 > k_0/\log^2 k_0$ , i.e. $A > \log k_0-2\log\log k_0$ . In addition, on the right hand side of (8.17), $A/(e^A-1) < 1/6$ and $e^A/k_0 = 1/A^2$ . $1/A^2$ is less than $1/9$ , hence the right hand side of (8.17) exceeds $A-1/(1-1/6-1/9)^2$ , which in turn is greater than $\log k_0-2\log\log k_0-2$ .

31 July, 2014 at 8:33 am

Together and Alone, Closing the Prime Gap | For a better Vietnam

[…] However, Tao predicted that the project’s participants will not be able to resist immediately sinking their teeth into Maynard’s new preprint. “It’s like red meat,” he […]

6 December, 2015 at 2:23 am

Anonymous

Why we need to choose the simplex such? What to change if you take $R_k=\{\dots(x)\in[0,1]^k\}$?

6 December, 2015 at 4:28 pm

Terence Tao

The restriction to the simplex is needed to force the products $d_1 \dots d_{k_0}$ and $d'_1 \dots d'_{k_0}$ to be bounded by $R$ (see second display before (11)), which is in turn needed in order to have good equidistribution estimates available (with error term smaller than the main term).

21 March, 2016 at 9:49 am

Anonymous

If you apply the approach of Maynard to $E_2=\left{x\in \mathbb{N}|\Omega(x)=2\right}=\left{q_{n}}_{n=1}^{\infty}$ , then to assess the difference $\liminf_{n}(q_{n+r_{k}-1}-q_{n})<h_k-h_1$ we must demand $\frac{\log N}{N}\sum_{N<nr_k-1$ . But $\sum_{N<n<2N}\chi_{E_2}(n)~\frac{N \log\log N}{\log N}$ . Choosing N large enough, we achieve the minimum distance for any r, which obviously leads to a contradiction. Where am I wrong?

[Wordpress treats < and > as HTML, causing text with both symbols in it to be corrupted. Use < and > instead – T.]

26 May, 2016 at 8:45 am

Anonymous

Where in the proof of the asymptotics for sums (10),(11) significantly that the set is admissible? Thanks.

26 May, 2016 at 9:07 am

Terence Tao

One needs admissibility in order to have at least one residue class $b\ (W)$ with all of the $b+h_i$ coprime to $W$ . So estimates such as (1) and (2) are vacuously true in the non-admissible case (because there is no $b$ that obeys the requirements).

26 May, 2016 at 10:33 am

Anonymous

Is it possible to formulate the allowed set is defined for arbitrary subsets of natural numbers? For example, if we consider the set of Prime numbers,the distance between which is greater than 1000, then the asymptotic amount and level of distribution will not change. Then, using Your approach, we would have the presence of smaller distances of 600 and would come to a contradiction. This is due to the choice of the set H. We knew in advance that, for example, n and n+2 could not lie simultaneously in our set and therefore had no right to choose so.

26 May, 2016 at 10:39 am

Terence Tao

I’m not fully sure I understand your question, but this method has been extended to find small gaps in other sets of numbers than the primes, see e.g. the work of Benatar, Thorner, Li-Pan, Pollack, Maynard, Chua-Park-Smith, and Baker-Zhao (this may not be a complete list of references).

17 June, 2016 at 5:13 am

Anonymous

Good afternoon!
Take the set of primes p such that p+2 and p-2 is obviously not a prime number. The asymptotic number of integers in this set and the level of distribution are the same as primes. Can we apply to this set the approach of Maynard? Thank you!

18 June, 2016 at 7:22 am

Terence Tao

It depends on what you mean by “obviously”. If you mean that $p +2$ and $p-2$ are each divisible by some small prime (e.g. 3 and 5 respectively), then as you say one has the required level of distribution, and it is not difficult to adapt the arguments (there are already several papers in the literature dealing with bounded gaps between special sets of primes obeying these sorts of level of distribution axioms). But if you only require that $p+2,p-2$ are both composite, then a level of distribution result (with error terms that are better than the main terms by an arbitrary power of $\log x$ ) would basically require the conjectural asymptotics for the twin prime conjecture, which is of course still open.

19 June, 2016 at 2:32 pm

Anonymous

Could You provide a link to the literature?

20 June, 2016 at 7:35 am

Terence Tao

See my previous comment on this blog post at 2016/05/26 at 10:39 am.

20 February, 2017 at 4:19 am

Anonymous

What is known about such sums with weights fourth degree or more? Why we study a second degree? Thank you.

20 February, 2017 at 10:04 am

Anonymous

It seems that fourth degree weights should have narrower peaks (i.e. more selective) which is somewhat analogous to Jackson kernel (fourth degree weights) versus Fejer kernel (second degree weights) in approximation theory.

24 February, 2017 at 11:10 am

Anonymous

Where I could read more about it? I tried to bring the corresponding coefficients of the main parts, and got unexpectedly complicated structure.

25 February, 2017 at 8:55 am

Anonymous

The idea to use such weights already appeared in a comment (10 August, 2014, at 11:38 am) in the 36-th Polymath8 thread (from 21 July 2014).

20 February, 2017 at 3:08 pm

sylvainjulien

Assuming Goldbach’s conjecture, and denoting by $ latex r_{0}(n) $ the smallest positive integer $ latex r $ such that both $ latex n-r $ and $ latex n+r $ are primes for latex n$ a large enough composite integer, and by $ latex l0(n) $ the number of primes in the interval $ latex [n-r_{0}(n),n+r_{0}(n)] $ , what would be the best upper bound for $ latex l_{0}(n) $ in terms of $ latex r_{0}(n) $ ? Can $ latex l_{0}(n)\ll_{h} (r_{0}(n))^{1/2+h} $ hold true under GRH ?

26 February, 2017 at 9:30 am

Anonymous

Dear Terry,

Could you please answer the following question: if we consider the weight of the fourth degree, the $I_k$ values will look like as in the second degree with the exception of the degree of $F$ ? I used the Fourier transform method and received a rather complex structure. So I’m trying to understand if I have a mistake.

Thank you!

11 March, 2017 at 8:50 am

Anonymous

What restrictions need to require the function $f(x_1,...,x_{k0}$ to compute values $\alpha$ and $\beta$ in the original form (5),(6)?
For example, if we take $f(x,y)=e^{x+y}$ then $\frac{\alpha}{\beta}=3.046.$ Where am I making a mistake?

15 March, 2017 at 9:44 pm

Terence Tao

This function is not supported on the simplex $\Delta_2$ (and if one truncates the function to this simplex, it will no longer be smooth).

15 March, 2017 at 11:24 pm

Anonymous

Then which function to choose?

12 March, 2017 at 1:53 pm

sylvainjulien

Dear Terry,

Can an interesting lower bound for the de Polignac constant as introduced in http://math.stackexchange.com/questions/2183872/lower-bound-for-the-de-polignac-constant be obtained as of today (March 12th 2017) ?
Many thanks in advance.

3 October, 2020 at 4:38 pm

Hoa Huynh

Assuming the GEH, the bounded gap is 6.
In Wikipedia of “Polignac’s conjecture”, in section “Conjectured density”, there is a statement:
Twin primes have the same conjectured density as cousin primes, and half that of sexy primes.
Together, don’t they imply that all 3: Twin Prime, Cousin Prime and Sexy Prime have infinite counts?

4 October, 2020 at 5:27 pm

Terence Tao

Conjecturally, yes. But this is not yet proven.

5 October, 2020 at 5:57 am

sylvainjulien

You may be interested in this: https://mathoverflow.net/questions/358027/a-conditional-approach-to-twin-prime-conjecture?r=SearchResults

6 October, 2020 at 6:52 am

Hoa Huynh

For “admissible tuples”, the first element’s position or pattern is fixed for “k equal to or greater than 6.”
Examples:
For k = 5, we have 2 patterns: (p, p + 2, p + 6, p + 8, p + 12)
16061, 16063, 16067, 16069, 16073 (*)
21011, 21013, 21017, 21019, 21023
43781, 43783, 43787, 43789, 43793 (*)
247601, 247603, 247607, 247609, 247613
1063961, 1063963, 1063967, 1063969, 1063973
1091261, 1091263, 1091267, 1091269, 1091273 (*)
1246361, 1246363, 1246367, 1246369, 1246373

and (p, p + 4, p + 6, p + 10, p + 12)
15727, 15731, 15733, 15737, 15739
16057, 16061, 16063, 16067, 16069 (*)
43777, 43781, 43783, 43787, 43789 (*)
79687, 79691, 79693, 79697, 79699
736357, 736361, 736363, 736367, 736369
1091257, 1091261, 1091263, 1091267, 1091269 (*)
1155607, 1155611, 1155613, 1155617, 1155619
——–
For k = 6, we have 1 pattern: (p, p + 4, p + 6, p + 10, p + 12, p + 16)
16057, 16061, 16063, 16067, 16069, 16073 (*)
43777, 43781, 43783, 43787, 43789, 43793 (*)
1091257, 1091261, 1091263, 1091267, 1091269, 1091273 (*)
——–
There are many many more of them, the examples are chosen to illustrate the points: pattern for k = 6 exists in k = 5; and the 4 middle elements are in the same decade, and the 2 outside ones are in adjacent decades.
The unit digits for k = 6 are fixed: 7, 1, 3, 7, 9, 3. The obvious reason is to avoid divisibility by 5.
Likewise, “fixed unit digit patterns” also happen for k > 6, for “optimal sieving.”
——–
Has this information/fact been used in some proofs of bounded gap?

8 October, 2020 at 7:44 am

Hoa Huynh

For the 105-tuple, we can not place its FIRST element at even-numbered locations, because it sieves all even numbers.

It has 26 numbers ended with 0, let’s call it group A. We would NOT place it at locations ended with 5, because we waste 25% of the “resource.”
Group A (26 counts)
0, 10, 30, 70, 90, 100,
120, 180, 190,
220, 250, 280,
300, 310, 330, 360,
390, 400, 420, 430, 450,
490, 510,
570, 580, 600
——————-
Let’s group the rest of 105-tuple in numbers ended with 2, 4, 6 and 8. Note that we don’t have any number ended with 6. So, we have 3 more groups B, C and D, as followed.
Group B (27 counts)
12, 42, 52, 72, 82,
112, 132,
192, 202, 222, 232, 252, 262,
322, 342, 352, 372,
402, 412, 432, 442,
462, 472, 492, 532,
562, 582
—–
Group C (27 counts)
24, 34, 54, 64, 94,
114, 124, 154, 174, 184,
204, 234, 264,
294, 324, 334, 364,
384, 394, 444,
454, 484, 504, 534,
544, 574, 594
——-
Group D (25 counts)
28, 48, 78,
118, 138, 148, 168, 178,
208, 258, 268,
288, 328, 358,
378, 408, 418,
468, 478, 498, 528, 538,
558, 588, 598
—————————
We would not place FIRST element of 105-tuple at locations ended with 3, because group B would waste 25% of the “resource.”
Likewise, for locations ended with 1 and 7, for groups C and D, respectively.
—————————
Therefore, the most strategic locations to place 105-tuple are where they are ended with 9, in order to maximize the result.
Same reason for all k-tuples, with k > 6.

26 January, 2023 at 10:48 am

Infinite partial sumsets in the primes | What's new

[…] that powers the proof of that theorem (though we use a formulation of that sieve closer to that in this blog post of mine). Indeed, the failure of (iii) basically arises from the failure of Maynard’s theorem for […]

1 February, 2023 at 4:07 pm

Infinite partial sumsets in the primes – Terence Tao – A2Z Facts

	Terence Tao on On product representations of…
	domotorp on On product representations of…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on A symmetric formulation of the…
	Anonymous on On product representations of…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on 275A, Notes 3: The weak and st…
	Alex Gunning on A symmetric formulation of the…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on It ought to be common knowledg…
	Anonymous on Work hard
	Aleksandar on 245C, Notes 4: Sobolev sp…

Polymath8b: Bounded intervals with many primes, after Maynard

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

161 comments

Leave a comment Cancel reply

For commenters

Polymath8b: Bounded intervals with many primes, after Maynard

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

161 comments

Leave a comment Cancel reply

For commenters