You are currently browsing the category archive for the ‘math.NT’ category.

There are multiple purposes to this blog post.

The first purpose is to announce the uploading of the paper “New equidistribution estimates of Zhang type, and bounded gaps between primes” by D.H.J. Polymath, which is the main output of the Polymath8a project on bounded gaps between primes, to the arXiv, and to describe the main results of this paper below the fold.

The second purpose is to roll over the previous thread on all remaining Polymath8a-related matters (e.g. updates on the submission status of the paper) to a fresh thread. (Discussion of the ongoing Polymath8b project is however being kept on a separate thread, to try to reduce confusion.)

The final purpose of this post is to coordinate the writing of a retrospective article on the Polymath8 experience, which has been solicited for the Newsletter of the European Mathematical Society. I suppose that this could encompass both the Polymath8a and Polymath8b projects, even though the second one is still ongoing (but I think we will soon be entering the endgame there). I think there would be two main purposes of such a retrospective article. The first one would be to tell a story about the process of conducting mathematical research, rather than just describe the outcome of such research; this is an important aspect of the subject which is given almost no attention in most mathematical writing, and it would be good to be able to capture some sense of this process while memories are still relatively fresh. The other would be to draw some tentative conclusions with regards to what the strengths and weaknesses of a Polymath project are, and how appropriate such a format would be for other mathematical problems than bounded gaps between primes. In my opinion, the bounded gaps problem had some fairly unique features that made it particularly amenable to a Polymath project, such as (a) a high level of interest amongst the mathematical community in the problem; (b) a very focused objective (“improve {H}!”), which naturally provided an obvious metric to measure progress; (c) the modular nature of the project, which allowed for people to focus on one aspect of the problem only, and still make contributions to the final goal; and (d) a very reasonable level of ambition (for instance, we did not attempt to prove the twin prime conjecture, which in my opinion would make a terrible Polymath project at our current level of mathematical technology). This is not an exhaustive list of helpful features of the problem; I would welcome other diagnoses of the project by other participants.

With these two objectives in mind, I propose a format for the retrospective article consisting of a brief introduction to the polymath concept in general and the polymath8 project in particular, followed by a collection of essentially independent contributions by different participants on their own experiences and thoughts. Finally we could have a conclusion section in which we make some general remarks on the polymath project (such as the remarks above). I’ve started a dropbox subfolder for this article (currently in a very skeletal outline form only), and will begin writing a section on my own experiences; other participants are of course encouraged to add their own sections (it is probably best to create separate files for these, and then input them into the main file retrospective.tex, to reduce edit conflicts. If there are participants who wish to contribute but do not currently have access to the Dropbox folder, please email me and I will try to have you added (or else you can supply your thoughts by email, or in the comments to this post; we may have a section for shorter miscellaneous comments from more casual participants, for people who don’t wish to write a lengthy essay on the subject).

As for deadlines, the EMS Newsletter would like a submitted article by mid-April in order to make the June issue, but in the worst case, it will just be held over until the issue after that.

Read the rest of this entry »

This is the seventh thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} can be found at the wiki page.

The current focus is on improving the upper bound on {H_1} under the assumption of the generalised Elliott-Halberstam conjecture (GEH) from {H_1 \leq 8} to {H_1 \leq 6}. Very recently, we have been able to exploit GEH more fully, leading to a promising new expansion of the sieve support region. The problem now reduces to the following:

Problem 1 Does there exist a (not necessarily convex) polytope {R \subset [0,2]^3} with quantities {0 \leq \varepsilon_1,\varepsilon_2,\varepsilon_3 \leq 1}, and a non-trivial square-integrable function {F: {\bf R}^3 \rightarrow {\bf R}} supported on {R} such that

  • {R + R \subset \{ (x,y,z) \in [0,4]^3: \min(x+y,y+z,z+x) \leq 2 \},}
  • {\int_0^\infty F(x,y,z)\ dx = 0} when {y+z \geq 1+\varepsilon_1};
  • {\int_0^\infty F(x,y,z)\ dy = 0} when {x+z \geq 1+\varepsilon_2};
  • {\int_0^\infty F(x,y,z)\ dz = 0} when {x+y \geq 1+\varepsilon_3};

and such that we have the inequality

\displaystyle  \int_{y+z \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y,z)\ dx)^2\ dy dz

\displaystyle + \int_{z+x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y,z)\ dy)^2\ dz dx

\displaystyle + \int_{x+y \leq 1-\varepsilon_3} (\int_{\bf R} F(x,y,z)\ dz)^2\ dx dy

\displaystyle  > 2 \int_R F(x,y,z)^2\ dx dy dz?

An affirmative answer to this question will imply {H_1 \leq 6} on GEH. We are “within two percent” of this claim; we cannot quite reach {2} yet, but have got as far as {1.962998}. However, we have not yet fully optimised {F} in the above problem. In particular, the simplex

\displaystyle  R = \{ (x,y,z) \in [0,2]^3: x+y+z \leq 3/2 \}

is now available, and should lead to some noticeable improvement in the numerology.

There is also a very slim chance that the twin prime conjecture is now provable on GEH. It would require an affirmative solution to the following problem:

Problem 2 Does there exist a (not necessarily convex) polytope {R \subset [0,2]^2} with quantities {0 \leq \varepsilon_1,\varepsilon_2 \leq 1}, and a non-trivial square-integrable function {F: {\bf R}^2 \rightarrow {\bf R}} supported on {R} such that

  • {R + R \subset \{ (x,y) \in [0,4]^2: \min(x,y) \leq 2 \}}

    \displaystyle  = [0,2] \times [0,4] \cup [0,4] \times [0,2],

  • {\int_0^\infty F(x,y)\ dx = 0} when {y \geq 1+\varepsilon_1};
  • {\int_0^\infty F(x,y)\ dy = 0} when {x \geq 1+\varepsilon_2};

and such that we have the inequality

\displaystyle  \int_{y \leq 1-\varepsilon_1} (\int_{\bf R} F(x,y)\ dx)^2\ dy

\displaystyle + \int_{x \leq 1-\varepsilon_2} (\int_{\bf R} F(x,y)\ dy)^2\ dx

\displaystyle  > 2 \int_R F(x,y)^2\ dx dy?

We suspect that the answer to this question is negative, but have not formally ruled it out yet.

For the rest of this post, I will justify why positive answers to these sorts of variational problems are sufficient to get bounds on {H_1} (or more generally {H_m}).

Read the rest of this entry »

This is the fourth thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} are:

  • (Maynard) Assuming the Elliott-Halberstam conjecture, {H_1 \leq 12}.
  • (Polymath8b, tentative) {H_1 \leq 272}. Assuming Elliott-Halberstam, {H_2 \leq 272}.
  • (Polymath8b, tentative) {H_2 \leq 429{,}822}. Assuming Elliott-Halberstam, {H_4 \leq 493{,}408}.
  • (Polymath8b, tentative) {H_3 \leq 26{,}682{,}014}. (Presumably a comparable bound also holds for {H_6} on Elliott-Halberstam, but this has not been computed.)
  • (Polymath8b) {H_m \leq \exp( 3.817 m )} for sufficiently large {m}. Assuming Elliott-Halberstam, {H_m \ll m e^{2m}} for sufficiently large {m}.

While the {H_1} bound on the Elliott-Halberstam conjecture has not improved since the start of the Polymath8b project, there is reason to hope that it will soon fall, hopefully to {8}. This is because we have begun to exploit more fully the fact that when using “multidimensional Selberg-GPY” sieves of the form

\displaystyle  \nu(n) := \sigma_{f,k}(n)^2

with

\displaystyle  \sigma_{f,k}(n) := \sum_{d_1|n+h_1,\dots,d_k|n+h_k} \mu(d_1) \dots \mu(d_k) f( \frac{\log d_1}{\log R},\dots,\frac{\log d_k}{\log R}),

where {R := x^{\theta/2}}, it is not necessary for the smooth function {f: [0,+\infty)^k \rightarrow {\bf R}} to be supported on the simplex

\displaystyle {\cal R}_k := \{ (t_1,\dots,t_k)\in [0,1]^k: t_1+\dots+t_k \leq 1\},

but can in fact be allowed to range on larger sets. First of all, {f} may instead be supported on the slightly larger polytope

\displaystyle {\cal R}'_k := \{ (t_1,\dots,t_k)\in [0,1]^k: t_1+\dots+t_{j-1}+t_{j+1}+\dots+t_k \leq 1

\displaystyle  \hbox{ for all } j=1,\dots,k\}.

However, it turns out that more is true: given a sufficiently general version of the Elliott-Halberstam conjecture {EH[\theta]} at the given value of {\theta}, one may work with functions {f} supported on more general domains {R}, so long as the sumset {R+R := \{ t+t': t,t'\in R\}} is contained in the non-convex region

\displaystyle  \bigcup_{j=1}^k \{ (t_1,\dots,t_k)\in [0,\frac{2}{\theta}]^k: t_1+\dots+t_{j-1}+t_{j+1}+\dots+t_k \leq 2 \} \cup \frac{2}{\theta} \cdot {\cal R}_k, \ \ \ \ \ (1)

and also provided that the restriction

\displaystyle  (t_1,\dots,t_{j-1},t_{j+1},\dots,t_k) \mapsto f(t_1,\dots,t_{j-1},0,t_{j+1},\dots,t_k) \ \ \ \ \ (2)

is supported on the simplex

\displaystyle {\cal R}_{k-1} := \{ (t_1,\dots,t_{j-1},t_{j+1},\dots,t_k)\in [0,1]^{k-1}:

\displaystyle t_1+\dots+t_{j-1}+t_{j+1}+\dots t_k \leq 1\}.

More precisely, if {f} is a smooth function, not identically zero, with the above properties for some {R}, and the ratio

\displaystyle  \sum_{j=1}^k \int_{{\cal R}_{k-1}} f_{1,\dots,j-1,j+1,\dots,k}(t_1,\dots,t_{j-1},0,t_{j+1},\dots,t_k)^2 \ \ \ \ \ (3)

\displaystyle dt_1 \dots dt_{j-1} dt_{j+1} \dots dt_k

\displaystyle  / \int_R f_{1,\dots,k}^2(t_1,\dots,t_k)\ dt_1 \dots dt_k

is larger than {\frac{2m}{\theta}}, then the claim {DHL[k,m+1]} holds (assuming {EH[\theta]}), and in particular {H_m \leq H(k)}.

I’ll explain why one can do this below the fold. Taking this for granted, we can rewrite this criterion in terms of the mixed derivative {F := f_{1,\dots,k}}, the upshot being that if one can find a smooth function {F} supported on {R} that obeys the vanishing marginal conditions

\displaystyle  \int F( t_1,\dots,t_k )\ dt_j = 0

whenever {1 \leq j \leq k} and {t_1+\dots+t_{j-1}+t_{j+1}+\dots+t_k > 1}, and the ratio

\displaystyle  \frac{\sum_{j=1}^k J_k^{(j)}(F)}{I_k(F)} \ \ \ \ \ (4)

is larger than {\frac{2m}{\theta}}, where

\displaystyle  I_k(F) := \int_R F(t_1,\dots,t_k)^2\ dt_1 \dots dt_k

and

\displaystyle  J_k^{(j)}(F) := \int_{{\cal R}_{k-1}} (\int_0^{1/\theta} F(t_1,\dots,t_k)\ dt_j)^2 dt_1 \dots dt_{j-1} dt_{j+1} \dots dt_k

then {DHL[k,m+1]} holds. (To equate these two formulations, it is convenient to assume that {R} is a downset, in the sense that whenever {(t_1,\dots,t_k) \in R}, the entire box {[0,t_1] \times \dots \times [0,t_k]} lie in {R}, but one can easily enlarge {R} to be a downset without destroying the containment of {R+R} in the non-convex region (1).) One initially requires {F} to be smooth, but a limiting argument allows one to relax to bounded measurable {F}. (To approximate a rough {F} by a smooth {F} while retaining the required moment conditions, one can first apply a slight dilation and translation so that the marginals of {F} are supported on a slightly smaller version of the simplex {{\cal R}_{k-1}}, and then convolve by a smooth approximation to the identity to make {F} smooth, while keeping the marginals supported on {{\cal R}_{k-1}}.)

We are now exploring various choices of {R} to work with, including the prism

\displaystyle  \{ (t_1,\dots,t_k) \in [0,1/\theta]^k: t_1+\dots+t_{k-1} \leq 1 \}

and the symmetric region

\displaystyle  \{ (t_1,\dots,t_k) \in [0,1/\theta]^k: t_1+\dots+t_k \leq \frac{k}{k-1} \}.

By suitably subdividing these regions into polytopes, and working with piecewise polynomial functions {F} that are polynomial of a specified degree on each subpolytope, one can phrase the problem of optimising (4) as a quadratic program, which we have managed to work with for {k=3}. Extending this program to {k=4}, there is a decent chance that we will be able to obtain {DHL[4,2]} on EH.

We have also been able to numerically optimise {M_k} quite accurately for medium values of {k} (e.g. {k \sim 50}), which has led to improved values of {H_1} without EH. For large {k}, we now also have the asymptotic {M_k=\log k - O(1)} with explicit error terms (details here) which have allowed us to slightly improve the {m=2} numerology, and also to get explicit {m=3} numerology for the first time.

Read the rest of this entry »

Mertens’ theorems are a set of classical estimates concerning the asymptotic distribution of the prime numbers:

Theorem 1 (Mertens’ theorems) In the asymptotic limit {x \rightarrow \infty}, we have

\displaystyle  \sum_{p\leq x} \frac{\log p}{p} = \log x + O(1), \ \ \ \ \ (1)

\displaystyle  \sum_{p\leq x} \frac{1}{p} = \log \log x + O(1), \ \ \ \ \ (2)

and

\displaystyle  \sum_{p\leq x} \log(1-\frac{1}{p}) = -\log \log x - \gamma + o(1) \ \ \ \ \ (3)

where {\gamma} is the Euler-Mascheroni constant, defined by requiring that

\displaystyle  1 + \frac{1}{2} + \ldots + \frac{1}{n} = \log n + \gamma + o(1) \ \ \ \ \ (4)

in the limit {n \rightarrow \infty}.

The third theorem (3) is usually stated in exponentiated form

\displaystyle  \prod_{p \leq x} (1-\frac{1}{p}) = \frac{e^{-\gamma}+o(1)}{\log x},

but in the logarithmic form (3) we see that it is strictly stronger than (2), in view of the asymptotic {\log(1-\frac{1}{p}) = -\frac{1}{p} + O(\frac{1}{p^2})}.

Remarkably, these theorems can be proven without the assistance of the prime number theorem

\displaystyle  \sum_{p \leq x} 1 = \frac{x}{\log x} + o( \frac{x}{\log x} ),

which was proven about two decades after Mertens’ work. (But one can certainly use versions of the prime number theorem with good error term, together with summation by parts, to obtain good estimates on the various errors in Mertens’ theorems.) Roughly speaking, the reason for this is that Mertens’ theorems only require control on the Riemann zeta function {\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}} in the neighbourhood of the pole at {s=1}, whereas (as discussed in this previous post) the prime number theorem requires control on the zeta function on (a neighbourhood of) the line {\{ 1+it: t \in {\bf R} \}}. Specifically, Mertens’ theorem is ultimately deduced from the Euler product formula

\displaystyle  \zeta(s) = \prod_p (1-\frac{1}{p^s})^{-1}, \ \ \ \ \ (5)

valid in the region {\hbox{Re}(s) > 1} (which is ultimately a Fourier-Dirichlet transform of the fundamental theorem of arithmetic), and following crude asymptotics:

Proposition 2 (Simple pole) For {s} sufficiently close to {1} with {\hbox{Re}(s) > 1}, we have

\displaystyle  \zeta(s) = \frac{1}{s-1} + O(1) \ \ \ \ \ (6)

and

\displaystyle  \zeta'(s) = \frac{-1}{(s-1)^2} + O(1).

Proof: For {s} as in the proposition, we have {\frac{1}{n^s} = \frac{1}{t^s} + O(\frac{1}{n^2})} for any natural number {n} and {n \leq t \leq n+1}, and hence

\displaystyle  \frac{1}{n^s} = \int_n^{n+1} \frac{1}{t^s}\ dt + O( \frac{1}{n^2} ).

Summing in {n} and using the identity {\int_1^\infty \frac{1}{t^s}\ dt = \frac{1}{s-1}}, we obtain the first claim. Similarly, we have

\displaystyle  \frac{-\log n}{n^s} = \int_n^{n+1} \frac{-\log t}{t^s}\ dt + O( \frac{\log n}{n^2} ),

and by summing in {n} and using the identity {\int_1^\infty \frac{-\log t}{t^s}\ dt = \frac{-1}{(s-1)^2}} (the derivative of the previous identity) we obtain the claim. \Box

The first two of Mertens’ theorems (1), (2) are relatively easy to prove, and imply the third theorem (3) except with {\gamma} replaced by an unspecified absolute constant. To get the specific constant {\gamma} requires a little bit of additional effort. From (4), one might expect that the appearance of {\gamma} arises from the refinement

\displaystyle  \zeta(s) = \frac{1}{s-1} + \gamma + O(|s-1|) \ \ \ \ \ (7)

that one can obtain to (6). However, it turns out that the connection is not so much with the zeta function, but with the Gamma function, and specifically with the identity {\Gamma'(1) = - \gamma} (which is of course related to (7) through the functional equation for zeta, but can be proven without any reference to zeta functions). More specifically, we have the following asymptotic for the exponential integral:

Proposition 3 (Exponential integral asymptotics) For sufficiently small {\epsilon}, one has

\displaystyle  \int_\epsilon^\infty \frac{e^{-t}}{t}\ dt = \log \frac{1}{\epsilon} - \gamma + O(\epsilon).

A routine integration by parts shows that this asymptotic is equivalent to the identity

\displaystyle  \int_0^\infty e^{-t} \log t\ dt = -\gamma

which is the identity {\Gamma'(1)=-\gamma} mentioned previously.

Proof: We start by using the identity {\frac{1}{i} = \int_0^1 x^{i-1}\ dx} to express the harmonic series {H_n := 1+\frac{1}{2}+\ldots+\frac{1}{n}} as

\displaystyle  H_n = \int_0^1 1 + x + \ldots + x^{n-1}\ dx

or on summing the geometric series

\displaystyle  H_n = \int_0^1 \frac{1-x^n}{1-x}\ dx.

Since {\int_0^{1-1/n} \frac{1}{1-x} = \log n}, we thus have

\displaystyle  H_n - \log n = \int_0^1 \frac{1_{[1-1/n,1]}(x) - x^n}{1-x}\ dx;

making the change of variables {x = 1-\frac{t}{n}}, this becomes

\displaystyle  H_n - \log n = \int_0^n \frac{1_{[0,1]}(t) - (1-\frac{t}{n})^n}{t}\ dt.

As {n \rightarrow \infty}, {\frac{1_{[0,1]}(t) - (1-\frac{t}{n})^n}{t}} converges pointwise to {\frac{1_{[0,1]}(t) - e^{-t}}{t}} and is pointwise dominated by {O( e^{-t} )}. Taking limits as {n \rightarrow \infty} using dominated convergence, we conclude that

\displaystyle  \gamma = \int_0^\infty \frac{1_{[0,1]}(t) - e^{-t}}{t}\ dt.

or equivalently

\displaystyle  \int_0^\infty \frac{e^{-t} - 1_{[0,\epsilon]}(t)}{t}\ dt = \log \frac{1}{\epsilon} - \gamma.

The claim then follows by bounding the {\int_0^\epsilon} portion of the integral on the left-hand side. \Box

Below the fold I would like to record how Proposition 2 and Proposition 3 imply Theorem 1; the computations are utterly standard, and can be found in most analytic number theory texts, but I wanted to write them down for my own benefit (I always keep forgetting, in particular, how the third of Mertens’ theorems is proven).

Read the rest of this entry »

This is the third thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} are:

  • (Maynard) Assuming the Elliott-Halberstam conjecture, {H_1 \leq 12}.
  • (Polymath8b, tentative) {H_1 \leq 330}. Assuming Elliott-Halberstam, {H_2 \leq 330}.
  • (Polymath8b, tentative) {H_2 \leq 484{,}126}. Assuming Elliott-Halberstam, {H_4 \leq 493{,}408}.
  • (Polymath8b) {H_m \leq \exp( 3.817 m )} for sufficiently large {m}. Assuming Elliott-Halberstam, {H_m \ll e^{2m} m \log m} for sufficiently large {m}.

Much of the current focus of the Polymath8b project is on the quantity

\displaystyle M_k = M_k({\cal R}_k) := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}

where {F} ranges over square-integrable functions on the simplex

\displaystyle {\cal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \}

with {I_k, J_k^{(m)}} being the quadratic forms

\displaystyle I_k(F) := \int_{{\cal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k

and

\displaystyle J_k^{(m)}(F) := \int_{{\cal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_m)^2

\displaystyle dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k.

It was shown by Maynard that one has {H_m \leq H(k)} whenever {M_k > 4m}, where {H(k)} is the narrowest diameter of an admissible {k}-tuple. As discussed in the previous post, we have slight improvements to this implication, but they are currently difficult to implement, due to the need to perform high-dimensional integration. The quantity {M_k} does seem however to be close to the theoretical limit of what the Selberg sieve method can achieve for implications of this type (at the Bombieri-Vinogradov level of distribution, at least); it seems of interest to explore more general sieves, although we have not yet made much progress in this direction.

The best asymptotic bounds for {M_k} we have are

\displaystyle \log k - \log\log\log k + O(1) \leq M_k \leq \frac{k}{k-1} \log k \ \ \ \ \ (1)

 

which we prove below the fold. The upper bound holds for all {k > 1}; the lower bound is only valid for sufficiently large {k}, and gives the upper bound {H_m \ll e^{2m} \log m} on Elliott-Halberstam.

For small {k}, the upper bound is quite competitive, for instance it provides the upper bound in the best values

\displaystyle 1.845 \leq M_4 \leq 1.848

and

\displaystyle 2.001162 \leq M_5 \leq 2.011797

we have for {M_4} and {M_5}. The situation is a little less clear for medium values of {k}, for instance we have

\displaystyle 3.95608 \leq M_{59} \leq 4.148

and so it is not yet clear whether {M_{59} > 4} (which would imply {H_1 \leq 300}). See this wiki page for some further upper and lower bounds on {M_k}.

The best lower bounds are not obtained through the asymptotic analysis, but rather through quadratic programming (extending the original method of Maynard). This has given significant numerical improvements to our best bounds (in particular lowering the {H_1} bound from {600} to {330}), but we have not yet been able to combine this method with the other potential improvements (enlarging the simplex, using MPZ distributional estimates, and exploiting upper bounds on two-point correlations) due to the computational difficulty involved.

Read the rest of this entry »

This is the second thread for the Polymath8b project to obtain new bounds for the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

either for small values of {m} (in particular {m=1,2}) or asymptotically as {m \rightarrow \infty}. The previous thread may be found here. The currently best known bounds on {H_m} are:

  • (Maynard) {H_1 \leq 600}.
  • (Polymath8b, tentative) {H_2 \leq 484,276}.
  • (Polymath8b, tentative) {H_m \leq \exp( 3.817 m )} for sufficiently large {m}.
  • (Maynard) Assuming the Elliott-Halberstam conjecture, {H_1 \leq 12}, {H_2 \leq 600}, and {H_m \ll m^3 e^{2m}}.

Following the strategy of Maynard, the bounds on {H_m} proceed by combining four ingredients:

  1. Distribution estimates {EH[\theta]} or {MPZ[\varpi,\delta]} for the primes (or related objects);
  2. Bounds for the minimal diameter {H(k)} of an admissible {k}-tuple;
  3. Lower bounds for the optimal value {M_k} to a certain variational problem;
  4. Sieve-theoretic arguments to convert the previous three ingredients into a bound on {H_m}.

Accordingly, the most natural routes to improve the bounds on {H_m} are to improve one or more of the above four ingredients.

Ingredient 1 was studied intensively in Polymath8a. The following results are known or conjectured (see the Polymath8a paper for notation and proofs):

  • (Bombieri-Vinogradov) {EH[\theta]} is true for all {0 < \theta < 1/2}.
  • (Polymath8a) {MPZ[\varpi,\delta]} is true for {\frac{600}{7} \varpi + \frac{180}{7}\delta < 1}.
  • (Polymath8a, tentative) {MPZ[\varpi,\delta]} is true for {\frac{1080}{13} \varpi + \frac{330}{13} \delta < 1}.
  • (Elliott-Halberstam conjecture) {EH[\theta]} is true for all {0 < \theta < 1}.

Ingredient 2 was also studied intensively in Polymath8a, and is more or less a solved problem for the values of {k} of interest (with exact values of {H(k)} for {k \leq 342}, and quite good upper bounds for {H(k)} for {k < 5000}, available at this page). So the main focus currently is on improving Ingredients 3 and 4.

For Ingredient 3, the basic variational problem is to understand the quantity

\displaystyle  M_k({\cal R}_k) := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}

for {F: {\cal R}_k \rightarrow {\bf R}} bounded measurable functions, not identically zero, on the simplex

\displaystyle  {\cal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \}

with {I_k, J_k^{(m)}} being the quadratic forms

\displaystyle  I_k(F) := \int_{{\cal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k

and

\displaystyle  J_k^{(m)}(F) := \int_{{\cal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_i)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k.

Equivalently, one has

\displaystyle  M_k({\cal R}_k) := \sup_F \frac{\int_{{\cal R}_k} F {\cal L}_k F}{\int_{{\cal R}_k} F^2}

where {{\cal L}_k: L^2({\cal R}_k) \rightarrow L^2({\cal R}_k)} is the positive semi-definite bounded self-adjoint operator

\displaystyle  {\cal L}_k F(t_1,\ldots,t_k) = \sum_{m=1}^k \int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_{m-1},s,t_{m+1},\ldots,t_k)\ ds,

so {M_k} is the operator norm of {{\cal L}}. Another interpretation of {M_k({\cal R}_k)} is that the probability that a rook moving randomly in the unit cube {[0,1]^k} stays in simplex {{\cal R}_k} for {n} moves is asymptotically {(M_k({\cal R}_k)/k + o(1))^n}.

We now have a fairly good asymptotic understanding of {M_k({\cal R}_k)}, with the bounds

\displaystyle  \log k - 2 \log\log k -2 \leq M_k({\cal R}_k) \leq \log k + \log\log k + 2

holding for sufficiently large {k}. There is however still room to tighten the bounds on {M_k({\cal R}_k)} for small {k}; I’ll summarise some of the ideas discussed so far below the fold.

For Ingredient 4, the basic tool is this:

Theorem 1 (Maynard) If {EH[\theta]} is true and {M_k({\cal R}_k) > \frac{2m}{\theta}}, then {H_m \leq H(k)}.

Thus, for instance, it is known that {M_{105} > 4} and {H(105)=600}, and this together with the Bombieri-Vinogradov inequality gives {H_1\leq 600}. This result is proven in Maynard’s paper and an alternate proof is also given in the previous blog post.

We have a number of ways to relax the hypotheses of this result, which we also summarise below the fold.

Read the rest of this entry »

For each natural number {m}, let {H_m} denote the quantity

\displaystyle  H_m := \liminf_{n \rightarrow\infty} (p_{n+m} - p_n),

where {p_n} denotes the {n\textsuperscript{th}} prime. In other words, {H_m} is the least quantity such that there are infinitely many intervals of length {H_m} that contain {m+1} or more primes. Thus, for instance, the twin prime conjecture is equivalent to the assertion that {H_1 = 2}, and the prime tuples conjecture would imply that {H_m} is equal to the diameter of the narrowest admissible tuple of cardinality {m+1} (thus we conjecturally have {H_1 = 2}, {H_2 = 6}, {H_3 = 8}, {H_4 = 12}, {H_5 = 16}, and so forth; see this web page for further continuation of this sequence).

In 2004, Goldston, Pintz, and Yildirim established the bound {H_1 \leq 16} conditional on the Elliott-Halberstam conjecture, which remains unproven. However, no unconditional finiteness of {H_1} was obtained (although they famously obtained the non-trivial bound {p_{n+1}-p_n = o(\log p_n)}), and even on the Elliot-Halberstam conjecture no finiteness result on the higher {H_m} was obtained either (although they were able to show {p_{n+2}-p_n=o(\log p_n)} on this conjecture). In the recent breakthrough of Zhang, the unconditional bound {H_1 \leq 70,000,000} was obtained, by establishing a weak partial version of the Elliott-Halberstam conjecture; by refining these methods, the Polymath8 project (which I suppose we could retroactively call the Polymath8a project) then lowered this bound to {H_1 \leq 4,680}.

With the very recent preprint of James Maynard, we have the following further substantial improvements:

Theorem 1 (Maynard’s theorem) Unconditionally, we have the following bounds:

  • {H_1 \leq 600}.
  • {H_m \leq C m^3 e^{4m}} for an absolute constant {C} and any {m \geq 1}.

If one assumes the Elliott-Halberstam conjecture, we have the following improved bounds:

  • {H_1 \leq 12}.
  • {H_2 \leq 600}.
  • {H_m \leq C m^3 e^{2m}} for an absolute constant {C} and any {m \geq 1}.

The final conclusion {H_m \leq C m^3 e^{2m}} on Elliott-Halberstam is not explicitly stated in Maynard’s paper, but follows easily from his methods, as I will describe below the fold. (At around the same time as Maynard’s work, I had also begun a similar set of calculations concerning {H_m}, but was only able to obtain the slightly weaker bound {H_m \leq C \exp( C m )} unconditionally.) In the converse direction, the prime tuples conjecture implies that {H_m} should be comparable to {m \log m}. Granville has also obtained the slightly weaker explicit bound {H_m \leq e^{8m+5}} for any {m \geq 1} by a slight modification of Maynard’s argument.

The arguments of Maynard avoid using the difficult partial results on (weakened forms of) the Elliott-Halberstam conjecture that were established by Zhang and then refined by Polymath8; instead, the main input is the classical Bombieri-Vinogradov theorem, combined with a sieve that is closer in spirit to an older sieve of Goldston and Yildirim, than to the sieve used later by Goldston, Pintz, and Yildirim on which almost all subsequent work is based.

The aim of the Polymath8b project is to obtain improved bounds on {H_1, H_2}, and higher values of {H_m}, either conditional on the Elliott-Halberstam conjecture or unconditional. The likeliest routes for doing this are by optimising Maynard’s arguments and/or combining them with some of the results from the Polymath8a project. This post is intended to be the first research thread for that purpose. To start the ball rolling, I am going to give below a presentation of Maynard’s results, with some minor technical differences (most significantly, I am using the Goldston-Pintz-Yildirim variant of the Selberg sieve, rather than the traditional “elementary Selberg sieve” that is used by Maynard (and also in the Polymath8 project), although it seems that the numerology obtained by both sieves is essentially the same). An alternate exposition of Maynard’s work has just been completed also by Andrew Granville.

Read the rest of this entry »

I’ve just uploaded to the arXiv my article “Algebraic combinatorial geometry: the polynomial method in arithmetic combinatorics, incidence combinatorics, and number theory“, submitted to the new journal “EMS surveys in the mathematical sciences“.  This is the first draft of a survey article on the polynomial method – a technique in combinatorics and number theory for controlling a relevant set of points by comparing it with the zero set of a suitably chosen polynomial, and then using tools from algebraic geometry (e.g. Bezout’s theorem) on that zero set. As such, the method combines algebraic geometry with combinatorial geometry, and could be viewed as the philosophy of a combined field which I dub “algebraic combinatorial geometry”.   There is also an important extension of this method when one is working overthe reals, in which methods from algebraic topology (e.g. the ham sandwich theorem and its generalisation to polynomials), and not just algebraic geometry, come into play also.

The polynomial method has been used independently many times in mathematics; for instance, it plays a key role in the proof of Baker’s theorem in transcendence theory, or Stepanov’s method in giving an elementary proof of the Riemann hypothesis for finite fields over curves; in combinatorics, the nullstellenatz of Alon is also another relatively early use of the polynomial method.  More recently, it underlies Dvir’s proof of the Kakeya conjecture over finite fields and Guth and Katz’s near-complete solution to the Erdos distance problem in the plane, and can be used to give a short proof of the Szemeredi-Trotter theorem.  One of the aims of this survey is to try to present all of these disparate applications of the polynomial method in a somewhat unified context; my hope is that there will eventually be a systematic foundation for algebraic combinatorial geometry which naturally contains all of these different instances the polynomial method (and also suggests new instances to explore); but the field is unfortunately not at that stage of maturity yet.

This is something of a first draft, so comments and suggestions are even more welcome than usual.  (For instance, I have already had my attention drawn to some additional uses of the polynomial method in the literature that I was not previously aware of.)

Define a partition of {1} to be a finite or infinite multiset {\Sigma} of real numbers in the interval {I \in (0,1]} (that is, an unordered set of real numbers in {I}, possibly with multiplicity) whose total sum is {1}: {\sum_{t \in \Sigma}t = 1}. For instance, {\{1/2,1/4,1/8,1/16,\ldots\}} is a partition of {1}. Such partitions arise naturally when trying to decompose a large object into smaller ones, for instance:

  1. (Prime factorisation) Given a natural number {n}, one can decompose it into prime factors {n = p_1 \ldots p_k} (counting multiplicity), and then the multiset

    \displaystyle  \Sigma_{PF}(n) := \{ \frac{\log p_1}{\log n}, \ldots,\frac{\log p_k}{\log n} \}

    is a partition of {1}.

  2. (Cycle decomposition) Given a permutation {\sigma \in S_n} on {n} labels {\{1,\ldots,n\}}, one can decompose {\sigma} into cycles {C_1,\ldots,C_k}, and then the multiset

    \displaystyle  \Sigma_{CD}(\sigma) := \{ \frac{|C_1|}{n}, \ldots, \frac{|C_k|}{n} \}

    is a partition of {1}.

  3. (Normalisation) Given a multiset {\Gamma} of positive real numbers whose sum {S := \sum_{x\in \Gamma}x} is finite and non-zero, the multiset

    \displaystyle  \Sigma_N( \Gamma) := \frac{1}{S} \cdot \Gamma = \{ \frac{x}{S}: x \in \Gamma \}

    is a partition of {1}.

In the spirit of the universality phenomenon, one can ask what is the natural distribution for what a “typical” partition should look like; thus one seeks a natural probability distribution on the space of all partitions, analogous to (say) the gaussian distributions on the real line, or GUE distributions on point processes on the line, and so forth. It turns out that there is one natural such distribution which is related to all three examples above, known as the Poisson-Dirichlet distribution. To describe this distribution, we first have to deal with the problem that it is not immediately obvious how to cleanly parameterise the space of partitions, given that the cardinality of the partition can be finite or infinite, that multiplicity is allowed, and that we would like to identify two partitions that are permutations of each other

One way to proceed is to random partition {\Sigma} as a type of point process on the interval {I}, with the constraint that {\sum_{x \in \Sigma} x = 1}, in which case one can study statistics such as the counting functions

\displaystyle  N_{[a,b]} := |\Sigma \cap [a,b]| = \sum_{x \in\Sigma} 1_{[a,b]}(x)

(where the cardinality here counts multiplicity). This can certainly be done, although in the case of the Poisson-Dirichlet process, the formulae for the joint distribution of such counting functions is moderately complicated. Another way to proceed is to order the elements of {\Sigma} in decreasing order

\displaystyle  t_1 \geq t_2 \geq t_3 \geq \ldots \geq 0,

with the convention that one pads the sequence {t_n} by an infinite number of zeroes if {\Sigma} is finite; this identifies the space of partitions with an infinite dimensional simplex

\displaystyle  \{ (t_1,t_2,\ldots) \in [0,1]^{\bf N}: t_1 \geq t_2 \geq \ldots; \sum_{n=1}^\infty t_n = 1 \}.

However, it turns out that the process of ordering the elements is not “smooth” (basically because functions such as {(x,y) \mapsto \max(x,y)} and {(x,y) \mapsto \min(x,y)} are not smooth) and the formulae for the joint distribution in the case of the Poisson-Dirichlet process is again complicated.

It turns out that there is a better (or at least “smoother”) way to enumerate the elements {u_1,(1-u_1)u_2,(1-u_1)(1-u_2)u_3,\ldots} of a partition {\Sigma} than the ordered method, although it is random rather than deterministic. This procedure (which I learned from this paper of Donnelly and Grimmett) works as follows.

  1. Given a partition {\Sigma}, let {u_1} be an element of {\Sigma} chosen at random, with each element {t\in \Sigma} having a probability {t} of being chosen as {u_1} (so if {t \in \Sigma} occurs with multiplicity {m}, the net probability that {t} is chosen as {u_1} is actually {mt}). Note that this is well-defined since the elements of {\Sigma} sum to {1}.
  2. Now suppose {u_1} is chosen. If {\Sigma \backslash \{u_1\}} is empty, we set {u_2,u_3,\ldots} all equal to zero and stop. Otherwise, let {u_2} be an element of {\frac{1}{1-u_1} \cdot (\Sigma \backslash \{u_1\})} chosen at random, with each element {t \in \frac{1}{1-u_1} \cdot (\Sigma \backslash \{u_1\})} having a probability {t} of being chosen as {u_2}. (For instance, if {u_1} occurred with some multiplicity {m>1} in {\Sigma}, then {u_2} can equal {\frac{u_1}{1-u_1}} with probability {(m-1)u_1/(1-u_1)}.)
  3. Now suppose {u_1,u_2} are both chosen. If {\Sigma \backslash \{u_1,u_2\}} is empty, we set {u_3, u_4, \ldots} all equal to zero and stop. Otherwise, let {u_3} be an element of {\frac{1}{1-u_1-u_2} \cdot (\Sigma\backslash \{u_1,u_2\})}, with ech element {t \in \frac{1}{1-u_1-u_2} \cdot (\Sigma\backslash \{u_1,u_2\})} having a probability {t} of being chosen as {u_3}.
  4. We continue this process indefinitely to create elements {u_1,u_2,u_3,\ldots \in [0,1]}.

We denote the random sequence {Enum(\Sigma) := (u_1,u_2,\ldots) \in [0,1]^{\bf N}} formed from a partition {\Sigma} in the above manner as the random normalised enumeration of {\Sigma}; this is a random variable in the infinite unit cube {[0,1]^{\bf N}}, and can be defined recursively by the formula

\displaystyle  Enum(\Sigma) = (u_1, Enum(\frac{1}{1-u_1} \cdot (\Sigma\backslash \{u_1\})))

with {u_1} drawn randomly from {\Sigma}, with each element {t \in \Sigma} chosen with probability {t}, except when {\Sigma =\{1\}} in which case we instead have

\displaystyle  Enum(\{1\}) = (1, 0,0,\ldots).

Note that one can recover {\Sigma} from any of its random normalised enumerations {Enum(\Sigma) := (u_1,u_2,\ldots)} by the formula

\displaystyle  \Sigma = \{ u_1, (1-u_1) u_2,(1-u_1)(1-u_2)u_3,\ldots\} \ \ \ \ \ (1)

with the convention that one discards any zero elements on the right-hand side. Thus {Enum} can be viewed as a (stochastic) parameterisation of the space of partitions by the unit cube {[0,1]^{\bf N}}, which is a simpler domain to work with than the infinite-dimensional simplex mentioned earlier.

Note that this random enumeration procedure can also be adapted to the three models described earlier:

  1. Given a natural number {n}, one can randomly enumerate its prime factors {n =p'_1 p'_2 \ldots p'_k} by letting each prime factor {p} of {n} be equal to {p'_1} with probability {\frac{\log p}{\log n}}, then once {p'_1} is chosen, let each remaining prime factor {p} of {n/p'_1} be equal to {p'_2} with probability {\frac{\log p}{\log n/p'_1}}, and so forth.
  2. Given a permutation {\sigma\in S_n}, one can randomly enumerate its cycles {C'_1,\ldots,C'_k} by letting each cycle {C} in {\sigma} be equal to {C'_1} with probability {\frac{|C|}{n}}, and once {C'_1} is chosen, let each remaining cycle {C} be equalto {C'_2} with probability {\frac{|C|}{n-|C'_1|}}, and so forth. Alternatively, one traverse the elements of {\{1,\ldots,n\}} in random order, then let {C'_1} be the first cycle one encounters when performing this traversal, let {C'_2} be the next cycle (not equal to {C'_1} one encounters when performing this traversal, and so forth.
  3. Given a multiset {\Gamma} of positive real numbers whose sum {S := \sum_{x\in\Gamma} x} is finite, we can randomly enumerate {x'_1,x'_2,\ldots} the elements of this sequence by letting each {x \in \Gamma} have a {\frac{x}{S}} probability of being set equal to {x'_1}, and then once {x'_1} is chosen, let each remaining {x \in \Gamma\backslash \{x'_1\}} have a {\frac{x_i}{S-x'_1}} probability of being set equal to {x'_2}, and so forth.

We then have the following result:

Proposition 1 (Existence of the Poisson-Dirichlet process) There exists a random partition {\Sigma} whose random enumeration {Enum(\Sigma) = (u_1,u_2,\ldots)} has the uniform distribution on {[0,1]^{\bf N}}, thus {u_1,u_2,\ldots} are independently and identically distributed copies of the uniform distribution on {[0,1]}.

A random partition {\Sigma} with this property will be called the Poisson-Dirichlet process. This process, first introduced by Kingman, can be described explicitly using (1) as

\displaystyle  \Sigma = \{ u_1, (1-u_1) u_2,(1-u_1)(1-u_2)u_3,\ldots\},

where {u_1,u_2,\ldots} are iid copies of the uniform distribution of {[0,1]}, although it is not immediately obvious from this definition that {Enum(\Sigma)} is indeed uniformly distributed on {[0,1]^{\bf N}}. We prove this proposition below the fold.

An equivalent definition of a Poisson-Dirichlet process is a random partition {\Sigma} with the property that

\displaystyle  (u_1, \frac{1}{1-u_1} \cdot (\Sigma \backslash \{u_1\})) \equiv (U, \Sigma) \ \ \ \ \ (2)

where {u_1} is a random element of {\Sigma} with each {t \in\Sigma} having a probability {t} of being equal to {u_1}, {U} is a uniform variable on {[0,1]} that is independent of {\Sigma}, and {\equiv} denotes equality of distribution. This can be viewed as a sort of stochastic self-similarity property of {\Sigma}: if one randomly removes one element from {\Sigma} and rescales, one gets a new copy of {\Sigma}.

It turns out that each of the three ways to generate partitions listed above can lead to the Poisson-Dirichlet process, either directly or in a suitable limit. We begin with the third way, namely by normalising a Poisson process to have sum {1}:

Proposition 2 (Poisson-Dirichlet processes via Poisson processes) Let {a>0}, and let {\Gamma_a} be a Poisson process on {(0,+\infty)} with intensity function {t \mapsto \frac{1}{t} e^{-at}}. Then the sum {S :=\sum_{x \in \Gamma_a} x} is almost surely finite, and the normalisation {\Sigma_N(\Gamma_a) = \frac{1}{S} \cdot \Gamma_a} is a Poisson-Dirichlet process.

Again, we prove this proposition below the fold. Now we turn to the second way (a topic, incidentally, that was briefly touched upon in this previous blog post):

Proposition 3 (Large cycles of a typical permutation) For each natural number {n}, let {\sigma} be a permutation drawn uniformly at random from {S_n}. Then the random partition {\Sigma_{CD}(\sigma)} converges in the limit {n \rightarrow\infty} to a Poisson-Dirichlet process {\Sigma_{PF}} in the following sense: given any fixed sequence of intervals {[a_1,b_1],\ldots,[a_k,b_k] \subset I} (independent of {n}), the joint discrete random variable {(N_{[a_1,b_1]}(\Sigma_{CD}(\sigma)),\ldots,N_{[a_k,b_k]}(\Sigma_{CD}(\sigma)))} converges in distribution to {(N_{[a_1,b_1]}(\Sigma),\ldots,N_{[a_k,b_k]}(\Sigma))}.

Finally, we turn to the first way:

Proposition 4 (Large prime factors of a typical number) Let {x > 0}, and let {N_x} be a random natural number chosen according to one of the following three rules:

  1. (Uniform distribution) {N_x} is drawn uniformly at random from the natural numbers in {[1,x]}.
  2. (Shifted uniform distribution) {N_x} is drawn uniformly at random from the natural numbers in {[x,2x]}.
  3. (Zeta distribution) Each natural number {n} has a probability {\frac{1}{\zeta(s)}\frac{1}{n^s}} of being equal to {N_x}, where {s := 1 +\frac{1}{\log x}}and {\zeta(s):=\sum_{n=1}^\infty \frac{1}{n^s}}.

Then {\Sigma_{PF}(N_x)} converges as {x \rightarrow \infty} to a Poisson-Dirichlet process {\Sigma} in the same fashion as in Proposition 3.

The process {\Sigma_{PF}(N_x)} was first studied by Billingsley (and also later by Knuth-Trabb Pardo and by Vershik, but the formulae were initially rather complicated; the proposition above is due to of Donnelly and Grimmett, although the third case of the proposition is substantially easier and appears in the earlier work of Lloyd. We prove the proposition below the fold.

The previous two propositions suggests an interesting analogy between large random integers and large random permutations; see this ICM article of Vershik and this non-technical article of Granville (which, incidentally, was once adapted into a play) for further discussion.

As a sample application, consider the problem of estimating the number {\pi(x,x^{1/u})} of integers up to {x} which are not divisible by any prime larger than {x^{1/u}} (i.e. they are {x^{1/u}}-smooth), where {u>0} is a fixed real number. This is essentially (modulo some inessential technicalities concerning the distinction between the intervals {[x,2x]} and {[1,x]}) the probability that {\Sigma} avoids {[1/u,1]}, which by the above theorem converges to the probability {\rho(u)} that {\Sigma_{PF}} avoids {[1/u,1]}. Below the fold we will show that this function is given by the Dickman function, defined by setting {\rho(u)=1} for {u < 1} and {u\rho'(u) = \rho(u-1)} for {u \geq 1}, thus recovering the classical result of Dickman that {\pi(x,x^{1/u}) = (\rho(u)+o(1))x}.

I thank Andrew Granville and Anatoly Vershik for showing me the nice link between prime factors and the Poisson-Dirichlet process. The material here is standard, and (like many of the other notes on this blog) was primarily written for my own benefit, but it may be of interest to some readers. In preparing this article I found this exposition by Kingman to be helpful.

Note: this article will emphasise the computations rather than rigour, and in particular will rely on informal use of infinitesimals to avoid dealing with stochastic calculus or other technicalities. We adopt the convention that we will neglect higher order terms in infinitesimal calculations, e.g. if {dt} is infinitesimal then we will abbreviate {dt + o(dt)} simply as {dt}.

Read the rest of this entry »

As in all previous posts in this series, we adopt the following asymptotic notation: {x} is a parameter going off to infinity, and all quantities may depend on {x} unless explicitly declared to be “fixed”. The asymptotic notation {O(), o(), \ll} is then defined relative to this parameter. A quantity {q} is said to be of polynomial size if one has {q = O(x^{O(1)})}, and bounded if {q=O(1)}. We also write {X \lessapprox Y} for {X \ll x^{o(1)} Y}, and {X \sim Y} for {X \ll Y \ll X}.

The purpose of this (rather technical) post is both to roll over the polymath8 research thread from this previous post, and also to record the details of the latest improvement to the Type I estimates (based on exploiting additional averaging and using Deligne’s proof of the Weil conjectures) which lead to a slight improvement in the numerology.

In order to obtain this new Type I estimate, we need to strengthen the previously used properties of “dense divisibility” or “double dense divisibility” as follows.

Definition 1 (Multiple dense divisibility) Let {y \geq 1}. For each natural number {k \geq 0}, we define a notion of {k}-tuply {y}-dense divisibility recursively as follows:

  • Every natural number {n} is {0}-tuply {y}-densely divisible.
  • If {k \geq 1} and {n} is a natural number, we say that {n} is {k}-tuply {y}-densely divisible if, whenever {i,j \geq 0} are natural numbers with {i+j=k-1}, and {1 \leq R \leq n}, one can find a factorisation {n = qr} with {y^{-1} R \leq r \leq R} such that {q} is {i}-tuply {y}-densely divisible and {r} is {j}-tuply {y}-densely divisible.

We let {{\mathcal D}^{(k)}_y} denote the set of {k}-tuply {y}-densely divisible numbers. We abbreviate “{1}-tuply densely divisible” as “densely divisible”, “{2}-tuply densely divisible” as “doubly densely divisible”, and so forth; we also abbreviate {{\mathcal D}^{(1)}_y} as {{\mathcal D}_y}.

Given any finitely supported sequence {\alpha: {\bf N} \rightarrow {\bf C}} and any primitive residue class {a\ (q)}, we define the discrepancy

\displaystyle \Delta(\alpha; a \ (q)) := \sum_{n: n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1} \alpha(n).

We now recall the key concept of a coefficient sequence, with some slight tweaks in the definitions that are technically convenient for this post.

Definition 2 A coefficient sequence is a finitely supported sequence {\alpha: {\bf N} \rightarrow {\bf R}} that obeys the bounds

\displaystyle  |\alpha(n)| \ll \tau^{O(1)}(n) \log^{O(1)}(x) \ \ \ \ \ (1)

for all {n}, where {\tau} is the divisor function.

  • (i) A coefficient sequence {\alpha} is said to be located at scale {N} for some {N \geq 1} if it is supported on an interval of the form {[cN, CN]} for some {1 \ll c < C \ll 1}.
  • (ii) A coefficient sequence {\alpha} located at scale {N} for some {N \geq 1} is said to obey the Siegel-Walfisz theorem if one has

    \displaystyle  | \Delta(\alpha 1_{(\cdot,q)=1}; a\ (r)) | \ll \tau(qr)^{O(1)} N \log^{-A} x \ \ \ \ \ (2)

    for any {q,r \geq 1}, any fixed {A}, and any primitive residue class {a\ (r)}.

  • (iii) A coefficient sequence {\alpha} is said to be smooth at scale {N} for some {N > 0} is said to be smooth if it takes the form {\alpha(n) = \psi(n/N)} for some smooth function {\psi: {\bf R} \rightarrow {\bf C}} supported on an interval of size {O(1)} and obeying the derivative bounds

    \displaystyle  |\psi^{(j)}(t)| \lesssim \log^{O(1)} x \ \ \ \ \ (3)

    for all fixed {j \geq 0} (note that the implied constant in the {O()} notation may depend on {j}).

Note that we allow sequences to be smooth at scale {N} without being located at scale {N}; for instance if one arbitrarily translates of a sequence that is both smooth and located at scale {N}, it will remain smooth at this scale but may not necessarily be located at this scale any more. Note also that we allow the smoothness scale {N} of a coefficient sequence to be less than one. This is to allow for the following convenient rescaling property: if {n \mapsto \psi(n)} is smooth at scale {N}, {q \geq 1}, and {a} is an integer, then {n \mapsto \psi(qn+a)} is smooth at scale {N/q}, even if {N/q} is less than one.

Now we adapt the Type I estimate to the {k}-tuply densely divisible setting.

Definition 3 (Type I estimates) Let {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and {0 < \sigma < 1/2} be fixed quantities, and let {k \geq 1} be a fixed natural number. We let {I} be an arbitrary bounded subset of {{\bf R}}, let {P_I := \prod_{p \in I} p}, and let {a\ (P_I)} a primitive congruence class. We say that {Type^{(k)}_I[\varpi,\delta,\sigma]} holds if, whenever {M, N \gg 1} are quantities with

\displaystyle  M N \sim x \ \ \ \ \ (4)

and

\displaystyle  x^{1/2-\sigma} \lessapprox N \lessapprox x^{1/2-2\varpi-c} \ \ \ \ \ (5)

for some fixed {c>0}, and {\alpha,\beta} are coefficient sequences located at scales {M,N} respectively, with {\beta} obeying a Siegel-Walfisz theorem, we have

\displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^{(k)}: q \leq x^{1/2+2\varpi}} |\Delta(\alpha * \beta; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (6)

for any fixed {A>0}. Here, as in previous posts, {{\mathcal S}_I} denotes the square-free natural numbers whose prime factors lie in {I}.

The main theorem of this post is then

Theorem 4 (Improved Type I estimate) We have {Type^{(4)}_I[\varpi,\delta,\sigma]} whenever

\displaystyle  \frac{160}{3} \varpi + 16 \delta + \frac{34}{9} \sigma < 1

and

\displaystyle  64\varpi + 18\delta + 2\sigma < 1.

In practice, the first condition here is dominant. Except for weakening double dense divisibility to quadruple dense divisibility, this improves upon the previous Type I estimate that established {Type^{(2)}_I[\varpi,\delta,\sigma]} under the stricter hypothesis

\displaystyle  56 \varpi + 16 \delta + 4 \sigma < 1.

As in previous posts, Type I estimates (when combined with existing Type II and Type III estimates) lead to distribution results of Motohashi-Pintz-Zhang type. For any fixed {\varpi, \delta > 0} and {k \geq 1}, we let {MPZ^{(k)}[\varpi,\delta]} denote the assertion that

\displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^{(k)}: q \leq x^{1/2+2\varpi}} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (7)

for any fixed {A > 0}, any bounded {I}, and any primitive {a\ (P_I)}, where {\Lambda} is the von Mangoldt function.

Corollary 5 We have {MPZ^{(4)}[\varpi,\delta]} whenever

\displaystyle  \frac{600}{7} \varpi + \frac{180}{7} \delta < 1 \ \ \ \ \ (8)

Proof: Setting {\sigma} sufficiently close to {1/10}, we see from the above theorem that {Type^{(4)}_{II}[\varpi,\delta]} holds whenever

\displaystyle  \frac{600}{7} \varpi + \frac{180}{7} \delta < 1

and

\displaystyle  80 \varpi + \frac{45}{2} \delta < 1.

The second condition is implied by the first and can be deleted.

From this previous post we know that {Type^{(4)}_{II}[\varpi,\delta]} (which we define analogously to {Type'_{II}[\varpi,\delta], Type''_{II}[\varpi,\delta]} from previous sections) holds whenever

\displaystyle  68 \varpi + 14 \delta < 1

while {Type^{(4)}_{III}[\varpi,\delta,\sigma]} holds with {\sigma} sufficiently close to {1/10} whenever

\displaystyle  70 \varpi + 5 \delta < 1.

Again, these conditions are implied by (8). The claim then follows from the Heath-Brown identity and dyadic decomposition as in this previous post. \Box

As before, we let {DHL[k_0,2]} denote the claim that given any admissible {k_0}-tuple {{\mathcal H}}, there are infinitely many translates of {{\mathcal H}} that contain at least two primes.

Corollary 6 We have {DHL[k_0,2]} with {k_0 = 632}.

This follows from the Pintz sieve, as discussed below the fold. Combining this with the best known prime tuples, we obtain that there are infinitely many prime gaps of size at most {4,680}, improving slightly over the previous record of {5,414}.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,863 other followers