Large gaps between consecutive prime numbers

21 August, 2014 in math.NT, paper | Tags: arithmetic progressions, prime gaps | by Terence Tao

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap ${p_{n+1}-p_n}$ . Here, we wish to consider the largest prime gap ${G(X) = p_{n+1}-p_n}$ that one can find in the interval ${[X] = \{1,\dots,X\}}$ as ${X}$ goes to infinity.

Finding lower bounds on ${G(X)}$ is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number ${n}$ , the consecutive numbers ${n!+2, n!+3,\dots,n!+n}$ are all composite, because each ${n!+i}$ , ${i=2,\dots,n}$ is divisible by some prime ${p \leq n}$ , while being strictly larger than that prime ${p}$ . From this and Stirling’s formula, it is not difficult to obtain the bound

$\displaystyle G(X) \gg \frac{\log X}{\log\log X}. \ \ \ \ \ (1)$

A more efficient bound comes from the prime number theorem: there are only ${(1+o(1)) \frac{X}{\log X}}$ primes up to ${X}$ , so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to ${X}$ of length at least ${(1-o(1)) \log X}$ , thus

$\displaystyle G(X) \gtrsim \log X \ \ \ \ \ (2)$

where we use ${X \gtrsim Y}$ or ${Y \lesssim X}$ as shorthand for ${X \geq (1-o(1)) Y}$ or ${Y \leq (1+o(1)) X}$ .

What about upper bounds? The Cramér random model predicts that the primes up to ${X}$ are distributed like a random subset ${\{1,\dots,X\}}$ of density ${1/\log X}$ . Using this model, Cramér arrived at the conjecture

$\displaystyle G(X) \ll \log^2 X.$

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

$\displaystyle G(X) \sim \log^2 X.$

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

$\displaystyle G(X) \gtrsim 2e^{-\gamma} \log^2 X$

(note that ${2e^{-\gamma} = 1.1229\dots}$ is slightly larger than ${1}$ ). For comparison, the known upper bounds on ${G(X)}$ are quite weak; unconditionally one has ${G(X) \ll X^{0.525}}$ by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to ${G(X) \ll X^{1/2} \log X}$ , as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

$\displaystyle G(X) \gg \frac{\log\log\log X}{\log\log\log\log X} \log X ,$

which Erdös in 1935 improved to

$\displaystyle G(X) \gg \frac{\log\log X}{(\log\log\log X)^2} \log X$

and Rankin in 1938 improved slightly further to

$\displaystyle G(X) \gtrsim c \frac{(\log\log X) \log\log\log\log X}{(\log\log\log X)^2} \log X \ \ \ \ \ (3)$

with ${c=1/3}$ . Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant ${c}$ , which was raised to ${c=\frac{1}{2} e^\gamma}$ in 1963 by Schönhage, to ${c= e^\gamma}$ in 1963 by Rankin, to ${c = 1.31256 e^\gamma}$ by Maier and Pomerance, and finally to ${c = 2e^\gamma}$ in 1997 by Pintz.

Erdös listed the problem of making ${c}$ arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1 The bound (3) holds for arbitrarily large ${c>0}$ .

In principle, we thus have a bound of the form

$\displaystyle G(X) \geq f(X) \frac{(\log\log X) \log\log\log\log X}{(\log\log\log X)^2} \log X$

for some ${f(X)}$ that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on ${f(X)}$ at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of ${\log\log\log X}$ in the denominator is ${3}$ instead of ${2}$ .) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

— 1. Sketch of proof —

Our method is a modification of Rankin’s method, combined with some work of myself and Ben Green (and of Tamar Ziegler) on counting various linear patterns in the primes. To explain this, let us first go back to Rankin’s argument, presented in a fashion that allows for comparison with our own methods. Let’s first go back to the easy bound (1) that came from using the consecutive string ${n!+2,\dots,n!+n}$ of composite numbers. This bound was inferior to the prime number theorem bound (2), however this can be easily remedied by replacing ${n!}$ with the somewhat smaller primorial ${P(n)}$ , defined as the product of the primes up to and including ${n}$ . It is still easy to see that ${P(n)+2,\dots,P(n)+n}$ are all composite, with each ${P(n)+i, i \leq n}$ divisible by some prime ${p \leq n}$ while being larger than that prime. On the other hand, the prime number theorem tells us that ${P(n) = \exp( (1+o(1)) n)}$ . From this, one can recover an alternate proof of (2) (perhaps not so surprising, since the prime number theorem is a key ingredient in both proofs).

This gives hope that further modification of this construction can be used to go beyond (2). If one looks carefully at the above proof, we see that the key fact used here is that the discrete interval of integers ${\{2,\dots,n\}}$ is completely sieved out by the residue classes ${0 \hbox{ mod } p}$ for primes ${p \leq n}$ , in the sense that each element in this interval is contained in at least one of these residue classes. More generally (and shifting the interval by ${1}$ for a more convenient normalisation), suppose we can find an interval ${[y] := \{ n \in {\bf N}: n \leq y \}}$ which is completely sieved out by one residue class ${a_p \hbox{ mod } p}$ for each ${p \leq x}$ , for some ${x}$ and ${y}$ . Then the string of consecutive numbers ${m+1,\dots,m+\lfloor y\rfloor}$ will be composite, whenever ${m}$ is an integer larger than or equal to ${x}$ with ${m = - a_p \hbox{ mod } p}$ for each prime ${p \leq x}$ , since each of the ${m+i, i \leq y}$ will be divisible by some prime ${p \leq x}$ while being larger than that prime. From the Chinese remainder theorem, one can find such an ${m}$ that is of size at most ${x+P(x)}$ . From this and the prime number theorem, one can obtain lower bounds on ${G(X)}$ if one can get lower bounds on ${y}$ in terms of ${x}$ . In particular, if for any large ${x}$ one can completely sieve out ${[y]}$ with a residue class ${a_p \hbox{ mod } p}$ for each ${p \leq x}$ , and

$\displaystyle y \sim c \frac{(\log x) \log\log\log x}{(\log\log x)^2} x, \ \ \ \ \ (4)$

then one can establish the bound (3). (The largest ${y}$ one can take for a given ${x}$ is known as the Jacobsthal function of the primorial ${P(x)}$ .) So the task is basically to find a smarter set of congruence classes ${a_p \hbox{ mod } p}$ than just the zero congruence classes ${0 \hbox{ mod } p}$ that can sieve out a larger interval than ${x}$ . (Unfortunately, this approach by itself is unlikely to reach the Cramér conjecture; it was shown by Iwaniec using the large sieve that ${y}$ is necessarily of size ${O(x^2)}$ (which somewhat coincidentally matches the Cramér bound), but Maier and Pomerance conjecture that in fact one must have ${y \leq x \log^{2+o(1)} x}$ , which would mean that the limit of this method would be to establish a bound of the form ${G(X) \geq (\log\log X)^{2+o(1)} \log X}$ .)

So, how can one do better than just using the “Eratosthenes” sieve ${0 \hbox{ mod } p}$ ? We will divide the sieving into different stages, depending on the size of ${p}$ . It turns out that a reasonably optimal division of primes ${p}$ up to ${x}$ will be into the following four classes:

Stage 1 primes: primes ${p}$ that are either tiny (less than ${\log x}$ ) or medium size (between ${z}$ and ${x/4}$ ), where ${z}$ is a parameter to be chosen later.
Stage 2 primes: primes that are small (between ${\log x}$ and ${z}$ ).
Stage 3 primes: Primes that are very large (between ${x/2}$ and ${x}$ ).
Stage 4 primes: Primes that are fairly large (between ${x/4}$ and ${x/2}$ ).

We will take an interval ${[y]}$ , where ${y}$ is given by (4), and sieve out first by Stage 1 primes, then Stage 2 primes, then Stage 3 primes, then Stage 4 primes, until none of the elements of ${[y]}$ are left.

Let’s first discuss the final sieving step, which is rather trivial. Suppose that our sieving by the first three sieving stages is so efficient that the number of surviving elements of ${[y]}$ is less than or equal to the number of Stage 4 primes (by the prime number theorem, this will for instance be the case for sufficiently large ${x}$ if there are fewer than ${(\frac{1}{5}+o(1)) \frac{x}{\log x}}$ survivors). Then one can finish off the remaining survivors simply by using each of the Stage 4 primes ${p}$ to remove one of the surviving integers in ${[y]}$ by an appropriate choice of residue class ${a_p \hbox{ mod } p}$ . So we can recast our problem as an approximate sieving problem rather than a perfect sieving problem; we now only need to eliminate most of the elements of ${[y]}$ rather than all of them, at the cost of only using primes from the Stages 1-3, rather than 1-4. Note though that for ${y}$ given by (4), the Stage 1-3 sieving has to be reasonably efficient, in that the proportion of survivors cannot be too much larger than ${1/\log^2 x}$ (ignoring factors of ${\log\log x}$ etc.).

Next, we discuss the Stage 1 sieving process. Here, we will simply copy the classic construction and use the Eratosthenes sieve ${0 \hbox{ mod } p}$ for these primes. The elements of ${[y]}$ that survive this process are those elements that are not divisible by any Stage 1 primes, that is to say they are only divisible by small (Stage 2) primes, or else contain at least one prime factor larger than ${x/4}$ , and no prime factors less than ${\log x}$ . In the latter case, the survivor has no choice but to be a prime in ${(x/4,y]}$ (since from (4) we have ${y \leq x \log x}$ for ${x}$ large enough). In the former case, the survivor is a ${z}$ -smooth number – a number with no prime factors greater than ${z}$ . How many such survivors are there? Here we can use a somewhat crude upper bound of Rankin:

Lemma 2 Let ${1 < z \leq y}$ be large quantities, and write ${u := \frac{\log y}{\log z}}$ . Suppose that

$\displaystyle \log u = o( \log z ).$

Then the number of ${z}$ -smooth numbers in ${[y]}$ is at most ${e^{-u\log u + O(u)} y \log z}$ .

Proof: We use a Dirichlet series method commonly known as “Rankin’s trick”. Let ${0<\rho<1}$ be a quantity to be optimised in later, and abbreviate “ ${z}$ -smooth” as “smooth”. Observe that if ${n}$ is a smooth number less than ${y}$ , then

$\displaystyle 1 \leq \frac{y}{y^\rho} \frac{1}{n^{1-\rho}}. \ \ \ \ \ (5)$

Thus, the number of smooth numbers in ${y}$ is at most

$\displaystyle \frac{y}{y^\rho} \sum_{n \hbox{smooth}} \frac{1}{n^{1-\rho}}$

where we have simply discarded the constraint ${n \leq y}$ . The point of doing this is that the above expression factors into a tractable Euler product

$\displaystyle \frac{y}{y^\rho} \prod_{p \leq z} (1 + \frac{1}{p^{1-\rho}} + \frac{1}{p^{2(1-\rho)}} + \dots).$

We will choose

$\displaystyle \rho := \frac{u \log u}{\log y} = \frac{\log u}{\log z} \ \ \ \ \ (6)$

so that ${y^\rho = e^{u \log u}}$ and ${\rho=o(1)}$ . Then the above expression simplifies to

$\displaystyle \ll \frac{y}{e^{u \log u}} \exp( \sum_{p \leq z} \frac{1}{p^{1-\rho}} ).$

To compute the sum here, we first observe from Mertens’ theorem (discussed in this previous blog post) that

$\displaystyle \sum_{p \leq z} \frac{1}{p} = \log \log z + O(1),$

so we may bound the previous expression by

$\displaystyle \ll \frac{y \log z}{e^{u \log u}} \exp( \sum_{p \leq z} \frac{1}{p^{1-\rho}} - \frac{1}{p} )$

which we rewrite using (6) as

$\displaystyle \ll \frac{y \log z}{e^{u \log u}} \exp( \sum_{p \leq z} \frac{\exp( \frac{\log p}{\log z} \log u ) - 1}{p} ).$

Next, we use the convexity inequality

$\displaystyle \exp( ct ) - 1 \leq (\exp(c) - 1 ) t$

for ${c > 0}$ and ${0 \leq t \leq 1}$ , applied with ${c := \log u}$ and ${t := \frac{\log p}{\log z}}$ , to conclude that

$\displaystyle \exp( \frac{\log p}{\log z} \log u ) - 1 \leq u \frac{\log p}{\log z}$

Finally, from the prime number theorem we have ${\sum_{p \leq z} \frac{\log p}{\log z} \ll 1}$ . The bound follows. $\Box$

Remark 1 One can basically eliminate the ${\log z}$ factor here (at the cost of worsening the ${O(u)}$ error slightly to ${O(u\log\log(3u))}$ ) by a more refined version of the Rankin trick, based on replacing the crude bound (5) by the more sophisticated inequality

$\displaystyle 1 = \frac{\log \frac{y}{n}}{\log n} + \sum_{p^\nu || n; p \leq z} \frac{\nu \log p}{\log y},$

$\displaystyle \ll \frac{y}{y^\rho \log y} \frac{1}{n^{1-\rho}} + \sum_{p^\nu || n; p \leq z} \frac{\nu \log p}{\log y},$

where ${p^\nu||n}$ denotes the assertion that ${p}$ divides ${n}$ exactly ${\nu}$ times. (Thanks to Kevin Ford for pointing out this observation to me.) In fact, the number of ${z}$ -smooth numbers in ${[y]}$ is known to be asymptotically ${e^{-u\log u + O( u \log\log(3u) )} y}$ in the range ${\log^3 y \leq z \leq y}$ , a result of de Bruijn.

In view of the error term permitted by the Stage 4 process, we would like to take ${z}$ as large as possible while still leaving only ${o(x/\log x)}$ smooth numbers in ${[y]}$ . A somewhat efficient choice of ${z}$ here is

$\displaystyle z := \exp( \frac{\log\log\log x}{4 \log\log x} \log x ),$

so that ${u \sim 4 \frac{\log\log x}{\log\log\log x}}$ and ${u\log u \sim 4 \log\log x}$ , and then one can check that the above lemma does indeed show that there are ${o(x/\log x)}$ smooth numbers in ${[y]}$ . (If we use the sharper bound in the remark, we can reduce the ${4}$ here to a ${3}$ , although this makes little difference to the final bound.) If we let ${{\mathcal Q}}$ denote all the primes in ${(x/4,y]}$ , the remaining task is then to sieve out all but ${(\frac{1}{5}+o(1)) \frac{x}{\log x}}$ of the primes in ${{\mathcal Q}}$ by using one congruence class from each of the Stage 2 and Stage 3 primes.

Note that ${{\mathcal Q}}$ is still quite large compared to the error that the Stage 4 primes can handle – it is of size about ${y/\log x}$ , whereas we need to get down to a bit less than ${x/\log x}$ . Still, this is some progress (the remaining sparsification needed is of the order of ${1/\log^{1+o(1)} x}$ rather than ${1/\log^{2+o(1)} x}$ ).

For the Stage 2 sieve, we will just use a random construction, choosing ${a_p \hbox{ mod } p}$ uniformly at random for each Stage 2 prime. This sieve is expected to sparsify the set ${{\mathcal Q}}$ of survivors by a factor

$\displaystyle \gamma := \prod_{\hbox{Stage 2 primes} p} (1-\frac{1}{p}),$

which by Mertens’ theorem is of size

$\displaystyle \gamma \sim \frac{\log \log x}{\log z} \sim 4 \frac{(\log \log x)^2}{\log x \log\log\log x}.$

In particular, if ${y}$ is given by (4), then all the strange logarithmic factors cancel out and

$\displaystyle \gamma y \sim 4c x.$

In particular, we expect ${{\mathcal Q}}$ to be cut down to a random set (which we have called ${{\mathcal Q}({\bf a})}$ in our paper) of size about ${4c \frac{x}{\log x}}$ . This would already finish the job for very small ${c}$ (e.g. ${c \leq 1/20}$ ), and indeed Rankin’s original argument proceeds more or less along these lines. But now we want to take ${c}$ to be large.

Fortunately, we still have the Stage 3 primes to play with. But the number of Stage 3 primes is about ${\frac{1}{2} \frac{x}{\log x}}$ , which is a bit smaller than the number of surviving primes ${{\mathcal Q}({\bf a})}$ , which is about ${4c \frac{x}{\log x}}$ . So to make this work, most of the Stage 3 congruence classes ${a_p \hbox{ mod } p}$ need to sieve out many primes from ${{\mathcal Q}({\bf a})}$ , rather than just one or two. (Rankin’s original argument is based on sieving out one prime per congruence class; the subsequent work of Maier-Pomerance and Pintz is basically based on sieving out two primes per congruence class.)

Here, one has to take some care because the set ${{\mathcal Q}({\bf a})}$ is already quite sparse inside ${[y]}$ (its density is about ${1/\log^{2+o(1)} x}$ ). So a randomly chosen ${a_p \hbox{ mod } p}$ would in fact most likely catch none of the primes in ${{\mathcal Q}({\bf a})}$ at all. So we need to restrict attention to congruence classes ${a_p \hbox{ mod } p}$ which already catch a large number of primes in ${{\mathcal Q}}$ , so that even after the Stage 2 sieving one can hope to be left with many congruence classes that also catch a large number of primes in ${{\mathcal Q}({\bf a})}$ .

Here’s where my work with Ben came in. Suppose one has an arithmetic progression ${a, a+d, \dots, a+(r-1)d}$ of length ${r}$ consisting entirely of primes in ${{\mathcal Q}}$ , and with ${d}$ a multiple of ${p}$ , then the congruence class ${a \hbox{ mod } p}$ is guaranteed to pick up at least ${k}$ primes in ${{\mathcal Q}}$ . My first theorem with Ben shows that no matter how large ${k}$ is, the set ${{\mathcal Q}}$ does indeed contain some arithmetic progressions ${a,a+d,\dots,a+(r-1)d}$ of length ${r}$ . This result is not quite suitable for our applications here, because (a) we need the spacing ${d}$ to also be divisible by a Stage 3 prime ${p}$ (in our paper, we take ${d = r! p}$ for concreteness, although other choices are certainly possible), and (b) for technical reasons, it is insufficient to simply have a large number of arithmetic progressions of primes strewn around ${{\mathcal Q}}$ ; they have to be “evenly distributed” in some sense in order to be able to still cover most of ${{\mathcal Q}({\bf a})}$ after throwing out any progression that is partly or completely sieved out by the Stage 2 primes. Fortunately, though, these distributional results for linear equations in primes were established by a subsequent paper of Ben and myself, contingent on two conjectures (the Mobius-Nilsequences conjecture and the inverse conjecture for the Gowers norms) which we also proved (the latter with Tamar Ziegler) in some further papers. (Actually, strictly speaking our work does not quite cover the case needed here, because the progressions are a little “narrow”; we need progressions of primes in ${[y]}$ whose spacing ${d}$ is comparable to ${x}$ instead of ${y}$ , whereas our paper only considered the situation in which the spacing was comparable to the elements of the progression. It turns out though that the arguments can be modified (somewhat tediously) to extend to this case though.)

44 comments

Comments feed for this article

21 August, 2014 at 11:49 am

Anonymous

Is there a way to get the pdf files of ICM proceedings of this year online?

Thanks

21 August, 2014 at 7:22 pm

Terence Tao

I would imagine that they will eventually be available at http://www.mathunion.org/ICM/ . But one may have to wait some weeks or perhaps months before they are all done with the typesetting etc..

21 August, 2014 at 10:47 pm

Gil Kalai

For ICM2010 it took about two years for the proceedings to become freely available in the above site. Meanwhile about 50 of the ICM 2014 proceedings’ contributions are already available on arXiv, via this search.http://arxiv.org/find/all/1/co:+icm/0/1/0/all/0/1 (I got it from Peter Woit’s blog.) Videos of lectures can already be found here. http://www.icm2014.org/en/vod/vod

22 August, 2014 at 5:58 am

Hyun Woo Kwon

I’m a technical Editor on this year ICM Proceedings. Actually, they are already given in USB except for plenary lectures. “USB” is given to participants. I’m not sure when it will be uploaded in IMU websites.

21 August, 2014 at 12:02 pm

Ben Green

I understand James Maynard’s preprint will be appearing later today. There is a sense in which he uses progress on small prime gaps to say something about large prime gaps! The method is quite different to ours, so it is remarkable that these papers should appear within a day of one another after a 75-year wait.

21 August, 2014 at 6:05 pm

Gergely Harcos

Maynard’s preprint is up (http://arxiv.org/abs/1408.5110). Congratulations to all of you!

21 August, 2014 at 7:16 pm

Pace Nielsen

Ben Green’s announcement talk at the ICM was very interesting. One of the things that stuck out to me most was his statement about double ineffectivity [I think this was in answer to a question after the talk(?)], so I was looking to see if James could get around that. It looks like he has!

21 August, 2014 at 10:02 pm

Gil Kalai

Congratulations!

22 August, 2014 at 2:38 am

Yes, I think more people should announce their results in this old-fashioned way — by giving a plenary lecture at ICM.

22 August, 2014 at 6:16 am

interested non-expert

Congratulations to all authors! As Ben Green pointed out, it is a strange coincidence that your team and James Maynard found a proof at the same time (after 75years). Some questions:

1) Has there been a major breakthrough recently, which both of you have used? What are other explanations?

2) What are the major differences between your proofs and conclusions? Are the involved/developed methods of the similar powerful?

3) As in the case for small gaps, which is a special case of a weaker variant of a great unknown conjecture (twin-prime conjecture), is there an analog for large primes?

Thanks!!

22 August, 2014 at 6:37 am

Terence Tao

Both of our papers need some recent results on how to catch primes inside arithmetic progressions, but we use different results. The paper of Kevin, Ben, Sergei and myself use the results of Ben and myself (and of Tamar Ziegler) on linear equations in primes to locate arithmetic progressions that consist solely of primes. James’ argument instead uses a variant of his result on admissible prime tuples that can catch many primes but are also allowed to catch composites (in particular, he uses the same engine to establish both small gaps between primes and large gaps between primes, which is a pleasing symmetry). The latter has the advantage of coming with explicit bounds (the Siegel zero ineffectivity issue is still present, due to the reliance on the Bombieri-Vinogradov theorem, but there are ways to deal with the exceptional zero, e.g. via effective variants of Bombieri-Vinogradov). On the other hand, the sieving process in our paper is a bit simpler to analyse than the one in James’ paper, although they are broadly both of Erdos-Rankin type. There is a good chance that one can combine the most efficient parts of both papers to obtain a cleaner proof of the result that also provides a better bound; we’re looking into that right now.

The analogue of the twin prime conjecture here is probably the Cramer conjecture (although the obstruction to solving the latter is different, it is the sheer difficulty of locating very rare events, rather than the parity problem). This conjecture is well out of reach of any of the Erdos-Rankin-based methods (which have a natural limit at $\log X (\log\log X)^{2+o(1)}$ , stopping well short of the Cramer prediction $\asymp (\log X)^2$ ).

23 August, 2014 at 1:26 pm

Polymath 8 – a Success! | Combinatorics and more

[…] Update (August 23): Before moving to small gaps, Sound’s 2007 survey briefly describes the situation for large gaps. The Cramer probabilistic heuristic suggests that there are consecutive primes in [1,n] which are apart, but not apart where c and C are some small and large positive constants. It follows from the prime number theorem that there is a gap of at least . And there were a few improvements in the 30s ending with a remarkable result by Rankin who showed that there is a gap as large as times . Last week Kevin Ford, Ben Green, Sergei Konyagin, and Terry Tao and independently James Maynard were able to improve Rankin’s estimate by a function that goes to infinity with n. See this post on “What’s new.” […]

24 August, 2014 at 7:05 am

jozsef

I was wondering if there is a two-sided variant of the result; if there are primes so that the nearest prime, above or below, is far.

24 August, 2014 at 7:47 am

Terence Tao

This looks likely; a result of this form (with a weaker gap bound) was established by Erdos back in 1949, and in 1981 Maier (using his famous matrix method) established this result with a Rankin-type gap bound (in fact he could get k consecutive large prime gaps for any fixed k). It looks likely that our arguments can be combined with Maier’s, but there are some technical issues to sort out; we’re planning to look into this question soon.

24 August, 2014 at 7:59 am

math stat

Hi Professor Tao;

1. There is no Granville conjecture, which improves the Cramer conjecture, he said himself on Mathoverflow.

2. Why there is no specific bound for G(X) given? The PNT should implies that G(X) < 0 small.

3. Does the Jacothal function implies that G(X) << (log X)^2 ?

Thank you.

[Wordpress interprets any text between < and > as HTML and deletes it. Use < and > instead. -T.]

24 August, 2014 at 9:57 am

Terence Tao

1. Granville did not make a conjecture for the precise value of $\limsup_n \frac{p_{n+1}-p_n}{\log^2 p_n}$ , but he did make the conjecture that this limsup will be at least $2e^{-\gamma}$ (see the bottom of page 12 of his paper here).

2. (See formatting note above.)

3. I am presuming that the intended question here is whether one can use known or conjectural bounds on the Jacobsthal function to imply that $G(X) \ll \log^2 X$ . Unfortunately this does not appear to be the case. The Jacobsthal function lets one control one special type of prime gap, caused by a string of consecutive numbers that each have a quite small prime factor (of size comparable to the logarithm of the numbers in the string). However, it does not say anything much about other prime gaps, that is to say strings of composite numbers, some of which contain large prime factors instead of small ones. The Cramer model suggests that the gaps with $G(X) \sim \log^2 X$ will be of this latter type and will thus be out of reach of constructions based on the Jacobsthal function. (Maier and Pomerance have a probabilistic heuristic bound on the Jacobsthal function that suggests that the largest prime gap that can be created from optimisers to that function is of size at most $\log X (\log\log X)^{2+o(1)}$ , but this is not expected to be the true value of $G(X)$ , merely the limit of this method.)

24 August, 2014 at 9:18 pm

Janko Bračič

Dear Professor Tao,

I have a question which is not directly related to the topic of this blog, but it is about the pairs of primes. Maybe the question is easy for an expert however I am not able to find an answer.

Question: Let $d\geq 2$ be an integer. What is the best known upper bound for number of pairs $(p, p+d)$ up to $X>0$ , i.e., $p\leq X$ , such that $p$ and $p+d$ both are primes?

Thank you!

24 August, 2014 at 11:32 pm

Ben Green

I think about 4 times the expected truth of $c X/\log^2 X$ is a well-known exercise in sieve theory. The 4 has been improved to about 3.4 using much more elaborate sieve theoretic methods. Here is a free to view reference for these matters in the case $d = 2$: http://arxiv.org/pdf/1205.0774.pdf

25 August, 2014 at 1:39 am

Janko Bračič

Thank you for the answer and the link!

25 August, 2014 at 6:11 pm

dwu

Hi Professor Tao:

I had a question about the particular choice of the division of the primes into these four separate classes or stages.

It’s quite understandable from the paper and from the above proof sketch how these sets of primes are used effectively to sieve out the desired interval, but is there any easily-explainable intuition for why this particular division is more effective than other known approaches for choosing residue classes? At a superficial level, it seems rather strange that one would want to partition the primes in this way, with one of the sets even being composed of two disjoint intervals, namely the stage 1 primes for which we simply choose the zero residue class.

Is the choice of these four stages mostly just an artifact of trying to exploit the tools we have available at the moment (e.g. strong bounds on the density of smooth numbers), or is there a more fundamental reason? For example, if one could actually find the optimal or a near-optimal selection of the residue classes for this sieve, would one broadly expect it to bear any similarities to the construction here?

26 August, 2014 at 10:40 pm

Terence Tao

I would say that this is primarily an artifact of trying to optimise between the known reasonably effective sieving methods we have (random, greedy, Eratosthenes). We chose the Stage 1 (Eratosthenes sieve) primes to be disconnected because it had the convenient consequence that the sifted set was essentially the set of primes (as opposed to almost primes, which is what would have happened if we did not include the tiny primes into this stage), so we could (almost) use off-the-shelf theorems about primes. But these theorems are likely also true for almost primes, and one would probably get comparable results if the tiny primes were shifted into Stage 2 (incidentally, this is basically what Maynard does in his paper; also, his Stage 2 uses a shifted Eratosthenes sieve instead of a random sieve, and this also gives comparable results although it requires a bit more sieve theory to analyse).

There isn’t much known as to what the truly optimal sieving process is – there is no known practical algorithm for locating these things once the size of the interval gets even moderately large. Even if the exact optimisers were known numerically, they could end up being rather unedifying – the problem is so combinatorially complicated that there is no particular reason to suspect any particularly nice structure to the exact optimiser. (There might be an interestingly structured quasi-optimiser, though, even if it is technically “beaten” by some semi-random set of congruence classes that manages by pure chance to beat the structured quasi-optimiser by a little bit.)

27 August, 2014 at 2:48 pm

Billy

Dear Professor Tao

Click to access 38128.pdf

In the above link which shows a one-page paper it is stated that
“the density of Chen primes gradually thins” . Is this enough to prove that there exist infinite non-Chen primes ? Or does the conclusion that there are infinite non-Chen primes come some way from the infinity of Chen primes in arithmetic progression, proved by you and B.Green?

Any proof that there is an infinite number of non-Chen primes could help me significantly proceed with an attempt of proving a conjecture that I ve been working on as an amateur for a long time.

Thank you in advance

27 August, 2014 at 3:59 pm

Terence Tao

The set of Chen primes has a significantly smaller density than the set of all primes. For instance, a minor modification of the proof of Brun’s theorem shows that the sum of reciprocals of the Chen primes is convergent, whereas the sum of the reciprocals of all the primes is divergent. More quantitatively, for large $x$ , the number of primes up to $x$ is approximately $x/\log x$ (the prime number theorem), but standard sieve theory techniques (e.g the Selberg sieve) show that the number of Chen primes up to $x$ is $O(x/\log^2 x)$ , which is asymptotically smaller. (Technically, this depends on whether one permits one of the prime factors of the Chen prime to be small or not. If one allows small prime factors, the bound is actually a little bit bigger, namely $O( x \log\log x/\log^2 x)$ .)

28 August, 2014 at 2:24 pm

Billy

Thank you very much professor Tao.
Should I finish my attempt of proving the conjecture, to which mathematical union should I send my work to get its validity examined ?

31 August, 2014 at 12:09 am

Philip Lee

I am just a science student, and I am not very familiar with deep maths. I don’t quite see how the prime number theorem could give such a tight bound that can lead to a pigeonhole principle, just by itself. It seems clear that anomalous clusters of primes becomes well, anomalous, but we see that the PNT is an asymptotic result. Anyone care to give a short explanation, if possible? Prime numbers make very interesting work.

31 August, 2014 at 8:49 am

Terence Tao

Suppose for instance that one knows that there are at most 99 primes in the interval $\{1,\dots,1000\}$ (this is asymptotically the type of information one gets from the prime number theorem, replacing 99 and 1000 with larger numbers). We can divide the interval $\{1,\dots,1000\}$ into 100 intervals of $\{n+1,\dots,n+10\}$ of length 10. By the pigeonhole principle (applied in the opposite direction to its traditional formulation), at least one of these intervals must be completely devoid of primes, giving rise to a prime gap of size at least 11.

31 August, 2014 at 10:01 pm

Philip Lee

Thanks for the reply. I wasn’t thinking about limits as I did not see the inequality with the tilde, which has a little o(1) in it (so the inequality depends on precision, or the smallness of 1/”1000″).

31 August, 2014 at 1:52 am

Thomas Dybdahl Ahle

“where $f(X)$ grows to infinity” – I suppose you could write that as $\omega(1)$ then? since you already use $o(1)$ for sub-constants.

31 August, 2014 at 8:57 am

Terence Tao

Yes, although this can cause some conflict in analytic number theory because $\omega(n)$ is also used to denote the number of prime divisors of $n$ . (Admittedly, we don’t use that notation in this particular paper, but it shows up often enough in other analytic number theory papers that it can lead to confusion if we use it here. Given that it is only needed once in our paper, we felt that it was not worth introducing a notation for it, even if it is standard in some fields of mathematics.)

1 September, 2014 at 6:53 am

interested non-expert

This week, an article dealing with prime/natural number networks has been published in a serious physics journal: http://journals.aps.org/pre/abstract/10.1103/PhysRevE.90.022806 It also deals with the Cramer-conjecture on large gaps between primes. As it seem to be partially connected with the statistical analysis in this blog-article, I wonder what are the connections and whether it presents new insights? Thanks alot!

4 September, 2014 at 6:53 pm

curious non-expert

Does this relate to Perron Frobenius properties of stochastic matrices? I am at a level of non-expertise that it seems incredible that a theorem like PF could be proved. The up-shot, as I see it, is that for mixing to occur in a network and reach a static steady state, all cycles in the network should have lengths which have their gcd = 1. This is kind of like factoring. What about random stochastic matrices? What about a steady state for the grown network with a ‘prime’ law distribution for the links. It would seem that half the numbers have a prime factor of two, some other fraction with a factor of 3, and so on. Is there an efficient method to label the nodes in the network? It turns out the combinatorix and discrete structures, and complex and asymptotic analysis have a deep connection (I guess the series representation (harmonics), and generating functions are quite natural for those fields). As with all curious people, I have my own other ideas on the representation of primes and the entropy for the set of primes.

Knowing so little, and might being so wrong, it can look like that I am a very strange troll from the internet.

5 September, 2014 at 3:44 pm

Anonymous

Terry, where is your paper submitted to?

16 December, 2014 at 8:29 pm

Long gaps between primes | What's new

[…] gaps between primes“. This is a followup work to our two previous papers (discussed in this previous post), in which we had simultaneously shown that the maximal […]

27 December, 2014 at 2:45 pm

billy

Dear professor Tao

If one divides the set of natural numbers into smaller sets based on their divisibility with a random prime number p , what he gets are p sequences of natural numbers of which only p-1 can contain prime numbers.For example, dividing all natural numbers with 5, will produce 5 sequences,
containing numbers of the form 5k+1 or 5k+2 or 5k+3 or 5k+4 or 5k.
Of course, a prime can only belong in one of the first four mentioned sequences.
So, my question is:
Are prime numbers equally distributed in those 4 sequences ?
For instance is 1 prime per 4 of the form 5k+1?
Or is 1 prime per 12 of the form 13k+7?

In general is there any proof that 1 prime per p-1 is of the form pk+b
where p is a prime
k is a natural number and
b is a natural number smaller than p ?

[Yes, this is the prime number theorem in arithmetic progressions, discussed for instance in the recent post https://terrytao.wordpress.com/2014/12/09/254a-notes-2-complex-analytic-multiplicative-number-theory/ . See also Dirichlet’s theorem, http://en.wikipedia.org/wiki/Dirichlet%27s_theorem_on_arithmetic_progressions -T.]

28 December, 2014 at 6:25 am

Carl L. Lambert

See #PrimeNumberGuy tweets on Twitter for a simple view of consecutive composite numbers, including groups of trillions of such consecutive composites in the mega number ranges.

22 February, 2015 at 3:16 pm

Fan

Prof. Tao

I’m confused about your definition of z-smooth numbers. For one, it is quite opposite what the wiki page you linked to says. For the other, after stage 1 of the sieve what is left should be smooth numbers as defined by Wikipedia rather than your definition.

[A typo, now corrected – T.]

22 February, 2015 at 3:22 pm

Fan

What is the x in Lemma 2? Is it the same as before or a typo?

[A typo, now corrected – T.]

22 February, 2015 at 8:14 pm

Fan

Nice article. Only kind of lost on the last paragraph.

10 October, 2015 at 7:30 pm

Abhishek

I would like to understand the choice of \rho . In particular, is there any reason to set \rho = log(u)/log(z) in the Proof of the Rankin Trick.

10 October, 2015 at 9:28 pm

Terence Tao

One can try other choices of $\rho$ , but they lead to poorer bounds; the choice given is the one which optimises the bound for the given method.

28 February, 2016 at 9:20 pm

Are there arbitrarily large gaps between consecutive primes? – math.stackexchange.com #JHedzWorlD | JHedzWorlD

[…] $g$, then you should look at integers $leq e^sqrtg$ (which, for $g = 10000$ is pretty large). See this post of Terence Tao for […]

18 May, 2017 at 12:00 pm

Gaps between prime-like sets | google-site-verification: google1edf67008356a9e1.html

[…] This post was inspired by a question of Trevor Wooley after a talk of James Maynard at MSRI. He asked what was known for lower bounds of large gaps between integers that have a representation as the sum of two squares. This posts assume some familiarity with the large gaps between primes problem. […]

13 September, 2017 at 2:48 pm

M.F.

I find it strange, ambiguous, and confusing to write “log A(log B) “. If you mean “(log A) log B” the (..) should be around the first factor. If you don’t mean this, then you must mean ” log (A log B) “, so why don’t you choose this unambiguous way of writing it? The chosen variant “log A(log B)” is no better than ” log A log B” without parentheses (and actually the latter is more likely understood (correctly?) as (log A) log B than the former). Just wondering — with all due respect!

[Typo corrected, thanks – T.]

19 August, 2020 at 4:20 am

Carl L. Lambert

To: Terrence Tao, August 19, 2020
The Lambert Prime Number Formula not only produces all the prime numbers and eliminates all the composite numbers, but it also can produce all the consecutive composite number clusters. Try this:
n * p# + v or n * p# – v
… where n is any integer between 0 and infinity, the * is the multiplication
symbol, and # indicates the primorial function, and p is any prime number.
Now then, simply assign any value to v between 2 and p+1, inclusive, and a composite number will be produced, which means that, for any value of n and any particular value of p, all those values for v constitute a contiguous composite number cluster consisting of p number of composite numbers (plus more when smaller clusters are enveloped by larger clusters in the larger number ranges). This is well covered in my Treatise #9, “The Mirror Images of Contiguous Composite Number Clusters”, 2015. It is not necessary to play with factorials here because all the prime numbers used in a primorial provide all the necessary divisors to be used either side of the plus (or minus) sign to prove a result of the expression to be composite.

Example:

n * 223092827# + v
produces an infinite number of contiguous composite number clusters, each consisting of 223092827 consecutive composite numbers and
n * 223092827# – v
produces a different but also infinite number of contiguous composite number clusters, each consisting of 223092827 consecutive composite numbers.

Carl L. Lambert
#PrimeNumberGuy

	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Pointwise ergodic theorems for…
	Anonymous on 275A, Notes 3: The weak and st…
	Terence Tao on Pointwise ergodic theorems for…
	Terence Tao on Erratum for “An inverse…
	Anonymous on Notes on the B+B+t theore…
	Anonymous on Pointwise ergodic theorems for…
	Anonymous on Erratum for “An inverse…
	Erratum for “A… on An inverse theorem for the Gow…
	Anonymous on Analysis II
	Anonymous on Notes on the B+B+t theore…
	Anonymous on Twisted convolution and the se…
	Anonymous on A generalized Cauchy-Schwarz i…
	Notes on the B+B+t t… on Ultrafilters, nonstandard anal…

Large gaps between consecutive prime numbers

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

44 comments

Leave a comment Cancel reply

For commenters

Large gaps between consecutive prime numbers

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

44 comments

Leave a comment Cancel reply

For commenters