You are currently browsing the tag archive for the ‘prime gaps’ tag.

William Banks, Kevin Ford, and I have just uploaded to the arXiv our paper “Large prime gaps and probabilistic models“. In this paper we introduce a random model to help understand the connection between two well known conjectures regarding the primes , the Cramér conjecture and the Hardy-Littlewood conjecture:

Conjecture 1 (Cramér conjecture)If is a large number, then the largest prime gap in is of size . (Granville refines this conjecture to , where . Here we use the asymptotic notation for , for , for , and for .)

Conjecture 2 (Hardy-Littlewood conjecture)If are fixed distinct integers, then the number of numbers with all prime is as , where the singular series is defined by the formula

(One can view these conjectures as modern versions of two of the classical Landau problems, namely Legendre’s conjecture and the twin prime conjecture respectively.)

A well known connection between the Hardy-Littlewood conjecture and prime gaps was made by Gallagher. Among other things, Gallagher showed that if the Hardy-Littlewood conjecture was true, then the prime gaps with were asymptotically distributed according to an exponential distribution of mean , in the sense that

as for any fixed . Roughly speaking, the way this is established is by using the Hardy-Littlewood conjecture to control the mean values of for fixed , where ranges over the primes in . The relevance of these quantities arises from the Bonferroni inequalities (or “Brun pure sieve“), which can be formulated as the assertion that

when is even and

when is odd, for any natural number ; setting and taking means, one then gets upper and lower bounds for the probability that the interval is free of primes. The most difficult step is to control the mean values of the singular series as ranges over -tuples in a fixed interval such as .

Heuristically, if one extrapolates the asymptotic (1) to the regime , one is then led to Cramér’s conjecture, since the right-hand side of (1) falls below when is significantly larger than . However, this is not a rigorous derivation of Cramér’s conjecture from the Hardy-Littlewood conjecture, since Gallagher’s computations only establish (1) for *fixed* choices of , which is only enough to establish the far weaker bound , which was already known (see this previous paper for a discussion of the best known unconditional lower bounds on ). An inspection of the argument shows that if one wished to extend (1) to parameter choices that were allowed to grow with , then one would need as input a stronger version of the Hardy-Littlewood conjecture in which the length of the tuple , as well as the magnitudes of the shifts , were also allowed to grow with . Our initial objective in this project was then to quantify exactly what strengthening of the Hardy-Littlewood conjecture would be needed to rigorously imply Cramer’s conjecture. The precise results are technical, but roughly we show results of the following form:

Theorem 3 (Large gaps from Hardy-Littlewood, rough statement)

- If the Hardy-Littlewood conjecture is uniformly true for -tuples of length , and with shifts of size , with a power savings in the error term, then .
- If the Hardy-Littlewood conjecture is “true on average” for -tuples of length and shifts of size for all , with a power savings in the error term, then .

In particular, we can recover Cramer’s conjecture given a sufficiently powerful version of the Hardy-Littlewood conjecture “on the average”.

Our proof of this theorem proceeds more or less along the same lines as Gallagher’s calculation, but now with allowed to grow slowly with . Again, the main difficulty is to accurately estimate average values of the singular series . Here we found it useful to switch to a probabilistic interpretation of this series. For technical reasons it is convenient to work with a truncated, unnormalised version

of the singular series, for a suitable cutoff ; it turns out that when studying prime tuples of size , the most convenient cutoff is the “Pólya magic cutoff“, defined as the largest prime for which

(this is well defined for ); by Mertens’ theorem, we have . One can interpret probabilistically as

where is the randomly sifted set of integers formed by removing one residue class uniformly at random for each prime . The Hardy-Littlewood conjecture can be viewed as an assertion that the primes behave in some approximate statistical sense like the random sifted set , and one can prove the above theorem by using the Bonferroni inequalities both for the primes and for the random sifted set, and comparing the two (using an even for the sifted set and an odd for the primes in order to be able to combine the two together to get a useful bound).

The proof of Theorem 3 ended up not using any properties of the set of primes other than that this set obeyed some form of the Hardy-Littlewood conjectures; the theorem remains true (with suitable notational changes) if this set were replaced by any other set. In order to convince ourselves that our theorem was not vacuous due to our version of the Hardy-Littlewood conjecture being too strong to be true, we then started exploring the question of coming up with random models of which obeyed various versions of the Hardy-Littlewood and Cramér conjectures.

This line of inquiry was started by Cramér, who introduced what we now call the *Cramér random model* of the primes, in which each natural number is selected for membership in with an independent probability of . This model matches the primes well in some respects; for instance, it almost surely obeys the “Riemann hypothesis”

and Cramér also showed that the largest gap was almost surely . On the other hand, it does not obey the Hardy-Littlewood conjecture; more precisely, it obeys a simplified variant of that conjecture in which the singular series is absent.

Granville proposed a refinement to Cramér’s random model in which one first sieves out (in each dyadic interval ) all residue classes for for a certain threshold , and then places each surviving natural number in with an independent probability . One can verify that this model obeys the Hardy-Littlewood conjectures, and Granville showed that the largest gap in this model was almost surely , leading to his conjecture that this bound also was true for the primes. (Interestingly, this conjecture is not yet borne out by numerics; calculations of prime gaps up to , for instance, have shown that never exceeds in this range. This is not necessarily a conflict, however; Granville’s analysis relies on inspecting gaps in an extremely sparse region of natural numbers that are more devoid of primes than average, and this region is not well explored by existing numerics. See this previous blog post for more discussion of Granville’s argument.)

However, Granville’s model does not produce a power savings in the error term of the Hardy-Littlewood conjectures, mostly due to the need to truncate the singular series at the logarithmic cutoff . After some experimentation, we were able to produce a tractable random model for the primes which obeyed the Hardy-Littlewood conjectures with power savings, and which reproduced Granville’s gap prediction of (we also get an upper bound of for both models, though we expect the lower bound to be closer to the truth); to us, this strengthens the case for Granville’s version of Cramér’s conjecture. The model can be described as follows. We select one residue class uniformly at random for each prime , and as before we let be the sifted set of integers formed by deleting the residue classes with . We then set

with Pólya’s magic cutoff (this is the cutoff that gives a density consistent with the prime number theorem or the Riemann hypothesis). As stated above, we are able to show that almost surely one has

and that the Hardy-Littlewood conjectures hold with power savings for up to for any fixed and for shifts of size . This is unfortunately a tiny bit weaker than what Theorem 3 requires (which more or less corresponds to the endpoint ), although there is a variant of Theorem 3 that can use this input to produce a lower bound on gaps in the model (but it is weaker than the one in (3)). In fact we prove a more precise almost sure asymptotic formula for that involves the optimal bounds for the *linear sieve* (or *interval sieve*), in which one deletes one residue class modulo from an interval for all primes up to a given threshold. The lower bound in (3) relates to the case of deleting the residue classes from ; the upper bound comes from the delicate analysis of the linear sieve by Iwaniec. Improving on either of the two bounds looks to be quite a difficult problem.

The probabilistic analysis of is somewhat more complicated than of or as there is now non-trivial coupling between the events as varies, although moment methods such as the second moment method are still viable and allow one to verify the Hardy-Littlewood conjectures by a lengthy but fairly straightforward calculation. To analyse large gaps, one has to understand the statistical behaviour of a random linear sieve in which one starts with an interval and randomly deletes a residue class for each prime up to a given threshold. For very small this is handled by the deterministic theory of the linear sieve as discussed above. For medium sized , it turns out that there is good concentration of measure thanks to tools such as Bennett’s inequality or Azuma’s inequality, as one can view the sieving process as a martingale or (approximately) as a sum of independent random variables. For larger primes , in which only a small number of survivors are expected to be sieved out by each residue class, a direct combinatorial calculation of all possible outcomes (involving the random graph that connects interval elements to primes if falls in the random residue class ) turns out to give the best results.

Kevin Ford, Sergei Konyagin, James Maynard, Carl Pomerance, and I have uploaded to the arXiv our paper “Long gaps in sieved sets“, submitted to J. Europ. Math. Soc..

This paper originated from the MSRI program in analytic number theory last year, and was centred around variants of the question of finding large gaps between primes. As discussed for instance in this previous post, it is now known that within the set of primes , one can find infinitely many adjacent elements whose gap obeys a lower bound of the form

where denotes the -fold iterated logarithm. This compares with the trivial bound of that one can obtain from the prime number theorem and the pigeonhole principle. Several years ago, Pomerance posed the question of whether analogous improvements to the trivial bound can be obtained for such sets as

Here there is the obvious initial issue that this set is not even known to be infinite (this is the fourth Landau problem), but let us assume for the sake of discussion that this set is indeed infinite, so that we have an infinite number of gaps to speak of. Standard sieve theory techniques give upper bounds for the density of that is comparable (up to an absolute constant) to the prime number theorem bounds for , so again we can obtain a trivial bound of for the gaps of . In this paper we improve this to

for an absolute constant ; this is not as strong as the corresponding bound for , but still improves over the trivial bound. In fact we can handle more general “sifted sets” than just . Recall from the sieve of Eratosthenes that the elements of in, say, the interval can be obtained by removing from one residue class modulo for each prime up to , namely the class mod . In a similar vein, the elements of in can be obtained by removing for each prime up to zero, one, or two residue classes modulo , depending on whether is a quadratic residue modulo . On the average, one residue class will be removed (this is a very basic case of the Chebotarev density theorem), so this sieving system is “one-dimensional on the average”. Roughly speaking, our arguments apply to any other set of numbers arising from a sieving system that is one-dimensional on average. (One can consider other dimensions also, but unfortunately our methods seem to give results that are worse than a trivial bound when the dimension is less than or greater than one.)

The standard “Erdős-Rankin” method for constructing long gaps between primes proceeds by trying to line up some residue classes modulo small primes so that they collectively occupy a long interval. A key tool in doing so are the smooth number estimates of de Bruijn and others, which among other things assert that if one removes from an interval such as all the residue classes mod for between and for some fixed , then the set of survivors has exceptionally small density (roughly of the order of , with the precise density given by the Dickman function), in marked contrast to the situation in which one randomly removes one residue class for each such prime , in which the density is more like . One generally exploits this phenomenon to sieve out almost all the elements of a long interval using some of the primes available, and then using the remaining primes to cover up the remaining elements that have not already been sifted out. In the more recent work on this problem, advanced combinatorial tools such as hypergraph covering lemmas are used for the latter task.

In the case of , there does not appear to be any analogue of smooth numbers, in the sense that there is no obvious way to arrange the residue classes so that they have significantly fewer survivors than a random arrangement. Instead we adopt the following semi-random strategy to cover an interval by residue classes. Firstly, we randomly remove residue classes for primes up to some intermediate threshold (smaller than by a logarithmic factor), leaving behind a preliminary sifted set . Then, for each prime between and another intermediate threshold , we remove a residue class mod that maximises (or nearly maximises) its intersection with . This ends up reducing the number of survivors to be significantly below what one would achieve if one selects residue classes randomly, particularly if one also uses the hypergraph covering lemma from our previous paper. Finally, we cover each the remaining survivors by a residue class from a remaining available prime.

There is a very nice recent paper by Lemke Oliver and Soundararajan (complete with a popular science article about it by the consistently excellent Erica Klarreich for Quanta) about a surprising (but now satisfactorily explained) bias in the distribution of pairs of consecutive primes when reduced to a small modulus .

This phenomenon is superficially similar to the more well known Chebyshev bias concerning the reduction of a single prime to a small modulus , but is in fact a rather different (and much stronger) bias than the Chebyshev bias, and seems to arise from a completely different source. The Chebyshev bias asserts, roughly speaking, that a randomly selected prime of a large magnitude will typically (though not always) be slightly more likely to be a quadratic non-residue modulo than a quadratic residue, but the bias is small (the difference in probabilities is only about for typical choices of ), and certainly consistent with known or conjectured positive results such as Dirichlet’s theorem or the generalised Riemann hypothesis. The reason for the Chebyshev bias can be traced back to the von Mangoldt explicit formula which relates the distribution of the von Mangoldt function modulo with the zeroes of the -functions with period . This formula predicts (assuming some standard conjectures like GRH) that the von Mangoldt function is quite unbiased modulo . The von Mangoldt function is *mostly* concentrated in the primes, but it also has a medium-sized contribution coming from *squares* of primes, which are of course all located in the quadratic residues modulo . (Cubes and higher powers of primes also make a small contribution, but these are quite negligible asymptotically.) To balance everything out, the contribution of the primes must then exhibit a small preference towards quadratic non-residues, and this is the Chebyshev bias. (See this article of Rubinstein and Sarnak for a more technical discussion of the Chebyshev bias, and this survey of Granville and Martin for an accessible introduction. The story of the Chebyshev bias is also related to Skewes’ number, once considered the largest explicit constant to naturally appear in a mathematical argument.)

The paper of Lemke Oliver and Soundararajan considers instead the distribution of the pairs for small and for large consecutive primes , say drawn at random from the primes comparable to some large . For sake of discussion let us just take . Then all primes larger than are either or ; Chebyshev’s bias gives a very slight preference to the latter (of order , as discussed above), but apart from this, we expect the primes to be more or less equally distributed in both classes. For instance, assuming GRH, the probability that lands in would be , and similarly for .

In view of this, one would expect that up to errors of or so, the pair should be equally distributed amongst the four options , , , , thus for instance the probability that this pair is would naively be expected to be , and similarly for the other three tuples. These assertions are not yet proven (although some non-trivial upper and lower bounds for such probabilities can be obtained from recent work of Maynard).

However, Lemke Oliver and Soundararajan argue (backed by both plausible heuristic arguments (based ultimately on the Hardy-Littlewood prime tuples conjecture), as well as substantial numerical evidence) that there is a significant bias away from the tuples and – informally, adjacent primes don’t like being in the same residue class! For instance, they predict that the probability of attaining is in fact

with similar predictions for the other three pairs (in fact they give a somewhat more precise prediction than this). The magnitude of this bias, being comparable to , is significantly stronger than the Chebyshev bias of .

One consequence of this prediction is that the prime gaps are slightly less likely to be divisible by than naive random models of the primes would predict. Indeed, if the four options , , , all occurred with equal probability , then should equal with probability , and and with probability each (as would be the case when taking the difference of two random numbers drawn from those integers not divisible by ); but the Lemke Oliver-Soundararajan bias predicts that the probability of being divisible by three should be slightly lower, being approximately .

Below the fold we will give a somewhat informal justification of (a simplified version of) this phenomenon, based on the Lemke Oliver-Soundararajan calculation using the prime tuples conjecture.

Kevin Ford, James Maynard, and I have uploaded to the arXiv our preprint “Chains of large gaps between primes“. This paper was announced in our previous paper with Konyagin and Green, which was concerned with the largest gap

between consecutive primes up to , in which we improved the Rankin bound of

to

for large (where we use the abbreviations , , and ). Here, we obtain an analogous result for the quantity

which measures how far apart the gaps between chains of consecutive primes can be. Our main result is

whenever is sufficiently large depending on , with the implied constant here absolute (and effective). The factor of is inherent to the method, and related to the basic probabilistic fact that if one selects numbers at random from the unit interval , then one expects the minimum gap between adjacent numbers to be about (i.e. smaller than the mean spacing of by an additional factor of ).

Our arguments combine those from the previous paper with the matrix method of Maier, who (in our notation) showed that

for an infinite sequence of going to infinity. (Maier needed to restrict to an infinite sequence to avoid Siegel zeroes, but we are able to resolve this issue by the now standard technique of simply eliminating a prime factor of an exceptional conductor from the sieve-theoretic portion of the argument. As a byproduct, this also makes all of the estimates in our paper effective.)

As its name suggests, the Maier matrix method is usually presented by imagining a matrix of numbers, and using information about the distribution of primes in the columns of this matrix to deduce information about the primes in at least one of the rows of the matrix. We found it convenient to interpret this method in an equivalent probabilistic form as follows. Suppose one wants to find an interval which contained a block of at least primes, each separated from each other by at least (ultimately, will be something like and something like ). One can do this by the probabilistic method: pick to be a random large natural number (with the precise distribution to be chosen later), and try to lower bound the probability that the interval contains at least primes, no two of which are within of each other.

By carefully choosing the residue class of with respect to small primes, one can eliminate several of the from consideration of being prime immediately. For instance, if is chosen to be large and even, then the with even have no chance of being prime and can thus be eliminated; similarly if is large and odd, then cannot be prime for any odd . Using the methods of our previous paper, we can find a residue class (where is a product of a large number of primes) such that, if one chooses to be a large random element of (that is, for some large random integer ), then the set of shifts for which still has a chance of being prime has size comparable to something like ; furthermore this set is fairly well distributed in in the sense that it does not concentrate too strongly in any short subinterval of . The main new difficulty, not present in the previous paper, is to get *lower* bounds on the size of in addition to upper bounds, but this turns out to be achievable by a suitable modification of the arguments.

Using a version of the prime number theorem in arithmetic progressions due to Gallagher, one can show that for each remaining shift , is going to be prime with probability comparable to , so one expects about primes in the set . An upper bound sieve (e.g. the Selberg sieve) also shows that for any distinct , the probability that and are both prime is . Using this and some routine second moment calculations, one can then show that with large probability, the set will indeed contain about primes, no two of which are closer than to each other; with no other numbers in this interval being prime, this gives a lower bound on .

Kevin Ford, Ben Green, Sergei Konyagin, James Maynard, and I have just uploaded to the arXiv our paper “Long gaps between primes“. This is a followup work to our two previous papers (discussed in this previous post), in which we had simultaneously shown that the maximal gap

between primes up to exhibited a lower bound of the shape

for some function that went to infinity as ; this improved upon previous work of Rankin and other authors, who established the same bound but with replaced by a constant. (Again, see the previous post for a more detailed discussion.)

In our previous papers, we did not specify a particular growth rate for . In my paper with Kevin, Ben, and Sergei, there was a good reason for this: our argument relied (amongst other things) on the inverse conjecture on the Gowers norms, as well as the Siegel-Walfisz theorem, and the known proofs of both results both have ineffective constants, rendering our growth function similarly ineffective. Maynard’s approach ostensibly also relies on the Siegel-Walfisz theorem, but (as shown in another recent paper of his) can be made quite effective, even when tracking -tuples of fairly large size (about for some small ). If one carefully makes all the bounds in Maynard’s argument quantitative, one eventually ends up with a growth rate of shape

on the gaps between primes for large ; this is an unpublished calculation of James’.

In this paper we make a further refinement of this calculation to obtain a growth rate

leading to a bound of the form

for large and some small constant . Furthermore, this appears to be the limit of current technology (in particular, falling short of Cramer’s conjecture that is comparable to ); in the spirit of Erdös’ original prize on this problem, I would like to offer 10,000 USD for anyone who can show (in a refereed publication, of course) that the constant here can be replaced by an arbitrarily large constant .

The reason for the growth rate (3) is as follows. After following the sieving process discussed in the previous post, the problem comes down to something like the following: can one sieve out all (or almost all) of the primes in by removing one residue class modulo for all primes in (say) ? Very roughly speaking, if one can solve this problem with , then one can obtain a growth rate on of the shape . (This is an oversimplification, as one actually has to sieve out a random subset of the primes, rather than all the primes in , but never mind this detail for now.)

Using the quantitative “dense clusters of primes” machinery of Maynard, one can find lots of -tuples in which contain at least primes, for as large as or so (so that is about ). By considering -tuples in arithmetic progression, this means that one can find lots of residue classes modulo a given prime in that capture about primes. In principle, this means that union of all these residue classes can cover about primes, allowing one to take as large as , which corresponds to (3). However, there is a catch: the residue classes for different primes may collide with each other, reducing the efficiency of the covering. In our previous papers on the subject, we selected the residue classes randomly, which meant that we had to insert an additional logarithmic safety margin in expected number of times each prime would be shifted out by one of the residue classes, in order to guarantee that we would (with high probability) sift out most of the primes. This additional safety margin is ultimately responsible for the loss in (2).

The main innovation of this paper, beyond detailing James’ unpublished calculations, is to use ideas from the literature on efficient hypergraph covering, to avoid the need for a logarithmic safety margin. The hypergraph covering problem, roughly speaking, is to try to cover a set of vertices using as few “edges” from a given hypergraph as possible. If each edge has vertices, then one certainly needs at least edges to cover all the vertices, and the question is to see if one can come close to attaining this bound given some reasonable uniform distribution hypotheses on the hypergraph . As before, random methods tend to require something like edges before one expects to cover, say of the vertices.

However, it turns out (under reasonable hypotheses on ) to eliminate this logarithmic loss, by using what is now known as the “semi-random method” or the “Rödl nibble”. The idea is to randomly select a small number of edges (a first “nibble”) – small enough that the edges are unlikely to overlap much with each other, thus obtaining maximal efficiency. Then, one pauses to remove all the edges from that intersect edges from this first nibble, so that all remaining edges will not overlap with the existing edges. One then randomly selects another small number of edges (a second “nibble”), and repeats this process until enough nibbles are taken to cover most of the vertices. Remarkably, it turns out that under some reasonable assumptions on the hypergraph , one can maintain control on the uniform distribution of the edges throughout the nibbling process, and obtain an efficient hypergraph covering. This strategy was carried out in detail in an influential paper of Pippenger and Spencer.

In our setup, the vertices are the primes in , and the edges are the intersection of the primes with various residue classes. (Technically, we have to work with a family of hypergraphs indexed by a prime , rather than a single hypergraph, but let me ignore this minor technical detail.) The semi-random method would *in principle* eliminate the logarithmic loss and recover the bound (3). However, there is a catch: the analysis of Pippenger and Spencer relies heavily on the assumption that the hypergraph is uniform, that is to say all edges have the same size. In our context, this requirement would mean that each residue class captures exactly the same number of primes, which is not the case; we only control the number of primes in an average sense, but we were unable to obtain any concentration of measure to come close to verifying this hypothesis. And indeed, the semi-random method, when applied naively, does not work well with edges of variable size – the problem is that edges of large size are much more likely to be eliminated after each nibble than edges of small size, since they have many more vertices that could overlap with the previous nibbles. Since the large edges are clearly the more useful ones for the covering problem than small ones, this bias towards eliminating large edges significantly reduces the efficiency of the semi-random method (and also greatly complicates the *analysis* of that method).

Our solution to this is to iteratively *reweight* the probability distribution on edges after each nibble to compensate for this bias effect, giving larger edges a greater weight than smaller edges. It turns out that there is a natural way to do this reweighting that allows one to repeat the Pippenger-Spencer analysis in the presence of edges of variable size, and this ultimately allows us to recover the full growth rate (3).

To go beyond (3), one either has to find a lot of residue classes that can capture significantly more than primes of size (which is the limit of the multidimensional Selberg sieve of Maynard and myself), or else one has to find a very different method to produce large gaps between primes than the Erdös-Rankin method, which is the method used in all previous work on the subject.

It turns out that the arguments in this paper can be combined with the Maier matrix method to also produce chains of consecutive large prime gaps whose size is of the order of (4); three of us (Kevin, James, and myself) will detail this in a future paper. (A similar combination was also recently observed in connection with our earlier result (1) by Pintz, but there are some additional technical wrinkles required to recover the full gain of (3) for the chains of large gaps problem.)

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap . Here, we wish to consider the *largest* prime gap that one can find in the interval as goes to infinity.

Finding lower bounds on is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number , the consecutive numbers are all composite, because each , is divisible by some prime , while being strictly larger than that prime . From this and Stirling’s formula, it is not difficult to obtain the bound

A more efficient bound comes from the prime number theorem: there are only primes up to , so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to of length at least , thus

where we use or as shorthand for or .

What about upper bounds? The *Cramér random model* predicts that the primes up to are distributed like a random subset of density . Using this model, Cramér arrived at the conjecture

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

(note that is slightly larger than ). For comparison, the known upper bounds on are quite weak; unconditionally one has by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to , as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

which Erdös in 1935 improved to

and Rankin in 1938 improved slightly further to

with . Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant , which was raised to in 1963 by Schönhage, to in 1963 by Rankin, to by Maier and Pomerance, and finally to in 1997 by Pintz.

Erdös listed the problem of making arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1The bound (3) holds for arbitrarily large .

In principle, we thus have a bound of the form

for some that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of in the denominator is instead of .) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

Suppose one is given a -tuple of distinct integers for some , arranged in increasing order. When is it possible to find infinitely many translates of which consists entirely of primes? The case is just Euclid’s theorem on the infinitude of primes, but the case is already open in general, with the case being the notorious twin prime conjecture.

On the other hand, there are some tuples for which one can easily answer the above question in the negative. For instance, the only translate of that consists entirely of primes is , basically because each translate of must contain an even number, and the only even prime is . More generally, if there is a prime such that meets each of the residue classes , then every translate of contains at least one multiple of ; since is the only multiple of that is prime, this shows that there are only finitely many translates of that consist entirely of primes.

To avoid this obstruction, let us call a -tuple *admissible* if it avoids at least one residue class for each prime . It is easy to check for admissibility in practice, since a -tuple is automatically admissible in every prime larger than , so one only needs to check a finite number of primes in order to decide on the admissibility of a given tuple. For instance, or are admissible, but is not (because it covers all the residue classes modulo ). We then have the famous Hardy-Littlewood prime tuples conjecture:

Conjecture 1 (Prime tuples conjecture, qualitative form)If is an admissible -tuple, then there exists infinitely many translates of that consist entirely of primes.

This conjecture is extremely difficult (containing the twin prime conjecture, for instance, as a special case), and in fact there is *no* explicitly known example of an admissible -tuple with for which we can verify this conjecture (although, thanks to the recent work of Zhang, we know that satisfies the conclusion of the prime tuples conjecture for some , even if we can’t yet say what the precise value of is).

Actually, Hardy and Littlewood conjectured a more precise version of Conjecture 1. Given an admissible -tuple , and for each prime , let denote the number of residue classes modulo that meets; thus we have for all by admissibility, and also for all . We then define the *singular series* associated to by the formula

where is the set of primes; by the previous discussion we see that the infinite product in converges to a finite non-zero number.

We will also need some asymptotic notation (in the spirit of “cheap nonstandard analysis“). We will need a parameter that one should think of going to infinity. Some mathematical objects (such as and ) will be independent of and referred to as *fixed*; but unless otherwise specified we allow all mathematical objects under consideration to depend on . If and are two such quantities, we say that if one has for some fixed , and if one has for some function of (and of any fixed parameters present) that goes to zero as (for each choice of fixed parameters).

Conjecture 2 (Prime tuples conjecture, quantitative form)Let be a fixed natural number, and let be a fixed admissible -tuple. Then the number of natural numbers such that consists entirely of primes is .

Thus, for instance, if Conjecture 2 holds, then the number of twin primes less than should equal , where is the twin prime constant

As this conjecture is stronger than Conjecture 1, it is of course open. However there are a number of partial results on this conjecture. For instance, this conjecture is known to be true if one introduces some additional averaging in ; see for instance this previous post. From the methods of sieve theory, one can obtain an *upper bound* of for the number of with all prime, where depends only on . Sieve theory can also give analogues of Conjecture 2 if the primes are replaced by a suitable notion of almost prime (or more precisely, by a weight function concentrated on almost primes).

Another type of partial result towards Conjectures 1, 2 come from the results of Goldston-Pintz-Yildirim, Motohashi-Pintz, and of Zhang. Following the notation of this recent paper of Pintz, for each , let denote the following assertion (DHL stands for “Dickson-Hardy-Littlewood”):

Conjecture 3() Let be a fixed admissible -tuple. Then there are infinitely many translates of which containat least twoprimes.

This conjecture gets harder as gets smaller. Note for instance that would imply all the cases of Conjecture 1, including the twin prime conjecture. More generally, if one knew for some , then one would immediately conclude that there are an infinite number of pairs of consecutive primes of separation at most , where is the minimal diameter amongst all admissible -tuples . Values of for small can be found at this link (with denoted in that page). For large , the best upper bounds on have been found by using admissible -tuples of the form

where denotes the prime and is a parameter to be optimised over (in practice it is an order of magnitude or two smaller than ); see this blog post for details. The upshot is that one can bound for large by a quantity slightly smaller than (and the large sieve inequality shows that this is sharp up to a factor of two, see e.g. this previous post for more discussion).

In a key breakthrough, Goldston, Pintz, and Yildirim were able to establish the following conditional result a few years ago:

Theorem 4 (Goldston-Pintz-Yildirim)Suppose that the Elliott-Halberstam conjecture is true for some . Then is true for some finite . In particular, this establishes an infinite number of pairs of consecutive primes of separation .

The dependence of constants between and given by the Goldston-Pintz-Yildirim argument is basically of the form . (UPDATE: as recently observed by Farkas, Pintz, and Revesz, this relationship can be improved to .)

Unfortunately, the Elliott-Halberstam conjecture (which we will state properly below) is only known for , an important result known as the Bombieri-Vinogradov theorem. If one uses the Bombieri-Vinogradov theorem instead of the Elliott-Halberstam conjecture, Goldston, Pintz, and Yildirim were still able to show the highly non-trivial result that there were infinitely many pairs of consecutive primes with (actually they showed more than this; see e.g. this survey of Soundararajan for details).

Actually, the full strength of the Elliott-Halberstam conjecture is not needed for these results. There is a technical specialisation of the Elliott-Halberstam conjecture which does not presently have a commonly accepted name; I will call it the *Motohashi-Pintz-Zhang conjecture* in this post, where is a parameter. We will define this conjecture more precisely later, but let us remark for now that is a consequence of .

We then have the following two theorems. Firstly, we have the following strengthening of Theorem 4:

Theorem 5 (Motohashi-Pintz-Zhang)Suppose that is true for some . Then is true for some .

A version of this result (with a slightly different formulation of ) appears in this paper of Motohashi and Pintz, and in the paper of Zhang, Theorem 5 is proven for the concrete values and . We will supply a self-contained proof of Theorem 5 below the fold, the constants upon those in Zhang’s paper (in particular, for , we can take as low as , with further improvements on the way). As with Theorem 4, we have an inverse quadratic relationship .

In his paper, Zhang obtained for the first time an unconditional advance on :

This is a deep result, building upon the work of Fouvry-Iwaniec, Friedlander-Iwaniec and Bombieri–Friedlander–Iwaniec which established results of a similar nature to but simpler in some key respects. We will not discuss this result further here, except to say that they rely on the (higher-dimensional case of the) Weil conjectures, which were famously proven by Deligne using methods from l-adic cohomology. Also, it was believed among at least some experts that the methods of Bombieri, Fouvry, Friedlander, and Iwaniec were not quite strong enough to obtain results of the form , making Theorem 6 a particularly impressive achievement.

Combining Theorem 6 with Theorem 5 we obtain for some finite ; Zhang obtains this for but as detailed below, this can be lowered to . This in turn gives infinitely many pairs of consecutive primes of separation at most . Zhang gives a simple argument that bounds by , giving his famous result that there are infinitely many pairs of primes of separation at most ; by being a bit more careful (as discussed in this post) one can lower the upper bound on to , and if one instead uses the newer value for one can instead use the bound . (Many thanks to Scott Morrison for these numerics.) UPDATE: These values are now obsolete; see this web page for the latest bounds.

In this post we would like to give a self-contained proof of both Theorem 4 and Theorem 5, which are both sieve-theoretic results that are mainly elementary in nature. (But, as stated earlier, we will not discuss the deepest new result in Zhang’s paper, namely Theorem 6.) Our presentation will deviate a little bit from the traditional sieve-theoretic approach in a few places. Firstly, there is a portion of the argument that is traditionally handled using contour integration and properties of the Riemann zeta function; we will present a “cheaper” approach (which Ben Green and I used in our papers, e.g. in this one) using Fourier analysis, with the only property used about the zeta function being the elementary fact that blows up like as one approaches from the right. To deal with the contribution of small primes (which is the source of the singular series ), it will be convenient to use the “-trick” (introduced in this paper of mine with Ben), passing to a single residue class mod (where is the product of all the small primes) to end up in a situation in which all small primes have been “turned off” which leads to better pseudorandomness properties (for instance, once one eliminates all multiples of small primes, almost all pairs of remaining numbers will be coprime).

In the third Marker lecture, I would like to discuss the recent progress, particularly by Goldston, Pintz, and Yıldırım, on finding small gaps between consecutive primes. (See also the surveys by Goldston-Pintz-Yıldırım, by Green, and by Soundararajan on the subject; the material here is based to some extent on these prior surveys.)

## Recent Comments