You are currently browsing the tag archive for the ‘prime gaps’ tag.

There is a very nice recent paper by Lemke Oliver and Soundararajan (complete with a popular science article about it by the consistently excellent Erica Klarreich for Quanta) about a surprising (but now satisfactorily explained) bias in the distribution of pairs of consecutive primes ${p_n, p_{n+1}}$ when reduced to a small modulus ${q}$.

This phenomenon is superficially similar to the more well known Chebyshev bias concerning the reduction of a single prime ${p_n}$ to a small modulus ${q}$, but is in fact a rather different (and much stronger) bias than the Chebyshev bias, and seems to arise from a completely different source. The Chebyshev bias asserts, roughly speaking, that a randomly selected prime ${p}$ of a large magnitude ${x}$ will typically (though not always) be slightly more likely to be a quadratic non-residue modulo ${q}$ than a quadratic residue, but the bias is small (the difference in probabilities is only about ${O(1/\sqrt{x})}$ for typical choices of ${x}$), and certainly consistent with known or conjectured positive results such as Dirichlet’s theorem or the generalised Riemann hypothesis. The reason for the Chebyshev bias can be traced back to the von Mangoldt explicit formula which relates the distribution of the von Mangoldt function ${\Lambda}$ modulo ${q}$ with the zeroes of the ${L}$-functions with period ${q}$. This formula predicts (assuming some standard conjectures like GRH) that the von Mangoldt function ${\Lambda}$ is quite unbiased modulo ${q}$. The von Mangoldt function is mostly concentrated in the primes, but it also has a medium-sized contribution coming from squares of primes, which are of course all located in the quadratic residues modulo ${q}$. (Cubes and higher powers of primes also make a small contribution, but these are quite negligible asymptotically.) To balance everything out, the contribution of the primes must then exhibit a small preference towards quadratic non-residues, and this is the Chebyshev bias. (See this article of Rubinstein and Sarnak for a more technical discussion of the Chebyshev bias, and this survey of Granville and Martin for an accessible introduction. The story of the Chebyshev bias is also related to Skewes’ number, once considered the largest explicit constant to naturally appear in a mathematical argument.)

The paper of Lemke Oliver and Soundararajan considers instead the distribution of the pairs ${(p_n \hbox{ mod } q, p_{n+1} \hbox{ mod } q)}$ for small ${q}$ and for large consecutive primes ${p_n, p_{n+1}}$, say drawn at random from the primes comparable to some large ${x}$. For sake of discussion let us just take ${q=3}$. Then all primes ${p_n}$ larger than ${3}$ are either ${1 \hbox{ mod } 3}$ or ${2 \hbox{ mod } 3}$; Chebyshev’s bias gives a very slight preference to the latter (of order ${O(1/\sqrt{x})}$, as discussed above), but apart from this, we expect the primes to be more or less equally distributed in both classes. For instance, assuming GRH, the probability that ${p_n}$ lands in ${1 \hbox{ mod } 3}$ would be ${1/2 + O( x^{-1/2+o(1)} )}$, and similarly for ${2 \hbox{ mod } 3}$.

In view of this, one would expect that up to errors of ${O(x^{-1/2+o(1)})}$ or so, the pair ${(p_n \hbox{ mod } 3, p_{n+1} \hbox{ mod } 3)}$ should be equally distributed amongst the four options ${(1 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$, ${(1 \hbox{ mod } 3, 2 \hbox{ mod } 3)}$, ${(2 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$, ${(2 \hbox{ mod } 3, 2 \hbox{ mod } 3)}$, thus for instance the probability that this pair is ${(1 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$ would naively be expected to be ${1/4 + O(x^{-1/2+o(1)})}$, and similarly for the other three tuples. These assertions are not yet proven (although some non-trivial upper and lower bounds for such probabilities can be obtained from recent work of Maynard).

However, Lemke Oliver and Soundararajan argue (backed by both plausible heuristic arguments (based ultimately on the Hardy-Littlewood prime tuples conjecture), as well as substantial numerical evidence) that there is a significant bias away from the tuples ${(1 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$ and ${(2 \hbox{ mod } 3, 2 \hbox{ mod } 3)}$ – informally, adjacent primes don’t like being in the same residue class! For instance, they predict that the probability of attaining ${(1 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$ is in fact

$\displaystyle \frac{1}{4} - \frac{1}{8} \frac{\log\log x}{\log x} + O( \frac{1}{\log x} )$

with similar predictions for the other three pairs (in fact they give a somewhat more precise prediction than this). The magnitude of this bias, being comparable to ${\log\log x / \log x}$, is significantly stronger than the Chebyshev bias of ${O(1/\sqrt{x})}$.

One consequence of this prediction is that the prime gaps ${p_{n+1}-p_n}$ are slightly less likely to be divisible by ${3}$ than naive random models of the primes would predict. Indeed, if the four options ${(1 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$, ${(1 \hbox{ mod } 3, 2 \hbox{ mod } 3)}$, ${(2 \hbox{ mod } 3, 1 \hbox{ mod } 3)}$, ${(2 \hbox{ mod } 3, 2 \hbox{ mod } 3)}$ all occurred with equal probability ${1/4}$, then ${p_{n+1}-p_n}$ should equal ${0 \hbox{ mod } 3}$ with probability ${1/2}$, and ${1 \hbox{ mod } 3}$ and ${2 \hbox{ mod } 3}$ with probability ${1/4}$ each (as would be the case when taking the difference of two random numbers drawn from those integers not divisible by ${3}$); but the Lemke Oliver-Soundararajan bias predicts that the probability of ${p_{n+1}-p_n}$ being divisible by three should be slightly lower, being approximately ${1/2 - \frac{1}{4} \frac{\log\log x}{\log x}}$.

Below the fold we will give a somewhat informal justification of (a simplified version of) this phenomenon, based on the Lemke Oliver-Soundararajan calculation using the prime tuples conjecture.

Kevin Ford, James Maynard, and I have uploaded to the arXiv our preprint “Chains of large gaps between primes“. This paper was announced in our previous paper with Konyagin and Green, which was concerned with the largest gap

$\displaystyle G_1(X) := \max_{p_n, p_{n+1} \leq X} (p_{n+1} - p_n)$

between consecutive primes up to ${X}$, in which we improved the Rankin bound of

$\displaystyle G_1(X) \gg \log X \frac{\log_2 X \log_4 X}{(\log_3 X)^2}$

to

$\displaystyle G_1(X) \gg \log X \frac{\log_2 X \log_4 X}{\log_3 X}$

for large ${X}$ (where we use the abbreviations ${\log_2 X := \log\log X}$, ${\log_3 X := \log\log\log X}$, and ${\log_4 X := \log\log\log\log X}$). Here, we obtain an analogous result for the quantity

$\displaystyle G_k(X) := \max_{p_n, \dots, p_{n+k} \leq X} \min( p_{n+1} - p_n, p_{n+2}-p_{n+1}, \dots, p_{n+k} - p_{n+k-1} )$

which measures how far apart the gaps between chains of ${k}$ consecutive primes can be. Our main result is

$\displaystyle G_k(X) \gg \frac{1}{k^2} \log X \frac{\log_2 X \log_4 X}{\log_3 X}$

whenever ${X}$ is sufficiently large depending on ${k}$, with the implied constant here absolute (and effective). The factor of ${1/k^2}$ is inherent to the method, and related to the basic probabilistic fact that if one selects ${k}$ numbers at random from the unit interval ${[0,1]}$, then one expects the minimum gap between adjacent numbers to be about ${1/k^2}$ (i.e. smaller than the mean spacing of ${1/k}$ by an additional factor of ${1/k}$).

Our arguments combine those from the previous paper with the matrix method of Maier, who (in our notation) showed that

$\displaystyle G_k(X) \gg_k \log X \frac{\log_2 X \log_4 X}{(\log_3 X)^2}$

for an infinite sequence of ${X}$ going to infinity. (Maier needed to restrict to an infinite sequence to avoid Siegel zeroes, but we are able to resolve this issue by the now standard technique of simply eliminating a prime factor of an exceptional conductor from the sieve-theoretic portion of the argument. As a byproduct, this also makes all of the estimates in our paper effective.)

As its name suggests, the Maier matrix method is usually presented by imagining a matrix of numbers, and using information about the distribution of primes in the columns of this matrix to deduce information about the primes in at least one of the rows of the matrix. We found it convenient to interpret this method in an equivalent probabilistic form as follows. Suppose one wants to find an interval ${n+1,\dots,n+y}$ which contained a block of at least ${k}$ primes, each separated from each other by at least ${g}$ (ultimately, ${y}$ will be something like ${\log X \frac{\log_2 X \log_4 X}{\log_3 X}}$ and ${g}$ something like ${y/k^2}$). One can do this by the probabilistic method: pick ${n}$ to be a random large natural number ${{\mathbf n}}$ (with the precise distribution to be chosen later), and try to lower bound the probability that the interval ${{\mathbf n}+1,\dots,{\mathbf n}+y}$ contains at least ${k}$ primes, no two of which are within ${g}$ of each other.

By carefully choosing the residue class of ${{\mathbf n}}$ with respect to small primes, one can eliminate several of the ${{\mathbf n}+j}$ from consideration of being prime immediately. For instance, if ${{\mathbf n}}$ is chosen to be large and even, then the ${{\mathbf n}+j}$ with ${j}$ even have no chance of being prime and can thus be eliminated; similarly if ${{\mathbf n}}$ is large and odd, then ${{\mathbf n}+j}$ cannot be prime for any odd ${j}$. Using the methods of our previous paper, we can find a residue class ${m \hbox{ mod } P}$ (where ${P}$ is a product of a large number of primes) such that, if one chooses ${{\mathbf n}}$ to be a large random element of ${m \hbox{ mod } P}$ (that is, ${{\mathbf n} = {\mathbf z} P + m}$ for some large random integer ${{\mathbf z}}$), then the set ${{\mathcal T}}$ of shifts ${j \in \{1,\dots,y\}}$ for which ${{\mathbf n}+j}$ still has a chance of being prime has size comparable to something like ${k \log X / \log_2 X}$; furthermore this set ${{\mathcal T}}$ is fairly well distributed in ${\{1,\dots,y\}}$ in the sense that it does not concentrate too strongly in any short subinterval of ${\{1,\dots,y\}}$. The main new difficulty, not present in the previous paper, is to get lower bounds on the size of ${{\mathcal T}}$ in addition to upper bounds, but this turns out to be achievable by a suitable modification of the arguments.

Using a version of the prime number theorem in arithmetic progressions due to Gallagher, one can show that for each remaining shift ${j \in {\mathcal T}}$, ${{\mathbf n}+j}$ is going to be prime with probability comparable to ${\log_2 X / \log X}$, so one expects about ${k}$ primes in the set ${\{{\mathbf n} + j: j \in {\mathcal T}\}}$. An upper bound sieve (e.g. the Selberg sieve) also shows that for any distinct ${j,j' \in {\mathcal T}}$, the probability that ${{\mathbf n}+j}$ and ${{\mathbf n}+j'}$ are both prime is ${O( (\log_2 X / \log X)^2 )}$. Using this and some routine second moment calculations, one can then show that with large probability, the set ${\{{\mathbf n} + j: j \in {\mathcal T}\}}$ will indeed contain about ${k}$ primes, no two of which are closer than ${g}$ to each other; with no other numbers in this interval being prime, this gives a lower bound on ${G_k(X)}$.

Kevin Ford, Ben Green, Sergei Konyagin, James Maynard, and I have just uploaded to the arXiv our paper “Long gaps between primes“. This is a followup work to our two previous papers (discussed in this previous post), in which we had simultaneously shown that the maximal gap

$\displaystyle G(X) := \sup_{p_n, p_{n+1} \leq X} p_{n+1}-p_n$

between primes up to ${X}$ exhibited a lower bound of the shape

$\displaystyle G(X) \geq f(X) \log X \frac{\log \log X \log\log\log\log X}{(\log\log\log X)^2} \ \ \ \ \ (1)$

for some function ${f(X)}$ that went to infinity as ${X \rightarrow \infty}$; this improved upon previous work of Rankin and other authors, who established the same bound but with ${f(X)}$ replaced by a constant. (Again, see the previous post for a more detailed discussion.)

In our previous papers, we did not specify a particular growth rate for ${f(X)}$. In my paper with Kevin, Ben, and Sergei, there was a good reason for this: our argument relied (amongst other things) on the inverse conjecture on the Gowers norms, as well as the Siegel-Walfisz theorem, and the known proofs of both results both have ineffective constants, rendering our growth function ${f(X)}$ similarly ineffective. Maynard’s approach ostensibly also relies on the Siegel-Walfisz theorem, but (as shown in another recent paper of his) can be made quite effective, even when tracking ${k}$-tuples of fairly large size (about ${\log^c x}$ for some small ${c}$). If one carefully makes all the bounds in Maynard’s argument quantitative, one eventually ends up with a growth rate ${f(X)}$ of shape

$\displaystyle f(X) \asymp \frac{\log \log \log X}{\log\log\log\log X}, \ \ \ \ \ (2)$

$\displaystyle G(X) \gg \log X \frac{\log \log X}{\log\log\log X}$

on the gaps between primes for large ${X}$; this is an unpublished calculation of James’.

In this paper we make a further refinement of this calculation to obtain a growth rate

$\displaystyle f(X) \asymp \log \log \log X \ \ \ \ \ (3)$

leading to a bound of the form

$\displaystyle G(X) \geq c \log X \frac{\log \log X \log\log\log\log X}{\log\log\log X} \ \ \ \ \ (4)$

for large ${X}$ and some small constant ${c}$. Furthermore, this appears to be the limit of current technology (in particular, falling short of Cramer’s conjecture that ${G(X)}$ is comparable to ${\log^2 X}$); in the spirit of Erdös’ original prize on this problem, I would like to offer 10,000 USD for anyone who can show (in a refereed publication, of course) that the constant ${c}$ here can be replaced by an arbitrarily large constant ${C}$.

The reason for the growth rate (3) is as follows. After following the sieving process discussed in the previous post, the problem comes down to something like the following: can one sieve out all (or almost all) of the primes in ${[x,y]}$ by removing one residue class modulo ${p}$ for all primes ${p}$ in (say) ${[x/4,x/2]}$? Very roughly speaking, if one can solve this problem with ${y = g(x) x}$, then one can obtain a growth rate on ${f(X)}$ of the shape ${f(X) \sim g(\log X)}$. (This is an oversimplification, as one actually has to sieve out a random subset of the primes, rather than all the primes in ${[x,y]}$, but never mind this detail for now.)

Using the quantitative “dense clusters of primes” machinery of Maynard, one can find lots of ${k}$-tuples in ${[x,y]}$ which contain at least ${\gg \log k}$ primes, for ${k}$ as large as ${\log^c x}$ or so (so that ${\log k}$ is about ${\log\log x}$). By considering ${k}$-tuples in arithmetic progression, this means that one can find lots of residue classes modulo a given prime ${p}$ in ${[x/4,x/2]}$ that capture about ${\log\log x}$ primes. In principle, this means that union of all these residue classes can cover about ${\frac{x}{\log x} \log\log x}$ primes, allowing one to take ${g(x)}$ as large as ${\log\log x}$, which corresponds to (3). However, there is a catch: the residue classes for different primes ${p}$ may collide with each other, reducing the efficiency of the covering. In our previous papers on the subject, we selected the residue classes randomly, which meant that we had to insert an additional logarithmic safety margin in expected number of times each prime would be shifted out by one of the residue classes, in order to guarantee that we would (with high probability) sift out most of the primes. This additional safety margin is ultimately responsible for the ${\log\log\log\log X}$ loss in (2).

The main innovation of this paper, beyond detailing James’ unpublished calculations, is to use ideas from the literature on efficient hypergraph covering, to avoid the need for a logarithmic safety margin. The hypergraph covering problem, roughly speaking, is to try to cover a set of ${n}$ vertices using as few “edges” from a given hypergraph ${H}$ as possible. If each edge has ${m}$ vertices, then one certainly needs at least ${n/m}$ edges to cover all the vertices, and the question is to see if one can come close to attaining this bound given some reasonable uniform distribution hypotheses on the hypergraph ${H}$. As before, random methods tend to require something like ${\frac{n}{m} \log r}$ edges before one expects to cover, say ${1-1/r}$ of the vertices.

However, it turns out (under reasonable hypotheses on ${H}$) to eliminate this logarithmic loss, by using what is now known as the “semi-random method” or the “Rödl nibble”. The idea is to randomly select a small number of edges (a first “nibble”) – small enough that the edges are unlikely to overlap much with each other, thus obtaining maximal efficiency. Then, one pauses to remove all the edges from ${H}$ that intersect edges from this first nibble, so that all remaining edges will not overlap with the existing edges. One then randomly selects another small number of edges (a second “nibble”), and repeats this process until enough nibbles are taken to cover most of the vertices. Remarkably, it turns out that under some reasonable assumptions on the hypergraph ${H}$, one can maintain control on the uniform distribution of the edges throughout the nibbling process, and obtain an efficient hypergraph covering. This strategy was carried out in detail in an influential paper of Pippenger and Spencer.

In our setup, the vertices are the primes in ${[x,y]}$, and the edges are the intersection of the primes with various residue classes. (Technically, we have to work with a family of hypergraphs indexed by a prime ${p}$, rather than a single hypergraph, but let me ignore this minor technical detail.) The semi-random method would in principle eliminate the logarithmic loss and recover the bound (3). However, there is a catch: the analysis of Pippenger and Spencer relies heavily on the assumption that the hypergraph is uniform, that is to say all edges have the same size. In our context, this requirement would mean that each residue class captures exactly the same number of primes, which is not the case; we only control the number of primes in an average sense, but we were unable to obtain any concentration of measure to come close to verifying this hypothesis. And indeed, the semi-random method, when applied naively, does not work well with edges of variable size – the problem is that edges of large size are much more likely to be eliminated after each nibble than edges of small size, since they have many more vertices that could overlap with the previous nibbles. Since the large edges are clearly the more useful ones for the covering problem than small ones, this bias towards eliminating large edges significantly reduces the efficiency of the semi-random method (and also greatly complicates the analysis of that method).

Our solution to this is to iteratively reweight the probability distribution on edges after each nibble to compensate for this bias effect, giving larger edges a greater weight than smaller edges. It turns out that there is a natural way to do this reweighting that allows one to repeat the Pippenger-Spencer analysis in the presence of edges of variable size, and this ultimately allows us to recover the full growth rate (3).

To go beyond (3), one either has to find a lot of residue classes that can capture significantly more than ${\log\log x}$ primes of size ${x}$ (which is the limit of the multidimensional Selberg sieve of Maynard and myself), or else one has to find a very different method to produce large gaps between primes than the Erdös-Rankin method, which is the method used in all previous work on the subject.

It turns out that the arguments in this paper can be combined with the Maier matrix method to also produce chains of consecutive large prime gaps whose size is of the order of (4); three of us (Kevin, James, and myself) will detail this in a future paper. (A similar combination was also recently observed in connection with our earlier result (1) by Pintz, but there are some additional technical wrinkles required to recover the full gain of (3) for the chains of large gaps problem.)

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap ${p_{n+1}-p_n}$. Here, we wish to consider the largest prime gap ${G(X) = p_{n+1}-p_n}$ that one can find in the interval ${[X] = \{1,\dots,X\}}$ as ${X}$ goes to infinity.

Finding lower bounds on ${G(X)}$ is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number ${n}$, the consecutive numbers ${n!+2, n!+3,\dots,n!+n}$ are all composite, because each ${n!+i}$, ${i=2,\dots,n}$ is divisible by some prime ${p \leq n}$, while being strictly larger than that prime ${p}$. From this and Stirling’s formula, it is not difficult to obtain the bound

$\displaystyle G(X) \gg \frac{\log X}{\log\log X}. \ \ \ \ \ (1)$

A more efficient bound comes from the prime number theorem: there are only ${(1+o(1)) \frac{X}{\log X}}$ primes up to ${X}$, so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to ${X}$ of length at least ${(1-o(1)) \log X}$, thus

$\displaystyle G(X) \gtrsim \log X \ \ \ \ \ (2)$

where we use ${X \gtrsim Y}$ or ${Y \lesssim X}$ as shorthand for ${X \geq (1-o(1)) Y}$ or ${Y \leq (1+o(1)) X}$.

What about upper bounds? The Cramér random model predicts that the primes up to ${X}$ are distributed like a random subset ${\{1,\dots,X\}}$ of density ${1/\log X}$. Using this model, Cramér arrived at the conjecture

$\displaystyle G(X) \ll \log^2 X.$

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

$\displaystyle G(X) \sim \log^2 X.$

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

$\displaystyle G(X) \gtrsim 2e^{-\gamma} \log^2 X$

(note that ${2e^{-\gamma} = 1.1229\dots}$ is slightly larger than ${1}$). For comparison, the known upper bounds on ${G(X)}$ are quite weak; unconditionally one has ${G(X) \ll X^{0.525}}$ by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to ${G(X) \ll X^{1/2} \log X}$, as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

$\displaystyle G(X) \gg \frac{\log\log\log X}{\log\log\log\log X} \log X ,$

which Erdös in 1935 improved to

$\displaystyle G(X) \gg \frac{\log\log X}{(\log\log\log X)^2} \log X$

and Rankin in 1938 improved slightly further to

$\displaystyle G(X) \gtrsim c \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X \ \ \ \ \ (3)$

with ${c=1/3}$. Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant ${c}$, which was raised to ${c=\frac{1}{2} e^\gamma}$ in 1963 by Schönhage, to ${c= e^\gamma}$ in 1963 by Rankin, to ${c = 1.31256 e^\gamma}$ by Maier and Pomerance, and finally to ${c = 2e^\gamma}$ in 1997 by Pintz.

Erdös listed the problem of making ${c}$ arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1 The bound (3) holds for arbitrarily large ${c>0}$.

In principle, we thus have a bound of the form

$\displaystyle G(X) \geq f(X) \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X$

for some ${f(X)}$ that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on ${f(X)}$ at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of ${\log\log\log X}$ in the denominator is ${3}$ instead of ${2}$.) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

Suppose one is given a ${k_0}$-tuple ${{\mathcal H} = (h_1,\ldots,h_{k_0})}$ of ${k_0}$ distinct integers for some ${k_0 \geq 1}$, arranged in increasing order. When is it possible to find infinitely many translates ${n + {\mathcal H} =(n+h_1,\ldots,n+h_{k_0})}$ of ${{\mathcal H}}$ which consists entirely of primes? The case ${k_0=1}$ is just Euclid’s theorem on the infinitude of primes, but the case ${k_0=2}$ is already open in general, with the ${{\mathcal H} = (0,2)}$ case being the notorious twin prime conjecture.

On the other hand, there are some tuples ${{\mathcal H}}$ for which one can easily answer the above question in the negative. For instance, the only translate of ${(0,1)}$ that consists entirely of primes is ${(2,3)}$, basically because each translate of ${(0,1)}$ must contain an even number, and the only even prime is ${2}$. More generally, if there is a prime ${p}$ such that ${{\mathcal H}}$ meets each of the ${p}$ residue classes ${0 \hbox{ mod } p, 1 \hbox{ mod } p, \ldots, p-1 \hbox{ mod } p}$, then every translate of ${{\mathcal H}}$ contains at least one multiple of ${p}$; since ${p}$ is the only multiple of ${p}$ that is prime, this shows that there are only finitely many translates of ${{\mathcal H}}$ that consist entirely of primes.

To avoid this obstruction, let us call a ${k_0}$-tuple ${{\mathcal H}}$ admissible if it avoids at least one residue class ${\hbox{ mod } p}$ for each prime ${p}$. It is easy to check for admissibility in practice, since a ${k_0}$-tuple is automatically admissible in every prime ${p}$ larger than ${k_0}$, so one only needs to check a finite number of primes in order to decide on the admissibility of a given tuple. For instance, ${(0,2)}$ or ${(0,2,6)}$ are admissible, but ${(0,2,4)}$ is not (because it covers all the residue classes modulo ${3}$). We then have the famous Hardy-Littlewood prime tuples conjecture:

Conjecture 1 (Prime tuples conjecture, qualitative form) If ${{\mathcal H}}$ is an admissible ${k_0}$-tuple, then there exists infinitely many translates of ${{\mathcal H}}$ that consist entirely of primes.

This conjecture is extremely difficult (containing the twin prime conjecture, for instance, as a special case), and in fact there is no explicitly known example of an admissible ${k_0}$-tuple with ${k_0 \geq 2}$ for which we can verify this conjecture (although, thanks to the recent work of Zhang, we know that ${(0,d)}$ satisfies the conclusion of the prime tuples conjecture for some ${0 < d < 70,000,000}$, even if we can’t yet say what the precise value of ${d}$ is).

Actually, Hardy and Littlewood conjectured a more precise version of Conjecture 1. Given an admissible ${k_0}$-tuple ${{\mathcal H} = (h_1,\ldots,h_{k_0})}$, and for each prime ${p}$, let ${\nu_p = \nu_p({\mathcal H}) := |{\mathcal H} \hbox{ mod } p|}$ denote the number of residue classes modulo ${p}$ that ${{\mathcal H}}$ meets; thus we have ${1 \leq \nu_p \leq p-1}$ for all ${p}$ by admissibility, and also ${\nu_p = k_0}$ for all ${p>h_{k_0}-h_1}$. We then define the singular series ${{\mathfrak G} = {\mathfrak G}({\mathcal H})}$ associated to ${{\mathcal H}}$ by the formula

$\displaystyle {\mathfrak G} := \prod_{p \in {\mathcal P}} \frac{1-\frac{\nu_p}{p}}{(1-\frac{1}{p})^{k_0}}$

where ${{\mathcal P} = \{2,3,5,\ldots\}}$ is the set of primes; by the previous discussion we see that the infinite product in ${{\mathfrak G}}$ converges to a finite non-zero number.

We will also need some asymptotic notation (in the spirit of “cheap nonstandard analysis“). We will need a parameter ${x}$ that one should think of going to infinity. Some mathematical objects (such as ${{\mathcal H}}$ and ${k_0}$) will be independent of ${x}$ and referred to as fixed; but unless otherwise specified we allow all mathematical objects under consideration to depend on ${x}$. If ${X}$ and ${Y}$ are two such quantities, we say that ${X = O(Y)}$ if one has ${|X| \leq CY}$ for some fixed ${C}$, and ${X = o(Y)}$ if one has ${|X| \leq c(x) Y}$ for some function ${c(x)}$ of ${x}$ (and of any fixed parameters present) that goes to zero as ${x \rightarrow \infty}$ (for each choice of fixed parameters).

Conjecture 2 (Prime tuples conjecture, quantitative form) Let ${k_0 \geq 1}$ be a fixed natural number, and let ${{\mathcal H}}$ be a fixed admissible ${k_0}$-tuple. Then the number of natural numbers ${n < x}$ such that ${n+{\mathcal H}}$ consists entirely of primes is ${({\mathfrak G} + o(1)) \frac{x}{\log^{k_0} x}}$.

Thus, for instance, if Conjecture 2 holds, then the number of twin primes less than ${x}$ should equal ${(2 \Pi_2 + o(1)) \frac{x}{\log^2 x}}$, where ${\Pi_2}$ is the twin prime constant

$\displaystyle \Pi_2 := \prod_{p \in {\mathcal P}: p>2} (1 - \frac{1}{(p-1)^2}) = 0.6601618\ldots.$

As this conjecture is stronger than Conjecture 1, it is of course open. However there are a number of partial results on this conjecture. For instance, this conjecture is known to be true if one introduces some additional averaging in ${{\mathcal H}}$; see for instance this previous post. From the methods of sieve theory, one can obtain an upper bound of ${(C_{k_0} {\mathfrak G} + o(1)) \frac{x}{\log^{k_0} x}}$ for the number of ${n < x}$ with ${n + {\mathcal H}}$ all prime, where ${C_{k_0}}$ depends only on ${k_0}$. Sieve theory can also give analogues of Conjecture 2 if the primes are replaced by a suitable notion of almost prime (or more precisely, by a weight function concentrated on almost primes).

Another type of partial result towards Conjectures 1, 2 come from the results of Goldston-Pintz-Yildirim, Motohashi-Pintz, and of Zhang. Following the notation of this recent paper of Pintz, for each ${k_0>2}$, let ${DHL[k_0,2]}$ denote the following assertion (DHL stands for “Dickson-Hardy-Littlewood”):

Conjecture 3 (${DHL[k_0,2]}$) Let ${{\mathcal H}}$ be a fixed admissible ${k_0}$-tuple. Then there are infinitely many translates ${n+{\mathcal H}}$ of ${{\mathcal H}}$ which contain at least two primes.

This conjecture gets harder as ${k_0}$ gets smaller. Note for instance that ${DHL[2,2]}$ would imply all the ${k_0=2}$ cases of Conjecture 1, including the twin prime conjecture. More generally, if one knew ${DHL[k_0,2]}$ for some ${k_0}$, then one would immediately conclude that there are an infinite number of pairs of consecutive primes of separation at most ${H(k_0)}$, where ${H(k_0)}$ is the minimal diameter ${h_{k_0}-h_1}$ amongst all admissible ${k_0}$-tuples ${{\mathcal H}}$. Values of ${H(k_0)}$ for small ${k_0}$ can be found at this link (with ${H(k_0)}$ denoted ${w}$ in that page). For large ${k_0}$, the best upper bounds on ${H(k_0)}$ have been found by using admissible ${k_0}$-tuples ${{\mathcal H}}$ of the form

$\displaystyle {\mathcal H} = ( - p_{m+\lfloor k_0/2\rfloor - 1}, \ldots, - p_{m+1}, -1, +1, p_{m+1}, \ldots, p_{m+\lfloor (k_0+1)/2\rfloor - 1} )$

where ${p_n}$ denotes the ${n^{th}}$ prime and ${m}$ is a parameter to be optimised over (in practice it is an order of magnitude or two smaller than ${k_0}$); see this blog post for details. The upshot is that one can bound ${H(k_0)}$ for large ${k_0}$ by a quantity slightly smaller than ${k_0 \log k_0}$ (and the large sieve inequality shows that this is sharp up to a factor of two, see e.g. this previous post for more discussion).

In a key breakthrough, Goldston, Pintz, and Yildirim were able to establish the following conditional result a few years ago:

Theorem 4 (Goldston-Pintz-Yildirim) Suppose that the Elliott-Halberstam conjecture ${EH[\theta]}$ is true for some ${1/2 < \theta < 1}$. Then ${DHL[k_0,2]}$ is true for some finite ${k_0}$. In particular, this establishes an infinite number of pairs of consecutive primes of separation ${O(1)}$.

The dependence of constants between ${k_0}$ and ${\theta}$ given by the Goldston-Pintz-Yildirim argument is basically of the form ${k_0 \sim (\theta-1/2)^{-2}}$. (UPDATE: as recently observed by Farkas, Pintz, and Revesz, this relationship can be improved to ${k_0 \sim (\theta-1/2)^{-3/2}}$.)

Unfortunately, the Elliott-Halberstam conjecture (which we will state properly below) is only known for ${\theta<1/2}$, an important result known as the Bombieri-Vinogradov theorem. If one uses the Bombieri-Vinogradov theorem instead of the Elliott-Halberstam conjecture, Goldston, Pintz, and Yildirim were still able to show the highly non-trivial result that there were infinitely many pairs ${p_{n+1},p_n}$ of consecutive primes with ${(p_{n+1}-p_n) / \log p_n \rightarrow 0}$ (actually they showed more than this; see e.g. this survey of Soundararajan for details).

Actually, the full strength of the Elliott-Halberstam conjecture is not needed for these results. There is a technical specialisation of the Elliott-Halberstam conjecture which does not presently have a commonly accepted name; I will call it the Motohashi-Pintz-Zhang conjecture ${MPZ[\varpi]}$ in this post, where ${0 < \varpi < 1/4}$ is a parameter. We will define this conjecture more precisely later, but let us remark for now that ${MPZ[\varpi]}$ is a consequence of ${EH[\frac{1}{2}+2\varpi]}$.

We then have the following two theorems. Firstly, we have the following strengthening of Theorem 4:

Theorem 5 (Motohashi-Pintz-Zhang) Suppose that ${MPZ[\varpi]}$ is true for some ${0 < \varpi < 1/4}$. Then ${DHL[k_0,2]}$ is true for some ${k_0}$.

A version of this result (with a slightly different formulation of ${MPZ[\varpi]}$) appears in this paper of Motohashi and Pintz, and in the paper of Zhang, Theorem 5 is proven for the concrete values ${\varpi = 1/1168}$ and ${k_0 = 3,500,000}$. We will supply a self-contained proof of Theorem 5 below the fold, the constants upon those in Zhang’s paper (in particular, for ${\varpi = 1/1168}$, we can take ${k_0}$ as low as ${341,640}$, with further improvements on the way). As with Theorem 4, we have an inverse quadratic relationship ${k_0 \sim \varpi^{-2}}$.

In his paper, Zhang obtained for the first time an unconditional advance on ${MPZ[\varpi]}$:

Theorem 6 (Zhang) ${MPZ[\varpi]}$ is true for all ${0 < \varpi \leq 1/1168}$.

This is a deep result, building upon the work of Fouvry-Iwaniec, Friedlander-Iwaniec and BombieriFriedlanderIwaniec which established results of a similar nature to ${MPZ[\varpi]}$ but simpler in some key respects. We will not discuss this result further here, except to say that they rely on the (higher-dimensional case of the) Weil conjectures, which were famously proven by Deligne using methods from l-adic cohomology. Also, it was believed among at least some experts that the methods of Bombieri, Fouvry, Friedlander, and Iwaniec were not quite strong enough to obtain results of the form ${MPZ[\varpi]}$, making Theorem 6 a particularly impressive achievement.

Combining Theorem 6 with Theorem 5 we obtain ${DHL[k_0,2]}$ for some finite ${k_0}$; Zhang obtains this for ${k_0 = 3,500,000}$ but as detailed below, this can be lowered to ${k_0 = 341,640}$. This in turn gives infinitely many pairs of consecutive primes of separation at most ${H(k_0)}$. Zhang gives a simple argument that bounds ${H(3,500,000)}$ by ${70,000,000}$, giving his famous result that there are infinitely many pairs of primes of separation at most ${70,000,000}$; by being a bit more careful (as discussed in this post) one can lower the upper bound on ${H(3,500,000)}$ to ${57,554,086}$, and if one instead uses the newer value ${k_0 = 341,640}$ for ${k_0}$ one can instead use the bound ${H(341,640) \leq 4,982,086}$. (Many thanks to Scott Morrison for these numerics.) UPDATE: These values are now obsolete; see this web page for the latest bounds.

In this post we would like to give a self-contained proof of both Theorem 4 and Theorem 5, which are both sieve-theoretic results that are mainly elementary in nature. (But, as stated earlier, we will not discuss the deepest new result in Zhang’s paper, namely Theorem 6.) Our presentation will deviate a little bit from the traditional sieve-theoretic approach in a few places. Firstly, there is a portion of the argument that is traditionally handled using contour integration and properties of the Riemann zeta function; we will present a “cheaper” approach (which Ben Green and I used in our papers, e.g. in this one) using Fourier analysis, with the only property used about the zeta function ${\zeta(s)}$ being the elementary fact that blows up like ${\frac{1}{s-1}}$ as one approaches ${1}$ from the right. To deal with the contribution of small primes (which is the source of the singular series ${{\mathfrak G}}$), it will be convenient to use the “${W}$-trick” (introduced in this paper of mine with Ben), passing to a single residue class mod ${W}$ (where ${W}$ is the product of all the small primes) to end up in a situation in which all small primes have been “turned off” which leads to better pseudorandomness properties (for instance, once one eliminates all multiples of small primes, almost all pairs of remaining numbers will be coprime).

In the third Marker lecture, I would like to discuss the recent progress, particularly by Goldston, Pintz, and Yıldırım, on finding small gaps $p_{n+1}-p_n$ between consecutive primes.  (See also the surveys by Goldston-Pintz-Yıldırım, by Green, and by Soundararajan on the subject; the material here is based to some extent on these prior surveys.)