Kevin Ford, James Maynard, and I have uploaded to the arXiv our preprint “Chains of large gaps between primes“. This paper was announced in our previous paper with Konyagin and Green, which was concerned with the largest gap

$\displaystyle G_1(X) := \max_{p_n, p_{n+1} \leq X} (p_{n+1} - p_n)$

between consecutive primes up to ${X}$, in which we improved the Rankin bound of

$\displaystyle G_1(X) \gg \log X \frac{\log_2 X \log_4 X}{(\log_3 X)^2}$

to

$\displaystyle G_1(X) \gg \log X \frac{\log_2 X \log_4 X}{\log_3 X}$

for large ${X}$ (where we use the abbreviations ${\log_2 X := \log\log X}$, ${\log_3 X := \log\log\log X}$, and ${\log_4 X := \log\log\log\log X}$). Here, we obtain an analogous result for the quantity

$\displaystyle G_k(X) := \max_{p_n, \dots, p_{n+k} \leq X} \min( p_{n+1} - p_n, p_{n+2}-p_{n+1}, \dots, p_{n+k} - p_{n+k-1} )$

which measures how far apart the gaps between chains of ${k}$ consecutive primes can be. Our main result is

$\displaystyle G_k(X) \gg \frac{1}{k^2} \log X \frac{\log_2 X \log_4 X}{\log_3 X}$

whenever ${X}$ is sufficiently large depending on ${k}$, with the implied constant here absolute (and effective). The factor of ${1/k^2}$ is inherent to the method, and related to the basic probabilistic fact that if one selects ${k}$ numbers at random from the unit interval ${[0,1]}$, then one expects the minimum gap between adjacent numbers to be about ${1/k^2}$ (i.e. smaller than the mean spacing of ${1/k}$ by an additional factor of ${1/k}$).

Our arguments combine those from the previous paper with the matrix method of Maier, who (in our notation) showed that

$\displaystyle G_k(X) \gg_k \log X \frac{\log_2 X \log_4 X}{(\log_3 X)^2}$

for an infinite sequence of ${X}$ going to infinity. (Maier needed to restrict to an infinite sequence to avoid Siegel zeroes, but we are able to resolve this issue by the now standard technique of simply eliminating a prime factor of an exceptional conductor from the sieve-theoretic portion of the argument. As a byproduct, this also makes all of the estimates in our paper effective.)

As its name suggests, the Maier matrix method is usually presented by imagining a matrix of numbers, and using information about the distribution of primes in the columns of this matrix to deduce information about the primes in at least one of the rows of the matrix. We found it convenient to interpret this method in an equivalent probabilistic form as follows. Suppose one wants to find an interval ${n+1,\dots,n+y}$ which contained a block of at least ${k}$ primes, each separated from each other by at least ${g}$ (ultimately, ${y}$ will be something like ${\log X \frac{\log_2 X \log_4 X}{\log_3 X}}$ and ${g}$ something like ${y/k^2}$). One can do this by the probabilistic method: pick ${n}$ to be a random large natural number ${{\mathbf n}}$ (with the precise distribution to be chosen later), and try to lower bound the probability that the interval ${{\mathbf n}+1,\dots,{\mathbf n}+y}$ contains at least ${k}$ primes, no two of which are within ${g}$ of each other.

By carefully choosing the residue class of ${{\mathbf n}}$ with respect to small primes, one can eliminate several of the ${{\mathbf n}+j}$ from consideration of being prime immediately. For instance, if ${{\mathbf n}}$ is chosen to be large and even, then the ${{\mathbf n}+j}$ with ${j}$ even have no chance of being prime and can thus be eliminated; similarly if ${{\mathbf n}}$ is large and odd, then ${{\mathbf n}+j}$ cannot be prime for any odd ${j}$. Using the methods of our previous paper, we can find a residue class ${m \hbox{ mod } P}$ (where ${P}$ is a product of a large number of primes) such that, if one chooses ${{\mathbf n}}$ to be a large random element of ${m \hbox{ mod } P}$ (that is, ${{\mathbf n} = {\mathbf z} P + m}$ for some large random integer ${{\mathbf z}}$), then the set ${{\mathcal T}}$ of shifts ${j \in \{1,\dots,y\}}$ for which ${{\mathbf n}+j}$ still has a chance of being prime has size comparable to something like ${k \log X / \log_2 X}$; furthermore this set ${{\mathcal T}}$ is fairly well distributed in ${\{1,\dots,y\}}$ in the sense that it does not concentrate too strongly in any short subinterval of ${\{1,\dots,y\}}$. The main new difficulty, not present in the previous paper, is to get lower bounds on the size of ${{\mathcal T}}$ in addition to upper bounds, but this turns out to be achievable by a suitable modification of the arguments.

Using a version of the prime number theorem in arithmetic progressions due to Gallagher, one can show that for each remaining shift ${j \in {\mathcal T}}$, ${{\mathbf n}+j}$ is going to be prime with probability comparable to ${\log_2 X / \log X}$, so one expects about ${k}$ primes in the set ${\{{\mathbf n} + j: j \in {\mathcal T}\}}$. An upper bound sieve (e.g. the Selberg sieve) also shows that for any distinct ${j,j' \in {\mathcal T}}$, the probability that ${{\mathbf n}+j}$ and ${{\mathbf n}+j'}$ are both prime is ${O( (\log_2 X / \log X)^2 )}$. Using this and some routine second moment calculations, one can then show that with large probability, the set ${\{{\mathbf n} + j: j \in {\mathcal T}\}}$ will indeed contain about ${k}$ primes, no two of which are closer than ${g}$ to each other; with no other numbers in this interval being prime, this gives a lower bound on ${G_k(X)}$.