Kevin Ford, Sergei Konyagin, James Maynard, Carl Pomerance, and I have uploaded to the arXiv our paper “Long gaps in sieved sets“, submitted to J. Europ. Math. Soc..

This paper originated from the MSRI program in analytic number theory last year, and was centred around variants of the question of finding large gaps between primes. As discussed for instance in this previous post, it is now known that within the set of primes {{\mathcal P} = \{2,3,5,\dots\}}, one can find infinitely many adjacent elements {a,b} whose gap {b-a} obeys a lower bound of the form

\displaystyle b-a \gg \log a \frac{\log_2 a \log_4 a}{\log_3 a}

where {\log_k} denotes the {k}-fold iterated logarithm. This compares with the trivial bound of {b-a \gg \log a} that one can obtain from the prime number theorem and the pigeonhole principle. Several years ago, Pomerance posed the question of whether analogous improvements to the trivial bound can be obtained for such sets as

\displaystyle {\mathcal P}_2 = \{ n \in {\bf N}: n^2+1 \hbox{ prime} \}.

Here there is the obvious initial issue that this set is not even known to be infinite (this is the fourth Landau problem), but let us assume for the sake of discussion that this set is indeed infinite, so that we have an infinite number of gaps to speak of. Standard sieve theory techniques give upper bounds for the density of {{\mathcal P}_2} that is comparable (up to an absolute constant) to the prime number theorem bounds for {{\mathcal P}}, so again we can obtain a trivial bound of {b-a \gg \log a} for the gaps of {{\mathcal P}_2}. In this paper we improve this to

\displaystyle b-a \gg \log a \log^c_2 a

for an absolute constant {c>0}; this is not as strong as the corresponding bound for {{\mathcal P}}, but still improves over the trivial bound. In fact we can handle more general “sifted sets” than just {{\mathcal P}_2}. Recall from the sieve of Eratosthenes that the elements of {{\mathcal P}} in, say, the interval {[x/2, x]} can be obtained by removing from {[x/2, x]} one residue class modulo {p} for each prime up to {\sqrt{x}}, namely the class {0} mod {p}. In a similar vein, the elements of {{\mathcal P}_2} in {[x/2,x]} can be obtained by removing for each prime {p} up to {x} zero, one, or two residue classes modulo {p}, depending on whether {-1} is a quadratic residue modulo {p}. On the average, one residue class will be removed (this is a very basic case of the Chebotarev density theorem), so this sieving system is “one-dimensional on the average”. Roughly speaking, our arguments apply to any other set of numbers arising from a sieving system that is one-dimensional on average. (One can consider other dimensions also, but unfortunately our methods seem to give results that are worse than a trivial bound when the dimension is less than or greater than one.)

The standard “Erdős-Rankin” method for constructing long gaps between primes proceeds by trying to line up some residue classes modulo small primes {p} so that they collectively occupy a long interval. A key tool in doing so are the smooth number estimates of de Bruijn and others, which among other things assert that if one removes from an interval such as {[1,x]} all the residue classes {0} mod {p} for {p} between {x^{1/u}} and {x} for some fixed {u>1}, then the set of survivors has exceptionally small density (roughly of the order of {u^{-u}}, with the precise density given by the Dickman function), in marked contrast to the situation in which one randomly removes one residue class for each such prime {p}, in which the density is more like {1/u}. One generally exploits this phenomenon to sieve out almost all the elements of a long interval using some of the primes available, and then using the remaining primes to cover up the remaining elements that have not already been sifted out. In the more recent work on this problem, advanced combinatorial tools such as hypergraph covering lemmas are used for the latter task.

In the case of {{\mathcal P}_2}, there does not appear to be any analogue of smooth numbers, in the sense that there is no obvious way to arrange the residue classes so that they have significantly fewer survivors than a random arrangement. Instead we adopt the following semi-random strategy to cover an interval {[1,y]} by residue classes. Firstly, we randomly remove residue classes for primes {p} up to some intermediate threshold {z} (smaller than {y} by a logarithmic factor), leaving behind a preliminary sifted set {S_{[2,z]}}. Then, for each prime {p} between {z} and another intermediate threshold {x/2}, we remove a residue class mod {p} that maximises (or nearly maximises) its intersection with {S_{[2,z]}}. This ends up reducing the number of survivors to be significantly below what one would achieve if one selects residue classes randomly, particularly if one also uses the hypergraph covering lemma from our previous paper. Finally, we cover each the remaining survivors by a residue class from a remaining available prime.