I’ve uploaded to the arXiv the polymath research paper “Deterministic methods to find primes“, which is the outcome of the Polymath4 collaborative mathematics project, and has been submitted to Mathematics of Computation.

The objective of this paper was to find fast deterministic algorithms to solve the following problem:

Given a (large) integer x, find a prime p larger than x.

Thanks to the AKS algorithm, a number of size O(x) can be deterministically tested for primality in time O( \log^{O(1)} x ) = O( x^{o(1)} ).   By Bertrand’s postulate, there is always at least one prime between x and 2x; by testing each one of these integers in turn for primality, one can thus obtain a deterministic algorithm to find primes in time O( x^{1+o(1)}).

But one should be able to do much better.  For comparison, if probabilistic algorithms are allowed, then by randomly selecting integers between x and 2x to test for primality, it is easy to see from the prime number theorem that one will succeed in obtaining a prime with high probability in time O( \log^{O(1)} x ) = O( x^{o(1)} ).  However, after some effort we were not able to “derandomise” this algorithm to create any reasonable deterministic counterpart.  Nevertheless, we conjecture that a deterministic algorithm with run time O( x^{o(1)}) exists.  Such algorithms can be easily obtained if one assumes some standard conjectures regarding the primes (e.g. Cramer’s conjecture on prime gaps), but we do not know of any deterministic algorithms which can be unconditionally proved to run in time O( x^{o(1)}).

Currently, the best known deterministic algorithm is due to Lagarias and Odlyzko, and has a run time of O( x^{1/2+o(1)}).  Roughly speaking, it is based on the ability to compute the prime counting function \pi(x) in time O( x^{1/2+o(1)} ); once one has this function, one can detect which intervals contain primes or not, and then starting from Bertrand’s postulate and performing a binary search one can then locate a prime.  The Lagarias-Odlyzko argument is based on approximating \pi(x) by a certain integral of the Riemann zeta function, which one then approximates in turn by quadrature.

We conjecture that one should be able to compute \pi(x) in faster time, and in particular in time O(x^{1/2-c+o(1)}) for some c>0.  Unfortunately, we were not able to achieve this; however, we do have a non-trivial method to compute the parity \pi(x) \hbox{ mod } 2 of \pi(x) in such a time; a bit more generally (and oversimplifying a little bit), we can compute various projections \sum_{p \leq x} t^p \hbox{ mod } 2, g(t) of the prime polynomial \sum_{p \leq x} t^p modulo some small polynomials g.  This seems close to being able to achieve the goal of detecting whether primes exist in a given interval, but unfortunately some additional work seems to be needed in that area.  Still, the methods used to get the parity seem to be interesting (involving the Dirichlet hyperbola method, a piecewise approximation of the hyperbola, and the Strassen fast multiplication algorithm), and these techniques might be useful for some other problems.

Roughly speaking, the idea to compute the parity of \pi(x) \hbox{ mod } 2 is as follows.  The first observation is that, for square-free n, the number \tau(n) of divisors of n is equal to 2 when n is a prime, and a multiple of 4 otherwise.  So to compute the parity of \pi(x), it suffices to compute \sum_{n \leq x} \tau(n) modulo 4 (or more precisely, the restriction of this sum to squarefree n).

The restriction to squarefree numbers turns out to be relatively easy to handle, so we ignore it.  Since \tau(n) = \sum_{a,b: ab=n} 1, we can rewrite

\sum_{n \leq x} \tau(n) = \sum_{a,b: ab \leq x} 1.

So it suffices to find an efficient way to count the number of lattice points below the hyperbola \{ (a,b): ab = x \}.

The classic way to do this is via the Dirichlet hyperbola method.  This lets one rewrite the above expression as

2 \sum_{a \leq \sqrt{x}} \lfloor \frac{x}{a} \rfloor - \lfloor \sqrt{x}\rfloor^2

(assuming x is not a perfect square, where \lfloor x \rfloor is the greatest integer function).  One can then compute this quantity in time O(x^{1/2+o(1)}) without difficulty.

To improve this to O(x^{1/2-c+o(1)}), we used some ideas of Vinogradov, based on using Diophantine approximation to find moderately long arithmetic progressions on which the map a \mapsto \lfloor \frac{x}{a} \rfloor is linear, and thus easily summable.

The same method extends to the polynomial setting, though now, instead of summing an arithmetic progression, one has to find an efficient way to sum quadratic sums such as \sum_{n=1}^N t^{an^2+bn+c}.  This turns out to be possible by expressing this sum as a matrix product and then using fast matrix multiplication algorithms.

There is still some scope to improve the results and methods further; Ernie Croot is pursuing this with two undergraduate students, David Hollis and David Lowry,