You are currently browsing the monthly archive for February 2018.

Kevin Ford, Sergei Konyagin, James Maynard, Carl Pomerance, and I have uploaded to the arXiv our paper “Long gaps in sieved sets“, submitted to J. Europ. Math. Soc..

This paper originated from the MSRI program in analytic number theory last year, and was centred around variants of the question of finding large gaps between primes. As discussed for instance in this previous post, it is now known that within the set of primes {{\mathcal P} = \{2,3,5,\dots\}}, one can find infinitely many adjacent elements {a,b} whose gap {b-a} obeys a lower bound of the form

\displaystyle b-a \gg \log a \frac{\log_2 a \log_4 a}{\log_3 a}

where {\log_k} denotes the {k}-fold iterated logarithm. This compares with the trivial bound of {b-a \gg \log a} that one can obtain from the prime number theorem and the pigeonhole principle. Several years ago, Pomerance posed the question of whether analogous improvements to the trivial bound can be obtained for such sets as

\displaystyle {\mathcal P}_2 = \{ n \in {\bf N}: n^2+1 \hbox{ prime} \}.

Here there is the obvious initial issue that this set is not even known to be infinite (this is the fourth Landau problem), but let us assume for the sake of discussion that this set is indeed infinite, so that we have an infinite number of gaps to speak of. Standard sieve theory techniques give upper bounds for the density of {{\mathcal P}_2} that is comparable (up to an absolute constant) to the prime number theorem bounds for {{\mathcal P}}, so again we can obtain a trivial bound of {b-a \gg \log a} for the gaps of {{\mathcal P}_2}. In this paper we improve this to

\displaystyle b-a \gg \log a \log^c_2 a

for an absolute constant {c>0}; this is not as strong as the corresponding bound for {{\mathcal P}}, but still improves over the trivial bound. In fact we can handle more general “sifted sets” than just {{\mathcal P}_2}. Recall from the sieve of Eratosthenes that the elements of {{\mathcal P}} in, say, the interval {[x/2, x]} can be obtained by removing from {[x/2, x]} one residue class modulo {p} for each prime up to {\sqrt{x}}, namely the class {0} mod {p}. In a similar vein, the elements of {{\mathcal P}_2} in {[x/2,x]} can be obtained by removing for each prime {p} up to {x} zero, one, or two residue classes modulo {p}, depending on whether {-1} is a quadratic residue modulo {p}. On the average, one residue class will be removed (this is a very basic case of the Chebotarev density theorem), so this sieving system is “one-dimensional on the average”. Roughly speaking, our arguments apply to any other set of numbers arising from a sieving system that is one-dimensional on average. (One can consider other dimensions also, but unfortunately our methods seem to give results that are worse than a trivial bound when the dimension is less than or greater than one.)

The standard “Erd\H{o}s-Rankin” method for constructing long gaps between primes proceeds by trying to line up some residue classes modulo small primes {p} so that they collectively occupy a long interval. A key tool in doing so are the smooth number estimates of de Bruijn and others, which among other things assert that if one removes from an interval such as {[1,x]} all the residue classes {0} mod {p} for {p} between {x^{1/u}} and {x} for some fixed {u>1}, then the set of survivors has exceptionally small density (roughly of the order of {u^{-u}}, with the precise density given by the Dickman function), in marked contrast to the situation in which one randomly removes one residue class for each such prime {p}, in which the density is more like {1/u}. One generally exploits this phenomenon to sieve out almost all the elements of a long interval using some of the primes available, and then using the remaining primes to cover up the remaining elements that have not already been sifted out. In the more recent work on this problem, advanced combinatorial tools such as hypergraph covering lemmas are used for the latter task.

In the case of {{\mathcal P}_2}, there does not appear to be any analogue of smooth numbers, in the sense that there is no obvious way to arrange the residue classes so that they have significantly fewer survivors than a random arrangement. Instead we adopt the following semi-random strategy to cover an interval {[1,y]} by residue classes. Firstly, we randomly remove residue classes for primes {p} up to some intermediate threshold {z} (smaller than {y} by a logarithmic factor), leaving behind a preliminary sifted set {S_{[2,z]}}. Then, for each prime {p} between {z} and another intermediate threshold {x/2}, we remove a residue class mod {p} that maximises (or nearly maximises) its intersection with {S_{[2,z]}}. This ends up reducing the number of survivors to be significantly below what one would achieve if one selects residue classes randomly, particularly if one also uses the hypergraph covering lemma from our previous paper. Finally, we cover each the remaining survivors by a residue class from a remaining available prime.

This is the third “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this previous thread. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We are making progress on the following test problem: can one show that {H_t(x+iy) \neq 0} whenever {t = 0.4}, {x \geq 0}, and {y \geq 0.4}? This would imply that

\displaystyle  \Lambda \leq 0.4 + \frac{1}{2} (0.4)^2 = 0.48

which would be the first quantitative improvement over the de Bruijn bound of {\Lambda \leq 1/2} (or the Ki-Kim-Lee refinement of {\Lambda < 1/2}). Of course we can try to lower the two parameters of {0.4} later on in the project, but this seems as good a place to start as any. One could also potentially try to use finer analysis of dynamics of zeroes to improve the bound {\Lambda \leq 0.48} further, but this seems to be a less urgent task.

Probably the hardest case is {y=0.4}, as there is a good chance that one can then recover the {y>0.4} case by a suitable use of the argument principle. Here we appear to have a workable Riemann-Siegel type formula that gives a tractable approximation for {H_t}. To describe this formula, first note that in the {t=0} case we have

\displaystyle  H_0(z) = \frac{1}{8} \xi( \frac{1+iz}{2} )

and the Riemann-Siegel formula gives

\displaystyle  \xi(s) = \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}

\displaystyle  + \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}

\displaystyle  + \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw

for any natural numbers {N,M}, where {C_M} is a contour from {+\infty} to {+\infty} that winds once anticlockwise around the zeroes {e^{2\pi im}, |m| \leq M} of {e^w-1} but does not wind around any other zeroes. A good choice of {N,M} to use here is

\displaystyle  N=M=\lfloor \sqrt{\mathrm{Im}(s)/2\pi}\rfloor = \lfloor \sqrt{\mathrm{Re}(z)/4\pi} \rfloor. \ \ \ \ \ (1)

In this case, a classical steepest descent computation (see wiki) yields the approximation

\displaystyle  \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw \approx - (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )

where

\displaystyle  \Psi(\alpha) := 2\pi \frac{\cos \pi(\frac{1}{2}\alpha^2 - \alpha - \pi/8)}{\cos(\pi \alpha)} \exp( \frac{i\pi}{2} \alpha^2 - \frac{5\pi}{8} ).

Thus we have

\displaystyle  H_0(z) \approx A^{(0)} + B^{(0)} - C^{(0)}

where

\displaystyle  A^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}

\displaystyle  B^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}

\displaystyle  C^{(0)} := \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )

with {s := \frac{1+iz}{2}} and {N,M} given by (1).

Heuristically, we have derived (see wiki) the more general approximation

\displaystyle  H_t(z) \approx A + B - C

for {t>0} (and in particular for {t=0.4}), where

\displaystyle  A := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{\exp(\frac{t}{16} \log^2 \frac{s+4}{2\pi n^2} )}{n^s}

\displaystyle  B := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{\exp(\frac{t}{16} \log^2 \frac{5-s}{2\pi m^2} )}{m^{1-s}}

\displaystyle  C := \exp(-\frac{t \pi^2}{64}) C^{(0)}.

In practice it seems that the {C} term is negligible once the real part {x} of {z} is moderately large, so one also has the approximation

\displaystyle  H_t(z) \approx A + B.

For large {x}, and for fixed {t,y>0}, e.g. {t=y=0.4}, the sums {A,B} converge fairly quickly (in fact the situation seems to be significantly better here than the much more intensively studied {t=0} case), and we expect the first term

\displaystyle  B_0 := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \exp( \frac{t}{16} \log^2 \frac{5-s}{2\pi} )

of the {B} series to dominate. Indeed, analytically we know that {\frac{A+B-C}{B_0} \rightarrow 1} (or {\frac{A+B}{B_0} \rightarrow 1}) as {x \rightarrow \infty} (holding {y} fixed), and it should also be provable that {\frac{H_t}{B_0} \rightarrow 1} as well. Numerically with {t=y=0.4}, it seems in fact that {\frac{A+B-C}{B_0}} (or {\frac{A+B}{B_0}}) stay within a distance of about {1/2} of {1} once {x} is moderately large (e.g. {x \geq 2 \times 10^5}). This raises the hope that one can solve the toy problem of showing {H_t(x+iy) \neq 0} for {t=y=0.4} by numerically controlling {H_t(x+iy) / B_0} for small {x} (e.g. {x \leq 2 \times 10^5}), numerically controlling {(A+B)/B_0} and analytically bounding the error {(H_t - A - B)/B_0} for medium {x} (e.g. {2 \times 10^5 \leq x \leq 10^7}), and analytically bounding both {(A+B)/B_0} and {(H_t-A-B)/B_0} for large {x} (e.g. {x \geq 10^7}). (These numbers {2 \times 10^5} and {10^7} are arbitrarily chosen here and may end up being optimised to something else as the computations become clearer.)

Thus, we now have four largely independent tasks (for suitable ranges of “small”, “medium”, and “large” {x}):

  1. Numerically computing {H_t(x+iy) / B_0} for small {x} (with enough accuracy to verify that there are no zeroes)
  2. Numerically computing {(A+B)/B_0} for medium {x} (with enough accuracy to keep it away from zero)
  3. Analytically bounding {(A+B)/B_0} for large {x} (with enough accuracy to keep it away from zero); and
  4. Analytically bounding {(H_t - A - B)/B_0} for medium and large {x} (with a bound that is better than the bound away from zero in the previous two tasks).

Note that tasks 2 and 3 do not directly require any further understanding of the function {H_t}.

Below we will give a progress report on the numeric and analytic sides of these tasks.

— 1. Numerics report (contributed by Sujit Nair) —

There is some progress on the code side but not at the pace I was hoping. Here are a few things which happened (rather, mistakes which were taken care of).

  1. We got rid of code which wasn’t being used. For example, @dhjpolymath computed {H_t} based on an old version but only realized it after the fact.
  2. We implemented tests to catch human/numerical bugs before a computation starts. Again, we lost some numerical cycles but moving forward these can be avoided.
  3. David got set up on GitHub and he is able to compare his output (in C) with the Python code. That is helping a lot.

Two areas which were worked on were

  1. Computing {H_t} and zeroes for {t} around {0.4}
  2. Computing quantities like {(A+B-C)/B_0}, {(A+B)/B_0}, {C/B_0}, etc. with the goal of understanding the zero free regions.

Some observations for {t=0.4}, {y=0.4}, {x \in ( 10^4, 10^7)} include:

  • {(A+B) / B_0} does seem to avoid the negative real axis
  • {|(A+B) / B0| > 0.4} (based on the oscillations and trends in the plots)
  • {|C/B_0|} seems to be settling around {10^{-4}} range.

See the figure below. The top plot is on the complex plane and the bottom plot is the absolute value. The code to play with this is here.

— 2. Analysis report —

The Riemann-Siegel formula and some manipulations (see wiki) give {H_0 = A^{(0)} + B^{(0)} - \tilde C^{(0)}}, where

\displaystyle  A^{(0)} = \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du

\displaystyle  - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du

\displaystyle  B^{(0)} = \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du

\displaystyle  - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du

\displaystyle  \tilde C^{(0)} := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u)\ du dw

\displaystyle  +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u)\ du dw

where {C} is a contour that goes from {+i\infty} to {+\infty} staying a bounded distance away from the upper imaginary and right real axes, and {\overline{C}} is the complex conjugate of {C}. (In each of these sums, it is the first term that should dominate, with the second one being about {O(1/x)} as large.) One can then evolve by the heat flow to obtain {H_t = \tilde A + \tilde B - \tilde C}, where

\displaystyle  \tilde A := \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du

\displaystyle  - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du

\displaystyle  \tilde B := \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du

\displaystyle  - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du

\displaystyle  \tilde C := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}

\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2) \ du dw

\displaystyle  +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}

\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2)\ du dw.

Steepest descent heuristics then predict that {\tilde A \approx A}, {\tilde B \approx B}, and {\tilde C \approx C}. For the purposes of this project, we will need effective error estimates here, with explicit error terms.

A start has been made towards this goal at this wiki page. Firstly there is a “effective Laplace method” lemma that gives effective bounds on integrals of the form {\int_I e^{\phi(x)} \psi(x)\ dx} if the real part of {\phi(x)} is either monotone with large derivative, or has a critical point and is decreasing on both sides of that critical point. In principle, all one has to do is manipulate expressions such as {\tilde A - A}, {\tilde B - B}, {\tilde C - C} by change of variables, contour shifting and integration by parts until it is of the form to which the above lemma can be profitably applied. As one may imagine though the computations are messy, particularly for the {\tilde C} term. As a warm up, I have begun by trying to estimate integrals of the form

\displaystyle  \int_C \exp( s (1+u-e^u) + \frac{t}{16} (u+b)^2 )\ du

for smallish complex numbers {b}, as these sorts of integrals appear in the form of {\tilde A, \tilde B, \tilde C}. As of this time of writing, there are effective bounds for the {b=0} case, and I am currently working on extending them to the {b \neq 0} case, which should give enough control to approximate {\tilde A - A} and {\tilde B-B}. The most complicated task will be that of upper bounding {\tilde C}, but it also looks eventually doable.

This is the second “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant {\Lambda}, continuing this previous thread. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We now have the following proposition (see this page for a proof sketch) that looks like it can give a numerically feasible approach to bound {\Lambda}:

Proposition 1 Suppose that one has parameters {t_0, T, \varepsilon > 0} obeying the following properties:

  • All the zeroes of {H_0(x+iy)=0} with {0 \leq x \leq T} are real.
  • There are no zeroes {H_t(x+iy)=0} with {0 \leq t \leq t_0} in the region {\{ x+iy: x \geq T; 1-2t \geq y^2 \geq \varepsilon^2 + (T-x)^2 \}}.
  • There are no zeroes {H_{t_0}(x+iy)=0} with {x > T} and {y \geq \varepsilon}.

Then one has {\Lambda \leq t_0 + \frac{1}{2} \varepsilon^2}.

The first hypothesis is already known for {T} up to about {10^{12}} (we should find out exactly what we can reach here). Preliminary calculations suggest that we can obtain the third item provided that {t_0, \varepsilon \gg \frac{1}{\log T}}. The second hypothesis requires good numerical calculation for {H_t}, to which we now turn.

The initial definition of {H_t} is given by the formula

\displaystyle  H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du

where

\displaystyle  \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3\pi n^2 e^{5u} ) \exp(-\pi n^2 e^{4u}).

This formula has proven numerically computable to acceptable error up until about the first hundred zeroes of {H_t}, but degrades after that, and so other exact or approximate formulae for {H_t} are needed. One possible exact formula that could be useful is

\displaystyle  H_t(z) = \frac{1}{2} (K_{t,\theta}(z) + \overline{K_{t,\theta}(\overline{z})})

where

\displaystyle  K_{t,\theta}(z) := \sum_{n=1}^\infty (2\pi^2 n^4 I_{t,\theta}(z-9i, \pi n^2) - 3\pi n^2I_{t,\theta}(z-5i, \pi n^2))

and

\displaystyle  I_{t,\theta}(b,\beta) := \int_{i\theta}^{i\theta+i\infty} \exp(tu^2 - \beta e^{4u} + ibu)\ du

and {-\pi/8 < \theta < \pi/8} can be chosen arbitrarily. We are still trying to see if this can be implemented numerically to give better accuracy than the previous formula.

It seems particularly promising to develop a generalisation of the Riemann-Siegel approximate functional equation for {H_0}. Preliminary computations suggest in particular that we have the approximation

\displaystyle  H_t(x+iy) \approx \frac{1}{4} (F_t(\frac{1+ix-y}{2}) + \overline{F_t(\frac{1+ix+y}{2})})

where

\displaystyle  F_t(s) := \pi^{-s/2} \Gamma(\frac{s+4}{2}) \sum_{n \leq \sqrt{\mathrm{Im}(s)/2\pi}} \frac{\exp( \frac{t}{16} \log^2 \frac{s+4}{2\pi n^2})}{n^s}.

Some very preliminary numerics suggest that this formula is reasonably accurate even for moderate values of {x}, though further numerical verification is needed. As a proof of concept, one could take this approximation as exact for the purposes of seeing what ranges of {T} one can feasibly compute with (and for extremely large values of {T}, we will presumably have to introduce some version of the Odlyzko-Schönhage algorithm. Of course, to obtain a rigorous result, we will eventually need a rigorous version of this formula with explicit error bounds. It may also be necessary to add more terms to the approximation to reduce the size of the error.

Sujit Nair has kindly summarised for me the current state of affairs with the numerics as follows:

  • We need a real milestone and work backward to set up intermediate goals. This will definitely help bring in focus!
  • So far, we have some utilities to compute zeroes of {H_t} with a nonlinear solver which uses roots of {H_0} as an initial condition. The solver is a wrapper around MINPACK’s implementation of Powell’s method. There is some room for optimization. For example, we aren’t providing the solver with the analytical Jacobian which speeds up the computation and increases accuracy.
  • We have some results in the output folder which contains the first 1000 roots of {H_t} for some small values of {t \in \{0.01, 0.1, 0.22\}}, etc. They need some more organization and visualization.

We have a decent initial start but we have some ways to go. Moving forward, here is my proposition for some areas of focus. We should expand and prioritize after some open discussion.

  1. Short term Optimize the existing framework and target to have the first million zeros of {H_t} (for a reasonable range of {t}) and the corresponding plots. With better engineering practice and discipline, I am confident we can get to a few tens of millions range. Some things which will help include parallelization, iterative approaches (using zeroes of {H_t} to compute zeroes of {H_{t + \delta t}}), etc.
  2. Medium term We need to explore better ways to represent the zeros and compute them. An analogy is the computation of Riemann zeroes up to height {T}. It is computed by computing the sign changes of {Z(t)} (page 119 of Edwards) and by exploiting the {\sqrt T} speed up with the Riemann-Siegel formulation (over Euler-Maclaurin). For larger values of {j}, I am not sure the root solver based approach is going to work to understand the gaps between zeroes.
  3. Long term We also need a better understanding of the errors involved in the computation — truncation, hardware/software, etc.

Archives