You are currently browsing the category archive for the ‘math.NT’ category.

This is the eleventh research thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

There are currently two strands of activity.  One is writing up the paper describing the combination of theoretical and numerical results needed to obtain the new bound $\Lambda \leq 0.22$.  The latest version of the writeup may be found here, in this directory.  The theoretical side of things have mostly been written up; the main remaining tasks to do right now are

1. giving a more detailed description and illustration of the two major numerical verifications, namely the barrier verification that establishes a zero-free region for $H_t(x+iy)=0$ for $0 \leq t \leq 0.2, 0.2 \leq y \leq 1, |x - 6 \times 10^{10} - 83952| \leq 0.5$, and the Dirichlet series bound that establishes a zero-free region for $t = 0.2, 0.2 \leq y \leq 1, x \geq 6 \times 10^{10} + 83952$; and
2. giving more detail on the conditional results assuming more numerical verification of RH.

Meanwhile, several of us have been exploring the behaviour of the zeroes of $H_t$ for negative $t$; this does not directly lead to any new progress on bounding $\Lambda$ (though there is a good chance that it may simplify the proof of $\Lambda \geq 0$), but there have been some interesting numerical phenomena uncovered, as summarised in this set of slides.  One phenomenon is that for large negative $t$, many of the complex zeroes begin to organise themselves near the curves

$\displaystyle y = -\frac{t}{2} \log \frac{x}{4\pi n(n+1)} - 1.$

(An example of the agreement between the zeroes and these curves may be found here.)  We now have a (heuristic) theoretical explanation for this; we should have an approximation

$\displaystyle H_t(x+iy) \approx B_t(x+iy) \sum_{n=1}^\infty \frac{b_n^t}{n^{s_*}}$

in this region (where $B_t, b_n^t, n^{s_*}$ are defined in equations (11), (15), (17) of the writeup, and the above curves arise from (an approximation of) those locations where two adjacent terms $\frac{b_n^t}{n^{s_*}}$, $\frac{b_{n+1}^t}{(n+1)^{s_*}}$ in this series have equal magnitude (with the other terms being of lower order).

However, we only have a partial explanation at present of the interesting behaviour of the real zeroes at negative t, for instance the surviving zeroes at extremely negative values of $t$ appear to lie on the curve where the quantity $N$ is close to a half-integer, where

$\displaystyle \tilde x := x + \frac{\pi t}{4}$

$\displaystyle N := \sqrt{\frac{\tilde x}{4\pi}}$

The remaining zeroes exhibit a pattern in $(N,u)$ coordinates that is approximately 1-periodic in $N$, where

$\displaystyle u := \frac{4\pi |t|}{\tilde x}.$

A plot of the zeroes in these coordinates (somewhat truncated due to the numerical range) may be found here.

We do not yet have a total explanation of the phenomena seen in this picture.  It appears that we have an approximation

$\displaystyle H_t(x) \approx A_t(x) \sum_{n=1}^\infty \exp( -\frac{|t| \log^2(n/N)}{4(1-\frac{iu}{8\pi})} - \frac{1+i\tilde x}{2} \log(n/N) )$

where $A_t(x)$ is the non-zero multiplier

$\displaystyle A_t(x) := e^{\pi^2 t/64} M_0(\frac{1+i\tilde x}{2}) N^{-\frac{1+i\tilde x}{2}} \sqrt{\frac{\pi}{1-\frac{iu}{8\pi}}}$

and

$\displaystyle M_0(s) := \frac{1}{8}\frac{s(s-1)}{2}\pi^{-s/2} \sqrt{2\pi} \exp( (\frac{s}{2}-\frac{1}{2}) \log \frac{s}{2} - \frac{s}{2} )$

The derivation of this formula may be found in this wiki page.  However our initial attempts to simplify the above approximation further have proven to be somewhat inaccurate numerically (in particular giving an incorrect prediction for the location of zeroes, as seen in this picture).  We are in the process of using numerics to try to resolve the discrepancies (see this page for some code and discussion).

Kaisa Matomäki, Maksym Radziwill, and I just uploaded to the arXiv our paper “Fourier uniformity of bounded multiplicative functions in short intervals on average“. This paper is the outcome of our attempts during the MSRI program in analytic number theory last year to attack the local Fourier uniformity conjecture for the Liouville function ${\lambda}$. This conjecture generalises a landmark result of Matomäki and Radziwill, who show (among other things) that one has the asymptotic

$\displaystyle \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n)|\ dx = o(HX) \ \ \ \ \ (1)$

whenever ${X \rightarrow \infty}$ and ${H = H(X)}$ goes to infinity as ${X \rightarrow \infty}$. Informally, this says that the Liouville function has small mean for almost all short intervals ${[x,x+H]}$. The remarkable thing about this theorem is that there is no lower bound on how ${H}$ goes to infinity with ${X}$; one can take for instance ${H = \log\log\log X}$. This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that

$\displaystyle \sum_{n \leq X} \frac{\lambda(n) \lambda(n+1)}{n} = o(\log X).$

The local Fourier uniformity conjecture asserts the stronger asymptotic

$\displaystyle \int_X^{2X} \sup_{\alpha \in {\bf R}} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|\ dx = o(HX) \ \ \ \ \ (2)$

under the same hypotheses on ${H}$ and ${X}$. As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that

$\displaystyle \sum_{n \leq X} \frac{\lambda(n) \lambda(n+1) \lambda(n+2)}{n} = o(\log X).$

This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that ${\lambda(p)=-1}$ for all primes ${p}$).

There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase ${{}e(-\alpha n)}$ is replaced with a polynomial phase such as ${e(-\alpha_d n^d - \dots - \alpha_1 n - \alpha_0)}$, or more generally a nilsequence ${\overline{F(g(n) \Gamma)}}$; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.

The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:

Theorem 1 The asymptotic (2) is true when ${H = X^\theta}$ for a fixed ${\theta > 0}$.

Previously this was known for ${\theta > 5/8}$ by the work of Zhan (who in fact proved the stronger pointwise assertion ${\sup_{\alpha \in {\bf R}} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|= o(H)}$ for ${X \leq x \leq 2X}$ in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version

$\displaystyle \sup_{\alpha \in {\bf R}} \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)|\ dx = o(HX) \ \ \ \ \ (3)$

of (2) for any ${H}$ growing arbitrarily slowly with ${X}$; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case ${\alpha}$ is permitted to depend on the ${x}$ parameter, whereas in (3) ${\alpha}$ must remain independent of ${x}$.

Unfortunately, the restriction ${H = X^\theta}$ is not strong enough to give applications to Chowla-type conjectures (one would need something more like ${H = \log^\theta X}$ for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic

$\displaystyle \sum_{h \leq H} \sum_{n \leq X} \lambda(n) \Lambda(n+h) \Lambda(n+2h) = o( H X )$

whenever ${H = X^\theta}$ for a fixed ${\theta > 0}$, where ${\Lambda}$ is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for

$\displaystyle \sum_{h \leq H} \sum_{n \leq X} \Lambda(n+h) \Lambda(n+2h)$

is only known in the range ${\theta \geq 1/6}$ (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes easier to control when one inserts a Liouville function!

We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of ${x \in [X,2X]}$ we would have the lower bound

$\displaystyle |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha_x n)| \gg 1$

for some frequency ${\alpha_x \in{\bf R}}$. We informally describe this correlation between ${\lambda(n)}$ and ${e(\alpha_x n)}$ by writing

$\displaystyle \lambda(n) \approx e(\alpha_x n) \ \ \ \ \ (4)$

for ${n \in [x,x+H]}$ (informally, one should view this as asserting that ${\lambda(n)}$ “behaves like” a constant multiple of ${e(\alpha_x n)}$). For sake of discussion, suppose we have this relationship for all ${x \in [X,2X]}$, not just many.

As mentioned before, the main difficulty here is to understand how ${\alpha_x}$ varies with ${x}$. As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let ${p}$ be a fairly small prime (e.g. of size ${H^\varepsilon}$ for some ${\varepsilon>0}$), and use the identity ${\lambda(np) = \lambda(n) \lambda(p) = - \lambda(n)}$ for the Liouville function to conclude (at least heuristically) from (4) that

$\displaystyle \lambda(n) \approx e(\alpha_x n p)$

for ${n \in [x/p, x/p + H/p]}$. (In practice, we will have this sort of claim for many primes ${p}$ rather than all primes ${p}$, after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)

Now let ${x, y \in [X,2X]}$ and ${p,q \sim P}$ be primes comparable to some fixed range ${P = H^\varepsilon}$ such that

$\displaystyle x/p = y/q + O( H/P). \ \ \ \ \ (5)$

Then we have both

$\displaystyle \lambda(n) \approx e(\alpha_x n p)$

and

$\displaystyle \lambda(n) \approx e(\alpha_y n q)$

on essentially the same range of ${n}$ (two nearby intervals of length ${\sim H/P}$). This suggests that the frequencies ${p \alpha_x}$ and ${q \alpha_y}$ should be close to each other modulo ${1}$, in particular one should expect the relationship

$\displaystyle p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } 1. \ \ \ \ \ (6)$

Comparing this with (5) one is led to the expectation that ${\alpha_x}$ should depend inversely on ${x}$ in some sense (for instance one can check that

$\displaystyle \alpha_x = T/x \ \ \ \ \ (7)$

would solve (6) if ${T = O( X / H^2 )}$; by Taylor expansion, this would correspond to a global approximation of the form ${\lambda(n) \approx n^{iT}}$). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on ${\alpha_x}$ that resembles (7).

A key obstacle in solving (6) efficiently is the fact that one only knows that ${p \alpha_x}$ and ${q \alpha_y}$ are close modulo ${1}$, rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say) ${\alpha_y}$ by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship

$\displaystyle p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } p \ \ \ \ \ (8)$

whenever ${x,y \in [X,2X]}$ and ${p,q \sim P}$ obey (5). (This may force ${\alpha_q}$ to become extremely large, on the order of ${\prod_{p \sim P} p}$, but this will not concern us.)

Now suppose that we have ${y,y' \in [X,2X]}$ and primes ${q,q' \sim P}$ such that

$\displaystyle y/q = y'/q' + O(H/P). \ \ \ \ \ (9)$

For every prime ${p \sim P}$, we can find an ${x}$ such that ${x/p}$ is within ${O(H/P)}$ of both ${y/q}$ and ${y'/q'}$. Applying (8) twice we obtain

$\displaystyle p \alpha_x = q \alpha_y + O( \frac{P}{H} ) \hbox{ mod } p$

and

$\displaystyle p \alpha_x = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } p$

and thus by the triangle inequality we have

$\displaystyle q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } p$

for all ${p \sim P}$; hence by the Chinese remainder theorem

$\displaystyle q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \hbox{ mod } \prod_{p \sim P} p.$

In practice, in the regime ${H = X^\theta}$ that we are considering, the modulus ${\prod_{p \sim P} p}$ is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have

$\displaystyle q \alpha_y = q' \alpha_{y'} + O( \frac{P}{H} ) \ \ \ \ \ (10)$

whenever ${y,y' \in [X,2X]}$ and ${q,q' \sim P}$ obey (9).

Now let ${k}$ be an integer to be chosen later, and suppose we have primes ${p_1,\dots,p_k,q_1,\dots,q_k \sim P}$ such that the difference

$\displaystyle q = |p_1 \dots p_k - q_1 \dots q_k|$

is small but non-zero. If ${k}$ is chosen so that

$\displaystyle P^k \approx \frac{X}{H}$

(where one is somewhat loose about what ${\approx}$ means) then one can then find real numbers ${x_1,\dots,x_k \sim X}$ such that

$\displaystyle \frac{x_j}{p_j} = \frac{x_{j+1}}{q_j} + O( \frac{H}{P} )$

for ${j=1,\dots,k}$, with the convention that ${x_{k+1} = x_1}$. We then have

$\displaystyle p_j \alpha_{x_j} = q_j \alpha_{x_{j+1}} + O( \frac{P}{H} )$

which telescopes to

$\displaystyle p_1 \dots p_k \alpha_{x_1} = q_1 \dots q_k \alpha_{x_1} + O( \frac{P^k}{H} )$

and thus

$\displaystyle q \alpha_{x_1} = O( \frac{P^k}{H} )$

and hence

$\displaystyle \alpha_{x_1} = O( \frac{P^k}{H} ) \approx O( \frac{X}{H^2} ).$

In particular, for each ${x \sim X}$, we expect to be able to write

$\displaystyle \alpha_x = \frac{T_x}{x} + O( \frac{1}{H} )$

for some ${T_x = O( \frac{X^2}{H^2} )}$. This quantity ${T_x}$ can vary with ${x}$; but from (10) and a short calculation we see that

$\displaystyle T_y = T_{y'} + O( \frac{X}{H} )$

whenever ${y, y' \in [X,2X]}$ obey (9) for some ${q,q' \sim P}$.

Now imagine a “graph” in which the vertices are elements ${y}$ of ${[X,2X]}$, and two elements ${y,y'}$ are joined by an edge if (9) holds for some ${q,q' \sim P}$. Because of exponential sum estimates on ${\sum_{q \sim P} q^{it}}$, this graph turns out to essentially be an “expander” in the sense that any two vertices ${y,y' \in [X,2X]}$ can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of ${y}$ or ${y'}$ by ${O(H)}$). As a consequence, we can assume that this quantity ${T_y}$ is essentially constant in ${y}$ (cf. the application of the ergodic theorem in this previous blog post), thus we now have

$\displaystyle \alpha_x = \frac{T}{x} + O(\frac{1}{H} )$

for most ${x \in [X,2X]}$ and some ${T = O(X^2/H^2)}$. By Taylor expansion, this implies that

$\displaystyle \lambda(n) \approx n^{iT}$

on ${[x,x+H]}$ for most ${x}$, thus

$\displaystyle \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) n^{-iT}|\ dx \gg HX.$

But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function ${n \mapsto \lambda(n) n^{-iT}}$ is known to be non-pretentious).

Joni Teräväinen and I have just uploaded to the arXiv our paper “The structure of correlations of multiplicative functions at almost all scales, with applications to the Chowla and Elliott conjectures“. This is a sequel to our previous paper that studied logarithmic correlations of the form

$\displaystyle f(a) := \lim^*_{x \rightarrow \infty} \frac{1}{\log \omega(x)} \sum_{x/\omega(x) \leq n \leq x} \frac{g_1(n+ah_1) \dots g_k(n+ah_k)}{n},$

where ${g_1,\dots,g_k}$ were bounded multiplicative functions, ${h_1,\dots,h_k \rightarrow \infty}$ were fixed shifts, ${1 \leq \omega(x) \leq x}$ was a quantity going off to infinity, and ${\lim^*}$ was a generalised limit functional. Our main technical result asserted that these correlations were necessarily the uniform limit of periodic functions ${f_i}$. Furthermore, if ${g_1 \dots g_k}$ (weakly) pretended to be a Dirichlet character ${\chi}$, then the ${f_i}$ could be chosen to be ${\chi}$isotypic in the sense that ${f_i(ab) = f_i(a) \chi(b)}$ whenever ${a,b}$ are integers with ${b}$ coprime to the periods of ${\chi}$ and ${f_i}$; otherwise, if ${g_1 \dots g_k}$ did not weakly pretend to be any Dirichlet character ${\chi}$, then ${f}$ vanished completely. This was then used to verify several cases of the logarithmically averaged Elliott and Chowla conjectures.

The purpose of this paper was to investigate the extent to which the methods could be extended to non-logarithmically averaged settings. For our main technical result, we now considered the unweighted averages

$\displaystyle f_d(a) := \lim^*_{x \rightarrow \infty} \frac{1}{x/d} \sum_{n \leq x/d} g_1(n+ah_1) \dots g_k(n+ah_k),$

where ${d>1}$ is an additional parameter. Our main result was now as follows. If ${g_1 \dots g_k}$ did not weakly pretend to be a twisted Dirichlet character ${n \mapsto \chi(n) n^{it}}$, then ${f_d(a)}$ converged to zero on (doubly logarithmic) average as ${d \rightarrow \infty}$. If instead ${g_1 \dots g_k}$ did pretend to be such a twisted Dirichlet character, then ${f_d(a) d^{it}}$ converged on (doubly logarithmic) average to a limit ${f(a)}$ of ${\chi}$-isotypic functions ${f_i}$. Thus, roughly speaking, one has the approximation

$\displaystyle \lim^*_{x \rightarrow \infty} \frac{1}{x/d} \sum_{n \leq x/d} g_1(n+ah_1) \dots g_k(n+ah_k) \approx f(a) d^{-it}$

for most ${d}$.

Informally, this says that at almost all scales ${x}$ (where “almost all” means “outside of a set of logarithmic density zero”), the non-logarithmic averages behave much like their logarithmic counterparts except for a possible additional twisting by an Archimedean character ${d \mapsto d^{it}}$ (which interacts with the Archimedean parameter ${d}$ in much the same way that the Dirichlet character ${\chi}$ interacts with the non-Archimedean parameter ${a}$). One consequence of this is that most of the recent results on the logarithmically averaged Chowla and Elliott conjectures can now be extended to their non-logarithmically averaged counterparts, so long as one excludes a set of exceptional scales ${x}$ of logarithmic density zero. For instance, the Chowla conjecture

$\displaystyle \lim_{x \rightarrow\infty} \frac{1}{x} \sum_{n \leq x} \lambda(n+h_1) \dots \lambda(n+h_k) = 0$

is now established for ${k}$ either odd or equal to ${2}$, so long as one excludes an exceptional set of scales.

In the logarithmically averaged setup, the main idea was to combine two very different pieces of information on ${f(a)}$. The first, coming from recent results in ergodic theory, was to show that ${f(a)}$ was well approximated in some sense by a nilsequence. The second was to use the “entropy decrement argument” to obtain an approximate isotopy property of the form

$\displaystyle f(a) g_1 \dots g_k(p)\approx f(ap)$

for “most” primes ${p}$ and integers ${a}$. Combining the two facts, one eventually finds that only the almost periodic components of the nilsequence are relevant.

In the current situation, each ${a \mapsto f_d(a)}$ is approximated by a nilsequence, but the nilsequence can vary with ${d}$ (although there is some useful “Lipschitz continuity” of this nilsequence with respect to the ${d}$ parameter). Meanwhile, the entropy decrement argument gives an approximation basically of the form

$\displaystyle f_{dp}(a) g_1 \dots g_k(p)\approx f_d(ap)$

for “most” ${d,p,a}$. The arguments then proceed largely as in the logarithmically averaged case. A key lemma to handle the dependence on the new parameter ${d}$ is the following cohomological statement: if one has a map ${\alpha: (0,+\infty) \rightarrow S^1}$ that was a quasimorphism in the sense that ${\alpha(xy) = \alpha(x) \alpha(y) + O(\varepsilon)}$ for all ${x,y \in (0,+\infty)}$ and some small ${\varepsilon}$, then there exists a real number ${t}$ such that ${\alpha(x) = x^{it} + O(\varepsilon)}$ for all small ${\varepsilon}$. This is achieved by applying a standard “cocycle averaging argument” to the cocycle ${(x,y) \mapsto \alpha(xy) \alpha(x)^{-1} \alpha(y)^{-1}}$.

It would of course be desirable to not have the set of exceptional scales. We only know of one (implausible) scenario in which we can do this, namely when one has far fewer (in particular, subexponentially many) sign patterns for (say) the Liouville function than predicted by the Chowla conjecture. In this scenario (roughly analogous to the “Siegel zero” scenario in multiplicative number theory), the entropy of the Liouville sign patterns is so small that the entropy decrement argument becomes powerful enough to control all scales rather than almost all scales. On the other hand, this scenario seems to be self-defeating, in that it allows one to establish a large number of cases of the Chowla conjecture, and the full Chowla conjecture is inconsistent with having unusually few sign patterns. Still it hints that future work in this direction may need to split into “low entropy” and “high entropy” cases, in analogy to how many arguments in multiplicative number theory have to split into the “Siegel zero” and “no Siegel zero” cases.

This is a sequel to this previous blog post, in which we discussed the effect of the heat flow evolution

$\displaystyle \partial_t P(t,z) = \partial_{zz} P(t,z)$

on the zeroes of a time-dependent family of polynomials ${z \mapsto P(t,z)}$, with a particular focus on the case when the polynomials ${z \mapsto P(t,z)}$ had real zeroes. Here (inspired by some discussions I had during a recent conference on the Riemann hypothesis in Bristol) we record the analogous theory in which the polynomials instead have zeroes on a circle ${\{ z: |z| = \sqrt{q} \}}$, with the heat flow slightly adjusted to compensate for this. As we shall discuss shortly, a key example of this situation arises when ${P}$ is the numerator of the zeta function of a curve.

More precisely, let ${g}$ be a natural number. We will say that a polynomial

$\displaystyle P(z) = \sum_{j=0}^{2g} a_j z^j$

of degree ${2g}$ (so that ${a_{2g} \neq 0}$) obeys the functional equation if the ${a_j}$ are all real and

$\displaystyle a_j = q^{g-j} a_{2g-j}$

for all ${j=0,\dots,2g}$, thus

$\displaystyle P(\overline{z}) = \overline{P(z)}$

and

$\displaystyle P(q/z) = q^g z^{-2g} P(z)$

for all non-zero ${z}$. This means that the ${2g}$ zeroes ${\alpha_1,\dots,\alpha_{2g}}$ of ${P(z)}$ (counting multiplicity) lie in ${{\bf C} \backslash \{0\}}$ and are symmetric with respect to complex conjugation ${z \mapsto \overline{z}}$ and inversion ${z \mapsto q/z}$ across the circle ${\{ |z| = \sqrt{q}\}}$. We say that this polynomial obeys the Riemann hypothesis if all of its zeroes actually lie on the circle ${\{ z = \sqrt{q}\}}$. For instance, in the ${g=1}$ case, the polynomial ${z^2 - a_1 z + q}$ obeys the Riemann hypothesis if and only if ${|a_1| \leq 2\sqrt{q}}$.

Such polynomials arise in number theory as follows: if ${C}$ is a projective curve of genus ${g}$ over a finite field ${\mathbf{F}_q}$, then, as famously proven by Weil, the associated local zeta function ${\zeta_{C,q}(z)}$ (as defined for instance in this previous blog post) is known to take the form

$\displaystyle \zeta_{C,q}(z) = \frac{P(z)}{(1-z)(1-qz)}$

where ${P}$ is a degree ${2g}$ polynomial obeying both the functional equation and the Riemann hypothesis. In the case that ${C}$ is an elliptic curve, then ${g=1}$ and ${P}$ takes the form ${P(z) = z^2 - a_1 z + q}$, where ${a_1}$ is the number of ${{\bf F}_q}$-points of ${C}$ minus ${q+1}$. The Riemann hypothesis in this case is a famous result of Hasse.

Another key example of such polynomials arise from rescaled characteristic polynomials

$\displaystyle P(z) := \det( 1 - \sqrt{q} F ) \ \ \ \ \ (1)$

of ${2g \times 2g}$ matrices ${F}$ in the compact symplectic group ${Sp(g)}$. These polynomials obey both the functional equation and the Riemann hypothesis. The Sato-Tate conjecture (in higher genus) asserts, roughly speaking, that “typical” polyomials ${P}$ arising from the number theoretic situation above are distributed like the rescaled characteristic polynomials (1), where ${F}$ is drawn uniformly from ${Sp(g)}$ with Haar measure.

Given a polynomial ${z \mapsto P(0,z)}$ of degree ${2g}$ with coefficients

$\displaystyle P(0,z) = \sum_{j=0}^{2g} a_j(0) z^j,$

we can evolve it in time by the formula

$\displaystyle P(t,z) = \sum_{j=0}^{2g} \exp( t(j-g)^2 ) a_j(0) z^j,$

thus ${a_j(t) = \exp(t(j-g)) a_j(0)}$ for ${t \in {\bf R}}$. Informally, as one increases ${t}$, this evolution accentuates the effect of the extreme monomials, particularly, ${z^0}$ and ${z^{2g}}$ at the expense of the intermediate monomials such as ${z^g}$, and conversely as one decreases ${t}$. This family of polynomials obeys the heat-type equation

$\displaystyle \partial_t P(t,z) = (z \partial_z - g)^2 P(t,z). \ \ \ \ \ (2)$

In view of the results of Marcus, Spielman, and Srivastava, it is also very likely that one can interpret this flow in terms of expected characteristic polynomials involving conjugation over the compact symplectic group ${Sp(n)}$, and should also be tied to some sort of “${\beta=\infty}$” version of Brownian motion on this group, but we have not attempted to work this connection out in detail.

It is clear that if ${z \mapsto P(0,z)}$ obeys the functional equation, then so does ${z \mapsto P(t,z)}$ for any other time ${t}$. Now we investigate the evolution of the zeroes. Suppose at some time ${t_0}$ that the zeroes ${\alpha_1(t_0),\dots,\alpha_{2g}(t_0)}$ of ${z \mapsto P(t_0,z)}$ are distinct, then

$\displaystyle P(t_0,z) = a_{2g}(0) \exp( t_0g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t_0) ).$

From the inverse function theorem we see that for times ${t}$ sufficiently close to ${t_0}$, the zeroes ${\alpha_1(t),\dots,\alpha_{2g}(t)}$ of ${z \mapsto P(t,z)}$ continue to be distinct (and vary smoothly in ${t}$), with

$\displaystyle P(t,z) = a_{2g}(0) \exp( t g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t) ).$

Differentiating this at any ${z}$ not equal to any of the ${\alpha_j(t)}$, we obtain

$\displaystyle \partial_t P(t,z) = P(t,z) ( g^2 - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)})$

and

$\displaystyle \partial_z P(t,z) = P(t,z) ( \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)})$

and

$\displaystyle \partial_{zz} P(t,z) = P(t,z) ( \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}).$

Inserting these formulae into (2) (expanding ${(z \partial_z - g)^2}$ as ${z^2 \partial_{zz} - (2g-1) z \partial_z + g^2}$) and canceling some terms, we conclude that

$\displaystyle - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)} = z^2 \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}$

$\displaystyle - (2g-1) z \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)}$

for ${t}$ sufficiently close to ${t_0}$, and ${z}$ not equal to ${\alpha_1(t),\dots,\alpha_{2g}(t)}$. Extracting the residue at ${z = \alpha_j(t)}$, we conclude that

$\displaystyle - \alpha'_j(t) = 2 \alpha_j(t)^2 \sum_{1 \leq k \leq 2g: k \neq j} \frac{1}{\alpha_j(t) - \alpha_k(t)} - (2g-1) \alpha_j(t)$

which we can rearrange as

$\displaystyle \frac{\alpha'_j(t)}{\alpha_j(t)} = - \sum_{1 \leq k \leq 2g: k \neq j} \frac{\alpha_j(t)+\alpha_k(t)}{\alpha_j(t)-\alpha_k(t)}.$

If we make the change of variables ${\alpha_j(t) = \sqrt{q} e^{i\theta_j(t)}}$ (noting that one can make ${\theta_j}$ depend smoothly on ${t}$ for ${t}$ sufficiently close to ${t_0}$), this becomes

$\displaystyle \partial_t \theta_j(t) = \sum_{1 \leq k \leq 2g: k \neq j} \cot \frac{\theta_j(t) - \theta_k(t)}{2}. \ \ \ \ \ (3)$

Intuitively, this equation asserts that the phases ${\theta_j}$ repel each other if they are real (and attract each other if their difference is imaginary). If ${z \mapsto P(t_0,z)}$ obeys the Riemann hypothesis, then the ${\theta_j}$ are all real at time ${t_0}$, then the Picard uniqueness theorem (applied to ${\theta_j(t)}$ and its complex conjugate) then shows that the ${\theta_j}$ are also real for ${t}$ sufficiently close to ${t_0}$. If we then define the entropy functional

$\displaystyle H(\theta_1,\dots,\theta_{2g}) := \sum_{1 \leq j < k \leq 2g} \log \frac{1}{|\sin \frac{\theta_j-\theta_k}{2}| }$

then the above equation becomes a gradient flow

$\displaystyle \partial_t \theta_j(t) = - 2 \frac{\partial H}{\partial \theta_j}( \theta_1(t),\dots,\theta_{2g}(t) )$

which implies in particular that ${H(\theta_1(t),\dots,\theta_{2g}(t))}$ is non-increasing in time. This shows that as one evolves time forward from ${t_0}$, there is a uniform lower bound on the separation between the phases ${\theta_1(t),\dots,\theta_{2g}(t)}$, and hence the equation can be solved indefinitely; in particular, ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for all ${t > t_0}$ if it does so at time ${t_0}$. Our argument here assumed that the zeroes of ${z \mapsto P(t_0,z)}$ were simple, but this assumption can be removed by the usual limiting argument.

For any polynomial ${z \mapsto P(0,z)}$ obeying the functional equation, the rescaled polynomials ${z \mapsto e^{-g^2 t} P(t,z)}$ converge locally uniformly to ${a_{2g}(0) (z^{2g} + q^g)}$ as ${t \rightarrow +\infty}$. By Rouche’s theorem, we conclude that the zeroes of ${z \mapsto P(t,z)}$ converge to the equally spaced points ${\{ e^{2\pi i(j+1/2)/2g}: j=1,\dots,2g\}}$ on the circle ${\{ |z| = \sqrt{q}\}}$. Together with the symmetry properties of the zeroes, this implies in particular that ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for all sufficiently large positive ${t}$. In the opposite direction, when ${t \rightarrow -\infty}$, the polynomials ${z \mapsto P(t,z)}$ converge locally uniformly to ${a_g(0) z^g}$, so if ${a_g(0) \neq 0}$, ${g}$ of the zeroes converge to the origin and the other ${g}$ converge to infinity. In particular, ${z \mapsto P(t,z)}$ fails the Riemann hypothesis for sufficiently large negative ${t}$. Thus (if ${a_g(0) \neq 0}$), there must exist a real number ${\Lambda}$, which we call the de Bruijn-Newman constant of the original polynomial ${z \mapsto P(0,z)}$, such that ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for ${t \geq \Lambda}$ and fails the Riemann hypothesis for ${t < \Lambda}$. The situation is a bit more complicated if ${a_g(0)}$ vanishes; if ${k}$ is the first natural number such that ${a_{g+k}(0)}$ (or equivalently, ${a_{g-j}(0)}$) does not vanish, then by the above arguments one finds in the limit ${t \rightarrow -\infty}$ that ${g-k}$ of the zeroes go to the origin, ${g-k}$ go to infinity, and the remaining ${2k}$ zeroes converge to the equally spaced points ${\{ e^{2\pi i(j+1/2)/2k}: j=1,\dots,2k\}}$. In this case the de Bruijn-Newman constant remains finite except in the degenerate case ${k=g}$, in which case ${\Lambda = -\infty}$.

For instance, consider the case when ${g=1}$ and ${P(0,z) = z^2 - a_1 z + q}$ for some real ${a_1}$ with ${|a_1| \leq 2\sqrt{q}}$. Then the quadratic polynomial

$\displaystyle P(t,z) = e^t z^2 - a_1 z + e^t q$

has zeroes

$\displaystyle \frac{a_1 \pm \sqrt{a_1^2 - 4 e^{2t} q}}{2e^t}$

and one easily checks that these zeroes lie on the circle ${\{ |z|=\sqrt{q}\}}$ when ${t \geq \log \frac{|a_1|}{2\sqrt{q}}}$, and are on the real axis otherwise. Thus in this case we have ${\Lambda = \log \frac{|a_1|}{2\sqrt{q}}}$ (with ${\Lambda=-\infty}$ if ${a_1=0}$). Note how as ${t}$ increases to ${+\infty}$, the zeroes repel each other and eventually converge to ${\pm i \sqrt{q}}$, while as ${t}$ decreases to ${-\infty}$, the zeroes collide and then separate on the real axis, with one zero going to the origin and the other to infinity.

The arguments in my paper with Brad Rodgers (discussed in this previous post) indicate that for a “typical” polynomial ${P}$ of degree ${g}$ that obeys the Riemann hypothesis, the expected time to relaxation to equilibrium (in which the zeroes are equally spaced) should be comparable to ${1/g}$, basically because the average spacing is ${1/g}$ and hence by (3) the typical velocity of the zeroes should be comparable to ${g}$, and the diameter of the unit circle is comparable to ${1}$, thus requiring time comparable to ${1/g}$ to reach equilibrium. Taking contrapositives, this suggests that the de Bruijn-Newman constant ${\Lambda}$ should typically take on values comparable to ${-1/g}$ (since typically one would not expect the initial configuration of zeroes to be close to evenly spaced). I have not attempted to formalise or prove this claim, but presumably one could do some numerics (perhaps using some of the examples of ${P}$ given previously) to explore this further.

This is the eighth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

Significant progress has been made since the last update; by implementing the “barrier” method to establish zero free regions for $H_t$ by leveraging the extensive existing numerical verification of the Riemann hypothesis (which establishes zero free regions for $H_0$), we have been able to improve our upper bound on $\Lambda$ from 0.48 to 0.28. Furthermore, there appears to be a bit of further room to improve the bounds further by tweaking the parameters $t_0, y_0, X$ used in the argument (we are currently using $t_0=0.2, y_0 = 0.4, X = 5 \times 10^9$); the most recent idea is to try to use exponential sum estimates to improve the bounds on the derivative of the approximation to $H_t$ that is used in the barrier method, which currently is the most computationally intensive step of the argument.

This is the seventh “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The most recent news is that we appear to have completed the verification that ${H_t(x+iy)}$ is free of zeroes when ${t=0.4}$ and ${y \geq 0.4}$, which implies that ${\Lambda \leq 0.48}$. For very large ${x}$ (for instance when the quantity ${N := \lfloor \sqrt{\frac{x}{4\pi} + \frac{t}{16}} \rfloor}$ is at least ${300}$) this can be done analytically; for medium values of ${x}$ (say when ${N}$ is between ${11}$ and ${300}$) this can be done by numerically evaluating a fast approximation ${A^{eff} + B^{eff}}$ to ${H_t}$ and using the argument principle in a rectangle; and most recently it appears that we can also handle small values of ${x}$, in part due to some new, and significantly faster, numerical ways to evaluate ${H_t}$ in this range.

One obvious thing to do now is to experiment with lowering the parameters ${t}$ and ${y}$ and see what happens. However there are two other potential ways to bound ${\Lambda}$ which may also be numerically feasible. One approach is based on trying to exclude zeroes of ${H_t(x+iy)=0}$ in a region of the form ${0 \leq t \leq t_0}$, ${X \leq x \leq X+1}$ and ${y \geq y_0}$ for some moderately large ${X}$ (this acts as a “barrier” to prevent zeroes from flowing into the region ${\{ 0 \leq x \leq X, y \geq y_0 \}}$ at time ${t_0}$, assuming that they were not already there at time ${0}$). This require significantly less numerical verification in the ${x}$ aspect, but more numerical verification in the ${t}$ aspect, so it is not yet clear whether this is a net win.

Another, rather different approach, is to study the evolution of statistics such as ${S(t) = \sum_{H_t(x+iy)=0: x,y>0} y e^{-x/X}}$ over time. One has fairly good control on such quantities at time zero, and their time derivative looks somewhat manageable, so one may be able to still have good control on this quantity at later times ${t_0>0}$. However for this approach to work, one needs an effective version of the Riemann-von Mangoldt formula for ${H_t}$, which at present is only available asymptotically (or at time ${t=0}$). This approach may be able to avoid almost all numerical computation, except for numerical verification of the Riemann hypothesis, for which we can appeal to existing literature.

Participants are also welcome to add any further summaries of the situation in the comments below.

This is the sixth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The last two threads have been focused primarily on the test problem of showing that ${H_t(x+iy) \neq 0}$ whenever ${t = y = 0.4}$. We have been able to prove this for most regimes of ${x}$, or equivalently for most regimes of the natural number parameter ${N := \lfloor \sqrt{\frac{x}{4\pi} + \frac{t}{16}} \rfloor}$. In many of these regimes, a certain explicit approximation ${A^{eff}+B^{eff}}$ to ${H_t}$ was used, together with a non-zero normalising factor ${B^{eff}_0}$; see the wiki for definitions. The explicit upper bound

$\displaystyle |H_t - A^{eff} - B^{eff}| \leq E_1 + E_2 + E_3$

has been proven for certain explicit expressions ${E_1, E_2, E_3}$ (see here) depending on ${x}$. In particular, if ${x}$ satisfies the inequality

$\displaystyle |\frac{A^{eff}+B^{eff}}{B^{eff}_0}| > \frac{E_1}{|B^{eff}_0|} + \frac{E_2}{|B^{eff}_0|} + \frac{E_3}{|B^{eff}_0|}$

then ${H_t(x+iy)}$ is non-vanishing thanks to the triangle inequality. (In principle we have an even more accurate approximation ${A^{eff}+B^{eff}-C^{eff}}$ available, but it is looking like we will not need it for this test problem at least.)

We have explicit upper bounds on ${\frac{E_1}{|B^{eff}_0|}}$, ${\frac{E_2}{|B^{eff}_0|}}$, ${\frac{E_3}{|B^{eff}_0|}}$; see this wiki page for details. They are tabulated in the range ${3 \leq N \leq 2000}$ here. For ${N \geq 2000}$, the upper bound ${\frac{E_3^*}{|B^{eff}_0|}}$ for ${\frac{E_3}{|B^{eff}_0|}}$ is monotone decreasing, and is in particular bounded by ${1.53 \times 10^{-5}}$, while ${\frac{E_2}{|B^{eff}_0|}}$ and ${\frac{E_1}{|B^{eff}_0|}}$ are known to be bounded by ${2.9 \times 10^{-7}}$ and ${2.8 \times 10^{-8}}$ respectively (see here).

Meanwhile, the quantity ${|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|}$ can be lower bounded by

$\displaystyle |\sum_{n=1}^N \frac{b_n}{n^s}| - |\sum_{n=1}^N \frac{a_n}{n^s}|$

for certain explicit coefficients ${a_n,b_n}$ and an explicit complex number ${s = \sigma + i\tau}$. Using the triangle inequality to lower bound this by

$\displaystyle |b_1| - \sum_{n=2}^N \frac{|b_n|}{n^\sigma} - \sum_{n=1}^N \frac{|a_n|}{n^\sigma}$

we can obtain a lower bound of ${0.18}$ for ${N \geq 2000}$, which settles the test problem in this regime. One can get more efficient lower bounds by multiplying both Dirichlet series by a suitable Euler product mollifier; we have found ${\prod_{p \leq P} (1 - \frac{b_p}{p^s})}$ for ${P=2,3,5,7}$ to be good choices to get a variety of further lower bounds depending only on ${N}$, see this table and this wiki page. Comparing this against our tabulated upper bounds for the error terms we can handle the range ${300 \leq N \leq 2000}$.

In the range ${11 \leq N \leq 300}$, we have been able to obtain a suitable lower bound ${|\frac{A^{eff}+B^{eff}}{B^{eff}_0}| \geq c}$ (where ${c}$ exceeds the upper bound for ${\frac{E_1}{|B^{eff}_0|} + \frac{E_2}{|B^{eff}_0|} + \frac{E_3}{|B^{eff}_0|}}$) by numerically evaluating ${|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|}$ at a mesh of points for each choice of ${N}$, with the mesh spacing being adaptive and determined by ${c}$ and an upper bound for the derivative of ${|\frac{A^{eff}+B^{eff}}{B^{eff}_0}|}$; the data is available here.

This leaves the final range ${N \leq 10}$ (roughly corresponding to ${x \leq 1600}$). Here we can numerically evaluate ${H_t(x+iy)}$ to high accuracy at a fine mesh (see the data here), but to fill in the mesh we need good upper bounds on ${H'_t(x+iy)}$. It seems that we can get reasonable estimates using some contour shifting from the original definition of ${H_t}$ (see here). We are close to finishing off this remaining region and thus solving the toy problem.

Beyond this, we need to figure out how to show that ${H_t(x+iy) \neq 0}$ for ${y > 0.4}$ as well. General theory lets one do this for ${y \geq \sqrt{1-2t} = 0.447\dots}$, leaving the region ${0.4 < y < 0.448}$. The analytic theory that handles ${N \geq 2000}$ and ${300 \leq N \leq 2000}$ should also handle this region; for ${N \leq 300}$ presumably the argument principle will become relevant.

The full argument also needs to be streamlined and organised; right now it sprawls over many wiki pages and github code files. (A very preliminary writeup attempt has begun here). We should also see if there is much hope of extending the methods to push much beyond the bound of ${\Lambda \leq 0.48}$ that we would get from the above calculations. This would also be a good time to start discussing whether to move to the writing phase of the project, or whether there are still fruitful research directions for the project to explore.

Participants are also welcome to add any further summaries of the situation in the comments below.

This is the fifth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this post. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We have almost finished off the test problem of showing that ${H_t(x+iy) \neq 0}$ whenever ${t = y = 0.4}$. We have two useful approximations for ${H_t}$, which we have denoted ${A^{eff}+B^{eff}}$ and ${A^{eff}+B^{eff}-C^{eff}}$, and a normalising quantity ${B^{eff}_0}$ that is asymptotically equal to the above expressions; see the wiki page for definitions. In practice, the ${A^{eff}+B^{eff}}$ approximation seems to be accurate within about one or two significant figures, whilst the ${A^{eff}+B^{eff}-C^{eff}}$ approximation is accurate to about three or four. We have an effective upper bound

$\displaystyle |H_t - A^{eff} - B^{eff}| \leq E_1 + E_2 + E_3^*$

where the expressions ${E_1,E_2,E_3^*}$ are quite small in practice (${E_3^*}$ is typically about two orders of magnitude smaller than the main term ${B^{eff}_0}$ once ${x}$ is moderately large, and the error terms ${E_1,E_2}$ are even smaller). See this page for details. In principle we could also obtain an effective upper bound for ${|H_t - (A^{eff} + B^{eff} - C^{eff})|}$ (the ${E_3^*}$ term would be replaced by something smaller).

The ratio ${\frac{A^{eff}+B^{eff}}{B^{eff}_0}}$ takes the form of a difference ${\sum_{n=1}^N \frac{b_n}{n^s} - e^{i\theta} \sum_{n=1}^N \frac{a_n}{n^s}}$ of two Dirichlet series, where ${e^{i\theta}}$ is a phase whose value is explicit but perhaps not terribly important, and the coefficients ${b_n, a_n}$ are explicit and relatively simple (${b_n}$ is ${\exp( \frac{t}{4} \log^2 n)}$, and ${a_n}$ is approximately ${(n/N)^y b_n}$). To bound this away from zero, we have found it advantageous to mollify this difference by multiplying by an Euler product ${\prod_{p \leq P} (1 - \frac{b_p}{p^s})}$ to cancel much of the initial oscillation; also one can take advantage of the fact that the ${b_n}$ are real and the ${a_n}$ are (approximately) real. See this page for details. The upshot is that we seem to be getting good lower bounds for the size of this difference of Dirichlet series starting from about ${x \geq 5 \times 10^5}$ or so. The error terms ${E_1,E_2,E_3^*}$ are already quite small by this stage, so we should soon be able to rigorously keep ${H_t}$ from vanishing at this point. We also have a scheme for lower bounding the difference of Dirichlet series below this range, though it is not clear at present how far we can continue this before the error terms ${E_1,E_2,E_3^*}$ become unmanageable. For very small ${x}$ we may have to explore some faster ways to compute the expression ${H_t}$, which is still difficult to compute directly with high accuracy. One will also need to bound the somewhat unwieldy expressions ${E_1,E_2}$ by something more manageable. For instance, right now these quantities depend on the continuous variable ${x}$; it would be preferable to have a quantity that depends only on the parameter ${N = \lfloor \sqrt{ \frac{x}{4\pi} + \frac{t}{16} }\rfloor}$, as this could be computed numerically for all ${x}$ in the remaining range of interest quite quickly.

As before, any other mathematical discussion related to the project is also welcome here, for instance any summaries of previous discussion that was not covered in this post.

This is the fourth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing https://terrytao.wordpress.com/2018/01/24/polymath-proposal-upper-bounding-the-de-bruijn-newman-constant/. Progress will be summarised at this Polymath wiki page.

We are getting closer to finishing off the following test problem: can one show that ${H_t(x+iy) \neq 0}$ whenever ${t = y = 0.4}$, ${x \geq 0}$? This would morally show that ${\Lambda \leq 0.48}$. A wiki page for this problem has now been created here. We have obtained a number of approximations ${A+B, A'+B', A^{eff}+B^{eff}, A^{toy}+B^{toy}}$ to ${H_t}$ (see wiki page), though numeric evidence indicates that the approximations are all very close to each other. (Many of these approximations come with a correction term ${C}$, but thus far it seems that we may be able to avoid having to use this refinement to the approximations.) The effective approximation ${A^{eff} + B^{eff}}$ also comes with an effective error bound

$\displaystyle |H_t - A^{eff} - B^{eff}| \leq E_1 + E_2 + E_3$

for some explicit (but somewhat messy) error terms ${E_1,E_2,E_3}$: see this wiki page for details. The original approximations ${A+B, A'+B'}$ can be considered deprecated at this point in favour of the (slightly more complicated) approximation ${A^{eff}+B^{eff}}$; the approximation ${A^{toy}+B^{toy}}$ is a simplified version of ${A^{eff}+B^{eff}}$ which is not quite as accurate but might be useful for testing purposes.

It is convenient to normalise everything by an explicit non-zero factor ${B^{eff}_0}$. Asymptotically, ${(A^{eff} + B^{eff}) / B^{eff}_0}$ converges to 1; numerically, it appears that its magnitude (and also its real part) stays roughly between 0.4 and 3 in the range ${10^5 \leq x \leq 10^6}$, and we seem to be able to keep it (or at least the toy counterpart ${(A^{toy} + B^{toy}) / B^{toy}_0}$) away from zero starting from about ${x \geq 4 \times 10^6}$ (here it seems that there is a useful trick of multiplying by Euler-type factors like ${1 - \frac{1}{2^{1-s}}}$ to cancel off some of the oscillation). Also, the bounds on the error ${(H_t - A^{eff} - B^{eff}) / B^{eff}_0}$ seem to be of size about 0.1 or better in these ranges also. So we seem to be on track to be able to rigorously eliminate zeroes starting from about ${x \geq 10^5}$ or so. We have not discussed too much what to do with the small values of ${x}$; at some point our effective error bounds will become unusable, and we may have to find some more faster ways to compute ${H_t}$.

In addition to this main direction of inquiry, there have been additional discussions on the dynamics of zeroes, and some numerical investigations of the behaviour Lehmer pairs under heat flow. Contributors are welcome to summarise any findings from these discussions from previous threads (or on any other related topic, e.g. improvements in the code) in the comments below.

Kevin Ford, Sergei Konyagin, James Maynard, Carl Pomerance, and I have uploaded to the arXiv our paper “Long gaps in sieved sets“, submitted to J. Europ. Math. Soc..

This paper originated from the MSRI program in analytic number theory last year, and was centred around variants of the question of finding large gaps between primes. As discussed for instance in this previous post, it is now known that within the set of primes ${{\mathcal P} = \{2,3,5,\dots\}}$, one can find infinitely many adjacent elements ${a,b}$ whose gap ${b-a}$ obeys a lower bound of the form

$\displaystyle b-a \gg \log a \frac{\log_2 a \log_4 a}{\log_3 a}$

where ${\log_k}$ denotes the ${k}$-fold iterated logarithm. This compares with the trivial bound of ${b-a \gg \log a}$ that one can obtain from the prime number theorem and the pigeonhole principle. Several years ago, Pomerance posed the question of whether analogous improvements to the trivial bound can be obtained for such sets as

$\displaystyle {\mathcal P}_2 = \{ n \in {\bf N}: n^2+1 \hbox{ prime} \}.$

Here there is the obvious initial issue that this set is not even known to be infinite (this is the fourth Landau problem), but let us assume for the sake of discussion that this set is indeed infinite, so that we have an infinite number of gaps to speak of. Standard sieve theory techniques give upper bounds for the density of ${{\mathcal P}_2}$ that is comparable (up to an absolute constant) to the prime number theorem bounds for ${{\mathcal P}}$, so again we can obtain a trivial bound of ${b-a \gg \log a}$ for the gaps of ${{\mathcal P}_2}$. In this paper we improve this to

$\displaystyle b-a \gg \log a \log^c_2 a$

for an absolute constant ${c>0}$; this is not as strong as the corresponding bound for ${{\mathcal P}}$, but still improves over the trivial bound. In fact we can handle more general “sifted sets” than just ${{\mathcal P}_2}$. Recall from the sieve of Eratosthenes that the elements of ${{\mathcal P}}$ in, say, the interval ${[x/2, x]}$ can be obtained by removing from ${[x/2, x]}$ one residue class modulo ${p}$ for each prime up to ${\sqrt{x}}$, namely the class ${0}$ mod ${p}$. In a similar vein, the elements of ${{\mathcal P}_2}$ in ${[x/2,x]}$ can be obtained by removing for each prime ${p}$ up to ${x}$ zero, one, or two residue classes modulo ${p}$, depending on whether ${-1}$ is a quadratic residue modulo ${p}$. On the average, one residue class will be removed (this is a very basic case of the Chebotarev density theorem), so this sieving system is “one-dimensional on the average”. Roughly speaking, our arguments apply to any other set of numbers arising from a sieving system that is one-dimensional on average. (One can consider other dimensions also, but unfortunately our methods seem to give results that are worse than a trivial bound when the dimension is less than or greater than one.)

The standard “Erdős-Rankin” method for constructing long gaps between primes proceeds by trying to line up some residue classes modulo small primes ${p}$ so that they collectively occupy a long interval. A key tool in doing so are the smooth number estimates of de Bruijn and others, which among other things assert that if one removes from an interval such as ${[1,x]}$ all the residue classes ${0}$ mod ${p}$ for ${p}$ between ${x^{1/u}}$ and ${x}$ for some fixed ${u>1}$, then the set of survivors has exceptionally small density (roughly of the order of ${u^{-u}}$, with the precise density given by the Dickman function), in marked contrast to the situation in which one randomly removes one residue class for each such prime ${p}$, in which the density is more like ${1/u}$. One generally exploits this phenomenon to sieve out almost all the elements of a long interval using some of the primes available, and then using the remaining primes to cover up the remaining elements that have not already been sifted out. In the more recent work on this problem, advanced combinatorial tools such as hypergraph covering lemmas are used for the latter task.

In the case of ${{\mathcal P}_2}$, there does not appear to be any analogue of smooth numbers, in the sense that there is no obvious way to arrange the residue classes so that they have significantly fewer survivors than a random arrangement. Instead we adopt the following semi-random strategy to cover an interval ${[1,y]}$ by residue classes. Firstly, we randomly remove residue classes for primes ${p}$ up to some intermediate threshold ${z}$ (smaller than ${y}$ by a logarithmic factor), leaving behind a preliminary sifted set ${S_{[2,z]}}$. Then, for each prime ${p}$ between ${z}$ and another intermediate threshold ${x/2}$, we remove a residue class mod ${p}$ that maximises (or nearly maximises) its intersection with ${S_{[2,z]}}$. This ends up reducing the number of survivors to be significantly below what one would achieve if one selects residue classes randomly, particularly if one also uses the hypergraph covering lemma from our previous paper. Finally, we cover each the remaining survivors by a residue class from a remaining available prime.