You are currently browsing the category archive for the ‘polymath’ category.

This is the third “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this previous thread. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We are making progress on the following test problem: can one show that ${H_t(x+iy) \neq 0}$ whenever ${t = 0.4}$, ${x \geq 0}$, and ${y \geq 0.4}$? This would imply that

$\displaystyle \Lambda \leq 0.4 + \frac{1}{2} (0.4)^2 = 0.48$

which would be the first quantitative improvement over the de Bruijn bound of ${\Lambda \leq 1/2}$ (or the Ki-Kim-Lee refinement of ${\Lambda < 1/2}$). Of course we can try to lower the two parameters of ${0.4}$ later on in the project, but this seems as good a place to start as any. One could also potentially try to use finer analysis of dynamics of zeroes to improve the bound ${\Lambda \leq 0.48}$ further, but this seems to be a less urgent task.

Probably the hardest case is ${y=0.4}$, as there is a good chance that one can then recover the ${y>0.4}$ case by a suitable use of the argument principle. Here we appear to have a workable Riemann-Siegel type formula that gives a tractable approximation for ${H_t}$. To describe this formula, first note that in the ${t=0}$ case we have

$\displaystyle H_0(z) = \frac{1}{8} \xi( \frac{1+iz}{2} )$

and the Riemann-Siegel formula gives

$\displaystyle \xi(s) = \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}$

$\displaystyle + \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}$

$\displaystyle + \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw$

for any natural numbers ${N,M}$, where ${C_M}$ is a contour from ${+\infty}$ to ${+\infty}$ that winds once anticlockwise around the zeroes ${e^{2\pi im}, |m| \leq M}$ of ${e^w-1}$ but does not wind around any other zeroes. A good choice of ${N,M}$ to use here is

$\displaystyle N=M=\lfloor \sqrt{\mathrm{Im}(s)/2\pi}\rfloor = \lfloor \sqrt{\mathrm{Re}(z)/4\pi} \rfloor. \ \ \ \ \ (1)$

In this case, a classical steepest descent computation (see wiki) yields the approximation

$\displaystyle \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw \approx - (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )$

where

$\displaystyle \Psi(\alpha) := 2\pi \frac{\cos \pi(\frac{1}{2}\alpha^2 - \alpha - \pi/8)}{\cos(\pi \alpha)} \exp( \frac{i\pi}{2} \alpha^2 - \frac{5\pi}{8} ).$

Thus we have

$\displaystyle H_0(z) \approx A^{(0)} + B^{(0)} - C^{(0)}$

where

$\displaystyle A^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}$

$\displaystyle B^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}$

$\displaystyle C^{(0)} := \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )$

with ${s := \frac{1+iz}{2}}$ and ${N,M}$ given by (1).

Heuristically, we have derived (see wiki) the more general approximation

$\displaystyle H_t(z) \approx A + B - C$

for ${t>0}$ (and in particular for ${t=0.4}$), where

$\displaystyle A := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{\exp(\frac{t}{16} \log^2 \frac{s+4}{2\pi n^2} )}{n^s}$

$\displaystyle B := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{\exp(\frac{t}{16} \log^2 \frac{5-s}{2\pi m^2} )}{m^{1-s}}$

$\displaystyle C := \exp(-\frac{t \pi^2}{64}) C^{(0)}.$

In practice it seems that the ${C}$ term is negligible once the real part ${x}$ of ${z}$ is moderately large, so one also has the approximation

$\displaystyle H_t(z) \approx A + B.$

For large ${x}$, and for fixed ${t,y>0}$, e.g. ${t=y=0.4}$, the sums ${A,B}$ converge fairly quickly (in fact the situation seems to be significantly better here than the much more intensively studied ${t=0}$ case), and we expect the first term

$\displaystyle B_0 := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \exp( \frac{t}{16} \log^2 \frac{5-s}{2\pi} )$

of the ${B}$ series to dominate. Indeed, analytically we know that ${\frac{A+B-C}{B_0} \rightarrow 1}$ (or ${\frac{A+B}{B_0} \rightarrow 1}$) as ${x \rightarrow \infty}$ (holding ${y}$ fixed), and it should also be provable that ${\frac{H_t}{B_0} \rightarrow 1}$ as well. Numerically with ${t=y=0.4}$, it seems in fact that ${\frac{A+B-C}{B_0}}$ (or ${\frac{A+B}{B_0}}$) stay within a distance of about ${1/2}$ of ${1}$ once ${x}$ is moderately large (e.g. ${x \geq 2 \times 10^5}$). This raises the hope that one can solve the toy problem of showing ${H_t(x+iy) \neq 0}$ for ${t=y=0.4}$ by numerically controlling ${H_t(x+iy) / B_0}$ for small ${x}$ (e.g. ${x \leq 2 \times 10^5}$), numerically controlling ${(A+B)/B_0}$ and analytically bounding the error ${(H_t - A - B)/B_0}$ for medium ${x}$ (e.g. ${2 \times 10^5 \leq x \leq 10^7}$), and analytically bounding both ${(A+B)/B_0}$ and ${(H_t-A-B)/B_0}$ for large ${x}$ (e.g. ${x \geq 10^7}$). (These numbers ${2 \times 10^5}$ and ${10^7}$ are arbitrarily chosen here and may end up being optimised to something else as the computations become clearer.)

Thus, we now have four largely independent tasks (for suitable ranges of “small”, “medium”, and “large” ${x}$):

1. Numerically computing ${H_t(x+iy) / B_0}$ for small ${x}$ (with enough accuracy to verify that there are no zeroes)
2. Numerically computing ${(A+B)/B_0}$ for medium ${x}$ (with enough accuracy to keep it away from zero)
3. Analytically bounding ${(A+B)/B_0}$ for large ${x}$ (with enough accuracy to keep it away from zero); and
4. Analytically bounding ${(H_t - A - B)/B_0}$ for medium and large ${x}$ (with a bound that is better than the bound away from zero in the previous two tasks).

Note that tasks 2 and 3 do not directly require any further understanding of the function ${H_t}$.

Below we will give a progress report on the numeric and analytic sides of these tasks.

— 1. Numerics report (contributed by Sujit Nair) —

There is some progress on the code side but not at the pace I was hoping. Here are a few things which happened (rather, mistakes which were taken care of).

1. We got rid of code which wasn’t being used. For example, @dhjpolymath computed ${H_t}$ based on an old version but only realized it after the fact.
2. We implemented tests to catch human/numerical bugs before a computation starts. Again, we lost some numerical cycles but moving forward these can be avoided.
3. David got set up on GitHub and he is able to compare his output (in C) with the Python code. That is helping a lot.

Two areas which were worked on were

1. Computing ${H_t}$ and zeroes for ${t}$ around ${0.4}$
2. Computing quantities like ${(A+B-C)/B_0}$, ${(A+B)/B_0}$, ${C/B_0}$, etc. with the goal of understanding the zero free regions.

Some observations for ${t=0.4}$, ${y=0.4}$, ${x \in ( 10^4, 10^7)}$ include:

• ${(A+B) / B_0}$ does seem to avoid the negative real axis
• ${|(A+B) / B0| > 0.4}$ (based on the oscillations and trends in the plots)
• ${|C/B_0|}$ seems to be settling around ${10^{-4}}$ range.

See the figure below. The top plot is on the complex plane and the bottom plot is the absolute value. The code to play with this is here.

— 2. Analysis report —

The Riemann-Siegel formula and some manipulations (see wiki) give ${H_0 = A^{(0)} + B^{(0)} - \tilde C^{(0)}}$, where

$\displaystyle A^{(0)} = \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du$

$\displaystyle - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du$

$\displaystyle B^{(0)} = \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du$

$\displaystyle - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du$

$\displaystyle \tilde C^{(0)} := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u)\ du dw$

$\displaystyle +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u)\ du dw$

where ${C}$ is a contour that goes from ${+i\infty}$ to ${+\infty}$ staying a bounded distance away from the upper imaginary and right real axes, and ${\overline{C}}$ is the complex conjugate of ${C}$. (In each of these sums, it is the first term that should dominate, with the second one being about ${O(1/x)}$ as large.) One can then evolve by the heat flow to obtain ${H_t = \tilde A + \tilde B - \tilde C}$, where

$\displaystyle \tilde A := \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du$

$\displaystyle - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du$

$\displaystyle \tilde B := \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du$

$\displaystyle - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du$

$\displaystyle \tilde C := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}$

$\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2) \ du dw$

$\displaystyle +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}$

$\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2)\ du dw.$

Steepest descent heuristics then predict that ${\tilde A \approx A}$, ${\tilde B \approx B}$, and ${\tilde C \approx C}$. For the purposes of this project, we will need effective error estimates here, with explicit error terms.

A start has been made towards this goal at this wiki page. Firstly there is a “effective Laplace method” lemma that gives effective bounds on integrals of the form ${\int_I e^{\phi(x)} \psi(x)\ dx}$ if the real part of ${\phi(x)}$ is either monotone with large derivative, or has a critical point and is decreasing on both sides of that critical point. In principle, all one has to do is manipulate expressions such as ${\tilde A - A}$, ${\tilde B - B}$, ${\tilde C - C}$ by change of variables, contour shifting and integration by parts until it is of the form to which the above lemma can be profitably applied. As one may imagine though the computations are messy, particularly for the ${\tilde C}$ term. As a warm up, I have begun by trying to estimate integrals of the form

$\displaystyle \int_C \exp( s (1+u-e^u) + \frac{t}{16} (u+b)^2 )\ du$

for smallish complex numbers ${b}$, as these sorts of integrals appear in the form of ${\tilde A, \tilde B, \tilde C}$. As of this time of writing, there are effective bounds for the ${b=0}$ case, and I am currently working on extending them to the ${b \neq 0}$ case, which should give enough control to approximate ${\tilde A - A}$ and ${\tilde B-B}$. The most complicated task will be that of upper bounding ${\tilde C}$, but it also looks eventually doable.

This is the first official “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The proposal naturally splits into at least three separate (but loosely related) topics:

• Numerical computation of the entire functions ${H_t(z)}$, with the ultimate aim of establishing zero-free regions of the form ${\{ x+iy: 0 \leq x \leq T, y \geq \varepsilon \}}$ for various ${T, \varepsilon > 0}$.
• Improved understanding of the dynamics of the zeroes ${z_j(t)}$ of ${H_t}$.
• Establishing the zero-free nature of ${H_t(x+iy)}$ when ${y \geq \varepsilon > 0}$ and ${x}$ is sufficiently large depending on ${t}$ and ${\varepsilon}$.

Below the fold, I will present each of these topics in turn, to initiate further discussion in each of them. (I thought about splitting this post into three to have three separate discussions, but given the current volume of comments, I think we should be able to manage for now having all the comments in a single post. If this changes then of course we can split up some of the discussion later.)

To begin with, let me present some formulae for computing ${H_t}$ (inspired by similar computations in the Ki-Kim-Lee paper) which may be useful. The initial definition of ${H_t}$ is

$\displaystyle H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du$

where

$\displaystyle \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3 \pi n^2 e^{5u}) \exp(- \pi n^2 e^{4u} )$

is a variant of the Jacobi theta function. We observe that ${\Phi}$ in fact extends analytically to the strip

$\displaystyle \{ u \in {\bf C}: -\frac{\pi}{8} < \mathrm{Im} u < \frac{\pi}{8} \}, \ \ \ \ \ (1)$

as ${e^{4u}}$ has positive real part on this strip. One can use the Poisson summation formula to verify that ${\Phi}$ is even, ${\Phi(-u) = \Phi(u)}$ (see this previous post for details). This lets us obtain a number of other formulae for ${H_t}$. Most obviously, one can unfold the integral to obtain

$\displaystyle H_t(z) = \frac{1}{2} \int_{-\infty}^\infty e^{tu^2} \Phi(u) e^{izu}\ du.$

In my previous paper with Brad, we used this representation, combined with Fubini’s theorem to swap the sum and integral, to obtain a useful series representation for ${H_t}$ in the ${t0}$ case because expressions such as ${e^{tu^2} e^{9u} \exp( -\pi n^2 e^{4u} ) e^{izu}}$ diverge as ${u}$ approaches ${-\infty}$. Nevertheless we can still perform the following contour integration manipulation. Let ${0 \leq \theta < \frac{\pi}{8}}$ be fixed. The function ${\Phi}$ decays super-exponentially fast (much faster than ${e^{tu^2}}$, in particular) as ${\mathrm{Re} u \rightarrow +\infty}$ with ${-\infty \leq \mathrm{Im} u \leq \theta}$; as ${\Phi}$ is even, we also have this decay as ${\mathrm{Re} u \rightarrow -\infty}$ with ${-\infty \leq \mathrm{Im} u \leq \theta}$ (this is despite each of the summands in ${\Phi}$ having much slower decay in this direction – there is considerable cancellation!). Hence by the Cauchy integral formula we have

$\displaystyle H_t(z) = \frac{1}{2} \int_{i\theta-\infty}^{i\theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du.$

Splitting the horizontal line from ${i\theta-\infty}$ to ${i\theta+\infty}$ at ${i\theta}$ and using the even nature of ${\Phi(u)}$, we thus have

$\displaystyle H_t(z) = \frac{1}{2} (\int_{i\theta}^{i\theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du + \int_{-i\theta}^{-i\theta+\infty} e^{tu^2} \Phi(u) e^{-izu}\ du).$

Using the functional equation ${\Phi(\overline{u}) = \overline{\Phi(u)}}$, we thus have the representation

$\displaystyle H_t(z) = \frac{1}{2} ( K_{t,\theta}(z) + \overline{K_{t,\theta}(\overline{z})} ) \ \ \ \ \ (2)$

where

$\displaystyle K_{t,\theta}(z) := \int_{i\theta}^{i \theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du$

$\displaystyle = \sum_{n=1}^\infty 2 \pi^2 n^4 I_{t, \theta}( z - 9i, \pi n^2 ) - 3 \pi n^2 I_{t,\theta}( z - 5i, \pi n^2 )$

where ${I_{t,\theta}(b,\beta)}$ is the oscillatory integral

$\displaystyle I_{t,\theta}(b,\beta) := \int_{i\theta}^{i\theta+\infty} \exp( tu^2 - \beta e^{4u} + i b u )\ du. \ \ \ \ \ (3)$

The formula (2) is valid for any ${0 \leq \theta < \frac{\pi}{8}}$. Naively one would think that it would be simplest to take ${\theta=0}$; however, when ${z=x+iy}$ and ${x}$ is large (with ${y}$ bounded), it seems asymptotically better to take ${\theta}$ closer to ${\pi/8}$, in particular something like ${\theta = \frac{\pi}{8} - \frac{1}{4x}}$ seems to be a reasonably good choice. This is because the integrand in (3) becomes significantly less oscillatory and also much lower in amplitude; the ${\exp(ibu)}$ term in (3) now generates a factor roughly comparable to ${\exp( - \pi x/8 )}$ (which, as we will see below, is the main term in the decay asymptotics for ${H_t(x+iy)}$), while the ${\exp( - \beta e^{4u} )}$ term still exhibits a reasonable amount of decay as ${u \rightarrow \infty}$. We will use the representation (2) in the asymptotic analysis of ${H_t}$ below, but it may also be a useful representation to use for numerical purposes.

Building on the interest expressed in the comments to this previous post, I am now formally proposing to initiate a “Polymath project” on the topic of obtaining new upper bounds on the de Bruijn-Newman constant ${\Lambda}$. The purpose of this post is to describe the proposal and discuss the scope and parameters of the project.

De Bruijn introduced a family ${H_t: {\bf C} \rightarrow {\bf C}}$ of entire functions for each real number ${t}$, defined by the formula

$\displaystyle H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du$

where ${\Phi}$ is the super-exponentially decaying function

$\displaystyle \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3 \pi n^2 e^{5u}) \exp(-\pi n^2 e^{4u}).$

As discussed in this previous post, the Riemann hypothesis is equivalent to the assertion that all the zeroes of ${H_0}$ are real.

De Bruijn and Newman showed that there existed a real constant ${\Lambda}$ – the de Bruijn-Newman constant – such that ${H_t}$ has all zeroes real whenever ${t \geq \Lambda}$, and at least one non-real zero when ${t < \Lambda}$. In particular, the Riemann hypothesis is equivalent to the upper bound ${\Lambda \leq 0}$. In the opposite direction, several lower bounds on ${\Lambda}$ have been obtained over the years, most recently in my paper with Brad Rodgers where we showed that ${\Lambda \geq 0}$, a conjecture of Newman.

As for upper bounds, de Bruijn showed back in 1950 that ${\Lambda \leq 1/2}$. The only progress since then has been the work of Ki, Kim and Lee in 2009, who improved this slightly to ${\Lambda < 1/2}$. The primary proposed aim of this Polymath project is to obtain further explicit improvements to the upper bound of ${\Lambda}$. Of course, if we could lower the upper bound all the way to zero, this would solve the Riemann hypothesis, but I do not view this as a realistic outcome of this project; rather, the upper bounds that one could plausibly obtain by known methods and numerics would be comparable in achievement to the various numerical verifications of the Riemann hypothesis that exist in the literature (e.g., that the first ${N}$ non-trivial zeroes of the zeta function lie on the critical line, for various large explicit values of ${N}$).

In addition to the primary goal, one could envisage some related secondary goals of the project, such as a better understanding (both analytic and numerical) of the functions ${H_t}$ (or of similar functions), and of the dynamics of the zeroes of these functions. Perhaps further potential goals could emerge in the discussion to this post.

I think there is a plausible plan of attack on this project that proceeds as follows. Firstly, there are results going back to the original work of de Bruijn that demonstrate that the zeroes of ${H_t}$ become attracted to the real line as ${t}$ increases; in particular, if one defines ${\sigma_{max}(t)}$ to be the supremum of the imaginary parts of all the zeroes of ${H_t}$, then it is known that this quantity obeys the differential inequality

$\displaystyle \frac{d}{dt} \sigma_{max}(t) \leq - \frac{1}{\sigma_{max}(t)} \ \ \ \ \ (1)$

whenever ${\sigma_{max}(t)}$ is positive; furthermore, once ${\sigma_{max}(t) = 0}$ for some ${t}$, then ${\sigma_{max}(t') = 0}$ for all ${t' > t}$. I hope to explain this in a future post (it is basically due to the attraction that a zero off the real axis has to its complex conjugate). As a corollary of this inequality, we have the upper bound

$\displaystyle \Lambda \leq t + \frac{1}{2} \sigma_{max}(t)^2 \ \ \ \ \ (2)$

for any real number ${t}$. For instance, because all the non-trivial zeroes of the Riemann zeta function lie in the critical strip ${\{ s: 0 \leq \mathrm{Re} s \leq 1 \}}$, one has ${\sigma_{max}(0) \leq 1}$, which when inserted into (2) gives ${\Lambda \leq 1/2}$. The inequality (1) also gives ${\sigma_{max}(t) \leq \sqrt{1-2t}}$ for all ${0 \leq t \leq 1/2}$. If we could find some explicit ${t}$ between ${0}$ and ${1/2}$ where we can improve this upper bound on ${\sigma_{max}(t)}$ by an explicit constant, this would lead to a new upper bound on ${\Lambda}$.

Secondly, the work of Ki, Kim and Lee (based on an analysis of the various terms appearing in the expression for ${H_t}$) shows that for any positive ${t}$, all but finitely many of the zeroes of ${H_t}$ are real (in contrast with the ${t=0}$ situation, where it is still an open question as to whether the proportion of non-trivial zeroes of the zeta function on the critical line is asymptotically equal to ${1}$). As a key step in this analysis, Ki, Kim, and Lee show that for any ${t>0}$ and ${\varepsilon>0}$, there exists a ${T>0}$ such that all the zeroes of ${H_t}$ with real part at least ${T}$, have imaginary part at most ${\varepsilon}$. Ki, Kim and Lee do not explicitly compute how ${T}$ depends on ${t}$ and ${\varepsilon}$, but it looks like this bound could be made effective.

If so, this suggests a possible strategy to get a new upper bound on ${\Lambda}$:

• Select a good choice of parameters ${t, \varepsilon > 0}$.
• By refining the Ki-Kim-Lee analysis, find an explicit ${T}$ such that all zeroes of ${H_t}$ with real part at least ${T}$ have imaginary part at most ${\varepsilon}$.
• By a numerical computation (e.g. using the argument principle), also verify that zeroes of ${H_t}$ with real part between ${0}$ and ${T}$ have imaginary part at most ${\varepsilon}$.
• Combining these facts, we obtain that ${\sigma_{max}(t) \leq \varepsilon}$; hopefully, one can insert this into (2) and get a new upper bound for ${\Lambda}$.

Of course, there may also be alternate strategies to upper bound ${\Lambda}$, and I would imagine this would also be a legitimate topic of discussion for this project.

One appealing thing about the above strategy for the purposes of a polymath project is that it naturally splits the project into several interacting but reasonably independent parts: an analytic part in which one tries to refine the Ki-Kim-Lee analysis (based on explicitly upper and lower bounding various terms in a certain series expansion for ${H_t}$ – I may detail this later in a subsequent post); a numerical part in which one controls the zeroes of ${H_t}$ in a certain finite range; and perhaps also a dynamical part where one sees if there is any way to improve the inequality (2). For instance, the numerical “team” might, over time, be able to produce zero-free regions for ${H_t}$ with an increasingly large value of ${T}$, while in parallel the analytic “team” might produce increasingly smaller values of ${T}$ beyond which they can control zeroes, and eventually the two bounds would meet up and we obtain a new bound on ${\Lambda}$. This factoring of the problem into smaller parts was also a feature of the successful Polymath8 project on bounded gaps between primes.

The project also resembles Polymath8 in another aspect: that there is an obvious way to numerically measure progress, by seeing how the upper bound for ${\Lambda}$ decreases over time (and presumably there will also be another metric of progress regarding how well we can control ${T}$ in terms of ${t}$ and ${\varepsilon}$). However, in Polymath8 the final measure of progress (the upper bound ${H}$ on gaps between primes) was a natural number, and thus could not decrease indefinitely. Here, the bound will be a real number, and there is a possibility that one may end up having an infinite descent in which progress slows down over time, with refinements to increasingly less significant digits of the bound as the project progresses. Because of this, I think it makes sense to follow recent Polymath projects and place an expiration date for the project, for instance one year after the launch date, in which we will agree to end the project and (if the project was successful enough) write up the results, unless there is consensus at that time to extend the project. (In retrospect, we should probably have imposed similar sunset dates on older Polymath projects, some of which have now been inactive for years, but that is perhaps a discussion for another time.)

Some Polymath projects have been known for a breakneck pace, making it hard for some participants to keep up. It’s hard to control these things, but I am envisaging a relatively leisurely project here, perhaps taking the full year mentioned above. It may well be that as the project matures we will largely be waiting for the results of lengthy numerical calculations to come in, for instance. Of course, as with previous projects, we would maintain some wiki pages (and possibly some other resources, such as a code repository) to keep track of progress and also to summarise what we have learned so far. For instance, as was done with some previous Polymath projects, we could begin with some “online reading seminars” where we go through some relevant piece of literature (most obviously the Ki-Kim-Lee paper, but there may be other resources that become relevant, e.g. one could imagine the literature on numerical verification of RH to be of value).

One could also imagine some incidental outcomes of this project, such as a more efficient way to numerically establish zero free regions for various analytic functions of interest; in particular, the project may well end up focusing on some other aspect of mathematics than the specific questions posed here.

Anyway, I would be interested to hear in the comments below from others who might be interested in participating, or at least observing, this project, particularly if they have suggestions regarding the scope and direction of the project, and on organisational structure (e.g. if one should start with reading seminars, or some initial numerical exploration of the functions ${H_t}$, etc..) One could also begin some preliminary discussion of the actual mathematics of the project itself, though (in line with the leisurely pace I was hoping for), I expect that the main burst of mathematical activity would happen later, once the project is formally launched (with wiki page resources, blog posts dedicated to specific aspects of the project, etc.).

The Polymath14 online collaboration has uploaded to the arXiv its paper “Homogeneous length functions on groups“, submitted to Algebra & Number Theory. The paper completely classifies homogeneous length functions ${\| \|: G \rightarrow {\bf R}^+}$ on an arbitrary group ${G = (G,\cdot,e,()^{-1})}$, that is to say non-negative functions that obey the symmetry condition ${\|x^{-1}\| = \|x\|}$, the non-degeneracy condition ${\|x\|=0 \iff x=e}$, the triangle inequality ${\|xy\| \leq \|x\| + \|y\|}$, and the homogeneity condition ${\|x^2\| = 2\|x\|}$. It turns out that these norms can only arise from pulling back the norm of a Banach space by an isometric embedding of the group. Among other things, this shows that ${G}$ can only support a homogeneous length function if and only if it is abelian and torsion free, thus giving a metric description of this property.

The proof is based on repeated use of the homogeneous length function axioms, combined with elementary identities of commutators, to obtain increasingly good bounds on quantities such as ${\|[x,y]\|}$, until one can show that such norms have to vanish. See the previous post for a full proof. The result is robust in that it allows for some loss in the triangle inequality and homogeneity condition, allowing for some new results on “quasinorms” on groups that relate to quasihomomorphisms.

As there are now a large number of comments on the previous post on this project, this post will also serve as the new thread for any final discussion of this project as it winds down.

In the tradition of “Polymath projects“, the problem posed in the previous two blog posts has now been solved, thanks to the cumulative effect of many small contributions by many participants (including, but not limited to, Sean Eberhard, Tobias Fritz, Siddharta Gadgil, Tobias Hartnick, Chris Jerdonek, Apoorva Khare, Antonio Machiavelo, Pace Nielsen, Andy Putman, Will Sawin, Alexander Shamov, Lior Silberman, and David Speyer). In this post I’ll write down a streamlined resolution, eliding a number of important but ultimately removable partial steps and insights made by the above contributors en route to the solution.

Theorem 1 Let ${G = (G,\cdot)}$ be a group. Suppose one has a “seminorm” function ${\| \|: G \rightarrow [0,+\infty)}$ which obeys the triangle inequality

$\displaystyle \|xy \| \leq \|x\| + \|y\|$

for all ${x,y \in G}$, with equality whenever ${x=y}$. Then the seminorm factors through the abelianisation map ${G \mapsto G/[G,G]}$.

Proof: By the triangle inequality, it suffices to show that ${\| [x,y]\| = 0}$ for all ${x,y \in G}$, where ${[x,y] := xyx^{-1}y^{-1}}$ is the commutator.

We first establish some basic facts. Firstly, by hypothesis we have ${\|x^2\| = 2 \|x\|}$, and hence ${\|x^n \| = n \|x\|}$ whenever ${n}$ is a power of two. On the other hand, by the triangle inequality we have ${\|x^n \| \leq n\|x\|}$ for all positive ${n}$, and hence by the triangle inequality again we also have the matching lower bound, thus

$\displaystyle \|x^n \| = n \|x\|$

for all ${n > 0}$. The claim is also true for ${n=0}$ (apply the preceding bound with ${x=1}$ and ${n=2}$). By replacing ${\|x\|}$ with ${\max(\|x\|, \|x^{-1}\|)}$ if necessary we may now also assume without loss of generality that ${\|x^{-1} \| = \|x\|}$, thus

$\displaystyle \|x^n \| = |n| \|x\| \ \ \ \ \ (1)$

for all integers ${n}$.

Next, for any ${x,y \in G}$, and any natural number ${n}$, we have

$\displaystyle \|yxy^{-1} \| = \frac{1}{n} \| (yxy^{-1})^n \|$

$\displaystyle = \frac{1}{n} \| y x^n y^{-1} \|$

$\displaystyle \leq \frac{1}{n} ( \|y\| + n \|x\| + \|y\|^{-1} )$

so on taking limits as ${n \rightarrow \infty}$ we have ${\|yxy^{-1} \| \leq \|x\|}$. Replacing ${x,y}$ by ${yxy^{-1},y^{-1}}$ gives the matching lower bound, thus we have the conjugation invariance

$\displaystyle \|yxy^{-1} \| = \|x\|. \ \ \ \ \ (2)$

Next, we observe that if ${x,y,z,w}$ are such that ${x}$ is conjugate to both ${wy}$ and ${zw^{-1}}$, then one has the inequality

$\displaystyle \|x\| \leq \frac{1}{2} ( \|y \| + \| z \| ). \ \ \ \ \ (3)$

Indeed, if we write ${x = swys^{-1} = t zw^{-1} t^{-1}}$ for some ${s,t \in G}$, then for any natural number ${n}$ one has

$\displaystyle \|x\| = \frac{1}{2n} \| x^n x^n \|$

$\displaystyle = \frac{1}{2n} \| swy \dots wy s^{-1}t zw^{-1} \dots zw^{-1} t^{-1} \|$

where the ${wy}$ and ${zw^{-1}}$ terms each appear ${n}$ times. From (2) we see that conjugation by ${w}$ does not affect the norm. Using this and the triangle inequality several times, we conclude that

$\displaystyle \|x\| \leq \frac{1}{2n} ( \|s\| + n \|y\| + \| s^{-1} t\| + n \|z\| + \|t^{-1} \| ),$

and the claim (3) follows by sending ${n \rightarrow \infty}$.

The following special case of (3) will be of particular interest. Let ${x,y \in G}$, and for any integers ${m,k}$, define the quantity

$\displaystyle f(m,k) := \| x^m [x,y]^k \|.$

Observe that ${x^m [x,y]^k}$ is conjugate to both ${x (x^{m-1} [x,y]^k)}$ and to ${(y^{-1} x^m [x,y]^{k-1} xy) x^{-1}}$, hence by (3) one has

$\displaystyle \| x^m [x,y]^k \| \leq \frac{1}{2} ( \| x^{m-1} [x,y]^k \| + \| y^{-1} x^{m} [x,y]^{k-1} xy \|)$

which by (2) leads to the recursive inequality

$\displaystyle f(m,k) \leq \frac{1}{2} (f(m-1,k) + f(m+1,k-1)).$

We can write this in probabilistic notation as

$\displaystyle f(m,k) \leq {\bf E} f( (m,k) + X )$

where ${X}$ is a random vector that takes the values ${(-1,0)}$ and ${(1,-1)}$ with probability ${1/2}$ each. Iterating this, we conclude in particular that for any large natural number ${n}$, one has

$\displaystyle f(0,n) \leq {\bf E} f( Z )$

where ${Z := (0,n) + X_1 + \dots + X_{2n}}$ and ${X_1,\dots,X_{2n}}$ are iid copies of ${X}$. We can write ${Z = (1,-1/2) (Y_1 + \dots + Y_{2n})}$ where $Y_1,\dots,Y_{2n} = \pm 1$ are iid signs.  By the triangle inequality, we thus have

$\displaystyle f( Z ) \leq |Y_1+\dots+Y_{2n}| (\|x\| + \frac{1}{2} \| [x,y] \|),$

noting that $Y_1+\dots+Y_{2n}$ is an even integer.  On the other hand, $Y_1+\dots+Y_{2n}$ has mean zero and variance $2n$, hence by Cauchy-Schwarz

$\displaystyle f(0,n) \leq \sqrt{2n}( \|x\| + \frac{1}{2} \| [x,y] \|).$

But by (1), the left-hand side is equal to ${n \| [x,y]\|}$. Dividing by ${n}$ and then sending ${n \rightarrow \infty}$, we obtain the claim. $\Box$

The above theorem reduces such seminorms to abelian groups. It is easy to see from (1) that any torsion element of such groups has zero seminorm, so we can in fact restrict to torsion-free groups, which we now write using additive notation ${G = (G,+)}$, thus for instance ${\| nx \| = |n| \|x\|}$ for ${n \in {\bf Z}}$. We think of ${G}$ as a ${{\bf Z}}$-module. One can then extend the seminorm to the associated ${{\bf Q}}$-vector space ${G \otimes_{\bf Z} {\bf Q}}$ by the formula ${\|\frac{a}{b} x\| := \frac{a}{b} \|x\|}$, and then to the associated ${{\bf R}}$-vector space ${G \otimes_{\bf Z} {\bf R}}$ by continuity, at which point it becomes a genuine seminorm (provided we have ensured the symmetry condition ${\|x\| = \|x^{-1}\|}$). Conversely, any seminorm on ${G \otimes_{\bf Z} {\bf R}}$ induces a seminorm on ${G}$. (These arguments also appear in this paper of Khare and Rajaratnam.)

Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials $P$ in the polynomial ring ${\bf F}_2[t]$ over the finite field of characteristic two, the simplest of which is

$\displaystyle \sum_P \frac{1}{1+P} = 0$

(expanded in terms of Taylor series in $u = 1/t$).  Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.

The Chowla conjecture asserts that all non-trivial correlations of the Liouville function are asymptotically negligible; for instance, it asserts that

$\displaystyle \sum_{n \leq X} \lambda(n) \lambda(n+h) = o(X)$

as ${X \rightarrow \infty}$ for any fixed natural number ${h}$. This conjecture remains open, though there are a number of partial results (e.g. these two previous results of Matomaki, Radziwill, and myself).

A natural generalisation of Chowla’s conjecture was proposed by Elliott. For simplicity we will only consider Elliott’s conjecture for the pair correlations

$\displaystyle \sum_{n \leq X} g(n) \overline{g}(n+h).$

For such correlations, the conjecture was that one had

$\displaystyle \sum_{n \leq X} g(n) \overline{g}(n+h) = o(X) \ \ \ \ \ (1)$

as ${X \rightarrow \infty}$ for any natural number ${h}$, as long as ${g}$ was a completely multiplicative function with magnitude bounded by ${1}$, and such that

$\displaystyle \sum_p \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p} = +\infty \ \ \ \ \ (2)$

for any Dirichlet character ${\chi}$ and any real number ${t}$. In the language of “pretentious number theory”, as developed by Granville and Soundararajan, the hypothesis (2) asserts that the completely multiplicative function ${g}$ does not “pretend” to be like the completely multiplicative function ${n \mapsto \chi(n) n^{it}}$ for any character ${\chi}$ and real number ${t}$. A condition of this form is necessary; for instance, if ${g(n)}$ is precisely equal to ${\chi(n) n^{it}}$ and ${\chi}$ has period ${q}$, then ${g(n) \overline{g}(n+q)}$ is equal to ${1_{(n,q)=1} + o(1)}$ as ${n \rightarrow \infty}$ and (1) clearly fails. The prime number theorem in arithmetic progressions implies that the Liouville function obeys (2), and so the Elliott conjecture contains the Chowla conjecture as a special case.

As it turns out, Elliott’s conjecture is false as stated, with the counterexample ${g}$ having the property that ${g}$ “pretends” locally to be the function ${n \mapsto n^{it_j}}$ for ${n}$ in various intervals ${[1, X_j]}$, where ${X_j}$ and ${t_j}$ go to infinity in a certain prescribed sense. See this paper of Matomaki, Radziwill, and myself for details. However, we view this as a technicality, and continue to believe that certain “repaired” versions of Elliott’s conjecture still hold. For instance, our counterexample does not apply when ${g}$ is restricted to be real-valued rather than complex, and we believe that Elliott’s conjecture is valid in this setting. Returning to the complex-valued case, we still expect the asymptotic (1) provided that the condition (2) is replaced by the stronger condition

$\displaystyle \sup_{|t| \leq X} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \rightarrow +\infty$

as ${X \rightarrow +\infty}$ for all fixed Dirichlet characters ${\chi}$. In our paper we supported this claim by establishing a certain “averaged” version of this conjecture; see that paper for further details. (See also this recent paper of Frantzikinakis and Host which establishes a different averaged version of this conjecture.)

One can make a stronger “non-asymptotic” version of this corrected Elliott conjecture, in which the ${X}$ parameter does not go to infinity, or equivalently that the function ${g}$ is permitted to depend on ${X}$:

Conjecture 1 (Non-asymptotic Elliott conjecture) Let ${\varepsilon > 0}$, let ${A \geq 1}$ be sufficiently large depending on ${\varepsilon}$, and let ${X}$ be sufficiently large depending on ${A,\varepsilon}$. Suppose that ${g}$ is a completely multiplicative function with magnitude bounded by ${1}$, such that

$\displaystyle \inf_{|t| \leq AX} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \geq A$

for all Dirichlet characters ${\chi}$ of period at most ${A}$. Then one has

$\displaystyle |\sum_{n \leq X} g(n) \overline{g(n+h)}| \leq \varepsilon X$

for all natural numbers ${1 \leq h \leq 1/\varepsilon}$.

The ${\varepsilon}$-dependent factor ${A}$ in the constraint ${|t| \leq AX}$ is necessary, as can be seen by considering the completely multiplicative function ${g(n) := n^{2iX}}$ (for instance). Again, the results in my previous paper with Matomaki and Radziwill can be viewed as establishing an averaged version of this conjecture.

Meanwhile, we have the following conjecture that is the focus of the Polymath5 project:

Conjecture 2 (Erdös discrepancy conjecture) For any function ${f: {\bf N} \rightarrow \{-1,+1\}}$, the discrepancy

$\displaystyle \sup_{n,d \in {\bf N}} |\sum_{j=1}^n f(jd)|$

is infinite.

It is instructive to compute some near-counterexamples to Conjecture 2 that illustrate the difficulty of the Erdös discrepancy problem. The first near-counterexample is that of a non-principal Dirichlet character ${f(n) = \chi(n)}$ that takes values in ${\{-1,0,+1\}}$ rather than ${\{-1,+1\}}$. For this function, one has from the complete multiplicativity of ${\chi}$ that

$\displaystyle |\sum_{j=1}^n f(jd)| = |\sum_{j=1}^n \chi(j) \chi(d)|$

$\displaystyle \leq |\sum_{j=1}^n \chi(j)|.$

If ${q}$ denotes the period of ${\chi}$, then ${\chi}$ has mean zero on every interval of length ${q}$, and thus

$\displaystyle |\sum_{j=1}^n f(jd)| \leq |\sum_{j=1}^n \chi(j)| \leq q.$

Thus ${\chi}$ has bounded discrepancy.

Of course, this is not a true counterexample to Conjecture 2 because ${\chi}$ can take the value ${0}$. Let us now consider the following variant example, which is the simplest member of a family of examples studied by Borwein, Choi, and Coons. Let ${\chi = \chi_3}$ be the non-principal Dirichlet character of period ${3}$ (thus ${\chi(n)}$ equals ${+1}$ when ${n=1 \hbox{ mod } 3}$, ${-1}$ when ${n = 2 \hbox{ mod } 3}$, and ${0}$ when ${n = 0 \hbox{ mod } 3}$), and define the completely multiplicative function ${f = \tilde \chi: {\bf N} \rightarrow \{-1,+1\}}$ by setting ${\tilde \chi(p) := \chi(p)}$ when ${p \neq 3}$ and ${\tilde \chi(3) = +1}$. This is about the simplest modification one can make to the previous near-counterexample to eliminate the zeroes. Now consider the sum

$\displaystyle \sum_{j=1}^n \tilde \chi(j)$

with ${n := 1 + 3 + 3^2 + \dots + 3^k}$ for some large ${k}$. Writing ${j = 3^a m}$ with ${m}$ coprime to ${3}$ and ${a}$ at most ${k}$, we can write this sum as

$\displaystyle \sum_{a=0}^k \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m).$

Now observe that ${\tilde \chi(3^a m) = \tilde \chi(3)^a \tilde \chi(m) = \chi(m)}$. The function ${\chi}$ has mean zero on every interval of length three, and ${\lfloor n/3^j\rfloor}$ is equal to ${1}$ mod ${3}$, and thus

$\displaystyle \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m) = 1$

for every ${a=0,\dots,k}$, and thus

$\displaystyle \sum_{j=1}^n \tilde \chi(j) = k+1 \gg \log n.$

Thus ${\tilde \chi}$ also has unbounded discrepancy, but only barely so (it grows logarithmically in ${n}$). These examples suggest that the main “enemy” to proving Conjecture 2 comes from completely multiplicative functions ${f}$ that somehow “pretend” to be like a Dirichlet character but do not vanish at the zeroes of that character. (Indeed, the special case of Conjecture 2 when ${f}$ is completely multiplicative is already open, appears to be an important subcase.)

All of these conjectures remain open. However, I would like to record in this blog post the following striking connection, illustrating the power of the Elliott conjecture (particularly in its nonasymptotic formulation):

Theorem 3 (Elliott conjecture implies unbounded discrepancy) Conjecture 1 implies Conjecture 2.

The argument relies heavily on two observations that were previously made in connection with the Polymath5 project. The first is a Fourier-analytic reduction that replaces the Erdos Discrepancy Problem with an averaged version for completely multiplicative functions ${g}$. An application of Cauchy-Schwarz then shows that any counterexample to that version will violate the conclusion of Conjecture 1, so if one assumes that conjecture then ${g}$ must pretend to be like a function of the form ${n \mapsto \chi(n) n^{it}}$. One then uses (a generalisation) of a second argument from Polymath5 to rule out this case, basically by reducing matters to a more complicated version of the Borwein-Choi-Coons analysis. Details are provided below the fold.

There is some hope that the Chowla and Elliott conjectures can be attacked, as the parity barrier which is so impervious to attack for the twin prime conjecture seems to be more permeable in this setting. (For instance, in my previous post I raised a possible approach, based on establishing expander properties of a certain random graph, which seems to get around the parity problem, in principle at least.)

(Update, Sep 25: fixed some treatment of error terms, following a suggestion of Andrew Granville.)

The (presumably) final article arising from the Polymath8 project has now been uploaded to the arXiv as “The “bounded gaps between primes” Polymath project – a retrospective“.  This article, submitted to the Newsletter of the European Mathematical Society, consists of personal contributions from ten different participants (at varying levels of stage of career, and intensity of participation) on their own experiences with the project, and some thoughts as to what lessons to draw for any subsequent Polymath projects.  (At present, I do not know of any such projects being proposed, but from recent experience I would imagine that some opportunity suitable for a Polymath approach will present itself at some point in the near future.)

This post will also serve as the latest (and probably last) of the Polymath8 threads (rolling over this previous post), to wrap up any remaining discussion about any aspect of this project.

I’ve just uploaded to the arXiv the D.H.J. Polymath paper “Variants of the Selberg sieve, and bounded intervals containing many primes“, which is the second paper to be produced from the Polymath8 project (the first one being discussed here). We’ll refer to this latter paper here as the Polymath8b paper, and the former as the Polymath8a paper. As with Polymath8a, the Polymath8b paper is concerned with the smallest asymptotic prime gap

$\displaystyle H_1 := \liminf_{n \rightarrow \infty}(p_{n+1}-p_n),$

where ${p_n}$ denotes the ${n^{th}}$ prime, as well as the more general quantities

$\displaystyle H_m := \liminf_{n \rightarrow \infty}(p_{n+m}-p_n).$

In the breakthrough paper of Goldston, Pintz, and Yildirim, the bound ${H_1 \leq 16}$ was obtained under the strong hypothesis of the Elliott-Halberstam conjecture. An unconditional bound on ${H_1}$, however, remained elusive until the celebrated work of Zhang last year, who showed that

$\displaystyle H_1 \leq 70{,}000{,}000.$

The Polymath8a paper then improved this to ${H_1 \leq 4{,}680}$. After that, Maynard introduced a new multidimensional Selberg sieve argument that gave the substantial improvement

$\displaystyle H_1 \leq 600$

unconditionally, and ${H_1 \leq 12}$ on the Elliott-Halberstam conjecture; furthermore, bounds on ${H_m}$ for higher ${m}$ were obtained for the first time, and specifically that ${H_m \ll m^3 e^{4m}}$ for all ${m \geq 1}$, with the improvements ${H_2 \leq 600}$ and ${H_m \ll m^3 e^{2m}}$ on the Elliott-Halberstam conjecture. (I had independently discovered the multidimensional sieve idea, although I did not obtain Maynard’s specific numerical results, and my asymptotic bounds were a bit weaker.)

In Polymath8b, we obtain some further improvements. Unconditionally, we have ${H_1 \leq 246}$ and ${H_m \ll m e^{(4 - \frac{28}{157}) m}}$, together with some explicit bounds on ${H_2,H_3,H_4,H_5}$; on the Elliott-Halberstam conjecture we have ${H_m \ll m e^{2m}}$ and some numerical improvements to the ${H_2,H_3,H_4,H_5}$ bounds; and assuming the generalised Elliott-Halberstam conjecture we have the bound ${H_1 \leq 6}$, which is best possible from sieve-theoretic methods thanks to the parity problem obstruction.

There were a variety of methods used to establish these results. Maynard’s paper obtained a criterion for bounding ${H_m}$ which reduced to finding a good solution to a certain multidimensional variational problem. When the dimension parameter ${k}$ was relatively small (e.g. ${k \leq 100}$), we were able to obtain good numerical solutions both by continuing the method of Maynard (using a basis of symmetric polynomials), or by using a Krylov iteration scheme. For large ${k}$, we refined the asymptotics and obtained near-optimal solutions of the variational problem. For the ${H_1}$ bounds, we extended the reach of the multidimensional Selberg sieve (particularly under the assumption of the generalised Elliott-Halberstam conjecture) by allowing the function ${F}$ in the multidimensional variational problem to extend to a larger region of space than was previously admissible, albeit with some tricky new constraints on ${F}$ (and penalties in the variational problem). This required some unusual sieve-theoretic manipulations, notably an “epsilon trick”, ultimately relying on the elementary inequality ${(a+b)^2 \geq a^2 + 2ab}$, that allowed one to get non-trivial lower bounds for sums such as ${\sum_n (a(n)+b(n))^2}$ even if the sum ${\sum_n b(n)^2}$ had no non-trivial estimates available; and a way to estimate divisor sums such as ${\sum_{n\leq x} \sum_{d|n} \lambda_d}$ even if ${d}$ was permitted to be comparable to or even exceed ${x}$, by using the fundamental theorem of arithmetic to factorise ${n}$ (after restricting to the case when ${n}$ is almost prime). I hope that these sieve-theoretic tricks will be useful in future work in the subject.

With this paper, the Polymath8 project is almost complete; there is still a little bit of scope to push our methods further and get some modest improvement for instance to the ${H_1 \leq 246}$ bound, but this would require a substantial amount of effort, and it is probably best to instead wait for some new breakthrough in the subject to come along. One final task we are performing is to write up a retrospective article on both the 8a and 8b experiences, an incomplete writeup of which can be found here. If anyone wishes to contribute some commentary on these projects (whether you were an active contributor, an occasional contributor, or a silent “lurker” in the online discussion), please feel free to do so in the comments to this post.

This should be the final thread (for now, at least) for the Polymath8 project (encompassing the original Polymath8a paper, the nearly finished Polymath8b paper, and the retrospective paper), superseding the previous Polymath8b thread (which was quite full) and the Polymath8a/retrospective thread (which was more or less inactive).

On Polymath8a: I talked briefly with Andrew Granville, who is handling the paper for Algebra & Number Theory, and he said that a referee report should be coming in soon.  Apparently length of the paper is a bit of an issue (not surprising, as it is 163 pages long) and there will be some suggestions to trim the size down a bit.

In view of the length issue at A&NT, I’m now leaning towards taking up Ken Ono’s offer to submit the Polymath8b paper to the new open access journal “Research in the Mathematical Sciences“.  I think the paper is almost ready to be submitted (after the current participants sign off on it, of course), but it might be worth waiting on the Polymath8a referee report in case the changes suggested impact the 8b paper.

Finally, it is perhaps time to start working on the retrospective article, and collect some impressions from participants.  I wrote up a quick draft of my own experiences, and also pasted in Pace Nielsen’s thoughts, as well as a contribution from an undergraduate following the project (Andrew Gibson).  Hopefully we can collect a few more (either through comments on this blog, through email, or through Dropbox), and then start working on editing them together and finding some suitable concluding points to make about the Polymath8 project, and what lessons we can take from it for future projects of this type.