You are currently browsing the category archive for the ‘question’ category.

This is the fourth “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing https://terrytao.wordpress.com/2018/01/24/polymath-proposal-upper-bounding-the-de-bruijn-newman-constant/. Progress will be summarised at this Polymath wiki page.

We are getting closer to finishing off the following test problem: can one show that ${H_t(x+iy) \neq 0}$ whenever ${t = y = 0.4}$, ${x \geq 0}$? This would morally show that ${\Lambda \leq 0.48}$. A wiki page for this problem has now been created here. We have obtained a number of approximations ${A+B, A'+B', A^{eff}+B^{eff}, A^{toy}+B^{toy}}$ to ${H_t}$ (see wiki page), though numeric evidence indicates that the approximations are all very close to each other. (Many of these approximations come with a correction term ${C}$, but thus far it seems that we may be able to avoid having to use this refinement to the approximations.) The effective approximation ${A^{eff} + B^{eff}}$ also comes with an effective error bound

$\displaystyle |H_t - A^{eff} - B^{eff}| \leq E_1 + E_2 + E_3$

for some explicit (but somewhat messy) error terms ${E_1,E_2,E_3}$: see this wiki page for details. The original approximations ${A+B, A'+B'}$ can be considered deprecated at this point in favour of the (slightly more complicated) approximation ${A^{eff}+B^{eff}}$; the approximation ${A^{toy}+B^{toy}}$ is a simplified version of ${A^{eff}+B^{eff}}$ which is not quite as accurate but might be useful for testing purposes.

It is convenient to normalise everything by an explicit non-zero factor ${B^{eff}_0}$. Asymptotically, ${(A^{eff} + B^{eff}) / B^{eff}_0}$ converges to 1; numerically, it appears that its magnitude (and also its real part) stays roughly between 0.4 and 3 in the range ${10^5 \leq x \leq 10^6}$, and we seem to be able to keep it (or at least the toy counterpart ${(A^{toy} + B^{toy}) / B^{toy}_0}$) away from zero starting from about ${x \geq 4 \times 10^6}$ (here it seems that there is a useful trick of multiplying by Euler-type factors like ${1 - \frac{1}{2^{1-s}}}$ to cancel off some of the oscillation). Also, the bounds on the error ${(H_t - A^{eff} - B^{eff}) / B^{eff}_0}$ seem to be of size about 0.1 or better in these ranges also. So we seem to be on track to be able to rigorously eliminate zeroes starting from about ${x \geq 10^5}$ or so. We have not discussed too much what to do with the small values of ${x}$; at some point our effective error bounds will become unusable, and we may have to find some more faster ways to compute ${H_t}$.

In addition to this main direction of inquiry, there have been additional discussions on the dynamics of zeroes, and some numerical investigations of the behaviour Lehmer pairs under heat flow. Contributors are welcome to summarise any findings from these discussions from previous threads (or on any other related topic, e.g. improvements in the code) in the comments below.

This is the third “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$, continuing this previous thread. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

We are making progress on the following test problem: can one show that ${H_t(x+iy) \neq 0}$ whenever ${t = 0.4}$, ${x \geq 0}$, and ${y \geq 0.4}$? This would imply that

$\displaystyle \Lambda \leq 0.4 + \frac{1}{2} (0.4)^2 = 0.48$

which would be the first quantitative improvement over the de Bruijn bound of ${\Lambda \leq 1/2}$ (or the Ki-Kim-Lee refinement of ${\Lambda < 1/2}$). Of course we can try to lower the two parameters of ${0.4}$ later on in the project, but this seems as good a place to start as any. One could also potentially try to use finer analysis of dynamics of zeroes to improve the bound ${\Lambda \leq 0.48}$ further, but this seems to be a less urgent task.

Probably the hardest case is ${y=0.4}$, as there is a good chance that one can then recover the ${y>0.4}$ case by a suitable use of the argument principle. Here we appear to have a workable Riemann-Siegel type formula that gives a tractable approximation for ${H_t}$. To describe this formula, first note that in the ${t=0}$ case we have

$\displaystyle H_0(z) = \frac{1}{8} \xi( \frac{1+iz}{2} )$

and the Riemann-Siegel formula gives

$\displaystyle \xi(s) = \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}$

$\displaystyle + \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}$

$\displaystyle + \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw$

for any natural numbers ${N,M}$, where ${C_M}$ is a contour from ${+\infty}$ to ${+\infty}$ that winds once anticlockwise around the zeroes ${e^{2\pi im}, |m| \leq M}$ of ${e^w-1}$ but does not wind around any other zeroes. A good choice of ${N,M}$ to use here is

$\displaystyle N=M=\lfloor \sqrt{\mathrm{Im}(s)/2\pi}\rfloor = \lfloor \sqrt{\mathrm{Re}(z)/4\pi} \rfloor. \ \ \ \ \ (1)$

In this case, a classical steepest descent computation (see wiki) yields the approximation

$\displaystyle \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1}\ dw \approx - (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )$

where

$\displaystyle \Psi(\alpha) := 2\pi \frac{\cos \pi(\frac{1}{2}\alpha^2 - \alpha - \pi/8)}{\cos(\pi \alpha)} \exp( \frac{i\pi}{2} \alpha^2 - \frac{5\pi i}{8} ).$

Thus we have

$\displaystyle H_0(z) \approx A^{(0)} + B^{(0)} - C^{(0)}$

where

$\displaystyle A^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{1}{n^s}$

$\displaystyle B^{(0)} := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{1}{m^{1-s}}$

$\displaystyle C^{(0)} := \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \frac{e^{-i\pi s} \Gamma(1-s)}{2\pi i} (2\pi i M)^{s-1} \Psi( \frac{s}{2\pi i M} - N )$

with ${s := \frac{1+iz}{2}}$ and ${N,M}$ given by (1).

Heuristically, we have derived (see wiki) the more general approximation

$\displaystyle H_t(z) \approx A + B - C$

for ${t>0}$ (and in particular for ${t=0.4}$), where

$\displaystyle A := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-s/2} \Gamma(s/2) \sum_{n=1}^N \frac{\exp(\frac{t}{16} \log^2 \frac{s+4}{2\pi n^2} )}{n^s}$

$\displaystyle B := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \sum_{m=1}^M \frac{\exp(\frac{t}{16} \log^2 \frac{5-s}{2\pi m^2} )}{m^{1-s}}$

$\displaystyle C := \exp(-\frac{t \pi^2}{64}) C^{(0)}.$

In practice it seems that the ${C}$ term is negligible once the real part ${x}$ of ${z}$ is moderately large, so one also has the approximation

$\displaystyle H_t(z) \approx A + B.$

For large ${x}$, and for fixed ${t,y>0}$, e.g. ${t=y=0.4}$, the sums ${A,B}$ converge fairly quickly (in fact the situation seems to be significantly better here than the much more intensively studied ${t=0}$ case), and we expect the first term

$\displaystyle B_0 := \frac{1}{8} \frac{s(s-1)}{2} \pi^{-(1-s)/2} \Gamma((1-s)/2) \exp( \frac{t}{16} \log^2 \frac{5-s}{2\pi} )$

of the ${B}$ series to dominate. Indeed, analytically we know that ${\frac{A+B-C}{B_0} \rightarrow 1}$ (or ${\frac{A+B}{B_0} \rightarrow 1}$) as ${x \rightarrow \infty}$ (holding ${y}$ fixed), and it should also be provable that ${\frac{H_t}{B_0} \rightarrow 1}$ as well. Numerically with ${t=y=0.4}$, it seems in fact that ${\frac{A+B-C}{B_0}}$ (or ${\frac{A+B}{B_0}}$) stay within a distance of about ${1/2}$ of ${1}$ once ${x}$ is moderately large (e.g. ${x \geq 2 \times 10^5}$). This raises the hope that one can solve the toy problem of showing ${H_t(x+iy) \neq 0}$ for ${t=y=0.4}$ by numerically controlling ${H_t(x+iy) / B_0}$ for small ${x}$ (e.g. ${x \leq 2 \times 10^5}$), numerically controlling ${(A+B)/B_0}$ and analytically bounding the error ${(H_t - A - B)/B_0}$ for medium ${x}$ (e.g. ${2 \times 10^5 \leq x \leq 10^7}$), and analytically bounding both ${(A+B)/B_0}$ and ${(H_t-A-B)/B_0}$ for large ${x}$ (e.g. ${x \geq 10^7}$). (These numbers ${2 \times 10^5}$ and ${10^7}$ are arbitrarily chosen here and may end up being optimised to something else as the computations become clearer.)

Thus, we now have four largely independent tasks (for suitable ranges of “small”, “medium”, and “large” ${x}$):

1. Numerically computing ${H_t(x+iy) / B_0}$ for small ${x}$ (with enough accuracy to verify that there are no zeroes)
2. Numerically computing ${(A+B)/B_0}$ for medium ${x}$ (with enough accuracy to keep it away from zero)
3. Analytically bounding ${(A+B)/B_0}$ for large ${x}$ (with enough accuracy to keep it away from zero); and
4. Analytically bounding ${(H_t - A - B)/B_0}$ for medium and large ${x}$ (with a bound that is better than the bound away from zero in the previous two tasks).

Note that tasks 2 and 3 do not directly require any further understanding of the function ${H_t}$.

Below we will give a progress report on the numeric and analytic sides of these tasks.

— 1. Numerics report (contributed by Sujit Nair) —

There is some progress on the code side but not at the pace I was hoping. Here are a few things which happened (rather, mistakes which were taken care of).

1. We got rid of code which wasn’t being used. For example, @dhjpolymath computed ${H_t}$ based on an old version but only realized it after the fact.
2. We implemented tests to catch human/numerical bugs before a computation starts. Again, we lost some numerical cycles but moving forward these can be avoided.
3. David got set up on GitHub and he is able to compare his output (in C) with the Python code. That is helping a lot.

Two areas which were worked on were

1. Computing ${H_t}$ and zeroes for ${t}$ around ${0.4}$
2. Computing quantities like ${(A+B-C)/B_0}$, ${(A+B)/B_0}$, ${C/B_0}$, etc. with the goal of understanding the zero free regions.

Some observations for ${t=0.4}$, ${y=0.4}$, ${x \in ( 10^4, 10^7)}$ include:

• ${(A+B) / B_0}$ does seem to avoid the negative real axis
• ${|(A+B) / B0| > 0.4}$ (based on the oscillations and trends in the plots)
• ${|C/B_0|}$ seems to be settling around ${10^{-4}}$ range.

See the figure below. The top plot is on the complex plane and the bottom plot is the absolute value. The code to play with this is here.

— 2. Analysis report —

The Riemann-Siegel formula and some manipulations (see wiki) give ${H_0 = A^{(0)} + B^{(0)} - \tilde C^{(0)}}$, where

$\displaystyle A^{(0)} = \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du$

$\displaystyle - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) )\ du$

$\displaystyle B^{(0)} = \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du$

$\displaystyle - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) )\ du$

$\displaystyle \tilde C^{(0)} := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u)\ du dw$

$\displaystyle +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M} \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u)\ du dw$

where ${C}$ is a contour that goes from ${+i\infty}$ to ${+\infty}$ staying a bounded distance away from the upper imaginary and right real axes, and ${\overline{C}}$ is the complex conjugate of ${C}$. (In each of these sums, it is the first term that should dominate, with the second one being about ${O(1/x)}$ as large.) One can then evolve by the heat flow to obtain ${H_t = \tilde A + \tilde B - \tilde C}$, where

$\displaystyle \tilde A := \frac{2}{8} \sum_{n=1}^N \int_C \exp( \frac{s+4}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du$

$\displaystyle - \frac{3}{8} \sum_{n=1}^N \int_C \exp( \frac{s+2}{2} u - e^u - \frac{s}{2} \log(\pi n^2) + \frac{t}{16} (u - \log(\pi n^2))^2)\ du$

$\displaystyle \tilde B := \frac{2}{8} \sum_{m=1}^M \int_{\overline{C}} \exp( \frac{5-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du$

$\displaystyle - \frac{3}{8} \sum_{m=1}^M \int_C \exp( \frac{3-s}{2} u - e^u - \frac{1-s}{2} \log(\pi m^2) + \frac{t}{16} (u - \log(\pi m^2))^2)\ du$

$\displaystyle \tilde C := -\frac{2}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}$

$\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{5-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2) \ du dw$

$\displaystyle +\frac{3}{8} \sum_{n=0}^\infty \frac{e^{-i\pi s/2} e^{i\pi s n}}{2^s \pi^{1/2}} \int_{\overline{C}} \int_{C_M}$

$\displaystyle \frac{w^{s-1} e^{-Nw}}{e^w-1} \exp( \frac{3-s}{2} u - e^u + \frac{t}{4} (i \pi(n-1/2) + \log \frac{w}{2\sqrt{\pi}} - \frac{u}{2})^2)\ du dw.$

Steepest descent heuristics then predict that ${\tilde A \approx A}$, ${\tilde B \approx B}$, and ${\tilde C \approx C}$. For the purposes of this project, we will need effective error estimates here, with explicit error terms.

A start has been made towards this goal at this wiki page. Firstly there is a “effective Laplace method” lemma that gives effective bounds on integrals of the form ${\int_I e^{\phi(x)} \psi(x)\ dx}$ if the real part of ${\phi(x)}$ is either monotone with large derivative, or has a critical point and is decreasing on both sides of that critical point. In principle, all one has to do is manipulate expressions such as ${\tilde A - A}$, ${\tilde B - B}$, ${\tilde C - C}$ by change of variables, contour shifting and integration by parts until it is of the form to which the above lemma can be profitably applied. As one may imagine though the computations are messy, particularly for the ${\tilde C}$ term. As a warm up, I have begun by trying to estimate integrals of the form

$\displaystyle \int_C \exp( s (1+u-e^u) + \frac{t}{16} (u+b)^2 )\ du$

for smallish complex numbers ${b}$, as these sorts of integrals appear in the form of ${\tilde A, \tilde B, \tilde C}$. As of this time of writing, there are effective bounds for the ${b=0}$ case, and I am currently working on extending them to the ${b \neq 0}$ case, which should give enough control to approximate ${\tilde A - A}$ and ${\tilde B-B}$. The most complicated task will be that of upper bounding ${\tilde C}$, but it also looks eventually doable.

This is the first official “research” thread of the Polymath15 project to upper bound the de Bruijn-Newman constant ${\Lambda}$. Discussion of the project of a non-research nature can continue for now in the existing proposal thread. Progress will be summarised at this Polymath wiki page.

The proposal naturally splits into at least three separate (but loosely related) topics:

• Numerical computation of the entire functions ${H_t(z)}$, with the ultimate aim of establishing zero-free regions of the form ${\{ x+iy: 0 \leq x \leq T, y \geq \varepsilon \}}$ for various ${T, \varepsilon > 0}$.
• Improved understanding of the dynamics of the zeroes ${z_j(t)}$ of ${H_t}$.
• Establishing the zero-free nature of ${H_t(x+iy)}$ when ${y \geq \varepsilon > 0}$ and ${x}$ is sufficiently large depending on ${t}$ and ${\varepsilon}$.

Below the fold, I will present each of these topics in turn, to initiate further discussion in each of them. (I thought about splitting this post into three to have three separate discussions, but given the current volume of comments, I think we should be able to manage for now having all the comments in a single post. If this changes then of course we can split up some of the discussion later.)

To begin with, let me present some formulae for computing ${H_t}$ (inspired by similar computations in the Ki-Kim-Lee paper) which may be useful. The initial definition of ${H_t}$ is

$\displaystyle H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du$

where

$\displaystyle \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3 \pi n^2 e^{5u}) \exp(- \pi n^2 e^{4u} )$

is a variant of the Jacobi theta function. We observe that ${\Phi}$ in fact extends analytically to the strip

$\displaystyle \{ u \in {\bf C}: -\frac{\pi}{8} < \mathrm{Im} u < \frac{\pi}{8} \}, \ \ \ \ \ (1)$

as ${e^{4u}}$ has positive real part on this strip. One can use the Poisson summation formula to verify that ${\Phi}$ is even, ${\Phi(-u) = \Phi(u)}$ (see this previous post for details). This lets us obtain a number of other formulae for ${H_t}$. Most obviously, one can unfold the integral to obtain

$\displaystyle H_t(z) = \frac{1}{2} \int_{-\infty}^\infty e^{tu^2} \Phi(u) e^{izu}\ du.$

In my previous paper with Brad, we used this representation, combined with Fubini’s theorem to swap the sum and integral, to obtain a useful series representation for ${H_t}$ in the ${t<0}$ case. Unfortunately this is not possible in the ${t>0}$ case because expressions such as ${e^{tu^2} e^{9u} \exp( -\pi n^2 e^{4u} ) e^{izu}}$ diverge as ${u}$ approaches ${-\infty}$. Nevertheless we can still perform the following contour integration manipulation. Let ${0 \leq \theta < \frac{\pi}{8}}$ be fixed. The function ${\Phi}$ decays super-exponentially fast (much faster than ${e^{tu^2}}$, in particular) as ${\mathrm{Re} u \rightarrow +\infty}$ with ${-\infty \leq \mathrm{Im} u \leq \theta}$; as ${\Phi}$ is even, we also have this decay as ${\mathrm{Re} u \rightarrow -\infty}$ with ${-\infty \leq \mathrm{Im} u \leq \theta}$ (this is despite each of the summands in ${\Phi}$ having much slower decay in this direction – there is considerable cancellation!). Hence by the Cauchy integral formula we have

$\displaystyle H_t(z) = \frac{1}{2} \int_{i\theta-\infty}^{i\theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du.$

Splitting the horizontal line from ${i\theta-\infty}$ to ${i\theta+\infty}$ at ${i\theta}$ and using the even nature of ${\Phi(u)}$, we thus have

$\displaystyle H_t(z) = \frac{1}{2} ( \int_{i\theta}^{i\theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du + \int_{-i\theta}^{-i\theta+\infty} e^{tu^2} \Phi(u) e^{-izu}\ du.$

Using the functional equation ${\Phi(\overline{u}) = \overline{\Phi(u)}}$, we thus have the representation

$\displaystyle H_t(z) = \frac{1}{2} ( K_{t,\theta}(z) + \overline{K_{t,\theta}(\overline{z})} ) \ \ \ \ \ (2)$

where

$\displaystyle K_{t,\theta}(z) := \int_{i\theta}^{i \theta+\infty} e^{tu^2} \Phi(u) e^{izu}\ du$

$\displaystyle = \sum_{n=1}^\infty 2 \pi^2 n^4 I_{t, \theta}( z - 9i, \pi n^2 ) - 3 \pi n^2 I_{t,\theta}( z - 5i, \pi n^2 )$

where ${I_{t,\theta}(b,\beta)}$ is the oscillatory integral

$\displaystyle I_{t,\theta}(b,\beta) := \int_{i\theta}^{i\theta+\infty} \exp( tu^2 - \beta e^{4u} + i b u )\ du. \ \ \ \ \ (3)$

The formula (2) is valid for any ${0 \leq \theta < \frac{\pi}{8}}$. Naively one would think that it would be simplest to take ${\theta=0}$; however, when ${z=x+iy}$ and ${x}$ is large (with ${y}$ bounded), it seems asymptotically better to take ${\theta}$ closer to ${\pi/8}$, in particular something like ${\theta = \frac{\pi}{8} - \frac{1}{4x}}$ seems to be a reasonably good choice. This is because the integrand in (3) becomes significantly less oscillatory and also much lower in amplitude; the ${\exp(ibu)}$ term in (3) now generates a factor roughly comparable to ${\exp( - \pi x/8 )}$ (which, as we will see below, is the main term in the decay asymptotics for ${H_t(x+iy)}$), while the ${\exp( - \beta e^{4u} )}$ term still exhibits a reasonable amount of decay as ${u \rightarrow \infty}$. We will use the representation (2) in the asymptotic analysis of ${H_t}$ below, but it may also be a useful representation to use for numerical purposes.

Building on the interest expressed in the comments to this previous post, I am now formally proposing to initiate a “Polymath project” on the topic of obtaining new upper bounds on the de Bruijn-Newman constant ${\Lambda}$. The purpose of this post is to describe the proposal and discuss the scope and parameters of the project.

De Bruijn introduced a family ${H_t: {\bf C} \rightarrow {\bf C}}$ of entire functions for each real number ${t}$, defined by the formula

$\displaystyle H_t(z) := \int_0^\infty e^{tu^2} \Phi(u) \cos(zu)\ du$

where ${\Phi}$ is the super-exponentially decaying function

$\displaystyle \Phi(u) := \sum_{n=1}^\infty (2\pi^2 n^4 e^{9u} - 3 \pi n^2 e^{5u}) \exp(-\pi n^2 e^{4u}).$

As discussed in this previous post, the Riemann hypothesis is equivalent to the assertion that all the zeroes of ${H_0}$ are real.

De Bruijn and Newman showed that there existed a real constant ${\Lambda}$ – the de Bruijn-Newman constant – such that ${H_t}$ has all zeroes real whenever ${t \geq \Lambda}$, and at least one non-real zero when ${t < \Lambda}$. In particular, the Riemann hypothesis is equivalent to the upper bound ${\Lambda \leq 0}$. In the opposite direction, several lower bounds on ${\Lambda}$ have been obtained over the years, most recently in my paper with Brad Rodgers where we showed that ${\Lambda \geq 0}$, a conjecture of Newman.

As for upper bounds, de Bruijn showed back in 1950 that ${\Lambda \leq 1/2}$. The only progress since then has been the work of Ki, Kim and Lee in 2009, who improved this slightly to ${\Lambda < 1/2}$. The primary proposed aim of this Polymath project is to obtain further explicit improvements to the upper bound of ${\Lambda}$. Of course, if we could lower the upper bound all the way to zero, this would solve the Riemann hypothesis, but I do not view this as a realistic outcome of this project; rather, the upper bounds that one could plausibly obtain by known methods and numerics would be comparable in achievement to the various numerical verifications of the Riemann hypothesis that exist in the literature (e.g., that the first ${N}$ non-trivial zeroes of the zeta function lie on the critical line, for various large explicit values of ${N}$).

In addition to the primary goal, one could envisage some related secondary goals of the project, such as a better understanding (both analytic and numerical) of the functions ${H_t}$ (or of similar functions), and of the dynamics of the zeroes of these functions. Perhaps further potential goals could emerge in the discussion to this post.

I think there is a plausible plan of attack on this project that proceeds as follows. Firstly, there are results going back to the original work of de Bruijn that demonstrate that the zeroes of ${H_t}$ become attracted to the real line as ${t}$ increases; in particular, if one defines ${\sigma_{max}(t)}$ to be the supremum of the imaginary parts of all the zeroes of ${H_t}$, then it is known that this quantity obeys the differential inequality

$\displaystyle \frac{d}{dt} \sigma_{max}(t) \leq - \frac{1}{\sigma_{max}(t)} \ \ \ \ \ (1)$

whenever ${\sigma_{max}(t)}$ is positive; furthermore, once ${\sigma_{max}(t) = 0}$ for some ${t}$, then ${\sigma_{max}(t') = 0}$ for all ${t' > t}$. I hope to explain this in a future post (it is basically due to the attraction that a zero off the real axis has to its complex conjugate). As a corollary of this inequality, we have the upper bound

$\displaystyle \Lambda \leq t + \frac{1}{2} \sigma_{max}(t)^2 \ \ \ \ \ (2)$

for any real number ${t}$. For instance, because all the non-trivial zeroes of the Riemann zeta function lie in the critical strip ${\{ s: 0 \leq \mathrm{Re} s \leq 1 \}}$, one has ${\sigma_{max}(0) \leq 1}$, which when inserted into (2) gives ${\Lambda \leq 1/2}$. The inequality (1) also gives ${\sigma_{max}(t) \leq \sqrt{1-2t}}$ for all ${0 \leq t \leq 1/2}$. If we could find some explicit ${t}$ between ${0}$ and ${1/2}$ where we can improve this upper bound on ${\sigma_{max}(t)}$ by an explicit constant, this would lead to a new upper bound on ${\Lambda}$.

Secondly, the work of Ki, Kim and Lee (based on an analysis of the various terms appearing in the expression for ${H_t}$) shows that for any positive ${t}$, all but finitely many of the zeroes of ${H_t}$ are real (in contrast with the ${t=0}$ situation, where it is still an open question as to whether the proportion of non-trivial zeroes of the zeta function on the critical line is asymptotically equal to ${1}$). As a key step in this analysis, Ki, Kim, and Lee show that for any ${t>0}$ and ${\varepsilon>0}$, there exists a ${T>0}$ such that all the zeroes of ${H_t}$ with real part at least ${T}$, have imaginary part at most ${\varepsilon}$. Ki, Kim and Lee do not explicitly compute how ${T}$ depends on ${t}$ and ${\varepsilon}$, but it looks like this bound could be made effective.

If so, this suggests a possible strategy to get a new upper bound on ${\Lambda}$:

• Select a good choice of parameters ${t, \varepsilon > 0}$.
• By refining the Ki-Kim-Lee analysis, find an explicit ${T}$ such that all zeroes of ${H_t}$ with real part at least ${T}$ have imaginary part at most ${\varepsilon}$.
• By a numerical computation (e.g. using the argument principle), also verify that zeroes of ${H_t}$ with real part between ${0}$ and ${T}$ have imaginary part at most ${\varepsilon}$.
• Combining these facts, we obtain that ${\sigma_{max}(t) \leq \varepsilon}$; hopefully, one can insert this into (2) and get a new upper bound for ${\Lambda}$.

Of course, there may also be alternate strategies to upper bound ${\Lambda}$, and I would imagine this would also be a legitimate topic of discussion for this project.

One appealing thing about the above strategy for the purposes of a polymath project is that it naturally splits the project into several interacting but reasonably independent parts: an analytic part in which one tries to refine the Ki-Kim-Lee analysis (based on explicitly upper and lower bounding various terms in a certain series expansion for ${H_t}$ – I may detail this later in a subsequent post); a numerical part in which one controls the zeroes of ${H_t}$ in a certain finite range; and perhaps also a dynamical part where one sees if there is any way to improve the inequality (2). For instance, the numerical “team” might, over time, be able to produce zero-free regions for ${H_t}$ with an increasingly large value of ${T}$, while in parallel the analytic “team” might produce increasingly smaller values of ${T}$ beyond which they can control zeroes, and eventually the two bounds would meet up and we obtain a new bound on ${\Lambda}$. This factoring of the problem into smaller parts was also a feature of the successful Polymath8 project on bounded gaps between primes.

The project also resembles Polymath8 in another aspect: that there is an obvious way to numerically measure progress, by seeing how the upper bound for ${\Lambda}$ decreases over time (and presumably there will also be another metric of progress regarding how well we can control ${T}$ in terms of ${t}$ and ${\varepsilon}$). However, in Polymath8 the final measure of progress (the upper bound ${H}$ on gaps between primes) was a natural number, and thus could not decrease indefinitely. Here, the bound will be a real number, and there is a possibility that one may end up having an infinite descent in which progress slows down over time, with refinements to increasingly less significant digits of the bound as the project progresses. Because of this, I think it makes sense to follow recent Polymath projects and place an expiration date for the project, for instance one year after the launch date, in which we will agree to end the project and (if the project was successful enough) write up the results, unless there is consensus at that time to extend the project. (In retrospect, we should probably have imposed similar sunset dates on older Polymath projects, some of which have now been inactive for years, but that is perhaps a discussion for another time.)

Some Polymath projects have been known for a breakneck pace, making it hard for some participants to keep up. It’s hard to control these things, but I am envisaging a relatively leisurely project here, perhaps taking the full year mentioned above. It may well be that as the project matures we will largely be waiting for the results of lengthy numerical calculations to come in, for instance. Of course, as with previous projects, we would maintain some wiki pages (and possibly some other resources, such as a code repository) to keep track of progress and also to summarise what we have learned so far. For instance, as was done with some previous Polymath projects, we could begin with some “online reading seminars” where we go through some relevant piece of literature (most obviously the Ki-Kim-Lee paper, but there may be other resources that become relevant, e.g. one could imagine the literature on numerical verification of RH to be of value).

One could also imagine some incidental outcomes of this project, such as a more efficient way to numerically establish zero free regions for various analytic functions of interest; in particular, the project may well end up focusing on some other aspect of mathematics than the specific questions posed here.

Anyway, I would be interested to hear in the comments below from others who might be interested in participating, or at least observing, this project, particularly if they have suggestions regarding the scope and direction of the project, and on organisational structure (e.g. if one should start with reading seminars, or some initial numerical exploration of the functions ${H_t}$, etc..) One could also begin some preliminary discussion of the actual mathematics of the project itself, though (in line with the leisurely pace I was hoping for), I expect that the main burst of mathematical activity would happen later, once the project is formally launched (with wiki page resources, blog posts dedicated to specific aspects of the project, etc.).

The Polymath14 online collaboration has uploaded to the arXiv its paper “Homogeneous length functions on groups“, submitted to Algebra & Number Theory. The paper completely classifies homogeneous length functions ${\| \|: G \rightarrow {\bf R}^+}$ on an arbitrary group ${G = (G,\cdot,e,()^{-1})}$, that is to say non-negative functions that obey the symmetry condition ${\|x^{-1}\| = \|x\|}$, the non-degeneracy condition ${\|x\|=0 \iff x=e}$, the triangle inequality ${\|xy\| \leq \|x\| + \|y\|}$, and the homogeneity condition ${\|x^2\| = 2\|x\|}$. It turns out that these norms can only arise from pulling back the norm of a Banach space by an isometric embedding of the group. Among other things, this shows that ${G}$ can only support a homogeneous length function if and only if it is abelian and torsion free, thus giving a metric description of this property.

The proof is based on repeated use of the homogeneous length function axioms, combined with elementary identities of commutators, to obtain increasingly good bounds on quantities such as ${\|[x,y]\|}$, until one can show that such norms have to vanish. See the previous post for a full proof. The result is robust in that it allows for some loss in the triangle inequality and homogeneity condition, allowing for some new results on “quasinorms” on groups that relate to quasihomomorphisms.

As there are now a large number of comments on the previous post on this project, this post will also serve as the new thread for any final discussion of this project as it winds down.

In the tradition of “Polymath projects“, the problem posed in the previous two blog posts has now been solved, thanks to the cumulative effect of many small contributions by many participants (including, but not limited to, Sean Eberhard, Tobias Fritz, Siddharta Gadgil, Tobias Hartnick, Chris Jerdonek, Apoorva Khare, Antonio Machiavelo, Pace Nielsen, Andy Putman, Will Sawin, Alexander Shamov, Lior Silberman, and David Speyer). In this post I’ll write down a streamlined resolution, eliding a number of important but ultimately removable partial steps and insights made by the above contributors en route to the solution.

Theorem 1 Let ${G = (G,\cdot)}$ be a group. Suppose one has a “seminorm” function ${\| \|: G \rightarrow [0,+\infty)}$ which obeys the triangle inequality

$\displaystyle \|xy \| \leq \|x\| + \|y\|$

for all ${x,y \in G}$, with equality whenever ${x=y}$. Then the seminorm factors through the abelianisation map ${G \mapsto G/[G,G]}$.

Proof: By the triangle inequality, it suffices to show that ${\| [x,y]\| = 0}$ for all ${x,y \in G}$, where ${[x,y] := xyx^{-1}y^{-1}}$ is the commutator.

We first establish some basic facts. Firstly, by hypothesis we have ${\|x^2\| = 2 \|x\|}$, and hence ${\|x^n \| = n \|x\|}$ whenever ${n}$ is a power of two. On the other hand, by the triangle inequality we have ${\|x^n \| \leq n\|x\|}$ for all positive ${n}$, and hence by the triangle inequality again we also have the matching lower bound, thus

$\displaystyle \|x^n \| = n \|x\|$

for all ${n > 0}$. The claim is also true for ${n=0}$ (apply the preceding bound with ${x=1}$ and ${n=2}$). By replacing ${\|x\|}$ with ${\max(\|x\|, \|x^{-1}\|)}$ if necessary we may now also assume without loss of generality that ${\|x^{-1} \| = \|x\|}$, thus

$\displaystyle \|x^n \| = |n| \|x\| \ \ \ \ \ (1)$

for all integers ${n}$.

Next, for any ${x,y \in G}$, and any natural number ${n}$, we have

$\displaystyle \|yxy^{-1} \| = \frac{1}{n} \| (yxy^{-1})^n \|$

$\displaystyle = \frac{1}{n} \| y x^n y^{-1} \|$

$\displaystyle \leq \frac{1}{n} ( \|y\| + n \|x\| + \|y\|^{-1} )$

so on taking limits as ${n \rightarrow \infty}$ we have ${\|yxy^{-1} \| \leq \|x\|}$. Replacing ${x,y}$ by ${yxy^{-1},y^{-1}}$ gives the matching lower bound, thus we have the conjugation invariance

$\displaystyle \|yxy^{-1} \| = \|x\|. \ \ \ \ \ (2)$

Next, we observe that if ${x,y,z,w}$ are such that ${x}$ is conjugate to both ${wy}$ and ${zw^{-1}}$, then one has the inequality

$\displaystyle \|x\| \leq \frac{1}{2} ( \|y \| + \| z \| ). \ \ \ \ \ (3)$

Indeed, if we write ${x = swys^{-1} = t zw^{-1} t^{-1}}$ for some ${s,t \in G}$, then for any natural number ${n}$ one has

$\displaystyle \|x\| = \frac{1}{2n} \| x^n x^n \|$

$\displaystyle = \frac{1}{2n} \| swy \dots wy s^{-1}t zw^{-1} \dots zw^{-1} t^{-1} \|$

where the ${wy}$ and ${zw^{-1}}$ terms each appear ${n}$ times. From (2) we see that conjugation by ${w}$ does not affect the norm. Using this and the triangle inequality several times, we conclude that

$\displaystyle \|x\| \leq \frac{1}{2n} ( \|s\| + n \|y\| + \| s^{-1} t\| + n \|z\| + \|t^{-1} \| ),$

and the claim (3) follows by sending ${n \rightarrow \infty}$.

The following special case of (3) will be of particular interest. Let ${x,y \in G}$, and for any integers ${m,k}$, define the quantity

$\displaystyle f(m,k) := \| x^m [x,y]^k \|.$

Observe that ${x^m [x,y]^k}$ is conjugate to both ${x (x^{m-1} [x,y]^k)}$ and to ${(y^{-1} x^m [x,y]^{k-1} xy) x^{-1}}$, hence by (3) one has

$\displaystyle \| x^m [x,y]^k \| \leq \frac{1}{2} ( \| x^{m-1} [x,y]^k \| + \| y^{-1} x^{m} [x,y]^{k-1} xy \|)$

which by (2) leads to the recursive inequality

$\displaystyle f(m,k) \leq \frac{1}{2} (f(m-1,k) + f(m+1,k-1)).$

We can write this in probabilistic notation as

$\displaystyle f(m,k) \leq {\bf E} f( (m,k) + X )$

where ${X}$ is a random vector that takes the values ${(-1,0)}$ and ${(1,-1)}$ with probability ${1/2}$ each. Iterating this, we conclude in particular that for any large natural number ${n}$, one has

$\displaystyle f(0,n) \leq {\bf E} f( Z )$

where ${Z := (0,n) + X_1 + \dots + X_{2n}}$ and ${X_1,\dots,X_{2n}}$ are iid copies of ${X}$. We can write ${Z = (1,-1/2) (Y_1 + \dots + Y_{2n})}$ where $Y_1,\dots,Y_{2n} = \pm 1$ are iid signs.  By the triangle inequality, we thus have

$\displaystyle f( Z ) \leq |Y_1+\dots+Y_{2n}| (\|x\| + \frac{1}{2} \| [x,y] \|),$

noting that $Y_1+\dots+Y_{2n}$ is an even integer.  On the other hand, $Y_1+\dots+Y_{2n}$ has mean zero and variance $2n$, hence by Cauchy-Schwarz

$\displaystyle f(0,n) \leq \sqrt{2n}( \|x\| + \frac{1}{2} \| [x,y] \|).$

But by (1), the left-hand side is equal to ${n \| [x,y]\|}$. Dividing by ${n}$ and then sending ${n \rightarrow \infty}$, we obtain the claim. $\Box$

The above theorem reduces such seminorms to abelian groups. It is easy to see from (1) that any torsion element of such groups has zero seminorm, so we can in fact restrict to torsion-free groups, which we now write using additive notation ${G = (G,+)}$, thus for instance ${\| nx \| = |n| \|x\|}$ for ${n \in {\bf Z}}$. We think of ${G}$ as a ${{\bf Z}}$-module. One can then extend the seminorm to the associated ${{\bf Q}}$-vector space ${G \otimes_{\bf Z} {\bf Q}}$ by the formula ${\|\frac{a}{b} x\| := \frac{a}{b} \|x\|}$, and then to the associated ${{\bf R}}$-vector space ${G \otimes_{\bf Z} {\bf R}}$ by continuity, at which point it becomes a genuine seminorm (provided we have ensured the symmetry condition ${\|x\| = \|x^{-1}\|}$). Conversely, any seminorm on ${G \otimes_{\bf Z} {\bf R}}$ induces a seminorm on ${G}$. (These arguments also appear in this paper of Khare and Rajaratnam.)

This post is a continuation of the previous post, which has attracted a large number of comments. I’m recording here some calculations that arose from those comments (particularly those of Pace Nielsen, Lior Silberman, Tobias Fritz, and Apoorva Khare). Please feel free to either continue these calculations or to discuss other approaches to the problem, such as those mentioned in the remaining comments to the previous post.

Let ${F_2}$ be the free group on two generators ${a,b}$, and let ${\| \|: F_2 \rightarrow {\bf R}^+}$ be a quantity obeying the triangle inequality

$\displaystyle \| xy\| \leq \|x \| + \|y\|$

and the linear growth property

$\displaystyle \| x^n \| = |n| \| x\|$

for all ${x,y \in F_2}$ and integers ${n \in {\bf Z}}$; this implies the conjugation invariance

$\displaystyle \| y^{-1} x y \| = \|x\|$

or equivalently

$\displaystyle \| xy \| = \| yx\|$

We consider inequalities of the form

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \alpha \|x\| + \beta \| y\| \ \ \ \ \ (1)$

or

$\displaystyle \| xyx^{-2}y^{-1} \| \leq \gamma \|x\| + \delta \| y\| \ \ \ \ \ (2)$

for various real numbers ${\alpha,\beta,\gamma,\delta}$. For instance, since

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \| xyx^{-1}\| + \|y^{-1} \| = \|y\| + \|y\|$

we have (1) for ${(\alpha,\beta) = (2,0)}$. We also have the following further relations:

Proposition 1

• (i) If (1) holds for ${(\alpha,\beta)}$, then it holds for ${(\beta,\alpha)}$.
• (ii) If (1) holds for ${(\alpha,\beta)}$, then (2) holds for ${(\alpha+1, \frac{\beta}{2})}$.
• (iii) If (2) holds for ${(\gamma,\delta)}$, then (1) holds for ${(\frac{2\gamma}{3}, \frac{2\delta}{3})}$.
• (iv) If (1) holds for ${(\alpha,\beta)}$ and (2) holds for ${(\gamma,\delta)}$, then (1) holds for ${(\frac{2\alpha+1+\gamma}{4}, \frac{\delta+\beta}{4})}$.

Proof: For (i) we simply observe that

$\displaystyle \| xyx^{-1} y^{-1} \| = \| (xyx^{-1} y^{-1})^{-1} \| = \| y^{-1} x^{-1} y x \| = \| y x y^{-1} x^{-1} \|.$

For (ii), we calculate

$\displaystyle \| xyx^{-2}y^{-1} \| = \frac{1}{2}\| (xyx^{-2}y^{-1})^2 \|$

$\displaystyle = \frac{1}{2} \| (xyx^{-2}y^{-1} x) (yx^{-2} y^{-1}) \|$

$\displaystyle \leq \frac{1}{2} (\| xyx^{-2}y^{-1} x\| + \|yx^{-2} y^{-1}\|)$

$\displaystyle \leq \frac{1}{2} ( \| x^2 y x^{-2} y^{-1} \| + 2 \|x\| )$

$\displaystyle \leq \frac{1}{2} ( 2 \alpha \|x\| + \beta \|y\| + 2 \|x\|)$

giving the claim.

For (iii), we calculate

$\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{3} \| (xyx^{-1}y^{-1})^3 \|$

$\displaystyle = \frac{1}{3} \| (xyx) (x^{-2} y^{-1} xy) (xyx)^{-1} (x^2 y x^{-1} y^{-1}) \|$

$\displaystyle \leq \frac{1}{3} ( \| x^{-2} y^{-1} xy\| + \| x^2 y x^{-1} y^{-1}\| )$

$\displaystyle = \frac{1}{3} ( \| xy x^{-2} y^{-1} \| + \|x^{-1} y^{-1} x^2 y \| )$

$\displaystyle \leq \frac{1}{3} ( \gamma \|x\| + \delta \|y\| + \gamma \|x\| + \delta \|y\|)$

giving the claim.

For (iv), we calculate

$\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{4} \| (xyx^{-1}y^{-1})^4 \|$

$\displaystyle = \frac{1}{4} \| (xy) (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) (xy)^{-1} (x^2yx^{-1}y^{-1}) \|$

$\displaystyle \leq \frac{1}{4} ( \| (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) \| + \|x^2yx^{-1}y^{-1}\| )$

$\displaystyle \leq \frac{1}{4} ( \|(y x^{-1} y^{-1}) (xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)$

$\displaystyle \leq \frac{1}{4} ( \|x\| + \|(xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)$

$\displaystyle = \frac{1}{4} ( \|x\| + \|x^{-2} y x^2 y^{-1} \|+ \gamma \|x\| + \delta \|y\|)$

$\displaystyle \leq \frac{1}{4} ( \|x\| + 2\alpha \|x\| + \beta \|y\| + \gamma \|x\| + \delta \|y\|)$

giving the claim. $\Box$

Here is a typical application of the above estimates. If (1) holds for ${(\alpha,\beta)}$, then by part (i) it holds for ${(\beta,\alpha)}$, then by (ii) (2) holds for ${(\beta+1,\frac{\alpha}{2})}$, then by (iv) (1) holds for ${(\frac{3\beta+2}{4}, \frac{3\alpha}{8})}$. The map ${(\alpha,\beta) \mapsto (\frac{3\beta+2}{4}, \frac{3\alpha}{8})}$ has fixed point ${(\alpha,\beta) = (\frac{16}{23}, \frac{6}{23})}$, thus

$\displaystyle \| xyx^{-1}y^{-1} \| \leq \frac{16}{23} \|x\| + \frac{6}{23} \|y\|.$

For instance, if ${\|a\|, \|b\| \leq 1}$, then ${\|aba^{-1}b^{-1} \| \leq 22/23 = 0.95652\dots}$.

Here is a curious question posed to me by Apoorva Khare that I do not know the answer to. Let ${F_2}$ be the free group on two generators ${a,b}$. Does there exist a metric ${d}$ on this group which is

• bi-invariant, thus ${d(xg,yg)=d(gx,gy) = d(x,y)}$ for all ${x,y,g \in F_2}$; and
• linear growth in the sense that ${d(x^n,1) = n d(x,1)}$ for all ${x \in F_2}$ and all natural numbers ${n}$?

By defining the “norm” of an element ${x \in F_2}$ to be ${\| x\| := d(x,1)}$, an equivalent formulation of the problem asks if there exists a non-negative norm function ${\| \|: F_2 \rightarrow {\bf R}^+}$ that obeys the conjugation invariance

$\displaystyle \| gxg^{-1} \| = \|x \| \ \ \ \ \ (1)$

for all ${x,g \in F_2}$, the triangle inequality

$\displaystyle \| xy \| \leq \| x\| + \| y\| \ \ \ \ \ (2)$

for all ${x,y \in F_2}$, and the linear growth

$\displaystyle \| x^n \| = |n| \|x\| \ \ \ \ \ (3)$

for all ${x \in F_2}$ and ${n \in {\bf Z}}$, and such that ${\|x\| > 0}$ for all non-identity ${x \in F_2}$. Indeed, if such a norm exists then one can just take ${d(x,y) := \| x y^{-1} \|}$ to give the desired metric.

One can normalise the norm of the generators to be at most ${1}$, thus

$\displaystyle \| a \|, \| b \| \leq 1.$

This can then be used to upper bound the norm of other words in ${F_2}$. For instance, from (1), (3) one has

$\displaystyle \| aba^{-1} \|, \| b^{-1} a b \|, \| a^{-1} b^{-1} a \|, \| bab^{-1}\| \leq 1.$

A bit less trivially, from (3), (2), (1) one can bound commutators as

$\displaystyle \| aba^{-1} b^{-1} \| = \frac{1}{3} \| (aba^{-1} b^{-1})^3 \|$

$\displaystyle = \frac{1}{3} \| (aba^{-1}) (b^{-1} ab) (a^{-1} b^{-1} a) (b ab^{-1}) \|$

$\displaystyle \leq \frac{4}{3}.$

In a similar spirit one has

$\displaystyle \| aba^{-2} b^{-1} \| = \frac{1}{2} \| (aba^{-2} b^{-1})^2 \|$

$\displaystyle = \frac{1}{2} \| (aba^{-1}) (a^{-1} b^{-1} a) (ba^{-1} b^{-1}) (ba^{-1} b^{-1}) \|$

$\displaystyle \leq 2.$

What is not clear to me is if one can keep arguing like this to continually improve the upper bounds on the norm ${\| g\|}$ of a given non-trivial group element ${g}$ to the point where this norm must in fact vanish, which would demonstrate that no metric with the above properties on ${F_2}$ would exist (and in fact would impose strong constraints on similar metrics existing on other groups as well). It is also tempting to use some ideas from geometric group theory (e.g. asymptotic cones) to try to understand these metrics further, though I wasn’t able to get very far with this approach. Anyway, this feels like a problem that might be somewhat receptive to a more crowdsourced attack, so I am posing it here in case any readers wish to try to make progress on it.

By an odd coincidence, I stumbled upon a second question in as many weeks about power series, and once again the only way I know how to prove the result is by complex methods; once again, I am leaving it here as a challenge to any interested readers, and I would be particularly interested in knowing of a proof that was not based on complex analysis (or thinly disguised versions thereof), or for a reference to previous literature where something like this identity has occured. (I suspect for instance that something like this may have shown up before in free probability, based on the answer to part (ii) of the problem.)

Here is a purely algebraic form of the problem:

Problem 1 Let ${F = F(z)}$ be a formal function of one variable ${z}$. Suppose that ${G = G(z)}$ is the formal function defined by

$\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}$

$\displaystyle = F + \left(\frac{F^2}{2}\right)' + \left(\frac{F^3}{6}\right)'' + \dots$

$\displaystyle = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + \dots,$

where we use ${f^{(k)}}$ to denote the ${k}$-fold derivative of ${f}$ with respect to the variable ${z}$.

• (i) Show that ${F}$ can be formally recovered from ${G}$ by the formula

$\displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}$

$\displaystyle = G - \left(\frac{G^2}{2}\right)' + \left(\frac{G^3}{6}\right)'' - \dots$

$\displaystyle = G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') - \dots.$

• (ii) There is a remarkable further formal identity relating ${F(z)}$ with ${G(z)}$ that does not explicitly involve any infinite summation. What is this identity?

To rigorously formulate part (i) of this problem, one could work in the commutative differential ring of formal infinite series generated by polynomial combinations of ${F}$ and its derivatives (with no constant term). Part (ii) is a bit trickier to formulate in this abstract ring; the identity in question is easier to state if ${F, G}$ are formal power series, or (even better) convergent power series, as it involves operations such as composition or inversion that can be more easily defined in those latter settings.

To illustrate Problem 1(i), let us compute up to third order in ${F}$, using ${{\mathcal O}(F^4)}$ to denote any quantity involving four or more factors of ${F}$ and its derivatives, and similarly for other exponents than ${4}$. Then we have

$\displaystyle G = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + {\mathcal O}(F^4)$

and hence

$\displaystyle G' = F' + (F')^2 + FF'' + {\mathcal O}(F^3)$

$\displaystyle G'' = F'' + {\mathcal O}(F^2);$

multiplying, we have

$\displaystyle GG' = FF' + F (F')^2 + F^2 F'' + F (F')^2 + {\mathcal O}(F^4)$

and

$\displaystyle G (G')^2 + \frac{1}{2} G^2 G'' = F (F')^2 + \frac{1}{2} F^2 F'' + {\mathcal O}(F^4)$

and hence after a lot of canceling

$\displaystyle G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') = F + {\mathcal O}(F^4).$

Thus Problem 1(i) holds up to errors of ${{\mathcal O}(F^4)}$ at least. In principle one can continue verifying Problem 1(i) to increasingly high order in ${F}$, but the computations rapidly become quite lengthy, and I do not know of a direct way to ensure that one always obtains the required cancellation at the end of the computation.

Problem 1(i) can also be posed in formal power series: if

$\displaystyle F(z) = a_1 z + a_2 z^2 + a_3 z^3 + \dots$

is a formal power series with no constant term with complex coefficients ${a_1, a_2, \dots}$ with ${|a_1|<1}$, then one can verify that the series

$\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}$

makes sense as a formal power series with no constant term, thus

$\displaystyle G(z) = b_1 z + b_2 z^2 + b_3 z^3 + \dots.$

For instance it is not difficult to show that ${b_1 = \frac{a_1}{1-a_1}}$. If one further has ${|b_1| < 1}$, then it turns out that

$\displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}$

as formal power series. Currently the only way I know how to show this is by first proving the claim for power series with a positive radius of convergence using the Cauchy integral formula, but even this is a bit tricky unless one has managed to guess the identity in (ii) first. (In fact, the way I discovered this problem was by first trying to solve (a variant of) the identity in (ii) by Taylor expansion in the course of attacking another problem, and obtaining the transform in Problem 1 as a consequence.)

The transform that takes ${F}$ to ${G}$ resembles both the exponential function

$\displaystyle \exp(F) = \sum_{n=0}^\infty \frac{F^n}{n!}$

and Taylor’s formula

$\displaystyle F(z) = \sum_{n=0}^\infty \frac{F^{(n)}(0)}{n!} z^n$

but does not seem to be directly connected to either (this is more apparent once one knows the identity in (ii)).

My colleague Tom Liggett recently posed to me the following problem about power series in one real variable ${x}$. Observe that the power series

$\displaystyle \sum_{n=0}^\infty (-1)^n\frac{x^n}{n!}$

has very rapidly decaying coefficients (of order ${O(1/n!)}$), leading to an infinite radius of convergence; also, as the series converges to ${e^{-x}}$, the series decays very rapidly as ${x}$ approaches ${+\infty}$. The problem is whether this is essentially the only example of this type. More precisely:

Problem 1 Let ${a_0, a_1, \dots}$ be a bounded sequence of real numbers, and suppose that the power series

$\displaystyle f(x) := \sum_{n=0}^\infty a_n\frac{x^n}{n!}$

(which has an infinite radius of convergence) decays like ${O(e^{-x})}$ as ${x \rightarrow +\infty}$, in the sense that the function ${e^x f(x)}$ remains bounded as ${x \rightarrow +\infty}$. Must the sequence ${a_n}$ be of the form ${a_n = C (-1)^n}$ for some constant ${C}$?

As it turns out, the problem has a very nice solution using complex analysis methods, which by coincidence I happen to be teaching right now. I am therefore posing as a challenge to my complex analysis students and to other readers of this blog to answer the above problem by complex methods; feel free to post solutions in the comments below (and in particular, if you don’t want to be spoiled, you should probably refrain from reading the comments). In fact, the only way I know how to solve this problem currently is by complex methods; I would be interested in seeing a purely real-variable solution that is not simply a thinly disguised version of a complex-variable argument.

(To be fair to my students, the complex variable argument does require one additional tool that is not directly covered in my notes. That tool can be found here.)