You are currently browsing the category archive for the ‘question’ category.

In the tradition of “Polymath projects“, the problem posed in the previous two blog posts has now been solved, thanks to the cumulative effect of many small contributions by many participants (including, but not limited to, Sean Eberhard, Tobias Fritz, Siddharta Gadgil, Tobias Hartnick, Chris Jerdonek, Apoorva Khare, Antonio Machiavelo, Pace Nielsen, Andy Putman, Will Sawin, Alexander Shamov, Lior Silberman, and David Speyer). In this post I’ll write down a streamlined resolution, eliding a number of important but ultimately removable partial steps and insights made by the above contributors en route to the solution.

Theorem 1 Let {G = (G,\cdot)} be a group. Suppose one has a “seminorm” function {\| \|: G \rightarrow [0,+\infty)} which obeys the triangle inequality

\displaystyle \|xy \| \leq \|x\| + \|y\|

for all {x,y \in G}, with equality whenever {x=y}. Then the seminorm factors through the abelianisation map {G \mapsto G/[G,G]}.

Proof: By the triangle inequality, it suffices to show that {\| [x,y]\| = 0} for all {x,y \in G}, where {[x,y] := xyx^{-1}y^{-1}} is the commutator.

We first establish some basic facts. Firstly, by hypothesis we have {\|x^2\| = 2 \|x\|}, and hence {\|x^n \| = n \|x\|} whenever {n} is a power of two. On the other hand, by the triangle inequality we have {\|x^n \| \leq n\|x\|} for all positive {n}, and hence by the triangle inequality again we also have the matching lower bound, thus

\displaystyle \|x^n \| = n \|x\|

for all {n > 0}. The claim is also true for {n=0} (apply the preceding bound with {x=1} and {n=2}). By replacing {\|x\|} with {\max(\|x\|, \|x^{-1}\|)} if necessary we may now also assume without loss of generality that {\|x^{-1} \| = \|x\|}, thus

\displaystyle \|x^n \| = |n| \|x\| \ \ \ \ \ (1)


for all integers {n}.

Next, for any {x,y \in G}, and any natural number {n}, we have

\displaystyle \|yxy^{-1} \| = \frac{1}{n} \| (yxy^{-1})^n \|

\displaystyle = \frac{1}{n} \| y x^n y^{-1} \|

\displaystyle \leq \frac{1}{n} ( \|y\| + n \|x\| + \|y\|^{-1} )

so on taking limits as {n \rightarrow \infty} we have {\|yxy^{-1} \| \leq \|x\|}. Replacing {x,y} by {yxy^{-1},y^{-1}} gives the matching lower bound, thus we have the conjugation invariance

\displaystyle \|yxy^{-1} \| = \|x\|. \ \ \ \ \ (2)


Next, we observe that if {x,y,z,w} are such that {x} is conjugate to both {wy} and {zw^{-1}}, then one has the inequality

\displaystyle \|x\| \leq \frac{1}{2} ( \|y \| + \| z \| ). \ \ \ \ \ (3)


Indeed, if we write {x = swys^{-1} = t zw^{-1} t^{-1}} for some {s,t \in G}, then for any natural number {n} one has

\displaystyle \|x\| = \frac{1}{2n} \| x^n x^n \|

\displaystyle = \frac{1}{2n} \| swy \dots wy s^{-1}t zw^{-1} \dots zw^{-1} t^{-1} \|

where the {wy} and {zw^{-1}} terms each appear {n} times. From (2) we see that conjugation by {w} does not affect the norm. Using this and the triangle inequality several times, we conclude that

\displaystyle \|x\| \leq \frac{1}{2n} ( \|s\| + n \|y\| + \| s^{-1} t\| + n \|z\| + \|t^{-1} \| ),

and the claim (3) follows by sending {n \rightarrow \infty}.

The following special case of (3) will be of particular interest. Let {x,y \in G}, and for any integers {m,k}, define the quantity

\displaystyle f(m,k) := \| x^m [x,y]^k \|.

Observe that {x^m [x,y]^k} is conjugate to both {x (x^{m-1} [x,y]^k)} and to {(y^{-1} x^m [x,y]^{k-1} xy) x^{-1}}, hence by (3) one has

\displaystyle \| x^m [x,y]^k \| \leq \frac{1}{2} ( \| x^{m-1} [x,y]^k \| + \| y^{-1} x^{m} [x,y]^{k-1} xy \|)

which by (2) leads to the recursive inequality

\displaystyle f(m,k) \leq \frac{1}{2} (f(m-1,k) + f(m+1,k-1)).

We can write this in probabilistic notation as

\displaystyle f(m,k) \leq {\bf E} f( (m,k) + X )

where {X} is a random vector that takes the values {(-1,0)} and {(1,-1)} with probability {1/2} each. Iterating this, we conclude in particular that for any large natural number {n}, one has

\displaystyle f(0,n) \leq {\bf E} f( Z )

where {Z := (0,n) + X_1 + \dots + X_{2n}} and {X_1,\dots,X_{2n}} are iid copies of {X}. We can write {Z = (1,-1/2) (Y_1 + \dots + Y_{2n})} where Y_1,\dots,Y_{2n} = \pm 1 are iid signs.  By the triangle inequality, we thus have

\displaystyle f( Z ) \leq |Y_1+\dots+Y_{2n}| (\|x\| + \frac{1}{2} \| [x,y] \|),

noting that Y_1+\dots+Y_{2n} is an even integer.  On the other hand, Y_1+\dots+Y_{2n} has mean zero and variance 2n, hence by Cauchy-Schwarz

\displaystyle f(0,n) \leq \sqrt{2n}( \|x\| + \frac{1}{2} \| [x,y] \|).

But by (1), the left-hand side is equal to {n \| [x,y]\|}. Dividing by {n} and then sending {n \rightarrow \infty}, we obtain the claim. \Box

The above theorem reduces such seminorms to abelian groups. It is easy to see from (1) that any torsion element of such groups has zero seminorm, so we can in fact restrict to torsion-free groups, which we now write using additive notation {G = (G,+)}, thus for instance {\| nx \| = |n| \|x\|} for {n \in {\bf Z}}. We think of {G} as a {{\bf Z}}-module. One can then extend the seminorm to the associated {{\bf Q}}-vector space {G \otimes_{\bf Z} {\bf Q}} by the formula {\|\frac{a}{b} x\| := \frac{a}{b} \|x\|}, and then to the associated {{\bf R}}-vector space {G \otimes_{\bf Z} {\bf R}} by continuity, at which point it becomes a genuine seminorm (provided we have ensured the symmetry condition {\|x\| = \|x^{-1}\|}). Conversely, any seminorm on {G \otimes_{\bf Z} {\bf R}} induces a seminorm on {G}. (These arguments also appear in this paper of Khare and Rajaratnam.)


This post is a continuation of the previous post, which has attracted a large number of comments. I’m recording here some calculations that arose from those comments (particularly those of Pace Nielsen, Lior Silberman, Tobias Fritz, and Apoorva Khare). Please feel free to either continue these calculations or to discuss other approaches to the problem, such as those mentioned in the remaining comments to the previous post.

Let {F_2} be the free group on two generators {a,b}, and let {\| \|: F_2 \rightarrow {\bf R}^+} be a quantity obeying the triangle inequality

\displaystyle \| xy\| \leq \|x \| + \|y\|

and the linear growth property

\displaystyle \| x^n \| = |n| \| x\|

for all {x,y \in F_2} and integers {n \in {\bf Z}}; this implies the conjugation invariance

\displaystyle \| y^{-1} x y \| = \|x\|

or equivalently

\displaystyle \| xy \| = \| yx\|

We consider inequalities of the form

\displaystyle \| xyx^{-1}y^{-1} \| \leq \alpha \|x\| + \beta \| y\| \ \ \ \ \ (1)


\displaystyle \| xyx^{-2}y^{-1} \| \leq \gamma \|x\| + \delta \| y\| \ \ \ \ \ (2)

for various real numbers {\alpha,\beta,\gamma,\delta}. For instance, since

\displaystyle \| xyx^{-1}y^{-1} \| \leq \| xyx^{-1}\| + \|y^{-1} \| = \|y\| + \|y\|

we have (1) for {(\alpha,\beta) = (2,0)}. We also have the following further relations:

Proposition 1

  • (i) If (1) holds for {(\alpha,\beta)}, then it holds for {(\beta,\alpha)}.
  • (ii) If (1) holds for {(\alpha,\beta)}, then (2) holds for {(\alpha+1, \frac{\beta}{2})}.
  • (iii) If (2) holds for {(\gamma,\delta)}, then (1) holds for {(\frac{2\gamma}{3}, \frac{2\delta}{3})}.
  • (iv) If (1) holds for {(\alpha,\beta)} and (2) holds for {(\gamma,\delta)}, then (1) holds for {(\frac{2\alpha+1+\gamma}{4}, \frac{\delta+\beta}{4})}.

Proof: For (i) we simply observe that

\displaystyle \| xyx^{-1} y^{-1} \| = \| (xyx^{-1} y^{-1})^{-1} \| = \| y^{-1} x^{-1} y x \| = \| y x y^{-1} x^{-1} \|.

For (ii), we calculate

\displaystyle \| xyx^{-2}y^{-1} \| = \frac{1}{2}\| (xyx^{-2}y^{-1})^2 \|

\displaystyle = \frac{1}{2} \| (xyx^{-2}y^{-1} x) (yx^{-2} y^{-1}) \|

\displaystyle \leq \frac{1}{2} (\| xyx^{-2}y^{-1} x\| + \|yx^{-2} y^{-1}\|)

\displaystyle \leq \frac{1}{2} ( \| x^2 y x^{-2} y^{-1} \| + 2 \|x\| )

\displaystyle \leq \frac{1}{2} ( 2 \alpha \|x\| + \beta \|y\| + 2 \|x\|)

giving the claim.

For (iii), we calculate

\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{3} \| (xyx^{-1}y^{-1})^3 \|

\displaystyle = \frac{1}{3} \| (xyx) (x^{-2} y^{-1} xy) (xyx)^{-1} (x^2 y x^{-1} y^{-1}) \|

\displaystyle \leq \frac{1}{3} ( \| x^{-2} y^{-1} xy\| + \| x^2 y x^{-1} y^{-1}\| )

\displaystyle = \frac{1}{3} ( \| xy x^{-2} y^{-1} \| + \|x^{-1} y^{-1} x^2 y \| )

\displaystyle \leq \frac{1}{3} ( \gamma \|x\| + \delta \|y\| + \gamma \|x\| + \delta \|y\|)

giving the claim.

For (iv), we calculate

\displaystyle \| xyx^{-1}y^{-1}\| = \frac{1}{4} \| (xyx^{-1}y^{-1})^4 \|

\displaystyle = \frac{1}{4} \| (xy) (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) (xy)^{-1} (x^2yx^{-1}y^{-1}) \|

\displaystyle \leq \frac{1}{4} ( \| (x^{-1} y^{-1} x) (y x^{-1} y^{-1}) (xyx^{-1}) \| + \|x^2yx^{-1}y^{-1}\| )

\displaystyle \leq \frac{1}{4} ( \|(y x^{-1} y^{-1}) (xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)

\displaystyle \leq \frac{1}{4} ( \|x\| + \|(xy^{-1}x^{-1})(x^{-1} y x) \| + \gamma \|x\| + \delta \|y\|)

\displaystyle = \frac{1}{4} ( \|x\| + \|x^{-2} y x^2 y^{-1} \|+ \gamma \|x\| + \delta \|y\|)

\displaystyle \leq \frac{1}{4} ( \|x\| + 2\alpha \|x\| + \beta \|y\| + \gamma \|x\| + \delta \|y\|)

giving the claim. \Box

Here is a typical application of the above estimates. If (1) holds for {(\alpha,\beta)}, then by part (i) it holds for {(\beta,\alpha)}, then by (ii) (2) holds for {(\beta+1,\frac{\alpha}{2})}, then by (iv) (1) holds for {(\frac{3\beta+2}{4}, \frac{3\alpha}{8})}. The map {(\alpha,\beta) \mapsto (\frac{3\beta+2}{4}, \frac{3\alpha}{8})} has fixed point {(\alpha,\beta) = (\frac{16}{23}, \frac{6}{23})}, thus

\displaystyle \| xyx^{-1}y^{-1} \| \leq \frac{16}{23} \|x\| + \frac{6}{23} \|y\|.

For instance, if {\|a\|, \|b\| \leq 1}, then {\|aba^{-1}b^{-1} \| \leq 22/23 = 0.95652\dots}.

Here is a curious question posed to me by Apoorva Khare that I do not know the answer to. Let {F_2} be the free group on two generators {a,b}. Does there exist a metric {d} on this group which is

  • bi-invariant, thus {d(xg,yg)=d(gx,gy) = d(x,y)} for all {x,y,g \in F_2}; and
  • linear growth in the sense that {d(x^n,1) = n d(x,1)} for all {x \in F_2} and all natural numbers {n}?

By defining the “norm” of an element {x \in F_2} to be {\| x\| := d(x,1)}, an equivalent formulation of the problem asks if there exists a non-negative norm function {\| \|: F_2 \rightarrow {\bf R}^+} that obeys the conjugation invariance

\displaystyle  \| gxg^{-1} \| = \|x \| \ \ \ \ \ (1)

for all {x,g \in F_2}, the triangle inequality

\displaystyle  \| xy \| \leq \| x\| + \| y\| \ \ \ \ \ (2)

for all {x,y \in F_2}, and the linear growth

\displaystyle  \| x^n \| = |n| \|x\| \ \ \ \ \ (3)

for all {x \in F_2} and {n \in {\bf Z}}, and such that {\|x\| > 0} for all non-identity {x \in F_2}. Indeed, if such a norm exists then one can just take {d(x,y) := \| x y^{-1} \|} to give the desired metric.

One can normalise the norm of the generators to be at most {1}, thus

\displaystyle  \| a \|, \| b \| \leq 1.

This can then be used to upper bound the norm of other words in {F_2}. For instance, from (1), (3) one has

\displaystyle  \| aba^{-1} \|, \| b^{-1} a b \|, \| a^{-1} b^{-1} a \|, \| bab^{-1}\| \leq 1.

A bit less trivially, from (3), (2), (1) one can bound commutators as

\displaystyle  \| aba^{-1} b^{-1} \| = \frac{1}{3} \| (aba^{-1} b^{-1})^3 \|

\displaystyle  = \frac{1}{3} \| (aba^{-1}) (b^{-1} ab) (a^{-1} b^{-1} a) (b ab^{-1}) \|

\displaystyle  \leq \frac{4}{3}.

In a similar spirit one has

\displaystyle  \| aba^{-2} b^{-1} \| = \frac{1}{2} \| (aba^{-2} b^{-1})^2 \|

\displaystyle  = \frac{1}{2} \| (aba^{-1}) (a^{-1} b^{-1} a) (ba^{-1} b^{-1}) (ba^{-1} b^{-1}) \|

\displaystyle  \leq 2.

What is not clear to me is if one can keep arguing like this to continually improve the upper bounds on the norm {\| g\|} of a given non-trivial group element {g} to the point where this norm must in fact vanish, which would demonstrate that no metric with the above properties on {F_2} would exist (and in fact would impose strong constraints on similar metrics existing on other groups as well). It is also tempting to use some ideas from geometric group theory (e.g. asymptotic cones) to try to understand these metrics further, though I wasn’t able to get very far with this approach. Anyway, this feels like a problem that might be somewhat receptive to a more crowdsourced attack, so I am posing it here in case any readers wish to try to make progress on it.

By an odd coincidence, I stumbled upon a second question in as many weeks about power series, and once again the only way I know how to prove the result is by complex methods; once again, I am leaving it here as a challenge to any interested readers, and I would be particularly interested in knowing of a proof that was not based on complex analysis (or thinly disguised versions thereof), or for a reference to previous literature where something like this identity has occured. (I suspect for instance that something like this may have shown up before in free probability, based on the answer to part (ii) of the problem.)

Here is a purely algebraic form of the problem:

Problem 1 Let {F = F(z)} be a formal function of one variable {z}. Suppose that {G = G(z)} is the formal function defined by

\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

\displaystyle = F + \left(\frac{F^2}{2}\right)' + \left(\frac{F^3}{6}\right)'' + \dots

\displaystyle = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + \dots,

where we use {f^{(k)}} to denote the {k}-fold derivative of {f} with respect to the variable {z}.

  • (i) Show that {F} can be formally recovered from {G} by the formula

    \displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

    \displaystyle = G - \left(\frac{G^2}{2}\right)' + \left(\frac{G^3}{6}\right)'' - \dots

    \displaystyle = G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') - \dots.

  • (ii) There is a remarkable further formal identity relating {F(z)} with {G(z)} that does not explicitly involve any infinite summation. What is this identity?

To rigorously formulate part (i) of this problem, one could work in the commutative differential ring of formal infinite series generated by polynomial combinations of {F} and its derivatives (with no constant term). Part (ii) is a bit trickier to formulate in this abstract ring; the identity in question is easier to state if {F, G} are formal power series, or (even better) convergent power series, as it involves operations such as composition or inversion that can be more easily defined in those latter settings.

To illustrate Problem 1(i), let us compute up to third order in {F}, using {{\mathcal O}(F^4)} to denote any quantity involving four or more factors of {F} and its derivatives, and similarly for other exponents than {4}. Then we have

\displaystyle G = F + FF' + (F (F')^2 + \frac{1}{2} F^2 F'') + {\mathcal O}(F^4)

and hence

\displaystyle G' = F' + (F')^2 + FF'' + {\mathcal O}(F^3)

\displaystyle G'' = F'' + {\mathcal O}(F^2);

multiplying, we have

\displaystyle GG' = FF' + F (F')^2 + F^2 F'' + F (F')^2 + {\mathcal O}(F^4)


\displaystyle G (G')^2 + \frac{1}{2} G^2 G'' = F (F')^2 + \frac{1}{2} F^2 F'' + {\mathcal O}(F^4)

and hence after a lot of canceling

\displaystyle G - GG' + (G (G')^2 + \frac{1}{2} G^2 G'') = F + {\mathcal O}(F^4).

Thus Problem 1(i) holds up to errors of {{\mathcal O}(F^4)} at least. In principle one can continue verifying Problem 1(i) to increasingly high order in {F}, but the computations rapidly become quite lengthy, and I do not know of a direct way to ensure that one always obtains the required cancellation at the end of the computation.

Problem 1(i) can also be posed in formal power series: if

\displaystyle F(z) = a_1 z + a_2 z^2 + a_3 z^3 + \dots

is a formal power series with no constant term with complex coefficients {a_1, a_2, \dots} with {|a_1|<1}, then one can verify that the series

\displaystyle G := \sum_{n=1}^\infty \left( \frac{F^n}{n!} \right)^{(n-1)}

makes sense as a formal power series with no constant term, thus

\displaystyle G(z) = b_1 z + b_2 z^2 + b_3 z^3 + \dots.

For instance it is not difficult to show that {b_1 = \frac{a_1}{1-a_1}}. If one further has {|b_1| < 1}, then it turns out that

\displaystyle F = \sum_{n=1}^\infty (-1)^{n-1} \left( \frac{G^n}{n!} \right)^{(n-1)}

as formal power series. Currently the only way I know how to show this is by first proving the claim for power series with a positive radius of convergence using the Cauchy integral formula, but even this is a bit tricky unless one has managed to guess the identity in (ii) first. (In fact, the way I discovered this problem was by first trying to solve (a variant of) the identity in (ii) by Taylor expansion in the course of attacking another problem, and obtaining the transform in Problem 1 as a consequence.)

The transform that takes {F} to {G} resembles both the exponential function

\displaystyle \exp(F) = \sum_{n=0}^\infty \frac{F^n}{n!}

and Taylor’s formula

\displaystyle F(z) = \sum_{n=0}^\infty \frac{F^{(n)}(0)}{n!} z^n

but does not seem to be directly connected to either (this is more apparent once one knows the identity in (ii)).

My colleague Tom Liggett recently posed to me the following problem about power series in one real variable {x}. Observe that the power series

\displaystyle  \sum_{n=0}^\infty (-1)^n\frac{x^n}{n!}

has very rapidly decaying coefficients (of order {O(1/n!)}), leading to an infinite radius of convergence; also, as the series converges to {e^{-x}}, the series decays very rapidly as {x} approaches {+\infty}. The problem is whether this is essentially the only example of this type. More precisely:

Problem 1 Let {a_0, a_1, \dots} be a bounded sequence of real numbers, and suppose that the power series

\displaystyle  f(x) := \sum_{n=0}^\infty a_n\frac{x^n}{n!}

(which has an infinite radius of convergence) decays like {O(e^{-x})} as {x \rightarrow +\infty}, in the sense that the function {e^x f(x)} remains bounded as {x \rightarrow +\infty}. Must the sequence {a_n} be of the form {a_n = C (-1)^n} for some constant {C}?

As it turns out, the problem has a very nice solution using complex analysis methods, which by coincidence I happen to be teaching right now. I am therefore posing as a challenge to my complex analysis students and to other readers of this blog to answer the above problem by complex methods; feel free to post solutions in the comments below (and in particular, if you don’t want to be spoiled, you should probably refrain from reading the comments). In fact, the only way I know how to solve this problem currently is by complex methods; I would be interested in seeing a purely real-variable solution that is not simply a thinly disguised version of a complex-variable argument.

(To be fair to my students, the complex variable argument does require one additional tool that is not directly covered in my notes. That tool can be found here.)

Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials P in the polynomial ring {\bf F}_2[t] over the finite field of characteristic two, the simplest of which is

\displaystyle \sum_P \frac{1}{1+P} = 0

(expanded in terms of Taylor series in u = 1/t).  Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.

Let {\lambda} denote the Liouville function. The prime number theorem is equivalent to the estimate

\displaystyle \sum_{n \leq x} \lambda(n) = o(x)

as {x \rightarrow \infty}, that is to say that {\lambda} exhibits cancellation on large intervals such as {[1,x]}. This result can be improved to give cancellation on shorter intervals. For instance, using the known zero density estimates for the Riemann zeta function, one can establish that

\displaystyle \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n)|\ dx = o( HX ) \ \ \ \ \ (1)


as {X \rightarrow \infty} if {X^{1/6+\varepsilon} \leq H \leq X} for some fixed {\varepsilon>0}; I believe this result is due to Ramachandra (see also Exercise 21 of this previous blog post), and in fact one could obtain a better error term on the right-hand side that for instance gained an arbitrary power of {\log X}. On the Riemann hypothesis (or the weaker density hypothesis), it was known that the {X^{1/6+\varepsilon}} could be lowered to {X^\varepsilon}.

Early this year, there was a major breakthrough by Matomaki and Radziwill, who (among other things) showed that the asymptotic (1) was in fact valid for any {H = H(X)} with {H \leq X} that went to infinity as {X \rightarrow \infty}, thus yielding cancellation on extremely short intervals. This has many further applications; for instance, this estimate, or more precisely its extension to other “non-pretentious” bounded multiplicative functions, was a key ingredient in my recent solution of the Erdös discrepancy problem, as well as in obtaining logarithmically averaged cases of Chowla’s conjecture, such as

\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1)}{n} = o(\log x). \ \ \ \ \ (2)


It is of interest to twist the above estimates by phases such as the linear phase {n \mapsto e(\alpha n) := e^{2\pi i \alpha n}}. In 1937, Davenport showed that

\displaystyle \sup_\alpha |\sum_{n \leq x} \lambda(n) e(\alpha n)| \ll_A x \log^{-A} x

which of course improves the prime number theorem. Recently with Matomaki and Radziwill, we obtained a common generalisation of this estimate with (1), showing that

\displaystyle \sup_\alpha \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(HX) \ \ \ \ \ (3)


as {X \rightarrow \infty}, for any {H = H(X) \leq X} that went to infinity as {X \rightarrow \infty}. We were able to use this estimate to obtain an averaged form of Chowla’s conjecture.

In that paper, we asked whether one could improve this estimate further by moving the supremum inside the integral, that is to say to establish the bound

\displaystyle \int_X^{2X} \sup_\alpha |\sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(HX) \ \ \ \ \ (4)


as {X \rightarrow \infty}, for any {H = H(X) \leq X} that went to infinity as {X \rightarrow \infty}. This bound is asserting that {\lambda} is locally Fourier-uniform on most short intervals; it can be written equivalently in terms of the “local Gowers {U^2} norm” as

\displaystyle \int_X^{2X} \sum_{1 \leq a \leq H} |\sum_{x \leq n \leq x+H} \lambda(n) \lambda(n+a)|^2\ dx = o( H^3 X )

from which one can see that this is another averaged form of Chowla’s conjecture (stronger than the one I was able to prove with Matomaki and Radziwill, but a consequence of the unaveraged Chowla conjecture). If one inserted such a bound into the machinery I used to solve the Erdös discrepancy problem, it should lead to further averaged cases of Chowla’s conjecture, such as

\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1) \lambda(n+2)}{n} = o(\log x), \ \ \ \ \ (5)


though I have not fully checked the details of this implication. It should also have a number of new implications for sign patterns of the Liouville function, though we have not explored these in detail yet.

One can write (4) equivalently in the form

\displaystyle \int_X^{2X} \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )\ dx = o(HX) \ \ \ \ \ (6)


uniformly for all {x}-dependent phases {\alpha(x), \beta(x)}. In contrast, (3) is equivalent to the subcase of (6) when the linear phase coefficient {\alpha(x)} is independent of {x}. This dependency of {\alpha(x)} on {x} seems to necessitate some highly nontrivial additive combinatorial analysis of the function {x \mapsto \alpha(x)} in order to establish (4) when {H} is small. To date, this analysis has proven to be elusive, but I would like to record what one can do with more classical methods like Vaughan’s identity, namely:

Proposition 1 The estimate (4) (or equivalently (6)) holds in the range {X^{2/3+\varepsilon} \leq H \leq X} for any fixed {\varepsilon>0}. (In fact one can improve the right-hand side by an arbitrary power of {\log X} in this case.)

The values of {H} in this range are far too large to yield implications such as new cases of the Chowla conjecture, but it appears that the {2/3} exponent is the limit of “classical” methods (at least as far as I was able to apply them), in the sense that one does not do any combinatorial analysis on the function {x \mapsto \alpha(x)}, nor does one use modern equidistribution results on “Type III sums” that require deep estimates on Kloosterman-type sums. The latter may shave a little bit off of the {2/3} exponent, but I don’t see how one would ever hope to go below {1/2} without doing some non-trivial combinatorics on the function {x \mapsto \alpha(x)}. UPDATE: I have come across this paper of Zhan which uses mean-value theorems for L-functions to lower the {2/3} exponent to {5/8}.

Let me now sketch the proof of the proposition, omitting many of the technical details. We first remark that known estimates on sums of the Liouville function (or similar functions such as the von Mangoldt function) in short arithmetic progressions, based on zero-density estimates for Dirichlet {L}-functions, can handle the “major arc” case of (4) (or (6)) where {\alpha} is restricted to be of the form {\alpha = \frac{a}{q} + O( X^{-1/6-\varepsilon} )} for {q = O(\log^{O(1)} X)} (the exponent here being of the same numerology as the {X^{1/6+\varepsilon}} exponent in the classical result of Ramachandra, tied to the best zero density estimates currently available); for instance a modification of the arguments in this recent paper of Koukoulopoulos would suffice. Thus we can restrict attention to “minor arc” values of {\alpha} (or {\alpha(x)}, using the interpretation of (6)).

Next, one breaks up {\lambda} (or the closely related Möbius function) into Dirichlet convolutions using one of the standard identities (e.g. Vaughan’s identity or Heath-Brown’s identity), as discussed for instance in this previous post (which is focused more on the von Mangoldt function, but analogous identities exist for the Liouville and Möbius functions). The exact choice of identity is not terribly important, but the upshot is that {\lambda(n)} can be decomposed into {\log^{O(1)} X} terms, each of which is either of the “Type I” form

\displaystyle \sum_{d \sim D; m \sim M: dm=n} a_d

for some coefficients {a_d} that are roughly of logarithmic size on the average, and scales {D, M} with {D \ll X^{2/3}} and {DM \sim X}, or else of the “Type II” form

\displaystyle \sum_{d \sim D; m \sim M: dm=n} a_d b_m

for some coefficients {a_d, b_m} that are roughly of logarithmic size on the average, and scales {D,M} with {X^{1/3} \ll D,M \ll X^{2/3}} and {DM \sim X}. As discussed in the previous post, the {2/3} exponent is a natural barrier in these identities if one is unwilling to also consider “Type III” type terms which are roughly of the shape of the third divisor function {\tau_3(n) := \sum_{d_1d_2d_3=1} 1}.

A Type I sum makes a contribution to { \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )} that can be bounded (via Cauchy-Schwarz) in terms of an expression such as

\displaystyle \sum_{d \sim D} | \sum_{x/d \leq m \leq x/d+H/d} e(\alpha(x) dm )|^2.

The inner sum exhibits a lot of cancellation unless {\alpha(x) d} is within {O(D/H)} of an integer. (Here, “a lot” should be loosely interpreted as “gaining many powers of {\log X} over the trivial bound”.) Since {H} is significantly larger than {D}, standard Vinogradov-type manipulations (see e.g. Lemma 13 of these previous notes) show that this bad case occurs for many {d} only when {\alpha} is “major arc”, which is the case we have specifically excluded. This lets us dispose of the Type I contributions.

A Type II sum makes a contribution to { \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )} roughly of the form

\displaystyle \sum_{d \sim D} | \sum_{x/d \leq m \leq x/d+H/d} b_m e(\alpha(x) dm)|.

We can break this up into a number of sums roughly of the form

\displaystyle \sum_{d = d_0 + O( H / M )} | \sum_{x/d_0 \leq m \leq x/d_0 + H/D} b_m e(\alpha(x) dm)|

for {d_0 \sim D}; note that the {d} range is non-trivial because {H} is much larger than {M}. Applying the usual bilinear sum Cauchy-Schwarz methods (e.g. Theorem 14 of these notes) we conclude that there is a lot of cancellation unless one has {\alpha(x) = a/q + O( \frac{X \log^{O(1)} X}{H^2} )} for some {q = O(\log^{O(1)} X)}. But with {H \geq X^{2/3+\varepsilon}}, {X \log^{O(1)} X/H^2} is well below the threshold {X^{-1/6-\varepsilon}} for the definition of major arc, so we can exclude this case and obtain the required cancellation.

The Chowla conjecture asserts that all non-trivial correlations of the Liouville function are asymptotically negligible; for instance, it asserts that

\displaystyle  \sum_{n \leq X} \lambda(n) \lambda(n+h) = o(X)

as {X \rightarrow \infty} for any fixed natural number {h}. This conjecture remains open, though there are a number of partial results (e.g. these two previous results of Matomaki, Radziwill, and myself).

A natural generalisation of Chowla’s conjecture was proposed by Elliott. For simplicity we will only consider Elliott’s conjecture for the pair correlations

\displaystyle  \sum_{n \leq X} g(n) \overline{g}(n+h).

For such correlations, the conjecture was that one had

\displaystyle  \sum_{n \leq X} g(n) \overline{g}(n+h) = o(X) \ \ \ \ \ (1)

as {X \rightarrow \infty} for any natural number {h}, as long as {g} was a completely multiplicative function with magnitude bounded by {1}, and such that

\displaystyle  \sum_p \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p} = +\infty \ \ \ \ \ (2)

for any Dirichlet character {\chi} and any real number {t}. In the language of “pretentious number theory”, as developed by Granville and Soundararajan, the hypothesis (2) asserts that the completely multiplicative function {g} does not “pretend” to be like the completely multiplicative function {n \mapsto \chi(n) n^{it}} for any character {\chi} and real number {t}. A condition of this form is necessary; for instance, if {g(n)} is precisely equal to {\chi(n) n^{it}} and {\chi} has period {q}, then {g(n) \overline{g}(n+q)} is equal to {1_{(n,q)=1} + o(1)} as {n \rightarrow \infty} and (1) clearly fails. The prime number theorem in arithmetic progressions implies that the Liouville function obeys (2), and so the Elliott conjecture contains the Chowla conjecture as a special case.

As it turns out, Elliott’s conjecture is false as stated, with the counterexample {g} having the property that {g} “pretends” locally to be the function {n \mapsto n^{it_j}} for {n} in various intervals {[1, X_j]}, where {X_j} and {t_j} go to infinity in a certain prescribed sense. See this paper of Matomaki, Radziwill, and myself for details. However, we view this as a technicality, and continue to believe that certain “repaired” versions of Elliott’s conjecture still hold. For instance, our counterexample does not apply when {g} is restricted to be real-valued rather than complex, and we believe that Elliott’s conjecture is valid in this setting. Returning to the complex-valued case, we still expect the asymptotic (1) provided that the condition (2) is replaced by the stronger condition

\displaystyle  \sup_{|t| \leq X} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \rightarrow +\infty

as {X \rightarrow +\infty} for all fixed Dirichlet characters {\chi}. In our paper we supported this claim by establishing a certain “averaged” version of this conjecture; see that paper for further details. (See also this recent paper of Frantzikinakis and Host which establishes a different averaged version of this conjecture.)

One can make a stronger “non-asymptotic” version of this corrected Elliott conjecture, in which the {X} parameter does not go to infinity, or equivalently that the function {g} is permitted to depend on {X}:

Conjecture 1 (Non-asymptotic Elliott conjecture) Let {\varepsilon > 0}, let {A \geq 1} be sufficiently large depending on {\varepsilon}, and let {X} be sufficiently large depending on {A,\varepsilon}. Suppose that {g} is a completely multiplicative function with magnitude bounded by {1}, such that

\displaystyle  \inf_{|t| \leq AX} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \geq A

for all Dirichlet characters {\chi} of period at most {A}. Then one has

\displaystyle  |\sum_{n \leq X} g(n) \overline{g(n+h)}| \leq \varepsilon X

for all natural numbers {1 \leq h \leq 1/\varepsilon}.

The {\varepsilon}-dependent factor {A} in the constraint {|t| \leq AX} is necessary, as can be seen by considering the completely multiplicative function {g(n) := n^{2iX}} (for instance). Again, the results in my previous paper with Matomaki and Radziwill can be viewed as establishing an averaged version of this conjecture.

Meanwhile, we have the following conjecture that is the focus of the Polymath5 project:

Conjecture 2 (Erdös discrepancy conjecture) For any function {f: {\bf N} \rightarrow \{-1,+1\}}, the discrepancy

\displaystyle  \sup_{n,d \in {\bf N}} |\sum_{j=1}^n f(jd)|

is infinite.

It is instructive to compute some near-counterexamples to Conjecture 2 that illustrate the difficulty of the Erdös discrepancy problem. The first near-counterexample is that of a non-principal Dirichlet character {f(n) = \chi(n)} that takes values in {\{-1,0,+1\}} rather than {\{-1,+1\}}. For this function, one has from the complete multiplicativity of {\chi} that

\displaystyle  |\sum_{j=1}^n f(jd)| = |\sum_{j=1}^n \chi(j) \chi(d)|

\displaystyle  \leq |\sum_{j=1}^n \chi(j)|.

If {q} denotes the period of {\chi}, then {\chi} has mean zero on every interval of length {q}, and thus

\displaystyle  |\sum_{j=1}^n f(jd)| \leq |\sum_{j=1}^n \chi(j)| \leq q.

Thus {\chi} has bounded discrepancy.

Of course, this is not a true counterexample to Conjecture 2 because {\chi} can take the value {0}. Let us now consider the following variant example, which is the simplest member of a family of examples studied by Borwein, Choi, and Coons. Let {\chi = \chi_3} be the non-principal Dirichlet character of period {3} (thus {\chi(n)} equals {+1} when {n=1 \hbox{ mod } 3}, {-1} when {n = 2 \hbox{ mod } 3}, and {0} when {n = 0 \hbox{ mod } 3}), and define the completely multiplicative function {f = \tilde \chi: {\bf N} \rightarrow \{-1,+1\}} by setting {\tilde \chi(p) := \chi(p)} when {p \neq 3} and {\tilde \chi(3) = +1}. This is about the simplest modification one can make to the previous near-counterexample to eliminate the zeroes. Now consider the sum

\displaystyle  \sum_{j=1}^n \tilde \chi(j)

with {n := 1 + 3 + 3^2 + \dots + 3^k} for some large {k}. Writing {j = 3^a m} with {m} coprime to {3} and {a} at most {k}, we can write this sum as

\displaystyle  \sum_{a=0}^k \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m).

Now observe that {\tilde \chi(3^a m) = \tilde \chi(3)^a \tilde \chi(m) = \chi(m)}. The function {\chi} has mean zero on every interval of length three, and {\lfloor n/3^j\rfloor} is equal to {1} mod {3}, and thus

\displaystyle  \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m) = 1

for every {a=0,\dots,k}, and thus

\displaystyle  \sum_{j=1}^n \tilde \chi(j) = k+1 \gg \log n.

Thus {\tilde \chi} also has unbounded discrepancy, but only barely so (it grows logarithmically in {n}). These examples suggest that the main “enemy” to proving Conjecture 2 comes from completely multiplicative functions {f} that somehow “pretend” to be like a Dirichlet character but do not vanish at the zeroes of that character. (Indeed, the special case of Conjecture 2 when {f} is completely multiplicative is already open, appears to be an important subcase.)

All of these conjectures remain open. However, I would like to record in this blog post the following striking connection, illustrating the power of the Elliott conjecture (particularly in its nonasymptotic formulation):

Theorem 3 (Elliott conjecture implies unbounded discrepancy) Conjecture 1 implies Conjecture 2.

The argument relies heavily on two observations that were previously made in connection with the Polymath5 project. The first is a Fourier-analytic reduction that replaces the Erdos Discrepancy Problem with an averaged version for completely multiplicative functions {g}. An application of Cauchy-Schwarz then shows that any counterexample to that version will violate the conclusion of Conjecture 1, so if one assumes that conjecture then {g} must pretend to be like a function of the form {n \mapsto \chi(n) n^{it}}. One then uses (a generalisation) of a second argument from Polymath5 to rule out this case, basically by reducing matters to a more complicated version of the Borwein-Choi-Coons analysis. Details are provided below the fold.

There is some hope that the Chowla and Elliott conjectures can be attacked, as the parity barrier which is so impervious to attack for the twin prime conjecture seems to be more permeable in this setting. (For instance, in my previous post I raised a possible approach, based on establishing expander properties of a certain random graph, which seems to get around the parity problem, in principle at least.)

(Update, Sep 25: fixed some treatment of error terms, following a suggestion of Andrew Granville.)

Read the rest of this entry »

Here’s a cute identity I discovered by accident recently. Observe that

\displaystyle  \frac{d}{dx} (1+x^2)^{0/2} = 0

\displaystyle  \frac{d^2}{dx^2} (1+x^2)^{1/2} = \frac{1}{(1+x^2)^{3/2}}

\displaystyle  \frac{d^3}{dx^3} (1+x^2)^{2/2} = 0

\displaystyle  \frac{d^4}{dx^4} (1+x^2)^{3/2} = \frac{9}{(1+x^2)^{5/2}}

\displaystyle  \frac{d^5}{dx^5} (1+x^2)^{4/2} = 0

\displaystyle  \frac{d^6}{dx^6} (1+x^2)^{5/2} = \frac{225}{(1+x^2)^{7/2}}

and so one can conjecture that one has

\displaystyle  \frac{d^{k+1}}{dx^{k+1}} (1+x^2)^{k/2} = 0

when k is even, and

\displaystyle  \frac{d^{k+1}}{dx^{k+1}} (1+x^2)^{k/2} = \frac{(1 \times 3 \times \dots \times k)^2}{(1+x^2)^{(k+2)/2}}

when k is odd. This is obvious in the even case since (1+x^2)^{k/2} is a polynomial of degree k, but I struggled for a while with the odd case before finding a slick three-line proof. (I was first trying to prove the weaker statement that \frac{d^{k+1}}{dx^{k+1}} (1+x^2)^{k/2} was non-negative, but for some strange reason I was only able to establish this by working out the derivative exactly, rather than by using more analytic methods, such as convexity arguments.) I thought other readers might like the challenge (and also I’d like to see some other proofs), so rather than post my own proof immediately, I’ll see if anyone would like to supply their own proofs or thoughts in the comments. Also I am curious to know if this identity is connected to any other existing piece of mathematics.

The lonely runner conjecture is the following open problem:

Conjecture 1 Suppose one has {n \geq 1} runners on the unit circle {{\bf R}/{\bf Z}}, all starting at the origin and moving at different speeds. Then for each runner, there is at least one time {t} for which that runner is “lonely” in the sense that it is separated by a distance at least {1/n} from all other runners.

One can normalise the speed of the lonely runner to be zero, at which point the conjecture can be reformulated (after replacing {n} by {n+1}) as follows:

Conjecture 2 Let {v_1,\dots,v_n} be non-zero real numbers for some {n \geq 1}. Then there exists a real number {t} such that the numbers {tv_1,\dots,tv_n} are all a distance at least {\frac{1}{n+1}} from the integers, thus {\|tv_1\|_{{\bf R}/{\bf Z}},\dots,\|tv_n\|_{{\bf R}/{\bf Z}} \geq \frac{1}{n+1}} where {\|x\|_{{\bf R}/{\bf Z}}} denotes the distance of {x} to the nearest integer.

This conjecture has been proven for {n \leq 7}, but remains open for larger {n}. The bound {\frac{1}{n+1}} is optimal, as can be seen by looking at the case {v_i=i} and applying the Dirichlet approximation theorem. Note that for each non-zero {v}, the set {\{ t \in {\bf R}: \|vt\|_{{\bf R}/{\bf Z}} \leq r \}} has (Banach) density {2r} for any {0 < r < 1/2}, and from this and the union bound we can easily find {t \in {\bf R}} for which

\displaystyle \|tv_1\|_{{\bf R}/{\bf Z}},\dots,\|tv_n\|_{{\bf R}/{\bf Z}} \geq \frac{1}{2n}-\varepsilon

for any {\varepsilon>0}, but it has proven to be quite challenging to remove the factor of {2} to increase {\frac{1}{2n}} to {\frac{1}{n+1}}. (As far as I know, even improving {\frac{1}{2n}} to {\frac{1+c}{2n}} for some absolute constant {c>0} and sufficiently large {n} remains open.)

The speeds {v_1,\dots,v_n} in the above conjecture are arbitrary non-zero reals, but it has been known for some time that one can reduce without loss of generality to the case when the {v_1,\dots,v_n} are rationals, or equivalently (by scaling) to the case where they are integers; see e.g. Section 4 of this paper of Bohman, Holzman, and Kleitman.

In this post I would like to remark on a slight refinement of this reduction, in which the speeds {v_1,\dots,v_n} are integers of bounded size, where the bound depends on {n}. More precisely:

Proposition 3 In order to prove the lonely runner conjecture, it suffices to do so under the additional assumption that the {v_1,\dots,v_n} are integers of size at most {n^{Cn^2}}, where {C} is an (explicitly computable) absolute constant. (More precisely: if this restricted version of the lonely runner conjecture is true for all {n \leq n_0}, then the original version of the conjecture is also true for all {n \leq n_0}.)

In principle, this proposition allows one to verify the lonely runner conjecture for a given {n} in finite time; however the number of cases to check with this proposition grows faster than exponentially in {n}, and so this is unfortunately not a feasible approach to verifying the lonely runner conjecture for more values of {n} than currently known.

One of the key tools needed to prove this proposition is the following additive combinatorics result. Recall that a generalised arithmetic progression (or {GAP}) in the reals {{\bf R}} is a set of the form

\displaystyle  P = \{ n_1 v_1 + \dots + n_d v_d: n_1,\dots,n_d \in {\bf Z}; |n_1| \leq N_1, \dots, |n_d| \leq N_d \}

for some {v_1,\dots,v_d \in {\bf R}} and {N_1,\dots,N_d > 0}; the quantity {d} is called the rank of the progression. If {t>0}, the progression {P} is said to be {t}-proper if the sums {n_1 v_1 + \dots + n_d v_d} with {|n_i| \leq t N_i} for {i=1,\dots,d} are all distinct. We have

Lemma 4 (Progressions lie inside proper progressions) Let {P} be a GAP of rank {d} in the reals, and let {t \geq 1}. Then {P} is contained in a {t}-proper GAP {Q} of rank at most {d}, with

\displaystyle |Q| \leq (2t)^d d^{6d^2} \prod_{i=1}^d (2N_i+1).

Proof: See Theorem 2.1 of this paper of Bilu. (Very similar results can also be found in Theorem 3.40 of my book with Van Vu, or Theorem 1.10 of this paper of mine with Van Vu.) \Box

Now let {n \geq 1}, and assume inductively that the lonely runner conjecture has been proven for all smaller values of {n}, as well as for the current value of {n} in the case that {v_1,\dots,v_n} are integers of size at most {n^{Cn^2}} for some sufficiently large {C}. We will show that the lonely runner conjecture holds in general for this choice of {n}.

let {v_1,\dots,v_n} be non-zero real numbers. Let {C_0} be a large absolute constant to be chosen later. From the above lemma applied to the GAP {\{ n_1 v_1 + \dots + n_d v_d: n_1,\dots,n_d \in \{-1,0,1\}\}}, one can find a {n^{C_0n}}-proper GAP {Q} of rank at most {n} containing {\{v_1,\dots,v_n\}} such that

\displaystyle  |Q| \leq (6n^{C_0 n})^n n^{6n^2};

in particular {|Q| \leq n^{Cn^2}} if {C} is large enough depending on {C_0}.

We write

\displaystyle  Q = \{ n_1 w_1 + \dots + n_d w_d: n_1,\dots,n_d \in {\bf Z}; |n_1| \leq N_1,\dots,|n_d| \leq N_d \}

for some {d \leq n}, {w_1,\dots,w_d}, and {N_1,\dots,N_d \geq 0}. We thus have {v_i = \phi(a_i)} for {i=1,\dots,n}, where {\phi: {\bf R}^d \rightarrow {\bf R}} is the linear map {\phi(n_1,\dots,n_d) := n_1 w_1 + \dots + n_d w_d} and {a_1,\dots,a_n \in {\bf Z}^d} are non-zero and lie in the box {\{ (n_1,\dots,n_d) \in {\bf R}^d: |n_1| \leq N_1,\dots,|n_d| \leq N_d \}}.

We now need an elementary lemma that allows us to create a “collision” between two of the {a_1,\dots,a_n} via a linear projection, without making any of the {a_i} collide with the origin:

Lemma 5 Let {a_1,\dots,a_n \in {\bf R}^d} be non-zero vectors that are not all collinear with the origin. Then, after replacing one or more of the {a_i} with their negatives {-a_i} if necessary, there exists a pair {a_i,a_j} such that {a_i-a_j \neq 0}, and such that none of the {a_1,\dots,a_n} is a scalar multiple of {a_i-a_j}.

Proof: We may assume that {d \geq 2}, since the {d \leq 1} case is vacuous. Applying a generic linear projection to {{\bf R}^2} (which does not affect collinearity, or the property that a given {a_k} is a scalar multiple of {a_i-a_j}), we may then reduce to the case {d=2}.

By a rotation and relabeling, we may assume that {a_1} lies on the negative {x}-axis; by flipping signs as necessary we may then assume that all of the {a_2,\dots,a_n} lie in the closed right half-plane. As the {a_i} are not all collinear with the origin, one of the {a_i} lies off of the {x}-axis, by relabeling, we may assume that {a_2} lies off of the {x} axis and makes a minimal angle with the {x}-axis. Then the angle of {a_2-a_1} with the {x}-axis is non-zero but smaller than any non-zero angle that any of the {a_i} make with this axis, and so none of the {a_i} are a scalar multiple of {a_2-a_1}, and the claim follows. \Box

We now return to the proof of the proposition. If the {a_1,\dots,a_n} are all collinear with the origin, then {\phi(a_1),\dots,\phi(a_n)} lie in a one-dimensional arithmetic progression {\{ mv: |m| \leq |Q| \}}, and then by rescaling we may take the {v_1,\dots,v_n} to be integers of magnitude at most {|Q| \leq n^{Cn}}, at which point we are done by hypothesis. Thus, we may assume that the {a_1,\dots,a_n} are not all collinear with the origin, and so by the above lemma and relabeling we may assume that {a_n-a_1} is non-zero, and that none of the {a_i} are scalar multiples of {a_n-a_1}.

We write

\displaystyle  a_n-a_1 = (c_1,\dots,c_d) \ \ \ \ \ (1)

with {|c_i| \leq 2 N_i} for {i=1,\dots,d}; by relabeling we may assume without loss of generality that {c_d} is non-zero, and furthermore that

\displaystyle  \frac{|c_i|}{N_i} \leq \frac{|c_d|}{N_d}

for {i=1,\dots,d}. We can also factor

\displaystyle  (c_1,\dots,c_d) = q (c'_1,\dots,c'_d) \ \ \ \ \ (2)

where {q} is a natural number and {c'_1,\dots,c'_d} have no common factor.

We now define a variant {\tilde \phi: {\bf R}^d \rightarrow {\bf R}} of {\phi: {\bf R}^d \rightarrow {\bf R}} by the map

\displaystyle  \tilde \phi(n_1,\dots,n_d) := n_1 \tilde w_1 + \dots + n_{d-1} \tilde w_{d-1} - \frac{n_d}{c_d} (c_1 \tilde w_1 + \dots + c_{d-1} \tilde w_{d-1}),

where the {\tilde w_1,\dots,\tilde w_{d-1}} are real numbers that are linearly independent over {{\bf Q}}, whose precise value will not be of importance in our argument. This is a linear map with the property that {\tilde \phi(a_n-a_1)=0}, so that {\tilde \phi(a_1),\dots,\tilde \phi(a_n)} consists of at most {n-1} distinct real numbers, which are non-zero since none of the {a_i} are scalar multiples of {a_n-a_1}, and the {\tilde w_i} are linearly independent over {{\bf Q}}. As we are assuming inductively that the lonely runner conjecture holds for {n-1}, we conclude (after deleting duplicates) that there exists at least one real number {\tilde t} such that

\displaystyle  \| \tilde t \tilde \phi(a_1) \|_{{\bf R}/{\bf Z}}, \dots, \| \tilde t \tilde \phi(a_n) \|_{{\bf R}/{\bf Z}} \geq \frac{1}{n}.

We would like to “approximate” {\phi} by {\tilde \phi} to then conclude that there is at least one real number {t} such that

\displaystyle  \| t \phi(a_1) \|_{{\bf R}/{\bf Z}}, \dots, \| t \phi(a_n) \|_{{\bf R}/{\bf Z}} \geq \frac{1}{n+1}.

It turns out that we can do this by a Fourier-analytic argument taking advantage of the {n^{C_0 n}}-proper nature of {Q}. Firstly, we see from the Dirichlet approximation theorem that one has

\displaystyle  \| \tilde t \tilde \phi(a_1) \|_{{\bf R}/{\bf Z}}, \dots, \| \tilde t \tilde \phi(a_n) \|_{{\bf R}/{\bf Z}} \leq \frac{1}{10 n^2}

for a set {\tilde t} of reals of (Banach) density {\gg n^{-O(n)}}. Thus, by the triangle inequality, we have

\displaystyle  \| \tilde t \tilde \phi(a_1) \|_{{\bf R}/{\bf Z}}, \dots, \| \tilde t \tilde \phi(a_n) \|_{{\bf R}/{\bf Z}} \geq \frac{1}{n} - \frac{1}{10n^2}

for a set {\tilde t} of reals of density {\gg n^{-O(n)}}.

Applying a smooth Fourier multiplier of Littlewood-Paley type, one can find a trigonometric polynomial

\displaystyle  \eta(x) = \sum_{m: |m| \leq n^{C_0 n/10}} b_m e^{2\pi i mx}

which takes values in {[0,1]}, is {\gg 1} for {\|x\|_{{\bf R}/{\bf Z}} \geq \frac{1}{n} - \frac{1}{10n^2}}, and is no larger than {O( n^{-100 C_0n} )} for {\|x\|_{{\bf R}/{\bf Z}} \leq \frac{1}{n+1}}. We then have

\displaystyle  \mathop{\bf E}_t \prod_{j=1}^n \eta( t \tilde \phi(a_j) ) \gg n^{-O(n)}

where {\mathop{\bf E}_t f(t)} denotes the mean value of a quasiperiodic function {f} on the reals {{\bf R}}. We expand the left-hand side out as

\displaystyle  \sum_{m_1,\dots,m_n: m_1 \tilde \phi(a_1) + \dots + m_n \tilde \phi(a_n) = 0} b_{m_1} \dots b_{m_n}.

From the genericity of {\tilde w_1,\dots,\tilde w_n}, we see that the constraint

\displaystyle  m_1 \tilde \phi(a_1) + \dots + m_n \tilde \phi(a_n) = 0

occurs if and only if {m_1 a_1 + \dots + m_n a_n} is a scalar multiple of {a_n-a_1}, or equivalently (by (1), (2)) an integer multiple of {(c'_1,\dots,c'_d)}. Thus

\displaystyle  \sum_{m_1,\dots,m_n: m_1 a_1 + \dots + m_n a_n \in {\bf Z} (c'_1,\dots,c'_d)} b_{m_1} \dots b_{m_n} \gg n^{-O(n)}. \ \ \ \ \ (3)

Next, we consider the average

\displaystyle  \mathop{\bf E}_t \varphi( t \xi ) \prod_{j=1}^n \eta( t v_j ) \ \ \ \ \ (4)


\displaystyle  \xi := c'_1 w_1 + \dots + c'_d w_d. \ \ \ \ \ (5)

and {\varphi} is the Dirichlet series

\displaystyle  \varphi(x) := \sum_{m: |m| \leq n^{C_0 n/2}} e^{2\pi i mx}.

By Fourier expansion and writing {v_j = \phi(a_j)}, we may write (4) as

\displaystyle  \sum_{m,m_1,\dots,m_n: |m| \leq n^{C_0n/2}; m_1 \phi(a_1) + \dots + m_n \phi(a_n) = m \xi} b_{m_1} \dots b_{m_n}.

The support of the {b_{m_i}} implies that {|m_i| \leq n^{C_0n/10}}. Because of the {n^{C_0 n}}-properness of {Q}, we see (for {n} large enough) that the equation

\displaystyle  m_1 \phi(a_1) + \dots + m_n \phi(a_n) = m \xi \ \ \ \ \ (6)

implies that

\displaystyle  m_1 a_1 + \dots + m_n a_n \in {\bf Z} (c'_1,\dots,c'_d) \ \ \ \ \ (7)

and conversely that (7) implies that (6) holds for some {m} with {|m| \leq n^{C_0 n/2}}. From (3) we thus have

\displaystyle  \mathop{\bf E}_t \varphi( t \xi ) \prod_{j=1}^n \eta( t v_j ) \gg n^{-O(1)}.

In particular, there exists a {t} such that

\displaystyle  \varphi( t \xi ) \prod_{j=1}^n \eta( t v_j ) \gg n^{-O(1)}.

Since {\varphi} is bounded in magnitude by {n^{C_0n/2}}, and {\eta} is bounded by {1}, we thus have

\displaystyle  \eta(t v_j) \gg n^{-C_0 n/2 - O(1)}

for each {1 \leq j \leq n}, which by the size properties of {\eta} implies that {\|tv_j\|_{{\bf R}/{\bf Z}} \geq \frac{1}{n+1}} for all {1 \leq j \leq n}, giving the lonely runner conjecture for {n}.