You are currently browsing the tag archive for the ‘polymath5’ tag.

I’ve just uploaded two related papers to the arXiv:

This pair of papers is an outgrowth of these two recent blog posts and the ensuing discussion. In the first paper, we establish the following logarithmically averaged version of the Chowla conjecture (in the case ${k=2}$ of two-point correlations (or “pair correlations”)):

Theorem 1 (Logarithmically averaged Chowla conjecture) Let ${a_1,a_2}$ be natural numbers, and let ${b_1,b_2}$ be integers such that ${a_1 b_2 - a_2 b_1 \neq 0}$. Let ${1 \leq \omega(x) \leq x}$ be a quantity depending on ${x}$ that goes to infinity as ${x \rightarrow \infty}$. Let ${\lambda}$ denote the Liouville function. Then one has

$\displaystyle \sum_{x/\omega(x) < n \leq x} \frac{\lambda(a_1 n + b_1) \lambda(a_2 n+b_2)}{n} = o( \log \omega(x) ) \ \ \ \ \ (1)$

as ${x \rightarrow \infty}$.

Thus for instance one has

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1)}{n} = o(\log x). \ \ \ \ \ (2)$

For comparison, the non-averaged Chowla conjecture would imply that

$\displaystyle \sum_{n \leq x} \lambda(n) \lambda(n+1) = o(x) \ \ \ \ \ (3)$

which is a strictly stronger estimate than (2), and remains open.

The arguments also extend to other completely multiplicative functions than the Liouville function. In particular, one obtains a slightly averaged version of the non-asymptotic Elliott conjecture that was shown in the previous blog post to imply a positive solution to the Erdos discrepancy problem. The averaged version of the conjecture established in this paper is slightly weaker than the one assumed in the previous blog post, but it turns out that the arguments there can be modified without much difficulty to accept this averaged Elliott conjecture as input. In particular, we obtain an unconditional solution to the Erdos discrepancy problem as a consequence; this is detailed in the second paper listed above. In fact we can also handle the vector-valued version of the Erdos discrepancy problem, in which the sequence ${f(1), f(2), \dots}$ takes values in the unit sphere of an arbitrary Hilbert space, rather than in ${\{-1,+1\}}$.

Estimates such as (2) or (3) are known to be subject to the “parity problem” (discussed numerous times previously on this blog), which roughly speaking means that they cannot be proven solely using “linear” estimates on functions such as the von Mangoldt function. However, it is known that the parity problem can be circumvented using “bilinear” estimates, and this is basically what is done here.

We now describe in informal terms the proof of Theorem 1, focusing on the model case (2) for simplicity. Suppose for contradiction that the left-hand side of (2) was large and (say) positive. Using the multiplicativity ${\lambda(pn) = -\lambda(n)}$, we conclude that

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+p) 1_{p|n}}{n}$

is also large and positive for all primes ${p}$ that are not too large; note here how the logarithmic averaging allows us to leave the constraint ${n \leq x}$ unchanged. Summing in ${p}$, we conclude that

$\displaystyle \sum_{n \leq x} \frac{ \sum_{p \in {\mathcal P}} \lambda(n) \lambda(n+p) 1_{p|n}}{n}$

is large and positive for any given set ${{\mathcal P}}$ of medium-sized primes. By a standard averaging argument, this implies that

$\displaystyle \frac{1}{H} \sum_{j=1}^H \sum_{p \in {\mathcal P}} \lambda(n+j) \lambda(n+p+j) 1_{p|n+j} \ \ \ \ \ (4)$

is large for many choices of ${n}$, where ${H}$ is a medium-sized parameter at our disposal to choose, and we take ${{\mathcal P}}$ to be some set of primes that are somewhat smaller than ${H}$. (A similar approach was taken in this recent paper of Matomaki, Radziwill, and myself to study sign patterns of the Möbius function.) To obtain the required contradiction, one thus wants to demonstrate significant cancellation in the expression (4). As in that paper, we view ${n}$ as a random variable, in which case (4) is essentially a bilinear sum of the random sequence ${(\lambda(n+1),\dots,\lambda(n+H))}$ along a random graph ${G_{n,H}}$ on ${\{1,\dots,H\}}$, in which two vertices ${j, j+p}$ are connected if they differ by a prime ${p}$ in ${{\mathcal P}}$ that divides ${n+j}$. A key difficulty in controlling this sum is that for randomly chosen ${n}$, the sequence ${(\lambda(n+1),\dots,\lambda(n+H))}$ and the graph ${G_{n,H}}$ need not be independent. To get around this obstacle we introduce a new argument which we call the “entropy decrement argument” (in analogy with the “density increment argument” and “energy increment argument” that appear in the literature surrounding Szemerédi’s theorem on arithmetic progressions, and also reminiscent of the “entropy compression argument” of Moser and Tardos, discussed in this previous post). This argument, which is a simple consequence of the Shannon entropy inequalities, can be viewed as a quantitative version of the standard subadditivity argument that establishes the existence of Kolmogorov-Sinai entropy in topological dynamical systems; it allows one to select a scale parameter ${H}$ (in some suitable range ${[H_-,H_+]}$) for which the sequence ${(\lambda(n+1),\dots,\lambda(n+H))}$ and the graph ${G_{n,H}}$ exhibit some weak independence properties (or more precisely, the mutual information between the two random variables is small).

Informally, the entropy decrement argument goes like this: if the sequence ${(\lambda(n+1),\dots,\lambda(n+H))}$ has significant mutual information with ${G_{n,H}}$, then the entropy of the sequence ${(\lambda(n+1),\dots,\lambda(n+H'))}$ for ${H' > H}$ will grow a little slower than linearly, due to the fact that the graph ${G_{n,H}}$ has zero entropy (knowledge of ${G_{n,H}}$ more or less completely determines the shifts ${G_{n+kH,H}}$ of the graph); this can be formalised using the classical Shannon inequalities for entropy (and specifically, the non-negativity of conditional mutual information). But the entropy cannot drop below zero, so by increasing ${H}$ as necessary, at some point one must reach a metastable region (cf. the finite convergence principle discussed in this previous blog post), within which very little mutual information can be shared between the sequence ${(\lambda(n+1),\dots,\lambda(n+H))}$ and the graph ${G_{n,H}}$. Curiously, for the application it is not enough to have a purely quantitative version of this argument; one needs a quantitative bound (which gains a factor of a bit more than ${\log H}$ on the trivial bound for mutual information), and this is surprisingly delicate (it ultimately comes down to the fact that the series ${\sum_{j \geq 2} \frac{1}{j \log j \log\log j}}$ diverges, which is only barely true).

Once one locates a scale ${H}$ with the low mutual information property, one can use standard concentration of measure results such as the Hoeffding inequality to approximate (4) by the significantly simpler expression

$\displaystyle \frac{1}{H} \sum_{j=1}^H \sum_{p \in {\mathcal P}} \frac{\lambda(n+j) \lambda(n+p+j)}{p}. \ \ \ \ \ (5)$

The important thing here is that Hoeffding’s inequality gives exponentially strong bounds on the failure probability, which is needed to counteract the logarithms that are inevitably present whenever trying to use entropy inequalities. The expression (5) can then be controlled in turn by an application of the Hardy-Littlewood circle method and a non-trivial estimate

$\displaystyle \sup_\alpha \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(1) \ \ \ \ \ (6)$

for averaged short sums of a modulated Liouville function established in another recent paper by Matomäki, Radziwill and myself.

When one uses this method to study more general sums such as

$\displaystyle \sum_{n \leq x} \frac{g_1(n) g_2(n+1)}{n},$

one ends up having to consider expressions such as

$\displaystyle \frac{1}{H} \sum_{j=1}^H \sum_{p \in {\mathcal P}} c_p \frac{g_1(n+j) g_2(n+p+j)}{p}.$

where ${c_p}$ is the coefficient ${c_p := \overline{g_1}(p) \overline{g_2}(p)}$. When attacking this sum with the circle method, one soon finds oneself in the situation of wanting to locate the large Fourier coefficients of the exponential sum

$\displaystyle S(\alpha) := \sum_{p \in {\mathcal P}} \frac{c_p}{p} e^{2\pi i \alpha p}.$

In many cases (such as in the application to the Erdös discrepancy problem), the coefficient ${c_p}$ is identically ${1}$, and one can understand this sum satisfactorily using the classical results of Vinogradov: basically, ${S(\alpha)}$ is large when ${\alpha}$ lies in a “major arc” and is small when it lies in a “minor arc”. For more general functions ${g_1,g_2}$, the coefficients ${c_p}$ are more or less arbitrary; the large values of ${S(\alpha)}$ are no longer confined to the major arc case. Fortunately, even in this general situation one can use a restriction theorem for the primes established some time ago by Ben Green and myself to show that there are still only a bounded number of possible locations ${\alpha}$ (up to the uncertainty mandated by the Heisenberg uncertainty principle) where ${S(\alpha)}$ is large, and we can still conclude by using (6). (Actually, as recently pointed out to me by Ben, one does not need the full strength of our result; one only needs the ${L^4}$ restriction theorem for the primes, which can be proven fairly directly using Plancherel’s theorem and some sieve theory.)

It is tempting to also use the method to attack higher order cases of the (logarithmically) averaged Chowla conjecture, for instance one could try to prove the estimate

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1) \lambda(n+2)}{n} = o(\log x).$

The above arguments reduce matters to obtaining some non-trivial cancellation for sums of the form

$\displaystyle \frac{1}{H} \sum_{j=1}^H \sum_{p \in {\mathcal P}} \frac{\lambda(n+j) \lambda(n+p+j) \lambda(n+2p+j)}{p}.$

A little bit of “higher order Fourier analysis” (as was done for very similar sums in the ergodic theory context by Frantzikinakis-Host-Kra and Wooley-Ziegler) lets one control this sort of sum if one can establish a bound of the form

$\displaystyle \frac{1}{X} \int_X^{2X} \sup_\alpha |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(1) \ \ \ \ \ (7)$

where ${X}$ goes to infinity and ${H}$ is a very slowly growing function of ${X}$. This looks very similar to (6), but the fact that the supremum is now inside the integral makes the problem much more difficult. However it looks worth attacking (7) further, as this estimate looks like it should have many nice applications (beyond just the ${k=3}$ case of the logarithmically averaged Chowla or Elliott conjectures, which is already interesting).

For higher ${k}$ than ${k=3}$, the same line of analysis requires one to replace the linear phase ${e(\alpha n)}$ by more complicated phases, such as quadratic phases ${e(\alpha n^2 + \beta n)}$ or even ${k-2}$-step nilsequences. Given that (7) is already beyond the reach of current literature, these even more complicated expressions are also unavailable at present, but one can imagine that they will eventually become tractable, in which case we would obtain an averaged form of the Chowla conjecture for all ${k}$, which would have a number of consequences (such as a logarithmically averaged version of Sarnak’s conjecture, as per this blog post).

It would of course be very nice to remove the logarithmic averaging, and be able to establish bounds such as (3). I did attempt to do so, but I do not see a way to use the entropy decrement argument in a manner that does not require some sort of averaging of logarithmic type, as it requires one to pick a scale ${H}$ that one cannot specify in advance, which is not a problem for logarithmic averages (which are quite stable with respect to dilations) but is problematic for ordinary averages. But perhaps the problem can be circumvented by some clever modification of the argument. One possible approach would be to start exploiting multiplicativity at products of primes, and not just individual primes, to try to keep the scale fixed, but this makes the concentration of measure part of the argument much more complicated as one loses some independence properties (coming from the Chinese remainder theorem) which allowed one to conclude just from the Hoeffding inequality.

The Chowla conjecture asserts that all non-trivial correlations of the Liouville function are asymptotically negligible; for instance, it asserts that

$\displaystyle \sum_{n \leq X} \lambda(n) \lambda(n+h) = o(X)$

as ${X \rightarrow \infty}$ for any fixed natural number ${h}$. This conjecture remains open, though there are a number of partial results (e.g. these two previous results of Matomaki, Radziwill, and myself).

A natural generalisation of Chowla’s conjecture was proposed by Elliott. For simplicity we will only consider Elliott’s conjecture for the pair correlations

$\displaystyle \sum_{n \leq X} g(n) \overline{g}(n+h).$

For such correlations, the conjecture was that one had

$\displaystyle \sum_{n \leq X} g(n) \overline{g}(n+h) = o(X) \ \ \ \ \ (1)$

as ${X \rightarrow \infty}$ for any natural number ${h}$, as long as ${g}$ was a completely multiplicative function with magnitude bounded by ${1}$, and such that

$\displaystyle \sum_p \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p} = +\infty \ \ \ \ \ (2)$

for any Dirichlet character ${\chi}$ and any real number ${t}$. In the language of “pretentious number theory”, as developed by Granville and Soundararajan, the hypothesis (2) asserts that the completely multiplicative function ${g}$ does not “pretend” to be like the completely multiplicative function ${n \mapsto \chi(n) n^{it}}$ for any character ${\chi}$ and real number ${t}$. A condition of this form is necessary; for instance, if ${g(n)}$ is precisely equal to ${\chi(n) n^{it}}$ and ${\chi}$ has period ${q}$, then ${g(n) \overline{g}(n+q)}$ is equal to ${1_{(n,q)=1} + o(1)}$ as ${n \rightarrow \infty}$ and (1) clearly fails. The prime number theorem in arithmetic progressions implies that the Liouville function obeys (2), and so the Elliott conjecture contains the Chowla conjecture as a special case.

As it turns out, Elliott’s conjecture is false as stated, with the counterexample ${g}$ having the property that ${g}$ “pretends” locally to be the function ${n \mapsto n^{it_j}}$ for ${n}$ in various intervals ${[1, X_j]}$, where ${X_j}$ and ${t_j}$ go to infinity in a certain prescribed sense. See this paper of Matomaki, Radziwill, and myself for details. However, we view this as a technicality, and continue to believe that certain “repaired” versions of Elliott’s conjecture still hold. For instance, our counterexample does not apply when ${g}$ is restricted to be real-valued rather than complex, and we believe that Elliott’s conjecture is valid in this setting. Returning to the complex-valued case, we still expect the asymptotic (1) provided that the condition (2) is replaced by the stronger condition

$\displaystyle \sup_{|t| \leq X} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \rightarrow +\infty$

as ${X \rightarrow +\infty}$ for all fixed Dirichlet characters ${\chi}$. In our paper we supported this claim by establishing a certain “averaged” version of this conjecture; see that paper for further details. (See also this recent paper of Frantzikinakis and Host which establishes a different averaged version of this conjecture.)

One can make a stronger “non-asymptotic” version of this corrected Elliott conjecture, in which the ${X}$ parameter does not go to infinity, or equivalently that the function ${g}$ is permitted to depend on ${X}$:

Conjecture 1 (Non-asymptotic Elliott conjecture) Let ${\varepsilon > 0}$, let ${A \geq 1}$ be sufficiently large depending on ${\varepsilon}$, and let ${X}$ be sufficiently large depending on ${A,\varepsilon}$. Suppose that ${g}$ is a completely multiplicative function with magnitude bounded by ${1}$, such that

$\displaystyle \inf_{|t| \leq AX} |\sum_{p \leq X} \hbox{Re} \frac{1 - g(p) \overline{\chi(p)} p^{-it}}{p}| \geq A$

for all Dirichlet characters ${\chi}$ of period at most ${A}$. Then one has

$\displaystyle |\sum_{n \leq X} g(n) \overline{g(n+h)}| \leq \varepsilon X$

for all natural numbers ${1 \leq h \leq 1/\varepsilon}$.

The ${\varepsilon}$-dependent factor ${A}$ in the constraint ${|t| \leq AX}$ is necessary, as can be seen by considering the completely multiplicative function ${g(n) := n^{2iX}}$ (for instance). Again, the results in my previous paper with Matomaki and Radziwill can be viewed as establishing an averaged version of this conjecture.

Meanwhile, we have the following conjecture that is the focus of the Polymath5 project:

Conjecture 2 (Erdös discrepancy conjecture) For any function ${f: {\bf N} \rightarrow \{-1,+1\}}$, the discrepancy

$\displaystyle \sup_{n,d \in {\bf N}} |\sum_{j=1}^n f(jd)|$

is infinite.

It is instructive to compute some near-counterexamples to Conjecture 2 that illustrate the difficulty of the Erdös discrepancy problem. The first near-counterexample is that of a non-principal Dirichlet character ${f(n) = \chi(n)}$ that takes values in ${\{-1,0,+1\}}$ rather than ${\{-1,+1\}}$. For this function, one has from the complete multiplicativity of ${\chi}$ that

$\displaystyle |\sum_{j=1}^n f(jd)| = |\sum_{j=1}^n \chi(j) \chi(d)|$

$\displaystyle \leq |\sum_{j=1}^n \chi(j)|.$

If ${q}$ denotes the period of ${\chi}$, then ${\chi}$ has mean zero on every interval of length ${q}$, and thus

$\displaystyle |\sum_{j=1}^n f(jd)| \leq |\sum_{j=1}^n \chi(j)| \leq q.$

Thus ${\chi}$ has bounded discrepancy.

Of course, this is not a true counterexample to Conjecture 2 because ${\chi}$ can take the value ${0}$. Let us now consider the following variant example, which is the simplest member of a family of examples studied by Borwein, Choi, and Coons. Let ${\chi = \chi_3}$ be the non-principal Dirichlet character of period ${3}$ (thus ${\chi(n)}$ equals ${+1}$ when ${n=1 \hbox{ mod } 3}$, ${-1}$ when ${n = 2 \hbox{ mod } 3}$, and ${0}$ when ${n = 0 \hbox{ mod } 3}$), and define the completely multiplicative function ${f = \tilde \chi: {\bf N} \rightarrow \{-1,+1\}}$ by setting ${\tilde \chi(p) := \chi(p)}$ when ${p \neq 3}$ and ${\tilde \chi(3) = +1}$. This is about the simplest modification one can make to the previous near-counterexample to eliminate the zeroes. Now consider the sum

$\displaystyle \sum_{j=1}^n \tilde \chi(j)$

with ${n := 1 + 3 + 3^2 + \dots + 3^k}$ for some large ${k}$. Writing ${j = 3^a m}$ with ${m}$ coprime to ${3}$ and ${a}$ at most ${k}$, we can write this sum as

$\displaystyle \sum_{a=0}^k \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m).$

Now observe that ${\tilde \chi(3^a m) = \tilde \chi(3)^a \tilde \chi(m) = \chi(m)}$. The function ${\chi}$ has mean zero on every interval of length three, and ${\lfloor n/3^j\rfloor}$ is equal to ${1}$ mod ${3}$, and thus

$\displaystyle \sum_{1 \leq m \leq n/3^j} \tilde \chi(3^a m) = 1$

for every ${a=0,\dots,k}$, and thus

$\displaystyle \sum_{j=1}^n \tilde \chi(j) = k+1 \gg \log n.$

Thus ${\tilde \chi}$ also has unbounded discrepancy, but only barely so (it grows logarithmically in ${n}$). These examples suggest that the main “enemy” to proving Conjecture 2 comes from completely multiplicative functions ${f}$ that somehow “pretend” to be like a Dirichlet character but do not vanish at the zeroes of that character. (Indeed, the special case of Conjecture 2 when ${f}$ is completely multiplicative is already open, appears to be an important subcase.)

All of these conjectures remain open. However, I would like to record in this blog post the following striking connection, illustrating the power of the Elliott conjecture (particularly in its nonasymptotic formulation):

Theorem 3 (Elliott conjecture implies unbounded discrepancy) Conjecture 1 implies Conjecture 2.

The argument relies heavily on two observations that were previously made in connection with the Polymath5 project. The first is a Fourier-analytic reduction that replaces the Erdos Discrepancy Problem with an averaged version for completely multiplicative functions ${g}$. An application of Cauchy-Schwarz then shows that any counterexample to that version will violate the conclusion of Conjecture 1, so if one assumes that conjecture then ${g}$ must pretend to be like a function of the form ${n \mapsto \chi(n) n^{it}}$. One then uses (a generalisation) of a second argument from Polymath5 to rule out this case, basically by reducing matters to a more complicated version of the Borwein-Choi-Coons analysis. Details are provided below the fold.

There is some hope that the Chowla and Elliott conjectures can be attacked, as the parity barrier which is so impervious to attack for the twin prime conjecture seems to be more permeable in this setting. (For instance, in my previous post I raised a possible approach, based on establishing expander properties of a certain random graph, which seems to get around the parity problem, in principle at least.)

(Update, Sep 25: fixed some treatment of error terms, following a suggestion of Andrew Granville.)