You are currently browsing the tag archive for the ‘Joni Teravainen’ tag.

Joni Teräväinen and myself have just uploaded to the arXiv our preprint “Quantitative bounds for Gowers uniformity of the Möbius and von Mangoldt functions“. This paper makes quantitative the Gowers uniformity estimates on the Möbius function ${\mu}$ and the von Mangoldt function ${\Lambda}$.

To discuss the results we first discuss the situation of the Möbius function, which is technically simpler in some (though not all) ways. We assume familiarity with Gowers norms and standard notations around these norms, such as the averaging notation ${\mathop{\bf E}_{n \in [N]}}$ and the exponential notation ${e(\theta) = e^{2\pi i \theta}}$. The prime number theorem in qualitative form asserts that

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) = o(1)$

as ${N \rightarrow \infty}$. With Vinogradov-Korobov error term, the prime number theorem is strengthened to

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) \ll \exp( - c \log^{3/5} N (\log \log N)^{-1/5} );$

we refer to such decay bounds (With ${\exp(-c\log^c N)}$ type factors) as pseudopolynomial decay. Equivalently, we obtain pseudopolynomial decay of Gowers ${U^1}$ seminorm of ${\mu}$:

$\displaystyle \| \mu \|_{U^1([N])} \ll \exp( - c \log^{3/5} N (\log \log N)^{-1/5} ).$

As is well known, the Riemann hypothesis would be equivalent to an upgrade of this estimate to polynomial decay of the form

$\displaystyle \| \mu \|_{U^1([N])} \ll_\varepsilon N^{-1/2+\varepsilon}$

for any ${\varepsilon>0}$.

Once one restricts to arithmetic progressions, the situation gets worse: the Siegel-Walfisz theorem gives the bound

$\displaystyle \| \mu 1_{a \hbox{ mod } q}\|_{U^1([N])} \ll_A \log^{-A} N \ \ \ \ \ (1)$

for any residue class ${a \hbox{ mod } q}$ and any ${A>0}$, but with the catch that the implied constant is ineffective in ${A}$. This ineffectivity cannot be removed without further progress on the notorious Siegel zero problem.

In 1937, Davenport was able to show the discorrelation estimate

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) e(-\alpha n) \ll_A \log^{-A} N$

for any ${A>0}$ uniformly in ${\alpha \in {\bf R}}$, which leads (by standard Fourier arguments) to the Fourier uniformity estimate

$\displaystyle \| \mu \|_{U^2([N])} \ll_A \log^{-A} N.$

Again, the implied constant is ineffective. If one insists on effective constants, the best bound currently available is

$\displaystyle \| \mu \|_{U^2([N])} \ll \log^{-c} N \ \ \ \ \ (2)$

for some small effective constant ${c>0}$.

For the situation with the ${U^3}$ norm the previously known results were much weaker. Ben Green and I showed that

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) \overline{F}(g(n) \Gamma) \ll_{A,F,G/\Gamma} \log^{-A} N \ \ \ \ \ (3)$

uniformly for any ${A>0}$, any degree two (filtered) nilmanifold ${G/\Gamma}$, any polynomial sequence ${g: {\bf Z} \rightarrow G}$, and any Lipschitz function ${F}$; again, the implied constants are ineffective. On the other hand, in a separate paper of Ben Green and myself, we established the following inverse theorem: if for instance we knew that

$\displaystyle \| \mu \|_{U^3([N])} \geq \delta$

for some ${0 < \delta < 1/2}$, then there exists a degree two nilmanifold ${G/\Gamma}$ of dimension ${O( \delta^{-O(1)} )}$, complexity ${O( \delta^{-O(1)} )}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and Lipschitz function ${F}$ of Lipschitz constant ${O(\delta^{-O(1)})}$ such that

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) \overline{F}(g(n) \Gamma) \gg \exp(-\delta^{-O(1)}).$

Putting the two assertions together and comparing all the dependencies on parameters, one can establish the qualitative decay bound

$\displaystyle \| \mu \|_{U^3([N])} = o(1).$

However the decay rate ${o(1)}$ produced by this argument is completely ineffective: obtaining a bound on when this ${o(1)}$ quantity dips below a given threshold ${\delta}$ depends on the implied constant in (3) for some ${G/\Gamma}$ whose dimension depends on ${\delta}$, and the dependence on ${\delta}$ obtained in this fashion is ineffective in the face of a Siegel zero.

For higher norms ${U^k, k \geq 3}$, the situation is even worse, because the quantitative inverse theory for these norms is poorer, and indeed it was only with the recent work of Manners that any such bound is available at all (at least for ${k>4}$). Basically, Manners establishes if

$\displaystyle \| \mu \|_{U^k([N])} \geq \delta$

then there exists a degree ${k-1}$ nilmanifold ${G/\Gamma}$ of dimension ${O( \delta^{-O(1)} )}$, complexity ${O( \exp\exp(\delta^{-O(1)}) )}$, a polynomial sequence ${g: {\bf Z} \rightarrow G}$, and Lipschitz function ${F}$ of Lipschitz constant ${O(\exp\exp(\delta^{-O(1)}))}$ such that

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) \overline{F}(g(n) \Gamma) \gg \exp\exp(-\delta^{-O(1)}).$

(We allow all implied constants to depend on ${k}$.) Meanwhile, the bound (3) was extended to arbitrary nilmanifolds by Ben and myself. Again, the two results when concatenated give the qualitative decay

$\displaystyle \| \mu \|_{U^k([N])} = o(1)$

but the decay rate is completely ineffective.

Our first result gives an effective decay bound:

Theorem 1 For any ${k \geq 2}$, we have ${\| \mu \|_{U^k([N])} \ll (\log\log N)^{-c_k}}$ for some ${c_k>0}$. The implied constants are effective.

This is off by a logarithm from the best effective bound (2) in the ${k=2}$ case. In the ${k=3}$ case there is some hope to remove this logarithm based on the improved quantitative inverse theory currently available in this case, but there is a technical obstruction to doing so which we will discuss later in this post. For ${k>3}$ the above bound is the best one could hope to achieve purely using the quantitative inverse theory of Manners.

We have analogues of all the above results for the von Mangoldt function ${\Lambda}$. Here a complication arises that ${\Lambda}$ does not have mean close to zero, and one has to subtract off some suitable approximant ${\Lambda^\sharp}$ to ${\Lambda}$ before one would expect good Gowers norms bounds. For the prime number theorem one can just use the approximant ${1}$, giving

$\displaystyle \| \Lambda - 1 \|_{U^1([N])} \ll \exp( - c \log^{3/5} N (\log \log N)^{-1/5} )$

but even for the prime number theorem in arithmetic progressions one needs a more accurate approximant. In our paper it is convenient to use the “Cramér approximant”

$\displaystyle \Lambda_{\hbox{Cram\'er}}(n) := \frac{W}{\phi(W)} 1_{(n,W)=1}$

where

$\displaystyle W := \prod_{p

and ${Q}$ is the quasipolynomial quantity

$\displaystyle Q = \exp(\log^{1/10} N). \ \ \ \ \ (4)$

Then one can show from the Siegel-Walfisz theorem and standard bilinear sum methods that

$\displaystyle \mathop{\bf E}_{n \in [N]} (\Lambda - \Lambda_{\hbox{Cram\'er}}(n)) e(-\alpha n) \ll_A \log^{-A} N$

and

$\displaystyle \| \Lambda - \Lambda_{\hbox{Cram\'er}}\|_{U^2([N])} \ll_A \log^{-A} N$

for all ${A>0}$ and ${\alpha \in {\bf R}}$ (with an ineffective dependence on ${A}$), again regaining effectivity if ${A}$ is replaced by a sufficiently small constant ${c>0}$. All the previously stated discorrelation and Gowers uniformity results for ${\mu}$ then have analogues for ${\Lambda}$, and our main result is similarly analogous:

Theorem 2 For any ${k \geq 2}$, we have ${\| \Lambda - \Lambda_{\hbox{Cram\'er}} \|_{U^k([N])} \ll (\log\log N)^{-c_k}}$ for some ${c_k>0}$. The implied constants are effective.

By standard methods, this result also gives quantitative asymptotics for counting solutions to various systems of linear equations in primes, with error terms that gain a factor of ${O((\log\log N)^{-c})}$ with respect to the main term.

We now discuss the methods of proof, focusing first on the case of the Möbius function. Suppose first that there is no “Siegel zero”, by which we mean a quadratic character ${\chi}$ of some conductor ${q \leq Q}$ with a zero ${L(\beta,\chi)}$ with ${1 - \beta \leq \frac{c}{\log Q}}$ for some small absolute constant ${c>0}$. In this case the Siegel-Walfisz bound (1) improves to a quasipolynomial bound

$\displaystyle \| \mu 1_{a \hbox{ mod } q}\|_{U^1([N])} \ll \exp(-\log^c N). \ \ \ \ \ (5)$

To establish Theorem 1 in this case, it suffices by Manners’ inverse theorem to establish the polylogarithmic bound

$\displaystyle \mathop{\bf E}_{n \in [N]} \mu(n) \overline{F}(g(n) \Gamma) \ll \exp(-\log^c N) \ \ \ \ \ (6)$

for all degree ${k-1}$ nilmanifolds ${G/\Gamma}$ of dimension ${O((\log\log N)^c)}$ and complexity ${O( \exp(\log^c N))}$, all polynomial sequences ${g}$, and all Lipschitz functions ${F}$ of norm ${O( \exp(\log^c N))}$. If the nilmanifold ${G/\Gamma}$ had bounded dimension, then one could repeat the arguments of Ben and myself more or less verbatim to establish this claim from (5), which relied on the quantitative equidistribution theory on nilmanifolds developed in a separate paper of Ben and myself. Unfortunately, in the latter paper the dependence of the quantitative bounds on the dimension ${d}$ was not explicitly given. In an appendix to the current paper, we go through that paper to account for this dependence, showing that all exponents depend at most doubly exponentially in the dimension ${d}$, which is barely sufficient to handle the dimension of ${O((\log\log N)^c)}$ that arises here.

Now suppose we have a Siegel zero ${L(\beta,\chi)}$. In this case the bound (5) will not hold in general, and hence also (6) will not hold either. Here, the usual way out (while still maintaining effective estimates) is to approximate ${\mu}$ not by ${0}$, but rather by a more complicated approximant ${\mu_{\hbox{Siegel}}}$ that takes the Siegel zero into account, and in particular is such that one has the (effective) pseudopolynomial bound

$\displaystyle \| (\mu - \mu_{\hbox{Siegel}}) 1_{a \hbox{ mod } q}\|_{U^1([N])} \ll \exp(-\log^c N) \ \ \ \ \ (7)$

for all residue classes ${a \hbox{ mod } q}$. The Siegel approximant to ${\mu}$ is actually a little bit complicated, and to our knowledge the first appearance of this sort of approximant only appears as late as this 2010 paper of Germán and Katai. Our version of this approximant is defined as the multiplicative function such that

$\displaystyle \mu_{\hbox{Siegel}}(p^j) = \mu(p^j)$

when ${p < Q}$, and

$\displaystyle \mu_{\hbox{Siegel}}(n) = \alpha n^{\beta-1} \chi(n)$

when ${n}$ is coprime to all primes ${p, and ${\alpha}$ is a normalising constant given by the formula

$\displaystyle \alpha := \frac{1}{L'(\beta,\chi)} \prod_{p

(this constant ends up being of size ${O(1)}$ and plays only a minor role in the analysis). This is a rather complicated formula, but it seems to be virtually the only choice of approximant that allows for bounds such as (7) to hold. (This is the one aspect of the problem where the von Mangoldt theory is simpler than the Möbius theory, as in the former one only needs to work with very rough numbers for which one does not need to make any special accommodations for the behavior at small primes when introducing the Siegel correction term.) With this starting point it is then possible to repeat the analysis of my previous papers with Ben and obtain the pseudopolynomial discorrelation bound

$\displaystyle \mathop{\bf E}_{n \in [N]} (\mu - \mu_{\hbox{Siegel}})(n) \overline{F}(g(n) \Gamma) \ll \exp(-\log^c N)$

for ${F(g(n)\Gamma)}$ as before, which when combined with Manners’ inverse theorem gives the doubly logarithmic bound

$\displaystyle \| \mu - \mu_{\hbox{Siegel}} \|_{U^k([N])} \ll (\log\log N)^{-c_k}.$

Meanwhile, a direct sieve-theoretic computation ends up giving the singly logarithmic bound

$\displaystyle \| \mu_{\hbox{Siegel}} \|_{U^k([N])} \ll \log^{-c_k} N$

(indeed, there is a good chance that one could improve the bounds even further, though it is not helpful for this current argument to do so). Theorem 1 then follows from the triangle inequality for the Gowers norm. It is interesting that the Siegel approximant ${\mu_{\hbox{Siegel}}}$ seems to play a rather essential component in the proof, even if it is absent in the final statement. We note that this approximant seems to be a useful tool to explore the “illusory world” of the Siegel zero further; see for instance the recent paper of Chinis for some work in this direction.

For the analogous problem with the von Mangoldt function (assuming a Siegel zero for sake of discussion), the approximant ${\Lambda_{\hbox{Siegel}}}$ is simpler; we ended up using

$\displaystyle \Lambda_{\hbox{Siegel}}(n) = \Lambda_{\hbox{Cram\'er}}(n) (1 - n^{\beta-1} \chi(n))$

which allows one to state the standard prime number theorem in arithmetic progressions with classical error term and Siegel zero term compactly as

$\displaystyle \| (\Lambda - \Lambda_{\hbox{Siegel}}) 1_{a \hbox{ mod } q}\|_{U^1([N])} \ll \exp(-\log^c N).$

Routine modifications of previous arguments also give

$\displaystyle \mathop{\bf E}_{n \in [N]} (\Lambda - \Lambda_{\hbox{Siegel}})(n) \overline{F}(g(n) \Gamma) \ll \exp(-\log^c N) \ \ \ \ \ (8)$

and

$\displaystyle \| \Lambda_{\hbox{Siegel}} \|_{U^k([N])} \ll \log^{-c_k} N.$

The one tricky new step is getting from the discorrelation estimate (8) to the Gowers uniformity estimate

$\displaystyle \| \Lambda - \Lambda_{\hbox{Siegel}} \|_{U^k([N])} \ll (\log\log N)^{-c_k}.$

One cannot directly apply Manners’ inverse theorem here because ${\Lambda}$ and ${\Lambda_{\hbox{Siegel}}}$ are unbounded. There is a standard tool for getting around this issue, now known as the dense model theorem, which is the standard engine powering the transference principle from theorems about bounded functions to theorems about certain types of unbounded functions. However the quantitative versions of the dense model theorem in the literature are expensive and would basically weaken the doubly logarithmic gain here to a triply logarithmic one. Instead, we bypass the dense model theorem and directly transfer the inverse theorem for bounded functions to an inverse theorem for unbounded functions by using the densification approach to transference introduced by Conlon, Fox, and Zhao. This technique turns out to be quantitatively quite efficient (the dependencies of the main parameters in the transference are polynomial in nature), and also has the technical advantage of avoiding the somewhat tricky “correlation condition” present in early transference results which are also not beneficial for quantitative bounds.

In principle, the above results can be improved for ${k=3}$ due to the stronger quantitative inverse theorems in the ${U^3}$ setting. However, there is a bottleneck that prevents us from achieving this, namely that the equidistribution theory of two-step nilmanifolds has exponents which are exponential in the dimension rather than polynomial in the dimension, and as a consequence we were unable to improve upon the doubly logarithmic results. Specifically, if one is given a sequence of bracket quadratics such as ${\lfloor \alpha_1 n \rfloor \beta_1 n, \dots, \lfloor \alpha_d n \rfloor \beta_d n}$ that fails to be ${\delta}$-equidistributed, one would need to establish a nontrivial linear relationship modulo 1 between the ${\alpha_1,\beta_1,\dots,\alpha_d,\beta_d}$ (up to errors of ${O(1/N)}$), where the coefficients are of size ${O(\delta^{-d^{O(1)}})}$; current methods only give coefficient bounds of the form ${O(\delta^{-\exp(d^{O(1)})})}$. An old result of Schmidt demonstrates proof of concept that these sorts of polynomial dependencies on exponents is possible in principle, but actually implementing Schmidt’s methods here seems to be a quite non-trivial task. There is also another possible route to removing a logarithm, which is to strengthen the inverse ${U^3}$ theorem to make the dimension of the nilmanifold logarithmic in the uniformity parameter ${\delta}$ rather than polynomial. Again, the Freiman-Bilu theorem (see for instance this paper of Ben and myself) demonstrates proof of concept that such an improvement in dimension is possible, but some work would be needed to implement it.

Kaisa Matomäki, Maksym Radziwill, Joni Teräväinen, Tamar Ziegler and I have uploaded to the arXiv our paper Higher uniformity of bounded multiplicative functions in short intervals on average. This paper (which originated from a working group at an AIM workshop on Sarnak’s conjecture) focuses on the local Fourier uniformity conjecture for bounded multiplicative functions such as the Liouville function ${\lambda}$. One form of this conjecture is the assertion that

$\displaystyle \int_0^X \| \lambda \|_{U^k([x,x+H])}\ dx = o(X) \ \ \ \ \ (1)$

as ${X \rightarrow \infty}$ for any fixed ${k \geq 0}$ and any ${H = H(X) \leq X}$ that goes to infinity as ${X \rightarrow \infty}$, where ${U^k([x,x+H])}$ is the (normalized) Gowers uniformity norm. Among other things this conjecture implies (logarithmically averaged version of) the Chowla and Sarnak conjectures for the Liouville function (or the Möbius function), see this previous blog post.

The conjecture gets more difficult as ${k}$ increases, and also becomes more difficult the more slowly ${H}$ grows with ${X}$. The ${k=0}$ conjecture is equivalent to the assertion

$\displaystyle \int_0^X |\sum_{x \leq n \leq x+H} \lambda(n)| \ dx = o(HX)$

which was proven (for arbitrarily slowly growing ${H}$) in a landmark paper of Matomäki and Radziwill, discussed for instance in this blog post.

For ${k=1}$, the conjecture is equivalent to the assertion

$\displaystyle \int_0^X \sup_\alpha |\sum_{x \leq n \leq x+H} \lambda(n) e(-\alpha n)| \ dx = o(HX). \ \ \ \ \ (2)$

This remains open for sufficiently slowly growing ${H}$ (and it would be a major breakthrough in particular if one could obtain this bound for ${H}$ as small as ${\log^\varepsilon X}$ for any fixed ${\varepsilon>0}$, particularly if applicable to more general bounded multiplicative functions than ${\lambda}$, as this would have new implications for a generalization of the Chowla conjecture known as the Elliott conjecture). Recently, Kaisa, Maks and myself were able to establish this conjecture in the range ${H \geq X^\varepsilon}$ (in fact we have since worked out in the current paper that we can get ${H}$ as small as ${\exp(\log^{5/8+\varepsilon} X)}$). In our current paper we establish Fourier uniformity conjecture for higher ${k}$ for the same range of ${H}$. This in particular implies local orthogonality to polynomial phases,

$\displaystyle \int_0^X \sup_{P \in \mathrm{Poly}_{\leq k-1}({\bf R} \rightarrow {\bf R})} |\sum_{x \leq n \leq x+H} \lambda(n) e(-P(n))| \ dx = o(HX) \ \ \ \ \ (3)$

where ${\mathrm{Poly}_{\leq k-1}({\bf R} \rightarrow {\bf R})}$ denotes the polynomials of degree at most ${k-1}$, but the full conjecture is a bit stronger than this, establishing the more general statement

$\displaystyle \int_0^X \sup_{g \in \mathrm{Poly}({\bf R} \rightarrow G)} |\sum_{x \leq n \leq x+H} \lambda(n) \overline{F}(g(n) \Gamma)| \ dx = o(HX) \ \ \ \ \ (4)$

for any degree ${k}$ filtered nilmanifold ${G/\Gamma}$ and Lipschitz function ${F: G/\Gamma \rightarrow {\bf C}}$, where ${g}$ now ranges over polynomial maps from ${{\bf R}}$ to ${G}$. The method of proof follows the same general strategy as in the previous paper with Kaisa and Maks. (The equivalence of (4) and (1) follows from the inverse conjecture for the Gowers norms, proven in this paper.) We quickly sketch first the proof of (3), using very informal language to avoid many technicalities regarding the precise quantitative form of various estimates. If the estimate (3) fails, then we have the correlation estimate

$\displaystyle |\sum_{x \leq n \leq x+H} \lambda(n) e(-P_x(n))| \gg H$

for many ${x \sim X}$ and some polynomial ${P_x}$ depending on ${x}$. The difficulty here is to understand how ${P_x}$ can depend on ${x}$. We write the above correlation estimate more suggestively as

$\displaystyle \lambda(n) \sim_{[x,x+H]} e(P_x(n)).$

Because of the multiplicativity ${\lambda(np) = -\lambda(p)}$ at small primes ${p}$, one expects to have a relation of the form

$\displaystyle e(P_{x'}(p'n)) \sim_{[x/p,x/p+H/p]} e(P_x(pn)) \ \ \ \ \ (5)$

for many ${x,x'}$ for which ${x/p \approx x'/p'}$ for some small primes ${p,p'}$. (This can be formalised using an inequality of Elliott related to the Turan-Kubilius theorem.) This gives a relationship between ${P_x}$ and ${P_{x'}}$ for “edges” ${x,x'}$ in a rather sparse “graph” connecting the elements of say ${[X/2,X]}$. Using some graph theory one can locate some non-trivial “cycles” in this graph that eventually lead (in conjunction to a certain technical but important “Chinese remainder theorem” step to modify the ${P_x}$ to eliminate a rather serious “aliasing” issue that was already discussed in this previous post) to obtain functional equations of the form

$\displaystyle P_x(a_x \cdot) \approx P_x(b_x \cdot)$

for some large and close (but not identical) integers ${a_x,b_x}$, where ${\approx}$ should be viewed as a first approximation (ignoring a certain “profinite” or “major arc” term for simplicity) as “differing by a slowly varying polynomial” and the polynomials ${P_x}$ should now be viewed as taking values on the reals rather than the integers. This functional equation can be solved to obtain a relation of the form

$\displaystyle P_x(t) \approx T_x \log t$

for some real number ${T_x}$ of polynomial size, and with further analysis of the relation (5) one can make ${T_x}$ basically independent of ${x}$. This simplifies (3) to something like

$\displaystyle \int_0^X \sup_{P \in \mathrm{Poly}_{\leq k-1}({\bf R} \rightarrow {\bf R})} |\sum_{x \leq n \leq x+H} \lambda(n) n^{-iT}| \ dx = o(HX)$

and this is now of a form that can be treated by the theorem of Matomäki and Radziwill (because ${n \mapsto \lambda(n) n^{-iT}}$ is a bounded multiplicative function). (Actually because of the profinite term mentioned previously, one also has to insert a Dirichlet character of bounded conductor into this latter conclusion, but we will ignore this technicality.)

Now we apply the same strategy to (4). For abelian ${G}$ the claim follows easily from (3), so we focus on the non-abelian case. One now has a polynomial sequence ${g_x \in \mathrm{Poly}({\bf R} \rightarrow G)}$ attached to many ${x \sim X}$, and after a somewhat complicated adaptation of the above arguments one again ends up with an approximate functional equation

$\displaystyle g_x(a_x \cdot) \Gamma \approx g_x(b_x \cdot) \Gamma \ \ \ \ \ (6)$

where the relation ${\approx}$ is rather technical and will not be detailed here. A new difficulty arises in that there are some unwanted solutions to this equation, such as

$\displaystyle g_x(t) = \gamma^{\frac{\log(a_x t)}{\log(a_x/b_x)}}$

for some ${\gamma \in \Gamma}$, which do not necessarily lead to multiplicative characters like ${n^{-iT}}$ as in the polynomial case, but instead to some unfriendly looking “generalized multiplicative characters” (think of ${e(\lfloor \alpha \log n \rfloor \beta \log n)}$ as a rough caricature). To avoid this problem, we rework the graph theory portion of the argument to produce not just one functional equation of the form (6)for each ${x}$, but many, leading to dilation invariances

$\displaystyle g_x((1+\theta) t) \Gamma \approx g_x(t) \Gamma$

for a “dense” set of ${\theta}$. From a certain amount of Lie algebra theory (ultimately arising from an understanding of the behaviour of the exponential map on nilpotent matrices, and exploiting the hypothesis that ${G}$ is non-abelian) one can conclude that (after some initial preparations to avoid degenerate cases) ${g_x(t)}$ must behave like ${\gamma_x^{\log t}}$ for some central element ${\gamma_x}$ of ${G}$. This eventually brings one back to the multiplicative characters ${n^{-iT}}$ that arose in the polynomial case, and the arguments now proceed as before.

We give two applications of this higher order Fourier uniformity. One regards the growth of the number

$\displaystyle s(k) := |\{ (\lambda(n+1),\dots,\lambda(n+k)): n \in {\bf N} \}|$

of length ${k}$ sign patterns in the Liouville function. The Chowla conjecture implies that ${s(k) = 2^k}$, but even the weaker conjecture of Sarnak that ${s(k) \gg (1+\varepsilon)^k}$ for some ${\varepsilon>0}$ remains open. Until recently, the best asymptotic lower bound on ${s(k)}$ was ${s(k) \gg k^2}$, due to McNamara; with our result, we can now show ${s(k) \gg_A k^A}$ for any ${A}$ (in fact we can get ${s(k) \gg_\varepsilon \exp(\log^{8/5-\varepsilon} k)}$ for any ${\varepsilon>0}$). The idea is to repeat the now-standard argument to exploit multiplicativity at small primes to deduce Chowla-type conjectures from Fourier uniformity conjectures, noting that the Chowla conjecture would give all the sign patterns one could hope for. The usual argument here uses the “entropy decrement argument” to eliminate a certain error term (involving the large but mean zero factor ${p 1_{p|n}-1}$). However the observation is that if there are extremely few sign patterns of length ${k}$, then the entropy decrement argument is unnecessary (there isn’t much entropy to begin with), and a more low-tech moment method argument (similar to the derivation of Chowla’s conjecture from Sarnak’s conjecture, as discussed for instance in this post) gives enough of Chowla’s conjecture to produce plenty of length ${k}$ sign patterns. If there are not extremely few sign patterns of length ${k}$ then we are done anyway. One quirk of this argument is that the sign patterns it produces may only appear exactly once; in contrast with preceding arguments, we were not able to produce a large number of sign patterns that each occur infinitely often.

The second application is to obtain cancellation for various polynomial averages involving the Liouville function ${\lambda}$ or von Mangoldt function ${\Lambda}$, such as

$\displaystyle {\bf E}_{n \leq X} {\bf E}_{m \leq X^{1/d}} \lambda(n+P_1(m)) \lambda(n+P_2(m)) \dots \lambda(n+P_k(m))$

or

$\displaystyle {\bf E}_{n \leq X} {\bf E}_{m \leq X^{1/d}} \lambda(n+P_1(m)) \Lambda(n+P_2(m)) \dots \Lambda(n+P_k(m))$

where ${P_1,\dots,P_k}$ are polynomials of degree at most ${d}$, no two of which differ by a constant (the latter is essential to avoid having to establish the Chowla or Hardy-Littlewood conjectures, which of course remain open). Results of this type were previously obtained by Tamar Ziegler and myself in the “true complexity zero” case when the polynomials ${P}$ had distinct degrees, in which one could use the ${k=0}$ theory of Matomäki and Radziwill; now that higher ${k}$ is available at the scale ${H=X^{1/d}}$ we can now remove this restriction.

Joni Teräväinen and I have just uploaded to the arXiv our paper “Value patterns of multiplicative functions and related sequences“, submitted to Forum of Mathematics, Sigma. This paper explores how to use recent technology on correlations of multiplicative (or nearly multiplicative functions), such as the “entropy decrement method”, in conjunction with techniques from additive combinatorics, to establish new results on the sign patterns of functions such as the Liouville function ${\lambda}$. For instance, with regards to length 5 sign patterns

$\displaystyle (\lambda(n+1),\dots,\lambda(n+5)) \in \{-1,+1\}^5$

of the Liouville function, we can now show that at least ${24}$ of the ${32}$ possible sign patterns in ${\{-1,+1\}^5}$ occur with positive upper density. (Conjecturally, all of them do so, and this is known for all shorter sign patterns, but unfortunately ${24}$ seems to be the limitation of our methods.)

The Liouville function can be written as ${\lambda(n) = e^{2\pi i \Omega(n)/2}}$, where ${\Omega(n)}$ is the number of prime factors of ${n}$ (counting multiplicity). One can also consider the variant ${\lambda_3(n) = e^{2\pi i \Omega(n)/3}}$, which is a completely multiplicative function taking values in the cube roots of unity ${\{1, \omega, \omega^2\}}$. Here we are able to show that all ${27}$ sign patterns in ${\{1,\omega,\omega^2\}}$ occur with positive lower density as sign patterns ${(\lambda_3(n+1), \lambda_3(n+2), \lambda_3(n+3))}$ of this function. The analogous result for ${\lambda}$ was already known (see this paper of Matomäki, Radziwiłł, and myself), and in that case it is even known that all sign patterns occur with equal logarithmic density ${1/8}$ (from this paper of myself and Teräväinen), but these techniques barely fail to handle the ${\lambda_3}$ case by itself (largely because the “parity” arguments used in the case of the Liouville function no longer control three-point correlations in the ${\lambda_3}$ case) and an additional additive combinatorial tool is needed. After applying existing technology (such as entropy decrement methods), the problem roughly speaking reduces to locating patterns ${a \in A_1, a+r \in A_2, a+2r \in A_3}$ for a certain partition ${G = A_1 \cup A_2 \cup A_3}$ of a compact abelian group ${G}$ (think for instance of the unit circle ${G={\bf R}/{\bf Z}}$, although the general case is a bit more complicated, in particular if ${G}$ is disconnected then there is a certain “coprimality” constraint on ${r}$, also we can allow the ${A_1,A_2,A_3}$ to be replaced by any ${A_{c_1}, A_{c_2}, A_{c_3}}$ with ${c_1+c_2+c_3}$ divisible by ${3}$), with each of the ${A_i}$ having measure ${1/3}$. An inequality of Kneser just barely fails to guarantee the existence of such patterns, but by using an inverse theorem for Kneser’s inequality in this previous paper of mine we are able to identify precisely the obstruction for this method to work, and rule it out by an ad hoc method.

The same techniques turn out to also make progress on some conjectures of Erdös-Pomerance and Hildebrand regarding patterns of the largest prime factor ${P^+(n)}$ of a natural number ${n}$. For instance, we improve results of Erdös-Pomerance and of Balog demonstrating that the inequalities

$\displaystyle P^+(n+1) < P^+(n+2) < P^+(n+3)$

and

$\displaystyle P^+(n+1) > P^+(n+2) > P^+(n+3)$

each hold for infinitely many ${n}$, by demonstrating the stronger claims that the inequalities

$\displaystyle P^+(n+1) < P^+(n+2) < P^+(n+3) > P^+(n+4)$

and

$\displaystyle P^+(n+1) > P^+(n+2) > P^+(n+3) < P^+(n+4)$

each hold for a set of ${n}$ of positive lower density. As a variant, we also show that we can find a positive density set of ${n}$ for which

$\displaystyle P^+(n+1), P^+(n+2), P^+(n+3) > n^\gamma$

for any fixed ${\gamma < e^{-1/3} = 0.7165\dots}$ (this improves on a previous result of Hildebrand with ${e^{-1/3}}$ replaced by ${e^{-1/2} = 0.6065\dots}$. A number of other results of this type are also obtained in this paper.

In order to obtain these sorts of results, one needs to extend the entropy decrement technology from the setting of multiplicative functions to that of what we call “weakly stable sets” – sets ${A}$ which have some multiplicative structure, in the sense that (roughly speaking) there is a set ${B}$ such that for all small primes ${p}$, the statements ${n \in A}$ and ${pn \in B}$ are roughly equivalent to each other. For instance, if ${A}$ is a level set ${A = \{ n: \omega(n) = 0 \hbox{ mod } 3 \}}$, one would take ${B = \{ n: \omega(n) = 1 \hbox{ mod } 3 \}}$; if instead ${A}$ is a set of the form ${\{ n: P^+(n) \geq n^\gamma\}}$, then one can take ${B=A}$. When one has such a situation, then very roughly speaking, the entropy decrement argument then allows one to estimate a one-parameter correlation such as

$\displaystyle {\bf E}_n 1_A(n+1) 1_A(n+2) 1_A(n+3)$

with a two-parameter correlation such as

$\displaystyle {\bf E}_n {\bf E}_p 1_B(n+p) 1_B(n+2p) 1_B(n+3p)$

(where we will be deliberately vague as to how we are averaging over ${n}$ and ${p}$), and then the use of the “linear equations in primes” technology of Ben Green, Tamar Ziegler, and myself then allows one to replace this average in turn by something like

$\displaystyle {\bf E}_n {\bf E}_r 1_B(n+r) 1_B(n+2r) 1_B(n+3r)$

where ${r}$ is constrained to be not divisible by small primes but is otherwise quite arbitrary. This latter average can then be attacked by tools from additive combinatorics, such as translation to a continuous group model (using for instance the Furstenberg correspondence principle) followed by tools such as Kneser’s inequality (or inverse theorems to that inequality).

Joni Teräväinen and I have just uploaded to the arXiv our paper “The structure of correlations of multiplicative functions at almost all scales, with applications to the Chowla and Elliott conjectures“. This is a sequel to our previous paper that studied logarithmic correlations of the form

$\displaystyle f(a) := \lim^*_{x \rightarrow \infty} \frac{1}{\log \omega(x)} \sum_{x/\omega(x) \leq n \leq x} \frac{g_1(n+ah_1) \dots g_k(n+ah_k)}{n},$

where ${g_1,\dots,g_k}$ were bounded multiplicative functions, ${h_1,\dots,h_k \rightarrow \infty}$ were fixed shifts, ${1 \leq \omega(x) \leq x}$ was a quantity going off to infinity, and ${\lim^*}$ was a generalised limit functional. Our main technical result asserted that these correlations were necessarily the uniform limit of periodic functions ${f_i}$. Furthermore, if ${g_1 \dots g_k}$ (weakly) pretended to be a Dirichlet character ${\chi}$, then the ${f_i}$ could be chosen to be ${\chi}$isotypic in the sense that ${f_i(ab) = f_i(a) \chi(b)}$ whenever ${a,b}$ are integers with ${b}$ coprime to the periods of ${\chi}$ and ${f_i}$; otherwise, if ${g_1 \dots g_k}$ did not weakly pretend to be any Dirichlet character ${\chi}$, then ${f}$ vanished completely. This was then used to verify several cases of the logarithmically averaged Elliott and Chowla conjectures.

The purpose of this paper was to investigate the extent to which the methods could be extended to non-logarithmically averaged settings. For our main technical result, we now considered the unweighted averages

$\displaystyle f_d(a) := \lim^*_{x \rightarrow \infty} \frac{1}{x/d} \sum_{n \leq x/d} g_1(n+ah_1) \dots g_k(n+ah_k),$

where ${d>1}$ is an additional parameter. Our main result was now as follows. If ${g_1 \dots g_k}$ did not weakly pretend to be a twisted Dirichlet character ${n \mapsto \chi(n) n^{it}}$, then ${f_d(a)}$ converged to zero on (doubly logarithmic) average as ${d \rightarrow \infty}$. If instead ${g_1 \dots g_k}$ did pretend to be such a twisted Dirichlet character, then ${f_d(a) d^{it}}$ converged on (doubly logarithmic) average to a limit ${f(a)}$ of ${\chi}$-isotypic functions ${f_i}$. Thus, roughly speaking, one has the approximation

$\displaystyle \lim^*_{x \rightarrow \infty} \frac{1}{x/d} \sum_{n \leq x/d} g_1(n+ah_1) \dots g_k(n+ah_k) \approx f(a) d^{-it}$

for most ${d}$.

Informally, this says that at almost all scales ${x}$ (where “almost all” means “outside of a set of logarithmic density zero”), the non-logarithmic averages behave much like their logarithmic counterparts except for a possible additional twisting by an Archimedean character ${d \mapsto d^{it}}$ (which interacts with the Archimedean parameter ${d}$ in much the same way that the Dirichlet character ${\chi}$ interacts with the non-Archimedean parameter ${a}$). One consequence of this is that most of the recent results on the logarithmically averaged Chowla and Elliott conjectures can now be extended to their non-logarithmically averaged counterparts, so long as one excludes a set of exceptional scales ${x}$ of logarithmic density zero. For instance, the Chowla conjecture

$\displaystyle \lim_{x \rightarrow\infty} \frac{1}{x} \sum_{n \leq x} \lambda(n+h_1) \dots \lambda(n+h_k) = 0$

is now established for ${k}$ either odd or equal to ${2}$, so long as one excludes an exceptional set of scales.

In the logarithmically averaged setup, the main idea was to combine two very different pieces of information on ${f(a)}$. The first, coming from recent results in ergodic theory, was to show that ${f(a)}$ was well approximated in some sense by a nilsequence. The second was to use the “entropy decrement argument” to obtain an approximate isotopy property of the form

$\displaystyle f(a) g_1 \dots g_k(p)\approx f(ap)$

for “most” primes ${p}$ and integers ${a}$. Combining the two facts, one eventually finds that only the almost periodic components of the nilsequence are relevant.

In the current situation, each ${a \mapsto f_d(a)}$ is approximated by a nilsequence, but the nilsequence can vary with ${d}$ (although there is some useful “Lipschitz continuity” of this nilsequence with respect to the ${d}$ parameter). Meanwhile, the entropy decrement argument gives an approximation basically of the form

$\displaystyle f_{dp}(a) g_1 \dots g_k(p)\approx f_d(ap)$

for “most” ${d,p,a}$. The arguments then proceed largely as in the logarithmically averaged case. A key lemma to handle the dependence on the new parameter ${d}$ is the following cohomological statement: if one has a map ${\alpha: (0,+\infty) \rightarrow S^1}$ that was a quasimorphism in the sense that ${\alpha(xy) = \alpha(x) \alpha(y) + O(\varepsilon)}$ for all ${x,y \in (0,+\infty)}$ and some small ${\varepsilon}$, then there exists a real number ${t}$ such that ${\alpha(x) = x^{it} + O(\varepsilon)}$ for all small ${\varepsilon}$. This is achieved by applying a standard “cocycle averaging argument” to the cocycle ${(x,y) \mapsto \alpha(xy) \alpha(x)^{-1} \alpha(y)^{-1}}$.

It would of course be desirable to not have the set of exceptional scales. We only know of one (implausible) scenario in which we can do this, namely when one has far fewer (in particular, subexponentially many) sign patterns for (say) the Liouville function than predicted by the Chowla conjecture. In this scenario (roughly analogous to the “Siegel zero” scenario in multiplicative number theory), the entropy of the Liouville sign patterns is so small that the entropy decrement argument becomes powerful enough to control all scales rather than almost all scales. On the other hand, this scenario seems to be self-defeating, in that it allows one to establish a large number of cases of the Chowla conjecture, and the full Chowla conjecture is inconsistent with having unusually few sign patterns. Still it hints that future work in this direction may need to split into “low entropy” and “high entropy” cases, in analogy to how many arguments in multiplicative number theory have to split into the “Siegel zero” and “no Siegel zero” cases.

Joni Teräväinen and I have just uploaded to the arXiv our paper “Odd order cases of the logarithmically averaged Chowla conjecture“, submitted to J. Numb. Thy. Bordeaux. This paper gives an alternate route to one of the main results of our previous paper, and more specifically reproves the asymptotic

$\displaystyle \sum_{n \leq x} \frac{\lambda(n+h_1) \dots \lambda(n+h_k)}{n} = o(\log x) \ \ \ \ \ (1)$

for all odd ${k}$ and all integers ${h_1,\dots,h_k}$ (that is to say, all the odd order cases of the logarithmically averaged Chowla conjecture). Our previous argument relies heavily on some deep ergodic theory results of Bergelson-Host-Kra, Leibman, and Le (and was applicable to more general multiplicative functions than the Liouville function ${\lambda}$); here we give a shorter proof that avoids ergodic theory (but instead requires the Gowers uniformity of the (W-tricked) von Mangoldt function, established in several papers of Ben Green, Tamar Ziegler, and myself). The proof follows the lines sketched in the previous blog post. In principle, due to the avoidance of ergodic theory, the arguments here have a greater chance to be made quantitative; however, at present the known bounds on the Gowers uniformity of the von Mangoldt function are qualitative, except at the ${U^2}$ level, which is unfortunate since the first non-trivial odd case ${k=3}$ requires quantitative control on the ${U^3}$ level. (But it may be possible to make the Gowers uniformity bounds for ${U^3}$ quantitative if one assumes GRH, although when one puts everything together, the actual decay rate obtained in (1) is likely to be poor.)

Joni Teräväinen and I have just uploaded to the arXiv our paper “The structure of logarithmically averaged correlations of multiplicative functions, with applications to the Chowla and Elliott conjectures“, submitted to Duke Mathematical Journal. This paper builds upon my previous paper in which I introduced an “entropy decrement method” to prove the two-point (logarithmically averaged) cases of the Chowla and Elliott conjectures. A bit more specifically, I showed that

$\displaystyle \lim_{m \rightarrow \infty} \frac{1}{\log \omega_m} \sum_{x_m/\omega_m \leq n \leq x_m} \frac{g_0(n+h_0) g_1(n+h_1)}{n} = 0$

whenever ${1 \leq \omega_m \leq x_m}$ were sequences going to infinity, ${h_0,h_1}$ were distinct integers, and ${g_0,g_1: {\bf N} \rightarrow {\bf C}}$ were ${1}$-bounded multiplicative functions which were non-pretentious in the sense that

$\displaystyle \liminf_{X \rightarrow \infty} \inf_{|t_j| \leq X} \sum_{p \leq X} \frac{1-\mathrm{Re}( g_j(p) \overline{\chi_j}(p) p^{it_j})}{p} = \infty \ \ \ \ \ (1)$

for all Dirichlet characters ${\chi_j}$ and for ${j=0,1}$. Thus, for instance, one had the logarithmically averaged two-point Chowla conjecture

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+h)}{n} = o(\log x)$

for fixed any non-zero ${h}$, where ${\lambda}$ was the Liouville function.

One would certainly like to extend these results to higher order correlations than the two-point correlations. This looks to be difficult (though perhaps not completely impossible if one allows for logarithmic averaging): in a previous paper I showed that achieving this in the context of the Liouville function would be equivalent to resolving the logarithmically averaged Sarnak conjecture, as well as establishing logarithmically averaged local Gowers uniformity of the Liouville function. However, in this paper we are able to avoid having to resolve these difficult conjectures to obtain partial results towards the (logarithmically averaged) Chowla and Elliott conjecture. For the Chowla conjecture, we can obtain all odd order correlations, in that

$\displaystyle \sum_{n \leq x} \frac{\lambda(n+h_1) \dots \lambda(n+h_k)}{n} = o(\log x) \ \ \ \ \ (2)$

for all odd ${k}$ and all integers ${h_1,\dots,h_k}$ (which, in the odd order case, are no longer required to be distinct). (Superficially, this looks like we have resolved “half of the cases” of the logarithmically averaged Chowla conjecture; but it seems the odd order correlations are significantly easier than the even order ones. For instance, because of the Katai-Bourgain-Sarnak-Ziegler criterion, one can basically deduce the odd order cases of (2) from the even order cases (after allowing for some dilations in the argument ${n}$).

For the more general Elliott conjecture, we can show that

$\displaystyle \lim_{m \rightarrow \infty} \frac{1}{\log \omega_m} \sum_{x_m/\omega_m \leq n \leq x_m} \frac{g_1(n+h_1) \dots g_k(n+h_k)}{n} = 0$

for any ${k}$, any integers ${h_1,\dots,h_k}$ and any bounded multiplicative functions ${g_1,\dots,g_k}$, unless the product ${g_1 \dots g_k}$ weakly pretends to be a Dirichlet character ${\chi}$ in the sense that

$\displaystyle \sum_{p \leq X} \frac{1 - \hbox{Re}( g_1 \dots g_k(p) \overline{\chi}(p)}{p} = o(\log\log X).$

This can be seen to imply (2) as a special case. Even when ${g_1,\dots,g_k}$ does pretend to be a Dirichlet character ${\chi}$, we can still say something: if the limits

$\displaystyle f(a) := \lim_{m \rightarrow \infty} \frac{1}{\log \omega_m} \sum_{x_m/\omega_m \leq n \leq x_m} \frac{g_1(n+ah_1) \dots g_k(n+ah_k)}{n}$

exist for each ${a \in {\bf Z}}$ (which can be guaranteed if we pass to a suitable subsequence), then ${f}$ is the uniform limit of periodic functions ${f_i}$, each of which is ${\chi}$isotypic in the sense that ${f_i(ab) = f_i(a) \chi(b)}$ whenever ${a,b}$ are integers with ${b}$ coprime to the periods of ${\chi}$ and ${f_i}$. This does not pin down the value of any single correlation ${f(a)}$, but does put significant constraints on how these correlations may vary with ${a}$.

Among other things, this allows us to show that all ${16}$ possible length four sign patterns ${(\lambda(n+1),\dots,\lambda(n+4)) \in \{-1,+1\}^4}$ of the Liouville function occur with positive density, and all ${65}$ possible length four sign patterns ${(\mu(n+1),\dots,\mu(n+4)) \in \{-1,0,+1\}^4 \backslash \{-1,+1\}^4}$ occur with the conjectured logarithmic density. (In a previous paper with Matomaki and Radziwill, we obtained comparable results for length three patterns of Liouville and length two patterns of Möbius.)

To describe the argument, let us focus for simplicity on the case of the Liouville correlations

$\displaystyle f(a) := \lim_{X \rightarrow \infty} \frac{1}{\log X} \sum_{n \leq X} \frac{\lambda(n) \lambda(n+a) \dots \lambda(n+(k-1)a)}{n}, \ \ \ \ \ (3)$

assuming for sake of discussion that all limits exist. (In the paper, we instead use the device of generalised limits, as discussed in this previous post.) The idea is to combine together two rather different ways to control this function ${f}$. The first proceeds by the entropy decrement method mentioned earlier, which roughly speaking works as follows. Firstly, we pick a prime ${p}$ and observe that ${\lambda(pn)=-\lambda(n)}$ for any ${n}$, which allows us to rewrite (3) as

$\displaystyle (-1)^k f(a) = \lim_{X \rightarrow \infty} \frac{1}{\log X}$

$\displaystyle \sum_{n \leq X} \frac{\lambda(pn) \lambda(pn+ap) \dots \lambda(pn+(k-1)ap)}{n}.$

Making the change of variables ${n' = pn}$, we obtain

$\displaystyle (-1)^k f(a) = \lim_{X \rightarrow \infty} \frac{1}{\log X}$

$\displaystyle \sum_{n' \leq pX} \frac{\lambda(n') \lambda(n'+ap) \dots \lambda(n'+(k-1)ap)}{n'} p 1_{p|n'}.$

The difference between ${n' \leq pX}$ and ${n' \leq X}$ is negligible in the limit (here is where we crucially rely on the log-averaging), hence

$\displaystyle (-1)^k f(a) = \lim_{X \rightarrow \infty} \frac{1}{\log X} \sum_{n \leq X} \frac{\lambda(n) \lambda(n+ap) \dots \lambda(n+(k-1)ap)}{n} p 1_{p|n}$

and thus by (3) we have

$\displaystyle (-1)^k f(a) = f(ap) + \lim_{X \rightarrow \infty} \frac{1}{\log X}$

$\displaystyle \sum_{n \leq X} \frac{\lambda(n) \lambda(n+ap) \dots \lambda(n+(k-1)ap)}{n} (p 1_{p|n}-1).$

The entropy decrement argument can be used to show that the latter limit is small for most ${p}$ (roughly speaking, this is because the factors ${p 1_{p|n}-1}$ behave like independent random variables as ${p}$ varies, so that concentration of measure results such as Hoeffding’s inequality can apply, after using entropy inequalities to decouple somewhat these random variables from the ${\lambda}$ factors). We thus obtain the approximate isotopy property

$\displaystyle (-1)^k f(a) \approx f(ap) \ \ \ \ \ (4)$

for most ${a}$ and ${p}$.

On the other hand, by the Furstenberg correspondence principle (as discussed in these previous posts), it is possible to express ${f(a)}$ as a multiple correlation

$\displaystyle f(a) = \int_X g(x) g(T^a x) \dots g(T^{(k-1)a} x)\ d\mu(x)$

for some probability space ${(X,\mu)}$ equipped with a measure-preserving invertible map ${T: X \rightarrow X}$. Using results of Bergelson-Host-Kra, Leibman, and Le, this allows us to obtain a decomposition of the form

$\displaystyle f(a) = f_1(a) + f_2(a) \ \ \ \ \ (5)$

where ${f_1}$ is a nilsequence, and ${f_2}$ goes to zero in density (even along the primes, or constant multiples of the primes). The original work of Bergelson-Host-Kra required ergodicity on ${X}$, which is very definitely a hypothesis that is not available here; however, the later work of Leibman removed this hypothesis, and the work of Le refined the control on ${f_1}$ so that one still has good control when restricting to primes, or constant multiples of primes.

Ignoring the small error ${f_2(a)}$, we can now combine (5) to conclude that

$\displaystyle f(a) \approx (-1)^k f_1(ap).$

Using the equidistribution theory of nilsequences (as developed in this previous paper of Ben Green and myself), one can break up ${f_1}$ further into a periodic piece ${f_0}$ and an “irrational” or “minor arc” piece ${f_3}$. The contribution of the minor arc piece ${f_3}$ can be shown to mostly cancel itself out after dilating by primes ${p}$ and averaging, thanks to Vinogradov-type bilinear sum estimates (transferred to the primes). So we end up with

$\displaystyle f(a) \approx (-1)^k f_0(ap),$

which already shows (heuristically, at least) the claim that ${f}$ can be approximated by periodic functions ${f_0}$ which are isotopic in the sense that

$\displaystyle f_0(a) \approx (-1)^k f_0(ap).$

But if ${k}$ is odd, one can use Dirichlet’s theorem on primes in arithmetic progressions to restrict to primes ${p}$ that are ${1}$ modulo the period of ${f_0}$, and conclude now that ${f_0}$ vanishes identically, which (heuristically, at least) gives (2).

The same sort of argument works to give the more general bounds on correlations of bounded multiplicative functions. But for the specific task of proving (2), we initially used a slightly different argument that avoids using the ergodic theory machinery of Bergelson-Host-Kra, Leibman, and Le, but replaces it instead with the Gowers uniformity norm theory used to count linear equations in primes. Basically, by averaging (4) in ${p}$ using the “${W}$-trick”, as well as known facts about the Gowers uniformity of the von Mangoldt function, one can obtain an approximation of the form

$\displaystyle (-1)^k f(a) \approx {\bf E}_{b: (b,W)=1} f(ab)$

where ${b}$ ranges over a large range of integers coprime to some primorial ${W = \prod_{p \leq w} p}$. On the other hand, by iterating (4) we have

$\displaystyle f(a) \approx f(apq)$

for most semiprimes ${pq}$, and by again averaging over semiprimes one can obtain an approximation of the form

$\displaystyle f(a) \approx {\bf E}_{b: (b,W)=1} f(ab).$

For ${k}$ odd, one can combine the two approximations to conclude that ${f(a)=0}$. (This argument is not given in the current paper, but we plan to detail it in a subsequent one.)