You are currently browsing the tag archive for the ‘Halasz’s theorem’ tag.

Let us call an arithmetic function ${f: {\bf N} \rightarrow {\bf C}}$ ${1}$-bounded if we have ${|f(n)| \leq 1}$ for all ${n \in {\bf N}}$. In this section we focus on the asymptotic behaviour of ${1}$-bounded multiplicative functions. Some key examples of such functions include:

• The Möbius function ${\mu}$;
• The Liouville function ${\lambda}$;
• Archimedean” characters ${n \mapsto n^{it}}$ (which I call Archimedean because they are pullbacks of a Fourier character ${x \mapsto x^{it}}$ on the multiplicative group ${{\bf R}^+}$, which has the Archimedean property);
• Dirichlet characters (or “non-Archimedean” characters) ${\chi}$ (which are essentially pullbacks of Fourier characters on a multiplicative cyclic group ${({\bf Z}/q{\bf Z})^\times}$ with the discrete (non-Archimedean) metric);
• Hybrid characters ${n \mapsto \chi(n) n^{it}}$.

The space of ${1}$-bounded multiplicative functions is also closed under multiplication and complex conjugation.

Given a multiplicative function ${f}$, we are often interested in the asymptotics of long averages such as

$\displaystyle \frac{1}{X} \sum_{n \leq X} f(n)$

for large values of ${X}$, as well as short sums

$\displaystyle \frac{1}{H} \sum_{x \leq n \leq x+H} f(n)$

where ${H}$ and ${x}$ are both large, but ${H}$ is significantly smaller than ${x}$. (Throughout these notes we will try to normalise most of the sums and integrals appearing here as averages that are trivially bounded by ${O(1)}$; note that other normalisations are preferred in some of the literature cited here.) For instance, as we established in Theorem 58 of Notes 1, the prime number theorem is equivalent to the assertion that

$\displaystyle \frac{1}{X} \sum_{n \leq X} \mu(n) = o(1) \ \ \ \ \ (1)$

as ${X \rightarrow \infty}$. The Liouville function behaves almost identically to the Möbius function, in that estimates for one function almost always imply analogous estimates for the other:

Exercise 1 Without using the prime number theorem, show that (1) is also equivalent to

$\displaystyle \frac{1}{X} \sum_{n \leq X} \lambda(n) = o(1) \ \ \ \ \ (2)$

as ${X \rightarrow \infty}$. (Hint: use the identities ${\lambda(n) = \sum_{d^2|n} \mu(n/d^2)}$ and ${\mu(n) = \sum_{d^2|n} \mu(d) \lambda(n/d^2)}$.)

Henceforth we shall focus our discussion more on the Liouville function, and turn our attention to averages on shorter intervals. From (2) one has

$\displaystyle \frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n) = o(1) \ \ \ \ \ (3)$

as ${x \rightarrow \infty}$ if ${H = H(x)}$ is such that ${H \geq \varepsilon x}$ for some fixed ${\varepsilon>0}$. However it is significantly more difficult to understand what happens when ${H}$ grows much slower than this. By using the techniques based on zero density estimates discussed in Notes 6, it was shown by Motohashi and that one can also establish \eqref. On the Riemann Hypothesis Maier and Montgomery lowered the threshold to ${H \geq x^{1/2} \log^C x}$ for an absolute constant ${C}$ (the bound ${H \geq x^{1/2+\varepsilon}}$ is more classical, following from Exercise 33 of Notes 2). On the other hand, the randomness heuristics from Supplement 4 suggest that ${H}$ should be able to be taken as small as ${x^\varepsilon}$, and perhaps even ${\log^{1+\varepsilon} x}$ if one is particularly optimistic about the accuracy of these probabilistic models. On the other hand, the Chowla conjecture (mentioned for instance in Supplement 4) predicts that ${H}$ cannot be taken arbitrarily slowly growing in ${x}$, due to the conjectured existence of arbitrarily long strings of consecutive numbers where the Liouville function does not change sign (and in fact one can already show from the known partial results towards the Chowla conjecture that (3) fails for some sequence ${x \rightarrow \infty}$ and some sufficiently slowly growing ${H = H(x)}$, by modifying the arguments in these papers of mine).

The situation is better when one asks to understand the mean value on almost all short intervals, rather than all intervals. There are several equivalent ways to formulate this question:

Exercise 2 Let ${H = H(X)}$ be a function of ${X}$ such that ${H \rightarrow \infty}$ and ${H = o(X)}$ as ${X \rightarrow \infty}$. Let ${f: {\bf N} \rightarrow {\bf C}}$ be a ${1}$-bounded function. Show that the following assertions are equivalent:

• (i) One has

$\displaystyle \frac{1}{H} \sum_{x \leq n \leq x+H} f(n) = o(1)$

as ${X \rightarrow \infty}$, uniformly for all ${x \in [X,2X]}$ outside of a set of measure ${o(X)}$.

• (ii) One has

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|\ dx = o(1)$

as ${X \rightarrow \infty}$.

• (iii) One has

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx = o(1) \ \ \ \ \ (4)$

as ${X \rightarrow \infty}$.

As it turns out the second moment formulation in (iii) will be the most convenient for us to work with in this set of notes, as it is well suited to Fourier-analytic techniques (and in particular the Plancherel theorem).

Using zero density methods, for instance, it was shown by Ramachandra that

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)|^2\ dx \ll_{A,\varepsilon} \log^{-A} X$

whenever ${X^{1/6+\varepsilon} \leq H \leq X}$ and ${\varepsilon>0}$. With this quality of bound (saving arbitrary powers of ${\log X}$ over the trivial bound of ${O(1)}$), this is still the lowest value of ${H}$ one can reach unconditionally. However, in a striking recent breakthrough, it was shown by Matomaki and Radziwill that as long as one is willing to settle for weaker bounds (saving a small power of ${\log X}$ or ${\log H}$, or just a qualitative decay of ${o(1)}$), one can obtain non-trivial estimates on far shorter intervals. For instance, they show

Theorem 3 (Matomaki-Radziwill theorem for Liouville) For any ${2 \leq H \leq X}$, one has

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)|^2\ dx \ll \log^{-c} H$

for some absolute constant ${c>0}$.

In fact they prove a slightly more precise result: see Theorem 1 of that paper. In particular, they obtain the asymptotic (4) for any function ${H = H(X)}$ that goes to infinity as ${X \rightarrow \infty}$, no matter how slowly! This ability to let ${H}$ grow slowly with ${X}$ is important for several applications; for instance, in order to combine this type of result with the entropy decrement methods from Notes 9, it is essential that ${H}$ be allowed to grow more slowly than ${\log X}$. See also this survey of Soundararajan for further discussion.

Exercise 4 In this exercise you may use Theorem 3 freely.

• (i) Establish the lower bound

$\displaystyle \frac{1}{X} \sum_{n \leq X} \lambda(n)\lambda(n+1) > -1+c$

for some absolute constant ${c>0}$ and all sufficiently large ${X}$. (Hint: if this bound failed, then ${\lambda(n)=\lambda(n+1)}$ would hold for almost all ${n}$; use this to create many intervals ${[x,x+H]}$ for which ${\frac{1}{H} \sum_{x \leq n \leq x+H} \lambda(n)}$ is extremely large.)

• (ii) Show that Theorem 3 also holds with ${\lambda(n)}$ replaced by ${\chi_2 \lambda(n)}$, where ${\chi_2}$ is the principal character of period ${2}$. (Use the fact that ${\lambda(2n)=-\lambda(n)}$ for all ${n}$.) Use this to establish the corresponding upper bound

$\displaystyle \frac{1}{X} \sum_{n \leq X} \lambda(n)\lambda(n+1) < 1-c$

to (i).

(There is a curious asymmetry to the difficulty level of these bounds; the upper bound in (ii) was established much earlier by Harman, Pintz, and Wolke, but the lower bound in (i) was only established in the Matomaki-Radziwill paper.)

The techniques discussed previously were highly complex-analytic in nature, relying in particular on the fact that functions such as ${\mu}$ or ${\lambda}$ have Dirichlet series ${{\mathcal D} \mu(s) = \frac{1}{\zeta(s)}}$, ${{\mathcal D} \lambda(s) = \frac{\zeta(2s)}{\zeta(s)}}$ that extend meromorphically into the critical strip. In contrast, the Matomaki-Radziwill theorem does not rely on such meromorphic continuations, and in fact holds for more general classes of ${1}$-bounded multiplicative functions ${f}$, for which one typically does not expect any meromorphic continuation into the strip. Instead, one can view the Matomaki-Radziwill theory as following the philosophy of a slightly different approach to multiplicative number theory, namely the pretentious multiplicative number theory of Granville and Soundarajan (as presented for instance in their draft monograph). A basic notion here is the pretentious distance between two ${1}$-bounded multiplicative functions ${f,g}$ (at a given scale ${X}$), which informally measures the extent to which ${f}$ “pretends” to be like ${g}$ (or vice versa). The precise definition is

Definition 5 (Pretentious distance) Given two ${1}$-bounded multiplicative functions ${f,g}$, and a threshold ${X>0}$, the pretentious distance ${\mathbb{D}(f,g;X)}$ between ${f}$ and ${g}$ up to scale ${X}$ is given by the formula

$\displaystyle \mathbb{D}(f,g;X) := \left( \sum_{p \leq X} \frac{1 - \mathrm{Re}(f(p) \overline{g(p)})}{p} \right)^{1/2}$

Note that one can also define an infinite version ${\mathbb{D}(f,g;\infty)}$ of this distance by removing the constraint ${p \leq X}$, though in such cases the pretentious distance may then be infinite. The pretentious distance is not quite a metric (because ${\mathbb{D}(f,f;X)}$ can be non-zero, and furthermore ${\mathbb{D}(f,g;X)}$ can vanish without ${f,g}$ being equal), but it is still quite close to behaving like a metric, in particular it obeys the triangle inequality; see Exercise 16 below. The philosophy of pretentious multiplicative number theory is that two ${1}$-bounded multiplicative functions ${f,g}$ will exhibit similar behaviour at scale ${X}$ if their pretentious distance ${\mathbb{D}(f,g;X)}$ is bounded, but will become uncorrelated from each other if this distance becomes large. A simple example of this philosophy is given by the following “weak Halasz theorem”, proven in Section 2:

Proposition 6 (Logarithmically averaged version of Halasz) Let ${X}$ be sufficiently large. Then for any ${1}$-bounded multiplicative functions ${f,g}$, one has

$\displaystyle \frac{1}{\log X} \sum_{n \leq X} \frac{f(n) \overline{g(n)}}{n} \ll \exp( - c \mathbb{D}(f, g;X)^2 )$

for an absolute constant ${c>0}$.

In particular, if ${f}$ does not pretend to be ${1}$, then the logarithmic average ${\frac{1}{\log X} \sum_{n \leq X} \frac{f(n)}{n}}$ will be small. This condition is basically necessary, since of course ${\frac{1}{\log X} \sum_{n \leq X} \frac{1}{n} = 1 + o(1)}$.

If one works with non-logarithmic averages ${\frac{1}{X} \sum_{n \leq X} f(n)}$, then not pretending to be ${1}$ is insufficient to establish decay, as was already observed in Exercise 11 of Notes 1: if ${f}$ is an Archimedean character ${f(n) = n^{it}}$ for some non-zero real ${t}$, then ${\frac{1}{\log X} \sum_{n \leq X} \frac{f(n)}{n}}$ goes to zero as ${X \rightarrow \infty}$ (which is consistent with Proposition 6), but ${\frac{1}{X} \sum_{n \leq X} f(n)}$ does not go to zero. However, this is in some sense the “only” obstruction to these averages decaying to zero, as quantified by the following basic result:

Theorem 7 (Halasz’s theorem) Let ${X}$ be sufficiently large. Then for any ${1}$-bounded multiplicative function ${f}$, one has

$\displaystyle \frac{1}{X} \sum_{n \leq X} f(n) \ll \exp( - c \min_{|t| \leq T} \mathbb{D}(f, n \mapsto n^{it};X)^2 ) + \frac{1}{T}$

for an absolute constant ${c>0}$ and any ${T > 0}$.

Informally, we refer to a ${1}$-bounded multiplicative function as “pretentious’; if it pretends to be a character such as ${n^{it}}$, and “non-pretentious” otherwise. The precise distinction is rather malleable, as the precise class of characters that one views as “obstructions” varies from situation to situation. For instance, in Proposition 6 it is just the trivial character ${1}$ which needs to be considered, but in Theorem 7 it is the characters ${n \mapsto n^{it}}$ with ${|t| \leq T}$. In other contexts one may also need to add Dirichlet characters ${\chi(n)}$ or hybrid characters such as ${\chi(n) n^{it}}$ to the list of characters that one might pretend to be. The division into pretentious and non-pretentious functions in multiplicative number theory is faintly analogous to the division into major and minor arcs in the circle method applied to additive number theory problems; see Notes 8. The Möbius and Liouville functions are model examples of non-pretentious functions; see Exercise 24.

In the contrapositive, Halasz’ theorem can be formulated as the assertion that if one has a large mean

$\displaystyle |\frac{1}{X} \sum_{n \leq X} f(n)| \geq \eta$

for some ${\eta > 0}$, then one has the pretentious property

$\displaystyle \mathbb{D}( f, n \mapsto n^{it}; X ) \ll \sqrt{\log(1/\eta)}$

for some ${t \ll \eta^{-1}}$. This has the flavour of an “inverse theorem”, of the type often found in arithmetic combinatorics.

Among other things, Halasz’s theorem gives yet another proof of the prime number theorem (1); see Section 2.

We now give a version of the Matomaki-Radziwill theorem for general (non-pretentious) multiplicative functions that is formulated in a similar contrapositive (or “inverse theorem”) fashion, though to simplify the presentation we only state a qualitative version that does not give explicit bounds.

Theorem 8 ((Qualitative) Matomaki-Radziwill theorem) Let ${\eta>0}$, and let ${1 \leq H \leq X}$, with ${H}$ sufficiently large depending on ${\eta}$. Suppose that ${f}$ is a ${1}$-bounded multiplicative function such that

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx \geq \eta^2.$

Then one has

$\displaystyle \mathbb{D}(f, n \mapsto n^{it};X) \ll_\eta 1$

for some ${t \ll_\eta \frac{X}{H}}$.

The condition ${t \ll_\eta \frac{X}{H}}$ is basically optimal, as the following example shows:

Exercise 9 Let ${\varepsilon>0}$ be a sufficiently small constant, and let ${1 \leq H \leq X}$ be such that ${\frac{1}{\varepsilon} \leq H \leq \varepsilon X}$. Let ${f}$ be the Archimedean character ${f(n) = n^{it}}$ for some ${|t| \leq \varepsilon \frac{X}{H}}$. Show that

$\displaystyle \frac{1}{X} \int_X^{2X} |\frac{1}{H} \sum_{x \leq n \leq x+H} f(n)|^2\ dx \asymp 1.$

Combining Theorem 8 with standard non-pretentiousness facts about the Liouville function (see Exercise 24), we recover Theorem 3 (but with a decay rate of only ${o(1)}$ rather than ${\log^{-c} H}$). We refer the reader to the original paper of Matomaki-Radziwill (as well as this followup paper with myself) for the quantitative version of Theorem 8 that is strong enough to recover the full version of Theorem 3, and which can also handle real-valued pretentious functions.

With our current state of knowledge, the only arguments that can establish the full strength of Halasz and Matomaki-Radziwill theorems are Fourier analytic in nature, relating sums involving an arithmetic function ${f}$ with its Dirichlet series

$\displaystyle {\mathcal D} f(s) := \sum_{n=1}^\infty \frac{f(n)}{n^s}$

which one can view as a discrete Fourier transform of ${f}$ (or more precisely of the measure ${\sum_{n=1}^\infty \frac{f(n)}{n} \delta_{\log n}}$, if one evaluates the Dirichlet series on the right edge ${\{ 1+it: t \in {\bf R} \}}$ of the critical strip). In this aspect, the techniques resemble the complex-analytic methods from Notes 2, but with the key difference that no analytic or meromorphic continuation into the strip is assumed. The key identity that allows us to pass to Dirichlet series is the following variant of Proposition 7 of Notes 2:

Proposition 10 (Parseval type identity) Let ${f,g: {\bf N} \rightarrow {\bf C}}$ be finitely supported arithmetic functions, and let ${\psi: {\bf R} \rightarrow {\bf R}}$ be a Schwartz function. Then

$\displaystyle \sum_{n=1}^\infty \sum_{m=1}^\infty \frac{f(n)}{n} \frac{\overline{g(m)}}{m} \psi(\log n - \log m) = \frac{1}{2\pi} \int_{\bf R} {\mathcal D} f(1+it) \overline{{\mathcal D} g(1+it)} \hat \psi(t)\ dt$

where ${\hat \psi(t) := \int_{\bf R} \psi(u) e^{itu}\ du}$ is the Fourier transform of ${\psi}$. (Note that the finite support of ${f,g}$ and the Schwartz nature of ${\psi,\hat \psi}$ ensure that both sides of the identity are absolutely convergent.)

The restriction that ${f,g}$ be finitely supported will be slightly annoying in places, since most multiplicative functions will fail to be finitely supported, but this technicality can usually be overcome by suitably truncating the multiplicative function, and taking limits if necessary.

Proof: By expanding out the Dirichlet series, it suffices to show that

$\displaystyle \psi(\log n - \log m) = \frac{1}{2\pi} \int_{\bf R} \frac{1}{n^{it}} \frac{1}{m^{-it}} \hat \psi(t)\ dt$

for any natural numbers ${n,m}$. But this follows from the Fourier inversion formula ${\psi(u) = \frac{1}{2\pi} \int_{\bf R} e^{-itu} \hat \psi(t)\ dt}$ applied at ${u = \log n - \log m}$. $\Box$

For applications to Halasz type theorems, one sets ${g(n)}$ equal to the Kronecker delta ${\delta_{n=1}}$, producing weighted integrals of ${{\mathcal D} f(1+it)}$ of “${L^1}$” type. For applications to Matomaki-Radziwill theorems, one instead sets ${f=g}$, and more precisely uses the following corollary of the above proposition, to obtain weighted integrals of ${|{\mathcal D} f(1+it)|^2}$ of “${L^2}$” type:

Exercise 11 (Plancherel type identity) If ${f: {\bf N} \rightarrow {\bf C}}$ is finitely supported, and ${\varphi: {\bf R} \rightarrow {\bf R}}$ is a Schwartz function, establish the identity

$\displaystyle \int_0^\infty |\sum_{n=1}^\infty \frac{f(n)}{n} \varphi(\log n - \log y)|^2 \frac{dy}{y} = \frac{1}{2\pi} \int_{\bf R} |{\mathcal D} f(1+it)|^2 |\hat \varphi(t)|^2\ dt.$

In contrast, information about the non-pretentious nature of a multiplicative function ${f}$ will give “pointwise” or “${L^\infty}$” type control on the Dirichlet series ${{\mathcal D} f(1+it)}$, as is suggested from the Euler product factorisation of ${{\mathcal D} f}$.

It will be convenient to formalise the notion of ${L^1}$, ${L^2}$, and ${L^\infty}$ control of the Dirichlet series ${{\mathcal D} f}$, which as previously mentioned can be viewed as a sort of “Fourier transform” of ${f}$:

Definition 12 (Fourier norms) Let ${f: {\bf N} \rightarrow {\bf C}}$ be finitely supported, and let ${\Omega \subset {\bf R}}$ be a bounded measurable set. We define the Fourier ${L^\infty}$ norm

$\displaystyle \| f\|_{FL^\infty(\Omega)} := \sup_{t \in \Omega} |{\mathcal D} f(1+it)|,$

the Fourier ${L^2}$ norm

$\displaystyle \| f\|_{FL^2(\Omega)} := \left(\int_\Omega |{\mathcal D} f(1+it)|^2\ dt\right)^{1/2},$

and the Fourier ${L^1}$ norm

$\displaystyle \| f\|_{FL^1(\Omega)} := \int_\Omega |{\mathcal D} f(1+it)|\ dt.$

One could more generally define ${FL^p}$ norms for other exponents ${p}$, but we will only need the exponents ${p=1,2,\infty}$ in this current set of notes. It is clear that all the above norms are in fact (semi-)norms on the space of finitely supported arithmetic functions.

As mentioned above, Halasz’s theorem gives good control on the Fourier ${L^\infty}$ norm for restrictions of non-pretentious functions to intervals:

Exercise 13 (Fourier ${L^\infty}$ control via Halasz) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a ${1}$-bounded multiplicative function, let ${I}$ be an interval in ${[C^{-1} X, CX]}$ for some ${X \geq C \geq 1}$, let ${R \geq 1}$, and let ${\Omega \subset {\bf R}}$ be a bounded measurable set. Show that

$\displaystyle \| f 1_I \|_{FL^\infty(\Omega)} \ll_C \exp( - c \min_{t: \mathrm{dist}(t,\Omega) \leq R} \mathbb{D}(f, n \mapsto n^{it};X)^2 ) + \frac{1}{R}.$

(Hint: you will need to use summation by parts (or an equivalent device) to deal with a ${\frac{1}{n}}$ weight.)

Meanwhile, the Plancherel identity in Exercise 11 gives good control on the Fourier ${L^2}$ norm for functions on long intervals (compare with Exercise 2 from Notes 6):

Exercise 14 (${L^2}$ mean value theorem) Let ${T \geq 1}$, and let ${f: {\bf N} \rightarrow {\bf C}}$ be finitely supported. Show that

$\displaystyle \| f \|_{FL^2([-T,T])}^2 \ll \sum_n \frac{1}{n} (\frac{T}{n} \sum_{m: |n-m| \leq n/T} |f(m)|)^2.$

Conclude in particular that if ${f}$ is supported in ${[C^{-1} N, C N]}$ for some ${C \geq 1}$ and ${N \gg T}$, then

$\displaystyle \| f \|_{FL^2([-T,T])}^2 \ll C^{O(1)} \frac{1}{N} \sum_n |f(n)|^2.$

In the simplest case of the logarithmically averaged Halasz theorem (Proposition 6), Fourier ${L^\infty}$ estimates are already sufficient to obtain decent control on the (weighted) Fourier ${L^1}$ type expressions that show up. However, these estimates are not enough by themselves to establish the full Halasz theorem or the Matomaki-Radziwill theorem. To get from Fourier ${L^\infty}$ control to Fourier ${L^1}$ or ${L^2}$ control more efficiently, the key trick is use Hölder’s inequality, which when combined with the basic Dirichlet series identity

$\displaystyle {\mathcal D}(f*g) = ({\mathcal D} f) ({\mathcal D} g)$

gives the inequalities

$\displaystyle \| f*g \|_{FL^1(\Omega)} \leq \|f\|_{FL^2(\Omega)} \|g\|_{FL^2(\Omega)} \ \ \ \ \ (5)$

and

$\displaystyle \| f*g \|_{FL^2(\Omega)} \leq \|f\|_{FL^2(\Omega)} \|g\|_{FL^\infty(\Omega)} \ \ \ \ \ (6)$

The strategy is then to factor (or approximately factor) the original function ${f}$ as a Dirichlet convolution (or average of convolutions) of various components, each of which enjoys reasonably good Fourier ${L^2}$ or ${L^\infty}$ estimates on various regions ${\Omega}$, and then combine them using the Hölder inequalities (5), (6) and the triangle inequality. For instance, to prove Halasz’s theorem, we will split ${f}$ into the Dirichlet convolution of three factors, one of which will be estimated in ${FL^\infty}$ using the non-pretentiousness hypothesis, and the other two being estimated in ${FL^2}$ using Exercise 14. For the Matomaki-Radziwill theorem, one uses a significantly more complicated decomposition of ${f}$ into a variety of Dirichlet convolutions of factors, and also splits up the Fourier domain ${[-T,T]}$ into several subregions depending on whether the Dirichlet series associated to some of these components are large or small. In each region and for each component of these decompositions, all but one of the factors will be estimated in ${FL^\infty}$, and the other in ${FL^2}$; but the precise way in which this is done will vary from component to component. For instance, in some regions a key factor will be small in ${FL^\infty}$ by construction of the region; in other places, the ${FL^\infty}$ control will come from Exercise 13. Similarly, in some regions, satisfactory ${FL^2}$ control is provided by Exercise 14, but in other regions one must instead use “large value” theorems (in the spirit of Proposition 9 from Notes 6), or amplify the power of the standard ${L^2}$ mean value theorems by combining the Dirichlet series with other Dirichlet series that are known to be large in this region.

There are several ways to achieve the desired factorisation. In the case of Halasz’s theorem, we can simply work with a crude version of the Euler product factorisation, dividing the primes into three categories (“small”, “medium”, and “large” primes) and expressing ${f}$ as a triple Dirichlet convolution accordingly. For the Matomaki-Radziwill theorem, one instead exploits the Turan-Kubilius phenomenon (Section 5 of Notes 1, or Lemma 2 of Notes 9)) that for various moderately wide ranges ${[P,Q]}$ of primes, the number of prime divisors of a large number ${n}$ in the range ${[P,Q]}$ is almost always close to ${\log\log Q - \log\log P}$. Thus, if we introduce the arithmetic functions

$\displaystyle w_{[P,Q]}(n) = \frac{1}{\log\log Q - \log\log P} \sum_{P \leq p \leq Q} 1_{n=p} \ \ \ \ \ (7)$

then we have

$\displaystyle 1 \approx 1 * w_{[P,Q]}$

and more generally we have a twisted approximation

$\displaystyle f \approx f * fw_{[P,Q]}$

for multiplicative functions ${f}$. (Actually, for technical reasons it will be convenient to work with a smoothed out version of these functions; see Section 3.) Informally, these formulas suggest that the “${FL^2}$ energy” of a multiplicative function ${f}$ is concentrated in those regions where ${f w_{[P,Q]}}$ is extremely large in a ${FL^\infty}$ sense. Iterations of this formula (or variants of this formula, such as an identity due to Ramaré) will then give the desired (approximate) factorisation of ${{\mathcal D} f}$.

A basic estimate in multiplicative number theory (particularly if one is using the Granville-Soundararajan “pretentious” approach to this subject) is the following inequality of Halasz (formulated here in a quantitative form introduced by Montgomery and Tenenbaum).

Theorem 1 (Halasz inequality) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a multiplicative function bounded in magnitude by ${1}$, and suppose that ${x \geq 3}$, ${T \geq 1}$, and ${ M \geq 0}$ are such that

$\displaystyle \sum_{p \leq x} \frac{1 - \hbox{Re}(f(p) p^{-it})}{p} \geq M \ \ \ \ \ (1)$

for all real numbers ${t}$ with ${|t| \leq T}$. Then

$\displaystyle \frac{1}{x} \sum_{n \leq x} f(n) \ll (1+M) e^{-M} + \frac{1}{\sqrt{T}}.$

As a qualitative corollary, we conclude (by standard compactness arguments) that if

$\displaystyle \sum_{p} \frac{1 - \hbox{Re}(f(p) p^{-it})}{p} = +\infty$

for all real ${t}$, then

$\displaystyle \frac{1}{x} \sum_{n \leq x} f(n) = o(1) \ \ \ \ \ (2)$

as ${x \rightarrow \infty}$. In the more recent work of this paper of Granville and Soundararajan, the sharper bound

$\displaystyle \frac{1}{x} \sum_{n \leq x} f(n) \ll (1+M) e^{-M} + \frac{1}{T} + \frac{\log\log x}{\log x}$

is obtained (with a more precise description of the ${(1+M) e^{-M}}$ term).

The usual proofs of Halasz’s theorem are somewhat lengthy (though there has been a recent simplification, in forthcoming work of Granville, Harper, and Soundarajan). Below the fold I would like to give a relatively short proof of the following “cheap” version of the inequality, which has slightly weaker quantitative bounds, but still suffices to give qualitative conclusions such as (2).

Theorem 2 (Cheap Halasz inequality) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a multiplicative function bounded in magnitude by ${1}$. Let ${T \geq 1}$ and ${M \geq 0}$, and suppose that ${x}$ is sufficiently large depending on ${T,M}$. If (1) holds for all ${|t| \leq T}$, then

$\displaystyle \frac{1}{x} \sum_{n \leq x} f(n) \ll (1+M) e^{-M/2} + \frac{1}{T}.$

The non-optimal exponent ${1/2}$ can probably be improved a bit by being more careful with the exponents, but I did not try to optimise it here. A similar bound appears in the first paper of Halasz on this topic.

The idea of the argument is to split ${f}$ as a Dirichlet convolution ${f = f_1 * f_2 * f_3}$ where ${f_1,f_2,f_3}$ is the portion of ${f}$ coming from “small”, “medium”, and “large” primes respectively (with the dividing line between the three types of primes being given by various powers of ${x}$). Using a Perron-type formula, one can express this convolution in terms of the product of the Dirichlet series of ${f_1,f_2,f_3}$ respectively at various complex numbers ${1+it}$ with ${|t| \leq T}$. One can use ${L^2}$ based estimates to control the Dirichlet series of ${f_2,f_3}$, while using the hypothesis (1) one can get ${L^\infty}$ estimates on the Dirichlet series of ${f_1}$. (This is similar to the Fourier-analytic approach to ternary additive problems, such as Vinogradov’s theorem on representing large odd numbers as the sum of three primes.) This idea was inspired by a similar device used in the work of Granville, Harper, and Soundarajan. A variant of this argument also appears in unpublished work of Adam Harper.

I thank Andrew Granville for helpful comments which led to significant simplifications of the argument.

In analytic number theory, it is a well-known phenomenon that for many arithmetic functions ${f: {\bf N} \rightarrow {\bf C}}$ of interest in number theory, it is significantly easier to estimate logarithmic sums such as

$\displaystyle \sum_{n \leq x} \frac{f(n)}{n}$

than it is to estimate summatory functions such as

$\displaystyle \sum_{n \leq x} f(n).$

(Here we are normalising ${f}$ to be roughly constant in size, e.g. ${f(n) = O( n^{o(1)} )}$ as ${n \rightarrow \infty}$.) For instance, when ${f}$ is the von Mangoldt function ${\Lambda}$, the logarithmic sums ${\sum_{n \leq x} \frac{\Lambda(n)}{n}}$ can be adequately estimated by Mertens’ theorem, which can be easily proven by elementary means (see Notes 1); but a satisfactory estimate on the summatory function ${\sum_{n \leq x} \Lambda(n)}$ requires the prime number theorem, which is substantially harder to prove (see Notes 2). (From a complex-analytic or Fourier-analytic viewpoint, the problem is that the logarithmic sums ${\sum_{n \leq x} \frac{f(n)}{n}}$ can usually be controlled just from knowledge of the Dirichlet series ${\sum_n \frac{f(n)}{n^s}}$ for ${s}$ near ${1}$; but the summatory functions require control of the Dirichlet series ${\sum_n \frac{f(n)}{n^s}}$ for ${s}$ on or near a large portion of the line ${\{ 1+it: t \in {\bf R} \}}$. See Notes 2 for further discussion.)

Viewed conversely, whenever one has a difficult estimate on a summatory function such as ${\sum_{n \leq x} f(n)}$, one can look to see if there is a “cheaper” version of that estimate that only controls the logarithmic sums ${\sum_{n \leq x} \frac{f(n)}{n}}$, which is easier to prove than the original, more “expensive” estimate. In this post, we shall do this for two theorems, a classical theorem of Halasz on mean values of multiplicative functions on long intervals, and a much more recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. The two are related; the former theorem is an ingredient in the latter (though in the special case of the Matomaki-Radziwiłł theorem considered here, we will not need Halasz’s theorem directly, instead using a key tool in the proof of that theorem).

We begin with Halasz’s theorem. Here is a version of this theorem, due to Montgomery and to Tenenbaum:

Theorem 1 (Halasz-Montgomery-Tenenbaum) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a multiplicative function with ${|f(n)| \leq 1}$ for all ${n}$. Let ${x \geq 3}$ and ${T \geq 1}$, and set

$\displaystyle M := \min_{|t| \leq T} \sum_{p \leq x} \frac{1 - \hbox{Re}( f(p) p^{-it} )}{p}.$

Then one has

$\displaystyle \frac{1}{x} \sum_{n \leq x} f(n) \ll (1+M) e^{-M} + \frac{1}{\sqrt{T}}.$

Informally, this theorem asserts that ${\sum_{n \leq x} f(n)}$ is small compared with ${x}$, unless ${f}$ “pretends” to be like the character ${p \mapsto p^{it}}$ on primes for some small ${y}$. (This is the starting point of the “pretentious” approach of Granville and Soundararajan to analytic number theory, as developed for instance here.) We now give a “cheap” version of this theorem which is significantly weaker (both because it settles for controlling logarithmic sums rather than summatory functions, it requires ${f}$ to be completely multiplicative instead of multiplicative, it requires a strong bound on the analogue of the quantity ${M}$, and because it only gives qualitative decay rather than quantitative estimates), but easier to prove:

Theorem 2 (Cheap Halasz) Let ${x}$ be an asymptotic parameter goingto infinity. Let ${f: {\bf N} \rightarrow {\bf C}}$ be a completely multiplicative function (possibly depending on ${x}$) such that ${|f(n)| \leq 1}$ for all ${n}$, such that

$\displaystyle \sum_{p \leq x} \frac{1 - \hbox{Re}( f(p) )}{p} \gg \log\log x. \ \ \ \ \ (1)$

Then

$\displaystyle \frac{1}{\log x} \sum_{n \leq x} \frac{f(n)}{n} = o(1). \ \ \ \ \ (2)$

Note that now that we are content with estimating exponential sums, we no longer need to preclude the possibility that ${f(p)}$ pretends to be like ${p^{it}}$; see Exercise 11 of Notes 1 for a related observation.

To prove this theorem, we first need a special case of the Turan-Kubilius inequality.

Lemma 3 (Turan-Kubilius) Let ${x}$ be a parameter going to infinity, and let ${1 < P < x}$ be a quantity depending on ${x}$ such that ${P = x^{o(1)}}$ and ${P \rightarrow \infty}$ as ${x \rightarrow \infty}$. Then

$\displaystyle \sum_{n \leq x} \frac{ | \frac{1}{\log \log P} \sum_{p \leq P: p|n} 1 - 1 |}{n} = o( \log x ).$

Informally, this lemma is asserting that

$\displaystyle \sum_{p \leq P: p|n} 1 \approx \log \log P$

for most large numbers ${n}$. Another way of writing this heuristically is in terms of Dirichlet convolutions:

$\displaystyle 1 \approx 1 * \frac{1}{\log\log P} 1_{{\mathcal P} \cap [1,P]}.$

This type of estimate was previously discussed as a tool to establish a criterion of Katai and Bourgain-Sarnak-Ziegler for Möbius orthogonality estimates in this previous blog post. See also Section 5 of Notes 1 for some similar computations.

Proof: By Cauchy-Schwarz it suffices to show that

$\displaystyle \sum_{n \leq x} \frac{ | \frac{1}{\log \log P} \sum_{p \leq P: p|n} 1 - 1 |^2}{n} = o( \log x ).$

Expanding out the square, it suffices to show that

$\displaystyle \sum_{n \leq x} \frac{ (\frac{1}{\log \log P} \sum_{p \leq P: p|n} 1)^j}{n} = \log x + o( \log x )$

for ${j=0,1,2}$.

We just show the ${j=2}$ case, as the ${j=0,1}$ cases are similar (and easier). We rearrange the left-hand side as

$\displaystyle \frac{1}{(\log\log P)^2} \sum_{p_1, p_2 \leq P} \sum_{n \leq x: p_1,p_2|n} \frac{1}{n}.$

We can estimate the inner sum as ${(1+o(1)) \frac{1}{[p_1,p_2]} \log x}$. But a routine application of Mertens’ theorem (handling the diagonal case when ${p_1=p_2}$ separately) shows that

$\displaystyle \sum_{p_1, p_2 \leq P} \frac{1}{[p_1,p_2]} = (1+o(1)) (\log\log P)^2$

and the claim follows. $\Box$

Remark 4 As an alternative to the Turan-Kubilius inequality, one can use the Ramaré identity

$\displaystyle \sum_{p \leq P: p|n} \frac{1}{\# \{ p' \leq P: p'|n\} + 1} - 1 = 1_{(p,n)=1 \hbox{ for all } p \leq P}$

(see e.g. Section 17.3 of Friedlander-Iwaniec). This identity turns out to give superior quantitative results than the Turan-Kubilius inequality in applications; see the paper of Matomaki and Radziwiłł for an instance of this.

We now prove Theorem 2. Let ${Q}$ denote the left-hand side of (2); by the triangle inequality we have ${Q=O(1)}$. By Lemma 3 (for some ${P = x^{o(1)}}$ to be chosen later) and the triangle inequality we have

$\displaystyle \sum_{n \leq x} \frac{\frac{1}{\log \log P} \sum_{p \leq P: p|n} f(n)}{n} = Q \log x + o( \log x ).$

We rearrange the left-hand side as

$\displaystyle \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p} \sum_{m \leq x/p} \frac{f(m)}{m}.$

We now replace the constraint ${m \leq x/p}$ by ${m \leq x}$. The error incurred in doing so is

$\displaystyle O( \frac{1}{\log\log P} \sum_{p \leq P} \frac{1}{p} \sum_{x/P \leq m \leq x} \frac{1}{m} )$

which by Mertens’ theorem is ${O(\log P) = o( \log x )}$. Thus we have

$\displaystyle \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p} \sum_{m \leq x} \frac{f(m)}{m} = Q \log x + o( \log x ).$

But by definition of ${Q}$, we have ${\sum_{m \leq x} \frac{f(m)}{m} = Q \log x}$, thus

$\displaystyle [1 - \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p}] Q = o(1). \ \ \ \ \ (3)$

From Mertens’ theorem, the expression in brackets can be rewritten as

$\displaystyle \frac{1}{\log\log P} \sum_{p \leq P} \frac{1 - f(p)}{p} + o(1)$

and so the real part of this expression is

$\displaystyle \frac{1}{\log\log P} \sum_{p \leq P} \frac{1 - \hbox{Re} f(p)}{p} + o(1).$

By (1), Mertens’ theorem and the hypothesis on ${f}$ we have

$\displaystyle \sum_{p \leq x^\varepsilon} \frac{(1 - \hbox{Re} f(p)) \log p}{p} \gg \log\log x^\varepsilon - O_\varepsilon(1)$

for any ${\varepsilon > 0}$. This implies that we can find ${P = x^{o(1)}}$ going to infinity such that

$\displaystyle \sum_{p \leq P} \frac{(1 - \hbox{Re} f(p)) \log p}{p} \gg (1-o(1))\log\log P$

and thus the expression in brackets has real part ${\gg 1-o(1)}$. The claim follows.

The Turan-Kubilius argument is certainly not the most efficient way to estimate sums such as ${\frac{1}{n} \sum_{n \leq x} f(n)}$. In the exercise below we give a significantly more accurate estimate that works when ${f}$ is non-negative.

Exercise 5 (Granville-Koukoulopoulos-Matomaki)

• (i) If ${g}$ is a completely multiplicative function with ${g(p) \in \{0,1\}}$ for all primes ${p}$, show that

$\displaystyle (e^{-\gamma}-o(1)) \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1} \leq \sum_{n \leq x} \frac{g(n)}{n} \leq \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1}.$

as ${x \rightarrow \infty}$. (Hint: for the upper bound, expand out the Euler product. For the lower bound, show that ${\sum_{n \leq x} \frac{g(n)}{n} \times \sum_{n \leq x} \frac{h(n)}{n} \ge \sum_{n \leq x} \frac{1}{n}}$, where ${h}$ is the completely multiplicative function with ${h(p) = 1-g(p)}$ for all primes ${p}$.)

• (ii) If ${g}$ is multiplicative and takes values in ${[0,1]}$, show that

$\displaystyle \sum_{n \leq x} \frac{g(n)}{n} \asymp \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1}$

$\displaystyle \asymp \exp( \sum_{p \leq x} \frac{g(p)}{p} )$

for all ${x \geq 1}$.

Now we turn to a very recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. For sake of illustration we specialise their results to the simpler case of the Liouville function ${\lambda}$, although their arguments actually work (with some additional effort) for arbitrary multiplicative functions of magnitude at most ${1}$ that are real-valued (or more generally, stay far from complex characters ${p \mapsto p^{it}}$). Furthermore, we give a qualitative form of their estimates rather than a quantitative one:

Theorem 6 (Matomaki-Radziwiłł, special case) Let ${X}$ be a parameter going to infinity, and let ${2 \leq h \leq X}$ be a quantity going to infinity as ${X \rightarrow \infty}$. Then for all but ${o(X)}$ of the integers ${x \in [X,2X]}$, one has

$\displaystyle \sum_{x \leq n \leq x+h} \lambda(n) = o( h ).$

Equivalently, one has

$\displaystyle \sum_{X \leq x \leq 2X} |\sum_{x \leq n \leq x+h} \lambda(n)|^2 = o( h^2 X ). \ \ \ \ \ (4)$

A simple sieving argument (see Exercise 18 of Supplement 4) shows that one can replace ${\lambda}$ by the Möbius function ${\mu}$ and obtain the same conclusion. See this recent note of Matomaki and Radziwiłł for a simple proof of their (quantitative) main theorem in this special case.

Of course, (4) improves upon the trivial bound of ${O( h^2 X )}$. Prior to this paper, such estimates were only known (using arguments similar to those in Section 3 of Notes 6) for ${h \geq X^{1/6+\varepsilon}}$ unconditionally, or for ${h \geq \log^A X}$ for some sufficiently large ${A}$ if one assumed the Riemann hypothesis. This theorem also represents some progress towards Chowla’s conjecture (discussed in Supplement 4) that

$\displaystyle \sum_{n \leq x} \lambda(n+h_1) \dots \lambda(n+h_k) = o( x )$

as ${x \rightarrow \infty}$ for any fixed distinct ${h_1,\dots,h_k}$; indeed, it implies that this conjecture holds if one performs a small amount of averaging in the ${h_1,\dots,h_k}$.

Below the fold, we give a “cheap” version of the Matomaki-Radziwiłł argument. More precisely, we establish

Theorem 7 (Cheap Matomaki-Radziwiłł) Let ${X}$ be a parameter going to infinity, and let ${1 \leq T \leq X}$. Then

$\displaystyle \int_X^{X^A} \left|\sum_{x \leq n \leq e^{1/T} x} \frac{\lambda(n)}{n}\right|^2\frac{dx}{x} = o\left( \frac{\log X}{T^2} \right), \ \ \ \ \ (5)$

for any fixed ${A>1}$.

Note that (5) improves upon the trivial bound of ${O( \frac{\log X}{T^2} )}$. Again, one can replace ${\lambda}$ with ${\mu}$ if desired. Due to the cheapness of Theorem 7, the proof will require few ingredients; the deepest input is the improved zero-free region for the Riemann zeta function due to Vinogradov and Korobov. Other than that, the main tools are the Turan-Kubilius result established above, and some Fourier (or complex) analysis.