You are currently browsing the tag archive for the ‘Kaisa Matomaki’ tag.

Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions I. Long shift ranges“, submitted to Proceedings of the London Mathematical Society. This paper is concerned with the estimation of correlations such as

\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+h) \ \ \ \ \ (1)

for medium-sized {h} and large {X}, where {\Lambda} is the von Mangoldt function; we also consider variants of this sum in which one of the von Mangoldt functions is replaced with a (higher order) divisor function, but for sake of discussion let us focus just on the sum (1). Understanding this sum is very closely related to the problem of finding pairs of primes that differ by {h}; for instance, if one could establish a lower bound

\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+2) \gg X

then this would easily imply the twin prime conjecture.

The (first) Hardy-Littlewood conjecture asserts an asymptotic

\displaystyle \sum_{n \leq X} \Lambda(n) \Lambda(n+h) = {\mathfrak S}(h) X + o(X) \ \ \ \ \ (2)

as {X \rightarrow \infty} for any fixed positive {h}, where the singular series {{\mathfrak S}(h)} is an arithmetic factor arising from the irregularity of distribution of {\Lambda} at small moduli, defined explicitly by

\displaystyle {\mathfrak S}(h) := 2 \Pi_2 \prod_{p|h; p>2} \frac{p-2}{p-1}

when {h} is even, and {{\mathfrak S}(h)=0} when {h} is odd, where

\displaystyle \Pi_2 := \prod_{p>2} (1-\frac{1}{(p-1)^2}) = 0.66016\dots

is (half of) the twin prime constant. See for instance this previous blog post for a a heuristic explanation of this conjecture. From the previous discussion we see that (2) for {h=2} would imply the twin prime conjecture. Sieve theoretic methods are only able to provide an upper bound of the form { \sum_{n \leq X} \Lambda(n) \Lambda(n+h) \ll {\mathfrak S}(h) X}.

Needless to say, apart from the trivial case of odd {h}, there are no values of {h} for which the Hardy-Littlewood conjecture is known. However there are some results that say that this conjecture holds “on the average”: in particular, if {H} is a quantity depending on {X} that is somewhat large, there are results that show that (2) holds for most (i.e. for {1-o(1)}) of the {h} betwen {0} and {H}. Ideally one would like to get {H} as small as possible, in particular one can view the full Hardy-Littlewood conjecture as the endpoint case when {H} is bounded.

The first results in this direction were by van der Corput and by Lavrik, who established such a result with {H = X} (with a subsequent refinement by Balog); Wolke lowered {H} to {X^{5/8+\varepsilon}}, and Mikawa lowered {H} further to {X^{1/3+\varepsilon}}. The main result of this paper is a further lowering of {H} to {X^{8/33+\varepsilon}}. In fact (as in the preceding works) we get a better error term than {o(X)}, namely an error of the shape {O_A( X \log^{-A} X)} for any {A}.

Our arguments initially proceed along standard lines. One can use the Hardy-Littlewood circle method to express the correlation in (2) as an integral involving exponential sums {S(\alpha) := \sum_{n \leq X} \Lambda(n) e(\alpha n)}. The contribution of “major arc” {\alpha} is known by a standard computation to recover the main term {{\mathfrak S}(h) X} plus acceptable errors, so it is a matter of controlling the “minor arcs”. After averaging in {h} and using the Plancherel identity, one is basically faced with establishing a bound of the form

\displaystyle \int_{\beta-1/H}^{\beta+1/H} |S(\alpha)|^2\ d\alpha \ll_A X \log^{-A} X

for any “minor arc” {\beta}. If {\beta} is somewhat close to a low height rational {a/q} (specifically, if it is within {X^{-1/6-\varepsilon}} of such a rational with {q = O(\log^{O(1)} X)}), then this type of estimate is roughly of comparable strength (by another application of Plancherel) to the best available prime number theorem in short intervals on the average, namely that the prime number theorem holds for most intervals of the form {[x, x + x^{1/6+\varepsilon}]}, and we can handle this case using standard mean value theorems for Dirichlet series. So we can restrict attention to the “strongly minor arc” case where {\beta} is far from such rationals.

The next step (following some ideas we found in a paper of Zhan) is to rewrite this estimate not in terms of the exponential sums {S(\alpha) := \sum_{n \leq X} \Lambda(n) e(\alpha n)}, but rather in terms of the Dirichlet polynomial {F(s) := \sum_{n \sim X} \frac{\Lambda(n)}{n^s}}. After a certain amount of computation (including some oscillatory integral estimates arising from stationary phase), one is eventually reduced to the task of establishing an estimate of the form

\displaystyle \int_{t \sim \lambda X} (\sum_{t-\lambda H}^{t+\lambda H} |F(\frac{1}{2}+it')|\ dt')^2\ dt \ll_A \lambda^2 H^2 X \log^{-A} X

for any {X^{-1/6-\varepsilon} \ll \lambda \ll \log^{-B} X} (with {B} sufficiently large depending on {A}).

The next step, which is again standard, is the use of the Heath-Brown identity (as discussed for instance in this previous blog post) to split up {\Lambda} into a number of components that have a Dirichlet convolution structure. Because the exponent {8/33} we are shooting for is less than {1/4}, we end up with five types of components that arise, which we call “Type {d_1}“, “Type {d_2}“, “Type {d_3}“, “Type {d_4}“, and “Type II”. The “Type II” sums are Dirichlet convolutions involving a factor supported on a range {[X^\varepsilon, X^{-\varepsilon} H]} and is quite easy to deal with; the “Type {d_j}” terms are Dirichlet convolutions that resemble (non-degenerate portions of) the {j^{th}} divisor function, formed from convolving together {j} portions of {1}. The “Type {d_1}” and “Type {d_2}” terms can be estimated satisfactorily by standard moment estimates for Dirichlet polynomials; this already recovers the result of Mikawa (and our argument is in fact slightly more elementary in that no Kloosterman sum estimates are required). It is the treatment of the “Type {d_3}” and “Type {d_4}” sums that require some new analysis, with the Type {d_3} terms turning to be the most delicate. After using an existing moment estimate of Jutila for Dirichlet L-functions, matters reduce to obtaining a family of estimates, a typical one of which (relating to the more difficult Type {d_3} sums) is of the form

\displaystyle \int_{t - H}^{t+H} |M( \frac{1}{2} + it')|^2\ dt' \ll X^{\varepsilon^2} H \ \ \ \ \ (3)

for “typical” ordinates {t} of size {X}, where {M} is the Dirichlet polynomial {M(s) := \sum_{n \sim X^{1/3}} \frac{1}{n^s}} (a fragment of the Riemann zeta function). The precise definition of “typical” is a little technical (because of the complicated nature of Jutila’s estimate) and will not be detailed here. Such a claim would follow easily from the Lindelof hypothesis (which would imply that {M(1/2 + it) \ll X^{o(1)}}) but of course we would like to have an unconditional result.

At this point, having exhausted all the Dirichlet polynomial estimates that are usefully available, we return to “physical space”. Using some further Fourier-analytic and oscillatory integral computations, we can estimate the left-hand side of (3) by an expression that is roughly of the shape

\displaystyle \frac{H}{X^{1/3}} \sum_{\ell \sim X^{1/3}/H} |\sum_{m \sim X^{1/3}} e( \frac{t}{2\pi} \log \frac{m+\ell}{m-\ell} )|.

The phase {\frac{t}{2\pi} \log \frac{m+\ell}{m-\ell}} can be Taylor expanded as the sum of {\frac{t_j \ell}{\pi m}} and a lower order term {\frac{t_j \ell^3}{3\pi m^3}}, plus negligible errors. If we could discard the lower order term then we would get quite a good bound using the exponential sum estimates of Robert and Sargos, which control averages of exponential sums with purely monomial phases, with the averaging allowing us to exploit the hypothesis that {t} is “typical”. Figuring out how to get rid of this lower order term caused some inefficiency in our arguments; the best we could do (after much experimentation) was to use Fourier analysis to shorten the sums, estimate a one-parameter average exponential sum with a binomial phase by a two-parameter average with a monomial phase, and then use the van der Corput {B} process followed by the estimates of Robert and Sargos. This rather complicated procedure works up to {H = X^{8/33+\varepsilon}} it may be possible that some alternate way to proceed here could improve the exponent somewhat.

In a sequel to this paper, we will use a somewhat different method to reduce {H} to a much smaller value of {\log^{O(1)} X}, but only if we replace the correlations {\sum_{n \leq X} \Lambda(n) \Lambda(n+h)} by either {\sum_{n \leq X} \Lambda(n) d_k(n+h)} or {\sum_{n \leq X} d_k(n) d_l(n+h)}, and also we now only save a {o(1)} in the error term rather than {O_A(\log^{-A} X)}.

Kaisa Matomäki, Maksym Radziwiłł, and I have just uploaded to the arXiv our paper “Sign patterns of the Liouville and Möbius functions“. This paper is somewhat similar to our previous paper in that it is using the recent breakthrough of Matomäki and Radziwiłł on mean values of multiplicative functions to obtain partial results towards the Chowla conjecture. This conjecture can be phrased, roughly speaking, as follows: if {k} is a fixed natural number and {n} is selected at random from a large interval {[1,x]}, then the sign pattern {(\lambda(n), \lambda(n+1),\dots,\lambda(n+k-1)) \in \{-1,+1\}^k} becomes asymptotically equidistributed in {\{-1,+1\}^k} in the limit {x \rightarrow \infty}. This remains open for {k \geq 2}. In fact even the significantly weaker statement that each of the sign patterns in {\{-1,+1\}^k} is attained infinitely often is open for {k \geq 4}. However, in 1986, Hildebrand showed that for {k \leq 3} all sign patterns are indeed attained infinitely often. Our first result is a strengthening of Hildebrand’s, moving a little bit closer to Chowla’s conjecture:

Theorem 1 Let {k \leq 3}. Then each of the sign patterns in {\{-1,+1\}^k} is attained by the Liouville function for a set of natural numbers {n} of positive lower density.

Thus for instance one has {\lambda(n)=\lambda(n+1)=\lambda(n+2)} for a set of {n} of positive lower density. The {k \leq 2} case of this theorem already appears in the original paper of Matomäki and Radziwiłł (and the significantly simpler case of the sign patterns {++} and {--} was treated previously by Harman, Pintz, and Wolke).

The basic strategy in all of these arguments is to assume for sake of contradiction that a certain sign pattern occurs extremely rarely, and then exploit the complete multiplicativity of {\lambda} (which implies in particular that {\lambda(2n) = -\lambda(n)}, {\lambda(3n) = -\lambda(n)}, and {\lambda(5n) = -\lambda(n)} for all {n}) together with some combinatorial arguments (vaguely analogous to solving a Sudoku puzzle!) to establish more complex sign patterns for the Liouville function, that are either inconsistent with each other, or with results such as the Matomäki-Radziwiłł result. To illustrate this, let us give some {k=2} examples, arguing a little informally to emphasise the combinatorial aspects of the argument. First suppose that the sign pattern {(\lambda(n),\lambda(n+1)) = (+1,+1)} almost never occurs. The prime number theorem tells us that {\lambda(n)} and {\lambda(n+1)} are each equal to {+1} about half of the time, which by inclusion-exclusion implies that the sign pattern {(\lambda(n),\lambda(n+1))=(-1,-1)} almost never occurs. In other words, we have {\lambda(n+1) = -\lambda(n)} for almost all {n}. But from the multiplicativity property {\lambda(2n)=-\lambda(n)} this implies that one should have

\displaystyle \lambda(2n+2) = -\lambda(2n)

\displaystyle \lambda(2n+1) = -\lambda(2n)

and

\displaystyle \lambda(2n+2) = -\lambda(2n+1)

for almost all {n}. But the above three statements are contradictory, and the claim follows.

Similarly, if we assume that the sign pattern {(\lambda(n),\lambda(n+1)) = (+1,-1)} almost never occurs, then a similar argument to the above shows that for any fixed {h}, one has {\lambda(n)=\lambda(n+1)=\dots=\lambda(n+h)} for almost all {n}. But this means that the mean {\frac{1}{h} \sum_{j=1}^h \lambda(n+j)} is abnormally large for most {n}, which (for {h} large enough) contradicts the results of Matomäki and Radziwiłł. Here we see that the “enemy” to defeat is the scenario in which {\lambda} only changes sign very rarely, in which case one rarely sees the pattern {(+1,-1)}.

It turns out that similar (but more combinatorially intricate) arguments work for sign patterns of length three (but are unlikely to work for most sign patterns of length four or greater). We give here one fragment of such an argument (due to Hildebrand) which hopefully conveys the Sudoku-type flavour of the combinatorics. Suppose for instance that the sign pattern {(\lambda(n),\lambda(n+1),\lambda(n+2)) = (+1,+1,+1)} almost never occurs. Now suppose {n} is a typical number with {\lambda(15n-1)=\lambda(15n+1)=+1}. Since we almost never have the sign pattern {(+1,+1,+1)}, we must (almost always) then have {\lambda(15n) = -1}. By multiplicativity this implies that

\displaystyle (\lambda(60n-4), \lambda(60n), \lambda(60n+4)) = (+1,-1,+1).

We claim that this (almost always) forces {\lambda(60n+5)=-1}. For if {\lambda(60n+5)=+1}, then by the lack of the sign pattern {(+1,+1,+1)}, this (almost always) forces {\lambda(60n+3)=\lambda(60n+6)=-1}, which by multiplicativity forces {\lambda(20n+1)=\lambda(20n+2)=+1}, which by lack of {(+1,+1,+1)} (almost always) forces {\lambda(20n)=-1}, which by multiplicativity contradicts {\lambda(60n)=-1}. Thus we have {\lambda(60n+5)=-1}; a similar argument gives {\lambda(60n-5)=-1} almost always, which by multiplicativity gives {\lambda(12n-1)=\lambda(12n)=\lambda(12n+1)=+1}, a contradiction. Thus we almost never have {\lambda(15n-1)=\lambda(15n+1)=+1}, which by the inclusion-exclusion argument mentioned previously shows that {\lambda(15n+1) = - \lambda(15n-1)} for almost all {n}.

One can continue these Sudoku-type arguments and conclude eventually that {\lambda(3n-1)=-\lambda(3n+1)=\lambda(3n+2)} for almost all {n}. To put it another way, if {\chi_3} denotes the non-principal Dirichlet character of modulus {3}, then {\lambda \chi_3} is almost always constant away from the multiples of {3}. (Conversely, if {\lambda \chi_3} changed sign very rarely outside of the multiples of three, then the sign pattern {(+1,+1,+1)} would never occur.) Fortunately, the main result of Matomäki and Radziwiłł shows that this scenario cannot occur, which establishes that the sign pattern {(+1,+1,+1)} must occur rather frequently. The other sign patterns are handled by variants of these arguments.

Excluding a sign pattern of length three leads to useful implications like “if {\lambda(n-1)=\lambda(n)=+1}, then {\lambda(n+1)=-1}” which turn out are just barely strong enough to quite rigidly constrain the Liouville function using Sudoku-like arguments. In contrast, excluding a sign pattern of length four only gives rise to implications like “`if {\lambda(n-2)=\lambda(n-1)=\lambda(n)=+1}, then {\lambda(n+1)=-1}“, and these seem to be much weaker for this purpose (the hypothesis in these implications just isn’t satisfied nearly often enough). So a different idea seems to be needed if one wishes to extend the above theorem to larger values of {k}.

Our second theorem gives an analogous result for the Möbius function {\mu} (which takes values in {\{-1,0,+1\}} rather than {\{-1,1\}}), but the analysis turns out to be remarkably difficult and we are only able to get up to {k=2}:

Theorem 2 Let {k \leq 2}. Then each of the sign patterns in {\{-1,0,+1\}^k} is attained by the Möbius function for a set {n} of positive lower density.

It turns out that the prime number theorem and elementary sieve theory can be used to handle the {k=1} case and all the {k=2} cases that involve at least one {0}, leaving only the four sign patterns {(\pm 1, \pm 1)} to handle. It is here that the zeroes of the Möbius function cause a significant new obstacle. Suppose for instance that the sign pattern {(+1, -1)} almost never occurs for the Möbius function. The same arguments that were used in the Liouville case then show that {\mu(n)} will be almost always equal to {\mu(n+1)}, provided that {n,n+1} are both square-free. One can try to chain this together as before to create a long string {\mu(n)=\dots=\mu(n+h) \in \{-1,+1\}} where the Möbius function is constant, but this cannot work for any {h} larger than three, because the Möbius function vanishes at every multiple of four.

The constraints we assume on the Möbius function can be depicted using a graph on the squarefree natural numbers, in which any two adjacent squarefree natural numbers are connected by an edge. The main difficulty is then that this graph is highly disconnected due to the multiples of four not being squarefree.

To get around this, we need to enlarge the graph. Note from multiplicativity that if {\mu(n)} is almost always equal to {\mu(n+1)} when {n,n+1} are squarefree, then {\mu(n)} is almost always equal to {\mu(n+p)} when {n,n+p} are squarefree and {n} is divisible by {p}. We can then form a graph on the squarefree natural numbers by connecting {n} to {n+p} whenever {n,n+p} are squarefree and {n} is divisible by {p}. If this graph is “locally connected” in some sense, then {\mu} will be constant on almost all of the squarefree numbers in a large interval, which turns out to be incompatible with the results of Matomäki and Radziwiłł. Because of this, matters are reduced to establishing the connectedness of a certain graph. More precisely, it turns out to be sufficient to establish the following claim:

Theorem 3 For each prime {p}, let {a_p \hbox{ mod } p^2} be a residue class chosen uniformly at random. Let {G} be the random graph whose vertices {V} consist of those integers {n} not equal to {a_p \hbox{ mod } p^2} for any {p}, and whose edges consist of pairs {n,n+p} in {V} with {n = a_p \hbox{ mod } p}. Then with probability {1}, the graph {G} is connected.

We were able to show the connectedness of this graph, though it turned out to be remarkably tricky to do so. Roughly speaking (and suppressing a number of technicalities), the main steps in the argument were as follows.

  • (Early stage) Pick a large number {X} (in our paper we take {X} to be odd, but I’ll ignore this technicality here). Using a moment method to explore neighbourhoods of a single point in {V}, one can show that a vertex {v} in {V} is almost always connected to at least {\log^{10} X} numbers in {[v,v+X^{1/100}]}, using relatively short paths of short diameter. (This is the most computationally intensive portion of the argument.)
  • (Middle stage) Let {X'} be a typical number in {[X/40,X/20]}, and let {R} be a scale somewhere between {X^{1/40}} and {X'}. By using paths {n, n+p_1, n+p_1-p_2, n+p_1-p_2+p_3} involving three primes, and using a variant of Vinogradov’s theorem and some routine second moment computations, one can show that with quite high probability, any “good” vertex in {[v+X'-R, v+X'-0.99R]} is connected to a “good” vertex in {[v+X'-0.01R, v+X-0.0099 R]} by paths of length three, where the definition of “good” is somewhat technical but encompasses almost all of the vertices in {V}.
  • (Late stage) Combining the two previous results together, we can show that most vertices {v} will be connected to a vertex in {[v+X'-X^{1/40}, v+X']} for any {X'} in {[X/40,X/20]}. In particular, {v} will be connected to a set of {\gg X^{9/10}} vertices in {[v,v+X/20]}. By tracking everything carefully, one can control the length and diameter of the paths used to connect {v} to this set, and one can also control the parity of the elements in this set.
  • (Final stage) Now if we have two vertices {v, w} at a distance {X} apart. By the previous item, one can connect {v} to a large set {A} of vertices in {[v,v+X/20]}, and one can similarly connect {w} to a large set {B} of vertices in {[w,w+X/20]}. Now, by using a Vinogradov-type theorem and second moment calculations again (and ensuring that the elements of {A} and {B} have opposite parity), one can connect many of the vertices in {A} to many of the vertices {B} by paths of length three, which then connects {v} to {w}, and gives the claim.

It seems of interest to understand random graphs like {G} further. In particular, the graph {G'} on the integers formed by connecting {n} to {n+p} for all {n} in a randomly selected residue class mod {p} for each prime {p} is particularly interesting (it is to the Liouville function as {G} is to the Möbius function); if one could show some “local expander” properties of this graph {G'}, then one would have a chance of modifying the above methods to attack the first unsolved case of the Chowla conjecture, namely that {\lambda(n)\lambda(n+1)} has asymptotic density zero (perhaps working with logarithmic density instead of natural density to avoids some technicalities).

Kaisa Matomaki, Maksym Radziwill, and I have just uploaded to the arXiv our paper “An averaged form of Chowla’s conjecture“. This paper concerns a weaker variant of the famous conjecture of Chowla (discussed for instance in this previous post) that

\displaystyle  \sum_{n \leq X} \lambda(n+h_1) \dots \lambda(n+h_k) = o(X)

as {X \rightarrow \infty} for any distinct natural numbers {h_1,\dots,h_k}, where {\lambda} denotes the Liouville function. (One could also replace the Liouville function here by the Möbius function {\mu} and obtain a morally equivalent conjecture.) This conjecture remains open for any {k \geq 2}; for instance the assertion

\displaystyle  \sum_{n \leq X} \lambda(n) \lambda(n+2) = o(X)

is a variant of the twin prime conjecture (though possibly a tiny bit easier to prove), and is subject to the notorious parity barrier (as discussed in this previous post).

Our main result asserts, roughly speaking, that Chowla’s conjecture can be established unconditionally provided one has non-trivial averaging in the {h_1,\dots,h_k} parameters. More precisely, one has

Theorem 1 (Chowla on the average) Suppose {H = H(X) \leq X} is a quantity that goes to infinity as {X \rightarrow \infty} (but it can go to infinity arbitrarily slowly). Then for any fixed {k \geq 1}, we have

\displaystyle  \sum_{h_1,\dots,h_k \leq H} |\sum_{n \leq X} \lambda(n+h_1) \dots \lambda(n+h_k)| = o( H^k X ).

In fact, we can remove one of the averaging parameters and obtain

\displaystyle  \sum_{h_2,\dots,h_k \leq H} |\sum_{n \leq X} \lambda(n) \lambda(n+h_2) \dots \lambda(n+h_k)| = o( H^{k-1} X ).

Actually we can make the decay rate a bit more quantitative, gaining about {\frac{\log\log H}{\log H}} over the trivial bound. The key case is {k=2}; while the unaveraged Chowla conjecture becomes more difficult as {k} increases, the averaged Chowla conjecture does not increase in difficulty due to the increasing amount of averaging for larger {k}, and we end up deducing the higher {k} case of the conjecture from the {k=2} case by an elementary argument.

The proof of the theorem proceeds as follows. By exploiting the Fourier-analytic identity

\displaystyle  \int_{{\mathbf T}} (\int_{\mathbf R} |\sum_{x \leq n \leq x+H} f(n) e(\alpha n)|^2 dx)^2\ d\alpha

\displaystyle = \sum_{|h| \leq H} (H-|h|)^2 |\sum_n f(n) \overline{f}(n+h)|^2

(related to a standard Fourier-analytic identity for the Gowers {U^2} norm) it turns out that the {k=2} case of the above theorem can basically be derived from an estimate of the form

\displaystyle  \int_0^X |\sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o( H X )

uniformly for all {\alpha \in {\mathbf T}}. For “major arc” {\alpha}, close to a rational {a/q} for small {q}, we can establish this bound from a generalisation of a recent result of Matomaki and Radziwill (discussed in this previous post) on averages of multiplicative functions in short intervals. For “minor arc” {\alpha}, we can proceed instead from an argument of Katai and Bourgain-Sarnak-Ziegler (discussed in this previous post).

The argument also extends to other bounded multiplicative functions than the Liouville function. Chowla’s conjecture was generalised by Elliott, who roughly speaking conjectured that the {k} copies of {\lambda} in Chowla’s conjecture could be replaced by arbitrary bounded multiplicative functions {g_1,\dots,g_k} as long as these functions were far from a twisted Dirichlet character {n \mapsto \chi(n) n^{it}} in the sense that

\displaystyle  \sum_p \frac{1 - \hbox{Re} g(p) \overline{\chi(p) p^{it}}}{p} = +\infty. \ \ \ \ \ (1)

(This type of distance is incidentally now a fundamental notion in the Granville-Soundararajan “pretentious” approach to multiplicative number theory.) During our work on this project, we found that Elliott’s conjecture is not quite true as stated due to a technicality: one can cook up a bounded multiplicative function {g} which behaves like {n^{it_j}} on scales {n \sim N_j} for some {N_j} going to infinity and some slowly varying {t_j}, and such a function will be far from any fixed Dirichlet character whilst still having many large correlations (e.g. the pair correlations {\sum_{n \leq N_j} g(n+1) \overline{g(n)}} will be large). In our paper we propose a technical “fix” to Elliott’s conjecture (replacing (1) by a truncated variant), and show that this repaired version of Elliott’s conjecture is true on the average in much the same way that Chowla’s conjecture is. (If one restricts attention to real-valued multiplicative functions, then this technical issue does not show up, basically because one can assume without loss of generality that {t=0} in this case; we discuss this fact in an appendix to the paper.)

In analytic number theory, it is a well-known phenomenon that for many arithmetic functions {f: {\bf N} \rightarrow {\bf C}} of interest in number theory, it is significantly easier to estimate logarithmic sums such as

\displaystyle  \sum_{n \leq x} \frac{f(n)}{n}

than it is to estimate summatory functions such as

\displaystyle  \sum_{n \leq x} f(n).

(Here we are normalising {f} to be roughly constant in size, e.g. {f(n) = O( n^{o(1)} )} as {n \rightarrow \infty}.) For instance, when {f} is the von Mangoldt function {\Lambda}, the logarithmic sums {\sum_{n \leq x} \frac{\Lambda(n)}{n}} can be adequately estimated by Mertens’ theorem, which can be easily proven by elementary means (see Notes 1); but a satisfactory estimate on the summatory function {\sum_{n \leq x} \Lambda(n)} requires the prime number theorem, which is substantially harder to prove (see Notes 2). (From a complex-analytic or Fourier-analytic viewpoint, the problem is that the logarithmic sums {\sum_{n \leq x} \frac{f(n)}{n}} can usually be controlled just from knowledge of the Dirichlet series {\sum_n \frac{f(n)}{n^s}} for {s} near {1}; but the summatory functions require control of the Dirichlet series {\sum_n \frac{f(n)}{n^s}} for {s} on or near a large portion of the line {\{ 1+it: t \in {\bf R} \}}. See Notes 2 for further discussion.)

Viewed conversely, whenever one has a difficult estimate on a summatory function such as {\sum_{n \leq x} f(n)}, one can look to see if there is a “cheaper” version of that estimate that only controls the logarithmic sums {\sum_{n \leq x} \frac{f(n)}{n}}, which is easier to prove than the original, more “expensive” estimate. In this post, we shall do this for two theorems, a classical theorem of Halasz on mean values of multiplicative functions on long intervals, and a much more recent result of Matomaki and RadziwiÅ‚Å‚ on mean values of multiplicative functions in short intervals. The two are related; the former theorem is an ingredient in the latter (though in the special case of the Matomaki-RadziwiÅ‚Å‚ theorem considered here, we will not need Halasz’s theorem directly, instead using a key tool in the proof of that theorem).

We begin with Halasz’s theorem. Here is a version of this theorem, due to Montgomery and to Tenenbaum:

Theorem 1 (Halasz-Montgomery-Tenenbaum) Let {f: {\bf N} \rightarrow {\bf C}} be a multiplicative function with {|f(n)| \leq 1} for all {n}. Let {x \geq 3} and {T \geq 1}, and set

\displaystyle  M := \min_{|t| \leq T} \sum_{p \leq x} \frac{1 - \hbox{Re}( f(p) p^{-it} )}{p}.

Then one has

\displaystyle  \frac{1}{x} \sum_{n \leq x} f(n) \ll (1+M) e^{-M} + \frac{1}{\sqrt{T}}.

Informally, this theorem asserts that {\sum_{n \leq x} f(n)} is small compared with {x}, unless {f} “pretends” to be like the character {p \mapsto p^{it}} on primes for some small {y}. (This is the starting point of the “pretentious” approach of Granville and Soundararajan to analytic number theory, as developed for instance here.) We now give a “cheap” version of this theorem which is significantly weaker (both because it settles for controlling logarithmic sums rather than summatory functions, it requires {f} to be completely multiplicative instead of multiplicative, it requires a strong bound on the analogue of the quantity {M}, and because it only gives qualitative decay rather than quantitative estimates), but easier to prove:

Theorem 2 (Cheap Halasz) Let {x} be an asymptotic parameter goingto infinity. Let {f: {\bf N} \rightarrow {\bf C}} be a completely multiplicative function (possibly depending on {x}) such that {|f(n)| \leq 1} for all {n}, such that

\displaystyle  \sum_{p \leq x} \frac{1 - \hbox{Re}( f(p) )}{p} \gg \log\log x. \ \ \ \ \ (1)

Then

\displaystyle  \frac{1}{\log x} \sum_{n \leq x} \frac{f(n)}{n} = o(1). \ \ \ \ \ (2)

Note that now that we are content with estimating exponential sums, we no longer need to preclude the possibility that {f(p)} pretends to be like {p^{it}}; see Exercise 11 of Notes 1 for a related observation.

To prove this theorem, we first need a special case of the Turan-Kubilius inequality.

Lemma 3 (Turan-Kubilius) Let {x} be a parameter going to infinity, and let {1 < P < x} be a quantity depending on {x} such that {P = x^{o(1)}} and {P \rightarrow \infty} as {x \rightarrow \infty}. Then

\displaystyle  \sum_{n \leq x} \frac{ | \frac{1}{\log \log P} \sum_{p \leq P: p|n} 1 - 1 |}{n} = o( \log x ).

Informally, this lemma is asserting that

\displaystyle  \sum_{p \leq P: p|n} 1 \approx \log \log P

for most large numbers {n}. Another way of writing this heuristically is in terms of Dirichlet convolutions:

\displaystyle  1 \approx 1 * \frac{1}{\log\log P} 1_{{\mathcal P} \cap [1,P]}.

This type of estimate was previously discussed as a tool to establish a criterion of Katai and Bourgain-Sarnak-Ziegler for Möbius orthogonality estimates in this previous blog post. See also Section 5 of Notes 1 for some similar computations.

Proof: By Cauchy-Schwarz it suffices to show that

\displaystyle  \sum_{n \leq x} \frac{ | \frac{1}{\log \log P} \sum_{p \leq P: p|n} 1 - 1 |^2}{n} = o( \log x ).

Expanding out the square, it suffices to show that

\displaystyle  \sum_{n \leq x} \frac{ (\frac{1}{\log \log P} \sum_{p \leq P: p|n} 1)^j}{n} = \log x + o( \log x )

for {j=0,1,2}.

We just show the {j=2} case, as the {j=0,1} cases are similar (and easier). We rearrange the left-hand side as

\displaystyle  \frac{1}{(\log\log P)^2} \sum_{p_1, p_2 \leq P} \sum_{n \leq x: p_1,p_2|n} \frac{1}{n}.

We can estimate the inner sum as {(1+o(1)) \frac{1}{[p_1,p_2]} \log x}. But a routine application of Mertens’ theorem (handling the diagonal case when {p_1=p_2} separately) shows that

\displaystyle  \sum_{p_1, p_2 \leq P} \frac{1}{[p_1,p_2]} = (1+o(1)) (\log\log P)^2

and the claim follows. \Box

Remark 4 As an alternative to the Turan-Kubilius inequality, one can use the Ramaré identity

\displaystyle  \sum_{p \leq P: p|n} \frac{1}{\# \{ p' \leq P: p'|n\} + 1} - 1 = 1_{(p,n)=1 \hbox{ for all } p \leq P}

(see e.g. Section 17.3 of Friedlander-Iwaniec). This identity turns out to give superior quantitative results than the Turan-Kubilius inequality in applications; see the paper of Matomaki and Radziwiłł for an instance of this.

We now prove Theorem 2. Let {Q} denote the left-hand side of (2); by the triangle inequality we have {Q=O(1)}. By Lemma 3 (for some {P = x^{o(1)}} to be chosen later) and the triangle inequality we have

\displaystyle  \sum_{n \leq x} \frac{\frac{1}{\log \log P} \sum_{p \leq P: p|n} f(n)}{n} = Q \log x + o( \log x ).

We rearrange the left-hand side as

\displaystyle  \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p} \sum_{m \leq x/p} \frac{f(m)}{m}.

We now replace the constraint {m \leq x/p} by {m \leq x}. The error incurred in doing so is

\displaystyle  O( \frac{1}{\log\log P} \sum_{p \leq P} \frac{1}{p} \sum_{x/P \leq m \leq x} \frac{1}{m} )

which by Mertens’ theorem is {O(\log P) = o( \log x )}. Thus we have

\displaystyle  \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p} \sum_{m \leq x} \frac{f(m)}{m} = Q \log x + o( \log x ).

But by definition of {Q}, we have {\sum_{m \leq x} \frac{f(m)}{m} = Q \log x}, thus

\displaystyle  [1 - \frac{1}{\log\log P} \sum_{p \leq P} \frac{f(p)}{p}] Q = o(1). \ \ \ \ \ (3)

From Mertens’ theorem, the expression in brackets can be rewritten as

\displaystyle  \frac{1}{\log\log P} \sum_{p \leq P} \frac{1 - f(p)}{p} + o(1)

and so the real part of this expression is

\displaystyle  \frac{1}{\log\log P} \sum_{p \leq P} \frac{1 - \hbox{Re} f(p)}{p} + o(1).

By (1), Mertens’ theorem and the hypothesis on {f} we have

\displaystyle  \sum_{p \leq x^\varepsilon} \frac{(1 - \hbox{Re} f(p)) \log p}{p} \gg \log\log x^\varepsilon - O_\varepsilon(1)

for any {\varepsilon > 0}. This implies that we can find {P = x^{o(1)}} going to infinity such that

\displaystyle  \sum_{p \leq P} \frac{(1 - \hbox{Re} f(p)) \log p}{p} \gg (1-o(1))\log\log P

and thus the expression in brackets has real part {\gg 1-o(1)}. The claim follows.

The Turan-Kubilius argument is certainly not the most efficient way to estimate sums such as {\frac{1}{n} \sum_{n \leq x} f(n)}. In the exercise below we give a significantly more accurate estimate that works when {f} is non-negative.

Exercise 5 (Granville-Koukoulopoulos-Matomaki)

  • (i) If {g} is a completely multiplicative function with {g(p) \in \{0,1\}} for all primes {p}, show that

    \displaystyle  (e^{-\gamma}-o(1)) \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1} \leq \sum_{n \leq x} \frac{g(n)}{n} \leq \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1}.

    as {x \rightarrow \infty}. (Hint: for the upper bound, expand out the Euler product. For the lower bound, show that {\sum_{n \leq x} \frac{g(n)}{n} \times \sum_{n \leq x} \frac{h(n)}{n} \ge \sum_{n \leq x} \frac{1}{n}}, where {h} is the completely multiplicative function with {h(p) = 1-g(p)} for all primes {p}.)

  • (ii) If {g} is multiplicative and takes values in {[0,1]}, show that

    \displaystyle  \sum_{n \leq x} \frac{g(n)}{n} \asymp \prod_{p \leq x} (1 - \frac{g(p)}{p})^{-1}

    \displaystyle  \asymp \exp( \sum_{p \leq x} \frac{g(p)}{p} )

    for all {x \geq 1}.

Now we turn to a very recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. For sake of illustration we specialise their results to the simpler case of the Liouville function {\lambda}, although their arguments actually work (with some additional effort) for arbitrary multiplicative functions of magnitude at most {1} that are real-valued (or more generally, stay far from complex characters {p \mapsto p^{it}}). Furthermore, we give a qualitative form of their estimates rather than a quantitative one:

Theorem 6 (Matomaki-Radziwiłł, special case) Let {X} be a parameter going to infinity, and let {2 \leq h \leq X} be a quantity going to infinity as {X \rightarrow \infty}. Then for all but {o(X)} of the integers {x \in [X,2X]}, one has

\displaystyle  \sum_{x \leq n \leq x+h} \lambda(n) = o( h ).

Equivalently, one has

\displaystyle  \sum_{X \leq x \leq 2X} |\sum_{x \leq n \leq x+h} \lambda(n)|^2 = o( h^2 X ). \ \ \ \ \ (4)

A simple sieving argument (see Exercise 18 of Supplement 4) shows that one can replace {\lambda} by the Möbius function {\mu} and obtain the same conclusion. See this recent note of Matomaki and Radziwiłł for a simple proof of their (quantitative) main theorem in this special case.

Of course, (4) improves upon the trivial bound of {O( h^2 X )}. Prior to this paper, such estimates were only known (using arguments similar to those in Section 3 of Notes 6) for {h \geq X^{1/6+\varepsilon}} unconditionally, or for {h \geq \log^A X} for some sufficiently large {A} if one assumed the Riemann hypothesis. This theorem also represents some progress towards Chowla’s conjecture (discussed in Supplement 4) that

\displaystyle  \sum_{n \leq x} \lambda(n+h_1) \dots \lambda(n+h_k) = o( x )

as {x \rightarrow \infty} for any fixed distinct {h_1,\dots,h_k}; indeed, it implies that this conjecture holds if one performs a small amount of averaging in the {h_1,\dots,h_k}.

Below the fold, we give a “cheap” version of the Matomaki-Radziwiłł argument. More precisely, we establish

Theorem 7 (Cheap Matomaki-Radziwiłł) Let {X} be a parameter going to infinity, and let {1 \leq T \leq X}. Then

\displaystyle  \int_X^{X^A} \left|\sum_{x \leq n \leq e^{1/T} x} \frac{\lambda(n)}{n}\right|^2\frac{dx}{x} = o\left( \frac{\log X}{T^2} \right), \ \ \ \ \ (5)

for any fixed {A>1}.

Note that (5) improves upon the trivial bound of {O( \frac{\log X}{T^2} )}. Again, one can replace {\lambda} with {\mu} if desired. Due to the cheapness of Theorem 7, the proof will require few ingredients; the deepest input is the improved zero-free region for the Riemann zeta function due to Vinogradov and Korobov. Other than that, the main tools are the Turan-Kubilius result established above, and some Fourier (or complex) analysis.

Read the rest of this entry »

Archives