You are currently browsing the tag archive for the ‘Kaisa Matomaki’ tag.
Kaisa Matomäki, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Higher uniformity of arithmetic functions in short intervals I. All intervals“. This paper investigates the higher order (Gowers) uniformity of standard arithmetic functions in analytic number theory (and specifically, the Möbius function , the von Mangoldt function
, and the generalised divisor functions
) in short intervals
, where
is large and
lies in the range
for a fixed constant
(that one would like to be as small as possible). If we let
denote one of the functions
, then there is extensive literature on the estimation of short sums
Traditionally, asymptotics for such sums are expressed in terms of a “main term” of some arithmetic nature, plus an error term that is estimated in magnitude. For instance, a sum such as would be approximated in terms of a main term that vanished (or is negligible) if
is “minor arc”, but would be expressible in terms of something like a Ramanujan sum if
was “major arc”, together with an error term. We found it convenient to cancel off such main terms by subtracting an approximant
from each of the arithmetic functions
and then getting upper bounds on remainder correlations such as
- For the Möbius function
, we simply set
, as per the Möbius pseudorandomness conjecture. (One could choose a more sophisticated approximant in the presence of a Siegel zero, as I did with Joni in this recent paper, but we do not do so here.)
- For the von Mangoldt function
, we eventually went with the Cramér-Granville approximant
, where
and
.
- For the divisor functions
, we used a somewhat complicated-looking approximant
for some explicit polynomials
, chosen so that
and
have almost exactly the same sums along arithmetic progressions (see the paper for details).
The objective is then to obtain bounds on sums such as (1) that improve upon the “trivial bound” that one can get with the triangle inequality and standard number theory bounds such as the Brun-Titchmarsh inequality. For and
, the Siegel-Walfisz theorem suggests that it is reasonable to expect error terms that have “strongly logarithmic savings” in the sense that they gain a factor of
over the trivial bound for any
; for
, the Dirichlet hyperbola method suggests instead that one has “power savings” in that one should gain a factor of
over the trivial bound for some
. In the case of the Möbius function
, there is an additional trick (introduced by Matomäki and Teräväinen) that allows one to lower the exponent
somewhat at the cost of only obtaining “weakly logarithmic savings” of shape
for some small
.
Our main estimates on sums of the form (1) work in the following ranges:
- For
, one can obtain strongly logarithmic savings on (1) for
, and power savings for
.
- For
, one can obtain weakly logarithmic savings for
.
- For
, one can obtain power savings for
.
- For
, one can obtain power savings for
.
Conjecturally, one should be able to obtain power savings in all cases, and lower down to zero, but the ranges of exponents and savings given here seem to be the limit of current methods unless one assumes additional hypotheses, such as GRH. The
result for correlation against Fourier phases
was established previously by Zhan, and the
result for such phases and
was established previously by by Matomäki and Teräväinen.
By combining these results with tools from additive combinatorics, one can obtain a number of applications:
- Direct insertion of our bounds in the recent work of Kanigowski, Lemanczyk, and Radziwill on the prime number theorem on dynamical systems that are analytic skew products gives some improvements in the exponents there.
- We can obtain a “short interval” version of a multiple ergodic theorem along primes established by Frantzikinakis-Host-Kra and Wooley-Ziegler, in which we average over intervals of the form
rather than
.
- We can obtain a “short interval” version of the “linear equations in primes” asymptotics obtained by Ben Green, Tamar Ziegler, and myself in this sequence of papers, where the variables in these equations lie in short intervals
rather than long intervals such as
.
We now briefly discuss some of the ingredients of proof of our main results. The first step is standard, using combinatorial decompositions (based on the Heath-Brown identity and (for the result) the Ramaré identity) to decompose
into more tractable sums of the following types:
- Type
sums, which are basically of the form
for some weights
of controlled size and some cutoff
that is not too large;
- Type
sums, which are basically of the form
for some weights
,
of controlled size and some cutoffs
that are not too close to
or to
;
- Type
sums, which are basically of the form
for some weights
of controlled size and some cutoff
that is not too large.
The precise ranges of the cutoffs depend on the choice of
; our methods fail once these cutoffs pass a certain threshold, and this is the reason for the exponents
being what they are in our main results.
The Type sums involving nilsequences can be treated by methods similar to those in this previous paper of Ben Green and myself; the main innovations are in the treatment of the Type
and Type
sums.
For the Type sums, one can split into the “abelian” case in which (after some Fourier decomposition) the nilsequence
is basically of the form
, and the “non-abelian” case in which
is non-abelian and
exhibits non-trivial oscillation in a central direction. In the abelian case we can adapt arguments of Matomaki and Shao, which uses Cauchy-Schwarz and the equidistribution properties of polynomials to obtain good bounds unless
is “major arc” in the sense that it resembles (or “pretends to be”)
for some Dirichlet character
and some frequency
, but in this case one can use classical multiplicative methods to control the correlation. It turns out that the non-abelian case can be treated similarly. After applying Cauchy-Schwarz, one ends up analyzing the equidistribution of the four-variable polynomial sequence
For the type sum, a model sum to study is
In a sequel to this paper (currently in preparation), we will obtain analogous results for almost all intervals with
in the range
, in which we will be able to lower
all the way to
.
Kaisa Matomäki, Maksym Radziwill, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Singmaster’s conjecture in the interior of Pascal’s triangle“. This paper leverages the theory of exponential sums over primes to make progress on a well known conjecture of Singmaster which asserts that any natural number larger than appears at most a bounded number of times in Pascal’s triangle. That is to say, for any integer
, there are at most
solutions to the equation
Our main result settles this conjecture in the “interior” region of the triangle:
Theorem 1 (Singmaster’s conjecture in the interior of the triangle) Ifand
is sufficiently large depending on
, there are at most two solutions to (1) in the region
and hence at most four in the region
Also, there is at most one solution in the region
To verify Singmaster’s conjecture in full, it thus suffices in view of this result to verify the conjecture in the boundary region
(or equivalentlyThe upper bound of two here for the number of solutions in the region (2) is best possible, due to the infinite family of solutions to the equation
coming from
The appearance of the quantity in Theorem 1 may be familiar to readers that are acquainted with Vinogradov’s bounds on exponential sums, which ends up being the main new ingredient in our arguments. In principle this threshold could be lowered if we had stronger bounds on exponential sums.
To try to control solutions to (1) we use a combination of “Archimedean” and “non-Archimedean” approaches. In the “Archimedean” approach (following earlier work of Kane on this problem) we view primarily as real numbers rather than integers, and express (1) in terms of the Gamma function as
Proposition 2 Let, and suppose
is sufficiently large depending on
. If
is a solution to (1) in the left half
of Pascal’s triangle, then there is at most one other solution
to this equation in the left half with
Again, the example of (4) shows that a cluster of two solutions is certainly possible; the convexity argument only kicks in once one has a cluster of three or more solutions.
To finish the proof of Theorem 1, one has to show that any two solutions to (1) in the region of interest must be close enough for the above proposition to apply. Here we switch to the “non-Archimedean” approach, in which we look at the
-adic valuations
of the binomial coefficients, defined as the number of times a prime
divides
. From the fundamental theorem of arithmetic, a collision
A key idea in our approach is to view this condition (6) statistically, for instance by viewing as a prime drawn randomly from an interval such as
for some suitably chosen scale parameter
, so that the two sides of (6) now become random variables. It then becomes advantageous to compare correlations between these two random variables and some additional test random variable. For instance, if
and
are far apart from each other, then one would expect the left-hand side of (6) to have a higher correlation with the fractional part
, since this term shows up in the summation on the left-hand side but not the right. Similarly if
and
are far apart from each other (although there are some annoying cases one has to treat separately when there is some “unexpected commensurability”, for instance if
is a rational multiple of
where the rational has bounded numerator and denominator). In order to execute this strategy, it turns out (after some standard Fourier expansion) that one needs to get good control on exponential sums such as
A modification of the arguments also gives similar results for the equation
where
Theorem 3 Ifand
is sufficiently large depending on
, there are at most two solutions to (7) in the region
Again the upper bound of two is best possible, thanks to identities such as
Kaisa Matomäki, Maksym Radziwill, Joni Teräväinen, Tamar Ziegler and I have uploaded to the arXiv our paper Higher uniformity of bounded multiplicative functions in short intervals on average. This paper (which originated from a working group at an AIM workshop on Sarnak’s conjecture) focuses on the local Fourier uniformity conjecture for bounded multiplicative functions such as the Liouville function . One form of this conjecture is the assertion that
The conjecture gets more difficult as increases, and also becomes more difficult the more slowly
grows with
. The
conjecture is equivalent to the assertion
For , the conjecture is equivalent to the assertion
Now we apply the same strategy to (4). For abelian the claim follows easily from (3), so we focus on the non-abelian case. One now has a polynomial sequence
attached to many
, and after a somewhat complicated adaptation of the above arguments one again ends up with an approximate functional equation
We give two applications of this higher order Fourier uniformity. One regards the growth of the number
The second application is to obtain cancellation for various polynomial averages involving the Liouville function or von Mangoldt function
, such as
Kaisa Matomäki, Maksym Radziwill, and I just uploaded to the arXiv our paper “Fourier uniformity of bounded multiplicative functions in short intervals on average“. This paper is the outcome of our attempts during the MSRI program in analytic number theory last year to attack the local Fourier uniformity conjecture for the Liouville function . This conjecture generalises a landmark result of Matomäki and Radziwill, who show (among other things) that one has the asymptotic
whenever and
goes to infinity as
. Informally, this says that the Liouville function has small mean for almost all short intervals
. The remarkable thing about this theorem is that there is no lower bound on how
goes to infinity with
; one can take for instance
. This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that
The local Fourier uniformity conjecture asserts the stronger asymptotic
under the same hypotheses on and
. As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that
This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that for all primes
).
There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase is replaced with a polynomial phase such as
, or more generally a nilsequence
; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.
The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:
Theorem 1 The asymptotic (2) is true when
for a fixed
.
Previously this was known for by the work of Zhan (who in fact proved the stronger pointwise assertion
for
in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version
of (2) for any growing arbitrarily slowly with
; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case
is permitted to depend on the
parameter, whereas in (3)
must remain independent of
.
Unfortunately, the restriction is not strong enough to give applications to Chowla-type conjectures (one would need something more like
for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic
whenever for a fixed
, where
is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for
is only known in the range (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes easier to control when one inserts a Liouville function!
We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of we would have the lower bound
for some frequency . We informally describe this correlation between
and
by writing
for (informally, one should view this as asserting that
“behaves like” a constant multiple of
). For sake of discussion, suppose we have this relationship for all
, not just many.
As mentioned before, the main difficulty here is to understand how varies with
. As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let
be a fairly small prime (e.g. of size
for some
), and use the identity
for the Liouville function to conclude (at least heuristically) from (4) that
for . (In practice, we will have this sort of claim for many primes
rather than all primes
, after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)
Now let and
be primes comparable to some fixed range
such that
Then we have both
and
on essentially the same range of (two nearby intervals of length
). This suggests that the frequencies
and
should be close to each other modulo
, in particular one should expect the relationship
Comparing this with (5) one is led to the expectation that should depend inversely on
in some sense (for instance one can check that
would solve (6) if ; by Taylor expansion, this would correspond to a global approximation of the form
). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on
that resembles (7).
A key obstacle in solving (6) efficiently is the fact that one only knows that and
are close modulo
, rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say)
by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship
whenever and
obey (5). (This may force
to become extremely large, on the order of
, but this will not concern us.)
Now suppose that we have and primes
such that
For every prime , we can find an
such that
is within
of both
and
. Applying (8) twice we obtain
and
and thus by the triangle inequality we have
for all ; hence by the Chinese remainder theorem
In practice, in the regime that we are considering, the modulus
is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have
whenever and
obey (9).
Now let be an integer to be chosen later, and suppose we have primes
such that the difference
is small but non-zero. If is chosen so that
(where one is somewhat loose about what means) then one can then find real numbers
such that
for , with the convention that
. We then have
which telescopes to
and thus
and hence
In particular, for each , we expect to be able to write
for some . This quantity
can vary with
; but from (10) and a short calculation we see that
whenever obey (9) for some
.
Now imagine a “graph” in which the vertices are elements of
, and two elements
are joined by an edge if (9) holds for some
. Because of exponential sum estimates on
, this graph turns out to essentially be an “expander” in the sense that any two vertices
can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of
or
by
). As a consequence, we can assume that this quantity
is essentially constant in
(cf. the application of the ergodic theorem in this previous blog post), thus we now have
for most and some
. By Taylor expansion, this implies that
on for most
, thus
But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function is known to be non-pretentious).
Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions II. Divisor correlations in short ranges“. This is a sequel of sorts to our previous paper on divisor correlations, though the proof techniques in this paper are rather different. As with the previous paper, our interest is in correlations such as
for medium-sized and large
, where
are natural numbers and
is the
divisor function (actually our methods can also treat a generalisation in which
is non-integer, but for simplicity let us stick with the integer case for this discussion). Our methods also allow for one of the divisor function factors to be replaced with a von Mangoldt function, but (in contrast to the previous paper) we cannot treat the case when both factors are von Mangoldt.
As discussed in this previous post, one heuristically expects an asymptotic of the form
for any fixed , where
is a certain explicit (but rather complicated) polynomial of degree
. Such asymptotics are known when
, but remain open for
. In the previous paper, we were able to obtain a weaker bound of the form
for of the shifts
, whenever the shift range
lies between
and
. But the methods become increasingly hard to use as
gets smaller. In this paper, we use a rather different method to obtain the even weaker bound
for of the shifts
, where
can now be as short as
. The constant
can be improved, but there are serious obstacles to using our method to go below
(as the exceptionally large values of
then begin to dominate). This can be viewed as an analogue to our previous paper on correlations of bounded multiplicative functions on average, in which the functions
are now unbounded, and indeed our proof strategy is based in large part on that paper (but with many significant new technical complications).
We now discuss some of the ingredients of the proof. Unsurprisingly, the first step is the circle method, expressing (1) in terms of exponential sums such as
Actually, it is convenient to first prune slightly by zeroing out this function on “atypical” numbers
that have an unusually small or large number of factors in a certain sense, but let us ignore this technicality for this discussion. The contribution of
for “major arc”
can be treated by standard techniques (and is the source of the main term
; the main difficulty comes from treating the contribution of “minor arc”
.
In our previous paper on bounded multiplicative functions, we used Plancherel’s theorem to estimate the global norm
, and then also used the Katai-Bourgain-Sarnak-Ziegler orthogonality criterion to control local
norms
, where
was a minor arc interval of length about
, and these two estimates together were sufficient to get a good bound on correlations by an application of Hölder’s inequality. For
, it is more convenient to use Dirichlet series methods (and Ramaré-type factorisations of such Dirichlet series) to control local
norms on minor arcs, in the spirit of the proof of the Matomaki-Radziwill theorem; a key point is to develop “log-free” mean value theorems for Dirichlet series associated to functions such as
, so as not to wipe out the (rather small) savings one will get over the trivial bound from this method. On the other hand, the global
bound will definitely be unusable, because the
sum
has too many unwanted factors of
. Fortunately, we can substitute this global
bound with a “large values” bound that controls expressions such as
for a moderate number of disjoint intervals , with a bound that is slightly better (for
a medium-sized power of
) than what one would have obtained by bounding each integral
separately. (One needs to save more than
for the argument to work; we end up saving a factor of about
.) This large values estimate is probably the most novel contribution of the paper. After taking the Fourier transform, matters basically reduce to getting a good estimate for
where is the midpoint of
; thus we need some upper bound on the large local Fourier coefficients of
. These coefficients are difficult to calculate directly, but, in the spirit of a paper of Ben Green and myself, we can try to replace
by a more tractable and “pseudorandom” majorant
for which the local Fourier coefficients are computable (on average). After a standard duality argument, one ends up having to control expressions such as
after various averaging in the parameters. These local Fourier coefficients of
turn out to be small on average unless
is “major arc”. One then is left with a mostly combinatorial problem of trying to bound how often this major arc scenario occurs. This is very close to a computation in the previously mentioned paper of Ben and myself; there is a technical wrinkle in that the
are not as well separated as they were in my paper with Ben, but it turns out that one can modify the arguments in that paper to still obtain a satisfactory estimate in this case (after first grouping nearby frequencies
together, and modifying the duality argument accordingly).
Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions I. Long shift ranges“, submitted to Proceedings of the London Mathematical Society. This paper is concerned with the estimation of correlations such as
for medium-sized and large
, where
is the von Mangoldt function; we also consider variants of this sum in which one of the von Mangoldt functions is replaced with a (higher order) divisor function, but for sake of discussion let us focus just on the sum (1). Understanding this sum is very closely related to the problem of finding pairs of primes that differ by
; for instance, if one could establish a lower bound
then this would easily imply the twin prime conjecture.
The (first) Hardy-Littlewood conjecture asserts an asymptotic
as for any fixed positive
, where the singular series
is an arithmetic factor arising from the irregularity of distribution of
at small moduli, defined explicitly by
when is even, and
when
is odd, where
is (half of) the twin prime constant. See for instance this previous blog post for a a heuristic explanation of this conjecture. From the previous discussion we see that (2) for would imply the twin prime conjecture. Sieve theoretic methods are only able to provide an upper bound of the form
.
Needless to say, apart from the trivial case of odd , there are no values of
for which the Hardy-Littlewood conjecture is known. However there are some results that say that this conjecture holds “on the average”: in particular, if
is a quantity depending on
that is somewhat large, there are results that show that (2) holds for most (i.e. for
) of the
betwen
and
. Ideally one would like to get
as small as possible, in particular one can view the full Hardy-Littlewood conjecture as the endpoint case when
is bounded.
The first results in this direction were by van der Corput and by Lavrik, who established such a result with (with a subsequent refinement by Balog); Wolke lowered
to
, and Mikawa lowered
further to
. The main result of this paper is a further lowering of
to
. In fact (as in the preceding works) we get a better error term than
, namely an error of the shape
for any
.
Our arguments initially proceed along standard lines. One can use the Hardy-Littlewood circle method to express the correlation in (2) as an integral involving exponential sums . The contribution of “major arc”
is known by a standard computation to recover the main term
plus acceptable errors, so it is a matter of controlling the “minor arcs”. After averaging in
and using the Plancherel identity, one is basically faced with establishing a bound of the form
for any “minor arc” . If
is somewhat close to a low height rational
(specifically, if it is within
of such a rational with
), then this type of estimate is roughly of comparable strength (by another application of Plancherel) to the best available prime number theorem in short intervals on the average, namely that the prime number theorem holds for most intervals of the form
, and we can handle this case using standard mean value theorems for Dirichlet series. So we can restrict attention to the “strongly minor arc” case where
is far from such rationals.
The next step (following some ideas we found in a paper of Zhan) is to rewrite this estimate not in terms of the exponential sums , but rather in terms of the Dirichlet polynomial
. After a certain amount of computation (including some oscillatory integral estimates arising from stationary phase), one is eventually reduced to the task of establishing an estimate of the form
for any (with
sufficiently large depending on
).
The next step, which is again standard, is the use of the Heath-Brown identity (as discussed for instance in this previous blog post) to split up into a number of components that have a Dirichlet convolution structure. Because the exponent
we are shooting for is less than
, we end up with five types of components that arise, which we call “Type
“, “Type
“, “Type
“, “Type
“, and “Type II”. The “Type II” sums are Dirichlet convolutions involving a factor supported on a range
and is quite easy to deal with; the “Type
” terms are Dirichlet convolutions that resemble (non-degenerate portions of) the
divisor function, formed from convolving together
portions of
. The “Type
” and “Type
” terms can be estimated satisfactorily by standard moment estimates for Dirichlet polynomials; this already recovers the result of Mikawa (and our argument is in fact slightly more elementary in that no Kloosterman sum estimates are required). It is the treatment of the “Type
” and “Type
” sums that require some new analysis, with the Type
terms turning to be the most delicate. After using an existing moment estimate of Jutila for Dirichlet L-functions, matters reduce to obtaining a family of estimates, a typical one of which (relating to the more difficult Type
sums) is of the form
for “typical” ordinates of size
, where
is the Dirichlet polynomial
(a fragment of the Riemann zeta function). The precise definition of “typical” is a little technical (because of the complicated nature of Jutila’s estimate) and will not be detailed here. Such a claim would follow easily from the Lindelof hypothesis (which would imply that
) but of course we would like to have an unconditional result.
At this point, having exhausted all the Dirichlet polynomial estimates that are usefully available, we return to “physical space”. Using some further Fourier-analytic and oscillatory integral computations, we can estimate the left-hand side of (3) by an expression that is roughly of the shape
The phase can be Taylor expanded as the sum of
and a lower order term
, plus negligible errors. If we could discard the lower order term then we would get quite a good bound using the exponential sum estimates of Robert and Sargos, which control averages of exponential sums with purely monomial phases, with the averaging allowing us to exploit the hypothesis that
is “typical”. Figuring out how to get rid of this lower order term caused some inefficiency in our arguments; the best we could do (after much experimentation) was to use Fourier analysis to shorten the sums, estimate a one-parameter average exponential sum with a binomial phase by a two-parameter average with a monomial phase, and then use the van der Corput
process followed by the estimates of Robert and Sargos. This rather complicated procedure works up to
it may be possible that some alternate way to proceed here could improve the exponent somewhat.
In a sequel to this paper, we will use a somewhat different method to reduce to a much smaller value of
, but only if we replace the correlations
by either
or
, and also we now only save a
in the error term rather than
.
Kaisa Matomäki, Maksym Radziwiłł, and I have just uploaded to the arXiv our paper “Sign patterns of the Liouville and Möbius functions“. This paper is somewhat similar to our previous paper in that it is using the recent breakthrough of Matomäki and Radziwiłł on mean values of multiplicative functions to obtain partial results towards the Chowla conjecture. This conjecture can be phrased, roughly speaking, as follows: if is a fixed natural number and
is selected at random from a large interval
, then the sign pattern
becomes asymptotically equidistributed in
in the limit
. This remains open for
. In fact even the significantly weaker statement that each of the sign patterns in
is attained infinitely often is open for
. However, in 1986, Hildebrand showed that for
all sign patterns are indeed attained infinitely often. Our first result is a strengthening of Hildebrand’s, moving a little bit closer to Chowla’s conjecture:
Theorem 1 Let
. Then each of the sign patterns in
is attained by the Liouville function for a set of natural numbers
of positive lower density.
Thus for instance one has for a set of
of positive lower density. The
case of this theorem already appears in the original paper of Matomäki and Radziwiłł (and the significantly simpler case of the sign patterns
and
was treated previously by Harman, Pintz, and Wolke).
The basic strategy in all of these arguments is to assume for sake of contradiction that a certain sign pattern occurs extremely rarely, and then exploit the complete multiplicativity of (which implies in particular that
,
, and
for all
) together with some combinatorial arguments (vaguely analogous to solving a Sudoku puzzle!) to establish more complex sign patterns for the Liouville function, that are either inconsistent with each other, or with results such as the Matomäki-Radziwiłł result. To illustrate this, let us give some
examples, arguing a little informally to emphasise the combinatorial aspects of the argument. First suppose that the sign pattern
almost never occurs. The prime number theorem tells us that
and
are each equal to
about half of the time, which by inclusion-exclusion implies that the sign pattern
almost never occurs. In other words, we have
for almost all
. But from the multiplicativity property
this implies that one should have
and
for almost all . But the above three statements are contradictory, and the claim follows.
Similarly, if we assume that the sign pattern almost never occurs, then a similar argument to the above shows that for any fixed
, one has
for almost all
. But this means that the mean
is abnormally large for most
, which (for
large enough) contradicts the results of Matomäki and Radziwiłł. Here we see that the “enemy” to defeat is the scenario in which
only changes sign very rarely, in which case one rarely sees the pattern
.
It turns out that similar (but more combinatorially intricate) arguments work for sign patterns of length three (but are unlikely to work for most sign patterns of length four or greater). We give here one fragment of such an argument (due to Hildebrand) which hopefully conveys the Sudoku-type flavour of the combinatorics. Suppose for instance that the sign pattern almost never occurs. Now suppose
is a typical number with
. Since we almost never have the sign pattern
, we must (almost always) then have
. By multiplicativity this implies that
We claim that this (almost always) forces . For if
, then by the lack of the sign pattern
, this (almost always) forces
, which by multiplicativity forces
, which by lack of
(almost always) forces
, which by multiplicativity contradicts
. Thus we have
; a similar argument gives
almost always, which by multiplicativity gives
, a contradiction. Thus we almost never have
, which by the inclusion-exclusion argument mentioned previously shows that
for almost all
.
One can continue these Sudoku-type arguments and conclude eventually that for almost all
. To put it another way, if
denotes the non-principal Dirichlet character of modulus
, then
is almost always constant away from the multiples of
. (Conversely, if
changed sign very rarely outside of the multiples of three, then the sign pattern
would never occur.) Fortunately, the main result of Matomäki and Radziwiłł shows that this scenario cannot occur, which establishes that the sign pattern
must occur rather frequently. The other sign patterns are handled by variants of these arguments.
Excluding a sign pattern of length three leads to useful implications like “if , then
” which turn out are just barely strong enough to quite rigidly constrain the Liouville function using Sudoku-like arguments. In contrast, excluding a sign pattern of length four only gives rise to implications like “`if
, then
“, and these seem to be much weaker for this purpose (the hypothesis in these implications just isn’t satisfied nearly often enough). So a different idea seems to be needed if one wishes to extend the above theorem to larger values of
.
Our second theorem gives an analogous result for the Möbius function (which takes values in
rather than
), but the analysis turns out to be remarkably difficult and we are only able to get up to
:
Theorem 2 Let
. Then each of the sign patterns in
is attained by the Möbius function for a set
of positive lower density.
It turns out that the prime number theorem and elementary sieve theory can be used to handle the case and all the
cases that involve at least one
, leaving only the four sign patterns
to handle. It is here that the zeroes of the Möbius function cause a significant new obstacle. Suppose for instance that the sign pattern
almost never occurs for the Möbius function. The same arguments that were used in the Liouville case then show that
will be almost always equal to
, provided that
are both square-free. One can try to chain this together as before to create a long string
where the Möbius function is constant, but this cannot work for any
larger than three, because the Möbius function vanishes at every multiple of four.
The constraints we assume on the Möbius function can be depicted using a graph on the squarefree natural numbers, in which any two adjacent squarefree natural numbers are connected by an edge. The main difficulty is then that this graph is highly disconnected due to the multiples of four not being squarefree.
To get around this, we need to enlarge the graph. Note from multiplicativity that if is almost always equal to
when
are squarefree, then
is almost always equal to
when
are squarefree and
is divisible by
. We can then form a graph on the squarefree natural numbers by connecting
to
whenever
are squarefree and
is divisible by
. If this graph is “locally connected” in some sense, then
will be constant on almost all of the squarefree numbers in a large interval, which turns out to be incompatible with the results of Matomäki and Radziwiłł. Because of this, matters are reduced to establishing the connectedness of a certain graph. More precisely, it turns out to be sufficient to establish the following claim:
Theorem 3 For each prime
, let
be a residue class chosen uniformly at random. Let
be the random graph whose vertices
consist of those integers
not equal to
for any
, and whose edges consist of pairs
in
with
. Then with probability
, the graph
is connected.
We were able to show the connectedness of this graph, though it turned out to be remarkably tricky to do so. Roughly speaking (and suppressing a number of technicalities), the main steps in the argument were as follows.
- (Early stage) Pick a large number
(in our paper we take
to be odd, but I’ll ignore this technicality here). Using a moment method to explore neighbourhoods of a single point in
, one can show that a vertex
in
is almost always connected to at least
numbers in
, using relatively short paths of short diameter. (This is the most computationally intensive portion of the argument.)
- (Middle stage) Let
be a typical number in
, and let
be a scale somewhere between
and
. By using paths
involving three primes, and using a variant of Vinogradov’s theorem and some routine second moment computations, one can show that with quite high probability, any “good” vertex in
is connected to a “good” vertex in
by paths of length three, where the definition of “good” is somewhat technical but encompasses almost all of the vertices in
.
- (Late stage) Combining the two previous results together, we can show that most vertices
will be connected to a vertex in
for any
in
. In particular,
will be connected to a set of
vertices in
. By tracking everything carefully, one can control the length and diameter of the paths used to connect
to this set, and one can also control the parity of the elements in this set.
- (Final stage) Now if we have two vertices
at a distance
apart. By the previous item, one can connect
to a large set
of vertices in
, and one can similarly connect
to a large set
of vertices in
. Now, by using a Vinogradov-type theorem and second moment calculations again (and ensuring that the elements of
and
have opposite parity), one can connect many of the vertices in
to many of the vertices
by paths of length three, which then connects
to
, and gives the claim.
It seems of interest to understand random graphs like further. In particular, the graph
on the integers formed by connecting
to
for all
in a randomly selected residue class mod
for each prime
is particularly interesting (it is to the Liouville function as
is to the Möbius function); if one could show some “local expander” properties of this graph
, then one would have a chance of modifying the above methods to attack the first unsolved case of the Chowla conjecture, namely that
has asymptotic density zero (perhaps working with logarithmic density instead of natural density to avoids some technicalities).
Kaisa Matomaki, Maksym Radziwill, and I have just uploaded to the arXiv our paper “An averaged form of Chowla’s conjecture“. This paper concerns a weaker variant of the famous conjecture of Chowla (discussed for instance in this previous post) that
as for any distinct natural numbers
, where
denotes the Liouville function. (One could also replace the Liouville function here by the Möbius function
and obtain a morally equivalent conjecture.) This conjecture remains open for any
; for instance the assertion
is a variant of the twin prime conjecture (though possibly a tiny bit easier to prove), and is subject to the notorious parity barrier (as discussed in this previous post).
Our main result asserts, roughly speaking, that Chowla’s conjecture can be established unconditionally provided one has non-trivial averaging in the parameters. More precisely, one has
Theorem 1 (Chowla on the average) Suppose
is a quantity that goes to infinity as
(but it can go to infinity arbitrarily slowly). Then for any fixed
, we have
In fact, we can remove one of the averaging parameters and obtain
Actually we can make the decay rate a bit more quantitative, gaining about over the trivial bound. The key case is
; while the unaveraged Chowla conjecture becomes more difficult as
increases, the averaged Chowla conjecture does not increase in difficulty due to the increasing amount of averaging for larger
, and we end up deducing the higher
case of the conjecture from the
case by an elementary argument.
The proof of the theorem proceeds as follows. By exploiting the Fourier-analytic identity
(related to a standard Fourier-analytic identity for the Gowers norm) it turns out that the
case of the above theorem can basically be derived from an estimate of the form
uniformly for all . For “major arc”
, close to a rational
for small
, we can establish this bound from a generalisation of a recent result of Matomaki and Radziwill (discussed in this previous post) on averages of multiplicative functions in short intervals. For “minor arc”
, we can proceed instead from an argument of Katai and Bourgain-Sarnak-Ziegler (discussed in this previous post).
The argument also extends to other bounded multiplicative functions than the Liouville function. Chowla’s conjecture was generalised by Elliott, who roughly speaking conjectured that the copies of
in Chowla’s conjecture could be replaced by arbitrary bounded multiplicative functions
as long as these functions were far from a twisted Dirichlet character
in the sense that
(This type of distance is incidentally now a fundamental notion in the Granville-Soundararajan “pretentious” approach to multiplicative number theory.) During our work on this project, we found that Elliott’s conjecture is not quite true as stated due to a technicality: one can cook up a bounded multiplicative function which behaves like
on scales
for some
going to infinity and some slowly varying
, and such a function will be far from any fixed Dirichlet character whilst still having many large correlations (e.g. the pair correlations
will be large). In our paper we propose a technical “fix” to Elliott’s conjecture (replacing (1) by a truncated variant), and show that this repaired version of Elliott’s conjecture is true on the average in much the same way that Chowla’s conjecture is. (If one restricts attention to real-valued multiplicative functions, then this technical issue does not show up, basically because one can assume without loss of generality that
in this case; we discuss this fact in an appendix to the paper.)
In analytic number theory, it is a well-known phenomenon that for many arithmetic functions of interest in number theory, it is significantly easier to estimate logarithmic sums such as
than it is to estimate summatory functions such as
(Here we are normalising to be roughly constant in size, e.g.
as
.) For instance, when
is the von Mangoldt function
, the logarithmic sums
can be adequately estimated by Mertens’ theorem, which can be easily proven by elementary means (see Notes 1); but a satisfactory estimate on the summatory function
requires the prime number theorem, which is substantially harder to prove (see Notes 2). (From a complex-analytic or Fourier-analytic viewpoint, the problem is that the logarithmic sums
can usually be controlled just from knowledge of the Dirichlet series
for
near
; but the summatory functions require control of the Dirichlet series
for
on or near a large portion of the line
. See Notes 2 for further discussion.)
Viewed conversely, whenever one has a difficult estimate on a summatory function such as , one can look to see if there is a “cheaper” version of that estimate that only controls the logarithmic sums
, which is easier to prove than the original, more “expensive” estimate. In this post, we shall do this for two theorems, a classical theorem of Halasz on mean values of multiplicative functions on long intervals, and a much more recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. The two are related; the former theorem is an ingredient in the latter (though in the special case of the Matomaki-Radziwiłł theorem considered here, we will not need Halasz’s theorem directly, instead using a key tool in the proof of that theorem).
We begin with Halasz’s theorem. Here is a version of this theorem, due to Montgomery and to Tenenbaum:
Theorem 1 (Halasz-Montgomery-Tenenbaum) Let
be a multiplicative function with
for all
. Let
and
, and set
Then one has
Informally, this theorem asserts that is small compared with
, unless
“pretends” to be like the character
on primes for some small
. (This is the starting point of the “pretentious” approach of Granville and Soundararajan to analytic number theory, as developed for instance here.) We now give a “cheap” version of this theorem which is significantly weaker (both because it settles for controlling logarithmic sums rather than summatory functions, it requires
to be completely multiplicative instead of multiplicative, it requires a strong bound on the analogue of the quantity
, and because it only gives qualitative decay rather than quantitative estimates), but easier to prove:
Theorem 2 (Cheap Halasz) Let
be an asymptotic parameter goingto infinity. Let
be a completely multiplicative function (possibly depending on
) such that
for all
, such that
Note that now that we are content with estimating exponential sums, we no longer need to preclude the possibility that pretends to be like
; see Exercise 11 of Notes 1 for a related observation.
To prove this theorem, we first need a special case of the Turan-Kubilius inequality.
Lemma 3 (Turan-Kubilius) Let
be a parameter going to infinity, and let
be a quantity depending on
such that
and
as
. Then
Informally, this lemma is asserting that
for most large numbers . Another way of writing this heuristically is in terms of Dirichlet convolutions:
This type of estimate was previously discussed as a tool to establish a criterion of Katai and Bourgain-Sarnak-Ziegler for Möbius orthogonality estimates in this previous blog post. See also Section 5 of Notes 1 for some similar computations.
Proof: By Cauchy-Schwarz it suffices to show that
Expanding out the square, it suffices to show that
for .
We just show the case, as the
cases are similar (and easier). We rearrange the left-hand side as
We can estimate the inner sum as . But a routine application of Mertens’ theorem (handling the diagonal case when
separately) shows that
and the claim follows.
Remark 4 As an alternative to the Turan-Kubilius inequality, one can use the Ramaré identity
(see e.g. Section 17.3 of Friedlander-Iwaniec). This identity turns out to give superior quantitative results than the Turan-Kubilius inequality in applications; see the paper of Matomaki and Radziwiłł for an instance of this.
We now prove Theorem 2. Let denote the left-hand side of (2); by the triangle inequality we have
. By Lemma 3 (for some
to be chosen later) and the triangle inequality we have
We rearrange the left-hand side as
We now replace the constraint by
. The error incurred in doing so is
which by Mertens’ theorem is . Thus we have
But by definition of , we have
, thus
From Mertens’ theorem, the expression in brackets can be rewritten as
and so the real part of this expression is
By (1), Mertens’ theorem and the hypothesis on we have
for any . This implies that we can find
going to infinity such that
and thus the expression in brackets has real part . The claim follows.
The Turan-Kubilius argument is certainly not the most efficient way to estimate sums such as . In the exercise below we give a significantly more accurate estimate that works when
is non-negative.
Exercise 5 (Granville-Koukoulopoulos-Matomaki)
- (i) If
is a completely multiplicative function with
for all primes
, show that
as
. (Hint: for the upper bound, expand out the Euler product. For the lower bound, show that
, where
is the completely multiplicative function with
for all primes
.)
- (ii) If
is multiplicative and takes values in
, show that
for all
.
Now we turn to a very recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. For sake of illustration we specialise their results to the simpler case of the Liouville function , although their arguments actually work (with some additional effort) for arbitrary multiplicative functions of magnitude at most
that are real-valued (or more generally, stay far from complex characters
). Furthermore, we give a qualitative form of their estimates rather than a quantitative one:
Theorem 6 (Matomaki-Radziwiłł, special case) Let
be a parameter going to infinity, and let
be a quantity going to infinity as
. Then for all but
of the integers
, one has
A simple sieving argument (see Exercise 18 of Supplement 4) shows that one can replace by the Möbius function
and obtain the same conclusion. See this recent note of Matomaki and Radziwiłł for a simple proof of their (quantitative) main theorem in this special case.
Of course, (4) improves upon the trivial bound of . Prior to this paper, such estimates were only known (using arguments similar to those in Section 3 of Notes 6) for
unconditionally, or for
for some sufficiently large
if one assumed the Riemann hypothesis. This theorem also represents some progress towards Chowla’s conjecture (discussed in Supplement 4) that
as for any fixed distinct
; indeed, it implies that this conjecture holds if one performs a small amount of averaging in the
.
Below the fold, we give a “cheap” version of the Matomaki-Radziwiłł argument. More precisely, we establish
Theorem 7 (Cheap Matomaki-Radziwiłł) Let
be a parameter going to infinity, and let
. Then
Note that (5) improves upon the trivial bound of . Again, one can replace
with
if desired. Due to the cheapness of Theorem 7, the proof will require few ingredients; the deepest input is the improved zero-free region for the Riemann zeta function due to Vinogradov and Korobov. Other than that, the main tools are the Turan-Kubilius result established above, and some Fourier (or complex) analysis.
Recent Comments