You are currently browsing the tag archive for the ‘Kaisa Matomaki’ tag.
Kaisa Matomäki, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Higher uniformity of arithmetic functions in short intervals I. All intervals“. This paper investigates the higher order (Gowers) uniformity of standard arithmetic functions in analytic number theory (and specifically, the Möbius function , the von Mangoldt function , and the generalised divisor functions ) in short intervals , where is large and lies in the range for a fixed constant (that one would like to be as small as possible). If we let denote one of the functions , then there is extensive literature on the estimation of short sums
and some literature also on the estimation of exponential sums such as for a real frequency , where . For applications in the additive combinatorics of such functions , it is also necessary to consider more general correlations, such as polynomial correlations where is a polynomial of some fixed degree, or more generally where is a nilmanifold of fixed degree and dimension (and with some control on structure constants), is a polynomial map, and is a Lipschitz function (with some bound on the Lipschitz constant). Indeed, thanks to the inverse theorem for the Gowers uniformity norm, such correlations let one control the Gowers uniformity norm of (possibly after subtracting off some renormalising factor) on such short intervals , which can in turn be used to control other multilinear correlations involving such functions.Traditionally, asymptotics for such sums are expressed in terms of a “main term” of some arithmetic nature, plus an error term that is estimated in magnitude. For instance, a sum such as would be approximated in terms of a main term that vanished (or is negligible) if is “minor arc”, but would be expressible in terms of something like a Ramanujan sum if was “major arc”, together with an error term. We found it convenient to cancel off such main terms by subtracting an approximant from each of the arithmetic functions and then getting upper bounds on remainder correlations such as
(actually for technical reasons we also allow the variable to be restricted further to a subprogression of , but let us ignore this minor extension for this discussion). There is some flexibility in how to choose these approximants, but we eventually found it convenient to use the following choices.
- For the Möbius function , we simply set , as per the Möbius pseudorandomness conjecture. (One could choose a more sophisticated approximant in the presence of a Siegel zero, as I did with Joni in this recent paper, but we do not do so here.)
- For the von Mangoldt function , we eventually went with the Cramér-Granville approximant , where and .
- For the divisor functions , we used a somewhat complicated-looking approximant for some explicit polynomials , chosen so that and have almost exactly the same sums along arithmetic progressions (see the paper for details).
The objective is then to obtain bounds on sums such as (1) that improve upon the “trivial bound” that one can get with the triangle inequality and standard number theory bounds such as the Brun-Titchmarsh inequality. For and , the Siegel-Walfisz theorem suggests that it is reasonable to expect error terms that have “strongly logarithmic savings” in the sense that they gain a factor of over the trivial bound for any ; for , the Dirichlet hyperbola method suggests instead that one has “power savings” in that one should gain a factor of over the trivial bound for some . In the case of the Möbius function , there is an additional trick (introduced by Matomäki and Teräväinen) that allows one to lower the exponent somewhat at the cost of only obtaining “weakly logarithmic savings” of shape for some small .
Our main estimates on sums of the form (1) work in the following ranges:
- For , one can obtain strongly logarithmic savings on (1) for , and power savings for .
- For , one can obtain weakly logarithmic savings for .
- For , one can obtain power savings for .
- For , one can obtain power savings for .
Conjecturally, one should be able to obtain power savings in all cases, and lower down to zero, but the ranges of exponents and savings given here seem to be the limit of current methods unless one assumes additional hypotheses, such as GRH. The result for correlation against Fourier phases was established previously by Zhan, and the result for such phases and was established previously by by Matomäki and Teräväinen.
By combining these results with tools from additive combinatorics, one can obtain a number of applications:
- Direct insertion of our bounds in the recent work of Kanigowski, Lemanczyk, and Radziwill on the prime number theorem on dynamical systems that are analytic skew products gives some improvements in the exponents there.
- We can obtain a “short interval” version of a multiple ergodic theorem along primes established by Frantzikinakis-Host-Kra and Wooley-Ziegler, in which we average over intervals of the form rather than .
- We can obtain a “short interval” version of the “linear equations in primes” asymptotics obtained by Ben Green, Tamar Ziegler, and myself in this sequence of papers, where the variables in these equations lie in short intervals rather than long intervals such as .
We now briefly discuss some of the ingredients of proof of our main results. The first step is standard, using combinatorial decompositions (based on the Heath-Brown identity and (for the result) the Ramaré identity) to decompose into more tractable sums of the following types:
- Type sums, which are basically of the form for some weights of controlled size and some cutoff that is not too large;
- Type sums, which are basically of the form for some weights , of controlled size and some cutoffs that are not too close to or to ;
- Type sums, which are basically of the form for some weights of controlled size and some cutoff that is not too large.
The precise ranges of the cutoffs depend on the choice of ; our methods fail once these cutoffs pass a certain threshold, and this is the reason for the exponents being what they are in our main results.
The Type sums involving nilsequences can be treated by methods similar to those in this previous paper of Ben Green and myself; the main innovations are in the treatment of the Type and Type sums.
For the Type sums, one can split into the “abelian” case in which (after some Fourier decomposition) the nilsequence is basically of the form , and the “non-abelian” case in which is non-abelian and exhibits non-trivial oscillation in a central direction. In the abelian case we can adapt arguments of Matomaki and Shao, which uses Cauchy-Schwarz and the equidistribution properties of polynomials to obtain good bounds unless is “major arc” in the sense that it resembles (or “pretends to be”) for some Dirichlet character and some frequency , but in this case one can use classical multiplicative methods to control the correlation. It turns out that the non-abelian case can be treated similarly. After applying Cauchy-Schwarz, one ends up analyzing the equidistribution of the four-variable polynomial sequence
as range in various dyadic intervals. Using the known multidimensional equidistribution theory of polynomial maps in nilmanifolds, one can eventually show in the non-abelian case that this sequence either has enough equidistribution to give cancellation, or else the nilsequence involved can be replaced with one from a lower dimensional nilmanifold, in which case one can apply an induction hypothesis.For the type sum, a model sum to study is
which one can expand as We experimented with a number of ways to treat this type of sum (including automorphic form methods, or methods based on the Voronoi formula or van der Corput’s inequality), but somewhat to our surprise, the most efficient approach was an elementary one, in which one uses the Dirichlet approximation theorem to decompose the hyperbolic region into a number of arithmetic progressions, and then uses equidistribution theory to establish cancellation of sequences such as on the majority of these progressions. As it turns out, this strategy works well in the regime unless the nilsequence involved is “major arc”, but the latter case is treatable by existing methods as discussed previously; this is why the exponent for our result can be as low as .In a sequel to this paper (currently in preparation), we will obtain analogous results for almost all intervals with in the range , in which we will be able to lower all the way to .
Kaisa Matomäki, Maksym Radziwill, Xuancheng Shao, Joni Teräväinen, and myself have just uploaded to the arXiv our preprint “Singmaster’s conjecture in the interior of Pascal’s triangle“. This paper leverages the theory of exponential sums over primes to make progress on a well known conjecture of Singmaster which asserts that any natural number larger than appears at most a bounded number of times in Pascal’s triangle. That is to say, for any integer , there are at most solutions to the equation
with . Currently, the largest number of solutions that is known to be attainable is eight, with equal to Because of the symmetry of Pascal’s triangle it is natural to restrict attention to the left half of the triangle.Our main result settles this conjecture in the “interior” region of the triangle:
Theorem 1 (Singmaster’s conjecture in the interior of the triangle) If and is sufficiently large depending on , there are at most two solutions to (1) in the region and hence at most four in the region Also, there is at most one solution in the region
To verify Singmaster’s conjecture in full, it thus suffices in view of this result to verify the conjecture in the boundary region
(or equivalently ); we have deleted the case as it of course automatically supplies exactly one solution to (1). It is in fact possible that for sufficiently large there are no further collisions for in the region (3), in which case there would never be more than eight solutions to (1) for sufficiently large . This is latter claim known for bounded values of by Beukers, Shorey, and Tildeman, with the main tool used being Siegel’s theorem on integral points.The upper bound of two here for the number of solutions in the region (2) is best possible, due to the infinite family of solutions to the equation
coming from , and is the Fibonacci number.The appearance of the quantity in Theorem 1 may be familiar to readers that are acquainted with Vinogradov’s bounds on exponential sums, which ends up being the main new ingredient in our arguments. In principle this threshold could be lowered if we had stronger bounds on exponential sums.
To try to control solutions to (1) we use a combination of “Archimedean” and “non-Archimedean” approaches. In the “Archimedean” approach (following earlier work of Kane on this problem) we view primarily as real numbers rather than integers, and express (1) in terms of the Gamma function as
One can use this equation to solve for in terms of as for a certain real analytic function whose asymptotics are easily computable (for instance one has the asymptotic ). One can then view the problem as one of trying to control the number of lattice points on the graph . Here we can take advantage of the fact that in the regime (which corresponds to working in the left half of Pascal’s triangle), the function can be shown to be convex, but not too convex, in the sense that one has both upper and lower bounds on the second derivative of (in fact one can show that ). This can be used to preclude the possibility of having a cluster of three or more nearby lattice points on the graph , basically because the area subtended by the triangle connecting three of these points would lie between and , contradicting Pick’s theorem. Developing these ideas, we were able to show
Proposition 2 Let , and suppose is sufficiently large depending on . If is a solution to (1) in the left half of Pascal’s triangle, then there is at most one other solution to this equation in the left half with
Again, the example of (4) shows that a cluster of two solutions is certainly possible; the convexity argument only kicks in once one has a cluster of three or more solutions.
To finish the proof of Theorem 1, one has to show that any two solutions to (1) in the region of interest must be close enough for the above proposition to apply. Here we switch to the “non-Archimedean” approach, in which we look at the -adic valuations of the binomial coefficients, defined as the number of times a prime divides . From the fundamental theorem of arithmetic, a collision
between binomial coefficients occurs if and only if one has agreement of valuations From the Legendre formula we can rewrite this latter identity (5) as where denotes the fractional part of . (These sums are not truly infinite, because the summands vanish once is larger than .)A key idea in our approach is to view this condition (6) statistically, for instance by viewing as a prime drawn randomly from an interval such as for some suitably chosen scale parameter , so that the two sides of (6) now become random variables. It then becomes advantageous to compare correlations between these two random variables and some additional test random variable. For instance, if and are far apart from each other, then one would expect the left-hand side of (6) to have a higher correlation with the fractional part , since this term shows up in the summation on the left-hand side but not the right. Similarly if and are far apart from each other (although there are some annoying cases one has to treat separately when there is some “unexpected commensurability”, for instance if is a rational multiple of where the rational has bounded numerator and denominator). In order to execute this strategy, it turns out (after some standard Fourier expansion) that one needs to get good control on exponential sums such as
for various choices of parameters , where . Fortunately, the methods of Vinogradov (which more generally can handle sums such as and for various analytic functions ) can give useful bounds on such sums as long as and are not too large compared to ; more specifically, Vinogradov’s estimates are non-trivial in the regime , and this ultimately leads to a distance bound between any colliding pair in the left half of Pascal’s triangle, as well as the variant bound under the additional assumption Comparing these bounds with Proposition 2 and using some basic estimates about the function , we can conclude Theorem 1.A modification of the arguments also gives similar results for the equation
where is the falling factorial:
Theorem 3 If and is sufficiently large depending on , there are at most two solutions to (7) in the region
Again the upper bound of two is best possible, thanks to identities such as
Kaisa Matomäki, Maksym Radziwill, Joni Teräväinen, Tamar Ziegler and I have uploaded to the arXiv our paper Higher uniformity of bounded multiplicative functions in short intervals on average. This paper (which originated from a working group at an AIM workshop on Sarnak’s conjecture) focuses on the local Fourier uniformity conjecture for bounded multiplicative functions such as the Liouville function . One form of this conjecture is the assertion that
as for any fixed and any that goes to infinity as , where is the (normalized) Gowers uniformity norm. Among other things this conjecture implies (logarithmically averaged version of) the Chowla and Sarnak conjectures for the Liouville function (or the Möbius function), see this previous blog post.The conjecture gets more difficult as increases, and also becomes more difficult the more slowly grows with . The conjecture is equivalent to the assertion
which was proven (for arbitrarily slowly growing ) in a landmark paper of Matomäki and Radziwill, discussed for instance in this blog post.For , the conjecture is equivalent to the assertion
This remains open for sufficiently slowly growing (and it would be a major breakthrough in particular if one could obtain this bound for as small as for any fixed , particularly if applicable to more general bounded multiplicative functions than , as this would have new implications for a generalization of the Chowla conjecture known as the Elliott conjecture). Recently, Kaisa, Maks and myself were able to establish this conjecture in the range (in fact we have since worked out in the current paper that we can get as small as ). In our current paper we establish Fourier uniformity conjecture for higher for the same range of . This in particular implies local orthogonality to polynomial phases, where denotes the polynomials of degree at most , but the full conjecture is a bit stronger than this, establishing the more general statement for any degree filtered nilmanifold and Lipschitz function , where now ranges over polynomial maps from to . The method of proof follows the same general strategy as in the previous paper with Kaisa and Maks. (The equivalence of (4) and (1) follows from the inverse conjecture for the Gowers norms, proven in this paper.) We quickly sketch first the proof of (3), using very informal language to avoid many technicalities regarding the precise quantitative form of various estimates. If the estimate (3) fails, then we have the correlation estimate for many and some polynomial depending on . The difficulty here is to understand how can depend on . We write the above correlation estimate more suggestively as Because of the multiplicativity at small primes , one expects to have a relation of the form for many for which for some small primes . (This can be formalised using an inequality of Elliott related to the Turan-Kubilius theorem.) This gives a relationship between and for “edges” in a rather sparse “graph” connecting the elements of say . Using some graph theory one can locate some non-trivial “cycles” in this graph that eventually lead (in conjunction to a certain technical but important “Chinese remainder theorem” step to modify the to eliminate a rather serious “aliasing” issue that was already discussed in this previous post) to obtain functional equations of the form for some large and close (but not identical) integers , where should be viewed as a first approximation (ignoring a certain “profinite” or “major arc” term for simplicity) as “differing by a slowly varying polynomial” and the polynomials should now be viewed as taking values on the reals rather than the integers. This functional equation can be solved to obtain a relation of the form for some real number of polynomial size, and with further analysis of the relation (5) one can make basically independent of . This simplifies (3) to something like and this is now of a form that can be treated by the theorem of Matomäki and Radziwill (because is a bounded multiplicative function). (Actually because of the profinite term mentioned previously, one also has to insert a Dirichlet character of bounded conductor into this latter conclusion, but we will ignore this technicality.)Now we apply the same strategy to (4). For abelian the claim follows easily from (3), so we focus on the non-abelian case. One now has a polynomial sequence attached to many , and after a somewhat complicated adaptation of the above arguments one again ends up with an approximate functional equation
where the relation is rather technical and will not be detailed here. A new difficulty arises in that there are some unwanted solutions to this equation, such as for some , which do not necessarily lead to multiplicative characters like as in the polynomial case, but instead to some unfriendly looking “generalized multiplicative characters” (think of as a rough caricature). To avoid this problem, we rework the graph theory portion of the argument to produce not just one functional equation of the form (6)for each , but many, leading to dilation invariances for a “dense” set of . From a certain amount of Lie algebra theory (ultimately arising from an understanding of the behaviour of the exponential map on nilpotent matrices, and exploiting the hypothesis that is non-abelian) one can conclude that (after some initial preparations to avoid degenerate cases) must behave like for some central element of . This eventually brings one back to the multiplicative characters that arose in the polynomial case, and the arguments now proceed as before.We give two applications of this higher order Fourier uniformity. One regards the growth of the number
of length sign patterns in the Liouville function. The Chowla conjecture implies that , but even the weaker conjecture of Sarnak that for some remains open. Until recently, the best asymptotic lower bound on was , due to McNamara; with our result, we can now show for any (in fact we can get for any ). The idea is to repeat the now-standard argument to exploit multiplicativity at small primes to deduce Chowla-type conjectures from Fourier uniformity conjectures, noting that the Chowla conjecture would give all the sign patterns one could hope for. The usual argument here uses the “entropy decrement argument” to eliminate a certain error term (involving the large but mean zero factor ). However the observation is that if there are extremely few sign patterns of length , then the entropy decrement argument is unnecessary (there isn’t much entropy to begin with), and a more low-tech moment method argument (similar to the derivation of Chowla’s conjecture from Sarnak’s conjecture, as discussed for instance in this post) gives enough of Chowla’s conjecture to produce plenty of length sign patterns. If there are not extremely few sign patterns of length then we are done anyway. One quirk of this argument is that the sign patterns it produces may only appear exactly once; in contrast with preceding arguments, we were not able to produce a large number of sign patterns that each occur infinitely often.The second application is to obtain cancellation for various polynomial averages involving the Liouville function or von Mangoldt function , such as
or where are polynomials of degree at most , no two of which differ by a constant (the latter is essential to avoid having to establish the Chowla or Hardy-Littlewood conjectures, which of course remain open). Results of this type were previously obtained by Tamar Ziegler and myself in the “true complexity zero” case when the polynomials had distinct degrees, in which one could use the theory of Matomäki and Radziwill; now that higher is available at the scale we can now remove this restriction.Kaisa Matomäki, Maksym Radziwill, and I just uploaded to the arXiv our paper “Fourier uniformity of bounded multiplicative functions in short intervals on average“. This paper is the outcome of our attempts during the MSRI program in analytic number theory last year to attack the local Fourier uniformity conjecture for the Liouville function . This conjecture generalises a landmark result of Matomäki and Radziwill, who show (among other things) that one has the asymptotic
whenever and goes to infinity as . Informally, this says that the Liouville function has small mean for almost all short intervals . The remarkable thing about this theorem is that there is no lower bound on how goes to infinity with ; one can take for instance . This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that
The local Fourier uniformity conjecture asserts the stronger asymptotic
under the same hypotheses on and . As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that
This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that for all primes ).
There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase is replaced with a polynomial phase such as , or more generally a nilsequence ; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.
The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:
Theorem 1 The asymptotic (2) is true when for a fixed .
Previously this was known for by the work of Zhan (who in fact proved the stronger pointwise assertion for in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version
of (2) for any growing arbitrarily slowly with ; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case is permitted to depend on the parameter, whereas in (3) must remain independent of .
Unfortunately, the restriction is not strong enough to give applications to Chowla-type conjectures (one would need something more like for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic
whenever for a fixed , where is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for
is only known in the range (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes easier to control when one inserts a Liouville function!
We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of we would have the lower bound
for some frequency . We informally describe this correlation between and by writing
for (informally, one should view this as asserting that “behaves like” a constant multiple of ). For sake of discussion, suppose we have this relationship for all , not just many.
As mentioned before, the main difficulty here is to understand how varies with . As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let be a fairly small prime (e.g. of size for some ), and use the identity for the Liouville function to conclude (at least heuristically) from (4) that
for . (In practice, we will have this sort of claim for many primes rather than all primes , after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)
Now let and be primes comparable to some fixed range such that
and
on essentially the same range of (two nearby intervals of length ). This suggests that the frequencies and should be close to each other modulo , in particular one should expect the relationship
Comparing this with (5) one is led to the expectation that should depend inversely on in some sense (for instance one can check that
would solve (6) if ; by Taylor expansion, this would correspond to a global approximation of the form ). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on that resembles (7).
A key obstacle in solving (6) efficiently is the fact that one only knows that and are close modulo , rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say) by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship
whenever and obey (5). (This may force to become extremely large, on the order of , but this will not concern us.)
Now suppose that we have and primes such that
For every prime , we can find an such that is within of both and . Applying (8) twice we obtain
and
and thus by the triangle inequality we have
for all ; hence by the Chinese remainder theorem
In practice, in the regime that we are considering, the modulus is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have
whenever and obey (9).
Now let be an integer to be chosen later, and suppose we have primes such that the difference
is small but non-zero. If is chosen so that
(where one is somewhat loose about what means) then one can then find real numbers such that
for , with the convention that . We then have
which telescopes to
and thus
and hence
In particular, for each , we expect to be able to write
for some . This quantity can vary with ; but from (10) and a short calculation we see that
whenever obey (9) for some .
Now imagine a “graph” in which the vertices are elements of , and two elements are joined by an edge if (9) holds for some . Because of exponential sum estimates on , this graph turns out to essentially be an “expander” in the sense that any two vertices can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of or by ). As a consequence, we can assume that this quantity is essentially constant in (cf. the application of the ergodic theorem in this previous blog post), thus we now have
for most and some . By Taylor expansion, this implies that
on for most , thus
But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function is known to be non-pretentious).
Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions II. Divisor correlations in short ranges“. This is a sequel of sorts to our previous paper on divisor correlations, though the proof techniques in this paper are rather different. As with the previous paper, our interest is in correlations such as
for medium-sized and large , where are natural numbers and is the divisor function (actually our methods can also treat a generalisation in which is non-integer, but for simplicity let us stick with the integer case for this discussion). Our methods also allow for one of the divisor function factors to be replaced with a von Mangoldt function, but (in contrast to the previous paper) we cannot treat the case when both factors are von Mangoldt.
As discussed in this previous post, one heuristically expects an asymptotic of the form
for any fixed , where is a certain explicit (but rather complicated) polynomial of degree . Such asymptotics are known when , but remain open for . In the previous paper, we were able to obtain a weaker bound of the form
for of the shifts , whenever the shift range lies between and . But the methods become increasingly hard to use as gets smaller. In this paper, we use a rather different method to obtain the even weaker bound
for of the shifts , where can now be as short as . The constant can be improved, but there are serious obstacles to using our method to go below (as the exceptionally large values of then begin to dominate). This can be viewed as an analogue to our previous paper on correlations of bounded multiplicative functions on average, in which the functions are now unbounded, and indeed our proof strategy is based in large part on that paper (but with many significant new technical complications).
We now discuss some of the ingredients of the proof. Unsurprisingly, the first step is the circle method, expressing (1) in terms of exponential sums such as
Actually, it is convenient to first prune slightly by zeroing out this function on “atypical” numbers that have an unusually small or large number of factors in a certain sense, but let us ignore this technicality for this discussion. The contribution of for “major arc” can be treated by standard techniques (and is the source of the main term ; the main difficulty comes from treating the contribution of “minor arc” .
In our previous paper on bounded multiplicative functions, we used Plancherel’s theorem to estimate the global norm , and then also used the Katai-Bourgain-Sarnak-Ziegler orthogonality criterion to control local norms , where was a minor arc interval of length about , and these two estimates together were sufficient to get a good bound on correlations by an application of Hölder’s inequality. For , it is more convenient to use Dirichlet series methods (and Ramaré-type factorisations of such Dirichlet series) to control local norms on minor arcs, in the spirit of the proof of the Matomaki-Radziwill theorem; a key point is to develop “log-free” mean value theorems for Dirichlet series associated to functions such as , so as not to wipe out the (rather small) savings one will get over the trivial bound from this method. On the other hand, the global bound will definitely be unusable, because the sum has too many unwanted factors of . Fortunately, we can substitute this global bound with a “large values” bound that controls expressions such as
for a moderate number of disjoint intervals , with a bound that is slightly better (for a medium-sized power of ) than what one would have obtained by bounding each integral separately. (One needs to save more than for the argument to work; we end up saving a factor of about .) This large values estimate is probably the most novel contribution of the paper. After taking the Fourier transform, matters basically reduce to getting a good estimate for
where is the midpoint of ; thus we need some upper bound on the large local Fourier coefficients of . These coefficients are difficult to calculate directly, but, in the spirit of a paper of Ben Green and myself, we can try to replace by a more tractable and “pseudorandom” majorant for which the local Fourier coefficients are computable (on average). After a standard duality argument, one ends up having to control expressions such as
after various averaging in the parameters. These local Fourier coefficients of turn out to be small on average unless is “major arc”. One then is left with a mostly combinatorial problem of trying to bound how often this major arc scenario occurs. This is very close to a computation in the previously mentioned paper of Ben and myself; there is a technical wrinkle in that the are not as well separated as they were in my paper with Ben, but it turns out that one can modify the arguments in that paper to still obtain a satisfactory estimate in this case (after first grouping nearby frequencies together, and modifying the duality argument accordingly).
Kaisa Matomaki, Maksym Radziwill, and I have uploaded to the arXiv our paper “Correlations of the von Mangoldt and higher divisor functions I. Long shift ranges“, submitted to Proceedings of the London Mathematical Society. This paper is concerned with the estimation of correlations such as
for medium-sized and large , where is the von Mangoldt function; we also consider variants of this sum in which one of the von Mangoldt functions is replaced with a (higher order) divisor function, but for sake of discussion let us focus just on the sum (1). Understanding this sum is very closely related to the problem of finding pairs of primes that differ by ; for instance, if one could establish a lower bound
then this would easily imply the twin prime conjecture.
The (first) Hardy-Littlewood conjecture asserts an asymptotic
as for any fixed positive , where the singular series is an arithmetic factor arising from the irregularity of distribution of at small moduli, defined explicitly by
when is even, and when is odd, where
is (half of) the twin prime constant. See for instance this previous blog post for a a heuristic explanation of this conjecture. From the previous discussion we see that (2) for would imply the twin prime conjecture. Sieve theoretic methods are only able to provide an upper bound of the form .
Needless to say, apart from the trivial case of odd , there are no values of for which the Hardy-Littlewood conjecture is known. However there are some results that say that this conjecture holds “on the average”: in particular, if is a quantity depending on that is somewhat large, there are results that show that (2) holds for most (i.e. for ) of the betwen and . Ideally one would like to get as small as possible, in particular one can view the full Hardy-Littlewood conjecture as the endpoint case when is bounded.
The first results in this direction were by van der Corput and by Lavrik, who established such a result with (with a subsequent refinement by Balog); Wolke lowered to , and Mikawa lowered further to . The main result of this paper is a further lowering of to . In fact (as in the preceding works) we get a better error term than , namely an error of the shape for any .
Our arguments initially proceed along standard lines. One can use the Hardy-Littlewood circle method to express the correlation in (2) as an integral involving exponential sums . The contribution of “major arc” is known by a standard computation to recover the main term plus acceptable errors, so it is a matter of controlling the “minor arcs”. After averaging in and using the Plancherel identity, one is basically faced with establishing a bound of the form
for any “minor arc” . If is somewhat close to a low height rational (specifically, if it is within of such a rational with ), then this type of estimate is roughly of comparable strength (by another application of Plancherel) to the best available prime number theorem in short intervals on the average, namely that the prime number theorem holds for most intervals of the form , and we can handle this case using standard mean value theorems for Dirichlet series. So we can restrict attention to the “strongly minor arc” case where is far from such rationals.
The next step (following some ideas we found in a paper of Zhan) is to rewrite this estimate not in terms of the exponential sums , but rather in terms of the Dirichlet polynomial . After a certain amount of computation (including some oscillatory integral estimates arising from stationary phase), one is eventually reduced to the task of establishing an estimate of the form
for any (with sufficiently large depending on ).
The next step, which is again standard, is the use of the Heath-Brown identity (as discussed for instance in this previous blog post) to split up into a number of components that have a Dirichlet convolution structure. Because the exponent we are shooting for is less than , we end up with five types of components that arise, which we call “Type “, “Type “, “Type “, “Type “, and “Type II”. The “Type II” sums are Dirichlet convolutions involving a factor supported on a range and is quite easy to deal with; the “Type ” terms are Dirichlet convolutions that resemble (non-degenerate portions of) the divisor function, formed from convolving together portions of . The “Type ” and “Type ” terms can be estimated satisfactorily by standard moment estimates for Dirichlet polynomials; this already recovers the result of Mikawa (and our argument is in fact slightly more elementary in that no Kloosterman sum estimates are required). It is the treatment of the “Type ” and “Type ” sums that require some new analysis, with the Type terms turning to be the most delicate. After using an existing moment estimate of Jutila for Dirichlet L-functions, matters reduce to obtaining a family of estimates, a typical one of which (relating to the more difficult Type sums) is of the form
for “typical” ordinates of size , where is the Dirichlet polynomial (a fragment of the Riemann zeta function). The precise definition of “typical” is a little technical (because of the complicated nature of Jutila’s estimate) and will not be detailed here. Such a claim would follow easily from the Lindelof hypothesis (which would imply that ) but of course we would like to have an unconditional result.
At this point, having exhausted all the Dirichlet polynomial estimates that are usefully available, we return to “physical space”. Using some further Fourier-analytic and oscillatory integral computations, we can estimate the left-hand side of (3) by an expression that is roughly of the shape
The phase can be Taylor expanded as the sum of and a lower order term , plus negligible errors. If we could discard the lower order term then we would get quite a good bound using the exponential sum estimates of Robert and Sargos, which control averages of exponential sums with purely monomial phases, with the averaging allowing us to exploit the hypothesis that is “typical”. Figuring out how to get rid of this lower order term caused some inefficiency in our arguments; the best we could do (after much experimentation) was to use Fourier analysis to shorten the sums, estimate a one-parameter average exponential sum with a binomial phase by a two-parameter average with a monomial phase, and then use the van der Corput process followed by the estimates of Robert and Sargos. This rather complicated procedure works up to it may be possible that some alternate way to proceed here could improve the exponent somewhat.
In a sequel to this paper, we will use a somewhat different method to reduce to a much smaller value of , but only if we replace the correlations by either or , and also we now only save a in the error term rather than .
Kaisa Matomäki, Maksym Radziwiłł, and I have just uploaded to the arXiv our paper “Sign patterns of the Liouville and Möbius functions“. This paper is somewhat similar to our previous paper in that it is using the recent breakthrough of Matomäki and Radziwiłł on mean values of multiplicative functions to obtain partial results towards the Chowla conjecture. This conjecture can be phrased, roughly speaking, as follows: if is a fixed natural number and is selected at random from a large interval , then the sign pattern becomes asymptotically equidistributed in in the limit . This remains open for . In fact even the significantly weaker statement that each of the sign patterns in is attained infinitely often is open for . However, in 1986, Hildebrand showed that for all sign patterns are indeed attained infinitely often. Our first result is a strengthening of Hildebrand’s, moving a little bit closer to Chowla’s conjecture:
Theorem 1 Let . Then each of the sign patterns in is attained by the Liouville function for a set of natural numbers of positive lower density.
Thus for instance one has for a set of of positive lower density. The case of this theorem already appears in the original paper of Matomäki and Radziwiłł (and the significantly simpler case of the sign patterns and was treated previously by Harman, Pintz, and Wolke).
The basic strategy in all of these arguments is to assume for sake of contradiction that a certain sign pattern occurs extremely rarely, and then exploit the complete multiplicativity of (which implies in particular that , , and for all ) together with some combinatorial arguments (vaguely analogous to solving a Sudoku puzzle!) to establish more complex sign patterns for the Liouville function, that are either inconsistent with each other, or with results such as the Matomäki-Radziwiłł result. To illustrate this, let us give some examples, arguing a little informally to emphasise the combinatorial aspects of the argument. First suppose that the sign pattern almost never occurs. The prime number theorem tells us that and are each equal to about half of the time, which by inclusion-exclusion implies that the sign pattern almost never occurs. In other words, we have for almost all . But from the multiplicativity property this implies that one should have
and
for almost all . But the above three statements are contradictory, and the claim follows.
Similarly, if we assume that the sign pattern almost never occurs, then a similar argument to the above shows that for any fixed , one has for almost all . But this means that the mean is abnormally large for most , which (for large enough) contradicts the results of Matomäki and Radziwiłł. Here we see that the “enemy” to defeat is the scenario in which only changes sign very rarely, in which case one rarely sees the pattern .
It turns out that similar (but more combinatorially intricate) arguments work for sign patterns of length three (but are unlikely to work for most sign patterns of length four or greater). We give here one fragment of such an argument (due to Hildebrand) which hopefully conveys the Sudoku-type flavour of the combinatorics. Suppose for instance that the sign pattern almost never occurs. Now suppose is a typical number with . Since we almost never have the sign pattern , we must (almost always) then have . By multiplicativity this implies that
We claim that this (almost always) forces . For if , then by the lack of the sign pattern , this (almost always) forces , which by multiplicativity forces , which by lack of (almost always) forces , which by multiplicativity contradicts . Thus we have ; a similar argument gives almost always, which by multiplicativity gives , a contradiction. Thus we almost never have , which by the inclusion-exclusion argument mentioned previously shows that for almost all .
One can continue these Sudoku-type arguments and conclude eventually that for almost all . To put it another way, if denotes the non-principal Dirichlet character of modulus , then is almost always constant away from the multiples of . (Conversely, if changed sign very rarely outside of the multiples of three, then the sign pattern would never occur.) Fortunately, the main result of Matomäki and Radziwiłł shows that this scenario cannot occur, which establishes that the sign pattern must occur rather frequently. The other sign patterns are handled by variants of these arguments.
Excluding a sign pattern of length three leads to useful implications like “if , then ” which turn out are just barely strong enough to quite rigidly constrain the Liouville function using Sudoku-like arguments. In contrast, excluding a sign pattern of length four only gives rise to implications like “`if , then “, and these seem to be much weaker for this purpose (the hypothesis in these implications just isn’t satisfied nearly often enough). So a different idea seems to be needed if one wishes to extend the above theorem to larger values of .
Our second theorem gives an analogous result for the Möbius function (which takes values in rather than ), but the analysis turns out to be remarkably difficult and we are only able to get up to :
Theorem 2 Let . Then each of the sign patterns in is attained by the Möbius function for a set of positive lower density.
It turns out that the prime number theorem and elementary sieve theory can be used to handle the case and all the cases that involve at least one , leaving only the four sign patterns to handle. It is here that the zeroes of the Möbius function cause a significant new obstacle. Suppose for instance that the sign pattern almost never occurs for the Möbius function. The same arguments that were used in the Liouville case then show that will be almost always equal to , provided that are both square-free. One can try to chain this together as before to create a long string where the Möbius function is constant, but this cannot work for any larger than three, because the Möbius function vanishes at every multiple of four.
The constraints we assume on the Möbius function can be depicted using a graph on the squarefree natural numbers, in which any two adjacent squarefree natural numbers are connected by an edge. The main difficulty is then that this graph is highly disconnected due to the multiples of four not being squarefree.
To get around this, we need to enlarge the graph. Note from multiplicativity that if is almost always equal to when are squarefree, then is almost always equal to when are squarefree and is divisible by . We can then form a graph on the squarefree natural numbers by connecting to whenever are squarefree and is divisible by . If this graph is “locally connected” in some sense, then will be constant on almost all of the squarefree numbers in a large interval, which turns out to be incompatible with the results of Matomäki and Radziwiłł. Because of this, matters are reduced to establishing the connectedness of a certain graph. More precisely, it turns out to be sufficient to establish the following claim:
Theorem 3 For each prime , let be a residue class chosen uniformly at random. Let be the random graph whose vertices consist of those integers not equal to for any , and whose edges consist of pairs in with . Then with probability , the graph is connected.
We were able to show the connectedness of this graph, though it turned out to be remarkably tricky to do so. Roughly speaking (and suppressing a number of technicalities), the main steps in the argument were as follows.
- (Early stage) Pick a large number (in our paper we take to be odd, but I’ll ignore this technicality here). Using a moment method to explore neighbourhoods of a single point in , one can show that a vertex in is almost always connected to at least numbers in , using relatively short paths of short diameter. (This is the most computationally intensive portion of the argument.)
- (Middle stage) Let be a typical number in , and let be a scale somewhere between and . By using paths involving three primes, and using a variant of Vinogradov’s theorem and some routine second moment computations, one can show that with quite high probability, any “good” vertex in is connected to a “good” vertex in by paths of length three, where the definition of “good” is somewhat technical but encompasses almost all of the vertices in .
- (Late stage) Combining the two previous results together, we can show that most vertices will be connected to a vertex in for any in . In particular, will be connected to a set of vertices in . By tracking everything carefully, one can control the length and diameter of the paths used to connect to this set, and one can also control the parity of the elements in this set.
- (Final stage) Now if we have two vertices at a distance apart. By the previous item, one can connect to a large set of vertices in , and one can similarly connect to a large set of vertices in . Now, by using a Vinogradov-type theorem and second moment calculations again (and ensuring that the elements of and have opposite parity), one can connect many of the vertices in to many of the vertices by paths of length three, which then connects to , and gives the claim.
It seems of interest to understand random graphs like further. In particular, the graph on the integers formed by connecting to for all in a randomly selected residue class mod for each prime is particularly interesting (it is to the Liouville function as is to the Möbius function); if one could show some “local expander” properties of this graph , then one would have a chance of modifying the above methods to attack the first unsolved case of the Chowla conjecture, namely that has asymptotic density zero (perhaps working with logarithmic density instead of natural density to avoids some technicalities).
Kaisa Matomaki, Maksym Radziwill, and I have just uploaded to the arXiv our paper “An averaged form of Chowla’s conjecture“. This paper concerns a weaker variant of the famous conjecture of Chowla (discussed for instance in this previous post) that
as for any distinct natural numbers , where denotes the Liouville function. (One could also replace the Liouville function here by the Möbius function and obtain a morally equivalent conjecture.) This conjecture remains open for any ; for instance the assertion
is a variant of the twin prime conjecture (though possibly a tiny bit easier to prove), and is subject to the notorious parity barrier (as discussed in this previous post).
Our main result asserts, roughly speaking, that Chowla’s conjecture can be established unconditionally provided one has non-trivial averaging in the parameters. More precisely, one has
Theorem 1 (Chowla on the average) Suppose is a quantity that goes to infinity as (but it can go to infinity arbitrarily slowly). Then for any fixed , we have
In fact, we can remove one of the averaging parameters and obtain
Actually we can make the decay rate a bit more quantitative, gaining about over the trivial bound. The key case is ; while the unaveraged Chowla conjecture becomes more difficult as increases, the averaged Chowla conjecture does not increase in difficulty due to the increasing amount of averaging for larger , and we end up deducing the higher case of the conjecture from the case by an elementary argument.
The proof of the theorem proceeds as follows. By exploiting the Fourier-analytic identity
(related to a standard Fourier-analytic identity for the Gowers norm) it turns out that the case of the above theorem can basically be derived from an estimate of the form
uniformly for all . For “major arc” , close to a rational for small , we can establish this bound from a generalisation of a recent result of Matomaki and Radziwill (discussed in this previous post) on averages of multiplicative functions in short intervals. For “minor arc” , we can proceed instead from an argument of Katai and Bourgain-Sarnak-Ziegler (discussed in this previous post).
The argument also extends to other bounded multiplicative functions than the Liouville function. Chowla’s conjecture was generalised by Elliott, who roughly speaking conjectured that the copies of in Chowla’s conjecture could be replaced by arbitrary bounded multiplicative functions as long as these functions were far from a twisted Dirichlet character in the sense that
(This type of distance is incidentally now a fundamental notion in the Granville-Soundararajan “pretentious” approach to multiplicative number theory.) During our work on this project, we found that Elliott’s conjecture is not quite true as stated due to a technicality: one can cook up a bounded multiplicative function which behaves like on scales for some going to infinity and some slowly varying , and such a function will be far from any fixed Dirichlet character whilst still having many large correlations (e.g. the pair correlations will be large). In our paper we propose a technical “fix” to Elliott’s conjecture (replacing (1) by a truncated variant), and show that this repaired version of Elliott’s conjecture is true on the average in much the same way that Chowla’s conjecture is. (If one restricts attention to real-valued multiplicative functions, then this technical issue does not show up, basically because one can assume without loss of generality that in this case; we discuss this fact in an appendix to the paper.)
In analytic number theory, it is a well-known phenomenon that for many arithmetic functions of interest in number theory, it is significantly easier to estimate logarithmic sums such as
than it is to estimate summatory functions such as
(Here we are normalising to be roughly constant in size, e.g. as .) For instance, when is the von Mangoldt function , the logarithmic sums can be adequately estimated by Mertens’ theorem, which can be easily proven by elementary means (see Notes 1); but a satisfactory estimate on the summatory function requires the prime number theorem, which is substantially harder to prove (see Notes 2). (From a complex-analytic or Fourier-analytic viewpoint, the problem is that the logarithmic sums can usually be controlled just from knowledge of the Dirichlet series for near ; but the summatory functions require control of the Dirichlet series for on or near a large portion of the line . See Notes 2 for further discussion.)
Viewed conversely, whenever one has a difficult estimate on a summatory function such as , one can look to see if there is a “cheaper” version of that estimate that only controls the logarithmic sums , which is easier to prove than the original, more “expensive” estimate. In this post, we shall do this for two theorems, a classical theorem of Halasz on mean values of multiplicative functions on long intervals, and a much more recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. The two are related; the former theorem is an ingredient in the latter (though in the special case of the Matomaki-Radziwiłł theorem considered here, we will not need Halasz’s theorem directly, instead using a key tool in the proof of that theorem).
We begin with Halasz’s theorem. Here is a version of this theorem, due to Montgomery and to Tenenbaum:
Theorem 1 (Halasz-Montgomery-Tenenbaum) Let be a multiplicative function with for all . Let and , and set
Then one has
Informally, this theorem asserts that is small compared with , unless “pretends” to be like the character on primes for some small . (This is the starting point of the “pretentious” approach of Granville and Soundararajan to analytic number theory, as developed for instance here.) We now give a “cheap” version of this theorem which is significantly weaker (both because it settles for controlling logarithmic sums rather than summatory functions, it requires to be completely multiplicative instead of multiplicative, it requires a strong bound on the analogue of the quantity , and because it only gives qualitative decay rather than quantitative estimates), but easier to prove:
Theorem 2 (Cheap Halasz) Let be an asymptotic parameter goingto infinity. Let be a completely multiplicative function (possibly depending on ) such that for all , such that
Note that now that we are content with estimating exponential sums, we no longer need to preclude the possibility that pretends to be like ; see Exercise 11 of Notes 1 for a related observation.
To prove this theorem, we first need a special case of the Turan-Kubilius inequality.
Lemma 3 (Turan-Kubilius) Let be a parameter going to infinity, and let be a quantity depending on such that and as . Then
Informally, this lemma is asserting that
for most large numbers . Another way of writing this heuristically is in terms of Dirichlet convolutions:
This type of estimate was previously discussed as a tool to establish a criterion of Katai and Bourgain-Sarnak-Ziegler for Möbius orthogonality estimates in this previous blog post. See also Section 5 of Notes 1 for some similar computations.
Proof: By Cauchy-Schwarz it suffices to show that
Expanding out the square, it suffices to show that
for .
We just show the case, as the cases are similar (and easier). We rearrange the left-hand side as
We can estimate the inner sum as . But a routine application of Mertens’ theorem (handling the diagonal case when separately) shows that
and the claim follows.
Remark 4 As an alternative to the Turan-Kubilius inequality, one can use the Ramaré identity
(see e.g. Section 17.3 of Friedlander-Iwaniec). This identity turns out to give superior quantitative results than the Turan-Kubilius inequality in applications; see the paper of Matomaki and Radziwiłł for an instance of this.
We now prove Theorem 2. Let denote the left-hand side of (2); by the triangle inequality we have . By Lemma 3 (for some to be chosen later) and the triangle inequality we have
We rearrange the left-hand side as
We now replace the constraint by . The error incurred in doing so is
which by Mertens’ theorem is . Thus we have
But by definition of , we have , thus
From Mertens’ theorem, the expression in brackets can be rewritten as
and so the real part of this expression is
By (1), Mertens’ theorem and the hypothesis on we have
for any . This implies that we can find going to infinity such that
and thus the expression in brackets has real part . The claim follows.
The Turan-Kubilius argument is certainly not the most efficient way to estimate sums such as . In the exercise below we give a significantly more accurate estimate that works when is non-negative.
Exercise 5 (Granville-Koukoulopoulos-Matomaki)
- (i) If is a completely multiplicative function with for all primes , show that
as . (Hint: for the upper bound, expand out the Euler product. For the lower bound, show that , where is the completely multiplicative function with for all primes .)
- (ii) If is multiplicative and takes values in , show that
for all .
Now we turn to a very recent result of Matomaki and Radziwiłł on mean values of multiplicative functions in short intervals. For sake of illustration we specialise their results to the simpler case of the Liouville function , although their arguments actually work (with some additional effort) for arbitrary multiplicative functions of magnitude at most that are real-valued (or more generally, stay far from complex characters ). Furthermore, we give a qualitative form of their estimates rather than a quantitative one:
Theorem 6 (Matomaki-Radziwiłł, special case) Let be a parameter going to infinity, and let be a quantity going to infinity as . Then for all but of the integers , one has
A simple sieving argument (see Exercise 18 of Supplement 4) shows that one can replace by the Möbius function and obtain the same conclusion. See this recent note of Matomaki and Radziwiłł for a simple proof of their (quantitative) main theorem in this special case.
Of course, (4) improves upon the trivial bound of . Prior to this paper, such estimates were only known (using arguments similar to those in Section 3 of Notes 6) for unconditionally, or for for some sufficiently large if one assumed the Riemann hypothesis. This theorem also represents some progress towards Chowla’s conjecture (discussed in Supplement 4) that
as for any fixed distinct ; indeed, it implies that this conjecture holds if one performs a small amount of averaging in the .
Below the fold, we give a “cheap” version of the Matomaki-Radziwiłł argument. More precisely, we establish
Theorem 7 (Cheap Matomaki-Radziwiłł) Let be a parameter going to infinity, and let . Then
Note that (5) improves upon the trivial bound of . Again, one can replace with if desired. Due to the cheapness of Theorem 7, the proof will require few ingredients; the deepest input is the improved zero-free region for the Riemann zeta function due to Vinogradov and Korobov. Other than that, the main tools are the Turan-Kubilius result established above, and some Fourier (or complex) analysis.
Recent Comments