One of the basic general problems in analytic number theory is to understand as much as possible the fluctuations of the Möbius function , defined as
when
is the product of
distinct primes, and zero otherwise. For instance, as
takes values in
, we have the trivial bound
and the seemingly slight improvement
is already equivalent to the prime number theorem, as observed by Landau (see e.g. this previous blog post for a proof), while the much stronger (and still open) improvement
is equivalent to the notorious Riemann hypothesis.
There is a general Möbius pseudorandomness heuristic that suggests that the sign pattern behaves so randomly (or pseudorandomly) that one should expect a substantial amount of cancellation in sums that involve the sign fluctuation of the Möbius function in a nontrivial fashion, with the amount of cancellation present comparable to the amount that an analogous random sum would provide; cf. the probabilistic heuristic discussed in this recent blog post. There are a number of ways to make this heuristic precise. As already mentioned, the Riemann hypothesis can be considered one such manifestation of the heuristic. Another manifestation is the following old conjecture of Chowla:
Conjecture 1 (Chowla’s conjecture) For any fixed integer
and exponents
, with at least one of the
odd (so as not to completely destroy the sign cancellation), we have
Note that as for any
, we can reduce to the case when the
take values in
here. When only one of the
are odd, this is essentially the prime number theorem in arithmetic progressions (after some elementary sieving), but with two or more of the
are odd, the problem becomes completely open. For instance, the estimate
is morally very close to the conjectured asymptotic
for the von Mangoldt function , where
is the twin prime constant; this asymptotic in turn implies the twin prime conjecture. (To formally deduce estimates for von Mangoldt from estimates for Möbius, though, typically requires some better control on the error terms than
, in particular gains of some power of
are usually needed. See this previous blog post for more discussion.)
Remark 2 The Chowla conjecture resembles an assertion that, for
chosen randomly and uniformly from
to
, the random variables
become asymptotically independent of each other (in the probabilistic sense) as
. However, this is not quite accurate, because some moments (namely those with all exponents
even) have the “wrong” asymptotic value, leading to some unwanted correlation between the two variables. For instance, the events
and
have a strong correlation with each other, basically because they are both strongly correlated with the event of
being divisible by
. A more accurate interpretation of the Chowla conjecture is that the random variables
are asymptotically conditionally independent of each other, after conditioning on the zero pattern
; thus, it is the sign of the Möbius function that fluctuates like random noise, rather than the zero pattern. (The situation is a bit cleaner if one works instead with the Liouville function
instead of the Möbius function
, as this function never vanishes, but we will stick to the traditional Möbius function formalism here.)
A more recent formulation of the Möbius randomness heuristic is the following conjecture of Sarnak. Given a bounded sequence , define the topological entropy of the sequence to be the least exponent
with the property that for any fixed
, and for
going to infinity the set
of
can be covered by
balls of radius
(in the
metric). (If
arises from a minimal topological dynamical system
by
and
is generated by
and its shifts, the above notion is equivalent to the usual notion of the topological entropy of a dynamical system.) For instance, if the sequence is a bit sequence (i.e. it takes values in
), then there are only
-bit patterns that can appear as blocks of
consecutive bits in this sequence. As a special case, a Turing machine with bounded memory that had access to a random number generator at the rate of one random bit produced every
units of time, but otherwise evolved deterministically, would have an output sequence that had a topological entropy of at most
. A bounded sequence is said to be deterministic if its topological entropy is zero. A typical example is a polynomial sequence such as
for some fixed
; the
-blocks of such polynomials sequence have covering numbers that only grow polynomially in
, rather than exponentially, thus yielding the zero entropy. Unipotent flows, such as the horocycle flow on a compact hyperbolic surface, are another good source of deterministic sequences.
Conjecture 3 (Sarnak’s conjecture) Let
be a deterministic bounded sequence. Then
This conjecture in general is still quite far from being solved. However, special cases are known:
- For constant sequences, this is essentially the prime number theorem (1).
- For periodic sequences, this is essentially the prime number theorem in arithmetic progressions.
- For quasiperiodic sequences such as
for some continuous
, this follows from the work of Davenport.
- For nilsequences, this is a result of Ben Green and myself.
- For horocycle flows, this is a result of Bourgain, Sarnak, and Ziegler.
- For the Thue-Morse sequence, this is a result of Dartyge-Tenenbaum (with a stronger error term obtained by Maduit-Rivat). A subsequent result of Bourgain handles all bounded rank one sequences (though the Thue-Morse sequence is actually of rank two), and a related result of Green establishes asymptotic orthogonality of the Möbius function to bounded depth circuits, although such functions are not necessarily deterministic in nature.
- For the Rudin-Shapiro sequence, I sketched out an argument at this MathOverflow post.
- The Möbius function is known to itself be non-deterministic, because its square
(i.e. the indicator of the square-free functions) is known to be non-deterministic (indeed, its topological entropy is
). (The corresponding question for the Liouville function
, however, remains open, as the square
has zero entropy.)
- In the converse direction, it is easy to construct sequences of arbitrarily small positive entropy that correlate with the Möbius function (a rather silly example is
for some fixed large (squarefree)
, which has topological entropy at most
but clearly correlates with
).
See this survey of Sarnak for further discussion of this and related topics.
In this post I wanted to give a very nice argument of Sarnak that links the above two conjectures:
Proposition 4 The Chowla conjecture implies the Sarnak conjecture.
The argument does not use any number-theoretic properties of the Möbius function; one could replace in both conjectures by any other function from the natural numbers to
and obtain the same implication. The argument consists of the following ingredients:
- To show that
, it suffices to show that the expectation of the random variable
, where
is drawn uniformly at random from
to
, can be made arbitrary small by making
large (and
even larger).
- By the union bound and the zero topological entropy of
, it suffices to show that for any bounded deterministic coefficients
, the random variable
concentrates with exponentially high probability.
- Finally, this exponentially high concentration can be achieved by the moment method, using a slight variant of the moment method proof of the large deviation estimates such as the Chernoff inequality or Hoeffding inequality (as discussed in this blog post).
As is often the case, though, while the “top-down” order of steps presented above is perhaps the clearest way to think conceptually about the argument, in order to present the argument formally it is more convenient to present the arguments in the reverse (or “bottom-up”) order. This is the approach taken below the fold.
— 1. Proof of proposition —
We first establish step 3 of the above outline.
Proposition 5 Assume the Chowla conjecture. Then for any
, any
, and any coefficients
bounded in magnitude by
, we have
where
is an absolute constant and
goes to zero as
for fixed
(uniformly in the choice of coefficients
), and
is drawn uniformly at random from
to
.
Proof: We use the moment method. Let be a large even integer to be optimised in later. By Chebyshev’s inequality, we can bound the left-hand side of (2) by
expanding out the power and using the triangle inequality, we can bound this by
The summand here is always bounded by . By Chowla’s conjecture, the summand can in fact be bounded by
unless none of the indices
occur an odd number of times, so that the
indices are in fact grouped into at most
classes. So we are now facing the enumerative combinatorics problem of counting how many tuples
are of this form. We can estimate this count as follows. Reading the
from left to right, there are two cases: each
is either “fresh” (in that
is not equal to any of the
), or a “repeat”. In the former case, there are at most
choices for
; in the latter case, there are at most
. Also, at most
of the cases are fresh, so once it is decided which indices
are fresh or repeats, there are at most
remaining choices to be made (assuming that
). Thus the total count is bounded by
. Using this bound, we can thus upper bound (3) by
If we then optimise in by choosing
to be the largest even integer less than
(say), we obtain the claim.
Now let be a deterministic sequence, which we can normalise to be bounded to be real-valued and in magnitude by
. Let
and
, and let
be
rounded to the nearest multiple of
that is also bounded in magnitude by
., then from the zero-entropy hypothesis, there are at most
different values for the tuple
, as
ranges over the natural numbers. Let
be the set of all such tuples. From the previous proposition and the union bound, we have
and hence
The right-hand side can be simplified to
if is sufficiently large depending on
. This in turn leads to the bound
since the expression inside the integrand is bounded in magnitude by (this is Step 2 of the sketch). But by approximate translation invariance we have
for each , hence
This gives (for a suitable choice of )
and thus
letting go to zero sufficiently slowly in
, we obtain the Sarnak conjecture.
Remark 6 The final part of this argument (Step 1 of the sketch) is very lossy: control of short-range correlations (such as
) can be easily averaged out to yield control of long-range correlations (such as those
), but it is very difficult to reverse the process and deduce short-range control from long-range control. Indeed, conjectures such as the Chowla conjecture, which control
-point correlations of an arithmetic function such as
are generally considered to be significantly more difficult than conjectures such as the Sarnak conjecture, which involve only simple correlations of that function. (In particular, one can plan to tackle the Sarnak conjecture by using Cauchy-Schwarz type methods to eliminate the role of arithmetic functions
, as was discussed for instance in this blog post, but there is no obvious way to usefully eliminate arithmetic from the Chowla conjeture.)
20 comments
Comments feed for this article
15 October, 2012 at 1:34 am
Anonymous
The link of “this Mathoverflow post” does not seem to work, I suppose it is supposed to be this one ?
http://mathoverflow.net/questions/97261/mobius-randomness-of-the-rudin-shapiro-sequence
[Yes; the link is now added, thanks – T.]
15 October, 2012 at 8:12 am
Anonymous
Almost there, but you wrote \hre instead of \href
[Oops! Fixed, thanks – T.]
17 October, 2012 at 11:31 pm
Duvall
Dear Terry,
can you explain the relation between the Sarnak conjecture for zero entropy transformations and uniform distribution of points at prime times for these transformations ( under suitable extra assumptions of course, as say, unique ergodicity of all powers).
Thank you.
18 October, 2012 at 7:43 am
Terence Tao
As mentioned in the post, the qualitative formulation of the Sarnak conjecture (with only a
gain over the trivial bound) appears to be insufficient to establish any nontrivial bounds involving sums over primes, but if one has a stronger version of the conjecture in which one gains a few powers of
(and has some uniformity with respect to dilations) then one can start getting results of this form. Ben and I did this in the case of nilsequences in Section 7 of http://arxiv.org/abs/0807.1736 . One can probably write down an abstract version of this implication that applies to more general sequences, though this is not in the literature. (We gave two arguments in that paper, but both took advantage of the special features of nilsequences.)
18 October, 2012 at 6:19 pm
Gil Kalai
A few questions: 1)What is an example of bounded depth function which is not deterministic? (Actually one can wonder also if there is a connection between “detrminism” in the topological entropy sense and notion of psedorandomness from CS) 2).Regarding the question of Duvall: Do the Mobius randomness of the +- 1 sequences f(n) considered above (or their proofs) imply that there are infinitely many primes for which f(p)=1. 3) What is the lowest density of 1’s that Mobius randomness is known for a non-artificial 01 sequence? 4) if you consider a not too complicated sequence of the kind listed above which depends also on random bits. Is it still mobius-random in some sense?
19 October, 2012 at 7:47 am
Terence Tao
Dear Gil,
With regards to (1), one has to be a little careful with how one defines entropy for bounded depth functions, as entropy is an asymptotic notion and bounded depth circuits are generally only defined on {0,…,2^{n-1}} for some fixed large n. But note that if k is a natural number and n is much larger than k (more specifically,
), one can build a bounded depth circuit
with
zero unless
for some fixed residue class a. ANDing such a circuit with the
bit of i, and then ORing
of these circuits together, we can get a bounded depth circuit which can attain any pattern whatsoever in a consecutive block of
inputs, and so the topological entropy at scale
is maximal. This suggests that there is no direct relation between circuit complexity and topological entropy.
For (2), one generally needs to improve the o(1) bounds by several factors of logarithms, and also to get some quantitative control of f on medium arithmetic progressions (the so-called “Type I sums”). See e.g. Proposition 3 of https://terrytao.wordpress.com/2011/11/21/the-bourgain-sarnak-ziegler-orthogonality-criterion/ .
For (3), I’m not sure I understand the question; any sequence of 0s and 1s for which the 1s only appear with density zero, will automatically have o(1) correlation with Mobius just from the triangle inequality.
For (4), once there is a nontrivial amount of randomness, it should become rather easy to establish low correlation with high probability. For instance, an easy application of Cauchy-Schwarz (or what harmonic analysts would call the TT^* method, or number theorists would call the large sieve) shows that if f is a random +-1 sequence, f’ is an independent copy of f, and one has
, then
(and one could replace
here by any deterministic bounded sequence).
19 October, 2012 at 12:32 pm
Gil Kalai
Dear Terry, regarding (3) I would regard a sequence f(n) of with 01-values to be “Möbius random” if
is
. Regarding (1) you are right, but I would expect that there is a way around it, e.g. perhaps every bounded depth function is approximately deterministic where the approximation does not affect Mobius randomness.
A class of functions that are interesting are based on low degree polynomials over Z/2Z. So f(n) is described in terms of the m digits of n
as
where
and the monomials
are of low degree, namely |S| is “small”. (Here
corresponds to the AND function on the variables in S, so
iff
for every
.) The Walsh functions correspond to all S‘s being of size 1 and the Rudin-Shapiro sequence correspond to a special degree 2 polynomial. Some strong form of Mobius randomness will have nice or even spectacular coputational complexity applications.
19 October, 2012 at 12:50 pm
Terence Tao
My guess is that the Friedlander-Iwaniec type results should give some discorrelation of Mobius in such sets as
or
. Some version of the Piatetski-Shapiro prime number theorem should also give discorrelation of Mobius in sparse sets such as
for some
slightly larger than 1, but these probably won’t be quite as sparse as the Friedlander-Iwaniec examples unless one starts assuming some strong number theoretic conjectures (e.g. Density conjecture, Lindehof hypothesis, RH, etc.). Then of course one could take a random sparse set, but that’s cheating…
22 October, 2012 at 12:55 pm
Harald
Hi Terry,
>My guess is that the Friedlander-Iwaniec type results should give >some discorrelation of Mobius in such sets as \{ n^2+m^4: n,m \in {\bf >N} \} or \{ n^3 + 2m^3: n,m \in {\bf N}\}.
This was shown in a part of my PhD thesis that I never got around to publishing. (Perhaps I should when I have some free time.) I seem to remember that adapting the first result (for n^2+m^4) was easy, whereas adapting Heath-Brown (n^3+2 m^3) and Heath-Brown-Moroz (cubic forms in general) was harder. Perhaps you can give an opinion of how much general interest there is in these results.
20 October, 2012 at 3:29 pm
The Quantum Debate is Over! (and other Updates) | Combinatorics and more
[…] some avenues and eventually Tao solved the problem (and earned the 300 reputation points). Here is a very recent post on Möbius randomness on Terry Tao’s […]
22 October, 2012 at 12:53 am
kodlu
@Gil: I think you must mean
27 March, 2013 at 7:34 pm
An informal version of the Furstenberg correspondence principle | What's new
[…] convention that ). The Möbius function is conjectured to behave quite random in many ways (see this blog post for a discussion of some of these), and so one might think that there is no interesting dynamical […]
24 June, 2014 at 8:16 am
Derandomizing restricted isometries via the Legendre symbol | Short, Fat Matrices
[…] a strong version of the Chowla conjecture, […]
4 February, 2015 at 2:30 pm
Dynamique de la suite de Moebius et conjecture de Chowla (El Abdalaoui, Lemanczyk, de la Rue) | Jérôme Buzzi's
[…] d’accumulation de l’orbite (et donc au sous-décalage ) et des mesures empiriques . La conjecture de Chowla (1965) est équivalente à la convergence des mesures empiriques vers une mesure pour laquelle les […]
17 March, 2015 at 10:16 pm
An averaged form of Chowla’s conjecture | What's new
[…] This paper concerns a weaker variant of the famous conjecture of Chowla (discussed for instance in this previous post) […]
15 April, 2015 at 2:44 am
On the speed of ergodicity of horocycle maps | Disquisitiones Mathematicae
[…] following particular case of Sarnak’s conjecture on the randomness of Möbius function: for all non-periodic and , one […]
18 September, 2015 at 4:46 pm
The logarithmically averaged Chowla and Elliott conjectures for two-point correlations; the Erdos discrepancy problem | What's new
[…] for all , which would have a number of consequences (such as a logarithmically averaged version of Sarnak’s conjecture, as per this blog […]
16 May, 2016 at 10:29 pm
Equivalence of the logarithmically averaged Chowla and Sarnak conjectures | What's new
[…] in turn follows from the logarithmically averaged form of Sarnak’s conjecture (discussed in this previous post), namely […]
5 March, 2017 at 10:38 am
Furstenberg limits of the Liouville function | What's new
[…] any zero-entropy sequence (see this previous blog post for more discussion). Morally speaking, this conjecture should be equivalent to the assertion that […]
30 July, 2020 at 7:12 pm
Higher uniformity of bounded multiplicative functions in short intervals on average | What's new
[…] of Chowla’s conjecture from Sarnak’s conjecture, as discussed for instance in this post) gives enough of Chowla’s conjecture to produce plenty of length sign patterns. If there are […]