Given a function {f: {\bf N} \rightarrow \{-1,+1\}} on the natural numbers taking values in {+1, -1}, one can invoke the Furstenberg correspondence principle to locate a measure preserving system {T \circlearrowright (X, \mu)} – a probability space {(X,\mu)} together with a measure-preserving shift {T: X \rightarrow X} (or equivalently, a measure-preserving {{\bf Z}}-action on {(X,\mu)}) – together with a measurable function (or “observable”) {F: X \rightarrow \{-1,+1\}} that has essentially the same statistics as {f} in the sense that

\displaystyle \lim \inf_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n+h_1) \dots f(n+h_k)

\displaystyle \leq \int_X F(T^{h_1} x) \dots F(T^{h_k} x)\ d\mu(x)

\displaystyle \leq \lim \sup_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n+h_1) \dots f(n+h_k)

for any integers {h_1,\dots,h_k}. In particular, one has

\displaystyle \int_X F(T^{h_1} x) \dots F(T^{h_k} x)\ d\mu(x) = \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n+h_1) \dots f(n+h_k) \ \ \ \ \ (1)


whenever the limit on the right-hand side exists. We will refer to the system {T \circlearrowright (X,\mu)} together with the designated function {F} as a Furstenberg limit ot the sequence {f}. These Furstenberg limits capture some, but not all, of the asymptotic behaviour of {f}; roughly speaking, they control the typical “local” behaviour of {f}, involving correlations such as {\frac{1}{N} \sum_{n=1}^N f(n+h_1) \dots f(n+h_k)} in the regime where {h_1,\dots,h_k} are much smaller than {N}. However, the control on error terms here is usually only qualitative at best, and one usually does not obtain non-trivial control on correlations in which the {h_1,\dots,h_k} are allowed to grow at some significant rate with {N} (e.g. like some power {N^\theta} of {N}).

The correspondence principle is discussed in these previous blog posts. One way to establish the principle is by introducing a Banach limit {p\!-\!\lim: \ell^\infty({\bf N}) \rightarrow {\bf R}} that extends the usual limit functional on the subspace of {\ell^\infty({\bf N})} consisting of convergent sequences while still having operator norm one. Such functionals cannot be constructed explicitly, but can be proven to exist (non-constructively and non-uniquely) using the Hahn-Banach theorem; one can also use a non-principal ultrafilter here if desired. One can then seek to construct a system {T \circlearrowright (X,\mu)} and a measurable function {F: X \rightarrow \{-1,+1\}} for which one has the statistics

\displaystyle \int_X F(T^{h_1} x) \dots F(T^{h_k} x)\ d\mu(x) = p\!-\!\lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n+h_1) \dots f(n+h_k) \ \ \ \ \ (2)


for all {h_1,\dots,h_k}. One can explicitly construct such a system as follows. One can take {X} to be the Cantor space {\{-1,+1\}^{\bf Z}} with the product {\sigma}-algebra and the shift

\displaystyle T ( (x_n)_{n \in {\bf Z}} ) := (x_{n+1})_{n \in {\bf Z}}

with the function {F: X \rightarrow \{-1,+1\}} being the coordinate function at zero:

\displaystyle F( (x_n)_{n \in {\bf Z}} ) := x_0

(so in particular {F( T^h (x_n)_{n \in {\bf Z}} ) = x_h} for any {h \in {\bf Z}}). The only thing remaining is to construct the invariant measure {\mu}. In order to be consistent with (2), one must have

\displaystyle \mu( \{ (x_n)_{n \in {\bf Z}}: x_{h_j} = \epsilon_j \forall 1 \leq j \leq k \} )

\displaystyle = p\!-\!\lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N 1_{f(n+h_1)=\epsilon_1} \dots 1_{f(n+h_k)=\epsilon_k}

for any distinct integers {h_1,\dots,h_k} and signs {\epsilon_1,\dots,\epsilon_k}. One can check that this defines a premeasure on the Boolean algebra of {\{-1,+1\}^{\bf Z}} defined by cylinder sets, and the existence of {\mu} then follows from the Hahn-Kolmogorov extension theorem (or the closely related Kolmogorov extension theorem). One can then check that the correspondence (2) holds, and that {\mu} is translation-invariant; the latter comes from the translation invariance of the (Banach-)Césaro averaging operation {f \mapsto p\!-\!\lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n)}. A variant of this construction shows that the Furstenberg limit is unique up to equivalence if and only if all the limits appearing in (1) actually exist.

One can obtain a slightly tighter correspondence by using a smoother average than the Césaro average. For instance, one can use the logarithmic Césaro averages {\lim_{N \rightarrow \infty} \frac{1}{\log N}\sum_{n=1}^N \frac{f(n)}{n}} in place of the Césaro average {\sum_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N f(n)}, thus one replaces (2) by

\displaystyle \int_X F(T^{h_1} x) \dots F(T^{h_k} x)\ d\mu(x)

\displaystyle = p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{f(n+h_1) \dots f(n+h_k)}{n}.

Whenever the Césaro average of a bounded sequence {f: {\bf N} \rightarrow {\bf R}} exists, then the logarithmic Césaro average exists and is equal to the Césaro average. Thus, a Furstenberg limit constructed using logarithmic Banach-Césaro averaging still obeys (1) for all {h_1,\dots,h_k} when the right-hand side limit exists, but also obeys the more general assertion

\displaystyle \int_X F(T^{h_1} x) \dots F(T^{h_k} x)\ d\mu(x)

\displaystyle = \lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{f(n+h_1) \dots f(n+h_k)}{n}

whenever the limit of the right-hand side exists.

In a recent paper of Frantizinakis, the Furstenberg limits of the Liouville function {\lambda} (with logarithmic averaging) were studied. Some (but not all) of the known facts and conjectures about the Liouville function can be interpreted in the Furstenberg limit. For instance, in a recent breakthrough result of Matomaki and Radziwill (discussed previously here), it was shown that the Liouville function exhibited cancellation on short intervals in the sense that

\displaystyle \lim_{H \rightarrow \infty} \limsup_{X \rightarrow \infty} \frac{1}{X} \int_X^{2X} \frac{1}{H} |\sum_{x \leq n \leq x+H} \lambda(n)|\ dx = 0.

In terms of Furstenberg limits of the Liouville function, this assertion is equivalent to the assertion that

\displaystyle \lim_{H \rightarrow \infty} \int_X |\frac{1}{H} \sum_{h=1}^H F(T^h x)|\ d\mu(x) = 0

for all Furstenberg limits {T \circlearrowright (X,\mu), F} of Liouville (including those without logarithmic averaging). Invoking the mean ergodic theorem (discussed in this previous post), this assertion is in turn equivalent to the observable {F} that corresponds to the Liouville function being orthogonal to the invariant factor {L^\infty(X,\mu)^{\bf Z} = \{ g \in L^\infty(X,\mu): g \circ T = g \}} of {X}; equivalently, the first Gowers-Host-Kra seminorm {\|F\|_{U^1(X)}} of {F} (as defined for instance in this previous post) vanishes. The Chowla conjecture, which asserts that

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \lambda(n+h_1) \dots \lambda(n+h_k) = 0

for all distinct integers {h_1,\dots,h_k}, is equivalent to the assertion that all the Furstenberg limits of Liouville are equivalent to the Bernoulli system ({\{-1,+1\}^{\bf Z}} with the product measure arising from the uniform distribution on {\{-1,+1\}}, with the shift {T} and observable {F} as before). Similarly, the logarithmically averaged Chowla conjecture

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{\lambda(n+h_1) \dots \lambda(n+h_k)}{n} = 0

is equivalent to the assertion that all the Furstenberg limits of Liouville with logarithmic averaging are equivalent to the Bernoulli system. Recently, I was able to prove the two-point version

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{\lambda(n) \lambda(n+h)}{n} = 0 \ \ \ \ \ (3)


of the logarithmically averaged Chowla conjecture, for any non-zero integer {h}; this is equivalent to the perfect strong mixing property

\displaystyle \int_X F(x) F(T^h x)\ d\mu(x) = 0

for any Furstenberg limit of Liouville with logarithmic averaging, and any {h \neq 0}.

The situation is more delicate with regards to the Sarnak conjecture, which is equivalent to the assertion that

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N \lambda(n) f(n) = 0

for any zero-entropy sequence {f: {\bf N} \rightarrow {\bf R}} (see this previous blog post for more discussion). Morally speaking, this conjecture should be equivalent to the assertion that any Furstenberg limit of Liouville is disjoint from any zero entropy system, but I was not able to formally establish an implication in either direction due to some technical issues regarding the fact that the Furstenberg limit does not directly control long-range correlations, only short-range ones. (There are however ergodic theoretic interpretations of the Sarnak conjecture that involve the notion of generic points; see this paper of El Abdalaoui, Lemancyk, and de la Rue.) But the situation is currently better with the logarithmically averaged Sarnak conjecture

\displaystyle \lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{\lambda(n) f(n)}{n} = 0,

as I was able to show that this conjecture was equivalent to the logarithmically averaged Chowla conjecture, and hence to all Furstenberg limits of Liouville with logarithmic averaging being Bernoulli; I also showed the conjecture was equivalent to local Gowers uniformity of the Liouville function, which is in turn equivalent to the function {F} having all Gowers-Host-Kra seminorms vanishing in every Furstenberg limit with logarithmic averaging. In this recent paper of Frantzikinakis, this analysis was taken further, showing that the logarithmically averaged Chowla and Sarnak conjectures were in fact equivalent to the much milder seeming assertion that all Furstenberg limits with logarithmic averaging were ergodic.

Actually, the logarithmically averaged Furstenberg limits have more structure than just a {{\bf Z}}-action on a measure preserving system {(X,\mu)} with a single observable {F}. Let {Aff_+({\bf Z})} denote the semigroup of affine maps {n \mapsto an+b} on the integers with {a,b \in {\bf Z}} and {a} positive. Also, let {\hat {\bf Z}} denote the profinite integers (the inverse limit of the cyclic groups {{\bf Z}/q{\bf Z}}). Observe that {Aff_+({\bf Z})} acts on {\hat {\bf Z}} by taking the inverse limit of the obvious actions of {Aff_+({\bf Z})} on {{\bf Z}/q{\bf Z}}.

Proposition 1 (Enriched logarithmically averaged Furstenberg limit of Liouville) Let {p\!-\!\lim} be a Banach limit. Then there exists a probability space {(X,\mu)} with an action {\phi \mapsto T^\phi} of the affine semigroup {Aff_+({\bf Z})}, as well as measurable functions {F: X \rightarrow \{-1,+1\}} and {M: X \rightarrow \hat {\bf Z}}, with the following properties:

  • (i) (Affine Furstenberg limit) For any {\phi_1,\dots,\phi_k \in Aff_+({\bf Z})}, and any congruence class {a\ (q)}, one has

    \displaystyle p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{\lambda(\phi_1(n)) \dots \lambda(\phi_k(n)) 1_{n = a\ (q)}}{n}

    \displaystyle = \int_X F( T^{\phi_1}(x) ) \dots F( T^{\phi_k}(x) ) 1_{M(x) = a\ (q)}\ d\mu(x).

  • (ii) (Equivariance of {M}) For any {\phi \in Aff_+({\bf Z})}, one has

    \displaystyle M( T^\phi(x) ) = \phi( M(x) )

    for {\mu}-almost every {x \in X}.

  • (iii) (Multiplicativity at fixed primes) For any prime {p}, one has

    \displaystyle F( T^{p\cdot} x ) = - F(x)

    for {\mu}-almost every {x \in X}, where {p \cdot \in Aff_+({\bf Z})} is the dilation map {n \mapsto pn}.

  • (iv) (Measure pushforward) If {\phi \in Aff_+({\bf Z})} is of the form {\phi(n) = an+b} and {S_\phi \subset X} is the set {S_\phi = \{ x \in X: M(x) \in \phi(\hat {\bf Z}) \}}, then the pushforward {T^\phi_* \mu} of {\mu} by {\phi} is equal to {a \mu\downharpoonright_{S_\phi}}, that is to say one has

    \displaystyle \mu( (T^\phi)^{-1}(E) ) = a \mu( E \cap S_\phi )

    for every measurable {E \subset X}.

Note that {{\bf Z}} can be viewed as the subgroup of {Aff_+({\bf Z})} consisting of the translations {n \mapsto n + b}. If one only keeps the {{\bf Z}}-portion of the {Aff_+({\bf Z})} action and forgets the rest (as well as the function {M}) then the action becomes measure-preserving, and we recover an ordinary Furstenberg limit with logarithmic averaging. However, the additional structure here can be quite useful; for instance, one can transfer the proof of (3) to this setting, which we sketch below the fold, after proving the proposition.

The observable {M}, roughly speaking, means that points {x} in the Furstenberg limit {X} constructed by this proposition are still “virtual integers” in the sense that one can meaningfully compute the residue class of {x} modulo any natural number modulus {q}, by first applying {M} and then reducing mod {q}. The action of {Aff_+({\bf Z})} means that one can also meaningfully multiply {x} by any natural number, and translate it by any integer. As with other applications of the correspondence principle, the main advantage of moving to this more “virtual” setting is that one now acquires a probability measure {\mu}, so that the tools of ergodic theory can be readily applied.

— 1. Proof of proposition —

We adapt the previous construction of the Furstenberg limit. The space {X} will no longer be the Cantor space {\{-1,+1\}^{\bf Z}}, but will instead be taken to be the space

\displaystyle X := \{-1,+1\}^{Aff_+({\bf Z})} \times \hat {\bf Z}.

The action of {Aff_+({\bf Z})} here is given by

\displaystyle T^\phi ( (x_\psi)_{\psi \in Aff_+({\bf Z})}, m ) := ( (x_{\psi \phi})_{\psi \in Aff_+({\bf Z})}, \phi(m) );

this can easily be seen to be a semigroup action. The observables {F: X \rightarrow \{-1,+1\}} and {M: X \rightarrow \hat {\bf Z}} are defined as

\displaystyle F( (x_\psi)_{\psi \in Aff_+({\bf Z})}, m ) := x_{id}


\displaystyle M( (x_\psi)_{\psi \in Aff_+({\bf Z})}, m ) := m

where {id} is the identity element of {Aff_+({\bf Z})}. Property (ii) is now clear. Now we have to construct the measure {\mu}. In order to be consistent with property (i), the measure of the set

\displaystyle \{ ((x_\phi)_{\phi \in Aff_+({\bf Z})}, m): x_{\phi_j} = \epsilon_j \forall 1 \leq j \leq k; m = a\ (q) \} \ \ \ \ \ (4)


for any distinct {\phi_1,\dots,\phi_k \in Aff_+({\bf Z})}, signs {\epsilon_1,\dots,\epsilon_k \in \{-1,+1\}}, and congruence class {a\ (q)}, must be equal to

\displaystyle p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{1_{\lambda(\phi_j(n)) = \epsilon_j \forall 1 \leq j \leq k; m = a\ (q)}}{n}.

One can check that this requirement uniquely defines a premeasure on the Boolean algebra on {X} generated by the sets (4), and {\mu} can then be constructed from the Hahn-Kolmogorov theorem as before. Property (i) follows from construction. Specialising to the case {\phi_1(n) = n}, {\phi_2(n) = pn} for a prime {p} we have

\displaystyle p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{\lambda(n) \lambda(pn)}{n}

\displaystyle = \int_X F( x ) F( T^{p \cdot}(x) ) \ d\mu(x);

the left-hand side is {-1}, which gives (iii).

It remains to establish (iv). It will suffice to do so for sets {E} of the form (4). The claim then follows from the dilation invariance property

\displaystyle p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{f(an+b)}{n} = a p\!-\!\lim_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n=1}^N \frac{f(n)}{n} 1_{n = b\ (a)}

for any bounded function {f}, which is easily verified (here is where it is essential that we are using logarithmically averaged Césaro means rather than ordinary Césaro means).

Remark 2 One can embed this {Aff_+({\bf Z})}-system {X} as a subsystem of a {Aff_+({\bf Q})}-system {Aff_+({\bf Q}) \otimes_{Aff_+({\bf Z})} X}, however this larger system is only {\sigma}-finite rather than a probability space, and also the observable {M} now takes values in the larger space {{\bf Q} \otimes_{\bf Z} \hat {\bf Z}}. This recovers a group action rather than a semigroup action, but I am not sure if the added complexity of infinite measure is worth it.

— 2. Two-point logarithmic Chowla —

We now sketch how the proof of (3) in this paper can be translated to the ergodic theory setting. For sake of notation let us just prove (3) when {h=1}. We will assume familiarity with ergodic theory concepts in this sketch. By taking a suitable Banach limit, it will suffice to establish that

\displaystyle \int_X F(x) F( T^{\cdot+1} x)\ d\mu(x) = 0

for any Furstenberg limit produced by Proposition 1, where {\cdot+h} denotes the operation of translation by {h}. By property (iii) of that proposition, we can the left-hand side as

\displaystyle \int_X F(T^{p\cdot} x) F( T^{p\cdot+p} x)\ d\mu(x)

for any prime {p}, and then by property (iv) we can write this in turn as

\displaystyle \int_X F(x) F( T^{p} x) p 1_{M(x) = 0\ (p)}\ d\mu(x).

Averaging, we thus have

\displaystyle \int_X F(x) F( T^{\cdot+1} x)\ d\mu(x) = \frac{1}{|\mathcal P|} \sum_{p \in {\mathcal P}_P} \int_X F(x) F( T^{p} x) p 1_{M(x) = 0\ (p)}\ d\mu(x)

for any {P>1}, where {{\mathcal P}_P} denotes the primes between {P/2} and {P}.

On the other hand, the Matomaki-Radziwill theorem (twisted by Dirichlet characters) tells us that for any congruence class {q \geq 1}, one has

\displaystyle \lim_{H \rightarrow \infty} \lim\sup_{N \rightarrow \infty} \frac{1}{\log N} \sum_{n \leq N} |\frac{1}{H} \sum_{h=1}^H \lambda(n+qh)| = 0

which on passing to the Furstenberg limit gives

\displaystyle \lim_{H \rightarrow \infty} \int_X |\frac{1}{H} \sum_{h=1}^H F( T^{\cdot+qh} x)|\ d\mu(x) = 0.

Applying the mean ergodic theorem, we conclude that {F} is orthogonal to the profinite factor of the {{\bf Z}}-action, by which we mean the factor generated by the functions that are periodic ({T^{\cdot+q}}-invariant for some {q \geq 1}). One can show from Fourier analysis that the profinite factor is characteristic for averaging along primes, and in particular that

\displaystyle \frac{1}{|{\mathcal P}_P|} \sum_{p \in {\mathcal P}_P} \int_X F(x) F( T^{\cdot + p} x)\ d\mu \rightarrow 0

as {P \rightarrow \infty}. (This is not too difficult given the usual Vinogradov estimates for exponential sums over primes, but I don’t know of a good reference for this fact. This paper of Frantzikinakis, Host, and Kra establishes the analogous claim that the Kronecker factor is characteristic for triple averages {\frac{1}{|\mathcal P|} \sum_{p \in {\mathcal P}_P} \int_X F(x) F( T^{p} x)\ d\mu}, and their argument would also apply here, but this is something of an overkill.) Thus, if we define the quantities

\displaystyle Q_P := \frac{1}{|{\mathcal P}_P|} \sum_{p \in {\mathcal P}_P} \int_X F(x) F( T^{\cdot + p} x) (p 1_{M(x) = 0\ (p)}-1)\ d\mu(x)

it will suffice to show that {\liminf_{P \rightarrow \infty} |Q_P| = 0}.

Suppose for contradiction that {|Q_P| \geq \varepsilon} for all sufficiently large {P}. We can write {Q_P} as an expectation

\displaystyle Q_P = {\bf E} F_P( X_P, Y_P )

where {X_P} is the {\{-1,+1\}^P}-valued random variable

\displaystyle X_P := ( F( T^{\cdot + k} x ) )_{0 \leq k < P}

with {x} drawn from {X} with law {\mu}, {Y_P} is the {\prod_{p \in {\mathcal P}_P} {\bf Z}/p{\bf Z}}-valued random variable

\displaystyle Y_P := ( M(x)\ (p) )_{p \in {\mathcal P}_P}

with {x} as before, and {F_P} is the function

\displaystyle F_P( (\epsilon_k)_{0 \leq k < P}, (a_p)_{p \in {\mathcal P}_P} ) := \frac{1}{|{\mathcal P}_P|} \sum_{p \in {\mathcal P}_P} \epsilon_0 \epsilon_p (p 1_{a_p = 0\ (p)} - 1).

As {|Q_P| \geq \varepsilon}, we have

\displaystyle |F_P(X_P, Y_P)| \geq \varepsilon/2

with probability at least {\varepsilon/2}. On the other hand, an application of Hoeffding’s inequality and the prime number theorem shows that if {U_P} is drawn uniformly from {\prod_{p \in {\mathcal P}_P} {\bf Z}/p{\bf Z}} and independently of {X_P}, that one has the concentration of measure bound

\displaystyle \mathop{\bf P}( |F_P(X_P, U_P)| > \varepsilon/2 ) \leq 2 \exp( - c_\varepsilon P / \log P )

for some {c_\varepsilon > 0}. Using the Pinsker-type inequality from this previous blog post, we conclude the lower bound

\displaystyle I( X_P : Y_P ) \gg_\varepsilon \frac{P}{\log P}

on the mutual information between {X_P} and {Y_P}. Using Shannon entropy inequalities as in my paper, this implies the entropy decrement

\displaystyle \frac{H(X_{kP})}{kP} \leq \frac{H(X_P)}{P} - \frac{c_\varepsilon}{\log P} + O( \frac{1}{k} )

for any natural number {k}, which on iterating (and using the divergence of {\sum_{j=1}^\infty \frac{1}{j \log j}}) shows that {\frac{H(X_P)}{P}} eventually becomes negative for sufficiently large {P}, which is absurd. (See also this previous blog post for a sketch of a slightly different way to conclude the argument from entropy inequalities.)