Pointwise ergodic theorems for non-conventional bilinear polynomial averages

3 August, 2020 in math.CA, math.DS, paper | Tags: adeles, Ben Krause, Mariusz Mirek, nonconventional ergodic averages, pointwise ergodic theorem | by Terence Tao

Ben Krause, Mariusz Mirek, and I have uploaded to the arXiv our paper Pointwise ergodic theorems for non-conventional bilinear polynomial averages. This paper is a contribution to the decades-long program of extending the classical ergodic theorems to “non-conventional” ergodic averages. Here, the focus is on pointwise convergence theorems, and in particular looking for extensions of the pointwise ergodic theorem of Birkhoff:

Theorem 1 (Birkhoff ergodic theorem) Let ${(X,\mu,T)}$ be a measure-preserving system (by which we mean ${(X,\mu)}$ is a ${\sigma}$ -finite measure space, and ${T: X \rightarrow X}$ is invertible and measure-preserving), and let ${f \in L^p(X)}$ for any ${1 \leq p < \infty}$ . Then the averages ${\frac{1}{N} \sum_{n=1}^N f(T^n x)}$ converge pointwise for ${\mu}$ -almost every ${x \in X}$ .

Pointwise ergodic theorems have an inherently harmonic analysis content to them, as they are closely tied to maximal inequalities. For instance, the Birkhoff ergodic theorem is closely tied to the Hardy-Littlewood maximal inequality.

The above theorem was generalized by Bourgain (conceding the endpoint ${p=1}$ , where pointwise almost everywhere convergence is now known to fail) to polynomial averages:

Theorem 2 (Pointwise ergodic theorem for polynomial averages) Let ${(X,\mu,T)}$ be a measure-preserving system, and let ${f \in L^p(X)}$ for any ${1 < p < \infty}$ . Let ${P \in {\bf Z}[{\mathrm n}]}$ be a polynomial with integer coefficients. Then the averages ${\frac{1}{N} \sum_{n=1}^N f(T^{P(n)} x)}$ converge pointwise for ${\mu}$ -almost every ${x \in X}$ .

For bilinear averages, we have a separate 1990 result of Bourgain (for ${L^\infty}$ functions), extended to other ${L^p}$ spaces by Lacey, and with an alternate proof given, by Demeter:

Theorem 3 (Pointwise ergodic theorem for two linear polynomials) Let ${(X,\mu,T)}$ be a measure-preserving system with finite measure, and let ${f \in L^{p_1}(X)}$ , ${g \in L^{p_2}}$ for some ${1 < p_1,p_2 \leq \infty}$ with ${\frac{1}{p_1}+\frac{1}{p_2} < \frac{3}{2}}$ . Then for any integers ${a,b}$ , the averages ${\frac{1}{N} \sum_{n=1}^N f(T^{an} x) g(T^{bn} x)}$ converge pointwise almost everywhere.

It has been an open question for some time (see e.g., Problem 11 of this survey of Frantzikinakis) to extend this result to other bilinear ergodic averages. In our paper we are able to achieve this in the partially linear case:

Theorem 4 (Pointwise ergodic theorem for one linear and one nonlinear polynomial) Let ${(X,\mu,T)}$ be a measure-preserving system, and let ${f \in L^{p_1}(X)}$ , ${g \in L^{p_2}}$ for some ${1 < p_1,p_2 < \infty}$ with ${\frac{1}{p_1}+\frac{1}{p_2} \leq 1}$ . Then for any polynomial ${P \in {\bf Z}[{\mathrm n}]}$ of degree ${d \geq 2}$ , the averages ${\frac{1}{N} \sum_{n=1}^N f(T^{n} x) g(T^{P(n)} x)}$ converge pointwise almost everywhere.

We actually prove a bit more than this, namely a maximal function estimate and a variational estimate, together with some additional estimates that “break duality” by applying in certain ranges with ${\frac{1}{p_1}+\frac{1}{p_2}>1}$ , but we will not discuss these extensions here. A good model case to keep in mind is when ${p_1=p_2=2}$ and ${P(n) = n^2}$ (which is the case we started with). We note that norm convergence for these averages was established much earlier by Furstenberg and Weiss (in the ${d=2}$ case at least), and in fact norm convergence for arbitrary polynomial averages is now known thanks to the work of Host-Kra, Leibman, and Walsh.

Our proof of Theorem 4 is much closer in spirit to Theorem 2 than to Theorem 3. The property of the averages shared in common by Theorems 2, 4 is that they have “true complexity zero”, in the sense that they can only be only be large if the functions ${f,g}$ involved are “major arc” or “profinite”, in that they behave periodically over very long intervals (or like a linear combination of such periodic functions). In contrast, the average in Theorem 3 has “true complexity one”, in the sense that they can also be large if ${f,g}$ are “almost periodic” (a linear combination of eigenfunctions, or plane waves), and as such all proofs of the latter theorem have relied (either explicitly or implicitly) on some form of time-frequency analysis. In principle, the true complexity zero property reduces one to study the behaviour of averages on major arcs. However, until recently the available estimates to quantify this true complexity zero property were not strong enough to achieve a good reduction of this form, and even once one was in the major arc setting the bilinear averages in Theorem 4 were still quite complicated, exhibiting a mixture of both continuous and arithmetic aspects, both of which being genuinely bilinear in nature.

After applying standard reductions such as the Calderón transference principle, the key task is to establish a suitably “scale-invariant” maximal (or variational) inequality on the integer shift system (in which ${X = {\bf Z}}$ with counting measure, and ${T(n) = n-1}$ ). A model problem is to establish the maximal inequality

$\displaystyle \| \sup_N |A_N(f,g)| \|_{\ell^1({\bf Z})} \lesssim \|f\|_{\ell^2({\bf Z})}\|g\|_{\ell^2({\bf Z})} \ \ \ \ \ (1)$

where ${N}$ ranges over powers of two and ${A_N}$ is the bilinear operator

$\displaystyle A_N(f,g)(x) := \frac{1}{N} \sum_{n=1}^N f(x-n) g(x-n^2).$

The single scale estimate

$\displaystyle \| A_N(f,g) \|_{\ell^1({\bf Z})} \lesssim \|f\|_{\ell^2({\bf Z})}\|g\|_{\ell^2({\bf Z})}$

or equivalently (by duality)

$\displaystyle \frac{1}{N} \sum_{n=1}^N \sum_{x \in {\bf Z}} h(x) f(x-n) g(x-n^2) \lesssim \|f\|_{\ell^2({\bf Z})}\|g\|_{\ell^2({\bf Z})} \|h\|_{\ell^\infty({\bf Z})} \ \ \ \ \ (2)$

is immediate from Hölder’s inequality; the difficulty is how to take the supremum over scales ${N}$ .

The first step is to understand when the single-scale estimate (2) can come close to equality. A key example to keep in mind is when ${f(x) = e(ax/q) F(x)}$ , ${g(x) = e(bx/q) G(x)}$ , ${h(x) = e(cx/q) H(x)}$ where ${q=O(1)}$ is a small modulus, ${a,b,c}$ are such that ${a+b+c=0 \hbox{ mod } q}$ , ${G}$ is a smooth cutoff to an interval ${I}$ of length ${O(N^2)}$ , and ${F=H}$ is also supported on ${I}$ and behaves like a constant on intervals of length ${O(N)}$ . Then one can check that (barring some unusual cancellation) (2) is basically sharp for this example. A remarkable result of Peluse and Prendiville (generalised to arbitrary nonlinear polynomials ${P}$ by Peluse) asserts, roughly speaking, that this example basically the only way in which (2) can be saturated, at least when ${f,g,h}$ are supported on a common interval ${I}$ of length ${O(N^2)}$ and are normalised in ${\ell^\infty}$ rather than ${\ell^2}$ . (Strictly speaking, the above paper of Peluse and Prendiville only says something like this regarding the ${f,h}$ factors; the corresponding statement for ${g}$ was established in a subsequent paper of Peluse and Prendiville.) The argument requires tools from additive combinatorics such as the Gowers uniformity norms, and hinges in particular on the “degree lowering argument” of Peluse and Prendiville, which I discussed in this previous blog post. Crucially for our application, the estimates are very quantitative, with all bounds being polynomial in the ratio between the left and right hand sides of (2) (or more precisely, the ${\ell^\infty}$ -normalized version of (2)).

For our applications we had to extend the ${\ell^\infty}$ inverse theory of Peluse and Prendiville to an ${\ell^2}$ theory. This turned out to require a certain amount of “sleight of hand”. Firstly, one can dualise the theorem of Peluse and Prendiville to show that the “dual function”

$\displaystyle A^*_N(h,g)(x) = \frac{1}{N} \sum_{n=1}^N h(x+n) g(x+n-n^2)$

can be well approximated in ${\ell^1}$ by a function that has Fourier support on “major arcs” if ${g,h}$ enjoy ${\ell^\infty}$ control. To get the required extension to ${\ell^2}$ in the ${f}$ aspect one has to improve the control on the error from ${\ell^1}$ to ${\ell^2}$ ; this can be done by some interpolation theory combined with the useful Fourier multiplier theory of Ionescu and Wainger on major arcs. Then, by further interpolation using recent ${\ell^p({\bf Z})}$ improving estimates of Han, Kovac, Lacey, Madrid, and Yang for linear averages such as ${x \mapsto \frac{1}{N} \sum_{n=1}^N g(x+n-n^2)}$ , one can relax the ${\ell^\infty}$ hypothesis on ${g}$ to an ${\ell^2}$ hypothesis, and then by undoing the duality one obtains a good inverse theorem for (2) for the function ${f}$ ; a modification of the arguments also gives something similar for ${g}$ .

Using these inverse theorems (and the Ionescu-Wainger multiplier theory) one still has to understand the “major arc” portion of (1); a model case arises when ${f,g}$ are supported near rational numbers ${a/q}$ with ${q \sim 2^l}$ for some moderately large ${l}$ . The inverse theory gives good control (with an exponential decay in ${l}$ ) on individual scales ${N}$ , and one can leverage this with a Rademacher-Menshov type argument (see e.g., this blog post) and some closer analysis of the bilinear Fourier symbol of ${A_N}$ to eventually handle all “small” scales, with ${N}$ ranging up to say ${2^{2^u}}$ where ${u = C 2^{\rho l}}$ for some small constant ${\rho}$ and large constant ${C}$ . For the “large” scales, it becomes feasible to place all the major arcs simultaneously under a single common denominator ${Q}$ , and then a quantitative version of the Shannon sampling theorem allows one to transfer the problem from the integers ${{\bf Z}}$ to the locally compact abelian group ${{\bf R} \times {\bf Z}/Q{\bf Z}}$ . Actually it was conceptually clearer for us to work instead with the adelic integers ${{\mathbf A}_{\bf Z} ={\bf R} \times \hat {\bf Z}}$ , which is the inverse limit of the ${{\bf R} \times {\bf Z}/Q{\bf Z}}$ . Once one transfers to the adelic integers, the bilinear operators involved split up as tensor products of the “continuous” bilinear operator

$\displaystyle A_{N,{\bf R}}(f,g)(x) := \frac{1}{N} \int_0^N f(x-t) g(x-t^2)\ dt$

on ${{\bf R}}$ , and the “arithmetic” bilinear operator

$\displaystyle A_{\hat Z}(f,g)(x) := \int_{\hat {\bf Z}} f(x-y) g(x-y^2) d\mu_{\hat {\bf Z}}(y)$

on the profinite integers ${\hat {\bf Z}}$ , equipped with probability Haar measure ${\mu_{\hat {\bf Z}}}$ . After a number of standard manipulations (interpolation, Fubini’s theorem, Hölder’s inequality, variational inequalities, etc.) the task of estimating this tensor product boils down to establishing an ${L^q}$ improving estimate

$\displaystyle \| A_{\hat {\bf Z}}(f,g) \|_{L^q(\hat {\bf Z})} \lesssim \|f\|_{L^2(\hat {\bf Z})} \|g\|_{L^2(\hat {\bf Z})}$

for some ${q>2}$ . Splitting the profinite integers ${\hat {\bf Z}}$ into the product of the ${p}$ -adic integers ${{\bf Z}_p}$ , it suffices to establish this claim for each ${{\bf Z}_p}$ separately (so long as we keep the implied constant equal to ${1}$ for sufficiently large ${p}$ ). This turns out to be possible using an arithmetic version of the Peluse-Prendiville inverse theorem as well as an arithmetic ${L^q}$ improving estimate for linear averaging operators which ultimately arises from some estimates on the distribution of polynomials on the ${p}$ -adic field ${{\bf Q}_p}$ , which are a variant of some estimates of Kowalski and Wright.

17 comments

Comments feed for this article

4 August, 2020 at 1:57 am

qwerty

Theorem 2: did you mean p>1?

[Corrected, thanks – T.]

2 September, 2020 at 7:59 am

IQ news

I believe the author of this blog is a person with a high IQ. I read an article about him with the highest IQ in America.

4 August, 2020 at 12:58 pm

Anonymous

Is the upper bound 3/2 in theorem 3 the best possible ?

7 August, 2020 at 9:49 am

Terence Tao

This question (or its counterpart for the closely related bilinear Hilbert transform) is actually one of the key open problems in time frequency analysis; the standard time-frequency techniques break down past this threshold, but this has not yet led to an actual counterexample to boundedness or convergence beyond this point. See for instance https://arxiv.org/abs/1409.3875 for some attempts to create such counterexamples.

6 August, 2020 at 3:32 am

Anonymous

Should $G(x) = H(x)$ rather than $G(x) = F(x)$ to respect the $N^2$ spatial scale?

[Corrected, thanks – T.]

7 August, 2020 at 1:04 pm

Satan

terence tao Let me attempt to make the path integral rigorous.
you can define the path integral as follows.

you discretize the 3d space and you define a path from point (x,y,z) to point (x’,y’,z’) by adjacent cubes.
for every such discrete path you give it a weight which is as follows:
the number of algorithms which produce this path divided by the total number of algorithms – and taking this limit to infinity, in other words the probability of a random algorithm generator to produce an algorithm which produce this path.
then you sum over all path each multiplied by its weigh, and because you discretized the space you take this infinite sum to the limit of infinity such that in each step the cubes that divide the space are smaller and smaller

7 August, 2020 at 6:38 pm

Asaf

Are there characteristic factors (in the sense of Furstenberg-Weiss) for those types of averages?

Another question – is the convergence result becomes easier when we assume some mixing condition on one system (or both)?

10 August, 2020 at 7:24 pm

Terence Tao

For these averages the characteristic factor is the profinite (or rational Kronecker) factor, i.e., the factor generated by the periodic functions (or the rational spectrum of the shift). This is basically already in the original paper of Furstenberg and Weiss (at least for the quadratic polynomial $P(n)=n^2$ ).

Mixing conditions certainly make norm convergence easier, but to my knowledge there is no way to use mixing hypotheses to simplify the proof of pointwise ae convergence for these averages. But there is prior work establishing pointwise convergence for multilinear polynomial averages for exact endomorphisms and K-automorphisms by Derrien and Lesigne (reference [20] in the paper).

7 August, 2020 at 7:09 pm

Mike Jones

Where did you submit this paper?

13 August, 2020 at 4:32 pm

The Ionescu-Wainger multiplier theorem and the adeles | What's new

[…] I’ve just uploaded to the arXiv my paper The Ionescu-Wainger multiplier theorem and the adeles“. This paper revisits a useful multiplier theorem of Ionescu and Wainger on “major arc” Fourier multiplier operators on the integers (or lattices ), and strengthens the bounds while also interpreting it from the viewpoint of the adelic integers (which were also used in my recent paper with Krause and Mirek). […]

16 August, 2020 at 9:59 am

Anonymous

Within the references of the paper uploaded on ArXiv, a minor correction related to the ref #20: Un théorème … Annales de l’Institut Henri Poincaré (IHP)

16 August, 2020 at 10:35 am

Anonymous

Congratulations for your paper.
In order to improve it, within the references of its first version uploaded on the ArXiv plateform, here are very minor corrections and precisions related to :
– the ref #15: … des groupes p-adiques. Bulletin de la Société Mathématique de France (SMF), 89:43-75, Paris, France, 1961.
– the ref #20: Un théorème … Annales de l’Institut Henri Poincaré (IHP) en probabilités et statistiques, 32(6):765-778, Paris, France, 1996.
– the ref #23: for paraproducts. (the final dot is missing)
Kind regards

26 August, 2020 at 10:03 pm

Chen

Just found a typo, in Theorem 2, the second “polynpmial”.

[Corrected, thanks – T.]

2 September, 2020 at 2:03 pm

Huang

Another typo founded: “Fubini’s theorem, Holder’s inequality” => “Fubini’s theorem, Ho¨lder’s inequality”

[Corrected, thanks – T.]

25 April, 2024 at 11:35 am

Anonymous

Hello, professor Tao: I’d like to know more on open question (7) in the paper. Based on the spirit the similarity on the proof of boundedness of bilinear Hilbert transform and bilinear maximal function. Could we expect to use the methods in the paper to establish the bounededness of bilinear Radon transform (or its truncation version): $\sum_{k \neq0}\frac{1}{k}f(T^kx)g(T^{k^2}x)?$

25 April, 2024 at 3:23 pm

Terence Tao

It’s possible, but not guaranteed. The $1/k$ weight potentially generates additional logarithmic divergences in many of the estimates used in our arguments; the question is whether the various gains that we also manage to obtain are sufficiently strong to counteract this divergence. My initial guess is that a naive replication of our arguments will fail to contain these divergences (for instance, the L^p improving inequalities probably fail once such weights are introduced), but it is possible that the argument could somehow be refactored in a way that the logarithmic divergences are only experienced when considering those components of the proof that have really strong additional cancellations. One reason to pursue questions like this is that it may force the person researching these questions to develop a more efficient or simpler proof of our original theorem in order to be able to extend it to the singular setting; I certainly don’t claim that our arguments are the *only* way to proceed for these sorts of questions!

As discussed in the paper, there may be model versions of these questions on other domains than the integers, such as the reals, the p-adics, or the adeles, which may be worth studying first as toy problems that already capture some, but not all, of the main difficulties.

26 April, 2024 at 10:06 am

Anonymous

Thanks for detailed explications!

	Anonymous on Infinite partial sumsets in th…
	Anonymous on A Banach algebra proof of the…
	Anonymous on A Banach algebra proof of the…
	Aleksandar on 245C, Notes 4: Sobolev sp…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Terence Tao on 245C, Notes 4: Sobolev sp…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on What is a gauge?
	Terence Tao on Erratum for “An inverse…
	Terence Tao on 275A, Notes 3: The weak and st…
	Terence Tao on An epsilon of room: pages from…
	Aleksandar on 245C, Notes 4: Sobolev sp…

Pointwise ergodic theorems for non-conventional bilinear polynomial averages

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

17 comments

Leave a comment Cancel reply

For commenters

Pointwise ergodic theorems for non-conventional bilinear polynomial averages

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

17 comments

Leave a comment Cancel reply

For commenters