The von Neumann ergodic theorem (the Hilbert space version of the mean ergodic theorem) asserts that if is a unitary operator on a Hilbert space , and is a vector in that Hilbert space, then one has

in the strong topology, where is the -invariant subspace of , and is the orthogonal projection to . (See e.g. these previous lecture notes for a proof.) The same proof extends to more general amenable groups: if is a countable amenable group acting on a Hilbert space by unitary transformations for , and is a vector in that Hilbert space, then one has

for any Folner sequence of , where is the -invariant subspace, and is the average of on . Thus one can interpret as a certain average of elements of the orbit of .

In a previous blog post, I noted a variant of this ergodic theorem (due to Alaoglu and Birkhoff) that holds even when the group is not amenable (or not discrete), using a more abstract notion of averaging:

Theorem 1 (Abstract ergodic theorem)Let be an arbitrary group acting unitarily on a Hilbert space , and let be a vector in . Then is the element in the closed convex hull of of minimal norm, and is also the unique element of in this closed convex hull.

I recently stumbled upon a different way to think about this theorem, in the additive case when is abelian, which has a closer resemblance to the classical mean ergodic theorem. Given an arbitrary additive group (not necessarily discrete, or countable), let denote the collection of finite non-empty multisets in – that is to say, unordered collections of elements of , not necessarily distinct, for some positive integer . Given two multisets , in , we can form the sum set . Note that the sum set can contain multiplicity even when do not; for instance, . Given a multiset in , and a function from to a vector space , we define the average as

Note that the multiplicity function of the set affects the average; for instance, we have , but .

We can define a directed set on as follows: given two multisets , we write if we have for some . Thus for instance we have . It is easy to verify that this operation is transitive and reflexive, and is directed because any two elements of have a common upper bound, namely . (This is where we need to be abelian.) The notion of convergence along a net, now allows us to define the notion of convergence along ; given a family of points in a topological space indexed by elements of , and a point in , we say that *converges* to along if, for every open neighbourhood of in , one has for sufficiently large , that is to say there exists such that for all . If the topological space is Hausdorff, then the limit is unique (if it exists), and we then write

When takes values in the reals, one can also define the limit superior or limit inferior along such nets in the obvious fashion.

We can then give an alternate formulation of the abstract ergodic theorem in the abelian case:

Theorem 2 (Abelian abstract ergodic theorem)Let be an arbitrary additive group acting unitarily on a Hilbert space , and let be a vector in . Then we havein the strong topology of .

*Proof:* Suppose that , so that for some , then

so by unitarity and the triangle inequality we have

thus is monotone non-increasing in . Since this quantity is bounded between and , we conclude that the limit exists. Thus, for any , we have for sufficiently large that

for all . In particular, for any , we have

We can write

and so from the parallelogram law and unitarity we have

for all , and hence by the triangle inequality (averaging over a finite multiset )

for any . This shows that is a Cauchy sequence in (in the strong topology), and hence (by the completeness of ) tends to a limit. Shifting by a group element , we have

and hence is invariant under shifts, and thus lies in . On the other hand, for any and , we have

and thus on taking strong limits

and so is orthogonal to . Combining these two facts we see that is equal to as claimed.

To relate this result to the classical ergodic theorem, we observe

Lemma 3Let be a countable additive group, with a F{\o}lner sequence , and let be a bounded sequence in a normed vector space indexed by . If exists, then exists, and the two limits are equal.

*Proof:* From the F{\o}lner property, we see that for any and any , the averages and differ by at most in norm if is sufficiently large depending on , (and the ). On the other hand, by the existence of the limit , the averages and differ by at most in norm if is sufficiently large depending on (regardless of how large is). The claim follows.

It turns out that this approach can also be used as an alternate way to construct the Gowers–Host-Kra seminorms in ergodic theory, which has the feature that it does not explicitly require any amenability on the group (or separability on the underlying measure space), though, as pointed out to me in comments, even uncountable abelian groups are amenable in the sense of possessing an invariant mean, even if they do not have a F{\o}lner sequence.

Given an arbitrary additive group , define a *-system* to be a probability space (not necessarily separable or standard Borel), together with a collection of invertible, measure-preserving maps, such that is the identity and (modulo null sets) for all . This then gives isomorphisms for by setting . From the above abstract ergodic theorem, we see that

in the strong topology of for any , where is the collection of measurable sets that are essentially -invariant in the sense that modulo null sets for all , and is the conditional expectation of with respect to .

In a similar spirit, we have

Theorem 4 (Convergence of Gowers-Host-Kra seminorms)Let be a -system for some additive group . Let be a natural number, and for every , let , which for simplicity we take to be real-valued. Then the expressionconverges, where we write , and we are using the product direct set on to define the convergence . In particular, for , the limit

converges.

We prove this theorem below the fold. It implies a number of other known descriptions of the Gowers-Host-Kra seminorms , for instance that

for , while from the ergodic theorem we have

This definition also manifestly demonstrates the cube symmetries of the Host-Kra measures on , defined via duality by requiring that

In a subsequent blog post I hope to present a more detailed study of the norm and its relationship with eigenfunctions and the Kronecker factor, without assuming any amenability on or any separability or topological structure on .

** — 1. Proof of theorem — **

If is a tuple of functions and , we say that is *-symmetric* if we have whenever and agree in the first components (that is, for ). We will prove Theorem 4 by downward induction on , with the case establishing the full theorem.

Thus, assume that and that the claim has already been proven for larger values of (this hypothesis is vacuous for ). Write

We will show that for any , and for sufficiently large (in the net ), the quantity can only increase by at most when one increases any of the , , that is to say that

whenever and . This implies that the limit superior of exceeds the limit inferior by at most , and on sending we will obtain Theorem 4.

There are two cases, depending on whether or . We begin with the first case . By relabeling we may take , so that . As is -symmetric, we can write

where

By the triangle inequality argument used to prove Theorem 2 we thus see that

and so certainly cannot increase by by increasing .

Now we turn to the case when . By relabeling we may take , so that . We can write

where

On the other hand, the quantity

is the same as , but with replaced by . After rearrangement, this is a -symmetric inner product, and so by induction hypothesis the limit

exists. In particular, for large enough, we have

for all , which by the parallelogram law as in the proof of Theorem 2 shows that

and hence by averaging

whenever . Similarly with replaced by . From Cauchy-Schwarz we then have

for some independent of (depending on the norms of the ), and the claim follows after redefining .

## 10 comments

Comments feed for this article

9 April, 2015 at 10:22 pm

rennydiokno2015Reblogged this on rennydiokno.com.

9 April, 2015 at 11:27 pm

AnonymousIn the initial definition of sum set, a + is missing (says AB :=… now)

In the proof of Thm 2, there is a lim_{A\to H} (“we conclude that the limit…”)

I’m having some trouble seeing the Cauchy estimate from the parallelogram law in Thm 2. Taking X = ET^a v we have ||X – T^g X||^2 + ||X+ T^g X||^2 = 4||X||^2 and from ||X + T^gX||^2 \geq ||X||^2 – \epsilon we get ||X – T^g X||^2\leq \epsilon + 3||X||^2 instead of 4\epsilon

In the discussion immediately preceding Thm 3, E(f, X^G) has a \mathcal{E} instead of \mathbf

[Corrected, thanks. The lower bound should be on , not . -T.]10 April, 2015 at 9:08 am

AnonymousAh, yes. There is a 1/2 missing in the RHS of E_{b \in A + {0,g}} = E_a T^a v + T^g E_a T^a v

[Corrected, thanks – T.]10 April, 2015 at 9:09 am

Craig HI think you’re still missing the $\frac{1}{2}$.

Specifically, just above the “parallelogram law” link, it should be

$ET^b v = \frac{1}{2} (ET^a v + T^g ET^a v)$,

since the size of the multiset has now doubled.

10 April, 2015 at 12:17 am

AnonymousIn the line below the definition of the average , it seems that “multiplicity of the set ” should be “multiplicity function of the multiset “.

[Corrected, thanks – T.]10 April, 2015 at 1:09 am

pzorinIsn’t it the case that all commutative groups are amenable as discrete groups?

10 April, 2015 at 5:25 am

Marcel S.As far as I am aware all abelian groups are amenable (see e.g. Chapter 10 in “The Banach-Tarski paradox” by Stan Wagon). If I am not mistaken a slight modification of the procedure described in Theorem 2 should provide an invariant mean on bounded functions on the group.

10 April, 2015 at 8:24 am

Terence TaoHmm you’re right; uncountable abelian groups don’t have Folner sequences, but they still have invariant means. I guess what I meant to say here is that one can establish the ergodic theorem and construct the Gowers-Host-Kra seminorms without explicit use of Folner sequences, even if the ambient group is still technically amenable.

11 April, 2015 at 8:41 am

AnonymousI guess in the beginning the link to “previous lecture notes” should be pointing to one lecture earlier (i.e. Lecture 8 not 9.)

[Corrected, thanks – T.]27 March, 2016 at 5:17 pm

Concatenation theorems for anti-Gowers-uniform functions and Host-Kra characteristic factors; polynomial patterns in primes | What's new[…] of ways. One is by duality, using the Gowers-Host-Kra uniformity seminorms (defined for instance here) . Namely, is the factor of defined up to equivalence by the requirement […]