The primary objective of this lecture and the next few will be to give a proof of the Furstenberg recurrence theorem (Theorem 2 from the previous lecture). Along the way we will develop a structural theory for measure-preserving systems.

The basic strategy of Furstenberg’s proof is to first prove the recurrence theorems for very simple systems – either those with “almost periodic” (or compact) dynamics or with “weakly mixing” dynamics. These cases are quite easy, but don’t manage to cover all the cases. To go further, we need to consider various combinations of these systems. For instance, by viewing a general system as an extension of the maximal compact factor, we will be able to prove Roth’s theorem (which is equivalent to the k=3 form of the Furstenberg recurrence theorem). To handle the general case, we need to consider compact extensions of compact factors, compact extensions of compact extensions of compact factors, etc., as well as weakly mixing extensions of all the previously mentioned factors.

In this lecture, we will consider those measure-preserving systems (X, {\mathcal X}, \mu, T) which are compact or almost periodic. These systems are analogous to the equicontinuous or isometric systems in topological dynamics discussed in Lecture 6, and as with those systems, we will be able to characterise such systems (or more precisely, the ergodic ones) algebraically as Kronecker systems, though this is not strictly necessary for the proof of the recurrence theorem.

— Almost periodic functions —

We begin with a basic definition.

Definition 1. Let (X,{\mathcal X},\mu,T) be a measure-preserving system. A function f \in L^2(X,{\mathcal X},\mu) is almost periodic if the orbit closure \overline{\{ T^n f: n \in {\Bbb Z} \}} is compact in L^2(X, {\mathcal X},\mu).

Example 1. If f is periodic (i.e. T^n f = f for some n > 0) then it is clearly almost periodic. In particular, any shift-invariant function (such as a constant function) is almost periodic. \diamond

Example 2. In the circle shift system ({\Bbb R}/{\Bbb Z}, x \mapsto x+\alpha), every function f \in L^2({\Bbb R}/{\Bbb Z}) is almost periodic, because the orbit closure lies inside the set \{ f(\cdot+\theta): \theta \in {\Bbb R}/{\Bbb Z} \}, which is the continuous image of a circle {\Bbb R}/{\Bbb Z} and therefore compact. \diamond

Exercise 1. Let (X,{\mathcal X},\mu,T) be a measure-preserving system, and let f \in L^2(X,{\mathcal X},\mu). Show that f is almost periodic in the ergodic theory sense (i.e. Definition 1 above) if and only if it is almost periodic in the topological dynamical systems sense (see Lecture 3), i.e. if the sets \{ n \in {\Bbb Z}: \|T^n f - f \|_{L^2(X,{\mathcal X},\mu)} \leq \varepsilon \} are syndetic for every \varepsilon > 0. (Hint: if f is almost periodic in the ergodic theory sense, show that the orbit closure is an isometric system and thus a Kronecker system, at which point Theorem 2 from Lecture 3 can be applied. For the converse implication, use the Heine-Borel theorem and the isometric nature of T on L^2.) \diamond

Exercise 2. Let (X, {\mathcal X},\mu,T) be a measure-preserving system. Show that the space of almost periodic functions in L^2(X,{\mathcal X},\mu) is a closed shift-invariant subspace which is also closed under the pointwise operations f, g \mapsto \max(f,g) and f, g \mapsto \min(f,g). Similarly, show that the space of almost periodic functions in L^\infty(X,{\mathcal X},\mu) is a closed subspace which is also an algebra (closed under products) as well as closed under \max and \min. \diamond

Exercise 3. Show that in any Bernoulli system \Omega^{\Bbb Z}, the only almost periodic functions are the constants. (Hint: first show that if f \in L^2(X,{\mathcal X},\mu) has mean zero, then \lim_{n \to \infty} \int_X f T^n f\ d\mu = 0, by first considering elementary functions. \diamond

Let us now recall the Furstenberg recurrence theorem, which we now phrase in terms of functions rather than sets:

Theorem 1. (Furstenberg recurrence theorem) Let (X, {\mathcal X},\mu,T) be a measure-preserving system, let k \geq 1, and let f \in L^\infty(X,{\mathcal X},\mu) be a non-negative function with \int_X f\ d\mu > 0. Then we have \liminf_{N \to \infty} \frac{1}{N} \sum_{r=0}^{N-1} \int_X f T^r f \ldots T^{(k-1)r} f > 0.

Exercise 4. Show that Theorem 1 is equivalent to Theorem 2 from the previous lecture. \diamond

We can now quickly establish this theorem in the almost periodic case:

Proposition 1. Theorem 1 is true whenever f is almost periodic.

Proof. Without loss of generality we may assume that f is bounded a.e. by 1. Let \varepsilon > 0 be chosen later. Recall from Exercise 1 that T^n f lies within \varepsilon of f in the L^2 topology for a syndetic set of n. For all such n, one also has \| T^{(j+1)n} f - T^{jn} f \|_{L^2(X,{\mathcal X},\mu)} \leq \varepsilon for all j, since T acts isometrically. By the triangle inequality, we conclude that T^{jn} f lies within O_k(\varepsilon) of f in L^2 for 0 \leq j \leq k. On the other hand, from Hölder’s inequality we see that on the unit ball of L^\infty(X, {\mathcal X}, \mu) with the L^2 topology, pointwise multiplication is Lipschitz. Applying this fact repeatedly, we conclude that for n in this syndetic set, f T^n f \ldots T^{(k-1)n} f lies within O_k(\varepsilon) in L^2 of f^k. In particular,

\int_X f T^n f \ldots T^{(k-1)n} f\ d\mu = \int_X f^k\ d\mu + O_k(\varepsilon). (1)

On the other hand, since \int f\ d\mu > 0, we must have \int_X f^k\ d\mu > 0. Choosing \varepsilon sufficiently small, we thus see that the left-hand side of (1) is uniformly bounded away from zero in a syndetic set, and the conclusion of Theorem 1 follows. \diamond

Remark 1. Because f lives in a Kronecker system, one can also obtain the above result using various multiple recurrence theorems from topological dynamics, such as Proposition 3 from Lecture 6 or the Birkhoff multiple recurrence theorem from Lecture 4 (though to get the full strength of the results, one needs to use either syndetic van der Waerden theorem, see Exercise 15.3 from Lecture 5, or the Varnavides averaging trick from the previous lecture). We leave the details to the reader. \diamond

— Kronecker systems and Haar measure —

We have seen how nice almost periodic functions are. Motivated by this, we define

Definition 2. A measure-preserving system (X, {\mathcal X}, \mu, T) is said to be compact if every function in L^2(X, {\mathcal X}, \mu) is almost periodic.

Thus, for instance, by Example 2, the circle shift system is compact, but from Exercise 3, no non-trivial Bernoulli system is compact. From Proposition 1 we know that the Furstenberg recurrence theorem is true for compact systems.

One source of compact systems comes from Kronecker systems, as introduced in Lecture 6. As such systems are topological rather than measure-theoretic, we will need to endow them with a canonical measure – Haar measure – first.

Let G be a compact metrisable topological group (not necessarily abelian). Without an ambient measure, we cannot yet define the convolution f*g of two continuous functions f, g \in C(G). However, we can define the convolution \mu*f of a finite Borel measure \mu on G and a continuous function f \in C(G) to be the function

\mu * f(x) := \int_G f(y^{-1} x)\ d\mu(y), (2)

which (by the uniform continuity of f) is easily seen to be another continuous function. We similarly define

f * \mu(x) := \int_G f(x y^{-1})\ d\mu(y). (2′)

Also, one can define the convolution \mu * \nu of two finite Borel measures to be the finite Borel measure defined as

\mu*\nu(E) := \int_G \nu( y^{-1} \cdot E )\ d\mu(y) (3)

for all Borel sets E. For instance, the convolution \delta_x * \delta_y of two Dirac masses is another Dirac mass \delta_{xy}. Fubini’s theorem tells us that the convolution of two finite measures is another finite measure. Convolution is also bilinear and associative (thus (\mu*\nu)*\rho = \mu*(\nu*\rho), f *(\mu*\nu)=(f*\mu)*\nu, (\mu*f)*\nu = \mu*(f*\nu), and (\mu*\nu)*f = \mu*(\nu*f) for measures \mu,\nu,\rho and continuous f); in particular, left-convolution and right-convolution commute. Also observe that the convolution of two Borel probability measures is again a Borel probability measure. Convolution also has a powerful smoothing effect that can upgrade weak convergence to strong convergence. Specifically, if \mu_n converges in the vague sense to \mu, and f is continuous, then an easy application of compactness of the underlying group G reveals that \mu_n*f converges in the uniform sense to \mu*f.

Let us say that a number c is a left-mean (resp. right-mean) of a continuous function f \in C(G) if there exists a probability measure \mu such that \mu*f (resp. f*\mu_n) is equal to a constant c. For compact metrisable groups G, this mean is well defined:

Lemma 1. (Existence and uniqueness of mean) Let G be a compact metrisable topological group, and let f \in C(G). Then there exists a unique constant c which is both a left-mean and right-mean of f.

Proof. Without loss of generality we can take f to be real-valued. Let us first show that there exists a left-mean. Define the oscillation of a real-valued continuous function to be the difference between its maximum and minimum. By the vague sequential compactness of probability measures (Lemma 1 from Lecture 7), one can find a probability measure \mu which minimises the oscillation of \mu*f. If this oscillation is zero, we are done. If the oscillation is non-zero, then (using the compactness of the group and the transitivity of the group action) it is not hard to find a finite number of left rotations of \mu*f whose average has strictly smaller oscillation than that of \mu*f (basically by rotating the places where \mu*f is near its maximum to cover where it is near its mimum). Thus we have a finitely supported probability \nu with \nu*\mu*f having smaller oscillation than \mu*f, a contradiction. We thus see that a left-mean exists. Similarly, a right-mean exists. But since left-convolution commutes with right-convolution, we see that all left-means are equal to all right-means, and the claim follows. \diamond

The map f \mapsto c from a continuous function to its mean is a bounded non-negative linear functional on C(G) which preserves constants, and thus by the Riesz representation theorem is given by a unique probability measure \mu; since left- and right-convolution commute, we see that this measure is both left- and right- invariant. Conversely, given any such measure \mu we easily see (again using the commutativity of left-and right-convolution) that f*\mu = \mu*f = c. We have thus shown

Corollary 1. (Existence and uniqueness of Haar measure) If G is a compact metrisable topological group, then there exists a unique Borel probability measure \mu on G which is both left and right invariant.

In particular, every topological Kronecker system (K, x \mapsto x+\alpha) can be canonically converted into a measure-preserving system, which is then compact by the same argument used to establish Example 2. (Actually this observation works for non-abelian Kronecker systems as well as abelian ones.)

Remark 2. One can also build left- and right- Haar measures for locally compact groups; these measures are locally finite Radon measures rather than Borel probability measures, and are unique up to constants; however it is no longer the case that such measures are necessarily equal to each other except in special cases, such as when the group is abelian or compact. These measures play an important role in the harmonic analysis and representation theory of such groups, but we will not discuss these topics further here. \diamond

— Classification of compact systems —

We have just seen that every Kronecker system is a compact system. The converse is not quite true; consider for instance the disjoint union of two Kronecker systems from different groups (with the probability measure being split, say, 50-50, between the two components). The situation is similar to that in Lecture 6, in which every Kronecker system was equicontinuous and isometric, but the converse only held under the additional assumption of minimality. There is a similar situation here, but first we need to define the notion of equivalence of two measure-preserving systems.

Define an abstract measure-preserving system ({\mathcal X}, \mu, T) to be an abstract separable \sigma-algebra {\mathcal X} (which, by Stone’s theorem, can be viewed as consisting of subsets of some ambient space X, though for our purposes it is better here to stay in the abstract world), together with an abstract probability measure \mu: {\mathcal X} \to [0,1] and an abstract invertible shift T: {\mathcal X} \to {\mathcal X} which preserves the measure \mu (but does not necessarily come from an invertible map T: X \to X on some ambient space). (An abstract measure space ({\mathcal X},\mu) is sometimes also known as a measure algebra.) There is an obvious notion of a morphism \Phi: ({\mathcal X}, \mu, T) \to ({\mathcal Y},\nu,S) between abstract measure-preserving systems, in which \Phi: {\mathcal Y} \to {\mathcal X} (note the contravariance) is a \sigma-algebra homomorphism with \nu = \mu \circ \Phi and S \circ \Phi = \Phi \circ T. This makes the class of abstract measure-preserving systems into a category. In particular we have a notion of two abstract measure-preserving systems being isomorphic.

Example 3. Let (X,{\mathcal X},\mu,T) be a skew shift (y,z) \mapsto (y+\alpha,z+y) and let (Y,{\mathcal Y},\nu,S) be the underlying circle shift y \mapsto y+\alpha. These systems are of course non-isomorphic, although there is a factor map \pi: X \to Y which is a morphism. If however we consider the \sigma-algebra \pi^\#({\mathcal Y}) \subset {\mathcal X} (which are the Cartesian products of horizontal Borel sets with the vertical circle {\Bbb R}/{\Bbb Z}), we see that \pi induces an isomorphism between the abstract measure-preserving systems (\pi^\#({\mathcal Y}), \mu, T) and ({\mathcal Y}, \nu, S). \diamond

Given a concrete measure-preserving system (X,{\mathcal X},\mu,T), we can define its abstraction ({\mathcal X}/\sim, \mu, T), where \sim is the equivalence relation of almost everywhere equivalence modulo \mu. In category theoretic language, abstraction is a covariant functor from the category of concrete measure-preserving systems to the category of abstract measure-preserving systems. We say that two concrete measure-preserving systems are equivalent if their abstractions are isomorphic. Thus for instance, in Example 3  above, (X, \pi^\#({\mathcal Y}), \mu, T) and (Y,  {\mathcal Y}, \nu, S) are equivalent; there is no concrete isomorphism between these two systems, but once one abstracts away the underlying sets X and Y, we can recover an equivalence. As another example, we see that if we add or remove a null set to a measure-preserving system, we obtain an abstractly equivalent measure-preserving system.

Remark 3. Up to null sets, we can also identify an abstract measure-preserving system ({\mathcal X},\mu,T) with its abelian von Neumann algebra L^\infty({\mathcal X},\mu) (which acts on the Hilbert space L^2({\mathcal X},\mu) by pointwise multiplication), together with an automorphism T of that algebra; conversely, one can recover the algebra {\mathcal X} as the idempotents 1_E of the von Neumann algebra, and the measure \mu(E) of a set being the trace of the idempotent 1_E. A significant portion of ergodic theory can in fact be rephrased in terms of von Neumann algebras (which, in particular, naturally suggests a non-commutative generalisation of the subject), although we will not adopt this perspective here. \diamond

Many results and notions about concrete measure-preserving systems (X,{\mathcal X},\mu,T) can be rephrased to not require knowledge of the underlying space X (and to be stable under modification by null sets), and so can be converted to statements about abstract measure-preserving systems; for instance, the Furstenberg recurrence theorem is of this form once one replaces “non-empty” with “positive measure” (see Exercise 1 from Lecture 10). The notion of ergodicity is also of this form. In particular, such results and notions automatically become preserved under equivalence. In view of this, the following classification result is of interest:

Theorem 2. (Classification of ergodic compact systems) Every ergodic compact system is equivalent to an (abelian) Kronecker system.

To prove this theorem, it is convenient to use a harmonic analysis approach. Define an eigenfunction of a measure-preserving system (X,{\mathcal X},\mu,T) to be a bounded measurable function f, not a.e. zero, such that Tf = \lambda f a.e..

Let {\mathcal Z}_1 \subset {\mathcal X} denote the \sigma-algebra generated by all the eigenfunctions. Note that this contains {\mathcal Z}_0 := {\mathcal X}^T, which is the \sigma-algebra generated by the eigenfunctions with eigenvalue 1. We have the following fundamental result:

Proposition 2. (Description of the almost periodic functions) Let (X, {\mathcal X}, \mu, T) be an ergodic measure-preserving system, and let f \in L^2(X,{\mathcal X},\mu). Then f is almost periodic if and only if it lies in L^2(X,{\mathcal Z}_1,\mu), i.e. if it is {\mathcal Z}_1-measurable (note that {\mathcal Z}_1 contains all null sets of {\mathcal X}).

Remark 4. One can view (X,{\mathcal Z}_1,\mu,T) as the maximal compact factor of (X,{\mathcal X},\mu,T), in much the same way that (X,{\mathcal Z}_0,\mu,T) is the maximal factor on which the system is essentially trivial (every function is essentially invariant). \diamond

Proof. It is clear that every eigenfunction is almost periodic. From repeated application of Exercise 2 we conclude that the indicator of any set in {\mathcal Z}_1 is also almost periodic, and thus (by more applications of Exercise 2) every function in L^2(X,{\mathcal Z}_1,\mu) is almost periodic.

Conversely, suppose f \in L^2(X,{\mathcal X},\mu) is almost periodic. Then the orbit closure Y_f \subset L^2(X,{\mathcal X},\mu) of f is an isometric system; the orbit of f is clearly dense in Y_f, and thus by isometry the orbit of every other point is also dense. Thus Y_f is minimal, and therefore Kronecker by Proposition 1 from Lecture 6; thus we have an isomorphism \phi: K \to Y_f from a group rotation (K, x \mapsto x+\alpha) to Y_f. By rotating if necessary we may assume that \phi(0) =f.

By Corollary 1, K comes with an invariant probability measure \nu. The theory of Fourier analysis on compact abelian groups then says that L^2(K,\nu) is spanned by an (orthonormal) basis of characters \chi. In particular, the Dirac mass at 0 (the group identity of K) can be expressed as the weak limit of finite linear combinations of such characters.

Now we need to move this information back to X. For this we use the operator S: L^2(K,\nu) \to L^2(X,{\mathcal X},\mu) defined by Sh := \int_K \phi(y) h(y)\ d\nu(y); one checks from Minkowski’s integral inequality that this is a bounded linear map. Because \phi is a morphism, and each character is an eigenfunction of the group rotation x \mapsto x+\alpha, one easily checks that the image S \chi of a character \chi is an eigenfunction. Since the image of the Dirac mass is (formally) just f, we thus conclude that f is the weak limit of finite linear combinations of characters. (One can in fact use compactness and continuity to make this a strong limit, but this is not necessary here.) In particular, f is equivalent a.e. to a {\mathcal Z}_1-measurable function, as desired. \Box

Exercise 5. (Spectral description of Kronecker factor) Show that the product of two eigenfunctions is again an eigenfunction. Using this and Proposition 2, conclude that L^2(X,{\mathcal Z}_1,\mu) is in fact equal to {\Bbb H}_{pp}, the closed subspace of the Hilbert space {\Bbb H} := L^2(X,{\mathcal X},\mu) generated by the eigenfunctions of the shift operator T. \diamond

Exercise 6. Let (X,{\mathcal X},\mu,T) be a measure-preserving system, and let f \in L^2(X, {\mathcal X},\mu). We say that f is quasiperiodic if the orbit \{ T^n f: n \in {\Bbb Z} \} lies in a finite-dimensional space. Show that a function is quasiperiodic if and only if it is a finite linear combination of eigenfunctions. Deduce that a function is almost periodic if and only if it is the limit in L^2 of quasiperiodic functions. \diamond

Exercise 7. The purpose of this exercise is to show how abstract measure-preserving systems, and the morphisms between them, can be satisfactorily modeled by concrete systems and morphisms.

  1. Let ({\mathcal X},\mu,T) be an abstract measure-preserving system. Show that there exists a concrete regular measure-preserving system (X', {\mathcal X}', \mu', T') which is equivalent to ({\mathcal X},\mu,T) (thus after omitting X’ and quotienting out both \sigma-algebras by null sets, the two resulting abstract measure-preserving systems are isomorphic); the notion of regularity was defined in Definition 2 of Lecture 9. Hint: take a countable shift-invariant family of sets that generate {\mathcal X} (thus T acts on this space by permutation), and use this to create a \sigma-algebra morphism from {\mathcal X} to {\mathcal X}', the product \sigma-algebra of some boolean space X' := 2^{\Bbb Z}, endowed with a permutation action T’.
  2. Let \phi: ({\mathcal X},\mu,T) \to ({\mathcal Y},\nu,S) be an abstract morphism. Show that there exist regular measure-preserving systems (X',{\mathcal X}',\mu',T') and (Y',{\mathcal Y}',\nu',S') equivalent to ({\mathcal X},\mu,T) and ({\mathcal Y},\nu,S), together with a concrete morphism \phi': X' \to Y', such that obvious commuting square connecting the abstract \sigma-algebras {\mathcal X}, {\mathcal Y}, {\mathcal X}', {\mathcal Y}' quotiented out by null sets does indeed commute. \diamond

Remark 6. Exercise 7 (and various related results) show that the distinction between concrete and abstract measure-preserving systems are very minor in practice. There are however other areas of mathematics in which taking an abstract or “point-less” approach by deleting (or at least downplaying) the underlying space can lead to non-trivial generalisations or refinements of the original concrete concept, for instance when moving from varieties to schemes. \diamond

Proof of Theorem 2. Note that if f is an eigenfunction then T|f|=|f|, and so (if the system is ergodic) |f| is a.e. constant (which implies also that the eigenvalue lies on the unit circle). In particular, any eigenfunction is invertible. The quotient of two eigenfunctions of the same eigenvalue is then T-invariant and thus constant a.e. by ergodicity, which shows that all eigenspaces have geometric multiplicity 1 modulo null sets. As T is unitary, any eigenfunctions of different eigenvalues are orthogonal to each other; as L^2(X, {\mathcal X},\mu) is separable, we conclude that the number of eigenfunctions (up to constants and a.e. equivalence) is at most countable.

Let (\phi_n)_{n \in A} be a collection of representative eigenfunctions for some at most countable index set A with eigenvalues \lambda_n; we can normalise |\phi_n|=1 a.e.. By modifying each eigenfunction on a set of measure zero (cf. Exercise 7 of Lecture 8) we can assume that T \phi_n = \lambda_n \phi_n and |\phi_n|=1 everywhere rather than just almost everywhere. Then the map \Phi: x \mapsto (\log \phi_n(x))_{n \in A} is a morphism from (X,{\mathcal X},\mu,T) to the torus ({\Bbb R}/{\Bbb Z})^A with the product \sigma-algebra {\mathcal B}, the push-forward measure \Phi_\# \mu, and the shift x \mapsto x+\alpha, where \alpha := (\log \lambda_n)_{n \in A}. From Proposition 2 we see that every measurable set in {\mathcal X} differs by a null set from a set in the pullback \sigma-algebra \Phi^\#( {\mathcal B} ). From this it is not hard to see that (X,{\mathcal X},\mu,T) is equivalent to the system ({\Bbb R}/{\Bbb Z})^A, {\mathcal B}, \Phi_\# \mu, x \mapsto x+\alpha).

Now, {\Bbb R}/{\Bbb Z})^A is a compact metrisable space. The orbit closure K of \alpha inside this space is thus also compact metrisable. The support of \Phi_\# \mu is shift-invariant and thus K-invariant; but from the ergodicity of \mu we conclude that the support must in fact be a single translate of K. In particular, \Phi_\# \mu is just a translate of Haar measure on K. From this one easily concludes that ({\Bbb R}/{\Bbb Z})^A, {\mathcal B}, \Phi_\# \mu, x \mapsto x+\alpha) is equivalent to the Kronecker system (K, x \mapsto x+\alpha) with the Borel \sigma-algebra and Haar measure, and the claim follows. \Box

Exercise 8. Let (X,{\mathcal X},\mu,T) be a compact system which is not necessarily ergodic, and let y \mapsto \mu_y be the ergodic decomposition of \mu relative to the projection \pi: (X,{\mathcal X},\mu,T) \to (Y, {\mathcal Y},\nu,S) given by Proposition 4 of Lecture 9. Show that (X,{\mathcal X},\mu_y,T) is a compact ergodic system for \nu-almost every y. From this and Theorem 2, we conclude that every compact system can be disintegrated into ergodic Kronecker systems (cf. the discussion after Proposition 2 of Lecture 6). \diamond

Remark 7. We comment here on finitary versions of the above concepts. Consider the cyclic group system ({\Bbb Z}/N{\Bbb Z},x \mapsto x+1) with the discrete \sigma-algebra and uniform probability measure. Strictly speaking, every function on this system is periodic with period N and thus almost periodic, and so this is a compact system. But suppose we consider N as a large parameter going to infinity (in which case one can view these systems, together with some function f = f_N on these systems, “converging” to some infinite system with some limit function f, as in the derivation of Theorem 3 from Theorem 2 in the previous lecture). Then we would be interested in uniform control on the almost periodicity of the function or the compactness of the system, i.e. quantitative bounds involving expressions such as O(1) which are bounded uniformly in N. With such a perspective, the analogue of a quasiperiodic function (see Exercise 6) is a function f: {\Bbb Z}/N{\Bbb Z} \to {\Bbb C} which is a linear combination of at most O(1) characters (i.e. its Fourier transform is non-zero at only O(1) frequencies), whilst an almost periodic function f is one which is approximable in L^2 by quasiperiodic functions, thus for every \varepsilon > 0 one can find a function with only O_\varepsilon(1) frequencies which lies within \varepsilon of f in L^2 norm. Most functions on {\Bbb Z}/N{\Bbb Z} for large N are not like this, and so the cyclic shift system is not compact in the asymptotic limit N \to \infty; however if one coarsens the underlying \sigma-algebra significantly one can recover compactness, though unfortunately one has to replace exact shift-invariance by approximate shift-invariance when one does so. For instance if one considers a \sigma-algebra {\mathcal B} generated by a bounded (O(1)) number of Bohr sets \{ n \in {\Bbb Z}/N{\Bbb Z}: \frac{\xi n}{N} - a\|_{{\Bbb R}/{\Bbb Z}} \leq \varepsilon \}, then {\mathcal B} is no longer shift-invariant in general, but all the functions which are measurable with respect to this algebra are uniformly almost periodic in the above sense. For some further developments of these sorts of “quantitative ergodic theory” ideas, see these papers of Ben Green and myself. \diamond

[Update, Feb 12: Another exercise and remark added.]

[Update, Feb 13: Examples 1 and 2 corrected.]