You are currently browsing the tag archive for the ‘ergodic theory’ tag.
Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our joint paper “Multiple recurrence and convergence results associated to -actions“. This paper is primarily concerned with limit formulae in the theory of multiple recurrence in ergodic theory. Perhaps the most basic formula of this type is the mean ergodic theorem, which (among other things) asserts that if
is a measure-preserving
-system (which, in this post, means that
is a probability space and
is measure-preserving and invertible, thus giving an action
of the integers), and
are functions, and
is ergodic (which means that
contains no
-invariant functions other than the constants (up to almost everywhere equivalence, of course)), then the average
to the expression
see e.g. this previous blog post. Informally, limit formula as an equidistribution result: if is drawn at random from
(using the probability measure
), and
is drawn at random from
for some large
, then the pair
becomes uniformly distributed in the product space
(using product measure
) in the limit as
.
If we allow to be non-ergodic, then we still have a limit formula, but it is a bit more complicated. Let
be the
-invariant measurable sets in
; the
-system
can then be viewed as a factor of the original system
, which is equivalent (in the sense of measure-preserving systems) to a trivial system
(known as the invariant factor) in which the shift is trivial. There is then a projection map
to the invariant factor which is a factor map, and the average (1) converges in the limit to the expression
is the pushforward map associated to the map
; see e.g. this previous blog post. We can interpret this as an equidistribution result. If
is a pair as before, then we no longer expect complete equidistribution in
in the non-ergodic, because there are now non-trivial constraints relating
with
; indeed, for any
-invariant function
, we have the constraint
; putting all these constraints together we see that
(for almost every
, at least). The limit (2) can be viewed as an assertion that this constraint
are in some sense the “only” constraints between
and
, and that the pair
is uniformly distributed relative to these constraints.
Limit formulae are known for multiple ergodic averages as well, although the statement becomes more complicated. For instance, consider the expression
; this is analogous to the combinatorial task of counting length sets. For simplicity we assume the system
to be ergodic. Naively limit to then converge to
which would roughly speaking correspond to an assertion that the triplet is asymptotically equidistributed in
. However, even in the ergodic case there can be additional constraints on this triplet that cannot be seen at the level of the individual pairs
,
. The key obstruction here is that of eigenfunctions of the shift
, that is to say non-trivial functions
that obey the eigenfunction equation
almost everywhere for some constant (or
-invariant)
. Each such eigenfunction generates a constraint
,
, and
. However, it turns out that these are in some sense the only constraints on
that are relevant for the limit (3). More precisely, if
to be the sub-algebra of
generated by the eigenfunctions of
, then it turns out that the factor
is isomorphic to a shift system
known as the Kronecker factor, for some compact abelian group
and some (irrational) shift
; the factor map
pushes eigenfunctions forward to (affine) characters on
. It is then known that the limit of (3) is
where is the closed subgroup
and is the Haar probability measure on
; see this previous blog post. The equation
defining
corresponds to the constraint (4) mentioned earlier. Among other things, this limit formula implies Roth’s theorem, which in the context of ergodic theory is the assertion that the limit (or at least the limit inferior) of (3) is positive when
is non-negative and not identically vanishing.
If average
, which obey an eigenfunction equation
in which
is no longer constant, but is now a linear eigenfunction. For such functions,
behaves quadratically in
, and existence of a constraint
,
,
, and
that is not detected at the triple average level. As it turns out, this is not the only type of constraint relevant for (5); there is a more general class of constraint involving two-step nilsystems which we will not detail here, but see e.g. this previous blog post for more discussion. Nevertheless there is still a similar limit formula to previous examples, involving a special factor
which turns out to be an inverse limit of two-step nilsystems; this limit theorem can be extracted from the structural theory in this paper of Host and Kra combined with a limit formula for nilsystems obtained by Lesigne, but will not be reproduced here. The pattern continues to higher averages (and higher step nilsystems); this was first dby Ziegler, and can also in principle be extracted from the structural theory of Host-Kra combined with nilsystem equidistribution results of Leibman. These sorts of limit formulae can lead to various recurrence results refining Roth’s theorem in various ways; see this paper of Bergelson, Host, and Kra for some examples of this.
The above discussion was concerned with -systems, but of the theory to measure-preserving
-systems for other discrete countable abelian groups
, in which family
of shifts indexed by
rather than a single shift, obeying the compatibility relation
. The role of the intervals
in this more general setting is replaced by that of Folner sequences. For arbitrary countable abelian
, the theory for double averages (1) and triple limits (3) is essentially identical to the
-system case. But when and higher limits, the situation becomes more complicated (and, for arbitrary
, still not fully understood). However is now well understood is the finite field case when
is an infinite-dimensional vector space over a finite field
(with the finite subspaces
then being a good choice for the Folner sequence). Here, the analogue of the structural theory of Host and Kra was worked out by Vitaly, Tamar, and myself in these previous papers (treating the high characteristic and low characteristic cases respectively). In the finite field setting, it turns out that nilsystems no longer appear, and deal with linear, quadratic, and higher order eigenfunctions (known collectively as phase polynomials). It is then natural to look for a limit formula that asserts, roughly speaking, that if
is drawn at random from a
-system and
drawn randomly from a large subspace of
, then the only constraints between
are those that arise from phase polynomials. The main theorem of this paper is to establish this limit formula (which, again, is a little complicated to state explicitly and will not be d. In particular, we establish for the first time that the limit actually exists (a result which, for
-systems, was results of this paper of Host and Kra).
As a consequence, we can recover finite field analogues of most of the results of Bergelson-Host-Kra, though interestingly some of the counterexamples demonstrating sharpness of their results for -systems (based on Behrend set constructions) do not seem to be present in the finite field setting (cf. this previous blog post on the cap set problem). In particular, we are able to largely settle the question of when type theorem that asserts that for any measurable set
in an ergodic
-system and any
,
for a syndetic set of , where
are distinct residue classes. It turns out that Khintchine-type theorems always hold for
(and for
ergodicity is not required), and for
it holds whenever
form a parallelogram, but not otherwise (though the counterexample here was such a painful computation that we ended up removing it from the paper, and may end up putting it online somewhere instead), and for larger
we could show that the Khintchine property failed for generic choices of
, though the problem of determining exactly the tuples for which the Khintchine property failed looked to be rather messy and we did not completely settle it.
One of the basic objects of study in combinatorics are finite strings or infinite strings
of symbols
from some given alphabet
, which could be either finite or infinite (but which we shall usually take to be compact). For instance, a set
of natural numbers can be identified with the infinite string
of
s and
s formed by the indicator of
, e.g. the even numbers can be identified with the string
from the alphabet
, the multiples of three can be identified with the string
, and so forth. One can also consider doubly infinite strings
, which among other things can be used to describe arbitrary subsets of integers.
On the other hand, the basic object of study in dynamics (and in related fields, such as ergodic theory) is that of a dynamical system , that is to say a space
together with a shift map
(which is often assumed to be invertible, although one can certainly study non-invertible dynamical systems as well). One often adds additional structure to this dynamical system, such as topological structure (giving rise topological dynamics), measure-theoretic structure (giving rise to ergodic theory), complex structure (giving rise to complex dynamics), and so forth. A dynamical system gives rise to an action of the natural numbers
on the space
by using the iterates
of
for
; if
is invertible, we can extend this action to an action of the integers
on the same space. One can certainly also consider dynamical systems whose underlying group (or semi-group) is something other than
or
(e.g. one can consider continuous dynamical systems in which the evolution group is
), but we will restrict attention to the classical situation of
or
actions here.
There is a fundamental correspondence principle connecting the study of strings (or subsets of natural numbers or integers) with the study of dynamical systems. In one direction, given a dynamical system , an observable
taking values in some alphabet
, and some initial datum
, we can first form the forward orbit
of
, and then observe this orbit using
to obtain an infinite string
. If the shift
in this system is invertible, one can extend this infinite string into a doubly infinite string
. Thus we see that every quadruplet
consisting of a dynamical system
, an observable
, and an initial datum
creates an infinite string.
Example 1 If
is the three-element set
with the shift map
,
is the observable that takes the value
at the residue class
and zero at the other two classes, and one starts with the initial datum
, then the observed string
becomes the indicator
of the multiples of three.
In the converse direction, every infinite string in some alphabet
arises (in a decidedly non-unique fashion) from a quadruple
in the above fashion. This can be easily seen by the following “universal” construction: take
to be the set
of infinite strings
in the alphabet
, let
be the shift map
let be the observable
and let be the initial point
Then one easily sees that the observed string is nothing more than the original string
. Note also that this construction can easily be adapted to doubly infinite strings by using
instead of
, at which point the shift map
now becomes invertible. An important variant of this construction also attaches an invariant probability measure to
that is associated to the limiting density of various sets associated to the string
, and leads to the Furstenberg correspondence principle, discussed for instance in these previous blog posts. Such principles allow one to rigorously pass back and forth between the combinatorics of strings and the dynamics of systems; for instance, Furstenberg famously used his correspondence principle to demonstrate the equivalence of Szemerédi’s theorem on arithmetic progressions with what is now known as the Furstenberg multiple recurrence theorem in ergodic theory.
In the case when the alphabet is the binary alphabet
, and (for technical reasons related to the infamous non-injectivity
of the decimal representation system) the string
does not end with an infinite string of
s, then one can reformulate the above universal construction by taking
to be the interval
,
to be the doubling map
,
to be the observable that takes the value
on
and
on
(that is,
is the first binary digit of
), and
is the real number
(that is,
in binary).
The above universal construction is very easy to describe, and is well suited for “generic” strings that have no further obvious structure to them, but it often leads to dynamical systems that are much larger and more complicated than is actually needed to produce the desired string
, and also often obscures some of the key dynamical features associated to that sequence. For instance, to generate the indicator
of the multiples of three that were mentioned previously, the above universal construction requires an uncountable space
and a dynamics which does not obviously reflect the key features of the sequence such as its periodicity. (Using the unit interval model, the dynamics arise from the orbit of
under the doubling map, which is a rather artificial way to describe the indicator function of the multiples of three.)
A related aesthetic objection to the universal construction is that of the four components of the quadruplet
used to generate the sequence
, three of the components
are completely universal (in that they do not depend at all on the sequence
), leaving only the initial datum
to carry all the distinctive features of the original sequence. While there is nothing wrong with this mathematically, from a conceptual point of view it would make sense to make all four components of the quadruplet to be adapted to the sequence, in order to take advantage of the accumulated intuition about various special dynamical systems (and special observables), not just special initial data.
One step in this direction can be made by restricting to the orbit
of the initial datum
(actually for technical reasons it is better to restrict to the topological closure
of this orbit, in order to keep
compact). For instance, starting with the sequence
, the orbit now consists of just three points
,
,
, bringing the system more in line with the example in Example 1. Technically, this is the “optimal” representation of the sequence by a quadruplet
, because any other such representation
is a factor of this representation (in the sense that there is a unique map
with
,
, and
). However, from a conceptual point of view this representation is still somewhat unsatisfactory, given that the elements of the system
are interpreted as infinite strings rather than elements of a more geometrically or algebraically rich object (e.g. points in a circle, torus, or other homogeneous space).
For general sequences , locating relevant geometric or algebraic structure in a dynamical system generating that sequence is an important but very difficult task (see e.g. this paper of Host and Kra, which is more or less devoted to precisely this task in the context of working out what component of a dynamical system controls the multiple recurrence behaviour of that system). However, for specific examples of sequences
, one can use an informal procedure of educated guesswork in order to produce a more natural-looking quadruple
that generates that sequence. This is not a particularly difficult or deep operation, but I found it very helpful in internalising the intuition behind the correspondence principle. Being non-rigorous, this procedure does not seem to be emphasised in most presentations of the correspondence principle, so I thought I would describe it here.
Let be an abelian countable discrete group. A measure-preserving
-system
(or
-system for short) is a probability space
, equipped with a measure-preserving action
of the group
, thus
for all and
, and
for all , with
equal to the identity map. Classically, ergodic theory has focused on the cyclic case
(in which the
are iterates of a single map
, with elements of
being interpreted as a time parameter), but one can certainly consider actions of other groups
also (including continuous or non-abelian groups).
A -system is said to be strongly
-mixing, or strongly mixing for short, if one has
for all , where the convergence is with respect to the one-point compactification of
(thus, for every
, there exists a compact (hence finite) subset
of
such that
for all
).
Similarly, we say that a -system is strongly
-mixing if one has
for all , thus for every
, there exists a finite subset
of
such that
whenever all lie outside
.
It is obvious that a strongly -mixing system is necessarily strong
-mixing. In the case of
-systems, it has been an open problem for some time, due to Rohlin, whether the converse is true:
Problem 1 (Rohlin’s problem) Is every strongly mixing
-system necessarily strongly
-mixing?
This is a surprisingly difficult problem. In the positive direction, a routine application of the Cauchy-Schwarz inequality (via van der Corput’s inequality) shows that every strongly mixing system is weakly -mixing, which roughly speaking means that
converges to
for most
. Indeed, every weakly mixing system is in fact weakly mixing of all orders; see for instance this blog post of Carlos Matheus, or these lecture notes of myself. So the problem is to exclude the possibility of correlation between
,
, and
for a small but non-trivial number of pairs
.
It is also known that the answer to Rohlin’s problem is affirmative for rank one transformations (a result of Kalikow) and for shifts with purely singular continuous spectrum (a result of Host; note that strongly mixing systems cannot have any non-trivial point spectrum). Indeed, any counterexample to the problem, if it exists, is likely to be highly pathological.
In the other direction, Rohlin’s problem is known to have a negative answer for -systems, by a well-known counterexample of Ledrappier which can be described as follows. One can view a
-system as being essentially equivalent to a stationary process
of random variables
in some range space
indexed by
, with
being
with the obvious shift map
In Ledrappier’s example, the take values in the finite field
of two elements, and are selected at uniformly random subject to the “Pascal’s triangle” linear constraints
A routine application of the Kolmogorov extension theorem allows one to build such a process. The point is that due to the properties of Pascal’s triangle modulo (known as Sierpinski’s triangle), one has
for all powers of two . This is enough to destroy strong
-mixing, because it shows a strong correlation between
,
, and
for arbitrarily large
and randomly chosen
. On the other hand, one can still show that
and
are asymptotically uncorrelated for large
, giving strong
-mixing. Unfortunately, there are significant obstructions to converting Ledrappier’s example from a
-system to a
-system, as pointed out by de la Rue.
In this post, I would like to record a “finite field” variant of Ledrappier’s construction, in which is replaced by the function field ring
, which is a “dyadic” (or more precisely, “triadic”) model for the integers (cf. this earlier blog post of mine). In other words:
Theorem 2 There exists a
-system that is strongly
-mixing but not strongly
-mixing.
The idea is much the same as that of Ledrappier; one builds a stationary -process
in which
are chosen uniformly at random subject to the constraints
and all
. Again, this system is manifestly not strongly
-mixing, but can be shown to be strongly
-mixing; I give details below the fold.
As I discussed in this previous post, in many cases the dyadic model serves as a good guide for the non-dyadic model. However, in this case there is a curious rigidity phenomenon that seems to prevent Ledrappier-type examples from being transferable to the one-dimensional non-dyadic setting; once one restores the Archimedean nature of the underlying group, the constraints (1) not only reinforce each other strongly, but also force so much linearity on the system that one loses the strong mixing property.
I have recently finished a draft version of my blog book “Poincaré’s legacies: pages from year two of a mathematical blog“, which covers all the mathematical posts from my blog in 2008, excluding those posts which primarily originated from other authors or speakers.
The draft is much longer – 694 pages – than the analogous draft from 2007 (which was 374 pages using the same style files). This is largely because of the two series of course lecture notes which dominate the book (and inspired its title), namely on ergodic theory and on the Poincaré conjecture. I am talking with the AMS staff about the possibility of splitting the book into two volumes, one focusing on ergodic theory, number theory, and combinatorics, and the other focusing on geometry, topology, and PDE (though there will certainly be miscellaneous sections that will basically be divided arbitrarily amongst the two volumes).
The draft probably also needs an index, which I will attend to at some point before publication.
As in the previous book, those comments and corrections from readers which were of a substantive and mathematical nature have been acknowledged in the text. In many cases, I was only able to refer to commenters by their internet handles; please email me if you wish to be attributed differently (or not to be attributed at all).
Any other suggestions, corrections, etc. are, of course welcome.
I learned some technical tricks for HTML to LaTeX conversion which made the process significantly faster than last year’s, although still rather tedious and time consuming; I thought I might share them below as they may be of use to anyone else contemplating a similar conversion.
This month I am at MSRI, for the programs of Ergodic Theory and Additive Combinatorics, and Analysis on Singular Spaces, that are currently ongoing here. This week I am giving three lectures on the correspondence principle, and on finitary versions of ergodic theory, for the introductory workshop in the former program. The article here is broadly describing the content of these talks (which are slightly different in theme from that announced in the abstract, due to some recent developments). [These lectures were also recorded on video and should be available from the MSRI web site within a few months.]
In this lecture, I define the basic notion of a dynamical system (as well as the more structured notions of a topological dynamical system and a measure-preserving system), and describe the main topics we will cover in this course.
Next quarter, starting on Wednesday January 9, I will be teaching a graduate course entitled “Topics in Ergodic Theory“. As an experiment, I have decided to post my lecture notes on this blog as the course progresses, as it seems to be a good medium to encourage feedback and corrections. (On the other hand, I expect that my frequency of posting on non-ergodic theory topics is going to go down substantially during this quarter.) All of my class posts will be prefaced with the course number, 254A, and will be placed in their own special category.
The topics I plan to cover include
- Topological dynamics;
- Classical ergodic theorems;
- The Furstenberg-Zimmer structure theory of measure preserving systems;
- Multiple recurrence theorems, and the connections with Szemerédi-type theorems;
- Orbits in homogeneous spaces (and in particular, in nilmanifolds);
- (Special cases of) Ratner’s theorem, and applications to number theory (e.g. the Oppenheim conjecture).
If time allows I will cover some other topics in ergodic theory as well (I haven’t decided yet exactly which ones to discuss yet, and might be willing to entertain some suggestions in this regard.)
If this works out well then I plan to also do the same for my spring class, in which I will cover as much of Perelman’s proof of the Poincaré conjecture as I can manage. (Note though that this latter class will build upon a class on Ricci flow given by my colleague William Wylie in the winter quarter, which will thus be a de facto prerequisite for my spring course.)
Ciprian Demeter, Michael Lacey, Christoph Thiele and I have just uploaded our joint paper, “The Walsh model for Carleson” to the arXiv. This paper (which was recently accepted for publication in Revista Iberoamericana) establishes a simplified model for the key estimate (the “
Carleson estimate”) in another (much longer) paper of ours on the return times theorem of Bourgain, in which the Fourier transform is replaced by its dyadic analogue, the Walsh-Fourier transform. This model estimate is established by the now-standard techniques of time-frequency analysis: one decomposes the expression to be estimated into a sum over tiles, and then uses combinatorial stopping time arguments into group the tiles into trees, and the trees into forests. One then uses (phase-space localised, and frequency-modulated) versions of classical Calderòn-Zygmund theory (or in this particular case, a certain maximal Fourier inequality of Bourgain) to control individual trees and forests, and sums up over the trees and forests using orthogonality methods (excluding an exceptional set if necessary).
Rather than discuss time-frequency analysis in detail here, I thought I would dwell instead on the return times theorem, and sketch how it is connected to the Carleson estimate; this is a more complicated version of the “
Carleson estimate”, which is an estimate which is logically equivalent to Carleson’s famous theorem (and its extension by Hunt) on the almost everywhere convergence of Fourier series.
I’ve just uploaded to the ArXiV my paper “Norm convergence of multiple ergodic averages for commuting transformations“, submitted to
Ergodic Theory and Dynamical Systems. This paper settles in full generality the norm convergence problem for several commuting transformations. Specifically, if is a probability space and
are commuting measure-preserving transformations, then for any bounded measurable functions
, the multiple average
(1)
is convergent in the norm topology (and thus also converges in probability). The corresponding question of pointwise almost everywhere convergence remains open (and quite difficult, in my opinion). My argument also does not readily establish a formula as to what this limit actually is (it really establishes that the sequence is Cauchy in
rather than convergent).
The l=1 case of this theorem is the classical mean ergodic theorem (also known as the von Neumann ergodic theorem). The l=2 case was established by Conze and Lesigne. The higher l case was partially resolved by Frantzikinakis and Kra, under the additional hypotheses that all of the transformations , as well as the quotients
, are ergodic. The special case
was established by Host-Kra (with another proof given subsequently by Ziegler). Another relevant result is the Furstenberg-Katznelson theorem, which asserts among other things when
is non-negative and not identically zero, then the inner product of the expression (1) with f has a strictly positive limit inferior as
. This latter result also implies Szemerédi’s theorem.
It is also known that the Furstenberg-Katznelson theorem can be proven by hypergraph methods, and in fact my paper also proceeds by a hypergraph-inspired approach, although the language of hypergraphs is not explicitly used in the body of the argument. (In contrast to the work of Host-Kra and Ziegler, no nilsystems appear in the proof.)

Recent Comments