You are currently browsing the tag archive for the ‘free group’ tag.
This post is a continuation of the previous post, which has attracted a large number of comments. I’m recording here some calculations that arose from those comments (particularly those of Pace Nielsen, Lior Silberman, Tobias Fritz, and Apoorva Khare). Please feel free to either continue these calculations or to discuss other approaches to the problem, such as those mentioned in the remaining comments to the previous post.
Let be the free group on two generators
, and let
be a quantity obeying the triangle inequality
and the linear growth property
for all and integers
; this implies the conjugation invariance
or equivalently
We consider inequalities of the form
for various real numbers . For instance, since
we have (1) for . We also have the following further relations:
Proposition 1
Proof: For (i) we simply observe that
For (ii), we calculate
giving the claim.
For (iii), we calculate
giving the claim.
For (iv), we calculate
giving the claim.
Here is a typical application of the above estimates. If (1) holds for , then by part (i) it holds for
, then by (ii) (2) holds for
, then by (iv) (1) holds for
. The map
has fixed point
, thus
For instance, if , then
.
Here is a curious question posed to me by Apoorva Khare that I do not know the answer to. Let be the free group on two generators
. Does there exist a metric
on this group which is
- bi-invariant, thus
for all
; and
- linear growth in the sense that
for all
and all natural numbers
?
By defining the “norm” of an element to be
, an equivalent formulation of the problem asks if there exists a non-negative norm function
that obeys the conjugation invariance
for all , the triangle inequality
for all , and the linear growth
for all and
, and such that
for all non-identity
. Indeed, if such a norm exists then one can just take
to give the desired metric.
One can normalise the norm of the generators to be at most , thus
This can then be used to upper bound the norm of other words in . For instance, from (1), (3) one has
A bit less trivially, from (3), (2), (1) one can bound commutators as
In a similar spirit one has
What is not clear to me is if one can keep arguing like this to continually improve the upper bounds on the norm of a given non-trivial group element
to the point where this norm must in fact vanish, which would demonstrate that no metric with the above properties on
would exist (and in fact would impose strong constraints on similar metrics existing on other groups as well). It is also tempting to use some ideas from geometric group theory (e.g. asymptotic cones) to try to understand these metrics further, though I wasn’t able to get very far with this approach. Anyway, this feels like a problem that might be somewhat receptive to a more crowdsourced attack, so I am posing it here in case any readers wish to try to make progress on it.
I’ve just uploaded to the arXiv my paper “Failure of the pointwise and maximal ergodic theorems for the free group“, submitted to Forum of Mathematics, Sigma. This paper concerns a variant of the pointwise ergodic theorem of Birkhoff, which asserts that if one has a measure-preserving shift map
on a probability space
, then for any
, the averages
converge pointwise almost everywhere. (In the important case when the shift map
is ergodic, the pointwise limit is simply the mean
of the original function
.)
The pointwise ergodic theorem can be extended to measure-preserving actions of other amenable groups, if one uses a suitably “tempered” Folner sequence of averages; see this paper of Lindenstrauss for more details. (I also wrote up some notes on that paper here, back in 2006 before I had started this blog.) But the arguments used to handle the amenable case break down completely for non-amenable groups, and in particular for the free non-abelian group on two generators.
Nevo and Stein studied this problem and obtained a number of pointwise ergodic theorems for -actions
on probability spaces
. For instance, for the spherical averaging operators
(where denotes the length of the reduced word that forms
), they showed that
converged pointwise almost everywhere provided that
was in
for some
. (The need to restrict to spheres of even radius can be seen by considering the action of
on the two-element set
in which both generators of
act by interchanging the elements, in which case
is determined by the parity of
.) This result was reproven with a different and simpler proof by Bufetov, who also managed to relax the condition
to the weaker condition
.
The question remained open as to whether the pointwise ergodic theorem for -actions held if one only assumed that
was in
. Nevo and Stein were able to establish this for the Cesáro averages
, but not for
itself. About six years ago, Assaf Naor and I tried our hand at this problem, and was able to show an associated maximal inequality on
, but due to the non-amenability of
, this inequality did not transfer to
and did not have any direct impact on this question, despite a fair amount of effort on our part to attack it.
Inspired by some recent conversations with Lewis Bowen, I returned to this problem. This time around, I tried to construct a counterexample to the pointwise ergodic theorem – something Assaf and I had not seriously attempted to do (perhaps due to being a bit too enamoured of our
maximal inequality). I knew of an existing counterexample of Ornstein regarding a failure of an
ergodic theorem for iterates
of a self-adjoint Markov operator – in fact, I had written some notes on this example back in 2007. Upon revisiting my notes, I soon discovered that the Ornstein construction was adaptable to the
setting, thus settling the problem in the negative:
Theorem 1 (Failure of
pointwise ergodic theorem) There exists a measure-preserving
-action on a probability space
and a non-negative function
such that
for almost every
.
To describe the proof of this theorem, let me first briefly sketch the main ideas of Ornstein’s construction, which gave an example of a self-adjoint Markov operator on a probability space
and a non-negative
such that
for almost every
. By some standard manipulations, it suffices to show that for any given
and
, there exists a self-adjoint Markov operator
on a probability space
and a non-negative
with
, such that
on a set of measure at least
. Actually, it will be convenient to replace the Markov chain
with an ancient Markov chain
– that is to say, a sequence of non-negative functions
for both positive and negative
, such that
for all
. The purpose of requiring the Markov chain to be ancient (that is, to extend infinitely far back in time) is to allow for the Markov chain to be shifted arbitrarily in time, which is key to Ornstein’s construction. (Technically, Ornstein’s original argument only uses functions that go back to a large negative time, rather than being infinitely ancient, but I will gloss over this point for sake of discussion, as it turns out that the
version of the argument can be run using infinitely ancient chains.)
For any , let
denote the claim that for any
, there exists an ancient Markov chain
with
such that
on a set of measure at least
. Clearly
holds since we can just take
for all
. Our objective is to show that
holds for arbitrarily small
. The heart of Ornstein’s argument is then the implication
for any , which upon iteration quickly gives the desired claim.
Let’s see informally how (1) works. By hypothesis, and ignoring epsilons, we can find an ancient Markov chain on some probability space
of total mass
, such that
attains the value of
or greater almost everywhere. Assuming that the Markov process is irreducible, the
will eventually converge as
to the constant value of
, in particular its final state will essentially stay above
(up to small errors).
Now suppose we duplicate the Markov process by replacing with a double copy
(giving
the uniform probability measure), and using the disjoint sum of the Markov operators on
and
as the propagator, so that there is no interaction between the two components of this new system. Then the functions
form an ancient Markov chain of mass at most
that lives solely in the first half
of this copy, and
attains the value of
or greater on almost all of the first half
, but is zero on the second half. The final state of
will be to stay above
in the first half
, but be zero on the second half.
Now we modify the above example by allowing an infinitesimal amount of interaction between the two halves ,
of the system (I mentally think of
and
as two identical boxes that a particle can bounce around in, and now we wish to connect the boxes by a tiny tube). The precise way in which this interaction is inserted is not terribly important so long as the new Markov process is irreducible. Once one does so, then the ancient Markov chain
in the previous example gets replaced by a slightly different ancient Markov chain
which is more or less identical with
for negative times
, or for bounded positive times
, but for very large values of
the final state is now constant across the entire state space
, and will stay above
on this space.
Finally, we consider an ancient Markov chain which is basically of the form
for some large parameter and for all
(the approximation becomes increasingly inaccurate for
much larger than
, but never mind this for now). This is basically two copies of the original Markov process in separate, barely interacting state spaces
, but with the second copy delayed by a large time delay
, and also attenuated in amplitude by a factor of
. The total mass of this process is now
. Because of the
component of
, we see that
basically attains the value of
or greater on the first half
. On the second half
, we work with times
close to
. If
is large enough,
would have averaged out to about
at such times, but the
component can get as large as
here. Summing (and continuing to ignore various epsilon losses), we see that
can get as large as
on almost all of the second half of
. This concludes the rough sketch of how one establishes the implication (1).
It was observed by Bufetov that the spherical averages for a free group action can be lifted up to become powers
of a Markov operator, basically by randomly assigning a “velocity vector”
to one’s base point
and then applying the Markov process that moves
along that velocity vector (and then randomly changing the velocity vector at each time step to the “reduced word” condition that the velocity never flips from
to
). Thus the spherical average problem has a Markov operator interpretation, which opens the door to adapting the Ornstein construction to the setting of
systems. This turns out to be doable after a certain amount of technical artifice; the main thing is to work with
-measure preserving systems that admit ancient Markov chains that are initially supported in a very small region in the “interior” of the state space, so that one can couple such systems to each other “at the boundary” in the fashion needed to establish the analogue of (1) without disrupting the ancient dynamics of such chains. The initial such system (used to establish the base case
) comes from basically considering the action of
on a (suitably renormalised) “infinitely large ball” in the Cayley graph, after suitably gluing together the boundary of this ball to complete the action. The ancient Markov chain associated to this system starts at the centre of this infinitely large ball at infinite negative time
, and only reaches the boundary of this ball at the time
.
In a multiplicative group , the commutator of two group elements
is defined as
(other conventions are also in use, though they are largely equivalent for the purposes of this discussion). A group is said to be nilpotent of step
(or more precisely, step
), if all iterated commutators of order
or higher necessarily vanish. For instance, a group is nilpotent of order
if and only if it is abelian, and it is nilpotent of order
if and only if
for all
(i.e. all commutator elements
are central), and so forth. A good example of an
-step nilpotent group is the group of
upper-triangular unipotent matrices (i.e. matrices with
s on the diagonal and zero below the diagonal), and taking values in some ring (e.g. reals, integers, complex numbers, etc.).
Another important example of nilpotent groups arise from operations on polynomials. For instance, if is the vector space of real polynomials of one variable of degree at most
, then there are two natural affine actions on
. Firstly, every polynomial
in
gives rise to an “vertical” shift
. Secondly, every
gives rise to a “horizontal” shift
. The group generated by these two shifts is a nilpotent group of step
; this reflects the well-known fact that a polynomial of degree
vanishes once one differentiates more than
times. Because of this link between nilpotentcy and polynomials, one can view nilpotent algebra as a generalisation of polynomial algebra.
Suppose one has a finite number of generators. Using abstract algebra, one can then construct the free nilpotent group
of step
, defined as the group generated by the
subject to the relations that all commutators of order
involving the generators are trivial. This is the universal object in the category of nilpotent groups of step
with
marked elements
. In other words, given any other
-step nilpotent group
with
marked elements
, there is a unique homomorphism from the free nilpotent group to
that maps each
to
for
. In particular, the free nilpotent group is well-defined up to isomorphism in this category.
In many applications, one wants to have a more concrete description of the free nilpotent group, so that one can perform computations more easily (and in particular, be able to tell when two words in the group are equal or not). This is easy for small values of . For instance, when
,
is simply the free abelian group generated by
, and so every element
of
can be described uniquely as
for some integers , with the obvious group law. Indeed, to obtain existence of this representation, one starts with any representation of
in terms of the generators
, and then uses the abelian property to push the
factors to the far left, followed by the
factors, and so forth. To show uniqueness, we observe that the group
of formal abelian products
is already a
-step nilpotent group with marked elements
, and so there must be a homomorphism from the free group to
. Since
distinguishes all the products
from each other, the free group must also.
It is only slightly more tricky to describe the free nilpotent group of step
. Using the identities
(where is the conjugate of
by
) we see that whenever
, one can push a positive or negative power of
past a positive or negative power of
, at the cost of creating a positive or negative power of
, or one of its conjugates. Meanwhile, in a
-step nilpotent group, all the commutators are central, and one can pull all the commutators out of a word and collect them as in the abelian case. Doing all this, we see that every element
of
has a representation of the form
for some integers for
and
for
. Note that we don’t need to consider commutators
for
, since
and
It is possible to show also that this representation is unique, by repeating the previous argument, i.e. by showing that the set of formal products
forms a -step nilpotent group, after using the above rules to define the group operations. This can be done, but verifying the group axioms (particularly the associative law) for
is unpleasantly tedious.
Once one sees this, one rapidly loses an appetite for trying to obtain a similar explicit description for free nilpotent groups for higher step, especially once one starts seeing that higher commutators obey some non-obvious identities such as the Hall-Witt identity
(a nonlinear version of the Jacobi identity in the theory of Lie algebras), which make one less certain as to the existence or uniqueness of various proposed generalisations of the representations (1) or (2). For instance, in the free -step nilpotent group, it turns out that for representations of the form
one has uniqueness but not existence (e.g. even in the simplest case , there is no place in this representation for, say,
or
), but if one tries to insert more triple commutators into the representation to make up for this, one has to be careful not to lose uniqueness due to identities such as (3). One can paste these in by ad hoc means in the
case, but the
case looks more fearsome still, especially now that the quadruple commutators split into several distinct-looking species such as
and
which are nevertheless still related to each other by identities such as (3). While one can eventually disentangle this mess for any fixed
and
by a finite amount of combinatorial computation, it is not immediately obvious how to give an explicit description of
uniformly in
and
.
Nevertheless, it turns out that one can give a reasonably tractable description of this group if one takes a polycyclic perspective rather than a nilpotent one – i.e. one views the free nilpotent group as a tower of group extensions of the trivial group by the cyclic group . This seems to be a fairly standard observation in group theory – I found it in this book of Magnus, Karrass, and Solitar, via this paper of Leibman – but seems not to be so widely known outside of that field, so I wanted to record it here.
Notational convention: In this post only, I will colour a statement red if it assumes the axiom of choice. (For the rest of the course, the axiom of choice will be implicitly assumed throughout.)
The famous Banach-Tarski paradox asserts that one can take the unit ball in three dimensions, divide it up into finitely many pieces, and then translate and rotate each piece so that their union is now two disjoint unit balls. As a consequence of this paradox, it is not possible to create a finitely additive measure on that is both translation and rotation invariant, which can measure every subset of
, and which gives the unit ball a non-zero measure. This paradox helps explain why Lebesgue measure (which is countably additive and both translation and rotation invariant, and gives the unit ball a non-zero measure) cannot measure every set, instead being restricted to measuring sets that are Lebesgue measurable.
On the other hand, it is not possible to replicate the Banach-Tarski paradox in one or two dimensions; the unit interval in or unit disk in
cannot be rearranged into two unit intervals or two unit disks using only finitely many pieces, translations, and rotations, and indeed there do exist non-trivial finitely additive measures on these spaces. However, it is possible to obtain a Banach-Tarski type paradox in one or two dimensions using countably many such pieces; this rules out the possibility of extending Lebesgue measure to a countably additive translation invariant measure on all subsets of
(or any higher-dimensional space).
In these notes I would like to establish all of the above results, and tie them in with some important concepts and tools in modern group theory, most notably amenability and the ping-pong lemma. This material is not required for the rest of the course, but nevertheless has some independent interest.
Recent Comments