We have now seen two ways to construct expander Cayley graphs . The first, discussed in Notes 2, is to use Cayley graphs that are projections of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group
with a flattening hypothesis for the random walk.
We now pursue the second approach more thoroughly. The main difficulty here is to figure out how to ensure flattening of the random walk, as it is then an easy matter to use quasirandomness to show that the random walk becomes mixing soon after it becomes flat. In the case of Selberg’s theorem, we achieved this through an explicit formula for the heat kernel on the hyperbolic plane (which is a proxy for the random walk). However, in most situations such an explicit formula is not available, and one must develop some other tool for forcing flattening, and specifically an estimate of the form
for some , where
is the uniform probability measure on the generating set
.
In 2006, Bourgain and Gamburd introduced a general method for achieving this goal. The intuition here is that the main obstruction that prevents a random walk from spreading out to become flat over the entire group is if the random walk gets trapped in some proper subgroup
of
(or perhaps in some coset
of such a subgroup), so that
remains large for some moderately large
. Note that
since ,
, and
is symmetric. By iterating this observation, we seethat if
is too large (e.g. of size
for some
comparable to
), then it is not possible for the random walk
to converge to the uniform distribution in time
, and so expansion does not occur.
A potentially more general obstruction of this type would be if the random walk gets trapped in (a coset of) an approximate group . Recall that a
-approximate group is a subset
of a group
which is symmetric, contains the identity, and is such that
can be covered by at most
left-translates (or equivalently, right-translates) of
. Such approximate groups were studied extensively in last quarter’s course. A similar argument to the one given previously shows (roughly speaking) that expansion cannot occur if
is too large for some coset
of an approximate group.
It turns out that this latter observation has a converse: if a measure does not concentrate in cosets of approximate groups, then some flattening occurs. More precisely, one has the following combinatorial lemma:
Lemma 1 (Weighted Balog-Szemerédi-Gowers lemma) Let
be a group, let
be a finitely supported probability measure on
which is symmetric (thus
for all
), and let
. Then one of the following statements hold:
- (i) (Flattening) One has
.
- (ii) (Concentration in an approximate group) There exists an
-approximate group
in
with
and an element
such that
.
This lemma is a variant of the more well-known Balog-Szemerédi-Gowers lemma in additive combinatorics due to Gowers (which roughly speaking corresponds to the case when is the uniform distribution on some set
), which in turn is a polynomially quantitative version of an earlier lemma of Balog and Szemerédi. We will prove it below the fold.
The lemma is particularly useful when the group in question enjoys a product theorem, which roughly speaking says that the only medium-sized approximate subgroups of
are trapped inside genuine proper subgroups of
(or, contrapositively, medium-sized sets that generate the entire group
cannot be approximate groups). The fact that some finite groups (and specifically, the bounded rank finite simple groups of Lie type) enjoy product theorems is a non-trivial fact, and will be discussed in later notes. For now, we simply observe that the presence of the product theorem, together with quasirandomness and a non-concentration hypothesis, can be used to demonstrate expansion:
Theorem 2 (Bourgain-Gamburd expansion machine) Suppose that
is a finite group, that
is a symmetric set of
generators, and that there are constants
with the following properties.
- (Quasirandomness). The smallest dimension of a nontrivial representation
of
is at least
;
- (Product theorem). For all
there is some
such that the following is true. If
is a
-approximate subgroup with
then
generates a proper subgroup of
;
- (Non-concentration estimate). There is some even number
such that
where the supremum is over all proper subgroups
.
Then
is a two-sided
-expander for some
depending only on
, and the function
(and this constant
is in principle computable in terms of these constants).
This criterion for expansion is implicitly contained in this paper of Bourgain and Gamburd, who used it to establish the expansion of various Cayley graphs in for prime
. This criterion has since been applied (or modified) to obtain expansion results in many other groups, as will be discussed in later notes.
— 1. The Balog-Szemerédi-Gowers lemma —
The Balog-Szemerédi-Gowers lemma (Lemma 1) is ostensibly a statement about group structure, but the main tool in its proof is a remarkable graph-theoretic lemma (also known as the Balog-Szemerédi-Gowers lemma) that allows one to upgrade a “statistical” structure (a structure which is only valid a small fraction of the time, say 1% of the time) to a “complete” structure (one which is valid 100% of the time), by shrinking the size of the structure slightly (and in particular, with losses of polynomial type, as opposed to exponential or worse). This is in contrast to other structure-improving results (such as Ramsey’s theorem, Szemerédi’s theorem, or Freiman’s theorem), which are qualitatively similar in spirit, but have much worse quantitative bounds (though there is some hope in the case of Freiman’s theorem to only lose polynomial bounds with some improvement of existing arguments).
As we shall see later, the property of being large is a statistical assertion about
(it asserts that
collides with itself somewhat often), whereas approximate groups
represent a more complete sort of structure (all products of
are trapped in a small set, whereas only many of the products in
are so constrained). The graph-theoretic Balog-Szemerédi lemma is the key to moving from the former type of structure to the latter with only polynomial losses.
We need some notation. Define a bipartite graph to be a graph whose vertex set
is partitioned into two non-empty sets
, and the edge set
consists only of edges between
and
. If a finite bipartite graph
is dense in the sense that its edge density
is large, then for many vertices
and
,
and
are connected by a path of length one (i.e. an edge). It is thus intuitive that many pairs of vertices
and
will be connected by many paths of length two. Perhaps surprisingly, one can upgrade “many pairs” here to “almost all pairs”, provided that one is willing to shrink the set
slightly. More precisely, one has
Lemma 3 (Balog-Szemerédi-Gowers lemma: paths of length two) Let
be a finite bipartite graph with
. Let
. Then there exists a subset
of
with
such that at least
of the pairs
are such that
are connected by at least
paths of length two (i.e. there exists at least
vertices
such that
both lie in
).
Remark 1 It is not possible to remove the
entirely from this lemma; see Exercise 6.4.2 of my book with Van Vu for a counterexample (involving Hamming balls).
Proof: The idea here is to use a probabilistic construction, picking to be a neighbourhood of a randomly selected element
of
. The rationale here is that if a pair
of vertices in
are not connected by many paths of length two, then they are unlikely to lie in the same neighbourhood, and so are unlikely to “wreck” the construction.
We turn to the details. Let be chosen uniformly at random, and let
be the neighbourhood of
. Observe that the expected size of
is
By Cauchy-Schwarz, we conclude in particular that
Now, call a pair bad if it is connected by fewer than
paths of length two, and let
be the number of bad pairs
in
. We consider the quantity
. Observe that if
is a bad pair in
, then there are at most
values of
for which
and
will both lie in
, and so this bad pair contributes at most
to the expectation. Since there are at most
bad pairs, we conclude that
Combining this with (2), we see that
In particular, there exists a choice of for which the expression on the left-hand side is non-negative. This implies that
and
and the claim follows.
Given that almost all pairs in
are joined by many paths of length two, it is then plausible that almost all pairs
,
are joined by many paths of length three, for some large subset
of
. Remarkably, one can now upgrade “almost all” pairs here to all pairs:
Lemma 4 (Balog-Szemerédi-Gowers lemma: paths of length three) Let
be a finite bipartite graph with
. Then there exists subsets
of
respectively with
and
, such that for every
and
,
and
are joined by
paths of length three.
Remark 2 A lemma similar to this was first established by Balog and Szemerédi, as a consequence of the Szemereédi regularity lemma. However, as a consequence of using that lemma, the polynomial bounds
in the above lemma had to be replaced by much worse bounds (of tower-exponential type in
), which turns out to be far too weak for the purposes of establishing expansion.
Proof: The idea is to first prune a few “unpopular” vertices from and
and then apply the preceding lemma.
Let be the vertices in
of degree at least
, and let
be the edges connecting
and
. Note that the vertices in
are connected to a total of at most
edges, and so
. Since
, we conclude in particular that
.
Let be a sufficiently small quantity (depending on
) to be chosen later. Applying Lemma 3, one can find a subset
of
of cardinality
such that at most
of the pairs
are bad in the sense that they are connected by
paths of length two.
Let be those vertices
in
for which there are at most
elements
of
for which
is bad. By Markov’s inequality,
consists of all but at most
elements of
.
Let be the edges connecting
with
. Since each vertex in
has degree at least
, one has
We may thus find a subset of
of cardinality
such that each
is adjacent to
elements of
.
Now let and
. We know that
is adjacent to
elements
of
, and that at most
of these elements are such that
is bad. If we choose
to be a sufficiently small multiple of
, we conclude that there are
elements
which are adjacent to
and for which
is not bad. One thus has
paths of length three connecting
to
, and the claim follows.
The exponents in here can be improved slightly, but we will not attempt to obtain the optimal numerology here.
Remark 3 The above results are analogous to a phenomenon in additive combinatorics, namely that a “1%-structured” set (such as a small density subset of a group) can often be upgraded to a “99%-structured” set (such as the complement of a small density subset of a group) by applying a single “convolution” or “sumset” operation, and then upgraded further to a “100%-structured” set (such as a genuine group) by applying a further convolution or sumset operation. (This is basically why, for instance, it is known that almost all even natural numbers are the sum of two primes, and all but finitely many odd natural numbers are the sum of three primes; but it is not known whether all but finitely many even natural numbers are the sum of two primes.)
Exercise 1 (Weighted Balog-Szemerédi-Gowers theorem) Let
and
be probability spaces, and let
have measure
for some
.
- (i) Show that for any
, there exists a subset
of
of measure
such that
- (ii) Show that there exists subsets
of
of measure
and
such that
for all
and
.
Exercise 2 (99% Balog-Szemerédi theorem) Let
be a finite bipartite graph with
.
- (i) Show that there exists a subset
of
of size
such that for every
,
and
are connected by at least
paths of length
. (Hint: select
to be those vertices in
that are connected to “almost all” the vertices in
.)
- (ii) Show that there also exists a subset
of
of size
such that for every
and
,
and
are connected by at least
paths of length
.
We now apply the graph-theoretic lemma to the group context. The main idea here is to show that various sets (e.g. product sets ) are small by showing that they are in the high-multiplicity region of some convolution (e.g.
), or equivalently that elements
of such sets have many representations as a product
with
. One can then use Markov’s inequality and the trivial identity
to get usable size bounds on such sets.
Corollary 5 (Balog-Szemerédi lemma, product set form) let
be finite non-empty subsets of a group
, and suppose that
for some
. (This hypothesis should be compared with the upper bound
arising from Young’s inequality.) Then there exists subsets
of
respectively with
and
with
and
.
The quantity (or equivalently, the number of solutions to the equation
with
and
) is also known as the multiplicative energy of
and
, and is sometimes denoted
in the literature.
Proof: By hypothesis, we have
Since
we conclude that
Since, by Cauchy-Schwarz (or Young’s inequality), we have , we conclude that there is a set
with
such that
for all .
By slight abuse of notation (arising from the fact that ,
are not necessarily disjoint, and that
is a set of ordered pairs rather than unordered pairs), we can view the triplet
as a bipartite graph. Applying Lemma 4, we can find subsets
of
respectively with
and
such that for all
and
, one can find
elements
such that
. In particular, we see that
Observe that . Using the identity
, we note that triples
for
are precisely those triples
with
. Thus the left-hand side of (3) is equal to
, where
But since
we see from Markov’s inequality that there are at most possible values for
, which gives the bound
.
The second bound can be proven similarly to the first (noting that any
are connected by
paths of length six), but can also from the former bound as follows. Observe that any element
has at least
representations of the form
with
, and hence
, thus
on . On the other hand, the left-hand side has an
norm of
, and the bound
then follows from Markov’s inequality.
Exercise 3 In the converse direction, show that if
are non-empty finite subsets of
with
, then
.
Exercise 4 If
are three non-empty finite subsets of
, establish the Ruzsa triangle inequality
. (Hint: mimic the final part of the proof of Corollary 5.)
We now give a variant of this corollary involving approximate groups.
Lemma 6 (Balog-Szemerédi lemma, approximate group form) Let
be a finite symmetric subset of a group
, and suppose that
for some
. Then there exists a
-approximate group
with
such that
for some
.
Proof: By Corollary 5, we may find a subset with
such that
By Exercise 3, this implies that
Observe that the left-hand side is equal to
We conclude that
On the other hand, we have
As a consequence, we see that if we set
for some sufficiently large absolute constant , then
and thus (for large enough)
Since , we conclude that
Also, is clearly symmetric and contains the origin.
Now let us consider an element of the product
. By construction of
, we can write each
as a product
with
in at least
ways. Doing so for each
gives rise to a factorisation
where ; as the
uniquely determine the
(for fixed
), we conclude that each element
of
has at least
such factorisations. But by (4), there are at most
such tuples
, and so there are at most
possible values for
, thus
In particular,
By the Ruzsa covering lemma (see the exercise below), this implies that is covered by
left-translates of
, and so
is a
-approximate group. Finally, from (5) one has
and thus by Exercise 3
In particular, since the support of has size
, one has
for some , or equivalently that
Increasing to
and taking inverses, we conclude that
, and the claim follows.
Exercise 5 (Ruzsa covering lemma) Let
be finite non-empty subsets of a group
. Show that
can be covered by at most
left-translates of
. (Hint: consider a maximal disjoint collection of translates
of
with
.)
Exercise 6 (Converse to Balog-Szemerédi-Gowers) Let
be a finite symmetric subset of a group
, and suppose there exists a
-approximate group
with
such that
for some
. Show that
Exercise 7 Let
be finite non-empty subsets of a group
, and suppose that
. Show that there exists a
-approximate group
with
and elements
such that
and
.
Finally, we can prove Lemma 1. Fix . We may assume that
and we need to use this to locate an -approximate group
in
with
and an element
such that
.
Let us write . Intuitively,
represents the “width” of the probability meaure
, as can be seen by considering the model example
where
is a symmetric set of cardinality
(i.e.
is the uniform probability measure on
). If we were actually in this model case, we could apply Lemma 6 immediately and be done. Of course, in general,
need not be a uniform measure on a set of size
. However, it turns out that one can use (6) to conclude that the “bulk” of
is basically of this form.
More precisely, let us split , where
Observe that
and so by Young’s inequality
In a similar vein, we have
and thus by Young’s inequality (and the normalisation )
Finally, from (6) one has
Subtracting using the triangle inequality (ignoring some slight double-counting), we conclude that
If we then set , we conclude in particular that
On the other hand, from Markov’s inequality one has . Applying Lemma 6, we conclude the existence of a
-approximate group
with
such that
for some
, which by definition of
implies that
, and the claim follows.
— 2. The Bourgain-Gamburd expansion machine —
We can now prove Theorem 2. We can assume that is sufficiently large depending on the parameters
, since the claim is trivial for bounded
(note that as
generates
, the Cayley graph
will be an
-expander for some
). Henceforth we allow all implied constants in the asymptotic notation to depend on
.
To show expansion, it suffices from the quasirandomness hypothesis (and Proposition 4 from the preceding notes), it will suffice to show that
for some .
From Young’s inequality, is decreasing in
, and is initially equal to
when
. We need to “flatten” the
norm of
as
increases. We first use the non-concentration hypothesis to obtain an initial amount of flattening:
Proposition 7 For any
, one has
for all proper subgroups
of
and all
.
Proof: By the non-concentration hypothesis, we can find such that
for all proper subgroups of
. If we write
as
, we see that
for all . By symmetry,
, and thus
If , then we may write
as the convolution of a probability measure
and
. From this, we see that
for all , giving the claim (9). Specialising this to the case when
is the trivial group, one has
Since we also have
the claim (8) then follows from Hölder’s inequality.
Now we obtain additional flattening using the product theorem hypothesis:
Lemma 8 (Flattening lemma) Suppose
is such that
Then one has
for some
depending only on
and
.
Proof: Suppose the claim fails for some to be chosen later, thus
Applying Lemma 1, we may thus find a -approximate group
with
and such that
Since by (9), we see that
Meanwhile, from (10) one has
Applying the product hypothesis (assuming sufficiently small depending on
and
), we conclude that
generates a proper subgroup
of
, and thus
But this contradicts (9) (again if is sufficiently small).
Iterating the above lemma times we obtain (7) for some
, as desired.
Remark 4 Roughly speaking, the three hypotheses in Theorem 2 govern three separate stages of the life cycle of the random walk and its distributions
. In the early stage
, the non-concentration hypotheses creates some initial spreading of this random walk, in particular ensuring that the walk “escapes” from cosets of proper subgroups. In the middle stage
, the product theorem steadily flattens the distribution of the random walk, until it is very roughly comparable to the uniform distribution. Finally, in the late stage
, the quasirandomness property can smooth out the random walk almost completely to obtain the mixing necessary for expansion.
8 comments
Comments feed for this article
26 January, 2012 at 3:50 am
Emmanuel Kowalski
(Somehow this post was not announced on Google Reader, so I only saw it yesterday…)
One remark that may be worth adding is that, if one is interested in explicit expansion bounds, this argument has a big loss at the stage of Lemma 8 (“flattening”), since it gives bounds for cycles of length exponentially large in terms of the gain it provides. However, with a little adaptation, this loss is in fact not needed (as described in my “Explicit expansion” notes for the case of SL_2.)
[Editor’s note: it turns out that the situation is a bit more complicated than this; see http://blogs.ethz.ch/kowalski/2012/07/30/various-updates/ -T.]
5 February, 2012 at 11:33 am
254B, Notes 5: Product theorems, pivot arguments, and the Larsen-Pink non-concentration inequality « What’s new
[…] the previous set of notes, we saw that one could derive expansion of Cayley graphs from three ingredients: non-concentration, […]
13 February, 2012 at 4:51 pm
254B, Notes 6: Non-concentration in subgroups « What’s new
[…] the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, namely […]
9 March, 2012 at 11:04 am
Anonymous
I think in Exercise 3, the conclusion should be |A|^{3/4}|B|^{3/4}, rather than 3/2.
[Corrected, thanks – T.]
9 March, 2012 at 11:33 am
Anonymous
Which also appears to affect both the statement and proof of Lemma 6.
[Corrected, thanks – T.]
29 March, 2012 at 6:04 am
On Endre Szemerédi’s Gifts to Computer Science « Windows On Theory
[…] will upper bound by . The proof of this magic lemma is also quite simple (e.g., see Lemma 4 in this blog post). Roughly speaking, if you choose to be a neighborhood of a random point , then we almost get what […]
10 September, 2013 at 8:15 am
Expansion in finite simple groups of Lie type | What's new
[…] The current paper also uses the “Bourgain-Gamburd machine”, as discussed in these lecture notes of mine, to demonstrate expansion. This machine shows how expansion of a Cayley graph follows from three […]
3 March, 2023 at 11:16 am
valuevar
In Lemma 5, one can easily get $|A’|\gg \delta |A|$, $|B’|\gg \delta |B|$ and $\gg \delta^5 |A| |B|$ paths (but then of course you know this).