We have now seen two ways to construct expander Cayley graphs . The first, discussed in Notes 2, is to use Cayley graphs that are projections of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group with a flattening hypothesis for the random walk.

We now pursue the second approach more thoroughly. The main difficulty here is to figure out how to ensure flattening of the random walk, as it is then an easy matter to use quasirandomness to show that the random walk becomes mixing soon after it becomes flat. In the case of Selberg’s theorem, we achieved this through an explicit formula for the heat kernel on the hyperbolic plane (which is a proxy for the random walk). However, in most situations such an explicit formula is not available, and one must develop some other tool for forcing flattening, and specifically an estimate of the form

for some , where is the uniform probability measure on the generating set .

In 2006, Bourgain and Gamburd introduced a general method for achieving this goal. The intuition here is that the main obstruction that prevents a random walk from spreading out to become flat over the entire group is if the random walk gets *trapped* in some proper subgroup of (or perhaps in some coset of such a subgroup), so that remains large for some moderately large . Note that

since , , and is symmetric. By iterating this observation, we seethat if is too large (e.g. of size for some comparable to ), then it is not possible for the random walk to converge to the uniform distribution in time , and so expansion does not occur.

A potentially more general obstruction of this type would be if the random walk gets trapped in (a coset of) an *approximate* group . Recall that a *-approximate group* is a subset of a group which is symmetric, contains the identity, and is such that can be covered by at most left-translates (or equivalently, right-translates) of . Such approximate groups were studied extensively in last quarter’s course. A similar argument to the one given previously shows (roughly speaking) that expansion cannot occur if is too large for some coset of an approximate group.

It turns out that this latter observation has a converse: if a measure does not concentrate in cosets of approximate groups, then some flattening occurs. More precisely, one has the following combinatorial lemma:

Lemma 1 (Weighted Balog-Szemerédi-Gowers lemma)Let be a group, let be a finitely supported probability measure on which is symmetric (thus for all ), and let . Then one of the following statements hold:

- (i) (Flattening) One has .
- (ii) (Concentration in an approximate group) There exists an -approximate group in with and an element such that .

This lemma is a variant of the more well-known *Balog-Szemerédi-Gowers lemma* in additive combinatorics due to Gowers (which roughly speaking corresponds to the case when is the uniform distribution on some set ), which in turn is a polynomially quantitative version of an earlier lemma of Balog and Szemerédi. We will prove it below the fold.

The lemma is particularly useful when the group in question enjoys a *product theorem*, which roughly speaking says that the only medium-sized approximate subgroups of are trapped inside genuine proper subgroups of (or, contrapositively, medium-sized sets that generate the entire group cannot be approximate groups). The fact that some finite groups (and specifically, the bounded rank finite simple groups of Lie type) enjoy product theorems is a non-trivial fact, and will be discussed in later notes. For now, we simply observe that the presence of the product theorem, together with quasirandomness and a non-concentration hypothesis, can be used to demonstrate expansion:

Theorem 2 (Bourgain-Gamburd expansion machine)Suppose that is a finite group, that is a symmetric set of generators, and that there are constants with the following properties.

- (Quasirandomness). The smallest dimension of a nontrivial representation of is at least ;
- (Product theorem). For all there is some such that the following is true. If is a -approximate subgroup with then generates a proper subgroup of ;
- (Non-concentration estimate). There is some even number such that
where the supremum is over all proper subgroups .

Then is a two-sided -expander for some depending only on , and the function (and this constant is in principle computable in terms of these constants).

This criterion for expansion is implicitly contained in this paper of Bourgain and Gamburd, who used it to establish the expansion of various Cayley graphs in for prime . This criterion has since been applied (or modified) to obtain expansion results in many other groups, as will be discussed in later notes.

** — 1. The Balog-Szemerédi-Gowers lemma — **

The Balog-Szemerédi-Gowers lemma (Lemma 1) is ostensibly a statement about group structure, but the main tool in its proof is a remarkable graph-theoretic lemma (also known as the Balog-Szemerédi-Gowers lemma) that allows one to upgrade a “statistical” structure (a structure which is only valid a small fraction of the time, say 1% of the time) to a “complete” structure (one which is valid 100% of the time), by shrinking the size of the structure slightly (and in particular, with losses of polynomial type, as opposed to exponential or worse). This is in contrast to other structure-improving results (such as Ramsey’s theorem, Szemerédi’s theorem, or Freiman’s theorem), which are qualitatively similar in spirit, but have much worse quantitative bounds (though there is some hope in the case of Freiman’s theorem to only lose polynomial bounds with some improvement of existing arguments).

As we shall see later, the property of being large is a statistical assertion about (it asserts that collides with itself somewhat often), whereas approximate groups represent a more complete sort of structure (*all* products of are trapped in a small set, whereas only *many* of the products in are so constrained). The graph-theoretic Balog-Szemerédi lemma is the key to moving from the former type of structure to the latter with only polynomial losses.

We need some notation. Define a *bipartite graph* to be a graph whose vertex set is partitioned into two non-empty sets , and the edge set consists only of edges between and . If a finite bipartite graph is *dense* in the sense that its edge density is large, then for many vertices and , and are connected by a path of length one (i.e. an edge). It is thus intuitive that many pairs of vertices and will be connected by many paths of length two. Perhaps surprisingly, one can upgrade “many pairs” here to “almost all pairs”, provided that one is willing to shrink the set slightly. More precisely, one has

Lemma 3 (Balog-Szemerédi-Gowers lemma: paths of length two)Let be a finite bipartite graph with . Let . Then there exists a subset of with such that at least of the pairs are such that are connected by at least paths of length two (i.e. there exists at least vertices such that both lie in ).

Remark 1It is not possible to remove the entirely from this lemma; see Exercise 6.4.2 of my book with Van Vu for a counterexample (involving Hamming balls).

*Proof:* The idea here is to use a probabilistic construction, picking to be a neighbourhood of a randomly selected element of . The rationale here is that if a pair of vertices in are not connected by many paths of length two, then they are unlikely to lie in the same neighbourhood, and so are unlikely to “wreck” the construction.

We turn to the details. Let be chosen uniformly at random, and let be the neighbourhood of . Observe that the expected size of is

By Cauchy-Schwarz, we conclude in particular that

Now, call a pair *bad* if it is connected by fewer than paths of length two, and let be the number of bad pairs in . We consider the quantity . Observe that if is a bad pair in , then there are at most values of for which and will both lie in , and so this bad pair contributes at most to the expectation. Since there are at most bad pairs, we conclude that

Combining this with (2), we see that

In particular, there exists a choice of for which the expression on the left-hand side is non-negative. This implies that

and

and the claim follows.

Given that almost all pairs in are joined by many paths of length two, it is then plausible that almost all pairs , are joined by many paths of length three, for some large subset of . Remarkably, one can now upgrade “almost all” pairs here to *all* pairs:

Lemma 4 (Balog-Szemerédi-Gowers lemma: paths of length three)Let be a finite bipartite graph with . Then there exists subsets of respectively with and , such that for every and , and are joined by paths of length three.

Remark 2A lemma similar to this was first established by Balog and Szemerédi, as a consequence of the Szemereédi regularity lemma. However, as a consequence of using that lemma, the polynomial bounds in the above lemma had to be replaced by much worse bounds (of tower-exponential type in ), which turns out to be far too weak for the purposes of establishing expansion.

*Proof:* The idea is to first prune a few “unpopular” vertices from and and then apply the preceding lemma.

Let be the vertices in of degree at least , and let be the edges connecting and . Note that the vertices in are connected to a total of at most edges, and so . Since , we conclude in particular that .

Let be a sufficiently small quantity (depending on ) to be chosen later. Applying Lemma 3, one can find a subset of of cardinality such that at most of the pairs are *bad* in the sense that they are connected by paths of length two.

Let be those vertices in for which there are at most elements of for which is bad. By Markov’s inequality, consists of all but at most elements of .

Let be the edges connecting with . Since each vertex in has degree at least , one has

We may thus find a subset of of cardinality such that each is adjacent to elements of .

Now let and . We know that is adjacent to elements of , and that at most of these elements are such that is bad. If we choose to be a sufficiently small multiple of , we conclude that there are elements which are adjacent to and for which is not bad. One thus has paths of length three connecting to , and the claim follows.

The exponents in here can be improved slightly, but we will not attempt to obtain the optimal numerology here.

Remark 3The above results are analogous to a phenomenon in additive combinatorics, namely that a “1%-structured” set (such as a small density subset of a group) can often be upgraded to a “99%-structured” set (such as the complement of a small density subset of a group) by applying a single “convolution” or “sumset” operation, and then upgraded further to a “100%-structured” set (such as a genuine group) by applying a further convolution or sumset operation. (This is basically why, for instance, it is known that almost all even natural numbers are the sum of two primes, and all but finitely many odd natural numbers are the sum of three primes; but it is not known whether all but finitely many even natural numbers are the sum of two primes.)

Exercise 1 (Weighted Balog-Szemerédi-Gowers theorem)Let and be probability spaces, and let have measure for some .

- (i) Show that for any , there exists a subset of of measure such that
- (ii) Show that there exists subsets of of measure and such that
for all and .

Exercise 2 (99% Balog-Szemerédi theorem)Let be a finite bipartite graph with .

- (i) Show that there exists a subset of of size such that for every , and are connected by at least paths of length . (
Hint:select to be those vertices in that are connected to “almost all” the vertices in .)- (ii) Show that there also exists a subset of of size such that for every and , and are connected by at least paths of length .

We now apply the graph-theoretic lemma to the group context. The main idea here is to show that various sets (e.g. product sets ) are small by showing that they are in the high-multiplicity region of some convolution (e.g. ), or equivalently that elements of such sets have many representations as a product with . One can then use Markov’s inequality and the trivial identity to get usable size bounds on such sets.

Corollary 5 (Balog-Szemerédi lemma, product set form)let be finite non-empty subsets of a group , and suppose thatfor some . (This hypothesis should be compared with the upper bound

arising from Young’s inequality.) Then there exists subsets of respectively with and with and .

The quantity (or equivalently, the number of solutions to the equation with and ) is also known as the *multiplicative energy* of and , and is sometimes denoted in the literature.

*Proof:* By hypothesis, we have

Since

we conclude that

Since, by Cauchy-Schwarz (or Young’s inequality), we have , we conclude that there is a set with such that

for all .

By slight abuse of notation (arising from the fact that , are not necessarily disjoint, and that is a set of ordered pairs rather than unordered pairs), we can view the triplet as a bipartite graph. Applying Lemma 4, we can find subsets of respectively with and such that for all and , one can find elements such that . In particular, we see that

Observe that . Using the identity , we note that triples for are precisely those triples with . Thus the left-hand side of (3) is equal to , where

But since

we see from Markov’s inequality that there are at most possible values for , which gives the bound .

The second bound can be proven similarly to the first (noting that any are connected by paths of length six), but can also from the former bound as follows. Observe that any element has at least representations of the form with , and hence , thus

on . On the other hand, the left-hand side has an norm of , and the bound then follows from Markov’s inequality.

Exercise 3In the converse direction, show that if are non-empty finite subsets of with , then .

Exercise 4If are three non-empty finite subsets of , establish theRuzsa triangle inequality. (Hint:mimic the final part of the proof of Corollary 5.)

We now give a variant of this corollary involving approximate groups.

Lemma 6 (Balog-Szemerédi lemma, approximate group form)Let be a finite symmetric subset of a group , and suppose thatfor some . Then there exists a -approximate group with such that for some .

*Proof:* By Corollary 5, we may find a subset with such that

By Exercise 3, this implies that

Observe that the left-hand side is equal to

We conclude that

On the other hand, we have

As a consequence, we see that if we set

for some sufficiently large absolute constant , then

and thus (for large enough)

Since , we conclude that

Also, is clearly symmetric and contains the origin.

Now let us consider an element of the product . By construction of , we can write each as a product with in at least ways. Doing so for each gives rise to a factorisation

where ; as the uniquely determine the (for fixed ), we conclude that each element of has at least such factorisations. But by (4), there are at most such tuples , and so there are at most possible values for , thus

By the Ruzsa covering lemma (see the exercise below), this implies that is covered by left-translates of , and so is a -approximate group. Finally, from (5) one has

and thus by Exercise 3

In particular, since the support of has size , one has

for some , or equivalently that

Increasing to and taking inverses, we conclude that , and the claim follows.

Exercise 5 (Ruzsa covering lemma)Let be finite non-empty subsets of a group . Show that can be covered by at most left-translates of . (Hint:consider a maximal disjoint collection of translates of with .)

Exercise 6 (Converse to Balog-Szemerédi-Gowers)Let be a finite symmetric subset of a group , and suppose there exists a -approximate group with such that for some . Show that

Exercise 7Let be finite non-empty subsets of a group , and suppose that . Show that there exists a -approximate group with and elements such that and .

Finally, we can prove Lemma 1. Fix . We may assume that

and we need to use this to locate an -approximate group in with and an element such that .

Let us write . Intuitively, represents the “width” of the probability meaure , as can be seen by considering the model example where is a symmetric set of cardinality (i.e. is the uniform probability measure on ). If we were actually in this model case, we could apply Lemma 6 immediately and be done. Of course, in general, need not be a uniform measure on a set of size . However, it turns out that one can use (6) to conclude that the “bulk” of is basically of this form.

More precisely, let us split , where

Observe that

and so by Young’s inequality

In a similar vein, we have

and thus by Young’s inequality (and the normalisation )

Finally, from (6) one has

Subtracting using the triangle inequality (ignoring some slight double-counting), we conclude that

If we then set , we conclude in particular that

On the other hand, from Markov’s inequality one has . Applying Lemma 6, we conclude the existence of a -approximate group with such that for some , which by definition of implies that , and the claim follows.

** — 2. The Bourgain-Gamburd expansion machine — **

We can now prove Theorem 2. We can assume that is sufficiently large depending on the parameters , since the claim is trivial for bounded (note that as generates , the Cayley graph will be an -expander for some ). Henceforth we allow all implied constants in the asymptotic notation to depend on .

To show expansion, it suffices from the quasirandomness hypothesis (and Proposition 4 from the preceding notes), it will suffice to show that

From Young’s inequality, is decreasing in , and is initially equal to when . We need to “flatten” the norm of as increases. We first use the non-concentration hypothesis to obtain an initial amount of flattening:

Proposition 7For any , one has

*Proof:* By the non-concentration hypothesis, we can find such that

for all proper subgroups of . If we write as , we see that

for all . By symmetry, , and thus

If , then we may write as the convolution of a probability measure and . From this, we see that

for all , giving the claim (9). Specialising this to the case when is the trivial group, one has

Since we also have

the claim (8) then follows from Hölder’s inequality.

Now we obtain additional flattening using the product theorem hypothesis:

Lemma 8 (Flattening lemma)Suppose is such thatfor some depending only on and .

*Proof:* Suppose the claim fails for some to be chosen later, thus

Applying Lemma 1, we may thus find a -approximate group with

and such that

Since by (9), we see that

Meanwhile, from (10) one has

Applying the product hypothesis (assuming sufficiently small depending on and ), we conclude that generates a proper subgroup of , and thus

But this contradicts (9) (again if is sufficiently small).

Iterating the above lemma times we obtain (7) for some , as desired.

Remark 4Roughly speaking, the three hypotheses in Theorem 2 govern three separate stages of the life cycle of the random walk and its distributions . In the early stage , the non-concentration hypotheses creates some initial spreading of this random walk, in particular ensuring that the walk “escapes” from cosets of proper subgroups. In the middle stage , the product theorem steadily flattens the distribution of the random walk, until it is very roughly comparable to the uniform distribution. Finally, in the late stage , the quasirandomness property can smooth out the random walk almost completely to obtain the mixing necessary for expansion.

## 8 comments

Comments feed for this article

26 January, 2012 at 3:50 am

Emmanuel Kowalski(Somehow this post was not announced on Google Reader, so I only saw it yesterday…)

One remark that may be worth adding is that, if one is interested in explicit expansion bounds, this argument has a big loss at the stage of Lemma 8 (“flattening”), since it gives bounds for cycles of length exponentially large in terms of the gain it provides. However, with a little adaptation, this loss is in fact not needed (as described in my “Explicit expansion” notes for the case of SL_2.)

[Editor’s note: it turns out that the situation is a bit more complicated than this; see http://blogs.ethz.ch/kowalski/2012/07/30/various-updates/ -T.]5 February, 2012 at 11:33 am

254B, Notes 5: Product theorems, pivot arguments, and the Larsen-Pink non-concentration inequality « What’s new[…] the previous set of notes, we saw that one could derive expansion of Cayley graphs from three ingredients: non-concentration, […]

13 February, 2012 at 4:51 pm

254B, Notes 6: Non-concentration in subgroups « What’s new[…] the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, namely […]

9 March, 2012 at 11:04 am

AnonymousI think in Exercise 3, the conclusion should be |A|^{3/4}|B|^{3/4}, rather than 3/2.

[Corrected, thanks – T.]9 March, 2012 at 11:33 am

AnonymousWhich also appears to affect both the statement and proof of Lemma 6.

[Corrected, thanks – T.]29 March, 2012 at 6:04 am

On Endre Szemerédi’s Gifts to Computer Science « Windows On Theory[…] will upper bound by . The proof of this magic lemma is also quite simple (e.g., see Lemma 4 in this blog post). Roughly speaking, if you choose to be a neighborhood of a random point , then we almost get what […]

10 September, 2013 at 8:15 am

Expansion in finite simple groups of Lie type | What's new[…] The current paper also uses the “Bourgain-Gamburd machine”, as discussed in these lecture notes of mine, to demonstrate expansion. This machine shows how expansion of a Cayley graph follows from three […]

3 March, 2023 at 11:16 am

valuevarIn Lemma 5, one can easily get $|A’|\gg \delta |A|$, $|B’|\gg \delta |B|$ and $\gg \delta^5 |A| |B|$ paths (but then of course you know this).