We have now seen two ways to construct expander Cayley graphs . The first, discussed in Notes 2, is to use Cayley graphs that are projections of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group with a flattening hypothesis for the random walk.
We now pursue the second approach more thoroughly. The main difficulty here is to figure out how to ensure flattening of the random walk, as it is then an easy matter to use quasirandomness to show that the random walk becomes mixing soon after it becomes flat. In the case of Selberg’s theorem, we achieved this through an explicit formula for the heat kernel on the hyperbolic plane (which is a proxy for the random walk). However, in most situations such an explicit formula is not available, and one must develop some other tool for forcing flattening, and specifically an estimate of the form
for some , where is the uniform probability measure on the generating set .
In 2006, Bourgain and Gamburd introduced a general method for achieving this goal. The intuition here is that the main obstruction that prevents a random walk from spreading out to become flat over the entire group is if the random walk gets trapped in some proper subgroup of (or perhaps in some coset of such a subgroup), so that remains large for some moderately large . Note that
since , , and is symmetric. By iterating this observation, we seethat if is too large (e.g. of size for some comparable to ), then it is not possible for the random walk to converge to the uniform distribution in time , and so expansion does not occur.
A potentially more general obstruction of this type would be if the random walk gets trapped in (a coset of) an approximate group . Recall that a -approximate group is a subset of a group which is symmetric, contains the identity, and is such that can be covered by at most left-translates (or equivalently, right-translates) of . Such approximate groups were studied extensively in last quarter’s course. A similar argument to the one given previously shows (roughly speaking) that expansion cannot occur if is too large for some coset of an approximate group.
It turns out that this latter observation has a converse: if a measure does not concentrate in cosets of approximate groups, then some flattening occurs. More precisely, one has the following combinatorial lemma:
Lemma 1 (Weighted Balog-Szemerédi-Gowers lemma) Let be a group, let be a finitely supported probability measure on which is symmetric (thus for all ), and let . Then one of the following statements hold:
- (i) (Flattening) One has .
- (ii) (Concentration in an approximate group) There exists an -approximate group in with and an element such that .
This lemma is a variant of the more well-known Balog-Szemerédi-Gowers lemma in additive combinatorics due to Gowers (which roughly speaking corresponds to the case when is the uniform distribution on some set ), which in turn is a polynomially quantitative version of an earlier lemma of Balog and Szemerédi. We will prove it below the fold.
The lemma is particularly useful when the group in question enjoys a product theorem, which roughly speaking says that the only medium-sized approximate subgroups of are trapped inside genuine proper subgroups of (or, contrapositively, medium-sized sets that generate the entire group cannot be approximate groups). The fact that some finite groups (and specifically, the bounded rank finite simple groups of Lie type) enjoy product theorems is a non-trivial fact, and will be discussed in later notes. For now, we simply observe that the presence of the product theorem, together with quasirandomness and a non-concentration hypothesis, can be used to demonstrate expansion:
- (Quasirandomness). The smallest dimension of a nontrivial representation of is at least ;
- (Product theorem). For all there is some such that the following is true. If is a -approximate subgroup with then generates a proper subgroup of ;
- (Non-concentration estimate). There is some even number such that
where the supremum is over all proper subgroups .
Then is a two-sided -expander for some depending only on , and the function (and this constant is in principle computable in terms of these constants).
This criterion for expansion is implicitly contained in this paper of Bourgain and Gamburd, who used it to establish the expansion of various Cayley graphs in for prime . This criterion has since been applied (or modified) to obtain expansion results in many other groups, as will be discussed in later notes.
— 1. The Balog-Szemerédi-Gowers lemma —
The Balog-Szemerédi-Gowers lemma (Lemma 1) is ostensibly a statement about group structure, but the main tool in its proof is a remarkable graph-theoretic lemma (also known as the Balog-Szemerédi-Gowers lemma) that allows one to upgrade a “statistical” structure (a structure which is only valid a small fraction of the time, say 1% of the time) to a “complete” structure (one which is valid 100% of the time), by shrinking the size of the structure slightly (and in particular, with losses of polynomial type, as opposed to exponential or worse). This is in contrast to other structure-improving results (such as Ramsey’s theorem, Szemerédi’s theorem, or Freiman’s theorem), which are qualitatively similar in spirit, but have much worse quantitative bounds (though there is some hope in the case of Freiman’s theorem to only lose polynomial bounds with some improvement of existing arguments).
As we shall see later, the property of being large is a statistical assertion about (it asserts that collides with itself somewhat often), whereas approximate groups represent a more complete sort of structure (all products of are trapped in a small set, whereas only many of the products in are so constrained). The graph-theoretic Balog-Szemerédi lemma is the key to moving from the former type of structure to the latter with only polynomial losses.
We need some notation. Define a bipartite graph to be a graph whose vertex set is partitioned into two non-empty sets , and the edge set consists only of edges between and . If a finite bipartite graph is dense in the sense that its edge density is large, then for many vertices and , and are connected by a path of length one (i.e. an edge). It is thus intuitive that many pairs of vertices and will be connected by many paths of length two. Perhaps surprisingly, one can upgrade “many pairs” here to “almost all pairs”, provided that one is willing to shrink the set slightly. More precisely, one has
Lemma 3 (Balog-Szemerédi-Gowers lemma: paths of length two) Let be a finite bipartite graph with . Let . Then there exists a subset of with such that at least of the pairs are such that are connected by at least paths of length two (i.e. there exists at least vertices such that both lie in ).
Remark 1 It is not possible to remove the entirely from this lemma; see Exercise 6.4.2 of my book with Van Vu for a counterexample (involving Hamming balls).
Proof: The idea here is to use a probabilistic construction, picking to be a neighbourhood of a randomly selected element of . The rationale here is that if a pair of vertices in are not connected by many paths of length two, then they are unlikely to lie in the same neighbourhood, and so are unlikely to “wreck” the construction.
We turn to the details. Let be chosen uniformly at random, and let be the neighbourhood of . Observe that the expected size of is
Now, call a pair bad if it is connected by fewer than paths of length two, and let be the number of bad pairs in . We consider the quantity . Observe that if is a bad pair in , then there are at most values of for which and will both lie in , and so this bad pair contributes at most to the expectation. Since there are at most bad pairs, we conclude that
Combining this with (2), we see that
In particular, there exists a choice of for which the expression on the left-hand side is non-negative. This implies that
and the claim follows.
Given that almost all pairs in are joined by many paths of length two, it is then plausible that almost all pairs , are joined by many paths of length three, for some large subset of . Remarkably, one can now upgrade “almost all” pairs here to all pairs:
Lemma 4 (Balog-Szemerédi-Gowers lemma: paths of length three) Let be a finite bipartite graph with . Then there exists subsets of respectively with and , such that for every and , and are joined by paths of length three.
Remark 2 A lemma similar to this was first established by Balog and Szemerédi, as a consequence of the Szemereédi regularity lemma. However, as a consequence of using that lemma, the polynomial bounds in the above lemma had to be replaced by much worse bounds (of tower-exponential type in ), which turns out to be far too weak for the purposes of establishing expansion.
Proof: The idea is to first prune a few “unpopular” vertices from and and then apply the preceding lemma.
Let be the vertices in of degree at least , and let be the edges connecting and . Note that the vertices in are connected to a total of at most edges, and so . Since , we conclude in particular that .
Let be a sufficiently small quantity (depending on ) to be chosen later. Applying Lemma 3, one can find a subset of of cardinality such that at most of the pairs are bad in the sense that they are connected by paths of length two.
Let be those vertices in for which there are at most elements of for which is bad. By Markov’s inequality, consists of all but at most elements of .
Let be the edges connecting with . Since each vertex in has degree at least , one has
We may thus find a subset of of cardinality such that each is adjacent to elements of .
Now let and . We know that is adjacent to elements of , and that at most of these elements are such that is bad. If we choose to be a sufficiently small multiple of , we conclude that there are elements which are adjacent to and for which is not bad. One thus has paths of length three connecting to , and the claim follows.
The exponents in here can be improved slightly, but we will not attempt to obtain the optimal numerology here.
Remark 3 The above results are analogous to a phenomenon in additive combinatorics, namely that a “1%-structured” set (such as a small density subset of a group) can often be upgraded to a “99%-structured” set (such as the complement of a small density subset of a group) by applying a single “convolution” or “sumset” operation, and then upgraded further to a “100%-structured” set (such as a genuine group) by applying a further convolution or sumset operation. (This is basically why, for instance, it is known that almost all even natural numbers are the sum of two primes, and all but finitely many odd natural numbers are the sum of three primes; but it is not known whether all but finitely many even natural numbers are the sum of two primes.)
Exercise 1 (Weighted Balog-Szemerédi-Gowers theorem) Let and be probability spaces, and let have measure for some .
- (i) Show that for any , there exists a subset of of measure such that
- (ii) Show that there exists subsets of of measure and such that
for all and .
Exercise 2 (99% Balog-Szemerédi theorem) Let be a finite bipartite graph with .
- (i) Show that there exists a subset of of size such that for every , and are connected by at least paths of length . (Hint: select to be those vertices in that are connected to “almost all” the vertices in .)
- (ii) Show that there also exists a subset of of size such that for every and , and are connected by at least paths of length .
We now apply the graph-theoretic lemma to the group context. The main idea here is to show that various sets (e.g. product sets ) are small by showing that they are in the high-multiplicity region of some convolution (e.g. ), or equivalently that elements of such sets have many representations as a product with . One can then use Markov’s inequality and the trivial identity to get usable size bounds on such sets.
for some . (This hypothesis should be compared with the upper bound
arising from Young’s inequality.) Then there exists subsets of respectively with and with and .
The quantity (or equivalently, the number of solutions to the equation with and ) is also known as the multiplicative energy of and , and is sometimes denoted in the literature.
Proof: By hypothesis, we have
we conclude that
Since, by Cauchy-Schwarz (or Young’s inequality), we have , we conclude that there is a set with such that
for all .
By slight abuse of notation (arising from the fact that , are not necessarily disjoint, and that is a set of ordered pairs rather than unordered pairs), we can view the triplet as a bipartite graph. Applying Lemma 4, we can find subsets of respectively with and such that for all and , one can find elements such that . In particular, we see that
Observe that . Using the identity , we note that triples for are precisely those triples with . Thus the left-hand side of (3) is equal to , where
we see from Markov’s inequality that there are at most possible values for , which gives the bound .
The second bound can be proven similarly to the first (noting that any are connected by paths of length six), but can also from the former bound as follows. Observe that any element has at least representations of the form with , and hence , thus
on . On the other hand, the left-hand side has an norm of , and the bound then follows from Markov’s inequality.
Exercise 4 If are three non-empty finite subsets of , establish the Ruzsa triangle inequality . (Hint: mimic the final part of the proof of Corollary 5.)
We now give a variant of this corollary involving approximate groups.
for some . Then there exists a -approximate group with such that for some .
Proof: By Corollary 5, we may find a subset with such that
By Exercise 3, this implies that
Observe that the left-hand side is equal to
We conclude that
On the other hand, we have
As a consequence, we see that if we set
for some sufficiently large absolute constant , then
and thus (for large enough)
Since , we conclude that
Also, is clearly symmetric and contains the origin.
Now let us consider an element of the product . By construction of , we can write each as a product with in at least ways. Doing so for each gives rise to a factorisation
where ; as the uniquely determine the (for fixed ), we conclude that each element of has at least such factorisations. But by (4), there are at most such tuples , and so there are at most possible values for , thus
By the Ruzsa covering lemma (see the exercise below), this implies that is covered by left-translates of , and so is a -approximate group. Finally, from (5) one has
and thus by Exercise 3
In particular, since the support of has size , one has
for some , or equivalently that
Increasing to and taking inverses, we conclude that , and the claim follows.
Exercise 5 (Ruzsa covering lemma) Let be finite non-empty subsets of a group . Show that can be covered by at most left-translates of . (Hint: consider a maximal disjoint collection of translates of with .)
Exercise 6 (Converse to Balog-Szemerédi-Gowers) Let be a finite symmetric subset of a group , and suppose there exists a -approximate group with such that for some . Show that
Exercise 7 Let be finite non-empty subsets of a group , and suppose that . Show that there exists a -approximate group with and elements such that and .
Finally, we can prove Lemma 1. Fix . We may assume that
and we need to use this to locate an -approximate group in with and an element such that .
Let us write . Intuitively, represents the “width” of the probability meaure , as can be seen by considering the model example where is a symmetric set of cardinality (i.e. is the uniform probability measure on ). If we were actually in this model case, we could apply Lemma 6 immediately and be done. Of course, in general, need not be a uniform measure on a set of size . However, it turns out that one can use (6) to conclude that the “bulk” of is basically of this form.
More precisely, let us split , where
and so by Young’s inequality
In a similar vein, we have
and thus by Young’s inequality (and the normalisation )
Finally, from (6) one has
Subtracting using the triangle inequality (ignoring some slight double-counting), we conclude that
If we then set , we conclude in particular that
On the other hand, from Markov’s inequality one has . Applying Lemma 6, we conclude the existence of a -approximate group with such that for some , which by definition of implies that , and the claim follows.
— 2. The Bourgain-Gamburd expansion machine —
We can now prove Theorem 2. We can assume that is sufficiently large depending on the parameters , since the claim is trivial for bounded (note that as generates , the Cayley graph will be an -expander for some ). Henceforth we allow all implied constants in the asymptotic notation to depend on .
To show expansion, it suffices from the quasirandomness hypothesis (and Proposition 4 from the preceding notes), it will suffice to show that
for some .
From Young’s inequality, is decreasing in , and is initially equal to when . We need to “flatten” the norm of as increases. We first use the non-concentration hypothesis to obtain an initial amount of flattening:
for all proper subgroups of and all .
Proof: By the non-concentration hypothesis, we can find such that
for all proper subgroups of . If we write as , we see that
for all . By symmetry, , and thus
If , then we may write as the convolution of a probability measure and . From this, we see that
for all , giving the claim (9). Specialising this to the case when is the trivial group, one has
Since we also have
the claim (8) then follows from Hölder’s inequality.
Now we obtain additional flattening using the product theorem hypothesis:
Then one has
for some depending only on and .
Proof: Suppose the claim fails for some to be chosen later, thus
Applying Lemma 1, we may thus find a -approximate group with
and such that
Since by (9), we see that
Meanwhile, from (10) one has
Applying the product hypothesis (assuming sufficiently small depending on and ), we conclude that generates a proper subgroup of , and thus
But this contradicts (9) (again if is sufficiently small).
Iterating the above lemma times we obtain (7) for some , as desired.
Remark 4 Roughly speaking, the three hypotheses in Theorem 2 govern three separate stages of the life cycle of the random walk and its distributions . In the early stage , the non-concentration hypotheses creates some initial spreading of this random walk, in particular ensuring that the walk “escapes” from cosets of proper subgroups. In the middle stage , the product theorem steadily flattens the distribution of the random walk, until it is very roughly comparable to the uniform distribution. Finally, in the late stage , the quasirandomness property can smooth out the random walk almost completely to obtain the mixing necessary for expansion.