You are currently browsing the tag archive for the ‘expander graphs’ tag.

In this final set of course notes, we discuss how (a generalisation of) the expansion results obtained in the preceding notes can be used for some nnumber-theoretic applications, and in particular to locate almost primes inside orbits of thin groups, following the work of Bourgain, Gamburd, and Sarnak. We will not attempt here to obtain the sharpest or most general results in this direction, but instead focus on the simplest instances of these results which are still illustrative of the ideas involved.

One of the basic general problems in analytic number theory is to locate tuples of primes of a certain form; for instance, the famous (and still unsolved) twin prime conjecture asserts that there are infinitely many pairs ${(n_1,n_2)}$ in the line ${\{ (n_1,n_2) \in {\bf Z}^2: n_2-n_1=2\}}$ in which both entries are prime. In a similar spirit, one of the Landau conjectures (also still unsolved) asserts that there are infinitely many primes in the set ${\{ n^2+1: n \in {\bf Z} \}}$. The Mersenne prime conjecture (also unsolved) asserts that there are infinitely many primes in the set ${\{ 2^n - 1: n \in {\bf Z} \}}$, and so forth.

More generally, given some explicit subset ${V}$ in ${{\bf R}^d}$ (or ${{\bf C}^d}$, if one wishes), such as an algebraic variety, one can ask the question of whether there are infinitely many integer lattice points ${(n_1,\ldots,n_d)}$ in ${V \cap {\bf Z}^d}$ in which all the coefficients ${n_1,\ldots,n_d}$ are simultaneously prime; let us refer to such points as prime points.

At this level of generality, this problem is impossibly difficult. Indeed, even the much simpler problem of deciding whether the set ${V \cap {\bf Z}^d}$ is non-empty (let alone containing prime points) when ${V}$ is a hypersurface ${\{ x \in {\bf R}^d: P(x) = 0 \}}$ cut out by a polynomial ${P}$ is essentially Hilbert’s tenth problem, which is known to be undecidable in general by Matiyasevich’s theorem. So one needs to restrict attention to a more special class of sets ${V}$, in which the question of finding integer points is not so difficult. One model case is to consider orbits ${V = \Gamma b}$, where ${b \in {\bf Z}^d}$ is a fixed lattice vector and ${\Gamma}$ is some discrete group that acts on ${{\bf Z}^d}$ somehow (e.g. ${\Gamma}$ might be embedded as a subgroup of the special linear group ${SL_d({\bf Z})}$, or on the affine group ${SL_d({\bf Z}) \ltimes {\bf Z}^d}$). In such a situation it is then quite easy to show that ${V = V \cap {\bf Z}^d}$ is large; for instance, ${V}$ will be infinite precisely when the stabiliser of ${b}$ in ${\Gamma}$ has infinite index in ${\Gamma}$.

Even in this simpler setting, the question of determining whether an orbit ${V = \Gamma b}$ contains infinitely prime points is still extremely difficult; indeed the three examples given above of the twin prime conjecture, Landau conjecture, and Mersenne prime conjecture are essentially of this form (possibly after some slight modification of the underlying ring ${{\bf Z}}$, see this paper of Bourgain-Gamburd-Sarnak for details), and are all unsolved (and generally considered well out of reach of current technology). Indeed, the list of non-trivial orbits ${V = \Gamma b}$ which are known to contain infinitely many prime points is quite slim; Euclid’s theorem on the infinitude of primes handles the case ${V = {\bf Z}}$, Dirichlet’s theorem handles infinite arithmetic progressions ${V = a{\bf Z} + r}$, and a somewhat complicated result of Green, Tao, and Ziegler handles “non-degenerate” affine lattices in ${{\bf Z}^d}$ of rank two or more (such as the lattice of length ${d}$ arithmetic progressions), but there are few other positive results known that are not based on the above cases (though we will note the remarkable theorem of Friedlander and Iwaniec that there are infinitely many primes of the form ${a^2+b^4}$, and the related result of Heath-Brown that there are infinitely many primes of the form ${a^3+2b^3}$, as being in a kindred spirit to the above results, though they are not explicitly associated to an orbit of a reasonable action as far as I know).

On the other hand, much more is known if one is willing to replace the primes by the larger set of almost primes – integers with a small number of prime factors (counting multiplicity). Specifically, for any ${r \geq 1}$, let us call an ${r}$-almost prime an integer which is the product of at most ${r}$ primes, and possibly by the unit ${-1}$ as well. Many of the above sorts of questions which are open for primes, are known for ${r}$-almost primes for ${r}$ sufficiently large. For instance, with regards to the twin prime conjecture, it is a result of Chen that there are infinitely many pairs ${p,p+2}$ where ${p}$ is a prime and ${p+2}$ is a ${2}$-almost prime; in a similar vein, it is a result of Iwaniec that there are infinitely many ${2}$-almost primes of the form ${n^2+1}$. On the other hand, it is still open for any fixed ${r}$ whether there are infinitely many Mersenne numbers ${2^n-1}$ which are ${r}$-almost primes. (For the superficially similar situation with the numbers ${2^n+1}$, it is in fact believed (but again unproven) that there are only finitely many ${r}$-almost primes for any fixed ${r}$ (the Fermat prime conjecture).)

The main tool that allows one to count almost primes in orbits is sieve theory. The reason for this lies in the simple observation that in order to ensure that an integer ${n}$ of magnitude at most ${x}$ is an ${r}$-almost prime, it suffices to guarantee that ${n}$ is not divisible by any prime less than ${x^{1/(r+1)}}$. Thus, to create ${r}$-almost primes, one can start with the integers up to some large threshold ${x}$ and remove (or “sieve out”) all the integers that are multiples of any prime ${p}$ less than ${x^{1/(r+1)}}$. The difficulty is then to ensure that a sufficiently non-trivial quantity of integers remain after this process, for the purposes of finding points in the given set ${V}$.

The most basic sieve of this form is the sieve of Eratosthenes, which when combined with the inclusion-exclusion principle gives the Legendre sieve (or exact sieve), which gives an exact formula for quantities such as the number ${\pi(x,z)}$ of natural numbers less than or equal to ${x}$ that are not divisible by any prime less than or equal to a given threshold ${z}$. Unfortunately, when one tries to evaluate this formula, one encounters error terms which grow exponentially in ${z}$, rendering this sieve useful only for very small thresholds ${z}$ (of logarithmic size in ${x}$). To improve the sieve level up to a small power of ${x}$ such as ${x^{1/(r+1)}}$, one has to replace the exact sieve by upper bound sieves and lower bound sieves which only seek to obtain upper or lower bounds on quantities such as ${\pi(x,z)}$, but contain a polynomial number of terms rather than an exponential number. There are a variety of such sieves, with the two most common such sieves being combinatorial sieves (such as the beta sieve), based on various combinatorial truncations of the inclusion-exclusion formula, and the Selberg upper bound sieve, based on upper bounds that are the square of a divisor sum. (There is also the large sieve, which is somewhat different in nature and based on ${L^2}$ almost orthogonality considerations, rather than on any actual sieving, to obtain upper bounds.) We will primarily work with a specific sieve in this notes, namely the beta sieve, and we will not attempt to optimise all the parameters of this sieve (which ultimately means that the almost primality parameter ${r}$ in our results will be somewhat large). For a more detailed study of sieve theory, see the classic text of Halberstam and Richert, or the more recent texts of Iwaniec-Kowalski or of Friedlander-Iwaniec.

Very roughly speaking, the end result of sieve theory is that excepting some degenerate and “exponentially thin” settings (such as those associated with the Mersenne primes), all the orbits which are expected to have a large number of primes, can be proven to at least have a large number of ${r}$-almost primes for some finite ${r}$. (Unfortunately, there is a major obstruction, known as the parity problem, which prevents sieve theory from lowering ${r}$ all the way to ${1}$; see this blog post for more discussion.) One formulation of this principle was established by Bourgain, Gamburd, and Sarnak:

Theorem 1 (Bourgain-Gamburd-Sarnak) Let ${\Gamma}$ be a subgroup of ${SL_2({\bf Z})}$ which is not virtually solvable. Let ${f: {\bf Z}^4 \rightarrow {\bf Z}}$ be a polynomial with integer coefficients obeying the following primitivity condition: for any positive integer ${q}$, there exists ${A \in \Gamma \subset {\bf Z}^4}$ such that ${f(A)}$ is coprime to ${q}$. Then there exists an ${r \geq 1}$ such that there are infinitely many ${A \in \Gamma}$ with ${f(A)}$ non-zero and ${r}$-almost prime.

This is not the strongest version of the Bourgain-Gamburd-Sarnak theorem, but it captures the general flavour of their results. Note that the theorem immediately implies an analogous result for orbits ${\Gamma b \subset {\bf Z}^2}$, in which ${f}$ is now a polynomial from ${{\bf Z}^2}$ to ${{\bf Z}}$, and one uses ${f(Ab)}$ instead of ${f(A)}$. It is in fact conjectured that one can set ${r=1}$ here, but this is well beyond current technology. For the purpose of reaching ${r=1}$, it is very natural to impose the primitivity condition, but as long as one is content with larger values of ${r}$, it is possible to relax the primitivity condition somewhat; see the paper of Bourgain, Gamburd, and Sarnak for more discussion.

By specialising to the polynomial ${f: \begin{pmatrix} a & b \\ c & d \end{pmatrix} \rightarrow abcd}$, we conclude as a corollary that as long as ${\Gamma}$ is primitive in the sense that it contains matrices with all coefficients coprime to ${q}$ for any given ${q}$, then ${\Gamma}$ contains infinitely many matrices whose elements are all ${r}$-almost primes for some ${r}$ depending only on ${\Gamma}$. For further applications of these sorts of results, for instance to Appolonian packings, see the paper of Bourgain, Gamburd, and Sarnak.

It turns out that to prove Theorem 1, the Cayley expansion results in ${SL_2(F_p)}$ from the previous set of notes are not quite enough; one needs a more general Cayley expansion result in ${SL_2({\bf Z}/q{\bf Z})}$ where ${q}$ is square-free but not necessarily prime. The proof of this expansion result uses the same basic methods as in the ${SL_2(F_p)}$ case, but is significantly more complicated technically, and we will only discuss it briefly here. As such, we do not give a complete proof of Theorem 1, but hopefully the portion of the argument presented here is still sufficient to give an impression of the ideas involved.

We have now seen two ways to construct expander Cayley graphs ${Cay(G,S)}$. The first, discussed in Notes 2, is to use Cayley graphs that are projections of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group ${G}$ with a flattening hypothesis for the random walk.

We now pursue the second approach more thoroughly. The main difficulty here is to figure out how to ensure flattening of the random walk, as it is then an easy matter to use quasirandomness to show that the random walk becomes mixing soon after it becomes flat. In the case of Selberg’s theorem, we achieved this through an explicit formula for the heat kernel on the hyperbolic plane (which is a proxy for the random walk). However, in most situations such an explicit formula is not available, and one must develop some other tool for forcing flattening, and specifically an estimate of the form

$\displaystyle \| \mu^{(n)} \|_{\ell^2(G)} \ll |G|^{-1/2+\epsilon} \ \ \ \ \ (1)$

for some ${n = O(\log |G|)}$, where ${\mu}$ is the uniform probability measure on the generating set ${S}$.

In 2006, Bourgain and Gamburd introduced a general method for achieving this goal. The intuition here is that the main obstruction that prevents a random walk from spreading out to become flat over the entire group ${G}$ is if the random walk gets trapped in some proper subgroup ${H}$ of ${G}$ (or perhaps in some coset ${xH}$ of such a subgroup), so that ${\mu^{(n)}(xH)}$ remains large for some moderately large ${n}$. Note that

$\displaystyle \mu^{(2n)}(H) \geq \mu^{(n)}(H x^{-1}) \mu^{(n)}(xH) = \mu^{(n)}(xH)^2,$

since ${\mu^{(2n)} = \mu^{(n)} * \mu^{(n)}}$, ${H = (H x^{-1}) \cdot (xH)}$, and ${\mu^{(n)}}$ is symmetric. By iterating this observation, we seethat if ${\mu^{(n)}(xH)}$ is too large (e.g. of size ${|G|^{-o(1)}}$ for some ${n}$ comparable to ${\log |G|}$), then it is not possible for the random walk ${\mu^{(n)}}$ to converge to the uniform distribution in time ${O(\log |G|)}$, and so expansion does not occur.

A potentially more general obstruction of this type would be if the random walk gets trapped in (a coset of) an approximate group ${H}$. Recall that a ${K}$-approximate group is a subset ${H}$ of a group ${G}$ which is symmetric, contains the identity, and is such that ${H \cdot H}$ can be covered by at most ${K}$ left-translates (or equivalently, right-translates) of ${H}$. Such approximate groups were studied extensively in last quarter’s course. A similar argument to the one given previously shows (roughly speaking) that expansion cannot occur if ${\mu^{(n)}(xH)}$ is too large for some coset ${xH}$ of an approximate group.

It turns out that this latter observation has a converse: if a measure does not concentrate in cosets of approximate groups, then some flattening occurs. More precisely, one has the following combinatorial lemma:

Lemma 1 (Weighted Balog-Szemerédi-Gowers lemma) Let ${G}$ be a group, let ${\nu}$ be a finitely supported probability measure on ${G}$ which is symmetric (thus ${\nu(g)=\nu(g^{-1})}$ for all ${g \in G}$), and let ${K \geq 1}$. Then one of the following statements hold:

• (i) (Flattening) One has ${\| \nu * \nu \|_{\ell^2(G)} \leq \frac{1}{K} \|\nu\|_{\ell^2(G)}}$.
• (ii) (Concentration in an approximate group) There exists an ${O(K^{O(1)})}$-approximate group ${H}$ in ${G}$ with ${|H| \ll K^{O(1)} / \| \nu \|_{\ell^2(G)}^2}$ and an element ${x \in G}$ such that ${\nu(xH) \gg K^{-O(1)}}$.

This lemma is a variant of the more well-known Balog-Szemerédi-Gowers lemma in additive combinatorics due to Gowers (which roughly speaking corresponds to the case when ${\mu}$ is the uniform distribution on some set ${A}$), which in turn is a polynomially quantitative version of an earlier lemma of Balog and Szemerédi. We will prove it below the fold.

The lemma is particularly useful when the group ${G}$ in question enjoys a product theorem, which roughly speaking says that the only medium-sized approximate subgroups of ${G}$ are trapped inside genuine proper subgroups of ${G}$ (or, contrapositively, medium-sized sets that generate the entire group ${G}$ cannot be approximate groups). The fact that some finite groups (and specifically, the bounded rank finite simple groups of Lie type) enjoy product theorems is a non-trivial fact, and will be discussed in later notes. For now, we simply observe that the presence of the product theorem, together with quasirandomness and a non-concentration hypothesis, can be used to demonstrate expansion:

Theorem 2 (Bourgain-Gamburd expansion machine) Suppose that ${G}$ is a finite group, that ${S \subseteq G}$ is a symmetric set of ${k}$ generators, and that there are constants ${0 < \kappa < 1 < \Lambda}$ with the following properties.

1. (Quasirandomness). The smallest dimension of a nontrivial representation ${\rho: G \rightarrow GL_d({\bf C})}$ of ${G}$ is at least ${|G|^{\kappa}}$;
2. (Product theorem). For all ${\delta > 0}$ there is some ${\delta' = \delta'(\delta) > 0}$ such that the following is true. If ${H \subseteq G}$ is a ${|G|^{\delta'}}$-approximate subgroup with ${|G|^{\delta} \leq |H| \leq |G|^{1 - \delta}}$ then ${H}$ generates a proper subgroup of ${G}$;
3. (Non-concentration estimate). There is some even number ${n \leq \Lambda\log |G|}$ such that

$\displaystyle \sup_{H < G}\mu^{(n)}(H) < |G|^{-\kappa},$

where the supremum is over all proper subgroups ${H < G}$.

Then ${Cay(G,S)}$ is a two-sided ${\epsilon}$-expander for some ${\epsilon > 0}$ depending only on ${k,\kappa, \Lambda}$, and the function ${\delta'(\cdot )}$ (and this constant ${\epsilon}$ is in principle computable in terms of these constants).

This criterion for expansion is implicitly contained in this paper of Bourgain and Gamburd, who used it to establish the expansion of various Cayley graphs in ${SL_2(F_p)}$ for prime ${p}$. This criterion has since been applied (or modified) to obtain expansion results in many other groups, as will be discussed in later notes.

Emmanuel Breuillard, Ben Green, and I have just uploaded to the arXiv the paper “Suzuki groups as expanders“, to be submitted. The purpose of this paper is to finish off the last case of the following theorem:

Theorem 1 (All finite simple groups have expanders) For every finite simple non-abelian group ${G}$, there exists a set of generators ${S}$ of cardinality bounded uniformly in ${G}$, such that the Cayley graph ${\hbox{Cay}(G,S)}$ on ${G}$ generated by ${S}$ (i.e. the graph that connects ${g}$ with ${sg}$ for all ${g \in G}$ and ${s \in S}$) has expansion constant bounded away from zero uniformly in ${G}$, or equivalently that ${|A \cdot S| \geq (1+\epsilon) |A|}$ for all ${A \subset G}$ with ${|A| < |G|/2}$ and some ${\epsilon>0}$ independent of ${G}$.

To put in an essentially equivalent way, one can quickly generate a random element of a finite simple group with a near-uniform distribution by multiplying together a few (${O(\log |G|)}$, to be more precise) randomly chosen elements of a fixed set ${S}$. (The most well-known instance of this phenomenon is the famous result of Bayer and Diaconis that one can shuffle a 52-card deck reasonably well after seven riffle shuffles, and almost perfectly after ten.) Note that the abelian simple groups ${{\bf Z}/p{\bf Z}}$ do not support expanders due to the slow mixing time of random walks in the abelian setting.

The first step in proving this theorem is, naturally enough, the classification of finite simple groups. The sporadic groups have bounded cardinality and are a trivial case of this theorem, so one only has to deal with the seventeen infinite families of finite non-abelian simple groups. With one exception, the groups ${G}$ in all of these families contain a copy of ${SL_2({\bf F}_q)}$ for some ${q}$ that goes to infinity as ${|G| \rightarrow \infty}$. Using this and several other non-trivial tools (such as Kazhdan’s property (T) and a deep model-theoretic result of Hrushovski and Pillay), the above theorem was proven for all groups outside of this exceptional family by a series of works culminating in the paper of Kassabov, Lubotzky, and Nikolov.

The exceptional family is the family of Suzuki groups ${Sz(q)}$, where ${q = 2^{2n+1}}$ is an odd power of ${2}$. The Suzuki group ${Sz(q)}$ can be viewed explicitly as a subgroup of the symplectic group ${Sp_4(q)}$ and has cardinality ${q^2 (q^2+1)(q-1) \approx q^5}$. This cardinality is not divisible by ${3}$, whereas all groups of the form ${SL_2(k)}$ have cardinality divisible by ${3}$; thus Suzuki groups do not contain copies of ${SL_2}$ and the Kassabov-Lubotsky-Nikolov argument does not apply.

Our main result is that the Suzuki groups also support expanders, thus completing the last case of the above theorem. In fact we can pick just two random elements ${a, b}$ of the Suzuki group, and with probability ${1-o_{q \rightarrow \infty}(1)}$, the Cayley graph generated by ${S = \{a,b,a^{-1},b^{-1}\}}$ will be an expander uniformly in ${q}$. (As stated in the paper of Kassabov-Lubotsky-Nikolov, the methods in that paper should give an upper bound on ${S}$ which they conservatively estimate to be ${1000}$.)

Our methods are different, instead following closely the arguments of Bourgain and Gamburd, which established the analogue of our result (i.e. that two random elements generate an expander graph) for the family of groups ${SL_2({\bf F}_p)}$ (${p}$ a large prime); the arguments there have since been generalised to several other major families of groups, and our result here can thus be viewed as one further such generalisation. Roughly speaking, the strategy is as follows. Let ${\mu}$ be the uniform probability measure on the randomly chosen set of generators ${S}$, and let ${\mu^{(n)}}$ be the ${n}$-fold convolution. We need ${\mu^{(n)}}$ to converge rapidly to the uniform measure on ${G}$ (with a mixing time of ${O(\log |G|)}$). There are three steps to obtain this mixing:

• (Early period) When ${n \sim c \log |G|}$ for some small ${c > 0}$, one wants ${\mu^{(n)}}$ to spread out a little bit in the sense that no individual element of ${G}$ is assigned a mass of any more than ${|G|^{-\epsilon}}$ for some fixed ${\epsilon > 0}$. More generally, no proper subgroup ${H}$ of ${G}$ should be assigned a mass of more than ${|G|^{-\epsilon}}$.
• (Middle period) Once ${\mu^{(n)}}$ is somewhat spread out, one should be able to convolve this measure with itself a bounded number of times and conclude that the measure ${\mu^{(Cn)}}$ for some suitable ${C}$ is reasonably spread out in the sense that its ${L^2}$ norm is comparable (up to powers of ${|G|^{\epsilon}}$ for some small ${\epsilon > 0}$) to the ${L^2}$ norm of the uniform distribution.
• (Late period) Once ${\mu^{(n)}}$ is reasonably spread out, a few more convolutions should make it extremely close to uniform (e.g. within ${|G|^{-10}}$ in the ${L^\infty}$ norm).

The late period claim is easy to establish from Gowers’ theory of quasirandom groups, the key point being that (like all other finite simple nonabelian groups), the Suzuki groups do not admit any non-trivial low-dimensional irreducible representations (we can for instance use a precise lower bound of ${\gg q^{3/2}}$, due to Landazuri and Seitz). (One can also proceed here using a trace formula argument of Sarnak and Xue; the two approaches are basically equivalent.) The middle period reduces, by a variant of the Balog-Szemerédi-Gowers lemma, to a product estimate in ${Sz(q)}$ which was recently established by Pyber-Szábo and can also be obtained by the methods of proof of the results announced by ourselves. (These arguments are in turn based on an earlier result of Helfgott establishing the analogous claim for ${SL_2({\bf F}_p)}$.) This requires checking that ${Sz(q)}$ is a “sufficiently Zariski dense” subgroup of the finite Lie group ${Sp_4(q)}$, but this can be done using an explicit description of the Suzuki group and the Schwartz-Zippel lemma.

The main difficulty is then to deal with the early period, obtaining some initial non-concentration in the random walk associated to ${S}$ away from subgroups ${H}$ of ${Sz(q)}$. These subgroups have been classified for some time (see e.g. the book of Wilson); they split into two families, the algebraic subgroups, which in the Suzuki case turn out to be solvable of derived length at most three, and the arithmetic subgroups, which are conjugate to ${Sz(q_0)}$, where ${{\bf F}_{q_0}}$ is a subfield of ${{\bf F}_q}$.

In the algebraic case, one can prevent concentration using a lower bound on the girth of random Cayley graphs due to Gamburd, Hoory, Shahshahani, Shalev, and Virág (and we also provide an independent proof of this fact for completeness, which fortunately is able to avoid any really deep technology, such as Lang-Weil estimates); this follows an analogous argument of Bourgain-Gamburd in the ${SL_2}$ case fairly closely, and is ultimately based on the fact that all the algebraic subgroups obey a fixed law (in this case, the law arises from the solvability). In the arithmetic case, the main task is to show that the coefficients of the characteristic polynomial of a typical word in ${S}$ does not fall into a proper subfield of ${{\bf F}_q}$, but this can be accomplished by a variant of the Schwartz-Zippel lemma.

In this final lecture in the Marker lecture series, I discuss the recent work of Bourgain, Gamburd, and Sarnak on how arithmetic combinatorics and expander graphs were used to sieve for almost primes in various algebraic sets.

On Thursday, Avi Wigderson continued his Distinguished Lecture Series here at UCLA on computational complexity with his second lecture “Expander Graphs – Constructions and Applications“. As in the previous lecture, he spent some additional time after the talk on an “encore”, which in this case was how lossless expanders could be used to obtain rapidly decodable error-correcting codes.

The talk was largely based on these slides. Avi also has a recent monograph with Hoory and Linial on these topics. (For a brief introduction to expanders, I can also recommend Peter Sarnak’s Notices article. I also mention expanders to some extent in my third Milliman lecture.)