The objective of this course is to present a number of recent constructions of expander graphs, which are a type of sparse but “pseudorandom” graph of importance in computer science, the theory of random walks, geometric group theory, and in number theory. The subject of expander graphs and their applications is an immense one, and we will not possibly be able to cover it in full in this course. In particular, we will say almost nothing about the important applications of expander graphs to computer science, for instance in constructing good pseudorandom number generators, derandomising a probabilistic algorithm, constructing error correcting codes, or in building probabilistically checkable proofs. For such topics, I recommend the survey of Hoory-Linial-Wigderson. We will also only pass very lightly over the other applications of expander graphs, though if time permits I may discuss at the end of the course the application of expander graphs in finite groups such as {SL_2(F_p)} to certain sieving problems in analytic number theory, following the work of Bourgain, Gamburd, and Sarnak.

Instead of focusing on applications, this course will concern itself much more with the task of constructing expander graphs. This is a surprisingly non-trivial problem. On one hand, an easy application of the probabilistic method shows that a randomly chosen (large, regular, bounded-degree) graph will be an expander graph with very high probability, so expander graphs are extremely abundant. On the other hand, in many applications, one wants an expander graph that is more deterministic in nature (requiring either no or very few random choices to build), and of a more specialised form. For the applications to number theory or geometric group theory, it is of particular interest to determine the expansion properties of a very symmetric type of graph, namely a Cayley graph; we will also occasionally work with the more general concept of a Schreier graph. It turns out that such questions are related to deep properties of various groups {G} of Lie type (such as {SL_2({\bf R})} or {SL_2({\bf Z})}), such as Kazhdan’s property (T), the first nontrivial eigenvalue of a Laplacian on a symmetric space {G/\Gamma} associated to {G}, the quasirandomness of {G} (as measured by the size of irreducible representations), and the product theory of subsets of {G}. These properties are of intrinsic interest to many other fields of mathematics (e.g. ergodic theory, operator algebras, additive combinatorics, representation theory, finite group theory, number theory, etc.), and it is quite remarkable that a single problem – namely the construction of expander graphs – is so deeply connected with such a rich and diverse array of mathematical topics. (Perhaps this is because so many of these fields are all grappling with aspects of a single general problem in mathematics, namely when to determine whether a given mathematical object or process of interest “behaves pseudorandomly” or not, and how this is connected with the symmetry group of that object or process.)

(There are also other important constructions of expander graphs that are not related to Cayley or Schreier graphs, such as those graphs constructed by the zigzag product construction, but we will not discuss those types of graphs in this course, again referring the reader to the survey of Hoory, Linial, and Wigderson.)

— 1. Expander graphs —

We begin by defining the concept of an expander graph formally. As with many fundamentally important concepts in mathematics, there are a number of equivalent definitions of this concept. We will adopt a “spectral” perspective towards expander graphs, defining them in terms of a certain spectral gap, but will relate this formulation of expansion to the more classical notion of edge expansion later in this section.

We begin by recalling the notion of a graph. To avoid some very minor technical issues, we will work with undirected, loop-free, multiplicity-free graphs (though later, when we discuss Cayley graphs, we will allow loops and repetition).

Definition 1 A graph is a pair {G = (V,E)}, where {V} is a set (called the vertex set of {G}), and {E \subset \binom{V}{2}} is a collection of unordered pairs {\{v,w\}} of distinct elements {v,w} of {V}, known as the edge set of {E}. Elements of {V} or {E} are called vertices and edges of {E} A graph is finite if the vertex set (and hence the edge set) is finite. If {k \geq 0} is a natural number, we say that a graph {G = (V,E)} is {k}-regular if each vertex of {V} is contained in exactly {k} edges in {E}; we refer to {k} as the degree of the regular graphs.

Example 2 The complete graph {(V, \binom{V}{2})} on a vertex set {V} has edge set {\binom{V}{2} := \{ \{v,w\}: v,w \in V, v \neq w \}}. If {V} has {n} elements, the complete graph is {n-1}-regular.

In this course, we will mostly be interested in constant-degree large finite regular graphs, in which {k} is fixed (e.g. {k=4}), and the number {n=|V|} of vertices is going off to infinity.

Given a finite graph {G = (V,E)}, one can define the adjacency operator {A: \ell^2(V) \rightarrow \ell^2(V)} on functions {f: G \rightarrow {\bf C}} by the formula

\displaystyle A f(v) := \sum_{w \in V: \{v,w\} \in E} f(w),

thus {Af(v)} is the sum of {f} over all of the neighbours of {v}. If one enumerates the vertices {V} as {v_1,\ldots,v_n} in some fashion, then one can associate {A} with an {n \times n} matrix, known as the adjacency matrix of {G} (with this choice of vertex enumeration).

As our graphs are undirected, the adjacency operator {A} is clearly self-adjoint (and the adjacency matrix is real symmetric). By the spectral theorem, {A} thus has {n} real eigenvalues

\displaystyle \lambda_1 \geq \ldots \geq \lambda_n.

We will write {\lambda_i} as {\lambda_i(G)} whenever we need to emphasise the dependence on the graph {G}.

The largest eigenvalue {\lambda_1} is easily understood for {k}-regular graphs:

Lemma 3 If {G} is a {k}-regular graph, then

\displaystyle k = \lambda_1 \geq \lambda_n \geq -k.

Proof: Clearly {A1 = k1} (where we write {1 \in \ell^2(V)} for the constant function), and so {k} is an eigenvalue of {A} with eigenvector {1}. On the other hand, for any {f, g \in \ell^2(V)} with norm one, one has

\displaystyle |\langle Af, g \rangle_{\ell^2(V)}| = |\sum_{v, w \in V: \{v,w\} \in E} f(w) \overline{g(v)}|

\displaystyle \leq \frac{1}{2} \sum_{v, w \in V: \{v,w\} \in E} |f(w)|^2 + |g(v)|^2

\displaystyle \leq \frac{1}{2} k \sum_{w \in V} |f(w)|^2 + \frac{1}{2} k \sum_{v \in V} |g(v)|^2

\displaystyle = k,

and so {A} has operator norm {k} (or equivalently, all eigenvalues of {A} lie between {-k} and {k}). The claim follows. \Box

Now we turn to the next eigenvalue after {\lambda_1}.

Definition 4 Let {\varepsilon > 0} and {k \geq 1}. A finite {k}-regular graph is said to be a (one-sided) {\varepsilon}-expander if one has

\displaystyle \lambda_2 \leq (1-\varepsilon) k

and a two-sided {\varepsilon}-expander if one also has

\displaystyle \lambda_n \geq -(1-\varepsilon) k.

A sequence {G_i = (V_i,E_i)} of finite {k}-regular graphs is said to be a one-sided (resp. two-sided) expander family if there is an {\varepsilon>0} such that {G_i} is a one-sided (resp. two-sided) {\varepsilon}-expander for all sufficiently large {i}.

Remark 5 The operator {\Delta := 1-\frac{1}{k} A} is sometimes known as the graph Laplacian (though note that it is also common to use the normalisation {k-A} instead of {1-\frac{1}{k} A} in many texts, particularly if one wishes to generalise to graphs that are not perfectly regular). This is a positive semi-definite operator with at least one zero eigenvalue (corresponding to the eigenvector {1}). A graph is an {\varepsilon}-expander if and only if there is a spectral gap of size {\varepsilon} in {\Delta}, in the sense that the first eigenvalue of {\Delta} exceeds the second by at least {\varepsilon}. The graph Laplacian is analogous to the classical Laplacian in Euclidean space (or the Laplace-Beltrami operator on Riemannian manifolds); we will see some formalisations of this analogy later in this course.

The original definition of expander graphs focused on one-sided expanders, but it will be slightly more natural in this course to focus on the two-sided expanders.

Strictly speaking, we have not defined the notion of a (one or two-sided) expander graph in the above definition; we have defined an {\varepsilon}-expander graph for any given parameter {\varepsilon>0}, and have defined the notion of an expander family, which is a sequence of graphs rather than for an individual graph. One could propose defining an expander graph to be a graph that is a one or two-sided {\varepsilon}-expander for some {\varepsilon>0} (or equivalently, a graph {G} such that the constant sequence {G,G,\ldots} is an expander family), but this definition collapses to existing concepts in graph theory:

Exercise 6 (Qualitative expansion) Let {k \geq 1}, and let {G = (V,E)} be a finite {k}-regular graph.

  • Show that {\lambda_2 = k} if and only if {G} is not connected.
  • Show that {\lambda_n = -k} if and only if {G} contains a non-empty bipartite graph as a connected component.

Thus, a graph is a one-sided expander for some {\varepsilon>0} if and only if it is connected, and a two-sided expander for some {\varepsilon>0} if and only if it is connected and not bipartite.

It is therefore necessary to either keep a more quantitative track of the {\varepsilon} parameter, or work with expander families (typically involving vertex sets whose cardinality goes to infinity) rather than with individual graphs. (Alternatively, one could adopt a nonstandard analysis viewpoint and work with ultra expander graphs – i.e. ultraproducts of expander families – but we will postpone using this viewpoint until much later in the course.) Nevertheless, we will often informally drop the {\varepsilon} parameter (or the use of families) and informally refer simply to “expander graphs” in our discussion.

Exercise 7 (Trace formulae) Let {G} be a {k}-regular graph on {n} vertices for some {n > k \geq 1}.

  • (i) Show that {\sum_{i=1}^n \lambda_i = 0}.
  • (ii) Show that {\sum_{i=1} \lambda_i^2 = nk}.
  • (iii) Show that {\max( |\lambda_2|, |\lambda_n| ) \geq \sqrt{k} - o_{n \rightarrow \infty;k}(1)}, where {o_{n \rightarrow \infty;k}(1)} denotes a quantity that goes to zero as {n \rightarrow \infty} for fixed {k}.

Remark 8 The above exercise places an upper bound on how strong of a two-sided expansion one can obtain for a large {k}-regular graph. It is not quite sharp; it turns out that one can obtain the improvement

\displaystyle \max( |\lambda_2|, |\lambda_n| ) \geq 2\sqrt{k-1} - o_{n \rightarrow \infty;k}(1),

a result of Alon and Boppana; see for instance the survey of Hoory-Linial-Wigderson for a proof. Graphs with {\max( |\lambda_2|, |\lambda_n| ) \leq 2\sqrt{k-1}} are known as Ramanujan graphs, and (as the name suggests) have connections to number theory, but we will not discuss this topic here; see for instance this text of Davidoff, Sarnak, and Valette for more discussion.

We will give a probabilistic construction of an expander family later, but let us first give an example of a family of regular graphs that is not an expander family.

Exercise 9 For each {n \geq 3}, let {G_n} be the {2}-regular graph whose vertex set is the cyclic group {{\bf Z}/n{\bf Z}}, and whose edge set is the set of pairs {\{x,x+1\}} for {x \in {\bf Z}/n{\bf Z}}. (This is a basic example of a Cayley graph. Cayley graphs will be discussed in more depth in the next set of notes.)

  • Show that the eigenvalues of the adjacency operator {A_n} associated to {G_n} are {2\cos(2\pi j/n)} for {j=0,\ldots,n-1}. (Hint: you may find the discrete Fourier transform to be helpful.)
  • Show that {G_n} is not a one-sided expander family (and is thus not a two-sided expander family either). This is despite {G_n} always being connected (and non-bipartite for {n} odd).

The next exercise shows that the complete graph is an excellent expander; the whole point, though, of expander graph constructions is to come up with much sparser graphs that still has many of the connectivity and expansion properties of the complete graph.

Exercise 10 Let {G} be the complete graph on {n} vertices, which is of course a {n-1}-regular graph. Show that

\displaystyle \lambda_2 = \ldots = \lambda_n = -1

and so {G} is a one-sided {1+\frac{1}{n-1}}-expander and a two-sided {1-\frac{1}{n-1}}-expander. (This is not a counterexample to Exercise 7(iii), because the error term {o_{n \rightarrow \infty;k}(1)} is only negligible in the regime when {k} is either fixed or is a very slowly growing function of {n}, and this is definitely not the case for the complete graph.)

Exercise 11 Let {G=(V,E)} be a {k}-regular graph on {n} vertices. Let {G^c=(V,\binom{V}{2} \backslash E)} be the complement graph, consisting of all the edges connecting two vertices in {V} that are not in {E}, thus {G^c} is an {n-k-1}-regular graph. Show that

\displaystyle \lambda_i(G^c) = - 1 - \lambda_{n+2-i}(G)

for all {2 \leq i \leq n}.

Exercise 12 Let {n \geq 2} be an even number, and let {G = K_{n/2,n/2}} be the complete bipartite graph between two sets of {n/2} vertices each, thus {G} is an {n/2}-regular graph. Show that {\lambda_n = -n/2} and {\lambda_2 = \ldots = \lambda_{n-1} = 0}. Thus, {G} is a one-sided {1}-expander, but is not a two-sided expander at all.

Exercise 13 (Expansion and Poincaré inequality) If {G = (V,E)} is a {k}-regular graph and {f: V \rightarrow {\bf C}} is a function, define the gradient magnitude {|\nabla f|: V \rightarrow {\bf C}} by the formula

\displaystyle |\nabla f(v)| := (\sum_{w \in V: \{v,w\} \in E} |f(w)-f(v)|^2)^{1/2}.

Show that {G} is a one-sided {\varepsilon}-expander if and only if one has the Poincaré inequality

\displaystyle \| \nabla f \|_{\ell^2(V)}^2 \geq 2 k \varepsilon \| f \|_{\ell^2(V)}^2

whenever {f} has mean zero.

— 2. Connection with edge expansion —

The intuition to explain Exercise 9 should be that while {G_n} is, strictly speaking, connected, it is not very strongly connected; the paths connecting a typical pair of points are quite long (comparable to {n}, the number of vertices) and it is easy to disconnect the graph into two large pieces simply by removing a handful of edges.

We now make this intuition more precise. Given two subsets {F_1,F_2} of the vertex set {V} in a graph {G = (V,E)}, define {E(F_1,F_2) \subset F_1 \times F_2} to be the set of all pairs {(v_1,v_2) \in F_1 \times F_2} such that {\{v_1,v_2\} \in E}. Note that the cardinality of this set can be expressed in terms of the adjacency operator as

\displaystyle |E(F_1,F_2)| = \langle A 1_{F_1}, 1_{F_2} \rangle_{\ell^2(V)}.

Define the boundary {\partial F} of a subset {F} of {V} to be the set {\partial F := E( F, V \backslash F )}, thus {\partial F} is essentially the set of all edges that connect an element of {F} to an element outside of {F}. We define the edge expansion ratio {h(G)} of the graph {G} to be given by the formula

\displaystyle h(G) := \min_{F \subset V: |F| \leq |V|/2} \frac{|\partial F|}{|F|},

where {F} ranges over all subsets of {V} of cardinality at most {|F| \leq |V|/2}. (Some upper bound on {F} is needed to avoid this quantity from degenerating, since {\partial F} becomes empty when {F=V}.) The quantity {h(G)} can be interpreted as a type of isoperimetric constant for {G}, analogous to the Cheeger constant of a compact Riemannian manifold, and so {h(G)} is sometimes known as the Cheeger constant of the graph {G}.

Note that {h(G)} is non-zero precisely when {G} is connected. (If {G} is disconnected, at least one of the components {F} will have cardinality less than {|V|/2}.) We have an analogous statement for one-sided expansion:

Proposition 14 (Weak discrete Cheeger inequality) Let {k \geq 1}, and let {G_n} be a family of finite {k}-regular graphs. Then the following are equivalent:

  • (i) {G_n} form a one-sided expander family.
  • (ii) There exists {c>0} such that {h(G_n) \geq c} for all sufficiently large {n}.

Proof: Let {n} be a large number. We abbreviate {G_n} as {G = (V,E)}.

We first establish the easy direction of this proposition, namely that (i) implies (ii). If {n} is large enough, then from the hypothesis (i) we have {\lambda_2 \leq (1-\varepsilon) k} for some {\varepsilon>0} independent of {n}.

Let {F} be a subset of {V} with {|F| \leq |V|/2}. We consider the quantity

\displaystyle \langle A 1_F, 1_F \rangle_{\ell^2(V)}. \ \ \ \ \ (1)

We split {1_F} into a multiple {\frac{|F|}{|V|} 1} of the first eigenvector {1}, plus the remainder {1_F - \frac{|F|}{|V|} 1}, Using the spectral decomposition of {A}, we can upper bound (1) by

\displaystyle k \|\frac{|F|}{|V|} 1\|_{\ell^2(V)}^2 + (1-\varepsilon) k \|1_F - \frac{|F|}{|V|} 1\|_{\ell^2(V)}^2

which after a brief calculation evaluates to

\displaystyle (1-\varepsilon) k |F| + \varepsilon k \frac{|F|^2}{|V|} \leq (1-\varepsilon/2) k |F|.

On the other hand, (1) is also equal to the number of (ordered) pairs of adjacent vertices {v,w \in F}. Since each {v \in F} is adjacent to exactly {k} vertices, we conclude that there are at least {\varepsilon k|F|/2} pairs {(v,w)} such that {v \in F} and {w \not \in F}. Thus

\displaystyle |\partial F| \geq \varepsilon k|F|/4,

and so {h(G_n) \geq \varepsilon k/4}, and the claim (ii) follows.

Now we establish the harder direction, in which we assume (ii) and prove (i). Thus we may assume that {h(G) \geq c} for some {c>0} independent of {n}.

The difficulty here is basically that the hypothesis (ii) only controls the action {A 1_F} of {A} on indicator functions {1_F}, whereas the conclusion (ii) basically requires us to understand {A f} for arbitrary functions {f \in \ell^2(V)}. Indeed, from the spectral decomposition one has

\displaystyle \lambda_2 = \sup_{f: \|f\|_{\ell^2(V)}=1; \langle f, 1 \rangle_{\ell^2(V)}=0} \langle A f, f \rangle_{\ell^2(V)}

so it suffices to show that

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq (1-\varepsilon) k, \ \ \ \ \ (2)

whenever {f \in \ell^2(V)} has norm one and mean zero, and {\varepsilon>0} is independent of {n}. Since {A} has real matrix coefficients, we may assume without loss of generality that {f} is real.

The mean zero hypothesis is needed to keep the function {f} away from {1}, but it forces {f} to change sign. It will be more convenient to first establish a variant of (2), namely that

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq (1-c) k \|f\|_{\ell^2(V)}^2 \ \ \ \ \ (3)

whenever {f} is non-negative and supported on a set of cardinality at most {|V|/2}.

Let us assume (3) for now and see why it implies (2). Let {f \in \ell^2(V)} have norm one and mean zero. We split {f = f_+ - f_-} into positive and negative parts, where {f_+, f_-} are non-negative with disjoint supports. Observe that {\langle A f_+, f_- \rangle_{\ell^2(V)} = \langle A f_-, f_+ \rangle_{\ell^2(V)}} is positive, and so

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq \langle A f_+, f_+ \rangle_{\ell^2(V)} + \langle A f_-, f_- \rangle_{\ell^2(V)}.

Also we have

\displaystyle 1 = \|f_+\|_{\ell^2(V)}^2 + \|f_-\|_{\ell^2(V)}^2.

At least one of {f_+} and {f_-} is supported on a set of size at most {|V|/2}. By symmetry we may assume that {f_-} has this small support. Let {\sigma>0} be a small quantity (depending on {c}) to be chosen later. If {f_-} has {\ell^2(V)} norm at least {\sigma}, then applying (3) to {f_-} (and the trivial bound {\langle A f, f \rangle_{\ell^2(V)} \leq k \|f\|_{\ell^2(V)}^2} for {f_+}) we have

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq (1-c \sigma^2) k

which would suffice. So we may assume that {f_-} has norm at most {\sigma}. By Cauchy-Schwarz, this implies that {\sum_{x \in V} f_-(x) \leq \sigma |V|^{1/2}}, and thus (as {f} has mean zero) {\sum_{x \in V} f_+(x) \leq \sigma |V|^{1/2}}. Ordering the values {f_+(x)} and applying Markov’s inequality, we see that we can split {f_+} as the sum of a function {f'_+} supported on a set of size at most {|V|/2}, plus an error of {\ell^2} norm {O(\sigma)}. Applying (3) to {f'_+} and {f_-} and using the triangle inequality (and Cauchy-Schwarz) to deal with the error term, we see that

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq (1-c+O(\sigma)) k

which also suffices (if {\sigma} is sufficiently small).

It remains to prove (3). We use the “wedding cake” decomposition, writing {f} as an integral

\displaystyle f = \int_0^\infty 1_{F_t}\ dt

where {F_t := \{ x \in V: |f(x)| > t \}}. By construction, all the {F_t} have cardinality at most {|V|/2} and are non-increasing in {t}. Also, a computation of the {\ell^2} norm shows that

\displaystyle \|f\|_{\ell^2(V)}^2 = \int_0^\infty 2t |F_t|\ dt. \ \ \ \ \ (4)

Expanding {\langle A f, f \rangle_{\ell^2(V)}} and using symmetry, we obtain

\displaystyle 2 \int_0^\infty \int_0^t \langle A 1_{F_s}, 1_{F_t} \rangle\ ds dt.

We can bound the integrand in two ways. Firstly, since {A 1_{F_s}} is bounded by {k}, one has

\displaystyle \langle A 1_{F_s}, 1_{F_t} \rangle \leq k |F_t|.

Secondly, we may bound {\langle A 1_{F_s}, 1_{F_t} \rangle} by {\langle A 1_{F_s}, 1_{F_s} \rangle}.

On the other hand, from the hypothesis {h(G) \geq c} we see that {|\partial F_s| \geq c |F_s|}, and hence

\displaystyle \langle A 1_{F_s}, 1_{F_t} \rangle \leq (k-c) |F_s|.

We insert the first bound for {s \leq (1-\varepsilon) t} and the second bound for {(1-\varepsilon) t < s \leq t}, for some {\varepsilon>0} to be determined later, and conclude that

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq 2 \int_0^\infty k (1-\varepsilon) t |F_t|\ dt + 2 \int_0^\infty \int_{(1-\varepsilon)t}^t (k-c) |F_s|\ ds dt.

Interchanging the integrals in the second integral, we conclude that

\displaystyle \langle A f, f \rangle_{\ell^2(V)} \leq 2 \int_0^\infty k (1-\varepsilon) t |F_t|\ dt + 2 \int_0^\infty (k-c)\frac{\varepsilon}{1-\varepsilon} s |F_s|\ ds.

For {\varepsilon} small enough, one can check that {k(1-\varepsilon) + (k-c) \frac{\varepsilon}{1-\varepsilon} < k (1- \varepsilon')} for some {\varepsilon'>0} depending only on {\varepsilon,k,c}, and the claim (3) then follows from (4). \Box

Example 15 The graphs in Exercise 9 contain large sets with small boundary (e.g. {\{ 1,\ldots,m\} \mod n} for {1 \leq m \leq n/2}), which gives a non-spectral way to establish that they do not form an expander family.

Exercise 16 Show that if {k \leq 2}, then the only expander families of {k}-regular graphs are those families of bounded size (i.e. the vertex sets {V_n} have cardinality bounded in {n}).

Remark 17 There is a more precise relationship between the edge expansion ratio {h(G)} and the best constant {\varepsilon} that makes {G} a one-sided {\varepsilon}-expander, namely the discrete Cheeger inequality

\displaystyle \frac{\varepsilon}{2} k \leq h(G) \leq \sqrt{2\varepsilon} k, \ \ \ \ \ (5)

first proven by Dodzuik, by Alon-Milman, and by Alon (and based on a continuous isoperimetric inequality of Cheeger and of Buser); see for instance this blog post of Luca Trevisan for a proof of the second inequality (the first inequality is already implicit in the proof of the above lemma). However, we will not use this more precise inequality here.

There is an analogous criterion for two-sided expansion, but it is more complicated to state. Here is one formulation that is quite useful:

Exercise 18 (Expander mixing lemma) Let {G(V,E)} be a {k}-regular graph on {n} vertices which is a two-sided {\varepsilon}-expander. Show that for any subsets {F_1, F_2} of {V}, one has

\displaystyle ||E(F_1,F_2)| - \frac{k}{n} |F_1| |F_2|| \leq (1-\varepsilon) k \sqrt{|F_1| |F_2|}.

(Actually, the factors {|F_1|, |F_2|} on the right-hand side can be refined slightly to {|F_1| - \frac{|F_1|^2}{n}} and {|F_2| - \frac{|F_2|^2}{n}} respectively.)

Thus, two-sided expanders behave analogously to the pseudorandom graphs that appear in the Szemerédi regularity lemma (but with the caveat that expanders are usually sparse graphs, whereas pseudorandom graphs are usually dense).

Here is variant of the above lemma that more closely resembles Proposition 14.

Exercise 19 Let {k \geq 1}, and let {G_n} be a family of finite {k}-regular graphs. Show that the following are equivalent:

  • (i) {G_n} form a one-sided expander family.
  • (ii) There exists {c>0} such that whenever {n} is sufficiently large and {F_1, F_2} are subsets of {V_n} of cardinality at most {|V_n|/2}, then {|E(F_1,F_2)| \leq (k-c) \sqrt{|F_1| |F_2|}}.

The exercises below connect expansion to some other graph-theoretic properties. On a connected graph {G}, one can define the graph metric {d: G \times G \rightarrow {\bf R}^+} by defining {d(v,w)} to be the length of the shortest path from {v} to {w} using only edges of {G}. This is easily seen to be a metric on {G}.

Exercise 20 (Expanders have low diameter) Let {G} be a {k}-regular graph on {n} vertices that is a one-sided {\varepsilon}-expander for some {n > k \geq 1} and {\varepsilon>0}. Show that there is a constant {c>0} depending only on {k} and {\varepsilon} such that for every vertex {v \in V} and any radius {r \geq 0}, the ball {B(v,r) := \{ w \in V: d(v,w) \leq r \}} has cardinality

\displaystyle |B(v,r)| \geq \min((1+c)^r,n).

(Hint: first establish the weaker bound {|B(v,r)| \geq \min((1+c)^r,n/2)}.) In particular, {G} has diameter {O(\log n)}, where the implied constant can depend on {k} and {\varepsilon}.

Exercise 21 (Expanders have high connectivity) Let {G} be a {k}-regular graph on {n} vertices that is a one-sided {\varepsilon}-expander for some {n > k \geq 1} and {\varepsilon>0}. Show that if one removes {m} edges from {G} for some {m \geq 0}, then the resulting graph has a connected component of size at least {n - C m}, where {C} depends only on {k} and {\varepsilon}.

Exercise 22 (Expanders have high chromatic number) Let {G} be a {k}-regular graph on {n} vertices that is a two-sided {\varepsilon}-expander for some {n > k \geq 1} and {\varepsilon>0}.

  • (i) Show that any independent set in {G} has cardinality at most {(1-\varepsilon) n}.
  • (ii) Show that the chromatic number of {G} is at least {\frac{1}{1-\varepsilon}}. (Of course, this bound only becomes non-trivial for {\varepsilon} close to {1}; however, it is still useful for constructing bounded-degree graphs of high chromatic number and large girth.)

Exercise 23 (Expansion and concentration of measure) Let {G} be a {k}-regular graph on {n} vertices that is a one-sided {\varepsilon}-expander for some {n > k \geq 1} and {\varepsilon>0}. Let {f: G \rightarrow {\bf R}} be a function which is Lipschitz with some Lipschitz constant {K}, thus {|f(v)-f(w)| \leq K d(v,w)} for all {v,w \in V}. Let {M} be a median value of {f} (thus {f(v) \geq M} for at least half of the vertices {v}, and {f(v) \leq M} for at least half the vertices {v}; note that the median may be non-unique in some cases.) Show that

\displaystyle |\{ v \in V: |f(v)-M| \geq \lambda K \}| \leq C n \exp( - c \lambda )

for all {\lambda > 0} and some constants {C, c>0} depending only on {k, \varepsilon}.

— 3. Random walks on expanders —

We now discuss a connection between expanders and random walks, which will be of particular importance in this course as a tool for demonstrating expansion. Given a {k}-regular graph {G} for some {k \geq 1}, and an initial vertex {v_0 \in G}, we define the random walk on {G} starting at {v_0} to be a random sequence {v_0, v_1, v_2, \ldots} of vertices in {G} defined recursively by setting, once {v_0,\ldots,v_i} have been chosen, {v_{i+1}} to be one of the {k} neighbours of {v_i}, chosen at random. (The existence of such a random process can be easily justified by using the Kolmogorov extension theorem.) For each {i}, let {\mu^{(i)}: G \rightarrow {\bf R}^+} be the probability distribution of {v_i}, thus

\displaystyle \mu^{(i)}(v) = {\bf P}( v_i = v ).

Thus {\mu^{(0)}} is the Dirac mass {\delta_{v_0}} at {v_0}, and we have the recursion

\displaystyle \mu^{(i+1)} = \frac{1}{k} A \mu^{(i)}

and thus

\displaystyle \mu^{(i)} = k^{-i} A^i \delta_{v_0}.

Among other things, this shows that the quantity {\| \mu^{(i)} - \frac{1}{|V_n|} \|_{\ell^2(V_n)}}, which measures the extent to which {v_i} is uniformly distributed, is non-increasing in {i}. The rate of this decrease is tied to the expansion properties of the graph:

Exercise 24 Let {k \geq 1}, and let {G_n = (V_n,E_n)} be a family of finite {k}-regular graphs. Let {\alpha>1/2}. Show that the following are equivalent:

  • (i) The {G_n} are a two-sided expander family.
  • (ii) There is a {C>0} independent of {n}, such that for all sufficiently large {n} one has {\| \mu^{(i)} - \frac{1}{|V_n|} \|_{\ell^2(V_n)} \leq |V_n|^{-\alpha}} for all {i \geq C \log |V_n|}, and all choices of initial vertex {v_0}.

Informally, the above exercise asserts that a two-sided expander on {n} vertices is one for which random walks (from an arbitrary starting point) become very close to uniform in just {O(\log n)} steps. (Compare with, say, the graphs in (9), in which the random walks do not come close to mixing until time well beyond {n^2}, as indicated by the central limit theorem.) This rapid mixing is useful for many applications; for instance, it can be used in computer science to generate almost perfectly uniformly distributed random elements of various interesting sets (e.g. elements of a finite group); see the survey of Hoory-Linial-Wigderson for more discussion. It is also useful in number theory to facilitate certain sieving estimates, as I hope to discuss later in this course.

Remark 25 In the above exercise, we assumed that the initial vertex {v_0} was deterministic rather than random. However, it is easy to see (from Minkowski’s inequality) that the exercise also holds if we permit {v_0} to be drawn from an arbitrary probability distribution on {V_n}, rather than being a single deterministic vertex. Thus one can view the uniform distribution in a two-sided expander to be a very strong attractor for all the other probability distributions on the vertex set.

Remark 26 The {\ell^2} norm is the most convenient norm to use when using the spectral theorem, but one can certainly replace this norm if desired by other norms (e.g. the {\ell^1} norm, which in this finite setting is the same thing as the total variation norm), after adjusting the lower bound of {A} slightly, since all norms on a finite-dimensional space are equivalent (though one typically has to concede some powers of {|V_n|} to attain this equivalency). One can also use some other norm-like quantities here to measure distance to uniformity, such as Shannon entropy, although we will not do so here.

Note that for graphs that are only one-sided expanders instead of two-sided expanders, the random walk is only partially mixing, in that the probability distribution tends to flatten out rapidly, but not converge as rapidly (or at all) to the uniform distribution. For instance, in the case of the complete bipartite graph {K_{n/2,n/2}}, it is clear that the random walk simply alternates between the two vertex sets of size {n/2} in the bipartite graph (although it is uniformly distributed in each set). But this lack of rapid mixing can be dealt with by replacing the random walk with the lazy random walk {w_0, w_1, w_2, \ldots}, which is defined similarly to the random walk {v_0, v_1, v_2, \ldots} except that {w_{i+1}} is only set equal to a randomly chosen neighbour of {w_i} with probability {1/2}, and remains equal to {w_i} with probability {1/2}. (One can also select other probabilities here than {1/2}, as this will not significantly affect the exercise below.) Indeed, we have:

Exercise 27 Let {k \geq 1}, and let {G_n = (V_n,E_n)} be a family of finite {k}-regular graphs. Let {\alpha>1/2}. Show that the following are equivalent:

  • (i) The {G_n} are a one-sided expander family.
  • (ii) There is a {C>0} independent of {n}, such that for all sufficiently large {n} one has {\| \nu^{(i)} - \frac{1}{|V_n|} \|_{\ell^2(V_n)} \leq |V_n|^{-\alpha}} for all {i \geq C \log n}, and all choices of initial vertex {w_0}, where {\nu^{(i)}} is the law of the random walk {w_i}.

— 4. Random graphs as expanders —

We now turn to the task of constructing expander families of {k}-regular graphs. The first result in this direction was by Pinsker in 1973 (with a closely related result also established by Barzdin and Kolmogorov in 1967), who showed that if one chose a {k}-regular graph on {n} vertices randomly, and then sent {n} to infinity along a sufficiently sparse sequence, the resulting sequence would be almost surely be an expander family. (Here, of course, one needs to avoid the parity obstruction that one cannot have a {k}-regular graph on {n} vertices if {k} and {n} are both odd.) We will not quite prove this result here (because it requires some understanding of the probability distribution of {k}-regular graphs, which has some subtleties as the parity obstruction already indicates), but establish a closely related result, in which {k} is restricted to be even (to avoid parity problems) and sufficiently large (for convenience) and the {k}-regular graph is drawn from a slightly non-uniform distribution.

Before we do this, though, let us perform a heuristic computation as to why, when {k} is fixed but large, and {n} goes to infinity, one expects a “random” {k}-regular graph {G=(V,E)} on {n} vertices to be an expander. For simplicity, we work with the one-sided expansion condition. By Proposition 14, we would then like to say that with high probability, one has

\displaystyle |\partial F| \geq c |F|

for all {F \subset V} with cardinality at most {n/2}, and some {c>0} independent of {n}. An equivalent formulation would be to say that the neighbourhood {N(F)} of {F} has to have cardinality at least {(1+c') |F|} for all {F \subset V} with cardinality at most {n/2}, and some {c'>0} independent of {n}. Thus, we wish to exclude the possibility that there are sets {F \subset F' \subset V} with {|F| \leq n/2} and {|F'| \leq (1+c') |F|}, for which all the edges starting from {F} end up in {F'}.

To bound this failure probability, we use the union bound: we try each pair {F, F'} in turn, bound the probability that the claim follows for that particular value of {F,F'}, and then sum in {F, F'}. The goal is to obtain a total probability bound of {o_{n \rightarrow \infty;k}(1)}, that goes to zero as {n \rightarrow \infty} for fixed {k}.

Accordingly, pick {1 \leq r \leq n/2}, and then pick {F \subset V} of cardinality {r}, and then {F \subset F' \subset V} of cardinality {r+r'}, where {r' := \lfloor c' r\rfloor + 1} for some small constant {c'>0} to be chosen later. For each fixed {r}, there are {\frac{n!}{r! r'! (n-r-r')!}} choices of {F} and {F'}. For each {F,F'}, there are {kr} edges emenating from {F}. Intuitively, if we choose the graph randomly, each edge has a probability about {\frac{r+r'}{n}} of landing back in {F'}, so the probability that they all do is about {(\frac{r+r'}{n})^{kr}}. So the failure rate should be about

\displaystyle \sum_{1 \leq r \leq n/2} \frac{n!}{r! r'! (n-r-r')!} (\frac{r+r'}{n})^{kr}.

We can bound {\frac{n!}{r! r'! (n-r-r')!}} somewhat crudely as {\frac{n^{r+r'}}{(r+r')!} O(1)^r}. Applying Stirling’s formula, we obtain a bound of

\displaystyle \sum_{1 \leq r \leq n/2} O(1)^r (\frac{r+r'}{n})^{kr - r - r'}.

For {c} small enough, {(r+r')/n} is less than {0.6} (say), and then (for {k} large enough) we see that this series is bounded by {o_{n \rightarrow \infty;k}(1)} as required.

Now we turn to the rigorous construction of random expander graphs. We will assume that {k} is a large (but fixed) even integer, {k=2l}. To build a {2l}-regular graph on {n} vertices {\{1,\ldots,n\}}, what we will do is pick {l} permutations {\pi_1,\ldots,\pi_l: \{1,\ldots,n\} \rightarrow \{1,\ldots,n\}}, and let {G} be the graph formed by connecting {v} to {\pi_i(v)} for all {v \in \{1,\ldots,n \}} and {i=1,\ldots,l}. This is not always a {2l}-regular graph, but we will be able to show the following two claims (if {k} is large enough):

Proposition 28 ({G} can be {k}-regular) The graph {G} is {2l}-regular with probability at least {c-o_{n \rightarrow \infty;k}(1)}, where {c>0} depends only on {k}.

Proposition 29 ({G} usually expands) There is an {\varepsilon>0} depending only on {k=2l}, such that the probability that {G} is {2l}-regular but not a one-sided {\varepsilon}-expander is {o_{n \rightarrow \infty;k}(1)}.

Putting these two propositions together, we conclude

Corollary 30 With probability at least {c-o_{n \rightarrow \infty;k}(1)}, {G} is a {2l}-regular one-sided {\varepsilon}-expander.

In particular, this allows us to construct one-sided expander families of {k}-regular graphs for any fixed large even {k}.

Remark 31 If one allowed graphs to have multiple edges and loops, then it would be possible to dispense with the need for Proposition 28, and show that {G} (now viewed as a {2l}-regular graph with multiple edges and loops) is a one-sided {\varepsilon}-expander with probability {1-o_{n \rightarrow \infty;k}(1)}. (This requires extending results such as the weak discrete Cheeger inequality to the case when there are multiple edges and loops, but this turns out to be straightforward.) However, we will not do so here as it requires one to introduce a slight amount of additional notation.

Let us prove Proposition 29 first, which will follow the informal sketch at the beginning of this section. By Proposition 14, it suffices to show that there is a {c>0} depending only on {k} such that the probability that {G} is {2l}-regular with {h(G) \leq c} is {o_{n \rightarrow \infty;k}(1)}. As in the sketch, we first bound for each {1 \leq r \leq n/2} and {F \subset F' \subset \{1,\ldots,n\}} with {|F|=r} and {|F'| = r+r'}, where {r' := \lfloor cr\rfloor+1}, the probability that all the edges from {F} end up in {F'}. A necessary condition for this to occur is that {\pi_i(F) \subset F'} for each {i=1,\ldots,k}. For each {i}, and for fixed {r,F,F'}, the probability that the random permutation {\pi_i} does this is

\displaystyle \frac{\binom{r+r'}{r}}{\binom{n}{r}} \leq (\frac{r+r'}{n})^r

so the total failure probability can be bounded by

\displaystyle \sum_{1 \leq r \leq n/2} \frac{n!}{r! r'! (n-r-r')!} (\frac{r+r'}{n})^{kr/2}

which is acceptable as discussed previously.

Now we turn to Proposition 28. We observe that {G} will be {2l}-regular unless there are distinct {i,j} and a vertex {v \in \{1,\ldots,n\}} such that either {\pi_i(v) = \pi_j(v)}, {\pi_i(v) = \pi_j^{-1}(v)}, {\pi_i(v) = v}, or {\pi_i(v) = \pi_i^{-1}(v)} (as such cases lead to repeated edges or loops). Unfortunately, each of these events can occur with a fairly sizeable probability (e.g. for each {i,j}, the probability that {\pi_i(v)=\pi_j(v)} for some {v} is about {1-1/e}, by the classical theory of derangements), so the union bound is not enough here. Instead, we will proceed by an interpolant between the union bound and the inclusion-exclusion formula, known as the Bonferroni inequalities:

Exercise 32 (Bonferroni inequalities)

  • (i) Show that if {n, k \geq 0} are natural numbers, that

    \displaystyle \sum_{j=0}^k (-1)^j \binom{n}{j} \geq 1_{n=0}

    when {k} is even, and

    \displaystyle \sum_{j=0}^k (-1)^j \binom{n}{j} \leq 1_{n=0}

    when {k} is odd, where we adopt the convention that {\binom{n}{j}=0} when {j>n}.

  • (ii) Show that if {N, k \geq 0} and {E_1,\ldots,E_N} are events, then

    \displaystyle \sum_{j=0}^k (-1)^j \sum_{1 \leq i_1 < \ldots < i_j \leq N} {\bf P}( E_{i_1} \cap \ldots \cap E_{i_j} ) \geq {\bf P}( \overline{\bigcup_{i=1}^N E_i} )

    when {k} is even, and

    \displaystyle \sum_{j=0}^k (-1)^j \sum_{1 \leq i_1 < \ldots < i_j \leq N} {\bf P}( E_{i_1} \cap \ldots \cap E_{i_j} ) \leq {\bf P}( \overline{\bigcup_{i=1}^N E_i} )

    when {k} is odd. (Note that the {k=1} case of this inequality is essentially the union bound, and the {k=N} case is the inclusion-exclusion formula.) Here we adopt the convention that the empty intersection occurs with probability {1}.

We return to the proof of Proposition 28. By a conditioning argument, it suffices to show the following:

Proposition 33 Let {1 \leq i \leq k}, and suppose that the permutations {\pi_1,\ldots,\pi_{i-1}} have already been chosen (and are now viewed as fixed deterministic objects). Let {\pi_i: \{1,\ldots,n\} \rightarrow \{1,\ldots,n\}} be a permutation chosen uniformly at random. Then with probability at least {c - o_{n \rightarrow \infty;k}(1)} for some {c>0} depending only on {k}, one has {\pi_i(v) \neq \pi_j(v), v, \pi_i^{-1}(v)} for all {v \in \{1,\ldots,n\}} and {j=1,\ldots,i-1}.

It remains to establish Proposition 33. We modify an argument of Bollobas. Consider the set {\Omega \subset \{1,\ldots,n\} \times \{1,\ldots,n\}} of pairs

\displaystyle \Omega := \{ (v, \pi_j(v)): v = 1,\ldots,n; j=1,\ldots,i-1 \} \cup \{ (v, \pi_j(v)^{-1}): v = 1,\ldots,n; j=1,\ldots,i-1 \} \cup \{ (v,v): v = 1,\ldots, n \}.

Each pair {(v,w)} in {\Omega} gives rise to a bad event {(\pi_i(v)=w)}. Each vertex {v} gives rise to another bad event

\displaystyle (\pi_i(v)=\pi_i^{-1}(v)) \wedge ((v,\pi_i(v)) \not \in \Omega).

In all, this gives {|\Omega|+n =: n \alpha} separate events for some {\alpha \sim_k 1}, which we enumerate as {E_1,\ldots,E_{\alpha n}}. Our task is to show that

\displaystyle {\bf P}( \overline{\bigcup_{s=1}^{\alpha n} E_s} ) \geq c - o_{n \rightarrow \infty;k}(1).

It is easy to check (by direct counting arguments) that

\displaystyle {\bf P}(E_s) = \frac{1}{n} + O(\frac{1}{n^2}) \ \ \ \ \ (6)

for each {i=1,\ldots,\alpha n}. More generally, for any fixed {j}, the same sort of counting arguments (which we leave as an exercise) the approximate independence

\displaystyle {\bf P}(E_{s_1} \cap \ldots \cap E_{s_j}) = \frac{1}{n^j} + O_{j,k}(\frac{1}{n^{j+1}}) \ \ \ \ \ (7)

for all but {O_{j,k}(n^{j-1})} of the tuples {(s_1,\ldots,s_j)} with {1 \leq s_1 < \ldots < s_j \leq \alpha n}, with

\displaystyle {\bf P}(E_{s_1} \cap \ldots \cap E_{s_j}) = O_{j,k}(\frac{1}{n^j}) \ \ \ \ \ (8)

for the exceptional tuples. (The subscripts in the {O()}-notation indicate that the implied constant in that notation can depend on the subscripted parameters.) In particular, we have

\displaystyle \sum_{1 \leq s_1 < \ldots < s_j \leq n \alpha} {\bf P}( E_{s_1} \cap \ldots \cap E_{s_j} ) = \frac{\alpha^j}{j!} + O_{j,k}(\frac{1}{n})

and thus by the Bonferroni inequalities

\displaystyle {\bf P}( \overline{\bigcup_{s=1}^{2in} E_s} ) \geq \sum_{j=0}^m (-1)^j \frac{\alpha^j}{j!} + O_{m,k}(\frac{1}{n})

for any odd {m}. But from Taylor series expansion, the sum on the right-hand side converges to the positive quantity {e^{-\alpha}}, and the claim follows by taking {m} to be a sufficiently large odd number depending on {k}.

Exercise 34 Verify the estimates (6), (7), (8).

Exercise 35 Show that the total number {N := \sum_{s=1}^{n(i+1)} 1_{E_s}} of failure events is asymptotically distributed according to the Poisson distribution of intensity {2i}, or more precisely that

\displaystyle {\bf P}(N = m) = \frac{(2i)^m e^{-2i}}{m!} + o_{n \rightarrow \infty;k,m}(1)

for any natural number {m}.

Remark 36 Another way to establish Proposition 28, via the “swapping method”, was pointed out to me by Brendan McKay here. The key observation is that if {l} permutations {\pi_1,\ldots,\pi_l} have {m} problematic edges (i.e. edges that are either repeated or loops), and then one applies {m} random transpositions to the {\pi_1,\ldots,\pi_l} (as selected at random), then with probability {\gg_m n^{-m}} (if {n} is sufficiently large depending on {m}), all problematic edges are erased and one obtains a random regular graph. Conversely, if one starts with a random regular graph and applies {m} random transpositions, the probability of obtaining {m} problematic edges as a result is {O_m(n^{-m})}. Combining the two facts, we see that {p_m} is the probability of having {m} problematic edges, then {p_0 \gg p_m} for each fixed {m} (and {n} sufficiently large depending on {m}). Since the expected number of problematic edges is bounded ,the desired bound {p_0 \gg 1} then follows from Markov’s inequality and the pigeonhole principle.

Exercise 37 Show that “one-sided” can be replaced with “two-sided” in Proposition 29 and hence in Corollary 30.

Remark 38 It turns out that the random {k}-regular graphs formed by taking {l} permutations as indicated above, and conditioning on the event that there are no “collisions” (so that one genuinely gets a {k}-regular graph) does not quite give a uniform distribution on the {k}-regular graphs. However, it is close enough to one that any property which is true with probability {1-o_{n \rightarrow \infty;k}(1)} for this model of random {k}-regular graph, is also true with probability {1-o_{n \rightarrow \infty;k}(1)} for uniform {k}-regular graphs, and conversely. This fact (known as contiguity of the two random models, and analogous to the concept of mutually absolutely continuous measures in measure theory) is established for instance in this survey of Wormald. As a consequence of this fact (and a more refined version of the above analysis), one can show that {1-o_{n \rightarrow \infty;k}(1)} of all {k}-regular graphs on {n} vertices are {\varepsilon}-expanders for some {\varepsilon = \varepsilon_k > 0} if {k \geq 3} (assuming of course the parity requirement that {nk} be even, otherwise there are no {k}-regular graphs at all).