In the previous set of notes we saw how a representation-theoretic property of groups, namely Kazhdan’s property (T), could be used to demonstrate expansion in Cayley graphs. In this set of notes we discuss a different representation-theoretic property of groups, namely quasirandomness, which is also useful for demonstrating expansion in Cayley graphs, though in a somewhat different way to property (T). For instance, whereas property (T), being qualitative in nature, is only interesting for infinite groups such as or
, and only creates Cayley graphs after passing to a finite quotient, quasirandomness is a quantitative property which is directly applicable to finite groups, and is able to deduce expansion in a Cayley graph, provided that random walks in that graph are known to become sufficiently “flat” in a certain sense.
The definition of quasirandomness is easy enough to state:
Definition 1 (Quasirandom groups) Let
be a finite group, and let
. We say that
is
-quasirandom if all non-trivial unitary representations
of
have dimension at least
. (Recall a representation is trivial if
is the identity for all
.)
Exercise 1 Let
be a finite group, and let
. A unitary representation
is said to be irreducible if
has no
-invariant subspaces other than
and
. Show that
is
-quasirandom if and only if every non-trivial irreducible representation of
has dimension at least
.
Remark 1 The terminology “quasirandom group” was introduced explicitly (though with slightly different notational conventions) by Gowers in 2008 in his detailed study of the concept; the name arises because dense Cayley graphs in quasirandom groups are quasirandom graphs in the sense of Chung, Graham, and Wilson, as we shall see below. This property had already been used implicitly to construct expander graphs by Sarnak and Xue in 1991, and more recently by Gamburd in 2002 and by Bourgain and Gamburd in 2008. One can of course define quasirandomness for more general locally compact groups than the finite ones, but we will only need this concept in the finite case. (A paper of Kunze and Stein from 1960, for instance, exploits the quasirandomness properties of the locally compact group
to obtain mixing estimates in that group.)
Quasirandomness behaves fairly well with respect to quotients and short exact sequences:
Exercise 2 Let
be a short exact sequence of finite groups
.
- (i) If
is
-quasirandom, show that
is
-quasirandom also. (Equivalently: any quotient of a
-quasirandom finite group is again a
-quasirandom finite group.)
- (ii) Conversely, if
and
are both
-quasirandom, show that
is
-quasirandom also. (In particular, the direct or semidirect product of two
-quasirandom finite groups is again a
-quasirandom finite group.)
Informally, we will call quasirandom if it is
-quasirandom for some “large”
, though the precise meaning of “large” will depend on context. For applications to expansion in Cayley graphs, “large” will mean “
for some constant
independent of the size of
“, but other regimes of
are certainly of interest.
The way we have set things up, the trivial group is infinitely quasirandom (i.e. it is
-quasirandom for every
). This is however a degenerate case and will not be discussed further here. In the non-trivial case, a finite group can only be quasirandom if it is large and has no large subgroups:
Exercise 3 Let
, and let
be a finite
-quasirandom group.
- (i) Show that if
is non-trivial, then
. (Hint: use the mean zero component
of the regular representation
.) In particular, non-trivial finite groups cannot be infinitely quasirandom.
- (ii) Show that any proper subgroup
of
has index
. (Hint: use the mean zero component of the quasiregular representation.)
The following exercise shows that quasirandom groups have to be quite non-abelian, and in particular perfect:
Exercise 4 (Quasirandomness, abelianness, and perfection) Let
be a finite group.
- (i) If
is abelian and non-trivial, show that
is not
-quasirandom. (Hint: use Fourier analysis or the classification of finite abelian groups.)
- (ii) Show that
is
-quasirandom if and only if it is perfect, i.e. the commutator group
is equal to
. (Equivalently,
is
-quasirandom if and only if it has no non-trivial abelian quotients.)
Later on we shall see that there is a converse to the above two exercises; any non-trivial perfect finite group with no large subgroups will be quasirandom.
Exercise 5 Let
be a finite
-quasirandom group. Show that for any subgroup
of
,
is
-quasirandom, where
is the index of
in
. (Hint: use induced representations.)
Now we give an example of a more quasirandom group.
Lemma 2 (Frobenius lemma) If
is a field of some prime order
, then
is
-quasirandom.
This should be compared with the cardinality of the special linear group, which is easily computed to be
.
Proof: We may of course take to be odd. Suppose for contradiction that we have a non-trivial representation
on a unitary group of some dimension
with
. Set
to be the group element
and suppose first that is non-trivial. Since
, we have
; thus all the eigenvalues of
are
roots of unity. On the other hand, by conjugating
by diagonal matrices in
, we see that
is conjugate to
(and hence
conjugate to
) whenever
is a quadratic residue mod
. As such, the eigenvalues of
must be permuted by the operation
for any quadratic residue mod
. Since
has at least one non-trivial eigenvalue, and there are
distinct quadratic residues, we conclude that
has at least
distinct eigenvalues. But
is a
matrix with
, a contradiction. Thus
lies in the kernel of
. By conjugation, we then see that this kernel contains all unipotent matrices. But these matrices generate
(see exercise below), and so
is trivial, a contradiction.
Exercise 6 Show that for any prime
, the unipotent matrices
for
ranging over
generate
as a group.
Exercise 7 Let
be a finite group, and let
. If
is generated by a collection
of
-quasirandom subgroups, show that
is itself
-quasirandom.
Exercise 8 Show that
is
-quasirandom for any
and any prime
. (This is not sharp; the optimal bound here is
, which follows from the results of Landazuri and Seitz.)
As a corollary of the above results and Exercise 2, we see that the projective special linear group is also
-quasirandom.
Remark 2 One can ask whether the bound
in Lemma 2 is sharp, assuming of course that
is odd. Noting that
acts linearly on the plane
, we see that it also acts projectively on the projective line
, which has
elements. Thus
acts via the quasiregular representation on the
-dimensional space
, and also on the
-dimensional subspace
; this latter representation (known as the Steinberg representation) is irreducible. This shows that the
bound cannot be improved beyond
. More generally, given any character
,
acts on the
-dimensional space
of functions
that obey the twisted dilation invariance
for all
and
; these are known as the principal series representations. When
is the trivial character, this is the quasiregular representation discussed earlier. For most other characters, this is an irreducible representation, but it turns out that when
is the quadratic representation (thus taking values in
while being non-trivial), the principal series representation splits into the direct sum of two
-dimensional representations, which comes very close to matching the bound in Lemma 2. There is a parallel series of representations to the principal series (known as the discrete series) which is more complicated to describe (roughly speaking, one has to embed
in a quadratic extension
and then use a rotated version of the above construction, to change a split torus into a non-split torus), but can generate irreducible representations of dimension
, showing that the bound in Lemma 2 is in fact exactly sharp. These constructions can be generalised to arbitrary finite groups of Lie type using Deligne-Luzstig theory, but this is beyond the scope of this course (and of my own knowledge in the subject).
Exercise 9 Let
be an odd prime. Show that for any
, the alternating group
is
-quasirandom. (Hint: show that all cycles of order
in
are conjugate to each other in
(and not just in
); in particular, a cycle is conjugate to its
power for all
. Also, as
,
is simple, and so the cycles of order
generate the entire group.)
Remark 3 By using more precise information on the representations of the alternating group (using the theory of Specht modules and Young tableaux), one can show the slightly sharper statement that
is
-quasirandom for
(but is only
-quasirandom for
due to icosahedral symmetry, and
-quasirandom for
due to lack of perfectness). Using Exercise 3 with the index
subgroup
, we see that the bound
cannot be improved. Thus,
(for large
) is not as quasirandom as the special linear groups
(for
large and
bounded), because in the latter case the quasirandomness is as strong as a power of the size of the group, whereas in the former case it is only logarithmic in size.
If one replaces the alternating group
with the slightly larger symmetric group
, then quasirandomness is destroyed (since
, having the abelian quotient
, is not perfect); indeed,
is
-quasirandom and no better.
Remark 4 Thanks to the monumental achievement of the classification of finite simple groups, we know that apart from a finite number (26, to be precise) of sporadic exceptions, all finite simple groups (up to isomorphism) are either a cyclic group
, an alternating group
, or is a finite simple group of Lie type such as
. (We will define the concept of a finite simple group of Lie type more precisely in later notes, but suffice to say for now that such groups are constructed from reductive algebraic groups, for instance
is constructed from
in characteristic
.) In the case of finite simple groups
of Lie type with bounded rank
, it is known from the work of Landazuri and Seitz that such groups are
-quasirandom for some
depending only on the rank. On the other hand, by the previous remark, the large alternating groups do not have this property, and one can show that the finite simple groups of Lie type with large rank also do not have this property. Thus, we see using the classification that if a finite simple group
is
-quasirandom for some
and
is sufficiently large depending on
, then
is a finite simple group of Lie type with rank
. It would be of interest to see if there was an alternate way to establish this fact that did not rely on the classification, as it may lead to an alternate approach to proving the classification (or perhaps a weakened version thereof).
A key reason why quasirandomness is desirable for the purposes of demonstrating expansion is that quasirandom groups happen to be rapidly mixing at large scales, as we shall see below the fold. As such, quasirandomness is an important tool for demonstrating expansion in Cayley graphs, though because expansion is a phenomenon that must hold at all scales, one needs to supplement quasirandomness with some additional input that creates mixing at small or medium scales also before one can deduce expansion. As an example of this technique of combining quasirandomness with mixing at small and medium scales, we present a proof (due to Sarnak-Xue, and simplified by Gamburd) of a weak version of the famous “3/16 theorem” of Selberg on the least non-trivial eigenvalue of the Laplacian on a modular curve, which among other things can be used to construct a family of expander Cayley graphs in (compare this with the property (T)-based methods in the previous notes, which could construct expander Cayley graphs in
for any fixed
).
— 1. Mixing in quasirandom groups —
Let be a finite group. Given two functions
, we can define the convolution
by the formula
This operation is bilinear and associative, but is not commutative unless is abelian. From the Cauchy-Schwarz inequality one has
and hence
This inequality is sharp in the sense that if we set and
to both be constant-valued, then the left-hand side and right-hand side match. For abelian groups, one can also see this example is sharp when
and
are multiples of the same character.
It turns out, though, that if one restricts one of or
(or both) to be of mean zero, and
is quasirandom, then one can improve this inequality, which first appeared explicitly in the work of Babai, Nikolov, and Pyber:
Proposition 3 (Mixing inequality) Let
be a finite
-quasirandom group, and let
. If at least one of
has mean zero, then
Proof: By subtracting a constant from or
, we may assume that
or
both have mean zero.
Observe that (being a superposition of right-translates of
) also has mean zero. Thus, we see that we may define an operator
by setting
. It thus suffices to show that the operator norm of
is at most
.
Fix . We can view
as a
matrix. We apply the singular value decomposition to this matrix to obtain singular values
of , together with associated singular vectors. The operator norm of
is the largest singular value
. The operator
is then a self-adjoint operator (or matrix) with eigenvalues
. In particular, we have
Now, a short computation shows that , where
, and (by embedding
in
, and noting that
annihilates constants) the trace can computed as
Thus, if is the eigenspace of
corresponding to the eigenvalue
(so that the dimension of
is the multiplicity of
), we have
Now observe that if is the left-regular representation (restricted to
) then
for any and
(this is a special case of the associativity of convolution). In particular, we see that
is invariant under
. Since
has no non-trivial invariant vectors in
, we conclude from quasirandomness that
has dimension at least
, and the claim follows.
Remark 5 One can also establish the above inequality using the nonabelian Fourier transform (which is based on the Peter-Weyl theorem combined with Schur’s lemma, and is developed in this blog post); we leave this as an exercise for the interested reader.
Exercise 10 Let
be subsets of a finite
-quasirandom group
. Show that
and
Conclude in particular that
and that
whenever
. (The bounds here are not quite sharp, but are simpler than the optimal bounds, and suffice for most applications.)
Thus, for instance, if is a subset of a finite
-quasirandom group
of density
more than
, then
will be most of
(with fewer than
elements omitted), and
will be all of
; thus large subsets of a quasirandom group rapidly expand to fill out the whole group. In the converse direction, we have
Exercise 11 Let
, and let
be a finite group which is not
-quasirandom. Show that there exists a subset
of
with
for some absolute constant
, such that
. (Hint: by hypothesis, one has a non-trivial unitary representation
of dimension at most
. Show that
contains an element at distance
from the identity in operator norm, and take
to be the preimage of a suitable ball around the identity in the operator norm.)
One can improve this result by using a quantitative form of Jordan’s theorem; see the next section.
Exercise 12 (Mixing inequality for actions) Let
be a finite
-quasirandom group acting (on the left) on a discrete set
. Given functions
and
, one can define the convolution
in much the same way as before:
Show that
whenever
has mean zero, or whenever
has mean zero on every orbit of
.
One can use quasirandomness to show that Cayley graphs of very large degree in a quasirandom group are expanders:
Exercise 13 Let
be a
-regular Cayley graph in a finite
-quasirandom group
on
vertices.
- (i) If
,
are subsets of
, show that
(compare with the expander mixing lemma, Exercise 9 from Notes 1).
- (ii) Show that
is a two-sided
-expander whenever
Unfortunately, the above result is only non-trivial in the regime , whereas for our applications we are interested instead in the regime when
. We record a tool for this purpose.
Proposition 4 (Using quasirandomness to demonstrate expansion) Let
be a
-regular Cayley graph in a finite group
. Assume the following:
- (i) (Quasirandomness)
is
-quasirandom for some
.
- (ii) (Flattening of random walk) One has
for some
with
and
, where
and
is the
-fold convolution of
.
Then
is a two-sided
-expander for some
depending only on
, if
is sufficiently large depending on these quantities. If we replace
by
in the flattening hypothesis, then
is a one-sided
-expander instead.
Proof: We allow implied constants to depend on . We will just prove the first claim, as the second claim is similar. By Exercise 5 from Notes 2, it will suffice to show that
(say) for some . But from Proposition 3 (with
and
) and the hypotheses we have
for any . Iterating this (starting from, say,
, and advancing in steps of
times) we obtain the claim.
Exercise 14 Obtain an alternate proof of the above result that proceeds directly from the spectral decomposition of the adjacency operator
into eigenvalues and eigenvectors and quasirandomness, rather than through Exercise 5 from Notes 2 and Proposition 3. (This alternate approach is closer in spirit to the arguments of Sarnak-Xue and Bourgain-Gamburd, though the two approaches are largely equivalent in the final analysis.)
Informally, the flattening hypothesis in Proposition 4 asserts that by time , the random walk has expanded to the point where it is covering a large portion of the group
(roughly speaking, it is spread out over a set of size at least
). The point is that the scale of this set is large enough for the quasirandomness properties of the group
to then mix the random walk rapidly towards the uniform distribution. However, this proposition provides no tools with which to prove this flattening property; this task will be a focus of subsequent notes.
The following exercise extends some of the above theory from quasirandom groups to “virtually quasirandom” groups, which have a bounded index subgroup that is quasirandom, but need not themselves be quasirandom.
Exercise 15 (Virtually quasirandom groups) Let
be a finite group that contains a normal
-quasirandom subgroup
of index at most
.
- (i) If
, and at least one of
has mean zero on every coset of
, show that
- (ii) If
for some
, and
is a connected
-regular Cayley graph obeying (1) for some
with
and
, show that
is a two-sided
-expander for some
depending only on
.
— 2. An algebraic description of quasirandomness (optional) —
As defined above, quasirandomness is a property of representations. However, one can reformulate this property (at a qualitative level, at least) in a more algebraic fashion, by means of Jordan’s theorem:
Theorem 5 (Jordan’s theorem) Let
be a finite subgroup of
for some
. Then
contains a normal abelian subgroup of index at most
, where
depends only on
.
A proof of this theorem (giving a rather poor value of ) may be found in this previous blog post. The optimal value of
is known for almost all
, thanks to the classification of finite simple groups: for instance, it is a result of Collins that
for
(which is attained with the example of the symmetric group
which acts on the space
of
-dimensional complex vectors whose coefficients sum to zero).
Jordan’s theorem can be used to give a qualitative description of quasirandomness, providing a converse to Exercises 3 and 4:
Exercise 16 Let
be an integer. Let
be a perfect finite group, with the property that all proper normal subgroups of
have index greater than
, where
is the quantity in Jordan’s theorem. Show that
is
-quasirandom.
Conclude in particular that any finite simple nonabelian group of cardinality greater than
is
-quasirandom.
By using the classification of finite simple groups more carefully, Nikolov and Pyber were able to replace here by
. Using related arguments, they also showed that if
was not
-quasirandom, then there was a subset
of
with cardinality
such that
, thus giving a reasonably tight converse to Exercise 10.
— 3. A weak form of Selberg’s 3/16 theorem (optional) —
Remark 6 This section presumes some familiarity with Riemannian geometry, as well as the functional analysis of Sobolev spaces and distributions. See for instance this blog post for a very brief introduction to Riemannian geometry, and these two previous posts for an introduction to distributions and Sobolev spaces.
We now give an application of quasirandomness to establish the following result, first observed explicitly by Lubotsky, Phillips and Sarnak as a corollary of a famous theorem of Selberg:
Theorem 6 (Selberg’s expander construction) If
is a symmetric set of generators of
that does not contain the identity, then the Cayley graphs
form a one-sided expander family, where
is the obvious projection homomorphism, and
ranges over primes.
This is the analogue of Margulis’s expander construction (Corollary 13 from Notes 2), except that the modulus
has been restricted here to be prime. This restriction can be removed with some additional effort, but we will not discuss this issue here. The condition that
generates the entire group
can be substantially relaxed; we will discuss this point in later notes.
In the property (T) approach to expansion, one passed from discrete groups such as to continuous groups such as
, in order to take advantage of tools from analysis (such as limits). Similarly, to prove Theorem 6, we will pass from
to
. Actually, it will be convenient to work with the quotient space
, better known as the hyperbolic plane. We will endow this plane with the structure of a Riemannian manifold, in order to access the Laplace-Beltrami operator on that plane, which is a continuous analogue (after some renormalisation) of the adjacency operator of a Cayley graph, which enjoys some nice exact identities which are difficult to discern in the discrete world.
We now therefore digress from the topic of expansion to recall the geometry of the hyperbolic plane. It will be convenient to switch between a number of different coordinatisations of this plane. Our primary model will be the half-plane model:
Definition 7 (Poincaré half-plane model) The Poincaré half-plane is the upper half-plane
with the Riemannian metric
. The (left) action of
on this half-plane is given by the formula
One easily verifies that this gives an isometric action on
.
Exercise 17 Verify that
acts isometrically and transitively on
, with stabiliser group isomorphic to
; thus
is isomorphic (as an
-homogeneous space) to
.
Note that in some of the literature, a right action is used instead of a left action, leading to some reversals in the notational conventions used below, but this does not lead to any essential changes in the arguments or results.
Exercise 18 Show that the distance
between two points
is given by the formula
We can also use the model of the Poincaré disk with the Riemannian metric
.
Exercise 19 Show that the Cayley transform
is an isometric isomorphism from the half-plane
to the disk
.
Expressing an element of the Poincaré disk in exponential polar coordinates as
, we can also model the Poincaré disk (in slightly singular coordinates) as the half-cylinder
with metric
. (Compare with the Euclidean plane in polar coordinates, which is similar but with the
factor replaced by
, or the sphere in Euler coordinates, which is also similar but with
restricted to
and
replaced by
. This similarity reflects the fact that these three Riemannian surfaces have constant curvature
respectively.)
The action of can of course be described explicitly in the disk or half-plane models, but we will not need these explicit formulae here.
A Riemannian metric on a manifold always generates a measure on that manifold. For the Poincaré half-plane, the measure is
. For the Poincaré disk, it is
. For the half-cylinder, it is
. In all cases, the action of
will preserve the measure, because it preserves the metric, thus one can view
as a Haar measure on the hyperbolic plane.
A Riemannian metric also generates a Laplace-Beltrami operator . In the Poincaré half-plane model, it is
in the Poincaré disk model, it is
and in the half-cylinder model, it is
Again, in all cases, the Laplacian will commute with the action of , because this action preserves the metric.
The discrete group acts on the hyperbolic plane, giving rise to a quotient
, known as the principal modular curve of level
. This quotient can also be viewed by taking the (closure of a) fundamental domain
and then identifying with
on the left and right sides of this domain, and also identifying
with
on the lower boundary of this domain. The quotient
is not compact, but it does have finite measure with respect to
; indeed, outside of a compact set,
behaves like the cusp
for any constant , again identifying the
boundary with the
boundary, and this strip has measure
Thus descends to a finite Haar measure on
.
Remark 7 One can interpret the modular curve geometrically as follows. As in the previous set of notes, one can think of
as the space of all unimodular lattices in
(or equivalently, in
) with two marked generators
with
. We can then map
to the Poincaré half-plane
by sending such a lattice with generators
to the point
; note that the fibers of this map correspond to rotations of the lattice and marked generators, thus identifying
with
. The action of
on
corresponds to moving the generators around while keeping the lattice (or more precisely, the lattice modulo rotations) fixed. The (interior of the) fundamental domain
then corresponds to the selection of generators given by setting
to be the non-zero lattice element of smallest norm, and
to be the generator whose
component lies between
and
.
The quotient is not quite a smooth Riemannian manifold, due to the presence of partially fixed points of the
action at
and
(of order
and order
respectively), and is thus technically an orbifold rather than a manifold. However, this distinction turns out to not significantly affect the analysis and will be glossed over here.
The Laplace-Beltrami operator is defined on smooth compactly supported functions
on
, and then descends to an operator on smooth compactly supported functions
on
. (Here, we use the smooth structure on
inherited from
, thus a function is smooth at a point in
if it lifts to a function smooth at the preimage of that point.) On
, we have the integration by parts formula
where is the gradient with respect to the Riemannian metric, and
the inner product; in the half-plane coordinates, we have
in the disk model, it is
and in the half-cylinder model, it is
In particular, we have the positive definiteness property
for all . This descends to
:
Thus is a symmetric positive-definite densely-defined operator on
. One can in fact show (by solving some PDE, such as the wave equation or the resolvent equation, and exploiting at some point the fact that the Riemannian manifold
is complete) that
is essentially self-adjoint and is thus subject to the spectral theorem, but we will avoid using the full force of spectral theory here.
Since has finite measure and
, we see that
is an eigenfunction of
(or
) with eigenvalue zero. We eliminate this eigenfunction by working in the space
(or
) of functions in
(or
) of mean zero. Let us now define the spectral gap
to be the quantity
Then . Using the spectral theorem, one can interpret the spectral gap as the infimum of the spectrum
of the (negative) Laplacian on
. Note also that one can take
to be either real or complex valued, as this will not affect the value of the spectral gap. Also by a truncation and mollification argument we may allow
to range in
instead of
here if desired.
We have the following bounds:
Proof: We first establish the upper bound . It will suffice to find non-zero functions
whose Rayleigh quotient
is arbitrarily close to .
We will restrict attention to smooth compactly supported functions supported on the cusp (2) for a fixed
(e.g. one can take
). In coordinates, the Raleigh quotient becomes
where we use subscripts to denote partial differentiation, while the mean zero condition becomes
To build such functions, we select a large parameter , and choose a function
that depends only on the
variable, is supported on the region
, and equals
in the region
and is smoothly truncated in the intermediate region (assigning enough negative mass in the region
to obtain the mean zero condition). A brief calculation shows that
and
(where the implied constant can depend on but not on
), and so the claim follows by sending
.
Now we show the lower bound . Suppose this claim failed; then we may find a sequence of functions
with
such that
where denotes a quantity that goes to zero as
. We can take the
to be real valued.
To deal with the non-compact portion of (i.e. the strip (2)) we now use Hardy’s inequality. Observe that if
is smooth, real-valued, and compactly supported on a cusp (2) for some
(we can take
as before), then by integration by parts
and hence by Cauchy-Schwarz
Applying this to a truncated version of
for some
and some smooth cutoff
supported on
that equals one on
, we conclude that
For any , one can use the pigeonhole principle to find an
(depending on
) such that
and thus we see that
for some depending only on
. Thus, the probability measures
form a tight sequence of measures in
. As the
are also locally uniformly bounded in the Sobolev space
, we conclude from the Rellich compactness theorem (or the Poincaré inequality) that after passing to a subsequence, the
converge strongly in
to a limit
, which then has
norm one, mean zero, and
in a distributional sense. But then by the Poincaré inequality,
is constant, which is absurd.
Remark 8 One can in fact establish after some calculation using the theory of modular forms that
is exactly
, but we will not do so here. By modifying the above arguments, one can in fact show that
on
has absolutely continuous spectrum on
.
Now we move back towards the task of establishing expansion for the Cayley graphs . Let
denote the kernel of the projection map
; this is the group of matrices in
that are equal to
mod
, and is known as a principal congruence subgroup of
. It is a finite index normal subgroup of
, and the quotient
can easily be seen to be isomorphic to
. In analogy with what we did for
, we can then define the principal modular curve
, and then define the Laplacian
on this curve and the spectral gap
. At a qualitative level, the geometry of
is similar to that of
, except that instead of having just one cusp (2), there are now multiple cusps (which do not necessarily go to infinity as in (2), but may instead go to some other point on the boundary
of the hyperbolic plane).
Remark 9 One may think of
as being formed by cutting up a finite number of of
‘s and then (pseudo-)randomly sowing them together to create a tangled orbifold that is a continuous analogue of an expander graph; see this Notices article of Sarnak for more discussion of this perspective. Indeed, one can view
as a continuous analogue of the Cayley graph
, where
, and
is the projection onto
, with each vertex of the Cayley graph being replaced by a copy of the fundamental domain
of
, with these domains then being glued together along their edges as prescribed by the edges of the Cayley graph.
By a routine modification of Proposition 8, one can show that
(Note also that as is a finite isometric cover of
, we have the trivial bound
.) However, these arguments do not keep
uniformly bounded away from zero. Much more is conjectured to be true:
Conjecture 9 (Selberg’s conjecture) One has
for all
(not necessarily prime).
This conjecture remains open (though it has been verified numerically for small values of , in particular for all
by Booker and Stromgbergsson). On the other hand, we have the following celebrated result of Selberg:
Theorem 10 (Selberg’s 3/16 theorem) One has
for all
(not necessarily prime).
Selberg’s argument uses a serious amount of number-theoretic machinery (in particular, bounds for Kloosterman sums) and will not be reproduced here. The bound has since been improved; the best bound currently known is
, due to Kim and Sarnak and involving even more number-theoretic machinery (related to the Langlands conjectures); this argument will also not be discussed further here.
In the papers of by Sarnak and Xue and by Gamburd, an argument based primarily on quasirandomness that used only very elementary number theory was introduced, to obtain the following result:
Theorem 11 (Weak Selberg theorem) One has
for all primes
, where
goes to zero as
.
In particular, one has a uniform lower bound for some absolute constant
(and, since one can compute that
, one can in fact take
). Despite giving a weaker result than Theorem 10, the argument is more flexible and can be applied to other arithmetic surfaces than
, for which the method of Selberg does not seem to apply; see the papers of Sarnak-Xue and Gamburd for further discussion.
We will not quite prove Theorem 11 here, but instead establish the following even weaker version which uses the same ideas, but in a slightly less computation-intensive fashion (at the cost of some efficiency in the argument):
Theorem 12 (Even weaker Selberg theorem) One has
for all primes
.
Of course, this result is still strong enough to supply a uniform lower bound on .
Before we prove Theorem 12 (a spectral gap in the continuous world), let us show how it can be transferred to deduce Theorem 6 (a spectral gap in the discrete world). Suppose for contradiction that Theorem 6 failed. Then we can find a finite symmetric generating set for
(not containing the identity) and a sequence of primes
going to infinity such that the one-sided expansion constant of
goes to zero. Write
. Applying the weak discrete Cheeger inequality (Exercise 3 from Notes 2), we conclude that we can find non-empty subsets
of size
which are almost
-invariant in the sense that that
. Since
generates
, we conclude in particular that
The idea now is to pass from this nearly-invariant discrete set to a nearly-invariant continuous analogue
to which the uniform bound on the spectral gap can be applied to obtain a contradiction. (This argument is similar in spirit to Proposition 9 from Notes 2.)
We turn to the details. Let be a large parameter (independent of
) to be chosen later, and let
be a point on
(avoiding fixed points of the
action); for sake of concretness we can take
to be the projection of
to
. Note that
acts on
. We consider the function
defined by the formula
This function equals when
is within
(in the hyperbolic metric) of a point in the orbit
, and equals
when
is further than
of this orbit; in particular, it is compactly supported. The function is also Lipschitz with constant
, so
(using a weak derivative).
The curve is a
-fold cover of
and thus has volume
. Observe that
equals
on the
-neighbourhood of any point in
. As these points are separated from each other by a bounded distance (independent of
and
), we conclude that
where the bound is uniform in . Conversely, if
is not of the form
for some
and some
within distance
from the identity, we have
equal to
on the
-neighbourhood of
. There are only
possible choices for
; since
is independent of
, we conclude from (3) that all but
in
are of the form described above. Since
, we conclude that
if is sufficiently large depending on
. As a consequence, if we let
be the mean-free component of , we have the lower bound
for sufficiently large depending on
.
On the other hand, is non-zero only at points which are at distance between
and
of
. Call the set of such points
. To estimate the volume
, we partition
into
sets of the form
, where
is a fundamental domain of
(projected onto
) and
ranges over
. Because the ball of radius
centred at
is precompact and thus meets only
of the translated domains
, we see that the only
for which
meets
are of the form
, where
lies in
and
lies in a subset of
of size
that is independent of
. From (3) we conclude that all but at most
of these
thus lie in
, and so
Since , we conclude that
for sufficiently large depending on
. But this and (4) contradict the uniform lower bound on the spectral gap
(after regularising
in a standard fashion to make it smooth rather than merely Lipschitz), giving the desired contradiction.
We now turn to the proof of Theorem 12. The first step is to show that the only source of spectrum below is provided by eigenfunctions.
Proposition 13 (Discrete spectrum below
) Suppose that
. Then there exists a non-zero
such that
(in the distributional sense).
Note that while is only initially in
, it is a routine application of elliptic regularity (which we omit here) to show that
is necessarily smooth.
Proof: For notational simplicity, we will just prove the claim in the case, though the general case is similar. Write
, so that
. The argument will be similar in spirit to the proof of the lower bound
. Indeed, by definition of
, we can find a sequence of functions
with
such that
As before, we can take the to be real valued.
Using Hardy’s inequality as in the proof of Proposition 8, we see that
for any . For any
, one can use the pigeonhole principle to find an
(depending on
) such that
and thus we see that
for some depending only on
. If
is small enough,
, and thus
for all sufficiently large . By (5),
is also uniformly bounded in
norm. Thus by the Rellich compactness theorem, we may pass to a subsequence and assume that
converges weakly in
and strongly in
to a limit
, which is then non-zero. Also, from (6) we see that for each
and
there is an
such that
and thus
Taking limits in weak (and strong
), we conclude that for some
larger than
that
Sending using monotone convergence, we conclude that
for any ; by definition of
, we must then have
Perturbing in some test function direction
of mean zero, and using the definition of
, we conclude that
for all such . The mean zero condition on
can be removed since both sides of this equation vanish when
is constant. By duality we thus see that
in the sense of distributions, as required.
Exercise 20 Establish the above proposition for general
.
Exercise 21 Show that for any
, the spectrum of
in
on
is finite (and in particular consists only of eigenvalues), with each eigenvalue having finite multiplicity. (For this exercise you may use without proof the fact that
is essentially self-adjoint.)
We will also need a variant of the above proposition:
Lemma 14 (Eigenfunctions do not concentrate in cusps) Let
. Then there is a compact subset
of
, such that for any
and any eigenfunction
on
with some
, one has
where the implied constant is independent of
,
, and
, and
is the covering map.
The lemma is basically proved by applying Hardy’s inequality to each cusp of ; see the paper of Gamburd for details.
Now we can start using quasirandomness. Let be the space of all eigenfunctions of
of eigenvalue
:
By the above proposition, this is a non-trivial Hilbert space. From Exercise 21, is finite-dimensional (though we do not really need to know this fact yet in the argument that follows, as it will be a consequence of the computations). Since
acts isometrically on
, it also acts on
. If
is a
-invariant vector in
, then it descends to a function on
, which is impossible if
. Applying the Frobenius lemma (Lemma 2), we conclude
Lemma 15 (Quasirandomness) If
, then
has dimension at least
.
To complement this quasirandomness to get expansion, we need a flattening property, as in Proposition 4. In the discrete world, we applied a flattening property to the distribution of a long discrete random walk. The direct analogue of such a distribution would be a heat kernel
of the Laplacian
, and this is what we shall use here. (It turns out that the heat kernel is not quite the most efficient object to analyse here; see Remark 12 below.)
We first recall the formula for the heat kernel on the hyperbolic plane (which can be found in many places, such as this text of Chavel, or this text of Terras):
Exercise 22 Show that the heat operator
on test functions
in
is given by the formula
where
is the kernel
(Hint: There are two main computations. One is to show that
obeys the heat equation, which in half-cylindrical coordinates means that one has to verify that
The other is to show that
resembles the Euclidean heat kernel
for small
. There are several other ways to derive this formula in terms of formulae for other operators (e.g. the wave propagator); see for instance Terras’s book for some discussion.)
For our purposes, we only need a crude upper bound on the heat kernel:
Exercise 23 With the notation of the preceding section, show that
when
and
.
In our applications, the polynomial factors will be negligible; only the exponential factors will be of importance. Note that if one integrates the above estimate against the measure
, one sees that
The right-hand side evaluates to . On the other hand, as the heat kernel is a probability measure, one has
. Thus, up to polynomial factors, the above estimate is quite tight.
Remark 10 From Exercise 23, we see that the probability measure
concentrates around the region
; thus on the hyperbolic plane, Brownian motion moves “ballistically” away from its starting point at a unit speed, in contrast to the situation in Euclidean geometry, where after time
a Brownian motion is only expected to move by a distance
. One can see this phenomenon also from the heat equation (7), which when expressed in terms of the probability density
becomes a Fokker-Planck equation
with unit diffusion and drift speed
. Since
rapidly approaches
when
becomes large, we thus expect
to concentrate in the region
, as is indeed the case.
We let be a parameter to optimise in later. The heat operator
on
descends to a heat operator on the quotient
, defined by the formula
for ; note that the sum
is
-invariant, and so makes sense for
and not just for
. When applied to an eigenfunction
, one has
; in particular,
preserves
, and thus also preserves the orthogonal complement of
. As
is positive semi-definite, it therefore splits as the sum of
and another positive semi-definite operator, where
is the orthogonal projection to
. Note that
is an integral operator with kernel
where is an orthonormal basis of
. (Here we use the fact that
is finite dimensional, but if one does not want to use this fact yet, one can work instead with a finite-dimensional subspace of
in the argument that follows.) Since positive semi-definite integral operators (with continuous kernel) are non-negative on the diagonal, we conclude the pointwise inequality
for all .
This will be our starting point to get a lower bound on . But first we must deal with the other quantities
,
in this expression. A simple way to proceed here is to integrate out in
to exploit the
normalisation of the
:
However, this turns out to be a little unfavourable because the integrand on the right-hand side does not behave well enough at cusps. However, if one uses Lemma 14 first (assuming that ), and integrates over the resulting region
, we can avoid the cusps:
Because the sum here is -invariant, we can descend from
to
:
We now insert the bound in Exercise 23, as well as the bound :
Because is compact, we can get a good bound on
:
Exercise 24
- (i) For any
, show that
, where
is the Frobenius norm of the matrix
.
- (ii) More generally, if
is a compact subset of
, show that
for some constant
depending only on
.
Inserting these bounds, we obtain
Decomposing according to the integer part of
, we thus have
where is the counting function
So one is left with the purely number-theoretic task of estimating . This is basically the number of points of
in the ball of radius
in
. From the half-cylinder model, we see that the measure of this ball is
. On the other hand,
has index
in
, which has bounded covolume in
. We thus heuristically expect
to be
. If this were the truth, then the right-hand side of (9) would be
(cf. the evaluation of (8)), which when combined with quasirandomness (Lemma 15) would give a lower bound of
, that would be particularly strong when
was small.
The key is then the following “flattening lemma”, that shows that is indeed roughly of the order of
when
is large, and is the main number-theoretic input to the argument:
Lemma 16 (Flattening lemma) For any
, one has
Proof: Using the definition of , we are basically counting the number of integer solutions
to the equation
subject to the congruences
and the bounds
Since are both divisible by
, we see also that
. Similarly, as
and
are divisible by
, we have
. Subtracting, we conclude that
Now we proceed as follows. The number of integers with
is
. For each such
, the number of
with
and
is
. For each fixed
and
, the expression
is of size
; by the divisor bound, there are thus
ways to factor
into
. Thus, we obtain a final bound of
giving the claim.
Remark 11 One can obtain improved bounds to
for some ranges of
(particularly when
ranges between
and
) by using more advanced tools, such as bounds on Kloosterman sums. Unfortunately, such improvements do not actually improve the final constants in this argument. (Kloosterman sums do however play a key role in the proof of Theorem 10, which proceeds by a different, and more highly arithmetic, argument.)
A routine calculation then finishes off the proof of Theorem 12:
Exercise 25 Using the above lemma, show that the right-hand side of (9) is
for any
. Optimising this in
and using Lemma 15, establish a contradiction whenever
and
is sufficiently large depending on
, thus giving Theorem 12.
Remark 12 An inspection of the above argument shows that the
term in (10) is the main obstacle to improving the
constant. This term ultimately can be “blamed” for the relatively large value of the heat kernel
at the origin. To improve this, one can observe that the main features of the heat kernel
that were needed for the argument were that it was positive definite, and had an explicit effect on eigenfunctions
. It turns out that there are several other kernels with these properties, and by selecting a kernel with less concentration at the identity, one can obtain a better result. In particular, an efficient choice of kernel turns out to be the convolution of a ball of radius
with itself. By performing some additional calculations in hyperbolic geometry (in particular, using the Selberg/Harish-Chandra theory of spherical functions) one can use this kernel to improve the
bound given here to
; see the paper of Gamburd for details. Unfortunately, the fraction
here appears to be the limit of this particular method.
30 comments
Comments feed for this article
18 December, 2011 at 5:39 pm
254B, Notes 1: Basic theory of expander graphs « Another Word For It
[…] 254B, Notes 3: Quasirandom groups, expansion, and Selberg’s 3/16 theorem […]
20 December, 2011 at 6:31 am
Wolfgang Moens
Exercise 15: I think the proof of the “in particular”-part of the exercise needs the extra assumption of “non-abelianness”. Am I missing something?
[Corrected, thanks – T.]
21 December, 2011 at 12:52 pm
Lior Silberman
The best known bound toward Selberg’s Conjecture is
, due to Kim-Sarnak.
The conjecture has been verified numerically for
and
by Booker-Strömbergsson (the resulting expanders would be the Schreyer graphs on
rather than Cayley graphs but this isn’t much of an issue).
At the moment the conjecture cannot be verified numerically in general. The problem is that when
is an exact eigenvalue, no numerical computation can show that the eigenvalue is not fractionally smaller. The Langlands Conjectures predict when this occurs; in many cases (when the Langlands-Tunnell Theorem applies) Booker-Strömbergsson) then rigorously show that there is an eigenfunction at exactly
. Since the numerical method does count the eigenvalues, Selberg’s Conjecture follows. However, this does not work all the time.
The first case where this difficully arises (an “icosahedral representation”) occurs for
.
[References added, thanks! – T.]
26 December, 2011 at 9:54 pm
A variant of Kemperman’s theorem « What’s new
[…] are arcs of a circle. For quasirandom , one can do much better than these bounds, as discussed in this recent blog post; thus, the abelian case is morally the worst case here, although it seems difficult to convert this […]
8 January, 2012 at 10:43 am
vipulnaik
Your statement of Jordan’s theorem misses the additional fact that the abelian subgroup is normal in the group (i.e., what you’ve written is correct but not the full theorem). Also, to obtain the second para of Exercise 15, the first para should read “proper normal subgroup” instead of “proper subgroups”. We do need to use the stronger assumption of “normal” in order to obtain the result you state in the second para of Exercise 15 involving finite simple non-abelian groups.
[Corrected, thanks – T.]
9 January, 2012 at 3:10 am
Wolfgang Moens
I have another, rather naive, question. In this post, Jordan’s theorem yields an abelian subgroup that is also normal. But, in one of the previous posts (“The Jordan-Schur theorem”), there was no mention of the existence of an abelian subgoup that was, in addition, normal. Is this just a coincidence, or is there something behind this?
E.g.: would it really be harder to generalize Jordan to Jordan-Schur if one wants to include the normality of the subgroup?
9 January, 2012 at 9:48 am
Terence Tao
Well, as long as one doesn’t care about the constants, this is straightforward, because any subgroup H of a group G of some index K will always contain a normal subgroup of G of index at most K! (namely, the kernel of the left action of G on the coset space G/H, or equivalently the intersection of all the conjugates of H).
10 January, 2012 at 2:29 am
Anonymous
Oh, yes. Thanks! =)
9 January, 2012 at 3:57 am
Wolfgang Moens
P.S.: Sorry, I have just noticed that I had not read vipulnaik’s message before posting my own.
13 January, 2012 at 7:49 pm
254B, Notes 4: The Bourgain-Gamburd expansion machine « What’s new
[…] of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group with a flattening hypothesis for the random […]
5 February, 2012 at 10:23 am
MK
Shouldn’t exercise 4, ii) state “G is 2-quasirandom iff it is perfect” instead of “not perfect”? [Corrected, thanks – T.]
5 February, 2012 at 11:32 am
254B, Notes 5: Product theorems, pivot arguments, and the Larsen-Pink non-concentration inequality « What’s new
[…] non-concentration, product theorems, and quasirandomness. Quasirandomness was discussed in Notes 3. In the current set of notes, we discuss product theorems. Roughly speaking, these theorems assert […]
13 February, 2012 at 4:51 pm
254B, Notes 6: Non-concentration in subgroups « What’s new
[…] the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, […]
1 March, 2012 at 5:26 pm
254B, Notes 7: Sieving and expanders « What’s new
[…] a consequence of Proposition 4 of Notes 3, the following claim was shown: Proposition 7 Let be a -quasirandom finite group, is a […]
28 November, 2012 at 5:32 pm
Multiple recurrence in quasirandom groups « What’s new
[…] Geom. Func. Anal.. This paper builds upon a paper of Gowers in which he introduced the concept of a quasirandom group, and established some mixing (or recurrence) properties of such groups. A -quasirandom group is a […]
11 December, 2012 at 9:04 pm
Mixing for progressions in non-abelian groups « What’s new
[…] starting motivation for this paper was a question posed in this foundational paper of Tim Gowers on quasirandom groups. In that paper, Gowers showed (among other things) that if was a quasirandom group, patterns such […]
12 December, 2012 at 5:37 am
Fred Lunnon
Remark 3 : for “isocahedral symmetry” read “icosahedral symmetry” . WFL
[Corrected, thanks – T.]
17 December, 2012 at 8:23 am
Uwe Stroinski
In the proof of Proposition 3 (mixing inequality) you write eigenvalue
and mean eigenvalue
.
[Corrected, thanks – T.]
7 February, 2013 at 12:09 am
Uwe Stroinski
Can you or a reader please help me with exercise 10. I can show the first inequality and I understand that one can use this to quantify that the number of places at which the convolution is small cannot be large. How does that help to prove the first inequality after ‘conclude in particular that’?
7 February, 2013 at 8:12 am
Terence Tao
You might find it easier to first establish the second conclusion (i.e. that
whenever
) before establishing the first (which is equivalent to the second).
2 May, 2013 at 4:28 pm
Quasirandom groups and a cheap version of the Brauer-Fowler theorem | What's new
[…] for groups, the modern study of which was initiated by Gowers, and is discussed for instance in this previous post. It gives the following slightly weaker version of Corollary […]
7 November, 2014 at 2:07 pm
Doron Puder
Dear T.,
1) In Proposition 4, shouldn’t the ‘flattening of random walk’ condition be defined by the restriction of \mu to sum-zero functions? (Otherwise, ||\mu^{*n}||=1 for all n).
2) I suspect the 1/n term in Exercise 13 (ii) is unnecessary. I think one can prove the claim for \epsilon \le 1-\sqrt(n/(Dk)) using Proposition 3 on the eigenfunction of the largest non-trivial eigenvalue (in absolute value) and the characteristic function of S.
[(2) corrected, thanks – T.]
7 November, 2014 at 2:16 pm
Doron Puder
Please ignore the first remark. (it’s the l2-norm of a vector, not operator norm of the operator).
15 August, 2015 at 10:51 pm
A wave equation approach to automorphic forms in analytic number theory | What's new
[…] Using the Weil bound on Kloosterman sums to derive Selberg’s 3/16 theorem on the least non-trivial eigenvalue for on (discussed previously here); […]
12 November, 2015 at 12:52 pm
gninrepoli
Applying the improved estimate for the
in Theorem 7 (“J.-M. Deshouillers and H. Iwaniec, Kloosterman sums and Fourier coefficients of cusp forms, Invent. Math. 70 (1982), 219-288”; p.233), you can reduce the degree of an integer
.Will this affect the estimates of Theorem 11, and with the prospect of Theorem 12 (p.236,237)? Thank you.
13 November, 2015 at 2:54 am
gninrepoli
By Theorem 10.11 implies the assertion in Theorem 12
, but I somehow doubt take that on Selberg hypothesis should be the same expression.Let Selberg conjecture is true, then
in Theorem 9. It turns out that in Theorem 12 expression
has no place in this form. Or is it all true? I do not know if you could give your comment, if you do not complicate. Thank you.
17 April, 2020 at 3:46 am
ConstantinK
Using remark 5, do you agree that on can improve Proposition 3 to the following:
for all
. So we also get some sharper bounds in later exercises.
17 April, 2020 at 10:15 am
Terence Tao
I don’t see how to get such a good dependence on D. If one could get such a gain then one could immediately improve the exponents in a number of existing results in the literature, e.g., https://www.cambridge.org/core/journals/mathematical-proceedings-of-the-cambridge-philosophical-society/article/mixing-for-threeterm-progressions-in-finite-simple-groups/3871BEAF10295F669BAFCA2C70BF0745
18 April, 2020 at 5:22 am
ConstantinK
Ahh now I see my mistake – thank you. I just used blindly the equation from Fourier Analysis for general compact groups:
where all the integration takes place with respect to the Haar probability measure and I forgot that the 2-norm in this problem is not with respect to the Haar probability measure of
.
25 February, 2022 at 10:13 pm
pr.likelihood - Show that $mathrm{SL}_2(mathbb{F}_p)$ is quasi-random Answer - Lord Web
[…] Terry Tao offers this indirect definition of quasi-random group in his notes 3 […]