In the previous set of notes we saw how a representation-theoretic property of groups, namely Kazhdan’s property (T), could be used to demonstrate expansion in Cayley graphs. In this set of notes we discuss a different representation-theoretic property of groups, namely quasirandomness, which is also useful for demonstrating expansion in Cayley graphs, though in a somewhat different way to property (T). For instance, whereas property (T), being qualitative in nature, is only interesting for infinite groups such as or , and only creates Cayley graphs after passing to a finite quotient, quasirandomness is a quantitative property which is directly applicable to finite groups, and is able to deduce expansion in a Cayley graph, provided that random walks in that graph are known to become sufficiently “flat” in a certain sense.
The definition of quasirandomness is easy enough to state:
Definition 1 (Quasirandom groups) Let be a finite group, and let . We say that is -quasirandom if all non-trivial unitary representations of have dimension at least . (Recall a representation is trivial if is the identity for all .)
Exercise 1 Let be a finite group, and let . A unitary representation is said to be irreducible if has no -invariant subspaces other than and . Show that is -quasirandom if and only if every non-trivial irreducible representation of has dimension at least .
Remark 1 The terminology “quasirandom group” was introduced explicitly (though with slightly different notational conventions) by Gowers in 2008 in his detailed study of the concept; the name arises because dense Cayley graphs in quasirandom groups are quasirandom graphs in the sense of Chung, Graham, and Wilson, as we shall see below. This property had already been used implicitly to construct expander graphs by Sarnak and Xue in 1991, and more recently by Gamburd in 2002 and by Bourgain and Gamburd in 2008. One can of course define quasirandomness for more general locally compact groups than the finite ones, but we will only need this concept in the finite case. (A paper of Kunze and Stein from 1960, for instance, exploits the quasirandomness properties of the locally compact group to obtain mixing estimates in that group.)
Quasirandomness behaves fairly well with respect to quotients and short exact sequences:
Exercise 2 Let be a short exact sequence of finite groups .
- (i) If is -quasirandom, show that is -quasirandom also. (Equivalently: any quotient of a -quasirandom finite group is again a -quasirandom finite group.)
- (ii) Conversely, if and are both -quasirandom, show that is -quasirandom also. (In particular, the direct or semidirect product of two -quasirandom finite groups is again a -quasirandom finite group.)
Informally, we will call quasirandom if it is -quasirandom for some “large” , though the precise meaning of “large” will depend on context. For applications to expansion in Cayley graphs, “large” will mean “ for some constant independent of the size of “, but other regimes of are certainly of interest.
The way we have set things up, the trivial group is infinitely quasirandom (i.e. it is -quasirandom for every ). This is however a degenerate case and will not be discussed further here. In the non-trivial case, a finite group can only be quasirandom if it is large and has no large subgroups:
Exercise 3 Let , and let be a finite -quasirandom group.
- (i) Show that if is non-trivial, then . (Hint: use the mean zero component of the regular representation .) In particular, non-trivial finite groups cannot be infinitely quasirandom.
- (ii) Show that any proper subgroup of has index . (Hint: use the mean zero component of the quasiregular representation.)
The following exercise shows that quasirandom groups have to be quite non-abelian, and in particular perfect:
Exercise 4 (Quasirandomness, abelianness, and perfection) Let be a finite group.
- (i) If is abelian and non-trivial, show that is not -quasirandom. (Hint: use Fourier analysis or the classification of finite abelian groups.)
- (ii) Show that is -quasirandom if and only if it is perfect, i.e. the commutator group is equal to . (Equivalently, is -quasirandom if and only if it has no non-trivial abelian quotients.)
Later on we shall see that there is a converse to the above two exercises; any non-trivial perfect finite group with no large subgroups will be quasirandom.
Exercise 5 Let be a finite -quasirandom group. Show that for any subgroup of , is -quasirandom, where is the index of in . (Hint: use induced representations.)
Now we give an example of a more quasirandom group.
Lemma 2 (Frobenius lemma) If is a field of some prime order , then is -quasirandom.
This should be compared with the cardinality of the special linear group, which is easily computed to be .
Proof: We may of course take to be odd. Suppose for contradiction that we have a non-trivial representation on a unitary group of some dimension with . Set to be the group element
and suppose first that is non-trivial. Since , we have ; thus all the eigenvalues of are roots of unity. On the other hand, by conjugating by diagonal matrices in , we see that is conjugate to (and hence conjugate to ) whenever is a quadratic residue mod . As such, the eigenvalues of must be permuted by the operation for any quadratic residue mod . Since has at least one non-trivial eigenvalue, and there are distinct quadratic residues, we conclude that has at least distinct eigenvalues. But is a matrix with , a contradiction. Thus lies in the kernel of . By conjugation, we then see that this kernel contains all unipotent matrices. But these matrices generate (see exercise below), and so is trivial, a contradiction.
Exercise 6 Show that for any prime , the unipotent matrices
for ranging over generate as a group.
Exercise 7 Let be a finite group, and let . If is generated by a collection of -quasirandom subgroups, show that is itself -quasirandom.
Exercise 8 Show that is -quasirandom for any and any prime . (This is not sharp; the optimal bound here is , which follows from the results of Landazuri and Seitz.)
As a corollary of the above results and Exercise 2, we see that the projective special linear group is also -quasirandom.
Remark 2 One can ask whether the bound in Lemma 2 is sharp, assuming of course that is odd. Noting that acts linearly on the plane , we see that it also acts projectively on the projective line , which has elements. Thus acts via the quasiregular representation on the -dimensional space , and also on the -dimensional subspace ; this latter representation (known as the Steinberg representation) is irreducible. This shows that the bound cannot be improved beyond . More generally, given any character , acts on the -dimensional space of functions that obey the twisted dilation invariance for all and ; these are known as the principal series representations. When is the trivial character, this is the quasiregular representation discussed earlier. For most other characters, this is an irreducible representation, but it turns out that when is the quadratic representation (thus taking values in while being non-trivial), the principal series representation splits into the direct sum of two -dimensional representations, which comes very close to matching the bound in Lemma 2. There is a parallel series of representations to the principal series (known as the discrete series) which is more complicated to describe (roughly speaking, one has to embed in a quadratic extension and then use a rotated version of the above construction, to change a split torus into a non-split torus), but can generate irreducible representations of dimension , showing that the bound in Lemma 2 is in fact exactly sharp. These constructions can be generalised to arbitrary finite groups of Lie type using Deligne-Luzstig theory, but this is beyond the scope of this course (and of my own knowledge in the subject).
Exercise 9 Let be an odd prime. Show that for any , the alternating group is -quasirandom. (Hint: show that all cycles of order in are conjugate to each other in (and not just in ); in particular, a cycle is conjugate to its power for all . Also, as , is simple, and so the cycles of order generate the entire group.)
Remark 3 By using more precise information on the representations of the alternating group (using the theory of Specht modules and Young tableaux), one can show the slightly sharper statement that is -quasirandom for (but is only -quasirandom for due to icosahedral symmetry, and -quasirandom for due to lack of perfectness). Using Exercise 3 with the index subgroup , we see that the bound cannot be improved. Thus, (for large ) is not as quasirandom as the special linear groups (for large and bounded), because in the latter case the quasirandomness is as strong as a power of the size of the group, whereas in the former case it is only logarithmic in size.
If one replaces the alternating group with the slightly larger symmetric group , then quasirandomness is destroyed (since , having the abelian quotient , is not perfect); indeed, is -quasirandom and no better.
Remark 4 Thanks to the monumental achievement of the classification of finite simple groups, we know that apart from a finite number (26, to be precise) of sporadic exceptions, all finite simple groups (up to isomorphism) are either a cyclic group , an alternating group , or is a finite simple group of Lie type such as . (We will define the concept of a finite simple group of Lie type more precisely in later notes, but suffice to say for now that such groups are constructed from reductive algebraic groups, for instance is constructed from in characteristic .) In the case of finite simple groups of Lie type with bounded rank , it is known from the work of Landazuri and Seitz that such groups are -quasirandom for some depending only on the rank. On the other hand, by the previous remark, the large alternating groups do not have this property, and one can show that the finite simple groups of Lie type with large rank also do not have this property. Thus, we see using the classification that if a finite simple group is -quasirandom for some and is sufficiently large depending on , then is a finite simple group of Lie type with rank . It would be of interest to see if there was an alternate way to establish this fact that did not rely on the classification, as it may lead to an alternate approach to proving the classification (or perhaps a weakened version thereof).
A key reason why quasirandomness is desirable for the purposes of demonstrating expansion is that quasirandom groups happen to be rapidly mixing at large scales, as we shall see below the fold. As such, quasirandomness is an important tool for demonstrating expansion in Cayley graphs, though because expansion is a phenomenon that must hold at all scales, one needs to supplement quasirandomness with some additional input that creates mixing at small or medium scales also before one can deduce expansion. As an example of this technique of combining quasirandomness with mixing at small and medium scales, we present a proof (due to Sarnak-Xue, and simplified by Gamburd) of a weak version of the famous “3/16 theorem” of Selberg on the least non-trivial eigenvalue of the Laplacian on a modular curve, which among other things can be used to construct a family of expander Cayley graphs in (compare this with the property (T)-based methods in the previous notes, which could construct expander Cayley graphs in for any fixed ).
— 1. Mixing in quasirandom groups —
Let be a finite group. Given two functions , we can define the convolution by the formula
This operation is bilinear and associative, but is not commutative unless is abelian. From the Cauchy-Schwarz inequality one has
and hence
This inequality is sharp in the sense that if we set and to both be constant-valued, then the left-hand side and right-hand side match. For abelian groups, one can also see this example is sharp when and are multiples of the same character.
It turns out, though, that if one restricts one of or (or both) to be of mean zero, and is quasirandom, then one can improve this inequality, which first appeared explicitly in the work of Babai, Nikolov, and Pyber:
Proposition 3 (Mixing inequality) Let be a finite -quasirandom group, and let . If at least one of has mean zero, then
Proof: By subtracting a constant from or , we may assume that or both have mean zero.
Observe that (being a superposition of right-translates of ) also has mean zero. Thus, we see that we may define an operator by setting . It thus suffices to show that the operator norm of is at most .
Fix . We can view as a matrix. We apply the singular value decomposition to this matrix to obtain singular values
of , together with associated singular vectors. The operator norm of is the largest singular value . The operator is then a self-adjoint operator (or matrix) with eigenvalues . In particular, we have
Now, a short computation shows that , where , and (by embedding in , and noting that annihilates constants) the trace can computed as
Thus, if is the eigenspace of corresponding to the eigenvalue (so that the dimension of is the multiplicity of ), we have
Now observe that if is the left-regular representation (restricted to ) then
for any and (this is a special case of the associativity of convolution). In particular, we see that is invariant under . Since has no non-trivial invariant vectors in , we conclude from quasirandomness that has dimension at least , and the claim follows.
Remark 5 One can also establish the above inequality using the nonabelian Fourier transform (which is based on the Peter-Weyl theorem combined with Schur’s lemma, and is developed in this blog post); we leave this as an exercise for the interested reader.
Exercise 10 Let be subsets of a finite -quasirandom group . Show that
and
Conclude in particular that
and that whenever . (The bounds here are not quite sharp, but are simpler than the optimal bounds, and suffice for most applications.)
Thus, for instance, if is a subset of a finite -quasirandom group of density more than , then will be most of (with fewer than elements omitted), and will be all of ; thus large subsets of a quasirandom group rapidly expand to fill out the whole group. In the converse direction, we have
Exercise 11 Let , and let be a finite group which is not -quasirandom. Show that there exists a subset of with for some absolute constant , such that . (Hint: by hypothesis, one has a non-trivial unitary representation of dimension at most . Show that contains an element at distance from the identity in operator norm, and take to be the preimage of a suitable ball around the identity in the operator norm.)
One can improve this result by using a quantitative form of Jordan’s theorem; see the next section.
Exercise 12 (Mixing inequality for actions) Let be a finite -quasirandom group acting (on the left) on a discrete set . Given functions and , one can define the convolution in much the same way as before:
Show that
whenever has mean zero, or whenever has mean zero on every orbit of .
One can use quasirandomness to show that Cayley graphs of very large degree in a quasirandom group are expanders:
Exercise 13 Let be a -regular Cayley graph in a finite -quasirandom group on vertices.
- (i) If , are subsets of , show that
(compare with the expander mixing lemma, Exercise 9 from Notes 1).
- (ii) Show that is a two-sided -expander whenever
Unfortunately, the above result is only non-trivial in the regime , whereas for our applications we are interested instead in the regime when . We record a tool for this purpose.
Proposition 4 (Using quasirandomness to demonstrate expansion) Let be a -regular Cayley graph in a finite group . Assume the following:
- (i) (Quasirandomness) is -quasirandom for some .
- (ii) (Flattening of random walk) One has
for some with and , where and is the -fold convolution of .
Then is a two-sided -expander for some depending only on , if is sufficiently large depending on these quantities. If we replace by in the flattening hypothesis, then is a one-sided -expander instead.
Proof: We allow implied constants to depend on . We will just prove the first claim, as the second claim is similar. By Exercise 5 from Notes 2, it will suffice to show that
(say) for some . But from Proposition 3 (with and ) and the hypotheses we have
for any . Iterating this (starting from, say, , and advancing in steps of times) we obtain the claim.
Exercise 14 Obtain an alternate proof of the above result that proceeds directly from the spectral decomposition of the adjacency operator into eigenvalues and eigenvectors and quasirandomness, rather than through Exercise 5 from Notes 2 and Proposition 3. (This alternate approach is closer in spirit to the arguments of Sarnak-Xue and Bourgain-Gamburd, though the two approaches are largely equivalent in the final analysis.)
Informally, the flattening hypothesis in Proposition 4 asserts that by time , the random walk has expanded to the point where it is covering a large portion of the group (roughly speaking, it is spread out over a set of size at least ). The point is that the scale of this set is large enough for the quasirandomness properties of the group to then mix the random walk rapidly towards the uniform distribution. However, this proposition provides no tools with which to prove this flattening property; this task will be a focus of subsequent notes.
The following exercise extends some of the above theory from quasirandom groups to “virtually quasirandom” groups, which have a bounded index subgroup that is quasirandom, but need not themselves be quasirandom.
Exercise 15 (Virtually quasirandom groups) Let be a finite group that contains a normal -quasirandom subgroup of index at most .
- (i) If , and at least one of has mean zero on every coset of , show that
- (ii) If for some , and is a connected -regular Cayley graph obeying (1) for some with and , show that is a two-sided -expander for some depending only on .
— 2. An algebraic description of quasirandomness (optional) —
As defined above, quasirandomness is a property of representations. However, one can reformulate this property (at a qualitative level, at least) in a more algebraic fashion, by means of Jordan’s theorem:
Theorem 5 (Jordan’s theorem) Let be a finite subgroup of for some . Then contains a normal abelian subgroup of index at most , where depends only on .
A proof of this theorem (giving a rather poor value of ) may be found in this previous blog post. The optimal value of is known for almost all , thanks to the classification of finite simple groups: for instance, it is a result of Collins that for (which is attained with the example of the symmetric group which acts on the space of -dimensional complex vectors whose coefficients sum to zero).
Jordan’s theorem can be used to give a qualitative description of quasirandomness, providing a converse to Exercises 3 and 4:
Exercise 16 Let be an integer. Let be a perfect finite group, with the property that all proper normal subgroups of have index greater than , where is the quantity in Jordan’s theorem. Show that is -quasirandom.
Conclude in particular that any finite simple nonabelian group of cardinality greater than is -quasirandom.
By using the classification of finite simple groups more carefully, Nikolov and Pyber were able to replace here by . Using related arguments, they also showed that if was not -quasirandom, then there was a subset of with cardinality such that , thus giving a reasonably tight converse to Exercise 10.
— 3. A weak form of Selberg’s 3/16 theorem (optional) —
Remark 6 This section presumes some familiarity with Riemannian geometry, as well as the functional analysis of Sobolev spaces and distributions. See for instance this blog post for a very brief introduction to Riemannian geometry, and these two previous posts for an introduction to distributions and Sobolev spaces.
We now give an application of quasirandomness to establish the following result, first observed explicitly by Lubotsky, Phillips and Sarnak as a corollary of a famous theorem of Selberg:
Theorem 6 (Selberg’s expander construction) If is a symmetric set of generators of that does not contain the identity, then the Cayley graphs form a one-sided expander family, where is the obvious projection homomorphism, and ranges over primes.
This is the analogue of Margulis’s expander construction (Corollary 13 from Notes 2), except that the modulus has been restricted here to be prime. This restriction can be removed with some additional effort, but we will not discuss this issue here. The condition that generates the entire group can be substantially relaxed; we will discuss this point in later notes.
In the property (T) approach to expansion, one passed from discrete groups such as to continuous groups such as , in order to take advantage of tools from analysis (such as limits). Similarly, to prove Theorem 6, we will pass from to . Actually, it will be convenient to work with the quotient space , better known as the hyperbolic plane. We will endow this plane with the structure of a Riemannian manifold, in order to access the Laplace-Beltrami operator on that plane, which is a continuous analogue (after some renormalisation) of the adjacency operator of a Cayley graph, which enjoys some nice exact identities which are difficult to discern in the discrete world.
We now therefore digress from the topic of expansion to recall the geometry of the hyperbolic plane. It will be convenient to switch between a number of different coordinatisations of this plane. Our primary model will be the half-plane model:
Definition 7 (Poincaré half-plane model) The Poincaré half-plane is the upper half-plane with the Riemannian metric . The (left) action of on this half-plane is given by the formula
One easily verifies that this gives an isometric action on .
Exercise 17 Verify that acts isometrically and transitively on , with stabiliser group isomorphic to ; thus is isomorphic (as an -homogeneous space) to .
Note that in some of the literature, a right action is used instead of a left action, leading to some reversals in the notational conventions used below, but this does not lead to any essential changes in the arguments or results.
Exercise 18 Show that the distance between two points is given by the formula
We can also use the model of the Poincaré disk with the Riemannian metric .
Exercise 19 Show that the Cayley transform is an isometric isomorphism from the half-plane to the disk .
Expressing an element of the Poincaré disk in exponential polar coordinates as , we can also model the Poincaré disk (in slightly singular coordinates) as the half-cylinder with metric . (Compare with the Euclidean plane in polar coordinates, which is similar but with the factor replaced by , or the sphere in Euler coordinates, which is also similar but with restricted to and replaced by . This similarity reflects the fact that these three Riemannian surfaces have constant curvature respectively.)
The action of can of course be described explicitly in the disk or half-plane models, but we will not need these explicit formulae here.
A Riemannian metric on a manifold always generates a measure on that manifold. For the Poincaré half-plane, the measure is . For the Poincaré disk, it is . For the half-cylinder, it is . In all cases, the action of will preserve the measure, because it preserves the metric, thus one can view as a Haar measure on the hyperbolic plane.
A Riemannian metric also generates a Laplace-Beltrami operator . In the Poincaré half-plane model, it is
in the Poincaré disk model, it is
and in the half-cylinder model, it is
Again, in all cases, the Laplacian will commute with the action of , because this action preserves the metric.
The discrete group acts on the hyperbolic plane, giving rise to a quotient , known as the principal modular curve of level . This quotient can also be viewed by taking the (closure of a) fundamental domain
and then identifying with on the left and right sides of this domain, and also identifying with on the lower boundary of this domain. The quotient is not compact, but it does have finite measure with respect to ; indeed, outside of a compact set, behaves like the cusp
for any constant , again identifying the boundary with the boundary, and this strip has measure
Thus descends to a finite Haar measure on .
Remark 7 One can interpret the modular curve geometrically as follows. As in the previous set of notes, one can think of as the space of all unimodular lattices in (or equivalently, in ) with two marked generators with . We can then map to the Poincaré half-plane by sending such a lattice with generators to the point ; note that the fibers of this map correspond to rotations of the lattice and marked generators, thus identifying with . The action of on corresponds to moving the generators around while keeping the lattice (or more precisely, the lattice modulo rotations) fixed. The (interior of the) fundamental domain then corresponds to the selection of generators given by setting to be the non-zero lattice element of smallest norm, and to be the generator whose component lies between and .
The quotient is not quite a smooth Riemannian manifold, due to the presence of partially fixed points of the action at and (of order and order respectively), and is thus technically an orbifold rather than a manifold. However, this distinction turns out to not significantly affect the analysis and will be glossed over here.
The Laplace-Beltrami operator is defined on smooth compactly supported functions on , and then descends to an operator on smooth compactly supported functions on . (Here, we use the smooth structure on inherited from , thus a function is smooth at a point in if it lifts to a function smooth at the preimage of that point.) On , we have the integration by parts formula
where is the gradient with respect to the Riemannian metric, and the inner product; in the half-plane coordinates, we have
in the disk model, it is
and in the half-cylinder model, it is
In particular, we have the positive definiteness property
for all . This descends to :
Thus is a symmetric positive-definite densely-defined operator on . One can in fact show (by solving some PDE, such as the wave equation or the resolvent equation, and exploiting at some point the fact that the Riemannian manifold is complete) that is essentially self-adjoint and is thus subject to the spectral theorem, but we will avoid using the full force of spectral theory here.
Since has finite measure and , we see that is an eigenfunction of (or ) with eigenvalue zero. We eliminate this eigenfunction by working in the space (or ) of functions in (or ) of mean zero. Let us now define the spectral gap to be the quantity
Then . Using the spectral theorem, one can interpret the spectral gap as the infimum of the spectrum of the (negative) Laplacian on . Note also that one can take to be either real or complex valued, as this will not affect the value of the spectral gap. Also by a truncation and mollification argument we may allow to range in instead of here if desired.
We have the following bounds:
Proof: We first establish the upper bound . It will suffice to find non-zero functions whose Rayleigh quotient
is arbitrarily close to .
We will restrict attention to smooth compactly supported functions supported on the cusp (2) for a fixed (e.g. one can take ). In coordinates, the Raleigh quotient becomes
where we use subscripts to denote partial differentiation, while the mean zero condition becomes
To build such functions, we select a large parameter , and choose a function that depends only on the variable, is supported on the region , and equals in the region and is smoothly truncated in the intermediate region (assigning enough negative mass in the region to obtain the mean zero condition). A brief calculation shows that
and
(where the implied constant can depend on but not on ), and so the claim follows by sending .
Now we show the lower bound . Suppose this claim failed; then we may find a sequence of functions with such that
where denotes a quantity that goes to zero as . We can take the to be real valued.
To deal with the non-compact portion of (i.e. the strip (2)) we now use Hardy’s inequality. Observe that if is smooth, real-valued, and compactly supported on a cusp (2) for some (we can take as before), then by integration by parts
and hence by Cauchy-Schwarz
Applying this to a truncated version of for some and some smooth cutoff supported on that equals one on , we conclude that
For any , one can use the pigeonhole principle to find an (depending on ) such that
and thus we see that
for some depending only on . Thus, the probability measures form a tight sequence of measures in . As the are also locally uniformly bounded in the Sobolev space , we conclude from the Rellich compactness theorem (or the Poincaré inequality) that after passing to a subsequence, the converge strongly in to a limit , which then has norm one, mean zero, and in a distributional sense. But then by the Poincaré inequality, is constant, which is absurd.
Remark 8 One can in fact establish after some calculation using the theory of modular forms that is exactly , but we will not do so here. By modifying the above arguments, one can in fact show that on has absolutely continuous spectrum on .
Now we move back towards the task of establishing expansion for the Cayley graphs . Let denote the kernel of the projection map ; this is the group of matrices in that are equal to mod , and is known as a principal congruence subgroup of . It is a finite index normal subgroup of , and the quotient can easily be seen to be isomorphic to . In analogy with what we did for , we can then define the principal modular curve , and then define the Laplacian on this curve and the spectral gap . At a qualitative level, the geometry of is similar to that of , except that instead of having just one cusp (2), there are now multiple cusps (which do not necessarily go to infinity as in (2), but may instead go to some other point on the boundary of the hyperbolic plane).
Remark 9 One may think of as being formed by cutting up a finite number of of ‘s and then (pseudo-)randomly sowing them together to create a tangled orbifold that is a continuous analogue of an expander graph; see this Notices article of Sarnak for more discussion of this perspective. Indeed, one can view as a continuous analogue of the Cayley graph , where , and is the projection onto , with each vertex of the Cayley graph being replaced by a copy of the fundamental domain of , with these domains then being glued together along their edges as prescribed by the edges of the Cayley graph.
By a routine modification of Proposition 8, one can show that
(Note also that as is a finite isometric cover of , we have the trivial bound .) However, these arguments do not keep uniformly bounded away from zero. Much more is conjectured to be true:
Conjecture 9 (Selberg’s conjecture) One has for all (not necessarily prime).
This conjecture remains open (though it has been verified numerically for small values of , in particular for all by Booker and Stromgbergsson). On the other hand, we have the following celebrated result of Selberg:
Theorem 10 (Selberg’s 3/16 theorem) One has for all (not necessarily prime).
Selberg’s argument uses a serious amount of number-theoretic machinery (in particular, bounds for Kloosterman sums) and will not be reproduced here. The bound has since been improved; the best bound currently known is , due to Kim and Sarnak and involving even more number-theoretic machinery (related to the Langlands conjectures); this argument will also not be discussed further here.
In the papers of by Sarnak and Xue and by Gamburd, an argument based primarily on quasirandomness that used only very elementary number theory was introduced, to obtain the following result:
Theorem 11 (Weak Selberg theorem) One has for all primes , where goes to zero as .
In particular, one has a uniform lower bound for some absolute constant (and, since one can compute that , one can in fact take ). Despite giving a weaker result than Theorem 10, the argument is more flexible and can be applied to other arithmetic surfaces than , for which the method of Selberg does not seem to apply; see the papers of Sarnak-Xue and Gamburd for further discussion.
We will not quite prove Theorem 11 here, but instead establish the following even weaker version which uses the same ideas, but in a slightly less computation-intensive fashion (at the cost of some efficiency in the argument):
Theorem 12 (Even weaker Selberg theorem) One has for all primes .
Of course, this result is still strong enough to supply a uniform lower bound on .
Before we prove Theorem 12 (a spectral gap in the continuous world), let us show how it can be transferred to deduce Theorem 6 (a spectral gap in the discrete world). Suppose for contradiction that Theorem 6 failed. Then we can find a finite symmetric generating set for (not containing the identity) and a sequence of primes going to infinity such that the one-sided expansion constant of goes to zero. Write . Applying the weak discrete Cheeger inequality (Exercise 3 from Notes 2), we conclude that we can find non-empty subsets of size which are almost -invariant in the sense that that . Since generates , we conclude in particular that
The idea now is to pass from this nearly-invariant discrete set to a nearly-invariant continuous analogue to which the uniform bound on the spectral gap can be applied to obtain a contradiction. (This argument is similar in spirit to Proposition 9 from Notes 2.)
We turn to the details. Let be a large parameter (independent of ) to be chosen later, and let be a point on (avoiding fixed points of the action); for sake of concretness we can take to be the projection of to . Note that acts on . We consider the function defined by the formula
This function equals when is within (in the hyperbolic metric) of a point in the orbit , and equals when is further than of this orbit; in particular, it is compactly supported. The function is also Lipschitz with constant , so (using a weak derivative).
The curve is a -fold cover of and thus has volume . Observe that equals on the -neighbourhood of any point in . As these points are separated from each other by a bounded distance (independent of and ), we conclude that
where the bound is uniform in . Conversely, if is not of the form for some and some within distance from the identity, we have equal to on the -neighbourhood of . There are only possible choices for ; since is independent of , we conclude from (3) that all but in are of the form described above. Since , we conclude that
if is sufficiently large depending on . As a consequence, if we let
be the mean-free component of , we have the lower bound
for sufficiently large depending on .
On the other hand, is non-zero only at points which are at distance between and of . Call the set of such points . To estimate the volume , we partition into sets of the form , where is a fundamental domain of (projected onto ) and ranges over . Because the ball of radius centred at is precompact and thus meets only of the translated domains , we see that the only for which meets are of the form , where lies in and lies in a subset of of size that is independent of . From (3) we conclude that all but at most of these thus lie in , and so
Since , we conclude that
for sufficiently large depending on . But this and (4) contradict the uniform lower bound on the spectral gap (after regularising in a standard fashion to make it smooth rather than merely Lipschitz), giving the desired contradiction.
We now turn to the proof of Theorem 12. The first step is to show that the only source of spectrum below is provided by eigenfunctions.
Proposition 13 (Discrete spectrum below ) Suppose that . Then there exists a non-zero such that (in the distributional sense).
Note that while is only initially in , it is a routine application of elliptic regularity (which we omit here) to show that is necessarily smooth.
Proof: For notational simplicity, we will just prove the claim in the case, though the general case is similar. Write , so that . The argument will be similar in spirit to the proof of the lower bound . Indeed, by definition of , we can find a sequence of functions with such that
As before, we can take the to be real valued.
Using Hardy’s inequality as in the proof of Proposition 8, we see that
for any . For any , one can use the pigeonhole principle to find an (depending on ) such that
and thus we see that
for some depending only on . If is small enough, , and thus
for all sufficiently large . By (5), is also uniformly bounded in norm. Thus by the Rellich compactness theorem, we may pass to a subsequence and assume that converges weakly in and strongly in to a limit , which is then non-zero. Also, from (6) we see that for each and there is an such that
and thus
Taking limits in weak (and strong ), we conclude that for some larger than that
Sending using monotone convergence, we conclude that
for any ; by definition of , we must then have
Perturbing in some test function direction of mean zero, and using the definition of , we conclude that
for all such . The mean zero condition on can be removed since both sides of this equation vanish when is constant. By duality we thus see that in the sense of distributions, as required.
Exercise 20 Establish the above proposition for general .
Exercise 21 Show that for any , the spectrum of in on is finite (and in particular consists only of eigenvalues), with each eigenvalue having finite multiplicity. (For this exercise you may use without proof the fact that is essentially self-adjoint.)
We will also need a variant of the above proposition:
Lemma 14 (Eigenfunctions do not concentrate in cusps) Let . Then there is a compact subset of , such that for any and any eigenfunction on with some , one has
where the implied constant is independent of , , and , and is the covering map.
The lemma is basically proved by applying Hardy’s inequality to each cusp of ; see the paper of Gamburd for details.
Now we can start using quasirandomness. Let be the space of all eigenfunctions of of eigenvalue :
By the above proposition, this is a non-trivial Hilbert space. From Exercise 21, is finite-dimensional (though we do not really need to know this fact yet in the argument that follows, as it will be a consequence of the computations). Since acts isometrically on , it also acts on . If is a -invariant vector in , then it descends to a function on , which is impossible if . Applying the Frobenius lemma (Lemma 2), we conclude
Lemma 15 (Quasirandomness) If , then has dimension at least .
To complement this quasirandomness to get expansion, we need a flattening property, as in Proposition 4. In the discrete world, we applied a flattening property to the distribution of a long discrete random walk. The direct analogue of such a distribution would be a heat kernel of the Laplacian , and this is what we shall use here. (It turns out that the heat kernel is not quite the most efficient object to analyse here; see Remark 12 below.)
We first recall the formula for the heat kernel on the hyperbolic plane (which can be found in many places, such as this text of Chavel, or this text of Terras):
Exercise 22 Show that the heat operator on test functions in is given by the formula
where is the kernel
(Hint: There are two main computations. One is to show that obeys the heat equation, which in half-cylindrical coordinates means that one has to verify that
The other is to show that resembles the Euclidean heat kernel for small . There are several other ways to derive this formula in terms of formulae for other operators (e.g. the wave propagator); see for instance Terras’s book for some discussion.)
For our purposes, we only need a crude upper bound on the heat kernel:
Exercise 23 With the notation of the preceding section, show that
when and .
In our applications, the polynomial factors will be negligible; only the exponential factors will be of importance. Note that if one integrates the above estimate against the measure , one sees that
The right-hand side evaluates to . On the other hand, as the heat kernel is a probability measure, one has . Thus, up to polynomial factors, the above estimate is quite tight.
Remark 10 From Exercise 23, we see that the probability measure concentrates around the region ; thus on the hyperbolic plane, Brownian motion moves “ballistically” away from its starting point at a unit speed, in contrast to the situation in Euclidean geometry, where after time a Brownian motion is only expected to move by a distance . One can see this phenomenon also from the heat equation (7), which when expressed in terms of the probability density becomes a Fokker-Planck equation
with unit diffusion and drift speed . Since rapidly approaches when becomes large, we thus expect to concentrate in the region , as is indeed the case.
We let be a parameter to optimise in later. The heat operator on descends to a heat operator on the quotient , defined by the formula
for ; note that the sum is -invariant, and so makes sense for and not just for . When applied to an eigenfunction , one has ; in particular, preserves , and thus also preserves the orthogonal complement of . As is positive semi-definite, it therefore splits as the sum of and another positive semi-definite operator, where is the orthogonal projection to . Note that is an integral operator with kernel
where is an orthonormal basis of . (Here we use the fact that is finite dimensional, but if one does not want to use this fact yet, one can work instead with a finite-dimensional subspace of in the argument that follows.) Since positive semi-definite integral operators (with continuous kernel) are non-negative on the diagonal, we conclude the pointwise inequality
for all .
This will be our starting point to get a lower bound on . But first we must deal with the other quantities , in this expression. A simple way to proceed here is to integrate out in to exploit the normalisation of the :
However, this turns out to be a little unfavourable because the integrand on the right-hand side does not behave well enough at cusps. However, if one uses Lemma 14 first (assuming that ), and integrates over the resulting region , we can avoid the cusps:
Because the sum here is -invariant, we can descend from to :
We now insert the bound in Exercise 23, as well as the bound :
Because is compact, we can get a good bound on :
Exercise 24
- (i) For any , show that , where is the Frobenius norm of the matrix .
- (ii) More generally, if is a compact subset of , show that for some constant depending only on .
Inserting these bounds, we obtain
Decomposing according to the integer part of , we thus have
where is the counting function
So one is left with the purely number-theoretic task of estimating . This is basically the number of points of in the ball of radius in . From the half-cylinder model, we see that the measure of this ball is . On the other hand, has index in , which has bounded covolume in . We thus heuristically expect to be . If this were the truth, then the right-hand side of (9) would be (cf. the evaluation of (8)), which when combined with quasirandomness (Lemma 15) would give a lower bound of , that would be particularly strong when was small.
The key is then the following “flattening lemma”, that shows that is indeed roughly of the order of when is large, and is the main number-theoretic input to the argument:
Lemma 16 (Flattening lemma) For any , one has
Proof: Using the definition of , we are basically counting the number of integer solutions to the equation
subject to the congruences
and the bounds
Since are both divisible by , we see also that . Similarly, as and are divisible by , we have . Subtracting, we conclude that
Now we proceed as follows. The number of integers with is . For each such , the number of with and is . For each fixed and , the expression is of size ; by the divisor bound, there are thus ways to factor into . Thus, we obtain a final bound of
giving the claim.
Remark 11 One can obtain improved bounds to for some ranges of (particularly when ranges between and ) by using more advanced tools, such as bounds on Kloosterman sums. Unfortunately, such improvements do not actually improve the final constants in this argument. (Kloosterman sums do however play a key role in the proof of Theorem 10, which proceeds by a different, and more highly arithmetic, argument.)
A routine calculation then finishes off the proof of Theorem 12:
Exercise 25 Using the above lemma, show that the right-hand side of (9) is
for any . Optimising this in and using Lemma 15, establish a contradiction whenever and is sufficiently large depending on , thus giving Theorem 12.
Remark 12 An inspection of the above argument shows that the term in (10) is the main obstacle to improving the constant. This term ultimately can be “blamed” for the relatively large value of the heat kernel at the origin. To improve this, one can observe that the main features of the heat kernel that were needed for the argument were that it was positive definite, and had an explicit effect on eigenfunctions . It turns out that there are several other kernels with these properties, and by selecting a kernel with less concentration at the identity, one can obtain a better result. In particular, an efficient choice of kernel turns out to be the convolution of a ball of radius with itself. By performing some additional calculations in hyperbolic geometry (in particular, using the Selberg/Harish-Chandra theory of spherical functions) one can use this kernel to improve the bound given here to ; see the paper of Gamburd for details. Unfortunately, the fraction here appears to be the limit of this particular method.
30 comments
Comments feed for this article
18 December, 2011 at 5:39 pm
254B, Notes 1: Basic theory of expander graphs « Another Word For It
[…] 254B, Notes 3: Quasirandom groups, expansion, and Selberg’s 3/16 theorem […]
20 December, 2011 at 6:31 am
Wolfgang Moens
Exercise 15: I think the proof of the “in particular”-part of the exercise needs the extra assumption of “non-abelianness”. Am I missing something?
[Corrected, thanks – T.]
21 December, 2011 at 12:52 pm
Lior Silberman
The best known bound toward Selberg’s Conjecture is , due to Kim-Sarnak.
The conjecture has been verified numerically for and by Booker-Strömbergsson (the resulting expanders would be the Schreyer graphs on rather than Cayley graphs but this isn’t much of an issue).
At the moment the conjecture cannot be verified numerically in general. The problem is that when is an exact eigenvalue, no numerical computation can show that the eigenvalue is not fractionally smaller. The Langlands Conjectures predict when this occurs; in many cases (when the Langlands-Tunnell Theorem applies) Booker-Strömbergsson) then rigorously show that there is an eigenfunction at exactly . Since the numerical method does count the eigenvalues, Selberg’s Conjecture follows. However, this does not work all the time.
The first case where this difficully arises (an “icosahedral representation”) occurs for .
[References added, thanks! – T.]
26 December, 2011 at 9:54 pm
A variant of Kemperman’s theorem « What’s new
[…] are arcs of a circle. For quasirandom , one can do much better than these bounds, as discussed in this recent blog post; thus, the abelian case is morally the worst case here, although it seems difficult to convert this […]
8 January, 2012 at 10:43 am
vipulnaik
Your statement of Jordan’s theorem misses the additional fact that the abelian subgroup is normal in the group (i.e., what you’ve written is correct but not the full theorem). Also, to obtain the second para of Exercise 15, the first para should read “proper normal subgroup” instead of “proper subgroups”. We do need to use the stronger assumption of “normal” in order to obtain the result you state in the second para of Exercise 15 involving finite simple non-abelian groups.
[Corrected, thanks – T.]
9 January, 2012 at 3:10 am
Wolfgang Moens
I have another, rather naive, question. In this post, Jordan’s theorem yields an abelian subgroup that is also normal. But, in one of the previous posts (“The Jordan-Schur theorem”), there was no mention of the existence of an abelian subgoup that was, in addition, normal. Is this just a coincidence, or is there something behind this?
E.g.: would it really be harder to generalize Jordan to Jordan-Schur if one wants to include the normality of the subgroup?
9 January, 2012 at 9:48 am
Terence Tao
Well, as long as one doesn’t care about the constants, this is straightforward, because any subgroup H of a group G of some index K will always contain a normal subgroup of G of index at most K! (namely, the kernel of the left action of G on the coset space G/H, or equivalently the intersection of all the conjugates of H).
10 January, 2012 at 2:29 am
Anonymous
Oh, yes. Thanks! =)
9 January, 2012 at 3:57 am
Wolfgang Moens
P.S.: Sorry, I have just noticed that I had not read vipulnaik’s message before posting my own.
13 January, 2012 at 7:49 pm
254B, Notes 4: The Bourgain-Gamburd expansion machine « What’s new
[…] of an infinite Cayley graph on a group with Kazhdan’s property (T). The second, discussed in Notes 3, is to combine a quasirandomness property of the group with a flattening hypothesis for the random […]
5 February, 2012 at 10:23 am
MK
Shouldn’t exercise 4, ii) state “G is 2-quasirandom iff it is perfect” instead of “not perfect”? [Corrected, thanks – T.]
5 February, 2012 at 11:32 am
254B, Notes 5: Product theorems, pivot arguments, and the Larsen-Pink non-concentration inequality « What’s new
[…] non-concentration, product theorems, and quasirandomness. Quasirandomness was discussed in Notes 3. In the current set of notes, we discuss product theorems. Roughly speaking, these theorems assert […]
13 February, 2012 at 4:51 pm
254B, Notes 6: Non-concentration in subgroups « What’s new
[…] the last three notes, we discussed the Bourgain-Gamburd expansion machine and two of its three ingredients, […]
1 March, 2012 at 5:26 pm
254B, Notes 7: Sieving and expanders « What’s new
[…] a consequence of Proposition 4 of Notes 3, the following claim was shown: Proposition 7 Let be a -quasirandom finite group, is a […]
28 November, 2012 at 5:32 pm
Multiple recurrence in quasirandom groups « What’s new
[…] Geom. Func. Anal.. This paper builds upon a paper of Gowers in which he introduced the concept of a quasirandom group, and established some mixing (or recurrence) properties of such groups. A -quasirandom group is a […]
11 December, 2012 at 9:04 pm
Mixing for progressions in non-abelian groups « What’s new
[…] starting motivation for this paper was a question posed in this foundational paper of Tim Gowers on quasirandom groups. In that paper, Gowers showed (among other things) that if was a quasirandom group, patterns such […]
12 December, 2012 at 5:37 am
Fred Lunnon
Remark 3 : for “isocahedral symmetry” read “icosahedral symmetry” . WFL
[Corrected, thanks – T.]
17 December, 2012 at 8:23 am
Uwe Stroinski
In the proof of Proposition 3 (mixing inequality) you write eigenvalue and mean eigenvalue .
[Corrected, thanks – T.]
7 February, 2013 at 12:09 am
Uwe Stroinski
Can you or a reader please help me with exercise 10. I can show the first inequality and I understand that one can use this to quantify that the number of places at which the convolution is small cannot be large. How does that help to prove the first inequality after ‘conclude in particular that’?
7 February, 2013 at 8:12 am
Terence Tao
You might find it easier to first establish the second conclusion (i.e. that whenever ) before establishing the first (which is equivalent to the second).
2 May, 2013 at 4:28 pm
Quasirandom groups and a cheap version of the Brauer-Fowler theorem | What's new
[…] for groups, the modern study of which was initiated by Gowers, and is discussed for instance in this previous post. It gives the following slightly weaker version of Corollary […]
7 November, 2014 at 2:07 pm
Doron Puder
Dear T.,
1) In Proposition 4, shouldn’t the ‘flattening of random walk’ condition be defined by the restriction of \mu to sum-zero functions? (Otherwise, ||\mu^{*n}||=1 for all n).
2) I suspect the 1/n term in Exercise 13 (ii) is unnecessary. I think one can prove the claim for \epsilon \le 1-\sqrt(n/(Dk)) using Proposition 3 on the eigenfunction of the largest non-trivial eigenvalue (in absolute value) and the characteristic function of S.
[(2) corrected, thanks – T.]
7 November, 2014 at 2:16 pm
Doron Puder
Please ignore the first remark. (it’s the l2-norm of a vector, not operator norm of the operator).
15 August, 2015 at 10:51 pm
A wave equation approach to automorphic forms in analytic number theory | What's new
[…] Using the Weil bound on Kloosterman sums to derive Selberg’s 3/16 theorem on the least non-trivial eigenvalue for on (discussed previously here); […]
12 November, 2015 at 12:52 pm
gninrepoli
Applying the improved estimate for the in Theorem 7 (“J.-M. Deshouillers and H. Iwaniec, Kloosterman sums and Fourier coefficients of cusp forms, Invent. Math. 70 (1982), 219-288”; p.233), you can reduce the degree of an integer .Will this affect the estimates of Theorem 11, and with the prospect of Theorem 12 (p.236,237)? Thank you.
13 November, 2015 at 2:54 am
gninrepoli
By Theorem 10.11 implies the assertion in Theorem 12 , but I somehow doubt take that on Selberg hypothesis should be the same expression.Let Selberg conjecture is true, then in Theorem 9. It turns out that in Theorem 12 expression has no place in this form. Or is it all true? I do not know if you could give your comment, if you do not complicate. Thank you.
17 April, 2020 at 3:46 am
ConstantinK
Using remark 5, do you agree that on can improve Proposition 3 to the following: for all . So we also get some sharper bounds in later exercises.
17 April, 2020 at 10:15 am
Terence Tao
I don’t see how to get such a good dependence on D. If one could get such a gain then one could immediately improve the exponents in a number of existing results in the literature, e.g., https://www.cambridge.org/core/journals/mathematical-proceedings-of-the-cambridge-philosophical-society/article/mixing-for-threeterm-progressions-in-finite-simple-groups/3871BEAF10295F669BAFCA2C70BF0745
18 April, 2020 at 5:22 am
ConstantinK
Ahh now I see my mistake – thank you. I just used blindly the equation from Fourier Analysis for general compact groups:
where all the integration takes place with respect to the Haar probability measure and I forgot that the 2-norm in this problem is not with respect to the Haar probability measure of .
25 February, 2022 at 10:13 pm
pr.likelihood - Show that $mathrm{SL}_2(mathbb{F}_p)$ is quasi-random Answer - Lord Web
[…] Terry Tao offers this indirect definition of quasi-random group in his notes 3 […]