In this final set of course notes, we discuss how (a generalisation of) the expansion results obtained in the preceding notes can be used for some number-theoretic applications, and in particular to locate almost primes inside orbits of thin groups, following the work of Bourgain, Gamburd, and Sarnak. We will not attempt here to obtain the sharpest or most general results in this direction, but instead focus on the simplest instances of these results which are still illustrative of the ideas involved.

One of the basic general problems in analytic number theory is to locate tuples of primes of a certain form; for instance, the famous (and still unsolved) twin prime conjecture asserts that there are infinitely many pairs in the line in which both entries are prime. In a similar spirit, one of the Landau conjectures (also still unsolved) asserts that there are infinitely many primes in the set . The Mersenne prime conjecture (also unsolved) asserts that there are infinitely many primes in the set , and so forth.

More generally, given some explicit subset in (or , if one wishes), such as an algebraic variety, one can ask the question of whether there are infinitely many integer lattice points in in which all the coefficients are simultaneously prime; let us refer to such points as *prime points*.

At this level of generality, this problem is impossibly difficult. Indeed, even the much simpler problem of deciding whether the set is non-empty (let alone containing prime points) when is a hypersurface cut out by a polynomial is essentially Hilbert’s tenth problem, which is known to be undecidable in general by Matiyasevich’s theorem. So one needs to restrict attention to a more special class of sets , in which the question of finding integer points is not so difficult. One model case is to consider *orbits* , where is a fixed lattice vector and is some discrete group that acts on somehow (e.g. might be embedded as a subgroup of the special linear group , or on the affine group ). In such a situation it is then quite easy to show that is large; for instance, will be infinite precisely when the stabiliser of in has infinite index in .

Even in this simpler setting, the question of determining whether an orbit contains infinitely prime points is still extremely difficult; indeed the three examples given above of the twin prime conjecture, Landau conjecture, and Mersenne prime conjecture are essentially of this form (possibly after some slight modification of the underlying ring , see this paper of Bourgain-Gamburd-Sarnak for details), and are all unsolved (and generally considered well out of reach of current technology). Indeed, the list of non-trivial orbits which are known to contain infinitely many prime points is quite slim; Euclid’s theorem on the infinitude of primes handles the case , Dirichlet’s theorem handles infinite arithmetic progressions , and a somewhat complicated result of Green, Tao, and Ziegler handles “non-degenerate” affine lattices in of rank two or more (such as the lattice of length arithmetic progressions), but there are few other positive results known that are not based on the above cases (though we will note the remarkable theorem of Friedlander and Iwaniec that there are infinitely many primes of the form , and the related result of Heath-Brown that there are infinitely many primes of the form , as being in a kindred spirit to the above results, though they are not explicitly associated to an orbit of a reasonable action as far as I know).

On the other hand, much more is known if one is willing to replace the primes by the larger set of *almost primes* – integers with a small number of prime factors (counting multiplicity). Specifically, for any , let us call an *-almost prime* an integer which is the product of at most primes, and possibly by the unit as well. Many of the above sorts of questions which are open for primes, are known for -almost primes for sufficiently large. For instance, with regards to the twin prime conjecture, it is a result of Chen that there are infinitely many pairs where is a prime and is a -almost prime; in a similar vein, it is a result of Iwaniec that there are infinitely many -almost primes of the form . On the other hand, it is still open for any fixed whether there are infinitely many Mersenne numbers which are -almost primes. (For the superficially similar situation with the numbers , it is in fact believed (but again unproven) that there are only finitely many -almost primes for any fixed (the Fermat prime conjecture).)

The main tool that allows one to count almost primes in orbits is sieve theory. The reason for this lies in the simple observation that in order to ensure that an integer of magnitude at most is an -almost prime, it suffices to guarantee that is not divisible by any prime less than . Thus, to create -almost primes, one can start with the integers up to some large threshold and remove (or “sieve out”) all the integers that are multiples of any prime less than . The difficulty is then to ensure that a sufficiently non-trivial quantity of integers remain after this process, for the purposes of finding points in the given set .

The most basic sieve of this form is the *sieve of Eratosthenes*, which when combined with the inclusion-exclusion principle gives the *Legendre sieve* (or *exact sieve*), which gives an exact formula for quantities such as the number of natural numbers less than or equal to that are not divisible by any prime less than or equal to a given threshold . Unfortunately, when one tries to evaluate this formula, one encounters error terms which grow exponentially in , rendering this sieve useful only for very small thresholds (of logarithmic size in ). To improve the sieve level up to a small power of such as , one has to replace the exact sieve by *upper bound sieves* and *lower bound sieves* which only seek to obtain upper or lower bounds on quantities such as , but contain a polynomial number of terms rather than an exponential number. There are a variety of such sieves, with the two most common such sieves being *combinatorial sieves* (such as the *beta sieve*), based on various combinatorial truncations of the inclusion-exclusion formula, and the *Selberg upper bound sieve*, based on upper bounds that are the square of a divisor sum. (There is also the *large sieve*, which is somewhat different in nature and based on almost orthogonality considerations, rather than on any actual sieving, to obtain upper bounds.) We will primarily work with a specific sieve in this notes, namely the beta sieve, and we will not attempt to optimise all the parameters of this sieve (which ultimately means that the almost primality parameter in our results will be somewhat large). For a more detailed study of sieve theory, see the classic text of Halberstam and Richert, or the more recent texts of Iwaniec-Kowalski or of Friedlander-Iwaniec.

Very roughly speaking, the end result of sieve theory is that excepting some degenerate and “exponentially thin” settings (such as those associated with the Mersenne primes), all the orbits which are expected to have a large number of primes, can be proven to at least have a large number of -almost primes for some finite . (Unfortunately, there is a major obstruction, known as the *parity problem*, which prevents sieve theory from lowering all the way to ; see this blog post for more discussion.) One formulation of this principle was established by Bourgain, Gamburd, and Sarnak:

Theorem 1 (Bourgain-Gamburd-Sarnak)Let be a subgroup of which is not virtually solvable. Let be a polynomial with integer coefficients obeying the followingprimitivitycondition: for any positive integer , there exists such that is coprime to . Then there exists an such that there are infinitely many with non-zero and -almost prime.

This is not the strongest version of the Bourgain-Gamburd-Sarnak theorem, but it captures the general flavour of their results. Note that the theorem immediately implies an analogous result for orbits , in which is now a polynomial from to , and one uses instead of . It is in fact conjectured that one can set here, but this is well beyond current technology. For the purpose of reaching , it is very natural to impose the primitivity condition, but as long as one is content with larger values of , it is possible to relax the primitivity condition somewhat; see the paper of Bourgain, Gamburd, and Sarnak for more discussion.

By specialising to the polynomial , we conclude as a corollary that as long as is primitive in the sense that it contains matrices with all coefficients coprime to for any given , then contains infinitely many matrices whose elements are all -almost primes for some depending only on . For further applications of these sorts of results, for instance to Appolonian packings, see the paper of Bourgain, Gamburd, and Sarnak.

It turns out that to prove Theorem 1, the Cayley expansion results in from the previous set of notes are not quite enough; one needs a more general Cayley expansion result in where is square-free but not necessarily prime. The proof of this expansion result uses the same basic methods as in the case, but is significantly more complicated technically, and we will only discuss it briefly here. As such, we do not give a complete proof of Theorem 1, but hopefully the portion of the argument presented here is still sufficient to give an impression of the ideas involved.

** — 1. Combinatorial sieving — **

In this section we set up the combinatorial sieve needed to establish Theorem 1. To motivate this sieve, let us focus first on a much simpler model problem, namely the task of estimating the number of natural numbers less than or equal to a given threshold which are not divisible by any prime less than or equal to . Note that for between and , is simply the number of primes in the interval ; but for less than , also counts some almost primes in addition to genuine primes. This quantity can be studied quite precisely by a variety of tools, such as those coming from multiplicative number theory; see for instance this paper of Granville and Soundararajan for some of the most precise results in this direction.

The quantity is easiest to estimate when is small. For instance, is simply the number of natural numbers less than , and so

Similarly, is the number of odd numbers less than , and so

Carrying this further, is the number of numbers less than that are coprime to , and so

(but note that the implied constant in the error is getting increasingly large). Continuing this analysis, it is not hard to see that

for any fixed ; note from Mertens’ theorem that

leading to the heuristic approximation

where is the Euler-Mascheroni constant. Note though that this heuristic should be treated with caution when is large; for instance, from the prime number theorem we see that we have the conflicting asymptotic

when . This is already a strong indication that one needs to pay careful attention to the error terms in this analysis. (Indeed, many false “proofs” of conjectures in analytic number theory, such as the twin prime conjecture, have been based on a cavalier attitude to such error terms, and their asymptotic behaviour under various limiting regimes.)

Let us thus work more carefully to control the error term . Write for the product of all the primes less than or equal to (this quantity is also known as the primorial of ). Then we can write

where the sum ranges over natural numbers less than , and is the greatest common divisor of and . The function is periodic of period , and is equal to on of the residue classes modulo , which leads to the crude bound

However, this error term is too large for most applications: from the prime number theorem, we see that , so the error term grows exponentially in . In particular, this estimate is only non-trivial in the regime .

One can do a little better than this by using the inclusion-exclusion principle, which in this context is also known as the *Legendre sieve*. Consider for instance , which counts the number of natural numbers coprime to . We can compute this quantity by first counting all numbers less than , then subtracting those numbers divisible by and by , and then adding back those numbers divisible by both and . A convenient way to describe this procedure in general is to introduce the Möbius function , defined to equal when is the product of distinct primes for some . The key point is that

for any natural number , where ranges over the divisors of ; indeed, this identity can be viewed as an alternate way to define the Möbius function. In particular, , leading to the Legendre identity

The inner sum can be easily estimated as

since has distinct factors, where is the number of primes less than or equal to , we conclude that

The main term here can be factorised as

leading to the following slight improvement

to (2). Note from the prime number theorem that , so this error term is asymptotically better than the one in (2); the bound here is now non-trivial in the slightly larger regime . But this is still not good enough for the purposes of counting almost primes, which would require as large as a power of .

To do better, we will replace the exact identity (3) by combinatorial truncations

of that identity, where divides and are sets to be specified later, leading to the upper bound sieve

The key point will be that and can be chosen to be only polynomially large in , rather than exponentially large, without causing too much damage to the main terms , which lead to upper and lower bounds on that remain non-trivial for moderately large values of (e.g. for some fixed ).

We now turn to the task of locating reasonably small sets obeying (6). We begin with(3), which we rewrite as

for a divisor of . One can view the divisors of as a -dimensional combinatorial cube, with the right-hand side in (9) being a sum over that cube; the idea is then to hack off various subcubes of that cube in a way that only serves to increase (for the upper bound sieve) or decrease (for the lower bound sieve) that sum, until only a relatively small portion of the cube remains.

We turn to the details. Our starting point will be the identity

whenever are primes, which follows easily from applying (3) to when divide . One can view the left-hand side of (10) as a subsum of the sum in (9), and (10) implies that this subsum is non-negative when is even and non-positive when is odd. In particular, we see that (6) will hold when is formed from the “cube” by removing some disjoint “subcubes” of the form for and odd, and similarly for but with now required to be even instead of odd.

Observe that the subcube consists precisely of those divisors of whose top prime factors are . We now have the following general inequality:

Lemma 2 (Combinatorial sieve)Let . For each natural number , let be a predicate pertaining to decreasing primes (thus is either true or false for each choice of ). Let be the set of all natural numbers which, when factored as for , is such that holds for all odd . Similarly define by requiring to be even instead of odd. Then (6) holds for all .

*Proof:* is formed from by removing those subcubes of the form for , odd, and such that holds for all odd but fails for . These subcubes are all disjoint, and so the claim for follows from the preceding discussion. Similarly for .

This gives us the upper and lower bounds (7), (8) for . To make these bounds useful, we need to choose so that the partial sums are close to

To do this, one must select the predicates carefully. The best choices for these predicates are not immediately obvious; but after much trial and error, it was discovered that one fairly efficient choice is to let be the predicate

for some moderately large parameter (we will eventually take ) and some parameter for some to be optimised in later (we will eventually take it to be almost as large as ). The use of this choice is referred to as the *beta sieve*.

Let us now estimate the errors

For sake of argument let us work with , as the case is almost identical. By the triangle inequality, we can bound this error by

where ranges over positive even integers, and denotes a sum over primes ranges over primes such that

Since and , this in particular gives the bound

From (12) we have

for all (not necessarily even); note that the case follows from the hypothesis . We can rewrite this inequality as

and hence by induction

for all . From (13) we then have

We conclude that the error (11) is bounded by

in the ; a similar argument also gives this bound in the case. The inner sum can be computed as

and thus by Mertens’ theorem (1) and the bound we have

We have thus bounded (11) by

The inner sum can be bounded by

By another of Mertens’ theorems (or by taking logarithms of (1)) one has

and so (11) is bounded by

Using the crude bound (as can be seen by considering the term in the Taylor expansion of ) we conclude the bound

If is large enough ( will suffice) then the expression is less than ; since , this leads to the bound

which after summing the geometric series becomes

(allowing implied constants to depend on ). From this bound on (11) and (5), (1) we have We thus have

Finally, if is an element of , then by (12) and the hypothesis we have

and so we have the crude upper bounds . From (7), (8) and recalling that , we thus have

If , we may optimise in by setting (in order to make the final error term much less than ), leading to the bound

whenever for some sufficiently small absolute constant .

Remark 1The bound (14) implies, among other things, that there exists an absolute constant such that the number of -almost primes less than is , which is a very weak version of the prime number theorem. Note though that the upper bound in (14) doesnotdirectly imply a corresponding upper bound on this count of -almost primes, because -almost primes are allowed to have prime factors that are less than . Indeed, a routine computation using Mertens’ theorem shows that for any fixed , the number of -almost primes less than is comparable to .

We can generalise the above argument as follows:

Exercise 1 (Beta sieve)Let be an absolutely convergent sequence of non-negative reals for . Let , , and . Let be a multiplicative function taking values between and , with for all primes . Assume the following axioms:

- (i) (Control in arithmetic progressions) For any , one has
- (ii) (Mertens type theorem) For all , one has
Conclude that there is an depending only on , and the implied constants in the above axioms, such that

whenever , and the implied constants may depend on , and the implied constants in the above axioms. (Note that (14) corresponds to the case when , , and .)

Exercise 2Suppose we have the notation and hypotheses of the preceding exercise, except that the estimate (15) is replaced by the weaker boundfor all sufficiently large . (For small , note that we still have the bound .) Show that we still have the lower bound

whenever for sufficiently small (which may depend on ), where the implied constant is now allowed to depend on . (

Hint:the main trick here is to extract out a common factor of from the analysis first, and then use the bound (16) to upper bound quantities such as .)

One can weaken the axioms somewhat and still obtain non-trivial results from the beta sieve, but this somewhat crude version of the sieve will suffice for our purposes. Another, more abstract, formalisation of the above argument (involving a construction of sets obeying (6) and a number of other desirable properties) is sometimes referred to as the fundamental lemma of sieve theory.

Exercise 3 (Twin almost primes)Let be the number of integers between and such that and are both coprime to .

- (i) Show that
if , and is a sufficiently small absolute constant.

- (ii) Show that there exists an such that there are infinitely many pairs which are both -almost primes. (Indeed, the argument here allows one to take without much effort, and by working considerably harder to optimise everything, one can lower substantially, although the parity problem mentioned earlier prevents one from taking below .)
- (iii) Establish Brun’s theorem that the sum of reciprocals of the twin primes is convergent.

Exercise 4 (Landau conjecture for almost primes)Let be the number of integers between and such that is coprime to .

- (i) Show that
if , and is a sufficiently small absolute constant. (

Hint:you will need the fact that is a quadratic residue mod if and only if , and Merten’s theorem for arithmetic progressions, which among other things asserts that .)- (ii) Show that there exists an such that there are infinitely natural numbers such that is an -almost primes.

Exercise 5Let be a polynomial with integer coefficients and degree . Assume that isprimitivein the sense that for each natural number , there exists a natural number such that is coprime to . Show that there exists an depending only on such that for all sufficiently large , there are at least natural numbers less than such that is an -almost prime.In many cases (e.g. if is irreducible) one can decrease the power of here (as in Exercise 4), by using tools such as Landau’s prime ideal theorem; see this previous blog post for some related discussion.

Remark 2The combinatorial sieve is not the only type of sieve used in sieve theory. Another popular choice is theSelberg upper bound sieve, in which the starting point is not the combinatorial inequalities (6), but rather the variantwhere the are arbitrary real parameters with , typically supported up to some level . By optimising the choice of weights , the Selberg sieve can lead to upper bounds on quantities such as which are competitive with the beta sieve (particularly when is moderately large), although it is more difficult for this sieve to produce matching lower bounds. A somewhat different type of sieve is the

large sieve, which does not upper bound or lower bound indicator functions such as directly, but rather controls the size of a function that avoids many residue classes by exploiting the properties of these residue classes, such as almost orthogonality phenomena or Fourier uncertainty principles. See this text of Friedlander and Iwaniec for a much more thorough discussion and comparison of these sieves.

** — 2. The strong approximation property — **

For any natural number , let be the obvious projection homomorphism. An easy application of Bezout’s theorem (or the Euclidean algorithm) shows that this map is surjective. From the Chinese remainder theorem, we also have whenever and are coprime.

To set up the sieve needed to establish Theorem 1, we need to understand the images of a non-virtually-solvable subgroup of . Clearly this is a subgroup of . Given that is fairly “large” (in particular, such groups can be easily seen to be Zariski-dense in ), we expect that in most cases is in fact all of . This type belief is formalised in general as the strong approximation property. We will not prove the most general instance of this property, but instead focus on the model case of for square-free, in which one can proceed by *ad hoc* elementary arguments. The general treatment of the strong approximation property was first achieved by Matthews, Vaserstein, and Weisfeiler using the classification of finite simple groups; a subsequent paper of Nori gave an alternate treatment that avoided the use of this classification.

In the previous set of notes (see Remark 2) it was already observed that for all sufficiently large primes . (Indeed, did not need to be free for this to hold; it was enough that not be virtually solvable.) To extend from the prime case to the (square-free) composite case, we will need some basic group theory, and in particular the theory of composition factors.

Define a composition series for a group to be a finite sequence

of subgroups, where each is a normal subgroup of , and the quotients are all simple. (By convention, we do not consider the trivial group to be simple.) The quotients for are referred to as the *composition factors* of this series.

Exercise 6Show that every finite group has at least one composition series.

A key fact about composition factors, known as the Jordan-Holder theorem, asserts that, up to permutation and isomorphism, they are independent of the choice of series:

Theorem 3 (Jordan-Holder theorem)Letand

be two composition series of the same group . Then there is a bijection such that for each , is isomorphic to . (In particular, and must be equal.)

*Proof:* By symmetry we may assume that . Fix . Let be the quotient map, and consider the groups for . These are an increasing family of subgroups of , with and . Since each is a normal subgroup of , we see that is a normal subgroup of . As is simple, this implies that there is a unique element of such that is trivial for and is equal to for .

Now we claim that is a bijection. Suppose this is not the case. Since , there thus exists which is not in the range of . This implies that for all . An induction on then shows that for all , and thus , contradicting the assumption that is simple.

Finally, fix , and let . Then we have for all , while and . From this and induction we see that is trivial for but isomorphic to for . (Here we are basically relying on a special case of the Zassenhaus lemma.) In particular, is isomorphic to , and the claim follows.

In view of this theorem, we can assign to each finite group a set (or more precisely, multiset) of composition factors of simple groups, which are unique up to permutation and isomorphism. This is somewhat analogous to how the fundamental theorem of arithmetic assigns to each positive integer a multiset of prime numbers, which are unique up to permutation. (Indeed, the latter can be viewed as the special case of the former in the case of cyclic groups.)

Exercise 7Show that for a prime, the composition factors of are (up to isomorphism and permutation) the cyclic group and the projective special linear group . What happens instead when or ?Also, show that the only normal subgroup of (other than the trivial group and all of ) is the center of the group. Thus, we see (in contrast with the fundamental theorem of arithmetic) that one cannot permute the composition factors arbitrarily.

Exercise 8Let be a normal subgroup of a finite group . Show that the set of composition factors of is equal to (up to isomorphism, and counting multiplicity) the union of the set of composition factors of , and the set of composition factors of . In particular, the set of composition factors of and of are subsets of the set of composition factors of (again up to isomorphism, and counting multiplicity). As another corollary, we see that the composition factors of a direct product or semidirect product of two finite groups is the union of the set of composition factors of and separately (again up to isomorphism, and counting multiplicity).

Knowing the composition factors of a group can assist in classifying its subgroups; in particular, groups which are “coprime” in the sense of having no composition factors in common are difficult to “join” together. (Interestingly, the phenomenon of “coprimality” implying “disjointness” also shows up in ergodic theory, in the theory of joinings, but we will not discuss this further here.) Here is an example of this which will be of importance in our application:

Lemma 4Let be a prime, and be a finite group which does not have a copy of amongst its composition factors. Let be a subgroup of whose projections to and are surjective. Then is all of .

*Proof:* We apply Goursat’s lemma (see Exercise 9 below). Thus if and , then are normal subgroups of respectively such that is isomorphic to . From Exercise 7 we see that is either trivial, all of , or is the center .

If is trivial, then is isomorphic to a quotient of , and thus by Exercise 8 the composition factors of are a subset of those of . But this is a contradiction, since is a composition factor of but not of . Similarly if is the center, since is then isomorphic to . So the only remaining case is when is all of . But then as surjects onto , we see that is all of and we are done.

Exercise 9 (Goursat’s lemma)Let be groups, and let be a subgroup of whose projections to are surjective. Let and . Show that are normal subgroups of , and that and are isomorphic. (Indeed, after quotienting out by , becomes a graph of such an isomorphism.) Conclude that the set of composition factors of are a subset of the union of the set of composition factors of and the set of composition factors of (up to isomorphism and counting multiplicity, as usual).

As such, we have the following satisfactory description of the images of a free group :

Corollary 5 (Strong approximation)Let be a subgroup of which is not virtually solvable. Let be an integer. Then there exists a multiple of with the following property: whenever is of the form with and distinct and coprime to , one has(after using the Chinese remainder theorem to identify with ). In particular, one has

The parameter will not actually be needed in our application, but is useful in the more general setting in which has rational coefficients instead of integer coefficients.

*Proof:* We already know that for all but finitely many primes . Let be the product of with all the eceptional primes, as well as and , thus and for all coprime to . By repeated application of Lemma 4 this implies that for any distinct primes coprime to (the key point being that the groups for primes are all non-isomorphic to each other and to by cardinality considerations).

The finite group may contain copies of amongst their composition factors for a finite number of primes coprime to ; let be the product of with all these primes. By many applications of Exercise 9, we see that the set of composition factors of are contained in the union of the set of composition factors of , and the set of composition factors of for all dividing but not . As a consequence, we see that is *not* a composition factor of for any coprime to ; by Exercise 8, is also not a composition factor of for any dividing and coprime to . By many applications of Lemma 4, we then obtain the claim.

As a simple application of the above corollary, we observe that we may reduce Theorem 1 to the case when is a free group on two generators. Indeed, if is not virtually solvable, then by the Tits alternative (Theorem 5 from Notes 6), contains a subgroup which is a free group on two generators (and in particular, continues to not be virtually solvable). Now the polynomial need not be primitive on , so we cannot deduce Theorem 1 for from its counterpart for . However, by Corollary 5 we have an integer such that

whenever with and are distinct primes coprime to .

As is primitive with respect to , we may find such that is coprime to . By translating by , we obtain a new polynomial for which is coprime to . In particular, for any , we have coprime to . By (17), this implies that for any square-free (and hence for arbitrary ), we can find with coprime to . Thus is primitive with respect to , and so we may deduce Theorem 1 for from its counterpart for .

** — 3. Sieving in thin groups — **

We can now deduce Theorem 1 from the following expander result:

Theorem 6 (Uniform expansion)Let generate a free group in . Then, as runs through the square-free integers, form a two-sided expander family.

When is restricted to be prime, this result follows from Theorem 3 from Notes 6. The extension of this theorem to non-prime is more difficult, and will be discussed later. For now, let us assume Theorem 6 and see how we can use it, together with the beta sieve, to imply Theorem 1.

As discussed in the preceding section, to show Theorem 1 we may assume without loss of generality that is a free group on two generators . Let be the generator of the associated random walk, and let be a large integer. Then will be supported on elements of whose coefficients have size , where we allow implied constants to depend on . In particular, for in this support, will be an integer of size , where we allow implied constants to depend on also. On the other hand, has an norm that decreases exponentially in (by Exercise 6 of Notes 6). If we then set for a sufficiently small absolute constant , it will then suffice to show that with probability , an element drawn from with distribution is such that is non-zero and coprime to .

It will be convenient to knock out a few exceptional primes. From Corollary 5, we may find an integer with the property that

whenever with and distinct and coprime to . As is primitive, we may find a residue class such that is coprime to . For each integer , let denote the quantity

It will suffice to show that

To do this, we will use the beta sieve. Indeed, by Exercise 2 it suffices to establish a bound of the form

for all square-free , some constant , and some multiplicative function obeying the bounds

and

for all primes .

By choice of , the quantity vanishes whenever is not coprime to . So we will set for the primes dividing , and it will suffice to establish (18) for coprime to . The left-hand side of (18) can then be expressed as

where we descend the polynomial to a polynomial in the obvious fashion. However, in view of Theorem 6 (and the random walk interpretation of expansion), we have

for some independent of . Note that

while from Corollary 5 we have

and thus

for small enough, where is defined for coprime to as

As is primitive, we have for all such ; from Corollary 5 we see that is multiplicative for such . Finally, from the Schwarz-Zippel lemma (see Exercise 23 from Notes 5) we have

and Theorem 1 follows.

Remark 3One can obtain more precise bounds on using the Lang-Weil theorem, but we will not need this result here. (Such results would however be needed if one wanted more quantitative information than Theorem 1; see the paper of Bourgain, Gamburd, and Sarnak for details.)

It remains to establish Theorem 1. In the case when is prime, this was achieved in previous sections using the ingredients of quasirandomness, product theorems, and non-concentration. In the original paper of Bourgain, Gamburd, and Sarnak, these ingredients were extendedto the square-free case by hand, which led to a fairly lengthy argument. In the subsequent paper of Varju, it was shown that each of these ingredients can in fact be more or less automatically bootstrapped from the prime case to the square-free case by using tools such as the Chinese remainder theorem (or strong approximation) to “factor” the latter case into copies of the former, thus simplifying the extension to the square-free case significantly. We will not give the full argument here, but just to convey a taste of these sorts of product arguments, we will discuss the product structure of just one of the three ingredients, namely quasirandomness. (The extension of this ingredient to the square-free setting was already observed in the Bourgain-Gamburd-Sarnak paper.)

As a consequence of Proposition 4 of Notes 3, the following claim was shown:

Proposition 7Let be a -quasirandom finite group, is a symmetric set of generators of not containing the identity of cardinality , and be such that(say) for some with , then is a two-sided -expander for some depending only on and the implied constants in the notation.

It turns out that this fact can be extended to product groups:

Proposition 8Proposition 7 continues to hold if the hypothesis that is -quasirandom is replaced with the hypothesis that for some and some finite groups , with each being -quasirandom.

The key point here is that the expansion constant does not depend on the number of groups in this factorisation.

*Proof:* (Sketch) For technical reasons it is convenient to allow to have multiplicity and to possibly contain the identity; this will require generalising the notion of Cayley graph, and of expansion in such generalised graphs.

Let be a non-constant eigenfunction of the adjacency operator, thus for some real . The objective is to prove that for some sufficiently small independent of .

The claim is trivial, so assume inductively that and that the claim is proven for all smaller values of (with a fixed choice of ). For each , we can partition the eigenspaces of the adjacency operator into those functions which are invariant in the direction, and those functions which have mean zero in each coset of . These partitions are compatible with each other as varies (basically because the operations of averaging in and averaging in commute). Thus, without loss of generality, we may assume that the eigenfunction is such that for each , either is -invariant, or has mean zero on each coset.

Suppose that was -invariant, then the eigenvalue would also persist after projecting down from to (possibly picking up the identity or some multiplicity in the process). The claim then follows from the induction hyopthesis. Similarly if was -invariant for any other value of . Thus we may assume that has mean zero in each coset.

One can show that every irreducible unitary representation of splits as a tensor product of irreducible unitary representations of the . If one lets be the subspace of spanned by and its left translates, we thus see that contains at least one such tensor product; but as every element of will have mean zero in each coset, the factors in this tensor product will all be non-trivial. Using quasirandomness, the factor will have dimension at least , and so must have dimension at least . At this point, one can use a trace formula to relate to to conclude the argument.

Exercise 10Develop the above sketch into a complete proof of the proposition.

## 16 comments

Comments feed for this article

2 March, 2012 at 11:39 am

JonA very interesting topic, thank you for these notes!

An easy typo: in section 1, when introducing you correctly link to the primorial, but instead of using the word “primorial” in your sentence there’s a URL for the PNT (which perhaps you wanted to use in the previous paragraph, which mentions it without a link).

[Corrected, thanks – T.]2 March, 2012 at 11:48 am

AnonymousSmall typo: in the second paragraph, I think you want

\{ (n_1,n_2) \in \mathbf{Z}^2 : n_2 – n_1 = 2\}, not

\{ (n_1,n_2) \in \mathbf{Z} : n_2 – n_1 = 2\}

[Corrected, thanks – T.]4 March, 2012 at 6:04 pm

David RobertsSmall typo – definition of Fermat number is missing a 2 in the stack of exponentials: 2^n instead of 2^2^n.

[Corrected, thanks – T.]5 March, 2012 at 2:04 am

Ben GreenTerry, I think the the work of Varju was a more fundamental advance than you make out. The paper of Bourgain, Gamburd and Sarnak went into the inner workings of Helfgott’s paper and did everything with instead of . That was quite painful, because of the need for a satisfactory sum-product theory for . By contrast Varju took Helfgott’s conclusion for as a black box; his argument works for , for any semisimple . Nice post!

5 March, 2012 at 10:17 am

Terence TaoFair enough; I edited the post accordingly. On the one hand, B-G-S already had the clean “Cartesian” approach to quasirandomness and non-concentration in their original paper, but the Cartesian approach to the product theorem (which is certainly the most difficult of the three ingredients to obtain) is clearly Varju’s contribution.

5 March, 2012 at 12:08 pm

petequinnProfessor Tao, small question to clarify your definition of “almost primes.”

Is 2 x 2 x 2 x 3 x 3 x 5 a 3-almost prime or a 6-almost prime?

Thanks,

Pete

5 March, 2012 at 4:18 pm

Terence TaoAs stated in the main post, prime factors are being counted with multiplicity; thus, for instance, the number you mention is an r-almost prime for any .

5 March, 2012 at 10:24 pm

CEI think there is a typo: you mean: .

(An 11-almost prime has at most 11 prime factors, so a number with 6 prime factors (with multiplicity) is an 11-almost prime).

[Corrected, thanks – T.]7 March, 2012 at 11:20 am

LewThe Landau conjecture mentioned (also attributed to Hardy & Littlewood) is possibly the “simplest” unsolved math problem, where “simplest” means shortest in first order number theory (using a suitable small set of symbols). See:

http://math.andrej.com/2006/11/04/are-small-sentences-of-peano-arithmetic-decidable/comment-page-1/

18 February, 2015 at 1:51 pm

254A, Notes 4: Some sieve theory | What's new[…] theory of expander graphs (the latter arises in the recent theory of the affine sieve, discussed in this previous blog post). However, the sieve-theoretic tools developed in this post are not particularly sensitive to how a […]

8 April, 2017 at 3:58 am

Claudeh5Hello professor Tao,

I think there is a small typo in (6): the D_+ is in fact {\mathcal D}_+

[Corrected, thanks – T.]12 August, 2018 at 4:36 am

AnonymousProfessor Tao, a small typo: in the 2nd there’s an extra n in number-theoretic.

[Corrected, thanks – T.]26 June, 2019 at 6:35 pm

Kevin BroughanHi Prof Tao,

For the definition of r-almost prime should it be x^{1/r} rather than x^{1/(r+1)}: the given sufficient condition is p\mid n\implies p\ge x^{1/r} so \Omega(n)=s\implies x\ge n\ge x^{s/r} so s\le r ? or am I missing something!

27 June, 2019 at 9:40 am

Terence TaoYou should use strict inequalities instead of non-strict inequalities here to gain an extra (the “integrality gap”). If is the product of prime factors, each strictly larger than , then , hence , hence .

24 January, 2023 at 12:53 am

LimithI don’t see how identity (10) holds with condition . In fact, the LHS of (10) becomes . I don’t see how you obtain (10) unless are exactly the bigger primes appearing in .

26 January, 2023 at 6:39 pm

Terence TaoThanks for pointing this out. The condition should be replaced by the more complicated condition that where is coprime to . The main feature of this identity remains, namely that the right-hand side is non-negative for even and non-positive for odd .