In this final set of course notes, we discuss how (a generalisation of) the expansion results obtained in the preceding notes can be used for some number-theoretic applications, and in particular to locate almost primes inside orbits of thin groups, following the work of Bourgain, Gamburd, and Sarnak. We will not attempt here to obtain the sharpest or most general results in this direction, but instead focus on the simplest instances of these results which are still illustrative of the ideas involved.
One of the basic general problems in analytic number theory is to locate tuples of primes of a certain form; for instance, the famous (and still unsolved) twin prime conjecture asserts that there are infinitely many pairs in the line
in which both entries are prime. In a similar spirit, one of the Landau conjectures (also still unsolved) asserts that there are infinitely many primes in the set
. The Mersenne prime conjecture (also unsolved) asserts that there are infinitely many primes in the set
, and so forth.
More generally, given some explicit subset in
(or
, if one wishes), such as an algebraic variety, one can ask the question of whether there are infinitely many integer lattice points
in
in which all the coefficients
are simultaneously prime; let us refer to such points as prime points.
At this level of generality, this problem is impossibly difficult. Indeed, even the much simpler problem of deciding whether the set is non-empty (let alone containing prime points) when
is a hypersurface
cut out by a polynomial
is essentially Hilbert’s tenth problem, which is known to be undecidable in general by Matiyasevich’s theorem. So one needs to restrict attention to a more special class of sets
, in which the question of finding integer points is not so difficult. One model case is to consider orbits
, where
is a fixed lattice vector and
is some discrete group that acts on
somehow (e.g.
might be embedded as a subgroup of the special linear group
, or on the affine group
). In such a situation it is then quite easy to show that
is large; for instance,
will be infinite precisely when the stabiliser of
in
has infinite index in
.
Even in this simpler setting, the question of determining whether an orbit contains infinitely prime points is still extremely difficult; indeed the three examples given above of the twin prime conjecture, Landau conjecture, and Mersenne prime conjecture are essentially of this form (possibly after some slight modification of the underlying ring
, see this paper of Bourgain-Gamburd-Sarnak for details), and are all unsolved (and generally considered well out of reach of current technology). Indeed, the list of non-trivial orbits
which are known to contain infinitely many prime points is quite slim; Euclid’s theorem on the infinitude of primes handles the case
, Dirichlet’s theorem handles infinite arithmetic progressions
, and a somewhat complicated result of Green, Tao, and Ziegler handles “non-degenerate” affine lattices in
of rank two or more (such as the lattice of length
arithmetic progressions), but there are few other positive results known that are not based on the above cases (though we will note the remarkable theorem of Friedlander and Iwaniec that there are infinitely many primes of the form
, and the related result of Heath-Brown that there are infinitely many primes of the form
, as being in a kindred spirit to the above results, though they are not explicitly associated to an orbit of a reasonable action as far as I know).
On the other hand, much more is known if one is willing to replace the primes by the larger set of almost primes – integers with a small number of prime factors (counting multiplicity). Specifically, for any , let us call an
-almost prime an integer which is the product of at most
primes, and possibly by the unit
as well. Many of the above sorts of questions which are open for primes, are known for
-almost primes for
sufficiently large. For instance, with regards to the twin prime conjecture, it is a result of Chen that there are infinitely many pairs
where
is a prime and
is a
-almost prime; in a similar vein, it is a result of Iwaniec that there are infinitely many
-almost primes of the form
. On the other hand, it is still open for any fixed
whether there are infinitely many Mersenne numbers
which are
-almost primes. (For the superficially similar situation with the numbers
, it is in fact believed (but again unproven) that there are only finitely many
-almost primes for any fixed
(the Fermat prime conjecture).)
The main tool that allows one to count almost primes in orbits is sieve theory. The reason for this lies in the simple observation that in order to ensure that an integer of magnitude at most
is an
-almost prime, it suffices to guarantee that
is not divisible by any prime less than
. Thus, to create
-almost primes, one can start with the integers up to some large threshold
and remove (or “sieve out”) all the integers that are multiples of any prime
less than
. The difficulty is then to ensure that a sufficiently non-trivial quantity of integers remain after this process, for the purposes of finding points in the given set
.
The most basic sieve of this form is the sieve of Eratosthenes, which when combined with the inclusion-exclusion principle gives the Legendre sieve (or exact sieve), which gives an exact formula for quantities such as the number of natural numbers less than or equal to
that are not divisible by any prime less than or equal to a given threshold
. Unfortunately, when one tries to evaluate this formula, one encounters error terms which grow exponentially in
, rendering this sieve useful only for very small thresholds
(of logarithmic size in
). To improve the sieve level up to a small power of
such as
, one has to replace the exact sieve by upper bound sieves and lower bound sieves which only seek to obtain upper or lower bounds on quantities such as
, but contain a polynomial number of terms rather than an exponential number. There are a variety of such sieves, with the two most common such sieves being combinatorial sieves (such as the beta sieve), based on various combinatorial truncations of the inclusion-exclusion formula, and the Selberg upper bound sieve, based on upper bounds that are the square of a divisor sum. (There is also the large sieve, which is somewhat different in nature and based on
almost orthogonality considerations, rather than on any actual sieving, to obtain upper bounds.) We will primarily work with a specific sieve in this notes, namely the beta sieve, and we will not attempt to optimise all the parameters of this sieve (which ultimately means that the almost primality parameter
in our results will be somewhat large). For a more detailed study of sieve theory, see the classic text of Halberstam and Richert, or the more recent texts of Iwaniec-Kowalski or of Friedlander-Iwaniec.
Very roughly speaking, the end result of sieve theory is that excepting some degenerate and “exponentially thin” settings (such as those associated with the Mersenne primes), all the orbits which are expected to have a large number of primes, can be proven to at least have a large number of -almost primes for some finite
. (Unfortunately, there is a major obstruction, known as the parity problem, which prevents sieve theory from lowering
all the way to
; see this blog post for more discussion.) One formulation of this principle was established by Bourgain, Gamburd, and Sarnak:
Theorem 1 (Bourgain-Gamburd-Sarnak) Let
be a subgroup of
which is not virtually solvable. Let
be a polynomial with integer coefficients obeying the following primitivity condition: for any positive integer
, there exists
such that
is coprime to
. Then there exists an
such that there are infinitely many
with
non-zero and
-almost prime.
This is not the strongest version of the Bourgain-Gamburd-Sarnak theorem, but it captures the general flavour of their results. Note that the theorem immediately implies an analogous result for orbits , in which
is now a polynomial from
to
, and one uses
instead of
. It is in fact conjectured that one can set
here, but this is well beyond current technology. For the purpose of reaching
, it is very natural to impose the primitivity condition, but as long as one is content with larger values of
, it is possible to relax the primitivity condition somewhat; see the paper of Bourgain, Gamburd, and Sarnak for more discussion.
By specialising to the polynomial , we conclude as a corollary that as long as
is primitive in the sense that it contains matrices with all coefficients coprime to
for any given
, then
contains infinitely many matrices whose elements are all
-almost primes for some
depending only on
. For further applications of these sorts of results, for instance to Appolonian packings, see the paper of Bourgain, Gamburd, and Sarnak.
It turns out that to prove Theorem 1, the Cayley expansion results in from the previous set of notes are not quite enough; one needs a more general Cayley expansion result in
where
is square-free but not necessarily prime. The proof of this expansion result uses the same basic methods as in the
case, but is significantly more complicated technically, and we will only discuss it briefly here. As such, we do not give a complete proof of Theorem 1, but hopefully the portion of the argument presented here is still sufficient to give an impression of the ideas involved.
— 1. Combinatorial sieving —
In this section we set up the combinatorial sieve needed to establish Theorem 1. To motivate this sieve, let us focus first on a much simpler model problem, namely the task of estimating the number of natural numbers less than or equal to a given threshold
which are not divisible by any prime less than or equal to
. Note that for
between
and
,
is simply the number of primes in the interval
; but for
less than
,
also counts some almost primes in addition to genuine primes. This quantity can be studied quite precisely by a variety of tools, such as those coming from multiplicative number theory; see for instance this paper of Granville and Soundararajan for some of the most precise results in this direction.
The quantity is easiest to estimate when
is small. For instance,
is simply the number of natural numbers less than
, and so
Similarly, is the number of odd numbers less than
, and so
Carrying this further, is the number of numbers less than
that are coprime to
, and so
(but note that the implied constant in the error is getting increasingly large). Continuing this analysis, it is not hard to see that
for any fixed ; note from Mertens’ theorem that
leading to the heuristic approximation
where is the Euler-Mascheroni constant. Note though that this heuristic should be treated with caution when
is large; for instance, from the prime number theorem we see that we have the conflicting asymptotic
when . This is already a strong indication that one needs to pay careful attention to the error terms in this analysis. (Indeed, many false “proofs” of conjectures in analytic number theory, such as the twin prime conjecture, have been based on a cavalier attitude to such error terms, and their asymptotic behaviour under various limiting regimes.)
Let us thus work more carefully to control the error term . Write
for the product of all the primes less than or equal to
(this quantity is also known as the primorial of
). Then we can write
where the sum ranges over natural numbers less than
, and
is the greatest common divisor of
and
. The function
is periodic of period
, and is equal to
on
of the residue classes modulo
, which leads to the crude bound
However, this error term is too large for most applications: from the prime number theorem, we see that , so the error term grows exponentially in
. In particular, this estimate is only non-trivial in the regime
.
One can do a little better than this by using the inclusion-exclusion principle, which in this context is also known as the Legendre sieve. Consider for instance , which counts the number of natural numbers
coprime to
. We can compute this quantity by first counting all numbers less than
, then subtracting those numbers divisible by
and by
, and then adding back those numbers divisible by both
and
. A convenient way to describe this procedure in general is to introduce the Möbius function
, defined to equal
when
is the product of
distinct primes for some
. The key point is that
for any natural number , where
ranges over the divisors of
; indeed, this identity can be viewed as an alternate way to define the Möbius function. In particular,
, leading to the Legendre identity
The inner sum can be easily estimated as
since has
distinct factors, where
is the number of primes less than or equal to
, we conclude that
The main term here can be factorised as
leading to the following slight improvement
to (2). Note from the prime number theorem that , so this error term is asymptotically better than the one in (2); the bound here is now non-trivial in the slightly larger regime
. But this is still not good enough for the purposes of counting almost primes, which would require
as large as a power of
.
To do better, we will replace the exact identity (3) by combinatorial truncations
of that identity, where divides
and
are sets to be specified later, leading to the upper bound sieve
The key point will be that and
can be chosen to be only polynomially large in
, rather than exponentially large, without causing too much damage to the main terms
, which lead to upper and lower bounds on
that remain non-trivial for moderately large values of
(e.g.
for some fixed
).
We now turn to the task of locating reasonably small sets obeying (6). We begin with(3), which we rewrite as
for a divisor of
. One can view the divisors of
as a
-dimensional combinatorial cube, with the right-hand side in (9) being a sum over that cube; the idea is then to hack off various subcubes of that cube in a way that only serves to increase (for the upper bound sieve) or decrease (for the lower bound sieve) that sum, until only a relatively small portion of the cube remains.
We turn to the details. Our starting point will be the identity
whenever are primes, which follows easily from applying (3) to
when
divide
. One can view the left-hand side of (10) as a subsum of the sum in (9), and (10) implies that this subsum is non-negative when
is even and non-positive when
is odd. In particular, we see that (6) will hold when
is formed from the “cube”
by removing some disjoint “subcubes” of the form
for
and
odd, and similarly for
but with
now required to be even instead of odd.
Observe that the subcube consists precisely of those divisors
of
whose top
prime factors are
. We now have the following general inequality:
Lemma 2 (Combinatorial sieve) Let
. For each natural number
, let
be a predicate pertaining to
decreasing primes
(thus
is either true or false for each choice of
). Let
be the set of all natural numbers
which, when factored as
for
, is such that
holds for all odd
. Similarly define
by requiring
to be even instead of odd. Then (6) holds for all
.
Proof: is formed from
by removing those subcubes of the form
for
,
odd, and such that
holds for all odd
but fails for
. These subcubes are all disjoint, and so the claim for
follows from the preceding discussion. Similarly for
.
This gives us the upper and lower bounds (7), (8) for . To make these bounds useful, we need to choose
so that the partial sums
are close to
To do this, one must select the predicates carefully. The best choices for these predicates are not immediately obvious; but after much trial and error, it was discovered that one fairly efficient choice is to let
be the predicate
for some moderately large parameter (we will eventually take
) and some parameter
for some
to be optimised in later (we will eventually take it to be almost as large as
). The use of this choice is referred to as the beta sieve.
Let us now estimate the errors
For sake of argument let us work with , as the
case is almost identical. By the triangle inequality, we can bound this error by
where ranges over positive even integers, and
denotes a sum over primes
ranges over primes such that
Since and
, this in particular gives the bound
From (12) we have
for all (not necessarily even); note that the case
follows from the hypothesis
. We can rewrite this inequality as
and hence by induction
for all . From (13) we then have
We conclude that the error (11) is bounded by
in the ; a similar argument also gives this bound in the
case. The inner sum can be computed as
and thus by Mertens’ theorem (1) and the bound we have
We have thus bounded (11) by
The inner sum can be bounded by
By another of Mertens’ theorems (or by taking logarithms of (1)) one has
and so (11) is bounded by
Using the crude bound (as can be seen by considering the
term in the Taylor expansion of
) we conclude the bound
If is large enough (
will suffice) then the expression
is less than
; since
, this leads to the bound
which after summing the geometric series becomes
(allowing implied constants to depend on ). From this bound on (11) and (5), (1) we have We thus have
Finally, if is an element of
, then by (12) and the hypothesis
we have
and so we have the crude upper bounds . From (7), (8) and recalling that
, we thus have
If , we may optimise in
by setting
(in order to make the final error term much less than
), leading to the bound
whenever for some sufficiently small absolute constant
.
Remark 1 The bound (14) implies, among other things, that there exists an absolute constant
such that the number of
-almost primes less than
is
, which is a very weak version of the prime number theorem. Note though that the upper bound in (14) does not directly imply a corresponding upper bound on this count of
-almost primes, because
-almost primes are allowed to have prime factors that are less than
. Indeed, a routine computation using Mertens’ theorem shows that for any fixed
, the number of
-almost primes less than
is comparable to
.
We can generalise the above argument as follows:
Exercise 1 (Beta sieve) Let
be an absolutely convergent sequence of non-negative reals for
. Let
,
, and
. Let
be a multiplicative function taking values between
and
, with
for all primes
. Assume the following axioms:
- (i) (Control in arithmetic progressions) For any
, one has
- (ii) (Mertens type theorem) For all
, one has
Conclude that there is an
depending only on
, and the implied constants in the above axioms, such that
whenever
, and the implied constants may depend on
, and the implied constants in the above axioms. (Note that (14) corresponds to the case when
,
, and
.)
Exercise 2 Suppose we have the notation and hypotheses of the preceding exercise, except that the estimate (15) is replaced by the weaker bound
for all sufficiently large
. (For small
, note that we still have the bound
.) Show that we still have the lower bound
whenever
for sufficiently small
(which may depend on
), where the implied constant is now allowed to depend on
. (Hint: the main trick here is to extract out a common factor of
from the analysis first, and then use the bound (16) to upper bound quantities such as
.)
One can weaken the axioms somewhat and still obtain non-trivial results from the beta sieve, but this somewhat crude version of the sieve will suffice for our purposes. Another, more abstract, formalisation of the above argument (involving a construction of sets obeying (6) and a number of other desirable properties) is sometimes referred to as the fundamental lemma of sieve theory.
Exercise 3 (Twin almost primes) Let
be the number of integers
between
and
such that
and
are both coprime to
.
- (i) Show that
if
, and
is a sufficiently small absolute constant.
- (ii) Show that there exists an
such that there are infinitely many pairs
which are both
-almost primes. (Indeed, the argument here allows one to take
without much effort, and by working considerably harder to optimise everything, one can lower
substantially, although the parity problem mentioned earlier prevents one from taking
below
.)
- (iii) Establish Brun’s theorem that the sum of reciprocals of the twin primes is convergent.
Exercise 4 (Landau conjecture for almost primes) Let
be the number of integers
between
and
such that
is coprime to
.
- (i) Show that
if
, and
is a sufficiently small absolute constant. (Hint: you will need the fact that
is a quadratic residue mod
if and only if
, and Merten’s theorem for arithmetic progressions, which among other things asserts that
.)
- (ii) Show that there exists an
such that there are infinitely natural numbers
such that
is an
-almost primes.
Exercise 5 Let
be a polynomial with integer coefficients and degree
. Assume that
is primitive in the sense that for each natural number
, there exists a natural number
such that
is coprime to
. Show that there exists an
depending only on
such that for all sufficiently large
, there are at least
natural numbers
less than
such that
is an
-almost prime.
In many cases (e.g. if
is irreducible) one can decrease the power of
here (as in Exercise 4), by using tools such as Landau’s prime ideal theorem; see this previous blog post for some related discussion.
Remark 2 The combinatorial sieve is not the only type of sieve used in sieve theory. Another popular choice is the Selberg upper bound sieve, in which the starting point is not the combinatorial inequalities (6), but rather the variant
where the
are arbitrary real parameters with
, typically supported up to some level
. By optimising the choice of weights
, the Selberg sieve can lead to upper bounds on quantities such as
which are competitive with the beta sieve (particularly when
is moderately large), although it is more difficult for this sieve to produce matching lower bounds. A somewhat different type of sieve is the large sieve, which does not upper bound or lower bound indicator functions such as
directly, but rather controls the size of a function that avoids many residue classes by exploiting the
properties of these residue classes, such as almost orthogonality phenomena or Fourier uncertainty principles. See this text of Friedlander and Iwaniec for a much more thorough discussion and comparison of these sieves.
— 2. The strong approximation property —
For any natural number , let
be the obvious projection homomorphism. An easy application of Bezout’s theorem (or the Euclidean algorithm) shows that this map is surjective. From the Chinese remainder theorem, we also have
whenever
and
are coprime.
To set up the sieve needed to establish Theorem 1, we need to understand the images of a non-virtually-solvable subgroup
of
. Clearly this is a subgroup of
. Given that
is fairly “large” (in particular, such groups can be easily seen to be Zariski-dense in
), we expect that in most cases
is in fact all of
. This type belief is formalised in general as the strong approximation property. We will not prove the most general instance of this property, but instead focus on the model case of
for
square-free, in which one can proceed by ad hoc elementary arguments. The general treatment of the strong approximation property was first achieved by Matthews, Vaserstein, and Weisfeiler using the classification of finite simple groups; a subsequent paper of Nori gave an alternate treatment that avoided the use of this classification.
In the previous set of notes (see Remark 2) it was already observed that for all sufficiently large primes
. (Indeed,
did not need to be free for this to hold; it was enough that
not be virtually solvable.) To extend from the prime case to the (square-free) composite case, we will need some basic group theory, and in particular the theory of composition factors.
Define a composition series for a group to be a finite sequence
of subgroups, where each is a normal subgroup of
, and the quotients
are all simple. (By convention, we do not consider the trivial group to be simple.) The quotients
for
are referred to as the composition factors of this series.
Exercise 6 Show that every finite group has at least one composition series.
A key fact about composition factors, known as the Jordan-Holder theorem, asserts that, up to permutation and isomorphism, they are independent of the choice of series:
Theorem 3 (Jordan-Holder theorem) Let
and
be two composition series of the same group
. Then there is a bijection
such that for each
,
is isomorphic to
. (In particular,
and
must be equal.)
Proof: By symmetry we may assume that . Fix
. Let
be the quotient map, and consider the groups
for
. These are an increasing family of subgroups of
, with
and
. Since each
is a normal subgroup of
, we see that
is a normal subgroup of
. As
is simple, this implies that there is a unique element
of
such that
is trivial for
and
is equal to
for
.
Now we claim that is a bijection. Suppose this is not the case. Since
, there thus exists
which is not in the range of
. This implies that
for all
. An induction on
then shows that
for all
, and thus
, contradicting the assumption that
is simple.
Finally, fix , and let
. Then we have
for all
, while
and
. From this and induction we see that
is trivial for
but isomorphic to
for
. (Here we are basically relying on a special case of the Zassenhaus lemma.) In particular,
is isomorphic to
, and the claim follows.
In view of this theorem, we can assign to each finite group a set (or more precisely, multiset) of composition factors of simple groups, which are unique up to permutation and isomorphism. This is somewhat analogous to how the fundamental theorem of arithmetic assigns to each positive integer a multiset of prime numbers, which are unique up to permutation. (Indeed, the latter can be viewed as the special case of the former in the case of cyclic groups.)
Exercise 7 Show that for
a prime, the composition factors of
are (up to isomorphism and permutation) the cyclic group
and the projective special linear group
. What happens instead when
or
?
Also, show that the only normal subgroup of
(other than the trivial group and all of
) is the center
of the group. Thus, we see (in contrast with the fundamental theorem of arithmetic) that one cannot permute the composition factors arbitrarily.
Exercise 8 Let
be a normal subgroup of a finite group
. Show that the set of composition factors of
is equal to (up to isomorphism, and counting multiplicity) the union of the set of composition factors of
, and the set of composition factors of
. In particular, the set of composition factors of
and of
are subsets of the set of composition factors of
(again up to isomorphism, and counting multiplicity). As another corollary, we see that the composition factors of a direct product
or semidirect product
of two finite groups
is the union of the set of composition factors of
and
separately (again up to isomorphism, and counting multiplicity).
Knowing the composition factors of a group can assist in classifying its subgroups; in particular, groups which are “coprime” in the sense of having no composition factors in common are difficult to “join” together. (Interestingly, the phenomenon of “coprimality” implying “disjointness” also shows up in ergodic theory, in the theory of joinings, but we will not discuss this further here.) Here is an example of this which will be of importance in our application:
Lemma 4 Let
be a prime, and
be a finite group which does not have a copy of
amongst its composition factors. Let
be a subgroup of
whose projections to
and
are surjective. Then
is all of
.
Proof: We apply Goursat’s lemma (see Exercise 9 below). Thus if and
, then
are normal subgroups of
respectively such that
is isomorphic to
. From Exercise 7 we see that
is either trivial, all of
, or is the center
.
If is trivial, then
is isomorphic to a quotient of
, and thus by Exercise 8 the composition factors of
are a subset of those of
. But this is a contradiction, since
is a composition factor of
but not of
. Similarly if
is the center, since
is then isomorphic to
. So the only remaining case is when
is all of
. But then as
surjects onto
, we see that
is all of
and we are done.
Exercise 9 (Goursat’s lemma) Let
be groups, and let
be a subgroup of
whose projections to
are surjective. Let
and
. Show that
are normal subgroups of
, and that
and
are isomorphic. (Indeed, after quotienting out by
,
becomes a graph of such an isomorphism.) Conclude that the set of composition factors of
are a subset of the union of the set of composition factors of
and the set of composition factors of
(up to isomorphism and counting multiplicity, as usual).
As such, we have the following satisfactory description of the images of a free group
:
Corollary 5 (Strong approximation) Let
be a subgroup of
which is not virtually solvable. Let
be an integer. Then there exists a multiple
of
with the following property: whenever
is of the form
with
and
distinct and coprime to
, one has
(after using the Chinese remainder theorem to identify
with
). In particular, one has
The parameter will not actually be needed in our application, but is useful in the more general setting in which
has rational coefficients instead of integer coefficients.
Proof: We already know that for all but finitely many primes
. Let
be the product of
with all the eceptional primes, as well as
and
, thus
and
for all
coprime to
. By repeated application of Lemma 4 this implies that
for any distinct primes
coprime to
(the key point being that the groups
for primes
are all non-isomorphic to each other and to
by cardinality considerations).
The finite group may contain copies of
amongst their composition factors for a finite number of primes
coprime to
; let
be the product of
with all these primes. By many applications of Exercise 9, we see that the set of composition factors of
are contained in the union of the set of composition factors of
, and the set of composition factors of
for all
dividing
but not
. As a consequence, we see that
is not a composition factor of
for any
coprime to
; by Exercise 8,
is also not a composition factor of
for any
dividing
and
coprime to
. By many applications of Lemma 4, we then obtain the claim.
As a simple application of the above corollary, we observe that we may reduce Theorem 1 to the case when is a free group on two generators. Indeed, if
is not virtually solvable, then by the Tits alternative (Theorem 5 from Notes 6),
contains a subgroup
which is a free group on two generators (and in particular, continues to not be virtually solvable). Now the polynomial
need not be primitive on
, so we cannot deduce Theorem 1 for
from its counterpart for
. However, by Corollary 5 we have an integer
such that
whenever with
and
are distinct primes coprime to
.
As is primitive with respect to
, we may find
such that
is coprime to
. By translating
by
, we obtain a new polynomial
for which
is coprime to
. In particular, for any
, we have
coprime to
. By (17), this implies that for any square-free
(and hence for arbitrary
), we can find
with
coprime to
. Thus
is primitive with respect to
, and so we may deduce Theorem 1 for
from its counterpart for
.
— 3. Sieving in thin groups —
We can now deduce Theorem 1 from the following expander result:
Theorem 6 (Uniform expansion) Let
generate a free group
in
. Then, as
runs through the square-free integers,
form a two-sided expander family.
When is restricted to be prime, this result follows from Theorem 3 from Notes 6. The extension of this theorem to non-prime
is more difficult, and will be discussed later. For now, let us assume Theorem 6 and see how we can use it, together with the beta sieve, to imply Theorem 1.
As discussed in the preceding section, to show Theorem 1 we may assume without loss of generality that is a free group on two generators
. Let
be the generator of the associated random walk, and let
be a large integer. Then
will be supported on elements of
whose coefficients have size
, where we allow implied constants to depend on
. In particular, for
in this support,
will be an integer of size
, where we allow implied constants to depend on
also. On the other hand,
has an
norm that decreases exponentially in
(by Exercise 6 of Notes 6). If we then set
for a sufficiently small absolute constant
, it will then suffice to show that with probability
, an element
drawn from
with distribution
is such that
is non-zero and coprime to
.
It will be convenient to knock out a few exceptional primes. From Corollary 5, we may find an integer with the property that
whenever with
and
distinct and coprime to
. As
is primitive, we may find a residue class
such that
is coprime to
. For each integer
, let
denote the quantity
It will suffice to show that
To do this, we will use the beta sieve. Indeed, by Exercise 2 it suffices to establish a bound of the form
for all square-free , some constant
, and some multiplicative function
obeying the bounds
and
for all primes .
By choice of , the quantity
vanishes whenever
is not coprime to
. So we will set
for the primes
dividing
, and it will suffice to establish (18) for
coprime to
. The left-hand side of (18) can then be expressed as
where we descend the polynomial to a polynomial
in the obvious fashion. However, in view of Theorem 6 (and the random walk interpretation of expansion), we have
for some independent of
. Note that
while from Corollary 5 we have
and thus
for small enough, where
is defined for
coprime to
as
As is primitive, we have
for all such
; from Corollary 5 we see that
is multiplicative for such
. Finally, from the Schwarz-Zippel lemma (see Exercise 23 from Notes 5) we have
and Theorem 1 follows.
Remark 3 One can obtain more precise bounds on
using the Lang-Weil theorem, but we will not need this result here. (Such results would however be needed if one wanted more quantitative information than Theorem 1; see the paper of Bourgain, Gamburd, and Sarnak for details.)
It remains to establish Theorem 1. In the case when is prime, this was achieved in previous sections using the ingredients of quasirandomness, product theorems, and non-concentration. In the original paper of Bourgain, Gamburd, and Sarnak, these ingredients were extendedto the square-free case by hand, which led to a fairly lengthy argument. In the subsequent paper of Varju, it was shown that each of these ingredients can in fact be more or less automatically bootstrapped from the prime case to the square-free case by using tools such as the Chinese remainder theorem (or strong approximation) to “factor” the latter case into copies of the former, thus simplifying the extension to the square-free case significantly. We will not give the full argument here, but just to convey a taste of these sorts of product arguments, we will discuss the product structure of just one of the three ingredients, namely quasirandomness. (The extension of this ingredient to the square-free setting was already observed in the Bourgain-Gamburd-Sarnak paper.)
As a consequence of Proposition 4 of Notes 3, the following claim was shown:
Proposition 7 Let
be a
-quasirandom finite group,
is a symmetric set of generators of
not containing the identity of cardinality
, and
be such that
(say) for some
with
, then
is a two-sided
-expander for some
depending only on
and the implied constants in the
notation.
It turns out that this fact can be extended to product groups:
Proposition 8 Proposition 7 continues to hold if the hypothesis that
is
-quasirandom is replaced with the hypothesis that
for some
and some finite groups
, with each
being
-quasirandom.
The key point here is that the expansion constant does not depend on the number
of groups in this factorisation.
Proof: (Sketch) For technical reasons it is convenient to allow to have multiplicity and to possibly contain the identity; this will require generalising the notion of Cayley graph, and of expansion in such generalised graphs.
Let be a non-constant eigenfunction of the adjacency operator, thus
for some real
. The objective is to prove that
for some sufficiently small
independent of
.
The claim is trivial, so assume inductively that
and that the claim is proven for all smaller values of
(with a fixed choice of
). For each
, we can partition the eigenspaces of the adjacency operator into those functions which are invariant in the
direction, and those functions which have mean zero in each coset of
. These partitions are compatible with each other as
varies (basically because the operations of averaging in
and averaging in
commute). Thus, without loss of generality, we may assume that the eigenfunction
is such that for each
, either
is
-invariant, or has mean zero on each
coset.
Suppose that was
-invariant, then the eigenvalue
would also persist after projecting
down from
to
(possibly picking up the identity or some multiplicity in the process). The claim then follows from the induction hyopthesis. Similarly if
was
-invariant for any other value of
. Thus we may assume that
has mean zero in each
coset.
One can show that every irreducible unitary representation of splits as a tensor product of irreducible unitary representations of the
. If one lets
be the subspace of
spanned by
and its left translates, we thus see that
contains at least one such tensor product; but as every element of
will have mean zero in each
coset, the factors in this tensor product will all be non-trivial. Using quasirandomness, the
factor will have dimension at least
, and so
must have dimension at least
. At this point, one can use a trace formula to relate
to
to conclude the argument.
Exercise 10 Develop the above sketch into a complete proof of the proposition.
16 comments
Comments feed for this article
2 March, 2012 at 11:39 am
Jon
A very interesting topic, thank you for these notes!
you correctly link to the primorial, but instead of using the word “primorial” in your sentence there’s a URL for the PNT (which perhaps you wanted to use in the previous paragraph, which mentions it without a link).
An easy typo: in section 1, when introducing
[Corrected, thanks – T.]
2 March, 2012 at 11:48 am
Anonymous
Small typo: in the second paragraph, I think you want
\{ (n_1,n_2) \in \mathbf{Z}^2 : n_2 – n_1 = 2\}, not
\{ (n_1,n_2) \in \mathbf{Z} : n_2 – n_1 = 2\}
[Corrected, thanks – T.]
4 March, 2012 at 6:04 pm
David Roberts
Small typo – definition of Fermat number is missing a 2 in the stack of exponentials: 2^n instead of 2^2^n.
[Corrected, thanks – T.]
5 March, 2012 at 2:04 am
Ben Green
Terry, I think the the work of Varju was a more fundamental advance than you make out. The paper of Bourgain, Gamburd and Sarnak went into the inner workings of Helfgott’s paper and did everything with
instead of
. That was quite painful, because of the need for a satisfactory sum-product theory for
. By contrast Varju took Helfgott’s conclusion for
as a black box; his argument works for
, for any semisimple
. Nice post!
5 March, 2012 at 10:17 am
Terence Tao
Fair enough; I edited the post accordingly. On the one hand, B-G-S already had the clean “Cartesian” approach to quasirandomness and non-concentration in their original paper, but the Cartesian approach to the product theorem (which is certainly the most difficult of the three ingredients to obtain) is clearly Varju’s contribution.
5 March, 2012 at 12:08 pm
petequinn
Professor Tao, small question to clarify your definition of “almost primes.”
Is 2 x 2 x 2 x 3 x 3 x 5 a 3-almost prime or a 6-almost prime?
Thanks,
Pete
5 March, 2012 at 4:18 pm
Terence Tao
As stated in the main post, prime factors are being counted with multiplicity; thus, for instance, the number you mention is an r-almost prime for any
.
5 March, 2012 at 10:24 pm
CE
I think there is a typo: you mean:
.
(An 11-almost prime has at most 11 prime factors, so a number with 6 prime factors (with multiplicity) is an 11-almost prime).
[Corrected, thanks – T.]
7 March, 2012 at 11:20 am
Lew
The Landau conjecture mentioned (also attributed to Hardy & Littlewood) is possibly the “simplest” unsolved math problem, where “simplest” means shortest in first order number theory (using a suitable small set of symbols). See:
http://math.andrej.com/2006/11/04/are-small-sentences-of-peano-arithmetic-decidable/comment-page-1/
18 February, 2015 at 1:51 pm
254A, Notes 4: Some sieve theory | What's new
[…] theory of expander graphs (the latter arises in the recent theory of the affine sieve, discussed in this previous blog post). However, the sieve-theoretic tools developed in this post are not particularly sensitive to how a […]
8 April, 2017 at 3:58 am
Claudeh5
Hello professor Tao,
I think there is a small typo in (6): the D_+ is in fact {\mathcal D}_+
[Corrected, thanks – T.]
12 August, 2018 at 4:36 am
Anonymous
Professor Tao, a small typo: in the 2nd there’s an extra n in number-theoretic.
[Corrected, thanks – T.]
26 June, 2019 at 6:35 pm
Kevin Broughan
Hi Prof Tao,
For the definition of r-almost prime should it be x^{1/r} rather than x^{1/(r+1)}: the given sufficient condition is p\mid n\implies p\ge x^{1/r} so \Omega(n)=s\implies x\ge n\ge x^{s/r} so s\le r ? or am I missing something!
27 June, 2019 at 9:40 am
Terence Tao
You should use strict inequalities instead of non-strict inequalities here to gain an extra
(the “integrality gap”). If
is the product of
prime factors, each strictly larger than
, then
, hence
, hence
.
24 January, 2023 at 12:53 am
Limith
I don’t see how identity (10) holds with condition
. In fact, the LHS of (10) becomes
. I don’t see how you obtain (10) unless
are exactly the
bigger primes appearing in
.
26 January, 2023 at 6:39 pm
Terence Tao
Thanks for pointing this out. The condition
should be replaced by the more complicated condition that
where
is coprime to
. The main feature of this identity remains, namely that the right-hand side is non-negative for even
and non-positive for odd
.