Many problems in non-multiplicative prime number theory can be recast as *sieving* problems. Consider for instance the problem of counting the number of pairs of twin primes contained in for some large ; note that the claim that for arbitrarily large is equivalent to the twin prime conjecture. One can obtain this count by any of the following variants of the sieve of Eratosthenes:

- Let be the set of natural numbers in . For each prime , let be the union of the residue classes and . Then is the cardinality of the
*sifted set*. - Let be the set of primes in . For each prime , let be the residue class . Then is the cardinality of the
*sifted set*. - Let be the set of primes in . For each prime , let be the residue class . Then is the cardinality of the
*sifted set*. - Let be the set . For each prime , let be the residue class Then is the cardinality of the
*sifted set*.

Exercise 1Develop similar sifting formulations of the other three Landau problems.

In view of these sieving interpretations of number-theoretic problems, it becomes natural to try to estimate the size of sifted sets for various finite sets of integers, and subsets of integers indexed by primes dividing some squarefree natural number (which, in the above examples, would be the product of all primes up to ). As we see in the above examples, the sets in applications are typically the union of one or more residue classes modulo , but we will work at a more abstract level of generality here by treating as more or less arbitrary sets of integers, without caring too much about the arithmetic structure of such sets.

It turns out to be conceptually more natural to replace sets by functions, and to consider the more general the task of estimating sifted *sums*

for some finitely supported sequence of non-negative numbers; the previous combinatorial sifting problem then corresponds to the indicator function case . (One could also use other index sets here than the integers if desired; for much of sieve theory the index set and its subsets are treated as abstract sets, so the exact arithmetic structure of these sets is not of primary importance.)

Continuing with twin primes as a running example, we thus have the following sample sieving problem:

Problem 2 (Sieving problem for twin primes)Let , and let denote the number of natural numbers which avoid the residue classes for all primes . In other words, we havewhere , is the product of all the primes strictly less than (we omit itself for minor technical reasons), and is the union of the residue classes . Obtain upper and lower bounds on which are as strong as possible in the asymptotic regime where goes to infinity and the

sifting levelgrows with (ideally we would like to grow as fast as ).

From the preceding discussion we know that the number of twin prime pairs in is equal to , if is not a perfect square; one also easily sees that the number of twin prime pairs in is at least , again if is not a perfect square. Thus we see that a sufficiently good answer to Problem 2 would resolve the twin prime conjecture, particularly if we can get the sifting level to be as large as .

We return now to the general problem of estimating (1). We may expand

where (with the convention that ). We thus arrive at the Legendre sieve identity

Specialising to the case of an indicator function , we recover the inclusion-exclusion formula

Such exact sieving formulae are already satisfactory for controlling sifted sets or sifted sums when the amount of sieving is relatively small compared to the size of . For instance, let us return to the running example in Problem 2 for some . Observe that each in this example consists of residue classes modulo , where is defined to equal when and when is odd. By the Chinese remainder theorem, this implies that for each , consists of residue classes modulo . Using the basic bound

for any and any residue class , we conclude that

for any , where is the multiplicative function

Since and there are at most primes dividing , we may crudely bound , thus

Also, the number of divisors of is at most . From the Legendre sieve (3), we thus conclude that

We can factorise the main term to obtain

This is compatible with the heuristic

coming from the equidistribution of residues principle (Section 3 of Supplement 4), bearing in mind (from the modified Cramér model, see Section 1 of Supplement 4) that we expect this heuristic to become inaccurate when becomes very large. We can simplify the right-hand side of (7) by recalling the twin prime constant

(see equation (7) from Supplement 4); note that

so from Mertens’ third theorem (Theorem 42 from Notes 1) one has

as . Bounding crudely by , we conclude in particular that

when with . This is somewhat encouraging for the purposes of getting a sufficiently good answer to Problem 2 to resolve the twin prime conjecture, but note that is currently far too small: one needs to get as large as before one is counting twin primes, and currently can only get as large as .

The problem is that the number of terms in the Legendre sieve (3) basically grows exponentially in , and so the error terms in (4) accumulate to an unacceptable extent once is significantly larger than . An alternative way to phrase this problem is that the estimate (4) is only expected to be truly useful in the regime ; on the other hand, the moduli appearing in (3) can be as large as , which grows exponentially in by the prime number theorem.

To resolve this problem, it is thus natural to try to *truncate* the Legendre sieve, in such a way that one only uses information about the sums for a relatively small number of divisors of , such as those which are below a certain threshold . This leads to the following general sieving problem:

Problem 3 (General sieving problem)Let be a squarefree natural number, and let be a set of divisors of . For each prime dividing , let be a set of integers, and define for all (with the convention that ). Suppose that is an (unknown) finitely supported sequence of non-negative reals, whose sumsare known for all . What are the best upper and lower bounds one can conclude on the quantity (1)?

Here is a simple example of this type of problem (corresponding to the case , , , , and ):

Exercise 4Let be a finitely supported sequence of non-negative reals such that , , and . Show thatand give counterexamples to show that these bounds cannot be improved in general, even when is an indicator function sequence.

Problem 3 is an example of a linear programming problem. By using linear programming duality (as encapsulated by results such as the Hahn-Banach theorem, the separating hyperplane theorem, or the Farkas lemma), we can rephrase the above problem in terms of *upper and lower bound sieves*:

Theorem 5 (Dual sieve problem)Let be as in Problem 3. We assume that Problem 3 isfeasible, in the sense that there exists at least one finitely supported sequence of non-negative reals obeying the constraints in that problem. Define an (normalised)upper bound sieveto be a function of the formfor some coefficients , and obeying the pointwise lower bound

for all (in particular is non-negative). Similarly, define a (normalised)

lower bound sieveto be a function of the formfor some coefficients , and obeying the pointwise upper bound

for all . Thus for instance and are (trivially) upper bound sieves and lower bound sieves respectively.

- (i) The supremal value of the quantity (1), subject to the constraints in Problem 3, is equal to the infimal value of the quantity , as ranges over all upper bound sieves.
- (ii) The infimal value of the quantity (1), subject to the constraints in Problem 3, is equal to the supremal value of the quantity , as ranges over all lower bound sieves.

*Proof:* We prove part (i) only, and leave part (ii) as an exercise. Let be the supremal value of the quantity (1) given the constraints in Problem 3, and let be the infimal value of . We need to show that .

We first establish the easy inequality . If the sequence obeys the constraints in Problem 3, and is an upper bound sieve, then

and hence (by the non-negativity of and )

taking suprema in and infima in we conclude that .

Now suppose for contradiction that , thus for some real number . We will argue using the hyperplane separation theorem; one can also proceed using one of the other duality results mentioned above. (See this previous blog post for some discussion of the connections between these various forms of linear duality.) Consider the affine functional

on the vector space of finitely supported sequences of reals. On the one hand, since , this functional is positive for every sequence obeying the constraints in Problem 3. Next, let be the space of affine functionals of the form

for some real numbers , some non-negative function which is a finite linear combination of the for , and some non-negative . This is a closed convex cone in a finite-dimensional vector space ; note also that lies in . Suppose first that , thus we have a representation of the form

for any finitely supported sequence . Comparing coefficients, we conclude that

for any (i.e., is an upper bound sieve), and also

and thus , a contradiction. Thus lies outside of . But then by the hyperplane separation theorem, we can find an affine functional on that is non-negative on and negative on . By duality, such an affine functional takes the form for some finitely supported sequence and (indeed, can be supported on a finite set consisting of a single representative for each atom of the finite -algebra generated by the ). Since is non-negative on the cone , we see (on testing against multiples of the functionals or ) that the and are non-negative, and that for all ; thus is feasible for Problem 3. Since is negative on , we see that

and thus , giving the desired contradiction.

Exercise 6Prove part (ii) of the above theorem.

Exercise 7Show that the infima and suprema in the above theorem are actually attained (so one can replace “infimal” and “supremal” by “minimal” and “maximal” if desired).

Exercise 8What are the optimal upper and lower bound sieves for Exercise 4?

In the case when consists of *all* the divisors of , we see that the Legendre sieve is both the optimal upper bound sieve and the optimal lower bound sieve, regardless of what the quantities are. However, in most cases of interest, will only be some strict subset of the divisors of , and there will be a gap between the optimal upper and lower bounds.

Observe that a sequence of real numbers will form an upper bound sieve if one has the inequalities

and

for all ; we will refer to such sequences as *upper bound sieve coefficients*. (Conversely, if the sets are in “general position” in the sense that every set of the form for is non-empty, we see that every upper bound sieve arises from a sequence of upper bound sieve coefficients.) Similarly, a sequence of real numbers will form a lower bound sieve if one has the inequalities

and

for all with ; we will refer to such sequences as *lower bound sieve coefficients*.

Exercise 9 (Brun pure sieve)Let be a squarefree number, and a non-negative integer. Show that the sequence defined bywhere (in contrast with the rest of this set of notes) denotes the number of prime factors of , is a sequence of upper bound sieve coefficients for even , and a sequence of lower bound sieve coefficients for odd . Deduce the Bonferroni inequalities

when is odd, whenever one is in the situation of Problem 3 (and contains all with ). The resulting upper and lower bound sieves are sometimes known as

Brun pure sieves. The Legendre sieve can be viewed as the limiting case when .

In many applications the sums in (9) take the form

for some quantity independent of , some multiplicative function with , and some remainder term whose effect is expected to be negligible on average if is restricted to be small, e.g. less than a threshold ; note for instance that (5) is of this form if for some fixed (note from the divisor bound, Lemma 23 of Notes 1, that if ). We are thus led to the following idealisation of the sieving problem, in which the remainder terms are ignored:

Problem 10 (Idealised sieving)Let (we refer to as thesifting leveland as thelevel of distribution), let be a multiplicative function with , and let . How small can one make the quantityfor a sequence of upper bound sieve coefficients, and how large can one make the quantity

Thus, for instance, the trivial upper bound sieve and the trivial lower bound sieve show that (14) can equal and (15) can equal . Of course, one hopes to do better than these trivial bounds in many situations; usually one can improve the upper bound quite substantially, but improving the lower bound is significantly more difficult, particularly when is large compared with .

If the remainder terms in (13) are indeed negligible on average for , then one expects the upper and lower bounds in Problem 3 to essentially be the optimal bounds in (14) and (15) respectively, multiplied by the normalisation factor . Thus Problem 10 serves as a good model problem for Problem 3, in which all the arithmetic content of the original sieving problem has been abstracted into two parameters and a multiplicative function . In many applications, will be approximately on the average for some fixed , known as the *sieve dimension*; for instance, in the twin prime sieving problem discussed above, the sieve dimension is . The larger one makes the level of distribution compared to , the more choices one has for the upper and lower bound sieves; it is thus of interest to obtain equidistribution estimates such as (13) for as large as possible. When the sequence is of arithmetic origin (for instance, if it is the von Mangoldt function ), then estimates such as the Bombieri-Vinogradov theorem, Theorem 17 from Notes 3, turn out to be particularly useful in this regard; in other contexts, the required equidistribution estimates might come from other sources, such as homogeneous dynamics, or the theory of expander graphs (the latter arises in the recent theory of the *affine sieve*, discussed in this previous blog post). However, the sieve-theoretic tools developed in this post are not particularly sensitive to *how* a certain level of distribution is attained, and are generally content to use sieve axioms such as (13) as “black boxes”.

In some applications one needs to modify Problem 10 in various technical ways (e.g. in altering the product , the set , or the definition of an upper or lower sieve coefficient sequence), but to simplify the exposition we will focus on the above problem without such alterations.

As the exercise below (or the heuristic (7)) suggests, the “natural” size of (14) and (15) is given by the quantity (so that the natural size for Problem 3 is ):

Exercise 11Let be as in Problem 10, and set .

- (i) Show that the quantity (14) is always at least when is a sequence of upper bound sieve coefficients. Similarly, show that the quantity (15) is always at most when is a sequence of lower bound sieve coefficients. (
Hint:compute the expected value of when is a random factor of chosen according to a certain probability distribution depending on .)- (ii) Show that (14) and (15) can both attain the value of when . (
Hint:translate the Legendre sieve to this setting.)

The problem of finding good sequences of upper and lower bound sieve coefficients in order to solve problems such as Problem 10 is one of the core objectives of sieve theory, and has been intensively studied. This is more of an optimisation problem rather than a genuinely number theoretic problem; however, the optimisation problem is sufficiently complicated that it has not been solved exactly or even asymptotically, except in a few special cases. (It can be reduced to a optimisation problem involving multilinear integrals of certain unknown functions of several variables, but this problem is rather difficult to analyse further; see these lecture notes of Selberg for further discussion.) But while we do not yet have a definitive solution to this problem in general, we do have a number of good general-purpose upper and lower bound sieve coefficients that give fairly good values for (14), (15), often coming within a constant factor of the idealised value , and which work well for sifting levels as large as a small power of the level of distribution . Unfortunately, we also know of an important limitation to the sieve, known as the *parity problem*, that prevents one from taking as large as while still obtaining non-trivial lower bounds; as a consequence, sieve theory is not able, on its own, to sift out primes for such purposes as establishing the twin prime conjecture. However, it is still possible to use these sieves, in conjunction with additional tools, to produce various types of primes or prime patterns in some cases; examples of this include the theorem of Ben Green and myself in which an upper bound sieve is used to demonstrate the existence of primes in arbitrarily long arithmetic progressions, or the more recent theorem of Zhang in which an upper bound sieve (among other things) was used to demonstrate the existence of infinitely many pairs of primes whose difference was bounded. In such arguments, the upper bound sieve was used not so much to count the primes or prime patterns directly, but to serve instead as a sort of “container” to efficiently envelop such prime patterns; when used in such a manner, the upper bound sieves are sometimes known as *enveloping sieves*. If the original sequence was supported on primes, then the enveloping sieve can be viewed as a “smoothed out indicator function” that is concentrated on *almost primes*, which in this context refers to numbers with no small prime factors.

In a somewhat different direction, it can be possible in some cases to break the parity barrier by assuming additional equidistribution axioms on the sequence than just (13), in particular controlling certain *bilinear* sums involving rather than just *linear* sums of the . This approach was in particular pursued by Friedlander and Iwaniec, leading to their theorem that there are infinitely many primes of the form .

The study of sieves is an immense topic; see for instance the recent 527-page text by Friedlander and Iwaniec. We will limit attention to two sieves which give good general-purpose results, if not necessarily the most optimal ones:

- (i) The
*beta sieve*(or*Rosser-Iwaniec sieve*), which is a modification of the classical combinatorial sieve of Brun. (A collection of sieve coefficients is called*combinatorial*if its coefficients lie in .) The beta sieve is a family of upper and lower bound combinatorial sieves, and are particularly useful for efficiently sieving out all primes up to a parameter from a set of integers of size , in the regime where is moderately large, leading to what is sometimes known as the fundamental lemma of sieve theory. - (ii) The
*Selberg upper bound sieve*, which is a general-purpose sieve that can serve both as an upper bound sieve for classical sieving problems, as well as an enveloping sieve for sets such as the primes. (One can also convert the Selberg upper bound sieve into a lower bound sieve in a number of ways, but we will only touch upon this briefly.) A key advantage of the Selberg sieve is that, due to the “quadratic” nature of the sieve, the difficult optimisation problem in Problem 10 is replaced with a much more tractable quadratic optimisation problem, which can often be solved for exactly.

Remark 12It is possible to compose two sieves together, for instance by using the observation that the product of two upper bound sieves is again an upper bound sieve, or that the product of an upper bound sieve and a lower bound sieve is a lower bound sieve. Such a composition of sieves is useful in some applications, for instance if one wants to apply the fundamental lemma as a “preliminary sieve” to sieve out small primes, but then use a more precise sieve like the Selberg sieve to sieve out medium primes. We will see an example of this in later notes, when we discuss the linear beta-sieve.

We will also briefly present the (arithmetic) *large sieve*, which gives a rather different approach to Problem 3 in the case that each consists of some number (typically a large number) of residue classes modulo , and is powered by the (analytic) large sieve inequality of the preceding section. As an application of these methods, we will utilise the Selberg upper bound sieve as an enveloping sieve to establish Zhang’s theorem on bounded gaps between primes. Finally, we give an informal discussion of the *parity barrier* which gives some heuristic limitations on what sieve theory is able to accomplish with regards to counting prime patters such as twin primes.

These notes are only an introduction to the vast topic of sieve theory; more detailed discussion can be found in the Friedlander-Iwaniec text, in these lecture notes of Selberg, and in many further texts.

** — 1. Combinatorial sieving and the fundamental lemma of sieve theory — **

We begin with a discusion of *combinatorial upper and lower bound sieves*, in which the sieve coefficients take values in . These sieves, first introduced by Brun, can be viewed as truncations of the Legendre sieve (which corresponds to the choice of coefficients for all ). The Legendre sieve, and more generally the Brun pure sieves in Exercise 9, give basic examples of combinatorial sieves.

We loosely follow the discussion of Friedlander and Iwaniec. The combinatorial sieves may be motivated by observing the *Buchstab identity*

for any and any collection of sets for . This identity reflects the basic fact that if does lie in , then there is a unique prime such that lies in , but does not lie in for any .

There is an analogous identity for the function :

Exercise 13For any arithmetic function , show thatHow is this identity related to (16)?

The Buchstab identity (16) allows one to convert upper bound sieves into lower bound sieves and vice versa. Indeed, if is a family of lower bound sieves for each , then (16) tells us that

is an upper bound sieve; similarly, if is a family of upper bound sieves for each , then

is a lower bound sieve. In terms of sieve coefficients, we see that if are lower bound sieve coefficients for each , then

is a sequence of upper bound sieve coefficients (where denotes the largest prime factor of ). Similarly, if are upper bound sieve coefficients for each , then

is a sequence of lower bound sieve coefficients.

One can iterate this procedure to produce an alternating sequence of upper and lower bound sieves. If one iterates out the Buchstab formula completely, one ends up back at the Legendre sieve. However, one can truncate this iteration by using the trivial lower bound sieve of for some of the primes . For instance, suppose one seeks an upper bound for . Applying (16), we only save some of the summands, say those which obey some predicate to be chosen later. For the remaining summands, we use the trivial lower bound sieve of , giving

For the surviving summands, we apply (16) again. With the sign change, the trivial lower bound sieve is not applicable, so we do not discard any further summands and arrive at

For the summands in the second sum on the right, we apply (16) once again. We once again have a favourable sign and can use the trivial lower bound sieve of to discard some of the resulting summands, keeping only those triples that obey some predicate , giving

Since one cannot have an infinite descent of primes, this process will eventually terminate to produce an upper bound sieve. A similar argument (but with the signs reversed) gives a lower bound sieve. We formalise this discussion as follows:

Proposition 14 (General combinatorial sieve)Let . For each natural number , let be a predicate pertaining to a sequence of decreasing primes, thus is either true or false for a given choice of such primes. Let (resp. ) denote the set of which, when factored as for , are such that holds for all odd (resp. even) . Thusand

- (i) For any sets , is an upper bound sieve, and is a lower bound sieve. In fact, we have the more precise identities
and

where denotes the least prime factor of and, for each natural number , is the collection of products with such that holds for all with the same parity as , but fails for . Thus for instance

and

- (ii) The sequences and are upper and lower sieve coefficient sequences respectively.
- (iii) For any multiplicative function , one has the identities

Note that the Legendre sieve is the special case of the combinatorial sieve when all the predicates are true for all choices of inputs (i.e. one does not ever utilise the trivial lower bound sieve); more generally, the Brun pure sieve with parameter corresponds to the case when is always true for and always false for . The predicates for odd are only used for the upper bound sieve and not the lower bound sieve; conversely, the predicates for even are only used for the lower bound sieve and not the upper bound sieve.

Exercise 15Prove Proposition 14.

Exercise 16Interpret the sieves in Exercise 8 as combinatorial sieves. What are the predicates in this case?

We now use the above proposition to attack Problem 10 for a given choice of parameters and ; we will focus on the regime when

for some , thus we are sifting out primes up to a small power of the sieve level . To do this, one needs to select the predicates in such a fashion that the sets only consist of divisors that are less than or equal to . There are a number of ways to achieve this, but experience has shown that a good general-purpose choice in this regard is to define to be the predicate

to be chosen later. The combinatorial sieve with this choice of predicate is known as the *beta sieve* or *Rosser-Iwaniec sieve* at sieve level . Observe that if lies in for some and , then at least one of or will hold, and hence since . If , we have the same conclusion thanks to (19) since . Thus under the hypotheses (21), (19), the beta sieves are indeed upper and lower bound sieves at level .

Now we estimate the quantities (14), (15) for the beta sieve coefficients for some multiplicative function with for all . In order to actually compute asymptotics, we assume the Mertens-like axiom

whenever , where and some fixed , which we refer to as the *sieve dimension*. From Mertens’ theorems, we see that these axioms will in particular be obeyed if

and

for all primes and some fixed depending only on .

Let be as above. From (17), (18), (22) we see that the quantities (14), (15) are both of the form

We now estimate the error term; our estimates will be somewhat crude in order to simplify the calculations, but this will already suffice for many applications. First observe that if then

We conclude that the condition is automatically satisfied when , and so the summand in (23) can be restricted to the range .

Next, note that if for some , then from (20) we have

for all with the same parity as , which implies (somewhat crudely) that

for all , regardless of parity (note that the case follows since ). We rearrange this inequality as

which iterates to give

On the other hand, from the definition of we have

which leads to a lower bound on :

so in particular

We can thus estimate (23) somewhat crudely as

We can estimate

thanks to (22) and the crude bound coming from the Taylor expansion of . If we choose large enough so that

(note the left-hand side decays like as ), we thus can estimate (23) by

which sums to

We conclude

Lemma 17 (Fundamental lemma of sieve theory)Let . If for some , then there exist combinatorial upper and lower sieve coefficients supported in such thatfor any multiplicative function with for all primes , obeying the bounds (22).

Informally, the fundamental lemma shows that one can get exponentially close to the Legendre sieve limit of as long as the sieve level is a large power of the sifting range .

*Proof:* Without loss of generality we may assume that is sufficiently large depending on , since for small we may simply reduce (for the purposes of the upper bound sieve) or use the trivial lower bound sieve . But then we may find obeying (24) and less than , and the claim follows.

Exercise 18Improve the error in (25) to for any . (It is possible to show that this error term is essentially best possible, at least in the linear dimension case , by using asymptotics for rough numbers (Exercise 28 of Supplement 4), but we will not do so here.)

The fundamental lemma is often used as a preliminary sifting device to remove small primes, in preparation for a more sophisticated sieve to deal with medium primes. But one can of course use it directly. Indeed, thanks to Theorem 5, we now have an adequate answer to Problem 3 in the regime where the sieve dimension is fixed and the sifting level is small compared to the level of distribution:

Corollary 19 (Fundamental lemma, applied directly)Let for some , and suppose that is a multiplicative function obeying the bounds (22) for all , where . For each , let be a set of integers, and let is a finitely supported sequence of non-negative reals such thatfor all square-free , where and , and . Then

We now specialise to the twin prime problem in Problem 2. Here, , and where for odd , with . In particular we may take . We set for some fixed , and for some quantity going sufficiently slowly to infinity, so . From (8) we have

and from (5) and the divisor bound we have

for any , as . From the fundamental lemma, we conclude that

If we let the term in decay sufficiently slowly, then the final error term in the above expression can be absorbed into the second error term, and so

Setting , we conclude in particular that

which implies from the sieve of Eratosthenes that there are pairs of twin primes in . This implies a famous theorem of Brun:

Exercise 20 (Brun’s theorem)Show that the sum is convergent.

Alternatively, if one applies (26) with a sufficiently large natural number, we see that

for all sufficiently large . This implies that there are numbers in with the property that are both not divisible by any prime less than or equal to . In particular (if ), and both have at most prime factors. We conclude a version of the twin prime conjecture for almost primes:

Corollary 21There is an absolute constant such that there are infinitely many natural numbers such that are each the product of at most primes.

Exercise 22By going through the beta sieve estimates carefully, show that one can take in the above corollary. (One can do substantially better if one uses more sophisticated sieve estimates, and of course Chen’s theorem tells us that can be taken as low as ; we will review the proof of Chen’s theorem in subsequent notes.)

In the above corollary, both elements of the pair are permitted to be almost prime rather than prime. It is possible to use the fundamental lemma to improve upon this by forcing (for instance) to be prime. To do this, we run the twin prime sieve differently; rather than start with all integers and sieve out two residue classes modulo each prime , we instead start with the primes and sieve out one residue class modulo . More precisely, we consider the quantity

where is the residue class and

(say). Observe that the sum is only as large as at worst for even . For odd , we expect from the prime number theorem in arithmetic progressions that

for some small . Thus it is natural to apply Corollary 19 with and , so that . From the Bombieri-Vinogradov theorem (see Theorem 17 of Notes 3), we see that

for any fixed . We thus take for some small fixed and , and apply Corollary 19 to conclude that

For a sufficiently large fixed constant, we conclude from Mertens’ theorem that

for large enough. The contribution for those for which is a power of a prime, rather than a prime, is easily seen to be negligible, and we conclude that there are primes less than such that has no factors less than , and thus is the product of at most primes. We thus have the following improvement of Corollary 21:

Proposition 23There is an absolute constant such that there are infinitely many primes such that is the product of at most primes.

More generally, the fundamental lemma (combined with equidistribution results such as the Bombieri-Vinogradov theorem) lets one prove analogues of various unsolved conjectures about primes, in which one either (a) is content with upper bounds of the right order of magnitude, or (b) is willing to replace primes with a suitable notion of almost prime. We give some examples in the exercises below.

Exercise 24 (Approximations to the prime -tuples conjecture)Let be an admissible -tuple of integers for some , thus the are all distinct, and for any , the number of residue classes occupied by is never equal to . Let denote the singular series

- (i) Show that for all sufficiently large . (This should be compared with the Hardy-Littlewood prime tuples conjecture as , see Exercise 13 of Supplement 4.) Thus the fundamental lemma tells us that the prime tuples conjecture is only off from the truth by a multiplicative factor of , at least as far as upper bounds are concerned.
- (ii) Show that if is sufficiently large depending on , then there are infinitely many such that are each the product of at most prime factors.
- (iii) Strengthen (ii) to include the requirement that (say) is prime.

Exercise 25 (Approximations to the even Goldbach conjecture)If is an even natural number, define the singular series

- (i) Show that the number of ways to write as the sum of two primes is if is sufficiently large. (This should be compared with the prediction of as , see Exercise 12 of Supplement 4.)
- (ii) Show that if is a sufficiently large natural number, then every sufficiently large even number is expressible as the sum of two natural numbers , each of which are the product of at most prime factors.
- (iii) Strengthen (ii) to include the requirement that is prime.

Exercise 26 (Approximations to Legendre’s conjecture)

- (i) Show that for any , the number of primes in the interval is at most .
- (ii) Show that if is a sufficiently large natural number, then for all sufficiently large , there is at least one natural number between and that is the product of at most primes.

Exercise 27 (Approximations to Landau’s fourth problem)Define the singular serieswhere is the number of residue classes such that .

- (i) Show that the number of primes of the form with is for sufficiently large . (This should be compared with the prediction , see Exercise 15 of Supplement 4.)
- (ii) Show that if is a sufficiently large natural number, then there are an infinite number of numbers of the form that are the product of at most primes.

** — 2. The Selberg upper bound sieve — **

We now turn to the *Selberg upper bound sieve*, which is a simple but remarkably useful general-purpose upper bound sieve. We again follow the discussion of Friedlander and Iwaniec.

The idea of the sieve comes from the following trivial observation: if is a squarefree number, is a set of integers for each , and are arbitrary real numbers with , then the function

is an upper bound sieve, since it is clearly non-negative and equals outside of . Equivalently, the sequence

for is a sequence of upper bound sieve coefficients, where is the least common multiple of . If we set and assume that the are only non-zero for , then this sequence of sieve coefficients will only be non-zero on . We will refer to sieves (and sieve coefficients) constructed by this method as *Selberg upper bound sieves* (or *Selberg upper bound sieve coefficients*); they are also referred to as sieves in the literature.

A key advantage of the Selberg sieve is that the coefficients for are completely unconstrained within the real numbers, and the sieve depends in a quadratic fashion on these coefficients, so the problem of optimising the Selberg sieve for a given application usually reduces to an unconstrained quadratic optimisation problem, which can often be solved exactly in many cases. Of course, the optimal Selberg sieve need not be the globally optimal sieve amongst *all* upper bound sieves; but in practice the Selberg sieve tends to give quite good results even if it might not be the globally optimal choice.

Selberg sieves can be used for many purposes, but we begin with the classical sieve problem posed in Problem 10. To avoid some degeneracies we will assume that

for all . Restricting to upper bound sieve coefficients of the Selberg form, the quantity (14) becomes

This is a quadratic form in the coefficients ; our task is to minimise this amongst all choices of coefficients with , with vanishing for .

Observe that any can be expressed as , with , in which case . By multiplicativity, we may thus write (30) as

which we can rearrange as as

The inner sum almost factorises as a square, but we have the coprimality constraint to deal with. The standard trick for dealing with this constraint is Möbius inversion:

inserting this and writing , , we can now write (30) as

which now does factor as a sum of squares:

Writing , this becomes

and is defined on factors of by the Dirichlet convolution

Thus is the multiplicative function with

for all ; we extend by zero outside of the factors of . In particular is non-negative.

The coefficients are a transform of the original coefficients . For instance, if was replaced by a single prime , then the coefficients would be related to the coefficients by the transform

which is inverted as

More generally, we have the following inversion formula:

Exercise 28 (Möbius-type inversion formula)Verify that

From this formula, we see in particular that is supported on if and only if is supported on . The constraint can now be written in terms of the coefficients as

Our task is now to minimise the quadratic form (31) amongst all coefficients supported on obeying (34). The Cauchy-Schwarz inequality gives the lower bound

where , with equality precisely when for . Inserting this back into (33), we arrive at the optimised Selberg sieve coefficients

and the quantity (14) is equal to .

We have a pleasant size bound on the coefficients :

Lemma 29We have for all .

*Proof:* Writing , we have

but

since one can compute that . The claim follows.

The upper bound sieve coefficients in (28) is then bounded by

where is the third divisor function. Applying Theorem 5, we thus have a general upper bound for Problem 3:

Theorem 30 (Selberg sieve upper bound)Let be a squarefree natural number. For each prime dividing , let be a set of integers. Let be a multiplicative function with for all , and define the multiplicative function by (32), extending by zero outside of the factors of . Let for some . Suppose that is a finitely supported sequence of non-negative reals, such that

Remark 31To compare this bound against the “expected” value of , observe from Euler products that

The quantity can often be computed in practice using standard methods for controlling sums of multiplicative functions, such as Theorem 27 of Notes 1. We illustrate this with the following sample application of the Selberg sieve:

Theorem 32 (Sieving an interval)Let be fixed natural numbers, and let . For each prime number , let be the union of residue classes modulo , where for all , and for all . Then for any , one haswhenever is sufficiently large depending on , where is the singular series

*Proof:* We apply Theorem 30 with , , , and . We set and . For any , is the union of residue classes modulo , and thus by (36) we have

and in particular

From Theorem 30, we thus can upper bound the left-hand side of (37) by

By the divisor bound and the choice of we have

so it will suffice to show (after adjusting ) that

But from Theorem 27(ii) of Notes 1 (with replaced by ), the left-hand side is

where

Since , we see that , and the claim follows (after adjusting appropriately).

The upper bound in (37) is off by a factor of about from what one expects to be the truth. The following exercises give some quick applications of Theorem 32.

Exercise 33 (Brun-Titchmarsh inequality)If , and is sufficiently large depending on , establish the Brun-Titchmarsh inequalityfor all , where is the number of primes less than . More generally, if , is a primitive residue class, and is sufficiently large depending on , establish the Brun-Titchmarsh inequality

for all , where is the number of primes less than that lie in . Comparing this with the prime number theorem in arithmetic progressions, we see that the Brun-Titchmarsh inequality is off from the truth by a factor of two when ; however, the Brun-Titchmarsh inequality is applicable even when is extremely large compared with , whereas the prime number theorem gives no non-trivial bound in this regime. The can in fact be deleted in these inequalities, a result of Montgomery and Vaughan (using a refinement of the large sieve, discussed below).

Exercise 34 (Upper bound for -tuple conjecture)If is an admissible -tuple (as in Exercise 24), and let be the singular series (27). Show thatas . (This should be compared with the prime tuples conjecture, which predicts

as .)

As with the fundamental lemma, one can use the Selberg sieve together with the Bombieri-Vinogradov theorem to sift out a set of primes, rather than a set of intervals:

Exercise 35 (Sieving primes)Let be fixed natural numbers, and let . For each prime number , let be the union of primitive residue classes modulo , where for all , and for all . Then for any , one haswhere is the singular series

(You will need Exercise 23 of Notes 3 to control the terms arising from the remainder term.) If one assumes the Elliott-Halberstam conjecture (see Exercise 22 of Notes 3), show that the factor of here can be lowered to .

Exercise 36Show that the quantity in (39) may be replaced by , which is a superior bound for . (It remains an open question to improve upon the bound without assuming any further unproven conjectures.) Assuming the Elliott-Halberstam conjecture, show that one can replace instead by .

We have approached the analysis of the Selberg sieve as a “discrete” optimisation problem, in which one is trying to optimise the parameters or that are indexed by the discrete set . It is however also instructive to consider a “continuous” version of the optimisation problem, in which the coefficients or are described by a continuous function of or ; this is a slightly more restrictive class of sieves, but turns out to lead to numerical bounds that come very close to the discrete optimum, and allow for the use of methods from calculus of variations to come into play when using the Selberg sieve in more sophisticated ways. (See these lecture notes of Selberg for further discussion of how the continuous problem serves as a good approximation of the discrete one.) We give a basic illustration of this by presenting an alternate proof of (a very slightly weaker form of) Theorem 32 in which we do not use the choice (35) for the sieve weights , but instead choose weights of the form

for some smooth compactly supported function that vanishes on , and which equals at the origin; compare this smoothly truncated version of the Möbius function with the more abrupt truncations of the Möbius function that arise in combinatorial sieving. Such cases of the Selberg sieve were studied by several authors, most notably Goldston, Pintz, and Yildirim (although they made a polynomial on , rather than a smooth function). As in the previous proof of Theorem 32, we take , , , , , and . The quantity (30) (or (14)) is then equal to

We can remove the constraints , since the summands vanish outside of this range. To estimate this sum we use an argument related to that used to prove Proposition 9 from Notes 2. We first use Fourier inversion, applied to the function , to obtain a representation of the form

for some smooth, rapidly decreasing function , thus

and

and so by Fubini’s theorem (using the divisor bound ) we may write (30) as

and then by Euler products and the definition of we may factorise this as

We have the crude upper bound

which by the asymptotic for (see equation (22) of Notes 1) implies that

From this and the rapid decrease of , we see that the contribution of the cases or (say) to (44) is for any . Thus we can write (44) as

Since , we can factor

where is the local Euler factor

The point of this factorisation is that, thanks to a routine but somewhat tedious Taylor expansion, one has the asymptotics

for any complex with imaginary part less than in magnitude, and hence (by the Cauchy integral formula)

for any complex with imaginary part less than in magnitude. By the fundamental theorem of calculus, we thus have

when are real. Also, observe from (38) that

and hence

From this and the standard asymptotic for close to , we have

and thus we can write (44) as

From the rapid decrease of , we may now remove the constraints on and write this as

To handle the weight , we observe from dividing (42) by and then differentiating times that

and hence by Fubini’s theorem

Using the Gamma function identity for any with , we conclude from another application of Fubini’s theorem that

Note from the support of that we can restrict the integral to . Putting this all together, we see that with the choice of weights (41), the expression (14) takes the form

Recall that we are requiring to equal , which after applications of the fundamental theorem of calculus (or integration by parts) is equivalent to the constraint

Since , we conclude from Cauchy-Schwarz that

with equality occurring when . Equality is not quite attainable with smooth, but by a mollification one can select an smoothly supported on (say) with and

for any . The expression (14) now takes the form

Meanwhile, from (41) we have and thus much as before. We can then apply Theorem 5 as before to recover Theorem 32.

Remark 37The Selberg sieve is most naturally used as an upper bound sieve. However, it is possible to modify the Selberg sieve to produce a lower bound sieve in a number of ways. One way is to simply insert the Selberg sieve into the Buchstab identity (16); see for instance this paper of Ankeny and Onishi for an investigation of this approach. Another method is to multiply the Selberg upper bound sieve by a lower bound sieve ; somewhat surprisingly, even a very crude lower bound sieve such as the Bonferroni sieve can give reasonably good results for this purpose. See Friedlander and Iwaniec for further analysis of this lower bound sieve (sometimes known as the sieve).

** — 3. A multidimensional Selberg sieve, and bounded gaps between primes — **

We now discuss a multidimensional variant of the Selberg sieve that establishes a recent result of Maynard (and independently by myself), giving further partial progress towards the Hardy-Littlewood prime tuples conjecture. For sake of exposition we shall omit some details; see this previous blog post for a more careful treatment; see also this survey of Granville.

More precisely, we sketch the proof of

Theorem 38 (Maynard’s theorem)Let be a natural number, and let be sufficiently large depending on . Then for any admissible -tuple , there exist infinitely many natural numbers such that at least of the numbers are prime.

Exercise 39Show that Maynard’s theorem implies that the quantity is finite for each natural number , where is the prime, is finite. (For the most recent bounds on , see this Polymath paper.) In particular, the case of Theorem 38 yields “bounded gaps between primes”, in that there is a finite for which for infinitely many .

Even the case of this theorem was only proven in 2013 by Zhang, building upon earlier work of Goldston, Pintz, and Yildirim. Zhang’s original value of in the “bounded gaps” application was ; this has since been lowered several times, with the current record being , due to the Polymath project. The original arguments of Zhang and Goldston-Pintz-Yildirim used a “one-dimensional” Selberg sieve, with coefficients given by (41), and required quite strong equidistribution hypotheses on the primes (stronger than what the Bombieri-Vinogradov theorem provides). However, the argument of Maynard relies on a more general and flexible “multi-dimensional” Selberg sieve, which has better numerical performance, and as such does not need any equidistribution result beyond the Bombieri-Vinogradov theorem. This sieve has since been used for a number of further applications in analytic number theory; see this web page for a list of papers in this direction.

Let be as in Theorem 38. Suppose that, for sufficiently large , we can find a non-negative function supported on with the property that

Then by the pigeonhole principle, there exists at least one such that , that is to say at least of the numbers are prime. Letting go to infinity (keeping fixed), we will obtain Theorem 38.

It remains to establish the existence of a function with the required properties. Prior to the work of Goldston, Pintz, and Yildirim, it was only known (using tools such as the fundamental lemma of sieve theory, or off-the-shelf Selberg sieves) how to construct sieve weights obeying (45) with replaced by a small constant less than . The earlier arguments of Zhang and Goldston-Pintz-Yildirim chose a weight which was essentially of the Selberg sieve form

(compare with (41)), where for some fixed , and was a smooth fixed compactly supported function. Omitting many calculations, their conclusion was that one could prove the case of Theorem 38 once one had some sort of distribution result on the primes at level (as defined in Exercise 22 of Notes 3) with . Unfortunately, this just falls short of what the Bombieri-Vinogradov theorem provides, which is a level of distribution for any . One of the key contributions of Zhang was to obtain a partial distribution result at a value of slightly above , which when combined with (a modification of) the previous work of Goldston, Pintz, and Yildirim, was able to yield the case of Theorem 38.

The approach of Maynard and myself was a little different, based on multidimensional Selberg sieves such as

for some multidimensional smooth function supported on the simplex ; the one-dimensional sieve considered by Zhang and Goldston-Pintz-Yildirim is then essentially the case when is a function just of on this simplex. Actually, to avoid some (very minor) issues concerning common factors between the , and also to essentially eliminate the role of the singular series, it is convenient to work with a slight modification

of the above sieve, where for some slowly growing to infinity with (e.g. will do), and is a residue class such that is primitive for all (such a exists thanks to the admissibility of and the Chinese remainder theorem). Note that for large enough, the have no common factors, and so the are automatically coprime to each other and to . (In Maynard’s paper, the coefficients were replaced by a sequence which was not directly arising from a smooth function , but instead the analogue of the coefficients from the preceding section were chosen to be of the form for some suitable function . However, the differences between the two approaches are fairly minor.)

It is possible to estimate the right-hand side of (45) using a modification of the arguments of the previous section:

Exercise 40Let be as in the above discussion. Assume that . If goes to infinity at a sufficiently slow rate at , establish the asymptoticas , where

and denotes the mixed partial derivative in each of the variables .

Now we consider a term on the left-hand side, say . When is prime, the quantity in (46) is necessarily , so (46) simplifies to a -dimensional version of itself:

If , one can control the effect of the weight using the Bombieri-Vinogradov theorem, and after some moderately complicated calculations one obtains

Exercise 41Let be as in the above discussion. Assume that . If goes to infinity at a sufficiently slow rate at , establish the asymptoticas , where

and denotes the mixed partial derivative in the variables . Similarly for permutations of the indices.

In view of these two calculations, one can establish Maynard’s theorem as soon as one can find and for which

where is defined similarly to but with the role of the and indices swapped. This is now a problem purely in calculus of variations rather than number theory. It is thus natural to take close to (though, for the qualitative analysis performed here, actually any non-zero choice of independent of will suffice). Our task is now to find with

The optimal choice of for a given is not known for any , but one can generate a “good enough” choice of by the following set of calculations. Firstly, we restrict to the case when is symmetric in the , so that all the are equal and our objective is now to satisfy the inequality

Next, we make the change of variables , thus is smooth on and supported on the simplex but is otherwise arbitrary on this simplex. From the fundamental theorem of calculus, the desired inequality now becomes

We select a function of the truncated product form

where is a smooth non-negative function normalised so that

and our objective (after a rescaling) is now to get

The emergence of the function as a parameter in the multidimensional sieve is a feature that is not present in previous sieves (at least not without sharply curtailing the support of ), and is the main reason why this sieve performs better than previous sieves for this problem.

We interpret the above optimisation problem probabilistically. Let be independent real values in with probability density function . Our task is now to obtain the inequality

Suppose that we can locate a fixed function (independent of ) obeying (47) with

Then from the law of large numbers, goes to infinity almost surely as to , and so from dominated convergence one has

and the claim follows from (49) if is sufficiently large depending on .

We are thus reduced to the easy one-dimensional problem of producing a smooth function obeying the constraints (47), (48), (49). However, one can verify that the choice

(barely) obeys (49) with and both finite, and the claim follows by a routine rescaling.

Exercise 42By working through a more quantitative version of the above argument, establish Theorem 38 with as . (Hint:One can replace the use of the law of large numbers with Chebyshev’s inequality.) It is not currently known how to obtain this theorem with any choice of that grows faster than logarithmically in ; the current record is , due to the Polymath project. It is known, though, that bounds on the order of are the limit of the Selberg sieve method, and one either needs new sieves, or new techniques beyond sieve theory, to increase beyond this rate; see the Polymath paper for further discussion.

One can use the multidimensional sieve to study the classical sieving problem of counting prime -tuples, but unfortunately the numerology it gives is precisely the same as the numerology given by previous sieves:

Exercise 43Let be an admissible tuple, and let go to infinity sufficiently slowly as .

** — 4. The large sieve — **

In the general discussion of sieving problems such as Problem 3 in previous sections, there was no structure imposed on the sets beyond the fact that one could control the sums (9). However, as we have seen in practice, is typically the union of some number of residue classes modulo . It is possible to exploit this structure using Fourier analysis, leading to an approach to sieving known as the *large sieve*, and relying crucially on the analytic large sieve inequality from Notes 3. The large sieve tends to give results similar to that of the Selberg sieve; indeed, the two sieves are in some sense “dual” to each other and are powered by the same sort of -based techniques (such as the Cauchy-Schwarz inequality). However, the large sieve methods can offer sharper control on error terms than the Selberg sieve if carried out carefully. As the name suggests, the large sieve is particularly useful in the regime when the number of residue classes modulo being sieved out is large, for instance as large as a constant multiple of . (When is *very large* – close to , then there is another sieve that gives even sharper results, namely the larger sieve of Gallagher, which we do not discuss further here.)

To apply the analytic large sieve to sieving problems, we need to exploit an “uncertainty principle” of Montgomery (discussed further in this previous blog post), that shows that functions which avoid certain residue classes in the physical domain are necessarily spread out in the frequency domain. More precisely, we have

Theorem 44 (Montgomery uncertainty principle)Let be a finitely supported function, and let be the Fourier transform , where . Suppose that for each prime , there are residue classes modulo on which vanishes identically for some . Then for any and any natural number , one haswhere is the multiplicative function

*Proof:* We may assume that is squarefree, since otherwise we have and the claim (50) is trivial. Observe from the Chinese remainder theorem that if the inequality (50) holds for two coprime values of , then it also holds for (as every fraction in with can be uniquely written as with ). Thus it suffices to verify the claim when is prime. By multiplying by we can also reduce to the case , thus our task is now to show that

or equivalently

By the Plancherel identity in , the left-hand side can be written as

and the right-hand side is

Since the sum is only non-vanishing for at most choices of , the claim follows from the Cauchy-Schwarz inequality.

We then obtain the following variant of Theorem 32 or Theorem 30:

Theorem 45 (Arithmetic large sieve)Let . For each prime number , let be the union of residue classes modulo , where for all . Then for any and natural number , we haveand is given by (51).

*Proof:* Let be the indicator function of the set in (52). From Theorem 44 with , and summed up to , one has

Now observe from the basic identity

that the Farey sequence is -separated in . We may thus invoke the analytic large sieve inequality (see Proposition 6 of Notes 3) to conclude that

But from construction of , the left-hand side of (52) is equal to both and , and the claim follows.

Note that if one uses the sharp form of the large sieve inequality (Remark 7 from Notes 3) then one can sharpen (52) to

Setting , one can then recover sharper versions of Theorem 32 and its consequences, for instance by working along these lines (and using a very sharp version of the analytic large sieve inequality that exploits the variable spacing of the set of fractions ), result of Montgomery and Vaughan were able to completely delete the error term in the Brun-Titchmarsh inequality (Exercise 33).

Remark 46It is a general phenomenon that large sieve methods tend to give the same main term as the Selberg sieve; see Section 9.5 of Friedlander-Iwaniec for further discussion.

** — 5. The parity problem — **

We have seen that sieve theoretic techniques are reasonably good at obtaining upper bounds to various sieving problems, and can also provide lower bounds if the sifting limit is low. However, these methods have proven to be frustratingly ineffective at providing non-trivial lower bounds for the original problems that motivated the development of sieve theory in the first place, namely counting patterns of primes such as twin primes. While we do not fully understand the exact limitations of what sieve theory can and cannot do, we do have a convincing heuristic barrier, known as the parity barrier or *parity problem* and first raised explicitly by Selberg, which explains why sieves have so much difficulty in lower bounding (or efficiently upper bounding) counts involving primes; in particular they explain why sieve-theoretic methods have thus far been unable to fully resolve any of the four Landau problems. The barrier is not completely rigorous, as it relies on the *Möbius pseudorandomness principle* (or *Liouville pseudorandomness principle*) discussed in Section 2 of Supplement 4. As such, our discussion in this section will be somewhat informal in nature; for instance, we will use vague symbols such as without defining their meaning precisely.

Consider the general sieving problem in Problem 3, in which one tries to control the quantity via knowledge of the sums , together with the hypothesis that the are non-negative. In most situations, one expects this problem to be *underdetermined*; in particular, there will probably exist two different finitely supported sequences , of non-negative reals which are essentially indistinguishable with regards to the inputs of this sieve problem, in the sense that

for all , but which have very different outputs for this sieve problem, in that

Then any upper bound for Problem 3 with this choice of must exceed (at least approximately) the greater of and ; similarly, any lower bound for this problem must be (approximately) less than the lesser of and .

As a special case of this observation, we have

Proposition 47 (Abstract parity barrier)(Informal) Let be as in Problem 3. Let be a finitely supported sequence of non-negative reals such thatfor all . Suppose that we have an additional “parity function” with the property that

for all in the support of , but such that

for all . Then any upper bound for Problem 3 cannot be significantly smaller than , and any lower bound for Problem 3 cannot be significantly larger than zero. In particular, one does not expect to be able to show using such sieve methods that the support of contains any appreciable number of elements outside of .

*Proof:* (Non-rigorous!) For the statement about the upper bound, apply the previous discussion to the sequence . For the statement about the lower bound, apply the previous discussion to the sequence .

Informally, the proposition asserts that if the sifted set is sensitive to the parity function , but original pre-sifted sequence and the sets are not, then traditional sieve methods cannot give any non-trivial lower bounds, and any upper bounds must be off from the truth by a factor of at least two.

In practice, one can heuristically obey the hypotheses of this proposition if one takes to be some variant of the Liouville function , which counts the parity of the number of prime factors of , in which case the parity problem asserts (roughly speaking) that the sifting problem must allow both numbers with an even number of prime factors and numbers with an odd number of prime factors to survive if one is to have any hope of good lower bounds. For instance:

Corollary 48 (Parity barrier for twin primes)(Very informal) One cannot use traditional sieve methods to resolve the twin prime conjecture.

*Proof:* (Very non-rigorous! Also assumes Liouville pseudorandomness) As mentioned in the introduction, there are a number of different ways to try to approach the twin prime conjecture by sieve methods. Consider for instance the approach in which one starts with the primes in and removes the residue class for each . The quantity will then be on the sifted set (since will be prime on that set), while the Liouville pseudorandomness heuristic predicts that

is negligible for all (say). The claim then follows from Proposition 47.

Exercise 49(Informal) Develop similar parity obstructions for the other sieve-theoretic approaches to the twin prime conjecture. In the case when one is sieving out the integers by two residue classes for each prime , argue that any sieve-theoretic bound must in fact be off from the truth by a factor of at leastfour. (Hint:experiment with weights such as .)

Exercise 50(Informal) Argue why any sieve-theoretic upper bound on prime -tuples that is based on equidistribution estimates for the natural numbers and primes must be off from the truth by a factor of at least . (This should be compared with the known upper bounds of and discussed in previous sections. It is not clear at present which of these bounds is closest to the true limit of sieve theory.)

The above argument is far from watertight, and there have been successful attempts to circumvent the parity barrier. For instance:

- (i) The parity barrier does not prohibit the possibility of producing a very small number of prime (or near-prime) patterns, even if one cannot hope to get lower bounds that are of the “right” order of magnitude. For instance, it was shown unconditionally by Goldston, Graham, Pintz, and Yildirim that there exist infinitely many pairs of consecutive numbers , each of which are the product of exactly four primes. Cramér type random models (see Supplement 4) predict the number of such pairs up to to be approximately (ignoring some factors of ), and the parity barrier prevents one from getting lower bounds of this magnitude; but the construction of Goldston, Graham, Pintz, and Yildirim only produces a quite sparse set of pairs (about or so) and so evades the parity barrier. Using a very different class of arguments, it is also possible to produce analogues of patterns like twin primes in the function field setting (in which the role of primes are replaced by irreducible polynomials over a finite field) by using very sparse sequences of polynomials for which irreducibility can be tested easily; see for instance this paper of Hall for the case of twin irreducible polynomials.
- (ii) The parity barrier may disappear if one is somehow in a situation in which the Möbius or Liouville pseudorandomness principles break down. An extreme example of this is when one assumes the existence of one (or many)
*Siegel zeroes*, which imply extremely high correlation between the Möbius or Liouville functions and a Dirichlet character, in strong violation of the pseudorandomness principle. A striking example of this case occurs in this paper of Heath-Brown, which demonstrates the truth of the twin prime conjecture if one assumes the existence of an infinite sequence of Siegel zeroes. - (iii) Traditional sieve theoretic methods only assume that one can control “linear” or “Type I” sums of the form . However, in some cases it is also possible to control “bilinear” or “Type II” sums such as for various coefficients of controlled size and support but uncontrolled phase. Note that multiplicative parity functions such as will be distinguishable with such sums, since will not exhibit any cancellation if both oscillate with the sign of respectively; as such, the parity barrier is no longer in effect. In such situations it can be possible to go beyond the limits of traditional sieve theory, for instance by using decompositions similar to the Vaughan identity employed in Notes 3. Of course, one still has to actually establish the required bilinear sum asymptotics, and this typically requires some deep and non-trivial input outside of sieve theory. A model example here is the work of Friedlander and Iwaniec establishing the infinitude of primes of the form . Here, the relevant bilinear input is obtained using the algebraic structure of the Gaussian integers and some non-trivial results on lattice counting problems.
- (iv) Sieve theory is focused on “one-dimensional” sums, such as , or perhaps . The presence of the parity problem relies on the ability to modify weights such as or in such sums in such a fashion that the value of the sum dramatically changes. If one instead is trying to control a
*multi-dimensional*sum, such as (which roughly speaking counts arithmetic progressions of primes of length four), then there is no easy way to modify any of the factors of such a sum in the fashion required for the parity barrier to be in effect. For such multi-dimensional sums one can then hope to use additional estimates to produce non-trivial lower bounds. For instance, with Ben Green, we used Selberg sieves combined with the additional tool of Szemerédi’s theorem to obtain lower bounds for multi-dimensional sums similar to the one above, proving in particular that the primes contained arbitrarily long arithmetic progressions.

It is one of the major problems in analytic prime number theory to continue to find more ways to defeat the parity barrier, so that we can produce prime patterns in many more situations than what is currently possible. However, this has proven to be a surprisingly challenging task, and one which typically requires the injection of new ideas and methods external to sieve theory.

Remark 51As mentioned previously, it is not known in general if the parity problem represents theonlyobstruction to sieve methods. However, under the additional hypothesis of the Elliott-Halberstam conjecture, and when focusing on “binary” problems such as the twin prime conjecture, there is a more detailed analysis of the parity problem due to Bombieri which indicates that, in some sense, the parity problemisindeed the only information not captured by sieve-theoretic methods, in that the asymptotic distribution of primes or almost primes in pairs is completely determined by a single parameter that represents the ratio between the actual number of twin primes and the expected number of twin primes; the parity problem prevents us from obtaining any further information about other than that it lies between and , but once is known, essentially all other asymptotics regarding the prime factorisation of and will also be known. See Chapter 16 of Friedlander-Iwaniec for further discussion.

Remark 52For a slightly different way to formalise the parity problem, see this previous blog post. Among other things, it is shown in that post that the parity problem prevents one from proving Theorem 38 with any choice of with using purely sieve-theoretic methods, in particular ruling out a solution to the twin prime conjecture by sieve methods.

## 41 comments

Comments feed for this article

21 January, 2015 at 7:27 pm

Fred LunnonRemark 46 — missing reference?

[Corrected, thanks – T.]22 January, 2015 at 12:17 am

observerRemark 46: sieve sieve.

[Corrected, thanks – T.]22 January, 2015 at 2:37 am

Bo JacobyThank you. When I read from the email all formulas appear superscripted like exponents. When I read from https://terrytao.wordpress.com/2015/01/21/254a-notes-4-some-sieve-theory/#comments it looks all right. Can I do something to make the email look all right?

22 January, 2015 at 5:09 am

AnonymousIn Problem 2, it seems (in line 3) that “” should be ““.

Also, it seems that should appear as the LHS of the displayed formula.

[Corrected, thanks – T.]22 January, 2015 at 6:40 am

tomcircleReblogged this on Math Online Tom Circle.

22 January, 2015 at 7:38 am

arch1In Problem 2, if we omit z itself in the product defining P(z), how does this avoid the residue classes 0, -2 (p) when p=z?

[Corrected, thanks – T.]22 January, 2015 at 12:28 pm

arch1In (6), you need to incorporate the immediately preceding crude bound.

[Corrected, thanks – T.]22 January, 2015 at 4:50 pm

arch1Proposition 14(i): The superscripts in and should instead be subscripts.

Proposition 23: n+2 -> p+2

[Corrected, thanks – T.]23 January, 2015 at 6:32 pm

arch1In the 1st sentence after (3), should instead be ?

[Corrected, thanks – T.]23 January, 2015 at 9:01 pm

kodlu“… theorem of Zhang which (among other things) used an upper bound sieve was used to demonstrate the existence of infinitely many pairs of primes whose difference was bounded.”

could be rewritten

“…theorem of Zhang in which (among other things) an upper bound sieve was used to demonstrate the existence of infinitely many pairs of primes whose difference was bounded.”

[Corrected, thanks – T.]26 January, 2015 at 6:01 am

arch1@gmail.comAs a beginner it took me a while to see that the terms in the expansion of (2)ās RHS correspond to the subsets of Pās prime factors, which in turn correspond to the divisors d of P on the following line.

[Additional step added to clarify – T.]27 January, 2015 at 2:23 am

AnonymousIn exercise 13 should’t it be in the last productory, instead of ? Very nice notes, by the way, hope I get some time to look at them carefully.

[Corrected, thanks – T.]29 January, 2015 at 5:34 am

arch1Maybe overkill, but it would have helped me if (4) had started with

and (5) with

29 January, 2015 at 11:32 am

254A, Supplement 5: The linear sieve and Chen’s theorem (optional) | What's new[…] continue the discussion of sieve theory from Notes 4, but now specialise to the case of the linear sieve in which the sieve dimension is equal to , […]

22 February, 2015 at 9:02 am

254A, Notes 7: Linnik’s theorem on primes in arithmetic progressions | What's new[…] options here; we will use a variant of the “continuous Selberg sieve” from Section 2 of Notes 4. Fix a smooth function that equals on and is supported on ; we allow implied constants to depend […]

25 February, 2015 at 6:01 am

arch1In the first sentence after Problem 2, should instead be ?

[Corrected, thanks – T.]30 March, 2015 at 12:49 pm

254A, Notes 8: The Hardy-Littlewood circle method and Vinogradov’s theorem | What's new[…] function is somewhat similar to the continuous Selberg sieve weights studied in Notes 4, with the main difference being that we did not square the divisor sum as we will not need to take […]

28 April, 2015 at 4:23 am

KaraskasTheorem 44: The condition about q being squre-free looks unnecessary thanks to the definition of h(q), doesn’t it?

In the definition of Fourier transform shouldn’t it be instead of ? It seems that if we define the transform without minus sign, then we will have to multiply the function f by during the reducing to the case.

[Corrected, thanks – T.]4 August, 2015 at 3:13 am

AnonymousAre these notes available as pdf? I want to take a printout.

18 September, 2015 at 4:46 pm

The logarithmically averaged Chowla and Elliott conjectures for two-point correlations; the Erdos discrepancy problem | What's new[…] such as (2) or (3) are known to be subject to the “parity problem” (discussed numerous times previously on this blog), which roughly speaking means that they cannot be proven solely using […]

23 November, 2015 at 9:24 pm

A cheap version of Halasz’s inequality | What's new[…] This follows for instance from the fundamental lemma of sieve theory (see e.g. Corollary 19 of this blog post). (One can also use the Selberg sieve or the large […]

17 July, 2016 at 8:54 am

Notes on the Bombieri asymptotic sieve | What's new[…] where denotes the derivative of . Note the loss of that had previously been pointed out. In the arguments that follows I will be a little brief with the details, as they are standard (see e.g. this previous post). […]

25 September, 2016 at 8:19 am

manhtuankhtnIn the definition of $G(t_1,t_2)$ after equation (44): should the term $dt_1dt_2$ removed?

[Corrected, thanks – T.]25 September, 2016 at 11:44 am

AnonymousIn the definition of $G(t_1,t_2)$ (after (44)) and $E_p(t_1,t_2)$: $1+p^{\frac{1+it_2}{\log R}}$ should be $p^{1+\frac{1+it_2}{\log R}}$.

[Corrected, thanks – T.]25 September, 2016 at 11:54 am

Anonymous”which by the asymptotic $\sum_p \frac{1}{p^s} = \log(s-1) + O(1)$ for $s > 1$” should be ”…. $-\log(s-1)$…” (missing minus sign).

[Corrected, thanks – T.]28 September, 2016 at 12:44 am

AnonymousI have a question which is somewhat related to the Brun-Titchmarsh inequality. It would be great if you can give me a reference. Does the following statement hold: Almost all natural numbers in the interval has the property that all prime factors of are at most , where grows slowly to infinity with . Please ignore (delete) my question if you think it is inappropriate for this post. Thanks!

28 September, 2016 at 8:02 am

Terence TaoThere has been a lot of work on counting smooth numbers in short intervals, see e.g. this paper of Croot http://people.math.gatech.edu/~ecroot/smooth6.pdf for some results and references.

18 December, 2018 at 6:31 am

AnonymousExercise 26 (Approximations to Legendreās conjecture):

I guess it should be O(y/logx) instead of O(y/logy), shouldn’t it?

18 December, 2018 at 3:56 pm

Terence TaoNo, one can only hope to get here (one can only make good use of the portion of the sieve of Eratosthenes up to primes of size about ). Note for instance that the number of primes in does not go to zero in the limit (indeed, we believe it equal to 2 infinitely often!)

8 November, 2019 at 5:23 pm

254A, Notes 2: Complex-analytic multiplicative number theory | What's new[…] 10 In Notes 4 we will use a similar method to that used to prove Theorem 9 to estimate sums such […]

16 June, 2021 at 1:25 pm

AnonymousDear Prof. Tao,

if you Ctrl+F for “taking s”, you’ll find a stray .

16 June, 2021 at 1:27 pm

WebspinnerAlso, for the proof of Brun’s theorem, one can’t just set , because the -notation does not yield a bound on the entire real line, but only a complement-bounded segment of it. One instead needs the containment of prime numbers within the set of (suitably) rough numbers.

16 June, 2021 at 1:33 pm

WebspinnerFinally, right after eqn. (19), is probably meant to be the level of distribution, and not the sieve level.

16 June, 2021 at 1:35 pm

WebspinnerOh, and regarding social matters: Perhaps my comment about the German saying has been percieved as somewhat grumpy. Yet, I observed that you, Prof. Tao, usually implement suggestions within the day, and I was put off by the fact that for my suggestions, that hasn’t happened yet.

17 June, 2021 at 7:54 am

WebspinnerRegarding lemma 17, I wonder whether the formulation might be optimised. Right now there is some class of functions obeying (22), and for this class one has the dependency on the sieve dimension. However, in the subsequent exercises one does not use the class of functions and the dependency on at all. As it stands, it might be misunderstood to state that the bound depends on and nothing else, which may be true but is not proven here, because (22) is assumed as an axiom.

Also, I’d be happy to get some kind of feedback from you whether my comments are welcome here or whether I’m just wasting my time.

17 June, 2021 at 8:21 am

WebspinnerAlso, in the paragraph after (22), I’m not sure at all whether the inequality is sufficient for the desired bound, which in needs to hold even for the small primes, if I’m not mistaken. I worry that a weird fluctuation in the beginning might throw the bound off. Also, I’m not quite sure whether the inequality is enough, because one bound is from above and the other from below.

17 June, 2021 at 8:51 am

AnonymousJust to be clear: I mean fluctuations in , not (even though it best not be zero for small primes).

17 June, 2021 at 1:21 pm

WebspinnerIn the proof of lemma 17, for the lower bound of , the third equation should probably not be strict, because the case is not yet ruled out. Two equations later, there is a missing in the exponent of .

Dude, I’m not trying to piss anyone off. But it would just be nice to have a freely available reference, and with a bit of collaboration, we can make it happen!

24 June, 2021 at 9:31 am

WebspinnerHello there,

this is a gentle reminder kindly asking you to implement the above suggestions.

30 July, 2021 at 12:30 pm

WebspinnerI give up. I’ve made some good faith suggestions, and all I got was an estranging remark by a certain “Anonymous” that said “Spin somewhere else” (see the comment section to chapter 2 of this lecture). I’m going to take the advice of this “Anonymous” seriously. Farewell.

30 July, 2021 at 12:32 pm

WebspinnerI should add that this comment has not been removed yet, as of today.