The parity problem is a notorious problem in sieve theory: this theory was invented in order to count prime patterns of various types (e.g. twin primes), but despite superb success in obtaining upper bounds on the number of such patterns, it has proven to be somewhat disappointing in obtaining lower bounds. [Sieves can also be used to study many other things than primes, of course, but we shall focus only on primes in this post.] Even the task of reproving Euclid’s theorem – that there are infinitely many primes – seems to be extremely difficult to do by sieve theoretic means, unless one of course injects into the theory an estimate at least as strong as Euclid’s theorem (e.g. the prime number theorem). The main obstruction is the parity problem: even assuming such strong hypotheses as the Elliott-Halberstam conjecture (a sort of “super-generalised Riemann Hypothesis” for sieves), sieve theory is largely (but not completely) unable to distinguish numbers with an odd number of prime factors from numbers with an even number of prime factors. This “parity barrier” has been broken for some select patterns of primes by injecting some powerful non-sieve theory methods into the subject, but remains a formidable obstacle in general.
I’ll discuss the parity problem in more detail later in this post, but I want to first discuss how sieves work [drawing in part on some excellent unpublished lecture notes of Iwaniec]; the basic ideas are elementary and conceptually simple, but there are many details and technicalities involved in actually executing these ideas, and which I will try to suppress for sake of exposition.
Let’s consider a basic question in prime number theory, namely how to count the number of primes in a given range, say between N and 2N for some large integer N. (This problem is more or less equivalent to that of counting primes between 1 and N, thanks to dyadic decomposition, but by keeping the magnitude of all numbers comparable to N we can simplify some (very minor) technicalities.) Of course, we know that this particular question can be settled fairly satisfactorily (the answer is ) using known facts about the Riemann zeta function, but let us pretend for now that we do not know about this function. (Once one moves to slightly more complicated additive questions about the primes, such as counting twin primes, the theory of the zeta function and its relatives becomes much less powerful, even assuming such things as the Riemann hypothesis; the problem is that these functions are measuring the multiplicative structure of the primes rather than the additive structure.)
The set of primes does not appear to have enough usable structure in order to perform such counts quickly. However, one can count other sets of numbers between N and 2N with much more ease. For instance, the set of integers between N and 2N can be easily counted with small error:
;
the error term O(1) in this case is in fact just 1. Similarly, we can count, say, the number of odd numbers between N and 2N,
,
simply because the set of odd numbers has density and is periodic of period 2. The error term O(1) now depends on the parity of N. More generally, we can count any given residue class in [N,2N] to a reasonable accuracy:
,
where the error term is now more complicated, and depends on what N is doing modulo q. This estimate is quite good as long as q is small compared with N, but once q is very large, the error term O(1) can begin to overwhelm the main term (especially if the main term is going to appear in a delicate summation with lots of cancellation). In general, any summation involving the main term N/q will be relatively easy to manipulate (because it is essentially multiplicative in q, and thus amenable to all the methods of multiplicative number theory, in particular Euler products and zeta functions); it is the error term O(1) which causes all the difficulty.
Once we have figured out how to count these basic sets, we can also count some combinations of these sets, as long as these combinations are simple enough. For instance, suppose we want to count
(1).
Well, we know that the total number of integers in [N,2N] is N+O(1). Of this set, we know that are not coprime to 2 (i.e. divisible by 2), and that
are not coprime to 3. So we should subtract those two sets from the original set, leaving
. But the numbers which are divisible by both 2 and 3 (i.e. divisible by 6) have been subtracted twice, so we have to put them back in; this adds in another
, giving a final count of
for the quantity (1); this is of course a simple instance of the inclusion-exclusion principle in action. An alternative way to estimate (1) is to use the Chinese remainder theorem to rewrite (1) as
and use our ability to count residue classes modulo 6 to get the same final count of (though the precise bound on the error term will be slightly different). For very small moduli such as 2 and 3, the Chinese remainder theorem is quite efficient, but it is somewhat rigid, and for higher moduli (e.g. for moduli much larger than
) it turns out that the more flexible inclusion-exclusion principle gives much better results (after applying some tricks to optimise the efficiency of that principle).
We can of course continue the example in (1), counting the numbers in [N,2N] which are coprime to 2,3,5,7, etc., which by the sieve of Eratosthenes will eventually give us a count for the primes in [N,2N], but let us pause for a moment to look at the larger picture. We have seen that some sets in [N,2N] are fairly easy to count accurately (e.g. residue classes with small modulus), and others are not (e.g. primes, twin primes). What is the defining characteristic of the former types of sets? One reasonable answer is that the sets that are easy to count are low-complexity, but this is a rather vaguely defined term. I would like to propose instead that sets (or more generally, weight functions – see below) are easy to count (or at least estimate) whenever they are smooth in a certain sense to be made more precise shortly. This terminology comes from harmonic analysis rather than from number theory (though number theory does have the related concept of a smooth number), so I will now digress a little bit to talk about smoothness, as it seems to me that this concept implicitly underlies the basic strategy of sieve theory.
Instead of talking about the problem of (approximately) counting a given set in [N,2N], let us consider instead the analogous problem of (approximately) computing the area of a given region E (e.g. a solid ellipse) in the unit square . As we are taught in high school, one way to do this is to subdivide the square into smaller squares, e.g. squares of length
for some n, and count how many of these small squares lie completely or partially in the set E, and multiply by the area of each square; this is of course the prelude to the Riemann integral. It works well as long as the set E is “smooth” in the sense that most of the small squares are either completely inside or completely outside the set E, with few borderline cases; this notion of smoothness can also be viewed as a quantitative version of Lebesgue measurability. Another way of saying this is that if one wants to determine whether a given point (x,y) lies in E, it is usually enough just to compute x and y to the first n significant digits in the decimal expansion.
Now we return to counting sets in [N,2N]. One can also define the notion of a “smooth set” here by again using the most significant digits of the numbers n in the interval [N,2N]; for instance, the set [1.1 N, 1.2 N] would be quite smooth, as one would be fairly confident whether n would lie in this set or not after looking at just the top two or three significant digits. However, with this “Euclidean” or “Archimedean” notion of smoothness, sets such as the primes or the odd numbers are certainly not smooth. However, things look a lot better if we change the metric, or (more informally) if we redefine what “most significant digit” is. For instance, if we view the last digit in the base 10 expansion of a number n (i.e. the value of ) as the most significant one, rather than the first – or more precisely, if we use the 10-adic metric intead of the Euclidean one, thus embedding the integers into
rather than into
– then the odd numbers become quite smooth (the most significant digit completely determines membership in this set). The primes in [N,2N] are not fully smooth, but they do exhibit some partial smoothness; indeed, if the most significant digit is 0, 2, 4, 5, 6, or 8, this fully determines membership in the set, though if the most significant digit is 1, 3, 7, or 9 then one only has partial information on membership in the set.
Now, the 10-adic metric is not fully satisfactory for characterising the elusive concept of number-theoretic “smoothness”. For instance, the multiples of 3 should be a smooth set, but this is not the case in the 10-adic metric (one really needs all the digits before one can be sure whether a number is a multiple of 3!). Also, we have the problem that the set [N/2,N] itself is now no longer smooth. This can be fixed by working not with just the Euclidean metric or a single n-adic metric, but with the product of all the n-adic metrics and the Euclidean metric at once. Actually, thanks to the Chinese remainder theorem, it is enough to work with the product of the p-adic metrics for primes p and the Euclidean metric, thus embedding the integers in the integer adele ring . For some strange reason, this adele ring is not explicitly used in most treatments of sieve theory, despite its obvious relevance (and despite the amply demonstrated usefulness of this ring in algebraic number theory or in the theory of L-functions, as exhibited for instance by Tate’s thesis). At any rate, we are only using the notion of “smoothness” in a very informal sense, and so we will not need the full formalism of the adeles here. Suffice to say that a set of integers in [N,2N] is “smooth” if membership in that set can be largely determined by its most significant digits in the Euclidean sense, and also in the p-adic senses for small p; roughly speaking, this means that this set is approximately the pullback of some “low complexity” set in the adele ring – a set which can be efficiently fashioned out of a few of basic sets which generate the topology and
-algebra of that ring. (Actually, in many applications of sieve theory, we only need to deal with moduli q which are square-free, which means that we can replace the p-adics
with the cyclic group
, and so it is now just the residues mod p for small p, together with the Euclidean most significant digits, which should control what smooth sets are; thus the adele ring has been replaced by the product
.)
Let us now return back to sieve theory, and the task of counting “rough” sets such as the primes in [N,2N]. Since we know how to accurately count “smooth” sets such as with q small, one can try to describe the rough set of primes as some sort of combination of smooth sets. The most direct implementation of this idea is the sieve of Eratosthenes; if one then tries to compute the number of primes using the inclusion-exclusion principle, one obtains the Legendre sieve; we implicitly used this idea previously when counting the quantity (1). However, the number of terms in the inclusion-exclusion formula is very large; if one runs the sieve of Eratosthenes for k steps (i.e. sieving out multiples of the first k primes), there are basically
terms in the inclusion-exclusion formula, leading to an error term which in the worst case could be of size
. A related issue is that the modulus q in many of the terms in the Legendre sieve become quite large – as large as the product of the first k primes (which turns out to be roughly
in size). Since the set one is trying to count is only of size N, we thus see that the Legendre sieve becomes useless after just
or so steps of the Eratosthenes sieve, which is well short of what one needs to accurately count primes (which requires that one uses
or so steps). More generally, “exact” sieves such as the Legendre sieve are useful for any situation involving only a logarithmically small number of moduli, but are unsuitable for sieving with much larger numbers of moduli.
One can describe the early development of sieve theory as a concerted effort to rectify the drawbacks of the Legendre sieve. The first main idea here is to not try to compute the size of the rough set exactly – as this is too “expensive” in terms of the number of smooth sets required to fully describe the rough set – but instead to just settle for upper or lower bounds on the size of this set, which use fewer smooth sets. There is thus a tradeoff between how well the bounds approximate the original set, and how well one can compute the bounds themselves; by selecting various parameters appropriately one can optimise this tradeoff and obtain a final bound which is non-trivial but not completely exact. For instance, in using the Legendre sieve to try to count primes between N and 2N, one can instead use that sieve to count the much larger set of numbers between N and 2N which are coprime to the first k primes, thus giving an upper bound for the primes between N and 2N. It turns out that the optimal value of k here is roughly or so (after this, the error terms in the Legendre sieve get out of hand), and give an upper bound of
for the number of primes between N and 2N – somewhat far from the truth (which is
), but still non-trivial.
In a similar spirit, one can work with various truncated and approximate versions of the inclusion-exclusion formula which involve fewer terms. For instance, to estimate the cardinality of the union of k sets, one can replace the inclusion-exclusion formula
(2)
by the obvious upper bound
(also known as the union bound), or by the slightly less obvious lower bound
More generally, if one takes the first n terms on the right-hand side of (2), this will be an upper bound for the left-hand side for odd n and a lower bound for even n. These inequalities, known as the Bonferroni inequalities, are a nice exercise to prove: they are equivalent to the observation that in the binomial identity
for any , the partial sums on the right-hand side alternate in sign between non-negative and non-positive. If one inserts these inequalities into the Legendre sieve and optimises the parameter, one can improve the upper bound for the number of primes in [N,2N] to
, which is significantly closer to the truth. Unfortunately, this method does not provide any lower bound other than the trivial bound of 0; either the main term is negative, or the error term swamps the main term. A similar argument was used by Brun to show that the number of twin primes in [N,2N] was
(again, the truth is conjectured to be
), which implied his famous theorem that the sum of reciprocals of the twin primes is convergent.
The full inclusion-exclusion expansion is a sum over terms, which one can view as binary strings of 0s and 1s of length k. In the Bonferroni inequalities, one only sums over a smaller collection of strings, namely the Hamming ball of strings which only involve n or fewer 1s. There are other collections of strings one can use which lead to upper or lower bounds; one can imagine revealing such a string one digit at a time and then deciding whether to keep or toss out this string once some threshold rule is reached. There are various ways to select these thresholding rules, leading to the family of combinatorial sieves. One particularly efficient such rule is similar to that given by the Bonferroni inequalities, but instead of using the number of 1s in a string to determine membership in the summation, one uses a weighted number of 1s (giving large primes more weight than small primes, because they tend to increase the modulus too quickly and thus should be removed from the sum sooner than the small primes). This leads to the beta sieve, which for instance gives the correct order of magnitude of
for the number of primes in [N,2N] or
for the number of twin primes in [N,2N]. This sieve is also powerful enough to give lower bounds, but only if one stops the sieve somewhat early, thus enlarging the set of primes to a set of almost primes (numbers which are coprime to all numbers less than a certain threshold, and thus have a bounded number of prime factors). For instance, this sieve can show that there are an infinite number of twins
, each of which has at most nine prime factors (the nine is not optimal, but to get better results requires much more work).
There seems however to be a limit as to what can be accomplished by purely combinatorial sieves. The problem stems from the “binary” viewpoint of such sieves: any given term in the inclusion-exclusion expansion is either included or excluded from the sieve upper or lower bound, and there is no middle ground. This leads to the next main idea in modern sieve theory, which is to work not with the cardinalities of sets in [N,2N], but rather with more flexible notion of sums of weight functions (real-valued functions on [N,2N]). The starting point is the obvious formula
for the cardinality of a set A in [N,2N], where is the indicator function of the set A. Applying this to smooth sets such as
, we obtain
;
in particular, specialising to the 0 residue class a = 0 (which is the residue class of importance for counting primes) we have
for any d. Thus if we can obtain a pointwise upper bound on by a divisor sum (which is a number-theoretic analogue of a smooth function), thus
(3)
for all n and some real constants (which could be positive or negative), then on summing we obtain the upper bound
(4)
One can also hope to obtain lower bounds on |A| by a similar procedure (though in practice, lower bounds for primes have proven to be much more difficult to obtain, due to the parity problem which we will discuss below). These strategies are suited for the task of bounding the number of primes in [N,2N]; if one wants to do something fancier such as counting twin primes n, n+2, one has to either involve more residue classes (e.g. the class a = -2 will play a role in the twin prime problem) or else insert additional weights in the summation (e.g. weighting all summations in n by an additional factor of , where
is the von Mangoldt function). To simplify the exposition let us just stick with the plainer problem of counting primes.
These strategies generalise the combinatorial sieve strategy, which is a special case in which the constants are restricted to be +1, 0, or -1. In practice, the sum
in (4) is relatively easy to sum by multiplicative number theory techniques (the coefficients
, in applications, usually involve the Möbius function (not surprising, since they are encoding some sort of inclusion-exclusion principle) and are often related to the coefficients of a Hasse-Weil zeta function, as they basically count solutions modulo d to some set of algebraic equations), and the main task is to ensure that the error term in (3) does not swamp the main term. To do this, one basically needs the weights
to be concentrated on those d which are relatively small compared with N, for instance they might be restricted to some range
where the sieve level
is some small power of N. Thus for instance, starting with the identity
, (5)
which corresponds to the zeta-function identity
,
where is the von Mangoldt function and
is the Möbius function, we obtain the upper bound
where A denotes the primes from N to 2N. This is already enough (together with the elementary asymptotic ) to obtain the weak prime number theorem
, but unfortunately this method does not give a nontrivial lower bound for |A|. However, a variant of the method does give a nice asymptotic for
almost primes – products of at most two (large) primes [e.g. primes larger than
for some fixed
]. Indeed, if one introduces the second von Mangoldt function
(6)
which is mostly supported on almost primes (indeed,
and
for distinct primes p, q, and
is mostly zero otherwise), and uses the elementary asymptotic
then one obtains the Selberg symmetry formula
.
This formula (together with the weak prime number theorem mentioned earlier) easily implies an “ almost prime number theorem”, namely that the number of
almost primes less than N is
. [This fact is much easier to prove than the prime number theorem itself. In terms of zeta functions, the reason why the prime number theorem is difficult is that the simple pole of
at s=1 could conceivably be counteracted by other simple poles on the line
. On the other hand, the
almost prime number theorem is much easier because the effect of the double pole of
at s=1 is not counteracted by the other poles on the line
, which are at most simple.]
The almost prime number theorem establishes the prime number theorem “up to a factor of 2”. It is surprisingly difficult to improve upon this factor of 2 by elementary methods, though once one can replace 2 by
for some
(a fact which is roughly equivalent to the absence of zeroes of
on the line
), one can iterate the Selberg symmetry formula (together with the tautological fact that an
almost prime is either a prime or the product of two primes) to get the prime number theorem; this is essentially the Erdős-Selberg elementary proof of that theorem.
One can obtain other divisor bounds of the form (3) by various tricks, for instance by modifying the weights in the above formulae (5) and (6). A surprisingly useful upper bound for the primes between N and 2N is obtained by the simple observation that
whenever are arbitrary real numbers with
, basically because the square of any real number is non-negative. This leads to the Selberg sieve, which suffices for many applications; for instance, it can prove the Brun-Titchmarsh inequality, which asserts that the number of primes between N and N+M is at most
, which is again off by a factor of 2 from the truth when N and M are reasonably comparable. There are also some useful lower bounds for the indicator function of the almost primes of divisor sum type, which can be used for instance to derive Chen’s theorem that there are infinitely many primes p such that p+2 is a
almost prime, or the theorem that there are infinitely many
almost primes of the form
.
In summary, sieve theory methods can provide good upper bounds, lower bounds, and even asymptotics for almost primes, which lead to upper bounds for primes which tend to be off by a constant factor such as 2. Rather frustratingly, though, sieve methods have proven largely unable to count or even lower bound the primes themselves, thus leaving the twin prime conjecture (or the conjecture about infinitely many primes of the form ) still out of reach. The reason for this – the parity problem – was first clarified by Selberg. Roughly speaking, it asserts:
Parity problem. If A is a set whose elements are all products of an odd number of primes (or are all products of an even number of primes), then (without injecting additional ingredients), sieve theory is unable to provide non-trivial lower bounds on the size of A. Also, any upper bounds must be off from the truth by a factor of 2 or more.
Thus we can hope to count almost primes (because they can have either an odd or an even number of factors), or to count numbers which are the product of 6 or 7 primes (which can for instance be done by a sieve of Bombieri), but we cannot hope to use plain sieve theory to just count primes, or just count semiprimes (the product of exactly two primes).
To explain this problem, we introduce the Liouville function (a close relative of the Möbius function), which is equal to +1 when n is the product of an even number of primes and -1 otherwise. Thus the parity problem applies whenever
is identically +1 or identically -1 on the set A of interest.
The Liouville function oscillates quite randomly between +1 and -1. Indeed, the prime number theorem turns out to be equivalent to the assertion that is asymptotically of mean zero,
(a fact first observed by Landau), and if the Riemann hypothesis is true then we have a much better estimate
for all
.
Assuming the generalised Riemann hypothesis, we have a similar claim for residue classes:
for all
.
What this basically means is that the Liouville function is essentially orthogonal to all smooth sets, or all smooth functions. Since sieve theory attempts to estimate everything in terms of smooth sets and functions, it thus cannot eliminate an inherent ambiguity coming from the Liouville function. More concretely, let A be a set where is constant (e.g.
is identically -1, which would be the case if A consisted of primes) and suppose we attempt to establish a lower bound for the size of a set A in, say, [N,2N] by setting up a divisor sum lower bound
(6)
where the divisors d are concentrated in for some reasonably small sieve level R. If we sum in n we obtain a lower bound of the form
(7)
and we can hope that the main term will be strictly positive and the error term is of lesser order, thus giving a non-trivial lower bound on |A|. Unfortunately, if we multiply both sides of (6) by the non-negative weight
and sum in n, we obtain
since we are assuming to equal -1 on A. If we sum this in n, and use the fact that
is essentially orthogonal to divisor sums, we obtain
which basically means that the bound (7) cannot improve upon the trivial bound . A similar argument using the weight
also shows that any upper bound on |A| obtained via sieve theory has to essentially be at least as large as 2|A|.
Despite this parity problem, there are a few results in which sieve theory, in conjunction with other methods, can be used to count primes. The first of these is the elementary proof of the prime number theorem alluded to earlier, using the multiplicative structure of the primes inside the almost primes. This method unfortunately does not seem to generalise well; for instance, the product of twin primes is not a twin almost prime. Other examples arise if one starts counting certain special two-parameter families of primes; for instance, Friedlander and Iwaniec showed that there are infinitely many primes of the form by a lengthy argument which started with Vaughan’s identity, which is sort of like an exact sieve, but with a (non-smooth) error term which has the form of a bilinear sum, which captures correlation with the Liouville function. The main difficulty is to control this bilinear error term, which after a number of (non-trivial) arithmetic manipulations (in particular, factorising
over the Gaussian integers) reduces to understanding some correlations between the Möbius function and the Jacobi symbol, which is then achieved by a variety of number-theoretic tools. The method was then modified by Heath-Brown to also show infinitely many primes of the form
. Related results for other cubic forms using similar methods have since been obtained by Heath-Brown and Moroz and by Helfgott (analogous claims for quadratic forms date back to Iwaniec). These methods all seem to require that the form be representable as a norm over some number field and so it does not seem as yet to yield a general procedure to resolve the parity problem.
The parity problem can also be sometimes be overcome when there is an exceptional Siegel zero, which basically means that there is a quadratic character which correlates very strongly with the primes. Morally speaking, this means that the primes can be largely recovered from the
almost primes as being those almost primes which are quadratic non-residues modulo the conductor q of
, and this additional information seems (in principle, at least) to overcome the parity problem obstacle (related to this is the fact that Siegel zeroes, if they exist, disprove GRH, and so the Liouville function is no longer as uniformly distributed on smooth sets as Selberg’s analysis assumed). For instance, Heath-Brown showed that if a Siegel zero existed, then there are infinitely many prime twins. Of course, assuming GRH then there are no Siegel zeroes, in which case these results would be technically vacuous; however, they do suggest that to break the parity barrier, we may assume without loss of generality that there are no Siegel zeroes.
Another known way to partially get around the parity problem is to combine precise asymptotics on almost primes (or of weight functions concentrated near the almost primes) with a lower bound on the number of primes, and then use combinatorial tools to parlay the lower bound on primes into lower bounds on prime patterns. For instance, suppose you knew could count the set
accurately (where is the set of
-almost primes), and also obtain sufficiently good lower bounds on the sets
and more precisely that one obtains
.
(For comparison, the parity problem predicts that one cannot hope to do any better than showing that , so the above inequality is not ruled out by the parity problem obstruction.)
Then, just from the pigeonhole principle, one deduces the existence of such that at least two of n, n+2, n+6 are prime, thus yielding a pair of primes whose gap is at most 6. This naive approach does not quite work directly, but by carefully optimising the argument (for instance, replacing the condition
with something more like
), Goldston, Yildirim, and Pintz were able to show unconditionally that prime gaps in
could be as small as
, and could in fact be as small as 16 infinitely often if one assumes the Elliot-Halberstam conjecture.
In a somewhat similar spirit, my result with Ben Green establishing that the primes contain arbitrarily long progressions proceeds by first using sieve theory methods to show that the almost primes (or more precisely, a suitable weight function concentrated near the almost primes) are very pseudorandomly distributed, in the sense that several self-correlations of
can be computed and agree closely with what one would have predicted if the almost primes were distributed randomly (after accounting for some irregularities caused by small moduli). Because of the parity problem, the primes themselves are not known to be as pseudorandomly distributed as the almost primes; however, the prime number theorem does at least tell us that the primes have a positive relative density in the almost primes. The main task is then to show that any set of positive relative density in a sufficiently pseudorandom set contains arithmetic progressions of any specified length; this combinatorial result (a “relative Szemerédi theorem“) plays roughly the same role that the pigeonhole principle did in the work of Goldston-Yildirim-Pintz. (On the other hand, the relative Szemerédi theorem works even for arbitrarily low density, whereas the pigeonhole principle does not; because of this, our sieve theory analysis is far less delicate than that in Goldston-Yildirim-Pintz.)
It is probably premature, with our current understanding, to try to find a systematic way to get around the parity problem in general, but it seems likely that we will be able to find some further ways to get around the parity problem in special cases, and perhaps once we have assembled enough of these special cases, it will become clearer what to do in general.
[Update, June 6: definition of almost prime modified, to specify that all prime factors are large. Reference to profinite integers deleted due to conflict with established notation.]
82 comments
Comments feed for this article
5 June, 2007 at 10:25 pm
TK
This blog is the best I’ve seen on the internet for mathematical exposition. Second only to the arxiv as a source of wisdom.
Kudos to the author; I hope he will continue to find time to write these long postings.
6 June, 2007 at 11:44 am
Rob
How could one go about obtaining a copy of the unpublished lecture notes of Iwaniec referred to above?
6 June, 2007 at 1:03 pm
Emmanuel Kowalski
This is indeed a wonderful post!
Here is a remark which may also be useful to understand some issues with sieve.
In the “almost prime” setting, there is often a distinction between integers having few prime factors (with multiplicity) and integers having no small prime factors. Most sieves typically produce the second type of integers, more precisely integers
such that all primes
dividing
are larger than
for some (fixed) positive constant
(often small). So those
have at most
prime factors.
The reason the distinction is sometimes important lies in the density of the two sets: for instance, there are about
integers
divisible by at most two primes, but only a constant (depending on
) times
integers with no prime factor
. In other words, the density of the second type of “almost primes” is comparable to that of the primes themselves, whereas there are rather more of the first type.
This might explain for instance (where the parity problem might suggest a serious difficulty) why Goldston, Graham, Pintz and Yildirim were able to prove that there are infinitely many integers which are products of two primes exactly and at some bounded distance (i.e., what looks like the analogue of bounded distances between primes; but among a set where the mean spacing is not
but somewhat smaller).
To reply (partly) to the last comment: Iwaniec and Friedlander are currently finishing
a book on sieve based in large part on those unpublished notes.
16 December, 2015 at 12:17 pm
Justin
I am very interested in the distribution of integers that are the product of at most two primes. You mention that there are asymptotically
integers with no prime factor
, where
is a constant depending on $c$. Does this directly appear in the literature somewhere? Or is it derived from a more general result? I have been searching high and low for it, but to no avail.
16 December, 2015 at 12:58 pm
Terence Tao
See https://en.wikipedia.org/wiki/Buchstab_function . Tenenbaum’s text “Introduction to analytic and probabilistic number theory” should have a thorough treatment of this (probably Montgomery-Vaughan’s “Multiplicative Number Theory” also).
16 December, 2015 at 11:38 pm
Justin
This is really helpful; thanks for the quick reply!
6 June, 2007 at 4:02 pm
Top Posts « WordPress.com
[…] Open question: The parity problem in sieve theory The parity problem is a notorious problem in sieve theory: this theory was invented in order to count prime patterns of […] […]
6 June, 2007 at 5:18 pm
Felipe Voloch
Sometimes one can get lower bounds out of upper bounds in prime number estimates using some Galois theory. Two examples of this (actually closely related) occur in Bombieri’s version of Stepanov’s proof of the Riemann hypothesis for function fields and in Chebotarev’s proof of his density theorem. I’ve never seen these arguments in the context of sieve theory, though.
6 June, 2007 at 11:11 pm
Emmanuel Kowalski
In sieve, getting lower bounds from upper bounds is usually done with the Buchstab identity, which is again a form of inclusion exclusion and states that
where
is the sum of some sequence
over integers with no prime factor less than
, and
is chosen arbitrarily less than
, i.e, integers with no prime factor less than
are those with no prime factors less than
(there are more of them), minus those with smallest prime divisor
between
and
, and no prime factor smaller than this
.
Somewhat related is the nice trick of “switching sieve”, found first (I think) in the paper of Iwaniec on primes represented by quadratic forms, and spectacularly used by Chen in his result on
: in this second case, roughly, Chen showed first that some weighted counting of primes
with
is “large”; then he showed that the number of primes
where
is smaller than this lower bound, so most of the first lower bound had to come from
where
… The second step is done by sieving the sequence
to count how many primes it may represent (there one needs only an upper bound!). It is not obvious a priori that this will work; in Chen’s first version, it depends on getting some numerical inequality right (and this depends of course on the careful choice of weight in the first lower bound).
7 June, 2007 at 9:05 am
Jordan Ellenberg
You ask why the adelic formulation is rarely used in sieve theory; I think it’s probably just because when you work over Z you
can get away with _not_ using it. It’s when you want to prove things
over arbitrary number fields (or for that matter function fields)
that the adelic point of view becomes really indispensible — so I
think one good outcome of using the adeles systematically is that it
will be clear (if you care) which of these theorems in sieve theory
truly have to do with Z and which are theorems about integers in
general number fields. Of course it is not quite obvious even what the correct questions are over more general fields! (And perhaps here, too, the adelic formulation would be of use.)
7 June, 2007 at 9:55 am
Emmanuel Kowalski
With respect to adèles, I think even when it would be a good conceptual tool, it is not necessarily a good fit for the foundational part of sieve methods (before applications), which tends to be very “finitistic” (although O() symbols are sometimes used, they can pretty much be replaced by completely explicit bounds if one is careful enough — Iwaniec did this very successfully for the so-called beta-sieve). So adèles could be also be replaced with finite products of completions, and then dissolved in finite quotients…
Recently I’ve been playing with quite general sieve settings, where the integers are replaced with potentially much more complicated-looking things (e.g., discrete groups, algebraic fundamental groups, probability spaces with random walks,…) without seeing infinitary objects like the adèles being obviously necessary (though they can be used to phrase things in a clean way).
7 June, 2007 at 10:11 am
Terence Tao
Dear Emmanuel and Jordan,
Thanks very much for your comments! I can see that adeles are perhaps not the right tool for really getting into the quantitative and finitary details of error estimates, etc., but they should at least provide a nice way to compute the “main term” that comes out of sieve theory (such as the expressions
mentioned in my post). For instance, I have felt that all the standard conjectures (Hardy-Littlewood prime tuples conjecture, Bateman-Horn conjecture, etc.) about the distribution of prime patterns would be very cleanly stated in an adelic setting.
Regarding other number fields, I have a paper establishing infinitely many constellations amongst the Gaussian integer primes (or any dense subset thereof), by modifying the methods of my paper with Ben. For this, I of course had to do some sieving in the Gaussian integers, but this was fairly straightforward to do by hand. I don’t know how to do the same thing for more general integral rings, e.g.
, mainly because I am using a rather “low-tech” approach to sieving that relies in particular on unique factorisation (the fundamental theorem of arithmetic). (There are also some mild problems with units, but as long as there are only finitely many of these, this does not seem to be dangerous.) A model problem would be to figure out how to use sieve theory to count constellations of “almost primes” or “almost irreducibles” in such integral rings, though I am not entirely sure exactly what those concepts even mean once one loses unique factorisation. There is also the question of exactly what type of “boxes” one should count the patterns in, though presumably the Archimedean valuations should tell you what to do there.
7 June, 2007 at 11:22 am
Emmanuel Kowalski
Certainly, as far as the main term is concerned, or for any Euler product more generally, (involving in addition an archimedean term), seeing things adelically is the right thing to do as soon as some generality is desired.
For problems in rings of integers in number fields with no unique factorization, one might look naively for problems involving “irreducibles”, i.e., elements of the ring which generate a prime ideal. Algebraic/analytic number theory proves that there are many of those (a proportion $1/h$ of prime ideals are principal, where $h$ is the class number). If there are infinitely many units, more needs to be done to get nice finite sets in which to sieve, but for imaginary quadratic fields, this makes it possible to write down some nice twin-prime like problems in general.
There might well be more interesting problems remaining to be discovered, on the other hand…
8 June, 2007 at 10:34 pm
Keith Conrad
This post is about Terry’s remark on what type of “boxes” one should count in.
There certainly is no canonical choice unless the number field has a unique archimedean place (thus limiting you to Q and imagiary quadratics). In other cases, you could pick some region in the product of archimedean completions of the number field — so in R x R if you had in mind Q(sqrt(2)), say, and take uniformly expanded versions of that as some appropriate scaling parameter grows. (If you aren’t careful then you can pick regions that are too thin in some direction.) Even if there’s nothing canonical about such a choice of boxes in which you tabulate statistics on prime values of some expression across the region, numerics shows that the number of prime values you find in the box (if you’re looking at values of a polynomial f(x), generating a principal prime ideal in case h > 1) grows with the scaling parameter in the same asymptotic way as the classical Hardy–Littlewood constant you would attach to the polynomial times the sum of 1/log|N(f(x))| where N is the norm down to Z from your ring of integers and x runs over integers in the box. I am not saying there is any hope of a proof in such generality, but the numerics look okay. (This is just an extrapolation of the usual heuristic probability of f(n) being prime as C_f/log|f(n)| where f is in Z[X] and C_f is a Hardy–Littlewood constant which measures how often f’s values on Z are not divisible by the primes.)
One way to get around the choice of ad hox “boxes” is to forget about any kind of “(un)natural density” for prime values and use the Zariski topology instead. This is done in recent work of Sarnak, Bourgain, Gamburd.
In a somewhat different direction, ordinarily it is said that looking at prime pairs n and n + 1 is silly: one of n and n + 1 is even. But if you work not in Z but in Z[1/2] then the idea makes sense since the problem at 2 goes away. Just as Z is discrete and co-compact in R, Z[1/2] is discrete and co-compact in R x Q_2. If you draw a large region in R x Q_2 which is not thin in some direction then the number of n in the region from Z[1/2] (not just Z) for which n and n+1 are both prime (in Z[1/2], of course) agrees with the sum of 1/log N_2(n(n+1)) over the region times the corresponding Hardy–Littlewood constant with the local factor at 2 dropped out. By N_2(m) for m in Z[1/2] I mean the size of Z[1/2]/(m), just as one measures (nonzero) integers k by |k| = #(Z/k). Here I am not claiming one can actually prove something new, but the numerics look okay. Similarly, for the “triple primes” of the form n, n+2, n+4, which is nonsense in Z because one of the terms is always a multiple of 3, there is meaning to this idea in Z[1/3] and counting triple primes n, n+2, n+4 from Z[1/3] within regions of R x Q_3 returns numerics consistent with analogues of the usual prime-counting heuristics for polynomial values over Z, but here the Hardy–Littlewood constant has no local factor at 3.
Oh, when I say the Hardy–Littlewood constant has its local factor at 2 or 3 taken out depending on n and n + 1 or n, n+2, and n+4 being considered, the effect of 2 and 3 reappears elsewhere: you need to include in the heuristic prime-counting estimate a reciprocal of the residue of a zeta-function at s = 1 with the local Euler factor at 2 or 3 removed, which actually makes something show up from 2 or 3 in the residue itself. (Since
Res_{s=1} zeta(s) = 1, one doesn’t notice zeta-residues in the classical setting over Z.)
Ack, I’ve probably written too much. Good night.
9 June, 2007 at 1:15 am
Ben Green
This reminds me a little bit of a trick Terry and I use in our work on linear equations in primes which we call the “W-trick”. Suppose you want to find long arithmetic progressions in the odd primes
3,5,7,11,13,17,19….
Then you may as well look for progressions in the sequence
1,2,3,5,6,8,9….
which I got from the previous one by subtracting one from each term and dividing by two. The new sequence doesn’t suffer from the same dislike of the even numbers that the primes do – in fact since asymptotically 50 percent of the primes are 1 mod 4, about half of the elements of this sequence are even and half odd.
Now I’ll just look at the elements of the new sequence which are divisible by 3, and I’ll divide each of them by 3 to get a new sequence
1,2,3,5,6,7,9…..
This sequence is now nicely distributed mod 2 and mod 3 (that is, mod 6). To see why, let’s ask when 6n + r lies in the sequence. This is the case if and only if 6(6n + r) + 1 is prime, that is to say iff 36n + a is prime where a = 6r + 1. But there are, asymptotically, equal numbers of primes in each residue class a mod 36 for which a is coprime to 36.
One can continue in this way mod 5, 7 and so on up to w, obtaining a sequence which is a rescaled version of the primes congruent to 1 mod W, where W = 2 x 3 x 5 x 7… x w (this is why we call it the W-trick).
You can actually play the same game for the primes congruent to a mod W, for any a.
These rescaled primes will have negligible bias mod every prime less than or equal to w. This makes computing the Hardy-Littlewood constant a rather trivial matter – if w is large then it tends to be very close to 1 (actually there are some slight issues with this – the Hardy-Littlewood constant appearing in the Goldbach conjecture can be dominated by the contribution from large primes, say if you’re counting the number of ways to represent N = p + q where N has many large prime factors).
If one can prove that all these Hardy-Littlewood constants are about 1 then one can piece together the information over all progressions a (mod W), for different a, to recover the Hardy-Littlewood constant for the original problem. For example one can use this technique to make the classical prediction of the number of twin primes less than N.
The point about these rescaled primes is that (unlike the primes themselves) they look sort of random. Thus in making predictions about them one can look at what is true for a random set of similar size and just guess that the same ought to be true for the primes.
There is another heuristic which is more accurate than the above, which is the so-called Mobius Randomness Principle. This asserts that the Mobius function is highly orthogonal to everything unless there is some obvious reason why it shouldn’t be (for example, it is not orthogonal to itself or to the von Mangoldt function). This principle can also be used to guess these Hardy-Littlewood constants, and much more – for example the Riemann hypothesis is equivalent to the statement that Mobius is highly orthogonal (with square-root cancellation) to the constant function 1.
Proving even a weak instance of the Mobius randomness principle involves overcoming the parity problem, as Terry explained above. There’s more on all this in Terry’s earlier post (Simons Lecture I) and in some of the comments that follow it.
9 June, 2007 at 10:00 am
Jonathan Vos Post
Very fine thread! I understood it better, having seen ben Green’s recent lectures to the Caltech Math department.
The same methods seem to work for quasi-Eratosthenesian sieves, such as for Ulam’s “lucky numbers.”
http://www.research.att.com/~njas/sequences/A000959
Note the comment at that OEIS link:
COMMENT
An interesting general discussion of the phenomenon of ‘random primes’ (generalizing the lucky numbers) occurs in Hawkins (1958). Heyde (1978) proves that Hawkins’ random primes do not only almost always satisfy the PNT but also the Riemann Hypothesis. – Alf van der Poorten, Jun 27 2002
REFERENCES
M. Gardner, Lucky numbers and 2187, Math. Intellig., 19 (No. 2, 1997), 26-29.
M. Gardner, Gardner’s Workout, Chapter 21 “Lucky Numbers and 2187” pp 149-156 A.K.Peters MA 2002.
V. Gardiner, R. Lazarus, N. Metropolis and S. Ulam, On certain sequences of integers defined by sieves, Math. Mag., 29 (1955), 117-119.
R. K. Guy, Unsolved Problems in Number Theory, C3.
D. Hawkins, The random sieve, Math. Mag. 31 (1958), 1-3.
C. C. Heyde, Ann. Probability, 6 (1978), 850-875.
C. S. Ogilvy, Tomorrow’s Math. 2nd ed., Oxford Univ. Press, 1972, p. 99.
D. Wells, The Penguin Dictionary of Curious and Interesting Numbers. Penguin Books, NY, 1986, 114.
One can examine intersections such as:
A031157 Numbers that are both lucky and prime.
3, 7, 13, 31, 37, 43, 67, 73, 79, 127, 151, 163, 193, 211, 223, 241, 283, 307, 331, 349, 367, 409, 421, 433, 463, 487, 541, 577, 601, 613, 619, 631, 643, 673, 727, 739, 769, 787, 823, 883, 937, 991, 997, 1009, 1021, 1039, 1087, 1093, 1117, 1123, …
Looking mod 10 can be confusing to me, as in:
http://www.research.att.com/~njas/sequences/A129864
A129864 Numbers that are both lucky and emirp.
13, 31, 37, 73, 79, 739, 769, 991, 1009, 1021, …
FORMULA
A000959 INTERSECTION A006567. a(n) is n element of A000959 AND a(n) is an element of A000040 AND R(a(n)) = A004086(a(n)) is an element of A000040.
EXAMPLE
a(9) = 1009 because 1009 is a lucky number A000959(154) and 1009 is an emirp because 1009 is prime and R(1009) = 9001 is prime (but is not lucky).
9 June, 2007 at 12:28 pm
physics_grad_student
Hi Terry,
I was wondering if the techniques you mentioned can be used to solve a problem I’m dealing with but I can’t seem to get it to work after trying for a few days. I would like to prove the following (BTW I’m not a math major)
Suppose we are interested in modular arithmetic mod d. d>1
Let S be the set of numbers coprime to d excluding zero. Let a and b be elements of set S.
Then what is the range of the function x = a+b mod d?
Numerically, for all d less than 2000, the answer is
1) If d is even, x = {0,2,4. . . d-2} (all even numbers less than d)
2) If d is odd, x= {0,1,2,. . .d-1} (all numbers less than d)
I would like to prove that the above result holds for all values of d.
I can prove this for d=odd prime, which can be extended to odd prime powers, which can then be extended to d=product of at most 4 odd prime powers. Any help is greatly appreciated.
BTW is there any connection between this problem and the Goldbach Conjecture?
9 June, 2007 at 11:31 pm
Noah Snyder
physics_grad_student:
Your conjecture is true, and although it is totally unrelated to the Goldbach Conjecture it is still a cute little excercise in elementary number theory. I’ll just do the d is odd case, though the d is even case is basically identical with a small twist.
By the Chinese remainder theorem it is enough to do the case where d is a power of a prime, say d=p^n. Now, modulo p^n it is easy to compute what fraction of the numbers are relatively prime to p^n, namely (p-1)/p. In particular, if p is odd then more than half of the numbers modulo p^n are relatively prime to p^n.
Now, let’s rephrase the problem slightly. Let U be the set of numbers relatively prime to d=p^n. For some number x, let x-U be the set of elements of the form x-u for u in U. Saying that x is of the form a+b as above is equivalent to saying that U and x-U have a nontrivial intersection. (If y is in U and in x-U then x = y + (x-y) exhibits x as a sum of two elements of U.) If p is odd, since more than half of the numbers are in U and more than half of the numbers are in x-U they must overlap!
The interesting thing about this argument is that it only works if you tackle it one prime at a time. In particular, in Z/105 fewer than half of the integers less than 105 are relatively prime to 105. So a direct pidgeon hole argument won’t work, you need the Chinese remainder theorem.
10 June, 2007 at 8:11 am
Terence Tao
Actually, there is some connection to the Goldbach conjecture; physics_grad_student’s conjecture is the “local” version. If there was a residue class mod n which was not expressible as a sum of two classes coprime to n, but was expressible as an even number, then Goldbach’s conjecture would easily be seen to be false. In the language of my Simons I lecture, the fact that the above conjecture is true means that there are no local obstructions to solvability of the Goldbach problem. Unfortunately, we don’t currently have a “local to global” principle for this type of problem to then finish the job.
The type of sieve theory that gives near misses to the Goldbach conjecture (e.g. Chen’s theorem that every large even number is the sum of a prime and a P_2) implicitly uses the fact that the above conjecture is true to ensure that the “main term” or “singular series” in the number of solutions is safely bounded away from zero.
10 June, 2007 at 8:22 am
Jonathan Vos Post
105 doesn’t work as a side-effect of it being the smallest odd sphenic number, i.e. product of 3 distinct odd primes. The next odd sphenic number is 165. Why does the argument break down at these? If we patch the argument, wheat then onstructs that approach to proof?
http://www.research.att.com/~njas/sequences/A007304
10 June, 2007 at 3:34 pm
Jonathan Vos Post
I think these exceptions come from Sylow’s theorems.
A homework problem of Brian Harbourne, U. Nebraska at Lincoln, gives most of the reason.
http://www.math.unl.edu/~bharbourne1/M417Spr03/M417Hmwk9Sols.html
Problem: Let H and K be subgroups of a group G of order pqr, where p, q and r are distinct primes. If |H| = pq and |K| = qr, prove that the intersection (call it I) of H and K has order q.
Proof: Since |I| divides both |H| and |K|, we see that |I divides both qr and qp. But the only common factors of qr and qp are 1 and q.
So suppose |I| = 1. Let HK denote the subset
{g in G : g = hk for some h in H and some k in K}. Suppose h_1 and h_2 are in H and that k_1 and k_2 are in K.
If h_1 k_1 = h_2 k_2, then (k_1 k_2)^-1 =
(h_1)^-1 h_2, but (k_1 k_2)^-1 is in K and
(h_1)^-1 h_2 is in H, so (k_1 k_2)^-1 =
( h_1)^-1 h_2 is in I, and by assumption must thus be e (since we are assuming |I| = 1).
Thus (k_1 k_2)^-1 = e so k_1 = k_2, and likewise h_1 = h_2. Thus the map f : HxK -> HK defined by f((h, k)) = hk is injective. It is obviously surjective, so it is bijective, hence |HK| = |H||K| = (prq)^2. But HK is a subset of G, and this means |HK| = (prq)^2 is bigger than |G| = pqr, which is impossible. Thus |I| = 1 must be false, so |I| = q.
8 August, 2007 at 8:01 am
(Emmanuel Kowalski) The large sieve inequalities « What’s new
[…] post may be seen as complementary to the post “The parity problem in sieve theory“. In addition to a survey of another important sieve technique, it might be interesting as a […]
12 August, 2007 at 8:13 pm
Atle Selberg « What’s new
[…] In working on the zeta function, Selberg developed two powerful tools which are still used routinely in analytic number theory today. The first is the method of mollifiers to smooth out the magnitude oscillations of the zeta function, leaving making the (more interesting) phase oscillation more visible. The second was the method of the Selberg sieve, which is a particularly elegant choice of sieve which allows one to count patterns in almost primes (and hence to upper bound patterns in primes) quite accurately. Variants of the Selberg sieve were a crucial ingredient in the work of Goldston-Yıldırım-Pintz on prime gaps, and the work of Ben Green and myself on arithmetic progressions in primes. (I discuss the Selberg sieve, as well as the Selberg symmetry formula below, in my post on the parity problem.) […]
6 December, 2007 at 10:11 pm
Milliman Lecture III: Sum-product estimates, expanders, and exponential sums « What’s new
[…] one has to add back in points which are divisible by multiple small factors, and so forth). See my post on sieve theory and the parity problem for further discussion. In order for sieve theory to work well, one needs to be able to accurately […]
1 January, 2008 at 8:41 pm
Amateur
I’m not a mathematician, so please forgive this intrusion, but I’ve been fascinated by sieve theory
and this seems the right place to ask about an aspect of it that I find very puzzling and would very much appreciate if someone could point out an error in my reasoning.
The phenomenon is already apparent in examples of counting problems given by T.Tao at the beginning of his esssay on parity problem at the top of this thread. The simplest question that exhibits the phenomenon explicitly is a variation on the second example given by T.Tao and asks: How many odd numbers are contained in a set of three cosecutive integers?
The problem is not with determining the answers, but with their plurality. This lack of uniqueness implies lack of functional dependence. Specifically that the cardinality of the presumed set of solutions is not a function of the cardinality of the original set to which the sieve is applied.
Abstractly the sieve can be viewed as an instance of the implicit function problem – actually an explicit formula for its solution, if and when applicable. It seems that in this case the underlying functional relationship neccessary for the existence of solution to the implicit function problem is not in place.
In the generic case of inclusion-exclusion the problem does not appear, because this procedure is applied to sets defined explicitly, or at least, as in the case of the various subsets appearing in the formula, capable of being so defined (explicitly computed), by the use of set operations (union, intersection, relative complement) as functions of the original set and relations defined on it. In this setting there is no problem in coverting the method to handle the corresponding cardinalities if those appearing in the formula can be explicitly determined.
Likewise in number theory, if enough information is supplied to nail the set down explicitly the problem disappears.
If per chance I’m right then the meaning and significance of the “error term” in at least some sieves
might require a re-examination.
3 January, 2008 at 3:09 pm
Terence Tao
Dear Amateur,
The sieve of Eratosthenes (for instance) does provide an explicit, deterministic formula for the number of primes in any given range. This formula is known as the Legendre identity, see
http://en.wikipedia.org/wiki/Legendre_sieve
Unfortunately, the formula contains the greatest integer part function
, which is hard to compute with. We can express this function as a combination of the “main term” x and the “error term”
. We understand how to sum up the main term very well, but the error terms are what cause all the trouble. Heuristically, the fractional parts of all the terms which arise in the Legendre identity should be uniformly distributed between 0 and 1 – what reason would they have for doing otherwise? – but we unfortunately cannot prove this; similarly for other sieves. So there’s not much we can do with the error terms for other than treat them as being essentially arbitrary numbers between 0 and 1, which is what causes the formulae from sieve theory to become somewhat inexact. But actually, in many cases having a simple but somewhat inexact formula is more useful than an exact but overly complicated formula. For instance, to prove the twin prime conjecture, it is not necessary to count the number of twin primes exactly; having a rough lower bound on this count which goes to infinity would be sufficient, and presumably simpler to establish than an exact count.
7 January, 2008 at 8:22 pm
AMS lecture: Structure and randomness in the prime numbers « What’s new
[…] well before the mark (sieve levels such as are typical). (The reason for this has to do with the parity problem, which I will not discuss further […]
19 November, 2008 at 2:41 pm
Marker lecture III: Small gaps between primes « What’s new
[…] But there seems to be a significant obstacle to pushing things all the way to 2. Indeed, the parity problem tells us that for any reasonable definition of “almost prime” which is amenable to […]
20 November, 2008 at 1:09 pm
Marker lecture IV: sieving for almost primes and expanders « What’s new
[…] fairly typical). The selection of the sieve weights is now a well-developed science (see also my earlier post, and Kowalski’s guest post, on this topic), and Bourgain, Gamburd and Sarnak basically use […]
1 September, 2009 at 10:47 am
timur
The mathematician Sun Tzu and the military general Sun Tzu seem to be different individuals.
1 September, 2009 at 1:54 pm
Terence Tao
Hmm, I think you’re right. A shame; the story I had was a good one, but it appears to be apocryphal after all, so I’ve deleted it.
2 September, 2009 at 12:38 pm
timur
A long time ago in a galaxy far, far away… :))
24 September, 2009 at 3:19 pm
The prime number theorem in arithmetic progressions, and dueling conspiracies « What’s new
[…] while the genuine primes do not, is a reflection of the parity problem in sieve theory; see this post for further discussion. The symmetry formula is however enough to get “within a factor of […]
29 December, 2009 at 8:44 am
Craig
I was thinking about your theorem with Ben Green that says that the primes contain arbitrarily long progressions. Here is a strange but simple argument that this must be true:
Suppose that it weren’t true. Then there must exist an N such that there are no N-term arithmetic progressions of primes. Then if we exhibited an arithmetic progression a+bk, k=1,2,3,…,N-1 of all primes, we could know immediately that a is not prime, since if so then a would be part of an N-term progression of primes.
Also note that if this were possible, we could know that a is not prime without having any information about the possible divisors of a, since the fact that each a+bk, k=1,2,3,…,N-1, is prime says nothing about the possible divisors of a. But we need to have some information about the divisors of a in order to know that a is not prime, since the truth of the statement (a is not prime) depends on the divisors of a. This is a contradiction. Hence, your and Ben Green’s theorem must be true.
Comments?
29 December, 2009 at 8:52 am
jc
I don’t follow this argument. In particular, what you mean by “information about possible divisors” doesn’t make any sense to me. Consider the fact that one can test a number for primality without factoring, for instance.
29 December, 2009 at 9:10 am
Craig
jc,
Yes, it is true that you can test a number for primality without factoring, but you cannot test a number for primality without having some information about its factors.
For instance, see the Wikipedia entry for the AKS algorithm. When testing whether n is prime, the algorithm checks whether 1<gcd(a,n)<n for a<= r < n. The result of this test gives some information about the factors of n.
29 December, 2009 at 11:59 am
Terence Tao
Actually, the primality of a+bk does tell us something about the divisors of a. For instance, if a+b and a+2b are both prime (and b is positive), this forces a to be odd, simply because a+2b is also odd.
More generally, an argumentam ad ignorantiam does not rise to the level of a rigorous proof, though it may have some heuristic value: just because we cannot see any obvious way to connect the primality of a+b, a+2b, …, a+(N-1)b to the divisors of a, it does not necessarily follow that no such connection exists (and could possibly be discovered in the future by a sufficiently clever mathematician).
Incidentally, there are plenty of primality tests that do not require any explicit tests of divisibility by potential factors of the number being tested at all; Wilson’s theorem provides a simple (though impractical) such example. Another example is the Lucas-Lehmer test for the primality of a Mersenne number. (Of course, the certificate of primality provided by such tests does imply a posteriori that there is no divisibility by smaller factors, but this does not imply that an explicit trial division is necessary prior to establishing the primality certificate.)
29 December, 2009 at 12:23 pm
Craig
Terrence,
You are correct about my argument being refuted since a+b, a+2b prime implies that a is odd. But what if I formulate the argument as: a+bk, k=0,1,2,…,N-2, are prime doesn’t say anything about whether a+b(N-1) is prime, since it says nothing about the divisors of a+b(N-1)?
Also, I disagree with you about Wilson’s theorem. Calculating whether (n-1)! mod n is zero or negative one is essentially the same thing as asking whether n is prime, just stated in a different way.
29 December, 2009 at 12:58 pm
Terence Tao
Essentially the same example applies: if a, a+b, a+2b are prime, and b is positive, then a+3b is odd (because a+b is odd).
In any case, even one could not immediately produce such examples, the basic argument is still an argumentam ad ignorantiam, and could potentially be refuted by an example of this type in the future. For instance, to use Wilson’s theorem as a hypothetical, suppose some clever mathematician found an ingenious formula that connected (a-1)!, (a+b-1)!, (a+2b-1)!, and (a+3b-1)! to each other. Then it is potentially conceivable that one could use Wilson’s theorem and the primality of a+b, a+2b, a+3b to say something about the primality of a. Now, I don’t know of such a formula (and in fact I strongly doubt that one exists); but I cannot rigorously rule out the possibility that something like this (or one of an infinite number of other possibilities, using some other number-theoretic fact than Wilson’s theorem) could actually happen (except by using my theorem with Ben, of course). All I can say is that I do not see any obvious way to pull off anything like this, but to use this ignorance to deduce anything is the logical fallacy of argumentam ad ignorantium.
29 December, 2009 at 1:09 pm
Craig
OK, that makes sense. Thank you very much for your response, Terence, and sorry for mispelling your name before.
29 December, 2009 at 9:15 am
Antonios Manoussos
Dear Prof. Tao,
In the text of your very nice article it is written “however, the prime number theorem does at least tell us that the primes have a positive relative density in the almost primes”. I am not an expert in this field but I have a question about that: Since the 2-primes less than a given $n$ are asymptotically $\frac{n}{\log n} \log \log{n}$ it looks like the relative density of the primes in the semi primes is 0. Is this o.k.?
Congratulations for your very interesting blog.
29 December, 2009 at 12:01 pm
Terence Tao
For sake of this discussion, I define a semiprime to be the product of two large primes – e.g. primes larger than
for some fixed threshold
. These are the types of almost primes that are generated by sieves (semiprimes with small factors tend to get eliminated quite easily by such sieves). With this definition, the number of semiprimes less than N is comparable to
rather than
.
29 December, 2009 at 1:05 pm
Antonios Manoussos
Thank you for the answer! Now it is clear to me.
Antonios
10 August, 2010 at 3:51 am
Goldbach’s conjecture (to prove or not to prove) | MathFax.com
[…] https://terrytao.wordpress.com/2007/06/0 … ve-theory/ […]
12 August, 2011 at 8:19 am
Primos gemelos: Un vistazo a algunos resultados. « Lo fascinante de la teoría de números
[…] 5. La conjetura de Elliot-Halberstam es otra conjetura acerca de teoría de números bastante tratada. En palabras de Terence Tao, “Una especie de Hipotesis de Riemann super generalizada“. […]
28 November, 2011 at 12:11 pm
Craig
Yao’s XOR Lemma shows that computing the parity of several independent weakly unpredictable predicates amplifies the unpredictability to the point where it is almost random. Is there any connection between this and the parity problem in sieve theory?
1 March, 2012 at 5:25 pm
254B, Notes 7: Sieving and expanders « What’s new
[…] known as the parity problem, which prevents sieve theory from lowering all the way to ; see this blog post for more discussion.) One formulation of this principle was established by Bourgain, Gamburd, and […]
25 May, 2012 at 7:44 am
Heuristic limitations of the circle method « What’s new
[…] which are much easier) on prime patterns, namely the parity problem. I discuss this problem in this previous blog post, and do not have much more to add here to what I already wrote in that […]
16 October, 2012 at 8:46 am
Mathematics: What it takes to prove the Goldbach Conjecture? - Quora
[…] know, but finding creative ways around the parity problem in sieve theory is a promising approach:https://terrytao.wordpress.com/20…Embed QuoteComment Loading… • Share • Embed • 4m ago Add […]
15 June, 2013 at 3:38 am
强扭的瓜是甜的 » 素数并不孤独
[…] Open question: The parity problem in sieve theory, Terence Tao, https://terrytao.wordpress.com/2007/06/05/open-question-the-parity-problem-in-sieve-theory/ […]
16 June, 2013 at 12:49 am
oldhat
dear professor Tao,
congratulations for your wonderful site! but it is ‘argumentum ad ignorantiam’, not ‘argumentam ad ignorantium’ (sorry for the nitpickery…).
[Unfortunately I am not able to locate this typo. -T.]
23 June, 2013 at 1:21 am
素数并不孤独 | fwjmath的相空间
[…] Open question: The parity problem in sieve theory, Terence Tao, https://terrytao.wordpress.com/2007/06/05/open-question-the-parity-problem-in-sieve-theory/ […]
23 June, 2013 at 2:17 am
oldhat
Dear professor Tao
yes, in fact it is not in your post, but it appears in two remarks above, which you wrote on december 29 2009, at 11:59 and at 12:58….forgive the grammatical obsession of an old latinist…and keep up your marvelous work (I can’t begin to tell you how envious I am, as a non mathematician, of the way of doing science you show in your blog)!
[Ah, found it, thanks – T.]
1 July, 2013 at 10:33 pm
Avi Levy
Regarding the remark immediately following (5):
I am having trouble seeing how (5) corresponds to the zeta-function identity you have posted (in the display following (5)).
Instead, (5) appears more closely related to the “identity”
, which I will call (5*).
Here’s my reasoning.
(in the notation of recent Polymath8 blog posts)
First, (5) is just a verbose way of writing
Then, since
corresponds to
and
corresponds to
, (5) is a restatement of (5*).
1 July, 2013 at 10:48 pm
Terence Tao
The first part of (5) (equating
with
) does correspond to the identity you state, but the second part of (5) (equating
with
) instead corresponds to the identity in the text. Of course both identities are very easy to prove (and easily seen to be equivalent).
2 July, 2013 at 4:43 pm
Avi Levy
Ah, it seems that the point of (5) is to justify the second equality. I was assuming that (5) was already asserting this statement, which boils down to
for all
.
Instead, if I understand correctly, you are using the display after (5) in order to justify the second equality of (5)?
2 July, 2013 at 9:28 pm
Avi Levy
Yes, one line using Liebniz:

2 July, 2013 at 10:05 pm
Avi Levy
In (4),
should be
.
[Corrected, thanks – T.]
9 July, 2013 at 3:30 pm
科学 – 六号教父 » 素数并不孤独
[…] Open question: The parity problem in sieve theory, Terence Tao, https://terrytao.wordpress.com/2007/06/05/open-question-the-parity-problem-in-sieve-theory/ […]
16 August, 2013 at 6:00 am
Euclid Strikes Back | Gödel's Lost Letter and P=NP
[…] This idea can be extended to triples of boxes. Now Bob must make them pairwise relatively prime. I had an application to a complexity theory problem, in which it was needed to avoid randomness. In my application, the solution is being done by a deterministic machine. Perhaps there are other uses of this simple trick. I note that it is quite close to the “W-trick” described by Terence Tao in relation to Yitang Zhang’s advance on the twin-prime conjecture, and sketched by Ben Green in a comment here. […]
26 September, 2013 at 10:13 am
Kolbjorn Tunstrom
Regarding a sieve theoretic proof of Euclid’s theorem: If we sieve the interval
by the
first primes, then
is divisible by all denominators in the Legendre identity; no terms need to be rounded, the error term vanishes, and the number of integers in
not divisible by any
is given exactly by
Since this holds for any
, it follows that there must be infinitely many primes. Shouldn’t this work?
26 September, 2013 at 12:46 pm
Terence Tao
Yes, this is a valid proof of Euclid’s theorem, albeit one which is very close to Euclid’s original proof. But note that the sifted set that
is counting here is not the set of primes in A, so one is not directly establishing the infinitude of primes by the usual sieve-theoretic method of viewing the primes as a sifted set, but instead using the primes as the set of sieving moduli, which is a different type of argument (more multiplicative number theory than sieve theory).
26 September, 2013 at 6:23 pm
Kolbjorn Tunstrom
Thanks for the reply. You are right it is close to Euler’s proof. In fact, an almost replica of his proof appears by sieving the interval
and exploiting the
-periodicity of
to show that
.
I agree completely with your point that these proofs lean more or less towards multiplicative number theory. However, as a basis for building up intuition about what structures that develops throughout sieving it seems quite useful to think in these terms (and perhaps even more so pedagogically). From this perspective, proofs of landmark theorems, like Dirichlet’s theorem on arithmetic progressions, your own Green-Tao theorem on arbitrarily long arithmetic progressions of primes, or Zhang’s very recent result on bounded gaps between primes, all boil down to simple variations of the proofs above. Naturally, these proofs hold only for coprimes and not primes. But it is interesting to observe the immense gap between this simplicity and the sieve theoretic proofs.
Another aspect is that the periodicity of
introduces certain symmetries and constraints. For example, it is easy to show that for an interval
of length
, the bounds of
and
are anti-symmetric, in the sense that
It is merely a playful speculation, but do you think exploring the constraints and symmetries imposed by the periodicity of
could be a useful parallel or addition to sieve theory?
Or has this already been done and doomed?
19 October, 2013 at 7:24 am
素数并不孤独 | ZHANG RONG
[…] Open question: The parity problem in sieve theory, Terence Tao,https://terrytao.wordpress.com/2007/06/05/open-question-the-parity-problem-in-sieve-theory/ […]
18 November, 2013 at 4:06 am
Number Theory: Lecture 17 | Theorem of the week
[…] I mentioned that there is a field of study called sieve theory. In addition to the Wikipedia page, you might also like to look at this blog post by Terry Tao. […]
27 November, 2013 at 10:13 pm
Polymath 8 – a Success! | Combinatorics and more
[…] 6) (Added Nov 27) There is a difficult “parity problem” which seems to be a difficult obstacle for getting the gap to two. (And for various related goals). Terry Tao wrote about it in 2007 in this post. […]
9 July, 2014 at 9:49 pm
The parity problem obstruction for the binary Goldbach problem with bounded error | What's new
[…] in both problems, preventing a solution to either conjecture by almost all known methods (see this previous blog post for more […]
26 August, 2015 at 4:24 pm
Heath-Brown’s theorem on prime twins and Siegel zeroes | What's new
[…] way that can distinguish them from other twins of almost primes. The parity problem is discussed in these previous blog posts; this obstruction is ultimately powered by the Möbius pseudorandomness […]
18 September, 2015 at 4:46 pm
The logarithmically averaged Chowla and Elliott conjectures for two-point correlations; the Erdos discrepancy problem | What's new
[…] such as (2) or (3) are known to be subject to the “parity problem” (discussed numerous times previously on this blog), which roughly speaking means that they cannot be proven solely […]
15 June, 2016 at 3:00 pm
Primos gemelos: Un vistazo a algunos resultados. – Lo fascinante de la teoría de números
[…] otra conjetura acerca de teoría de números bastante tratada. En palabras de Terence Tao, “Una especie de Hipotesis de Riemann super generalizada para […]
16 June, 2016 at 11:58 am
Anonymous
What is the best upper estimate for the number
? In the book Halberstam, Richert “Sieve_methods” only lower bounds. Could You recommend literature about upper bounds? Thanks.
16 June, 2016 at 12:15 pm
Anonymous
Sorry,
17 July, 2016 at 8:53 am
Notes on the Bombieri asymptotic sieve | What's new
[…] scalar thus embodies the “parity problem” for the twin prime conjecture (discussed in these previous blog […]
20 August, 2016 at 10:28 pm
Notes on the Bombieri asymptotic sieve - TAREA LISTA
[…] scalar thus embodies the “parity problem” for the twin prime conjecture (discussed in these previous blog […]
12 February, 2017 at 4:16 pm
Anonymous
Are there any known sieves of Twin Pairs that do not break down with an either/or at one of the numbers? — And if not, my main question is, Of all known sieves:
>> What is the farthest a sieve has gone when producing twin pairs before it breaks? <<
Lastly, if a sieve was found that never broke down, how could it be used to give a proof for the infinitude of twin pairs? ( I am lead to believe that even a perfect sieve is not a proof. Am I wrong?)
29 July, 2017 at 12:31 am
Mats Granvik
Is the use of eigenvalues allowed in approaching the parity problem?
https://oeis.org/A275345
20 September, 2017 at 9:37 am
Kyle Pratt
I apologize for posting a comment on an old post.
In your discussion of Selberg’s elementary proof of the PNT you mention that
grows like
. However, for every
we have
for some constant
.
[Corrected, thanks – T.]
22 January, 2018 at 5:13 pm
Hammer and Nails | Radimentary
[…] Usually even the failure of certain techniques sheds light on shape of the difficulty. One classic example of an enlightening failure is the consistent overcounting (by exactly a factor of two!) of primes by sieve methods. This failure is so serious and unfixable that it has its own name: the Parity Problem. […]
25 February, 2018 at 7:48 am
DZ
The line “For comparison, the parity problem predicts that one cannot hope to do any better than showing that $|A_1|, |A_2|, |A_3| \geq |A|/2$ …” probably should be read $|A_1| + |A_2| +|A_3| \geq |A|/2$ ?
25 February, 2018 at 8:27 am
Terence Tao
Actually the statement is as I intended: it is possible to obtain a lower bound of the form
through lower bounds on
that are no stronger than
,
,
. In contrast, one cannot prove a lower bound of the form
in this fashion.
29 April, 2018 at 8:45 pm
Finding Coprime Pairs | Gödel's Lost Letter and P=NP
[…] comment by Ben Green neatly expresses the analytical motivation, as does section 4 of this paper. The […]
27 December, 2019 at 2:44 am
Fred Daniel Kline
I’m an amateur and in my attempts to find symmetry in Binary Goldbach, I had to make my odd number function use positive indexes. This eliminated a one-off issue for me. I don’t know how this would affect the Twin Prime parity.