You are currently browsing the tag archive for the ‘parity problem’ tag.
Two of the most famous open problems in additive prime number theory are the twin prime conjecture and the binary Goldbach conjecture. They have quite similar forms:
- Twin prime conjecture The equation has infinitely many solutions with prime.
- Binary Goldbach conjecture The equation has at least one solution with prime for any given even .
In view of this similarity, it is not surprising that the partial progress on these two conjectures have tracked each other fairly closely; the twin prime conjecture is generally considered slightly easier than the binary Goldbach conjecture, but broadly speaking any progress made on one of the conjectures has also led to a comparable amount of progress on the other. (For instance, Chen’s theorem has a version for the twin prime conjecture, and a version for the binary Goldbach conjecture.) Also, the notorious parity obstruction is present in both problems, preventing a solution to either conjecture by almost all known methods (see this previous blog post for more discussion).
In this post, I would like to note a divergence from this general principle, with regards to bounded error versions of these two conjectures:
- Twin prime with bounded error The inequalities has infinitely many solutions with prime for some absolute constant .
- Binary Goldbach with bounded error The inequalities has at least one solution with prime for any sufficiently large and some absolute constant .
The first of these statements is now a well-known theorem of Zhang, and the Polymath8b project hosted on this blog has managed to lower to unconditionally, and to assuming the generalised Elliott-Halberstam conjecture. However, the second statement remains open; the best result that the Polymath8b project could manage in this direction is that (assuming GEH) at least one of the binary Goldbach conjecture with bounded error, or the twin prime conjecture with no error, had to be true.
All the known proofs of Zhang’s theorem proceed through sieve-theoretic means. Basically, they take as input equidistribution results that control the size of discrepancies such as
for various congruence classes and various arithmetic functions , e.g. (or more generaly for various ). After taking some carefully chosen linear combinations of these discrepancies, and using the trivial positivity lower bound
one eventually obtains (for suitable ) a non-trivial lower bound of the form
where is some weight function, and is the set of such that there are at least two primes in the interval . This implies at least one solution to the inequalities with , and Zhang’s theorem follows.
In a similar vein, one could hope to use bounds on discrepancies such as (1) (for comparable to ), together with the trivial lower bound (2), to obtain (for sufficiently large , and suitable ) a non-trivial lower bound of the form
for some weight function , where is the set of such that there is at least one prime in each of the intervals and . This would imply the binary Goldbach conjecture with bounded error.
However, the parity obstruction blocks such a strategy from working (for much the same reason that it blocks any bound of the form in Zhang’s theorem, as discussed in the Polymath8b paper.) The reason is as follows. The sieve-theoretic arguments are linear with respect to the summation, and as such, any such sieve-theoretic argument would automatically also work in a weighted setting in which the summation is weighted by some non-negative weight . More precisely, if one could control the weighted discrepancies
to essentially the same accuracy as the unweighted discrepancies (1), then thanks to the trivial weighted version
of (2), any sieve-theoretic argument that was capable of proving (3) would also be capable of proving the weighted estimate
However, (4) may be defeated by a suitable choice of weight , namely
where is the Liouville function, which counts the parity of the number of prime factors of a given number . Since , one can expand out as the sum of and a finite number of other terms, each of which consists of the product of two or more translates (or reflections) of . But from the Möbius randomness principle (or its analogue for the Liouville function), such products of are widely expected to be essentially orthogonal to any arithmetic function that is arising from a single multiplicative function such as , even on very short arithmetic progressions. As such, replacing by in (1) should have a negligible effect on the discrepancy. On the other hand, in order for to be non-zero, has to have the same sign as and hence the opposite sign to cannot simultaneously be prime for any , and so vanishes identically, contradicting (4). This indirectly rules out any modification of the Goldston-Pintz-Yildirim/Zhang method for establishing the binary Goldbach conjecture with bounded error.
The above argument is not watertight, and one could envisage some ways around this problem. One of them is that the Möbius randomness principle could simply be false, in which case the parity obstruction vanishes. A good example of this is the result of Heath-Brown that shows that if there are infinitely many Siegel zeroes (which is a strong violation of the Möbius randomness principle), then the twin prime conjecture holds. Another way around the obstruction is to start controlling the discrepancy (1) for functions that are combinations of more than one multiplicative function, e.g. . However, controlling such functions looks to be at least as difficult as the twin prime conjecture (which is morally equivalent to obtaining non-trivial lower-bounds for ). A third option is not to use a sieve-theoretic argument, but to try a different method (e.g. the circle method). However, most other known methods also exhibit linearity in the “” variable and I would suspect they would be vulnerable to a similar obstruction. (In any case, the circle method specifically has some other difficulties in tackling binary problems, as discussed in this previous post.)
One of the most basic methods in additive number theory is the Hardy-Littlewood circle method. This method is based on expressing a quantity of interest to additive number theory, such as the number of representations of an integer as the sum of three primes , as a Fourier-analytic integral over the unit circle involving exponential sums such as
where the sum here ranges over all primes up to , and . For instance, the expression mentioned earlier can be written as
The strategy is then to obtain sufficiently accurate bounds on exponential sums such as in order to obtain non-trivial bounds on quantities such as . For instance, if one can show that for all odd integers greater than some given threshold , this implies that all odd integers greater than are expressible as the sum of three primes, thus establishing all but finitely many instances of the odd Goldbach conjecture.
Remark 1 In practice, it can be more efficient to work with smoother sums than the partial sum (1), for instance by replacing the cutoff with a smoother cutoff for a suitable chocie of cutoff function , or by replacing the restriction of the summation to primes by a more analytically tractable weight, such as the von Mangoldt function . However, these improvements to the circle method are primarily technical in nature and do not have much impact on the heuristic discussion in this post, so we will not emphasise them here. One can also certainly use the circle method to study additive combinations of numbers from other sets than the set of primes, but we will restrict attention to additive combinations of primes for sake of discussion, as it is historically one of the most studied sets in additive number theory.
In many cases, it turns out that one can get fairly precise evaluations on sums such as in the major arc case, when is close to a rational number with small denominator , by using tools such as the prime number theorem in arithmetic progressions. For instance, the prime number theorem itself tells us that
and the prime number theorem in residue classes modulo suggests more generally that
when is small and is close to , basically thanks to the elementary calculation that the phase has an average value of when is uniformly distributed amongst the residue classes modulo that are coprime to . Quantifying the precise error in these approximations can be quite challenging, though, unless one assumes powerful hypotheses such as the Generalised Riemann Hypothesis.
In the minor arc case when is not close to a rational with small denominator, one no longer expects to have such precise control on the value of , due to the “pseudorandom” fluctuations of the quantity . Using the standard probabilistic heuristic (supported by results such as the central limit theorem or Chernoff’s inequality) that the sum of “pseudorandom” phases should fluctuate randomly and be of typical magnitude , one expects upper bounds of the shape
for “typical” minor arc . Indeed, a simple application of the Plancherel identity, followed by the prime number theorem, reveals that
which is consistent with (though weaker than) the above heuristic. In practice, though, we are unable to rigorously establish bounds anywhere near as strong as (3); upper bounds such as are far more typical.
Because one only expects to have upper bounds on , rather than asymptotics, in the minor arc case, one cannot realistically hope to make much use of phases such as for the minor arc contribution to integrals such as (2) (at least if one is working with a single, deterministic, value of , so that averaging in is unavailable). In particular, from upper bound information alone, it is difficult to avoid the “conspiracy” that the magnitude oscillates in sympathetic resonance with the phase , thus essentially eliminating almost all of the possible gain in the bounds that could arise from exploiting cancellation from that phase. Thus, one basically has little option except to use the triangle inequality to control the portion of the integral on the minor arc region :
Despite this handicap, though, it is still possible to get enough bounds on both the major and minor arc contributions of integrals such as (2) to obtain non-trivial lower bounds on quantities such as , at least when is large. In particular, this sort of method can be developed to give a proof of Vinogradov’s famous theorem that every sufficiently large odd integer is the sum of three primes; my own result that all odd numbers greater than can be expressed as the sum of at most five primes is also proven by essentially the same method (modulo a number of minor refinements, and taking advantage of some numerical work on both the Goldbach problems and on the Riemann hypothesis ). It is certainly conceivable that some further variant of the circle method (again combined with a suitable amount of numerical work, such as that of numerically establishing zero-free regions for the Generalised Riemann Hypothesis) can be used to settle the full odd Goldbach conjecture; indeed, under the assumption of the Generalised Riemann Hypothesis, this was already achieved by Deshouillers, Effinger, te Riele, and Zinoviev back in 1997. I am optimistic that an unconditional version of this result will be possible within a few years or so, though I should say that there are still significant technical challenges to doing so, and some clever new ideas will probably be needed to get either the Vinogradov-style argument or numerical verification to work unconditionally for the three-primes problem at medium-sized ranges of , such as . (But the intermediate problem of representing all even natural numbers as the sum of at most four primes looks somewhat closer to being feasible, though even this would require some substantially new and non-trivial ideas beyond what is in my five-primes paper.)
However, I (and many other analytic number theorists) are considerably more skeptical that the circle method can be applied to the even Goldbach problem of representing a large even number as the sum of two primes, or the similar (and marginally simpler) twin prime conjecture of finding infinitely many pairs of twin primes, i.e. finding infinitely many representations of as the difference of two primes. At first glance, the situation looks tantalisingly similar to that of the Vinogradov theorem: to settle the even Goldbach problem for large , one has to find a non-trivial lower bound for the quantity
for sufficiently large , as this quantity is also the number of ways to represent as the sum of two primes . Similarly, to settle the twin prime problem, it would suffice to obtain a lower bound for the quantity
that goes to infinity as , as this quantity is also the number of ways to represent as the difference of two primes less than or equal to .
In principle, one can achieve either of these two objectives by a sufficiently fine level of control on the exponential sums . Indeed, there is a trivial (and uninteresting) way to take any (hypothetical) solution of either the asymptotic even Goldbach problem or the twin prime problem and (artificially) convert it to a proof that “uses the circle method”; one simply begins with the quantity or , expresses it in terms of using (5) or (6), and then uses (5) or (6) again to convert these integrals back into a the combinatorial expression of counting solutions to or , and then uses the hypothetical solution to the given problem to obtain the required lower bounds on or .
Of course, this would not qualify as a genuine application of the circle method by any reasonable measure. One can then ask the more refined question of whether one could hope to get non-trivial lower bounds on or (or similar quantities) purely from the upper and lower bounds on or similar quantities (and of various type norms on such quantities, such as the bound (4)). Of course, we do not yet know what the strongest possible upper and lower bounds in are yet (otherwise we would already have made progress on major conjectures such as the Riemann hypothesis); but we can make plausible heuristic conjectures on such bounds. And this is enough to make the following heuristic conclusions:
- (i) For “binary” problems such as computing (5), (6), the contribution of the minor arcs potentially dominates that of the major arcs (if all one is given about the minor arc sums is magnitude information), in contrast to “ternary” problems such as computing (2), in which it is the major arc contribution which is absolutely dominant.
- (ii) Upper and lower bounds on the magnitude of are not sufficient, by themselves, to obtain non-trivial bounds on (5), (6) unless these bounds are extremely tight (within a relative error of or better); but
- (iii) obtaining such tight bounds is a problem of comparable difficulty to the original binary problems.
I will provide some justification for these conclusions below the fold; they are reasonably well known “folklore” to many researchers in the field, but it seems that they are rarely made explicit in the literature (in part because these arguments are, by their nature, heuristic instead of rigorous) and I have been asked about them from time to time, so I decided to try to write them down here.
In view of the above conclusions, it seems that the best one can hope to do by using the circle method for the twin prime or even Goldbach problems is to reformulate such problems into a statement of roughly comparable difficulty to the original problem, even if one assumes powerful conjectures such as the Generalised Riemann Hypothesis (which lets one make very precise control on major arc exponential sums, but not on minor arc ones). These are not rigorous conclusions – after all, we have already seen that one can always artifically insert the circle method into any viable approach on these problems – but they do strongly suggest that one needs a method other than the circle method in order to fully solve either of these two problems. I do not know what such a method would be, though I can give some heuristic objections to some of the other popular methods used in additive number theory (such as sieve methods, or more recently the use of inverse theorems); this will be done at the end of this post.
The parity problem is a notorious problem in sieve theory: this theory was invented in order to count prime patterns of various types (e.g. twin primes), but despite superb success in obtaining upper bounds on the number of such patterns, it has proven to be somewhat disappointing in obtaining lower bounds. [Sieves can also be used to study many other things than primes, of course, but we shall focus only on primes in this post.] Even the task of reproving Euclid’s theorem – that there are infinitely many primes – seems to be extremely difficult to do by sieve theoretic means, unless one of course injects into the theory an estimate at least as strong as Euclid’s theorem (e.g. the prime number theorem). The main obstruction is the parity problem: even assuming such strong hypotheses as the Elliott-Halberstam conjecture (a sort of “super-generalised Riemann Hypothesis” for sieves), sieve theory is largely (but not completely) unable to distinguish numbers with an odd number of prime factors from numbers with an even number of prime factors. This “parity barrier” has been broken for some select patterns of primes by injecting some powerful non-sieve theory methods into the subject, but remains a formidable obstacle in general.
I’ll discuss the parity problem in more detail later in this post, but I want to first discuss how sieves work [drawing in part on some excellent unpublished lecture notes of Iwaniec]; the basic ideas are elementary and conceptually simple, but there are many details and technicalities involved in actually executing these ideas, and which I will try to suppress for sake of exposition.
Recent Comments