One of the most basic methods in additive number theory is the Hardy-Littlewood circle method. This method is based on expressing a quantity of interest to additive number theory, such as the number of representations of an integer as the sum of three primes , as a Fourier-analytic integral over the unit circle involving exponential sums such as
The strategy is then to obtain sufficiently accurate bounds on exponential sums such as in order to obtain non-trivial bounds on quantities such as . For instance, if one can show that for all odd integers greater than some given threshold , this implies that all odd integers greater than are expressible as the sum of three primes, thus establishing all but finitely many instances of the odd Goldbach conjecture.
Remark 1 In practice, it can be more efficient to work with smoother sums than the partial sum (1), for instance by replacing the cutoff with a smoother cutoff for a suitable choice of cutoff function , or by replacing the restriction of the summation to primes by a more analytically tractable weight, such as the von Mangoldt function . However, these improvements to the circle method are primarily technical in nature and do not have much impact on the heuristic discussion in this post, so we will not emphasise them here. One can also certainly use the circle method to study additive combinations of numbers from other sets than the set of primes, but we will restrict attention to additive combinations of primes for sake of discussion, as it is historically one of the most studied sets in additive number theory.
In many cases, it turns out that one can get fairly precise evaluations on sums such as in the major arc case, when is close to a rational number with small denominator , by using tools such as the prime number theorem in arithmetic progressions. For instance, the prime number theorem itself tells us that
and the prime number theorem in residue classes modulo suggests more generally that
when is small and is close to , basically thanks to the elementary calculation that the phase has an average value of when is uniformly distributed amongst the residue classes modulo that are coprime to . Quantifying the precise error in these approximations can be quite challenging, though, unless one assumes powerful hypotheses such as the Generalised Riemann Hypothesis.
In the minor arc case when is not close to a rational with small denominator, one no longer expects to have such precise control on the value of , due to the “pseudorandom” fluctuations of the quantity . Using the standard probabilistic heuristic (supported by results such as the central limit theorem or Chernoff’s inequality) that the sum of “pseudorandom” phases should fluctuate randomly and be of typical magnitude , one expects upper bounds of the shape
which is consistent with (though weaker than) the above heuristic. In practice, though, we are unable to rigorously establish bounds anywhere near as strong as (3); upper bounds such as are far more typical.
Because one only expects to have upper bounds on , rather than asymptotics, in the minor arc case, one cannot realistically hope to make much use of phases such as for the minor arc contribution to integrals such as (2) (at least if one is working with a single, deterministic, value of , so that averaging in is unavailable). In particular, from upper bound information alone, it is difficult to avoid the “conspiracy” that the magnitude oscillates in sympathetic resonance with the phase , thus essentially eliminating almost all of the possible gain in the bounds that could arise from exploiting cancellation from that phase. Thus, one basically has little option except to use the triangle inequality to control the portion of the integral on the minor arc region :
Despite this handicap, though, it is still possible to get enough bounds on both the major and minor arc contributions of integrals such as (2) to obtain non-trivial lower bounds on quantities such as , at least when is large. In particular, this sort of method can be developed to give a proof of Vinogradov’s famous theorem that every sufficiently large odd integer is the sum of three primes; my own result that all odd numbers greater than can be expressed as the sum of at most five primes is also proven by essentially the same method (modulo a number of minor refinements, and taking advantage of some numerical work on both the Goldbach problems and on the Riemann hypothesis ). It is certainly conceivable that some further variant of the circle method (again combined with a suitable amount of numerical work, such as that of numerically establishing zero-free regions for the Generalised Riemann Hypothesis) can be used to settle the full odd Goldbach conjecture; indeed, under the assumption of the Generalised Riemann Hypothesis, this was already achieved by Deshouillers, Effinger, te Riele, and Zinoviev back in 1997. I am optimistic that an unconditional version of this result will be possible within a few years or so, though I should say that there are still significant technical challenges to doing so, and some clever new ideas will probably be needed to get either the Vinogradov-style argument or numerical verification to work unconditionally for the three-primes problem at medium-sized ranges of , such as . (But the intermediate problem of representing all even natural numbers as the sum of at most four primes looks somewhat closer to being feasible, though even this would require some substantially new and non-trivial ideas beyond what is in my five-primes paper.)
However, I (and many other analytic number theorists) are considerably more skeptical that the circle method can be applied to the even Goldbach problem of representing a large even number as the sum of two primes, or the similar (and marginally simpler) twin prime conjecture of finding infinitely many pairs of twin primes, i.e. finding infinitely many representations of as the difference of two primes. At first glance, the situation looks tantalisingly similar to that of the Vinogradov theorem: to settle the even Goldbach problem for large , one has to find a non-trivial lower bound for the quantity
for sufficiently large , as this quantity is also the number of ways to represent as the sum of two primes . Similarly, to settle the twin prime problem, it would suffice to obtain a lower bound for the quantity
that goes to infinity as , as this quantity is also the number of ways to represent as the difference of two primes less than or equal to .
In principle, one can achieve either of these two objectives by a sufficiently fine level of control on the exponential sums . Indeed, there is a trivial (and uninteresting) way to take any (hypothetical) solution of either the asymptotic even Goldbach problem or the twin prime problem and (artificially) convert it to a proof that “uses the circle method”; one simply begins with the quantity or , expresses it in terms of using (5) or (6), and then uses (5) or (6) again to convert these integrals back into a the combinatorial expression of counting solutions to or , and then uses the hypothetical solution to the given problem to obtain the required lower bounds on or .
Of course, this would not qualify as a genuine application of the circle method by any reasonable measure. One can then ask the more refined question of whether one could hope to get non-trivial lower bounds on or (or similar quantities) purely from the upper and lower bounds on or similar quantities (and of various type norms on such quantities, such as the bound (4)). Of course, we do not yet know what the strongest possible upper and lower bounds in are yet (otherwise we would already have made progress on major conjectures such as the Riemann hypothesis); but we can make plausible heuristic conjectures on such bounds. And this is enough to make the following heuristic conclusions:
- (i) For “binary” problems such as computing (5), (6), the contribution of the minor arcs potentially dominates that of the major arcs (if all one is given about the minor arc sums is magnitude information), in contrast to “ternary” problems such as computing (2), in which it is the major arc contribution which is absolutely dominant.
- (ii) Upper and lower bounds on the magnitude of are not sufficient, by themselves, to obtain non-trivial bounds on (5), (6) unless these bounds are extremely tight (within a relative error of or better); but
- (iii) obtaining such tight bounds is a problem of comparable difficulty to the original binary problems.
I will provide some justification for these conclusions below the fold; they are reasonably well known “folklore” to many researchers in the field, but it seems that they are rarely made explicit in the literature (in part because these arguments are, by their nature, heuristic instead of rigorous) and I have been asked about them from time to time, so I decided to try to write them down here.
In view of the above conclusions, it seems that the best one can hope to do by using the circle method for the twin prime or even Goldbach problems is to reformulate such problems into a statement of roughly comparable difficulty to the original problem, even if one assumes powerful conjectures such as the Generalised Riemann Hypothesis (which lets one make very precise control on major arc exponential sums, but not on minor arc ones). These are not rigorous conclusions – after all, we have already seen that one can always artifically insert the circle method into any viable approach on these problems – but they do strongly suggest that one needs a method other than the circle method in order to fully solve either of these two problems. I do not know what such a method would be, though I can give some heuristic objections to some of the other popular methods used in additive number theory (such as sieve methods, or more recently the use of inverse theorems); this will be done at the end of this post.
— 1. Minor arc dominance —
Let us first explain why minor arc contributions to (5) or (6) are expected to dominate. For sake of discussion let us just work with the twin prime integral (6), as the situation with the even Goldbach integral (5) is similar.
First, let us get a crude heuristic prediction as to the size of the quantity , which is the number of twin primes that are both less than . Note that from the prime number theorem, an integer chosen uniformly at random between and has a probability about of being prime, and similarly has a probability about of being prime. Making the heuristic (but very nonrigorous) hypothesis that these two events are approximately independent, we thus expect a random number between and to have a probability about of being a twin prime, leading to the prediction
As it turns out, this prediction is almost certainly inaccurate, due to “local” correlations between the primality of and primality of caused by residue classes with respect to small moduli (and in particular due to the parity of , which clearly has an extremely large influence on whether and will be prime). However, the procedure for correcting for these local correlations is well known, and only ends up modifying the prediction (7) by a multiplicative constant known as the twin prime constant; see for instance this previous blog post for more discussion. Our analysis here is focused on orders of growth in rather than on multiplicative constants, and so we will ignore the effect of the twin prime constant in this discussion.
Now let us predict the heuristic contribution of the major and minor arcs to (5). We begin with the primary major arc when is close to zero. As already noted, the prime number theorem gives
From the uncertainty principle, we then expect
when , because the phase does not go any significant oscillation in this case as ranges over the primes from to . On the other hand, as begins to exceed , we expect the exponential sum to start decaying, because the prime number theorem suggests the heuristic
for slightly larger than , and the latter sum does decay as begins to exceed . (Admittedly, the decay is rather slow, of the order of , but one can speed it up by using smooth cutoffs in the exponential sum, and in any event the decay is already sufficient for analysing type expressions such as (5).) In view of this, we expect the contribution to (5) of the major arc when is close to to be roughly
which, encouragingly, agrees with the heuristic prediction (7) (and indeed, if one unpacks all the Fourier analysis, one sees that these two predictions are ultimately coming from the same source). Furthermore, it is not difficult to make these sorts of heuristics rigorous using summation by parts and some version of the prime number theorem with an explicit error term; but this will not be our concern here.
Next, we look at the major arcs when is close to for some fixed (and fairly small) , and when is coprime to . We begin with the study of
The prime number theorem in arithmetic progressions suggests that the primes are equally distributed in the residue classes coprime to , which heuristically suggests that
(this can be seen first by working in the case when is a prime or a power of a prime, and then handling the general case via the Chinese remainder theorem). This leads to the heuristic
when , with some decay as begins to exceed . On the other hand, the set of all in these arcs has measure . Thus, if one were to ignore the effects of the term in (5), and just estimate things crudely by the triangle inequality, one would heuristically end up with a net bound of
for any given . This does decay in , but too slowly to ensure absolute convergence; given that a positive proportion of integers are squarefree, and that is more or less comparable to for typical , we expect the logarithmic divergence
as (and indeed one can make this heuristic rigorous by standard methods for estimating sums of multiplicative functions, e.g. via Dirichlet series methods). This should be compared with the analogous situation for (2), in which the major arc contribution of a given denominator , when estimated in absolute value by the same method, would be of size
and is convergent.
Now, the apparent divergence of the major arc contributions is not actually a problem, because we have an asymptotic for rather than merely an upper bound, and can therefore exploit the cancellation coming from the term. Indeed, if we apply the approximation (9) without discarding the phase, we see (heuristically, at least) that the contribution of the major arcs at denominator are more like
Using (8), we thus expect the total contribution at denominator to actually be of order , gaining an additional factor of over the preceding bound, leading to a summable contribution of the major arcs. (And if one pursues this calculation more carefully, one even arrives at the more refined prediction of for the net major arc contribution.)
So far, so good. But difficulties begin to arise when one turns attention to the minor arcs, and particularly when is only close to rationals with very large denominator , such as , which is the typical scenario if one defines “close to” as “within of”. Here, we do not expect to have the asymptotic (9), because the effect of random fluctuations in the primes on , which as discussed previously is expected to be of size and can thus dominate the main term in (9) if is close to . (In any case, even on the Generalised Riemann Hypothesis, we are well short of being able to maintain the asymptotic (9) for anywhere near that large a value of , instead only reaching levels such as or so before the error terms begin to dominate the main term.) As such, we can no longer easily exploit the cancellation of the phase, and are more or less forced to estimate the minor arc contribution by taking absolute values:
But now the problem is that, in view of heuristics such as (3) or (4), and keeping in mind that the set of minor arcs have measure close to , this absolute value integral is expected to be of the order of , which exceeds the major arc contribution of by a logarithmic factor. (One can also use Montgomery’s uncertainty principle to show that the major arc contributions are a fraction of the minor arc ones, for any reasonable choice of division between major and minor arc.) Thus we see that the minor arc contribution overwhelms the major arc one if we cannot control the oscillation of on the minor arcs. (It is instructive to see why the situation is reversed with the ternary sum (2), in which case the major arc terms dominate.)
— 2. Loose bounds do not suffice —
A slightly different perspective on the difficulties in applying the circle method to binary problems can be seen by noting how “fragile” the truth of such problems is with respect to “edits” made to the set of primes. To describe what I mean by this, let us focus now on the even Goldbach problem of representing a large but fixed number as the sum of two primes (but the same discussion also applies, with minor modifications, to the twin prime problem). As discussed previously, we expect about sums of this form (actually one expects slightly more sums than this when has a lot of prime factors, but never mind this for now). One can even establish this upper bound rigorously by sieve theory if one wishes, but this is not particularly relevant as will only be concerned with a heuristic discussion here.
The point is that the number of sums here is still significantly smaller than the total number of primes less than , which is approximately by the prime number theorem. To exploit this discrepancy, let us define the set of redacted primes to be those primes less than which do not arise in one of the representations of as the sum of two primes; in other words, a redacted prime is a prime less than such that is not a prime. Thus, from a relative viewpoint, the redacted primes comprise almost all of the actual primes less than ; only a proportion of or so of the primes less than are not redacted. As such, the redacted primes look very similar to the actual primes less than , but with one key difference: there is no way to express as the sum of two redacted primes. So the even Goldbach property has been destroyed by passing from the primes to the redacted primes.
On the other hand, given that the redacted primes have density about in the set of all primes less than , and are expected to be distributed more or less uniformly within that set (other than some local irregularities coming from small residue classes which turn out not to be terribly relevant), we thus expect heuristically that the redacted exponential sum
(This is not quite a good heuristic when happens to vanish, but writing a more accurate heuristic here would be messier and not add much illumination to the discussion.)
This has the following consequence. Suppose one is planning to prove the even Goldbach conjecture by using some approximations on , say of the form
for some explicit quantities . (One can (and should) also consider integral bounds rather than pointwise bounds, but let us discuss pointwise bounds here for sake of illustration.) If these bounds are too “loose” in the sense that , then the preceding heuristics suggest that the redacted exponential sums are going to obey a similar estimate:
However, as mentioned previously, the redacted version of (5) vanishes. Thus, one cannot hope to prove the even Goldbach conjecture purely by loose bounds, as these bounds are incapable of distinguishing between the genuine primes and the redacted primes. In particular, upper bounds of the type one expects to have available for the minor arc exponential sums are far too loose to be useful in this regard (as in such cases is simply zero).
Remark 2 One might object that the type of techniques used to prove minor arc exponential sum estimates (e.g. the Vinogradov-Vaughan method based on bilinear sums) rely on the special multiplicative structure of the primes, which is absent for the redacted primes. This is true, but is irrelevant for the purposes of determining the feasibility of the circle method, because the circle method only cares about possessing good bounds on , and not about where those bounds came from; it cannot distinguish a bound that came from a rigorous argument such as the Vinogradov-Vaughan method, from a bound that came from a heuristic argument such as (10). If one wants to truly use multiplicative structure to distinguish the primes from the redacted primes, one has to use that structure directly to establish the desired Goldbach or twin prime problem, bypassing the circle method, which only sees the consequences of that structure on exponential sum estimates, rather than seeing the structure itself.
Admittedly, for major arc values of , a sufficiently strong version of the prime number in arithmetic progressions will give tight bounds (in the sense that the error term improves upon the main term by at least a logarithmic factor). However, as pointed out in the preceding section, the major arcs are not the dominant contribution to (5). Furthermore, if one normalises the redacted exponential sum by dividing out by the relative density of the redacted primes in the actual primes, one expects the major arc value of to become much closer to that of , so that the two expressions become indistinguishable even to quite tight bounds.
It is also instructive to see how this situation is different in the ternary case, when one is trying to find triples of primes that sum to a given odd prime . There, it is no longer possible to easily redact the primes to remove all such triples, because one expects the number of such triples to be on the order of , compared with the total number of primes. Indeed, it is known that Vinogradov’s theorem continues to hold even if one deletes a moderately large positive fraction of the primes (this is a result of Li and Pan, adapting an earlier argument of Green).
— 3. Tight bounds are morally equivalent to binary problems —
In the preceding two sections, we argued that the minor arc estimates are the most important, and that the bounds here must be quite tight in order to have any hope of solving the conjecture this way. On the other hand, if the bounds are tight enough, then indeed one has enough control to start proving some binary conjectures. As far as I am aware, the first results in the literature formalising this observation are due to Srinivasan (in the context of the even Goldbach problem). For the twin prime problem, a more precise formalisation was worked out recently by Maier and Sankaranarayanan. Roughly speaking, these latter authors show that if one can obtain a tight asymptotic for on minor arcs, roughly of the form
for some absolute constant and all intervals of size , then the twin prime conjecture holds, basically because one can then control the minor arc contribution to (6) by an integration by parts. (Actually the constant is necessarily equal to in this conjecture, basically because of (4) and the fact that the major arcs only occupy a small fraction of the total norm.) Furthermore, the standard random fluctuation heuristics suggest that the conjecture (11) is likely to be true (and Maier and Sankaranarayanan verify the conjecture when the primes are replaced with a certain set of almost primes). In fact, the bound (11) is strong enough to similarly count the number of pairs for any fixed , with a very good error term (smaller than the main term by several powers of ); in other words, this conjecture would imply the binary case of the Hardy-Littlewood prime tuples conjecture with the expected asymptotics.
On the other hand, it is not difficult to reverse this procedure and deduce the conjecture (11) from the prime tuples conjecture (with the conjectured asymptotics and a sufficiently strong error term). This is because is the Fourier transform of the prime tuples counting function :
As such, strong bounds on would be expected to yield strong bounds on , and in particular the expected bounds on the prime tuples conjecture leads to the expected bound (11). So we see that tight bounds on minor arc exponential sums are basically just a Fourier reformulation of the underlying binary problems being considered, and any argument that established such bounds could likely be converted into a bound on the binary problem directly, without direct use of the circle method. As such, I believe that one has to look outside of the circle method in order to make progress on these binary problems.
— 4. Other methods —
Needless to say, I do not actually have a viable strategy in hand for solving these binary problems, but I can comment briefly on some of the other existing methods in additive number theory, which also unfortunately have well-known limitations in this regard.
The first obvious candidate for solving these problems is sieve theory, which was in fact invented specifically with problems such as the twin prime conjecture in mind. However, there is a major obstruction to using sieve theory to obtain non-trivial lower bounds (as opposed to upper bounds, which are much easier) on prime patterns, namely the parity problem. I discuss this problem in this previous blog post, and do not have much more to add here to what I already wrote in that post.
Another potential approach, following the methods of additive combinatorics (and particularly the sub-branch of additive combinatorics focused on arithmetic progressions), is to try to develop some sort of inverse theorem, in analogy with the Gowers-type inverse theorems that relate the lack of arithmetic progressions with some sort of correlation with a structured function (such as a Fourier character). Unfortunately, when it comes to binary patterns such as twins or pairs of numbers summing to a fixed sum , one cannot have any simple class of structured functions that capture the absence of these patterns. We already saw a glimpse of this with the redacted primes in the even Goldbach problem, which are likely to have almost identical correlations with structured functions (such as characters) as the unredacted primes. One can also construct quite pseudorandom-looking sets that lack a binary pattern, for instance by randomly selecting one member of each pair that sums to , and standard probabilistic computations then show that such sets will typically have low correlation with any set of structured functions of low entropy, which is the only type of functions for which we expect randomness heuristics (such as the Mobius randomness conjecture, discussed for instance in this lecture of Sarnak) to hold.
In a similar vein, one can try to hope for some sort of transference principle argument, taking advantage of the fact that the primes lie inside the almost primes in order to model the primes by a much denser set of natural numbers. The lack of a good inverse theorem is going to be a major obstacle to this strategy; but actually, even if that problem was somehow evaded, the parity problem serves as a separate obstruction. Indeed, the parity problem suggests that the maximum density of the primes inside the almost primes that one can realistically hope to achieve (while still getting good control on the almost primes) is . As such, the densest model one could hope for in the natural numbers would also have density . But this is just the critical density for avoiding patterns such as twins or Goldbach-type pairs; for instance, the natural numbers which equal or mod have density but no twins. So even just a small loss of density in the model could potentially kill off all the twins, and in the absence of an inverse theorem there would be no computable statistic to prevent this from happening. (On the other hand, this obstruction does not prevent one from finding pairs of primes which differ by at most , say; this is consistent with the Goldston-Yildirim-Pintz results on small prime gaps, which does not exactly use a dense model, but works relative to the almost primes, treating the primes “as if” they were dense in some sense.)
One can view the “enemy” in these binary problems as a “conspiracy” amongst the primes to behave in a certain pathological way – avoiding twins, for instance, or refusing to sum together to some exceptional integer . Conspiracies are typically very difficult to eliminate rigorously, about the only thing which is strong enough to rule out a conspiracy is a conflicting conspiracy (see my previous blog post on this topic). A good example of this is Heath-Brown’s result that the existence of a Siegel zero (which is a conspiracy amongst primes that, among other things, would completely ruin the Generalised Riemann Hypothesis) implies the truth of the twin prime conjecture (basically by upending all the previous heuristics, and causing the major arc sums to dominate the minor arcs instead of vice versa). But since we do not expect any conspiracies to occur amongst the primes, one cannot directly use these sorts of “dueling conspiracies” methods to unilaterally rule out any given conspiracy.
However, there is perhaps the outside chance that a binary conspiracy (such as a conspiracy between primes to stop having twins) could somehow be used to conspire against itself (in the spirit of “self-defeating object” arguments, discussed previously several times on this blog). This is actually a fairly common technique in analytic number theory (though not usually described in such terms); for instance, Linnik’s dispersion method can be viewed as a way to eliminate a potential conspiracy by transforming it to compete against itself; bilinear sum methods, such as those used to control the minor arc sum , can also be viewed in this way. However, these methods generally require some sort of underlying group action (such as multiplicative dilation) in order to transform the initial conspiracy into a competing conspiracy, and no obvious such action is present for these binary problems.
Finally, one can hope to solve the problem by producing explicit sequences of twin primes, or explicit primes that add up to a given integer , which is perhaps the most obvious method to try for someone who has seen this sort of problem for the first time. However, it is notoriously difficult to come up with any explicit and tractable formula for producing primes; this type of problem (with “tractable” replaced by “deterministically computable in a reasonable time”) was considered in the Polymath4 project, with only modest advances over the obvious brute-force algorithms located. On the other hand, the situation is significantly more promising if one works in the function field model in which the integers are replaced by the ring of polynomials of one variable over a finite field, and the notion of prime is replaced by that of an monic irreducible polynomial. The main reason for this is in addition to having the arithmetic operations of addition and multiplication, polynomials also have the operation of composition available to them. Furthermore, there is a relationship between the irreducibility of a polynomial and that of a composition ; indeed, except in degenerate cases, reducibility of the former implies reducibility of the latter, and conversely irreducibility of the former often ensures irreducibility of the latter in a manner that can be controlled (particularly after averaging in ). By using this method (known as the substitution method) one can prove some versions of the twin prime conjecture; for instance, it was shown by Pollack (building upon a previous result of this type by Hall) that for if is an odd prime and , then there exist infinitely many monic irreducible such that is also irreducible. It is perhaps conceivable that other binary problems, such as the even Goldbach problem, can also be tackled over function fields by this method. However, this argument appears specific to the function field setting, as the substitution operation has no obvious counterpart in the rational integers.