You are currently browsing the tag archive for the ‘circle method’ tag.
We have seen in previous notes that the operation of forming a Dirichlet series
or twisted Dirichlet series
is an incredibly useful tool for questions in multiplicative number theory. Such series can be viewed as a multiplicative Fourier transform, since the functions and are multiplicative characters.
Similarly, it turns out that the operation of forming an additive Fourier series
where lies on the (additive) unit circle and is the standard additive character, is an incredibly useful tool for additive number theory, particularly when studying additive problems involving three or more variables taking values in sets such as the primes; the deployment of this tool is generally known as the Hardy-Littlewood circle method. (In the analytic number theory literature, the minus sign in the phase is traditionally omitted, and what is denoted by here would be referred to instead by , or just .) We list some of the most classical problems in this area:
- (Even Goldbach conjecture) Is it true that every even natural number greater than two can be expressed as the sum of two primes?
- (Odd Goldbach conjecture) Is it true that every odd natural number greater than five can be expressed as the sum of three primes?
- (Waring problem) For each natural number , what is the least natural number such that every natural number can be expressed as the sum of or fewer powers?
- (Asymptotic Waring problem) For each natural number , what is the least natural number such that every sufficiently large natural number can be expressed as the sum of or fewer powers?
- (Partition function problem) For any natural number , let denote the number of representations of of the form where and are natural numbers. What is the asymptotic behaviour of as ?
The Waring problem and its asymptotic version will not be discussed further here, save to note that the Vinogradov mean value theorem (Theorem 13 from Notes 5) and its variants are particularly useful for getting good bounds on ; see for instance the ICM article of Wooley for recent progress on these problems. Similarly, the partition function problem was the original motivation of Hardy and Littlewood in introducing the circle method, but we will not discuss it further here; see e.g. Chapter 20 of Iwaniec-Kowalski for a treatment.
Instead, we will focus our attention on the odd Goldbach conjecture as our model problem. (The even Goldbach conjecture, which involves only two variables instead of three, is unfortunately not amenable to a circle method approach for a variety of reasons, unless the statement is replaced with something weaker, such as an averaged statement; see this previous blog post for further discussion. On the other hand, the methods here can obtain weaker versions of the even Goldbach conjecture, such as showing that “almost all” even numbers are the sum of two primes; see Exercise 34 below.) In particular, we will establish the following celebrated theorem of Vinogradov:
Recently, the restriction that be sufficiently large was replaced by Helfgott with , thus establishing the odd Goldbach conjecture in full. This argument followed the same basic approach as Vinogradov (based on the circle method), but with various estimates replaced by “log-free” versions (analogous to the log-free zero-density theorems in Notes 7), combined with careful numerical optimisation of constants and also some numerical work on the even Goldbach problem and on the generalised Riemann hypothesis. We refer the reader to Helfgott’s text for details.
We will in fact show the more precise statement:
The implied constants are ineffective.
We dropped the hypothesis that is odd in Theorem 2, but note that vanishes when is even. For odd , we have
Unfortunately, due to the ineffectivity of the constants in Theorem 2 (a consequence of the reliance on the Siegel-Walfisz theorem in the proof of that theorem), one cannot quantify explicitly what “sufficiently large” means in Theorem 1 directly from Theorem 2. However, there is a modification of this theorem which gives effective bounds; see Exercise 32 below.
Exercise 4 Obtain a heuristic derivation of the main term using the modified Cramér model (Section 1 of Supplement 4).
To prove Theorem 2, we consider the more general problem of estimating sums of the form
for various integers and functions , which we will take to be finitely supported to avoid issues of convergence.
Suppose that are supported on ; for simplicity, let us first assume the pointwise bound for all . (This simple case will not cover the case in Theorem 2, when are truncated versions of the von Mangoldt function , but will serve as a warmup to that case.) Then we have the trivial upper bound
A basic observation is that this upper bound is attainable if all “pretend” to behave like the same additive character for some . For instance, if , then we have when , and then it is not difficult to show that
The key to the success of the circle method lies in the converse of the above statement: the only way that the trivial upper bound (2) comes close to being sharp is when all correlate with the same character , or in other words are simultaneously large. This converse is largely captured by the following two identities:
The traditional approach to using the circle method to compute sums such as proceeds by invoking (3) to express this sum as an integral over the unit circle, then dividing the unit circle into “major arcs” where are large but computable with high precision, and “minor arcs” where one has estimates to ensure that are small in both and senses. For functions of number-theoretic significance, such as truncated von Mangoldt functions, the “major arcs” typically consist of those that are close to a rational number with not too large, and the “minor arcs” consist of the remaining portions of the circle. One then obtains lower bounds on the contributions of the major arcs, and upper bounds on the contribution of the minor arcs, in order to get good lower bounds on .
This traditional approach is covered in many places, such as this text of Vaughan. We will emphasise in this set of notes a slightly different perspective on the circle method, coming from recent developments in additive combinatorics; this approach does not quite give the sharpest quantitative estimates, but it allows for easier generalisation to more combinatorial contexts, for instance when replacing the primes by dense subsets of the primes, or replacing the equation with some other equation or system of equations.
From Exercise 5 and Hölder’s inequality, we immediately obtain
Similarly for permutations of the .
In the case when are supported on and bounded by , this corollary tells us that we have is whenever one has uniformly in , and similarly for permutations of . From this and the triangle inequality, we obtain the following conclusion: if is supported on and bounded by , and is Fourier-approximated by another function supported on and bounded by in the sense that
Thus, one possible strategy for estimating the sum is, one can effectively replace (or “model”) by a simpler function which Fourier-approximates in the sense that the exponential sums agree up to error . For instance:
Exercise 7 Let be a natural number, and let be a random subset of , chosen so that each has an independent probability of of lying in .
- (i) If and , show that with probability as , one has uniformly in . (Hint: for any fixed , this can be accomplished with quite a good probability (e.g. ) using a concentration of measure inequality, such as Hoeffding’s inequality. To obtain the uniformity in , round to the nearest multiple of (say) and apply the union bound).
- (ii) Show that with probability , one has representations of the form with (with treated as an ordered triple, rather than an unordered one).
In the case when is something like the truncated von Mangoldt function , the quantity is of size rather than . This costs us a logarithmic factor in the above analysis, however we can still conclude that we have the approximation (4) whenever is another sequence with such that one has the improved Fourier approximation
uniformly in . (Later on we will obtain a “log-free” version of this implication in which one does not need to gain a factor of in the error term.)
This suggests a strategy for proving Vinogradov’s theorem: find an approximant to some suitable truncation of the von Mangoldt function (e.g. or ) which obeys the Fourier approximation property (5), and such that the expression is easily computable. It turns out that there are a number of good options for such an approximant . One of the quickest ways to obtain such an approximation (which is used in Chapter 19 of Iwaniec and Kowalski) is to start with the standard identity , that is to say
Thus, for instance, if , the approximant would be taken to be
in which case we would take
The function is somewhat similar to the continuous Selberg sieve weights studied in Notes 4, with the main difference being that we did not square the divisor sum as we will not need to take to be non-negative. As long as is not too large, one can use some sieve-like computations to compute expressions like quite accurately. The approximation (5) can be justified by using a nice estimate of Davenport that exemplifies the Mobius pseudorandomness heuristic from Supplement 4:
uniformly for all . The implied constants are ineffective.
This estimate will be proven by splitting into two cases. In the “major arc” case when is close to a rational with small (of size or so), this estimate will be a consequence of the Siegel-Walfisz theorem ( from Notes 2); it is the application of this theorem that is responsible for the ineffective constants. In the remaining “minor arc” case, one proceeds by using a combinatorial identity (such as Vaughan’s identity) to express the sum in terms of bilinear sums of the form , and use the Cauchy-Schwarz inequality and the minor arc nature of to obtain a gain in this case. This will all be done below the fold. We will also use (a rigorous version of) the approximation (6) (or (7)) to establish Vinogradov’s theorem.
for some that is not too large compared to . The methods used to establish Theorem 8 can also establish a Fourier approximation that makes (8) precise, and which can yield an alternate proof of Vinogradov’s theorem; this will be done below the fold.
Exercise 9 Show that the right-hand side of (8) can be rewritten as
Then, show the inequalities
and conclude that
(Hint: for the latter estimate, use Theorem 27 of Notes 1.)
The coefficients in the above exercise are quite similar to optimised Selberg sieve coefficients (see Section 2 of Notes 4).
Another approximation to , related to the modified Cramér random model (see Model 10 of Supplement 4) is
for as above and coprime to . These approximations (closely related to a device known as the “-trick”) are not as quantitatively accurate as the previous approximations, but can still suffice to establish Vinogradov’s theorem, and also to count many other linear patterns in the primes or subsets of the primes (particularly if one injects some additional tools from additive combinatorics, and specifically the inverse conjecture for the Gowers uniformity norms); see this paper of Ben Green and myself for more discussion (and this more recent paper of Shao for an analysis of this approach in the context of Vinogradov-type theorems). The following exercise expresses the approximation (9) in a form similar to the previous approximation (8):
Exercise 10 With as above, show that
for all natural numbers .
One of the most basic methods in additive number theory is the Hardy-Littlewood circle method. This method is based on expressing a quantity of interest to additive number theory, such as the number of representations of an integer as the sum of three primes , as a Fourier-analytic integral over the unit circle involving exponential sums such as
The strategy is then to obtain sufficiently accurate bounds on exponential sums such as in order to obtain non-trivial bounds on quantities such as . For instance, if one can show that for all odd integers greater than some given threshold , this implies that all odd integers greater than are expressible as the sum of three primes, thus establishing all but finitely many instances of the odd Goldbach conjecture.
Remark 1 In practice, it can be more efficient to work with smoother sums than the partial sum (1), for instance by replacing the cutoff with a smoother cutoff for a suitable choice of cutoff function , or by replacing the restriction of the summation to primes by a more analytically tractable weight, such as the von Mangoldt function . However, these improvements to the circle method are primarily technical in nature and do not have much impact on the heuristic discussion in this post, so we will not emphasise them here. One can also certainly use the circle method to study additive combinations of numbers from other sets than the set of primes, but we will restrict attention to additive combinations of primes for sake of discussion, as it is historically one of the most studied sets in additive number theory.
In many cases, it turns out that one can get fairly precise evaluations on sums such as in the major arc case, when is close to a rational number with small denominator , by using tools such as the prime number theorem in arithmetic progressions. For instance, the prime number theorem itself tells us that
and the prime number theorem in residue classes modulo suggests more generally that
when is small and is close to , basically thanks to the elementary calculation that the phase has an average value of when is uniformly distributed amongst the residue classes modulo that are coprime to . Quantifying the precise error in these approximations can be quite challenging, though, unless one assumes powerful hypotheses such as the Generalised Riemann Hypothesis.
In the minor arc case when is not close to a rational with small denominator, one no longer expects to have such precise control on the value of , due to the “pseudorandom” fluctuations of the quantity . Using the standard probabilistic heuristic (supported by results such as the central limit theorem or Chernoff’s inequality) that the sum of “pseudorandom” phases should fluctuate randomly and be of typical magnitude , one expects upper bounds of the shape
which is consistent with (though weaker than) the above heuristic. In practice, though, we are unable to rigorously establish bounds anywhere near as strong as (3); upper bounds such as are far more typical.
Because one only expects to have upper bounds on , rather than asymptotics, in the minor arc case, one cannot realistically hope to make much use of phases such as for the minor arc contribution to integrals such as (2) (at least if one is working with a single, deterministic, value of , so that averaging in is unavailable). In particular, from upper bound information alone, it is difficult to avoid the “conspiracy” that the magnitude oscillates in sympathetic resonance with the phase , thus essentially eliminating almost all of the possible gain in the bounds that could arise from exploiting cancellation from that phase. Thus, one basically has little option except to use the triangle inequality to control the portion of the integral on the minor arc region :
Despite this handicap, though, it is still possible to get enough bounds on both the major and minor arc contributions of integrals such as (2) to obtain non-trivial lower bounds on quantities such as , at least when is large. In particular, this sort of method can be developed to give a proof of Vinogradov’s famous theorem that every sufficiently large odd integer is the sum of three primes; my own result that all odd numbers greater than can be expressed as the sum of at most five primes is also proven by essentially the same method (modulo a number of minor refinements, and taking advantage of some numerical work on both the Goldbach problems and on the Riemann hypothesis ). It is certainly conceivable that some further variant of the circle method (again combined with a suitable amount of numerical work, such as that of numerically establishing zero-free regions for the Generalised Riemann Hypothesis) can be used to settle the full odd Goldbach conjecture; indeed, under the assumption of the Generalised Riemann Hypothesis, this was already achieved by Deshouillers, Effinger, te Riele, and Zinoviev back in 1997. I am optimistic that an unconditional version of this result will be possible within a few years or so, though I should say that there are still significant technical challenges to doing so, and some clever new ideas will probably be needed to get either the Vinogradov-style argument or numerical verification to work unconditionally for the three-primes problem at medium-sized ranges of , such as . (But the intermediate problem of representing all even natural numbers as the sum of at most four primes looks somewhat closer to being feasible, though even this would require some substantially new and non-trivial ideas beyond what is in my five-primes paper.)
However, I (and many other analytic number theorists) are considerably more skeptical that the circle method can be applied to the even Goldbach problem of representing a large even number as the sum of two primes, or the similar (and marginally simpler) twin prime conjecture of finding infinitely many pairs of twin primes, i.e. finding infinitely many representations of as the difference of two primes. At first glance, the situation looks tantalisingly similar to that of the Vinogradov theorem: to settle the even Goldbach problem for large , one has to find a non-trivial lower bound for the quantity
for sufficiently large , as this quantity is also the number of ways to represent as the sum of two primes . Similarly, to settle the twin prime problem, it would suffice to obtain a lower bound for the quantity
that goes to infinity as , as this quantity is also the number of ways to represent as the difference of two primes less than or equal to .
In principle, one can achieve either of these two objectives by a sufficiently fine level of control on the exponential sums . Indeed, there is a trivial (and uninteresting) way to take any (hypothetical) solution of either the asymptotic even Goldbach problem or the twin prime problem and (artificially) convert it to a proof that “uses the circle method”; one simply begins with the quantity or , expresses it in terms of using (5) or (6), and then uses (5) or (6) again to convert these integrals back into a the combinatorial expression of counting solutions to or , and then uses the hypothetical solution to the given problem to obtain the required lower bounds on or .
Of course, this would not qualify as a genuine application of the circle method by any reasonable measure. One can then ask the more refined question of whether one could hope to get non-trivial lower bounds on or (or similar quantities) purely from the upper and lower bounds on or similar quantities (and of various type norms on such quantities, such as the bound (4)). Of course, we do not yet know what the strongest possible upper and lower bounds in are yet (otherwise we would already have made progress on major conjectures such as the Riemann hypothesis); but we can make plausible heuristic conjectures on such bounds. And this is enough to make the following heuristic conclusions:
- (i) For “binary” problems such as computing (5), (6), the contribution of the minor arcs potentially dominates that of the major arcs (if all one is given about the minor arc sums is magnitude information), in contrast to “ternary” problems such as computing (2), in which it is the major arc contribution which is absolutely dominant.
- (ii) Upper and lower bounds on the magnitude of are not sufficient, by themselves, to obtain non-trivial bounds on (5), (6) unless these bounds are extremely tight (within a relative error of or better); but
- (iii) obtaining such tight bounds is a problem of comparable difficulty to the original binary problems.
I will provide some justification for these conclusions below the fold; they are reasonably well known “folklore” to many researchers in the field, but it seems that they are rarely made explicit in the literature (in part because these arguments are, by their nature, heuristic instead of rigorous) and I have been asked about them from time to time, so I decided to try to write them down here.
In view of the above conclusions, it seems that the best one can hope to do by using the circle method for the twin prime or even Goldbach problems is to reformulate such problems into a statement of roughly comparable difficulty to the original problem, even if one assumes powerful conjectures such as the Generalised Riemann Hypothesis (which lets one make very precise control on major arc exponential sums, but not on minor arc ones). These are not rigorous conclusions – after all, we have already seen that one can always artifically insert the circle method into any viable approach on these problems – but they do strongly suggest that one needs a method other than the circle method in order to fully solve either of these two problems. I do not know what such a method would be, though I can give some heuristic objections to some of the other popular methods used in additive number theory (such as sieve methods, or more recently the use of inverse theorems); this will be done at the end of this post.
I’ve just uploaded to the arXiv my paper “Every odd number greater than 1 is the sum of at most five primes“, submitted to Mathematics of Computation. The main result of the paper is as stated in the title, and is in the spirit of (though significantly weaker than) the even Goldbach conjecture (every even natural number is the sum of at most two primes) and odd Goldbach conjecture (every odd natural number greater than 1 is the sum of at most three primes). It also improves on a result of Ramaré that every even natural number is the sum of at most six primes. This result had previously also been established by Kaniecki under the additional assumption of the Riemann hypothesis, so one can view the main result here as an unconditional version of Kaniecki’s result.
The method used is the Hardy-Littlewood circle method, which was for instance also used to prove Vinogradov’s theorem that every sufficiently large odd number is the sum of three primes. Let’s quickly recall how this argument works. It is convenient to use a proxy for the primes, such as the von Mangoldt function , which is mostly supported on the primes. To represent a large number as the sum of three primes, it suffices to obtain a good lower bound for the sum
By Fourier analysis, one can rewrite this sum as an integral
and . To control this integral, one then needs good bounds on for various values of . To do this, one first approximates by a rational with controlled denominator (using a tool such as the Dirichlet approximation theorem) . The analysis then broadly bifurcates into the major arc case when is small, and the minor arc case when is large. In the major arc case, the problem more or less boils down to understanding sums such as
which in turn is almost equivalent to understanding the prime number theorem in arithmetic progressions modulo . In the minor arc case, the prime number theorem is not strong enough to give good bounds (unless one is using some extremely strong hypotheses, such as the generalised Riemann hypothesis), so instead one uses a rather different method, using truncated versions of divisor sum identities such as to split into a collection of linear and bilinear sums that are more tractable to bound, typical examples of which (after using a particularly simple truncated divisor sum identity known as Vaughan’s identity) include the “Type I sum”
and the “Type II sum”
After using tools such as the triangle inequality or Cauchy-Schwarz inequality to eliminate arithmetic functions such as or , one ends up controlling plain exponential sums such as , which can be efficiently controlled in the minor arc case.
This argument works well when is extremely large, but starts running into problems for moderate sized , e.g. . The first issue is that of logarithmic losses in the minor arc estimates. A typical minor arc estimate takes the shape
when is close to for some . This only improves upon the trivial estimate from the prime number theorem when . As a consequence, it becomes necessary to obtain an accurate prime number theorem in arithmetic progressions with modulus as large as . However, with current technology, the error term in such theorems are quite poor (terms such as for some small are typical, and there is also a notorious “Siegel zero” problem), and as a consequence, the method is generally only applicable for very large . For instance, the best explicit result of Vinogradov type known currently is due to Liu and Wang, who established that all odd numbers larger than are the sum of three odd primes. (However, on the assumption of the GRH, the full odd Goldbach conjecture is known to be true; this is a result of Deshouillers, Effinger, te Riele, and Zinoviev.)
In this paper, we make a number of refinements to the general scheme, each one of which is individually rather modest and not all that novel, but which when added together turn out to be enough to resolve the five primes problem (though many more ideas would still be needed to tackle the three primes problem, and as is well known the circle method is very unlikely to be the route to make progress on the two primes problem). The first refinement, which is only available in the five primes case, is to take advantage of the numerical verification of the even Goldbach conjecture up to some large (we take , using a verification of Richstein, although there are now much larger values of – as high as – for which the conjecture has been verified). As such, instead of trying to represent an odd number as the sum of five primes, we can represent it as the sum of three odd primes and a natural number between and . This effectively brings us back to the three primes problem, but with the significant additional boost that one can essentially restrict the frequency variable to be of size . In practice, this eliminates all of the major arcs except for the principal arc around . This is a significant simplification, in particular avoiding the need to deal with the prime number theorem in arithmetic progressions (and all the attendant theory of L-functions, Siegel zeroes, etc.).
In a similar spirit, by taking advantage of the numerical verification of the Riemann hypothesis up to some height , and using the explicit formula relating the von Mangoldt function with the zeroes of the zeta function, one can safely deal with the principal major arc . For our specific application, we use the value , arising from the verification of the Riemann hypothesis of the first zeroes by van de Lune (unpublished) and Wedeniswki. (Such verifications have since been extended further, the latest being that the first zeroes lie on the line.)
To make the contribution of the major arc as efficient as possible, we borrow an idea from a paper of Bourgain, and restrict one of the three primes in the three-primes problem to a somewhat shorter range than the other two (of size instead of , where we take to be something like ), as this largely eliminates the “Archimedean” losses coming from trying to use Fourier methods to control convolutions on . In our paper, we set the scale parameter to be (basically, anything that is much larger than but much less than will work), but we found that an additional gain (which we ended up not using) could be obtained by averaging over a range of scales, say between and . This sort of averaging could be a useful trick in future work on Goldbach-type problems.
It remains to treat the contribution of the “minor arc” . To do this, one needs good and type estimates on the exponential sum . Plancherel’s theorem gives an estimate which loses a logarithmic factor, but it turns out that on this particular minor arc one can use tools from the theory of the large sieve (such as Montgomery’s uncertainty principle) to eliminate this logarithmic loss almost completely; it turns out that the most efficient way to do this is use an effective upper bound of Siebert on the number of prime pairs less than to obtain an bound that only loses a factor of (or of , once one cuts out the major arc).
For estimates, it turns out that existing effective versions of (1) (in particular, the bound given by Chen and Wang) are insufficient, due to the three logarithmic factors of in the bound. By using a smoothed out version of the sum , for some suitable cutoff function , one can save one factor of a logarithm, obtaining a bound of the form
with effective constants. One can improve the constants further by restricting all summations to odd integers (which barely affects , since was mostly supported on odd numbers anyway), which in practice reduces the effective constants by a factor of two or so. One can also make further improvements in the constants by using the very sharp large sieve inequality to control the “Type II” sums that arise from Vaughan’s identity, and by using integration by parts to improve the bounds on the “Type I” sums. A final gain can then be extracted by optimising the cutoff parameters appearing in Vaughan’s identity to minimise the contribution of the Type II sums (which, in practice, are the dominant term). Combining all these improvements, one ends up with bounds of the shape
when is small (say ) and
when is large (say ). (See the paper for more explicit versions of these estimates.) The point here is that the factors have been partially replaced by smaller logarithmic factors such as or . Putting together all of these improvements, one can finally obtain a satisfactory bound on the minor arc. (There are still some terms with a factor in them, but we use the effective Vinogradov theorem of Liu and Wang to upper bound by , which ends up making the remaining terms involving manageable.)