Suppose one is given a -tuple of distinct integers for some , arranged in increasing order. When is it possible to find infinitely many translates of which consists entirely of primes? The case is just Euclid’s theorem on the infinitude of primes, but the case is already open in general, with the case being the notorious twin prime conjecture.
On the other hand, there are some tuples for which one can easily answer the above question in the negative. For instance, the only translate of that consists entirely of primes is , basically because each translate of must contain an even number, and the only even prime is . More generally, if there is a prime such that meets each of the residue classes , then every translate of contains at least one multiple of ; since is the only multiple of that is prime, this shows that there are only finitely many translates of that consist entirely of primes.
To avoid this obstruction, let us call a -tuple admissible if it avoids at least one residue class for each prime . It is easy to check for admissibility in practice, since a -tuple is automatically admissible in every prime larger than , so one only needs to check a finite number of primes in order to decide on the admissibility of a given tuple. For instance, or are admissible, but is not (because it covers all the residue classes modulo ). We then have the famous Hardy-Littlewood prime tuples conjecture:
This conjecture is extremely difficult (containing the twin prime conjecture, for instance, as a special case), and in fact there is no explicitly known example of an admissible -tuple with for which we can verify this conjecture (although, thanks to the recent work of Zhang, we know that satisfies the conclusion of the prime tuples conjecture for some , even if we can’t yet say what the precise value of is).
Actually, Hardy and Littlewood conjectured a more precise version of Conjecture 1. Given an admissible -tuple , and for each prime , let denote the number of residue classes modulo that meets; thus we have for all by admissibility, and also for all . We then define the singular series associated to by the formula
where is the set of primes; by the previous discussion we see that the infinite product in converges to a finite non-zero number.
We will also need some asymptotic notation (in the spirit of “cheap nonstandard analysis“). We will need a parameter that one should think of going to infinity. Some mathematical objects (such as and ) will be independent of and referred to as fixed; but unless otherwise specified we allow all mathematical objects under consideration to depend on . If and are two such quantities, we say that if one has for some fixed , and if one has for some function of (and of any fixed parameters present) that goes to zero as (for each choice of fixed parameters).
Conjecture 2 (Prime tuples conjecture, quantitative form) Let be a fixed natural number, and let be a fixed admissible -tuple. Then the number of natural numbers such that consists entirely of primes is .
As this conjecture is stronger than Conjecture 1, it is of course open. However there are a number of partial results on this conjecture. For instance, this conjecture is known to be true if one introduces some additional averaging in ; see for instance this previous post. From the methods of sieve theory, one can obtain an upper bound of for the number of with all prime, where depends only on . Sieve theory can also give analogues of Conjecture 2 if the primes are replaced by a suitable notion of almost prime (or more precisely, by a weight function concentrated on almost primes).
Another type of partial result towards Conjectures 1, 2 come from the results of Goldston-Pintz-Yildirim, Motohashi-Pintz, and of Zhang. Following the notation of this recent paper of Pintz, for each , let denote the following assertion (DHL stands for “Dickson-Hardy-Littlewood”):
Conjecture 3 () Let be a fixed admissible -tuple. Then there are infinitely many translates of which contain at least two primes.
This conjecture gets harder as gets smaller. Note for instance that would imply all the cases of Conjecture 1, including the twin prime conjecture. More generally, if one knew for some , then one would immediately conclude that there are an infinite number of pairs of consecutive primes of separation at most , where is the minimal diameter amongst all admissible -tuples . Values of for small can be found at this link (with denoted in that page). For large , the best upper bounds on have been found by using admissible -tuples of the form
where denotes the prime and is a parameter to be optimised over (in practice it is an order of magnitude or two smaller than ); see this blog post for details. The upshot is that one can bound for large by a quantity slightly smaller than (and the large sieve inequality shows that this is sharp up to a factor of two, see e.g. this previous post for more discussion).
In a key breakthrough, Goldston, Pintz, and Yildirim were able to establish the following conditional result a few years ago:
Theorem 4 (Goldston-Pintz-Yildirim) Suppose that the Elliott-Halberstam conjecture is true for some . Then is true for some finite . In particular, this establishes an infinite number of pairs of consecutive primes of separation .
The dependence of constants between and given by the Goldston-Pintz-Yildirim argument is basically of the form . (UPDATE: as recently observed by Farkas, Pintz, and Revesz, this relationship can be improved to .)
Unfortunately, the Elliott-Halberstam conjecture (which we will state properly below) is only known for , an important result known as the Bombieri-Vinogradov theorem. If one uses the Bombieri-Vinogradov theorem instead of the Elliott-Halberstam conjecture, Goldston, Pintz, and Yildirim were still able to show the highly non-trivial result that there were infinitely many pairs of consecutive primes with (actually they showed more than this; see e.g. this survey of Soundararajan for details).
Actually, the full strength of the Elliott-Halberstam conjecture is not needed for these results. There is a technical specialisation of the Elliott-Halberstam conjecture which does not presently have a commonly accepted name; I will call it the Motohashi-Pintz-Zhang conjecture in this post, where is a parameter. We will define this conjecture more precisely later, but let us remark for now that is a consequence of .
We then have the following two theorems. Firstly, we have the following strengthening of Theorem 4:
A version of this result (with a slightly different formulation of ) appears in this paper of Motohashi and Pintz, and in the paper of Zhang, Theorem 5 is proven for the concrete values and . We will supply a self-contained proof of Theorem 5 below the fold, the constants upon those in Zhang’s paper (in particular, for , we can take as low as , with further improvements on the way). As with Theorem 4, we have an inverse quadratic relationship .
In his paper, Zhang obtained for the first time an unconditional advance on :
This is a deep result, building upon the work of Fouvry-Iwaniec, Friedlander-Iwaniec and Bombieri-Friedlander-Iwaniec which established results of a similar nature to but simpler in some key respects. We will not discuss this result further here, except to say that they rely on the (higher-dimensional case of the) Weil conjectures, which were famously proven by Deligne using methods from l-adic cohomology. Also, it was believed among at least some experts that the methods of Bombieri, Fouvry, Friedlander, and Iwaniec were not quite strong enough to obtain results of the form , making Theorem 6 a particularly impressive achievement.
Combining Theorem 6 with Theorem 5 we obtain for some finite ; Zhang obtains this for but as detailed below, this can be lowered to . This in turn gives infinitely many pairs of consecutive primes of separation at most . Zhang gives a simple argument that bounds by , giving his famous result that there are infinitely many pairs of primes of separation at most ; by being a bit more careful (as discussed in this post) one can lower the upper bound on to , and if one instead uses the newer value for one can instead use the bound . (Many thanks to Scott Morrison for these numerics.) UPDATE: These values are now obsolete; see this web page for the latest bounds.
In this post we would like to give a self-contained proof of both Theorem 4 and Theorem 5, which are both sieve-theoretic results that are mainly elementary in nature. (But, as stated earlier, we will not discuss the deepest new result in Zhang’s paper, namely Theorem 6.) Our presentation will deviate a little bit from the traditional sieve-theoretic approach in a few places. Firstly, there is a portion of the argument that is traditionally handled using contour integration and properties of the Riemann zeta function; we will present a “cheaper” approach (which Ben Green and I used in our papers, e.g. in this one) using Fourier analysis, with the only property used about the zeta function being the elementary fact that blows up like as one approaches from the right. To deal with the contribution of small primes (which is the source of the singular series ), it will be convenient to use the “-trick” (introduced in this paper of mine with Ben), passing to a single residue class mod (where is the product of all the small primes) to end up in a situation in which all small primes have been “turned off” which leads to better pseudorandomness properties (for instance, once one eliminates all multiples of small primes, almost all pairs of remaining numbers will be coprime).
— 1. The -trick —
In this section we introduce the “-trick”, which is a simple but useful device that automatically takes care of local factors arising from small primes, such as the singular series . The price one pays for this trick is that the explicit decay rates in various terms can be rather poor, but for the applications here, we will not need to know any information on these decay rates and so the -trick may be freely applied.
Let be a natural number, which should be thought of as either fixed and large, or as a very slowly growing function of . Actually, the two viewpoints are basically equivalent for the purposes of asymptotic analysis (at least at the qualitative level of decay rates), thanks to the following basic principle:
- (i) For every fixed there exists a fixed such that
for all fixed .
- (ii) We have
whenever is a function of going to infinity that is sufficiently slowly growing. (In other words, there exists a function going to infinity with the property that whenever is a natural number-valued function of is such that as and for all sufficiently large .)
This principle is closely related to the overspill principle from nonstandard analysis, though we will not explicitly adopt a nonstandard perspective here. It is also similar in spirit to the diagonalisation trick used to prove the Arzela-Ascoli theorem.
Proof: We first show that (i) implies (ii). By (i), we see that for every natural number , we can find a real number with the property that
whenever , , and are such that . By increasing the as necessary we may assume that they are increasing and go to infinity as . If we then define to equal the largest natural number for which , or equal to if no such number exists, then one easily verifies that whenever goes to infinity and is bounded by for sufficiently large .
Now we show that (ii) implies (i). Suppose for contradiction that (i) failed, then we can find a fixed with the property that for any natural number , there exist such that for arbitrarily large . We can select the to be increasing to infinity, and then we can find a sequence increasing to infinity such that for all ; by increasing as necessary, we can also ensure that for all and . If we then define to be when , and for , we see that whenever , contradicting (ii).
Henceforth we will usually think of as a sufficiently slowly growing function of , although we will on occasion take advantage of Lemma 7 to switch to thinking of as a large fixed quantity instead. In either case, we should think of as exceeding the size of fixed quantities such as or , at least in the limit where is large; in particular, for a fixed -tuple , we will have
as this follows from the absolutely convergent nature of the sum and hence also . As a consequence of this, once we “turn off” all the primes less than , any errors in our sieve-theoretic analysis which are quadratic or higher in can be essentially ignored, which will be very convenient for us. In a similar vein, for any fixed -tuple , one has
In order to “turn off” all the small primes, we introduce the quantity , defined as the product of all the primes up to (i.e. the primorial of ):
As is going to infinity, is going to infinity also (but as slowly as we please). The idea of the -trick is to search for prime patterns in a single residue class , which as mentioned earlier will “turn off” all the primes less than in the sieve-theoretic analysis.
Using (4) and the Chinese remainder theorem, we may thus approximate the singular series as
where is the Euler totient function of , and is the set of residue classes such that all of the shifts are coprime to . Note that if consists purely of primes and is sufficiently large, then must lie in one of the residue classes in . Thus we can count tuples with all prime by working in each residue class in separately. We conclude that Conjecture 2 is equivalent to the following “-tricked version” in which the singular series is no longer present (or, more precisely, has been replaced by some natural normalisation factors depending on , such as ):
Conjecture 8 (Prime tuples conjecture, W-tricked quantitative form) Let be a fixed natural number, and let be a fixed admissible -tuple. Assume is a sufficiently slowly growing function of . Then for any residue class in , the number of natural numbers with such that consists entirely of primes is .
We will work with similarly -tricked asymptotics in the analysis below.
— 2. Sums of multiplicative functions —
As a result of the sieve-theoretic computations to follow, we will frequently need to estimate sums of the form
where is a multiplicative function, the sieve level (also denoted in some literature) is a fixed power of (such as or ), is the Möbius function, is a fixed smooth compactly supported function, is a (possibly half-infinite) interval in , and is the set of square-free numbers that are products of distinct primes in . (Actually, in applications won’t quite be smooth, but instead have some high order of differentiability (e.g. times continuously differentiable for some ), but we can extend the analysis of smooth to sufficiently differentiable by various standard limiting or approximation arguments which we will not dwell on here.) We will also need to control the more complicated variant
where are also smooth compactly supported functions. In practice, the interval will be something like , , . In particular, thanks to the -trick we will be able to turn off all the primes up to , so that only contains primes larger than , allowing us to take advantage of bounds such as (2).
Once is restricted to , the quantity is determined entirely by the values of the multiplicative function at primes in :
for all and some fixed positive (note that we allow the implied constants in the notation to depend on quantities such as ); we refer to as the dimension of the multiplicative function . Henceforth we assume that has a fixed dimension . We remark that we could unify the treatment of and in what follows by allowing multiplicative functions of negative dimension, but we will avoid doing so here. In our applications will be an integer; one could also generalise much of the discussion below to the fractional dimension case, but we will not need to do so here.
Traditionally the above expressions are handled by complex analysis, starting with Perron’s formula. We will instead take a slightly different Fourier-analytic approach. We perform a Fourier expansion of the smooth compactly supported function to obtain a representation
for some Schwartz function ; in particular, is rapidly decreasing. (Strictly speaking, is the Fourier transform of shifted in the complex domain by , rather than the true Fourier transform of , but we will ignore this distinction for the purposes of this discussion.) In particular we have
for any . By Fubini’s theorem, we can thus write as
which factorises as
Similarly one has
In order to use asymptotics of the Riemann zeta function near the pole , it is convenient to temporarily truncate the above integrals to the region or :
Lemma 9 For any fixed , we have
Also we have the crude bound
Proof: We begin with the bounds on . From (6) we have
for (which forces , so there is no issue with the singularity of the logarithm) and thus
we see on taking logarithms that
The bounds on then follow from the rapid decrease of . The bounds for and are proven similarly.
From (6) and the restriction of to quantities larger than , we see that
where is the restricted Euler product
which is well-defined for at least (and this is the only region of for which we will need ).
We now specialise to the model case , in which case
where is the Riemann zeta function. Using the basic (and easily proven) asymptotic for near
for , if is sufficiently slowly growing (this can be seen by first working with a fixed large and then using Lemma 7). Note that because of the above truncation, we do not need any deeper bounds on than what one can obtain from the simple pole at ; in particular no zero-free regions near the line are needed here. (This is ultimately because of the smooth nature of , which is sufficient for the applications in this post; if one wanted rougher cutoff functions here then the situation is closer to that of the prime number theorem, and non-trivial zero-free regions would be required.)
We conclude in the case that
using the rapid decrease of , we thus have
We can rewrite these expressions in terms of instead of . Using the Gamma function identity
and (7) we see that
whilst from differentiating (7) times at the origin (after first dividing by ) we see that
Combining these two methods, we also see that
We have thus obtained the following asymptotics:
These asymptotics will suffice for the treatment of the Goldston-Pintz-Yildirim theorem (Theorem 4). For the Motohashi-Pintz-Zhang theorem (Theorem 5) we will also need to deal with truncated intervals , such as ; we will discuss how to deal with these truncations later.
— 3. The Goldston-Yildirim-Pintz theorem —
We are now ready to state and prove the Goldston-Yildirim-Pintz theorem. We first need to state the Elliott-Halberstam conjecture properly.
Let be the von Mangoldt function, thus equals when is equal to a prime or a power of that prime, and equal to zero otherwise. The prime number theorem in arithmetic progressions tells us that
for any fixed arithmetic progression with coprime to . In particular,
where are the residue classes mod that are coprime to . By invoking the Siegel-Walfisz theorem one can obtain the improvement
for any fixed (though, annoyingly, the implied constant here is only ineffectively bounded with current methods; see this previous post for further discussion).
The above error term is only useful when is fixed (or is of logarithmic size in ). For larger values of , it is very difficult to get good error terms for each separately, unless one assumes powerful hypotheses such as the generalised Riemann hypothesis. However, it is possible to obtain good control on the error term if one averages in . More precisely, for any , let denote the following assertion:
Conjecture 11 () One has
for all fixed .
This should be compared with the asymptotic for some absolute constant , as can be deduced for instance from Proposition 10. The Elliott-Halberstam conjecture is the assertion that holds for all . This remains open, but the important Bombieri-Vinogradov theorem establishes for all . Remarkably, the threshold is also the limit of what one can establish if one directly invokes the generalised Riemann hypothesis, so the Bombieri-Vinogradov theorem is often referred to as an assertion that the generalised Riemann hypothesis (or at least the Siegel-Walfisz theorem) holds “on the average”, which is often good enough for sieve-theoretic purposes.
We may replace the von Mangoldt function with the slight variant , defined to equal when is a prime and zero otherwise. Using this replacement, as well as the prime number theorem (with error term), it is not difficult to show that is equivalent to the estimate
Now we establish Theorem 4. Suppose that holds for some fixed , let be sufficiently large depending on but otherwise fixed, and let be a fixed admissible -tuple. We would like to show that there are infinitely many such that contains at least two primes. We will begin with the -trick, restricting to a residue class with (note that is non-empty because is admissible).
for all and some fixed , where is a fixed power of (we will eventually take ). (The factors of , , and on the right-hand side are natural normalisations coming from sieve theory and the reader should not pay too much attention to them.) Informally, (9) says that has some normalised density at most , and then (10) roughly speaking asserts that relative to the weight , has a probability of at least of being prime. If we sum (10) for all and then subtract off copies of (9), we conclude that
and so is positive for at least one value of between and . This can only occur if contains two or more primes. Thus we must have containing at least two primes for some between and ; sending off to infinity then gives as desired.
It thus suffices to find a weight function obeying the required properties (9), (10) with parameters obeying the key inequality (11). It is thus of interest to make as large a power of as possible, and to minimise the ratio between and . It is in the former task that the Elliott-Halberstam hypothesis will be crucial.
The key is to find a good choice of , and the selection of this weight is arguably the main contribution of Goldston, Pintz, and Yildirim, who use a carefully modified version of the Selberg sieve. Following (a slight modification of) the Goldston-Pintz-Yildirim argument, we will take a weight of the form , where
The intuition here is that is a truncated approximation to a function of the form
for some natural number , which one can check is only non-vanishing when has at most distinct prime factors in . So is concentrated on those numbers for which already has few prime factors for , which will assist in making the ratio as small as possible.
The constraint is equivalent to requiring that for each prime dividing , lies in one of the residue classes for . By choice of , , so all the are distinct, and so we are constraining to lie in one of residue classes modulo for each ; together with the constraint and the Chinese remainder theorem, we are thus constraining to residue classes modulo , where is the number of prime factors of . We thus have
Note from the support of that may be constrained to be at most , so that is at most . We can thus express the left-hand side of (9) as the main term
plus an error
By Proposition 10, the error term is . So if we set
then the error term will certainly give a negligible contribution to (9) with plenty of room to spare. (But when we come to the more difficult sum (10), we will have much less room – only a superlogarithmic amount of room, in fact.) To show (9), it thus suffices to show that
But by Proposition 10 (applied to the -dimensional multiplicative function ) and the support of , this bound holds with equal to the quantity
Now we consider the inner sum
As discussed earlier, the conditions and split into residue classes . However, if for one of the primes dividing , then must vanish (since is much less than ). So there are actually only residue classes for which is coprime to . We thus have
Remark 1 There is an inefficiency here; the supremum in (13) is over all primitive residue classes , but actually one only needs to take the supremum over the residue classes for which , where . This inefficiency is not exploitable if we insist on using the Elliott-Halberstam conjecture as the starting hypothesis, but will be used in the arguments of the next section in which a more lightweight hypothesis is utilised.
The left-hand side of (10) is thus equal to the main term
plus an error term
We first deal with the error term. Since is in and is bounded by on the support of this function, and each has representations of the form , we can bound this expression by
Note that we are assuming to be a fixed smooth compactly supported function and so it has magnitude . On the other hand, from Proposition 10 and the trivial bound we have
while from (8) (and here we crucially use the choice of ) one easily verifies that
for any fixed . By the Cauchy-Schwarz inequality we see that the error term to (10) is negligible (assuming sufficiently slowly growing of course). Meanwhile, the main term can be rewritten as
where is the -dimensional multiplicative function
To obtain the crucial inequality (11), we thus need to locate a fixed smooth non-negative function supported on obeying the inequality
In principle one can use calculus of variations to optimise the choice of here (it will be the ground state of a certain one-dimensional Schrödinger operator), but one can already get a fairly good result here by a specific choice of that is amenable for computations, namely a polynomial of the form for and some integer , with vanishing for and smoothly truncated to somehow at negative values of . Strictly speaking, this is not admissible here because it is not infinitely smooth at , being only times continuously differentiable instead, but one can regularise this function to be smooth without significantly affecting either side of (14), so we will go ahead and test (14) with this function and leave the regularisation details to the reader. The inequality then becomes (after cancelling some factors)
Using the Beta function identity
which simplifies to
Actually, the same inequality is also applicable when is real instead of integer, using Gamma functions in place of factorials; we leave the details to the interested reader. We can then optimise in by setting , arriving at the inequality
But as long as , this inequality is satisfiable for any larger than . This concludes the proof of Theorem 4.
Remark 2 One can obtain slightly better dependencies of in terms of by using more general functions for than the monomials , for instance one can take linear combinations of such functions. See the paper of Goldston, Pintz, and Yildirim for details. Unfortunately, as noted in this survey of Soundararajan, one has the general inequality
The right-hand side simplifies after some integration by parts to
Subtracting off from both sides, one is left with
From the fundamental theorem of calculus and Cauchy-Schwarz, one has the bound
Using this bound for close to and dominating by for far from , we obtain the claim (at least if is large enough). There is some slack in this argument; it would be of interest to calculate exactly what the best constants are in (17), so that one can obtain the optimal relationship between and .
To get around this obstruction (17) in the unconditional setting when one only has for , Goldston, Pintz, and Yildirim also considered sums of the form in which was now outside (but close to) . While the bounds here were significantly inferior to those in (10), they were still sufficient to prove a variant of the inequality (11) to get reasonably small gaps between primes.
— 4. The Motohashi-Pintz-Zhang theorem —
We now modify the above argument to give Theorem 5. Our treatment here is different from that of Zhang in that it employs the method of Buchstab iteration; a related argument also appears in the paper of Motohashi and Pintz. This arrangement of the argument leads to a more efficient dependence of on than in the paper of Zhang. (The argument of Motohashi and Pintz is a bit more complicated and uses a slightly different formulation of the base conjecture , but the final bounds are similar to those given here, albeit with non-explicit constants in the notation.)
The main idea here is to truncate the interval of relevant primes from to for some small . Somewhat remarkably, it turns out that this apparently severe truncation does not affect the sums (9), (10) here as long as is large (which is going to be the case in practice, with being comparable to ). The intuition is that was already concentrated on those for which had about factors, and it is too “expensive” for one of these factors to as large as or more, as it forces many of the other factors to be smaller than they “want” to be. The advantage of truncating the set of primes this way is that the version of the Elliott-Halberstam conjecture needed also acquires the same truncation, which gives that version a certain “well-factored” form (in the spirit of the work of Bombieri, Fouvry, Friedlander, and Iwaniec) which is essential in being able to establish that conjecture unconditionally for some suitably small .
To make this more precise, we first formalise the conjecture for mentioned earlier.
Conjecture 12 () Let be a fixed -tuple (not necessarily admissible) for some fixed , and let be a primitive residue class. Then
This is the -tricked formulation of the conjecture as (implicitly) stated in Zhang’s paper, which did not have the restriction present (and with the interval enlarged from to , and was required to be admissible). However the two formulations are morally equivalent (and Zhang’s arguments establish Theorem 6 with as stated). From the prime number theorem in arithmetic progressions (with error term) together with Proposition 10 we observe that we may replace (18) here by the slight variant
without affecting the truth of .
It is also not difficult to deduce from after using a Cauchy-Schwarz argument to dispose of the residue classes in the above sum (cf. the treatment of the error term in (10) in the previous section); we leave the details to the interested reader. Note however that whilst demands control over all primitive residue classes in a given modulus , the conjecture only requires control of a much smaller number of residue classes (roughly polylogarithmic in number, on average). Thus is simpler than , though it is still far from trivial.
We now begin the proof of Theorem 5. Let be such that holds, and let be a sufficiently large quantity depending on but which is otherwise fixed. As before, it suffices to locate a non-negative sieve weight that obeys the estimates (9), (10) for parameters that obey the key inequality (11), and with smooth and supported on . The choice of weight is almost the same as before; it is also given as a square with given by (12), but now the interval is truncated to instead of . Also, in this argument we take
plus an error term which continues to be acceptable (indeed, the error term is slightly smaller than in the previous case due to the truncated nature of ). At this point in the previous section we applied Proposition 10, but that proposition was only available for the untruncated interval instead of the truncated interval . One could try to adapt the proof of that proposition to the truncated case, but then one is faced with the problem of controlling the truncated zeta function . While one can eventually get some reasonable asymptotics for this function, it seems to be more efficient to eschew Fourier analysis and work entirely in “physical space” by the following partial Möbius inversion argument. Write , thus . Observe that for any , the quantity equals when lies in and vanishes otherwise. Hence, for any function of and supported on squarefree numbers we have the partial Mobius inversion formula
and so the main term (20) can be expressed as
We first dispose of the contribution to (21) when share a common prime factor for some . For any fixed , we can bound this contribution by
Factorising the inner two sums as an Euler product, this becomes
[UPDATE: The above argument is not quite correct; a corrected (and improved) version is given at this newer post.] The product is by e.g. Mertens’ theorem, while . So the contribution of this case is negligible.
If do not share a common factor for any , then we can factor as . Rearranging this portion of (21) and then reinserting the case when have a common factor for some , we may write (21) up to negligible errors as
Note that we can restrict to be at most as otherwise the factors vanish. The inner sum
is now of the form that can be treated by Proposition 10, and takes the form
Here we make the technical remark that the translates of by shifts between and are uniformly controlled in smooth norms, which means that the error here is uniform in the choices of .
Let us first deal with the contribution of the error term. This is bounded by
The inner sum factorises as
Note that the factor increases very rapidly with when is large, which basically means that any non-trivial shift of the factors to the left by or will cause the integral in (22) to decrease dramatically. Since all the in are either equal to or bounded below by , this will cause the term to dominate in the regime when is large (or more precisely ), which is the case in applications.
At this point, in order to perform the computations cleanly, we will mimic the arguments from the previous section and take the explicit choice
for some integer and (and some smooth continuation to for negative , and so
for positive . (Again, this function is not quite smooth at , but this issue can be dealt with by an infinitesimal regularisation argument which we omit here.) The left-hand side of (22) now becomes
The integral here is a little bit more complicated than a beta integral. To estimate it, we use the beta function identity to observe that
and hence by Cauchy-Schwarz
This Cauchy-Schwarz step is a bit wasteful when are far apart, but this does seems to only lead to a minor loss of efficiency in the estimates. We have thus bounded the left-hand side of (22) by
It is now convenient to collapse the double summation to a single summation. We may bound
(since is less than the greater of and ) and observe that each has representations of the form , so we may now bound the left-hand side of (22) by
Note that an element of is either equal to , or lies in the interval for some natural number . In the latter case, we have
In particular, this expression vanishes if . We can thus bound the left-hand side of (22) by
then we have thus bounded the left-hand side of (22) by
as can be seen by isolating the smallest prime in all the terms in (23) with . (This inequality is very close to being an identity, the only loss coming from the possibility of the prime factor being repeated in a term associated to .) We can iterate this identity to obtain the following conclusion:
whenever and for some fixed , with the error term being uniform in the choice of .
Proof: Write . We prove the bound by strong induction on . The case follows from (24). Now suppose that and that the claim has already been proven for smaller . Let and . Note that whenever . We thus have from (25) and the induction hypothesis that
applying Mertens’ theorem (or the prime number theorem) we have
and the claim follows from the telescoping identity
Applying this inequality, we have established (22) with
so in the regime , is roughly , which will be negligible for the parameter ranges of of interest. Thus the in this argument is quite close to the from (15) in practice.
plus an error term
where is the quantity
is the polynomial , and was defined in (19). Using the hypothesis and Cauchy-Schwarz as in the previous section we see that the error term is negligible for the purposes of establishing (10). As for the main term, the same argument used to reduce (9) to (22) shows that (10) reduces to
Here, we can do something a bit crude; with our choice of , the integrand is non-negative, so we can simply discard all but the term and reduce to
(The intuition here is that by refusing to sieve using primes larger than , we have enlarged the sieve , which makes the upper bound (9) more difficult but the lower bound (10) actually becomes easier.) So we can take the same choice (16) of as in the previous section:
As before, can be taken to be non-integer if desired. Setting to be slightly larger than we obtain Theorem 5.
— 5. Using optimal values of (NEW, June 5, 2013) —
We can do better than given above by using an optimal value of . The following result was obtained recently by Farkas, Pintz, and Revesz, and independently worked out by commenters on this blog:
where is a smooth function with that is not identically vanishing, has a minimal value of
where is the first zero of the Bessel function . Furthermore, this minimum is attained if (and only if) is a scalar multiple of the function
Proof: The function , by definition, obeys the Bessel differential equation
for smooth functions with . A calculation using (29) and integration by parts shows that
and so , giving the first claim; the second claim follows by noting that vanishes if and only if is a scalar multiple of . (Note that the integration by parts is a little subtle, because vanishes to first order at and so blows up to first order; but still vanishes to first order at , allowing one to justify the integration by parts by a standard limiting argument.)
If we now test (14) with a function which is smooth, vanishes to order at , and has a derivative equal to , we see that we can deduce from whenever
Using the known asymptotic
for and large (see e.g. Abramowitz and Stegun), this is asymptotically of the form
thus giving a relationship of the form that is superior to the previous relationship .
A similar argument can be given for Theorem 5, using of the form above rather than a monomial (and extended by zero to ). For future optimisation we consider a generalisation of in which the interval is of the form rather than , so that is now rather than . As before, the key point is the estimation of . The arguments leading to (22) go through for any test function , so we have to show
We need some sign information on :
Lemma 15 On , is positive, is negative and is positive.
Proof: From (28) we have
From construction we already know that is positive on . The above equation then shows that is negative at , and that cannot have any local minimum in , so is negative throughout. To obtain the final claim we use an argument provided by Gergely Harcos in the comments: from the recursive relations for Bessel functions we can check that is a positive multiple of , and the claim then follows from the interlacing properties of the zeroes of Bessel functions.
Write , so is positive and by Theorem 14 we have
If , then as is negative and increasing we have
for , and thus by change of variable
for , and thus
for all . Similarly
for all . By Cauchy-Schwarz we can thus bound the integral in (30) by
and so (30) reduces to
Repeating the arguments of the previous section, we can reduce this to
and by further continuing the arguments of the previous section we end up being able to take
Also, the previous arguments allow us to take
The key inequality (11) now becomes