For any , let denote the assertion that there are infinitely many pairs of consecutive primes whose difference is at most , or equivalently that

thus for instance is the notorious twin prime conjecture. While this conjecture remains unsolved, we have the following recent breakthrough result of Zhang, building upon earlier work of Goldston-Pintz-Yildirim, Bombieri, Fouvry, Friedlander, and Iwaniec, and others:

Theorem 1 (Zhang’s theorem)is true for some finite .

In fact, Zhang’s paper shows that is true with .

About a month ago, the Polymath8 project was launched with the objective of reading through Zhang’s paper, clarifying the arguments, and then making them more efficient, in order to improve the value of . This project is still ongoing, but we have made significant progress; currently, we have confirmed that holds for as low as , and provisionally for as low as subject to certain lengthy arguments being checked. For several reasons, our methods (which are largely based on Zhang’s original argument structure, though with numerous refinements and improvements) will not be able to attain the twin prime conjecture , but there is still scope to lower the value of a bit further than what we have currently.

The precise arguments here are quite technical, and are discussed at length on other posts on this blog. In this post, I would like to give a “high level” summary of how Zhang’s argument works, and give some impressions of the improvements we have made so far; these would already be familiar to the active participants of the Polymath8 project, but perhaps may be of value to people who are following this project on a more casual basis.

While Zhang’s arguments (and our refinements of it) are quite lengthy, they are fortunately also very modular, that is to say they can be broken up into several independent components that can be understood and optimised more or less separately from each other (although we have on occasion needed to modify the formulation of one component in order to better suit the needs of another). At the top level, Zhang’s argument looks like this:

- Statements of the form are deduced from weakened versions of the Hardy-Littlewood prime tuples conjecture, which we have denoted (the stands for “Dickson-Hardy-Littlewood”), by locating suitable narrow admissible tuples (see below). Zhang’s paper establishes for the first time an unconditional proof of for some finite ; in his initial paper, was , but we have lowered this value to (and provisionally to ). Any reduction in the value of leads directly to reductions in the value of ; a web site to collect the best known values of in terms of has recently been set up here (and is accepting submissions for anyone who finds narrower admissible tuples than are currently known).
- Next, by adapting sieve-theoretic arguments of Goldston, Pintz, and Yildirim, the Dickson-Hardy-Littlewood type assertions are deduced in turn from weakened versions of the Elliott-Halberstam conjecture that we have denoted (the stands for “Motohashi-Pintz-Zhang”). More recently, we have replaced the conjecture by a slightly stronger conjecture to significantly improve the efficiency of this step (using some recent ideas of Pintz). Roughly speaking, these statements assert that the primes are more or less evenly distributed along many arithmetic progressions, including those that have relatively large spacing. A crucial technical fact here is that in contrast to the older Elliott-Halberstam conjecture, the Motohashi-Pintz-Zhang estimates only require one to control progressions whose spacings have a lot of small prime factors (the original conjecture requires the spacing to be smooth, but the newer variant has relaxed this to “densely divisible” as this turns out to be more efficient). The parameter is more important than the technical parameter ; we would like to be as large as possible, as any increase in this parameter should lead to a reduced value of . In Zhang’s original paper, was taken to be ; we have now increased this to be almost as large as (and provisionally ).
- By a certain amount of combinatorial manipulation (combined with a useful decomposition of the von Mangoldt function due Heath-Brown), estimates such as can be deduced from three subestimates, the “Type I” estimate , the “Type II” estimate , and the “Type III” estimate , which all involve the distribution of certain Dirichlet convolutions in arithmetic progressions. Here is an adjustable parameter that demarcates the border between the Type I and Type III estimates; raising makes it easier to prove Type III estimates but harder to prove Type I estimates, and lowering of course has the opposite effect. There is a combinatorial lemma that asserts that as long as one can find some between and for which all three estimates , , hold, one can prove . (The condition arises from the combinatorics, and appears to be rather essential; in fact, it is currently a major obstacle to further improvement of and hence and .)
- The Type I estimates are asserting good distribution properties of convolutions of the form , where are moderately long sequences which have controlled magnitude and length but are otherwise arbitrary. Estimates that are roughly of this type first appeared in a series of papers by Bombieri, Fouvry, Friedlander, Iwaniec, and other authors, and Zhang’s arguments here broadly follow those of previous authors, but with several new twists that take advantage of the many factors of the spacing . In particular, the
*dispersion method*of Linnik is used (which one can think of as a clever application of the Cauchy-Schwarz inequality) to ultimately reduce matters (after more Cauchy-Schwarz, as well as treatment of several error terms) to estimation of incomplete Kloosterman-type sums such asZhang’s argument uses classical estimates on this Kloosterman sum (dating back to the work of Weil), but we have improved this using the “-van der Corput -process” introduced by Heath-Brown and Ringrose.

- The Type II estimates are similar to the Type I estimates, but cover a small hole in the coverage of the Type I estimates which comes up when the two sequences are almost equal in length. It turns out that one can modify the Type I argument to cover this case also. In practice, these estimates give less stringent conditions on than the other two estimates, and so as a first approximation one can ignore the need to treat these estimates, although recently our Type I and Type III estimates have become so strong that it has become necessary to tighten the Type II estimates as well.
- The Type III estimates are an averaged variant of the classical problem of understanding the distribution of the ternary divisor function in arithmetic progressions. There are various ways to attack this problem, but most of them ultimately boil down (after the use of standard devices such as Cauchy-Schwarz and completion of sums) to the task of controlling certain higher-dimensional Kloosterman-type sums such as
In principle, any such sum can be controlled by invoking Deligne’s proof of the Weil conjectures in arbitrary dimension (which, roughly speaking, establishes the analogue of the Riemann hypothesis for arbitrary varieties over finite fields), although in the higher dimensional setting some algebraic geometry is needed to ensure that one gets the full “square root cancellation” for these exponential sums. (For the particular sum above, the necessary details were worked out by Birch and Bombieri.) As such, this part of the argument is by far the least elementary component of the whole. Zhang’s original argument cleverly exploited some additional cancellation in the above exponential sums that goes beyond the naive square root cancellation heuristic; more recently, an alternate argument of Fouvry, Kowalski, Michel, and Nelson uses bounds on a slightly different higher-dimensional Kloosterman-type sum to obtain results that give better values of . We have also been able to improve upon these estimates by exploiting some additional averaging that was left unused by the previous arguments.

As of this time of writing, our understanding of the first three stages of Zhang’s argument (getting from to , getting from or to , and getting to or from Type I, Type II, and Type III estimates) are quite satisfactory, with the implications here being about as efficient as one could hope for with current methods, although one could still hope to get some small improvements in parameters by wringing out some of the last few inefficiencies. The remaining major sources of improvements to the parameters are then coming from gains in the Type I, II, and III estimates; we are currently in the process of making such improvements, but it will still take some time before they are fully optimised.

Below the fold I will discuss (mostly at an informal, non-rigorous level) the six steps above in a little more detail (full details can of course be found in the other polymath8 posts on this blog). This post will also serve as a new research thread, as the previous threads were getting quite lengthy.

** — 1. Finding narrow prime tuples — **

For any , we define an *admissible -tuple* to be a tuple of distinct integers , such that avoids at least one residue class modulo for each prime . Thus for instance is an admissible -tuple, and is an admissible -tuple, but is not admissible (it does not avoid any residue classes modulo ).

The prime tuples conjecture asserts that whenever is an admissible -tuple, then there are infinitely many translates that consist entirely of primes. This conjecture is still out of reach of current methods, but we can hope to establish results of the following form for a given :

Conjecture 2() If is an admissible -tuple, then there are infinitely many translates of that contain at least two primes.

Let be the minimal diameter of an admissible -tuple: thus for instance , , and . It is clear that if holds for some , then is true for , since we have an admissible -tuple of diameter , and any translate of that tuple that contains two primes will generate a pair of consecutive primes of distance at most apart. So we can use statements of the form to deduce estimates of the form , provided that we can find admissible -tuples that are as narrow as possible, in order to obtain as good a value of that one can hope to get.

Perhaps the easiest way to obtain an admissible -tuple of reasonably narrow width is to take “the first primes past “, i.e. the tuple , where is the first prime larger than . It is not difficult to show that this is an admissible tuple. Zhang used this construction with to obtain ; as observed later by Trudgian, this construction actually has diameter , giving the bound .

For large , this construction gives an asymptotic of the form

as can be seen by a careful application of the prime number theorem.

One can do better than this in a number of ways. Firstly, it turns out that one can reduce a bit and still have admissible; for , it turns out after some computation that this trick lowers the diameter a little bit to ; asymptotically this gives

(because can be taken to be ). A larger improvement comes from the work of Hensley and Richards, namely to shift the tuple to cover positive and negative primes around the origin, in order to exploit the slightly increased density of the primes in this interval. Specifically, if one uses a tuple of the form

and optimises in , one can bound . Asymptotically, this gives the slight improvement

One can get a little more improvement by deleting a few primes from one end of this tuple and adding them to another, although the gains here are modest. Incidentally, Hensley and Richards used this improved asymptotic to demonstrate that the prime tuples conjecture was inconsistent with the second Hardy-Littlewood conjecture, which is now widely believed to be false.

One can view the Hensley-Richard construction as a “sieve” similar to the sieve of Eratosthenes, formed by starting with the integers in an interval and striking out the residue class for all primes up to . (In principle, this sieve could leave a few extra survivors, namely powers of primes larger than , together with their negations, though in practice such powers fall outside the interval of interest.) This sieve already does most of the “work” of establishing admissibility as it handles all the small moduli, leaving only the larger moduli to be checked computationally. It was observed by Schinzel that one could do very slightly better than this by sieving out the odd integers rather than the even integers , leaving behind numbers of the form for various primes ; this is conjecturally believed to give the bound

By similarly sieving out for other small primes , it is in fact conjectured that

(see this paper of Gordon and Rodemich for a heuristic justification) although for the values of that we have worked with, these asymptotically superior constructions are actually numerically inferior to what one can get just from the first version of the Schinzel sieve. As before, some slight further gains can be obtained by shifting the interval that one is sieving over.

More recently, we have worked with tuples that initially started from partially performing the sieving from a Schinzel construction, but at some point switching to a “greedy” algorithm in which one deletes, for each prime, the residue class that removes the fewest elements from the tuple. The next technical innovation was a “merging” approach in which one took two reasonably good tuples already constructed, merged them together by taking their union, and then greedily pruning away residue classes to obtain a merged tuple that is hopefully superior to either of its two “parents”. This procedure can be iterated, and also combined with local optimisations in which the sieving interval is shifted or a small number of elements are otherwise added and deleted. This has produced many of the best constructions so far, typically about 5% narrower than what previous constructions such as the Schinzel sieve construction gave. (For small values of , say , the older work of Engelsma on finding narrow tuples, which used slightly different methods, is often still the record-holder.

Very recently, a web page has been set up to record the best known tuples of cardinality up to (including the older data of Engelsma), and to accept submissions for new “record holders”. This will help with further improvements to , but the database of narrow tuples is likely to have other uses in effective prime number theory as well.

Numerical regression suggests that is roughly for small or medium values of . There is no theoretical justification for this bound; the best lower bounds we have on asymptotically come from the Brun-Titchmarsh theorem and its refinements (and in particular various versions of the large sieve), and take the form

It looks very difficult to improve upon this loss of (this problem may possibly be related to the parity problem in sieve theory, although the precise connection is unclear).

Further information on this part of the Polymath8 project may be found at this page.

** — 2. Sieving and Dickson-Hardy-Littlewood type results — **

Now we informally describe how to obtain results of the form . Zhang’s approach (and thus ours) follows the basic strategy given by the breakthrough paper of Goldston, Pintz, and Yildirim establishing small (but not bounded) gaps between primes.

Let be an admissible -tuple. The basic difficulty in trying to locate translates that contain two or more primes is that the primes get sparser and sparser when one goes out to infinity, so if one chooses to be an integer chosen at random from (say) the interval , one would expect most such tuples to in fact contain *no* primes.

The trick of Goldston, Pintz, and Yildirim is not to pick *uniformly* from , but instead to pick in a *weighted* fashion that significantly increases the probability of each member of the translated tuple being prime. To oversimplify, suppose we could construct a moderately large subset of with the property that for each , we have that is prime for more than of the elements of . Then just from the pigeonhole principle, one could conclude the existence of at least one such that is prime for at least two choices of , which give .

Of course, one has to choose a set for which one can prove the desired claim, that each is prime for more than of the elements of . Intuitively, the strategy of Goldston, Pintz, and Yildirim is to let consist of those for which the quantity is *almost prime* – that it only has a relatively small number of prime factors (not much more than ).

Actually, it turns out that it is best not to try to use a set for the above strategy, but rather a non-negative weight function on which is concentrated on those for which is almost prime. One then seeks to maximise the ratio

for each ; in particular, if one can get these ratios all above , one has proven .

One then needs to construct a non-negative weight for which the ratio (1) is (a) computable, and (b) as large as possible. For this one uses a version of the *Selberg sieve*, choosing of the form

for some weights to be optimised later. In order to be able to compute both the numerator and denominator of (1), one needs to limit the sieve weights to be supported on the range for some *sieve level* , which one would like to be as large as possible to give maximum flexibility to the sieve. The denominator of (1) is relatively easy to compute (as long as does not exceed , but the numerator is more difficult. Expanding out the formula for , the numerator becomes

As is standard in analytic number theory, it is slightly more convenient to replace this sum by the closely related sum

where is the von Mangoldt function, which serves as a useful proxy for the prime numbers.

The summation is restricted to a fairly small number of residue classes modulo , which is a modulus of size at most . To control the numerator of (1), one thus needs some sort of *distribution result* for the primes (or the von Mangoldt function) in arithmetic progressions of modulus at least as large as . If one had to get good distribution for *all* moduli up to this level, this would be a task comparable to the generalised Riemann hypothesis, and thus hopeless; but fortunately we only need such distribution results *on average* for such moduli . For this, standard tools such as the Bomberi-Vinogradov theorem are available, at least for up to about or so. This was already strong enough for Goldston, Pintz, and Yildirim to obtain reasonably narrow prime gaps (in which , for instance), but to get bounded prime gaps it turns out to be necessary to obtain distribution results up to a slightly higher exponent for some . The standard distribution conjecture in this direction is the Elliott-Halberstam conjecture

for any fixed , where is a primitive residue class modulo for each , and is the Euler totient function. This estimate is conjectured to hold for arbitrarily close to , and Goldston, Pintz and Yildirim showed that the assertion (2) for even a small was sufficient to imply for some (the condition they obtained was asymptotically of the form , and if one could get sufficiently close to , one could in fact obtain ). Unfortunately, the Elliott-Halberstam conjecture (2) remains open for any choice of .

Zhang made a number of observations to get around this obstacle. First, he observed that the residue classes needed for the application to were not completely arbitrary, but were linked to each other, basically obeying a Chinese remainder theorem

for any coprime . (These congruences basically come from the roots of the polynomial modulo .) Secondly, he observed that by suitably truncating the Selberg sieve, one did not need to establish (2) for *all* moduli , but only those moduli which were squarefree and *smooth*, in the sense that all prime divisors of were at most for some small . (This observation had also been made previously by Motohashi and Pintz.) One had the freedom to select as one wished, although if one set to be too small, the value of one obtained would increase.

One can then formulate a weakened version of (2) which we have called (the MPZ standing for Motohashi, Pintz, and Zhang), in which the are constrained to obey the Chinese remainder theorem and the are constrained to be square-free and smooth. Zhang was able to establish for , which he could then use to establish for (note that is roughly of the order of as mentioned previously).

We have improved Zhang’s argument in a number of ways. Firstly, we have made the sieving argument that deduces from results of the form (2) more efficient by optimising a certain cutoff function involved in the construction of the Selberg sieve; the optimal cutoff turns out to be given in terms of a certain Bessel function, and improves the relationship to ; see this post for details. Also, a modification of the truncated Selberg sieve due to Pintz allows one to take to be extremely small with almost no impact on (but at a certain technical cost, namely that the property of being smooth has to be weakened to another property which we have called “dense divisibility”, replacing the conjecture by a variant that we have called ); see this post for details. Finally, by optimising the remaining steps in Zhang’s argument (see below), we have managed to increase significantly, with the currently largest value of that we can use being close to (and provisionally to ). The combination of these improvements has allowed us to take as low as (and provisionally to ).

At this stage we believe our arguments converting (or variants thereof) to is about as efficient as can be hoped for given the methods we have, and that the most significant further improvements will be coming from raising beyond the currently available range.

** — 3. The Heath-Brown identity and combinatorics — **

Now we discuss how to establish estimates such as (2) in the case of smooth moduli, and in the case of congruences obeying a Chinese remainder theorem. Prior to Zhang’s paper, there were a number of results of this type established in a series of papers by (various combinations of) Bombieri, Fouvry, Friedlander, and Iwaniec, as well as some followup work by later authors, although the prior results were not directly applicable for the current problem (either because the function was replaced by a simpler arithmetic function, or the absolute values in (2) were replaced instead with a “well-factorable” weight, or the congruence classes involved were more specialised, e.g. of the form where was bounded and independent of ). However, Zhang was able to adapt and improve the analysis in the previous literature in a number of ways, to the point where he could in fact obtain an estimate of the required strength.

We won’t describe the entire proof here, but give some of the important ideas in these arguments. The first observation is that one can decompose the von Mangoldt function as a combination of simpler functions, such as the order divisor functions

the modified order divisor functions

or more generally certain combinations

of “nice” arithmetic functions (we will be vague about what “nice” means here), where is the usual Dirichlet convolution. The point is that functions with this convolution structure are easier to sum and estimate. For instance, the basic estimate

is of course the prime number theorem, which is already non-trivial, and improving the error term here to for some fixed is a major open problem (and would be a huge breakthrough towards the Riemann hypothesis). In contrast, the divisor function

is much easier to sum, and one can easily obtain the estimate

by the elementary Dirichlet hyperbola method, which ultimately relies on the Dirichlet convolution structure of . Similarly for the higher order divisor functions and their variants , although the error terms become harder to estimate well when increases.

Why would we expect to be expressible in terms of functions such as or ? To illustrate this sort of decomposition, let us make the major cheat that all numbers are square-free and have at most (say) four prime factors. (This is clearly quite a cheat; but note that a positive fraction of integers are square-free, and if we restrict attention to *large* prime factors, say factors larger than , then we do indeed have at most four such prime factors, so the cheat actually is not so terrible in some ways.) Then the quantity is equal to when has one prime factor and zero when has two, three, or four prime factors. Meanwhile, is equal to when has one, two, three, or four prime factors respectively. Similarly is equal to and is equal to . Thus we see in all cases that

Note that while a little computation is required to verify this identity, it is immediate from linear algebra that *some* identity of this type must exist, at least for square-free numbers with a bounded number of factors.

More generally, we have *Linnik’s identity*

which is valid for all natural numbers ; this identity can be proven by starting with the formal identity

for the Riemann zeta function , differentiating this identity, then comparing Dirichlet series coefficients. (It is also a fun exercise to find a purely combinatorial proof of this identity.)

Morally speaking, Linnik’s identity reduces the task of proving (2) (or variants thereof) to the task of proving

for . However, this implication is not quite rigorous, because the parameter on the right-hand side is unbounded. There is however a clever truncation of Linnik’s identity due to Heath-Brown, which only has finitely many terms, each one of which resembles a divisor function , albeit one which is averaged by an additional Dirichlet convolution. The precise identity is a little complicated (although not difficult to prove); see this post for details. Using this identity, we can indeed reduce (2) to slightly more complicated versions of the identities (5). By decomposing the versions of (5) obtained in this fashion into dyadic pieces (actually, one uses a decomposition that is slightly finer than the dyadic decomposition, but never mind this technical detail), it turns out that one can organise the resulting estimates to be proven into three classes, which Zhang calls “Type I estimates”, “Type II estimates”, and “Type III estimates”. The precise formulation of such estimates depends on the two parameters alluded to earlier, as well as an additional parameter demarcating the boundary between the Type I and Type III estimates. (The lower bound is needed to effect the combinatorial partition into Type I, Type II, Type III sums; it would be nice to remove this bound, but this seems to necessitate the introduction of further types of estimates (“Type IV” and “Type V”) which we have so far been unable to deal with effectively.) A precise definition of these estimates may be found at this page, but let us for now give an oversimplified caricature of what these estimates look like. A model case of a Type I estimate takes the form

where are two sequences localised to scales and respectively, where one should think of and as being like and respectively; the functions also obey certain bounds similar to those enjoyed by the divisor functions (e.g. their average value should be ) which we do not give in detail here. The modulus should be understood to be squarefree and smooth as in the statement of (or if one is trying to prove instead, “smooth” needs to be replaced by “densely divisible”). The Type II sums are similar, except that one should now think of and as both being like . A model case of a Type III estimate takes the form

where , and again should be understood to be square-free and smooth. (This is an oversimplification; the true Type III estimate involves a smoothly truncated, averaged version of , and it is to compensate for this oversimplification that the upper bound for has changed.) The Type I estimates are harder to prove when gets larger, while the Type III estimates become easier. It is thus necessary to work out what estimates one can prove on both the Type I and Type III sides before one can optimise in . The Type II estimates are less important (and in practice, in all the regimes in which we can prove Type I and Type III estimates so far, we have also been able to prove Type II estimates), but still need to be dealt of course. Thus far we have not been able to improve upon this portion of Zhang’s arguments much; we rely on the same Heath-Brown identity and combinatorial decomposition that Zhang does, as they seem to be more or less optimally efficient for decompositions of this type, at least for the appliction at hand. The task of proving (or ) and hence and for various choices of parameters is thus reduced to that of verifying Type I, Type II, and Type III estimates.

** — 4. Type I estimates — **

Now we give a very informal sketch of Zhang’s argument for establishing Type I estimates (details may be found at these two posts). It is based on previous arguments of Bombieri, Friedlander, and Iwaniec, relying on various analytic manipulations such as the use of the Cauchy-Schwarz inequality (and in particular on the *dispersion method* of Linnik that is based on this inequality) and Fourier identities (such as the Pólya-Vinogradov method of completion of sums), in order to eliminate the poorly understood sequences , until one is only left with the task of estimating a certain incomplete Kloosterman-type sum, to which one uses standard tools from analytic number theory (such as the Weil bound for completed Kloosterman sums). We have continued to follow the same general strategy, but have found some improvements based on better bounds for incomplete Kloosterman sums.

We now give a little more detail. We will ignore the role of the technical parameter , and assume simply that all moduli are as smooth as needed for the argument. We change notation a little bit, replacing and by and , and consider a dyadic component of (8), namely

One now takes advantage of the square-free smooth nature of to factor , where are coprime, and is a little bit smaller than , so that is a little bit larger than ; the reason for this choice of factorisation is that it will turn out that we want to be as large as possible while still staying a bit less than . There is also a technical step in which we shift all the tiny prime factors of to and not to (where “tiny” means something like “less than “); this is needed to ensure that most pairs of ‘s that we encounter become coprime at a crucial juncture later in the argument. See this post for more details of this step. The congruence similarly factors into and by (3), so we now have to prove something like

where for some small and .

Roughly speaking, this estimate is asserting that we can estimate a quantity such as

to accuracy whenever are arbitrary bounded coefficients (where we will be a bit vague on what it means to estimate something to a given accuracy). Splitting as and ignoring the constraint , we may rearrange this expression as

We are unable to take much advantage of the averaging, and will thus seek to estimate

to accuracy for a fixed .

Now we start eliminating some of the poorly controlled expressions . Applying a Cauchy-Schwarz, it (morally) suffices to estimate

to accuracy , i.e. a little bit better than . Note that because is a little bit less than , we have a non-trivial number of obeying the congruence condition for any given . This prevents the summation inside the absolute values from being so trivial that Cauchy-Schwarz could not gain anything.

The above expression can be rearranged as

The constraint forces . Since is so close to , this basically forces to equal (or to equal a shift of for some small ). To simplify the exposition we will just look at the diagonal case (the off-diagonal terms are treated similarly). Our task is now to control

with accuracy better than . Using averaging notation, this is basically the same as controlling

with accuracy better than .

Recall that we had the foresight to remove all the tiny primes from . This has the effect of making typically coprime. We will discard the non-coprime contributions (it turns out that they can be controlled with relatively easy estimates) and insert the condition in the above summation. We can also make coprime to since the summation vanishes otherwise.

The summation is over an interval of length over a single congruence class of modulus . This sum can then be inverted via the Fourier transform to a dual sum over a range ; morally speaking we may replace

where for , and the reciprocal here is taken inside the ring . The zero frequency can be controlled directly; we ignore it and work with non-zero . We are now trying to control

We can eliminate the next by performing a Cauchy-Schwarz in , but it turns out to be a bit more efficient to Cauchy-Schwarz in and , not just in . This places us in the position of having to control

to accuracy better than , where we have suppressed the various coprimality conditions on the for brevity. So it suffices to get bounds on the exponential sum

which beat the trivial bound of by about on average, for “typical” values of . This is basically an incomplete Kloosterman sum of the shape

for some modulus of typical size and a coefficient which turns out to typically be more or less coprime to except in a certain *diagonal case* , but it turns out in the Type I case that this diagonal contribution is manageable. A standard “Pólya-Vinogradov” type bound for such sums, ultimately based on Weil’s famous bound

for completed Kloosterman sums, allows one to bound (11) by (ignoring some logarithmic type errors and lower order terms). This gives us an acceptable estimate if

which may be rearranged as

(Recall that we are treating as negligible, otherwise we have to insert some factors of in this constraint as well.) Crucially, this constraint allows to exceed , and thus gives a good range of Type I estimates if is small enough. Further improvements have been obtained by using more refined estimates on incomplete Kloosterman sums (first discussed here); to date we have managed to (provisionally) replace the above constraint to

which turns out to give superior numerical results. This reveals a tradeoff, in that more advanced exponential sum estimates can improve at the expense of ; we have not yet located what the optimal balance for this tradeoff is.

** — 5. Type II estimates — **

The Type II estimates are treated quite similarly to the Type I estimates and will only be discussed briefly here. If one repeats the Type I estimates verbatim, most of the same arguments go through (in fact they become better because the terms are not present); however it turns out that the diagonal contribution that was manageable in the Type I setting, no longer becomes manageable in the Type II setting. To get around this one has to reduce the influence of the diagonal by taking a narrower Cauchy-Schwarz when eliminating from (10), namely that one only performs a Cauchy-Schwarz in rather than . The price one pays for this is that the off-diagonal terms become slightly worse (conceding a few more powers of ); the best estimate we currently have here (again ignoring issues) is

** — 6. Type III estimates — **

The Type III estimates (7) were originally treated by Zhang using the method of Weyl differencing. More recently, a simpler method which gives better numerology was introduced by Fouvry, Kowalski, Michel, and Nelson, and may be sketched as follows. Our goal here is to control something like

to accuracy better than , where are bounded coefficients, and is a quantity larger than (which we would like to take to be as large as possible). Expanding out the function and using dyadic decomposition, this basically amounts to controlling

to accuracy for some that multiply to (e.g. one can take for sake of discussion, although the precise choice of turns out to not be so important). Applying Fourier decomposition as in (9), one can write this expression roughly speaking as

where are the dual scales . It turns out that the contributions when vanish, or more generally share a common factor with , can be dealt with, so we ignore them and assume that are coprime with . In this case, we can write the inner sum after a change of variables as

where is the *hyper-Kloosterman sum*

The normalisation in (12) reflects the expected square root cancellation in this sum, which is a sum of about phases that one expects to behave more or less independently of each other. Indeed, using the deep results of Deligne on the Weil conjectures, one can show that

for coprime to (ignoring some log factors).

Replacing by (which can be done by adjusting the a little bit), our task is now to control

to accuracy better than . If we apply Deligne’s bound (13) right away, we get a bound of , which only works for , which recovers the Bombieri-Vinogradov theorem for but no better. To improve upon this we perform a Cauchy-Schwarz. We first perform the useful substitution , which rewrites the above sum as something like

where is a truncated version of (counting factorisations with ). Note that is given by , so that . We interchange the sum, so we are now asking to bound

to accuracy better than .

At this point it turns out to be optimal to use the smoothness of the modulus to split where , so we wish to control something like

to accuracy better than . If we Cauchy-Schwarz in and , we end up having to bound

to accuracy better than . The diagonal contribution is thanks to (13). Now we turn to the opposite case when are coprime. The goal is then to establish the estimate

for an incomplete Kloosterman correlation with net modulus . It turns out that there is a Pólya-Vinogradov type bound of for this sum (though this requires a non-trivial amount of Deligne’s Weil conjecture machinery to actually prove). So we are done if

which simplifies to , or equivalently . So this argument allows one to obtain (7) for up to , which corresponds to the constraint

which, happily, overlaps enough with the Type I bounds and the combinatorial constraint to give (or ) for relatively good values of (and , which we have ignored in this discussion). We have managed to (provisionally) improve upon this constraint to

by exploiting an additional averaging that is present in the true Type III estimate (and which was suppressed in (7)). Combining this with the best Type I estimate we currently have (and with the combinatorial constraint ) gives us the best value of currently available, namely (note that this is well below the Type II constraint, which may thus be ignored). Actually, the Type III estimates are now so strong that they are no longer critical for further improvements in the numerology; it is either the Type I estimate or the combinatorial constraint that needs improving instead. (Alternatively, if one is willing to use a worse value of , namely , one can use the Type I estimates to raise up to , which closes off the Type III case completely and allows for a slightly more elementary proof of Zhang’s theorem in that the full strength of Deligne’s proof of the Weil conjectures is no longer needed (although Weil’s classical bound on Kloosterman sums is still required).)

## 85 comments

Comments feed for this article

30 June, 2013 at 12:46 pm

Muchos están esperando tu ayuda y que les prediques la palabra de Dios.[…] formula 1Bounded gaps between primes (Polymath8) – a progress report For any , let denote the assertion that there are infinitely many pairs of consecutive primes whose […]

30 June, 2013 at 1:34 pm

bengreenI haven’t been following every detail of this so far, but if it’s true that the Type I/II estimates cover everything, as hence you avoid the deeper Deligne stuff, then to my tastes that’s the most significant development of the polymath project so far. It means one could in principle teach the whole proof in a graduate course. Furthermore (I haven’t checked this) if you can do Type I for then can’t you in fact use Vaughan’s identity instead of Heath-Brown?

30 June, 2013 at 11:47 pm

v08ltuI agree this would be quite significant. I think the gain is described here: http://terrytao.wordpress.com/2013/06/23/the-distribution-of-primes-in-densely-divisible-moduli/#comment-236387

Namely we achieve rather than the (dominant) of Lemma 10. My calculations agree that this allows one to take as large as 1/5. Here (up to parts), and the above expressions need to be less than .

1 July, 2013 at 7:28 am

Terence TaoJust recording the results of a discussion I had with Ben on this. It does look like Vaughan’s identity with , together with a Zhang-Type I estimate that works for , is enough to establish Zhang’s theorem; the Vaughan-Type II sums can be handled by Zhang’s Type I/II analysis, and the Vaughan-Type I sums can be handled by either Zhang Type I/II or Zhang Type 0, depending on the exact scales involved. So one does not need either the Heath-Brown identity or Deligne’s theory to prove Zhang’s theorem, just Weil’s bound on completed exponential sums. Actually Ben pointed out that the more elementary bound of Kloosterman for nonlinear rational functions , when combined with sufficiently many iterations of the q-van der Corput method, give non-trivial bounds on even very short incomplete Kloosterman sums on smooth moduli that suffice to give Type I/II estimates for arbitrarily close to 1/2, and in particular at least as large as 1/6, which suffices for the purposes of establishing Zhang’s theorem (albeit with somewhat worse values of ).

1 July, 2013 at 9:39 am

DPDeligne’s proof of the Weil conjectures, technically, is absolutely “needed” for Dr. Zhang to have his paper accepted by Annals of Math.quickly. If such an amazing big result had been announced with only an elementary proof by a guy whom basically no one knows of, the editors wouldn’t even have taken a look at his manuscript.

30 June, 2013 at 2:24 pm

David RobertsIn point 1 near the top the link to the k-tuples page is missing – did you mean for it to be on the word ‘here’?

[Corrected, thanks – T.]30 June, 2013 at 2:46 pm

GregSmall typo: two displayed equations before (4), the coefficient of should be instead of .

[Corrected, thanks – T.]30 June, 2013 at 3:57 pm

SniffnoyAlso, something seems to have gotten lost here:

“This has produced many of the best constructions so far, typically 5 Very recently,”

[Corrected, thanks – T.]30 June, 2013 at 4:04 pm

Gergely HarcosI just posted an improvement (hopefully) to Theorem 8 in the earlier thread (http://terrytao.wordpress.com/2013/06/23/the-distribution-of-primes-in-densely-divisible-moduli/). The point is that by properly averaging over , the constraint (47) can be weakened so that it follows from (45). I have not checked if this leads to any improvement in Theorem 4 of the earlier thread, but eliminating a condition is always a good thing.

30 June, 2013 at 10:50 pm

Terence TaoThanks for this! Actually it turns out that this condition could already be eliminated by a later refinement of the incomplete Kloosterman sum estimate in which the factor was improved to (basically by using the older completion of sums method instead of the van der Corput method when N is large: see http://terrytao.wordpress.com/2013/06/22/bounding-short-exponential-sums-on-smooth-moduli-via-weyl-differencing/#comment-236033 . But this additional refinement you have pushes this contribution down even further, although it is currently not the dominant constraint so unfortunately it does not (yet) give any improvement to the numerology. (But as you say, it doesn’t hurt. :-).

1 July, 2013 at 11:31 am

Gergely HarcosThank you! I will try to catch up with the latest developments. It is not easy (for me): the blog entries and comments that constitute polymath8 would occupy now about 450 printed pages!

30 June, 2013 at 4:22 pm

Andrew SutherlandRegarding the second Hardy-Littlewood conjecture, this graph (http://math.mit.edu/~drew/hw2data.gif) may be of interest. It shows lower bounds for k-pi(w) versus w, with w=H(k)+1, using the most current upper bounds on H(k) from the new tuples database (which in just 4 days has processed more than 1000 submissions, improving some 800 H(k) values). The dips in the chart are good targets for future improvement.

30 June, 2013 at 10:46 pm

Terence TaoDear Andrew,

It looks like the link is 403’ed currently (permissions are not set properly). EDIT: Never mind, it seems to work now.

30 June, 2013 at 11:16 pm

Andrew SutherlandThe 403 error is an intermittent problem that are IT folks are working on, just hit refresh if it happens again.

30 June, 2013 at 6:17 pm

Qiaochu YuanThanks for the update! This was helpful.

Some formatting seems to have gone wrong around “but only those moduli which were squarefree and http://www.ams.org/mathscinet-getitem?mr=2414788.)…”

[Corrected, thanks – T.]30 June, 2013 at 6:40 pm

xfxieLet my computer did a systematic scan to obtain feasible values for different settings. The table can be found here:

http://www.cs.cmu.edu/~xfxie/project/admissible/k0table.html

A basic observation is that the coefficient for seems to have a quite minor effect on determining values. So I assume that testing on a single coefficient (or a few more) might be sufficient to gain some knowledge on possible values. I did test some rows to confirm more.

All these data could be generated by an old MacBook machine (otherwise it might be idle forever) in less than one day. So there is no difficulty to generate a more complete table, if it might be useful for some guys. I assume many of them might still be improved. I am also not quite sure about the correctness of the settings of parameter ranges.

1 July, 2013 at 6:44 pm

xfxieJust cleared the code and put a download link in the webpage above.

1 July, 2013 at 12:19 am

Shizhuo LooiThank you for this summary and update; it was helpful. Just a very minor point: when you summarize Zhang’s argument, there seems to be a missing parenthesis at the end of point 1.

[Corrected, thanks – T.]1 July, 2013 at 1:28 am

Grégoire NicollierIn Section 3, a prime is missing in the definition of the modified order divisor functions.

[Corrected, thanks – T.]1 July, 2013 at 3:31 am

MichaelThe direction of the sign in the third-to-last display is opposite sense to the following line

[Corrected, thanks – T.]1 July, 2013 at 4:55 am

pigh3In point 3 of introduction, TypeIII[\varpi, \delta] (last mention of TypeIII) should probably be TypeIII[\varpi, \delta, \sigma].

[Corrected, thanks – T.]1 July, 2013 at 9:20 am

Pace NielsenGeneral questions for the polymath participants:

While I’m comfortable with convolutions, I’m not as familiar with the methods which require smoothness of one of the convolved functions. In particular, I’ve enjoyed learning about “completion of sums,” although it is still somewhat of a black box for me in some of the discussions.

I noticed that in the (more detailed) Type I analysis, we replace (after some Cauchy-Schwarzing) by a smooth sequence, in order to take advantage of completion of sums. On the other hand, in the Type III analysis, smoothness is used (as far as I can tell) to do completion of sums twice, and to give a Taylor series once. I was wondering: Can a similar trick as in the Type I case be used to allow non-smoothness in one or more of the completion of sums in the Type III case?

Also, if are smooth, how “far away” is from being smooth?

1 July, 2013 at 10:45 pm

Terence TaoDear Pace,

I can answer your second question first: because Dirichlet convolution is based on the multiplicative structure of the integers rather than the additive structure, the convolution of two smooth functions is unfortunately not very smooth in general. For instance the divisor function is quite rough, jumping down to 2 at the primes but typically much larger (the average value of is about ).

If one replaced a smooth sequence by a rough sequence in the Type III sums, one could eliminate it by Cauchy-Schwarz, but each use of Cauchy-Schwarz tends to halve any gains one gets later on in the argument (e.g. from the Weil conjectures), or equivalently doubles the amount of gain one has to generate in order to make up for losses (e.g. from completion of sums). Conversely, one might be able to improve the Type I sum numerology if was smooth as we could avoid one application of Cauchy-Schwarz; this may potentially deal with some “Type IV” sums (of the form where the are smooth and supported near respectively) but not unfortunately for “Type V” sums (of the form where the are all smooth and supported near ) where looks a bit like and thus not very smooth. (There are some fancy techniques coming from Eisenstein series that I don’t understand that could potentially start controlling exponential sums such as when looks like a divisor function , but the gains are very small and also rather complicated, and may end up being inferior to the current Cauchy-Schwarz methods.)

2 July, 2013 at 6:14 am

Pace NielsenDear Terry,

Thanks for explaining this so clearly. The reason I asked is that, at present, it isn’t the Type III sums which are the roadblock, and so we can be a little more lossy if we also able to deal with the combinatorial problems. And the Type IV, V,… sums are incorporated into Type III if among , both and one of the other two sequences can be assumed non-smooth.

My current project is to understand these sums well enough to work out how this change affects the numerology, and whether the losses incurred by two Cauchy-Schwarzes are offset by the ability to lower to 0.

1 July, 2013 at 11:02 am

E.L. WistyReblogged this on Pink Iguana.

1 July, 2013 at 5:25 pm

Avi LevyThe first two paragraphs below the fold (right before Conjecture 2) have a couple typos regarding and being flipped at times. Perhaps just stick to $k_0$ to avoid confusion?

[Corrected, thanks – T.]1 July, 2013 at 9:59 pm

Terence TaoI’m recording the outcome of the idea (inspired by a comment of Philippe Michel) to readjust the Cauchy-Schwarz in the Type I sums (something that I’ve called “Level 6″ on the wiki, skipping Level 5 for now which was more difficult to implement than I had hoped). This time around it does seem to improve the final value for , in contrast to the Level 4 estimates which actually made things worse. (Although Level 4 (if iterated) does have the effect of showing that Type I estimates hold for arbitrarily close to 1/2 if one is willing to set sufficiently close to zero. This observation might be useful for some other application than bounded gaps beween primes, but doesn’t seem to be directly useful for our current project.)

Anyway, to Level 6. We will need some sort of “double dense divisibility”, but I will punt on this as before by first working in the case when the modulus is -smooth, which gives us all the dense divisibility we need (at the cost of making the numerology to get from to a bit worse, by forcing and to be equal).

If we go through the Type I argument from http://terrytao.wordpress.com/2013/06/12/estimation-of-the-type-i-and-type-ii-sums/ , we reach a point (see equation (39) from that post) where we have to prove the estimate

(*)

where the are bounded coefficients, and

Previously we have dealt with (*) in the Type I case by performing Cauchy-Schwarz in the variables. Now we rebalance this by factoring (which is -smooth by construction) as where , for some , and we can force

for some to be chosen later. We now need to establish

.

If we perform Cauchy-Schwarz now in , we reduce to

.

We first control the diagonal case , which is a little stronger than it was in previous arguments. By the divisor bound, there are possible values of that contribute to this diagonal case, and for each such contribution we use the

trivial bound of for the summation. (It just occurred to me on writing this that one could perhaps do a bit better than the trivial bound here, but then I realised that the subcomponent of the diagonal has no such cancellation and is already a significant fraction of the diagonal, so it seems there is no gain available). The diagonal contribution is thus acceptable if

thus

We thus set . (This requires us to have , but this turns out to follow from the Type I condition .)

Now for the off-diagonal terms. On using the rebalanced van der Corput estimate (see http://terrytao.wordpress.com/2013/06/23/the-distribution-of-primes-in-densely-divisible-moduli/#comment-236387 ) we have

so the total off-diagonal contribution is

and we win if we have both

and

We can bound

so the conditions become

and

The worst case arises when

in which case we have

and the conditions become

and

which simplify to

and

The second condition is implied by the first, so we have obtained a Type I estimate whenever

.

Playing this against (the Type II and Type III constraints can be checked to be non-dominant here) gives

which is a little bit better than before in the aspect, but worse in the aspect, particularly since we had to use -smoothness rather than dense divsibility and so will have to take when converting to . However this latter defect is presumably temporary once we figure out what “double dense divisibility” means. (Though the complexity of this argument is beginning to raise the question of whether the final paper should focus on the messiest but strongest results, or instead focus on a slightly less strong but cleaner result. I guess it’s still a bit premature to discuss these things though.)

1 July, 2013 at 10:32 pm

Terence TaoIncidentally, I’m coming around to the opinion that absent a radically different way to do the Type I estimates, we’re almost at the end of the road for easy improvements here (and thus probably for the Polymath8 project as a whole). In order to get to the point where we are estimating an exponential sum such as

, (*)

one has to perform one completion of sums (to deal with the congruence conditions) and two Cauchy-Schwarz’s (to eliminate the coefficients). The completion of sums costs a factor of , which is about , and this is amplified to by the Cauchy-Schwarz. So one has to gain a factor of about over the trivial bound in (*). Even in the best-case scenario (Hooley conjecture) that one has square root cancellation in (*), we still need ; since can be as low as and has to be at least 1/10, this forces . Now we are a long way from actually reaching the Hooley conjecture, even for smooth moduli, which is why our values of are more like 1/108 or 3/280; it is conceivable that some increasingly heroic efforts could shave a bit more off of the exponential sum estimates (possibly by exploiting the additional averaging in h, r parameters that are available, although my initial efforts in this direction have been rather fruitless) but I think the era of dramatic improvements in the numerology are basically over, I’m afraid, absent a radical new way to perform these estimates (or else a new way to perform Type IV and Type V estimates, which would let us get past the annoying barrier).

2 July, 2013 at 2:48 am

Eytan Paldi1. Perhaps the combinatorial lemma should be somehat modified (as well as the definitions of types) to improve the estimate of the “modified type I”?

2 July, 2013 at 4:52 pm

Brian RothbachI think you can gain a little bit by replacing the in the combinatorial lemma with two parameters: instead of using type II estimates on the interval , use them on the interval . The restraint on sigma becomes (and that the \sigma’s are positive), by the exact same argument as before, but now you can optimize the two parameters separately to improve the parameters for type I estimates.

2 July, 2013 at 5:36 pm

Pace NielsenSuppose we are looking at the situation where we have two numbers whose sum is 1 (which is what happens both in the Type I and Type II cases). Thus, one number is and the other number is . Hence, the extra generality of an uneven interval is no help in Type I/II situations. If you optimize on one end, you must have the same on the other side.

You could possibly introduce another type of summation where the boundaries are uneven. If you want to do so, consider the following main case when there are five equal numbers . There are only so many ways to combine them all. Combining them into two numbers either gives or or . The last doesn’t work, for a number of reasons (lack of a Siegel-Walfisz theorem, lack of picking divisors in appropriate intervals, etc…). The first one is better than the second, and so is the best option, combinatorially, when combining into two numbers.

Now you can work out what happens when trying to combine them into three numbers, etc…

2 July, 2013 at 7:27 pm

Brian RothbachAh, I see. I was misinterpreting which of the Type estimates correspond to cases in the combinatorial lemma, and the estimates that are causing difficulties are exactly the ones where the unequal intervals don’t help.

2 July, 2013 at 5:58 pm

Pace NielsenBrian,

By the way, I proposed something like that, consisting of a new Type consisting of a convolution of three sequences: the biggest one nonsmooth, and the smaller ones both smooth. In terms of the case I gave above, this would correspond to the decomposition . The analysis would probably follow the same path as for Type I sums, except we would replace everywhere by a convolution of two smooth sequences (coming from the Heath-Brown decomposition). The trouble would then be to decouple that convolution in a reasonable manner (which unfortunately seems unlikely).

2 July, 2013 at 7:54 am

xfxieInitial calculation got =722 for (280/3,80/3), for = 3/280-2.9850449068186893*10-5, =0.03373456067881111, and =618.7045821541182.

Hopefully it does not violate some constraints.

2 July, 2013 at 8:00 am

Terence TaoThis will probably be close to the truth once we figure out how to relax the condition of -smoothness we are currently using (probably by developing some sort of notion of “double dense divisbility”) but right now we are forced to take in order to get the smoothness (which makes kappa_3 vanish and renders A irrelevant), which will probably make k_0 increase a little bit (but it should still be better than what we had before).

2 July, 2013 at 9:08 am

HannesFor I think works. This is a for which Engelsma’s bound has been improved.

Tao: In a previous post you write instead of .

[Corrected, thanks – T.]2 July, 2013 at 10:40 pm

Terence TaoThanks for this! This illustrates the fair amount of gain we could extract if we could relax back from -smoothness to something more like dense divisibility, then we would get something much closer to xfxie’s computed values of 720. I’ll try to see if the arguments for the Type I sums can be recast in terms of something like dense divisibility.

2 July, 2013 at 10:38 am

xfxieThe maple file is here: http://www.cs.cmu.edu/~xfxie/sol_872.mpl

The script is based on the maple script by Terence — just added a new constraint .

This file stands for =872 for (280/3,80/3), as , =927.8564427404251.

Hopefully this one will be slightly more correct.

2 July, 2013 at 12:10 pm

HannesUnfortunately I think is not satisfied in this case.

2 July, 2013 at 12:43 pm

xfxie@Hannes, Thanks for pointing out the error. Now I get more sense in setting parameters.

2 July, 2013 at 8:17 am

xfxieA refined result is =720 for (280/3,80/3), for = 3/280-1.1591660703532822*10-5, =0.006805137872659862, and =369.2931543045393 .

Seems A is quite flexible, A=369 also works.

2 July, 2013 at 7:30 pm

The quest for narrow admissible tuples | Secret Blogging Seminar[…] guest post summarizing the current state of affairs. This task is made easier by Tao’s recent progress report on the polymath project to sharpen Zhang’s result on bounded gaps between primes. If you […]

3 July, 2013 at 7:55 am

Terence TaoI’m recording here some notes on double dense divisibility that almost allow us to recover the near-optimal numerology of the Pintz sieve (and in particular to not have to force ).

Recall that a number is said to be -densely divisible if, for every , one can find a factor of in the range . Let us say that is doubly -densely divisible if, for every , one can find a factor of in which is itself -densely divisible. Clearly double -dense divisibility implies -dense divisibility; also, being -smooth implies double -dense divisibility because -smooth numbers are -densely divisible, and any factor of a -smooth number is again -densely divisible.

Note though that double dense divisibility is stronger than single dense divisibility. For instance, if are primes of size about respectively, then the product is -densely divisible, but not doubly -densely divisible.

In http://terrytao.wordpress.com/2013/06/18/a-truncated-elementary-selberg-sieve-of-pintz/ it was shown that if and was a squarefree -smooth number with

then was -densely divisible. Indeed, to get a prime factor of in the interval one multiplies together all the big () factors of together until just before one exceeds (or one runs out of big factors), and then multiplies together small factors until one enters the interval .

An iteration of this argument shows that if we instead assume the stronger condition

then is doubly -densely divisible. Indeed if one runs the previous argument to find a factor of in , one will find that there are still enough small prime factors left over in to make -densely divisible by the previous argument (indeed, will either be -smooth, or else be such that . Replacing with and with we then obtain the claim.

Let be the same conjecture as defined in Conjecture 2 of http://terrytao.wordpress.com/2013/06/18/a-truncated-elementary-selberg-sieve-of-pintz/ , but with dense divisibility replaced by double dense divisibility. We can then repeat the proof of Theorem 5 of that post with only minor changes and obtain that implies assuming the hypotheses stated in that theorem are obeyed, except that the quantity in the definition of is replaced instead with . (Amusingly, this cancels out an improvement of Gergely Harcos made in the comments to that post in which he had inserted the factor of in the theorem.) Thus the Maple verification code for double dense divisibility now takes the following form:

k0 := [INSERT VALUE HERE];

varpi := [INSERT VALUE HERE];

delta := [INSERT VALUE HERE];

deltap := [INSERT VALUE HERE];

A := [INSERT VALUE HERE];

theta := deltap / (1/4 + varpi);

thetat := ((deltap - delta) + varpi) / (1/4 + varpi);

deltat := delta / (1/4 + varpi);

j := BesselJZeros(k0-2,1);

eps := 1 - j^2 / (k0 * (k0-1) * (1+4*varpi));

kappa1 := int( (1-t)^((k0-1)/2)/t, t = theta..1, numeric);

kappa2 := (k0-1) * int( (1-t)^(k0-1)/t, t=theta..1, numeric);

alpha := j^2 / (4 * (k0-1));

e := exp( A + (k0-1) * int( exp(-(A+2*alpha)*t)/t, t=deltat..theta, numeric ) );

gd := (j^2/2) * BesselJ(k0-3,j)^2;

tn := sqrt(thetat)*j;

gn := (tn^2/2) * (BesselJ(k0-2,tn)^2 - BesselJ(k0-3,tn)*BesselJ(k0-1,tn));

kappa3 := (gn/gd) * e;

eps2 := 2*(kappa1+kappa2+kappa3);

# we win if eps2 < eps

Next, I claim that the MPZ estimate established in the previous comment http://terrytao.wordpress.com/2013/06/30/bounded-gaps-between-primes-polymath8-a-progress-report/#comment-237087 for under the assumption of -smoothness of the original moduli, also works under the weaker hypothesis of double -dense divisibility. This was already known for the Type II and Type III estimates (which only need single dense divisibility) so it's just a matter of checking that things are OK for the Type I estimate. But by construction, the quantities in the Type I argument are doubly -divisible, so one can arrange matters so that are singly -divisible while still staying in the same intervals as before. (There is also the point that one has to remove all the tiny prime factors of , which can affect dense divisibility slightly, but the net effect here is and can be neglected.) Thus we can obtain the factorisation needed in the above comment without difficulty.

So, one just has to optimise for in the above code subject to the condition , which should bring us very close to xfxie's calculation of 720 or so.

3 July, 2013 at 9:07 am

Gergely HarcosVery interesting! For clarity I would change “make -densely divisible” to “ensure that is -densely divisible, namely either is -smooth or its -smooth part exceeds “. I know you are very busy, but it would be nice and useful to have an updated version of the earlier treatment (http://terrytao.wordpress.com/2013/06/23/the-distribution-of-primes-in-densely-divisible-moduli/), because all the improvements are now scattered in the comments. Perhaps it could be a post entitled “The distribution of primes in doubly densely divisible moduli”?

[Corrected, thanks. I’ll probably write up a post of this sort this weekend. -T]3 July, 2013 at 9:14 am

xfxieGreat. Seems it is possible to get under the new condition. Here is the maple code for validation:

http://www.cs.cmu.edu/~xfxie/sol_722.mpl

Hopefully this time the solution is correct :).

3 July, 2013 at 9:28 am

xfxieSeems I was too rush again, can be 721:

http://www.cs.cmu.edu/~xfxie/sol_721.mpl

Actually, I am using TeamView in my android phone to connect to my macbook. Somebody might try to further improve the result using the code:

http://www.cs.cmu.edu/~xfxie/project/admissible/k0table.html

Just notice there is a slight change in the maple code ( (deltap – delta) instead of (deltap – delta)/2 for thetat), as mentioned by Terence above.

3 July, 2013 at 10:07 am

xfxieFurther search returned the same 720:

http://www.cs.cmu.edu/~xfxie/sol_720.mpl

3 July, 2013 at 10:42 am

xfxieBefore the termination condition satisfied, a new solution k0=719 was found.

http://www.cs.cmu.edu/~xfxie/sol_719.mpl

3 July, 2013 at 10:49 am

xfxieSorry, the result 719 is wrong, after I managed to remotely validate using maple.

3 July, 2013 at 11:30 am

xfxieJust found the reason. As I tried to verify the correctness of the solution using maple, its output (a negative value for eps2 – eps) overwrote the file that the search algorithm was using, and thus fooled the algorithm.

4 July, 2013 at 11:40 pm

Gergely HarcosI confirmed this value below (with slightly cleaner and more optimal parameters). So it seems that we have prime gaps of size at most 5414 infinitely often.

3 July, 2013 at 11:58 am

Eytan PaldiIs the coefficient in the exponent the minimal one required to get doubly – densely divisible?

3 July, 2013 at 10:07 am

Richard Elwes – Carnival of Mathematics 100[…] • Let’s start with the biggest mathematical news story at the moment. Back in May, Yitang Zhang announced a proof of the following great breakthrough: there are infinitely many pairs of primes which are at most 70,000,000 (= H) apart. Terry Tao reports on subsequent huge success: […]

3 July, 2013 at 7:11 pm

Daniel ReemFollowing the “high level” spirit of this post, I wonder whether some geometric arguments can be used in order to improve the analysis (and in particular, some parameters). Let me give a more concrete example. It can be seen that the Cauchy-Schwarz inequality appears over and over again

in the analysis. However, it is used as it is, in an “analytic manner”, without any attempt to try to use its full power by estimating the angle between the involved vectors. Doing so may indeed be a non-trivial task, but if one can somehow show (at least in certain cases) that the vectors are close to be orthogonal, or at least the angle between them is bounded away from 0 and pi, then this is expected to improve the outcome.

P.S. It is very interesting to see how much progress has been made in a so short period.

4 July, 2013 at 11:36 pm

Gergely HarcosJust a confirmation that xfxie’s seems correct, and this yields prime gaps of size (cf. http://math.mit.edu/~primegaps/tuples/admissible_720_5414.txt). Using , , , , I am getting , , , .

5 July, 2013 at 4:51 pm

Pace NielsenJust a comment about the combinatorics part of the program.

The Type III sums take advantage of the case where there are three somewhat large scale, smooth functions, at the cost of working with bounds on . We can, instead break things into cases as follows:

Let be the largest smooth function in the Heath-Brown decomposition, with scale . (Perhaps this can be chosen in the Linnik decomposition, if done correctly.) Write . If we have we are in Type 0. If we are in Type II, but with one of the components smooth (which we won’t use). If , then we are in Type I, but with the smallest component smooth. Finally, if we are in Type I/II.

On the positive side, in this new breakdown can be arbitrarily small. On the negative side, we are ignoring some extra information (which allows for the creation of the Type III sums). On the positive side again, these new sums are much easier to compute.

Thus, we still do the Type 0,I,II computations, but also do a Type I computation with smooth and (let’s call this new sum, Type IV for now). If I computed correctly, the improved bound from the (first) Type I computations, modified to Type IV, yields the bound . This seems like a much better border than between Type I and Type III.

(Note: This inequality was worked on scratch paper, fairly quickly. Others may want to rework it, to double check that the Type I/IV border is better than the Type I/III border.)

5 July, 2013 at 5:34 pm

Pace NielsenIf the work above holds, we get the new inequality (which barely misses the Type II bound):

.

The work above followed the Type I level 1 computations, but without the last use of Cauchy-Schwarz, and thus needing only to bound .

5 July, 2013 at 6:39 pm

Gergely HarcosYou say: if , then we are in Type I, but with the smallest component smooth. I don’t see this. For example, there can be several tiny ‘s which correspond to Möbius convolution factors restricted to small dyadic intervals. Also, Zhang’s argument (and the one presented here) is based on grouping together certain convolution factors in order to create arithmetic functions with fairly large supports and possibly with the requirement of smoothness as well. I don’t see how to achieve that with your classification based on a single only. If you have a new way to treat the whole problem (new types of sums, and new estimates for them), then provide all the details so that we can verify them.

5 July, 2013 at 6:48 pm

Pace NielsenTo clarify, if , we are in Type I only in the sense that we are looking at a sum with two convolved sequences (where is the single smooth component corresponding to , and is the convolution of the rest of the components). But we are *not* in Type I in the sense that the sequences have scale . Rather, the scales are (at worst) . That’s why I called it Type IV.

So, we are still grouping things together. We just leave the biggest smooth piece alone. The tiny ‘s, which are not smooth, all get sucked into the . I hope this clarifies things.

5 July, 2013 at 7:03 pm

Gergely HarcosLet us focus on the ranges and , and let be smooth. How do you prove (42) from the earlier thread (http://terrytao.wordpress.com/2013/06/23/the-distribution-of-primes-in-densely-divisible-moduli/)?

5 July, 2013 at 7:44 pm

Pace NielsenI was following the ideas from the even earlier thread (http://terrytao.wordpress.com/2013/06/12/estimation-of-the-type-i-and-type-ii-sums/). First, we replace by (a smooth function). We combine the two exponential terms and into a single exponential. Then we use Corollary 10 from that thread, and bound things just as is done there (but since we never had to do the squaring, from the Cauchy-Schwarz, things work out a bit better).

So, if I did things right, I get

by using Corollary 10. Here, we have and . So . We again bound by , and by . However, here can be approximated by rather than .

———————-

To establish the needed bound, we now need to establish two inequalities. The first is

.

Recalling that , ignoring s, we plug in and find that we need (contrary to my earlier computation, by the way, I accidentally dropped one power of ):

.

This leads to

which simplifies to

.

So, unfortunately, we have $\sigma \geq 1/6$ here. However, there is some hope that this inequality can be improved as we apply the improvements obtained in the Type I case.

Sorry for dropping that extra , and thanks for making me double check everything.

5 July, 2013 at 8:05 pm

Gergely HarcosI think this is an interesting direction that merits attention.

5 July, 2013 at 7:11 pm

Gergely HarcosYou also say: if we are in Type I/II. Why is that?

5 July, 2013 at 7:31 pm

Pace NielsenI’m typing up my answer to your question above, but I can answer this one easily.

If all ‘s are , then a greedy algorithm allows us to get into the Type I/II situation. (It is the same argument used in the original combinatorics.) We don’t leave the biggest alone in this case, but force to have scales as close as possible (and neither are smooth anymore).

5 July, 2013 at 8:01 pm

Gergely HarcosThe assumption is initially made for the smooth convolution factors only, so I guess you are implicitly assuming that the rest of the ‘s are automatically at most . This can be achieved, but it requires (e.g. when working with the Heath-Brown decomposition) that remains above . Here is as in http://terrytao.wordpress.com/2013/06/10/a-combinatorial-subset-sum-problem-associated-with-bounded-prime-gaps/. I need to leave the computer now, but it would be nice to get an improved and simplified argument along these lines.

5 July, 2013 at 8:07 pm

Pace NielsenYou can choose as large as you like, so this issue causes no problems. But yes you understood correctly.

5 July, 2013 at 8:18 pm

Pace NielsenI guess to improve and simplify the argument, we would simply say: Let be the sequence in the Heath-Brown decomposition, where if we write the scale as , then is the largest real number that occurs. Letting be big enough, we only assume smoothness for when (for [and hence ] to be optimized later, using inequalities involving only fixed objects).

6 July, 2013 at 8:53 am

Improved bounds on Prime number gaps | On the learning curve...[…] and his team steadily managed to improve the bound. Tao has already knitted a nice and detailed summary on the progress. As of last week, the proven gap was , but now the updated gap could be as small as (which is […]

6 July, 2013 at 9:41 am

Pace NielsenJust for fun, I plugged in what would happen in the Type I-Level 6 computations if we used the worse case bounds

which are what arise when trying to do the new Type IV computations with scales nearly .

In this case, we obtain the inequality

.

Note that this is obtained by following the Type I-Level 6 ideas in the new Type IV situation, *without* taking advantage of the fact that is smooth. So there has to be some room for improvement here, but it would have to be in the constant to improve the current record.

7 July, 2013 at 11:59 am

Parents of Invention | DeepWeave[…] find shorter and more elegant ways to achieve it. As of June 30, a collaborative project hosted by Terence Tao had pushed H down to 12,006 and provisionally as low as 6,996. They already know that the path […]

7 July, 2013 at 11:17 pm

The distribution of primes in doubly densely divisible moduli | What's new[…] with respect to the parameter (which, using this constraint, gives a value of as verified in comments, which then implies a value of […]

7 July, 2013 at 11:25 pm

Terence TaoNew thread is up: http://terrytao.wordpress.com/2013/07/07/the-distribution-of-primes-in-doubly-densely-divisible-moduli/

As requested, the latest Type I, II, and III arguments are presented. The Type III arguments turned out to have a few technical twists that were not anticipated in the previous writeups, but fortunately this only affected some lower order terms and the numerology is still basically unchanged. In any case the Type III estimates are currently quite far from being the critical obstacle towards further improvements of the estimates. (I would still recommend the sketchier arguments given in this current post than the more detailed arguments in the most recent post if one just wants to get a sense of how the arguments work, and to play with the numerology.)

8 July, 2013 at 4:31 am

Gergely HarcosThank you! I will study it in the coming days.

9 July, 2013 at 11:23 am

The p-group fixed point theorem | Annoying Precision[…] theorem on bounded gaps between primes (although my understanding is that this step is unnecessary just to establish the […]

10 July, 2013 at 6:28 pm

Gergely HarcosSome typos:

1. “an interval ” should be “an interval “.

2. “optimised in later” should be “optimised later”.

3. In the two displays following “the numerator becomes”, the weight is missing, but perhaps intentionally for the sake of exposition?

4. “Polyá-Vinogradov” should be “Pólya-Vinogradov”.

5. In the display following “expression can be rearranged as”, the condition should be .

[Corrected, thanks – T.]11 July, 2013 at 4:35 am

Batch Auctions, US/EU Swap Reg Deal, and Primal Madness « Pink Iguana[…] Internet race in a project called Polymath8. A recent e-mail from Tao announced the latest result: a gap of no more than 12,006 numbers between primes, and a provisional gap of only 6,966—conditi…. The details are in the link […]

11 July, 2013 at 11:59 am

Martin MaulhardtDear Mr Tao,

I’ve followed and admired your work for a long time and I’m pleased that you mentioned this subject on Zhang’s theorem.

In connection with this theorem on the difference between two consecutive prime numbers, I wanted to share with you a result I proved recently: “There are infinitely many terns of consecutive primes such that the difference between the two second ones is smaller or equal than the difference between the first two ones.” For example (7,11,13).

I’d be honored if you want to read more on my paper that I posted on my website (http://martinmaulhardt.com/?page_id=6).

11 July, 2013 at 1:59 pm

Gergely HarcosYour result follows from Zhang’s theorem: if is the difference between the -th prime and the next one, then for any the assumption that is increasing for would imply that tends to infinity (because they tend to infinity on average), contrary to Zhang’s theorem. In fact Erdős and Turán proved already in 1948 that changes sign infinitely often (see (7) here: http://www.ams.org/journals/bull/1948-54-04/S0002-9904-1948-09010-3/S0002-9904-1948-09010-3.pdf).

11 July, 2013 at 9:31 pm

Martin MaulhardtDear Gergely,

Thanks for your valuable post and I have to admit that I was not aware of the result of Erdos and Turan. So my result is not new as I first thought. But revising the proof that these great mathematicians had constructed I have to say that mine is completely different and simpler.

Of course I have not the slightest pretension to compare myself with either of these great mathematicians by saying that my proof is simpler and different.

12 July, 2013 at 7:58 am

Terence TaoA survey of the known results about the difference between consecutive primes can be found in this recent article of Pintz http://arxiv.org/abs/1305.6289 (see in particular Section 2.4).

12 July, 2013 at 8:31 am

Martin MaulhardtThanks very much!

2 September, 2013 at 2:32 pm

Polymath8: Writing the paper, II | What's new[…] (For a discussion of the terminology, and for a general overview of the proof strategy, see this previous progress report on the Polymath8 project.) This post can also contain any other discussion pertinent to any […]

20 February, 2014 at 6:54 am

zeggarii am not a matimatician expert but i have some ideas:

let S the count of the digit of a number. EX S(99)=S(18)=S(9)=9.

S(i) must be between [2;9], i>1.

to verify if two number’s are not twin:

“IF S(p)*S(p+2) different to 8 then p and p+2 are not twin prime numbers”

note that

1)S(p+i)=S(S(p)+i)

2)S(i)*S(j)=S(i*j)

EX:

S(111413131313131343534231331663534342442442442355343434351141879678685757575756464545353422211109)=6 and S(111413131313131343534231331663534342442442442355343434351141879678685757575756464545353422211111)=6+2

S(6*8)=S(48)=S(12)=S(3)=3 fifferent to 8 . this two numbers are not twin prime numbers (vérified by hand on less then 3 minutes).

20 February, 2014 at 7:04 am

zeggariProblem of over bound of the web page

this is a new exemple beacouse i loste the first number

EX:

S(111413131313131343534231331663534342442442442355343434 351141)=7 and S(111413131313131343534231331663534342442442442355343434 351143)=7+2

S(7*9)=S(63)=S(9)=9 . this two numbers are not twin prime numbers

20 February, 2014 at 12:09 pm

zeggarisorry, i think that this means simply if we look to test p1 and p2 if they are not twin prime numbers you must devide first p1/3 then p2/3 and so on.