This is the third thread for the Polymath8b project to obtain new bounds for the quantity

either for small values of (in particular ) or asymptotically as . The previous thread may be found here. The currently best known bounds on are:

- (Maynard) Assuming the Elliott-Halberstam conjecture, .
- (Polymath8b, tentative) . Assuming Elliott-Halberstam, .
- (Polymath8b, tentative) . Assuming Elliott-Halberstam, .
- (Polymath8b) for sufficiently large . Assuming Elliott-Halberstam, for sufficiently large .

Much of the current focus of the Polymath8b project is on the quantity

where ranges over square-integrable functions on the simplex

with being the quadratic forms

and

It was shown by Maynard that one has whenever , where is the narrowest diameter of an admissible -tuple. As discussed in the previous post, we have slight improvements to this implication, but they are currently difficult to implement, due to the need to perform high-dimensional integration. The quantity does seem however to be close to the theoretical limit of what the Selberg sieve method can achieve for implications of this type (at the Bombieri-Vinogradov level of distribution, at least); it seems of interest to explore more general sieves, although we have not yet made much progress in this direction.

The best asymptotic bounds for we have are

which we prove below the fold. The upper bound holds for all ; the lower bound is only valid for sufficiently large , and gives the upper bound on Elliott-Halberstam.

For small , the upper bound is quite competitive, for instance it provides the upper bound in the best values

and

we have for and . The situation is a little less clear for medium values of , for instance we have

and so it is not yet clear whether (which would imply ). See this wiki page for some further upper and lower bounds on .

The best lower bounds are not obtained through the asymptotic analysis, but rather through quadratic programming (extending the original method of Maynard). This has given significant numerical improvements to our best bounds (in particular lowering the bound from to ), but we have not yet been able to combine this method with the other potential improvements (enlarging the simplex, using MPZ distributional estimates, and exploiting upper bounds on two-point correlations) due to the computational difficulty involved.

** — 1. Upper bound — **

We now prove the upper bound in (1). The key estimate is

Assuming this estimate, we may integrate in to conclude that

which symmetrises to

giving the upper bound in (1).

It remains to prove (2). By Cauchy-Schwarz, it suffices to show that

But writing , the left-hand side evaluates to

as required.

This also suggests that extremal behave like for some function , and similarly for permutations. However, it is not possible to have exact equality in all variables simultaneously, which indicates that the upper bound (1) is not optimal, although in practice it does remarkably well for small .

** — 2. Lower bound — **

Using the notation of this previous post, we have the lower bound

whenever is supported on , , and are independent random variables on with density . We select the function

with , and for some to be chosen later. We have

and

and so

Observe that the random variables have mean

The variance may be bounded crudely by

Thus the random variable has mean at most and variance , with each variable bounded in magnitude by . By Hoeffding’s inequality, this implies that is at least with probability at most for some absolute constant . If we set for a sufficiently large absolute constant , we thus have

and thus

giving the lower bound in (1).

Hoeffding’s bound is proven by the exponential moment method, which more generally gives the bound

for any . However, this bound is inferior to the linear algebra method for small ; for instance, we can only obtain for by this method (compared to from quadratic programming), even if one uses the best available MPZ estimate.

Using , one can modify this lower bound, obtaining whenever we can find obeying

where

and

with

Details can be found at this comment.

** — 3. Quadratic programming — **

The expressions , are quadratic forms in , so one can (in principle, at least) obtain lower bounds for by restricting to a finite-dimensional space of and performing quadratic programming on the resulting finite-dimensional quadratic forms.

One can take to be symmetric without loss of generality. One can then look at functions that are linear combinations of symmetric polynomials

for various signatures ; actually it turns out numerically that one can get more efficient results by working with combinations of for up to a fixed degree, and for at that degree afterwords, where is the first symmetric polynomial. (Note that the GPY type sieves come from the case when is purely a function of .)

It may be that other choices of bases here are more efficient still (perhaps choosing bases that reflect the belief that should behave something like for some smallish ), but one numerical obstacle is that we are only able to accurately compute the required coefficients for in the case of polynomials.

## 154 comments

Comments feed for this article

8 December, 2013 at 5:08 pm

Andrew SutherlandA couple of minor typos in the bulleted list of bounds: I think should be and should be .

8 December, 2013 at 5:19 pm

Terence TaoActually, the Elliott-Halberstam conjecture essentially allows us to double the value of for any bound previously obtained using only the Bombieri-Vinogradov theorem (such as the bound ).

8 December, 2013 at 5:11 pm

Terence TaoThis doesn’t directly help the numerical optimisation, but here is an alternate way to compute the integrals of polynomials on the simplex , giving a slightly different way to establish Lemma 7.1 from Maynard’s paper. The starting point is the identity

which is a simple application of Fubini’s theorem. If F is homogeneous of degree d, then

from the Gamma function identity , we thus see that

Now, from the gamma function identity again we have

and thus

Replacing k with , and then performing the integral, one also has the generalisation

which is equation (7.4) in Maynard’s paper.

This calculation suggests that Laplace transforms in the variables may possibly be useful.

8 December, 2013 at 5:51 pm

Terence TaoI’ve been experimenting with the problem of optimising a sieve, to get some sense as to how sharp the Selberg sieve is. A model problem is the following: suppose one has a non-negative sequence for which one has the exact bounds

for all and some multiplicative function taking values in [0,1] and some quantity (in practice there are error terms also, but I will ignore them here). Let be the product of some finite number of primes. What is the best upper bound one can then place on the quantity

?

(This isn’t quite the sieve problem that is of interest when trying to go beyond the Selberg sieve to prove results of the form DHL[k,m+1], but it is a much more well-studied problem and is perhaps a place to start first.)

By the Hahn-Banach theorem, is also the minimal value of the quotient

subject to the constraint that for all (and that is non-zero, of course).

Another equivalent definition of is that it is the maximal value of , if are events in a probability space such that

for all . For instance, by taking the to be independent events of probability , we obtain the lower bound

. (1)

The Selberg sieve, in contrast, gives the upper bound

where . For comparison, (1) can be rewritten as

.

But the Selberg sieve is not always optimal; for instance, when is a product of many small primes then often the beta sieve is superior (and in the model one-dimensional case when , the beta sieve in fact gives asymptotically optimal results).

I’ve started playing around with a model case when the primes dividing P are very close to each other, and is basically constant, for some fixed , where the question simplifies to a low-dimensional linear programming problem, with the intent of understanding the precise strength of the Selberg sieve in this particular setting. The situation is then as follows: if we have events for some large with the bounds

for all and , what is the best upper bound on

in the asymptotic limit ? There are some interesting combinatorics emerging here and I might detail them in a future post.

8 December, 2013 at 9:44 pm

Terence TaoSome preliminary computations for the problem mentioned previously, where is real and is a natural number:

The lower bound (1) becomes . The Selberg sieve upper bound is

.

The Bonferroni inequalities give

whenever is even. When and is the largest even number less than or equal to , I can show that this is in fact an equality, by explicitly constructing events .

One can also define to be the infimal value of subject to the constraints for all ; it is also the supremal value of whenever are non-negative numbers obeying the constraints for all . Unlike the general sieve problem, there is so much symmetry here that the linear programming problem can actually be solved exactly for small values of .

The first non-trivial value of is : here we have and , and want an asymptotic upper bound for . It turns out that the exact value can be computed here as

whenever is a natural number and . Thus for instance the Bonferroni bound

is sharp for , then we have

for , and so forth. The Selberg sieve bound is

which happens to be sharp exactly when is a natural number – so in this case the Selberg sieve is fairly efficient. This latter bound can also be seen by observing that the counting random variable has mean and second moment , so that it has to be non-zero on an event of probability at least by the Cauchy-Schwarz inequality.

The situation appears to be almost identical with the situation, though I haven’t nailed this down completely. The case may be more interesting; I suspect here that the Selberg sieve bound is now inefficient, although I don’t know for sure yet. This is quite an oversimplified problem compared to our true sieve problem, but may help shed some light on that problem.

9 December, 2013 at 2:18 pm

Terence TaoI confirmed that the numerology is identical to the numerology. For the case, the Selberg sieve bound can be derived as follows. For a random variable of mean zero and variance 1, one has the bound relating the third and fourth moments, which arises from the Selberg-sieve type manipulation of starting with the trivial inequality and then optimising in a and b. In our situation, the counting function has falling moments for , or equivalently where the are Stirling numbers of the second kind. Conditioning onto the event where is non-zero, normalising to have mean zero and unit variance, and using the previous inequality, one obtains (as I verified in Maple) the bound .

The inequality is sharp if and only if X takes at most two values; thus the Selberg sieve is sharp exactly when X takes at most two non-zero values. But I believe I can show by hand that this cannot actually occur for the situation at hand, and so the Selberg bound is never sharp in this case. However, I don’t yet know of an easy way to improve the bound (although in the case, brute force computation should eventually give the optimal bounds).

9 December, 2013 at 5:22 pm

Terence TaoActually, it turns out that in some cases (specifically, when for some natural number ) it is possible for to take precisely two non-zero values, in which case the Selberg sieve bound is indeed sharp. (For instance, when , one can concoct a scenario in which takes the value 0 with probability 1/5, 2 with probability 2/3, and 5 with probability 2/15, and the Selberg sieve bound is sharp in this case.)

Similar scenarios are possible for larger ; in principle, the Selberg sieve bound is sharp if takes precisely non-zero values, although it becomes difficult to ensure that these values are all natural numbers. But the Selberg sieve nevertheless performs substantially better than I had expected on this model problem, which is perhaps a hint that one cannot hope to do much better than that sieve for the bounded gaps problem.

12 December, 2013 at 9:52 am

Terence TaoI found a way to convert the general sieve problem into a Selberg-type problem, by taking advantage of the positivstellensatz of Putinar. Among other things, this theorem asserts that a polynomial of n variables is non-negative on the cube if and only if it is the sum of polynomials of the form and for various polynomials (but with the caveat that R can have degree larger than that of Q, due to cancellation).

Anyway, the sieve problem is to minimise for supported on , given that and . Writing and introducing the n-variate polynomial

we see that we are trying to minimise the linear functional

(*)

subject to the constraints that Q is non-negative on the corners of the cube with , and only involves monomials with . Since on the corners of the cube, one can extend these monomials to for arbitrary . By adding large positive multiples of to Q, we may then assume that Q is non-negative on the entire unit cube , not just on the corners.

One can then use the positivstellensatz to replace the non-negativity of Q with a sum-of-squares representation

and one is now trying to minimise (*) subject to the linear constraints that Q(0)=1 and that all coefficients of Q with vanish. In principle this is a Selberg-type optimisation problem – the Selberg case corresponds to the case when Q is a single square, . Unfortunately the degrees of the can be large, as can the number of involved (I think it can be something like ). But it suggests a way to go a little bit beyond the Selberg case, for instance replacing Selberg sieves

(where has to be restricted to be at most ) with slightly more general sieves such as

for coefficients which are arbitrary except for the linear constraint that the net coefficient of in the final sum vanishes for . But it’s not yet clear to me whether this leads to a substantial improvement in the Selberg sieve numerology (from the toy cases in previous comments, there are already a lot of cases when Selberg is basically sharp already).

12 December, 2013 at 9:53 pm

Terence TaoHere is a slight modification of the above scheme, adapted for our current sieve problem. For sake of concreteness let us take k=3 (with the tuple (0,2,6)) and assume EH. Right now, we are working with sieves of the form

for some weights which we are free to choose, as long as (actually we can stretch this a bit to the region , which corresponds to the enlarged region , but let us ignore this enlargement for now). One possible generalisation of the above sieve is to

where is a parameter one can optimise in, and obeys the same support conditions as , and also agrees with for . Note that the expression is non-negative for , so remains non-negative, and note that when one expands out as a divisor sum, one only obtains terms with (all larger values of end up cancelling each other out). So one can still use EH to control error terms here.

In the case when , this new sieve collapses to the old Selberg sieve. But hopefully the extra flexibility here allows us to improve the quadratic optimisation problem. One can also play with other variants of the above scheme of course.

14 December, 2013 at 10:42 am

Terence TaoI’ve been trying to see if this new sieve actually improves the values of , etc. over the existing theoretical optimal values, but the results are frustratingly inconclusive: for model problems, no gain occurs, and for the original problem, the Selberg sieve is a stationary point of the optimisation problem, but it is still not clear whether it is a global extremum.

To focus the discussion, let us look at the problem of bounding the number of such that are all prime. Hardy-Littlewood predicts an asymptotic for this quantity, but the best upper bounds we can get are only within a large constant of this. For instance, by using a Selberg sieve

for smooth and supported on the interior of the simplex , we get an upper bound of where

Writing and using Cauchy-Schwarz, we see that the best value of C is 48 (the reciprocal of the volume of the simplex ), attained when is the indicator function of the simplex (in practice one has to infinitesimally smooth this a bit). (One can do a bit better than this by using distribution theorems on the primes; using Bombieri-Vinogradov I think we can lower C to 32, and on Elliott-Halberstam I think we can get 8, but let me ignore these directions here for simplicity.)

Now let us try a modified sieve

where F is as before, and for each , has the same support properties as F with for . This sieve is still non-negative for , and one can still calculate : indeed the numerator of C is adjusted from to

where . The summand vanishes when , and one can make vanish for , and then by using Mertens' theorem, C gets reduced to

. (*)

At first glance it looks like this is a strict improvement, as we have subtracted a new non-negative term from C. However, this new term vanishes when F is the previous extremiser (when is the indicator of ); furthermore, using the identity

and some Cauchy-Schwarz, one can lower bound (*) by

and by some further Cauchy-Schwarz, we get exactly the same extremum C=48 for C as before. So unfortunately in this model problem at least, the new sieve doesn't move the needle as far as bounds is concerned. However the analysis for the prime gaps problem is substantially more complicated, and I can't tell whether the extra flexibility here actually improves the global extremum; all I can say so far is that the previous extremiser remains a critical point.

7 May, 2014 at 5:29 am

AnonymousIt seems that this preprint is related to the problem.

8 December, 2013 at 9:49 pm

Terence TaoIncidentally, the upper bound suggests that could occur for k as low as 2973, which is a significant improvement over the value of 43,134 we currently have. Given how tight the upper bound is for small k (and for k=5 it is ridiculously tight – I would like to understand how that is so better) this suggests that there is quite a bit of room to improve our m=2 analysis. One idea I had was to replace the Cauchy-Schwarz inequality with its defect version (the Lagrange identity) and then try to obtain upper bounds on the defect , which may be more efficient than just crudely obtaining lower bounds on .

9 December, 2013 at 8:50 am

Eytan PaldiThis difference seems to be (or even ).

9 December, 2013 at 12:24 am

AnonymousTypographical comment: If you wrap the comma, used as ‘1000-separator’ in a large number, in curly braces, the space after the comma disappears, as it should; “” –> “”.

[Corrected, thanks – T.]9 December, 2013 at 4:41 am

Eytan PaldiThere is a typo in the upper bound for above (it should be ).

[Corrected, thanks! This helps explains why the M_5 bounds were suspiciously tight; they’re still quite good, but not unbelievably so any more… -T.]9 December, 2013 at 5:00 am

Eytan PaldiIn order to give a precise generalized definition for for a domain (not necessarily ), it is not sufficiently clear how to define the inner integral in the expression for .

9 December, 2013 at 8:50 pm

Pace NielsenTrue, but this can be done for the domain defined in Maynard’s preprint. I wonder if there are some nice functions (which can take the place of our monomial symmetric polynomials) for which is easily computed.

9 December, 2013 at 9:14 pm

Terence TaoWell, is the union (up to measure zero sets) of pieces, all of which are reflections of

or equivalently (after setting and for )

and so for symmetric F we have

.

In principle, this means that we may integrate on using the existing integration formulae on (symmetrising the integrand on the RHS if desired). For instance

and

(I think). The formulae get a bit more complicated beyond this, but look computable.

[EDIT: On the other hand, the functionals are a little bit tricky to compute here; will have to think about this a bit more.]

10 December, 2013 at 6:57 am

Pace NielsenFor the functionals , one may want to break up the simplex in a different way. Let’s focus on for a moment (the others being similar). One option (following something James sent me) is to break into the sets

for . I think one can then follow the analysis you gave above. It seems that the formulas are a bit complicated; there may be a cleaner method here.

10 December, 2013 at 7:13 am

Terence TaoActually, I realised just now that we can use symmetry to write

where is the Volterra operator in the k^th direction:

In particular if F is a polynomial then is a polynomial of one higher degree (without having to involve tropical expressions such as ). So any formula that allows one to integrate polynomials on will also be able to calculate for polynomials F, and similarly for the other .

9 December, 2013 at 10:40 am

SmetalkaPossible inconsistency: on the wiki page the lower bound for k=59 (3.96508) differs from the one you mentioned in your original post (3.898)

[Corrected; the 3.898 referred to a previous lower bound that has since been superseded. -T.]9 December, 2013 at 11:49 am

Eytan PaldiThere is still a typo (also in the wiki page): The value (as given by Pace) is

[Corrected, thanks – T.]10 December, 2013 at 11:49 am

Eytan PaldiAnother possibility to approximate is by trying to express the coefficients of and than to see if by the power method may be estimated.

10 December, 2013 at 12:55 pm

Eytan PaldiThree lines above lemma 4.9, I suggest to replace “the first non-zero derivative is positive” by .

[Corrected, thanks – although these sorts of comments should go in the Polymath8a thread. -T.]10 December, 2013 at 1:28 pm

FanThe link to this newest thread is not on the polymath wiki.

[Corrected, thanks – T.]10 December, 2013 at 1:50 pm

Terence TaoI was looking again at the Selberg sieve to see if there was any other way to move beyond our current range. It occurred to me that we could exploit more fully the difficulty gap between the task of computing the sum (which is really easy, since we understand how integers distribute in arithmetic progressions) with the task of computing the sum (which is difficult, since we need to understand how primes distribute in arithmetic progression).

Let us use the multidimensional GPY sieve as discussed in this blog post (rather than the elementary Selberg sieve from Maynard’s paper). Assuming all error terms are under control, and the underlying cutoff f is symmetric we obtain (assuming ) whenever the ratio

exceeds .

Now what constraints are there on the cutoff function f, other than it be smooth and compactly supported? Right now, we are requiring that f is supported on the simplex , or on the enlarged simplex . But actually, if one looks at the proof, in order to control the numerator it is only necessary that the slice be supported on the simplex , while to control the denominator, f can be supported in the larger simplex to reflect the fact that the integers trivially obey the full Elliott-Halberstam conjecture. (Here let us work in the “unconditional” setting in which is equal to or only slightly larger than 1/2.) It may be that this gives some extra flexibility to the variational problem. In terms of the differentiated cutoff function that appears in Maynard’s paper, the constraint is then essentially that F is allowed to be supported in the larger simplex , so long as the marginal distribution

is still supported on the simplex . For non-negative F, this restricts F to the region , but there is some possibility (perhaps unlikely) that one gains something numerically by considering some oscillatory F that pokes a bit outside of . I’ll think about this a bit more…

10 December, 2013 at 2:26 pm

Terence TaoIndeed, it appears that this relaxation of constraints does allow one to increase the ratio one is trying to maximise. Among other things, the optimum in the new variational problem is only achieved if the function is a sum of functions that depend only on k-1 of the k variables, thus (by symmetry)

.

Whereas in the previous optimisation problem this decomposition held only inside or , in the new problem it must hold throughout ; equivalently, the alternating sum of on the vertices of any box in with sides parallel to the axes needs to vanish. This is an additional constraint that is unlikely to have been satisfied by the original extremiser, suggesting that an improvement in the analogue for in this setting is available.

I’m not yet sure though how to convert this to an effectively computable optimisation problem; one needs a good basis for the space of symmetric functions of k variables on whose marginal distribution on k-1 variables is supported on the simplex , and I can’t see an obvious candidate for such a basis offhand…

10 December, 2013 at 3:17 pm

Pace NielsenCan you clarify what it means for the marginal distribution on variables to be supported on the simplex , perhaps with a simple example of such function? [Specifically, I thought we could choose our function to be piece-wise smooth, so doesn’t that let essentially let us choose the support? I’m probably missing something simple here.]

10 December, 2013 at 4:25 pm

Terence TaoSaying that a symmetric function has k-1-marginals supported on means that the k-1-dimensional function vanishes whenever . An example would be the tensor product for any function of mean zero; this symmetric oscillating function in fact has all marginals vanishing everywhere.

For sake of notation let us take k=3 and (in practice of course we need to take k to be more like 59 with this value of , but I don’t want to write so many variables); then the new variational problem (replacing ) is that of maximising the ratio

for piecewise smooth symmetric supported in the simplex , and such that vanishes whenever . The latter condition is automatic if F is supported on , but also allows for some additional oscillatory functions F.

10 December, 2013 at 3:26 pm

Eytan PaldiFor it seems that such oscillatory (having the above symmetric form) is not possible.

10 December, 2013 at 4:30 pm

Terence TaoIf , then one can find oscillating symmetric functions supported on the triangle whose marginals are supported on , e.g. where are supported on respectively with g having mean zero. (Actually, the case looks like a good one to analyse even if it won’t give us any nontrivial value of m, as this is one of the few places where we can hope for an exact solution to the optimisation problem.)

10 December, 2013 at 5:16 pm

Eytan PaldiI meant that it can’t be done if is of the form (symmetric sum of two functions of one variable) – which (as previously remarked) is the only possibility for an optimum.

10 December, 2013 at 7:38 pm

Terence TaoTo be precise, F will be of the form . It is possible for functions of this form to have vanishing marginals outside of [0,1], although if F is an extremiser then by a Lagrange multiplier argument one can show that is a function of y only when , and similarly when x and y are reversed, which shows after a little computation that is a scalar multiple of . This shows that in the k=2 case, we don’t gain anything over the bound that we already have from the example.

For k=3, though, I believe a similar argument shows that the extremiser is not supported on , suggesting that we can improve upon with this expanded variational problem.

11 December, 2013 at 8:55 am

Terence TaoJust to expand upon the final paragraph: Lagrange multipliers tell us that a symmetric extremiser F to the extended variational problem (in which F is supported on but has marginals supported on ) solves the modified eigenfunction equation on , where as the usual operator (but now defined on ), is the optimal ratio (larger than or equal to the optimal ratio on , which is in turn larger than or equal to ), and is symmetric supported on but vanishes on .

Note also from the vanishing marginal condition that where is symmetric and supported on .

In the k=3 case, we have

Suppose that F is supported on and . Then in the regime , we then have

which (setting y=1) shows that for and some function . In the regime , we have

the left-hand side is while the RHS is independent of y, which forces h to be constant, which then forces f to be constant, and then F is a scalar multiple of the indicator function on , which is not the extremiser. I think a similar argument works for all (for , though, the indicator on happens to be the extremiser).

In the Elliott-Halberstam case , the restriction of F to collapses the problem back to the M_k problem, but I think one can replace by the larger region

(by using the generalised EH to control the distribution of the summations) which is still larger than . There is now hope of getting greater than 2 (it seems that is about 1.92, we only need a little bit of additional boost to get over the top).

10 December, 2013 at 5:31 pm

Eytan PaldiBTW, perhaps Kolmogorov’s superposition theorem may be used here?

11 December, 2013 at 11:08 am

Pace NielsenJust to clarify, is that new range (in the case) equal to

?

If so, I can do the computation to see if now works.

11 December, 2013 at 11:47 am

Terence TaoA bit more precisely, for the region is

(in particular I believe this region is now non-convex for k>2), and one has to optimise for symmetric and supported in , subject to the constraint that the marginals are supported in . (In the general theta case, one has to restrict to otherwise some of the divisors will exceed x.)

I’m not sure how to implement the vanishing marginal constraint computationally. One crude way would be to add a large penalty function based on the L^2 mass of the marginals outside of the simplex , but this is likely to significantly degrade convergence of the algorithm.

11 December, 2013 at 11:15 am

Aubrey de GreyIs it yet possible to derive the minimal k for which M’_k passes 4? (The mention of a value for M’_4 arouses irresistible curiosity that it might be!)

11 December, 2013 at 11:39 am

Aubrey de GreyAh, I apologise – I now realise that M’_4 = 1.92 is simply twice the value of 0.96 that James provided two weeks ago (http://terrytao.wordpress.com/2013/11/22/polymath8b-ii-optimising-the-variational-problem-and-the-sieve/#comment-253116), using R’ but before the “M'” notation had been introduced. (Notation is evolving quickly at present!) But it would still be of interest to learn how feasible/close a calculation of M’_k for k in the 50 range is felt to be.

12 December, 2013 at 9:58 pm

Terence TaoI need to correct my previous comment here; actually, I don’t think we can take the support of F in the case (taking for sake of concreteness) to be as large as

as I had previously claimed.

The problem comes from lack of convexity; we have to deal with the square of the divisor sum, and if one factor in the square comes from a tuple with (say) , and if another comes from a tuple with , then we don’t get any good control on the combined factor, and in particular cannot use EH in any of the three resulting moduli. The best I can see right now is to break the symmetry and work with a region such as

While this in principle is still an improvement over the previous situation in which we restricted ourselves to the region , the loss of symmetry is going to make the numerical optimisation messier (especially since we still have the condition on the (y,z) and (z,x) marginals being restricted to and respectively).

But in the case, we can still work with the simplex as before, because it is convex and so this issue does not occur.

11 December, 2013 at 2:43 am

Eytan PaldiSuch a basis can’t be real analytic (in particular, polynomial) on (otherwise, the marginals are also real analytic on – and can’t be supported on without vanishing identically!)

11 December, 2013 at 2:26 am

AndrewVSutherlandThis paper of Friedlander and Iwaniec posted to arXiv this morning may be of interest (and should probably be added to the bibliography)..

[Reference added, thanks – T.]11 December, 2013 at 8:38 am

Terence TaoThanks for the link! From a quick read through, the paper has a few innovations that aren’t in Polymath8a; for instance, in the MPZ estimates they are able to let the residue class vary with (in our notation, it would be ). By doing so, they give up some of the more advanced estimates of Polymath8a that come from averaging in modulus parameters such as q and r, but enough of the original arguments of Zhang (including Zhang’s original Type III argument, based on the Birch-Bombieri bound from Deligne’s theorem) can be adapted to still obtain a nontrivial estimate of this type.

There is also a passing remark that the Bessel function optimisation was actually performed first by Brian Conrey in an unpublished 2005 computation, thus predating the work of Farkas, Pintz, and Revesz; I’m guessing this was at an AIM workshop. I suppose we should note this also in Polymath8a, I’ll see if I can get some confirmation of this remark.

[UPDATE: Brian has confirmed the calculation and showed me some of his emails and handwritten notes on the topic. I’ll put in a remark in the paper to this effect. -T.]11 December, 2013 at 11:09 am

Emmanuel KowalskiI had also heard that Brian had done this optimization around the time of the first GPY paper, but I didn’t get around to asking him for details. I actually thought that Soundararajan had a remark to this effect in his Bulletin of the AMS survey, but I couldn’t find it there, so that was probably a mis-remembrace. I agree that it is good to add a mention of this in the paper.

11 December, 2013 at 2:44 pm

Terence TaoI’ve added a remark to this effect in Section 4.3. I also found a reference to Conrey’s computation in p. 129 of “Opera del Cribro”, and now that I look again at the Farkas-Pintz-Revesz paper, this is also mentioned in the introduction; I remember looking at this paper before, so I’m surprised I didn’t catch this previously.

On looking through “Opera del Cribro” I also found a conjecture on page 408 which is close to what we have been calling the Motohashi-Pintz-Zhang estimate (but the authors do cite Motohashi-Pintz for further details). This should probably also be mentioned in our paper.

14 December, 2013 at 5:44 am

Donal Lyons“Opera de Cribro” :-)

11 December, 2013 at 11:08 am

Eytan PaldiIt would be helpful to update the wiki page about the variational problem by adding details on the associated problem for (and perhaps for larger domains as discussed above.)

[Added some text for this – T.]12 December, 2013 at 8:22 am

Eytan PaldiOn the possibility of a “piecewise polynomial” basis for the variational problem:

The idea is to represent as a disjoint (up to boundaries) union of and finitely many convex polytopes such that each basis function is a polynomial over the interior of and each . Note that (because of convexity), each marginal is a sum of integrals over intervals, with at most one interval for each .

It follows that each marginal of each monomial is “piecewise polynomial” outside (over similar partition of polytopes outside .)

We need the condition: For each FIXED , the marginal (as an integral with respect to ) of each monomial is a polynomial over each above (where the polytopes are “monomial independent” but may depend on ).

It seems that this condition is satisfied (for sufficiently refined partition into polytopes .)

Under the above condition, for each basis function, the vanishing of each marginal outside is described by a system of linear constraints on the coefficients of the basis function.

Therefore (for appropriate partition ), we return (for each maximal degree ) to a similar optimization problem (with the additional system of linear constraints on the coefficients describing the “piecewise polynomial” ).

14 December, 2013 at 11:13 am

Terence TaoI like this idea. It may well be that by applying this with k=4 on EH, we may finally be able to get above 2 and thus get on EH.

But it may make sense to warm up on k=3 first, which is already somewhat complicated combinatorially (and k=4 is much worse). We have, in increasing order, the quantities , where is the maximal value of

(*)

for F supported on the simplex , with

and similarly for cyclic permutations, is the maximum value of the same ratio but with F supported on the larger region , and is the maximum value of the same ratio, but F now supported on the prism and with marginal distributions and supported on and respectively. (The support of the third marginal is already supported on the correct simplex.) Note that while the first two problems are completely symmetric in x,y,z, this third optimisation problem is only symmetric in x,y, so we have to work with functions F that are symmetric in x,y only.

I’m not sure if the numerically computed values of M_3 or M’_3 have already been posted here but presumably they can be calculated easily using existing algorithms, to serve as benchmarks for the potential gain M”_3 offers over M’_3.

As Eytan suggests, one can split the prism into convex polytopes on which the marginal conditions become polynomial. It seems that the following partition into seven regions works:

We take to be piecewise polynomial on each of these seven pieces, e.g. on , on , etc. By symmetry we may assume that for (with ), so we have four polynomials to independently optimise over. The M’_3 problem corresponds to the case .

By symmetry, the only marginal condition is that when . When , this condition is just

but when , the condition becomes

and when , the condition becomes

.

In principle, these are finite-dimensional linear constraints on that allow for a non-trivial solution when the degrees D of are large enough (one has about degrees of freedom and constraints). The quadratic program is moderately messy though.

14 December, 2013 at 11:22 am

Pace NielsenIn your definition of did you get the correct powers on ? I thought in we only used the square, and in only the square on the outside of the first integral.

[Oops, that was weird; fixed now. -T.]14 December, 2013 at 4:16 pm

Eytan PaldiSome remarks:

1. To find the coefficients of the linear constraints and quadratic forms for the optimization, it is perhaps easier to use (iterated) symbolic integration.

2. If the linear constraints are represented by where is the constraint matrix and is a vector representing the coefficients of , then we have $a = D b$ for some matrix and vector with $dim b < dim a$, so we return to the original optimization problem (without linear constraints) for .

14 December, 2013 at 7:36 pm

Aubrey de GreyIs there any prospect of adapting Terry’s proof from three weeks ago of a sharp upper bound on M_k (http://terrytao.wordpress.com/2013/11/22/polymath8b-ii-optimising-the-variational-problem-and-the-sieve/#comment-253731) to give a corresponding upper bound on M’_k or M”_k ? I don’t think there has been any discussion of that (other than James’s computed upper bound for M’_4), and I’m thinking that in view of the daunting combinatorial obstacle which may now be emerging in relation to extending Eytan’s idea to k > 3 (and especially to k for which M”_k could get anywhere near 4), there may be a case for attempting to determine in advance how much progress beyond M_k could be achieved for a given k in the best-case scenario.

15 December, 2013 at 8:39 am

Terence TaoFor M’_k, the best bounds I know of in general are

(improving slightly over the previously observed upper bound ). The lower bound is trivial. For the upper bound, note for any F supported on the polytope that one has

by looking at each fixed- slice of F (which is a k-1-dimensional function supported on ) and applying the definition of , and then integrating in . Averaging over permutations we then obtain

giving the claim.

Note that we do not necessarily have monotonicity in (for instance, ). So it may well be that M’_3 exceeds 2 even if M’_4 does not. The above bound, combined with the existing bound , gives an upper bound of 2.0794 for , so there is hope, though admittedly a slim one since the upper bound was somewhat crude.

For , the best bounds I can obtain easily are

.

The lower bound is trivial, and the upper bound comes from observing that the function F for is supported in the prism , and ignoring all the vanishing marginal conditions on F. From the slicing argument we again have

and from Cauchy-Schwarz one has

giving the claim. This is a very poor bound for small k, but it shows asymptotically that we do not break the logarithmic barrier through any of these tricks, unfortunately.

15 December, 2013 at 9:11 am

Aubrey de GreyThank you Terry! The absence of an (easily-obtainable, anyway) upper bound on that is even close to that for seems most encouraging. Out of interest, is also not necessarily monotonic (for a given )?

16 December, 2013 at 11:23 pm

James MaynardI had a look at this this afternoon.

Provisionally, my calculations are getting the bound

This is from taking to be degree 4 polynomials in and following the approach outlined above.

This should be compared with the bounds

I’ll give a couple details tomorrow.

Frustratingly, if the above calculations hold up then it looks like we might come just short of improving our current bound to something above 2.

17 December, 2013 at 9:27 am

Terence TaoThanks for the computations! At least we now do know (at least numerically) that can be strictly greater than , so the idea of going outside does have some potential. I’ve started putting up the data on the wiki at http://michaelnielsen.org/polymath1/index.php?title=Selberg_sieve_variational_problem#World_records

I’m still missing a few cells (notably the lower bounds for M_2 and M’_5 – presumably you have these available or easily computable from your code?) but it already shows that the crude upper bounds and are nowhere near as sharp as the bound , although this is not terribly surprising.

It would be somewhat amusing (though suspenseful) if was just at the edge of provability, so that a massive computer effort would be required to get high enough degree to verify it. If that doesn’t work, another option is to turn to and shave off as discussed a couple weeks ago, to get to 10 rather than to 8.

One further option is the following (let’s take k=3 for sake of discussion). Right now, on EH, the function F is assumed to lie in the prism

with appropriate vanishing marginal conditions. This is not quite the only choice available: in principle, we can have F supported on any set with the property that the sumset lies in the non-convex region

as this represents all the ranges of moduli occurring in the Selberg sieve (which is the square of a divisor sum over a region associated to R) for which we can use Elliott-Halberstam in one of the three moduli (to make this work we have to first approximate F to be a finite linear combination of tensor products of one-dimensional functions, but this should be easily achievable from a Stone-Weierstrass argument, though there is a minor technicality that one has to make sure that the vanishing boundary conditions are still obeyed). Anyway, it is conceivable that there is some other choice of R which is superior to the prism , although I don’t obviously see one (my 3D geometry visualisation is not that great!).

17 December, 2013 at 10:17 am

James MaynardI get that and . (Actually for we can solve the Eigenfunction equation, and the answer is the solution of the equation

)

17 December, 2013 at 11:25 am

Terence TaoThanks for this! Just for the record, I’m writing down the calculations for the eigenfunction equation on the triangle . Assuming symmetry and writing , where , we see that

which in particular implies that f(1)=0. Also, differentiating we see that

and thus

(1)

where ; reflecting, we obtain

and hence

which can be solved as

We may normalise C=1. Substituting back into (1), we conclude after some calculation that

but then since we have C’=0. Using the boundary condition f(1)=0 we then have

which rearranges to as required. Up to scalar multiples, the eigenfunction is then

Wolfram alpha tells me (and one can easily confirm) that where is the Lambert W function.

17 December, 2013 at 12:17 pm

James MaynardI thought I’d record for completeness the solution to the Eigenfunction equation (even though it seems we can’t solve it when and for we have simpler and stronger results from using the enlarged region). The comment above was mine – I hadn’t intended it to be anonymised.

[Name restored to previous comment – T.]We wish to solve the eigenfunction equation (for smooth supported on )

We see that from this equation (and by the symmetry of ) that is of the form . This gives us an eigenfunction equation for (having simplified slightly):

Differentiating with respect to gives

and since the right hand side is unchanged by the substitution we see that

Differentiating (*) again and substituting our above expression for gives

This is a standard differential equation which can be solved by separation of variables. The solution is

for some constants . We see from substituting this expression into (*) that we must have . Since any scalar multiple of an eigenfunction is also an eigenfunction, WLOG we may take . To calculate we notice that the eigenfunction equation for requires that . This gives the equation

mentioned above. For such a , we have then found the unique eigienfunction (it is easy to verify that this is an eigenfunction) given by with

17 December, 2013 at 8:12 pm

Pace NielsenAlong the lines of finding a set for which lies in the non-convex region Terry described above, the following is a choice which is symmetric in all three variables. (It’s not clear to me whether this domain is better than the one already being used, but it may be worth a shot.)

The region is simply

.

I like to visualize the increasing sequence of domains (in the case) as follows. The domain used for computing is obtained by slicing the cube along the plane (containing three corners of the cube), leaving behind a tetrahedron.

The domain used for computing is found by taking the tetrahedron we just described, and gluing another tetrahedron along the same plane , with the new tetrahedron having a pinnacle at the point .

The new region described above also contains the point , and indeed we slice the cube along the plane , which contains .

17 December, 2013 at 8:25 pm

Pace NielsenAlternatively, one can view this new region as cutting a cube exactly in half (with the cutting plane perpendicular to opposite corners). So the area of this new region matches that of the other one currently being used.

17 December, 2013 at 9:39 pm

Aubrey de GreyI’m not sure whether this was Pace’s motivation, but it’s worth noting that even if Pace’s new suggestion for R is no larger than R”, the fact that it is symmetric in 1..k may very well make it much easier to identify an algorithm for dissecting it appropriately when k>3.

17 December, 2013 at 11:17 pm

Terence TaoIt took a while to figure out how to dissect Pace’s half-cube – in fact I eventually had to use my son’s construction toy set to visualise it – but I think I have a dissection into 13 pieces so that all the constraints become polynomial. First there is the base polytope

.

Then there are three simplices , where

and similarly for cyclic permutations of x,y,z; there are also three simplices , where

and similarly for cyclic permutations; and finally there are six simplices , where

and similarly for arbitrary permutations.

A symmetric function F can be specified by its values on R’_3 (which are arbitrary), together with its values on respectively, with being symmetric wrt y and z interchange. The vanishing marginal constraints, by symmetry, just assert that for , and , which become

when , and

when . (But this should probably be checked.)

With this new polytope, I have an upper bound , just because . This is larger than the bound for the prism , which is perhaps a hopeful sign, although the upper bound here is cruder because the role of the corners of poking out of [0,1]^3 are ignored.

I don’t see yet though how to dissect either of the two four-dimensional candidate polytopes, namely the prism and the truncated simplex . (Certainly my son’s construction set is no longer of use here!)

18 December, 2013 at 8:50 am

Aubrey de GreyIs it possible to state in purely geometrical terms what it means for a polyhedron (or polytope, but let’s start with k=3) to have polynomial moment conditions? I would like to play with possibilities for doing the chopping-up of candidate R’s systematically for k > 3, and I’m thinking that there may be alternative dissections that are superficially inferior (involve more pieces, in particular) but that turn out to be easier to generalise to larger k, but this is hard to get one’s head around without a geometrical sense of what makes a dissection admissible.

18 December, 2013 at 9:38 am

Terence TaoThe precise problem is this: given a polytope , subdivide it into subpolytopes , such that for each polytope and each , at least one of the following is true:

(1) The projection of to under the operation of deleting the coordinate maps to a subset of (or equivalently, lies in the prism (which is some permutation of ); or

(2) The polytope can be viewed as a region of the form

for some affine-linear functionals . In other words, all but two of the faces of are parallel to the coordinate direction. (This makes the vanishing marginal condition in the e_j direction a linear one in the coefficients of F, if F is piecewise polynomial on each of the P_i.)

In order to ensure that we recover at least in our optimisation problem, we also need to require that one of the polytopes in the partition is equal to . (In particular, the trivial partition in which m=1 is useless, as it only gives the trivial lower bound of 0 for M”_k.)

At present it is not obvious to me that an arbitrary polytope R actually admits such a finite subdivision; for any fixed j one can apply various slicing operations parallel to to make the subdivision good for that particular value of j, but this may introduce new problematic faces that ruin other values of j. The subdivisions I found so far were obtained by ad hoc means, and I don’t immediately see how to generalise them.

18 December, 2013 at 9:49 am

Terence TaoActually, it just occurred to me that we don’t, strictly speaking, need the polytopes to obey condition (2); if there are more faces that are transverse to , then we get more vanishing conditions, which reduces the number of degrees of freedom for a given degree of piecewise polynomial, but still in principle allows us to approach the optimal M”_k by using a sufficiently high degree. So one may be able to get by with a coarser partition that has more transverse faces, if the degrees of freedom tradeoff is favorable.

James: perhaps it may be a good idea to organise your code in such a way that one can easily swap in different partitions, polytopes, degree parameters, etc., to give some comparative numerical study of performance of different partitions in the k=3 case, which may give us some useful guidance before plunging into the all-important k=4 case.

17 December, 2013 at 3:15 pm

Aubrey de GreyMany thanks James – and, in advance, for the details. In particular, can you elaborate on the basis of your remark concerning ? – I was presuming that there is no simple algorithm for dissecting with according to Eytan’s idea, which is surely a prerequisite for making any such estimate, but maybe there is one?

17 December, 2013 at 4:53 pm

James MaynardAubrey: I agree there is no simple algorithm at present. The same sort of decompositions should work for , although the decomposition will probably be in many more pieces. I think it should be tedious but probably relatively straightforward to do it by hand.

My comment regarding this improvement potentially not getting all the way to was because we need an improvement of about 0.063 on the current bound for . When we appear to get an improvement of 0.072 in going from to . My guess was that the improvement in moving from to would decrease quite quickly in (as it does moving from to ), and so the current improvement of 0.072 would probably decrease to below the required 0.063. This is just my guess though (I have no proof of this) – it would be nice if I was wrong on this front.

17 December, 2013 at 5:09 pm

Terence TaoWell, from a volume perspective at least, has much bigger volume (, compared with for and for ), which gives one hope. Of course, much of this extra volume is somehow “cancelled out” by the additional vanishing marginal conditions, so things are inconclusive.

This is a degenerate data point, but the improvement from M’_2 to M”_2 is precisely zero, so the improvement certainly got better going from k=2 to k=3. Perhaps going from k=3 to k=4 we still have a little bit of upward monotonicity – well, we can hope, anyway :)

18 December, 2013 at 12:19 pm

James MaynardI’ll try to give some more details of my calculations (and hopefully if I’ve made a mistake someone will spot it). I’m not sure that I’m explaining this in the clearest language, but I hope it makes sense.

As described in Terry’s post, we are looking at polynomials supported on respectively.

Let’s concentrate on the calculation of , the calculations of being similar.

I considered the projections of onto the plane. For example, the projection of is given by . Based on these projections, I split the plane into 6 disjoint polytopes given by

These were formed by just splitting the projections so that each region had a fixed set of the regions which projected to it (and those regions projected onto the whole of ). Since the regions were given by a set of linear inequalities, so too were the regions (and in particular they are convex).

The point of these regions is that for in any region the integration over can then be written as integration over an interval. We see that the integration over for must be zero from our marginal conditions (since ).

Over each region, we can write down the integration :

The regions of integration for over correspond exactly to the vanishing conditions terry wrote above.

Thus we can perform the above integrations for with respect to , square, and then integrate over the regions . This gives our expression for as a quadratic form in the coefficients of .

To enforce the vanishing marginal conditions, we simply calculate the polynomials given by Terry above (i.e. the 2-variable polynomials obtained by integrating with respect to ). Each coefficient of a monomial gives a (homogeneous) linear constraint on our coefficients. By simply using each linear constraint to eliminate one variable at a time from the quadratic form , we arrive at a reduced quadratic form which satisifies all the vanishing marginal conditions. Doing the same for then gives us the ratio of two quadratic forms, which is our quadratic programming problem from before.

It should be relatively straightforward to modify my code so that given a decomposition into polytopes given by inequalities for linear functions it can calculate the quadratic form corresponding to the integration over the polytope.

18 December, 2013 at 12:43 pm

James MaynardAlso, unless I’m missing something, I think in principle there should be a fairly simple algorithm for decomposing any region given by linear inequalities to polytopes of the type we desire.

Assume we are given some region which is given by a set of linear inequalities in the variables . We can split this into subregions depending on which of the forms are greater than 1, and which are less than 1. Each such subregion will also just be given by a set of linear inequalities. We let be a fixed polynomial on each such region . Since we know exactly which of the functions are greater than 1 on , each region comes with a set of vanishing marginal conditions, where is the number of the which are greater than 1.

Each region can be further decomposed into regions which are of the form for linear functions . (Proof: induction on . Since is given by a set of linear inequalities, the lower and upper bounds for are given by the maximum and minimum of a set of linear functions in . For a each ordering of these functions, this gives a upper bound and lower bound for which is linear. Then we use the induction hypothesis, since the remaining variables are constrained only by linear inequalities).

We then decompose the projections in the coordinate of the regions as above (so we have regions where each region is projected to from a fixed set of the ). This means on each of these subregions (which are also just given by linear inequalities) the integral of with respect to is given by a finite sum of integrals of over intervals. Thus we can calculate all these integrals explcitly and calculate the quadratic form . Similarly for the other regions.

We still need to apply the linear constraints, but by following exactly the same method above, we can calculate the various degree polynomials corresponding to the vanishing marginals, which gives a set of homogeneous linear constraints on the coefficients of the . Eliminating variables from the quadratic forms then gives reduced quadratic forms satisfying the vanishing conditions.

Since everything is just given by linear constraints at each stage, everything remains convex so we avoid issues to do with a lack of convexity.

Obviously the above would give some huge number of decompositions if done directly. We can save a lot for small by using symmetry and sometimes switching the orderings of the variables to mean we only have to consider fewer regions. Thus it is probably the case that it is still better to do things by hand.

18 December, 2013 at 1:18 pm

Aubrey de GreySplendid! Hm, but if an economical decomposition is hard to find even for k=4, maybe it is worth seeking an algorithm to convert your decomposition into something with a more manageable number of components. The first approach that springs to mind is to do the decomposition as you describe and then iteratively examine each pair of component polytopes that share a (k-1)-dimensional polytope, to see whether their union satisfies the necessary conditions, and merge them if it does. Is there any reason why that would be expected to run into the sand unhelpfully quickly?

18 December, 2013 at 1:35 pm

James MaynardIn case my first post isn’t clear, I thought I’d try to write out explicitly what I get for some of the integrals. (Also it might make any mistakes I made clearer – there are enough terms that I can’t rule out I’ve made a small mistake somewhere).

Not assuming symmetry, I find that

where

And this is subject to the three vanishing marginal conditions

The expression for is essentially the same, swapping all instances of with in the integration variables and limits, and swapping the functions with everywhere.

For I obtained

where

There are no vanishing conditions for since we are restricting to anyway.

18 December, 2013 at 4:32 pm

Pace NielsenJames, did you get a chance to try the computation for on the new domain I described above (which Terry was kind enough to work out the linear constraints for)?

18 December, 2013 at 4:43 pm

James MaynardI think my comments above can give a relatively straightforward decomposition of the region

We split into 8 subregions which I’ll label . To ease notation let and similarly define which are the forms which are the sum of all variables except for or respectively. Then

and similarly for (the subscripts are indicating which of the are greater than 1).

We choose our function to be a polynomial of each region . (For the time being I won’t assume any symmetry, although this can be used to simplify things later).

To calculate :

We split the space into 6 subregions given by and all permutations of the subscripts. (For the regions the subsripts are intended to be ordered, whist the regions the subscripts are unordered). The region is given by

and similarly for all other regions.

If we restrict to , then the inner integration is a sum of integrals of the functions over intervals. For example, for we find is given by

It is easy to write explicit limits for the integration with respect to for any region . Thus we can calculate the quadratic form . There are no vanishing marginal restrictions here.

To calculate we follow a similar approach. We again split the space into regions etc giving the orderings of the sizes of . However, we further split into and depending on whether or . For example, we obtain for

whilst for we obtain the vanishing condition

By symmetry, in this case we only need to consider functions for which for any permutation of . This should reduce the search space significantly. For example, and will be symmetric in , whilst will be symmetric in but equal to one of the other functions under transpositions of or .

I haven’t had a chance to actually try to compute anything with this yet, though. I also haven’t had a chance to compute with Pace’s alternative region. (Work to do, although I’ll be busy with other things for most of tomorrow.)

16 December, 2013 at 2:32 pm

Terence TaoI can get rid of the final error in the asymptotic lower bound for , thus

.

I’m recording a sketch of the argument below; I may give a more detailed proof for the next blog post.

The basic idea is to do a defect analysis of the upper bound . We first observe that the Cauchy-Schwarz upper bound

(1)

valid for reasonable non-negative functions , can be matched with the lower bound

(2)

for any scalar a (independent of the variable of integration). Indeed, the right-hand side can be calculated to be .

Now, recall that the upper bound comes from the inequality

for any , where and is short for ; this comes from applying (1) with . Integrating this in the variables and then symmetrising gives the upper bound.

Now we make the usual choice of testing

where and , where are parameters at our disposal; a good choice here is and for a small absolute constant . Applying (2) with , we obtain that

.

If we symmetrise, and let be iid random variables with law , where , then

where , and hence

where is now interpreted as , and the integral is interpreted as vanishing if is negative. One can compute that , and from the second moment if is small enough, so

The integral may be bounded by the sum of

(which by convention vanishes if ) and

.

The first term may be bounded by

For the second term, we use the identity to write this integral as

when , and

when .

When , we crudely bound , to bound this integral by . When , we bound and to bound this integral by . Putting all this together, we see that

where we are implicitly reducing to the event that is positive.

The second moment method shows that has mean and variance ; this makes all the error terms except possibly for the term, but this can also be made to be O(1) by an averaging argument (dilating the g’s by and using the pigeonhole principle).

In principle this analysis might give better numerics for the m=2 bound, but it’s a bit messy.

18 December, 2013 at 12:45 am

Andrew GranvilleTerry: please can you give more details when you write “the \log \frac{c}{\theta} term … can also be made to be O(1) by an averaging argument (dilating the g’s by 1 + O(1/\log x) and using the pigeonhole principle).” ?

In my own failed efforts at proving something like this, I got stuck precisely when the range for t_k is tiny (ie reducing this term). Thanks!

18 December, 2013 at 7:42 am

Terence TaoI wrote up some more details of the argument (with explicit error terms, which unfortunately makes things messy) at http://michaelnielsen.org/polymath1/index.php?title=Selberg_sieve_variational_problem#Lower_bounds

The averaging argument ends up being easier to run as follows: if we let be the truncation of a tensor product to the dilated simplex , then we have

We manipulate this inequality as discussed previously, converting it into an upper bound for the defect in terms of various integrals involving g and r. When one arrives at the problematic logarithmic singularity, one can then average in r (noting that the defect is independent of r).

19 December, 2013 at 12:41 pm

Eytan PaldiCan the estimate be made effectively computable ? (or even be optimized for each given ?)

19 December, 2013 at 2:06 pm

Terence TaoIt is certainly effective; I don’t know yet whether it is competitive with our existing bounds for m=2, but it should certainly give reasonable bounds for higher m for the first time. I’ll try to work out some more details later today..

19 December, 2013 at 6:07 pm

Terence TaoAfter playing around with Maple, I was only able to get the O(1) loss in down to about 4 or 5 for medium-sized values of k, which unfortunately does not beat the current estimates for m=1,2, and gives a pretty lousy estimate for m=3; I can get but not much better than this. (I was hoping to recover Zhang’s original value of k=3,500,000, so that we could now stuff four primes into this tuple instead of two, but fell short of this even with our best MPZ estimate.)

I’m recording the Maple code that gave in the m=3 case below. There are three free parameters that one can optimise over, although I found in practice that a (somewhat artificial) constraint that I had to impose at some point was dominant in the sense that it set the optimal value of for a given c, T.

k := 10000000;

m := 3;

# These three parameters may be chosen arbitrarily

c := 1.08/log(k);

T := 0.41/log(k);

# tau := 1.14/log(k);

M0 := k*log(k)/(k-1);

g := 1/(c+(k-1)*t);

m2 := int(g*g, t=0..T );

mu := int(t*g*g, t=0..T )/m2;

sigma2 := int(t*t*g*g, t=0..T)/m2 - mu*mu;

tau := 1 - k*mu;

denominator := (1+tau/2)*(1 - (k*sigma2)/((1+tau-k*mu)^2));

a := (r - k*mu) / T;

S := r * (log((r-k*mu)/T) +(k*sigma2)/(4*a*a*T*T*log(a)) ) + r*r/(4*k*T);

Sav := int(S, r=1..1+tau, numeric)/tau;

Z3 := int( g*g*k*t*log(1+t/T), t=0..T, numeric) / m2;

W := int( g*g*log(1+tau/(k*t)), t=0..T, numeric) / m2;

X:= (log(k)/tau) * ( (1-(k-1)*mu+max(tau-c,0))^2 +(k-1)*sigma2+ c^2 );

numerator := Sav + Z3 + W*X;

Delta := (k/(k-1)) * numerator/ denominator;

# these three quantities need to be positive

evalf( 1 - k*mu - T );

evalf( 1 - k*mu - tau );

evalf(denominator);

# this is the lower bound for M_k

evalf(M0-Delta);

varpi := m/(M0-Delta) - 1/4;

delta := T * ((1/4) + varpi);

`# if this is less than 1, we win`

evalf(1080*varpi/13+ 330*delta/13);

19 December, 2013 at 10:47 pm

Terence TaoI found a more efficient way to control one of the error terms (which I called Z_4 in the wiki page http://michaelnielsen.org/polymath1/index.php?title=Selberg_sieve_variational_problem#Lower_bounds ), which gets the O(1) error down to about 2 or 3, which gets the m=3 value of k down to 1,700,000. The code below also gives a value of k=38,000 for m=2 (by setting c = 1.07/log(k) and T = 0.48/log(k)); presumably a little bit more optimisation is possible.

k := 1700000;

m := 3;

# These three parameters may be chosen arbitrarily

c := 1.06/log(k);

T := 0.46/log(k);

# tau := 1.14/log(k);

M0 := k*log(k)/(k-1);

g := 1/(c+(k-1)*t);

m2 := int(g*g, t=0..T );

mu := int(t*g*g, t=0..T )/m2;

sigma2 := int(t*t*g*g, t=0..T)/m2 - mu*mu;

tau := 1 - k*mu;

denominator := (1+tau/2)*(1 - (k*sigma2)/((1+tau-k*mu)^2));

a := (r - k*mu) / T;

S := r * (log((r-k*mu)/T) +(k*sigma2)/(4*a*a*T*T*log(a)) ) + r*r/(4*k*T);

Sav := int(S, r=1..1+tau, numeric)/tau;

Z3 := int( g*g*k*t*log(1+t/T), t=0..T, numeric) / m2;

W := int( g*g*log(1+tau/(k*t)), t=0..T, numeric) / m2;

X:= (log(k)/tau) * c^2;

V := (c/m2) * int( g^2 / (2*c+(k-1)*t) , t=0..T, numeric);

U := (log(k)/c) * int( (1+u*tau - (k-1)*mu - c)^2 + sigma2, u=0..1, numeric);

numerator := Sav + Z3 + W*X + V*U;

Delta := (k/(k-1)) * numerator/ denominator;

# this needs to be positive

evalf( 1 - k*mu - T );

evalf( 1 - k*mu - tau );

# this needs to be positive too

evalf(denominator);

# this is the lower bound for M_k

evalf(M0-Delta);

varpi := m/(M0-Delta) - 1/4;

delta := T * ((1/4) + varpi);

`# if this is less than 1, we win`

evalf(1080*varpi/13+ 330*delta/13);

19 December, 2013 at 6:25 pm

Andrew SutherlandTaking the first primes greater than gives for m=3.

19 December, 2013 at 6:42 pm

Andrew SutherlandTaking 10,000,000 consecutive primes starting with 1,040,407 gives an admissible tuple of diameter 179,933,380 for m=3.

(and 1,040,407 is the smallest prime for which this works).

20 December, 2013 at 2:32 am

Andrew SutherlandUsing the Hensley-Richards sieve for k=10,000,000 gives an admissible sequence of diameter 175,225,874 for m=3.

20 December, 2013 at 2:59 am

Andrew SutherlandFor k=1,700,000, taking the first k primes greater than k gives the bound for m=3.

Taking 1,700,000 consecutive primes starting with 205,297 gives an admissible tuple with diameter 27,398,976 for m=3.

20 December, 2013 at 3:07 am

Andrew SutherlandUsing the Hensley-Richards sieve for k=1,700,000 gives and admissible tuple of diameter 26,682,014 for m=3.

20 December, 2013 at 3:16 am

Andrew SutherlandFor k=38,000 a greedy sieve yields an admissible tuple of diameter 431,682 for k=2.

20 December, 2013 at 3:38 am

Andrew SutherlandFor k=38,000 the bound on H can be improved to 430,448.

20 December, 2013 at 8:44 am

Andrew SutherlandFor m=2, k=38000 we can further improve the bound on H to 429,822.

20 December, 2013 at 8:49 am

Wouter CastryckTerence’s maple code with the (hopefully legal?) input m := 2, c := 0.975/log(k), T := 0.97/log(k) seems to work for k = 25819.

[This seems to check out – T.]20 December, 2013 at 11:31 am

Terence TaoUnfortunately I made a typo when entering in the formula for U; the sigma2 term should in fact be (k-1)sigma2, which makes things worse and invalidates Wouter’s optimisations – sorry about that! I was able at least to recover the k=38000 value using Wouter’s values (incidentally I have realised that it is slightly better to view c as a perturbation of 1/log(k), so I now wrote c = 1/log(k) – 0.25 / log(k)^2); similarly I can recover k=1700000 in the m=3 case by taking c = 1/log(k) – 0.07/log(k)^2 and T = 0.877/log(k)^2.

k := 38000;

m := 2;

# These three parameters may be chosen arbitrarily

c := 1/log(k) - 0.25/log(k)^2;

T := 0.97/log(k);

# tau := 1.14/log(k);

M0 := k*log(k)/(k-1);

g := 1/(c+(k-1)*t);

m2 := int(g*g, t=0..T );

mu := int(t*g*g, t=0..T )/m2;

sigma2 := int(t*t*g*g, t=0..T)/m2 - mu*mu;

tau := 1 - k*mu;

denominator := (1+tau/2)*(1 - (k*sigma2)/((1+tau-k*mu)^2));

a := (r - k*mu) / T;

S := r * (log((r-k*mu)/T) +(k*sigma2)/(4*a*a*T*T*log(a)) ) + r*r/(4*k*T);

Sav := int(S, r=1..1+tau, numeric)/tau;

Z3 := int( g*g*k*t*log(1+t/T), t=0..T, numeric) / m2;

W := int( g*g*log(1+tau/(k*t)), t=0..T, numeric) / m2;

X:= (log(k)/tau) * c^2;

V := (c/m2) * int( g^2 / (2*c+(k-1)*t) , t=0..T, numeric);

U := (log(k)/c) * int( (1+u*tau - (k-1)*mu - c)^2 + (k-1)*sigma2, u=0..1, numeric);

numerator := Sav + Z3 + W*X + V*U;

Delta := (k/(k-1)) * numerator/ denominator;

# this needs to be positive

evalf( 1 - k*mu - T );

evalf( 1 - k*mu - tau );

# this needs to be positive too

evalf(denominator);

# this is the lower bound for M_k

evalf(M0-Delta);

varpi := m/(M0-Delta) - 1/4;

delta := T * ((1/4) + varpi);

`# if this is less than 1, we win`

evalf(1080*varpi/13+ 330*delta/13);

20 December, 2013 at 9:02 am

Andrew SutherlandFor k=25819 we can bound H by 283,242.

20 December, 2013 at 10:11 am

Andrew SutherlandThe bound for k=25819 can be improved to 282,792.

20 December, 2013 at 10:46 am

Wouter CastryckFor m = 3, the maple code seems to prove k = 1214999 on input c = 0.995/log(k) and T = 0.877/log(k). This is probably not yet optimal.

20 December, 2013 at 10:56 am

Andrew SutherlandFor k=1700000, m=3, a greedy sieve yields an admissible tuple of diameter 25,602,510.

I can take a look at k=1214999 shortly.

20 December, 2013 at 11:03 am

Andrew SutherlandFor k=1,214,999, taking k consecutive primes starting with 157,831 gives the bound 19,152,478. Using Hensley-Richards gives 18,641,812.

20 December, 2013 at 8:17 pm

xfxieFor m=2, seems k = 36000 is valid at:

c := 0.9785232473863417/log(k);

T := 0.9205181897139338/log(k);

[Seems to check out – T.]21 December, 2013 at 1:51 am

Andrew SutherlandSome further improvements:

For m=2, k=38000 we can get 429,798.

For m=3,k=1,700,000 we can get 25,602,438.

21 December, 2013 at 4:32 am

Andrew SutherlandFor k=36,000 we can bound H by 405,528.

20 December, 2013 at 10:14 pm

xfxieHere is a solution 35146, which is based on the code here (The only differences are at the four lines for the inputs k0, m, c, T).

21 December, 2013 at 4:37 am

Andrew SutherlandFor k=35146 we can bound H by 395,542.

Have you tried optimizing for m=3 (or even m=4)?

21 December, 2013 at 9:03 am

Andrew SutherlandFor m=2, k=35,146 the bound on H can be improved to 395,264.

21 December, 2013 at 10:04 am

xfxieFor k=35146 [m=2], H can be down to 395234.

21 December, 2013 at 10:34 am

Andrew Sutherland395,178 [m=2].

21 December, 2013 at 10:20 am

xfxieJust tried for m=3, it can be down to 1630680.

21 December, 2013 at 10:49 am

Andrew SutherlandThanks! Hensley-Richards then gives 25,527,718 as a bound on H.

21 December, 2013 at 6:34 pm

Andrew SutherlandA greedy sieve brings the m=3 bound with k=1,630,680 down to 24,490,758.

21 December, 2013 at 10:31 am

xfxieFor m=4, k can be at least down to 36000000.

21 December, 2013 at 11:00 am

Andrew SutherlandI was able to get k down to 35,127,242 for m=4, but you may be able to improve this further.

Taking the first k primes greater than k gets the bound on H down to 685,833,596 for m=4.

21 December, 2013 at 11:04 am

Andrew SutherlandHmm, it doesn’t seem to want to post the link. In any case, the parameters I used for m=4 to get k=35,127,242 are

c := evalf(0.9529804688/log(k));

T := evalf(0.9999999999/log(k));

21 December, 2013 at 4:23 pm

xfxieFor m=4, seems k can be decreased to 25589558.

21 December, 2013 at 6:25 pm

Andrew SutherlandNice improvement. Taking the first k primes greater than k=25589558 then gives the bound 491149914 on H for m=4.

Out of curiosity, have you tried m=5? It might be interesting to get a sense of how close to the asymptotic prediction we can get with explicit bounds.

21 December, 2013 at 4:51 am

Andrew SutherlandSo if I just change m to 3 in xfxie’s worksheet and then naively optimize k without changing anything else it appears that we can get 1,640,042 for m=3.

21 December, 2013 at 8:53 am

Andrew SutherlandAfter fiddling with the parameters a bit, it appears that for m=3 we can get k down to 1,631,027.

Hensley-Richards then gives 25,533,684 as an upper bound on H.

21 December, 2013 at 5:04 am

Andrew SutherlandDoing the same thing with m=4 gives k=41,862,295. Taking the first k primes greater than k gives the bound 825,018,354 on H.

18 December, 2013 at 3:46 am

Eytan PaldiThe above computation of (the operator norm of ) relies on the (yet unproved) assumption that it is attained as the corresponding eigenvalue for some smooth eigenfunction of .

18 December, 2013 at 8:21 am

Terence TaoThat’s a good point. Smoothness is unlikely to be an issue, because everything is at least L^2 in regularity, and all the operations involved are linear in the unknown F, so that the theory of distributions will probably be able to make all the ODE manipulations rigorous. Attainment of the minimum at an eigenfunction should be possible through calculus of variations arguments due to the strict gap (which we can prove rigorously, since and the explicit eigenfunction we have tells us that ) prevents escape to high frequencies. More precisely, let be a extremising sequence for the ratio ; we can symmetrise using the parallelogram law (as mentioned in http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251870 ) to ensure wlog that the are symmetric; we can also normalise them in L^2 and make them non-negative.

By passing to a subsequence, we can make the converge weakly in L^2. We want to upgrade this to strong convergence in L^2, so that the limit will be a true eigenfunction at the extremal eigenvalue . Because the have uniformly compact support and uniformly bounded in L^2, the only way in which strong convergence fails is escape of mass to high frequencies (a sort of failure of L^2 equicontinuity, in the spirit of Arzela-Ascoli). However, in the high frequency limit, at least one of the two components in goes to zero, and the contribution of the high frequency component is controlled by M_1, which is strictly less than M_2. This can be used to show that escape to high frequences does not occur. (There are some technical details here due to the fact that the high and low frequency components of F_n may leak outside the simplex , but I believe that these issues are manageable by sending the cutoff between high and low frequency slowly to infinity.)

18 December, 2013 at 10:51 am

Eytan PaldiConcerning Aubrey’s question (on systematic dissection of ), a sufficient condition is that for each and each , the interval

for each polytope is either empty or has endpoints which are FIXED linear combinations of – (i.e. each endpoint belong to exactly one bounding plane of .)

18 December, 2013 at 1:27 pm

Eytan PaldiA possible algorithm for systematic dissection of is to check for each and polytope if every in the projection of on the plane orthogonal to axis, satisfies the above condition (i.e. each endpoint of the interval above belongs to a FIXED bounding plane of . Otherwise, should be dissected (by appropriate planes parallel to axis to satisfy the above condition). I think that by repeating this dissection refinement process we get a desired dissection after finitely many steps.

18 December, 2013 at 8:22 pm

Pace NielsenI thought I’d give a quick update on the computational work I’ve been doing, to improve the bound we have.

I’ve tried a number of different things. One was to limit the lengths of signatures being used. It turns out that this still quickly leads to memory limitations, and also does not give good numerics in the ranges where RAM issues are gone.

Second, Frederik Ziebell sent me a very nice program to compute permutations, which is a bit slower than Mathematica, but not nearly as RAM intensive. Unfortunately, that program still eventually runs into memory issues as well!

Third, I tried pushing the degree 10 computations further, but it is clear that is the best result using the methods I’ve already tried.

Fourth, I found a computer nearby which has 64GB of RAM, and ran the program again. Surprisingly, I kept running into problems. It turns out that there is still trouble when we use signatures . In the next day or two, I will throw out this case, and maybe two other problem cases, and get a partial degree 11 result (which should push a bit lower).

It might be the case that I’m just not approaching the problem very well. Here is the computational hang-up:

Let be signatures (non-decreasing sequences of positive integers). Define the monomial symmetric polynomials as before.

We wish to compute . Once the coefficients are found, I put them into a look-up table and the rest of the computation does not present much difficulty.

So, the difficulty is in computing the ‘s. For anyone who wants to try to write a better algorithm than mine, a good test case is when . [I’m a little surprised there is not already a large look-up table of these coefficients out there (at least I can’t find one). I would think this sort of thing would be extremely useful in other contexts.]

18 December, 2013 at 9:26 pm

James MaynardIn an earlier post I think you mentioned looking at polynomials

where and is a polynomial of signature .

Is it not possible to use these where has no 1s in its signature? This should form a basis for all symmetric polynomials, and (if I understand correctly) should avoid the RAM issues (for a bit) since you only have to add signatures which have no 1s in them, and so they should be about half the length.

This still is a bit of a hack; I’m sure there is probably a neater solution.

18 December, 2013 at 10:02 pm

Terence TaoPerhaps one could convert between the basis of symmetrised monomials and the basis of monomials of the elementary symmetric polynomials (in your notation, where there are j 1’s)? The point is that multiplication is trivial in the latter basis (particularly from a memory standpoint, which seems to be the bottleneck). Also the operation of integrating out one of the k variables, say t_k, is not too bad in this basis, because of the recursive identities

and because t_k is being integrated from 0 to .

The price one pays for this of course is that integrating on the entire simplex is unpleasant in this basis, but one could now convert back to the symmetrised monomial basis in which integration is easy. It seems to be that computing the structure constants for is relatively easy if is elementary, so the conversion from the monomials-of-elementary-functions basis to the symmetrised-monomials basis should be fairly straightforward. (The reverse conversion may not actually be necessary, if one stores F in the former basis rather than the latter.) In any case, these conversions should occupy significantly less memory (if one has an N-dimensional space of polynomials, any conversion matrices will take up memory, as opposed to the memory required for the multiplication structure constants .)

For your specific example m(221111111), it looks like this polynomial is an explicit linear combination of , so is a combination of , and each of these should be easily expandable back into ‘s.

19 December, 2013 at 1:46 pm

Pace NielsenJames and Terry, these are both great ideas. I went with James’ idea first since it is incredibly easy to implement inside the code I have already written, and it essentially doubles the degree without much extra work. [The degree 10 case takes only about 1 minute now.]

I first accidentally ran the computation on polynomials of the form , where is a signature of degree with only *even* entries. So this set of polynomials doesn’t quite span the space of all symmetric polynomials of degree 20, but it still seems to give very good numberics (perhaps better time-wise than using the full basis–I’ll experiment with this). I obtained the lower bound .

I’ll continue running the code, lift the restriction on signatures with only even entries, and report more results tomorrow.

19 December, 2013 at 2:01 pm

Terence TaoAh, so we have hit , excellent! It’s remarkable how close the upper bound is holding up (and also reassuring that we are not contradicting that bound).

Incidentally, I changed the attribution for H(k) for small k (up to 171) on the wiki to Clark and Jarvis rather than Engelsma, as their work was earlier. (Engelsma gets optimal values up to k=342, and not necessarily optimal values up to k=800 or so.)

19 December, 2013 at 2:20 pm

Andrew SutherlandThe change in attribution makes sense. I’ll add a note to the entries in the admissible tuples database for k < 172.

If we manage to get k down to 26 we can go back further to Smith's 1957 paper.

19 December, 2013 at 2:37 pm

Eytan PaldiThis agrees with my estimate (based on linear convergence assumption) above that should be about , according to that assumption (with convergence rate of about ), the minimal for which $m_k > 4$ was estimated to be or .

20 December, 2013 at 9:46 am

Pace NielsenDoing as above (considering where is a signature with only even entries, and the total degree is at most 20) yields . [This method is definitely faster than using all monomial symmetric polynomials of a given degree, but that can still be done.] If Eytan’s predictions hold true, this will be the smallest value for which on .

But just to give more data, here are the lower bounds on using all monomials of a given degree, for degrees from 1 to 14.

If any further computation is necessary, let me know.

20 December, 2013 at 9:59 am

Eytan PaldiSince the difference should be very close to (i.e. very nearly ), we expect to be very close to – i.e. about .

20 December, 2013 at 10:42 am

Aubrey de Grey3.987 is indeed very close to what our previous method (extrapolating differences with a 0.8 convergence rate) predicts for d=20. However, allowing d to rise to 26 appears to let M_54 creep over 4 – I trust Pace will be able to test this prediction – so it looks as though 0.8 is slightly over-pessimistic. It seems pretty certain that the corresponding series for k=53 will converge below 4, however (though of course that must also be checked).

20 December, 2013 at 11:04 am

Aubrey de GreyActually I retract my pessimism for k=53. Using a convergence rate of 0.8 as above, the limit of M_54 as d tends to infinity is about 4.004, but the ratio of differences for d > 10 is actually more like 0.83, which translates to a limit extremely close to 4 + 1/54, i.e. to a prediction that M_53 may creep over 4 but only for terrifyingly large d.

20 December, 2013 at 11:19 am

Aubrey de GreyPace – how much difference are you typically seeing for a given k and (double-digit) d between M calculated with all monomials versus only even entries? If the difference (or ratio, or some other simple function) is relatively constant, it may allow the even-entry version to act as a highly accurate estimator of what d (if any) would be needed to get M_53 > 4 using all monomials.

20 December, 2013 at 12:16 pm

Pace NielsenFirst, I should mention that using only even entries in the signature was a mistake on my part (which fortunately sped up the computation quite a bit). My intuition is that it should be better to use signatures with entries limited to for some small , which will keep the speed increase but not be quite so ad hoc.

Here are two data points about using limited signatures: For degree 4, , using all monomials we get while with restricted signatures we get . Similarly, in degree 10 we get the bounds and . So it doesn’t appear that there is a significant difference in the outcome.

Regarding your pessimism, it should be kept in mind that these computations do not continue on at the same convergence rate once the degree gets to be about , because the monomials involved start overlapping more often. (There are fewer monomial symmetric polynomials in due to the lack of available variables.) So I would guess that is not within the realm of possibility–but I may be wrong.

I’ll see if I can’t put together some code to try degree 26 for . (Unfortunately, I do have finals to grade, and a family vacation to take this next week. So it may be a little while before it gets done.)

20 December, 2013 at 12:24 pm

Eytan PaldiI agree with Aubrey’s conclusions. I give below more details

The last three ratios among consecutive differences are:

– showing good stability near Aubrey’s value (0.83) for the estimated convergence rate. Using this rate we have the prediction that should be about (which is safely above the threshold – concerning the small fluctuations of the ratios around ). It seems however that is “too close” to to reliably decide if is above or below (because of the small fluctuations in the above ratios.)

20 December, 2013 at 2:12 pm

Aubrey de GreyThank you Pace. The point about d > k/2 is well taken. The difference between the two versions of M_k for a given d is not so small, though – in fact I think it is enough, especially since it is bigger for d=10 than for d=4, that M_54 will probably not reach 4 for any d (certainly not d=26), if the even-entries version of your code is used. (That assumes that the difference is relatively independent of k, though.) Your new idea of a hybrid that includes only small odd numbers sounds ideal.

20 December, 2013 at 3:44 pm

Aubrey de GreyPace – before you write new code I should mention that if the convergence rate of 0.83 (rather than 0.8) is accurate, M_54 should actually reach 4 with d as small as 19.

20 December, 2013 at 3:50 pm

Aubrey de GreyDamn, apologies – actually not until d=22.

19 December, 2013 at 4:33 am

Aubrey de GreyThe search for an ideal R has thus far proceeded by identifying extra volume that can be added to an R that is already known to be usable. I’m wondering whether there might be promise in the reciprocal option of first identifying a region that provably contains every possible R and then removing polytopes to arrive at a region that works. Would a suitable starting-point for such an approach be the union of k prisms of the form currently being investigated, i.e. a half-sized version of the region Terry describes in http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257354 as the (Selberg-specific) minimal constraint for the sumset R+R ? (I’m actually having trouble seeing what prevents the use of that entire region itself as the maximal R, but I’m sure I’m missing something.)

19 December, 2013 at 7:58 am

Terence TaoI had initially used this very region before realising that R+R would be too big, due to the contribution of sums r+r’ where r,r’ lie in different prisms. Pace’s symmetrised region is the largest symmetric convex region available, because if lies in a symmetric R then so does , and so R+R contains , forcing . There is a possibility of using a non-convex R, though one should proceed with caution in this case and recheck all the other steps of the argument.

19 December, 2013 at 8:42 am

Eytan PaldiIt seems that the generalization of Pace’s region for is

with

19 December, 2013 at 9:18 am

Eytan PaldiFor the above region we (obviously) have the upper bound

19 December, 2013 at 11:20 am

Eytan PaldiThe second inequality above is implied (of course) only for – i.e. .

19 December, 2013 at 10:56 am

Aubrey de GreyAh yes, of course – indeed, for k=3 it’s immediate that no R, convex or otherwise, can be bigger than half the unit cube, simply because no pair of points whose midpoint is epsilon away from (1/2,1/2,1/2) can both be in R. I’m not sure whether this generalises to larger k, though, because I’m not clear what the equivalent sumset condition is – I see various possibilities. Could you please state it explicitly?

19 December, 2013 at 11:36 am

Terence TaoFor general values of , the constraint is that R+R must lie in the set

which is the union of k prisms. For the current focus of improving the bounds on H_1 on EH, we are taking , but one could imagine that the case could be useful for improving the Bombieri-Vinogradov level of k (currently at 64), e.g. by using the longer prism , which looks like a promisingly large amount of extra room over .

20 December, 2013 at 2:46 am

AnonymousDear Pro.Tao,

Until now,proving H=2 is possible?Why does my intuition and mental telepathy keep feeling that 95% you can do.I’m really happy to see your answer very much.

Thank you very much,

19 December, 2013 at 2:59 pm

Eytan PaldiThe lower bound for in the wiki page should be (as given by James.)

[Corrected, thanks- T.]20 December, 2013 at 8:18 am

Eytan PaldiCurrently there are several candidates for (and perhaps more future ones) for which this notation (as well as ) is somewhat vague – so this notation should be made clearer.

20 December, 2013 at 1:20 pm

Eytan PaldiIn the asymptotic analysis for the lower bound for in the wiki page, the integral in the numerator of the last upper bound expression for is not sufficiently clear.

[Clarification added -T.]20 December, 2013 at 2:21 pm

Eytan PaldiIt seems clearer to replace by in the integrand

[OK, -T.].20 December, 2013 at 2:05 pm

Polymath8b, IV: Enlarging the sieve support, more efficient numerics, and explicit asymptotics. | What's new[…] for small values of (in particular ) or asymptotically as . The previous thread may be found here. The currently best known bounds on […]

21 December, 2013 at 9:07 am

Eytan PaldiThere are typos in the values of the upper bounds for appearing in the table in the wiki page. There is also a typo in the approximate value of in the last line of that page.

[Corrected, thanks – T.]31 December, 2013 at 11:38 am

John NicholsonCan someone explain the link of 272 to here, from the table with m=1 at: http://michaelnielsen.org/polymath1/index.php?title=Bounded_gaps_between_primes

31 December, 2013 at 12:06 pm

Andrew SutherlandThe link is to Pace Nielsen’s comment in which he notes that M_55 > 4 = 4m. By Maynard’s result (summarized above), this implies that every admissible 55-tuple has infinitely many translates that contain at least 2 primes. The bound 272 comes from the fact that there exists an admissible 55-tuple of diameter 272, e.g. this one.

2 April, 2014 at 12:30 pm

Gergely HarcosIn Section 2 of this post, should be (3 occurrences). Correspondingly,

in the text, “at most ” should be “at least “.

[Corrected, thanks – T.]