The main objectives of the polymath8 project, initiated back in June, were to understand the recent breakthrough paper of Zhang establishing an infinite number of prime gaps bounded by a fixed constant , and then to lower that value of as much as possible. After a large number of refinements, optimisations, and other modifications to Zhang’s method, we have now lowered the value of from the initial value of down to (provisionally) , as well as to the slightly worse value of if one wishes to avoid any reliance on the deep theorems of Deligne on the Weil conjectures.

As has often been the case with other polymath projects, the pace has settled down subtantially after the initial frenzy of activity; in particular, the values of (and other key parameters, such as , , and ) have stabilised over the last few weeks. While there may still be a few small improvements in these parameters that can be wrung out of our methods, I think it is safe to say that we have cleared out most of the “low-hanging fruit” (and even some of the “medium-hanging fruit”), which means that it is time to transition to the next phase of the polymath project, namely the writing phase.

After some discussion at the previous post, we have tentatively decided on writing a single research paper, which contains (in a reasonably self-contained fashion) the details of the strongest result we have (i.e. bounded gaps with ), together with some variants, such as the bound that one can obtain without invoking Deligne’s theorems. We can of course also include some discussion as to where further improvements could conceivably arise from these methods, although even if one assumes the most optimistic estimates regarding distribution of the primes, we still do not have any way to get past the barrier of identified as the limit of this method by Goldston, Pintz, and Yildirim. This research paper does not necessarily represent the only output of the polymath8 project; for instance, as part of the polymath8 project the admissible tuples page was created, which is a repository of narrow prime tuples which can automatically accept (and verify) new submissions. (At an early stage of the project, it was suggested that we set up a computing challenge for mathematically inclined programmers to try to find the narrowest prime tuples of a given width; it might be worth revisiting this idea now that our value of has stabilised and the prime tuples page is up and running.) Other potential outputs include additional expository articles, lecture notes, or perhaps the details of a “minimal proof” of bounded gaps between primes that gives a lousy value of but with as short and conceptual a proof as possible. But it seems to me that these projects do not need to proceed via the traditional research paper route (perhaps ending up on the blog, on the wiki, or on the admissible tuples page instead). Also, these projects might also benefit from the passage of time to lend a bit of perspective and depth, especially given that there are likely to be further advances in this field from outside of the polymath project.

I have taken the liberty of setting up a Dropbox folder containing a skeletal outline of a possible research paper, and anyone who is interested in making significant contributions to the writeup of the paper can contact me to be given write access to that folder. However, I am not firmly wedded to the organisational structure of that paper, and at this stage it is quite easy to move sections around if this would lead to a more readable or more logically organised paper.

I have tried to structure the paper so that the deepest arguments – the ones which rely on Deligne’s theorems – are placed at the end of the paper, so that a reader who wishes to read and understand a proof of bounded gaps that does not rely on Deligne’s theorems can stop reading about halfway through the paper. I have also moved the top-level structure of the argument (deducing bounded gaps from a Dickson-Hardy-Littlewood claim , which in turn is established from a Motohashi-Pintz-Zhang distribution estimate , which is in turn deduced from Type I, Type II, and Type III estimates) to the front of the paper.

Of course, any feedback on the draft paper is encouraged, even from (or especially from!) readers who have been following this project on a casual basis, as this would be valuable in making sure that the paper is written in as accessible as fashion as possible. (Sometimes it is possible to be so close to a project that one loses some sense of perspective, and does not realise that what one is writing might not necessarily be as clear to other mathematicians as it is to the author.)

## 146 comments

Comments feed for this article

17 August, 2013 at 4:01 am

E.L. WistyReblogged this on Pink Iguana.

17 August, 2013 at 5:27 am

mgflaxHave you considered “git” (the collaboration tool initially developed by Linus Torvalds for Linux but now in widespread use) instead of dropbox?

A sample intro is at http://www.sharelatex.com/blog/2012/10/16/collaborating-with-latex-and-git.html and some best-practices are at stackoverflow.com/questions/6188780/git-latex-workflow. Also, dropbox.com is free as long as all documents are publically-readable.

It works very well for text-based files (e.g. .tex files). Each participant would create their own (free) repository on github, and any participant could “pull” updates from any other repository. But I’d assume you’d have a “hub-and-spokes” model where you would pull contributions from other participants and then everyone would pull updates from your repository.

github even has built-in mechanisms so participants could notify you that there are changes in their repository that you should pull, and lots of similar workflow tools.

17 August, 2013 at 7:13 am

Eytan PaldiThe conclusion of theorem 2.6 is missing.

[Corrected, thanks - T.]17 August, 2013 at 7:32 am

Aubrey de Grey[I just posted this in reply to Eytan's comment https://terrytao.wordpress.com/2013/07/27/an-improved-type-i-estimate/#comment-242193 on achieving k0=631 but then noticed that the page has now been marked inactive, so I'm reposting here; my apologies if that is contrary to convention.]

Thank you Eytan. I’m also wondering whether the current Deligne-avoiding k0 of 1788 can be nudged down, though of course this is of more marginal interest to most observers than the headline 632 value.

17 August, 2013 at 9:16 am

xfxieFor the Deligne-avoiding case, seems might be pushed down to 1783 (http://www.cs.cmu.edu/~xfxie/project/admissible/k0/sol_1783__.mpl), if there is no error in my implementation.

[Paper updated to reflect this improved numerology - T.]18 August, 2013 at 3:49 am

AnonymousAf the very end of page 1, “” should be “”.

[Corrected, thanks - T.]17 August, 2013 at 9:23 am

Aubrey de GreyThank you xfxie. I can’t resist noting the entirely gratuitous but possibly amusing fact that if that value holds up, the product of the headline and Deligne-avoiding values of H then creeps below Zhang’s original value! Even 1784 would fail to achieve this.

19 August, 2013 at 10:43 am

Aubrey de GreyActually, on perusing the current draft of the paper, I’ve noticed that even though a whole section is proposed that focuses on computational methods for finding optimal or near-optimal values of H for a given k0, there is not obviously any equivalent intention to include explicit descriptions of methods to find optimal or near-optimal values of k0 for a given (varpi,delta), such as those currently implemented by xfxie. It seems to me that these methods are sufficiently subtle that at least a brief discussion of them in the paper is merited, rather than merely a note that “one may almost pretend that delta vanishes entirely”; apologies if I have misunderstood and this is indeed already intended. In particular, xfxie, I’m interested to understand whether the ideas recently described by Eytan for attempting to reach a k0 of 631 or 630 are in fact encompassed by what your algorithm already does, or whether there is scope for further refinement of these optimisation methods.

[Good idea; I've added a subsection for this. -T.]19 August, 2013 at 11:23 am

Eytan PaldiI’ve just discovered the real obstruction for further reducing to or : In order to make positive,

must be sufficiently reduced (which cause the current upper bound on to grow “too much” – as I found by analyzing its asymptotic dependence on ) – so to compensate this growth one needs to reduce the G-ratio by decreasing – but this would cause the upper bound on to increase “too much” !

It seems that the only way out of this is to improve the upper bound on

(and, in particular, its asymptotic dependence on small

– I’m working on it now!

21 August, 2013 at 8:36 pm

xfxieIt would be my pleasure to help some with the subsection “Optimizing the parameters”.

18 August, 2013 at 8:57 am

Eytan PaldiIn the second column of table 2, should be .

[Corrected, thanks - T.]18 August, 2013 at 12:23 pm

James HilfertyYour last paragraph is all too true; as sometimes you can’t see the wood for the trees; so could you give a hint as to what you are all trying to achieve? I am really only interested in financial maths and am not much good at anything else. However in my domain the fact that exponential (e) is to the wrong power no longer matters since real interest rates (for the financial institutions) is zero; (e) times zero = 1 so the real world is just as mad as the formula. Is this true?

18 August, 2013 at 2:38 pm

Cancer_doctorWhat these brilliant mathematicians are trying to do is solve a 3000 year-old problem which is to prove that no matter how high up in the numbers you look you will always find some twin primes. Primes are whole numbers that are exactly divisible only by themselves and one; an example is “13.” Twin primes are primes separated by 2 as in 11 and 13, or 17 and 19.

The “twin prime conjecture” is so difficult that no real progress was made in solving it until about 6 months ago when an unknown mathematician made a huge start by using a great deal of abstract mathematics developed in the 20th century. These gentlemen, some of the finest minds in the world by the way, are trying to understand that paper on a deep level and improve on it, which they certainly have done. My opinion is that understanding why such a simply stated problem is so enormously hard to prove is the real work that is being done here. Perhaps we, the human race, are ignorant of some terribly important branch of mathematics, or it may be that some seemingly “simple” problems require immense complexity to explain. Perhaps these gentlemen will find out.

With regard to your question about (e) times zero, any (finite) number times zero is zero, not 1. What you probably meant was (e) to the zeroth power, an entirely different thing, which is, indeed, 1. Even simple business math like computing compound interest would not work correctly without knowing this. Fortunately for us, mathematicians hundreds of years ago did all that hard work for us, just as these mathematicians are doing for future generations.

18 August, 2013 at 9:01 pm

Terence TaoI’ll be travelling for the next few days, but hope to finish up a draft version of Section 2 (which lists the key subclaims needed to get bounded gaps between primes, and the connections between them) soon. I don’t plan to touch other sections of the paper yet (not that there is much content to any of them at this stage).

One organisational issue regards how to deal with the van der Corput estimates. There are two places where we use a van der Corput bound: firstly for the “Deligne-free” Type I estimates (and also the Type II estimates), where we have to bound sums of the form

(1)

for some smooth modulus , some rational function and some smooth cutoff . These sums are bounded by a combination of the q-van der Corput A-process of Heath-Brown and Graham-Ringrose, the completion of sums method, and the Weil conjectures for curves. Details can be found in Section 1 of https://terrytao.wordpress.com/2013/07/07/the-distribution-of-primes-in-doubly-densely-divisible-moduli/ .

But then for the improved Type I estimates we use the van der Corput bounds again for sums of the form

(2)

where is a Kloosterman-type sum, for some rational function of two variables. These sums are estimated by the same methods as the previous sums, but with the Weil conjectures for curves replaced by Deligne’s “Weil II” theorem on trace weights. Details can be found in Sections 2 and 3 of https://terrytao.wordpress.com/2013/07/27/an-improved-type-i-estimate/ .

Strictly speaking, the estimation of the sums (1) are a special case of the estimation of the sums (2), so we could prove the stronger statement about estimation of (2) first and then obtain (1) as a special case. But this would make it more difficult to disentangle which results relied on Deligne’s theorems and which ones did not. Another approach is an axiomatic one (which I tried to set up in the previous post), in which one posits a class of “structured functions” obeying some axioms, proves an abstract van der Corput result for such structured functions, and then specialises to the class of rational phases to prove (1) and then to the class of trace weights to prove (2). A third approach would be to prove (1) directly, and then repeat the arguments to prove (2); this would lead to a slightly longer text but it may be clearer to follow, particularly if we encapsulate all the properties about trace weights that we need for (2) into a single proposition, and if we make the proof of (2) resemble the proof of (1) as much as possible. I’m now inclined towards the final method (even though I was initially disposed to the axiomatic approach), but perhaps others may have opinions on this point as well.

19 August, 2013 at 5:15 pm

AnonymousRegarding the paper: If you are interested, I can typeset the tables properly for you; I really don’t like how they look now. (On a second note, the in the web address is missing.)

[Tilde added. If you would like to contribute, let me know and I can give edit access to the Dropbox. -T]21 August, 2013 at 9:03 pm

Anonymous@Terry: I’m not a mathematician (just interested in math) so I don’t understand the details in the paper at all; I use LaTeX and just saw that the tables are not very well typeset.

With that said, I have now typeset the two tables in a seperate document so I can either send it to you by mail or you can give me edit access so that I can put it in the Dropbox folder myself.

21 August, 2013 at 9:17 pm

Anonymous@Terry: Files “tables.tex” and “tables.pdf” added to folder.

[New version of tables implemented, thanks - T.]20 August, 2013 at 7:16 am

Wouter CastryckA couple of comments (some of them may be too editorial for this early stage, so please ignore accordingly):

* in the abstract: –>

* statement of Theorem 1.3: –>

* section 2.1: (and hence –> brackets don’t close

* section 2.1, definition of discrepancy, shouldn’t there be a premise of the kind “whenever the sums converge.”?

* statement of Theorem 2.6: there’s an “11″ at the end

* statement of Claim 2.8: what’s here? should it be recalled that ?

* discussion following Theorem 2.10: “maybe work out exactly what this gives” –> I found the value (a value which we encountered before; the contribution of $\kappa_1$ and $\kappa_2$ is negligible in this region), but maybe this should be double-checked

* same discussion: –>

* same discussion: I didn’t fully grasp the last phrase about the tradeoff

[Corrections implemented, thanks - T]20 August, 2013 at 10:56 am

Andrew SutherlandI would vote to include 34429 and one other representative value of between 1783 and 34429, say 5453, in Table 1. In section 3 (which I am working on now) I plan to include a small portion of the benchmark table http://michaelnielsen.org/polymath1/index.php?title=Finding_narrow_admissible_tuples#Benchmarks, and it would be helpful to have a few values between 1783 and 181000.

[I've added 34429 to the table; feel free to make further additions, as I will not be making many edits in the next two or three days. -T]20 August, 2013 at 8:18 am

Welcome Back! | OU Math Club[…] This has been a polymath project and they are now in the stage of writing up their calculations. You can read all about it on Dr. Tao’s blog. […]

23 August, 2013 at 1:29 pm

Eytan PaldiIn the third column of table 2, the exponent of should be .

[Corrected, thanks - T.]24 August, 2013 at 10:50 am

Eytan PaldiThe two lines below table 1 should belong to theorem 2.3.

24 August, 2013 at 12:26 pm

Terence TaoHmm, actually they are already part of that theorem; LaTeX for some reason automatically placed the table within the statement of the theorem. It may be that this problem will sort itself out as the paper evolves. (I think right now, tables are placed at the top of a page, which limits the ability to move them around.)

Speaking of this table: I have added some new values of to this table, arising from the theoretical best possible values of for selected values of arising both from the Bessel-function cutoff and the older GPY polynomial cutoff, namely (for the best possible ), (for , our best value), (for , the best Deligne-free value), and (for ; these values were in fact encountered early in the polymath project). This seems to give reasonably good coverage of the various orders of magnitude, although I don’t actually have tuples data for and yet.

24 August, 2013 at 12:49 pm

xfxieFor 7140, I can get 69280.

http://www.cs.cmu.edu/~xfxie/project/admissible/admissible_7140_69280_4678.txt

For 341640, seems the memory usage need to be optimized in my current code.

[Added, thanks - T.]24 August, 2013 at 1:36 pm

Terence TaoBy the way, thanks for starting the optimization section; I linked your new file optimize.tex to the current file gpy.tex so that they can both go into the PDF while being edited separately, as I am starting to work on the latter file (having almost finished the work on subtheorems.tex).

One other thing: if your optimization routines are available, would it be possible to see what the best value of one can deduce from the best known constraint if one had to use the older implication in which , i.e. Theorem 6 of https://terrytao.wordpress.com/2013/06/18/a-truncated-elementary-selberg-sieve-of-pintz/ ? I think it would be good to illustrate how much of a gain the more complicated implication (with , etc.) gives.

24 August, 2013 at 3:23 pm

xfxieIf still use , can be obtained.

http://www.cs.cmu.edu/~xfxie/project/admissible/k0/sol_776.mpl

Please check if the setting is OK.

24 August, 2013 at 3:45 pm

Terence TaoThanks! Actually in the case, the quantity (and hence ) should vanish and should be irrelevant; there seems to be some little glitch in the maple code that doesn’t see this, but I redid the calculations without and they are of course fine.

24 August, 2013 at 4:52 pm

xfxieActually, I was also confused about the realization of . Here I copy the latex description in Theorem 2.13.

If , then , which means the integration term is 0. Then it seems , which leads to .

This is quite strange. So I checked the original blog:

https://terrytao.wordpress.com/2013/06/18/a-truncated-elementary-selberg-sieve-of-pintz/. In Theorem 5, is still 0 if . But afterward, has been replaced by a new upper bound , which is no longer 0.

24 August, 2013 at 6:21 pm

Terence TaoAh, that explains what was going on. I guess this is confirming Eytan’s suspicions that our estimation of is not entirely efficient…

24 August, 2013 at 12:58 pm

Andrew SutherlandFor 341,640 the records page lists 4,656,298 (but I’m sure this can be improved).

[Added, thanks - T.]24 August, 2013 at 12:20 pm

Eytan PaldiTable 1 should be located just before section 2.1 and table 3 just before section 2.2.

24 August, 2013 at 1:20 pm

Eytan PaldiNow Table 1 should be just before sec. 2.2 and table 3 just before sec. 2.3.

24 August, 2013 at 5:03 pm

Eytan PaldiSome remarks on the optimization problem:

The optimization problem (for each choice of ) can be formulated in terms of alone (as the “free” parameters), since optimal and are explicitly determined by given as follows:

1. In order to maximize (to check if it can be made positive), we see that its maximization with respect to gives the supremum of with respect to its constraints (the supremum is allowed since the criterion is based on strict inequality!) which gives optimal for the maximization of .

2. Order to optimize A, we need to minimize or to minimize the exponent in the bound for for .Denote this exponent by . It follows (by differentiating under the integral sign) that

and on (hence is strictly increasing on and tending to as . Consider the two cases:

(i) : In this case (since is increasing) the optimal A is 0.

(ii) : In this case the optimal A is the (unique) positive A for which psi'(A) = 0.

Since on , as can be verified by differentiating under the integral sign, it follows from the concavity of that Newton iteration for the positive root of should converge (very fast) to the optimal when starting with initial $A = 0$.

Remark: It seems that the simpler iteration

with initialization is converging (linearly) to the optimal value of .

24 August, 2013 at 10:04 pm

Eytan PaldiOn simple and very good approximation of :

Let for .

( which can be any positive number plays the role of )

Since , it follows that

as . Moreover, g is strictly decreasing (the proof, to be posted later, depends on a differential equation satisfied by ) – implying that (defined as the minimal positive integer for which ) is uniquely determined as the smallest integer greater than where is defined for each as the unique positive satisfying

.

Using the asymptotic expansion

Where is the first positive zero of

and .

It follows that

Where is a third degree polynomial

, where

It follows that a very good approximation for is given by

Examples: gives

gives

This surprisingly good approximating accuracy can be explained by explicit error bounds (to be posted later). It should be remarked that the accuracy should becomes better for larger (the approximation error is typically 0 or 1.)

[I added a footnote about this to the paper, though I only stated the leading order term explicitly - T.]25 August, 2013 at 8:56 am

Gergely HarcosThe last two lines in Table 3 do not agree with my SAGE calculations. For I get and . For I get and .

The discrepancy probably comes from different values for . I am using 100-bit precision in SAGE and the precise value for , not an approximation. I get and . Can you please double check these values? I copy some minimal SAGE code below.

R = RealField(100); R

k=632

print R(find_root(Bessel(k-2, typ=’J’),k,k+2*k^(1/3)))

k=1783

print R(find_root(Bessel(k-2, typ=’J’),k,k+2*k^(1/3)))

25 August, 2013 at 9:24 am

Gergely HarcosI also think that in the second column we should have , because yields .

25 August, 2013 at 11:10 am

xfxieIn Maple 16, k0*=1781

BTW: j_1779 = 1801.571309

25 August, 2013 at 1:02 pm

Gergely Harcosxfxie is right, I take this last comment (about ) back. For some reason I did not work with but with the value decreased by the final value of . So as in the paper. My original comments of 8:56am remain unchanged.

25 August, 2013 at 10:42 am

xfxieSimilar results in Maple 16:

j_630=646.0292074, j_1781=1803.579701

For the last line: 1.6812E-6, 3.6654E-6

25 August, 2013 at 1:37 pm

Terence TaoGood catch! I think I sometimes confuse Maple when cut-and-pasting code repeatedly, but I am now getting essentially the same results as you and xfxie here, and have updated the table accordingly.

25 August, 2013 at 3:28 pm

Gergely HarcosIndeed my mistake at 9:24am was due to copy+paste in SAGE…

25 August, 2013 at 10:57 am

Eytan PaldiIn the proof of theorem 2.14, the upper bound for should be updated to the current bound (with the exponent in the integrand) – by using the current bound for .

(which is based on the sub-multiplicative property of f and its log-concavity.)

[Oops, I had forgotten about this improvement! I think it is fixed now. - T.]25 August, 2013 at 3:10 pm

Eytan PaldiIn the proof of lemma 4.6, it should be (instead of ) -two places.

[Fixed, thanks. I also realised that was used for two different things in this section, so I changed here to . -T]25 August, 2013 at 3:44 pm

Eytan PaldiIn the proof of theorem 2.14, I think that the definition of should be dependent on i.

[Fixed, thanks - T.]25 August, 2013 at 4:06 pm

Gergely HarcosThis is not really important, but a slightly cleaner set of parameters for the second column of Table 3 would be , , . Then , , , , . Finally, , , , and .

[Replaced, thanks - T.]26 August, 2013 at 3:11 am

Gergely HarcosThanks for updating the table! Can you please double check the value of in the first column? SAGE gives me which differs from the given value in the third decimal. The Bessel zero is , while with the SAGE command

print R(numerical_integral(exp(-(A+2*alpha)*t)/t, deltat, theta, eps_abs=1e-20, eps_rel=1e-10)[0])

[I didn't verify this yet, but I put in your corrected version. I was using the off-the-shelf Maple numerical integration which may have had some trouble with precision here. -T.]26 August, 2013 at 4:57 am

Eytan PaldiIt is possible to check the accuracy of the numerical integration for this integral by expressing it in terms of the exponential integral function E_1.

26 August, 2013 at 5:20 am

Gergely HarcosSAGE and Mathematica agree (after increased precision) on 16 decimal digits: .

26 August, 2013 at 5:57 am

Gergely HarcosThanks for suggesting the exponential integral function . With it we can express the integral as , and now SAGE and Mathematica agree on 27 digits (I could not get more digits from Mathematica): .

26 August, 2013 at 11:23 am

commentNot sure if this helps but http://mpmath.googlecode.com/svn/trunk/doc/build/functions/expintegrals.html#e1 is available from mpmath on python.

25 August, 2013 at 5:34 pm

AnonymousI have updated the file subtheorems.tex in the Dropbox folder to get the correct alignment in the tables. (I haven’t compiled so nothing is changed in the PDF file yet.)

[Thanks! There was an issue with the \frac{}{}'s not compiling, but I put dollar signs around them and it works OK now. - T.]26 August, 2013 at 1:07 am

AnonymousRegarding table 2: I think the readability will improve if one used “” instead of “” (as done for the parameter in table 3).

[Changed, thanks - T.]26 August, 2013 at 2:07 am

AnonymousAnother comment: If the package hyperref is loaded (as the last package!), one can use “\url{http://math.mit.edu/~primegaps/tuples/admissible_632_4680.txt}” to get a proper tilde. Furthermore, with hyperref, all the references in the document become interactive.

[Ah, nice! I had previously had problems with hyperref but I found out that this was because some of the .aux files I had were built without hyperref leading to some weird conflict. -T.]26 August, 2013 at 4:37 am

Eytan PaldiRemarks for the optimization of parameters:

1. In the reduced model (4.55), the parameter still appears as a “free” parameter, while it can easily be made dependent on (as described in my previous comment above – with a very simple and fast iteration to find it) – reducing the search space to only two parameters.

2. In the last line of table 4, the values of are seemingly their current values (before optimization!)

26 August, 2013 at 7:29 am

xfxieThanks for the remarks.

1. will add some text about the dependence property.

BTW: I will try to implement this improvement if we want a full description. (Assuming the policy that the methods that we described should at least have been actually implemented and tested before the paper is finished).

2. Changed.

26 August, 2013 at 11:20 am

Eytan PaldiMore comments:

1. The integral in the upper bound for is almost independent on . More precisely, this integral is expressible in terms of the exponential integral function (see Gergely’s comment above), so in the cost function this integral may be approximated (from above) by

(i.e. the upper bound for is increased very slightly – as well as the resulting cost function.)

For this modified cost function, the optimal value of is determined as the (unique) root in of the equation

(which follows from case (ii) of my comment above.)

It seems (still without proof) that this root can be found by iterating this equation starting with (probably, the proof has to show that this iteration is a contraction map from into itself.)

2. It follows that the optimal is close (from below) to

(which initialize the iteration above.)

3. Since (in practice) we can use (5.1.11) in A&S to get the following upper bound for

Where and is Euler's constant.

Note that this bound shows that for sufficiently small , the upper bound on is roughly proportional to – so any reduction of must be compensated by reducing the G-ratio factor (of Bessel functions) in the bound (by appropriate reduction of ) – which unfortunately should increase the bound! (this shows the trade-off during the optimization of and !)

5. In order to get a truly unconstrained optimization, I suggest to represent in an interval as – where is the new parameter (this has the added advantage that the order of magnitude of remains near 1) and similarly for

.

26 August, 2013 at 2:38 pm

Gergely HarcosIt is a good idea to use the you indicate when performing an exhaustive search. Note that symbolic software packages can quickly and reliably solve the equation , just as they can find the relevant Bessel root, so in practice there is no need to analyze the equation and its (unique) solution further.

For example, in SAGE one can use the command

A=find_root(x==(k-1)*exp(-deltat*x),0,k-1)-2*eta

This would tell that that in Table 3, keeping the indicated values and , the optimal would be (resp. ), yielding (resp. ).

26 August, 2013 at 4:45 pm

Eytan PaldiI just did this with Mathematica (but with the maximal as determined by ) which gives – which gives almost the same as for and about reduction in the value of (relative to with the original ).

26 August, 2013 at 4:49 pm

Eytan PaldiI meant % reduction in the value of (latex eliminates the % symbol.)

26 August, 2013 at 7:24 pm

Gergely Harcos@Eytan: For Table 3 I recommended round values of for expository purposes. They were not meant to be optimal in any sense except that they were good enough to provide the smallest admissible value of (as far as we know it).

26 August, 2013 at 7:51 pm

Eytan PaldiI agree that in table 3, may have “rounded” values, (provided that they are compatible with ), but in table 4 they should be nearly optimal (which is still not the case for the last line – which has the same parameter values as in table 3.)

26 August, 2013 at 7:58 pm

Aubrey de GreyIt seems to me that the main thing to be illustrated in this part of the paper is how close the best values of A, the various kappas, etc., are to achieving an improved k0 – in other words, how robust is the conclusion that the current k0 cannot be improved. Hence, maybe what is needed in the tables is to give the smallest value of eps2-eps (or some other critical quantity?) yet obtained for k0=631, k0=1782 and so on, along with the precise values of A, the various kappas, etc., which produce that value. Perhaps, in order to give a sense of scale, it also makes sense to give the best (negative) value of eps2-eps for the current k0 (i.e for 632, 1783 etc.), but maybe that’s overkill.

26 August, 2013 at 8:12 pm

Eytan PaldiI meant table 5 (it was table 4). I agree that more information on the size of the cost function for the optimal parameters is needed for (in particular, because I know of some variants of the bounds for the ‘s which may lead (with small improvements of the bounds) to or even .

28 August, 2013 at 8:22 pm

xfxieI have put most of the improvements above into the “optimization” part.

About Table 5, the current parameters are mainly from previous results listed in the record page (unless for solutions that are not optimal). Since this problem is essentially a satisfaction problem, I am not sure how important to put “optimal” parameters for these optimal k_0 values.

I do agree that it might be useful to generate a “failure” table for optimal parameters for , and reports all errors, which might provide some hints for possible further improvements.

29 August, 2013 at 11:43 am

Eytan PaldiSince for each given , its corresponding parameters are termed as “optimal” in table 5, they should be optimized (for that $k_0$ value.)

Another reason for the importance of this optimization is that the optimized cost function for might be “sufficiently small” to enable further reduction of by slightly reducing the dominant (i.e. ) by slightly improving its (probably crude) current bound. So it is very important to know the size of the optimized cost function (for and .)

BTW, I have some ideas how to improve (perhaps slightly) the values of the ‘s – using some simple new bounds for them (which perhaps may lead to further reduction of the values – provided that the size of the corresponding cost function is “sufficiently small”!)

30 August, 2013 at 8:12 pm

xfxie@Eytan: I have generated such a table for , (k0opttable.pdf in the dropbox). Both normal and A-free model were tried, the results were similar. Among the three error terms, was always the biggest.

31 August, 2013 at 6:42 am

Eytan PaldiThank you! Can you please do it also for (I think we still have a chance to reach it), and in addition the resulting values of (for optimized ?) thanks!

BTW, should (by its definition) for sufficiently large be small compared to .

I think also that (after optimization) and should have comparable sizes.

31 August, 2013 at 7:31 am

xfxieJust updated the table.

For 632: , , are 3.01E-06, 3.40E-08, 9.89E-08, respectively.

For 631: , , are 3.01E-06, 3.40E-08, 1.03E-07, respectively.

BTW: Seems $\latex k_0=630$ is feasible if is ignored from the model.

31 August, 2013 at 8:07 am

Eytan PaldiInteresting! but I’m not sure that (for ) these are the optimal values (in particular seems to be “too large” to enable negative cost function – even with zero ‘s!) – Please check if the upper bound on is correct and satisfied!

Based on some preliminary analysis, I suggest to optimize in the interval .

31 August, 2013 at 8:11 am

Eytan PaldiI meant the interval .

31 August, 2013 at 8:25 am

Eytan PaldiPlease note that

1. In the second table appears as .

2. The values of in the last two lines

(i.e. ) are the same.

31 August, 2013 at 8:59 am

Gergely Harcos@Eytan: I think xfxie’s optimized values for k=632 and k=631 are correct. For k=631 the optimum is about

objective = 0.000040614908058444562410709227127

delta = 1/10892 = 0.000091810503121557106132941608520

deltap = 1/119 = 0.0084033613445378151260504201681

A = 194.01650795568561574098481763

By comparison, for delta=1/17000, 1/18000, 1/19000, 1/20000 the optimal choice of deltap=1/n (n an integer) gives

objective = 0.015653695596778259119692041036

delta = 1/17000 = 0.000058823529411764705882352941176

deltap = 1/262 = 0.0038167938931297709923664122137

A = 227.54976810616371534477281350

objective = 0.050726720125176960294660949986

delta = 1/18000 = 0.000055555555555555555555555555556

deltap = 1/299 = 0.0033444816053511705685618729097

A = 229.23953230877909430771893788

objective = 0.15427791851124266114834442323

delta = 1/19000 = 0.000052631578947368421052631578947

deltap = 1/343 = 0.0029154518950437317784256559767

A = 232.39141344513906955169574324

objective = 0.43652800188855623135923348342

delta = 1/20000 = 0.000050000000000000000000000000000

deltap = 1/396 = 0.0025252525252525252525252525253

A = 235.27252623541871383847962100

31 August, 2013 at 9:18 am

Eytan PaldiThe problem is that the current (optimized?) value of for is too large to enable feasibility (i.e. negative cost function) = even for zero ‘s!. It is easy to calculate the upper bound on to enable feasibility with zero ‘s (it is about 1/15000 -smaller than the current value of ) – this is the reason for my suggestion to upper bound by 1/17000.

Since the possible feasibility region (if exists) seems to be quite small, the optimization of the cost function should be done very carefully (the examples you gave do not indicate that the current value is optimal.)

31 August, 2013 at 9:30 am

Eytan PaldiClearly, by starting to slowly reduce below its current value (and for each value optimizing and as in your example, you can see the cost function starting to decrease (up to a certain “optimal” – perhaps less than 1/17000) – the values of and corresponding to the minimal cost function should be of comparable sizes.

31 August, 2013 at 10:29 am

Gergely HarcosEytan, it is true that needs to be at least about in order to make positive, when . Yet, the difference of this expression and is maximized for and , and it is still negative there. In the range you suggest, the mentioned difference (as maximized in ) is a decreasing function of :

delta = 1/17000 = 0.000058823529411764705882352941176

deltap = 10/2616 = 0.0038226299694189602446483180428

difference = -0.015625686235754007815067521685

delta = 1/17001 = 0.000058820069407681901064643256279

deltap = 10/2617 = 0.0038211692777990064959877722583

difference = -0.015644105635903322436233767115

and so on. That’s life. Perhaps you want to study another extremum problem.

31 August, 2013 at 10:33 am

Eytan PaldiIt may be possible for , that the current values of as optimized by xfxie are indeed optimal. If this is the case than the only chance for improving to is by improving the current upper bounds on the ‘s – I’ll try to check this possibility (I found some examples of such new bounds.)

31 August, 2013 at 11:02 am

Gergely HarcosI should have posted my previous comment of 11:00 am at this level, not one level up. Sorry about that.

31 August, 2013 at 11:00 am

Gergely HarcosHere is more data showing that for the extremum is at and :

delta = 1/10895 = 0.000091785222579164754474529600734

deltap = 1000/119043 = 0.0084003259326461866720428752636

A = 193.97915262578728328433413956

difference = -0.000040614772581912220284280584210

delta = 1/10896 = 0.000091776798825256975036710719530

deltap = 1000/119059 = 0.0083991970367632854299129003267

A = 194.09113855858381197092882324

difference = -0.000040614764639407839935052223726

delta = 1/10897 = 0.000091768376617417637881985867670

deltap = 1000/119075 = 0.0083980684442578207012387150955

A = 193.99409796303702518114647794

difference = -0.000040614771964712124471218772054

31 August, 2013 at 1:35 pm

Eytan PaldiI agree!

26 August, 2013 at 8:32 am

Gergely HarcosIn the sixth line below Theorem 2.20, should be .

[Corrected, thanks - T.]26 August, 2013 at 11:58 am

Eytan PaldiIn the proof of lemma 4.6, a (right) parenthesis is missing at the end of the third line.

[Corrected, thanks - T.]26 August, 2013 at 12:19 pm

Eytan PaldiIn the proof of theorem 2.14, it seems that in the cases (i) – (iii), should be replaced by .

[Fixed this for (iii); I think it is OK as is for (i), (ii). -T]27 August, 2013 at 11:00 am

Eytan PaldiIn theorem 2.14, the condition can be relaxed to . (Indeed, in case (i) of my comment above, the optimal is .

[Corrected, thanks - T.]28 August, 2013 at 2:39 am

Anonymous@Authors: I have left the Dropbox folder. (If some LaTeX’ing is needed, I’ll try to do my best after being re-invited.)

28 August, 2013 at 8:27 am

Eytan PaldiIn the third line below (4.43), the statement that “both sides vanish” should be corrected.

[Corrected, thanks - T.]28 August, 2013 at 5:43 pm

Eytan PaldiA typo: it should be “side”.

[I was unable to locate this correction, sorry - T.]29 August, 2013 at 3:20 am

Eytan PaldiI meant that In the third line below (4.43), it should be “side” (instead of “sides”).

[OK, corrected now, thanks - T.]28 August, 2013 at 3:20 pm

Aubrey de GreyI’m wondering whether there is a case for concluding the paper with a short section summarising the current understanding of what avenues are available – and, critically, of how promising they seem – for further progress on H. It seems to me (though I concede that this may be because my own field, biology, may simply have different conventions for the structure of papers) that the paper as it stands not only lacks such a synthesis but also is overall rather patchy in terms of describing which avenues are and are not promising.

Most conspicuously, while the reader already comes away with a nicely clear understanding that there is negligible (if any) scope for further reducing H for relevant values of k0, s/he cannot draw comparable conclusions in respect of the best k0 for a given varpi. To clarify: I am not referring to how best to minimise delta by various trickery involving error terms, about which there is a vibrant ongoing discussion here. Rather, what seems to me to be missing is discussion of the potentially much more powerful option of improving the relationship between varpi and “k0*”, most especially the critical exponent -3/2 (originally -2) in that relationship.

I feel that similar patchiness currently exists in respect of varpi itself: there is proposed to be a nice discussion of why varpi cannot readily be improved by reducing sigma (the “Type V” obstacle), but it is striking that nothing equivalent is proposed to be said about obstacles to further relaxing the current Type 1 boundary, given that that is so overwhelmingly the quantitatively best single option for increasing varpi (in the sense that big improvements of varpi without progress on Type 1 would result only from big progress on both Type III and sigma).

Finally, maybe there should be a brief mention – here perhaps all that is needed is pointers to the key literature – of what makes the upper bound of 1/4 on varpi (leading to the lower bound of 16 on H) so impregnable.

28 August, 2013 at 4:23 pm

Terence TaoHmm, that sounds like a good idea; I’ve added a stub of a section on possible improvements to the end of the paper. Of course we would also be inserting remarks at various junctures of the paper when some further optimisation looks likely or unlikely, but it does make sense to summarise all that in one concluding section.

Regarding the varpi/k_0^* relationship, this is more or less the focus of an existing paper of Farkas, Pintz, and Revesz: http://www.renyi.hu/~revesz/ThreeCorr0grey.pdf . We don’t really have much new to add to what they say (which is that the relationship we have is the limit of what one can do just by optimising the cutoff function in the Goldston-Pintz-Yildirim sieve), but we can perhaps recall more prominently some of their conclusions here.

The Elliott-Halberstam conjecture is known to fail at the endpoint varpi = 1/4; I don’t know offhand the reference where this is first established (maybe in the original Elliott-Halberstam paper?) but I agree this is worth pointing out in the paper.

29 August, 2013 at 1:33 pm

Terence Taop.s. I’ve just learned that the failure of the endpoint Elliott-Halberstam conjecture was established in a paper of Friedlander-Granville (http://www.ams.org/mathscinet-getitem?mr=1220459), with a stronger version of that failure established in a followup paper of Friedlander-Granville-Hildebrand-Maier (http://www.ams.org/mathscinet-getitem?mr=1080647).

28 August, 2013 at 4:52 pm

Eytan PaldiA very minor correction: In the line below (2.5), the constant is in fact

– so I suggest to display it as

[Corrected, thanks - T.]28 August, 2013 at 5:20 pm

Derek OCongratulations to everyone who contributed to this project. As an undergraduate college student (with research in number theory), I was following this since the day it began. This is truly spectacular. Well done, you guys.

28 August, 2013 at 5:41 pm

Eytan PaldiA typo: in the tenth line of the abstract, it should be “application”.

[Corrected, thanks - T.]28 August, 2013 at 5:56 pm

Andrew SutherlandI have posted an updated version of narrow.tex to Dropbox that includes essentially everything I wanted to put in to section 3, except for the lower bounds subsection 3.8, which still needs to be written. I am heading off to a conference tomorrow and won’t have time to work on it for a while, so I am happy to turn over the file at this point if someone else wants to take a crack at drafting 3.8 (Wouter?).

I added a new subsection 3.1 that describes an asymptotically faster admissibility testing algorithm that turned out to be critical for handling the larger values of k (especially 3500000). The upper bounds on H for k=34429, 181000, 341640, and 3500000 have now been improved to 386344, 2323344, 4601036, and 55233504, respectively (see Table 4).

These values should also be updated in Table 1.

One thing I thought about putting in, but didn’t, is an updated version of the graphic http://math.mit.edu/~drew/hw2data.gif, which is related to the discrepancy between the prime tuples conjecture and the second Hardy-Littlewood conjecture (which is mentioned in the article but isn’t directly relevant to bounding prime gaps). But if people think it should be included, I would be happy to generate an updated version.

28 August, 2013 at 8:55 pm

Terence TaoJust a small update on my end; I have more or less finished editing subtheorems.tex, gpy.tex, and heath-brown.tex for now, and plan to start working on exponential.tex in the next week or so, although I will be traveling and having some other commitments for the next two or three days, so this may take a while to get going. I hadn’t realised that the net length of the paper is likely to exceed the 100 page mark; somehow it seemed smaller than this when split up into individual blog posts! This may complicate the question of where to submit this paper (or whether to convert this paper into a monograph), but perhaps it will be clearer how to proceed on that front as the paper continues to be fleshed out.

29 August, 2013 at 3:46 am

Andrew SutherlandFYI, I just updated Table 1 in subtheorems.tex to incorporate the improved bounds on H noted above.

29 August, 2013 at 9:03 am

Gergely HarcosIn fact the blog posts and the comments make up more than 600 pages when printed and joined to a pdf file!

29 August, 2013 at 1:31 am

Wouter CastryckSure, I can help with writing Section 3.8.

29 August, 2013 at 3:39 am

Andrew SutherlandThanks. I just fixed a typo (3451 changed to 3461), but I’ll hand the file over to you know.

29 August, 2013 at 9:40 pm

AnonymousTypograhpical comment: For maps, one should use “\colon” instead of “:”, i.e., instead of ; this gives the correct spacing.

29 August, 2013 at 11:49 pm

Terence TaoI’m reporting here some simplifications found by Philippe Michel (and confirmed by Emmanuel Kowalski and myself) to one of the main conclusions of the Deligne machinery needed for our project, namely the fact that the trace weight

for any with c,l non-zero has bounded conductor and is almost orthogonal to all quadratic phases in the sense that

(*)

for all . Currently we are using some rather general ell-adic machinery to establish the bounded conductor property, and Hooley’s criterion on two-dimensional exponential sums to establish the almost orthogonality. However, it turns out that one can manipulate K by some Fourier analysis to avoid both of these steps, instead using the more standard theory of Deligne, Laumon and Katz on “Fourier transforms” of sheaves. The first observation is that the properties required on K (namely, bounded conductor and almost orthognality to quadratic phases) are unaffected by the following “metaplectic” operations:

* dilation of K (replacing by for any non-zero )

* translation of K (replacing by for any )

* frequency modulation of K (replacing by for any )

* Fourier transform of K (replacing by ) – this requires Deligne-Laumon-Katz to keep the conductor bounded.

Using the first three symmetries, one can reduce to the case and , thus

Then taking a Fourier transform we can replace with

Making the change of variables this becomes

which we can write in terms of the Kloosterman weight as

or after the change of variables

thus is the Fourier transform of the function evaluated at the quadratic function of . The results of Deligne, Laumon and Katz then ensure that this is a trace weight of bounded conductor, and one can also show directly that the associated sheaf does not contain quadratic Artin-Schrier components, without the need for the Hooley argument (and generic irreducibility, etc.).

This argument is easier than the one currently being written into the draft paper, but is not as general (it relies more heavily on the low-degree nature of the rational phases we are using). We are therefore thinking of rewriting this portion of the paper to focus on this easier argument, and only sketch the more complicated and general argument (perhaps details would be given in a future paper, outside of the polymath8 project).

30 August, 2013 at 3:34 am

Gaston RachlouPlease excuse me if the following suggestion has already been made. I understand and appreciate that various versions are being written, with or without this or that technique.

I think it would be nice to have available somewhere the simplest possible proof of the qualitative result “ does not tend to infinity”. The implicit bound could be worse than Zhang’s 70 000 000, the focus being on using the minimal machinery. With the collective expertise acquired during this remarkable endeavour of yours, it could be an interesting task for someone.

30 August, 2013 at 3:55 am

Gaston RachlouOf course this “good idea of mine” was already in your post… Sorry for the wasted room, but it would definitely be a welcome text!

30 August, 2013 at 11:23 am

Eytan PaldiIn ref. [13], there is a typo in the name of Turan.

30 August, 2013 at 12:28 pm

Gergely HarcosIt is probably due to the accent which requires an input font package or must be typeset differently. At any rate, the correct spelling is “Turán”, but it seems that the cover of that volume will have “Turan” on it. See http://www.degruyter.com/doc/flyer/9783110282375.pdf

30 August, 2013 at 12:48 pm

Eytan PaldiProbably so (because the letter ‘a’ is missing.)

30 August, 2013 at 1:24 pm

Terence TaoActually, thanks to your comment I found that there were quite a lot of special characters in the bibliography (coming from accents and also from some en-dashes and em-dashes that had been cut-and-pasted into biblio.tex) that were not being LaTeX’ed properly; I think the problem is fixed now.

Also, reminded by Gaston Rachlou’s comment, I have now added a remark about the “minimal” proof we currently have to the paper at Remark 2.21. After this project is finished, I might write a blog post in which I try to extract this minimal proof (similar to something I did for Gromov’s theorem at https://terrytao.wordpress.com/2010/02/18/a-proof-of-gromovs-theorem/ ), but that won’t happen for a while.

Best,

Terry

1 September, 2013 at 4:46 am

Aubrey de GreyThank you Terry! I think Remark 2.21 is particularly valuable to non-specialists. Three things occur to me as a result of it.

1) There is one aspect that I think may confuse the less expert reader, though (at least it confuses me!): it concerns the assertion that “one can extend the Type II analysis to also cover the Type I case” (which was already noted in the placeholder paragraph for Section 7). Since Type II constraints do not depend on sigma, how exactly does a purely Type II argument dovetail with the need to keep sigma above 1/6? Is it more accurate to say that the Type I argument can be replaced not by precisely a Type II argument but actually by a variant of Type II that does give rise to a dependence on sigma (and thus, presumably, a varpi considerably smaller than 1/68)?

2) Concerning the extraction of a minimal proof: while I see that that is not a priority, and notwithstanding the closing sentence of Remark 2.21, it seems to me that since the paper is going to include quite extensive discussion of sub-optimal k0 values obtainable without using certain sophisticated machinery, it would be a shame not to include good values of k0 (and H) arising from purely Type II arguments if they could be found fairly easily. If I fully understand Remark 2.21, it seems that the only thing missing from the existing discussion that is needed for this is to identify the relevant linear constraint on (varpi, delta) presuming sigma = 1/6, and then xfxie and Drew could provide good k0 and H in half no time. Is such a constraint already “as good as known” (e.g., is it already available by combining suitable “levels” of the Type I and II constraints on the wiki)? I apologise if I have overlooked some previous discussion of this, or if I am underestimating the difficulty of deriving this constraint.

3) Finally: I asked some time ago about the possibility of bypassing Weil entirely, and you mentioned the mathoverflow question you’d asked and the lack of a resolution (see http://mathoverflow.net/questions/138193/is-there-a-cheap-proof-of-power-savings-for-exponential-sums-over-finite-fields). Would it be appropriate to include a sentence about that issue here?

1 September, 2013 at 2:57 pm

Terence TaoRegarding (3), I’ve now added an additional remark (Remark 6.2) to address this. As for extending the Type II analysis, I haven’t yet gotten to that part of the writeup (still finishing off Section 6), but will add some comment about this when I reach that point.

30 August, 2013 at 12:12 pm

SteveLayman question: What is required to confirm H = 4,686? Is it review of a proof, or just some exhaustive calculation? If it’s the latter, I’d be happy to set my GPU about the task of crunching numbers.

31 August, 2013 at 7:19 am

Eytan PaldiIn the third line of claim 1.1, it seems clearer to have inside parentheses.

[Edited, thanks - T.]31 August, 2013 at 9:02 am

anthonyquasAny chance of an idiot’s guide to any of this? I’d really love to understand, but don’t know where to start.

[ You might try https://terrytao.wordpress.com/2013/06/30/bounded-gaps-between-primes-polymath8-a-progress-report/ -T.]31 August, 2013 at 10:20 am

neymetmsnewgap (Terence Tao’s conflicted copy 2013-08-30).pdf (p.20)

3.2 g_n = p_{n + 1} – p_n,

for n <250150 g_n <174

for n < 3 750150 g_n < 256

31 August, 2013 at 10:47 am

Eytan PaldiIn (4.32), it should be (instead of ).

[Corrected, thanks - T.]31 August, 2013 at 11:20 am

Eytan PaldiIn the tenth line above the proof of theorem 2.14, it should be “non-increasing” (instead of “non-decreasing”.)

[Corrected, thanks - T.]31 August, 2013 at 5:33 pm

Andrew SutherlandA new admissible tuple for with diameter 4,597,926

has been posted at

http://math.mit.edu/~primegaps/tuples/admissible_341640_4597926.txt.

This was obtained by letting the adjustment process described in 3.6.1 run until it could not find any beneficial adjustments using residue classes at up to 3 primes (I did this earlier for ). I’ve posted the new bound on the wiki, but it should also be updated in Table 1 of subtheorems.tex and Table 4 of narrow.tex.

[Updated, thanks - T.]31 August, 2013 at 7:44 pm

Gergely HarcosTwo lines below (5.2), should be . Five lines below (5.2), (resp. ) should be (resp. ). On page 53, line 4, "exceed" should be "be at least".

[Corrected, thanks - T.]1 September, 2013 at 7:40 am

Gergely Harcos1. In part (i) of Lemma 2.12, “ is” should be “ is -tuply”.

2. At the end of the second paragraph of the proof of Lemma 2.12, should be . Three lines below this, in the subscript should be .

3. I suggest that in Lemma 2.12 and its proof we rename (resp. ) to (resp. ) in order to be in harmony with Definition 2.11 and its neighborhood.

[Changed, thanks - T.]1 September, 2013 at 2:36 pm

Gergely HarcosThese corrections do not appear in the current pdf files (newgap.pdf and your conflicted copy of 2013-09-02). I have not checked the tex files.

[Oops, there seems to have been an edit conflict about this; I think it is resolved now. -T.]1 September, 2013 at 3:59 pm

Gergely HarcosItem #3 is implemented now, but it seems items #1 and #2 are not. Thank you in advance!

[Gah, fixed now, I hope... -T.]1 September, 2013 at 4:13 pm

Gergely HarcosAlmost :-) In the proof of Lemma 2.12, in the definition of , should be .

[OK, hopefully this time it is fixed - T.]1 September, 2013 at 12:29 pm

Eytan PaldiOn the inefficiency of the current upper bound for :

1. The integral in this bound has a good approximation (from above) in terms of the exponential integral function by , where

– which (by the expansion 5.1.11 in A$S) is

Where is Euler’s constant.

Note that the optimal is (closely) upper bounded by – implying that (in practice) .

2. Observe that the exponential factor of the current bound is an upper bound for a finite sum (for the original bound) of J-dimensional integrals where

. It seems (because of the above logarithmic approximation of ) that a major part of these J-dim. integrals is due to integration over small domains of sizes of order O(latex \tilde{\delta}) – implying that by the “A – trick” the original integrands over these small “major” domains are multiplied by approximately . Therefore it seems that the true should be much smaller than its current bound (perhaps by a factor of about .)

1 September, 2013 at 12:35 pm

Eytan PaldiIt should be in my last comment above.

1 September, 2013 at 1:29 pm

AnonymousAre you saying that is much smaller than ? Or are you scrolling further back in the argument?

1 September, 2013 at 2:25 pm

Eytan PaldiNot exactly. I’m saying that the “A trick” (which is using the fact that the

J-dim. integrand is zero whenever can be very

inefficient (multiplying the integrand by about exactly on the “major domain” with size – where the integrand is most important – as reflected by its logarithmic nature – which depends only on domains of size .

The purpose of the “A-trick” is to upper bound the J-dim. integrals over the domain – while still having a multiplicative integrand.

Unfortunately, this advantage of the "A-trick" leads to a very large magnification (about ) exactly on the above "major domain"!

(I'm trying to reduce this inefficiency by generalizing somewhat the "A-trick".)

1 September, 2013 at 3:02 pm

Gergely HarcosIf I understand correctly, you are trying to improve on the bound

1 September, 2013 at 3:18 pm

Eytan PaldiYes!

1 September, 2013 at 4:15 pm

Gergely HarcosI experimented a little bit for , , . The right hand side for the optimal is about . On the left hand side, the -term is clearly at least

which for is about . So indeed there is hope to save or more, but of course my benchmark lower bound is very crude.

1 September, 2013 at 6:54 pm

Eytan PaldiIt is easy to express the J-th term as an iterated integral (I tried it for and it can’t be expressed exactly in terms of the classical special functions, but at least it has tight lower and upper simple bounds using the expansions for for small and large arguments) – I think that these bounds can be extended for general J-th term.

(This is of course a direct approximation approach, but I intend to try also a generalized “A-trick” bounding approach.)

2 September, 2013 at 7:15 am

Eytan PaldiThe general idea of this “A-trick” is to find a weight function (possibly dependent on ) on , satisfying

(i) on

(ii) for some constant C whenever each .

Note that property (ii) is implied by

(ii)’ is log-concave on .

One may try to optimize or try to optimize it in a parametric form

Parametric examples:

1. where

(this is the current one-parametric example with ).

2.

where and with

.

Note that in the second example, the additional parameter helps in reducing the factor

2 September, 2013 at 7:29 am

Eytan PaldiIn my last comment above, in the two examples, the expressions given for the constant are in fact for its inverse .

2 September, 2013 at 1:20 pm

Eytan PaldiSome remarks:

1. The second example is the simplest parametric weight function generalizing the current weight (i.e. example 1) – this could be only a beginning for optimizing a more general weight function (constrained to have the properties (i) and (ii) above). As an example for further generalization, one may use where each

(this strategy may be called “generalize and optimize”)

2. It is possible that by fixing and optimizing and than fixing and optimizing and so on, the convergence may be very slow (depending on the condition number of the Hessian of the cost function) – so perhaps an efficient optimization algorithm for combined optimization of the parameters is needed (e.g. a quasi-Newton optimization method.)

3. As I understand (at least intuitively) the role of A, it gives a trade-off between the desire to have low magnification for the important domain of size while significantly reduce the contribution to the integral of the domain with .

Note that the first requirement demands “small A”, while the second one demands “large A” – showing the trade-off in A optimization.

Now B helps by assisting in the second requirement (steep cut-off for large t) -thereby helping to reduce A to cope with the first requirement (small t).

Therefore I think that what happened numerically is that B (governing the steepness of the weight function for large t is not sufficiently large) so perhaps a further generalization is needed, i.e.

With parameters to be optimized (such that A and B should “share” their respective requirements in a better way.)

2 September, 2013 at 11:37 am

Gergely HarcosI tried to implement your second parametric example. It gives, for any and ,

In practice is less than , so we can drop the subscript. For , , , , SAGE tells me that the right hand side decreases in very slightly in the beginning, then it increases. So it seems we cannot get a significant savings this way. Here are some sample values:

B = 0.0000 -> 2.4714682685342455648079262199e438

B = 0.0005 -> 2.4714677808195620527909981040e438

B = 0.0010 -> 2.4714679115895903063390145381e438

B = 0.0100 -> 2.4715767393414678026360526257e438

B = 0.1000 -> 2.4845919836872781662553066857e438

B = 0.9990 -> 9.0952599437455008465256980149e440

B = 0.9999 -> 9.0870726191879883371508337709e441

4 September, 2013 at 5:28 pm

Eytan PaldiI’m trying also to prove that is unattainable (this is also not easy – because in this case one needs good lower bounds for the true ‘s!)

1 September, 2013 at 1:42 pm

xfxieI tried to implement the original e-term before, but seems there were no big difference at that time.

Based on Eytan’s comment, I just tried a re-implementation of the e-term.

The direct implementation of e-term seems to be too slow for some parameters:

e := exp(A)*evalf(Sum((k0-1)^J/J!*int(exp(-(A*t)/t), t=deltat..theta, numeric ), J=0..floor(1/deltat)));

So I changed to the following form:

ia := int(exp(-(A*t)/t), t=deltat..theta, numeric)*(k0-1);

e := exp(A)*evalf(Sum(ia^J/J!, J=0..floor(1/deltat)));

Then I can get k0=631: http://www.cs.cmu.edu/~xfxie/project/admissible/k0/sol_varpi600d7_631_o.mpl

I still need to check if there are any errors.

1 September, 2013 at 1:56 pm

Gergely HarcosI think you have a typo in your notebook: exp(-(A*t)/t) is no good, it should be exp(-A*t)/t or, rather, exp(-(A+2*eta)*t)/t.

1 September, 2013 at 2:09 pm

xfxieYou are right. Seems the hope is almost lost as going back to here. Basically, if the term J=0 is allowed, e^A is there, since 0^0 =1.

1 September, 2013 at 2:11 pm

Gergely HarcosSAGE tells me that there is virtually no difference between taking the exponential of the integral (multiplied by ) and using the original finite Taylor sum. For , , , the two methods give

kappa3 = 9.8625509966256926663337917961e-8

kappa3 = 9.8625509966256926663337917960e-8

1 September, 2013 at 1:55 pm

xfxieSeems 630 also survives.

http://www.cs.cmu.edu/~xfxie/project/admissible/k0/sol_varpi600d7_630_o.mpl

1 September, 2013 at 2:21 pm

xfxieIncorrect. Based on the typo found by Gergely.

1 September, 2013 at 2:55 pm

Eytan PaldiIt is easy to see that the J-th Taylor term for is maximal for and decreasing afterwards where the remainder term can easily be estimated analytically.

2 September, 2013 at 9:51 am

Terence TaoI have created a subpage of the polymath wiki at

http://michaelnielsen.org/polymath1/index.php?title=Polymath8_grant_acknowledgments

to hold the contact information and grant acknowledgments of the Polymath8 project participants.

Of course it is not well defined exactly what a “participant” of the project is, since the level of contribution can vary from being responsible for a significant chunk of the research and writeup to leaving a single blog comment. In the past (and specifically for Polymath1 and Polymath4), what we have done is relied on self-reporting: people who feel that they have made a significant mathematical contribution to the project (of the level commensurate with that of a co-author in a traditional mathematical research paper) can add their own name, contact information, and grant information to the first section of the wiki page. Those who feel that they have made an auxiliary contribution to the project (e.g. stylistic suggestions, locating references, etc. – commensurate with a mention in the Acknowledgments section of a mathematical research paper) – can add their name to the second section of the wiki page. The dividing line between the two categories is necessarily a bit blurry, but in practice it seems that most participants are able to categorise their contributions appropriately.

Because of spam reasons, new user accounts are not currently enabled on the polymath wiki except through emailing the administrator; if you are unable to add your name to the wiki because of this, email me and I can add your name manually.

p.s. I plan to roll over this thread to a new one soon, as the number of comments here is getting quite large. We are soon reaching the point where we have draft text for each of the sections of the paper, so we can shortly begin focusing on the proofreading and polishing phases of the writeup.

2 September, 2013 at 10:52 am

Eytan PaldiIn the line below (4.53) there is a (plausible – but still unjustified!) claim that a certain sequence is decreasing.

2 September, 2013 at 2:32 pm

Polymath8: Writing the paper, II | What's new[…] main purpose of this post is to roll over the discussion from the previous Polymath8 thread, which has become rather full with comments. As with the previous thread, the main focus on […]

2 September, 2013 at 2:38 pm

Terence TaoI’m rolling over the discussion to a new thread. Of course, everyone is welcome to continue any active discussions here if it would not make sense to break up the comment thread, but any fresh discussion should probably move over to the new thread where it would be easier to follow.

21 October, 2013 at 9:05 am

cainiaozrReblogged this on ZHANG RONG.