You are currently browsing the category archive for the ‘polymath’ category.

It’s time to (somewhat belatedly) roll over the previous thread on writing the first paper from the Polymath8 project, as this thread is overflowing with comments.  We are getting near the end of writing this large (173 pages!) paper, establishing a bound of 4,680 on the gap between primes, with only a few sections left to thoroughly proofread (and the last section should probably be removed, with appropriate changes elsewhere, in view of the more recent progress by Maynard).  As before, one can access the working copy of the paper at this subdirectory, as well as the rest of the directory, and the plan is to submit the paper to Algebra and Number theory (and the arXiv) once there is consensus to do so.  Even before this paper was submitted, it already has had some impact; Andrew Granville’s exposition of the bounded gaps between primes story for the Bulletin of the AMS follows several of the Polymath8 arguments in deriving the result.

After this paper is done, there is interest in continuing onwards with other Polymath8 – related topics, and perhaps it is time to start planning for them.  First of all, we have an invitation from  the Newsletter of the European Mathematical Society to discuss our experiences and impressions with the project.  I think it would be interesting to collect some impressions or thoughts (both positive and negative)  from people who were highly active in the research and/or writing aspects of the project, as well as from more casual participants who were following the progress more quietly.  This project seemed to attract a bit more attention than most other polymath projects (with the possible exception of the very first project, Polymath1).  I think there are several reasons for this; the project builds upon a recent breakthrough (Zhang’s paper) that attracted an impressive amount of attention and publicity; the objective is quite easy to describe, when compared against other mathematical research objectives; and one could summarise the current state of progress by a single natural number H, which implied by infinite descent that the project was guaranteed to terminate at some point, but also made it possible to set up a “scoreboard” that could be quickly and easily updated.  From the research side, another appealing feature of the project was that – in the early stages of the project, at least – it was quite easy to grab a new world record by means of making a small observation, which made it fit very well with the polymath spirit (in which the emphasis is on lots of small contributions by many people, rather than a few big contributions by a small number of people).  Indeed, when the project first arose spontaneously as a blog post of Scott Morrrison over at the Secret Blogging Seminar, I was initially hesitant to get involved, but soon found the “game” of shaving a few thousands or so off of $H$ to be rather fun and addictive, and with a much greater sense of instant gratification than traditional research projects, which often take months before a satisfactory conclusion is reached.  Anyway, I would welcome other thoughts or impressions on the projects in the comments below (I think that the pace of comments regarding proofreading of the paper has slowed down enough that this post can accommodate both types of comments comfortably.)

Then of course there is the “Polymath 8b” project in which we build upon the recent breakthroughs of James Maynard, which have simplified the route to bounded gaps between primes considerably, bypassing the need for any Elliott-Halberstam type distribution results beyond the Bombieri-Vinogradov theorem.  James has kindly shown me an advance copy of the preprint, which should be available on the arXiv in a matter of days; it looks like he has made a modest improvement to the previously announced results, improving $k_0$ a bit to 105 (which then improves H to the nice round number of 600).  He also has a companion result on bounding gaps $p_{n+m}-p_n$ between non-consecutive primes for any $m$ (not just $m=1$), with a bound of the shape $H_m := \lim \inf_{n \to \infty} p_{n+m}-p_n \ll m^3 e^{4m}$, which is in fact the first time that the finiteness of this limit inferior has been demonstrated.  I plan to discuss these results (from a slightly different perspective than Maynard) in a subsequent blog post kicking off the Polymath8b project, once Maynard’s paper has been uploaded.  It should be possible to shave the value of $H = H_1$ down further (or to get better bounds for $H_m$ for larger $m$), both unconditionally and under assumptions such as the Elliott-Halberstam conjecture, either by performing more numerical or theoretical optimisation on the variational problem Maynard is faced with, and also by using the improved distributional estimates provided by our existing paper; again, I plan to discuss these issues in a subsequent post. ( James, by the way, has expressed interest in participating in this project, which should be very helpful.)

Once again it is time to roll over the previous discussion thread, which has become rather full with comments.  The paper is nearly finished (see also the working copy at this subdirectory, as well as the rest of the directory), but several people are carefully proofreading various sections of the paper.  Once all the people doing so have signed off on it, I think we will be ready to submit (there appears to be no objection to the plan to submit to Algebra and Number Theory).

Another thing to discuss is an invitation to Polymath8 to write a feature article (up to 8000 words or 15 pages) for the Newsletter of the European Mathematical Society on our experiences with this project.  It is perhaps premature to actually start writing this article before the main research paper is finalised, but we can at least plan how to write such an article.  One suggestion, proposed by Emmanuel, is to have individual participants each contribute a brief account of their interaction with the project, which we would compile together with some additional text summarising the project as a whole (and maybe some speculation for any lessons we can apply here for future polymath projects).   Certainly I plan to have a separate blog post collecting feedback on this project once the main writing is done.

The main purpose of this post is to roll over the discussion from the previous Polymath8 thread, which has become rather full with comments.  We are still writing the paper, but it appears to have stabilised in a near-final form (source files available here); the main remaining tasks are proofreading, checking the mathematics, and polishing the exposition.  We also have a tentative consensus to submit the paper to Algebra and Number Theory when the proofreading is all complete.

The paper is quite large now (164 pages!) but it is fortunately rather modular, and thus hopefully somewhat readable (particularly regarding the first half of the paper, which does not  need any of the advanced exponential sum estimates).  The size should not be a major issue for the journal, so I would not seek to artificially shorten the paper at the expense of readability or content.

The main purpose of this post is to roll over the discussion from the previous Polymath8 thread, which has become rather full with comments.    As with the previous thread, the main focus on the comments to this thread are concerned with writing up the results of the Polymath8 “bounded gaps between primes” project; the latest files on this writeup may be found at this directory, with the most recently compiled PDF file (clocking in at about 90 pages so far, with a few sections still to be written!) being found here.  There is also still some active discussion on improving the numerical results, with a particular focus on improving the sieving step that converts distribution estimates such as $MPZ^{(i)}[\varpi,\delta]$ into weak prime tuples results $DHL[k_0,2]$.  (For a discussion of the terminology, and for a general overview of the proof strategy, see this previous progress report on the Polymath8 project.)  This post can also contain any other discussion pertinent to any aspect of the polymath8 project, of course.

There are a few sections that still need to be written for the draft, mostly concerned with the Type I, Type II, and Type III estimates.  However, the proofs of these estimates exist already on this blog, so I hope to transcribe them to the paper fairly shortly (say by the end of this week).  Barring any unexpected surprises, or major reorganisation of the paper, it seems that the main remaining task in the writing process would be the proofreading and polishing, and turning from the technical mathematical details to expository issues.  As always, feedback from casual participants, as well as those who have been closely involved with the project, would be very valuable in this regard.  (One small comment, by the way, regarding corrections: as the draft keeps changing with time, referring to a specific line of the paper using page numbers and line numbers can become inaccurate, so if one could try to use section numbers, theorem numbers, or equation numbers as reference instead (e.g. “the third line after (5.35)” instead of “the twelfth line of page 54″) that would make it easier to track down specific portions of the paper.)

Also, we have set up a wiki page for listing the participants of the polymath8 project, their contact information, and grant information (if applicable).  We have two lists of participants; one for those who have been making significant contributions to the project (comparable to that of a co-author of a traditional mathematical research paper), and another list for those who have made auxiliary contributions (e.g. typos, stylistic suggestions, or supplying references) that would typically merit inclusion in the Acknowledgments section of a traditional paper.  It’s difficult to exactly draw the line between the two types of contributions, but we have relied in the past on self-reporting, which has worked pretty well so far.  (By the time this project concludes, I may go through the comments to previous posts and see if any further names should be added to these lists that have not already been self-reported.)

The main objectives of the polymath8 project, initiated back in June, were to understand the recent breakthrough paper of Zhang establishing an infinite number of prime gaps bounded by a fixed constant ${H}$, and then to lower that value of ${H}$ as much as possible. After a large number of refinements, optimisations, and other modifications to Zhang’s method, we have now lowered the value of ${H}$ from the initial value of ${70,000,000}$ down to (provisionally) ${4,680}$, as well as to the slightly worse value of ${14,994}$ if one wishes to avoid any reliance on the deep theorems of Deligne on the Weil conjectures.

As has often been the case with other polymath projects, the pace has settled down subtantially after the initial frenzy of activity; in particular, the values of ${H}$ (and other key parameters, such as ${k_0}$, ${\varpi}$, and ${\delta}$) have stabilised over the last few weeks. While there may still be a few small improvements in these parameters that can be wrung out of our methods, I think it is safe to say that we have cleared out most of the “low-hanging fruit” (and even some of the “medium-hanging fruit”), which means that it is time to transition to the next phase of the polymath project, namely the writing phase.

After some discussion at the previous post, we have tentatively decided on writing a single research paper, which contains (in a reasonably self-contained fashion) the details of the strongest result we have (i.e. bounded gaps with ${H = 4,680}$), together with some variants, such as the bound ${H=14,994}$ that one can obtain without invoking Deligne’s theorems. We can of course also include some discussion as to where further improvements could conceivably arise from these methods, although even if one assumes the most optimistic estimates regarding distribution of the primes, we still do not have any way to get past the barrier of ${H=16}$ identified as the limit of this method by Goldston, Pintz, and Yildirim. This research paper does not necessarily represent the only output of the polymath8 project; for instance, as part of the polymath8 project the admissible tuples page was created, which is a repository of narrow prime tuples which can automatically accept (and verify) new submissions. (At an early stage of the project, it was suggested that we set up a computing challenge for mathematically inclined programmers to try to find the narrowest prime tuples of a given width; it might be worth revisiting this idea now that our value of ${k_0}$ has stabilised and the prime tuples page is up and running.) Other potential outputs include additional expository articles, lecture notes, or perhaps the details of a “minimal proof” of bounded gaps between primes that gives a lousy value of ${H}$ but with as short and conceptual a proof as possible. But it seems to me that these projects do not need to proceed via the traditional research paper route (perhaps ending up on the blog, on the wiki, or on the admissible tuples page instead). Also, these projects might also benefit from the passage of time to lend a bit of perspective and depth, especially given that there are likely to be further advances in this field from outside of the polymath project.

I have taken the liberty of setting up a Dropbox folder containing a skeletal outline of a possible research paper, and anyone who is interested in making significant contributions to the writeup of the paper can contact me to be given write access to that folder. However, I am not firmly wedded to the organisational structure of that paper, and at this stage it is quite easy to move sections around if this would lead to a more readable or more logically organised paper.

I have tried to structure the paper so that the deepest arguments – the ones which rely on Deligne’s theorems – are placed at the end of the paper, so that a reader who wishes to read and understand a proof of bounded gaps that does not rely on Deligne’s theorems can stop reading about halfway through the paper. I have also moved the top-level structure of the argument (deducing bounded gaps from a Dickson-Hardy-Littlewood claim ${DHL[k_0,2]}$, which in turn is established from a Motohashi-Pintz-Zhang distribution estimate ${MPZ^{(i)}[\varpi,\delta]}$, which is in turn deduced from Type I, Type II, and Type III estimates) to the front of the paper.

Of course, any feedback on the draft paper is encouraged, even from (or especially from!) readers who have been following this project on a casual basis, as this would be valuable in making sure that the paper is written in as accessible as fashion as possible. (Sometimes it is possible to be so close to a project that one loses some sense of perspective, and does not realise that what one is writing might not necessarily be as clear to other mathematicians as it is to the author.)

As in all previous posts in this series, we adopt the following asymptotic notation: ${x}$ is a parameter going off to infinity, and all quantities may depend on ${x}$ unless explicitly declared to be “fixed”. The asymptotic notation ${O(), o(), \ll}$ is then defined relative to this parameter. A quantity ${q}$ is said to be of polynomial size if one has ${q = O(x^{O(1)})}$, and bounded if ${q=O(1)}$. We also write ${X \lessapprox Y}$ for ${X \ll x^{o(1)} Y}$, and ${X \sim Y}$ for ${X \ll Y \ll X}$.

The purpose of this (rather technical) post is both to roll over the polymath8 research thread from this previous post, and also to record the details of the latest improvement to the Type I estimates (based on exploiting additional averaging and using Deligne’s proof of the Weil conjectures) which lead to a slight improvement in the numerology.

In order to obtain this new Type I estimate, we need to strengthen the previously used properties of “dense divisibility” or “double dense divisibility” as follows.

Definition 1 (Multiple dense divisibility) Let ${y \geq 1}$. For each natural number ${k \geq 0}$, we define a notion of ${k}$-tuply ${y}$-dense divisibility recursively as follows:

• Every natural number ${n}$ is ${0}$-tuply ${y}$-densely divisible.
• If ${k \geq 1}$ and ${n}$ is a natural number, we say that ${n}$ is ${k}$-tuply ${y}$-densely divisible if, whenever ${i,j \geq 0}$ are natural numbers with ${i+j=k-1}$, and ${1 \leq R \leq n}$, one can find a factorisation ${n = qr}$ with ${y^{-1} R \leq r \leq R}$ such that ${q}$ is ${i}$-tuply ${y}$-densely divisible and ${r}$ is ${j}$-tuply ${y}$-densely divisible.

We let ${{\mathcal D}^{(k)}_y}$ denote the set of ${k}$-tuply ${y}$-densely divisible numbers. We abbreviate “${1}$-tuply densely divisible” as “densely divisible”, “${2}$-tuply densely divisible” as “doubly densely divisible”, and so forth; we also abbreviate ${{\mathcal D}^{(1)}_y}$ as ${{\mathcal D}_y}$.

Given any finitely supported sequence ${\alpha: {\bf N} \rightarrow {\bf C}}$ and any primitive residue class ${a\ (q)}$, we define the discrepancy

$\displaystyle \Delta(\alpha; a \ (q)) := \sum_{n: n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1} \alpha(n).$

We now recall the key concept of a coefficient sequence, with some slight tweaks in the definitions that are technically convenient for this post.

Definition 2 A coefficient sequence is a finitely supported sequence ${\alpha: {\bf N} \rightarrow {\bf R}}$ that obeys the bounds

$\displaystyle |\alpha(n)| \ll \tau^{O(1)}(n) \log^{O(1)}(x) \ \ \ \ \ (1)$

for all ${n}$, where ${\tau}$ is the divisor function.

• (i) A coefficient sequence ${\alpha}$ is said to be located at scale ${N}$ for some ${N \geq 1}$ if it is supported on an interval of the form ${[cN, CN]}$ for some ${1 \ll c < C \ll 1}$.
• (ii) A coefficient sequence ${\alpha}$ located at scale ${N}$ for some ${N \geq 1}$ is said to obey the Siegel-Walfisz theorem if one has

$\displaystyle | \Delta(\alpha 1_{(\cdot,q)=1}; a\ (r)) | \ll \tau(qr)^{O(1)} N \log^{-A} x \ \ \ \ \ (2)$

for any ${q,r \geq 1}$, any fixed ${A}$, and any primitive residue class ${a\ (r)}$.

• (iii) A coefficient sequence ${\alpha}$ is said to be smooth at scale ${N}$ for some ${N > 0}$ is said to be smooth if it takes the form ${\alpha(n) = \psi(n/N)}$ for some smooth function ${\psi: {\bf R} \rightarrow {\bf C}}$ supported on an interval of size ${O(1)}$ and obeying the derivative bounds

$\displaystyle |\psi^{(j)}(t)| \lesssim \log^{O(1)} x \ \ \ \ \ (3)$

for all fixed ${j \geq 0}$ (note that the implied constant in the ${O()}$ notation may depend on ${j}$).

Note that we allow sequences to be smooth at scale ${N}$ without being located at scale ${N}$; for instance if one arbitrarily translates of a sequence that is both smooth and located at scale ${N}$, it will remain smooth at this scale but may not necessarily be located at this scale any more. Note also that we allow the smoothness scale ${N}$ of a coefficient sequence to be less than one. This is to allow for the following convenient rescaling property: if ${n \mapsto \psi(n)}$ is smooth at scale ${N}$, ${q \geq 1}$, and ${a}$ is an integer, then ${n \mapsto \psi(qn+a)}$ is smooth at scale ${N/q}$, even if ${N/q}$ is less than one.

Now we adapt the Type I estimate to the ${k}$-tuply densely divisible setting.

Definition 3 (Type I estimates) Let ${0 < \varpi < 1/4}$, ${0 < \delta < 1/4+\varpi}$, and ${0 < \sigma < 1/2}$ be fixed quantities, and let ${k \geq 1}$ be a fixed natural number. We let ${I}$ be an arbitrary bounded subset of ${{\bf R}}$, let ${P_I := \prod_{p \in I} p}$, and let ${a\ (P_I)}$ a primitive congruence class. We say that ${Type^{(k)}_I[\varpi,\delta,\sigma]}$ holds if, whenever ${M, N \gg 1}$ are quantities with

$\displaystyle M N \sim x \ \ \ \ \ (4)$

and

$\displaystyle x^{1/2-\sigma} \lessapprox N \lessapprox x^{1/2-2\varpi-c} \ \ \ \ \ (5)$

for some fixed ${c>0}$, and ${\alpha,\beta}$ are coefficient sequences located at scales ${M,N}$ respectively, with ${\beta}$ obeying a Siegel-Walfisz theorem, we have

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^{(k)}: q \leq x^{1/2+2\varpi}} |\Delta(\alpha * \beta; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (6)$

for any fixed ${A>0}$. Here, as in previous posts, ${{\mathcal S}_I}$ denotes the square-free natural numbers whose prime factors lie in ${I}$.

The main theorem of this post is then

Theorem 4 (Improved Type I estimate) We have ${Type^{(4)}_I[\varpi,\delta,\sigma]}$ whenever

$\displaystyle \frac{160}{3} \varpi + 16 \delta + \frac{34}{9} \sigma < 1$

and

$\displaystyle 64\varpi + 18\delta + 2\sigma < 1.$

In practice, the first condition here is dominant. Except for weakening double dense divisibility to quadruple dense divisibility, this improves upon the previous Type I estimate that established ${Type^{(2)}_I[\varpi,\delta,\sigma]}$ under the stricter hypothesis

$\displaystyle 56 \varpi + 16 \delta + 4 \sigma < 1.$

As in previous posts, Type I estimates (when combined with existing Type II and Type III estimates) lead to distribution results of Motohashi-Pintz-Zhang type. For any fixed ${\varpi, \delta > 0}$ and ${k \geq 1}$, we let ${MPZ^{(k)}[\varpi,\delta]}$ denote the assertion that

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^{(k)}: q \leq x^{1/2+2\varpi}} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (7)$

for any fixed ${A > 0}$, any bounded ${I}$, and any primitive ${a\ (P_I)}$, where ${\Lambda}$ is the von Mangoldt function.

Corollary 5 We have ${MPZ^{(4)}[\varpi,\delta]}$ whenever

$\displaystyle \frac{600}{7} \varpi + \frac{180}{7} \delta < 1 \ \ \ \ \ (8)$

Proof: Setting ${\sigma}$ sufficiently close to ${1/10}$, we see from the above theorem that ${Type^{(4)}_{II}[\varpi,\delta]}$ holds whenever

$\displaystyle \frac{600}{7} \varpi + \frac{180}{7} \delta < 1$

and

$\displaystyle 80 \varpi + \frac{45}{2} \delta < 1.$

The second condition is implied by the first and can be deleted.

From this previous post we know that ${Type^{(4)}_{II}[\varpi,\delta]}$ (which we define analogously to ${Type'_{II}[\varpi,\delta], Type''_{II}[\varpi,\delta]}$ from previous sections) holds whenever

$\displaystyle 68 \varpi + 14 \delta < 1$

while ${Type^{(4)}_{III}[\varpi,\delta,\sigma]}$ holds with ${\sigma}$ sufficiently close to ${1/10}$ whenever

$\displaystyle 70 \varpi + 5 \delta < 1.$

Again, these conditions are implied by (8). The claim then follows from the Heath-Brown identity and dyadic decomposition as in this previous post. $\Box$

As before, we let ${DHL[k_0,2]}$ denote the claim that given any admissible ${k_0}$-tuple ${{\mathcal H}}$, there are infinitely many translates of ${{\mathcal H}}$ that contain at least two primes.

Corollary 6 We have ${DHL[k_0,2]}$ with ${k_0 = 632}$.

This follows from the Pintz sieve, as discussed below the fold. Combining this with the best known prime tuples, we obtain that there are infinitely many prime gaps of size at most ${4,680}$, improving slightly over the previous record of ${5,414}$.

As in previous posts, we use the following asymptotic notation: ${x}$ is a parameter going off to infinity, and all quantities may depend on ${x}$ unless explicitly declared to be “fixed”. The asymptotic notation ${O(), o(), \ll}$ is then defined relative to this parameter. A quantity ${q}$ is said to be of polynomial size if one has ${q = O(x^{O(1)})}$, and bounded if ${q=O(1)}$. We also write ${X \lessapprox Y}$ for ${X \ll x^{o(1)} Y}$, and ${X \sim Y}$ for ${X \ll Y \ll X}$.

The purpose of this post is to collect together all the various refinements to the second half of Zhang’s paper that have been obtained as part of the polymath8 project and present them as a coherent argument. In order to state the main result, we need to recall some definitions. If ${I}$ is a bounded subset of ${{\bf R}}$, let ${{\mathcal S}_I}$ denote the square-free numbers whose prime factors lie in ${I}$, and let ${P_I := \prod_{p \in I} p}$ denote the product of the primes ${p}$ in ${I}$. Note by the Chinese remainder theorem that the set ${({\bf Z}/P_I{\bf Z})^\times}$ of primitive congruence classes ${a\ (P_I)}$ modulo ${P_I}$ can be identified with the tuples ${(a_q\ (q))_{q \in {\mathcal S}_I}}$ of primitive congruence classes ${a_q\ (q)}$ of congruence classes modulo ${q}$ for each ${q \in {\mathcal S}_I}$ which obey the Chinese remainder theorem

$\displaystyle (a_{qr}\ (qr)) = (a_q\ (q)) \cap (a_r\ (r))$

for all coprime ${q,r \in {\mathcal S}_I}$, since one can identify ${a\ (P_I)}$ with the tuple ${(a\ (q))_{q \in {\mathcal S}_I}}$ for each ${a \in ({\bf Z}/P_I{\bf Z})^\times}$.

If ${y > 1}$ and ${n}$ is a natural number, we say that ${n}$ is ${y}$-densely divisible if, for every ${1 \leq R \leq n}$, one can find a factor of ${n}$ in the interval ${[y^{-1} R, R]}$. We say that ${n}$ is doubly ${y}$-densely divisible if, for every ${1 \leq R \leq n}$, one can find a factor ${m}$ of ${n}$ in the interval ${[y^{-1} R, R]}$ such that ${m}$ is itself ${y}$-densely divisible. We let ${{\mathcal D}_y^2}$ denote the set of doubly ${y}$-densely divisible natural numbers, and ${{\mathcal D}_y}$ the set of ${y}$-densely divisible numbers.

Given any finitely supported sequence ${\alpha: {\bf N} \rightarrow {\bf C}}$ and any primitive residue class ${a\ (q)}$, we define the discrepancy

$\displaystyle \Delta(\alpha; a \ (q)) := \sum_{n: n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1} \alpha(n).$

For any fixed ${\varpi, \delta > 0}$, we let ${MPZ''[\varpi,\delta]}$ denote the assertion that

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^2: q \leq x^{1/2+2\varpi}} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (1)$

for any fixed ${A > 0}$, any bounded ${I}$, and any primitive ${a\ (P_I)}$, where ${\Lambda}$ is the von Mangoldt function. Importantly, we do not require ${I}$ or ${a}$ to be fixed, in particular ${I}$ could grow polynomially in ${x}$, and ${a}$ could grow exponentially in ${x}$, but the implied constant in (1) would still need to be fixed (so it has to be uniform in ${I}$ and ${a}$). (In previous formulations of these estimates, the system of congruence ${a\ (q)}$ was also required to obey a controlled multiplicity hypothesis, but we no longer need this hypothesis in our arguments.) In this post we will record the proof of the following result, which is currently the best distribution result produced by the ongoing polymath8 project to optimise Zhang’s theorem on bounded gaps between primes:

Theorem 1 We have ${MPZ''[\varpi,\delta]}$ whenever ${\frac{280}{3} \varpi + \frac{80}{3} \delta < 1}$.

This improves upon the previous constraint of ${148 \varpi + 33 \delta < 1}$ (see this previous post), although that latter statement was stronger in that it only required single dense divisibility rather than double dense divisibility. However, thanks to the efficiency of the sieving step of our argument, the upgrade of the single dense divisibility hypothesis to double dense divisibility costs almost nothing with respect to the ${k_0}$ parameter (which, using this constraint, gives a value of ${k_0=720}$ as verified in these comments, which then implies a value of ${H = 5,414}$).

This estimate is deduced from three sub-estimates, which require a bit more notation to state. We need a fixed quantity ${A_0>0}$.

Definition 2 A coefficient sequence is a finitely supported sequence ${\alpha: {\bf N} \rightarrow {\bf R}}$ that obeys the bounds

$\displaystyle |\alpha(n)| \ll \tau^{O(1)}(n) \log^{O(1)}(x) \ \ \ \ \ (2)$

for all ${n}$, where ${\tau}$ is the divisor function.

• (i) A coefficient sequence ${\alpha}$ is said to be at scale ${N}$ for some ${N \geq 1}$ if it is supported on an interval of the form ${[(1-O(\log^{-A_0} x)) N, (1+O(\log^{-A_0} x)) N]}$.
• (ii) A coefficient sequence ${\alpha}$ at scale ${N}$ is said to obey the Siegel-Walfisz theorem if one has

$\displaystyle | \Delta(\alpha 1_{(\cdot,q)=1}; a\ (r)) | \ll \tau(qr)^{O(1)} N \log^{-A} x \ \ \ \ \ (3)$

for any ${q,r \geq 1}$, any fixed ${A}$, and any primitive residue class ${a\ (r)}$.

• (iii) A coefficient sequence ${\alpha}$ at scale ${N}$ (relative to this choice of ${A_0}$) is said to be smooth if it takes the form ${\alpha(n) = \psi(n/N)}$ for some smooth function ${\psi: {\bf R} \rightarrow {\bf C}}$ supported on ${[1-O(\log^{-A_0} x), 1+O(\log^{-A_0} x)]}$ obeying the derivative bounds

$\displaystyle \psi^{(j)}(t) = O( \log^{j A_0} x ) \ \ \ \ \ (4)$

for all fixed ${j \geq 0}$ (note that the implied constant in the ${O()}$ notation may depend on ${j}$).

Definition 3 (Type I, Type II, Type III estimates) Let ${0 < \varpi < 1/4}$, ${0 < \delta < 1/4+\varpi}$, and ${0 < \sigma < 1/2}$ be fixed quantities. We let ${I}$ be an arbitrary bounded subset of ${{\bf R}}$, and ${a\ (P_I)}$ a primitive congruence class.

Theorem 1 is then a consequence of the following four statements.

Theorem 4 (Type I estimate) ${Type''_I[\varpi,\delta,\sigma]}$ holds whenever ${\varpi,\delta,\sigma > 0}$ are fixed quantities such that

$\displaystyle 56 \varpi + 16 \delta + 4\sigma < 1.$

Theorem 5 (Type II estimate) ${Type''_{II}[\varpi,\delta]}$ holds whenever ${\varpi,\delta > 0}$ are fixed quantities such that

$\displaystyle 68 \varpi + 14 \delta < 1.$

Theorem 6 (Type III estimate) ${Type''_{III}[\varpi,\delta,\sigma]}$ holds whenever ${0 < \varpi < 1/4}$, ${0 < \delta < 1/4+\varpi}$, and ${\sigma > 0}$ are fixed quantities such that

$\displaystyle \sigma > \frac{1}{18} + \frac{28}{9} \varpi + \frac{2}{9} \delta \ \ \ \ \ (12)$

and

$\displaystyle \varpi< \frac{1}{12}. \ \ \ \ \ (13)$

In particular, if

$\displaystyle 70 \varpi + 5 \delta < 1.$

then all values of ${\sigma}$ that are sufficiently close to ${1/10}$ are admissible.

Lemma 7 (Combinatorial lemma) Let ${0 < \varpi < 1/4}$, ${0 < \delta < 1/4+\varpi}$, and ${1/10 < \sigma < 1/2}$ be such that ${Type''_I[\varpi,\delta,\sigma]}$, ${Type''_{II}[\varpi,\delta]}$, and ${Type''_{III}[\varpi,\delta,\sigma]}$ simultaneously hold. Then ${MPZ''[\varpi,\delta]}$ holds.

Indeed, if ${\frac{280}{3} \varpi + \frac{80}{3} \delta < 1}$, one checks that the hypotheses for Theorems 4, 5, 6 are obeyed for ${\sigma}$ sufficiently close to ${1/10}$, at which point the claim follows from Lemma 7.

The proofs of Theorems 4, 5, 6 will be given below the fold, while the proof of Lemma 7 follows from the arguments in this previous post. We remark that in our current arguments, the double dense divisibility is only fully used in the Type I estimates; the Type II and Type III estimates are also valid just with single dense divisibility.

Remark 1 Theorem 6 is vacuously true for ${\sigma > 1/6}$, as the condition (10) cannot be satisfied in this case. If we use this trivial case of Theorem 6, while keeping the full strength of Theorems 4 and 5, we obtain Theorem 1 in the regime

$\displaystyle 168 \varpi + 48 \delta < 1.$

For any ${H \geq 2}$, let ${B[H]}$ denote the assertion that there are infinitely many pairs of consecutive primes ${p_n, p_{n+1}}$ whose difference ${p_{n+1}-p_n}$ is at most ${H}$, or equivalently that

$\displaystyle \lim\inf_{n \rightarrow \infty} p_{n+1} - p_n \leq H;$

thus for instance ${B[2]}$ is the notorious twin prime conjecture. While this conjecture remains unsolved, we have the following recent breakthrough result of Zhang, building upon earlier work of Goldston-Pintz-Yildirim, Bombieri, Fouvry, Friedlander, and Iwaniec, and others:

Theorem 1 (Zhang’s theorem) ${B[H]}$ is true for some finite ${H}$.

In fact, Zhang’s paper shows that ${B[H]}$ is true with ${H = 70,000,000}$.

About a month ago, the Polymath8 project was launched with the objective of reading through Zhang’s paper, clarifying the arguments, and then making them more efficient, in order to improve the value of ${H}$. This project is still ongoing, but we have made significant progress; currently, we have confirmed that ${B[H]}$ holds for ${H}$ as low as ${12,006}$, and provisionally for ${H}$ as low as ${6,966}$ subject to certain lengthy arguments being checked. For several reasons, our methods (which are largely based on Zhang’s original argument structure, though with numerous refinements and improvements) will not be able to attain the twin prime conjecture ${B[2]}$, but there is still scope to lower the value of ${H}$ a bit further than what we have currently.

The precise arguments here are quite technical, and are discussed at length on other posts on this blog. In this post, I would like to give a “high level” summary of how Zhang’s argument works, and give some impressions of the improvements we have made so far; these would already be familiar to the active participants of the Polymath8 project, but perhaps may be of value to people who are following this project on a more casual basis.

While Zhang’s arguments (and our refinements of it) are quite lengthy, they are fortunately also very modular, that is to say they can be broken up into several independent components that can be understood and optimised more or less separately from each other (although we have on occasion needed to modify the formulation of one component in order to better suit the needs of another). At the top level, Zhang’s argument looks like this:

1. Statements of the form ${B[H]}$ are deduced from weakened versions of the Hardy-Littlewood prime tuples conjecture, which we have denoted ${DHL[k_0,2]}$ (the ${DHL}$ stands for “Dickson-Hardy-Littlewood”), by locating suitable narrow admissible tuples (see below). Zhang’s paper establishes for the first time an unconditional proof of ${DHL[k_0,2]}$ for some finite ${k_0}$; in his initial paper, ${k_0}$ was ${3,500,000}$, but we have lowered this value to ${1,466}$ (and provisionally to ${902}$). Any reduction in the value of ${k_0}$ leads directly to reductions in the value of ${H}$; a web site to collect the best known values of ${H}$ in terms of ${k_0}$ has recently been set up here (and is accepting submissions for anyone who finds narrower admissible tuples than are currently known).
2. Next, by adapting sieve-theoretic arguments of Goldston, Pintz, and Yildirim, the Dickson-Hardy-Littlewood type assertions ${DHL[k_0,2]}$ are deduced in turn from weakened versions of the Elliott-Halberstam conjecture that we have denoted ${MPZ[\varpi,\delta]}$ (the ${MPZ}$ stands for “Motohashi-Pintz-Zhang”). More recently, we have replaced the conjecture ${MPZ[\varpi,\delta]}$ by a slightly stronger conjecture ${MPZ'[\varpi,\delta]}$ to significantly improve the efficiency of this step (using some recent ideas of Pintz). Roughly speaking, these statements assert that the primes are more or less evenly distributed along many arithmetic progressions, including those that have relatively large spacing. A crucial technical fact here is that in contrast to the older Elliott-Halberstam conjecture, the Motohashi-Pintz-Zhang estimates only require one to control progressions whose spacings ${q}$ have a lot of small prime factors (the original ${MPZ[\varpi,\delta]}$ conjecture requires the spacing ${q}$ to be smooth, but the newer variant ${MPZ'[\varpi,\delta]}$ has relaxed this to “densely divisible” as this turns out to be more efficient). The ${\varpi}$ parameter is more important than the technical parameter ${\delta}$; we would like ${\varpi}$ to be as large as possible, as any increase in this parameter should lead to a reduced value of ${k_0}$. In Zhang’s original paper, ${\varpi}$ was taken to be ${1/1168}$; we have now increased this to be almost as large as ${1/148}$ (and provisionally ${1/108}$).
3. By a certain amount of combinatorial manipulation (combined with a useful decomposition of the von Mangoldt function due Heath-Brown), estimates such as ${MPZ[\varpi,\delta]}$ can be deduced from three subestimates, the “Type I” estimate ${Type_I[\varpi,\delta,\sigma]}$, the “Type II” estimate ${Type_{II}[\varpi,\delta]}$, and the “Type III” estimate ${Type_{III}[\varpi,\delta,\sigma]}$, which all involve the distribution of certain Dirichlet convolutions in arithmetic progressions. Here ${1/10 < \sigma < 1/2}$ is an adjustable parameter that demarcates the border between the Type I and Type III estimates; raising ${\sigma}$ makes it easier to prove Type III estimates but harder to prove Type I estimates, and lowering ${\sigma}$ of course has the opposite effect. There is a combinatorial lemma that asserts that as long as one can find some ${\sigma}$ between ${1/10}$ and ${1/2}$ for which all three estimates ${Type_I[\varpi,\delta,\sigma]}$, ${Type_{II}[\varpi,\delta]}$, ${Type_{III}[\varpi,\delta,\sigma]}$ hold, one can prove ${MPZ[\varpi,\delta]}$. (The condition ${\sigma > 1/10}$ arises from the combinatorics, and appears to be rather essential; in fact, it is currently a major obstacle to further improvement of ${\varpi}$ and hence ${k_0}$ and ${H}$.)
4. The Type I estimates ${Type_I[\varpi,\delta,\sigma]}$ are asserting good distribution properties of convolutions of the form ${\alpha * \beta}$, where ${\alpha,\beta}$ are moderately long sequences which have controlled magnitude and length but are otherwise arbitrary. Estimates that are roughly of this type first appeared in a series of papers by Bombieri, Fouvry, Friedlander, Iwaniec, and other authors, and Zhang’s arguments here broadly follow those of previous authors, but with several new twists that take advantage of the many factors of the spacing ${q}$. In particular, the dispersion method of Linnik is used (which one can think of as a clever application of the Cauchy-Schwarz inequality) to ultimately reduce matters (after more Cauchy-Schwarz, as well as treatment of several error terms) to estimation of incomplete Kloosterman-type sums such as

$\displaystyle \sum_{n \leq N} e_d( \frac{c}{n} ).$

Zhang’s argument uses classical estimates on this Kloosterman sum (dating back to the work of Weil), but we have improved this using the “${q}$-van der Corput ${A}$-process” introduced by Heath-Brown and Ringrose.

5. The Type II estimates ${Type_{II}[\varpi,\delta]}$ are similar to the Type I estimates, but cover a small hole in the coverage of the Type I estimates which comes up when the two sequences ${\alpha,\beta}$ are almost equal in length. It turns out that one can modify the Type I argument to cover this case also. In practice, these estimates give less stringent conditions on ${\varpi,\delta}$ than the other two estimates, and so as a first approximation one can ignore the need to treat these estimates, although recently our Type I and Type III estimates have become so strong that it has become necessary to tighten the Type II estimates as well.
6. The Type III estimates ${Type_{III}[\varpi,\delta,\sigma]}$ are an averaged variant of the classical problem of understanding the distribution of the ternary divisor function ${\tau_3(n) := \sum_{abc=n} 1}$ in arithmetic progressions. There are various ways to attack this problem, but most of them ultimately boil down (after the use of standard devices such as Cauchy-Schwarz and completion of sums) to the task of controlling certain higher-dimensional Kloosterman-type sums such as

$\displaystyle \sum_{t,t' \in ({\bf Z}/d{\bf Z})^\times} \sum_{l \in {\bf Z}/d{\bf Z}: (l,d)=(l+k,d)=1} e_d( \frac{t}{l} - \frac{t'}{l+k} + \frac{m}{t} - \frac{m'}{t'} ).$

In principle, any such sum can be controlled by invoking Deligne’s proof of the Weil conjectures in arbitrary dimension (which, roughly speaking, establishes the analogue of the Riemann hypothesis for arbitrary varieties over finite fields), although in the higher dimensional setting some algebraic geometry is needed to ensure that one gets the full “square root cancellation” for these exponential sums. (For the particular sum above, the necessary details were worked out by Birch and Bombieri.) As such, this part of the argument is by far the least elementary component of the whole. Zhang’s original argument cleverly exploited some additional cancellation in the above exponential sums that goes beyond the naive square root cancellation heuristic; more recently, an alternate argument of Fouvry, Kowalski, Michel, and Nelson uses bounds on a slightly different higher-dimensional Kloosterman-type sum to obtain results that give better values of ${\varpi,\delta,\sigma}$. We have also been able to improve upon these estimates by exploiting some additional averaging that was left unused by the previous arguments.

As of this time of writing, our understanding of the first three stages of Zhang’s argument (getting from ${DHL[k_0,2]}$ to ${B[H]}$, getting from ${MPZ[\varpi,\delta]}$ or ${MPZ'[\varpi,\delta]}$ to ${DHL[k_0,2]}$, and getting to ${MPZ[\varpi,\delta]}$ or ${MPZ'[\varpi,\delta]}$ from Type I, Type II, and Type III estimates) are quite satisfactory, with the implications here being about as efficient as one could hope for with current methods, although one could still hope to get some small improvements in parameters by wringing out some of the last few inefficiencies. The remaining major sources of improvements to the parameters are then coming from gains in the Type I, II, and III estimates; we are currently in the process of making such improvements, but it will still take some time before they are fully optimised.

Below the fold I will discuss (mostly at an informal, non-rigorous level) the six steps above in a little more detail (full details can of course be found in the other polymath8 posts on this blog). This post will also serve as a new research thread, as the previous threads were getting quite lengthy.

As in previous posts, we use the following asymptotic notation: ${x}$ is a parameter going off to infinity, and all quantities may depend on ${x}$ unless explicitly declared to be “fixed”. The asymptotic notation ${O(), o(), \ll}$ is then defined relative to this parameter. A quantity ${q}$ is said to be of polynomial size if one has ${q = O(x^{O(1)})}$, and bounded if ${q=O(1)}$. We also write ${X \lessapprox Y}$ for ${X \ll x^{o(1)} Y}$, and ${X \sim Y}$ for ${X \ll Y \ll X}$.

The purpose of this post is to collect together all the various refinements to the second half of Zhang’s paper that have been obtained as part of the polymath8 project and present them as a coherent argument (though not fully self-contained, as we will need some lemmas from previous posts).

In order to state the main result, we need to recall some definitions.

Definition 1 (Singleton congruence class system) Let ${I \subset {\bf R}}$, and let ${{\mathcal S}_I}$ denote the square-free numbers whose prime factors lie in ${I}$. A singleton congruence class system on ${I}$ is a collection ${{\mathcal C} = (\{a_q\})_{q \in {\mathcal S}_I}}$ of primitive residue classes ${a_q \in ({\bf Z}/q{\bf Z})^\times}$ for each ${q \in {\mathcal S}_I}$, obeying the Chinese remainder theorem property

$\displaystyle a_{qr}\ (qr) = (a_q\ (q)) \cap (a_r\ (r)) \ \ \ \ \ (1)$

whenever ${q,r \in {\mathcal S}_I}$ are coprime. We say that such a system ${{\mathcal C}}$ has controlled multiplicity if the

$\displaystyle \tau_{\mathcal C}(n) := |\{ q \in {\mathcal S}_I: n = a_q\ (q) \}|$

obeys the estimate

$\displaystyle \sum_{C^{-1} x \leq n \leq Cx: n = a\ (r)} \tau_{\mathcal C}(n)^2 \ll \frac{x}{r} \tau(r)^{O(1)} \log^{O(1)} x + x^{o(1)}. \ \ \ \ \ (2)$

for any fixed ${C>1}$ and any congruence class ${a\ (r)}$ with ${r \in {\mathcal S}_I}$. Here ${\tau}$ is the divisor function.

Next we need a relaxation of the concept of ${y}$-smoothness.

Definition 2 (Dense divisibility) Let ${y \geq 1}$. A positive integer ${q}$ is said to be ${y}$-densely divisible if, for every ${1 \leq R \leq q}$, there exists a factor of ${q}$ in the interval ${[y^{-1} R, R]}$. We let ${{\mathcal D}_y}$ denote the set of ${y}$-densely divisible positive integers.

Now we present a strengthened version ${MPZ'[\varpi,\delta]}$ of the Motohashi-Pintz-Zhang conjecture ${MPZ[\varpi,\delta]}$, which depends on parameters ${0 < \varpi < 1/4}$ and ${0 < \delta < 1/4}$.

Conjecture 3 (${MPZ'[\varpi,\delta]}$) Let ${I \subset {\bf R}}$, and let ${(\{a_q\})_{q \in {\mathcal S}_I}}$ be a congruence class system with controlled multiplicity. Then

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}: q< x^{1/2+2\varpi}} |\Delta(\Lambda 1_{[x,2x]}; a_q)| \ll x \log^{-A} x \ \ \ \ \ (3)$

for any fixed ${A>0}$, where ${\Lambda}$ is the von Mangoldt function.

The difference between this conjecture and the weaker conjecture ${MPZ[\varpi,\delta]}$ is that the modulus ${q}$ is constrained to be ${x^\delta}$-densely divisible rather than ${x^\delta}$-smooth (note that ${I}$ is no longer constrained to lie in ${[1,x^\delta]}$). This relaxation of the smoothness condition improves the Goldston-Pintz-Yildirim type sieving needed to deduce ${DHL[k_0,2]}$ from ${MPZ'[\varpi,\delta]}$; see this previous post.

The main result we will establish is

Theorem 4 ${MPZ'[\varpi,\delta]}$ holds for any ${\varpi,\delta>0}$ with

$\displaystyle 148\varpi+33\delta < 1. \ \ \ \ \ (4)$

This improves upon previous constraints of ${87\varpi + 17 \delta < \frac{1}{4}}$ (see this blog comment) and ${207 \varpi + 43 \delta < \frac{1}{4}}$ (see Theorem 13 of this previous post), which were also only established for ${MPZ[\varpi,\delta]}$ instead of ${MPZ'[\varpi,\delta]}$. Inserting Theorem 4 into the Pintz sieve from this previous post gives ${DHL[k_0,2]}$ for ${k_0 = 1467}$ (see this blog comment), which when inserted in turn into newly set up tables of narrow prime tuples gives infinitely many prime gaps of separation at most ${H = 12,012}$.

As in previous posts, we use the following asymptotic notation: ${x}$ is a parameter going off to infinity, and all quantities may depend on ${x}$ unless explicitly declared to be “fixed”. The asymptotic notation ${O(), o(), \ll}$ is then defined relative to this parameter. A quantity ${q}$ is said to be of polynomial size if one has ${q = O(x^{O(1)})}$, and said to be bounded if ${q=O(1)}$. Another convenient notation: we write ${X \lessapprox Y}$ for ${X \ll x^{o(1)} Y}$. Thus for instance the divisor bound asserts that if ${q}$ has polynomial size, then the number of divisors of ${q}$ is ${\lessapprox 1}$.

This post is intended to highlight a phenomenon unearthed in the ongoing polymath8 project (and is in fact a key component of Zhang’s proof that there are bounded gaps between primes infinitely often), namely that one can get quite good bounds on relatively short exponential sums when the modulus ${q}$ is smooth, through the basic technique of Weyl differencing (ultimately based on the Cauchy-Schwarz inequality, and also related to the van der Corput lemma in equidistribution theory). Improvements in the case of smooth moduli have appeared before in the literature (e.g. in this paper of Heath-Brown, paper of Graham and Ringrose, this later paper of Heath-Brown, this paper of Chang, or this paper of Goldmakher); the arguments here are particularly close to that of the first paper of Heath-Brown. It now also appears that further optimisation of this Weyl differencing trick could lead to noticeable improvements in the numerology for the polymath8 project, so I am devoting this post to explaining this trick further.

To illustrate the method, let us begin with the classical problem in analytic number theory of estimating an incomplete character sum

$\displaystyle \sum_{M+1 \leq n \leq M+N} \chi(n)$

where ${\chi}$ is a primitive Dirichlet character of some conductor ${q}$, ${M}$ is an integer, and ${N}$ is some quantity between ${1}$ and ${q}$. Clearly we have the trivial bound

$\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq N; \ \ \ \ \ (1)$

we also have the classical Pólya-Vinogradov inequality

$\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll q^{1/2} \log q. \ \ \ \ \ (2)$

This latter inequality gives improvements over the trivial bound when ${N}$ is much larger than ${q^{1/2}}$, but not for ${N}$ much smaller than ${q^{1/2}}$. The Pólya-Vinogradov inequality can be deduced via a little Fourier analysis from the completed exponential sum bound

$\displaystyle | \sum_{n \in {\bf Z}/q{\bf Z}} \chi(n) e_q( an )| \ll q^{1/2}$

for any ${a \in {\bf Z}/q{\bf Z}}$, where ${e_q(n) :=e^{2\pi i n/q}}$. (In fact, from the classical theory of Gauss sums, this exponential sum is equal to ${\tau(\chi) \overline{\chi(a)}}$ for some complex number ${\tau(\chi)}$ of norm ${\sqrt{q}}$.)

In the case when ${q}$ is a prime, improving upon the above two inequalities is an important but difficult problem, with only partially satisfactory results so far. To give just one indication of the difficulty, the seemingly modest improvement

$\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll p^{1/2} \log \log p$

to the Pólya-Vinogradov inequality when ${q=p}$ was a prime required a 14-page paper in Inventiones by Montgomery and Vaughan to prove, and even then it was only conditional on the generalised Riemann hypothesis! See also this more recent paper of Granville and Soundararajan for an unconditional variant of this result in the case that ${\chi}$ has odd order.

Another important improvement is the Burgess bound, which in our notation asserts that

$\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \lessapprox N^{1-1/r} q^{\frac{r+1}{4r^2}} \ \ \ \ \ (3)$

for any fixed integer ${r \geq 2}$, assuming that ${q}$ is square-free (for simplicity) and of polynomial size; see this previous post for a discussion of the Burgess argument. This is non-trivial for ${N}$ as small as ${q^{1/4+o(1)}}$.

In the case when ${q}$ is prime, there has been very little improvement to the Burgess bound (or its Fourier dual, which can give bounds for ${N}$ as large as ${q^{3/4-o(1)}}$) in the last fifty years; an improvement to the exponents in (3) in this case (particularly anything that gave a power saving for ${N}$ below ${q^{1/4}}$) would in fact be rather significant news in analytic number theory.

However, in the opposite case when ${q}$ is smooth – that is to say, all of its factors are much smaller than ${q}$ – then one can do better than the Burgess bound in some regimes. This fact has been observed in several places in the literature (in particular, in the papers of Heath-Brown, Graham-Ringrose, Chang, and Goldmakher mentioned previously), but also turns out to (implicitly) be a key insight in Zhang’s paper on bounded prime gaps. In the case of character sums, one such improved estimate (closely related to Theorem 2 of the Heath-Brown paper) is as follows:

Proposition 1 Let ${q}$ be square-free with a factorisation ${q = q_1 q_2}$ and of polynomial size, and let ${M,N}$ be integers with ${1 \leq N \leq q}$. Then for any primitive character ${\chi}$ with conductor ${q}$, one has

$\displaystyle | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox N^{1/2} q_1^{1/2} + N^{1/2} q_2^{1/4}.$

This proposition is particularly powerful when ${q}$ is smooth, as this gives many factorisations ${q = q_1 q_2}$ with the ability to specify ${q_1,q_2}$ with a fair amount of accuracy. For instance, if ${q}$ is ${y}$-smooth (i.e. all prime factors are at most ${y}$), then by the greedy algorithm one can find a divisor ${q_1}$ of ${q}$ with ${y^{-2/3} q^{1/3} \leq q_1 \leq y^{1/3} q^{1/3}}$; if we set ${q_2 := q/q_1}$, then ${y^{-1/3} q^{2/3} \leq q_2 \leq y^{2/3} q^{2/3}}$, and the above proposition then gives

$\displaystyle | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox y^{1/6} N^{1/2} q^{1/6}$

which can improve upon the Burgess bound when ${y}$ is small. For instance, if ${N = q^{1/2}}$, then this bound becomes ${\lessapprox y^{1/6} q^{5/12}}$; in contrast the Burgess bound only gives ${\lessapprox q^{7/16}}$ for this value of ${N}$ (using the optimal choice ${r=2}$ for ${r}$), which is inferior for ${y < q^{1/8}}$.

The hypothesis that ${q}$ be squarefree may be relaxed, but for applications to the Polymath8 project, it is only the squarefree moduli that are relevant.

Proof: If ${N \ll q_1}$ then the claim follows from the trivial bound (1), while for ${N \gg q_2}$ the claim follows from (2). Hence we may assume that

$\displaystyle q_1 < N < q_2.$

We use the method of Weyl differencing, the key point being to difference in multiples of ${q_1}$.

Let ${K := \lfloor N/q_1 \rfloor}$, thus ${K \geq 1}$. For any ${1 \leq k \leq K}$, we have

$\displaystyle \sum_{M+1 \leq n \leq M+N} \chi(n) = \sum_n 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1)$

and thus on averaging

$\displaystyle \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1). \ \ \ \ \ (4)$

By the Chinese remainder theorem, we may factor

$\displaystyle \chi(n) = \chi_1(n) \chi_2(n)$

where ${\chi_1,\chi_2}$ are primitive characters of conductor ${q_1,q_2}$ respectively. As ${\chi_1}$ is periodic of period ${q_1}$, we thus have

$\displaystyle \chi(n+kq_1) = \chi_1(n) \chi_2(n+kq_2)$

and so we can take ${\chi_1}$ out of the inner summation of the right-hand side of (4) to obtain

$\displaystyle \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \chi_1(n) \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)$

and hence by the triangle inequality

$\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq \frac{1}{K} \sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|.$

Note how the characters on the right-hand side only have period ${q_2}$ rather than ${q=q_1 q_2}$. This reduction in the period is ultimately the source of the saving over the Pólya-Vinogradov inequality.

Note that the inner sum vanishes unless ${n \in [M+1-Kq_1,M+N]}$, which is an interval of length ${O(N)}$ by choice of ${K}$. Thus by Cauchy-Schwarz one has

$\displaystyle | \sum_{M+1 \leq n \leq M+N} \chi(n) | \ll$

$\displaystyle \frac{N^{1/2}}{K} (\sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|^2)^{1/2}.$

We expand the right-hand side as

$\displaystyle \frac{N^{1/2}}{K} |\sum_{1 \leq k,k' \leq K} \sum_n$

$\displaystyle 1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|^{1/2}.$

We first consider the diagonal contribution ${k=k'}$. In this case we use the trivial bound ${O(N)}$ for the inner summation, and we soon see that the total contribution here is ${O( K^{-1/2} N ) = O( N^{1/2}q_1^{1/2} )}$.

Now we consider the off-diagonal case; by symmetry we can take ${k < k'}$. Then the indicator functions ${1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1)}$ restrict ${n}$ to the interval ${[M+1-kq_1, M+N-k'q_1]}$. On the other hand, as a consequence of the Weil conjectures for curves one can show that

$\displaystyle |\sum_{n \in {\bf Z}/q_2{\bf Z}} \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)} e_{q_2}(an)| \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}$

for any ${a \in {\bf Z}/q_2{\bf Z}}$; indeed one can use the Chinese remainder theorem and the square-free nature of ${q_2}$ to reduce to the case when ${q_2}$ is prime, in which case one can apply (for instance) the original paper of Weil to establish this bound, noting also that ${q_1}$ and ${q_2}$ are coprime since ${q}$ is squarefree. Applying the method of completion of sums (or the Parseval formula), this shows that

$\displaystyle |\sum_n 1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|$

$\displaystyle \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}.$

Summing in ${k,k'}$ (using Lemma 5 from this previous post) we see that the total contribution to the off-diagonal case is

$\displaystyle \lessapprox \frac{N^{1/2}}{K} ( K^2 q_2^{1/2} )^{1/2}$

which simplifies to ${\lessapprox N^{1/2} q_2^{1/4}}$. The claim follows. $\Box$

A modification of the above argument (using more complicated versions of the Weil conjectures) allows one to replace the summand ${\chi(n)}$ by more complicated summands such as ${\chi(f(n)) e_q(g(n))}$ for some polynomials or rational functions ${f,g}$ of bounded degree and obeying a suitable non-degeneracy condition (after restricting of course to those ${n}$ for which the arguments ${f(n),g(n)}$ are well-defined). We will not detail this here, but instead turn to the question of estimating slightly longer exponential sums, such as

$\displaystyle \sum_{1 \leq n \leq N} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} )$

where ${N}$ should be thought of as a little bit larger than ${(d_1d_2)^{1/2}}$.