As in previous posts, we use the following asymptotic notation: {x} is a parameter going off to infinity, and all quantities may depend on {x} unless explicitly declared to be “fixed”. The asymptotic notation {O(), o(), \ll} is then defined relative to this parameter. A quantity {q} is said to be of polynomial size if one has {q = O(x^{O(1)})}, and said to be bounded if {q=O(1)}. Another convenient notation: we write {X \lessapprox Y} for {X \ll x^{o(1)} Y}. Thus for instance the divisor bound asserts that if {q} has polynomial size, then the number of divisors of {q} is {\lessapprox 1}.

This post is intended to highlight a phenomenon unearthed in the ongoing polymath8 project (and is in fact a key component of Zhang’s proof that there are bounded gaps between primes infinitely often), namely that one can get quite good bounds on relatively short exponential sums when the modulus {q} is smooth, through the basic technique of Weyl differencing (ultimately based on the Cauchy-Schwarz inequality, and also related to the van der Corput lemma in equidistribution theory). Improvements in the case of smooth moduli have appeared before in the literature (e.g. in this paper of Heath-Brown, paper of Graham and Ringrose, this later paper of Heath-Brown, this paper of Chang, or this paper of Goldmakher); the arguments here are particularly close to that of the first paper of Heath-Brown. It now also appears that further optimisation of this Weyl differencing trick could lead to noticeable improvements in the numerology for the polymath8 project, so I am devoting this post to explaining this trick further.

To illustrate the method, let us begin with the classical problem in analytic number theory of estimating an incomplete character sum

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n)

where {\chi} is a primitive Dirichlet character of some conductor {q}, {M} is an integer, and {N} is some quantity between {1} and {q}. Clearly we have the trivial bound

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq N; \ \ \ \ \ (1)

we also have the classical Pólya-Vinogradov inequality

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll q^{1/2} \log q. \ \ \ \ \ (2)

This latter inequality gives improvements over the trivial bound when {N} is much larger than {q^{1/2}}, but not for {N} much smaller than {q^{1/2}}. The Pólya-Vinogradov inequality can be deduced via a little Fourier analysis from the completed exponential sum bound

\displaystyle  | \sum_{n \in {\bf Z}/q{\bf Z}} \chi(n) e_q( an )| \ll q^{1/2}

for any {a \in {\bf Z}/q{\bf Z}}, where {e_q(n) :=e^{2\pi i n/q}}. (In fact, from the classical theory of Gauss sums, this exponential sum is equal to {\tau(\chi) \overline{\chi(a)}} for some complex number {\tau(\chi)} of norm {\sqrt{q}}.)

In the case when {q} is a prime, improving upon the above two inequalities is an important but difficult problem, with only partially satisfactory results so far. To give just one indication of the difficulty, the seemingly modest improvement

\displaystyle |\sum_{M+1 \leq n \leq M+N} \chi(n)| \ll p^{1/2} \log \log p

to the Pólya-Vinogradov inequality when {q=p} was a prime required a 14-page paper in Inventiones by Montgomery and Vaughan to prove, and even then it was only conditional on the generalised Riemann hypothesis! See also this more recent paper of Granville and Soundararajan for an unconditional variant of this result in the case that {\chi} has odd order.

Another important improvement is the Burgess bound, which in our notation asserts that

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \lessapprox N^{1-1/r} q^{\frac{r+1}{4r^2}} \ \ \ \ \ (3)

for any fixed integer {r \geq 2}, assuming that {q} is square-free (for simplicity) and of polynomial size; see this previous post for a discussion of the Burgess argument. This is non-trivial for {N} as small as {q^{1/4+o(1)}}.

In the case when {q} is prime, there has been very little improvement to the Burgess bound (or its Fourier dual, which can give bounds for {N} as large as {q^{3/4-o(1)}}) in the last fifty years; an improvement to the exponents in (3) in this case (particularly anything that gave a power saving for {N} below {q^{1/4}}) would in fact be rather significant news in analytic number theory.

However, in the opposite case when {q} is smooth – that is to say, all of its factors are much smaller than {q} – then one can do better than the Burgess bound in some regimes. This fact has been observed in several places in the literature (in particular, in the papers of Heath-Brown, Graham-Ringrose, Chang, and Goldmakher mentioned previously), but also turns out to (implicitly) be a key insight in Zhang’s paper on bounded prime gaps. In the case of character sums, one such improved estimate (closely related to Theorem 2 of the Heath-Brown paper) is as follows:

Proposition 1 Let {q} be square-free with a factorisation {q = q_1 q_2} and of polynomial size, and let {M,N} be integers with {1 \leq N \leq q}. Then for any primitive character {\chi} with conductor {q}, one has

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox N^{1/2} q_1^{1/2} + N^{1/2} q_2^{1/4}.

This proposition is particularly powerful when {q} is smooth, as this gives many factorisations {q = q_1 q_2} with the ability to specify {q_1,q_2} with a fair amount of accuracy. For instance, if {q} is {y}-smooth (i.e. all prime factors are at most {y}), then by the greedy algorithm one can find a divisor {q_1} of {q} with {y^{-2/3} q^{1/3} \leq q_1 \leq y^{1/3} q^{1/3}}; if we set {q_2 := q/q_1}, then {y^{-1/3} q^{2/3} \leq q_2 \leq y^{2/3} q^{2/3}}, and the above proposition then gives

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \lessapprox y^{1/6} N^{1/2} q^{1/6}

which can improve upon the Burgess bound when {y} is small. For instance, if {N = q^{1/2}}, then this bound becomes {\lessapprox y^{1/6} q^{5/12}}; in contrast the Burgess bound only gives {\lessapprox q^{7/16}} for this value of {N} (using the optimal choice {r=2} for {r}), which is inferior for {y < q^{1/8}}.

The hypothesis that {q} be squarefree may be relaxed, but for applications to the Polymath8 project, it is only the squarefree moduli that are relevant.

Proof: If {N \ll q_1} then the claim follows from the trivial bound (1), while for {N \gg q_2} the claim follows from (2). Hence we may assume that

\displaystyle  q_1 < N < q_2.

We use the method of Weyl differencing, the key point being to difference in multiples of {q_1}.

Let {K := \lfloor N/q_1 \rfloor}, thus {K \geq 1}. For any {1 \leq k \leq K}, we have

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \sum_n 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1)

and thus on averaging

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi(n+kq_1). \ \ \ \ \ (4)

By the Chinese remainder theorem, we may factor

\displaystyle  \chi(n) = \chi_1(n) \chi_2(n)

where {\chi_1,\chi_2} are primitive characters of conductor {q_1,q_2} respectively. As {\chi_1} is periodic of period {q_1}, we thus have

\displaystyle  \chi(n+kq_1) = \chi_1(n) \chi_2(n+kq_2)

and so we can take {\chi_1} out of the inner summation of the right-hand side of (4) to obtain

\displaystyle  \sum_{M+1 \leq n \leq M+N} \chi(n) = \frac{1}{K} \sum_n \chi_1(n) \sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)

and hence by the triangle inequality

\displaystyle  |\sum_{M+1 \leq n \leq M+N} \chi(n)| \leq \frac{1}{K} \sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|.

Note how the characters on the right-hand side only have period {q_2} rather than {q=q_1 q_2}. This reduction in the period is ultimately the source of the saving over the Pólya-Vinogradov inequality.

Note that the inner sum vanishes unless {n \in [M+1-Kq_1,M+N]}, which is an interval of length {O(N)} by choice of {K}. Thus by Cauchy-Schwarz one has

\displaystyle  | \sum_{M+1 \leq n \leq M+N} \chi(n) | \ll

\displaystyle  \frac{N^{1/2}}{K} (\sum_n |\sum_{k=1}^K 1_{[M+1,M+N]}(n+kq_1) \chi_2(n+kq_1)|^2)^{1/2}.

We expand the right-hand side as

\displaystyle  \frac{N^{1/2}}{K} |\sum_{1 \leq k,k' \leq K} \sum_n

\displaystyle  1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|^{1/2}.

We first consider the diagonal contribution {k=k'}. In this case we use the trivial bound {O(N)} for the inner summation, and we soon see that the total contribution here is {O( K^{-1/2} N ) = O( N^{1/2}q_1^{1/2} )}.

Now we consider the off-diagonal case; by symmetry we can take {k < k'}. Then the indicator functions {1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1)} restrict {n} to the interval {[M+1-kq_1, M+N-k'q_1]}. On the other hand, as a consequence of the Weil conjectures for curves one can show that

\displaystyle  |\sum_{n \in {\bf Z}/q_2{\bf Z}} \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)} e_{q_2}(an)| \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}

for any {a \in {\bf Z}/q_2{\bf Z}}; indeed one can use the Chinese remainder theorem and the square-free nature of {q_2} to reduce to the case when {q_2} is prime, in which case one can apply (for instance) the original paper of Weil to establish this bound, noting also that {q_1} and {q_2} are coprime since {q} is squarefree. Applying the method of completion of sums (or the Parseval formula), this shows that

\displaystyle  |\sum_n 1_{[M+1,M+N]}(n+kq_1) 1_{[M+1,M+N]}(n+k'q_1) \chi_2(n+kq_1) \overline{\chi_2(n+k'q_1)}|

\displaystyle  \lessapprox q_2^{1/2} (k-k',q_2)^{1/2}.

Summing in {k,k'} (using Lemma 5 from this previous post) we see that the total contribution to the off-diagonal case is

\displaystyle  \lessapprox \frac{N^{1/2}}{K} ( K^2 q_2^{1/2} )^{1/2}

which simplifies to {\lessapprox N^{1/2} q_2^{1/4}}. The claim follows. \Box

A modification of the above argument (using more complicated versions of the Weil conjectures) allows one to replace the summand {\chi(n)} by more complicated summands such as {\chi(f(n)) e_q(g(n))} for some polynomials or rational functions {f,g} of bounded degree and obeying a suitable non-degeneracy condition (after restricting of course to those {n} for which the arguments {f(n),g(n)} are well-defined). We will not detail this here, but instead turn to the question of estimating slightly longer exponential sums, such as

\displaystyle  \sum_{1 \leq n \leq N} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} )

where {N} should be thought of as a little bit larger than {(d_1d_2)^{1/2}}.

It will be convenient to introduce a little bit of formal algebraic notation. For the narrow task of establishing a new estimate on an incomplete Kloosterman-type sum of the type used in Zhang’s paper, one does not really need all this formalism, but it may be useful to have this formalism around for future generalisations and strengthenings of this estimate (e.g. when trying to set up a Burgess-style argument).

Define an integral rational function to be a formal rational function {f(t) = \frac{P(t)}{Q(t)}} in a formal indeterminate {t} where {P,Q \in {\bf Z}[t]} are polynomials with integer coefficients, and {Q} is monic; in particular any polynomial {P(t) \in {\bf Z}[t]} can be identified with the integral rational function {P(t) \equiv \frac{P(t)}{1}}. For minor technical reasons we do not equate integral rational functions under cancelling, thus for instance we consider {\frac{P(t)R(t)}{Q(t)R(t)}} to be distinct from {\frac{P(t)}{Q(t)}}; we need to do this because the domain of definition of these two functions is a little different (the former is not defined when {R(t)=0}, but the latter can still be defined here). Because we refuse to cancel, we have to be a little careful how we define algebraic operations: specifically, we define

\displaystyle  \frac{P_1(t)}{Q_1(t)} + \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) Q_2(t) + P_2(t) Q_1(t)}{Q_1(t) Q_2(t)}

\displaystyle  \frac{P_1(t)}{Q_1(t)} - \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) Q_2(t) - P_2(t) Q_1(t)}{Q_1(t) Q_2(t)}

\displaystyle  \frac{P_1(t)}{Q_1(t)} \cdot \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) P_2(t)}{Q_1(t) Q_2(t)}

\displaystyle  (\frac{P(t)}{Q(t)})' := \frac{P'(t) Q(t) - P(t) Q'(t)}{Q(t)^2}.

Note that the denominator always remains monic with respect to these operations. This is not quite a ring with a derivation (the subtraction operation does not quite cancel the addition operation due to the inability to cancel) but this will not bother us in practice. (On the other hand, addition and multiplication remain associative, and the latter continues to distribute over the former, and differentiation obeys the usual sum and product rules.)

In applications to bounded gaps between primes, we will deal with integral rational functions of the explicit form

\displaystyle  f(t) := \frac{a}{t} + \frac{b}{t+l} + ct

for various integers {a,b,c,l}, but much of the theory below extends without much difficulty to more general integral rational functions, so we shall state it in that level of generality.

Note that if {f(t) = \frac{P(t)}{Q(t)}} is an integral rational function, we can localise it modulo {q} for any modulus {q} to obtain a rational function {f(t)\ (q) := \frac{P(t)\ (q)}{Q(t)\ (q)}} that is the ratio of two polynomials {P\ (q), Q\ (q)} in {({\bf Z}/q{\bf Z})[t]}, with the denominator monic and hence non-vanishing. We can define the algebraic operations of addition, subtraction, multiplication, and differentiation on integral rational functions modulo {q} by the same formulae as above, and we observe that these operations are completely compatible with their counterparts over {{\bf Z}} (even without the ability to cancel), thus for instance {(f+g)\ (q) = (f\ (q)) + (g\ (q))}. We say that {f} is divisible by {q}, and write {q|f}, if the numerator {P(t)} of {f(t)} has all coefficients divisible by {q}.

If {f} is an integral rational function and {n \in {\bf Z}/q{\bf Z}}, then {f(n)} is well defined as an element of {{\bf Z}/q{\bf Z}} except when {Q(n)} is a zero divisor in {{\bf Z}/q{\bf Z}}. We adopt the convention that {e_q(f(n)) = 0} when {Q(n)} is a zero divisor in {{\bf Z}/q{\bf Z}}, thus {e_q(f(n))} is really shorthand for {1_{(Q(n),q)=1} e_q(f(n))}; by abuse of notation we view {n \mapsto e_q(f(n))} both as a function on {{\bf Z}/q{\bf Z}} and as a {q}-periodic function on {{\bf Z}}. Thus for instance

\displaystyle  e_q( \frac{0}{n} ) = 1_{(n,q)=1}.

Note that if {q|f}, then {f(n)=0\ (q)} for all {n \in {\bf Z}/q{\bf Z}} for which {f(n)} is well defined. We define {(f,q)} to be the largest factor {q'} of {q} for which {q'|f}; in particular, if {q} is square-free, we have

\displaystyle  (f,q) = \prod_{p|q: p|f} p.

We recall the following Chinese remainder theorem:

Lemma 2 (Chinese remainder theorem) Let {q = q_1 q_2} with {q_1,q_2} coprime positive integers. If {f} is a rational function, then

\displaystyle  e_q( f(n) ) = e_{q_1}( \bar{q_2} f(n) ) e_{q_2} ( \bar{q_1} f(n) )

for all integers {n}, where {\bar{q_2}} is the inverse of {q_2} in {{\bf Z}/q_1{\bf Z}} and similarly for {\bar{q_1}}.

When there is no chance of confusion we will write {e_{q_1} ( \frac{1}{q_2} f(n) )} for {e_{q_1}( \bar{q_2} f(n) )} (though note that {\frac{1}{q_2}} does not qualify as an integral rational function since the constant {q_2} is not monic).

Proof: See Lemma 7 of this previous post. \Box

We need the following algebraic version of the fundamental theorem of calculus:

Lemma 3 (Fundamental theorem of calculus) Let {f(t) = \frac{P(t)}{Q(t)}} be a rational function, and let {q} be a positive integer. If {q|f'} and all prime factors of {q} are sufficiently large depending on the degrees of {P,Q}, then there exists {c \in {\bf Z}/q{\bf Z}} such that {q|f-c}.

Proof: By definition, the constraint {q|f'} may be rewritten as

\displaystyle  P'(t) Q(t) - P(t) Q'(t) = 0\ (q).

By abuse of notation, we view {P,Q} as polynomials over {{\bf Z}/q{\bf Z}} (i.e. all occurrences of {P,Q} in what follows should actually be {P\ (q), Q\ (q)} if we did not abuse notation). If {P} has degree larger than {Q}, then by inspecting the {t^{\hbox{deg}(P)+\hbox{deg}(Q)-1}} coefficient of this equation we obtain a contradiction if all prime factors of {q} are large enough. Thus we may assume that {\hbox{deg}(P) \leq \hbox{deg}(Q)}; since {Q} is monic, this implies that {P = cQ + R} for some {R \in {\bf Z}/q{\bf Z}[t]} with {\hbox{deg}(R) < \hbox{deg}(Q)}. We can then rewrite the previous constraint as

\displaystyle  R'(t) Q(t) - R(t) Q'(t) = 0\ (q).

If {R} is non-zero, then inspecting the {t^{\hbox{deg}(R)+\hbox{deg}(Q)-1}} coefficient of this equation leads to a contradiction if all prime factors of {q} are large enough. Thus {R} is zero, and so {q|f-c} as required. \Box

Now we give an estimate for complete exponential sums, which combines both Ramanujan sum bounds with Weil conjecture bounds.

Proposition 4 (Ramanujan-Weil bounds) Let {q} be a positive square-free integer of polynomial size, and let {f(t) = \frac{P(t)}{Q(t)}} be an integral rational function with {P,Q} of bounded degree. Then we have

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q( f(n) )| \lessapprox q^{1/2} \frac{(f',q)}{(f'',q)^{1/2}}.

For instance, if we set {f(t) := \frac{0}{t} + bt} for some integer {b}, we obtain the familiar Ramanujan sum bound

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q( b n) 1_{(n,q)=1}| \lessapprox (b,q). \ \ \ \ \ (5)

More generally, if we set {f(t) := \frac{a}{t} + bt} for some integers {a,b}, we obtain Weil’s Kloosterman sum bound

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q( \frac{a}{n} + bn ) 1_{(n,q)=1}| \lessapprox q^{1/2} \frac{(a,b,q)}{(a,q)^{1/2}},

and so forth.

The bound in Proposition 4 is quite sharp, with one exception of relevance to the polymath8 project, namely the estimation of the degenerate Kloosterman-type sum

\displaystyle  \sum_{n \in {\bf Z}/q{\bf Z}} e_q( \frac{a}{n+l} )

with {a, l} integers. Proposition 4 gives a bound of {\ll q^{1/2} (a,q)^{1/2}} in this case, but one can do better by performing the change of variables {m = \frac{1}{n+l}} that transforms this sum into a Ramanujan sum, yielding the superior bound

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q( \frac{a}{n+l} )| \ll (a,q). \ \ \ \ \ (6)

One could try to modify the argument below to incorporate this improvement, but it leads to a significantly more complicated right-hand side for the estimate, so we will not do so here.

Proof: By the Chinese remainder theorem and the divisor bound, it suffices to verify this when {q=p} is a single prime and with {\lessapprox} replaced by {\ll}, thus we wish to show that

\displaystyle  |\sum_{n \in {\bf Z}/p{\bf Z}} e_p(f(n))| \ll p

when {p|f'} (which implies {p|f''}),

\displaystyle  |\sum_{n \in {\bf Z}/p{\bf Z}} e_p(f(n))| \ll 1 \ \ \ \ \ (7)

when {p|f''} but {p \not | f'}, and

\displaystyle  |\sum_{n \in {\bf Z}/p{\bf Z}} e_p(f(n))| \ll \sqrt{p} \ \ \ \ \ (8)

otherwise. We may assume that {p} is sufficiently large depending on the degrees of {P,Q} as the claim is trivial otherwise.

The first bound trivially follows from the triangle inequality, so we turn to the second bound. Since {p|f''}, we conclude from Lemma 3 that there is {c \in {\bf Z}/p{\bf Z}} such that {p|f'-c}; since {p \not | f'}, {c} must be non-zero. Since {f'-c = (f-ct)'}, another application of Lemma 3 shows that there is {d \in {\bf Z}/p{\bf Z}} such that {p|f-ct-d}. Thus {f(n) = cn+d\ (p)} whenever {f(n)} is well-defined. Since the denominator {Q(n)} of {f(n)} has only bounded number of zeroes, we conclude that {e_p(f(n)) = e_p(cn+d)} for all but a bounded number of {n \in {\bf Z}/p{\bf Z}}. The claim then follows (7) from the Fourier identity

\displaystyle  \sum_{n \in{\bf Z}/p{\bf Z}} e_p( cn+d) = 0.

Now we prove (8). By abuse of notation, we view {P} and {Q} as polynomials over the field {{\bf F}_p[t]} rather than over the integers. Let {R := (P,Q)} be the (monic) greatest common divisor, thus {P = R \tilde P} and {Q = R \tilde Q} with {\tilde P, \tilde Q \in {\bf F}_p[t]} coprime and of bounded degree with {\tilde Q} monic. As in the previous paragraph, we have {e_p(f(n)) = e_p( \frac{\tilde P(n)}{\tilde Q(n)} )} for all but a bounded number of {n}. So it suffices to show that

\displaystyle  |\sum_{n \in {\bf Z}/p{\bf Z}} 1_{\tilde Q(n) \neq 0} e_p(\frac{\tilde P(n)}{\tilde Q(n)})| \ll \sqrt{p}.

We now use the Weil bound for exponentials of rational functions over a field of prime order, as stated for instance in this paper of Perelmuter or this paper of Cochrane and Pinner (in the case when {f} is polynomial, one can also use the original paper of Weil, who also treats the important case of Kloosterman sums when {f} is a fractional linear transformation). We remark that the Cochrane-Pinner paper uses Stepanov’s method and is thus essentially elementary in nature, whereas the other papers use more sophisticated tools from arithmetic geometry (such as {\ell}-adic cohomology). In any event, the results in these papers give the required bound except when the rational function {\tilde P / \tilde Q} has the form

\displaystyle  \frac{\tilde P}{\tilde Q} = g^p - g + c \ \ \ \ \ (9)

in the field {{\bf F}_p(t)} of rational functions over {{\bf F}_p} for some {g \in {\bf F}_p(t)} and {c \in {\bf F}_p}. (The presence of the {g^p-g} factor arises because of elements of {{\bf F}_p} are all roots of {x^p-x}.)

Suppose that (9) holds. Writing {g = R/S} for some coprime polynomials {R,S \in {\bf F}_p[t]} with {S} monic, we conclude on multiplying everything out that

\displaystyle  \tilde P S^p = \tilde Q R^p - \tilde Q R S^{p-1} + c \tilde Q S^p

in the polynomial ring {{\bf F}_p[t]}. By unique factorisation, this forces {S^{p-1}} to divide {\tilde Q}, which for {p} large enough forces {S} to have degree zero, thus {S=1} as {S} is monic. We thus have

\displaystyle  \tilde P = \tilde Q R^p - \tilde Q R + c \tilde Q.

Inspecting the {\hbox{deg}(\tilde Q) + p \hbox{deg}(R)} coefficient of this equation, we obtain (for {p} large enough) a contradiction unless {R} has degree zero, thus {R} is constant. In particular {R^p - R = 0} and so {\tilde P} is a scalar multiple of {\tilde Q}, hence {P} is a scalar multiple of {Q} and so {p | f'}, contradicting the hypothesis. This covers all cases and we are done. \Box

Now we obtain a bound for incomplete exponential sums to smooth moduli. We will focus attention here on the specific sum we need to bound for the Polymath8 project, namely a sum of the form

\displaystyle  S := \sum_n \psi_N(n) e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} )

where {d_1,d_2} are square-free numbers (not necessarily coprime), {c_1,c_2,l} are integers, {N \gg 1}, and {\psi_N: {\bf R} \rightarrow {\bf C}} is a function supported on an interval {I} of length {O(N)} that obeys the smoothness bounds

\displaystyle  |\psi^{(j)}_N(t)| \lessapprox N^{-j}

for any fixed {j \geq 0}. The arguments below also apply to more general exponential sums as these, but the specific sum {S} benefits from the additional cancellation (6) that is not available for more general sums.

Proposition 5 Let {S} be as above. Write {d'_1 := d_1/(d_1,d_2)} and {d'_2 := d_2/(d_1,d_2)}.

  • (i) (Trivial bound) We have

    \displaystyle  |S| \lessapprox N. \ \ \ \ \ (10)

  • (ii) (Standard bound) We have

    \displaystyle  |S| \lessapprox [d_1,d_2]^{1/2} + \frac{N}{d_1 d_2} (d_1,d_2)^2 (c_1,d'_1) (c_2,d'_2). \ \ \ \ \ (11)

  • (iii) (Improved bound) We have

    \displaystyle  |S| \lessapprox N^{1/2} (d'_1)^{1/4} + N^{1/2} d_2^{1/2} + \frac{(c_1,d'_1)^{1/4}}{(d'_1)^{1/4}} N. \ \ \ \ \ (12)

The estimate (ii) is a slightly improved version of Lemma 11 of Zhang’s paper (which is a key estimate to treat the Type I/II sums); one can also write the second term {\frac{N}{d_1 d_2} (d_1,d_2)^2 (c_1,d'_1) (c_2,d'_2)} on the right-hand side more suggestively as {\frac{(c_1,d'_1)}{d'_1} \frac{(c_2,d'_2)}{d'_2} N}. But it is the estimate (iii) which is a more substantial improvement, as for the medium-sized values of {N} that come up in Zhang’s argument, the first two terms {N^{1/2} (d'_1)^{1/4}, N^{1/2} d_2^{1/2}} that appear in (12) are superior to the main term {[d_1,d_2]^{1/2}} in (11). (Admittedly, the final term in (12) is worse than the final term in (11), but in practice these terms are not dominant.) The improved bound (12) leads to significantly better values of {\varpi} and {\delta} for the assertion {MPZ'[\varpi,\delta]} that we need for bounded gaps between primes; this is sketched in this previous comment and will be detailed in a subsequent post.

Proof: The trivial bound (10) is immediate from the triangle inequality and the bounds and support of {\psi_N}, so we turn now to the standard bound (11). We can write

\displaystyle  e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) = e_q( f(n) )

where {q := [d_1,d_2] = d_1 d'_2 = d'_1 d_2} and {f} is the integral rational function

\displaystyle  f(t) := \frac{c_1 d'_2}{t} + \frac{c_2 d'_1}{t+l}. \ \ \ \ \ (13)

By Fourier inversion and the {q}-periodicity of {e_q(f)}, one can rewrite {S} as

\displaystyle  S = \frac{1}{q} \sum_{h \in {\bf Z}/q{\bf Z}} \hat \psi_N(\frac{h}{q}) \sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn) \ \ \ \ \ (14)

where {\hat \psi_N: {\bf R}/{\bf Z} \rightarrow {\bf C}} is the function

\displaystyle  \hat \psi_N(\theta) := \sum_{n \in {\bf Z}} \psi_N(n) e^{-2\pi i n \theta}.

From the Poisson summation formula and integration by parts, we have the standard bounds

\displaystyle  |\hat \psi_N(\theta)| \lessapprox N (N |\theta|)^{-j} \ \ \ \ \ (15)

for all {\theta \in [-1/2,1/2)} (which, by abuse of notation, we identify with {{\bf R}/{\bf Z}}) and any fixed {j \geq 0}.

Consider first the {h=0} term in (14). By the Chinese remainder theorem, we may factor {\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n))} as

\displaystyle  (\sum_{n \in {\bf Z}/d'_1 {\bf Z}} e_{d'_1}( \frac{c_1}{n (d_1,d_2)} )) (\sum_{n \in {\bf Z}/d'_2 {\bf Z}} e_{d'_2}( \frac{c_2}{n (d_1,d_2)} ))

\displaystyle  (\sum_{n \in {\bf Z}/(d_1,d_2){\bf Z}} e_{(d_1,d_2)}( \frac{c_1}{nd'_1} + \frac{c_2}{(n+l)d'_2} )).

Applying (6) to the first two factors and the trivial bound of {(d_1,d_2)} to the second factor, we conclude that

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n))| \lessapprox (c_1,d'_1) (c_2,d'_2) (d_1,d_2)

and so the contribution of the {h=0} term is acceptable. It remains to establish that

\displaystyle  \frac{1}{q}\sum_{h \in {\bf Z}/q{\bf Z} \backslash \{0\}} |\hat \psi_N(\frac{h}{q})| |\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn)| \lessapprox q^{1/2}.

Let {\epsilon > 0} be fixed. For {|h| \geq x^\epsilon q / N} the bound is trivial from (15) (for {j} large enough), so by overspill it will suffice to show that

\displaystyle  \frac{1}{q} \sum_{1 \leq |h| \leq x^\epsilon q/N} |\hat \psi_N(\frac{h}{q})| |\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn)| \lessapprox x^{\epsilon} q^{1/2}.

By Proposition 4 and (15), the left-hand side is

\displaystyle  \lessapprox \frac{1}{q} \sum_{1 \leq |h| \leq x^\epsilon q/N} N q^{1/2} \frac{(f'+h,q)}{(f'',q)^{1/2}}.

Writing {(f'+h,q) \leq \sum_{d|q} d 1_{d|f'+h}}, we may bound the previous expression by

\displaystyle  \lessapprox \frac{N}{q^{1/2} (f'',q)^{1/2}} \sum_{d|q} d \sum_{1 \leq |h| \leq x^\epsilon q/N: d|f'+h} 1.

Since

\displaystyle  f'(t) = - \frac{c_1 d'_2}{t^2} - \frac{c_2 d'_1}{(t+l)^2}

vanishes at infinity, we see that {d|f'+h} implies {d|h}. Thus

\displaystyle  \sum_{1 \leq |h| \leq x^\epsilon q/N: d|f'+h} 1 \ll \frac{x^\epsilon q/N}{d}

and hence by the divisor bound

\displaystyle  \sum_{d|q} d \sum_{1 \leq |h| \leq x^\epsilon q/N: d|f'+h} 1 \lessapprox x^\epsilon q/N

and the claim follows.

Now we prove (12). We first observe that if {N < d_2}, then {N < N^{1/2} d_2^{1/2}} and the claim (12) follows already from the trivial bound (10). Similarly, if {N > d'_1}, then {N^{1/2} d_2^{1/2} > [d_1,d_2]^{1/2}} and also

\displaystyle  \frac{(c_1,d'_1)^{1/4}}{(d'_1)^{1/4}} N \geq N \frac{(c_1,d'_1)}{d'_1} \geq \frac{N}{d_1 d_2} (d_1,d_2)^2 (c_1,d'_1) (c_2,d'_2)

so (12) follows from (11) in this case. Thus we may restrict attention to the regime

\displaystyle  d_2 \leq N \leq d'_1.

As before, we can rewrite {\sum_n \psi_N(n) e_q(f(n))} as (14). Now set

\displaystyle  K := \lfloor \frac{d'_1}{N} \rfloor, \ \ \ \ \ (16)

so {K \geq 1}. Shifting {h} by {d_2k} for {k=1,\ldots,K} and averaging, we may then rewrite (14) as

\displaystyle  \frac{1}{qK} \sum_{h \in {\bf Z}/q{\bf Z}} \sum_{k=1}^K \hat \psi_N(\frac{h+d_2k}{q}) \sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn + knd_2).

From Lemma 2, we have

\displaystyle  \sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn + knd_2) = (\sum_{n_1 \in {\bf Z}/d_2{\bf Z}} e_{d_2}(f(n_1)/d'_1 + hn_1/d'_1))

\displaystyle  (\sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2))

and so we may bound {|\sum_n \psi_N(n) e_q(f(n))|} by

\displaystyle  \frac{1}{qK} \sum_{h \in {\bf Z}/q{\bf Z}} |\sum_{n_1 \in {\bf Z}/d_2{\bf Z}} e_{d_2}(f(n_1)/d'_1 + hn_1/d'_1)|

\displaystyle  |\sum_{k=1}^K \hat \psi_N(\frac{h+d_2k}{q}) \sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2)|.

By Proposition 4 (and the coprimality of {d_2,d'_1}), we have

\displaystyle  |\sum_{n_1 \in {\bf Z}/d_2{\bf Z}} e_{d_2}(f(n_1)/d'_1 + hn_1/d'_1)| \lessapprox d_2^{1/2} (f'+h,d_2) (f'',d_2)^{-1/2}.

Thus we may bound the previous expression by

\displaystyle  \lessapprox \frac{1}{d_2^{1/2} (f'',d_2)^{1/2} d'_1 K} \sum_{h \in{\bf Z}/q{\bf Z}} (f'+h,d_2)

\displaystyle  |\sum_{k=1}^K \hat \psi_N(\frac{h+d_2k}{q}) \sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2)|.

Next, let {\epsilon>0} be a small fixed quantity. By (15) as before, the contribution of those {h} outside of the interval {[-x^\epsilon q/N, x^\epsilon q/N]} is negligible, so it will suffice to show that

\displaystyle  \frac{1}{d_2^{1/2} (f'',d_2)^{1/2} d'_1 K} \sum_{|h| \leq x^\epsilon q/N} (f'+h,d_2) \ \ \ \ \ (17)

\displaystyle  |\sum_{k=1}^K \hat \psi_N(\frac{h+d_2k}{q}) \sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2)| \lessapprox

\displaystyle  x^{\epsilon} (N^{1/2} (d'_1)^{1/4} + N^{1/2} d_2^{1/2} + \frac{(c_1,d'_1)^{1/4}}{(d'_1)^{1/4}} N).

Now let us consider the sum

\displaystyle  \sum_{|h| \leq x^\epsilon q/N} (f'+h,d_2)^2. \ \ \ \ \ (18)

Bounding {(f'+h,d_2)^2 \leq \sum_{d|d_2} d^2 1_{d|f'+h}}, we may bound this sum by

\displaystyle  \ll \sum_{d|d_2} d^2 \sum_{|h| \leq x^\epsilon q/N: d|f'+h} 1.

The inner sum is only non-vanishing when {d | (f'',d_2)}, and in that case we have a bound of {O( \frac{x^\epsilon q/N}{d} + 1 )} as in the proof of (11); note that

\displaystyle \frac{x^\epsilon q/N}{d} \geq \frac{q}{N d_2} \geq \frac{[d_1,d_2]}{d'_1 d_2} = 1,

so we may replace {O( \frac{x^\epsilon q/N}{d} + 1 )} by {O( \frac{x^\epsilon q/N}{d} )}. By the divisor bound we may thus bound (18) by

\displaystyle  \lessapprox x^\epsilon (f'',d_2) \frac{q}{N}.

By Cauchy-Schwarz, we may thus bound the left-hand side of (17) by

\displaystyle  \lessapprox x^\epsilon \frac{1}{(d'_1)^{1/2} N^{1/2} K} |X|^{1/2} \ \ \ \ \ (19)

where {X} is the quantity

\displaystyle  X := \sum_{h \in {\bf Z}/q{\bf Z}} |\sum_{k=1}^K \hat \psi_N(\frac{h+d_2k}{q}) \sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2)|^2.

We may square {X} out as

\displaystyle  X = \sum_{1 \leq k, k' \leq K} \sum_{n,n'} \psi_N(n) \overline{\psi_N(n')} \sum_{n_2,n'_2 \in {\bf Z}/d'_1{\bf Z}} \sum_{h \in{\bf Z}/q{\bf Z}}

\displaystyle  e_{d'_1}(f(n_2)/d_2 + hn_2/d_2 + kn_2 - f(n'_2)/d_2 - hn'_2/d_2 - k'n'_2)

\displaystyle  e_q( h n' - h n ) e_{d'_1}(k'n' - kn).

Now observe from Fourier analysis that the sum

\displaystyle  \sum_{h \in {\bf Z}/q{\bf Z}} e_{d'_1}( hn_2/d_2 - hn'_2/d_2 ) e_q(hn' - hn)

equals {q} when {n=n'\ (d_2)} and {n-n' = n_2-n'_2\ (d'_1)}, and vanishes otherwise. Thus we may simplify the above expression as

\displaystyle  X = q\sum_{1 \leq k. k' \leq K} \sum_{n,n': n=n'\ (d_2)} \psi_N(n) \overline{\psi_N(n')} e_{d'_1}(k'n' - kn)

\displaystyle  \sum_{n_2,n'_2 \in {\bf Z}/d'_1{\bf Z}: n'_2-n_2 = n'-n\ (d'_1)} e_{d'_1}(f(n_2)/d_2 + kn_2 - f(n'_2)/d_2 - k'n'_2).

Substituting {n'_2 = n_2 + n'-n}, this becomes

\displaystyle X = q\sum_{1 \leq k, k' \leq K} \sum_{n,n': n=n'\ (d_2)} \psi_N(n) \overline{\psi_N(n')} e_{d'_1}((k'-k)n)

\displaystyle \sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + (k-k')n_2 - f(n_2+n'-n)/d_2 ).

As {\psi_N} is supported in an interval {I} of length {O(N)}, we obtain the bound

\displaystyle  X \lessapprox q\sum_{1 \leq k, k' \leq K} \sum_{n,n' \in I: n=n'\ (d_2)} |\sum_{n_2 \in {\bf Z}/d'_1{\bf Z}} e_{d'_1}(f(n_2)/d_2 + (k-k')n_2 - f(n_2+n'-n)/d_2)|.

We first isolate the diagonal terms {n'=n}. In this case the inner sum is a Ramanujan sum, so by (5) the diagonal contribution is bounded by

\displaystyle  \lessapprox q \sum_{1 \leq k,k' \leq K} N (k-k',d'_1).

Summing in {k,k'} using Lemma 5 from this previous post, this is bounded by

\displaystyle  \lessapprox q N (K^2 + K d'_1) \lessapprox qN K d'_1.

Now we handle the non-diagonal terms {n \neq n'}. By Proposition 4, this contribution to {X} is

\displaystyle  \lessapprox q\sum_{1 \leq k, k' \leq K} \sum_{n,n' \in I: n=n'\ (d_2); n \neq n'} (d'_1)^{1/2} (f' - f'(\cdot+n'-n) + (k-k')d_2, d'_1) (f'' - f''(\cdot+n'-n),d'_1)^{-1/2}.

Bounding

\displaystyle  (f' - f'(\cdot+n'-n) + (k-k')d_2, d'_1) \leq \sum_{d|d'_1} d 1_{d| f' - f'(\cdot+n'-n) + (k-k')d_2}

we can bound the preceding expression by

\displaystyle  \lessapprox q(d'_1)^{1/2} \sum_{n,n' \in I: n=n'\ (d_2); n \neq n'} (f'' - f''(\cdot+n'-n),d'_1)^{-1/2} \sum_{d|d'_1} d \sum_{1 \leq k,k' \leq K} 1_{d|f'-f'(\cdot+n'-n)+(k-k')d_2}.

The inner sum vanishes unless {d | (f'' - f''(\cdot+n'-n),d'_1)}. In that case, {k-k'} is restricted to a single residue class modulo {d}, and so the inner sum is {O( K^2/d + K )}. From the divisor bound we conclude that

\displaystyle  \sum_{d|d'_1} d \sum_{1 \leq k,k' \leq K} 1_{d|f'-f'(\cdot+n'-n)+(k-k')d_2} \lessapprox K^2 + K (f'' - f''(\cdot+n'-n),d'_1)

and thus this contribution to {X} is bounded by

\displaystyle \lessapprox q(d'_1)^{1/2} \sum_{n,n' \in I: n=n'\ (d_2); n \neq n'} (K^2 (f'' - f''(\cdot+n'-n),d'_1)^{-1/2}

\displaystyle  + K (f'' - f''(\cdot+n'-n),d'_1)^{1/2} ).

From the explicit formula (13) we see that if {p | d'_1}, then {p | f'' - f''(\cdot + n'-n)} if and only if

\displaystyle  p | \frac{2c_1}{t^3} - \frac{2c_1}{(t+n'-n)^3}.

For {p} large enough, this only happens if {p | c_1} or {p | n'-n}. We conclude that

\displaystyle  (f'' - f''(\cdot+n'-n),d'_1) \ll (c_1,d'_1) (n'-n, d'_1)

and thus this contribution to {X} is

\displaystyle  \lessapprox q(d'_1)^{1/2} (K^2 N^2 / d_2 + K (c_1,d'_1)^{1/2} \sum_{n,n' \in I: n=n'\ (d_2); n \neq n'} (n'-n, d'_1)^{1/2} ).

If we estimate

\displaystyle  (n'-n, d'_1)^{1/2} \leq \sum_{d|d'_1} d^{1/2} 1_{d|n'-n}

we have

\displaystyle  \sum_{n,n' \in I: n=n'\ (d_2)} (n'-n, d'_1)^{1/2} \leq \sum_{d|d'_1} d^{1/2} \sum_{n,n' \in I: n-n'\ (dd_2); n \neq n'} 1

\displaystyle  \ll \sum_{d|d'_1} d^{1/2} \frac{N^2}{dd_2}\

\displaystyle  \lessapprox N^2 d_2^{-1}.

This contribution to {X} is thus

\displaystyle  \lessapprox q(d'_1)^{1/2} K^2 N^2 d_2^{-1} + qK (d'_1)^{1/2} (c_1,d'_1)^{1/2} N^2 d_2^{-1}

and thus

\displaystyle  X \lessapprox qNKd'_1 + q(d'_1)^{1/2} K^2 N^2 d_2^{-1} + qK (d'_1)^{1/2} (c_1,d'_1)^{1/2} N^2 d_2^{-1}.

Inserting this back into (19) gives the desired bound (17). This completes the proof of (12). \Box