As in previous posts, we use the following asymptotic notation: {x} is a parameter going off to infinity, and all quantities may depend on {x} unless explicitly declared to be “fixed”. The asymptotic notation {O(), o(), \ll} is then defined relative to this parameter. A quantity {q} is said to be of polynomial size if one has {q = O(x^{O(1)})}, and bounded if {q=O(1)}. We also write {X \lessapprox Y} for {X \ll x^{o(1)} Y}, and {X \sim Y} for {X \ll Y \ll X}.

The purpose of this post is to collect together all the various refinements to the second half of Zhang’s paper that have been obtained as part of the polymath8 project and present them as a coherent argument. In order to state the main result, we need to recall some definitions. If {I} is a bounded subset of {{\bf R}}, let {{\mathcal S}_I} denote the square-free numbers whose prime factors lie in {I}, and let {P_I := \prod_{p \in I} p} denote the product of the primes {p} in {I}. Note by the Chinese remainder theorem that the set {({\bf Z}/P_I{\bf Z})^\times} of primitive congruence classes {a\ (P_I)} modulo {P_I} can be identified with the tuples {(a_q\ (q))_{q \in {\mathcal S}_I}} of primitive congruence classes {a_q\ (q)} of congruence classes modulo {q} for each {q \in {\mathcal S}_I} which obey the Chinese remainder theorem

\displaystyle  (a_{qr}\ (qr)) = (a_q\ (q)) \cap (a_r\ (r))

for all coprime {q,r \in {\mathcal S}_I}, since one can identify {a\ (P_I)} with the tuple {(a\ (q))_{q \in {\mathcal S}_I}} for each {a \in ({\bf Z}/P_I{\bf Z})^\times}.

If {y > 1} and {n} is a natural number, we say that {n} is {y}-densely divisible if, for every {1 \leq R \leq n}, one can find a factor of {n} in the interval {[y^{-1} R, R]}. We say that {n} is doubly {y}-densely divisible if, for every {1 \leq R \leq n}, one can find a factor {m} of {n} in the interval {[y^{-1} R, R]} such that {m} is itself {y}-densely divisible. We let {{\mathcal D}_y^2} denote the set of doubly {y}-densely divisible natural numbers, and {{\mathcal D}_y} the set of {y}-densely divisible numbers.

Given any finitely supported sequence {\alpha: {\bf N} \rightarrow {\bf C}} and any primitive residue class {a\ (q)}, we define the discrepancy

\displaystyle \Delta(\alpha; a \ (q)) := \sum_{n: n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1} \alpha(n).

For any fixed {\varpi, \delta > 0}, we let {MPZ''[\varpi,\delta]} denote the assertion that

\displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^2: q \leq x^{1/2+2\varpi}} |\Delta(\Lambda 1_{[x,2x]}; a\ (q))| \ll x \log^{-A} x \ \ \ \ \ (1)

for any fixed {A > 0}, any bounded {I}, and any primitive {a\ (P_I)}, where {\Lambda} is the von Mangoldt function. Importantly, we do not require {I} or {a} to be fixed, in particular {I} could grow polynomially in {x}, and {a} could grow exponentially in {x}, but the implied constant in (1) would still need to be fixed (so it has to be uniform in {I} and {a}). (In previous formulations of these estimates, the system of congruence {a\ (q)} was also required to obey a controlled multiplicity hypothesis, but we no longer need this hypothesis in our arguments.) In this post we will record the proof of the following result, which is currently the best distribution result produced by the ongoing polymath8 project to optimise Zhang’s theorem on bounded gaps between primes:

Theorem 1 We have {MPZ''[\varpi,\delta]} whenever {\frac{280}{3} \varpi + \frac{80}{3} \delta < 1}.

This improves upon the previous constraint of {148 \varpi + 33 \delta < 1} (see this previous post), although that latter statement was stronger in that it only required single dense divisibility rather than double dense divisibility. However, thanks to the efficiency of the sieving step of our argument, the upgrade of the single dense divisibility hypothesis to double dense divisibility costs almost nothing with respect to the {k_0} parameter (which, using this constraint, gives a value of {k_0=720} as verified in these comments, which then implies a value of {H = 5,414}).

This estimate is deduced from three sub-estimates, which require a bit more notation to state. We need a fixed quantity {A_0>0}.

Definition 2 A coefficient sequence is a finitely supported sequence {\alpha: {\bf N} \rightarrow {\bf R}} that obeys the bounds

\displaystyle  |\alpha(n)| \ll \tau^{O(1)}(n) \log^{O(1)}(x) \ \ \ \ \ (2)

for all {n}, where {\tau} is the divisor function.

  • (i) A coefficient sequence {\alpha} is said to be at scale {N} for some {N \geq 1} if it is supported on an interval of the form {[(1-O(\log^{-A_0} x)) N, (1+O(\log^{-A_0} x)) N]}.
  • (ii) A coefficient sequence {\alpha} at scale {N} is said to obey the Siegel-Walfisz theorem if one has

    \displaystyle  | \Delta(\alpha 1_{(\cdot,q)=1}; a\ (r)) | \ll \tau(qr)^{O(1)} N \log^{-A} x \ \ \ \ \ (3)

    for any {q,r \geq 1}, any fixed {A}, and any primitive residue class {a\ (r)}.

  • (iii) A coefficient sequence {\alpha} at scale {N} (relative to this choice of {A_0}) is said to be smooth if it takes the form {\alpha(n) = \psi(n/N)} for some smooth function {\psi: {\bf R} \rightarrow {\bf C}} supported on {[1-O(\log^{-A_0} x), 1+O(\log^{-A_0} x)]} obeying the derivative bounds

    \displaystyle  \psi^{(j)}(t) = O( \log^{j A_0} x ) \ \ \ \ \ (4)

    for all fixed {j \geq 0} (note that the implied constant in the {O()} notation may depend on {j}).

Definition 3 (Type I, Type II, Type III estimates) Let {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and {0 < \sigma < 1/2} be fixed quantities. We let {I} be an arbitrary bounded subset of {{\bf R}}, and {a\ (P_I)} a primitive congruence class.

  • (i) We say that {Type''_I[\varpi,\delta,\sigma]} holds if, whenever {M, N \gg 1} are quantities with

    \displaystyle  M N \sim x \ \ \ \ \ (5)

    and

    \displaystyle  x^{1/2-\sigma} \lessapprox N \lessapprox x^{1/2-2\varpi-c} \ \ \ \ \ (6)

    for some fixed {c>0}, and {\alpha,\beta} are coefficient sequences at scales {M,N} respectively, with {\beta} obeying a Siegel-Walfisz theorem, we have

    \displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^2: q \leq x^{1/2+2\varpi}} |\Delta(\alpha * \beta; a\ (q))| \ll x \log^{-A} x. \ \ \ \ \ (7)

  • (ii) We say that {Type''_{II}[\varpi,\delta]} holds if the conclusion (7) of {Type''_I[\varpi,\delta,\sigma]} holds under the same hypotheses as before, except that (6) is replaced with

    \displaystyle  x^{1/2-2\varpi-c} \lessapprox N \lessapprox x^{1/2} \ \ \ \ \ (8)

    for some sufficiently small fixed {c>0}.

  • (iii) We say that {Type''_{III}[\varpi,\delta,\sigma]} holds if, whenever {M, N_1,N_2,N_3 \gg 1} are quantities with

    \displaystyle  M N_1 N_2 N_3 \sim x

    and

    \displaystyle  N_1 N_2, N_1 N_3, N_2 N_3 \gtrapprox x^{1/2+\sigma} \ \ \ \ \ (9)

    and

    \displaystyle  x^{2\sigma} \lessapprox N_1,N_2,N_3 \lessapprox x^{1/2-\sigma} \ \ \ \ \ (10)

    and {\alpha,\psi_1,\psi_2,\psi_3} are coefficient sequences at scales {M,N_1,N_2,N_3} respectively, with {\psi_1,\psi_2,\psi_3} smooth, we have

    \displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^2: q \leq x^{1/2+2\varpi}} \ \ \ \ \ (11)

    \displaystyle  |\Delta(\alpha * \psi_1 * \psi_2 * \psi_3; a\ (q))| \ll x \log^{-A} x.

Theorem 1 is then a consequence of the following four statements.

Theorem 4 (Type I estimate) {Type''_I[\varpi,\delta,\sigma]} holds whenever {\varpi,\delta,\sigma > 0} are fixed quantities such that

\displaystyle  56 \varpi + 16 \delta + 4\sigma < 1.

Theorem 5 (Type II estimate) {Type''_{II}[\varpi,\delta]} holds whenever {\varpi,\delta > 0} are fixed quantities such that

\displaystyle  68 \varpi + 14 \delta < 1.

Theorem 6 (Type III estimate) {Type''_{III}[\varpi,\delta,\sigma]} holds whenever {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and {\sigma > 0} are fixed quantities such that

\displaystyle  \sigma > \frac{1}{18} + \frac{28}{9} \varpi + \frac{2}{9} \delta \ \ \ \ \ (12)

and

\displaystyle  \varpi< \frac{1}{12}. \ \ \ \ \ (13)

In particular, if

\displaystyle  70 \varpi + 5 \delta < 1.

then all values of {\sigma} that are sufficiently close to {1/10} are admissible.

Lemma 7 (Combinatorial lemma) Let {0 < \varpi < 1/4}, {0 < \delta < 1/4+\varpi}, and {1/10 < \sigma < 1/2} be such that {Type''_I[\varpi,\delta,\sigma]}, {Type''_{II}[\varpi,\delta]}, and {Type''_{III}[\varpi,\delta,\sigma]} simultaneously hold. Then {MPZ''[\varpi,\delta]} holds.

Indeed, if {\frac{280}{3} \varpi + \frac{80}{3} \delta < 1}, one checks that the hypotheses for Theorems 4, 5, 6 are obeyed for {\sigma} sufficiently close to {1/10}, at which point the claim follows from Lemma 7.

The proofs of Theorems 4, 5, 6 will be given below the fold, while the proof of Lemma 7 follows from the arguments in this previous post. We remark that in our current arguments, the double dense divisibility is only fully used in the Type I estimates; the Type II and Type III estimates are also valid just with single dense divisibility.

Remark 1 Theorem 6 is vacuously true for {\sigma > 1/6}, as the condition (10) cannot be satisfied in this case. If we use this trivial case of Theorem 6, while keeping the full strength of Theorems 4 and 5, we obtain Theorem 1 in the regime

\displaystyle  168 \varpi + 48 \delta < 1.

— 1. Exponential sum estimates —

It will be convenient to introduce a little bit of formal algebraic notation. Define an integral rational function to be a formal rational function {f(t) = \frac{P(t)}{Q(t)}} in a formal indeterminate {t} where {P,Q \in {\bf Z}[t]} are polynomials with integer coefficients, and {Q} is monic; in particular any polynomial {P(t) \in {\bf Z}[t]} can be identified with the integral rational function {P(t) \equiv \frac{P(t)}{1}}. For minor technical reasons we do not equate integral rational functions under cancelling, thus for instance we consider {\frac{P(t)R(t)}{Q(t)R(t)}} to be distinct from {\frac{P(t)}{Q(t)}}; we need to do this because the domain of definition of these two functions is a little different (the former is not defined when {R(t)=0}, but the latter can still be defined here). Because we refuse to cancel, we have to be a little careful how we define algebraic operations: specifically, we define

\displaystyle  \frac{P_1(t)}{Q_1(t)} + \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) Q_2(t) + P_2(t) Q_1(t)}{Q_1(t) Q_2(t)}

\displaystyle  \frac{P_1(t)}{Q_1(t)} - \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) Q_2(t) - P_2(t) Q_1(t)}{Q_1(t) Q_2(t)}

\displaystyle  \frac{P_1(t)}{Q_1(t)} \cdot \frac{P_2(t)}{Q_2(t)} := \frac{P_1(t) P_2(t)}{Q_1(t) Q_2(t)}

\displaystyle  (\frac{P(t)}{Q(t)})' := \frac{P'(t) Q(t) - P(t) Q'(t)}{Q(t)^2}.

Note that the denominator always remains monic with respect to these operations. This is not quite a ring with a derivation (the subtraction operation does not quite cancel the addition operation due to the inability to cancel) but this will not bother us in practice. (On the other hand, addition and multiplication remain associative, and the latter continues to distribute over the former, and differentiation obeys the usual sum and product rules.) Note that if {f(t) = \frac{P(t)}{Q(t)}} is an integral rational function, we can localise it modulo {q} for any modulus {q} to obtain a rational function {f(t)\ (q) := \frac{P(t)\ (q)}{Q(t)\ (q)}} that is the ratio of two polynomials {P\ (q), Q\ (q)} in {({\bf Z}/q{\bf Z})[t]}, with the denominator monic and hence non-vanishing. We can define the algebraic operations of addition, subtraction, multiplication, and differentiation on integral rational functions modulo {q} by the same formulae as above, and we observe that these operations are completely compatible with their counterparts over {{\bf Z}} (even without the ability to cancel), thus for instance {(f+g)\ (q) = (f\ (q)) + (g\ (q))}. We say that {f} is divisible by {q}, and write {q|f}, if the numerator {P(t)} of {f(t)} has all coefficients divisible by {q}.

If {f} is an integral rational function and {n \in {\bf Z}/q{\bf Z}}, then {f(n)} is well defined as an element of {{\bf Z}/q{\bf Z}} except when {Q(n)} is a zero divisor in {{\bf Z}/q{\bf Z}}. We adopt the convention that {e_q(f(n)) = 0} when {Q(n)} is a zero divisor in {{\bf Z}/q{\bf Z}}, thus {e_q(f(n))} is really shorthand for {1_{(Q(n),q)=1} e_q(f(n))}; by abuse of notation we view {n \mapsto e_q(f(n))} both as a function on {{\bf Z}/q{\bf Z}} and as a {q}-periodic function on {{\bf Z}}. Thus for instance

\displaystyle  e_q( \frac{0}{n} ) = 1_{(n,q)=1}.

Note that if {q|f}, then {f(n)=0\ (q)} for all {n \in {\bf Z}/q{\bf Z}} for which {f(n)} is well defined. We define {(f,q)} to be the largest factor {q'} of {q} for which {q'|f}; in particular, if {q} is square-free, we have

\displaystyle  (f,q) = \prod_{p|q: p|f} p.

Note with these conventions that {(0,q)=q}.

We recall the following Chinese remainder theorem:

Lemma 8 (Chinese remainder theorem) Let {q = q_1 q_2} with {q_1,q_2} coprime positive integers. If {f} is a rational function, then

\displaystyle  e_q( f(n) ) = e_{q_1}( \overline{q_2} f(n) ) e_{q_2} ( \bar{q_1} f(n) )

for all integers {n}, where {\overline{q_2}} is the inverse of {q_2} in {{\bf Z}/q_1{\bf Z}} and similarly for {\overline{q_1}}.

When there is no chance of confusion we will write {e_{q_1} ( \frac{1}{q_2} f(n) )} for {e_{q_1}( \overline{q_2} f(n) )} (though note that {\frac{1}{q_2}} does not qualify as an integral rational function since the constant {q_2} is not monic).

Proof: See Lemma 7 of this previous post. \Box

Now we give an estimate for complete exponential sums, which combines both Ramanujan sum bounds with Weil conjecture bounds.

Proposition 9 (Ramanujan-Weil bounds) Let {q} be a positive square-free integer of polynomial size, and let {f(t) = \frac{P(t)}{Q(t)}} be an integral rational function with {P,Q} of bounded degree. Then we have

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_q( f(n) )| \lessapprox q^{1/2} \frac{(f',q)}{(f'',q)^{1/2}}.

Proof: See Proposition 4 of this previous post. \Box

Proposition 10 (Incomplete sums) Let {q} be a positive square-free integer of polynomial size, and let {f(t) = \frac{P(t)}{Q(t)}} be an integral rational function with {P,Q} of bounded degree with {\hbox{deg}(P) < \hbox{deg}(Q)}. Let {l \geq 1} be a fixed integer, and suppose that we have a factorisation {q = q_1 \ldots q_l}. Then for any {N \gg 1}, and any coefficient sequence {\psi_N} at scale {N} of polynomial size, one has

\displaystyle  |\sum_n \psi_N(n) e_q(f(n))| \lessapprox \sum_{i=1}^{l-1} N^{1-1/2^i} (q'_i)^{1/2^i} + N^{1-1/2^{l-1}} (q'_l)^{1/2^l}

\displaystyle  + \frac{N}{q'} 1_{N \geq q'} |\sum_{n \in {\bf Z}/q'{\bf Z}} e_{q'}(f(n) / (f,q) )|

where {q' := q / (f,q)} and {q'_i := (q_i,q')}.

Proof: Let {C} be a sufficiently large fixed quantity depending on the degrees of {P} and {Q}. We first make the technical reduction that it suffices to establish the claim in the case when {q} has no prime factors less than {C}, for otherwise one can factor {q = q_0 q'} where {q_0} is the product of all the prime factors of {q} less than {C}, and by splitting the {n} summation into residue classes {b\ (q_0)} and performing the substitution {n = q_0' n + b} and applying the proposition with {q} replaced by {q'} (and adjusting {q_1,\ldots q_l,P,Q,f} accordingly) we obtain the claim.

By dividing {f} through by {(f,q)} (and replacing {q_i} with {q'_i}) we may assume without loss of generality that {(f,q)=1} and {q_i = q'_i}. As {f} vanishes at infinity, this implies that {(f',q)=1} and {(f'',q)=1} (see Lemma 3 from this previous post).

We induct on {l}. We begin with the base case {l=1}, where the task is to show that

\displaystyle  |\sum_n \psi_N(n) e_q(f(n))| \lessapprox q^{1/2} + \frac{N}{q} 1_{N \geq q} |\sum_{n \in {\bf Z}/q{\bf Z}} e_{q}(f(n))|.

By Proposition 9, we have

\displaystyle  |\sum_{n \in {\bf Z}/q{\bf Z}} e_{q}(f(n))| \lessapprox q^{1/2}

so we may delete the condition {1_{N \geq q}} on the right-hand side without penalty.

By completion of sums (see Lemma 6 of this previous post), the left-hand side is

\displaystyle  \lessapprox 1 + \frac{N}{q} \sum_{|h| \lessapprox q/N} |\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn)|

so it will suffice to show that

\displaystyle  \frac{N}{q} \sum_{0 < |h| \lessapprox q/N} |\sum_{n \in {\bf Z}/q{\bf Z}} e_q(f(n) + hn)| \lessapprox q^{1/2}.

By Proposition 9 again, the left-hand side is

\displaystyle  \lessapprox \frac{N}{q^{1/2}} \sum_{0 < |h| \lessapprox q/N} (f'+h,q).

Since {(f'',q)=1}, we see that {(f'+h,q)=1}, and the claim follows.

Now suppose that {l > 1}, and that the claim has already been proven for {l-1}. We use the {q}-van der Corput {A}-process of Heath-Brown and Graham-Ringrose. If we have {N \geq q_l}, then

\displaystyle  N^{1-1/2^{l-1}} q_{l-1}^{1/2^{l-1}} + N^{1-1/2^{l-1}} q_l^{1/2^l} \geq N^{1-1/2^{l-2}} (q_{l-1} q_l)^{1/2^{l-1}}

and the claim then follows by the induction hypothesis (concatenating {q_l} and {q_{l-1}}). Similarly, if {N \leq q_1}, then {N^{1/2} q_1^{1/2} \geq N}, and the claim follows from the triangle inequality. Thus we may assume that

\displaystyle  q_1 < N < q_l.

Let {K := \lfloor N/q_1\rfloor}. We can rewrite {\sum_n \psi_N(n) e_q(f(n))} as

\displaystyle  \frac{1}{K} \sum_n \sum_{k=1}^K \psi_N(n+kq_1) e_q(f(n+kq_1)).

By Lemma 8 we have {e_q(f(n+kq_1)) = e_{q_1}(\overline{q_2 \ldots q_l} f(n)) e_{q_2 \ldots q_l}( \overline{q_1} f(n+kq_1) )}, and so by the triangle inequality and the Cauchy-Schwarz inequality

\displaystyle  |\sum_n \psi_N(n) e_q(f(n))| \leq \frac{1}{K} \sum_n |\sum_{k=1}^K \psi_N(n+kq_1) e_{q_2 \ldots q_l}( \overline{q_1} f(n+kq_1) )|

\displaystyle  \ll \frac{N^{1/2}}{K} (\sum_n |\sum_{k=1}^K \psi_N(n+kq_1) e_{q_2 \ldots q_l}( \overline{q_1} f(n+kq_1) )|^2)^{1/2}

since the summand is only non-zero when {n} is supported on an interval of length {O(N)}. This last expression may be rearranged as

\displaystyle  \frac{N^{1/2}}{K} |\sum_{1 \leq k,k' \leq K} \sum_n \psi_N(n+kq_1) \overline{\psi_N(n+k'q_1)}

\displaystyle  e_{q_2 \ldots q_l}( \overline{q_1} (f(n+kq_1) - f(n+k'q_1) ) )|^{1/2}.

The diagonal contribution {k=k'} can be estimated by {O( \frac{N^{1/2}}{K} ( K N )^{1/2} ) = O( N^{1/2} q_1^{1/2} )}, which is acceptable, so it suffices to show that

\displaystyle  |\sum_{1 \leq k,k' \leq K: k \neq k'} \sum_n \psi_N(n+kq_1) \overline{\psi_N(n+k'q_1)} \ \ \ \ \ (14)

\displaystyle  e_{q_2 \ldots q_l}( \overline{q_1} (f(n+kq_1) - f(n+k'q_1) ) )|

\displaystyle  \lessapprox K^2 ( \sum_{i=2}^{l-1} N^{1-1/2^{i-1}} q_i^{1/2^{i-1}} + N^{1-1/2^{l-2}} q_l^{1/2^{l-1}} ).

We observe that {n \mapsto \overline{q_1} (f(n+kq_1) - f(n+k'q_1) )} is an integral rational function whose numerator has lower degree than the denominator. If a prime {p} dividing {q_2 \ldots q_l} also divides this rational function, then {f(n+(k-k')q_1)-f(n)} is divisible by {p}; if {k-k'} is not divisible by {p}, this implies by telescoping series that {p|f(n+a)-f(n)} for all {a}. This implies that {f(n)\ (p)} is constant where it is defined; as {f} vanishes at infinity and is defined outside of {O(1)} elements, this implies that {p|f} (here we use the fact that {p} must exceed {C}, since it divides {q}). We conclude that

\displaystyle  (q_2 \ldots q_l, \overline{q_1} (f(\cdot+kq_1) - f(\cdot+k'q_1) )) \leq (q_2 \ldots q_l, k-k').

Applying the induction hypothesis and Proposition 9, we may thus bound

\displaystyle  |\sum_n \psi_N(n+kq_1) \overline{\psi_N(n+k'q_1)} e_{q_2 \ldots q_l}( \overline{q_1} (f(n+kq_1) - f(n+k'q_1) ) )|

by

\displaystyle  \lessapprox \sum_{i=2}^{l-1} N^{1-1/2^{i-1}} q_i^{1/2^{i-1}} + N^{1-1/2^{l-2}} q_l^{1/2^{l-1}}

\displaystyle  + N (q_2 \ldots q_l)^{-1/2} (q_2 \ldots q_l, k-k')^{1/2} 1_{N \geq q_2 \ldots q_l / (q_2 \ldots q_l, k-k')}.

The contribution of the first two terms to (14) is acceptable, so the only contribution remaining to control is

\displaystyle  N (q_2 \ldots q_l)^{-1/2} \sum_{1 \leq k,k' \leq K: k \neq k'} (q_2 \ldots q_l, k-k')^{1/2} 1_{N \geq q_2 \ldots q_l / (q_2 \ldots q_l, k-k')}.

If we bound {1_{N \geq q_2 \ldots q_l / (q_2 \ldots q_l, k-k')}} by {(q_2 \ldots q_l, k-k')^{1/2} (q_2 \ldots q_l)^{-1/2} N^{1/2}}, we can bound this expression by

\displaystyle  N^{3/2} (q_2 \ldots q_l)^{-1} \sum_{1 \leq k,k' \leq K: k \neq k'} (q_2 \ldots q_l, k-k')

which by Lemma 5 of this previous post and the bound {N < q_l} is bounded by

\displaystyle  \lessapprox K^2 N^{1-1/2^{l-2}} q_l^{1/2^{l-1}}

which is acceptable. \Box

We record a special case of the above proposition:

Corollary 11 Let {d_1,d_2} be square-free numbers (not necessarily coprime) of polynomial size, let {c_1,c_2,l,l',m} be integers, let {N \gg 1}, and let {\psi_N} be a coefficient sequence at scale {N}. Suppose that {[d_1,d_2]} is {y}-densely divisible. Let {a\ (q_0)} be a residue class with {q_0 | (d_1,d_2)}. Then

\displaystyle  |\sum_{n : n = a\ (q_0)} \psi_N(n+m) e_{d_1}( \frac{c_1}{n+l} ) e_{d_2}( \frac{c_2}{n+l'} )| \lessapprox

\displaystyle  q_0^{-1/2} N^{1/2} [d_1,d_2]^{1/6} y^{1/6} + q_0^{-1} \frac{(c_1,d'_1)}{d'_1} \frac{(c_2,d'_2)}{d'_2} N

where {d'_i := d_i/(d_1,d_2)} for {i=1,2}. We also have the variant bound

\displaystyle  |\sum_{n : n = a\ (q_0)} \psi_N(n+m) e_{d_1}( \frac{c_1}{n + l} ) e_{d_2}( \frac{c_2}{n+l'} )| \lessapprox

\displaystyle  q_0^{-1/2} [d_1,d_2]^{1/2} + q_0^{-1} \frac{(c_1,d'_1)}{d'_1} \frac{(c_2,d'_2)}{d'_2} N.

Proof: We first consider the case {q_0=1}, so that the congruence condition {n = a\ (q_0)} can be deleted. By the dense divisibility hypothesis we may factor {[d_1,d_2] = q_1 q_2} for some

\displaystyle  y^{-2/3}[d_1,d_2]^{1/3} \leq q_1 \leq y^{1/3} [d_1,d_2]^{1/3}

and

\displaystyle  y^{-1/3}[d_1,d_2]^{2/3} \leq q_2 \leq y^{2/3} [d_1,d_2]^{2/3}.

The first bound then follows from the {l=2} case of Proposition 10, combined with the bound

\displaystyle  |\sum_{n \in {\bf Z}/[d_1,d_2]{\bf Z}} e_{d_1}( \frac{c_1}{n+l} ) e_{d_2}( \frac{c_2}{n+l'} )| \lessapprox (c_1,d'_1) (c_2,d'_2) (d_1,d_2)

that is proven as part of Proposition 5 of this previous post. The second bound similarly follows from the {l=1} case of Proposition10.

Now we consider the case when {q_0 > 1}. Writing {n = n' q_0 + a}, {d_1 = q_0 \tilde d_1}, and {d_2 =q_0 \tilde d_2}, we have from Lemma 8 that

\displaystyle  e_{d_1}( \frac{c_1}{n+l} ) = e_{\tilde d_1}( \frac{c_1 \overline{q_0}^2}{n' + (a+l) \overline{q_0} } ) e_{q_0}( \frac{c_1 \overline{\tilde d_1}}{a+l} )

and similarly

\displaystyle  e_{d_1}( \frac{c_2}{n+l'} ) = e_{\tilde d_2}( \frac{c_2 \overline{q_0}^2}{n' + (a+l') \overline{q_0} } ) e_{q_0}( \frac{c_2 \overline{\tilde d_2}}{a+l'} ).

If we then apply the previous results with {d_1,d_2} replaced by {\tilde d_1,\tilde d_2} (with {[\tilde d_1,\tilde d_2] = [d_1,d_2]/q_0} being {q_0y}-densely divisible) and {N} replaced by {N/q_0} (and with suitable alterations to {c_1,c_2,l,l'}), we obtain the required claims. \Box

For the Type III estimate we will also need a deeper exponential sum estimate, involving the hyper-Kloosterman sums

\displaystyle  K_3(a;q) := \frac{1}{q} \sum_{x,y,z \in ({\bf Z}/q{\bf Z})^\times: xyz=a} e_q(x+y+z) \ \ \ \ \ (15)

for square-free {q} and {a \in ({\bf Z}/q{\bf Z})^\times}.)

Lemma 12 (Correlation of hyper-Kloosterman sums) Let {s,r_1,r_2} be square-free numbers of polynomial size with {(s,r_1)=(s,r_2)=1}. Let {a_1 \in ({\bf Z}/r_1 s)^\times}, {a_2 \in ({\bf Z}/r_2s)^\times}, and {n \in {\bf Z}/([r_1,r_2] s){\bf Z}}. Then

\displaystyle  |\sum_{h \in ({\bf Z}/[r_1, r_2]s {\bf Z})^\times} K_3(a_1 h; r_1 s) \overline{K_3(a_2 h; r_2 s)} e_{[r_1,r_2] s}( nh )|

\displaystyle  \lessapprox d^{1/2} s^{1/2} [r_1,r_2]^{1/2} (r_1,r_2,a_2-a_1,n)^{1/2}

where {d := (a_2 r_1^3 - a_1 r_2^3, n, s)}.

Proof: From Lemma 8 we have

\displaystyle  K_3( a_i h; r_i s ) = K_3( a_i \bar{s}^3 h; r_i) K_3( a_i \overline{r_i}^3 h; s )

and so it suffices to prove the estimates

\displaystyle  |\sum_{h \in ({\bf Z}/[r_1, r_2] {\bf Z})^\times} K_3(a_1 \bar{s}^3 h; r_1) \overline{K_3(a_2 \bar{s}^3 h; r_2)} e_{[r_1,r_2]}( \bar{s} nh )|

\displaystyle  \lessapprox [r_1,r_2]^{1/2} (r_1,r_2,a_2-a_1,n)^{1/2}

and

\displaystyle  |\sum_{h \in ({\bf Z}/s {\bf Z})^\times} K_3(a_1 \overline{r_1}^3 h; s) \overline{K_3(a_2 \overline{r_2}^3 h; s)} e_{s}( \overline{[r_1,r_2]} nh )| \lessapprox d^{1/2} s^{1/2}.

By further application of Lemma 8, together with the divisor bound, it suffices to show that

\displaystyle  |\sum_{h \in ({\bf Z}/p {\bf Z})^\times} K_3(b_1 h; d_1) \overline{K_3(b_2 h; d_2)} e_{p}( mh )| \ll p^{1/2} (b_1-b_2,m,d_1,d_2)^{1/2}

whenever {[d_1,d_2] = p}, {m \in {\bf Z}/p{\bf Z}}, and {b_1,b_2 \in ({\bf Z}/p{\bf Z})^\times}.

Suppose first that {d_2=1}, so that {d_1=p}. Then {\overline{K_3(b_2 h; d_2)} = 1}, and the left-hand side simplifies to

\displaystyle  |\sum_{h \in ({\bf Z}/p {\bf Z})^\times} K_3(b_1 h; p) e_{p}( mh )|

which can be expanded as

\displaystyle  \frac{1}{p} | \sum_{x,y,z \in ({{\bf Z}/p{\bf Z}})^\times} e_p( m \overline{b_1} xyz + x + y + z ) |.

Performing the Fourier summation in {z}, this can be bounded by the sum of

\displaystyle  | \sum_{x,y \in ({{\bf Z}/p{\bf Z}})^\times} e_p( x + y ) 1_{m \overline{b_1} xy = -1}|

and

\displaystyle  \frac{1}{p} | \sum_{x,y \in ({{\bf Z}/p{\bf Z}})^\times} e_p( x + y ) |.

The first term either vanishes or is a Kloosterman sum, and is {O(\sqrt{p})} in either case, while the second term can be calculated by Fourier series to be {O(1)}, and the claim follows. Similarly if {d_1=1}.

Now suppose that {d_1=d_2=p} and {b_1-b_2=m=0\ (p)}. Then we use the Deligne bound {|K_3(h; p)| \ll 1} to obtain the desired claim. The only remaining case is when {d_1=d_2=p} and {b_1-b_2 \neq 0\ (p)} or {m \neq 0\ (p)}, so our task is to show that

\displaystyle  |\sum_{h \in ({\bf Z}/p {\bf Z})^\times} K_3(b_1 h; p) \overline{K_3(b_2 h; p)} e_{p}( mh )| \ll p^{1/2}

in this case. A proof of this claim (which uses the full strength of Deligne’s work) can be found in this paper of Michel (see also Proposition 6 of this recent expository note of Fouvry, Kowalski, and Michel). \Box

From this and completion and sums we have

Lemma 13 (Correlation of hyper-Kloosterman sums, II) Let {s,r_1,r_2} be square-free numbers of polynomial size with {(s,r_1)=(s,r_2)=1}. Let {a_1 \in ({\bf Z}/r_1 s)^\times}, {a_2 \in ({\bf Z}/r_2s)^\times}. Let {\Psi: {\bf Z} \rightarrow {\bf C}} be a smooth function adapted to the interval {[-2H,2H]} which equals one on {[-H,H]}. Then

\displaystyle  |\sum_h \Psi(h) K_3(a_1 h; r_1 s) \overline{K_3(a_2 h; r_2 s)} e_{[r_1,r_2] s}( nh )|

\displaystyle  \lessapprox (\frac{H}{[r_1,r_2]s} + 1) d^{1/2} s^{1/2} [r_1,r_2]^{1/2} (a_2-a_1,r_1,r_2)^{1/2}

where {d := (a_2 r_1^3 - a_1 r_2^3, s)}, and the sum runs over those {h} coprime to {[r_1,r_2]s}.

— 2. Type I estimate —

We begin the proof of Theorem 4, closely following the arguments from Section 5 of this previous post. Let {I, a, N, M, \alpha} be as in the theorem. We can restrict {q} to the range

\displaystyle  q \gtrapprox x^{1/2}

for some sufficiently slowly decaying {o(1)}, since otherwise we may use the Bombieri-Vinogradov theorem (Theorem 4 from this previous post). Thus, by dyadic decomposition, we need to show that

\displaystyle  \sum_{d \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}^2: D \leq d < 2D} |\Delta(\alpha \ast \beta; a\ (d))| \ll NM \log^{-A} x. \ \ \ \ \ (16)

for any fixed {A} and for any {D} in the range

\displaystyle  x^{1/2} \lessapprox D \lessapprox x^{1/2+2\varpi}.

Let

\displaystyle  \epsilon > 0 \ \ \ \ \ (17)

be a sufficiently small fixed exponent.

By Lemma 11 of this previous post, we know that for all {d} in {[D,2D]} outside of a small number of exceptions, we have

\displaystyle  \prod_{p|d: p \leq D_0} p \lessapprox 1 \ \ \ \ \ (18)

where

\displaystyle  D_0 := \exp(\log^{1/3} x). \ \ \ \ \ (19)

Specifically, the number of exceptions in the interval {[D,2D]} is {O(D \log^{-A} x)} for any fixed {A>0}. The contribution of the exceptional {d} can be shown to be acceptable by Cauchy-Schwarz and trivial estimates (see Section 5 of this previous post), so we restrict attention to those {d} for which (18) holds. In particular, as {d} is restricted to be doubly {x^\delta}-densely divisible we may factor

\displaystyle  d=qr

with {q,r} coprime and square-free, with {q \in {\mathcal S}_{I'}} {x^\delta}-densely divisible with {I' := [D_0,\infty) \cap I}, and

\displaystyle  x^{-\epsilon-\delta} N \lessapprox r \lessapprox x^{-\epsilon} N

and

\displaystyle  x^{1/2} \lessapprox qr \lessapprox x^{1/2+2\varpi}.

Here we use the easily verified fact that {N \gtrapprox x^\epsilon}. Since {d} is {x^\delta}-densely divisible, we also have {qr \in {\mathcal D}_{x^\delta}}.

By dyadic decomposition, it thus sufices to show that

\displaystyle  \sum_{q \in {\mathcal S}_{I'} \cap {\mathcal D}_{x^\delta}: q \sim Q} \sum_{r \in {\mathcal S}_I: r \sim R; qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\Delta(\alpha \ast \beta; a\ (qr))| \ll NM \log^{-A} x.

for any fixed {A>0}, where {Q, R \geq 1} obey the size conditions

\displaystyle  x^{-\epsilon-\delta} N \lessapprox R \lessapprox x^{-\epsilon} N \ \ \ \ \ (20)

and

\displaystyle  x^{1/2} \lessapprox QR \lessapprox x^{1/2 + 2\varpi}. \ \ \ \ \ (21)

Fix {Q,R}. We abbreviate {\sum_{q \in {\mathcal S}_{I'} \cap {\mathcal D}_{x^\delta}: q \sim Q}} and {\sum_{r \in {\mathcal S}_I: r \sim R}} by {\sum_q} and {\sum_r} respectively, thus our task is to show that

\displaystyle  \sum_q \sum_{r: qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\Delta(\alpha \ast \beta; a\ (qr))| \ll NM \log^{-A} x.

We now split the discrepancy

\displaystyle  \Delta(\alpha \ast \beta; a\ (qr)) = \sum_{n = a\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n)

as the sum of the subdiscrepancies

\displaystyle  \sum_{n: n = a\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a\ (r)} \alpha \ast \beta(n)

and

\displaystyle  \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n).

In Section 5 of this previous post, it was established (using the Bombieri-Vinogradov theorem) that

\displaystyle  \sum_{q} \sum_{r; (q,r)=1} |\frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n)| \ll

\displaystyle  NM \log^{-A} x

so it suffices to show that

\displaystyle  \sum_{q} \sum_{r; qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\sum_{n: n = a\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a\ (r)} \alpha \ast \beta(n)| \ \ \ \ \ (22)

\displaystyle  \ll NM \log^{-A} x.

As in the previous notes, we will not take advantage of the {r} summation, and use crude estimates to reduce to showing that

\displaystyle  \sum_{q: qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\sum_{n: n = a\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a\ (r)} \alpha \ast \beta(n)| \ \ \ \ \ (23)

\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x

for each individual {r \in {\mathcal S}_I} with {r \sim R}, which we now select. It will suffice to prove the slightly stronger statement

\displaystyle  \sum_{q: qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\sum_{n: n = a\ (r); n= b\ (q)} \alpha \ast \beta(n) - \sum_{n: (n,q)=1; n = a\ (r); n = b'\ (q)} \alpha \ast \beta(n)| \ \ \ \ \ (24)

\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x

for all {a,b,b'} coprime to {P_I}, since if one then specialises to the case when {b=a} and averages over all primitive {b'\ (P_I)} we obtain (23) from the triangle inequality.

We use the dispersion method. We write the left-hand side of (24) as

\displaystyle \sum_{q: qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} c_q (\sum_{n: n = a\ (r); n= b\ (q)} \alpha \ast \beta(n) - \sum_{n: n = a\ (r); n = b'\ (q)} \alpha \ast \beta(n))

for some bounded sequence {c_q} (which may also depend on {r}, but we suppress this dependence). This expression may be rearranged as

\displaystyle  \sum_m \alpha(m) (\sum_{q,n: mn = a\ (r); qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} c_{q} \beta(n) (1_{mn = b\ (q)} - 1_{mn = b'\ (q)})),

so from the Cauchy-Schwarz inequality and crude estimates it suffices to show that

\displaystyle  \sum_{m} \psi_M(m) |\sum_{q,n: mn = a\ (r); qr \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} c_{q} \beta(n) (1_{mn = b\ (q)} - 1_{mn = b'\ (q)})|^2 \ \ \ \ \ (25)

\displaystyle  \ll N^2 M R^{-2} \tau(r)^{O(1)} \log^{-A} x

for any fixed {A>0}, where {\psi_M} is a smooth coefficient sequence at scale {M}. Expanding out the square, it suffices to show that

\displaystyle  \sum_{m} \psi_M(m) \sum_{q_1,q_2,n_1,n_2: mn_1=mn_2 = a\ (r); q_1r,q_2r \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} \ \ \ \ \ (26)

\displaystyle  c_{q_1} \overline{c_{q_2}} \beta(n_1) \overline{\beta(n_2)} 1_{mn_1 = b\ (q_1)} 1_{mn_2 = b'\ (q_2)}

\displaystyle  = X + O( N^2 M R^{-2} \tau(r)^{O(1)} \log^{-A} x )

where {q_1,q_2} is subject to the same constraints as {q} (thus {q_i \in {\mathcal S}_{I'} \cap {\mathcal D}_{x^\delta}} and {q_i \sim Q} for {i=1,2}), and {X} is some quantity that is independent of {b,b'}.

Observe that {n_1} must be coprime to {q_1r} and {n_2} coprime to {q_2r}, with {n_1 = n_2\ (r)}, to have a non-zero contribution to (26). We then rearrange the left-hand side as

\displaystyle  \sum_{q_1,q_2: q_1r,q_2r \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} \sum_{m} \psi_M(m) \sum_{n_1,n_2: n_1=n_2\ (r); (n_1,q_1r)=(n_2,q_2)=1}

\displaystyle c_{q_1} \overline{c_{q_2}} \overline{\beta(n_1)} \overline{\beta(n_2)} 1_{m = a/n_1\ (r); m = b/n_1\ (q_1); m = b'/n_2 (q_2)};

note that these inverses in the various rings {{\bf Z}/r{\bf Z}}, {{\bf Z}/q_1{\bf Z}}, {{\bf Z}/q_2{\bf Z}} are well-defined thanks to the coprimality hypotheses.

We may write {n_2 = n_1+kr} for some {k = O(N/R)}. By the triangle inequality, and relabeling {n_1} as {n}, it thus suffices to show that for any particular

\displaystyle  k = O(N/R), \ \ \ \ \ (27)

one has

\displaystyle  \sum_{k = O(N/R)} \sum_{q_1,q_2: q_1r,q_2r \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\sum_{n; (n,q_1r)=(n+kr,q_2)=1} \ \ \ \ \ (28)

\displaystyle  c_{q_1} \overline{c_{q_2}} \beta(n) \overline{\beta(n+kr)} \sum_{m} \psi_M(m) 1_{m = a/n\ (r); m = b/n\ (q_1); m = b'/(n+kr) (q_2)}|

\displaystyle  = X + O( N^2 M R^{-2} \log^{-A} x )

for some {X} independent of {b}, {b'}.

At this stage in previous posts we isolated the coprime case {(q_1,q_2)=1} as the dominant case, using a controlled multiplicity hypothesis to deal with the non-coprime case. Here, we will carry the non-coprime case with us for a little longer so as not to rely on a controlled multiplicity hypothesis; this introduces some additional factors of {q_0 := (q_1,q_2)} into the analysis but they should be ignored on a first reading.

Applying completion of sums (Section 2 from this previous post), we can express the left-hand side of (28) as a main term

\displaystyle  \sum_{k = O(N/R)} \sum_{q_1,q_2: q_1r,q_2r \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} |\sum_{n; (n,q_1r)=(n+kr,q_2)=1} \ \ \ \ \ (29)

\displaystyle  c_{q_1} \overline{c_{q_2}} \beta(n) \overline{\beta(n+kr)} (\sum_{m} \psi_M(m)) \frac{1}{r[q_1,q_2]} 1_{b/n = b'/(n+kr)\ ((q_1,q_2))}

plus an error term

\displaystyle  O( \frac{1}{H} \sum_{k=O(N/R)} \sum_{1 \leq h \leq H} \sum_{q_1,q_2} |\sum_{n} \beta(n) \beta(n+kr) \Phi_{k,r}(h,q_1,q_2; n)| ) \ \ \ \ \ (30)

\displaystyle  + O( x^{-A} ),

where

\displaystyle  H := x^\epsilon Q^2 R/M \ \ \ \ \ (31)

and {\Phi = \Phi_{k,r}} is the phase

\displaystyle  \Phi(h,q_1,q_2;n) := 1_{(n,r)=(n,q_1)=(n+kr,q_2)=1} 1_{q_1,q_2,q_1r,q_2r \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}} \ \ \ \ \ (32)

\displaystyle  1_{b/n=b'/(n+kr)\ ((q_1,q_2))}

\displaystyle  e_r( \frac{ah}{nq_1 q'_2} ) e_{q_1}( \frac{bh}{n r q'_2} ) e_{q'_2}( \frac{b' h}{(n+kr) r q_1} ),

where {q'_2 := q_2/(q_1,q_2)}.

Let us first deal with the main term (29). The contribution of the coprime case {(q_1,q_2)=1} does not depend on {b,b'} and can thus be absorbed into the {X} term. Now we consider the contribution of the non-coprime case when {q_0 = (q_1,q_2) > 1}. We may estimate the contribution of this case by

\displaystyle  O( \sum_{k = O(N/R)} \sum_{q_0 \in {\mathcal S}_{I'}: 1 < q_0 \ll Q, (q_0,r)=1} \sum_{q'_1,q'_2 \sim Q/q_0} |\sum_{n: b/n = b'/(n+kr)\ (q_0)}

\displaystyle  |\beta(n)| |\beta(n+kr)| M \frac{1}{rq_0 q'_1 q'_2} ).

We may estimate {|\beta(n)| |\beta(n+kr)|} by {|\beta(n)|^2 + |\beta(n+kr)|^2}. We just estimate the contribution of {|\beta(n)|^2}, as the other case is treated similarly (after shifting {n} by {kr}). We rearrange this contribution as

\displaystyle  O( \sum_{q_0 \in {\mathcal S}_{I'}: 1 < q_0 \ll Q, (q_0,r)=1} \sum_{q'_1,q'_2 \sim Q/q_0} |\sum_{n}

\displaystyle  |\beta(n)|^2 M \frac{1}{Rq_0 q'_1 q'_2} \sum_{k = O(N/R)} 1_{b/n = b'/(n+kr)\ (q_0)} ).

The {k} summation is {O( 1 + \frac{N}{Rq_0} )}. Evaluating the {n} and {q'_1,q'_2} summations, we obtain a bound of

\displaystyle  O( \frac{MN}{R} \log^{O(1)} x \sum_{q_0 \in {\mathcal S}_{I'}: 1 < q_0 \ll Q} \frac{1}{q_0} ( 1 + \frac{N}{Rq_0} ) ).

Since {q_0 > 1} and {q_0 \in {\mathcal S}_{I'}}, we have {q_0 \geq D_0}, and so we may evaluate the {q_0} summation as

\displaystyle  O( \frac{MN}{R} \log^{O(1)} x (1 + \frac{N}{RD_0} ) ).

By (20) and (19), this is {O( N^2 M R^{-2} \log^{-A} x )} as required.

It remains to control (30). We may assume that {H \geq 1}, as the claim is trivial otherwise. It will suffice to obtain the bound

\displaystyle  \frac{1}{H} \sum_{k=O(N/R)} \sum_{1 \leq h \leq H} \sum_{q_1,q_2 \sim Q} |\sum_{n} \beta(n) \overline{\beta(n+kr)} \Phi_{k,r}(h,q_1,q_2; n)|

\displaystyle  \lessapprox x^{-\epsilon} N^2 M R^{-2}.

Using (31), it will suffice to show that

\displaystyle  \sum_{1 \leq h \leq H} \sum_{q_1,q_2 \sim Q} |\sum_{n} \beta(n) \overline{\beta(n+kr)} \Phi_{k,r}(h,q_1,q_2; n)|

\displaystyle  \lessapprox Q^2 N

for each {k = O(N/R)}.

We now work with a single {k}, and abbreviate {\Phi_{k,r}} as {\Phi}. To proceed further, we write {q_0 := (q_1,q_2)} and {q_1 = q_0 q'_1}, {q_2 = q_0 q'_2}; it then suffices to show that

\displaystyle  \sum_{1 \leq h \leq H} \sum_{q'_1,q'_2 \sim Q/q_0: (q'_1,q'_2) = 1} \ \ \ \ \ (33)

\displaystyle  |\sum_{n} \beta(n) \overline{\beta(n+kr)} \Phi(h,q_0 q'_1,q_0 q'_2; n)|

\displaystyle  \lessapprox Q^2 N / q_0

for each {q_0 \geq 1}.

Henceforth we work with a single choice of {q_0}. We pause to verify the relationship

\displaystyle  H \lessapprox Q.

From (31) and (21), this follows from the assertion that

\displaystyle  x^{1/2+2\varpi+\epsilon} \lessapprox M,

but this follows from (5), (6) if {\epsilon} is sufficiently small depending on {c}.

As {q_1} is {x^\delta}-densely divisible, we may now factor {q_1 = s_1 t_1} where

\displaystyle  x^{-\delta} Q/H \lessapprox s_1 \lessapprox Q/H

and thus

\displaystyle  H \lessapprox t_1 \lessapprox x^\delta H.

Factoring out {q_0}, we may then write {q'_1 = s'_1 t'_1} where

\displaystyle  q_0^{-1} x^{-\delta} Q/H \lessapprox s'_1 \lessapprox Q/H

and

\displaystyle  q_0^{-1} H \lessapprox t'_1 \lessapprox x^\delta H.

By dyadic decomposition, it thus suffices to show that

\displaystyle  \sum_{1 \leq h \leq H} \sum_{s'_1 \sim S; t'_1 \sim T; q'_2 \sim Q/q_0: (s'_1 t'_1,q'_2) = 1}

\displaystyle |\sum_{n} \beta(n) \overline{\beta(n+kr)} \Phi(h,q_0 s'_1 t'_1,q_0 q'_2; n)|

\displaystyle  \lessapprox Q^2 N / q_0

whenever {S,T} are such that

\displaystyle  q_0^{-1} x^{-\delta} Q/H \lessapprox S \lessapprox Q/H

and

\displaystyle  q_0^{-1} H \lessapprox T \lessapprox x^\delta H.

and

\displaystyle  ST \sim Q/q_0.

We rearrange this estimate as

\displaystyle  |\sum_{n; s'_1 \sim S; q'_2 \sim Q/q_0} \beta(n) \overline{\beta(n+kr)} \sum_{1 \leq h \leq H; t'_1 \sim T} c_{h,s'_1,t'_1,q'_2} \Phi(h,q_0 s'_1 t'_1,q_0 q'_2; n)|

\displaystyle  \lessapprox QSTN

for some bounded sequence {c_{h,s_1,t_1,q_2}} which is only non-zero when

\displaystyle  (s'_1 t'_1,q'_2) = (q_0,s'_1t'_1) = (q_0,q'_2) = 1.

By Cauchy-Schwarz and crude estimates, it then suffices to show that

\displaystyle  \sum_{n; s'_1 \sim S; q'_2 \sim Q/q_0} \psi_N(n) |\sum_{1 \leq h \leq H; t'_1 \sim T} c_{h,s'_1,t'_1,q'_2} \Phi(h,q_0 s'_1 t'_1,q_0 q_2; n)|^2

\displaystyle  \lessapprox QST^2 N q_0

where {\psi_N} is a coefficient sequence at scale {N}. The left-hand side may be bounded by

\displaystyle  \sum_{1 \leq h,\tilde h \leq H; t'_1,\tilde t'_1 \sim T; s'_1 \sim S; q'_2 \sim Q/q_0; (s'_1t'_1\tilde t'_1,q'_2)=1} \ \ \ \ \ (34)

\displaystyle  |\sum_n \psi_N(n) \Phi(h, q_0 s'_1 t'_1,q_0 q'_2; n) \overline{ \Phi(\tilde h,q_0 s'_1 \tilde t'_1,q_0 q'_2; n) } |.

The contribution of the diagonal case {h \tilde t'_1 = \tilde h t'_1} is {\lessapprox HTSQ N/q_0} by the divisor bound, which is acceptable since {q_0 T \gtrapprox H}. Thus it suffices to control the off-diagonal case {h\tilde t'_1 \neq \tilde ht'_1}. Observe that for a given choice of {h,\tilde h,s'_1,t'_1,\tilde t'_1,q'_2}, the phase {\Phi(h,q_0 s'_1 t'_1,q_0 q'_2; n) \overline{ \Phi(\tilde h,q_0 s'_1 \tilde t'_1,q_0 q'_2; n) }} either vanishes identically, or is equal to

\displaystyle  1_{b/n=b'/(n+kr)\ (q_0)} e_{r q_0 s'_1 [t'_1, \tilde t'_1]}(\frac{c_1}{n} ) e_{q_0 q'_2}(\frac{c_2}{n+kr} )

for some quantities {c_1,c_2} with

\displaystyle  (c_1,r) = (h\tilde t'_1-\tilde ht'_1, r).

Also, by construction, {rq_0 s'_1 t'_1}, {rq_0 s'_1 \tilde t'_1} and {q_0 q'_2} are {x^\delta}-densely divisible, so {[rq_0 s'_1 t'_1 \tilde t'_1,q_0 q'_2]} is as well. (Here we use the fact that the least common multiple of two {x^\delta}-densely divisible numbers is again {x^\delta}-densely divisible, which follows from the more general fact that if {q=q_1 q_2}, {q_2 \leq \sqrt{q}}, and {q_1} is {x^\delta}-densely divisible, then {q} is also.) The condition {1_{b/n=b'/(n+kr)\ (q_0)}} is either not satisfiable, or restricts {n} to a congruence class {a\ (q)} for some {q} dividing {q_0}. We can thus apply Corollary 11 and bound

\displaystyle  |\sum_n \psi_N(n) \Phi(h, q_0 s'_1 t'_1,q_0 q'_2; n) \overline{ \Phi(\tilde h,q_0 s'_1 \tilde t'_1,q_0 q'_2; n) } |

by

\displaystyle  \lessapprox N^{1/2} (Rq_0 ST^2 Q/q_0)^{1/6} x^{\delta/6} + \frac{(h\tilde t'_1-\tilde ht'_1, r)}{R} N.

Bounding {Rq_0 ST^2 Q/q_0} by {x^\delta RQ^2 H}, we can thus bound the off-diagonal contribution to (34) by

\displaystyle  \lessapprox \sum_{1 \leq h,\tilde h \leq H; t'_1,\tilde t'_1 \sim T: h \tilde t'_1 \neq \tilde h t'_1}

\displaystyle  \sum_{s'_1 \sim S; q'_2 \sim Q/q_0} N^{1/2} x^{\delta/3} (RQ^2 H)^{1/6} + \frac{(h\tilde t'_1-\tilde ht'_1, r)}{R} N

which sums (using Lemma 5 of this previous post and the divisor bound) to

\displaystyle  \lessapprox (H^2 T^2 S Q / q_0 ) (N^{1/2} x^{\delta/3} (RQ^2 H)^{1/6} + \frac{1}{R} N ).

Discarding some factors of {q_0}, we reduce to showing that

\displaystyle  N^{1/2} x^{\delta/3} (RQ^2 H)^{1/6} + \frac{1}{R} N \lessapprox H^{-2} N.

From (31), (21), (5) we have {Q \lessapprox x^{1/2+2\varpi} R^{-1}} and {H \lessapprox x^{4\varpi+\epsilon} N R^{-1}}, so the previous estimate will be implied by

\displaystyle  N^{1/2} x^{\delta/3} (x^{1+8\varpi+\epsilon} N R^{-2})^{1/6} + \frac{1}{R} N \lessapprox x^{-8\varpi-2\epsilon} R^2 N^{-1}.

From (20), this will be implied by

\displaystyle  N^{1/2} x^{\delta/3} (x^{1+8\varpi+2\delta+3\epsilon} N^{-1})^{1/6} + x^{\delta+\epsilon} \lessapprox x^{-8\varpi-2\delta-4\epsilon} N

or equivalently that

\displaystyle  x^{\frac{1}{4} + 14 \varpi + 4 \delta + \frac{27}{4} \epsilon} \lessapprox N

and

\displaystyle  x^{8 \varpi + 3 \delta + 5 \epsilon} \lessapprox N

which by (6) is obeyed whenever

\displaystyle  56 \varpi + 16 \delta + 4\sigma < 1

and

\displaystyle  16 \varpi + 6 \delta + 2 \sigma < 1.

The second condition is implied by the first and may be deleted. The proof of Theorem 4 is now complete.

— 3. Type II estimate —

Now we prove Theorem 5. We repeat the Type I arguments through to (33) (noting that the hypothesis (6) is never used until that point, other than to ensure that {N \gtrapprox x^\epsilon}), thus we are again faced with the task of proving

\displaystyle  \sum_{1 \leq h \leq H} \sum_{q'_1,q'_2 \sim Q/q_0: (q'_1,q'_2) = 1} |\sum_{n} \beta(n) \overline{\beta(n+kr)} \Phi(h,q_0 q'_1,q_0 q'_2; n)|

\displaystyle  \lessapprox Q^2 N / q_0.

This time, however, we do not have {H \lessapprox Q}; however we claim the weaker bound

\displaystyle  H \lessapprox Q^2. \ \ \ \ \ (35)

Indeed by (31) this is equivalent to

\displaystyle  x^\epsilon R \lessapprox M

and this follows from (20) and (5), (8).

With this weaker bound (35) we have to perform Cauchy-Schwarz differently. We rearrange the left-hand side as

\displaystyle  \sum_n \beta(n) \overline{\beta(n+kr)} ( \sum_{1 \leq h \leq H}

\displaystyle  \sum_{q'_1,q'_2 \sim Q/q_0: (q'_1,q'_2) = 1} c_{h,q'_1,q'_2} \Phi(h,q_0 q'_1,q_0 q'_2; n) )

for some bounded coefficients {c_{h,q'_1,q'_2}}. Applying Cauchy-Schwarz, it then suffices to show that

\displaystyle  \sum_n \psi_N(n) |\sum_{1 \leq h \leq H}

\displaystyle  \sum_{q'_1,q'_2 \sim Q/q_0: (q'_1,q'_2) = 1} c_{h,q'_1,q'_2} \Phi(h,q_0 q'_1,q_0 q'_2; n)|^2

\displaystyle  \lessapprox Q^4 N / q_0^2.

The left-hand side may be bounded by

\displaystyle  \sum_{1 \leq h, \tilde h \leq H} \sum_{q'_1,q'_2,\tilde q'_1,\tilde q'_2 \sim Q/q_0: (q'_1,q'_2) = (\tilde q'_1,\tilde q'_2) = 1}

\displaystyle  | \sum_n \psi_N(n) \Phi(h,q_0 q'_1,q_0 q'_2; n) \overline{\Phi(\tilde h,q_0 \tilde q'_1,q_0 \tilde q'_2; n)} |.

We isolate the diagonal case {h \tilde q'_1 \tilde q'_2 = \tilde h q'_1 q'_2}. By the divisor bound, the contribution of this case is {\lessapprox HN (Q/q_0)^2}, which is acceptable by (35). So we now restrict attention to the off-diagonal case {h \tilde q'_1 \tilde q'_2 \neq \tilde h q'_1 q'_2}. The phase {\Phi(h,q_0 q'_1,q_0 q'_2; n) \overline{\Phi(\tilde h,q_0 \tilde q'_1,q_0 \tilde q'_2; n)}} either vanishes identically, or takes the form

\displaystyle  1_{b/n=b'/(n+kr)\ (q_0)} e_{r q_0 [q'_1, \tilde q'_1]}(\frac{c_1}{n} ) e_{q_0 [q'_2, \tilde q'_2]}(\frac{c_2}{n+kr} )

for some {c_1,c_2} with {(c_1,r) = (h \tilde q'_1 \tilde q'_2 - \tilde h q'_1 q'_2, r)}. By the second part of Corollary 11 we may thus bound the previous expression by

\displaystyle  \sum_{1 \leq h, \tilde h \leq H} \sum_{q'_1,q'_2,\tilde q'_1,\tilde q'_2 \sim Q/q_0: (q'_1,q'_2) = (\tilde q'_1,\tilde q'_2) = 1}

\displaystyle (r q_0 q'_1 q'_2 \tilde q'_1 \tilde q'_2)^{1/2} + (h \tilde q'_1 \tilde q'_2 - \tilde h q'_1 q'_2, r) R^{-1} N.

By the divisor bound and Lemma 5 of this previous post, this sums to

\displaystyle  \lessapprox H^2 (Q/q_0)^4 ( ( R q_0 (Q/q_0)^4 )^{1/2} + R^{-1} N ).

Discarding some factors of {q_0}, it suffices to show that

\displaystyle  (RQ^4)^{1/2} + R^{-1} N \lessapprox H^{-2} N.

From (31), (21), (5) we have {Q \lessapprox x^{1/2+2\varpi} R^{-1}} and {H \lessapprox x^{4\varpi+\epsilon} N R^{-1}}, so the previous estimate will be implied by

\displaystyle  (x^{2+8\varpi} R^{-3})^{1/2} + \frac{1}{R} N \lessapprox x^{-8\varpi-2\epsilon} R^2 N^{-1}.

From (20), this will be implied by

\displaystyle  (x^{2+8\varpi+3\delta+3\epsilon})^{1/2} N^{-3/2} + x^{\delta+\epsilon} \lessapprox x^{-8\varpi-2\delta-4\epsilon} N

or equivalently that

\displaystyle  x^{\frac{2}{5} + \frac{24}{5} \varpi + \frac{7}{5} \delta + \frac{11}{5} \epsilon} \lessapprox N

and

\displaystyle  x^{8 \varpi + 3 \delta + 5 \epsilon} \lessapprox N

which by (8) is obeyed whenever

\displaystyle  68 \varpi + 14 \delta < 1

and

\displaystyle  20 \varpi + 6 \delta < 1.

The second condition is implied by the first and may be deleted. The proof of Theorem 5 is now complete.

— 4. Type III estimate —

We now prove Theorem 6. Let {M,N_1,N_2,N_3,\alpha,\psi_1,\psi_2,\psi_3,I,a} be as in the definition of {Type''_{III}[\varpi,\delta,\sigma]}. We will not need the full strength of double dense divisibility here, and work instead with single dense divisibility. By a finer-than-dyadic decomposition (and using the Bombieri-Vinogradov theorem to handle small moduli), it suffices to show that

\displaystyle  \sum_{q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta}: q = (1+O(x^{-\epsilon})) Q} |\Delta(\alpha * \psi_1 * \psi_2 * \psi_3; a\ (q))| \lessapprox x^{-2\epsilon} M N

for some sufficiently small fixed {\epsilon>0} and all

\displaystyle  x^{1/2} \lessapprox Q \ll x^{1/2+2\varpi}, \ \ \ \ \ (36)

where {N := N_1 N_2 N_3}.

Henceforth we work with a single choice of {Q}, and abbreviate the {q} summation as {\sum_q}. The left-hand side may then be written as

\displaystyle  \sum_q c_q \Delta(\alpha * \psi_1 * \psi_2 * \psi_3; a\ (q))

for some bounded sequence {c_q}. So it suffices to show that

\displaystyle  \sum_q c_q \sum_{n = a\ (q)} \alpha * \psi_1 * \psi_2 * \psi_3(n) = X + O( x^{-2\epsilon+o(1)} M N )

for some {X} that is independent of {a}, as the claim then follows by averaging in {a}.

The left-hand side may be rewritten as

\displaystyle  \sum_q c_q \sum_{m: (m,q)=1} \alpha(m) \sum_{n_1,n_2,n_3} \psi_1(n_1) \psi_2(n_2) \psi_3(n_3) 1_{mn_1n_2n_3 = a\ (q)}. \ \ \ \ \ (37)

Note that for {i=1,2,3} one has

\displaystyle  N_i \lessapprox x^{1/2-\sigma} \lessapprox x^{-\sigma} x^{1/2} \lessapprox x^{-\sigma} Q.

By Fourier inversion we have

\displaystyle  \psi_i(n_i) = \frac{1}{q} \sum_{-q/2 < h_i \leq q/2} \hat \psi_i(h_i/q) e_{q}( h_i n_i )

for all {-q/2 < n_i \leq q/2}, where {\hat \psi_i} is the Fourier transform

\displaystyle  \hat \psi_i(\theta) := \sum_{n_i} \psi_i(n_i) e(-\theta n_i).

From the smoothness of {\psi_i}, Poisson summation, and integration by parts we have the decay estimates

\displaystyle  |\hat \psi_i(\theta)| \lessapprox N_i (1 + N_i |\theta|)^{-C}

for any fixed {C \geq 0} and any {-1/2 \leq \theta \leq 1/2}. More generally, we also have the derivative estimates

\displaystyle  |\frac{d^j}{d\theta^j} \hat \psi_i(\theta)| \lessapprox N_i^{1+j} (1 + N_i |\theta|)^{-C}

for any fixed {C,j \geq 0} and any {-1/2 \leq \theta \leq 1/2}. We thus have

\displaystyle  \hat \psi_i(h_i/q) = O(x^{-100})

(say) when {|h_i| > x^\epsilon H_i}, where

\displaystyle  H_i := Q / N_i.

Furthermore, for {|h_i| \leq x^\epsilon H_i}, we can perform a Taylor expansion around {q=Q} and conclude that

\displaystyle  \frac{1}{q} \hat \psi_i(h_i/q) = \frac{1}{H_i} \sum_{j=0}^J c_{i,j,h_i} (\frac{q-Q}{Q})^j + O( x^{-100} )

for some fixed {J > 0} (depending on {\epsilon}), any {q = (1+O(x^{-\epsilon})) Q}, and some coefficients {c_{i,j,h_i} = O(1)} whose exact value will not be of importance to us. We may thus express (37), up to negligible errors, as the sum of a bounded number of expressions of the form

\displaystyle  \frac{1}{H} \sum_q c'_q \sum_{m: (m,q)=1} \alpha(m) \sum_{h_1,h_2,h_3} c''_{h_1,h_2,h_3}

\displaystyle  \sum_{n_1,n_2,n_3 \in {\bf Z}/q{\bf Z}} e_{q}( h_1 n_1 + h_2 n_2 + h_3 n_3 ) 1_{mn_1n_2n_3 = a\ (q)}

for some bounded sequences {c'_q, c''_{h_1,h_2,h_3}} whose exact value will not be of importance to us other than their support, which is contained in the sets

\displaystyle  \{ q = (1 + O(x^{-\epsilon})) Q: q \in {\mathcal S}_I \cap {\mathcal D}_{x^\delta} \}

and

\displaystyle  \{ (h_1,h_2,h_3): |h_1| \leq x^\epsilon H_1, |h_2| \leq x^\epsilon H_2, |h_3| \leq x^\epsilon H_3 \}

respectively, and where

\displaystyle  H := H_1 H_2 H_3 = Q^3 / N.

If we then introduce the modified hyper-Kloosterman sum

\displaystyle  F( h_1,h_2,h_3, a; q) :=

\displaystyle  \frac{1}{q} \sum_{n_1,n_2,n_3 \in ({\bf Z}/q{\bf Z})^\times} e_{q}( h_1 n_1 + h_2 n_2 + h_3 n_3 ) 1_{n_1n_2n_3 = a\ (q)}

defined for {h_1,h_2,h_3 \in {\bf Z}/q{\bf Z}} and {a \in ({\bf Z}/q{\bf Z})^\times}, then our objective is now to show that

\displaystyle  \sum_q \tilde c'_q \sum_{m: (m,q)=1} \alpha(m) \sum_{h_1,h_2,h_3} c''_{h_1,h_2,h_3} F( h_1,h_2,h_3, a \overline{m}, q)

\displaystyle = X' + O( x^{-2\epsilon+o(1)} H MN / Q)

for some {X'} that does not depend on {a}, where {\overline{m}} is the reciprocal of {m} in {{\bf Z}/q{\bf Z}} and {\tilde c'_q := \frac{q}{Q} c'_q}.

We may rewrite {HMN/Q} as {M Q^2}. Observe that {F(h_1,h_2,h_3, a \overline{m}; q)} is independent of {a} if one of {h_1} vanishes (as can be seen by dilating {n_1}), and similarly if {h_2} or {h_3} vanishes. Thus we may delete the case {h_1h_2h_3=0} from the above sum, and reduce to showing that

\displaystyle  |\sum_q \tilde c'_q \sum_{m: (m,q)=1} \alpha(m) \sum_{h_1,h_2,h_3: h_1h_2h_3 \neq 0} c''_{h_1,h_2,h_3} F( h_1,h_2,h_3, a \overline{m}; q)| \ \ \ \ \ (38)

\displaystyle  \lessapprox x^{-2\epsilon} M Q^2.

At this point we need to account for a technical problem that the {h_1,h_2,h_3} may still share a common factor with {q} even after being restricted to be non-zero. For {i=1,2,3}, let {b_i := \prod_{p|q; p^j || h_i} p^j} be the product of all the primes in {h_i} (counting multiplicity) that also divide {q}; thus {h_i = b_i h'_i} where {h'_i} is coprime to {q}. As we shall see, the case {b_1=b_2=b_3=1} is dominant, and on a first reading one may wish to focus exclusively on this case in what follows to simplify the discussion. We then write {b := \prod_{p|b_1b_2b_3} p = (h_1h_2h_3,q)}; this divides {q}, so we may write {q = bq'}. Note that as {q} is {x^\delta}-densely divisible, {q'} is {bx^\delta}-densely divisible, thus {q' \in {\mathcal S}_I \cap {\mathcal D}_{bx^\delta}}.

Now we factor {F}. From Lemma 8 we see that

\displaystyle  F( h_1,h_2,h_3, a \overline{m}; q) = F( h_1 \overline{q'},h_2 \overline{q'},h_3 \overline{q'}, a \overline{m}; b) F( h_1 \overline{b},h_2 \overline{b},h_3 \overline{b}, a \overline{m}; q').

For the second term, we observe that {h_i \overline{b}} is coprime to {q'} for {i=1,2,3}, and so by dilating the variables {n_1,n_2,n_3} we have

\displaystyle  F( h_1 \overline{b},h_2 \overline{b},h_3 \overline{b}, a \overline{m}; q') = K_3( a h_1 h_2 h_3 \overline{b}^3 \overline{m}; q' )

\displaystyle  = K_3( a_{b_1,b_2,b_3,q'} h'_1 h'_2 h'_3 \overline{m}; q')

where {a_{b_1,b_2,b_3,q'}} is the residue class

\displaystyle  a_{b_1,b_2,b_3,q'} := \frac{a b_1 b_2 b_3}{b^3}\ (q')

and we recall that {K_3} is the normalised hyper-Kloosterman sum

\displaystyle  K_3( a;q) := \frac{1}{q} \sum_{x,y,z \in ({\bf Z}/q{\bf Z})^\times: xyz = a\ (q)} e_q(x+y+z).

As for the first term, we have the following estimate:

Lemma 14 We have

\displaystyle  |F( h_1 \overline{q'},h_2 \overline{q'},h_3 \overline{q'}, a \overline{m}; b)| \leq \frac{\hbox{rad}(b_1) \hbox{rad}(b_2) \hbox{rad}(b_3)}{\hbox{rad}(b_1 b_2 b_3)^2}.

where {\hbox{rad}(a) := \prod_{p|a} p} (thus for instance {b = \hbox{rad}(b_1 b_2 b_3)}).

Proof: By further applications of Lemma 8 it suffices to show that

\displaystyle  |F( c_1, c_2, c_3, a; p)| \leq \frac{(c_1,p) (c_2,p) (c_3,p)}{p^2}

whenever {p} is prime, {c_1,c_2,c_3 \in {\bf Z}/p{\bf Z}} with {c_1c_2c_3 = 0\ (p)}, and {a \in ({\bf Z}/p{\bf Z})^\times}.

Without loss of generality we may assume that {c_3 = 0\ (p)}, then we may rewrite {F(c_1,c_2,c_3,a;p)} as

\displaystyle  \frac{1}{p} \sum_{n_1,n_2 \in ({\bf Z}/p{\bf Z})^\times} e_p( c_1 n_1 + c_2 n_2 ).

But this factors as the product of two Ramanujan sums divided by {p}, and the claim then follows by direct computation. \Box

For brevity we write {\vec b} for {b_1,b_2,b_3}. We may thus bound the left-hand side of (38) by

\displaystyle  \sum_{\vec b} \frac{\hbox{rad}(b_1) \hbox{rad}(b_2) \hbox{rad}(b_3)}{\hbox{rad}(b_1 b_2 b_3)^2} \sum_{h'_1,h'_2,h'_3}

\displaystyle  |\sum_{q'\in {\mathcal S}_I \cap {\mathcal D}_{bx^\delta}: (b h'_1 h'_2 h'_3,q')=1} \tilde c'_{bq'}

\displaystyle  \sum_{m: (bq',m)=1} \alpha(m) K_3( a_{\vec b,q'} h'_1 h'_2 h'_3 \overline{m}; q' )|

where the {h'_i} summations are over the ranges

\displaystyle  0 < |h'_i| \leq \frac{x^\epsilon H_i}{b_i}.

Writing {h := h'_1 h'_2 h'_3}, so that

\displaystyle  0 < |h| \leq \frac{x^{3\epsilon} H}{b_1 b_2 b_3},

and recalling that {b = \hbox{rad}(b_1b_2b_3)}, we may thus estimate the previous expression by

\displaystyle  \sum_{\vec b} \frac{\hbox{rad}(b_1) \hbox{rad}(b_2) \hbox{rad}(b_3)}{b^2} S_{\vec b}

where {S_{\vec b}} is the quantity

\displaystyle  S_{\vec b} := \sum_{0 < |h| \leq H_{\vec b}} \tau_3(h) |\sum_{q'\in {\mathcal S}_I \cap {\mathcal D}_{bx^\delta}: (b h,q')=1} \tilde c'_{bq'}

\displaystyle  \sum_{m: (bq',m)=1} \alpha(m) K_3( a_{\vec b,q'} h \overline{m}; q' )|

where {\tau_3(h) := \sum_{h_1,h_2,h_3: h_1h_2h_3=1} 1} is the third divisor function and

\displaystyle  H_{\vec b} := \frac{x^{3\epsilon} H}{b_1 b_2 b_3}.

Our task is now to show that

\displaystyle  \sum_{\vec b} \frac{\hbox{rad}(b_1) \hbox{rad}(b_2) \hbox{rad}(b_3)}{b^2} S_{\vec b} \lessapprox x^{-2\epsilon} M Q^2. \ \ \ \ \ (39)

We now focus on estimating {S_{\vec b}}.

We let

\displaystyle  1 \leq S_0 \leq y Q_{\vec b} \ \ \ \ \ (40)

be a quantity to optimise in later, where

\displaystyle  Q_{\vec b} := \frac{Q}{b}

and

\displaystyle y := x^\delta b.

We may assume that

\displaystyle  Q_{\vec b}, H_{\vec b} \gg 1 \ \ \ \ \ (41)

Observe that every {q'} that appears in the expression for {S_{\vec b}} is {y}-densely divisible and may thus be factored as {q'=rs} for some coprime {r,s} with

\displaystyle  y^{-1} S_0 \ll s \ll S_0

with {rs \sim Q_{\vec b}}. Thus we may write

\displaystyle  S_{\vec b} \ll \sum_{y^{-1} S_0 \ll s \ll S_0} \sum_{0 < |h| \leq H_{\vec b}} \tau_3(h) |\sum_{r \in {\mathcal S}_I: (b h,rs)=(r,s)=1} \tilde c'_{brs} \sum_{m: (brs,m)=1} \alpha(m)

\displaystyle  K_3( a_{\vec b,rs} h \overline{m}; rs )|

where

\displaystyle  H_{\vec b} := \frac{x^{3\epsilon} H}{b_1 b_2 b_3}.

From crude estimates we have

\displaystyle  \sum_{y^{-1} S_0 \ll s \ll S_0} \sum_{0 < |h| \leq H_{\vec b}} \frac{\tau_3(h)^2}{s} \lessapprox H_{\vec b}

so from the Cauchy-Schwarz inequality we have

\displaystyle  S_{\vec b} \lessapprox H_{\vec b}^{1/2} (S'_{\vec b})^{1/2} \ \ \ \ \ (42)

where

\displaystyle  S'_{\vec b} = \sum_{y^{-1} S_0 \ll s \ll S_0} \sum_h \Psi(h) s |\sum_{r\in {\mathcal S}_I: (b h,rs)=(r,s)=1} \tilde c'_{brs}

\displaystyle  \sum_{m: (brs,m)=1} \alpha(m) K_3( a_{\vec b,rs} h \overline{m}; rs )|^2

and {\Psi = \Psi_{\vec b}} is a smooth cutoff function supported on the interval {[-2H_{\vec b},2H_{\vec b}]} which equals one on {[-H_{\vec b},H_{\vec b}]}.

Now we estimate {S'_{\vec b}}. We can expand this expression as

\displaystyle  \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0} \sum_{s \sim Q_{\vec b}/r_1: (r_1r_2,s)=1 = (r_1r_2 s, b)=1}

\displaystyle  \sum_{m_1,m_2: (br_1s,m_1)=(br_2s,m_2)=1} \tilde c'_{br_1s} \overline{\tilde c'_{br_2 s}}

\displaystyle  s \alpha(m_1) \overline{\alpha(m_2)}

\displaystyle  \sum_{h: (r_1r_2 s,h)=1} \Psi(h) K_3( a_{\vec b,r_1s} h \overline{m_1}; r_1s ) \overline{K_3( a_{\vec b,r_2s} h \overline{m_2}; r_2s )}.

We first dispose of the diagonal case {m_1 r_1^3 - m_2 r_2^3 = 0}. Here we use the Deligne bound {|K_3| \lessapprox 1} to bound this case in magnitude by

\displaystyle  \lessapprox \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0} \sum_{s \sim Q_{\vec b}/r_1} \sum_{m_1,m_2: m_1 r_1^3 = m_2 r_2^3} s H_{\vec b}.

By the divisor bound, for each {r_1,m_1} there are {\lessapprox 1} choices for {m_2,r_2}, so this expression is

\displaystyle  \lessapprox \sum_{r_1 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \ll yQ_{\vec b}/S_0} M (Q_{\vec b}/r_1)^2 H_{\vec b}

which sums to {\lessapprox M S_0 Q_{\vec b} H_{\vec b}}.

Applying Lemma 13 for the off-diagonal case {m_1 r_1^3 - m_2 r_2^3 \neq 0}, we thus have

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0} \sum_{s \sim Q_{\vec b}/r_1: (r_1r_2,s)=(r_1r_2s,b)=1} \sum_{m_1,m_2 \sim M: m_1 r_1^3 - m_2 r_2^3 \neq 0} s

\displaystyle  (\frac{ H_{\vec b}}{[r_1,r_2] s} + 1) d^{1/2} s^{1/2} [r_1,r_2]^{1/2} (r_1,r_2,m_2-m_1)^{1/2}

where

\displaystyle  d := (a_{\vec b,r_2s} \overline{m_2} r_1^3 - a_{\vec b,r_1s} \overline{m_1} r_2^3, s )

\displaystyle  = (m_1 r_1^3 - m_2 r_2^3, s ).

Using the bound {s \sim Q_{\vec b}/r_1}, this becomes

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0; m_1r_1^3-m_2r_2^3 \neq 0} \sum_{s \sim Q_{\vec b}/r_1} \sum_{m_1,m_2 \sim M}

\displaystyle  (Q_{\vec b}/r_1) (\frac{ H_{\vec b}}{[r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2}} + [r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2})

\displaystyle  (m_1r_1^3 -m_2r_2^3, s)^{1/2} (r_1,r_2,m_2-m_1)^{1/2}.

By Lemma 5 from this previous post we have

\displaystyle  \sum_{s \sim Q_{\vec b}/r_1} (m_1r_1^3 -m_2r_2^3, s) \lessapprox Q_{\vec b}/r_1

and hence also

\displaystyle  \sum_{s \sim Q_{\vec b}/r_1} (m_1r_1^3 -m_2r_2^3, s)^{1/2} \lessapprox Q_{\vec b}/r_1

and thus

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0} \sum_{m_1,m_2 \sim M} (Q_{\vec b}/r_1)^2

\displaystyle  (\frac{ H_{\vec b}}{[r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2}} + [r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2}) (r_1,r_2,m_2-m_1)^{1/2}.

Similarly, we have

\displaystyle  \sum_{m_1,m_2 \sim M: d | m_2-m_1} d^{1/2} \lessapprox \frac{M^2}{d^{1/2}} + M d^{1/2}

for all {d}, so on summing over all {d|(r_1,r_2)} we have

\displaystyle  \sum_{m_1,m_2 \sim M} (r_1,r_2,m_2-m_1)^{1/2} \lessapprox M^2 + M (r_1,r_2)^{1/2}.

We thus have

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_1,r_2 \in {\mathcal S}_I: Q_{\vec b}/S_0 \ll r_1 \sim r_2 \ll yQ_{\vec b}/S_0} (Q_{\vec b}/r_1)^2

\displaystyle  (\frac{ H_{\vec b}}{[r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2}} + [r_1,r_2]^{1/2} (Q_{\vec b}/r_1)^{1/2}) (M^2 + M (r_1,r_2)^{1/2}).

Writing {r_0 := (r_1,r_2)}, {r_1 = r_0 r'_1} and {r_2 = r_0 r'_2}, we thus have

\displaystyle  S'_{\vec b}\lessapprox M S_0Q_{\vec b} H_{\vec b} + \sum_{r_0 \ll yQ_{\vec b}/S_0} \sum_{r'_1,r'_2 \in {\mathcal S}_I: \frac{Q_{\vec b}}{S_0 r_0} \ll r'_1 \sim r'_2 \ll \frac{yQ_{\vec b}}{S_0r_0}} (\frac{Q_{\vec b}}{r_0r'_1})^2

\displaystyle  (\frac{ H_{\vec b}}{(Q_{\vec b} r'_2)^{1/2}} + (Q_{\vec b} r'_2)^{1/2}) (M^2 + M r_0^{1/2}).

Performing the {r'_2} summation, this becomes

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_0 \ll yQ_{\vec b}/S_0} \sum_{r'_1 \in {\mathcal S}_I: \frac{Q_{\vec b}}{S_0 r_0} \ll r'_1 \ll \frac{yQ_{\vec b}}{S_0r_0}} (\frac{Q_{\vec b}}{r_0})^2

\displaystyle  (\frac{ H_{\vec b}}{Q_{\vec b}^{1/2}} (r'_1)^{-3/2} + Q_{\vec b}^{1/2} (r'_1)^{-1/2}) (M^2 + M r_0^{1/2})

and then performing the {r'_1} summation we obtain

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + \sum_{r_0 \ll yQ_{\vec b}/S_0} (\frac{Q_{\vec b}}{r_0})^2

\displaystyle  (\frac{ H_{\vec b}}{Q_{\vec b}^{1/2}} (\frac{Q_{\vec b}}{S_0 r_0})^{-1/2} + Q_{\vec b}^{1/2} (\frac{yQ_{\vec b}}{S_0 r_0})^{1/2}) (M^2 + M r_0^{1/2}).

The net power of {r_0} here is always at most {-1}, so the {r_0=1} term in the summation dominates:

\displaystyle  S'_{\vec b}\lessapprox M S_0 Q_{\vec b} H_{\vec b} + Q_{\vec b}^2

\displaystyle  (\frac{ H_{\vec b}}{Q_{\vec b}^{1/2}} (\frac{Q_{\vec b}}{S_0})^{-1/2} + Q_{\vec b}^{1/2} (\frac{yQ_{\vec b}}{S_0})^{1/2}) M^2.

We simplify this as

\displaystyle  S'_{\vec b} \lessapprox M S_0 Q_{\vec b} H_{\vec b} + M^2 H_{\vec b} Q_{\vec b} S_0^{1/2} + y^{1/2} M^2 Q_{\vec b}^3 S_0^{-1/2}. \ \ \ \ \ (43)

To optimise this in {S_0}, we select

\displaystyle  S_0 := \min( Q_{\vec b}^{4/3} M^{2/3} H_{\vec b}^{-2/3} y^{1/3}, yQ_{\vec b} ).

(The quantity {Q_{\vec b}^{4/3} M^{2/3} H_{\vec b}^{-2/3} y^{1/3}} comes from equating {MS_0 Q_{\vec b} H_{\vec b}} and {y^{1/2} M^2 Q_{\vec b}^3 S_0^{-1/2}}.) By construction, we have the second inequality in (40). We also claim the first inequality, since this is equivalent to

\displaystyle  H_{\vec b} \leq Q_{\vec b}^2 M y^{1/2}

which would follow if

\displaystyle  b^{1/2} Q \frac{b}{b_1 b_2 b_3} \leq x^{1+\delta/2-4\epsilon}.

But from (41) one has {b^{1/2} \ll Q^{1/2}} and {\frac{b}{b_1 b_2 b_3} \leq 1}, and the claim now follows from (36) and (13).

Inserting this value of {S_0} (using {S_0 \leq Q_{\vec b}^{4/3} M^{2/3} H_{\vec b}^{-2/3} y^{1/3}} for the first two terms in (43)), we conclude that

\displaystyle  S'_{\vec b} \lessapprox M^{5/3} Q_{\vec b}^{7/3} H_{\vec b}^{1/3} y^{1/3} + M^{7/3} H_{\vec b}^{2/3} Q_{\vec b}^{5/3} y^{1/6} + M^2 Q_{\vec b}^{5/2}.

One should view the first term here as the main term. By (42), we conclude that

\displaystyle  S_{\vec b} \lessapprox M^{5/6} Q_{\vec b}^{7/6} H_{\vec b}^{2/3} y^{1/6} + M^{7/6} H_{\vec b}^{5/6} Q_{\vec b}^{5/6} y^{1/12} + M H_{\vec b}^{1/2} Q_{\vec b}^{5/4}.

Since {H_{\vec b} = x^{3\epsilon} H (b_1 b_2 b_3)^{-1} \leq x^{3\epsilon} H b^{-1}}, {Q_{\vec b} = Q b^{-1}}, and {y = x^\delta b}, we thus have

\displaystyle  S_{\vec b} \lessapprox x^{5\epsilon/2} b^{-1} (b_1 b_2 b_3)^{-1/2} ( M^{5/6} Q^{7/6} H^{2/3} x^{\delta/6} + M^{7/6} H^{5/6} Q^{5/6} x^{\delta/12} + M H^{1/2} Q^{5/4} ).

From Euler products we see that

\displaystyle  \sum_{\vec b} \frac{\hbox{rad}(b_1) \hbox{rad}(b_2) \hbox{rad}(b_3)}{b^2} b^{-1} (b_1 b_2 b_3)^{-1/2} \ll 1

and so to prove (39) it will suffice to show that

\displaystyle  M^{5/6} Q^{7/6} H^{2/3} x^{\delta/6} + M^{7/6} H^{5/6} Q^{5/6} x^{\delta/12} + M H^{1/2} Q^{5/4} \lessapprox x^{-5\epsilon} M Q^2.

We can rewrite these conditions as upper bounds on {H}:

\displaystyle  H \lessapprox x^{-\delta/4 - 15 \epsilon/2} M^{1/4} Q^{5/4}

\displaystyle  H \lessapprox x^{-\delta/10 - 6 \epsilon} M^{-1/5} Q^{7/5}

\displaystyle  H \lessapprox x^{-10\epsilon} Q^{3/2}.

As {H = Q^3 / N} and {MN \sim x}, we can rewrite these conditions as upper bounds on {Q}:

\displaystyle  Q^{7/4} \lessapprox x^{1/4-\delta/4 - 15 \epsilon/2} N^{3/4}

\displaystyle  Q^{8/5} \lessapprox x^{-1/5-\delta/10 - 6 \epsilon} N^{6/5}

\displaystyle  Q^{3/2} \lessapprox x^{-10\epsilon} N.

Since {N \gtrapprox x^{\frac{3}{2} (\frac{1}{2}+\sigma)}} and {Q \lessapprox x^{1/2+2\varpi}}, these conditions become

\displaystyle  \frac{7}{4} (\frac{1}{2}+2\varpi) < \frac{1}{4} - \frac{\delta}{4} + \frac{3}{4} \frac{3}{2} (\frac{1}{2} + \sigma)

\displaystyle  \frac{8}{5} (\frac{1}{2}+2\varpi) < -\frac{1}{5} - \frac{\delta}{10} + \frac{6}{5} \frac{3}{2} (\frac{1}{2} + \sigma)

\displaystyle  \frac{3}{2} (\frac{1}{2}+2\varpi) < \frac{3}{2} (\frac{1}{2} + \sigma)

which we may rearrange as

\displaystyle  \sigma > \frac{1}{18} + \frac{28}{9} \varpi + \frac{2}{9} \delta

\displaystyle  \sigma > \frac{1}{18} + \frac{16}{9} \varpi + \frac{1}{18} \delta

\displaystyle  \sigma > 2 \varpi

but these follow from (12). The proof of Theorem 6 is complete.