Estimation of the Type I and Type II sums

12 June, 2013 in math.NT, polymath | Tags: Bombieri-Vinogradov theorem, completion of sums, dispersion method, Kloosterman sums, polymath8, Yitang Zhang | by Terence Tao

This is one of the continuations of the online reading seminar of Zhang’s paper for the polymath8 project. (There are two other continuations; this previous post, which deals with the combinatorial aspects of the second part of Zhang’s paper, and a post to come that covers the Type III sums.) The main purpose of this post is to present (and hopefully, to improve upon) the treatment of two of the three key estimates in Zhang’s paper, namely the Type I and Type II estimates.

The main estimate was already stated as Theorem 16 in the previous post, but we quickly recall the relevant definitions here. As in other posts, we always take ${x}$ to be a parameter going off to infinity, with the usual asymptotic notation ${O(), o(), \ll}$ associated to this parameter.

Definition 1 (Coefficient sequences) A coefficient sequence is a finitely supported sequence ${\alpha: {\bf N} \rightarrow {\bf R}}$ that obeys the bounds

$\displaystyle |\alpha(n)| \ll \tau^{O(1)}(n) \log^{O(1)}(x) \ \ \ \ \ (1)$

for all ${n}$ , where ${\tau}$ is the divisor function.

(i) If ${\alpha}$ is a coefficient sequence and ${a\ (q) = a \hbox{ mod } q}$ is a primitive residue class, the (signed) discrepancy ${\Delta(\alpha; a\ (q))}$ of ${\alpha}$ in the sequence is defined to be the quantity
$\displaystyle \Delta(\alpha; a \ (q)) := \sum_{n: n = a\ (q)} \alpha(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1} \alpha(n). \ \ \ \ \ (2)$

(ii) A coefficient sequence ${\alpha}$ is said to be at scale ${N}$ for some ${N \geq 1}$ if it is supported on an interval of the form ${[(1-O(\log^{-A_0} x)) N, (1+O(\log^{-A_0} x)) N]}$ .

(iii) A coefficient sequence ${\alpha}$ at scale ${N}$ is said to obey the Siegel-Walfisz theorem if one has
$\displaystyle | \Delta(\alpha 1_{(\cdot,q)=1}; a\ (r)) | \ll \tau(qr)^{O(1)} N \log^{-A} x \ \ \ \ \ (3)$

for any ${q,r \geq 1}$ , any fixed ${A}$ , and any primitive residue class ${a\ (r)}$ .

(iv) A coefficient sequence ${\alpha}$ at scale ${N}$ is said to be smooth if it takes the form ${\alpha(n) = \psi(n/N)}$ for some smooth function ${\psi: {\bf R} \rightarrow {\bf C}}$ supported on ${[1-O(\log^{-A_0} x), 1+O(\log^{-A_0} x)]}$ obeying the derivative bounds
$\displaystyle \psi^{(j)}(t) = O( \log^{j A_0} x ) \ \ \ \ \ (4)$

for all fixed ${j \geq 0}$ (note that the implied constant in the ${O()}$ notation may depend on ${j}$ ).

In Lemma 8 of this previous post we established a collection of “crude estimates” which assert, roughly speaking, that for the purposes of averaged estimates one may ignore the ${\tau^{O(1)}(n)}$ factor in (1) and pretend that ${\alpha(n)}$ was in fact ${O( \log^{O(1)} n)}$ . We shall rely frequently on these “crude estimates” without further citation to that precise lemma.

For any ${I \subset {\bf R}}$ , let ${{\mathcal S}_I}$ denote the square-free numbers whose prime factors lie in ${I}$ .

Definition 2 (Singleton congruence class system) Let ${I \subset {\bf R}}$ . A singleton congruence class system on ${I}$ is a collection ${{\mathcal C} = (\{a_q\})_{q \in {\mathcal S}_I}}$ of primitive residue classes ${a_q \in ({\bf Z}/q{\bf Z})^\times}$ for each ${q \in {\mathcal S}_I}$ , obeying the Chinese remainder theorem property

$\displaystyle a_{qr}\ (qr) = (a_q\ (q)) \cap (a_r\ (r)) \ \ \ \ \ (5)$

whenever ${q,r \in {\mathcal S}_I}$ are coprime. We say that such a system ${{\mathcal C}}$ has controlled multiplicity if the

$\displaystyle \tau_{\mathcal C}(n) := |\{ q \in {\mathcal S}_I: n = a_q\ (q) \}|$

obeys the estimate

$\displaystyle \sum_{C^{-1} x \leq n \leq Cx: n = a\ (r)} \tau_{\mathcal C}(n)^2 \ll \frac{x}{r} \tau(r)^{O(1)} \log^{O(1)} x + x^{o(1)}. \ \ \ \ \ (6)$

for any fixed ${C>1}$ and any congruence class ${a\ (r)}$ with ${r \in {\mathcal S}_I}$ .

The main result of this post is then the following:

Theorem 3 (Type I/II estimate) Let ${\varpi, \delta, \sigma > 0}$ be fixed quantities such that

$\displaystyle 11 \varpi + 3\delta + 2 \sigma < \frac{1}{4} \ \ \ \ \ (7)$

and

$\displaystyle 29\varpi + 5 \delta < \frac{1}{4} \ \ \ \ \ (8)$

and let ${\alpha,\beta}$ be coefficient sequences at scales ${M,N}$ respectively with

$\displaystyle x \ll MN \ll x \ \ \ \ \ (9)$

and

$\displaystyle x^{\frac{1}{2}-\sigma} \ll N \ll M \ll x^{\frac{1}{2}+\sigma} \ \ \ \ \ (10)$

with ${\beta}$ obeying a Siegel-Walfisz theorem. Then for any ${I \subset [1,x^\delta]}$ and any singleton congruence class system ${(\{a_q\})_{q \in {\mathcal S}_I}}$ with controlled multiplicity we have

$\displaystyle \sum_{q \in {\mathcal S}_I: q< x^{1/2+2\varpi}} |\Delta(\alpha \ast \beta; a_q)| \ll x \log^{-A} x.$

The proof of this theorem relies on five basic tools:

(i) the Bombieri-Vinogradov theorem;
(ii) completion of sums;
(iii) the Weil conjectures;
(iv) factorisation of smooth moduli ${q \in {\mathcal S}_I}$ ; and
(v) the Cauchy-Schwarz and triangle inequalities (Weyl differencing and the dispersion method).

For the purposes of numerics, it is the interplay between (ii), (iii), and (v) that drives the final conditions (7), (8). The Weil conjectures are the primary source of power savings ( ${x^{-c}}$ for some fixed ${c>0}$ ) in the argument, but they need to overcome power losses coming from completion of sums, and also each use of Cauchy-Schwarz tends to halve any power savings present in one’s estimates. Naively, one could thus expect to get better estimates by relying more on the Weil conjectures, and less on completion of sums and on Cauchy-Schwarz.

— 1. The Bombieri-Vinogradov theorem —

One of the basic distribution results in this area of analytic number theory is the Bombieri-Vinogradov theorem. As first observed by Motohashi, this theorem in fact controls the distribution of a general class of Dirichlet convolutions in congruence classes. We will use the following formulation of this theorem, essentially Theorem 0 of Bombieri-Friedlander-Iwaniec;

Theorem 4 (Bombieri-Vinogradov theorem) Let ${M,N}$ be such that

$\displaystyle x \ll MN \ll x$

and

$\displaystyle x^\epsilon \ll M,N \ll x^{1-\epsilon}$

for some fixed ${0 < \epsilon < 1}$ . Let ${\alpha,\beta}$ be coefficient sequences at scale ${M,N}$ respectively. Suppose also that ${\beta}$ obeys a Siegel-Walfisz theorem.

(i) (First Bombieri-Vinogradov inequality) We have
$\displaystyle \sum_{q \leq x^{-o(1)} N} \sum_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\beta; a)|^2 \ll N^2 \log^{-A} x$

for some sufficiently slowly decaying ${o(1)}$ and any fixed ${A>0}$ .

(ii) (Second Bombieri-Vinogradov inequality) we have
$\displaystyle \sum_{q \leq x^{1/2-o(1)}} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\alpha*\beta; a)| \ll x \log^{-A} x$

for some sufficiently slowly decaying ${o(1)}$ and any fixed ${A>0}$ .

For sake of completeness we now recall the proof of this theorem, following the presentation in Bombieri-Friedlander-Iwaniec. (This is standard material, and experts may immediately skip to the next section.) We first need the large sieve inequality for Dirichlet characters:

Lemma 5 (Large sieve inequality) For any sequence ${\alpha: {\bf N} \rightarrow {\bf C}}$ supported on ${\{1,\ldots,N\}}$ and any ${Q > 1}$ one has

$\displaystyle \sum_{q \leq Q} \sum_{\chi\ (q)}^* \frac{q}{\phi(q)} |\sum_n \alpha(n) \chi(n)|^2 \ll (Q^2+N) \sum_n |\alpha(n)|^2$

where the ${\chi}$ summation is over primitive Dirichlet characters of conductor ${q}$ .

Proof: By enlarging ${N}$ we may assume that ${N \geq Q^2}$ .

We use the ${TT^*}$ method. By duality, the desired estimate is equivalent to

$\displaystyle |\sum_{q \leq Q} \sum_{\chi\ (q)}^* \frac{q}{\phi(q)} c_\chi \sum_{n \leq N} \alpha(n) \chi(n)|$

$\displaystyle \ll N^{1/2} (\sum_{n \leq N} |\alpha(n)|^2)^{1/2} (\sum_{q \leq Q} \sum_{\chi\ (q)}^* \frac{q}{\phi(q)} |c_\chi|^2)^{1/2}$

which is in turn equivalent to

$\displaystyle \sum_n 1_{n \leq N} |\sum_{q \leq Q} \sum_{\chi\ (q)}^* \frac{q}{\phi(q)} c_\chi \chi(n)|^2 \ll N \sum_{q \leq Q} \sum_{\chi\ (q)}^* \frac{q}{\phi(q)} |c_\chi|^2.$

We bound ${1_{n \leq N}}$ by a Schwartz function ${\psi(n/N)}$ whose Fourier transform is supported on ${[-1/2Q^2,1/2Q^2]}$ ; this is possible for a fixed ${\psi}$ when ${N \geq Q^2}$ . The left-hand side then expands as

$\displaystyle \sum_{q,q' \leq Q} \sum_{\chi\ (q)}^* \sum_{\chi'\ (q')}^* \frac{q}{\phi(q)} c_\chi \frac{q'}{\phi(q')} \overline{c_{\chi'}} (\sum_n \psi(n/N) \chi\overline{\chi'}(n)).$

The inner sum is ${O(N \frac{\phi(q)}{q})}$ when ${\chi=\chi'}$ , and zero otherwise thanks to Fourier analysis (specifically, the Poisson summation formula combined with the support of the Fourier transform of ${\psi}$ and the fact that ${\chi\overline{\chi'}}$ is periodic of period ${qq'}$ with mean zero). The claim follows. $\Box$

Now we prove part (i) of Theorem 4. By an overspill argument (cf. Lemma 7 of this previous post) it suffices to show that

$\displaystyle \sum_{q \leq x^{-\epsilon} N} \sum_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\beta; a)|^2 \ll N^2 \log^{-A} x \ \ \ \ \ (11)$

for any fixed ${A,\epsilon > 0}$ .

Let ${Q := x^{-\epsilon} N}$ . By multiplicative Fourier analysis we may write the left-hand side of (11) as

$\displaystyle \sum_{q \leq Q} \frac{1}{\phi(q)} \sum_{\chi \neq \chi_0\ (q)} |\sum_n \beta(n) \chi(n)|^2$

where the inner sum ranges over non-principal Dirichlet characters ${\chi}$ of modulus ${q}$ , not necessarily primitive. Any such character can be written as ${\chi(n) = \psi(n) 1_{(n,e)=1}}$ where ${q = de}$ with ${d > 1}$ , and ${\psi}$ is a primitive Dirichlet character of conductor ${d}$ . The above sum can then be rewritten as

$\displaystyle \sum_{e \leq Q} \sum_{1 < d < Q/e} \frac{1}{\phi(de)} \sum_{\psi\ (d)}^* |\sum_n \beta(n) \psi(n) 1_{(n,e)=1}|^2$

where the ${\psi}$ summation ranges over primitive characters modulo ${d}$ . Since ${\frac{1}{\phi(de)} \leq \frac{1}{\phi(d)} \frac{1}{\phi(e)}}$ , we may bound this by

$\displaystyle \sum_{e \leq Q} \frac{1}{\phi(e)} \sum_{1 < d < Q/e} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^* |\sum_n \beta(n) \psi(n) 1_{(n,e)=1}|^2,$

which on performing the ${e}$ summation is bounded by

$\displaystyle \ll (\log x)^{O(1)} \sup_{e \leq Q} \sum_{1 < d < Q/e} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^* |\sum_n \beta(n) \psi(n) 1_{(n,e)=1}|^2,$

and then by dyadic decomposition this is bounded by

$\displaystyle \ll (\log x)^{O(1)} \sup_{e, D \leq Q} \sum_{D < d < 2D} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^* |\sum_n \beta(n) \psi(n) 1_{(n,e)=1}|^2.$

Let us first consider the contribution of the small moduli, specifically when ${D \leq \log^{C} x}$ for some fixed ${C>0}$ . From the Siegel-Walfisz theorem (3) we have

$\displaystyle \sum_n \beta(n) \psi(n) 1_{(n,e)=1} \ll \tau(de)^{O(1)} N \log^{-A' + O(C)} x$

for any fixed ${A'>0}$ , and then by crude estimates (see Lemma 8 of this previous post) we see that this contribution is acceptable for (11). Thus we may restrict attention to those moduli with ${D > \log^{C} x}$ for any fixed ${C}$ . On the other hand, from Lemma 5 we have

$\displaystyle \sum_{D \leq d \leq 2D} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^* |\sum_n \beta(n) \psi(n) 1_{(n,e)=1}|^2 \ll \frac{N+D^2}{D} \sum_n |\beta(n) 1_{(n,e)=1}|^2$

for any ${\log^C x \ll D \ll Q}$ . From crude estimates we have

$\displaystyle \sum_n |\beta(n) 1_{(n,e)=1}|^2 \ll N \log^{O(1)} x.$

Since

$\displaystyle \frac{N+D^2}{D} = N/D + D \ll N \log^{-C} x + N x^{-\epsilon}$

the claim follows.

Now we prove (ii). We now set ${Q := x^{1/2-\epsilon}}$ for some fixed ${\epsilon>0}$ . By overspill as before, it suffices to show that

$\displaystyle \sum_{q \leq Q} \sup_{a \in ({\bf Z}/q{\bf Z})^\times} |\Delta(\alpha \ast \beta; a)| \ll x \log^{-A} x. \ \ \ \ \ (12)$

By multiplicative Fourier analysis we have

$\displaystyle \Delta(\alpha \ast \beta; a) = \frac{1}{\phi(q)} \sum_{\chi \neq \chi_0\ (q)} \overline{\chi(a)} \sum_n \alpha \ast \beta(n) \chi(n)$

$\displaystyle = \frac{1}{\phi(q)} \sum_{\chi \neq \chi_0\ (q)} \overline{\chi(a)} (\sum_m \alpha \chi(m)) (\sum_n \beta \chi(n))$

so by splitting into primitive characters as before we may bound the left-hand side of (12) by

$\displaystyle \ll (\log x)^{O(1)} \sup_{e,D \leq Q} \sum_{D < d < 2D} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^*$

$\displaystyle |\sum_m \alpha \psi(m) 1_{(m,e)=1}| |\sum_n \beta \psi(n) 1_{(n,e)=1}|$

Arguing as before, the case ${D \leq \log^C x}$ is acceptable, so we assume ${\log^C x \leq D \leq x^{1/2-\epsilon}}$ . From crude estimates we have

$\displaystyle \sum_m |\alpha(m) 1_{(m,e)=1}|^2 \ll M \log^{O(1)} x.$

and

$\displaystyle \sum_n |\beta(n) 1_{(n,e)=1}|^2 \ll N \log^{O(1)} x.$

From Lemma 5 and Cauchy-Schwarz we can thus bound

$\displaystyle \sum_{D < d < 2D} \frac{1}{\phi(d)} \sum_{\psi\ (d)}^* |\sum_m \alpha \chi(m) 1_{(m,e)=1}| |\sum_n \beta \chi(n) 1_{(n,e)=1}|$

$\displaystyle \ll \frac{\log^{O(1)} x}{D} N^{1/2} M^{1/2} (N+D^2)^{1/2} (M+D^2)^{1/2};$

as ${NM}$ is comparable to ${x}$ , this simplifies to

$\displaystyle \ll x \log^{O(1)} x ( \frac{1}{D} + N^{-1/2} + M^{-1/2} + x^{-1/2} D ),$

which is acceptable from the hypotheses on ${D, Q, N, M}$ if ${C}$ is chosen large enough depending on ${A}$ . This concludes the proof of Theorem 4.

We remark that an inspection of the proof reveals that the ${x^{-o(1)}}$ or ${x^{-\epsilon}}$ factors in the threshold ${Q}$ for ${q}$ may be replaced by ${\log^{-A'} x}$ provided that ${A'}$ is sufficiently large depending on ${A}$ . However, this refinement of the Bombieri-Vinogradov inequality does not lead to any improvement in the final numerology for our purposes.

— 2. Completion of sums —

At several stages in the argument we will need to consider sums of the form

$\displaystyle \sum_m \psi_M(m) \sum_{i \in I: m = a_i\ (d)} c_i$

for some smooth coefficient sequence ${\psi_M}$ , sone congruence classes ${a_i\ (d)}$ depending on a further parameter ${i}$ , and further coefficients ${c_i}$ . The completion of sums technique replaces the ${\psi_M(m)}$ term here with an exponential phase, leaving one with consideration of exponential sums such as

$\displaystyle \sum_{i \in I} c_i e_d(-a_i h)$

for various ${h}$ , where ${e_d(n) := e^{2\pi i n/d}}$ for ${n \in {\bf Z}/d{\bf Z}}$ (or ${n \in {\bf Z}}$ ) is the standard character on ${{\bf Z}/d{\bf Z}}$ . More precisely, we have

Lemma 6 (Completion of sums) Let ${\psi_M}$ be a smooth coefficient sequence at some scale ${1 \ll M \ll x}$ . Let ${I}$ be a finite set of indices, and for each ${i \in I}$ let ${c_i}$ be a complex number and ${a_i\ (d)}$ be a congruence class for some ${d \ll x}$ . we have

$\displaystyle \sum_m \psi_M(m) \sum_{i \in I: m = a_i\ (d)} c_i = \frac{1}{d} (\sum_m \psi_M(m)) (\sum_i c_i) \ \ \ \ \ (13)$

$\displaystyle + O( (\log^{O(1)} x) \frac{M}{d} \sum_{1 \leq h \leq x^\epsilon d M^{-1}} |\sum_i c_i e_d( a_i h )| )$

$\displaystyle + O( x^{-A} \sum_i |c_i| )$

for any fixed ${\epsilon,A>0}$ .

Specialising to the case ${I = {\bf Z}/d{\bf Z}}$ and ${a_i = i}$ , we conclude in particular that

$\displaystyle \sum_m \psi_M(m) c(m) = \frac{1}{d} (\sum_m \psi_M(m)) (\sum_{n \in {\bf Z}/d{\bf Z}} c(n)) \ \ \ \ \ (14)$

$\displaystyle + O( (\log^{O(1)} x) \frac{M}{d} \sum_{1 \leq h \leq x^\epsilon d M^{-1}} |\sum_{n \in {\bf Z}/d{\bf Z}} c(n) e_d( n h )| )$

$\displaystyle + O( x^{-A} \sum_{n \in {\bf Z}/d{\bf Z}} |c(n)| )$

whenever ${c: {\bf Z} \rightarrow {\bf C}}$ is periodic of degree ${d}$ .

Proof: We rearrange the left-hand side as

$\displaystyle \sum_{i \in I} c_i \sum_{m} \psi_M(m) 1_{m=a_i\ (d)}.$

Using the Fourier expansion

$\displaystyle 1_{m = a_i\ (d)} = \frac{1}{d} \sum_{h \in {\bf Z}/d{\bf Z}} e_d( a_i h) e_d(-m h)$

and rearranging, the left-hand side then becomes

$\displaystyle \frac{1}{d} \sum_{h \in {\bf Z}/d{\bf Z}} [\sum_m \psi_M(m) e_d(-mh)] \times [\sum_{i \in I} c_i e_d(a_i h)].$

The ${h=0}$ term of this is the first term on the right-hand side of (13). The terms coming from integers ${h}$ with ${1 \leq |h| \leq x^\epsilon d M^{-1}}$ can be bounded by the second term in (13), bounding ${\sum_m \psi_M(m) e_d(mh)}$ crudely by ${O(M \log^{O(1)} x)}$ by crude estimates and also using conjugation symmetry when ${h}$ is negative. So it will suffice to show that

$\displaystyle \sum_m \psi_M(m) e_d(-mh) = O( x^{-A-2} )$

(say) when ${x^\epsilon d M^{-1} \leq |h| \leq d/2}$ . Writing ${\psi_M(m) = \psi(m/M)}$ as in the definition of a smooth coefficient sequence, and applying Poisson summation, the left-hand side is

$\displaystyle M \sum_n \hat \psi( M n + \frac{Mh}{d} )$

where ${\hat \psi(t) := \int_{\bf R} e^{-2\pi i st} \psi(s)\ ds}$ is the Fourier transform of ${\psi}$ . But from the smoothness (4) of ${\psi}$ and integration by parts one has the bounds

$\displaystyle \hat \psi(t) \ll |t|^{-A'} \log^{O(1)} x$

for any fixed ${A'}$ , and from the hypothesis ${x^\epsilon d M^{-1} \leq |h| \leq d/2}$ we obtain the claim by taking ${A'}$ large enough depending on ${\epsilon,A}$ . $\Box$

We remark that in the absence of cancellation in the exponential sum ${\sum_i c_i e_d(-a_i h)}$ , the first error term in (13) could be as large as

$\displaystyle O( x^{\epsilon+o(1)} \sum_i |c_i| )$

which is about ${x^{\epsilon+o(1)} \frac{d}{M}}$ times as large as the main term in (13). In practice we will apply this lemma with ${d > x^c M}$ for some fixed ${c>0}$ , in which case completion of sums will cost a factor of ${x^c}$ or so in the bounds. However, it is still often desirable to pay this cost in order to exploit cancellation bounds for exponential sum, in particular those coming from the Weil conjectures as described below.

In our applications, the modulus ${d}$ will split into a product of two factors ${q_1 q_2}$ or three factors ${q_1 q_2 q_3}$ . The following simple lemma lets us then split exponential phases of the form ${e_d(-ah)}$ :

Lemma 7 Suppose that ${d=q_1 q_2}$ for some coprime ${q_1, q_2}$ , and let ${a\ (d)}$ be the intersection of the congruence classes ${b_1\ (q_1)}$ and ${b_2\ (q_2)}$ . Then for any integer ${h}$ ,

$\displaystyle e_d( a h ) = e_{q_1}( \frac{b_1 h}{q_2} ) e_{q_2}( \frac{b_2 h}{q_1} ).$

Similarly, if ${d=q_1q_2q_3}$ for coprime ${q_1,q_2,q_3}$ and ${a\ (d)}$ is the intersection of ${b_1\ (q_1)}$ , ${b_2\ (q_2)}$ , and ${b_3\ (q_3)}$ , then

$\displaystyle e_d( a h ) = e_{q_1}( \frac{b_1 h}{q_2 q_3} ) e_{q_2}( \frac{b_2 h}{q_1 q_3} ) e_{q_3}( \frac{b_3 h}{q_1 q_2} ).$

Here and in the sequel we are using the convention that ${e_q( \frac{a}{b} )}$ means ${e_q(a \overline{b} )}$ for ${b}$ coprime to ${q}$ , where ${\overline{b}}$ is the reciprocal of ${b}$ modulo ${q}$ .

Proof: We just prove the first identity, as the second is similar. Let ${\bar{q_1}}$ be an integer such that ${q_1 \bar{q_1} = 1 \ (q_2)}$ , and similarly let ${\bar{q_2}}$ be an integer such that ${q_2 \bar{q_2} = 1\ (q_1)}$ . Then ${b_1 q_2 \bar{q_2} + b_2 q_1 \bar{q_1}}$ is equal to ${b_1}$ mod ${q_1}$ and ${b_2}$ mod ${q_2}$ , and thus equal to ${a}$ mod ${q_1 q_2}$ . Thus

$\displaystyle e_d( ah) = e_d( b_1 hq_2 \bar{q_2} + b_2 h q_1 \bar{q_1} )$

and the claim follows by factoring the exponential. $\Box$

— 3. The Weil conjectures —

For the purposes of this post, the Weil conjectures (as proven in full generality by Deligne) can be viewed as a black box device to obtain “square root cancellation” for various exponential sums of the form

$\displaystyle \sum_{n \in G} \xi( P(n) )$

where ${G}$ is a finite abelian group (i.e. a finite product of cyclic groups) with some additive character ${\xi: H \rightarrow S^1}$ and some rational function ${P: G \rightarrow H}$ , basically by obtaining the analogue of the Riemann Hypothesis for a certain zeta function associated to an algebraic variety related to the function ${P}$ . (This is by no means the full strength of the Weil conjectures; amongst other things, one can also twist such sums by multiplicative characters, and also work with more complicated schemes than classical algebraic varieties, though the exponential sum estimates are more difficult to state succinctly as a consequence.) A basic instance of this is Weil’s classical bound on the Kloosterman sum

$\displaystyle S(a,b;m) := \sum_{n \in ({\bf Z}/m{\bf Z})^\times} e_m( an + \frac{b}{n} ), \ \ \ \ \ (15)$

defined whenever ${a,b,m}$ are integers with ${m}$ positive.

Theorem 8 (Weil bound) For any ${a,b,m}$ one has

$\displaystyle |S(a,b;m)| \leq \tau(m) (a,b,m)^{1/2} m^{1/2}$

where ${(a,b,m)}$ is the greatest common divisor of ${a,b,m}$ .

Proof: (Sketch only; see e.g. Iwaniec-Kowalski for a full proof.) For simplicity we only address the case of square-free ${m}$ , which is the case needed in our application. By the Chinese remainder theorem we may reduce to the case when ${m}$ is a prime ${q}$ , then we may also reduce to the case when ${a,b}$ are coprime to ${p}$ . It then suffices to show that

$\displaystyle |\sum_{(x,y) \in C({\bf F}_q)} \psi(ax+by)| \leq 2 q^{1/2} \ \ \ \ \ (16)$

whenever ${\psi: {\bf F}_q \rightarrow S^1}$ is an additive character and ${C({\bf F}_q)}$ is the hyperbola

$\displaystyle C({\bf F}_q) := \{ (x,y) \in {\bf F}_q \times {\bf F}_q: xy = 1 \}.$

An important trick is then to generalise from ${{\bf F}_q}$ to finite dimensional extensions ${{\bf F}_{q^n}}$ of ${{\bf F}_q}$ , and then consider the Kloosterman sums

$\displaystyle S_n(\psi) := \sum_{(x,y) \in C({\bf F}_{q^n})} \psi( \hbox{Tr}(ax+by) )$

associated to these extensions, where ${\hbox{Tr}: {\bf F}_{q^n} \rightarrow {\bf F}_q}$ is the field trace. For non-principal ${\psi}$ , it is possible to show the explicit formula

$\displaystyle S_n(\psi) = - \alpha_\psi^n - \beta_\psi^n \ \ \ \ \ (17)$

for some complex numbers ${\alpha_\psi,\beta_\psi}$ (depending of course on ${a,b}$ ); this is part of the general theory of ${L}$ -functions associated to algebraic varieties, but can also be established by elementary means (e.g. by establishing a linear recurrence for the ${S_n}$ ). For the principal character ${\psi = \psi_0}$ , of course, one has

$\displaystyle S_n(\psi_0) = q^n - 1.$

Next, one observes the basic identity

$\displaystyle \sum_\psi \psi(\hbox{Tr}(x)) = | \{ z \in {\bf F}_{q^n}: z^q-z = x \}|$

as can be seen by computing the kernel and range of the linear map ${z \mapsto z^q-z}$ on ${{\bf F}_{q^n}}$ (this identity is also related to Hilbert’s Theorem 90). From this we have a combinatorial interpretation of the quantity

$\displaystyle \sum_\psi S_n(\psi),$

namely as the number of points on the curve

$\displaystyle \{ (x,y,z) \in {\bf F}_{q^n}^3: z^q-z = ax+by; xy=1 \}.$

One can show (e.g. using Stepanov’s method, cf. this previous post) that the number of this points on this curve is equal to ${q^n + O(q^{n/2+ O(1)})}$ , leading to the identity

$\displaystyle q^n - 1 - \sum_{\psi \neq \psi_0} (\alpha_\psi^n +\beta_\psi^n) = q^n + O(q^{n/2+ O(1)}).$

Studying the asymptotics as ${n \rightarrow \infty}$ , one is led to the conclusion that ${|\alpha_\psi|, |\beta_\psi| \leq \sqrt{q}}$ (this trick to “magically” delete the ${O(q^{O(1)})}$ error is a canonical example of the tensor power trick), and the bound (16) then follows from the ${n=1}$ case of (17). $\Box$

In practice, we shall estimate ${\tau(m)}$ crudely by the divisor bound ${\tau(m) = m^{o(1)}}$ (where ${o(1)}$ denotes a quantity that goes to zero as ${m \rightarrow \infty}$ ), and the factor ${(a,b,m)}$ will also be small in applications, so that we do indeed see the square root savings over the trivial bound ${|S(a,b,m)| \leq m}$ . For the Type I/II sums, the classical Weil bound is sufficient; but for the Type III sums that we will cover in a subsequent post, the full force of Deligne’s results are needed.

An important remark is that when ${a=0}$ , we can apply the change of variables ${n \mapsto 1/n}$ and convert the Kloosterman sum ${S(0,b;m)}$ into a Ramanujan sum

$\displaystyle S(0,b,m) = \sum_{n \in ({\bf Z}/m{\bf Z})^\times} e_m(bn)$

which enjoys even better cancellation than square root cancellation; in particular it is not difficult to establish the bound

$\displaystyle |S(0,b,m)| \ll m^{o(1)} (b,m) \ \ \ \ \ (18)$

using the Chinese remainder theorem to reduce to the case when ${m}$ is a power of a prime, and then using the divisor bound.

For the Type I and Type II sums we need a more complicated variant of this bound (Lemma 11 of Zhang’s paper):

Lemma 9 Let ${d_1,d_2}$ be square-free natural numbers, let

$\displaystyle d := [d_1,d_2]; \quad d_0 := (d_1,d_2); \quad t_1 := d_1/d_0, \quad t_2 := d_2/d_0,$

and let ${c_1, c_2, l, m}$ be integers. Then the double Kloosterman sum

$\displaystyle K(d_1,c_1; d_2,c_2; l,m) := \sum_{n \in {\bf Z}/d{\bf Z}: (n,d_1) = (n+l,d_2)=1}$

$\displaystyle e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) e_d( mn )$

obeys the bound

$\displaystyle |K(d_1,c_1; d_2,c_2; l,m)| \leq d_0 |S(m_1,b_1; t_1)| |S(m_2,b_2; t_2)|$

for some ${m_1,m_2,b_1,b_2}$ with

$\displaystyle (m_i,b_i,t_i) | (m,c_i,d_i)$

for ${i=1,2}$ . In particular, from Theorem 8, we have

$\displaystyle |K(d_1,c_1; d_2,c_2; l,m)| \ll (d_1 d_2)^{o(1)}$

$\displaystyle (m,c_1,d_1)^{1/2} (m,c_2,d_2)^{1/2} d_1^{1/2} d_2^{1/2}$

while in the ${m=0}$ case we have from (18) that

$\displaystyle |K(d_1,c_1; d_2,c_2; l,0)| \ll (d_1 d_2)^{o(1)} d_0 (c_1,d_1) (c_2,d_2).$

Here ${o(1)}$ denotes a quantity that goes to zero as ${d_1,d_2 \rightarrow \infty}$ .

Proof: As ${d_0,t_1,t_2}$ are coprime, we may refactor ${e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) e_d( mn )}$ as

$\displaystyle e_{d_0}( \frac{a_1}{n} + \frac{a_2}{n+l} + m_0 n ) e_{t_1}( \frac{b_1}{n} + m_1 n ) e_{t_2}( \frac{b_2}{n+l} + m_2 n )$

for some ${a_1,a_2,m_0 \in {\bf Z}/d_0{\bf Z}}$ , ${b_1,m_1 \in {\bf Z}/t_1{\bf Z}}$ and ${b_2,m_2 \in {\bf Z}/t_2{\bf Z}}$ , with

$\displaystyle m = m_0 t_1 t_2 + m_1 d_0 t_2 + m_2 d_0 t_1\ (d_0 t_1 t_2)$

and

$\displaystyle c_i = a_i t_i + b_i d_0\ (d_i)$

for ${i=1,2}$ , which in particular implies that ${(m_i,b_i,t_i) | (m,c_i,d_i)}$ as claimed. Using the Chinese remainder theorem, we may now factor ${K(d_1,c_1; d_2,c_2; l,m)}$ as the product of the sums

$\displaystyle \sum_{n \in {\bf Z}/d_0{\bf Z}: (n,d_0)=(n+l,d_0)=1} e_{d_0}( \frac{a_1}{n} + \frac{a_2}{n+l} + m_0 n ),$

$\displaystyle \sum_{n \in {\bf Z}/t_1{\bf Z}: (n,t_1) = 1} e_{t_1}( \frac{b_1}{n} + m_1 n )$

and

$\displaystyle \sum_{n \in {\bf Z}/t_2{\bf Z}: (n+l,t_2) = 1} e_{t_2}( \frac{b_2}{n+l} + m_2 n ).$

Bounding the first sum trivially by ${d_0}$ and shifting the third sum by ${l}$ we obtain the claim (note that ${d_0 t_1^{1/2} t_2^{1/2} = d_1^{1/2} d_2^{1/2}}$ ).

The treatment of the ${d_0}$ terms in the above analysis are crude, but in applications ${d_0}$ is often trivial anyway, so it is not essential to obtain the sharpest estimates here.

We can combine this with the method of completion of sums:

Corollary 10 Let ${d_1,d_2}$ be square-free natural numbers with ${d_1,d_2 \ll x}$ , let ${c_1,c_2,l}$ be integers, and let ${\psi_N}$ be a smooth coefficient sequence at a scale ${1 \ll N \ll x}$ . Then

$\displaystyle | \sum_{(n,d_1)=(n+l,d_2)=1} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) \psi_N(n) |$

$\displaystyle \ll x^{o(1)} ( (d_1 d_2)^{1/2} + \frac{N (d_1,d_2)^2}{d_1 d_2} (c_1,d_1) (c_2,d_2) ).$

Proof: Write ${d := [d_1,d_2]}$ . By (14) (and overspill) we may bound the left-hand side by

$\displaystyle \ll \frac{1}{d} (\sum_m \psi_N(m)) (\sum_{n \in {\bf Z}/d{\bf Z}} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) )$

$\displaystyle + x^{o(1)} \frac{N}{d} \sum_{1 \leq m \leq x^{o(1)} d N^{-1}} |\sum_{n \in {\bf Z}/d{\bf Z}} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) e_d(mn)|$

$\displaystyle + x^{-A} d,$

where we suppress the conditions ${(n,d_1)=(n+l,d_2)}$ from the ${n}$ summation for brevity. The first term is ${O( x^{o(1)} \frac{N}{d} (d_1,d_2) (c_1,d_1) (c_2,d_2))}$ by Lemma 9, which is acceptable. Another application of Lemma 9 gives

$\displaystyle |\sum_{n \in {\bf Z}/d{\bf Z}} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} ) e_d(mn)|$

$\displaystyle \ll x^{o(1)} (m,c_1,d_1)^{1/2} (m,c_2,d_2)^{1/2} d_1^{1/2} d_2^{1/2}.$

From the divisor bound one has

$\displaystyle \sum_{1 \leq m \leq M} (m,q) \ll q^{o(1)} M$

for any ${M,q \geq 1}$ , so from this and Cauchy-Schwarz the net contribution of the second term is

$\displaystyle \ll x^{o(1)} \frac{N}{d} (x^{o(1)} d N^{-1}) d_1^{1/2} d_2^{1/2}$

which is acceptable. $\Box$

— 4. Factoring a smooth number —

We will need to take advantage of the smooth nature of the variable ${q}$ to factor it into two smaller pieces. We need an elementary lemma:

Lemma 11 Let ${D}$ be a quantity of size ${1 \ll D \ll x}$ , and set

$\displaystyle D_0 := \exp( \log^{1/3} x )$

(say). Let ${A > 0}$ be fixed. Then, for all but ${O(D \log^{-A} x)}$ exceptions, all integers ${d \in [D,2D]}$ have the property that

$\displaystyle \prod_{p|d: p \leq D_0} p \leq \exp( \log^{2/3} x ); \ \ \ \ \ (19)$

in particular,

$\displaystyle \prod_{p|d: p \leq D_0} p = x^{o(1)}.$

Proof: Suppose that (19) failed, thus

$\displaystyle \prod_{p|d: p \leq D_0} p > \exp( \log^{2/3} x ) = D_0^{\log^{1/3} x}.$

In particular, we see that ${d}$ has at least ${\log^{1/3} x}$ prime factors, which implies in particular that

$\displaystyle \tau(d) \geq 2^{\log^{1/3} x}.$

On the other hand, we have the standard bound

$\displaystyle \sum_{D \leq d \leq 2D} \tau(d) \ll D \log x$

and the claim now follows from Markov’s inequality. $\Box$

Corollary 12 (Good factorisation) Let ${I = [1,x^\delta]}$ , and let ${R, D}$ be quantities such that

$\displaystyle 1 \ll R \ll D \ll x.$

Let ${A>0}$ be fixed. Then for all but ${O(D\log^{-A} x)}$ exceptions, all ${d \in {\mathcal S}_I \cap [D,2D]}$ have a factorisation

$\displaystyle d = qr$

where ${q,r \in {\mathcal S}_I}$ are coprime with

$\displaystyle x^{-o(1)} R \ll r \ll x^{\delta-o(1)} R.$

Furthermore ${q}$ has no prime factors less than ${D_0 := \exp(\log^{1/3} x)}$ , thus

$\displaystyle q \in {\mathcal S}_{I'}$

where ${I' := [D_0,x^\delta]}$ .

The fact that ${q}$ has no tiny (i.e. less than ${D_0}$ ) prime factors will imply that any two such ${q}$ will typically be coprime to each other with high probability (at least ${1-O(\log^{-A} x)}$ for any fixed ${A}$ ), which is a key technical fact which we will need to exploit later. (The ${W}$ -trick achieves a qualitatively similar effect, but would only give such a claim with probability ${1-o(1)}$ or maybe ${1-O(\log^{-c} x)}$ for some small ${c>0}$ if one really optimised it, which is insufficient for the applications at hand.)

Proof: By Lemma 11 we may restrict attention to those ${d \in {\mathcal S}_I \cap [D,2D]}$ for which

$\displaystyle \prod_{p|d: p \leq D_0} p = x^{o(1)}.$

Now ${d}$ is the product of distinct primes of size at most ${x^\delta}$ , with ${d \gg R}$ . Applying the greedy algorithm, one can then find a factor ${r'}$ of ${d}$ with

$\displaystyle R \ll r' \ll x^{\delta} R.$

If one then multiplies ${r'}$ by all primes of size less than ${D_0}$ that divide ${d/r'}$ to create ${r'}$ , then sets ${q := d/r'}$ , the claim follows. $\Box$

— 5. The dispersion method —

We begin the proof of Theorem 3. The reader may wish to track the exponents involved in the model regime

$\displaystyle \varpi, \delta \approx 0; \quad 0 < \sigma < 1/8; \quad M \approx x^{1/2+\sigma}; \quad N \approx x^{1/2-\sigma}. \ \ \ \ \ (20)$

We can restrict ${q}$ to the range

$\displaystyle q \geq x^{1/2-o(1)}$

for some sufficiently slowly decaying ${o(1)}$ , since otherwise we may use the Bombieri-Vinogradov theorem (Theorem 4). Thus we need to show that

$\displaystyle \sum_{q \in {\mathcal S}_I: x^{1/2-o(1)} \leq q< x^{1/2+2\varpi}} |\Delta(\alpha \ast \beta; a_q)| \ll NM \log^{-A} x. \ \ \ \ \ (21)$

Let

$\displaystyle \mu > 0 \ \ \ \ \ (22)$

be an exponent to be optimised later (in many cases, such as (20), it can be set very close to zero). By Corollary 12, outside of a small number of exceptions, we may factor ${q=q'r}$ where ${q' \in {\mathcal S}_{I'}}$ with ${I' := I \cap [D_0,x^\delta]}$ , ${r \in {\mathcal S}_I}$ is coprime to ${q'}$ , and

$\displaystyle x^{-\mu-\delta-o(1)} N \ll r \ll x^{-\mu+o(1)} N$

and

$\displaystyle x^{1/2-o(1)} \ll q'r \ll x^{1/2+2\varpi+o(1)}.$

Let us first dispose of the set ${{\mathcal E}}$ of exceptional values of ${q}$ for which the above factorisation is unavailable. From Corollary 12 we have

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal E}: x^{1/2-o(1)} \leq q< x^{1/2+2\varpi}} \frac{x}{q} \ll x \log^{-A+O(1)} x.$

On the other hand, we have the crude estimate

$\displaystyle |\Delta(\alpha \ast \beta; a_q)| \ll \frac{x}{q} \tau(q)^{O(1)} \log^{O(1)} x$

which when combined with crude estimates leads to the crude upper bound

$\displaystyle \sum_{q: x^{1/2-o(1)} \leq q< x^{1/2+2\varpi}} \frac{q}{x} |\Delta(\alpha \ast \beta; a_q)|^2 \ll x \log^{O(1)} x.$

Applying Cauchy-Schwarz we conclude that

$\displaystyle \sum_{q \in {\mathcal S}_I \cap {\mathcal E}: x^{1/2-o(1)} \leq q< x^{1/2+2\varpi}} |\Delta(\alpha \ast \beta; a_q)| \ll x \log^{-A/2+O(1)} x.$

and so ${{\mathcal E}}$ gives a negligible contribution to (21) (increasing ${A}$ as necessary).

For the non-exceptional ${q \not \in {\mathcal E}}$ , we arbitrarily select a factorisation ${q = q'r}$ of the above form, and apply a dyadic decomposition. We conclude that it suffices to show that

$\displaystyle \sum_{q' \in {\mathcal S}_{I'}: Q \ll q' \ll Q} \sum_{r \in {\mathcal S}_I: R \ll r \ll R; (q,r)=1} |\Delta(\alpha \ast \beta; a_{q'r})| \ll NM \log^{-A} x.$

for any fixed ${A > 0}$ , where ${Q, R \geq 1}$ obey the size conditions

$\displaystyle x^{-\mu-\delta-o(1)} N \ll R \ll x^{-\mu+o(1)} N \ \ \ \ \ (23)$

and

$\displaystyle x^{1/2-o(1)} \ll QR \ll x^{1/2 + 2\varpi}. \ \ \ \ \ (24)$

Fix ${Q,R}$ . We replace ${q'}$ by ${q}$ , and abbreviate ${\sum_{q \in {\mathcal S}_{I'}: Q \ll q \ll Q}}$ and ${\sum_{r \in {\mathcal S}_I: R \ll r \ll R}}$ by ${\sum_q}$ and ${\sum_r}$ respectively, thus our task is to show that

$\displaystyle \sum_q \sum_{r: (q,r)=1} |\Delta(\alpha \ast \beta; a_{qr})| \ll NM \log^{-A} x.$

We now split the discrepancy

$\displaystyle \Delta(\alpha \ast \beta; a_{qr}) = \sum_{n = a_{qr}\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n)$

as the sum of the subdiscrepancies

$\displaystyle \sum_{n: n = a_{qr}\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n)$

and

$\displaystyle \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n).$

Of the two, the first discrepancy is significantly more difficult to handle. By the triangle inequality, it will suffice to show that

$\displaystyle \sum_{q} \sum_{r; (q,r)=1} |\sum_{n: n = a_{qr}\ (qr)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n)| \ \ \ \ \ (25)$

$\displaystyle \ll NM \log^{-A} x$

and

$\displaystyle \sum_{q} \sum_{r; (q,r)=1} |\frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(qr)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n)| \ll \ \ \ \ \ (26)$

$\displaystyle NM \log^{-A} x.$

We begin with (26), which is easier. For each fixed ${q}$ , it will suffice to show that

$\displaystyle \sum_{r; (q,r)=1} |\sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(r)} \sum_{n: (n,qr)=1} \alpha \ast \beta(n)| \ll NM \log^{-A} x,$

as the claim then follows by dividing by ${\phi(q)}$ and summing using standard estimates (see Lemma 8 of this previous post). But this claim follows from the Bombieri-Vinogradov theorem (Theorem 4), after restricting ${\alpha,\beta}$ to the integers coprime to ${q}$ (which does not affect the property of being a coefficient sequence supported at a certain scale, nor does it affect the Siegel-Walfisz theorem).

Now we establish (25). Here we will not take advantage of the ${r}$ summation, and use crude estimates to reduce to showing that

$\displaystyle \sum_{q; (q,r)=1} |\sum_{n: n = a_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n)| \ \ \ \ \ (27)$

$\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x$

for each individual ${r \in {\mathcal S}_I}$ with ${R \ll r \ll R}$ , which we now fix. Actually, we will prove the more general statement

$\displaystyle \sum_{q; (q,r)=1} |\sum_{n: n = b_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n) - \sum_{n: n = b'_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n)| \ \ \ \ \ (28)$

$\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x$

whenever ${(b_q)_{q \in {\mathcal S}_{I'}}, (b'_q)_{q \in {\mathcal S}_{I'}}}$ are good singleton congruence class systems. Let us see how (28) implies (27). Observe that if ${t = o(x)}$ is any integer, then ${(\{a_q\})_{q \in {\mathcal S}_{I'_t}}}$ and ${(\{a_q+t\})_{q \in {\mathcal S}_{I'_t}}}$ are also good singleton congruence class systems, where ${I'_t}$ consists of the primes ${p \in I'}$ with ${a_p+t}$ not divisible by ${p}$ (thus ${q \in {\mathcal S}_{I'}}$ lies in ${{\mathcal S}_{I'_t}}$ when ${a_q+t}$ is coprime to ${q}$ ). By (28) we thus have

$\displaystyle \sum_{q; (q,r)=1} 1_{(a_q+t,q)=1} |\sum_{n: n = a_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n)$

$\displaystyle - \sum_{n: n = a_q + t\ (q); n = a_r\ (r)} \alpha \ast \beta(n)|$

$\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x.$

If we average ${t}$ over the interval ${T := {\bf Z} \cap [-x^{1-\epsilon},x^{1-\epsilon}]}$ for some sufficiently small fixed ${\epsilon>0}$ , we observe that

$\displaystyle \frac{1}{|T|} \sum_{t \in T} 1_{(a_q+t,q)=1} = \frac{\phi(q)}{q} + O( Q |T|^{-1} )$

and similarly (using the crude divisor bound)

$\displaystyle \frac{1}{|T|} \sum_{t \in T} 1_{(a_q+t,q)=1} \sum_{n: (n,q)=1; n = a_q + t\ (q); n = a_r\ (r)} \alpha \ast \beta(n)$

$\displaystyle = \frac{1}{q} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n) + O( NM |T|^{-1} x^{o(1)} ).$

(The condition ${(n,q)=1}$ is redundant, but we include it for emphasis.) We conclude from the triangle inequality and the bounds on ${Q,R}$ (if ${\epsilon}$ is small enough) that

$\displaystyle \sum_{q; (q,r)=1} \frac{\phi(q)}{q} |\sum_{n: n = a_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n)|$

$\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x.$

On the other hand, from crude estimates one has

$\displaystyle \sum_{q; (q,r)=1} \frac{q}{\phi(q)} |\sum_{n: n = a_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n) - \frac{1}{\phi(q)} \sum_{n: (n,q)=1; n = a_r\ (r)} \alpha \ast \beta(n)|$

$\displaystyle \ll NM R^{-1} \log^{O(1)} x$

and the claim (27) then follows from Cauchy-Schwarz (noting from the Chinese remainder theorem that the two constraints ${n = a_q\ (q), n = a_r\ (r)}$ are equivalent to the single constraint ${n = a_{qr}\ (qr)}$ ).

It remains to prove (28). We will use the dispersion method (or Cauchy-Schwarz), playing the two congruence conditions ${n = b_q\ (q)}$ and ${n = b'_q\ (q)}$ against each other. We first get rid of the absolute values in (28) by introducing an additional bounded coefficient. More precisely, to prove (28) it suffices to show that

$\displaystyle |\sum_{q; (q,r)=1} c_q (\sum_{n: n = b_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n) - \sum_{n: n = b'_q\ (q); n = a_r\ (r)} \alpha \ast \beta(n))|$

$\displaystyle \ll NM R^{-1} \tau(r)^{O(1)} \log^{-A} x$

for any bounded real coefficients ${c_q = O(1)}$ . We expand out the Dirichlet convolution

$\displaystyle \alpha \ast \beta(n) = \sum_{m,n': mn' = n} \alpha(m) \beta(n')$

then relabel ${n'}$ as ${n}$ to rearrange the left-hand side as

$\displaystyle |\sum_{m} \alpha(m) \sum_{q,n: mn = a_r\ (r); (q,r)=1} c_{q} \beta(n) (1_{mn = b_{q}\ (q)} - 1_{mn = b'_{q}\ (q)})|.$

We can write ${\alpha = \alpha \psi_M}$ for some smooth non-negative coefficient sequence ${\psi_M}$ at scale ${M}$ . From crude bounds one has

$\displaystyle \sum_{m} \alpha^2(m) \psi_M(m)\ll M \log^{O(1)} x$

so by the Cauchy-Schwarz inequality it suffices to show that

$\displaystyle \sum_{m} \psi_M(m) |\sum_{q,n: mn = a_r\ (r); (q,r)=1} c_{q} \beta(n) (1_{mn = b_{q}\ (q)} - 1_{mn = b'_{q}\ (q)})|^2 \ \ \ \ \ (29)$

$\displaystyle \ll N^2 M R^{-2} \tau(r)^{O(1)} \log^{-A} x$

for any fixed ${A>0}$ . (As a sanity check, note that we are still only asking for a ${\log^{-A} x}$ savings over the trivial bound.) Expanding out the square, it suffices to show that

$\displaystyle \sum_{m} \psi_M(m) \sum_{q_1,q_2,n_1,n_2: mn_1=mn_2 = a_r\ (r); (q_1,r)=(q_2,r)=1} \ \ \ \ \ (30)$

$\displaystyle c_{q_1} c_{q_2} \beta(n_1) \beta(n_2) 1_{mn_1 = b_{q_1}\ (q_1)} 1_{mn_2 = b'_{q_2}\ (q_2)}$

$\displaystyle = X + O( N^2 M R^{-2} \tau(r)^{O(1)} \log^{-A} x )$

where ${q_1,q_2}$ is subject to the same constraints as ${q}$ (thus ${q_i \in {\mathcal S}_{I'}}$ and ${Q \ll q_i \ll Q}$ for ${i=1,2}$ ), and ${X}$ is some quantity that is independent of the choice of congruence classes ${(b_q)_{q \in {\mathcal S}_I}}$ , ${(b'_q)_{q \in {\mathcal S}_I}}$ , since by replacing the ${b_q}$ with ${b'_q}$ or vice versa as necessary one can express (29) as a linear combination ${S_1-2S_2+S_3}$ of terms ${S_1,S_2,S_3}$ of the form in (29) in such a way that all the ${X}$ terms cancel out ( ${X - 2X + X = 0}$ ).

At this stage we need to deal with a technical problem that ${q_1,q_2}$ may share a common factor; fortunately, this event turns out to be negligible (but only thanks to the controlled multiplicity hypothesis (6)). More precisely, we split (30) into the coprime case

$\displaystyle \sum_{m} \psi_M(m) \sum_{q_1,q_2,n_1,n_2: mn_1=mn_2 = a_r\ (r); (q_1,r)=(q_2,r)=(q_1,q_2)=1} \ \ \ \ \ (31)$

$\displaystyle c_{q_1} c_{q_2} \beta(n_1) \beta(n_2) 1_{mn_1 = b_{q_1}\ (q_1)} 1_{mn_2 = b'_{q_2}\ (q_2)}$

$\displaystyle = X + O( N^2 M R^{-2} \log^{-A} x )$

and the non-coprime case

$\displaystyle |\sum_{q_0>1} \sum_{m} \psi_M(m) \sum_{q_1,q_2,n_1,n_2: mn_1=mn_2 = a_r\ (r); (q_1,r)=(q_2,r)=1; (q_1,q_2)=q_0} \ \ \ \ \ (32)$

$\displaystyle c_{q_1} c_{q_2} \beta(n_1) \beta(n_2) 1_{mn_1 = b_{q_1}\ (q_1)} 1_{mn_2 = b'_{q_2}\ (q_2)}|$

$\displaystyle \ll N^2 M R^{-2} \tau(r)^{O(1)} \log^{-A} x$

We first show (32). The basic point here is that because we have previously restricted ${q_1,q_2}$ to have no prime factor smaller than ${D_0}$ , we can gain a factor of ${D_0^{-1}}$ in (32), which is strong enough to overcome logarithmic losses ${\log^{O(1)} x}$ but not losses ${x^{o(1)}}$ coming from the divisor bound. To avoid using the divisor bound we will need the controlled multiplicity hypothesis (6) (and this is the only place in the argument where this hypothesis is actually used).

We turn to the details. The quantities ${q_0,q_1,q_2}$ must all lie in ${{\mathcal S}_{I'}}$ . We may split ${q_1 = q_0 q'_1}$ and ${q_2 = q_0 q'_2}$ where ${q'_1,q'_2}$ are coprime. By the Chinese remainder theorem, the constraints ${mn_1 = b_{q_1}\ (q_1)}$ and ${mn_2 = b'_{q_2}\ (q_2)}$ then imply that

$\displaystyle mn_1 = b_{q_0}\ (q_0); \quad mn_2 = b'_{q_0}\ (q_0); \quad mn_1 = b_{q'_1}\ (q'_1); \quad mn_2 = b'_{q'_2}\ (q'_2)$

so by the triangle inequality and interchange of summation we can bound the left-hand side by

$\displaystyle \ll \sum_{q_0>1} \sum_{m} \psi_M(m) \sum_{n_1,n_2: mn_1=mn_2 =a_r\ (r); mn_1 = b_{q_0}\ (q_0); mn_2 = b'_{q_0}\ (q_0)}$

$\displaystyle |\beta(n_1)| |\beta(n_2)| \sum_{q'_1,q'_2} 1_{mn_1 = b_{q'_1}\ (q'_1); mn_2 = b'_{q'_2}\ (q'_2)}$

(with ${q_0}$ understood to lie in ${{\mathcal S}_{I'}}$ and be at most ${Q}$ ) which we can write as

$\displaystyle \ll \sum_{q_0>1} \sum_{m} \psi_M(m) \sum_{n_1,n_2: mn_1=mn_2 =a_r\ (r); mn_1 = b_{q_0}\ (q_0); mn_2 = b'_{q_0}\ (q_0)}$

$\displaystyle |\beta(n_1)| |\beta(n_2)| \tau_b(mn_1) \tau_{b'}(mn_2)$

where

$\displaystyle \tau_b(n) := |\{ q \in {\mathcal S}_{I'}: n = b_q\ (q) \}$

and

$\displaystyle \tau_{b'}(n) := |\{ q \in {\mathcal S}_{I'}: n = b'_q\ (q) \}$

are the multiplicity functions associated to the congruence class systems ${(\{b_q\})_{q \in {\mathcal S}_{I'}}}$ and ${(\{b'_q\})_{q \in {\mathcal S}_{I'}}}$ .

By the elementary bound ${\tau_b(mn_1) \tau_{b'}(mn_2) \leq \tau_b(mn_1)^2 + \tau_{b'}(mn_2)^2}$ and symmetry we may thus bound the left-hand side of (32) by

$\displaystyle \ll \sum_{q_0>1} \sum_{m} \psi_M(m) \sum_{n_1,n_2: mn_1=mn_2 =a_r\ (r); mn_1 = b_{q_0}\ (q_0); mn_2 = b'_{q_0}\ (q_0)}$

$\displaystyle |\beta(n_1)| |\beta(n_2)| \tau_b(mn_1)^2$

plus a symmetric term which is treated similarly and will be ignored.

We rearrange the constraints

$\displaystyle mn_1=mn_2=a_r\ (r); \quad mn_1 = b_{q_0}\ (q_0); \quad mn_2 = b'_{q_0}\ (q_0)\ (q_0)$

as a combination of the constraints

$\displaystyle n_1 = n_2\ (r); \quad b'_{q_0} n_1 = b_{q_0} n_2\ (q_0)$

and

$\displaystyle mn_1 = a_r\ (r); \quad mn_1 = b_{q_0}\ (q_0).$

For a given choice of ${n_1,n_2}$ obeying the former set of constraints, the ${mn_1}$ obeying the second set of constraints lie in a single congruence class mod ${q_0rn_1}$ by the Chinese remainder theorem. From the controlled multiplicity hypothesis (6) one thus has

$\displaystyle \sum_{m: mn_1 = a_r\ (r); mn_1 = b_{q_0}\ (q_0)} \psi_M(m) \tau_b(mn_1)^2 \ll \frac{M}{q_0 R} \tau(q_0 r n_1)^{O(1)} \log^{O(1)} x + x^{o(1)}$

and so the left-hand side of (32) has been bounded by

$\displaystyle \ll \sum_{q_0>1} \sum_{n_1,n_2: n_1=n_2\ (r); b'_{q_0} n_1 = b_{q_0} n_2\ (q_0)} |\beta(n_1)| |\beta(n_2)|$

$\displaystyle (\frac{M}{q_0 R} \tau(q_0 r n_1)^{O(1)} \log^{O(1)} x + x^{o(1)}).$

We first dispose of the ${x^{o(1)}}$ error term. From the divisor bound we have ${|\beta(n_1)| |\beta(n_2)| = O(x^{o(1)})}$ . For a given choice of ${n_1 \sim N}$ , the ${n_2}$ sum then can be bounded by ${x^{o(1)} (\frac{N}{q_0 R} + 1)}$ , leading to a total contribution of

$\displaystyle \ll x^{o(1)} N (\frac{N}{R} + Q).$

We would like to bound this by ${N^2 M R^{-2} \log^{-A} x}$ . This is possible if we have

$\displaystyle R \ll x^{-c+o(1)} M \ \ \ \ \ (33)$

and

$\displaystyle QR \times R \ll x^{-c+o(1)} MN$

for some fixed ${c>0}$ . But the former bound is immediate from (23), (10), (22), while from (9), (24), (23) we see that the latter bound will follow if we have

$\displaystyle x^\mu \gg x^{-1/2+2\varpi+c} N. \ \ \ \ \ (34)$

for some fixed ${c>0}$ . We file away this necessary condition for now and move on, though we note that these conditions are weaker than (22) except in the “Type II” case when ${M,N}$ are close to ${\sqrt{x}}$ .

Having disposed of the ${x^{o(1)}}$ error term, the remaining contribution to the left-hand side of (32) that we need to control is

$\displaystyle \sum_{q_0>1} \sum_{n_1,n_2: n_1=n_2\ (r); b'_{q_0} n_1 = b_{q_0} n_2\ (q_0)} |\beta(n_1)| |\beta(n_2)| \frac{M}{q_0 R} \tau(q_0 r n_1)^{O(1)} \log^{O(1)} x.$

Summing in ${n_2}$ using crude bounds, this is bounded by

$\displaystyle \ll \sum_{q_0>1} \sum_{n_1} |\beta(n_1)| \frac{M}{q_0 R} \tau(q_0 r n_1)^{O(1)} \log^{O(1)} x ( \frac{N}{q_0 R} + x^{o(1)}),$

and then by summing in ${n_1}$ this is in turn bounded by

$\displaystyle \ll \sum_{q_0>1} \frac{NM}{q_0 R} \tau(q_0 r)^{O(1)} \log^{O(1)} x ( \frac{N}{q_0 R} + x^{o(1)}).$

The ${x^{o(1)}}$ term sums to ${O( x^{o(1)} NM / R )}$ , which is ${O( N^2 M R^{-2} \log^{-A} x)}$ thanks to (23), (22). The main term is then

$\displaystyle \ll ( \tau(r)^{O(1)} \log^{O(1)} x) \frac{N^2 M}{R^2} \sum_{q_0>1} \frac{\tau(q_0)^{O(1)}}{q_0^2}.$

Now we finally use the fact that ${q_0}$ has no small factors less than ${D_0}$ to bound the summation here by

$\displaystyle \prod_{p \geq D_0} (1 + O(\frac{1}{p^2})) - 1 = O( \frac{1}{D_0} )$

and since ${D_0}$ grows faster than any power of ${\log x}$ we see that this error term is also acceptable for (32). This concludes the proof of (32) (contingent of course on the lower bounds (22), (34) on ${\mu}$ that we will deal with later).

It remains to verify (31). Observe that ${n_1}$ must be coprime to ${q_1r}$ and ${n_2}$ coprime to ${q_2r}$ , with ${n_1 = n_2\ (r)}$ , to have a non-zero contribution to the sum. We then rearrange the left-hand side as

$\displaystyle \sum_{q_1,q_2: (q_1,r)=(q_2,r)=(q_1,q_2)=1} \sum_{m} \psi_M(m) \sum_{n_1,n_2: n_1=n_2\ (r); (n_1,q_1r)=(n_2,q_2)=1}$

$\displaystyle c_{q_1} c_{q_2} \beta(n_1) \beta(n_2) 1_{m = a_r/n_1\ (r); m = b_{q_1}/n_1\ (q_1); m = b'_{q_2}/n_2 (q_2)};$

note that these inverses in the various rings ${{\bf Z}/r{\bf Z}}$ , ${{\bf Z}/q_1{\bf Z}}$ , ${{\bf Z}/q_2{\bf Z}}$ are well-defined thanks to the coprimality hypotheses.

We may write ${n_2 = n_1+kr}$ for some ${k = O(N/R)}$ . By the triangle inequality, and relabeling ${n_1}$ as ${n}$ , it thus suffices to show that for any particular

$\displaystyle k = O(N/R), \ \ \ \ \ (35)$

one has

$\displaystyle \sum_{q_1,q_2: (q_1,r)=(q_2,r)=(q_1,q_2)=1} \sum_{m} \psi_M(m) \sum_{n; (n,q_1r)=(n+kr,q_2)=1} \ \ \ \ \ (36)$

$\displaystyle c_{q_1} c_{q_2} \beta(n) \beta(n+kr) 1_{m = a_r/n\ (r); m = b_{q_1}/n\ (q_1); m = b'_{q_2}/(n+kr) (q_2)}$

$\displaystyle = X_k + O( N M R^{-1} \log^{-A} x )$

for some ${X_k}$ independent of the ${b_q}$ and ${b'_q}$ .

We remark that at this stage we are only needing to gain a factor of ${\log^{-A} x}$ over the trivial bound. However, we will now perform the expensive step of completion of sums (Lemma 6), which replaces the ${\psi_M}$ factor by an exponential phase at the cost of requiring now a significantly larger gain over the trivial bound. Applying Lemma 6 and Lemma 7, we can write the left-hand side of (36) as the sum of the main term

$\displaystyle X_k := \sum_{q_1,q_2: (q_1,r)=(q_2,r)=(q_1,q_2)=1} \sum_{n: (n,q_1r)=(n+kr,q_2)=1}$

$\displaystyle \frac{c_{q_1} c_{q_2} \beta(n) \beta(n+kr)}{r q_1 q_2} (\sum_m \psi_M(m));$

an error term

$\displaystyle O( (\log^{O(1)} x) \frac{M}{Q^2R} \sum_{1 \leq h \leq H} \sum_{q_1,q_2} |\sum_{n} \beta(n) \beta(n+kr) \Phi(h,q_1,q_2; n) |)$

where

$\displaystyle H := x^\epsilon Q^2 R/M, \ \ \ \ \ (37)$

${\epsilon > 0}$ is an arbitrary small fixed quantity, and ${\Phi = \Phi_{k,r}}$ is the phase

$\displaystyle \Phi(h,q_1,q_2;n) := 1_{(q_1,r)=(q_2,r)=(q_1,q_2)=(n,r)=(n,q_1)=(n+kr,q_2)=1} \ \ \ \ \ (38)$

$\displaystyle e_r( \frac{a_r h}{nq_1 q_2} ) e_{q_1}( \frac{b_{q_1}h}{n r q_2} ) e_{q_2}( \frac{b'_{q_2} h}{(n+kr) r q_1} )$

(here we use the bounded nature of the ${c_{q_1}, c_{q_2}}$ ); and another error term that can easily be shown to be ${O(x^{-A+O(1)})}$ for any fixed ${A}$ . The term ${X_k}$ is independent of the ${b_q}$ and ${b'_q}$ , so it will suffice to show that

$\displaystyle \sum_h \sum_{Q \ll q_1,q_2 \ll Q} |\sum_{n} \beta(n) \beta(n+kr) \Phi(h,q_1,q_2; n)| \ll x^{-\epsilon+o(1)} Q^2 N \ \ \ \ \ (39)$

for a sufficiently small fixed ${\epsilon>0}$ , and we have dropped all hypotheses on ${q_1,q_2}$ other than magnitude, and we abbreviate ${\sum_{1 \leq h \leq H}}$ as ${\sum_h}$ . As noted after Lemma 6, we are no longer asking for just a ${\log^{-A} x}$ savings over the trivial bound; we must instead gain a factor of ${x^{\epsilon} H}$ to overcome the summation in ${h}$ .

Although this is not strictly necessary for our analysis, let is confirm that ${H}$ is actually non-trivial in the sense that

$\displaystyle H \gg 1. \ \ \ \ \ (40)$

Indeed, from (23) and (9) one has

$\displaystyle RM \ll x^{1-\mu+o(1)}$

and hence from (24)

$\displaystyle RM \ll x^{-\mu+o(1)} Q^2 R^2$

and (40) then follows from (37). Note though in the model case (20) with ${\mu \approx 0}$ that ${H}$ is close to ${1}$ (for any ${\sigma}$ between ${0}$ and ${1/8}$ ).

We now split into two cases, one which works when ${M, N}$ are not too close to ${x^{1/2}}$ , and one which works when ${M,N}$ are close to ${x^{1/2}}$ .

Theorem 13 (Type I case) If the inequalities

$\displaystyle 11\varpi + 3\mu + 3\delta + 2 \sigma < \frac{1}{4} \ \ \ \ \ (41)$

and

$\displaystyle M \gg x^{1/2+2\varpi+c} \ \ \ \ \ (42)$

hold for some fixed ${c>0}$ , then (39) holds for a sufficiently small fixed ${\epsilon>0}$ .

The condition (42) represents a border between this case and the Type II case.

Theorem 14 (Type II case) If the inequality

$\displaystyle 24\varpi + 10 \mu + 10 \delta + 7 \sigma < \frac{1}{2} \ \ \ \ \ (43)$

holds, then (39) holds for a sufficiently small fixed ${\epsilon>0}$ .

Both of these theorems are established by using Cauchy-Schwarz to eliminate the absolute values and ${\beta}$ factors in (39) until one is left with an expression only involving the phase ${\Phi(h,q_1,q_2;n)}$ which can then be estimated using the Weil conjectures with a power saving to counteract the loss of ${H}$ , however the use of Cauchy-Schwarz is slightly different in the two cases. In practice, the condition (43) is too strong to be satisfied by value of ${\sigma}$ given in Theorem 3, so we will only use Theorem 14 in the case that (42) fails, since in that case we may make ${\sigma}$ small enough for (43) to hold.

Assuming these theorems, let us now conclude the proof of Theorem 3. First suppose we are in the “Type I” regime when (42) holds for some fixed ${c>0}$ . Then by (9) we have

$\displaystyle N \ll x^{1/2-2\varpi-c}$

which means that the condition (34) is now weaker than (22) and may be omitted. By (7) we can simultaneously obey (22), (34), (41) by setting ${\mu}$ sufficiently close to zero, and the claim now follows from Theorem 13.

Now suppose instead that we are in the “Type II” regime where (42) fails for some small ${c>0}$ , so that by (9) we have

$\displaystyle x^{1/2-2\varpi-c} \ll N \ll M \ll x^{1/2+2\varpi+c}.$

From this we see that we may replace ${\sigma}$ by ${2\varpi+c}$ in (10) and in all of the above analysis. If we set ${\mu := 2\varpi + c}$ then the conditions (22), (34) are obeyed. Theorem 14 will then give us what we want provided that

$\displaystyle 24\varpi + 10 (2\varpi+c) + 10 \delta + 7 (2\varpi+c) < \frac{1}{2}$

which is satisfied for ${c}$ small enough thanks to (8).

In the next two sections we establish Theorem 13 and Theorem 14.

— 6. The Type I sum —

We now prove Theorem 13. It suffices to show that

$\displaystyle |\sum_h \sum_{Q \ll q_1,q_2 \ll Q} c_{h,q_1,q_2} \sum_{n} \beta(n) \beta(n+kr) \Phi(h,q_1,q_2; n)| \ll x^{-\epsilon+o(1)} Q^2 N$

for any bounded real coefficients ${c_{h,q_1,q_2} = O(1)}$ (which are vaguely related to the previous coefficients ${c_q}$ , but this is not important for us here). We rearrange the left-hand side as

$\displaystyle |\sum_{Q\ll q_1 \ll Q} \sum_n \beta(n) \beta(n+kr) \sum_h \sum_{Q \ll q_2 \ll Q} c_{h,q_1,q_2} \Phi(h,q_1,q_2; n)|.$

From the divisor bound we have

$\displaystyle \sum_{Q \ll q_1 \ll Q} \sum_n |\beta(n) \beta(n+kr)|^2 \ll x^{o(1)} Q N$

and we may write ${\beta = \beta \psi_N}$ for some smooth coefficient sequence at scale ${N}$ , so by Cauchy-Schwarz it suffices to show that

$\displaystyle \sum_{Q \ll q_1 \ll Q} \sum_{n} \psi_N(n) |\sum_h \sum_{Q \ll q_2 \ll Q} c_{h,q_1,q_2} \Phi(h,q_1,q_2; n)|^2 \ll x^{-2\epsilon+o(1)} Q^3 N$

(note now we have to gain more than ${H^2}$ over the trivial bound, rather than just ${H}$ ). We rearrange this as

$\displaystyle |\sum_{h,h'} \sum_{Q \ll q_1, q_2,q'_2 \ll Q} c_{h,q_1,q_2} \overline{c_{h',q_1,q'_2}} \sum_{n} \psi_N(n)\Phi(h,q_1,q_2;n) \overline{\Phi(h',q_1,q'_2;n)}|$

so by the triangle inequality it suffices to show that

$\displaystyle \sum_{h,h'} \sum_{Q\ll q_1, q_2,q'_2 \ll Q} |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q_1,q'_2;n)}| \ll x^{-2\epsilon+o(1)} Q^3 N$

for some fixed ${\epsilon>0}$ . We discard the ${q_1}$ summation and reduce to showing that

$\displaystyle \sum_{h,h'} \sum_{Q\ll q_2,q'_2 \ll Q} |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q_1,q'_2;n)}| \ll x^{-2\epsilon+o(1)} Q^2 N \ \ \ \ \ (44)$

for any ${Q \ll q_1 \ll Q}$ .

To prove (44), we isolate the diagonal case ${h'q_2 = hq'_2}$ and the non-diagonal case ${h'q_2 \neq h q'_2}$ . For the diagonal case, we make the crude bound

$\displaystyle |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q_1,q'_2;n)}| \ll x^{o(1)} N$

The contribution of the diagonal case can now be bounded by

$\displaystyle \ll N | \{ (h,h',q_2,q'_2): h,h' = O(H); q_2,q'_2 = O(Q); hq'_2=h'q_2 \}|.$

Writing ${m := hq'_2 = h'q_2}$ we have ${m=O(HQ)}$ , and one can estimate this by

$\displaystyle \ll N \sum_{m = O(HQ)} \tau(m)^2$

which by the divisor bound is of the form

$\displaystyle \ll x^{o(1)} N H Q.$

For this to be acceptable we need a bound of the form

$\displaystyle H \ll x^{-2\epsilon+o(1)} Q$

which, by (37) is equivalent to

$\displaystyle QR \ll x^{-3\epsilon+o(1)} M,$

but this follows from (24), (42) for ${\epsilon}$ small enough.

Now we treat the non-diagonal case ${h'q_2 \neq hq'_2}$ . The key estimate here is

$\displaystyle |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q_1,q'_2;n)}| \ \ \ \ \ (45)$

$\displaystyle \ll x^{o(1)} (Q^{3/2} R^{1/2} + H Q R^{-1} N)$

for all non-diagonal ${h,q_2,h'_2,q'_2}$ . In the model case (20) with ${\mu \approx 0}$ , the two terms on the right-hand side are approximately ${x^{1/4+\sigma+o(1)}}$ and ${x^{\sigma+o(1)}}$ , which give the desired power saving compared to the trivial bound of ${x^{1/2-\sigma+o(1)}}$ since ${\sigma < 1/8}$ (and in (20), ${H}$ is small, so just about any power saving suffices). As the model case indicates, the first term in (45) is the dominant one in practice.

Assume for the moment that the estimate (45) holds; then the non-diagonal contribution to (44) is

$\displaystyle \ll x^{o(1)} ( H^2 Q^{7/2} R^{1/2} + H^3 Q^3 R^{-1} N )$

so to conclude (44) we need to show that

$\displaystyle H^2 Q^{7/2} R^{1/2} \ll x^{-2\epsilon+o(1)} Q^2 N$

and

$\displaystyle H^3 Q^3 R^{-1} N \ll x^{-2\epsilon+o(1)} Q^2 N.$

Using (37), (9) we can rewrite these criteria as

$\displaystyle (QR)^{11/2} \ll x^{2-4\epsilon+o(1)} N^2 (R/N)^3$

and

$\displaystyle (QR)^7 \ll x^{3-5\epsilon+o(1)} N^2 (R/N)^5$

respectively. Applying (24), (23), it suffices to verify that

$\displaystyle x^{\frac{11}{4} + 11 \varpi} \ll x^{2-4\epsilon-3\mu-3\delta+o(1)} N^2$

and

$\displaystyle x^{\frac{7}{2} + 14 \varpi} \ll x^{3-5\epsilon-5\mu-5\delta+o(1)} N^2$

but these follow from (10) and (41) (the latter inequality holds with considerable room to spare).

It remains to show (45) in the non-diagonal case ${h'q_2 \neq hq'_2}$ . From (38) we may of course assume that

$\displaystyle (q_1,r) = (q_2,r) = (q_1,q_2) = (q'_2,r) = (q_1,q'_2) = 1$

and the left-hand side of (45) expands as

$\displaystyle | \sum_{n} \psi_N(n) 1_{(n,q_1r) = (n+kr,q_2q'_2)=1} e_r( \frac{a_r (hq'_2-h'q_2)}{nq_1q_2q'_2} ) e_{q_1}( \frac{b_{q_1} (h q'_2 - h' q_2)}{nrq_2q'_2})$

$\displaystyle e_{q_2}( \frac{b'_{q_2} h}{(n+kr) r q_1} ) e_{q'_2}( -\frac{b'_{q'_2} h'}{(n+kr) r q_1} )|.$

The first two phases

$\displaystyle e_r( \frac{a_r (hq'_2-h'q_2)}{nq_1q_2q'_2} ) e_{q_1}( \frac{b_{q_1} (h q'_2 - h' q_2)}{nrq_2q'_2})$

can be combined as

$\displaystyle e_{d_1}( \frac{c_1}{n} )$

where ${d_1 := q_1 r}$ and ${c_1 \in {\bf Z}/d_1{\bf Z}}$ is the residue class

$\displaystyle c_1 := (hq'_2 - h'q_2) (a_r \bar{q_1} \bar{q_2} \bar{q'_2} q_1 + b_{q_1} \bar{r} \bar{q_2} \bar{q'_2} r)\ (d_1)$

and the inverses ${\bar{x}}$ are with respect to modulus ${r}$ in the first summand and ${q_1}$ in the second summand. For future reference we note that

$\displaystyle (c_1, r) = (hq'_2-h'q_2, r). \ \ \ \ \ (46)$

Similarly, the two phases

$\displaystyle e_{q_2}( \frac{b'_{q_2} h}{(n+kr) r q_1} ) e_{q'_2}( -\frac{b'_{q'_2} h'}{(n+kr) r q_1} )$

can combine as

$\displaystyle e_{d_2}( \frac{c_2}{n+kr} )$

where ${d_2 := [q_2,q'_2]}$ and ${c_2 \in {\bf Z}/d_2{\bf Z}}$ is the residue class

$\displaystyle c_2 := b'_{q_2} h \bar{r} \bar{q_1} \frac{d_2}{q_2} - b'_{q'_2} h' \bar{r} \bar{q_1} \frac{d_2}{q'_2}\ (d_2),$

although the precise value of ${c_2}$ will not be important for us. The left-hand side of (45) is thus

$\displaystyle |\sum_{n} \psi_N(n) 1_{(n,d_1) = (n+kr,d_2)=1} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+kr} )|$

and hence Corollary 10 and the coprimality of ${d_1,d_2}$ is bounded by

$\displaystyle \ll x^{o(1)} ( (d_1 d_2)^{1/2} + \frac{N}{d_1 d_2} (c_1,d_1) (c_2,d_2) ).$

Note that

$\displaystyle d_1 d_2 \leq q_1 r q_2 q'_2 \ll Q^3 R$

which is controlled by the first term on the right-hand side of (45). So it remains to show that

$\displaystyle \frac{1}{d_1 d_2} (c_1,d_1) (c_2,d_2) \ll H Q R^{-1}.$

We crudely bound ${(c_2,d_2)}$ by ${d_2}$ and ${(c_1,d_1)}$ by ${(c_1,r) q_1}$ . (This is inefficient, but this term is not dominant in the analysis in any case.) By (46) we may bound ${(c_1,r)}$ by ${HQ}$ , and the claim follows.

— 7. The Type II sum —

We now prove Theorem 14. As before, it suffices to show that

$\displaystyle |\sum_h \sum_{Q \ll q_1,q_2 \ll Q} c_{h,q_1,q_2} \sum_{n} \beta(n) \beta(n+kr) \Phi(h,q_1,q_2; n)| \ll x^{-\epsilon+o(1)} Q^2 N$

for any bounded real coefficients ${c_{h,q_1,q_2} = O(1)}$ . We rearrange the left-hand side slightly differently from the previous section:

$\displaystyle |\sum_n \beta(n) \beta(n+kr) \sum_h \sum_{Q \ll q_1, q_2 \ll Q} c_{h,q_1,q_2} \Phi(h,q_1,q_2; n)|.$

From crude bounds we have

$\displaystyle \sum_n |\beta(n) \beta(n+kr)|^2 \ll x^{o(1)} N,$

so by Cauchy-Schwarz it suffices to show that

$\displaystyle \sum_{n} \psi_N(n) |\sum_h \sum_{Q \ll q_1,q_2 \ll Q} c_{h,q_1,q_2} \Phi(h,q_1,q_2; n)|^2 \ll x^{-2\epsilon+o(1)} Q^4 N.$

Expanding and using the triangle inequality as in the previous section, we reduce to showing that

$\displaystyle \sum_{h,h'} \sum_{Q\ll q_1, q'_1, q_2,q'_2 \ll Q} |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q'_1,q'_2;n)}| \ll x^{-2\epsilon+o(1)} Q^4 N. \ \ \ \ \ (47)$

This is basically the same situation as in the previous section except that we have decoupled the ${q_1,q'_1}$ variables from each other.

As before, we isolate a diagonal case ${h'q_1 q_2 = h q'_1 q'_2}$ , and now consider the contribution of this case. Arguing as in the previous section, the contribution of this case to (47) can be bounded by

$\displaystyle \ll N \sum_{m = O(HQ^2)} \tau(m)^{O(1)}$

which by the divisor bound is of the form

$\displaystyle \ll x^{o(1)} N H Q^2$

which will be acceptable if

$\displaystyle H \ll x^{-2\epsilon+o(1)} Q^2.$

By (37) this is equivalent to

$\displaystyle R \ll x^{-3\epsilon+o(1)} M,$

but this is automatic from (23) and (22).

For the off-diagonal case, we use the following variant

$\displaystyle |\sum_{n} \psi_N(n) \Phi(h,q_1,q_2;n) \overline{\Phi(h',q'_1,q'_2;n)}| \ll \ \ \ \ \ (48)$

$\displaystyle x^{o(1)} (Q^{2} R^{1/2} + H Q^6 R^{-1} N)$

of (45), valid for all non-diagonal ${h,q_1,q_2,h',q'_1,q'_2}$ . The bound here is weaker than in (45), but in the model case (20) the right-hand side terms are approximately ${x^{\frac{1}{4} + \frac{3\sigma}{2}+o(1)}}$ and ${x^{6\sigma+o(1)}}$ respectively, which still represents a power saving over the trivial bound of approximately ${x^{1/2-\sigma}}$ as long as ${\sigma < 1/14}$ . While this does not cover all the range ${\sigma<1/8}$ that the Type I analysis does, it crucially is able to cover the case when ${\sigma}$ is very close to zero, which the Type I analysis does not cover due to the condition (42). The Type I/II border is not the critical border for optimising the exponents, so it is not a priority for us to improve bounds in the Type II analysis such as (48).

We assume (48) for now and finish the proof of Theorem 14 by numerical computations similar to that in the previous section. The non-diagonal contribution to (47) is now estimated by

$\displaystyle \ll x^{o(1)} ( H^2 Q^6 R^{1/2} + H^3 Q^{10} R^{-1} N )$

so to conclude (47) we need to show that

$\displaystyle H^2 Q^6 R^{1/2} \ll x^{-2\epsilon+o(1)} Q^4 N$

and

$\displaystyle H^3 Q^{10} R^{-1} N \ll x^{-2\epsilon+o(1)} Q^4 N.$

Using (37), (9) we can rewrite these criteria as

$\displaystyle (QR)^6 \ll x^{2-4\epsilon+o(1)} N^{5/2} (R/N)^{7/2}$

and

$\displaystyle (QR)^{12} \ll x^{3-5\epsilon+o(1)} N^7 (R/N)^{10}$

respectively. Applying (24), (23), it suffices to verify that

$\displaystyle x^{3 + 12 \varpi} \ll x^{2-4\epsilon-\frac{7}{2}\mu-\frac{7}{2}\delta+o(1)} N^{5/2}$

and

$\displaystyle x^{6 + 24 \varpi} \ll x^{3-5\epsilon-10\mu-10\delta+o(1)} N^7$

but these follow from (10) and (43).

It remains to prove (48). This is very similar to the treatment of (45). From (38) we may assume

$\displaystyle (q_1,r) = (q'_1,r) = (q_2,r) = (q'_2,r) = (q_1,q_2) = (q'_1,q'_2) = 1$

and the left-hand side of (48) expands as

$\displaystyle | \sum_{n} \psi_N(n) 1_{(n,q_1q'_1r) = (n+kr,q_2q'_2)=1} e_r( \frac{a_r (hq'_1q'_2-h'q_1q_2)}{nq_1q'_1q_2q'_2} )$

$\displaystyle e_{q_1}( \frac{b_{q_1} h}{nrq_2}) e_{q'_1}( -\frac{b'_{q_1} h'}{nrq'_2})$

$\displaystyle e_{q_2}( \frac{b'_{q_2} h}{(n+kr) r q_1} ) e_{q'_2}( -\frac{b'_{q'_2} h'}{(n+kr) r q'_1} )|.$

If we set

$\displaystyle d_1 := [q_1,q'_1]r; \quad d_2 := [q_2,q'_2]$

then as before we can rewrite this sum as

$\displaystyle | \sum_{n} \psi_N(n) 1_{(n,d_1) = (n+kr,d_2)=1} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+kr} ) |$

where

$\displaystyle c_1 = a_r \bar{q_1} \bar{q'_1} \bar{q_2} \bar{q'_2} (hq'_1q'_2-h'q_1q_2) \frac{d_1}{r}$

$\displaystyle + b_{q_1} \bar{r} \bar{q_2} h \frac{d_1}{q_1} - b'_{q'_1} \bar{r} \bar{q'_2} h' \frac{d_1}{q'_1}\ (d_1)$

(with the inverses with respect to the moduli ${r, q_1, q'_1}$ in the first, second, and third summands respectively), and the value of ${c_2}$ is unimportant for us. We have an analogue of (46), namely

$\displaystyle (c_1, r) = (hq'_1q'_2-h'q_1q_2, r). \ \ \ \ \ (49)$

We apply Corollary 10 as before, although things are not quite as favorable because ${d_1,d_2}$ need not be coprime in this case. This bounds the left-hand side of (48) by

$\displaystyle \ll x^{o(1)} ( (d_1 d_2)^{1/2} + \frac{N (d_1,d_2)^2}{d_1 d_2} (c_1,d_1) (c_2,d_2) ).$

We have

$\displaystyle d_1 d_2 \leq q_1 q'_1 r q_2 q'_2 \ll Q^4 R$

which is controlled by the first term on the right-hand side of (48). So it remains to show that

$\displaystyle \frac{(d_1,d_2)^2}{d_1 d_2} (c_1,d_1) (c_2,d_2) \ll H Q^6 R^{-1}.$

We crudely bound ${(c_2,d_2)}$ by ${d_2}$ and ${(c_1,d_1)}$ by ${(c_1,r) \frac{d_1}{r}}$ ; also

$\displaystyle (d_1,d_2) \leq (q_1q'_1, q_2 q'_2) \ll Q^2$

and from (49) one has ${(c_1,r) \ll H Q^2}$ . The claim follows.

75 comments

Comments feed for this article

12 June, 2013 at 1:06 pm

mttpd

Shouldn’t that be “n” instead of “x” in the RHS of (1)?
// Argument of the log function.

12 June, 2013 at 1:10 pm

mttpd

Ah, no, just looked at the previous post (“As in previous posts, we let {x} be an asymptotic parameter tending to infinity”). NVM :-)

12 June, 2013 at 4:04 pm

Terence Tao

Some recording of parameters in the critical case: assuming

$\varpi = 1/899 \approx 0.00111$

$\delta = 71/154628 \approx 0.000460$

(following https://terrytao.wordpress.com/2013/06/10/a-combinatorial-subset-sum-problem-associated-with-bounded-prime-gaps/#comment-234015 ) we have

$\sigma = 4569/38657 \approx 0.118193$

and the most critical case of the Type I/II analysis occurs when

$\mu \approx 0$

$N \approx x^{1/2-\sigma} \approx x^{0.381807}$

$M \approx x^{1/2+\sigma} \approx x^{0.618193}$

$R \approx x^{-\delta} N \approx x^{0.381347}$

$Q \approx x^{1/2+2\varpi}/R \approx x^{0.120877}$

$H \approx Q^2 R/M \approx x^{0.00491}$

Here one uses the Type I analysis and the non-diagonal case dominates, with the first term on the RHS of (43) giving the main contribution.

Due to the use of completion of sums, (45) has to gain a factor of $H^2 \approx x^{0.00981}$ over the trivial bound, which it barely does; one is summing a Kloosterman-type sum of period $\approx Q^3 R \approx x^{0.743979}$ over an interval of scale $N \approx x^{0.381807}$ , and the square root cancellation basically estimates this sum by $\approx (Q^3 R)^{1/2} \approx x^{.371990}$ giving the desired savings of $x^{0.00981}$ . In principle short Kloosterman sums might improve this (ideally we should get $\sqrt{N}$ bounds and not $\sqrt{Q^3 R}$ ), but N is already quite close to the square root of $Q^3 R$ so it doesn’t look so promising (unless the Chinese remainder theorem can somehow save us…). We also have all the averaging in $r, q_1, q_2, q'_2, h, h', k$ that is just being discarded at present, and possibly one could take more advantage of these parameters, but it’s not clear to me how to do so (incidentally $k$ is of size $\approx N/R \approx x^{0.000460}$ ).

Random thought: is there a Burgess type squaring trick for short Kloosterman sums?

12 June, 2013 at 4:22 pm

Terence Tao

Another observation is that all the moduli involved (e.g. $r,q_1,q_2,q'_2$ ) are very smooth. For instance, since $Q \approx x^{262\delta}$ , we know that $q_1,q_2,q'_2$ all have at least 262 prime factors, all of which are quite small. So even though the progression $\{1,\ldots,N\}$ is quite short relative to the big modulus $Q^3 R$ , it is quite long compared to each of the prime divisors of this modulus. But I don’t know of a way to exploit this; Bourgain and Chang have some clever tricks to stretch the Burgess bound for character sums to product spaces in a way that squeezes out additional gains (see http://math.ucr.edu/~mcc/paper/138%20BurgessJB.pdf ) but I have no idea if these ideas also work for Kloosterman sums.

12 June, 2013 at 4:23 pm

v08ltu

Maybe something like Theorem 3 of FI, note that they use Holder with 4 in the exponent, but you can take this larger and gain more in some interval lengths (they remark this after Theorem 4, and do so in their Estimates on characters sums, for a related matter).

But the problem I had with such ideas, is that the sums are not all that “short” when I tried to apply these.

12 June, 2013 at 5:14 pm

v08ltu

Luo has a paper on incomplete Burgess estimates for Hyper-Kloosterman sums: http://www.sciencedirect.com/science/article/pii/S0022314X9892340X

12 June, 2013 at 5:32 pm

v08ltu

Shparlinski improves upon Luo, and mentions some related work. http://www.sciencedirect.com/science/article/pii/S0022314X06003027

12 June, 2013 at 5:38 pm

v08ltu

Also a 2010 paper of Karatsuba (posthumously): http://link.springer.com/article/10.1134%2FS0001434610090075

12 June, 2013 at 5:53 pm

v08ltu

It seems that Karatsuba (and Korolev) only save logs, and for quite short sums.

The most relevant I can find to our situation is Theorem 1 of this paper by Ping Xi: http://arxiv.org/pdf/1111.5459‎

The interval has to be as least $p^{3/8}$ in length for the result to be nontrivial (he demands the modulus be prime, but that might be technical).

His bound is $L^{1/r}p^{1/2-(3r-1)/4r^2}\log p$ for any $r$ where the interval is length $L$ .

13 June, 2013 at 5:31 pm

v08ltu

This preprint of Ping Xi is not easy to figure out, but it seems that at (6) on pg5 he uses Proposition 1 (from a different preprint) rather than Lemmas 2 and 3 as stated. So the main tool is actually a mean-value result of http://arxiv.org/abs/1111.5455
The Theorem 8 there is stated on pg18, but any proof exposition is deemed superfluous (“follow the arguments in this paper”). It also might be nontrivial to pass from prime moduli to composite, as FI note that much of the mess of their Section 1 is to extend as thus. However, if it all works out, it should beat $\sqrt{d_1d_2}$ with $(d_1d_2)^{3/16}\sqrt N$ (r=2 case in above formula), which is a win in our target region I think.

12 June, 2013 at 5:17 pm

v08ltu

A recent preprint of Browning and Haynes does not mention anything other than the Weil bound, and the Hooley conjecture, though the point is slightly different. http://arxiv.org/pdf/1204.6374‎

12 June, 2013 at 10:16 pm

Gaston Rachlou

A clash in terminology is always a bad thing in mathematics. This post contains two different uses of the adjective “smooth” (for coefficient sequences and for numbers). I think this is a good opportunity to definitely chose the adjective “friable” for numbers with small prime factors. See : https://blogs.ethz.ch/kowalski/2008/12/08/more-mathematical-terminology-friable/

12 June, 2013 at 10:49 pm

Terence Tao

I certainly see where you’re coming from, but in this particular case I think that the two notions of smooth are not in conflict, and in fact if one adopts an “adelic” perspective then there is actually a conceptual overlap between the two notions. A coefficient sequence is smooth if its spectrum avoids large frequencies (as measured in the Archimedean sense); and a number is smooth if its spectrum avoids large primes.

12 June, 2013 at 11:14 pm

Gaston Rachlou

If you vindicate an ambiguous use of “smooth” by an ambiguous use of “spectrum”, you’re likely to win any such debate:-)

By the way, congratulations for your blog in general and this “bounded gaps” series in particular!

13 June, 2013 at 9:48 am

Terence Tao

A fair point, but it’s worth mentioning that (as observed by Gelfand), these two notions of spectrum also have a lot of conceptual overlap: each frequency in the spectrum of a function $f$ corresponds to a prime ideal in the associated convolution algebra (of functions formed by convolving $f$ with other functions), whereas each prime in the spectrum of an integer $n$ corresponds to a prime ideal in the associated cyclic group ${\bf Z}/n{\bf Z}$ . Thus in both cases the spectrum is the spectrum of a commutative ring that is naturally associated to the object.

13 June, 2013 at 12:23 pm

Gaston Rachlou

Interesting! I am just still a bit puzzled as I tend to think about the set of prime divisors of $n$ more as the support of $n$ than as its spectrum. But, after all, this could be consistent with your identification of $n$ with the self-dual group $\mathbb{Z}/n\mathbb{Z}$ .

So, you almost convinced me:-( Nevertheless, I still prefer “friable”, which has a more direct, almost visual, explanation.

13 June, 2013 at 11:58 am

Eytan Paldi

This post should be added to the list of the polymath posts.

[Done, thanks – T.]

14 June, 2013 at 6:28 am

CraigH

Another followup question: MPZ[ $\varpi, \delta$ ] is a statement about the absolute deviation of $\theta(n)$ . What other number-theoretic functions can we make a similar statement about?
For example, could we take $\theta'(n) = \log n$ for $n$ prime 1 mod 4 (and 0 elsewhere)? If so, we could try to tailor admissible sets with $k_0$ elements such that the first $k_0/2$ of them are 1 mod 4, and the rest are 3 mod 4, and that would effectively divide the width by two.

14 June, 2013 at 8:47 am

Estimation of the Type III sums | What's new

[…] post, which deals with the combinatorial aspects of the second part of Zhang’s paper, and this previous post, that covers the Type I and Type II sums.) The main purpose of this post is to present (and […]

18 June, 2013 at 6:19 pm

A truncated elementary Selberg sieve of Pintz | What's new

[…] in the proof of . Indeed, if one inspects the proof of this proposition (described in these three previous posts), one sees that the key property of needed is not so much the smoothness, but a weaker […]

19 June, 2013 at 7:06 pm

Terence Tao

I am thinking that if we exploit the smoothness of the moduli, we can get (in some cases) better bounds from the incomplete Kloosterman-type sums by using Weyl differencing rather than by using completion of sums, in the spirit of what Zhang did for the Type III sums.

Let me illustrate this with the model problem of getting a non-trivial bound

$\sum_{1 \leq n \leq N} e_q( a n + b \overline{n} ) 1_{(n,q)=1} = o(N)$

for an incomplete Kloosterman sum over some squarefree modulus $q$ , with $a,b$ coprime to $q$ . The usual completion of sums method, combined with the Weil bound on Kloosterman sums, gives a non-trivial estimate of $q^{1/2+o(1)}$ for $N \ggg \sqrt{q}$ , but does not give any non-trivial bound for $N \ll \sqrt{q}$ . But if we have a good factorisation $q = q_1 q_2$ with $q_1,q_2$ in the right places, we can do better through Weyl differencing. Namely, suppose $q_1 \lll N \lll q_2$ , then by shifting $n$ by $n + kq_1$ where $1 \leq k \leq K$ and $K = o( N / q_1 )$ , we can basically write the sum we want as

$\frac{1}{K} \sum_{1 \leq n \leq N} \sum_{1 \leq k \leq K} e_q( a (n+kq_1) + b \overline{n+kq_1} ) 1_{(n+kq_1,q)=1}$

so after Cauchy-Schwarz and throwing away the diagonal term (acceptable as long as $K \ggg 1$ ) it suffices to show that

$\sum_{1 \leq k \leq K} | \sum_{1 \leq n \leq N} e_q( a(n+kq_1) + b \overline{n+kq_1} - an - b \overline{n} )$

$1_{(n,q)=(n+kq_1,q)=1}| = o(KN).$

We can simplify the phase as

$e_q( akq_1 - \frac{bq_1}{n(n+kq_1)} )$

which collapses to a $q_2$ phase:

$e_{q_2}( ak - \frac{b}{n(n+kq_1)} )$ .

NOW if we do completion of sums, the inner sum

$\sum_{1 \leq n \leq N} e_{q_2}( ak - \frac{b}{n(n+kq_1)} ) 1_{(n,q)=(n+kq_1,q)=1}$

should be bounded by $O(q_2^{1/2+o(1)})$ (possibly times a factor of $(k,q_2)$ , but this is negligible after averaging),. and so we get a non-trivial bound whenever we have a factorisation $q=q_1 q_2$ with

$q_2^{1/2}, q_1 \lll N \lll q_2$

which (if $q$ is smooth) lets one get non-trivial bound for $N$ almost as small as $q^{1/3}$ rather than $q^{1/2}$ .

Now, this sort of gain isn’t quite the type of gain we need to improve the Type I sums at the critical numerology; instead of trying to improve the trivial bound for very short Kloosterman sums (length less than $q^{1/2}$ ), our problem is closer to that of improving the completion of sums bound for moderate length Kloosterman sums (length greater than $q^{1/2}$ ). But it may be that one can flip one to the other via a Fourier transform. (Admittedly Kloosterman sums don’t Fourier transform as nicely as character sums, which I am more used to, so I may be a bit naive here, but still I think there is hope.)

More generally, I think one of the lessons of the Zhang analysis is that we really should be exploiting the smooth nature of our moduli whenever we can. Because of this the apparently “simpler” case of prime moduli may actually be a bit misleading; problems (such as estimating short Kloosterman sums) which are difficult for prime moduli may actually be a lot easier for smooth moduli due to the very flexible Weyl differencing technique that is available in this case.

20 June, 2013 at 11:16 am

Terence Tao

OK, the Fourier analysis wasn’t actually so bad. The new model problem is to obtain a bound on moderately long Kloosterman sums of the form

$\sum_n \psi_N(n) e_q( a \bar{n} ) 1_{(n,q)=1} = O( q^{1/2-\varepsilon+o(1)} )$ (*)

for some $\varepsilon > 0$ , where $q$ is squarefree and smooth, $a$ is coprime to $q$ , $\psi_N$ is a smooth cutoff to an interval of length N, and $N$ is a bit larger than $q^{1/2}$ (the method described below gives a non-trivial bound basically for $N$ between $q^{1/2}$ and $q^{2/3}$ ). Note that completion of sums, together with the Weil bound on completed Kloosterman sums, only gives the bound of $O(q^{1/2+o(1)})$ , so we are beating the methods used by Zhang, in a regime of interest. (But, as per Corollary 10 above, we will eventually also need to stick in an additional phase $e_{q'}( a' \overline{n+l} )$ into the analysis.)

Anyway, by completion of sums or Poisson summation the LHS can be written as

$\frac{1}{q} \sum_{h \in {\bf Z}/q{\bf Z}} \hat \psi_N(\frac{h}{q}) S(h,a; q)$

where $S(h,a;q)$ is the completed Kloosterman sum

$S(h,a;q) := \sum_{n \in ({\bf Z}/q{\bf Z})^\times} e_q( a \bar{n} + h n )$

and $\hat \psi_N$ is the Fourier transform

$\hat \psi_N(\theta) := \sum_n \psi_N(n) e(-n\theta).$

Our objective is thus to show

$|\sum_{h \in {\bf Z}/q{\bf Z}} \hat \psi_N(\frac{h}{q}) S(h,a; q)| \ll q^{3/2-\varepsilon+o(1)}$ (**)

which is a power saving over the Weil bound and standard bounds on $\hat \psi_N$ (which has amplitude about $N$ and is concentrated on about $q/N$ frequencies).

We factor $q=q_1q_2$ where $q_1,q_2$ are to be determined later.

We now perform Weyl differencing, replacing $h$ by $h + k q_1$ for $1 \leq k \leq K$ , where $K := \lfloor q^{-\varepsilon} q_2 / N \rfloor$ . We assume that $K \geq q^{2\varepsilon}$ , so that

$N \ll q^{-3\varepsilon} q_2$ .

Up to acceptable errors (exploiting the smooth nature of $\hat \psi_N$ at scale $q/N$ ), the LHS of (**) can then be rewritten as

$\frac{1}{K} |\sum_{h \in {\bf Z}/q{\bf Z}} \sum_{1 \leq k \leq K} \hat \psi_N(\frac{h+kq_1}{q}) S(h+kq_1,a; q)|.$

From Lemma 7 we may factorise

$S(h+kq_1,a;q) = S(h/q_2,a/q_2; q_1) S(h/q_1 + k,a/q_1; q_2)$ .

By the Weyl bound, $S(h/q_2,a/q_2; q_1) = O( q_1^{1/2+o(1)} )$ , so we may bound the previous expression by

$\frac{q_1^{1/2+o(1)}}{K} \sum_{h \in {\bf Z}/q{\bf Z}} | \sum_{1 \leq k \leq K} \hat \psi_N(\frac{h+kq_1}{q}) S(h/q_1+k,a/q_1; q_2)|.$

We can restrict $h$ to the range $h=O(q^{1+o(1)}/N)$ since $\hat \psi_N$ is negligible outside of this range. By Cauchy-Schwarz, it then suffices to show

$\sum_{h \in {\bf Z}/q{\bf Z}} | \sum_{1 \leq k \leq K} \hat \psi_N(\frac{h+kq_1}{q}) S(h/q_1+k,a/q_1; q_2)|^2$

$\ll q^{-2\varepsilon+o(1)} N q q_2 K^2.$

Note that this goal represents a saving of $q^{-2\varepsilon}$ over the Weil bound and standard bounds on $\hat \psi_N$ . Expanding out the square and removing the diagonal $k=k'$ (which is acceptable since $K \gg q^{2\varepsilon}$ ), we reduce to showing that

$|\sum_{h \in {\bf Z}/q{\bf Z}} \hat \psi_N(\frac{h+kq_1}{q}) \overline{\hat \psi_N}(\frac{h}{q}) S(h/q_1+k,a/q_1; q_2) \overline{S(h/q_1,a/q_1;q_2)}|$

$\ll q^{-2\varepsilon+o(1)} N q q_2$

on the average for $1\leq k \leq K$ . Writing $h = h_1 q_2 + h_2 q_1$ for $h_i \in {\bf Z}/q_i {\bf Z}$ , the left-hand side becomes

$|\sum_{h_1 \in {\bf Z}/q_1{\bf Z}} \sum_{h_2 \in {\bf Z}/q_2{\bf Z}} \hat \psi_N(\frac{h_2+k}{q_2} + \frac{h_1}{q_1}) \overline{\hat \psi_N}(\frac{h_2}{q_2} + \frac{h_1}{q_1})$

$S(h_2+k,a/q_1; q_2) \overline{S(h_2,a/q_1;q_2)}|.$

After some Fourier manipulation to evaluate the $h_1,h_2$ sums, we may write this as

$q_1 q_2 |\sum_{n,m: n=m\ (q_1)} \psi_N(n) \overline{\psi_N(m)} \sum_{n_2,m_2 \in ({\bf Z}/q_2{\bf Z})^\times: n_2-m_2 = n-m\ (q_2)}$

$e_{q_2} ( (n_2-n) k + \bar{n_2} a/q_1 - \bar{m_2} a/q_1 )|.$

From Weil, the inner sum is (I think) bounded by $O( q_2^{1/2+o(1)} )$ (possibly times a factor of $(k,q_2)$ which is negligible after averaging in $k$ ), and if we assume $N \gg q_1$ , then we get a net bound of $O( N^2 q_2^{3/2+o(1)} )$ , which gives the desired power saving in the regime

$q_1 \ll N \ll q^{-3\varepsilon} q_2, q^{-2\varepsilon} q_1 q_2^{1/2}$

which in the smooth case allows one to take $N$ nearly as large as $q^{2/3-3\varepsilon}$ by choosing $q_1,q_2$ close to $q^{1/3}, q^{2/3}$ respectively.

In principle the same sort of argument should work to get an improvement to Corollary 10 above, which in turn should lead to improvements in the Type I analysis. I haven’t checked the numerology and details yet though.

20 June, 2013 at 12:43 pm

Terence Tao

Just to get started on the numerology in the critical case: it’s looking like $\sigma=1/10$ is going to be the key threshold, which upon setting $\delta \approx 0$ gives $\varpi = 1/220$ . We then have $N \approx x^{0.4}$ and $M \approx x^{0.6}$ , and $d = qr \approx x^{1/2+2\varpi}$ . $r \sim R$ is taken close to $N$ , so $q \sim Q \approx x^{0.1 +2 \varpi}$ , then $H \approx Q^2 R / M \approx x^{4\varpi}$ . In the Type I analysis, $d_1 \approx QR \approx x^{1/2+2\varpi}$ and $d_2 \approx Q^2 \approx x^{0.2 + 4 \varpi}$ . The key sum that needs estimating is $\sum_n \psi_N(n) 1_{(n,d_1)=(n+kr,d_2)=1} e_{d_1}(\frac{c_1}{n}) e_{d_2}(\frac{c_2}{n+kr})$ , where $d_1,d_2$ are coprime. The trivial bound is $N \approx x^{0.4}$ and the Weil + completion of sums bound is $(d_1 d_2)^{1/2+o(1)} \approx x^{0.35 + 3 \varpi}$ (plus an extra term which is lower order), a saving of $x^{1/20 - 3 \varpi}$ , which matches the loss $H^2 \approx x^{8\varpi}$ coming from completion of sums compounded by Cauchy-Schwarz when $\varpi=1/220$ . So any improvement of the Weil + completion of sums bound here will lead to an improvement in the Type I analysis.

We have $N \approx (d_1 d_2)^{\frac{0.4}{0.7 + 6\varpi}} = (d_1 d_2)^{0.55}$ . $0.55$ is less than $2/3$ , so I am optimistic that the methods from the previous comment will lead to some gain here. Still have to check the gory details, though…

20 June, 2013 at 2:22 pm

Terence Tao

I realised that with regards to the exponential sum

$S_N(c_1,c_2,a;d_1,d_2) := \sum_n \psi_N(n) 1_{(n,d_1)=(n+a,d_2)=1} e_{d_1}( \frac{c_1}{n} ) e_{d_2}( \frac{c_2}{n+l} )$

we already have a convenient factorisation $d_1 d_2$ of the modulus available so we don’t even need any smoothness hypothesis to improve upon Corollary 10 of the post (= Lemma 11 of Zhang) in the regime of interest – Weyl differencing (conjugated by a Fourier transform) already basically improves the $(d_1 d_2)^{1/2}$ main term in Corollary 10 to $N^{1/2} d_1^{1/4} + N^{1/2} d_2^{1/2}$ in the range $d_2 \ll N \ll d_1$ , which is the range of interest here. This should propagate to a new value of $\varpi$ , but I’ll do that in the next comment. For now, the details of the above claimed bound.

I’ll need first a Weyl-type bound on completed exponential sums, namely that

$| \sum_{n \in ({\bf Z}/p{\bf Z}): n \neq 0,-l} e_p( \frac{c_1}{n} + \frac{c_2}{n+l} + c_3 n)| \ll \sqrt{p}$

except in the degenerate case $c_3 = a = c_1 - c_2 = 0\ (p)$ , in which case one only gets the trivial bound of $p$ . I’m pretty sure this claim follows from the Riemann hypothesis for curves (plus some ad hoc computations in partially degenerate cases), so I’ll just take it for granted. By the Chinese remainder theorem it implies that

$| \sum_{n \in ({\bf Z}/q{\bf Z}): (n,q)=(n+l,q)=1} e_q( \frac{c_1}{n} + \frac{c_2}{n+l} + c_3 n)|$

$\ll q^{1/2+o(1)} ( c_3,l,c_1-c_2,q)^{1/2}.$ (1)

Now we return to estimation of the sum $S_N(c_1,c_2,l;d_1,d_2)$ , assuming that $d_1,d_2$ are squarefree and coprime, and that
$d_2 \ll N \ll d_1$ . We follow the previous strategy of Weyl differencing conjugated by the Fourier transform. First, the Fourier transform lets us write $S_N(c_1,c_2;l;d_1,d_2)$ as

$\frac{1}{d_1 d_2} \sum_{h \in {\bf Z}/d_1 d_2{\bf Z}} \hat \psi_N(\frac{h}{d_1 d_2}) S(c_1,c_2,l,h; d_1,d_2)$

where $S(c_1,c_2,l,h;d_1,d_2)$ is the complete Kloosterman-type sum

$S(c_1,c_2,l,h;d_1,d_2) := \sum_{n \in {\bf Z}/d_1 d_2 {\bf Z}} e_{d_1}(\frac{c_1}{n}) e_{d_2}(\frac{c_2}{n+l}) e_{d_1 d_2}(hn)$

where for brevity we adopt the convention that $e_{d_1}(\frac{c_1}{n})$ vanishes when the denominator $n$ is not coprime to the modulus $d_1$ , and similarly for $e_{d_2}(\frac{c_2}{n+a})$ .

Next, we perform Weyl shifting, replacing $h$ by $h+kd_2$ for $1 \leq k \leq K$ and $K := \lfloor d_1 /N \rfloor$ . The previous expression is then equal to

$\frac{1}{K d_1 d_2} \sum_{h \in {\bf Z}/d_1 d_2{\bf Z}} \sum_{k=1}^K$

$\hat \psi_N(\frac{h+kd_2}{d_1 d_2}) S(c_1,c_2,l,h+kd_2;d_1 d_2)$ .

The $h=0$ term contributes $O( \frac{N}{d_1 d_2} (c_1,d_1) (c_2,d_2)$ exactly as in the main post and we now delete this term.

Now from the Chinese remainder theorem we have the twisted multiplicativity law

$S(c_1,c_2,l,h+kd_2;d_1 d_2) = e_{d_2}( - h l / d_1 ) S( h/d_2 + k, c_1; d_1)$

$S( h/d_1, c_2; d_2)$ .

Here $S(b,c;d)$ are the usual completed Kloosterman sums. Using the Weil bound on the $S( h/d_1, c_2; d_2)$ factor, we can bound the previous expression in magnitude by

$\frac{x^{o(1)}}{K d_1 d_2^{1/2}} \sum_{h \in {\bf Z}/d_1 d_2{\bf Z}: h \neq 0} (h,c_2,d_2)^{1/2} |\sum_{k=1}^K$

$\hat \psi_N(\frac{h+kd_2}{d_1 d_2}) S(h/d_2+k,c_1;d_1)|.$

From the rapid decay of $\hat \psi_N$ we may localise $h = O( x^{o(1)} d_1 d_2 ./ N )$ . We have

$\sum_{h = O( x^{o(1)} d_1 d_2 / N ): h \neq 0} (h,c_2,d_2) \ll x^{o(1)} d_1 d_2 / N$

so by Cauchy-Schwarz we may bound the previous expression by

$\frac{x^{o(1)}}{K d_1^{1/2} N^{1/2}} ( \sum_{h \in {\bf Z}/d_1 d_2{\bf Z}} |\sum_{k=1}^K$

$\hat \psi_N(\frac{h+kd_2}{d_1 d_2}) S(h/d_2+k,c_1;d_1)|^2)^{1/2}.$

Expanding out, we encounter a diagonal term $k=k'$ which, if one applies the Weil bound on Kloosterman sums together with the decay bounds on $\hat \psi_N$ , eventually gives a bound of $x^{o(1)} K^{-1/2} (d_1 d_2)^{1/2} = x^{o(1)} N^{1/2} d_2^{1/2}$ (gaining $K^{-1/2}$ over the “trivial” bound, which is typical for a diagonal contribution in Weyl differencing). Now we restrict to the off-diagonal terms $k \neq k'$ . Performing the $h$ summation first using Fourier analysis, one eventually arrives at

$\frac{x^{o(1)} d_2^{1/2}}{K N^{1/2}} | \sum_{n,n': n = n'\ (d_2)} \psi_N(n) \overline{\psi_N}(n')$

$\sum_{1 \leq k,k' \leq K: k \neq k'} \sum_{n_1,n'_1 \in {\bf Z}/d_1{\bf Z}: n-n' = n_1-n'_1\ (d_1)}$

$e_{d_1}( \frac{c_1}{n_1} + k n_1 - \frac{c_1}{n'_1} - k' n'_1 ) |^{1/2}.$

By (1), the innermost summation (over $n_1,n'_1$ ) is $O( x^{o(1)} d_1^{1/2} (k-k',d_1)^{1/2})$ . Inserting this bound and performing all the summations, one eventually arrives at $O( x^{o(1)} N^{1/2} d_1^{1/4} )$ . Thus one has shown the following improvement to Corollary 10:

$|S_N(c_1,c_2,l; d_1,d_2)| \ll x^{o(1)} ( N^{1/2} d_1^{1/4} + N^{1/2} d_2^{1/2}$

$+ \frac{N}{d_1 d_2} (c_1,d_1) (c_2,d_2))$

when $d_1,d_2$ are squarefree coprime and $d_2 \ll N \ll d_1$ .

It is possible that one could improve upon this bound by using smoothness and working with a different factorisation than the factorisation $d_1 d_2$ that is naturally provided in Corollary 10, but this would be a bit messier and I will avoid doing that for now.

20 June, 2013 at 4:13 pm

Terence Tao

Incidentally, I now have a reference for the Weil type bound for exponentials of rational functions that I needed: Perelmuter (1969), http://www.ams.org/mathscinet-getitem?mr=241424 . (This type of estimate can also be proven by Stepanov’s method: see Cochrane-Pinner (2006), http://www.ams.org/mathscinet-getitem?mr=2195926 .

22 June, 2013 at 12:20 am

v08ltu

I am confused when you say “Next, we perform Weyl shifting, replacing h by h+kd_2” as the next display seems (twice) to shift h by $h+kd_1$ . One of these is corrected(?) a few lines later, but not the other?

22 June, 2013 at 1:00 am

v08ltu

Just thinking aloud, if one possessed say higher-dim Deligne at hand, would Holder win more than Cauchy with this shift? This is often feasible in Burgess analysis.(however, Ping Xi seemed to max out at $r=2$

22 June, 2013 at 3:22 am

v08ltu

I am not so sure that this divisibility is something like a red hering, for now. If you plain had $d_2=1$ you would still win via the Weyl shift in this $(N,d)$ range, according in your notes as somewhat like to the method of FI (they only fail to achieve much enough gain in Type III methinks? ). Though of course the divisibility and Zhang shift should be induced for greater gains if possible. But really here, you “just” worked a short Kloosterman bound, if I am correct, with the $d_2$ divisor effect largely negligent.

22 June, 2013 at 6:49 am

Terence Tao

Sorry about that, the shift should be by kd_2 throughout, I edited the comment accordingly.

I’m almost done with a post detailing these bounds a bit more carefully (in particular the refinement to Zhang Lemma 11 in the comments below did not quite treat some lower order terms properly, though fortunately this did not affect the dominant terms and ultimately did not affect $\varpi,\delta$ numerology). The case of short character sums is slightly simpler than that of short Kloosterman sums so I am giving that example instead in the post.

Certainly there is scope to play with these methods further and get even better bounds, e.g. by combining carefully chosen Weyl differencing with a Burgess argument.

20 June, 2013 at 3:26 pm

Terence Tao

Now it’s time to trace this improvement of Corollary 10 back through the rest of the argument. Let me first note that the new bound of $x^{o(1)} ( N^{1/2} d_1^{1/4} + N^{1/2} d_2^{1/2} + \frac{N}{d_1 d_2} (c_1,d_1) (c_2,d_2) )$ was only proven in the range $d_2 \ll N \ll d_1$ , but trivially holds outside this range also: when $N \gg d_1$ then $N^{1/2} d_2^{1/2} \gg d_1^{1/2} d_2^{1/2}$ and one can use the existing Corollary 10, while for $N \ll d_2$ one has $N^{1/2} d_2^{1/2} \gg N$ and one can use the trivial bound of $O(N)$ . So the new bound in fact holds universally in $N$ .

In the post, if we estimate the left-hand side of (45) using the new version of Corollary 10, we obtain an upper bound of

$\ll x^{o(1)} ( N^{1/2} d_1^{1/4} + N^{1/2} d_2^{1/2} + \frac{N}{d_1 d_2} (c_1,d_1) (c_2,d_2) )$

which by using the bounds given near the end of Section 7 come out to

$\ll x^{o(1)} ( N^{1/2} Q^{1/4} R^{1/4} + N^{1/2} Q + H Q R^{-1} N ).$

giving an improvement to (45). The off-diagonal contribution to (44) is then

$\ll x^{o(1)} ( H^2 N^{1/2} Q^{9/4} R^{1/4} + H^2 N^{1/2} Q^3 + H^3 Q^3 R^{-1} N )$

so we win when

$H^2 N^{1/2} Q^{9/4} R^{1/4} \ll x^{-2\varepsilon+o(1)} Q^2 N$

$H^2 N^{1/2} Q^3 \ll x^{-2\varepsilon+o(1)} Q^2 N$

and

$H^3 Q^3 R^{-1} N \ll x^{-2\varepsilon+o(1)} Q^2 N$ .

Using (37), (9) these become

$(QR)^{17/4} \ll x^{2-4\varepsilon+o(1)} N^{1/2} (R/N)^2$

$(QR)^5 \ll x^{2-4\varepsilon+o(1)} N^{3/2} (R/N)^3$

$(QR)^7 \ll x^{3-5\varepsilon+o(1)} N^2 (R/N)^5$

which from (24), (23), (10) reduces to

$\frac{17}{8} + \frac{17}{2} \varpi < \frac{9}{4} - \frac{1}{2} \sigma - 2\mu - 2 \delta$

$\frac{5}{2} + 10 \varpi < \frac{11}{4} - \frac{3}{2} \sigma - 3 \mu - 3 \delta$

$\frac{7}{2} + 14 \varpi < 4 - 2 \sigma - 5 \mu - 5 \delta$

which we rearrange as

$17 \varpi + \sigma + 4 \mu + 4 \delta < \frac{1}{4}$

$20 \varpi + 3 \sigma + 6 \mu + 6 \delta < \frac{1}{2}$

$14 \varpi + 2 \sigma + 5 \mu + 5 \delta < \frac{1}{2}.$

The third condition is implied by the second and may be dropped. These conditions replace (41) in Theorem 13 and are superior basically because they allow $\sigma$ to be as large as 1/6 now instead of 1/8.

Following the arguments back to Theorem 3, we now see that we may replace (7) by the two essentially weaker conditions

$17 \varpi + 4\delta + \sigma < \frac{1}{4}$ (1′)

and

$20 \varpi + 6\delta + 3\sigma < \frac{1}{2 }$ . (2′)

From the Type II analysis we have an additional constraint

$37 \varpi + 5 \delta < \frac{1}{4}$ (3′)

which could probably be improved with the new version of Corollary 10, but we won't do so here.

On the other hand, the Type III analysis currently works when

$\frac{13}{2} (\frac{1}{2} + \sigma) > 4 + 16 \varpi + \delta$ (4′)

and we also need the combinatorial constraint

$\sigma > 1/10$ . (5′)

Playing (5′) off of (1′) and (2′) gives the constraints

$17 \varpi + 4 \delta < \frac{1}{5}$ (6′)

$20 \varpi + 6 \delta < \frac{1}{5};$ (7′)

(6′) is implied by (7′) and so may be dropped. Playing (4′) off of (1′), (2′) gives the additional constraints

$253 \varpi + 54 \delta < \frac{7}{4}$ (8′)

and

$178 \varpi + 52 \delta < 1$ . (9′)

(3′), (7′) and (8′) are implied by (9′) and may be dropped. So I claim that $MPZ'[\varpi,\delta]$ holds whenever

$178 \varpi + 52 \delta < 1$ .

But of course I should double-check all the computations here, there is plenty of scope for numerical error.

20 June, 2013 at 3:31 pm

Terence Tao

Actually, one should be able to do better than this, because I was using the older Type III estimate, not the newer Type III estimate that also takes advantage of $\alpha$ averaging. Recalculating…

20 June, 2013 at 3:47 pm

Terence Tao

Recomputing the previous calculation with the improved Type III analysis. As stated in

Estimation of the Type III sums

this improvement replaces the condition

$\frac{13}{2} (\frac{1}{2}+\sigma) > 4 + 16 \varpi + \delta$ (4′)

by the combination of

$5 (\frac{1}{2}+\sigma) > 3 + 16 \varpi + \delta$ (1”)

and

$\frac{9}{2} (\frac{1}{2} + \sigma) > \frac{5}{2} + 10 \varpi + \delta$ (2”).

So now we need to play (1”), (2”) off of (1′), (2′) to replace (8′), (9′) with the new constraints

$101\varpi + 21 \delta < \frac{3}{4}$ (3'')

$148 \varpi + 33 \delta < 1$ (4'')

$173 \varpi + 38 \delta < \frac{7}{4}$ (5'')

$80 \varpi + 20 \delta < 1$ . (6'')

The constraint (4'') implies (3''), (5''), (6''), (3') (barely!), and (7'), so I now claim (somewhat nervously) that $MPZ'[\varpi,\delta]$ holds whenever

$148 \varpi + 33 \delta < 1$ .

This is extremely tentative though; it is quite likely that there is at least one typo in the above (but hopefully the typos only affect secondary constraints and not the primary one). I will have to think about how to write things up so that it is easier to chase through the numerology. (I guess I will have to write up another blog post to try to describe the latest version of the argument.)

20 June, 2013 at 3:52 pm

David Roberts

So what k_0 do we get from these two new bounds on \varpi?

20 June, 2013 at 4:30 pm

Terence Tao

Have to run, but I can do a quick projection: $\varpi$ was previously close to 1/348, but with the latest bound (if it holds up) it will become close to 1/148 (since we now know that delta is essentially negligible), a multiplicative gain of 2.35. Ever since the Bessel function technology was introduced, $k_0$ scales like $\varpi^{-3/2}$ , so it should improve by about $(2.35)^{3/2}$ , so from 5,452 to about 1,512. For this range of $k_0$ we now have tables in place for the best known H for such a k_0; specifically, $k_0 =1512$ gives $H=12432$ , see http://www.opertech.com/primes/webdata/k1000-1999/k1500-1599/ . But this is all just a rough calculation…

21 June, 2013 at 5:31 am

v08ltu

I have rederived (mostly independently) 1′ and 3”, and 5” using 2” (which I never checked) from your “new bound” (which I have yet to check yet). I haven’t digested the post enough to figure out the $\mu$ -numerology.

21 June, 2013 at 6:25 am

v08ltu

I’ve lost the net thread of the arguments, but I do think that $148\omega+33\delta\le 1$ is correct arithmetic. What exactly is the purpose of this $\mu$ (besides to set it equal to 0)?

21 June, 2013 at 6:43 am

v08ltu

If it all works, $k_0=1471$ with $\omega=1/148-7/400000, \delta'=1/170,A=2000$ .

21 June, 2013 at 6:45 am

v08ltu

$k_0=1470$ with $\omega=1/148-5.9/400000, \delta'=1/245,A=2100$ .

21 June, 2013 at 7:24 am

Terence Tao

Combining this with the newly set up systematic tables at http://www.opertech.com/primes/webdata/ and http://math.mit.edu/~drew/records2.txt , this gives H = 12,042.

Double-checked your computation using Maple (taking advantage of Gergely and Eytan’s exact formulae for Bessel integrals). It checks out: $2\kappa_1+2\kappa_2+2\kappa_3$ has to be bounded by $0.132 \times 10^{-5}$ , but is instead about $0.123 \times 10^{-5}$ . As usual, $\kappa_1 \approx 0.603 \times 10^{-6}$ dominates, with $\kappa_2 \approx 0.355 \times 10^{-8}$ well behind and $\kappa_3 \approx 0.931 \times 10^{-392}$ again negligible.

varpi := 1/148 - 5.9/400000; k0 := 1470; deltap := 1/245; A := 2100;

delta := (1 - 148 * varpi) / 33; theta := deltap / (1/4 + varpi); thetat := (deltap - delta + varpi) / (1/4 + varpi); deltat := delta / (1/4 + varpi); j := BesselJZeros(k0-2,1); eps := 1 - j^2 / (k0 * (k0-1) * (1+4*varpi)); kappa1 := int( (1-t)^((k0-1)/2)/t, t = theta..1, numeric); kappa2 := (k0-1) * int( (1-t)^(k0-1)/t, t=theta..1, numeric); e := exp( A + (k0-1) * int( exp(-A*t)/t, t=deltat..theta, numeric ) ); # using Gergely's exact expression for denominator gd := (j^2/2) * BesselJ(k0-3,j)^2; # using Eytan's exact expression for numerator tn := sqrt(thetat)*j; gn := (tn^2/2) * (BesselJ(k0-2,tn)^2 - BesselJ(k0-3,tn)*BesselJ(k0-1,tn)); kappa3 := (gn/gd) * e; eps2 := 2*(kappa1+kappa2+kappa3);

# we win if eps2 < eps

21 June, 2013 at 7:58 am

v08ltu

If I use these better bounds on $\kappa_3$ , I think I get $k_0=1465$ for $\omega=1/148-1/10^7,\delta'=1/200,A=2400$ . This is best possible, even if the $\kappa,\delta$ did not exist.

21 June, 2013 at 8:12 am

Terence Tao

Unfortunately Maple is not confirming these numbers. $2\kappa_1+2\kappa_2+2\kappa_3$ has to not exceed $0.301 \times 10^{-6}$ . We have $\kappa_1 = 0.361 \times 10^{-7}$ and $\kappa_2 := 0.153 \times 10^{-10}$ which is fine, but I’m getting $\kappa_3 = 0.159 \times 10^{2589}$ which is not fine. The exponential factor $\exp( A + (k_0-1) \int_{\tilde \delta}^\theta e^{-At}/t\ dt )$ is coming out as $0.529 \times 10^{4159}$ , while the Bessel numerator is $0.242 \times 10^{-1568}$ and the denominator is $80.3$ . Could there be a typo in your choice of parameters? For instance A does not appear to be optimal here (I accidentally entered in A=2400 and got a better bound on the exponential factor).

21 June, 2013 at 8:16 am

v08ltu

It seems to me that the numer/denom of the Bessel-integral quotient is so small that the $A$ parameter ends up being pointless. You can take $A=0$ if you want, I think. Although a diversion, $A\approx3.35$ or about seems to minimize $\kappa_3$ .

21 June, 2013 at 8:19 am

v08ltu

OK, I found my typo, e(-A*t/t) instead of e(-A*t)/t… Sorry

21 June, 2013 at 7:01 am

Terence Tao

Great, thanks!

$\mu$ is the quantity such that the factor $r$ of $d$ lies in the range

$x^{-\delta} R \leq r \leq R$

where $R := x^{-\mu} N$ . In (7.1), (7.2) of Zhang, $\mu$ is set to be $\varepsilon$ (i.e. infinitesimal) in the Type I case and equal to $3\varpi$ in the Type II case (actually $2\varpi+\varepsilon$ would have worked here). So, yes, $\mu$ can be ignored for the Type I analysis (although it plays a more non-trivial, but still minor, role in the Type II analysis).

21 June, 2013 at 8:22 am

v08ltu

OK, try this again… $\omega=1/148-24/4000000,\delta'=1/200,k_0=1467,A=1300$

21 June, 2013 at 8:28 am

Terence Tao

This one I can confirm :-). $2\kappa_1+2\kappa_2+2\kappa_3$ has to be at most $0.5922 \times 10^{-6}$ , but is $0.707 \times 10^{-7}$ , with the overly sensitive $\kappa_3$ now being $0.453 \times 10^{-24}$ . From tables, $k_0=1467$ gives $H=12012$ .

22 June, 2013 at 12:36 am

Anonymous

I am a physisist, but I am interested in your project. Since \kappa_i (i=1,2,3) is very small, can you rescale these quantities with some characterstic quantity, such that the rescaled \kappa_{i} is not small, and the corresponding round-off error can be much reduced?

22 June, 2013 at 8:03 am

Terence Tao

Unfortunately, the $\kappa$ quantities are dimensionless and cannot be rescaled. (Actually for number theory there is not much dimensional analysis available in general, because the fundamental domain of study is the integers (or natural numbers), which do not have any natural scaling (or, if you wish, it has a natural unit, namely 1).)

23 June, 2013 at 3:27 am

Tom

Gentlemen, H = 12012 and still falling. How low you can go with these methods?

23 June, 2013 at 9:47 am

Aubrey de Grey

Let’s at least hope they get to 11832, which in an almost completely meaningless sense would be half way from Zhang to the actual twin primes conjecture!

23 June, 2013 at 12:25 pm

Tom

2^26 = 67108864, 2^13 = 8192, 2^1 = 2.

23 June, 2013 at 3:07 pm

Aubrey de Grey

Not sure why there is no “Reply” link at the bottom of Tom’s reply to mine, but: Tom, if your comment is meant to imply that my mention of 11832 as the symbolic half-way mark was incorrect, please note that 13+13-26 is 0, not 1. 11832 is simply sqrt(2*70000000).

22 June, 2013 at 7:39 am

Bounding short exponential sums on smooth moduli via Weyl differencing | What's new

[…] Proof: See Lemma 7 of this previous post. […]

23 June, 2013 at 9:14 pm

The distribution of primes in densely divisible moduli | What's new

[…] for instead of . Inserting Theorem 4 into the Pintz sieve from this previous post gives for (see this blog comment), which when inserted in turn into newly set up tables of narrow prime tuples gives infinitely many […]

23 June, 2013 at 10:20 pm

Terence Tao

In order to try to stop the discussion from fragmenting too much, I’m rolling both this Type I/II thread and the Type III thread into a single new thread at

The distribution of primes in densely divisible moduli

which has the latest version of both arguments that incorporate all of the recent improvements to Zhang’s argument, in particular the use of the q-analogue of the van der Corput lemma to improve the exponential sums in the Type I/II analysis, and the use of alpha-averaging to improve the Type III analysis.

25 June, 2013 at 10:42 am

Pace Nielsen

General question to anyone interested.

In Lemma 6, how much can one weaken the assumption that $\psi_M$ is smooth? In particular, for what types of convolutions $\alpha_1\ast \alpha_2 \ast \cdots \ast \alpha_n$ of functions $\alpha_i$ coming from the Heath-Brown identity does Lemma 6 not apply?

25 June, 2013 at 12:20 pm

Terence Tao

This is a good question! I think there are partial analogues of Lemma 6 in some cases but the bounds are weaker.

Roughly speaking, when dealing with a sum such as $\sum_n \psi_M(n) f(n)$ where $f$ is periodic of some period $d > N$ , one should think of $\psi_M$ as having a Fourier expansion roughly of the form

$\psi_M(n) \approx \frac{M}{d} \sum_{h = O(d/M)} e_d( hn ).$ (*)

Here I am deliberately being vague as to what summation over $h=O(d/M)$ means. If one uses a sharp cutoff $-d/M \leq h \leq d/M$ then one gets a Dirichlet kernel at scale $M$ rather than a smooth compactly supported cutoff $\psi_M$ ; if one inserts a Fejer weight one gets a Fejer kernel, and similarly for de Vallee-Poussin kernels, etc. To get a compactly supported kernel $\psi_M$ one needs a weight which is not completely localised in the range $h=O(d/M)$ , but is instead rapidly decreasing outside of this range.

If one pretends that (*) is a rigorous formula then one easily gets Lemma 6 as a consequence (and the real proof of Lemma 6 basically proceeds through a rigorous analogue of (*)).

Anyway, now suppose we replace $\psi_M$ by $\alpha * \psi_{N_2}$ where $M = N_1 N_2$ and $\alpha$ is some coefficient sequence at scale $N_1$ . The same heuristic that gives (*) also gives

$\psi_{N_2}(n) \approx \frac{N_2}{d} \sum_{h = O(d/N_2)} e_d( hn ).$

and hence

$\alpha * \psi_{N_2}(n) \approx \frac{N_2}{d} \sum_{m = O(N_1)} \alpha(m) \sum_{h = O(d/N_2)} e_d( hn / m )$

at least if we restrict $\alpha$ to those integers coprime to $d$ for simplicity. If $N_1 \ll N_2 \ll d$ , then the set of fractions $h/m\ (d)$ in ${\bf Z}/d{\bf Z}$ with $m = O(N_1)$ and $h = O(d/N_2)$ has multiplicity $d^{o(1)}$ (from the divisor bound) and is thus spread out over a set of cardinality $\approx N_1 d / N_2 = N_1^2 d / M^2$ . So one can get a partial analogue of Lemma 6 in this case, in which one sums over $N_1^2$ times as many frequencies $h$ , but divides by an additional $N_1$ , for a net “loss” of $N_1$ over Lemma 6 applied to $\psi_M$ (or one can think of this assertion as an averaged version of Lemma 6 applied to $\psi_{N_2}$ , in which case one is no longer losing anything and is in fact gaining (in principle, at least) because of the additional frequency averaging).

Incidentally this sort of “averaged completion of sums” shows up in the Type III analysis, and is the main source of our current improvement over Zhang’s Type III bound.

28 June, 2013 at 11:53 pm

சொல்வனம் » பகா எண் இடைவெளிகளின் எல்லைகள் – யீடாங் சாங், இருளைப் பிளந்த மின்னல் கீற்று

[…] https://terrytao.wordpress.com/2013/06/12/estimation-of-the-type-i-and-type-ii-sums/#comment-235545 […]

29 June, 2013 at 5:22 am

Gergely Harcos

A comment and some typos:

1. Before Lemma 7 it should be emphasized that an expression $e_q(\frac{a}{b})$ always means $e_q(a\bar{b})$ . This is because it is customary to use the notation $e_q(x)$ for any $x\in\mathbb{R}$ , in which case division is meant in the ordinary sense. For example, without this warning, $e_{q_1}( \frac{b_1 h}{q_2} ) e_{q_2}( \frac{b_2 h}{q_1} )$ in Lemma 7 could be mistaken for $e_{q_1q_2}( b_1 h ) e_{q_1q_2}( b_2 h )$ .

2. In the proof of Lemma 7, “ $a$ mod $h$ ” should be “ $a$ mod $q_1q_2$ “, and $e_d( b_1 q_2 \bar{q_2} + b_2 q_1 \bar{q_1})$ should be $e_d(b_1 h q_2 \bar{q_2} + b_2 h q_1 \bar{q_1})$ .

3. In the second display below (14), $\sum_{m = a_i\ (d)}$ should be $1_{m=a_i\ (d)}$ . In the next display, the factor $\frac{1}{d}$ is missing.

4. In the fifth display below (14), $\hat\psi( M + \frac{Mh}{d} )$ should be $\hat\psi( Mn + \frac{Mh}{d} )$ .

5. In (29), $n=a_r$ should be $mn=a_r$ .

6. In the fourth display below (34), $q_0^2$ should be $q_0^{-2}$ .

[Corrected, thanks – T.]

30 June, 2013 at 12:39 pm

Bounded gaps between primes (Polymath8) – a progress report | What's new

[…] sketch of Zhang’s argument for establishing Type I estimates (details may be found at these two posts). It is based on previous arguments of Bombieri, Friedlander, and Iwaniec, relying on […]

30 June, 2013 at 12:52 pm

Gergely Harcos

Your new blog entry tells me you arrived safely. Welcome! Two small comments and some typos:

1. In the proof of Lemma 5, I would emphasize for the sake of the reader that the vanishing of $\sum_n \psi(n/N) \chi\overline{\chi'}(n)$
follows by representing the nontrivial character $\chi\overline{\chi'}(n)$ as a linear combination of $e_{qq'}(rn)$ with $r\not\equiv 0\ (qq')$ , and then applying Poisson summation for each subsum $\sum_n \psi(n/N) e_{qq'}(rn)$ .

2. The inequality (16) and the preceding text might give the impression that it covers Theorem 8 for $m=q=p^j$ and $(ab,p)=1$ . This is only true for $j=1$ , since otherwise summing over $({\bf Z}/q{\bf Z})^\times$ is different from summing over ${\bf F}_q^\times$ . So I would add that for $j\geq 2$ there is an elementary treatment that goes back to Salié.

3. In Theorem 4, $x \ll M,N \ll x$ should be $x \ll MN \ll x$ .

4. In the last display of the proof of Lemma 5, $\sum_{\chi'\ (q)}^*$ should be $\sum_{\chi'\ (q')}^*$ , and $\overline{c'_{\chi}}$ should be $\overline{c_{\chi'}}$ .

5. In (12) and the next display, $\alpha\beta$ should be $\alpha\ast\beta$ .

6. In the second display below (12), $\chi(m)$ and $\chi(n)$ should be $\psi(m)$ and $\psi(n)$ .

7. In the proof of Theorem 8, ${\bf F}_q^n$ should be ${\bf F}_{q^n}$ : 3 occurrences.

8. In the last display of the proof of Theorem 8, $\sum_\psi \alpha_\psi^n +\beta_\psi^n$ should be $\sum_{\psi\neq\psi_0} (\alpha_\psi^n +\beta_\psi^n)$ .

9. In the first display of the proof of Lemma 9, $\frac{b_2}{n}$ should be $\frac{b_2}{n+l}$ .

10. In the sixth display of the proof of Lemma 9, $e_{t_1}( \frac{b_2}{n+l} + m_2 n )$ should be $e_{t_2}( \frac{b_2}{n+l} + m_2 n )$ .

11. In the proof of Corollary 10, the conditions $(n,d_1)=(n+l,d_2)=1$ are missing under the three sums $\sum_{n \in {\bf Z}/d{\bf Z}}$ .

12. In the second display of the proof of Corollary 10, the factor $(d_1,d_2)$ should be omitted.

[Corrected, thanks – T.]

1 July, 2013 at 11:59 am

Gergely Harcos

It seems that #6 has not been implemented. Note that “second display below (12)” really means “7 lines below (12)”. Also, in the last display of the proof of Theorem 8, I would put $\alpha_\psi^n +\beta_\psi^n$ into parentheses for clarity.

[Corrected, thanks – T.]

1 July, 2013 at 8:14 pm

Pace Nielsen

Just a few more typos:

1. The $n$ on the RHS of (6) should be $x$ .

2. On the line below (10), the $\varpi$ should be $\delta$ .

3. In Lemma 5, “whee” should be “where”.

4. Gergely’s point #6 above.

5. Three displays later, both of the $=1$ should be in the subscripts. (One already is.)

6. In (14), both in the first and third lines, $c(d)$ should be $c(n)$ .

7. In the second line of the proof of Theorem 8, “needed our application” is missing the word “in”.

8. I believe that the second to last display of Lemma 9 is missing a factor of $d_0$ . This factor thus also seems to be missing from the first term in the bound of Corollary 10 (since it is missing in the third to last, and then last, displayed equations of the proof). I don’t know how much further this missing factor propagates.

9. It might be helpful to define $d$ in the statement of Corollary 10 (rather than in the first line of the proof), so that in the display the summation can be over $n\in {\bf Z}/d{\bf Z}:(n,d_1)=(n+l,d_2)=1$ .

10. In the sentence before Section 5, the last $r'$ should just be $r$ .

————–

I’ll try to finish reading the second half of the post tomorrow.

My impression of how the third display after (22) is established, is first to break the sum into a dyadic decompositions, then use Corollary 12, and the $O(1)$ comes (mostly) from the number of terms in the dyadic breakdown.

[Thanks for the corrections! Regarding point 8, the $d_0$ is buried in the $d_1^{1/2} d_2^{1/2} = d_0 t_1^{1/2} t_2^{1/2}$ factor. Regarding point 9, the initial sum is over integers $n$ , but completion of sums converts this to sums over ${\bf Z}/d{\bf Z}$ . And yes, this is how the third display after (22) is established. -T.]

2 July, 2013 at 7:38 pm

Pace Nielsen

Only made it through another quarter of the post, but here are some comments.

1. In the third display after (28), on the LHS the condition $(n,q)=1$ is redundant. It was actually helpful for me to have the condition there, but it also might be helpful to mention the redundance.

2. A few sentences before (33), it says “For fixed $n_1, n_2$ obeying…” and later it says “For fixed $n_1\sim N$ ,…”. I would recommend changing the word “fixed” to “any given” (to avoid confusion with being fixed with respect to $x$ ).

3. In the third display above (33), it took me a while to figure out how the controlled multiplicity hypothesis was being applied. I think one transforms $\tau_{b}(mn_1)$ to $\tau_{n_1^{-1}b}(m)$ (essentially, replacing the old singleton congruence class system with a new one where we multiply by the inverse of $n$ modulo $q$ whenever possible). If that is correct, it might be good to say a word about why this new system still has controlled multiplicity.

4. In (34), I recommend changing $\geq$ to $\gg$ (to avoid trivial technicalities later when plugging in $\mu=2\varpi +c$ ).

5. In the sentence following (39), change “As noted after Lemma 13” to “As noted after Lemma 6” (the link is to equation (13), contained in Lemma 6).

6. In the sentence after (40) which reads “Indeed, from (33) and (9) one has…”, change (33) to (23). Then in the next two displays, replace $c$ by $\mu$ .

7. Directly before Theorem 13, change $M,M$ to $M,N$ .

8. In the second paragraph after Theorem 14, there is a sentence which begins “By (7), (8) we can simultaneously…” The condition (8) is superfluous here.

9. I recommend either changing the next sentence to read: “Note that (43) is weaker than (41), so we can now suppose instead that we are in the “Type II” regime…” OR change the statement of Theorem 14 to include the assumption that (42) fails.

10. The computation following Theorem 14 shows that the $39$ in equation (6) should be $29$ . (This new constant matches what I saw on the wiki.)

[Corrected, thanks – T.]

3 July, 2013 at 8:41 am

Pace Nielsen

I can follow your argument involving controlled multiplicity now. Thanks for clarifying that. Here are the remainder of the comments/questions.

1. For the display following “For the diagonal case, we make the crude bound”, I believe the RHS should have an extra factor of $\log^{O(1)}x$ (or just $x^{o(1)}$ ).

2. In the display which says $x^{\frac{7}{2}+7\varpi} \ll x^{3-5\epsilon-5\mu-3\delta+o(1)}N^2$ , change $7\varpi$ to $14\varpi$ , and also change $3\delta$ to $5\delta$ . (Fortunately, these changes do not affect any later computations.)

3. Two displays later, on the second line of that display where it has $b'_{q_2'}h$ , the $h$ should be $h'$ . A similar change should be made to the display following (46).

4. Three displays after (46), the $h,h'$ are missing in the definition of $c_2$ .

5. In Section 7, when it says “From we have” you can add “crude bounds” (or, “the divisor bound”) inbetween.

6. On the RHS of (48), there are parentheses missing ( $x^{o(1)}$ needs to multiply both factors).

7. In the fourth display above (49), on the second line the second $b_{q_1}$ should be $b_{q_1}'$ . On the third line, the second $h$ should be $h'$ .

8. On the last line of the display above (49) (where $c_1$ is being defined) there is a prime missing in the subscript of the $b$ , and another prime missing from $h$ .

———

When dealing with the Type I sum, when performing the Cauchy-Schwarz the choice is made to pull out the sum on $q_1$ . What is the reason we don’t pull out the sum on $q_2$ as well?

3 July, 2013 at 9:56 pm

Terence Tao

Thanks for the corrections!

Regarding Cauchy-Schwarz, there is a lot of flexibility in what we pull outside of the Cauchy-Schwarz, and what stays inside (and gets “doubled”, e.g. from $\sum_{q_2}$ to $\sum_{q_2,q'_2}$ ). But one has to balance the “diagonal” terms and the “off-diagonal” terms, both to be small in order to win. If one pulls too much stuff outside the Cauchy-Schwarz then the diagonal terms get too big, but if one puts too much stuff inside then the off-diagonal terms get too big. (Some of the more recent refinements to the Type I sums have come by striking a better balance in this regard.)

If we pull both the $q_1,q_2$ sums outside then only the $h$ summation gets doubled. But then the diagonal contribution (which becomes $h=h'$ instead of $hq'_2 = h'q_2$ ) only saves us a factor of $H^{-1}$ (instead of $(HQ)^{-1}$ ) over the trivial bound, when we actually need to save $H^{-2}$ . (The off-diagonal terms become a lot better, of course, but I don’t know how to exploit that when the diagonal terms are so bad.)

4 July, 2013 at 11:48 pm

Gergely Harcos

Some typos:

1. Four displays before (49), $e_{q'_1}( -\frac{b_{q_1} h')}{nrq'_2})$ should be $e_{q'_1}( -\frac{b_{q'_1} h'}{nrq'_2})$ , and $e_{q'_2}( -\frac{b'_{q'_2} h}{(n+kr) r q_1} )$ should be $e_{q'_2}( -\frac{b'_{q'_2} h'}{(n+kr) r q'_1} )$ ; one typo in the first expression, two typos in the second one.

2. In the display before (49), $- b_{q_1} \bar{r} \bar{q'_2} h \frac{d_1}{q'_1}$ should be $- b_{q'_1} \bar{r} \bar{q'_2} h' \frac{d_1}{q'_1}$ ; two typos here. Also, this display would look better in two lines instead of three.

[Corrected, thanks – T.]

4 July, 2013 at 11:52 pm

Gergely Harcos

Forgot one:

3. In Line 6 of Section 7, “From we have” should be “From the divisor bound we have” (with link to the divisor bound).

[Corrected, thanks – T.]

5 July, 2013 at 9:41 am

Pace Nielsen

Is there a natural multidimensional analog of the “Completion of Sums” lemma? In particular, I’m looking for simplifications for a sum of the form

$\sum_{m,m'}\psi_{M}(m)\psi_{M'}(m')\sum_{i\in I: mm'=a_i\ (d)}c_i$ .

5 July, 2013 at 1:27 pm

Pace Nielsen

From reading a few more comments on different threads, it appears that there is such an analog, but we run into the problem of higher dimensional Kloosterman sums.

7 July, 2013 at 11:17 pm

The distribution of primes in doubly densely divisible moduli | What's new

[…] Proof: See Lemma 7 of this previous post. […]

21 March, 2014 at 4:14 pm

Stijn Stefan Campbell Hanson

Sorry, I’ve been looking this over again and could you maybe explain a bit more why the case $D < \mathcal{L}^C$ is acceptable in the proof of Bombieri-Vinogradov (ii)?

21 March, 2014 at 7:01 pm

Terence Tao

As I said in the post, the argument here is similar to the treatment of the $D \leq {\mathcal L}^C$ case of (i). If you could describe our own efforts to prove the estimate and where precisely you got stuck, I could be more able to try to diagnose the issue.

22 March, 2014 at 3:29 am

Stijn Stefan Campbell Hanson

So we have the bound on the $\beta$ sum using Siegel-Walfisz (I think, I’m not too clear on the precise steps) but how do we combine this with the $\alpha$ sum? Also would we still need to use the crude estimates, in which case which ones as it wouldn’t be the same same one, surely?

22 March, 2014 at 7:33 am

Terence Tao

Use Siegel-Walfisz to bound the term $|\sum_n \beta \psi(n)1_{(n,e)}=1|$ , and crude estimates to bound the sum $|\sum_m \alpha \psi(m) 1_{(m,e)}|$ , and then continue using crude estimates to bound all of the terms that appear afterwards. Since the Siegel-Walfisz factor is going to save arbitrarily many powers of $\log x$ , it is perfectly safe to lose bounded powers of $\log x$ in all other estimates; thus for instance, once each term in the $\psi$ summation is bounded by some uniform bound $X$ , the $\psi$ sum may then be crudely bounded by $dX \leq \log^C X$ , since there are at most $d$ characters of conductor $d$ . Similarly, it is safe to discard the $\frac{1}{\phi(d)}$ factor in this regime. I recommend actually inserting some estimates and seeing what you get; as I said, it doesn’t really matter if you are somewhat inefficient in your estimation because the gain of an arbitrary number of logarithms in Siegel-Walfisz will always rescue you. (See also Strategy 16 and Strategy 15 from https://terrytao.wordpress.com/2010/10/21/245a-problem-solving-strategies/ , and also my discussion of the value of attempting to partially implement a solution even when you do not yet have the full solution at https://plus.google.com/u/1/114134834346472219368/posts/Xdm8eiPLWZp .)

	Jas, the Physicist on Career advice
	Jas, the Physicist on Career advice
	Anonymous on Two announcements: AI for Math…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on An epsilon of room: pages from…
	Anonymous on An epsilon of room: pages from…
	Anonymous on An epsilon of room: pages from…
	Anonymous on Erratum for “An inverse…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Erratum for “An inverse…
	Anonymous on Marton’s conjecture in a…
	Anonymous on Marton’s conjecture in a…
	Anonymous on 246B, Notes 3: Elliptic functi…

Estimation of the Type I and Type II sums

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

75 comments

Leave a comment Cancel reply

For commenters

Estimation of the Type I and Type II sums

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

75 comments

Leave a comment Cancel reply

For commenters