In the previous set of notes, we saw how zero-density theorems for the Riemann zeta function, when combined with the zero-free region of Vinogradov and Korobov, could be used to obtain prime number theorems in short intervals. It turns out that a more sophisticated version of this type of argument also works to obtain prime number theorems in arithmetic progressions, in particular establishing the celebrated theorem of Linnik:

Theorem 1 (Linnik’s theorem) Let {a\ (q)} be a primitive residue class. Then {a\ (q)} contains a prime {p} with {p \ll q^{O(1)}}.

In fact it is known that one can find a prime {p} with {p \ll q^{5}}, a result of Xylouris. For sake of comparison, recall from Exercise 65 of Notes 2 that the Siegel-Walfisz theorem gives this theorem with a bound of {p \ll \exp( q^{o(1)} )}, and from Exercise 48 of Notes 2 one can obtain a bound of the form {p \ll \phi(q)^2 \log^2 q} if one assumes the generalised Riemann hypothesis. The probabilistic random models from Supplement 4 suggest that one should in fact be able to take {p \ll q^{1+o(1)}}.

We will not aim to obtain the optimal exponents for Linnik’s theorem here, and follow the treatment in Chapter 18 of Iwaniec and Kowalski. We will in fact establish the following more quantitative result (a special case of a more powerful theorem of Gallagher), which splits into two cases, depending on whether there is an exceptional zero or not:

Theorem 2 (Quantitative Linnik theorem) Let {a\ (q)} be a primitive residue class for some {q \geq 2}. For any {x > 1}, let {\psi(x;q,a)} denote the quantity

\displaystyle  \psi(x;q,a) := \sum_{n \leq x: n=a\ (q)} \Lambda(n).

Assume that {x \geq q^C} for some sufficiently large {C}.

  • (i) (No exceptional zero) If all the real zeroes {\beta} of {L}-functions {L(\cdot,\chi)} of real characters {\chi} of modulus {q} are such that {1-\beta \gg \frac{1}{\log q}}, then

    \displaystyle  \psi(x;q,a) = \frac{x}{\phi(q)} ( 1 + O( \exp( - c \frac{\log x}{\log q} ) ) + O( \frac{\log^2 q}{q} ) )

    for all {x \geq 1} and some absolute constant {c>0}.

  • (ii) (Exceptional zero) If there is a zero {\beta} of an {L}-function {L(\cdot,\chi_1)} of a real character {\chi_1} of modulus {q} with {\beta = 1 - \frac{\varepsilon}{\log q}} for some sufficiently small {\varepsilon>0}, then

    \displaystyle  \psi(x;q,a) = \frac{x}{\phi(q)} ( 1 - \chi_1(a) \frac{x^{\beta-1}}{\beta} \ \ \ \ \ (1)

    \displaystyle + O( \exp( - c \frac{\log x}{\log q} \log \frac{1}{\varepsilon} ) )

    \displaystyle  + O( \frac{\log^2 q}{q} ) )

    for all {x \geq 1} and some absolute constant {c>0}.

The implied constants here are effective.

Note from the Landau-Page theorem (Exercise 54 from Notes 2) that at most one exceptional zero exists (if {\varepsilon} is small enough). A key point here is that the error term {O( \exp( - c \frac{\log x}{\log q} \log \frac{1}{\varepsilon} ) )} in the exceptional zero case is an improvement over the error term when no exceptional zero is present; this compensates for the potential reduction in the main term coming from the {\chi_1(a) \frac{x^{\beta-1}}{\beta}} term. The splitting into cases depending on whether an exceptional zero exists or not turns out to be an essential technique in many advanced results in analytic number theory (though presumably such a splitting will one day become unnecessary, once the possibility of exceptional zeroes are finally eliminated for good).

Exercise 3 Assuming Theorem 2, and assuming {x \geq q^C} for some sufficiently large absolute constant {C}, establish the lower bound

\displaystyle  \psi(x;a,q) \gg \frac{x}{\phi(q)}

when there is no exceptional zero, and

\displaystyle  \psi(x;a,q) \gg \varepsilon \frac{x}{\phi(q)}

when there is an exceptional zero {\beta = 1 - \frac{\varepsilon}{\log q}}. Conclude that Theorem 2 implies Theorem 1, regardless of whether an exceptional zero exists or not.

Remark 4 The Brun-Titchmarsh theorem (Exercise 33 from Notes 4), in the sharp form of Montgomery and Vaughan, gives that

\displaystyle  \pi(x; q, a) \leq 2 \frac{x}{\phi(q) \log (x/q)}

for any primitive residue class {a\ (q)} and any {x \geq q}. This is (barely) consistent with the estimate (1). Any lowering of the coefficient {2} in the Brun-Titchmarsh inequality (with reasonable error terms), in the regime when {x} is a large power of {q}, would then lead to at least some elimination of the exceptional zero case. However, this has not led to any progress on the Landau-Siegel zero problem (and may well be just a reformulation of that problem). (When {x} is a relatively small power of {q}, some improvements to Brun-Titchmarsh are possible that are not in contradiction with the presence of an exceptional zero; see this paper of Maynard for more discussion.)

Theorem 2 is deduced in turn from facts about the distribution of zeroes of {L}-functions. Recall from the truncated explicit formula (Exercise 45(iv) of Notes 2) with (say) {T := q^2} that

\displaystyle  \sum_{n \leq x} \Lambda(n) \chi(n) = - \sum_{\hbox{Re}(\rho) > 3/4; |\hbox{Im}(\rho)| \leq q^2; L(\rho,\chi)=0} \frac{x^\rho}{\rho} + O( \frac{x}{q^2} \log^2 q)

for any non-principal character {\chi} of modulus {q}, where we assume {x \geq q^C} for some large {C}; for the principal character one has the same formula with an additional term of {x} on the right-hand side (as is easily deduced from Theorem 21 of Notes 2). Using the Fourier inversion formula

\displaystyle  1_{n = a\ (q)} = \frac{1}{\phi(q)} \sum_{\chi\ (q)} \overline{\chi(a)} \chi(n)

(see Theorem 69 of Notes 1), we thus have

\displaystyle  \psi(x;a,q) = \frac{x}{\phi(q)} ( 1 - \sum_{\chi\ (q)} \overline{\chi(a)} \sum_{\hbox{Re}(\rho) > 3/4; |\hbox{Im}(\rho)| \leq q^2; L(\rho,\chi)=0} \frac{x^{\rho-1}}{\rho}

\displaystyle  + O( \frac{\log^2 q}{q} ) )

and so it suffices by the triangle inequality (bounding {1/\rho} very crudely by {O(1)}, as the contribution of the low-lying zeroes already turns out to be quite dominant) to show that

\displaystyle  \sum_{\chi\ (q)} \sum_{\sigma > 3/4; |t| \leq q^2; L(\sigma+it,\chi)=0} x^{\sigma-1} \ll \exp( - c \frac{\log x}{\log q} ) \ \ \ \ \ (2)

when no exceptional zero is present, and

\displaystyle  \sum_{\chi\ (q)} \sum_{\sigma > 3/4; |t| \leq q^2; L(\sigma+it,\chi)=0; \sigma+it \neq \beta} x^{\sigma-1} \ll \exp( - c \frac{\log x}{\log q} \log \frac{1}{\varepsilon} ) \ \ \ \ \ (3)

when an exceptional zero is present.

To handle the former case (2), one uses two facts about zeroes. The first is the classical zero-free region (Proposition 51 from Notes 2), which we reproduce in our context here:

Proposition 5 (Classical zero-free region) Let {q, T \geq 2}. Apart from a potential exceptional zero {\beta}, all zeroes {\sigma+it} of {L}-functions {L(\cdot,\chi)} with {\chi} of modulus {q} and {|t| \leq T} are such that

\displaystyle  \sigma \leq 1 - \frac{c}{\log qT}

for some absolute constant {c>0}.

Using this zero-free region, we have

\displaystyle  x^{\sigma-1} \ll \log x \int_{1/2}^{1-c/\log q} 1_{\alpha < \sigma} x^{\alpha-1}\ d\alpha

whenever {\sigma} contributes to the sum in (2), and so the left-hand side of (2) is bounded by

\displaystyle  \ll \log x \int_{1/2}^{1 - c/\log q} N( \alpha, q, q^2 ) x^{\alpha-1}\ d\alpha

where we recall that {N(\alpha,q,T)} is the number of zeroes {\sigma+it} of any {L}-function of a character {\chi} of modulus {q} with {\sigma \geq \alpha} and {0 \leq t \leq T} (here we use conjugation symmetry to make {t} non-negative, accepting a multiplicative factor of two).

In Exercise 25 of Notes 6, the grand density estimate

\displaystyle  N(\alpha,q,T) \ll (qT)^{4(1-\alpha)} \log^{O(1)}(qT) \ \ \ \ \ (4)

is proven. If one inserts this bound into the above expression, one obtains a bound for (2) which is of the form

\displaystyle  \ll (\log^{O(1)} q) \exp( - c \frac{\log x}{\log q} ).

Unfortunately this is off from what we need by a factor of {\log^{O(1)} q} (and would lead to a weak form of Linnik’s theorem in which {p} was bounded by {O( \exp( \log^{O(1)} q ) )} rather than by {q^{O(1)}}). In the analogous problem for prime number theorems in short intervals, we could use the Vinogradov-Korobov zero-free region to compensate for this loss, but that region does not help here for the contribution of the low-lying zeroes with {t = O(1)}, which as mentioned before give the dominant contribution. Fortunately, it is possible to remove this logarithmic loss from the zero-density side of things:

Theorem 6 (Log-free grand density estimate) For any {q, T > 1} and {1/2 \leq \alpha \leq 1}, one has

\displaystyle  N(\alpha,q,T) \ll (qT)^{O(1-\alpha)}.

The implied constants are effective.

We prove this estimate below the fold. The proof follows the methods of the previous section, but one inserts various sieve weights to restrict sums over natural numbers to essentially become sums over “almost primes”, as this turns out to remove the logarithmic losses. (More generally, the trick of restricting to almost primes by inserting suitable sieve weights is quite useful for avoiding any unnecessary losses of logarithmic factors in analytic number theory estimates.)

Exercise 7 Use Theorem 6 to complete the proof of (2).

Now we turn to the case when there is an exceptional zero (3). The argument used to prove (2) applies here also, but does not gain the factor of {\log \frac{1}{\varepsilon}} in the exponent. To achieve this, we need an additional tool, a version of the Deuring-Heilbronn repulsion phenomenon due to Linnik:

Theorem 8 (Deuring-Heilbronn repulsion phenomenon) Suppose {q \geq 2} is such that there is an exceptional zero {\beta = 1 - \frac{\varepsilon}{\log q}} with {\varepsilon} small. Then all other zeroes {\sigma+it} of {L}-functions of modulus {q} are such that

\displaystyle  \sigma \leq 1 - c \frac{\log \frac{1}{\varepsilon}}{\log(q(2+|t|))}.

In other words, the exceptional zero enlarges the classical zero-free region by a factor of {\log \frac{1}{\varepsilon}}. The implied constants are effective.

Exercise 9 Use Theorem 6 and Theorem 8 to complete the proof of (3), and thus Linnik’s theorem.

Exercise 10 Use Theorem 8 to give an alternate proof of (Tatuzawa’s version of) Siegel’s theorem (Theorem 62 of Notes 2). (Hint: if two characters have different moduli, then they can be made to have the same modulus by multiplying by suitable principal characters.)

Theorem 8 is proven by similar methods to that of Theorem 6, the basic idea being to insert a further weight of {1 * \chi_1} (in addition to the sieve weights), the point being that the exceptional zero causes this weight to be quite small on the average. There is a strengthening of Theorem 8 due to Bombieri that is along the lines of Theorem 6, obtaining the improvement

\displaystyle  N'(\alpha,q,T) \ll \varepsilon (1 + \frac{\log T}{\log q}) (qT)^{O(1-\alpha)} \ \ \ \ \ (5)

with effective implied constants for any {1/2 \leq \alpha \leq 1} and {T \geq 1} in the presence of an exceptional zero, where the prime in {N'(\alpha,q,T)} means that the exceptional zero {\beta} is omitted (thus {N'(\alpha,q,T) = N(\alpha,q,T)-1} if {\alpha \leq \beta}). Note that the upper bound on {N'(\alpha,q,T)} falls below one when {\alpha > 1 - c \frac{\log \frac{1}{\varepsilon}}{\log(qT)}} for a sufficiently small {c>0}, thus recovering Theorem 8. Bombieri’s theorem can be established by the methods in this set of notes, and will be given as an exercise to the reader.

Remark 11 There are a number of alternate ways to derive the results in this set of notes, for instance using the Turan power sums method which is based on studying derivatives such as

\displaystyle \frac{L'}{L}(s,\chi)^{(k)} = (-1)^k \sum_n \frac{\Lambda(n) \chi(n) \log^k n}{n^s}

\displaystyle  \approx (-1)^{k+1} k! \sum_\rho \frac{1}{(s-\rho)^{k+1}}

for {\hbox{Re}(s)>1} and large {k}, and performing various sorts of averaging in {k} to attenuate the contribution of many of the zeroes {\rho}. We will not develop this method here, but see for instance Chapter 9 of Montgomery’s book. See the text of Friedlander and Iwaniec for yet another approach based primarily on sieve-theoretic ideas.

Remark 12 When one optimises all the exponents, it turns out that the exponent in Linnik’s theorem is extremely good in the presence of an exceptional zero – indeed Friedlander and Iwaniec showed can even get a bound of the form {p \ll q^{2-c}} for some {c>0}, which is even stronger than one can obtain from GRH! There are other places in which exceptional zeroes can be used to obtain results stronger than what one can obtain even on the Riemann hypothesis; for instance, Heath-Brown used the hypothesis of an infinite sequence of Siegel zeroes to obtain the twin prime conejcture.

— 1. Log-free density estimate —

We now prove Theorem 6. We will make no attempt here to optimise the exponents in this theorem, and so will be quite wasteful in the choices of numerical exponents in the argument that follows in order to simplify the presentation.

By increasing {T} if necessary we may assume that

\displaystyle  T \geq q^{10} \ \ \ \ \ (6)

(say); we may also assume that {T} is larger than any specified absolute constant. We may then replace {qT} by {T} in the estimate, thus we wish to show that

\displaystyle  N(\alpha,q,T) \ll T^{O(1-\alpha)}.

Observe that in the regime

\displaystyle  1-\alpha \gg \frac{\log\log T}{\log T}

the claim already follows from the non-log-free density estimate (4). Thus we may assume that

\displaystyle  \alpha = 1 - \frac{A}{\log T}

for some {A = O( \log \log T )}, and the claim is now to show that there are at most {O( \exp( O(A) ) )} zeroes of {L}-functions {L(\sigma+it,\chi)} with {|t| \leq T}, {\sigma \geq 1 - \frac{A}{\log T}}, and {\chi} a character of modulus {q}. We may assume that {A \geq 1}, since the {A < 1} case follows from the {A \geq 1} case (and also essentially follows from the classical zero-free region, in any event).

For minor technical reasons it is convenient to first dispose of the contribution of the principal character. In this case, the zeroes are the same as those of the Riemann zeta function. From the Vinogradov-Korobov zero-free region we conclude there are no zeroes {\sigma} with {\sigma \geq 1 - \frac{A}{\log T}} and {|t| \leq T}. Thus we may restrict attention to non-principal characters {\chi}.

Suppose we have a zero {L(\sigma+it,\chi)=0} of a non-principal character {\chi} of modulus {q} with {\sigma \geq 1 - \frac{A}{\log T}} and {|t| \leq T}. From equation (48) of Notes 2 we then have

\displaystyle  \sum_{n \leq x} \frac{\chi(n)}{n^{\sigma+it}} \ll q T x^{-\sigma}

for any {x \geq 1}; in particular

\displaystyle  \sum_{n \leq x} \frac{\chi(n)}{n^{\sigma+it}} \ll T^{-400} \ \ \ \ \ (7)

(say) for all {T^{500} \leq x \leq T^{1000}}. One can of course obtain more efficient truncations than this, but as mentioned previously we are not trying to optimise the exponents. If one subtracts the {n=1} term from the left-hand side, this already gives a zero-detecting polynomial, but it is not tractable to work with because it contains too many terms with small {n} (and is also not concentrated on those {n} that are almost prime). To fix this, we weight the previous Dirichlet polynomial by {\rho*1}, where {\rho} is an arithmetic function supported on {[1,T^{500}]} to be chosen later obeying the bound {\rho(m) \ll \tau(m)^{O(1)}}. We expand

\displaystyle  \sum_{n \leq T^{1000}} \rho*1(n) \frac{\chi(n)}{n^{\sigma+it}} = \sum_{m \leq T^{500}} \frac{\rho(m) \chi(m)}{m^{\sigma+it}} \sum_{n \leq T^{1000}/m} \frac{\chi(n)}{n^{\sigma+it}}

and hence by (7) and the upper bound on {\rho(m)}

\displaystyle  \sum_{n \leq T^{1000}} \rho*1(n) \frac{\chi(n)}{n^{\sigma+it}} \ll T^{-400} \sum_{m \leq T^{500}} \frac{\tau(m)^{O(1)}}{m^\sigma}.

Since {\sigma \geq 1 - O( \frac{\log\log T}{\log T} )}, one sees from the divisor bound and the hypothesis that {T} is large that

\displaystyle  \left|\sum_{n \leq T^{1000}} \rho*1(n) \frac{\chi(n)}{n^{\sigma+it}}\right| \leq \frac{1}{2}

If we have {\rho(1)=1}, then we can extract the {n=1} term and obtain a zero-detecting polynomial:

\displaystyle  \left|\sum_{1 < n \leq T^{1000}} \rho*1(n) \frac{\chi(n)}{n^{\sigma+it}}\right| \geq \frac{1}{2}.

We now select the weights {\rho}. There are a number of options here; we will use a variant of the “continuous Selberg sieve” from Section 2 of Notes 4. Fix a smooth function {f: {\bf R} \rightarrow {\bf R}} that equals {1} on {[-1,1]} and is supported on {[-2,2]}; we allow implied constants to depend on {f}. For any {R > 1}, define

\displaystyle  \beta_R(n) := \sum_{d|n} \mu(d) f( \frac{\log d}{\log R} ).

Observe from Möbius inversion that {\beta_R(n) = 1_{n=1}} for all {n \leq R}. The weight {\beta_R^2} was used as an upper bound Selberg sieve in Notes 4.

We observe that

\displaystyle  \beta_{R_1} \beta_{R_2} = 1 * \rho_{R_1,R_2} \ \ \ \ \ (8)

for any {R_1,R_2 > 1}, where

\displaystyle  \rho_{R_1,R_2}(d) := \sum_{d_1,d_2: [d_1,d_2] = d} \mu(d_1) f( \frac{\log d_1}{\log R_1} ) \mu(d_2) f( \frac{\log d_2}{\log R_2} ). \ \ \ \ \ (9)

We will need the following general bound:

Lemma 13 (Sieve upper bound) Let {1 \leq R_1 \leq R_2 \ll R_1^{O(1)}}, and let {g} be a completely multiplicative function such that {g(p) = O(1)} for all primes {p}. Then

\displaystyle  \sum_d \frac{g(d)}{d} \rho_{R_1,R_2}(d) \ll \exp( - \sum_{p \leq R_1} \frac{g(p)}{p} ). \ \ \ \ \ (10)

Proof: Clearly, we can restrict {d} to those numbers whose prime factors do not exceed {CR_1^C}, for some large absolute constant {C}.

By a Fourier expansion we can write

\displaystyle  f(u) = \int_{\bf R} e^{-itu} F(t)\ dt \ \ \ \ \ (11)

for some rapidly decreasing function {F: {\bf R} \rightarrow {\bf C}}, and thus the left-hand side of (10) may be written as

\displaystyle  \int_{\bf R} \int_{\bf R} \sum_{d_1,d_2} \mu(d_1) \mu(d_2) \frac{g([d_1,d_2])}{[d_1,d_2] d_1^{it_1/\log R_1} d_2^{it_2/\log R_2}}\ F(t_1) F(t_2) dt_1 dt_2

where we implicitly restrict {d_1,d_2} to numbers whose prime factors do not exceed {CT^C} (note that this makes the integrand absolutely summable and integrable, so that Fubini’s theorem applies). We may factor this as

\displaystyle  \int_{\bf R} \int_{\bf R} \prod_{p \leq CR_1^C} (1 - \frac{g(p)}{p} (p^{-it_1/\log R_1} + p^{-it_2/\log R_2} - p^{-it_1/\log R_1 - it_2/\log R_2}))

\displaystyle  F(t_1) F(t_2) dt_1 dt_2.

By the rapid decrease of {F}, it thus suffices to show that

\displaystyle  \prod_{p \leq CR_1^C} (1 - \frac{g(p)}{p} (p^{-it_1/\log R_1} + p^{-it_2/\log R_2} - p^{-it_1/\log R_1 - it_2/\log R_2}))

\displaystyle  \ll (2+|t_1|+|t_2|)^{O(1)} \exp( - \sum_{p \leq R_1} \frac{g(p)}{p} ).

By Taylor expansion we can bound the left-hand side by

\displaystyle  \ll |\exp( - \sum_{p \leq CR_1^C} \frac{g(p)}{p} (p^{-it_1/\log R_1} + p^{-it_2/\log R_2} + p^{-it_1/\log R_1 - it_2/\log R_2}) )|.

By Mertens theorem we can replace the constraint {p \leq CR_1^C} with {p \leq R_1}. Since {g(p)=O(1)}, it thus suffices to show that

\displaystyle  \sum_{p \leq R_1} \frac{1}{p} |p^{-it_1/\log R_1} + p^{-it_2/\log R_2} - p^{-it_1/\log R_1 - it_2/\log R_2}) - 1|

\displaystyle  \ll \log( 2 + |t_1|+|t_2| ).

But we can factor

\displaystyle  |p^{-it_1/\log R_1} + p^{-it_2/\log R_2} - p^{-it_1/\log R_1 - it_2/\log R_2}) - 1|

\displaystyle  = |p^{-it_1/\log R_1}-1| |p^{-it_2/\log R_1}-1|

\displaystyle  \ll |p^{-it_1/\log R_1}-1|

\displaystyle  \ll \min( |t_1| \frac{\log p}{\log R_1}, 1 )

and the claim follows from Mertens’ theorem. \Box

We record a basic corollary of this estimate:

Corollary 14 For any {R \geq 1}, we have

\displaystyle  \sum_{n \leq x} \beta_R(n)^2 \ll \frac{x}{\log R} \ \ \ \ \ (12)

for any {x \geq R^5}, and

\displaystyle  \sum_{n \leq x} \frac{\beta_R(n)^2}{n} \ll 1 \ \ \ \ \ (13)

for any {x \ll R^{O(1)}}.

Proof: Writing {\beta_R^2 = \rho_{R,R}*1}, we can write the left-hand side of (12) as

\displaystyle  \sum_d \rho_{R,R}(d) (\frac{x}{d} + O(1)).

Since {\rho_{R,R}(d)} is supported on {[1,R^4]} and is bounded above by {\tau(d)^{O(1)}}, the contribution of the {O(1)} error is {O( R^4 \log^{O(1)} R )} which is acceptable. By Lemma 13 with {g=1}, the contribution of the main term is {O( x \exp( - \sum_{p \leq R} \frac{1}{p} )}, and the claim then follows from Mertens’ theorem.

Now we prove (13). Using Rankin’s trick, it suffices to show that

\displaystyle  \sum_n \frac{\beta_R(n)^2}{n^{1+1/\log R}} \ll 1.

The left-hand side factorises as

\displaystyle  \zeta(1 + \frac{1}{\log R}) \sum_n \frac{\rho_{R,R}(n)}{n^{1+1/\log R}}.

From Lemma 13 with {g(n) := n^{-1/\log R}}, we see that

\displaystyle  \sum_n \frac{\rho_{R,R}(n)}{n^{1+1/\log R}} \ll \exp( - \sum_{p \leq R} \frac{1}{p^{1+1/\log R}} )

\displaystyle  \ll \exp( - \log \zeta(1 + \frac{1}{\log R}) )

(using Mertens theorem to control the error between {\log \zeta(1 + \frac{1}{\log R}) = \sum_p \frac{1}{p^{1+1/\log R}} + O(\frac{1}{p^2})} and {\sum_{p \leq R} \frac{1}{p^{1+1/\log R}}}) and the claim follows. \Box

We will work primarily with the cutoff

\displaystyle  \beta_{T^{10}} \beta_{T^{100}} = \rho_{T^{10}, T^{100}} * 1;

the reason for the separate scales {T^{10}} and {T^{100}} will become clearer later. The function {\rho} is supported on {[1,T^{500}]}, equals {1} at {1}, and is bounded by {O( \tau^{O(1)})}, so from the previous discussion we thus have the zero-detector inequality

\displaystyle  |\sum_{T^{100} \leq n \leq T^{1000}} \beta_{T^{10}} \beta_{T^{100}}(n) \frac{\chi(n)}{n^{\sigma+it}}| \gg 1 \ \ \ \ \ (14)

whenever {L(\sigma+it,\chi)=0} with {\chi} of modulus {q}, {|t| \leq T}, and {\sigma \geq 1 - O( \frac{A}{\log T} )}. Our objective is to show that the number of such zeroes is {O( \exp(O(A)))}.

We first control the number of zeroes that are very close together. From equation (48) of Notes 2 with {x = T^2} (say), we see that

\displaystyle  L(s,\chi) \ll \exp( O( A) ) \log( q T )

whenever {\hbox{Re}(s) \geq 1 - O( \frac{A}{\log T} )}, {\hbox{Im}(s) \ll T}, and {\chi} is non-principal of modulus {q}; also from equation (45) of Notes 2 we have

\displaystyle  1 / L( 1 + \frac{1}{\log T}, \chi ) \ll \zeta( 1 + \frac{1}{\log T} ) \ll \log T.

From Jensen’s theorem (Theorem 16 of Supplement 2), we conclude that for any given non-principal {\chi} and any {t_0 \in [-T,T]}, there are at most {O(A)} zeroes of {L(\sigma+it,\chi)} (counting multiplicity, of course) with {\sigma \geq 1 - \frac{A}{\log T}} and {|t-t_0| \leq \frac{A}{\log T}}. To prove Theorem 6, it thus suffices by the usual covering argument to establish the bound

\displaystyle  J \ll \exp( O( A) ) \ \ \ \ \ (15)

whenever one has a sequence {((\chi_j, \sigma_j+ i t_j))_{j=1}^J} of zeroes {L( \sigma_j + it_j, \chi_j )=0} with {\chi_j} a non-principal character of conductor {q}, {|t_j| \leq T}, and {\sigma_j \geq 1 - \frac{A}{\log T}}, obeying the separation condition

\displaystyle  |t_j - t_{j'}| \geq \frac{A}{\log T} \hbox{ whenever } \chi_j = \chi_{j'}, j \neq j'. \ \ \ \ \ (16)

Note from the existing grand zero-density estimate in (4) that

\displaystyle  J \ll \log^{O(1)} T. \ \ \ \ \ (17)

We write (14) for the zeroes {L(\sigma_j + it_j, \chi_j)=0} as

\displaystyle  |\sum_n f(n) \overline{g_j}(n)| \gg 1 \ \ \ \ \ (18)

for all {j=1,\dots,J}, where

\displaystyle  f(n) := \frac{1}{n} \beta_{T^{10}} \beta_{T^{100}}(n) 1_{T^{100} \leq n \leq T^{1000}}

and

\displaystyle  g_j(n) := n^{1-\sigma_j + it_j} \overline{\chi}_j(n) \psi( \frac{\log n}{\log T} )

and {\psi: {\bf R} \rightarrow {\bf R}} is a smooth function supported on {[50, 2000]} which equals {1} on {[100, 1000]}. Note that the {n^{1-\sigma_j}} term in {g_j(n)} is {O( \exp( O(A) ) )}.

We use the generalised Bessel inequality (Proposition 2 from Notes 3) with {\nu(n) := \frac{1}{n} \beta_{R_1}(n)^2} to conclude that

\displaystyle  \sum_{j=1}^J |\sum_n f(n) \overline{g_j(n)}|^2 \leq (\sum_{T^{100} \leq n \leq T^{1000}} \frac{\beta_{T^{100}}(n)^2}{n} )

\displaystyle  \times (\sum_{1 \leq j,j' \leq J} c_j \overline{c_{j'}} \sum_n \frac{\beta_{T^{10}}(n)^2}{n} g_j(n) \overline{g_{j'}}(n) )

where {c_1,\dots,c_J} are complex numbers with {\sum_{j=1}^J |c_j|^2 = 1}. (Strictly speaking, one needs to deal with the issue that the {g_j, g_{j'}, \nu} are not finitely supported, but there is enough absolute convergence here that this is a routine matter.) From Corollary 14 we have

\displaystyle  \sum_{T^{100} \leq n \leq T^{1000}} \frac{\beta_{T^{100}}(n)^2}{n} \ll 1

(note how the logarithmic factors cancel, which is crucial to obtaining our “log-free” estimate) and so from (18), the inequality {c_j \overline{c_{j'}} \ll |c_j|^2 + |c_{j'}|^2} and symmetry it suffices to show that

\displaystyle  \sum_{j'=1}^J |\sum_n \frac{\beta_{T^{10}}(n)^2}{n} g_j(n) \overline{g_{j'}}(n)| \ll \exp(O(A)) \ \ \ \ \ (19)

for all {1 \leq j \leq J}.

We now estimate the expression

\displaystyle  \sum_n \frac{\beta_{T^{10}}(n)^2}{n} g_j(n) \overline{g_{j'}}(n). \ \ \ \ \ (20)

We expand (20) using (8) as

\displaystyle  \sum_d \frac{\rho_{T^{10}, T^{10}}(d)}{d} \sum_m \frac{1}{m} g_j(dm) \overline{g_{j'}(dm)}. \ \ \ \ \ (21)

From (9), the factor {\rho_{T^{10},T^{10}}(d)} vanishes unless {d \leq T^{40}}, and from the support of {g_j} we see that the inner sum vanishes unless {T^{50}/d \leq m \ll T^{O(1)}}. From Exercise 44 of Notes 2, we then have

\displaystyle  \sum_m \frac{1}{m} g_j(dm) \overline{g_{j'}(dm)} = 1_{\chi_j=\chi_{j'}} \frac{\phi(q)}{q} \int x^{1-\sigma_j+1-\sigma_{j'}} x^{i(t_j-t_{j'})} \psi( \frac{\log x}{\log T} )^2\ \frac{dx}{x} \ \ \ \ \ (22)

\displaystyle  + O( \exp( O(A) ) q \frac{T}{T^{50}/d} ).

From the upper bound {\rho_{T^{10}, T^{10}}(d) \ll \tau^{O(1)} d 1_{d \leq T^{40}}} we have

\displaystyle  \sum_d \frac{|\rho_{T^{10}, T^{10}}(d)|}{d} \ll \log^{O(1)} T, \ \ \ \ \ (23)

so we see from (17) and (6) that the contribution of the error term {O( \exp( O(A) ) q \frac{T}{T^{50}/d} )} to (19) is acceptable. For the main term in (22), we see from Corollary 14 that

\displaystyle  \sum_d \frac{\rho_{T^{10},T^{10}}(d)}{d} \ll \frac{1}{\log T}

so (as the main term in (22) is independent of {d}) the remaining contribution to (19) is bounded by

\displaystyle  \ll \frac{1}{\log T} \sum_{j': \chi_j = \chi_{j'}} |\int x^{1-\sigma_j+1-\sigma_{j'}} x^{i(t_j-t_{j'})} \psi( \frac{\log x}{\log T} )^2\ \frac{dx}{x}|.

Making the change of variables {x=\exp(y \log T)}, this becomes

\displaystyle  \sum_{j': \chi_j = \chi_{j'}} |\int e^{y(1-\sigma_j+1-\sigma_{j'}) \log T} e^{iy (t_j-t_{j'}) \log T} \psi( y )^2\ \ dy|.

The integral is bounded by {\exp(O(A))}, and from two integration by parts it is also bounded by

\displaystyle \exp(O(A)) \frac{1}{(|t_j-t_{j'}| \log T)^2}.

On the other hand, for {\chi_j = \chi_{j'}}, the {t_{j'}} are {\frac{A}{\log T}}-separated by hypothesis, and so

\displaystyle  \sum_{j': \chi_j = \chi_{j'}} \min( 1, \frac{1}{(|t_j-t_{j'}| \log T)^2} ) \ll 1,

and the claim follows.

— 2. Consequences of an exceptional zero —

In preparation for proving Theorem 8, we investigate in this section the consequences of a Landau-Siegel zero, that is to say a real character {\chi_1} of some modulus {q \geq 2} with a zero

\displaystyle  L(\beta,\chi_1) = 0

for some {\beta = 1 - \frac{\varepsilon}{\log q}} with {\varepsilon>0} small. For minor technical reasons we will assume that {q} is a multiple of {6}, so that {\chi_1(2) = \chi_1(3) = 0}; this condition can be easily established by multiplying {\chi} by a principal character of modulus dividing {6}. (We will not need to assume here that {\chi_1} is primitive.)

In Notes 2, we already observed that the presence of an exceptional zero was associated with a small (but positive) value of {L(1,\chi_1)}; indeed, from Lemmas 57 and 59 of Notes 2 we see that

\displaystyle  \frac{\varepsilon}{\log q} \ll L(1,\chi_1) \ll \varepsilon \log q. \ \ \ \ \ (24)

Also, from the class number formula (equation (56) from Notes 2) we have

\displaystyle  L(1,\chi_1) \gg q^{-1/2} \log^{-O(1)} q \ \ \ \ \ (25)

and thus

\displaystyle  \varepsilon \gg q^{-1/2} \log^{-O(1)} q.

For the arguments below, one could also use the slightly weaker estimates in Exercise 67 of Notes 2 or Exercise 57 of Notes 3 and still obtain comparable results. We will however not rely on Siegel’s theorem (Theorem 62 of Notes 2) in order to keep all bounds effective.

We now refine this analysis. We begin with a complexified version of Exercise 58 from Notes 2:

Exercise 15 Let {\chi_1} be a non-principal character of modulus {q \geq 2}. Let {s = \sigma+it} with {0 \leq \sigma \leq 2} and {|t| \leq T} for some {T \geq 2}. If {\sigma \neq 1}, show that

\displaystyle  \sum_{n \leq x} \frac{1*\chi_1(n)}{n^s} = \zeta(s) L(s,\chi_1) + \frac{x^{1-s}}{1-s} L(1,\chi_1) \ \ \ \ \ (26)

\displaystyle  + O( (qT)^2 x^{-1/2} \frac{x^{1-\sigma}}{1-\sigma} )

for any {x \geq 1}. (Hint: use the Dirichlet hyperbola method and Exercise 44 from Notes 2.)

If {\chi} is a non-principal character of modulus {q} with {\chi \neq \chi_1}, show that

\displaystyle  \sum_{n \leq x} \frac{1*\chi_1(n) \chi(n)}{n^s} = L(s,\chi) L(s,\chi \chi_1) + O( (qT)^2 x^{-1/2} \frac{x^{1-\sigma}}{1-\sigma} ).

For technical reasons it will be convenient to work with a completely multiplicative variant of the function {1*\chi_1}. Define the arithmetic function {g: {\bf N} \rightarrow {\bf R}} to be the completely multiplicative function such that {g(p) = 1 + \chi_1(p)} for all {p}; this is equal to {1*\chi} at square-free numbers, but is a bit larger at other numbers. Observe that {g} is non-negative, and has the factorisation

\displaystyle  g = 1 * \chi_1 * h

where {h = g * \mu * \chi_1 \mu} is a multiplicative function that vanishes on primes and obeys the bounds

\displaystyle 0 \leq h(p^j) = (1+\chi_1(p))^{j-2} \chi_1(p) \leq 2^{j-2}

for all {j \geq 2} and primes {p}. In particular {h} is non-negative and {h(p^j)=0} for {p=2,3}, since we assumed {\chi_1(2)=\chi_1(3)=0}. From Euler products we see that

\displaystyle  \sum_n \frac{h(n)}{n^{\frac{1}{2}+\varepsilon}} = \prod_p \sum_{j=0}^\infty \frac{h(p^j)}{p^{j (\frac{1}{2}+\varepsilon)}}

\displaystyle  \leq \prod_{p \geq 5} 1 + \sum_{j=2}^\infty \frac{2^{j-2}}{p^{j (\frac{1}{2}+\varepsilon)}}

\displaystyle  \ll_\varepsilon 1

for any {\varepsilon>0}, so in particular the Dirichlet series {{\mathcal D} h(s) := \sum_n \frac{h(n)}{n^s}} is analytic and uniformly bounded on {\hbox{Re}(s)>1/2+\varepsilon}. (The constraint {p \geq 5} is needed to ensure convergence of the geometric series.) This implies that

\displaystyle  1 \leq {\mathcal D} h(1) \ll 1 \ \ \ \ \ (27)

and also that

\displaystyle  \sum_{n \leq x} \frac{h(n)}{n^s} = {\mathcal D} h(s) + O_\varepsilon( x^{1/2-\sigma+\varepsilon} )

whenever {s = \sigma+it} with {\sigma \geq 1/2+\varepsilon}.

Taking Dirichlet series, we see that

\displaystyle  \sum_n \frac{g(n)}{n^s} = \zeta(s) L(s,\chi_1) {\mathcal D} h(s)

whenever {\hbox{Re}(s) > 1}; more generally, we have

\displaystyle  \sum_n \frac{g(n)\chi(n)}{n^s} = L(s,\chi) L(s,\chi\chi_1) {\mathcal D}(\chi h)(s)

for any character {\chi}. Now we look at what happens inside the critical strip:

Exercise 16 Let {\chi_1} be a real character of modulus {q} a multiple of {6}, and let {g} be as above. Let {s = \sigma+it} with {3/4 \leq \sigma \leq 2} and {|t| \leq T} for some {T \geq 2}. If {\sigma \neq 1}, show that

\displaystyle  \sum_{n \leq x} \frac{g(n)}{n^s} = \zeta(s) L(s,\chi_1) {\mathcal D} h(s) + \frac{x^{1-s}}{1-s} L(1,\chi_1) {\mathcal D} h(1) \ \ \ \ \ (28)

\displaystyle  + O_\varepsilon( (qT)^2 x^{-1/4+\varepsilon} \frac{x^{1-\sigma}}{1-\sigma} )

for any {x \geq 1} and {\varepsilon>0}.

If {\chi} is a non-principal character of modulus {q} with {\chi \neq \chi_1}, show that

\displaystyle  \sum_{n \leq x} \frac{g(n) \chi(n)}{n^s} = L(s,\chi) L(s,\chi \chi_1) {\mathcal D}(\chi h)(s) \ \ \ \ \ (29)

\displaystyle  + O( (qT)^2 x^{-1/4+\varepsilon} \frac{x^{1-\sigma}}{1-\sigma} )

for any {x \geq 1} and {\varepsilon>0}.

We record a nice corollary of these estimates due to Bombieri, which asserts that the exceptional zero forces {g} to vanish (or equivalently, {\chi_1(p)} to become {-1}) on most large primes:

Lemma 17 (Bombieri’s lemma) Let {\chi_1} be a real character of modulus {q} with an exceptional zero at {\beta = 1 - \frac{\varepsilon}{\log q}} for some sufficiently small {\varepsilon>0}. Then for any {x \gg q^{20}}, we have

\displaystyle  \sum_{x < p \leq x^2} \frac{1+\chi_1(p)}{p} \ll \varepsilon \frac{\log x}{\log q}.

Informally, Bombieri’s lemma asserts that {\chi_1(p)=-1} for most primes between {q^{20}} and {q^{1/\varepsilon}}. The exponent of {20} here can be lowered substantially with a more careful analysis, but we will not do so here. For primes much larger than {q^{1/\varepsilon}}, {\chi_1} becomes equidistributed; see Exercise 23 below.

Proof: Without loss of generality we may take {q} to be a multiple of {6}. We may assume that {\varepsilon \frac{\log x}{\log q} \leq 1}, as the claim follows from Mertens’ theorem otherwise; in particular {x^{1-\beta} \ll 1}.

By (28) for {s=\beta} we have

\displaystyle  \sum_{n \leq x} \frac{g(n)}{n^\beta} = \frac{x^{1-\beta}}{1-\beta} ( L(1,\chi_1) {\mathcal D} h(1) + O( q^2 x^{-1/4} ) ) \ \ \ \ \ (30)

for any {x \geq 1}. Since {{\mathcal D} h(1) \geq 1} and {x \geq q^{20}}, we see from (25), (27) that the error term is dominated by the main term, thus

\displaystyle  \sum_{n \leq x} \frac{g(n)}{n^\beta} \gg \frac{x^{1-\beta}}{1-\beta} L(1,\chi_1) {\mathcal D} h(1).

Next, applying (30) with {x} replaced by {x^3} and subtracting, we have

\displaystyle  \sum_{x < n \leq x^3} \frac{g(n)}{n^\beta} = \frac{x^{1-\beta}}{1-\beta} ( (x^{2(1-\beta)}-1) L(1,\chi_1) {\mathcal D} h(1) + O( q^2 x^{-1/4} ) ).

As {x^{1-\beta} \ll 1}, we have {x^{2(1-\beta)}-1 \ll \varepsilon \frac{\log x}{\log q}} by Taylor expansion. As before, the error term can be bounded by the main term and so

\displaystyle  \sum_{x < n \leq x^3} \frac{g(n)}{n^\beta} \ll \varepsilon \frac{\log x}{\log q} \frac{x^{1-\beta}}{1-\beta} L(1,\chi_1) {\mathcal D} h(1).

Since {g} is non-negative and completely multiplicative, one has

\displaystyle  (\sum_{n \leq x} \frac{g(n)}{n^\beta}) \times (\sum_{x < p \leq x^2} \frac{g(p)}{p^\beta}) \leq \sum_{x < n \leq x^3} \frac{g(n)}{n^\beta}

and thus (since {g(p)=1+\chi_1(p)})

\displaystyle  \sum_{x < p \leq x^2} \frac{1+\chi_1(p)}{p^\beta} \ll \varepsilon \frac{\log x}{\log q}.

Since {x^{1-\beta} \ll 1}, we have {\frac{1}{p} \ll \frac{1}{p^\beta}}, and the claim follows. \Box

Exercise 18 With the hypotheses of Bombieri’s lemma, show that

\displaystyle  \sum_{q^{20/k} < p \leq q^{20}} \frac{1+\chi_1(p)}{p} \ll_k \varepsilon^{1/k}

for any natural number {k}.

Now we can give a more precise version of (24):

Proposition 19 Let {\chi_1} be a real character of modulus {q} with an exceptional zero at {\beta = 1 - \frac{\varepsilon}{\log q}} for some sufficiently small {\varepsilon>0}. Then

\displaystyle  L(1,\chi_1) \asymp \frac{\varepsilon}{\log q} \exp( \sum_{p \leq T} \frac{1+\chi_1(p)}{p} )

for any {T \geq 1} with {\log q \ll \log T \ll \frac{\log q}{\varepsilon}}.

Observe that (24) is a corollary of the {T=q} case of this proposition thanks to Mertens’ theorem and the trivial bounds {0 \leq 1+\chi_1(p) \leq 2}. We thus see from this proposition and Bombieri’s lemma that the exceptional zero {\beta} controls {\chi_1} at primes larger than {q^{20}}, but that {L(1,\chi)} is additionally sensitive to the values of {\chi_1} at primes below this range. For an even more precise formula for {L(1,\chi)}, see this paper of Goldfeld and Schinzel, or Exercise 24 below.

Proof: By Bombieri’s lemma and Mertens’ theorem, it suffices to prove the asymptotic for {T=q}.

We begin with the upper bound

\displaystyle  L(1,\chi_1) \ll \frac{\varepsilon}{\log q} \exp( \sum_{p \leq q} \frac{1+\chi_1(p)}{p} ).

Applying (26) with {s = 1 + \frac{\varepsilon}{\log q}} and {x=q^{10}} we have

\displaystyle  \sum_{n \leq q^{10}} \frac{1*\chi_1(n)}{n^s} = \zeta(s) L(s,\chi_1) - \frac{\exp(-10\varepsilon)}{s-1} ( L(1,\chi_1) + O( q^{-3} ) ). \ \ \ \ \ (31)

The left-hand side is non-negative and {\zeta(s) \gg \frac{1}{s-1}}, so we conclude (using (25)) that

\displaystyle  L(1,\chi_1) \ll L(s,\chi_1).

From Euler products and Mertens’ theorem we have

\displaystyle  L(s,\chi_1) \asymp \exp( \sum_p \frac{\chi_1(p)}{p^{1+\varepsilon/\log q}} )

\displaystyle  \asymp \frac{\varepsilon}{\log T} \exp( \sum_p \frac{1+\chi_1(p)}{p^{1+\varepsilon/\log q}} ).

But from Lemma 17 and Mertens’ theorem we see that

\displaystyle  \sum_p \frac{1+\chi_1(p)}{p^{1+\varepsilon/\log q}} = \sum_{p \leq q} \frac{g(p)}{p} + O(1)

and the claim follows.

Now we establish the matching lower bound

\displaystyle  L(1,\chi_1) \gg \frac{\varepsilon}{\log q} \exp( \sum_{p \leq q} \frac{1+\chi_1(p)}{p} ).

Applying (26) with {\beta = 1 - \frac{\varepsilon}{\log q}} and {x=q^{10}} we have

\displaystyle  \sum_{n \leq q^{10}} \frac{1*\chi_1(n)}{n^\beta} = \frac{\exp(10\varepsilon)}{1-\beta} ( L(1,\chi_1) + O( q^{-3} ) ).

For {n \leq q^{10}}, we have {\frac{1}{n^\beta} \ll \frac{1}{n^s}}, and thus by (25)

\displaystyle  \sum_{n \leq q^{10}} \frac{1*\chi_1(n)}{n^s} \ll \frac{1}{s-1} L(1,\chi_1).

Inserting this into (31) and using (25) and {\zeta(s) \ll \frac{1}{s-1}} we conclude that

\displaystyle  L(s,\chi_1) \ll L(1,\chi_1)

and the claim then follows from the preceding calculations. \Box

Remark 20 One particularly striking consequence of an exceptional zero {L(\beta,\chi_1)} is that the spacing of zeroes of other {L}-functions become extremely regular; roughly speaking, for most other characters {\chi} whose conductor {q} is somewhat (but not too much) larger than the conductor {q_1} of {\chi_1}, the zeroes of {L(s,\chi) L(s,\chi\chi_1)} (at moderate height) mostly lie on the critical line and are spaced in approximate arithmetic progression; this “alternative hypothesis” is in contradiction to to the pair correlation conjecture discussed in Section 4 of Supplement 4. This phenomenon was first discovered by Montgomery and Weinberger and can roughly be explained as follows. By an approximate functional equation similar to Exercise 54 of Supplement 3, one can approximately write {L(s,\chi) L(s,\chi_1)} as the sum of {\sum_{n \lessapprox \sqrt{qq_1}} \frac{\chi (1*\chi_1)(n)}{n^s}} plus {\sum_{n \lessapprox \sqrt{qq_1}} \frac{\bar{\chi} (1*\chi_1)(n)}{n^{1-s}}} times a gamma factor which oscillates like {(q|t|)^{it}} when {s = 1/2+it}. The smallness of {1*\chi_1(n)} on average for medium-sized {n} (as suggested for instance by Bombieri’s lemma) suggests that these sums should be well approximated by much shorter sums, which oscillate quite slowly in {t}. This gives an approximation to {L(1/2+it,\chi) L(1/2+it,\chi_1)} that is of the form {F(t) + (q|t|)^{it} G(t)} for slowly varying {F,G}, which can then be used to place the zeroes of this function in approximate arithmetic progression on the real line.

— 3. The Deuring-Heilbronn repulsion phenomenon —

We now prove Theorem 8. Let {q \geq 2} be such that there is an exceptional zero {\beta = 1 - \frac{\varepsilon}{\log qT}} with {\varepsilon} small, associated to some quadratic character {\chi_1} of modulus {q}:

\displaystyle  L( \beta, \chi_1 ) = 0.

From the class number bound (equation (56) of Notes 2; one could also use Exercise 67 of Notes 2 for a similar bound) we have

\displaystyle  L(1,\chi_1) \gg q^{-1/2} \log^{-O(1)} q \ \ \ \ \ (32)

and

\displaystyle  \frac{1}{\varepsilon} \ll q^{1/2} \log^{O(1)} q. \ \ \ \ \ (33)

Let {T \geq 2}, let {\chi} be a character of modulus {q} (possibly equal to {\chi_1} or the principal character), and suppose we have

\displaystyle  L( \sigma+it, \chi ) = 0 \ \ \ \ \ (34)

for some {|t| \leq T}, with {\sigma+it \neq \beta}. Our task is to show that

\displaystyle  1 - \sigma \gg \log\frac{1}{\varepsilon} \frac{1}{\log(qT)}. \ \ \ \ \ (35)

We may assume that

\displaystyle  \sigma > 0.99 \ \ \ \ \ (36)

(say), since the claim is trivial otherwise. By multiplying by the principal character of modulus {6} if necessary, we may assume as before that {q} is a multiple of {6}, so that we can utilise the multiplicative function {g} from the previous section. By enlarging {T}, we may assume as in Section 1 that

\displaystyle  T \geq q^{10}; \ \ \ \ \ (37)

we may also assume that {T} is larger than any specified absolute constant. From the classical zero-free region and the Landau-Page theorem we have

\displaystyle  \sigma - 1 \gg \frac{1}{\log T}. \ \ \ \ \ (38)

The task (35) is then equivalent to showing that

\displaystyle  T^{1-\sigma} \gg \varepsilon^{-c} \ \ \ \ \ (39)

for some absolute constant {c>0}.

We recall the sieve cutoffs

\displaystyle  \beta_R(n) = \sum_{d|n} \mu(d) f( \frac{\log d}{\log R} )

from Section 1, which were used in the zero detector. The main difference is that we will “twist” the polynomial by the completely multiplicative function {g}:

Proposition 21 (Zero-detecting polynomial) Let the notation and assumptions be as above.

  • (i) If {\chi} is not equal to {\chi_1} or the principal character, then

    \displaystyle  \left|\sum_{T^{100} \leq n \leq T^{1000}} \beta_{T^{10}}(n) \beta_{T^{100}}(n) \frac{g(n) \chi(n)}{n^s}\right| \gg 1. \ \ \ \ \ (40)

  • (ii) If {\chi} is equal to {\chi_1} or the principal character, then

    \displaystyle  \left|\sum_{T^{100} \leq n \leq T^{1000}} \beta_{T^{10}}(n) \beta_{T^{100}}(n) \frac{g(n)}{n^s}\right| \gg 1 - O( \varepsilon T^{O(1-\sigma)} ). \ \ \ \ \ (41)

Proof: First suppose that {\chi} is not equal to {\chi_1} or the principal character. Since {L(s,\chi)=0}, we see from (29), (37), (36), (38) that

\displaystyle  \sum_{n \leq x} \frac{g(n) \chi(n)}{n^s} \ll T^{-10}

(say) for any {T^{500} \leq x \leq T^{1000}}. In particular, as {\rho} is supported on {[1,T^{500}]} and one has {\rho(n) g(n)=n^{o(1)}} from the divisor bound, one has

\displaystyle  \sum_{n \leq T^{1000}} (\rho*1)(n) \frac{g(n) \chi(n)}{n^s} \ll \sum_d \frac{|\rho(d)|}{d^\sigma} T^{-10}

\displaystyle  \ll T^{-1}

(say), thanks to (36). Since {\rho*1(n) = \beta_{T^{10}} \beta_{T^{100}}(n)} equals {1_{n=1}} for {n < T^{100}}, we thus conclude (40) since {T} is assumed to be large.

Now suppose that {\chi} is {\chi_1} or the principal character, so that {\zeta(s) L(s,\chi)=0}. From (28), (37), (36), (38) we then have

\displaystyle  \sum_{n \leq x} \frac{g(n)}{n^s} = \frac{x^{1-s}}{1-s} L(1,\chi) {\mathcal D} h(1) + O( T^{-10} )

for {T^{500} \leq x \leq T^{1000}}. By a similar calculation to before, we have

\displaystyle  \sum_{n \leq T^{1000}} (\rho*1)(n) \frac{g(n)}{n^s}

\displaystyle  = \sum_d \frac{\rho(d) g(d)}{d^s} \frac{(T^{1000}/d)^{1-s}}{1-s} L(1,\chi) {\mathcal D} h(1) + O( T^{-1} )

\displaystyle  = \frac{T^{1000(1-s)}}{1-s} {\mathcal D} h(1) L(1,\chi) \sum_d \frac{\rho(d) g(d)}{d} + O( T^{-1} )

\displaystyle  \ll T^{O(1-s)} \varepsilon \exp\left( \sum_{p \leq T} \frac{g(p)}{p} \right) \left|\sum_d \frac{\rho(d) g(d)}{d}\right| + T^{-1}

where we have used (38) and Proposition 19 in the last line. The claim (41) then follows from Lemma 13. \Box

Using the estimates from the previous section, we can establish the following bound:

Proposition 22 We have

\displaystyle  \sum_{T^{100} \leq n \leq T^{1000}} |\beta_{T^{10}}(n)| |\beta_{T^{100}}(n)| \frac{g(n)}{n} \ll \varepsilon^{1/2}.

The point here is that the sieve weights {\beta_{T^{10}}} and {\beta_{T^{100}}} are morally restricting to almost primes, and that {g} should be small on such numbers by Bombieri’s lemma. Assuming this proposition, we conclude that the left-hand sides of (40) or (41) are {O( \varepsilon^{1/2} T^{O(1-\sigma)})}, and (39) follows.

Proof: By the Cauchy-Schwarz inequality it suffices to show that

\displaystyle  \sum_{T^{100} \leq n \leq T^{1000}} \beta_{T^{10}}(n)^2 \frac{g(n)}{n} \ll \varepsilon \ \ \ \ \ (42)

and

\displaystyle  \sum_{T^{100} \leq n \leq T^{1000}} \beta_{T^{100}}(n)^2 \frac{g(n)}{n} \ll 1. \ \ \ \ \ (43)

We begin with the second bound (42), which we establish by quite crude estimates. By a Fourier expansion we can write

\displaystyle  f(u) = \int_{\bf R} e^{-itu} F(t)\ dt \ \ \ \ \ (44)

for some rapidly decreasing function {F: {\bf R} \rightarrow {\bf C}}, and thus

\displaystyle  \beta_R(n) = \int_{\bf R} \sum_{d|n} \mu(d) d^{-it/\log R} F(t)\ dt \ \ \ \ \ (45)

which factorises as

\displaystyle  \beta_R(n) = \int_{\bf R} \prod_{p|n} (1 - p^{-it/\log R}) F(t)\ dt.

Bounding {1 - p^{-it/\log R} \ll \min( t \frac{\log p}{\log R}, 1 )}, we thus have

\displaystyle  \beta_R(n) \ll_A \int_{\bf R} \prod_{p|n} O( \min( t \frac{\log p}{\log R}, 1 ) ) (1+|t|)^{-A}\ dt

for any {A}. Squaring and using Cauchy-Schwarz, we conclude that

\displaystyle  \beta_R(n)^2 \ll_A \int_{\bf R} \prod_{p|n} O( \min( t \frac{\log p}{\log R}, 1 )^2 ) (1+|t|)^{-A}\ dt

for any {A}. In particular, for {n \leq T^{1000}}, we have

\displaystyle  \beta_{T^{100}}(n)^2 \frac{g(n)}{n} \ll_A \int_{\bf R} \prod_{p \leq T^{1000}: p|n} \frac{2^{\hbox{ord}_p(n)}}{p^{\hbox{ord}_p(n)}} O( \min( t \frac{\log p}{\log T}, 1 )^2 ) (1+|t|)^{-A}\ dt,

and so we can bound the left-hand side of (43) by

\displaystyle  \ll_A \int_{\bf R} \prod_{p \leq T^{1000}} (1 + O( \frac{\min( t \frac{\log p}{\log T}, 1 )}{p} )) (1+|t|)^{-A}\ dt,

which we bound by

\displaystyle  \ll_A \int_{\bf R} \exp( \sum_{p \leq T^{1000}} O( \frac{\min( t \frac{\log p}{\log T}, 1 )}{p} ) ) (1+|t|)^{-A}\ dt.

By Mertens’ theorem we have

\displaystyle  \sum_{p \leq T^{1000}} \frac{\min( t \frac{\log p}{\log T}, 1 )}{p} \ll \log(2+|t|),

and the claim follows by taking {A} large enough.

Now we establish (42). By dyadic decomposition it suffices to show that

\displaystyle  \sum_{n \leq x} \beta_{T^{10}}(n)^2 g(n) \ll \frac{\varepsilon}{\log T} x \ \ \ \ \ (46)

for all {T^{100} \leq x \leq T^{1000}}. The left-hand side may be written as as

\displaystyle  \sum_d \rho_{T^{10},T^{10}}(d) g(d) \sum_{n \leq x/d} g(n).

From the hyperbola method we see that

\displaystyle  \sum_{n \leq y} 1*\chi(n) = L(1,\chi) y + O( q y^{1/2} )

for any {y \geq 1}, and thus

\displaystyle  \sum_{n \leq x/d} g(n) = L(1,\chi) \frac{x}{d} {\mathcal D} h(1) + O( q x^{1/2} d^{-1/2} ).

Since {\rho_{T^{10},T^{10}}(d)} is supported on {[0,T^{40}]} and is bounded by {\tau(d)^{O(1)}}, the contribution of the {O( q x^{1/2} d^{-1/2} )} is easily seen to be acceptable (using (33), (37)). The contribution of the main term is

\displaystyle  L(1,\chi) {\mathcal D} h(1) \sum_d \rho_{T^{10},T^{10}}(d) \frac{g(d)}{d}

but this is acceptable by Lemma 13, (27), and Proposition 19. \Box

The proof of Theorem 8 is now complete.

Exercise 23 Let {\chi_1} be a real quadratic character of modulus {q} with a zero at {\beta = 1 - \frac{\varepsilon}{\log q}} for some small {\varepsilon > 0}. Show that

\displaystyle  \sum_{p \leq x} \chi(p) \log p = - \frac{x^\beta}{\beta} + O( \exp( - c \log \frac{1}{\varepsilon} \frac{\log x}{\log q} ) x )

and hence

\displaystyle  \sum_{p \leq x} \chi(p) \log p \ll \exp( - \varepsilon \frac{\log x}{\log q} ) x

for all {x \geq 1} and some absolute constant {c>0}. (Hint: use (3) and the explicit formula.) Roughly speaking, this exercise asserts that {\chi(p)} is equidistributed for primes {p} with {\log p} much larger than {\frac{1}{\varepsilon} \log q}.

Exercise 24 Let {\chi_1} be a real quadratic character of modulus {q} with a zero at {1 - \frac{\varepsilon}{\log q}} for some small {\varepsilon > 0}. Show that

\displaystyle  L(1,\chi) = (1 + O(\varepsilon)) \frac{\varepsilon}{\log q} \prod_{p \leq q^C} (1-\frac{1}{p})^{-1}(1-\frac{\chi(p)}{p})^{-1}

if {C} is a sufficiently large absolute constant. (Hint: use Exercise 81, Lemma 40, and Theorem 41 of Notes 1, as well as Exercise 23.

Exercise 25 (Bombieri’s zero density estimate) Under the hypotheses of Theorem 8, establish the estimate (5). (Hint: repeat the arguments in Section 1, but now “twisted” by {g}.)