In the previous set of notes, we studied upper bounds on sums such as {|\sum_{N \leq n \leq N+M} n^{-it}|} for {1 \leq M \leq N} that were valid for all {t} in a given range, such as {[T,2T]}; this led in turn to upper bounds on the Riemann zeta {\zeta(\sigma+it)} for {t} in the same range, and for various choices of {\sigma}. While some improvement over the trivial bound of {O(N)} was obtained by these methods, we did not get close to the conjectural bound of {O( N^{1/2+o(1)})} that one expects from pseudorandomness heuristics (assuming that {T} is not too large compared with {N}, e.g. {T = O(N^{O(1)})}.

However, it turns out that one can get much better bounds if one settles for estimating sums such as {|\sum_{N \leq n \leq N+M} n^{-it}|}, or more generally finite Dirichlet series (also known as Dirichlet polynomials) such as {|\sum_n a_n n^{-it}|}, for most values of {t} in a given range such as {[T,2T]}. Equivalently, we will be able to get some control on the large values of such Dirichlet polynomials, in the sense that we can control the set of {t} for which {|\sum_n a_n n^{-it}|} exceeds a certain threshold, even if we cannot show that this set is empty. These large value theorems are often closely tied with estimates for mean values such as {\frac{1}{T}\int_T^{2T} |\sum_n a_n n^{-it}|^{2k}\ dt} of a Dirichlet series; these latter estimates are thus known as mean value theorems for Dirichlet series. Our approach to these theorems will follow the same sort of methods used in Notes 3, in particular relying on the generalised Bessel inequality from those notes.

Our main application of the large value theorems for Dirichlet polynomials will be to control the number of zeroes of the Riemann zeta function {\zeta(s)} (or the Dirichlet {L}-functions {L(s,\chi)}) in various rectangles of the form {\{ \sigma+it: \sigma \geq \alpha, |t| \leq T \}} for various {T > 1} and {1/2 < \alpha < 1}. These rectangles will be larger than the zero-free regions for which we can exclude zeroes completely, but we will often be able to limit the number of zeroes in such rectangles to be quite small. For instance, we will be able to show the following weak form of the Riemann hypothesis: as {T \rightarrow \infty}, a proportion {1-o(1)} of zeroes of the Riemann zeta function in the critical strip with {|\hbox{Im}(s)| \leq T} will have real part {1/2+o(1)}. Related to this, the number of zeroes with {|\hbox{Im}(s)| \leq T} and {|\hbox{Re}(s)| \geq \alpha} can be shown to be bounded by {O( T^{O(1-\alpha)+o(1)} )} as {T \rightarrow \infty} for any {1/2 < \alpha < 1}.

In the next set of notes we will use refined versions of these theorems to establish Linnik’s theorem on the least prime in an arithmetic progression.

Our presentation here is broadly based on Chapters 9 and 10 in Iwaniec and Kowalski, who give a number of more sophisticated large value theorems than the ones discussed here.

— 1. Large values of Dirichlet polynomials —

Our basic estimate on large values is the following {L^2} estimate, due to Montgomery and Vaughan.

Theorem 1 ({L^2} estimate on large values) Let {N \geq 1} and {T > 0}, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers. Let {t_1,\dots,t_J} be a {1}-separated set of real numbers in an interval of length at most {T}, thus {|t_j - t_{j'}| \geq 1} for all {1 \leq j < j' \leq J}. Let {\sigma_1,\dots,\sigma_J = O(1)} be real numbers. Then

\displaystyle  \sum_{j=1}^J |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n n^{-\sigma_j-it_j}|^2 \ll (T + N) \sum_{N/2 \leq n \leq N} |a_n|^2. \ \ \ \ \ (1)

This estimate is closely analogous to the analytic large sieve inequality (Proposition 6 from Notes 3). The factor {N^{\sigma_j} n^{-\sigma_j}} is needed for technical reasons, but should be ignored at a first reading since it is comparable to one on the range {N/2 \leq n \leq N}. The bound (1) can be compared against the Cauchy-Schwarz bound

\displaystyle  |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n n^{-\sigma_j-it_j}|^2 \ll N \sum_{N/2 \leq n \leq N} |a_n|^2

and against the pseudorandomness heuristic that {|\sum_{N/2 \leq n \leq N} a_n n^{-it_j}|^2} should be roughly of size {\sum_{N/2 \leq n \leq N} |a_n|^2} for “typical” {t_j}.

Proof: We first observe that to prove the theorem, it suffices to do so in the case {T = N/100}. Indeed, if {T < N/100}, one can simply increase {T} until it reaches {N/100}, which does not significantly affect the factor {T+N}; conversely, if {T > N/100}, one can partition {t_1,\dots,t_J} into {O( T/N )} subsets, each of diameter at most {N/100}, and the claim then follows from the triangle inequality.

Without loss of generality we may assume the {t_j} are in increasing order, so in particular {|t_j - t_{j'}| \geq |j-j'|} for any {j,j'}.

Next, we apply the generalised Bessel inequality (Proposition 2 of Notes 3), using the weight {\nu(n)} defined as {\nu(n) := \psi(n/N)}, where {\psi: {\bf R} \rightarrow {\bf R}} is a smooth function supported on {[1/4,4]} which equals one on {[1/2,1]}. This lets us bound the left-hand side of (1) by

\displaystyle  (\sum_{N/2 \leq n \leq N} \frac{|a_n|^2}{\nu(n)}) (\sum_{j=1}^J \sum_{j'=1}^J c_j \overline{c_{j'}} \sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})})

for some coefficients {c_1,\dots,c_J} with {\sum_{j=1}^J |c_j|^2 = 1}.

By choice of {\nu}, we have

\displaystyle  \sum_{N/2 \leq n \leq N} \frac{|a_n|^2}{\nu(n)} = \sum_{N/2 \leq n \leq N} |a_n|^2,

so it will suffice to show that

\displaystyle  \sum_{j=1}^J \sum_{j'=1}^J c_j \overline{c_{j'}} \sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})} \ll N.

We now focus on the expression

\displaystyle  \sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})}) \ \ \ \ \ (2)

\displaystyle  = \sum_n \psi(n/N) (n/N)^{-\sigma_j-\sigma_{j'}} e( \frac{t_j - t_{j'}}{2\pi} \log n ).

When {j=j'}, we can bound this by {O(N)}, which gives an acceptable contribution; now we consider the case {j \neq j'}. One could estimate (2) in this case using Proposition 6 of Notes 5 and summation by parts to get a bound of {O( \frac{1}{|t_j-t_{j'}|} ) = O( \frac{1}{|j-j'|})} here, but one can do better (saving a logarithmic factor) by exploiting the smooth nature of {\psi}. Namely, by using the Poisson summation formula (Theorem 34 of Supplement 2), we can rewrite (2) as

\displaystyle  \sum_m \int_{\bf R} \psi( x/N) (x/N)^{-\sigma_j-\sigma_{j'}} e( \frac{t_j - t_{j'}}{2\pi} \log x - m x )\ dx

which we rescale as

\displaystyle  N e( \frac{t_j - t_{j'}}{2\pi} \log N) \sum_m \int_{\bf R} \frac{\psi(x)}{x^{\sigma_j+\sigma_{j'}}} e( \frac{t_j - t_{j'}}{2\pi} \log x - m N x )\ dx.

Since {|t_j - t_{j'}| \leq T = \frac{N}{100}}, one can check that the derivative of the phase {x \mapsto \frac{t_j - t_{j'}}{2\pi} \log x - m N x} is {\gg |t_j-t_{j'}|} on the support of {\psi} when {m=0}, and {\gg |m|N} when {m} is non-zero, while the second derivative is {O( |t_j-t_{j'}| )}. Meanwhile, the function {\frac{\psi(x)}{x^{\sigma_j+\sigma_{j'}}}} is compactly supported and has the first two derivatives bounded by {O(1)}. From this and two integrations by parts we obtain the bounds

\displaystyle  \int_{\bf R} \frac{\psi(x)}{x^{\sigma_j+\sigma_{j'}}} e( \frac{t_j - t_{j'}}{2\pi} \log x - m N x )\ dx \ll \frac{1}{|t_j-t_{j'}|^2}

when {m=0} and

\displaystyle  \int_{\bf R} \frac{\psi(x)}{x^{\sigma_j+\sigma_{j'}}} e( \frac{t_j - t_{j'}}{2\pi} \log x - m N x )\ dx \ll \frac{1}{m^2 N^2}

when {m \neq 0}, thus on summing

\displaystyle  \sum_n \nu(n) n^{i(t_j-t_{j'})} \ll \frac{N}{|t_j-t_{j'}|^2} \ll \frac{N}{|j-j'|^2}.

We thus need to show that

\displaystyle  \sum_{1 \leq j,j' \leq J: j \neq j'} \frac{c_j \overline{c_{j'}}}{|j-j'|^2} \ll 1.

But if one bounds {c_j \overline{c_{j'}} \ll |c_j|^2 + |c_{j'}|^2} we obtain the claim. \Box

There is an integral variant of this large values estimate, although we will not use it much here:

Exercise 2 ({L^2} mean value theorem) Let {N \geq 1} and let {(a_n)_{1 \leq n \leq N}} be a sequence of complex numbers.

  • (i) Show that

    \displaystyle  \lim_{T \rightarrow \infty} \frac{1}{T} \int_0^T |\sum_{1 \leq n \leq N} a_n n^{-it}|^2\ dt = \sum_{1 \leq n \leq N} |a_n|^2.

  • (ii) Show the more precise estimate

    \displaystyle  \int_{T_0}^{T_0+T} |\sum_{1 \leq n \leq N} a_n n^{-it}|^2\ dt = (T + O(N)) \sum_{1 \leq n \leq N} |a_n|^2. \ \ \ \ \ (3)

    for any {T > 0} and {T_0 \in {\bf R}}. (Hint: Reduce to the case when {T_0=0}. An adaptation of Theorem 1 gives an upper bound of roughly the right order of magnitude; to get the asymptotic, apply Plancherel’s theorem to an expression of the form {\int_{\bf R} \eta(t) |\sum_{1 \leq n \leq N} a_n n^{-it}|^2} where {\eta(t)} is the indicator function {1_{[0,T]}} convolved by some rapidly decreasing function whose Fourier transform is supported on (say) {[-1/10N, 1/10N]}, and use the previous upper bound to control the error.)

We will use the large values estimate in the following way:

Corollary 3 Let {N, T \geq 2}, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers obeying a bound of the form {a_n \ll \tau(n)^{O(1)} \log^{O(1)} n } for all {1 \leq n \leq N}. Let {0 \leq \beta \leq 1}. Then, after deleting at most {O( T^\beta \log^{O(1)}(N) )} unit intervals from {[0,T]}, we have

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \leq N^{1/2-\sigma} \max(N, T)^{1/2} T^{-\beta/2}

for all {-1 \leq \sigma \leq 2} and all {t} in the remaining portion of {[0,T]}.

One can view this corollary as improving upon the trivial bound

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \leq N^{1/2-\sigma} N^{1/2} \log^{O(1)} N \ \ \ \ \ (4)

coming from mean value theorems on multiplicative functions (see Proposition 21 of Notes 1), if one is allowed to delete some unit intervals from the range {[0,T]} of {t}, with the bound improving as one deletes more and more intervals.

Proof: Let {S \subset [0,T]} be the set of all {t \in [0,T]} for which

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| > N^{1/2-\sigma} \max(N, T)^{1/2} T^{-\beta/2}

for at least one choice of {\sigma \in [-1,2]}. By the greedy algorithm, we can cover {S} by the union of {J} intervals of the form {[t_j-1, t_j+1]}, where {t_1,\dots,t_J} are {1}-separated points in {[0,T]}. By hypothesis, we can find {\sigma_1,\dots,\sigma_J \in [-1,2]} such that

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma_j+it_j}}| > N^{1/2-\sigma_j} \max(N, T)^{1/2} T^{-\beta/2}

for all {j=1,\dots,J}, and hence

\displaystyle  \sum_{j=1}^J |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n n^{-\sigma_j-it_j}|^2 \geq J N \max(N,T) T^{-\beta}.

On the other hand, from mean value theorems on multiplicative functions (see Proposition 21 of Notes 1) we have

\displaystyle  \sum_{N/2 \leq n\leq N} |a_n|^2 \ll N \log^{O(1)} N

and thus by Theorem 1 we have

\displaystyle  \sum_{j=1}^J |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n n^{-\sigma_j-it_j}|^2 \ll (T+N) N \log^{O(1)} N.

Comparing the two bounds we see that {J = O( T^\beta \log^{O(1)} N)}, and the claim follows. \Box

The above estimate turns out to be rather inefficient if {N} is very small compared with {T}, because the {\max(N,T)} factor becomes large compared with {N} and so it is not even clear that one improves upon the trivial bound (4). However, by multiplying Dirichlet polynomials together one can get a good bound in this regime:

Corollary 4 Let {N, T \geq 2}, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers obeying a bound of the form {a_n \ll \tau(n)^{O(1)} \log^{O(1)} n } for all {1 \leq n \leq N}. Let {0 \leq \beta \leq 1}, and let {k \geq 1} be a natural number. Then, after deleting at most {O_k( T^\beta \log^{O_k(1)}(N) )} unit intervals from {[0,T]}, we have

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_k N^{1/2-\sigma} \max(N, T^{1/k})^{1/2} T^{-\frac{\beta}{2k}} \ \ \ \ \ (5)

for all {-1 \leq \sigma \leq 2} and all {t} in the remaining portion of {[0,T]}.

Proof: We raise the original Dirichlet polynomial {\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}} to the {k^{th}} power to obtain

\displaystyle  (\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}})^k = \sum_{N^k/2^k \leq n \leq N^k} \frac{b_n}{n^{\sigma+it}} \ \ \ \ \ (6)

where {b_n} is the Dirichlet convolution

\displaystyle  b_n := \sum_{N/2 \leq n_1,\dots,n_k \leq N: n = n_1 \dots n_k} a_{n_1} \dots a_{n_k}.

From the bounds on {a_n} we have {b_n\ll_k \tau(n)^{O_k(1)} \log^{O_k(1)} n}. Subdividing {[N^k/2^k,N^k]} into {k} dyadic intervals and applying Corollary 3 (with {N} replaced by {N^k}) and the triangle inequality, we conclude that upon deleting {O_k( T^\beta \log^{O_k(1)}(N) )} unit intervals from {[0,T]}, we have

\displaystyle  |\sum_{N^k/2^k \leq n \leq N^k} \frac{b_n}{n^{\sigma+it}}| \ll_k (N^k)^{1/2-\sigma} \max(N^k, T)^{1/2} T^{-\beta/2}

for all {-1 \leq \sigma \leq 2} and all {t \in [0,T]} outside these intervals. Applying (6) and taking {k^{th}} roots, we obtain the claim. \Box

Exercise 5 Show that if {T \geq 2} and {0 \leq \beta \leq 1}, then after deleting at most {O( T^\beta \log^{O(1)} T )} unit intervals from {[0,T]}, one has {\zeta(\frac{1}{2}+it) \ll T^{\frac{1-\beta}{4}}} for all {t} in the remaining portion of {[0,T]}. Conclude the fourth moment bound

\displaystyle  \int_0^T |\zeta(\frac{1}{2}+it)|^4\ dt \ll T \log^{O(1)} T.

(Hint: use the approximate functional equation, Exercise 39 from Supplement 3.)

Remark 6 In 1926, Ingham showed the more precise asymptotic

\displaystyle  \int_0^T |\zeta(\frac{1}{2}+it)|^4\ dt = \frac{1+o(1)}{2\pi^2} T \log^4 T

as {T \rightarrow \infty}. The higher moments {\int_0^T |\zeta(\frac{1}{2}+it)|^{2k}\ dt} for {k=3,4,\dots} have been intensively studied, and are conjectured by Conrey and Gonek (using the random matrix model, see Section 4 of Supplement 4) to be asymptotic to {C_k T \log^{k^2} T} for certain explicit constants {C_k}, but this remains unproven. It can be shown that the Lindelof hypothesis is equivalent to the assertion that {\int_0^T |\zeta(\frac{1}{2}+it)|^{2k}\ dt \ll_{k,\varepsilon} T^{1+\varepsilon}} for all {k \geq 1} and {\varepsilon>0}. In a recent paper of Harper, it was shown assuming the Riemann hypothesis that {\int_0^T |\zeta(\frac{1}{2}+it)|^{2k}\ dt \ll_k T \log^{k^2} T} for all {k \geq 1} and {T \geq 2}.

It is conjectured by Montgomery that Corollary 4 also holds for real {k \geq 1}, after replacing the {O_k(\log^{O_k(1)}(N))} factor by {O_{k,\varepsilon}(N^\varepsilon)} for any fixed {\varepsilon>0}; see this paper of Bourgain for a counterexample showing that such a factor is necessary. (Amusingly, this counterexample relies on the existence of Besicovitch sets of measure zero!) If one had this, then one could optimise (5) in the range {T^\varepsilon \ll N \ll T} for any fixed {\varepsilon>0} by choosing {k} so that {N = T^{1/k}}, arriving (morally, at least) at a bound of the form

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1-\sigma-\frac{\beta}{2}}, \ \ \ \ \ (7)

thus beating the trivial bound by about {N^{\beta/2}}. This bound would have many consequences, most notably the density hypothesis discussed below. Unfortunately, Montgomery’s conjecture remains open. Nevertheless, one can obtain a weaker version of this bound by choosing natural numbers {k} so that {T^{1/k}} is close to {N}, rather than exactly equal to {N}:

Corollary 7 Let {N, T \geq 2} be such that {T^\varepsilon \ll N \ll T} for some {\varepsilon>0}, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers obeying a bound of the form {a_n \ll \tau(n)^{O(1)} \log^{O(1)} n } for all {1 \leq n \leq N}. Let {0 \leq \beta \leq 1}. Then, after deleting at most {O_\varepsilon( T^\beta \log^{O_\varepsilon(1)}(N) )} unit intervals from {[0,T]}, we have

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1-\sigma-\max(\frac{\beta}{4},\beta-\frac{1}{2})} \ \ \ \ \ (8)

for all {-1 \leq \sigma \leq 2} and all {t} in the remaining portion of {[0,T]}.

Note that the bound (8) is not too much worse than (7) in the important regimes when {\beta} is close to {0} and when {\beta} is close to {1}.

Proof: We can choose a natural number {k = O_\varepsilon(1)} such that {T^{1/2} \ll N^k \ll T}, so that {T^{1/2k} \ll N}. Applying Corollary 4 with {k} replaced by {2k}, we conclude (after deleting at most {O_\varepsilon( T^\beta \log^{O_\varepsilon(1)}(N) )} unit intervals from {[0,T]}) that

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1/2-\sigma} N^{1/2} T^{-\frac{\beta}{4k}};

since {T^{-\frac{\beta}{4k}} \ll N^{-\frac{\beta}{4}}}, one obtains (8) with {\frac{\beta}{4}} in place of {\max(\frac{\beta}{4},\beta-\frac{1}{2})}. If instead one uses Corollary 4 with {k} rather than {2k}, one obtains (after deleting a similar number of unit intervals) that

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1/2-\sigma} T^{\frac{1-\beta}{2k}};

since {T^{\frac{1-\beta}{2k}} \ll N^{1-\beta}}, one obtains the remaining case of (8). \Box

Exercise 8 If one has the additional hypothesis {N \ll T^{2\varepsilon}}, show that one can replace the {\max(\frac{\beta}{4},\beta-\frac{1}{2})} factor in (8) by {(1-O(\varepsilon)) \frac{\beta}{2}}. Thus we can approach the conjectured estimate (7) in the regime where {N} is very small compared with {T}.

In the regime when {\beta} is small, one can obtain better bounds by exploiting further estimates on exponential sums such as (2). We will give just one example (due to Montgomery) of such an improvement, referring the reader to Chapter 9 of Iwaniec-Kowalski or Chapter 7 of Montgomery for further examples.

Proposition 9 Let {N, T \geq 2}, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers obeying a bound of the form {a_n \ll \tau(n)^{O(1)} \log^{O(1)} n } for all {1 \leq n \leq N}. Let {0 \leq \beta \leq 1}. Then, after deleting at most {O( T^\beta )} unit intervals from {[0,T]}, we have

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll N^{1-\sigma} (T^{-\beta/2} + T^{1/4} N^{-1/2}) \log^{O(1)}(NT) \ \ \ \ \ (9)

for all {-1 \leq \sigma \leq 2} and all {t} in the remaining portion of {[0,T]}.

Note that this bound can in fact be superior to (7) if {N} is a little bit less than {T} and {\beta} is less than {1/2}.

Proof: Let {S \subset [0,T]} be the set of all {t \in [0,T]} for which

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| > N^{1-\sigma} (T^{-\beta/2} + T^{1/4} N^{-1/2}) \log^C(NT)

for at least one choice of {\sigma \in [-1,2]}, where {C} is a large constant be chosen later. By the greedy algorithm as before, we can cover {S} by the union of {J} intervals of the form {[t_j-1, t_j+1]}, where {t_1,\dots,t_J} are {1}-separated points in {[0,T]}; we arrange the {t_j} in increasing order, so that {|t_j - t_{j'}| \geq |j-j'|}. By hypothesis, we can find {\sigma_1,\dots,\sigma_J \in [-1,2]} such that

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma_j+it_j}}| > N^{1-\sigma} (T^{-\beta/2} + T^{1/4} N^{-1/2}) \log^C(NT)

for all {j=1,\dots,J}, and hence

\displaystyle  \sum_{j=1}^J |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n n^{-\sigma_j-it_j}|^2 \gg J (N^2 T^{-\beta} + N T^{1/2}) \log^{2C}(NT). \ \ \ \ \ (10)

As before, we have

\displaystyle  \sum_{N/2 \leq n\leq N} |a_n|^2 \ll N \log^{O(1)} N

and so by the generalised Bessel inequality (with {\nu := 1_{[N/2,N]}}) we may bound the left-hand side of (10) by

\displaystyle  \ll N \log^{O(1)} N \sum_{1 \leq j,j' \leq J} |c_j| |c_{j'}| |\sum_{N/2 \leq n \leq N} (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})}|

for some {c_1,\dots,c_J} with {\sum_{j=1}^J |c_j|^2 = 1}. The diagonal contribution {j=j'} is {O( N^2 \log^{O(1)} N )}. For the off-diagonal contributions, we see from Propositions 6, 9 of Notes 5 (dealing with the cases {|t_j-t_{j'}| \leq N/100} and {|t_j-t_{j'}| > N/100} respectively) and summation by parts that

\displaystyle  |\sum_{N/2 \leq n \leq N} (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})}| \ll \frac{N}{|t_j-t_{j'}|} + |t_j-t_{j'}|^{1/2} \log T

\displaystyle  \ll \frac{N}{|j-j'|} + T^{1/2} \log T,

and so from the bound {|c_j| |c_{j'}| \ll |c_j|^2 + |c_{j'}|^2} we can bound the left-hand side of (10) by

\displaystyle  \ll N \log^{O(1)}(NTJ) ( N + J T^{1/2} ).

Since {J \leq T}, we thus have

\displaystyle  J (N^2 T^{-\beta} + N T^{1/2}) \log^{2C}(NT) \ll N \log^{O(1)}(NT) ( N + J T^{1/2} ).

The second term on the right-hand side may be absorbed into the left-hand side if {C} is large enough, and we conclude that

\displaystyle  J (N^2 T^{-\beta} + N T^{1/2}) \log^{2C}(NT) \ll N \log^{O(1)}(NT) N

which gives {J \ll T^{2\beta}}, and the claim follows. \Box

— 2. Zero density estimates —

We now use the large value theorems for Dirichlet polynomials to control zeroes {\sigma+it} of the Riemann zeta function {\zeta} with large real part {\sigma}. The key connection is that if {\sigma+it} is a zero of {\zeta}, then this will force some Dirichlet polynomial {\sum_n \frac{a_n}{n^{\sigma+it}}} to be unusually large at that zero, which can be ruled out by the theorems of the previous section (after excluding some unit intervals from the range of {t}). Dirichlet polynomials with this sort of property are sometimes known as zero-detecting polynomials, for this reason.

Naively, one might expect, in view of the identities {\sum_{n=1}^\infty \frac{\mu(n)}{n^s} = \frac{1}{\zeta(s)}} and {\sum_{n=1}^\infty \frac{\Lambda(n)}{n^s} = -\frac{\zeta'(s)}{\zeta(s)}} for {\hbox{Re}(s) < 1}, that Dirichlet polynomials such as {\sum_{1 \leq n \leq N} \frac{\mu(n)}{n^s}} or {\sum_{1 \leq n \leq N} \frac{\Lambda(n)}{n^s}} might serve as zero-detecting polynomials. Unfortunately, in the regime {\hbox{Re}(s) > 1}, it is difficult to control the tail behaviour of these series. A more efficient choice comes from the following observation. Suppose that {\zeta(\sigma+it)=0} for some {0 < \sigma < 1} and {t \in [T,2T]}, where {T \geq 2} is large. From Exercise 33 of Supplement 3, we have

\displaystyle  \zeta(\sigma+it) = \sum_n \frac{\eta(\log n - \log x)}{n^{\sigma+it}} + O_A( T^{-A} )

for any {A>0}, where {\eta: {\bf R} \rightarrow {\bf R}} is a smooth function equaling {1} on {(-\infty,-1)} and zero on {(1,+\infty)}, and {x} is a sufficiently large multiple of {T}. Thus

\displaystyle  \sum_n \frac{\eta(\log n - \log x)}{n^{\sigma+it}} = O_A( T^{-A} ).

To eliminate the terms coming from small {n}, we multiply both sides by the Dirichlet polynomial {\sum_{n \leq T^\varepsilon} \frac{\mu(n)}{n^{\sigma+it}}} for some fixed {0 < \varepsilon < 1}. This series can be very crudely bounded by {O(T^\varepsilon)}, so that (after adjusting {A}) we have

\displaystyle  (\sum_{n \leq T^\varepsilon} \frac{\mu(n)}{n^{\sigma+it}}) (\sum_n \frac{\eta(\log n - \log x)}{n^{\sigma+it}}) = O_A( T^{-A} ).

In particular, we have

\displaystyle  \sum_n \frac{a_n}{n^{\sigma+it}} = -1 + O_A( T^{-A} )

where {a_n} is the sequence

\displaystyle  a_n := \sum_{d|n: d \leq T^\varepsilon} \mu(d) \eta( \log \frac{n}{d} - \log x ) - 1_{n=1}.

In particular, for large {T} we have

\displaystyle  |\sum_n \frac{a_n}{n^{\sigma+it}}| > \frac{1}{2} \ \ \ \ \ (11)

and so {\sum_n \frac{a_n}{n^{\sigma+it}}} serves as a zero-detecting polynomial.

From Möbius inversion, we see that {a_n} is supported on the range {T^\varepsilon \ll n \ll T^{1+\varepsilon}}, and is bounded by {O( \tau(n) )}. We can thus decompose the Dirichlet polynomial {\sum_n \frac{a_n}{n^{\sigma+it}}} as the sum of {O(\log T)} expressions of the form {\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}} for {T^\varepsilon \ll N \ll T^{1+\varepsilon}}. Applying Corollary 7 (or Corollary 3 to deal with the cases when {N \gg T}), we see that for any {0 < \beta < 1}, and after deleting at most {O_\varepsilon( T^{\beta} \log^{O_\varepsilon(1)} T )} unit intervals from {[T,2T]}, we have

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1-\sigma-\beta/4}

for all {O(\log T)} choices of {T^\varepsilon \ll N \ll T^{1+\varepsilon}} and for all {-1 \leq \sigma \leq 2} and all {t} in the remaining portion of {[T,2T]}. Adding a large multiple of {\frac{\log\log T}{\log T}} to {\beta} (which does not significantly affect the error {O_\varepsilon( T^{\beta} \log^{O_\varepsilon(1)} T )}, we can improve this to

\displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1-\sigma-\beta/4} \frac{1}{\log^2 N}

(say). Summing this in {N}, we see that

\displaystyle  |\sum_n \frac{a_n}{n^{\sigma+it}}| < \frac{1}{2}

if

\displaystyle  \sigma \geq 1 - \beta/4

and {T} is sufficiently large depending on {\varepsilon}. Comparing this with (11), we conclude

Proposition 10 Let {0 < \beta < 1} and {\varepsilon > 0}, and let {T} be sufficiently large depending on {\varepsilon}. Then, after deleting at most {O_\varepsilon( T^{\beta} \log^{O_\varepsilon(1)} )} unit intervals from {[T,2T]}, there are no zeroes of {\zeta} of the form {\sigma+it} with {\sigma \geq 1-\beta/4} and {t} in the remaining portion of {[T,2T]}.

For any {\alpha > 0} and {T > 0}, let {N(\alpha,T)} denote the number of zeroes {\sigma+it} of the Riemann zeta function (counting multiplicity) with {\alpha \leq \sigma \leq 1} and {0 \leq t \leq T}. Recall from Proposition 16 of Notes 2 that there are at most {O(\log T)} zeroes in any rectangle of the form {\{ \sigma+it: \sigma \geq 1/2; t_0 \leq t \leq t_0+1 \}} with {t_0 = O(T)}. Combining this with the previous proposition, we conclude

Corollary 11 (Zero-density estimate) Let {3/4 < \alpha < 1} and {T \geq 2}. Then one has

\displaystyle  N(\alpha,T) \ll T^{4(1-\alpha)} \log^{O(1)} T.

Proof: By dyadic decomposition (and reflection symmetry) we may replace the constraint {0 \leq t < T} in the definition of {N(\alpha,T)} by {T \leq t \leq 2T}. The claim then follows from the previous proposition with {\beta := 4(1-\alpha)} and a suitably small choice of {\varepsilon}. \Box

Exercise 12 (Weak Riemann hypothesis) For any {1/2 < \alpha < 1} and {\varepsilon > 0}, show that

\displaystyle  N(\alpha,T) \ll_\varepsilon T^{3/2-\alpha+\varepsilon}

for all {T > 2}. Conclude in particular that for any {\varepsilon > 0}, only {o(T)} of the zeroes of the Riemann zeta function in the rectangle {\{\sigma+it: 0 < \sigma < 1; 0 \leq t \leq T \}} have real part greater than {1/2+\varepsilon} or less than {1/2-\varepsilon}.

Remark 13 The sharpest result known in the direction of the weak Riemann hypothesis is by Selberg, who showed that {N(\alpha,T) \ll \frac{T}{\alpha-1/2}} for any {1/2 < \alpha \leq 1} and {T \geq 1}. Thus, “most” zeroes {\sigma+it} with {|t| \leq T} lie within {O(1/\log T)} of the critical line. Improving upon this result seems to be closely related to making progress on the pair correlation conjecture (see Section 4 of Supplement 4).

Corollary 11 is far from the sharpest bound on {N(\alpha,T)} known, but it will suffice for our applications; see Chapter 10 of Iwaniec and Kowalski for various stronger bounds on {N(\alpha,T)}, formed by using more advanced large values estimates as well as more complicated zero detecting polynomials. The density hypothesis asserts that

\displaystyle  N(\alpha,T) \ll_\varepsilon T^{2(1-\alpha)+\varepsilon} \ \ \ \ \ (12)

for any {1/2 \leq \alpha \leq 1}, {T > 1}, and {\varepsilon > 0}, thus replacing the exponent {4} in Corollary 11 with {2}. This hypothesis is known to hold for {\alpha > \frac{5}{6}} (a result of Huxley, in fact his estimates are even stronger than what the density hypothesis predicts), but is open in general; it is known that the exponent {4} in Corollary 11 can be replaced by {12/5} (by the work of Ingham and of Huxley), but at the critical value {\alpha=3/4} no further improvement is currently known. Of course, the Riemann hypothesis is equivalent to the assertion that {N(\alpha,T)=0} for all {1/2 < \alpha \leq 1}, which is far stronger than the density hypothesis; in Exercise 15 below we will see that even the Lindelof hypothesis is sufficient to imply the density hypothesis. However, the density hypothesis can be a reasonable substitute for the Riemann hypothesis in some settings (e.g. in establishing prime number theorems in short intervals, as discussed below), and looks more amenable to a purely analytic attack (e.g. through resolution of the exponent pair conjecture, discussed briefly in Notes 5) than the Riemann hypothesis.

Exercise 14 Show that (7) implies (12).

In Exercise 32 of Notes 2, it was observed that if there were no zeroes of the Riemann zeta function with real part larger than {\alpha}, then one had the upper bound {\zeta(\sigma+it) = O(t^{o(1)})} as {|t| \rightarrow \infty} for fixed {\sigma >\alpha}. The following exercise gives a sort of converse to this claim, “modulo the density hypothesis”:

Exercise 15 (Riemann zero usually implies large value of {\zeta}) Let {1/2 < \alpha < 1} and {\varepsilon > 0}, and let {T > 2}. Show that after deleting at most {O_\varepsilon( T^{2(1-\alpha)+O(\varepsilon)} )} unit intervals from {[T,2T]}, the following holds: whenever {\sigma+it} is a zero of the Riemann zeta function with {\alpha \leq \sigma \leq 1} and {t} in the remaining portion of {[T,2T]}, we have

\displaystyle  |\zeta( \sigma' + it' )| \geq T^{\varepsilon^2}

for some {\sigma',t'} with {|t'-t| \leq T^\varepsilon} and {\sigma' = \sigma + O(\varepsilon)}. (Hint: use Exercise 8 to dispose of the portion of the zero-detecting polynomial {\sum_n \frac{a_n}{n^{\sigma+it}}} with {n = O(T^{O(\varepsilon)})}. Then divide out by {\sum_{n \leq T^\varepsilon} \frac{\mu(n)}{n^{\sigma+it}}} and conclude that a sum such as {\sum_n \frac{\eta(\log n - \log N)}{n^{\sigma+it}}} is large for some {T^\varepsilon \ll N \ll T}. Then use Lemma 34 from Supplement 3.) Conclude in particular that the Lindelof hypothesis implies the density hypothesis.

Remark 16 One can refine the above arguments to show that the density hypothesis is equivalent to the assertion that for any {1/2 < \alpha < 1}, one has the Lindelof-type bounds {\zeta(\sigma+it) = O( T^{o(1)} )} for all {\alpha \leq \sigma \leq 1} and all {t} in the set {[0,T]} with at most {O( T^{2(1-\alpha)+o(1)} )} unit intervals removed, as {T \rightarrow \infty}. An application of Jensen’s formula shows that this claim is equivalent in turn to the assertion that for any {1/2 < \alpha < 1} and {\varepsilon > 0}, the set {\{ \sigma+it: \sigma > \alpha+\varepsilon; |t-t_0| \leq 1\}} contains {o( \log T )} zeroes of {\zeta} for all {t \in [0,T]} with at most {O( T^{2(1-\alpha)+o(1)} )} involved. Informally, what this means is that the density hypothesis fails not just through the existence of a sufficient number of “exceptional” zeroes {\sigma+it} with {\sigma > \alpha}, but in fact through a sufficient number of clumps of exceptional zeroes – {\gg \log T} such zeroes within a unit distance of each other.

— 3. Primes in short intervals —

As an application of the zero density estimates we obtain a prime number theorem in short intervals:

Theorem 17 (Prime number theorem in short intervals) For any fixed {\varepsilon > 0}, we have

\displaystyle  \sum_{x \leq n < x+y} \Lambda(n) = y + o(y)

whenever {x \rightarrow \infty} and {x^{3/4+\varepsilon} \leq y \leq x}. In particular, there exists a prime between {x} and {x+x^{3/4+\varepsilon}} whenever {x} is sufficiently large depending on {\varepsilon}.

Theorems of this type were first obtained by Hohiesel in 1930. If the {3/4} exponent could be lowered to be below {1/2}, this would establish Legendre’s conjecture that there always exists a prime between consecutive squares {n^2, (n+1)^2}, at least when {n} is sufficiently large. Unfortunately, we do not know how to do this, even under the assumption of the Riemann hypothesis; the best unconditional result is by Baker, Harman, and Pintz, who established the existence of a prime between {x} and {x+x^{0.525}} for every sufficiently large {x}. Among other things, this establishes the existence of a prime between adjacent cubes {n^3, (n+1)^3} if {n} is large enough.

Proof: From the truncated explicit formula (Theorem 21 from Notes 2) we have

\displaystyle  \sum_{n < x} \Lambda(n) = x - \sum_{\rho: \hbox{Re}(\rho) > 3/4 + \varepsilon/2, |\hbox{Im}(\rho)| \leq x^{1/4-\varepsilon/2}} \frac{x^\rho}{\rho} + o(y)

and similarly with {x} replaced by {x+y}. Subtracting, we conclude that

\displaystyle  \sum_{x \leq n < x+y} \Lambda(n) = y - \sum_{\rho: \hbox{Re}(\rho) > 3/4 + \varepsilon/2, |\hbox{Im}(\rho)| \leq x^{1/4-\varepsilon/2}} \frac{(x+y)^\rho - x^\rho}{\rho} + o(y). \ \ \ \ \ (13)

From the fundamental theorem of calculus we have

\displaystyle  (x+y)^\rho - x^\rho \ll |\rho| y x^{\hbox{Re}(\rho) - 1}

and so

\displaystyle  \sum_{x \leq n < x+y} \Lambda(n) = y + y \sum_{\rho: \hbox{Re}(\rho) > 3/4 + \varepsilon/2, |\hbox{Im}(\rho)| \leq x^{1/4-\varepsilon/2}} O(x^{\hbox{Re}(\rho) - 1}) + o(y).

By symmetry of the zeroes, it thus suffices to show that

\displaystyle  \sum_{\rho: \hbox{Re}(\rho) > 3/4 + \varepsilon/2, 0 \leq \hbox{Im}(\rho) \leq x^{1/4-\varepsilon/2}} x^{\hbox{Re}(\rho) - 1} = o(1). \ \ \ \ \ (14)

Writing

\displaystyle  x^{\hbox{Re}(\rho) - 1} \ll_\varepsilon \log x \int_{3/4+\varepsilon/4}^1 1_{\sigma < \hbox{Re}(\rho)} x^{\sigma-1}\ d\sigma,

we can bound the left-hand side of (14) by

\displaystyle  \ll_\varepsilon \log x \int_{3/4+\varepsilon/4}^1 N( \sigma, x^{1/4-\varepsilon/2} ) x^{\sigma-1}\ d\sigma.

From the Vinogradov-Korobov zero-free region (Exercise 5 of Notes 5), we see that the integrand vanishes when {\sigma \geq 1 - A \frac{\log\log x}{\log x}} for any {A>0}, if {x} is sufficiently large depending on {A}. (Indeed, the Vinogradov-Korobov region gives more vanishing than this, but this is all that we shall need.) Using this and Corollary 11, we may bound the left-hand side of (14) by

\displaystyle  \ll_\varepsilon \log^{O(1)} x \int_{3/4+\varepsilon/4}^{1 - A \log \log x/ \log x} x^{-2\varepsilon(1-\sigma)}\ d\sigma

\displaystyle  \ll_\varepsilon \log^{O(1) - 2 \varepsilon A} x

and the claim follows by choosing {A} large enough depending on {\varepsilon}. \Box

Exercise 18 Assuming the density hypothesis, show that the exponent {3/4} in Theorem 17 can be replaced by {1/2}; thus the density hypothesis, though weaker than the Riemann or Lindelof hypotheses, is still strong enough to get a near-miss to the Legendre conjecture.

Exercise 19 Using the Littlewood zero-free region (see Exercise 4 of Notes 5 and subsequent remarks) in place of the Vinogradov-Korobov zero-free region, obtain Theorem 17 with {3/4} replaced by some absolute constant {c<1}. (This is close to the original argument of Hohiesel.)

As we have seen several times in previous notes, one can obtain better results on prime number theorems in short intervals if one works on average in {x}, rather than demanding a result which is true for all {x}. For instance, we have

Theorem 20 (Prime number theorem on the average) For any fixed {\varepsilon > 0}, we have

\displaystyle  \int_X^{2X} | \sum_{x \leq n < x+y} \Lambda(n) - y|^2\ dx = o( y^2 X )

whenever {X \rightarrow \infty} and {x^{1/2+\varepsilon} \leq y \leq x}. In particular, there is a prime between {x} and {x+x^{1/2+\varepsilon}} for almost all {x} in {[X,2X]}, in the sense that the set of exceptions has measure {o(X)}.

Proof: By arguing as in (13) we have

\displaystyle  \sum_{x \leq n < x+y} \Lambda(n) - y = - \sum_{\rho: \hbox{Re}(\rho) > 1/2 + \varepsilon/2, \hbox{Im}(\rho)| \leq T} \frac{(x+y)^\rho - x^\rho}{\rho} + o(y)

where {T := x^{1/2-\varepsilon/2}}, so it suffices to show that

\displaystyle  \int_X^{2X} | \sum_{\rho} \frac{(x+y)^\rho - x^\rho}{\rho} |^2\ dx = o( y^2 X )

where the sum is understood to be over zeroes {\rho} if the zeta function with {\hbox{Re}(\rho) > 1/2 + \varepsilon/2, \hbox{Im}(\rho)| \leq T}. Since

\displaystyle  \frac{(x+y)^\rho - x^\rho}{\rho} = y\int_0^1 (x+\theta y)^{\rho-1}\ d\theta

it suffices to show that

\displaystyle  \int_X^{2X} | \sum_{\rho} (x+\theta y)^{\rho-1} |^2\ dx = o( X )

for each {0 < \theta < 1}; shifting {x} by {\theta y}, it suffices to show that

\displaystyle  \int_X^{3X} | \sum_{\rho} x^{\rho-1} |^2\ dx = o( y^2 X ).

The left-hand side can be expanded as

\displaystyle  \sum_{\rho, \rho'} \int_X^{3X} x^{\rho+\overline{\rho'}-2}\ dx.

By symmetry we may bound this by

\displaystyle  \sum_{\rho, \rho': \hbox{Re}(\rho) \geq \hbox{Re}(\rho')} |\int_X^{3X} x^{\rho+\overline{\rho'}-2}\ dx|.

For {\rho' = \rho+O(1)}, we may bound

\displaystyle  \int_X^{3X} x^{\rho+\overline{\rho'}-2}\ dx \ll X^{2 \hbox{Re}(\rho) - 1}

while for {|\rho'-\rho| > 1} we have

\displaystyle  \int_X^{3X} x^{\rho+\overline{\rho'}-2}\ dx \ll \frac{X^{2 \hbox{Re}(\rho) - 1}}{|\rho-\rho'|}

so on summing using Proposition 16 of Notes 2 we have

\displaystyle  \sum_{\rho': \hbox{Re}(\rho) \geq \hbox{Re}(\rho')} |\int_X^{3X} x^{\rho+\overline{\rho'}-2}\ dx| \ll X^{2 \hbox{Re}(\rho) - 1} \log^2 X

and so it will suffice to show that

\displaystyle  \sum_\rho X^{2 \hbox{Re}(\rho) - 2} = o( \frac{1}{\log^2 X} ).

But this can be done by a repetition of the arguments used to establish (14); indeed, the left-hand side is

\displaystyle  \ll_\varepsilon \int_{1/2+\varepsilon/4}^1 N( \sigma, x^{1/2-\varepsilon/2} ) x^{2(\sigma-1)}\ d\sigma

and by using the Vinogradov-Korobov zero-free region and Corollary 11 as before, we obtain the claim. \Box

Exercise 21

  • (i) Assuming without proof that the exponent in Corollary 11 can be lowered from {4} to {12/5}, show that the exponent {1/2} in Theorem 20 may be lowered from {1/2} to {1/6}; this implies in particular that Legendre’s conjecture is true for “almost all” {n}.
  • (ii) Assuming the density hypothesis, show that the exponent {1/2} in Theorem 20 can be lowered all the way to zero.

— 4. {L}-function variants —

We have already seen in previous notes that many results about the Riemann zeta function {\zeta(\sigma+it)} extend to {L}-functions {L(\sigma+it,\chi)}, with the variable {\chi} playing a role closely analogous to that of the imaginary ordinate {t}. Certainly, for instance, one can prove zero-density estimates for a single {L}-function {L(\cdot,\chi)} for a single Dirichlet character of modulus {q} by much the same methods as given previously, after the usual modification of replacing logarithmic factors such as {\log(2+T)} with {\log(q(2+T))} instead.

However, one can also prove “grand density theorems” in which one counts zeroes not just of a single {L}-function {\sigma+it \mapsto L(\sigma+it,\chi)}, but of a whole family of {L}-functions {(\sigma+it,\chi) \mapsto L(\sigma+it,\chi)}, in which {\chi} ranges in some given family (e.g. all Dirichlet characters of a given modulus {q}). Such theorems are particularly useful when trying to control primes in an arithmetic progression {a\ (q)}, since Fourier expansion then requires one to consider all characters of modulus {q} simultaneously. (It is also of interest to obtain grand density theorems for all Dirichlet characters of modulus up to some threshold {Q}, but we will not need to discuss this variant here.) In order to get good results in this regard, one needs a version of the large values theorems in which one averages over characters {\chi} as well as imaginary ordinates {t}. Here is a typical such result:

Theorem 22 ({L^2} estimate on large values with characters) Let {N \geq 1} and {T > 1}, let {q} be a natural number, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers. Let {t_1,\dots,t_J} be real numbers in an interval of length {T}, and let {\chi_1,\dots,\chi_J} be Dirichlet characters of modulus {q} (possibly non-primitive or principal). Assume the following separation condition on the pairs {(\chi_j,t_j)} for {j=1,\dots,J}:

\displaystyle  |t_j-t_{j'}| \geq 1 \hbox{ whenever } 1 \leq j < j' \leq J \hbox{ and } \chi_j = \chi_{j'}. \ \ \ \ \ (15)

Then for any {\sigma_1,\dots,\sigma_J \in [-1,2]}, we have

\displaystyle  \sum_{j=1}^J |N^{\sigma_j} \sum_{N/2 \leq n \leq N} a_n \chi_j(n) n^{-\sigma_j-it_j}|^2 \ \ \ \ \ (16)

\displaystyle  \ll (qT + N) \sum_{N/2 \leq n \leq N} |a_n|^2.

Note that this generalises Theorem 1, which is the {q=1} case of Theorem 22.

Proof: We mimic the proof of Theorem 22. By increasing {N}, we may reduce without loss of generality to the case {qT \leq N/100}. As before, we apply the generalised Bessel inequality with the same weight {\nu(n) := \psi(n/N)} used in Theorem 22, to reduce to showing that

\displaystyle  \sum_{j=1}^J \sum_{j'=1}^J c_j \overline{c_{j'}} \sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})} \overline{\chi_j} \chi_{j'}(n) \ll N.

Bounding {c_j \overline{c_{j'}} \ll |c_j|^2 + |c_{j'}|^2} and using symmetry, it thus suffices to show that

\displaystyle  \sum_{j'=1}^J |\sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})} \overline{\chi_j} \chi_{j'}(n)| \ll N

for each {j}.

We now focus on the expression

\displaystyle  \sum_n \nu(n) (n/N)^{-\sigma_j-\sigma_{j'}} n^{i(t_j-t_{j'})}) \overline{\chi_j} \chi_{j'}(n) \ \ \ \ \ (17)

\displaystyle  = \sum_n \psi(n/N) (n/N)^{-\sigma_j-\sigma_{j'}} e( \frac{t_j - t_{j'}}{2\pi} \log n ) \overline{\chi_j} \chi_{j'}(n).

As before, this expression is bounded by {O(N)} in the diagonal case {j=j'}, which gives an acceptable contribution, so we turn to the case {j \neq j'}. We split into two subcases, depending on whether {\chi_j = \chi_{j'}} or not.

First suppose that {\chi_j=\chi_{j'}}. We then split the sum in (17) into {\phi(q)} sums, each arising from a primitive residue class {a\ (q)}, on which {\overline{\chi_j} \chi_{j'}(n) = 1}. Applying a change of variables {n = m q + a} and then using Poisson summation as in the proof of Theorem 22, one can show that each individual sum is {O( \frac{N}{q |t_j-t_{j'}|^2} )}, so the total sum (17) is {O( \frac{N}{|t_j-t_{j'}|^2} )}. Summing over all the {j'} with {\chi_{j'} = \chi_j}, the {t_{j'}} are {1}-separated by hypothesis, and so we obtain an acceptable contribution in this case.

Now suppose that {\chi_j \neq \chi_{j'}}. Again, we can split the sum in (17) into {\phi(q)} sums, each of which is equal to {\overline{\chi_j} \chi_{j'}(a)} times

\displaystyle  \sum_{n = a\ (q)} \psi(n/N) (n/N)^{-\sigma_j-\sigma_{j'}} e( \frac{t_j - t_{j'}}{2\pi} \log n )

for a primitive residue class {a\ (q)}. Repeating the analysis of Theorem 22, but now noting that {t_j-t_{j'}} is not required to be bounded below, we can write this latter sum as

\displaystyle  \frac{N}{q} (\int_{\bf R} \frac{\psi(x)}{x^{\sigma_j+\sigma_{j'}}} e( \frac{t_j - t_{j'}}{2\pi} \log x )\ dx + O( \frac{1}{(N/q)^2} )).

Summing in {a}, and noting that {\overline{\chi_j} \chi_{j'}} has mean zero, we see that the main term here cancels out, and we can bound (17) by {O( \frac{q}{N} )}. The total number of possible {j'} can be bounded by {qT = O(N)}, and so the contribution of this case is acceptable (with a bit of room to spare). \Box

Remark 23 If one is willing to lose a factor of {\log N}, the above proof can be simplified slightly by performing only one integration by parts rather than two in the integrals arising from Poisson summation.

We can now obtain a large values theorem in which the role of the interval {[0,T]} is now replaced by the set {\{ (\chi,t): \chi\ (q), t \in [0,T] \}} of pairs {(\chi,t)}, with {\chi} a Dirichlet character of modulus {q}, and {t} an element of {[0,T]}. Inside this set, we consider unit intervals of the form {\{ (\chi, t): t \in [t_0,t_0+1] \}} for some Dirichlet character {\chi} and {t_0 \in {\bf R}}.

Exercise 24 Let {N, T \geq 2}, let {q} be a natural number, and let {(a_n)_{N/2 \leq n \leq N}} be a sequence of complex numbers obeying a bound of the form {a_n \ll \tau(n)^{O(1)} \log^{O(1)} n } for all {1 \leq n \leq N}. Let {0 \leq \beta \leq 1}.

  • (i) If {k} is a natural number, show that after deleting at most {O_k( (qT)^\beta \log^{O_k(1)}(N) )} unit intervals from {\{ (\chi,t): \chi\ (q), t \in [0,T] \}}, one has

    \displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n \chi(n)}{n^{\sigma+it}}| \ \ \ \ \ (18)

    \displaystyle  \ll_k N^{1/2-\sigma} \max(N, (qT)^{1/k})^{1/2} (qT)^{-\frac{\beta}{2k}}

    for all {-1 \leq \sigma \leq 2} and all {(\chi,t)} in the remaining portion of {\{ (\chi,t): \chi\ (q), t \in [0,T] \}}.

  • (ii) If {N \geq (qT)^\varepsilon} for some {\varepsilon>0}, show that after deleting at most {O_\varepsilon( T^\beta \log^{O_\varepsilon(1)}(N) )} unit intervals from {\{ (\chi,t): \chi\ (q), t \in [0,T] \}}, we have

    \displaystyle  |\sum_{N/2 \leq n \leq N} \frac{a_n}{n^{\sigma+it}}| \ll_\varepsilon N^{1-\sigma-\max(\frac{\beta}{4},\beta-\frac{1}{2})} \ \ \ \ \ (19)

    for all {-1 \leq \sigma \leq 2} and all {(\chi,t)} in the remaining portion of {\{ (\chi,t): \chi\ (q), t \in [0,T] \}}.

For any {0 \leq \alpha \leq 1}, {T \geq 1}, and natural number {q}, let {N(\alpha,q,T)} denote the combined number of zeroes {\sigma+it} of all of the {L}-functions {L(\cdot,\chi)} with {\chi} of modulus {q}, {\alpha \leq \sigma \leq 1} and {0 \leq t \leq T}, counting multiplicity of course. We can then repeat the proof of Corollary 11 to give

Exercise 25 (Grand zero-density estimate) Let {3/4 < \alpha < 1}, {T \geq 2}, and let {q} be a natural number. Show that

\displaystyle  N(\alpha,q,T) \ll (qT)^{4(1-\alpha)} \log^{O(1)}(qT).

Exercise 26 For any {0 \leq \alpha \leq 1} and {Q,T \geq 1}, let {\tilde N(\alpha,Q,T)} denote the combined number of zeroes {\sigma+it} of all of the {L}-functions {L(\cdot,\chi)} with {\chi} primitive and of conductor at most {Q}, {\alpha \leq \sigma \leq 1} and {0 \leq t \leq T}, counting multiplicity of course. Show that for any {3/4 < \alpha < 1}, one has

\displaystyle  \tilde N(\alpha,Q,T) \ll (Q^2T)^{4(1-\alpha)} \log^{O(1)}(Q^2T).

(Hint: one will have to develop an analogue of Theorem 22 and Exercise 24 in which one works with primitive characters of conductor at most {Q}, rather than all characters of a fixed modulus {q}, and with all references to {QT} replaced by {Q^2 T}.)

As with the density theorems for the zeta function, the exponents here may be improved somewhat, with the {4} conjecturally being reducible to {2} for any {1/2 < \alpha < 1}; see Chapter 10 of Iwaniec and Kowalski. (Of course, the generalised Riemann hypothesis asserts that {N(\alpha,q,T)} in fact vanishes whenever {\alpha > 1/2}.)

Given that the density estimates for the Riemann zeta function yield prime number theorems in short intervals, one expects the grand density estimates to yield prime number theorems in sparse (and short) arithmetic progressions. However, there are two technical issues with this. The first is that the analogue of the Vinogradov-Korobov zero-free region is not necessarily wide enough (in some ranges of parameters) for the argument to work. The second is that one may encounter an exceptional (Landau-Siegel) zero. These difficulties can be overcome, leading to Linnik’s theorem (as well as the quantitative refinement of this theorem by Gallagher); this will be the focus of the next set of notes.