In Notes 1, we approached multiplicative number theory (the study of multiplicative functions {f: {\bf N} \rightarrow {\bf C}} and their relatives) via elementary methods, in which attention was primarily focused on obtaining asymptotic control on summatory functions {\sum_{n \leq x} f(n)} and logarithmic sums {\sum_{n \leq x} \frac{f(n)}{n}}. Now we turn to the complex approach to multiplicative number theory, in which the focus is instead on obtaining various types of control on the Dirichlet series {{\mathcal D} f}, defined (at least for {s} of sufficiently large real part) by the formula

\displaystyle  {\mathcal D} f(s) := \sum_n \frac{f(n)}{n^s}.

These series also made an appearance in the elementary approach to the subject, but only for real {s} that were larger than {1}. But now we will exploit the freedom to extend the variable {s} to the complex domain; this gives enough freedom (in principle, at least) to recover control of elementary sums such as {\sum_{n\leq x} f(n)} or {\sum_{n\leq x} \frac{f(n)}{n}} from control on the Dirichlet series. Crucially, for many key functions {f} of number-theoretic interest, the Dirichlet series {{\mathcal D} f} can be analytically (or at least meromorphically) continued to the left of the line {\{ s: \hbox{Re}(s) = 1 \}}. The zeroes and poles of the resulting meromorphic continuations of {{\mathcal D} f} (and of related functions) then turn out to control the asymptotic behaviour of the elementary sums of {f}; the more one knows about the former, the more one knows about the latter. In particular, knowledge of where the zeroes of the Riemann zeta function {\zeta} are located can give very precise information about the distribution of the primes, by means of a fundamental relationship known as the explicit formula. There are many ways of phrasing this explicit formula (both in exact and in approximate forms), but they are all trying to formalise an approximation to the von Mangoldt function {\Lambda} (and hence to the primes) of the form

\displaystyle  \Lambda(n) \approx 1 - \sum_\rho n^{\rho-1} \ \ \ \ \ (1)

where the sum is over zeroes {\rho} (counting multiplicity) of the Riemann zeta function {\zeta = {\mathcal D} 1} (with the sum often restricted so that {\rho} has large real part and bounded imaginary part), and the approximation is in a suitable weak sense, so that

\displaystyle  \sum_n \Lambda(n) g(n) \approx \int_0^\infty g(y)\ dy - \sum_\rho \int_0^\infty g(y) y^{\rho-1}\ dy \ \ \ \ \ (2)

for suitable “test functions” {g} (which in practice are restricted to be fairly smooth and slowly varying, with the precise amount of restriction dependent on the amount of truncation in the sum over zeroes one wishes to take). Among other things, such approximations can be used to rigorously establish the prime number theorem

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + o(x) \ \ \ \ \ (3)

as {x \rightarrow \infty}, with the size of the error term {o(x)} closely tied to the location of the zeroes {\rho} of the Riemann zeta function.

The explicit formula (1) (or any of its more rigorous forms) is closely tied to the counterpart approximation

\displaystyle  -\frac{\zeta'}{\zeta}(s) \approx \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} \ \ \ \ \ (4)

for the Dirichlet series {{\mathcal D} \Lambda = -\frac{\zeta'}{\zeta}} of the von Mangoldt function; note that (4) is formally the special case of (2) when {g(n) = n^{-s}}. Such approximations come from the general theory of local factorisations of meromorphic functions, as discussed in Supplement 2; the passage from (4) to (2) is accomplished by such tools as the residue theorem and the Fourier inversion formula, which were also covered in Supplement 2. The relative ease of uncovering the Fourier-like duality between primes and zeroes (sometimes referred to poetically as the “music of the primes”) is one of the major advantages of the complex-analytic approach to multiplicative number theory; this important duality tends to be rather obscured in the other approaches to the subject, although it can still in principle be discernible with sufficient effort.

More generally, one has an explicit formula

\displaystyle  \Lambda(n) \chi(n) \approx - \sum_\rho n^{\rho-1} \ \ \ \ \ (5)

for any (non-principal) Dirichlet character {\chi}, where {\rho} now ranges over the zeroes of the associated Dirichlet {L}-function {L(s,\chi) := {\mathcal D} \chi(s)}; we view this formula as a “twist” of (1) by the Dirichlet character {\chi}. The explicit formula (5), proven similarly (in any of its rigorous forms) to (1), is important in establishing the prime number theorem in arithmetic progressions, which asserts that

\displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + o(x) \ \ \ \ \ (6)

as {x \rightarrow \infty}, whenever {a\ (q)} is a fixed primitive residue class. Again, the size of the error term {o(x)} here is closely tied to the location of the zeroes of the Dirichlet {L}-function, with particular importance given to whether there is a zero very close to {s=1} (such a zero is known as an exceptional zero or Siegel zero).

While any information on the behaviour of zeta functions or {L}-functions is in principle welcome for the purposes of analytic number theory, some regions of the complex plane are more important than others in this regard, due to the differing weights assigned to each zero in the explicit formula. Roughly speaking, in descending order of importance, the most crucial regions on which knowledge of these functions is useful are

  1. The region on or near the point {s=1}.
  2. The region on or near the right edge {\{ 1+it: t \in {\bf R} \}} of the critical strip {\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}}.
  3. The right half {\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}} of the critical strip.
  4. The region on or near the critical line {\{ \frac{1}{2} + it: t \in {\bf R} \}} that bisects the critical strip.
  5. Everywhere else.

For instance:

  1. We will shortly show that the Riemann zeta function {\zeta} has a simple pole at {s=1} with residue {1}, which is already sufficient to recover much of the classical theorems of Mertens discussed in the previous set of notes, as well as results on mean values of multiplicative functions such as the divisor function {\tau}. For Dirichlet {L}-functions, the behaviour is instead controlled by the quantity {L(1,\chi)} discussed in Notes 1, which is in turn closely tied to the existence and location of a Siegel zero.
  2. The zeta function is also known to have no zeroes on the right edge {\{1+it: t \in {\bf R}\}} of the critical strip, which is sufficient to prove (and is in fact equivalent to) the prime number theorem. Any enlargement of the zero-free region for {\zeta} into the critical strip leads to improved error terms in that theorem, with larger zero-free regions leading to stronger error estimates. Similarly for {L}-functions and the prime number theorem in arithmetic progressions.
  3. The (as yet unproven) Riemann hypothesis prohibits {\zeta} from having any zeroes within the right half {\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}} of the critical strip, and gives very good control on the number of primes in intervals, even when the intervals are relatively short compared to the size of the entries. Even without assuming the Riemann hypothesis, zero density estimates in this region are available that give some partial control of this form. Similarly for {L}-functions, primes in short arithmetic progressions, and the generalised Riemann hypothesis.
  4. Assuming the Riemann hypothesis, further distributional information about the zeroes on the critical line (such as Montgomery’s pair correlation conjecture, or the more general GUE hypothesis) can give finer information about the error terms in the prime number theorem in short intervals, as well as other arithmetic information. Again, one has analogues for {L}-functions and primes in short arithmetic progressions.
  5. The functional equation of the zeta function describes the behaviour of {\zeta} to the left of the critical line, in terms of the behaviour to the right of the critical line. This is useful for building a “global” picture of the structure of the zeta function, and for improving a number of estimates about that function, but (in the absence of unproven conjectures such as the Riemann hypothesis or the pair correlation conjecture) it turns out that many of the basic analytic number theory results using the zeta function can be established without relying on this equation. Similarly for {L}-functions.

Remark 1 If one takes an “adelic” viewpoint, one can unite the Riemann zeta function {\zeta(\sigma+it) = \sum_n n^{-\sigma-it}} and all of the {L}-functions {L(\sigma+it,\chi) = \sum_n \chi(n) n^{-\sigma-it}} for various Dirichlet characters {\chi} into a single object, viewing {n \mapsto \chi(n) n^{-it}} as a general multiplicative character on the adeles; thus the imaginary coordinate {t} and the Dirichlet character {\chi} are really the Archimedean and non-Archimedean components respectively of a single adelic frequency parameter. This viewpoint was famously developed in Tate’s thesis, which among other things helps to clarify the nature of the functional equation, as discussed in this previous post. We will not pursue the adelic viewpoint further in these notes, but it does supply a “high-level” explanation for why so much of the theory of the Riemann zeta function extends to the Dirichlet {L}-functions. (The non-Archimedean character {\chi(n)} and the Archimedean character {n^{it}} behave similarly from an algebraic point of view, but not so much from an analytic point of view; as such, the adelic viewpoint is well suited for algebraic tasks (such as establishing the functional equation), but not for analytic tasks (such as establishing a zero-free region).)

Roughly speaking, the elementary multiplicative number theory from Notes 1 corresponds to the information one can extract from the complex-analytic method in region 1 of the above hierarchy, while the more advanced elementary number theory used to prove the prime number theorem (and which we will not cover in full detail in these notes) corresponds to what one can extract from regions 1 and 2.

As a consequence of this hierarchy of importance, information about the {\zeta} function away from the critical strip, such as Euler’s identity

\displaystyle  \zeta(2) = \frac{\pi^2}{6}

or equivalently

\displaystyle  1 + \frac{1}{2^2} + \frac{1}{3^2} + \dots = \frac{\pi^2}{6}

or the infamous identity

\displaystyle  \zeta(-1) = -\frac{1}{12},

which is often presented (slightly misleadingly, if one’s conventions for divergent summation are not made explicit) as

\displaystyle  1 + 2 + 3 + \dots = -\frac{1}{12},

are of relatively little direct importance in analytic prime number theory, although they are still of interest for some other, non-number-theoretic, applications. (The quantity {\zeta(2)} does play a minor role as a normalising factor in some asymptotics, see e.g. Exercise 28 from Notes 1, but its precise value is usually not of major importance.) In contrast, the value {L(1,\chi)} of an {L}-function at {s=1} turns out to be extremely important in analytic number theory, with many results in this subject relying ultimately on a non-trivial lower-bound on this quantity coming from Siegel’s theorem, discussed below the fold.

For a more in-depth treatment of the topics in this set of notes, see Davenport’s “Multiplicative number theory“.

— 1. Dirichlet series to the right of the critical strip —

We begin with the (easy) theory of Dirichlet series to the right of the critical strip {\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}}, which generalises the theory of Dirichlet series for real {s>1} that was used in the previous set of notes.

Given any arithmetic function {f: {\bf N} \rightarrow {\bf C}} obeying the crude size bound

\displaystyle  f(n) = O(n^{o(1)}), \ \ \ \ \ (7)

the Dirichlet series

\displaystyle  {\mathcal D} f(s) := \sum_n \frac{f(n)}{n^s} \ \ \ \ \ (8)

is absolutely convergent in the region {\{ s: \hbox{Re}(s) > 1 \}} to the right of the critical strip. Indeed, if {s = \sigma+it} for some {\sigma>1}, then we have

\displaystyle  \frac{f(n)}{n^s} = O_\varepsilon( \frac{1}{n^{\sigma-\varepsilon}})

for any {\varepsilon>0}, and the claim follows by choosing {\varepsilon} so that {\sigma-\varepsilon > 1}. Note that this argument also shows that {{\mathcal D}f} is bounded in any region of the form {\{ s: \hbox{Re}(s) \ge 1+\varepsilon \}} for any {\varepsilon > 0}.

The partial sums {\sum_{n=1}^N \frac{f(n)}{n^s} = \sum_{n=1}^N f(n) \exp( - s \log n )} are clearly holomorphic functions in {s} on the entire complex plane {{\bf C}}, and they converge locally uniformly to {{\mathcal D} f(s)} on the region {\{ s: \hbox{Re}(s) > 1 \}}. Since locally uniform limits of holomorphic functions are holomorphic (Corollary 11 of Supplement 2), we conclude that {{\mathcal D} f} is holomorphic on {\{ s: \hbox{Re}(s) > 1 \}}.

If {f, g: {\bf N} \rightarrow {\bf C}} obey the crude size bound (7), then so does the Dirichlet convolution {f*g} (see Exercise 24 of Notes 1). A simple application of Fubini’s theorem then gives the fundamental identity

\displaystyle  {\mathcal D}(f*g)(s) = {\mathcal D} f(s) {\mathcal D} g(s) \ \ \ \ \ (9)

in the region {\{s: \hbox{Re}(s) > 1 \}}. Also, by carefully differentiating (8) in {s} we obtain the additional identity

\displaystyle  {\mathcal D}(Lf)(s) = -\frac{d}{ds} {\mathcal D} f(s) \ \ \ \ \ (10)

in the same region, where {L: {\bf N} \rightarrow {\bf C}} is the logarithm function {L(n) = \log n}.

Exercise 2 Rigorously establish (9) and (10).

From (9), (10) we can express the Dirichlet series of many basic arithmetic functions of number-theoretic interest in terms of the Riemann zeta function, at least in the region {\{ s: \hbox{Re}(s) > 1\}}:

  • By definition, {{\mathcal D}(1) = \zeta}. Since {1*1 = \tau}, we conclude from (9) that {{\mathcal D}(\tau) = \zeta^2}.
  • Clearly {{\mathcal D}(\delta)=1}. By (9) and the Möbius inversion formula {1 * \mu = \delta}, we conclude that {{\mathcal D}(\mu) = \frac{1}{\zeta}}. (In particular, {\zeta} has no zeroes in the region {\{ s: \hbox{Re}(s) > 1 \}}.)
  • From (10), we have {{\mathcal D}(L) = -\zeta'}. By (9) and the basic identity {1 * \Lambda = L}, we conclude that {{\mathcal D}(\Lambda) = -\frac{\zeta'}{\zeta}}.
  • From (10), we see that {{\mathcal D}(\frac{\Lambda}{L})} has a derivative of {\frac{\zeta'}{\zeta}}. For real {s>1}, we already saw (see equation (21) of Notes 1) that this expression was equal to {\log \zeta}. Thus we see that {{\mathcal D}(\frac{\Lambda}{L})} is a branch of the complex logarithm of {\zeta} to the right of the strip, so we write (by slight abuse of notation) {{\mathcal D}(\frac{\Lambda}{L}) = \log \zeta} in this region.

Exercise 3

  • (i) Show that {{\mathcal D}(\mu^2)(s) = \frac{\zeta(s)}{\zeta(2s)}} in the region {\{ s: \hbox{Re}(s) > 1 \}}.
  • (ii) Define the Liouville function {\lambda: {\bf N} \rightarrow {\bf C}} by setting {\lambda(n) := (-1)^k} whenever {n} is the product of {k} (not necessarily distinct) primes for some {k \geq 0}. Show that {{\mathcal D}(\lambda) = \frac{\zeta(2s)}{\zeta(s)}} in the region {\{ s: \hbox{Re}(s) > 1 \}}.

Exercise 4 Let {\Lambda_k} be the higher order von Mangoldt functions (equation (65) from Notes 1). Show that {{\mathcal D}(\Lambda_k) = (-1)^k \frac{\zeta^{(k)}}{\zeta}} in the region {\{ s: \hbox{Re}(s) > 1 \}}, where {\zeta^{(k)}} is the {k}-fold derivative of {\zeta}.

Exercise 5 (Uniqueness for Dirichlet series)

  • (i) Let {f, g} be two arithmetic functions obeying (7), such that {{\mathcal D} f = {\mathcal D} g} on the region {\{ s: \hbox{Re}(s) > 1 \}}. Show that {f=g}. (Hint: if {f \neq g}, obtain an asymptotic for {{\mathcal D} f(s) - {\mathcal D} g(s)} as {s \rightarrow +\infty} along the reals.)
  • (ii) Use this uniqueness to give an alternate proof of the identity {\Lambda_{k+1} = L \Lambda_k + \Lambda_k * \Lambda} for {k \geq 1} (equation (66) from Notes 1).
  • (iii) Use this uniqueness to give an alternate proof of the Diamond-Steinig identities (Exercise 64 from Notes 1).

Now we establish some crude bounds on Dirichlet series in the region {\{ s: \hbox{Re}(s) > 1 \}}. We will use the following simple application of the triangle inequality: if {f: {\bf N} \rightarrow {\bf C}} and {g: {\bf N} \rightarrow {\bf R}^+} are arithmetic functions obeying (7) with {|f(n)| \leq g(n)} for all {n}, and {s = \sigma+it} for some {\sigma>1} and {t \in {\bf R}}, then

\displaystyle  | \frac{f(n)}{n^s}| \leq \frac{g(n)}{n^\sigma}

for all {n}, and hence

\displaystyle  |{\mathcal D} f(s)| \leq {\mathcal D} g(\sigma).

In practice, one can use the estimates from Notes 1 to bound {{\mathcal D} g(\sigma)}. For instance, from Exercise 10 of those notes we have

\displaystyle  \zeta(\sigma), -\frac{\zeta'(\sigma)}{\zeta(\sigma)} = \frac{1}{\sigma-1} + O(1)

and thus

\displaystyle  |\zeta(s)|, |\frac{1}{\zeta(s)}|, |\frac{\zeta'(s)}{\zeta(s)}| \leq \frac{1}{\sigma-1} + O(1) \ \ \ \ \ (11)

for all {s=\sigma+it} with {\sigma>1} and {t \in {\bf R}} (the second bound arising from the trivial upper bound {|\mu(n)| \leq 1}).

We can often obtain matching lower bounds to these upper bounds when {s} is close to {1} by a number of means. For the Riemann zeta function, one can use the bound

\displaystyle  \sum_{n \in {\bf Z}: y \leq n \leq x} f(n) = \int_y^x f(t)\ dt + O( \int_y^x |f'(t)|\ dt + |f(y)| ) \ \ \ \ \ (12)

for continuously differentiable {f: [y,x] \rightarrow {\bf C}} (see Exercise 11 from Notes 1), which after a brief calculation gives

\displaystyle  \zeta(s) = \frac{1}{s-1} + O(1)

for {\hbox{Re}(s) > 1} and {s = O(1)}. A similar calculation also gives

\displaystyle  \zeta'(s) = -\frac{1}{(s-1)^2} + O(1)

in the same region, and hence

\displaystyle  -\frac{\zeta'(s)}{\zeta(s)} = \frac{1}{s-1} + O(1) \ \ \ \ \ (13)

for {\hbox{Re}(s)>1} and {s} sufficiently close to {1}; note that these three estimates were already established in Notes 1 under the additional hypothesis that {s} was real. One can view (13) as a crude version of the heuristic (4), in which the role of the zeroes {\rho} is neglected. When controlling {{\mathcal D} f(s)} for a multiplicative function {f} obeying (7), one can also exploit the Euler product formula

\displaystyle  {\mathcal D} f(s) = \prod_p (1 + \frac{f(p)}{p^s} + \frac{f(p^2)}{p^{2s}} + \dots)

which remains valid in the domain {\{ s: \hbox{Re}(s) > 1 \}}. For instance, under the hypotheses of Theorem 27 of Notes 1, we have

\displaystyle {\mathcal D} f(s) = \frac{\mathfrak S}{(s-1)^k} + O_k(\frac{1}{(s-1)^{k-1}})

for {\hbox{Re}(s)>1} and {s=O(1)}, as can be seen by an inspection of the proof of Theorem 27(i).

Exercise 6 By using the Selberg symmetry formula (equation (67) from Notes 1), show that

\displaystyle  \frac{\zeta''(s)}{\zeta(s)} = \frac{2}{(s-1)^2} + O(1 + \frac{1}{\sigma-1})

whenever {s = \sigma+it} with {\sigma>1} and {t \in {\bf R}}.

We can obtain better estimates for the zeta function and its relatives once we have some analytic continuation of these functions to the critical strip. However, even before we do so, we can still control various weighted sums of arithmetic functions {f(n)} in terms of integral combinations of the Dirichlet series {{\mathcal D} f(s)}. This can be achieved by the following formula:

Proposition 7 (Parseval-type formula) Let {f: {\bf N} \rightarrow {\bf C}} obey (7), and let {g: {\bf R} \rightarrow {\bf C}} be a twice continuously differentiable, compactly supported function. Then for any {\sigma>1}, one has

\displaystyle  \sum_n \frac{f(n)}{n} g(\log n) = \frac{1}{2\pi} \int_{\bf R} {\mathcal D} f(\sigma+it) \hat g(t-i(\sigma-1))\ dt \ \ \ \ \ (14)

where {\hat g:{\bf C} \rightarrow {\bf C}} is the Fourier transform of {g} extended to the complex plane, defined by the formula

\displaystyle  \hat g(t) := \int_{\bf R} g(u) e^{itu}\ du \ \ \ \ \ (15)

for {t \in {\bf C}}. The integral on the right-hand side of (14) is absolutely integrable.

The formula (14) is similar to the Parseval-type identity

\displaystyle  \int_{\bf R} f(u) g(u)\ du = \frac{1}{2\pi} \int_{\bf R} \hat f(-t) \hat g(t)\ dt

for suitably “nice” functions {f, g: {\bf R} \rightarrow {\bf C}} (see Corollary 32 from Supplement 2). Indeed, one could view {{\mathcal D} f(1-it)} as the Fourier transform of the Radon measure {\sum_n \frac{f(n)}{n} \delta_{\log n}}, which (formally, at least) yields (14) from the Parseval identity. However we will not adopt this measure-theoretic viewpoint explicitly here.

Proof: From Exercise 28 of Supplement 2, we have the bounds

\displaystyle  |\hat g(t-i(\sigma-1))| \ll_{g,\sigma} \min( 1, \frac{1}{|t|^2} )

which, together with the boundedness of {{\mathcal D} f} on {\{ s: \hbox{Re}(s) \geq \sigma\}}, makes the integral in (14) absolutely integrable as claimed. The same bounds allow one to invoke Fubini’s theorem and rewrite the right-hand side of (14) as

\displaystyle  \frac{1}{2\pi} \sum_n \int_{\bf R} \frac{f(n)}{n^{\sigma+it}} \hat g(t-i(\sigma-1))\ dt

which we rearrange as

\displaystyle  \sum_n \frac{f(n)}{n^\sigma} \frac{1}{2\pi} \int_{\bf R} \hat g(t-i(\sigma-1)) e^{-it \log n}\ dt.

By the Fourier inversion formula (Theorem 30 from Supplement 2), this simplifies to

\displaystyle  \sum_n \frac{f(n)}{n^\sigma} e^{(\sigma-1)\log n} g(\log n),

and the claim follows. \Box

Remark 8 The uncertainty principle in Fourier analysis tells us (heuristically, at least) that if we want the function {g} to exhibit non-trivial oscillation at the scale {1/T}, then the Fourier transform {G} has to spread out over an interval of size {\gg T}. In particular, if we want to use (14) to investigate fine scale structure of {f(n)} on intervals such as {\{ n: \log n = \log x + O(1/T) \}}, then one expects to need control on the associated Dirichlet series {{\mathcal D} f(s)} in which the imaginary part of {s} can be as large as {T} in magnitude. Thus, Fourier analysis gives us the insight that the extent to which one can extend control of the Dirichlet series away from the real axis determines the finest scale of {\log n} that one can hope to control. For instance, the prime number theorem allows one to counting primes in regions such as {\{n: \log n =\log x + o(1)\}}, and so should need control of {\zeta} on the entire right edge of the critical strip. Conversely, numerical verification of the Riemann hypothesis that establishes zero free regions for {\zeta} for imaginary parts up to some finite threshold {T} should yield effective substitutes for the prime number theorem that are able to count primes in intervals such as {\{ n: \log n = \log x + O(1/T) \}}.

The most powerful applications of the Parseval-type formula (14) occur when {{\mathcal D}} has a meromorphic continuation into the critical strip (or beyond), allowing one to shift {\sigma} in the right-hand side of (14) to the left of {1} (picking up various terms from residue calculus along the way). But one can still obtain some useful estimates on various summatory functions involving {f} even without such meromorphic continuations; in particular, just by using asymptotics near (and to the right of) {s=1} such as (13), we can recover estimates of strength comparable to Mertens’ theorems. Here is a basic example, using only the asymptotic (13):

Proposition 9 (Crude Mertens-type theorem) Let {f: {\bf R} \rightarrow {\bf C}} be a continuously twice differentiable compactly supported function. Then for {x>10}, one has

\displaystyle  \sum_n \frac{\Lambda(n)}{n} f( \frac{\log n}{\log x} ) = (\int_0^\infty f(t)\ dt) \log x + O_{f}(1). \ \ \ \ \ (16)

This estimate is weaker than the Mertens’ theorems from Notes 1. However, later in these notes we will be able to improve the error term in this proposition if {f} is smoother, by using a more accurate asymptotic expansion than (13). One should also compare (16) to the heuristic (2) (again neglecting the role of the zeroes).

Proof: Since {{\mathcal D} \Lambda = -\frac{\zeta'}{\zeta}} to the right of the critical strip, we can apply Proposition 7 and rewrite the left-hand side of (16) as

\displaystyle  -\frac{\log x}{2\pi} \int_{\bf R} \frac{\zeta'(\sigma+it)}{\zeta(\sigma+it)} \hat f( (t-i(\sigma-1)) \log x )\ dt

where {\sigma > 1} is arbitrary.

A convenient choice of {\sigma} here is {\sigma = 1 + \frac{1}{\log x}}; this is about as far as one can push {\sigma} to the right (in order to get the best use out of the estimate (11)) before the {i(\sigma-1) \log x} shift to {\hat f} becomes problematic. (Compare with Rankin’s trick, discussed in Notes 1.) In view of (13), it is natural to consider the expression

\displaystyle  \frac{\log x}{2\pi} \int_{\bf R} \frac{1}{\sigma+it - 1} \hat f( (t-i(\sigma-1)) \log x )\ dt. \ \ \ \ \ (17)

To compute this expression, we write

\displaystyle \frac{1}{\sigma+it - 1} = \int_0^\infty e^{-(\sigma+it-1)u}\ du

and interchange integrals by Fubini’s theorem. From the Fourier inversion formula (Theorem 30 from Supplement 2) and a change of variables one has

\displaystyle  \frac{1}{2\pi} \int_{\bf R} e^{-(\sigma+it-1)u} \hat f( (t-i(\sigma-1)) \log x )\ dt

\displaystyle = \frac{1}{\log x} f( \frac{u}{\log x} )

so (17) simplifies to the expected main term {(\int_0^\infty f(t)\ dt) \log x} in (16). It thus suffices to show that

\displaystyle  \int_{\bf R} [\frac{\zeta'(\sigma+it)}{\zeta(\sigma+it)} - \frac{1}{\sigma+it-1}] \hat f( t \log x - i )\ dt = O_{f}(\frac{1}{\log x}).

By (13), we can bound {\frac{\zeta'(\sigma+it)}{\zeta(\sigma+it)} - \frac{1}{\sigma+it-1}} in magnitude by {O(1)} when {|t|} is smaller than some absolute constant {c>0}. For {|t| > c}, we instead use (11) to bound this quantity by {O( \frac{1}{\sigma-1} ) = O( \log x )}. Meanwhile, from Exercise 28 of Supplement 2 we have the bounds

\displaystyle  \hat f(t-i) = O_{f}( \min(1, \frac{1}{|t|^2} ) ).

Putting these bounds together, we obtain the claim. \Box

Remark 10 In later notes we will use a similar method to that used to prove Theorem 9 to estimate sums such as

\displaystyle  \sum_{d_1,d_2} \frac{f(\frac{\log d_1}{\log x}, \frac{\log d_2}{\log x})}{[d_1,d_2]},

where {[d_1,d_2]} is the least common multiple of {d_1,d_2}, and {f: {\bf R}^2 \rightarrow {\bf C}} is a smooth compactly supported function; such expressions will arise naturally when we turn to the topic of sieve theory.

A classical limiting case of Proposition 7 is Perron’s formula:

Exercise 11 (Perron’s formula) Let {f: {\bf N} \rightarrow {\bf C}} obey (7). For any non-integer {x>0} and any {\sigma>1}, show that

\displaystyle  \sum_{n \leq x} f(n) = \frac{1}{2\pi i} \lim_{T \rightarrow \infty} \int_{\sigma-iT}^{\sigma+iT} {\mathcal D}f(s) \frac{x^s}{s}\ ds, \ \ \ \ \ (18)

where the integral is a contour integral along the line segment {\{ \sigma+it: -T \leq t \leq T \}}. What happens when {x} is an integer?

In practice, the presence of the limit in (18) is inconvenient, and one usually works with smoothed or truncated version of this formula. Proposition 7 can be viewed as a smoothed version of Perron’s formula. Now we establish a truncated version:

Proposition 12 (Truncated Perron’s formula) Let {f: {\bf N} \rightarrow {\bf C}} be such that {f(n) = O(\log(2+n))} for all {n}, and let {x \geq T \geq 2}. Then

\displaystyle  \sum_{n \leq x} f(n) = \frac{1}{2\pi i} \int_{\sigma-iT}^{\sigma+iT} {\mathcal D}f(s) \frac{x^s}{s}\ ds + O( \frac{x}{T} \log^2(xT) ) \ \ \ \ \ (19)

where {\sigma := 1 + \frac{1}{\log x}}.

One can sharpen the {\log^2(xT)} factor here slightly, but we will not need such improvements here. The condition {f(n) = O(\log(2+n))} can also be relaxed (at the cost of worsening the error term accordingly); we leave this as an exercise to the reader.

Proof: By perturbing {x} we may assume that {x} is not an integer. In view of (18), it suffices by dyadic decomposition to show that

\displaystyle  |\int_{\sigma \pm iT}^{\sigma \pm 2iT} {\mathcal D}f(s) \frac{x^s}{s}\ ds| \ll \frac{x}{T} \log^2(xT).

By Fubini’s theorem, the left-hand side may be written as

\displaystyle  | \sum_n f(n) a_n| \ \ \ \ \ (20)

where

\displaystyle  a_n := \int_{\sigma \pm iT}^{\sigma \pm 2iT} \frac{(x/n)^s}{s}\ ds.

On taking absolute values, we see that

\displaystyle  a_n \ll (x/n)^\sigma,

and in particular {a_n \ll 1} for {n \in [x/2,2x]}. On the other hand, from integration by parts we have

\displaystyle  a_n = \frac{(x/n)^s}{s \log(x/n)}|^{\sigma \pm 2iT}_{\sigma \pm iT} + \int_{\sigma \pm iT}^{\sigma \pm 2iT} \frac{(x/n)^s}{s^2 \log(x/n)}\ ds

and thus on taking absolute values

\displaystyle  a_n \ll \frac{(x/n)^\sigma}{T |\log(x/n)|}.

In particular we have

\displaystyle  a_n \ll \frac{x}{T} \frac{1}{n^\sigma}

for {n \not \in [x/2,2x]}, and

\displaystyle  a_n \ll \min( 1, \frac{x}{T |x-n|} )

for {n \in [x/2,2x]}. We may thus upper bound (20) by

\displaystyle  \ll \frac{x}{T} \sum_n \frac{\log(2+n)}{n^\sigma} + \sum_{x/2 \leq n \leq 2x} \log x \min( 1, \frac{x}{T|x-n|} )

and the claim follows from Lemma 2 of Notes 1. \Box

— 2. Meromorphic continuation into the critical strip, and the (truncated) explicit formula —

To get the most use out of Perron-type formulae, we have to extend Dirichlet series such as the Riemann zeta function meromorphically into the critical strip {\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}}. Not every Dirichlet series with coefficients {f} obeying (7) has such a meromorphic extension; roughly speaking, the existence of such an extension is morally equivalent to having asymptotic formulae for sums such as {\sum_{n \leq x} f(n)} or {\sum_{n \leq x} \frac{f(n)}{n^s}} whose error term is better than what one can obtain just from (7).

To extend the zeta function {\zeta} into the critical strip, we will use (12), which gives the bound

\displaystyle  \sum_{y \leq n \leq x} \frac{1}{n^s} = \frac{y^{1-s} - x^{1-s}}{s-1} + O( \frac{|s|}{\sigma} y^{-\sigma} )

whenever {1 \leq y \leq x} and {s = \sigma+it} for some {\sigma > 0}, {t \in {\bf R}} with {s \neq 1}. (The error term is a bit crude, particularly when {s} has large imaginary part; we will obtain better estimates in later notes.) From Lemma 5 of Notes 1, this implies that we can find a (unique) complex number {\zeta(s)} such that

\displaystyle  \sum_{n \leq x} \frac{1}{n^s} = \zeta(s) - \frac{x^{1-s}}{s-1} + O( \frac{|s|}{\sigma} x^{-\sigma} ) \ \ \ \ \ (21)

for all {x \geq 1} and {s = \sigma+it} for some {\sigma>0} and {t \in {\bf R}} with {s \neq 1}. For {\sigma > 1}, this definition of {\zeta(s)} agrees with the prior definition of the Riemann zeta function in this range; it also is consistent with the quantity {\zeta(s)} defined for {0 < s < 1} in Section 1 of Notes 1.

From (21) we also observe the conjugation symmetry

\displaystyle  \zeta( \overline{s} ) = \overline{\zeta(s)} \ \ \ \ \ (22)

for any {s} in {\{ s: \hbox{Re}(s) > 0 \}}. (Indeed, from the unique continuation property for meromorphic functions, this property is automatic for any meromorphic function on a connected domain symmetric around the real axis, which is real on the real axis.)

Observe that the function {\frac{x^{1-s}-1}{s-1}} has a removable singularity at {s=1} (it approaches {\log x} at that value of {s}). From (21), we see that on the region {\{ s: \hbox{Re}(s) > 0 \}}, the function {\zeta(s) - \frac{1}{s-1}} is the locally uniform limit of the functions {\sum_{n \leq x} \frac{1}{n^s} + \frac{x^{1-s}-1}{s-1}}, which are holomorphic on this region once the removable singularity at {s=1} is removed. We conclude that {\zeta(s)-\frac{1}{s-1}} is holomorphic in this region (after removing the singularity at {s=1}), and hence {\zeta} is meromorphic in this region, with a simple pole at {s=1} and no other poles. In particular, we have the Laurent expansion

\displaystyle  \zeta(s) = \frac{1}{s-1} + c_0 + c_1 (s-1) + \dots + c_k(s-1)^k + O_k( |s-1|^{k+1} )

for any natural number {k} and all {s} sufficiently close to {1}, where {c_0, c_1, \dots} are complex coefficients. Differentiating, we also see that

\displaystyle  -\frac{\zeta'}{\zeta}(s) = \frac{1}{s-1} + d_0 + d_1 (s-1) + \dots + d_k(s-1)^k \ \ \ \ \ (23)

\displaystyle + O_k( |s-1|^{k+1} )

for any natural number {k} and all {s} sufficiently close to {1}, where {d_0, d_1, \dots} are further complex coefficients. This refines the bound (13).

Exercise 13 Show that {c_0= \gamma} and {d_0=-\gamma}, where {\gamma} is Euler’s constant. (This appearance of {\gamma} in analytic number theory is largely unrelated to the appearance of {e^\gamma} type factors in Notes 1.)

Exercise 14 Prove the following generalisation of Proposition 9: if {f: {\bf R} \rightarrow {\bf C}} is a continuously {k}-times differentiable compactly supported function for some {k \geq 2}, then for {x>10}, one has

\displaystyle  \sum_n \frac{\Lambda(n)}{n} f( \frac{\log n}{\log x} ) = (\int_0^\infty f(t)\ dt) \log x + a_{0,f} + a_{-1,f} \log^{-1} x + \dots

\displaystyle  + a_{3-k,f} \log^{3-k} x + O_{f}(\log^{2-k} x)

for some complex coefficients {a_{0,f}, a_{-1,f}, \dots, a_{3-k,f}} depending on {f}. (Hint: repeat the proof of Proposition 9, but use (23) in place of (13).)

Remark 15 Note that the error term in the above exercise improves as {f} gets smoother, and in fact becomes significantly better than the type of error terms appearing in the elementary approach when {f} is smooth enough (a special case of the “smoothed sums” philosophy in analysis). Thus we see a contrast between the elementary and complex-analytic methods; the latter approach can provide superior error terms, but also has a preference for smoother sums than the roughly truncated sums that are the main focus of the elementary methods.

From (21) with {x=1}, we also have the crude bound

\displaystyle  |\zeta(s) - \frac{1}{s-1}| \ll \frac{|s|}{\sigma} \ \ \ \ \ (24)

in the region {\{ s: \hbox{Re}(s) > 0 \}}. While this is quite a crude bound, it implies a decent upper bound on the log-magnitude of {\zeta}, namely that

\displaystyle  \log |\zeta(s)| \leq O_\varepsilon( \log(2+|s|) ) \ \ \ \ \ (25)

whenever {\hbox{Re}(s) \geq \varepsilon} and {|s-1| \geq \varepsilon}. This, combined with Jensen’s formula, gives a useful upper bound on the density of zeroes of {\zeta}:

Proposition 16 (Crude upper bound on zeta zeroes) For any {\varepsilon > 0} and {t_0 \in {\bf R}}, there are at most {O_\varepsilon( \log(2+|t_0|) )} zeroes of {\zeta} (counting multiplicity) in the region {\{ \sigma+it: \varepsilon \leq \sigma \leq 1; |t-t_0| \leq 1 \}}.

Proof: We may assume that {|t_0| \geq 10} (say), since for {|t_0| < 10} the claim follows from the discrete nature of the zeroes of a meromorphic function. We may also take {\varepsilon} small, say {\varepsilon < 1/2}.

Consider the disk of radius {2-\varepsilon/2} centred at {2 + it_1} for some {t_1 \in [t_0-2, t_0+2]}. By (11), we have {\log |\zeta(s)| = O(1)} at the centre {2+it_0} of the circle, while from (25) one has {\log |\zeta(s)| \leq O_\varepsilon( \log |t_0| )} on the boundary of the circle. By Jensen’s formula (Theorem 16 from previous notes), this implies that there are at most {O_\varepsilon( \log |t_0| )} zeroes in the disk of radius {2 - 3\varepsilon/4} (say) centred at {2+it_1}. Since the region {\{ \sigma+it: \varepsilon \leq \sigma \leq 1; |t-t_0| \leq 1 \}} can be covered by {O_\varepsilon(1)} such disks, the claim follows. \Box

Remark 17 It will be convenient in the following discussion to adopt the convention that all sums over zeroes of {\zeta} (or of other {L}-functions) are counted with multiplicity; thus for instance a double zero would contribute twice to such a sum. Indeed, one can think of a zero of order {k} as being a limiting case of {k} simple zeroes that are extremely close together (cf. Rouché’s theorem or Hurwitz’s theorem in complex analysis), which helps explain why such zeroes are always counted with multiplicity. It is conjectured that the zeroes of the Riemann zeta function (or of any {L}-function) are all simple, but this claim looks hopeless to prove using current methods; the problem is that it is nearly impossible for analytic methods to distinguish between a repeated zero, and a pair of simple zeroes that are extremely close together, and we currently do not have good methods to exclude the latter from occurring at least once.

Remark 18 As a corollary of Proposition 16, we see that the number of zeroes of {\zeta} in the region {\{ \sigma+it: \varepsilon \leq \sigma \leq 1; |t| \leq T \}} is {O_\varepsilon( T \log(2+T) )} for any {T > 0}. Once we establish the functional equation for {\zeta}, we will be able to match this upper bound with a comparable lower bound, and also set {\varepsilon} to zero; see later notes.

This gives an approximate formula for the log-derivative of zeta in terms of the nearby zeroes:

Proposition 19 (Approximate formula for log-derivative of zeta) For any {C,\varepsilon > 0}, we have

\displaystyle  -\frac{\zeta'}{\zeta}(s) = -\sum_{\rho: |s-\rho| \leq \varepsilon/2} \frac{1}{s-\rho} + \frac{1}{s-1} + O_{C,\varepsilon}( \log(2+|t|) )

whenever {s = \sigma+it} and {\varepsilon \leq \sigma \leq C}.

This proposition should be viewed as a local version of the heuristic (4).

Proof: To eliminate the pole at {s=1}, it is convenient to work with the modified function {f(s) := (s-1) \zeta(s)}, which is holomorphic in {\{\hbox{Re}(s) > 0\}} after removing the singularity at {s=1}, and our task is now to show that

\displaystyle  \frac{f'}{f}(s) = \sum_{\rho: |s-\rho| \leq \varepsilon/2} \frac{1}{s-\rho} + O_{C,\varepsilon}( \log(2+|t|) )

whenever {s = \sigma+it} and {\varepsilon \leq \sigma \leq C}.

As in the previous proposition, consider the disk of radius {2-\varepsilon/4} centred at {2 + it}. From (11) we have {\log |f| = O( \log(2+|t| ))} in the centre of this disk, and from (25) one has {\log |f| \leq O( \log(2+|t| ) )} on the boundary of the disk. The claim now follows from Theorem 21 of Supplement 2 (using Proposition 16 to remove the contribution of zeroes further than {\varepsilon/2} from {s}). \Box

Among other things, this proposition gives good control on the size of the log-derivative on average, which will be useful in figuring out how to shift a contour without encountering the large values of {\zeta'/\zeta} too often:

Corollary 20 (Local integrability of log-derivative) For any {C > \varepsilon > 0} and {t_0 \in {\bf R}}, we have

\displaystyle  \int_\varepsilon^C \int_{t_0-1}^{t_0+1} |\frac{\zeta'}{\zeta}(\sigma+it)|\ dt d\sigma \ll_{\varepsilon,C} \log(2+|t_0|).

Proof: We apply Proposition 19. The contribution of the error term {O_{C,\varepsilon}(\log(2+|t|))} is clearly acceptable. Because {\frac{1}{s}} is a locally integrable function on the complex plane, we see that the {\frac{1}{s-1}} term contributes a factor of {O_{\varepsilon,C}(1)} to the required integral, and every zero within {O_{\varepsilon,C}(1)} of {it_0} also contributes {O_{\varepsilon,C}(1)}. The claim now follows from Proposition 16. \Box

Now we can apply the truncated Perron’s formula with contour shifting to obtain a truncated explicit formula for the von Mangoldt summatory function:

Theorem 21 (Truncated von Mangoldt explicit formula) For any {0 < \varepsilon < 1} and {x, T \geq 2}, we have

\displaystyle  \sum_{n \leq x} \Lambda(n) = x -\sum_{\rho: \hbox{Re}(\rho) \geq \varepsilon; |\hbox{Im}(\rho)| \leq T} \frac{x^\rho}{\rho}

\displaystyle  + O_\varepsilon( x^\varepsilon \log^2 T + \frac{x}{T} \log^2(xT) ).

The error terms here can be improved a little, particularly once one uses the functional equation for {\zeta}; see later notes. However, the current form of the formula already suffices for many applications. This theorem should be compared with (2).

Proof: From Corollary 20 and the pigeonhole principle, we may find {T' \in [T,T+1]} such that

\displaystyle  \int_{\varepsilon/4}^2 |\frac{\zeta'}{\zeta}(\sigma \pm iT')|\ d\sigma \ll_\varepsilon \log T \ \ \ \ \ (26)

for either choice of sign {\pm} (note that this implies that the horizontal lines {\{ s: \hbox{Im} s = \pm T'\}} avoid all the poles of {\zeta'/\zeta}). (One could use (22) here to eliminate the need to consider a sign {\pm}, but it is not necessary to do so here.) On the other hand, from Proposition 12 we have

\displaystyle  \sum_{n \leq x} \Lambda(n) = -\frac{1}{2\pi i} \int_{1+\frac{1}{\log x}-iT'}^{1+\frac{1}{\log x}+iT'} \frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}\ ds + O( \frac{x}{T} \log^2(xT) ).

Observe that on the half-space {\{ \hbox{Re}(s) > 0\}}, the meromorphic function {\frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}} has a pole at {s=1} with residue {x}, and poles at every zero {s = \rho} of {\zeta} with residue {-\frac{x^\rho}{\rho}} (multiplied by the multiplicity of the zero), with no other poles. By the residue theorem (Exercise 13 of Supplement 2) applied to the boundary of the rectangle {\{ s: \varepsilon' \leq \hbox{Re}(s) \leq 1+\frac{1}{\log x}; -T' \leq \hbox{Im}(s) \leq T' \}} for some {\varepsilon' \in [\varepsilon/2,\varepsilon]} to be chosen shortly, with {\{ s: \hbox{Re}(s) =\varepsilon'\}} avoiding the poles of {\zeta'/\zeta}, and using (26) to bound the upper and lower limits of integration, we thus have

\displaystyle  \sum_{n \leq x} \Lambda(n) = x - \sum_{\rho: \hbox{Re}(\rho) \geq \varepsilon'; |\hbox{Im}(\rho)| \leq T'} \frac{x^\rho}{\rho}

\displaystyle  -\frac{1}{2\pi i} \int_{\varepsilon'-iT'}^{\varepsilon'+iT'} \frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}\ ds + O_\varepsilon( \frac{x}{T} \log^2(xT) ). \ \ \ \ \ (27)

Now, from Corollary 20, and integrating {t_0} from {-T'} to {T'}, we have

\displaystyle  \int_{\varepsilon/2}^{\varepsilon} \int_{-T'}^{T'} |\frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}|\ dt d\sigma \ll_\varepsilon x^{\varepsilon} \log^2 T

where {s = \sigma+it}. Thus by the pigeonhole principle we may find {\varepsilon' \in [\varepsilon/2,\varepsilon]} where

\displaystyle  \int_{-T'}^{T'} |\frac{\zeta'(s)}{\zeta(s)} \frac{x^s}{s}|\ dt \ll_\varepsilon x^{\varepsilon} \log^2 T

where {s = \varepsilon' + it}, and so the contour integral in (27) is {O_\varepsilon(x^{\varepsilon} \log^2 T)}. Using Proposition 16, the contribution to {\sum_{\hbox{Re}(s) \geq \varepsilon'; |\hbox{Im}(s)| \leq T'} \frac{x^\rho}{\rho}} of those zeroes {\rho} with {|\hbox{Im}(s)| \geq T} is {O( x \log T / T)}, and of those zeroes {\rho} with {|\hbox{Re}(s)| \leq \varepsilon} is {O( x^\varepsilon \log^2 T )}. The claim follows. \Box

Exercise 22 (Smoothed explicit formula) Let {g: {\bf R} \rightarrow {\bf C}} be a smooth, compactly supported function. Then for any {\varepsilon>0} and {x \geq 2}, show that

\displaystyle  \sum_n \Lambda(n) g( \log n - \log x) = x \hat g(-i) - \sum_{\rho: \hbox{Re}(\rho) > \varepsilon} x^\rho \hat g(-i\rho)

\displaystyle + O_{g,\varepsilon}( x^\varepsilon ),

with the sum on the right-hand side being absolutely convergent. (Hint: use Proposition 7 and contour shifting.) This exercise should be compared with (14).

Theorem 21 allows one to use zero-free regions of the Riemann zeta function to improve the error term in the prime number theorem:

Corollary 23 (Zero-free region controls von Mangoldt summatory function) Let {T \geq 2} and {0 < \delta \leq 1/2}, and suppose that there are no zeroes {\rho} of {\zeta} in the rectangle {\{ s: 1-\delta < |\hbox{Re}(s)| \leq 1; |\hbox{Im}(s)| \leq T \}}. Then one has

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1-\delta} \log^2 T ) + O( \frac{x}{T} \log^2(xT) )

for all {x \geq 2}.

Proof: Apply Theorem 21 with {\varepsilon = 1-\delta}. (As {\varepsilon} is bounded away from zero, the implied constant is uniformly bounded in {\varepsilon}.) \Box

In fact, we have a fairly tight relationship between zero-free regions and error terms. Here is one example of this:

Proposition 24 Let {0 < \delta \leq 1/2}. Then the following assertions are equivalent:

  • (i) One has {\sum_{n \leq x} \Lambda(n) = x + O(x^{1-\delta+o(1)})} as {x \rightarrow \infty}.
  • (ii) One has {\sum_{n \leq x} \Lambda(n) = x + O(x^{1-\delta} \log^2 x )} for all {x \geq 2}.
  • (iii) All the zeroes {\rho} of {\zeta} have real part at most {1-\delta}.

Proof: Clearly (ii) implies (i). If (iii) holds, then by applying Corollary 23 with {T := x^\delta}, we obtain (ii).

Finally, suppose that (i) holds. For {\hbox{Re}(s) > 1}, we see from Fubini’s theorem that

\displaystyle  \frac{\zeta'(s)}{\zeta(s)} = -\sum_n \frac{\Lambda(n)}{n^s} = -\int_1^\infty \frac{s}{x^{s+1}} \sum_{n \leq x} \Lambda(n)\ dx,

and thus

\displaystyle  \frac{\zeta'(s)}{\zeta(s)} = -\frac{s}{s-1} - \int_1^\infty \frac{s}{x^{s+1}} (\sum_{n \leq x} \Lambda(n)-x)\ dx.

By (i) and Morera’s theorem, the integral on the right-hand side extends holomorphically to the region {\{s:\hbox{Re}(s) > 1-\delta\}}, and so by unique continuation {\zeta'/\zeta} cannot have any poles in this region other than at {s=1}. This implies (iii). \Box

Remark 25 Proposition 24 illustrates a remarkable “self-improving” property of estimates on the von Mangoldt summatory function: a weak bound of the form {\sum_{n \leq x} \Lambda(n) = x + O(x^{1-\delta+o(1)})}, if true in the asymptotic limit {x \rightarrow \infty}, automatically implies the stronger bound {\sum_{n \leq x} \Lambda(n) = x + O(x^{1-\delta} \log^2 x)} for any given {x \geq 2} (and in fact the implied constant in the conclusion depends only on {\delta}, and not on the decay rate in the hypothesis). This is due to the special structure of this summatory function {x \mapsto \sum_{n \leq x} \Lambda(n)}, as revealed by the explicit formula, which limits the range of possible asymptotic behaviours of this function, and in particular gives some control on a given value of this function at some choice of {x} in terms of its values at much larger choices of {x}. (Compare with the following easy example of a self-improving property: if {k} is a natural number and {P: {\bf R} \rightarrow {\bf C}} is a polynomial with {P(x) = o(x^{k+1})} as {x \rightarrow \infty}, then {P(x) = O_P(x^k)} for all {x \geq 1}.)

Exercise 26 Give an alternate proof that (i) implies (iii) in Proposition 24 that uses Theorem 21 (with {T} set to a large power of {x}), as well as an inspection of the asymptotics of the expression

\displaystyle  \int_0^\infty \varphi( y / x ) y^{-\rho_0} \sum_{n \leq y} \Lambda(n)\ dy

as {x \rightarrow \infty}, where {\rho_0} is a zero of {\zeta} and {\varphi: [0,+\infty) \rightarrow {\bf R}} is a smooth compactly supported bump function. (The point is that expression isolates the effect of the single zero {\rho_0} in the von Mangoldt explicit formula.) Give a similar derivation that uses Exercise 22 instead of Theorem 21.

Exercise 27 (Truncated Landau explicit formula) Let {0 < \varepsilon < c \leq \sigma_0 \leq C} and {x, T \geq 2}, and let {s_0 = \sigma_0+it_0} for some {t_0 \in {\bf R}} be such that {s_0} is not a zero or pole of {\zeta}. Show that

\displaystyle  \sum_{n \leq x} \frac{\Lambda(n)}{n^{s_0}} = -\frac{\zeta'(s_0)}{\zeta(s_0)} + \frac{x^{1-{s_0}}}{1-s_0} -\sum_{\rho: \hbox{Re}(\rho) \geq \varepsilon; |\hbox{Im}(\rho)-t_0| \leq T} \frac{x^{\rho-{s_0}}}{\rho-{s_0}}

\displaystyle  + O_{\varepsilon,c,C}( x^{\varepsilon-\sigma_0} \log^2 T + \frac{x}{T} \log^2(xT) ).

Unfortunately, none of the statements in Proposition 24 are known to hold for any positive {\delta}. The infamous Riemann hypothesis asserts that the statements in Proposition 24 hold for {\delta} as large as {1/2}:

Conjecture 28 (Riemann hypothesis) All the zeroes of {\zeta} have real part at most {1/2}.

Remark 29 This is not quite the traditional formulation of the Riemann hypothesis, which asserts instead that all the zeroes of {\zeta} on the critical strip lie on the critical line {\{ 1/2+it: t \in {\bf R}\}}. However, the two formulations are logically equivalent, once one possesses the functional equation; see later notes.

From the above proposition, we see that the Riemann hypothesis is equivalent to the quite strong estimate

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1/2} \log^2 x )

on the von Mangoldt summatory function for all {x \geq 2}. This already gives a “near miss” to Legendre’s conjecture that there exists a prime between {n^2} and {(n+1)^2} for any {n \geq 1}:

Exercise 30 (Conditional near-miss to Legendre’s conjecture) Assume the Riemann hypothesis. Show that there exists a constant {C>0} such that there exists a prime between {n^2} and {(n + C \log^2 n)^2} for any {n \geq 2}.

We remark that Cramér reduced the {\log^2 n} term here to a {\log n}, however no further improvement is known if one “only” assumes the Riemann hypothesis. (But one can shave the {\log n} further to {o(\sqrt{\log n})} if one additionally assumes a form of the Montgomery pair correlation conjecture, a result of Goldston and Heath-Brown.) In later notes we will discuss some weaker near-misses to Legendre’s conjecture that are not conditional on unproven statements such as the Riemann hypothesis, by replacing the notion of zero-free region with the weaker, but somewhat comparable in power, notion of a zero-density theorem.

There is a limiting case {\delta=0} of Proposition 24, due to Wiener:

Proposition 31 (Equivalent forms of prime number theorem) The following assertions are equivalent:

  • (i) One has {\sum_{n \leq x} \Lambda(n) = x + o(x)} as {x \rightarrow \infty}.
  • (ii) All the zeroes {\rho} of {\zeta} have real part strictly less than one.

Proof: First suppose that (ii) fails but (i) holds, so {\zeta} has a zero at {1+it} for some non-zero {t \in {\bf R}}. Then {\frac{-\zeta'}{\zeta}} has a simple pole at {1+it} with residue at most {-1}, and so

\displaystyle  -\hbox{Re}(\frac{\zeta'}{\zeta}(\sigma+it)) \leq \frac{-1+o(1)}{\sigma-1}

as {\sigma \rightarrow 1^+}, or in other words

\displaystyle  \hbox{Re}(\sum_n \frac{\Lambda(n)}{n^{\sigma+it}}) \leq \frac{-1+o(1)}{\sigma-1}.

However, from Fubini’s theorem we have

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{\sigma+it}} = (\sigma+it) \int_1^\infty x^{-\sigma-it-1} \sum_{n \leq x} \Lambda(n)\ dx.

Applying (i), we soon conclude that

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{\sigma+it}} = o( \frac{1}{\sigma-1} )

as {\sigma \rightarrow 1^+}, giving the required contradiction.

Now suppose that (ii) holds. We apply Exercise 22 with {\varepsilon=1/2} (say) to obtain

\displaystyle  \sum_n \Lambda(n) g( \log n - \log x) = x \hat g(-i) - \sum_{\rho: \hbox{Re}(\rho) > 1/2} x^\rho \hat g(-i\rho) + o(x),

as {x \rightarrow \infty}, for any smooth compactly supported {g} independent of {x}. By (ii), each individual term {x^\rho \hat g(-i\rho)} is {o(x)}. Since the zeroes are discrete, we thus have

\displaystyle  \sum_n \Lambda(n) g( \log n - \log x) = x \hat g(-i)

\displaystyle  - \sum_{\rho: \hbox{Re}(\rho) > 1/2; |\hbox{Im}(\rho)| \geq T} x^\rho \hat g(-i\rho) + o(x),

for any {T \geq 2} independent of {x}. To control the remaining portion of the sum, we crudely bound {x^\rho = O(x)} and {\hat g(-i\rho) = O_g( 1/|\rho|^2 )} (using Exercise 28 of Supplement 2) and use Proposition 16 to conclude that

\displaystyle  \sum_n \Lambda(n) g( \log n - \log x) = x \hat g(-i) + O_g( \frac{x}{T} ) + o(x),

and thus on sending {T} to infinity and expanding out {\hat g(-i)},

\displaystyle  \sum_n \Lambda(n) g( \log n - \log x) = x \int_{\bf R} g(u) e^u\ du + o(x).

Letting {g} be an upper or lower approximant to {1_{[-\log 2,0]}}, we conclude that

\displaystyle  \sum_{x/2 \leq n \leq x} \Lambda(n) = \frac{x}{2} + o(x),

and (i) follows by a telescoping argument. \Box

Some of the above discussion involving the von Mangoldt function {\Lambda} has an analogue involving the Möbius function, although it is more difficult to use the residue theorem to obtain a useful explicit formula because the residues of {\frac{1}{\zeta}} are significantly less well understood than that of {-\frac{\zeta'}{\zeta}}. Nevertheless, one can still use other complex analytic tools, such as Taylor expansion, to get some weaker statements. We give some examples of this in the exercises below.

Exercise 32 Suppose that the conclusions of Proposition 24 hold for some {0 < \delta \leq 1/2}.

  • (i) For any {\varepsilon > 0}, show the bounds

    \displaystyle  \log|\zeta(s)| \ll_\varepsilon \log |t|

    if {s=\sigma+it} with {\sigma > 1-\delta+\varepsilon} and {|t| \geq 2}. If {\sigma \geq 3/2} (say), improve this to

    \displaystyle  \log|\zeta(s)| = O(1).

  • (ii) Show that there is a branch {\log \zeta} of the logarithm of {\zeta} that is holomorphic in the region {\{ s: \hbox{Re}(s) > 1-\delta, \hbox{Im}(s) > 0 \}}, and obeying the bounds

    \displaystyle  (\log \zeta)^{(k)}( 2+it ) \ll_\varepsilon k! (1+\delta-\varepsilon)^{-k} \log |t|

    and

    \displaystyle  (\log \zeta)^{(k)}( 2+it ) \ll k! 2^k

    for all {\varepsilon>0}, {k \geq 0} and {|t| \geq 10}, where {(\log \zeta)^{(k)}} denotes the {k}-fold derivative of {\log \zeta}. (Hint: use the generalised Cauchy integral formulae, see Exercise 9 of Supplement 2.)

  • (iii) Show that for any {\varepsilon > 0}, we have

    \displaystyle  \log|\zeta(s)| \leq \varepsilon \log |t|

    if {s=\sigma+it} with {\sigma > 1-\delta+\varepsilon} and {|t|} sufficiently large depending on {\varepsilon}. (Hint: Taylor expand {\log \zeta} around {2+it} using the bounds from (ii), possibly with a different choice of {\varepsilon}.)

Exercise 33 Let {0 < \delta < 1/2}. Show that the conclusions (i)-(iii) of Proposition 24 are equivalent to the assertion

  • (iv) {\sum_{n \leq x} \mu(n) = O( x^{1-\delta+o(1)})} as {x \rightarrow \infty}.

(Hint: apply the truncated Perron formula to {f(n)=\mu(n)} and shift the contour, using the preceding exercise to control error terms.) In particular, we see that the Riemann hypothesis is equivalent to the assertion that

\displaystyle  \sum_{n \leq x} \mu(n) = O( x^{1/2+o(1)} )

as {x \rightarrow \infty}.

Exercise 34 Show that the Riemann hypothesis implies the Lindelöf hypothesis that {\zeta(1/2+it) = O( t^{o(1)} )} as {t \rightarrow +\infty}.

Exercise 35 Let {k \geq 1} be a natural number, and let {f: {\bf N} \rightarrow {\bf C}} be a multiplicative function obeying the bounds {f(p) = k + O_k(\frac{1}{p})} for all primes {p}, and such that {f(p^j) = O_k(j^{O_k(1)})} for all primes {p} and {j \geq 2}.

  • (i) Show that {{\mathcal D} f} has a meromorphic continuation to the half-space {\{ s: \hbox{Re}(s) > 0 \}}, which has a pole of at most order {k} at {s=1} but no other poles. (Hint: use Euler products to factor {{\mathcal D} f} as the product of {\zeta^k} and a function holomorphic in this half-space.) Also show that {{\mathcal D} f(s) = O_{\varepsilon,k}( |t|^k )} when {s = \sigma+it} with {\sigma \geq \varepsilon} and {|t| \geq 2}.
  • (ii) Show that

    \displaystyle  \sum_{n \leq x} f(n) = x P( \log x ) + O_k( x^{1-c} )

    for all {x \geq 10} and some {c>0} depending only {k}, where {P(t)} is a polynomial with leading term {\frac{1}{(k-1)!} {\mathfrak S} t^k}, where the singular series {{\mathfrak S}} was defined in Theorem 27 of Notes 1. (Hint: modify Proposition 12 to deal with the fact that {f(n)} is only bounded by {n^{o(1)}} rather than by {O(\log n)}, apply it with {T} a small power of {x}, then shift the contour.) Note that this refines Theorem 27(iii) from Notes 1, and also generalises Exercise 32 from those notes.

— 3. The prime number theorem —

We are now finally ready to prove the prime number theorem (3), first established by Hadamard and de la Vallée Poussin. In view of Proposition 31, the task comes down to excluding the possibility that a zero

\displaystyle  \zeta(1+it) = 0

occurs on the line {\{ 1+it: t \in {\bf R}\}} for some {t \in {\bf R}}. Note that {t} cannot be zero, as {\zeta} has a pole at {s=1}.

The basic point here is that such a zero implies a “conspiracy” between the von Mangoldt function {\Lambda(n)} and the multiplicative function {n^{it}}, in that the two functions correlate or “pretend” to be like each other in a certain sense. Indeed, if {\zeta} has a zero of some positive order {k} at {1+it}, then the log-derivative {-\frac{\zeta'}{\zeta}} has a simple pole with residue {-k} at {1+it}, so in particular

\displaystyle  -\frac{\zeta'}{\zeta}(\sigma+it) = -\frac{k}{\sigma-1} + O_t(1)

for {\sigma > 1} sufficiently close to {1}. We rewrite this as

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} n^{-it} = -\frac{k}{\sigma-1} + O_t(1). \ \ \ \ \ (28)

On the other hand, from (13) (and (11), when {\sigma} is large) one has

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} = \frac{1}{\sigma-1} + O(1). \ \ \ \ \ (29)

From the triangle inequality

\displaystyle  |\sum_n \frac{\Lambda(n)}{n^\sigma} n^{-it}| \leq \sum_n \frac{\Lambda(n)}{n^\sigma} \ \ \ \ \ (30)

and sending {\sigma \rightarrow 1^+}, we already obtain a contradiction if {k \geq 2}; thus we have shown that there are no zeroes of multiplicity two or higher on the line {\{ 1+it: t \in {\bf R}\}}. In the case of a simple zero {k=1}, we have not yet obtained a contradiction; but observe that in this case, the triangle inequality (30) is close to being attained with equality. Intuitively, this implies that {n^{-it} \approx -1} on most of the support of {\Lambda}, that is to say that {p^{-it} \approx 1} for “most” primes {p}. To make this precise, we add (28) to (29) and then take real parts to conclude that

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} (1 + \hbox{Re}(n^{-it})) = O_t(1) \ \ \ \ \ (31)

so in particular

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} (1 + \hbox{Re}(n^{-it})) = o( \sum_n \frac{\Lambda(n)}{n^\sigma} )

as {\sigma \rightarrow 1^+}. In probabilistic terms, if one selects a natural number {n} at random using the probability density {\frac{\Lambda(n)}{n^\sigma}}, divided by the quantity {\sum_n \frac{\Lambda(n)}{n^\sigma}} to normalise the total probability to be one, then the random variable {1 + \hbox{Re}(n^{-it})} converges in {L^1} to zero. (Note that we are implicitly using the non-negative nature of {\Lambda} in order to access this probabilistic interpretation.)

Following Hadamard, we exploit the following basic observation: if {n^{it} \approx -1}, then {n^{2it} \approx +1}. To use this observation quantitatively, it is convenient (following Mertens, who simplified the original argument of Hadamard) to exploit the trigonometric inequality

\displaystyle  0 \leq 1 - \cos(2\theta) \leq 4 (1 + \cos \theta ) \ \ \ \ \ (32)

for any {\theta \in {\bf R}} (which follows from the identity {1 - \cos(2\theta) = 4 (1 + \cos \theta) - 2(1+\cos \theta)^2}), which implies that

\displaystyle  0 \leq 1 - \hbox{Re}(n^{-2it}) \leq 4 (1 + \hbox{Re}(n^{-it}) ). \ \ \ \ \ (33)

Inserting this inequality into (31), we conclude that

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} (1 - \hbox{Re}(n^{-2it})) = O_t(1)

and hence by (29)

\displaystyle  \hbox{Re} \sum_n \frac{\Lambda(n)}{n^\sigma} n^{-2it} = \frac{1}{\sigma-1} + O_t(1).

This implies that {-\frac{\zeta'}{\zeta}} has a pole at {1+2it} with residue {1}, and so {\zeta} must have a simple pole at {1+2it}. But the only pole of {\zeta} is at {1}, and {t} is non-zero, giving a contradiction. Thus there are no zeroes of {\zeta} on the line {\{1+it: t \in {\bf R}\}}, and the prime number theorem follows thanks to Proposition 31.

The key inequality (32) is often written as {3 + 4 \cos \theta + \cos(2\theta)\geq 0}, or {e^{-2i\theta} + 4e^{-i\theta} + 6 + 4e^{i\theta} + e^{2i\theta} \geq 0}. In particular, we have

\displaystyle  3 + 4 \hbox{Re} n^{-it} + \hbox{Re} n^{-2it} \geq 0, \ \ \ \ \ (34)

which on multiplying with {\frac{\Lambda(n)}{n^\sigma}} and summing gives the useful inequality

\displaystyle  - 3 \frac{\zeta'}{\zeta}(\sigma) - 4 \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+it) - \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+2it) \geq 0 \ \ \ \ \ (35)

for any {\sigma > 1} and {t \in {\bf R}}. Integrating this in {\sigma} from {\sigma=+\infty} (noting that {\zeta(\sigma+it) \rightarrow 1} as {\sigma \rightarrow +\infty} for any fixed {t}) gives the variant

\displaystyle  |\zeta(\sigma)|^3 |\zeta(\sigma+it)|^4 |\zeta(\sigma+2it)| \geq 1. \ \ \ \ \ (36)

(One can also obtain this inequality directly from (34) by multiplying by {\frac{\Lambda(n)}{n^\sigma \log n}}, summing, then exponentiating; we leave the details to the interested reader.) This variant gives a slightly different way to interpret the above proof of the prime number theorem: {\zeta} has a simple pole at {s=1}, and no pole at {1+2it}, so from (36) the maximum order of zero it can have at {1+it} is {\frac{3}{4}}. But the order must be an integer, and so one cannot have a zero of any positive order.

Exercise 36 Use the Selberg symmetry formula (equation (67) from Notes 1) to obtain the asymptotics

\displaystyle  \sum_n \frac{\Lambda_2(n)}{n^\sigma \log n} = (1+o(1)) \frac{2}{\sigma-1}

and

\displaystyle  \sum_n \frac{\Lambda_2(n)}{n^\sigma \log n} \hbox{Re}(n^{-it}) = o(\frac{2}{\sigma-1})

and

\displaystyle  \sum_n \frac{\Lambda_2(n)}{n^\sigma \log n} |\hbox{Re}(n^{-it})| = (\frac{2}{\pi}+o(1)) \frac{2}{\sigma-1}

as {\sigma \rightarrow 1^+}, for any fixed {t \neq 0}. By using the bound {|\Lambda(n) - \frac{1}{2 \log n} \Lambda_2(n)| \leq \frac{1}{2 \log n} \Lambda_2(n)}, conclude that

\displaystyle  \sum_n \frac{\Lambda(n)}{n^\sigma} (1+\hbox{Re}(n^{-it})) \leq (\frac{2}{\pi}+o(1)) \frac{1}{\sigma-1},

and use this to give an alternate proof of the prime number theorem. (This argument is related, though not completely identical, to the Erdös-Selberg elementary proof of the prime number theorem, which we will not give here.)

Remark 37 Another heuristic way to see the lack of zeroes on the line {\{1+it: t \in {\bf R}\}} is to return to the explicit formula (1). If there was a zero at {1+it}, there would also be a zero at {1-it} thanks to the conjugation symmetry (22), and hence

\displaystyle  \Lambda \approx 1 - n^{-it} - n^{it} - \dots .

In particular, {\Lambda} should behave like {-1} or less on the average in the region where {n^{it} \approx 1} (which would imply that other powers {n^{it'}} are also comparable to {1} if {t'} is an integer multiple of {t}, or else oscillate “orthogonally” to {n^{it}}). But {\Lambda} is non-negative, which heuristically suggests a contradiction. One can interpret the arguments based on (34) above as a rigorous implementation of this heuristic argument.

We have now established that the Riemann zeta function has no zeroes on the line {\{1+it: t \in {\bf R} \}}. Since the zeroes of {\zeta} are discrete, this implies a qualitative zero-free region to the left of this line, in the sense that there is an open neighbourhood of this line that is free of zeroes of {\zeta}. However, for applications (such as Corollary 23), we need a more quantitative zero-free region. To do this, we return to the bound in Proposition 19 as a quantitative substitute for the bound (28). We specialise to the case where {s = \sigma+it} with {1 < \sigma < 2} and {|t| \gg 1}, and set {\varepsilon=1/2} (say). In this case, the {\frac{1}{s-1}} term is {O(1)}, and we conclude that

\displaystyle  -\frac{\zeta'}{\zeta}(\sigma+it) = -\sum_{\rho: |\sigma+it-\rho| \leq 1/4} \frac{1}{\sigma+it-\rho} + O( \log(2+|t|) ) \ \ \ \ \ (37)

Observe that as all the zeroes {\rho} have real part at most {1}, the quantity {\frac{1}{\sigma+it-\rho}} has non-negative real part. Thus we have

\displaystyle  - \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+it) \leq O( \log(2+|t|) ) \ \ \ \ \ (38)

whenever {1 < \sigma < 2} and {|t| \gg 1}. If there is a zero {\sigma_0 + it} of {\zeta} with the same imaginary part as {\sigma+it}, with {\sigma_0 > 0}, then we have the improvement

\displaystyle  - \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+it) \leq - \frac{1}{\sigma-\sigma_0} + O( \log(2+|t|) ) \ \ \ \ \ (39)

(note that the term {\frac{1}{\sigma-\sigma_0}} is only of size {O(1)} and thus negligible if {|\sigma-\sigma_0| > 1/4}). This now gives

Proposition 38 (Classical zero-free region) There exists an absolute constant {c>0} such that there are no zeroes of {\zeta} in the region

\displaystyle  \{ \beta+it: \beta > 1 - \frac{c}{\log(2+|t|)}\}.

Proof: Let {c>0} be a small constant to be chosen later. Suppose for contradiction that one has

\displaystyle  \zeta(\beta + it ) = 0

for some {t \in {\bf R}} and {\beta > 1 - \frac{c}{\log(2+|t|)}}. As {\zeta} has a simple pole at {1}, there are no zeroes in a neighbourhood of {1}, and so one has {|t| \gg 1} if {c} is small enough. For any {1 < \sigma < 2}, we conclude from (39) that

\displaystyle  - \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+it) \leq - \frac{1}{\sigma-\beta} + O( \log(2+|t|) )

while from (38) one has

\displaystyle  - \hbox{Re} \frac{\zeta'}{\zeta}(\sigma+2it) \leq O( \log(2+|t|) )

and from (13) one has

\displaystyle  - \frac{\zeta'}{\zeta}(\sigma) = \frac{1}{\sigma - 1} + O(1).

Inserting these bounds into (35), we conclude that

\displaystyle  \frac{3}{\sigma-1} - \frac{4}{\sigma-\beta} \geq - O( \log(2+|t|) )

for any {1 < \sigma < 2}. Setting {\sigma := 1 + C (1-\beta)} for a sufficiently large absolute constant {C} (actually {C=4} suffices), we still have {\sigma < 2} if {c} is small enough, and the left-hand side is equal to

\displaystyle  (\frac{3}{C} - \frac{4}{C+1}) \frac{1}{1-\beta}.

For {C} large enough, {\frac{3}{C} - \frac{4}{C+1}} is negative, we contradict the hypothesis {\beta > 1 - \frac{c}{\log(2+|t|)}} if {c} is small enough. \Box

We can insert this zero-free region into Corollary 23, optimising the choice of parameters, to obtain a quantitative form of the prime number theorem, first obtained by de Vallée Poussin:

Corollary 39 (Prime number theorem with classical error term) We have

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x \exp( - c \sqrt{\log x} ) )

for all {x \geq 2} and some absolute constant {c>0}. In particular, one has

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O_A( x \log^{-A} x )

for any {x \geq 2} and {A \geq 0}.

Proof: Apply Corollary 23 with {T := \exp( c_1 \sqrt{\log x} )} and {\delta := \frac{c_1}{\sqrt{\log x}}} for some small absolute constant {c_1>0}; this choice of parameters is designed to roughly balance the size of two error terms in that corollary, which is usually a near-optimal way to choose parameters. The required zero-free region follows from Proposition 38 if {c_1} is small enough, and the claim then follows (noting that logarithmic factors {\log^{O(1)} x} can be absorbed into the decay factors {O( \exp( - c \sqrt{\log x} ) )} by shrinking {c} slightly). \Box

Exercise 40 (Alternate form of prime number theorem) Show that for any {x \geq 2}, the number {\pi(x)} of primes less than or equal to {x} obeys the estimate

\displaystyle  \pi(x) = \hbox{li}(x) + O( x \exp( - c \sqrt{\log x} ) )

for some absolute constant {c>0}, where the logarithmic integral {\hbox{li}(x)} is defined by the formula

\displaystyle  \hbox{li}(x) := \int_0^x \frac{dt}{\log t}.

Conclude in particular that

\displaystyle  \pi(x) = \frac{x}{\log x} + \frac{x}{\log^2 x} + O( \frac{x}{\log^3 x} )

for all {x \geq 2}; in particular, the simple form {\pi(x) \sim x/\log x} of the prime number theorem is not particularly accurate, and one should use the refined version {\pi(x) \sim \hbox{li}(x)} instead (or better yet, work with the von Mangoldt function).

Exercise 41 (Prime number theorem for Möbius) Show that there is an absolute constant {c_1>0} such that one has the bounds

\displaystyle  -\frac{\zeta'}{\zeta}(\sigma+it) = O( \log(2+|t|) )

and

\displaystyle  \log|\zeta(\sigma+it)| = O( \log \log(100+|t|) )

whenever {|t| \gg 1} and {\sigma \geq 1 - \frac{c_1}{\log(2+|t|)}}. Conclude the alternate form

\displaystyle  \sum_{n \leq x} \mu(n) = O( x \exp( - c_2 \sqrt{\log x} ) )

of the prime number theorem with classical error term for all {x \geq 2} and some {c_2>0}.

Exercise 42 (Landau-Beurling prime number theorem) Let

\displaystyle  1 < p_1 \leq p_2 \leq \dots

be a set of real numbers, which we refer to as Beurling primes. Define a Beurling integer to be a real number of the form

\displaystyle  p_{i_1}^{a_1} \dots p_{i_r}^{a_r}

for some {i_1 < \dots < i_r} and {a_1,\dots,a_r \geq 0}; note that due to potential collisions between different products of Beurling primes, it is possible for a real number to be a Beurling integer in multiple ways. Let {{\mathcal P}_B} and {{\bf N}_B} denote the sets of Beurling primes and Beurling integers respectively. If we have the asymptotic bound

\displaystyle  \sum_{n \in {\bf N}_B: n \leq x} 1 = x + O( x^{1-c_0} ) \ \ \ \ \ (40)

for all {x \geq 1} and some absolute constant {c_0>0}, establish the Landau-Beurling prime number theorem

\displaystyle  \sum_{p \in {\mathcal P}_B: n \leq x} 1 = x + O( x \exp( - c_1 \sqrt{\log x} ) ) \ \ \ \ \ (41)

for all {x \geq 2} and some absolute constant {c_1>0}; this generalises Exercise 40. (Hint: form the Beurling zeta function {\zeta_B(s) := \sum_{n\in {\bf N}_B} \frac{1}{n^s}} and show that it has a meromorphic continuation to the region {\{ s: \hbox{Re}(s) > 1-c_0\}}, and obeys the bounds {\zeta_B(\sigma+it) = O_{c_0,\varepsilon}( |t|^{O_{c_0,\varepsilon}(1)} )} for {\varepsilon>0}, {|t| \gg 1}, and {\sigma > c_0 + \varepsilon}. Then repeat the proof of the prime number theorem, all the way down to Exercise 40.) This result is essentially due to Landau; Beurling was able to obtain a variant in which the hypothesis (40) and conclusion (41) were both weakened. On the other hand, it was shown by Diamond, Montgomery, and Vorhauer that without any further axioms on Beurling integers beyond (40), it is not possible to improve upon the estimate (41) (other than by sharpening the constant {c_1}). Thus, to go beyond the prime number theorem with classical error term, one needs to know more about the natural numbers than just that they are roughly uniformly distributed on the positive real axis in the sense of (40).

In later notes, we will obtain better upper bounds on {|\zeta(s)|} in the critical strip (and particularly near the line {\{1+it: t \in {\bf R}\}}) that improve upon (24). This will allow us to obtain variants of Proposition 19 near the line {\{1+it: t \in {\bf R}\}} in which the error term {O( \log(2+|t|) )} is replaced with a smaller quantity. The argument based on (35) will then allow us to enlarge the classical zero-free region in Proposition 38, which in turn leads to an improved error term in the prime number theorem. The asymptotically strongest such result is due to Vinogradov and Korobov, who use new upper bounds on {|\zeta|} to obtain a zero-free region of the form

\displaystyle  \{ \beta+it: \beta > 1 - \frac{c}{\log(100+t)^{2/3} \log\log(100+t)^{1/3}} \} \ \ \ \ \ (42)

for some {c>0}, which leads to the prime number theorem

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x \exp( - c \log^{3/5} x / (\log\log x)^{1/5} ) ) \ \ \ \ \ (43)

for some {c>0} and all {x\geq 100}; see the exercise below. This still falls short of the claims in Proposition 24 for any fixed {\delta>0}, however it is important for some applications (e.g. finding primes in short intervals) to get some improvement over the classical zero-free region in Proposition 38.

Exercise 43

  • (i) Establish the upper bound

    \displaystyle  |\zeta(\sigma+it)| \ll \log^{O(1)} |t| \ \ \ \ \ (44)

    whenever {|t| \geq 100} and {\sigma> 1 - c \frac{\log\log |t|}{\log |t|}} for an absolute constant {c>0}. (Hint: apply (21) for a suitable choice of {x}.)

  • (ii) Assume that the upper bound (44) in fact holds in the larger region where {|t| \geq 100} and {\sigma> 1 - c \frac{(\log\log |t|)^{2/3}}{\log^{2/3} |t|}} for some absolute constant {c>0}. (This bound, essentially due to Vinogradov and Korobov, will be rigorously established in later notes.) Conclude the variant

    \displaystyle  -\frac{\zeta'}{\zeta}(s) = -\sum_{\rho: |\rho - s| \leq c' \frac{(\log \log |t|)^{2/3}}{\log^{2/3} t}} \frac{1}{s-\rho}

    \displaystyle  + O( \log^{2/3} |t| (\log\log |t|)^{1/3})

    of (37), whenever {s=\sigma+it} with {|t| \geq 200} and {\sigma > 1}, and {c'>0} is an absolute constant.

  • (iii) With the assumption in (ii), establish a zero-free region of the form (42).
  • (iv) Assuming a zero-free region of the form (42), deduce (43).
  • (v) What happens if one starts only with the bound in (i), rather than in (ii)?

— 4. Dirichlet {L}-functions, Siegel’s theorem, and the prime number theorem in arithmetic progressions —

We now extend the above theory of the Riemann zeta function to Dirichlet {L}-functions {L(s,\chi)}, where {\chi: {\bf Z} \rightarrow {\bf C}} is a Dirichlet character of some period {q}. As already remarked in Remark 1, the theory of such functions is very similar to that of the zeta function, with the character {\chi(n)} being like a “non-Archimedean” counterpart of the “Archimedean” character {n^{it}}. However, there is one key new feature, which is that the behaviour near {s=1} is not completely understood when {\chi} is a real character.

For {\hbox{Re} s > 1}, the Dirichlet {L}-functions are defined as {L(s,\chi) = {\mathcal D} \chi(s)}, thus

\displaystyle  L(s,\chi) := \sum_n \frac{\chi(n)}{n^s}.

By the general theory of Dirichlet series, this is an analytic function on the half-space {\{ s: \hbox{Re}(s) > 1 \}}. Since {\delta = \chi * (\mu \chi)} and {L \chi = \chi * (\Lambda \chi)}, we then have

\displaystyle  \frac{1}{L(s,\chi)} := \sum_n \frac{\mu(n) \chi(n)}{n^s}

and

\displaystyle  -\frac{L'(s,\chi)}{L(s,\chi)} := \sum_n \frac{\Lambda(n) \chi(n)}{n^s},

where the derivative is always understood to be with respect to the {s} variable. In particular, we have the analogue of (11):

\displaystyle  |L(s,\chi)|, |\frac{1}{L(s,\chi)}|, |\frac{L'(s,\chi)}{L(s,\chi)}| \leq \frac{1}{\sigma-1} + O(1) \ \ \ \ \ (45)

whenever {s=\sigma+it} with {\sigma > 1} and {t \in {\bf R}}. In particular, {L(s,\chi)} has no zeroes in the region {\{ s: \hbox{Re}(s) > 1 \}}.

We also have the Euler product

\displaystyle  L(s,\chi) = \prod_p (1 -\frac{\chi(p)}{p^s})^{-1}.

If {\chi} is a principal character, thus {\chi(n) = 1_{(n,q)=1}}, then we may compare this Euler product with the corresponding Euler product {\zeta(s) = \prod_p (1 - \frac{1}{p^s})^{-1}} of the Riemann zeta function, and conclude that

\displaystyle  L(s,\chi) = \zeta(s) \prod_{p|q} (1 -\frac{1}{p^s}). \ \ \ \ \ (46)

The product {\prod_{p|q} (1 -\frac{1}{p^s})} extends analytically to the entire complex plane, and has no zeroes in the region {\{\hbox{Re}(s) > 0\}}, so this {L}-function has a meromorphic extension to {\{\hbox{Re}(s) > 0 \}} with exactly the same zeroes and poles as {\zeta}.

The more interesting situation occurs when {\chi} is non-principal. In particular, it has mean zero on every interval of length {q}. This gives a bound on slowly varying sums of {\chi} (cf. Lemma 71 of Notes 1):

Exercise 44 Let {\chi} be a non-principal Dirichlet character of period {q}, let {1 \leq x < y}, and let {f: [x,y] \rightarrow {\bf C}} be a continuously differentiable function. Show that

\displaystyle  \sum_{x \leq n < y} \chi(n) f(n) \ll q \int_x^y |f'(t)|\ dt + q |f(y)|.

If {\chi} is principal, show instead that

\displaystyle  \sum_{x \leq n < y} \chi(n) f(n) - \frac{\phi(q)}{q} \int_x^y f(t)\ dt \ll q \int_x^y |f'(t)|\ dt + q |f(y)|.

Using this exercise, we see that

\displaystyle  \sum_{x \leq n < y} \frac{\chi(n)}{n^s} \ll q \frac{|s|}{\sigma} x^{-\sigma} \ \ \ \ \ (47)

whenever {s = \sigma+it} with {\sigma > 0}. By Lemma 5 of Notes 1, we conclude that for any such {s}, there is a unique complex number {L(s,\chi)} such that

\displaystyle  \sum_{n < x} \frac{\chi(n)}{n^s} = L(s,\chi) + O( q \frac{|s|}{\sigma} x^{-\sigma} ) \ \ \ \ \ (48)

for any {x > 0}. In particular, the partial sums {\sum_{n \leq x} \frac{\chi(n)}{n^s}} converge locally uniformly to {L(s,\chi)} on the half-space {\{ s: \hbox{Re}(s) > 0 \}}, and so {L(s,\chi)} is holomorphic on this region. This is similar to {\zeta}, but with the key difference that there is no longer any pole at {s=1}.

Setting {x=1} in (48), we obtain the crude bound

\displaystyle  L(s,\chi) \ll_\varepsilon q (2+|t|) \ \ \ \ \ (49)

when {s=\sigma+it} with {\sigma > \varepsilon > 0}. Taking logarithms, we have

\displaystyle  \log |L(s,\chi)| \leq O_\varepsilon( \log(q (2+|t|)) ).

One can then repeat much of the arguments in Section 2 with few changes (other than replacing logarithmic factors such as {\log(2+|t|)} with {\log(q(2+|t|))} instead, and removing the effect of a pole at {s=1}):

Exercise 45 Let {\chi} be a non-principal character of period {q}.

  • (i) (Crude upper bound on {L}-function zeroes) For any {\varepsilon > 0} and {t_0 \in {\bf R}}, show there are at most {O_\varepsilon( \log(q(2+|t_0|)) )} zeroes of {L(\cdot,\chi)} in the region {\{ \sigma+it: \varepsilon \leq \sigma \leq 1; |t-t_0| \leq 1 \}}. (As with {\zeta}, zeroes of {L(\cdot,\chi)} are always understood here to be counted with multiplicity.)
  • (ii) (Approximate formula for log-derivative of {L}-function) For any {C,\varepsilon > 0}, show that

    \displaystyle  -\frac{L'(s,\chi)}{L(s,\chi)} = -\sum_{\rho: |s-\rho| \leq \varepsilon/2} \frac{1}{s-\rho} + O_{C,\varepsilon}( \log(q(2+|t|)) ),

    whenever {s = \sigma+it} and {\varepsilon \leq \sigma \leq C}. Here and in the rest of this exercise, the sum is over the zeroes {\rho} of {L(s,\chi)}.

  • (iii) (Local integrability of log-derivative) For any {C > \varepsilon > 0} and {t_0 \in {\bf R}}, show that

    \displaystyle  \int_\varepsilon^C \int_{t_0-1}^{t_0+1} |\frac{L'(\sigma+it,\chi)}{L(\sigma+it,\chi)}|\ dt d\sigma \ll_{\varepsilon,C} \log(q(2+|t_0|)).

  • (iv) (Truncated twisted von Mangoldt explicit formula) For any {0 < \varepsilon < 1} and {x, T \geq 2}, show that

    \displaystyle  \sum_{n \leq x} \Lambda(n) \chi(n) = -\sum_{\rho: \hbox{Re}(\rho) \geq \varepsilon; |\hbox{Im}(\rho)| \leq T} \frac{x^\rho}{\rho}

    \displaystyle  + O_\varepsilon( x^\varepsilon \log^2(qT) + \frac{x}{T} \log^2(xqT) ).

    (Compare with (5).)

  • (v) (Smoothed twisted explicit formula) Let {g: {\bf R} \rightarrow {\bf C}} be a smooth, compactly supported function. Then for any {\varepsilon>0} and {x \geq 2}, show that

    \displaystyle  \sum_n \Lambda(n) \chi(n) g( \log n - \log x) = - \sum_{\rho: \hbox{Re}(\rho) > \varepsilon} x^\rho \hat g(-i\rho)

    \displaystyle  + O_{g,\varepsilon}( x^\varepsilon \log(2q) ),

    with the sum on the right-hand side being absolutely convergent. (Again, compare with (5).)

  • (vi) (Zero-free region controls twisted von Mangoldt summatory function) Let {T \geq 2} and {0 < \delta \leq 1/2}, and suppose that there are no zeroes {\rho} of {L(s,\chi)} in the rectangle {\{ s: 1-\delta < |\hbox{Re}(s)| \leq 1; |\hbox{Im}(s)| \leq T \}}. Then show that

    \displaystyle  \sum_{n \leq x} \Lambda(n) \chi(n) = O( x^{1-\delta} \log^2(qT) ) + O( \frac{x}{T} \log^2(xqT) )

    for all {x \geq 2}.

Exercise 46 Let {\chi} be a non-principal character of period {q}, and et {0 < \delta \leq 1/2}. Show that the following assertions are equivalent:

  • (i) One has {\sum_{n \leq x} \Lambda(n) \chi(n) = O(x^{1-\delta+o(1)})} as {x \rightarrow \infty}.
  • (ii) One has {\sum_{n \leq x} \Lambda(n) \chi(n) = O(x^{1-\delta} \log^2(qx) )} for all {x \geq 2}.
  • (iii) All the zeroes {\rho} of {L(\cdot,\chi)} have real part at most {1-\delta}.

Based on this exercise, it is now natural to generalise the Riemann hypothesis:

Conjecture 47 (Generalised Riemann hypothesis) Let {\chi} be a Dirichlet character. Then all the zeroes of {L(\cdot,\chi)} have real part at most {1/2}.

Given that the Riemann hypothesis (RH) remains unsolved, the stronger assertion of the generalised Riemann hypothesis (GRH) is also unsolved. (But later on we will establish the Bombieri-Vinogradov theorem, of major importance in sieve theory, which can be viewed as a kind of assertion that the generalised Riemann hypothesis holds “on average” in a certain technical sense.

Exercise 48 Assume the Generalised Riemann hypothesis. Show that

\displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + O( x^{1/2} \log^2(qx) )

for all primitive congruence classes {a\ (q)} and all {x \geq 2}.

Exercise 49 (Equivalent forms of twisted prime number theorem) Let {\chi} be a non-principal Dirichlet character. Show that the following assertions are equivalent:

  • (i) One has {\sum_{n \leq x} \Lambda(n) \chi(n) = o(x)} as {x \rightarrow \infty}.
  • (ii) All the zeroes {\rho} of {L(\cdot,\chi)} have real part strictly less than one.

This exercise should be compared with the derivation of Dirichlet’s theorem (Theorem 70 from Notes 1) from the non-vanishing of {L(1,\chi)} (Theorem 73 from Notes 1).

Now we obtain zero-free regions for {L(s,\chi)} for a Dirichlet character {\chi} of period {q}. From the Mertens trigonometric inequality (32) we have

\displaystyle  \chi_0(n) - \hbox{Re}(\chi^2(n) n^{-2it}) \leq 4 (\chi_0(n) + \hbox{Re}(\chi(n) n^{-it}) ),

where {\chi_0 := |\chi|} is the principal character of period {q}; equivalently, we have

\displaystyle  3 \chi_0(n) + 4 \hbox{Re}(\chi(n) n^{-it}) + \hbox{Re}(\chi^2(n) n^{-2it}) \geq 0. \ \ \ \ \ (50)

Multiplying by {\frac{\Lambda(n)}{n^\sigma}} for some {\sigma>1} and summing, we obtain a twisted version of (35),

\displaystyle  - 3 \frac{L'(\sigma,\chi_0)}{L(\sigma,\chi_0)} - 4 \hbox{Re} \frac{L'(\sigma+it,\chi)}{L(\sigma+it,\chi)} - \hbox{Re} \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi^2)} \geq 0 \ \ \ \ \ (51)

for any {\sigma > 1} and {t \in {\bf R}}. Integrating this in {t} from {t=+\infty} gives a twisted version of (36):

\displaystyle  |L(\sigma,\chi_0)|^3 |L(\sigma+it,\chi)|^4 |L(\sigma+2it,\chi^2)|^2 \geq 1. \ \ \ \ \ (52)

We can now strengthen Dirichlet’s theorem (Theorem 65 from Notes 1):

Exercise 50 (Prime number theorem in arithmetic progressions)

  • (i) For any Dirichlet character {\chi}, show that {L(\cdot,\chi)} has no zeroes on the line {\{1+it: t \in {\bf R}\}}. (You will need Theorem 73 from Notes 1 to deal with the {t=0} case.)
  • (ii) For any primitive residue class {a\ (q)}, show that

    \displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + o(x)

    as {x \rightarrow \infty} (keeping {a} and {q} fixed). (The decay rate in the {o()} notation may depend on {a} and {q}.)

  • (iii) For any primitive residue class {a\ (q)}, show that the number of primes in {a\ (q)} less than {x} is {\frac{x}{\phi(q) \log x} + o(\frac{x}{\log x})} as {x \rightarrow \infty} (keeping {a} and {q} fixed).

Next, we obtain the analogue of the classical zero-free region (Proposition 38), though with an important exception due to the lack of control near {s=1}:

Proposition 51 (Classical zero-free region for {L}-functions) There exists an absolute constant {c>0} such that, for any Dirichlet character {\chi} of period {q}, there are no zeroes of {L(\cdot,\chi)} in the region

\displaystyle  \{ \beta+it: \beta > 1 - \frac{c}{\log(q(2+|t|))} \}

with the possible exception of a single real zero {1 - \frac{c}{\log(2q)} < \beta < 1} (which we refer to as an exceptional zero or Siegel zero). The exceptional zero can only occur if {\chi} is a non-principal real character.

Proof: We may assume that {\chi} is non-principal, since otherwise the claim follows from Proposition 38. In particular, {q \geq 2}.

Let {c_1>0} be a small constant to be chosen later, and let {c>0} be sufficiently small depending on {c_1}.

First suppose that {\chi} is a complex character, so that {\chi^2} is non-principal. Suppose first that we have {L(\beta+it,\chi)=0} for some {t \in {\bf R}} and {\beta > 1 - \frac{c}{\log(q(2+|t|)}}. From Exercise 45(ii) and taking real parts, we have

\displaystyle  - \hbox{Re} \frac{L'(\sigma+it,\chi)}{L(\sigma+it,\chi)} \leq - \frac{1}{\sigma-\beta} + O( \log(q(2+|t|)) )

for any {1 < \sigma < 2}. Similarly, because {\chi^2} is non-principal, we have

\displaystyle  - \hbox{Re} \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi^2)} \leq O( \log(q(2+|t|)) )

while from (45) we have

\displaystyle  - \hbox{Re} \frac{L'(\sigma,\chi_0)}{L(\sigma,\chi_0)} \leq \frac{1}{\sigma-1} + O( 1 ).

Applying (51), we conclude that

\displaystyle  \frac{3}{\sigma-1} - \frac{4}{\sigma-\beta} \geq -O( \log(q(2+|t|) ).

Setting {\sigma = 1 + C(1-\beta)} for (say) {C=4}, we obtain a contradiction with {c} small enough. This completes the proof of the proposition when {\chi} is complex.

Now suppose that {\chi} is a real character, so that {\chi^2 = \chi_0}. We can adapt the previous argument, but need a new tool to estimate {- \hbox{Re} \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi^2)}}. By (46) we have

\displaystyle  - \hbox{Re} \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi^2)} = -\hbox{Re} \frac{\zeta'}{\zeta}(\sigma+2it) - \sum_{p|q} \frac{\log p}{p^{\sigma+2it}-1}.

We crudely bound

\displaystyle  \sum_{p|q} \frac{\log p}{p^{\sigma+2it}-1} = O( \sum_{p|q} \log p ) = O( \log q )

and then apply Proposition 19 and take real parts to conclude that

\displaystyle  - \hbox{Re} \frac{L'(\sigma+2it,\chi^2)}{L(\sigma+2it,\chi^2)} \leq \hbox{Re} \frac{1}{\sigma+2it-1} + O( \log(q(2+|t|)) ).

Applying (51) as before, we conclude that

\displaystyle  \frac{3}{\sigma-1} - \frac{4}{\sigma-\beta} + \hbox{Re} \frac{1}{\sigma+2it-1} \geq -O( \log(q(2+|t|) ).

As before, we set {\sigma := 1+4(1-\beta)} and conclude that

\displaystyle  -\frac{1}{20(1-\beta)} + \hbox{Re} \frac{1}{4(1-\beta)+2it} \geq -O( \log(q(2+|t|) ).

If {|t| \geq \frac{c_1}{\log q}}, then {|t| \geq 100(1-\beta)} (say) if {c} is small enough, leading again to a contradiction. Thus the only remaining case is when {\chi} is real and {|t| < \frac{c_1}{\log q}}.

We now show that there is at most one zero of {L(s,\chi)} in the region {\{ \beta+it: |t| \leq \frac{c_1}{\log q}; \beta > 1 - \frac{c}{\log(q(2+|t|))} \}}. If there are two such zeroes {\beta_1+it_1, \beta_2+it_2}, then from Exercise 45(ii) and taking real parts we have

\displaystyle  - \hbox{Re} \frac{L'(\sigma,\chi)}{L(\sigma,\chi)} \leq - \hbox{Re} \frac{1}{\sigma-\beta_1-it_1} - \hbox{Re} \frac{1}{\sigma-\beta_2-it_2} + O( \log q );

comparing this with (45), we conclude that

\displaystyle  \hbox{Re} \frac{1}{\sigma-\beta_1-it_1} + \hbox{Re} \frac{1}{\sigma-\beta_2-it_2} \leq \frac{1}{\sigma-1} + O(\log q).

If we set {\sigma = 1 + 100 \frac{c_1}{\log q}} (say), we obtain a contradiction if {c_1} is small enough.

As {\chi} is real, we have the conjugation symmetry

\displaystyle  L(\overline{s},\chi) = \overline{L(s,\chi)},

and so if {\beta+it} is a zero of {L(s,\chi)}, then {\beta-it} is one also. Thus there can be no strictly complex zeroes in the region {\{ \beta+it: |t| \leq \frac{c_1}{\log q}; \beta > 1 - \frac{c}{\log(q(2+|t|))} \}}, and at most one real zero; and the claim follows. (From Theorem 73 from Notes 1, {\beta} cannot equal {1}, and from (45) there are no zeroes to the right of {1}.) \Box

The exceptional zero {\beta} in the above theorem is quite a nuisance; if one believes in the generalised Riemann hypothesis, it should not exist, but frustratingly, we have not been able to completely exclude this zero from occurring. However, there is an important repulsion phenomenon (known as the Deuring-Heilbronn repulsion phenomenon), that asserts (roughly speaking) that the existence of one exceptional zero tends to repel away other exceptional zeroes. We already saw one instance of this phenomenon when proving Proposition 51, when we showed that a single character {\chi} could not have two or more exceptional zeroes. Another instance appeared in Proposition 76 of Notes 1.

To state the repulsion phenomenon more precisely, we have to exclude a degenerate case, coming from the fact that if one multiplies a Dirichlet character {\chi} (of some modulus {q}) by a principal character {\chi'_0} (of some modulus {q'}), then the resulting Dirichlet character {\chi \chi'_0} (which has modulus {[q,q']}) has essentially the same {L}-function as {\chi}, as {L(s,\chi\chi'_0)} and {L(s,\chi)} differ by a finite number of Euler factors (as in (46)), and so the two {L}-functions have an identical set of zeroes in the region {\{ s: \hbox{Re}(s) > 0\}}. To avoid this problem, let us call a Dirichlet character {\chi} of modulus {q} primitive if it cannot be factored as {\chi = \chi' \chi''_0}, where {\chi''_0} is a principal character and {\chi'} is a character of modulus strictly less than {q}.

Exercise 52

  • (i) Show that every Dirichlet character {\chi} of modulus {q} can be uniquely factored as {\chi = \chi' \chi''_0}, where {\chi'} is a primitive character of some modulus {q'} (known as the conductor of {\chi}) and {\chi''_0} is a principal character whose modulus {q''} is coprime to {q'}. Furthermore, {q'q''} divides {q}, and {\chi} is real if and only if {\chi'} is real. Thus we see that to understand the zeroes of Dirichlet {L}-functions {L(\cdot,\chi)}, it suffices to do so for the primitive characters.
  • (ii) Let {\chi, \chi'} be primitive Dirichlet characters. Show that {\chi \overline{\chi'}} is a principal character if and only if {\chi=\chi'}.

Here is one standard manifestation of the repulsion phenomenon:

Theorem 53 (Landau’s theorem) There is an absolute constant {c>0} with the following property: whenever {\chi,\chi'} are two distinct real primitive characters of conductor {q,q'} respectively, there is at most one real zero {\beta} of {L(s,\chi)} or {L(s,\chi')} with {\beta > 1 - \frac{c}{\log(qq')}}.

Proof: Let {c>0} be sufficiently small. If the claim failed, then (since each {L}-function has at most one exceptional zero) we can find {\beta, \beta' > 1 - \frac{c}{\log(qq')}} such that {L(\beta,\chi)=L(\beta',\chi')=0}.

In previous arguments, one used the inequality (50). Here, we will instead use the inequality

\displaystyle  (1+\chi(n)) (1+\chi'(n)) \geq 0

which we expand as

\displaystyle  1 + \chi(n) + \chi'(n) + \chi(n) \chi'(n) \geq 0.

Multiplying by {\frac{\Lambda(n)}{n^\sigma}} for some {1 < \sigma < 2} and summing, we conclude that

\displaystyle  - \frac{\zeta'}{\zeta}(\sigma) - \frac{L'(\sigma,\chi)}{L(\sigma,\chi)} - \frac{L'(\sigma,\chi')}{L(\sigma,\chi')} - \frac{L'(\sigma,\chi\chi')}{L(\sigma,\chi\chi')} \geq 0.

(We do not need to take real parts here, as everything in sight is already real.) From (13) we have

\displaystyle  - \frac{\zeta'}{\zeta}(\sigma) = \frac{1}{\sigma-1} + O(1),

and from Exercise 45(ii) we have

\displaystyle  - \frac{L'(\sigma,\chi)}{L(\sigma,\chi)} \leq - \frac{1}{\sigma-\beta} + O( \log q )

and

\displaystyle  - \frac{L'(\sigma,\chi')}{L(\sigma,\chi')} \leq - \frac{1}{\sigma-\beta'} + O( \log q' ).

By Exercise 52(ii), {\chi\chi'} is a non-principal character of modulus at most {qq'}, and so from Exercise 45(ii) again we have

\displaystyle  - \frac{L'(\sigma,\chi\chi')}{L(\sigma,\chi\chi')} \leq O( \log q' ).

Putting all this together, we see that

\displaystyle  \frac{1}{\sigma-1} - \frac{1}{\sigma-\beta} - \frac{1}{\sigma-\beta'} \geq -O( \log(qq') ).

Setting {\sigma := 1 + \frac{100 c}{\log(qq')}}, we obtain a contradiction if {c} is small enough. \Box

This gives a variant of Proposition 51, in which the zero-free region is reduced slightly, but there is only one primitive character that has an exceptional zero:

Exercise 54 (Page’s theorem) Let {Q \geq 2}. Show that for each primitive character {\chi} of conductor at most {Q}, the {L}-function {L(s,\chi)} has a zero-free region of the form {\{ \sigma+it: t \in {\bf R}, \sigma \geq 1 - \frac{c}{\log Q} \}} for some absolute constant {c>0}, with the possible exception of a single real zero {\beta} by a single primitive real character {\chi_*} of modulus at most {Q}.

We will refer to Landau’s theorem and Page’s theorem collectively as the Landau-Page theorem.

Exercise 55 (Prime number theorem in arithmetic progressions with classical error term) Let {c > 0} be a sufficiently small quantity, and let {q} be a natural number.

  • (i) If {\chi_0} is the principal Dirichlet character modulo {q}, show that

    \displaystyle  \sum_{n \leq x} \Lambda(n) \chi_0(n) = x + O( x \exp( - c \sqrt{\log x} ) )

    if {x \geq 2} and {q \leq \exp( c \sqrt{\log x} )}.

  • (ii) If {\chi} is a non-principal Dirichlet character modulo {q}, show that

    \displaystyle  \sum_{n \leq x} \Lambda(n) \chi(n) = -\frac{x^\beta}{\beta} + O( x \exp( - c \sqrt{\log x} ) )

    if {x \geq 2}, {q \leq \exp(c \sqrt{\log x})} and {\chi} has an exceptional zero {\beta} (which, for this current exercise, means a zero of {L(s,\chi)} with {\beta \geq 1 - \frac{c}{\log q}}). If {\chi} has no exceptional zero, then the {\frac{x^\beta}{\beta}} term should be deleted; this is for instance the case when {\chi} is complex.

  • (iii) If {a\ (q)} is a primitive residue class modulo {q}, show that

    \displaystyle  \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} - \frac{\chi(a)}{\phi(q)} \frac{x^\beta}{\beta} + O( x \exp( - c \sqrt{\log x} ) )

    if {x \geq 2}, {q \leq \exp(c \sqrt{\log x})}, where {\chi} is a real non-principal character of modulus {q} with an exceptional zero {\beta}. (Note from Page’s theorem that there is at most one such character for a given {q}, if {c} is small enough.) This character {\chi} will be called the exceptional character. If there is no exceptional character, the term {\frac{\chi(a)}{\phi(q)} \frac{x^\beta}{\beta}} should be deleted.

(Note: the constant {c} may need to be smaller in (iii) than it needs to be for (i) or (ii).)

Remark 56 Informally, the prime number theorem in arithmetic progressions asserts that the primes {p} are equidistributed in the primitive residue classes modulo {q} for {\log p \gg \log^2 q}, unless there is an exceptional character {\chi} with exceptional zero {\beta}, in which case the primes {p} are more or less equidistributed in the primitive residue classes {a\ (q)} with {\chi(a)=-1} if {\log^2 q \ll \log p \ll \frac{1}{1-\beta}}, and then become equidistributed in all the primitive residue classes modulo {a\ (q)} for {\log p \gg \max( \log^2 q, \frac{1}{1-\beta} )}.

The Landau-Page theorem is good at eliminating exceptional zeroes in a power range such as {[Q,Q^C]} for any fixed {C}, as it prevents more than a single primitive real character {\chi} of conductor {q} in this range having an exceptional zero {\beta} with {\beta \geq 1 - \frac{c}{C \log q}} for an absolute constant {c>0}. However, it loses control of exceptional zeroes in wider ranges than this. For instance, the Landau-Page theorem does not prevent the existence of an infinite sequence of exceptional real primitive characters {\chi_n} whose conductor {q_n} grows very rapidly in {n} (e.g. {q_n \sim \exp(\exp(100^n))}), and with each {\chi_n} having an exceptional zero {\beta_n} that converges very quickly to {1}, e.g. {1- \beta_n \sim q_n^{-10}}.

Fortunately we have another way to exploit the repulsion phenomenon even for characters of widely separated modulus. To develop this aspect of the repulsion phenomenon, we first need to establish a link between exceptional zeroes, and exceptionally small values of {L(1,\chi)}. We first give one direction of this link:

Lemma 57 (Exceptional zero implies small {L(1,\chi)}) Suppose that {\chi} is a real non-principal character of modulus {q} whose {L}-function {L(s,\chi)} has a real zero {\beta} with {\beta \geq 1 - O(\frac{1}{\log q})}. Then {L(1,\chi) \ll (\log^2 q) (1-\beta)}.

Proof: From (48) with {x=q} and {s = 1 + O(\frac{1}{\log q})} we have

\displaystyle  L(s,\chi) = \sum_{n < q} \frac{\chi(n)}{n^s} + O( 1 )

which on bounding {|\chi(n)| \leq 1} and {\frac{1}{n^s} = O( \frac{1}{n} )} gives

\displaystyle  L(s,\chi) = O( \log q ) \ \ \ \ \ (53)

for {s = 1 + O( \frac{1}{\log q} )}. From the generalised Cauchy integral formula (Exercise 9 of Supplement 2), we thus have

\displaystyle  L'(s,\chi) = O( \log^2 q )

for {s = 1 + O( \frac{1}{\log q} )}. Since {L(\beta,\chi)=0}, the claim now follows from the fundamental theorem of calculus. \Box

To go in the opposite direction, we will borrow a trick from the proof of the non-vanishing of {L(1,\chi)} (Theorem 73 from Notes 1) and exploit the positivity

\displaystyle  1 * \chi(n) \geq 1_{n=1} \ \ \ \ \ (54)

to understand the sums

\displaystyle  \sum_{n \leq x} \frac{1 * \chi(n)}{n^s}

for various choices of {s} and {x}. The key estimate is the following (compare with (21) or (48)):

Exercise 58 Let {\chi} be a real non-principal character of modulus {q}, and let {1/2 \leq s < 1} be a real number. Establish the bound

\displaystyle  \sum_{n \leq x} \frac{1*\chi(n)}{n^s} = \zeta(s) L(s,\chi) + \frac{x^{1-s}}{1-s} L(1,\chi) + O( \frac{q}{1-s} x^{1/2-s} )

for any {x \geq 1}. (Hint: use the Dirichlet hyperbola method and (21), (48), (24).) For an additional challenge, see if you can establish this estimate (possibly with a slightly weaker error term) by using the truncated Perron formula and contour shifting (bearing in mind that {1 * \chi(n)} is only bounded by {O(n^{o(1)})} rather than by {O( \log(2+n))}).

Lemma 59 (Small {L(1,\chi)} implies exceptional zero) Suppose that {\chi} is a real non-principal character of modulus {q} such that {L(1,\chi) \leq \frac{c}{\log q}} for some sufficiently small absolute constant {c}. Then there is a real zero {\beta} of {L(s,\chi)} with {\beta \geq 1 - O( L(1,\chi) )}.

Proof: Let {s := 1 - C L(1,\chi)} for some large absolute constant {C}, and let {c} be sufficiently small depending on {C}, thus {s = 1 - O(c/\log q)} (recall that {L(1,\chi)} is positive), and so {1/2 \leq s < 1} if {c} is small enough. From (54) we have

\displaystyle  \sum_{n \leq x} \frac{1 * \chi(n)}{n^s} \geq 1,

for any {x \geq 1} and so from Exercise 58 we have

\displaystyle  1 \leq \zeta(s) L(s,\chi) + \frac{x^{C L(1,\chi)}}{C} + O( x^{-1/4} q / L(1,\chi) )

(say). Setting {x = q^{10} / L(1,\chi)^{10}} (say), we conclude that

\displaystyle  \zeta(s) L(s,\chi) \geq \frac{1}{2}

if {C} is large enough and {c} is small enough. By (24), {\zeta(s)} is negative. Thus, {L(s,\chi) < 0}, and so by the intermediate value theorem there must be a zero {\beta} of {L(\cdot,\chi)} between {s} and {1}, and the claim follows. \Box

Now suppose we have two distinct real primitive characters {\chi, \chi'} of modulus {q, q'} respectively, so that {\chi\chi'} is also a real non-principal character of modulus at most {qq'}. As in the proof of the Landau-Page theorem, we have the non-negativity

\displaystyle  1 + \chi(n) + \chi'(n) + \chi(n) \chi'(n) \geq 0.

We will instead exploit the multiplicative version of this non-negativity:

\displaystyle  1 * \chi * \chi' * \chi\chi'(n) \geq 1_{n=1}.

The latter bound can be deduced from the former after using the formal identity

\displaystyle  \sum_n \frac{1 * \chi * \chi' * \chi\chi'(n)}{n^s} = \exp( \sum_n \frac{\Lambda(n)}{\log n} \frac{1 + \chi(n) + \chi'(n) + \chi(n) \chi'(n)}{n^s} )

that comes from the identity

\displaystyle  \sum_n \frac{\chi(n)}{n^s} = \exp( \sum_n \frac{\Lambda(n) \chi(n)}{n^s \log n} )

valid for all characters {\chi} (cf. (28) from Notes 1); it can also be verified directly. In particular, we have

\displaystyle  \sum_{n \leq x} \frac{1*\chi*\chi'*\chi\chi'(n)}{n^s} \geq 1 \ \ \ \ \ (55)

for any {x \geq 1} and {0 < s < 1}. Meanwhile, one has the following variant of Exercise 58:

Exercise 60 Let {\chi,\chi'} be distinct real primitive characters of modulus {q}, {q'} respectively, and let {1-c < s < 1} for some sufficiently small absolute constant {c>0}. Establish the bound

\displaystyle \sum_{n \leq x} \frac{1*\chi*\chi'*\chi\chi'(n)}{n^s} = \zeta(s) L(s,\chi) L(s,\chi') L(s,\chi\chi') +

\displaystyle  \frac{x^{1-s}}{1-s} L(1,\chi) L(1,\chi') L(1,\chi\chi') + O( \frac{(q q')^{O(1)}}{1-s} x^{1-c-s} )

for any {x \geq 1}. (Hint: either use a higher-dimensional version of the Dirichlet hyperbola method, or the truncated Perron formula and contour shifting.)

This gives a repulsion phenomenon:

Proposition 61 (Repulsion phenomenon) Let {\chi} be a real primitive character of modulus {q}. Suppose that {L(\beta,\chi)=0} for some {1-c \leq \beta < 1}, where {c} is a sufficiently small absolute constant. Then one has

\displaystyle  L(1,\chi) L(1,\chi') \gg \frac{1-\beta}{(qq')^{O(1-\beta)} \log(qq')}

for all real primitive characters {\chi'} distinct from {\chi}, with {q'} denoting the modulus of {\chi'}.

Proof: From Exercise 60 with {s=\beta} and (55) (and shrinking {c} as needed) one has

\displaystyle  \frac{x^{1-\beta}}{1-\beta} L(1,\chi) L(1,\chi') L(1,\chi\chi') + O( \frac{(q q')^{O(1)}}{1-\beta} x^{1-2c-\beta} ) \geq 1

for any {x \geq 1}. If we then set {x := (\frac{qq'}{1-\beta})^C} for a sufficiently large absolute constant {C}, the error term is less than {1/2}, and so

\displaystyle  \frac{x^{1-\beta}}{1-\beta} L(1,\chi) L(1,\chi') L(1,\chi\chi') \gg 1.

From (53) one has {L(1,\chi\chi') \ll \log(qq')}, and from choice of {x} we have {x^{1-\beta} \ll (qq')^{O(1-\beta)}} with implied constants depending on {C}. The claim follows. \Box

We can now give Siegel’s theorem on exceptional zeroes (or on exceptionally small values of {L(1,\chi)}), which will be the first theorem in this set of notes to feature ineffective implied constants – constants which cannot be explicitly computed in terms of the given data, but are merely known to be finite and positive.

Theorem 62 (Siegel’s theorem)

  • (i) For any {\varepsilon > 0}, one has the bound

    \displaystyle  L(1,\chi) \geq c_\varepsilon q^{-\varepsilon}

    for all but at most one real primitive character {\chi} of conductor {q}, and some constant {c_\varepsilon > 0}.

  • (ii) For any {\varepsilon > 0}, there are no zeroes of {L(s,\chi)} in the interval {[1-c_\varepsilon q^{-\varepsilon}, 1]} for all but at most one real primitive character {\chi} of conductor {q}, and some constant {c_\varepsilon > 0}.

In both (i) and (ii), the constant {c_\varepsilon} is effective: it can be computed explicitly in terms of {\varepsilon}. However, if one wishes to replace “all but at most one” with “all” in either (i) or (ii), one can do this at the cost of rendering {c_\varepsilon} ineffective: this constant is still known to be positive, but we no longer know of a way to compute {c_\varepsilon} explicitly in terms of {\varepsilon}.

Remark 63 The observation that Siegel’s theorem may be made effective if one exceptional character is removed is due to Tatuzawa. Combined with the class number formula, this can be used to show that with at most one exception, all but an explicitly computable finite list of quadratic fields of negative discriminant do not have unique factorisation. Indeed, using related methods, Heilbronn and Linfott had previously showed that, apart from the nine discriminants {D = -3,-4,-7,-8,-11,-19,-43,-67,-163} (which all give quadratic fields of unique factorisation), there is at most one further negative discriminant giving a quadratic field of unique factorisation. This elusive “tenth discriminant” was finally ruled out by Heegner by some difficult arguments, which have since been clarified by subsequent work of Stark and many further authors, giving what is now known as the Stark-Heegner theorem.

Proof: From Lemmas 57, 59 we see that (i) and (ii) are equivalent, so we will just prove (i). It suffices to prove the claim with one exceptional character {\chi} deleted and with effective choices of {c_\varepsilon}, since one can reinstate the exceptional character (at the cost of making {c_\varepsilon} ineffective) just by using the positivity {L(1,\chi) > 0} for the exceptional character.

Let {c_\varepsilon} be a small (effective) constant, depending only on {\varepsilon}, to be chosen later. We divide into two cases:

  1. There are no zeroes of {L(s,\chi)} in the interval {[1-c_\varepsilon q^{-\varepsilon},1]} for any real primitive character {\chi} with a conductor {q}.
  2. There exists a real primitive character {\chi} of some conductor {q} with a zero {\beta} in {[1-c_\varepsilon q^{-\varepsilon},1]}.

In Case 1, Lemma 59 in the contrapositive gives {L(1,\chi) \gg c_\varepsilon q^{-\varepsilon}} for all real primitive characters {\chi} with a conductor {q}, giving the claim (after adjusting {c_\varepsilon} slightly).

Now suppose we are in Case 2, so {L(\beta,\chi)=0} for some real primitive character {\chi} of conductor {q} and some {1-c_\varepsilon q^{-\varepsilon} \leq \beta < 1} (recall that {L(1,\chi)} is non-vanishing). We may take {q} to be minimal among all such characters. Note that while {q} is obviously finite, we do not have any effective bound on {q}, so we have to proceed a little carefully if one is to avoid the final implied constants from depending on {\chi} or {q}.

Let {\chi'} be a real primitive character of conductor {q'} that is distinct from the exceptional character {\chi}. If {q' < q} then by construction, {L(s,\chi')} has no zeroes in {[1-c_\varepsilon (q')^{-\varepsilon},1]}, and the required bound follows again from Lemma 59 in the contrapositive. Now suppose that {q' \geq q}. From Proposition 61, we have

\displaystyle  L(1,\chi) L(1,\chi') \gg \frac{1-\beta}{(qq')^{O(1-\beta)} \log(qq')}.

By Lemma 27, we have {L(1,\chi) \ll (1-\beta) \log^2 q}. Bounding {q} by {q'}, we conclude that

\displaystyle  L(1,\chi') \gg \frac{1}{(q')^{O(1-\beta)} \log^3(q')},

and using the bound {1-\beta \leq c_\varepsilon q^{-\varepsilon}}, we obtain the required estimate if {c_\varepsilon} is small enough. \Box

The best known effective lower bounds on {L(1,\chi)} for all real primitive characters {\chi} (not excluding an exceptional character) go through the class number formula (as briefly discussed in Supplement 1), and take the shape

\displaystyle  L(1,\chi) \gg q^{-1/2} \log^{C} q \ \ \ \ \ (56)

for some explicit constant {C}; this corresponds to a zero-free region of {L(s,\chi)} of size {[1-c q^{-1/2} \log^{C-2} q, 1]} for some effective constants {c, C>0}. The bound (56) is trivial for {C=0} since the class number is always at least one; it turns out that one can raise {C} to be arbitrarily close to {1}, but this is a difficult result (at least when {\chi} is associated to a quadratic field of negative discriminant), due to Goldfeld and Gross-Zagier; see this survey of Goldfeld for further discussion. It is of great interest to improve these effective bounds further, but this has not yet been achieved; despite the conjectural non-existence of Siegel zeroes, they seem to live in a stubbornly self-consistent (though somewhat strange) universe that has defied all efforts to eradicate them to date.

In summary, we can give the following bounds on exceptional zeroes {\beta} of {L}-functions of real primitive characters {\chi} of conductor {q}:

  • (i) (Class number methods) One has {\beta \leq 1 - c q^{-1/2} \log^{C-2} q} for some effective {c > 0} and {C}.
  • (ii) (Siegel) For any {\varepsilon>0}, one has {\beta \leq 1 - c_\varepsilon q^{-\varepsilon}} for some ineffective {c_\varepsilon > 0}.
  • (iii) (Tatuzawa) For any {\varepsilon>0}, one has {\beta \leq 1 - c_\varepsilon q^{-\varepsilon}} for some effective {c_\varepsilon > 0}, except possibly for a single exceptional character {\chi_\varepsilon}.
  • (iv) (Page) One has {\beta \leq 1-\frac{c}{\log Q}} for some effective {c>0} and all real primitive characters {\chi} of conductor at most {Q}, except possibly for a single exceptional character {\chi'_Q}.

One also obtains analogous lower bounds on {L(1,\chi)} through Lemma 59, and lower bounds on class numbers (at least in the case of negative discriminant) using the class number formula.

All four of the bounds (i), (ii), (iii), (iv) have their advantages and disadvantages, and are all useful in various applications; the choice of which of (i)-(iv) to use depends on whether one has some argument to deal with a potential exceptional character, whether one can tolerate ineffective values of the implied constant, and whether one has a reasonable bound {Q} on the conductor of the characters one wishes to use.

A basic application of Siegel’s theorem is the Siegel-Walfisz theorem.

Exercise 64 (Siegel-Walfisz theorem) For any {A > 0}, show that there exists an (ineffective) constant {c_A>0} such that

\displaystyle  \sum_{n \leq x:n=a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + O( x \exp( - c_A \sqrt{\log x} ) )

for all primitive residue classes {a\ (q)} and all {x \geq 2} with

\displaystyle  q \leq \log^A x. \ \ \ \ \ (57)

(Hint: use Exercise 55(iii) together with Siegel’s theorem to handle the exceptional zero.) Conclude in particular that

\displaystyle  \sum_{n \leq x:n=a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + O_A( x \log^{-A} x )

for all primitive residue classes {a\ (q)} and all {x \geq 2} (without assuming the size restriction (57)), and with an ineffective constant in the {O_A()} notation.

Of course, the error term in the Siegel-Walfisz theorem can be substantially improved if one assumes the generalised Riemann hypothesis: see Exercise 48. In later notes we will use the Siegel-Walfisz theorem to prove the Bombieri-Vinogradov theorem, which is a theorem of basic importance in sieve theory.

Exercise 65 (Least prime in an arithmetic progression) If {a\ (q)} is a primitive residue class, show that {a\ (q)} contains a prime {p} with {p \ll \exp( q^{o(1)} )}, with the implied constants ineffective; by using (56), obtain the alternate bound {p \ll \exp(q^{2+o(1)})} with effective constants, and obtain the improvement {p \ll \exp( O(\log^2 q) )} with effective constants if there is no exceptional character of modulus {q}. In later notes we will be able to improve all of these bounds to {p \ll q^{O(1)}} with effective constants, a result known as Linnik’s theorem.

Exercise 66 (Siegel-Walfisz for the Möbius function) Show that for any {A>0}, one has the bound

\displaystyle  \sum_{n \leq x:n=a\ (q)} \mu(n) \ll_A x \log^{-A} x

for all residue classes {a\ (q)} (not necessarily primitive) and all {x \geq 2}, with an ineffective constant in the {O_A()} notation. (Hint: reduce to the case in which {q \leq \log^A x} and {a\ (q)} is primitive. One can either use a truncated Perron’s formula argument using some lower bounds on {L(s,\chi)} for {s} slightly to the left of {1}, or else modify the elementary method from Theorem 58 of Notes 1, using an induction on {A}.)

Exercise 67 (Elementary lower bound for {L(1,\chi)}) The purpose of this exercise is to give a somewhat reasonable effective lower bound on {L(1,\chi)} by an elementary (but somewhat ad hoc) device. Let {\chi} be a real non-principal character of modulus {q}.

  • (i) Establish the identity

    \displaystyle  \sum_n (1*\chi)(n) e^{-n/x} = x L(1,\chi) + \sum_n \chi(n) f(\frac{n}{x})

    for any {x>0}, where {f: (0,+\infty) \rightarrow {\bf R}} is the function {f(t) := \frac{1}{e^t-1}-\frac{1}{t}}, and the sum is in the conditionally convergent sense.

  • (ii) Obtain the bounds

    \displaystyle  \sum_n (1*\chi)(n) e^{-n/x} \gg \sqrt{x}

    and

    \displaystyle  \sum_n \chi(n) f(\frac{n}{x}) \ll q \ \ \ \ \ (58)

    and conclude that

    \displaystyle  L(1,\chi) \gg q^{-1}. \ \ \ \ \ (59)

In later notes, we will develop Fourier-analytic tools that, among other things, improve the upper bound on (58) to {O(\sqrt{q} \log q)}, which almost recovers the bound {L(1,\chi) \gg q^{-1/2}} coming from the class number formula.

Exercise 68 Use (59) and Exercise 58 for a suitable choice of {x, s} to give an alternate proof of Lemma 57.