You are currently browsing the tag archive for the ‘Riemann zeta function’ tag.

In Notes 2, the Riemann zeta function ${\zeta}$ (and more generally, the Dirichlet ${L}$-functions ${L(\cdot,\chi)}$) were extended meromorphically into the region ${\{ s: \hbox{Re}(s) > 0 \}}$ in and to the right of the critical strip. This is a sufficient amount of meromorphic continuation for many applications in analytic number theory, such as establishing the prime number theorem and its variants. The zeroes of the zeta function in the critical strip ${\{ s: 0 < \hbox{Re}(s) < 1 \}}$ are known as the non-trivial zeroes of ${\zeta}$, and thanks to the truncated explicit formulae developed in Notes 2, they control the asymptotic distribution of the primes (up to small errors).

The ${\zeta}$ function obeys the trivial functional equation

$\displaystyle \zeta(\overline{s}) = \overline{\zeta(s)} \ \ \ \ \ (1)$

for all ${s}$ in its domain of definition. Indeed, as ${\zeta(s)}$ is real-valued when ${s}$ is real, the function ${\zeta(s) - \overline{\zeta(\overline{s})}}$ vanishes on the real line and is also meromorphic, and hence vanishes everywhere. Similarly one has the functional equation

$\displaystyle \overline{L(s, \chi)} = L(\overline{s}, \overline{\chi}). \ \ \ \ \ (2)$

From these equations we see that the zeroes of the zeta function are symmetric across the real axis, and the zeroes of ${L(\cdot,\chi)}$ are the reflection of the zeroes of ${L(\cdot,\overline{\chi})}$ across this axis.

It is a remarkable fact that these functions obey an additional, and more non-trivial, functional equation, this time establishing a symmetry across the critical line ${\{ s: \hbox{Re}(s) = \frac{1}{2} \}}$ rather than the real axis. One consequence of this symmetry is that the zeta function and ${L}$-functions may be extended meromorphically to the entire complex plane. For the zeta function, the functional equation was discovered by Riemann, and reads as follows:

Theorem 1 (Functional equation for the Riemann zeta function) The Riemann zeta function ${\zeta}$ extends meromorphically to the entire complex plane, with a simple pole at ${s=1}$ and no other poles. Furthermore, one has the functional equation

$\displaystyle \zeta(s) = \alpha(s) \zeta(1-s) \ \ \ \ \ (3)$

or equivalently

$\displaystyle \zeta(1-s) = \alpha(1-s) \zeta(s) \ \ \ \ \ (4)$

for all complex ${s}$ other than ${s=0,1}$, where ${\alpha}$ is the function

$\displaystyle \alpha(s) := 2^s \pi^{s-1} \sin( \frac{\pi s}{2}) \Gamma(1-s). \ \ \ \ \ (5)$

Here ${\cos(z) := \frac{e^z + e^{-z}}{2}}$, ${\sin(z) := \frac{e^{-z}-e^{-z}}{2i}}$ are the complex-analytic extensions of the classical trigionometric functions ${\cos(x), \sin(x)}$, and ${\Gamma}$ is the Gamma function, whose definition and properties we review below the fold.

The functional equation can be placed in a more symmetric form as follows:

Corollary 2 (Functional equation for the Riemann xi function) The Riemann xi function

$\displaystyle \xi(s) := \frac{1}{2} s(s-1) \pi^{-s/2} \Gamma(\frac{s}{2}) \zeta(s) \ \ \ \ \ (6)$

is analytic on the entire complex plane ${{\bf C}}$ (after removing all removable singularities), and obeys the functional equations

$\displaystyle \xi(\overline{s}) = \overline{\xi(s)}$

and

$\displaystyle \xi(s) = \xi(1-s). \ \ \ \ \ (7)$

In particular, the zeroes of ${\xi}$ consist precisely of the non-trivial zeroes of ${\zeta}$, and are symmetric about both the real axis and the critical line. Also, ${\xi}$ is real-valued on the critical line and on the real axis.

Corollary 2 is an easy consequence of Theorem 1 together with the duplication theorem for the Gamma function, and the fact that ${\zeta}$ has no zeroes to the right of the critical strip, and is left as an exercise to the reader (Exercise 19). The functional equation in Theorem 1 has many proofs, but most of them are related in on way or another to the Poisson summation formula

$\displaystyle \sum_n f(n) = \sum_m \hat f(2\pi m) \ \ \ \ \ (8)$

(Theorem 34 from Supplement 2, at least in the case when ${f}$ is twice continuously differentiable and compactly supported), which can be viewed as a Fourier-analytic link between the coarse-scale distribution of the integers and the fine-scale distribution of the integers. Indeed, there is a quick heuristic proof of the functional equation that comes from formally applying the Poisson summation formula to the function ${1_{x>0} \frac{1}{x^s}}$, and noting that the functions ${x \mapsto \frac{1}{x^s}}$ and ${\xi \mapsto \frac{1}{\xi^{1-s}}}$ are formally Fourier transforms of each other, up to some Gamma function factors, as well as some trigonometric factors arising from the distinction between the real line and the half-line. Such a heuristic proof can indeed be made rigorous, and we do so below the fold, while also providing Riemann’s two classical proofs of the functional equation.

From the functional equation (and the poles of the Gamma function), one can see that ${\zeta}$ has trivial zeroes at the negative even integers ${-2,-4,-6,\dots}$, in addition to the non-trivial zeroes in the critical strip. More generally, the following table summarises the zeroes and poles of the various special functions appearing in the functional equation, after they have been meromorphically extended to the entire complex plane, and with zeroes classified as “non-trivial” or “trivial” depending on whether they lie in the critical strip or not. (Exponential functions such as ${2^{s-1}}$ or ${\pi^{-s}}$ have no zeroes or poles, and will be ignored in this table; the zeroes and poles of rational functions such as ${s(s-1)}$ are self-evident and will also not be displayed here.)

 Function Non-trivial zeroes Trivial zeroes Poles ${\zeta(s)}$ Yes ${-2,-4,-6,\dots}$ ${1}$ ${\zeta(1-s)}$ Yes ${1,3,5,\dots}$ ${0}$ ${\sin(\pi s/2)}$ No Even integers No ${\cos(\pi s/2)}$ No Odd integers No ${\sin(\pi s)}$ No Integers No ${\Gamma(s)}$ No No ${0,-1,-2,\dots}$ ${\Gamma(s/2)}$ No No ${0,-2,-4,\dots}$ ${\Gamma(1-s)}$ No No ${1,2,3,\dots}$ ${\Gamma((1-s)/2)}$ No No ${2,4,6,\dots}$ ${\xi(s)}$ Yes No No

Among other things, this table indicates that the Gamma and trigonometric factors in the functional equation are tied to the trivial zeroes and poles of zeta, but have no direct bearing on the distribution of the non-trivial zeroes, which is the most important feature of the zeta function for the purposes of analytic number theory, beyond the fact that they are symmetric about the real axis and critical line. In particular, the Riemann hypothesis is not going to be resolved just from further analysis of the Gamma function!

The zeta function computes the “global” sum ${\sum_n \frac{1}{n^s}}$, with ${n}$ ranging all the way from ${1}$ to infinity. However, by some Fourier-analytic (or complex-analytic) manipulation, it is possible to use the zeta function to also control more “localised” sums, such as ${\sum_n \frac{1}{n^s} \psi(\log n - \log N)}$ for some ${N \gg 1}$ and some smooth compactly supported function ${\psi: {\bf R} \rightarrow {\bf C}}$. It turns out that the functional equation (3) for the zeta function localises to this context, giving an approximate functional equation which roughly speaking takes the form

$\displaystyle \sum_n \frac{1}{n^s} \psi( \log n - \log N ) \approx \alpha(s) \sum_m \frac{1}{m^{1-s}} \psi( \log M - \log m )$

whenever ${s=\sigma+it}$ and ${NM = \frac{|t|}{2\pi}}$; see Theorem 38 below for a precise formulation of this equation. Unsurprisingly, this form of the functional equation is also very closely related to the Poisson summation formula (8), indeed it is essentially a special case of that formula (or more precisely, of the van der Corput ${B}$-process). This useful identity relates long smoothed sums of ${\frac{1}{n^s}}$ to short smoothed sums of ${\frac{1}{m^{1-s}}}$ (or vice versa), and can thus be used to shorten exponential sums involving terms such as ${\frac{1}{n^s}}$, which is useful when obtaining some of the more advanced estimates on the Riemann zeta function.

We will give two other basic uses of the functional equation. The first is to get a good count (as opposed to merely an upper bound) on the density of zeroes in the critical strip, establishing the Riemann-von Mangoldt formula that the number ${N(T)}$ of zeroes of imaginary part between ${0}$ and ${T}$ is ${\frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T)}$ for large ${T}$. The other is to obtain untruncated versions of the explicit formula from Notes 2, giving a remarkable exact formula for sums involving the von Mangoldt function in terms of zeroes of the Riemann zeta function. These results are not strictly necessary for most of the material in the rest of the course, but certainly help to clarify the nature of the Riemann zeta function and its relation to the primes.

In view of the material in previous notes, it should not be surprising that there are analogues of all of the above theory for Dirichlet ${L}$-functions ${L(\cdot,\chi)}$. We will restrict attention to primitive characters ${\chi}$, since the ${L}$-function for imprimitive characters merely differs from the ${L}$-function of the associated primitive factor by a finite Euler product; indeed, if ${\chi = \chi' \chi_0}$ for some principal ${\chi_0}$ whose modulus ${q_0}$ is coprime to that of ${\chi'}$, then

$\displaystyle L(s,\chi) = L(s,\chi') \prod_{p|q_0} (1 - \frac{1}{p^s}) \ \ \ \ \ (9)$

(cf. equation (45) of Notes 2).

The main new feature is that the Poisson summation formula needs to be “twisted” by a Dirichlet character ${\chi}$, and this boils down to the problem of understanding the finite (additive) Fourier transform of a Dirichlet character. This is achieved by the classical theory of Gauss sums, which we review below the fold. There is one new wrinkle; the value of ${\chi(-1) \in \{-1,+1\}}$ plays a role in the functional equation. More precisely, we have

Theorem 3 (Functional equation for ${L}$-functions) Let ${\chi}$ be a primitive character of modulus ${q}$ with ${q>1}$. Then ${L(s,\chi)}$ extends to an entire function on the complex plane, with

$\displaystyle L(s,\chi) = \varepsilon(\chi) 2^s \pi^{s-1} q^{1/2-s} \sin(\frac{\pi}{2}(s+\kappa)) \Gamma(1-s) L(1-s,\overline{\chi})$

or equivalently

$\displaystyle L(1-s,\overline{\chi}) = \varepsilon(\overline{\chi}) 2^{1-s} \pi^{-s} q^{s-1/2} \sin(\frac{\pi}{2}(1-s+\kappa)) \Gamma(s) L(s,\chi)$

for all ${s}$, where ${\kappa}$ is equal to ${0}$ in the even case ${\chi(-1)=+1}$ and ${1}$ in the odd case ${\chi(-1)=-1}$, and

$\displaystyle \varepsilon(\chi) := \frac{\tau(\chi)}{i^\kappa \sqrt{q}} \ \ \ \ \ (10)$

where ${\tau(\chi)}$ is the Gauss sum

$\displaystyle \tau(\chi) := \sum_{n \in {\bf Z}/q{\bf Z}} \chi(n) e(n/q). \ \ \ \ \ (11)$

and ${e(x) := e^{2\pi ix}}$, with the convention that the ${q}$-periodic function ${n \mapsto e(n/q)}$ is also (by abuse of notation) applied to ${n}$ in the cyclic group ${{\bf Z}/q{\bf Z}}$.

From this functional equation and (2) we see that, as with the Riemann zeta function, the non-trivial zeroes of ${L(s,\chi)}$ (defined as the zeroes within the critical strip ${\{ s: 0 < \hbox{Re}(s) < 1 \}}$ are symmetric around the critical line (and, if ${\chi}$ is real, are also symmetric around the real axis). In addition, ${L(s,\chi)}$ acquires trivial zeroes at the negative even integers and at zero if ${\chi(-1)=1}$, and at the negative odd integers if ${\chi(-1)=-1}$. For imprimitive ${\chi}$, we see from (9) that ${L(s,\chi)}$ also acquires some additional trivial zeroes on the left edge of the critical strip.

There is also a symmetric version of this equation, analogous to Corollary 2:

Corollary 4 Let ${\chi,q,\varepsilon(\chi)}$ be as above, and set

$\displaystyle \xi(s,\chi) := (q/\pi)^{(s+\kappa)/2} \Gamma((s+\kappa)/2) L(s,\chi),$

then ${\xi(\cdot,\chi)}$ is entire with ${\xi(1-s,\chi) = \varepsilon(\chi) \xi(s,\chi)}$.

For further detail on the functional equation and its implications, I recommend the classic text of Titchmarsh or the text of Davenport.

In Notes 1, we approached multiplicative number theory (the study of multiplicative functions ${f: {\bf N} \rightarrow {\bf C}}$ and their relatives) via elementary methods, in which attention was primarily focused on obtaining asymptotic control on summatory functions ${\sum_{n \leq x} f(n)}$ and logarithmic sums ${\sum_{n \leq x} \frac{f(n)}{n}}$. Now we turn to the complex approach to multiplicative number theory, in which the focus is instead on obtaining various types of control on the Dirichlet series ${{\mathcal D} f}$, defined (at least for ${s}$ of sufficiently large real part) by the formula

$\displaystyle {\mathcal D} f(s) := \sum_n \frac{f(n)}{n^s}.$

These series also made an appearance in the elementary approach to the subject, but only for real ${s}$ that were larger than ${1}$. But now we will exploit the freedom to extend the variable ${s}$ to the complex domain; this gives enough freedom (in principle, at least) to recover control of elementary sums such as ${\sum_{n\leq x} f(n)}$ or ${\sum_{n\leq x} \frac{f(n)}{n}}$ from control on the Dirichlet series. Crucially, for many key functions ${f}$ of number-theoretic interest, the Dirichlet series ${{\mathcal D} f}$ can be analytically (or at least meromorphically) continued to the left of the line ${\{ s: \hbox{Re}(s) = 1 \}}$. The zeroes and poles of the resulting meromorphic continuations of ${{\mathcal D} f}$ (and of related functions) then turn out to control the asymptotic behaviour of the elementary sums of ${f}$; the more one knows about the former, the more one knows about the latter. In particular, knowledge of where the zeroes of the Riemann zeta function ${\zeta}$ are located can give very precise information about the distribution of the primes, by means of a fundamental relationship known as the explicit formula. There are many ways of phrasing this explicit formula (both in exact and in approximate forms), but they are all trying to formalise an approximation to the von Mangoldt function ${\Lambda}$ (and hence to the primes) of the form

$\displaystyle \Lambda(n) \approx 1 - \sum_\rho n^{\rho-1} \ \ \ \ \ (1)$

where the sum is over zeroes ${\rho}$ (counting multiplicity) of the Riemann zeta function ${\zeta = {\mathcal D} 1}$ (with the sum often restricted so that ${\rho}$ has large real part and bounded imaginary part), and the approximation is in a suitable weak sense, so that

$\displaystyle \sum_n \Lambda(n) g(n) \approx \int_0^\infty g(y)\ dy - \sum_\rho \int_0^\infty g(y) y^{\rho-1}\ dy \ \ \ \ \ (2)$

for suitable “test functions” ${g}$ (which in practice are restricted to be fairly smooth and slowly varying, with the precise amount of restriction dependent on the amount of truncation in the sum over zeroes one wishes to take). Among other things, such approximations can be used to rigorously establish the prime number theorem

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + o(x) \ \ \ \ \ (3)$

as ${x \rightarrow \infty}$, with the size of the error term ${o(x)}$ closely tied to the location of the zeroes ${\rho}$ of the Riemann zeta function.

The explicit formula (1) (or any of its more rigorous forms) is closely tied to the counterpart approximation

$\displaystyle -\frac{\zeta'}{\zeta}(s) \approx \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} \ \ \ \ \ (4)$

for the Dirichlet series ${{\mathcal D} \Lambda = -\frac{\zeta'}{\zeta}}$ of the von Mangoldt function; note that (4) is formally the special case of (2) when ${g(n) = n^{-s}}$. Such approximations come from the general theory of local factorisations of meromorphic functions, as discussed in Supplement 2; the passage from (4) to (2) is accomplished by such tools as the residue theorem and the Fourier inversion formula, which were also covered in Supplement 2. The relative ease of uncovering the Fourier-like duality between primes and zeroes (sometimes referred to poetically as the “music of the primes”) is one of the major advantages of the complex-analytic approach to multiplicative number theory; this important duality tends to be rather obscured in the other approaches to the subject, although it can still in principle be discernible with sufficient effort.

More generally, one has an explicit formula

$\displaystyle \Lambda(n) \chi(n) \approx - \sum_\rho n^{\rho-1} \ \ \ \ \ (5)$

for any Dirichlet character ${\chi}$, where ${\rho}$ now ranges over the zeroes of the associated Dirichlet ${L}$-function ${L(s,\chi) := {\mathcal D} \chi(s)}$; we view this formula as a “twist” of (1) by the Dirichlet character ${\chi}$. The explicit formula (5), proven similarly (in any of its rigorous forms) to (1), is important in establishing the prime number theorem in arithmetic progressions, which asserts that

$\displaystyle \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + o(x) \ \ \ \ \ (6)$

as ${x \rightarrow \infty}$, whenever ${a\ (q)}$ is a fixed primitive residue class. Again, the size of the error term ${o(x)}$ here is closely tied to the location of the zeroes of the Dirichlet ${L}$-function, with particular importance given to whether there is a zero very close to ${s=1}$ (such a zero is known as an exceptional zero or Siegel zero).

While any information on the behaviour of zeta functions or ${L}$-functions is in principle welcome for the purposes of analytic number theory, some regions of the complex plane are more important than others in this regard, due to the differing weights assigned to each zero in the explicit formula. Roughly speaking, in descending order of importance, the most crucial regions on which knowledge of these functions is useful are

1. The region on or near the point ${s=1}$.
2. The region on or near the right edge ${\{ 1+it: t \in {\bf R} \}}$ of the critical strip ${\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}}$.
3. The right half ${\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}}$ of the critical strip.
4. The region on or near the critical line ${\{ \frac{1}{2} + it: t \in {\bf R} \}}$ that bisects the critical strip.
5. Everywhere else.

For instance:

1. We will shortly show that the Riemann zeta function ${\zeta}$ has a simple pole at ${s=1}$ with residue ${1}$, which is already sufficient to recover much of the classical theorems of Mertens discussed in the previous set of notes, as well as results on mean values of multiplicative functions such as the divisor function ${\tau}$. For Dirichlet ${L}$-functions, the behaviour is instead controlled by the quantity ${L(1,\chi)}$ discussed in Notes 1, which is in turn closely tied to the existence and location of a Siegel zero.
2. The zeta function is also known to have no zeroes on the right edge ${\{1+it: t \in {\bf R}\}}$ of the critical strip, which is sufficient to prove (and is in fact equivalent to) the prime number theorem. Any enlargement of the zero-free region for ${\zeta}$ into the critical strip leads to improved error terms in that theorem, with larger zero-free regions leading to stronger error estimates. Similarly for ${L}$-functions and the prime number theorem in arithmetic progressions.
3. The (as yet unproven) Riemann hypothesis prohibits ${\zeta}$ from having any zeroes within the right half ${\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}}$ of the critical strip, and gives very good control on the number of primes in intervals, even when the intervals are relatively short compared to the size of the entries. Even without assuming the Riemann hypothesis, zero density estimates in this region are available that give some partial control of this form. Similarly for ${L}$-functions, primes in short arithmetic progressions, and the generalised Riemann hypothesis.
4. Assuming the Riemann hypothesis, further distributional information about the zeroes on the critical line (such as Montgomery’s pair correlation conjecture, or the more general GUE hypothesis) can give finer information about the error terms in the prime number theorem in short intervals, as well as other arithmetic information. Again, one has analogues for ${L}$-functions and primes in short arithmetic progressions.
5. The functional equation of the zeta function describes the behaviour of ${\zeta}$ to the left of the critical line, in terms of the behaviour to the right of the critical line. This is useful for building a “global” picture of the structure of the zeta function, and for improving a number of estimates about that function, but (in the absence of unproven conjectures such as the Riemann hypothesis or the pair correlation conjecture) it turns out that many of the basic analytic number theory results using the zeta function can be established without relying on this equation. Similarly for ${L}$-functions.

Remark 1 If one takes an “adelic” viewpoint, one can unite the Riemann zeta function ${\zeta(\sigma+it) = \sum_n n^{-\sigma-it}}$ and all of the ${L}$-functions ${L(\sigma+it,\chi) = \sum_n \chi(n) n^{-\sigma-it}}$ for various Dirichlet characters ${\chi}$ into a single object, viewing ${n \mapsto \chi(n) n^{-it}}$ as a general multiplicative character on the adeles; thus the imaginary coordinate ${t}$ and the Dirichlet character ${\chi}$ are really the Archimedean and non-Archimedean components respectively of a single adelic frequency parameter. This viewpoint was famously developed in Tate’s thesis, which among other things helps to clarify the nature of the functional equation, as discussed in this previous post. We will not pursue the adelic viewpoint further in these notes, but it does supply a “high-level” explanation for why so much of the theory of the Riemann zeta function extends to the Dirichlet ${L}$-functions. (The non-Archimedean character ${\chi(n)}$ and the Archimedean character ${n^{it}}$ behave similarly from an algebraic point of view, but not so much from an analytic point of view; as such, the adelic viewpoint is well suited for algebraic tasks (such as establishing the functional equation), but not for analytic tasks (such as establishing a zero-free region).)

Roughly speaking, the elementary multiplicative number theory from Notes 1 corresponds to the information one can extract from the complex-analytic method in region 1 of the above hierarchy, while the more advanced elementary number theory used to prove the prime number theorem (and which we will not cover in full detail in these notes) corresponds to what one can extract from regions 1 and 2.

As a consequence of this hierarchy of importance, information about the ${\zeta}$ function away from the critical strip, such as Euler’s identity

$\displaystyle \zeta(2) = \frac{\pi^2}{6}$

or equivalently

$\displaystyle 1 + \frac{1}{2^2} + \frac{1}{3^2} + \dots = \frac{\pi^2}{6}$

or the infamous identity

$\displaystyle \zeta(-1) = -\frac{1}{12},$

which is often presented (slightly misleadingly, if one’s conventions for divergent summation are not made explicit) as

$\displaystyle 1 + 2 + 3 + \dots = -\frac{1}{12},$

are of relatively little direct importance in analytic prime number theory, although they are still of interest for some other, non-number-theoretic, applications. (The quantity ${\zeta(2)}$ does play a minor role as a normalising factor in some asymptotics, see e.g. Exercise 28 from Notes 1, but its precise value is usually not of major importance.) In contrast, the value ${L(1,\chi)}$ of an ${L}$-function at ${s=1}$ turns out to be extremely important in analytic number theory, with many results in this subject relying ultimately on a non-trivial lower-bound on this quantity coming from Siegel’s theorem, discussed below the fold.

For a more in-depth treatment of the topics in this set of notes, see Davenport’s “Multiplicative number theory“.

Mertens’ theorems are a set of classical estimates concerning the asymptotic distribution of the prime numbers:

Theorem 1 (Mertens’ theorems) In the asymptotic limit ${x \rightarrow \infty}$, we have

$\displaystyle \sum_{p\leq x} \frac{\log p}{p} = \log x + O(1), \ \ \ \ \ (1)$

$\displaystyle \sum_{p\leq x} \frac{1}{p} = \log \log x + O(1), \ \ \ \ \ (2)$

and

$\displaystyle \sum_{p\leq x} \log(1-\frac{1}{p}) = -\log \log x - \gamma + o(1) \ \ \ \ \ (3)$

where ${\gamma}$ is the Euler-Mascheroni constant, defined by requiring that

$\displaystyle 1 + \frac{1}{2} + \ldots + \frac{1}{n} = \log n + \gamma + o(1) \ \ \ \ \ (4)$

in the limit ${n \rightarrow \infty}$.

The third theorem (3) is usually stated in exponentiated form

$\displaystyle \prod_{p \leq x} (1-\frac{1}{p}) = \frac{e^{-\gamma}+o(1)}{\log x},$

but in the logarithmic form (3) we see that it is strictly stronger than (2), in view of the asymptotic ${\log(1-\frac{1}{p}) = -\frac{1}{p} + O(\frac{1}{p^2})}$.

Remarkably, these theorems can be proven without the assistance of the prime number theorem

$\displaystyle \sum_{p \leq x} 1 = \frac{x}{\log x} + o( \frac{x}{\log x} ),$

which was proven about two decades after Mertens’ work. (But one can certainly use versions of the prime number theorem with good error term, together with summation by parts, to obtain good estimates on the various errors in Mertens’ theorems.) Roughly speaking, the reason for this is that Mertens’ theorems only require control on the Riemann zeta function ${\zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}}$ in the neighbourhood of the pole at ${s=1}$, whereas (as discussed in this previous post) the prime number theorem requires control on the zeta function on (a neighbourhood of) the line ${\{ 1+it: t \in {\bf R} \}}$. Specifically, Mertens’ theorem is ultimately deduced from the Euler product formula

$\displaystyle \zeta(s) = \prod_p (1-\frac{1}{p^s})^{-1}, \ \ \ \ \ (5)$

valid in the region ${\hbox{Re}(s) > 1}$ (which is ultimately a Fourier-Dirichlet transform of the fundamental theorem of arithmetic), and following crude asymptotics:

Proposition 2 (Simple pole) For ${s}$ sufficiently close to ${1}$ with ${\hbox{Re}(s) > 1}$, we have

$\displaystyle \zeta(s) = \frac{1}{s-1} + O(1) \ \ \ \ \ (6)$

and

$\displaystyle \zeta'(s) = \frac{-1}{(s-1)^2} + O(1).$

Proof: For ${s}$ as in the proposition, we have ${\frac{1}{n^s} = \frac{1}{t^s} + O(\frac{1}{n^2})}$ for any natural number ${n}$ and ${n \leq t \leq n+1}$, and hence

$\displaystyle \frac{1}{n^s} = \int_n^{n+1} \frac{1}{t^s}\ dt + O( \frac{1}{n^2} ).$

Summing in ${n}$ and using the identity ${\int_1^\infty \frac{1}{t^s}\ dt = \frac{1}{s-1}}$, we obtain the first claim. Similarly, we have

$\displaystyle \frac{-\log n}{n^s} = \int_n^{n+1} \frac{-\log t}{t^s}\ dt + O( \frac{\log n}{n^2} ),$

and by summing in ${n}$ and using the identity ${\int_1^\infty \frac{-\log t}{t^s}\ dt = \frac{-1}{(s-1)^2}}$ (the derivative of the previous identity) we obtain the claim. $\Box$

The first two of Mertens’ theorems (1), (2) are relatively easy to prove, and imply the third theorem (3) except with ${\gamma}$ replaced by an unspecified absolute constant. To get the specific constant ${\gamma}$ requires a little bit of additional effort. From (4), one might expect that the appearance of ${\gamma}$ arises from the refinement

$\displaystyle \zeta(s) = \frac{1}{s-1} + \gamma + O(|s-1|) \ \ \ \ \ (7)$

that one can obtain to (6). However, it turns out that the connection is not so much with the zeta function, but with the Gamma function, and specifically with the identity ${\Gamma'(1) = - \gamma}$ (which is of course related to (7) through the functional equation for zeta, but can be proven without any reference to zeta functions). More specifically, we have the following asymptotic for the exponential integral:

Proposition 3 (Exponential integral asymptotics) For sufficiently small ${\epsilon}$, one has

$\displaystyle \int_\epsilon^\infty \frac{e^{-t}}{t}\ dt = \log \frac{1}{\epsilon} - \gamma + O(\epsilon).$

A routine integration by parts shows that this asymptotic is equivalent to the identity

$\displaystyle \int_0^\infty e^{-t} \log t\ dt = -\gamma$

which is the identity ${\Gamma'(1)=-\gamma}$ mentioned previously.

Proof: We start by using the identity ${\frac{1}{i} = \int_0^1 x^{i-1}\ dx}$ to express the harmonic series ${H_n := 1+\frac{1}{2}+\ldots+\frac{1}{n}}$ as

$\displaystyle H_n = \int_0^1 1 + x + \ldots + x^{n-1}\ dx$

or on summing the geometric series

$\displaystyle H_n = \int_0^1 \frac{1-x^n}{1-x}\ dx.$

Since ${\int_0^{1-1/n} \frac{1}{1-x} = \log n}$, we thus have

$\displaystyle H_n - \log n = \int_0^1 \frac{1_{[1-1/n,1]}(x) - x^n}{1-x}\ dx;$

making the change of variables ${x = 1-\frac{t}{n}}$, this becomes

$\displaystyle H_n - \log n = \int_0^n \frac{1_{[0,1]}(t) - (1-\frac{t}{n})^n}{t}\ dt.$

As ${n \rightarrow \infty}$, ${\frac{1_{[0,1]}(t) - (1-\frac{t}{n})^n}{t}}$ converges pointwise to ${\frac{1_{[0,1]}(t) - e^{-t}}{t}}$ and is pointwise dominated by ${O( e^{-t} )}$. Taking limits as ${n \rightarrow \infty}$ using dominated convergence, we conclude that

$\displaystyle \gamma = \int_0^\infty \frac{1_{[0,1]}(t) - e^{-t}}{t}\ dt.$

or equivalently

$\displaystyle \int_0^\infty \frac{e^{-t} - 1_{[0,\epsilon]}(t)}{t}\ dt = \log \frac{1}{\epsilon} - \gamma.$

The claim then follows by bounding the ${\int_0^\epsilon}$ portion of the integral on the left-hand side. $\Box$

Below the fold I would like to record how Proposition 2 and Proposition 3 imply Theorem 1; the computations are utterly standard, and can be found in most analytic number theory texts, but I wanted to write them down for my own benefit (I always keep forgetting, in particular, how the third of Mertens’ theorems is proven).

The Riemann zeta function ${\zeta(s)}$ is defined in the region ${\hbox{Re}(s)>1}$ by the absolutely convergent series

$\displaystyle \zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s} = 1 + \frac{1}{2^s} + \frac{1}{3^s} + \ldots. \ \ \ \ \ (1)$

Thus, for instance, it is known that ${\zeta(2)=\pi^2/6}$, and thus

$\displaystyle \sum_{n=1}^\infty \frac{1}{n^2} = 1 + \frac{1}{4} + \frac{1}{9} + \ldots = \frac{\pi^2}{6}. \ \ \ \ \ (2)$

For ${\hbox{Re}(s) \leq 1}$, the series on the right-hand side of (1) is no longer absolutely convergent, or even conditionally convergent. Nevertheless, the ${\zeta}$ function can be extended to this region (with a pole at ${s=1}$) by analytic continuation. For instance, it can be shown that after analytic continuation, one has ${\zeta(0) = -1/2}$, ${\zeta(-1) = -1/12}$, and ${\zeta(-2)=0}$, and more generally

$\displaystyle \zeta(-s) = - \frac{B_{s+1}}{s+1} \ \ \ \ \ (3)$

for ${s=1,2,\ldots}$, where ${B_n}$ are the Bernoulli numbers. If one formally applies (1) at these values of ${s}$, one obtains the somewhat bizarre formulae

$\displaystyle \sum_{n=1}^\infty 1 = 1 + 1 + 1 + \ldots = -1/2 \ \ \ \ \ (4)$

$\displaystyle \sum_{n=1}^\infty n = 1 + 2 + 3 + \ldots = -1/12 \ \ \ \ \ (5)$

$\displaystyle \sum_{n=1}^\infty n^2 = 1 + 4 + 9 + \ldots = 0 \ \ \ \ \ (6)$

and

$\displaystyle \sum_{n=1}^\infty n^s = 1 + 2^s + 3^s + \ldots = -\frac{B_{s+1}}{s+1}. \ \ \ \ \ (7)$

Clearly, these formulae do not make sense if one stays within the traditional way to evaluate infinite series, and so it seems that one is forced to use the somewhat unintuitive analytic continuation interpretation of such sums to make these formulae rigorous. But as it stands, the formulae look “wrong” for several reasons. Most obviously, the summands on the left are all positive, but the right-hand sides can be zero or negative. A little more subtly, the identities do not appear to be consistent with each other. For instance, if one adds (4) to (5), one obtains

$\displaystyle \sum_{n=1}^\infty (n+1) = 2 + 3 + 4 + \ldots = -7/12 \ \ \ \ \ (8)$

whereas if one subtracts ${1}$ from (5) one obtains instead

$\displaystyle \sum_{n=2}^\infty n = 0 + 2 + 3 + 4 + \ldots = -13/12 \ \ \ \ \ (9)$

and the two equations seem inconsistent with each other.

However, it is possible to interpret (4), (5), (6) by purely real-variable methods, without recourse to complex analysis methods such as analytic continuation, thus giving an “elementary” interpretation of these sums that only requires undergraduate calculus; we will later also explain how this interpretation deals with the apparent inconsistencies pointed out above.

To see this, let us first consider a convergent sum such as (2). The classical interpretation of this formula is the assertion that the partial sums

$\displaystyle \sum_{n=1}^N \frac{1}{n^2} = 1 + \frac{1}{4} + \frac{1}{9} + \ldots + \frac{1}{N^2}$

converge to ${\pi^2/6}$ as ${N \rightarrow \infty}$, or in other words that

$\displaystyle \sum_{n=1}^N \frac{1}{n^2} = \frac{\pi^2}{6} + o(1)$

where ${o(1)}$ denotes a quantity that goes to zero as ${N \rightarrow \infty}$. Actually, by using the integral test estimate

$\displaystyle \sum_{n=N+1}^\infty \frac{1}{n^2} \leq \int_N^\infty \frac{dx}{x^2} = \frac{1}{N}$

we have the sharper result

$\displaystyle \sum_{n=1}^N \frac{1}{n^2} = \frac{\pi^2}{6} + O(\frac{1}{N}).$

Thus we can view ${\frac{\pi^2}{6}}$ as the leading coefficient of the asymptotic expansion of the partial sums of ${\sum_{n=1}^\infty 1/n^2}$.

One can then try to inspect the partial sums of the expressions in (4), (5), (6), but the coefficients bear no obvious relationship to the right-hand sides:

$\displaystyle \sum_{n=1}^N 1 = N$

$\displaystyle \sum_{n=1}^N n = \frac{1}{2} N^2 + \frac{1}{2} N$

$\displaystyle \sum_{n=1}^N n^2 = \frac{1}{3} N^3 + \frac{1}{2} N^2 + \frac{1}{6} N.$

For (7), the classical Faulhaber formula (or Bernoulli formula) gives

$\displaystyle \sum_{n=1}^N n^s = \frac{1}{s+1} \sum_{j=0}^s \binom{s+1}{j} B_j N^{s+1-j} \ \ \ \ \ (10)$

$\displaystyle = \frac{1}{s+1} N^{s+1} + \frac{1}{2} N^s + \frac{s}{12} N^{s-1} + \ldots + B_s N$

for ${s \geq 2}$, which has a vague resemblance to (7), but again the connection is not particularly clear.

The problem here is the discrete nature of the partial sum

$\displaystyle \sum_{n=1}^N n^s = \sum_{n \leq N} n^s,$

which (if ${N}$ is viewed as a real number) has jump discontinuities at each positive integer value of ${N}$. These discontinuities yield various artefacts when trying to approximate this sum by a polynomial in ${N}$. (These artefacts also occur in (2), but happen in that case to be obscured in the error term ${O(1/N)}$; but for the divergent sums (4), (5), (6), (7), they are large enough to cause real trouble.)

However, these issues can be resolved by replacing the abruptly truncated partial sums ${\sum_{n=1}^N n^s}$ with smoothed sums ${\sum_{n=1}^\infty \eta(n/N) n^s}$, where ${\eta: {\bf R}^+ \rightarrow {\bf R}}$ is a cutoff function, or more precisely a compactly supported bounded function that equals ${1}$ at ${0}$. The case when ${\eta}$ is the indicator function ${1_{[0,1]}}$ then corresponds to the traditional partial sums, with all the attendant discretisation artefacts; but if one chooses a smoother cutoff, then these artefacts begin to disappear (or at least become lower order), and the true asymptotic expansion becomes more manifest.

Note that smoothing does not affect the asymptotic value of sums that were already absolutely convergent, thanks to the dominated convergence theorem. For instance, we have

$\displaystyle \sum_{n=1}^\infty \eta(n/N) \frac{1}{n^2} = \frac{\pi^2}{6} + o(1)$

whenever ${\eta}$ is a cutoff function (since ${\eta(n/N) \rightarrow 1}$ pointwise as ${N \rightarrow \infty}$ and is uniformly bounded). If ${\eta}$ is equal to ${1}$ on a neighbourhood of the origin, then the integral test argument then recovers the ${O(1/N)}$ decay rate:

$\displaystyle \sum_{n=1}^\infty \eta(n/N) \frac{1}{n^2} = \frac{\pi^2}{6} + O(\frac{1}{N}).$

However, smoothing can greatly improve the convergence properties of a divergent sum. The simplest example is Grandi’s series

$\displaystyle \sum_{n=1}^\infty (-1)^{n-1} = 1 - 1 + 1 - \ldots.$

The partial sums

$\displaystyle \sum_{n=1}^N (-1)^{n-1} = \frac{1}{2} + \frac{1}{2} (-1)^{N-1}$

oscillate between ${1}$ and ${0}$, and so this series is not conditionally convergent (and certainly not absolutely convergent). However, if one performs analytic continuation on the series

$\displaystyle \sum_{n=1}^\infty \frac{(-1)^{n-1}}{n^s} = 1 - \frac{1}{2^s} + \frac{1}{3^s} - \ldots$

and sets ${s = 0}$, one obtains a formal value of ${1/2}$ for this series. This value can also be obtained by smooth summation. Indeed, for any cutoff function ${\eta}$, we can regroup

$\displaystyle \sum_{n=1}^\infty (-1)^{n-1} \eta(n/N) =$

$\displaystyle \frac{\eta(1/N)}{2} + \sum_{m=1}^\infty \frac{\eta((2m-1)/N) - 2\eta(2m/N) + \eta((2m+1)/N)}{2}.$

If ${\eta}$ is twice continuously differentiable (i.e. ${\eta \in C^2}$), then from Taylor expansion we see that the summand has size ${O(1/N^2)}$, and also (from the compact support of ${\eta}$) is only non-zero when ${m=O(N)}$. This leads to the asymptotic

$\displaystyle \sum_{n=1}^\infty (-1)^{n-1} \eta(n/N) = \frac{1}{2} + O( \frac{1}{N} )$

and so we recover the value of ${1/2}$ as the leading term of the asymptotic expansion.

Exercise 1 Show that if ${\eta}$ is merely once continuously differentiable (i.e. ${\eta \in C^1}$), then we have a similar asymptotic, but with an error term of ${o(1)}$ instead of ${O(1/N)}$. This is an instance of a more general principle that smoother cutoffs lead to better error terms, though the improvement sometimes stops after some degree of regularity.

Remark 1 The most famous instance of smoothed summation is Cesáro summation, which corresponds to the cutoff function ${\eta(x) := (1-x)_+}$. Unsurprisingly, when Cesáro summation is applied to Grandi’s series, one again recovers the value of ${1/2}$.

If we now revisit the divergent series (4), (5), (6), (7) with smooth summation in mind, we finally begin to see the origin of the right-hand sides. Indeed, for any fixed smooth cutoff function ${\eta}$, we will shortly show that

$\displaystyle \sum_{n=1}^\infty \eta(n/N) = -\frac{1}{2} + C_{\eta,0} N + O(\frac{1}{N}) \ \ \ \ \ (11)$

$\displaystyle \sum_{n=1}^\infty n \eta(n/N) = -\frac{1}{12} + C_{\eta,1} N^2 + O(\frac{1}{N}) \ \ \ \ \ (12)$

$\displaystyle \sum_{n=1}^\infty n^2 \eta(n/N) = C_{\eta,2} N^3 + O(\frac{1}{N}) \ \ \ \ \ (13)$

and more generally

$\displaystyle \sum_{n=1}^\infty n^s \eta(n/N) = -\frac{B_{s+1}}{s+1} + C_{\eta,s} N^{s+1} + O(\frac{1}{N}) \ \ \ \ \ (14)$

for any fixed ${s=1,2,3,\ldots}$ where ${C_{\eta,s}}$ is the Archimedean factor

$\displaystyle C_{\eta,s} := \int_0^\infty x^s \eta(x)\ dx \ \ \ \ \ (15)$

(which is also essentially the Mellin transform of ${\eta}$). Thus we see that the values (4), (5), (6), (7) obtained by analytic continuation are nothing more than the constant terms of the asymptotic expansion of the smoothed partial sums. This is not a coincidence; we will explain the equivalence of these two interpretations of such sums (in the model case when the analytic continuation has only finitely many poles and does not grow too fast at infinity) below the fold.

This interpretation clears up the apparent inconsistencies alluded to earlier. For instance, the sum ${\sum_{n=1}^\infty n = 1 + 2 + 3 + \ldots}$ consists only of non-negative terms, as does its smoothed partial sums ${\sum_{n=1}^\infty n \eta(n/N)}$ (if ${\eta}$ is non-negative). Comparing this with (13), we see that this forces the highest-order term ${C_{\eta,1} N^2}$ to be non-negative (as indeed it is), but does not prohibit the lower-order constant term ${-\frac{1}{12}}$ from being negative (which of course it is).

Similarly, if we add together (12) and (11) we obtain

$\displaystyle \sum_{n=1}^\infty (n+1) \eta(n/N) = -\frac{7}{12} + C_{\eta,1} N^2 + C_{\eta,0} N + O(\frac{1}{N}) \ \ \ \ \ (16)$

while if we subtract ${1}$ from (12) we obtain

$\displaystyle \sum_{n=2}^\infty n \eta(n/N) = -\frac{13}{12} + C_{\eta,1} N^2 + O(\frac{1}{N}). \ \ \ \ \ (17)$

These two asymptotics are not inconsistent with each other; indeed, if we shift the index of summation in (17), we can write

$\displaystyle \sum_{n=2}^\infty n \eta(n/N) = \sum_{n=1}^\infty (n+1) \eta((n+1)/N) \ \ \ \ \ (18)$

and so we now see that the discrepancy between the two sums in (8), (9) come from the shifting of the cutoff ${\eta(n/N)}$, which is invisible in the formal expressions in (8), (9) but become manifestly present in the smoothed sum formulation.

Exercise 2 By Taylor expanding ${\eta(n+1/N)}$ and using (11), (18) show that (16) and (17) are indeed consistent with each other, and in particular one can deduce the latter from the former.

The Riemann zeta function ${\zeta(s)}$, defined for ${\hbox{Re}(s)>1}$ by

$\displaystyle \zeta(s) := \sum_{n=1}^\infty \frac{1}{n^s} \ \ \ \ \ (1)$

and then continued meromorphically to other values of ${s}$ by analytic continuation, is a fundamentally important function in analytic number theory, as it is connected to the primes ${p=2,3,5,\ldots}$ via the Euler product formula

$\displaystyle \zeta(s) = \prod_p (1 - \frac{1}{p^s})^{-1} \ \ \ \ \ (2)$

(for ${\hbox{Re}(s) > 1}$, at least), where ${p}$ ranges over primes. (The equivalence between (1) and (2) is essentially the generating function version of the fundamental theorem of arithmetic.) The function ${\zeta}$ has a pole at ${1}$ and a number of zeroes ${\rho}$. A formal application of the factor theorem gives

$\displaystyle \zeta(s) = \frac{1}{s-1} \prod_\rho (s-\rho) \times \ldots \ \ \ \ \ (3)$

where ${\rho}$ ranges over zeroes of ${\zeta}$, and we will be vague about what the ${\ldots}$ factor is, how to make sense of the infinite product, and exactly which zeroes of ${\zeta}$ are involved in the product. Equating (2) and (3) and taking logarithms gives the formal identity

$\displaystyle - \log \zeta(s) = \sum_p \log(1 - \frac{1}{p^s}) = \log(s-1) - \sum_\rho \log(s-\rho) + \ldots; \ \ \ \ \ (4)$

using the Taylor expansion

$\displaystyle \log(1 - \frac{1}{p^s}) = - \frac{1}{p^s} - \frac{1}{2 p^{2s}} - \frac{1}{3p^{3s}} - \ldots \ \ \ \ \ (5)$

and differentiating the above identity in ${s}$ yields the formal identity

$\displaystyle - \frac{\zeta'(s)}{\zeta(s)} = \sum_n \frac{\Lambda(n)}{n^s} = \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} + \ldots \ \ \ \ \ (6)$

where ${\Lambda(n)}$ is the von Mangoldt function, defined to be ${\log p}$ when ${n}$ is a power of a prime ${p}$, and zero otherwise. Thus we see that the behaviour of the primes (as encoded by the von Mangoldt function) is intimately tied to the distribution of the zeroes ${\rho}$. For instance, if we knew that the zeroes were far away from the axis ${\hbox{Re}(s)=1}$, then we would heuristically have

$\displaystyle \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \frac{1}{it}$

for real ${t}$. On the other hand, the integral test suggests that

$\displaystyle \sum_n \frac{1}{n^{1+it}} \approx \frac{1}{it}$

and thus we see that ${\frac{\Lambda(n)}{n}}$ and ${\frac{1}{n}}$ have essentially the same (multiplicative) Fourier transform:

$\displaystyle \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \sum_n \frac{1}{n^{1+it}}.$

Inverting the Fourier transform (or performing a contour integral closely related to the inverse Fourier transform), one is led to the prime number theorem

$\displaystyle \sum_{n \leq x} \Lambda(n) \approx \sum_{n \leq x} 1.$

In fact, the standard proof of the prime number theorem basically proceeds by making all of the above formal arguments precise and rigorous.

Unfortunately, we don’t know as much about the zeroes ${\rho}$ of the zeta function (and hence, about the ${\zeta}$ function itself) as we would like. The Riemann hypothesis (RH) asserts that all the zeroes (except for the “trivial” zeroes at the negative even numbers) lie on the critical line ${\hbox{Re}(s)=1/2}$; this hypothesis would make the error terms in the above proof of the prime number theorem significantly more accurate. Furthermore, the stronger GUE hypothesis asserts in addition to RH that the local distribution of these zeroes on the critical line should behave like the local distribution of the eigenvalues of a random matrix drawn from the gaussian unitary ensemble (GUE). I will not give a precise formulation of this hypothesis here, except to say that the adjective “local” in the context of distribution of zeroes ${\rho}$ means something like “at scale ${O(1/\log T)}$ when ${\hbox{Im}(s) = O(T)}$“.

Nevertheless, we do know some reasonably non-trivial facts about the zeroes ${\rho}$ and the zeta function ${\zeta}$, either unconditionally, or assuming RH (or GUE). Firstly, there are no zeroes for ${\hbox{Re}(s)>1}$ (as one can already see from the convergence of the Euler product (2) in this case) or for ${\hbox{Re}(s)=1}$ (this is trickier, relying on (6) and the elementary observation that

$\displaystyle \hbox{Re}( 3\frac{\Lambda(n)}{n^{\sigma}} + 4\frac{\Lambda(n)}{n^{\sigma+it}} + \frac{\Lambda(n)}{n^{\sigma+2it}} ) = 2\frac{\Lambda(n)}{n^\sigma} (1+\cos(t \log n))^2$

is non-negative for ${\sigma > 1}$ and ${t \in {\mathbb R}}$); from the functional equation

$\displaystyle \pi^{-s/2} \Gamma(s/2) \zeta(s) = \pi^{-(1-s)/2} \Gamma((1-s)/2) \zeta(1-s)$

(which can be viewed as a consequence of the Poisson summation formula, see e.g. my blog post on this topic) we know that there are no zeroes for ${\hbox{Re}(s) \leq 0}$ either (except for the trivial zeroes at negative even integers, corresponding to the poles of the Gamma function). Thus all the non-trivial zeroes lie in the critical strip ${0 < \hbox{Re}(s) < 1}$.

We also know that there are infinitely many non-trivial zeroes, and can approximately count how many zeroes there are in any large bounded region of the critical strip. For instance, for large ${T}$, the number of zeroes ${\rho}$ in this strip with ${\hbox{Im}(\rho) = T+O(1)}$ is ${O(\log T)}$. This can be seen by applying (6) to ${s = 2+iT}$ (say); the trivial zeroes at the negative integers end up giving a contribution of ${O(\log T)}$ to this sum (this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being poles of the Gamma function), while the ${\frac{1}{s-1}}$ and ${\ldots}$ terms end up being negligible (of size ${O(1)}$), while each non-trivial zero ${\rho}$ contributes a term which has a non-negative real part, and furthermore has size comparable to ${1}$ if ${\hbox{Im}(\rho) = T+O(1)}$. (Here I am glossing over a technical renormalisation needed to make the infinite series in (6) converge properly.) Meanwhile, the left-hand side of (6) is absolutely convergent for ${s=2+iT}$ and of size ${O(1)}$, and the claim follows. A more refined version of this argument shows that the number of non-trivial zeroes with ${0 \leq \hbox{Im}(\rho) \leq T}$ is ${\frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T)}$, but we will not need this more precise formula here. (A fair fraction – at least 40%, in fact – of these zeroes are known to lie on the critical line; see this earlier blog post of mine for more discussion.)

Another thing that we happen to know is how the magnitude ${|\zeta(1/2+it)|}$ of the zeta function is distributed as ${t \rightarrow \infty}$; it turns out to be log-normally distributed with log-variance about ${\frac{1}{2} \log \log t}$. More precisely, we have the following result of Selberg:

Theorem 1 Let ${T}$ be a large number, and let ${t}$ be chosen uniformly at random from between ${T}$ and ${2T}$ (say). Then the distribution of ${\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|}$ converges (in distribution) to the normal distribution ${N(0,1)}$.

To put it more informally, ${\log |\zeta(1/2+it)|}$ behaves like ${\sqrt{\frac{1}{2} \log \log t} \times N(0,1)}$ plus lower order terms for “typical” large values of ${t}$. (Zeroes ${\rho}$ of ${\zeta}$ are, of course, certainly not typical, but one can show that one can usually stay away from these zeroes.) In fact, Selberg showed a slightly more precise result, namely that for any fixed ${k \geq 1}$, the ${k^{th}}$ moment of ${\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|}$ converges to the ${k^{th}}$ moment of ${N(0,1)}$.

Remarkably, Selberg’s result does not need RH or GUE, though it is certainly consistent with such hypotheses. (For instance, the determinant of a GUE matrix asymptotically obeys a remarkably similar log-normal law to that given by Selberg’s theorem.) Indeed, the net effect of these hypotheses only affects some error terms in ${\log |\zeta(1/2+it)|}$ of magnitude ${O(1)}$, and are thus asymptotically negligible compared to the main term, which has magnitude about ${O(\sqrt{\log \log T})}$. So Selberg’s result, while very pretty, manages to finesse the question of what the zeroes ${\rho}$ of ${\zeta}$ are actually doing – he makes the primes do most of the work, rather than the zeroes.

Selberg never actually published the above result, but it is reproduced in a number of places (e.g. in this book by Joyner, or this book by Laurincikas). As with many other results in analytic number theory, the actual details of the proof can get somewhat technical; but I would like to record here (partly for my own benefit) an informal sketch of some of the main ideas in the argument.

The Riemann zeta function $\zeta(s)$, defined for $\hbox{Re}(s) > 1$ by the formula

$\displaystyle \zeta(s) := \sum_{n \in {\Bbb N}} \frac{1}{n^s}$ (1)

where ${\Bbb N} = \{1,2,\ldots\}$ are the natural numbers, and extended meromorphically to other values of s by analytic continuation, obeys the remarkable functional equation

$\displaystyle \Xi(s) = \Xi(1-s)$ (2)

where

$\displaystyle \Xi(s) := \Gamma_\infty(s) \zeta(s)$ (3)

is the Riemann Xi function,

$\displaystyle \Gamma_\infty(s) := \pi^{-s/2} \Gamma(s/2)$ (4)

is the Gamma factor at infinity, and the Gamma function $\Gamma(s)$ is defined for $\hbox{Re}(s) > 1$ by

$\displaystyle \Gamma(s) := \int_0^\infty e^{-t} t^s\ \frac{dt}{t}$ (5)

and extended meromorphically to other values of s by analytic continuation.

There are many proofs known of the functional equation (2).  One of them (dating back to Riemann himself) relies on the Poisson summation formula

$\displaystyle \sum_{a \in {\Bbb Z}} f_\infty(a t_\infty) = \frac{1}{|t|_\infty} \sum_{a \in {\Bbb Z}} \hat f_\infty(a/t_\infty)$ (6)

for the reals $k_\infty := {\Bbb R}$ and $t \in k_\infty^*$, where $f$ is a Schwartz function, $|t|_\infty := |t|$ is the usual Archimedean absolute value on $k_\infty$, and

$\displaystyle \hat f_\infty(\xi_\infty) := \int_{k_\infty} e_\infty(-x_\infty \xi_\infty) f_\infty(x_\infty)\ dx_\infty$ (7)

is the Fourier transform on $k_\infty$, with $e_\infty(x_\infty) := e^{2\pi i x_\infty}$ being the standard character $e_\infty: k_\infty \to S^1$ on $k_\infty$.  (The reason for this rather strange notation for the real line and its associated structures will be made clearer shortly.)  Applying this formula to the (Archimedean) Gaussian function

$\displaystyle g_\infty(x_\infty) := e^{-\pi |x_\infty|^2}$, (8)

which is its own (additive) Fourier transform, and then applying the multiplicative Fourier transform (i.e. the Mellin transform), one soon obtains (2).  (Riemann also had another proof of the functional equation relying primarily on contour integration, which I will not discuss here.)  One can “clean up” this proof a bit by replacing the Gaussian by a Dirac delta function, although one now has to work formally and “renormalise” by throwing away some infinite terms.  (One can use the theory of distributions to make this latter approach rigorous, but I will not discuss this here.)  Note how this proof combines the additive Fourier transform with the multiplicative Fourier transform.  [Continuing with this theme, the Gamma function (5) is an inner product between an additive character $e^{-t}$ and a multiplicative character $t^s$, and the zeta function (1) can be viewed both additively, as a sum over n, or multiplicatively, as an Euler product.]

In the famous thesis of Tate, the above argument was reinterpreted using the language of the adele ring ${\Bbb A}$, with the Poisson summation formula (4) on $k_\infty$ replaced by the Poisson summation formula

$\displaystyle \sum_{a \in k} f(a t) = \sum_{a \in k} \hat f(t/a)$ (9)

on ${\Bbb A}$, where $k = {\Bbb Q}$ is the rationals, $t \in {\Bbb A}$, and f is now a Schwartz-Bruhat function on ${\Bbb A}$.  Applying this formula to the adelic (or global) Gaussian function $g(x) := g_\infty(x_\infty) \prod_p 1_{{\mathbb Z}_p}(x_p)$, which is its own Fourier transform, and then using the adelic Mellin transform, one again obtains (2).  Again, the proof can be cleaned up by replacing the Gaussian with a Dirac mass, at the cost of making the computations formal (or requiring the theory of distributions).

In this post I will write down both Riemann’s proof and Tate’s proof together (but omitting some technical details), to emphasise the fact that they are, in some sense, the same proof.  However, Tate’s proof gives a high-level clarity to the situation (in particular, explaining more adequately why the Gamma factor at infinity (4) fits seamlessly with the Riemann zeta function (1) to form the Xi function (2)), and allows one to generalise the functional equation relatively painlessly to other zeta-functions and L-functions, such as Dedekind zeta functions and Hecke L-functions.

[Note: the material here is very standard in modern algebraic number theory; the post here is partially for my own benefit, as most treatments of this topic in the literature tend to operate in far higher levels of generality than I would prefer.]