You are currently browsing the tag archive for the ‘Riemann zeta function’ tag.

Mertens’ theorems are a set of classical estimates concerning the asymptotic distribution of the prime numbers:

Theorem 1 (Mertens’ theorems)In the asymptotic limit , we havewhere is the Euler-Mascheroni constant, defined by requiring that

The third theorem (3) is usually stated in exponentiated form

but in the logarithmic form (3) we see that it is strictly stronger than (2), in view of the asymptotic .

Remarkably, these theorems can be proven without the assistance of the prime number theorem

which was proven about two decades after Mertens’ work. (But one can certainly use versions of the prime number theorem with good error term, together with summation by parts, to obtain good estimates on the various errors in Mertens’ theorems.) Roughly speaking, the reason for this is that Mertens’ theorems only require control on the Riemann zeta function in the neighbourhood of the pole at , whereas (as discussed in this previous post) the prime number theorem requires control on the zeta function on (a neighbourhood of) the line . Specifically, Mertens’ theorem is ultimately deduced from the Euler product formula

valid in the region (which is ultimately a Fourier-Dirichlet transform of the fundamental theorem of arithmetic), and following crude asymptotics:

Proposition 2 (Simple pole)For sufficiently close to with , we have

*Proof:* For as in the proposition, we have for any natural number and , and hence

Summing in and using the identity , we obtain the first claim. Similarly, we have

and by summing in and using the identity (the derivative of the previous identity) we obtain the claim.

The first two of Mertens’ theorems (1), (2) are relatively easy to prove, and imply the third theorem (3) except with replaced by an unspecified absolute constant. To get the specific constant requires a little bit of additional effort. From (4), one might expect that the appearance of arises from the refinement

that one can obtain to (6). However, it turns out that the connection is not so much with the zeta function, but with the Gamma function, and specifically with the identity (which is of course related to (7) through the functional equation for zeta, but can be proven without any reference to zeta functions). More specifically, we have the following asymptotic for the exponential integral:

Proposition 3 (Exponential integral asymptotics)For sufficiently small , one has

A routine integration by parts shows that this asymptotic is equivalent to the identity

which is the identity mentioned previously.

*Proof:* We start by using the identity to express the harmonic series as

or on summing the geometric series

Since , we thus have

making the change of variables , this becomes

As , converges pointwise to and is pointwise dominated by . Taking limits as using dominated convergence, we conclude that

or equivalently

The claim then follows by bounding the portion of the integral on the left-hand side.

Below the fold I would like to record how Proposition 2 and Proposition 3 imply Theorem 1; the computations are utterly standard, and can be found in most analytic number theory texts, but I wanted to write them down for my own benefit (I always keep forgetting, in particular, how the third of Mertens’ theorems is proven).

The Riemann zeta function is defined in the region by the absolutely convergent series

Thus, for instance, it is known that , and thus

For , the series on the right-hand side of (1) is no longer absolutely convergent, or even conditionally convergent. Nevertheless, the function can be extended to this region (with a pole at ) by analytic continuation. For instance, it can be shown that after analytic continuation, one has , , and , and more generally

for , where are the Bernoulli numbers. If one *formally* applies (1) at these values of , one obtains the somewhat bizarre formulae

Clearly, these formulae do not make sense if one stays within the traditional way to evaluate infinite series, and so it seems that one is forced to use the somewhat unintuitive analytic continuation interpretation of such sums to make these formulae rigorous. But as it stands, the formulae look “wrong” for several reasons. Most obviously, the summands on the left are all positive, but the right-hand sides can be zero or negative. A little more subtly, the identities do not appear to be consistent with each other. For instance, if one adds (4) to (5), one obtains

whereas if one subtracts from (5) one obtains instead

and the two equations seem inconsistent with each other.

However, it is possible to interpret (4), (5), (6) by purely real-variable methods, without recourse to complex analysis methods such as analytic continuation, thus giving an “elementary” interpretation of these sums that only requires undergraduate calculus; we will later also explain how this interpretation deals with the apparent inconsistencies pointed out above.

To see this, let us first consider a convergent sum such as (2). The classical interpretation of this formula is the assertion that the partial sums

converge to as , or in other words that

where denotes a quantity that goes to zero as . Actually, by using the integral test estimate

we have the sharper result

Thus we can view as the leading coefficient of the asymptotic expansion of the partial sums of .

One can then try to inspect the partial sums of the expressions in (4), (5), (6), but the coefficients bear no obvious relationship to the right-hand sides:

For (7), the classical Faulhaber formula (or *Bernoulli formula*) gives

for , which has a vague resemblance to (7), but again the connection is not particularly clear.

The problem here is the discrete nature of the partial sum

which (if is viewed as a real number) has jump discontinuities at each positive integer value of . These discontinuities yield various artefacts when trying to approximate this sum by a polynomial in . (These artefacts also occur in (2), but happen in that case to be obscured in the error term ; but for the divergent sums (4), (5), (6), (7), they are large enough to cause real trouble.)

However, these issues can be resolved by replacing the abruptly truncated partial sums with smoothed sums , where is a *cutoff function*, or more precisely a compactly supported bounded function that equals at . The case when is the indicator function then corresponds to the traditional partial sums, with all the attendant discretisation artefacts; but if one chooses a smoother cutoff, then these artefacts begin to disappear (or at least become lower order), and the true asymptotic expansion becomes more manifest.

Note that smoothing does not affect the asymptotic value of sums that were already absolutely convergent, thanks to the dominated convergence theorem. For instance, we have

whenever is a cutoff function (since pointwise as and is uniformly bounded). If is equal to on a neighbourhood of the origin, then the integral test argument then recovers the decay rate:

However, smoothing can greatly improve the convergence properties of a divergent sum. The simplest example is Grandi’s series

The partial sums

oscillate between and , and so this series is not conditionally convergent (and certainly not absolutely convergent). However, if one performs analytic continuation on the series

and sets , one obtains a formal value of for this series. This value can also be obtained by smooth summation. Indeed, for any cutoff function , we can regroup

If is twice continuously differentiable (i.e. ), then from Taylor expansion we see that the summand has size , and also (from the compact support of ) is only non-zero when . This leads to the asymptotic

and so we recover the value of as the leading term of the asymptotic expansion.

Exercise 1Show that if is merely once continuously differentiable (i.e. ), then we have a similar asymptotic, but with an error term of instead of . This is an instance of a more general principle that smoother cutoffs lead to better error terms, though the improvement sometimes stops after some degree of regularity.

Remark 1The most famous instance of smoothed summation is Cesáro summation, which corresponds to the cutoff function . Unsurprisingly, when Cesáro summation is applied to Grandi’s series, one again recovers the value of .

If we now revisit the divergent series (4), (5), (6), (7) with smooth summation in mind, we finally begin to see the origin of the right-hand sides. Indeed, for any fixed smooth cutoff function , we will shortly show that

for any fixed where is the Archimedean factor

(which is also essentially the Mellin transform of ). Thus we see that the values (4), (5), (6), (7) obtained by analytic continuation are nothing more than the constant terms of the asymptotic expansion of the *smoothed* partial sums. This is not a coincidence; we will explain the equivalence of these two interpretations of such sums (in the model case when the analytic continuation has only finitely many poles and does not grow too fast at infinity) below the fold.

This interpretation clears up the apparent inconsistencies alluded to earlier. For instance, the sum consists only of non-negative terms, as does its smoothed partial sums (if is non-negative). Comparing this with (13), we see that this forces the highest-order term to be non-negative (as indeed it is), but does not prohibit the *lower-order* constant term from being negative (which of course it is).

Similarly, if we add together (12) and (11) we obtain

while if we subtract from (12) we obtain

These two asymptotics are not inconsistent with each other; indeed, if we shift the index of summation in (17), we can write

and so we now see that the discrepancy between the two sums in (8), (9) come from the shifting of the cutoff , which is invisible in the formal expressions in (8), (9) but become manifestly present in the smoothed sum formulation.

Exercise 2By Taylor expanding and using (11), (18) show that (16) and (17) are indeed consistent with each other, and in particular one can deduce the latter from the former.

The Riemann zeta function , defined for by

and then continued meromorphically to other values of by analytic continuation, is a fundamentally important function in analytic number theory, as it is connected to the primes via the Euler product formula

(for , at least), where ranges over primes. (The equivalence between (1) and (2) is essentially the generating function version of the fundamental theorem of arithmetic.) The function has a pole at and a number of zeroes . A formal application of the factor theorem gives

where ranges over zeroes of , and we will be vague about what the factor is, how to make sense of the infinite product, and exactly which zeroes of are involved in the product. Equating (2) and (3) and taking logarithms gives the formal identity

and differentiating the above identity in yields the formal identity

where is the von Mangoldt function, defined to be when is a power of a prime , and zero otherwise. Thus we see that the behaviour of the primes (as encoded by the von Mangoldt function) is intimately tied to the distribution of the zeroes . For instance, if we knew that the zeroes were far away from the axis , then we would heuristically have

for real . On the other hand, the integral test suggests that

and thus we see that and have essentially the same (multiplicative) Fourier transform:

Inverting the Fourier transform (or performing a contour integral closely related to the inverse Fourier transform), one is led to the prime number theorem

In fact, the standard proof of the prime number theorem basically proceeds by making all of the above formal arguments precise and rigorous.

Unfortunately, we don’t know as much about the zeroes of the zeta function (and hence, about the function itself) as we would like. The Riemann hypothesis (RH) asserts that all the zeroes (except for the “trivial” zeroes at the negative even numbers) lie on the *critical line* ; this hypothesis would make the error terms in the above proof of the prime number theorem significantly more accurate. Furthermore, the stronger *GUE hypothesis* asserts in addition to RH that the local distribution of these zeroes on the critical line should behave like the local distribution of the eigenvalues of a random matrix drawn from the gaussian unitary ensemble (GUE). I will not give a precise formulation of this hypothesis here, except to say that the adjective “local” in the context of distribution of zeroes means something like “at scale when “.

Nevertheless, we do know some reasonably non-trivial facts about the zeroes and the zeta function , either unconditionally, or assuming RH (or GUE). Firstly, there are no zeroes for (as one can already see from the convergence of the Euler product (2) in this case) or for (this is trickier, relying on (6) and the elementary observation that

is non-negative for and ); from the functional equation

(which can be viewed as a consequence of the Poisson summation formula, see e.g. my blog post on this topic) we know that there are no zeroes for either (except for the trivial zeroes at negative even integers, corresponding to the poles of the Gamma function). Thus all the non-trivial zeroes lie in the *critical strip* .

We also know that there are infinitely many non-trivial zeroes, and can approximately count how many zeroes there are in any large bounded region of the critical strip. For instance, for large , the number of zeroes in this strip with is . This can be seen by applying (6) to (say); the trivial zeroes at the negative integers end up giving a contribution of to this sum (this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being poles of the Gamma function), while the and terms end up being negligible (of size ), while each non-trivial zero contributes a term which has a non-negative real part, and furthermore has size comparable to if . (Here I am glossing over a technical renormalisation needed to make the infinite series in (6) converge properly.) Meanwhile, the left-hand side of (6) is absolutely convergent for and of size , and the claim follows. A more refined version of this argument shows that the number of non-trivial zeroes with is , but we will not need this more precise formula here. (A fair fraction – at least 40%, in fact – of these zeroes are known to lie on the critical line; see this earlier blog post of mine for more discussion.)

Another thing that we happen to know is how the *magnitude* of the zeta function is distributed as ; it turns out to be log-normally distributed with log-variance about . More precisely, we have the following result of Selberg:

Theorem 1Let be a large number, and let be chosen uniformly at random from between and (say). Then the distribution of converges (in distribution) to the normal distribution .

To put it more informally, behaves like plus lower order terms for “typical” large values of . (Zeroes of are, of course, certainly not typical, but one can show that one can usually stay away from these zeroes.) In fact, Selberg showed a slightly more precise result, namely that for any fixed , the moment of converges to the moment of .

Remarkably, Selberg’s result does not need RH or GUE, though it is certainly consistent with such hypotheses. (For instance, the determinant of a GUE matrix asymptotically obeys a remarkably similar log-normal law to that given by Selberg’s theorem.) Indeed, the net effect of these hypotheses only affects some error terms in of magnitude , and are thus asymptotically negligible compared to the main term, which has magnitude about . So Selberg’s result, while very pretty, manages to finesse the question of what the zeroes of are actually doing – he makes the primes do most of the work, rather than the zeroes.

Selberg never actually published the above result, but it is reproduced in a number of places (e.g. in this book by Joyner, or this book by Laurincikas). As with many other results in analytic number theory, the actual details of the proof can get somewhat technical; but I would like to record here (partly for my own benefit) an informal sketch of some of the main ideas in the argument.

The Riemann zeta function , defined for by the formula

(1)

where are the natural numbers, and extended meromorphically to other values of s by analytic continuation, obeys the remarkable functional equation

(2)

where

(3)

is the Riemann Xi function,

(4)

is the *Gamma factor at infinity*, and the Gamma function is defined for by

(5)

and extended meromorphically to other values of s by analytic continuation.

There are many proofs known of the functional equation (2). One of them (dating back to Riemann himself) relies on the Poisson summation formula

(6)

for the reals and , where is a Schwartz function, is the usual Archimedean absolute value on , and

(7)

is the Fourier transform on , with being the standard character on . (The reason for this rather strange notation for the real line and its associated structures will be made clearer shortly.) Applying this formula to the (Archimedean) Gaussian function

, (8)

which is its own (additive) Fourier transform, and then applying the *multiplicative *Fourier transform (i.e. the Mellin transform), one soon obtains (2). (Riemann also had another proof of the functional equation relying primarily on contour integration, which I will not discuss here.) One can “clean up” this proof a bit by replacing the Gaussian by a Dirac delta function, although one now has to work formally and “renormalise” by throwing away some infinite terms. (One can use the theory of distributions to make this latter approach rigorous, but I will not discuss this here.) Note how this proof combines the additive Fourier transform with the multiplicative Fourier transform. [Continuing with this theme, the Gamma function (5) is an inner product between an additive character and a multiplicative character , and the zeta function (1) can be viewed both additively, as a sum over n, or multiplicatively, as an Euler product.]

In the famous thesis of Tate, the above argument was reinterpreted using the language of the adele ring , with the Poisson summation formula (4) on replaced by the Poisson summation formula

(9)

on , where is the rationals, , and f is now a Schwartz-Bruhat function on . Applying this formula to the adelic (or global) Gaussian function , which is its own Fourier transform, and then using the adelic Mellin transform, one again obtains (2). Again, the proof can be cleaned up by replacing the Gaussian with a Dirac mass, at the cost of making the computations formal (or requiring the theory of distributions).

In this post I will write down both Riemann’s proof and Tate’s proof together (but omitting some technical details), to emphasise the fact that they are, in some sense, the same proof. However, Tate’s proof gives a high-level clarity to the situation (in particular, explaining more adequately why the Gamma factor at infinity (4) fits seamlessly with the Riemann zeta function (1) to form the Xi function (2)), and allows one to generalise the functional equation relatively painlessly to other zeta-functions and L-functions, such as Dedekind zeta functions and Hecke L-functions.

[Note: the material here is very standard in modern algebraic number theory; the post here is partially for my own benefit, as most treatments of this topic in the literature tend to operate in far higher levels of generality than I would prefer.]

## Recent Comments