The Riemann zeta function , defined for by

and then continued meromorphically to other values of by analytic continuation, is a fundamentally important function in analytic number theory, as it is connected to the primes via the Euler product formula

(for , at least), where ranges over primes. (The equivalence between (1) and (2) is essentially the generating function version of the fundamental theorem of arithmetic.) The function has a pole at and a number of zeroes . A formal application of the factor theorem gives

where ranges over zeroes of , and we will be vague about what the factor is, how to make sense of the infinite product, and exactly which zeroes of are involved in the product. Equating (2) and (3) and taking logarithms gives the formal identity

and differentiating the above identity in yields the formal identity

where is the von Mangoldt function, defined to be when is a power of a prime , and zero otherwise. Thus we see that the behaviour of the primes (as encoded by the von Mangoldt function) is intimately tied to the distribution of the zeroes . For instance, if we knew that the zeroes were far away from the axis , then we would heuristically have

for real . On the other hand, the integral test suggests that

and thus we see that and have essentially the same (multiplicative) Fourier transform:

Inverting the Fourier transform (or performing a contour integral closely related to the inverse Fourier transform), one is led to the prime number theorem

In fact, the standard proof of the prime number theorem basically proceeds by making all of the above formal arguments precise and rigorous.

Unfortunately, we don’t know as much about the zeroes of the zeta function (and hence, about the function itself) as we would like. The Riemann hypothesis (RH) asserts that all the zeroes (except for the “trivial” zeroes at the negative even numbers) lie on the *critical line* ; this hypothesis would make the error terms in the above proof of the prime number theorem significantly more accurate. Furthermore, the stronger *GUE hypothesis* asserts in addition to RH that the local distribution of these zeroes on the critical line should behave like the local distribution of the eigenvalues of a random matrix drawn from the gaussian unitary ensemble (GUE). I will not give a precise formulation of this hypothesis here, except to say that the adjective “local” in the context of distribution of zeroes means something like “at scale when “.

Nevertheless, we do know some reasonably non-trivial facts about the zeroes and the zeta function , either unconditionally, or assuming RH (or GUE). Firstly, there are no zeroes for (as one can already see from the convergence of the Euler product (2) in this case) or for (this is trickier, relying on (6) and the elementary observation that

is non-negative for and ); from the functional equation

(which can be viewed as a consequence of the Poisson summation formula, see e.g. my blog post on this topic) we know that there are no zeroes for either (except for the trivial zeroes at negative even integers, corresponding to the poles of the Gamma function). Thus all the non-trivial zeroes lie in the *critical strip* .

We also know that there are infinitely many non-trivial zeroes, and can approximately count how many zeroes there are in any large bounded region of the critical strip. For instance, for large , the number of zeroes in this strip with is . This can be seen by applying (6) to (say); the trivial zeroes at the negative integers end up giving a contribution of to this sum (this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being poles of the Gamma function), while the and terms end up being negligible (of size ), while each non-trivial zero contributes a term which has a non-negative real part, and furthermore has size comparable to if . (Here I am glossing over a technical renormalisation needed to make the infinite series in (6) converge properly.) Meanwhile, the left-hand side of (6) is absolutely convergent for and of size , and the claim follows. A more refined version of this argument shows that the number of non-trivial zeroes with is , but we will not need this more precise formula here. (A fair fraction – at least 40%, in fact – of these zeroes are known to lie on the critical line; see this earlier blog post of mine for more discussion.)

Another thing that we happen to know is how the *magnitude* of the zeta function is distributed as ; it turns out to be log-normally distributed with log-variance about . More precisely, we have the following result of Selberg:

Theorem 1Let be a large number, and let be chosen uniformly at random from between and (say). Then the distribution of converges (in distribution) to the normal distribution .

To put it more informally, behaves like plus lower order terms for “typical” large values of . (Zeroes of are, of course, certainly not typical, but one can show that one can usually stay away from these zeroes.) In fact, Selberg showed a slightly more precise result, namely that for any fixed , the moment of converges to the moment of .

Remarkably, Selberg’s result does not need RH or GUE, though it is certainly consistent with such hypotheses. (For instance, the determinant of a GUE matrix asymptotically obeys a remarkably similar log-normal law to that given by Selberg’s theorem.) Indeed, the net effect of these hypotheses only affects some error terms in of magnitude , and are thus asymptotically negligible compared to the main term, which has magnitude about . So Selberg’s result, while very pretty, manages to finesse the question of what the zeroes of are actually doing – he makes the primes do most of the work, rather than the zeroes.

Selberg never actually published the above result, but it is reproduced in a number of places (e.g. in this book by Joyner, or this book by Laurincikas). As with many other results in analytic number theory, the actual details of the proof can get somewhat technical; but I would like to record here (partly for my own benefit) an informal sketch of some of the main ideas in the argument.

** — 1. Informal overview of argument — **

The first step is to get a usable (approximate) formula for . On the one hand, from the second part of (4) one has

This formula turns out not to be directly useful because it requires one to know much more about the distribution of the zeroes than we currently possess. On the other hand, from the first part of (4) and (5) one also has the formula

This formula also turns out not to be directly useful, because it requires one to know much more about the distribution of the primes than we currently possess.

However, it turns out that we can “split the difference” between (7), (8), and get a formula for which involves some zeroes and some primes , in a manner that one can control them both. Roughly speaking, the formula looks like this:

for and , where is a small parameter that we can choose (e.g. ); thus we have localised the prime sum to the primes of size , and the zero sum to those zeroes at a distance from . (This is an oversimplification; there is a “tail” coming from those zeroes that are more distant from than , and also one has to smooth out the sum in a little bit, and allow the implied constants in the notation to depend on , but let us ignore these technical issues here, as well as the issue of what exactly is hiding in the error.)

It turns out that all of these expressions can be controlled. The error term coming from the zeroes (as well as the error term) turn out to be of size for most values of , so are a lower order term. (As mentioned before, it is this error term that would be better controlled if one had RH or GUE, but this is not necessary to establish Selberg’s result.) The main term is the one coming from the primes.

We can heuristically argue as follows. The expression , for ranging between and , is a random variable of mean zero and variance approximately (if and is small). Making the heuristic assumption that the behave as if they were independent, the central limit theorem then suggests that the sum should behave like a normal distribution of mean zero and variance . But the claim now follows from the classical estimate

(which follows from the prime number theorem, but can also be deduced from the formula (8) for , using the fact that has a simple pole at ).

To summarise, there are three main tasks to establish Selberg’s theorem:

- Establish a formula along the lines of (9);
- Show that the error terms arising from zeroes are on the average;
- Justify the central limit calculation for .

I’ll now briefly talk (informally) about each of the three steps in turn.

** — 2. The explicit formula — **

To get a formula such as (9), the basic strategy is to take a suitable average of the formula (7) and the formula (8). Traditionally, this is done by contour integration; however, I prefer (perhaps idiosyncratically) to take a more Fourier-analytic perspective, using convolutions rather than contour integrals. (The two approaches are largely equivalent, though.) The basic point is that the imaginary part of the zeroes inhabits the same space as the imaginary part of the variable, which in turn is the Fourier analytic dual of the variable that the logarithm of the primes live in; this can be seen by writing (7), (8) in a more Fourier-like manner as

(These sorts of Fourier-analytic connections are often summarised by the slogan “the zeroes of the zeta function are the music of the primes”.) The uncertainty principle then predicts that localising to the scale should result in blurring out the zeroes at scale , which is where (9) is going to come from.

Let’s see how this idea works in practice. We consider a convolution of the form

where is some bump function with total mass ; informally, this is averaged out in the vertical direction at scale (we allow implied constants to depend on ).

We can express (10) in two different ways, one using (7), and one using (8). Let’s look at (7) first. If one modifies by , then the quantity doesn’t fluctuate very much, unless is within of , in which case it can move by about . As a consequence, we see that

when , and

The quantity also doesn’t move very much by this shift (we are assuming the imaginary part of to be large). Inserting these facts into (7), we thus see that (10) is (heuristically) equal to

Now let’s compute (10) using (8) instead. Writing , we express (10) as

Introducing the Fourier transform of , one can write this as

Now we took to be a bump function, so its Fourier transform should also be like a bump function (or perhaps a Schwartz function). As a first approximation, one can thus think of as a smoothed truncation to the region , thus the weight is morally restricting to the region . Thus we (morally) can express (10) as

Comparing this with the other formula (11) we have for (10), we obtain (9) as required (formally, at least).

** — 3. Controlling the zeroes — **

Next, we want to show that the quantity

is on the average, when and is chosen uniformly at random from to .

For this, we can use the *first moment method*. For each zero , let be the random variable which equals when and zero otherwise, thus we are trying to control the expectation of . The only zeroes which are relevant are those which are of size , and we know that there are of these (indeed, we have an even more precise formula, as remarked earlier). On the other hand, a randomly chosen has a probability of of falling within of , and so we expect each to have an expected value of . (The logarithmic factor in the definition of turns out not to be an issue, basically because is locally integrable.) By linearity of expectation, we conclude that has expectation , and the claim follows.

Remark 1One can actually do a bit better than this, showing that higher order moments of are also , by using a variant of (9) together with the moment bounds in the next section; but we will not need that refinement here.

** — 4. The central limit theorem — **

Finally, we have to show that behaves like a normal distribution, as predicted by the central limit theorem heuristic. The key is to show that the behave “as if” they were jointly independent. In particular, as the all have mean zero, one would like to show that products such as

have a negligible expectation as long as at least one of the primes in occurs at most once. Once one has this (as well as a similar formula for the case when all primes appear at least twice), one can then do a standard moment computation of the moment and verify that this moment then matches the answer predicted by the central limit theorem, which by standard arguments (involving the Weierstrass approximation theorem) is enough to establish the distributional law. Note that to get close to the normal distribution by a fixed amount of accuracy, it suffices to control a bounded number of moments, which ultimately means that we can treat as being bounded, .

If we expand out the product (12), we get

Using the product formula for cosines (or Euler’s formula), the product of cosines here can be expressed as a linear combination of cosines , where the frequency takes the form

Thus, is the logarithm of a rational number, whose numerator and denominator are the product of some of the . Since all the are at most , we see that the numerator and denominator here are at most .

Now for the punchline. If there is a prime in that appears only once, then the numerator and denominator cannot fully cancel, by the fundamental theorem of arithmetic. Thus cannot be . Furthermore, since the denominator is at most , we see that must stay away from by a distance of about or more, and so has a wavelength of at most . On the other hand, ranges between and . If is fixed and is small enough (much smaller than ), we thus see that the average value of between and is close to zero, and so (12) does indeed have negligible expectation as claimed. (A similar argument lets one compute the expectation of (12) when all primes appear at least twice.)

Remark 2A famous theorem of Erdös and Kac gives a normal distribution for the number of prime factors of a large number , with mean and variance . One can view Selberg’s theorem as a sort of Fourier-analytic variant of the Erdös-Kac theorem.

Remark 3The Fourier-like correspondence between zeroes of the zeta function and primes can be used to convert statements about zeroes, such as the Riemann hypothesis and the GUE hypothesis, into equivalent statements about primes. For instance, the Riemann hypothesis is equivalent to having the square root error termin the prime number theorem holding asymptotically as for all and all intervals which are

largein the sense that is comparable to . Meanwhile, the pair correlation conjecture (the simplest component of the GUE hypothesis) is equivalent (on RH) to the square root error term holding (with the expected variance) for all andalmostall intervals which areshortin the sense that for some small (fixed) . (This is a rough statement; a more precise formulation can be found in this paper of Goldston and Montgomery.) It seems to me that reformulation of the full GUE hypothesis in terms of primes should be similar, but would assert that the error term in the prime number theorem (as well as variants of this theorem for almost primes) in short intervals enjoys the expected normal distribution; I don’t know of a precise formulation of this assertion, but calculations in this direction lie in this paper of Bogomolny and Keating.)

## 21 comments

Comments feed for this article

12 July, 2009 at 9:41 pm

AnonymousDear Prof. Tao,

Thank you very much for the post.

Could you please recommend an introductory level analytic number theory book?

thanks

12 July, 2009 at 11:20 pm

Emmanuel KowalskiThe connection between Selberg’s central limit theorem for zeta(s) and the Erdös-Kác theorem is quite interesting; as I’ve discussed recently (based on joint work with A. Nikgehbali and J. Jacod), they can both be interpreted as renormalized consequences of un-normalized limiting behavior of the relevant characteristic functions/Fourier transforms (although for zeta(s), this is only conditional, depending on a suitable form of the Random Matrix correspondance).

One difference is that the case of zeta(s) involves unitary matrices and gaussian variables, while the Erdös-Kác theorem involves permutations and Poisson variables. (See this preprint).

13 July, 2009 at 5:44 pm

AnonymousThere’s a small typo in ” that does appears only once”.

[Corrected, thanks -T.]13 July, 2009 at 7:32 pm

AnonymousI have a question about the truncation in (9). Isn’t it sub-optimal? We are truncating in physical space at while in frequency space at . But is, at least in most results regarding the zeta function, far away from . By the uncertainty principle shouldn’t we be able to truncate at without much (any?) impact on the tail terms in frequency space? Of course, you could argue that we aren’t going to improve the conclusion of the theorem so why does it matter, but it struck me as odd.

13 July, 2009 at 7:54 pm

AnonymousOh, i think i see, we are realy localizing at .

14 July, 2009 at 4:51 am

AnonymousThis question my be sightly off comment but out of curiosity…

A question regarding the Zeros of the Riemann’s Zeta.

I believe that Hardy proved the Riemann’s conjecture for an countable number of zeros.

I have been told that this has been improved so that the number of zeros is commensurate with the continuum!

Is this correct!??? If so please can you give any references!

Thanks in advance

6 August, 2009 at 12:33 am

hezhigangto infinite set, same cardinality can not imply equivalent.

card{n^2|n∈N}=card{n|n∈N}, but their elements are different in much detail.

14 July, 2009 at 8:03 am

Alex KontorovichDear Terry,

Thanks for the interesting post. Since you also recently posted about Benford’s law, I wanted to mention a paper with Steve (J.) Miller:

http://arxiv.org/abs/math.NT/0412003

in which we show that the values of $L$-functions near the critical line exhibit Benford behavior, using generalizations of Selberg’s theorem. In the same paper we show that iterates in the 3x+1 problem are also Benford; there is also a subsequent paper of Lagarias and Sound on the matter:

http://arxiv.org/abs/math/0509175

Best, Alex

14 July, 2009 at 3:18 pm

Matthew EmertonDear Anonymous (at 4:51 am, July 14),

Any non-zero meromorphic function has only countably many zeroes, since the zeroes are a discrete set. In particular, the zeta function has only countably

many zeroes on the critical line.

Hardy proved that an infinite number of zeroes lie on the critical line, and this

result was later improved (by Selberg) to show that a positive proportion of zeroes lies on the line.

15 July, 2009 at 7:03 am

AnonymousHi everyone!

Slightly off topic…

There appears to be a revised version of some papers on the Archive

A Proof of the Density Hypothesis

http://arxiv.org/abs/0810.2103

Yuan-You Fu-Rui Cheng

Anything to this

26 February, 2010 at 6:25 am

AnonymousThe revised version has been posted on Feb. 26, 2010.

13 December, 2012 at 7:59 pm

Anonymoushttp://arxiv.org/abs/0810.2103

Yuanyou Cheng and Sergio Albeverio

17 July, 2009 at 1:57 am

zhiganghesome naive question:

in formula (3), is {\rho} contains both trivial zeroes and complex zeroes?

why 1/(s-1)?

I understand it as s=1 is pole of zeta.

Lars V. Ahlfors said in Complex Analysis’ preface that ues of words ‘clearly’, ‘obviously’, ‘evidently’ to test the reader’s understanding.

as a layman to math, I can understand few lines of such professional blog

18 July, 2009 at 5:59 pm

AnonymousThank you for presenting this.

Minor detail:

Are there a few missing minus signs? For example, it’s minus the logarithmic derivative of zeta which gives the von Mangoldt function. I think this applies to the left hand sides of equations (4), (5) and (7) — maybe more?

[Corrected, thanks - T.]21 July, 2009 at 1:44 am

SikaDear professor, I believe that you have error/typo in remark 2.

sqrt(log log (n)) is standard deviation, not variance, for the number of prime factors of a large number n.

[Corrected, thanks - T.]Regards, sika

24 July, 2009 at 2:57 am

Michael Nielsen » Biweekly links for 07/24/2009[...] Selberg’s limit theorem for the Riemann zeta function on the critical line « What’s new [...]

7 August, 2009 at 5:59 am

ValentinaDear Prof. Tao,

Thank you very much for the notes!

31 January, 2010 at 7:40 am

Andrea AluTerry: congratulations on the King Faisal award. Will you be picking it up?

7 February, 2010 at 12:06 pm

AnonymousYou wrote: ‘(this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being zeroes of the Gamma function)’

Did you mean ‘poles of the Gamma function’? (since the Gamma function doesn’t have zeros)

[Corrected, thanks - T.]25 June, 2010 at 6:44 pm

The uncertainty principle « What’s new[...] e.g. this blog post of mine for more discussion). Using such formulae, one can relate the zeroes of the zeta function in the [...]

21 December, 2010 at 6:15 pm

DuyDear Prof. Tao,

I tried to find the proof of Selberg’s CLT and I found your interesting discussion here. Thank you very much. Actually, I have the book of Laurincikas but it is not so clear for me. Could you tell me where I can find the complete argument as you post here?