You are currently browsing the tag archive for the ‘prime number theorem’ tag.

The prime number theorem can be expressed as the assertion

is the von Mangoldt function. It is a basic result in analytic number theory, but requires a bit of effort to prove. One “elementary” proof of this theorem proceeds through the Selberg symmetry formula

where the second von Mangoldt function is defined by the formula

(We are avoiding the use of the symbol here to denote Dirichlet convolution, as we will need this symbol to denote ordinary convolution shortly.) For the convenience of the reader, we give a proof of the Selberg symmetry formula below the fold. Actually, for the purposes of proving the prime number theorem, the weaker estimate

In this post I would like to record a somewhat “soft analysis” reformulation of the elementary proof of the prime number theorem in terms of Banach algebras, and specifically in Banach algebra structures on (completions of) the space of compactly supported continuous functions equipped with the convolution operation

This soft argument does not easily give any quantitative decay rate in the prime number theorem, but by the same token it avoids many of the quantitative calculations in the traditional proofs of this theorem. Ultimately, the key “soft analysis” fact used is the spectral radius formula

for any element of a unital commutative Banach algebra , where is the space of characters (i.e., continuous unital algebra homomorphisms from to ) of . This formula is due to Gelfand and may be found in any text on Banach algebras; for sake of completeness we prove it below the fold.

The connection between prime numbers and Banach algebras is given by the following consequence of the Selberg symmetry formula.

Theorem 1 (Construction of a Banach algebra norm)For any , let denote the quantityThen is a seminorm on with the bound

for all . Furthermore, we have the Banach algebra bound

We prove this theorem below the fold. The prime number theorem then follows from Theorem 1 and the following two assertions. The first is an application of the spectral radius formula (6) and some basic Fourier analysis (in particular, the observation that contains a plentiful supply of local units:

Theorem 2 (Non-trivial Banach algebras with many local units have non-trivial spectrum)Let be a seminorm on obeying (7), (8). Suppose that is not identically zero. Then there exists such thatfor all . In particular, by (7), one has

whenever is a non-negative function.

The second is a consequence of the Selberg symmetry formula and the fact that is real (as well as Mertens’ theorem, in the case), and is closely related to the non-vanishing of the Riemann zeta function on the line :

Theorem 3 (Breaking the parity barrier)Let . Then there exists such that is non-negative, and

Assuming Theorems 1, 2, 3, we may now quickly establish the prime number theorem as follows. Theorem 2 and Theorem 3 imply that the seminorm constructed in Theorem 1 is trivial, and thus

as for any Schwartz function (the decay rate in may depend on ). Specialising to functions of the form for some smooth compactly supported on , we conclude that

as ; by the smooth Urysohn lemma this implies that

as for any fixed , and the prime number theorem then follows by a telescoping series argument.

The same argument also yields the prime number theorem in arithmetic progressions, or equivalently that

for any fixed Dirichlet character ; the one difference is that the use of Mertens’ theorem is replaced by the basic fact that the quantity is non-vanishing.

One of the most basic methods in additive number theory is the Hardy-Littlewood circle method. This method is based on expressing a quantity of interest to additive number theory, such as the number of representations of an integer as the sum of three primes , as a Fourier-analytic integral over the unit circle involving exponential sums such as

where the sum here ranges over all primes up to , and . For instance, the expression mentioned earlier can be written as

The strategy is then to obtain sufficiently accurate bounds on exponential sums such as in order to obtain non-trivial bounds on quantities such as . For instance, if one can show that for all odd integers greater than some given threshold , this implies that all odd integers greater than are expressible as the sum of three primes, thus establishing all but finitely many instances of the odd Goldbach conjecture.

Remark 1In practice, it can be more efficient to work with smoother sums than the partial sum (1), for instance by replacing the cutoff with a smoother cutoff for a suitable chocie of cutoff function , or by replacing the restriction of the summation to primes by a more analytically tractable weight, such as the von Mangoldt function . However, these improvements to the circle method are primarily technical in nature and do not have much impact on the heuristic discussion in this post, so we will not emphasise them here. One can also certainly use the circle method to study additive combinations of numbers from other sets than the set of primes, but we will restrict attention to additive combinations of primes for sake of discussion, as it is historically one of the most studied sets in additive number theory.

In many cases, it turns out that one can get fairly precise evaluations on sums such as in the *major arc* case, when is close to a rational number with small denominator , by using tools such as the prime number theorem in arithmetic progressions. For instance, the prime number theorem itself tells us that

and the prime number theorem in residue classes modulo suggests more generally that

when is small and is close to , basically thanks to the elementary calculation that the phase has an average value of when is uniformly distributed amongst the residue classes modulo that are coprime to . Quantifying the precise error in these approximations can be quite challenging, though, unless one assumes powerful hypotheses such as the Generalised Riemann Hypothesis.

In the *minor arc* case when is not close to a rational with small denominator, one no longer expects to have such precise control on the value of , due to the “pseudorandom” fluctuations of the quantity . Using the standard probabilistic heuristic (supported by results such as the central limit theorem or Chernoff’s inequality) that the sum of “pseudorandom” phases should fluctuate randomly and be of typical magnitude , one expects upper bounds of the shape

for “typical” minor arc . Indeed, a simple application of the Plancherel identity, followed by the prime number theorem, reveals that

which is consistent with (though weaker than) the above heuristic. In practice, though, we are unable to rigorously establish bounds anywhere near as strong as (3); upper bounds such as are far more typical.

Because one only expects to have upper bounds on , rather than asymptotics, in the minor arc case, one cannot realistically hope to make much use of phases such as for the minor arc contribution to integrals such as (2) (at least if one is working with a single, deterministic, value of , so that averaging in is unavailable). In particular, from upper bound information alone, it is difficult to avoid the “conspiracy” that the magnitude oscillates in sympathetic resonance with the phase , thus essentially eliminating almost all of the possible gain in the bounds that could arise from exploiting cancellation from that phase. Thus, one basically has little option except to use the triangle inequality to control the portion of the integral on the minor arc region :

Despite this handicap, though, it is still possible to get enough bounds on both the major and minor arc contributions of integrals such as (2) to obtain non-trivial lower bounds on quantities such as , at least when is large. In particular, this sort of method can be developed to give a proof of Vinogradov’s famous theorem that every sufficiently large odd integer is the sum of three primes; my own result that all odd numbers greater than can be expressed as the sum of at most five primes is also proven by essentially the same method (modulo a number of minor refinements, and taking advantage of some numerical work on both the Goldbach problems and on the Riemann hypothesis ). It is certainly conceivable that some further variant of the circle method (again combined with a suitable amount of numerical work, such as that of numerically establishing zero-free regions for the Generalised Riemann Hypothesis) can be used to settle the full odd Goldbach conjecture; indeed, under the assumption of the Generalised Riemann Hypothesis, this was already achieved by Deshouillers, Effinger, te Riele, and Zinoviev back in 1997. I am optimistic that an unconditional version of this result will be possible within a few years or so, though I should say that there are still significant technical challenges to doing so, and some clever new ideas will probably be needed to get either the Vinogradov-style argument or numerical verification to work unconditionally for the three-primes problem at medium-sized ranges of , such as . (But the intermediate problem of representing all even natural numbers as the sum of at most four primes looks somewhat closer to being feasible, though even this would require some substantially new and non-trivial ideas beyond what is in my five-primes paper.)

However, I (and many other analytic number theorists) are considerably more skeptical that the circle method can be applied to the even Goldbach problem of representing a large even number as the sum of two primes, or the similar (and marginally simpler) twin prime conjecture of finding infinitely many pairs of twin primes, i.e. finding infinitely many representations of as the *difference* of two primes. At first glance, the situation looks tantalisingly similar to that of the Vinogradov theorem: to settle the even Goldbach problem for large , one has to find a non-trivial lower bound for the quantity

for sufficiently large , as this quantity is also the number of ways to represent as the sum of two primes . Similarly, to settle the twin prime problem, it would suffice to obtain a lower bound for the quantity

that goes to infinity as , as this quantity is also the number of ways to represent as the difference of two primes less than or equal to .

In principle, one can achieve either of these two objectives by a sufficiently fine level of control on the exponential sums . Indeed, there is a trivial (and uninteresting) way to take any (hypothetical) solution of either the asymptotic even Goldbach problem or the twin prime problem and (artificially) convert it to a proof that “uses the circle method”; one simply begins with the quantity or , expresses it in terms of using (5) or (6), and then uses (5) or (6) again to convert these integrals back into a the combinatorial expression of counting solutions to or , and then uses the hypothetical solution to the given problem to obtain the required lower bounds on or .

Of course, this would not qualify as a genuine application of the circle method by any reasonable measure. One can then ask the more refined question of whether one could hope to get non-trivial lower bounds on or (or similar quantities) purely from the upper and lower bounds on or similar quantities (and of various type norms on such quantities, such as the bound (4)). Of course, we do not yet know what the strongest possible upper and lower bounds in are yet (otherwise we would already have made progress on major conjectures such as the Riemann hypothesis); but we can make plausible heuristic conjectures on such bounds. And this is enough to make the following heuristic conclusions:

- (i) For “binary” problems such as computing (5), (6), the contribution of the minor arcs potentially dominates that of the major arcs (if all one is given about the minor arc sums is magnitude information), in contrast to “ternary” problems such as computing (2), in which it is the major arc contribution which is absolutely dominant.
- (ii) Upper and lower bounds on the magnitude of are not sufficient, by themselves, to obtain non-trivial bounds on (5), (6) unless these bounds are
*extremely*tight (within a relative error of or better); but - (iii) obtaining such tight bounds is a problem of comparable difficulty to the original binary problems.

I will provide some justification for these conclusions below the fold; they are reasonably well known “folklore” to many researchers in the field, but it seems that they are rarely made explicit in the literature (in part because these arguments are, by their nature, heuristic instead of rigorous) and I have been asked about them from time to time, so I decided to try to write them down here.

In view of the above conclusions, it seems that the best one can hope to do by using the circle method for the twin prime or even Goldbach problems is to reformulate such problems into a statement of roughly comparable difficulty to the original problem, even if one assumes powerful conjectures such as the Generalised Riemann Hypothesis (which lets one make very precise control on major arc exponential sums, but not on minor arc ones). These are not rigorous conclusions – after all, we have already seen that one can always artifically insert the circle method into any viable approach on these problems – but they do strongly suggest that one needs a method other than the circle method in order to fully solve either of these two problems. I do not know what such a method would be, though I can give some heuristic objections to some of the other popular methods used in additive number theory (such as sieve methods, or more recently the use of inverse theorems); this will be done at the end of this post.

A fundamental problem in analytic number theory is to understand the distribution of the prime numbers . For technical reasons, it is convenient not to study the primes directly, but a proxy for the primes known as the von Mangoldt function , defined by setting to equal when is a prime (or a power of that prime) and zero otherwise. The basic reason why the von Mangoldt function is useful is that it encodes the fundamental theorem of arithmetic (which in turn can be viewed as the defining property of the primes) very neatly via the identity

The most important result in this subject is the prime number theorem, which asserts that the number of prime numbers less than a large number is equal to :

Here, of course, denotes a quantity that goes to zero as .

It is not hard to see (e.g. by summation by parts) that this is equivalent to the asymptotic

for the von Mangoldt function (the key point being that the squares, cubes, etc. of primes give a negligible contribution, so is essentially the same quantity as ). Understanding the nature of the term is a very important problem, with the conjectured optimal decay rate of being equivalent to the Riemann hypothesis, but this will not be our concern here.

The prime number theorem has several important generalisations (for instance, there are analogues for other number fields such as the Chebotarev density theorem). One of the more elementary such generalisations is the prime number theorem in arithmetic progressions, which asserts that for fixed and with coprime to (thus ), the number of primes less than equal to mod is equal to , where is the Euler totient function:

(Of course, if is not coprime to , the number of primes less than equal to mod is . The subscript in the and notation denotes that the implied constants in that notation is allowed to depend on .) This is a more quantitative version of Dirichlet’s theorem, which asserts the weaker statement that the number of primes equal to mod is infinite. This theorem is important in many applications in analytic number theory, for instance in Vinogradov’s theorem that every sufficiently large odd number is the sum of three odd primes. (Imagine for instance if almost all of the primes were clustered in the residue class mod , rather than mod . Then almost all sums of three odd primes would be divisible by , leaving dangerously few sums left to cover the remaining two residue classes. Similarly for other moduli than . This does not fully rule out the possibility that Vinogradov’s theorem could still be true, but it does indicate why the prime number theorem in arithmetic progressions is a relevant tool in the proof of that theorem.)

As before, one can rewrite the prime number theorem in arithmetic progressions in terms of the von Mangoldt function as the equivalent form

Philosophically, one of the main reasons why it is so hard to control the distribution of the primes is that we do not currently have too many tools with which one can rule out “conspiracies” between the primes, in which the primes (or the von Mangoldt function) decide to correlate with some structured object (and in particular, with a totally multiplicative function) which then visibly distorts the distribution of the primes. For instance, one could imagine a scenario in which the probability that a randomly chosen large integer is prime is not asymptotic to (as is given by the prime number theorem), but instead to fluctuate depending on the phase of the complex number for some fixed real number , thus for instance the probability might be significantly less than when is close to an integer, and significantly more than when is close to a half-integer. This would contradict the prime number theorem, and so this scenario would have to be somehow eradicated in the course of proving that theorem. In the language of Dirichlet series, this conspiracy is more commonly known as a zero of the Riemann zeta function at .

In the above scenario, the primality of a large integer was somehow sensitive to asymptotic or “Archimedean” information about , namely the approximate value of its logarithm. In modern terminology, this information reflects the local behaviour of at the infinite place . There are also potential consipracies in which the primality of is sensitive to the local behaviour of at finite places, and in particular to the residue class of mod for some fixed modulus . For instance, given a Dirichlet character of modulus , i.e. a completely multiplicative function on the integers which is periodic of period (and vanishes on those integers not coprime to ), one could imagine a scenario in which the probability that a randomly chosen large integer is prime is large when is close to , and small when is close to , which would contradict the prime number theorem in arithmetic progressions. (Note the similarity between this scenario at and the previous scenario at ; in particular, observe that the functions and are both totally multiplicative.) In the language of Dirichlet series, this conspiracy is more commonly known as a zero of the -function of at .

An especially difficult scenario to eliminate is that of *real characters*, such as the Kronecker symbol , in which numbers which are quadratic nonresidues mod are very likely to be prime, and quadratic residues mod are unlikely to be prime. Indeed, there is a scenario of this form – the Siegel zero scenario – which we are still not able to eradicate (without assuming powerful conjectures such as GRH), though fortunately Siegel zeroes are not quite strong enough to destroy the prime number theorem in arithmetic progressions.

It is difficult to prove that no conspiracy between the primes exist. However, it is not entirely impossible, because we have been able to exploit two important phenomena. The first is that there is often a “all or nothing dichotomy” (somewhat resembling the *zero-one laws* in probability) regarding conspiracies: in the asymptotic limit, the primes can either conspire totally (or more precisely, anti-conspire totally) with a multiplicative function, or fail to conspire at all, but there is no middle ground. (In the language of Dirichlet series, this is reflected in the fact that zeroes of a meromorphic function can have order , or order (i.e. are not zeroes after all), but cannot have an intermediate order between and .) As a corollary of this fact, the prime numbers cannot conspire with two distinct multiplicative functions at once (by having a partial correlation with one and another partial correlation with another); thus one can use the existence of one conspiracy to exclude all the others. In other words, there is at most one conspiracy that can significantly distort the distribution of the primes. Unfortunately, this argument is *ineffective*, because it doesn’t give any control at all on what that conspiracy is, or even if it exists in the first place!

But now one can use the second important phenomenon, which is that because of symmetries, one type of conspiracy can lead to another. For instance, because the von Mangoldt function is real-valued rather than complex-valued, we have conjugation symmetry; if the primes correlate with, say, , then they must also correlate with . (In the language of Dirichlet series, this reflects the fact that the zeta function and -functions enjoy symmetries with respect to reflection across the real axis (i.e. complex conjugation).) Combining this observation with the all-or-nothing dichotomy, we conclude that the primes cannot correlate with for any non-zero , which in fact leads directly to the prime number theorem (2), as we shall discuss below. Similarly, if the primes correlated with a Dirichlet character , then they would also correlate with the conjugate , which also is inconsistent with the all-or-nothing dichotomy, except in the exceptional case when is real – which essentially means that is a quadratic character. In this one case (which is the only scenario which comes close to threatening the truth of the prime number theorem in arithmetic progressions), the above tricks fail and one has to instead exploit the algebraic number theory properties of these characters instead, which has so far led to weaker results than in the non-real case.

As mentioned previously in passing, these phenomena are usually presented using the language of Dirichlet series and complex analysis. This is a very slick and powerful way to do things, but I would like here to present the elementary approach to the same topics, which is slightly weaker but which I find to also be very instructive. (However, I will not be *too* dogmatic about keeping things elementary, if this comes at the expense of obscuring the key ideas; in particular, I will rely on multiplicative Fourier analysis (both at and at finite places) as a substitute for complex analysis in order to expedite various parts of the argument. Also, the emphasis here will be more on heuristics and intuition than on rigour.)

The material here is closely related to the theory of *pretentious characters* developed by Granville and Soundararajan, as well as an earlier paper of Granville on elementary proofs of the prime number theorem in arithmetic progressions.

Atle Selberg, who made immense and fundamental contributions to analytic number theory and related areas of mathematics, died last Monday, aged 90.

Selberg’s early work was focused on the study of the Riemann zeta function . In 1942, Selberg showed that a positive fraction of the zeroes of this function lie on the critical line . Apart from improvements in the fraction (the best value currently being a little over 40%, a result of Conrey), this is still one of the strongest partial results we have towards the Riemann hypothesis. (I discuss Selberg’s result, and the method of mollifiers he introduced there, in a little more detail after the jump.)

In working on the zeta function, Selberg developed two powerful tools which are still used routinely in analytic number theory today. The first is the method of mollifiers to smooth out the magnitude oscillations of the zeta function, making the (more interesting) phase oscillation more visible. The second was the method of the Selberg sieve, which is a particularly elegant choice of sieve which allows one to count patterns in almost primes (and hence to upper bound patterns in primes) quite accurately. Variants of the Selberg sieve were a crucial ingredient in, for instance, the recent work of Goldston-Yıldırım-Pintz on prime gaps, as well as the work of Ben Green and myself on arithmetic progressions in primes. (I discuss the Selberg sieve, as well as the Selberg symmetry formula below, in my post on the parity problem. Incidentally, Selberg was the first to formalise this problem as a significant obstruction in sieve theory.)

For all of these achievements, Selberg was awarded the Fields Medal in 1950. Around that time, Selberg and Erdős also produced the first elementary proof of the prime number theorem. A key ingredient here was the Selberg symmetry formula, which is an elementary analogue of the prime number theorem for almost primes.

But perhaps Selberg’s greatest contribution to mathematics was his discovery of the Selberg trace formula, which is a non-abelian generalisation of the Poisson summation formula, and which led to many further deep connections between representation theory and number theory, and in particular being one of the main inspirations for the Langlands program, which in turn has had an impact on many different parts of mathematics (for instance, it plays a role in Wiles’ proof of Fermat’s last theorem). For an introduction to the trace formula, its history, and its impact, I recommend the survey article of Arthur.

Other major contributions of Selberg include the Rankin-Selberg theory connecting Artin L-functions from representation theory to the integrals of automorphic forms (very much in the spirit of the Langlands program), and the Chowla-Selberg formula relating the Gamma function at rational values to the periods of elliptic curves with complex multiplication. He also made an influential conjecture on the spectral gap of the Laplacian on quotients of by congruence groups, which is still open today (Selberg had the first non-trivial partial result). As an example of this conjecture’s impact, Selberg’s eigenvalue conjecture has inspired some recent work of Sarnak-Xue, Gamburd, and Bourgain-Gamburd on new constructions of expander graphs, and has revealed some further connections between number theory and arithmetic combinatorics (such as sum-product theorems); see this announcement of Bourgain-Gamburd-Sarnak for the most recent developments (this work, incidentally, also employs the Selberg sieve). As observed by Satake, Selberg’s eigenvalue conjecture and the more classical Ramanujan-Petersson conjecture can be unified into a single conjecture, now known as the* Ramanujan-Selberg conjecture*; the eigenvalue conjecture is then essentially an archimedean (or “non-dyadic“) special case of the general Ramanujan-Selberg conjecture. (The original (dyadic) Ramanujan-Petersson conjecture was finally proved by Deligne-Serre, after many important contributions by other authors, but the non-dyadic version remains open.)

## Recent Comments