Problem 1 (Erdös-Ulam problem)Let be a set such that the distance between any two points in is rational. Is it true that cannot be (topologically) dense in ?

The paper of Anning and Erdös addressed the case that all the distances between two points in were integer rather than rational in the affirmative.

The Erdös-Ulam problem remains open; it was discussed recently over at Gödel’s lost letter. It is in fact likely (as we shall see below) that the set in the above problem is not only forbidden to be topologically dense, but also cannot be Zariski dense either. If so, then the structure of is quite restricted; it was shown by Solymosi and de Zeeuw that if fails to be Zariski dense, then all but finitely many of the points of must lie on a single line, or a single circle. (Conversely, it is easy to construct examples of dense subsets of a line or circle in which all distances are rational.)

The main tool of the Solymosi-de Zeeuw analysis was Faltings’ celebrated theorem that every algebraic curve of genus at least two contains only finitely many rational points. The purpose of this post is to observe that an affirmative answer to the full Erdös-Ulam problem similarly follows from the conjectured analogue of Falting’s theorem for surfaces, namely the following conjecture of Bombieri and Lang:

Conjecture 2 (Bombieri-Lang conjecture)Let be a smooth projective irreducible algebraic surface defined over the rationals which is of general type. Then the set of rational points of is not Zariski dense in .

In fact, the Bombieri-Lang conjecture has been made for varieties of arbitrary dimension, and for more general number fields than the rationals, but the above special case of the conjecture is the only one needed for this application. We will review what “general type” means (for smooth projective complex varieties, at least) below the fold. } The Bombieri-Lang conjecture is considered to be extremely difficult, in particular being substantially harder than Faltings’ theorem, which is itself a highly non-trivial result. So this implication should not be viewed as a practical route to resolving the Erdös-Ulam problem unconditionally; rather, it is a demonstration of the power of the Bombieri-Lang conjecture. Still, it was an instructive algebraic geometry exercise for me to carry out the details of this implication, which quickly boils down to verifying that a certain quite explicit algebraic surface is of general type (Theorem 3 below). As I am not an expert in algebraic geometry, my computations here will be rather tedious and pedestrian; it is likely that they could be made much slicker by exploiting more of the machinery of modern algebraic geometry, and I would welcome any such streamlining by actual experts in this area. (For similar reasons, there may be more typos and errors than usual in this post; corrections are welcome as always.) My calculations here are based on a similar calculation of van Luijk, who used analogous arguments to show (assuming Bombieri-Lang) that the set of perfect cuboids is not Zariski-dense in its projective parameter space.

We also remark that in a recent paper of Makhul and Shaffaf, the Bombieri-Lang conjecture (or more precisely, a weaker consequence of that conjecture) was used to show that if is a subset of with rational distances which intersects any line in only finitely many points, then there is a uniform bound on the cardinality of the intersection of with any line.

Let us now give the elementary reductions to the claim that a certain variety is of general type. For sake of contradiction, let be a dense set such that the distance between any two points is rational. Then certainly contains two points that are a rational distance apart. By applying a translation, rotation, and a (rational) dilation, we may assume that these two points are and . As is dense, there is a third point of not on the axis, which after a reflection we can place in the upper half-plane; we will write it as with .

Given any two points in , the quantities are rational, and so by the cosine rule the dot product is rational as well. Since , this implies that the -component of every point in is rational; this in turn implies that the product of the -coordinates of any two points in is rational as well (since this differs from by a rational number). In particular, and are rational, and all of the points in now lie in the lattice . (This fact appears to have first been observed in the 1988 habilitationschrift of Kemnitz.)

Now take four points , in in general position (so that the octuplet avoids any pre-specified hypersurface in ); this can be done if is dense. (If one wished, one could re-use the three previous points to be three of these four points, although this ultimately makes little difference to the analysis.) If is any point in , then the distances from to are rationals that obey the equations

for , and thus determine a rational point in the affine complex variety defined as

By inspecting the projection from to , we see that is a branched cover of , with the generic cover having points (coming from the different ways to form the square roots ); in particular, is a complex affine algebraic surface, defined over the rationals. By inspecting the monodromy around the four singular base points (which switch the sign of one of the roots , while keeping the other three roots unchanged), we see that the variety is connected away from its singular set, and thus irreducible. As is topologically dense in , it is Zariski-dense in , and so generates a Zariski-dense set of rational points in . To solve the Erdös-Ulam problem, it thus suffices to show that

Claim 1For any non-zero rational and for rationals in general position, the rational points of the affine surface is not Zariski dense in .

This is already very close to a claim that can be directly resolved by the Bombieri-Lang conjecture, but is affine rather than projective, and also contains some singularities. The first issue is easy to deal with, by working with the projectivisation

of , where is the homogeneous quadratic polynomial

with

and the projective complex space is the space of all equivalence classes of tuples up to projective equivalence . By identifying the affine point with the projective point , we see that consists of the affine variety together with the set , which is the union of eight curves, each of which lies in the closure of . Thus is the projective closure of , and is thus a complex irreducible projective surface, defined over the rationals. As is cut out by four quadric equations in and has degree sixteen (as can be seen for instance by inspecting the intersection of with a generic perturbation of a fibre over the generically defined projection ), it is also a complete intersection. To show (1), it then suffices to show that the rational points in are not Zariski dense in .

Heuristically, the reason why we expect few rational points in is as follows. First observe from the projective nature of (1) that every rational point is equivalent to an integer point. But for a septuple of integers of size , the quantity is an integer point of of size , and so should only vanish about of the time. Hence the number of integer points of height comparable to should be about

this is a convergent sum if ranges over (say) powers of two, and so from standard probabilistic heuristics (see this previous post) we in fact expect only finitely many solutions, in the absence of any special algebraic structure (e.g. the structure of an abelian variety, or a birational reduction to a simpler variety) that could produce an unusually large number of solutions.

The Bombieri-Lang conjecture, Conjecture 2, can be viewed as a formalisation of the above heuristics (roughly speaking, it is one of the most optimistic natural conjecture one could make that is compatible with these heuristics while also being invariant under birational equivalence).

Unfortunately, contains some singular points. Being a complete intersection, this occurs when the Jacobian matrix of the map has less than full rank, or equivalently that the gradient vectors

for are linearly dependent, where the is in the coordinate position associated to . One way in which this can occur is if one of the gradient vectors vanish identically. This occurs at precisely points, when is equal to for some , and one has for all (so in particular ). Let us refer to these as the *obvious* singularities; they arise from the geometrically evident fact that the distance function is singular at .

The other way in which could occur is if a non-trivial linear combination of at least two of the gradient vectors vanishes. From (2), this can only occur if for some distinct , which from (1) implies that

for two choices of sign . If the signs are equal, then (as are in general position) this implies that , and then we have the singular point

If the non-trivial linear combination involved three or more gradient vectors, then by the pigeonhole principle at least two of the signs involved must be equal, and so the only singular points are (5). So the only remaining possibility is when we have two gradient vectors that are parallel but non-zero, with the signs in (3), (4) opposing. But then (as are in general position) the vectors are non-zero and non-parallel to each other, a contradiction. Thus, outside of the obvious singular points mentioned earlier, the only other singular points are the two points (5).

We will shortly show that the obvious singularities are *ordinary double points*; the surface near any of these points is analytically equivalent to an ordinary cone near the origin, which is a cone over a smooth conic curve . The two non-obvious singularities (5) are slightly more complicated than ordinary double points, they are *elliptic singularities*, which approximately resemble a cone over an elliptic curve. (As far as I can tell, this resemblance is exact in the category of real smooth manifolds, but not in the category of algebraic varieties.) If one blows up each of the point singularities of separately, no further singularities are created, and one obtains a smooth projective surface (using the Segre embedding as necessary to embed back into projective space, rather than in a product of projective spaces). Away from the singularities, the rational points of lift up to rational points of . Assuming the Bombieri-Lang conjecture, we thus are able to answer the Erdös-Ulam problem in the affirmative once we establish

This will be done below the fold, by the pedestrian device of explicitly constructing global differential forms on ; I will also be working from a complex analysis viewpoint rather than an algebraic geometry viewpoint as I am more comfortable with the former approach. (As mentioned above, though, there may well be a quicker way to establish this result by using more sophisticated machinery.)

I thank Mark Green and David Gieseker for helpful conversations (and a crash course in varieties of general type!).

Remark 4The above argument shows in fact (assuming Bombieri-Lang) that sets with all distances rational cannot be Zariski-dense, and thus (by Solymosi-de Zeeuw) must lie on a single line or circle with only finitely many exceptions. Assuming a stronger version of Bombieri-Lang involving a general number field , we obtain a similar conclusion with “rational” replaced by “lying in ” (one has to extend the Solymosi-de Zeeuw analysis to more general number fields, but this should be routine, using the analogue of Faltings’ theorem for such number fields).

** — 1. Singularities — **

Let us inspect the local behaviour of near an obvious singularity, when is close to for some . We may normalise so that and , and then we may use the affine chart , so that we are looking at the affine variety

for near . Note that for , stays away from zero and so is a smooth branch of the square root of near an obvious singularity. Thus, up to an invertible analytic map, the local behaviour of near the obvious singularity is that of a cone

Such a cone is blown up to the surface

where

is the closure of the equivalence class in ; the origin blows up to the conic curve

and the blowup locally looks (on the level of real smooth manifolds, at least) like the cylinder . In particular, the blown up variety is smooth near the blowup of the original singularity.

Now we look at the behaviour of near a non-obvious singularity (5); for sake of discussion we work with the sign . Here the calculations are messier; unfortunately, I do not know of a slick way to avoid excessive computation. We use the affine chart and write , then we are looking at the affine variety

for near . We can rewrite this as

Using the equations, we can solve for and (using the general position of the ) to obtain equations of the form

for some homogeneous quadratics and complex coefficients determined by the . Substituting this into the equations we obtain equations of the form

for some homogeneous quadratics and complex coefficients determined by the . One can solve the first two equations by power series (or the inverse function theorem) for near zero to obtain an analytic representation

for some functions analytic near the origin with . The second two equations then become

for some functions analytic near the origin that vanish to second order at . Thus, near this non-obvious singularity, is analytically equivalent to the complex surface

near .

It may be possible to simplify the surface further into an even better normal form, though I was unable to do so. Nevertheless, the current form is simple enough that one can understand the blowup (in the category of real smooth manifolds, at least), which in this case is

or equivalently the three-dimensional manifold

quotiented by the equivalence , where is the function

which is locally analytic near the origin. One can cover this manifold by the four affine charts . For instance, the chart becomes the affine manifold

The point blows up to the curve

that is to say the intersection of two quadrics in . For generic, the vectors and are linearly independent, and one easily verifies that this is a smooth curve in , which by Bezout’s theorem is of degree four. If one projects from a point of this curve to a generic hyperplane, one obtains a smooth planar curve of degree three, that is to say an elliptic curve . Thus, on the level of real smooth manifolds at least, the blowup of is equivalent to , by the inverse function theorem; in particular, the blowup is smooth near the fibre of the original singularity.

To summarise, the blown up surface is smooth, with the obvious singularities blowing up to a conic curve with neighbourhood structure , and the non-obvious singularities blowing up to an elliptic curve with neighbourhood structure (at the level of real smooth manifolds; I was not able to understand the complex or algebrai geometry structure properly).

** — 2. General type — **

Let be a smooth complex projective variety of some dimension . The canonical bundle (also known as the *determinant bundle*) is then the top-dimensional exterior product of the cotangent bundle , thus sections of this bundle are holomorphic -forms on . Since the space of possible top-dimensional forms at a point is one-dimensional, this is a line bundle. One can also take higher powers of this bundle for any natural number to create further line bundles, known as *pluricanonical bundles*; sections of the pluricanonical bundle of order can then be locally represented as formal -fold products of holomorphic -forms.

For each pluricanonical bundle , define to be the space of global sections of this bundle; in particular, is the space of globally holomorphic -forms. It turns out (as can be proven for instance using compactness theorems such as Montel’s theorem) that this space is always a finite-dimensional complex vector space; the global sections are also always algebraic (this follows from Serre’s GAGA paper, but presumably can be derived from earlier results also). If the space has some positive dimension (that is, at least one non-trivial global section exists), we say that the bundle is *effective*, and then we can define an (almost everywhere defined) *pluricanonical map* by the formula

where is a complex basis for the space ; note that this map can be undefined if the all simultaneously vanish, but this is a measure zero set. The pluricanonical map is only defined up to choice of basis , but changing the basis only amounts to applying a projective linear transformation to . In particular, the image of is well defined up to projective transformations. We refer to the case of the pluricanonical map (when it exists) as the *canonical map*.

The Kodaira dimension of is then defined to be the maximum of the dimensions of for all natural numbers with effective pluricanonical bundles, or if no such exists (some authors use instead of ). From the definition it is clear that the Kodaira dimension of is at most the dimension of as an algebraic variety (or as a complex manifold). The variety is said to be of general type if the two dimensions are equal, that is to say that there is a pluricanonical map whose image has the same dimension as .

Example 1Suppose is a projective space. On the one hand, an -form on is an object that assigns to each point an alternating -form from tangent vectors to complex numbers. But is the quotient of by dilations. Thus, one can also view the -form as a lifted object that assigns to each point an alternating -form from complex vectors to complex numbers, such that one has the vanishing propertyif any of the are a scalar multiple of , as well as the scale invariance property

for any , or equivalently

(This homogeneity of order is essentially asserting that the canonical bundle is isomorphic to the line bundle , the power of the tautological line bundle .) In particular, if is a globally holomorphic -form, are arbitrary vectors, and is an arbitrary homogeneous polynomial of degree , then is a holomorphic map from to which is dilation invariant, and thus descends to a holomorphic function on , which by Liouville’s theorem is then necessarily constant. By varying the polynomial , this shows that must vanish identically. Thus the canonical bundle here is not effective. A similar scaling argument shows that the pluricanonical bundles are not effective either, thus has a Kodaira dimension of (or ).

Example 2In the case of smooth projective algebraic curves, it turns out that genus zero curves like have Kodaira dimension (or ), genus one curves (such as elliptic curves) have Kodaira dimension zero, and higher genus curves have Kodaira dimension one and are thus of general type. This is basically a corollary of the Riemann-Roch theorem. Intuitively, the more “holes” or other topologically interesting structure a variety has, the more opportunity there is for the canonical and pluricanonical bundles to have interesting global sections, and the more likely it becomes that the variety is of general type.

Remark 5As the name suggests, “most” varieties are expected to be of general type, with only a few “special” varieties being not of general type. In the case of curves, the special varieties are the genus zero and genus one curves. In the case of surfaces, the situation is analogous, but significantly more complicated, and is described by the Enriques-Kodaira classification. I don’t know the current level of understanding for higher dimensional varieties, but would imagine that the story becomes even more complicated than in the surface case.

** — 3. Constructing differential forms — **

To show that a smooth projective variety is of general type, it clearly suffices to locate enough global holomorphic -forms that the canonical map has full dimension in its image. It is thus of interest to find ways to construct global holomorphic -forms on such varieties.

Suppose first that we have a (possibly singular) complete intersection

of codimension in , where are homogeneous polynomials of degree respectively. This variety can be viewed as the quotient of the quasiprojective variety

by dilations. Similarly to Example 1, an -form on can then be identified with an -form on obeying the vanishing condition

whenever one of the is parallel to , as well as the scaling relationship

To try to create such a form, we can start with the standard -form

on , and “divide” by the one-forms to create a -form on by requiring that

for , , and , where

Away from the singularities of , the determinant form is non-degenerate (after quotienting out the by the tangent plane , on which this form clearly vanishes). As such, is well-defined as a form on the smooth points of . It also obeys the vanishing condition (7) when one of the is parallel to . However, it doesn’t necessarily obey the scaling relationship (8); instead, one has

for any , where is the quantity

Repeating the heuristic analysis in the introduction, we expect the number of integer points in of height to be , so should morally correspond to “general type” in some sense. This is reflected here by the ability, when , to multiply by an arbitrary homogeneous polynomial of degree , leading to a form which does obey both (7) and (8), and thus descends to an -form defined on the smooth points of . This already shows that for *smooth* complete intersections , one has general type whenever (because one can use of the form , for a fixed homogeneous polynomial of degree and being the basis functions, to create a portion of the canonical map that is basically the tautological embedding of to ). (Indeed, this argument shows in this case that the canonical bundle on is isomorphic to the pullback of to .)

If has singularities, then the situation is a bit more complicated; we have to pass to the blowup of , and the form may develop some singularities as one approaches the blown up fibres. For the specific blowup considered in Theorem 3, though, it turns out that these singularities are removable, if we make vanish at the exceptional singular points (5); as there are only two such singular points, this will still give enough freedom in to make the canonical map have full dimension in the image.

We turn to the details. Starting with the complete intersection given by (1), we let be the -form on the smooth portion of the deprojectivised variety

defined by the above construction, thus

for a smooth point of , , and . In this case, , , and , so , and so for any linear function , the form descends to a -form on the smooth portion of , which then lifts to a -form on the blown up variety except possibly at the exceptional fibres above each of the singular points of .

We now claim that this -form on can be smoothly continued to each of the fibres if the linear function vanishes at the two exceptional points (5), or equivalently that is a linear combination of the coefficients . Let us first verify this for an obvious singular point. As before, we normalise and , and take the affine chart , then is locally described by the affine variety

where as usual we identify with , and the form restricted to this affine variety takes the form

for , , and . In particular, taking for , where is the standard basis, we have

whenever the denominator is non-zero, where is the one-form on that sends a tangent vector to its component , and similarly for , , etc. Using the notation of wedge product , we can write this as

Using (2), we compute that

The quantities are bounded away from zero near the singularity, and is also bounded. Thus we have

near the singularity, where we write for -forms to denote the assertion that for some scalar function that is bounded near the singularity.

At first glance, it looks like this form could blow up whenever vanishes, but this does not actually happen as the numerator will also vanish in that case. One can see this by using a different choice of for in (9). For instance, if one takes to be rather than , then we get

and now

so we obtain an alternate form

One can also see the equivalence of (10), (11), (12) by noting that vanishes on , which on differentiating yields the linear dependence

between the -forms , which can then easily be used to relate (10), (11), (12) to each other.

Thus the only actual singularity that could occur here is when all vanish. However, it turns out that even here the singularity is removable after blowing up the surface. Recall that the blowup surface takes the form

We pick an affine chart of this surface; for sake of argument we take the chart, as the other charts are treated similarly. Then the surface can be expressed as

with . In particular,

and thus

The key point is that there is a common factor of here. Using (11), we thus have

and so does not blow up in the coordinates of as approaches . By Riemann’s theorem, (the pullback of) can thus be extended holomorphically to the exceptional fibre .

Now we look at the behaviour near a non-obvious singularity; for sake of argument we look at the neighbourhood of

As before, we use the affine chart and write , then we are looking at the affine variety

near , where is the system of quadratic polynomials

Applying a change of variables, we have

for , , and , and some harmless constant (depending on ). As before, if we set for , we arrive at

when the denominator is non-zero. Similarly

when the denominators are non-zero. Similarly for permutations of the indices.

Now, we lift to the blowup manifold

For sake of discussion we work in the affine chart (say) and write , in which case the manifold becomes

with the convention . To convert back into the original coordinates , we have

and

Thus

and

and thus

Also, . From (14) we can thus cancel all the factors of and conclude that

and so stays bounded up to the exceptional fibre in the blowup coordinates as long as stay bounded away from zero.

It remains to deal with the case when one of the are small. From the form of the curve (6), and the fact that the are in general position, none of the vectors are scalar multiples of each other, which means that only one of the can vanish at a time. Suppose is close to vanishing, so that are comparable to one (the other cases will be similar). In this case, we use (15). As before, and ; also

and so

So once again the factors of cancel, and

and so stays bounded up to the exceptional fibre in this case. Similarly when one of the other are small. Thus stays bounded as one approaches the exceptional fibre in any direction, and so by Riemann’s theorem it can be continued holomorphically to this fibre.

In conclusion, lifts to a smooth -form on the blowup surface . By choosing to be the coordinate functions , we conclude that the image of the canonical map on contains the image of under the (generically defined) projection . But it is easy to see that this map is generically injective on (indeed, one can solve for as rational functions of ), by subtracting the various equations from each other), and so the image has full dimension. This establishes that has general type as required.

Filed under: expository, math.AG, math.CO, math.NT Tagged: algebraic varieties, Bombieri-Lang conjecture, differential forms, Erdos-Ulam problem, Kodaira dimension, singularities ]]>

between primes up to exhibited a lower bound of the shape

for some function that went to infinity as ; this improved upon previous work of Rankin and other authors, who established the same bound but with replaced by a constant. (Again, see the previous post for a more detailed discussion.)

In our previous papers, we did not specify a particular growth rate for . In my paper with Kevin, Ben, and Sergei, there was a good reason for this: our argument relied (amongst other things) on the inverse conjecture on the Gowers norms, as well as the Siegel-Walfisz theorem, and the known proofs of both results both have ineffective constants, rendering our growth function similarly ineffective. Maynard’s approach ostensibly also relies on the Siegel-Walfisz theorem, but (as shown in another recent paper of his) can be made quite effective, even when tracking -tuples of fairly large size (about for some small ). If one carefully makes all the bounds in Maynard’s argument quantitative, one eventually ends up with a growth rate of shape

on the gaps between primes for large ; this is an unpublished calculation of James’.

In this paper we make a further refinement of this calculation to obtain a growth rate

leading to a bound of the form

for large and some small constant . Furthermore, this appears to be the limit of current technology (in particular, falling short of Cramer’s conjecture that is comparable to ); in the spirit of Erdös’ original prize on this problem, I would like to offer 10,000 USD for anyone who can show (in a refereed publication, of course) that the constant here can be replaced by an arbitrarily large constant .

The reason for the growth rate (3) is as follows. After following the sieving process discussed in the previous post, the problem comes down to something like the following: can one sieve out all (or almost all) of the primes in by removing one residue class modulo for all primes in (say) ? Very roughly speaking, if one can solve this problem with , then one can obtain a growth rate on of the shape . (This is an oversimplification, as one actually has to sieve out a random subset of the primes, rather than all the primes in , but never mind this detail for now.)

Using the quantitative “dense clusters of primes” machinery of Maynard, one can find lots of -tuples in which contain at least primes, for as large as or so (so that is about ). By considering -tuples in arithmetic progression, this means that one can find lots of residue classes modulo a given prime in that capture about primes. In principle, this means that union of all these residue classes can cover about primes, allowing one to take as large as , which corresponds to (3). However, there is a catch: the residue classes for different primes may collide with each other, reducing the efficiency of the covering. In our previous papers on the subject, we selected the residue classes randomly, which meant that we had to insert an additional logarithmic safety margin in expected number of times each prime would be shifted out by one of the residue classes, in order to guarantee that we would (with high probability) sift out most of the primes. This additional safety margin is ultimately responsible for the loss in (2).

The main innovation of this paper, beyond detailing James’ unpublished calculations, is to use ideas from the literature on efficient hypergraph covering, to avoid the need for a logarithmic safety margin. The hypergraph covering problem, roughly speaking, is to try to cover a set of vertices using as few “edges” from a given hypergraph as possible. If each edge has vertices, then one certainly needs at least edges to cover all the vertices, and the question is to see if one can come close to attaining this bound given some reasonable uniform distribution hypotheses on the hypergraph . As before, random methods tend to require something like edges before one expects to cover, say of the vertices.

However, it turns out (under reasonable hypotheses on ) to eliminate this logarithmic loss, by using what is now known as the “semi-random method” or the “Rödl nibble”. The idea is to randomly select a small number of edges (a first “nibble”) – small enough that the edges are unlikely to overlap much with each other, thus obtaining maximal efficiency. Then, one pauses to remove all the edges from that intersect edges from this first nibble, so that all remaining edges will not overlap with the existing edges. One then randomly selects another small number of edges (a second “nibble”), and repeats this process until enough nibbles are taken to cover most of the vertices. Remarkably, it turns out that under some reasonable assumptions on the hypergraph , one can maintain control on the uniform distribution of the edges throughout the nibbling process, and obtain an efficient hypergraph covering. This strategy was carried out in detail in an influential paper of Pippenger and Spencer.

In our setup, the vertices are the primes in , and the edges are the intersection of the primes with various residue classes. (Technically, we have to work with a family of hypergraphs indexed by a prime , rather than a single hypergraph, but let me ignore this minor technical detail.) The semi-random method would *in principle* eliminate the logarithmic loss and recover the bound (3). However, there is a catch: the analysis of Pippenger and Spencer relies heavily on the assumption that the hypergraph is uniform, that is to say all edges have the same size. In our context, this requirement would mean that each residue class captures exactly the same number of primes, which is not the case; we only control the number of primes in an average sense, but we were unable to obtain any concentration of measure to come close to verifying this hypothesis. And indeed, the semi-random method, when applied naively, does not work well with edges of variable size – the problem is that edges of large size are much more likely to be eliminated after each nibble than edges of small size, since they have many more vertices that could overlap with the previous nibbles. Since the large edges are clearly the more useful ones for the covering problem than small ones, this bias towards eliminating large edges significantly reduces the efficiency of the semi-random method (and also greatly complicates the *analysis* of that method).

Our solution to this is to iteratively *reweight* the probability distribution on edges after each nibble to compensate for this bias effect, giving larger edges a greater weight than smaller edges. It turns out that there is a natural way to do this reweighting that allows one to repeat the Pippenger-Spencer analysis in the presence of edges of variable size, and this ultimately allows us to recover the full growth rate (3).

To go beyond (3), one either has to find a lot of residue classes that can capture significantly more than primes of size (which is the limit of the multidimensional Selberg sieve of Maynard and myself), or else one has to find a very different method to produce large gaps between primes than the Erdös-Rankin method, which is the method used in all previous work on the subject.

It turns out that the arguments in this paper can be combined with the Maier matrix method to also produce chains of consecutive large prime gaps whose size is of the order of (4); three of us (Kevin, James, and myself) will detail this in a future paper. (A similar combination was also recently observed in connection with our earlier result (1) by Pintz, but there are some additional technical wrinkles required to recover the full gain of (3) for the chains of large gaps problem.)

Filed under: math.NT, paper Tagged: Ben Green, hypergraph covering, James Maynard, Kevin Ford, prime gaps, Sergey Konyagin ]]>

The function obeys the trivial functional equation

for all in its domain of definition. Indeed, as is real-valued when is real, the function vanishes on the real line and is also meromorphic, and hence vanishes everywhere. Similarly one has the functional equation

From these equations we see that the zeroes of the zeta function are symmetric across the real axis, and the zeroes of are the reflection of the zeroes of across this axis.

It is a remarkable fact that these functions obey an additional, and more non-trivial, functional equation, this time establishing a symmetry across the *critical line* rather than the real axis. One consequence of this symmetry is that the zeta function and -functions may be extended meromorphically to the entire complex plane. For the zeta function, the functional equation was discovered by Riemann, and reads as follows:

Theorem 1 (Functional equation for the Riemann zeta function)The Riemann zeta function extends meromorphically to the entire complex plane, with a simple pole at and no other poles. Furthermore, one has the functional equationfor all complex other than , where is the function

Here , are the complex-analytic extensions of the classical trigionometric functions , and is the Gamma function, whose definition and properties we review below the fold.

The functional equation can be placed in a more symmetric form as follows:

Corollary 2 (Functional equation for the Riemann xi function)The Riemann xi functionis analytic on the entire complex plane (after removing all removable singularities), and obeys the functional equations

In particular, the zeroes of consist precisely of the non-trivial zeroes of , and are symmetric about both the real axis and the critical line. Also, is real-valued on the critical line and on the real axis.

Corollary 2 is an easy consequence of Theorem 1 together with the duplication theorem for the Gamma function, and the fact that has no zeroes to the right of the critical strip, and is left as an exercise to the reader (Exercise 19). The functional equation in Theorem 1 has many proofs, but most of them are related in on way or another to the Poisson summation formula

(Theorem 34 from Supplement 2, at least in the case when is twice continuously differentiable and compactly supported), which can be viewed as a Fourier-analytic link between the coarse-scale distribution of the integers and the fine-scale distribution of the integers. Indeed, there is a quick heuristic proof of the functional equation that comes from formally applying the Poisson summation formula to the function , and noting that the functions and are formally Fourier transforms of each other, up to some Gamma function factors, as well as some trigonometric factors arising from the distinction between the real line and the half-line. Such a heuristic proof can indeed be made rigorous, and we do so below the fold, while also providing Riemann’s two classical proofs of the functional equation.

From the functional equation (and the poles of the Gamma function), one can see that has *trivial zeroes* at the negative even integers , in addition to the non-trivial zeroes in the critical strip. More generally, the following table summarises the zeroes and poles of the various special functions appearing in the functional equation, after they have been meromorphically extended to the entire complex plane, and with zeroes classified as “non-trivial” or “trivial” depending on whether they lie in the critical strip or not. (Exponential functions such as or have no zeroes or poles, and will be ignored in this table; the zeroes and poles of rational functions such as are self-evident and will also not be displayed here.)

Function | Non-trivial zeroes | Trivial zeroes | Poles |

Yes | |||

Yes | |||

No | Even integers | No | |

No | Odd integers | No | |

No | Integers | No | |

No | No | ||

No | No | ||

No | No | ||

No | No | ||

Yes | No | No |

Among other things, this table indicates that the Gamma and trigonometric factors in the functional equation are tied to the trivial zeroes and poles of zeta, but have no direct bearing on the distribution of the non-trivial zeroes, which is the most important feature of the zeta function for the purposes of analytic number theory, beyond the fact that they are symmetric about the real axis and critical line. In particular, the Riemann hypothesis is not going to be resolved just from further analysis of the Gamma function!

The zeta function computes the “global” sum , with ranging all the way from to infinity. However, by some Fourier-analytic (or complex-analytic) manipulation, it is possible to use the zeta function to also control more “localised” sums, such as for some and some smooth compactly supported function . It turns out that the functional equation (3) for the zeta function localises to this context, giving an *approximate functional equation* which roughly speaking takes the form

whenever and ; see Theorem 38 below for a precise formulation of this equation. Unsurprisingly, this form of the functional equation is also very closely related to the Poisson summation formula (8), indeed it is essentially a special case of that formula (or more precisely, of the van der Corput -process). This useful identity relates long smoothed sums of to short smoothed sums of (or vice versa), and can thus be used to shorten exponential sums involving terms such as , which is useful when obtaining some of the more advanced estimates on the Riemann zeta function.

We will give two other basic uses of the functional equation. The first is to get a good count (as opposed to merely an upper bound) on the density of zeroes in the critical strip, establishing the Riemann-von Mangoldt formula that the number of zeroes of imaginary part between and is for large . The other is to obtain untruncated versions of the explicit formula from Notes 2, giving a remarkable exact formula for sums involving the von Mangoldt function in terms of zeroes of the Riemann zeta function. These results are not strictly necessary for most of the material in the rest of the course, but certainly help to clarify the nature of the Riemann zeta function and its relation to the primes.

In view of the material in previous notes, it should not be surprising that there are analogues of all of the above theory for Dirichlet -functions . We will restrict attention to primitive characters , since the -function for imprimitive characters merely differs from the -function of the associated primitive factor by a finite Euler product; indeed, if for some principal whose modulus is coprime to that of , then

(cf. equation (45) of Notes 2).

The main new feature is that the Poisson summation formula needs to be “twisted” by a Dirichlet character , and this boils down to the problem of understanding the finite (additive) Fourier transform of a Dirichlet character. This is achieved by the classical theory of Gauss sums, which we review below the fold. There is one new wrinkle; the value of plays a role in the functional equation. More precisely, we have

Theorem 3 (Functional equation for -functions)Let be a primitive character of modulus with . Then extends to an entire function on the complex plane, withor equivalently

for all , where is equal to in the even case and in the odd case , and

where is the Gauss sum

and , with the convention that the -periodic function is also (by abuse of notation) applied to in the cyclic group .

From this functional equation and (2) we see that, as with the Riemann zeta function, the non-trivial zeroes of (defined as the zeroes within the critical strip are symmetric around the critical line (and, if is real, are also symmetric around the real axis). In addition, acquires trivial zeroes at the negative even integers and at zero if , and at the negative odd integers if . For imprimitive , we see from (9) that also acquires some additional trivial zeroes on the left edge of the critical strip.

There is also a symmetric version of this equation, analogous to Corollary 2:

Corollary 4Let be as above, and setthen is entire with .

For further detail on the functional equation and its implications, I recommend the classic text of Titchmarsh or the text of Davenport.

** — 1. The Gamma function — **

There are many ways to define the Gamma function, but we will use the following classical definition:

Definition 5 (Gamma function)For any complex number with , the Gamma function is defined as

It is easy to see that the integrals here are absolutely convergent. One can view as the inner product between the multiplicative character and the additive character with respect to multiplicative Haar measure . As such, the Gamma function often appears as a normalisation factor in integrals that involve both additive and multiplicative characters. For instance, by a simple change of variables we see that

whenever and ; indeed, from a contour shift we see that the above identity also holds for complex with , if we use the standard interpretation of the complex exponential with positive real base. Making the further substitution and performing some additional manipulations, we see that the Gamma function is also related to integrals involving Gaussian functions, in that

for . Later on we will also need the variant

which follows from (14) by replacing with .

From Cauchy’s theorem and Fubini’s theorem one easily verifies that has vanishing contour integral on any closed contour in the half-space , and thus by Morera’s theorem is holomorphic on this half-space.

From (12) and an integration by parts we see that

for any with . Among other things, this allows us to extend meromorphically to the entire complex plane, by repeatedly using the form

of (16) as a *definition* to meromorphically extend the domain of definition of leftwards by one unit.

Exercise 6Show that for any natural number (thus , , etc.), and that has simple poles at and no further singularities. Thus one can view the Gamma function as a (shifted) generalisation of the factorial function.

By repeating the proof of (31), we obtain the conjugation symmetry

for all outside of the poles of . Translating this to the -function (5), we see that is meromorphic, with a pole at , and that

The Gamma function is also closely connected to the beta function:

Lemma 7 (Beta function identity)One haswhenever . (Note that this hypothesis makes the integral on the left-hand side absolutely integrable.)

*Proof:* From (12) and Fubini’s theorem one has

Making the change of variables for and (and using absolute integrability to justify this change of variables), the right-hand side becomes

and the claim follows another appeal to (12) and Fubini’s theorem.

This gives an important reflection formula:

Lemma 8 (Euler reflection formula)One hasas meromorphic functions (that is to say, the identity holds outside of the poles of the left or right-hand sides, which occur at the integers). In particular, has no zeroes in the complex plane.

Note that the reflection formula, when written in terms of the -function (5), is simply

for . In particular, has a zero at and no other zeroes. Note that (19) is consistent with the functional equations (4), (3).

*Proof:* By unique continuation of meromorphic functions, it suffices to verify this identity in the critical strip . By the beta function identity (and the value ), it thus suffices to show that

for in the critical strip.

If we make the substitution , so that , we have

We extend the function to the complex plane (excluding the origin) by the formula , where is the branch of the complex logarithm whose imaginary part lies in the half-open interval . This agrees with the usual power function at (or infinitesimally above) the positive real axis, but instead converges to infinitesimally below this axis. Thus, if one lets be a contour that loops clockwise around the positive real axis, and stays sufficiently close to this axis, we see (using Cauchy’s theorem to justify the passage from infinitesimal neighbourhoods of the real axis to non-infinitesimal ones, and using the hypothesis to handle the contributions near the origin and infinity) we have

On the other hand, outside of the non-negative real axis, is meromorphic, with a simple pole at of residue , and decays faster than at infinity. From the residue theorem we then have

and the claim then follows by putting the above identities together.

As a quick application of (8), if we set and observe that is clearly positive, we have

and thus (by (14)) we recover the classical Gaussian identity

Next, we give an alternate definition of the Gamma function:

Lemma 9 (Euler form of Gamma function)If is not a pole of (i.e., ), then

*Proof:* It is easy to verify the second identity, and that the product and limit are convergent. One also easily verifies that the expression obeys (16), so it will suffice to establish the claim when .

We use a trick previously employed to prove Lemma 40 of Notes 1. By (12) and the dominated convergence theorem, we have

But by Lemma 7 and a change of variables we have

From (16) one has and , and the claim follows (recall that is never zero).

Exercise 10 (Weierstrass form of Gamma function)If is not a pole of , show thatwhere is the Euler constant, with the product being absolutely convergent. (

Hint:you may need Lemma 40 from Notes 1.)

Exercise 11 (Digamma function)Define the digamma function to be the logarithmic derivative of the Gamma function. Show that the digamma function is a meromorphic function, with simple poles of residue at the non-positive integers and no other poles, and thatfor outside of the poles of , with the sum being absolutely convergent. Establish the reflection formula

for non-integer .

Exercise 12Show that .

Exercise 13 (Legendre duplication formula)Show thatwhenever is not a pole of . (

Hint:using the digamma function, show that the logarithmic derivatives of both sides differ by a constant. Then test the formula at two values of to verify that the normalising factor of is correct.)

Exercise 14 (Gauss multiplication theorem)For any natural number , establish the multiplication theoremwhenever is not a pole of .

Exercise 15 (Bohr-Mollerup theorem)Establish the Bohr-Mollerup theorem: the function , which is the Gamma function restricted to the positive reals, is the unique log-convex function on the positive reals with and for all .

Now we turn to the question of asymptotics for . We begin with the corresponding asymptotics for the digamma function . Recall (see Exercise 11 from Notes 1) that one has

for any real and any continuously differentiable functions . This gives

for in a sector of the form for some fixed (that is, makes at least a fixed angle with the negative real axis), where and are the standard branches of the argument and logarithm respectively. From Exercise 11, we obtain the asymptotic

in this regime. (For the other values of , one can use the reflection formula (21) to obtain an analogous asymptotic.) Actually, it will be convenient to sharpen this approximation a bit, using the following version of the trapezoid rule:

Exercise 16 (Trapezoid rule)Let be distinctintegers, and let be a continuously twice differentiable function. Show that(

Hint:first establish the case when .)

From this exercise, we obtain a sharper estimate

in the region where . Integrating this, we obtain a branch of the logarithm of with

for some absolute constant . To find this constant , we apply the reflection formula (Lemma 8) and and conclude that

for . Since (up to multiples of )

and

we conclude that is equal to up to multiples of ; but as is positive on the positive reals, we can normalise so that , thus we obtain the Stirling approximation

In particular, we have the approximation

in this region. For sake of comparison, note that

in this region (note this is consistent with the reflection formula, Lemma 8, as well as the duplication formula, Exercise 13).

Exercise 17When with and , show thatand

for a suitable choice of branch of ; equivalently, using the notation , one has

Also show that the error in (25) is real-valued when , so that

Exercise 20Using the trapezoid rule, show that for any in the region with , there exists a unique complex number for which one has the asymptoticfor any natural number , where . Use this to extend the Riemann zeta function meromorphically to the region . Conclude in particular that .

Exercise 21Obtain the refinementto the trapezoid rule when are integers and is continuously three times differentiable. Then show that for any in the region with , there exists a unique complex number for which one has the asymptotic

for any natural number , where . Use this to extend the Riemann zeta function meromorphically to the region . Conclude in particular that ; this is a rigorous interpretation of the infamous formula

Remark 22One can continue this procedure to extend meromorphically to the entire complex plane by using the Euler-Maclaurin formula; see this previous blog post. However, we will not pursue this approach to the meromorphic continuation of zeta further here.

** — 2. The functional equation — **

We now give three different (although not wholly unrelated) proofs of the functional equation, Theorem 1.

The first proof (due to Riemann) relies on a relationship between the Dirichlet series

of an arithmetic function , and the Taylor series

Given that both of the transforms and are linear and (formally, at least) injective, it is not surprising that there should be some linear relationship between the two. It turns out that we can use the Gamma function to mediate such a relationship:

Lemma 23 (Dirichlet series from power series)Let be an arithmetic function such that as . Then for any complex number with , we havewhere is the Taylor series (26), which is absolutely convergent in the unit disk . The integral on the right-hand side is absolutely integrable.

*Proof:* From (13) we have

for any natural number . Multiplying by , summing, and using Fubini’s theorem, we conclude that

and the claim follows. (By restricting to the case when is real and is non-negative, we can see that all integrals here are absolutely integrable.)

Specialising to the case , so that , we obtain the identity

for , which can be compared with (12). Now we recall the contour introduced in the proof of Lemma 8, which goes around the positive real axis in the clockwise direction. As in the proof of that lemma, we see that

for sufficiently close to the real axis (specifically, it has to not wind around any of the zeroes of other than ), where we use the branch as in the proof of Lemma 8. Thus we have

for with non-integer (to avoid the zeroes of ).

The contour integral is in fact absolutely convergent for *any* , and from the usual argument involving the Cauchy, Fubini, and Morera theorems we see that this integral depends holomorphically on . Thus, we can use (28) as a *definition* for the Riemann zeta function that extends it meromorphically to the entire complex plane with no further poles (note that has no zeroes to the left of the critical strip, after removing all singularities).

Now suppose that we are in the region , with not an integer. For any natural number , we shift the contour to the rectangular contour , which starts at , goes leftwards to , then upwards to , then rightwards to . As has simple poles at for each non-zero integer with residue , we see from the residue theorem (and the exponential decay of as goes to infinity to the right) that

If , then one can compute that the integral goes to zero as , and thus

From the choice of branch for , one sees that

Inserting these identities into (28), we obtain (4) after a brief calculation, at least in the region when and is not an integer; the remaining cases then follow from unique continuation of meromorphic functions.

Remark 24The Poisson summation formula was not explicitly used in the above proof of the functional equation. However, if one inspects the contour integration proof of the Poisson summation formula in Supplement 2, one sees an application of the residue theorem which is quite similar to that in the above argument, and so that formula is still present behind the scenes.

Now we give Riemann’s second proof of the functional equation. We again start in the region . If we repeat the derivation of (27), but use (14) in place of (13), we obtain the variant identity

Introducing the theta function

in the half-plane and using symmetry, we thus see that

Making the change of variables , this becomes

Recall from the Poisson summation formula that

for in the upper half-plane, using the principal branch of the square root; see Exercise 36 of . In particular, blows up like as . We use this formula to transform the previous integral to an integral just on rather than . First observe that

Next, from (30) (using the hypothesis to ensure absolute convergence) and the change of variables we have

Finally

Putting all this together, we see that

Note from the definition of the theta function that decays exponentially fast as . As such, the integral in the right-hand is absolutely convergent for any , and by the usual Morera theorem argument is in fact holomorphic in . Thus (31) may be used to give a meromorphic extension of to the entire complex plane. The right-hand side of (31) is also manifestly symmetric with respect to the reflection , giving the functional equation in the form (7).

Next, we give a short heuristic proof of the functional equation arising from formally applying the Poisson summation formula (8) to the function

ignoring all infinite divergences. Formally, the Fourier transform is then given by

thanks to (13). The Poisson summation formula (8) then *formally* yields

and the functional equation (3) formally follows after some routine calculation if we discard the divergent term on the right-hand side, converts the sum over negative to a sum over positive by the change of variables , and formally identify and with respectively.

The above heuristic argument may be made rigorous by using suitable regularisations. This is the purpose of the exercise below.

Exercise 25 (Rigorous justification of functional equation)Let be an element of the critical strip .

- (i) For any , show that the function defined by is continuous, absolutely integrable, and has Fourier transform
for , using the standard branch of the logarithm to define .

- (ii) Rigorously justify the Poisson summation formula
for any . (In Supplement 2, the Poisson summation formula was only established for continuously twice differentiable, compactly supported functions; is neither of these, but one can still recover the formula in this instance by an approximation argument.)

- (iii) Show that as .
- (iv) Show that
and

as , for either choice of sign .

- (v) Prove (3) for in the critical strip, and then prove the rest of Theorem 1.

Remark 26There are many further proofs of the functional equation than the three given above; see for instance the text of Titchmarsh for several further proofs. Most of the proofs can be connected in one form or another to the Poisson summation formula. One important proof worth mentioning is Tate’s adelic proof, discussed in this previous post, which is well suited for generalising the functional equation to many other zeta functions and -functions, but will not be discussed further in this post.

Exercise 27Use the formula from Exercise 21, together with the functional equation, to show that .

Exercise 28 (Relation between zeta function and Bernoulli numbers)In this exercise we give the classical connection between the zeta function and Bernoulli numbers; this connection is not so relevant for analytic number theory, as it only involves values of the zeta function that are far from the critical strip, but is of interest for some other applications.

- (i) For any complex number with , use the Poisson summation formula (8) to establish the identity
- (ii) For as above and sufficiently small, show that
Conclude that

for any natural number , where the Bernoulli numbers are defined through the Taylor expansion

Thus for instance , , and so forth.

- (iii) Show that
for any odd natural number . (This identity can also be deduced from the Euler-Maclaurin formula, which generalises the approach in Exercise 21; see this previous post.)

- (iv) Use (28) and the residue theorem (now working inside the contour , rather than outside) to give an alternate proof of (32).

Exercise 29Show that .

Remark 30The functional equation is almost certainly not sufficient, by itself, to establish the Riemann hypothesis. For instance, there is a classical example of Davenport and Heilbronn of a finite linear combination of Dirichlet L-functions which obeys a functional equation very similar (though not quite identical) to (4), but which possesses zeroes off of the critical line; see e.g. this article for a recent analysis of the counterexample. Eisenstein series can also be used to construct a “natural” variant of a zeta function that has a Dirichlet series and a functional equation, but has zeroes off the critical line. For a “cheaper” counterexample, take two nearby non-trivial zeroes of on the critical line, and “replace” them with two other nearby complex numbers symmetric around the critical line, but not on the line, by introducing the modified zeta functionThis function also obeys the functional equation, and behaves very similarly (though not identically) to the Riemann zeta function in all the regions in which we have a good understanding of this function (in particular, it has similar behaviour to around or around the edges of the critical strip), but clearly has zeroes off of the critical line. Such constructions would be particularly hard to exclude by analytic methods if there happened to be a repeated zero of on the critical line, as one could then make extremely close to both of ; it is conjectured that such a repeated zero does not occur, but we cannot exclude this possibility with current technology, which creates a family of “infinitesimal counterexamples” to the Riemann hypothesis which rules out a large number of potential approaches to this hypothesis.

On the other hand, unlike the Davenport-Heilbronn counterexample, does not arise from a Dirichlet series, and certainly does not have an Euler product. One can show (see Section 2.13 of Titchmarsh) that if one insists on the functional equation (4) on the nose (as opposed to, say, the modified functional equation that the Davenport-Heilbronn example obeys, or the functional equation obeyed by a Dirichlet -function) as well as a Dirichlet series representation, then the only possible functions available are scalar multiples of the Riemann zeta function. It could well be that the analogue of the Riemann hypothesis is in fact obeyed by any function which obeys a suitable functional equation,

together witha Dirichlet series representation (with appropriate size bounds on the coefficients)andan Euler product factorisation; a precise form of this statement is the Riemann hypothesis for the Selberg class. But one would somehow need to make essential use ofall threeof the above axioms to try to prove the Riemann hypothesis, as we have numerous counterexamples that show that zeroes can be produced off the critical line if one drops one or more of these axioms.

** — 3. Approximate and localised forms of the functional equation — **

In our construction of the Riemann zeta function in Notes 2, we had the asymptotic

for and in the region . Thus, is the limit of the functions as , locally uniformly for in this region. We have an analogous limit for smoothed sums:

Exercise 31 (Smoothed sums)Let be a smooth function such that vanishes for and equals for for some constant . Show that the functionsconverge locally uniformly to on the region .

It is of interest to understand the rate of convergence of these approximations to the zeta function. We restrict attention to in the critical strip . The first observation is that the smoothed sums have negligible contribution once is much larger than :

Lemma 32Let be a smooth, compactly supported function. Let lie in the critical strip. If for some sufficiently large (depending on the support of ), then we havefor any .

*Proof:* By the Poisson summation formula (8), we have

where

If we write and , then after a change of variables we have

where and . In particular we have

so it suffices by the triangle inequality to show that

for any and any non-zero integer . But by hypothesis on , we see that we have the derivative bounds on the support of the smooth compactly supported function . If one repeatedly writes and integrates by parts to move the derivative off of the phase, one obtains the claim.

This gives us a good approximation to in the critical strip, involving a smoothed sum consisting of terms:

Exercise 33Let be a smooth function such that vanishes for and equals for for some constant . Let in the critical strip. Show thatfor any , if one has for some sufficiently large depending on . (

Hint:use Lemma 5 of Notes 1, Exercise 31, and dyadic decomposition.) Conclude in particular that

We remark that the asymptotic (34) is also valid for the ordinary partial sums (a classical result of Hardy and Littlewood); see Theorem 4.11 of Titchmarsh. However, it will be slightly more convenient for us here to work exclusively with smoothed sums.

From (34) and the triangle inequality, we have the crude bound

in the interior of the critical strip. One can do better through the functional equation. Indeed, from (4), (23), (24) we see that

One can then use the Hadamard three lines theorem to interpolate between (35) and (37) to obtain the *convexity bound*

for any and ; we leave the details to the interested reader (and we will reprove the convexity bound shortly). Further improvements to (38) for the zeta function and other -functions are known as *subconvexity bounds* and have many applications in analytic number theory, though we will only discuss the simplest subconvexity bounds in this course.

Exercise 33 describes the zeta function in terms of smoothed sums of . In the converse direction, one can use Fourier inversion to express smoothed sums of in terms of the zeta function:

Lemma 34 (Fourier inversion)Let be a smooth, compactly supported function, and let lie in the critical strip. Then for any , we havefor all , where

is the Fourier transform of .

*Proof:* We can write the left-hand side of (39) as , where . By Proposition 7 of Notes 2, this can be rewritten as

Noting that

we thus rewrite the left-hand side of (39) as the contour integral

The function has a pole at with residue , which by Exercise 28 of Supplement 2 is of size for any . By another appeal to that exercise, together with (35), we see that goes to zero as uniformly when is bounded. By the residue theorem, we can thus shift the contour integral to

and the claim follows by performing the substitution .

Exercise 35Establish (39) directly from the Fourier inversion formula, without invoking contour integration methods.

Among other things, this lemma shows that growth bounds in the Riemann zeta function are equivalent to growth bounds on smooth exponential sums of :

Exercise 36Let and . Show that the following claims are equivalent:

- (i) One has as .
- (ii) One has the bound
whenever , , and is a compactly supported smooth function.

Exercise 37For any , let denote the least exponent for which one has the asymptotic as .

- (i) Show that is convex and obeys the functional equation for .
- (ii) Show that for all , and that for or . (In particular, this reproves (38).)
- (iii) Show that the Lindelöf hypothesis (Exercise 34 from Notes 2) is equivalent to the assertion that for all .

Lemma 34 and Theorem 1 suggest that there should be some approximate functional equation for the smoothed sums . This is indeed the case:

Theorem 38 (Approximate functional equation for smoothed sums)Let with and . Let be such that . Let be a smooth compactly supported function. Thenfor any , where is as in (5).

This approximate functional equation can also be established directly from the Poisson summation formula using the method of stationary phase; see Chapter 4 of Titchmarsh. The error term of can be improved further by using better growth bounds on (or by further Taylor expansion of ), but the error term given here is adequate for applications. Note that the true functional equation (3) is formally the case of (40) if one ignores the error term.

*Proof:* By (39), the left-hand side is

up to negligible errors. Using the rapid decrease of (Exercise 28 of Supplement 2) and (35), we may restrict to the range , up to negligible error. Applying the functional equation (3), we rewrite this as

For , we see from Exercise 17 that

and thus from the fundamental theorem of calculus we have

or equivalently (using )

We can thus write the left-hand side of (40) up to acceptable errors as

From Exercise 17 we have . From (38) and the rapid decrease of , the contribution of the error term can then be controlled by . Thus, up to acceptable errors, (40) is equal to

By another appeal to (38) and the rapid decrease of (and the growth bound on we may remove the constraint . The claim then follows by changing to and using (39) again.

Let with and , and let be as in Exercise 33. From (34) we have

for any , and from the functional equation we have

Using Theorem 38, we may split the difference:

Exercise 39 (Approximate functional equation)Let be a smooth function such that vanishes for and equals for for some constant , and let . Let with and . Let be such that . Show thatfor all , where .

One can also obtain a version of this equation using partial sums instead of smoothed sums, but with slightly worse error terms, known as the Riemann-Siegel formula; see e.g. Theorem 4.13 of Titchmarsh. Setting , we see that we may now approximate by smoothed sums consisting of about terms, improving upon the sum with terms appearing in Exercise 33. Using the triangle inequality, this gives a slight improvement to (38), namely that

whenever and . The equation (41) is particularly useful for getting reasonably good bounds on ; we will see an example of this in subsequent notes.

** — 4. Further applications of the functional equation — **

One basic application of the functional equation is to improve the control on zeroes of the Riemann zeta function, beyond what was obtained in Notes 2.

From Exercise 37, we now have the crude bounds

and in particular

whenever and . The Jensen formula argument from Proposition 16 of Notes 2 is no longer restricted to the region , and shows that there are zeroes of the Riemann zeta function in any disk of the form . Similarly, Proposition 19 of Notes 2 extends to give the formula

whenever with . Corollary 20 of Notes 2 also extends to show that

We can say more about the zeroes. For any , let denote the number of zeroes of in the rectangle . (If there were zeroes of on the interval , they should each count for towards , but it turns out (as can be computationally verified) that there are no such zeroes.) Equivalently, is the number of zeroes in . We have the following asymptotic for , conjectured by Riemann and established by von Mangoldt:

Theorem 40 (Riemann-von Mangoldt formula)For (say), we have .

*Proof:* We use the Riemann function, whose zeroes are the non-trivial zeroes of . From (6), (43), (22) one has

and so by the pigeonhole principle we can find in and respectively such that

in particular, the line segments does not meet any of the zeroes of or . We will show for either choice of sign that the rectangle contains zeroes, which gives the claim since and only differ by .

By the residue theorem (or the argument principle), the number of zeroes in this rectangle is equal to times the contour integral of anticlockwise around the boundary of the rectangle. By (44), the contribution of the upper and lower edges of this contour are ; from the functional equation (Corollary 2) we see that the contribution of the left and right edges of the contour are the same, and from conjugation symmetry we see that the contribution of the upper half of the right edge is the complex conjugate of that of the lower half. Putting all this together, we see that it suffices to show that

or equivalently (after removing the integral from to , which is ), that

for . Since is given by a Dirichlet series that is uniformly bounded on , we have

and the claim follows.

Exercise 41Establish the more precise formulawhenever and the line avoids all zeroes of , where , and the logarithm is extended leftwards from the region , thus

Theorem 40 then asserts that for all . It is in fact conjectured that as , but this problem has resisted solution for over a century (although it is known that this bound would follow from powerful hypotheses such as the Lindelöf hypothesis).

Remark 42In principle, can be numerically computed exactly for any , as long as the line has no zeroes of , by evaluating the contour integral of to sufficiently high accuracy. Similarly, one can also obtain a numerical lower bound for the number of zeroes of on the critical line by finding sign changes for the function , which is real-valued on the critical line. If the zeroes are all simple and on the critical line, then this (in principle) allows one to numerically verify the Riemann hypothesis up to height . In practice, faster methods for numerically verifying the Riemann hypothesis (e.g. based on the Riemann-Siegel formula) are available.

We can now obtain global explicit formulae for the log-derivatives of and . Since has zeroes only at the non-trivial zeroes of , and no poles, one heuristically expects a relationship of the form

Unfortunately, the right-hand side is divergent; but we can normalise it by considering the sum

Exercise 43Show that the sum converges locally uniformly to a meromorphic function away from the non-trivial zeroes of , with an entire function (after removing all singularities). Also establish the boundsfor all in the complex plane. (

Hint:You will need to use the Riemann-von Mangoldt formula.)

From (6), (42), (22) one also has

for all in the strip ; from (22) and the boundedness of in the region one sees that this bound also holds with , and from the functional equation we see that it also holds for . In particular, with as in the preceding exercise, we see that the entire function is bounded by on the entire complex plane. By the generalised Cauchy integral formula (Exercise 9 of Supplement 2) applied to a disk of radius we conclude that has derivative for any , and by sending we conclude that this function is constant; thus we have the representation

for some absolute constant , and all away from the non-trivial zeroes of . From (6) and Exercise 11, we conclude the representation

The exact values of are not terribly important for applications, but can be computed explicitly:

Exercise 44By inspecting both sides of the above equations as , show that , and hence .

By inserting (45) into Perron’s formula (Exercise 11 of Notes 2), we obtain the Riemann-von Mangoldt explicit formula for the von Mangoldt summatory function:

Exercise 45 (Riemann-von Mangoldt explicit formula)For any non-integer , show thatConclude that

This is an exact counterpart of the truncated explicit formula (Theorem 21 of Notes 2), although in many applications the truncated formula is a little bit more convenient to use; the untruncated formula supplies all of the “lowest order terms”, but these terms are destined to be absorbed into error terms in most applications anyway.

We similarly have a global smoothed explicit formula, refining Exercise 22 from Notes 2:

Exercise 46 (Smoothed explicit formula)Let be a smooth function, supported on the positive real axis. Show thatwith the sums being absolutely convergent. Conclude that

whenever is a smooth function, supported in .

** — 5. The functional equation for Dirichlet -functions — **

Now we turn to the functional equation for Dirichlet -functions , Theorem 3. Henceforth is a primitive character of modulus . To obtain the functional equation for , we will need a twisted version of the Poisson summation formula (8). The key to performing the twist is the following expression for the additive Fourier coefficients of in the group :

Lemma 47Let , and let be a primitive character of modulus . Then for any , we havewhere we abuse notation by viewing the -periodic functions and as functions on , and the Gauss sum is defined by the formula (11).

*Proof:* The formula (46) is trivial when , and by making the substitution we see that it is also true when is coprime to . Now suppose that shares a common factor with , then the right-hand side of (46) vanishes. Writing , we see that is periodic with period , and in particular is invariant with respect to multiplication by any invertible element of whose projection to is one; thus the left-hand side of (46) is invariant under multiplication by . We are thus done unless for all invertible that projects down to one on . But then factors as the product of a character of modulus and a principal character, contradicting the hypothesis that is primitive.

The Gauss sum , being an inner product between a multiplicative character and an additive character , is analogous to the Gamma function , which is also an inner product between a multiplicative character and an additive character . This analogy can be deepened by working in Tate’s adelic formalism, but we will not do so here. But we will give one further demonstration of the analogy between Gauss sums and Gamma functions:

Exercise 48 (Jacobi sum identity)Let be Dirichlet characters modulo , such that are all non-principal. By computing the sum in two different ways, establish the Jacobi sum identityThis should be compared with the beta function identity, Lemma 7.

From (46) and the Fourier inversion formula on (Theorem 69 from Notes 1), we have

setting , we conclude in particular that

which is somewhat analogous to the reflection formula for the Gamma function (Lemma 8). On the other hand, from (46) with we have

This determines the magnitude of ; in particular, the quantity defined in (10) has magnitude one, and

The phase of (or ) is harder to compute, except when is a real primitive character, where we have the remarkable discovery of Gauss that ; see the appendix below.

Now suppose that is a twice continuously differentiable, compactly supported function. We can expand the sum using (47) (and identifying with as

applying the Poisson summation formula (8) to the modulated function , we conclude that

which on making the change of variables , and then relabeling as , becomes the *twisted Poisson summation formula*

Remark 49One can view both the ordinary Poisson summation formula (8) and its twisted analogue (51) as special cases of the adelic Poisson summation formula; see this previous blog post. However, we will not explicitly adopt the adelic viewpoint here.

We now adapt the proofs of the functional equation for to prove Theorem 3, using (51) as a replacement for (8). One can use any of the three proofs for this purpose, but I found it easiest to work with the third proof. We first work heuristically. As before, we formally apply (51) with , ignoring all infinite divergences. Again, the Fourier transform is given by

and so (51) formally yields

The term is formally absent since . If one converts the sum over negative to a sum over positive by the change of variables , and formally identifies and with and , one formally obtains Theorem 3 after some routine calculation.

Exercise 50Make the above argument rigorous by adapting the argument in Exercise 25.

The following exercise develops the analogue of Riemann’s second proof of the functional equation, which is the proof of the functional equation for Dirichlet -functions that is found in most textbooks.

Exercise 51Let be a primitive character of conductor , and let be such that . Define the theta-type functionin the half-plane . The purpose of the factor is to make the summand even in , rather than odd (as the theta-function would be trivial if the summand were odd).

One can also adapt the first proof of Riemann to the -function setting, as was done in this paper of Berndt:

Exercise 52Let be a primitive character of conductor .

The next exercise extends the approximate functional equations for the zeta function to Dirichlet -functions for primitive characters of some conductor . As may be expected from Notes 2, the role of in the error terms will be replaced with .

Exercise 53 (Approximate functional equation)Let be a primitive Dirichlet character of conductor . Let in the critical strip.

- (i) Show that
for any , if is a smooth function such that vanishes for and equals for for some constant , and for some sufficiently large depending on .

- (ii) Show that
- (iii) Suppose that , and let be such that . Let be a smooth compactly supported function. Then
for any and any smooth, compactly supported , where

- (iv) With as in (iii), and as in (i), show that
where .

There are useful variants of the approximate functional equation for -functions that are valid in the low-lying regime , but we will not detail them here.

Exercise 54Let be a primitive character of modulus , and let . Let denote the number of zeroes of in the rectangle (note that we are now including the lower half-plane as well as the upper half-plane, as the zeroes of need not be symmetric around the real axis when is complex). Show that

Exercise 55 (Riemann-von Mangoldt explicit formula for -functions)Let be a non-integer, and let be a primitive Dirichlet character of conductor .

- (i) If , show that
- (ii) If , show that
for some quantity depending only on .

** — 6. Appendix: Gauss sum for real primitive characters — **

The material here is not needed elsewhere in this course, or even in this set of notes, but I am including it because it is a very pretty piece of mathematics; it is also the very first hint of a much deeper connection between automorphic forms and Galois theory known as the Langlands program, which I will not be able to discuss further here.

Suppose that is a real primitive character. Then from (50), must be either or . Applying Corollary 4 with , we see that

which strongly suggests that , but one has to prevent vanishing of (or equivalently, ). (A similar argument using (52) would also give if one could somehow prevent vanishing of .) While it has been conjectured (by Chowla) that is never zero (and should in fact be positive), this is still not proven unconditionally; see this preprint of Fiorilli for recent work in this direction. (Indeed, understanding the vanishing of -functions at the central point is a very deep problem, as attested to by the difficulty of the Birch and Swinnerton-Dyer conjecture.) Nevertheless we still have the following result:

Theorem 56 (Gauss sum for real primitive characters)One has for any real primitive character .

This result is surprisingly tricky to prove. The exercises below give one such proof, essentially due to Dirichlet. First we reduce to quadratic characters modulo a prime:

Exercise 57 (Classification of real primitive characters)

- (i) Let be coprime natural numbers. Show that if are real primitive characters of conductors respectively, then is a real primitive character of conductor , and that .
- (ii) Conversely, if are coprime natural numbers and is a real primitive character of conductor , show that there exist unique real primitive characters of conductors respectively such that . (
Hint:use the Chinese remainder theorem to identify with and .- (iii) If is an odd prime and is a natural number, show that an element of is a quadratic residue if and only if it is a quadratic residue after reduction to . Conclude that there are no real primitive characters of conductor if , and the only real primitive character of conductor is the quadratic character .
- (iv) If , show that an element of is a quadratic residue if and only if it is a quadratic residue after reduction to . Conclude that there are no real primitive characters of conductor if , and show that there is one such character of conductor , two characters of conductor , and no characters of conductor . Verify that for each of these characters.

In view of this exercise and the fundamental theorem of arithmetic, we see that to verify Theorem 56, it suffices to do so when for some odd prime . This can be achieved using the functional equation (30) for the theta function defined in (29):

Exercise 58 (Landsberg-Schaar relation and its consequences)Let be an odd prime, and let be the quadratic character to modulus .

- (i) Show that . (
Hint:rewrite both sides in terms of the sum of over quadratic residues or non-residues .)- (ii) For any positive integers , establish the Landsberg-Schaar relation
by using the functional equation (30) with and sending .

- (iii) By using the case of the Landsberg-Schaar relation, show that is equal to when and when , and that .
- (iv) By applying the Landsberg-Schaar relation with an odd prime distinct from , establish the law of quadratic reciprocity

Exercise 59Define afundamental discriminantto be an integer that is the discriminant of a quadratic number field; recall from Supplement 1 that such take the form if is a squarefree integer with , or if is a squarefree integer with . (Supplement 1 focused primarily on the negative discriminant case when , but the above statement also holds for positive discriminant.) Show that if is a fundamental discriminant, then is a primitive real character, where is the Kronecker symbol. Conversely, show that all primitive real characters arise in this fashion.

Filed under: 254A - analytic prime number theory, math.CA, math.CV, math.NT Tagged: Dirichlet L-function, explicit formula, functional equation, Gauss sum, Poisson summation formula, quadratic reciprocity, Riemann zeta function ]]>

These series also made an appearance in the elementary approach to the subject, but only for real that were larger than . But now we will exploit the freedom to extend the variable to the complex domain; this gives enough freedom (in principle, at least) to recover control of elementary sums such as or from control on the Dirichlet series. Crucially, for many key functions of number-theoretic interest, the Dirichlet series can be analytically (or at least meromorphically) continued to the left of the line . The zeroes and poles of the resulting meromorphic continuations of (and of related functions) then turn out to control the asymptotic behaviour of the elementary sums of ; the more one knows about the former, the more one knows about the latter. In particular, knowledge of where the zeroes of the Riemann zeta function are located can give very precise information about the distribution of the primes, by means of a fundamental relationship known as the explicit formula. There are many ways of phrasing this explicit formula (both in exact and in approximate forms), but they are all trying to formalise an approximation to the von Mangoldt function (and hence to the primes) of the form

where the sum is over zeroes (counting multiplicity) of the Riemann zeta function (with the sum often restricted so that has large real part and bounded imaginary part), and the approximation is in a suitable weak sense, so that

for suitable “test functions” (which in practice are restricted to be fairly smooth and slowly varying, with the precise amount of restriction dependent on the amount of truncation in the sum over zeroes one wishes to take). Among other things, such approximations can be used to rigorously establish the prime number theorem

as , with the size of the error term closely tied to the location of the zeroes of the Riemann zeta function.

The explicit formula (1) (or any of its more rigorous forms) is closely tied to the counterpart approximation

for the Dirichlet series of the von Mangoldt function; note that (4) is formally the special case of (2) when . Such approximations come from the general theory of local factorisations of meromorphic functions, as discussed in Supplement 2; the passage from (4) to (2) is accomplished by such tools as the residue theorem and the Fourier inversion formula, which were also covered in Supplement 2. The relative ease of uncovering the Fourier-like duality between primes and zeroes (sometimes referred to poetically as the “music of the primes”) is one of the major advantages of the complex-analytic approach to multiplicative number theory; this important duality tends to be rather obscured in the other approaches to the subject, although it can still in principle be discernible with sufficient effort.

More generally, one has an explicit formula

for any Dirichlet character , where now ranges over the zeroes of the associated Dirichlet -function ; we view this formula as a “twist” of (1) by the Dirichlet character . The explicit formula (5), proven similarly (in any of its rigorous forms) to (1), is important in establishing the prime number theorem in arithmetic progressions, which asserts that

as , whenever is a fixed primitive residue class. Again, the size of the error term here is closely tied to the location of the zeroes of the Dirichlet -function, with particular importance given to whether there is a zero very close to (such a zero is known as an *exceptional zero* or Siegel zero).

While any information on the behaviour of zeta functions or -functions is in principle welcome for the purposes of analytic number theory, some regions of the complex plane are more important than others in this regard, due to the differing weights assigned to each zero in the explicit formula. Roughly speaking, in descending order of importance, the most crucial regions on which knowledge of these functions is useful are

- The region on or near the point .
- The region on or near the right edge of the
*critical strip*. - The right half of the critical strip.
- The region on or near the
*critical line*that bisects the critical strip. - Everywhere else.

For instance:

- We will shortly show that the Riemann zeta function has a simple pole at with residue , which is already sufficient to recover much of the classical theorems of Mertens discussed in the previous set of notes, as well as results on mean values of multiplicative functions such as the divisor function . For Dirichlet -functions, the behaviour is instead controlled by the quantity discussed in Notes 1, which is in turn closely tied to the existence and location of a Siegel zero.
- The zeta function is also known to have no zeroes on the right edge of the critical strip, which is sufficient to prove (and is in fact equivalent to) the prime number theorem. Any enlargement of the zero-free region for into the critical strip leads to improved error terms in that theorem, with larger zero-free regions leading to stronger error estimates. Similarly for -functions and the prime number theorem in arithmetic progressions.
- The (as yet unproven) Riemann hypothesis prohibits from having any zeroes within the right half of the critical strip, and gives very good control on the number of primes in intervals, even when the intervals are relatively short compared to the size of the entries. Even without assuming the Riemann hypothesis,
*zero density estimates*in this region are available that give some partial control of this form. Similarly for -functions, primes in short arithmetic progressions, and the generalised Riemann hypothesis. - Assuming the Riemann hypothesis, further distributional information about the zeroes on the critical line (such as Montgomery’s pair correlation conjecture, or the more general
*GUE hypothesis*) can give finer information about the error terms in the prime number theorem in short intervals, as well as other arithmetic information. Again, one has analogues for -functions and primes in short arithmetic progressions. - The functional equation of the zeta function describes the behaviour of to the left of the critical line, in terms of the behaviour to the right of the critical line. This is useful for building a “global” picture of the structure of the zeta function, and for improving a number of estimates about that function, but (in the absence of unproven conjectures such as the Riemann hypothesis or the pair correlation conjecture) it turns out that many of the basic analytic number theory results using the zeta function can be established without relying on this equation. Similarly for -functions.

Remark 1If one takes an “adelic” viewpoint, one can unite the Riemann zeta function and all of the -functions for various Dirichlet characters into a single object, viewing as a general multiplicative character on the adeles; thus the imaginary coordinate and the Dirichlet character are really the Archimedean and non-Archimedean components respectively of a single adelic frequency parameter. This viewpoint was famously developed in Tate’s thesis, which among other things helps to clarify the nature of the functional equation, as discussed in this previous post. We will not pursue the adelic viewpoint further in these notes, but it does supply a “high-level” explanation for why so much of the theory of the Riemann zeta function extends to the Dirichlet -functions. (The non-Archimedean character and the Archimedean character behave similarly from an algebraic point of view, but not so much from an analytic point of view; as such, the adelic viewpoint is well suited for algebraic tasks (such as establishing the functional equation), but not for analytic tasks (such as establishing a zero-free region).)

Roughly speaking, the elementary multiplicative number theory from Notes 1 corresponds to the information one can extract from the complex-analytic method in region 1 of the above hierarchy, while the more advanced elementary number theory used to prove the prime number theorem (and which we will not cover in full detail in these notes) corresponds to what one can extract from regions 1 and 2.

As a consequence of this hierarchy of importance, information about the function away from the critical strip, such as Euler’s identity

or equivalently

or the infamous identity

which is often presented (slightly misleadingly, if one’s conventions for divergent summation are not made explicit) as

are of relatively little direct importance in analytic prime number theory, although they are still of interest for some other, non-number-theoretic, applications. (The quantity does play a minor role as a normalising factor in some asymptotics, see e.g. Exercise 28 from Notes 1, but its precise value is usually not of major importance.) In contrast, the value of an -function at turns out to be extremely important in analytic number theory, with many results in this subject relying ultimately on a non-trivial lower-bound on this quantity coming from Siegel’s theorem, discussed below the fold.

For a more in-depth treatment of the topics in this set of notes, see Davenport’s “Multiplicative number theory“.

** — 1. Dirichlet series to the right of the critical strip — **

We begin with the (easy) theory of Dirichlet series to the right of the critical strip , which generalises the theory of Dirichlet series for real that was used in the previous set of notes.

Given any arithmetic function obeying the crude size bound

is absolutely convergent in the region to the right of the critical strip. Indeed, if for some , then we have

for any , and the claim follows by choosing so that . Note that this argument also shows that is bounded in any region of the form for any .

The partial sums are clearly holomorphic functions in on the entire complex plane , and they converge locally uniformly to on the region . Since locally uniform limits of holomorphic functions are holomorphic (Corollary 11 of Supplement 2), we conclude that is holomorphic on .

If obey the crude size bound (7), then so does the Dirichlet convolution (see Exercise 24 of Notes 1). A simple application of Fubini’s theorem then gives the fundamental identity

in the region . Also, by carefully differentiating (8) in we obtain the additional identity

in the same region, where is the logarithm function .

From (9), (10) we can express the Dirichlet series of many basic arithmetic functions of number-theoretic interest in terms of the Riemann zeta function, at least in the region :

- By definition, . Since , we conclude from (9) that .
- Clearly . By (9) and the Möbius inversion formula , we conclude that . (In particular, has no zeroes in the region .)
- From (10), we have . By (9) and the basic identity , we conclude that .
- From (10), we see that has a derivative of . For real , we already saw (see equation (21) of Notes 1) that this expression was equal to . Thus we see that is a branch of the complex logarithm of to the right of the strip, so we write (by slight abuse of notation) in this region.

Exercise 3

- (i) Show that in the region .
- (ii) Define the Liouville function by setting whenever is the product of (not necessarily distinct) primes for some . Show that in the region .

Exercise 4Let be the higher order von Mangoldt functions (equation (65) from Notes 1). Show that in the region , where is the -fold derivative of .

Exercise 5 (Uniqueness for Dirichlet series)

- (i) Let be two arithmetic functions obeying (7), such that on the region . Show that . (
Hint:if , obtain an asymptotic for as along the reals.)- (ii) Use this uniqueness to give an alternate proof of the identity for (equation (66) from Notes 1.
- (iii) Use this uniqueness to give an alternate proof of the Diamond-Steinig identities (Exercise 64 from Notes 1).

Now we establish some crude bounds on Dirichlet series in the region . We will use the following simple application of the triangle inequality: if and are arithmetic functions obeying (7) with for all , and for some and , then

for all , and hence

In practice, one can use the estimates from Notes 1 to bound . For instance, from Exercise 10 of those notes we have

for all with and (the second bound arising from the trivial upper bound ).

We can often obtain matching lower bounds to these upper bounds when is close to by a number of means. For the Riemann zeta function, one can use the bound

for continuously differentiable (see Exercise 11 from Notes 1), which after a brief calculation gives

for and . A similar calculation also gives

for and sufficiently close to ; note that these three estimates were already established in Notes 1 under the additional hypothesis that was real. One can view (13) as a crude version of the heuristic (4), in which the role of the zeroes is neglected. When controlling for a multiplicative function obeying (7), one can also exploit the Euler product formula

which remains valid in the domain . For instance, under the hypotheses of Theorem 27 of Notes 1, we have

for and , as can be seen by an inspection of the proof of Theorem 27(i).

Exercise 6By using the Selberg symmetry formula (equation (67) from Notes 1), show thatwhenever with and .

We can obtain better estimates for the zeta function and its relatives once we have some analytic continuation of these functions to the critical strip. However, even before we do so, we can still control various weighted sums of arithmetic functions in terms of integral combinations of the Dirichlet series . This can be achieved by the following formula:

Proposition 7 (Parseval-type formula)Let obey (7), and let be a twice continuously differentiable, compactly supported function. Then for any , one haswhere is the Fourier transform of extended to the complex plane, defined by the formula

for . The integral on the right-hand side of (14) is absolutely integrable.

The formula (14) is similar to the Parseval-type identity

for suitably “nice” functions (see Corollary 32 from Supplement 2). Indeed, one could view as the Fourier transform of the Radon measure , which (formally, at least) yields (14) from the Parseval identity. However we will not adopt this measure-theoretic viewpoint explicitly here.

*Proof:* From Exercise 28 of Supplement 2, we have the bounds

which, together with the boundedness of on , makes the integral in (14) absolutely integrable as claimed. The same bounds allow one to invoke Fubini’s theorem and rewrite the right-hand side of (14) as

which we rearrange as

By the Fourier inversion formula (Theorem 30 from Supplement 2), this simplifies to

and the claim follows.

Remark 8The uncertainty principle in Fourier analysis tells us (heuristically, at least) that if we want the function to exhibit non-trivial oscillation at the scale , then the Fourier transform has to spread out over an interval of size . In particular, if we want to use (14) to investigate fine scale structure of on intervals such as , then one expects to need control on the associated Dirichlet series in which the imaginary part of can be as large as in magnitude. Thus, Fourier analysis gives us the insight that the extent to which one can extend control of the Dirichlet series away from the real axis determines the finest scale of that one can hope to control. For instance, the prime number theorem allows one to counting primes in regions such as , and so should need control of on the entire right edge of the critical strip. Conversely, numerical verification of the Riemann hypothesis that establishes zero free regions for for imaginary parts up to some finite threshold should yield effective substitutes for the prime number theorem that are able to count primes in intervals such as .

The most powerful applications of the Parseval-type formula (14) occur when has a meromorphic continuation into the critical strip (or beyond), allowing one to shift in the right-hand side of (14) to the left of (picking up various terms from residue calculus along the way). But one can still obtain some useful estimates on various summatory functions involving even without such meromorphic continuations; in particular, just by using asymptotics near (and to the right of) such as (13), we can recover estimates of strength comparable to Mertens’ theorems. Here is a basic example, using only the asymptotic (13):

Proposition 9 (Crude Mertens-type theorem)Let be a continuously twice differentiable compactly supported function. Then for , one has

This estimate is weaker than the Mertens’ theorems from Notes 1. However, later in these notes we will be able to improve the error term in this proposition if is smoother, by using a more accurate asymptotic expansion than (13). One should also compare (16) to the heuristic (2) (again neglecting the role of the zeroes).

*Proof:* Since to the right of the critical strip, we can apply Proposition 7 and rewrite the left-hand side of (16) as

where is arbitrary.

A convenient choice of here is ; this is about as far as one can push to the right (in order to get the best use out of the estimate (11)) before the shift to becomes problematic. (Compare with *Rankin’s trick*, discussed in Notes 1.) In view of (13), it is natural to consider the expression

To compute this expression, we write

and interchange integrals by Fubini’s theorem. From the Fourier inversion formula (Theorem 30 from Supplement 2) and a change of variables one has

so (17) simplifies to the expected main term in (16). It thus suffices to show that

By (13), we can bound in magnitude by when is smaller than some absolute constant . For , we instead use (11) to bound this quantity by . Meanwhile, from Exercise 28 of Supplement 2 we have the bounds

Putting these bounds together, we obtain the claim.

Remark 10In later notes we will use a similar method to that used to prove Theorem 9 to estimate sums such aswhere is the least common multiple of , and is a smooth compactly supported function; such expressions will arise naturally when we turn to the topic of sieve theory.

A classical limiting case of Proposition 7 is Perron’s formula:

Exercise 11 (Perron’s formula)Let obey (7). For any non-integer and any , show thatwhere the integral is a contour integral along the line segment . What happens when is an integer?

In practice, the presence of the limit in (18) is inconvenient, and one usually works with smoothed or truncated version of this formula. Proposition 7 can be viewed as a smoothed version of Perron’s formula. Now we establish a truncated version:

Proposition 12 (Truncated Perron’s formula)Let be such that for all , and let . Then

One can sharpen the factor here slightly, but we will not need such improvements here. The condition can also be relaxed (at the cost of worsening the error term accordingly); we leave this as an exercise to the reader.

*Proof:* By perturbing we may assume that is not an integer. In view of (18), it suffices by dyadic decomposition to show that

By Fubini’s theorem, the left-hand side may be written as

On taking absolute values, we see that

and in particular for . On the other hand, from integration by parts we have

and thus on taking absolute values

In particular we have

for , and

for . We may thus upper bound (20) by

and the claim follows from Lemma 2 of Notes 1.

** — 2. Meromorphic continuation into the critical strip, and the (truncated) explicit formula — **

To get the most use out of Perron-type formulae, we have to extend Dirichlet series such as the Riemann zeta function meromorphically into the critical strip . Not every Dirichlet series with coefficients obeying (7) has such a meromorphic extension; roughly speaking, the existence of such an extension is morally equivalent to having asymptotic formulae for sums such as or whose error term is better than what one can obtain just from (7).

To extend the zeta function into the critical strip, we will use (12), which gives the bound

whenever and for some , with . (The error term is a bit crude, particularly when has large imaginary part; we will obtain better estimates in later notes.) From Lemma 5 of Notes 1, this implies that we can find a (unique) complex number such that

for all and for some and with . For , this definition of agrees with the prior definition of the Riemann zeta function in this range; it also is consistent with the quantity defined for in Section 1 of Notes 1.

From (21) we also observe the conjugation symmetry

for any in . (Indeed, from the unique continuation property for meromorphic functions, this property is automatic for any meromorphic function on a connected domain symmetric around the real axis, which is real on the real axis.)

Observe that the function has a removable singularity at (it approaches at that value of ). From (21), we see that on the region , the function is the locally uniform limit of the functions , which are holomorphic on this region once the removable singularity at is removed. We conclude that is holomorphic in this region (after removing the singularity at ), and hence is meromorphic in this region, with a simple pole at and no other poles. In particular, we have the Laurent expansion

for any natural number and all sufficiently close to , where are complex coefficients. Differentiating, we also see that

for any natural number and all sufficiently close to , where are further complex coefficients. This refines the bound (13).

Exercise 13Show that and , where is Euler’s constant. (This appearance of in analytic number theory is largely unrelated to the appearance of type factors in Notes 1.)

Exercise 14Prove the following generalisation of Proposition 9: if is a continuously -times differentiable compactly supported function for some , then for , one hasfor some complex coefficients depending on . (

Hint:repeat the proof of Proposition 9, but use (23) in place of (13).)

Remark 15Note that the error term in the above exercise improves as gets smoother, and in fact becomes significantly better than the type of error terms appearing in the elementary approach when is smooth enough (a special case of the “smoothed sums” philosophy in analysis). Thus we see a contrast between the elementary and complex-analytic methods; the latter approach can provide superior error terms, but also has a preference for smoother sums than the roughly truncated sums that are the main focus of the elementary methods.

From (21) with , we also have the crude bound

in the region . While this is quite a crude bound, it implies a decent upper bound on the log-magnitude of , namely that

whenever and . This, combined with Jensen’s formula, gives a useful upper bound on the density of zeroes of :

Proposition 16 (Crude upper bound on zeta zeroes)For any and , there are at most zeroes of (counting multiplicity) in the region .

*Proof:* We may assume that (say), since for the claim follows from the discrete nature of the zeroes of a meromorphic function. We may also take small, say .

Consider the disk of radius centred at for some . By (11), we have at the centre of the circle, while from (25) one has on the boundary of the circle. By Jensen’s formula (Theorem 16 from previous notes), this implies that there are at most zeroes in the disk of radius (say) centred at . Since the region can be covered by such disks, the claim follows.

Remark 17It will be convenient in the following discussion to adopt the convention that all sums over zeroes of (or of other -functions) are counted with multiplicity; thus for instance a double zero would contribute twice to such a sum. Indeed, one can think of a zero of order as being a limiting case of simple zeroes that are extremely close together (cf. Rouché’s theorem or Hurwitz’s theorem in complex analysis), which helps explain why such zeroes are always counted with multiplicity. It is conjectured that the zeroes of the Riemann zeta function (or of any -function) are all simple, but this claim looks hopeless to prove using current methods; the problem is that it is nearly impossible for analytic methods to distinguish between a repeated zero, and a pair of simple zeroes that are extremely close together, and we currently do not have good methods to exclude the latter from occurring at least once.

Remark 18As a corollary of Proposition 16, we see that the number of zeroes of in the region is for any . Once we establish the functional equation for , we will be able to match this upper bound with a comparable lower bound, and also set to zero; see later notes.

This gives an approximate formula for the log-derivative of zeta in terms of the nearby zeroes:

Proposition 19 (Approximate formula for log-derivative of zeta)For any , we havewhenever and .

This proposition should be viewed as a local version of the heuristic (4).

*Proof:* To eliminate the pole at , it is convenient to work with the modified function , which is holomorphic in after removing the singularity at , and our task is now to show that

whenever and .

As in the previous proposition, consider the disk of radius centred at . From (11) we have in the centre of this disk, and from (25) one has on the boundary of the disk. The claim now follows from Theorem 21 of Supplement 2 (using Proposition 16 to remove the contribution of zeroes further than from ).

Among other things, this proposition gives good control on the size of the log-derivative on average, which will be useful in figuring out how to shift a contour without encountering the large values of too often:

Corollary 20 (Local integrability of log-derivative)For any and , we have

*Proof:* We apply Proposition 19. The contribution of the error term is clearly acceptable. Because is a locally integrable function on the complex plane, we see that the term contributes a factor of to the required integral, and every zero within of also contributes . The claim now follows from Proposition 16.

Now we can apply the truncated Perron’s formula with contour shifting to obtain a truncated explicit formula for the von Mangoldt summatory function:

Theorem 21 (Truncated von Mangoldt explicit formula)For any and , we have

The error terms here can be improved a little, particularly once one uses the functional equation for ; see later notes. However, the current form of the formula already suffices for many applications. This theorem should be compared with (2).

*Proof:* From (20) and the pigeonhole principle, we may find such that

for either choice of sign (note that this implies that the horizontal lines avoid all the poles of ). (One could use (22) here to eliminate the need to consider a sign , but it is not necessary to do so here.) On the other hand, from Proposition 12 we have

Observe that on the half-space , the meromorphic function has a pole at with residue , and poles at every zero of with residue (multiplied by the multiplicity of the zero), with no other poles. By the residue theorem (Exercise 13 of Supplement 2) applied to the boundary of the rectangle for some to be chosen shortly, with avoiding the poles of , and using (26) to bound the upper and lower limits of integration, we thus have

Now, from (20), and integrating from to , we have

where . Thus by the pigeonhole principle we may find where

where , and so the contour integral in (27) is . Using Proposition 16, the contribution to of those zeroes with is , and of those zeroes with is . The claim follows.

Exercise 22 (Smoothed explicit formula)Let be a smooth, compactly supported function. Then for any and , show thatwith the sum on the right-hand side being absolutely convergent. (

Hint:use Proposition 7 and contour shifting.) This exercise should be compared with (14).

Theorem 21 allows one to use zero-free regions of the Riemann zeta function to improve the error term in the prime number theorem:

Corollary 23 (Zero-free region controls von Mangoldt summatory function)Let and , and suppose that there are no zeroes of in the rectangle . Then one hasfor all .

*Proof:* We apply Theorem 21 with (say). It then suffices to show that

But by hypothesis, we have , and the claim then readily follows from Proposition 16.

In fact, we have a fairly tight relationship between zero-free regions and error terms. Here is one example of this:

Proposition 24Let . Then the following assertions are equivalent:

- (i) One has as .
- (ii) One has for all .
- (iii) All the zeroes of have real part at most .

*Proof:* Clearly (ii) implies (i). If (iii) holds, then by applying Corollary 23 with , we obtain (ii).

Finally, suppose that (i) holds. For , we see from Fubini’s theorem that

and thus

By (i) and Morera’s theorem, the integral on the right-hand side extends holomorphically to the region , and so by unique continuation cannot have any poles in this region other than at . This implies (iii).

Remark 25Proposition 24 illustrates a remarkable “self-improving” property of estimates on the von Mangoldt summatory function: a weak bound of the form , if true in the asymptotic limit , automatically implies the stronger bound for any given (and in fact the implied constant in the conclusion depends only on , and not on the decay rate in the hypothesis). This is due to the special structure of this summatory function , as revealed by the explicit formula, which limits the range of possible asymptotic behaviours of this function, and in particular gives some control on a given value of this function at some choice of in terms of its values at much larger choices of . (Compare with the following easy example of a self-improving property: if is a natural number and is a polynomial with as , then for all .)

Exercise 26Give an alternate proof that (i) implies (iii) in Proposition 24 that uses Theorem 21 (with set to a large power of ), as well as an inspection of the asymptotics of the expressionas , where is a zero of and is a smooth compactly supported bump function. (The point is that expression isolates the effect of the single zero in the von Mangoldt explicit formula.) Give a similar derivation that uses Exercise 22 instead of Theorem 21.

Exercise 27 (Truncated Landau explicit formula)Let and , and let for some be such that is not a zero or pole of . Show that

Unfortunately, none of the statements in Proposition 24 are known to hold for any positive . The infamous Riemann hypothesis asserts that the statements in Proposition 24 hold for as large as :

Conjecture 28 (Riemann hypothesis)All the zeroes of have real part at most .

Remark 29This is not quite the traditional formulation of the Riemann hypothesis, which asserts instead that all the zeroes of on the critical strip lie on the critical line . However, the two formulations are logically equivalent, once one possesses the functional equation; see later notes.

From the above proposition, we see that the Riemann hypothesis is equivalent to the quite strong estimate

on the von Mangoldt summatory function for all . This already gives a “near miss” to Legendre’s conjecture that there exists a prime between and for any :

Exercise 30 (Conditional near-miss to Legendre’s conjecture)Assume the Riemann hypothesis. Show that there exists a constant such that there exists a prime between and for any .

We remark that Cramér reduced the term here to a , however no further improvement is known if one “only” assumes the Riemann hypothesis. (But one shave the further to if one additionally assumes a form of the Montgomery pair correlation conjecture, a result of Goldston and Heath-Brown.) In later notes we will discuss some weaker near-misses to Legendre’s conjecture that are not conditional on unproven statements such as the Riemann hypothesis, by replacing the notion of zero-free region with the weaker, but somewhat comparable in power, notion of a *zero-density theorem*.

There is a limiting case of Proposition 24, due to Wiener:

Proposition 31 (Equivalent forms of prime number theorem)The following assertions are equivalent:

- (i) One has as .
- (ii) All the zeroes of have real part strictly less than one.

*Proof:* First suppose that (ii) fails but (i) holds, so has a zero at for some non-zero . Then has a simple pole at with residue at most , and so

as , or in other words

However, from Fubini’s theorem we have

Applying (i), we soon conclude that

as , giving the required contradiction.

Now suppose that (ii) holds. We apply Exercise 22 with (say) to obtain

as , for any smooth compactly supported independent of . By (ii), each individual term is . Since the zeroes are discrete, we thus have

for any independent of . To control the remaining portion of the sum, we crudely bound and (using Exercise 28 of Supplement 2) and use Proposition 16 to conclude that

and thus on sending to infinity and expanding out ,

Letting be an upper or lower approximant to , we conclude that

and (i) follows by a telescoping argument.

Some of the above discussion involving the von Mangoldt function has an analogue involving the Möbius function, although it is more difficult to use the residue theorem to obtain a useful explicit formula because the residues of are significantly less well understood than that of . Nevertheless, one can still use other complex analytic tools, such as Taylor expansion, to get some weaker statements. We give some examples of this in the exercises below.

Exercise 32Suppose that the conclusions of Proposition 24 hold for some .

- (i) For any , show the bounds
if with and . If (say), improve this to

- (ii) Show that there is a branch of the logarithm of that is holomorphic in the region , and obeying the bounds
and

for all , and , where denotes the -fold derivative of . (

Hint:use the generalised Cauchy integral formulae, see Exercise 9 of Supplement 2.)- (iii) Show that for any , we have
if with and sufficiently large depending on . (

Hint:Taylor expand around using the bounds from (ii), possibly with a different choice of .)

Exercise 33Let . Show that the conclusions (i)-(iii) of Proposition 24 are equivalent to the assertion

- (iv) as .
(

Hint:apply the truncated Perron formula to and shift the contour, using the preceding exercise to control error terms.) In particular, we see that the Riemann hypothesis is equivalent to the assertion thatas .

Exercise 34Show that the Riemann hypothesis implies the Lindelöf hypothesis that as .

Exercise 35Let be a natural number, and let be a multiplicative function obeying the bounds for all primes , and such that for all primes and .

- (i) Show that has a meromorphic continuation to the half-space , which has a pole of at most order at but no other poles. (
Hint:use Euler products to factor as the product of and a function holomorphic in this half-space.) Also show that when with and .- (ii) Show that
for all and some depending only , where is a polynomial with leading term , where the singular series was defined in Theorem 27 of Notes 1. (

Hint:modify Proposition 12 to deal with the fact that is only bounded by rather than by , apply it with a small power of , then shift the contour.) Note that this refines Theorem 27(iii) from Notes 1, and also generalises Exercise 32 from those notes.

** — 3. The prime number theorem — **

We are now finally ready to prove the prime number theorem (3), first established by Hadamard and de la Vallée Poussin. In view of Proposition 31, the task comes down to excluding the possibility that a zero

occurs on the line for some . Note that cannot be zero, as has a pole at .

The basic point here is that such a zero implies a “conspiracy” between the von Mangoldt function and the multiplicative function , in that the two functions correlate or “pretend” to be like each other in a certain sense. Indeed, if has a zero of some positive order at , then the log-derivative has a simple pole with residue at , so in particular

for sufficiently close to . We rewrite this as

On the other hand, from (13) one has

and sending , we already obtain a contradiction if ; thus we have shown that there are no zeroes of multiplicity two or higher on the line . In the case of a simple zero , we have not yet obtained a contradiction; but observe that in this case, the triangle inequality (30) is close to being attained with equality. Intuitively, this implies that on most of the support of , that is to say that for “most” primes . To make this precise, we add (28) to (29) and then take real parts to conclude that

as . In probabilistic terms, if one selects a natural number at random using the probability density , divided by the quantity to normalise the total probability to be one, then the random variable converges in to zero. (Note that we are implicitly using the non-negative nature of in order to access this probabilistic interpretation.)

Following Hadamard, we exploit the following basic observation: if , then . To use this observation quantitatively, it is convenient (following Mertens, who simplified the original argument of Hadamard) to exploit the trigonometric inequality

for any (which follows from the identity ), which implies that

Inserting this inequality into (31), we conclude that

and hence by (29)

This implies that has a pole at with residue , and so must have a simple pole at . But the only pole of is at , and is non-zero, giving a contradiction. Thus there are no zeroes of on the line , and the prime number theorem follows thanks to Proposition 31.

The key inequality (32) is often written as , or . In particular, we have

which on multiplying with and summing gives the useful inequality

for any and . Integrating this in from (noting that as for any fixed ) gives the variant

(One can also obtain this inequality directly from (34) by multiplying by , summing, then exponentiating; we leave the details to the interested reader.) This variant gives a slightly different way to interpret the above proof of the prime number theorem: has a simple pole at , and no pole at , so from (36) the maximum order of zero it can have at is . But the order must be an integer, and so one cannot have a zero of any positive order.

Exercise 36Use the Selberg symmetry formula (equation (67) from Notes 1) to obtain the asymptoticsand

and

as , for any fixed . By using the bound , conclude that

and use this to give an alternate proof of the prime number theorem. (This argument is related, though not completely identical, to the Erdös-Selberg elementary proof of the prime number theorem, which we will not give here.)

Remark 37Another heuristic way to see the lack of zeroes on the line is to return to the explicit formula (1). If there was a zero at , there would also be a zero at thanks to the conjugation symmetry (22), and henceIn particular, should behave like or less on the average in the region where (which would imply that other powers are also comparable to if is an integer multiple of , or else oscillate “orthogonally” to ). But is non-negative, which heuristically suggests a contradiction. One can interpret the arguments based on (34) above as a rigorous implementation of this heuristic argument.

We have now established that the Riemann zeta function has no zeroes on the line . Since the zeroes of are discrete, this implies a *qualitative* zero-free region to the left of this line, in the sense that there is an open neighbourhood of this line that is free of zeroes of . However, for applications (such as Corollary 23), we need a more quantitative zero-free region. To do this, we return to the bound in Proposition 19 as a quantitative substitute for the bound (28). We specialise to the case where with and , and set (say). In this case, the term is , and we conclude that

Observe that as all the zeroes have real part at most , the quantity has non-negative real part. Thus we have

whenever and . If there is a zero of with the same imaginary part as , with , then we have the improvement

(note that the term is only of size and thus negligible if ). This now gives

Proposition 38 (Classical zero-free region)There exists an absolute constant such that there are no zeroes of in the region

*Proof:* Let be a small constant to be chosen later. Suppose for contradiction that one has

for some and . As has a simple pole at , there are no zeroes in a neighbourhood of , and so one has if is small enough. For any , we conclude from (38) that

while from (37) one has

and from (13) one has

Inserting these bounds into (35), we conclude that

for any . Setting for a sufficiently large absolute constant (actually suffices), we still have if is small enough, and the left-hand side is equal to

For large enough, is negative, we contradict the hypothesis if is small enough.

We can insert this zero-free region into Corollary 23, optimising the choice of parameters, to obtain a quantitative form of the prime number theorem, first obtained by de Vallée Poussin:

Corollary 39 (Prime number theorem with classical error term)We havefor all and some absolute constant . In particular, one has

for any and .

*Proof:* Apply Corollary 23 with and for some small absolute constant ; this choice of parameters is designed to roughly balance the size of two error terms in that corollary, which is usually a near-optimal way to choose parameters. The required zero-free region follows from Proposition 38 if is small enough, and the claim then follows (noting that logarithmic factors can be absorbed into the decay factors by shrinking slightly).

Exercise 40 (Alternate form of prime number theorem)Show that for any , the number of primes less than or equal to obeys the estimatefor some absolute constant , where the logarithmic integral is defined by the formula

Conclude in particular that

for all ; in particular, the simple form of the prime number theorem is not particularly accurate, and one should use the refined version instead (or better yet, work with the von Mangoldt function).

Exercise 41 (Prime number theorem for Möbius)Show that there is an absolute constant such that one has the boundsand

whenever and . Conclude the alternate form

of the prime number theorem with classical error term for all and some .

Exercise 42 (Landau-Beurling prime number theorem)Letbe a set of real numbers, which we refer to as

Beurling primes. Define aBeurling integerto be a real number of the formfor some and ; note that due to potential collisions between different products of Beurling primes, it is possible for a real number to be a Beurling integer in multiple ways. Let and denote the sets of Beurling primes and Beurling integers respectively. If we have the asymptotic bound

for all and some absolute constant , establish the

Landau-Beurling prime number theoremfor all and some absolute constant ; this generalises Exercise 40. (

Hint:form the Beurling zeta function and show that it has a meromorphic continuation to the region , and obeys the bounds for , , and . Then repeat the proof of the prime number theorem, all the way down to Exercise 40.) This result is essentially due to Landau; Beurling was able to obtain a variant in which the hypothesis (39) and conclusion (40) were both weakened. On the other hand, it was shown by Diamond, Montgomery, and Vorhauer that without any further axioms on Beurling integers beyond (39), it is not possible to improve upon the estimate (40) (other than by sharpening the constant ). Thus, to go beyond the prime number theorem with classical error term, one needs to know more about the natural numbers than just that they are roughly uniformly distributed on the positive real axis in the sense of (39).

In later notes, we will obtain better upper bounds on in the critical strip (and particularly near the line ) that improve upon (24). This will allow us to obtain variants of Proposition 19 near the line in which the error term is replaced with a smaller quantity. The argument based on (35) will then allow us to enlarge the classical zero-free region in Proposition 38, which in turn leads to an improved error term in the prime number theorem. The asymptotically strongest such result is due to Vinogradov and Korobov, who use new upper bounds on to obtain a zero-free region of the form

for some , which leads to the prime number theorem

for some and all ; see the exercise below. This still falls short of the claims in Proposition 24 for any fixed , however it is important for some applications (e.g. finding primes in short intervals) to get some improvement over the classical zero-free region in Proposition 38.

Exercise 43

- (i) Establish the upper bound
whenever and for some absolute constant . (

Hint:apply (21) for a suitable choice of .)- (ii) Assume that the upper bound (43) in fact holds in the larger region where and for some absolute constant . (This bound, essentially due to Vinogradov and Korobov, will be rigorously established in later notes.) Conclude the variant
when with and , and is an absolute constant.

- (iii) With the assumption in (ii), establish a zero-free region of the form (41).
- (iv) Assuming a zero-free region of the form (41), deduce (42).
- (v) What happens if one starts only with the bound in (i), rather than in (ii)?

** — 4. Dirichlet -functions, Siegel’s theorem, and the prime number theorem in arithmetic progressions — **

We now extend the above theory of the Riemann zeta function to Dirichlet -functions , where is a Dirichlet character of some period . As already remarked in Remark 1, the theory of such functions is very similar to that of the zeta function, with the character being like a “non-Archimedean” counterpart of the “Archimedean” character . However, there is one key new feature, which is that the behaviour near is not completely understood when is a real character.

For , the Dirichlet -functions are defined as , thus

By the general theory of Dirichlet series, this is an analytic function on the half-space . Since and , we then have

and

where the derivative is always understood to be with respect to the variable. In particular, we have the analogue of (11):

whenever with and . In particular, has no zeroes in the region .

We also have the Euler product

If is a principal character, thus , then we may compare this Euler product with the corresponding Euler product of the Riemann zeta function, and conclude that

The product extends analytically to the entire complex plane, and has no zeroes in the region , so this -function has a meromorphic extension to with exactly the same zeroes and poles as .

The more interesting situation occurs when is non-principal. In particular, it has mean zero on every interval of length . This gives a bound on slowly varying sums of (cf. Lemma 71 of Notes 1):

Exercise 44Let be a non-principal Dirichlet character of period , let , and let be a continuously differentiable function. Show that

Using this exercise, we see that

whenever with . By Lemma 5 of Notes 1, we conclude that for any such , there is a unique complex number such that

for any . In particular, the partial sums converge locally uniformly to on the half-space , and so is holomorphic on this region. This is similar to , but with the key difference that there is no longer any pole at .

Setting in (47), we obtain the crude bound

when with . Taking logarithms, we have

One can then repeat much of the arguments in Section 2 with few changes (other than replacing logarithmic factors such as with instead, and removing the effect of a pole at ):

Exercise 45Let be a non-principal character of period .

- (i) (Crude upper bound on -function zeroes) For any and , show there are at most zeroes of in the region . (As with , zeroes of are always understood here to be counted with multiplicity.)
- (ii) (Approximate formula for log-derivative of -function) For any , show that
whenever and . Here and in the rest of this exercise, the sum is over the zeroes of .

- (iii) (Local integrability of log-derivative) For any and , show that
- (iv) (Truncated twisted von Mangoldt explicit formula) For any and , show that
(Compare with (5).)

- (v) (Smoothed twisted explicit formula) Let be a smooth, compactly supported function. Then for any and , show that
with the sum on the right-hand side being absolutely convergent. (Again, compare with (5).)

- (vi) (Zero-free region controls twisted von Mangoldt summatory function) Let and , and suppose that there are no zeroes of in the rectangle . Then show that
for all .

Exercise 46Let be a non-principal character of period , and et . Show that the following assertions are equivalent:

- (i) One has as .
- (ii) One has for all .
- (iii) All the zeroes of have real part at most .

Based on this exercise, it is now natural to generalise the Riemann hypothesis:

Conjecture 47 (Generalised Riemann hypothesis)Let be a Dirichlet character. Then all the zeroes of have real part at most .

Given that the Riemann hypothesis (RH) remains unsolved, the stronger assertion of the generalised Riemann hypothesis (GRH) is also unsolved. (But later on we will establish the Bombieri-Vinogradov theorem, of major importance in sieve theory, which can be viewed as a kind of assertion that the generalised Riemann hypothesis holds “on average” in a certain technical sense.

Exercise 48Assume the Generalised Riemann hypothesis. Show thatfor all primitive congruence classes and all .

Exercise 49 (Equivalent forms of twisted prime number theorem)Let be a non-principal Dirichlet character. Show that the following assertions are equivalent:

- (i) One has as .
- (ii) All the zeroes of have real part strictly less than one.
This exercise should be compared with the derivation of Dirichlet’s theorem (Theorem 70 from Notes 1) from the non-vanishing of (Theorem 73 from Notes 1).

Now we obtain zero-free regions for for a Dirichlet character of period . From the Mertens trigonometric inequality (32) we have

where is the principal character of period ; equivalently, we have

Multiplying by for some and summing, we obtain a twisted version of (35),

for any and . Integrating this in from gives a twisted version of (36):

We can now strengthen Dirichlet’s theorem (Theorem 65 from Notes 1):

Exercise 50 (Prime number theorem in arithmetic progressions)

- (i) For any Dirichlet character , show that has no zeroes on the line . (You will need Theorem 73 from Notes 1 to deal with the case.)
- (ii) For any primitive residue class , show that
as (keeping and fixed). (The decay rate in the notation may depend on and .)

- (iii) For any primitive residue class , show that the number of primes in less than is as (keeping and fixed).

Next, we obtain the analogue of the classical zero-free region (Proposition 38), though with an important exception due to the lack of control near :

Proposition 51 (Classical zero-free region for -functions)There exists an absolute constant such that, for any Dirichlet character of period , there are no zeroes of in the regionwith the possible exception of a single real zero (which we refer to as an

exceptional zeroorSiegel zero). The exceptional zero can only occur if is a non-principal real character.

*Proof:* We may assume that is non-principal, since otherwise the claim follows from Proposition 38. In particular, .

Let be a small constant to be chosen later, and let be sufficiently small depending on .

First suppose that is a complex character, so that is non-principal. Suppose first that we have for some and . From Exercise 45(ii) and taking real parts, we have

for any . Similarly, because is non-principal, we have

while from (44) we have

Applying (50), we conclude that

Setting for (say) , we obtain a contradiction with small enough. This completes the proof of the proposition when is complex.

Now suppose that is a real character, so that . We can adapt the previous argument, but need a new tool to estimate . By (45) we have

We crudely bound

and then apply Proposition 19 and take real parts to conclude that

Applying (50) as before, we conclude that

As before, we set and conclude that

If , then (say) if is small enough, leading again to a contradiction. Thus the only remaining case is when is real and .

We now show that there is at most one zero of in the region . If there are two such zeroes , then from Exercise 45(ii) and taking real parts we have

comparing this with (44), we conclude that

If we set (say), we obtain a contradiction if is small enough.

As is real, we have the conjugation symmetry

and so if is a zero of , then is one also. Thus there can be no strictly complex zeroes in the region , and at most one real zero; and the claim follows. (From Theorem 73 from Notes 1, cannot equal , and from (44) there are no zeroes to the right of .)

The exceptional zero in the above theorem is quite a nuisance; if one believes in the generalised Riemann hypothesis, it should not exist, but frustratingly, we have not been able to completely exclude this zero from occurring. However, there is an important *repulsion phenomenon* (known as the Deuring-Heilbronn repulsion phenomenon), that asserts (roughly speaking) that the existence of one exceptional zero tends to repel away other exceptional zeroes. We already saw one instance of this phenomenon when proving Proposition 51, when we showed that a single character could not have two or more exceptional zeroes. Another instance appeared in Proposition 76 of Notes 1.

To state the repulsion phenomenon more precisely, we have to exclude a degenerate case, coming from the fact that if one multiplies a Dirichlet character (of some modulus ) by a principal character (of some modulus ), then the resulting Dirichlet character (which has modulus ) has essentially the same -function as , as and differ by a finite number of Euler factors (as in (45)), and so the two -functions have an identical set of zeroes in the region . To avoid this problem, let us call a Dirichlet character of modulus *primitive* if it cannot be factored as , where is a principal character and is a character of modulus strictly less than .

Exercise 52Show that every Dirichlet character of modulus can be uniquely factored as , where is a primitive character of some modulus (known as the conductor of ) and is a principal character whose modulus is coprime to . Furthermore, divides , and is real if and only if is real. Thus we see that to understand the zeroes of Dirichlet -functions , it suffices to do so for the primitive characters.

Here is one standard manifestation of the repulsion phenomenon:

Theorem 53 (Landau’s theorem)There is an absolute constant with the following property: whenever are two distinct real primitive characters of conductor respectively, there is at most one real zero of or with .

*Proof:* Let be sufficiently small. If the claim failed, then (since each -function has at most one exceptional zero) we can find such that .

In previous arguments, one used the inequality (49). Here, we will instead use the inequality

which we expand as

Multiplying by for some and summing, we conclude that

(We do not need to take real parts here, as everything in sight is already real.) From (13) we have

and from Exercise 45(ii) we have

and

Finally, as are distinct primitive characters of modulus respectively, is a non-principal character of modulus at most , and so from Exercise 45(ii) again we have

Putting all this together, we see that

Setting , we obtain a contradiction if is large enough.

This gives a variant of Proposition 51, in which the zero-free region is reduced slightly, but there is only one primitive character that has an exceptional zero:

Exercise 54 (Page’s theorem)Let . Show that for each primitive character of conductor at most , the -function has a zero-free region of the form for some absolute constant , with the possible exception of asinglereal zero by asingleprimitive real character of modulus at most .

We will refer to Landau’s theorem and Page’s theorem collectively as the *Landau-Page theorem*.

Exercise 55 (Prime number theorem in arithmetic progressions with classical error term)Let be a sufficiently small quantity, and let be a natural number.

- (i) If is the principal Dirichlet character modulo , show that
if and .

- (ii) If is a non-principal Dirichlet character modulo , show that
if , and has an exceptional zero (which, for this current exercise, means a zero of with ). If has no exceptional zero, then the term should be deleted; this is for instance the case when is complex.

- (iii) If is a primitive residue class modulo , show that
if , , where is a real non-principal character modulus with an exceptional zero . (Note from Page’s theorem that there is at most one such character, if is small enough.) This character will be called the

exceptional character. If there is no exceptional character, the term should be deleted.(Note: the constant may need to be smaller in (iii) than it needs to be for (i) or (ii).)

Remark 56Informally, the prime number theorem in arithmetic progressions asserts that the primes are equidistributed in the primitive residue classes modulo for , unless there is an exceptional character with exceptional zero , in which case the primes are more or less equidistributed in the primitive residue classes with if , and then become equidistributed in all the primitive residue classes modulo for .

The Landau-Page theorem is good at eliminating exceptional zeroes in a power range such as for any fixed , as it prevents more than a single primitive real character of conductor in this range having an exceptional zero with for an absolute constant . However, it loses control of exceptional zeroes in wider ranges than this. For instance, the Landau-Page theorem does not prevent the existence of an infinite sequence of exceptional real primitive characters whose conductor grows very rapidly in (e.g. ), and with each having an exceptional zero that converges very quickly to , e.g. .

Fortunately we have another way to exploit the repulsion phenomenon even for characters of widely separated modulus. To develop this aspect of the repulsion phenomenon, we first need to establish a link between exceptional zeroes, and exceptionally small values of . We first give one direction of this link:

Lemma 57 (Exceptional zero implies small )Suppose that is a real non-principal character of modulus whose -function has a real zero with . Then .

*Proof:* From (47) with and we have

for . From the generalised Cauchy integral formula (Exercise 9 of Supplement 2), we thus have

for . Since , the claim now follows from the fundamental theorem of calculus.

To go in the opposite direction, we will borrow a trick from the proof of the non-vanishing of (Theorem 73 from Notes 1) and exploit the positivity

for various choices of and . The key estimate is the following (compare with (21) or (47)):

Exercise 58Let be a real non-principal character of modulus , and let be a real number. Establish the boundfor any . (

Hint:use the Dirichlet hyperbola method and (21), (47), (24).) For an additional challenge, see if you can establish this estimate (possibly with a slightly weaker error term) by using the truncated Perron formula and contour shifting (bearing in mind that is only bounded by rather than by ).

Lemma 59 (Small implies exceptional zero)Suppose that is a real non-principal character of modulus such that for some sufficiently small absolute constant . Then there is a real zero of with .

*Proof:* Let for some large absolute constant , and let be sufficiently small depending on , thus (recall that is positive), and so if is small enough. From (53) we have

for any and so from Exercise 58 we have

(say). Setting (say), we conclude that

if is large enough and is small enough. By (24), is negative. Thus, , and so by the intermediate value theorem there must be a zero of between and , and the claim follows.

Now suppose we have two distinct real primitive characters of modulus respectively, so that is also a real non-principal character of modulus at most . As in the proof of the Landau-Page theorem, we have the non-negativity

We will instead exploit the multiplicative version of this non-negativity:

The latter bound can be deduced from the former after using the formal identity

that comes from the identity

valid for all characters (cf. (28) from Notes 1); it can also be verified directly. In particular, we have

for any and . Meanwhile, one has the following variant of Exercise 58:

Exercise 60Let be distinct real primitive characters of modulus , respectively, and let for some sufficiently small absolute constant . Establish the boundfor any . (

Hint:either use a higher-dimensional version of the Dirichlet hyperbola method, or the truncated Perron formula and contour shifting.)

This gives a repulsion phenomenon:

Proposition 61 (Repulsion phenomenon)Let be a real primitive character of modulus . Suppose that for some , where is a sufficiently small absolute constant. Then one hasfor all real primitive characters distinct from , with denoting the modulus of .

*Proof:* From Exercise 60 with and (54) (and shrinking as needed) one has

for any . If we then set for a sufficiently large absolute constant , the error term is less than , and so

From (52) one has , and from choice of we have with implied constants depending on . The claim follows.

We can now give Siegel’s theorem on exceptional zeroes (or on exceptionally small values of ), which will be the first theorem in this set of notes to feature *ineffective* implied constants – constants which cannot be explicitly computed in terms of the given data, but are merely known to be finite and positive.

- (i) For any , one has the bound
for all but at most one real primitive character of conductor , and some constant .

- (ii) For any , there are no zeroes of in the interval for all but at most one real primitive character of conductor , and some constant .
In both (i) and (ii), the constant is

effective: it can be computed explicitly in terms of . However, if one wishes to replace “all but at most one” with “all” in either (i) or (ii), one can do this at the cost of renderingineffective: this constant is still known to be positive, but we no longer know of a way to compute explicitly in terms of .

Remark 63The observation that Siegel’s theorem may be made effective if one exceptional character is removed is due to Tatuzawa. Combined with the class number formula, this can be used to show that with at most one exception, all but an explicitly computable finite list of quadratic fields of negative discriminant do not have unique factorisation. Indeed, using related methods, Heilbronn and Linfott had previously showed that, apart from the nine discriminants (which all give quadratic fields of unique factorisation), there is at most one further negative discriminant giving a quadratic field of unique factorisation. This elusive “tenth discriminant” was finally ruled out by Heegner by some difficult arguments, which have since been clarified by subsequent work of Stark and many further authors, giving what is now known as the Stark-Heegner theorem.

*Proof:* From Lemmas 27, 59 we see that (i) and (ii) are equivalent, so we will just prove (i). It suffices to prove the claim with one exceptional character deleted and with effective choices of , since one can reinstate the exceptional character (at the cost of making ineffective) just by using the positivity for the exceptional character.

Let be a small (effective) constant, depending only on , to be chosen later. We divide into two cases:

- There are no zeroes of in the interval for
*any*real primitive character with a conductor . - There exists a real primitive character of some conductor with a zero in .

In Case 1, Lemma 59 in the contrapositive gives for all real primitive characters with a conductor , giving the claim (after adjusting slightly).

Now suppose we are in Case 2, so for some real primitive character of conductor and some (recall that is non-vanishing). We may take to be minimal among all such characters. Note that while is obviously finite, we do not have any *effective* bound on , so we have to proceed a little carefully if one is to avoid the final implied constants from depending on or .

Let be a real primitive character of conductor that is distinct from the exceptional character . If then by construction, has no zeroes in , and the required bound follows again from Lemma 59 in the contrapositive. Now suppose that . From Proposition 61, we have

By Lemma 27, we have . Bounding by , we conclude that

and using the bound , we obtain the required estimate if is small enough.

The best known *effective* lower bounds on for *all* real primitive characters (not excluding an exceptional character) go through the class number formula (as briefly discussed in Supplement 1), and take the shape

for some explicit constant ; this corresponds to a zero-free region of of size for some effective constants . The bound (55) is trivial for since the class number is always at least one; it turns out that one can raise to be arbitrarily close to , but this is a difficult result (at least when is associated to a quadratic field of negative discriminant), due to Goldfeld and Gross-Zagier; see this survey of Goldfeld for further discussion. It is of great interest to improve these effective bounds further, but this has not yet been achieved; despite the conjectural non-existence of Siegel zeroes, they seem to live in a stubbornly self-consistent (though somewhat strange) universe that has defied all efforts to eradicate them to date.

In summary, we can give the following bounds on exceptional zeroes of -functions of real primitive characters of conductor :

- (i) (Class number methods) One has for some effective and .
- (ii) (Siegel) For any , one has for some
*ineffective*. - (iii) (Tatuzawa) For any , one has for some
*effective*, except possibly for a single exceptional character . - (iv) (Page) One has for some effective and all real primitive characters of conductor at most , except possibly for a single exceptional character .

One also obtains analogous lower bounds on through Lemma 59, and lower bounds on class numbers (at least in the case of negative discriminant) using the class number formula.

All four of the bounds (i), (ii), (iii), (iv) have their advantages and disadvantages, and are all useful in various applications; the choice of which of (i)-(iv) to use depends on whether one has some argument to deal with a potential exceptional character, whether one can tolerate ineffective values of the implied constant, and whether one has a reasonable bound on the conductor of the characters one wishes to use.

A basic application of Siegel’s theorem is the Siegel-Walfisz theorem.

Exercise 64 (Siegel-Walfisz theorem)For any , show that there exists an (ineffective) constant such thatfor all primitive residue classes and all with

(

Hint:use Exercise 55(iii) together with Siegel’s theorem to handle the exceptional zero.) Conclude in particular thatfor all primitive residue classes and all (without assuming the size restriction (56)), and with an ineffective constant in the notation.

Of course, the error term in the Siegel-Walfisz theorem can be substantially improved if one assumes the generalised Riemann hypothesis: see Exercise 48. In later notes we will use the Siegel-Walfisz theorem to prove the Bombieri-Vinogradov theorem, which is a theorem of basic importance in sieve theory.

Exercise 65 (Least prime in an arithmetic progression)If is a primitive residue class, show that contains a prime with , with the implied constants ineffective; by using (55), obtain the alternate bound with effective constants, and obtain the improvement with effective constants if there is no exceptional character of modulus . In later notes we will be able to improve all of these bounds to with effective constants, a result known as Linnik’s theorem.

Exercise 66 (Siegel-Walfisz for the Möbius function)Show that for any , one has the boundfor all residue classes (not necessarily primitive) and all , with an ineffective constant in the notation. (

Hint:reduce to the case in which and is primitive. One can either use a truncated Perron’s formula argument using some lower bounds on for slightly to the left of , or else modify the elementary method from Theorem 58 of Notes 1, using an induction on .)

Exercise 67 (Elementary lower bound for )The purpose of this exercise is to give a somewhat reasonable effective lower bound on by an elementary (but somewhatad hoc) device. Let be a real non-principal character of modulus .

- (i) Establish the identity
for any , where is the function , and the sum is in the conditionally convergent sense.

- (ii) Obtain the bounds
In later notes, we will develop Fourier-analytic tools that, among other things, improve the upper bound on (57) to , which basically recovers the bound coming from the class number formula.

Filed under: 254A - analytic prime number theory, math.CV, math.NT Tagged: contour integration, Dirichlet characters, Dirichlet L-function, Perron formula, prime number, Riemann zeta function, Siegel zero ]]>

We begin by recalling the notion of a holomorphic function, which will later be shown to be synonymous with that of a complex analytic function.

Definition 1 (Holomorphic function)Let be an open subset of , and let be a function. If , we say that iscomplex differentiableat if the limitexists, in which case we refer to as the (complex)

derivativeof at . If is differentiable at every point of , and the derivative is continuous, we say that isholomorphicon .

Exercise 2Show that a function is holomorphic if and only if the two-variable function is continuously differentiable on and obeys the Cauchy-Riemann equation

Basic examples of holomorphic functions include complex polynomials

as well as the complex exponential function

which are holomorphic on the entire complex plane (i.e., they are entire functions). The sum or product of two holomorphic functions is again holomorphic; the quotient of two holomorphic functions is holomorphic so long as the denominator is non-zero. Finally, the composition of two holomorphic functions is holomorphic wherever the composition is defined.

- (i) Establish Euler’s formula
for all . (

Hint:it is a bit tricky to do this starting from the trigonometric definitions of sine and cosine; I recommend either using the Taylor series formulations of these functions instead, or alternatively relying on the ordinary differential equations obeyed by sine and cosine.)- (ii) Show that every non-zero complex number has a complex logarithm such that , and that this logarithm is unique up to integer multiples of .
- (iii) Show that there exists a unique principal branch of the complex logarithm in the region , defined by requiring to be a logarithm of with imaginary part between and . Show that this principal branch is holomorphic with derivative .

In real analysis, we have the fundamental theorem of calculus, which asserts that

whenever is a real interval and is a continuously differentiable function. The complex analogue of this fact is that

whenever is a holomorphic function, and is a contour in , by which we mean a piecewise continuously differentiable function, and the contour integral for a continuous function is defined via change of variables as

The complex fundamental theorem of calculus (2) follows easily from the real fundamental theorem and the chain rule.

In real analysis, we have the rather trivial fact that the integral of a continuous function on a closed contour is always zero:

In complex analysis, the analogous fact is significantly more powerful, and is known as Cauchy’s theorem:

Theorem 4 (Cauchy’s theorem)Let be a holomorphic function in a simply connected open set , and let be a closed contour in (thus ). Then .

Exercise 5Use Stokes’ theorem to give a proof of Cauchy’s theorem.

A useful reformulation of Cauchy’s theorem is that of contour shifting: if is a holomorphic function on a open set , and are two contours in an open set with and , such that can be continuously deformed into , then . A basic application of contour shifting is the Cauchy integral formula:

Theorem 6 (Cauchy integral formula)Let be a holomorphic function in a simply connected open set , and let be a closed contour which is simple (thus does not traverse any point more than once, with the exception of the endpoint that is traversed twice), and which encloses a bounded region in the anticlockwise direction. Then for any , one has

*Proof:* Let be a sufficiently small quantity. By contour shifting, one can replace the contour by the sum (concatenation) of three contours: a contour from to , a contour traversing the circle once anticlockwise, and the reversal of the contour that goes from to . The contributions of the contours cancel each other, thus

By a change of variables, the right-hand side can be expanded as

Sending , we obtain the claim.

The Cauchy integral formula has many consequences. Specialising to the case when traverses a circle around , we conclude the mean value property

whenever is holomorphic in a neighbourhood of the disk . In a similar spirit, we have the maximum principle for holomorphic functions:

Lemma 7 (Maximum principle)Let be a simply connected open set, and let be a simple closed contour in enclosing a bounded region anti-clockwise. Let be a holomorphic function. If we have the bound for all on the contour , then we also have the bound for all .

*Proof:* We use an argument of Landau. Fix . From the Cauchy integral formula and the triangle inequality we have the bound

for some constant depending on and . This ostensibly looks like a weaker bound than what we want, but we can miraculously make the constant disappear by the “tensor power trick“. Namely, observe that if is a holomorphic function bounded in magnitude by on , and is a natural number, then is a holomorphic function bounded in magnitude by on . Applying the preceding argument with replaced by we conclude that

and hence

Sending , we obtain the claim.

Another basic application of the integral formula is

Corollary 8Every holomorphic function is complex analytic, thus it has a convergent Taylor series around every point in the domain. In particular, holomorphic functions are smooth, and the derivative of a holomorphic function is again holomorphic.

Conversely, it is easy to see that complex analytic functions are holomorphic. Thus, the terms “complex analytic” and “holomorphic” are synonymous, at least when working on open domains. (On a non-open set , saying that is analytic on is equivalent to asserting that extends to a holomorphic function of an open neighbourhood of .) This is in marked contrast to real analysis, in which a function can be continuously differentiable, or even smooth, without being real analytic.

*Proof:* By translation, we may suppose that . Let be a a contour traversing the circle that is contained in the domain , then by the Cauchy integral formula one has

for all in the disk . As is continuously differentiable (and hence continuous) on , it is bounded. From the geometric series formula

and dominated convergence, we conclude that

with the right-hand side an absolutely convergent series for , and the claim follows.

Exercise 9Establish the generalised Cauchy integral formulaefor any non-negative integer , where is the -fold complex derivative of .

This in turn leads to a converse to Cauchy’s theorem, known as Morera’s theorem:

Corollary 10 (Morera’s theorem)Let be a continuous function on an open set with the property that for all closed contours . Then is holomorphic.

*Proof:* We can of course assume to be non-empty and connected (hence path-connected). Fix a point , and define a “primitive” of by defining , with being any contour from to (this is well defined by hypothesis). By mimicking the proof of the real fundamental theorem of calculus, we see that is holomorphic with , and the claim now follows from Corollary 8.

An important consequence of Morera’s theorem for us is

Corollary 11 (Locally uniform limit of holomorphic functions is holomorphic)Let be holomorphic functions on an open set which converge locally uniformly to a function . Then is also holomorphic on .

*Proof:* By working locally we may assume that is a ball, and in particular simply connected. By Cauchy’s theorem, for all closed contours in . By local uniform convergence, this implies that for all such contours, and the claim then follows from Morera’s theorem.

Now we study the zeroes of complex analytic functions. If a complex analytic function vanishes at a point , but is not identically zero in a neighbourhood of that point, then by Taylor expansion we see that factors in a sufficiently small neighbourhood of as

for some natural number (which we call the *order* of the zero at ) and some function that is complex analytic and non-zero near ; this generalises the factor theorem for polynomials. In particular, the zero is isolated if does not vanish identically near . We conclude that if is connected and vanishes on a neighbourhood of some point in , then it must vanish on all of (since the maximal connected neighbourhood of in on which vanishes cannot have any boundary point in ). This implies unique continuation of analytic functions: if two complex analytic functions on agree on a non-empty open set, then they agree everywhere. In particular, if a complex analytic function does not vanish everywhere, then all of its zeroes are isolated, so in particular it has only finitely many zeroes on any given compact set.

Recall that a rational function is a function which is a quotient of two polynomials (at least outside of the set where vanishes). Analogously, let us define a meromorphic function on an open set to be a function defined outside of a discrete subset of (the *singularities* of ), which is locally the quotient of holomorphic functions, in the sense that for every , one has in a neighbourhood of excluding , with holomorphic near and with non-vanishing outside of . If and has a zero of equal or higher order than at , then the singularity is removable and one can extend the meromorphic function holomorphically across (by the holomorphic factor theorem (4)); otherwise, the singularity is non-removable and is known as a *pole*, whose order is equal to the difference between the order of and the order of at . (If one wished, one could extend meromorphic functions to the poles by embedding in the Riemann sphere and mapping each pole to , but we will not do so here. One could also consider non-meromorphic functions with essential singularities at various points, but we will have no need to analyse such singularities in this course.)

Exercise 12Show that the space of meromorphic functions on a non-empty open set , quotiented by almost everywhere equivalence, forms a field.

By quotienting two Taylor series, we see that if a meromorphic function has a pole of order at some point , then it has a Laurent expansion

absolutely convergent in a neighbourhood of excluding itself, and with non-zero. The Laurent coefficient has a special significance, and is called the residue of the meromorphic function at , which we will denote as . The importance of this coefficient comes from the following significant generalisation of the Cauchy integral formula, known as the residue theorem:

Exercise 13 (Residue theorem)Let be a meromorphic function on a simply connected domain , and let be a closed contour in enclosing a bounded region