Let us first discuss the algebraic geometry application. Given a smooth complex -dimensional projective variety there is a standard line bundle attached to it, known as the canonical line bundle; -forms on the variety become sections of this bundle. The bundle may not actually admit global sections; that is to say, the dimension of global sections may vanish. But as one raises the canonical line bundle to higher and higher powers to form further line bundles , the number of global sections tends to increase; in particular, the dimension of global sections (known as the plurigenus) always obeys an asymptotic of the form

as for some non-negative number , which is called theIt follows from a deep result obtained independently by Hacon–McKernan, Takayama and Tsuji that there is a uniform lower bound for the volume of all -dimensional projective varieties of general type. However, the precise lower bound is not known, and the current paper is a contribution towards probing this bound by constructing varieties of particularly small volume in the high-dimensional limit . Prior to this paper, the best such constructions of -dimensional varieties basically had exponentially small volume, with a construction of volume at most given by Ballico–Pignatelli–Tasin, and an improved construction with a volume bound of given by Totaro and Wang. In this paper, we obtain a variant construction with the somewhat smaller volume bound of ; the method also gives comparable bounds for some other related algebraic geometry statistics, such as the largest for which the pluricanonical map associated to the linear system is not a birational embedding into projective space.

The space is constructed by taking a general hypersurface of a certain degree in a weighted projective space and resolving the singularities. These varieties are relatively tractable to work with, as one can use standard algebraic geometry tools (such as the Reid–Tai inequality) to provide sufficient conditions to guarantee that the hypersurface has only canonical singularities and that the canonical bundle is a reflexive sheaf, which allows one to calculate the volume exactly in terms of the degree and weights . The problem then reduces to optimizing the resulting volume given the constraints needed for the above-mentioned sufficient conditions to hold. After working with a particular choice of weights (which consist of products of mostly consecutive primes, with each product occuring with suitable multiplicities ), the problem eventually boils down to trying to minimize the total multiplicity , subject to certain congruence conditions and other bounds on the . Using crude bounds on the eventually leads to a construction with volume at most , but by taking advantage of the ability to “dilate” the congruence conditions and optimizing over all dilations, we are able to improve the constant to .

Now it is time to turn to the analytic side of the paper by describing the optimization problem that we solve. We consider the *sawtooth* function , with defined as the unique real number in that is equal to mod . We consider a (Borel) probability measure on the real line, and then compute the average value of this sawtooth function

If one considers the deterministic case in which is a Dirac mass supported at some real number , then the Dirichlet approximation theorem tells us that there is such that is within of an integer, so we have

in this case, and this bound is sharp for deterministic measures . Thus we have However, both of these bounds turn out to be far from the truth, and the optimal value of is comparable to . In fact we were able to compute this quantity precisely:

Theorem 1 (Optimal bound for sawtooth inequality)Let .In particular, we have as .

- (i) If for some natural number , then .
- (ii) If for some natural number , then .

We establish this bound through duality. Indeed, suppose we could find non-negative coefficients such that one had the pointwise bound

for all real numbers . Integrating this against an arbitrary probability measure , we would conclude and hence Conversely, one can find lower bounds on by selecting suitable candidate measures and computing the means . The theory of linear programming duality tells us that this method must give us the optimal bound, but one has to locate the optimal measure and optimal weights . This we were able to do by first doing some extensive numerics to discover these weights and measures for small values of , and then doing some educated guesswork to extrapolate these examples to the general case, and then to verify the required inequalities. In case (i) the situation is particularly simple, as one can take to be the discrete measure that assigns a probability to the numbers and the remaining probability of to , while the optimal weighted inequality (1) turns out to be which is easily proven by telescoping series. However the general case turned out to be significantly tricker to work out, and the verification of the optimal inequality required a delicate case analysis (reflecting the fact that equality was attained in this inequality in a large number of places).After solving the sawtooth problem, we became interested in the analogous question for the sine function, that is to say what is the best bound for the inequality

The left-hand side is the smallest imaginary part of the first Fourier coefficients of . To our knowledge this quantity has not previously been studied in the Fourier analysis literature. By adopting a similar approach as for the sawtooth problem, we were able to compute this quantity exactly also:

Theorem 2For any , one has In particular,

Interestingly, a closely related cotangent sum recently appeared in this MathOverflow post. Verifying the lower bound on boils down to choosing the right test measure ; it turns out that one should pick the probability measure supported the with odd, with probability proportional to , and the lower bound verification eventually follows from a classical identity

for , first posed by Eisenstein in 1844 and proved by Stern in 1861. The upper bound arises from establishing the trigonometric inequality for all real numbers , which to our knowledge is new; the left-hand side has a Fourier-analytic intepretation as convolving the Fejér kernel with a certain discretized square wave function, and this interpretation is used heavily in our proof of the inequality. ]]>The significance of the Gowers norms is that they control other multilinear forms that show up in additive combinatorics. Given any polynomials and functions , we define the multilinear form

(assuming that the denominator is finite and non-zero). Thus for instance where we view as formal (indeterminate) variables, and are understood to be extended by zero to all of . These forms are used to count patterns in various sets; for instance, the quantity is closely related to the number of length three arithmetic progressions contained in . Let us informally say that a form is- and have true complexity ;
- has true complexity ;
- has true complexity ;
- The form (which among other things could be used to count twin primes) has infinite true complexity (which is quite unfortunate for applications).

Gowers and Wolf formulated a conjecture on what this complexity should be, at least for linear polynomials ; Ben Green and I thought we had resolved this conjecture back in 2010, though it turned out there was a subtle gap in our arguments and we were only able to resolve the conjecture in a partial range of cases. However, the full conjecture was recently resolved by Daniel Altman.

The (semi-)norm is so weak that it barely controls any averages at all. For instance the average

is not controlled by the semi-norm: it is perfectly possible for a -bounded function to even have vanishing norm but have large value of (consider for instance the parity function ).Because of this, I propose inserting an additional norm in the Gowers uniformity norm hierarchy between the and norms, which I will call the (or “profinite “) norm:

where ranges over all arithmetic progressions in . This can easily be seen to be a norm on functions that controls the norm. It is also basically controlled by the norm for -bounded functions ; indeed, if is an arithmetic progression in of some spacing , then we can write as the intersection of an interval with a residue class modulo , and from Fourier expansion we have If we let be a standard bump function supported on with total mass and is a parameter then (extending by zero outside of ), as can be seen by using the triangle inequality and the estimate After some Fourier expansion of we now have Writing as a linear combination of and using the Gowers–Cauchy–Schwarz inequality, we conclude hence on optimising in we have Forms which are controlled by the norm (but not ) would then have their true complexity adjusted to with this insertion.The norm recently appeared implicitly in work of Peluse and Prendiville, who showed that the form had true complexity in this notation (with polynomially strong bounds). [Actually, strictly speaking this control was only shown for the third function ; for the first two functions one needs to localize the norm to intervals of length . But I will ignore this technical point to keep the exposition simple.] The weaker claim that has true complexity is substantially easier to prove (one can apply the circle method together with Gauss sum estimates).

The well known inverse theorem for the norm tells us that if a -bounded function has norm at least for some , then there is a Fourier phase such that

this follows easily from (1) and Plancherel’s theorem. Conversely, from the Gowers–Cauchy–Schwarz inequality one hasFor one has a trivial inverse theorem; by definition, the norm of is at least if and only if

Thus the frequency appearing in the inverse theorem can be taken to be zero when working instead with the norm.For one has the intermediate situation in which the frequency is not taken to be zero, but is instead major arc. Indeed, suppose that is -bounded with , thus

for some progression . This forces the spacing of this progression to be . We write the above inequality as for some residue class and some interval . By Fourier expansion and the triangle inequality we then have for some integer . Convolving by for a small multiple of and a Schwartz function of unit mass with Fourier transform supported on , we have The Fourier transform of is bounded by and supported on , thus by Fourier expansion and the triangle inequality we have for some , so in particular . Thus we have for some of the major arc form with . Conversely, for of this form, some routine summation by parts gives the bound so if (2) holds for a -bounded then one must have .Here is a diagram showing some of the control relationships between various Gowers norms, multilinear forms, and duals of classes of functions (where each class of functions induces a dual norm :

Here I have included the three classes of functions that one can choose from for the inverse theorem, namely degree two nilsequences, bracket quadratic phases, and local quadratic phases, as well as the more narrow class of globally quadratic phases.

The Gowers norms have counterparts for measure-preserving systems , known as *Host-Kra seminorms*. The norm can be defined for as

To discuss the results we first discuss the situation of the Möbius function, which is technically simpler in some (though not all) ways. We assume familiarity with Gowers norms and standard notations around these norms, such as the averaging notation and the exponential notation . The prime number theorem in qualitative form asserts that

as . With Vinogradov-Korobov error term, the prime number theorem is strengthened to we refer to such decay bounds (With type factors) asOnce one restricts to arithmetic progressions, the situation gets worse: the Siegel-Walfisz theorem gives the bound

for any residue class and any , but with the catch that the implied constant is ineffective in . This ineffectivity cannot be removed without further progress on the notorious Siegel zero problem.In 1937, Davenport was able to show the discorrelation estimate

for any uniformly in , which leads (by standard Fourier arguments) to the Fourier uniformity estimate Again, the implied constant is ineffective. If one insists on effective constants, the best bound currently available is for some small effective constant .For the situation with the norm the previously known results were much weaker. Ben Green and I showed that

uniformly for any , any degree two (filtered) nilmanifold , any polynomial sequence , and any Lipschitz function ; again, the implied constants are ineffective. On the other hand, in a separate paper of Ben Green and myself, we established the following inverse theorem: if for instance we knew that for some , then there exists a degree two nilmanifold of dimension , complexity , a polynomial sequence , and Lipschitz function of Lipschitz constant such that Putting the two assertions together and comparing all the dependencies on parameters, one can establish the qualitative decay bound However the decay rate produced by this argument isFor higher norms , the situation is even worse, because the quantitative inverse theory for these norms is poorer, and indeed it was only with the recent work of Manners that any such bound is available at all (at least for ). Basically, Manners establishes if

then there exists a degree nilmanifold of dimension , complexity , a polynomial sequence , and Lipschitz function of Lipschitz constant such that (We allow all implied constants to depend on .) Meanwhile, the bound (3) was extended to arbitrary nilmanifolds by Ben and myself. Again, the two results when concatenated give the qualitative decay but the decay rate is completely ineffective.Our first result gives an effective decay bound:

Theorem 1For any , we have for some . The implied constants are effective.

This is off by a logarithm from the best effective bound (2) in the case. In the case there is some hope to remove this logarithm based on the improved quantitative inverse theory currently available in this case, but there is a technical obstruction to doing so which we will discuss later in this post. For the above bound is the best one could hope to achieve purely using the quantitative inverse theory of Manners.

We have analogues of all the above results for the von Mangoldt function . Here a complication arises that does not have mean close to zero, and one has to subtract off some suitable approximant to before one would expect good Gowers norms bounds. For the prime number theorem one can just use the approximant , giving

but even for the prime number theorem in arithmetic progressions one needs a more accurate approximant. In our paper it is convenient to use the “Cramér approximant” where and is the quasipolynomial quantity Then one can show from the Siegel-Walfisz theorem and standard bilinear sum methods that and for all and (with an ineffective dependence on ), again regaining effectivity if is replaced by a sufficiently small constant . All the previously stated discorrelation and Gowers uniformity results for then have analogues for , and our main result is similarly analogous:

Theorem 2For any , we have for some . The implied constants are effective.

By standard methods, this result also gives quantitative asymptotics for counting solutions to various systems of linear equations in primes, with error terms that gain a factor of with respect to the main term.

We now discuss the methods of proof, focusing first on the case of the Möbius function. Suppose first that there is no “Siegel zero”, by which we mean a quadratic character of some conductor with a zero with for some small absolute constant . In this case the Siegel-Walfisz bound (1) improves to a quasipolynomial bound

To establish Theorem 1 in this case, it suffices by Manners’ inverse theorem to establish the polylogarithmic bound for all degree nilmanifolds of dimension and complexity , all polynomial sequences , and all Lipschitz functions of norm . If the nilmanifold had bounded dimension, then one could repeat the arguments of Ben and myself more or less verbatim to establish this claim from (5), which relied on the quantitative equidistribution theory on nilmanifolds developed in a separate paper of Ben and myself. Unfortunately, in the latter paper the dependence of the quantitative bounds on the dimension was not explicitly given. In an appendix to the current paper, we go through that paper to account for this dependence, showing that all exponents depend at most doubly exponentially in the dimension , which is barely sufficient to handle the dimension of that arises here.
Now suppose we have a Siegel zero . In this case the bound (5) will *not* hold in general, and hence also (6) will not hold either. Here, the usual way out (while still maintaining effective estimates) is to approximate not by , but rather by a more complicated approximant that takes the Siegel zero into account, and in particular is such that one has the (effective) pseudopolynomial bound

For the analogous problem with the von Mangoldt function (assuming a Siegel zero for sake of discussion), the approximant is simpler; we ended up using

which allows one to state the standard prime number theorem in arithmetic progressions with classical error term and Siegel zero term compactly as Routine modifications of previous arguments also give and The one tricky new step is getting from the discorrelation estimate (8) to the Gowers uniformity estimate One cannot directly apply Manners’ inverse theorem here because and are unbounded. There is a standard tool for getting around this issue, now known as theIn principle, the above results can be improved for due to the stronger quantitative inverse theorems in the setting. However, there is a bottleneck that prevents us from achieving this, namely that the equidistribution theory of two-step nilmanifolds has exponents which are exponential in the dimension rather than polynomial in the dimension, and as a consequence we were unable to improve upon the doubly logarithmic results. Specifically, if one is given a sequence of bracket quadratics such as that fails to be -equidistributed, one would need to establish a nontrivial linear relationship modulo 1 between the (up to errors of ), where the coefficients are of size ; current methods only give coefficient bounds of the form . An old result of Schmidt demonstrates proof of concept that these sorts of polynomial dependencies on exponents is possible in principle, but actually implementing Schmidt’s methods here seems to be a quite non-trivial task. There is also another possible route to removing a logarithm, which is to strengthen the inverse theorem to make the dimension of the nilmanifold logarithmic in the uniformity parameter rather than polynomial. Again, the Freiman-Bilu theorem (see for instance this paper of Ben and myself) demonstrates proof of concept that such an improvement in dimension is possible, but some work would be needed to implement it.

]]>Our main result settles this conjecture in the “interior” region of the triangle:

Theorem 1 (Singmaster’s conjecture in the interior of the triangle)If and is sufficiently large depending on , there are at most two solutions to (1) in the region and hence at most four in the region Also, there is at most one solution in the region

To verify Singmaster’s conjecture in full, it thus suffices in view of this result to verify the conjecture in the boundary region

(or equivalently ); we have deleted the case as it of course automatically supplies exactly one solution to (1). It is in fact possible that for sufficiently large there are no further collisions for in the region (3), in which case there would never be more than eight solutions to (1) for sufficiently large . This is latter claim known for bounded values of by Beukers, Shorey, and Tildeman, with the main tool used being Siegel’s theorem on integral points.The upper bound of two here for the number of solutions in the region (2) is best possible, due to the infinite family of solutions to the equation

coming from , and is the Fibonacci number.The appearance of the quantity in Theorem 1 may be familiar to readers that are acquainted with Vinogradov’s bounds on exponential sums, which ends up being the main new ingredient in our arguments. In principle this threshold could be lowered if we had stronger bounds on exponential sums.

To try to control solutions to (1) we use a combination of “Archimedean” and “non-Archimedean” approaches. In the “Archimedean” approach (following earlier work of Kane on this problem) we view primarily as real numbers rather than integers, and express (1) in terms of the Gamma function as

One can use this equation to solve for in terms of as for a certain real analytic function whose asymptotics are easily computable (for instance one has the asymptotic ). One can then view the problem as one of trying to control the number of lattice points on the graph . Here we can take advantage of the fact that in the regime (which corresponds to working in the left half of Pascal’s triangle), the function can be shown to be convex, but not too convex, in the sense that one has both upper and lower bounds on the second derivative of (in fact one can show that ). This can be used to preclude the possibility of having a cluster of three or more nearby lattice points on the graph , basically because the area subtended by the triangle connecting three of these points would lie between and , contradicting Pick’s theorem. Developing these ideas, we were able to show

Proposition 2Let , and suppose is sufficiently large depending on . If is a solution to (1) in the left half of Pascal’s triangle, then there is at most one other solution to this equation in the left half with

Again, the example of (4) shows that a cluster of two solutions is certainly possible; the convexity argument only kicks in once one has a cluster of three or more solutions.

To finish the proof of Theorem 1, one has to show that any two solutions to (1) in the region of interest must be close enough for the above proposition to apply. Here we switch to the “non-Archimedean” approach, in which we look at the -adic valuations of the binomial coefficients, defined as the number of times a prime divides . From the fundamental theorem of arithmetic, a collision

between binomial coefficients occurs if and only if one has agreement of valuations From the Legendre formula we can rewrite this latter identity (5) as where denotes the fractional part of . (These sums are not truly infinite, because the summands vanish once is larger than .)
A key idea in our approach is to view this condition (6) *statistically*, for instance by viewing as a prime drawn randomly from an interval such as for some suitably chosen scale parameter , so that the two sides of (6) now become random variables. It then becomes advantageous to compare correlations between these two random variables and some additional test random variable. For instance, if and are far apart from each other, then one would expect the left-hand side of (6) to have a higher correlation with the fractional part , since this term shows up in the summation on the left-hand side but not the right. Similarly if and are far apart from each other (although there are some annoying cases one has to treat separately when there is some “unexpected commensurability”, for instance if is a rational multiple of where the rational has bounded numerator and denominator). In order to execute this strategy, it turns out (after some standard Fourier expansion) that one needs to get good control on exponential sums such as

A modification of the arguments also gives similar results for the equation

where is the falling factorial:

Theorem 3If and is sufficiently large depending on , there are at most two solutions to (7) in the region

Again the upper bound of two is best possible, thanks to identities such as

]]>

- ( too small) is contained in some proper subgroup of , or the elements of are constrained to some sort of equation that the full group does not satisfy.
- ( too large) contains some non-trivial normal subgroup of , and as such actually arises by pullback from some subgroup of the quotient group .
- (Structure) There is some useful structural relationship between and the groups .

It is perhaps easiest to explain the flavour of these lemmas with some simple examples, starting with the case where we are just considering subgroups of a single group .

Lemma 1Let be a subgroup of a group . Then exactly one of the following hold:

- (i) ( too small) There exists a non-trivial group homomorphism into a group such that for all .
- (ii) ( normally generates ) is generated as a group by the conjugates of .

*Proof:* Let be the group normally generated by , that is to say the group generated by the conjugates of . This is a normal subgroup of containing (indeed it is the smallest such normal subgroup). If is all of we are in option (ii); otherwise we can take to be the quotient group and to be the quotient map. Finally, if (i) holds, then all of the conjugates of lie in the kernel of , and so (ii) cannot hold.

Here is a “dual” to the above lemma:

Lemma 2Let be a subgroup of a group . Then exactly one of the following hold:

- (i) ( too large) is the pullback of some subgroup of for some non-trivial normal subgroup of , where is the quotient map.
- (ii) ( is core-free) does not contain any non-trivial conjugacy class .

*Proof:* Let be the normal core of , that is to say the intersection of all the conjugates of . This is the largest normal subgroup of that is contained in . If is non-trivial, we can quotient it out and end up with option (i). If instead is trivial, then there is no non-trivial element that lies in the core, hence no non-trivial conjugacy class lies in and we are in option (ii). Finally, if (i) holds, then every conjugacy class of an element of is contained in and hence in , so (ii) cannot hold.

For subgroups of nilpotent groups, we have a nice dichotomy that detects properness of a subgroup through abelian representations:

Lemma 3Let be a subgroup of a nilpotent group . Then exactly one of the following hold:

- (i) ( too small) There exists non-trivial group homomorphism into an abelian group such that for all .
- (ii) .

Informally: if is a variable ranging in a subgroup of a nilpotent group , then either is unconstrained (in the sense that it really ranges in all of ), or it obeys some abelian constraint .

*Proof:* By definition of nilpotency, the lower central series

Since is a normal subgroup of , is also a subgroup of . Suppose first that is a proper subgroup of , then the quotient map is a non-trivial homomorphism to an abelian group that annihilates , and we are in option (i). Thus we may assume that , and thus

Note that modulo the normal group , commutes with , hence and thus We conclude that . One can continue this argument by induction to show that for every ; taking large enough we end up in option (ii). Finally, it is clear that (i) and (ii) cannot both hold.

Remark 4When the group is locally compact and is closed, one can take the homomorphism in Lemma 3 to be continuous, and by using Pontryagin duality one can also take the target group to be the unit circle . Thus is now a character of . Similar considerations hold for some of the later lemmas in this post. Discrete versions of this above lemma, in which the group is replaced by some orbit of a polynomial map on a nilmanifold, were obtained by Leibman and are important in the equidistribution theory of nilmanifolds; see this paper of Ben Green and myself for further discussion.

Here is an analogue of Lemma 3 for special linear groups, due to Serre (IV-23):

Lemma 5Let be a prime, and let be a closed subgroup of , where is the ring of -adic integers. Then exactly one of the following hold:

- (i) ( too small) There exists a proper subgroup of such that for all .
- (ii) .

*Proof:* It is a standard fact that the reduction of mod is , hence (i) and (ii) cannot both hold.

Suppose that (i) fails, then for every there exists such that , which we write as

We now claim inductively that for any and , there exists with ; taking limits as using the closed nature of will then place us in option (ii).The case is already handled, so now suppose . If , we see from the case that we can write where and . Thus to establish the claim it suffices to do so under the additional hypothesis that .

First suppose that for some with . By the case, we can find of the form for some . Raising to the power and using and , we note that

giving the claim in this case.Any matrix of trace zero with coefficients in is a linear combination of , , and is thus a sum of matrices that square to zero. Hence, if is of the form , then for some matrix of trace zero, and thus one can write (up to errors) as the finite product of matrices of the form with . By the previous arguments, such a matrix lies in up to errors, and hence does also. This completes the proof of the case.

Now suppose and the claim has already been proven for . Arguing as before, it suffices to close the induction under the additional hypothesis that , thus we may write . By induction hypothesis, we may find with . But then , and we are done.

We note a generalisation of Lemma 3 that involves two groups rather than just one:

Lemma 6Let be a subgroup of a product of two nilpotent groups . Then exactly one of the following hold:

- (i) ( too small) There exists group homomorphisms , into an abelian group , with non-trivial, such that for all , where is the projection of to .
- (ii) for some subgroup of .

*Proof:* Consider the group . This is a subgroup of . If it is all of , then must be a Cartesian product and option (ii) holds. So suppose that this group is a proper subgroup of . Applying Lemma 3, we obtain a non-trivial group homomorphism into an abelian group such that whenever . For any in the projection of to , there is thus a unique quantity such that whenever . One easily checks that is a homomorphism, so we are in option (i).

Finally, it is clear that (i) and (ii) cannot both hold, since (i) places a non-trivial constraint on the second component of an element of for any fixed choice of .

We also note a similar variant of Lemma 5, which is Lemme 10 of this paper of Serre:

Lemma 7Let be a prime, and let be a closed subgroup of . Then exactly one of the following hold:

- (i) ( too small) There exists a proper subgroup of such that for all .
- (ii) .

*Proof:* As in the proof of Lemma 5, (i) and (ii) cannot both hold. Suppose that (i) does not hold, then for any there exists such that . Similarly, there exists with . Taking commutators of and , we can find with . Continuing to take commutators with and extracting a limit (using compactness and the closed nature of ), we can find with . Thus, the closed subgroup of does not obey conclusion (i) of Lemma 5, and must therefore obey conclusion (ii); that is to say, contains . Similarly contains ; multiplying, we end up in conclusion (ii).

The most famous result of this type is of course the Goursat lemma, which we phrase here in a somewhat idiosyncratic manner to conform to the pattern of the other lemmas in this post:

Lemma 8 (Goursat lemma)Let be a subgroup of a product of two groups . Then one of the following hold:

- (i) ( too small) is contained in for some subgroups , of respectively, with either or (or both).
- (ii) ( too large) There exist normal subgroups of respectively, not both trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) (Isomorphism) There is a group isomorphism such that is the graph of . In particular, and are isomorphic.

Here we almost have a trichotomy, because option (iii) is incompatible with both option (i) and option (ii). However, it is possible for options (i) and (ii) to simultaneously hold.

*Proof:* If either of the projections , from to the factor groups (thus and fail to be surjective, then we are in option (i). Thus we may assume that these maps are surjective.

Next, if either of the maps , fail to be injective, then at least one of the kernels , is non-trivial. We can then descend down to the quotient and end up in option (ii).

The only remaining case is when the group homomorphisms are both bijections, hence are group isomorphisms. If we set we end up in case (iii).

We can combine the Goursat lemma with Lemma 3 to obtain a variant:

Corollary 9 (Nilpotent Goursat lemma)Let be a subgroup of a product of two nilpotent groups . Then one of the following hold:

- (i) ( too small) There exists and a non-trivial group homomorphism such that for all .
- (ii) ( too large) There exist normal subgroups of respectively, not both trivial, such that arises from a subgroup of .
- (iii) (Isomorphism) There is a group isomorphism such that is the graph of . In particular, and are isomorphic.

*Proof:* If Lemma 8(i) holds, then by applying Lemma 3 we arrive at our current option (i). The other options are unchanged from Lemma 8, giving the claim.

Now we present a lemma involving three groups that is known in ergodic theory contexts as the “Furstenberg-Weiss argument”, as an argument of this type arose in this paper of Furstenberg and Weiss, though perhaps it also implicitly appears in other contexts also. It has the remarkable feature of being able to enforce the abelian nature of one of the groups once the other options of the lemma are excluded.

Lemma 10 (Furstenberg-Weiss lemma)Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of and some such that whenever and .
- (ii) ( too large) There exists a non-trivial normal subgroup of with abelian, such that arises from a subgroup of , where is the quotient map.
- (iii) is abelian.

*Proof:* If the group is a proper subgroup of , then we are in option (i) (with ), so we may assume that

As before, we can combine this with previous lemmas to obtain a variant in the nilpotent case:

Lemma 11 (Nilpotent Furstenberg-Weiss lemma)Let be a subgroup of a product of three nilpotent groups . Then one of the following hold:

- (i) ( too small) There exists and group homomorphisms , for some abelian group , with non-trivial, such that whenever , where is the projection of to .
- (ii) ( too large) There exists a non-trivial normal subgroup of , such that arises from a subgroup of .
- (iii) is abelian.

Informally, this lemma asserts that if is a variable ranging in some subgroup , then either (i) there is a non-trivial abelian equation that constrains in terms of either or ; (ii) is not fully determined by and ; or (iii) is abelian.

*Proof:* Applying Lemma 10, we are already done if conclusions (ii) or (iii) of that lemma hold, so suppose instead that conclusion (i) holds for say . Then the group is not of the form , since it only contains those with . Applying Lemma 6, we obtain group homomorphisms , into an abelian group , with non-trivial, such that whenever , placing us in option (i).

The Furstenberg-Weiss argument is often used (though not precisely in this form) to establish that certain key structure groups arising in ergodic theory are abelian; see for instance Proposition 6.3(1) of this paper of Host and Kra for an example.

One can get more structural control on in the Furstenberg-Weiss lemma in option (iii) if one also broadens options (i) and (ii):

Lemma 12 (Variant of Furstenberg-Weiss lemma)Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of for some such that whenever . (In other words, the projection of to is not surjective.)
- (ii) ( too large) There exists a normal of respectively, not all trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) are abelian and isomorphic. Furthermore, there exist isomorphisms , , to an abelian group such that

The ability to encode an abelian additive relation in terms of group-theoretic properties is vaguely reminiscent of the group configuration theorem.

*Proof:* We apply Lemma 10. Option (i) of that lemma implies option (i) of the current lemma, and similarly for option (ii), so we may assume without loss of generality that is abelian. By permuting we may also assume that are abelian, and will use additive notation for these groups.

We may assume that the projections of to and are surjective, else we are in option (i). The group is then a normal subgroup of ; we may assume it is trivial, otherwise we can quotient it out and be in option (ii). Thus can be expressed as a graph for some map . As is a group, must be a homomorphism, and we can write it as for some homomorphisms , . Thus elements of obey the constraint .

If or fails to be injective, then we can quotient out by their kernels and end up in option (ii). If fails to be surjective, then the projection of to also fails to be surjective (since for , is now constrained to lie in the range of ) and we are in option (i). Similarly if fails to be surjective. Thus we may assume that the homomorphisms are bijective and thus group isomorphisms. Setting to the identity, we arrive at option (iii).

Combining this lemma with Lemma 3, we obtain a nilpotent version:

Corollary 13 (Variant of nilpotent Furstenberg-Weiss lemma)Let be a subgroup of a product of three groups . Then one of the following hold:

- (i) ( too small) There are homomorphisms , to some abelian group for some , with not both trivial, such that whenever .
- (ii) ( too large) There exists a normal of respectively, not all trivial, such that arises from a subgroup of , where is the quotient map.
- (iii) are abelian and isomorphic. Furthermore, there exist isomorphisms , , to an abelian group such that

Here is another variant of the Furstenberg-Weiss lemma, attributed to Serre by Ribet (see Lemma 3.3):

Lemma 14 (Serre’s lemma)Let be a subgroup of a finite product of groups with . Then one of the following hold:

- (i) ( too small) There is some proper subgroup of for some such that whenever .
- (ii) ( too large) One has .
- (iii) One of the has a non-trivial abelian quotient .

*Proof:* The claim is trivial for (and we don’t need (iii) in this case), so suppose that . We can assume that each is a perfect group, , otherwise we can quotient out by the commutator and arrive in option (iii). Similarly, we may assume that all the projections of to , are surjective, otherwise we are in option (i).

We now claim that for any and any , one can find with for and . For this follows from the surjectivity of the projection of to . Now suppose inductively that and the claim has already been proven for . Since is perfect, it suffices to establish this claim for of the form for some . By induction hypothesis, we can find with for and . By surjectivity of the projection of to , one can find with and . Taking commutators of these two elements, we obtain the claim.

Setting , we conclude that contains . Similarly for permutations. Multiplying these together we see that contains all of , and we are in option (ii).

]]>

Question 1Does there exist a smooth function which is not real analytic, but such that all the differences are real analytic for every ?

The hypothesis implies that the Newton quotients are real analytic for every . If analyticity was preserved by smooth limits, this would imply that is real analytic, which would make real analytic. However, we are not assuming any uniformity in the analyticity of the Newton quotients, so this simple argument does not seem to resolve the question immediately.

In the case that is periodic, say periodic with period , one can answer the question in the negative by Fourier series. Perform a Fourier expansion . If is not real analytic, then there is a sequence going to infinity such that as . From the Borel-Cantelli lemma one can then find a real number such that (say) for infinitely many , hence for infinitely many . Thus the Fourier coefficients of do not decay exponentially and hence this function is not analytic, a contradiction.

I was not able to quickly resolve the non-periodic case, but I thought perhaps this might be a good problem to crowdsource, so I invite readers to contribute their thoughts on this problem here. In the spirit of the polymath projects, I would encourage comments that contain thoughts that fall short of a complete solution, in the event that some other reader may be able to take the thought further.

]]>

Lemma 1 (Van der Corput inequality)Let be unit vectors in a Hilbert space . Then

*Proof:* The left-hand side may be written as for some unit complex numbers . By Cauchy-Schwarz we have

As a corollary, correlation becomes transitive in a statistical sense (even though it is not transitive in an absolute sense):

Corollary 2 (Statistical transitivity of correlation)Let be unit vectors in a Hilbert space such that for all and some . Then we have for at least of the pairs .

*Proof:* From the lemma, we have

One drawback with this corollary is that it does not tell us *which* pairs correlate. In particular, if the vector also correlates with a separate collection of unit vectors, the pairs for which correlate may have no intersection whatsoever with the pairs in which correlate (except of course on the diagonal where they must correlate).

While working on an ongoing research project, I recently found that there is a very simple way to get around the latter problem by exploiting the tensor power trick:

Corollary 3 (Simultaneous statistical transitivity of correlation)Let be unit vectors in a Hilbert space for and such that for all , and some . Then there are at least pairs such that . In particular (by Cauchy-Schwarz) we have for all .

*Proof:* Apply Corollary 2 to the unit vectors and , in the tensor power Hilbert space .

It is surprisingly difficult to obtain even a qualitative version of the above conclusion (namely, if correlates with all of the , then there are many pairs for which correlates with for all simultaneously) without some version of the tensor power trick. For instance, even the powerful Szemerédi regularity lemma, when applied to the set of pairs for which one has correlation of , for a single , does not seem to be sufficient. However, there is a reformulation of the argument using the Schur product theorem as a substitute for (or really, a disguised version of) the tensor power trick. For simplicity of notation let us just work with real Hilbert spaces to illustrate the argument. We start with the identity

where is the orthogonal projection to the complement of . This implies a Gram matrix inequality for each where denotes the claim that is positive semi-definite. By the Schur product theorem, we conclude that and hence for a suitable choice of signs , One now argues as in the proof of Corollary 2.A separate application of tensor powers to amplify correlations was also noted in this previous blog post giving a cheap version of the Kabatjanskii-Levenstein bound, but this seems to not be directly related to this current application.

]]>

Proposition 1 (Classical Möbius inversion)Let be functions from the natural numbers to an additive group . Then the following two claims are equivalent:

- (i) for all .
- (ii) for all .

There is a generalisation of this formula to (finite) posets, due to Hall, in which one sums over chains in the poset:

Proposition 2 (Poset Möbius inversion)Let be a finite poset, and let be functions from that poset to an additive group . Then the following two claims are equivalent:(Note from the finite nature of that the inner sum in (ii) is vacuous for all but finitely many .)

- (i) for all , where is understood to range in .
- (ii) for all , where in the inner sum are understood to range in with the indicated ordering.

Comparing Proposition 2 with Proposition 1, it is natural to refer to the function as the Möbius function of the poset; the condition (ii) can then be written as

In fact it is not completely necessary that the poset be finite; an inspection of the proof shows that it suffices that every element of the poset has only finitely many predecessors .

It is not difficult to see that Proposition 2 includes Proposition 1 as a special case, after verifying the combinatorial fact that the quantity

is equal to when divides , and vanishes otherwise.I recently discovered that Proposition 2 can also lead to a useful variant of the inclusion-exclusion principle. The classical version of this principle can be phrased in terms of indicator functions: if are subsets of some set , then

In particular, if there is a finite measure on for which are all measurable, we haveOne drawback of this formula is that there are exponentially many terms on the right-hand side: of them, in fact. However, in many cases of interest there are “collisions” between the intersections (for instance, perhaps many of the pairwise intersections agree), in which case there is an opportunity to collect terms and hopefully achieve some cancellation. It turns out that it is possible to use Proposition 2 to do this, in which one only needs to sum over chains in the resulting poset of intersections:

Proposition 3 (Hall-type inclusion-exclusion principle)Let be subsets of some set , and let be the finite poset formed by intersections of some of the (with the convention that is the empty intersection), ordered by set inclusion. Then for any , one has where are understood to range in . In particular (setting to be the empty intersection) if the are all proper subsets of then we have In particular, if there is a finite measure on for which are all measurable, we have

Using the Möbius function on the poset , one can write these formulae as

and
*Proof:* It suffices to establish (2) (to derive (3) from (2) observe that all the are contained in one of the , so the effect of may be absorbed into ). Applying Proposition 2, this is equivalent to the assertion that

Example 4If with , and are all distinct, then we have for any finite measure on that makes measurable that due to the four chains , , , of length one, and the three chains , , of length two. Note that this expansion just has six terms in it, as opposed to the given by the usual inclusion-exclusion formula, though of course one can reduce the number of terms by combining the factors. This may not seem particularly impressive, especially if one views the term as really being three terms instead of one, but if we add a fourth set with for all , the formula now becomes and we begin to see more cancellation as we now have just seven terms (or ten if we count as four terms) instead of terms.

Example 5 (Variant of Legendre sieve)If are natural numbers, and is some sequence of complex numbers with only finitely many terms non-zero, then by applying the above proposition to the sets and with equal to counting measure weighted by the we obtain a variant of the Legendre sieve where range over the set formed by taking least common multiples of the (with the understanding that the empty least common multiple is ), and denotes the assertion that divides but is strictly less than . I am curious to know of this version of the Legendre sieve already appears in the literature (and similarly for the other applications of Proposition 2 given here).

If the poset has bounded depth then the number of terms in Proposition 3 can end up being just polynomially large in rather than exponentially large. Indeed, if all chains in have length at most then the number of terms here is at most . (The examples (4), (5) are ones in which the depth is equal to two.) I hope to report in a later post on how this version of inclusion-exclusion with polynomially many terms can be useful in an application.

Actually in our application we need an abstraction of the above formula, in which the indicator functions are replaced by more abstract idempotents:

Proposition 6 (Hall-type inclusion-exclusion principle for idempotents)Let be pairwise commuting elements of some ring with identity, which are all idempotent (thus for ). Let be the finite poset formed by products of the (with the convention that is the empty product), ordered by declaring when (note that all the elements of are idempotent so this is a partial ordering). Then for any , one has where are understood to range in . In particular (setting ) if all the are not equal to then we have

Morally speaking this proposition is equivalent to the previous one after applying a “spectral theorem” to simultaneously diagonalise all of the , but it is quicker to just adapt the previous proof to establish this proposition directly. Using the Möbius function for , we can rewrite these formulae as

and
*Proof:* Again it suffices to verify (6). Using Proposition 2 as before, it suffices to show that

]]>

The Department of Mathematics at the University of California, Los Angeles, is inviting applications for the position of an Academic Administrator who will serve as the Director of the UCLA Endowed Olga Radko Math Circle (ORMC). The Academic Administrator will have the broad responsibility for administration of the ORMC, an outreach program with weekly activities for mathematically inclined students in grades K-12. Currently, over 300 children take part in the program each weekend. Instruction is delivered by a team of over 50 docents, the majority of whom are UCLA undergraduate and graduate students.

The Academic Administrator is required to teach three mathematics courses in the undergraduate curriculum per academic year as assigned by the Department. This is also intended to help with the recruitment of UCLA students as docents and instructors for the ORMC.

As the director of ORMC, the Academic Administrator will have primary responsibility for all aspects of ORMC operations:

- Determining the structure of ORMC, including the number and levels of groups
- Recruiting, training and supervising instructors, docents, and postdoctoral fellows associated with the ORMC
- Developing curricular materials and providing leadership in development of innovative ways of explaining mathematical ideas to school children
- Working with the Mathematics Department finance office to ensure timely payment of stipends and wages to ORMC instructors and docents, as appropriate
- Maintaining ORMC budget and budgetary projections, ensuring that the funds are used appropriately and efficiently for ORMC activities, and applying for grants as appropriate to fund the operations of ORMC
- Working with the Steering Committee and UCLA Development to raise funds for ORMC, both from families whose children participate in ORMC and other sources
- Admitting students to ORMC, ensuring appropriate placement, and working to maintain a collegial and inclusive atmosphere conducive to learning for all ORMC attendees
- Reporting to and working with the ORMC Steering Committee throughout the year

A competitive candidate should have leadership potential and experience with developing mathematical teaching materials for the use of gifted school children, as well as experience with teaching undergraduate mathematics courses. Candidates must have a Ph.D. degree (or equivalent) or expect to complete their Ph.D. by June 30, 2021.

Applications should be received by March 15, 2021. Further details on the position and the application process can be found at the application page.

]]>One of the great classical triumphs of complex analysis was in providing the first complete proof (by Hadamard and de la Vallée Poussin in 1896) of arguably the most important theorem in analytic number theory, the prime number theorem:

Theorem 1 (Prime number theorem)Let denote the number of primes less than a given real number . Then (or in asymptotic notation, as ).

(Actually, it turns out to be slightly more natural to replace the approximation in the prime number theorem by the logarithmic integral , which turns out to be a more precise approximation, but we will not stress this point here.)

The complex-analytic proof of this theorem hinges on the study of a key meromorphic function related to the prime numbers, the Riemann zeta function . Initially, it is only defined on the half-plane :

Definition 2 (Riemann zeta function, preliminary definition)Let be such that . Then we define

Note that the series is locally uniformly convergent in the half-plane , so in particular is holomorphic on this region. In previous notes we have already evaluated some special values of this function:

However, it turns out that the zeroes (and pole) of this function are of far greater importance to analytic number theory, particularly with regards to the study of the prime numbers.The Riemann zeta function has several remarkable properties, some of which we summarise here:

Theorem 3 (Basic properties of the Riemann zeta function)

- (i) (Euler product formula) For any with , we have where the product is absolutely convergent (and locally uniform in ) and is over the prime numbers .
- (ii) (Trivial zero-free region) has no zeroes in the region .
- (iii) (Meromorphic continuation) has a unique meromorphic continuation to the complex plane (which by abuse of notation we also call ), with a simple pole at and no other poles. Furthermore, the Riemann xi function is an entire function of order (after removing all singularities). The function is an entire function of order one after removing the singularity at .
- (iv) (Functional equation) After applying the meromorphic continuation from (iii), we have for all (excluding poles). Equivalently, we have for all . (The equivalence between the (5) and (6) is a routine consequence of the Euler reflection formula and the Legendre duplication formula, see Exercises 26 and 31 of Notes 1.)

*Proof:* We just prove (i) and (ii) for now, leaving (iii) and (iv) for later sections.

The claim (i) is an encoding of the fundamental theorem of arithmetic, which asserts that every natural number is uniquely representable as a product over primes, where the are natural numbers, all but finitely many of which are zero. Writing this representation as , we see that

whenever , , and consists of all the natural numbers of the form for some . Sending and to infinity, we conclude from monotone convergence and the geometric series formula that whenever is real, and then from dominated convergence we see that the same formula holds for complex with as well. Local uniform convergence then follows from the product form of the Weierstrass -test (Exercise 19 of Notes 1).The claim (ii) is immediate from (i) since the Euler product is absolutely convergent and all terms are non-zero.

We remark that by sending to in Theorem 3(i) we conclude that

and from the divergence of the harmonic series we then conclude Euler’s theorem . This can be viewed as a weak version of the prime number theorem, and already illustrates the potential applicability of the Riemann zeta function to control the distribution of the prime numbers.The meromorphic continuation (iii) of the zeta function is initially surprising, but can be interpreted either as a manifestation of the extremely regular spacing of the natural numbers occurring in the sum (1), or as a consequence of various integral representations of (or slight modifications thereof). We will focus in this set of notes on a particular representation of as essentially the Mellin transform of the theta function that briefly appeared in previous notes, and the functional equation (iv) can then be viewed as a consequence of the modularity of that theta function. This in turn was established using the Poisson summation formula, so one can view the functional equation as ultimately being a manifestation of Poisson summation. (For a direct proof of the functional equation via Poisson summation, see these notes.)

Henceforth we work with the meromorphic continuation of . The functional equation (iv), when combined with special values of such as (2), gives some additional values of outside of its initial domain , most famously

If one
From Theorem 3 and the non-vanishing nature of , we see that has simple zeroes (known as *trivial zeroes*) at the negative even integers , and all other zeroes (the *non-trivial zeroes*) inside the *critical strip* . (The non-trivial zeroes are conjectured to all be simple, but this is hopelessly far from being proven at present.) As we shall see shortly, these latter zeroes turn out to be closely related to the distribution of the primes. The functional equation tells us that if is a non-trivial zero then so is ; also, we have the identity

Conjecture 4 (Riemann hypothesis)All the non-trivial zeroes of lie on the critical line .

This conjecture would have many implications in analytic number theory, particularly with regard to the distribution of the primes. Of course, it is far from proven at present, but the partial results we have towards this conjecture are still sufficient to establish results such as the prime number theorem.

Return now to the original region where . To take more advantage of the Euler product formula (3), we take complex logarithms to conclude that

for suitable branches of the complex logarithm, and then on taking derivatives (using for instance the generalised Cauchy integral formula and Fubini’s theorem to justify the interchange of summation and derivative) we see that From the geometric series formula we have and so (by another application of Fubini’s theorem) we have the identity for , where the von Mangoldt function is defined to equal whenever is a power of a prime for some , and otherwise. The contribution of the higher prime powers is negligible in practice, and as a first approximation one can think of the von Mangoldt function as the indicator function of the primes, weighted by the logarithm function.The series and that show up in the above formulae are examples of Dirichlet series, which are a convenient device to transform various sequences of arithmetic interest into holomorphic or meromorphic functions. Here are some more examples:

Exercise 5 (Standard Dirichlet series)Let be a complex number with .

- (i) Show that .
- (ii) Show that , where is the divisor function of (the number of divisors of ).
- (iii) Show that , where is the Möbius function, defined to equal when is the product of distinct primes for some , and otherwise.
- (iv) Show that , where is the Liouville function, defined to equal when is the product of (not necessarily distinct) primes for some .
- (v) Show that , where is the holomorphic branch of the logarithm that is real for , and with the convention that vanishes for .
- (vi) Use the fundamental theorem of arithmetic to show that the von Mangoldt function is the unique function such that for every positive integer . Use this and (i) to provide an alternate proof of the identity (8). Thus we see that (8) is really just another encoding of the fundamental theorem of arithmetic.

Given the appearance of the von Mangoldt function , it is natural to reformulate the prime number theorem in terms of this function:

Theorem 6 (Prime number theorem, von Mangoldt form)One has (or in asymptotic notation, as ).

Let us see how Theorem 6 implies Theorem 1. Firstly, for any , we can write

The sum is non-zero for only values of , and is of size , thus Since , we conclude from Theorem 6 that as . Next, observe from the fundamental theorem of calculus that Multiplying by and summing over all primes , we conclude that From Theorem 6 we certainly have , thus By splitting the integral into the ranges and we see that the right-hand side is , and Theorem 1 follows.

Exercise 7Show that Theorem 1 conversely implies Theorem 6.

The alternate form (8) of the Euler product identity connects the primes (represented here via proxy by the von Mangoldt function) with the logarithmic derivative of the zeta function, and can be used as a starting point for describing further relationships between and the primes. Most famously, we shall see later in these notes that it leads to the remarkably precise Riemann-von Mangoldt explicit formula:

Theorem 8 (Riemann-von Mangoldt explicit formula)For any non-integer , we have where ranges over the non-trivial zeroes of with imaginary part in . Furthermore, the convergence of the limit is locally uniform in .

Actually, it turns out that this formula is in some sense *too* precise; in applications it is often more convenient to work with smoothed variants of this formula in which the sum on the left-hand side is smoothed out, but the contribution of zeroes with large imaginary part is damped; see Exercise 22. Nevertheless, this formula clearly illustrates how the non-trivial zeroes of the zeta function influence the primes. Indeed, if one formally differentiates the above formula in , one is led to the (quite nonrigorous) approximation

Comparing Theorem 8 with Theorem 6, it is natural to suspect that the key step in the proof of the latter is to establish the following slight but important extension of Theorem 3(ii), which can be viewed as a very small step towards the Riemann hypothesis:

Theorem 9 (Slight enlargement of zero-free region)There are no zeroes of on the line .

It is not quite immediate to see how Theorem 6 follows from Theorem 8 and Theorem 9, but we will demonstrate it below the fold.

Although Theorem 9 only seems like a slight improvement of Theorem 3(ii), proving it is surprisingly non-trivial. The basic idea is the following: if there was a zero at , then there would also be a different zero at (note cannot vanish due to the pole at ), and then the approximation (9) becomes

But the expression can be negative for large regions of the variable , whereas is always non-negative. This conflict eventually leads to a contradiction, but it is not immediately obvious how to make this argument rigorous. We will present here the classical approach to doing so using a trigonometric identity of Mertens.In fact, Theorem 9 is basically equivalent to the prime number theorem:

Exercise 10For the purposes of this exercise, assume Theorem 6, but do not assume Theorem 9. For any non-zero real , show that as , where denotes a quantity that goes to zero as after being multiplied by . Use this to derive Theorem 9.

This equivalence can help explain why the prime number theorem is remarkably non-trivial to prove, and why the Riemann zeta function has to be either explicitly or implicitly involved in the proof.

This post is only intended as the briefest of introduction to complex-analytic methods in analytic number theory; also, we have not chosen the shortest route to the prime number theorem, electing instead to travel in directions that particularly showcase the complex-analytic results introduced in this course. For some further discussion see this previous set of lecture notes, particularly Notes 2 and Supplement 3 (with much of the material in this post drawn from the latter).

** — 1. Meromorphic continuation and functional equation — **

We now focus on understanding the meromorphic continuation of , as well as the functional equation that that continuation satisfies. The arguments here date back to Riemann’s original paper on the zeta function. The general strategy is to relate the zeta function for to some sort of integral involving the parameter , which is manipulated in such a way that the integral makes sense for values of outside of the halfplane , and can thus be used to define the zeta function meromorphically in such a region. Often the Gamma function is involved in the relationship between the zeta function and integral. There are many such ways to connect to an integral; we present some of the more classical ones here.

One way to motivate the meromorphic continuation is to look at the continuous analogue

of (1). This clearly extends meromorphically to the whole complex plane. So one now just has to understand the analytic continuation properties of the residual For instance, using the Riemann sum type quadrature one can write this residual as since , it is a routine application of the Fubini and Morera theorems to establish analytic continuation of the residual to the half-plane , thus giving a meromorphic extension of to the region . Among other things, this shows that (the meromorphic continuation of) has a simple pole at with residue .

Exercise 11Using the trapezoid rule, show that for any in the region with , there exists a unique complex number for which one has the asymptotic for any natural number , where . Use this to extend the Riemann zeta function meromorphically to the region . Conclude in particular that and .

Exercise 12Obtain the refinement to the trapezoid rule when are integers and is continuously three times differentiable. Then show that for any in the region with , there exists a unique complex number for which one has the asymptotic for any natural number , where . Use this to extend the Riemann zeta function meromorphically to the region . Conclude in particular that .

One can keep going in this fashion using the Euler-Maclaurin formula (see this previous blog post) to extend the range of meromorphic continuation to the rest of the complex plane. However, we will now proceed in a different fashion, using the theta function

that made an appearance in previous notes, and try to transform this function into the zeta function. We will only need this function for imaginary values of the argument in the upper half-plane (so ); from Exercise 7 of Notes 2 we have the modularity relation In particular, since decays exponentially to as , blows up like as .We will attempt to apply the Mellin transform (Exercise 11 from Notes 2) to this function; formally, we have

There is however a problem: as goes to infinity, converges to one, and the integral here is unlikely to be convergent. So we will compute the Mellin transform of : The function decays exponentially as , and blows up like as , so this integral will be absolutely integrable when . Since we can write By the Fubini–Tonelli theorem, the integrand here is absolutely integrable, and hence From the Bernoulli definition of the Gamma function (Exercise 29(ii) of Notes 1) and a change of variables we have and hence by (1) we obtain the identity whenever . Replacing by , we can rearrange this as a formula for the function (4), namely whenever .Now we exploit the modular identity (12) to improve the convergence of this formula. The convergence of is much better near than near , so we use (13) to split

and then transform the first integral using the change of variables to obtain Using (12) we can write this as Direct computation shows that and thus whenever . However, the integrand here is holomorphic in and exponentially decaying in , so from the Fubini and Morera theorems we easily see that the right-hand side is an entire function of ; also from inspection we see that it is symmetric with respect to the symmetry . Thus we can define as an entire function, and hence as a meromorphic function, and one verifies the functional equation (6).It remains to establish that is of order . From (11) we have so from the triangle inequality

From the Stirling approximation (Exercise 30(v) from Notes 1) we conclude that for (say), and hence is of order at most as required. (One can show that has order exactly one by inspecting what happens to as , using that in this regime.) This completes the proof of Theorem 3.

Exercise 13 (Alternate derivation of meromorphic continuation and functional equation)

- (i) Establish the identity whenever .
- (ii) Establish the identity whenever , is not an integer, , where is the branch of the logarithm with real part in , and is the contour consisting of the line segment , the semicircle , and the line segment .
- (iii) Use (ii) to meromorphically continue to the entire complex plane .
- (iv) By shifting the contour to the contour for a large natural number and applying the residue theorem, show that again using the branch of the logarithm to define .
- (v) Establish the functional equation (5).

Exercise 14Use the formula from Exercise 12, together with the functional equation, to give yet another proof of the identity .

Exercise 15 (Relation between zeta function and Bernoulli numbers)

- (i) For any complex number with , use the Poisson summation formula (Proposition 3(v) from Notes 2) to establish the identity
- (ii) For as above and sufficiently small, show that Conclude that for any natural number , where the Bernoulli numbers are defined through the Taylor expansion Thus for instance , , and so forth.
- (iii) Show that for any odd natural number . (This identity can also be deduced from the Euler-Maclaurin formula, which generalises the approach in Exercise 12; see this previous post.)
- (iv) Use (14) and the residue theorem (now working inside the contour , rather than outside) to give an alternate proof of (15).

Exercise 16 (Convexity bounds)It is possible to improve the bounds (iii) in the region ; such improvements are known as

- (i) Establish the bounds for any and with .
- (ii) Establish the bounds for any and with . (Hint: use the functional equation.)
- (iii) Establish the bounds for any and with . (Hint: use the Phragmén-Lindelöf principle, Exercise 19 from Notes 2, after dealing somehow with the pole at .)
subconvexity estimates. For instance, it is currently known that for any and , a result of Bourgain; the Lindelöf hypothesis asserts that this bound in fact holds for all , although this remains unproven (it is however a consequence of the Riemann hypothesis).

Exercise 17 (Riemann-von Mangoldt formula)Show that for any , the number of zeroes of in the rectangle is equal to . (Hint:apply the argument principle to evaluated at a rectangle for some that is chosen so that the horizontal edges of the rectangle do not come too close to any of the zeroes (cf. the selection of the radii in the proof of the Hadamard factorisation theorem in Notes 1), and use the functional equation and Stirling’s formula to control the asymptotics for the horizontal edges.)We remark that the error term , due to von Mangoldt in 1905, has not been significantly improved despite over a century of effort. Even assuming the Riemann hypothesis, the error has only been reduced very slightly to (a result of Littlewood from 1924).

Remark 18Thanks to the functional equation and Rouche’s theorem, it is possible to numerically verify the Riemann hypothesis in any finite portion of the critical strip, so long as the zeroes in that strip are all simple. Indeed, if there was a zero off of the critical line , then an application of the argument principle (and Rouche’s theorem) in some small contour around but avoiding the critical line would be capable of numerically determining that there was a zero off of the line. Similarly, for each simple zero on the critical line, applying the argument principle for some small contour around that zero and symmetric around the critical line would numerically verify that there was exactly one zero within that contour, which by the functional equation would then have to lie exactly on that line. (In practice, more efficient methods are used to numerically verify the Riemann hypothesis over large finite portions of the strip, but we will not detail them here.)

** — 2. The explicit formula — **

We now prove Riemann-von Mangoldt explicit formula. Since is a non-trivial entire function of order , with zeroes at the non-trivial zeroes of (the trivial zeroes having been cancelled out by the Gamma function), we see from the Hadamard factorisation theorem (in the form of Exercise 35 from Notes 1) that

away from the zeroes of , where ranges over the non-trivial zeroes of (note from Exercise 11 that there is no zero at the origin), and is some constant. From (4) we can calculate while from Exercise 27 of Notes 1 we have and thus (after some rearranging) whereOne can compute the values of explicitly:

Exercise 19By inspecting both sides of (16) as , show that , and hence .

Jensen’s formula tells us that the number of non-trivial zeroes of in a disk is at most for any and . One can obtain a local version:

Exercise 20 (Local bound on zeroes)

- (i) Establish the upper bound whenever and with . (
Hint:use (10). More precise bounds are available with more effort, but will not be needed here.)- (ii) Establish the bounds uniformly in . (
Hint:use the Euler product.)- (iii) Show that for any , the number of non-trivial zeroes with imaginary part in is . (
Hint:use Jensen’s formula and the functional equation.)- (iv) For , , and , with not a zero of , show that (
Hint:use Exercise 9 of Notes 1.)

Meanwhile, from Perron’s formula (Exercise 12 of Notes 2) and (8) we see that for any non-integer , we have

We can compute individual terms here and then conclude the Riemann-von Mangoldt explicit formula:

Exercise 21 (Riemann-von Mangoldt explicit formula)Let and . Establish the following bounds:(

- (i) .
- (ii) .
- (iii) For any positive integer , we have
- (iv) For any non-trivial zero , we have
- (v) We have .
- (vi) We have .
Hint:for (i)-(iii), shift the contour to for an that gets sent to infinity, and using the residue theorem. The same argument works for (iv) except when is really close to , in which case a detour to the contour may be called for. For (vi), use Exercise 20 and partition the zeroes depending on what unit interval falls into.)

- (viii) Using the above estimates, conclude Theorem 8.

The explicit formula in Theorem 8 is completely exact, but turns out to be a little bit inconvenient for applications because it involves all the zeroes , and the series involving them converges very slowly (indeed the convergence is not even absolute). In practice it is preferable to work with a smoothed version of the formula. Here is one such smoothing:

Exercise 22 (Smoothed explicit formula)

- (i) Let be a smooth function compactly supported on . Show that is entire and obeys the bound (say) for some , all , and all .
- (ii) With as in (i), establish the identity with the summations being absolutely convergent by applying the Fourier inversion formula to , shifting the contour to frequencies for some , applying (8), and then shifting the contour again (using Exercise 20 and (i) to justify the contour shifting).
- (iii) Show that whenever is a smooth function, compactly supported in , with the summation being absolutely convergent.
- (iv) Explain why (iii) is
formallyconsistent with Theorem 8 when applied to the non-smooth function .

** — 3. Extending the zero free region, and the prime number theorem — **

We now show how Theorem 9 implies Theorem 6. Let be parameters to be chosen later. We will apply Exercise 22 to a function which equals one on , is supported on , and obeys the derivative estimates

for all and , and for all and . Such a function can be constructed by gluing together various rescaled versions of (antiderivatives of) standard bump functions. For such a function, we have On the other hand, we have and and hence We split into the two cases and , where is a parameter to be chosen later. For , there are only zeros, and all of them have real part strictly less than by Theorem 9. Hence there exists such that for all such zeroes. For each such zero, we have from the triangle inequality and so the total contribution of these zeroes to (17) is . For each zero with , we integrate parts twice to get some decay in : and from the triangle inequality and the fact that we conclude Since is convergent (this follows from Exercise 20 we conclude (for large enough depending on ) that the total contribution here is . Thus, after choosing suitably, we obtain the bound and thus whenever is sufficiently large depending on (since depends only on , which depends only on ). A similar argument (replacing by in the construction of ) gives the matching lower bound whenever is sufficiently large depending on . Sending , we obtain Theorem 6.

Exercise 23Assuming the Riemann hypothesis, show that for any and , and that for any and . Conversely, show that either of these two estimates are equivalent to the Riemann hypothesis. (Hint:find a holomorphic continuation of to the region in a manner similar to how was first holomorphically continued to the region ).

It remains to prove Theorem 9. The claim is clear for thanks to the simple pole of at , so we may assume . Suppose for contradiction that there was a zero of at , thus

for sufficiently close to . Taking logarithms, we see in particular that Using Lemma 5(v), we conclude that Note that the summands here are oscillatory due to the cosine term. To manage the oscillation, we use the simple pole at that gives for sufficiently close to one, and on taking logarithms as before we get These two estimates come close to being contradictory, but not quite (because we could have close to for most numbers that are weighted by . To get the contradiction, we use the analytic continuation of to to conclude that and hence Now we take advantage of the

Exercise 24Establish the inequality for any and .

Remark 25There are a number of ways to improve Theorem 9 that move a little closer in the direction of the Riemann hypothesis. Firstly, there are a number ofzero-free regionsfor the Riemann zeta function known that give lower bounds for (and in particular preclude the existence of zeros) a small amount inside the critical strip, and can be used to improve the error term in the prime number theorem; for instance, theclassical zero-free regionshows that there are no zeroes in the region for some sufficiently small absolute constant , and lets one improve the error term in Theorem 6 to (with a corresponding improvement in Theorem 1, provided that one replaces with the logarithmic integral ). A further improvement in the zero free region and in the prime number theorem error term was subsequently given by Vinogradov. We also mention a number of importantzero density estimateswhich provide non-trivial upper bounds for the number of zeroes in other, somewhat larger regions of the critical strip; the bounds are not strong enough to completely exclude zeroes as is the case with zero-free regions, but can at least limit the collective influence of such zeroes. For more discussion of these topics, see the various lecture notes to this previous course.

]]>