You are currently browsing the tag archive for the ‘Dirichlet L-function’ tag.

This is another sequel to a recent post in which I showed the Riemann zeta function ${\zeta}$ can be locally approximated by a polynomial, in the sense that for randomly chosen ${t \in [T,2T]}$ one has an approximation

$\displaystyle \zeta(\frac{1}{2} + it - \frac{2\pi i z}{\log T}) \approx P_t( e^{2\pi i z/N} ) \ \ \ \ \ (1)$

where ${N}$ grows slowly with ${T}$, and ${P_t}$ is a polynomial of degree ${N}$. It turns out that in the function field setting there is an exact version of this approximation which captures many of the known features of the Riemann zeta function, namely Dirichlet ${L}$-functions for a random character of given modulus over a function field. This model was (essentially) studied in a fairly recent paper by Andrade, Miller, Pratt, and Trinh; I am not sure if there is any further literature on this model beyond this paper (though the number field analogue of low-lying zeroes of Dirichlet ${L}$-functions is certainly well studied). In this model it is possible to set ${N}$ fixed and let ${T}$ go to infinity, thus providing a simple finite-dimensional model problem for problems involving the statistics of zeroes of the zeta function.

In this post I would like to record this analogue precisely. We will need a finite field ${{\mathbb F}}$ of some order ${q}$ and a natural number ${N}$, and set

$\displaystyle T := q^{N+1}.$

We will primarily think of ${q}$ as being large and ${N}$ as being either fixed or growing very slowly with ${q}$, though it is possible to also consider other asymptotic regimes (such as holding ${q}$ fixed and letting ${N}$ go to infinity). Let ${{\mathbb F}[X]}$ be the ring of polynomials of one variable ${X}$ with coefficients in ${{\mathbb F}}$, and let ${{\mathbb F}[X]'}$ be the multiplicative semigroup of monic polynomials in ${{\mathbb F}[X]}$; one should view ${{\mathbb F}[X]}$ and ${{\mathbb F}[X]'}$ as the function field analogue of the integers and natural numbers respectively. We use the valuation ${|n| := q^{\mathrm{deg}(n)}}$ for polynomials ${n \in {\mathbb F}[X]}$ (with ${|0|=0}$); this is the analogue of the usual absolute value on the integers. We select an irreducible polynomial ${Q \in {\mathbb F}[X]}$ of size ${|Q|=T}$ (i.e., ${Q}$ has degree ${N+1}$). The multiplicative group ${({\mathbb F}[X]/Q{\mathbb F}[X])^\times}$ can be shown to be cyclic of order ${|Q|-1=T-1}$. A Dirichlet character of modulus ${Q}$ is a completely multiplicative function ${\chi: {\mathbb F}[X] \rightarrow {\bf C}}$ of modulus ${Q}$, that is periodic of period ${Q}$ and vanishes on those ${n \in {\mathbb F}[X]}$ not coprime to ${Q}$. From Fourier analysis we see that there are exactly ${\phi(Q) := |Q|-1}$ Dirichlet characters of modulus ${Q}$. A Dirichlet character is said to be odd if it is not identically one on the group ${{\mathbb F}^\times}$ of non-zero constants; there are only ${\frac{1}{q-1} \phi(Q)}$ non-odd characters (including the principal character), so in the limit ${q \rightarrow \infty}$ most Dirichlet characters are odd. We will work primarily with odd characters in order to be able to ignore the effect of the place at infinity.

Let ${\chi}$ be an odd Dirichlet character of modulus ${Q}$. The Dirichlet ${L}$-function ${L(s, \chi)}$ is then defined (for ${s \in {\bf C}}$ of sufficiently large real part, at least) as

$\displaystyle L(s,\chi) := \sum_{n \in {\mathbb F}[X]'} \frac{\chi(n)}{|n|^s}$

$\displaystyle = \sum_{m=0}^\infty q^{-sm} \sum_{n \in {\mathbb F}[X]': |n| = q^m} \chi(n).$

Note that for ${m \geq N+1}$, the set ${n \in {\mathbb F}[X]': |n| = q^m}$ is invariant under shifts ${h}$ whenever ${|h| < T}$; since this covers a full set of residue classes of ${{\mathbb F}[X]/Q{\mathbb F}[X]}$, and the odd character ${\chi}$ has mean zero on this set of residue classes, we conclude that the sum ${\sum_{n \in {\mathbb F}[X]': |n| = q^m} \chi(n)}$ vanishes for ${m \geq N+1}$. In particular, the ${L}$-function is entire, and for any real number ${t}$ and complex number ${z}$, we can write the ${L}$-function as a polynomial

$\displaystyle L(\frac{1}{2} + it - \frac{2\pi i z}{\log T},\chi) = P(Z) = P_{t,\chi}(Z) := \sum_{m=0}^N c^1_m(t,\chi) Z^j$

where ${Z := e(z/N) = e^{2\pi i z/N}}$ and the coefficients ${c^1_m = c^1_m(t,\chi)}$ are given by the formula

$\displaystyle c^1_m(t,\chi) := q^{-m/2-imt} \sum_{n \in {\mathbb F}[X]': |n| = q^m} \chi(n).$

Note that ${t}$ can easily be normalised to zero by the relation

$\displaystyle P_{t,\chi}(Z) = P_{0,\chi}( q^{-it} Z ). \ \ \ \ \ (2)$

In particular, the dependence on ${t}$ is periodic with period ${\frac{2\pi}{\log q}}$ (so by abuse of notation one could also take ${t}$ to be an element of ${{\bf R}/\frac{2\pi}{\log q}{\bf Z}}$).

Fourier inversion yields a functional equation for the polynomial ${P}$:

Proposition 1 (Functional equation) Let ${\chi}$ be an odd Dirichlet character of modulus ${Q}$, and ${t \in {\bf R}}$. There exists a phase ${e(\theta)}$ (depending on ${t,\chi}$) such that

$\displaystyle a_{N-m}^1 = e(\theta) \overline{c^1_m}$

for all ${0 \leq m \leq N}$, or equivalently that

$\displaystyle P(1/Z) = e^{i\theta} Z^{-N} \overline{P}(Z)$

where ${\overline{P}(Z) := \overline{P(\overline{Z})}}$.

Proof: We can normalise ${t=0}$. Let ${G}$ be the finite field ${{\mathbb F}[X] / Q {\mathbb F}[X]}$. We can write

$\displaystyle a_{N-m} = q^{-(N-m)/2} \sum_{n \in q^{N-m} + H_{N-m}} \chi(n)$

where ${H_j}$ denotes the subgroup of ${G}$ consisting of (residue classes of) polynomials of degree less than ${j}$. Let ${e_G: G \rightarrow S^1}$ be a non-trivial character of ${G}$ whose kernel lies in the space ${H_N}$ (this is easily achieved by pulling back a non-trivial character from the quotient ${G/H_N \equiv {\mathbb F}}$). We can use the Fourier inversion formula to write

$\displaystyle a_{N-m} = q^{(m-N)/2} \sum_{\xi \in G} \hat \chi(\xi) \sum_{n \in T^{N-m} + H_{N-m}} e_G( n\xi )$

where

$\displaystyle \hat \chi(\xi) := q^{-N-1} \sum_{n \in G} \chi(n) e_G(-n\xi).$

From change of variables we see that ${\hat \chi}$ is a scalar multiple of ${\overline{\chi}}$; from Plancherel we conclude that

$\displaystyle \hat \chi = e(\theta_0) q^{-(N+1)/2} \overline{\chi} \ \ \ \ \ (3)$

for some phase ${e(\theta_0)}$. We conclude that

$\displaystyle a_{N-m} = e(\theta_0) q^{-(2N-m+1)/2} \sum_{\xi \in G} \overline{\chi}(\xi) e_G( T^{N-j} \xi) \sum_{n \in H_{N-j}} e_G( n\xi ). \ \ \ \ \ (4)$

The inner sum ${\sum_{n \in H_{N-m}} e_G( n\xi )}$ equals ${q^{N-m}}$ if ${\xi \in H_{j+1}}$, and vanishes otherwise, thus

$\displaystyle a_{N-m} = e(\theta_0) q^{-(m+1)/2} \sum_{\xi \in H_{j+1}} \overline{\chi}(\xi) e_G( T^{N-m} \xi).$

For ${\xi}$ in ${H_j}$, ${e_G(T^{N-m} \xi)=1}$ and the contribution of the sum vanishes as ${\chi}$ is odd. Thus we may restrict ${\xi}$ to ${H_{m+1} \backslash H_m}$, so that

$\displaystyle a_{N-m} = e(\theta_0) q^{-(m+1)/2} \sum_{h \in {\mathbb F}^\times} e_G( T^{N} h) \sum_{\xi \in h T^m + H_{m}} \overline{\chi}(\xi).$

By the multiplicativity of ${\chi}$, this factorises as

$\displaystyle a_{N-m} = e(\theta_0) q^{-(m+1)/2} (\sum_{h \in {\mathbb F}^\times} \overline{\chi}(h) e_G( T^{N} h)) (\sum_{\xi \in T^m + H_{m}} \overline{\chi}(\xi)).$

From the one-dimensional version of (3) (and the fact that ${\chi}$ is odd) we have

$\displaystyle \sum_{h \in {\mathbb F}^\times} \overline{\chi}(h) e_G( T^{N} h) = e(\theta_1) q^{1/2}$

for some phase ${e(\theta_1)}$. The claim follows. $\Box$

As one corollary of the functional equation, ${a_N}$ is a phase rotation of ${\overline{a_1} = 1}$ and thus is non-zero, so ${P}$ has degree exactly ${N}$. The functional equation is then equivalent to the ${N}$ zeroes of ${P}$ being symmetric across the unit circle. In fact we have the stronger

Theorem 2 (Riemann hypothesis for Dirichlet ${L}$-functions over function fields) Let ${\chi}$ be an odd Dirichlet character of modulus ${Q}$, and ${t \in {\bf R}}$. Then all the zeroes of ${P}$ lie on the unit circle.

We derive this result from the Riemann hypothesis for curves over function fields below the fold.

In view of this theorem (and the fact that ${a_1=1}$), we may write

$\displaystyle P(Z) = \mathrm{det}(1 - ZU)$

for some unitary ${N \times N}$ matrix ${U = U_{t,\chi}}$. It is possible to interpret ${U}$ as the action of the geometric Frobenius map on a certain cohomology group, but we will not do so here. The situation here is simpler than in the number field case because the factor ${\exp(A)}$ arising from very small primes is now absent (in the function field setting there are no primes of size between ${1}$ and ${q}$).

We now let ${\chi}$ vary uniformly at random over all odd characters of modulus ${Q}$, and ${t}$ uniformly over ${{\bf R}/\frac{2\pi}{\log q}{\bf Z}}$, independently of ${\chi}$; we also make the distribution of the random variable ${U}$ conjugation invariant in ${U(N)}$. We use ${{\mathbf E}_Q}$ to denote the expectation with respect to this randomness. One can then ask what the limiting distribution of ${U}$ is in various regimes; we will focus in this post on the regime where ${N}$ is fixed and ${q}$ is being sent to infinity. In the spirit of the Sato-Tate conjecture, one should expect ${U}$ to converge in distribution to the circular unitary ensemble (CUE), that is to say Haar probability measure on ${U(N)}$. This may well be provable from Deligne’s “Weil II” machinery (in the spirit of this monograph of Katz and Sarnak), though I do not know how feasible this is or whether it has already been done in the literature; here we shall avoid using this machinery and study what partial results towards this CUE hypothesis one can make without it.

If one lets ${\lambda_1,\dots,\lambda_N}$ be the eigenvalues of ${U}$ (ordered arbitrarily), then we now have

$\displaystyle \sum_{m=0}^N c^1_m Z^m = P(Z) = \prod_{j=1}^N (1 - \lambda_j Z)$

and hence the ${c^1_m}$ are essentially elementary symmetric polynomials of the eigenvalues:

$\displaystyle c^1_m = (-1)^j e_m( \lambda_1,\dots,\lambda_N). \ \ \ \ \ (5)$

One can take log derivatives to conclude

$\displaystyle \frac{P'(Z)}{P(Z)} = \sum_{j=1}^N \frac{\lambda_j}{1-\lambda_j Z}.$

On the other hand, as in the number field case one has the Dirichlet series expansion

$\displaystyle Z \frac{P'(Z)}{P(Z)} = \sum_{n \in {\mathbb F}[X]'} \frac{\Lambda_q(n) \chi(n)}{|n|^s}$

where ${s = \frac{1}{2} + it - \frac{2\pi i z}{\log T}}$ has sufficiently large real part, ${Z = e(z/N)}$, and the von Mangoldt function ${\Lambda_q(n)}$ is defined as ${\log_q |p| = \mathrm{deg} p}$ when ${n}$ is the power of an irreducible ${p}$ and ${0}$ otherwise. We conclude the “explicit formula”

$\displaystyle c^{\Lambda_q}_m = \sum_{j=1}^N \lambda_j^m = \mathrm{tr}(U^m) \ \ \ \ \ (6)$

for ${m \geq 1}$, where

$\displaystyle c^{\Lambda_q}_m := q^{-m/2-imt} \sum_{n \in {\mathbb F}[X]': |n| = q^m} \Lambda_q(n) \chi(n).$

Similarly on inverting ${P(Z)}$ we have

$\displaystyle P(Z)^{-1} = \prod_{j=1}^N (1 - \lambda_j Z)^{-1}.$

Since we also have

$\displaystyle P(Z)^{-1} = \sum_{n \in {\mathbb F}[X]'} \frac{\mu(n) \chi(n)}{|n|^s}$

for ${s}$ sufficiently large real part, where the Möbius function ${\mu(n)}$ is equal to ${(-1)^k}$ when ${n}$ is the product of ${k}$ distinct irreducibles, and ${0}$ otherwise, we conclude that the Möbius coefficients

$\displaystyle c^\mu_m := q^{-m/2-imt} \sum_{n \in {\mathbb F}[X]': |n| = q^m} \mu(n) \chi(n)$

are just the complete homogeneous symmetric polynomials of the eigenvalues:

$\displaystyle c^\mu_m = h_m( \lambda_1,\dots,\lambda_N). \ \ \ \ \ (7)$

One can then derive various algebraic relationships between the coefficients ${c^1_m, c^{\Lambda_q}_m, c^\mu_m}$ from various identities involving symmetric polynomials, but we will not do so here.

What do we know about the distribution of ${U}$? By construction, it is conjugation-invariant; from (2) it is also invariant with respect to the rotations ${U \rightarrow e^{i\theta} U}$ for any phase ${\theta \in{\bf R}}$. We also have the function field analogue of the Rudnick-Sarnak asymptotics:

Proposition 3 (Rudnick-Sarnak asymptotics) Let ${a_1,\dots,a_k,b_1,\dots,b_k}$ be nonnegative integers. If

$\displaystyle \sum_{j=1}^k j a_j \leq N, \ \ \ \ \ (8)$

then the moment

$\displaystyle {\bf E}_{Q} \prod_{j=1}^k (\mathrm{tr} U^j)^{a_j} (\overline{\mathrm{tr} U^j})^{b_j} \ \ \ \ \ (9)$

is equal to ${o(1)}$ in the limit ${q \rightarrow \infty}$ (holding ${N,a_1,\dots,a_k,b_1,\dots,b_k}$ fixed) unless ${a_j=b_j}$ for all ${j}$, in which case it is equal to

$\displaystyle \prod_{j=1}^k j^{a_j} a_j! + o(1). \ \ \ \ \ (10)$

Comparing this with Proposition 1 from this previous post, we thus see that all the low moments of ${U}$ are consistent with the CUE hypothesis (and also with the ACUE hypothesis, again by the previous post). The case ${\sum_{j=1}^k a_j + \sum_{j=1}^k b_j \leq 2}$ of this proposition was essentially established by Andrade, Miller, Pratt, and Trinh.

Proof: We may assume the homogeneity relationship

$\displaystyle \sum_{j=1}^k j a_j = \sum_{j=1}^k j b_j \ \ \ \ \ (11)$

since otherwise the claim follows from the invariance under phase rotation ${U \mapsto e^{i\theta} U}$. By (6), the expression (9) is equal to

$\displaystyle q^{-D} {\bf E}_Q \sum_{n_1,\dots,n_l,n'_1,\dots,n'_{l'} \in {\mathbb F}[X]': |n_i| = q^{s_i}, |n'_i| = q^{s'_i}} (\prod_{i=1}^l \Lambda_q(n_i) \chi(n_i)) \prod_{i=1}^{l'} \Lambda_q(n'_i) \overline{\chi(n'_i)}$

where

$\displaystyle D := \sum_{j=1}^k j a_j = \sum_{j=1}^k j b_j$

$\displaystyle l := \sum_{j=1}^k a_j$

$\displaystyle l' := \sum_{j=1}^k b_j$

and ${s_1 \leq \dots \leq s_l}$ consists of ${a_j}$ copies of ${j}$ for each ${j=1,\dots,k}$, and similarly ${s'_1 \leq \dots \leq s'_{l'}}$ consists of ${b_j}$ copies of ${j}$ for each ${j=1,\dots,k}$.

The polynomials ${n_1 \dots n_l}$ and ${n'_1 \dots n'_{l'}}$ are monic of degree ${D}$, which by hypothesis is less than the degree of ${Q}$, and thus they can only be scalar multiples of each other in ${{\mathbb F}[X] / Q {\mathbb F}[X]}$ if they are identical (in ${{\mathbb F}[X]}$). As such, we see that the average

$\displaystyle {\bf E}_Q \chi(n_1) \dots \chi(n_l) \overline{\chi(n'_1)} \dots \overline{\chi(n'_{l'})}$

vanishes unless ${n_1 \dots n_l = n'_1 \dots n'_{l'}}$, in which case this average is equal to ${1}$. Thus the expression (9) simplifies to

$\displaystyle q^{-D} \sum_{n_1,\dots,n_l,n'_1,\dots,n'_{l'}: |n_i| = q^{s_i}, |n'_i| = q^{s'_i}; n_1 \dots n_l = n'_1 \dots n'_l} (\prod_{i=1}^l \Lambda_q(n_i)) \prod_{i=1}^{l'} \Lambda_q(n'_i).$

There are at most ${q^D}$ choices for the product ${n_1 \dots n_l}$, and each one contributes ${O_D(1)}$ to the above sum. All but ${o(q^D)}$ of these choices are square-free, so by accepting an error of ${o(1)}$, we may restrict attention to square-free ${n_1 \dots n_l}$. This forces ${n_1,\dots,n_l,n'_1,\dots,n'_{l'}}$ to all be irreducible (as opposed to powers of irreducibles); as ${{\mathbb F}[X]}$ is a unique factorisation domain, this forces ${l=l'}$ and ${n_1,\dots,n_l}$ to be a permutation of ${n'_1,\dots,n'_{l'}}$. By the size restrictions, this then forces ${a_j = b_j}$ for all ${j}$ (if the above expression is to be anything other than ${o(1)}$), and each ${n_1,\dots,n_l}$ is associated to ${\prod_{j=1}^k a_j!}$ possible choices of ${n'_1,\dots,n'_{l'}}$. Writing ${\Lambda_q(n'_i) = s'_i}$ and then reinstating the non-squarefree possibilities for ${n_1 \dots n_l}$, we can thus write the above expression as

$\displaystyle q^{-D} \prod_{j=1}^k j a_j! \sum_{n_1,\dots,n_l,n'_1,\dots,n'_{l'}\in {\mathbb F}[X]': |n_i| = q^{s_i}} \prod_{i=1}^l \Lambda_q(n_i) + o(1).$

Using the prime number theorem ${\sum_{n \in {\mathbb F}[X]': |n| = q^s} \Lambda_q(n) = q^s}$, we obtain the claim. $\Box$

Comparing this with Proposition 1 from this previous post, we thus see that all the low moments of ${U}$ are consistent with the CUE and ACUE hypotheses:

Corollary 4 (CUE statistics at low frequencies) Let ${\lambda_1,\dots,\lambda_N}$ be the eigenvalues of ${U}$, permuted uniformly at random. Let ${R(\lambda)}$ be a linear combination of monomials ${\lambda_1^{a_1} \dots \lambda_N^{a_N}}$ where ${a_1,\dots,a_N}$ are integers with either ${\sum_{j=1}^N a_j \neq 0}$ or ${\sum_{j=1}^N |a_j| \leq 2N}$. Then

$\displaystyle {\bf E}_Q R(\lambda) = {\bf E}_{CUE} R(\lambda) + o(1).$

The analogue of the GUE hypothesis in this setting would be the CUE hypothesis, which asserts that the threshold ${2N}$ here can be replaced by an arbitrarily large quantity. As far as I know this is not known even for ${2N+2}$ (though, as mentioned previously, in principle one may be able to resolve such cases using Deligne’s proof of the Riemann hypothesis for function fields). Among other things, this would allow one to distinguish CUE from ACUE, since as discussed in the previous post, these two distributions agree when tested against monomials up to threshold ${2N}$, though not to ${2N+2}$.

Proof: By permutation symmetry we can take ${R}$ to be symmetric, and by linearity we may then take ${R}$ to be the symmetrisation of a single monomial ${\lambda_1^{a_1} \dots \lambda_N^{a_N}}$. If ${\sum_{j=1}^N a_j \neq 0}$ then both expectations vanish due to the phase rotation symmetry, so we may assume that ${\sum_{j=1}^N a_j \neq 0}$ and ${\sum_{j=1}^N |a_j| \leq 2N}$. We can write this symmetric polynomial as a constant multiple of ${\mathrm{tr}(U^{a_1}) \dots \mathrm{tr}(U^{a_N})}$ plus other monomials with a smaller value of ${\sum_{j=1}^N |a_j|}$. Since ${\mathrm{tr}(U^{-a}) = \overline{\mathrm{tr}(U^a)}}$, the claim now follows by induction from Proposition 3 and Proposition 1 from the previous post. $\Box$

Thus, for instance, for ${k=1,2}$, the ${2k^{th}}$ moment

$\displaystyle {\bf E}_Q |\det(1-U)|^{2k} = {\bf E}_Q |P(1)|^{2k} = {\bf E}_Q |L(\frac{1}{2} + it, \chi)|^{2k}$

is equal to

$\displaystyle {\bf E}_{CUE} |\det(1-U)|^{2k} + o(1)$

because all the monomials in ${\prod_{j=1}^N (1-\lambda_j)^k (1-\lambda_j^{-1})^k}$ are of the required form when ${k \leq 2}$. The latter expectation can be computed exactly (for any natural number ${k}$) using a formula

$\displaystyle {\bf E}_{CUE} |\det(1-U)|^{2k} = \prod_{j=1}^N \frac{\Gamma(j) \Gamma(j+2k)}{\Gamma(j+k)^2}$

of Baker-Forrester and Keating-Snaith, thus for instance

$\displaystyle {\bf E}_{CUE} |\det(1-U)|^2 = N+1$

$\displaystyle {\bf E}_{CUE} |\det(1-U)|^4 = \frac{(N+1)(N+2)^2(N+3)}{12}$

and more generally

$\displaystyle {\bf E}_{CUE}|\det(1-U)|^{2k} = \frac{g_k+o(1)}{(k^2)!} N^{k^2}$

when ${N \rightarrow \infty}$, where ${g_k}$ are the integers

$\displaystyle g_1 = 1, g_2 = 2, g_3 = 42, g_4 = 24024, \dots$

and more generally

$\displaystyle g_k := \frac{(k^2)!}{\prod_{i=1}^{2k-1} i^{k-|k-i|}}$

(OEIS A039622). Thus we have

$\displaystyle {\bf E}_Q |\det(1-U)|^{2k} = \frac{g_k+o(1)}{k^2!} N^{k^2}$

for ${k=1,2}$ if ${Q \rightarrow \infty}$ and ${N}$ is sufficiently slowly growing depending on ${Q}$. The CUE hypothesis would imply that that this formula also holds for higher ${k}$. (The situation here is cleaner than in the number field case, in which the GUE hypothesis only suggests the correct lower bound for the moments rather than an asymptotic, due to the absence of the wildly fluctuating additional factor ${\exp(A)}$ that is present in the Riemann zeta function model.)

Now we can recover the analogue of Montgomery’s work on the pair correlation conjecture. Consider the statistic

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} R( \lambda_i / \lambda_j )$

where

$\displaystyle R(z) = \sum_m \hat R(m) z^m$

is some finite linear combination of monomials ${z^m}$ independent of ${q}$. We can expand the above sum as

$\displaystyle \sum_m \hat R(m) {\bf E}_Q \mathrm{tr}(U^m) \mathrm{tr}(U^{-m}).$

Assuming the CUE hypothesis, then by Example 3 of the previous post, we would conclude that

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} R( \lambda_i / \lambda_j ) = N^2 \hat R(0) + \sum_m \min(|m|,N) \hat R(m) + o(1). \ \ \ \ \ (12)$

This is the analogue of Montgomery’s pair correlation conjecture. Proposition 3 implies that this claim is true whenever ${\hat R}$ is supported on ${[-N,N]}$. If instead we assume the ACUE hypothesis (or the weaker Alternative Hypothesis that the phase gaps are non-zero multiples of ${1/2N}$), one should instead have

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} R( \lambda_i / \lambda_j ) = \sum_{k \in {\bf Z}} N^2 \hat R(2Nk) + \sum_{1 \leq |m| \leq N} |m| \hat R(m+2Nk) + o(1)$

for arbitrary ${R}$; this is the function field analogue of a recent result of Baluyot. In any event, since ${\mathrm{tr}(U^m) \mathrm{tr}(U^{-m})}$ is non-negative, we unconditionally have the lower bound

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} R( \lambda_i / \lambda_j ) \geq N^2 \hat R(0) + \sum_{1 \leq |m| \leq N} |m| \hat R(m) + o(1). \ \ \ \ \ (13)$

if ${\hat R(m)}$ is non-negative for ${|m| > N}$.

By applying (12) for various choices of test functions ${R}$ we can obtain various bounds on the behaviour of eigenvalues. For instance suppose we take the Fejér kernel

$\displaystyle R(z) = |1 + z + \dots + z^N|^2 = \sum_{m=-N}^N (N+1-|m|) z^m.$

Then (12) applies unconditionally and we conclude that

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} R( \lambda_i / \lambda_j ) = N^2 (N+1) + \sum_{1 \leq |m| \leq N} (N+1-|m|) |m| + o(1).$

The right-hand side evaluates to ${\frac{2}{3} N(N+1)(2N+1)+o(1)}$. On the other hand, ${R(\lambda_i/\lambda_j)}$ is non-negative, and equal to ${(N+1)^2}$ when ${\lambda_i = \lambda_j}$. Thus

$\displaystyle {\bf E}_Q \sum_{1 \leq i,j \leq N} 1_{\lambda_i = \lambda_j} \leq \frac{2}{3} \frac{N(2N+1)}{N+1} + o(1).$

The sum ${\sum_{1 \leq j \leq N} 1_{\lambda_i = \lambda_j}}$ is at least ${1}$, and is at least ${2}$ if ${\lambda_i}$ is not a simple eigenvalue. Thus

$\displaystyle {\bf E}_Q \sum_{1 \leq i, \leq N} 1_{\lambda_i \hbox{ not simple}} \leq \frac{1}{3} \frac{N(N-1)}{N+1} + o(1),$

and thus the expected number of simple eigenvalues is at least ${\frac{2N}{3} \frac{N+4}{N+1} + o(1)}$; in particular, at least two thirds of the eigenvalues are simple asymptotically on average. If we had (12) without any restriction on the support of ${\hat R}$, the same arguments allow one to show that the expected proportion of simple eigenvalues is ${1-o(1)}$.

Suppose that the phase gaps in ${U}$ are all greater than ${c/N}$ almost surely. Let ${\hat R}$ is non-negative and ${R(e^{i\theta})}$ non-positive for ${\theta}$ outside of the arc ${[-c/N,c/N]}$. Then from (13) one has

$\displaystyle R(0) N \geq N^2 \hat R(0) + \sum_{1 \leq |m| \leq N} |m| \hat R(m) + o(1),$

so by taking contrapositives one can force the existence of a gap less than ${c/N}$ asymptotically if one can find ${R}$ with ${\hat R}$ non-negative, ${R}$ non-positive for ${\theta}$ outside of the arc ${[-c/N,c/N]}$, and for which one has the inequality

$\displaystyle R(0) N < N^2 \hat R(0) + \sum_{1 \leq |m| \leq N} |m| \hat R(m).$

By a suitable choice of ${R}$ (based on a minorant of Selberg) one can ensure this for ${c \approx 0.6072}$ for ${N}$ large; see Section 5 of these notes of Goldston. This is not the smallest value of ${c}$ currently obtainable in the literature for the number field case (which is currently ${0.50412}$, due to Goldston and Turnage-Butterbaugh, by a somewhat different method), but is still significantly less than the trivial value of ${1}$. On the other hand, due to the compatibility of the ACUE distribution with Proposition 3, it is not possible to lower ${c}$ below ${0.5}$ purely through the use of Proposition 3.

In some cases it is possible to go beyond Proposition 3. Consider the mollified moment

$\displaystyle {\bf E}_Q |M(U) P(1)|^2$

where

$\displaystyle M(U) = \sum_{m=0}^d a_m h_m(\lambda_1,\dots,\lambda_N)$

for some coefficients ${a_0,\dots,a_d}$. We can compute this moment in the CUE case:

Proposition 5 We have

$\displaystyle {\bf E}_{CUE} |M(U) P(1)|^2 = |a_0|^2 + N \sum_{m=1}^d |a_m - a_{m-1}|^2.$

Proof: From (5) one has

$\displaystyle P(1) = \sum_{i=0}^N (-1)^i e_i(\lambda_1,\dots,\lambda_N)$

hence

$\displaystyle M(U) P(1) = \sum_{i=0}^N \sum_{m=0}^d (-1)^i a_m e_i h_m$

where we suppress the dependence on the eigenvalues ${\lambda}$. Now observe the Pieri formula

$\displaystyle e_i h_m = s_{m 1^i} + s_{(m+1) 1^{i-1}}$

where ${s_{m 1^i}}$ are the hook Schur polynomials

$\displaystyle s_{m 1^i} = \sum_{a_1 \leq \dots \leq a_m; a_1 < b_1 < \dots < b_i} \lambda_{a_1} \dots \lambda_{a_m} \lambda_{b_1} \dots \lambda_{b_i}$

and we adopt the convention that ${s_{m 1^i}}$ vanishes for ${i = -1}$, or when ${m = 0}$ and ${i > 0}$. Then ${s_{m1^i}}$ also vanishes for ${i\geq N}$. We conclude that

$\displaystyle M(U) P(1) = a_0 s_{0 1^0} + \sum_{0 \leq i \leq N-1} \sum_{m \geq 1} (-1)^i (a_m - a_{m-1}) s_{m 1^i}.$

As the Schur polynomials are orthonormal on the unitary group, the claim follows. $\Box$

The CUE hypothesis would then imply the corresponding mollified moment conjecture

$\displaystyle {\bf E}_{Q} |M(U) P(1)|^2 = |a_0|^2 + N \sum_{m=1}^d |a_m - a_{m-1}|^2 + o(1). \ \ \ \ \ (14)$

(See this paper of Conrey, and this paper of Radziwill, for some discussion of the analogous conjecture for the zeta function, which is essentially due to Farmer.)

From Proposition 3 one sees that this conjecture holds in the range ${d \leq \frac{1}{2} N}$. It is likely that the function field analogue of the calculations of Conrey (based ultimately on deep exponential sum estimates of Deshouillers and Iwaniec) can extend this range to ${d < \theta N}$ for any ${\theta < \frac{4}{7}}$, if ${N}$ is sufficiently large depending on ${\theta}$; these bounds thus go beyond what is available from Proposition 3. On the other hand, as discussed in Remark 7 of the previous post, ACUE would also predict (14) for ${d}$ as large as ${N-2}$, so the available mollified moment estimates are not strong enough to rule out ACUE. It would be interesting to see if there is some other estimate in the function field setting that can be used to exclude the ACUE hypothesis (possibly one that exploits the fact that GRH is available in the function field case?).

In Notes 2, the Riemann zeta function ${\zeta}$ (and more generally, the Dirichlet ${L}$-functions ${L(\cdot,\chi)}$) were extended meromorphically into the region ${\{ s: \hbox{Re}(s) > 0 \}}$ in and to the right of the critical strip. This is a sufficient amount of meromorphic continuation for many applications in analytic number theory, such as establishing the prime number theorem and its variants. The zeroes of the zeta function in the critical strip ${\{ s: 0 < \hbox{Re}(s) < 1 \}}$ are known as the non-trivial zeroes of ${\zeta}$, and thanks to the truncated explicit formulae developed in Notes 2, they control the asymptotic distribution of the primes (up to small errors).

The ${\zeta}$ function obeys the trivial functional equation

$\displaystyle \zeta(\overline{s}) = \overline{\zeta(s)} \ \ \ \ \ (1)$

for all ${s}$ in its domain of definition. Indeed, as ${\zeta(s)}$ is real-valued when ${s}$ is real, the function ${\zeta(s) - \overline{\zeta(\overline{s})}}$ vanishes on the real line and is also meromorphic, and hence vanishes everywhere. Similarly one has the functional equation

$\displaystyle \overline{L(s, \chi)} = L(\overline{s}, \overline{\chi}). \ \ \ \ \ (2)$

From these equations we see that the zeroes of the zeta function are symmetric across the real axis, and the zeroes of ${L(\cdot,\chi)}$ are the reflection of the zeroes of ${L(\cdot,\overline{\chi})}$ across this axis.

It is a remarkable fact that these functions obey an additional, and more non-trivial, functional equation, this time establishing a symmetry across the critical line ${\{ s: \hbox{Re}(s) = \frac{1}{2} \}}$ rather than the real axis. One consequence of this symmetry is that the zeta function and ${L}$-functions may be extended meromorphically to the entire complex plane. For the zeta function, the functional equation was discovered by Riemann, and reads as follows:

Theorem 1 (Functional equation for the Riemann zeta function) The Riemann zeta function ${\zeta}$ extends meromorphically to the entire complex plane, with a simple pole at ${s=1}$ and no other poles. Furthermore, one has the functional equation

$\displaystyle \zeta(s) = \alpha(s) \zeta(1-s) \ \ \ \ \ (3)$

or equivalently

$\displaystyle \zeta(1-s) = \alpha(1-s) \zeta(s) \ \ \ \ \ (4)$

for all complex ${s}$ other than ${s=0,1}$, where ${\alpha}$ is the function

$\displaystyle \alpha(s) := 2^s \pi^{s-1} \sin( \frac{\pi s}{2}) \Gamma(1-s). \ \ \ \ \ (5)$

Here ${\cos(z) := \frac{e^z + e^{-z}}{2}}$, ${\sin(z) := \frac{e^{-z}-e^{-z}}{2i}}$ are the complex-analytic extensions of the classical trigionometric functions ${\cos(x), \sin(x)}$, and ${\Gamma}$ is the Gamma function, whose definition and properties we review below the fold.

The functional equation can be placed in a more symmetric form as follows:

Corollary 2 (Functional equation for the Riemann xi function) The Riemann xi function

$\displaystyle \xi(s) := \frac{1}{2} s(s-1) \pi^{-s/2} \Gamma(\frac{s}{2}) \zeta(s) \ \ \ \ \ (6)$

is analytic on the entire complex plane ${{\bf C}}$ (after removing all removable singularities), and obeys the functional equations

$\displaystyle \xi(\overline{s}) = \overline{\xi(s)}$

and

$\displaystyle \xi(s) = \xi(1-s). \ \ \ \ \ (7)$

In particular, the zeroes of ${\xi}$ consist precisely of the non-trivial zeroes of ${\zeta}$, and are symmetric about both the real axis and the critical line. Also, ${\xi}$ is real-valued on the critical line and on the real axis.

Corollary 2 is an easy consequence of Theorem 1 together with the duplication theorem for the Gamma function, and the fact that ${\zeta}$ has no zeroes to the right of the critical strip, and is left as an exercise to the reader (Exercise 19). The functional equation in Theorem 1 has many proofs, but most of them are related in on way or another to the Poisson summation formula

$\displaystyle \sum_n f(n) = \sum_m \hat f(2\pi m) \ \ \ \ \ (8)$

(Theorem 34 from Supplement 2, at least in the case when ${f}$ is twice continuously differentiable and compactly supported), which can be viewed as a Fourier-analytic link between the coarse-scale distribution of the integers and the fine-scale distribution of the integers. Indeed, there is a quick heuristic proof of the functional equation that comes from formally applying the Poisson summation formula to the function ${1_{x>0} \frac{1}{x^s}}$, and noting that the functions ${x \mapsto \frac{1}{x^s}}$ and ${\xi \mapsto \frac{1}{\xi^{1-s}}}$ are formally Fourier transforms of each other, up to some Gamma function factors, as well as some trigonometric factors arising from the distinction between the real line and the half-line. Such a heuristic proof can indeed be made rigorous, and we do so below the fold, while also providing Riemann’s two classical proofs of the functional equation.

From the functional equation (and the poles of the Gamma function), one can see that ${\zeta}$ has trivial zeroes at the negative even integers ${-2,-4,-6,\dots}$, in addition to the non-trivial zeroes in the critical strip. More generally, the following table summarises the zeroes and poles of the various special functions appearing in the functional equation, after they have been meromorphically extended to the entire complex plane, and with zeroes classified as “non-trivial” or “trivial” depending on whether they lie in the critical strip or not. (Exponential functions such as ${2^{s-1}}$ or ${\pi^{-s}}$ have no zeroes or poles, and will be ignored in this table; the zeroes and poles of rational functions such as ${s(s-1)}$ are self-evident and will also not be displayed here.)

 Function Non-trivial zeroes Trivial zeroes Poles ${\zeta(s)}$ Yes ${-2,-4,-6,\dots}$ ${1}$ ${\zeta(1-s)}$ Yes ${3,5,\dots}$ ${0}$ ${\sin(\pi s/2)}$ No Even integers No ${\cos(\pi s/2)}$ No Odd integers No ${\sin(\pi s)}$ No Integers No ${\Gamma(s)}$ No No ${0,-1,-2,\dots}$ ${\Gamma(s/2)}$ No No ${0,-2,-4,\dots}$ ${\Gamma(1-s)}$ No No ${1,2,3,\dots}$ ${\Gamma((1-s)/2)}$ No No ${1,3,5,\dots}$ ${\xi(s)}$ Yes No No

Among other things, this table indicates that the Gamma and trigonometric factors in the functional equation are tied to the trivial zeroes and poles of zeta, but have no direct bearing on the distribution of the non-trivial zeroes, which is the most important feature of the zeta function for the purposes of analytic number theory, beyond the fact that they are symmetric about the real axis and critical line. In particular, the Riemann hypothesis is not going to be resolved just from further analysis of the Gamma function!

The zeta function computes the “global” sum ${\sum_n \frac{1}{n^s}}$, with ${n}$ ranging all the way from ${1}$ to infinity. However, by some Fourier-analytic (or complex-analytic) manipulation, it is possible to use the zeta function to also control more “localised” sums, such as ${\sum_n \frac{1}{n^s} \psi(\log n - \log N)}$ for some ${N \gg 1}$ and some smooth compactly supported function ${\psi: {\bf R} \rightarrow {\bf C}}$. It turns out that the functional equation (3) for the zeta function localises to this context, giving an approximate functional equation which roughly speaking takes the form

$\displaystyle \sum_n \frac{1}{n^s} \psi( \log n - \log N ) \approx \alpha(s) \sum_m \frac{1}{m^{1-s}} \psi( \log M - \log m )$

whenever ${s=\sigma+it}$ and ${NM = \frac{|t|}{2\pi}}$; see Theorem 38 below for a precise formulation of this equation. Unsurprisingly, this form of the functional equation is also very closely related to the Poisson summation formula (8), indeed it is essentially a special case of that formula (or more precisely, of the van der Corput ${B}$-process). This useful identity relates long smoothed sums of ${\frac{1}{n^s}}$ to short smoothed sums of ${\frac{1}{m^{1-s}}}$ (or vice versa), and can thus be used to shorten exponential sums involving terms such as ${\frac{1}{n^s}}$, which is useful when obtaining some of the more advanced estimates on the Riemann zeta function.

We will give two other basic uses of the functional equation. The first is to get a good count (as opposed to merely an upper bound) on the density of zeroes in the critical strip, establishing the Riemann-von Mangoldt formula that the number ${N(T)}$ of zeroes of imaginary part between ${0}$ and ${T}$ is ${\frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T)}$ for large ${T}$. The other is to obtain untruncated versions of the explicit formula from Notes 2, giving a remarkable exact formula for sums involving the von Mangoldt function in terms of zeroes of the Riemann zeta function. These results are not strictly necessary for most of the material in the rest of the course, but certainly help to clarify the nature of the Riemann zeta function and its relation to the primes.

In view of the material in previous notes, it should not be surprising that there are analogues of all of the above theory for Dirichlet ${L}$-functions ${L(\cdot,\chi)}$. We will restrict attention to primitive characters ${\chi}$, since the ${L}$-function for imprimitive characters merely differs from the ${L}$-function of the associated primitive factor by a finite Euler product; indeed, if ${\chi = \chi' \chi_0}$ for some principal ${\chi_0}$ whose modulus ${q_0}$ is coprime to that of ${\chi'}$, then

$\displaystyle L(s,\chi) = L(s,\chi') \prod_{p|q_0} (1 - \frac{1}{p^s}) \ \ \ \ \ (9)$

(cf. equation (45) of Notes 2).

The main new feature is that the Poisson summation formula needs to be “twisted” by a Dirichlet character ${\chi}$, and this boils down to the problem of understanding the finite (additive) Fourier transform of a Dirichlet character. This is achieved by the classical theory of Gauss sums, which we review below the fold. There is one new wrinkle; the value of ${\chi(-1) \in \{-1,+1\}}$ plays a role in the functional equation. More precisely, we have

Theorem 3 (Functional equation for ${L}$-functions) Let ${\chi}$ be a primitive character of modulus ${q}$ with ${q>1}$. Then ${L(s,\chi)}$ extends to an entire function on the complex plane, with

$\displaystyle L(s,\chi) = \varepsilon(\chi) 2^s \pi^{s-1} q^{1/2-s} \sin(\frac{\pi}{2}(s+\kappa)) \Gamma(1-s) L(1-s,\overline{\chi})$

or equivalently

$\displaystyle L(1-s,\overline{\chi}) = \varepsilon(\overline{\chi}) 2^{1-s} \pi^{-s} q^{s-1/2} \sin(\frac{\pi}{2}(1-s+\kappa)) \Gamma(s) L(s,\chi)$

for all ${s}$, where ${\kappa}$ is equal to ${0}$ in the even case ${\chi(-1)=+1}$ and ${1}$ in the odd case ${\chi(-1)=-1}$, and

$\displaystyle \varepsilon(\chi) := \frac{\tau(\chi)}{i^\kappa \sqrt{q}} \ \ \ \ \ (10)$

where ${\tau(\chi)}$ is the Gauss sum

$\displaystyle \tau(\chi) := \sum_{n \in {\bf Z}/q{\bf Z}} \chi(n) e(n/q). \ \ \ \ \ (11)$

and ${e(x) := e^{2\pi ix}}$, with the convention that the ${q}$-periodic function ${n \mapsto e(n/q)}$ is also (by abuse of notation) applied to ${n}$ in the cyclic group ${{\bf Z}/q{\bf Z}}$.

From this functional equation and (2) we see that, as with the Riemann zeta function, the non-trivial zeroes of ${L(s,\chi)}$ (defined as the zeroes within the critical strip ${\{ s: 0 < \hbox{Re}(s) < 1 \}}$ are symmetric around the critical line (and, if ${\chi}$ is real, are also symmetric around the real axis). In addition, ${L(s,\chi)}$ acquires trivial zeroes at the negative even integers and at zero if ${\chi(-1)=1}$, and at the negative odd integers if ${\chi(-1)=-1}$. For imprimitive ${\chi}$, we see from (9) that ${L(s,\chi)}$ also acquires some additional trivial zeroes on the left edge of the critical strip.

There is also a symmetric version of this equation, analogous to Corollary 2:

Corollary 4 Let ${\chi,q,\varepsilon(\chi)}$ be as above, and set

$\displaystyle \xi(s,\chi) := (q/\pi)^{(s+\kappa)/2} \Gamma((s+\kappa)/2) L(s,\chi),$

then ${\xi(\cdot,\chi)}$ is entire with ${\xi(1-s,\overline{\chi}) = \varepsilon(\chi) \xi(s,\chi)}$.

For further detail on the functional equation and its implications, I recommend the classic text of Titchmarsh or the text of Davenport.

In Notes 1, we approached multiplicative number theory (the study of multiplicative functions ${f: {\bf N} \rightarrow {\bf C}}$ and their relatives) via elementary methods, in which attention was primarily focused on obtaining asymptotic control on summatory functions ${\sum_{n \leq x} f(n)}$ and logarithmic sums ${\sum_{n \leq x} \frac{f(n)}{n}}$. Now we turn to the complex approach to multiplicative number theory, in which the focus is instead on obtaining various types of control on the Dirichlet series ${{\mathcal D} f}$, defined (at least for ${s}$ of sufficiently large real part) by the formula

$\displaystyle {\mathcal D} f(s) := \sum_n \frac{f(n)}{n^s}.$

These series also made an appearance in the elementary approach to the subject, but only for real ${s}$ that were larger than ${1}$. But now we will exploit the freedom to extend the variable ${s}$ to the complex domain; this gives enough freedom (in principle, at least) to recover control of elementary sums such as ${\sum_{n\leq x} f(n)}$ or ${\sum_{n\leq x} \frac{f(n)}{n}}$ from control on the Dirichlet series. Crucially, for many key functions ${f}$ of number-theoretic interest, the Dirichlet series ${{\mathcal D} f}$ can be analytically (or at least meromorphically) continued to the left of the line ${\{ s: \hbox{Re}(s) = 1 \}}$. The zeroes and poles of the resulting meromorphic continuations of ${{\mathcal D} f}$ (and of related functions) then turn out to control the asymptotic behaviour of the elementary sums of ${f}$; the more one knows about the former, the more one knows about the latter. In particular, knowledge of where the zeroes of the Riemann zeta function ${\zeta}$ are located can give very precise information about the distribution of the primes, by means of a fundamental relationship known as the explicit formula. There are many ways of phrasing this explicit formula (both in exact and in approximate forms), but they are all trying to formalise an approximation to the von Mangoldt function ${\Lambda}$ (and hence to the primes) of the form

$\displaystyle \Lambda(n) \approx 1 - \sum_\rho n^{\rho-1} \ \ \ \ \ (1)$

where the sum is over zeroes ${\rho}$ (counting multiplicity) of the Riemann zeta function ${\zeta = {\mathcal D} 1}$ (with the sum often restricted so that ${\rho}$ has large real part and bounded imaginary part), and the approximation is in a suitable weak sense, so that

$\displaystyle \sum_n \Lambda(n) g(n) \approx \int_0^\infty g(y)\ dy - \sum_\rho \int_0^\infty g(y) y^{\rho-1}\ dy \ \ \ \ \ (2)$

for suitable “test functions” ${g}$ (which in practice are restricted to be fairly smooth and slowly varying, with the precise amount of restriction dependent on the amount of truncation in the sum over zeroes one wishes to take). Among other things, such approximations can be used to rigorously establish the prime number theorem

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + o(x) \ \ \ \ \ (3)$

as ${x \rightarrow \infty}$, with the size of the error term ${o(x)}$ closely tied to the location of the zeroes ${\rho}$ of the Riemann zeta function.

The explicit formula (1) (or any of its more rigorous forms) is closely tied to the counterpart approximation

$\displaystyle -\frac{\zeta'}{\zeta}(s) \approx \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} \ \ \ \ \ (4)$

for the Dirichlet series ${{\mathcal D} \Lambda = -\frac{\zeta'}{\zeta}}$ of the von Mangoldt function; note that (4) is formally the special case of (2) when ${g(n) = n^{-s}}$. Such approximations come from the general theory of local factorisations of meromorphic functions, as discussed in Supplement 2; the passage from (4) to (2) is accomplished by such tools as the residue theorem and the Fourier inversion formula, which were also covered in Supplement 2. The relative ease of uncovering the Fourier-like duality between primes and zeroes (sometimes referred to poetically as the “music of the primes”) is one of the major advantages of the complex-analytic approach to multiplicative number theory; this important duality tends to be rather obscured in the other approaches to the subject, although it can still in principle be discernible with sufficient effort.

More generally, one has an explicit formula

$\displaystyle \Lambda(n) \chi(n) \approx - \sum_\rho n^{\rho-1} \ \ \ \ \ (5)$

for any (non-principal) Dirichlet character ${\chi}$, where ${\rho}$ now ranges over the zeroes of the associated Dirichlet ${L}$-function ${L(s,\chi) := {\mathcal D} \chi(s)}$; we view this formula as a “twist” of (1) by the Dirichlet character ${\chi}$. The explicit formula (5), proven similarly (in any of its rigorous forms) to (1), is important in establishing the prime number theorem in arithmetic progressions, which asserts that

$\displaystyle \sum_{n \leq x: n = a\ (q)} \Lambda(n) = \frac{x}{\phi(q)} + o(x) \ \ \ \ \ (6)$

as ${x \rightarrow \infty}$, whenever ${a\ (q)}$ is a fixed primitive residue class. Again, the size of the error term ${o(x)}$ here is closely tied to the location of the zeroes of the Dirichlet ${L}$-function, with particular importance given to whether there is a zero very close to ${s=1}$ (such a zero is known as an exceptional zero or Siegel zero).

While any information on the behaviour of zeta functions or ${L}$-functions is in principle welcome for the purposes of analytic number theory, some regions of the complex plane are more important than others in this regard, due to the differing weights assigned to each zero in the explicit formula. Roughly speaking, in descending order of importance, the most crucial regions on which knowledge of these functions is useful are

1. The region on or near the point ${s=1}$.
2. The region on or near the right edge ${\{ 1+it: t \in {\bf R} \}}$ of the critical strip ${\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}}$.
3. The right half ${\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}}$ of the critical strip.
4. The region on or near the critical line ${\{ \frac{1}{2} + it: t \in {\bf R} \}}$ that bisects the critical strip.
5. Everywhere else.

For instance:

1. We will shortly show that the Riemann zeta function ${\zeta}$ has a simple pole at ${s=1}$ with residue ${1}$, which is already sufficient to recover much of the classical theorems of Mertens discussed in the previous set of notes, as well as results on mean values of multiplicative functions such as the divisor function ${\tau}$. For Dirichlet ${L}$-functions, the behaviour is instead controlled by the quantity ${L(1,\chi)}$ discussed in Notes 1, which is in turn closely tied to the existence and location of a Siegel zero.
2. The zeta function is also known to have no zeroes on the right edge ${\{1+it: t \in {\bf R}\}}$ of the critical strip, which is sufficient to prove (and is in fact equivalent to) the prime number theorem. Any enlargement of the zero-free region for ${\zeta}$ into the critical strip leads to improved error terms in that theorem, with larger zero-free regions leading to stronger error estimates. Similarly for ${L}$-functions and the prime number theorem in arithmetic progressions.
3. The (as yet unproven) Riemann hypothesis prohibits ${\zeta}$ from having any zeroes within the right half ${\{ s: \frac{1}{2} < \hbox{Re}(s) < 1 \}}$ of the critical strip, and gives very good control on the number of primes in intervals, even when the intervals are relatively short compared to the size of the entries. Even without assuming the Riemann hypothesis, zero density estimates in this region are available that give some partial control of this form. Similarly for ${L}$-functions, primes in short arithmetic progressions, and the generalised Riemann hypothesis.
4. Assuming the Riemann hypothesis, further distributional information about the zeroes on the critical line (such as Montgomery’s pair correlation conjecture, or the more general GUE hypothesis) can give finer information about the error terms in the prime number theorem in short intervals, as well as other arithmetic information. Again, one has analogues for ${L}$-functions and primes in short arithmetic progressions.
5. The functional equation of the zeta function describes the behaviour of ${\zeta}$ to the left of the critical line, in terms of the behaviour to the right of the critical line. This is useful for building a “global” picture of the structure of the zeta function, and for improving a number of estimates about that function, but (in the absence of unproven conjectures such as the Riemann hypothesis or the pair correlation conjecture) it turns out that many of the basic analytic number theory results using the zeta function can be established without relying on this equation. Similarly for ${L}$-functions.

Remark 1 If one takes an “adelic” viewpoint, one can unite the Riemann zeta function ${\zeta(\sigma+it) = \sum_n n^{-\sigma-it}}$ and all of the ${L}$-functions ${L(\sigma+it,\chi) = \sum_n \chi(n) n^{-\sigma-it}}$ for various Dirichlet characters ${\chi}$ into a single object, viewing ${n \mapsto \chi(n) n^{-it}}$ as a general multiplicative character on the adeles; thus the imaginary coordinate ${t}$ and the Dirichlet character ${\chi}$ are really the Archimedean and non-Archimedean components respectively of a single adelic frequency parameter. This viewpoint was famously developed in Tate’s thesis, which among other things helps to clarify the nature of the functional equation, as discussed in this previous post. We will not pursue the adelic viewpoint further in these notes, but it does supply a “high-level” explanation for why so much of the theory of the Riemann zeta function extends to the Dirichlet ${L}$-functions. (The non-Archimedean character ${\chi(n)}$ and the Archimedean character ${n^{it}}$ behave similarly from an algebraic point of view, but not so much from an analytic point of view; as such, the adelic viewpoint is well suited for algebraic tasks (such as establishing the functional equation), but not for analytic tasks (such as establishing a zero-free region).)

Roughly speaking, the elementary multiplicative number theory from Notes 1 corresponds to the information one can extract from the complex-analytic method in region 1 of the above hierarchy, while the more advanced elementary number theory used to prove the prime number theorem (and which we will not cover in full detail in these notes) corresponds to what one can extract from regions 1 and 2.

As a consequence of this hierarchy of importance, information about the ${\zeta}$ function away from the critical strip, such as Euler’s identity

$\displaystyle \zeta(2) = \frac{\pi^2}{6}$

or equivalently

$\displaystyle 1 + \frac{1}{2^2} + \frac{1}{3^2} + \dots = \frac{\pi^2}{6}$

or the infamous identity

$\displaystyle \zeta(-1) = -\frac{1}{12},$

which is often presented (slightly misleadingly, if one’s conventions for divergent summation are not made explicit) as

$\displaystyle 1 + 2 + 3 + \dots = -\frac{1}{12},$

are of relatively little direct importance in analytic prime number theory, although they are still of interest for some other, non-number-theoretic, applications. (The quantity ${\zeta(2)}$ does play a minor role as a normalising factor in some asymptotics, see e.g. Exercise 28 from Notes 1, but its precise value is usually not of major importance.) In contrast, the value ${L(1,\chi)}$ of an ${L}$-function at ${s=1}$ turns out to be extremely important in analytic number theory, with many results in this subject relying ultimately on a non-trivial lower-bound on this quantity coming from Siegel’s theorem, discussed below the fold.

For a more in-depth treatment of the topics in this set of notes, see Davenport’s “Multiplicative number theory“.