A useful rule of thumb in complex analysis is that holomorphic functions {f(z)} behave like large degree polynomials {P(z)}. This can be evidenced for instance at a “local” level by the Taylor series expansion for a complex analytic function in the disk, or at a “global” level by factorisation theorems such as the Weierstrass factorisation theorem (or the closely related Hadamard factorisation theorem). One can truncate these theorems in a variety of ways (e.g., Taylor’s theorem with remainder) to be able to approximate a holomorphic function by a polynomial on various domains.

In some cases it can be convenient instead to work with polynomials {P(Z)} of another variable {Z} such as {Z = e^{2\pi i z}} (or more generally {Z=e^{2\pi i z/N}} for a scaling parameter {N}). In the case of the Riemann zeta function, defined by meromorphic continuation of the formula

\displaystyle  \zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s} \ \ \ \ \ (1)

one ends up having the following heuristic approximation in the neighbourhood of a point {\frac{1}{2}+it} on the critical line:

Heuristic 1 (Polynomial approximation) Let {T \ggg 1} be a height, let {t} be a “typical” element of {[T,2T]}, and let {1 \lll N \ll \log T} be an integer. Let {\phi_t = \phi_{t,T}: {\bf C} \rightarrow {\bf C}} be the linear change of variables

\displaystyle  \phi_t(z) := \frac{1}{2} + it - \frac{2\pi i z}{\log T}.

Then one has an approximation

\displaystyle  \zeta( \phi_t(z) ) \approx P_t( e^{2\pi i z/N} ) \ \ \ \ \ (2)

for {z = o(N)} and some polynomial {P_t = P_{t,T}} of degree {N}.

The requirement {z=o(N)} is necessary since the right-hand side is periodic with period {N} in the {z} variable (or period {\frac{2\pi i N}{\log T}} in the {s = \phi_t(z)} variable), whereas the zeta function is not expected to have any such periodicity, even approximately.

Let us give two non-rigorous justifications of this heuristic. Firstly, it is standard that inside the critical strip (with {\mathrm{Im}(s) = O(T)}) we have an approximate form

\displaystyle  \zeta(s) \approx \sum_{n \leq T} \frac{1}{n^s}

of (11). If we group the integers {n} from {1} to {T} into {N} bins depending on what powers of {T^{1/N}} they lie between, we thus have

\displaystyle  \zeta(s) \approx \sum_{j=0}^N \sum_{T^{j/N} \leq n < T^{(j+1)/N}} \frac{1}{n^s}

For {s = \phi_t(z)} with {z = o(N)} and {T^{j/N} \leq n < T^{(j+1)/N}} we heuristically have

\displaystyle  \frac{1}{n^s} \approx \frac{1}{n^{\frac{1}{2}+it}} e^{2\pi i j z / N}

and so

\displaystyle  \zeta(s) \approx \sum_{j=0}^N a_j(t) (e^{2\pi i z/N})^j

where {a_j(t)} are the partial Dirichlet series

\displaystyle  a_j(t) \approx \sum_{T^{j/N} \leq n < T^{(j+1)/N}} \frac{1}{n^{\frac{1}{2}+it}}. \ \ \ \ \ (3)

This gives the desired polynomial approximation.

A second non-rigorous justification is as follows. From factorisation theorems such as the Hadamard factorisation theorem we expect to have

\displaystyle  \zeta(s) \propto \prod_\rho (s-\rho) \times \dots

where {\rho} runs over the non-trivial zeroes of {\zeta}, and there are some additional factors arising from the trivial zeroes and poles of {\zeta} which we will ignore here; we will also completely ignore the issue of how to renormalise the product to make it converge properly. In the region {s = \frac{1}{2} + it + o( N / \log T) = \phi_t( \{ z: z = o(N) \})}, the dominant contribution to this product (besides multiplicative constants) should arise from zeroes {\rho} that are also in this region. The Riemann-von Mangoldt formula suggests that for “typical” {t} one should have about {N} such zeroes. If one lets {\rho_1,\dots,\rho_N} be any enumeration of {N} zeroes closest to {\frac{1}{2}+it}, and then repeats this set of zeroes periodically by period {\frac{2\pi i N}{\log T}}, one then expects to have an approximation of the form

\displaystyle  \zeta(s) \propto \prod_{j=1}^N \prod_{k \in {\bf Z}} (s-(\rho_j+\frac{2\pi i kN}{\log T}) )

again ignoring all issues of convergence. If one writes {s = \phi_t(z)} and {\rho_j = \phi_t(\lambda_j)}, then Euler’s famous product formula for sine basically gives

\displaystyle  \prod_{k \in {\bf Z}} (s-(\rho_j+\frac{2\pi i kN}{\log T}) ) \propto \prod_{k \in {\bf Z}} (z - (\lambda_j+2\pi k N) )

\displaystyle  \propto (e^{2\pi i z/N} - e^{2\pi i \lambda j/N})

(here we are glossing over some technical issues regarding renormalisation of the infinite products, which can be dealt with by studying the asymptotics as {\mathrm{Im}(z) \rightarrow \infty}) and hence we expect

\displaystyle  \zeta(s) \propto \prod_{j=1}^N (e^{2\pi i z/N} - e^{2\pi i \lambda j/N}).

This again gives the desired polynomial approximation.

Below the fold we give a rigorous version of the second argument suitable for “microscale” analysis. More precisely, we will show

Theorem 2 Let {N = N(T)} be an integer going sufficiently slowly to infinity. Let {W_0 \ll N} go to zero sufficiently slowly depending on {N}. Let {t} be drawn uniformly at random from {[T,2T]}. Then with probability {1-o(1)} (in the limit {T \rightarrow \infty}), and possibly after adjusting {N} by {1}, there exists a polynomial {P_t(Z)} of degree {N} and obeying the functional equation (9) below, such that

\displaystyle  \zeta( \phi_t(z) ) = (1+o(1)) P_t( e^{2\pi i z/N} ) \ \ \ \ \ (4)

whenever {|z| \leq W_0}.

It should be possible to refine the arguments to extend this theorem to the mesoscale setting by letting {N} be anything growing like {o(\log T)}, and {W_0} anything growing like {o(N)}; also we should be able to delete the need to adjust {N} by {1}. We have not attempted these optimisations here.

Many conjectures and arguments involving the Riemann zeta function can be heuristically translated into arguments involving the polynomials {P_t(Z)}, which one can view as random degree {N} polynomials if {t} is interpreted as a random variable drawn uniformly at random from {[T,2T]}. These can be viewed as providing a “toy model” for the theory of the Riemann zeta function, in which the complex analysis is simplified to the study of the zeroes and coefficients of this random polynomial (for instance, the role of the gamma function is now played by a monomial in {Z}). This model also makes the zeta function theory more closely resemble the function field analogues of this theory (in which the analogue of the zeta function is also a polynomial (or a rational function) in some variable {Z}, as per the Weil conjectures). The parameter {N} is at our disposal to choose, and reflects the scale {\approx N/\log T} at which one wishes to study the zeta function. For “macroscopic” questions, at which one wishes to understand the zeta function at unit scales, it is natural to take {N \approx \log T} (or very slightly larger), while for “microscopic” questions one would take {N} close to {1} and only growing very slowly with {T}. For the intermediate “mesoscopic” scales one would take {N} somewhere between {1} and {\log T}. Unfortunately, the statistical properties of {P_t} are only understood well at a conjectural level at present; even if one assumes the Riemann hypothesis, our understanding of {P_t} is largely restricted to the computation of low moments (e.g., the second or fourth moments) of various linear statistics of {P_t} and related functions (e.g., {1/P_t}, {P'_t/P_t}, or {\log P_t}).

Let’s now heuristically explore the polynomial analogues of this theory in a bit more detail. The Riemann hypothesis basically corresponds to the assertion that all the {N} zeroes of the polynomial {P_t(Z)} lie on the unit circle {|Z|=1} (which, after the change of variables {Z = e^{2\pi i z/N}}, corresponds to {z} being real); in a similar vein, the GUE hypothesis corresponds to {P_t(Z)} having the asymptotic law of a random scalar {a_N(t)} times the characteristic polynomial of a random unitary {N \times N} matrix. Next, we consider what happens to the functional equation

\displaystyle  \zeta(s) = \chi(s) \zeta(1-s) \ \ \ \ \ (5)

where

\displaystyle  \chi(s) := 2^s \pi^{s-1} \sin(\frac{\pi s}{2}) \Gamma(1-s).

A routine calculation involving Stirling’s formula reveals that

\displaystyle  \chi(\frac{1}{2}+it) = (1+o(1)) e^{-2\pi i L(t)} \ \ \ \ \ (6)

with {L(t) := \frac{t}{2\pi} \log \frac{t}{2\pi} - \frac{t}{2\pi} + \frac{7}{8}}; one also has the closely related approximation

\displaystyle  \frac{\chi'}{\chi}(s) = -\log T + O(1) \ \ \ \ \ (7)

and hence

\displaystyle  \chi(\phi_t(z)) = (1+o(1)) e^{-2\pi i \theta(t)} e^{2\pi i z} \ \ \ \ \ (8)

when {z = o(\log T)}. Since {\zeta(1-s) = \overline{\zeta(\overline{1-s})}}, applying (5) with {s = \phi_t(z)} and using the approximation (2) suggests a functional equation for {P_t}:

\displaystyle  P_t(e^{2\pi i z/N}) = e^{-2\pi i L(t)} e^{2\pi i z} \overline{P_t(e^{2\pi i \overline{z}/N})}

or in terms of {Z := e^{2\pi i z/N}},

\displaystyle  P_t(Z) = e^{-2\pi i L(t)} Z^N \overline{P_t}(1/Z) \ \ \ \ \ (9)

where {\overline{P_t}(Z) := \overline{P_t(\overline{Z})}} is the polynomial {P_t} with all the coefficients replaced by their complex conjugate. Thus if we write

\displaystyle  P_t(Z) = \sum_{j=0}^N a_j Z^j

then the functional equation can be written as

\displaystyle  a_j(t) = e^{-2\pi i L(t)} \overline{a_{N-j}(t)}.

We remark that if we use the heuristic (3) (interpreting the cutoffs in the {n} summation in a suitably vague fashion) then this equation can be viewed as an instance of the Poisson summation formula.

Another consequence of the functional equation is that the zeroes of {P_t} are symmetric with respect to inversion {Z \mapsto 1/\overline{Z}} across the unit circle. This is of course consistent with the Riemann hypothesis, but does not obviously imply it. The phase {L(t)} is of little consequence in this functional equation; one could easily conceal it by working with the phase rotation {e^{\pi i L(t)} P_t} of {P_t} instead.

One consequence of the functional equation is that {e^{\pi i L(t)} e^{-i N \theta/2} P_t(e^{i\theta})} is real for any {\theta \in {\bf R}}; the same is then true for the derivative {e^{\pi i L(t)} e^{i N \theta} (i e^{i\theta} P'_t(e^{i\theta}) - i \frac{N}{2} P_t(e^{i\theta})}. Among other things, this implies that {P'_t(e^{i\theta})} cannot vanish unless {P_t(e^{i\theta})} does also; thus the zeroes of {P'_t} will not lie on the unit circle except where {P_t} has repeated zeroes. The analogous statement is true for {\zeta}; the zeroes of {\zeta'} will not lie on the critical line except where {\zeta} has repeated zeroes.

Relating to this fact, it is a classical result of Speiser that the Riemann hypothesis is true if and only if all the zeroes of the derivative {\zeta'} of the zeta function in the critical strip lie on or to the right of the critical line. The analogous result for polynomials is

Proposition 3 We have

\displaystyle  \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| > 1: P'_t(Z) = 0 \}

(where all zeroes are counted with multiplicity.) In particular, the zeroes of {P_t(Z)} all lie on the unit circle if and only if the zeroes of {P'_t(Z)} lie in the closed unit disk.

Proof: From the functional equation we have

\displaystyle  \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| > 1: P_t(Z) = 0 \}.

Thus it will suffice to show that {P_t} and {P'_t} have the same number of zeroes outside the closed unit disk.

Set {f(z) := z \frac{P'(z)}{P(z)}}, then {f} is a rational function that does not have a zero or pole at infinity. For {e^{i\theta}} not a zero of {P_t}, we have already seen that {e^{\pi i L(t)} e^{-i N \theta/2} P_t(e^{i\theta})} and {e^{\pi i L(t)} e^{i N \theta} (i e^{i\theta} P'_t(e^{i\theta}) - i \frac{N}{2} P_t(e^{i\theta})} are real, so on dividing we see that {i f(e^{i\theta}) - \frac{iN}{2}} is always real, that is to say

\displaystyle  \mathrm{Re} f(e^{i\theta}) = \frac{N}{2}.

(This can also be seen by writing {f(e^{i\theta}) = \sum_\lambda \frac{1}{1-e^{-i\theta} \lambda}}, where {\lambda} runs over the zeroes of {P_t}, and using the fact that these zeroes are symmetric with respect to reflection across the unit circle.) When {e^{i\theta}} is a zero of {P_t}, {f(z)} has a simple pole at {e^{i\theta}} with residue a positive multiple of {e^{i\theta}}, and so {f(z)} stays on the right half-plane if one traverses a semicircular arc around {e^{i\theta}} outside the unit disk. From this and continuity we see that {f} stays on the right-half plane in a circle slightly larger than the unit circle, and hence by the argument principle it has the same number of zeroes and poles outside of this circle, giving the claim. \Box

From the functional equation and the chain rule, {Z} is a zero of {P'_t} if and only if {1/\overline{Z}} is a zero of {N P_t - P'_t}. We can thus write the above proposition in the equivalent form

\displaystyle  \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| < 1: NP_t(Z) - P'_t(Z) = 0 \}.

One can use this identity to get a lower bound on the number of zeroes of {P_t} by the method of mollifiers. Namely, for any other polynomial {M_t}, we clearly have

\displaystyle  \# \{ |Z| = 1: P_t(Z) = 0 \}

\displaystyle \geq N - 2 \# \{ |Z| < 1: M_t(Z)(NP_t(Z) - P'_t(Z)) = 0 \}.

By Jensen’s formula, we have for any {r>1} that

\displaystyle  \log |M_t(0)| |NP_t(0)-P'_t(0)|

\displaystyle \leq -(\log r) \# \{ |Z| < 1: M_t(Z)(NP_t(Z) - P'_t(Z)) = 0 \}

\displaystyle + \frac{1}{2\pi} \int_0^{2\pi} \log |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|\ d\theta.

We therefore have

\displaystyle  \# \{ |Z| = 1: P_t(Z) = 0 \} \geq N + \frac{2}{\log r} \log |M_t(0)| |NP_t(0)-P'_t(0)|

\displaystyle - \frac{1}{\log r} \frac{1}{2\pi} \int_0^{2\pi} \log |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|^2\ d\theta.

As the logarithm function is concave, we can apply Jensen’s inequality to conclude

\displaystyle  {\bf E} \# \{ |Z| = 1: P_t(Z) = 0 \} \geq N

\displaystyle + {\bf E} \frac{2}{\log r} \log |M_t(0)| |NP_t(0)-P'_t(0)|

\displaystyle - \frac{1}{\log r} \log \left( \frac{1}{2\pi} \int_0^{2\pi} {\bf E} |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|^2\ d\theta\right).

where the expectation is over the {t} parameter. It turns out that by choosing the mollifier {M_t} carefully in order to make {M_t P_t} behave like the function {1} (while keeping the degree {M_t} small enough that one can compute the second moment here), and then optimising in {r}, one can use this inequality to get a positive fraction of zeroes of {P_t} on the unit circle on average. This is the polynomial analogue of a classical argument of Levinson, who used this to show that at least one third of the zeroes of the Riemann zeta function are on the critical line; all later improvements on this fraction have been based on some version of Levinson’s method, mainly focusing on more advanced choices for the mollifier {M_t} and of the differential operator {N - \partial_z} that implicitly appears in the above approach. (The most recent lower bound I know of is {0.4191637}, due to Pratt and Robles. In principle (as observed by Farmer) this bound can get arbitrarily close to {1} if one is allowed to use arbitrarily long mollifiers, but establishing this seems of comparable difficulty to unsolved problems such as the pair correlation conjecture; see this paper of Radziwill for more discussion.) A variant of these techniques can also establish “zero density estimates” of the following form: for any {W \geq 1}, the number of zeroes of {P_t} that lie further than {\frac{W}{N}} from the unit circle is of order {O( e^{-cW} N )} on average for some absolute constant {c>0}. Thus, roughly speaking, most zeroes of {P_t} lie within {O(1/N)} of the unit circle. (Analogues of these results for the Riemann zeta function were worked out by Selberg, by Jutila, and by Conrey, with increasingly strong values of {c}.)

The zeroes of {P'_t} tend to live somewhat closer to the origin than the zeroes of {P_t}. Suppose for instance that we write

\displaystyle  P_t(Z) = \sum_{j=0}^N a_j(t) Z^j = a_N(t) \prod_{j=1}^N (Z - \lambda_j)

where {\lambda_1,\dots,\lambda_N} are the zeroes of {P_t(Z)}, then by evaluating at zero we see that

\displaystyle  \lambda_1 \dots \lambda_N = (-1)^N a_0(t) / a_N(t)

and the right-hand side is of unit magnitude by the functional equation. However, if we differentiate

\displaystyle  P'_t(Z) = \sum_{j=1}^N a_j(t) j Z^{j-1} = N a_N(t) \prod_{j=1}^{N-1} (Z - \lambda'_j)

where {\lambda'_1,\dots,\lambda'_{N-1}} are the zeroes of {P'_t}, then by evaluating at zero we now see that

\displaystyle  \lambda'_1 \dots \lambda'_{N-1} = (-1)^N a_1(t) / N a_N(t).

The right-hand side would now be typically expected to be of size {O(1/N) \approx \exp(- \log N)}, and so on average we expect the {\lambda'_j} to have magnitude like {\exp( - \frac{\log N}{N} )}, that is to say pushed inwards from the unit circle by a distance roughly {\frac{\log N}{N}}. The analogous result for the Riemann zeta function is that the zeroes of {\zeta'(s)} at height {\sim T} lie at a distance roughly {\frac{\log\log T}{\log T}} to the right of the critical line on the average; see this paper of Levinson and Montgomery for a precise statement.

— 1. An exact factorisation of {\zeta}

In this section we give an an exact factorisation of {\zeta} into a “mesoscopic” part involving a finite Dirichlet series relating to the von Mangoldt function, and a “microscopic” part involving nearby zeroes; this can be viewed as an interpolant between the classical formula

\displaystyle  \zeta(s) = \exp( \sum_n \frac{\Lambda(n)}{n^s \log n} )

for {\mathrm{Re}(s)>1} on one hand, and the Weierstrass or Hadamard factorisations on the other. This factorisation will be useful in the eventual proof of Theorem 2. (UPDATE: I have since learned that this factorisation was previously introduced by Gonek, Hughes, and Keating (see also this later paper of Gonek), for essentially the same purposes as in this post.)

The starting point will be the explicit formula, which we use in the form

\displaystyle  \sum_n \Lambda(n) g( \log n ) = - \sum_\rho^* \hat g(-i\rho) \ \ \ \ \ (10)

for any test function {g} supported in {(0,\infty)}, where {\Lambda} is the von Mangoldt function, {\hat g} is the Fourier transform

\displaystyle  \hat g(t) := \int_0^\infty g(u) e^{itu}\ du,

and {\sum_\rho^*} sums over all zeroes of the Riemann zeta function (both trivial and non-trivial), together with the pole at {s=1} counted wiht multiplicity {-1}. See for instance Exercise 46 of this previous blog post. If {\varphi: {\bf R} \rightarrow {\bf C}} is a test function that equals {1} near the origin, and {s \in {\bf C}} has sufficiently large real part, then this formula, together with a limiting argument, implies that

\displaystyle  \sum_n \frac{\Lambda(n)}{n^s} (1 - \varphi(\log n)) = - \sum_\rho^* G_\varphi(s-\rho)

where

\displaystyle  G_\varphi(s) := \int_0^\infty (1-\varphi(u)) e^{-su}\ du

is the Laplace transform of {1-\varphi}. Since

\displaystyle  \sum_n \frac{\Lambda(n)}{n^s} = -\frac{\zeta'(s)}{\zeta(s)}

we conclude that

\displaystyle  \frac{\zeta'(s)}{\zeta(s)} = - \sum_n \frac{\Lambda(n)}{n^s} \varphi(\log n) + \sum_\rho^* G_\varphi(s-\rho) \ \ \ \ \ (11)

when the real part of {s} is sufficiently large. We can integrate by parts to write

\displaystyle  G_\varphi(s) = -\frac{1}{s} \int_0^\infty \varphi'(u) e^{-su}\ du,

and now it is clear {G_\varphi(s)} extends meromorphically to the entire complex plane with a simple pole at {s=0} with residue {-\int_0^\infty \varphi'(u)\ du = 1}; also, since {\varphi'} is smooth and supported in some compact subinterval {[c_1,c_2]} of {[0,+\infty)}, we see that we have estimates of the form

\displaystyle  |G_\varphi(s)| \ll_{A,\varphi} \frac{1}{s} (1 + |\mathrm{Im}(s)|)^{-A} (e^{-c_1 \mathrm{Re}(s)} + e^{-c_2 \mathrm{Re}(s)}) \ \ \ \ \ (12)

for any {A \geq 0}, thus {G_\varphi} decays exponentially as {\mathrm{Re}(s) \rightarrow +\infty} and is rapidly decreasing as {\mathrm{Im}(s) \rightarrow \pm \infty}. (If {\varphi} is not merely smooth, but is in fact in a Gevrey class, one can improve the {(1 + |\mathrm{Im}(s)|)^{-A}} factor to {e^{-c |\mathrm{Im}(s)|^{1/\alpha}}} for some {c>0} and {\alpha>1}. This is basically the maximum decay one can hope for here thanks to the Paley-Wiener theorem (or the more advanced Beurling-Malliavin theorem). However, we will not need such strong decay here.) Both sides of (11) now extend meromorphically to the entire complex plane and so the identity (11) holds for all {s} other than the zeroes and poles of {\zeta}.

By taking an antiderivative of {G_\varphi} and then integrating, we may write

\displaystyle  G_\varphi(s) = \frac{H'_\varphi(s)}{H_\varphi(s)}

for some entire function {H_\varphi(s)} that has a simple zero at {s=0} and no other zeroes, and converges to {1} at {+\infty}; one can express {H} explicitly in terms of the exponential integral {E_1(z) := \int_z^\infty \frac{e^t}{t}\ dt} as

\displaystyle  H_\varphi(s) = \exp( \int_0^\infty \varphi'(u) E_1(su)\ du )

for {|\mathrm{Arg}(s)| < \pi}, and then extended continuously to the negative real axis. From (12) one has the bounds

\displaystyle  |H_\varphi(s)| \ll_{r,\varphi} |s|

for {|s| \leq r}, and

\displaystyle  H_\varphi(s) = \exp( O_{A,r,\varphi}( (1 + |\mathrm{Im}(s)|)^{-A} (e^{-c_1 \mathrm{Re}(s)} + e^{-c_2 \mathrm{Re}(s)}) ) ) \ \ \ \ \ (13)

for {|s| \geq r} and any {A \geq 0}, for any {r>0}.

From (11) we have

\displaystyle  \frac{\zeta'(s)}{\zeta(s)} = - \sum_n \frac{\Lambda(n)}{n^s} \varphi(\log n) + \sum_\rho^* \frac{H'_\varphi}{H_\varphi}(s-\rho)

and we can integrate this to obtain

\displaystyle  \log \zeta(s) = \sum_n \frac{\Lambda(n)}{n^s \log n} \varphi(\log n) + \sum_\rho^* \log H_\varphi(s-\rho)

at least when the real part of {s} is large enough (and we choose branches of {\log \zeta} and {\log H} to vanish at {+\infty}); one can justify the interchange of summation and integration using (13) (and the fact that {\zeta} is of order {1}). We can then exponentiate to conclude the formula

\displaystyle  \zeta(s) = \exp( \sum_n \frac{\Lambda(n)}{n^s \log n} \varphi(\log n) ) \prod_\rho^* H_\varphi(s-\rho). \ \ \ \ \ (14)

for {\mathrm{Re} s} sufficiently large (where the factor at the pole {\rho=1} is {H_\varphi(s-1)^{-1}} due to the negative multiplicity); the right-hand side extends meromorphically to the entire complex plane, so (14) in fact holds for all {s \neq 1}.

Rescaling {\varphi(u)} to {\varphi(u/\log R)} (which rescales {H_\varphi(s)} to {H_\varphi(s \log R)}) for any {R > 1}, we obtain the generalisation

\displaystyle  \zeta(s) = \exp( \sum_n \frac{\Lambda(n)}{n^s \log n} \varphi(\frac{\log n}{\log R}) ) \prod_\rho^* H_\varphi((s-\rho) \log R) \ \ \ \ \ (15)

for all {s \neq 1}. While this formula is valid in the entire complex plane (other than the pole {s=1}), it is most useful to the right of the critical line, where most of the {H_\varphi} factors become close to {1} thanks to (13).

— 2. A microscale zero-free region —

Let {\Sigma} denote the non-trivial zeroes {\rho} of zeta (counting multiplicity), and let {N(T)} denote the number elements {\rho} of {\Sigma} with {0 \leq \mathrm{Im}(\rho) \leq T}. The Riemann-von Mangoldt formula then gives the asymptotic

\displaystyle  N(T) = \frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T). \ \ \ \ \ (16)

For any {1/2 < \sigma < 1}, let {N(\sigma,T)} denote the number of elements {\rho} of {\Sigma} with {\mathrm{Re}(\rho)>\sigma} and {0 \leq \mathrm{Im}(\rho) \leq T}. Clearly {N(\sigma,T) \leq N(T) \ll T \log T}. The Riemann hypothesis asserts that {N(\sigma,T)=0}. The Density hypothesis (in the log-free form) asserts the weaker bound that

\displaystyle  N(\sigma,T) \ll T^{-2(\sigma-1/2)} T \log T.

This remains open; however bounds of the form

\displaystyle  N(\sigma,T) \ll T^{-c(\sigma-1/2)} T \log T \ \ \ \ \ (17)

are known unconditionally for some {c>0}. This was first achieved by Selberg for {c=1/4}, by Jutila for any {c<1/2}, and by Conrey for any {c < 4/7}. However, for our analysis any positive value of {c} will suffice.

As a corollary of (17) (and (16)), we see that for any {W \geq 0}, there are {O( e^{-cW} T \log T )} zeroes {\rho \in \Sigma} with {|\mathrm{Im}(\rho)| \leq 3T} and {\mathrm{Re}(\rho) > \frac{1}{2} + \frac{W}{\log T}}. If we denote this set of zeroes by {\Sigma_W}, then by the Hardy-Littlewood maximal inequality (applied to the sum of Dirac masses at the imaginary parts of these zeroes), for any {\lambda > 0}, the event

\displaystyle  \sup_{H>0} \frac{1}{H} \# \{ \rho \in \Sigma_W: |\mathrm{Im} \rho - t| \leq H\} \geq \lambda

holds with probability {O( \frac{1}{\lambda} e^{-cW} \log T )}. Setting {\lambda = e^{cW/2} \log T} and then taking the union bound over {W \geq W_0/2} that are powers of two, we conclude that with probability {1-o(1)}, one has

\displaystyle  \# \{ \rho \in \Sigma_W: |\mathrm{Im} \rho - t| \leq H\} \leq H e^{-cW/2} \log T

for all {H>0} and all {W \geq W_0/2} that are a power of two. From this, (16), and renormalising {H = h / \log T}, we thus have with probability {1-o(1)} that

\displaystyle  \# \{ \rho \in \Sigma: |\mathrm{Im} \rho - t| \leq \frac{h}{\log T}; \mathrm{Re}(\rho) \geq \frac{1}{2} + \frac{W}{\log T} \} \ll h e^{-cW/2} \ \ \ \ \ (18)

for all {h > 0} and {W \geq W_0}.

A similar argument (using the Riemann-von Mangoldt formula) shows that with probability {1-o(1)}, one has

\displaystyle  \# \{ \rho \in \Sigma: |\mathrm{Im} \rho - t| \leq \frac{h}{\log T} \} \ll h \log W_0 \ \ \ \ \ (19)

for all {h > 0} (in fact one could replace {\log W_0} here by any other quantity that goes to infinity). We will improve this bound later (after discarding another exceptional event of {t}‘s).

Henceforth we restrict to the event that (18), (19) both hold. Since the left-hand side of (18) cannot go below one without vanishing entirely, we now have a “microscale” zero-free region

\displaystyle  \{ \phi_t(x+iy): y \geq \frac{W_0}{2\pi}, |x| \leq e^{-cy} \}

(say) for the Riemann zeta function. If we define the rescaled zero set

\displaystyle  \tilde \Sigma := \phi_t^{-1}(\Sigma) \ \ \ \ \ (20)

then we can rescale (18), (19) to be

\displaystyle  \# \{ \lambda \in \tilde \Sigma: |\mathrm{Re} \lambda| \leq h; \mathrm{Im} \lambda \geq W \} \ll h e^{-cW/2} \ \ \ \ \ (21)

for all {h > 0} and {W \geq W_0}, and

\displaystyle  \# \{ \lambda \in \tilde \Sigma: |\mathrm{Re} \lambda| \leq h \} \ll h \log W_0 \ \ \ \ \ (22)

for all {h>0}.

After shrinking the region a little bit, we have a quite precise formula for {\zeta} in this region:

Proposition 4 If {s = \phi_t(x+iy)} lies in the region

\displaystyle  \{ \phi_t(x+iy): y \geq \frac{W_0}{\pi}, |x| \leq e^{-cy}/2 \}

and {0 < \varepsilon < c}, then

\displaystyle  \log \zeta(s) = \sum_n \frac{\Lambda(n)}{n^s \log n} \varphi(\frac{\log n}{\log T^\varepsilon}) + O_{\varepsilon}(e^{-c' \varepsilon y} \log W_0) \ \ \ \ \ (23)

for some absolute constant {c'>0}. (We allow implied constants to depend on {\varphi}.)

Proof: We apply (15) with {R := T^\varepsilon}. We have

\displaystyle  H_\varphi( (s-1) \log R ) = \exp( O( T^{-100} ) )

and

\displaystyle  H_\varphi( (s+2n) \log R ) = \exp( O( T^{-100} n^{-100} ) )

for all {n=1,2,\dots}, so

\displaystyle  \prod_\rho^* H_\varphi((s-\rho) \log R) = (1+o(1)) \prod_{\rho \in \Sigma} H_\varphi( (s-\rho) \varepsilon \log T ).

To evaluate the product, we write {\rho = \phi_t(\lambda)}, so that

\displaystyle \prod_{\rho \in \Sigma} H_\varphi( (s-\rho) \varepsilon \log T ) = \prod_{\lambda \in \tilde \Sigma} H_\varphi( 2\pi \varepsilon (x+iy-\lambda) ).

We first consider those rescaled zeroes {\lambda} for which {\mathrm{Im} \lambda \leq y / 2}. Here we see from (13) and the triangle inequality that

\displaystyle  H_\varphi( 2\pi \varepsilon(x+iy-\lambda) ) = \exp( O_\varepsilon( (1 + |\mathrm{Im}(\lambda)|)^{-10} e^{-c_1 \varepsilon y / 2} ) )

which we can then multiply using (22) and dyadic decomposition to conclude that

\displaystyle  \prod_{\lambda \in \tilde \Sigma: \mathrm{Im} \lambda \leq y / 2} H_\varphi( 2\pi \varepsilon (x+iy-\lambda) ) = \exp( O_{\varepsilon}( e^{-c_1 \varepsilon y / 2} \log W_0 ) ).

Now consider those {\lambda} for which {2^{j-1} y \leq \mathrm{Im} \lambda \leq 2^j y} for some {j \geq 0}. Here we see from (13) that

\displaystyle  H_\varphi( 2\pi \varepsilon(x+iy-\lambda) ) = \exp( O_{\varepsilon,A}( |\mathrm{Re} \lambda|^{-A} e^{2^j c_2 \varepsilon y} ) )

for any {A > 0}. Multiplying this using (21) and dyadic decomposition, we conclude for {A} large enough that

\displaystyle  \prod_{\lambda \in \tilde \Sigma: 2^{j-1} y \leq \mathrm{Im} \lambda \leq 2^j y} H_\varphi( 2\pi \varepsilon (x+iy-\lambda) ) = \exp( O_{\varepsilon}( e^{- 2^j c_1 \varepsilon y / 2} ) ).

Putting all this together, we conclude (23). \Box

From the Cauchy integral formula one then has the bound

\displaystyle  -\frac{\zeta'}{\zeta}(s) = \sum_n \frac{\Lambda(n)}{n^s} \varphi(\frac{\log n}{\log T^\varepsilon}) + O_{\varepsilon}(e^{-c' \varepsilon y} \log T) \ \ \ \ \ (24)

for all {s = \phi_t(x+iy)} in the region

\displaystyle  \{ \phi_t(x+iy): y \geq W_0, |x| \leq e^{-cy}/4 \}.

We can use the moment method to control the right-hand side of (24).

Proposition 5 Let {\delta>0} be fixed. With probability {1-o(1)}, one has the bound

\displaystyle  -\frac{\zeta'}{\zeta}(s) \ll (|x| + y)^{\delta} \frac{\log T}{y}

for all {s = \phi_t(x+iy)} in the region

\displaystyle  \{ \phi_t(x+iy): y \geq 2W_0, |x| \leq e^{-cy}/8 \}. \ \ \ \ \ (25)

One could improve the {(|x|+y)^\delta} loss here somewhat, in the spirit of the law of the iterated logarithm, but we will not attempt to do so here.

Proof: We can tile the region (25) by squares of the form

\displaystyle  Q = \{ \phi_t(x+iy): W \leq y \leq 2W; x_0 \leq x \leq x_0+W \}

for various {W \geq W_0} and {x_0}. By the union bound, it will suffice to show the bound

\displaystyle  \sup_{s \in Q} |\frac{\zeta'}{\zeta}(s)| \ll (1 + |x_0| + W)^\delta \frac{\log T}{W}

with probability {1 - O_\delta( (|x_0|+W)^{-10} )} (say) for each such square {Q}, after restricting to the event that Proposition 4. Taking {\varepsilon = 1/W}, it then suffices by that proposition to show that

\displaystyle  \sup_{s \in Q} |F(s)| \ll (|x_0|+W)^\delta

with probability {1 - O_\delta( (|x_0|+X)^{-10} )}, where

\displaystyle  F(s) := \frac{W}{\log T} \sum_n \frac{\Lambda(n)}{n^s} \varphi(\frac{\log n}{\log T^{1/W}}).

Let {k} be a large fixed even integer depending on {\delta}.

\displaystyle  (\sup_{s \in Q} |F(s)|)^{k} \ll_k \frac{1}{|Q|} \int_Q |F(s)|^k\ ds + \frac{1}{|Q|} \int_Q |\frac{W}{\log T} F'(s)|^k\ ds

where {ds} is Lebesgue measure on {Q}. By linearity of expectation and Markov’s inequality, it then suffices to establish the bounds

\displaystyle  {\bf E} |F(s)|^k \ll_k 1

and

\displaystyle  {\bf E} |\frac{W}{\log T} F'(s)|^k \ll_k 1

uniformly for {s \in Q}. But this is a routine moment calculation (after first restoring the exceptional events in which (19), (22) fail in order to more easily compute the moment). \Box

— 3. Microscale Riemann-von Mangoldt formulae —

Henceforth we restrict attention to the probability {1-o(1)} event where (19), (22), and Proposition 5 all hold. Then we have a microscale zero-free region slightly to the right of the critical line with good bounds on the log-derivative; by the functional equation, we also have good bounds slightly to the left of the critical line. Meanwhile, (19), (22) gives some preliminary control between these two lines. One can then put all this information together by standard techniques to obtain a microscale version of the Riemann-von Mangoldt formula, which we can then use to establish Theorem 2.

We turn to the details. We start from the well known identity

\displaystyle  -\frac{\zeta'}{\zeta}(s) = C - \sum_\rho^* (\frac{1}{s-\rho} - \frac{1}{\rho}) \ \ \ \ \ (26)

(see e.g., equation (45) of this previous blog post) for an absolute constant {C} and all {s'} that are not zeroes or poles of {\zeta}. In particular we have

\displaystyle  -\frac{\zeta'}{\zeta}(s) + \frac{\zeta'}{\zeta}(\overline{1-s}) = - \sum_\rho^* (\frac{1}{s-\rho} - \frac{1}{\overline{1-s}-\rho}).

On the other hand, from the functional equation (5) one has

\displaystyle  \frac{\zeta'}{\zeta}(s) = \frac{\chi'}{\chi}(s) - \frac{\zeta'}{\zeta}(1-s)

and hence

\displaystyle  \frac{\zeta'}{\zeta}(\overline{1-s}) = \overline{\frac{\chi'}{\chi}(s)} - \overline{\frac{\zeta'}{\zeta}(s) }

so that

\displaystyle  - 2 \mathrm{Re} \frac{\zeta'}{\zeta}(s) + \overline{\frac{\chi'}{\chi}(s)} = - \sum_\rho^* (\frac{1}{s-\rho} - \frac{1}{\overline{1-s}-\rho}).

Writing {s = \phi_t(z)} and using (7), (20) we conclude

\displaystyle  - 2 \mathrm{Re} \frac{\zeta'}{\zeta}(s) + \log T= - \frac{\log T}{2\pi} \sum_{\lambda \in \tilde \Sigma} (\frac{1}{z-\lambda} - \frac{1}{\overline{z}-\lambda}) + O(1).

Using the functional equation we can replace {\frac{1}{\overline{z}-\lambda}} here by {\frac{1}{\overline{z-\lambda}}}, and conclude that

\displaystyle  \sum_{\lambda \in \tilde \Sigma} \frac{\mathrm{Im}(z-\lambda)}{|z-\lambda|^2} = \pi - \frac{2\pi}{\log T} \mathrm{Re} \frac{\zeta'}{\zeta}(s) + O( \frac{1}{\log T}.

In particular from Proposition 5 and writing {z=x+iy} one has

\displaystyle  \sum_{\lambda \in \tilde \Sigma} \frac{\mathrm{Im}(x+iy-\lambda)}{|x+iy-\lambda|^2} = \pi + O( \frac{(|x|+y)^{0.1}}{y} ) \ \ \ \ \ (27)

when one is in the region (25) with {y = o(\log T)}. The zeroes {\lambda} of imaginary part less than {y/2} give a positive contribution to the LHS (which is comparable to {1/y} when {\mathrm{Re}\lambda = x + O(y)}, while the contributions of the zeroes of imaginary part greater than or equal to {y/2} is {O(e^{-c''y})} for some {c''>0} thanks to (18). We conclude in particular the crude estimate

\displaystyle  \# \{ \lambda \in \tilde \Sigma: |\mathrm{Re} \lambda - x| \leq y \} \ll y \ \ \ \ \ (28)

whenever one is in the region (25) with {y = o(\log T)}. If we then go back to (27) and integrate it for {x} in an interval {I} in {[-e^{-cy/16}, e^{cy/16}]} and use (28) to control errors, we conclude that

\displaystyle  \# \{ \lambda \in \tilde \Sigma: \mathrm{Re} \lambda \in I \} = |I| + O( \frac{(\mathrm{dist}(I,0)+y)^{0.1} |I|}{y} ) + O(y).

In particular, if we arrange the sort the rescaled zeroes {\lambda_n \in \tilde \Sigma} in nondecreasing order of real part, with {\mathrm{Re} \lambda_n \geq 0} for {n \geq 0} and {\mathrm{Re} \lambda_n < 0} for {n < 0}, and such that any conjugate pairs of zeroes are consecutive, then we have the microscale Riemann-von Mangoldt formula

\displaystyle  \lambda_n = n + O( W_0^2 + |n|^{0.6} ) \ \ \ \ \ (29)

for {n = o(T \log T)} (as can be seen by applying the above formula with {I = [0,x]} or {I = [-x,0]} for {|x|} near {|n|} and {y = 100 W_0 + |n|^{1/2}}. Likely the error term can be improved with further effort. For {n \gg T \log T} one can also get very good control on {\lambda_n} from the classical Riemann-von Mangoldt formula.

From (26) we have

\displaystyle  -\frac{\zeta'}{\zeta}(s) = C_t - i \frac{\log T}{2\pi} \mathrm{p.v.} \sum_n \frac{1}{z-\lambda_n}

for some constant {C_t} depending on {t} but not on {z}, where we use (29) (and the classical Riemann-von Mangoldt formula) to ensure convergence of the principal value summation. If we set {z = iW} for some {W_0^3 \leq W = o(\log T)} then from (29) we have

\displaystyle  \mathrm{p.v.} \sum_n \frac{1}{z-\lambda_n} = -i\pi + O(W^{-0.4})

while from Proposition 5 one has

\displaystyle  -\frac{\zeta'}{\zeta}(s) \ll W^{-0.9}

and hence {C_t = O(W^{-0.4})}. Optimising in {W} we thus have {C_t = \frac{1}{2} \log T + O( \log^{-0.3} T )} (say), hence

\displaystyle  -\frac{\zeta'}{\zeta}(s) = \frac{1}{2} \log T - i \frac{\log T}{2\pi} \mathrm{p.v.} \sum_n \frac{1}{z-\lambda_n} + O( \log^{-0.3} T ).

Next, let {\lambda_{-N_-}, \dots, \lambda_{N_+}} be a consecutive string of rescaled zeroes in {\tilde \Sigma} with {N_-, N_+ = N/2+O(1)}. By adjusting {N} by one if necessary, we can assume that there are exactly {N} zeroes in this string and that whenever one complex zero is in this set, its complex conjugate is also. Then we can define a degree {N} polynomial

\displaystyle  P(Z) := A \prod_{j=-N_-}^{N_+} (Z - e^{2\pi i \lambda_j / N})

for some non-zero constant {A} to be chosen later. This polynomial obeys a functional equation

\displaystyle  P_t(Z) = e^{-2\pi i \theta} Z^N \overline{P_t}(1/Z) \ \ \ \ \ (30)

for some phase {\theta} (which at present need not be equal to {L(T)}, but we will address this issue later). If we set

\displaystyle  G_t(s) := P_t( e^{-2\pi i \phi_t^{-1}(s)/N} )

then {G_t} is an entire function, and from the Euler product formula for sine we have

\displaystyle  -\frac{G'_t}{G_t}(s) = \frac{1}{2} \log T -i \frac{\log T}{2\pi} \sum_{j=-N_-}^{N_+} \mathrm{p.v.} \sum_n \frac{1}{z - \lambda_j - N n}

whenever {s = \phi_t(z)}. Thus if we factor {\zeta_t = G_t K_t}, then by using (29) one can compute that

\displaystyle  -\frac{K'_t}{K_t}(s) = O(W_0 N^{-0.4} \log T)

if {s = \frac{1}{2} + it - O( \frac{W_0}{\log T})}. Thus, {\log K_t} only fluctuates by {O( W_0^2 N^{-0.4}) = o(1)} in this region. By choosing the normalising constant {A} appropriately, one may thus ensure that {\log K_t = o(1)} in this region, thus giving the approximation (4) when {|z| \leq W_0}. From (5) one thus has

\displaystyle  P_t(e^{-2\pi i z/N}) = (1+o(1)) \chi(s) \overline{P_t}(e^{-2\pi i \overline{z}/N})

when {|z| \leq W_0}. From (8) one has

\displaystyle  \chi(s) = (1+o(1)) e^{-2\pi i L(t)} Z^N

and thus

\displaystyle  P_t(Z) = (1+o(1)) e^{-2\pi i L(t)} Z^N \overline{P_t}(1/Z)

for at least one choice of {Z}. Thus the phase {\theta} in (30) differs from {L(t)} by {o(1)} (after shifting by an integer). Thus by adjusting the normalising constant {A} by a multiplicative factor of {1+o(1)}, we obtain (9) as required.