You are currently browsing the tag archive for the ‘Riemann hypothesis’ tag.

The Riemann zeta function {\zeta(s)}, defined for {\hbox{Re}(s)>1} by

\displaystyle  \zeta(s) := \sum_{n=1}^\infty \frac{1}{n^s} \ \ \ \ \ (1)

and then continued meromorphically to other values of {s} by analytic continuation, is a fundamentally important function in analytic number theory, as it is connected to the primes {p=2,3,5,\ldots} via the Euler product formula

\displaystyle  \zeta(s) = \prod_p (1 - \frac{1}{p^s})^{-1} \ \ \ \ \ (2)

(for {\hbox{Re}(s) > 1}, at least), where {p} ranges over primes. (The equivalence between (1) and (2) is essentially the generating function version of the fundamental theorem of arithmetic.) The function {\zeta} has a pole at {1} and a number of zeroes {\rho}. A formal application of the factor theorem gives

\displaystyle  \zeta(s) = \frac{1}{s-1} \prod_\rho (s-\rho) \times \ldots \ \ \ \ \ (3)

where {\rho} ranges over zeroes of {\zeta}, and we will be vague about what the {\ldots} factor is, how to make sense of the infinite product, and exactly which zeroes of {\zeta} are involved in the product. Equating (2) and (3) and taking logarithms gives the formal identity

\displaystyle  - \log \zeta(s) = \sum_p \log(1 - \frac{1}{p^s}) = \log(s-1) - \sum_\rho \log(s-\rho) + \ldots; \ \ \ \ \ (4)

using the Taylor expansion

\displaystyle  \log(1 - \frac{1}{p^s}) = - \frac{1}{p^s} - \frac{1}{2 p^{2s}} - \frac{1}{3p^{3s}} - \ldots \ \ \ \ \ (5)

and differentiating the above identity in {s} yields the formal identity

\displaystyle  - \frac{\zeta'(s)}{\zeta(s)} = \sum_n \frac{\Lambda(n)}{n^s} = \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} + \ldots \ \ \ \ \ (6)

where {\Lambda(n)} is the von Mangoldt function, defined to be {\log p} when {n} is a power of a prime {p}, and zero otherwise. Thus we see that the behaviour of the primes (as encoded by the von Mangoldt function) is intimately tied to the distribution of the zeroes {\rho}. For instance, if we knew that the zeroes were far away from the axis {\hbox{Re}(s)=1}, then we would heuristically have

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \frac{1}{it}

for real {t}. On the other hand, the integral test suggests that

\displaystyle  \sum_n \frac{1}{n^{1+it}} \approx \frac{1}{it}

and thus we see that {\frac{\Lambda(n)}{n}} and {\frac{1}{n}} have essentially the same (multiplicative) Fourier transform:

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \sum_n \frac{1}{n^{1+it}}.

Inverting the Fourier transform (or performing a contour integral closely related to the inverse Fourier transform), one is led to the prime number theorem

\displaystyle  \sum_{n \leq x} \Lambda(n) \approx \sum_{n \leq x} 1.

In fact, the standard proof of the prime number theorem basically proceeds by making all of the above formal arguments precise and rigorous.

Unfortunately, we don’t know as much about the zeroes {\rho} of the zeta function (and hence, about the {\zeta} function itself) as we would like. The Riemann hypothesis (RH) asserts that all the zeroes (except for the “trivial” zeroes at the negative even numbers) lie on the critical line {\hbox{Re}(s)=1/2}; this hypothesis would make the error terms in the above proof of the prime number theorem significantly more accurate. Furthermore, the stronger GUE hypothesis asserts in addition to RH that the local distribution of these zeroes on the critical line should behave like the local distribution of the eigenvalues of a random matrix drawn from the gaussian unitary ensemble (GUE). I will not give a precise formulation of this hypothesis here, except to say that the adjective “local” in the context of distribution of zeroes {\rho} means something like “at scale {O(1/\log T)} when {\hbox{Im}(s) = O(T)}“.

Nevertheless, we do know some reasonably non-trivial facts about the zeroes {\rho} and the zeta function {\zeta}, either unconditionally, or assuming RH (or GUE). Firstly, there are no zeroes for {\hbox{Re}(s)>1} (as one can already see from the convergence of the Euler product (2) in this case) or for {\hbox{Re}(s)=1} (this is trickier, relying on (6) and the elementary observation that

\displaystyle  \hbox{Re}( 3\frac{\Lambda(n)}{n^{\sigma}} + 4\frac{\Lambda(n)}{n^{\sigma+it}} + \frac{\Lambda(n)}{n^{\sigma+2it}} ) = 2\frac{\Lambda(n)}{n^\sigma} (1+\cos(t \log n))^2

is non-negative for {\sigma > 1} and {t \in {\mathbb R}}); from the functional equation

\displaystyle  \pi^{-s/2} \Gamma(s/2) \zeta(s) = \pi^{-(1-s)/2} \Gamma((1-s)/2) \zeta(1-s)

(which can be viewed as a consequence of the Poisson summation formula, see e.g. my blog post on this topic) we know that there are no zeroes for {\hbox{Re}(s) \leq 0} either (except for the trivial zeroes at negative even integers, corresponding to the poles of the Gamma function). Thus all the non-trivial zeroes lie in the critical strip {0 < \hbox{Re}(s) < 1}.

We also know that there are infinitely many non-trivial zeroes, and can approximately count how many zeroes there are in any large bounded region of the critical strip. For instance, for large {T}, the number of zeroes {\rho} in this strip with {\hbox{Im}(\rho) = T+O(1)} is {O(\log T)}. This can be seen by applying (6) to {s = 2+iT} (say); the trivial zeroes at the negative integers end up giving a contribution of {O(\log T)} to this sum (this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being poles of the Gamma function), while the {\frac{1}{s-1}} and {\ldots} terms end up being negligible (of size {O(1)}), while each non-trivial zero {\rho} contributes a term which has a non-negative real part, and furthermore has size comparable to {1} if {\hbox{Im}(\rho) = T+O(1)}. (Here I am glossing over a technical renormalisation needed to make the infinite series in (6) converge properly.) Meanwhile, the left-hand side of (6) is absolutely convergent for {s=2+iT} and of size {O(1)}, and the claim follows. A more refined version of this argument shows that the number of non-trivial zeroes with {0 \leq \hbox{Im}(\rho) \leq T} is {\frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T)}, but we will not need this more precise formula here. (A fair fraction – at least 40%, in fact – of these zeroes are known to lie on the critical line; see this earlier blog post of mine for more discussion.)

Another thing that we happen to know is how the magnitude {|\zeta(1/2+it)|} of the zeta function is distributed as {t \rightarrow \infty}; it turns out to be log-normally distributed with log-variance about {\frac{1}{2} \log \log t}. More precisely, we have the following result of Selberg:

Theorem 1 Let {T} be a large number, and let {t} be chosen uniformly at random from between {T} and {2T} (say). Then the distribution of {\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|} converges (in distribution) to the normal distribution {N(0,1)}.

To put it more informally, {\log |\zeta(1/2+it)|} behaves like {\sqrt{\frac{1}{2} \log \log t} \times N(0,1)} plus lower order terms for “typical” large values of {t}. (Zeroes {\rho} of {\zeta} are, of course, certainly not typical, but one can show that one can usually stay away from these zeroes.) In fact, Selberg showed a slightly more precise result, namely that for any fixed {k \geq 1}, the {k^{th}} moment of {\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|} converges to the {k^{th}} moment of {N(0,1)}.

Remarkably, Selberg’s result does not need RH or GUE, though it is certainly consistent with such hypotheses. (For instance, the determinant of a GUE matrix asymptotically obeys a remarkably similar log-normal law to that given by Selberg’s theorem.) Indeed, the net effect of these hypotheses only affects some error terms in {\log |\zeta(1/2+it)|} of magnitude {O(1)}, and are thus asymptotically negligible compared to the main term, which has magnitude about {O(\sqrt{\log \log T})}. So Selberg’s result, while very pretty, manages to finesse the question of what the zeroes {\rho} of {\zeta} are actually doing – he makes the primes do most of the work, rather than the zeroes.

Selberg never actually published the above result, but it is reproduced in a number of places (e.g. in this book by Joyner, or this book by Laurincikas). As with many other results in analytic number theory, the actual details of the proof can get somewhat technical; but I would like to record here (partly for my own benefit) an informal sketch of some of the main ideas in the argument.

Read the rest of this entry »

One of the oldest and most fundamental concepts in mathematics is the line. Depending on exactly what mathematical structures we want to study (algebraic, geometric, topological, order-theoretic, etc.), we model lines nowadays by a variety of standard mathematical objects, such as the real line {\Bbb R}, the complex line {\Bbb C}, the projective line \Bbb{RP}^1, the extended real line {}[-\infty,+\infty], the affine line {\Bbb A}^1, the continuum c, the long line L, etc. We also have discrete versions of the line, such as the natural numbers {\Bbb N}, the integers {\Bbb Z}, and the ordinal \omega, as well as compact versions of the line, such as the unit interval {}[0,1] or the unit circle {\Bbb T} := {\Bbb R}/{\Bbb Z}. Finally we have discrete and compact versions of the line, such as the cyclic groups {\Bbb Z}/N{\Bbb Z} and the discrete intervals \{1,\ldots,N\} and \{0,\ldots,N-1\}. By taking Cartesian products we then obtain higher-dimensional objects such as Euclidean space {\Bbb R}^n, the standard lattice {\Bbb Z}^n, the standard torus {\Bbb T}^n = {\Bbb R}^n/{\Bbb Z}^n, and so forth. These objects of course form the background on which a very large fraction of modern mathematics is set.

Broadly speaking, the line has three major families of structures on it:

  1. Geometric structures, such as a metric or a measure, completeness, scales (coarse and fine), rigid motions (translations and reflection), similarities (dilation, affine maps), and differential structures (tangent bundle, etc.);
  2. Algebraic structures, such group, ring, or field structures, and everything else that comes from those categories (e.g. subgroups, homomorphisms, involutions, etc.); and
  3. One-dimensional structures, such as order, a length space structure (in particular, path-connectedness structure), a singleton generator, the Archimedean property, the ability to use mathematical induction (i.e. well-ordering), convexity, or the ability to disconnect the line by removing a single point.

Of course, these structures are inter-related, and it is an important phenomenon that a mathematical concept which appears to be native to one structure, can often be equivalently defined in terms of other structures. For instance, the absolute value |n| of an integer n can be defined geometrically as the distance from 0 to n, algebraically as the index of the subgroup \langle n \rangle = n \cdot \Bbb Z of the integers {\Bbb Z} generated by n, or one-dimensionally as the number of integers between 0 and n (including 0, but excluding n). This equivalence of definitions becomes important when one wants to work in more general contexts in which one or more of the above structures is missing or otherwise weakened.

What I want to talk about today is an important toy model for the line (in any of its incarnations), in which the geometric and algebraic structures are enhanced (and become neatly nested and recursive), at the expense of the one-dimensional structure (which is largely destroyed). This model has many different names, depending on what field of mathematics one is working in and which structures one is interested in. In harmonic analysis it is called the dyadic model, the Walsh model, or the Cantor group model; in number theory and arithmetic geometry it is known as the function field model; in topology it is the Cantor space model; in probability it is the martingale model; in metric geometry it is the ultrametric, tree, or non-Archimedean model; in algebraic geometry it is the Puiseux series model; in additive combinatorics it is the bounded torsion or finite field model; in computer science and information theory it is the Hamming cube model; in representation theory it is the Kashiwara crystal model. Let me arbitrarily select one of these terms, and refer to all of these models as dyadic models for the line (or of objects derived from the line). While there is often no direct link between a dyadic model and a non-dyadic model, dyadic models serve as incredibly useful laboratories in which to gain insight and intuition for the “real-world” non-dyadic model, since one has much more powerful and elegant algebraic and geometric structure to play with in this setting (though the loss of one-dimensional structure can be a significant concern). Perhaps the most striking example of this is the three-line proof of the Riemann hypothesis in the function field model of the integers, which I will discuss a little later.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,314 other followers