You are currently browsing the monthly archive for May 2019.
I was recently asked to contribute a short comment to Nature Reviews Physics, as part of a series of articles on fluid dynamics on the occasion of the 200th anniversary (this August) of the birthday of George Stokes. My contribution is now online as “Searching for singularities in the Navier–Stokes equations“, where I discuss the global regularity problem for Navier-Stokes and my thoughts on how one could try to construct a solution that blows up in finite time via an approximately discretely self-similar “fluid computer”. (The rest of the series does not currently seem to be available online, but I expect they will become so shortly.)
Given three points in the plane, the distances
between them have to be non-negative and obey the triangle inequalities
but are otherwise unconstrained. But if one has four points in the plane, then there is an additional constraint connecting the six distances
between them, coming from the Cayley-Menger determinant:
Proposition 1 (Cayley-Menger determinant) If
are four points in the plane, then the Cayley-Menger determinant
vanishes.
Proof: If we view as vectors in
, then we have the usual cosine rule
, and similarly for all the other distances. The
matrix appearing in (1) can then be written as
, where
is the matrix
and is the (augmented) Gram matrix
The matrix is a rank one matrix, and so
is also. The Gram matrix
factorises as
, where
is the
matrix with rows
, and thus has rank at most
. Therefore the matrix
in (1) has rank at most
, and hence has determinant zero as claimed.
For instance, if we know that and
, then in order for
to be coplanar, the remaining distance
has to obey the equation
After some calculation the left-hand side simplifies to , so the non-negative quantity is constrained to equal either
or
. The former happens when
form a unit right-angled triangle with right angle at
and
; the latter happens when
form the vertices of a unit square traversed in that order. Any other value for
is not compatible with the hypothesis for
lying on a plane; hence the Cayley-Menger determinant can be used as a test for planarity.
Now suppose that we have four points on a sphere
of radius
, with six distances
now measured as lengths of arcs on the sphere. There is a spherical analogue of the Cayley-Menger determinant:
Proposition 2 (Spherical Cayley-Menger determinant) If
are four points on a sphere
of radius
in
, then the spherical Cayley-Menger determinant
vanishes.
Proof: We can assume that the sphere is centred at the origin of
, and view
as vectors in
of magnitude
. The angle subtended by
from the origin is
, so by the cosine rule we have
Similarly for all the other inner products. Thus the matrix in (2) can be written as , where
is the Gram matrix
We can factor where
is the
matrix with rows
. Thus
has rank at most
and thus the determinant vanishes as required.
Just as the Cayley-Menger determinant can be used to test for coplanarity, the spherical Cayley-Menger determinant can be used to test for lying on a sphere of radius . For instance, if we know that
lie on
and
are all equal to
, then the above proposition gives
The left-hand side evaluates to ; as
lies between
and
, the only choices for this distance are then
and
. The former happens for instance when
lies on the north pole
,
are points on the equator with longitudes differing by 90 degrees, and
is also equal to the north pole; the latter occurs when
is instead placed on the south pole.
The Cayley-Menger and spherical Cayley-Menger determinants look slightly different from each other, but one can transform the latter into something resembling the former by row and column operations. Indeed, the determinant (2) can be rewritten as
and by further row and column operations, this determinant vanishes if and only if the determinant
vanishes, where . In the limit
(so that the curvature of the sphere
tends to zero),
tends to
, and by Taylor expansion
tends to
; similarly for the other distances. Now we see that the planar Cayley-Menger determinant emerges as the limit of (3) as
, as would be expected from the intuition that a plane is essentially a sphere of infinite radius.
In principle, one can now estimate the radius of the Earth (assuming that it is either a sphere
or a flat plane
) if one is given the six distances
between four points
on the Earth. Of course, if one wishes to do so, one should have
rather far apart from each other, since otherwise it would be difficult to for instance distinguish the round Earth from a flat one. As an experiment, and just for fun, I wanted to see how accurate this would be with some real world data. I decided to take
,
,
,
be the cities of London, Los Angeles, Tokyo, and Dubai respectively. As an initial test, I used distances from this online flight calculator, measured in kilometers:
Given that the true radius of the earth was about kilometers, I chose the change of variables
(so that
corresponds to the round Earth model with the commonly accepted value for the Earth’s radius, and
corresponds to the flat Earth), and obtained the following plot for (3):
In particular, the determinant does indeed come very close to vanishing when , which is unsurprising since, as explained on the web site, the online flight calculator uses a model in which the Earth is an ellipsoid of radii close to
km. There is another radius that would also be compatible with this data at
(corresponding to an Earth of radius about
km), but presumably one could rule out this as a spurious coincidence by experimenting with other quadruples of cities than the ones I selected. On the other hand, these distances are highly incompatible with the flat Earth model
; one could also see this with a piece of paper and a ruler by trying to lay down four points
on the paper with (an appropriately rescaled) version of the above distances (e.g., with
,
, etc.).
If instead one goes to the flight time calculator and uses flight travel times instead of distances, one now gets the following data (measured in hours):
Assuming that planes travel at about kilometers per hour, the true radius of the Earth should be about
of flight time. If one then uses the normalisation
, one obtains the following plot:
Not too surprisingly, this is basically a rescaled version of the previous plot, with vanishing near and at
. (The website for the flight calculator does say it calculates short and long haul flight times slightly differently, which may be the cause of the slight discrepancies between this figure and the previous one.)
Of course, these two data sets are “cheating” since they come from a model which already presupposes what the radius of the Earth is. But one can input real world flight times between these four cities instead of the above idealised data. Here one runs into the issue that the flight time from to
is not necessarily the same as that from
to
due to such factors as windspeed. For instance, I looked up the online flight time from Tokyo to Dubai to be 11 hours and 10 minutes, whereas the online flight time from Dubai to Tokyo was 9 hours and 50 minutes. The simplest thing to do here is take an arithmetic mean of the two times as a preliminary estimate for the flight time without windspeed factors, thus for instance the Tokyo-Dubai flight time would now be 10 hours and 30 minutes, and more generally
This data is not too far off from the online calculator data, but it does distort the graph slightly (taking as before):
Now one gets estimates for the radius of the Earth that are off by about a factor of from the truth, although the
round Earth model still is twice as accurate as the flat Earth model
.
Given that windspeed should additively affect flight velocity rather than flight time, and the two are inversely proportional to each other, it is more natural to take a harmonic mean rather than an arithmetic mean. This gives the slightly different values
but one still gets essentially the same plot:
So the inaccuracies are presumably coming from some other source. (Note for instance that the true flight time from Tokyo to Dubai is about greater than the calculator predicts, while the flight time from LA to Dubai is about
less; these sorts of errors seem to pile up in this calculation.) Nevertheless, it does seem that flight time data is (barely) enough to establish the roundness of the Earth and obtain a somewhat ballpark estimate for its radius. (I assume that the fit would be better if one could include some Southern Hemisphere cities such as Sydney or Santiago, but I was not able to find a good quadruple of widely spaced cities on both hemispheres for which there were direct flights between all six pairs.)
This is another sequel to a recent post in which I showed the Riemann zeta function can be locally approximated by a polynomial, in the sense that for randomly chosen
one has an approximation
where grows slowly with
, and
is a polynomial of degree
. It turns out that in the function field setting there is an exact version of this approximation which captures many of the known features of the Riemann zeta function, namely Dirichlet
-functions for a random character of given modulus over a function field. This model was (essentially) studied in a fairly recent paper by Andrade, Miller, Pratt, and Trinh; I am not sure if there is any further literature on this model beyond this paper (though the number field analogue of low-lying zeroes of Dirichlet
-functions is certainly well studied). In this model it is possible to set
fixed and let
go to infinity, thus providing a simple finite-dimensional model problem for problems involving the statistics of zeroes of the zeta function.
In this post I would like to record this analogue precisely. We will need a finite field of some order
and a natural number
, and set
We will primarily think of as being large and
as being either fixed or growing very slowly with
, though it is possible to also consider other asymptotic regimes (such as holding
fixed and letting
go to infinity). Let
be the ring of polynomials of one variable
with coefficients in
, and let
be the multiplicative semigroup of monic polynomials in
; one should view
and
as the function field analogue of the integers and natural numbers respectively. We use the valuation
for polynomials
(with
); this is the analogue of the usual absolute value on the integers. We select an irreducible polynomial
of size
(i.e.,
has degree
). The multiplicative group
can be shown to be cyclic of order
. A Dirichlet character of modulus
is a completely multiplicative function
of modulus
, that is periodic of period
and vanishes on those
not coprime to
. From Fourier analysis we see that there are exactly
Dirichlet characters of modulus
. A Dirichlet character is said to be odd if it is not identically one on the group
of non-zero constants; there are only
non-odd characters (including the principal character), so in the limit
most Dirichlet characters are odd. We will work primarily with odd characters in order to be able to ignore the effect of the place at infinity.
Let be an odd Dirichlet character of modulus
. The Dirichlet
-function
is then defined (for
of sufficiently large real part, at least) as
Note that for , the set
is invariant under shifts
whenever
; since this covers a full set of residue classes of
, and the odd character
has mean zero on this set of residue classes, we conclude that the sum
vanishes for
. In particular, the
-function is entire, and for any real number
and complex number
, we can write the
-function as a polynomial
where and the coefficients
are given by the formula
Note that can easily be normalised to zero by the relation
In particular, the dependence on is periodic with period
(so by abuse of notation one could also take
to be an element of
).
Fourier inversion yields a functional equation for the polynomial :
Proposition 1 (Functional equation) Let
be an odd Dirichlet character of modulus
, and
. There exists a phase
(depending on
) such that
for all
, or equivalently that
where
.
Proof: We can normalise . Let
be the finite field
. We can write
where denotes the subgroup of
consisting of (residue classes of) polynomials of degree less than
. Let
be a non-trivial character of
whose kernel lies in the space
(this is easily achieved by pulling back a non-trivial character from the quotient
). We can use the Fourier inversion formula to write
where
From change of variables we see that is a scalar multiple of
; from Plancherel we conclude that
for some phase . We conclude that
The inner sum equals
if
, and vanishes otherwise, thus
For in
,
and the contribution of the sum vanishes as
is odd. Thus we may restrict
to
, so that
By the multiplicativity of , this factorises as
From the one-dimensional version of (3) (and the fact that is odd) we have
for some phase . The claim follows.
As one corollary of the functional equation, is a phase rotation of
and thus is non-zero, so
has degree exactly
. The functional equation is then equivalent to the
zeroes of
being symmetric across the unit circle. In fact we have the stronger
Theorem 2 (Riemann hypothesis for Dirichlet
-functions over function fields) Let
be an odd Dirichlet character of modulus
, and
. Then all the zeroes of
lie on the unit circle.
We derive this result from the Riemann hypothesis for curves over function fields below the fold.
In view of this theorem (and the fact that ), we may write
for some unitary matrix
. It is possible to interpret
as the action of the geometric Frobenius map on a certain cohomology group, but we will not do so here. The situation here is simpler than in the number field case because the factor
arising from very small primes is now absent (in the function field setting there are no primes of size between
and
).
We now let vary uniformly at random over all odd characters of modulus
, and
uniformly over
, independently of
; we also make the distribution of the random variable
conjugation invariant in
. We use
to denote the expectation with respect to this randomness. One can then ask what the limiting distribution of
is in various regimes; we will focus in this post on the regime where
is fixed and
is being sent to infinity. In the spirit of the Sato-Tate conjecture, one should expect
to converge in distribution to the circular unitary ensemble (CUE), that is to say Haar probability measure on
. This may well be provable from Deligne’s “Weil II” machinery (in the spirit of this monograph of Katz and Sarnak), though I do not know how feasible this is or whether it has already been done in the literature; here we shall avoid using this machinery and study what partial results towards this CUE hypothesis one can make without it.
If one lets be the eigenvalues of
(ordered arbitrarily), then we now have
and hence the are essentially elementary symmetric polynomials of the eigenvalues:
One can take log derivatives to conclude
On the other hand, as in the number field case one has the Dirichlet series expansion
where has sufficiently large real part,
, and the von Mangoldt function
is defined as
when
is the power of an irreducible
and
otherwise. We conclude the “explicit formula”
for , where
Similarly on inverting we have
Since we also have
for sufficiently large real part, where the Möbius function
is equal to
when
is the product of
distinct irreducibles, and
otherwise, we conclude that the Möbius coefficients
are just the complete homogeneous symmetric polynomials of the eigenvalues:
One can then derive various algebraic relationships between the coefficients from various identities involving symmetric polynomials, but we will not do so here.
What do we know about the distribution of ? By construction, it is conjugation-invariant; from (2) it is also invariant with respect to the rotations
for any phase
. We also have the function field analogue of the Rudnick-Sarnak asymptotics:
Proposition 3 (Rudnick-Sarnak asymptotics) Let
be nonnegative integers. If
is equal to
in the limit
(holding
fixed) unless
for all
, in which case it is equal to
Comparing this with Proposition 1 from this previous post, we thus see that all the low moments of are consistent with the CUE hypothesis (and also with the ACUE hypothesis, again by the previous post). The case
of this proposition was essentially established by Andrade, Miller, Pratt, and Trinh.
Proof: We may assume the homogeneity relationship
since otherwise the claim follows from the invariance under phase rotation . By (6), the expression (9) is equal to
where
and consists of
copies of
for each
, and similarly
consists of
copies of
for each
.
The polynomials and
are monic of degree
, which by hypothesis is less than the degree of
, and thus they can only be scalar multiples of each other in
if they are identical (in
). As such, we see that the average
vanishes unless , in which case this average is equal to
. Thus the expression (9) simplifies to
There are at most choices for the product
, and each one contributes
to the above sum. All but
of these choices are square-free, so by accepting an error of
, we may restrict attention to square-free
. This forces
to all be irreducible (as opposed to powers of irreducibles); as
is a unique factorisation domain, this forces
and
to be a permutation of
. By the size restrictions, this then forces
for all
(if the above expression is to be anything other than
), and each
is associated to
possible choices of
. Writing
and then reinstating the non-squarefree possibilities for
, we can thus write the above expression as
Using the prime number theorem , we obtain the claim.
Comparing this with Proposition 1 from this previous post, we thus see that all the low moments of are consistent with the CUE and ACUE hypotheses:
Corollary 4 (CUE statistics at low frequencies) Let
be the eigenvalues of
, permuted uniformly at random. Let
be a linear combination of monomials
where
are integers with either
or
. Then
The analogue of the GUE hypothesis in this setting would be the CUE hypothesis, which asserts that the threshold here can be replaced by an arbitrarily large quantity. As far as I know this is not known even for
(though, as mentioned previously, in principle one may be able to resolve such cases using Deligne’s proof of the Riemann hypothesis for function fields). Among other things, this would allow one to distinguish CUE from ACUE, since as discussed in the previous post, these two distributions agree when tested against monomials up to threshold
, though not to
.
Proof: By permutation symmetry we can take to be symmetric, and by linearity we may then take
to be the symmetrisation of a single monomial
. If
then both expectations vanish due to the phase rotation symmetry, so we may assume that
and
. We can write this symmetric polynomial as a constant multiple of
plus other monomials with a smaller value of
. Since
, the claim now follows by induction from Proposition 3 and Proposition 1 from the previous post.
Thus, for instance, for , the
moment
is equal to
because all the monomials in are of the required form when
. The latter expectation can be computed exactly (for any natural number
) using a formula
of Baker-Forrester and Keating-Snaith, thus for instance
and more generally
when , where
are the integers
and more generally
(OEIS A039622). Thus we have
for if
and
is sufficiently slowly growing depending on
. The CUE hypothesis would imply that that this formula also holds for higher
. (The situation here is cleaner than in the number field case, in which the GUE hypothesis only suggests the correct lower bound for the moments rather than an asymptotic, due to the absence of the wildly fluctuating additional factor
that is present in the Riemann zeta function model.)
Now we can recover the analogue of Montgomery’s work on the pair correlation conjecture. Consider the statistic
where
is some finite linear combination of monomials independent of
. We can expand the above sum as
Assuming the CUE hypothesis, then by Example 3 of the previous post, we would conclude that
This is the analogue of Montgomery’s pair correlation conjecture. Proposition 3 implies that this claim is true whenever is supported on
. If instead we assume the ACUE hypothesis (or the weaker Alternative Hypothesis that the phase gaps are non-zero multiples of
), one should instead have
for arbitrary ; this is the function field analogue of a recent result of Baluyot. In any event, since
is non-negative, we unconditionally have the lower bound
if is non-negative for
.
By applying (12) for various choices of test functions we can obtain various bounds on the behaviour of eigenvalues. For instance suppose we take the Fejér kernel
Then (12) applies unconditionally and we conclude that
The right-hand side evaluates to . On the other hand,
is non-negative, and equal to
when
. Thus
The sum is at least
, and is at least
if
is not a simple eigenvalue. Thus
and thus the expected number of simple eigenvalues is at least ; in particular, at least two thirds of the eigenvalues are simple asymptotically on average. If we had (12) without any restriction on the support of
, the same arguments allow one to show that the expected proportion of simple eigenvalues is
.
Suppose that the phase gaps in are all greater than
almost surely. Let
is non-negative and
non-positive for
outside of the arc
. Then from (13) one has
so by taking contrapositives one can force the existence of a gap less than asymptotically if one can find
with
non-negative,
non-positive for
outside of the arc
, and for which one has the inequality
By a suitable choice of (based on a minorant of Selberg) one can ensure this for
for
large; see Section 5 of these notes of Goldston. This is not the smallest value of
currently obtainable in the literature for the number field case (which is currently
, due to Goldston and Turnage-Butterbaugh, by a somewhat different method), but is still significantly less than the trivial value of
. On the other hand, due to the compatibility of the ACUE distribution with Proposition 3, it is not possible to lower
below
purely through the use of Proposition 3.
In some cases it is possible to go beyond Proposition 3. Consider the mollified moment
where
for some coefficients . We can compute this moment in the CUE case:
Proposition 5 We have
Proof: From (5) one has
hence
where we suppress the dependence on the eigenvalues . Now observe the Pieri formula
where are the hook Schur polynomials
and we adopt the convention that vanishes for
, or when
and
. Then
also vanishes for
. We conclude that
As the Schur polynomials are orthonormal on the unitary group, the claim follows.
The CUE hypothesis would then imply the corresponding mollified moment conjecture
(See this paper of Conrey, and this paper of Radziwill, for some discussion of the analogous conjecture for the zeta function, which is essentially due to Farmer.)
From Proposition 3 one sees that this conjecture holds in the range . It is likely that the function field analogue of the calculations of Conrey (based ultimately on deep exponential sum estimates of Deshouillers and Iwaniec) can extend this range to
for any
, if
is sufficiently large depending on
; these bounds thus go beyond what is available from Proposition 3. On the other hand, as discussed in Remark 7 of the previous post, ACUE would also predict (14) for
as large as
, so the available mollified moment estimates are not strong enough to rule out ACUE. It would be interesting to see if there is some other estimate in the function field setting that can be used to exclude the ACUE hypothesis (possibly one that exploits the fact that GRH is available in the function field case?).
In a recent post I discussed how the Riemann zeta function can be locally approximated by a polynomial, in the sense that for randomly chosen
one has an approximation
where grows slowly with
, and
is a polynomial of degree
. Assuming the Riemann hypothesis (as we will throughout this post), the zeroes of
should all lie on the unit circle, and one should then be able to write
as a scalar multiple of the characteristic polynomial of (the inverse of) a unitary matrix
, which we normalise as
Here is some quantity depending on
. We view
as a random element of
; in the limit
, the GUE hypothesis is equivalent to
becoming equidistributed with respect to Haar measure on
(also known as the Circular Unitary Ensemble, CUE; it is to the unit circle what the Gaussian Unitary Ensemble (GUE) is on the real line). One can also view
as analogous to the “geometric Frobenius” operator in the function field setting, though unfortunately it is difficult at present to make this analogy any more precise (due, among other things, to the lack of a sufficiently satisfactory theory of the “field of one element“).
Taking logarithmic derivatives of (2), we have
and hence on taking logarithmic derivatives of (1) in the variable we (heuristically) have
Morally speaking, we have
so on comparing coefficients we expect to interpret the moments of
as a finite Dirichlet series:
To understand the distribution of in the unitary group
, it suffices to understand the distribution of the moments
where denotes averaging over
, and
. The GUE hypothesis asserts that in the limit
, these moments converge to their CUE counterparts
where is now drawn uniformly in
with respect to the CUE ensemble, and
denotes expectation with respect to that measure.
The moment (6) vanishes unless one has the homogeneity condition
This follows from the fact that for any phase ,
has the same distribution as
, where we use the number theory notation
.
In the case when the degree is low, we can use representation theory to establish the following simple formula for the moment (6), as evaluated by Diaconis and Shahshahani:
Proposition 1 (Low moments in CUE model) If
then the moment (6) vanishes unless
for all
, in which case it is equal to
Another way of viewing this proposition is that for distributed according to CUE, the random variables
are distributed like independent complex random variables of mean zero and variance
, as long as one only considers moments obeying (8). This identity definitely breaks down for larger values of
, so one only obtains central limit theorems in certain limiting regimes, notably when one only considers a fixed number of
‘s and lets
go to infinity. (The paper of Diaconis and Shahshahani writes
in place of
, but I believe this to be a typo.)
Proof: Let be the left-hand side of (8). We may assume that (7) holds since we are done otherwise, hence
Our starting point is Schur-Weyl duality. Namely, we consider the -dimensional complex vector space
This space has an action of the product group : the symmetric group
acts by permutation on the
tensor factors, while the general linear group
acts diagonally on the
factors, and the two actions commute with each other. Schur-Weyl duality gives a decomposition
where ranges over Young tableaux of size
with at most
rows,
is the
-irreducible unitary representation corresponding to
(which can be constructed for instance using Specht modules), and
is the
-irreducible polynomial representation corresponding with highest weight
.
Let be a permutation consisting of
cycles of length
(this is uniquely determined up to conjugation), and let
. The pair
then acts on
, with the action on basis elements
given by
The trace of this action can then be computed as
where is the
matrix coefficient of
. Breaking up into cycles and summing, this is just
But we can also compute this trace using the Schur-Weyl decomposition (10), yielding the identity
where is the character on
associated to
, and
is the character on
associated to
. As is well known,
is just the Schur polynomial of weight
applied to the (algebraic, generalised) eigenvalues of
. We can specialise to unitary matrices to conclude that
and similarly
where consists of
cycles of length
for each
. On the other hand, the characters
are an orthonormal system on
with the CUE measure. Thus we can write the expectation (6) as
Now recall that ranges over all the Young tableaux of size
with at most
rows. But by (8) we have
, and so the condition of having
rows is redundant. Hence
now ranges over all Young tableaux of size
, which as is well known enumerates all the irreducible representations of
. One can then use the standard orthogonality properties of characters to show that the sum (12) vanishes if
,
are not conjugate, and is equal to
divided by the size of the conjugacy class of
(or equivalently, by the size of the centraliser of
) otherwise. But the latter expression is easily computed to be
, giving the claim.
Example 2 We illustrate the identity (11) when
,
. The Schur polynomials are given as
where
are the (generalised) eigenvalues of
, and the formula (11) in this case becomes
The functions
are orthonormal on
, so the three functions
are also, and their
norms are
,
, and
respectively, reflecting the size in
of the centralisers of the permutations
,
, and
respectively. If
is instead set to say
, then the
terms now disappear (the Young tableau here has too many rows), and the three quantities here now have some non-trivial covariance.
Example 3 Consider the moment
. For
, the above proposition shows us that this moment is equal to
. What happens for
? The formula (12) computes this moment as
where
is a cycle of length
in
, and
ranges over all Young tableaux with size
and at most
rows. The Murnaghan-Nakayama rule tells us that
vanishes unless
is a hook (all but one of the non-zero rows consisting of just a single box; this also can be interpreted as an exterior power representation on the space
of vectors in
whose coordinates sum to zero), in which case it is equal to
(depending on the parity of the number of non-zero rows). As such we see that this moment is equal to
. Thus in general we have
Now we discuss what is known for the analogous moments (5). Here we shall be rather non-rigorous, in particular ignoring an annoying “Archimedean” issue that the product of the ranges and
is not quite the range
but instead leaks into the adjacent range
. This issue can be addressed by working in a “weak" sense in which parameters such as
are averaged over fairly long scales, or by passing to a function field analogue of these questions, but we shall simply ignore the issue completely and work at a heuristic level only. For similar reasons we will ignore some technical issues arising from the sharp cutoff of
to the range
(it would be slightly better technically to use a smooth cutoff).
One can morally expand out (5) using (4) as
where ,
, and the integers
are in the ranges
for and
, and
for and
. Morally, the expectation here is negligible unless
in which case the expecation is oscillates with magnitude one. In particular, if (7) fails (with some room to spare) then the moment (5) should be negligible, which is consistent with the analogous behaviour for the moments (6). Now suppose that (8) holds (with some room to spare). Then is significantly less than
, so the
multiplicative error in (15) becomes an additive error of
. On the other hand, because of the fundamental integrality gap – that the integers are always separated from each other by a distance of at least
– this forces the integers
,
to in fact be equal:
The von Mangoldt factors effectively restrict
to be prime (the effect of prime powers is negligible). By the fundamental theorem of arithmetic, the constraint (16) then forces
, and
to be a permutation of
, which then forces
for all
._ For a given
, the number of possible
is then
, and the expectation in (14) is equal to
. Thus this expectation is morally
and using Mertens’ theorem this soon simplifies asymptotically to the same quantity in Proposition 1. Thus we see that (morally at least) the moments (5) associated to the zeta function asymptotically match the moments (6) coming from the CUE model in the low degree case (8), thus lending support to the GUE hypothesis. (These observations are basically due to Rudnick and Sarnak, with the degree case of pair correlations due to Montgomery, and the degree
case due to Hejhal.)
With some rare exceptions (such as those estimates coming from “Kloostermania”), the moment estimates of Rudnick and Sarnak basically represent the state of the art for what is known for the moments (5). For instance, Montgomery’s pair correlation conjecture, in our language, is basically the analogue of (13) for , thus
for all . Montgomery showed this for (essentially) the range
(as remarked above, this is a special case of the Rudnick-Sarnak result), but no further cases of this conjecture are known.
These estimates can be used to give some non-trivial information on the largest and smallest spacings between zeroes of the zeta function, which in our notation corresponds to spacing between eigenvalues of . One such method used today for this is due to Montgomery and Odlyzko and was greatly simplified by Conrey, Ghosh, and Gonek. The basic idea, translated to our random matrix notation, is as follows. Suppose
is some random polynomial depending on
of degree at most
. Let
denote the eigenvalues of
, and let
be a parameter. Observe from the pigeonhole principle that if the quantity
then the arcs cannot all be disjoint, and hence there exists a pair of eigenvalues making an angle of less than
(
times the mean angle separation). Similarly, if the quantity (18) falls below that of (19), then these arcs cannot cover the unit circle, and hence there exists a pair of eigenvalues making an angle of greater than
times the mean angle separation. By judiciously choosing the coefficients of
as functions of the moments
, one can ensure that both quantities (18), (19) can be computed by the Rudnick-Sarnak estimates (or estimates of equivalent strength); indeed, from the residue theorem one can write (18) as
for sufficiently small , and this can be computed (in principle, at least) using (3) if the coefficients of
are in an appropriate form. Using this sort of technology (translated back to the Riemann zeta function setting), one can show that gaps between consecutive zeroes of zeta are less than
times the mean spacing and greater than
times the mean spacing infinitely often for certain
; the current records are
(due to Goldston and Turnage-Butterbaugh) and
(due to Bui and Milinovich, who input some additional estimates beyond the Rudnick-Sarnak set, namely the twisted fourth moment estimates of Bettin, Bui, Li, and Radziwill, and using a technique based on Hall’s method rather than the Montgomery-Odlyzko method).
It would be of great interest if one could push the upper bound for the smallest gap below
. The reason for this is that this would then exclude the Alternative Hypothesis that the spacing between zeroes are asymptotically always (or almost always) a non-zero half-integer multiple of the mean spacing, or in our language that the gaps between the phases
of the eigenvalues
of
are nasymptotically always non-zero integer multiples of
. The significance of this hypothesis is that it is implied by the existence of a Siegel zero (of conductor a small power of
); see this paper of Conrey and Iwaniec. (In our language, what is going on is that if there is a Siegel zero in which
is very close to zero, then
behaves like the Kronecker delta, and hence (by the Riemann-Siegel formula) the combined
-function
will have a polynomial approximation which in our language looks like a scalar multiple of
, where
and
is a phase. The zeroes of this approximation lie on a coset of the
roots of unity; the polynomial
is a factor of this approximation and hence will also lie in this coset, implying in particular that all eigenvalue spacings are multiples of
. Taking
then gives the claim.)
Unfortunately, the known methods do not seem to break this barrier without some significant new input; already the original paper of Montgomery and Odlyzko observed this limitation for their particular technique (and in fact fall very slightly short, as observed in unpublished work of Goldston and of Milinovich). In this post I would like to record another way to see this, by providing an “alternative” probability distribution to the CUE distribution (which one might dub the Alternative Circular Unitary Ensemble (ACUE) which is indistinguishable in low moments in the sense that the expectation for this model also obeys Proposition 1, but for which the phase spacings are always a multiple of
. This shows that if one is to rule out the Alternative Hypothesis (and thus in particular rule out Siegel zeroes), one needs to input some additional moment information beyond Proposition 1. It would be interesting to see if any of the other known moment estimates that go beyond this proposition are consistent with this alternative distribution. (UPDATE: it looks like they are, see Remark 7 below.)
To describe this alternative distribution, let us first recall the Weyl description of the CUE measure on the unitary group in terms of the distribution of the phases
of the eigenvalues, randomly permuted in any order. This distribution is given by the probability measure
where
is the Vandermonde determinant; see for instance this previous blog post for the derivation of a very similar formula for the GUE distribution, which can be adapted to CUE without much difficulty. To see that this is a probability measure, first observe the Vandermonde determinant identity
where ,
denotes the dot product, and
is the “long word”, which implies that (20) is a trigonometric series with constant term
; it is also clearly non-negative, so it is a probability measure. One can thus generate a random CUE matrix by first drawing
using the probability measure (20), and then generating
to be a random unitary matrix with eigenvalues
.
For the alternative distribution, we first draw on the discrete torus
(thus each
is a
root of unity) with probability density function
shift by a phase drawn uniformly at random, and then select
to be a random unitary matrix with eigenvalues
. Let us first verify that (21) is a probability density function. Clearly it is non-negative. It is the linear combination of exponentials of the form
for
. The diagonal contribution
gives the constant function
, which has total mass one. All of the other exponentials have a frequency
that is not a multiple of
, and hence will have mean zero on
. The claim follows.
From construction it is clear that the matrix drawn from this alternative distribution will have all eigenvalue phase spacings be a non-zero multiple of
. Now we verify that the alternative distribution also obeys Proposition 1. The alternative distribution remains invariant under rotation by phases, so the claim is again clear when (8) fails. Inspecting the proof of that proposition, we see that it suffices to show that the Schur polynomials
with
of size at most
and of equal size remain orthonormal with respect to the alternative measure. That is to say,
when have size equal to each other and at most
. In this case the phase
in the definition of
is irrelevant. In terms of eigenvalue measures, we are then reduced to showing that
By Fourier decomposition, it then suffices to show that the trigonometric polynomial does not contain any components of the form
for some non-zero lattice vector
. But we have already observed that
is a linear combination of plane waves of the form
for
. Also, as is well known,
is a linear combination of plane waves
where
is majorised by
, and similarly
is a linear combination of plane waves
where
is majorised by
. So the product
is a linear combination of plane waves of the form
. But every coefficient of the vector
lies between
and
, and so cannot be of the form
for any non-zero lattice vector
, giving the claim.
Example 4 If
, then the distribution (21) assigns a probability of
to any pair
that is a permuted rotation of
, and a probability of
to any pair that is a permuted rotation of
. Thus, a matrix
drawn from the alternative distribution will be conjugate to a phase rotation of
with probability
, and to
with probability
.
A similar computation when
gives
conjugate to a phase rotation of
with probability
, to a phase rotation of
or its adjoint with probability of
each, and a phase rotation of
with probability
.
Remark 5 For large
it does not seem that this specific alternative distribution is the only distribution consistent with Proposition 1 and which has all phase spacings a non-zero multiple of
; in particular, it may not be the only distribution consistent with a Siegel zero. Still, it is a very explicit distribution that might serve as a test case for the limitations of various arguments for controlling quantities such as the largest or smallest spacing between zeroes of zeta. The ACUE is in some sense the distribution that maximally resembles CUE (in the sense that it has the greatest number of Fourier coefficients agreeing) while still also being consistent with the Alternative Hypothesis, and so should be the most difficult enemy to eliminate if one wishes to disprove that hypothesis.
In some cases, even just a tiny improvement in known results would be able to exclude the alternative hypothesis. For instance, if the alternative hypothesis held, then is periodic in
with period
, so from Proposition 1 for the alternative distribution one has
which differs from (13) for any . (This fact was implicitly observed recently by Baluyot, in the original context of the zeta function.) Thus a verification of the pair correlation conjecture (17) for even a single
with
would rule out the alternative hypothesis. Unfortunately, such a verification appears to be on comparable difficulty with (an averaged version of) the Hardy-Littlewood conjecture, with power saving error term. (This is consistent with the fact that Siegel zeroes can cause distortions in the Hardy-Littlewood conjecture, as (implicitly) discussed in this previous blog post.)
Remark 6 One can view the CUE as normalised Lebesgue measure on
(viewed as a smooth submanifold of
). One can similarly view ACUE as normalised Lebesgue measure on the (disconnected) smooth submanifold of
consisting of those unitary matrices whose phase spacings are non-zero integer multiples of
; informally, ACUE is CUE restricted to this lower dimensional submanifold. As is well known, the phases of CUE eigenvalues form a determinantal point process with kernel
(or one can equivalently take
; in a similar spirit, the phases of ACUE eigenvalues, once they are rotated to be
roots of unity, become a discrete determinantal point process on those roots of unity with exactly the same kernel (except for a normalising factor of
). In particular, the
-point correlation functions of ACUE (after this rotation) are precisely the restriction of the
-point correlation functions of CUE after normalisation, that is to say they are proportional to
.
Remark 7 One family of estimates that go beyond the Rudnick-Sarnak family of estimates are twisted moment estimates for the zeta function, such as ones that give asymptotics for
for some small even exponent
(almost always
or
) and some short Dirichlet polynomial
; see for instance this paper of Bettin, Bui, Li, and Radziwill for some examples of such estimates. The analogous unitary matrix average would be something like
where
is now some random medium degree polynomial that depends on the unitary matrix
associated to
(and in applications will typically also contain some negative power of
to cancel the corresponding powers of
in
). Unfortunately such averages generally are unable to distinguish the CUE from the ACUE. For instance, if all the coefficients of
involve products of traces
of total order less than
, then in terms of the eigenvalue phases
,
is a linear combination of plane waves
where the frequencies
have coefficients of magnitude less than
. On the other hand, as each coefficient of
is an elementary symmetric function of the eigenvalues,
is a linear combination of plane waves
where the frequencies
have coefficients of magnitude at most
. Thus
is a linear combination of plane waves where the frequencies
have coefficients of magnitude less than
, and thus is orthogonal to the difference between the CUE and ACUE measures on the phase torus
by the previous arguments. In other words,
has the same expectation with respect to ACUE as it does with respect to CUE. Thus one can only start distinguishing CUE from ACUE if the mollifier
has degree close to or exceeding
, which corresponds to Dirichlet polynomials
of length close to or exceeding
, which is far beyond current technology for such moment estimates.
Remark 8 The GUE hypothesis for the zeta function asserts that the average
for any
and any test function
, where
is the Dyson sine kernel and
are the ordinates of zeroes of the zeta function. This corresponds to the CUE distribution for
. The ACUE distribution then corresponds to an “alternative gaussian unitary ensemble (AGUE)” hypothesis, in which the average (22) is instead predicted to equal a Riemann sum version of the integral (23):
This is a stronger version of the alternative hypothesis that the spacing between adjacent zeroes is almost always approximately a half-integer multiple of the mean spacing. I do not know of any known moment estimates for Dirichlet series that is able to eliminate this AGUE hypothesis (even assuming GRH). (UPDATE: These facts have also been independently observed in forthcoming work of Lagarias and Rodgers.)
Just a short note to point out that submissions to the 2019 Breakthrough Junior Challenge are now open until June 15. Students ages 13 to 18 from countries across the globe are invited to create and submit original videos (3:00 minutes in length maximum) that bring to life a concept or theory in the life sciences, physics or mathematics. The submissions are judged on the student’s ability to communicate complex scientific ideas in engaging, illuminating, and imaginative ways. The Challenge is organized by the Breakthrough Prize Foundation, in partnership with Khan Academy, National Geographic, and Cold Spring Harbor Laboratory. The winner of the challenge recieves a $250K college scholarship, with an addition $50K prize to the winner’s maths or science teacher, and a $100K lab for the student’s school. (This year I will be on the selection committee for this challenge.)
A useful rule of thumb in complex analysis is that holomorphic functions behave like large degree polynomials
. This can be evidenced for instance at a “local” level by the Taylor series expansion for a complex analytic function in the disk, or at a “global” level by factorisation theorems such as the Weierstrass factorisation theorem (or the closely related Hadamard factorisation theorem). One can truncate these theorems in a variety of ways (e.g., Taylor’s theorem with remainder) to be able to approximate a holomorphic function by a polynomial on various domains.
In some cases it can be convenient instead to work with polynomials of another variable
such as
(or more generally
for a scaling parameter
). In the case of the Riemann zeta function, defined by meromorphic continuation of the formula
one ends up having the following heuristic approximation in the neighbourhood of a point on the critical line:
Heuristic 1 (Polynomial approximation) Let
be a height, let
be a “typical” element of
, and let
be an integer. Let
be the linear change of variables
for
and some polynomial
of degree
.
The requirement is necessary since the right-hand side is periodic with period
in the
variable (or period
in the
variable), whereas the zeta function is not expected to have any such periodicity, even approximately.
Let us give two non-rigorous justifications of this heuristic. Firstly, it is standard that inside the critical strip (with ) we have an approximate form
of (11). If we group the integers from
to
into
bins depending on what powers of
they lie between, we thus have
For with
and
we heuristically have
and so
where are the partial Dirichlet series
This gives the desired polynomial approximation.
A second non-rigorous justification is as follows. From factorisation theorems such as the Hadamard factorisation theorem we expect to have
where runs over the non-trivial zeroes of
, and there are some additional factors arising from the trivial zeroes and poles of
which we will ignore here; we will also completely ignore the issue of how to renormalise the product to make it converge properly. In the region
, the dominant contribution to this product (besides multiplicative constants) should arise from zeroes
that are also in this region. The Riemann-von Mangoldt formula suggests that for “typical”
one should have about
such zeroes. If one lets
be any enumeration of
zeroes closest to
, and then repeats this set of zeroes periodically by period
, one then expects to have an approximation of the form
again ignoring all issues of convergence. If one writes and
, then Euler’s famous product formula for sine basically gives
(here we are glossing over some technical issues regarding renormalisation of the infinite products, which can be dealt with by studying the asymptotics as ) and hence we expect
This again gives the desired polynomial approximation.
Below the fold we give a rigorous version of the second argument suitable for “microscale” analysis. More precisely, we will show
Theorem 2 Let
be an integer going sufficiently slowly to infinity. Let
go to zero sufficiently slowly depending on
. Let
be drawn uniformly at random from
. Then with probability
(in the limit
), and possibly after adjusting
by
, there exists a polynomial
of degree
and obeying the functional equation (9) below, such that
whenever
.
It should be possible to refine the arguments to extend this theorem to the mesoscale setting by letting be anything growing like
, and
anything growing like
; also we should be able to delete the need to adjust
by
. We have not attempted these optimisations here.
Many conjectures and arguments involving the Riemann zeta function can be heuristically translated into arguments involving the polynomials , which one can view as random degree
polynomials if
is interpreted as a random variable drawn uniformly at random from
. These can be viewed as providing a “toy model” for the theory of the Riemann zeta function, in which the complex analysis is simplified to the study of the zeroes and coefficients of this random polynomial (for instance, the role of the gamma function is now played by a monomial in
). This model also makes the zeta function theory more closely resemble the function field analogues of this theory (in which the analogue of the zeta function is also a polynomial (or a rational function) in some variable
, as per the Weil conjectures). The parameter
is at our disposal to choose, and reflects the scale
at which one wishes to study the zeta function. For “macroscopic” questions, at which one wishes to understand the zeta function at unit scales, it is natural to take
(or very slightly larger), while for “microscopic” questions one would take
close to
and only growing very slowly with
. For the intermediate “mesoscopic” scales one would take
somewhere between
and
. Unfortunately, the statistical properties of
are only understood well at a conjectural level at present; even if one assumes the Riemann hypothesis, our understanding of
is largely restricted to the computation of low moments (e.g., the second or fourth moments) of various linear statistics of
and related functions (e.g.,
,
, or
).
Let’s now heuristically explore the polynomial analogues of this theory in a bit more detail. The Riemann hypothesis basically corresponds to the assertion that all the zeroes of the polynomial
lie on the unit circle
(which, after the change of variables
, corresponds to
being real); in a similar vein, the GUE hypothesis corresponds to
having the asymptotic law of a random scalar
times the characteristic polynomial of a random unitary
matrix. Next, we consider what happens to the functional equation
where
A routine calculation involving Stirling’s formula reveals that
with ; one also has the closely related approximation
when . Since
, applying (5) with
and using the approximation (2) suggests a functional equation for
:
where is the polynomial
with all the coefficients replaced by their complex conjugate. Thus if we write
then the functional equation can be written as
We remark that if we use the heuristic (3) (interpreting the cutoffs in the summation in a suitably vague fashion) then this equation can be viewed as an instance of the Poisson summation formula.
Another consequence of the functional equation is that the zeroes of are symmetric with respect to inversion
across the unit circle. This is of course consistent with the Riemann hypothesis, but does not obviously imply it. The phase
is of little consequence in this functional equation; one could easily conceal it by working with the phase rotation
of
instead.
One consequence of the functional equation is that is real for any
; the same is then true for the derivative
. Among other things, this implies that
cannot vanish unless
does also; thus the zeroes of
will not lie on the unit circle except where
has repeated zeroes. The analogous statement is true for
; the zeroes of
will not lie on the critical line except where
has repeated zeroes.
Relating to this fact, it is a classical result of Speiser that the Riemann hypothesis is true if and only if all the zeroes of the derivative of the zeta function in the critical strip lie on or to the right of the critical line. The analogous result for polynomials is
Proposition 3 We have
(where all zeroes are counted with multiplicity.) In particular, the zeroes of
all lie on the unit circle if and only if the zeroes of
lie in the closed unit disk.
Proof: From the functional equation we have
Thus it will suffice to show that and
have the same number of zeroes outside the closed unit disk.
Set , then
is a rational function that does not have a zero or pole at infinity. For
not a zero of
, we have already seen that
and
are real, so on dividing we see that
is always real, that is to say
(This can also be seen by writing , where
runs over the zeroes of
, and using the fact that these zeroes are symmetric with respect to reflection across the unit circle.) When
is a zero of
,
has a simple pole at
with residue a positive multiple of
, and so
stays on the right half-plane if one traverses a semicircular arc around
outside the unit disk. From this and continuity we see that
stays on the right-half plane in a circle slightly larger than the unit circle, and hence by the argument principle it has the same number of zeroes and poles outside of this circle, giving the claim.
From the functional equation and the chain rule, is a zero of
if and only if
is a zero of
. We can thus write the above proposition in the equivalent form
One can use this identity to get a lower bound on the number of zeroes of by the method of mollifiers. Namely, for any other polynomial
, we clearly have
By Jensen’s formula, we have for any that
We therefore have
As the logarithm function is concave, we can apply Jensen’s inequality to conclude
where the expectation is over the parameter. It turns out that by choosing the mollifier
carefully in order to make
behave like the function
(while keeping the degree
small enough that one can compute the second moment here), and then optimising in
, one can use this inequality to get a positive fraction of zeroes of
on the unit circle on average. This is the polynomial analogue of a classical argument of Levinson, who used this to show that at least one third of the zeroes of the Riemann zeta function are on the critical line; all later improvements on this fraction have been based on some version of Levinson’s method, mainly focusing on more advanced choices for the mollifier
and of the differential operator
that implicitly appears in the above approach. (The most recent lower bound I know of is
, due to Pratt and Robles. In principle (as observed by Farmer) this bound can get arbitrarily close to
if one is allowed to use arbitrarily long mollifiers, but establishing this seems of comparable difficulty to unsolved problems such as the pair correlation conjecture; see this paper of Radziwill for more discussion.) A variant of these techniques can also establish “zero density estimates” of the following form: for any
, the number of zeroes of
that lie further than
from the unit circle is of order
on average for some absolute constant
. Thus, roughly speaking, most zeroes of
lie within
of the unit circle. (Analogues of these results for the Riemann zeta function were worked out by Selberg, by Jutila, and by Conrey, with increasingly strong values of
.)
The zeroes of tend to live somewhat closer to the origin than the zeroes of
. Suppose for instance that we write
where are the zeroes of
, then by evaluating at zero we see that
and the right-hand side is of unit magnitude by the functional equation. However, if we differentiate
where are the zeroes of
, then by evaluating at zero we now see that
The right-hand side would now be typically expected to be of size , and so on average we expect the
to have magnitude like
, that is to say pushed inwards from the unit circle by a distance roughly
. The analogous result for the Riemann zeta function is that the zeroes of
at height
lie at a distance roughly
to the right of the critical line on the average; see this paper of Levinson and Montgomery for a precise statement.
Recent Comments