You are currently browsing the tag archive for the ‘universality’ tag.

I’ve just uploaded to the arXiv my paper “On the universality of the incompressible Euler equation on compact manifolds“, submitted to Discrete and Continuous Dynamical Systems. This is a variant of my recent paper on the universality of potential well dynamics, but instead of trying to embed dynamical systems into a potential well {\partial_{tt} u = -\nabla V(u)}, here we try to embed dynamical systems into the incompressible Euler equations

\displaystyle  \partial_t u + \nabla_u u = - \mathrm{grad}_g p \ \ \ \ \ (1)

\displaystyle  \mathrm{div}_g u = 0

on a Riemannian manifold {(M,g)}. (One is particularly interested in the case of flat manifolds {M}, particularly {{\bf R}^3} or {({\bf R}/{\bf Z})^3}, but for the main result of this paper it is essential that one is permitted to consider curved manifolds.) This system, first studied by Ebin and Marsden, is the natural generalisation of the usual incompressible Euler equations to curved space; it can be viewed as the formal geodesic flow equation on the infinite-dimensional manifold of volume-preserving diffeomorphisms on {M} (see this previous post for a discussion of this in the flat space case).

The Euler equations can be viewed as a nonlinear equation in which the nonlinearity is a quadratic function of the velocity field {u}. It is thus natural to compare the Euler equations with quadratic ODE of the form

\displaystyle  \partial_t y = B(y,y) \ \ \ \ \ (2)

where {y: {\bf R} \rightarrow {\bf R}^n} is the unknown solution, and {B: {\bf R}^n \times {\bf R}^n \rightarrow {\bf R}^n} is a bilinear map, which we may assume without loss of generality to be symmetric. One can ask whether such an ODE may be linearly embedded into the Euler equations on some Riemannian manifold {(M,g)}, which means that there is an injective linear map {U: {\bf R}^n \rightarrow \Gamma(TM)} from {{\bf R}^n} to smooth vector fields on {M}, as well as a bilinear map {P: {\bf R}^n \times {\bf R}^n \rightarrow C^\infty(M)} to smooth scalar fields on {M}, such that the map {y \mapsto (U(y), P(y,y))} takes solutions to (2) to solutions to (1), or equivalently that

\displaystyle  U(B(y,y)) + \nabla_{U(y)} U(y) = - \mathrm{grad}_g P(y,y)

\displaystyle  \mathrm{div}_g U(y) = 0

for all {y \in {\bf R}^n}.

For simplicity let us restrict {M} to be compact. There is an obvious necessary condition for this embeddability to occur, which comes from energy conservation law for the Euler equations; unpacking everything, this implies that the bilinear form {B} in (2) has to obey a cancellation condition

\displaystyle  \langle B(y,y), y \rangle = 0 \ \ \ \ \ (3)

for some positive definite inner product {\langle, \rangle: {\bf R}^n \times {\bf R}^n \rightarrow {\bf R}} on {{\bf R}^n}. The main result of the paper is the converse to this statement: if {B} is a symmetric bilinear form obeying a cancellation condition (3), then it is possible to embed the equations (2) into the Euler equations (1) on some Riemannian manifold {(M,g)}; the catch is that this manifold will depend on the form {B} and on the dimension {n} (in fact in the construction I have, {M} is given explicitly as {SO(n) \times ({\bf R}/{\bf Z})^{n+1}}, with a funny metric on it that depends on {B}).

As a consequence, any finite dimensional portion of the usual “dyadic shell models” used as simplified toy models of the Euler equation, can actually be embedded into a genuine Euler equation, albeit on a high-dimensional and curved manifold. This includes portions of the self-similar “machine” I used in a previous paper to establish finite time blowup for an averaged version of the Navier-Stokes (or Euler) equations. Unfortunately, the result in this paper does not apply to infinite-dimensional ODE, so I cannot yet establish finite time blowup for the Euler equations on a (well-chosen) manifold. It does not seem so far beyond the realm of possibility, though, that this could be done in the relatively near future. In particular, the result here suggests that one could construct something resembling a universal Turing machine within an Euler flow on a manifold, which was one ingredient I would need to engineer such a finite time blowup.

The proof of the main theorem proceeds by an “elimination of variables” strategy that was used in some of my previous papers in this area, though in this particular case the Nash embedding theorem (or variants thereof) are not required. The first step is to lessen the dependence on the metric {g} by partially reformulating the Euler equations (1) in terms of the covelocity {g \cdot u} (which is a {1}-form) instead of the velocity {u}. Using the freedom to modify the dimension of the underlying manifold {M}, one can also decouple the metric {g} from the volume form that is used to obtain the divergence-free condition. At this point the metric can be eliminated, with a certain positive definiteness condition between the velocity and covelocity taking its place. After a substantial amount of trial and error (motivated by some “two-and-a-half-dimensional” reductions of the three-dimensional Euler equations, and also by playing around with a number of variants of the classic “separation of variables” strategy), I eventually found an ansatz for the velocity and covelocity that automatically solved most of the components of the Euler equations (as well as most of the positive definiteness requirements), as long as one could find a number of scalar fields that obeyed a certain nonlinear system of transport equations, and also obeyed a positive definiteness condition. Here I was stuck for a bit because the system I ended up with was overdetermined – more equations than unknowns. After trying a number of special cases I eventually found a solution to the transport system on the sphere, except that the scalar functions sometimes degenerated and so the positive definiteness property I wanted was only obeyed with positive semi-definiteness. I tried for some time to perturb this example into a strictly positive definite solution before eventually working out that this was not possible. Finally I had the brainwave to lift the solution from the sphere to an even more symmetric space, and this quickly led to the final solution of the problem, using the special orthogonal group rather than the sphere as the underlying domain. The solution ended up being rather simple in form, but it is still somewhat miraculous to me that it exists at all; in retrospect, given the overdetermined nature of the problem, relying on a large amount of symmetry to cut down the number of equations was basically the only hope.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: Universality of local spectral statistics of non-Hermitian matrices“. The main result of this paper is a “Four Moment Theorem” that establishes universality for local spectral statistics of non-Hermitian matrices with independent entries, under the additional hypotheses that the entries of the matrix decay exponentially, and match moments with either the real or complex gaussian ensemble to fourth order. This is the non-Hermitian analogue of a long string of recent results establishing universality of local statistics in the Hermitian case (as discussed for instance in this recent survey of Van and myself, and also in several other places).

The complex case is somewhat easier to describe. Given a (non-Hermitian) random matrix ensemble {M_n} of {n \times n} matrices, one can arbitrarily enumerate the (geometric) eigenvalues as {\lambda_1(M_n),\ldots,\lambda_n(M_n) \in {\bf C}}, and one can then define the {k}-point correlation functions {\rho^{(k)}_n: {\bf C}^k \rightarrow {\bf R}^+} to be the symmetric functions such that

\displaystyle  \int_{{\bf C}^k} F(z_1,\ldots,z_k) \rho^{(k)}_n(z_1,\ldots,z_k)\ dz_1 \ldots dz_k

\displaystyle  = {\bf E} \sum_{1 \leq i_1 < \ldots < i_k \leq n} F(\lambda_1(M_n),\ldots,\lambda_k(M_n)).

In the case when {M_n} is drawn from the complex gaussian ensemble, so that all the entries are independent complex gaussians of mean zero and variance one, it is a classical result of Ginibre that the asymptotics of {\rho^{(k)}_n} near some point {z \sqrt{n}} as {n \rightarrow \infty} and {z \in {\bf C}} is fixed are given by the determinantal rule

\displaystyle  \rho^{(k)}_n(z\sqrt{n} + w_1,\ldots,z\sqrt{n}+w_k) \rightarrow \hbox{det}( K(w_i,w_j) )_{1 \leq i,j \leq k} \ \ \ \ \ (1)

for {|z| < 1} and

\displaystyle  \rho^{(k)}_n(z\sqrt{n} + w_1,\ldots,z\sqrt{n}+w_k) \rightarrow 0

for {|z| > 1}, where {K} is the reproducing kernel

\displaystyle  K(z,w) := \frac{1}{\pi} e^{-|z|^2/2 - |w|^2/2 + z \overline{w}}.

(There is also an asymptotic for the boundary case {|z|=1}, but it is more complicated to state.) In particular, we see that {\rho^{(k)}_n(z \sqrt{n}) \rightarrow \frac{1}{\pi} 1_{|z| \leq 1}} for almost every {z}, which is a manifestation of the well-known circular law for these matrices; but the circular law only captures the macroscopic structure of the spectrum, whereas the asymptotic (1) describes the microscopic structure.

Our first main result is that the asymptotic (1) for {|z|<1} also holds (in the sense of vague convergence) when {M_n} is a matrix whose entries are independent with mean zero, variance one, exponentially decaying tails, and which all match moments with the complex gaussian to fourth order. (Actually we prove a stronger result than this which is valid for all bounded {z} and has more uniform bounds, but is a bit more technical to state.) An analogous result is also established for real gaussians (but now one has to separate the correlation function into components depending on how many eigenvalues are real and how many are strictly complex; also, the limiting distribution is more complicated, being described by Pfaffians rather than determinants). Among other things, this allows us to partially extend some known results on complex or real gaussian ensembles to more general ensembles. For instance, there is a central limit theorem of Rider which establishes a central limit theorem for the number of eigenvalues of a complex gaussian matrix in a mesoscopic disk; from our results, we can extend this central limit theorem to matrices that match the complex gaussian ensemble to fourth order, provided that the disk is small enough (for technical reasons, our error bounds are not strong enough to handle large disks). Similarly, extending some results of Edelman-Kostlan-Shub and of Forrester-Nagao, we can show that for a matrix matching the real gaussian ensemble to fourth order, the number of real eigenvalues is {\sqrt{\frac{2n}{\pi}} + O(n^{1/2-c})} with probability {1-O(n^{-c})} for some absolute constant {c>0}.

There are several steps involved in the proof. The first step is to apply the Girko Hermitisation trick to replace the problem of understanding the spectrum of a non-Hermitian matrix, with that of understanding the spectrum of various Hermitian matrices. The two identities that realise this trick are, firstly, Jensen’s formula

\displaystyle  \log |\det(M_n-z_0)| = - \sum_{1 \leq i \leq n: \lambda_i(M_n) \in B(z_0,r)} \log \frac{r}{|\lambda_i(M_n)-z_0|}

\displaystyle + \frac{1}{2\pi} \int_0^{2\pi} \log |\det(M_n-z_0-re^{i\theta})|\ d\theta

that relates the local distribution of eigenvalues to the log-determinants {\log |\det(M_n-z_0)|}, and secondly the elementary identity

\displaystyle  \log |\det(M_n - z)| = \frac{1}{2} \log|\det W_{n,z}| + \frac{1}{2} n \log n

that relates the log-determinants of {M_n-z} to the log-determinants of the Hermitian matrices

\displaystyle  W_{n,z} := \frac{1}{\sqrt{n}} \begin{pmatrix} 0 & M_n -z \\ (M_n-z)^* & 0 \end{pmatrix}.

The main difficulty is then to obtain concentration and universality results for the Hermitian log-determinants {\log|\det W_{n,z}|}. This turns out to be a task that is analogous to the task of obtaining concentration for Wigner matrices (as we did in this recent paper), as well as central limit theorems for log-determinants of Wigner matrices (as we did in this other recent paper). In both of these papers, the main idea was to use the Four Moment Theorem for Wigner matrices (which can now be proven relatively easily by a combination of the local semi-circular law and resolvent swapping methods), combined with (in the latter paper) a central limit theorem for the gaussian unitary ensemble (GUE). This latter task was achieved by using the convenient Trotter normal form to tridiagonalise a GUE matrix, which has the effect of revealing the determinant of that matrix as the solution to a certain linear stochastic difference equation, and one can analyse the distribution of that solution via such tools as the martingale central limit theorem.

The matrices {W_{n,z}} are somewhat more complicated than Wigner matrices (for instance, the semi-circular law must be replaced by a distorted Marchenko-Pastur law), but the same general strategy works to obtain concentration and universality for their log-determinants. The main new difficulty that arises is that the analogue of the Trotter norm for gaussian random matrices is not tridiagonal, but rather Hessenberg (i.e. upper-triangular except for the lower diagonal). This ultimately has the effect of expressing the relevant determinant as the solution to a nonlinear stochastic difference equation, which is a bit trickier to solve for. Fortunately, it turns out that one only needs good lower bounds on the solution, as one can use the second moment method to upper bound the determinant and hence the log-determinant (following a classical computation of Turan). This simplifies the analysis on the equation somewhat.

While this result is the first local universality result in the category of random matrices with independent entries, there are still two limitations to the result which one would like to remove. The first is the moment matching hypotheses on the matrix. Very recently, one of the ingredients of our paper, namely the local circular law, was proved without moment matching hypotheses by Bourgade, Yau, and Yin (provided one stays away from the edge of the spectrum); however, as of this time of writing the other main ingredient – the universality of the log-determinant – still requires moment matching. (The standard tool for obtaining universality without moment matching hypotheses is the heat flow method (and more specifically, the local relaxation flow method), but the analogue of Dyson Brownian motion in the non-Hermitian setting appears to be somewhat intractible, being a coupled flow on both the eigenvalues and eigenvectors rather than just on the eigenvalues alone.)

Van Vu and I have just uploaded to the arXiv our paper “The Wigner-Dyson-Mehta bulk universality conjecture for Wigner matrices“, submitted to the Proceedings of the National Academy of Sciences. This short note concerns the convergence of the {k}-point correlation functions of Wigner matrices in the bulk to the Dyson {k}-point functions, a statement conjectured by Wigner, Dyson, and Mehta. Thanks to the results of Erdös, Peche, Ramirez, Schlein, Vu, Yau, and myself, this conjecture has now been established for all Wigner matrices (assuming a finite moment condition on the entries), but only if one uses a quite weak notion of convergence, namely averaged vague convergence in which one averages in the energy parameter {u}. The main purpose of this note is to observe that by combining together existing results in the literature, one can improve the convergence to vague convergence (which is the natural notion of convergence in the discrete setting); and furthermore, if one assumes some regularity and decay conditions on the coefficient distribution, one can improve the convergence further to local {L^1} convergence.

More precisely, let {M_n} be an {n \times n} Wigner matrix – a random Hermitian matrix whose off-diagonal elements {\frac{1}{\sqrt{n}} \zeta_{ij}} for {1 \leq i < j \leq n} are iid with mean zero and variance {1/n} (and whose diagonal elements also obey similar hypotheses, which we omit here). For simplicity, we also assume that the real and imaginary parts of {\zeta_{ij}} are also iid (as is the case for instance for the Gaussian Unitary Ensemble (GUE)). The eigenvalues {\lambda_1(M_n) \leq \ldots \leq \lambda_n(M_n)} of such a matrix are known to be asymptotically distributed accordingly to the Wigner semicircular distribution {\rho_{sc}(u)\ du}, where

\displaystyle  \rho_{sc}(u) := \frac{1}{2\pi} (4-u^2)_+^{1/2}.

In particular, this suggests that at any energy level {u} in the bulk {(-2,2)} of the spectrum, the average eigenvalue spacing should be about {\frac{1}{n \rho_{sc}(u)}}. It is then natural to introduce the normalised {k}-point correlation function

\displaystyle  \rho_{n,u}^{(k)}(t_1,\ldots,t_k) := \lim_{\epsilon \rightarrow 0} \frac{1}{\epsilon^k} {\bf P} E_\epsilon

for any distinct reals {t_1,\ldots,t_k} and {k \geq 1}, where {E_\epsilon} is the event that there is an eigenvalue in each of the intervals {[u + \frac{t_i}{n \rho_{sc}(u)}, u + \frac{t_i+\epsilon}{n \rho_{sc}(u)}]} for each {1 \leq i \leq k}. (This definition is valid when the Wigner ensemble is continuous; for discrete ensembles, one can define {\rho_{n,u}^{(k)}} instead in a distributional sense.)

The Wigner-Dyson-Mehta conjecture asserts that {\rho_{n,u}^{(k)}} converges (in various senses) as {n \rightarrow \infty} to the Dyson {k}-point function

\displaystyle  \rho_{Dyson}^{(k)}(t_1,\ldots,t_k) := \hbox{det}( K( t_i,t_j) )_{1 \leq i,j \leq k}

where {K(t,t'):=\frac{\sin \pi(t-t')}{\pi(t-t')}} is the Dyson sine kernel. This conjecture was verified first for the GUE (with a quite strong notion of convergence, namely local uniform convergence) by Dyson, using an explicit formula for {\rho_{n,u}^{(k)}} in the GUE case due to Gaudin and Mehta. Later results of Johansson, Erdos-Ramirez-Schlein-Yau, Erdos-Peche-Ramirez-Schlein-Yau, and Vu and myself, extended these results to increasingly wider ranges of Wigner matrices, but in the context of either weak convergence (which means that

\displaystyle  \int_{{\bf R}^k} \rho_{n,u}^{(k)}(t) F(t)\ dt \rightarrow \int_{{\bf R}^k} \rho_{Dyson}^{(k)}(t) F(t)\ dt \ \ \ \ \ (1)

for any {L^\infty}, compactly supported function {F}), or the slightly weaker notion of vague convergence (which is the same as weak convergence, except that the function {F} is also required to be continuous).

In a joint paper of Erdos, Ramirez, Schlein, Vu, Yau, and myself, we established the Wigner-Dyson-Mehta conjecture for all Wigner matrices (assuming only an exponential decay condition on the entries), but using a quite weak notion of convergence, namely averaged vague convergence, which allows for averaging in the energy parameter. Specifically, we showed that

\displaystyle  \lim_{b \rightarrow 0} \lim_{n \rightarrow \infty} \frac{1}{2b} \int_{u-b}^{u+b} \int_{{\bf R}^k} \rho_{n,u'}^{(k)}(t) F(t)\ dt = \int_{{\bf R}^k} \rho_{Dyson}^{(k)}(t) F(t)\ dt.

Subsequently, Erdos, Schlein, and Yau introduced the powerful local relaxation flow method, which achieved a simpler proof of the same result which also generalised to other ensembles beyond the Wigner case. However, for technical reasons, this method was restricted to establishing averaged vague convergence only.

In the current paper, we show that by combining the argument of Erdos, Ramirez, Schlein, Vu, Yau, and myself with some more recent technical results, namely the relaxation of the exponential decay condition in the four moment theorem to a finite moment condition (established by Vu and myself) and a strong eigenvalue localisation bound of Erdos, Yau, and Yin, one can upgrade the averaged vague convergence to vague convergence, and handle all Wigner matrices that assume a finite moment condition. Vague convergence is the most natural notion of convergence for discrete random matrix ensembles; for such ensembles, the correlation function is a discrete measure, and so one does not expect convergence to a continuous limit in any stronger sense than the vague sense. Also, by carefully inspecting the earlier argument of Erdos, Peche, Ramirez, Schlein, and Yau, we were able to establish convergence in the stronger local {L^1} sense once one assumed some regularity and positivity condition on the underlying coefficient distribution. These are somewhat modest and technical improvements over previous work on the Wigner-Dyson-Mehta conjecture, but they help to clarify and organise the profusion of results in this area, which are now reaching a fairly definitive form.

It may well be possible to go beyond local {L^1} convergence in the case of smooth ensembles, for instance establishing local uniform convergence; this was recently accomplished in the {k=1} case by Maltsev and Schlein. Indeed one may optimistically expect to even have convergence in the local smooth topology, which would basically be the strongest convergence one could hope for.

Last week I gave a talk at the Trinity Mathematical Society at Trinity College, Cambridge UK.  As the audience was primarily undergraduate, I gave a fairly non-technical talk on the universality phenomenon, based on this blog article of mine on the same topic.  It was a quite light and informal affair, and this is reflected in the talk slides (which, in particular, play up quite strongly the role of former students and Fellows of Trinity College in this story).   There was some interest in making these slides available publicly, so I have placed them on this site here.  (Note: copyright for the images in these slides has not been secured.)

I’ve spent the last week or so reworking the first draft of my universality article for Mathematics Awareness Month, in view of the useful comments and feedback received on that draft here on this blog, as well as elsewhere.  In fact, I ended up rewriting the article from scratch, and expanding it substantially, in order to focus on a more engaging and less technical narrative.  I found that I had to use a substantially different mindset than the one I am used to having for technical expository writing; indeed, the exercise reminded me more of my high school English assignments than of my professional work.  (This is perhaps a bad sign: English was not exactly my strongest subject as a student.)

The piece now has title: “E pluribus unum: from complexity, universality”.  This is a somewhat US-centric piece of wordplay, but Mathematics Awareness Month is, after all, a US-based initiative, even though awareness of mathematics certainly transcends national boundaries.   Still, it is a trivial matter to modify the title later if a better proposal arises, and I am sure that if I send this text to be published, that the editors may have some suggestions in this regard.

By coincidence, I moved up and expanded the other US-centric item – the discussion of the 2008 US presidential elections – to the front of the paper to play the role of the hook.  I’ll try to keep the Commonwealth spelling conventions, though. :-)

I decided to cut out the discussion of the N-body problem for various values of N, in part due to the confusion over the notion of a “solution”; there is a nice mathematical story there, but perhaps one that gets in the way of the main story of universality.

I have added a fair number of relevant images, though some of them will have to be changed in the final version for copyright reasons.  The narrow column format of this blog means that the image placement is not optimal, but I am sure that this can be rectified if this article is published professionally.

Read the rest of this entry »

The month of April has been designated as Mathematics Awareness Month by the major American mathematics organisations (the AMS, ASA, MAA, and SIAM).  I was approached to write a popular mathematics article for April 2011 (the theme for that month is “Mathematics and Complexity”).  While I have written a fair number of expository articles (including several on this blog) aimed at a mathematical audience, I actually have not had much experience writing articles at the popular mathematics level, and so I found this task to be remarkably difficult.  At this level of exposition, one not only needs to explain the facts, but also to tell a story; I have experience in the former but not in the latter.

I decided to write on the topic of universality – the phenomenon that the macroscopic behaviour of a dynamical system can be largely independent of the precise microscopic structure.   Below the fold is a first draft of the article; I would definitely welcome feedback and corrections.  It does not yet have any pictures, but I plan to rectify that in the final draft.  It also does not have a title, but this will be easy to address later.   But perhaps the biggest thing lacking right now is a narrative “hook”; I don’t yet have any good ideas as to how to make the story of universality compelling to a lay audience.  Any suggestions in this regard would be particularly appreciated.

I have not yet decided where I would try to publish this article; in fact, I might just publish it here on this blog (and eventually, in one of the blog book compilations).

Read the rest of this entry »

Van Vu and I have just uploaded to the arXiv our paper “Random covariance matrices: Universality of local statistics of eigenvalues“, to be submitted shortly. This paper draws heavily on the technology of our previous paper, in which we established a Four Moment Theorem for the local spacing statistics of eigenvalues of Wigner matrices. This theorem says, roughly speaking, that these statistics are completely determined by the first four moments of the coefficients of such matrices, at least in the bulk of the spectrum. (In a subsequent paper we extended the Four Moment Theorem to the edge of the spectrum.)

In this paper, we establish the analogous result for the singular values of rectangular iid matrices {M = M_{n,p}}, or (equivalently) the eigenvalues of the associated covariance matrix {\frac{1}{n} M M^*}. As is well-known, there is a parallel theory between the spectral theory of random Wigner matrices and those of covariance matrices; for instance, just as the former has asymptotic spectral distribution governed by the semi-circular law, the latter has asymptotic spectral distribution governed by the Marcenko-Pastur law. One reason for the connection can be seen by noting that the singular values of a rectangular matrix {M} are essentially the same thing as the eigenvalues of the augmented matrix

\displaystyle  \begin{pmatrix} 0 & M \\ M^* & 0\end{pmatrix}

after eliminating sign ambiguities and degeneracies. So one can view singular values of a rectangular iid matrix as the eigenvalues of a matrix which resembles a Wigner matrix, except that two diagonal blocks of that matrix have been zeroed out.

The zeroing out of these elements prevents one from applying the entire Wigner universality theory directly to the covariance matrix setting (in particular, the crucial Talagrand concentration inequality for the magnitude of a projection of a random vector to a subspace does not work perfectly once there are many zero coefficients). Nevertheless, a large part of the theory (particularly the deterministic components of the theory, such as eigenvalue variation formulae) carry through without much difficulty. The one place where one has to spend a bit of time to check details is to ensure that the Erdos-Schlein-Yau delocalisation result (that asserts, roughly speaking, that the eigenvectors of a Wigner matrix are about as small in {\ell^\infty} norm as one could hope to get) is also true for in the covariance matrix setting, but this is a straightforward (though somewhat tedious) adaptation of the method (which is based on the Stieltjes transform).

As an application, we extend the sine kernel distribution of local covariance matrix statistics, first established in the case of Wishart ensembles (when the underlying variables are gaussian) by Nagao and Wadati, and later extended to gaussian-divisible matrices by Ben Arous and Peche, to any distributions which matches one of these distributions to up to four moments, which covers virtually all complex distributions with independent iid real and imaginary parts, with basically the lone exception of the complex Bernoulli ensemble.

Recently, Erdos, Schlein, Yau, and Yin generalised their local relaxation flow method to also obtain similar universality results for distributions which have a large amount of smoothness, but without any matching moment conditions. By combining their techniques with ours as in our joint paper, one should probably be able to remove both smoothness and moment conditions, in particular now covering the complex Bernoulli ensemble.

In this paper we also record a new observation that the exponential decay hypothesis in our earlier paper can be relaxed to a finite moment condition, for a sufficiently high (but fixed) moment. This is done by rearranging the order of steps of the original argument carefully.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: Universality of local eigenvalue statistics up to the edge“, submitted to Comm. Math. Phys..  This is a sequel to our previous paper, in which we studied universality of local eigenvalue statistics (such as normalised eigenvalue spacings \sqrt{n} ( \lambda_{i+1}(M_n) - \lambda_i(M_n) )) for random matrices M_n of Wigner type, i.e. Hermitian (or symmetric) random matrices in which the upper-triangular entries are independent with mean zero and variance one (for technical reasons we also have to assume an exponential decay condition on the distribution of the entries).   The results in the previous paper were almost entirely focused on the bulk region, in which the index i of the eigenvalues involved was in the range \varepsilon n \leq i \leq (1-\varepsilon) n.  The main purpose of this paper is to extend the main results of the previous paper all the way up to the edge, thus allowing one to control all indices 1 \leq i \leq n.  As an application, we obtain a variant of Soshnikov’s well-known result that the largest eigenvalue of Wigner matrices is distributed (after suitable normalisation) according to the Tracy-Widom law when the coefficients are symmetric, by assuming instead that the coefficients have vanishing third moment.

As one transitions from the bulk to the edge, the density of the eigenvalues decreases to zero (in accordance to the Wigner semicircular law), and so the average spacing between eigenvalues increases.  (For instance, the spacing between eigenvalues in the bulk is of size n^{-1/2}, but at the extreme edge it increases to n^{-1/6}.)  On the one hand, the increase in average spacing should make life easier, because one does not have to work at such a fine spatial scale in order to see the eigenvalue distribution.  On the other hand, a certain key technical step in the previous paper (in which we adapted an argument of Erdos, Schlein, and Yau to show that eigenvectors of Wigner matrices were delocalised) seemed to require eigenvalue spacings to be of size O(n^{-1/2}), which was the main technical obstacle to extending the preceding results from the bulk to the edge.

The main new observation in the paper is that it was not the eigenvalue spacings \lambda_{i+1}(M_n) - \lambda_i(M_n) which were of importance to eigenvalue delocalisation, but rather the somewhat smaller interlaced eigenvalue spacings \lambda_{i+1}(M_n) - \lambda_i(M_{n-1}), where M_{n-1} is a n-1 \times n-1 minor of M_n.  The Cauchy interlacing law asserts that the latter is smaller than the former.  But the interesting thing is that at the edge (when i is close to n), the interlaced spacings are much smaller than the former, and in particular remain of size about O(n^{-1/2}) (up to log factors) even though the non-interlaced spacings increase to be as large as O(n^{-1/6}).  This is ultimately due to a sort of “attractive force” on eigenvalues that draws them towards the origin, and counteracts the usual “eigenvalue repulsion effect”, that pushes eigenvalues away from each other.  This induces “bias” for eigenvalues to move in towards the bulk rescues the delocalization result, and the remainder of the arguments in our previous paper then continue with few changes.

Below the fold I wish to give some heuristic justification of the interlacing bias phenomenon, sketch why this is relevant for eigenvector delocalisation, and finally to recall why eigenvalue delocalisation in turn is relevant for universality.

[Update, Aug 16: sign error corrected.]

Read the rest of this entry »

The Riemann zeta function {\zeta(s)}, defined for {\hbox{Re}(s)>1} by

\displaystyle  \zeta(s) := \sum_{n=1}^\infty \frac{1}{n^s} \ \ \ \ \ (1)

and then continued meromorphically to other values of {s} by analytic continuation, is a fundamentally important function in analytic number theory, as it is connected to the primes {p=2,3,5,\ldots} via the Euler product formula

\displaystyle  \zeta(s) = \prod_p (1 - \frac{1}{p^s})^{-1} \ \ \ \ \ (2)

(for {\hbox{Re}(s) > 1}, at least), where {p} ranges over primes. (The equivalence between (1) and (2) is essentially the generating function version of the fundamental theorem of arithmetic.) The function {\zeta} has a pole at {1} and a number of zeroes {\rho}. A formal application of the factor theorem gives

\displaystyle  \zeta(s) = \frac{1}{s-1} \prod_\rho (s-\rho) \times \ldots \ \ \ \ \ (3)

where {\rho} ranges over zeroes of {\zeta}, and we will be vague about what the {\ldots} factor is, how to make sense of the infinite product, and exactly which zeroes of {\zeta} are involved in the product. Equating (2) and (3) and taking logarithms gives the formal identity

\displaystyle  - \log \zeta(s) = \sum_p \log(1 - \frac{1}{p^s}) = \log(s-1) - \sum_\rho \log(s-\rho) + \ldots; \ \ \ \ \ (4)

using the Taylor expansion

\displaystyle  \log(1 - \frac{1}{p^s}) = - \frac{1}{p^s} - \frac{1}{2 p^{2s}} - \frac{1}{3p^{3s}} - \ldots \ \ \ \ \ (5)

and differentiating the above identity in {s} yields the formal identity

\displaystyle  - \frac{\zeta'(s)}{\zeta(s)} = \sum_n \frac{\Lambda(n)}{n^s} = \frac{1}{s-1} - \sum_\rho \frac{1}{s-\rho} + \ldots \ \ \ \ \ (6)

where {\Lambda(n)} is the von Mangoldt function, defined to be {\log p} when {n} is a power of a prime {p}, and zero otherwise. Thus we see that the behaviour of the primes (as encoded by the von Mangoldt function) is intimately tied to the distribution of the zeroes {\rho}. For instance, if we knew that the zeroes were far away from the axis {\hbox{Re}(s)=1}, then we would heuristically have

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \frac{1}{it}

for real {t}. On the other hand, the integral test suggests that

\displaystyle  \sum_n \frac{1}{n^{1+it}} \approx \frac{1}{it}

and thus we see that {\frac{\Lambda(n)}{n}} and {\frac{1}{n}} have essentially the same (multiplicative) Fourier transform:

\displaystyle  \sum_n \frac{\Lambda(n)}{n^{1+it}} \approx \sum_n \frac{1}{n^{1+it}}.

Inverting the Fourier transform (or performing a contour integral closely related to the inverse Fourier transform), one is led to the prime number theorem

\displaystyle  \sum_{n \leq x} \Lambda(n) \approx \sum_{n \leq x} 1.

In fact, the standard proof of the prime number theorem basically proceeds by making all of the above formal arguments precise and rigorous.

Unfortunately, we don’t know as much about the zeroes {\rho} of the zeta function (and hence, about the {\zeta} function itself) as we would like. The Riemann hypothesis (RH) asserts that all the zeroes (except for the “trivial” zeroes at the negative even numbers) lie on the critical line {\hbox{Re}(s)=1/2}; this hypothesis would make the error terms in the above proof of the prime number theorem significantly more accurate. Furthermore, the stronger GUE hypothesis asserts in addition to RH that the local distribution of these zeroes on the critical line should behave like the local distribution of the eigenvalues of a random matrix drawn from the gaussian unitary ensemble (GUE). I will not give a precise formulation of this hypothesis here, except to say that the adjective “local” in the context of distribution of zeroes {\rho} means something like “at scale {O(1/\log T)} when {\hbox{Im}(s) = O(T)}“.

Nevertheless, we do know some reasonably non-trivial facts about the zeroes {\rho} and the zeta function {\zeta}, either unconditionally, or assuming RH (or GUE). Firstly, there are no zeroes for {\hbox{Re}(s)>1} (as one can already see from the convergence of the Euler product (2) in this case) or for {\hbox{Re}(s)=1} (this is trickier, relying on (6) and the elementary observation that

\displaystyle  \hbox{Re}( 3\frac{\Lambda(n)}{n^{\sigma}} + 4\frac{\Lambda(n)}{n^{\sigma+it}} + \frac{\Lambda(n)}{n^{\sigma+2it}} ) = 2\frac{\Lambda(n)}{n^\sigma} (1+\cos(t \log n))^2

is non-negative for {\sigma > 1} and {t \in {\mathbb R}}); from the functional equation

\displaystyle  \pi^{-s/2} \Gamma(s/2) \zeta(s) = \pi^{-(1-s)/2} \Gamma((1-s)/2) \zeta(1-s)

(which can be viewed as a consequence of the Poisson summation formula, see e.g. my blog post on this topic) we know that there are no zeroes for {\hbox{Re}(s) \leq 0} either (except for the trivial zeroes at negative even integers, corresponding to the poles of the Gamma function). Thus all the non-trivial zeroes lie in the critical strip {0 < \hbox{Re}(s) < 1}.

We also know that there are infinitely many non-trivial zeroes, and can approximately count how many zeroes there are in any large bounded region of the critical strip. For instance, for large {T}, the number of zeroes {\rho} in this strip with {\hbox{Im}(\rho) = T+O(1)} is {O(\log T)}. This can be seen by applying (6) to {s = 2+iT} (say); the trivial zeroes at the negative integers end up giving a contribution of {O(\log T)} to this sum (this is a heavily disguised variant of Stirling’s formula, as one can view the trivial zeroes as essentially being poles of the Gamma function), while the {\frac{1}{s-1}} and {\ldots} terms end up being negligible (of size {O(1)}), while each non-trivial zero {\rho} contributes a term which has a non-negative real part, and furthermore has size comparable to {1} if {\hbox{Im}(\rho) = T+O(1)}. (Here I am glossing over a technical renormalisation needed to make the infinite series in (6) converge properly.) Meanwhile, the left-hand side of (6) is absolutely convergent for {s=2+iT} and of size {O(1)}, and the claim follows. A more refined version of this argument shows that the number of non-trivial zeroes with {0 \leq \hbox{Im}(\rho) \leq T} is {\frac{T}{2\pi} \log \frac{T}{2\pi} - \frac{T}{2\pi} + O(\log T)}, but we will not need this more precise formula here. (A fair fraction – at least 40%, in fact – of these zeroes are known to lie on the critical line; see this earlier blog post of mine for more discussion.)

Another thing that we happen to know is how the magnitude {|\zeta(1/2+it)|} of the zeta function is distributed as {t \rightarrow \infty}; it turns out to be log-normally distributed with log-variance about {\frac{1}{2} \log \log t}. More precisely, we have the following result of Selberg:

Theorem 1 Let {T} be a large number, and let {t} be chosen uniformly at random from between {T} and {2T} (say). Then the distribution of {\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|} converges (in distribution) to the normal distribution {N(0,1)}.

To put it more informally, {\log |\zeta(1/2+it)|} behaves like {\sqrt{\frac{1}{2} \log \log t} \times N(0,1)} plus lower order terms for “typical” large values of {t}. (Zeroes {\rho} of {\zeta} are, of course, certainly not typical, but one can show that one can usually stay away from these zeroes.) In fact, Selberg showed a slightly more precise result, namely that for any fixed {k \geq 1}, the {k^{th}} moment of {\frac{1}{\sqrt{\frac{1}{2} \log \log T}} \log |\zeta(1/2+it)|} converges to the {k^{th}} moment of {N(0,1)}.

Remarkably, Selberg’s result does not need RH or GUE, though it is certainly consistent with such hypotheses. (For instance, the determinant of a GUE matrix asymptotically obeys a remarkably similar log-normal law to that given by Selberg’s theorem.) Indeed, the net effect of these hypotheses only affects some error terms in {\log |\zeta(1/2+it)|} of magnitude {O(1)}, and are thus asymptotically negligible compared to the main term, which has magnitude about {O(\sqrt{\log \log T})}. So Selberg’s result, while very pretty, manages to finesse the question of what the zeroes {\rho} of {\zeta} are actually doing – he makes the primes do most of the work, rather than the zeroes.

Selberg never actually published the above result, but it is reproduced in a number of places (e.g. in this book by Joyner, or this book by Laurincikas). As with many other results in analytic number theory, the actual details of the proof can get somewhat technical; but I would like to record here (partly for my own benefit) an informal sketch of some of the main ideas in the argument.

Read the rest of this entry »

One further paper in this stream: László Erdős, José Ramírez, Benjamin Schlein, Van Vu, Horng-Tzer Yau, and myself have just uploaded to the arXiv the paper “Bulk universality for Wigner hermitian matrices with subexponential decay“, submitted to Mathematical Research Letters.  (Incidentally, this is my first six-author paper I have been involved in, not counting the polymath projects of course, though I have had a number of five-author papers.)

This short paper (9 pages) combines the machinery from two recent papers on the universality conjecture for the eigenvalue spacings in the bulk for Wigner random matrices (see my earlier blog post for more discussion).  On the one hand, the paper of Erdős-Ramírez-Schlein-Yau established this conjecture under the additional hypothesis that the distribution of the individual entries obeyed some smoothness and exponential decay conditions.  Meanwhile, the paper of Van Vu and myself (which I discussed in my earlier blog post) established the conjecture under a somewhat different set of hypotheses, namely that the distribution of the individual entries obeyed some moment conditions (in particular, the third moment had to vanish), a support condition (the entries had to have real part supported in at least three points), and an exponential decay condition.

After comparing our results, the six of us realised that our methods could in fact be combined rather easily to obtain a stronger result, establishing the universality conjecture assuming only a exponential decay (or more precisely, sub-exponential decay) bound {\Bbb P}(|x_{\ell k}| > t ) \ll \exp( - t^c ) on the coefficients; thus all regularity, moment, and support conditions have been eliminated.  (There is one catch, namely that we can no longer control a single spacing \lambda_{i+1}-\lambda_i for a single fixed i, but must now average over all 1 \leq i \leq n before recovering the universality.  This is an annoying technical issue but it may be resolvable in the future with further refinements to the method.)

I can describe the main idea behind the unified approach here.  One can arrange the Wigner matrices in a hierarchy, from most structured to least structured:

  • The most structured (or special) ensemble is the Gaussian Unitary Ensemble (GUE), in which the coefficients are gaussian. Here, one has very explicit and tractable formulae for the eigenvalue distributions, gap spacing, etc.
  • The next most structured ensemble of Wigner matrices are the Gaussian-divisible or Johansson matrices, which are matrices H of the form H = e^{-t/2} \hat H + (1-e^{-t})^{1/2} V, where \hat H is another Wigner matrix, V is a GUE matrix independent of \hat H, and 0 < t < 1 is a fixed parameter independent of n.  Here, one still has quite explicit (though not quite as tractable) formulae for the joint eigenvalue distribution and related statistics.  Note that the limiting case t=1 is GUE.
  • After this, one has the Ornstein-Uhlenbeck-evolved matrices, which are also of the form H = e^{-t/2} \hat H + (1-e^{-t})^{1/2} V, but now t = n^{-1+\delta} decays at a power rate with n, rather than being comparable to 1.  Explicit formulae still exist for these matrices, but extracting universality out of this is hard work (and occupies the bulk of the paper of Erdős-Ramírez-Schlein-Yau).
  • Finally, one has arbitrary Wigner matrices, which can be viewed as the t=0 limit of the above Ornstein-Uhlenbeck process.

The arguments in the paper of Erdős-Ramírez-Schlein-Yau can be summarised as follows (I assume subexponential decay throughout this discussion):

  1. (Structured case) The universality conjecture is true for Ornstein-Uhlenbeck-evolved matrices with t = n^{-1+\delta} for any 0 < \delta \leq 1.  (The case 1/4 < \delta \leq 1 was treated in an earlier paper of Erdős-Ramírez-Schlein-Yau, while the case where t is comparable to 1 was treated by Johansson.)
  2. (Matching) Every Wigner matrix with suitable smoothness conditions can be “matched” with an Ornstein-Uhlenbeck-evolved matrix, in the sense that the eigenvalue statistics for the two matrices are asymptotically identical.  (This is relatively easy due to the fact that \delta can be taken arbitrarily close to zero.)
  3. Combining 1. and 2. one obtains universality for all Wigner matrices obeying suitable smoothness conditions.

The arguments in the paper of Van and myself can be summarised as follows:

  1. (Structured case) The universality conjecture is true for Johansson matrices, by the paper of Johansson.
  2. (Matching) Every Wigner matrix with some moment and support conditions can be “matched” with a Johansson matrix, in the sense that the first four moments of the entries agree, and hence (by the Lindeberg strategy in our paper) have asymptotically identical statistics.
  3. Combining 1. and 2. one obtains universality for all Wigner matrices obtaining suitable moment and support conditions.

What we realised is by combining the hard part 1. of the paper of Erdős-Ramírez-Schlein-Yau with the hard part 2. of the paper of Van and myself, we can remove all regularity, moment, and support conditions.  Roughly speaking, the unified argument proceeds as follows:

  1. (Structured case) By the arguments of Erdős-Ramírez-Schlein-Yau, the universality conjecture is true for Ornstein-Uhlenbeck-evolved matrices with t = n^{-1+\delta} for any 0 < \delta \leq 1.
  2. (Matching) Every Wigner matrix H can be “matched” with an Ornstein-Uhlenbeck-evolved matrix e^{-t/2} H + (1-e^{-t})^{1/2} V for t= n^{-1+0.01} (say), in the sense that the first four moments of the entries almost agree, which is enough (by the arguments of Van and myself) to show that these two matrices have asymptotically identical statistics on the average.
  3. Combining 1. and 2. one obtains universality for the averaged statistics for all Wigner matrices.

The averaging should be removable, but this would require better convergence results to the semicircular law than are currently known (except with additional hypotheses, such as vanishing third moment).  The subexponential decay should also be relaxed to a condition of finiteness for some fixed moment {\Bbb E} |x|^C, but we did not pursue this direction in order to keep the paper short.