Van Vu and I have just uploaded to the arXiv our paper “Random matrices: universality of local eigenvalue statistics“, submitted to Acta Math.. This paper concerns the eigenvalues of a *Wigner matrix* , which we define to be a random Hermitian matrix whose upper-triangular entries are independent (and whose strictly upper-triangular entries are also identically distributed). [The lower-triangular entries are of course determined from the upper-triangular ones by the Hermitian property.] We normalise the matrices so that all the entries have mean zero and variance 1. Basic examples of Wigner Hermitian matrices include

- The Gaussian Unitary Ensemble (GUE), in which the upper-triangular entries are complex gaussian, and the diagonal entries are real gaussians;
- The Gaussian Orthogonal Ensemble (GOE), in which all entries are real gaussian;
- The
*Bernoulli Ensemble*, in which all entries take values (with equal probability of each).

We will make a further distinction into *Wigner real symmetric matrices* (which are Wigner matrices with real coefficients, such as GOE and the Bernoulli ensemble) and Wigner Hermitian matrices (which are Wigner matrices whose upper-triangular coefficients have real and imaginary parts iid, such as GUE).

The GUE and GOE ensembles have a rich algebraic structure (for instance, the GUE distribution is invariant under conjugation by unitary matrices, while the GOE distribution is similarly invariant under conjugation by orthogonal matrices, hence the terminology), and as a consequence their eigenvalue distribution can be computed explicitly. For instance, the joint distribution of the eigenvalues for GUE is given by the explicit formula

(0)

for some explicitly computable constant on the orthant (a result first established by Ginibre). (A similar formula exists for GOE, but for simplicity we will just discuss GUE here.) Using this explicit formula one can compute a wide variety of asymptotic eigenvalue statistics. For instance, the (bulk) *empirical spectral distribution (ESD) measure* for GUE (and indeed for all Wigner matrices, see below) is known to converge (in the vague sense) to the Wigner semicircular law

(1)

as . Actually, more precise statements are known for GUE; for instance, for , the eigenvalue is known to equal

(2)

with probability , where is the inverse cumulative distribution function of the semicircular law, thus

.

Furthermore, the distribution of the normalised eigenvalue spacing is known; in the bulk region for fixed , it converges as to the *Gaudin distribution*, which can be described explicitly in terms of determinants of the Dyson sine kernel . Many further local statistics of the eigenvalues of GUE are in fact governed by this sine kernel, a result usually proven using the asymptotics of orthogonal polynomials (and specifically, the Hermite polynomials). (At the edge of the spectrum, say , the asymptotic distribution is a bit different, being governed instead by the *Tracy-Widom law*.)

It has been widely believed that these GUE facts enjoy a universality property, in the sense that they should also hold for wide classes of other matrix models. In particular, Wigner matrices should enjoy the same bulk distribution (1), the same asymptotic law (2) for individual eigenvalues, and the same sine kernel statistics as GUE. (The statistics for Wigner symmetric matrices are slightly different, and should obey GOE statistics rather than GUE ones.)

There has been a fair body of evidence to support this belief. The bulk distribution (1) is in fact valid for all Wigner matrices (a result of Pastur, building on the original work of Wigner of course). The Tracy-Widom statistics on the edge were established for all Wigner Hermitian matrices (assuming that the coefficients had a distribution which was symmetric and decayed exponentially) by Soshnikov (with some further refinements by Soshnikov and Peche). Soshnikov’s arguments were based on an advanced version of the moment method.

The sine kernel statistics were established by Johansson for Wigner Hermitian matrices which were *gaussian divisible, *which means that they could be expressed as a non-trivial linear combination of another Wigner Hermitian matrix and an independent GUE. (Basically, this means that distribution of the coefficients is a convolution of some other distribution with a gaussian. There were some additional technical decay conditions in Johansson’s work which were removed in subsequent work of Ben Arous and Peche.) Johansson’s work was based on an explicit formula for the joint distribution for gauss divisible matrices that generalises (0) (but is significantly more complicated).

Just last week, Erdos, Ramirez, Schlein, and Yau established sine kernel statistics for Wigner Hermitian matrices with exponential decay and a high degree of smoothness (roughly speaking, they require control of up to six derivatives of the Radon-Nikodym derivative of the distribution). Their method is based on an analysis of the dynamics of the eigenvalues under a smooth transition from a general Wigner Hermitian matrix to GUE, essentially a matrix version of the Ornstein-Uhlenbeck process, whose eigenvalue dynamics are governed by *Dyson Brownian motion*.

In my paper with Van, we establish similar results to that of Erdos et al. under slightly different hypotheses, and by a somewhat different method. Informally, our main result is as follows:

Theorem 1.(Informal version) Suppose is a Wigner Hermitian matrix whose coefficients have an exponentially decaying distribution, and whose real and imaginary parts are supported on at least three points (basically, this excludes Bernoulli-type distributions only) and have vanishing third moment (which is for instance the case for symmetric distributions). Then one has the local statistics (2) (but with an error term of for any rather than ) and the sine kernel statistics for individual eigenvalue spacings (as well as higher order correlations) in the bulk.If one removes the vanishing third moment hypothesis, one still has the sine kernel statistics provided one averages over all i.

There are analogous results for Wigner real symmetric matrices (see paper for details). There are also some related results, such as a universal distribution for the least singular value of matrices of the form in Theorem 1, and a crude asymptotic for the determinant (in particular, with probability ).

The arguments are based primarily on the *Lindeberg replacement strategy,* which Van and I also used to obtain universality for the circular law for iid matrices, and for the least singular value for iid matrices, but also rely on other tools, such as some recent arguments of Erdos, Schlein, and Yau, as well as a very useful concentration inequality of Talagrand which lets us tackle both discrete and continuous matrix ensembles in a unified manner. (I plan to talk about Talagrand’s inequality in my next blog post.)

— The replacement strategy —

Suppose one has a Wigner Hermitian random matrix . To compute the eigenvalue statistics of , e.g. the expectation of some suitably normalised function of the eigenvalue, the Lindeberg strategy is not to perform this computation directly, but to first compare the relevant statistic of with the corresponding statistic of a much nicer random matrix ensemble , such as GUE. Since the statistics of the latter have already been computed, one will be done as soon as one establishes an invariance principle for the statistic being studied.

After trying a number of implementations of this strategy, Van and I eventually settled on that of replacing the entries of with the corresponding gaussian entries of one by one (or more precisely, two by two, since we need to keep the matrix Hermitian) and seeing how the eigenvalues evolve. A key tool here is the *Hadamard variation formula*

that describes the rate of change of an eigenvalue in terms of the rate of change of the matrix, and the unit eigenvector of the matrix associated to that eigenvalue. (See this earlier blog post for further discussion of this and related formulae). If one assumes that the eigenvectors are delocalised in the sense that their coefficients have magnitude (which is essentially the best possible, given that these magnitudes must square-sum to 1 by Pythagoras’ theorem), this formula suggests that the replacement of one or two entries in a random matrix should only move the eigenvalues by about . (This type of delocalisation result was recently established by Erdos, Schlein, and Yau, for a slightly more restrictive class of matrices, but we were able to extend their method to our class.) Since the mean eigenvalue spacing is about , this is encouraging for the purposes of establishing the invariance principle. However, we have to perform about separate replacement operations. A random walk of steps of size would be expected to move a much larger distance than the mean eigenvalue spacing , so it looks like this strategy is not going to work.

Fortunately, one can salvage things by performing a Taylor expansion. Suppose for instance that one is swapping one entry (together with its adjoint ) of a random matrix M into another entry to create another matrix . Thus one can write

where

is the matrix M with the pq and qp entries zeroed out, and are the basis matrices corresponding to the positions pq, qp. If we write , we are trying to compare with . To do this, we can perform a Taylor expansion with remainder:

(3)

where r is a parameter that one can choose (it is ultimately going to be something like 4 in practice). Technically, the Taylor expansion is a bit messier than this because is complex rather than real (and is not holomorphic), but let us ignore this technicality for now.

If the moments of and agree up to order, then when one takes expectations, all the terms in (3) except for the error term remain unchanged when swapping into . A more advanced version of the Hadamard variation formula can be used to establish (at least heuristically) that is “typically” of size , which suggests that one should be able to swap all entries and still get fine-scale distributional results as long as r is at least 4.

It turns out that (with a certain amount of technical effort, and some tweaks to the above strategy) one can make this plan work, with the upshot that one can replace all the entries in a Wigner matrix without significantly affecting the local statistics, as long as one preserves all moments up to fourth order of the entries. This would give Theorem 1 in the case where the entries agreed with the GUE entries up to fourth order, but as stated they only agree up to third order. To drop the moment matching condition by one order, we use the result of Johansson on gaussian divisible matrices, combined with an elementary observation that any random variable that is supported on three or more points can be matched with a gaussian-divisible random variable up to fourth order. (The Bernoulli random variables, which are supported on only two points, are unfortunately an extremal for the moment matching problem and cannot be matched to fourth order with any other distribution.)

## 33 comments

Comments feed for this article

3 June, 2009 at 6:24 pm

Paul LeopardiI recently gave a presentation on random Gram matrices obtained from polynomial interpolation on the unit sphere.

See

“Polynomial interpolation on the sphere, reproducing kernels and random matrices”.

My investigation so far has been entirely empirical. It looks like your paper could give me some ideas to use in a more theoretical investigation. In my case, the diagonal entries of the random real symmetric Gram matrix $G_t$ are and the strict upper triangular entries are where X and Y are independently, uniformly distributed on the unit sphere.

5 June, 2009 at 8:55 am

edmathsDear Professor Tao,

Here are some typos I have noticed on your paper about universality of local eigenvalue statistics. The number pages are those of 0906.0510v2.pdf.

– Page 1, footnote: the word “is” is repeated twice,

– Page 8, in 1.5, line 5: “Winger” should be Wigner,

– Page 12, Remark 23: I think the last sentence could be improved (problem with “remove”).

I am sorry I cannot help on the mathematical content :-)

Feel free to delete this comment at any time.

Best regards,

edmaths

[Corrected, thanks – T.]7 June, 2009 at 9:15 pm

masterchefDear Professor Tao,

I hope one day you will come back to Australia. You’re an Australian Treasure.

From Masterchef (in Australia)

7 June, 2009 at 11:46 pm

Yet Another Graduate StudentDear Prof. Tao,

Are there any other central questions left open in random matrix theory after you and Prof. Vu tackled this universality conjecture? I really thought it would take another five to ten years before some special cases of it would yield to human prowess.

Sincerely,

John

8 June, 2009 at 7:10 am

Mark MeckesJohn,

What counts as a “central” question is quite subjective, but here are a couple candidates, one rather specific and one quite general:

1. What are necessary and sufficient conditions on the entries of a Wigner matrix for Tracy-Widom asymptotics of the largest eigenvalue? (The conjecture is that fourth moments exist, which is known to be necessary and sufficient for the largest eigenvalue to satisfy a strong law of large numbers.)

2. Can the various universality results for Wigner matrices be extended to random matrices with weakly dependent entries? For the so-called unitary and orthogonal ensembles, which share the invariance properties Terry discussed for the GUE/GOE, the answer is known in many cases to yes, but I know of only a few results which lack those invariance properties and simply replace independence with a weaker hypothesis. In the context of Wishart-type matrices, there is also some recent work on matrices with independent columns.

26 June, 2009 at 8:05 pm

Bulk universality for Wigner hermitian matrices with subexponential decay « What’s new[…] the universality conjecture for the eigenvalue spacings in the bulk for Wigner random matrices (see my earlier blog post for more discussion). On the one hand, the paper of Erdős-Ramírez-Schlein-Yau established this […]

15 August, 2009 at 1:22 pm

Random matrices: Universality of local eigenvalue statistics up to the edge « What’s new[…] eigenvalue statistics up to the edge“, submitted to Comm. Math. Phys.. This is a sequel to our previous paper, in which we studied universality of local eigenvalue statistics (such as normalised eigenvalue […]

9 December, 2009 at 1:16 pm

Random covariance matrices: Universality of local statistics of eigenvalues « What’s new[…] of eigenvalues“, to be submitted shortly. This paper draws heavily on the technology of our previous paper, in which we established a Four Moment Theorem for the local spacing statistics of eigenvalues of […]

10 December, 2009 at 1:34 pm

WynandDear Prof

In your paper, on page 6, you say that “the distribution of the eigenvalues is given by the restriction to the region .”

I am a bit confused by what “restriction” means here (is it just restriction of a function?) and by how to interpret the infinitesimals . Could you point me at something that will help me understand?

Thank you!

11 December, 2009 at 1:17 pm

Terence TaoI am using to denote the absolutely continuous measure formed by multiplying Lebesgue measure with the density function , thus . The restriction of a measure to any subset is defined by .

12 January, 2010 at 9:03 pm

254A, Notes 3a: Eigenvalues and sums of Hermitian matrices « What’s new[…] in the eigenbasis). See this earlier blog post for more discussion. Remark 4 In the proof of the four moment theorem of Van Vu and myself, which we will discuss in a subsequent lecture, we will also need the variation formulae for the […]

16 February, 2010 at 1:56 am

Florent Benaych-GeorgesDear Professor Tao,

I am interested in equation (2) of your post, which is not far away from equation (2) of the article. However, I did not find any good reference for it. For example, I did not find it in the references you give in your article (two articles by Bai and Yin of 1988).

In fact, I am only interested in the GUE and GOE cases. Would you please tell me where, in the articles of Bai and Yin (or somewhere else), this is proved ?

Thank you !

With best regards,

Florent

16 February, 2010 at 8:04 am

Terence TaoThe strongest results so far in this direction are in

J. Gustavsson, Gaussian fluctuations of eigenvalues in the GUE, Ann. Inst. H. Poincare Probab. Statist. 41 (2005), no. 2, 151–178.

In our paper we use the four moment theorem (actually, we use the three moment version in this case) to extend this result to more general ensembles.

16 February, 2010 at 8:58 am

Florent Benaych-GeorgesOK, thanks a lot !

Florent

18 May, 2010 at 7:26 am

Random matrices: Localization of the eigenvalues and the necessity of four moments « What’s new[…] of Erdos, Ramirez, Schlein, Vu, Yau, and myself. One tool for this is the four moment theorem of Van and myself, which roughly speaking shows that the behaviour of the eigenvalues at the scale (and even at the […]

1 February, 2011 at 1:29 pm

The Wigner-Dyson-Mehta bulk universality conjecture for Wigner matrices « What’s new[…] results of Johansson, href{ErdH{o}s-Ramirez-Schlein-Yau}, ErdHs-Peche-Ramirez-Schlein-Yau}, and Vu and myself, extended these results to increasingly wider ranges of Wigner matrices, but in the context of […]

16 February, 2011 at 7:41 pm

MadsProfessor Tao,

A typo: In your joint paper with Van Vu, six lines after (44) on page 33, “whence the claim” should be “hence the claim”.

[Thanks, this will be corrected in the next revision of the ms. -T.]18 February, 2011 at 6:35 am

MadsDear Professor Tau,

Since English isn’t my main language, I’m not sure whether the following suggestions are correct or not. If not, I apologise for wasting your time.

*) P. 7, 3 lines before theorem 8: “(i.e. independent” –> “(i.e., independent”

*) P. 10, line 5: “(i.e. $A_n$ has” –> “(i.e., $A_n$ has”

*) P. 40, line 9: “(i.e. to spans” –> “(i.e., to spans”

*) P. 44, 8 lines into corollary 58: “(i.e. to spans” –> “(i.e., to spans”

*) P. 65, line 5: “(i.e. all” –> “(i.e., all”

16 March, 2011 at 7:59 am

Random matrices: Universality of eigenvectors « What’s new[…] submitted to Random Matrices: Theory and Applications. This paper concerns an extension of our four moment theorem for eigenvalues. Roughly speaking, that four moment theorem asserts (under mild decay conditions on […]

28 November, 2011 at 9:00 pm

A central limit theorem for the determinant of a Wigner matrix « What’s new[…] Erdos, Yau, and Yin (based on a previous four moment theorem for individual eigenvalues introduced by Van Vu and myself). The four moment theorem is proven by the now-standard Lindeberg exchange method, combined with […]

7 March, 2012 at 10:18 pm

The asymptotic distribution of a single eigenvalue gap of a Wigner matrix « What’s new[…] ensembles by the Four Moment Theorem, at least if one assumes a moment matching condition; see this previous paper with Van Vu for details. The moment condition can in fact be removed, as was done in this subsequent paper with […]

20 May, 2014 at 11:58 am

Review of wigner nj-Symbols | quantumtetrahedron[…] Random matrices: universality of local eigenvalue statistics (terrytao.wordpress.com) […]

20 June, 2014 at 6:33 am

deepDear Professor Tao,

I ma little confused by the terminology between Wigner real symmetric matrices and Gaussian Orthogonal Ensemble (GOE). Genrerally for GOE they assume iid diagonal entries N(0; 1) and the off-diagonal entries are iid (subject to being symmetric) as N(0; 1/2).

But I am interested in a real symmetric matrix with all entries are i.i.d N(0,1) and I wanted to study the spacings to detect whether there is any interesting eigenvalue (via eigen-Gap) present in the specturm. Is you paper http://arxiv.org/abs/0906.0510 answer this question or do you know any result in this direction ?

My goal is to find a threshold c, such that whenever the maximum eigen-gap exceeds that value I will declare thare is potential `signal’ present.

Best

Deep

20 June, 2014 at 7:33 am

Terence TaoThat’s a good question. The universality results in my paper with Van (and in subsequent work) tell us that if one replaces the N(0,1) distributions in your real symmetric matrix with any other distribution with mean zero and variance 1 (and exponential tails), the gap distribution is essentially unchanged (in our paper, one needs also four matching moments; a subsequent paper of Erdos, Ramirez, Schlein, Vu, Yau, and myself removes this condition, at the cost of averaging over all gaps, not just over a specific gap). However, this does not directly tell us what the gap distribution is, only that it is universal. The “local semicircle law” of Erdos, Schlein, and Yau (Theorem 60 in the current paper) tells us that these gaps cannot be much bigger than the mean spacing. and Theorem 19 in the current paper gives a nontrivial lower bound on the gap on average. It is likely that the gap distribution for your model is the same as that in the GOE case, but I don’t think this has been established in the literature yet.

20 June, 2014 at 8:23 am

deepThank you so much Terence. So if I understood you correctly, there is no literature currently on the distribution of the spacings for a real symmetric random matrix with all entries N(0,1).

So at this point the other possibility is via maximum eigenvalue route. Do you know any result, which discusses the spectral (specially the maximum eigenvalue Tracy–Widom type result) properties of such matrix ? I would be grateful for that.

My goal is to detect the `signal’ either using the spacing results or the maximum eigenvalue type thresholding.

Thank you again for all your valuable insights and comments.

20 June, 2014 at 4:53 pm

Terence TaoThe known results for edge universality are a little better than those for bulk universality, in that fewer matching moment conditions are required. In particular, (a small modification of) the paper of Knowles and Yin at http://arxiv.org/abs/1102.0057 should imply that the distribution of the largest eigenvalues is essentially unchanged if one changes the variance of the diagonal entries of GOE from 1/2 to 1, so your model should have the same Tracy-Widom statistics that GOE does.

25 June, 2014 at 12:14 pm

deeptempleJust to clarify one last thing about the eigenvalue spacing. The Wigner Surmise type distribution only holds for GOE, which has diagonal elements N(0,2) not N(0,1).

If I understood you correctly there is no known result of this type valid for Real symmetric matrix with all entries N(0,1)..m I correct Terence ?

Thanks a lot !

25 June, 2014 at 1:47 pm

Terence TaoI am not aware of such a result in the literature, but it ought to be true, and looks almost within the reach of known methods, particularly if one is willing to work with the average gap spacing rather than with a specific gap.

14 March, 2017 at 6:06 am

K. AhnDear professor Tao.

I read your paper, and I came up with several technical questions in the proof. I will be grateful for any helps.

1. At the very last part of section 3.5 (High-level proof of Thm 19), it says “if E_0^c holds, then by (i) we have … ” However, I don’t see why E_0^c implies (i) as the definition of E_0 is “at least one of the exceptional events (i),(iii)-(vi), or E_n hold”. (More precisely, I interpreted that E_0^c would be “none of the events (i),(iii)-(vi) hold”)

2. I don’t see how you relate the event (i) with the definition of g_{i,l,n} in (37). At first glance, the denominator shown in (37) does not look consistent with the right hand side of the inequality presented in the event (i).

I would appreciate your helps.

15 March, 2017 at 10:08 pm

Terence Tao1. Sorry, this was a typo: it should be “failure of (i)” instead of (i); the next two inequalities should be reversed (with instead of ); and the value of should be rather than .

2. To lower bound , we upper bound the denominator in (37) by .

16 March, 2017 at 6:26 pm

K. AhnI have another question. Any helps would be appriciated.

In section 5.3, I don’t see why you abruptly introduce right after (94). To my understanding, I think it suffices to show for . (Lemma 64 ensures that we only need to show that is close to uniformly for all )

Again, thanks for any helps!

16 March, 2017 at 10:33 pm

K. AhnI just understood your intention. I apologize for a silly question. I have another question:

at the very last part of the proof of proposition 65 (section 5.3), we have an upper bound . However, I don’t see how this comes even though there is and we only know .

Thanks for your helps!

22 March, 2017 at 2:55 am

Terence TaoThis is a typo: the upper bound on should be (note that the interval to which Proposition 66 is applied has length about ).

Incidentally, you may find this post of mine on how to deal with these sorts of typos when reading mathematical papers to be helpful: https://plus.google.com/u/0/+TerenceTao27/posts/TGjjJPUdJjk