You are currently browsing the tag archive for the ‘Wigner semi-circular law’ tag.

I’ve just uploaded to the arXiv my paper The asymptotic distribution of a single eigenvalue gap of a Wigner matrix, submitted to Probability Theory and Related Fields. This paper (like several of my previous papers) is concerned with the asymptotic distribution of the eigenvalues {\lambda_1(M_n) \leq \ldots \leq \lambda_n(M_n)} of a random Wigner matrix {M_n} in the limit {n \rightarrow \infty}, with a particular focus on matrices drawn from the Gaussian Unitary Ensemble (GUE). This paper is focused on the bulk of the spectrum, i.e. to eigenvalues {\lambda_i(M_n)} with {\delta n \leq i \leq (1-\delta) i n} for some fixed {\delta>0}.

The location of an individual eigenvalue {\lambda_i(M_n)} is by now quite well understood. If we normalise the entries of the matrix {M_n} to have mean zero and variance {1}, then in the asymptotic limit {n \rightarrow \infty}, the Wigner semicircle law tells us that with probability {1-o(1)} one has

\displaystyle  \lambda_i(M_n) =\sqrt{n} u + o(\sqrt{n})

where the classical location {u = u_{i/n} \in [-2,2]} of the eigenvalue is given by the formula

\displaystyle  \int_{-2}^{u} \rho_{sc}(x)\ dx = \frac{i}{n}

and the semicircular distribution {\rho_{sc}(x)\ dx} is given by the formula

\displaystyle  \rho_{sc}(x) := \frac{1}{2\pi} (4-x^2)_+^{1/2}.

Actually, one can improve the error term here from {o(\sqrt{n})} to {O( \log^{1/2+\epsilon} n)} for any {\epsilon>0} (see this previous recent paper of Van and myself for more discussion of these sorts of estimates, sometimes known as eigenvalue rigidity estimates).

From the semicircle law (and the fundamental theorem of calculus), one expects the {i^{th}} eigenvalue spacing {\lambda_{i+1}(M_n)-\lambda_i(M_n)} to have an average size of {\frac{1}{\sqrt{n} \rho_{sc}(u)}}. It is thus natural to introduce the normalised eigenvalue spacing

\displaystyle  X_i := \frac{\lambda_{i+1}(M_n) - \lambda_i(M_n)}{1/\sqrt{n} \rho_{sc}(u)}

and ask what the distribution of {X_i} is.

As mentioned previously, we will focus on the bulk case {\delta n \leq i\leq (1-\delta)n}, and begin with the model case when {M_n} is drawn from GUE. (In the edge case when {i} is close to {1} or to {n}, the distribution is given by the famous Tracy-Widom law.) Here, the distribution was almost (but as we shall see, not quite) worked out by Gaudin and Mehta. By using the theory of determinantal processes, they were able to compute a quantity closely related to {X_i}, namely the probability

\displaystyle  {\bf P}( N_{[\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)}, \sqrt{n} u + \frac{y}{\sqrt{n} \rho_{sc}(u)}]} = 0) \ \ \ \ \ (1)

that an interval {[\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)}, \sqrt{n} u + \frac{y}{\sqrt{n} \rho_{sc}(u)}]} near {\sqrt{n} u} of length comparable to the expected eigenvalue spacing {1/\sqrt{n} \rho_{sc}(u)} is devoid of eigenvalues. For {u} in the bulk and fixed {x,y}, they showed that this probability is equal to

\displaystyle  \det( 1 - 1_{[x,y]} P 1_{[x,y]} ) + o(1),

where {P} is the Dyson projection

\displaystyle  P f(x) = \int_{\bf R} \frac{\sin(\pi(x-y))}{\pi(x-y)} f(y)\ dy

to Fourier modes in {[-1/2,1/2]}, and {\det} is the Fredholm determinant. As shown by Jimbo, Miwa, Tetsuji, Mori, and Sato, this determinant can also be expressed in terms of a solution to a Painleve V ODE, though we will not need this fact here. In view of this asymptotic and some standard integration by parts manipulations, it becomes plausible to propose that {X_i} will be asymptotically distributed according to the Gaudin-Mehta distribution {p(x)\ dx}, where

\displaystyle  p(x) := \frac{d^2}{dx^2} \det( 1 - 1_{[0,x]} P 1_{[0,x]} ).

A reasonably accurate approximation for {p} is given by the Wigner surmise {p(x) \approx \frac{1}{2} \pi x e^{-\pi x^2/4}}, which was presciently proposed by Wigner as early as 1957; it is exact for {n=2} but not in the asymptotic limit {n \rightarrow \infty}.

Unfortunately, when one tries to make this argument rigorous, one finds that the asymptotic for (1) does not control a single gap {X_i}, but rather an ensemble of gaps {X_i}, where {i} is drawn from an interval {[i_0 - L, i_0 + L]} of some moderate size {L} (e.g. {L = \log n}); see for instance this paper of Deift, Kriecherbauer, McLaughlin, Venakides, and Zhou for a more precise formalisation of this statement (which is phrased slightly differently, in which one samples all gaps inside a fixed window of spectrum, rather than inside a fixed range of eigenvalue indices {i}). (This result is stated for GUE, but can be extended to other Wigner ensembles by the Four Moment Theorem, at least if one assumes a moment matching condition; see this previous paper with Van Vu for details. The moment condition can in fact be removed, as was done in this subsequent paper with Erdos, Ramirez, Schlein, Vu, and Yau.)

The problem is that when one specifies a given window of spectrum such as {[\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)}, \sqrt{n} u + \frac{y}{\sqrt{n} \rho_{sc}(u)}]}, one cannot quite pin down in advance which eigenvalues {\lambda_i(M_n)} are going to lie to the left or right of this window; even with the strongest eigenvalue rigidity results available, there is a natural uncertainty of {\sqrt{\log n}} or so in the {i} index (as can be quantified quite precisely by this central limit theorem of Gustavsson).

The main difficulty here is that there could potentially be some strange coupling between the event (1) of an interval being devoid of eigenvalues, and the number {N_{(-\infty,\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)})}(M_n)} of eigenvalues to the left of that interval. For instance, one could conceive of a possible scenario in which the interval in (1) tends to have many eigenvalues when {N_{(-\infty,\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)})}(M_n)} is even, but very few when {N_{(-\infty,\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)})}(M_n)} is odd. In this sort of situation, the gaps {X_i} may have different behaviour for even {i} than for odd {i}, and such anomalies would not be picked up in the averaged statistics in which {i} is allowed to range over some moderately large interval.

The main result of the current paper is that these anomalies do not actually occur, and that all of the eigenvalue gaps {X_i} in the bulk are asymptotically governed by the Gaudin-Mehta law without the need for averaging in the {i} parameter. Again, this is shown first for GUE, and then extended to other Wigner matrices obeying a matching moment condition using the Four Moment Theorem. (It is likely that the moment matching condition can be removed here, but I was unable to achieve this, despite all the recent advances in establishing universality of local spectral statistics for Wigner matrices, mainly because the universality results in the literature are more focused on specific energy levels {u} than on specific eigenvalue indices {i}. To make matters worse, in some cases universality is currently known only after an additional averaging in the energy parameter.)

The main task in the proof is to show that the random variable {N_{(-\infty,\sqrt{n} u + \frac{x}{\sqrt{n} \rho_{sc}(u)})}(M_n)} is largely decoupled from the event in (1) when {M_n} is drawn from GUE. To do this we use some of the theory of determinantal processes, and in particular the nice fact that when one conditions a determinantal process to the event that a certain spatial region (such as an interval) contains no points of the process, then one obtains a new determinantal process (with a kernel that is closely related to the original kernel). The main task is then to obtain a sufficiently good control on the distance between the new determinantal kernel and the old one, which we do by some functional-analytic considerations involving the manipulation of norms of operators (and specifically, the operator norm, Hilbert-Schmidt norm, and nuclear norm). Amusingly, the Fredholm alternative makes a key appearance, as I end up having to invert a compact perturbation of the identity at one point (specifically, I need to invert {1 - 1_{[x,y]}P1_{[x,y]}}, where {P} is the Dyson projection and {[x,y]} is an interval). As such, the bounds in my paper become ineffective, though I am sure that with more work one can invert this particular perturbation of the identity by hand, without the need to invoke the Fredholm alternative.

We can now turn attention to one of the centerpiece universality results in random matrix theory, namely the Wigner semi-circle law for Wigner matrices. Recall from previous notes that a Wigner Hermitian matrix ensemble is a random matrix ensemble {M_n = (\xi_{ij})_{1 \leq i,j \leq n}} of Hermitian matrices (thus {\xi_{ij} = \overline{\xi_{ji}}}; this includes real symmetric matrices as an important special case), in which the upper-triangular entries {\xi_{ij}}, {i>j} are iid complex random variables with mean zero and unit variance, and the diagonal entries {\xi_{ii}} are iid real variables, independent of the upper-triangular entries, with bounded mean and variance. Particular special cases of interest include the Gaussian Orthogonal Ensemble (GOE), the symmetric random sign matrices (aka symmetric Bernoulli ensemble), and the Gaussian Unitary Ensemble (GUE).

In previous notes we saw that the operator norm of {M_n} was typically of size {O(\sqrt{n})}, so it is natural to work with the normalised matrix {\frac{1}{\sqrt{n}} M_n}. Accordingly, given any {n \times n} Hermitian matrix {M_n}, we can form the (normalised) empirical spectral distribution (or ESD for short)

\displaystyle  \mu_{\frac{1}{\sqrt{n}} M_n} := \frac{1}{n} \sum_{j=1}^n \delta_{\lambda_j(M_n) / \sqrt{n}},

of {M_n}, where {\lambda_1(M_n) \leq \ldots \leq \lambda_n(M_n)} are the (necessarily real) eigenvalues of {M_n}, counting multiplicity. The ESD is a probability measure, which can be viewed as a distribution of the normalised eigenvalues of {M_n}.

When {M_n} is a random matrix ensemble, then the ESD {\mu_{\frac{1}{\sqrt{n}} M_n}} is now a random measure – i.e. a random variable taking values in the space {\hbox{Pr}({\mathbb R})} of probability measures on the real line. (Thus, the distribution of {\mu_{\frac{1}{\sqrt{n}} M_n}} is a probability measure on probability measures!)

Now we consider the behaviour of the ESD of a sequence of Hermitian matrix ensembles {M_n} as {n \rightarrow \infty}. Recall from Notes 0 that for any sequence of random variables in a {\sigma}-compact metrisable space, one can define notions of convergence in probability and convergence almost surely. Specialising these definitions to the case of random probability measures on {{\mathbb R}}, and to deterministic limits, we see that a sequence of random ESDs {\mu_{\frac{1}{\sqrt{n}} M_n}} converge in probability (resp. converge almost surely) to a deterministic limit {\mu \in \hbox{Pr}({\mathbb R})} (which, confusingly enough, is a deterministic probability measure!) if, for every test function {\varphi \in C_c({\mathbb R})}, the quantities {\int_{\mathbb R} \varphi\ d\mu_{\frac{1}{\sqrt{n}} M_n}} converge in probability (resp. converge almost surely) to {\int_{\mathbb R} \varphi\ d\mu}.

Remark 1 As usual, convergence almost surely implies convergence in probability, but not vice versa. In the special case of random probability measures, there is an even weaker notion of convergence, namely convergence in expectation, defined as follows. Given a random ESD {\mu_{\frac{1}{\sqrt{n}} M_n}}, one can form its expectation {{\bf E} \mu_{\frac{1}{\sqrt{n}} M_n} \in \hbox{Pr}({\mathbb R})}, defined via duality (the Riesz representation theorem) as

\displaystyle  \int_{\mathbb R} \varphi\ d{\bf E} \mu_{\frac{1}{\sqrt{n}} M_n} := {\bf E} \int_{\mathbb R} \varphi\ d	 \mu_{\frac{1}{\sqrt{n}} M_n};

this probability measure can be viewed as the law of a random eigenvalue {\frac{1}{\sqrt{n}}\lambda_i(M_n)} drawn from a random matrix {M_n} from the ensemble. We then say that the ESDs converge in expectation to a limit {\mu \in \hbox{Pr}({\mathbb R})} if {{\bf E} \mu_{\frac{1}{\sqrt{n}} M_n}} converges the vague topology to {\mu}, thus

\displaystyle  {\bf E} \int_{\mathbb R} \varphi\ d	 \mu_{\frac{1}{\sqrt{n}} M_n} \rightarrow \int_{\mathbb R} \varphi\ d\mu

for all {\phi \in C_c({\mathbb R})}.

In general, these notions of convergence are distinct from each other; but in practice, one often finds in random matrix theory that these notions are effectively equivalent to each other, thanks to the concentration of measure phenomenon.

Exercise 1 Let {M_n} be a sequence of {n \times n} Hermitian matrix ensembles, and let {\mu} be a continuous probability measure on {{\mathbb R}}.

  • Show that {\mu_{\frac{1}{\sqrt{n}} M_n}} converges almost surely to {\mu} if and only if {\mu_{\frac{1}{\sqrt{n}}}(-\infty,\lambda)} converges almost surely to {\mu(-\infty,\lambda)} for all {\lambda \in {\mathbb R}}.
  • Show that {\mu_{\frac{1}{\sqrt{n}} M_n}} converges in probability to {\mu} if and only if {\mu_{\frac{1}{\sqrt{n}}}(-\infty,\lambda)} converges in probability to {\mu(-\infty,\lambda)} for all {\lambda \in {\mathbb R}}.
  • Show that {\mu_{\frac{1}{\sqrt{n}} M_n}} converges in expectation to {\mu} if and only if {\mathop{\mathbb E} \mu_{\frac{1}{\sqrt{n}}}(-\infty,\lambda)} converges to {\mu(-\infty,\lambda)} for all {\lambda \in {\mathbb R}}.

We can now state the Wigner semi-circular law.

Theorem 1 (Semicircular law) Let {M_n} be the top left {n \times n} minors of an infinite Wigner matrix {(\xi_{ij})_{i,j \geq 1}}. Then the ESDs {\mu_{\frac{1}{\sqrt{n}} M_n}} converge almost surely (and hence also in probability and in expectation) to the Wigner semi-circular distribution

\displaystyle  \mu_{sc} := \frac{1}{2\pi} (4-|x|^2)^{1/2}_+\ dx. \ \ \ \ \ (1)

A numerical example of this theorem in action can be seen at the MathWorld entry for this law.

The semi-circular law nicely complements the upper Bai-Yin theorem from Notes 3, which asserts that (in the case when the entries have finite fourth moment, at least), the matrices {\frac{1}{\sqrt{n}} M_n} almost surely has operator norm at most {2+o(1)}. Note that the operator norm is the same thing as the largest magnitude of the eigenvalues. Because the semi-circular distribution (1) is supported on the interval {[-2,2]} with positive density on the interior of this interval, Theorem 1 easily supplies the lower Bai-Yin theorem, that the operator norm of {\frac{1}{\sqrt{n}} M_n} is almost surely at least {2-o(1)}, and thus (in the finite fourth moment case) the norm is in fact equal to {2+o(1)}. Indeed, we have just shown that the circular law provides an alternate proof of the lower Bai-Yin bound (Proposition 11 of Notes 3).

As will hopefully become clearer in the next set of notes, the semi-circular law is the noncommutative (or free probability) analogue of the central limit theorem, with the semi-circular distribution (1) taking on the role of the normal distribution. Of course, there is a striking difference between the two distributions, in that the former is compactly supported while the latter is merely subgaussian. One reason for this is that the concentration of measure phenomenon is more powerful in the case of ESDs of Wigner matrices than it is for averages of iid variables; compare the concentration of measure results in Notes 3 with those in Notes 1.

There are several ways to prove (or at least to heuristically justify) the circular law. In this set of notes we shall focus on the two most popular methods, the moment method and the Stieltjes transform method, together with a third (heuristic) method based on Dyson Brownian motion (Notes 3b). In the next set of notes we shall also study the free probability method, and in the set of notes after that we use the determinantal processes method (although this method is initially only restricted to highly symmetric ensembles, such as GUE).

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.

RSS Mathematics in Australia

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,890 other followers