Outliers in the spectrum of iid matrices with bounded rank perturbations

22 December, 2010 in math.PR, math.RA, paper | Tags: circular law, iid matrices, outliers, random matrices | by Terence Tao

I’ve just uploaded to the arXiv my paper “Outliers in the spectrum of iid matrices with bounded rank perturbations“, submitted to Probability Theory and Related Fields. This paper is concerned with outliers to the circular law for iid random matrices. Recall that if ${X_n}$ is an ${n \times n}$ matrix whose entries are iid complex random variables with mean zero and variance one, then the ${n}$ complex eigenvalues of the normalised matrix ${\frac{1}{\sqrt{n}} X_n}$ will almost surely be distributed according to the circular law distribution ${\frac{1}{\pi} 1_{|z| \leq 1} d^2 z}$ in the limit ${n \rightarrow \infty}$ . (See these lecture notes for further discussion of this law.)

The circular law is also stable under bounded rank perturbations: if ${C_n}$ is a deterministic rank ${O(1)}$ matrix of polynomial size (i.e. of operator norm ${O(n^{O(1)})}$ ), then the circular law also holds for ${\frac{1}{\sqrt{n}} X_n + C_n}$ (this is proven in a paper of myself, Van Vu, and Manjunath Krisnhapur). In particular, the bulk of the eigenvalues (i.e. ${(1-o(1)) n}$ of the ${n}$ eigenvalues) will lie inside the unit disk ${\{ z \in {\bf C}: |z| \leq 1 \}}$ .

However, this leaves open the possibility for one or more outlier eigenvalues that lie significantly outside the unit disk; the arguments in the paper cited above give some upper bound on the number of such eigenvalues (of the form ${O(n^{1-c})}$ for some absolute constant ${c>0}$ ) but does not exclude them entirely. And indeed, numerical data shows that such outliers can exist for certain bounded rank perturbations.

In this paper, some results are given as to when outliers exist, and how they are distributed. The easiest case is of course when there is no bounded rank perturbation: ${C_n=0}$ . In that case, an old result of Bai and Yin and of Geman shows that the spectral radius of ${\frac{1}{\sqrt{n}} X_n}$ is almost surely ${1+o(1)}$ , thus all eigenvalues will be contained in a ${o(1)}$ neighbourhood of the unit disk, and so there are no significant outliers. The proof is based on the moment method.

Now we consider a bounded rank perturbation ${C_n}$ which is nonzero, but which has a bounded operator norm: ${\|C_n\|_{op} = O(1)}$ . In this case, it turns out that the matrix ${\frac{1}{\sqrt{n}} X_n + C_n}$ will have outliers if the deterministic component ${C_n}$ has outliers. More specifically (and under the technical hypothesis that the entries of ${X_n}$ have bounded fourth moment), if ${\lambda}$ is an eigenvalue of ${C_n}$ with ${|\lambda| > 1}$ , then (for ${n}$ large enough), ${\frac{1}{\sqrt{n}} X_n + C_n}$ will almost surely have an eigenvalue at ${\lambda+o(1)}$ , and furthermore these will be the only outlier eigenvalues of ${\frac{1}{\sqrt{n}} X_n + C_n}$ .

Thus, for instance, adding a bounded nilpotent low rank matrix to ${\frac{1}{\sqrt{n}} X_n}$ will not create any outliers, because the nilpotent matrix only has eigenvalues at zero. On the other hand, adding a bounded Hermitian low rank matrix will create outliers as soon as this matrix has an operator norm greater than ${1}$ .

When I first thought about this problem (which was communicated to me by Larry Abbott), I believed that it was quite difficult, because I knew that the eigenvalues of non-Hermitian matrices were quite unstable with respect to general perturbations (as discussed in this previous blog post), and that there were no interlacing inequalities in this case to control bounded rank perturbations (as discussed in this post). However, as it turns out I had arrived at the wrong conclusion, especially in the exterior of the unit disk in which the resolvent is actually well controlled and so there is no pseudospectrum present to cause instability. This was pointed out to me by Alice Guionnet at an AIM workshop last week, after I had posed the above question during an open problems session. Furthermore, at the same workshop, Percy Deift emphasised the point that the basic determinantal identity

$\displaystyle \det(1 + AB) = \det(1 + BA) \ \ \ \ \ (1)$

for ${n \times k}$ matrices ${A}$ and ${k \times n}$ matrices ${B}$ was a particularly useful identity in random matrix theory, as it converted problems about large ( ${n \times n}$ ) matrices into problems about small ( ${k \times k}$ ) matrices, which was particularly convenient in the regime when ${n \rightarrow \infty}$ and ${k}$ was fixed. (Percy was speaking in the context of invariant ensembles, but the point is in fact more general than this.)

From this, it turned out to be a relatively simple manner to transform what appeared to be an intractable ${n \times n}$ matrix problem into quite a well-behaved ${k \times k}$ matrix problem for bounded ${k}$ . Specifically, suppose that ${C_n}$ had rank ${k}$ , so that one can factor ${C_n = A_n B_n}$ for some (deterministic) ${n \times k}$ matrix ${A_n}$ and ${k \times n}$ matrix ${B_n}$ . To find an eigenvalue ${z}$ of ${\frac{1}{\sqrt{n}} X_n + C_n}$ , one has to solve the characteristic polynomial equation

$\displaystyle \det( \frac{1}{\sqrt{n}} X_n + A_n B_n - z ) = 0.$

This is an ${n \times n}$ determinantal equation, which looks difficult to control analytically. But we can manipulate it using (1). If we make the assumption that ${z}$ is outside the spectrum of ${\frac{1}{\sqrt{n}} X_n}$ (which we can do as long as ${z}$ is well away from the unit disk, as the unperturbed matrix ${\frac{1}{\sqrt{n}} X_n}$ has no outliers), we can divide by ${\frac{1}{\sqrt{n}} X_n - z}$ to arrive at

$\displaystyle \det( 1 + (\frac{1}{\sqrt{n}} X_n-z)^{-1} A_n B_n ) = 0.$

Now we apply the crucial identity (1) to rearrange this as

$\displaystyle \det( 1 + B_n (\frac{1}{\sqrt{n}} X_n-z)^{-1} A_n ) = 0.$

The crucial point is that this is now an equation involving only a ${k \times k}$ determinant, rather than an ${n \times n}$ one, and is thus much easier to solve. The situation is particularly simple for rank one perturbations

$\displaystyle \frac{1}{\sqrt{n}} X_n + u_n v_n^*$

in which case the eigenvalue equation is now just a scalar equation

$\displaystyle 1 + \langle (\frac{1}{\sqrt{n}} X_n-z)^{-1} u_n, v_n \rangle = 0$

that involves what is basically a single coefficient of the resolvent ${(\frac{1}{\sqrt{n}} X_n-z)^{-1}}$ . (It is also an instructive exercise to derive this eigenvalue equation directly, rather than through (1).) There is by now a very well-developed theory for how to control such coefficients (particularly for ${z}$ in the exterior of the unit disk, in which case such basic tools as Neumann series work just fine); in particular, one has precise enough control on these coefficients to obtain the result on outliers mentioned above.

The same method can handle some other bounded rank perturbations. One basic example comes from looking at iid matrices with a non-zero mean ${\mu}$ and variance ${1}$ ; this can be modeled by ${\frac{1}{\sqrt{n}} X_n + \mu \sqrt{n} \phi_n \phi_n^*}$ where ${\phi_n}$ is the unit vector ${\phi_n := \frac{1}{\sqrt{n}} (1,\ldots,1)^*}$ . Here, the bounded rank perturbation ${\mu \sqrt{n} \phi_n \phi_n^*}$ has a large operator norm (equal to ${|\mu| \sqrt{n}}$ ), so the previous result does not directly apply. Nevertheless, the self-adjoint nature of the perturbation has a stabilising effect, and I was able to show that there is still only one outlier, and that it is at the expected location of ${\mu \sqrt{n}+o(1)}$ .

If one moves away from the case of self-adjoint perturbations, though, the situation changes. Let us now consider a matrix of the form ${\frac{1}{\sqrt{n}} X_n + \mu \sqrt{n} \phi_n \psi_n^*}$ , where ${\psi_n}$ is a randomised version of ${\phi_n}$ , e.g. ${\psi_n := \frac{1}{\sqrt{n}} (\pm 1, \ldots, \pm 1)^*}$ , where the ${\pm 1}$ are iid Bernoulli signs; such models were proposed recently by Rajan and Abbott as a model for neural networks in which some nodes are excitatory (and give columns with positive mean) and some are inhibitory (leading to columns with negative mean). Despite the superficial similarity with the previous example, the outlier behaviour is now quite different. Instead of having one extremely large outlier (of size ${\sim\sqrt{n}}$ ) at an essentially deterministic location, we now have a number of eigenvalues of size ${O(1)}$ , scattered according to a random process. Indeed, (in the case when the entries of ${X_n}$ were real and bounded) I was able to show that the outlier point process converged (in the sense of converging ${k}$ -point correlation functions) to the zeroes of a random Laurent series

$\displaystyle g(z) = 1 - \mu \sum_{j=0}^\infty \frac{g_j}{z^{j+1}}$

where ${g_0,g_1,g_2,\ldots \equiv N(0,1)}$ are iid real Gaussians. This is basically because the coefficients of the resolvent ${(\frac{1}{\sqrt{n}} X_n - zI)^{-1}}$ have a Neumann series whose coefficients enjoy a central limit theorem.

On the other hand, as already observed numerically (and rigorously, in the gaussian case) by Rajan and Abbott, if one projects such matrices to have row sum zero, then the outliers all disappear. This can be explained by another appeal to (1); this projection amounts to right-multiplying ${\frac{1}{\sqrt{n}} X_n + \mu \sqrt{n} \phi_n \psi_n^*}$ by the projection matrix ${P}$ to the zero-sum vectors. But by (1), the non-zero eigenvalues of the resulting matrix ${(\frac{1}{\sqrt{n}} X_n + \mu \sqrt{n} \phi_n \psi_n^*)P}$ are the same as those for ${P (\frac{1}{\sqrt{n}} X_n + \mu \sqrt{n} \phi_n \psi_n^*)}$ . Since ${P}$ annihilates ${\phi_n}$ , we thus see that in this case the bounded rank perturbation plays no role, and the question reduces to obtaining a circular law with no outliers for ${P \frac{1}{\sqrt{n}} X_n}$ . As it turns out, this can be done by invoking the machinery of Van Vu and myself that we used to prove the circular law for various random matrix models.

18 comments

Comments feed for this article

23 December, 2010 at 2:16 pm

Djalil

Hi!

This reminds me this paper by Silverstein: http://www.ams.org/mathscinet-getitem?mr=1284550

About these matrices with iid entries, I wonder if one can prove the universality of the Gumbel fluctuation of the spectral radius (the result is known for the complex Ginibre ensemble).

23 December, 2010 at 5:30 pm

Terence Tao

Thanks for the reference, which I was not aware of! Silverstein’s paper does indeed much give the same control on the outlier eigenvalue of $\frac{1}{\sqrt{n}} X_n + \mu \phi_n \phi_n^*$ as my paper does (and indeed the methods are basically the same).

As for your second question, we don’t yet have enough fine-scale control on the circular law to come close to answering this sort of question yet – the behaviour in the boundary region $|z|=1+o(1)$ , in particular, is still too delicate to resolve. (The results here avoid this because they only need one to understand the resolvent etc. in the region $|z| \geq 1+\varepsilon$ for some fixed $\varepsilon > 0$ independent of $n$ .)

24 December, 2010 at 12:12 am

Djalil

There is also this paper by Andrew:
http://www.ams.org/mathscinet-getitem?mr=1062321

25 December, 2010 at 6:25 am

Timothy

Should be “This paper is concerned with”

[Corrected, thanks – T.]

26 December, 2010 at 12:52 pm

Terence Tao

I’ve deleted a number of comments on this post that involved (or were in response to) attempts at spoofing the identity of an author. Needless to say, such attempts are not welcome at this blog.

26 December, 2010 at 9:26 pm

Timothy

Shouldn’t it be “uploaded my paper.”

[Corrected, thanks. It’s been a while since I wrote a singly authored paper – T.]

29 December, 2010 at 5:33 am

Benaych-Georges

Shouldn’t it be “perturbation” instead of “permutation” in the title ?

By the way, the same technics were used to locate the outliers in the spectrums of either additive or multiplicative perturbations of rather general matrices in http://hal.archives-ouvertes.fr/hal-00423593/fr/

29 December, 2010 at 9:36 am

Terence Tao

Thanks for the correction and for the reference!

Best,

Terry

3 January, 2011 at 9:28 am

Raj Rao

In the paper:
Restricted rank modification of the symmetric eigenvalue problem: Theoretical considerations by Arbenz, Gander, Golub

the formula in your Remark 2.2 is referred to as the “modified Weinstein determinant” following Weinstein and Stenger’s “Methods for Intermediate Problems for Eigenvalues”.

Just FYI :-)

4 January, 2011 at 9:46 pm

Terence Tao

Thanks for the reference!

3 January, 2011 at 9:30 am

Raj Rao

(Or at least its analogue in the symmetric case)

16 January, 2011 at 10:47 pm

andrescaicedo

Terry, didn’t you have a post discussing identity (1)? Seems to have disappeared. (Maybe I’m confused?)

17 January, 2011 at 5:00 am

Terence Tao

I discussed this identity at

The mesoscopic structure of GUE eigenvalues

17 January, 2011 at 8:17 am

andrescaicedo

Thanks!

19 April, 2011 at 4:24 pm

Yashar

Hi Terry,

I have a question about the non-self-adjoint case that was studied by Rajan and Abbott. Simulations suggest that the total number of outlier eigenvalues (at least in the “balanced” case where the components of \psi_n sum up to 0) scales like \sqrt(n) (up to n ~ 3000). But you say above that the outlier point process converges to the zeros of that random Laurent series which is independent of n, suggesting that the number of outliers is O(1), and not O(\sqrt(n)), and my simulations are not actually showing the very large n behavior.
Am I right (in concluding the number is O(1) based on what you say), or am I missing something?
I would really appreciate your answer.
Thanks.

19 April, 2011 at 4:27 pm

Yashar

I guess I found my own mistake: the Laurent series has infinite zeros actually…

8 September, 2012 at 5:17 pm

Char

Hi Terry, I have a question about the distribution of the outliers. According to Silverstein’s paper, outliers are N~(mu*n,sigma^2) given that the outliers are not normalised. I ran a simulation with R. I tried different values of n, mu, sigma and roughly 1000 matrices. I got the mean of the outliers of mu*n but I got the standard deviation approximately sigma*sqrt(2) instead of sigma. Is it a correct result?

13 January, 2013 at 12:36 pm

Matrix identities as derivatives of determinant identities « What’s new

[…] for some . (This type of situation is also common in random matrix theory, for instance it arose in this previous paper of mine on outliers to the circular law.) If is invertible, then from (1) and (2) one […]

	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on It ought to be common knowledg…
	Ring Theory Intervie… on Reading seminar: “Stable…
	Anonymous on Work hard
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…

Outliers in the spectrum of iid matrices with bounded rank perturbations

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

18 comments

Leave a comment Cancel reply

For commenters

Outliers in the spectrum of iid matrices with bounded rank perturbations

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

18 comments

Leave a comment Cancel reply

For commenters