I was asked recently (in relation to my recent work with Van Vu on the spectral theory of random matrices) to explain some standard heuristics regarding how the eigenvalues of an
matrix A behave under small perturbations. These heuristics can be summarised as follows:
- For normal matrices (and in particular, unitary or self-adjoint matrices), eigenvalues are very stable under small perturbations. For more general matrices, eigenvalues can become unstable if there is pseudospectrum present.
- For self-adjoint (Hermitian) matrices, eigenvalues that are too close together tend to repel quickly from each other under such perturbations. For more general matrices, eigenvalues can either be repelled or be attracted to each other, depending on the type of perturbation.
In this post I would like to briefly explain why these heuristics are plausible.
– Pseudospectrum –
As any student of linear algebra knows, the spectrum of a
matrix A consists of all the complex numbers
such that
fails to be invertible, thus there exists a non-zero vector v such that
is zero. This is a finite set, consisting of at most n points. But there is also an important set containing the spectrum (and which, at times, can be much larger than that spectrum), which is the pseudospectrum of the matrix A. Unlike the spectrum, which is canonically defined, there is more than one definition of the pseudospectrum; also, this concept depends on an additional parameter, a small number
. We will define the pseudospectrum (or more precisely, the
-pseudospectrum)
to be the set of all the complex numbers
such that
has least singular value at most
, or equivalently that there exists a unit vector v such that
. (Another equivalent definition is that the
-pseudospectrum consists of the spectrum, together with those complex numbers
for which the resolvent
has operator norm at least
.)
The significance of the pseudospectrum is that it describes where the spectrum
can go to under small perturbations. Indeed, if
lies in the pseudospectrum
, so that there exists a unit vector v whose image
has magnitude at most
, then we see that
and so lies in the spectrum
of the perturbation
of A. Note that the operator norm of
is at most
.
Conversely, if does not lie in the pseudospectrum
, and
is a small perturbation of A (with E having operator norm at most
), then for any unit vector v, one has
by the triangle inequality, and so cannot lie in the spectrum of
.
Thus, if the pseudospectrum is tightly clustered around the spectrum, the spectrum is stable under small perturbations; but if the pseudospectrum is widely dispersed, then the spectrum becomes unstable.
No matter what A is, the pseudospectrum always contains the
-neighbourhood of the spectrum
. Indeed, if v is a unit eigenvector with eigenvalue
, then
, which implies that
for any
in the
-neighbourhood of
, and the claim follows.
Conversely, when A is normal, consists only of the
-neighbourhood of
. This is easiest to see by using the spectral theorem to diagonalise A and then computing everything explicitly. In particular, we conclude that if we perturb a normal matrix by a (possibly non-normal) perturbation of operator norm at most
, then the spectrum moves by at most
.
In the non-normal case, things can be quite different. A good example is provided by the shift matrix
This matrix is nilpotent: . As such, the only eigenvalue is zero. But observe that for any complex number
,
.
From this and a little computation, we see that if , then
will lie in the
-pseudospectrum of U. For fixed
, we thus see that
fills up the unit disk in the high dimensional limit
. (The pseudospectrum will not venture far beyond the unit disk, as the operator norm of U is 1.) And indeed, it is easy to perturb U so that its spectrum moves far away from the origin. For instance, observe that the perturbation
of U has a characteristic polynomial of (or note that
, which is basically the same fact thanks to the Cayley-Hamilton theorem) and so has eigenvalues equal to the
roots of
; for fixed
and n tending to infinity, this spectrum becomes asymptotically uniformly distributed on the unit circle, rather than at the origin.
[Update, Nov 8: Much more on the theory of pseudospectra can be found at this web site; thanks to Nick Trefethen for the link.]
– Spectral dynamics –
The pseudospectrum tells us, roughly speaking, how far the spectrum of a matrix A can move with respect to a small perturbation, but does not tell us the direction in which the spectrum moves. For this, it is convenient to use the language of calculus: we suppose that A=A(t) varies smoothly with respect to some time parameter t, and would like to “differentiate” the spectrum
with respect to t.
Since it is a little unclear what it means to differentiate a set, let us work instead with the eigenvalues of
. Note that generically (e.g. for almost all A), the eigenvalues will be distinct. (Proof: the eigenvalues are distinct when the characteristic polynomial has no repeated roots, or equivalently when the resultant of the characteristic polynomial with its derivative is non-zero. This is clearly a Zariski-open condition; since the condition is obeyed at least once, it is thus Zariski-dense.) So it is not unreasonable to assume that for all t in some open interval, the
are distinct; an application of the implicit function theorem then allows one to make the
smooth in t. Similarly, we can make the eigenvectors
vary smoothly in t. (There is some freedom to multiply each eigenvector by a scalar, but this freedom will cancel itself out in the end, as we are ultimately interested only in the eigenvalues rather than the eigenvectors.)
The eigenvectors form a basis of
. Let
be the dual basis, thus
for all
, and so we have the reproducing formula
(1)
for all vectors u. Combining this with the eigenvector equations
(2)
we obtain the adjoint eigenvector equations
. (3)
Next, we differentiate (2) using the product rule to obtain
. (4)
Taking the inner product of this with the dual vector , and using (3) to cancel some terms, we obtain the first variation formula for eigenvalues:
. (5)
Note that if A is normal, then we can take the eigenbasis to be orthonormal, in which case the dual basis
is identical to
. In particular we see that
; the infinitesimal change of each eigenvalue does not exceed the infinitesimal size of the perturbation. This is consistent with the stability of the spectrum for normal operators mentioned in the previous section.
[Side remark: if A evolves by the Lax pair equation , then
, and so from (5) we see that the spectrum of A is invariant in time. This fact underpins the inverse scattering method for solving integrable systems.]
Now we look at how the eigenvectors vary. Taking the inner product instead with a dual vector for
, we obtain
;
applying (1) we conclude a first variation formula for the eigenvectors , namely that
(6)
for some scalar (the presence of this term reflects the freedom to multiply
by a scalar). Similar considerations for the adjoint give
(6′)
(here we use the derivative of the identity to get the correct multiple of
on the right.)
We can use (5), (6), (6′) to obtain a second variation formula for the eigenvalues. Indeed, by differentiating (5) we obtain
;
applying (6), (6′) we conclude the second variation formula
(7)
Now suppose that A is self-adjoint, so as before we can take to be orthonormal. The above formula then becomes
One can view the terms on the right-hand side here as various “forces” acting on the eigenvalue ; the acceleration of the original matrix A provides one such force, while all the other eigenvalues
provide a repulsive force (and note, as with Newton’s third law, that the force that
exerts on
is equal and opposite to the force that
exerts on
; this is consistent with the trace formula
). The closer two eigenvalues are to each other, the stronger the repulsive force becomes.
When the matrix A is not self-adjoint, then the interaction between and
can be either repulsive or attractive. Consider for instance the matrices
The eigenvalues of this matrix are for
– so we see that they are attracted to each other as t evolves, until the matrix becomes degenerate at
. In contrast, the self-adjoint matrices
have eigenvalues , which repel each other as t evolves.
The repulsion effect of eigenvalues is also consistent with the smallness of the set of matrices with repeated eigenvalues. Consider for instance the space of Hermitian matrices, which has real dimension
. The subset of Hermitian matrices with distinct eigenvalues can be described by a collection of n orthogonal (complex) one-dimensional eigenspaces (which can be computed to have
degrees of freedom) plus n real eigenvalues (for an additional n degrees of freedom), thus the set of matrices with distinct eigenvalues has full dimension
.
Now consider the space of matrices with one repeated eigenvalue. This can be described by n-2 orthogonal complex one-dimensional eigenspaces, plus a complex two-dimensional orthogonal complement (which has degrees of freedom) plus n-1 real eigenvalues, thus the set of matrices with repeated eigenvalues only has dimension
. Thus it is in fact very rare for eigenvalues to actually collide, which helps explain why there must be a repulsion effect in the first place.
An example can help illustrate this phenomenon. Consider the one-parameter family of Hermitian matrices
The eigenvalues of this matrix at time t are of course t and -t, which cross over each other when t changes sign. Now consider instead the Hermitian perturbation
for some small . The eigenvalues are now
and
; they come close to each other as t approaches 0, but then “bounce” off of each other due to the repulsion effect.

15 comments
Comments feed for this article
28 October, 2008 at 10:05 am
Anonymous
Dear Terry,
thanks for the nice post! By the way, let me make a suggestion (if you permit): after reading your post, I thought that you could complement it with a few comments about the stability of the spectrum in the infinite dimensional case, e.g., the results of Feldmann, Trubowitz and Knorrer about perturbations of the form
of Laplacian
on the d-dimensional torus (if you are able to find the time to write it down of course).
Bests,
Matheus
28 October, 2008 at 2:24 pm
Terence Tao
Dear Matheus,
I was not familiar with the work of Feldmann et al., but in any event one can understand the spectral theory of self-adjoint unbounded operators such as
by viewing them as the limit of the finite-dimensional self-adjoint truncations
, where
is a Fourier projection to all modes of size less than N; the spectral theorem for
shows that any finite number of eigenvalues and eigenvectors of the latter converge to the former in the limit
(one can also use the minimax formulae for these eigenvalues and eigenvectors here). So any stability statement concerning a bounded number of finite-dimensional eigenvalues or eigenvectors which is uniform in N can be recovered in this infinite-dimensional limit (in principle, at least).
A version of the first variation formula for eigenvalues of Laplacians was used recently by Hassell in his proof of the scarring conjecture for generic Bunimovich stadia, see
http://terrytao.wordpress.com/2008/07/07/hassells-proof-of-scarring-for-the-bunimovich-stadium/
29 October, 2008 at 8:57 pm
orr
Dear Prof. Tao,
Is it possible that you meant to write, in the definition of the pseudospectrum, that it is the set of all numbers for which the resolvent has operator norm AT LEAST (and not AT MOST) one over epsilon?
29 October, 2008 at 9:33 pm
Terence Tao
Dear orr: thanks for the correction!
30 October, 2008 at 4:13 pm
Top Posts « WordPress.com
[...] When are eigenvalues stable? I was asked recently (in relation to my recent work with Van Vu on the spectral theory of random matrices) to explain [...] [...]
31 October, 2008 at 8:49 am
Carlos Matheus
Dear Terry,
this approach (ie, either approximation by finite dimensional self-adjoint operators or minimax principle) works well for the issue of stability of a bounded part of the spectrum in general. Also, we can push this argument to make it work for the whole spectrum in the 1d ‘torus’ case (this is particularly important when dealing with the KAM theory of NLS). In fact, the case of the circle is a bit special because the eigenvalues are simple and they have a ‘good spacing’ (ie the distance between two consecutive eigenvalues grows linearly).
Of course, these two facts are not true for higher dimensional tori. However, Feldman, Trubowitz and Knorrer manage to show in dimensions 2 and 3 (for the usual ’square, flat’ torus and also for a ‘generic’ torus obtained from a typical lattice) that ‘most’ of eigenvalues are stable (in the sense that the set of stable eigenvalues form a subset of density 1 of the set of eigenvalues). Moreover, they constructed examples where the set of unstable eigenvalues is ‘non-trivial’ (although it has zero density). In particular, this is a technical problem if one would like to show KAM results for the NLS in higher dimensions.
By the way, the references for the papers of Feldman et al. are the following:
Stability results: ”The perturbatively stable spectrum of a periodic Schrödinger operator”. Invent. Math. 100 (1990), no. 2, 259–300.
Instability results: ”Perturbatively unstable eigenvalues of a periodic Schrödinger operator”. Comment. Math. Helv. 66 (1991), no. 4, 557–579.
Bests,
Matheus
3 November, 2008 at 11:21 pm
Jonathan Vos Post
Is there a connection between this approach and the new submissions for Tue, 4 Nov 08
arXiv:0811.0007
Title: Large gaps between random eigenvalues
Authors: Benedek Valkó, Bálint Virág
Comments: 18 pages, 1 figure
Subjects: Probability (math.PR)
We show that in the point process limit of the bulk eigenvalues of $\beta$-ensembles of random matrices, the probability of having no eigenvalue in a fixed interval of size $\lambda$ is given by $$ (\kappa_\beta+o(1))\lambda^{\gamma_\beta} \exp(- \frac{\beta}{64}\lambda^2+(\frac{\beta}{4}-\frac18)\lambda) $$ as $\lambda\to\infty$, where $$ \gamma_\beta={1/4}(\frac\beta{2}+\frac{2}{\beta}- 3). $$ This is a slightly corrected version of a prediction by Dyson. Our proof uses the new Brownian carousel representation of the limit process, as well as the Cameron-Martin-Girsanov transformation in stochastic calculus.
7 November, 2008 at 10:36 am
Carnival of Mathematics #43 « The Number Warrior
[...] of Terence Tao, I recommend his recent post on when eigenvalues are stable. If you’re intimidated by some of his other posts (say, his 19-part series on Ricci flow and [...]
8 November, 2008 at 8:59 am
E Brian Davies
Terry’s articles about the uses of pseudospectra touches on what is now a substantial research field. The book by Trefethen and Embree: Spectra and Pseudospectra, Princeton University Press, 2005, provides a wide range of theorems and real-world examples of their application. My own book, Linear Operators and Their Spectra, CUP 2007, is written at a more pure-mathematical level and has a chapter devoted to pseudospectra and what are often called nonlinear generalizations. Terry’s example of a small perturbation of a large Jordan matrix has been explored in detail by M Hager and myself — several relevant papers are in the arXiv.
It is generally considered that theorems about pseudospectra have nicer formulations if one defines the epsilon pseudospectrum using a strict inequality rather than a non-strict one.
8 November, 2008 at 10:42 am
E Brian Davies
With reference to the communications between Terry and Matheus above.
One has to be very careful about computing the spectrum of a self-adjoint operator H by looking at the limit of the spectra of
as
.
This can be justified by a minimax argument if one restricts attention to the part of the spectrum below the essential spectrum. It almost always gives a completely wrong answer if the operator H has a gap between two parts of the essential spectrum. The truncations seem to suggest that the gap is not present. This applies in particular to Schrodinger operators with periodic potentials acting in
, which generally have a band-gap structure to the spectrum. If
is an increasing sequence of finite rank projections constructed by standard finite element methods and converging strongly to the identity operator, then the spectra of the truncations fill up the gaps more and more densely as
.
Once again there is a substantial numerical and analytic literature explaining why this method does not work and what one should do to get the right answer.
9 November, 2008 at 12:50 am
E Brian Davies
More dangers of truncation. I should have mentioned the many papers, indeed books, of Albrecht Boettcher and his co-workers on infinite non-self-adjoint bounded Toeplitz matrices. The spectra of the obvious finite truncations do converge as N\to\infty, but the limits are completely different from the spectra of the infinite matrices. This is a large and fascinating subject, but the message is the same – do not trust truncation unless you actually know that it works in your particular context.
17 November, 2008 at 5:26 pm
me
Also, you may want to check out the papers of Burke, Lewis, and Overton. Together (and apart) they have many papers on Matrix stability, pseudospectral subdifferentials, etc. All very interesting and related work.
9 April, 2009 at 3:37 pm
Steven Heilman
Hi-
I think I found a few errors (most likely due to transcription?):
1. Eq. (3) is missing a * on the right hand side
2. after Eq. (6), should read “… freedom to multiply v_{k}…”
Best-
-Steve
[Corrected, thanks - T.]
28 April, 2009 at 5:53 pm
Tusarepides
Hi, there unfortunately seems to be plenty of formulae that do not parse on this page – I have no trouble with other pages on your blog. Any hints on how I could fix this?
3 June, 2009 at 11:19 am
Random matrices: universality of local eigenvalue statistics « What’s new
[...] of the matrix, and the unit eigenvector of the matrix associated to that eigenvalue. (See this earlier blog post for further discussion of this and related formulae). If one assumes that the eigenvectors are [...]