Let be a Hermitian
matrix. By the spectral theorem for Hermitian matrices (which, for sake of completeness, we prove below), one can diagonalise
using a sequence
of real eigenvalues, together with an orthonormal basis of eigenvectors
. (The eigenvalues are uniquely determined by
, but the eigenvectors have a little ambiguity to them, particularly if there are repeated eigenvalues; for instance, one could multiply each eigenvector by a complex phase
. In these notes we are arranging eigenvalues in descending order; of course, one can also arrange eigenvalues in increasing order, which causes some slight notational changes in the results below.) The set
is known as the spectrum of
.
A basic question in linear algebra asks the extent to which the eigenvalues and
of two Hermitian matrices
constrains the eigenvalues
of the sum. For instance, the linearity of trace
when expressed in terms of eigenvalues, gives the trace constraint
(together with the counterparts for and
) gives the inequality
and so forth.
The complete answer to this problem is a fascinating one, requiring a strangely recursive description (once known as Horn’s conjecture, which is now solved), and connected to a large number of other fields of mathematics, such as geometric invariant theory, intersection theory, and the combinatorics of a certain gadget known as a “honeycomb”. See for instance my survey with Allen Knutson on this topic some years ago.
In typical applications to random matrices, one of the matrices (say, ) is “small” in some sense, so that
is a perturbation of
. In this case, one does not need the full strength of the above theory, and instead rely on a simple aspect of it pointed out by Helmke and Rosenthal and by Totaro, which generates several of the eigenvalue inequalities relating
,
, and
, of which (1) and (3) are examples. (Actually, this method eventually generates all of the eigenvalue inequalities, but this is a non-trivial fact to prove.) These eigenvalue inequalities can mostly be deduced from a number of minimax characterisations of eigenvalues (of which (2) is a typical example), together with some basic facts about intersections of subspaces. Examples include the Weyl inequalities
valid whenever and
, and the Ky Fan inequality
One consequence of these inequalities is that the spectrum of a Hermitian matrix is stable with respect to small perturbations.
We will also establish some closely related inequalities concerning the relationships between the eigenvalues of a matrix, and the eigenvalues of its minors.
Many of the inequalities here have analogues for the singular values of non-Hermitian matrices (which is consistent with the discussion near Exercise 16 of Notes 3). However, the situation is markedly different when dealing with eigenvalues of non-Hermitian matrices; here, the spectrum can be far more unstable, if pseudospectrum is present. Because of this, the theory of the eigenvalues of a random non-Hermitian matrix requires an additional ingredient, namely upper bounds on the prevalence of pseudospectrum, which after recentering the matrix is basically equivalent to establishing lower bounds on least singular values. We will discuss this point in more detail in later notes.
We will work primarily here with Hermitian matrices, which can be viewed as self-adjoint transformations on complex vector spaces such as . One can of course specialise the discussion to real symmetric matrices, in which case one can restrict these complex vector spaces to their real counterparts
. The specialisation of the complex theory below to the real case is straightforward and is left to the interested reader.
— 1. Proof of spectral theorem —
To prove the spectral theorem, it is convenient to work more abstractly, in the context of self-adjoint operators on finite-dimensional Hilbert spaces:
Theorem 1 (Spectral theorem) Let
be a finite-dimensional complex Hilbert space of some dimension
, and let
be a self-adjoint operator. Then there exists an orthonormal basis
of
and eigenvalues
such that
for all
.
The spectral theorem as stated in the introduction then follows by specialising to the case and ordering the eigenvalues.
Proof: We induct on the dimension . The claim is vacuous for
, so suppose that
and that the claim has already been proven for
.
Let be a unit vector in
(thus
) that maximises the form
; this maximum exists by compactness. By the method of Lagrange multipliers,
is a critical point of
for some
. Differentiating in an arbitrary direction
, we conclude that
this simplifies using self-adjointness to
Since was arbitrary, we conclude that
, thus
is a unit eigenvector of
. By self-adjointness, this implies that the orthogonal complement
of
is preserved by
. Restricting
to this lower-dimensional subspace and applying the induction hypothesis, we can find an orthonormal basis of eigenvectors of
on
. Adjoining the new unit vector
to the orthonormal basis, we obtain the claim.
Suppose we have a self-adjoint transformation , which of course can be identified with a Hermitian matrix. Using the orthogonal eigenbasis provided by the spectral theorem, we can perform an orthonormal change of variables to set that eigenbasis to be the standard basis
, so that the matrix of
becomes diagonal. This is very useful when dealing with just a single matrix
– for instance, it makes the task of computing functions of
, such as
or
, much easier. However, when one has several Hermitian matrices in play (e.g.
), then it is usually not possible to standardise all the eigenbases simultaneously (i.e. to simultaneously diagonalise all the matrices), except when the matrices all commute. Nevertheless one can still normalise one of the eigenbases to be the standard basis, and this is still useful for several applications, as we shall soon see.
Exercise 1 Suppose that the eigenvalues
of an
Hermitian matrix are distinct. Show that the associated eigenbasis
is unique up to rotating each individual eigenvector
by a complex phase
. In particular, the spectral projections
are unique. What happens when there is eigenvalue multiplicity?
— 2. Minimax formulae —
The eigenvalue functional
is not a linear functional (except in dimension one). It is not even a convex functional (except when
) or a concave functional (except when
). However, it is the next best thing, namely it is a minimax expression of linear functionals. (Note that a convex functional is the same thing as a max of linear functionals, while a concave functional is the same thing as a min of linear functionals.) More precisely, we have
Theorem 2 (Courant-Fischer min-max theorem) Let
be an
Hermitian matrix. Then we have
for all
, where
ranges over all subspaces of
with the indicated dimension.
Proof: It suffices to prove (6), as (7) follows by replacing by
(noting that
).
We first verify the case, i.e. (2). By the spectral theorem, we can assume that
has the standard eigenbasis
, in which case we have
whenever . The claim (2) is then easily verified.
To prove the general case, we may again assume has the standard eigenbasis. By considering the space
spanned by
, we easily see the inequality
so we only need to prove the reverse inequality. In other words, for every -dimensional subspace
of
, we have to show that
contains a unit vector
such that
Let be the space spanned by
. This space has codimension
, so it must have non-trivial intersection with
. If we let
be a unit vector in
, the claim then follows from (8).
Remark 1 By homogeneity, one can replace the restriction
with
provided that one replaces the quadratic form
with the Rayleigh quotient
.
A closely related formula is as follows. Given an Hermitian matrix
and an
-dimensional subspace
of
, we define the partial trace
to be the expression
where is any orthonormal basis of
. It is easy to see that this expression is independent of the choice of orthonormal basis, and so the partial trace is well-defined.
Proposition 3 (Extremal partial trace) Let
be an
Hermitian matrix. Then for any
, one has
and
As a corollary, we see that is a convex function, and
is a concave function.
Proof: Again, by symmetry it suffices to prove the first formula. As before, we may assume without loss of generality that has the standard eigenbasis. By selecting
to be the span of
we have the inequality
so it suffices to prove the reverse inequality. For this we induct on dimension. If has dimension
, then it has a
-dimensional subspace
that is contained in the span of
. By the induction hypothesis, we have
On the other hand, if is a unit vector in the orthogonal complement of
in
, we see from (2) that
Adding the two inequalities we obtain the claim.
Specialising Proposition 3 to the case when is a coordinate subspace (i.e. the span of
of the basis vectors
), we conclude the Schur-Horn inequalities
for any , where
are the diagonal entries of
.
Exercise 2 Show that the inequalities (9) are equivalent to the assertion that the diagonal entries
lies in the permutahedron of
, defined as the convex hull of the
permutations of
in
.
Remark 2 It is a theorem of Schur and Horn that these are the complete set of inequalities connecting the diagonal entries
of a Hermitian matrix to its spectrum. To put it another way, the image of any coadjoint orbit
of a matrix
with a given spectrum
under the diagonal map
is the permutahedron of
. Note that the vertices of this permutahedron can be attained by considering the diagonal matrices inside this coadjoint orbit, whose entries are then a permutation of the eigenvalues. One can interpret this diagonal map
as the moment map associated with the conjugation action of the standard maximal torus of
(i.e. the diagonal unitary matrices) on the coadjoint orbit. When viewed in this fashion, the Schur-Horn theorem can be viewed as the special case of the more general Atiyah convexity theorem (also proven independently by Guillemin and Sternberg) in symplectic geometry. Indeed, the topic of eigenvalues of Hermitian matrices turns out to be quite profitably viewed as a question in symplectic geometry (and also in algebraic geometry, particularly when viewed through the machinery of geometric invariant theory).
There is a simultaneous generalisation of Theorem 2 and Proposition 3:
Exercise 3 (Wielandt minimax formula) Let
be integers. Define a partial flag to be a nested collection
of subspaces of
such that
for all
. Define the associated Schubert variety
to be the collection of all
-dimensional subspaces
such that
. Show that for any
matrix
,
— 3. Eigenvalue inequalities —
Using the above minimax formulae, we can now quickly prove a variety of eigenvalue inequalities. The basic idea is to exploit the linearity relationship
for any unit vector , and more generally
for any subspace .
For instance, as mentioned before, the inequality (3) follows immediately from (2) and (10). Similarly, for the Ky Fan inequality (5), one observes from (11) and Proposition 3 that
for any -dimensional subspace
. Substituting this into Proposition 3 gives the claim. If one uses Exercise 3 instead of Proposition 3, one obtains the more general Lidskii inequality
for any .
In a similar spirit, using the inequality
for unit vectors , combined with (10) and (6), we obtain the eigenvalue stability inequality
thus the spectrum of is close to that of
if
is small in operator norm. In particular, we see that the map
is Lipschitz continuous on the space of Hermitian matrices, for fixed
.
More generally, suppose one wants to establish the Weyl inequality (4). From (6) that it suffices to show that every -dimensional subspace
contains a unit vector
such that
But from (6), one can find a subspace of codimension
such that
for all unit vectors
in
, and a subspace
of codimension
such that
for all unit vectors
in
. The intersection
has codimension at most
and so has a nontrivial intersection with
; and the claim follows.
Remark 3 More generally, one can generate an eigenvalue inequality whenever the intersection numbers of three Schubert varieties of compatible dimensions is non-zero; see the paper of Helmke and Rosenthal. In fact, this generates a complete set of inequalities; this is a result of Klyachko. One can in fact restrict attention to those varieties whose intersection number is exactly one; this is a result of Knutson, Woodward, and myself. Finally, in those cases, the fact that the intersection is one can be proven by entirely elementary means (based on the standard inequalities relating the dimension of two subspaces
to their intersection
and sum
); this is a result of Bercovici, Collins, Dykema, Li, and Timotin. As a consequence, the methods in this section can, in principle, be used to derive all possible eigenvalue inequalities for sums of Hermitian matrices.
Exercise 4 Verify the inequalities (12) and (4) by hand in the case when
and
commute (and are thus simultaneously diagonalisable), without the use of minimax formulae.
Exercise 5 Establish the dual Lidskii inequality
for any
and the dual Weyl inequality
whenever
.
Exercise 6 Use the Lidskii inequality to establish the more general inequality
whenever
, and
is the decreasing rearrangement of
. (Hint: express
as the integral of
as
runs from
to infinity. For each fixed
, apply (12).) Combine this with Hölder’s inequality to conclude the
-Weilandt-Hoffman inequality
for any
, where
is the usual
norm (with the usual convention that
), and
is the
-Schatten norm of
.
Exercise 7 Show that the
-Schatten norms are indeed a norm on the space of Hermitian matrices for every
.
Exercise 8 Show that for any
and any Hermitian matrix
, one has
Exercise 9 Establish the Hölder inequality
whenever
with
, and
are
Hermitian matrices. (Hint: Diagonalise one of the matrices and use the preceding exercise.)
The most important -Schatten norms are the
-Schatten norm
, which is just the operator norm, and the
-Schatten norm
, which is also the Frobenius norm (or Hilbert-Schmidt norm)
where are the coeffiicents of
. (The
-Schatten norm
, also known as the nuclear norm or trace class norm, is important in a number of applications, such as matrix completion, but will not be used often in this course.) Thus we see that the
case of the Weilandt-Hoffman inequality can be written as
We will give an alternate proof of this inequality, based on eigenvalue deformation, in the next section.
— 4. Eigenvalue deformation —
From the Weyl inequality (13), we know that the eigenvalue maps are Lipschitz continuous on Hermitian matrices (and thus also on real symmetric matrices). It turns out that we can obtain better regularity, provided that we avoid repeated eigenvalues. Fortunately, repeated eigenvalues are rare:
Exercise 10 (Dimension count) Suppose that
. Show that the space of Hermitian matrices with at least one repeated eigenvalue has codimension
in the space of all Hermitian matrices, and the space of real symmetric matrices with at least one repeated eigenvalue has codimension
in the space of all real symmetric matrices. (When
, repeated eigenvalues of course do not occur.)
Let us say that a Hermitian matrix has simple spectrum if it has no repeated eigenvalues. We thus see from the above exercise and (13) that the set of Hermitian matrices with simple spectrum forms an open dense set in the space of all Hermitian matrices, and similarly for real symmetric matrices; thus simple spectrum is the generic behaviour of such matrices. Indeed, the unexpectedly high codimension of the non-simple matrices (naively, one would expect a codimension set for a collision between, say,
and
) suggests a repulsion phenomenon: because it is unexpectedly rare for eigenvalues to be equal, there must be some “force” that “repels” eigenvalues of Hermitian (and to a lesser extent, real symmetric) matrices from getting too close to each other. We now develop some machinery to make this more precise.
We first observe that when has simple spectrum, the zeroes of the characteristic polynomial
are simple (i.e. the polynomial has nonzero derivartive at those zeroes). From this and the inverse function theorem, we see that each of the eigenvalue maps
are smooth on the region where
has simple spectrum. Because the eigenvectors
are determined (up to phase) by the equations
and
, another application of the inverse function theorem tells us that we can (locally) select the maps
to also be smooth. (There may be topological obstructions to smoothly selecting these vectors globally, but this will not concern us here as we will be performing a local analysis only. In later notes, we will in fact not work with the
at all due to their phase ambiguity, and work instead with the spectral projections
, which do not have this ambiguity.)
Now suppose that depends smoothly on a time variable
, so that (when
has simple spectrum) the eigenvalues
and eigenvectors
also depend smoothly on
. We can then differentiate the equations
to obtain various equations of motion for and
in terms of the derivatives of
.
Let’s see how this works. Taking first derivatives of (18), (19) using the product rule, we obtain
The equation (21) simplifies to , thus
is orthogonal to
. Taking inner products of (20) with
, we conclude the Hadamard first variation formula
This can already be used to give alternate proofs of various eigenvalue identities. For instance, If we apply this to , we see that
whenever has simple spectrum. The right-hand side can be bounded in magnitude by
, and so we see that the map
is Lipschitz with norm
whenever
has simple spectrum, which happens for generic
(and all
) by Exercise 10. By the fundamental theorem of calculus, we thus conclude (13).
Exercise 11 Use a similar argument to the one above to establish (17) without using minimax formulae or Lidskii’s inequality.
Exercise 12 Use a similar argument to the one above to deduce Lidskii’s inequality (12) from Proposition 3 rather than Exercise 3.
One can also compute the second derivative of eigenvalues:
Exercise 13 Suppose that
depends smoothly on
. By differentiating (20), (21) twice, establish the Hadamard second variation formula
whenever
has simple spectrum and
.
If one interprets the second derivative of the eigenvalues as being proportional to a “force” on those eigenvalues (in analogy with Newton’s second law), (23) is asserting that each eigenvalue
“repels” the other eigenvalues
by exerting a force that is inversely proportional to their separation (and also proportional to the square of the matrix coefficient of
in the eigenbasis). See this earlier blog post for more discussion.
Remark 4 In the proof of the four moment theorem of Van Vu and myself, which we will discuss in a subsequent lecture, we will also need the variation formulae for the third, fourth, and fifth derivatives of the eigenvalues (the first four derivatives match up with the four moments mentioned in the theorem, and the fifth derivative is needed to control error terms). Fortunately, we do not need the precise formulae for these derivatives (which, as one can imagine, are quite complicated), but only their general form, and in particular an upper bound for these derivatives in terms of more easily computable quantities.
— 5. Minors —
In the previous sections, we perturbed Hermitian matrices
by adding a (small)
Hermitian correction matrix
to them to form a new
Hermitian matrix
. Another important way to perturb a matrix is to pass to a principal minor, for instance to the top left
minor
of
. There is an important relationship between the eigenvalues of the two matrices:
Exercise 14 (Cauchy interlacing inequalities) For any
Hermitian matrix
with top left
minor
, then
for all
. (Hint: use the Courant-Fischer min-max theorem, Theorem 2.) Show furthermore that the space of
for which equality holds in one of the inequalities in (24) has codimension
(for Hermitian matrices) or
(for real symmetric matrices).
Remark 5 If one takes successive minors
of an
Hermitian matrix
, and computes their spectra, then (24) shows that this triangular array of numbers forms a pattern known as a Gelfand-Tsetlin pattern. These patterns are discussed a little more in this blog post.
One can obtain a more precise formula for the eigenvalues of in terms of those for
:
Exercise 15 (Eigenvalue equation) Let
be an
Hermitian matrix with top left
minor
. Suppose that
is an eigenvalue of
distinct from all the eigenvalues of
(and thus simple, by (24)). Show that
where
is the bottom right entry of
, and
is the right column of
(minus the bottom entry). (Hint: Expand out the eigenvalue equation
into the
and
components.) Note the similarities between (25) and (23).
Observe that the function is a rational function of
which is increasing away from the eigenvalues of
, where it has a pole (except in the rare case when the inner product
vanishes, in which case it can have a removable singularity). By graphing this function one can see that the interlacing formula (24) can also be interpreted as a manifestation of the intermediate value theorem.
The identity (25) suggests that under typical circumstances, an eigenvalue of
can only get close to an eigenvalue
if the associated inner product
is small. This type of observation is useful to achieve eigenvalue repulsion – to show that it is unlikely that the gap between two adjacent eigenvalues is small. We shall see examples of this in later notes.
— 6. Singular values —
The theory of eigenvalues of Hermitian matrices has an analogue in the theory of singular values of
non-Hermitian matrices. We first begin with the counterpart to the spectral theorem, namely the singular value decomposition.
Theorem 4 (Singular value decomposition) Let
, and let
be a linear transformation from an
-dimensional complex Hilbert space
to a
-dimensional complex Hilbert space
. (In particular,
could be an
matrix with complex entries, viewed as a linear transformation from
to
.) Then there exist non-negative real numbers
(known as the singular values of
) and orthonormal sets
and
(known as singular vectors of
), such that
for all
, where we abbreviate
, etc.
Furthermore,
whenever
is orthogonal to all of the
.
We adopt the convention that for
. The above theorem only applies to matrices with at least as many rows as columns, but one can also extend the definition to matrices with more columns than rows by adopting the convention
(it is easy to check that this extension is consistent on square matrices). All of the results below extend (with minor modifications) to the case when there are more columns than rows, but we have not displayed those extensions here in order to simplify the notation.
Proof: We induct on . The claim is vacuous for
, so suppose that
and that the claim has already been proven for
.
We follow a similar strategy to the proof of Theorem 1. We may assume that is not identically zero, as the claim is obvious otherwise. The function
is continuous on the unit sphere of
, so there exists a unit vector
which maximises this quantity. If we set
, one easily verifies that
is a critical point of the map
, which then implies that
. Thus, if we set
, then
and
. This implies that
maps the orthogonal complement
of
in
to the orthogonal complement
of
in
. By induction hypothesis, the restriction of
to
(and
) then admits a singular value decomposition with singular values
and singular vectors
,
with the stated properties. By construction we see that
are less than or equal to
. If we now adjoin
to the other singular values and vectors we obtain the claim.
Exercise 16 Show that the singular values
of a
matrix
are unique. If we have
, show that the singular vectors are unique up to rotation by a complex phase.
By construction (and the above uniqueness claim) we see that whenever
is a
matrix,
is a unitary
matrix, and
is a unitary
matrix. Thus the singular spectrum of a matrix is invariant under left and right unitary transformations.
Exercise 17 If
is a
complex matrix for some
, show that the augmented matrix
is a
Hermitian matrix whose eigenvalues consist of
, together with
copies of the eigenvalue zero. (This generalises Exercise 16 from Notes 3.) What is the relationship between the singular vectors of
and the eigenvectors of
?
Exercise 18 If
is an
Hermitian matrix, show that the singular values
of
are simply the absolute values
of
, arranged in descending order. Show that the same claim also holds when
is a normal matrix. What is the relationship between the singular vectors and eigenvectors of
?
Remark 6 When
is not normal, the relationship between eigenvalues and singular values is more subtle. We will discuss this point in later notes.
Exercise 19 If
is a
complex matrix for some
, show that
has eigenvalues
, and
has eigenvalues
together with
copies of the eigenvalue zero. Based on this observation, give an alternate proof of the singular value decomposition theorem using the spectral theorem for (positive semi-definite) Hermitian matrices.
Exercise 20 Show that the rank of a
matrix is equal to the number of non-zero singular values.
Exercise 21 Let
be a
complex matrix for some
. Establish the Courant-Fischer min-max formula
for all
, where the supremum ranges over all subspaces of
of dimension
.
One can use the above exercises to deduce many inequalities about singular values from analogous ones about eigenvalues. We give some examples below.
Exercise 22 Let
be
complex matrices for some
.
- (i) Establish the Weyl inequality
whenever
.
- (ii) Establish the Lidskii inequality
whenever
.
- (iii) Show that for any
, the map
defines a norm on the space
of complex
matrices (this norm is known as the
Ky Fan norm).
- (iv) Establish the Weyl inequality
for all
.
- (v) More generally, establish the
-Weilandt-Hoffman inequality
for any
, where
is the
-Schatten norm of
. (Note that this is consistent with the previous definition of the Schatten norms.)
- (vi) Show that the
-Schatten norm is indeed a norm on
for any
.
- (vii) If
is formed by removing one row from
, show that
for all
.
- (viii) If
and
is formed by removing one column from
, show that
for all
and
. What changes when
?
Exercise 23 Let
be a
complex matrix for some
. Observe that the linear transformation
naturally induces a linear transformation
from
-forms on
to
-forms on
. We give
the structure of a Hilbert space by declaring the basic forms
for
to be orthonormal.
For any
, show that the operator norm of
is equal to
.
Exercise 24 Let
be a
matrix for some
, let
be a
matrix, and let
be a
matrix for some
.
Show that
and
for any
.
Exercise 25 Let
be a
matrix for some
, let
be distinct, and let
be distinct. Show that
Using this, show that if
are distinct, then
for every
.
Exercise 26 Establish the Hölder inequality
whenever
are
complex matrices and
are such that
.
Acknowledgments: Thanks to Allen Knutson for corrections.
66 comments
Comments feed for this article
13 January, 2010 at 6:10 am
Marius
Hi Terry,
Linear algebra is indeed fascinating. Maybe the readers of this post may find
interesting some connections between this subject and quasiconvexity in the
calculus of variations and nonlinear elasticity:
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0R-4SPSHN6-3&_user=3797462&_rdoc=1&_fmt=&_orig=search&_sort=d&_docanchor=&view=c&_searchStrId=1164865628&_rerunOrigin=google&_acct=C000061416&_version=1&_urlVersion=0&_userid=3797462&md5=d8361adf4dc60e9323fc42216695b1cd
or
14 January, 2010 at 1:50 pm
Anonymous
A potential typo: in eq. 19 (and elsewhere), I thought the norm should be 1, not 0.
[Corrected, thanks – T.]
15 January, 2010 at 7:59 am
kashif
A small typo start of section 4: “From the Weyl inequality (13) we now that…” should be “From the Weyl inequality (13) we know that…”
[Corrected, thanks – T.]
18 January, 2010 at 6:29 pm
254A, Notes 3b: Brownian motion and Dyson Brownian motion « What’s new
[…] from the first and second Hadamard variation formulae (see Section 4 of Notes 3a) we […]
19 January, 2010 at 4:15 pm
David Speyer
Possible typo: Should equation (19) read u_i^* u_i =1 (rather than 0)?
[Ack, I thought I corrected that typo already. Corrected again, thanks – T.]
2 February, 2010 at 1:34 pm
254A, Notes 4: The semi-circular law « What’s new
[…] first observation is that the Cauchy interlacing law (Exercise 14 from Notes 3a) shows that the ESD of is very stable in . Indeed, we see from the interlacing law […]
17 February, 2010 at 8:30 am
Steven Heilman
Some pretty pedantic potential corrections:
1. A basic question in linear algebra …\lambda_{1}(B),\ldots,\lambda_{n}(B)
2. Eqs. (5),(9),(12): numerical label outside viewing window?
3. Section 3: Similarly, [for] the Ky Fan…
4. Section 4: From the Weyl inequality… we [k]now
5. After Exercise 10: … from the above exercise and [ ] (13)
6(a). Statement of Theorem 4: A^* v_j=\sigma_j u_j
6(b). Proof of Theorem 4: \sigma_1 :=\|Au_1 \|>0 ?
7. Exercise 23: induces a linear transformation … \bigwedge^k {\mathbb C}^p …
——————-
[and a few for Notes 3b]
1. After Remark 6: Applying [Lemma] (5)
2. … with disjoint increments being jointly indepen[d]ent
3. Proof of Theorem 6: …G continues [to] have the GUE distribution…
[Corrected, thanks – T.]
23 February, 2010 at 10:02 pm
254A, Notes 6: Gaussian ensembles « What’s new
[…] is simple. Since almost all Hermitian matrices have simple spectrum (see Exercise 10 of Notes 3a), this gives the full spectral distribution of GUE, except for the issue of the unspecified […]
14 March, 2010 at 11:33 am
254A, Notes 8: The circular law « What’s new
[…] case, where eigenvalue inequalities such as the Weyl inequalities or Hoffman-Wielandt inequalities (Notes 3a) ensure stability of the […]
8 May, 2010 at 1:30 am
Anonymous
Dear Prof. Tao,
I had some problems on Ex. 8 and Ex. 13:
Ex. 8: Could you give some hint on how to prove it? This is most probably very trivial, but I tried it for several hours.
Ex. 13: I tried to do it according to your hint, but it is hard for me how to make the other eigenvalues
involved in the equations of
.
Thank you very much in advance!
8 May, 2010 at 1:12 pm
Terence Tao
For Ex 8, apply the method of Ex6 but with (12) replaced by (9).
For Ex13, a solution can be found at the link provided.
21 May, 2010 at 10:11 am
Ben
Hi,
in a neighbourhood of
. Is there not an issue that the matrix
is singular? Is this where the normalization
comes in? Thanks for your help.
I’m a bit confused about your use of the inverse/implicit function theorem to get smoothness of the eigenvectors
21 May, 2010 at 12:07 pm
Terence Tao
Yes, with the normalisation
(and working in the real symmetric setting for simplicity), the map
is a map from an n-dimensional space
to an n-dimensoinal space
, and one can check by hand that the Jacobian is nondegenerate when the eigenvalues are simple, so the inverse theorem lets one select both
and
smoothly in this case. In the complex Hermitian setting it is a bit more complicated, one has to also fix the phase of one of the coordinates of
(e.g. insisting that the first coordinate is a positive real) to make the dimensions match.
There is a theorem of Rellich that shows that eigenvectors and eigenvalues can in fact be selected in an analytic fashion even when there are repeated eigenvalues, but only if one abandons the idea of keeping the order of the eigenvalues fixed (i.e. one allows one eigenvalue to overtake another). This is a nontrivial theorem to prove, though.
2 October, 2014 at 10:02 am
hmobahi
Thank you Terrence. This was an extremely helpful comment. Does your comment imply that the function \sum_{k=1}^n (\frac{d}{d t} \lambda_k(t)) f_k(\lambda_k(t)) is well-defined even when eigenvalues are not simple? Here f_k(x) is any functional that analytically depends on x. I think about this, because the sum kills the dependency of a specific ordering of the eigenvaules.
30 May, 2010 at 5:12 pm
Anonymous
Question and a correction.
In Exercise 5, the second inequality should be A->A+B
n->A, n->B.
Question: could a hint be given on how to prove the p-Weilandt-Hoffman inequality using Holder’s inequality? I do not have any clue.
30 May, 2010 at 11:05 pm
Terence Tao
Thanks for the correction!
By Holder’s inequality, the
norm of a sequence is equal to the largest inner product between that sequence and a sequence of unit
norm. Now use the fact that the
norm is invariant with respect to rearrangement.
6 February, 2012 at 9:03 pm
Anonymous
Dear Prof. Tao, to get to get the first inequality in Exercise 6, do you multiply the Lidskii’s inequality by the integral given in the hint, and then how can one get
or have an rearrangement of
in front of
? Thanks for your help!
31 May, 2010 at 11:48 am
Anonymous
Thanks for the reply. But I am still confused about how to get (14) using ONLY the first inequality given in Exercise 6 and the Holder’s inequality.
TO generate the l_p norm for (\lamda_{i}(A+B)-\lamda_{i}(A)) using Holder’s inequality, we need to use a c_i that is negative if (\lamda_{i}(A+B)-\lamda_{i}(A)) is negative.
But this inequality is only about c_i that is nonnegative.
For example, consider the scalar case when A,B are both scalars, \lamda_{1}(A+B)-\lamda_{1}(A) <\lamda_1(B), does not necessarily mean the absolute value of the former is smaller than the absolute value of the latter.
Should we use some other conditions?
31 May, 2010 at 6:56 pm
Terence Tao
Because the trace of A+B is the sum of the trace of A and the trace of B, one can extend the original inequality to the case when the c_i are arbitrary reals.
6 February, 2012 at 9:06 pm
Anonymous
Dear Prof. Tao, to get the first inequality in Exercise 6, do you multiply the Lidskii’s inequality by the integral given in the hint, and then how can one get
or have an rearrangement of
in front of
? Thanks for your help!
22 December, 2010 at 9:08 pm
Outliers in the spectrum of iid matrices with bounded rank permutations « What’s new
[…] were no interlacing inequalities in this case to control bounded rank perturbations (as discussed in this post). However, as it turns out I had arrived at the wrong conclusion, especially in the exterior of the […]
19 January, 2011 at 6:55 pm
San
In proposition3, how do you use induction hypothesis, as the hypothesis would be true for the sum of first k eigenvalues.
Thanks,
San
20 January, 2011 at 12:13 pm
Terence Tao
When one restricts A to the span of
to obtain an n-1-dimensional linear operator, the eigenvalues are now
, thanks to the hypothesis that A has the standard eigenbasis (so that it is given by the diagonal matrix with entries
.
20 January, 2011 at 5:41 pm
San
Thanks for the reply. Can I ask where can I find the solution of exercise3.
Sangeeta
24 July, 2019 at 5:37 am
Anonymous
https://math.stackexchange.com/q/2863568/469791
22 January, 2011 at 9:26 am
Sandor
A direct link to “my survey with Allen Knutson” is http://www.ams.org/notices/200102/fea-knutson.pdf
22 August, 2011 at 4:17 am
Interlaced eigenvalues « Alasdair's musings
[…] Another proof uses the min-max theorem, of which a full account is given by Terry Tao. […]
13 February, 2012 at 8:52 am
kushal
Hi,
1.Is there any upper bound for the norm of the inverse of a matrix (i.e. ||inv(A)||), in terms of the norm of the matrix , i.e. ||A|| ?
NB. There is no restriction on A such as det(A)= ???
Is there any lower bound for problem-2 that involves only A and not inv(A)?
11 July, 2013 at 11:03 am
Anonymous
Hello Prof. Tao,
Can these results be applied to matrices of differnt dimensions? For example, I have a m*m matrix (A) of rank n<m, and add another matrix of dimension q*q (B), where q< n. All matrices are Hermitian. Can I find a bound on the absolute eigenvalues of the system? The eigenvalues are lambda_1, – lambda_1, lambda_2, -lambda_2, and so on.
16 April, 2014 at 3:42 pm
Anonymous
Is there a relation for the eigenvalues of the sum of an anti-Hermitian A and a Hermitian matrix B? For example, if I want to know the eigenvalues of A + B, can I simply calculate the eigenvalues of A and B seperately and then get the eigenvalues of A + B from them?
16 April, 2014 at 4:03 pm
Terence Tao
If A and B commute, then the eigenvalues of A+B are the sum of eigenvalues of A and some permutation of the eigenvalues of B.
In general, though, the situation is quite complicated, since A+B will usually not be a normal matrix, and the eigenvalues will be quite badly behaved complex numbers. One can still control the singular values, and hence the eigenvalues to some extent, by the Weyl inequalities, and one also controls the trace of A+B, but other than that things look quite messy, even for the 2×2 matrix case.
26 August, 2015 at 4:38 am
Hamed
I do not know about the application you have in mind. But in case you need to know the bounds of eigenvalues (for example if you want to analysis the stability of some dynamical system) this structure can help you. You can use Bendixon’s theorem which says that the eigenvalues of A+B are bounded in a box, when the real parts are bounded by eigenvalues of B (Hermitian) and the imaginary parts are bounded by eigenvalues of A (anti-Hermitian).
27 June, 2014 at 2:18 am
Felix V.
Dear Professor Tao,
in Exercise 10, you ask the reader to proof that
“the space of Hermitian matrices with at least one repeated eigenvalue has codimension {3} in the space of all Hermitian matrices.”
It seemed to me that “space” should be interpreted as “(real) vector space” in this context.
But then the problem is that the set of matrices with at least one repeated eigenvalue is NOT a vector space (not closed under addition), as can be seen by considering e.g. diagonal matrices:
{
\left(\begin{array}{ccc}
1\\
& 1\\
& & 2
\end{array}\right)+\left(\begin{array}{ccc}
0\\
& 1\\
& & 1
\end{array}\right)=\left(\begin{array}{ccc}
1\\
& 2\\
& & 3
\end{array}\right)
}
Best regards,
Felix V.
27 June, 2014 at 8:10 am
Terence Tao
Here I am not interpreting “space” in the linear algebra sense; you may substitute “set” or “algebraic set” instead, if you prefer.
17 March, 2015 at 5:47 am
Lecture #22: Random Sampling for Fast SVDs | CMU Advanced Algos: S15
[…] pointed out that it follows from the Weyl inequalities. These say that for all integers such that […]
28 March, 2015 at 7:57 pm
kumar vishwajeet
Dear Dr Tao,
be a random complex matrix of size
.
has circular Gaussian distribution with mean 0 and variance 
where
has Gaussian distribution with mean 0 and variance
. Using monte Carlo, I found that the probaility density function of the
and that of
only differs by a multiplying constant. Is there any mathematical proof or logical explanation for this?
I am currently working on the distribution of eigenvalues of non central wishart matrices. I found the following interesting thing.
Let
Consider another case where,
29 March, 2015 at 8:08 am
Terence Tao
The eigenvalues of
and
are the squares of the eigenvalues of the Hermitian matrices
and
respectively. So my guess is that the matrices
and
have essentially the same spectrum (semicircle law, I think) and are behaving as if they were free of
. (I’m not sure what you mean by the multiplying constant difference though – the probability density function has to integrate to 1.)
28 March, 2015 at 8:12 pm
kumar vishwajeet
Just a modification of my last post: The probability density function of eigenvalues of
and
differ by a multiplying constant.
29 March, 2015 at 12:46 pm
kumar vishwajeet
Thank you for the prompt reply.
be the pdf of eigenvalues of
and
be the pdf of eigenvalues of
, then
By “Multiplying constant difference” I mean that if
29 March, 2015 at 3:12 pm
kumar vishwajeet
I got it.
has to be equal to one.
24 September, 2015 at 4:31 pm
Abbas
What does $|Av|$ denote in Exercise 21?
[The magnitude of the vector
formed by multiplying the matrix
with the vector
. -T.]
30 March, 2016 at 4:39 pm
David
Prof. Tao,
I am a novice going through a few of these exercises and ran into trouble on #6. Here you suggest using Holder’s inequality to prove the p-Wielandt Hoffman inequality. This works very straightforwardly to yield an Lp norm for the right hand side of the inequality (in (14)), but the Lp norm also must be obtained on the left hand side, which is more difficult. Can you comment on how to proceed for the left hand side of (14)?
30 March, 2016 at 9:24 pm
Terence Tao
Use the converse form
of Holder’s inequality.
31 March, 2016 at 11:22 am
David
Very nice, thanks.
24 May, 2016 at 4:44 am
Anonymous
Dear Dr. Tao,
I was thinking about a way to show that the partial trace is independent of the choice an orthonormal basis. But I kept failing to find a good proof. Is there any hint you can give me ?
24 May, 2016 at 8:35 am
Terence Tao
Given two orthonormal bases for
, write the elements of one basis as a linear combination of the other basis; the coefficients will then form a unitary matrix since both bases are orthonormal. Insert these expansions of the basis elements to write the partial trace in one basis in terms of the other basis, and use the definition of a unitary matrix to simplify.
Alternatively, one can note that the partial trace is also equal to the full trace of
, where
is the orthogonal projection onto
, and use the standard fact that the full trace is basis-independent.
24 May, 2016 at 10:32 am
Anonymous
Thank you, the second one is the one I was looking for.
27 May, 2016 at 12:01 am
David
Dear Dr. Tao,
I would be interested to see the proof of the dual Weyl inequality if you happen to have it handy. I have read the proof in “A Note on Weyl’s Inequality” by Steve Fisk, but I failed to really understand it.
Thanks,
David
27 May, 2016 at 7:56 am
Terence Tao
One can for instance derive the dual Weyl inequality from the original Weyl inequality by repeatedly using the identity
.
24 June, 2016 at 8:43 am
vsk1996
Tao sir please give some simpler version of (Courant-Fischer Theorem) and relate it with rayleigh quotient
25 October, 2016 at 3:39 am
Archy
Prof. Tao, is there a simplification of this theory, if all the Hermitian matrices are rank 1? thank you very much!
26 November, 2016 at 11:05 am
keej
These notes are great. I think there is a typo in the definition of spectral projection in Exercise 1. It should be $u_j(A) u_j(A)^*$, not $u_j(A)^* u_j(A)$.
[Corrected, thanks – T.]
2 March, 2017 at 12:08 am
keej
In the proof of the SVD, we use the fact that a critical point of the map
must satisfy the identity
. I have two questions about this.
First, I can see why this is true by expanding out all the entries and differentiating, but I feel like there should be a faster way. Is there?
Secondly, what does the derivative even mean in the case that
has complex entries? The map in question is real-valued, so how can its gradient be a complex vector?
2 March, 2017 at 9:48 am
Terence Tao
One can first try to understand the situation when
is a diagonal matrix, in which case the map basically splits as the sum of
non-interacting one-dimensional maps, and the claim is then easy to understand in that setting. The existence of the SVD in fact tells us that the critical point behaviour of a general matrix
will be the same as that of a diagonal matrix (with the singular values as entries), so if one believes the SVD to exist then this explains why this identity ought to be true. Of course one cannot use this observation to prove the existence of SVD, as this would be circular, but it can certainly be used to motivate such a proof.
Any complex vector space can be viewed as a real vector space of twice the dimension, so one can take the gradient in that setting.
16 February, 2018 at 4:52 pm
Dita
Dear Prof. Tao, I have one question. If the eigenvalues of matrices A and
A + B are increasing ordered, but the eigenvalues of B are not ordered, is it possible to use Weyl’s inequality to get a bound for the eigenvalue of A + B? Thank you for your attention.
4 May, 2018 at 10:55 am
Anonymous
Is the Weyl’s inequality true even in the case when we have separable Hilbert space?
4 May, 2018 at 12:54 pm
Terence Tao
For compact self-adjoint operators, basically all the inequalities for finite-dimensional operators extend to the infinite-dimensional setting by a limiting argument, see https://arxiv.org/abs/0709.1088 . For non-compact operators, in which the spectrum need not be pure point, the situation is significantly more complicated, and I am not sure what can cleanly be said (beyond the subadditivity of operator norms).
12 July, 2018 at 4:08 pm
Anonymous
Do (vii/viii) in Exercise 22 imply that cauchy interlacing theorem holds for eigenvalues of matrices that not necessarily hermitian?
[Oops, this was a typo: all references to eigenvalues here have been replaced with singular values. There is no interlacing for eigenvalues for non-Hermitian matrices, as these eigenvalues need not even be real. -T.]
6 April, 2019 at 5:48 pm
Schur-Horn | Random Mathematics
[…] do: Do the exercises of this post by […]
7 April, 2019 at 5:27 am
Diagonalization of Hermitian matrices | Random Mathematics
[…] Following Tao’s post here. […]
7 April, 2019 at 7:48 am
Courant-Fischer min-max Theorem | Random Mathematics
[…] Following Tao’s post here. […]
7 April, 2019 at 1:16 pm
Extremal partial traces | Random Mathematics
[…] Following Tao’s post. […]
7 April, 2019 at 6:07 pm
Wielandt minimax formula | Random Mathematics
[…] This is exercise 3 in Tao’s post here. […]
30 April, 2020 at 4:49 pm
Hamiltonian Density of States – Ramis Movassagh
[…] Horn’s conjecture was finally proved by Alexander Klaychko in (1998). Later Knutson and Tao gave a proof based on Honey comb puzzles and Schubert calculus (see this post). […]
30 September, 2020 at 10:02 am
Vaibhav
Does the positive-definiteness of the matrices A and B alter the eigenvalue inequalities, such as, Weyl inequalities? Is that a condition that needs consideration? Thanks in advance!
1 October, 2020 at 7:54 am
Terence Tao
Any self-adjoint matrix can be made positive definite by adding a large multiple of the identity, which has the effect of translating all the eigenvalues by a constant. One can check that inequalities such as Weyl’s inequalities are invariant under such a translation, hence the imposition of positive definiteness does not lead to any significant improvement to Weyl’s inequalities. On the other hand, being positive definite is equivalent to the eigenvalues being non-negative, so one now has some additional inequalities
that one can combine with Weyl’s inequalities if one wishes to obtain some new, but *weaker* inequalities. For instance by adding
to
one gets the new, but worse, inequality
.