You are currently browsing the tag archive for the ‘Courant-Fisher theorem’ tag.

Let {A} be a Hermitian {n \times n} matrix. By the spectral theorem for Hermitian matrices (which, for sake of completeness, we prove below), one can diagonalise {A} using a sequence

\displaystyle  \lambda_1(A) \geq \ldots \geq \lambda_n(A)

of {n} real eigenvalues, together with an orthonormal basis of eigenvectors {u_1(A),\ldots,u_n(A) \in {\mathbb C}^n}. (The eigenvalues are uniquely determined by {A}, but the eigenvectors have a little ambiguity to them, particularly if there are repeated eigenvalues; for instance, one could multiply each eigenvector by a complex phase {e^{i\theta}}. In these notes we are arranging eigenvalues in descending order; of course, one can also arrange eigenvalues in increasing order, which causes some slight notational changes in the results below.) The set {\{\lambda_1(A),\ldots,\lambda_n(A)\}} is known as the spectrum of {A}.

A basic question in linear algebra asks the extent to which the eigenvalues {\lambda_1(A),\ldots,\lambda_n(A)} and {\lambda_1(B),\ldots,\lambda_n(B)} of two Hermitian matrices {A, B} constrains the eigenvalues {\lambda_1(A+B),\ldots,\lambda_n(A+B)} of the sum. For instance, the linearity of trace

\displaystyle  \hbox{tr}(A+B) = \hbox{tr}(A)+\hbox{tr}(B),

when expressed in terms of eigenvalues, gives the trace constraint

\displaystyle  \lambda_1(A+B)+\ldots+\lambda_n(A+B) = \lambda_1(A)+\ldots+\lambda_n(A) \ \ \ \ \ (1)

\displaystyle  +\lambda_1(B)+\ldots+\lambda_n(B);

the identity

\displaystyle  \lambda_1(A) = \sup_{|v|=1} v^* Av \ \ \ \ \ (2)

(together with the counterparts for {B} and {A+B}) gives the inequality

\displaystyle  \lambda_1(A+B) \leq \lambda_1(A) + \lambda_1(B); \ \ \ \ \ (3)

and so forth.

The complete answer to this problem is a fascinating one, requiring a strangely recursive description (once known as Horn’s conjecture, which is now solved), and connected to a large number of other fields of mathematics, such as geometric invariant theory, intersection theory, and the combinatorics of a certain gadget known as a “honeycomb”. See for instance my survey with Allen Knutson on this topic some years ago.

In typical applications to random matrices, one of the matrices (say, {B}) is “small” in some sense, so that {A+B} is a perturbation of {A}. In this case, one does not need the full strength of the above theory, and instead rely on a simple aspect of it pointed out by Helmke and Rosenthal and by Totaro, which generates several of the eigenvalue inequalities relating {A}, {B}, and {C}, of which (1) and (3) are examples. (Actually, this method eventually generates all of the eigenvalue inequalities, but this is a non-trivial fact to prove.) These eigenvalue inequalities can mostly be deduced from a number of minimax characterisations of eigenvalues (of which (2) is a typical example), together with some basic facts about intersections of subspaces. Examples include the Weyl inequalities

\displaystyle  \lambda_{i+j-1}(A+B) \leq \lambda_i(A) + \lambda_j(B), \ \ \ \ \ (4)

valid whenever {i,j \geq 1} and {i+j-1 \leq n}, and the Ky Fan inequality

\displaystyle  \lambda_1(A+B)+\ldots+\lambda_k(A+B) \leq

\displaystyle  \lambda_1(A)+\ldots+\lambda_k(A) + \lambda_1(B)+\ldots+\lambda_k(B). \ \ \ \ \ (5)

One consequence of these inequalities is that the spectrum of a Hermitian matrix is stable with respect to small perturbations.

We will also establish some closely related inequalities concerning the relationships between the eigenvalues of a matrix, and the eigenvalues of its minors.

Many of the inequalities here have analogues for the singular values of non-Hermitian matrices (which is consistent with the discussion near Exercise 16 of Notes 3). However, the situation is markedly different when dealing with eigenvalues of non-Hermitian matrices; here, the spectrum can be far more unstable, if pseudospectrum is present. Because of this, the theory of the eigenvalues of a random non-Hermitian matrix requires an additional ingredient, namely upper bounds on the prevalence of pseudospectrum, which after recentering the matrix is basically equivalent to establishing lower bounds on least singular values. We will discuss this point in more detail in later notes.

We will work primarily here with Hermitian matrices, which can be viewed as self-adjoint transformations on complex vector spaces such as {{\mathbb C}^n}. One can of course specialise the discussion to real symmetric matrices, in which case one can restrict these complex vector spaces to their real counterparts {{\mathbb R}^n}. The specialisation of the complex theory below to the real case is straightforward and is left to the interested reader.

Read the rest of this entry »