You are currently browsing the tag archive for the ‘Gelfand-Tsetlin patterns’ tag.

Suppose we have an {n \times n} matrix {M} that is expressed in block-matrix form as

\displaystyle  M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}

where {A} is an {(n-k) \times (n-k)} matrix, {B} is an {(n-k) \times k} matrix, {C} is an {k \times (n-k)} matrix, and {D} is a {k \times k} matrix for some {1 < k < n}. If {A} is invertible, we can use the technique of Schur complementation to express the inverse of {M} (if it exists) in terms of the inverse of {A}, and the other components {B,C,D} of course. Indeed, to solve the equation

\displaystyle  M \begin{pmatrix} x & y \end{pmatrix} = \begin{pmatrix} a & b \end{pmatrix},

where {x, a} are {(n-k) \times 1} column vectors and {y,b} are {k \times 1} column vectors, we can expand this out as a system

\displaystyle  Ax + By = a

\displaystyle  Cx + Dy = b.

Using the invertibility of {A}, we can write the first equation as

\displaystyle  x = A^{-1} a - A^{-1} B y \ \ \ \ \ (1)

and substituting this into the second equation yields

\displaystyle  (D - C A^{-1} B) y = b - C A^{-1} a

and thus (assuming that {D - CA^{-1} B} is invertible)

\displaystyle  y = - (D - CA^{-1} B)^{-1} CA^{-1} a + (D - CA^{-1} B)^{-1} b

and then inserting this back into (1) gives

\displaystyle  x = (A^{-1} + A^{-1} B (D - CA^{-1} B)^{-1} C A^{-1}) a - A^{-1} B (D - CA^{-1} B)^{-1} b.

Comparing this with

\displaystyle  \begin{pmatrix} x & y \end{pmatrix} = M^{-1} \begin{pmatrix} a & b \end{pmatrix},

we have managed to express the inverse of {M} as

\displaystyle  M^{-1} =

\displaystyle  \begin{pmatrix} A^{-1} + A^{-1} B (D - CA^{-1} B)^{-1} C A^{-1} & - A^{-1} B (D - CA^{-1} B)^{-1} \\ - (D - CA^{-1} B)^{-1} CA^{-1} & (D - CA^{-1} B)^{-1} \end{pmatrix}. \ \ \ \ \ (2)

One can consider the inverse problem: given the inverse {M^{-1}} of {M}, does one have a nice formula for the inverse {A^{-1}} of the minor {A}? Trying to recover this directly from (2) looks somewhat messy. However, one can proceed as follows. Let {U} denote the {n \times k} matrix

\displaystyle  U := \begin{pmatrix} 0 \\ I_k \end{pmatrix}

(with {I_k} the {k \times k} identity matrix), and let {V} be its transpose:

\displaystyle  V := \begin{pmatrix} 0 & I_k \end{pmatrix}.

Then for any scalar {t} (which we identify with {t} times the identity matrix), one has

\displaystyle  M + UtV = \begin{pmatrix} A & B \\ C & D+t \end{pmatrix},

and hence by (2)

\displaystyle  (M+UtV)^{-1} =

\displaystyle \begin{pmatrix} A^{-1} + A^{-1} B (D + t - CA^{-1} B)^{-1} C A^{-1} & - A^{-1} B (D + t- CA^{-1} B)^{-1} \\ - (D + t - CA^{-1} B)^{-1} CA^{-1} & (D + t - CA^{-1} B)^{-1} \end{pmatrix}.

noting that the inverses here will exist for {t} large enough. Taking limits as {t \rightarrow \infty}, we conclude that

\displaystyle  \lim_{t \rightarrow \infty} (M+UtV)^{-1} = \begin{pmatrix} A^{-1} & 0 \\ 0 & 0 \end{pmatrix}.

On the other hand, by the Woodbury matrix identity (discussed in this previous blog post), we have

\displaystyle  (M+UtV)^{-1} = M^{-1} - M^{-1} U (t^{-1} + V M^{-1} U)^{-1} V M^{-1}

and hence on taking limits and comparing with the preceding identity, one has

\displaystyle  \begin{pmatrix} A^{-1} & 0 \\ 0 & 0 \end{pmatrix} = M^{-1} - M^{-1} U (V M^{-1} U)^{-1} V M^{-1}.

This achieves the aim of expressing the inverse {A^{-1}} of the minor in terms of the inverse of the full matrix. Taking traces and rearranging, we conclude in particular that

\displaystyle  \mathrm{tr} A^{-1} = \mathrm{tr} M^{-1} - \mathrm{tr} (V M^{-2} U) (V M^{-1} U)^{-1}. \ \ \ \ \ (3)

In the {k=1} case, this can be simplified to

\displaystyle  \mathrm{tr} A^{-1} = \mathrm{tr} M^{-1} - \frac{e_n^T M^{-2} e_n}{e_n^T M^{-1} e_n} \ \ \ \ \ (4)

where {e_n} is the {n^{th}} basis column vector.

We can apply this identity to understand how the spectrum of an {n \times n} random matrix {M} relates to that of its top left {n-1 \times n-1} minor {A}. Subtracting any complex multiple {z} of the identity from {M} (and hence from {A}), we can relate the Stieltjes transform {s_M(z) := \frac{1}{n} \mathrm{tr}(M-z)^{-1}} of {M} with the Stieltjes transform {s_A(z) := \frac{1}{n-1} \mathrm{tr}(A-z)^{-1}} of {A}:

\displaystyle  s_A(z) = \frac{n}{n-1} s_M(z) - \frac{1}{n-1} \frac{e_n^T (M-z)^{-2} e_n}{e_n^T (M-z)^{-1} e_n} \ \ \ \ \ (5)

At this point we begin to proceed informally. Assume for sake of argument that the random matrix {M} is Hermitian, with distribution that is invariant under conjugation by the unitary group {U(n)}; for instance, {M} could be drawn from the Gaussian Unitary Ensemble (GUE), or alternatively {M} could be of the form {M = U D U^*} for some real diagonal matrix {D} and {U} a unitary matrix drawn randomly from {U(n)} using Haar measure. To fix normalisations we will assume that the eigenvalues of {M} are typically of size {O(1)}. Then {A} is also Hermitian and {U(n)}-invariant. Furthermore, the law of {e_n^T (M-z)^{-1} e_n} will be the same as the law of {u^* (M-z)^{-1} u}, where {u} is now drawn uniformly from the unit sphere (independently of {M}). Diagonalising {M} into eigenvalues {\lambda_j} and eigenvectors {v_j}, we have

\displaystyle u^* (M-z)^{-1} u = \sum_{j=1}^n \frac{|u^* v_j|^2}{\lambda_j - z}.

One can think of {u} as a random (complex) Gaussian vector, divided by the magnitude of that vector (which, by the Chernoff inequality, will concentrate to {\sqrt{n}}). Thus the coefficients {u^* v_j} with respect to the orthonormal basis {v_1,\dots,v_j} can be thought of as independent (complex) Gaussian vectors, divided by that magnitude. Using this and the Chernoff inequality again, we see (for {z} distance {\sim 1} away from the real axis at least) that one has the concentration of measure

\displaystyle  u^* (M-z)^{-1} u \approx \frac{1}{n} \sum_{j=1}^n \frac{1}{\lambda_j - z}

and thus

\displaystyle  e_n^T (M-z)^{-1} e_n \approx \frac{1}{n} \mathrm{tr} (M-z)^{-1} = s_M(z)

(that is to say, the diagonal entries of {(M-z)^{-1}} are roughly constant). Similarly we have

\displaystyle  e_n^T (M-z)^{-2} e_n \approx \frac{1}{n} \mathrm{tr} (M-z)^{-2} = \frac{d}{dz} s_M(z).

Inserting this into (5) and discarding terms of size {O(1/n^2)}, we thus conclude the approximate relationship

\displaystyle  s_A(z) \approx s_M(z) + \frac{1}{n} ( s_M(z) - s_M(z)^{-1} \frac{d}{dz} s_M(z) ).

This can be viewed as a difference equation for the Stieltjes transform of top left minors of {M}. Iterating this equation, and formally replacing the difference equation by a differential equation in the large {n} limit, we see that when {n} is large and {k \approx e^{-t} n} for some {t \geq 0}, one expects the top left {k \times k} minor {A_k} of {M} to have Stieltjes transform

\displaystyle  s_{A_k}(z) \approx s( t, z ) \ \ \ \ \ (6)

where {s(t,z)} solves the Burgers-type equation

\displaystyle  \partial_t s(t,z) = s(t,z) - s(t,z)^{-1} \frac{d}{dz} s(t,z) \ \ \ \ \ (7)

with initial data {s(0,z) = s_M(z)}.

Example 1 If {M} is a constant multiple {M = cI_n} of the identity, then {s_M(z) = \frac{1}{c-z}}. One checks that {s(t,z) = \frac{1}{c-z}} is a steady state solution to (7), which is unsurprising given that all minors of {M} are also {c} times the identity.

Example 2 If {M} is GUE normalised so that each entry has variance {\sigma^2/n}, then by the semi-circular law (see previous notes) one has {s_M(z) \approx \frac{-z + \sqrt{z^2-4\sigma^2}}{2\sigma^2} = -\frac{2}{z + \sqrt{z^2-4\sigma^2}}} (using an appropriate branch of the square root). One can then verify the self-similar solution

\displaystyle  s(t,z) = \frac{-z + \sqrt{z^2 - 4\sigma^2 e^{-t}}}{2\sigma^2 e^{-t}} = -\frac{2}{z + \sqrt{z^2 - 4\sigma^2 e^{-t}}}

to (7), which is consistent with the fact that a top {k \times k} minor of {M} also has the law of GUE, with each entry having variance {\sigma^2 / n \approx \sigma^2 e^{-t} / k} when {k \approx e^{-t} n}.

One can justify the approximation (6) given a sufficiently good well-posedness theory for the equation (7). We will not do so here, but will note that (as with the classical inviscid Burgers equation) the equation can be solved exactly (formally, at least) by the method of characteristics. For any initial position {z_0}, we consider the characteristic flow {t \mapsto z(t)} formed by solving the ODE

\displaystyle  \frac{d}{dt} z(t) = s(t,z(t))^{-1} \ \ \ \ \ (8)

with initial data {z(0) = z_0}, ignoring for this discussion the problems of existence and uniqueness. Then from the chain rule, the equation (7) implies that

\displaystyle  \frac{d}{dt} s( t, z(t) ) = s(t,z(t))

and thus {s(t,z(t)) = e^t s(0,z_0)}. Inserting this back into (8) we see that

\displaystyle  z(t) = z_0 + s(0,z_0)^{-1} (1-e^{-t})

and thus (7) may be solved implicitly via the equation

\displaystyle  s(t, z_0 + s(0,z_0)^{-1} (1-e^{-t}) ) = e^t s(0, z_0) \ \ \ \ \ (9)

for all {t} and {z_0}.

Remark 3 In practice, the equation (9) may stop working when {z_0 + s(0,z_0)^{-1} (1-e^{-t})} crosses the real axis, as (7) does not necessarily hold in this region. It is a cute exercise (ultimately coming from the Cauchy-Schwarz inequality) to show that this crossing always happens, for instance if {z_0} has positive imaginary part then {z_0 + s(0,z_0)^{-1}} necessarily has negative or zero imaginary part.

Example 4 Suppose we have {s(0,z) = \frac{1}{c-z}} as in Example 1. Then (9) becomes

\displaystyle  s( t, z_0 + (c-z_0) (1-e^{-t}) ) = \frac{e^t}{c-z_0}

for any {t,z_0}, which after making the change of variables {z = z_0 + (c-z_0) (1-e^{-t}) = c - e^{-t} (c - z_0)} becomes

\displaystyle  s(t, z ) = \frac{1}{c-z}

as in Example 1.

Example 5 Suppose we have

\displaystyle  s(0,z) = \frac{-z + \sqrt{z^2-4\sigma^2}}{2\sigma^2} = -\frac{2}{z + \sqrt{z^2-4\sigma^2}}.

as in Example 2. Then (9) becomes

\displaystyle  s(t, z_0 - \frac{z_0 + \sqrt{z_0^2-4\sigma^2}}{2} (1-e^{-t}) ) = e^t \frac{-z_0 + \sqrt{z_0^2-4\sigma^2}}{2\sigma^2}.

If we write

\displaystyle  z := z_0 - \frac{z_0 + \sqrt{z_0^2-4\sigma^2}}{2} (1-e^{-t})

\displaystyle  = \frac{(1+e^{-t}) z_0 - (1-e^{-t}) \sqrt{z_0^2-4\sigma^2}}{2}

one can calculate that

\displaystyle  z^2 - 4 \sigma^2 e^{-t} = (\frac{(1-e^{-t}) z_0 - (1+e^{-t}) \sqrt{z_0^2-4\sigma^2}}{2})^2

and hence

\displaystyle  \frac{-z + \sqrt{z^2 - 4\sigma^2 e^{-t}}}{2\sigma^2 e^{-t}} = e^t \frac{-z_0 + \sqrt{z_0^2-4\sigma^2}}{2\sigma^2}

which gives

\displaystyle  s(t,z) = \frac{-z + \sqrt{z^2 - 4\sigma^2 e^{-t}}}{2\sigma^2 e^{-t}}. \ \ \ \ \ (10)

One can recover the spectral measure {\mu} from the Stieltjes transform {s(z)} as the weak limit of {x \mapsto \frac{1}{\pi} \mathrm{Im} s(x+i\varepsilon)} as {\varepsilon \rightarrow 0}; we write this informally as

\displaystyle  d\mu(x) = \frac{1}{\pi} \mathrm{Im} s(x+i0^+)\ dx.

In this informal notation, we have for instance that

\displaystyle  \delta_c(x) = \frac{1}{\pi} \mathrm{Im} \frac{1}{c-x-i0^+}\ dx

which can be interpreted as the fact that the Cauchy distributions {\frac{1}{\pi} \frac{\varepsilon}{(c-x)^2+\varepsilon^2}} converge weakly to the Dirac mass at {c} as {\varepsilon \rightarrow 0}. Similarly, the spectral measure associated to (10) is the semicircular measure {\frac{1}{2\pi \sigma^2 e^{-t}} (4 \sigma^2 e^{-t}-x^2)_+^{1/2}}.

If we let {\mu_t} be the spectral measure associated to {s(t,\cdot)}, then the curve {e^{-t} \mapsto \mu_t} from {(0,1]} to the space of measures is the high-dimensional limit {n \rightarrow \infty} of a Gelfand-Tsetlin pattern (discussed in this previous post), if the pattern is randomly generated amongst all matrices {M} with spectrum asymptotic to {\mu_0} as {n \rightarrow \infty}. For instance, if {\mu_0 = \delta_c}, then the curve is {\alpha \mapsto \delta_c}, corresponding to a pattern that is entirely filled with {c}‘s. If instead {\mu_0 = \frac{1}{2\pi \sigma^2} (4\sigma^2-x^2)_+^{1/2}} is a semicircular distribution, then the pattern is

\displaystyle  \alpha \mapsto \frac{1}{2\pi \sigma^2 \alpha} (4\sigma^2 \alpha -x^2)_+^{1/2},

thus at height {\alpha} from the top, the pattern is semicircular on the interval {[-2\sigma \sqrt{\alpha}, 2\sigma \sqrt{\alpha}]}. The interlacing property of Gelfand-Tsetlin patterns translates to the claim that {\alpha \mu_\alpha(-\infty,\lambda)} (resp. {\alpha \mu_\alpha(\lambda,\infty)}) is non-decreasing (resp. non-increasing) in {\alpha} for any fixed {\lambda}. In principle one should be able to establish these monotonicity claims directly from the PDE (7) or from the implicit solution (9), but it was not clear to me how to do so.

An interesting example of such a limiting Gelfand-Tsetlin pattern occurs when {\mu_0 = \frac{1}{2} \delta_{-1} + \frac{1}{2} \delta_1}, which corresponds to {M} being {2P-I}, where {P} is an orthogonal projection to a random {n/2}-dimensional subspace of {{\bf C}^n}. Here we have

\displaystyle  s(0,z) = \frac{1}{2} \frac{1}{-1-z} + \frac{1}{2} \frac{1}{1-z} = \frac{z}{1-z^2}

and so (9) in this case becomes

\displaystyle  s(t, z_0 + \frac{1-z_0^2}{z_0} (1-e^{-t}) ) = \frac{e^t z_0}{1-z_0^2}

A tedious calculation then gives the solution

\displaystyle  s(t,z) = \frac{(2e^{-t}-1)z + \sqrt{z^2 - 4e^{-t}(1-e^{-t})}}{2e^{-t}(1-z^2)}. \ \ \ \ \ (11)

For {\alpha = e^{-t} > 1/2}, there are simple poles at {z=-1,+1}, and the associated measure is

\displaystyle  \mu_\alpha = \frac{2\alpha-1}{2\alpha} \delta_{-1} + \frac{2\alpha-1}{2\alpha} \delta_1 + \frac{1}{2\pi \alpha(1-x^2)} (4\alpha(1-\alpha)-x^2)_+^{1/2}\ dx.

This reflects the interlacing property, which forces {\frac{2\alpha-1}{2\alpha} \alpha n} of the {\alpha n} eigenvalues of the {\alpha n \times \alpha n} minor to be equal to {-1} (resp. {+1}). For {\alpha = e^{-t} \leq 1/2}, the poles disappear and one just has

\displaystyle  \mu_\alpha = \frac{1}{2\pi \alpha(1-x^2)} (4\alpha(1-\alpha)-x^2)_+^{1/2}\ dx.

For {\alpha=1/2}, one has an inverse semicircle distribution

\displaystyle  \mu_{1/2} = \frac{1}{\pi} (1-x^2)_+^{-1/2}.

There is presumably a direct geometric explanation of this fact (basically describing the singular values of the product of two random orthogonal projections to half-dimensional subspaces of {{\bf C}^n}), but I do not know of one off-hand.

The evolution of {s(t,z)} can also be understood using the {R}-transform and {S}-transform from free probability. Formally, letlet {z(t,s)} be the inverse of {s(t,z)}, thus

\displaystyle  s(t,z(t,s)) = s

for all {t,s}, and then define the {R}-transform

\displaystyle  R(t,s) := z(t,-s) - \frac{1}{s}.

The equation (9) may be rewritten as

\displaystyle  z( t, e^t s ) = z(0,s) + s^{-1} (1-e^{-t})

and hence

\displaystyle  R(t, -e^t s) = R(0, -s)

or equivalently

\displaystyle  R(t,s) = R(0, e^{-t} s). \ \ \ \ \ (12)

See these previous notes for a discussion of free probability topics such as the {R}-transform.

Example 6 If {s(t,z) = \frac{1}{c-z}} then the {R} transform is {R(t,s) = c}.

Example 7 If {s(t,z)} is given by (10), then the {R} transform is

\displaystyle  R(t,s) = \sigma^2 e^{-t} s.

Example 8 If {s(t,z)} is given by (11), then the {R} transform is

\displaystyle  R(t,s) = \frac{-1 + \sqrt{1 + 4 s^2 e^{-2t}}}{2 s e^{-t}}.

This simple relationship (12) is essentially due to Nica and Speicher (thanks to Dima Shylakhtenko for this reference). It has the remarkable consequence that when {\alpha = 1/m} is the reciprocal of a natural number {m}, then {\mu_{1/m}} is the free arithmetic mean of {m} copies of {\mu}, that is to say {\mu_{1/m}} is the free convolution {\mu \boxplus \dots \boxplus \mu} of {m} copies of {\mu}, pushed forward by the map {\lambda \rightarrow \lambda/m}. In terms of random matrices, this is asserting that the top {n/m \times n/m} minor of a random matrix {M} has spectral measure approximately equal to that of an arithmetic mean {\frac{1}{m} (M_1 + \dots + M_m)} of {m} independent copies of {M}, so that the process of taking top left minors is in some sense a continuous analogue of the process of taking freely independent arithmetic means. There ought to be a geometric proof of this assertion, but I do not know of one. In the limit {m \rightarrow \infty} (or {\alpha \rightarrow 0}), the {R}-transform becomes linear and the spectral measure becomes semicircular, which is of course consistent with the free central limit theorem.

In a similar vein, if one defines the function

\displaystyle  \omega(t,z) := \alpha \int_{\bf R} \frac{zx}{1-zx}\ d\mu_\alpha(x) = e^{-t} (- 1 - z^{-1} s(t, z^{-1}))

and inverts it to obtain a function {z(t,\omega)} with

\displaystyle  \omega(t, z(t,\omega)) = \omega

for all {t, \omega}, then the {S}-transform {S(t,\omega)} is defined by

\displaystyle  S(t,\omega) := \frac{1+\omega}{\omega} z(t,\omega).

Writing

\displaystyle  s(t,z) = - z^{-1} ( 1 + e^t \omega(t, z^{-1}) )

for any {t}, {z}, we have

\displaystyle  z_0 + s(0,z_0)^{-1} (1-e^{-t}) = z_0 \frac{\omega(0,z_0^{-1})+e^{-t}}{\omega(0,z_0^{-1})+1}

and so (9) becomes

\displaystyle  - z_0^{-1} \frac{\omega(0,z_0^{-1})+1}{\omega(0,z_0^{-1})+e^{-t}} (1 + e^{t} \omega(t, z_0^{-1} \frac{\omega(0,z_0^{-1})+1}{\omega(0,z_0^{-1})+e^{-t}}))

\displaystyle = - e^t z_0^{-1} (1 + \omega(0, z_0^{-1}))

which simplifies to

\displaystyle  \omega(t, z_0^{-1} \frac{\omega(0,z_0^{-1})+1}{\omega(0,z_0^{-1})+e^{-t}})) = \omega(0, z_0^{-1});

replacing {z_0} by {z(0,\omega)^{-1}} we obtain

\displaystyle  \omega(t, z(0,\omega) \frac{\omega+1}{\omega+e^{-t}}) = \omega

and thus

\displaystyle  z(0,\omega)\frac{\omega+1}{\omega+e^{-t}} = z(t, \omega)

and hence

\displaystyle  S(0, \omega) = \frac{\omega+e^{-t}}{\omega+1} S(t, \omega).

One can compute {\frac{\omega+e^{-t}}{\omega+1}} to be the {S}-transform of the measure {(1-\alpha) \delta_0 + \alpha \delta_1}; from the link between {S}-transforms and free products (see e.g. these notes of Guionnet), we conclude that {(1-\alpha)\delta_0 + \alpha \mu_\alpha} is the free product of {\mu_1} and {(1-\alpha) \delta_0 + \alpha \delta_1}. This is consistent with the random matrix theory interpretation, since {(1-\alpha)\delta_0 + \alpha \mu_\alpha} is also the spectral measure of {PMP}, where {P} is the orthogonal projection to the span of the first {\alpha n} basis elements, so in particular {P} has spectral measure {(1-\alpha) \delta_0 + \alpha \delta_1}. If {M} is unitarily invariant then (by a fundamental result of Voiculescu) it is asymptotically freely independent of {P}, so the spectral measure of {PMP = P^{1/2} M P^{1/2}} is asymptotically the free product of that of {M} and of {P}.

Israel Gelfand, who made profound and prolific contributions to many areas of mathematics, including functional analysis, representation theory, operator algebras, and partial differential equations, died on Monday, age 96.

Gelfand’s beautiful theory of {C^*}-algebras and related spaces made quite an impact on me as a graduate student in Princeton, to the point where I was seriously considering working in this area; but there was not much activity in operator algebras at the time there, and I ended up working in harmonic analysis under Eli Stein instead. (Though I am currently involved in another operator algebras project, of which I hope to be able to discuss in the near future. The commutative version of Gelfand’s theory is discussed in these lecture notes of mine.)

I met Gelfand only once, in one of the famous “Gelfand seminars” at the IHES in 2000. The speaker was Tim Gowers, on his new proof of Szemerédi’s theorem. (Endre Szemerédi, incidentally, was Gelfand’s student.) Gelfand’s introduction to the seminar, on the subject of Banach spaces which both mathematicians contributed so greatly to, was approximately as long as Gowers’ talk itself!

There are far too many contributions to mathematics by Gelfand to name here, so I will only mention two. The first are the Gelfand-Tsetlin patterns, induced by an {n \times n} Hermitian matrix {A}. Such matrices have {n} real eigenvalues {\lambda_{n,1} \leq \ldots \leq \lambda_{n,n}}. If one takes the top {n-1 \times n-1} minor, this is another Hermitian matrix, whose {n-1} eigenvalues {\lambda_{n-1,1} \leq \ldots \leq \lambda_{n-1,n-1}} intersperse the {n} eigenvalues of the original matrix: {\lambda_{n,i} \leq \lambda_{n-1,i} \leq \lambda_{n,i+1}} for every {1 \leq i \leq n-1}. This interspersing can be easily seen from the minimax characterisation

\displaystyle  \lambda_{n,i} = \inf_{\hbox{dim}(V)=i} \sup_{v \in V: \|v\|=1} \langle Av, v \rangle

of the eigenvalues of {A}, with the eigenvalues of the {n-1 \times n-1} minor being similar but with {V} now restricted to a subspace of {{\mathbb C}^{n-1}} rather than {{\mathbb C}^n}.

Similarly, the eigenvalues {\lambda_{n-2,1} \leq \ldots \leq \lambda_{n-2,n-2}} of the top {n-2 \times n-2} minor of {A} intersperse those of the previous minor. Repeating this procedure one eventually gets a pyramid of real numbers of height and width {n}, with the numbers in each row interspersing the ones in the row below. Such a pattern is known as a Gelfand-Tsetlin pattern. The space of such patterns forms a convex cone, and (if one fixes the initial eigenvalues {\lambda_{n,1},\ldots,\lambda_{n,n}}) becomes a compact convex polytope. If one fixes the initial eigenvalues {\lambda_{n,1},\ldots,\lambda_{n,n}} of {A} but chooses the eigenvectors randomly (using the Haar measure of the unitary group), then the resulting Gelfand-Tsetlin pattern is uniformly distributed across this polytope; the {n=2} case of this observation is essentially the classic observation of Archimedes that the cross-sectional areas of a sphere and a circumscribing cylinder are the same. (Ultimately, the reason for this is that the Gelfand-Tsetlin pattern almost turns the space of all {A} with a fixed spectrum (i.e. the co-adjoint orbit associated to that spectrum) into a toric variety. More precisely, there exists a mostly diffeomorphic map from the co-adjoint orbit to a (singular) toric variety, and the Gelfand-Tsetlin pattern induces a complete set of action variables on that variety.) There is also a “quantum” (or more precisely, representation-theoretic) version of this observation, in which one can decompose any irreducible representation of the unitary group {U(n)} into a canonical basis (the Gelfand-Tsetlin basis), indexed by integer-valued Gelfand-Tsetlin patterns, by first decomposing this representation into irreducible representations of {U(n-1)}, then {U(n-2)}, and so forth.

The structure, symplectic geometry, and representation theory of Gelfand-Tsetlin patterns was enormously influential in my own work with Allen Knutson on honeycomb patterns, which control the sums of Hermitian matrices and also the structure constants of the tensor product operation for representations of {U(n)}; indeed, Gelfand-Tsetlin patterns arise as the degenerate limit of honeycombs in three different ways, and we in fact discovered honeycombs by trying to glue three Gelfand-Tsetlin patterns together. (See for instance our Notices article for more discussion. The honeycomb analogue of the representation-theoretic properties of these patterns was eventually established by Henriques and Kamnitzer, using {gl(n)} crystals and their Kashiwara bases.)

The second contribution of Gelfand I want to discuss is the Gelfand-Levitan-Marchenko equation for solving the one-dimensional inverse scattering problem: given the scattering data of an unknown potential function {V(x)}, recover {V}. This is already interesting in and of itself, but is also instrumental in solving integrable systems such as the Korteweg-de Vries equation, because the Lax pair formulation of such equations implies that they can be linearised (and solved explicitly) by applying the scattering and inverse scattering transforms associated with the Lax operator. I discuss the derivation of this equation below the fold.

Read the rest of this entry »

Archives