You are currently browsing the tag archive for the ‘eigenvectors’ tag.

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv the short unpublished note “Eigenvectors from eigenvalues“. This note gives two proofs of a general eigenvector identity observed recently by Denton, Parke and Zhang in the course of some quantum mechanical calculations. The identity is as follows:

Theorem 1 Let {A} be an {n \times n} Hermitian matrix, with eigenvalues {\lambda_1(A),\dots,\lambda_n(A)}. Let {v_i} be a unit eigenvector corresponding to the eigenvalue {\lambda_i(A)}, and let {v_{i,j}} be the {j^{th}} component of {v_i}. Then

\displaystyle  |v_{i,j}|^2 \prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A)) = \prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M_j))

where {M_j} is the {n-1 \times n-1} Hermitian matrix formed by deleting the {j^{th}} row and column from {A}.

For instance, if we have

\displaystyle  A = \begin{pmatrix} a & X^* \\ X & M \end{pmatrix}

for some real number {a}, {n-1}-dimensional vector {X}, and {n-1 \times n-1} Hermitian matrix {M}, then we have

\displaystyle  |v_{i,1}|^2 = \frac{\prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M))}{\prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A))} \ \ \ \ \ (1)

assuming that the denominator is non-zero.

Once one is aware of the identity, it is not so difficult to prove it; we give two proofs, each about half a page long, one of which is based on a variant of the Cauchy-Binet formula, and the other based on properties of the adjugate matrix. But perhaps it is surprising that such a formula exists at all; one does not normally expect to learn much information about eigenvectors purely from knowledge of eigenvalues. In the random matrix theory literature, for instance in this paper of Erdos, Schlein, and Yau, or this later paper of Van Vu and myself, a related identity has been used, namely

\displaystyle  |v_{i,1}|^2 = \frac{1}{1 + \| (M-\lambda_i(A))^{-1} X \|^2}, \ \ \ \ \ (2)

but it is not immediately obvious that one can derive the former identity from the latter. (I do so below the fold; we ended up not putting this proof in the note as it was longer than the two other proofs we found. I also give two other proofs below the fold, one from a more geometric perspective and one proceeding via Cramer’s rule.) It was certainly something of a surprise to me that there is no explicit appearance of the {a,X} components of {A} in the formula (1) (though they do indirectly appear through their effect on the eigenvalues {\lambda_k(A)}; for instance from taking traces one sees that {a = \sum_{k=1}^n \lambda_k(A) - \sum_{k=1}^{n-1} \lambda_k(M)}).

One can get some feeling of the identity (1) by considering some special cases. Suppose for instance that {A} is a diagonal matrix with all distinct entries. The upper left entry {a} of {A} is one of the eigenvalues of {A}. If it is equal to {\lambda_i(A)}, then the eigenvalues of {M} are the other {n-1} eigenvalues of {A}, and now the left and right-hand sides of (1) are equal to {1}. At the other extreme, if {a} is equal to a different eigenvalue of {A}, then {\lambda_i(A)} now appears as an eigenvalue of {M}, and both sides of (1) now vanish. More generally, if we order the eigenvalues {\lambda_1(A) \leq \dots \leq \lambda_n(A)} and {\lambda_1(M) \leq \dots \leq \lambda_{n-1}(M)}, then the Cauchy interlacing inequalities tell us that

\displaystyle  0 \leq \lambda_i(A) - \lambda_k(M) \leq \lambda_i(A) - \lambda_k(A)

for {1 \leq k < i}, and

\displaystyle  \lambda_i(A) - \lambda_{k+1}(A) \leq \lambda_i(A) - \lambda_k(M) < 0

for {i \leq k \leq n-1}, so that the right-hand side of (1) lies between {0} and {1}, which is of course consistent with (1) as {v_i} is a unit vector. Thus the identity relates the coefficient sizes of an eigenvector with the extent to which the Cauchy interlacing inequalities are sharp.

Read the rest of this entry »

Perhaps the most important structural result about general large dense graphs is the Szemerédi regularity lemma. Here is a standard formulation of that lemma:

Lemma 1 (Szemerédi regularity lemma) Let {G = (V,E)} be a graph on {n} vertices, and let {\epsilon > 0}. Then there exists a partition {V = V_1 \cup \ldots \cup V_M} for some {M \leq M(\epsilon)} with the property that for all but at most {\epsilon M^2} of the pairs {1 \leq i \leq j \leq M}, the pair {V_i, V_j} is {\epsilon}-regular in the sense that

\displaystyle  | d( A, B ) - d( V_i, V_j ) | \leq \epsilon

whenever {A \subset V_i, B \subset V_j} are such that {|A| \geq \epsilon |V_i|} and {|B| \geq \epsilon |V_j|}, and {d(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|/|A| |B|} is the edge density between {A} and {B}. Furthermore, the partition is equitable in the sense that {||V_i| - |V_j|| \leq 1} for all {1 \leq i \leq j \leq M}.

There are many proofs of this lemma, which is actually not that difficult to establish; see for instance these previous blog posts for some examples. In this post I would like to record one further proof, based on the spectral decomposition of the adjacency matrix of {G}, which is essentially due to Frieze and Kannan. (Strictly speaking, Frieze and Kannan used a variant of this argument to establish a weaker form of the regularity lemma, but it is not difficult to modify the Frieze-Kannan argument to obtain the usual form of the regularity lemma instead. Some closely related spectral regularity lemmas were also developed by Szegedy.) I found recently (while speaking at the Abel conference in honour of this year’s laureate, Endre Szemerédi) that this particular argument is not as widely known among graph theory experts as I had thought, so I thought I would record it here.

For reasons of exposition, it is convenient to first establish a slightly weaker form of the lemma, in which one drops the hypothesis of equitability (but then has to weight the cells {V_i} by their magnitude when counting bad pairs):

Lemma 2 (Szemerédi regularity lemma, weakened variant) . Let {G = (V,E)} be a graph on {n} vertices, and let {\epsilon > 0}. Then there exists a partition {V = V_1 \cup \ldots \cup V_M} for some {M \leq M(\epsilon)} with the property that for all pairs {(i,j) \in \{1,\ldots,M\}^2} outside of an exceptional set {\Sigma}, one has

\displaystyle  | E(A,B) - d_{ij} |A| |B| | \ll \epsilon |V_i| |V_j| \ \ \ \ \ (1)

whenever {A \subset V_i, B \subset V_j}, for some real number {d_{ij}}, where {E(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|} is the number of edges between {A} and {B}. Furthermore, we have

\displaystyle  \sum_{(i,j) \in \Sigma} |V_i| |V_j| \ll \epsilon |V|^2. \ \ \ \ \ (2)

Let us now prove Lemma 2. We enumerate {V} (after relabeling) as {V = \{1,\ldots,n\}}. The adjacency matrix {T} of the graph {G} is then a self-adjoint {n \times n} matrix, and thus admits an eigenvalue decomposition

\displaystyle  T = \sum_{i=1}^n \lambda_i u_i^* u_i

for some orthonormal basis {u_1,\ldots,u_n} of {{\bf C}^n} and some eigenvalues {\lambda_1,\ldots,\lambda_n \in {\bf R}}, which we arrange in decreasing order of magnitude:

\displaystyle  |\lambda_1| \geq \ldots \geq |\lambda_n|.

We can compute the trace of {T^2} as

\displaystyle  \hbox{tr}(T^2) = \sum_{i=1}^n |\lambda_i|^2.

But we also have {\hbox{tr}(T^2) = 2|E| \leq n^2}, so

\displaystyle  \sum_{i=1}^n |\lambda_i|^2 \leq n^2. \ \ \ \ \ (3)

Among other things, this implies that

\displaystyle  |\lambda_i| \leq \frac{n}{\sqrt{i}} \ \ \ \ \ (4)

for all {i \geq 1}.

Let {F: {\bf N} \rightarrow {\bf N}} be a function (depending on {\epsilon}) to be chosen later, with {F(i) \geq i} for all {i}. Applying (3) and the pigeonhole principle (or the finite convergence principle, see this blog post), we can find {J \leq C(F,\epsilon)} such that

\displaystyle  \sum_{J \leq i < F(J)} |\lambda_i|^2 \leq \epsilon^3 n^2.

(Indeed, the bound on {J} is basically {F} iterated {1/\epsilon^3} times.) We can now split

\displaystyle  T = T_1 + T_2 + T_3, \ \ \ \ \ (5)

where {T_1} is the “structured” component

\displaystyle  T_1 := \sum_{i < J} \lambda_i u_i^* u_i, \ \ \ \ \ (6)

{T_2} is the “small” component

\displaystyle  T_2 := \sum_{J \leq i < F(J)} \lambda_i u_i^* u_i, \ \ \ \ \ (7)

and {T_3} is the “pseudorandom” component

\displaystyle  T_3 := \sum_{i > F(J)} \lambda_i u_i^* u_i. \ \ \ \ \ (8)

We now design a vertex partition to make {T_1} approximately constant on most cells. For each {i < J}, we partition {V} into {O_{J,\epsilon}(1)} cells on which {u_i} (viewed as a function from {V} to {{\bf C}}) only fluctuates by {O(\epsilon n^{-1/2} /J)}, plus an exceptional cell of size {O( \frac{\epsilon}{J} |V|)} coming from the values where {|u_i|} is excessively large (larger than {\sqrt{\frac{J}{\epsilon}} n^{-1/2}}). Combining all these partitions together, we can write {V = V_1 \cup \ldots \cup V_{M-1} \cup V_M} for some {M = O_{J,\epsilon}(1)}, where {V_M} has cardinality at most {\epsilon |V|}, and for all {1 \leq i \leq M-1}, the eigenfunctions {u_1,\ldots,u_{J-1}} all fluctuate by at most {O(\epsilon/J)}. In particular, if {1 \leq i,j \leq M-1}, then (by (4) and (6)) the entries of {T_1} fluctuate by at most {O(\epsilon)} on each block {V_i \times V_j}. If we let {d_{ij}} be the mean value of these entries on {V_i \times V_j}, we thus have

\displaystyle  1_B^* T_1 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (9)

for any {1 \leq i,j \leq M-1} and {A \subset V_i, B \subset V_j}, where we view the indicator functions {1_A, 1_B} as column vectors of dimension {n}.

Next, we observe from (3) and (7) that {\hbox{tr} T_2^2 \leq \epsilon^3 n^2}. If we let {x_{ab}} be the coefficients of {T_2}, we thus have

\displaystyle  \sum_{a,b \in V} |x_{ab}|^2 \leq \epsilon^3 n^2

and hence by Markov’s inequality we have

\displaystyle  \sum_{a \in V_i} \sum_{b \in V_j} |x_{ab}|^2 \leq \epsilon^2 |V_i| |V_j| \ \ \ \ \ (10)

for all pairs {(i,j) \in \{1,\ldots,M-1\}^2} outside of an exceptional set {\Sigma_1} with

\displaystyle  \sum_{(i,j) \in \Sigma_1} |V_i| |V_j| \leq \epsilon |V|^2.

If {(i,j) \in \{1,\ldots,M-1\}^2} avoids {\Sigma_1}, we thus have

\displaystyle  1_B^* T_2 1_A = O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (11)

for any {A \subset V_i, B \subset V_j}, by (10) and the Cauchy-Schwarz inequality.

Finally, to control {T_3} we see from (4) and (8) that {T_3} has an operator norm of at most {n/\sqrt{F(J)}}. In particular, we have from the Cauchy-Schwarz inequality that

\displaystyle  1_B^* T_3 1_A = O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (12)

for any {A, B \subset V}.

Let {\Sigma} be the set of all pairs {(i,j) \in \{1,\ldots,M\}^2} where either {(i,j) \in \Sigma_1}, {i = M}, {j=M}, or

\displaystyle  \min(|V_i|, |V_j|) \leq \frac{\epsilon}{M} n.

One easily verifies that (2) holds. If {(i,j) \in \{1,\ldots,M\}^2} is not in {\Sigma}, then by summing (9), (11), (12) and using (5), we see that

\displaystyle  1_B^* T 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) + O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (13)

for all {A \subset V_i, B \subset V_j}. The left-hand side is just {E(A,B)}. As {(i,j) \not \in \Sigma}, we have

\displaystyle  |V_i|, |V_j| > \frac{\epsilon}{M} n

and so (since {M = O_{J,\epsilon}(1)})

\displaystyle  n^2 / \sqrt{F(J)} \ll_{J,\epsilon} |V_i| |V_j| / \sqrt{F(J)}.

If we let {F} be a sufficiently rapidly growing function of {J} that depends on {\epsilon}, the second error term in (13) can be absorbed in the first, and (1) follows. This concludes the proof of Lemma 2.

To prove Lemma 1, one argues similarly (after modifying {\epsilon} as necessary), except that the initial partition {V_1,\ldots,V_M} of {V} constructed above needs to be subdivided further into equitable components (of size {\epsilon |V|/M+O(1)}), plus some remainder sets which can be aggregated into an exceptional component of size {O( \epsilon |V| )} (and which can then be redistributed amongst the other components to arrive at a truly equitable partition). We omit the details.

Remark 1 It is easy to verify that {F} needs to be growing exponentially in {J} in order for the above argument to work, which leads to tower-exponential bounds in the number of cells {M} in the partition. It was shown by Gowers that a tower-exponential bound is actually necessary here. By varying {F}, one basically obtains the strong regularity lemma first established by Alon, Fischer, Krivelevich, and Szegedy; in the opposite direction, setting {F(J) := J} essentially gives the weak regularity lemma of Frieze and Kannan.

Remark 2 If we specialise to a Cayley graph, in which {V = (V,+)} is a finite abelian group and {E = \{ (a,b): a-b \in A \}} for some (symmetric) subset {A} of {V}, then the eigenvectors are characters, and one essentially recovers the arithmetic regularity lemma of Green, in which the vertex partition classes {V_i} are given by Bohr sets (and one can then place additional regularity properties on these Bohr sets with some additional arguments). The components {T_1, T_2, T_3} of {T}, representing high, medium, and low eigenvalues of {T}, then become a decomposition associated to high, medium, and low Fourier coefficients of {A}.

Remark 3 The use of spectral theory here is parallel to the use of Fourier analysis to establish results such as Roth’s theorem on arithmetic progressions of length three. In analogy with this, one could view hypergraph regularity as being a sort of “higher order spectral theory”, although this spectral perspective is not as convenient as it is in the graph case.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: Universality of eigenvectors“, submitted to Random Matrices: Theory and Applications. This paper concerns an extension of our four moment theorem for eigenvalues. Roughly speaking, that four moment theorem asserts (under mild decay conditions on the coefficients of the random matrix) that the fine-scale structure of individual eigenvalues of a Wigner random matrix depend only on the first four moments of each of the entries.

In this paper, we extend this result from eigenvalues to eigenvectors, and specifically to the coefficients of, say, the {i^{th}} eigenvector {u_i(M_n)} of a Wigner random matrix {M_n}. Roughly speaking, the main result is that the distribution of these coefficients also only depends on the first four moments of each of the entries. In particular, as the distribution of coefficients eigenvectors of invariant ensembles such as GOE or GUE are known to be asymptotically gaussian real (in the GOE case) or gaussian complex (in the GUE case), the same asymptotic automatically holds for Wigner matrices whose coefficients match GOE or GUE to fourth order.

(A technical point here: strictly speaking, the eigenvectors {u_i(M_n)} are only determined up to a phase, even when the eigenvalues are simple. So, to phrase the question properly, one has to perform some sort of normalisation, for instance by working with the coefficients of the spectral projection operators {P_i(M_n) := u_i(M_n) u_i(M_n)^*} instead of the eigenvectors, or rotating each eigenvector by a random phase, or by fixing the first component of each eigenvector to be positive real. This is a fairly minor technical issue here, though, and will not be discussed further.)

This theorem strengthens a four moment theorem for eigenvectors recently established by Knowles and Yin (by a somewhat different method), in that the hypotheses are weaker (no level repulsion assumption is required, and the matrix entries only need to obey a finite moment condition rather than an exponential decay condition), and a slightly stronger conclusion (less regularity is needed on the test function, and one can handle the joint distribution of polynomially many coefficients, rather than boundedly many coefficients). On the other hand, the Knowles-Yin paper can also handle generalised Wigner ensembles in which the variances of the entries are allowed to fluctuate somewhat.

The method used here is a variation of that in our original paper (incorporating the subsequent improvements to extend the four moment theorem from the bulk to the edge, and to replace exponential decay by a finite moment condition). That method was ultimately based on the observation that if one swapped a single entry (and its adjoint) in a Wigner random matrix, then an individual eigenvalue {\lambda_i(M_n)} would not fluctuate much as a consequence (as long as one had already truncated away the event of an unexpectedly small eigenvalue gap). The same analysis shows that the projection matrices {P_i(M_n)} obeys the same stability property.

As an application of the eigenvalue four moment theorem, we establish a four moment theorem for the coefficients of resolvent matrices {(\frac{1}{\sqrt{n}} M_n - zI)^{-1}}, even when {z} is on the real axis (though in that case we need to make a level repulsion hypothesis, which has been already verified in many important special cases and is likely to be true in general). This improves on an earlier four moment theorem for resolvents of Erdos, Yau, and Yin, which required {z} to stay some distance away from the real axis (specifically, that {\hbox{Im}(z) \geq n^{-1-c}} for some small {c>0}).