You are currently browsing the tag archive for the ‘eigenvectors’ tag.

Marcel Filoche, Svitlana Mayboroda, and I have just uploaded to the arXiv our preprint “The effective potential of an ${M}$-matrix“. This paper explores the analogue of the effective potential of Schrödinger operators ${-\Delta + V}$ provided by the “landscape function” ${u}$, when one works with a certain type of self-adjoint matrix known as an ${M}$-matrix instead of a Schrödinger operator.

Suppose one has an eigenfunction

$\displaystyle (-\Delta + V) \phi = E \phi$

of a Schrödinger operator ${-\Delta+V}$, where ${\Delta}$ is the Laplacian on ${{\bf R}^d}$, ${V: {\bf R}^d \rightarrow {\bf R}}$ is a potential, and ${E}$ is an energy. Where would one expect the eigenfunction ${\phi}$ to be concentrated? If the potential ${V}$ is smooth and slowly varying, the correspondence principle suggests that the eigenfunction ${\phi}$ should be mostly concentrated in the potential energy wells ${\{ x: V(x) \leq E \}}$, with an exponentially decaying amount of tunnelling between the wells. One way to rigorously establish such an exponential decay is through an argument of Agmon, which we will sketch later in this post, which gives an exponentially decaying upper bound (in an ${L^2}$ sense) of eigenfunctions ${\phi}$ in terms of the distance to the wells ${\{ V \leq E \}}$ in terms of a certain “Agmon metric” on ${{\bf R}^d}$ determined by the potential ${V}$ and energy level ${E}$ (or any upper bound ${\overline{E}}$ on this energy). Similar exponential decay results can also be obtained for discrete Schrödinger matrix models, in which the domain ${{\bf R}^d}$ is replaced with a discrete set such as the lattice ${{\bf Z}^d}$, and the Laplacian ${\Delta}$ is replaced by a discrete analogue such as a graph Laplacian.

When the potential ${V}$ is very “rough”, as occurs for instance in the random potentials arising in the theory of Anderson localisation, the Agmon bounds, while still true, become very weak because the wells ${\{ V \leq E \}}$ are dispersed in a fairly dense fashion throughout the domain ${{\bf R}^d}$, and the eigenfunction can tunnel relatively easily between different wells. However, as was first discovered in 2012 by my two coauthors, in these situations one can replace the rough potential ${V}$ by a smoother effective potential ${1/u}$, with the eigenfunctions typically localised to a single connected component of the effective wells ${\{ 1/u \leq E \}}$. In fact, a good choice of effective potential comes from locating the landscape function ${u}$, which is the solution to the equation ${(-\Delta + V) u = 1}$ with reasonable behavior at infinity, and which is non-negative from the maximum principle, and then the reciprocal ${1/u}$ of this landscape function serves as an effective potential.

There are now several explanations for why this particular choice ${1/u}$ is a good effective potential. Perhaps the simplest (as found for instance in this recent paper of Arnold, David, Jerison, and my two coauthors) is the following observation: if ${\phi}$ is an eigenvector for ${-\Delta+V}$ with energy ${E}$, then ${\phi/u}$ is an eigenvector for ${-\frac{1}{u^2} \mathrm{div}(u^2 \nabla \cdot) + \frac{1}{u}}$ with the same energy ${E}$, thus the original Schrödinger operator ${-\Delta+V}$ is conjugate to a (variable coefficient, but still in divergence form) Schrödinger operator with potential ${1/u}$ instead of ${V}$. Closely related to this, we have the integration by parts identity

$\displaystyle \int_{{\bf R}^d} |\nabla f|^2 + V |f|^2\ dx = \int_{{\bf R}^d} u^2 |\nabla(f/u)|^2 + \frac{1}{u} |f|^2\ dx \ \ \ \ \ (1)$

for any reasonable function ${f}$, thus again highlighting the emergence of the effective potential ${1/u}$.

These particular explanations seem rather specific to the Schrödinger equation (continuous or discrete); we have for instance not been able to find similar identities to explain an effective potential for the bi-Schrödinger operator ${\Delta^2 + V}$.

In this paper, we demonstrate the (perhaps surprising) fact that effective potentials continue to exist for operators that bear very little resemblance to Schrödinger operators. Our chosen model is that of an ${M}$-matrix: self-adjoint positive definite matrices ${A}$ whose off-diagonal entries are negative. This model includes discrete Schrödinger operators (with non-negative potentials) but can allow for significantly more non-local interactions. The analogue of the landscape function would then be the vector ${u := A^{-1} 1}$, where ${1}$ denotes the vector with all entries ${1}$. Our main result, roughly speaking, asserts that an eigenvector ${A \phi = E \phi}$ of ${A}$ will then be exponentially localised to the “potential wells” ${K := \{ j: \frac{1}{u_j} \leq E \}}$, where ${u_j}$ denotes the coordinates of the landscape function ${u}$. In particular, we establish the inequality

$\displaystyle \sum_k \phi_k^2 e^{2 \rho(k,K) / \sqrt{W}} ( \frac{1}{u_k} - E )_+ \leq W \max_{i,j} |a_{ij}|$

if ${\phi}$ is normalised in ${\ell^2}$, where the connectivity ${W}$ is the maximum number of non-zero entries of ${A}$ in any row or column, ${a_{ij}}$ are the coefficients of ${A}$, and ${\rho}$ is a certain moderately complicated but explicit metric function on the spatial domain. Informally, this inequality asserts that the eigenfunction ${\phi_k}$ should decay like ${e^{-\rho(k,K) / \sqrt{W}}}$ or faster. Indeed, our numerics show a very strong log-linear relationship between ${\phi_k}$ and ${\rho(k,K)}$, although it appears that our exponent ${1/\sqrt{W}}$ is not quite optimal. We also provide an associated localisation result which is technical to state but very roughly asserts that a given eigenvector will in fact be localised to a single connected component of ${K}$ unless there is a resonance between two wells (by which we mean that an eigenvalue for a localisation of ${A}$ associated to one well is extremely close to an eigenvalue for a localisation of ${A}$ associated to another well); such localisation is also strongly supported by numerics. (Analogous results for Schrödinger operators had been previously obtained by the previously mentioned paper of Arnold, David, Jerison, and my two coauthors, and to quantum graphs in a very recent paper of Harrell and Maltsev.)

Our approach is based on Agmon’s methods, which we interpret as a double commutator method, and in particular relying on exploiting the negative definiteness of certain double commutator operators. In the case of Schrödinger operators ${-\Delta+V}$, this negative definiteness is provided by the identity

$\displaystyle \langle [[-\Delta+V,g],g] u, u \rangle = -2\int_{{\bf R}^d} |\nabla g|^2 |u|^2\ dx \leq 0 \ \ \ \ \ (2)$

for any sufficiently reasonable functions ${u, g: {\bf R}^d \rightarrow {\bf R}}$, where we view ${g}$ (like ${V}$) as a multiplier operator. To exploit this, we use the commutator identity

$\displaystyle \langle g [\psi, -\Delta+V] u, g \psi u \rangle = \frac{1}{2} \langle [[-\Delta+V, g \psi],g\psi] u, u \rangle$

$\displaystyle -\frac{1}{2} \langle [[-\Delta+V, g],g] \psi u, \psi u \rangle$

valid for any ${g,\psi,u: {\bf R}^d \rightarrow {\bf R}}$ after a brief calculation. The double commutator identity then tells us that

$\displaystyle \langle g [\psi, -\Delta+V] u, g \psi u \rangle \leq \int_{{\bf R}^d} |\nabla g|^2 |\psi u|^2\ dx.$

If we choose ${u}$ to be a non-negative weight and let ${\psi := \phi/u}$ for an eigenfunction ${\phi}$, then we can write

$\displaystyle [\psi, -\Delta+V] u = [\psi, -\Delta+V - E] u = \psi (-\Delta+V - E) u$

and we conclude that

$\displaystyle \int_{{\bf R}^d} \frac{(-\Delta+V-E)u}{u} |g|^2 |\phi|^2\ dx \leq \int_{{\bf R}^d} |\nabla g|^2 |\phi|^2\ dx. \ \ \ \ \ (3)$

We have considerable freedom in this inequality to select the functions ${u,g}$. If we select ${u=1}$, we obtain the clean inequality

$\displaystyle \int_{{\bf R}^d} (V-E) |g|^2 |\phi|^2\ dx \leq \int_{{\bf R}^d} |\nabla g|^2 |\phi|^2\ dx.$

If we take ${g}$ to be a function which equals ${1}$ on the wells ${\{ V \leq E \}}$ but increases exponentially away from these wells, in such a way that

$\displaystyle |\nabla g|^2 \leq \frac{1}{2} (V-E) |g|^2$

outside of the wells, we can obtain the estimate

$\displaystyle \int_{V > E} (V-E) |g|^2 |\phi|^2\ dx \leq 2 \int_{V < E} (E-V) |\phi|^2\ dx,$

which then gives an exponential type decay of ${\phi}$ away from the wells. This is basically the classic exponential decay estimate of Agmon; one can basically take ${g}$ to be the distance to the wells ${\{ V \leq E \}}$ with respect to the Euclidean metric conformally weighted by a suitably normalised version of ${V-E}$. If we instead select ${u}$ to be the landscape function ${u = (-\Delta+V)^{-1} 1}$, (3) then gives

$\displaystyle \int_{{\bf R}^d} (\frac{1}{u} - E) |g|^2 |\phi|^2\ dx \leq \int_{{\bf R}^d} |\nabla g|^2 |\phi|^2\ dx,$

and by selecting ${g}$ appropriately this gives an exponential decay estimate away from the effective wells ${\{ \frac{1}{u} \leq E \}}$, using a metric weighted by ${\frac{1}{u}-E}$.

It turns out that this argument extends without much difficulty to the ${M}$-matrix setting. The analogue of the crucial double commutator identity (2) is

$\displaystyle \langle [[A,D],D] u, u \rangle = \sum_{i \neq j} a_{ij} u_i u_j (d_{ii} - d_{jj})^2 \leq 0$

for any diagonal matrix ${D = \mathrm{diag}(d_{11},\dots,d_{NN})}$. The remainder of the Agmon type arguments go through after making the natural modifications.

Numerically we have also found some aspects of the landscape theory to persist beyond the ${M}$-matrix setting, even though the double commutators cease being negative definite, so this may not yet be the end of the story, but it does at least demonstrate that utility the landscape does not purely rely on identities such as (1).

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv a completely rewritten version of our previous paper, now titled “Eigenvectors from Eigenvalues: a survey of a basic identity in linear algebra“. This paper is now a survey of the various literature surrounding the following basic identity in linear algebra, which we propose to call the eigenvector-eigenvalue identity:

Theorem 1 (Eigenvector-eigenvalue identity) Let ${A}$ be an ${n \times n}$ Hermitian matrix, with eigenvalues ${\lambda_1(A),\dots,\lambda_n(A)}$. Let ${v_i}$ be a unit eigenvector corresponding to the eigenvalue ${\lambda_i(A)}$, and let ${v_{i,j}}$ be the ${j^{th}}$ component of ${v_i}$. Then

$\displaystyle |v_{i,j}|^2 \prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A)) = \prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M_j))$

where ${M_j}$ is the ${n-1 \times n-1}$ Hermitian matrix formed by deleting the ${j^{th}}$ row and column from ${A}$.

When we posted the first version of this paper, we were unaware of previous appearances of this identity in the literature; a related identity had been used by Erdos-Schlein-Yau and by myself and Van Vu for applications to random matrix theory, but to our knowledge this specific identity appeared to be new. Even two months after our preprint first appeared on the arXiv in August, we had only learned of one other place in the literature where the identity showed up (by Forrester and Zhang, who also cite an earlier paper of Baryshnikov).

The situation changed rather dramatically with the publication of a popular science article in Quanta on this identity in November, which gave this result significantly more exposure. Within a few weeks we became informed (through private communication, online discussion, and exploration of the citation tree around the references we were alerted to) of over three dozen places where the identity, or some other closely related identity, had previously appeared in the literature, in such areas as numerical linear algebra, various aspects of graph theory (graph reconstruction, chemical graph theory, and walks on graphs), inverse eigenvalue problems, random matrix theory, and neutrino physics. As a consequence, we have decided to completely rewrite our article in order to collate this crowdsourced information, and survey the history of this identity, all the known proofs (we collect seven distinct ways to prove the identity (or generalisations thereof)), and all the applications of it that we are currently aware of. The citation graph of the literature that this ad hoc crowdsourcing effort produced is only very weakly connected, which we found surprising:

The earliest explicit appearance of the eigenvector-eigenvalue identity we are now aware of is in a 1966 paper of Thompson, although this paper is only cited (directly or indirectly) by a fraction of the known literature, and also there is a precursor identity of Löwner from 1934 that can be shown to imply the identity as a limiting case. At the end of the paper we speculate on some possible reasons why this identity only achieved a modest amount of recognition and dissemination prior to the November 2019 Quanta article.

Peter Denton, Stephen Parke, Xining Zhang, and I have just uploaded to the arXiv the short unpublished note “Eigenvectors from eigenvalues“. This note gives two proofs of a general eigenvector identity observed recently by Denton, Parke and Zhang in the course of some quantum mechanical calculations. The identity is as follows:

Theorem 1 Let ${A}$ be an ${n \times n}$ Hermitian matrix, with eigenvalues ${\lambda_1(A),\dots,\lambda_n(A)}$. Let ${v_i}$ be a unit eigenvector corresponding to the eigenvalue ${\lambda_i(A)}$, and let ${v_{i,j}}$ be the ${j^{th}}$ component of ${v_i}$. Then

$\displaystyle |v_{i,j}|^2 \prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A)) = \prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M_j))$

where ${M_j}$ is the ${n-1 \times n-1}$ Hermitian matrix formed by deleting the ${j^{th}}$ row and column from ${A}$.

For instance, if we have

$\displaystyle A = \begin{pmatrix} a & X^* \\ X & M \end{pmatrix}$

for some real number ${a}$, ${n-1}$-dimensional vector ${X}$, and ${n-1 \times n-1}$ Hermitian matrix ${M}$, then we have

$\displaystyle |v_{i,1}|^2 = \frac{\prod_{k=1}^{n-1} (\lambda_i(A) - \lambda_k(M))}{\prod_{k=1; k \neq i}^n (\lambda_i(A) - \lambda_k(A))} \ \ \ \ \ (1)$

assuming that the denominator is non-zero.

Once one is aware of the identity, it is not so difficult to prove it; we give two proofs, each about half a page long, one of which is based on a variant of the Cauchy-Binet formula, and the other based on properties of the adjugate matrix. But perhaps it is surprising that such a formula exists at all; one does not normally expect to learn much information about eigenvectors purely from knowledge of eigenvalues. In the random matrix theory literature, for instance in this paper of Erdos, Schlein, and Yau, or this later paper of Van Vu and myself, a related identity has been used, namely

$\displaystyle |v_{i,1}|^2 = \frac{1}{1 + \| (M-\lambda_i(A))^{-1} X \|^2}, \ \ \ \ \ (2)$

but it is not immediately obvious that one can derive the former identity from the latter. (I do so below the fold; we ended up not putting this proof in the note as it was longer than the two other proofs we found. I also give two other proofs below the fold, one from a more geometric perspective and one proceeding via Cramer’s rule.) It was certainly something of a surprise to me that there is no explicit appearance of the ${a,X}$ components of ${A}$ in the formula (1) (though they do indirectly appear through their effect on the eigenvalues ${\lambda_k(A)}$; for instance from taking traces one sees that ${a = \sum_{k=1}^n \lambda_k(A) - \sum_{k=1}^{n-1} \lambda_k(M)}$).

One can get some feeling of the identity (1) by considering some special cases. Suppose for instance that ${A}$ is a diagonal matrix with all distinct entries. The upper left entry ${a}$ of ${A}$ is one of the eigenvalues of ${A}$. If it is equal to ${\lambda_i(A)}$, then the eigenvalues of ${M}$ are the other ${n-1}$ eigenvalues of ${A}$, and now the left and right-hand sides of (1) are equal to ${1}$. At the other extreme, if ${a}$ is equal to a different eigenvalue of ${A}$, then ${\lambda_i(A)}$ now appears as an eigenvalue of ${M}$, and both sides of (1) now vanish. More generally, if we order the eigenvalues ${\lambda_1(A) \leq \dots \leq \lambda_n(A)}$ and ${\lambda_1(M) \leq \dots \leq \lambda_{n-1}(M)}$, then the Cauchy interlacing inequalities tell us that

$\displaystyle 0 \leq \lambda_i(A) - \lambda_k(M) \leq \lambda_i(A) - \lambda_k(A)$

for ${1 \leq k < i}$, and

$\displaystyle \lambda_i(A) - \lambda_{k+1}(A) \leq \lambda_i(A) - \lambda_k(M) < 0$

for ${i \leq k \leq n-1}$, so that the right-hand side of (1) lies between ${0}$ and ${1}$, which is of course consistent with (1) as ${v_i}$ is a unit vector. Thus the identity relates the coefficient sizes of an eigenvector with the extent to which the Cauchy interlacing inequalities are sharp.

Perhaps the most important structural result about general large dense graphs is the Szemerédi regularity lemma. Here is a standard formulation of that lemma:

Lemma 1 (Szemerédi regularity lemma) Let ${G = (V,E)}$ be a graph on ${n}$ vertices, and let ${\epsilon > 0}$. Then there exists a partition ${V = V_1 \cup \ldots \cup V_M}$ for some ${M \leq M(\epsilon)}$ with the property that for all but at most ${\epsilon M^2}$ of the pairs ${1 \leq i \leq j \leq M}$, the pair ${V_i, V_j}$ is ${\epsilon}$-regular in the sense that

$\displaystyle | d( A, B ) - d( V_i, V_j ) | \leq \epsilon$

whenever ${A \subset V_i, B \subset V_j}$ are such that ${|A| \geq \epsilon |V_i|}$ and ${|B| \geq \epsilon |V_j|}$, and ${d(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|/|A| |B|}$ is the edge density between ${A}$ and ${B}$. Furthermore, the partition is equitable in the sense that ${||V_i| - |V_j|| \leq 1}$ for all ${1 \leq i \leq j \leq M}$.

There are many proofs of this lemma, which is actually not that difficult to establish; see for instance these previous blog posts for some examples. In this post I would like to record one further proof, based on the spectral decomposition of the adjacency matrix of ${G}$, which is essentially due to Frieze and Kannan. (Strictly speaking, Frieze and Kannan used a variant of this argument to establish a weaker form of the regularity lemma, but it is not difficult to modify the Frieze-Kannan argument to obtain the usual form of the regularity lemma instead. Some closely related spectral regularity lemmas were also developed by Szegedy.) I found recently (while speaking at the Abel conference in honour of this year’s laureate, Endre Szemerédi) that this particular argument is not as widely known among graph theory experts as I had thought, so I thought I would record it here.

For reasons of exposition, it is convenient to first establish a slightly weaker form of the lemma, in which one drops the hypothesis of equitability (but then has to weight the cells ${V_i}$ by their magnitude when counting bad pairs):

Lemma 2 (Szemerédi regularity lemma, weakened variant) . Let ${G = (V,E)}$ be a graph on ${n}$ vertices, and let ${\epsilon > 0}$. Then there exists a partition ${V = V_1 \cup \ldots \cup V_M}$ for some ${M \leq M(\epsilon)}$ with the property that for all pairs ${(i,j) \in \{1,\ldots,M\}^2}$ outside of an exceptional set ${\Sigma}$, one has

$\displaystyle | E(A,B) - d_{ij} |A| |B| | \ll \epsilon |V_i| |V_j| \ \ \ \ \ (1)$

whenever ${A \subset V_i, B \subset V_j}$, for some real number ${d_{ij}}$, where ${E(A,B) := |\{ (a,b) \in A \times B: \{a,b\} \in E \}|}$ is the number of edges between ${A}$ and ${B}$. Furthermore, we have

$\displaystyle \sum_{(i,j) \in \Sigma} |V_i| |V_j| \ll \epsilon |V|^2. \ \ \ \ \ (2)$

Let us now prove Lemma 2. We enumerate ${V}$ (after relabeling) as ${V = \{1,\ldots,n\}}$. The adjacency matrix ${T}$ of the graph ${G}$ is then a self-adjoint ${n \times n}$ matrix, and thus admits an eigenvalue decomposition

$\displaystyle T = \sum_{i=1}^n \lambda_i u_i^* u_i$

for some orthonormal basis ${u_1,\ldots,u_n}$ of ${{\bf C}^n}$ and some eigenvalues ${\lambda_1,\ldots,\lambda_n \in {\bf R}}$, which we arrange in decreasing order of magnitude:

$\displaystyle |\lambda_1| \geq \ldots \geq |\lambda_n|.$

We can compute the trace of ${T^2}$ as

$\displaystyle \hbox{tr}(T^2) = \sum_{i=1}^n |\lambda_i|^2.$

But we also have ${\hbox{tr}(T^2) = 2|E| \leq n^2}$, so

$\displaystyle \sum_{i=1}^n |\lambda_i|^2 \leq n^2. \ \ \ \ \ (3)$

Among other things, this implies that

$\displaystyle |\lambda_i| \leq \frac{n}{\sqrt{i}} \ \ \ \ \ (4)$

for all ${i \geq 1}$.

Let ${F: {\bf N} \rightarrow {\bf N}}$ be a function (depending on ${\epsilon}$) to be chosen later, with ${F(i) \geq i}$ for all ${i}$. Applying (3) and the pigeonhole principle (or the finite convergence principle, see this blog post), we can find ${J \leq C(F,\epsilon)}$ such that

$\displaystyle \sum_{J \leq i < F(J)} |\lambda_i|^2 \leq \epsilon^3 n^2.$

(Indeed, the bound on ${J}$ is basically ${F}$ iterated ${1/\epsilon^3}$ times.) We can now split

$\displaystyle T = T_1 + T_2 + T_3, \ \ \ \ \ (5)$

where ${T_1}$ is the “structured” component

$\displaystyle T_1 := \sum_{i < J} \lambda_i u_i^* u_i, \ \ \ \ \ (6)$

${T_2}$ is the “small” component

$\displaystyle T_2 := \sum_{J \leq i < F(J)} \lambda_i u_i^* u_i, \ \ \ \ \ (7)$

and ${T_3}$ is the “pseudorandom” component

$\displaystyle T_3 := \sum_{i > F(J)} \lambda_i u_i^* u_i. \ \ \ \ \ (8)$

We now design a vertex partition to make ${T_1}$ approximately constant on most cells. For each ${i < J}$, we partition ${V}$ into ${O_{J,\epsilon}(1)}$ cells on which ${u_i}$ (viewed as a function from ${V}$ to ${{\bf C}}$) only fluctuates by ${O(\epsilon n^{-1/2} /J)}$, plus an exceptional cell of size ${O( \frac{\epsilon}{J} |V|)}$ coming from the values where ${|u_i|}$ is excessively large (larger than ${\sqrt{\frac{J}{\epsilon}} n^{-1/2}}$). Combining all these partitions together, we can write ${V = V_1 \cup \ldots \cup V_{M-1} \cup V_M}$ for some ${M = O_{J,\epsilon}(1)}$, where ${V_M}$ has cardinality at most ${\epsilon |V|}$, and for all ${1 \leq i \leq M-1}$, the eigenfunctions ${u_1,\ldots,u_{J-1}}$ all fluctuate by at most ${O(\epsilon/J)}$. In particular, if ${1 \leq i,j \leq M-1}$, then (by (4) and (6)) the entries of ${T_1}$ fluctuate by at most ${O(\epsilon)}$ on each block ${V_i \times V_j}$. If we let ${d_{ij}}$ be the mean value of these entries on ${V_i \times V_j}$, we thus have

$\displaystyle 1_B^* T_1 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (9)$

for any ${1 \leq i,j \leq M-1}$ and ${A \subset V_i, B \subset V_j}$, where we view the indicator functions ${1_A, 1_B}$ as column vectors of dimension ${n}$.

Next, we observe from (3) and (7) that ${\hbox{tr} T_2^2 \leq \epsilon^3 n^2}$. If we let ${x_{ab}}$ be the coefficients of ${T_2}$, we thus have

$\displaystyle \sum_{a,b \in V} |x_{ab}|^2 \leq \epsilon^3 n^2$

and hence by Markov’s inequality we have

$\displaystyle \sum_{a \in V_i} \sum_{b \in V_j} |x_{ab}|^2 \leq \epsilon^2 |V_i| |V_j| \ \ \ \ \ (10)$

for all pairs ${(i,j) \in \{1,\ldots,M-1\}^2}$ outside of an exceptional set ${\Sigma_1}$ with

$\displaystyle \sum_{(i,j) \in \Sigma_1} |V_i| |V_j| \leq \epsilon |V|^2.$

If ${(i,j) \in \{1,\ldots,M-1\}^2}$ avoids ${\Sigma_1}$, we thus have

$\displaystyle 1_B^* T_2 1_A = O( \epsilon |V_i| |V_j| ) \ \ \ \ \ (11)$

for any ${A \subset V_i, B \subset V_j}$, by (10) and the Cauchy-Schwarz inequality.

Finally, to control ${T_3}$ we see from (4) and (8) that ${T_3}$ has an operator norm of at most ${n/\sqrt{F(J)}}$. In particular, we have from the Cauchy-Schwarz inequality that

$\displaystyle 1_B^* T_3 1_A = O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (12)$

for any ${A, B \subset V}$.

Let ${\Sigma}$ be the set of all pairs ${(i,j) \in \{1,\ldots,M\}^2}$ where either ${(i,j) \in \Sigma_1}$, ${i = M}$, ${j=M}$, or

$\displaystyle \min(|V_i|, |V_j|) \leq \frac{\epsilon}{M} n.$

One easily verifies that (2) holds. If ${(i,j) \in \{1,\ldots,M\}^2}$ is not in ${\Sigma}$, then by summing (9), (11), (12) and using (5), we see that

$\displaystyle 1_B^* T 1_A = d_{ij} |A| |B| + O( \epsilon |V_i| |V_j| ) + O( n^2 / \sqrt{F(J)} ) \ \ \ \ \ (13)$

for all ${A \subset V_i, B \subset V_j}$. The left-hand side is just ${E(A,B)}$. As ${(i,j) \not \in \Sigma}$, we have

$\displaystyle |V_i|, |V_j| > \frac{\epsilon}{M} n$

and so (since ${M = O_{J,\epsilon}(1)}$)

$\displaystyle n^2 / \sqrt{F(J)} \ll_{J,\epsilon} |V_i| |V_j| / \sqrt{F(J)}.$

If we let ${F}$ be a sufficiently rapidly growing function of ${J}$ that depends on ${\epsilon}$, the second error term in (13) can be absorbed in the first, and (1) follows. This concludes the proof of Lemma 2.

To prove Lemma 1, one argues similarly (after modifying ${\epsilon}$ as necessary), except that the initial partition ${V_1,\ldots,V_M}$ of ${V}$ constructed above needs to be subdivided further into equitable components (of size ${\epsilon |V|/M+O(1)}$), plus some remainder sets which can be aggregated into an exceptional component of size ${O( \epsilon |V| )}$ (and which can then be redistributed amongst the other components to arrive at a truly equitable partition). We omit the details.

Remark 1 It is easy to verify that ${F}$ needs to be growing exponentially in ${J}$ in order for the above argument to work, which leads to tower-exponential bounds in the number of cells ${M}$ in the partition. It was shown by Gowers that a tower-exponential bound is actually necessary here. By varying ${F}$, one basically obtains the strong regularity lemma first established by Alon, Fischer, Krivelevich, and Szegedy; in the opposite direction, setting ${F(J) := J}$ essentially gives the weak regularity lemma of Frieze and Kannan.

Remark 2 If we specialise to a Cayley graph, in which ${V = (V,+)}$ is a finite abelian group and ${E = \{ (a,b): a-b \in A \}}$ for some (symmetric) subset ${A}$ of ${V}$, then the eigenvectors are characters, and one essentially recovers the arithmetic regularity lemma of Green, in which the vertex partition classes ${V_i}$ are given by Bohr sets (and one can then place additional regularity properties on these Bohr sets with some additional arguments). The components ${T_1, T_2, T_3}$ of ${T}$, representing high, medium, and low eigenvalues of ${T}$, then become a decomposition associated to high, medium, and low Fourier coefficients of ${A}$.

Remark 3 The use of spectral theory here is parallel to the use of Fourier analysis to establish results such as Roth’s theorem on arithmetic progressions of length three. In analogy with this, one could view hypergraph regularity as being a sort of “higher order spectral theory”, although this spectral perspective is not as convenient as it is in the graph case.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: Universality of eigenvectors“, submitted to Random Matrices: Theory and Applications. This paper concerns an extension of our four moment theorem for eigenvalues. Roughly speaking, that four moment theorem asserts (under mild decay conditions on the coefficients of the random matrix) that the fine-scale structure of individual eigenvalues of a Wigner random matrix depend only on the first four moments of each of the entries.

In this paper, we extend this result from eigenvalues to eigenvectors, and specifically to the coefficients of, say, the ${i^{th}}$ eigenvector ${u_i(M_n)}$ of a Wigner random matrix ${M_n}$. Roughly speaking, the main result is that the distribution of these coefficients also only depends on the first four moments of each of the entries. In particular, as the distribution of coefficients eigenvectors of invariant ensembles such as GOE or GUE are known to be asymptotically gaussian real (in the GOE case) or gaussian complex (in the GUE case), the same asymptotic automatically holds for Wigner matrices whose coefficients match GOE or GUE to fourth order.

(A technical point here: strictly speaking, the eigenvectors ${u_i(M_n)}$ are only determined up to a phase, even when the eigenvalues are simple. So, to phrase the question properly, one has to perform some sort of normalisation, for instance by working with the coefficients of the spectral projection operators ${P_i(M_n) := u_i(M_n) u_i(M_n)^*}$ instead of the eigenvectors, or rotating each eigenvector by a random phase, or by fixing the first component of each eigenvector to be positive real. This is a fairly minor technical issue here, though, and will not be discussed further.)

This theorem strengthens a four moment theorem for eigenvectors recently established by Knowles and Yin (by a somewhat different method), in that the hypotheses are weaker (no level repulsion assumption is required, and the matrix entries only need to obey a finite moment condition rather than an exponential decay condition), and a slightly stronger conclusion (less regularity is needed on the test function, and one can handle the joint distribution of polynomially many coefficients, rather than boundedly many coefficients). On the other hand, the Knowles-Yin paper can also handle generalised Wigner ensembles in which the variances of the entries are allowed to fluctuate somewhat.

The method used here is a variation of that in our original paper (incorporating the subsequent improvements to extend the four moment theorem from the bulk to the edge, and to replace exponential decay by a finite moment condition). That method was ultimately based on the observation that if one swapped a single entry (and its adjoint) in a Wigner random matrix, then an individual eigenvalue ${\lambda_i(M_n)}$ would not fluctuate much as a consequence (as long as one had already truncated away the event of an unexpectedly small eigenvalue gap). The same analysis shows that the projection matrices ${P_i(M_n)}$ obeys the same stability property.

As an application of the eigenvalue four moment theorem, we establish a four moment theorem for the coefficients of resolvent matrices ${(\frac{1}{\sqrt{n}} M_n - zI)^{-1}}$, even when ${z}$ is on the real axis (though in that case we need to make a level repulsion hypothesis, which has been already verified in many important special cases and is likely to be true in general). This improves on an earlier four moment theorem for resolvents of Erdos, Yau, and Yin, which required ${z}$ to stay some distance away from the real axis (specifically, that ${\hbox{Im}(z) \geq n^{-1-c}}$ for some small ${c>0}$).