You are currently browsing the category archive for the ‘math.SP’ category.
Perhaps the most important structural result about general large dense graphs is the Szemerédi regularity lemma. Here is a standard formulation of that lemma:
Lemma 1 (Szemerédi regularity lemma) Let
be a graph on
vertices, and let
. Then there exists a partition
for some
with the property that for all but at most
of the pairs
, the pair
is
-regular in the sense that
whenever
are such that
and
, and
is the edge density between
and
. Furthermore, the partition is equitable in the sense that
for all
.
There are many proofs of this lemma, which is actually not that difficult to establish; see for instance these previous blog posts for some examples. In this post I would like to record one further proof, based on the spectral decomposition of the adjacency matrix of , which is essentially due to Frieze and Kannan. (Strictly speaking, Frieze and Kannan used a variant of this argument to establish a weaker form of the regularity lemma, but it is not difficult to modify the Frieze-Kannan argument to obtain the usual form of the regularity lemma instead. Some closely related spectral regularity lemmas were also developed by Szegedy.) I found recently (while speaking at the Abel conference in honour of this year’s laureate, Endre Szemerédi) that this particular argument is not as widely known among graph theory experts as I had thought, so I thought I would record it here.
For reasons of exposition, it is convenient to first establish a slightly weaker form of the lemma, in which one drops the hypothesis of equitability (but then has to weight the cells by their magnitude when counting bad pairs):
Lemma 2 (Szemerédi regularity lemma, weakened variant) . Let
be a graph on
vertices, and let
. Then there exists a partition
for some
with the property that for all pairs
outside of an exceptional set
, one has
whenever
, for some real number
, where
is the number of edges between
and
. Furthermore, we have
Let us now prove Lemma 2. We enumerate (after relabeling) as
. The adjacency matrix
of the graph
is then a self-adjoint
matrix, and thus admits an eigenvalue decomposition
for some orthonormal basis of
and some eigenvalues
, which we arrange in decreasing order of magnitude:
We can compute the trace of as
But we also have , so
.
Let be a function (depending on
) to be chosen later, with
for all
. Applying (3) and the pigeonhole principle (or the finite convergence principle, see this blog post), we can find
such that
(Indeed, the bound on is basically
iterated
times.) We can now split
is the “structured” component
is the “small” component
is the “pseudorandom” component
We now design a vertex partition to make approximately constant on most cells. For each
, we partition
into
cells on which
(viewed as a function from
to
) only fluctuates by
, plus an exceptional cell of size
coming from the values where
is excessively large (larger than
). Combining all these partitions together, we can write
for some
, where
has cardinality at most
, and for all
, the eigenfunctions
all fluctuate by at most
. In particular, if
, then (by (4) and (6)) the entries of
fluctuate by at most
on each block
. If we let
be the mean value of these entries on
, we thus have
and
, where we view the indicator functions
as column vectors of dimension
.
Next, we observe from (3) and (7) that . If we let
be the coefficients of
, we thus have
and hence by Markov’s inequality we have
outside of an exceptional set
with
If avoids
, we thus have
, by (10) and the Cauchy-Schwarz inequality.
Finally, to control we see from (4) and (8) that
has an operator norm of at most
. In particular, we have from the Cauchy-Schwarz inequality that
.
Let be the set of all pairs
where either
,
,
, or
One easily verifies that (2) holds. If is not in
, then by summing (9), (11), (12) and using (5), we see that
. The left-hand side is just
. As
, we have
and so (since )
If we let be a sufficiently rapidly growing function of
that depends on
, the second error term in (13) can be absorbed in the first, and (1) follows. This concludes the proof of Lemma 2.
To prove Lemma 1, one argues similarly (after modifying as necessary), except that the initial partition
of
constructed above needs to be subdivided further into equitable components (of size
), plus some remainder sets which can be aggregated into an exceptional component of size
(and which can then be redistributed amongst the other components to arrive at a truly equitable partition). We omit the details.
Remark 1 It is easy to verify that
needs to be growing exponentially in
in order for the above argument to work, which leads to tower-exponential bounds in the number of cells
in the partition. It was shown by Gowers that a tower-exponential bound is actually necessary here. By varying
, one basically obtains the strong regularity lemma first established by Alon, Fischer, Krivelevich, and Szegedy; in the opposite direction, setting
essentially gives the weak regularity lemma of Frieze and Kannan.
Remark 2 If we specialise to a Cayley graph, in which
is a finite abelian group and
for some (symmetric) subset
of
, then the eigenvectors are characters, and one essentially recovers the arithmetic regularity lemma of Green, in which the vertex partition classes
are given by Bohr sets (and one can then place additional regularity properties on these Bohr sets with some additional arguments). The components
of
, representing high, medium, and low eigenvalues of
, then become a decomposition associated to high, medium, and low Fourier coefficients of
.
Remark 3 The use of spectral theory here is parallel to the use of Fourier analysis to establish results such as Roth’s theorem on arithmetic progressions of length three. In analogy with this, one could view hypergraph regularity as being a sort of “higher order spectral theory”, although this spectral perspective is not as convenient as it is in the graph case.
Van Vu and I have just uploaded to the arXiv our paper “Random matrices: Universality of local spectral statistics of non-Hermitian matrices“. The main result of this paper is a “Four Moment Theorem” that establishes universality for local spectral statistics of non-Hermitian matrices with independent entries, under the additional hypotheses that the entries of the matrix decay exponentially, and match moments with either the real or complex gaussian ensemble to fourth order. This is the non-Hermitian analogue of a long string of recent results establishing universality of local statistics in the Hermitian case (as discussed for instance in this recent survey of Van and myself, and also in several other places).
The complex case is somewhat easier to describe. Given a (non-Hermitian) random matrix ensemble of
matrices, one can arbitrarily enumerate the (geometric) eigenvalues as
, and one can then define the
-point correlation functions
to be the symmetric functions such that
In the case when is drawn from the complex gaussian ensemble, so that all the entries are independent complex gaussians of mean zero and variance one, it is a classical result of Ginibre that the asymptotics of
near some point
as
and
is fixed are given by the determinantal rule
and
for , where
is the reproducing kernel
(There is also an asymptotic for the boundary case , but it is more complicated to state.) In particular, we see that
for almost every
, which is a manifestation of the well-known circular law for these matrices; but the circular law only captures the macroscopic structure of the spectrum, whereas the asymptotic (1) describes the microscopic structure.
Our first main result is that the asymptotic (1) for also holds (in the sense of vague convergence) when
is a matrix whose entries are independent with mean zero, variance one, exponentially decaying tails, and which all match moments with the complex gaussian to fourth order. (Actually we prove a stronger result than this which is valid for all bounded
and has more uniform bounds, but is a bit more technical to state.) An analogous result is also established for real gaussians (but now one has to separate the correlation function into components depending on how many eigenvalues are real and how many are strictly complex; also, the limiting distribution is more complicated, being described by Pfaffians rather than determinants). Among other things, this allows us to partially extend some known results on complex or real gaussian ensembles to more general ensembles. For instance, there is a central limit theorem of Rider which establishes a central limit theorem for the number of eigenvalues of a complex gaussian matrix in a mesoscopic disk; from our results, we can extend this central limit theorem to matrices that match the complex gaussian ensemble to fourth order, provided that the disk is small enough (for technical reasons, our error bounds are not strong enough to handle large disks). Similarly, extending some results of Edelman-Kostlan-Shub and of Forrester-Nagao, we can show that for a matrix matching the real gaussian ensemble to fourth order, the number of real eigenvalues is
with probability
for some absolute constant
.
There are several steps involved in the proof. The first step is to apply the Girko Hermitisation trick to replace the problem of understanding the spectrum of a non-Hermitian matrix, with that of understanding the spectrum of various Hermitian matrices. The two identities that realise this trick are, firstly, Jensen’s formula
that relates the local distribution of eigenvalues to the log-determinants , and secondly the elementary identity
that relates the log-determinants of to the log-determinants of the Hermitian matrices
The main difficulty is then to obtain concentration and universality results for the Hermitian log-determinants . This turns out to be a task that is analogous to the task of obtaining concentration for Wigner matrices (as we did in this recent paper), as well as central limit theorems for log-determinants of Wigner matrices (as we did in this other recent paper). In both of these papers, the main idea was to use the Four Moment Theorem for Wigner matrices (which can now be proven relatively easily by a combination of the local semi-circular law and resolvent swapping methods), combined with (in the latter paper) a central limit theorem for the gaussian unitary ensemble (GUE). This latter task was achieved by using the convenient Trotter normal form to tridiagonalise a GUE matrix, which has the effect of revealing the determinant of that matrix as the solution to a certain linear stochastic difference equation, and one can analyse the distribution of that solution via such tools as the martingale central limit theorem.
The matrices are somewhat more complicated than Wigner matrices (for instance, the semi-circular law must be replaced by a distorted Marchenko-Pastur law), but the same general strategy works to obtain concentration and universality for their log-determinants. The main new difficulty that arises is that the analogue of the Trotter norm for gaussian random matrices is not tridiagonal, but rather Hessenberg (i.e. upper-triangular except for the lower diagonal). This ultimately has the effect of expressing the relevant determinant as the solution to a nonlinear stochastic difference equation, which is a bit trickier to solve for. Fortunately, it turns out that one only needs good lower bounds on the solution, as one can use the second moment method to upper bound the determinant and hence the log-determinant (following a classical computation of Turan). This simplifies the analysis on the equation somewhat.
While this result is the first local universality result in the category of random matrices with independent entries, there are still two limitations to the result which one would like to remove. The first is the moment matching hypotheses on the matrix. Very recently, one of the ingredients of our paper, namely the local circular law, was proved without moment matching hypotheses by Bourgade, Yau, and Yin (provided one stays away from the edge of the spectrum); however, as of this time of writing the other main ingredient – the universality of the log-determinant – still requires moment matching. (The standard tool for obtaining universality without moment matching hypotheses is the heat flow method (and more specifically, the local relaxation flow method), but the analogue of Dyson Brownian motion in the non-Hermitian setting appears to be somewhat intractible, being a coupled flow on both the eigenvalues and eigenvectors rather than just on the eigenvalues alone.)
I’ve just uploaded to the arXiv my paper The asymptotic distribution of a single eigenvalue gap of a Wigner matrix, submitted to Probability Theory and Related Fields. This paper (like several of my previous papers) is concerned with the asymptotic distribution of the eigenvalues of a random Wigner matrix
in the limit
, with a particular focus on matrices drawn from the Gaussian Unitary Ensemble (GUE). This paper is focused on the bulk of the spectrum, i.e. to eigenvalues
with
for some fixed
.
The location of an individual eigenvalue is by now quite well understood. If we normalise the entries of the matrix
to have mean zero and variance
, then in the asymptotic limit
, the Wigner semicircle law tells us that with probability
one has
where the classical location of the eigenvalue is given by the formula
and the semicircular distribution is given by the formula
Actually, one can improve the error term here from to
for any
(see this previous recent paper of Van and myself for more discussion of these sorts of estimates, sometimes known as eigenvalue rigidity estimates).
From the semicircle law (and the fundamental theorem of calculus), one expects the eigenvalue spacing
to have an average size of
. It is thus natural to introduce the normalised eigenvalue spacing
and ask what the distribution of is.
As mentioned previously, we will focus on the bulk case , and begin with the model case when
is drawn from GUE. (In the edge case when
is close to
or to
, the distribution is given by the famous Tracy-Widom law.) Here, the distribution was almost (but as we shall see, not quite) worked out by Gaudin and Mehta. By using the theory of determinantal processes, they were able to compute a quantity closely related to
, namely the probability
near
of length comparable to the expected eigenvalue spacing
is devoid of eigenvalues. For
in the bulk and fixed
, they showed that this probability is equal to
where is the Dyson projection
to Fourier modes in , and
is the Fredholm determinant. As shown by Jimbo, Miwa, Tetsuji, Mori, and Sato, this determinant can also be expressed in terms of a solution to a Painleve V ODE, though we will not need this fact here. In view of this asymptotic and some standard integration by parts manipulations, it becomes plausible to propose that
will be asymptotically distributed according to the Gaudin-Mehta distribution
, where
A reasonably accurate approximation for is given by the Wigner surmise
, which was presciently proposed by Wigner as early as 1957; it is exact for
but not in the asymptotic limit
.
Unfortunately, when one tries to make this argument rigorous, one finds that the asymptotic for (1) does not control a single gap , but rather an ensemble of gaps
, where
is drawn from an interval
of some moderate size
(e.g.
); see for instance this paper of Deift, Kriecherbauer, McLaughlin, Venakides, and Zhou for a more precise formalisation of this statement (which is phrased slightly differently, in which one samples all gaps inside a fixed window of spectrum, rather than inside a fixed range of eigenvalue indices
). (This result is stated for GUE, but can be extended to other Wigner ensembles by the Four Moment Theorem, at least if one assumes a moment matching condition; see this previous paper with Van Vu for details. The moment condition can in fact be removed, as was done in this subsequent paper with Erdos, Ramirez, Schlein, Vu, and Yau.)
The problem is that when one specifies a given window of spectrum such as , one cannot quite pin down in advance which eigenvalues
are going to lie to the left or right of this window; even with the strongest eigenvalue rigidity results available, there is a natural uncertainty of
or so in the
index (as can be quantified quite precisely by this central limit theorem of Gustavsson).
The main difficulty here is that there could potentially be some strange coupling between the event (1) of an interval being devoid of eigenvalues, and the number of eigenvalues to the left of that interval. For instance, one could conceive of a possible scenario in which the interval in (1) tends to have many eigenvalues when
is even, but very few when
is odd. In this sort of situation, the gaps
may have different behaviour for even
than for odd
, and such anomalies would not be picked up in the averaged statistics in which
is allowed to range over some moderately large interval.
The main result of the current paper is that these anomalies do not actually occur, and that all of the eigenvalue gaps in the bulk are asymptotically governed by the Gaudin-Mehta law without the need for averaging in the
parameter. Again, this is shown first for GUE, and then extended to other Wigner matrices obeying a matching moment condition using the Four Moment Theorem. (It is likely that the moment matching condition can be removed here, but I was unable to achieve this, despite all the recent advances in establishing universality of local spectral statistics for Wigner matrices, mainly because the universality results in the literature are more focused on specific energy levels
than on specific eigenvalue indices
. To make matters worse, in some cases universality is currently known only after an additional averaging in the energy parameter.)
The main task in the proof is to show that the random variable is largely decoupled from the event in (1) when
is drawn from GUE. To do this we use some of the theory of determinantal processes, and in particular the nice fact that when one conditions a determinantal process to the event that a certain spatial region (such as an interval) contains no points of the process, then one obtains a new determinantal process (with a kernel that is closely related to the original kernel). The main task is then to obtain a sufficiently good control on the distance between the new determinantal kernel and the old one, which we do by some functional-analytic considerations involving the manipulation of norms of operators (and specifically, the operator norm, Hilbert-Schmidt norm, and nuclear norm). Amusingly, the Fredholm alternative makes a key appearance, as I end up having to invert a compact perturbation of the identity at one point (specifically, I need to invert
, where
is the Dyson projection and
is an interval). As such, the bounds in my paper become ineffective, though I am sure that with more work one can invert this particular perturbation of the identity by hand, without the need to invoke the Fredholm alternative.
Van Vu and I have just uploaded to the arXiv our paper Random matrices: Sharp concentration of eigenvalues, submitted to the Electronic Journal of Probability. As with many of our previous papers, this paper is concerned with the distribution of the eigenvalues of a random Wigner matrix
(such as a matrix drawn from the Gaussian Unitary Ensemble (GUE) or Gaussian Orthogonal Ensemble (GOE)). To simplify the discussion we shall mostly restrict attention to the bulk of the spectrum, i.e. to eigenvalues
with
for some fixed
, although analogues of most of the results below have also been obtained at the edge of the spectrum.
If we normalise the entries of the matrix to have mean zero and variance
, then in the asymptotic limit
, we have the Wigner semicircle law, which asserts that the eigenvalues are asymptotically distributed according to the semicircular distribution
, where
An essentially equivalent way of saying this is that for large , we expect the
eigenvalue
of
to stay close to the classical location
, defined by the formula
In particular, from the Wigner semicircle law it can be shown that asymptotically almost surely, one has
.
In the modern study of the spectrum of Wigner matrices (and in particular as a key tool in establishing universality results), it has become of interest to improve the error term in (1) as much as possible. A typical early result in this direction was by Bai, who used the Stieltjes transform method to obtain polynomial convergence rates of the shape for some absolute constant
; see also the subsequent papers of Alon-Krivelevich-Vu and of of Meckes, who were able to obtain such convergence rates (with exponentially high probability) by using concentration of measure tools, such as Talagrand’s inequality. On the other hand, in the case of the GUE ensemble it is known (by this paper of Gustavsson) that
has variance comparable to
in the bulk, so that the optimal error term in (1) should be about
. (One may think that if one wanted bounds on (1) that were uniform in
, one would need to enlarge the error term further, but this does not appear to be the case, due to strong correlations between the
; note for instance this recent result of Ben Arous and Bourgarde that the largest gap between eigenvalues in the bulk is typically of order
.)
A significant advance in this direction was achieved by Erdos, Schlein, and Yau in a series of papers where they used a combination of Stieltjes transform and concentration of measure methods to obtain local semicircle laws which showed, among other things, that one had asymptotics of the form
with exponentially high probability for intervals in the bulk that were as short as
for some
, where
is the number of eigenvalues. These asymptotics are consistent with a good error term in (1), and are already sufficient for many applications, but do not quite imply a strong concentration result for individual eigenvalues
(basically because they do not preclude long-range or “secular” shifts in the spectrum that involve large blocks of eigenvalues at mesoscopic scales). Nevertheless, this was rectified in a subsequent paper of Erdos, Yau, and Yin, which roughly speaking obtained a bound of the form
in the bulk with exponentially high probability, for Wigner matrices obeying some exponential decay conditions on the entries. This was achieved by a rather delicate high moment calculation, in which the contribution of the diagonal entries of the resolvent (whose average forms the Stieltjes transform) was shown to mostly cancel each other out.
As the GUE computations show, this concentration result is sharp up to the quasilogarithmic factor . The main result of this paper is to improve the concentration result to one more in line with the GUE case, namely
with exponentially high probability (see the paper for a more precise statement of results). The one catch is that an additional hypothesis is required, namely that the entries of the Wigner matrix have vanishing third moment. We also obtain similar results for the edge of the spectrum (but with a different scaling).
Our arguments are rather different from those of Erdos, Yau, and Yin, and thus provide an alternate approach to establishing eigenvalue concentration. The main tool is the Lindeberg exchange strategy, which is also used to prove the Four Moment Theorem (although we do not directly invoke the Four Moment Theorem in our analysis). The main novelty is that this exchange strategy is now used to establish large deviation estimates (i.e. exponentially small tail probabilities) rather than universality of the limiting distribution. Roughly speaking, the basic point is as follows. The Lindeberg exchange strategy seeks to compare a function of many independent random variables
with the same function
of a different set of random variables (which match moments with the original set of variables to some order, such as to second or fourth order) by exchanging the random variables one at a time. Typically, one tries to upper bound expressions such as
for various smooth test functions , by performing a Taylor expansion in the variable being swapped and taking advantage of the matching moment hypotheses. In previous implementations of this strategy,
was a bounded test function, which allowed one to get control of the bulk of the distribution of
, and in particular in controlling probabilities such as
for various thresholds and
, but did not give good control on the tail as the error terms tended to be polynomially decaying in
rather than exponentially decaying. However, it turns out that one can modify the exchange strategy to deal with moments such as
for various moderately large (e.g. of size comparable to
), obtaining results such as
after performing all the relevant exchanges. As such, one can then use large deviation estimates on to deduce large deviation estimates on
.
In this paper we also take advantage of a simplification, first noted by Erdos, Yau, and Yin, that Four Moment Theorems become somewhat easier to prove if one works with resolvents (and the closely related Stieltjes transform
) rather than with individual eigenvalues, as the Taylor expansion of resolvents are very simple (essentially being a Neumann series). The relationship between the Stieltjes transform and the location of individual eigenvalues can be seen by taking advantage of the identity
for any energy level , which can be verified from elementary calculus. (In practice, we would truncate
near zero and near infinity to avoid some divergences, but this is a minor technicality.) As such, a concentration result for the Stieltjes transform can be used to establish an analogous concentration result for the eigenvalue counting functions
, which in turn can be used to deduce concentration results for individual eigenvalues
by some basic combinatorial manipulations.
Let be a self-adjoint operator on a finite-dimensional Hilbert space
. The behaviour of this operator can be completely described by the spectral theorem for finite-dimensional self-adjoint operators (i.e. Hermitian matrices, when viewed in coordinates), which provides a sequence
of eigenvalues and an orthonormal basis
of eigenfunctions such that
for all
. In particular, given any function
on the spectrum
of
, one can then define the linear operator
by the formula
which then gives a functional calculus, in the sense that the map is a
-algebra isometric homomorphism from the algebra
of bounded continuous functions from
to
, to the algebra
of bounded linear operators on
. Thus, for instance, one can define heat operators
for
, Schrödinger operators
for
, resolvents
for
, and (if
is positive) wave operators
for
. These will be bounded operators (and, in the case of the Schrödinger and wave operators, unitary operators, and in the case of the heat operators with
positive, they will be contractions). Among other things, this functional calculus can then be used to solve differential equations such as the heat equation
The functional calculus can also be associated to a spectral measure. Indeed, for any vectors , there is a complex measure
on
with the property that
indeed, one can set to be the discrete measure on
defined by the formula
One can also view this complex measure as a coefficient
of a projection-valued measure on
, defined by setting
Finally, one can view as unitarily equivalent to a multiplication operator
on
, where
is the real-valued function
, and the intertwining map
is given by
so that .
It is an important fact in analysis that many of these above assertions extend to operators on an infinite-dimensional Hilbert space , so long as one one is careful about what “self-adjoint operator” means; these facts are collectively referred to as the spectral theorem. For instance, it turns out that most of the above claims have analogues for bounded self-adjoint operators
. However, in the theory of partial differential equations, one often needs to apply the spectral theorem to unbounded, densely defined linear operators
, which (initially, at least), are only defined on a dense subspace
of the Hilbert space
. A very typical situation arises when
is the square-integrable functions on some domain or manifold
(which may have a boundary or be otherwise “incomplete”), and
are the smooth compactly supported functions on
, and
is some linear differential operator. It is then of interest to obtain the spectral theorem for such operators, so that one build operators such as
or to solve equations such as (1), (2), (3), (4).
In order to do this, some necessary conditions on the densely defined operator must be imposed. The most obvious is that of symmetry, which asserts that
. In some applications, one also wants to impose positive definiteness, which asserts that
. These hypotheses are sufficient in the case when
is bounded, and in particular when
is finite dimensional. However, as it turns out, for unbounded operators these conditions are not, by themselves, enough to obtain a good spectral theory. For instance, one consequence of the spectral theorem should be that the resolvents
are well-defined for any strictly complex
, which by duality implies that the image of
should be dense in
. However, this can fail if one just assumes symmetry, or symmetry and positive definiteness. A well-known example occurs when
is the Hilbert space
,
is the space of test functions, and
is the one-dimensional Laplacian
. Then
is symmetric and positive, but the operator
does not have dense image for any complex
, since
for all test functions , as can be seen from a routine integration by parts. As such, the resolvent map is not everywhere uniquely defined. There is also a lack of uniqueness for the wave, heat, and Schrödinger equations for this operator (note that there are no spatial boundary conditions specified in these equations).
Another example occurs when ,
,
is the momentum operator
. Then the resolvent
can be uniquely defined for
in the upper half-plane, but not in the lower half-plane, due to the obstruction
for all test functions (note that the function
lies in
when
is in the lower half-plane). For related reasons, the translation operators
have a problem with either uniqueness or existence (depending on whether
is positive or negative), due to the unspecified boundary behaviour at the origin.
The key property that lets one avoid this bad behaviour is that of essential self-adjointness. Once is essentially self-adjoint, then spectral theorem becomes applicable again, leading to all the expected behaviour (e.g. existence and uniqueness for the various PDE given above).
Unfortunately, the concept of essential self-adjointness is defined rather abstractly, and is difficult to verify directly; unlike the symmetry condition (5) or the positive condition (6), it is not a “local” condition that can be easily verified just by testing on various inputs, but is instead a more “global” condition. In practice, to verify this property, one needs to invoke one of a number of a partial converses to the spectral theorem, which roughly speaking asserts that if at least one of the expected consequences of the spectral theorem is true for some symmetric densely defined operator
, then
is self-adjoint. Examples of “expected consequences” include:
- Existence of resolvents
(or equivalently, dense image for
);
- Existence of a contractive heat propagator semigroup
(in the positive case);
- Existence of a unitary Schrödinger propagator group
;
- Existence of a unitary wave propagator group
(in the positive case);
- Existence of a “reasonable” functional calculus.
- Unitary equivalence with a multiplication operator.
Thus, to actually verify essential self-adjointness of a differential operator, one typically has to first solve a PDE (such as the wave, Schrödinger, heat, or Helmholtz equation) by some non-spectral method (e.g. by a contraction mapping argument, or a perturbation argument based on an operator already known to be essentially self-adjoint). Once one can solve one of the PDEs, then one can apply one of the known converse spectral theorems to obtain essential self-adjointness, and then by the forward spectral theorem one can then solve all the other PDEs as well. But there is no getting out of that first step, which requires some input (typically of an ODE, PDE, or geometric nature) that is external to what abstract spectral theory can provide. For instance, if one wants to establish essential self-adjointness of the Laplace-Beltrami operator on a smooth Riemannian manifold
(using
as the domain space), it turns out (under reasonable regularity hypotheses) that essential self-adjointness is equivalent to geodesic completeness of the manifold, which is a global ODE condition rather than a local one: one needs geodesics to continue indefinitely in order to be able to (unitarily) solve PDEs such as the wave equation, which in turn leads to essential self-adjointness. (Note that the domains
and
in the previous examples were not geodesically complete.) For this reason, essential self-adjointness of a differential operator is sometimes referred to as quantum completeness (with the completeness of the associated Hamilton-Jacobi flow then being the analogous classical completeness).
In these notes, I wanted to record (mostly for my own benefit) the forward and converse spectral theorems, and to verify essential self-adjointness of the Laplace-Beltrami operator on geodesically complete manifolds. This is extremely standard analysis (covered, for instance, in the texts of Reed and Simon), but I wanted to write it down myself to make sure that I really understood this foundational material properly.
In the previous set of notes we saw how a representation-theoretic property of groups, namely Kazhdan’s property (T), could be used to demonstrate expansion in Cayley graphs. In this set of notes we discuss a different representation-theoretic property of groups, namely quasirandomness, which is also useful for demonstrating expansion in Cayley graphs, though in a somewhat different way to property (T). For instance, whereas property (T), being qualitative in nature, is only interesting for infinite groups such as or
, and only creates Cayley graphs after passing to a finite quotient, quasirandomness is a quantitative property which is directly applicable to finite groups, and is able to deduce expansion in a Cayley graph, provided that random walks in that graph are known to become sufficiently “flat” in a certain sense.
The definition of quasirandomness is easy enough to state:
Definition 1 (Quasirandom groups) Let
be a finite group, and let
. We say that
is
-quasirandom if all non-trivial unitary representations
of
have dimension at least
. (Recall a representation is trivial if
is the identity for all
.)
Exercise 1 Let
be a finite group, and let
. A unitary representation
is said to be irreducible if
has no
-invariant subspaces other than
and
. Show that
is
-quasirandom if and only if every non-trivial irreducible representation of
has dimension at least
.
Remark 1 The terminology “quasirandom group” was introduced explicitly (though with slightly different notational conventions) by Gowers in 2008 in his detailed study of the concept; the name arises because dense Cayley graphs in quasirandom groups are quasirandom graphs in the sense of Chung, Graham, and Wilson, as we shall see below. This property had already been used implicitly to construct expander graphs by Sarnak and Xue in 1991, and more recently by Gamburd in 2002 and by Bourgain and Gamburd in 2008. One can of course define quasirandomness for more general locally compact groups than the finite ones, but we will only need this concept in the finite case. (A paper of Kunze and Stein from 1960, for instance, exploits the quasirandomness properties of the locally compact group
to obtain mixing estimates in that group.)
Quasirandomness behaves fairly well with respect to quotients and short exact sequences:
Exercise 2 Let
be a short exact sequence of finite groups
.
- (i) If
is
-quasirandom, show that
is
-quasirandom also. (Equivalently: any quotient of a
-quasirandom finite group is again a
-quasirandom finite group.)
- (ii) Conversely, if
and
are both
-quasirandom, show that
is
-quasirandom also. (In particular, the direct or semidirect product of two
-quasirandom finite groups is again a
-quasirandom finite group.)
Informally, we will call quasirandom if it is
-quasirandom for some “large”
, though the precise meaning of “large” will depend on context. For applications to expansion in Cayley graphs, “large” will mean “
for some constant
independent of the size of
“, but other regimes of
are certainly of interest.
The way we have set things up, the trivial group is infinitely quasirandom (i.e. it is
-quasirandom for every
). This is however a degenerate case and will not be discussed further here. In the non-trivial case, a finite group can only be quasirandom if it is large and has no large subgroups:
Exercise 3 Let
, and let
be a finite
-quasirandom group.
- (i) Show that if
is non-trivial, then
. (Hint: use the mean zero component
of the regular representation
.) In particular, non-trivial finite groups cannot be infinitely quasirandom.
- (ii) Show that any proper subgroup
of
has index
. (Hint: use the mean zero component of the quasiregular representation.)
The following exercise shows that quasirandom groups have to be quite non-abelian, and in particular perfect:
Exercise 4 (Quasirandomness, abelianness, and perfection) Let
be a finite group.
- (i) If
is abelian and non-trivial, show that
is not
-quasirandom. (Hint: use Fourier analysis or the classification of finite abelian groups.)
- (ii) Show that
is
-quasirandom if and only if it is perfect, i.e. the commutator group
is equal to
. (Equivalently,
is
-quasirandom if and only if it has no non-trivial abelian quotients.)
Later on we shall see that there is a converse to the above two exercises; any non-trivial perfect finite group with no large subgroups will be quasirandom.
Exercise 5 Let
be a finite
-quasirandom group. Show that for any subgroup
of
,
is
-quasirandom, where
is the index of
in
. (Hint: use induced representations.)
Now we give an example of a more quasirandom group.
Lemma 2 (Frobenius lemma) If
is a field of some prime order
, then
is
-quasirandom.
This should be compared with the cardinality of the special linear group, which is easily computed to be
.
Proof: We may of course take to be odd. Suppose for contradiction that we have a non-trivial representation
on a unitary group of some dimension
with
. Set
to be the group element
and suppose first that is non-trivial. Since
, we have
; thus all the eigenvalues of
are
roots of unity. On the other hand, by conjugating
by diagonal matrices in
, we see that
is conjugate to
(and hence
conjugate to
) whenever
is a quadratic residue mod
. As such, the eigenvalues of
must be permuted by the operation
for any quadratic residue mod
. Since
has at least one non-trivial eigenvalue, and there are
distinct quadratic residues, we conclude that
has at least
distinct eigenvalues. But
is a
matrix with
, a contradiction. Thus
lies in the kernel of
. By conjugation, we then see that this kernel contains all unipotent matrices. But these matrices generate
(see exercise below), and so
is trivial, a contradiction.
Exercise 6 Show that for any prime
, the unipotent matrices
for
ranging over
generate
as a group.
Exercise 7 Let
be a finite group, and let
. If
is generated by a collection
of
-quasirandom subgroups, show that
is itself
-quasirandom.
Exercise 8 Show that
is
-quasirandom for any
and any prime
. (This is not sharp; the optimal bound here is
, which follows from the results of Landazuri and Seitz.)
As a corollary of the above results and Exercise 2, we see that the projective special linear group is also
-quasirandom.
Remark 2 One can ask whether the bound
in Lemma 2 is sharp, assuming of course that
is odd. Noting that
acts linearly on the plane
, we see that it also acts projectively on the projective line
, which has
elements. Thus
acts via the quasiregular representation on the
-dimensional space
, and also on the
-dimensional subspace
; this latter representation (known as the Steinberg representation) is irreducible. This shows that the
bound cannot be improved beyond
. More generally, given any character
,
acts on the
-dimensional space
of functions
that obey the twisted dilation invariance
for all
and
; these are known as the principal series representations. When
is the trivial character, this is the quasiregular representation discussed earlier. For most other characters, this is an irreducible representation, but it turns out that when
is the quadratic representation (thus taking values in
while being non-trivial), the principal series representation splits into the direct sum of two
-dimensional representations, which comes very close to matching the bound in Lemma 2. There is a parallel series of representations to the principal series (known as the discrete series) which is more complicated to describe (roughly speaking, one has to embed
in a quadratic extension
and then use a rotated version of the above construction, to change a split torus into a non-split torus), but can generate irreducible representations of dimension
, showing that the bound in Lemma 2 is in fact exactly sharp. These constructions can be generalised to arbitrary finite groups of Lie type using Deligne-Luzstig theory, but this is beyond the scope of this course (and of my own knowledge in the subject).
Exercise 9 Let
be an odd prime. Show that for any
, the alternating group
is
-quasirandom. (Hint: show that all cycles of order
in
are conjugate to each other in
(and not just in
); in particular, a cycle is conjugate to its
power for all
. Also, as
,
is simple, and so the cycles of order
generate the entire group.)
Remark 3 By using more precise information on the representations of the alternating group (using the theory of Specht modules and Young tableaux), one can show the slightly sharper statement that
is
-quasirandom for
(but is only
-quasirandom for
due to icosahedral symmetry, and
-quasirandom for
due to lack of perfectness). Using Exercise 3 with the index
subgroup
, we see that the bound
cannot be improved. Thus,
(for large
) is not as quasirandom as the special linear groups
(for
large and
bounded), because in the latter case the quasirandomness is as strong as a power of the size of the group, whereas in the former case it is only logarithmic in size.
If one replaces the alternating group
with the slightly larger symmetric group
, then quasirandomness is destroyed (since
, having the abelian quotient
, is not perfect); indeed,
is
-quasirandom and no better.
Remark 4 Thanks to the monumental achievement of the classification of finite simple groups, we know that apart from a finite number (26, to be precise) of sporadic exceptions, all finite simple groups (up to isomorphism) are either a cyclic group
, an alternating group
, or is a finite simple group of Lie type such as
. (We will define the concept of a finite simple group of Lie type more precisely in later notes, but suffice to say for now that such groups are constructed from reductive algebraic groups, for instance
is constructed from
in characteristic
.) In the case of finite simple groups
of Lie type with bounded rank
, it is known from the work of Landazuri and Seitz that such groups are
-quasirandom for some
depending only on the rank. On the other hand, by the previous remark, the large alternating groups do not have this property, and one can show that the finite simple groups of Lie type with large rank also do not have this property. Thus, we see using the classification that if a finite simple group
is
-quasirandom for some
and
is sufficiently large depending on
, then
is a finite simple group of Lie type with rank
. It would be of interest to see if there was an alternate way to establish this fact that did not rely on the classification, as it may lead to an alternate approach to proving the classification (or perhaps a weakened version thereof).
A key reason why quasirandomness is desirable for the purposes of demonstrating expansion is that quasirandom groups happen to be rapidly mixing at large scales, as we shall see below the fold. As such, quasirandomness is an important tool for demonstrating expansion in Cayley graphs, though because expansion is a phenomenon that must hold at all scales, one needs to supplement quasirandomness with some additional input that creates mixing at small or medium scales also before one can deduce expansion. As an example of this technique of combining quasirandomness with mixing at small and medium scales, we present a proof (due to Sarnak-Xue, and simplified by Gamburd) of a weak version of the famous “3/16 theorem” of Selberg on the least non-trivial eigenvalue of the Laplacian on a modular curve, which among other things can be used to construct a family of expander Cayley graphs in (compare this with the property (T)-based methods in the previous notes, which could construct expander Cayley graphs in
for any fixed
).
Van Vu and I have just uploaded to the arXiv our short survey article, “Random matrices: The Four Moment Theorem for Wigner ensembles“, submitted to the MSRI book series, as part of the proceedings on the MSRI semester program on random matrix theory from last year. This is a highly condensed version (at 17 pages) of a much longer survey (currently at about 48 pages, though not completely finished) that we are currently working on, devoted to the recent advances in understanding the universality phenomenon for spectral statistics of Wigner matrices. In this abridged version of the survey, we focus on a key tool in the subject, namely the Four Moment Theorem which roughly speaking asserts that the statistics of a Wigner matrix depend only on the first four moments of the entries. We give a sketch of proof of this theorem, and two sample applications: a central limit theorem for individual eigenvalues of a Wigner matrix (extending a result of Gustavsson in the case of GUE), and the verification of a conjecture of Wigner, Dyson, and Mehta on the universality of the asymptotic k-point correlation functions even for discrete ensembles (provided that we interpret convergence in the vague topology sense).
For reasons of space, this paper is very far from an exhaustive survey even of the narrow topic of universality for Wigner matrices, but should hopefully be an accessible entry point into the subject nevertheless.
In the previous set of notes we introduced the notion of expansion in arbitrary -regular graphs. For the rest of the course, we will now focus attention primarily to a special type of
-regular graph, namely a Cayley graph.
Definition 1 (Cayley graph) Let
be a group, and let
be a finite subset of
. We assume that
is symmetric (thus
whenever
) and does not contain the identity
(this is to avoid loops). Then the (right-invariant) Cayley graph
is defined to be the graph with vertex set
and edge set
, thus each vertex
is connected to the
elements
for
, and so
is a
-regular graph.
Example 1 The graph in Exercise 3 of Notes 1 is the Cayley graph on
with generators
.
Remark 1 We call the above Cayley graphs right-invariant because every right translation
on
is a graph automorphism of
. This group of automorphisms acts transitively on the vertex set of the Cayley graph. One can thus view a Cayley graph as a homogeneous space of
, as it “looks the same” from every vertex. One could of course also consider left-invariant Cayley graphs, in which
is connected to
rather than
. However, the two such graphs are isomorphic using the inverse map
, so we may without loss of generality restrict our attention throughout to left Cayley graphs.
Remark 2 For minor technical reasons, it will be convenient later on to allow
to contain the identity and to come with multiplicity (i.e. it will be a multiset rather than a set). If one does so, of course, the resulting Cayley graph will now contain some loops and multiple edges.
For the purposes of building expander families, we would of course want the underlying group
to be finite. However, it will be convenient at various times to “lift” a finite Cayley graph up to an infinite one, and so we permit
to be infinite in our definition of a Cayley graph.
We will also sometimes consider a generalisation of a Cayley graph, known as a Schreier graph:
Definition 2 (Schreier graph) Let
be a finite group that acts (on the left) on a space
, thus there is a map
from
to
such that
and
for all
and
. Let
be a symmetric subset of
which acts freely on
in the sense that
for all
and
, and
for all distinct
and
. Then the Schreier graph
is defined to be the graph with vertex set
and edge set
.
Example 2 Every Cayley graph
is also a Schreier graph
, using the obvious left-action of
on itself. The
-regular graphs formed from
permutations
that were studied in the previous set of notes is also a Schreier graph provided that
for all distinct
, with the underlying group being the permutation group
(which acts on the vertex set
in the obvious manner), and
.
Exercise 1 If
is an even integer, show that every
-regular graph is a Schreier graph involving a set
of generators of cardinality
. (Hint: first show that every
-regular graph can be decomposed into
unions of cycles, each of which partition the vertex set, then use the previous example.
We return now to Cayley graphs. It is easy to characterise qualitative expansion properties of Cayley graphs:
Exercise 2 (Qualitative expansion) Let
be a finite Cayley graph.
- (i) Show that
is a one-sided
-expander for
for some
if and only if
generates
.
- (ii) Show that
is a two-sided
-expander for
for some
if and only if
generates
, and furthermore
intersects each index
subgroup of
.
We will however be interested in more quantitative expansion properties, in which the expansion constant is independent of the size of the Cayley graph, so that one can construct non-trivial expander families
of Cayley graphs.
One can analyse the expansion of Cayley graphs in a number of ways. For instance, by taking the edge expansion viewpoint, one can study Cayley graphs combinatorially, using the product set operation
of subsets of .
Exercise 3 (Combinatorial description of expansion) Let
be a family of finite
-regular Cayley graphs. Show that
is a one-sided expander family if and only if there is a constant
independent of
such that
for all sufficiently large
and all subsets
of
with
.
One can also give a combinatorial description of two-sided expansion, but it is more complicated and we will not use it here.
Exercise 4 (Abelian groups do not expand) Let
be a family of finite
-regular Cayley graphs, with the
all abelian, and the
generating
. Show that
are a one-sided expander family if and only if the Cayley graphs have bounded cardinality (i.e.
). (Hint: assume for contradiction that
is a one-sided expander family with
, and show by two different arguments that
grows at least exponentially in
and also at most polynomially in
, giving the desired contradiction.)
The left-invariant nature of Cayley graphs also suggests that such graphs can be profitably analysed using some sort of Fourier analysis; as the underlying symmetry group is not necessarily abelian, one should use the Fourier analysis of non-abelian groups, which is better known as (unitary) representation theory. The Fourier-analytic nature of Cayley graphs can be highlighted by recalling the operation of convolution of two functions , defined by the formula
This convolution operation is bilinear and associative (at least when one imposes a suitable decay condition on the functions, such as compact support), but is not commutative unless is abelian. (If one is more algebraically minded, one can also identify
(when
is finite, at least) with the group algebra
, in which case convolution is simply the multiplication operation in this algebra.) The adjacency operator
on a Cayley graph
can then be viewed as a convolution
where is the probability density
is the Kronecker delta function on
. Using the spectral definition of expansion, we thus see that
is a one-sided expander if and only if
is orthogonal to the constant function
, and is a two-sided expander if
is orthogonal to the constant function
.
We remark that the above spectral definition of expansion can be easily extended to symmetric sets which contain the identity or have multiplicity (i.e. are multisets). (We retain symmetry, though, in order to keep the operation of convolution by
self-adjoint.) In particular, one can say (with some slight abuse of notation) that a set of elements
of
(possibly with repetition, and possibly with some elements equalling the identity) generates a one-sided or two-sided
-expander if the associated symmetric probability density
We saw in the last set of notes that expansion can be characterised in terms of random walks. One can of course specialise this characterisation to the Cayley graph case:
Exercise 5 (Random walk description of expansion) Let
be a family of finite
-regular Cayley graphs, and let
be the associated probability density functions. Let
be a constant.
- Show that the
are a two-sided expander family if and only if there exists a
such that for all sufficiently large
, one has
for some
, where
denotes the convolution of
copies of
.
- Show that the
are a one-sided expander family if and only if there exists a
such that for all sufficiently large
, one has
for some
.
In this set of notes, we will connect expansion of Cayley graphs to an important property of certain infinite groups, known as Kazhdan’s property (T) (or property (T) for short). In 1973, Margulis exploited this property to create the first known explicit and deterministic examples of expanding Cayley graphs. As it turns out, property (T) is somewhat overpowered for this purpose; in particular, we now know that there are many families of Cayley graphs for which the associated infinite group does not obey property (T) (or weaker variants of this property, such as property ). In later notes we will therefore turn to other methods of creating Cayley graphs that do not rely on property (T). Nevertheless, property (T) is of substantial intrinsic interest, and also has many connections to other parts of mathematics than the theory of expander graphs, so it is worth spending some time to discuss it here.
The material here is based in part on this recent text on property (T) by Bekka, de la Harpe, and Valette (available online here).
Van Vu and I have just uploaded to the arXiv our paper A central limit theorem for the determinant of a Wigner matrix, submitted to Adv. Math.. It studies the asymptotic distribution of the determinant of a random Wigner matrix (such as a matrix drawn from the Gaussian Unitary Ensemble (GUE) or Gaussian Orthogonal Ensemble (GOE)).
Before we get to these results, let us first discuss the simpler problem of studying the determinant of a random iid matrix
, such as a real gaussian matrix (where all entries are independently and identically distributed using the standard real normal distribution
), a complex gaussian matrix (where all entries are independently and identically distributed using the standard complex normal distribution
, thus the real and imaginary parts are independent with law
), or the random sign matrix (in which all entries are independently and identically distributed according to the Bernoulli distribution
(with a
chance of either sign). More generally, one can consider a matrix
in which all the entries
are independently and identically distributed with mean zero and variance
.
We can expand using the Leibniz expansion
ranges over the permutations of
, and
is the product
From the iid nature of the , we easily see that each
has mean zero and variance one, and are pairwise uncorrelated as
varies. We conclude that
has mean zero and variance
(an observation first made by Turán). In particular, from Chebyshev’s inequality we see that
is typically of size
.
It turns out, though, that this is not quite best possible. This is easiest to explain in the real gaussian case, by performing a computation first made by Goodman. In this case, is clearly symmetrical, so we can focus attention on the magnitude
. We can interpret this quantity geometrically as the volume of an
-dimensional parallelopiped whose generating vectors
are independent real gaussian vectors in
(i.e. their coefficients are iid with law
). Using the classical base-times-height formula, we thus have
is the
-dimensional linear subspace of
spanned by
(note that
, having an absolutely continuous joint distribution, are almost surely linearly independent). Taking logarithms, we conclude
Now, we take advantage of a fundamental symmetry property of the Gaussian vector distribution, namely its invariance with respect to the orthogonal group . Because of this, we see that if we fix
(and thus
, the random variable
has the same distribution as
, or equivalently the
distribution
where are iid copies of
. As this distribution does not depend on the
, we conclude that the law of
is given by the sum of
independent
-variables:
A standard computation shows that each has mean
and variance
, and then a Taylor series (or Ito calculus) computation (using concentration of measure tools to control tails) shows that
has mean
and variance
. As such,
has mean
and variance
. Applying a suitable version of the central limit theorem, one obtains the asymptotic law
denotes convergence in distribution. A bit more informally, we have
is a real gaussian matrix; thus, for instance, the median value of
is
. At first glance, this appears to conflict with the second moment bound
of Turán mentioned earlier, but once one recalls that
has a second moment of
, we see that the two facts are in fact perfectly consistent; the upper tail of the normal distribution in the exponent in (4) ends up dominating the second moment.
It turns out that the central limit theorem (3) is valid for any real iid matrix with mean zero, variance one, and an exponential decay condition on the entries; this was first claimed by Girko, though the arguments in that paper appear to be incomplete. Another proof of this result, with more quantitative bounds on the convergence rate has been recently obtained by Hoi Nguyen and Van Vu. The basic idea in these arguments is to express the sum in (2) in terms of a martingale and apply the martingale central limit theorem.
If one works with complex gaussian random matrices instead of real gaussian random matrices, the above computations change slightly (one has to replace the real distribution with the complex
distribution, in which the
are distributed according to the complex gaussian
instead of the real one). At the end of the day, one ends up with the law
We can now turn to the results of our paper. Here, we replace the iid matrices by Wigner matrices
, which are defined similarly but are constrained to be Hermitian (or real symmetric), thus
for all
. Model examples here include the Gaussian Unitary Ensemble (GUE), in which
for
and
for
, the Gaussian Orthogonal Ensemble (GOE), in which
for
and
for
, and the symmetric Bernoulli ensemble, in which
for
(with probability
of either sign). In all cases, the upper triangular entries of the matrix are assumed to be jointly independent. For a more precise definition of the Wigner matrix ensembles we are considering, see the introduction to our paper.
The determinants of these matrices still have a Leibniz expansion. However, in the Wigner case, the mean and variance of the
are slightly different, and what is worse, they are not all pairwise uncorrelated any more. For instance, the mean of
is still usually zero, but equals
in the exceptional case when
is a perfect matching (i.e. the union of exactly
-cycles, a possibility that can of course only happen when
is even). As such, the mean
still vanishes when
is odd, but for even
it is equal to
(the fraction here simply being the number of perfect matchings on vertices). Using Stirling’s formula, one then computes that
is comparable to
when
is large and even. The second moment calculation is more complicated (and uses facts about the distribution of cycles in random permutations, mentioned in this previous post), but one can compute that
is comparable to
for GUE and
for GOE. (The discrepancy here comes from the fact that in the GOE case,
and
can correlate when
contains reversals of
-cycles of
for
, but this does not happen in the GUE case.) For GUE, much more precise asymptotics for the moments of the determinant are known, starting from the work of Brezin and Hikami, though we do not need these more sophisticated computations here.
Our main results are then as follows.
Theorem 1 Let
be a Wigner matrix.
- If
is drawn from GUE, then
- If
is drawn from GOE, then
- The previous two results also hold for more general Wigner matrices, assuming that the real and imaginary parts are independent, a finite moment condition is satisfied, and the entries match moments with those of GOE or GUE to fourth order. (See the paper for a more precise formulation of the result.)
Thus, we informally have
when is drawn from GUE, or from another Wigner ensemble matching GUE to fourth order (and obeying some additional minor technical hypotheses); and
when is drawn from GOE, or from another Wigner ensemble matching GOE to fourth order. Again, these asymptotic limiting distributions are consistent with the asymptotic behaviour for the second moments.
The extension from the GUE or GOE case to more general Wigner ensembles is a fairly routine application of the four moment theorem for Wigner matrices, although for various technical reasons we do not quite use the existing four moment theorems in the literature, but adapt them to the log determinant. The main idea is to express the log-determinant as an integral
of . Strictly speaking, the integral in (7) is divergent at infinity (and also can be ill-behaved near zero), but this can be addressed by standard truncation and renormalisation arguments (combined with known facts about the least singular value of Wigner matrices), which we omit here. We then use a variant of the four moment theorem for the Stieltjes transform, as used by Erdos, Yau, and Yin (based on a previous four moment theorem for individual eigenvalues introduced by Van Vu and myself). The four moment theorem is proven by the now-standard Lindeberg exchange method, combined with the usual resolvent identities to control the behaviour of the resolvent (and hence the Stieltjes transform) with respect to modifying one or two entries, together with the delocalisation of eigenvector property (which in turn arises from local semicircle laws) to control the error terms.
Somewhat surprisingly (to us, at least), it turned out that it was the first part of the theorem (namely, the verification of the limiting law for the invariant ensembles GUE and GOE) that was more difficult than the extension to the Wigner case. Even in an ensemble as highly symmetric as GUE, the rows are no longer independent, and the formula (2) is basically useless for getting any non-trivial control on the log determinant. There is an explicit formula for the joint distribution of the eigenvalues of GUE (or GOE), which does eventually give the distribution of the cumulants of the log determinant, which then gives the required central limit theorem; but this is a lengthy computation, first performed by Delannay and Le Caer.
Following a suggestion of my colleague, Rowan Killip, we give an alternate proof of this central limit theorem in the GUE and GOE cases, by using a beautiful observation of Trotter, namely that the GUE or GOE ensemble can be conjugated into a tractable tridiagonal form. Let me state it just for GUE:
Proposition 2 (Tridiagonal form of GUE) \cite{trotter} Let
be the random tridiagonal real symmetric matrix
where the
are jointly independent real random variables, with
being standard real Gaussians, and each
having a
-distribution:
where
are iid complex gaussians. Let
be drawn from GUE. Then the joint eigenvalue distribution of
is identical to the joint eigenvalue distribution of
.
Proof: Let be drawn from GUE. We can write
where is drawn from the
GUE,
, and
is a random gaussian vector with all entries iid with distribution
. Furthermore,
are jointly independent.
We now apply the tridiagonal matrix algorithm. Let , then
has the
-distribution indicated in the proposition. We then conjugate
by a unitary matrix
that preserves the final basis vector
, and maps
to
. Then we have
where is conjugate to
. Now we make the crucial observation: because
is distributed according to GUE (which is a unitarily invariant ensemble), and
is a unitary matrix independent of
,
is also distributed according to GUE, and remains independent of both
and
.
We continue this process, expanding as
Applying a further unitary conjugation that fixes but maps
to
, we may replace
by
while transforming
to another GUE matrix
independent of
. Iterating this process, we eventually obtain a coupling of
to
by unitary conjugations, and the claim follows.
The determinant of a tridiagonal matrix is not quite as simple as the determinant of a triangular matrix (in which it is simply the product of the diagonal entries), but it is pretty close: the determinant of the above matrix is given by solving the recursion
with and
. Thus, instead of the product of a sequence of independent scalar
distributions as in the gaussian matrix case, the determinant of GUE ends up being controlled by the product of a sequence of independent
matrices whose entries are given by gaussians and
distributions. In this case, one cannot immediately take logarithms and hope to get something for which the martingale central limit theorem can be applied, but some ad hoc manipulation of these
matrix products eventually does make this strategy work. (Roughly speaking, one has to work with the logarithm of the Frobenius norm of the matrix first.)

Recent Comments