You are currently browsing the tag archive for the ‘singular values’ tag.

Let {F} be a field. A definable set over {F} is a set of the form

\displaystyle  \{ x \in F^n | \phi(x) \hbox{ is true} \} \ \ \ \ \ (1)

where {n} is a natural number, and {\phi(x)} is a predicate involving the ring operations {+,\times} of {F}, the equality symbol {=}, an arbitrary number of constants and free variables in {F}, the quantifiers {\forall, \exists}, boolean operators such as {\vee,\wedge,\neg}, and parentheses and colons, where the quantifiers are always understood to be over the field {F}. Thus, for instance, the set of quadratic residues

\displaystyle  \{ x \in F | \exists y: x = y \times y \}

is definable over {F}, and any algebraic variety over {F} is also a definable set over {F}. Henceforth we will abbreviate “definable over {F}” simply as “definable”.

If {F} is a finite field, then every subset of {F^n} is definable, since finite sets are automatically definable. However, we can obtain a more interesting notion in this case by restricting the complexity of a definable set. We say that {E \subset F^n} is a definable set of complexity at most {M} if {n \leq M}, and {E} can be written in the form (1) for some predicate {\phi} of length at most {M} (where all operators, quantifiers, relations, variables, constants, and punctuation symbols are considered to have unit length). Thus, for instance, a hypersurface in {n} dimensions of degree {d} would be a definable set of complexity {O_{n,d}(1)}. We will then be interested in the regime where the complexity remains bounded, but the field size (or field characteristic) becomes large.

In a recent paper, I established (in the large characteristic case) the following regularity lemma for dense definable graphs, which significantly strengthens the Szemerédi regularity lemma in this context, by eliminating “bad” pairs, giving a polynomially strong regularity, and also giving definability of the cells:

Lemma 1 (Algebraic regularity lemma) Let {F} be a finite field, let {V,W} be definable non-empty sets of complexity at most {M}, and let {E \subset V \times W} also be definable with complexity at most {M}. Assume that the characteristic of {F} is sufficiently large depending on {M}. Then we may partition {V = V_1 \cup \ldots \cup V_m} and {W = W_1 \cup \ldots \cup W_n} with {m,n = O_M(1)}, with the following properties:

  • (Definability) Each of the {V_1,\ldots,V_m,W_1,\ldots,W_n} are definable of complexity {O_M(1)}.
  • (Size) We have {|V_i| \gg_M |V|} and {|W_j| \gg_M |W|} for all {i=1,\ldots,m} and {j=1,\ldots,n}.
  • (Regularity) We have

    \displaystyle  |E \cap (A \times B)| = d_{ij} |A| |B| + O_M( |F|^{-1/4} |V| |W| ) \ \ \ \ \ (2)

    for all {i=1,\ldots,m}, {j=1,\ldots,n}, {A \subset V_i}, and {B\subset W_j}, where {d_{ij}} is a rational number in {[0,1]} with numerator and denominator {O_M(1)}.

My original proof of this lemma was quite complicated, based on an explicit calculation of the “square”

\displaystyle  \mu(w,w') := \{ v \in V: (v,w), (v,w') \in E \}

of {E} using the Lang-Weil bound and some facts about the étale fundamental group. It was the reliance on the latter which was the main reason why the result was restricted to the large characteristic setting. (I then applied this lemma to classify expanding polynomials over finite fields of large characteristic, but I will not discuss these applications here; see this previous blog post for more discussion.)

Recently, Anand Pillay and Sergei Starchenko (and independently, Udi Hrushovski) have observed that the theory of the étale fundamental group is not necessary in the argument, and the lemma can in fact be deduced from quite general model theoretic techniques, in particular using (a local version of) the concept of stability. One of the consequences of this new proof of the lemma is that the hypothesis of large characteristic can be omitted; the lemma is now known to be valid for arbitrary finite fields {F} (although its content is trivial if the field is not sufficiently large depending on the complexity at most {M}).

Inspired by this, I decided to see if I could find yet another proof of the algebraic regularity lemma, again avoiding the theory of the étale fundamental group. It turns out that the spectral proof of the Szemerédi regularity lemma (discussed in this previous blog post) adapts very nicely to this setting. The key fact needed about definable sets over finite fields is that their cardinality takes on an essentially discrete set of values. More precisely, we have the following fundamental result of Chatzidakis, van den Dries, and Macintyre:

Proposition 2 Let {F} be a finite field, and let {M > 0}.

  • (Discretised cardinality) If {E} is a non-empty definable set of complexity at most {M}, then one has

    \displaystyle  |E| = c |F|^d + O_M( |F|^{d-1/2} ) \ \ \ \ \ (3)

    where {d = O_M(1)} is a natural number, and {c} is a positive rational number with numerator and denominator {O_M(1)}. In particular, we have {|F|^d \ll_M |E| \ll_M |F|^d}.

  • (Definable cardinality) Assume {|F|} is sufficiently large depending on {M}. If {V, W}, and {E \subset V \times W} are definable sets of complexity at most {M}, so that {E_w := \{ v \in V: (v,w) \in W \}} can be viewed as a definable subset of {V} that is definably parameterised by {w \in W}, then for each natural number {d = O_M(1)} and each positive rational {c} with numerator and denominator {O_M(1)}, the set

    \displaystyle  \{ w \in W: |E_w| = c |F|^d + O_M( |F|^{d-1/2} ) \} \ \ \ \ \ (4)

    is definable with complexity {O_M(1)}, where the implied constants in the asymptotic notation used to define (4) are the same as those that appearing in (3). (Informally: the “dimension” {d} and “measure” {c} of {E_w} depends definably on {w}.)

We will take this proposition as a black box; a proof can be obtained by combining the description of definable sets over pseudofinite fields (discussed in this previous post) with the Lang-Weil bound (discussed in this previous post). (The former fact is phrased using nonstandard analysis, but one can use standard compactness-and-contradiction arguments to convert such statements to statements in standard analysis, as discussed in this post.)

The above proposition places severe restrictions on the cardinality of definable sets; for instance, it shows that one cannot have a definable set of complexity at most {M} and cardinality {|F|^{1/2}}, if {|F|} is sufficiently large depending on {M}. If {E \subset V} are definable sets of complexity at most {M}, it shows that {|E| = (c+ O_M(|F|^{-1/2})) |V|} for some rational {0\leq c \leq 1} with numerator and denominator {O_M(1)}; furthermore, if {c=0}, we may improve this bound to {|E| = O_M( |F|^{-1} |V|)}. In particular, we obtain the following “self-improving” properties:

  • If {E \subset V} are definable of complexity at most {M} and {|E| \leq \epsilon |V|} for some {\epsilon>0}, then (if {\epsilon} is sufficiently small depending on {M} and {F} is sufficiently large depending on {M}) this forces {|E| = O_M( |F|^{-1} |V| )}.
  • If {E \subset V} are definable of complexity at most {M} and {||E| - c |V|| \leq \epsilon |V|} for some {\epsilon>0} and positive rational {c}, then (if {\epsilon} is sufficiently small depending on {M,c} and {F} is sufficiently large depending on {M,c}) this forces {|E| = c |V| + O_M( |F|^{-1/2} |V| )}.

It turns out that these self-improving properties can be applied to the coefficients of various matrices (basically powers of the adjacency matrix associated to {E}) that arise in the spectral proof of the regularity lemma to significantly improve the bounds in that lemma; we describe how this is done below the fold. We also make some connections to the stability-based proofs of Pillay-Starchenko and Hrushovski.

Read the rest of this entry »

Van Vu and I have just uploaded to the arXiv our survey paper “From the Littlewood-Offord problem to the Circular Law: universality of the spectral distribution of random matrices“, submitted to Bull. Amer. Math. Soc..  This survey recaps (avoiding most of the technical details) the recent work of ourselves and others that exploits the inverse theory for the Littlewood-Offord problem (which, roughly speaking, amounts to figuring out what types of random walks exhibit concentration at any given point), and how this leads to bounds on condition numbers, least singular values, and resolvents of random matrices; and then how the latter then leads to universality of the empirical spectral distributions (ESDs) of random matrices, and in particular to the circular law for the ESDs for iid random matrices with zero mean and unit variance (see my previous blog post on this topic, or my Lewis lectures).  We conclude by mentioning a few open problems in the subject.

While this subject does unfortunately contain a large amount of technical theory and detail, every so often we find a very elementary observation that simplifies the work required significantly.  One such observation is an identity which we call the negative second moment identity, which I would like to discuss here.    Let A be an n \times n matrix; for simplicity we assume that the entries are real-valued.  Denote the n rows of A by X_1,\ldots,X_n, which we view as vectors in {\Bbb R}^n.  Let \sigma_1(A) \geq \ldots \geq \sigma_n(A) \geq 0 be the singular values of A. In our applications, the vectors X_j are easily described (e.g. they might be randomly distributed on the discrete cube \{-1,1\}^n), but the distribution of the singular values \sigma_j(A) is much more mysterious, and understanding this distribution is a key objective in this entire theory.

Read the rest of this entry »

This is my second Milliman lecture, in which I talk about recent applications of ideas from additive combinatorics (and in particular, from the inverse Littlewood-Offord problem) to the theory of discrete random matrices.
Read the rest of this entry »

Van Vu and I have recently uploaded our joint paper, “Random matrices: the circular law“, submitted to Contributions to Discrete Mathematics. In this paper we come close to fully resolving the circular law conjecture regarding the eigenvalue distribution of random matrices, for arbitrary choices of coefficient distribution.

More precisely, suppose we have an n \times n matrix N_n for some large n, where each coefficient x_{ij} of N_n is an independent identically distributed copy of a single random variable x (possibly complex-valued). x could be continuous (e.g. a Gaussian) or discrete (e.g. a Bernoulli random variable, taking values +1 and -1 with equal probability). For simplicity, let us normalise x to have mean 0 and variance 1 (in particular, the second moment is finite). This matrix will not be self-adjoint or normal, but we still expect it to be diagonalisable, with n complex eigenvalues. Heuristic arguments suggest that these eigenvalues should mostly have magnitude O(\sqrt{n}); for instance, one can see this by observing that the Hilbert-Schmidt norm (a.k.a. the Frobenius norm) \hbox{tr} N_n^* N_n, which can be shown to dominate the sum of squares of the magnitudes of the eigenvalues, is of size comparable to n^2 on the average. Because of this, it is customary to normalise the matrix by 1/\sqrt{n}; thus let \lambda_1,\ldots,\lambda_n be the n complex eigenvalues of \frac{1}{\sqrt{n}} N_n, arranged in any order.

Numerical evidence (as seen for instance here) soon reveals that these n eigenvalues appear to distribute themselves uniformly in the unit circle \{ z \in {\Bbb C}: |z| \leq 1 \} in the limit n \to \infty. This phenomenon is known as the circular law. It can be made more precise; if we define the empirical spectral distribution \mu_n: {\Bbb R}^2 \to [0,1] to be the function

\mu_n(s,t) := \frac{1}{n} \# \{ 1 \leq k \leq n: \hbox{Re}(\lambda_k) \leq s; \hbox{Im}(\lambda_k) \leq t \}

then with probability 1, \mu_n should converge uniformly to the uniform distribution \mu_\infty of the unit circle, defined as

\mu_\infty(s,t) := \frac{1}{\pi} \hbox{mes}(\{ (x,y): |x|^2 + |y|^2 \leq 1; x \leq s; y \leq t \}.

This statement is known as the circular law conjecture. In the case when x is a complex Gaussian, this law was verified by Mehta (using an explicit formula of Ginibre for the joint density function of the eigenvalues in this case). A strategy for attacking the general case was then formulated by Girko, although a fully rigorous execution of that strategy was first achieved by Bai (and then improved slightly by Bai and Silverstein). They established the circular law under the assumption that x had slightly better than bounded second moment (i.e. {\Bbb E}|x|^{2+\delta} < \infty for some \delta > 0), but more importantly that the probability density function of x in the complex plane was bounded (in particular, this ruled out all discrete random variables, such as the Bernoulli random variable). The reason for this latter restriction was in order to control the event that the matrix N_n (or more precisely \frac{1}{\sqrt{n}} N_n - z I for various complex numbers z) becomes too ill-conditioned by having a very small least singular value.

In the last few years, work of Rudelson, of myself with Van Vu, and of Rudelson-Vershynin (building upon earlier work of Kahn, Komlos, and Szemerédi, and of Van and myself), have opened the way to control the condition number of random matrices even when the matrices are discrete, and so there have been a recent string of results using these techniques to extend the circular law to discrete settings. In particular, Gotze and Tikhomirov established the circular law for discrete random variables which were sub-Gaussian, which was then relaxed by Pan and Zhou to an assumption of bounded fourth moment. In our paper, we get very close to the ideal assumption of bounded second moment; we need {\Bbb E} |x|^2 \log^C(1+|x|) < \infty for some C > 16. (The power of 16 in the logarithm can certainly be improved, though our methods do not allow the logarithm to be removed entirely.)

The main new difficulty that arises when relaxing the moment condition so close to the optimal one is that one begins to lose control on the largest singular value of \frac{1}{\sqrt{n}} N_n, i.e. on the operator norm of \frac{1}{\sqrt{n}} N_n. Under high moment assumptions (e.g. fourth moment) one can keep this operator norm bounded with reasonable probability (especially after truncating away some exceptionally large elements), but when the moment conditions are loosened, one can only bound this operator norm by a quantity bounded polynomially in n, even after truncation. This in turn causes certain metric entropy computations to become significantly more delicate, as one has to reduce the scale \epsilon of the net below what one would ordinarily like to have.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,571 other followers