Van Vu and I have just uploaded to the arXiv our paper “Random matrices: A general approach for the least singular value problem“, submitted to Israel J. Math.. This paper continues a recent series of papers by ourselves and also by Rudelson and by RudelsonVershynin on understanding the least singular value $\sigma_n(A) := \inf_{\|v\|=1} \|Av\|$ of a large random $n \times n$ random complex matrix A. There are many random matrix models that one can consider, but here we consider models of the form $A = M_n + N_n$, where $M_n = {\Bbb E}(A)$ is a deterministic matrix depending on n, and $N_n$ is a random matrix whose entries are iid with some complex distribution x of mean zero and unit variance. (In particular, this model is useful for studying the normalised resolvents $(\frac{1}{\sqrt{n}} N_n - zI)^{-1}$ of random iid matrices $N_n$, which are of importance in the spectral theory of these matrices; understanding the least singular value of random perturbations of deterministic matrices is also important in numerical analysis, and particularly in smoothed analysis of algorithms such as the simplex method.)

In the model mean zero case $M_n = 0$, the normalised singular values $\frac{1}{\sqrt{n}} \sigma_1(A) \geq \ldots \geq \frac{1}{\sqrt{n}} \sigma_n(A) \geq 0$ of $A = N_n$ are known to be asymptotically distributed according to the Marchenko-Pastur distribution $\frac{1}{\pi} (4-x^2)_+^{1/2} dx$, which in particular implies that most of the singular values are continuously distributed (via a semicircular distribution) in the interval ${}[0, 2\sqrt{n}]$. (Assuming only second moment hypotheses on the underlying distribution x, this result is due to Yin; there are many earlier results assuming stronger hypotheses on x.) This strongly suggests, but does not formally prove, that the least singular value $\sigma_n(A)$ should be of size $\sim 1/\sqrt{n}$ on the average. (To get such a sharp bound on the least singular value via the Marchenko-Pastur law would require an incredibly strong bound on the convergence rate to this law, which seems out of reach at present, especially when one does not assume strong moment conditions on x; current results such as those of Götze-Tikhomirov or Chatterjee-Bose give some upper bound on $\sigma_n(A)$ which improves upon the trivial bound of $O(n^{1/2})$ by a polynomial factor assuming certain moment conditions on x, but as far as I am aware these bounds do not get close to the optimal value of $O(n^{-1/2})$, except perhaps in the special case when x is Gaussian.) The statement that $\sigma_n(A) \sim 1/\sqrt{n}$ with high probability has been conjectured (in various forms) in a number of places, for instance by von Neumann, by Smale, and by Spielman-Teng.

There have been several papers establishing lower tail bounds on the least singular value consistent with the above conjecture of varying degrees of strength, under various hypotheses on the matrix $M_n$ and distribution x. For instance:

1. In 1988, Edelman showed that ${\Bbb P}( \sigma_n(A) \leq \varepsilon/\sqrt{n}) = O( \varepsilon )$ for any $\varepsilon > 0$ when $M_n=0$ and x is Gaussian (there is also a matching lower bound when $\varepsilon = O(1)$).
2. Some non-trivial bounds on ${\Bbb P}( \sigma_n(A) \leq \varepsilon/\sqrt{n})$ for very small $\varepsilon$ (polynomially small in size) are established implicitly in the case when $M_n$ is a multiple of the identity in the papers of Girko and of Bai assuming some moment and continuity assumptions on x, although these bounds are not specified explicitly.
3. In 2005, Rudelson showed that ${\Bbb P}( \sigma_n(A) \leq \varepsilon/\sqrt{n}) = O( n \varepsilon + n^{-1/2})$ when $M_n=0$ and x is subgaussian.
4. In 2005, Van Vu and I showed that for every $C_1 > 0$ there existed $C_2 > 0$ such that ${\Bbb P}( \sigma_n(A) \leq n^{-C_2}) = O( n^{-C_1} )$, when $M_n=0$ and x is the Bernoulli distribution (equal to +1 or -1 with a equal probability of each).
5. In 2006, Senkar, Teng, and Spielman extended Edelman’s result to the case when $M_n$ is nonzero (but x is still Gaussian).
6. In 2007, Rudelson-Vershynin showed that ${\Bbb P}( \sigma_n(A) \leq \varepsilon/\sqrt{n})$ is bounded by $O( \varepsilon^c )$ for $\varepsilon$ independent of n when $M_n=0$ and x has bounded fourth moment, and is bounded by $O( \varepsilon + c^n)$ for all $\varepsilon > 0$ if x is subgaussian.
7. In 2007, Van Vu and I showed that our earlier result extends to the case when $M_n$ is non-zero, as long as it has polynomial size in n and is also integer-valued; we also allow x to be more general than the Bernoulli distribution, but still integer-valued. (Our restriction to the integer case was due to a certain rounding trick that converted finite precision arithmetic to exact arithmetic, but only when the expressions involved ultimately come from integers.)
8. In 2007, Van Vu and I removed the integrality assumptions on our previous results, thus establishing the bound ${\Bbb P}( \sigma_n(A) \leq n^{-C_2}) = O( n^{-C_1} )$ whenever M has polynomial size and with no assumptions on x other than zero mean and unit variance. (See also my earlier blog post on this paper.)
9. In 2008, Rudelson and Vershynin obtained generalisations of their previous results to rectangular matrices (thus also generalising some previous work of Litvak, Pajor, Rudelson, and Tomczak-Jaegermann), but still for the case $M_n=0$ and with subgaussian hypotheses on x.

These recent results are largely based on entropy (or “epsilon-net”) arguments, combined with conditioning arguments, with the entropy bounds in turn originating from inverse Littlewood-Offord theorems; see my Lewis lectures for further discussion.

The current paper is a partial unification of the results of Rudelson and Vershynin (which give very sharp tail estimates, but under strong hypotheses on M and x) and ourselves (which have very general assumptions on M and x, but a weak tail estimate). A little more precisely, we obtain an estimate of the form

${\Bbb P}( \sigma_n(A) \leq \varepsilon/\sqrt{n}) \leq O( n^{o(1)} \varepsilon + n^{-C} + {\Bbb P}( \|N_n\| \geq C \sqrt{n} ) )$

for any fixed $C > 0$, assuming that $\|M\| = O(\sqrt{n})$. This result almost recovers those of Rudelson and Vershynin under subgaussian hypotheses on x (which are known to imply exponentially good bounds on ${\Bbb P}( \|N_n\| \geq C \sqrt{n} )$) in the case when $\varepsilon$ has polynomial size, except that we lose a factor of $n^{o(1)}$. One has analogous results here in which $\|M\|$ and $\|N_n\|$ are bounded by some weaker power of n than $n^{1/2}$; roughly speaking, if $\|M\|$ and $\|N_n\|$ have size $O(n^\gamma)$, then we can show that $\sigma_n(A)$ is at least $n^{-(2C+1) \gamma}$ with probability $1 - O(n^{-C+o(1)})$ for any given C.

We also give a simple example that show that the deterioration of these bounds when $\|M\|$ gets large is necessary; in particular, we show that the universality of the bounds of Edelman type can break down once $\|M\|$ exceeds $n^2$, because one can design M to force the existence of an unusually small value of $Av = (M +N_n)v$ for a certain type of unit vector v.

Our methods here are based on those of earlier papers, particularly our circular law paper; the key innovation is to run a certain scale pigeonholing argument as efficiently as possible, so that one only loses factors of $n^{o(1)}$ in the final bound.

[Update, May 22: some typos corrected.]