You are currently browsing the tag archive for the ‘epsilon net argument’ tag.
Now we turn attention to another important spectral statistic, the least singular value of an matrix (or, more generally, the least non-trivial singular value of a matrix with ). This quantity controls the invertibility of . Indeed, is invertible precisely when is non-zero, and the operator norm of is given by . This quantity is also related to the condition number of , which is of importance in numerical linear algebra. As we shall see in the next set of notes, the least singular value of (and more generally, of the shifts for complex ) will be of importance in rigorously establishing the circular law for iid random matrices , as it plays a key role in computing the Stieltjes transform of such matrices, which as we have already seen is a powerful tool in understanding the spectra of random matrices.
The least singular value
which sits at the “hard edge” of the spectrum, bears a superficial similarity to the operator norm
at the “soft edge” of the spectrum, that was discussed back in Notes 3, so one may at first think that the methods that were effective in controlling the latter, namely the epsilon-net argument and the moment method, would also work to control the former. The epsilon-net method does indeed have some effectiveness when dealing with rectangular matrices (in which the spectrum stays well away from zero), but the situation becomes more delicate for square matrices; it can control some “low entropy” portions of the infimum that arise from “structured” or “compressible” choices of , but are not able to control the “generic” or “incompressible” choices of , for which new arguments will be needed. As for the moment method, this can give the coarse order of magnitude (for instance, for rectangular matrices with for , it gives an upper bound of for the singular value with high probability, thanks to the Marchenko-Pastur law), but again this method begins to break down for square matrices, although one can make some partial headway by considering negative moments such as , though these are more difficult to compute than positive moments .
So one needs to supplement these existing methods with additional tools. It turns out that the key issue is to understand the distance between one of the rows of the matrix , and the hyperplane spanned by the other rows. The reason for this is as follows. First suppose that , so that is non-invertible, and there is a linear dependence between the rows . Thus, one of the will lie in the hyperplane spanned by the other rows, and so one of the distances mentioned above will vanish; in fact, one expects many of the distances to vanish. Conversely, whenever one of these distances vanishes, one has a linear dependence, and so .
More generally, if the least singular value is small, one generically expects many of these distances to be small also, and conversely. Thus, control of the least singular value is morally equivalent to control of the distance between a row and the hyperplane spanned by the other rows. This latter quantity is basically the dot product of with a unit normal of this hyperplane.
When working with random matrices with jointly independent coefficients, we have the crucial property that the unit normal (which depends on all the rows other than ) is independent of , so even after conditioning to be fixed, the entries of remain independent. As such, the dot product is a familiar scalar random walk, and can be controlled by a number of tools, most notably Littlewood-Offord theorems and the Berry-Esséen central limit theorem. As it turns out, this type of control works well except in some rare cases in which the normal is “compressible” or otherwise highly structured; but epsilon-net arguments can be used to dispose of these cases. (This general strategy was first developed for the technically simpler singularity problem by Komlós, and then extended to the least singular value problem by Rudelson.)
These methods rely quite strongly on the joint independence on all the entries; it remains a challenge to extend them to more general settings. Even for Wigner matrices, the methods run into difficulty because of the non-independence of some of the entries (although it turns out one can understand the least singular value in such cases by rather different methods).
To simplify the exposition, we shall focus primarily on just one specific ensemble of random matrices, the Bernoulli ensemble of random sign matrices, where are independent Bernoulli signs. However, the results can extend to more general classes of random matrices, with the main requirement being that the coefficients are jointly independent.