You are currently browsing the category archive for the 'paper' category.
I have just uploaded to the arXiv the third installment of my “heatwave” project, entitled “Global regularity of wave maps V. Large data local well-posedness in the energy class“. This (rather technical) paper establishes another of the key ingredients necessary to establish the global existence of smooth wave maps from 2+1-dimensional spacetime to hyperbolic space
. Specifically, a large data local well-posedness result is established, constructing a local solution from any initial data with finite (but possibly quite large) energy, and furthermore that the solution depends continuously on the initial data in the energy topology. (This topology was constructed in my previous paper.) Once one has this result, the only remaining task is to show a “Palais-Smale property” for wave maps, in that if singularities form in the wave maps equation, then there exists a non-trivial minimal-energy blowup solution, whose orbit is almost periodic modulo the symmetries of the equation. I anticipate this to the most difficult component of the whole project, and is the subject of the fourth (and hopefully final) installment of this series.
This local result is closely related to the small energy global regularity theory developed in recent years by myself, by Krieger, and by Tataru. In particular, the complicated function spaces used in that paper (which ultimately originate from a precursor paper of Tataru). The main new difficulties here are to extend the small energy theory to large energy (by localising time suitably), and to establish continuous dependence on the data (i.e. two solutions which are initially close in the energy topology, need to stay close in that topology). The former difficulty is in principle manageable by exploiting finite speed of propagation (exploiting the fact (arising from the monotone convergence theorem) that large energy data becomes small energy data at sufficiently small spatial scales), but for technical reasons (having to do with my choice of gauge) I was not able to do this and had to deal with the large energy case directly (and in any case, a genuinely large energy theory is going to be needed to construct the minimal energy blowup solution in the next paper). The latter difficulty is in principle resolvable by adapting the existence theory to differences of solutions, rather than to individual solutions, but the nonlinear choice of gauge adds a rather tedious amount of complexity to the task of making this rigorous. (It may be that simpler gauges, such as the Coulomb gauge, may be usable here, at least in the case of the hyperbolic plane (cf. the work of Krieger), but such gauges cause additional analytic problems as they do not renormalise the nonlinearity as strongly as the caloric gauge. The paper of Tataru establishes these goals, but assumes an isometric embedding of the target manifold into a Euclidean space, which is unfortunately not available for hyperbolic space targets.)
The main technical difficulty that had to be overcome in the paper was that there were two different time variables t, s (one for the wave maps equation and one for the heat flow), and three types of PDE (hyperbolic, parabolic, and ODE) that one has to solve forward in t, forward in s, and backwards in s respectively. In order to close the argument in the large energy case, this necessitated a rather complicated iteration-type scheme, in which one solved for the caloric gauge, established parabolic regularity estimates for that gauge, propagated a “wave-tension field” by the heat flow, and then solved a wave maps type equation using that field as a forcing term. The argument can eventually be closed using mostly “off-the-shelf” function space estimates from previous papers, but is remarkably lengthy, especially when analysing differences of two solutions. (One drawback of using off-the-shelf estimates, though, is that one does not get particularly good control of the solution over extended periods of time; in particular, the spaces used here cannot detect the decay of the solution over extended periods of time (unlike, say, Strichartz spaces for
) and so will not be able to supply the long-time perturbation theory that will be needed in the next paper in this series. I believe I know how to re-engineer these spaces to achieve this, though, and the details should follow in the forthcoming paper.)
Van Vu and I have just uploaded to the arXiv our new paper, “Random matrices: Universality of ESDs and the circular law“, with an appendix by Manjunath Krishnapur (and some numerical data and graphs by Philip Wood). One of the things we do in this paper (which was our original motivation for this project) was to finally establish the endpoint case of the circular law (in both strong and weak forms) for random iid matrices , where the coefficients
are iid random variables with mean zero and unit variance. (The strong circular law says that with probability 1, the empirical spectral distribution (ESD) of the normalised eigenvalues
of the matrix
converges to the uniform distribution on the unit circle as
. The weak circular law asserts the same thing, but with convergence in probability rather than almost sure convergence; this is in complete analogy with the weak and strong law of large numbers, and in fact this law is used in the proof.) In a previous paper, we had established the same claim but under the additional assumption that the
moment
was finite for some
; this builds upon a significant body of earlier work by Mehta, Girko, Bai, Bai-Silverstein, Gotze-Tikhomirov, and Pan-Zhou, as discussed in the blog article for the previous paper.
As it turned out, though, in the course of this project we found a more general universality principle (or invariance principle) which implied our results about the circular law, but is perhaps more interesting in its own right. Observe that the statement of the circular law can be split into two sub-statements:
- (Universality for iid ensembles) In the asymptotic limit
, the ESD of the random matrix
is independent of the choice of distribution of the coefficients, so long as they are normalised in mean and variance. In particular, the ESD of such a matrix is asymptotically the same as that of a (real or complex) gaussian matrix
with the same mean and variance.
- (Circular law for gaussian matrices) In the asymptotic limit
, the ESD of a gaussian matrix
converges to the circular law.
The reason we single out the gaussian matrix ensemble is that it has a much richer algebraic structure (for instance, the real (resp. complex) gaussian ensemble is invariant under right and left multiplication by the orthogonal group O(n) (resp. the unitary group U(n))). Because of this, it is possible to compute the eigenvalue distribution very explicitly by algebraic means (for instance, using the machinery of orthogonal polynomials). In particular, the circular law for complex gaussian matrices (Statement 2 above) was established all the way back in 1967 by Mehta, using an explicit formula for the distribution of the ESD in this case due to Ginibre.
These highly algebraic techniques completely break down for more general iid ensembles, such as the Bernoulli ensemble of matrices whose entries are +1 or -1 with an equal probability of each. Nevertheless, it is a remarkable phenomenon - which has been referred to as universality in the literature, for instance in this survey by Deift - that the spectral properties of random matrices for non-algebraic ensembles are in many cases asymptotically indistinguishable in the limit from that of algebraic ensembles with the same mean and variance (i.e. Statement 1 above). One might view this as a sort of “non-Hermitian, non-commutative” analogue of the universality phenomenon represented by the central limit theorem, in which the limiting distribution of a normalised average
(1)
of an iid sequence depends only on the mean and variance of the elements of that sequence (assuming of course that these quantities are finite), and not on the underlying distribution. (The Hermitian non-commutative analogue of the CLT is known as Wigner’s semicircular law.)
Previous approaches to the circular law did not build upon the gaussian case, but instead proceeded directly, in particular controlling the ESD of a random matrix via estimates on the Stieltjes transform
(2)
of that matrix for complex numbers z. This method required a combination of delicate analysis (in particular, a bound on the least singular values of ), and algebra (in order to compute and then invert the Stieltjes transform). [As a general rule, and oversimplifying somewhat, algebra tends to be used to control main terms in a computation, while analysis is used to control error terms.]
What we discovered while working on our paper was that the algebra and analysis could be largely decoupled from each other: that one could establish a universality principle (Statement 1 above) by relying primarily on tools from analysis (most notably the bound on least singular values mentioned earlier, but also Talagrand’s concentration of measure inequality, and a universality principle for the singular value distribution of random matrices due to Dozier and Silverstein), so that the algebraic heavy lifting only needs to be done in the gaussian case (Statement 2 above) where the task is greatly simplified by all the additional algebraic structure available in that setting. This suggests a possible strategy to proving other conjectures in random matrices (for instance concerning the eigenvalue spacing distribution of random iid matrices), by first establishing universality to swap the general random matrix ensemble with an algebraic ensemble (without fully understanding the limiting behaviour of either), and then using highly algebraic tools to understand the latter ensemble. (There is now a sophisticated theory in place to deal with the latter task, but the former task - understanding universality - is still only poorly understood in many cases.)
I’ve just uploaded to the arXiv the paper “Global existence and uniqueness results for weak solutions of the focusing mass-critical non-linear Schrödinger equation“, submitted to Analysis & PDE. This paper is concerned with solutions to the focusing mass-critical NLS equation
, (1)
where the only regularity we assume on the solution is that the mass is finite and locally bounded in time. (For sufficiently strong notions of solution, the mass is in fact conserved, but part of the point with this paper is that mass conservation breaks down when the solution becomes too weak.) Note that the mass is dimensionless (i.e. scale-invariant) with respect to the natural scale invariance
for this equation. For various technical reasons I work in high dimensions
(this in particular allows the nonlinearity in (1) to be locally integrable in space).
In the classical (smooth) category, there is no ambiguity as to what it means for a function u to “solve” an equation such as (1); but once one is in a low regularity class (such as the class of finite mass solutions), there are several competing notions of solution, in particular the notions of a strong solution and a weak solution. To oversimplify a bit, both strong and weak solutions solve (1) in a distributional sense, but strong solutions are also continuous in time (in the space of functions of finite mass). A canonical example here is given by the pseudoconformally transformed soliton blowup solution
(2)
to (1), where Q is a solution to the ground state equation . This solution is a strong solution on (say) the time interval
, but cannot be continued as a strong solution beyond time zero due to the discontinuity at t=0. Nevertheless, it can be continued as a weak solution by extending by zero at t=0 and at
(or alternatively, one could extend for
using (2); thus there is no uniqueness for the initial value problem in the weak solution class. Note this example also shows that weak solutions need not conserve mass; all the mass in (1) concentrates into the spatial origin as
and disappears in the limit t=0).
There is a slightly stronger notion than a strong solution, which I call a Strichartz-class solution, in which one adds an additional regularity assumption . This assumption is natural from the point of view of Strichartz estimates, which are a major tool in the analysis of such equations.
There is a vast theory for the initial value problem for these sorts of equations, but basically one has the following situation: in the category of Strichartz class solutions, one has local existence and uniqueness, but not global existence (as the example (2) already shows); at the other extreme, in the category of weak solutions, one has global existence, but not uniqueness (as (2) again shows).
(This contrast between strong and weak solutions shows up in many other PDE as well. For instance, global existence of smooth solutions to the Navier-Stokes equation is one of the Clay Millennium problems that I have blogged about before, but global existence of weak solutions is quite easy with today’s technology and was first done by Leray back in 1933.)
In this paper, I introduce a new solution class, which I call the semi-Strichartz class; rather than being continuous in time, it varies right-continuously (in both the mass space and the Strichartz space) in time in the future of the initial time , and left-continuously in the past of
. With this tweak of the definition, it turns out that one has both global existence and uniqueness in this class. (For instance, if one started with the initial data u(-1) given by (2) at time t=-1, the unique global semi-Strichartz solution from this initial data would be given by (2) for negative times and by zero for non-negative times.) This notion of solution is analogous (but much, much simpler than) the notion of Ricci flow with surgery used by Hamilton and Perelman; basically, every time a singularity develops, the semi-Strichartz solution removes the portion of mass that was becoming discontinuous, leaving only the non-singular portion of the solution to continue onwards in time.
Peter Petersen and I have just uploaded to the arXiv our paper, “Classification of Almost Quarter-Pinched Manifolds“, submitted to Proc. Amer. Math. Soc.. This is perhaps the shortest paper (3 pages) I have ever been involved in, because we were fortunate enough that we could simply cite (as a black box) a reference for every single fact that we needed here.
The paper is related to the famous sphere theorem from Riemannian geometry. This theorem asserts that any n-dimensional complete simply connected Riemannian manifold which was strictly quarter-pinched (i.e. the sectional curvatures all in the interval for some
) must necessarily be homeomorphic to the n-sphere
. (In dimensions 3 or less, this already follows from simple connectedness thanks to the Poincaré conjecture (and Myers theorem), so the theorem is really only interesting in higher dimensions. One can easily drop the simple connectedness hypothesis by passing to a universal cover, but then one has to admit sphere quotients
as well as spheres.)
Due to the existence of exotic spheres in higher dimensions, being homeomorphic to a sphere does not necessarily imply being diffeomorphic to a sphere. (For instance, an example of an exotic sphere with positive sectional curvature (but not quarter-pinched) was recently constructed by Petersen and Wilhelm.) Nevertheless, Brendle and Schoen recently proved the diffeomorphic version of the sphere theorem: every strictly quarter-pinched complete simply connected Riemannian manifold. The proof is based on Ricci flow, and involves three main steps:
- A verification that if M is quarter-pinched, then the manifold
has non-negative isotropic curvature. (The same statement is true without adding the two additional flat dimensions, but these additional dimensions are very convenient for simplifying the analysis by allowing certain two-planes to wander freely in the product tangent space.)
- A verification that the property of having non-negative isotropic curvature is preserved by Ricci flow. (By contrast, the quarter-pinched property is not preserved by Ricci flow.)
- The pinching theory of Böhm and Wilking, which is a refinement of the work of Hamilton (who handled the three and four-dimensional cases).
Brendle and Schoen in fact proved a slightly stronger statement in which the curvature bound K is allowed to vary with position x, but we will not discuss this strengthening here.
The quarter-pinching is sharp; the Fubini-Study metric on complex projective spaces is non-strictly quarter-pinched (the sectional curvatures lie in
but is not homeomorphic to a sphere). Nevertheless, by refining the above methods, an endpoint result was established by Brendle and Schoen (see also a later refinement by Seshadri): any complete simply-connected manifold which is non-strictly quarter-pinched is diffeomorphic to either a sphere or a compact rank one symmetric space (or CROSS, for short) such as complex projective space. (In the latter case one also has some further control on the metric, which we will not detail here.) The homeomorphic version of this statement was established earlier by Berger and by Klingenberg.
Our result pushes this further by an epsilon. More precisely, we show for each dimension n that there exists such that any
-pinched complete simply connected manifold (i.e. the curvatures lie in
) is diffeomorphic to either a sphere or a CROSS. (The homeomorphic version of this statement was established earlier in even dimensions by Berger.) We do not know if
can be made independent of n.
Ben Green and I have just uploaded to the arXiv our paper, “The Möbius function is asymptotically orthogonal to nilsequences“, which is a sequel to our earlier paper “The quantitative behaviour of polynomial orbits on nilmanifolds“, which I talked about in this post. In this paper, we apply our previous results on quantitative equidistribution polynomial orbits in nilmanifolds to settle the Möbius and nilsequences conjecture from our earlier paper, as part of our program to detect and count solutions to linear equations in primes. (The other major plank of that program, namely the inverse conjecture for the Gowers norm, remains partially unresolved at present.) Roughly speaking, this conjecture asserts the asymptotic orthogonality
(1)
between the Möbius function and any Lipschitz nilsequence f(n), by which we mean a sequence of the form
for some orbit
in a nilmanifold
, and some Lipschitz function
on that nilmanifold. (The implied constant can depend on the nilmanifold and on the Lipschitz constant of F, but it is important that it be independent of the generator g of the orbit or the base point x.) The case when f is constant is essentially the prime number theorem; the case when f is periodic is essentially the prime number theorem in arithmetic progressions. The case when f is almost periodic (e.g.
for some irrational
) was established by Davenport, using the method of Vinogradov. The case when f was a 2-step nilsequence (such as the quadratic phase
; bracket quadratic phases such as
can also be covered by an approximation argument, though the logarithmic decay in (1) is weakened as a consequence) was done by Ben and myself a few years ago, by a rather ad hoc adaptation of Vinogradov’s method. By using the equidistribution theory of nilmanifolds, we were able to apply Vinogradov’s method more systematically, and in fact the proof is relatively short (20 pages), although it relies on the 64-page predecessor paper on equidistribution. I’ll talk a little bit more about the proof after the fold.
There is an amusing way to interpret the conjecture (using the close relationship between nilsequences and bracket polynomials) as an assertion of the pseudorandomness of the Liouville function from a computational complexity perspective. Suppose you possess a calculator with the wonderful property of being infinite precision: it can accept arbitrarily large real numbers as input, manipulate them precisely, and also store them in memory. However, this calculator has two limitations. Firstly, the only operations available are addition, subtraction, multiplication, integer part , fractional part
, memory store (into one of O(1) registers), and memory recall (from one of these O(1) registers). In particular, there is no ability to perform division. Secondly, the calculator only has a finite display screen, and when it shows a real number, it only shows O(1) digits before and after the decimal point. (Thus, for instance, the real number 1234.56789 might be displayed only as
.)
Now suppose you play the following game with an opponent.
- The opponent specifies a large integer d.
- You get to enter in O(1) real constants of your choice into your calculator. These can be absolute constants such as
and
, or they can depend on d (e.g. you can enter in
).
- The opponent randomly selects an d-digit integer n, and enters n into one of the registers of your calculator.
- You are allowed to perform O(1) operations on your calculator and record what is displayed on the calculator’s viewscreen.
- After this, you have to guess whether the opponent’s number n had an odd or even number of prime factors (i.e. you guess
.)
- If you guess correctly, you win $1; otherwise, you lose $1.
For instance, using your calculator you can work out the first few digits of , provided of course that you entered the constants
and
in advance. You can also work out the leading digits of n by storing
in advance, and computing the first few digits of
.
Our theorem is equivalent to the assertion that as d goes to infinity (keeping the O(1) constants fixed), your probability of winning this game converges to 1/2; in other words, your calculator becomes asymptotically useless to you for the purposes of guessing whether n has an odd or even number of prime factors, and you may as well just guess randomly.
[I should mention a recent result in a similar spirit by Mauduit and Rivat; in this language, their result asserts that knowing the last few digits of the digit-sum of n does not increase your odds of guessing correctly.]
I have just uploaded to the arXiv the second installment of my “heatwave” project, entitled “Global regularity of wave maps IV. Absence of stationary or self-similar solutions in the energy class“. In the first installment of this project, I was able to establish the global existence of smooth wave maps from 2+1-dimensional spacetime to hyperbolic space
from arbitrary smooth initial data, conditionally on five claims:
- A construction of an energy space for maps into hyperbolic space obeying a certain set of reasonable properties, such as compatibility with symmetries, approximability by smooth maps, and existence of a well-defined stress-energy tensor.
- A large data local well-posedness result for wave maps in the above energy space.
- The existence of an almost periodic “minimal-energy blowup solution” to the wave maps equation in the energy class, if this equation is such that singularities can form in finite time.
- The non-existence of any non-trivial degenerate maps into hyperbolic space in the energy class, where “degenerate” means that one of the partial derivatives of this map vanishes identically.
- The non-existence of any travelling or self-similar solution to the wave maps equation in the energy class.
In this paper, the second of four in this series (or, as the title suggests, the fourth in a series of six papers on wave maps, the first two of which can be found here and here), I verify Claims 1, 4, and 5. (The third paper in the series will tackle Claim 2, while the fourth paper will tackle Claim 3.) These claims are largely “elliptic” in nature (as opposed to the “hyperbolic” Claims 2, 3), but I will establish them by a “parabolic” method, relying very heavily on the harmonic map heat flow, and on the closely associated caloric gauge introduced in an earlier paper of mine. The results of paper can be viewed as nonlinear analogues of standard facts about the linear energy space , for instance the fact that smooth compactly supported functions are dense in that space, and that this space contains no non-trivial harmonic functions, or functions which are constant in one of the two spatial directions. The paper turned out a little longer than I had expected (77 pages) due to some surprisingly subtle technicalities, especially when excluding self-similar wave maps. On the other hand, the heat flow and caloric gauge machinery developed here will be reused in the last two papers in this series, hopefully keeping their length to under 100 pages as well.
A key stumbling block here, related to the critical (scale-invariant) nature of the energy space (or to the failure of the endpoint Sobolev embedding ) is that changing coordinates in hyperbolic space can be a non-uniformly-continuous operation in the energy space. Thus, for the purposes of making quantitative estimates in that space, it is preferable to work as covariantly (or co-ordinate free) manner as possible, or if one is to use co-ordinates, to pick them in some canonical manner which is optimally adapted to the tasks at hand. Ideally, one would work with directly with maps
(as well as their velocity field
) without using any coordinates on
, but then it becomes to perform basic analytical operations on such maps, such as taking the Fourier transform, or (even more elementarily) taking the difference of two maps in order to measure how distinct they are from each other.
I’ve uploaded a new paper to the arXiv entitled “The sum-product phenomenon in arbitrary rings“, and submitted to Contributions to Discrete Mathematics. The sum-product phenomenon asserts, very roughly speaking, that given a finite non-empty set A in a ring R, then either the sum set or the product set
will be significantly larger than A, unless A is somehow very close to being a subring of R, or if A is highly degenerate (for instance, containing a lot of zero divisors). For instance, in the case of the integers
, which has no non-trivial finite subrings, a long-standing conjecture of Erdös and Szemerédi asserts that
for every finite non-empty
and every
. (The current best result on this problem is a very recent result of Solymosi, who shows that the conjecture holds for any
greater than 2/3.) In recent years, many other special rings have been studied intensively, most notably finite fields and cyclic groups, but also the complex numbers, quaternions, and other division algebras, and continuous counterparts in which A is now (for instance) a collection of intervals on the real line. I will not try to summarise all the work on sum-product estimates and their applications (which range from number theory to graph theory to ergodic theory to computer science) here, but I discuss this in the introduction to my paper, which has over 50 references to this literature (and I am probably still missing out on a few).
I was recently asked the question as to what could be said about the sum-product phenomenon in an arbitrary ring R, which need not be commutative or contain a multiplicative identity. Once one makes some assumptions to avoid the degenerate case when A (or related sets, such as A-A) are full of zero-divisors, it turns out that there is in fact quite a bit one can say, using only elementary methods from additive combinatorics (in particular, the Plünnecke-Ruzsa sum set theory). Roughly speaking, the main results of the paper assert that in an arbitrary ring, a set A which is non-degenerate and has small sum set and product set must be mostly contained inside a subring of R of comparable size to A, or a dilate of such a subring, though in the absence of invertible elements one sometimes has to enlarge the ambient ring R slightly before one can find the subring. At the end of the paper I specialise these results to specific rings, such as division algebras or products of division algebras, cyclic groups, or finite-dimensional algebras over fields. Generally speaking, the methods here give very good results when the set of zero divisors is sparse and easily describable, but poor results otherwise. (In particular, the sum-product theory in cyclic groups, as worked out by Bourgain and coauthors, is only recovered for groups which are the product of a bounded number of primes; the case of cyclic groups whose order has many factors seems to require a more multi-scale analysis which I did not attempt to perform in this paper.)
Read the rest of this entry »
I’ve just uploaded to the arXiv a new paper, “Global regularity of wave maps III. Large energy from to hyperbolic spaces“, to be submitted when three other companion papers (”Global regularity of wave maps” IV, V, and VI) are finished. This project (which I had called “Heatwave”, due to the use of a heat flow to renormalise a wave equation) has a somewhat lengthy history to it, which I will now attempt to explain.
For the last nine years or so, I have been working on and off on the global regularity problem for wave maps . The wave map equation
is a nonlinear generalisation of the wave equation
in which the unknown field
takes values in a Riemannian manifold
rather than in a vector space (much as the concept of a harmonic map is a nonlinear generalisation of a harmonic function). This equation (also known as the nonlinear
model) is one of the simplest examples of a geometric nonlinear wave equation, and is also arises as a simplified model of the Einstein equations (after making a U(1) symmetry assumption). The global regularity problem seeks to determine when smooth initial data for a wave map (i.e. an initial position
and an initial velocity
tangent to the position) necessarily leads to a smooth global solution.
The problem is particularly interesting in the energy-critical dimension d=2, in which the conserved energy becomes invariant under the scaling symmetry
. (In the subcritical dimension d=1, global regularity is fairly easy to establish, and was first done by Gu and by Ladyzhenskaya-Shubov; in supercritical dimensions
, examples of singularity formation are known, starting with the self-similar examples of Shatah.)
It is generally believed that in two dimensions, singularities can form when M is positively curved but that global regularity should persist when M is negatively curved, in analogy with known results (in particular, the landmark paper of Eells and Sampson) for the harmonic map heat flow (a parabolic cousin of the wave map equation). In particular, one should always have global regularity when the target is a hyperbolic space. There has been a large number of results supporting this conjecture; for instance, when the target is the sphere, examples of singularity formation have recently been constructed by Rodnianski-Sterbenz and by Krieger-Schlag-Tataru, while for suitably negatively curved manifolds such as hyperbolic space, global regularity was established assuming equivariant symmetry by Shatah and Tahvildar-Zadeh, and assuming spherical symmetry by Christodoulou and Tahvildar-Zadeh. I will not attempt to mention all the other results on this problem here, but see for instance one of these survey articles or books for further discussion.
Van Vu and I have just uploaded to the arXiv our paper “Random matrices: A general approach for the least singular value problem“, submitted to Israel J. Math.. This paper continues a recent series of papers by ourselves and also by Rudelson and by Rudelson-Vershynin on understanding the least singular value of a large random
random complex matrix A. There are many random matrix models that one can consider, but here we consider models of the form
, where
is a deterministic matrix depending on n, and
is a random matrix whose entries are iid with some complex distribution x of mean zero and unit variance. (In particular, this model is useful for studying the normalised resolvents
of random iid matrices
, which are of importance in the spectral theory of these matrices; understanding the least singular value of random perturbations of deterministic matrices is also important in numerical analysis, and particularly in smoothed analysis of algorithms such as the simplex method.)
In the model mean zero case , the normalised singular values
of
are known to be asymptotically distributed according to the Marchenko-Pastur distribution
, which in particular implies that most of the singular values are continuously distributed (via a semicircular distribution) in the interval
. (Assuming only second moment hypotheses on the underlying distribution x, this result is due to Yin; there are many earlier results assuming stronger hypotheses on x.) This strongly suggests, but does not formally prove, that the least singular value
should be of size
on the average. (To get such a sharp bound on the least singular value via the Marchenko-Pastur law would require an incredibly strong bound on the convergence rate to this law, which seems out of reach at present, especially when one does not assume strong moment conditions on x; current results such as those of Götze-Tikhomirov or Chatterjee-Bose give some upper bound on
which improves upon the trivial bound of
by a polynomial factor assuming certain moment conditions on x, but as far as I am aware these bounds do not get close to the optimal value of
, except perhaps in the special case when x is Gaussian.) The statement that
with high probability has been conjectured (in various forms) in a number of places, for instance by von Neumann, by Smale, and by Spielman-Teng.
I’ve just uploaded to the arXiv my paper “A global compact attractor for high-dimensional defocusing non-linear Schrödinger equations with potential“, submitted to Dynamics of PDE. This paper continues some earlier work of myself in an attempt to understand the soliton resolution conjecture for various nonlinear dispersive equations, and in particular, nonlinear Schrödinger equations (NLS). This conjecture (which I also discussed in my third Simons lecture) asserts, roughly speaking, that any reasonable (e.g. bounded energy) solution to such equations eventually resolves into a superposition of a radiation component (which behaves like a solution to the linear Schrödinger equation) plus a finite number of “nonlinear bound states” or “solitons”. This conjecture is known in many perturbative cases (when the solution is close to a special solution, such as the vacuum state or a ground state) as well as in defocusing cases (in which no non-trivial bound states or solitons exist), but is still almost completely open in non-perturbative situations (in which the solution is large and not close to a special solution) which contain at least one bound state. In my earlier papers, I was able to show that for certain NLS models in sufficiently high dimension, one could at least say that such solutions resolved into a radiation term plus a finite number of “weakly bound” states whose evolution was essentially almost periodic (or almost periodic modulo translation symmetries). These bound states also enjoyed various additional decay and regularity properties. As a consequence of this, in five and higher dimensions (and for reasonable nonlinearities), and assuming spherical symmetry, I showed that there was a (local) compact attractor for the flow: any solution with energy bounded by some given level E would eventually decouple into a radiation term, plus a state which converged to this compact attractor
. In that result, I did not rule out the possibility that this attractor depended on the energy E. Indeed, it is conceivable for many models that there exist nonlinear bound states of arbitrarily high energy, which would mean that
must increase in size as E increases to accommodate these states. (I discuss these results in a recent talk of mine.)
In my new paper, following a suggestion of Michael Weinstein, I consider the NLS equation
where is the solution, and
is a smooth compactly supported real potential. We make the standard assumption
(which is asserting that the nonlinearity is mass-supercritical and energy-subcritical). In the absence of this potential (i.e. when V=0), this is the defocusing nonlinear Schrödinger equation, which is known to have no bound states, and in fact it is known in this case that all finite energy solutions eventually scatter into a radiation state (which asymptotically resembles a solution to the linear Schrödinger equation). However, once one adds a potential (particularly one which is large and negative), both linear bound states (solutions to the linear eigenstate equation
) and nonlinear bound states (solutions to the nonlinear eigenstate equation
) can appear. Thus in this case the soliton resolution conjecture predicts that solutions should resolve into a scattering state (that behaves as if the potential was not present), plus a finite number of (nonlinear) bound states. There is a fair amount of work towards this conjecture for this model in perturbative cases (when the energy is small), but the case of large energy solutions is still open.
In my new paper, I consider the large energy case, assuming spherical symmetry. For technical reasons, I also need to assume very high dimension . The main result is the existence of a global compact attractor K: every finite energy solution, no matter how large, eventually resolves into a scattering state and a state which converges to K. In particular, since K is bounded, all but a bounded amount of energy will be radiated off to infinity. Another corollary of this result is that the space of all nonlinear bound states for this model is compact. Intuitively, the point is that when the solution gets very large, the defocusing nonlinearity dominates any attractive aspects of the potential V, and so the solution will disperse in this case; thus one expects the only bound states to be bounded. The spherical symmetry assumption also restricts the bound states to lie near the origin, thus yielding the compactness. (It is also conceivable that the localised nature of V also restricts bound states to lie near the origin, even without the help of spherical symmetry, but I was not able to establish this rigorously.)
A few months ago, I announced that I was going to convert a significant fraction of my 2007 blog posts into a book format. For various reasons, this conversion took a little longer than I had anticipated, but I have finally completed a draft copy of this book, which I have uploaded here; note that this is a moderately large file (1.5MB 1.3MB 1.1MB), as the book is 374 pages 287 pages 270 pages long. There are still several formatting issues to resolve, but the content has all been converted.
It may be a while before I hear back from the editors at the American Mathematical Society as to the status of the book project, but in the meantime any comments on the book, ranging from typos to suggestions as to the format, are of course welcome.
[Update, April 21: New version uploaded, incorporating contributed corrections. The formatting has been changed for the internet version to significantly reduce the number of pages. As a consequence, note that the page numbering for the internet version of the book will differ substantially from that in the print version.]
[Update, April 21: As some readers may have noticed, I have placed paraphrased versions of some of the blog comments in the book, using the handles given in the blog comments to identify the authors. If any such commenters wish to change one's handle (e.g. to one's full name) or to otherwise modify or remove any comments I have placed in the book, you are welcome to contact me by email to do so.]
[Update, April 23: Another new version uploaded, incorporating contributed corrections and shrinking the page size a little further.]
[Update, May 8: A few additional corrections to the book.]
Van Vu and I have just uploaded to the arXiv our preprint “On the permanent of random Bernoulli matrices“, submitted to Adv. Math. This paper establishes analogues of some recent results on the determinant of random Bernoulli matrices (matrices in which all entries are either +1 or -1, with equal probability of each), in which the determinant is replaced by the permanent.
More precisely, let M be a random Bernoulli matrix, with n large. Since every row of this matrix has magnitude
, it is easy to see (by interpreting the determinant as the signed volume of a parallelopiped) that
is at most
, with equality being satisfied exactly when M is a Hadamard matrix. In fact, it is known that the determinant
has magnitude
with probability
; for a more precise result, see my earlier paper with Van. (There is in fact believed to be a central limit theorem for
; see this paper of Girko for details.) These results are based upon the elementary “base times height” formula for the volume of a parallelopiped; the main difficulty is to understand what the distance is from one row of M to a subspace spanned by several other rows of M.
The permanent looks formally very similar to the determinant, but does not have a geometric interpretation as a signed volume of a parallelopiped and so can only be analysed combinatorially; the main difficulty is to understand the cancellation that can arise from the various signs in the matrix. It can be somewhat larger than the determinant; for instance, the maximum value of
for a Bernoulli matrix M is
, attaned when M consists entirely of +1’s. Nevertheless, it is not hard to see that
has the same mean and standard deviation as
, namely 0 and
respectively, which shows that
is at most
with probability 1-o(1). Our main result is to show that one also has that
is at least
with probability 1-o(1), thus obtaining the analogue of the previously mentioned result for the determinant (though our o(1) bounds are significantly weaker).
In particular, this shows that the probability that the permanent vanishes completely is o(1) (in fact, we get a bound of for some absolute constant
). This result appears to be new (although there is a cute observation of Alon (see e.g. this paper of Wanless for a proof) that if
is one less than a power of 2, then every Bernoulli matrix has non-zero permanent). In contrast, the probability that the determinant vanishes completely is conjectured to equal
(which is easily seen to be a lower bound), but the best known upper bound for this probability is
, due to Bourgain, Vu, and Wood.
Over two years ago, Emmanuel Candés and I submitted the paper “The Dantzig selector: Statistical estimation when is much
larger than ” to the Annals of Statistics. This paper, which appeared last year, proposed a new type of selector (which we called the Dantzig selector, due to its reliance on the linear programming methods to which George Dantzig, who had died as we were finishing our paper, had contributed so much to) for statistical estimation, in the case when the number
of unknown parameters is much larger than the number
of observations. More precisely, we considered the problem of obtaining a reasonable estimate
for an unknown vector
of parameters given a vector
of measurements, where
is a known
predictor matrix and
is a (Gaussian) noise error with some variance
. We assumed that the predictor matrix X obeyed the restricted isometry property (RIP, also known as UUP), which roughly speaking asserts that
has norm comparable to
whenever the vector
is sparse. This RIP property is known to hold for various ensembles of random matrices of interest; see my earlier blog post on this topic.
Our selection algorithm, inspired by our previous work on compressed sensing, chooses the estimated parameters to have minimal
norm amongst all vectors which are consistent with the data in the sense that the residual vector
obeys the condition
, where
(1)
(one can check that such a condition is obeyed with high probability in the case that , thus the true vector of parameters is feasible for this selection algorithm). This selector is similar, though not identical, to the more well-studied lasso selector in the literature, which minimises the
norm of
penalised by the
norm of the residual.
A simple model case arises when n=p and X is the identity matrix, thus the observations are given by a simple additive noise model . In this case, the Dantzig selector
is given by the hard soft thresholding formula
The mean square error for this selector can be computed to be roughly
(2)
and one can show that this is basically best possible (except for constants and logarithmic factors) amongst all selectors in this model. More generally, the main result of our paper was that under the assumption that the predictor matrix obeys the RIP, the mean square error of the Dantzig selector is essentially equal to (2) and thus close to best possible.
After accepting our paper, the Annals of Statistics took the (somewhat uncommon) step of soliciting responses to the paper from various experts in the field, and then soliciting a rejoinder to these responses from Emmanuel and I. Recently, the Annals posted these responses and rejoinder on the arXiv:
I’m closing my series of articles for the Princeton Companion to Mathematics with my article on “Ricci flow“. Of course, this flow on Riemannian manifolds is now very well known to mathematicians, due to its fundamental role in Perelman’s celebrated proof of the Poincaré conjecture. In this short article, I do not focus on that proof, but instead on the more basic questions as to what a Riemannian manifold is, what the Ricci curvature tensor is on such a manifold, and how Ricci flow qualitatively changes the geometry (and with surgery, the topology) of such manifolds over time.
I’ve saved this article for last, in part because it ties in well with my upcoming course on Perelman’s proof which will start in a few weeks (details to follow soon).
The last external article for the PCM that I would like to point out here is Brian Osserman’s article on the Weil conjectures, which include the “Riemann hypothesis over finite fields” that was famously solved by Deligne. These (now solved) conjectures, which among other things gives some quite precise control on the number of points in an algebraic variety over a finite field, were (and continue to be) a major motivating force behind much of modern arithmetic and algebraic geometry.
[Update, Mar 13: Actual link to Weil conjecture article added.]
My penultimate article for my PCM series is a very short one, on “Hamiltonians“. The PCM has a number of short articles to define terms which occur frequently in the longer articles, but are not substantive enough topics by themselves to warrant a full-length treatment. One of these is the term “Hamiltonian”, which is used in all the standard types of physical mechanics (classical or quantum, microscopic or statistical) to describe the total energy of a system. It is a remarkable feature of the laws of physics that this single object (which is a scalar-valued function in classical physics, and a self-adjoint operator in quantum mechanics) suffices to describe the entire dynamics of a system, although from a mathematical perspective it is not always easy to read off all the analytic aspects of this dynamics just from the form of the Hamiltonian.
In mathematics, Hamiltonians of course arise in the equations of mathematical physics (such as Hamilton’s equations of motion, or Schrödinger’s equations of motion), but also show up in symplectic geometry (as a special case of a moment map) and in microlocal analysis.
For this post, I would also like to highlight an article of my good friend Andrew Granville on one of my own favorite topics, “Analytic number theory“, focusing in particular on the classical problem of understanding the distribution of the primes, via such analytic tools as zeta functions and L-functions, sieve theory, and the circle method.
From Tim Gowers, I hear the good news that the editing process of the Princeton Companion to Mathematics is finally nearing completion. It therefore seems like a good time to resume my own series of Companion articles, while there is still time to correct any errors.
I’ll start today with my article on “Function spaces“. Just as the analysis of numerical quantities relies heavily on the concept of magnitude or absolute value to measure the size of such quantities, or the extent to which two such quantities are close to each other, the analysis of functions relies on the concept of a norm to measure various “sizes” of such functions, as well as the extent to which two functions resemble to each other. But while numbers mainly have just one notion of magnitude (not counting the p-adic valuations, which are of importance in number theory), functions have a wide variety of such magnitudes, such as “height” ( or
norm), “mass” (
norm), “mean square” or “energy” (
or
norms), “slope” (
