You are currently browsing the monthly archive for August 2009.
I’ve just uploaded to the arXiv a draft version of the final paper in my “heatwave” project, “Global regularity of wave maps VII. Control of delocalised or dispersed solutions“. This paper establishes the final ingredient needed to obtain global regularity and (uniform) scattering for wave maps into hyperbolic space, by showing that any sufficiently delocalised or dispersed wave map (an approximate superposition of two maps of lower energy that are widely separated in space, frequency, or time) can be controlled by wave maps of lesser energy.
This type of result is well understood for scalar semilinear equations, such as the nonlinear Schrodinger equation (NLS) or nonlinear wave equation (NLW). The main new difficulties here are that
- (a) the wave maps equation is an overdetermined system, rather than a scalar equation, rendering such basic operations as decomposing a scalar field into components, or superimposing those components to reassemble the solution, much more delicate and nonlinear;
- (b) the wave maps equation is no longer semilinear (in the sense that it can be viewed as a perturbation of the free wave equation) unless a gauge transform is first performed, but the gauge is itself nonlinear and thus interacts in a complicated way with the decompositions and superpositions in (a);
- (c) the function spaces required to control wave maps in two dimensions are extremely complicated and delicate compared to, say, the NLS theory (in which Strichartz spaces largely suffice), and the estimates are not as favourable. In particular, the “low-high” frequency interactions are non-negligible; the low frequency components of wave maps have a non-trivial “magnetic” effect on the high-frequency components. Furthermore, in contrast to the NLS and NLW settings, it takes substantial effort to show that the function spaces are “divisible”, which roughly means that a wave map only exhibits substantial nonlinear behaviour on a bounded number of time intervals and length scales.
Juggling these three difficulties together led to an unusually large length for this paper (124 pages, and this is after taking some shortcuts, see below).
Last month, Sterbenz and Tataru managed, by a slightly different argument, to also establish global regularity and (non-uniform) scattering for wave maps into compact targets (and thence also to hyperbolic space targets, by a lifting argument). Their argument is significantly shorter (a net length of about 100 pages, compared to about 300 pages for my heatwave project) as it relies on a clever shortcut. In my approach, I seek to control all components of the wave map at once, as well as the nonlinear interactions between those components, in order to show that a delocalised wave map can be controlled by wave maps of lesser energy. In contrast, Sterbenz and Tataru focus on just the finest scale at which nontrivial blowup behaviour occurs; it turns out that the small energy theory and finite speed of propagation, together with a regularising effect arising from the Morawetz estimate, are enough to show that this behaviour is controlled by harmonic maps, and so blowup cannot occur below the critical energy. This approach requires substantially less perturbation theory, and thus largely eliminates the need to develop a nonlinear theory of decomposition and superposition alluded to in (a) above (developing this theory, and meshing it with (b) and (c), occupies the bulk of the current paper). On the other hand, the approach in my papers provides more information on the solution, in particular providing certain spacetime “scattering” bounds on the solution that depend only on the energy, as opposed to a “non-uniform” scattering result in which the scattering norms are finite but potentially unbounded.
Nevertheless, my arguments are much more complicated (though I do feel that the machinery set up to disassemble and reassemble maps into manifolds should be useful for other applications), and in the course of this project, I found that I had not quite set up the material in the earlier papers in a way which was perfectly suited for this last (and longest paper). Because of this, this final paper proved to be far more difficult to write than it ought to have been with the correct framework. At some point in the future, when it becomes clearer exactly what that framework is, I am thinking of collecting and reorganising all this material into a reasonably self-contained book (as opposed to being spread out over a half-dozen papers totaling hundreds of pages in length). But this would take a significant amount of effort, and this project has already distracted me from my other tasks for several months now. As such, I have decided to compromise somewhat and release only a draft version of this paper here, with some of the arguments only sketched rather than given out in full, and continuing to use the existing framework provided by the preceding papers as much as possible, rather than to overhaul the entire series of papers. This is not the most satisfactory outcome – and in particular, I do not consider these papers ready for publication at this stage – but all of the important mathematical material in the arguments should be present here for those who are interested. I do hope though that the various technical components of the theory (particularly the points (a), (b), (c) mentioned above) will be simplified in the future (and the results generalised to other targets), at which point I may begin the process of converting these papers into a publication-quality monograph.
There are many situations in combinatorics in which one is running some sort of iteration algorithm to continually “improve” some object ; each loop of the algorithm replaces with some better version of itself, until some desired property of is attained and the algorithm halts. In order for such arguments to yield a useful conclusion, it is often necessary that the algorithm halts in a finite amount of time, or (even better), in a bounded amount of time. (In general, one cannot use infinitary iteration tools, such as transfinite induction or Zorn’s lemma, in combinatorial settings, because the iteration processes used to improve some target object often degrade some other finitary quantity in the process, and an infinite iteration would then have the undesirable effect of making infinite.)
A basic strategy to ensure termination of an algorithm is to exploit a monotonicity property, or more precisely to show that some key quantity keeps increasing (or keeps decreasing) with each loop of the algorithm, while simultaneously staying bounded. (Or, as the economist Herbert Stein was fond of saying, “If something cannot go on forever, it must stop.”)
Here are four common flavours of this monotonicity strategy:
- The mass increment argument. This is perhaps the most familiar way to ensure termination: make each improved object “heavier” than the previous one by some non-trivial amount (e.g. by ensuring that the cardinality of is strictly greater than that of , thus ). Dually, one can try to force the amount of “mass” remaining “outside” of in some sense to decrease at every stage of the iteration. If there is a good upper bound on the “mass” of that stays essentially fixed throughout the iteration process, and a lower bound on the mass increment at each stage, then the argument terminates. Many “greedy algorithm” arguments are of this type. The proof of the Hahn decomposition theorem in measure theory also falls into this category. The general strategy here is to keep looking for useful pieces of mass outside of , and add them to to form , thus exploiting the additivity properties of mass. Eventually no further usable mass remains to be added (i.e. is maximal in some sense), and this should force some desirable property on .
- The density increment argument. This is a variant of the mass increment argument, in which one increments the “density” of rather than the “mass”. For instance, might be contained in some ambient space , and one seeks to improve to (and to ) in such a way that the density of the new object in the new ambient space is better than that of the previous object (e.g. for some ). On the other hand, the density of is clearly bounded above by . As long as one has a sufficiently good lower bound on the density increment at each stage, one can conclude an upper bound on the number of iterations in the algorithm. The prototypical example of this is Roth’s proof of his theorem that every set of integers of positive upper density contains an arithmetic progression of length three. The general strategy here is to keep looking for useful density fluctuations inside , and then “zoom in” to a region of increased density by reducing and appropriately. Eventually no further usable density fluctuation remains (i.e. is uniformly distributed), and this should force some desirable property on .
- The energy increment argument. This is an “” analogue of the ““-based mass increment argument (or the ““-based density increment argument), in which one seeks to increments the amount of “energy” that captures from some reference object , or (equivalently) to decrement the amount of energy of which is still “orthogonal” to . Here and are related somehow to a Hilbert space, and the energy involves the norm on that space. A classic example of this type of argument is the existence of orthogonal projections onto closed subspaces of a Hilbert space; this leads among other things to the construction of conditional expectation in measure theory, which then underlies a number of arguments in ergodic theory, as discussed for instance in this earlier blog post. Another basic example is the standard proof of the Szemerédi regularity lemma (where the “energy” is often referred to as the “index”). These examples are related; see this blog post for further discussion. The general strategy here is to keep looking for useful pieces of energy orthogonal to , and add them to to form , thus exploiting square-additivity properties of energy, such as Pythagoras’ theorem. Eventually, no further usable energy outside of remains to be added (i.e. is maximal in some sense), and this should force some desirable property on .
- The rank reduction argument. Here, one seeks to make each new object to have a lower “rank”, “dimension”, or “order” than the previous one. A classic example here is the proof of the linear algebra fact that given any finite set of vectors, there exists a linearly independent subset which spans the same subspace; the proof of the more general Steinitz exchange lemma is in the same spirit. The general strategy here is to keep looking for “collisions” or “dependencies” within , and use them to collapse to an object of lower rank. Eventually, no further usable collisions within remain, and this should force some desirable property on .
Much of my own work in additive combinatorics relies heavily on at least one of these types of arguments (and, in some cases, on a nested combination of two or more of them). Many arguments in nonlinear partial differential equations also have a similar flavour, relying on various monotonicity formulae for solutions to such equations, though the objective in PDE is usually slightly different, in that one wants to keep control of a solution as one approaches a singularity (or as some time or space coordinate goes off to infinity), rather than to ensure termination of an algorithm. (On the other hand, many arguments in the theory of concentration compactness, which is used heavily in PDE, does have the same algorithm-terminating flavour as the combinatorial arguments; see this earlier blog post for more discussion.)
Recently, a new species of monotonicity argument was introduced by Moser, as the primary tool in his elegant new proof of the Lovász local lemma. This argument could be dubbed an entropy compression argument, and only applies to probabilistic algorithms which require a certain collection of random “bits” or other random choices as part of the input, thus each loop of the algorithm takes an object (which may also have been generated randomly) and some portion of the random string to (deterministically) create a better object (and a shorter random string , formed by throwing away those bits of that were used in the loop). The key point is to design the algorithm to be partially reversible, in the sense that given and and some additional data that logs the cumulative history of the algorithm up to this point, one can reconstruct together with the remaining portion not already contained in . Thus, each stage of the argument compresses the information-theoretic content of the string into the string in a lossless fashion. However, a random variable such as cannot be compressed losslessly into a string of expected size smaller than the Shannon entropy of that variable. Thus, if one has a good lower bound on the entropy of , and if the length of is significantly less than that of (i.e. we need the marginal growth in the length of the history file per iteration to be less than the marginal amount of randomness used per iteration), then there is a limit as to how many times the algorithm can be run, much as there is a limit as to how many times a random data file can be compressed before no further length reduction occurs.
It is interesting to compare this method with the ones discussed earlier. In the previous methods, the failure of the algorithm to halt led to a new iteration of the object which was “heavier”, “denser”, captured more “energy”, or “lower rank” than the previous instance of . Here, the failure of the algorithm to halt leads to new information that can be used to “compress” (or more precisely, the full state ) into a smaller amount of space. I don’t know yet of any application of this new type of termination strategy to the fields I work in, but one could imagine that it could eventually be of use (perhaps to show that solutions to PDE with sufficiently “random” initial data can avoid singularity formation?), so I thought I would discuss it here.
Below the fold I give a special case of Moser’s argument, based on a blog post of Lance Fortnow on this topic.
The most fundamental unsolved problem in complexity theory is undoubtedly the P=NP problem, which asks (roughly speaking) whether a problem which can be solved by a non-deterministic polynomial-time (NP) algorithm, can also be solved by a deterministic polynomial-time (P) algorithm. The general belief is that , i.e. there exist problems which can be solved by non-deterministic polynomial-time algorithms but not by deterministic polynomial-time algorithms.
One reason why the question is so difficult to resolve is that a certain generalisation of this question has an affirmative answer in some cases, and a negative answer in other cases. More precisely, if we give all the algorithms access to an oracle, then for one choice of this oracle, all the problems that are solvable by non-deterministic polynomial-time algorithms that calls (), can also be solved by a deterministic polynomial-time algorithm algorithm that calls (), thus ; but for another choice of this oracle, there exist problems solvable by non-deterministic polynomial-time algorithms that call , which cannot be solved by a deterministic polynomial-time algorithm that calls , thus . One particular consequence of this result (which is due to Baker, Gill, and Solovay) is that there cannot be any relativisable proof of either or , where “relativisable” means that the proof would also work without any changes in the presence of an oracle.
The Baker-Gill-Solovay result was quite surprising, but the idea of the proof turns out to be rather simple. To get an oracle such that , one basically sets to be a powerful simulator that can simulate non-deterministic machines (and, furthermore, can also simulate itself); it turns out that any PSPACE-complete oracle would suffice for this task. To get an oracle for which , one has to be a bit sneakier, setting to be a query device for a sparse set of random (or high-complexity) strings, which are too complex to be guessed at by any deterministic polynomial-time algorithm.
Unfortunately, the simple idea of the proof can be obscured by various technical details (e.g. using Turing machines to define and precisely), which require a certain amount of time to properly absorb. To help myself try to understand this result better, I have decided to give a sort of “allegory” of the proof, based around a (rather contrived) story about various students trying to pass a multiple choice test, which avoids all the technical details but still conveys the basic ideas of the argument. This allegory was primarily for my own benefit, but I thought it might also be of interest to some readers here (and also has some tangential relation to the proto-polymath project of determinstically finding primes), so I reproduce it below.