I’ve just uploaded to the arXiv my paper “An inverse theorem for the bilinear $L^2$ Strichartz estimate for the wave equation“.  This paper is another technical component of my “heatwave project“, which aims to establish the global regularity conjecture for energy-critical wave maps into hyperbolic space.    I have been in the process of writing the final paper of that project, in which I will show that the only way singularities can form is if a special type of solution, known as an “almost periodic blowup solution”, exists.  However, I recently discovered that the existing function space estimates that I was relying on for the large energy perturbation theory were not quite adequate, and in particular I needed a certain “inverse theorem” for a standard bilinear estimate which was not quite in the literature.  The purpose of this paper is to establish that inverse theorem, which may also have some application to other nonlinear wave equations.

To explain the inverse theorem, let me first discuss the bilinear estimate that it inverts.  Define a wave to be a solution to the free wave equation $-\phi_{tt} + \Delta \phi = 0$.  If the wave has a finite amount of energy, then one expects the wave to disperse as time goes to infinity; this is captured by the Strichartz estimates, which establish various spacetime $L^p$ bounds on such waves in terms of the energy (or related quantities, such as Sobolev norms of the initial data).  These estimates are fundamental to the local and global theory of nonlinear wave equations, as they can be used to control the effect of the nonlinearity.

In some cases (especially in low dimensions and/or low regularities, and with equations whose nonlinear terms contain derivatives), Strichartz estimates are too weak to control nonlinearities; roughly speaking, this is because waves decay too slowly in low dimensions.  (For instance, one-dimensional waves $\phi(t,x) = f(x+t)+g(x-t)$ do not decay at all.)  However, it has been understood for some time that if the nonlinearity has a special null structure, which roughly means that it consists only of interactions between transverse waves rather than parallel waves, then there is more decay that one can exploit.  For instance, while one-dimensional waves do not decay in time, the product between a left-propagating wave $f(x+t)$ and a right-propagating wave $g(x-t)$ does decay in time.  In particular, if f and g are bounded in $L^2({\Bbb R})$, then this product is bounded in spacetime $L^2_{t,x}({\Bbb R})$, thanks to the Fubini-Tonelli theorem.

There is a similar “bilinear $L^2$” estimate for products of transverse waves in higher dimensions.  This estimate is the basic building block for the bilinear $X^{s,b}$ estimates and their variants as developed by Bourgain, Klainerman-Machedon, Kenig-Ponce-Vega, Tataru, and others, and which are the tool of choice for establishing local and global control on nonlinear wave equations, particularly at low dimensions and at critical regularities.  In particular, these estimates (or more precisely, a complicated variant of these estimates in sophisticated function spaces, due to Tataru and myself), are used in the theory of the energy-critical wave map equation.  [These bilinear (and trilinear) estimates are not, by themselves, enough to handle this equation; one also needs an additional gauge fixing procedure before the equation is sufficiently close to linear in behaviour that these estimates become effective.  But I do not wish to discuss the (significant) gauge fixing issue here.]

To cut a (very) long story short, these estimates, when combined with a suitable perturbative theory, allow one to control energy-critical wave maps as long as the energy is small.  However, the whole point of the “heatwave” project is to control the non-perturbative setting when the energy is large (but finite), and one wants to control the solution for long periods of time.

In my previous “heatwave” paper, in which I established large data local well-posedness for this equation, I finessed this issue by localising time to very short intervals, which made certain spacetime norms small enough for the perturbation theory to apply.  This sufficed for the local well-posedness theory,  but is not good enough for the global perturbative theory, because the number of very short intervals needed to cover the entire time axis becomes unbounded.  For that, one needs the ability to make certain norms or estimates “small” by only chopping up time into a bounded number of intervals.  I refer to this property as divisibility (I used to refer to it, somewhat incorrectly, as fungibility).

In the case of semilinear wave (or Schrödinger equations), in which Strichartz estimates are already sufficient to obtain a satisfactory perturbative theory, divisibility is well-understood, and boils down to the following simple observation: if a function $\phi: {\Bbb R} \times {\Bbb R}^n \to {\Bbb C}$ obeys a global spacetime integrability bound such as $\int_{\Bbb R} \int_{\Bbb R}^n |\phi(t,x)|^p\ dx dt \leq M$

for some finite exponent p and some finite bound M, then one can partition ${\Bbb R}$ into intervals I on which $\int_I \int_{\Bbb R}^n |\phi(t,x)|^p\ dx dt \leq \varepsilon$

for some $\varepsilon > 0$ at one’s disposal to select.  Indeed the number of such intervals is bounded by $M/\varepsilon$, and the intervals can be selected by a simple “greedy algorithm” argument.  This divisibility property of $L^p$-type spacetime norms allows one to easily generalise the small-data perturbation theory to the large-data setting, and is relied upon heavily in the modern theory of the critical nonlinear wave and Schrödinger equations; see for instance this survey of Killip and Visan.

Unfortunately, the function spaces used in wave maps are not easily divisible in this manner (very roughly speaking, this is because the function space norms contain too many $L^\infty_t$ type norms within them).   So one cannot rely purely on refining the function space; one must also work on refining the bilinear (and trilinear) estimates on these spaces.   The standard way to do this is to strengthen the $L^p$ exponents in these estimates, and for the basic bilinear $L^2$ estimate this has indeed been done (in work of Wolff and myself).  This suffices for “equal-frequency” interactions, in which one is multiplying two transverse waves of the same frequency, but turns out to be inadequate for “imbalanced-frequency” interactions, when one is multiplying a low-frequency wave by a high-frequency transverse wave.  For this, I rely instead on establishing an inverse theorem for the estimate.

Generally speaking, whenever one is faced with an estimate, e.g. a linear estimate $\| Tf \|_Y \leq C \|f\|_X,$

one can pose the inverse problem of trying to classify the functions f for which the estimate is tight in the sense that $\| Tf \|_Y \geq \delta \|f\|_X$

for some $\delta > 0$ which is not too small.  Such inverse theorems are a current area of study in additive combinatorics, and have recently begun making an appearance in PDE as well.  For instance:

• Young’s inequality $\|f*g\|_{L^r} \leq \|f\|_{L^p} \|g\|_{L^r}$ or the Hausdorff-Young inequality $\|\hat f\|_{L^{p'}} \leq \|f\|_{L^p}$, is only tight (for non-endpoint p,q,r) when f, g are concentrated on balls, arithmetic progressions, or Bohr sets (this is a consequence of several basic theorems in additive combinatorics, including Freiman’s theorem and the Balog-Szemeredi-Gowers theorem);
• The trivial inequality $\|f\|_{U^k} \leq \|f\|_{L^\infty}$ for the Gowers uniformity norms is only expected to be tight when f correlates with a highly algebraic object, such as a polynomial phase or nilsequence (this is the inverse conjecture for the Gowers norm, which is partially proven so far);
• The Sobolev embedding $\| f \|_{L^q} \leq C \|f\|_{W^{s,r}}$ is only tight when f is concentrated on a unit ball (for non-endpoint estimates) or a ball of arbitrary radius (for endpoint estimates);
• Strichartz estimates are only tight when f is concentrated on a ball (for non-endpoint estimates) or a tube (for endpoint estimates).

Inverse theorems for such estimates as Sobolev inequalities and Strichartz estimates are also closely related to the theory of concentration compactness and profile decompositions; see this previous blog post of mine for a discussion.

I can now state informally, the main result of this paper:

Theorem 1 (informal statement).  A bilinear $L^2$ estimate between two waves of different frequency is only tight when the waves are concentrated on a small number of light rays.  Outside of these rays, the $L^2$ norm is small.

This leads to a corollary which will be used in my final heatwave paper:

Corollary 2 (informal statement).  Any large-energy wave $\phi$ can have its time axis subdivided into a bounded number of intervals, such that on each interval the bilinear estimates for that wave (when interacted against any high-frequency transverse wave) behave “as if” $\phi$ was small-energy rather than large energy.

The method of proof relies on a paper of mine from several years ago on bilinear $L^p$ estimates for the wave equation, which in turn is based on a celebrated paper of Wolff.   Roughly speaking, the idea is to use wave packet decompositions and the combinatorics of light rays to isolate the regions of spacetime where the waves are concentrating, cover these regions by tubular neighbourhoods of light rays, then remove the light rays to reduce the energy (or mass) of the solution and iterate.  The wave packet analysis is moderately complicated, but fortunately I can use a proposition on this topic from my paper as a black box, leaving only the other components of the argument to write out in detail.