I’ve just uploaded to the arXiv my paper Finite time blowup for high dimensional nonlinear wave systems with bounded smooth nonlinearity, submitted to Comm. PDE. This paper is in the same spirit as (though not directly related to) my previous paper on finite time blowup of supercritical NLW systems, and was inspired by a question posed to me some time ago by Jeffrey Rauch. Here, instead of looking at supercritical equations, we look at an extremely subcritical equation, namely a system of the form

$\displaystyle \Box u = f(u) \ \ \ \ \ (1)$

where ${u: {\bf R}^{1+d} \rightarrow {\bf R}^m}$ is the unknown field, and ${f: {\bf R}^m \rightarrow {\bf R}^m}$ is the nonlinearity, which we assume to have all derivatives bounded. A typical example of such an equation is the higher-dimensional sine-Gordon equation

$\displaystyle \Box u = \sin u$

for a scalar field ${u: {\bf R}^{1+d} \rightarrow {\bf R}}$. Here ${\Box = -\partial_t^2 + \Delta}$ is the d’Alembertian operator. We restrict attention here to classical (i.e. smooth) solutions to (1).

We do not assume any Hamiltonian structure, so we do not require ${f}$ to be a gradient ${f = \nabla F}$ of a potential ${F: {\bf R}^m \rightarrow {\bf R}}$. But even without such Hamiltonian structure, the equation (1) is very well behaved, with many a priori bounds available. For instance, if the initial position ${u_0(x) = u(0,x)}$ and initial velocity ${u_1(x) = \partial_t u(0,x)}$ are smooth and compactly supported, then from finite speed of propagation ${u(t)}$ has uniformly bounded compact support for all ${t}$ in a bounded interval. As the nonlinearity ${f}$ is bounded, this immediately places ${f(u)}$ in ${L^\infty_t L^2_x}$ in any bounded time interval, which by the energy inequality gives an a priori ${L^\infty_t H^1_x}$ bound on ${u}$ in this time interval. Next, from the chain rule we have

$\displaystyle \nabla f(u) = (\nabla_{{\bf R}^m} f)(u) \nabla u$

which (from the assumption that ${\nabla_{{\bf R}^m} f}$ is bounded) shows that ${f(u)}$ is in ${L^\infty_t H^1_x}$, which by the energy inequality again now gives an a priori ${L^\infty_t H^2_x}$ bound on ${u}$.

One might expect that one could keep iterating this and obtain a priori bounds on ${u}$ in arbitrarily smooth norms. In low dimensions such as ${d \leq 3}$, this is a fairly easy task, since the above estimates and Sobolev embedding already place one in ${L^\infty_t L^\infty_x}$, and the nonlinear map ${f}$ is easily verified to preserve the space ${L^\infty_t H^k_x \cap L^\infty_t L^\infty_x}$ for any natural number ${k}$, from which one obtains a priori bounds in any Sobolev space; from this and standard energy methods, one can then establish global regularity for this equation (that is to say, any smooth choice of initial data generates a global smooth solution). However, one starts running into trouble in higher dimensions, in which no ${L^\infty_x}$ bound is available. The main problem is that even a really nice nonlinearity such as ${u \mapsto \sin u}$ is unbounded in higher Sobolev norms. The estimates

$\displaystyle |\sin u| \leq |u|$

and

$\displaystyle |\nabla(\sin u)| \leq |\nabla u|$

ensure that the map ${u \mapsto \sin u}$ is bounded in low regularity spaces like ${L^2_x}$ or ${H^1_x}$, but one already runs into trouble with the second derivative

$\displaystyle \nabla^2(\sin u) = (\cos u) \nabla^2 u - (\sin u) \nabla u \nabla u$

where there is a troublesome lower order term of size ${O( |\nabla u|^2 )}$ which becomes difficult to control in higher dimensions, preventing the map ${u \mapsto \sin u}$ to be bounded in ${H^2_x}$. Ultimately, the issue here is that when ${u}$ is not controlled in ${L^\infty}$, the function ${\sin u}$ can oscillate at a much higher frequency than ${u}$; for instance, if ${u}$ is the one-dimensional wave ${u = A \sin(kx)}$for some ${k > 0}$ and ${A>1}$, then ${u}$ oscillates at frequency ${k}$, but the function ${\sin(u)= \sin(A \sin(kx))}$ more or less oscillates at the larger frequency ${Ak}$.

In medium dimensions, it is possible to use dispersive estimates for the wave equation (such as the famous Strichartz estimates) to overcome these problems. This line of inquiry was pursued (albeit for slightly different classes of nonlinearity ${f}$ than those considered here) by Heinz-von Wahl, Pecher (in a series of papers), Brenner, and Brenner-von Wahl; to cut a long story short, one of the conclusions of these papers was that one had global regularity for equations such as (1) in dimensions ${d \leq 9}$. (I reprove this result using modern Strichartz estimate and Littlewood-Paley techniques in an appendix to my paper. The references given also allow for some growth in the nonlinearity ${f}$, but we will not detail the precise hypotheses used in these papers here.)

In my paper, I complement these positive results with an almost matching negative result:

Theorem 1 If ${d \geq 11}$ and ${m \geq 2}$, then there exists a nonlinearity ${f: {\bf R}^m \rightarrow {\bf R}^m}$ with all derivatives bounded, and a solution ${u}$ to (1) that is smooth at time zero, but develops a singularity in finite time.

The construction crucially relies on the ability to choose the nonlinearity ${f}$, and also needs some injectivity properties on the solution ${u: {\bf R}^{1+d} \rightarrow {\bf R}^m}$ (after making a symmetry reduction using an assumption of spherical symmetry to view ${u}$ as a function of ${1+1}$ variables rather than ${1+d}$) which restricts our counterexample to the ${m \geq 2}$ case. Thus the model case of the higher-dimensional sine-Gordon equation ${\Box u =\sin u}$ is not covered by our arguments. Nevertheless (as with previous finite-time blowup results discussed on this blog), one can view this result as a barrier to trying to prove regularity for equations such as ${\Box u = \sin u}$ in eleven and higher dimensions, as any such argument must somehow use a property of that equation that is not applicable to the more general system (1).

Let us first give some back-of-the-envelope calculations suggesting why there could be finite time blowup in eleven and higher dimensions. For sake of this discussion let us restrict attention to the sine-Gordon equation ${\Box u = \sin u}$. The blowup ansatz we will use is as follows: for each frequency ${N_j}$ in a sequence ${1 < N_1 < N_2 < N_3 < \dots}$ of large quantities going to infinity, there will be a spacetime “cube” ${Q_j = \{ (t,x): t \sim \frac{1}{N_j}; x = O(\frac{1}{N_j})\}}$ on which the solution ${u}$ oscillates with “amplitude” ${N_j^\alpha}$ and “frequency” ${N_j}$, where ${\alpha>0}$ is an exponent to be chosen later; this ansatz is of course compatible with the uncertainty principle. Since ${N_j^\alpha \rightarrow \infty}$ as ${j \rightarrow \infty}$, this will create a singularity at the spacetime origin ${(0,0)}$. To make this ansatz plausible, we wish to make the oscillation of ${u}$ on ${Q_j}$ driven primarily by the forcing term ${\sin u}$ at ${Q_{j-1}}$. Thus, by Duhamel’s formula, we expect a relation roughly of the form

$\displaystyle u(t,x) \approx \int \frac{\sin((s-t)\sqrt{-\Delta})}{\sqrt{-\Delta}} \sin(1_{Q_{j-1}} u(s)) (x)\ ds$

on ${Q_j}$, where ${\frac{\sin((s-t)\sqrt{-\Delta})}{\sqrt{-\Delta}}}$ is the usual free wave propagator, and ${1_{Q_{j-1}}}$ is the indicator function of ${Q_{j-1}}$.

On ${Q_{j-1}}$, ${u}$ oscillates with amplitude ${N_{j-1}^\alpha}$ and frequency ${N_{j-1}}$, we expect the derivative ${\nabla_{t,x} u}$ to be of size about ${N_{j-1}^{\alpha+1}}$, and so from the principle of stationary phase we expect ${\sin(u)}$ to oscillate at frequency about ${N_{j-1}^{\alpha+1}}$. Since the wave propagator ${\frac{\sin((s-t)\sqrt{-\Delta})}{\sqrt{-\Delta}}}$ preserves frequencies, and ${u}$ is supposed to be of frequency ${N_j}$ on ${Q_j}$ we are thus led to the requirement

$\displaystyle N_j \approx N_{j-1}^{\alpha+1}. \ \ \ \ \ (2)$

Next, when restricted to frequencies of order ${N_{j}}$, the propagator ${\frac{\sin((s-t)\sqrt{-\Delta})}{\sqrt{-\Delta}}}$ “behaves like” ${N_{j}^{\frac{d-3}{2}} (s-t)^{\frac{d-1}{2}} A_{s-t}}$, where ${A_{s-t}}$ is the spherical averaging operator

$\displaystyle A_{s-t} f(x) := \frac{1}{\omega_{d-1}} \int_{S^{d-1}} f(x + (s-t)\theta)\ d\theta$

where ${d\theta}$ is surface measure on the unit sphere ${S^{d-1}}$, and ${\omega_{d-1}}$ is the volume of that sphere. In our setting, ${s-t}$ is comparable to ${1/N_{j-1}}$, and so we have the informal approximation

$\displaystyle u(t,x) \approx N_j^{\frac{d-3}{2}} N_{j-1}^{-\frac{d-1}{2}} \int_{s \sim 1/N_{j-1}} A_{s-t} \sin(u(s))(x)\ ds$

on ${Q_j}$.

Since ${\sin(u(s))}$ is bounded, ${A_{s-t} \sin(u(s))}$ is bounded as well. This gives a (non-rigorous) upper bound

$\displaystyle u(t,x) \lessapprox N_j^{\frac{d-3}{2}} N_{j-1}^{-\frac{d-1}{2}} \frac{1}{N_{j-1}}$

which when combined with our ansatz that ${u}$ has ampitude about ${N_j^\alpha}$ on ${Q_j}$, gives the constraint

$\displaystyle N_j^\alpha \lessapprox N_j^{\frac{d-3}{2}} N_{j-1}^{-\frac{d-1}{2}} \frac{1}{N_{j-1}}$

which on applying (2) gives the further constraint

$\displaystyle \alpha(\alpha+1) \leq \frac{d-3}{2} (\alpha+1) - \frac{d-1}{2} - 1$

which can be rearranged as

$\displaystyle \left(\alpha - \frac{d-5}{4}\right)^2 \leq \frac{d^2-10d-7}{16}.$

It is now clear that the optimal choice of ${\alpha}$ is

$\displaystyle \alpha = \frac{d-5}{4},$

and this blowup ansatz is only self-consistent when

$\displaystyle \frac{d^2-10d-7}{16} \geq 0$

or equivalently if ${d \geq 11}$.

To turn this ansatz into an actual blowup example, we will construct ${u}$ as the sum of various functions ${u_j}$ that solve the wave equation with forcing term in ${Q_{j+1}}$, and which concentrate in ${Q_j}$ with the amplitude and frequency indicated by the above heuristic analysis. The remaining task is to show that ${\Box u}$ can be written in the form ${f(u)}$ for some ${f}$ with all derivatives bounded. For this one needs some injectivity properties of ${u}$ (after imposing spherical symmetry to impose a dimensional reduction on the domain of ${u}$ from ${d+1}$ dimensions to ${1+1}$). This requires one to construct some solutions to the free wave equation that have some unusual restrictions on the range (for instance, we will need a solution taking values in the plane ${{\bf R}^2}$ that avoid one quadrant of that plane). In order to do this we take advantage of the very explicit nature of the fundamental solution to the wave equation in odd dimensions (such as ${d=11}$), particularly under the assumption of spherical symmetry. Specifically, one can show that in odd dimension ${d}$, any spherically symmetric function ${u(t,x) = u(t,r)}$ of the form

$\displaystyle u(t,r) = \left(\frac{1}{r} \partial_r\right)^{\frac{d-1}{2}} (g(t+r) + g(t-r))$

for an arbitrary smooth function ${g: {\bf R} \rightarrow {\bf R}^m}$, will solve the free wave equation; this is ultimately due to iterating the “ladder operator” identity

$\displaystyle \left( \partial_{tt} + \partial_{rr} + \frac{d-1}{r} \partial_r \right) \frac{1}{r} \partial_r = \frac{1}{r} \partial_r \left( \partial_{tt} + \partial_{rr} + \frac{d-3}{r} \partial_r \right).$

This precise and relatively simple formula for ${u}$ allows one to create “bespoke” solutions ${u}$ that obey various unusual properties, without too much difficulty.

It is not clear to me what to conjecture for ${d=10}$. The blowup ansatz given above is a little inefficient, in that the frequency ${N_{j+1}}$ component of the solution is only generated from a portion of the ${N_j}$ component, namely the portion close to a certain light cone. In particular, the solution does not saturate the Strichartz estimates that are used to establish the positive results for ${d \leq 9}$, which helps explain the slight gap between the positive and negative results. It may be that a more complicated ansatz could work to give a negative result in ten dimensions; conversely, it is also possible that one could use more advanced estimates than the Strichartz estimate (that somehow capture the “thinness” of the fundamental solution, and not just its dispersive properties) to stretch the positive results to ten dimensions. Which side the ${d=10}$ case falls in all come down to some rather delicate numerology.

Nominations for the 2017 Breakthrough Prize in mathematics and the New Horizons Prizes in mathematics are now open.  In 2016, the Breakthrough Prize was awarded to Ian Agol.  The New Horizons prizes are for breakthroughs given by junior mathematicians, usually restricted to within 10 years of PhD; the 2016 prizes were awarded to Andre Neves, Larry Guth, and Peter Scholze (declined).

The rules for the prizes are listed on this page, and nominations can be made at this page.  (No self-nominations are allowed, for the obvious reasons; also, a third-party letter of recommendation is also required.)

Just a quick post to note that the arXiv overlay journal Discrete Analysis, managed by Timothy Gowers, has now gone live with its permanent (and quite modern looking) web site, which is run using the Scholastica platform, as well as the first half-dozen or so accepted papers (including one of my own).  See Tim’s announcement for more details.  I am one of the editors of this journal (and am already handling a few submissions). Needless to say, we are happy to take in more submissions (though they will have to be peer reviewed if they are to be accepted, of course).

I’ve just uploaded to the arXiv my paper Finite time blowup for a supercritical defocusing nonlinear wave system, submitted to Analysis and PDE. This paper was inspired by a question asked of me by Sergiu Klainerman recently, regarding whether there were any analogues of my blowup example for Navier-Stokes type equations in the setting of nonlinear wave equations.

Recall that the defocusing nonlinear wave (NLW) equation reads

$\displaystyle \Box u = |u|^{p-1} u \ \ \ \ \ (1)$

where ${u: {\bf R}^{1+d} \rightarrow {\bf R}}$ is the unknown scalar field, ${\Box = -\partial_t^2 + \Delta}$ is the d’Alambertian operator, and ${p>1}$ is an exponent. We can generalise this equation to the defocusing nonlinear wave system

$\displaystyle \Box u = (\nabla F)(u) \ \ \ \ \ (2)$

where ${u: {\bf R}^{1+d} \rightarrow {\bf R}^m}$ is now a system of scalar fields, and ${F: {\bf R}^m \rightarrow {\bf R}}$ is a potential which is homogeneous of degree ${p+1}$ and strictly positive away from the origin; the scalar equation corresponds to the case where ${m=1}$ and ${F(u) = \frac{1}{p+1} |u|^{p+1}}$. We will be interested in smooth solutions ${u}$ to (2). It is only natural to restrict to the smooth category when the potential ${F}$ is also smooth; unfortunately, if one requires ${F}$ to be homogeneous of order ${p+1}$ all the way down to the origin, then ${F}$ cannot be smooth unless it is identically zero or ${p+1}$ is an odd integer. This is too restrictive for us, so we will only require that ${F}$ be homogeneous away from the origin (e.g. outside the unit ball). In any event it is the behaviour of ${F(u)}$ for large ${u}$ which will be decisive in understanding regularity or blowup for the equation (2).

Formally, solutions to the equation (2) enjoy a conserved energy

$\displaystyle E[u] = \int_{{\bf R}^d} \frac{1}{2} \|\partial_t u \|^2 + \frac{1}{2} \| \nabla_x u \|^2 + F(u)\ dx.$

Using this conserved energy, it is possible to establish global regularity for the Cauchy problem (2) in the energy-subcritical case when ${d \leq 2}$, or when ${d \geq 3}$ and ${p < 1+\frac{4}{d-2}}$. This means that for any smooth initial position ${u_0: {\bf R}^d \rightarrow {\bf R}^m}$ and initial velocity ${u_1: {\bf R}^d \rightarrow {\bf R}^m}$, there exists a (unique) smooth global solution ${u: {\bf R}^{1+d} \rightarrow {\bf R}^m}$ to the equation (2) with ${u(0,x) = u_0(x)}$ and ${\partial_t u(0,x) = u_1(x)}$. These classical global regularity results (essentially due to Jörgens) were famously extended to the energy-critical case when ${d \geq 3}$ and ${p = 1 + \frac{4}{d-2}}$ by Grillakis, Struwe, and Shatah-Struwe (though for various technical reasons, the global regularity component of these results was limited to the range ${3 \leq d \leq 7}$). A key tool used in the energy-critical theory is the Morawetz estimate

$\displaystyle \int_0^T \int_{{\bf R}^d} \frac{|u(t,x)|^{p+1}}{|x|}\ dx dt \lesssim E[u]$

which can be proven by manipulating the properties of the stress-energy tensor

$\displaystyle T_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle - \frac{1}{2} \eta_{\alpha \beta} (\langle \partial^\gamma u, \partial_\gamma u \rangle + F(u))$

(with the usual summation conventions involving the Minkowski metric ${\eta_{\alpha \beta} dx^\alpha dx^\beta = -dt^2 + |dx|^2}$) and in particular exploiting the divergence-free nature of this tensor: ${\partial^\beta T_{\alpha \beta}}$ See for instance the text of Shatah-Struwe, or my own PDE book, for more details. The energy-critical regularity results have also been extended to slightly supercritical settings in which the potential grows by a logarithmic factor or so faster than the critical rate; see the results of myself and of Roy.

This leaves the question of global regularity for the energy supercritical case when ${d \geq 3}$ and ${p > 1+\frac{4}{d-2}}$. On the one hand, global smooth solutions are known for small data (if ${F}$ vanishes to sufficiently high order at the origin, see e.g. the work of Lindblad and Sogge), and global weak solutions for large data were constructed long ago by Segal. On the other hand, the solution map, if it exists, is known to be extremely unstable, particularly at high frequencies; see for instance this paper of Lebeau, this paper of Christ, Colliander, and myself, this paper of Brenner and Kumlin, or this paper of Ibrahim, Majdoub, and Masmoudi for various formulations of this instability. In the case of the focusing NLW ${-\partial_{tt} u + \Delta u = - |u|^{p-1} u}$, one can easily create solutions that blow up in finite time by ODE constructions, for instance one can take ${u(t,x) = c (1-t)^{-\frac{2}{p-1}}}$ with ${c = (\frac{2(p+1)}{(p-1)^2})^{\frac{1}{p-1}}}$, which blows up as ${t}$ approaches ${1}$. However the situation in the defocusing supercritical case is less clear. The strongest positive results are of Kenig-Merle and Killip-Visan, which show (under some additional technical hypotheses) that global regularity for such equations holds under the additional assumption that the critical Sobolev norm of the solution stays bounded. Roughly speaking, this shows that “Type II blowup” cannot occur for (2).

Our main result is that finite time blowup can in fact occur, at least for three-dimensional systems where the number ${m}$ of degrees of freedom is sufficiently large:

Theorem 1 Let ${d=3}$, ${p > 5}$, and ${m \geq 76}$. Then there exists a smooth potential ${F: {\bf R}^m \rightarrow {\bf R}}$, positive and homogeneous of degree ${p+1}$ away from the origin, and a solution to (2) with smooth initial data that develops a singularity in finite time.

The rather large lower bound of ${76}$ on ${m}$ here is primarily due to our use of the Nash embedding theorem (which is the first time I have actually had to use this theorem in an application!). It can certainly be lowered, but unfortunately our methods do not seem to be able to bring ${m}$ all the way down to ${1}$, so we do not directly exhibit finite time blowup for the scalar supercritical defocusing NLW. Nevertheless, this result presents a barrier to any attempt to prove global regularity for that equation, in that it must somehow use a property of the scalar equation which is not available for systems. It is likely that the methods can be adapted to higher dimensions than three, but we take advantage of some special structure to the equations in three dimensions (related to the strong Huygens principle) which does not seem to be available in higher dimensions.

The blowup will in fact be of discrete self-similar type in a backwards light cone, thus ${u}$ will obey a relation of the form

$\displaystyle u(e^S t, e^S x) = e^{-\frac{2}{p-1} S} u(t,x)$

for some fixed ${S>0}$ (the exponent ${-\frac{2}{p-1}}$ is mandated by dimensional analysis considerations). It would be natural to consider continuously self-similar solutions (in which the above relation holds for all ${S}$, not just one ${S}$). And rough self-similar solutions have been constructed in the literature by perturbative methods (see this paper of Planchon, or this paper of Ribaud and Youssfi). However, it turns out that continuously self-similar solutions to a defocusing equation have to obey an additional monotonicity formula which causes them to not exist in three spatial dimensions; this argument is given in my paper. So we have to work just with discretely self-similar solutions.

Because of the discrete self-similarity, the finite time blowup solution will be “locally Type II” in the sense that scale-invariant norms inside the backwards light cone stay bounded as one approaches the singularity. But it will not be “globally Type II” in that scale-invariant norms stay bounded outside the light cone as well; indeed energy will leak from the light cone at every scale. This is consistent with the results of Kenig-Merle and Killip-Visan which preclude “globally Type II” blowup solutions to these equations in many cases.

We now sketch the arguments used to prove this theorem. Usually when studying the NLW, we think of the potential ${F}$ (and the initial data ${u_0,u_1}$) as being given in advance, and then try to solve for ${u}$ as an unknown field. However, in this problem we have the freedom to select ${F}$. So we can look at this problem from a “backwards” direction: we first choose the field ${u}$, and then fit the potential ${F}$ (and the initial data) to match that field.

Now, one cannot write down a completely arbitrary field ${u}$ and hope to find a potential ${F}$ obeying (2), as there are some constraints coming from the homogeneity of ${F}$. Namely, from the Euler identity

$\displaystyle \langle u, (\nabla F)(u) \rangle = (p+1) F(u)$

we see that ${F(u)}$ can be recovered from (2) by the formula

$\displaystyle F(u) = \frac{1}{p+1} \langle u, \Box u \rangle \ \ \ \ \ (3)$

so the defocusing nature of ${F}$ imposes a constraint

$\displaystyle \langle u, \Box u \rangle > 0.$

Furthermore, taking a derivative of (3) we obtain another constraining equation

$\displaystyle \langle \partial_\alpha u, \Box u \rangle = \frac{1}{p+1} \partial_\alpha \langle u, \Box u \rangle$

that does not explicitly involve the potential ${F}$. Actually, one can write this equation in the more familiar form

$\displaystyle \partial^\beta T_{\alpha \beta} = 0$

where ${T_{\alpha \beta}}$ is the stress-energy tensor

$\displaystyle T_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle - \frac{1}{2} \eta_{\alpha \beta} (\langle \partial^\gamma u, \partial_\gamma u \rangle + \frac{1}{p+1} \langle u, \Box u \rangle),$

now written in a manner that does not explicitly involve ${F}$.

With this reformulation, this suggests a strategy for locating ${u}$: first one selects a stress-energy tensor ${T_{\alpha \beta}}$ that is divergence-free and obeys suitable positive definiteness and self-similarity properties, and then locates a self-similar map ${u}$ from the backwards light cone to ${{\bf R}^m}$ that has that stress-energy tensor (one also needs the map ${u}$ (or more precisely the direction component ${u/\|u\|}$ of that map) injective up to the discrete self-similarity, in order to define ${F(u)}$ consistently). If the stress-energy tensor was replaced by the simpler “energy tensor”

$\displaystyle E_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle$

then the question of constructing an (injective) map ${u}$ with the specified energy tensor is precisely the embedding problem that was famously solved by Nash (viewing ${E_{\alpha \beta}}$ as a Riemannian metric on the domain of ${u}$, which in this case is a backwards light cone quotiented by a discrete self-similarity to make it compact). It turns out that one can adapt the Nash embedding theorem to also work with the stress-energy tensor as well (as long as one also specifies the mass density ${M = \|u\|^2}$, and as long as a certain positive definiteness property, related to the positive semi-definiteness of Gram matrices, is obeyed). Here is where the dimension ${76}$ shows up:

Proposition 2 Let ${M}$ be a smooth compact Riemannian ${4}$-manifold, and let ${m \geq 76}$. Then ${M}$ smoothly isometrically embeds into the sphere ${S^{m-1}}$.

Proof: The Nash embedding theorem (in the form given in this ICM lecture of Gunther) shows that ${M}$ can be smoothly isometrically embedded into ${{\bf R}^{19}}$, and thus in ${[-R,R]^{19}}$ for some large ${R}$. Using an irrational slope, the interval ${[-R,R]}$ can be smoothly isometrically embedded into the ${2}$-torus ${\frac{1}{\sqrt{38}} (S^1 \times S^1)}$, and so ${[-R,R]^{19}}$ and hence ${M}$ can be smoothly embedded in ${\frac{1}{\sqrt{38}} (S^1)^{38}}$. But from Pythagoras’ theorem, ${\frac{1}{\sqrt{38}} (S^1)^{38}}$ can be identified with a subset of ${S^{m-1}}$ for any ${m \geq 76}$, and the claim follows. $\Box$

One can presumably improve upon the bound ${76}$ by being more efficient with the embeddings (e.g. by modifying the proof of Nash embedding to embed directly into a round sphere), but I did not try to optimise the bound here.

The remaining task is to construct the stress-energy tensor ${T_{\alpha \beta}}$. One can reduce to tensors that are invariant with respect to rotations around the spatial origin, but this still leaves a fair amount of degrees of freedom (it turns out that there are four fields that need to be specified, which are denoted ${M, E_{tt}, E_{tr}, E_{rr}}$ in my paper). However a small miracle occurs in three spatial dimensions, in that the divergence-free condition involves only two of the four degrees of freedom (or three out of four, depending on whether one considers a function that is even or odd in ${r}$ to only be half a degree of freedom). This is easiest to illustrate with the scalar NLW (1). Assuming spherical symmetry, this equation becomes

$\displaystyle - \partial_{tt} u + \partial_{rr} u + \frac{2}{r} \partial_r u = |u|^{p-1} u.$

Making the substitution ${\phi := ru}$, we can eliminate the lower order term ${\frac{2}{r} \partial_r}$ completely to obtain

$\displaystyle - \partial_{tt} \phi + \partial_{rr} \phi= \frac{1}{r^{p-1}} |\phi|^{p-1} \phi.$

(This can be compared with the situation in higher dimensions, in which an undesirable zeroth order term ${\frac{(d-1)(d-3)}{r^2} \phi}$ shows up.) In particular, if one introduces the null energy density

$\displaystyle e_+ := \frac{1}{2} |\partial_t \phi + \partial_r \phi|^2$

and the potential energy density

$\displaystyle V := \frac{|\phi|^{p+1}}{(p+1) r^{p-1}}$

then one can verify the equation

$\displaystyle (\partial_t - \partial_r) e_+ + (\partial_t + \partial_r) V = - \frac{p-1}{r} V$

which can be viewed as a transport equation for ${e_+}$ with forcing term depending on ${V}$ (or vice versa), and is thus quite easy to solve explicitly by choosing one of these fields and then solving for the other. As it turns out, once one is in the supercritical regime ${p>5}$, one can solve this equation while giving ${e_+}$ and ${V}$ the right homogeneity (they have to be homogeneous of order ${-\frac{4}{p-1}}$, which is greater than ${-1}$ in the supercritical case) and positivity properties, and from this it is possible to prescribe all the other fields one needs to satisfy the conclusions of the main theorem. (It turns out that ${e_+}$ and ${V}$ will be concentrated near the boundary of the light cone, so this is how the solution ${u}$ will concentrate also.)

I’ve been meaning to return to fluids for some time now, in order to build upon my construction two years ago of a solution to an averaged Navier-Stokes equation that exhibited finite time blowup. (I recently spoke on this work in the recent conference in Princeton in honour of Sergiu Klainerman; my slides for that talk are here.)

One of the biggest deficiencies with my previous result is the fact that the averaged Navier-Stokes equation does not enjoy any good equation for the vorticity ${\omega = \nabla \times u}$, in contrast to the true Navier-Stokes equations which, when written in vorticity-stream formulation, become

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u + \nu \Delta \omega$

$\displaystyle u = (-\Delta)^{-1} (\nabla \times \omega).$

(Throughout this post we will be working in three spatial dimensions ${{\bf R}^3}$.) So one of my main near-term goals in this area is to exhibit an equation resembling Navier-Stokes as much as possible which enjoys a vorticity equation, and for which there is finite time blowup.

Heuristically, this task should be easier for the Euler equations (i.e. the zero viscosity case ${\nu=0}$ of Navier-Stokes) than the viscous Navier-Stokes equation, as one expects the viscosity to only make it easier for the solution to stay regular. Indeed, morally speaking, the assertion that finite time blowup solutions of Navier-Stokes exist should be roughly equivalent to the assertion that finite time blowup solutions of Euler exist which are “Type I” in the sense that all Navier-Stokes-critical and Navier-Stokes-subcritical norms of this solution go to infinity (which, as explained in the above slides, heuristically means that the effects of viscosity are negligible when compared against the nonlinear components of the equation). In vorticity-stream formulation, the Euler equations can be written as

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u$

$\displaystyle u = (-\Delta)^{-1} (\nabla \times \omega).$

As discussed in this previous blog post, a natural generalisation of this system of equations is the system

$\displaystyle \partial_t \omega + (u \cdot \nabla) \omega = (\omega \cdot \nabla) u \ \ \ \ \ (1)$

$\displaystyle u = T (-\Delta)^{-1} (\nabla \times \omega).$

where ${T}$ is a linear operator on divergence-free vector fields that is “zeroth order” in some sense; ideally it should also be invertible, self-adjoint, and positive definite (in order to have a Hamiltonian that is comparable to the kinetic energy ${\frac{1}{2} \int_{{\bf R}^3} |u|^2}$). (In the previous blog post, it was observed that the surface quasi-geostrophic (SQG) equation could be embedded in a system of the form (1).) The system (1) has many features in common with the Euler equations; for instance vortex lines are transported by the velocity field ${u}$, and Kelvin’s circulation theorem is still valid.

So far, I have not been able to fully achieve this goal. However, I have the following partial result, stated somewhat informally:

Theorem 1 There is a “zeroth order” linear operator ${T}$ (which, unfortunately, is not invertible, self-adjoint, or positive definite) for which the system (1) exhibits smooth solutions that blowup in finite time.

The operator ${T}$ constructed is not quite a zeroth-order pseudodifferential operator; it is instead merely in the “forbidden” symbol class ${S^0_{1,1}}$, and more precisely it takes the form

$\displaystyle T v = \sum_{j \in {\bf Z}} 2^{3j} \langle v, \phi_j \rangle \psi_j \ \ \ \ \ (2)$

for some compactly supported divergence-free ${\phi,\psi}$ of mean zero with

$\displaystyle \phi_j(x) := \phi(2^j x); \quad \psi_j(x) := \psi(2^j x)$

being ${L^2}$ rescalings of ${\phi,\psi}$. This operator is still bounded on all ${L^p({\bf R}^3)}$ spaces ${1 < p < \infty}$, and so is arguably still a zeroth order operator, though not as convincingly as I would like. Another, less significant, issue with the result is that the solution constructed does not have good spatial decay properties, but this is mostly for convenience and it is likely that the construction can be localised to give solutions that have reasonable decay in space. But the biggest drawback of this theorem is the fact that ${T}$ is not invertible, self-adjoint, or positive definite, so in particular there is no non-negative Hamiltonian for this equation. It may be that some modification of the arguments below can fix these issues, but I have so far been unable to do so. Still, the construction does show that the circulation theorem is insufficient by itself to prevent blowup.

We sketch the proof of the above theorem as follows. We use the barrier method, introducing the time-varying hyperboloid domains

$\displaystyle \Omega(t) := \{ (r,\theta,z): r^2 \leq 1-t + z^2 \}$

for ${t>0}$ (expressed in cylindrical coordinates ${(r,\theta,z)}$). We will select initial data ${\omega(0)}$ to be ${\omega(0,r,\theta,z) = (0,0,\eta(r))}$ for some non-negative even bump function ${\eta}$ supported on ${[-1,1]}$, normalised so that

$\displaystyle \int\int \eta(r)\ r dr d\theta = 1;$

in particular ${\omega(0)}$ is divergence-free supported in ${\Omega(0)}$, with vortex lines connecting ${z=-\infty}$ to ${z=+\infty}$. Suppose for contradiction that we have a smooth solution ${\omega}$ to (1) with this initial data; to simplify the discussion we assume that the solution behaves well at spatial infinity (this can be justified with the choice (2) of vorticity-stream operator, but we will not do so here). Since the domains ${\Omega(t)}$ disconnect ${z=-\infty}$ from ${z=+\infty}$ at time ${t=1}$, there must exist a time ${0 < T_* < 1}$ which is the first time where the support of ${\omega(T_*)}$ touches the boundary of ${\Omega(T_*)}$, with ${\omega(t)}$ supported in ${\Omega(t)}$.

From (1) we see that the support of ${\omega(t)}$ is transported by the velocity field ${u(t)}$. Thus, at the point of contact of the support of ${\omega(T_*)}$ with the boundary of ${\Omega(T_*)}$, the inward component of the velocity field ${u(T_*)}$ cannot exceed the inward velocity of ${\Omega(T_*)}$. We will construct the functions ${\phi,\psi}$ so that this is not the case, leading to the desired contradiction. (Geometrically, what is going on here is that the operator ${T}$ is pinching the flow to pass through the narrow cylinder ${\{ z, r = O( \sqrt{1-t} )\}}$, leading to a singularity by time ${t=1}$ at the latest.)

First we observe from conservation of circulation, and from the fact that ${\omega(t)}$ is supported in ${\Omega(t)}$, that the integrals

$\displaystyle \int\int \omega_z(t,r,\theta,z) \ r dr d\theta$

are constant in both space and time for ${0 \leq t \leq T_*}$. From the choice of initial data we thus have

$\displaystyle \int\int \omega_z(t,r,\theta,z) \ r dr d\theta = 1$

for all ${t \leq T_*}$ and all ${z}$. On the other hand, if ${T}$ is of the form (2) with ${\phi = \nabla \times \eta}$ for some bump function ${\eta = (0,0,\eta_z)}$ that only has ${z}$-components, then ${\phi}$ is divergence-free with mean zero, and

$\displaystyle \langle (-\Delta) (\nabla \times \omega), \phi_j \rangle = 2^{-j} \langle (-\Delta) (\nabla \times \omega), \nabla \times \eta_j \rangle$

$\displaystyle = 2^{-j} \langle \omega, \eta_j \rangle$

$\displaystyle = 2^{-j} \int\int\int \omega_z(t,r,\theta,z) \eta_z(2^j r, \theta, 2^j z)\ r dr d\theta dz,$

where ${\eta_j(x) := \eta(2^j x)}$. We choose ${\eta_z}$ to be supported in the slab ${\{ C \leq z \leq 2C\}}$ for some large constant ${C}$, and to equal a function ${f(z)}$ depending only on ${z}$ on the cylinder ${\{ C \leq z \leq 2C; r \leq 10C \}}$, normalised so that ${\int f(z)\ dz = 1}$. If ${C/2^j \geq (1-t)^{1/2}}$, then ${\Omega(t)}$ passes through this cylinder, and we conclude that

$\displaystyle \langle (-\Delta) (\nabla \times \omega), \phi_j \rangle = -2^{-j} \int f(2^j z)\ dz$

$\displaystyle = 2^{-2j}.$

Inserting ths into (2), (1) we conclude that

$\displaystyle u = \sum_{j: C/2^j \geq (1-t)^{1/2}} 2^j \psi_j + \sum_{j: C/2^j < (1-t)^{1/2}} c_j(t) \psi_j$

for some coefficients ${c_j(t)}$. We will not be able to control these coefficients ${c_j(t)}$, but fortunately we only need to understand ${u}$ on the boundary ${\partial \Omega(t)}$, for which ${r+|z| \gg (1-t)^{1/2}}$. So, if ${\psi}$ happens to be supported on an annulus ${1 \ll r+|z| \ll 1}$, then ${\psi_j}$ vanishes on ${\partial \Omega(t)}$ if ${C}$ is large enough. We then have

$\displaystyle u = \sum_j 2^j \psi_j$

on the boundary of ${\partial \Omega(t)}$.

Let ${\Phi(r,\theta,z)}$ be a function of the form

$\displaystyle \Phi(r,\theta,z) = C z \varphi(z/r)$

where ${\varphi}$ is a bump function supported on ${[-2,2]}$ that equals ${1}$ on ${[-1,1]}$. We can perform a dyadic decomposition ${\Phi = \sum_j \Psi_j}$ where

$\displaystyle \Psi_j(r,\theta,z) = \Phi(r,\theta,z) a(2^j r)$

where ${a}$ is a bump function supported on ${[1/2,2]}$ with ${\sum_j a(2^j r) = 1}$. If we then set

$\displaystyle \psi_j = \frac{2^{-j}}{r} (-\partial_z \Psi_j, 0, \partial_r \Psi_j)$

then one can check that ${\psi_j(x) = \psi(2^j x)}$ for a function ${\psi}$ that is divergence-free and mean zero, and supported on the annulus ${1 \ll r+|z| \ll 1}$, and

$\displaystyle \sum_j 2^j \psi_j = \frac{1}{r} (-\partial_z \Phi, 0, \partial_r \Phi)$

so on ${\partial \Omega(t)}$ (where ${|z| \leq r}$) we have

$\displaystyle u = (-\frac{C}{r}, 0, 0 ).$

One can manually check that the inward velocity of this vector on ${\partial \Omega(t)}$ exceeds the inward velocity of ${\Omega(t)}$ if ${C}$ is large enough, and the claim follows.

Remark 2 The type of blowup suggested by this construction, where a unit amount of circulation is squeezed into a narrow cylinder, is of “Type II” with respect to the Navier-Stokes scaling, because Navier-Stokes-critical norms such ${L^3({\bf R}^3)}$ (or at least ${L^{3,\infty}({\bf R}^3)}$) look like they stay bounded during this squeezing procedure (the velocity field is of size about ${2^j}$ in cylinders of radius and length about ${2^j}$). So even if the various issues with ${T}$ are repaired, it does not seem likely that this construction can be directly adapted to obtain a corresponding blowup for a Navier-Stokes type equation. To get a “Type I” blowup that is consistent with Kelvin’s circulation theorem, it seems that one needs to coil the vortex lines around a loop multiple times in order to get increased circulation in a small space. This seems possible to pull off to me – there don’t appear to be any unavoidable obstructions coming from topology, scaling, or conservation laws – but would require a more complicated construction than the one given above.

The Institute for Pure and Applied Mathematics (IPAM) here at UCLA is seeking applications for its new director in 2017 or 2018, to replace Russ Caflisch, who is nearing the end of his five-year term as IPAM director.  The previous directors of IPAM (Tony Chan, Mark Green, and Russ Caflisch) were also from the mathematics department here at UCLA, but the position is open to all qualified applicants with extensive scientific and administrative experience in mathematics, computer science, or statistics.  Applications will be reviewed on June 1, 2016 (though the applications process will remain open through to Dec 1, 2016).

Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials $P$ in the polynomial ring ${\bf F}_2[t]$ over the finite field of characteristic two, the simplest of which is

$\displaystyle \sum_P \frac{1}{1+P} = 0$

(expanded in terms of Taylor series in $u = 1/t$).  Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.

In this blog post, I would like to specialise the arguments of Bourgain, Demeter, and Guth from the previous post to the two-dimensional case of the Vinogradov main conjecture, namely

Theorem 1 (Two-dimensional Vinogradov main conjecture) One has

$\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^6\ dx dy \ll N^{3+o(1)}$

as ${N \rightarrow \infty}$.

This particular case of the main conjecture has a classical proof using some elementary number theory. Indeed, the left-hand side can be viewed as the number of solutions to the system of equations

$\displaystyle j_1 + j_2 + j_3 = k_1 + k_2 + k_3$

$\displaystyle j_1^2 + j_2^2 + j_3^2 = k_1^2 + k_2^2 + k_3^2$

with ${j_1,j_2,j_3,k_1,k_2,k_3 \in \{0,\dots,N\}}$. These two equations can combine (using the algebraic identity ${(a+b-c)^2 - (a^2+b^2-c^2) = 2 (a-c)(b-c)}$ applied to ${(a,b,c) = (j_1,j_2,k_3), (k_1,k_2,j_3)}$) to imply the further equation

$\displaystyle (j_1 - k_3) (j_2 - k_3) = (k_1 - j_3) (k_2 - j_3)$

which, when combined with the divisor bound, shows that each ${k_1,k_2,j_3}$ is associated to ${O(N^{o(1)})}$ choices of ${j_1,j_2,k_3}$ excluding diagonal cases when two of the ${j_1,j_2,j_3,k_1,k_2,k_3}$ collide, and this easily yields Theorem 1. However, the Bourgain-Demeter-Guth argument (which, in the two dimensional case, is essentially contained in a previous paper of Bourgain and Demeter) does not require the divisor bound, and extends for instance to the the more general case where ${j}$ ranges in a ${1}$-separated set of reals between ${0}$ to ${N}$.

In this special case, the Bourgain-Demeter argument simplifies, as the lower dimensional inductive hypothesis becomes a simple ${L^2}$ almost orthogonality claim, and the multilinear Kakeya estimate needed is also easy (collapsing to just Fubini’s theorem). Also one can work entirely in the context of the Vinogradov main conjecture, and not turn to the increased generality of decoupling inequalities (though this additional generality is convenient in higher dimensions). As such, I am presenting this special case as an introduction to the Bourgain-Demeter-Guth machinery.

We now give the specialisation of the Bourgain-Demeter argument to Theorem 1. It will suffice to establish the bound

$\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^p\ dx dy \ll N^{p/2+o(1)}$

for all ${4, (where we keep ${p}$ fixed and send ${N}$ to infinity), as the ${L^6}$ bound then follows by combining the above bound with the trivial bound ${|\sum_{j=0}^N e( j x + j^2 x^2)| \ll N}$. Accordingly, for any ${\eta > 0}$ and ${4, we let ${P(p,\eta)}$ denote the claim that

$\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^p\ dx dy \ll N^{p/2+\eta+o(1)}$

as ${N \rightarrow \infty}$. Clearly, for any fixed ${p}$, ${P(p,\eta)}$ holds for some large ${\eta}$, and it will suffice to establish

Proposition 2 Let ${4, and let ${\eta>0}$ be such that ${P(p,\eta)}$ holds. Then there exists ${0 < \eta' < \eta}$ such that ${P(p,\eta')}$ holds.

Indeed, this proposition shows that for ${4, the infimum of the ${\eta}$ for which ${P(p,\eta)}$ holds is zero.

We prove the proposition below the fold, using a simplified form of the methods discussed in the previous blog post. To simplify the exposition we will be a bit cavalier with the uncertainty principle, for instance by essentially ignoring the tails of rapidly decreasing functions.

Given any finite collection of elements ${(f_i)_{i \in I}}$ in some Banach space ${X}$, the triangle inequality tells us that

$\displaystyle \| \sum_{i \in I} f_i \|_X \leq \sum_{i \in I} \|f_i\|_X.$

However, when the ${f_i}$ all “oscillate in different ways”, one expects to improve substantially upon the triangle inequality. For instance, if ${X}$ is a Hilbert space and the ${f_i}$ are mutually orthogonal, we have the Pythagorean theorem

$\displaystyle \| \sum_{i \in I} f_i \|_X = (\sum_{i \in I} \|f_i\|_X^2)^{1/2}.$

For sake of comparison, from the triangle inequality and Cauchy-Schwarz one has the general inequality

$\displaystyle \| \sum_{i \in I} f_i \|_X \leq (\# I)^{1/2} (\sum_{i \in I} \|f_i\|_X^2)^{1/2} \ \ \ \ \ (1)$

for any finite collection ${(f_i)_{i \in I}}$ in any Banach space ${X}$, where ${\# I}$ denotes the cardinality of ${I}$. Thus orthogonality in a Hilbert space yields “square root cancellation”, saving a factor of ${(\# I)^{1/2}}$ or so over the trivial bound coming from the triangle inequality.

More generally, let us somewhat informally say that a collection ${(f_i)_{i \in I}}$ exhibits decoupling in ${X}$ if one has the Pythagorean-like inequality

$\displaystyle \| \sum_{i \in I} f_i \|_X \ll_\varepsilon (\# I)^\varepsilon (\sum_{i \in I} \|f_i\|_X^2)^{1/2}$

for any ${\varepsilon>0}$, thus one obtains almost the full square root cancellation in the ${X}$ norm. The theory of almost orthogonality can then be viewed as the theory of decoupling in Hilbert spaces such as ${L^2({\bf R}^n)}$. In ${L^p}$ spaces for ${p < 2}$ one usually does not expect this sort of decoupling; for instance, if the ${f_i}$ are disjointly supported one has

$\displaystyle \| \sum_{i \in I} f_i \|_{L^p} = (\sum_{i \in I} \|f_i\|_{L^p}^p)^{1/p}$

and the right-hand side can be much larger than ${(\sum_{i \in I} \|f_i\|_{L^p}^2)^{1/2}}$ when ${p < 2}$. At the opposite extreme, one usually does not expect to get decoupling in ${L^\infty}$, since one could conceivably align the ${f_i}$ to all attain a maximum magnitude at the same location with the same phase, at which point the triangle inequality in ${L^\infty}$ becomes sharp.

However, in some cases one can get decoupling for certain ${2 < p < \infty}$. For instance, suppose we are in ${L^4}$, and that ${f_1,\dots,f_N}$ are bi-orthogonal in the sense that the products ${f_i f_j}$ for ${1 \leq i < j \leq N}$ are pairwise orthogonal in ${L^2}$. Then we have

$\displaystyle \| \sum_{i = 1}^N f_i \|_{L^4}^2 = \| (\sum_{i=1}^N f_i)^2 \|_{L^2}$

$\displaystyle = \| \sum_{1 \leq i,j \leq N} f_i f_j \|_{L^2}$

$\displaystyle \ll (\sum_{1 \leq i,j \leq N} \|f_i f_j \|_{L^2}^2)^{1/2}$

$\displaystyle = \| (\sum_{1 \leq i,j \leq N} |f_i f_j|^2)^{1/2} \|_{L^2}$

$\displaystyle = \| \sum_{i=1}^N |f_i|^2 \|_{L^2}$

$\displaystyle \leq \sum_{i=1}^N \| |f_i|^2 \|_{L^2}$

$\displaystyle = \sum_{i=1}^N \|f_i\|_{L^4}^2$

giving decoupling in ${L^4}$. (Similarly if each of the ${f_i f_j}$ is orthogonal to all but ${O_\varepsilon( N^\varepsilon )}$ of the other ${f_{i'} f_{j'}}$.) A similar argument also gives ${L^6}$ decoupling when one has tri-orthogonality (with the ${f_i f_j f_k}$ mostly orthogonal to each other), and so forth. As a slight variant, Khintchine’s inequality also indicates that decoupling should occur for any fixed ${2 < p < \infty}$ if one multiplies each of the ${f_i}$ by an independent random sign ${\epsilon_i \in \{-1,+1\}}$.

In recent years, Bourgain and Demeter have been establishing decoupling theorems in ${L^p({\bf R}^n)}$ spaces for various key exponents of ${2 < p < \infty}$, in the “restriction theory” setting in which the ${f_i}$ are Fourier transforms of measures supported on different portions of a given surface or curve; this builds upon the earlier decoupling theorems of Wolff. In a recent paper with Guth, they established the following decoupling theorem for the curve ${\gamma({\bf R}) \subset {\bf R}^n}$ parameterised by the polynomial curve

$\displaystyle \gamma: t \mapsto (t, t^2, \dots, t^n).$

For any ball ${B = B(x_0,r)}$ in ${{\bf R}^n}$, let ${w_B: {\bf R}^n \rightarrow {\bf R}^+}$ denote the weight

$\displaystyle w_B(x) := \frac{1}{(1 + \frac{|x-x_0|}{r})^{100n}},$

which should be viewed as a smoothed out version of the indicator function ${1_B}$ of ${B}$. In particular, the space ${L^p(w_B) = L^p({\bf R}^n, w_B(x)\ dx)}$ can be viewed as a smoothed out version of the space ${L^p(B)}$. For future reference we observe a fundamental self-similarity of the curve ${\gamma({\bf R})}$: any arc ${\gamma(I)}$ in this curve, with ${I}$ a compact interval, is affinely equivalent to the standard arc ${\gamma([0,1])}$.

Theorem 1 (Decoupling theorem) Let ${n \geq 1}$. Subdivide the unit interval ${[0,1]}$ into ${N}$ equal subintervals ${I_i}$ of length ${1/N}$, and for each such ${I_i}$, let ${f_i: {\bf R}^n \rightarrow {\bf R}}$ be the Fourier transform

$\displaystyle f_i(x) = \int_{\gamma(I_i)} e(x \cdot \xi)\ d\mu_i(\xi)$

of a finite Borel measure ${\mu_i}$ on the arc ${\gamma(I_i)}$, where ${e(\theta) := e^{2\pi i \theta}}$. Then the ${f_i}$ exhibit decoupling in ${L^{n(n+1)}(w_B)}$ for any ball ${B}$ of radius ${N^n}$.

Orthogonality gives the ${n=1}$ case of this theorem. The bi-orthogonality type arguments sketched earlier only give decoupling in ${L^p}$ up to the range ${2 \leq p \leq 2n}$; the point here is that we can now get a much larger value of ${n}$. The ${n=2}$ case of this theorem was previously established by Bourgain and Demeter (who obtained in fact an analogous theorem for any curved hypersurface). The exponent ${n(n+1)}$ (and the radius ${N^n}$) is best possible, as can be seen by the following basic example. If

$\displaystyle f_i(x) := \int_{I_i} e(x \cdot \gamma(\xi)) g_i(\xi)\ d\xi$

where ${g_i}$ is a bump function adapted to ${I_i}$, then standard Fourier-analytic computations show that ${f_i}$ will be comparable to ${1/N}$ on a rectangular box of dimensions ${N \times N^2 \times \dots \times N^n}$ (and thus volume ${N^{n(n+1)/2}}$) centred at the origin, and exhibit decay away from this box, with ${\|f_i\|_{L^{n(n+1)}(w_B)}}$ comparable to

$\displaystyle 1/N \times (N^{n(n+1)/2})^{1/(n(n+1))} = 1/\sqrt{N}.$

On the other hand, ${\sum_{i=1}^N f_i}$ is comparable to ${1}$ on a ball of radius comparable to ${1}$ centred at the origin, so ${\|\sum_{i=1}^N f_i\|_{L^{n(n+1)}(w_B)}}$ is ${\gg 1}$, which is just barely consistent with decoupling. This calculation shows that decoupling will fail if ${n(n+1)}$ is replaced by any larger exponent, and also if the radius of the ball ${B}$ is reduced to be significantly smaller than ${N^n}$.

This theorem has the following consequence of importance in analytic number theory:

Corollary 2 (Vinogradov main conjecture) Let ${s, n, N \geq 1}$ be integers, and let ${\varepsilon > 0}$. Then

$\displaystyle \int_{[0,1]^n} |\sum_{j=1}^N e( j x_1 + j^2 x_2 + \dots + j^n x_n)|^{2s}\ dx_1 \dots dx_n$

$\displaystyle \ll_{\varepsilon,s,n} N^{s+\varepsilon} + N^{2s - \frac{n(n+1)}{2}+\varepsilon}.$

Proof: By the Hölder inequality (and the trivial bound of ${N}$ for the exponential sum), it suffices to treat the critical case ${s = n(n+1)/2}$, that is to say to show that

$\displaystyle \int_{[0,1]^n} |\sum_{j=1}^N e( j x_1 + j^2 x_2 + \dots + j^n x_n)|^{n(n+1)}\ dx_1 \dots dx_n \ll_{\varepsilon,n} N^{\frac{n(n+1)}{2}+\varepsilon}.$

We can rescale this as

$\displaystyle \int_{[0,N] \times [0,N^2] \times \dots \times [0,N^n]} |\sum_{j=1}^N e( x \cdot \gamma(j/N) )|^{n(n+1)}\ dx \ll_{\varepsilon,n} N^{3\frac{n(n+1)}{2}+\varepsilon}.$

As the integrand is periodic along the lattice ${N{\bf Z} \times N^2 {\bf Z} \times \dots \times N^n {\bf Z}}$, this is equivalent to

$\displaystyle \int_{[0,N^n]^n} |\sum_{j=1}^N e( x \cdot \gamma(j/N) )|^{n(n+1)}\ dx \ll_{\varepsilon,n} N^{\frac{n(n+1)}{2}+n^2+\varepsilon}.$

The left-hand side may be bounded by ${\ll \| \sum_{j=1}^N f_j \|_{L^{n(n+1)}(w_B)}^{n(n+1)}}$, where ${B := B(0,N^n)}$ and ${f_j(x) := e(x \cdot \gamma(j/N))}$. Since

$\displaystyle \| f_j \|_{L^{n(n+1)}(w_B)} \ll (N^{n^2})^{\frac{1}{n(n+1)}},$

the claim now follows from the decoupling theorem and a brief calculation. $\Box$

Using the Plancherel formula, one may equivalently (when ${s}$ is an integer) write the Vinogradov main conjecture in terms of solutions ${j_1,\dots,j_s,k_1,\dots,k_s \in \{1,\dots,N\}}$ to the system of equations

$\displaystyle j_1^i + \dots + j_s^i = k_1^i + \dots + k_s^i \forall i=1,\dots,n,$

but we will not use this formulation here.

A history of the Vinogradov main conjecture may be found in this survey of Wooley; prior to the Bourgain-Demeter-Guth theorem, the conjecture was solved completely for ${n \leq 3}$, or for ${n > 3}$ and ${s}$ either below ${n(n+1)/2 - n/3 + O(n^{2/3})}$ or above ${n(n-1)}$, with the bulk of recent progress coming from the efficient congruencing technique of Wooley. It has numerous applications to exponential sums, Waring’s problem, and the zeta function; to give just one application, the main conjecture implies the predicted asymptotic for the number of ways to express a large number as the sum of ${23}$ fifth powers (the previous best result required ${28}$ fifth powers). The Bourgain-Demeter-Guth approach to the Vinogradov main conjecture, based on decoupling, is ostensibly very different from the efficient congruencing technique, which relies heavily on the arithmetic structure of the program, but it appears (as I have been told from second-hand sources) that the two methods are actually closely related, with the former being a sort of “Archimedean” version of the latter (with the intervals ${I_i}$ in the decoupling theorem being analogous to congruence classes in the efficient congruencing method); hopefully there will be some future work making this connection more precise. One advantage of the decoupling approach is that it generalises to non-arithmetic settings in which the set ${\{1,\dots,N\}}$ that ${j}$ is drawn from is replaced by some other similarly separated set of real numbers. (A random thought – could this allow the Vinogradov-Korobov bounds on the zeta function to extend to Beurling zeta functions?)

Below the fold we sketch the Bourgain-Demeter-Guth argument proving Theorem 1.

I thank Jean Bourgain and Andrew Granville for helpful discussions.

Let ${\lambda}$ denote the Liouville function. The prime number theorem is equivalent to the estimate

$\displaystyle \sum_{n \leq x} \lambda(n) = o(x)$

as ${x \rightarrow \infty}$, that is to say that ${\lambda}$ exhibits cancellation on large intervals such as ${[1,x]}$. This result can be improved to give cancellation on shorter intervals. For instance, using the known zero density estimates for the Riemann zeta function, one can establish that

$\displaystyle \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n)|\ dx = o( HX ) \ \ \ \ \ (1)$

as ${X \rightarrow \infty}$ if ${X^{1/6+\varepsilon} \leq H \leq X}$ for some fixed ${\varepsilon>0}$; I believe this result is due to Ramachandra (see also Exercise 21 of this previous blog post), and in fact one could obtain a better error term on the right-hand side that for instance gained an arbitrary power of ${\log X}$. On the Riemann hypothesis (or the weaker density hypothesis), it was known that the ${X^{1/6+\varepsilon}}$ could be lowered to ${X^\varepsilon}$.

Early this year, there was a major breakthrough by Matomaki and Radziwill, who (among other things) showed that the asymptotic (1) was in fact valid for any ${H = H(X)}$ with ${H \leq X}$ that went to infinity as ${X \rightarrow \infty}$, thus yielding cancellation on extremely short intervals. This has many further applications; for instance, this estimate, or more precisely its extension to other “non-pretentious” bounded multiplicative functions, was a key ingredient in my recent solution of the Erdös discrepancy problem, as well as in obtaining logarithmically averaged cases of Chowla’s conjecture, such as

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1)}{n} = o(\log x). \ \ \ \ \ (2)$

It is of interest to twist the above estimates by phases such as the linear phase ${n \mapsto e(\alpha n) := e^{2\pi i \alpha n}}$. In 1937, Davenport showed that

$\displaystyle \sup_\alpha |\sum_{n \leq x} \lambda(n) e(\alpha n)| \ll_A x \log^{-A} x$

which of course improves the prime number theorem. Recently with Matomaki and Radziwill, we obtained a common generalisation of this estimate with (1), showing that

$\displaystyle \sup_\alpha \int_X^{2X} |\sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(HX) \ \ \ \ \ (3)$

as ${X \rightarrow \infty}$, for any ${H = H(X) \leq X}$ that went to infinity as ${X \rightarrow \infty}$. We were able to use this estimate to obtain an averaged form of Chowla’s conjecture.

In that paper, we asked whether one could improve this estimate further by moving the supremum inside the integral, that is to say to establish the bound

$\displaystyle \int_X^{2X} \sup_\alpha |\sum_{x \leq n \leq x+H} \lambda(n) e(\alpha n)|\ dx = o(HX) \ \ \ \ \ (4)$

as ${X \rightarrow \infty}$, for any ${H = H(X) \leq X}$ that went to infinity as ${X \rightarrow \infty}$. This bound is asserting that ${\lambda}$ is locally Fourier-uniform on most short intervals; it can be written equivalently in terms of the “local Gowers ${U^2}$ norm” as

$\displaystyle \int_X^{2X} \sum_{1 \leq a \leq H} |\sum_{x \leq n \leq x+H} \lambda(n) \lambda(n+a)|^2\ dx = o( H^3 X )$

from which one can see that this is another averaged form of Chowla’s conjecture (stronger than the one I was able to prove with Matomaki and Radziwill, but a consequence of the unaveraged Chowla conjecture). If one inserted such a bound into the machinery I used to solve the Erdös discrepancy problem, it should lead to further averaged cases of Chowla’s conjecture, such as

$\displaystyle \sum_{n \leq x} \frac{\lambda(n) \lambda(n+1) \lambda(n+2)}{n} = o(\log x), \ \ \ \ \ (5)$

though I have not fully checked the details of this implication. It should also have a number of new implications for sign patterns of the Liouville function, though we have not explored these in detail yet.

One can write (4) equivalently in the form

$\displaystyle \int_X^{2X} \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )\ dx = o(HX) \ \ \ \ \ (6)$

uniformly for all ${x}$-dependent phases ${\alpha(x), \beta(x)}$. In contrast, (3) is equivalent to the subcase of (6) when the linear phase coefficient ${\alpha(x)}$ is independent of ${x}$. This dependency of ${\alpha(x)}$ on ${x}$ seems to necessitate some highly nontrivial additive combinatorial analysis of the function ${x \mapsto \alpha(x)}$ in order to establish (4) when ${H}$ is small. To date, this analysis has proven to be elusive, but I would like to record what one can do with more classical methods like Vaughan’s identity, namely:

Proposition 1 The estimate (4) (or equivalently (6)) holds in the range ${X^{2/3+\varepsilon} \leq H \leq X}$ for any fixed ${\varepsilon>0}$. (In fact one can improve the right-hand side by an arbitrary power of ${\log X}$ in this case.)

The values of ${H}$ in this range are far too large to yield implications such as new cases of the Chowla conjecture, but it appears that the ${2/3}$ exponent is the limit of “classical” methods (at least as far as I was able to apply them), in the sense that one does not do any combinatorial analysis on the function ${x \mapsto \alpha(x)}$, nor does one use modern equidistribution results on “Type III sums” that require deep estimates on Kloosterman-type sums. The latter may shave a little bit off of the ${2/3}$ exponent, but I don’t see how one would ever hope to go below ${1/2}$ without doing some non-trivial combinatorics on the function ${x \mapsto \alpha(x)}$. UPDATE: I have come across this paper of Zhan which uses mean-value theorems for L-functions to lower the ${2/3}$ exponent to ${5/8}$.

Let me now sketch the proof of the proposition, omitting many of the technical details. We first remark that known estimates on sums of the Liouville function (or similar functions such as the von Mangoldt function) in short arithmetic progressions, based on zero-density estimates for Dirichlet ${L}$-functions, can handle the “major arc” case of (4) (or (6)) where ${\alpha}$ is restricted to be of the form ${\alpha = \frac{a}{q} + O( X^{-1/6-\varepsilon} )}$ for ${q = O(\log^{O(1)} X)}$ (the exponent here being of the same numerology as the ${X^{1/6+\varepsilon}}$ exponent in the classical result of Ramachandra, tied to the best zero density estimates currently available); for instance a modification of the arguments in this recent paper of Koukoulopoulos would suffice. Thus we can restrict attention to “minor arc” values of ${\alpha}$ (or ${\alpha(x)}$, using the interpretation of (6)).

Next, one breaks up ${\lambda}$ (or the closely related Möbius function) into Dirichlet convolutions using one of the standard identities (e.g. Vaughan’s identity or Heath-Brown’s identity), as discussed for instance in this previous post (which is focused more on the von Mangoldt function, but analogous identities exist for the Liouville and Möbius functions). The exact choice of identity is not terribly important, but the upshot is that ${\lambda(n)}$ can be decomposed into ${\log^{O(1)} X}$ terms, each of which is either of the “Type I” form

$\displaystyle \sum_{d \sim D; m \sim M: dm=n} a_d$

for some coefficients ${a_d}$ that are roughly of logarithmic size on the average, and scales ${D, M}$ with ${D \ll X^{2/3}}$ and ${DM \sim X}$, or else of the “Type II” form

$\displaystyle \sum_{d \sim D; m \sim M: dm=n} a_d b_m$

for some coefficients ${a_d, b_m}$ that are roughly of logarithmic size on the average, and scales ${D,M}$ with ${X^{1/3} \ll D,M \ll X^{2/3}}$ and ${DM \sim X}$. As discussed in the previous post, the ${2/3}$ exponent is a natural barrier in these identities if one is unwilling to also consider “Type III” type terms which are roughly of the shape of the third divisor function ${\tau_3(n) := \sum_{d_1d_2d_3=1} 1}$.

A Type I sum makes a contribution to ${ \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )}$ that can be bounded (via Cauchy-Schwarz) in terms of an expression such as

$\displaystyle \sum_{d \sim D} | \sum_{x/d \leq m \leq x/d+H/d} e(\alpha(x) dm )|^2.$

The inner sum exhibits a lot of cancellation unless ${\alpha(x) d}$ is within ${O(D/H)}$ of an integer. (Here, “a lot” should be loosely interpreted as “gaining many powers of ${\log X}$ over the trivial bound”.) Since ${H}$ is significantly larger than ${D}$, standard Vinogradov-type manipulations (see e.g. Lemma 13 of these previous notes) show that this bad case occurs for many ${d}$ only when ${\alpha}$ is “major arc”, which is the case we have specifically excluded. This lets us dispose of the Type I contributions.

A Type II sum makes a contribution to ${ \sum_{x \leq n \leq x+H} \lambda(n) e( \alpha(x) n + \beta(x) )}$ roughly of the form

$\displaystyle \sum_{d \sim D} | \sum_{x/d \leq m \leq x/d+H/d} b_m e(\alpha(x) dm)|.$

We can break this up into a number of sums roughly of the form

$\displaystyle \sum_{d = d_0 + O( H / M )} | \sum_{x/d_0 \leq m \leq x/d_0 + H/D} b_m e(\alpha(x) dm)|$

for ${d_0 \sim D}$; note that the ${d}$ range is non-trivial because ${H}$ is much larger than ${M}$. Applying the usual bilinear sum Cauchy-Schwarz methods (e.g. Theorem 14 of these notes) we conclude that there is a lot of cancellation unless one has ${\alpha(x) = a/q + O( \frac{X \log^{O(1)} X}{H^2} )}$ for some ${q = O(\log^{O(1)} X)}$. But with ${H \geq X^{2/3+\varepsilon}}$, ${X \log^{O(1)} X/H^2}$ is well below the threshold ${X^{-1/6-\varepsilon}}$ for the definition of major arc, so we can exclude this case and obtain the required cancellation.