You are currently browsing the tag archive for the ‘Nash embedding theorem’ tag.

I’ve just uploaded to the arXiv my paper “Embedding the Heisenberg group into a bounded dimensional Euclidean space with optimal distortion“, submitted to Revista Matematica Iberoamericana. This paper concerns the extent to which one can accurately embed the metric structure of the Heisenberg group

\displaystyle H := \begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix}

into Euclidean space, which we can write as {\{ [x,y,z]: x,y,z \in {\bf R} \}} with the notation

\displaystyle [x,y,z] := \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}.

Here we give {H} the right-invariant Carnot-Carathéodory metric {d} coming from the right-invariant vector fields

\displaystyle X := \frac{\partial}{\partial x} + y \frac{\partial}{\partial z}; \quad Y := \frac{\partial}{\partial y}

but not from the commutator vector field

\displaystyle Z := [Y,X] = \frac{\partial}{\partial z}.

This gives {H} the geometry of a Carnot group. As observed by Semmes, it follows from the Carnot group differentiation theory of Pansu that there is no bilipschitz map from {(H,d)} to any Euclidean space {{\bf R}^D} or even to {\ell^2}, since such a map must be differentiable almost everywhere in the sense of Carnot groups, which in particular shows that the derivative map annihilate {Z} almost everywhere, which is incompatible with being bilipschitz.

On the other hand, if one snowflakes the Heisenberg group by replacing the metric {d} with {d^{1-\varepsilon}} for some {0 < \varepsilon < 1}, then it follows from the general theory of Assouad on embedding snowflaked metrics of doubling spaces that {(H,d^{1-\varepsilon})} may be embedded in a bilipschitz fashion into {\ell^2}, or even to {{\bf R}^{D_\varepsilon}} for some {D_\varepsilon} depending on {\varepsilon}.

Of course, the distortion of this bilipschitz embedding must degenerate in the limit {\varepsilon \rightarrow 0}. From the work of Austin-Naor-Tessera and Naor-Neiman it follows that {(H,d^{1-\varepsilon})} may be embedded into {\ell^2} with a distortion of {O( \varepsilon^{-1/2} )}, but no better. The Naor-Neiman paper also embeds {(H,d^{1-\varepsilon})} into a finite-dimensional space {{\bf R}^D} with {D} independent of {\varepsilon}, but at the cost of worsening the distortion to {O(\varepsilon^{-1})}. They then posed the question of whether this worsening of the distortion is necessary.

The main result of this paper answers this question in the negative:

Theorem 1 There exists an absolute constant {D} such that {(H,d^{1-\varepsilon})} may be embedded into {{\bf R}^D} in a bilipschitz fashion with distortion {O(\varepsilon^{-1/2})} for any {0 < \varepsilon \leq 1/2}.

To motivate the proof of this theorem, let us first present a bilipschitz map {\Phi: {\bf R} \rightarrow \ell^2} from the snowflaked line {({\bf R},d_{\bf R}^{1-\varepsilon})} (with {d_{\bf R}} being the usual metric on {{\bf R}}) into complex Hilbert space {\ell^2({\bf C})}. The map is given explicitly as a Weierstrass type function

\displaystyle \Phi(x) := \sum_{k \in {\bf Z}} 2^{-\varepsilon k} (\phi_k(x) - \phi_k(0))

where for each {k}, {\phi_k: {\bf R} \rightarrow \ell^2} is the function

\displaystyle \phi_k(x) := 2^k e^{2\pi i x / 2^k} e_k.

and {(e_k)_{k \in {\bf Z}}} are an orthonormal basis for {\ell^2({\bf C})}. The subtracting of the constant {\phi_k(0)} is purely in order to make the sum convergent as {k \rightarrow \infty}. If {x,y \in {\bf R}} are such that {2^{k_0-2} \leq d_{\bf R}(x,y) \leq 2^{k_0-1}} for some integer {k_0}, one can easily check the bounds

\displaystyle |\phi_k(x) - \phi_k(y)| \lesssim d_{\bf R}(x,y)^{(1-\varepsilon)} \min( 2^{-(1-\varepsilon) (k_0-k)}, 2^{-\varepsilon (k-k_0)} )

with the lower bound

\displaystyle |\phi_{k_0}(x) - \phi_{k_0}(y)| \gtrsim d_{\bf R}(x,y)^{(1-\varepsilon)}

at which point one finds that

\displaystyle d_{\bf R}(x,y)^{1-\varepsilon} \lesssim |\Phi(x) - \Phi(y)| \lesssim \varepsilon^{-1/2} d_{\bf R}(x,y)^{1-\varepsilon}

as desired.

The key here was that each function {\phi_k} oscillated at a different spatial scale {2^k}, and the functions were all orthogonal to each other (so that the upper bound involved a factor of {\varepsilon^{-1/2}} rather than {\varepsilon^{-1}}). One can replicate this example for the Heisenberg group without much difficulty. Indeed, if we let {\Gamma := \{ [a,b,c]: a,b,c \in {\bf Z} \}} be the discrete Heisenberg group, then the nilmanifold {H/\Gamma} is a three-dimensional smooth compact manifold; thus, by the Whitney embedding theorem, it smoothly embeds into {{\bf R}^6}. This gives a smooth immersion {\phi: H \rightarrow {\bf R}^6} which is {\Gamma}-automorphic in the sense that {\phi(p\gamma) = \phi(p)} for all {p \in H} and {\gamma \in \Gamma}. If one then defines {\phi_k: H \rightarrow \ell^2 \otimes {\bf R}^6} to be the function

\displaystyle \phi_k(p) := 2^k \phi( \delta_{2^{-k}}(p) ) \otimes e_k

where {\delta_\lambda: H \rightarrow H} is the scaling map

\displaystyle \delta_\lambda([x,y,z]) := [\lambda x, \lambda y, \lambda^2 z],

then one can repeat the previous arguments to obtain the required bilipschitz bounds

\displaystyle d(p,q)^{1-\varepsilon} \lesssim |\Phi(p) - \Phi(q) \lesssim \varepsilon^{-1/2} d(p,q)^{1-\varepsilon}

for the function

\displaystyle \Phi(p) :=\sum_{k \in {\bf Z}} 2^{-\varepsilon k} (\phi_k(p) - \phi_k(0)).

To adapt this construction to bounded dimension, the main obstruction was the requirement that the {\phi_k} took values in orthogonal subspaces. But if one works things out carefully, it is enough to require the weaker orthogonality requirement

\displaystyle B( \phi_{k_0}, \sum_{k>k_0} 2^{-\varepsilon(k-k_0)} \phi_k ) = 0

for all {k_0 \in {\bf Z}}, where {B(\phi, \psi): H \rightarrow {\bf R}^2} is the bilinear form

\displaystyle B(\phi,\psi) := (X \phi \cdot X \psi, Y \phi \cdot Y \psi ).

One can then try to construct the {\phi_k: H \rightarrow {\bf R}^D} for bounded dimension {D} by an iterative argument. After some standard reductions, the problem becomes this (roughly speaking): given a smooth, slowly varying function {\psi: H \rightarrow {\bf R}^{D}} whose derivatives obey certain quantitative upper and lower bounds, construct a smooth oscillating function {\phi: H \rightarrow {\bf R}^{D}}, whose derivatives also obey certain quantitative upper and lower bounds, which obey the equation

\displaystyle B(\phi,\psi) = 0. \ \ \ \ \ (1)

 

We view this as an underdetermined system of differential equations for {\phi} (two equations in {D} unknowns; after some reductions, our {D} can be taken to be the explicit value {36}). The trivial solution {\phi=0} to this equation will be inadmissible for our purposes due to the lower bounds we will require on {\phi} (in order to obtain the quantitative immersion property mentioned previously, as well as for a stronger “freeness” property that is needed to close the iteration). Because this construction will need to be iterated, it will be essential that the regularity control on {\phi} is the same as that on {\psi}; one cannot afford to “lose derivatives” when passing from {\psi} to {\phi}.

This problem has some formal similarities with the isometric embedding problem (discussed for instance in this previous post), which can be viewed as the problem of solving an equation of the form {Q(\phi,\phi) = g}, where {(M,g)} is a Riemannian manifold and {Q} is the bilinear form

\displaystyle Q(\phi,\psi)_{ij} = \partial_i \phi \cdot \partial_j \psi.

The isometric embedding problem also has the key obstacle that naive attempts to solve the equation {Q(\phi,\phi)=g} iteratively can lead to an undesirable “loss of derivatives” that prevents one from iterating indefinitely. This obstacle was famously resolved by the Nash-Moser iteration scheme in which one alternates between perturbatively adjusting an approximate solution to improve the residual error term, and mollifying the resulting perturbation to counteract the loss of derivatives. The current equation (1) differs in some key respects from the isometric embedding equation {Q(\phi,\phi)=g}, in particular being linear in the unknown field {\phi} rather than quadratic; nevertheless the key obstacle is the same, namely that naive attempts to solve either equation lose derivatives. Our approach to solving (1) was inspired by the Nash-Moser scheme; in retrospect, I also found similarities with Uchiyama’s constructive proof of the Fefferman-Stein decomposition theorem, discussed in this previous post (and in this recent one).

To motivate this iteration, we first express {B(\phi,\psi)} using the product rule in a form that does not place derivatives directly on the unknown {\phi}:

\displaystyle B(\phi,\psi) = \left( W(\phi \cdot W \psi) - \phi \cdot WW \psi\right)_{W = X,Y} \ \ \ \ \ (2)

 

This reveals that one can construct solutions {\phi} to (1) by solving the system of equations

\displaystyle \phi \cdot W \psi = \phi \cdot WW \psi = 0 \ \ \ \ \ (3)

 

for {W \in \{X, Y \}}. Because this system is zeroth order in {\phi}, this can easily be done by linear algebra (even in the presence of a forcing term {B(\phi,\psi)=F}) if one imposes a “freeness” condition (analogous to the notion of a free embedding in the isometric embedding problem) that {X \psi(p), Y \psi(p), XX \psi(p), YY \psi(p)} are linearly independent at each point {p}, which (together with some other technical conditions of a similar nature) one then adds to the list of upper and lower bounds required on {\psi} (with a related bound then imposed on {\phi}, in order to close the iteration). However, as mentioned previously, there is a “loss of derivatives” problem with this construction: due to the presence of the differential operators {W} in (3), a solution {\phi} constructed by this method can only be expected to have two degrees less regularity than {\psi} at best, which makes this construction unsuitable for iteration.

To get around this obstacle (which also prominently appears when solving (linearisations of) the isometric embedding equation {Q(\phi,\phi)=g}), we instead first construct a smooth, low-frequency solution {\phi_{\leq N_0} \colon H \rightarrow {\bf R}^{D}} to a low-frequency equation

\displaystyle B( \phi_{\leq N_0}, P_{\leq N_0} \psi ) = 0 \ \ \ \ \ (4)

 

where {P_{\leq N_0} \psi} is a mollification of {\psi} (of Littlewood-Paley type) applied at a small spatial scale {1/N_0} for some {N_0}, and then gradually relax the frequency cutoff {P_{\leq N_0}} to deform this low frequency solution {\phi_{\leq N_0}} to a solution {\phi} of the actual equation (1).

We will construct the low-frequency solution {\phi_{\leq N_0}} rather explicitly, using the Whitney embedding theorem to construct an initial oscillating map {f} into a very low dimensional space {{\bf R}^6}, composing it with a Veronese type embedding into a slightly larger dimensional space {{\bf R}^{27}} to obtain a required “freeness” property, and then composing further with a slowly varying isometry {U(p) \colon {\bf R}^{27} \rightarrow {\bf R}^{36}} depending on {P_{\leq N_0}} and constructed by a quantitative topological lemma (relying ultimately on the vanishing of the first few homotopy groups of high-dimensional spheres), in order to obtain the required orthogonality (4). (This sort of “quantitative null-homotopy” was first proposed by Gromov, with some recent progress on optimal bounds by Chambers-Manin-Weinberger and by Chambers-Dotterer-Manin-Weinberger, but we will not need these more advanced results here, as one can rely on the classical qualitative vanishing {\pi^k(S^d)=0} for {k < d} together with a compactness argument to obtain (ineffective) quantitative bounds, which suffice for this application).

To perform the deformation of {\phi_{\leq N_0}} into {\phi}, we must solve what is essentially the linearised equation

\displaystyle B( \dot \phi, \psi ) + B( \phi, \dot \psi ) = 0 \ \ \ \ \ (5)

 

of (1) when {\phi}, {\psi} (viewed as low frequency functions) are both being deformed at some rates {\dot \phi, \dot \psi} (which should be viewed as high frequency functions). To avoid losing derivatives, the magnitude of the deformation {\dot \phi} in {\phi} should not be significantly greater than the magnitude of the deformation {\dot \psi} in {\psi}, when measured in the same function space norms.

As before, if one directly solves the difference equation (5) using a naive application of (2) with {B(\phi,\dot \psi)} treated as a forcing term, one will lose at least one derivative of regularity when passing from {\dot \psi} to {\dot \phi}. However, observe that (2) (and the symmetry {B(\phi, \dot \psi) = B(\dot \psi,\phi)}) can be used to obtain the identity

\displaystyle B( \dot \phi, \psi ) + B( \phi, \dot \psi ) = \left( W(\dot \phi \cdot W \psi + \dot \psi \cdot W \phi) - (\dot \phi \cdot WW \psi + \dot \psi \cdot WW \phi)\right)_{W = X,Y} \ \ \ \ \ (6)

 

and then one can solve (5) by solving the system of equations

\displaystyle \dot \phi \cdot W \psi = - \dot \psi \cdot W \phi

for {W \in \{X,XX,Y,YY\}}. The key point here is that this system is zeroth order in both {\dot \phi} and {\dot \psi}, so one can solve this system without losing any derivatives when passing from {\dot \psi} to {\dot \phi}; compare this situation with that of the superficially similar system

\displaystyle \dot \phi \cdot W \psi = - \phi \cdot W \dot \psi

that one would obtain from naively linearising (3) without exploiting the symmetry of {B}. There is still however one residual “loss of derivatives” problem arising from the presence of a differential operator {W} on the {\phi} term, which prevents one from directly evolving this iteration scheme in time without losing regularity in {\phi}. It is here that we borrow the final key idea of the Nash-Moser scheme, which is to replace {\phi} by a mollified version {P_{\leq N} \phi} of itself (where the projection {P_{\leq N}} depends on the time parameter). This creates an error term in (5), but it turns out that this error term is quite small and smooth (being a “high-high paraproduct” of {\nabla \phi} and {\nabla\psi}, it ends up being far more regular than either {\phi} or {\psi}, even with the presence of the derivatives) and can be iterated away provided that the initial frequency cutoff {N_0} is large and the function {\psi} has a fairly high (but finite) amount of regularity (we will eventually use the Hölder space {C^{20,\alpha}} on the Heisenberg group to measure this).

I’ve just uploaded to the arXiv my paper Finite time blowup for a supercritical defocusing nonlinear Schrödinger system, submitted to Analysis and PDE. This paper is an analogue of a recent paper of mine in which I constructed a supercritical defocusing nonlinear wave (NLW) system {-\partial_{tt} u + \Delta u = (\nabla F)(u)} which exhibited smooth solutions that developed singularities in finite time. Here, we achieve essentially the same conclusion for the (inhomogeneous) supercritical defocusing nonlinear Schrödinger (NLS) equation

\displaystyle  i \partial_t u + \Delta u = (\nabla F)(u) + G \ \ \ \ \ (1)

where {u: {\bf R} \times {\bf R}^d \rightarrow {\bf C}^m} is now a system of scalar fields, {F: {\bf C}^m \rightarrow {\bf R}} is a potential which is strictly positive and homogeneous of degree {p+1} (and invariant under phase rotations {u \mapsto e^{i\theta} u}), and {G: {\bf R} \times {\bf R}^d \rightarrow {\bf C}^m} is a smooth compactly supported forcing term, needed for technical reasons.

To oversimplify somewhat, the equation (1) is known to be globally regular in the energy-subcritical case when {d \leq 2}, or when {d \geq 3} and {p < 1+\frac{4}{d-2}}; global regularity is also known (but is significantly more difficult to establish) in the energy-critical case when {d \geq 3} and {p = 1 +\frac{4}{d-2}}. (This is an oversimplification for a number of reasons, in particular in higher dimensions one only knows global well-posedness instead of global regularity. See this previous post for some exploration of this issue in the context of nonlinear wave equations.) The main result of this paper is to show that global regularity can break down in the remaining energy-supercritical case when {d \geq 3} and {p > 1 + \frac{4}{d-2}}, at least when the target dimension {m} is allowed to be sufficiently large depending on the spatial dimension {d} (I did not try to achieve the optimal value of {m} here, but the argument gives a value of {m} that grows quadratically in {d}). Unfortunately, this result does not directly impact the most interesting case of the defocusing scalar NLS equation

\displaystyle  i \partial_t u + \Delta u = |u|^{p-1} u \ \ \ \ \ (2)

in which {m=1}; however it does establish a rigorous barrier to any attempt to prove global regularity for the scalar NLS equation, in that such an attempt needs to crucially use some property of the scalar NLS that is not shared by the more general systems in (1). For instance, any approach that is primarily based on the conservation laws of mass, momentum, and energy (which are common to both (1) and (2)) will not be sufficient to establish global regularity of supercritical defocusing scalar NLS.

The method of proof in this paper is broadly similar to that in the previous paper for NLW, but with a number of additional technical complications. Both proofs begin by reducing matters to constructing a discretely self-similar solution. In the case of NLW, this solution lived on a forward light cone {\{ (t,x): |x| \leq t \}} and obeyed a self-similarity

\displaystyle  u(2t, 2x) = 2^{-\frac{2}{p-1}} u(t,x).

The ability to restrict to a light cone arose from the finite speed of propagation properties of NLW. For NLS, the solution will instead live on the domain

\displaystyle  H_d := ([0,+\infty) \times {\bf R}^d) \backslash \{(0,0)\}

and obey a parabolic self-similarity

\displaystyle  u(4t, 2x) = 2^{-\frac{2}{p-1}} u(t,x)

and solve the homogeneous version {G=0} of (1). (The inhomogeneity {G} emerges when one truncates the self-similar solution so that the initial data is compactly supported in space.) A key technical point is that {u} has to be smooth everywhere in {H_d}, including the boundary component {\{ (0,x): x \in {\bf R}^d \backslash \{0\}\}}. This unfortunately rules out many of the existing constructions of self-similar solutions, which typically will have some sort of singularity at the spatial origin.

The remaining steps of the argument can broadly be described as quantifier elimination: one systematically eliminates each of the degrees of freedom of the problem in turn by locating the necessary and sufficient conditions required of the remaining degrees of freedom in order for the constraints of a particular degree of freedom to be satisfiable. The first such degree of freedom to eliminate is the potential function {F}. The task here is to determine what constraints must exist on a putative solution {u} in order for there to exist a (positive, homogeneous, smooth away from origin) potential {F} obeying the homogeneous NLS equation

\displaystyle  i \partial_t u + \Delta u = (\nabla F)(u).

Firstly, the requirement that {F} be homogeneous implies the Euler identity

\displaystyle  \langle (\nabla F)(u), u \rangle = (p+1) F(u)

(where {\langle,\rangle} denotes the standard real inner product on {{\bf C}^m}), while the requirement that {F} be phase invariant similarly yields the variant identity

\displaystyle  \langle (\nabla F)(u), iu \rangle = 0,

so if one defines the potential energy field to be {V = F(u)}, we obtain from the chain rule the equations

\displaystyle  \langle i \partial_t u + \Delta u, u \rangle = (p+1) V

\displaystyle  \langle i \partial_t u + \Delta u, iu \rangle = 0

\displaystyle  \langle i \partial_t u + \Delta u, \partial_t u \rangle = \partial_t V

\displaystyle  \langle i \partial_t u + \Delta u, \partial_{x_j} u \rangle = \partial_{x_j} V.

Conversely, it turns out (roughly speaking) that if one can locate fields {u} and {V} obeying the above equations (as well as some other technical regularity and non-degeneracy conditions), then one can find an {F} with all the required properties. The first of these equations can be thought of as a definition of the potential energy field {V}, and the other three equations are basically disguised versions of the conservation laws of mass, energy, and momentum respectively. The construction of {F} relies on a classical extension theorem of Seeley that is a relative of the Whitney extension theorem.

Now that the potential {F} is eliminated, the next degree of freedom to eliminate is the solution field {u}. One can observe that the above equations involving {u} and {V} can be expressed instead in terms of {V} and the Gram-type matrix {G[u,u]} of {u}, which is a {(2d+4) \times (2d+4)} matrix consisting of the inner products {\langle D_1 u, D_2 u \rangle} where {D_1,D_2} range amongst the {2d+4} differential operators

\displaystyle  D_1,D_2 \in \{ 1, i, \partial_t, i\partial_t, \partial_{x_1},\dots,\partial_{x_d}, i\partial_{x_1}, \dots, i\partial_{x_d}\}.

To eliminate {u}, one thus needs to answer the question of what properties are required of a {(2d+4) \times (2d+4)} matrix {G} for it to be the Gram-type matrix {G = G[u,u]} of a field {u}. Amongst some obvious necessary conditions are that {G} needs to be symmetric and positive semi-definite; there are also additional constraints coming from identities such as

\displaystyle  \partial_t \langle u, u \rangle = 2 \langle u, \partial_t u \rangle

\displaystyle  \langle i u, \partial_t u \rangle = - \langle u, i \partial_t u \rangle

and

\displaystyle  \partial_{x_j} \langle iu, \partial_{x_k} u \rangle - \partial_{x_k} \langle iu, \partial_{x_j} u \rangle = 2 \langle i \partial_{x_j} u, \partial_{x_k} u \rangle.

Ideally one would like a theorem that asserts (for {m} large enough) that as long as {G} obeys all of the “obvious” constraints, then there exists a suitably non-degenerate map {u} such that {G = G[u,u]}. In the case of NLW, the analogous claim was basically a consequence of the Nash embedding theorem (which can be viewed as a theorem about the solvability of the system of equations {\langle \partial_{x_j} u, \partial_{x_k} u \rangle = g_{jk}} for a given positive definite symmetric set of fields {g_{jk}}). However, the presence of the complex structure in the NLS case poses some significant technical challenges (note for instance that the naive complex version of the Nash embedding theorem is false, due to obstructions such as Liouville’s theorem that prevent a compact complex manifold from being embeddable holomorphically in {{\bf C}^m}). Nevertheless, by adapting the proof of the Nash embedding theorem (in particular, the simplified proof of Gunther that avoids the need to use the Nash-Moser iteration scheme) we were able to obtain a partial complex analogue of the Nash embedding theorem that sufficed for our application; it required an artificial additional “curl-free” hypothesis on the Gram-type matrix {G[u,u]}, but fortunately this hypothesis ends up being automatic in our construction. Also, this version of the Nash embedding theorem is unable to prescribe the component {\langle \partial_t u, \partial_t u \rangle} of the Gram-type matrix {G[u,u]}, but fortunately this component is not used in any of the conservation laws and so the loss of this component does not cause any difficulty.

After applying the above-mentioned Nash-embedding theorem, the task is now to locate a matrix {G} obeying all the hypotheses of that theorem, as well as the conservation laws for mass, momentum, and energy (after defining the potential energy field {V} in terms of {G}). This is quite a lot of fields and constraints, but one can cut down significantly on the degrees of freedom by requiring that {G} is spherically symmetric (in a tensorial sense) and also continuously self-similar (not just discretely self-similar). Note that this hypothesis is weaker than the assertion that the original field {u} is spherically symmetric and continuously self-similar; indeed we do not know if non-trivial solutions of this type actually exist. These symmetry hypotheses reduce the number of independent components of the {(2d+4) \times (2d+4)} matrix {G} to just six: {g_{1,1}, g_{1,i\partial_t}, g_{1,i\partial_r}, g_{\partial_r, \partial_r}, g_{\partial_\omega, \partial_\omega}, g_{\partial_r, \partial_t}}, which now take as their domain the {1+1}-dimensional space

\displaystyle  H_1 := ([0,+\infty) \times {\bf R}) \backslash \{(0,0)\}.

One now has to construct these six fields, together with a potential energy field {v}, that obey a number of constraints, notably some positive definiteness constraints as well as the aforementioned conservation laws for mass, momentum, and energy.

The field {g_{1,i\partial_t}} only arises in the equation for the potential {v} (coming from Euler’s identity) and can easily be eliminated. Similarly, the field {g_{\partial_r,\partial_t}} only makes an appearance in the current of the energy conservation law, and so can also be easily eliminated so long as the total energy is conserved. But in the energy-supercritical case, the total energy is infinite, and so it is relatively easy to eliminate the field {g_{\partial_r, \partial_t}} from the problem also. This leaves us with the task of constructing just five fields {g_{1,1}, g_{1,i\partial_r}, g_{\partial_r,\partial_r}, g_{\partial_\omega,\partial_\omega}, v} obeying a number of positivity conditions, symmetry conditions, regularity conditions, and conservation laws for mass and momentum.

The potential field {v} can effectively be absorbed into the angular stress field {g_{\partial_\omega,\partial_\omega}} (after placing an appropriate counterbalancing term in the radial stress field {g_{\partial_r, \partial_r}} so as not to disrupt the conservation laws), so we can also eliminate this field. The angular stress field {g_{\partial_\omega, \partial_\omega}} is then only constrained through the momentum conservation law and a requirement of positivity; one can then eliminate this field by converting the momentum conservation law from an equality to an inequality. Finally, the radial stress field {g_{\partial_r, \partial_r}} is also only constrained through a positive definiteness constraint and the momentum conservation inequality, so it can also be eliminated from the problem after some further modification of the momentum conservation inequality.

The task then reduces to locating just two fields {g_{1,1}, g_{1,i\partial_r}} that obey a mass conservation law

\displaystyle  \partial_t g_{1,1} = 2 \left(\partial_r + \frac{d-1}{r} \right) g_{1,i\partial r}

together with an additional inequality that is the remnant of the momentum conservation law. One can solve for the mass conservation law in terms of a single scalar field {W} using the ansatz

\displaystyle g_{1,1} = 2 r^{1-d} \partial_r (r^d W)

\displaystyle g_{1,i\partial_r} = r^{1-d} \partial_t (r^d W)

so the problem has finally been simplified to the task of locating a single scalar field {W} with some scaling and homogeneity properties that obeys a certain differential inequality relating to momentum conservation. This turns out to be possible by explicitly writing down a specific scalar field {W} using some asymptotic parameters and cutoff functions.

Throughout this post we shall always work in the smooth category, thus all manifolds, maps, coordinate charts, and functions are assumed to be smooth unless explicitly stated otherwise.

A (real) manifold {M} can be defined in at least two ways. On one hand, one can define the manifold extrinsically, as a subset of some standard space such as a Euclidean space {{\bf R}^d}. On the other hand, one can define the manifold intrinsically, as a topological space equipped with an atlas of coordinate charts. The fundamental embedding theorems show that, under reasonable assumptions, the intrinsic and extrinsic approaches give the same classes of manifolds (up to isomorphism in various categories). For instance, we have the following (special case of) the Whitney embedding theorem:

Theorem 1 (Whitney embedding theorem) Let {M} be a compact manifold. Then there exists an embedding {u: M \rightarrow {\bf R}^d} from {M} to a Euclidean space {{\bf R}^d}.

In fact, if {M} is {n}-dimensional, one can take {d} to equal {2n}, which is often best possible (easy examples include the circle {{\bf R}/{\bf Z}} which embeds into {{\bf R}^2} but not {{\bf R}^1}, or the Klein bottle that embeds into {{\bf R}^4} but not {{\bf R}^3}). One can also relax the compactness hypothesis on {M} to second countability, but we will not pursue this extension here. We give a “cheap” proof of this theorem below the fold which allows one to take {d} equal to {2n+1}.

A significant strengthening of the Whitney embedding theorem is (a special case of) the Nash embedding theorem:

Theorem 2 (Nash embedding theorem) Let {(M,g)} be a compact Riemannian manifold. Then there exists a isometric embedding {u: M \rightarrow {\bf R}^d} from {M} to a Euclidean space {{\bf R}^d}.

In order to obtain the isometric embedding, the dimension {d} has to be a bit larger than what is needed for the Whitney embedding theorem; in this article of Gunther the bound

\displaystyle d = \max( n(n+5)/2, n(n+3)/2 + 5) \ \ \ \ \ (1)

 

is attained, which I believe is still the record for large {n}. (In the converse direction, one cannot do better than {d = \frac{n(n+1)}{2}}, basically because this is the number of degrees of freedom in the Riemannian metric {g}.) Nash’s original proof of theorem used what is now known as Nash-Moser inverse function theorem, but a subsequent simplification of Gunther allowed one to proceed using just the ordinary inverse function theorem (in Banach spaces).

I recently had the need to invoke the Nash embedding theorem to establish a blowup result for a nonlinear wave equation, which motivated me to go through the proof of the theorem more carefully. Below the fold I give a proof of the theorem that does not attempt to give an optimal value of {d}, but which hopefully isolates the main ideas of the argument (as simplified by Gunther). One advantage of not optimising in {d} is that it allows one to freely exploit the very useful tool of pairing together two maps {u_1: M \rightarrow {\bf R}^{d_1}}, {u_2: M \rightarrow {\bf R}^{d_2}} to form a combined map {(u_1,u_2): M \rightarrow {\bf R}^{d_1+d_2}} that can be closer to an embedding or an isometric embedding than the original maps {u_1,u_2}. This lets one perform a “divide and conquer” strategy in which one first starts with the simpler problem of constructing some “partial” embeddings of {M} and then pairs them together to form a “better” embedding.

In preparing these notes, I found the articles of Deane Yang and of Siyuan Lu to be helpful.

Read the rest of this entry »

I’ve just uploaded to the arXiv my paper Finite time blowup for a supercritical defocusing nonlinear wave system, submitted to Analysis and PDE. This paper was inspired by a question asked of me by Sergiu Klainerman recently, regarding whether there were any analogues of my blowup example for Navier-Stokes type equations in the setting of nonlinear wave equations.

Recall that the defocusing nonlinear wave (NLW) equation reads

\displaystyle \Box u = |u|^{p-1} u \ \ \ \ \ (1)

 

where {u: {\bf R}^{1+d} \rightarrow {\bf R}} is the unknown scalar field, {\Box = -\partial_t^2 + \Delta} is the d’Alambertian operator, and {p>1} is an exponent. We can generalise this equation to the defocusing nonlinear wave system

\displaystyle \Box u = (\nabla F)(u) \ \ \ \ \ (2)

 

where {u: {\bf R}^{1+d} \rightarrow {\bf R}^m} is now a system of scalar fields, and {F: {\bf R}^m \rightarrow {\bf R}} is a potential which is homogeneous of degree {p+1} and strictly positive away from the origin; the scalar equation corresponds to the case where {m=1} and {F(u) = \frac{1}{p+1} |u|^{p+1}}. We will be interested in smooth solutions {u} to (2). It is only natural to restrict to the smooth category when the potential {F} is also smooth; unfortunately, if one requires {F} to be homogeneous of order {p+1} all the way down to the origin, then {F} cannot be smooth unless it is identically zero or {p+1} is an odd integer. This is too restrictive for us, so we will only require that {F} be homogeneous away from the origin (e.g. outside the unit ball). In any event it is the behaviour of {F(u)} for large {u} which will be decisive in understanding regularity or blowup for the equation (2).

Formally, solutions to the equation (2) enjoy a conserved energy

\displaystyle E[u] = \int_{{\bf R}^d} \frac{1}{2} \|\partial_t u \|^2 + \frac{1}{2} \| \nabla_x u \|^2 + F(u)\ dx.

Using this conserved energy, it is possible to establish global regularity for the Cauchy problem (2) in the energy-subcritical case when {d \leq 2}, or when {d \geq 3} and {p < 1+\frac{4}{d-2}}. This means that for any smooth initial position {u_0: {\bf R}^d \rightarrow {\bf R}^m} and initial velocity {u_1: {\bf R}^d \rightarrow {\bf R}^m}, there exists a (unique) smooth global solution {u: {\bf R}^{1+d} \rightarrow {\bf R}^m} to the equation (2) with {u(0,x) = u_0(x)} and {\partial_t u(0,x) = u_1(x)}. These classical global regularity results (essentially due to Jörgens) were famously extended to the energy-critical case when {d \geq 3} and {p = 1 + \frac{4}{d-2}} by Grillakis, Struwe, and Shatah-Struwe (though for various technical reasons, the global regularity component of these results was limited to the range {3 \leq d \leq 7}). A key tool used in the energy-critical theory is the Morawetz estimate

\displaystyle \int_0^T \int_{{\bf R}^d} \frac{|u(t,x)|^{p+1}}{|x|}\ dx dt \lesssim E[u]

which can be proven by manipulating the properties of the stress-energy tensor

\displaystyle T_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle - \frac{1}{2} \eta_{\alpha \beta} (\langle \partial^\gamma u, \partial_\gamma u \rangle + F(u))

(with the usual summation conventions involving the Minkowski metric {\eta_{\alpha \beta} dx^\alpha dx^\beta = -dt^2 + |dx|^2}) and in particular exploiting the divergence-free nature of this tensor: {\partial^\beta T_{\alpha \beta}} See for instance the text of Shatah-Struwe, or my own PDE book, for more details. The energy-critical regularity results have also been extended to slightly supercritical settings in which the potential grows by a logarithmic factor or so faster than the critical rate; see the results of myself and of Roy.

This leaves the question of global regularity for the energy supercritical case when {d \geq 3} and {p > 1+\frac{4}{d-2}}. On the one hand, global smooth solutions are known for small data (if {F} vanishes to sufficiently high order at the origin, see e.g. the work of Lindblad and Sogge), and global weak solutions for large data were constructed long ago by Segal. On the other hand, the solution map, if it exists, is known to be extremely unstable, particularly at high frequencies; see for instance this paper of Lebeau, this paper of Christ, Colliander, and myself, this paper of Brenner and Kumlin, or this paper of Ibrahim, Majdoub, and Masmoudi for various formulations of this instability. In the case of the focusing NLW {-\partial_{tt} u + \Delta u = - |u|^{p-1} u}, one can easily create solutions that blow up in finite time by ODE constructions, for instance one can take {u(t,x) = c (1-t)^{-\frac{2}{p-1}}} with {c = (\frac{2(p+1)}{(p-1)^2})^{\frac{1}{p-1}}}, which blows up as {t} approaches {1}. However the situation in the defocusing supercritical case is less clear. The strongest positive results are of Kenig-Merle and Killip-Visan, which show (under some additional technical hypotheses) that global regularity for such equations holds under the additional assumption that the critical Sobolev norm of the solution stays bounded. Roughly speaking, this shows that “Type II blowup” cannot occur for (2).

Our main result is that finite time blowup can in fact occur, at least for three-dimensional systems where the number {m} of degrees of freedom is sufficiently large:

Theorem 1 Let {d=3}, {p > 5}, and {m \geq 76}. Then there exists a smooth potential {F: {\bf R}^m \rightarrow {\bf R}}, positive and homogeneous of degree {p+1} away from the origin, and a solution to (2) with smooth initial data that develops a singularity in finite time.

The rather large lower bound of {76} on {m} here is primarily due to our use of the Nash embedding theorem (which is the first time I have actually had to use this theorem in an application!). It can certainly be lowered, but unfortunately our methods do not seem to be able to bring {m} all the way down to {1}, so we do not directly exhibit finite time blowup for the scalar supercritical defocusing NLW. Nevertheless, this result presents a barrier to any attempt to prove global regularity for that equation, in that it must somehow use a property of the scalar equation which is not available for systems. It is likely that the methods can be adapted to higher dimensions than three, but we take advantage of some special structure to the equations in three dimensions (related to the strong Huygens principle) which does not seem to be available in higher dimensions.

The blowup will in fact be of discrete self-similar type in a backwards light cone, thus {u} will obey a relation of the form

\displaystyle u(e^S t, e^S x) = e^{-\frac{2}{p-1} S} u(t,x)

for some fixed {S>0} (the exponent {-\frac{2}{p-1}} is mandated by dimensional analysis considerations). It would be natural to consider continuously self-similar solutions (in which the above relation holds for all {S}, not just one {S}). And rough self-similar solutions have been constructed in the literature by perturbative methods (see this paper of Planchon, or this paper of Ribaud and Youssfi). However, it turns out that continuously self-similar solutions to a defocusing equation have to obey an additional monotonicity formula which causes them to not exist in three spatial dimensions; this argument is given in my paper. So we have to work just with discretely self-similar solutions.

Because of the discrete self-similarity, the finite time blowup solution will be “locally Type II” in the sense that scale-invariant norms inside the backwards light cone stay bounded as one approaches the singularity. But it will not be “globally Type II” in that scale-invariant norms stay bounded outside the light cone as well; indeed energy will leak from the light cone at every scale. This is consistent with the results of Kenig-Merle and Killip-Visan which preclude “globally Type II” blowup solutions to these equations in many cases.

We now sketch the arguments used to prove this theorem. Usually when studying the NLW, we think of the potential {F} (and the initial data {u_0,u_1}) as being given in advance, and then try to solve for {u} as an unknown field. However, in this problem we have the freedom to select {F}. So we can look at this problem from a “backwards” direction: we first choose the field {u}, and then fit the potential {F} (and the initial data) to match that field.

Now, one cannot write down a completely arbitrary field {u} and hope to find a potential {F} obeying (2), as there are some constraints coming from the homogeneity of {F}. Namely, from the Euler identity

\displaystyle \langle u, (\nabla F)(u) \rangle = (p+1) F(u)

we see that {F(u)} can be recovered from (2) by the formula

\displaystyle F(u) = \frac{1}{p+1} \langle u, \Box u \rangle \ \ \ \ \ (3)

 

so the defocusing nature of {F} imposes a constraint

\displaystyle \langle u, \Box u \rangle > 0.

Furthermore, taking a derivative of (3) we obtain another constraining equation

\displaystyle \langle \partial_\alpha u, \Box u \rangle = \frac{1}{p+1} \partial_\alpha \langle u, \Box u \rangle

that does not explicitly involve the potential {F}. Actually, one can write this equation in the more familiar form

\displaystyle \partial^\beta T_{\alpha \beta} = 0

where {T_{\alpha \beta}} is the stress-energy tensor

\displaystyle T_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle - \frac{1}{2} \eta_{\alpha \beta} (\langle \partial^\gamma u, \partial_\gamma u \rangle + \frac{1}{p+1} \langle u, \Box u \rangle),

now written in a manner that does not explicitly involve {F}.

With this reformulation, this suggests a strategy for locating {u}: first one selects a stress-energy tensor {T_{\alpha \beta}} that is divergence-free and obeys suitable positive definiteness and self-similarity properties, and then locates a self-similar map {u} from the backwards light cone to {{\bf R}^m} that has that stress-energy tensor (one also needs the map {u} (or more precisely the direction component {u/\|u\|} of that map) injective up to the discrete self-similarity, in order to define {F(u)} consistently). If the stress-energy tensor was replaced by the simpler “energy tensor”

\displaystyle E_{\alpha \beta} = \langle \partial_\alpha u, \partial_\beta u \rangle

then the question of constructing an (injective) map {u} with the specified energy tensor is precisely the embedding problem that was famously solved by Nash (viewing {E_{\alpha \beta}} as a Riemannian metric on the domain of {u}, which in this case is a backwards light cone quotiented by a discrete self-similarity to make it compact). It turns out that one can adapt the Nash embedding theorem to also work with the stress-energy tensor as well (as long as one also specifies the mass density {M = \|u\|^2}, and as long as a certain positive definiteness property, related to the positive semi-definiteness of Gram matrices, is obeyed). Here is where the dimension {76} shows up:

Proposition 2 Let {M} be a smooth compact Riemannian {4}-manifold, and let {m \geq 76}. Then {M} smoothly isometrically embeds into the sphere {S^{m-1}}.

Proof: The Nash embedding theorem (in the form given in this ICM lecture of Gunther) shows that {M} can be smoothly isometrically embedded into {{\bf R}^{19}}, and thus in {[-R,R]^{19}} for some large {R}. Using an irrational slope, the interval {[-R,R]} can be smoothly isometrically embedded into the {2}-torus {\frac{1}{\sqrt{38}} (S^1 \times S^1)}, and so {[-R,R]^{19}} and hence {M} can be smoothly embedded in {\frac{1}{\sqrt{38}} (S^1)^{38}}. But from Pythagoras’ theorem, {\frac{1}{\sqrt{38}} (S^1)^{38}} can be identified with a subset of {S^{m-1}} for any {m \geq 76}, and the claim follows. \Box

One can presumably improve upon the bound {76} by being more efficient with the embeddings (e.g. by modifying the proof of Nash embedding to embed directly into a round sphere), but I did not try to optimise the bound here.

The remaining task is to construct the stress-energy tensor {T_{\alpha \beta}}. One can reduce to tensors that are invariant with respect to rotations around the spatial origin, but this still leaves a fair amount of degrees of freedom (it turns out that there are four fields that need to be specified, which are denoted {M, E_{tt}, E_{tr}, E_{rr}} in my paper). However a small miracle occurs in three spatial dimensions, in that the divergence-free condition involves only two of the four degrees of freedom (or three out of four, depending on whether one considers a function that is even or odd in {r} to only be half a degree of freedom). This is easiest to illustrate with the scalar NLW (1). Assuming spherical symmetry, this equation becomes

\displaystyle - \partial_{tt} u + \partial_{rr} u + \frac{2}{r} \partial_r u = |u|^{p-1} u.

Making the substitution {\phi := ru}, we can eliminate the lower order term {\frac{2}{r} \partial_r} completely to obtain

\displaystyle - \partial_{tt} \phi + \partial_{rr} \phi= \frac{1}{r^{p-1}} |\phi|^{p-1} \phi.

(This can be compared with the situation in higher dimensions, in which an undesirable zeroth order term {\frac{(d-1)(d-3)}{r^2} \phi} shows up.) In particular, if one introduces the null energy density

\displaystyle e_+ := \frac{1}{2} |\partial_t \phi + \partial_r \phi|^2

and the potential energy density

\displaystyle V := \frac{|\phi|^{p+1}}{(p+1) r^{p-1}}

then one can verify the equation

\displaystyle (\partial_t - \partial_r) e_+ + (\partial_t + \partial_r) V = - \frac{p-1}{r} V

which can be viewed as a transport equation for {e_+} with forcing term depending on {V} (or vice versa), and is thus quite easy to solve explicitly by choosing one of these fields and then solving for the other. As it turns out, once one is in the supercritical regime {p>5}, one can solve this equation while giving {e_+} and {V} the right homogeneity (they have to be homogeneous of order {-\frac{4}{p-1}}, which is greater than {-1} in the supercritical case) and positivity properties, and from this it is possible to prescribe all the other fields one needs to satisfy the conclusions of the main theorem. (It turns out that {e_+} and {V} will be concentrated near the boundary of the light cone, so this is how the solution {u} will concentrate also.)

Archives