Notes on the Nash embedding theorem

11 May, 2016 in expository, math.AP, math.DG, math.MG | Tags: Nash embedding theorem, Riemannian geometry, Whitney embedding theorem | by Terence Tao

Throughout this post we shall always work in the smooth category, thus all manifolds, maps, coordinate charts, and functions are assumed to be smooth unless explicitly stated otherwise.

A (real) manifold ${M}$ can be defined in at least two ways. On one hand, one can define the manifold extrinsically, as a subset of some standard space such as a Euclidean space ${{\bf R}^d}$ . On the other hand, one can define the manifold intrinsically, as a topological space equipped with an atlas of coordinate charts. The fundamental embedding theorems show that, under reasonable assumptions, the intrinsic and extrinsic approaches give the same classes of manifolds (up to isomorphism in various categories). For instance, we have the following (special case of) the Whitney embedding theorem:

Theorem 1 (Whitney embedding theorem) Let ${M}$ be a compact manifold. Then there exists an embedding ${u: M \rightarrow {\bf R}^d}$ from ${M}$ to a Euclidean space ${{\bf R}^d}$ .

In fact, if ${M}$ is ${n}$ -dimensional, one can take ${d}$ to equal ${2n}$ , which is often best possible (easy examples include the circle ${{\bf R}/{\bf Z}}$ which embeds into ${{\bf R}^2}$ but not ${{\bf R}^1}$ , or the Klein bottle that embeds into ${{\bf R}^4}$ but not ${{\bf R}^3}$ ). One can also relax the compactness hypothesis on ${M}$ to second countability, but we will not pursue this extension here. We give a “cheap” proof of this theorem below the fold which allows one to take ${d}$ equal to ${2n+1}$ .

A significant strengthening of the Whitney embedding theorem is (a special case of) the Nash embedding theorem:

Theorem 2 (Nash embedding theorem) Let ${(M,g)}$ be a compact Riemannian manifold. Then there exists a isometric embedding ${u: M \rightarrow {\bf R}^d}$ from ${M}$ to a Euclidean space ${{\bf R}^d}$ .

In order to obtain the isometric embedding, the dimension ${d}$ has to be a bit larger than what is needed for the Whitney embedding theorem; in this article of Gunther the bound

$\displaystyle d = \max( n(n+5)/2, n(n+3)/2 + 5) \ \ \ \ \ (1)$

is attained, which I believe is still the record for large ${n}$ . (In the converse direction, one cannot do better than ${d = \frac{n(n+1)}{2}}$ , basically because this is the number of degrees of freedom in the Riemannian metric ${g}$ .) Nash’s original proof of theorem used what is now known as Nash-Moser inverse function theorem, but a subsequent simplification of Gunther allowed one to proceed using just the ordinary inverse function theorem (in Banach spaces).

I recently had the need to invoke the Nash embedding theorem to establish a blowup result for a nonlinear wave equation, which motivated me to go through the proof of the theorem more carefully. Below the fold I give a proof of the theorem that does not attempt to give an optimal value of ${d}$ , but which hopefully isolates the main ideas of the argument (as simplified by Gunther). One advantage of not optimising in ${d}$ is that it allows one to freely exploit the very useful tool of pairing together two maps ${u_1: M \rightarrow {\bf R}^{d_1}}$ , ${u_2: M \rightarrow {\bf R}^{d_2}}$ to form a combined map ${(u_1,u_2): M \rightarrow {\bf R}^{d_1+d_2}}$ that can be closer to an embedding or an isometric embedding than the original maps ${u_1,u_2}$ . This lets one perform a “divide and conquer” strategy in which one first starts with the simpler problem of constructing some “partial” embeddings of ${M}$ and then pairs them together to form a “better” embedding.

In preparing these notes, I found the articles of Deane Yang and of Siyuan Lu to be helpful.

— 1. The Whitney embedding theorem —

To prove the Whitney embedding theorem, we first prove a weaker version in which the embedding is replaced by an immersion:

Theorem 3 (Weak Whitney embedding theorem) Let ${M}$ be a compact manifold. Then there exists an immersion ${u: M \rightarrow {\bf R}^d}$ from ${M}$ to a Euclidean space ${{\bf R}^d}$ .

Proof: Our objective is to construct a map ${u: M \rightarrow {\bf R}^d}$ such that the derivatives ${\partial_\alpha u(x)}$ are linearly independent in ${{\bf R}^d}$ for each ${x \in M}$ . For any given point ${x_0 \in M}$ , we have a coordinate chart ${\phi_{x_0}: U_{x_0} \rightarrow {\bf R}^n}$ from some neighbourhood ${U_{x_0}}$ of ${x_0}$ to ${{\bf R}^n}$ . If we set ${u_{x_0}: M \rightarrow {\bf R}^n}$ to be ${\phi_{x_0}}$ multiplied by a suitable cutoff function supported near ${x_0}$ , we see that ${u_{x_0}}$ is an immersion in a neighbourhood of ${x_0}$ . Pairing together finitely many of the ${u_{x_0}}$ and using compactness, we obtain the claim. $\Box$

Now we upgrade the immersion ${u}$ from the above theorem to an embedding by further use of pairing. First observe that as ${M}$ is smooth and compact, an embedding is nothing more than an immersion that is injective. Let ${u: M \rightarrow {\bf R}^d}$ be an immersion. Let ${\Sigma \subset M \times M}$ be the set of pairs ${(x_1,x_2)}$ of distinct points ${(x_1,x_2)\in M}$ such that ${u(x_1)=u(x_2)}$ ; note that this set is compact since ${u}$ is an immersion (and so there is no failure of injectivity when ${(x_1,x_2)}$ is near the diagonal). If ${\Sigma}$ is empty then ${u}$ is injective and we are done. If ${\Sigma}$ contains a point ${(x_1,x_2)}$ , then by pairing ${u}$ with some scalar function ${\eta: M \rightarrow {\bf R}}$ that separates ${x_1}$ and ${x_2}$ , we can replace ${u}$ by another immersion (in one higher dimension ${{\bf R}^{d+1}}$ ) such that a neighbourhood of ${x_1}$ and a neighbourhood of ${x_2}$ get mapped to disjoint sets, thus effectively removing an open neighbourhood of ${(x_1,x_2)}$ from ${\Sigma}$ . Repeating these procedures finitely many times, using the compactness of ${\Sigma}$ , we end up with an immersion which is injective, giving the Whitney embedding theorem.

At present, the embedding ${u: M \rightarrow {\bf R}^d}$ of an ${n}$ -dimensional compact manifold ${M}$ could be extremely high dimensional. However, if ${d > 2n+1}$ , then it is possible to project ${u}$ from ${{\bf R}^d}$ to ${{\bf R}^{d-1}}$ by the random projection trick (discussed in this previous post). Indeed, if one picks a random element ${\omega \in S^{d-1}}$ of the unit sphere, and then lets ${T: {\bf R}^d \rightarrow \omega^\perp}$ be the (random) orthogonal projection to the hyperplane ${\omega^\perp}$ orthogonal to ${\omega}$ , then it is geometrically obvious that ${T \circ u: M \rightarrow \omega^\perp}$ will remain an embedding unless ${\omega}$ either is of the form ${\pm \frac{u(x)-u(y)}{\|u(x)-u(y)\|}}$ for some distinct ${x,y \in M}$ , or lies in the tangent plane to ${u(M)}$ at ${u(x)}$ for some ${x \in M}$ . But the set of all such excluded ${\omega}$ is of dimension at most ${2n}$ (using, for instance, the Hausdorff notion of dimension), and so for ${d > 2n+1}$ almost every ${\omega}$ in ${S^{d-1}}$ will avoid this set. Thus one can use these projections to cut the dimension ${d}$ down by one for ${d>2n+1}$ ; iterating this observation we can end up with the final value of ${d=2n+1}$ for the Whitney embedding theorem.

Remark 4 The Whitney embedding theorem for ${d=2n}$ is more difficult to prove. Using the random projection trick, one can arrive at an immersion ${u: M \rightarrow {\bf R}^{2n}}$ which is injective except at a finite number of “double points” where ${u(M)}$ meets itself transversally (think of projecting a knot in ${{\bf R}^3}$ randomly down to ${{\bf R}^2}$ ). One then needs to “push” the double points out of existence using a device known as the “Whitney trick”.

— 2. Reduction to a local isometric embedding theorem —

We now begin the proof of the Nash embedding theorem. In this section we make a series of reductions that reduce the “global” problem of isometric embedding a compact manifold to a “local” problem of turning a near-isometric embedding of a torus into a true isometric embedding.

We first make a convenient (though not absolutely necessary) reduction: in order to prove Theorem 2, it suffices to do so in the case when ${M}$ is a torus ${({\bf R}/{\bf Z})^n}$ (equipped with some metric ${g}$ which is not necessarily flat). Indeed, if ${M}$ is not a torus, we can use the Whitney embedding theorem to embed ${M}$ (non-isometrically) into some Euclidean space ${{\bf R}^m}$ , which by rescaling and then quotienting out by ${{\bf Z}^m}$ lets one assume without loss of generality that ${M}$ is some submanifold of a torus ${({\bf R}/{\bf Z})^m}$ equipped with some metric ${g}$ . One can then use a smooth version of the Tietze extension theorem to extend the metric ${g}$ smoothly from ${M}$ to all of ${({\bf R}/{\bf Z})^m}$ ; this extended metric ${\tilde g}$ will remain positive definite in some neighbourhood of ${M}$ , so by using a suitable (smooth) partition of unity and taking a convex combination of ${\tilde g}$ with the flat metric on ${({\bf R}/{\bf Z})^m}$ , one can find another extension ${g'}$ of ${g}$ to ${({\bf R}/{\bf Z})^m}$ that remains positive definite (and symmetric) on all of ${({\bf R}/{\bf Z})^m}$ , giving rise to a Riemannian torus ${(({\bf R}/{\bf Z})^{m}, g')}$ . Any isometric embedding of this torus into ${{\bf R}^d}$ will induce an isometric embedding of the original manifold ${M}$ , completing the reduction.

The main advantage of this reduction to the torus case is that it gives us a global system of (periodic) coordinates on ${M}$ , so that we no longer need to work with local coordinate charts. Also, one can easily use Fourier analysis on the torus to verify the ellipticity properties of the Laplacian that we will need later in the proof. These are however fairly minor conveniences, and it would not be difficult to continue the argument below without having first reduced to the torus case.

Henceforth our manifold ${M}$ is assumed to be the torus ${({\bf R}/{\bf Z})^n}$ equipped with a Riemannian metric ${g = g_{\alpha \beta}}$ , where the indices ${\alpha,\beta}$ run from ${1}$ to ${n}$ . Our task is to find an injective map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ which is isometric in the sense that it obeys the system of partial differential equations

$\displaystyle \partial_\alpha u \cdot \partial_\beta u = g_{\alpha \beta}$

for ${\alpha,\beta=1,\dots,n}$ , where ${\cdot}$ denotes the usual dot product on ${{\bf R}^d}$ . Let us write this equation as

$\displaystyle Q(u) = g$

where ${Q(u)}$ is the symmetric tensor

$\displaystyle Q(u)_{\alpha \beta} := \partial_\alpha u \cdot \partial_\beta u.$

The operator ${Q}$ is a nonlinear differential operator, but it behaves very well with respect to pairing:

$\displaystyle Q( (u_1,u_2) ) = Q(u_1) + Q(u_2). \ \ \ \ \ (2)$

We can use (2) to obtain a number of very useful reductions (at the cost of worsening the eventual value of ${d}$ , which as stated in the introduction we will not be attempting to optimise). First we claim that we can drop the injectivity requirement on ${u}$ , that is to say it suffices to show that every Riemannian metric ${g}$ on ${({\bf R}/{\bf Z})^n}$ is of the form ${g = Q(u)}$ for some map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ into some Euclidean space ${{\bf R}^d}$ . Indeed, suppose that this were the case. Let ${u_1: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ be any (not necessarily isometric) embedding (the existence of which is guaranteed by the Whitney embedding theorem; alternatively, one can use the usual exponential map ${\theta \mapsto (\cos \theta, \sin \theta)}$ to embed ${{\bf R}/{\bf Z}}$ into ${{\bf R}^2}$ ). For ${\varepsilon>0}$ small enough, the map ${\varepsilon u_1}$ is short in the sense that ${Q(\varepsilon u_1) < g}$ pointwise in the sense of symmetric tensors (or equivalently, the map ${\varepsilon u_1}$ is a contraction from ${(M,g)}$ to ${{\bf R}^d}$ ). For such an ${\varepsilon}$ , we can write ${g = Q(\varepsilon u_1) + g'}$ for some Riemannian metric ${g'}$ . If we then write ${g' = Q(u')}$ for some (not necessarily injective) map ${u': ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^{d'}}$ , then from (2) we see that ${g = Q( (\varepsilon u_1, u') )}$ ; since ${(\varepsilon u_1, u')}$ inherits its injectivity from the component map ${u_1}$ , this gives the desired isometric embedding.

Call a metric ${g}$ on ${({\bf R}/{\bf Z})^n}$ good if it is of the form ${Q(u)}$ for some map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ into a Euclidean space ${{\bf R}^d}$ . Our task is now to show that every metric is good; the relation (2) tells us that the sum of any two good metrics is good.

In order to make the local theory work later, it will be convenient to introduce the following notion: a map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ is said to be free if, for every point ${x \in ({\bf R}/{\bf Z})^n}$ , the ${n}$ vectors ${\partial_\alpha u(x)}$ , ${\alpha=1,\dots,n}$ and the ${\frac{n(n+1)}{2}}$ vectors ${\partial_\alpha \partial_\beta u(x)}$ , ${1 \leq \alpha \leq \beta \leq n}$ are all linearly independent; equivalently, given a further map ${v: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ , there are no dependencies whatsoever between the ${n + \frac{n(n+1)}{2}}$ scalar functions ${\partial_\alpha u \cdot v}$ , ${\alpha=1,\dots,n}$ and ${\partial_{\alpha \beta} u \cdot v}$ , ${1 \leq \alpha \leq \beta \leq n}$ . Clearly, a free map into ${{\bf R}^d}$ is only possible for ${d \geq n + \frac{n(n+1)}{2}}$ , and this explains the bulk of the formula (1) of the best known value of ${d}$ .

For any natural number ${m}$ , the “Veronese embedding” ${\iota: {\bf R}^m \rightarrow {\bf R}^{m + \frac{m(m+1)}{2}}}$ defined by

$\displaystyle \iota(x_1,\dots,x_m) := ( (x_\alpha)_{1 \leq \alpha \leq m}, (x_\alpha x_\beta)_{1 \leq \alpha \leq \beta \leq m} )$

can easily be verified to be free. From this, one can construct a free map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^{m + \frac{m(m+1)}{2}}}$ by starting with an arbitrary immersion ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^m}$ and composing it with the Veronese embedding (the fact that the composition is free will follow after several applications of the chain rule).

Given a Riemannian metric ${g}$ , one can find a free map ${u}$ which is short in the sense that ${Q(u) < g}$ , by taking an arbitrary free map and scaling it down by some small scaling factor ${\varepsilon>0}$ . This gives us a decomposition

$\displaystyle g = Q(u) + g'$

for some Riemannian metric ${g'}$ .

The metric ${Q(u)}$ is clearly good, so by (2) it would suffice to show that ${g'}$ is good. What is easy to show is that ${g'}$ is approximately good:

Proposition 5 Let ${g'}$ be a Riemannian metric on ${({\bf R}/{\bf Z})^n}$ . Then there exists a smooth symmetric tensor ${h}$ on ${({\bf R}/{\bf Z})^n}$ with the property that ${g' + \varepsilon^2 h}$ is good for every ${\varepsilon>0}$ .

Proof: Roughly speaking, the idea here is to use “tightly wound spirals” to capture various “rank one” components of the metric ${g'}$ , the point being that if a map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ “oscillates” at some high frequency ${\xi \in {\bf R}^n}$ with some “amplitude” ${A}$ , then ${Q(u)}$ is approximately equal to the rank one tensor ${A \xi_\alpha \xi_\beta}$ . The argument here is related to the technique of convex integration, which among other things leads to one way to establish the ${h}$ -principle of Gromov.

By the spectral theorem, every positive definite tensor ${g_{\alpha \beta}}$ can be written as a positive linear combination of symmetric rank one tensors ${v_\alpha v_\beta}$ for some vector ${v \in {\bf R}^n}$ . By adding some additional rank one tensors if necessary, one can make this decomposition stable, in the sense that any nearby tensor ${g_{\alpha \beta}}$ is also a positive linear combination of the ${v_\alpha v_\beta}$ . One can think of ${v_\alpha}$ as the gradient ${\partial_\alpha \psi}$ of some linear function ${\psi: {\bf R}^n \rightarrow {\bf R}^n}$ . Using compactness and a smooth partition of unity, one can then arrive at a decomposition

$\displaystyle g'_{\alpha \beta} = \sum_{i=1}^m (\eta^{(i)})^2 \partial_\alpha \psi^{(i)} \partial_\beta \psi^{(i)}$

for some finite ${m}$ , some smooth scalar functions ${\eta^{(i)}, \psi^{(i)}: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}}$ (one can take ${\psi^{(i)}}$ to be linear functions on small coordinate charts, and ${\eta^{(i)}}$ to basically be cutoffs to these charts).

For any ${\varepsilon>0}$ and ${i=1,\dots,m}$ , consider the “spiral” map ${u_\varepsilon^{(i)}: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^2}$ defined by

$\displaystyle u_\varepsilon^{(i)} := \varepsilon \eta^{(i)} ( \cos( \frac{\psi^{(i)}}{\varepsilon} ), \sin( \frac{\psi^{(i)}}{\varepsilon} ) )$

Direct computation shows that

$\displaystyle Q(u_\varepsilon^{(i)})_{\alpha \beta} = (\eta^{(i)})^2 \partial_\alpha \psi^{(i)} \partial_\beta \psi^{(i)}$

$\displaystyle + \varepsilon^2 \eta^{(i)}_\alpha \eta^{(i)}_\beta$

and the claim follows by summing in ${i}$ (using (2)) and taking ${h := \sum_{i=1}^m \eta^{(i)}_\alpha \eta^{(i)}_\beta}$ . $\Box$

The claim then reduces to the following local (perturbative) statement, that shows that the property of being good is stable around a free map:

Theorem 6 (Local embedding) Let ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ be a free map. Then ${Q(u)+h}$ is good for all symmetric tensors ${h}$ sufficiently close to zero in the ${C^\infty}$ topology.

Indeed, assuming Theorem 6, and with ${h}$ as in Proposition 5, we have ${Q(u) - \varepsilon^2 h}$ good for ${\varepsilon}$ small enough. By (2) and Proposition 5, we then have ${g = (Q(u) - \varepsilon^2 h) + (g' + \varepsilon^2 h)}$ good, as required.

The remaining task is to prove Theorem 6. This is a problem in perturbative PDE, to which we now turn.

— 3. Proof of local embedding —

We are given a free map ${u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ and a small tensor ${h}$ . It will suffice to find a perturbation ${u+v: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ of ${u}$ that solves the PDE

$\displaystyle Q(u+v) = Q(u) + h.$

We can expand the left-hand side and cancel off ${Q(u)}$ to write this as

$\displaystyle L(v)= h - Q(v) \ \ \ \ \ (3)$

where the symmetric tensor-valued first-order linear operator ${L}$ is defined (in terms of the fixed free map ${u}$ ) as

$\displaystyle L(v)_{\alpha \beta} := \partial_\alpha u \partial_\beta v + \partial_\beta u \partial_\alpha v.$

To exploit the free nature of ${u}$ , we would like to write the operator ${L}$ in terms of the inner products ${\partial_\alpha u \cdot v}$ and ${\partial_{\alpha \beta} u \cdot v}$ . After some rearranging using the product rule, we arrive at the representation

$\displaystyle L(v) = \partial_\beta ( \partial_\alpha u \cdot v) + \partial_\alpha( \partial_\beta u \cdot v ) - 2 \partial_{\alpha \beta} u \cdot v.$

Among other things, this allows for a way to right-invert the underdetermined linear operator ${L}$ . As ${u}$ is free, we can use Cramer’s rule to find smooth maps ${w_{\alpha \beta}: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ for ${\alpha,\beta=1,\dots,n}$ (with ${w_{\alpha \beta} = w_{\beta \alpha}}$ ) that is dual to ${\partial_{\alpha \beta} u}$ in the sense that

$\displaystyle \partial_\alpha u \cdot w_{\alpha' \beta'} = 0$

$\displaystyle \partial_{\alpha \beta} u \cdot w_{\alpha' \beta'} = \delta_{\alpha \alpha'} \delta_{\beta \beta'} + \delta_{\alpha \beta'} \delta_{\beta \alpha'}$

where ${\delta}$ denotes the Kronecker delta. If one then defines the linear zeroth-order operator ${M}$ from symmetric tensors ${f = f_{\alpha \beta}}$ to maps ${Mf: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d}$ by the formula

$\displaystyle Mf := -\frac{1}{4} \sum_{1 \leq \alpha,\beta \leq n} f_{\alpha \beta} w_{\alpha \beta}$

then direct computation shows that ${LMf = f}$ for any sufficiently regular ${F}$ . As a consequence of this, one could try to use the ansatz ${v = Mf}$ and transform the equation (3) to the fixed point equation

$\displaystyle f = h - Q(Mf). \ \ \ \ \ (4)$

One can hope to solve this equation by standard perturbative techniques, such as the inverse function theorem or the contraction mapping theorem, hopefully exploiting the smallness of ${h}$ to obtain the required contraction. Unfortunately we run into a fundamental loss of derivatives problem, in that the quadratic differential operator ${Q}$ loses a degree of regularity, and this loss is not recovered by the operator ${M}$ (which has no smoothing properties).

We know of two ways around this difficulty. The original argument of Nash used what is now known as the Nash-Moser iteration scheme to overcome the loss of derivatives by replacing the simple iterative scheme used in the contraction mapping theorem with a much more rapidly convergent scheme that generalises Newton’s method; see this previous blog post for a similar idea. The other way out, due to Gunther, is to observe that ${Q(v)}$ can be factored as

$\displaystyle Q(v) = L Q_0(v) \ \ \ \ \ (5)$

where ${Q_0}$ is a zeroth order quadratic operator ${Q_0}$ , so that (3) can be written instead as

$\displaystyle L( v + Q_0(v) ) = h,$

and using the right-inverse ${M}$ , it now suffices to solve the equation

$\displaystyle v = Mh - Q_0(v) \ \ \ \ \ (6)$

(compare with (4)), which can be done perturbatively if ${Q_0}$ is indeed zeroth order (e.g. if it is bounded on Hölder spaces such as ${C^{2,\alpha}}$ ).

It remains to achieve the desired factoring (5). We can bilinearise ${Q(v)}$ as ${\frac{1}{2} B(v,v)}$ , where

$\displaystyle B(v,w)_{\alpha \beta} := \partial_\alpha v \cdot \partial_\beta w + \partial_\beta v \cdot \partial_\alpha w.$

The basic point is that when ${v}$ is much higher frequency than ${w}$ , then

$\displaystyle B(v,w)_{\alpha \beta} \approx \partial_\alpha ( v \cdot \partial_\beta w ) + \partial_\beta ( v \cdot \partial_\alpha w ) \ \ \ \ \ (7)$

which can be approximated by ${L}$ applied to some quantity relating to the vector field ${v \cdot \partial_\beta w}$ ; similarly if ${w}$ is much higher frequency than ${v}$ . One can formalise these notions of “much higher frequency” using the machinery of paraproducts, but one can proceed in a slightly more elementary fashion by using the Laplacian operator ${\Delta = \sum_{\alpha=1}^n \partial_{\alpha \alpha}}$ and its (modified) inverse operator ${(1-\Delta)^{-1}}$ (which is easily defined on the torus using the Fourier transform, and has good smoothing properties) as a substitute for the paraproduct calculus. We begin by writing

$\displaystyle Q(v) = \frac{1}{2} B(v,v) = \frac{1}{2} (1 - \Delta)^{-1} ( B(v,v) - \Delta B(v,v) ).$

The dangerous term here is ${\Delta B(v,v)}$ . Using the product rule and symmetry, we can write

$\displaystyle \Delta B(v,v) = 2 B(\Delta v, v) + 2 \sum_{\gamma=1}^n B(\partial_\gamma v, \partial_\gamma v).$

The second term will be “lower order” in that it only involves second derivatives of ${v}$ , rather than third derivatives. As for the higher order term ${B(\Delta v, v)}$ , the main contribution will come from the terms where ${\Delta v}$ is higher frequency than ${v}$ (since the Laplacian accentuates high frequencies and dampens low frequencies, as can be seen by inspecting the Fourier symbol of the Laplacian). As such, we can profitably use the approximation (7) here. Indeed, from the product rule we have

$\displaystyle B(\Delta v, v)_{\alpha \beta} = \partial_\alpha ( \Delta v \cdot \partial_\beta v) + \partial_\beta (\Delta v \cdot \partial_\alpha v) - 2 \Delta v \cdot \partial_{\alpha \beta} v.$

Putting all this together, we obtain the decomposition

$\displaystyle Q(v)_{\alpha \beta} = \partial_\alpha Q_\beta(v) + \partial_\beta Q_\alpha(v) + Q'_{\alpha \beta}(v)$

where

$\displaystyle Q_\alpha(v) := - (1-\Delta)^{-1} ( \Delta v \cdot \partial_\alpha v )$

and

$\displaystyle Q'_{\alpha \beta}(v) := (1-\Delta)^{-1} ( \frac{1}{2} B(v,v)_{\alpha \beta} - \sum_{\gamma=1}^n B(\partial_\gamma v, \partial_\gamma v)_{\alpha \beta} + 2 \Delta v \cdot \partial_{\alpha \beta} v ).$

If we then use Cramer’s rule to create smooth functions ${w_\alpha}$ dual to the ${\partial_\alpha u}$ in the sense that

$\displaystyle \partial_\alpha u \cdot w_{\alpha'} = \delta_{\alpha \alpha'}$

$\displaystyle \partial_{\alpha \beta} u \cdot w_{\alpha' \beta'} = 0$

then we obtain the desired factorisation (5) with

$\displaystyle Q_0(v) := \sum_{\alpha=1}^n Q_\alpha(v) w_\alpha - \frac{1}{4} \sum_{1 \leq \alpha,\beta \leq n} Q'_{\alpha \beta} w_{\alpha \beta}.$

Note that ${Q_0(v)}$ is the smoothing operator ${(1-\Delta)^{-1}}$ applied to quadratic expressions of up to two derivatives of ${v}$ . As such, one can show using elliptic (Schauder) estimates to show that ${Q_0}$ is Lipschitz continuous in the Holder spaces ${C^{2,\alpha}(({\bf R}/{\bf Z})^n)}$ for ${0 < \alpha < 1}$ (with the Lipschitz constant being small when ${v}$ has small norm); this together with the contraction mapping theorem in the Banach space ${C^{2,\alpha}}$ is already enough to solve the equation (6) in this space if ${h}$ is small enough. This is not quite enough because we also need ${v}$ to be smooth; but it is possible (using Schauder estimates and product Hölder estimates) to establish bounds of the form

$\displaystyle \| Q_0(v) \|_{C^{k,\alpha}} \lesssim \| v \|_{C^{k,\alpha}} \|v\|_{C^{2,\alpha}} + O_k( \|v\|_{C^{k-1,\alpha}}^2 )$

for any ${k \geq 2}$ (with implied constants depending on ${\alpha,u}$ but independent of ${k}$ ), which can be used (for ${k}$ small enough) to show that the solution ${v}$ constructed by the contraction mapping principle lies in ${C^{k,\alpha}}$ for any ${k \geq 2}$ (by showing that the iterates used in the construction remain bounded in these norms), and is thus smooth.

29 comments

Comments feed for this article

12 May, 2016 at 12:43 am

Bo Jacoby

Does theorem 2 imply that space-time around a star can be imbedded in some higher dimensional euclidean space?

12 May, 2016 at 8:29 am

Terence Tao

Spacetime is pseudo-Riemannian rather than Riemannian, and so cannot be embedded in any Riemannian space. I guess the natural conjecture would be that a $d+1$ -dimensional spacetime may be locally isometrically embedded into some high dimensional Minkowski spacetime, but I don’t know if there are such results in the literature. (Global embedding is unrealistic because the spacetime may contain causal loops, which Minkowski spacetime, which of course doesn’t have any.) There are at least two obstacles to extending the arguments in this post to the pseudo-Riemannian setting: firstly, the Cartesian product of two Minkowski spacetimes is not a Minkowski spacetime, making the pairing operation much less useful; and secondly, the d’Lambertian (the natural analogue of the Laplacian in this setting) is no longer elliptic. But the latter obstacle at least might still be manageable using the original Nash-Moser method, which can tolerate some loss of derivatives in the nonlinearity.

12 May, 2016 at 9:59 am

Harshvardhan Tandon

Sir I am a high school student and a big fan of yours. I was a bit curious regarding your thoughts about the hotly debated ABC conjecture(about which I believe you must have given some thought). So what do you think about the enormous proof posed by Shinichi Mochizuki?Is it anywhere close to being correct or is it just a pointless and confusing pursuit which will not lead to anything important?

12 May, 2016 at 1:38 pm

Terence Tao

This recent post by Brian Conrad is probably the best summary of the current state of affairs: https://mathbabe.org/2015/12/15/notes-on-the-oxford-iut-workshop-by-brian-conrad/

26 August, 2016 at 12:35 am

Olaf Müller

Fortunately, your conjecture has a positive answer Miguel Sánchez and I found some years ago:

https://arxiv.org/abs/0812.4439

As you say, noncausality is an obstruction to the existence of an isometric embedding into a Minkowski space-time. An equivalent characterization for the existence is the existence of a steep temporal function, i.e. a function whose gradient v satisfies g(v,v) <-1, and such a function exists in every globally hyperbolic space-time. (If we asked for a conformal embedding instead of an isometric one, then the embeddable space-times would be exactly the stably causal ones.)
Now, if we focus on the local question, then we can use that in every space-time, every point has a globally hyperbolic neighborhood, thus, indeed, the local question is unobstructed.

Best,
-olaf.

18 August, 2020 at 10:26 am

rogerwest

What is meant by “causal” here? And why does it surface here, while the main article doesn’t allude to it?

20 August, 2020 at 2:19 pm

Terence Tao

Lorenztial spacetimes have causal structure, but Riemannian manifolds (the focus of the main article) do not.

13 May, 2016 at 12:44 pm

lewallen

Thanks a lot for the notes, great reading. I was a bit confused early on until I realized that by “pairing” two functions you meant the cartesian product (is that right?) — I’d never seen that particular terminology. What do you think about making that more explicit (e.g. in the proof of the Whitney immersion theorem)?

13 May, 2016 at 12:48 pm

lewallen

hah I just noticed that you explicitly defined it in the introduction, in fact it’s a central technique is you point out ;). Ok, sorry about that!

14 May, 2016 at 2:15 am

Paul Bryan

The paragraph on upgrading the weak Whitney theorem to an embedding is a little unclear. The immersion $u$ is locally an embedding, so is injective in a neighbourhood of any point. The statement, “if $x_1 = x_2$ , then $u$ is injective on a neighbourhood of $x_1$ …” is thus a little odd. It may also be helpful to remark that, although an injective immersion is not in general an embedding, for $M$ compact, it is since any continuous bijection from a compact space to a Hausdorff space is a homeomorphism.

[Text modified, thanks – T.]

14 May, 2016 at 8:08 pm

Anonymous

nice

15 May, 2016 at 10:32 pm

Notes on the Nash embedding theorem — What’s new | wendaliblog

[…] via Notes on the Nash embedding theorem — What’s new […]

17 May, 2016 at 11:32 am

Luiz Botelho -Físico Matemático de Altas energias e Turbulencia estocástica.

Terence
Beautiful argument .I think it can work for sigma compact Riemannian Manifolds .Relate to the General Relativity Problem , it is usual to postulate that you can complexify your pseudo Riemann manifold to a Complex Manifold with a Complex bilinear form as the Riemanian metric ,if this makes sense at all and thus apply Nash theorem for the Euclidean Manifold section (Now a truly Rieman Manifold!) .Any chance to extend this result by analytic continuation to the whole , now a complex manifold ?.Of course that some deep topological restrictions must be imposed , like Manifold Global orientability (By the way , if you consider orientability -Global , can this simplifies the Theorem ?).Being a spin manifold , etc….

18 September, 2016 at 7:21 pm

there is a very very little mistake with symbol：
Call a metric {g} on {({\bf R}/{\bf Z})^d} good if it is of the form {Q(u)} for some map {u: ({\bf R}/{\bf Z})^n \rightarrow {\bf R}^d} into a Euclidean space {{\bf R}^d}. Our task is now to show that every metric is good; the relation (2) tells us that the sum of any two good metrics is good.
{({\bf R}/{\bf Z})^d} should be {({\bf R}/{\bf Z})^n}

[Corrected, thanks – T.]

27 November, 2016 at 8:19 am

haduonght

Reblogged this on Eniod's Blog and commented:
Terence Tao notes on embedding theorem.

26 November, 2018 at 9:27 am

Embedding the Heisenberg group into a bounded dimensional Euclidean space with optimal distortion | What's new

[…] has some formal similarities with the isometric embedding problem (discussed for instance in this previous post), which can be viewed as the problem of solving an equation of the form , where is a Riemannian […]

8 January, 2019 at 3:12 pm

255B, Notes 2: Onsager’s conjecture | What's new

[…] is a celebrated theorem of Nash (discussed in this previous blog post) that the isometric embedding problem is possible in the smooth category if the dimension is large […]

15 February, 2019 at 1:36 am

maria

Professor Tao, could you maybe explain the decomposition of g_{\alpha,\beta} in the proof of proposition 5? As far as I understand, one can write a symmetric positive definite matrix as a product V^T\times V where the rows of V are the eigenvectors of the matrix multiplied by the square root of the positive eigenvalues. Hence, if \psi(x)=V\times x, then g_{\alpha,\beta}is given by the inner product of \partial_\alpha \psi and \partial_\beta \psi.

15 February, 2019 at 2:27 pm

Terence Tao

This gives the required decomposition at a single point $x$ . To make a decomposition that applies for an open set of $x$ one needs a more stable decomposition than the spectral decomposition; see Lemma 16 of my more recent notes https://terrytao.wordpress.com/2019/01/08/255b-notes-2-onsagers-conjecture/ for details.

15 February, 2020 at 8:36 pm

Brian

I notice that both arguments of the max operator are second degree polynomials. A gross but true overestimate then for the d in Nash’s embedding theorem would be, for large enough n, n^3.

In fact, it seems 3 is the minimum of k such that n^k (the monomial) is a bound for d for all n after a certain point.

Why is this? I guess another way to ask is why is the bound Gunther found second degree? Why not first, third, or more than third?

16 February, 2020 at 10:06 am

Terence Tao

Well, the equation $Q(u) = g$ is $\frac{n(n+1)}{2}$ distinct equations in $d$ unknowns, so one should certainly expect smooth embedding to only be possible for $d \geq \frac{n(n+1)}{2}$ (this can be made rigorous by computing the dimension of the image of the Taylor coefficients at the origin of $Q(u)$ up to some large order as $u$ varies over smooth maps, and comparing this with the dimension of the space of Taylor coefficients of $g$ ; I give an argument of this form for instance in Section 3 of https://arxiv.org/abs/1902.06313 ).

The main constraint in the known arguments for Nash embedding require one to begin with a smooth (but not isometric) embedding $u$ which is free in the sense that the first derivatives $\partial_\alpha u$ and the second derivatives $\partial_{\alpha \beta} u$ are linearly independent (after taking into account the obvious symmetry $\partial_{\alpha \beta} u = \partial_{\beta \alpha} u$ . This forces $d \geq \frac{n(n+1)}{2}+n$ at a bare minimum. It remains an open question what the precise optimal dependence on $d$ and $n$ is.

6 March, 2020 at 7:34 pm

Anonymous

After (5), it states “ $L(v- Q_{0}(v)) = h, ... v = Mh + Q_{0}(v)$ ”. Isn’t the sign off? In other words, should it not be “ $L(v + Q_{0}()) = h ... v = Mh - Q_{0}(v)$ ”? Also, later there is a definition of “ $Q_{\alpha}(v) : = -(1- \Delta)^{-1} (\Delta v \cdot \partial_{\beta} v)$ ” Surely this must be $Q_{\beta}(v)$ .

[Corrected, thanks – T.]

28 March, 2020 at 5:06 pm

Martin Ondrejat

Concerning the smoothness of the solution (the final step) the proof seems to be using uniform boundedness of the operator norms of $(1-\Delta)^{-1}$ from $C^{k,\alpha}$ to $C^{k+2,\alpha}$ with respect to all $k$ positive. Is this really true? If one realizes that the number of derivatives of $k$ -th order does not increase in $k$ linearly, this would be surprising to hold …

29 March, 2020 at 7:48 am

Terence Tao

The operator $(1-\Delta)^{-1}$ commutes with all constant coefficient differential operators, and in particular with $\nabla^k$ . Because of this, once one knows that $(1-\Delta)^{-1}$ maps $C^{0,\alpha}$ to $C^{2,\alpha}$ , it is immediate that it also maps $C^{k,\alpha}$ to $C^{k+2,\alpha}$ . (To put it another way: the operator $(1-\Delta)^{-1}$ is a Fourier multiplier and thus does not change the frequency of the function it is applied to, only the amplitude. Varying the regularity exponent in Holder or Sobolev norms such as $C^{k,\alpha}$ corresponds to multiplying the norm by some power of the frequency (at least for functions that are localised to a single frequency annulus). If the frequency doesn’t change, then this makes the $C^{k,\alpha}$ norm effectively a scalar multiple of the norm $C^{0,\alpha}$ for the function, and $C^{k+2,\alpha}$ the same scalar multiple of $C^{2,\alpha}$ of the output, so the effect of increasing $k$ on both sides cancels out.)

29 March, 2020 at 11:04 am

Martin Ondrejat

I can see that what you write works in the scale of the Sobolev spaces W^{k,2} (because there, everything can be transfered to the language of coordinates with respect to the orthonormal basis). And it is actually sufficient for the proof. One does not even need the (difficult) Schauder estimates for C^{k,\alpha} which can be replaced by (easy) estimates in the Sobolev W^{k,2} spaces (which are also an algebra for large k). And then the proof is really simple. Great!

30 August, 2020 at 12:27 pm

James Green

Hello,

I went to read the proof of the Nash Embedding theorem after I read your notes above. When you mention ” loss of derivatives problem” is this overcome in the original proof , by Nash, by using mollification in the convolution in Nash’s paper?

30 August, 2020 at 1:13 pm

Anonymous

Yes.

3 January, 2021 at 11:50 am

Anonymous

Can you comment on paraproduct machinery?

8 January, 2021 at 8:43 pm

Anonymous

In the formula after “If we then use Cramer’s rule to create smooth functions”, should it be delta_{…} instead of 0?

[No. One could augment the $w_{\alpha \beta}$ to a more complete dual basis by also locating vectors $w_{\alpha}$ that are dual to the $\partial_\alpha u$ , but these are not necessary for this argument. -T]

	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on It ought to be common knowledg…
	Ring Theory Intervie… on Reading seminar: “Stable…
	Anonymous on Work hard
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…
	Anonymous on Infinite partial sumsets in th…

Notes on the Nash embedding theorem

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

29 comments

Leave a comment Cancel reply

For commenters

Notes on the Nash embedding theorem

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

29 comments

Leave a comment Cancel reply

For commenters