We consider the Euler equations for incompressible fluid flow on a Euclidean space ; we will label as the “Eulerian space” (or “Euclidean space”, or “physical space”) to distinguish it from the “Lagrangian space” (or “labels space”) that we will introduce shortly (but the reader is free to also ignore the or subscripts if he or she wishes). Elements of Eulerian space will be referred to by symbols such as , we use to denote Lebesgue measure on and we will use for the coordinates of , and use indices such as to index these coordinates (with the usual summation conventions), for instance denotes partial differentiation along the coordinates. (We use superscripts for coordinates instead of subscripts to be compatible with some differential geometry notation that we will use shortly; in particular, when using the summation notation, we will now be matching subscripts with superscripts for the pair of indices being summed.)

In Eulerian coordinates, the Euler equations read

where is the velocity field and is the pressure field. These are functions of time and on the spatial location variable . We will refer to the coordinates as Eulerian coordinates. However, if one reviews the physical derivation of the Euler equations from 254A Notes 0, before one takes the continuum limit, the fundamental unknowns were not the velocity field or the pressure field , but rather the trajectories , which can be thought of as a single function from the coordinates (where is a time and is an element of the label set ) to . The relationship between the trajectories and the velocity field was given by the informal relationship

We will refer to the coordinates as (discrete) *Lagrangian coordinates* for describing the fluid.

In view of this, it is natural to ask whether there is an alternate way to formulate the continuum limit of incompressible inviscid fluids, by using a continuous version of the Lagrangian coordinates, rather than Eulerian coordinates. This is indeed the case. Suppose for instance one has a smooth solution to the Euler equations on a spacetime slab in Eulerian coordinates; assume furthermore that the velocity field is uniformly bounded. We introduce another copy of , which we call *Lagrangian space* or *labels space*; we use symbols such as to refer to elements of this space, to denote Lebesgue measure on , and to refer to the coordinates of . We use indices such as to index these coordinates, thus for instance denotes partial differentiation along the coordinate. We will use summation conventions for both the Eulerian coordinates and the Lagrangian coordinates , with an index being summed if it appears as both a subscript and a superscript in the same term. While and are of course isomorphic, we will try to refrain from identifying them, except perhaps at the initial time in order to fix the initialisation of Lagrangian coordinates.

Given a smooth and bounded velocity field , define a *trajectory map* for this velocity to be any smooth map that obeys the ODE

in view of (2), this describes the trajectory (in ) of a particle labeled by an element of . From the Picard existence theorem and the hypothesis that is smooth and bounded, such a map exists and is unique as long as one specifies the initial location assigned to each label . Traditionally, one chooses the initial condition

for , so that we label each particle by its initial location at time ; we are also free to specify other initial conditions for the trajectory map if we please. Indeed, we have the freedom to “permute” the labels by an arbitrary diffeomorphism: if is a trajectory map, and is any diffeomorphism (a smooth map whose inverse exists and is also smooth), then the map is also a trajectory map, albeit one with different initial conditions .

Despite the popularity of the initial condition (4), we will try to keep conceptually separate the Eulerian space from the Lagrangian space , as they play different physical roles in the interpretation of the fluid; for instance, while the Euclidean metric is an important feature of Eulerian space , it is not a geometrically natural structure to use in Lagrangian space . We have the following more general version of Exercise 8 from 254A Notes 2:

Exercise 1Let be smooth and bounded.

- If is a smooth map, show that there exists a unique smooth trajectory map with initial condition for all .
- Show that if is a diffeomorphism and , then the map is also a diffeomorphism.

Remark 2The first of the Euler equations (1) can now be written in the formwhich can be viewed as a continuous limit of Newton’s first law .

Call a diffeomorphism *(oriented) volume preserving* if one has the equation

for all , where the total differential is the matrix with entries for and , where are the components of . (If one wishes, one can also view as a linear transformation from the tangent space of Lagrangian space at to the tangent space of Eulerian space at .) Equivalently, is orientation preserving and one has a Jacobian-free change of variables formula

for all , which is in turn equivalent to having the same Lebesgue measure as for any measurable set .

The divergence-free condition then can be nicely expressed in terms of volume-preserving properties of the trajectory maps , in a manner which confirms the interpretation of this condition as an incompressibility condition on the fluid:

Lemma 3Let be smooth and bounded, let be a volume-preserving diffeomorphism, and let be the trajectory map. Then the following are equivalent:

- on .
- is volume-preserving for all .

*Proof:* Since is orientation-preserving, we see from continuity that is also orientation-preserving. Suppose that is also volume-preserving, then for any we have the conservation law

for all . Differentiating in time using the chain rule and (3) we conclude that

for all , and hence by change of variables

which by integration by parts gives

for all and , so is divergence-free.

To prove the converse implication, it is convenient to introduce the *labels map* , defined by setting to be the inverse of the diffeomorphism , thus

for all . By the implicit function theorem, is smooth, and by differentiating the above equation in time using (3) we see that

where is the usual material derivative

acting on functions on . If is divergence-free, we have from integration by parts that

for any test function . In particular, for any , we can calculate

and hence

for any . Since is volume-preserving, so is , thus

Thus is volume-preserving, and hence is also.

Exercise 4Let be a continuously differentiable map from the time interval to the general linear group of invertible matrices. Establish Jacobi’s formulaand use this and (6) to give an alternate proof of Lemma 3 that does not involve any integration.

Remark 5One can view the use of Lagrangian coordinates as an extension of the method of characteristics. Indeed, from the chain rule we see that for any smooth function of Eulerian spacetime, one hasand hence any transport equation that in Eulerian coordinates takes the form

for smooth functions of Eulerian spacetime is equivalent to the ODE

where are the smooth functions of Lagrangian spacetime defined by

In this set of notes we recall some basic differential geometry notation, particularly with regards to pullbacks and Lie derivatives of differential forms and similar fields on manifolds such as and , and explore how the Euler equations look in this notation. Our discussion will be entirely formal in nature; we will assume that all functions have enough smoothness and decay at infinity to justify the relevant calculations. (It is possible to work rigorously in Lagrangian coordinates – see for instance the work of Ebin and Marsden – but we will not do so here.) As a general rule, Lagrangian coordinates tend to be somewhat less convenient to use than Eulerian coordinates for establishing the basic analytic properties of the Euler equations, such as local existence, uniqueness, and continuous dependence on the data; however, they are quite good at clarifying the more algebraic properties of these equations, such as conservation laws and the variational nature of the equations. It may well be that in the future we will be able to use the Lagrangian formalism more effectively on the analytic side of the subject also.

Remark 6One can also write the Navier-Stokes equations in Lagrangian coordinates, but the equations are not expressed in a favourable form in these coordinates, as the Laplacian appearing in the viscosity term becomes replaced with a time-varying Laplace-Beltrami operator. As such, we will not discuss the Lagrangian coordinate formulation of Navier-Stokes here.

** — 1. Pullbacks and Lie derivatives — **

In order to efficiently change coordinates, it is convenient to use the language of differential geometry, which is designed to be almost entirely independent of the choice of coordinates. We therefore spend some time recalling the basic concepts of differential geometry that we will need. Our presentation will be based on explicitly working in coordinates; there are of course more coordinate-free approaches to the subject (for instance setting up the machinery of vector bundles, or of derivations), but we will not adopt these approaches here.

Throughout this section, we fix a diffeomorphism from Lagrangian space to Eulerian space ; one can for instance take where is a trajectory map and is some time. Then all the differential geometry structures on Eulerian space can be pulled back via to Lagrangian space . For instance, a physical point can be pulled back to a label , and similarly a subset of physical space can be pulled back to a subset of label space. A scalar field can be pulled back to a scalar field , defined by pre-composition:

These operations are all compatible with each other in various ways; for instance, if , , and , and then

- if and only if .
- if and only if .
- The map is an isomorphism of -algebras.
- The map is an algebra isomorphism.

**Differential forms.** The next family of structures we will pull back are that of differential forms, which we will define using coordinates. (See also my previous notes on this topic for more discussion on differential forms.) For any , a *-form* on will be defined as a family of functions for which is totally antisymmetric with respect to permutations of the indices , thus if one interchanges and for any , then flips to . Thus for instance

- A -form is just a scalar field ;
- A -form, when viewed in coordinates, is a collection of scalar functions;
- A -form, when viewed in coordinates, is a collection of scalar functions with (so in particular );
- A -form, when viewed in coordinates, is a collection of scalar functions with , , and .

The antisymmetry makes the component of a -form vanish whenever two of the indices agree. In particular, if , then the only -form that exists is the zero -form . A -form is also known as a volume form; amongst all such forms we isolate the *standard volume form* , defined by setting for any permutation (with being the sign of the permutation), and setting all other components of equal to zero. For instance, in three dimensions one has equal to when , when , and otherwise. We use to denote the space of -forms on .

If is a scalar field and , we can define the product by pointwise multiplication of components:

More generally, given two forms , , we define the wedge product to be the -form given by the formula

where is the symmetric group of permutations on . For instance, for a scalar field (so ), . Similarly, if and , we have the pointwise identities

Exercise 7Show that the wedge product is a bilinear map from to that obeys the supercommutative propertyfor and , and the associative property

for , , . (In other words, the space of formal linear combinations of forms, graded by the parity of the order of the forms, is a supercommutative algebra. Very roughly speaking, the prefix “super” means that “odd order objects anticommute with each other rather than commute”.)

If is continuously differentiable, we define the exterior derivative in coordinates as

It is easy to verify that this is indeed a -form. Thus for instance:

- If is a continously differentiable scalar field, then .
- If is a continuously differentiable -form, then .
- If is a continuously differentiable -form, then .

Exercise 8If and are continuously differentiable, establish the antiderivation (or super-Leibniz) lawand if is twice continuously differentiable, establish the chain complex law

Each of the coordinates , can be viewed as scalar fields . In particular, the exterior derivatives , are -forms. It is easy to verify the identity

for any with the usual summation conventions (which, in this differential geometry formalism, assert that we sum indices whenever they appear as a subscript-superscript pair). In particular the volume form can be written as

One can of course define differential forms on Lagrangian space as well, changing the indices from Roman to Greek. For instance, if is continuously differentiable, then is given in coordinates as

If , we define the pullback form by the formula

with the usual summation conventions. Thus for instance

- If is a scalar field, then the pullback is given by the same formula as before.
- If is a -form, then the pullback is given by the formula .
- If is a -form, then the pullback is given by the formula

It is easy to see that pullback is a linear map from to . It also preserves the exterior algebra and exterior derivative:

Exercise 9Let and . Show thatand if is continuously differentiable, show that

One can integrate -forms on oriented -manifolds. Suppose for instance that an oriented -manifold has a parameterisation , where is an open subset of and is an injective immersion. Then any continuous compactly supported -form can be integrated on by the formula

with the usual summation conventions. It can be shown that this definition is independent of the choice of parameterisation. For a more general manifold , one can use a partition of unity to decompose the integral into parameterised manifolds, and define the total integral to be the sum of the components; again, one can show (after some tedious calculation) that this is independent of the choice of parameterisation. If is all of (with the standard identity), and , then we have the identity

linking integration on differential forms with the Lebesgue (or Riemann) integral. We also record Stokes’ theorem

whenever is a smooth orientable -manifold with smooth boundary , and is a continuous, compactly supported -form. The regularity conditions on here can often be relaxed by the usual limiting arguments; for the purposes of this set of notes, we shall proceed formally and assume that identities such as (14) hold for all manifolds and forms under consideration.

From the change of variables formula we see that pullback also respects integration on manifolds, in that

whenever is a smooth orientable -manifold, and a continuous compactly supported -form.

Exercise 10Establish the identityConclude in particular that is volume-preserving if and only if

**Vector fields.** Having pulled back differential forms, we now pull back vector fields. A vector field on , when viewed in coordinates, is a collection , of scalar functions; superficially, this resembles a -form , except that we use superscripts instead of subscripts to denote the components. On the other hand, we will transform vector fields under pullback in a different manner from -forms. For each , a basic example of a vector field is the coordinate vector field , defined by setting to equal when and otherwise. Then every vector field may be written as

where we multiply scalar functions against vector fields in the obvious fashion. The space of all vector fields will be denoted . One can of course define vector fields on similarly.

The pullback of is defined to be the unique vector field such that

for all (so that is the pushforward of ). Equivalently, if is the inverse matrix to the total differential (which we recall in coordinates is ), so that

with denoting the Kronecker delta, then

From the inverse function theorem one can also write

thus is also the pullback of by .

If is a -form and are vector fields, one can form the scalar field by the formula

Thus for instance if is a -form and are vector fields, then

It is clear that is a totally antisymmetric form in the . If is a -form for some and is a vector field, we define the contraction (or *interior product*) in coordinates by the formula

or equivalently that

for . Thus for instance if is a -form, and is a vector field, then is the -form

If is a vector field and is a continuously differentiable scalar field, then is just the directional derivative of along the vector field :

The contraction is also denoted in the literature. If one contracts a vector field against the standard volume form , one obtains a -form which we will call (by slight abuse of notation) the Hodge dual of :

This can easily be seen to be a bijection between vector fields and -forms. The inverse of this operation will also be denoted by the Hodge star :

In a similar spirit, the Hodge dual of a scalar field will be defined as the volume form

and conversely the Hodge dual of a volume form is a scalar field:

More generally one can form a Hodge duality relationship between -vector fields and -forms for any , but we will not do so here as we will not have much use for the notion of a -vector field for any .

These operations behave well under pullback (if one assumes volume preservation in the case of the Hodge star):

Exercise 11

- (i) If and , show that
- (ii) If for some and , show that
- (iii) If is volume-preserving, show that
whenever is a scalar field, vector field, -form, or -form on .

**Riemannian metrics.** A Riemannian metric on , when expressed in coordinates is a collection of scalar functions such that for each point , the matrix is symmetric and strictly positive definite. In particular it has an inverse metric , which is a collection of scalar functions such that

where denotes the Kronecker delta; here we have abused notation (and followed the conventions of general relativity) by allowing the inverse on the metric to be omitted when expressed in coordinates (relying instead on the superscripting of the indices, as opposed to subscripting, to indicate the metric inversion). The Euclidean metric is an example of a metric tensor, with equal to when and zero otherwise; the coefficients of the inverse Euclidean metric is similarly equal to when and otherwise. Given two vector fields and a Riemannian metric , we can form the scalar field by

this is a symmetric bilinear form in .

We can define the pullback metric by the formula

this is easily seen to be a Riemannian metric on , and one has the compatibility property

for all . It is then not difficult to check that if we pull back the inverse metric by the formula

then we have the expected relationship

Exercise 12If is a diffeomorphism, show thatfor any , and similarly

for any , and

for any Riemannian metric .

Exercise 13Show that is an isometry (with respect to the Euclidean metric on both and ) if and only if .

Every Riemannian metric induces a musical isomorphism between vector fields on with -forms: if is a vector field, the associated -form (also denoted or simply ) is defined in coordinates as

and similarly if , the associated vector field (also denoted or ) is defined in coordinates as

These operations clearly invert each other: and . Note that can still be defined if is not positive definite, though it might not be an isomorphism in this case. Observe the identities

The musical isomorphism interacts well with pullback, provided that one also pulls back the metric :

Exercise 14If is a Riemannian metric, show thatfor all , and

for all .

We can now interpret some classical operations on vector fields in this differential geometry notation. For instance, if are vector fields, the dot product can be written as

and also

and for , the cross product can be written in differential geometry notation as

Exercise 15Formulate a definition for the pullback of a rank tensor field (which in coordinates would be given by for ) that generalises the pullback of differential forms, vector fields, and Riemannian metrics. Argue why your definition is the natural one.

**Lie derivatives.** Let is a continuously differentiable vector field, and is a continuously differentiable -form, we will define the *Lie derivative* of along by the *Cartan formula*

with the convention that vanishes if is a -form. Thus for instance:

- If is a continuously differentiable scalar field, then is just the directional derivative of along : .
- If is a continuously differentiable -form, then is the -form
- If is a continuously differentiable -form, then is the -form

One can interpret the Lie derivative as the infinitesimal version of pullback:

Exercise 16Let be smooth and bounded (so that can be viewed as a smooth vector field on for each ), and let be a trajectory map. If is a smooth -form, show thatMore generally, if is a smooth -form that varies smoothly in , show that

where denotes the

material Lie derivative

Note that the material Lie derivative specialises to the material derivative when applied to scalar fields. The above exercise shows that the trajectory map intertwines the ordinary time derivative with the material (Lie) derivative.

Remark 17If one lets be the trajectory map associated to a time-independent vector field with initial condition (4) (thus and , then the above exercise shows that for any differential form . This can be used as an alternate definition of the Lie derivative (and has the advantage of readily extending to other tensors than differential forms, for which the Cartan formula is not available).

The Lie derivative behaves very well with respect to exterior product and exterior derivative:

Exercise 18Let be continuously differentiable, and let also be continuously differentiable. Establish the Leibniz ruleIf is twice continuously differentiable, also establish the commutativity

of exterior derivative and Lie derivative.

Exercise 19Let be continuously differentiable. Show thatwhere is the divergence of . Use this and Exercise 16 to give an alternate proof of Lemma 3.

Exercise 20Let be continuously differentiable. For any smooth compactly supported volume form , show thatConclude in particular that if is divergence-free then

for any .

The Lie derivative of a continuously differentiable vector field is defined in coordinates as

and the Lie derivative of a continuously differentiable rank tensor is defined in coordinates as

Thus for instance the Lie derivative of the Euclidean metric is expressible in coordinates as

(compare with the *deformation tensor* used in Notes 0).

We have similar properties to Exercise 21:

Exercise 21Let be continuously differentiable.

- (i) If and are continuously differentiable, establish the Leibniz rule
If , and , establish the variant Leibniz rule

- (ii) If is a continuously differentiable rank tensor and are continuously differentiable, establish the Leibniz rule
similarly, for , show that

- (iii) Establish the analogue of Exercise 16 in which the differential form is replaced by a vector field or a rank -tensor .
- (iv) If is divergence-free, show that
whenever is a continuously differential scalar field, vector field, -form, or -form on .

Exercise 22If is continuously differentiable, establish the identitywhenever is a continuously differentiable differential form, vector field, or metric tensor.

Exercise 23If are smooth, define theLie bracketby the formulaEstablish the anti-symmetry (so in particular ) and the Jacobi identity

and also

whenever are smooth, and is a smooth differentiable form, vector field, or metric tensor.

Exercise 24Formulate a definition for the Lie derivative of a (continuously differentiable) rank tensor field along a vector field that generalises the Lie derivative of differential forms, vector fields, and Riemannian metrics. Argue why your definition is the natural one.

** — 2. The Euler equations in differential geometry notation — **

Now we write the Euler equations (1) in differential geometry language developed in the above section. This will make it relatively painless to change coordinates. As in the rest of the notes, we work formally, assuming that all fields are smooth enough to justify the manipulations below.

The Euler equations involve a time-dependent scalar field , which can be viewed as an element of , and a time-dependent velocity field , which can be viewed as an element of . The second of the Euler equations simply asserts that this vector field is divergence-free:

or equivalently (by Exercise 19 and the definition of material Lie derivative )

For the first equation, it is convenient to work instead with the *covelocity field* , formed by applying the Euclidean musical isomorphism to :

In coordinates, we have . The Euler equations can then be written in coordinates as

The left-hand side is close to the component of the material Lie derivative of . Indeed, from (20) we have

and so the first Euler equation becomes

Since , we can express the right-hand side as a total derivative , where is the *modified pressure*

We thus see that the Euler equations can be transformed to the system

Using the Cartan formula (19), one can also write (22) as

where is another modification of the pressure:

In coordinates, (25) becomes

One advantage of the formulation (22)–(24) is that one can pull back by an arbitrary diffeomorphic change of coordinates (both time-dependent and time-independent), with the only things potentially changing being the material Lie derivative , the metric , and the volume form . (Another, related, advantage is that this formulation readily suggests an extension to more general Riemannian manifolds, by replacing with a general Riemannian metric and with the associated volume form, without the need to explicitly introduce other Riemannian geometry concepts such as covariant derivatives or Christoffel symbols.)

For instance, suppose , and we wish to view the Euler equations in cylindrical coordinates , thus pulling back under the time-independent map defined by

Strictly speaking, this is not a diffeomorphism due to singularities at , but we ignore this issue for now by only working away from the axis . As is well known, the metric pulls back under this change of coordinates as

thus the pullback metric is diagonal in coordinates with entries

The volume form similarly pulls back to the familiar cylindrical coordinate volume form

If (by slight abuse of notation) we write the components of as , and the components of as , then the second equation (23) in p., current formulation of the Euler equations now becomes

and the third equation (24) is

which by the product rule and Exercise 19 becomes

or after expanding in coordinates

If one substitutes (27) into eqref{yu-1-alt2} in the coordinates to eliminate the variables, we thus see that the cylindrical coordinate form of the Euler equations is

One should compare how readily one can derive these equations using the differential geometry formalism with the more pedestrian aproach using the chain rule:

Exercise 25Starting with a smooth solution to the Euler equations (1) in , and transforming to cylindrical coordinates , establish the chain rule formulaeand use this and the identity

to rederive the system (28)–(31) (away from the axis) without using the language of differential geometry.

Exercise 26Turkington coordinates are a variant of cylindrical coordinates , defined by the formulaethe advantage of these coordinates are that the map from Cartesian coordinates to Turkington coordinates is volume preserving. Show that in these coordinates, the Euler equations become

(These coordinates are particularly useful for studying solutions to Euler that are “axisymmetric with swirl”, in the sense that the fields do not depend on the variable, so that all the terms involving vanish; one can specialise further to the case of solutions that are “axisymmetric without swirl”, in which case also vanishes.)

We can use the differential geometry formalism to formally verify the conservation laws of the Euler equation. We begin with conservation of energy

Formally differentiating this in time (and noting that the form is symmetric in ) we have

Using (22), we can write

From the Cartan formula (19) one has ; from Exercise 23 one has , and hence by the Leibniz rule (Exercise 21(i)) we thus can write as a total derivative:

From Exercise 20 we thus formally obtain the conservation law .

Now suppose that is a time-independent vector field that is a Killing vector field for the Euclidean metric , by which we mean that

Taking traces in (21), this implies in particular that is divergence-free, or equivalently

(Geometrically, this implication arises because the volume form can be constructed from the Euclidean metric (up to a choice of orientation).) Consider the formal quantity

As is the only time-dependent quantity here, we may formally differentiate to obtain

Using (22), the left-hand side is

By Cartan’s formula, is a total derivative , and hence this contribution to the integral formally vanishes as is divergence-free. The quantity can be written using the Leibniz rule as the difference of the total derivative and the quantity . The former quantity also gives no contribution to the integral as is divergence free, thus

By Exercise 23, we have . Since (and hence ) is annihilated by , and the form is symmetric in , we can express as a total derivative

and so this integral also vanishes. Thus we obtain the conservation law . If we set the Killing vector field equal to the constant vector field for some , we obtain conservation of the momentum components

for ; if we instead set the Killing vector field equal to the rotation vector field ) (which one can easily verify to be Killing using (21)) we obtain conservation of the angular momentum components

for . Unfortunately, this essentially exhausts the supply of Killing vector fields:

Exercise 27Let be a smooth Killing vector field of the Euclidean metric . Show that is a linear combination (with real coefficients) of the constant vector fields , and the rotation vector fields , . (Hint: use (21) to show that all the second derivatives of components of vanish.)

The *vorticity -form* is defined as the exterior derivative of the covelocity:

It already made an appearance in Notes 3 from the previous quarter. Taking exterior derivatives of (22) using (10) and Exercise 21 we obtain the appealingly simple *vorticity equation*

In two and three dimensions we may take the Hodge dual of the velocity -form to obtain either a scalar field (in dimension ) or a vector field (in dimension ), and then Exercise 21(iv) implies that

In two dimensions, this gives us a lot of conservation laws, since one can apply the scalar chain rule to then formally conclude that

for any , which upon integration on using Exercise 20 gives the conservation law

for any such function . Thus for instance the norms of are formally conserved for every , and hence also for by a limiting argument, recovering Proposition 24 from Notes 3 of the previous quarter.

In three dimensions there is also an interesting conservation law involving the vorticity. Observe that the wedge product of the covelocity and the vorticity is a -form and can thus be integrated over . The helicity

is a formally conserved quantity of the Euler equations. Indeed, formally differentiating and using Exercise 20 we have

From the Leibniz rule and (32) we have

Applying (22) we can write this expression as . From (10) we have , hence this expression is also a total derivative . From Stokes’ theorem (14) we thus formally obtain the conservation of helicity: (first observed by Moreau).

Exercise 28Formally verify the conservation of momentum, angular momentum, and helicity directly from the original form (1) of the Euler equations.

Exercise 29In even dimensions , show that the integral (formed by taking the exterior product of copies of ) is conserved by the flow, while in odd dimensions , show that the generalised helicity is conserved by the flow. (This observation is due to Denis Serre, as well as unpublished work of Tartar.)

As it turns out, there are no further conservation laws for the Euler equations in Eulerian coordinates that are linear or quadratic integrals of the velocity field and its derivatives, at least in three dimensions; see this paper of Denis Serre. In particular, the Euler equations are not believed to be completely integrable. (But there are a few more conserved integrals of motion in the Lagrangian formalism; see Exercise 39.)

Exercise 30Let be a smooth solution to the Euler equations in three dimensions , let be the vorticity vector field, and let be an arbitrary smooth scalar field. EstablishErtel’s theorem

Exercise 31 (Clebsch variables)Let be a smooth solution to the Euler equations. Suppose that at time zero, the covelocity takes the formfor some smooth scalar fields . Show that at all subsequent times , the covelocity takes the form

where are smooth scalar fields obeying the transport equations

(The classical Clebsch variables take , but as was observed by Constantin, the analysis also extends without difficulty to the case .)

** — 3. Viewing the Euler equations in Lagrangian coordinates — **

Throughout this section, is a smooth solution to the Euler equations on , and let be a trajectory map.

We pull back the Euler equations (22), (23), (24), to create a Lagrangian velocity field , a Lagrangian covelocity field , a Lagrangian modified pressure field , and a Lagrangian vorticity field by the formulae

By Exercise 16, the Euler equations now take the form

and the vorticity is given by

and obeys the vorticity equation

We thus see that in Lagrangian coordinates, the vorticity is a *pointwise* conserved quantity:

This lets us solve for the Eulerian vorticity in terms of the trajectory map. Indeed, from (12), (35) we have

applying the inverse of the linear transformation , we thus obtain the *Cauchy vorticity formula*

If we normalise the trajectory map by (4), then , and we thus have

Thus for instance, we see that the support of the vorticity is transported by the flow:

Among other things, this shows that the volume and topology of the support of the vorticity remain constant in time. It also suggests that the Euler equations admit a number of “vortex patch” solutions in which the vorticity is compactly supported.

Exercise 32Assume the normalisation (4).

- (i) In the two-dimensional case , show that the Cauchy vorticity formula simplifies to
Thus in this case, vorticity is simply transported by the flow.

- (ii) In the three-dimensional case , show that the Cauchy vorticity formula can be written using the Hodge dual of the vorticity as
Thus we see that the vorticity is transported and also stretched by the flow, with the stretching given by the matrix .

One can also phrase the conservation of vorticity in an integral form. If is a two-dimensional oriented surface in that does not vary in time, then from (37) we see that the integral

is formally conserved in time:

Composing this with the trajectory map using (35), we conclude that

Writing and using Stokes’ theorem (14), we arrive at the Kelvin circulation theorem

The integral of the covelocity along a loop is known as the *circulation* of the fluid along the loop; the Kelvin circulation theorem then asserts that this circulation remains constant over time as long as the loop evolves along the flow.

Exercise 33 (Cauchy invariants)

- (i) Use (3) to establish the identity
expressing the Lagrangian covelocity in terms of the Euclidean metric and the trajectory map .

- (ii) Use (i) and (36) to establish the Lagrangian equation of motion
- (iii) Show that is also the pullback of the unmodified Eulerian pressure , thus
and recover

Newton’s first law(5).- (iv) Use (ii) to conclude the
Cauchy invariantsare conserved in time.

- (v) Show that the Cauchy invariants are precisely the Lagrangian vorticity , thus the conservation of the Cauchy invariants is equivalent to the Cauchy vorticity formula.
For more discussion of Cauchy’s investigation of the Cauchy invariants and vorticity formula, see this article of Frisch and Villone.

Exercise 34 (Transport of vorticity lines)Suppose we are in three dimensions , so that the Hodge dual of vorticity is a vector field. A smooth curve (either infinite on both ends, or a closed loop) in is said to be a vortex line (or vortex ring, in the case of a closed loop) at time if at every point of the curve , the tangent to at is parallel to the vorticity at that point. Suppose that the trajectory map is normalised using (4). Show that if is a vortex line at time , then is a vortex line at any other time ; thus, vortex lines (or vortex rings) flow along with the fluid.

Exercise 35 (Conservation of helicity in Lagrangian coordinates)

- (i) In any dimension, establish the identity
in Lagrangian spacetime.

- (ii) Conclude that in three dimensions , the quantity
is formally conserved in time. Explain why this conserved quantity is the same as the helicity (34).

- (iii) Continue assuming . Define a
vortex tubeat time to be a region in which, at every point on the boundary , the vorticity vector field is tangent to . Show that if is a vortex tube at time , then is a vortex tube at time , and the helicity on the vortex tube is formally conserved in time.- (iv) Let . If the covelocity can be expressed in Clebsch variables (Exercise 31) with , show that the local helicity formally vanishes on every vortex tube . This provides an obstruction to the existence of Clebsch variables. (On the other hand, it is easy to find Clebsch variables on with for an arbitrary covelocity , simply by setting equal to the coordinate functions .)

Exercise 36In the three-dimensional case , show that the material derivative commutes with operation of differentiation along the (Hodge dual of the) vorticity.

The Cauchy vorticity formula (39) can be used to obtain an integral representation for the velocity in terms of the trajectory map , leading to the *vorticity-stream formulation* of the Euler equations. Recall from 254A Notes 3 that if one takes the divergence of the (Eulerian) vorticity , one obtains the Laplacian of the (Eulerian) covelocity :

where are the partial derivatives raised by the Euclidean metric. For , we can use the fundamental solution of the Laplacian (see Exercise 18 of 254A Notes 1) that (formally, at least)

Integrating by parts (after first removing a small ball around , and observing that the boundary terms from this ball go to zero as one shrinks the radius to zero) one obtains the Biot-Savart law

for the covelocity, or equivalently

for the velocity.

Exercise 37Show that this law is also valid in the two-dimensional case .

Changing to Lagrangian variables, we conclude that

Using the Cauchy vorticity formula (39) (assuming the normalisation (4)), we obtain

Combining this with (3), we obtain an integral-differential equation for the evolution of the trajectory map:

This is known as the *vorticity-stream formulation* of the Euler equations. In two and three dimensions, the formulation can be simplified using the alternate forms of the vorticity formula in Exercise 32. While the equation (42) looks complicated, it is actually well suited for Picard-type iteration arguments (of the type used in 254A Notes 1), due to the relatively small number of derivatives on the right-hand side. Indeed, it turns out that one can iterate this equation with the trajectory map placed in function spaces such as ; see Chapter 4 of Bertozzi-Majda for details.

Remark 38Because of the ability to solve the Euler equations in Lagrangian coordinates by an iteration method, the local well-posedness theory is slightly stronger in some respects in Lagrangian coordinates than it is in Eulerian coordinates. For instance, in this paper of Constantin Kukavica and Vicol it is shown that Lagrangian coordinate Euler equations are well-posed in Gevrey spaces, while Eulerian coordinate Euler equations are not. It also happens that the trajectory maps are real-analytic in even if the initial data is merely smooth; see for instance this paper of Constantin-Vicol-Wu and the references therein.

Exercise 39Show that the integralis formally conserved in time. (

Hint:some of the terms arising from computing the derivative are more easily treated by moving to Eulerian coordinates and performing integration by parts there, rather than in Lagrangian coordinates.) This conservation law is related to a scaling symmetry of the Euler equations in Lagrangian coordinates, and is due to Shankar. It does not have a local expression in Eulerian coordinates (mainly because of the appearance of the labels coordinate ).

** — 4. Variational characterisation of the Euler equations — **

Our computations in this section will be even more formal than in previous sections.

From Exercise 1, a (smooth, bounded) vector field (together with a choice of initial map ) gives rise to a trajectory map . From Lemma 3, we see that that is volume preserving for all times if and only if is volume preserving and if is divergence-free. Given such a trajectory map, let us formally define the *Lagrangian* by the formula

As observed by Arnold, the Euler equations can be viewed as the Euler-Lagrange equations for this Lagrangian, subject to the constraint that the trajectory map is always volume-preserving:

Proposition 40Let be a smooth bounded divergence-free vector field with a volume-preserving trajectory map . Then the following are formally equivalent:

- (i) There is a pressure field such that solves the Euler equations.
- (ii) The trajectory map is a critical point of the Lagrangian with respect to all compactly supported infinitesimal perturbations of in that preserve the volume-preserving nature of the trajectory map.

*Proof:* First suppose that (i) holds. Consider an infinitesimal deformation of the trajectory map, with compactly supported in , where one can view either as an infinitesimal or as a parameter tending to zero (in this formal analysis we will not bother to make the setup more precise than this). If this deformation is still volume-preserving, then we have

differentiating at using Exercise 4 we see that

Writing , we thus see from the chain rule that the Eulerian vector field is divergence-free

Now, let us compute the infinitesimal variation of the Lagrangian:

Formally differentiating under the integral sign, this expression becomes

which by symmetry simplifies to

We integrate by parts in time to move the derivative off of the perturbation , to arrive at

Using Newton’s first law (41), this becomes

Writing , we can change to Eulerian variables to obtain

We can now integrate by parts and use (45) and conclude that this variation vanishes. Thus is a formal critical point of the Lagrangian.

Conversely, if is a formal critical point, then the above analysis shows that the expression (46) vanishes whenever obeys (45). Changing variables to Euclidean space, this expression becomes

Hodge theory (cf. Exercise 16 of 254A Notes 1) then implies (formally) that must be a differential , which is equivalent to Newton’s first law (41), which is in turn equivalent to the Euler equations (recalling that is assumed to be divergence-free).

Remark 41The above analysis reveals that the pressure field can be interpreted as a Lagrange multiplier arising from the constraint that the trajectory map be volume-preserving.

Following Arnold, one can use Proposition 40 to formally interpret the Euler equations as a geodesic flow on an infinite dimensional Riemannian manifold. Indeed, for a finite-dimensional Riemannian manifold , it is well known that (constant speed) geodesics are formal critical points of the energy functional

Thus we see that if we formally take to be the infinite-dimensional space of volume-preserving diffeomorphisms , with the formal Riemannian metric at a point in the directions of two infinitesimal perturbations defined by

then Proposition 40 asserts, formally, that solutions to the Euler equations coincide with constant speed geodesic flows on . As it turns out, a number of other physical equations, including several further fluid equations, also have such a geodesic interpretation, such as Burgers’ equation, the Korteweg-de Vries equation, and the Camassa-Holm equations; see for instance this paper of Vizman for a survey. In principle this means that the tools of Riemannian geometry could be deployed to obtain a better understanding of the Euler equations (and of the other equations mentioned above), although to date this has proven to be somewhat elusive (except when discussing conservation laws, as in Remark 42 below) for a number of reasons, not the least of which is that rigorous Riemannian geometry on infinite-dimensional manifolds is technically quite problematic. (Nevertheless, one can at least recover the local existence theory for the Euler equations this way; see the aforementioned work of Ebin and Marsden.)

Remark 42Noether’s theorem tells us that one should expect a one-to-one correspondence between symmetries of a Lagrangian and conservation laws of the corresponding Euler-Lagrange equation. Applying this to Proposition 40, we conclude that the conservation laws of the Euler equations should correspond to symmetries of the Lagrangian (43). There are basically two obvious symmetries of this Lagrangian; one coming from isometries of Eulerian spacetime , and in particular time translation, spatial translation, and spatial rotation; and the other coming from volume-preserving diffeomorphisms of Lagrangian space . One can check that time translation corresponds to energy conservation, spatial translation corresponds to momentum conservation, and spatial rotation corresponds to angular momentum conservation, while Lagrangian diffeomorphism invariance corresponds to conservation of Lagrangian vorticity (or equivalently, the Cauchy vorticity formula). In three dimensions, if one specialises to the specific Lagrangian diffeomorphism created by flow along the vorticity vector field , one also recovers conservation of helicity; see this previous blog post for more discussion.

Remark 43There are also Hamiltonian formulations of the Euler equations that do not correspond exactly to the geodesic flow interpretation here; see this paper of Olver. Again, one can explain each of the known conservation laws for the Euler equations in terms of symmetries of the Hamiltonian.

Further discussion of the geodesic flow interpretation of the Euler equations may be found in this previous blog post.

]]>In a Notes 2, we reviewed the classical construction of Leray of global weak solutions to the Navier-Stokes equations. We did not quite follow Leray’s original proof, in that the notes relied more heavily on the machinery of Littlewood-Paley projections, which have become increasingly common tools in modern PDE. On the other hand, we did use the same “exploiting compactness to pass to weakly convergent subsequence” strategy that is the standard one in the PDE literature used to construct weak solutions.

As I discussed in a previous post, the manipulation of sequences and their limits is analogous to a “cheap” version of nonstandard analysis in which one uses the Fréchet filter rather than an ultrafilter to construct the nonstandard universe. (The manipulation of generalised functions of Columbeau-type can also be comfortably interpreted within this sort of cheap nonstandard analysis.) Augmenting the manipulation of sequences with the right to pass to subsequences whenever convenient is then analogous to a sort of “lazy” nonstandard analysis, in which the implied ultrafilter is never actually constructed as a “completed object“, but is instead lazily evaluated, in the sense that whenever membership of a given subsequence of the natural numbers in the ultrafilter needs to be determined, one either passes to that subsequence (thus placing it in the ultrafilter) or the complement of the sequence (placing it out of the ultrafilter). This process can be viewed as the initial portion of the transfinite induction that one usually uses to construct ultrafilters (as discussed using a voting metaphor in this post), except that there is generally no need in any given application to perform the induction for any uncountable ordinal (or indeed for most of the countable ordinals also).

On the other hand, it is also possible to work directly in the orthodox framework of nonstandard analysis when constructing weak solutions. This leads to an approach to the subject which is largely equivalent to the usual subsequence-based approach, though there are some minor technical differences (for instance, the subsequence approach occasionally requires one to work with separable function spaces, whereas in the ultrafilter approach the reliance on separability is largely eliminated, particularly if one imposes a strong notion of saturation on the nonstandard universe). The subject acquires a more “algebraic” flavour, as the quintessential analysis operation of taking a limit is replaced with the “standard part” operation, which is an algebra homomorphism. The notion of a sequence is replaced by the distinction between standard and nonstandard objects, and the need to pass to subsequences disappears entirely. Also, the distinction between “bounded sequences” and “convergent sequences” is largely eradicated, particularly when the space that the sequences ranged in enjoys some compactness properties on bounded sets. Also, in this framework, the notorious non-uniqueness features of weak solutions can be “blamed” on the non-uniqueness of the nonstandard extension of the standard universe (as well as on the multiple possible ways to construct nonstandard mollifications of the original standard PDE). However, many of these changes are largely cosmetic; switching from a subsequence-based theory to a nonstandard analysis-based theory does *not* seem to bring one significantly closer for instance to the global regularity problem for Navier-Stokes, but it could have been an alternate path for the historical development and presentation of the subject.

In any case, I would like to present below the fold this nonstandard analysis perspective, quickly translating the relevant components of real analysis, functional analysis, and distributional theory that we need to this perspective, and then use it to re-prove Leray’s theorem on existence of global weak solutions to Navier-Stokes.

** — 1. Quick review of nonstandard analysis — **

In this section we quickly review the aspects of nonstandard analysis that we need. Let denote the “standard” universe of “standard” mathematical objects; this includes what one might think of as “primitive” standard objects such as (standard) numbers and (standard) points, but also sets of standard objects (such as the set of real numbers, or the Euclidean space ), or functions from one standard space to another, or function spaces such as of such functions (possibly quotiented out by almost everywhere equivalence), and so forth. In short, should contain all the standard objects that one generally works with in analysis. One can require that this universe obey various axioms (e.g. the Zermelo-Fraenkel-Choice axioms of set theory), but we will not be particularly concerned with the precise properties of this universe (we won’t even need to know whether is a set or a proper class).

What nonstandard analysis does is take this standard universe of standard objects and embed it in a larger *nonstandard universe* of nonstandard objects which has similar properties to the standard one, but also some additional properties. As discussed in this previous post, the relationship between the standard universe and the nonstandard universe is somewhat analogous to that between the rationals and its metric completion ; most of the algebraic properties of carry over to , but also has some additional completeness and (local) compactness properties that lacks. Also, one should think of as being far “larger” than , in much the same way that is larger than in various senses, for instance in the sense of cardinality.

There is one important subtlety concerning the nonstandard universe : it comes with a more restrictive notion of subset (or of function) than the “external” notion of subset or function that one has if one views from some external metatheory (e.g., if one places both and inside a very large model of ZFC). Thus, for instance, an externally defined subset of the nonstandard reals may or may not be an internal subset of these reals (in particular, the embedded copy of the standard reals is *not* an internal subset of , being merely an external subset instead); similarly, an externally defined function from to need not be an internal function (for instance, the standard part function will be external rather than internal). The relationship between internal sets/functions and external sets/functions in nonstandard analysis is somewhat analogous to the relationship between measurable sets/functions and arbitrary sets/functions in measure theory.

The reals can be constructed from the rationals in a number of ways, such as by forming Cauchy sequences in and quotienting out by the sequences that converge to zero; similarly, the nonstandard universe can be formed from the standard one in a number of ways, such as by forming arbitrary sequences in and quotienting out by a non-principal ultrafiter. See for instance this previous post for details. However, much as the precise construction of the reals is often of little direct importance in applications, we will not need to care too much about how the nonstandard universe is constructed. Rather, the following properties of this universe will be used:

- (i) (Embedding) Every standard object, space, operation, or function in has a nonstandard counterpart in . For instance, if is a real number in the set of standard reals, then will be an element of the set of nonstandard reals; if is a standard function, then is a nonstandard function from the nonstandard Euclidean space to the nonstandard reals . The standard addition operation on the standard reals induces a nonstandard addition operation on the nonstandard reals, though to avoid notational clutter we will write as , and similarly for other basic mathematical operations. Similarly, the norm function has a nonstandard counterpart that assigns a nonstandard non-negative real to any nonstandard function . (To avoid notational clutter, we will often abuse notation by identifying with for various “primitive” mathematical objects such as real numbers, arithmetic operations such as , or functions such as , unless we have a pressing need to carefully distinguish a standard object from its representative in the nonstandard universe.)
- (ii) (Transfer) If is a standard predicate in first order logic involving some finite number of standard objects (with a fixed standard natural number), and possibly some quantification over standard sets, and is the nonstandard version of the predicate in which one quantifies over nonstandard sets, then is true if and only if is true. Important caveat: the predicate needs to be
*internal*to the mathematical language used internally to both and separately; it is not allowed to use*external*concepts dependent on the way in which embeds into , or how either universe embeds into an external metatheory. - (iii) (–saturation) Let be standard natural numbers, and suppose that for each standard natural number , is a nonstandard predicate on nonstandard variables and nonstandard constants . If any finite collection of the predicates are simultaneously satisfiable (thus, for each standard , there exist nonstandard objects such that holds for all ), then the entire collection is simultaneously satisfiable (thus there exists nonstandard objects such that holds for all ).

The -saturation property (also informally referred to as *countable saturation*, though this is technically a slight misnomer) resembles the finite intersection property that characterises compactness of topological spaces (and can thus be viewed as somewhat analogous to the local compactness property for the reals ), except that the finite intersection property involves arbitrary families of (closed) sets, whereas the -saturation property requires the collection of predicates involved to be countable. It is possible to construct nonstandard models with a higher degree of saturation (where one can use more predicates , as long as the total number does not exceed some cardinal which relates to the size of the nonstandard universe ), for instance by replacing the sequences used to construct the nonstandard universe with tuples ranging over a larger cardinality set. This may potentially be useful for certain types of analysis, for instance ones involving non-separable spaces, or Frechet spaces involving an uncountable number of seminorms.

Let us take for granted the existence of a nonstandard universe obeying the embedding, transfer, and saturation properties, and see what we can do with them. Firstly, transfer shows that the map is injective: if and only if . The field axioms of the standard reals can be phrased in the language of first-order logic, and hence by transfer the nonstandard reals also form a field. For instance, the assertion “For every non-zero standard real , there exists a standard real such that ” transfers over to “For every non-zero nonstandard real , there exists a nonstandard real such that “. If is a standard natural numer, one can transfer the statement “ if and only if ” from standard tuples to nonstandard tuples; among other things, this gives the nice identification when is a standard natural number. (The situation is more subtle when is a nonstandard natural number, but in most PDE applications one works in a fixed dimension and will not need to deal with this subtlety.) As one final example, “If , then holds if and only if ” transfers to “If , then holds if and only if “. More generally, basic inequalities such as Hölder’s inequality, Sobolev embedding, or the Bernstein inequalities transfer over to the nonstandard setting without difficulty.

As a basic example of saturation, for each standard natural number let denote the statement “There exists a nonstandard real such that “. These statements are finitely satisfable, hence by -saturation they are jointly satisfiable, thus there exists a nonstandard real which is *unbounded* in the sense that it is larger than every standard natural number (and hence also by every standard real number, by the Archimedean property of the reals). Similarly, there exist nonstandard real numbers which are non-zero but still *infinitesimal* in the sense that for every standard real .

On the other hand, one cannot apply the saturation property to the statements “There exists a nonstandard real such that and “, since is not known to be an internal subset of the nonstandard universe and so cannot be used as a constant for the purposes of saturation. (Indeed, since this sequence of statements is finitely satisfiable but not jointly satisfiable, this is a proof that is *not* an internal subset of , and must instead be viewed only as an external subset.)

Now we develop analogues of the sequential-based theory of limits in nonstandard analysis. The following dictionary may be helpful to keep in mind when comparing the two:

Standard real | A real number |

Nonstandard reals | A sequence of reals |

Embedding of standard real | A constant sequence of reals |

Internal set of nonstandard reals | A sequence of subsets of reals |

Embedding of standard set of reals | A constant sequence of subsets of reals |

External set | A collection of sequences of reals |

Internal function | A sequence of functions |

Embedding of a standard function | A constant sequence of functions |

External function | A map from sequences of vectors to sequences of reals |

Equality of nonstandard reals | After passing to a subsequence, for all |

is bounded | |

converges to zero (possibly after passing to subsequence) | |

Bounded sequences | |

Sequences converging to zero (possibly after passing to subsequence) | |

Convergent sequences (possibly after passing to subsequence) | |

Bolzano-Weierstrass theorem | |

Standard part of bounded real | Limit of bounded sequence (possibly after passing to subsequence) |

Note in particular that in the nonstandard analysis formalism there is no need to repeatedly pass to subsequences, as is often the case in sequential-based analysis.

A nonstandard real is said to be *bounded* if one has for some standard . In this case, we write , and let denote the set of all bounded reals. It is an external subring of that in turn contains as a external subring.

A nonstandard real is said to be *infinitesimal* if one has for all standard . In this case, we write , and let denote the set of all infinitesimal reals. This is another external subring (in fact, an ideal) of , and can be viewed as external vector spaces over .

The Bolzano-Weierstrass theorem is fundamental to orthodox real analysis. Its counterpart in nonstandard analysis is

Theorem 1 (Nonstandard version of Bolzano-Weierstrass)As external vector spaces over , we have the decomposition .

*Proof:* The only real which is simultaneously standard and infinitesimal is zero, so . It thus suffices to show that every bounded real can be written in the form for some standard . But the set is a Dedekind cut; setting to be the supremum of this cut, we have for all standard natural numbers , hence as desired.

If and for some standard real , we call the *standard part* of and denote it by : thus is the linear projection from to with kernel . It is an algebra homomorphism (this is the analogue of the usual limit laws in real analysis).

In real analysis, we know that continuous functions on a compact set that are pointwise bounded are automatically uniformly bounded. There is a handy analogue of this fact in nonstandard analysis:

Lemma 2 (Pointwise bounded/infinitesimal internal functions are uniformly bounded/infinitesimal)Let be an internal function.

- (i) If for all , then there is a standard such that for all .
- (ii) If for all , then there is an infinitesimal such that for all .

*Proof:* Suppose (i) were not the case, then the predicates “ and ” would be finitely satisfiable, hence jointly satisfiable by -saturation. But then there would exist such that for all , contradicting the hypothesis that .

For (ii), observe that the predicates “ is a nonstandard real such that for all ” are finitely satisfiable, hence jointly satisfiable by -saturation, giving the claim.

Because the rationals are dense in the reals, we see (from saturation) that every standard real number can be expressed as the standard part of a bounded rational, thus . This can in fact be viewed as a way to *construct* the reals; it is a minor variant of the standard construction of the reals as the space of Cauchy sequences of rationals, quotiented out by Cauchy equivalence.

Closely related to Lemma 2 is the overspill (or underspill) principle:

Lemma 3Let be an internal predicate of a nonnegative nonstandard real number .

- (i) (Overspill) If is true for arbitrarily large standard , then it is also true for at least one unbounded .
- (ii) (Underspill) If is true for arbitrarily small standard , then it is also true for at least one infinitesimal .

*Proof:* To prove (i), observe that the predicates “ and ” for a standard natural number are finitely satisfiable, hence jointly satisfiable by -saturation, and the claim follows. The claim (ii) is proven similarly, using instead of .

Corollary 4Let be an internal function. If for all standard , then one has for at least one unbounded .

*Proof:* Apply Lemma 3 to the predicate .

The overspill principle and its analogues correspond, roughly speaking, to the “diagonalisation” arguments that are common in sequential analysis, for instance in the proof of the Arzelá-Ascoli theorem.

** — 2. Some nonstandard functional analysis — **

In the discussion of the previous section, the real numbers could be replaced by the complex numbers or finite-dimensional vector spaces (with a standard natural number) with essentially no change in theory. However the situation becomes a bit more subtle when one works with infinite dimensional spaces, such as the functional spaces that are commonly used in PDE.

Let be a standard normed vector space with norm , then we can form the nonstandard function space with a nonstandard (or internal) norm . This is not quite a normed vector space when viewed externally, because the nonstandard norm takes values in the nonstandard nonnegative reals to rather than the standard nonnegative reals . However, we can form the subspace of consisting of those vectors which are *strongly bounded* in the sense that . This is an external real subspace of that contains . It comes with a seminorm defined by

It is easy to see that this is a seminorm. The null space of this seminorm is the subspace of consisting of those vectors which are *strongly infinitesimal* (in ) in the sense that ; we say two elements of are *strongly equivalent* (in ) if their difference is strongly infinitesimal. In infinite dimensions, is no longer locally compact, and the Bolzano-Weierstrass theorem now only gives an inclusion:

In general we expect to be significantly larger than (this is the nonstandard analogue of the sequential analysis phenomenon that most bounded sequences in will fail to have convergent subsequences). For instance if is a standard Hilbert space with an orthonormal system , and is an unbounded natural number, one can check that lies in but is not in . The quotient space is a normed vector space that contains as an isometric subspace and is known as the *nonstandard hull* of , but we will not explicitly use this space much in these notes.

In functional analysis one often has an embedding of standard function spaces , with an inequality of the form for all and some constant . For instance, one has the Sobolev embeddings whenever , , and . One easily sees that such an embedding induces also embeddings and .

If the embedding is compact – so that bounded subsets in are precompact in – then we can partially recover the missing inclusion in the Bolzano-Weierstrass theorem. This follows from

Lemma 5 (Compactness)Let be a standard compact subset (in the strong topology) of a standard normed vector space . Then one has the inclusionthat is to say every can be decomposed (uniquely) as with and .

This is the nonstandard analysis analogue of the assertion that compact subsets of a normed metric space are sequentially compact.

*Proof:* Uniqueness is clear (since non-zero standard elements of have non-zero standard, hence non-infinitesimal, nonstandard norm), so we turn to existence. If this failed for some , then for every , there exists a standard such that . Hence, by compactness of, one can find a standard natural number and standard such that for all , one has for some . By transfer, (viewing as constants), this implies that for all , one has for some . Applying this with , we obtain a contradiction.

We also note an easy converse inclusion: if is a standard open subset of a standard normed vector space , then

Exercise 6Suppose that the standard normed vector space is separable. Establish the converse implications, that a standard subset of is compact whenever , and a standard subset is open whenever . (The hypothesis of separability can be relaxed if one imposes stronger saturation properties on the nonstandard universe than -saturation.)

Exercise 7Let be a standard function on a standard subset of a normed vector space , and let be the nonstandard counterpart.

- (i) Show that is bounded if and only if for all .
- (ii) Show that is continuous (in the strong topology) if and only if whenever , are such that is strongly equivalent to .
- (iii) Show that is uniformly continuous (in the strong topology) if and only if whenever are such that is strongly equivalent to .
- (iv) If is compact, show that . Conclude the well-known fact that a standard continuous function on a compact set is uniformly continuous and bounded.

Now we have

Theorem 8 (Compact embeddings)If is a compact embedding of standard normed vector spaces , thenIf furthermore the closed unit ball of is compact (not just precompact) in , we can sharpen this to

This is the analogue of Proposition 3 of Notes 2.

*Proof:* Since , it suffices to show that every can be written as with and . But this is immediate from Lemma 5 (applied to the closure in of a closed unit ball in ).

From the above theorem and the Arzelá-Ascoli theorem, we see for instance that a nonstandard Lipschitz function from to with bounded nonstandard Lipschitz norm can be expressed as the sum of a standard Lipschitz function and a function which is infinitesimal in the uniform norm. Here is a more general version of this latter assertion:

Exercise 9 (Nonstandard version of Arzelá-Ascoli)Let be a standard open set, and let be a nonstandard function obeying the following axioms:

- (i) (Pointwise boundedness) For all , we have .
- (ii) (Pointwise equicontinuity) For all , we have whenever and .
Let denote the function

(this is well-defined by pointwise boundedness). Then is continuous, and its nonstandard representative is locally infinitesimally close to in the sense that

whenever . Conclude in particular that for every standard compact there exists an infinitesimal such that

for all .

We remark that the above discussion for normed vector spaces also extends without difficulty to Frechet spaces that have at most countably many seminorms , with now consisting of those with for all , and now consisting of those with for all .

Now suppose we work with the dual space of a normed vector space . (Here unfortunately we have a clash of notation, as the asterisk will now be used both to denote nonstandard representative and dual; hopefully the mathematics will still be unambiguous.) A nonstandard element of (thus, a nonstandardly continuous linear functional ) is said to be *weak*-bounded* if for all , and *weak*-infinitesimal* if . The space of weak*-bounded elements of will be denoted , and the space of weak*-infinitesimal elements denoted . These are related to the strong counterparts of these spaces by the inclusions

we also have the inclusion

For instance, if is a standard Hilbert space with orthonormal system , and is an unbounded natural number, then lies in but not in , , or , while lies in (and hence also in ) but not in or .

(One could also develop similar notations in which one uses weak topologies instead of the weak* topology, but we will not need the weak topology in these notes.)

We have the following nonstandard version of the Banach-Alaoglu theorem:

Theorem 10 (Nonstandard version of Banach-Alaoglu)If is a normed vector space with dual , then we have the inclusion

*Proof:* Since , it suffices to show that every can be decomposed as for some and .

Let , thus there is a standard such that for all . In particular, if we define for , then depends linearly on and for all ; thus is an element of . Setting , we see from construction that for all , giving the claim.

Theorem 10 can be compared with Proposition 2 of Notes 2, except in this nonstandard analysis setting, no separability is required on the predual space .

If is a standard bounded linear operator between normed vector spaces, then one has a nonstandard linear operator . It is easy to see that this operator maps to and to . The adjoint operator similarly maps to and to , but also takes to and to .

A (nonstandard) linear operator will be said to be an *approximate identity* on if maps to . Here are two basic and useful examples of such approximate identities:

Exercise 11(Frequency and spatial localisation as approximate identities)

- (i) For standard and unbounded , show that the nonstandard Littlewood-Paley projection is an approximate identity on , and more generally on the Sobolev spaces for any standard natural number .
- (ii) For standard and unbounded , and a standard test function that equals near the origin, show that the nonstandard spatial truncation opertor is an approximate identity on , and more generally on the Sobolev spaces for any standard natural number . What happens at ?

** — 3. Nonstandard analysis and distributions — **

Let be a standard open set. The dual of the standard space of test functions on is the standard space of distributions. Like any other standard space, it has a nonstandard counterpart , whose elements are the nonstandard distributions.

A nonstandard distribution will be said to be *weakly bounded* if, for any standard compact set , there is a standard natural number and standard such that one has the bound

for all standard . (It would be slightly more accurate to use the terminology “weak-* bounded” instead of “weakly bounded”, but we will omit the asterisk here to make the notation a little less clunky. Similarly for the related concepts below) Thus for instance any standard distribution is weakly bounded, and if is any normed vector space structure on that is continuous in the test function topology, and , then will be weakly bounded. We say that a nonstandard distribution is *weakly infinitesimal* if one has for all standard . For instance, if is any normed vector space structure on and , then will be weakly infinitesimal. We say that two nonstandard distributions are *weakly equivalent*, and write , if they differ by a weakly infinitesimal distribution, thus

for all standard test functions .

If is a weakly bounded distribution, we can define the *weakly standard part* to be the distribution defined by the formula

This is the unique standard distribution that is weakly equivalent to . As such, it must be compatible with the other decompositions of the preceding section. For instance, if is a normed vector space structure on and , then the decomposition must agree with the one in Theorem 10, thus and . In particular, if embeds compactly into a normed vector space , we also have , thus weak equivalence in implies strong equivalence in . By the definition of the dual norm we also conclude the Fatou-type inequality

Informally, represents the portion of that one can “observe” at standard physical scales and standard frequency scales, ignoring all components of that are at unbounded or infinitesimal physical or frequency scales. The following examples may help illustrate this point:

Example 12Let be an unbounded natural number, and on let be the nonstandard distribution . Then is weakly infinitesimal, so . This is in contrast to the pointwise standard part , which is a rather wild (and almost certainly non-measurable) function from to . The nonstandard distribution is not even pointwise bounded in general, so does not have a pointwise standard part, but is still weakly infinitesimal and so . Similarly if one replaces by for some standard bump function .

The following table gives the analogy between these nonstandard analysis concepts, and the more familiar ones from sequential weak compactness theory:

Standard distribution | Distribution |

Nonstandard distribution | Sequence of distributions |

Embedding of standard distribution | Constant sequence |

is bounded in norm | |

converges strongly to zero in | |

converges in norm (possibly after passing to subsequence) | |

is weak-* bounded in | |

converges weak-* to zero in | |

is weakly infinitesimal | converges to zero in distribution |

Weak limit of (possibly after passing to subsequence) |

Exercise 13Let , establish the Pythagorean identityThis identity can be used as a starting point for the theory of concentration compactness, as discussed in these notes. What happens in other spaces?

Exercise 14Show that every standard distribution is the weakly standard part of some weakly bounded nonstandard test function . (Hint:When is , one can convolve with a nonstandard approximation to the identity and also apply a nonstandard spatial cutoff. When is a proper subset of one also has to smoothly cut off outside an infinitesimal neighbourhood of the boundary of if one wants to make the convolution well defined.) This result can be viewed as analogous to the previous observation that every standard real is the standard part of a bounded rational. This also provides an alternate way to construct distributions, as weakly bounded nonstandard test functions up to weak equivalence.

Exercise 15 (Arzela-Ascoli, nonstandard distributional form)If is a nonstandard continuous function obeying the pointwise boundedness and pointwise equicontinuity axioms from Exercise 9. Show that is also a weakly bounded distribution and that the weakly standard part of agrees with the pointwise standard part: . In particular, if is also weakly infinitesimal, conclude that for every standard compact set .

** — 4. Leray-Hopf solutions to Navier-Stokes — **

For that is divergence-free, recall that a *Leray-Hopf solution* to the Navier-Stokes equations on is a distribution vanishing outside of that has the additional regularity

for almost all , and solves the equation

in the sense of distributions. We now give a nonstandard interpretation of this concept:

Proposition 16 (Nonstandard interpretation of Leray-Hopf solution)Let be divergence-free. Let be a nonstandard smooth function obeying the following properties:

- (i) (Energy inequality) For all nonstandard , one has the nonstandard energy inequality
Here of course we use the nonstandard version of the Lebesgue integral, and extend to in the usual fashion (i.e., we identify with ).

- (ii) (Initial data) We have .
- (iii) (Weak time regularity) For any standard , .
- (iv) (Navier-Stokes) On , one has the nonstandard (classical) forced Navier-Stokes equation
where the forcing term , after extending by zero to , is weakly infinitesimal.

Then after extending by zero to , is weakly bounded and is a Leray-Hopf solution to Navier-Stokes. Conversely, every such Leray-Hopf solution arises in this fashion.

Thus, roughly speaking, Leray-Hopf solutions arise from nonstandard strong solutions to Navier-Stokes in which one permits weakly infinitesimal changes to the initial data and forcing term, which are arbitrary save for the constraint that these changes do not add more than an infinitesimal amount of energy to the system, and which also obey a technical but weak condition on the time derivative. Thus, for instance, one can insert a nonstandard frequency mollification at an unbounded frequency cutoff , or a spatial truncation at an unbounded spatial scale , without difficulty (so long as one checks that such modifications do not introduce more than an infinitesimal amount of energy into the system), which can be used to recover the standard construction of Leray-Hopf solutions.

*Proof:* First suppose that obeys all the axioms (i)-(iv). We now repeat the arguments used to prove Theorem 14 of Notes 2, but translated to the language of nonstandard analysis. (All the key inputs are still basically the same; the changes are almost all entirely in the surrounding formalism.)

From (i) we see that

and

which implies that

and

Also, for every standard we have

and hence by Sobolev embedding

for sufficiently close to . From Hölder this gives

and then by the boundedness of the Leray projector

We have

in the sense of nonstandard distributions. Taking weakly standard part, we will obtain a weak solution to the Navier-Stokes equations as long as

Both sides lie in , but the equality requires some care. Applying a standard test function , it suffices to show that

for all such , which we can assume to be supported in . One can check that lies in . If we can show that for every standard compact set that

hence by Corollary 4 the same claim is true for some unbounded ; however, the nonstandard norm of outside of an unbounded ball is infinitesimal, and we then conclude (6).

Fix . By Hölder’s inequality, it now suffices to show that

that is to say is strongly equivalent to in . By another application of Hölder, it suffices to show that is strongly equivalent to in .

For unbounded , is strongly equivalent to in ; since is bounded in , we also see from Bernstein’s theorem and the triangle inequality that is strongly equivalent to in . Thus it suffices to show that for at least one unbounded , that is strongly equivalent to in . By overspill, it suffices to do this for arbitarily large standard . But by Bernstein’s inequality, the difference is bounded in , has space derivative bounded in , and time derivative bounded in on by hypothesis, hence is equicontinuous from the fundamental theorem of calculus and Cauchy-Schwarz; by Exercise 15 we conclude that is strongly infinitesimal in , and hence also in as required.

To show the energy inequality for , we again repeat the arguments from Notes 2. If is standard and is a standard non-negative test function supported on of total mass one for some small standard , then from averaging (4) we have

Taking weakly standard parts using (1) we conclude that

The energy inequality (2) then follows from the Lebesgue differentiation theorem (which, incidentally, can also be translated into a nonstandard analysis framework, as discussed to some extent in this previous post).

Now we establish the converse direction. Let be a Leray-Hopf solution, then from Sobolev and Hölder we have

for some sufficiently close to , and hence from the weak Navier-Stokes equation and Bernstein’s inequality we have

for every standard (but without claiming the bound to be uniform in ).

Now let be infinitesimal and be unbounded, let be a standard non-negative test function of total mass supported on , and let be the nonstandard function

arising from performing a frequency truncation in space and a smooth averaging in time. This is a nonstandard smooth function. From Minkowski’s inequality, (2), and the non-expansive nature of on we see that we have the energy inequality (4) (in fact we do not even need the error here). From Exercise 19 of Notes 2, we know that converges strongly in to as , and hence is strongly equivalent to in for all , and hence is also. From (7) and Minkowski’s inequality we also conclude property (iii) of the proposition.

The only remaining thing to verify is property (iv), which we will do assuming that is sufficiently small depending on . From (3) we see that on , we have (in the classical sense) that

and so (5) holds with forcing term

To show that is weakly infinitesimal, it suffices as before to show that

is strongly infinitesimal in for every standard and compact . But from (7) we know that for all , , and , and hence by overspill one has

for the same range of if is sufficiently small depending on . Thus is strongly equivalent in (and hence in to the commutator type expression

But from Bernstein’s inequality and the boundedness of , we know that is strongly equivalent to in , so by Hölder it suffices to show that is strongly infinitesimal in . But this follows from the fact that is strongly equivalent to in , and that vanishes.

Remark 17The above proof shows that we can in fact demand stronger regularity on the time derivative than is required in Proposition 16(iii) if desired; for instance, one can place in for close enough to .

Exercise 18State and prove a more traditional analogue of this proposition that asserts (roughly speaking) that any weak limit of a sequence of smooth solutions to Navier-Stokes with changes in initial data and forcing term that converge weakly to zero, which asymptotically obeys the energy inequality, and which has some weak uniform control on the time derivative, will produce a Leray-Hopf solution, and conversely that every Leray-Hopf solution arises in this fashion.

]]>

Exercise 19Translate the proof of weak-strong uniqueness from Proposition 20 of Notes 2 to nonstandard analysis, by first using Proposition 16 to interpret the weak solution as the weakly standard part of a strong nonstandard approximate solution. (One will need the improved control on mentioned in Remark 6.)

whenever and goes to infinity as . Informally, this says that the Liouville function has small mean for almost all short intervals . The remarkable thing about this theorem is that there is no lower bound on how goes to infinity with ; one can take for instance . This lack of lower bound was crucial when I applied this result (or more precisely, a generalisation of this result to arbitrary non-pretentious bounded multiplicative functions) a few years ago to solve the Erdös discrepancy problem, as well as a logarithmically averaged two-point Chowla conjecture, for instance it implies that

The local Fourier uniformity conjecture asserts the stronger asymptotic

under the same hypotheses on and . As I worked out in a previous paper, this conjecture would imply a logarithmically averaged three-point Chowla conjecture, implying for instance that

This particular bound also follows from some slightly different arguments of Joni Teräväinen and myself, but the implication would also work for other non-pretentious bounded multiplicative functions, whereas the arguments of Joni and myself rely more heavily on the specific properties of the Liouville function (in particular that for all primes ).

There is also a higher order version of the local Fourier uniformity conjecture in which the linear phase is replaced with a polynomial phase such as , or more generally a nilsequence ; as shown in my previous paper, this conjecture implies (and is in fact equivalent to, after logarithmic averaging) a logarithmically averaged version of the full Chowla conjecture (not just the two-point or three-point versions), as well as a logarithmically averaged version of the Sarnak conjecture.

The main result of the current paper is to obtain some cases of the local Fourier uniformity conjecture:

Theorem 1The asymptotic (2) is true when for a fixed .

Previously this was known for by the work of Zhan (who in fact proved the stronger pointwise assertion for in this case). In a previous paper with Kaisa and Maksym, we also proved a weak version

of (2) for any growing arbitrarily slowly with ; this is stronger than (1) (and is in fact proven by a variant of the method) but significantly weaker than (2), because in the latter the worst-case is permitted to depend on the parameter, whereas in (3) must remain independent of .

Unfortunately, the restriction is not strong enough to give applications to Chowla-type conjectures (one would need something more like for this). However, it can still be used to control some sums that had not previously been manageable. For instance, a quick application of the circle method lets one use the above theorem to derive the asymptotic

whenever for a fixed , where is the von Mangoldt function. Amusingly, the seemingly simpler question of establishing the expected asymptotic for

is only known in the range (from the work of Zaccagnini). Thus we have a rare example of a number theory sum that becomes *easier* to control when one inserts a Liouville function!

We now give an informal description of the strategy of proof of the theorem (though for numerous technical reasons, the actual proof deviates in some respects from the description given here). If (2) failed, then for many values of we would have the lower bound

for some frequency . We informally describe this correlation between and by writing

for (informally, one should view this as asserting that “behaves like” a constant multiple of ). For sake of discussion, suppose we have this relationship for *all* , not just *many*.

As mentioned before, the main difficulty here is to understand how varies with . As it turns out, the multiplicativity properties of the Liouville function place a significant constraint on this dependence. Indeed, if we let be a fairly small prime (e.g. of size for some ), and use the identity for the Liouville function to conclude (at least heuristically) from (4) that

for . (In practice, we will have this sort of claim for *many* primes rather than *all* primes , after using tools such as the Turán-Kubilius inequality, but we ignore this distinction for this informal argument.)

Now let and be primes comparable to some fixed range such that

and

on essentially the same range of (two nearby intervals of length ). This suggests that the frequencies and should be close to each other modulo , in particular one should expect the relationship

Comparing this with (5) one is led to the expectation that should depend inversely on in some sense (for instance one can check that

would solve (6) if ; by Taylor expansion, this would correspond to a global approximation of the form ). One now has a problem of an additive combinatorial flavour (or of a “local to global” flavour), namely to leverage the relation (6) to obtain global control on that resembles (7).

A key obstacle in solving (6) efficiently is the fact that one only knows that and are close modulo , rather than close on the real line. One can start resolving this problem by the Chinese remainder theorem, using the fact that we have the freedom to shift (say) by an arbitrary integer. After doing so, one can arrange matters so that one in fact has the relationship

whenever and obey (5). (This may force to become extremely large, on the order of , but this will not concern us.)

Now suppose that we have and primes such that

For every prime , we can find an such that is within of both and . Applying (8) twice we obtain

and

and thus by the triangle inequality we have

for all ; hence by the Chinese remainder theorem

In practice, in the regime that we are considering, the modulus is so huge we can effectively ignore it (in the spirit of the Lefschetz principle); so let us pretend that we in fact have

whenever and obey (9).

Now let be an integer to be chosen later, and suppose we have primes such that the difference

is small but non-zero. If is chosen so that

(where one is somewhat loose about what means) then one can then find real numbers such that

for , with the convention that . We then have

which telescopes to

and thus

and hence

In particular, for each , we expect to be able to write

for some . This quantity can vary with ; but from (10) and a short calculation we see that

whenever obey (9) for some .

Now imagine a “graph” in which the vertices are elements of , and two elements are joined by an edge if (9) holds for some . Because of exponential sum estimates on , this graph turns out to essentially be an “expander” in the sense that any two vertices can be connected (in multiple ways) by fairly short paths in this graph (if one allows one to modify one of or by ). As a consequence, we can assume that this quantity is essentially constant in (cf. the application of the ergodic theorem in this previous blog post), thus we now have

for most and some . By Taylor expansion, this implies that

on for most , thus

But this can be shown to contradict the Matomäki-Radziwill theorem (because the multiplicative function is known to be non-pretentious).

]]>into Euclidean space, which we can write as with the notation

Here we give the right-invariant Carnot-CarathÃ©odory metric coming from the right-invariant vector fields

but not from the commutator vector field

This gives the geometry of a Carnot group. As observed by Semmes, it follows from the Carnot group differentiation theory of Pansu that there is no bilipschitz map from to any Euclidean space or even to , since such a map must be differentiable almost everywhere in the sense of Carnot groups, which in particular shows that the derivative map annihilate almost everywhere, which is incompatible with being bilipschitz.

On the other hand, if one *snowflakes* the Heisenberg group by replacing the metric with for some , then it follows from the general theory of Assouad on embedding snowflaked metrics of doubling spaces that may be embedded in a bilipschitz fashion into , or even to for some depending on .

Of course, the distortion of this bilipschitz embedding must degenerate in the limit . From the work of Austin-Naor-Tessera and Naor-Neiman it follows that may be embedded into with a distortion of , but no better. The Naor-Neiman paper also embeds into a finite-dimensional space with independent of , but at the cost of worsening the distortion to . They then posed the question of whether this worsening of the distortion is necessary.

The main result of this paper answers this question in the negative:

Theorem 1There exists an absolute constant such that may be embedded into in a bilipschitz fashion with distortion for any .

To motivate the proof of this theorem, let us first present a bilipschitz map from the snowflaked line (with being the usual metric on ) into complex Hilbert space . The map is given explicitly as a Weierstrass type function

where for each , is the function

and are an orthonormal basis for . The subtracting of the constant is purely in order to make the sum convergent as . If are such that for some integer , one can easily check the bounds

with the lower bound

at which point one finds that

as desired.

The key here was that each function oscillated at a different spatial scale , and the functions were all orthogonal to each other (so that the upper bound involved a factor of rather than ). One can replicate this example for the Heisenberg group without much difficulty. Indeed, if we let be the discrete Heisenberg group, then the nilmanifold is a three-dimensional smooth compact manifold; thus, by the Whitney embedding theorem, it smoothly embeds into . This gives a smooth immersion which is -automorphic in the sense that for all and . If one then defines to be the function

where is the scaling map

then one can repeat the previous arguments to obtain the required bilipschitz bounds

for the function

To adapt this construction to bounded dimension, the main obstruction was the requirement that the took values in orthogonal subspaces. But if one works things out carefully, it is enough to require the weaker orthogonality requirement

for all , where is the bilinear form

One can then try to construct the for bounded dimension by an iterative argument. After some standard reductions, the problem becomes this (roughly speaking): given a smooth, slowly varying function whose derivatives obey certain quantitative upper and lower bounds, construct a smooth oscillating function , whose derivatives also obey certain quantitative upper and lower bounds, which obey the equation

Â

We view this as an underdetermined system of differential equations for (two equations in unknowns; after some reductions, our can be taken to be the explicit value ). The trivial solution to this equation will be inadmissible for our purposes due to the lower bounds we will require on (in order to obtain the quantitative immersion property mentioned previously, as well as for a stronger “freeness” property that is needed to close the iteration). Because this construction will need to be iterated, it will be essential that the regularity control on is the same as that on ; one cannot afford to “lose derivatives” when passing from to .

This problem has some formal similarities with the isometric embedding problem (discussed for instance in this previous post), which can be viewed as the problem of solving an equation of the form , where is a Riemannian manifold and is the bilinear form

The isometric embedding problem also has the key obstacle that naive attempts to solve the equation iteratively can lead to an undesirable “loss of derivatives” that prevents one from iterating indefinitely. This obstacle was famously resolved by the Nash-Moser iteration scheme in which one alternates between perturbatively adjusting an approximate solution to improve the residual error term, and mollifying the resulting perturbation to counteract the loss of derivatives. The current equation (1) differs in some key respects from the isometric embedding equation , in particular being linear in the unknown field rather than quadratic; nevertheless the key obstacle is the same, namely that naive attempts to solve either equation lose derivatives. Our approach to solving (1) was inspired by the Nash-Moser scheme; in retrospect, I also found similarities with Uchiyama’s constructive proof of the Fefferman-Stein decomposition theorem, discussed in this previous post (and in this recent one).

To motivate this iteration, we first express using the product rule in a form that does not place derivatives directly on the unknown :

Â

This reveals that one can construct solutions to (1) by solving the system of equations

Â

for . Because this system is zeroth order in , this can easily be done by linear algebra (even in the presence of a forcing term ) if one imposes a “freeness” condition (analogous to the notion of a free embedding in the isometric embedding problem) that are linearly independent at each point , which (together with some other technical conditions of a similar nature) one then adds to the list of upper and lower bounds required on (with a related bound then imposed on , in order to close the iteration). However, as mentioned previously, there is a “loss of derivatives” problem with this construction: due to the presence of the differential operators in (3), a solution constructed by this method can only be expected to have two degrees less regularity than at best, which makes this construction unsuitable for iteration.

To get around this obstacle (which also prominently appears when solving (linearisations of) the isometric embedding equation ), we instead first construct a smooth, low-frequency solution to a low-frequency equation

Â

where is a mollification of (of Littlewood-Paley type) applied at a small spatial scale for some , and then gradually relax the frequency cutoff to deform this low frequency solution to a solution of the actual equation (1).

We will construct the low-frequency solution rather explicitly, using the Whitney embedding theorem to construct an initial oscillating map into a very low dimensional space , composing it with a Veronese type embedding into a slightly larger dimensional space to obtain a required “freeness” property, and then composing further with a slowly varying isometry depending on and constructed by a quantitative topological lemma (relying ultimately on the vanishing of the first few homotopy groups of high-dimensional spheres), in order to obtain the required orthogonality (4). (This sort of “quantitative null-homotopy” was first proposed by Gromov, with some recent progress on optimal bounds by Chambers-Manin-Weinberger and by Chambers-Dotterer-Manin-Weinberger, but we will not need these more advanced results here, as one can rely on the classical qualitative vanishing for together with a compactness argument to obtain (ineffective) quantitative bounds, which suffice for this application).

To perform the deformation of into , we must solve what is essentially the linearised equation

Â

of (1) when , (viewed as low frequency functions) are both being deformed at some rates (which should be viewed as high frequency functions). To avoid losing derivatives, the magnitude of the deformation in should not be significantly greater than the magnitude of the deformation in , when measured in the same function space norms.

As before, if one directly solves the difference equation (5) using a naive application of (2) with treated as a forcing term, one will lose at least one derivative of regularity when passing from to . However, observe that (2) (and the symmetry ) can be used to obtain the identity

Â

and then one can solve (5) by solving the system of equations

for . The key point here is that this system is zeroth order in both and , so one can solve this system without losing any derivatives when passing from to ; compare this situation with that of the superficially similar system

that one would obtain from naively linearising (3) without exploiting the symmetry of . There is still however one residual “loss of derivatives” problem arising from the presence of a differential operator on the term, which prevents one from directly evolving this iteration scheme in time without losing regularity in . It is here that we borrow the final key idea of the Nash-Moser scheme, which is to replace by a mollified version of itself (where the projection depends on the time parameter). This creates an error term in (5), but it turns out that this error term is quite small and smooth (being a “high-high paraproduct” of and , it ends up being far more regular than either or , even with the presence of the derivatives) and can be iterated away provided that the initial frequency cutoff is large and the function has a fairly high (but finite) amount of regularity (we will eventually use the HÃ¶lder space on the Heisenberg group to measure this).

]]>modulo constants, for some , where are the Riesz transforms. A technical note here a function in BMO is defined only up to constants (as well as up to the usual almost everywhere equivalence); related to this, if is an function, then the Riesz transform is well defined as an element of , but is also only defined up to constants and almost everywhere equivalence.

The original proof of Fefferman and Stein was indirect (relying for instance on the Hahn-Banach theorem). A constructive proof was later given by Uchiyama, and was in fact the topic of the second post on this blog. A notable feature of Uchiyama’s argument is that the construction is quite nonlinear; the vector-valued function is defined to take values on a sphere, and the iterative construction to build these functions from involves repeatedly projecting a potential approximant to this function to the sphere (also, the high-frequency components of this approximant are constructed in a manner that depends nonlinearly on the low-frequency components, which is a type of technique that has become increasingly common in analysis and PDE in recent years).

It is natural to ask whether the Fefferman-Stein decomposition (1) can be made linear in , in the sense that each of the depend linearly on . Strictly speaking this is easily accomplished using the axiom of choice: take a Hamel basis of , choose a decomposition (1) for each element of this basis, and then extend linearly to all finite linear combinations of these basis functions, which then cover by definition of Hamel basis. But these linear operations have no reason to be continuous as a map from to . So the correct question is whether the decomposition can be made *continuously linear* (or equivalently, boundedly linear) in , that is to say whether there exist continuous linear transformations such that

modulo constants for all . Note from the open mapping theorem that one can choose the functions to depend in a bounded fashion on (thus for some constant , however the open mapping theorem does not guarantee linearity. Using a result of Bartle and Graves one can also make the depend continuously on , but again the dependence is not guaranteed to be linear.

It is generally accepted folklore that continuous linear dependence is known to be impossible, but I had difficulty recently tracking down an explicit proof of this assertion in the literature (if anyone knows of a reference, I would be glad to know of it). The closest I found was a proof of a similar statement in this paper of Bourgain and Brezis, which I was able to adapt to establish the current claim. The basic idea is to average over the symmetries of the decomposition, which in the case of (1) are translation invariance, rotation invariance, and dilation invariance. This effectively makes the operators invariant under all these symmetries, which forces them to themselves be linear combinations of the identity and Riesz transform operators; however, no such non-trivial linear combination maps to , and the claim follows. Formal details of this argument (which we phrase in a dual form in order to avoid some technicalities) appear below the fold.

** — 1. Formal argument — **

Suppose for contradiction that we have bounded linear maps such that (2) holds for all . We convert this to an assertion about bilinear forms. Let be the space of Schwartz functions, and let be codimension one subspace of Schwarz functions of integral zero. We define the bilinear forms

and

by the formulae

and

From (2) and integration by parts we have the identity

for all with vanishing near the origin (so that is also Schwartz), where is the vector Riesz transform. Meanwhile, from the boundedness of and Hölder’s inequality we have the bounds

for any and , where is the covector Riesz transform, thus . (Note that is indeed in due to the hypothesis that has integral zero.

We will show that there are no bilinear forms obeying the identity (3) (for the indicated range of ) and the bounds (4), (5) (for the indicated range of ), which will give the claim.

Now we exploit symmetries of the hypotheses (3), (4), (5). For any , let be the translation operator

This operator also acts on vector-valued functions by acting on each component separately. The translation operators are uniformly bounded on and , and commute with Riesz transforms and preserve Lebesgue measure . As a consequence, if the forms obey the hypotheses (3), (4), (5), then so do the translated forms

and

with implied constants in (4), (5) uniform in .

On the other hand, , being abelian, is an amenable group. Thus, if , obey the hypotheses (3), (4), (5), then so do the means

and

where is an invariant mean of . (Note from (4), (5) that the quantities , depend continuously on .) By construction, these new bilinear forms are translation-invariant in the sense that

for all , , and . Thus, to show the non-existence of bilinear forms obeying (3), (4), (5), we may assume without loss of generality that the forms also obey the translation invariance (6).

We can argue similarly to assume rotational invariance, provided we apply the rotation correctly to the vector field . More precisely, if is any orthogonal matrix (a rotation or reflection), and obey (3), (4), (5), (6), then one can check that the rotated forms

also obey these hypotheses (with implied constants uniform in ), a key point being the identity

which follows from the chain rule. Being compact, is also amenable (indeed one can simply integrate against the probability Haar measure on ), and so by arguing as before we may assume without loss of generality that the bilinear forms obey the rotational invariance

Finally, we impose dilation invariance. For any in the positive real line , we define the dilation operator by

The operator preserves , while preserves . As a consequence, if obey (3), (4), (5), (6), (7), then so do the dilated forms

with constants uniform in . As is also amenable, we may thus assume without loss of generality that the bilinear forms obey the dilation invariance

Remark 1One could have combined the translation invariance, rotational invariance, and dilation invariance steps into a single step by considering the action of the group of similarity transformations generated by translations, rotations, and dilations, which is also amenable, but I found it conceptually easier to consider the three invariances separately.

Now we use all these invariances to almost completely determine the operators . We begin with the analysis of , which is slightly simpler. First observe from the Schwartz kernel theorem that there is a (tempered) distribution on such that

for all , where by abuse of notation the right-hand side denotes the action of the distribution on the tensor product of and . The translation invariance (6) then asserts that

for all and . Since Schwartz functions in can be approximated (in Schwartz seminorms) by linear combinations of Schwartz functions in (as can be done for instance using dyadic partitions of unity and Fourier series), we conclude that

for any Schwartz .

Morally, this means that , that is to say is really just a function of . To justify this at the level of distributions, we argue as follows. Now suppose that is such that for all . If we let be a bump function supported on of total mass one, and is sufficiently large, then we have

(from translation invariance and averaging),

(because vanishes unless , and because ),

(by translating by ),

(by translating by ),

(because ). On the other hand, the integrand can be seen to be non-vanishing only when , and has uniform compact support in , with all norms . From this and the triangle inequality we have

and thus on sending we conclude that

whenever is such that , or equivalently whenever . Thus the expression only depends on the function , so there exists a distribution on such that

We then have the convolution representation

for all , where is the reflection of . By a limiting argument one can allow to be in the Schwartz space rather than . The rotation and dilation invariances of then imply that

The rotation invariance and an averaging argument implies that

is the spherically symmetric component of , and denotes probability Haar measure on . Next, if vanishes near the origin and is such that for all , and is a bump function supported on of total mass one, then (similarly to the translation invariance analysis) we have

for large enough. But the integrand is non-vanishing only when is comaparable to , and has uniform compact support in with all norms . From this and the triangle inequality we conclude that

and hence

whenever vanishes near the origin with for all . In the case when is radial, , the latter condition simplifies to . We conclude that there is a constant such that

whenever is spherically symmetric vanishes near the origin. On the other hand, if is a spherically symmetric bump function equal to one near the origin and is decreasing, then is strictly positive, while from (9) we have

We conclude that , thus vanishes whenever is spherically symmetric and vanishes near . From (10) the same is true without the spherical symmetry hypothesis; thus the distribution is supported at the origin, and hence is a linear combination of the Dirac delta function and its derivatives. To be consistent with the dilation invariance (9), it can in fact only be a constant multiple of the Dirac delta; thus

for some . But this is only consistent with (4) if (one can take for instance to be a smoothed out version of for some large , which has a bounded BMO norm, and to be a bump function supported near the origin, to obtain a contradiction for non-zero ). Thus must vanish identically.

We now apply a similar (but slightly more complicated) argument for . The Schwartz kernel theorem (and the Hahn-Banach theorem) shows that there exists a vector-valued distribution on such that

whenever and . Translation invariance then gives

whenever is such that for all . The translation averaging argument then shows that

whenever is such that for all and for all ; this implies the existence of a vector-valued distribution on such that

whenever is such that for all . In particular we have

whenever and . The rotation and dilation invariances of then imply that

The rotation invariance implies that

is the equivariant version of (in particular one has for all and . From dilation invariance as before we have

whenever vanishes near the origin with for all . For equivariant , this latter condition is equivalent to . In particular, there is a constant such that

whenever is equivariant and vanishes near the origin Combining this with (12) we conclude that

whenever vanishes near the origin. Using polar coordinates, the right-hand side can also be expressed as

for another constant ( divided by the surface area of the unit sphere). Thus, is equal to the distribution away from the origin, and thus differs from that distribution by a (vector-valued) linear combination of Dirac delta functions and its derivatives. By dilation invariance there cannot actually be any derivative terms, and from rotation invariance there cannot be any delta function terms either. Hence we have

and hence

for some absolute constant ; comparing this with (3) we see that . We conclude that for , we have for ; but one can easily show that fails to be bounded from to (even when restricted to Schwartz functions), and the claim follows.

]]>For sake of discussion we will just work in the non-periodic domain , , although the arguments here can be adapted without much difficulty to the periodic setting. We will only work with solutions in which the pressure is normalised in the usual fashion:

Formally, the Euler equations (with normalised pressure) arise as the vanishing viscosity limit of the Navier-Stokes equations

that was studied in previous notes. However, because most of the bounds established in previous notes, either on the lifespan of the solution or on the size of the solution itself, depended on , it is not immediate how to justify passing to the limit and obtain either a strong well-posedness theory or a weak solution theory for the limiting equation (1). (For instance, weak solutions to the Navier-Stokes equations (or the approximate solutions used to create such weak solutions) have lying in for , but the bound on the norm is and so one could lose this regularity in the limit , at which point it is not clear how to ensure that the nonlinear term still converges in the sense of distributions to what one expects.)

Nevertheless, by carefully using the energy method (which we will do loosely following an approach of Bertozzi and Majda), it is still possible to obtain *local-in-time* estimates on (high-regularity) solutions to (3) that are uniform in the limit . Such *a priori* estimates can then be combined with a number of variants of these estimates obtain a satisfactory local well-posedness theory for the Euler equations. Among other things, we will be able to establish the *Beale-Kato-Majda criterion* – smooth solutions to the Euler (or Navier-Stokes) equations can be continued indefinitely unless the integral

becomes infinite at the final time , where is the *vorticity* field. The vorticity has the important property that it is transported by the Euler flow, and in two spatial dimensions it can be used to establish global regularity for both the Euler and Navier-Stokes equations in these settings. (Unfortunately, in three and higher dimensions the phenomenon of vortex stretching has frustrated all attempts to date to use the vorticity transport property to establish global regularity of either equation in this setting.)

There is a rather different approach to establishing local well-posedness for the Euler equations, which relies on the *vorticity-stream* formulation of these equations. This will be discused in a later set of notes.

** â€” 1. A priori bounds â€” **

We now develop some *a priori* bounds for very smooth solutions to Navier-Stokes that are uniform in the viscosity . Define an function to be a function that lies in every space; similarly define an function to be a function that lies in for every . Given divergence-free initial data , an mild solution to the Navier-Stokes initial value problem (3) is a solution that is an mild solution for all . From the (non-periodic version of) Corollary 40 of Notes 1, we know that for any divergence-free initial data , there is unique maximal Cauchy development , with infinite if is finite.

Here are our first bounds:

Theorem 1 (A priori bound)Let be an maximal Cauchy development to (3) with initial data .

- (i) For any integer , we have
Furthermore, if for a sufficiently small constant depending only on , then

- (ii) For any and integer , one has

The hypothesis that is integer can be dropped by more heavily exploiting the theory of paraproducts, but we shall restrict attention to integer for simplicity.

We now prove this theorem using the energy method. Using the Navier-Stokes equations, we see that and all lie in for any ; an easy iteration argument then shows that the same is true for all higher derivatives of also. This will make it easy to justify the differentiation under the integral sign that we shall shortly perform.

Let be an integer. For each time , we introduce the energy-type quantity

Here we think of as taking values in the Euclidean space . This quantity is of course comparable to , up to constants depending on . It is easy to verify that is continuously differentiable in time, with derivative

where we suppress explicit dependence on in the integrand for brevity. We now try to bound this quantity in terms of . We expand the right-hand side in coordinates using (3) to obtain

where

For , we can integrate by parts to move the operator onto and use the divergence-free nature of to conclude that . Similarly, we may integrate by parts for to move one copy of over to the other factor in the integrand to conclude

so in particular (note that as we are seeking bounds that are uniform in , we can’t get much further use out of beyond this bound). Thus we have

Now we expand out using the Leibniz rule. There is one dangerous term, in which all the derivatives in fall on the factor, giving rise to the expression

But we can locate a total derivative to write this as

and then an integration by parts using as before shows that this term vanishes. Estimating the remaining contributions to using the triangle inequality, we arrive at the bound

At this point we now need a variant of Proposition 35 from Notes 1:

Exercise 2Let be integers. For any , show that(

Hint:for or , use HÃ¶lder’s inequality. Otherwise, use a suitable Littlewood-Paley decomposition.)

Using this exercise and HÃ¶lder’s inequality, we see that

By Gronwall’s inequality we conclude that

for any and , which gives part (ii).

Now assume . Then we have the Sobolev embedding

which when inserted into (4) yields the differential inequality

or equivalently

for some constant (strictly speaking one should work with for some small which one sends to zero later, if one wants to avoid the possibility that vanishes, but we will ignore this small technicality for sake of exposition.) Since , we conclude that stays bounded for a time interval of the form ; this, together with the blowup criterion that must go to infinity as , gives part (i).

As a consequence, we can now obtain local existence for the Euler equations from smooth data:

Corollary 3 (Local existence for smooth solutions)Let be divergence-free. Let be an integer, and setThen there is a smooth solution , to (1) with all derivatives of in for appropriate . Furthermore, for any integer , one has

*Proof:* We use the compactness method, which will be more powerful here than in the last section because we have much higher regularity uniform bounds (but they are only local in time rather than global). Let be a sequence of viscosities going to zero. By the local existence theory for Navier-Stokes (Corollary 40 of Notes 1), for each we have a maximal Cauchy development , to the Navier-Stokes initial value problem (3) with viscosity and initial data . From Theorem 1(i), we have for all (if is small enough), and

for all . By Sobolev embedding, this implies that

and then by Theorem 1(ii) one has

for every integer . Thus, for each , is bounded in , uniformly in . By repeatedly using (3) and product estimates for Sobolev spaces, we see the same is true for , and for all higher derivatives of . In particular, all derivatives of are equicontinuous.

Using weak compactness (Proposition 2 of Notes 2), one can pass to a subsequence such that converge weakly to some limits , such that and all their derivatives lie in on ; in particular, are smooth. From the ArzelÃ¡-Ascoli theorem (and Proposition 3 of Notes 2), and converge locally uniformly to , and similarly for all derivatives of . One can then take limits in (3) and conclude that solve (1). The bound (5) follows from taking limits in (6).

Remark 4We are able to easily pass to the zero viscosity limit here because our domain has no boundary. In the presence of a boundary, we cannot freely differentiate in space as casually as we have been doing above, and one no longer has bounds on higher derivatives on and near the boundary that are uniform in the viscosity. Instead, it is possible for the fluid to form a thin boundary layer that has a non-trivial effect on the limiting dynamics. We hope to return to this topic in a future set of notes.

We have constructed a local smooth solution to the Euler equations from smooth data, but have not yet established uniqueness or continuous dependence on the data; related to the latter point, we have not extended the construction to larger classes of initial data than the smooth class . To accomplish these tasks we need a further *a priori* estimate, now involving *differences* of two solutions, rather than just bounding a single solution:

Theorem 5 (A priori bound for differences)Let , let be an integer, and let be divergence-free with norm at most . Letwhere is sufficiently small depending on . Let and be an solution to (1) with initial data (this exists thanks to Corollary 3), and let and be an solution to (1) with initial data . Then one has

Note the asymmetry between and in (8): this estimate requires control on the initial data in the high regularity space in order to be usable, but has no such requirement on the initial data . This asymmetry will be important in some later applications.

*Proof:* From Corollary 3 we have

Now we need bounds on the difference . Initially we have , where . To evolve later in time, we will need to use the energy method. Subtracting (1) for and , we have

By hypothesis, all derivatives of and lie in on , which will allow us to justify the manipulations below without difficulty. We introduce the low regularity energy for the difference:

Arguing as in the proof of Proposition 1, we see that

where

As before, the divergence-free nature of ensures that vanishes. For , we use the Leibniz rule and again extract out the dangerous term

which again vanishes by integration by parts. We then use the triangle inequality to bound

The key point here is that at most derivatives are being applied to at any given time, although the full derivatives may also hit or . Using Exercise 2 and HÃ¶lder, we may bound the above expression by

which by Sobolev embedding gives

Applying (9) and Gronwall’s inequality, we conclude that

for , and (7) follows.

Now we work with the high regularity energy

Arguing as before we have

Using Exercise 2 and HÃ¶lder, we may bound this by

Using Sobolev embedding we thus have

By the chain rule, we obtain

(one can work with in place of and then send later if one wishes to avoid a lack of differentiability at ). By Gronwall’s inequality, we conclude that

for all , and (8) follows.

By specialising (7) (or (8)) to the case where , we see the solution constructed in Corollary 3 is unique. Now we can extend to wider classes of initial data than initial data. The following result is essentially due to Kato and to Swann (with a similar result obtained by different methods by Ebin-Marsden):

Proposition 6Let be an integer, and let be divergence-free. Setwhere is sufficiently small depending on . Let be a sequence of divergence-free vector fields converging to in norm (for instance, one could apply Littlewood-Paley projections to ). Let , be the associated solutions to (1) provided by Corollary 3 (these are well-defined for large enough). Then and converge in norm on to limits , respectively, which solve (1) in a distributional sense.

*Proof:* We use a variant of Kato’s argument (see also the paper of Bona and Smith for a related technique). It will suffice to show that the form a Cauchy sequence in , since the algebra properties of then give the same for , and one can then easily take limits (in this relatively high regularity setting) to obtain the limiting solution that solves (1) in a distributional sense.

Let be a large dyadic integer. By Corollary 3, we may find an solution be the solution to the Euler equations (1) with initial data (which lies in ). From Theorem 5, one has

Applying the triangle inequality and then taking limit superior, we conclude that

But by Plancherel’s theorem and dominated convergence we see that

as , and hence

giving the claim.

Remark 7Since the sequence can converge to at most one limit , we see that the solution to (1) is unique in the class of distributional solutions that are limits of smooth solutions (with initial data of those solutions converging to in ). However, this leaves open the possibility that there are other distributional solutions that do not arise as the limits of smooth solutions (or as limits of smooth solutions whose initial data only converge to in a weaker sense). It is possible to recover some uniqueness results for fairly weak solutions to the Euler equations if one also assumes some additional regularity on the fields (or on related fields such as the vorticity ). In two dimensions, for instance, there is a celebrated theorem of Yudovich that weak solutions to 2D Euler are unique if one has an bound on the vorticity. In higher dimensions one can also obtain uniqueness results if one assumes that the solution is in a high-regularity space such as , . See for instance this paper of Chae for an example of such a result.

Exercise 8 (Continuous dependence on initial data)Let be an integer, let , and set , where is sufficiently small depending on . Let be the closed ball of radius around the origin of divergence-free vector fields in . The above proposition provides a solution to the associated initial value problem. Show that the map from to is a continuous map from to .

Remark 9The continuity result provided by the above exercise is not as strong as in Navier-Stokes, where the solution map is in fact Lipschitz continuous (see e.g., Exercise 43 of Notes 1). In fact for the Euler equations, which is classified as a “quasilinear” equation rather than a “semilinear” one due to the lack of the dissipative term in the equation, the solution map is not expected to be uniformly continuous on this ball, let alone Lipschitz continuous. See this previous blog post for some more discussion.

Exercise 10 (Maximal Cauchy development)Let be an integer, and let be divergence free. Show that there exists a unique and unique , with the following properties:

- (i) If and is divergence-free and converges to in norm, then for large enough, there is an solution to (1) with initial data on , and furthermore and converge in norm on to .
- (ii) If , then as .
- (iii) If , then we have the
weak Beale-Kato-Majda criterionFurthermore, show that do not depend on the particular choice of , in the sense that if belongs to both and for two integers then the time and the fields produced by the above claims are the same for both and .

We will refine part (iii) of the above exercise in the next section. It is a major open problem as to whether the case (i.e., finite time blowup) can actually occur. (It is important here that we have some spatial decay at infinity, as represented here by the presence of the norm; when the solution is allowed to diverge at spatial infinity, it is not difficult to construct smooth solutions to the Euler equations that blow up in finite time; see e.g., this article of Stuart for an example.)

Remark 11The condition that recurs in the above results can be explained using the heuristics from Section 5 of Notes 1. Assume that a given time , the velocity field fluctuates at a spatial frequency , with the fluctuations being of amplitude . (We however permit the velocity field to contain a “bulk” low frequency component which can have much higher amplitude than ; for instance, the first component of might take the form where is a quantity much larger than .) Suppose one considers the trajectories of two particles whose separation at time zero is comparable to the wavelength of the frequency oscillation. Then the relative velocities of will differ by about , so one would expect the particles to stay roughly the same distance from each other up to time , and then exhibit more complicated and unpredictable behaviour after that point. Thus the natural time scale here is , so one only expects to have a reasonable local well-posedness theory in the regimeOn the other hand, if lies in , and the frequency fluctuations are spread out over a set of volume , the heuristics from the previous notes predict that

The uncertainty principle predicts , and so

Thus we force the regime (11) to occur if , and barely have a chance of doing so in the endpoint case , but would not expect to have a local theory (at least using the sort of techniques deployed in this section) for .

Exercise 12Use similar heuristics to explain the relevance of quantities of the form that occurs in various places in this section.

Because the solutions constructed in Exercise 10 are limits (in rather strong topologies) of smooth solutions, it is fairly easy to extend estimates and conservation laws that are known for smooth solutions to these slightly less regular solutions. For instance:

Exercise 13Let be as in Exercise 10.

- (i) (Energy conservation) Show that for all .
- (ii) Show that
for all .

Exercise 14 (Vanishing viscosity limit)Let the notation and hypotheses be as in Corollary 3. For each , let , be the solution to (3) with this choice of viscosity and with initial data . Show that as , and converge locally uniformly to , and similarly for all derivatives of and . (In other words, there is actually no need to pass to a subsequence as is done in the proof of Corollary 3.)Hint:apply the energy method to control the difference .

Exercise 15 (Local existence for forced Euler)Let be divergence-free, and let , thus is smooth and for any and any integer and , . Show that there exists and a smooth solution to the forced Euler equation

Note:one will first need a local existence theory for the forced Navier-Stokes equation. It is also possible to develop forced analogues of most of the other results in this section, but we will not detail this here.

** â€” 2. The Beale-Kato-Majda blowup criterion â€” **

In Exercise 10 we saw that we could continue solutions, to the Euler equations indefinitely in time, unless the integral became infinite at some finite time . There is an important refinement of this blowup criterion, due to Beale, Kato, and Majda, in which the tensor is replaced by the vorticity two-form (or vorticity, for short)

that is to say is essentially the anti-symmetric component of . Whereas is the tensor field

is the anti-symmetric tensor field

Remark 16In two dimensions, is essentially a scalar, since and . As such, it is common in fluid mechanics to refer to the scalar field as the vorticity, rather than the two form . In three dimensions, there are three independent components of the vorticity, and it is common to view as a vector field rather than a two-form in this case (actually, to be precise would be a pseudovector field rather than a vector field, because it behaves slightly differently to vectors with respect to changes of coordinate). With this interpretation, the vorticity is now the curl of the velocity field . From a differential geometry viewpoint, one can view the two-form as an antisymmetric bilinear map from vector fields to scalar functions , and the relation between the vorticity two-form and the vorticity (pseudo-)vector field in is given by the relationfor arbitrary vector fields , where is the volume form on , which can be viewed in three dimensions as an antisymmetric trilinear form on vector fields. The fact that is a pseudovector rather than a vector then arises from the fact that the volume form changes sign upon applying a reflection.

The point is that vorticity behaves better under the Euler flow than the full derivative . Indeed, if one takes a smooth solution to the Euler equation in coordinates

and applies to both sides, one obtains

If one interchanges and then subtracts, the pressure terms disappear, and one is left with

which we can rearrange using the material derivative as

Writing and , this becomes the *vorticity equation*

The vorticity equation is particularly simple in two and three dimensions:

Exercise 17 (Transport of vorticity)Let be a smooth solution to Euler equation in , and let be the vorticity two-form.

- (i) If , show that
- (ii) If , show that
where is the vorticity pseudovector.

Remark 18One can interpret the vorticity equation in the language of differential geometry, which is a more covenient formalism when working on more general Riemann manifolds than . To be consistent with the conventions of differential geometry, we now write the components of the velocity field as rather than (and the coordinates of as rather than ). Define thecovelocity -formaswhere is the Euclidean metric tensor (in the standard coordinates, is the Kronecker delta, though can take other values than if one uses a different coordinate system). Thus in coordinates, ; the covelocity field is thus the musical isomorphism applied to the velocity field. The vorticity -form can then be interpreted as the exterior derivative of the covelocity, thus

or in coordinates

The Euler equations can be rearranged as

where is the Lie derivative along , which for -forms is given in coordinates as

and is the modified pressure

If one takes exterior derivatives of both sides of (14) using the basic differential geometry identities and , one obtains the vorticity equation

where the Lie derivative for -forms is given in coordinates as

and so we recover (13) after some relabeling.

We now present the Beale-Kato-Majda condition.

Theorem 19 (Beale-Kato-Majda)Let be an integer, and let be divergence free. Let , be the maximal Cauchy development from Exercise 10, and let be the vorticity.

The double exponential in (i) is not a typo! It is an open question though as to whether this double exponential bound can be at all improved, even in the simplest case of two spatial dimensions.

We turn to the proof of this theorem. Part (ii) will be implied by part (i), since if is finite then part (i) gives a uniform bound on as , preventing finite time blowup. So it suffices to prove part (i). To do this, it suffices to do so for solutions, since one can then pass to a limit (using the strong continuity in ) to establish the general case. In particular, we can now assume that are smooth.

We would like to convert control on back to control of the full derivative . If one takes divergences of the vorticity using (12) and the divergence-free nature of , we see that

Thus, we can recover the derivative from the vorticity by the formula

where one can define via the Fourier transform as a multiplier bounded on every space.

If the operators were bounded in , then we would have

and the claimed bound (15) would follow from Theorem 1(ii) (with one exponential to spare). Unfortunately, is not quite bounded on . Indeed, from Exercise 18 of Notes 1 we have the formula

for any test function and , where is the singular kernel

If one sets to be a (smooth approximation) to the signum restricted to an annulus , we conclude that the operator norm of is at least as large as

But one can calculate using polar coordinaates that this expression diverges like in the limit , , giving unboundedness.

As it turns out, though, the Gronwall argument used to establish Theorem 1(ii) can just barely tolerate an additional “logarithmic loss” of the above form, albeit at the cost of worsening the exponential term to a double exponential one. The key lemma is the following result that quantifies the logarithmic divergence indicated by the previous calculaation, and is similar in spirit to a well known inequality of Brezis and Wainger.

The lower order terms will be easily dealt with in practice; the main point is that one can almost bound the norm of by that of , up to a logarithmic factor.

*Proof:* By a limiting argument we may assume that (and hence are test functions. We apply Littlewood-Paley decomposition to write

and hence by the triangle inequality we may bound the left-hand side of (17) by

where we omit the domain and range from the function space norms for brevity.

By Bernstein’s inequality we have

Also, from Bernstein and Plancherel we have

and hence by geometric series we have

for any . This gives an acceptable contribution if we select . This leaves remaining values of to control, so if one can bound

Observe from applying the scaling (that is, replacing with that to prove (18) for all it suffices to do so for . By Fourier analysis, the function is the convolution of with the inverse Fourier transform of the function

This function is a test function, so is a Schwartz function, and the claim now follows from Young’s inequality.

We return now to the proof of (15). We adapt the proof of Proposition 1(i). As in that proposition, we introduce the higher energy

We no longer have the viscosity term as , but that term was discarded anyway in the analysis. From (4) we have

Applying (16), (20) one thus has

From Exercise 13 one has

By the chain rule, one then has

and hence by Gronwall’s inequality one has

The claim (15) follows.

Remark 21The Beale-Kato-Majda criterion can be sharpened a little bit, by replacing the sup norm with slightly smaller norms, such as the bounded mean oscillation (BMO) norm of , basically by improving the right-hand side of Lemma 20 slightly. See for instance this paper of Planchon and the references therein.

Remark 22An inspection of the proof of Theorem 19 reveals that the same result holds if the Euler equations are replaced by the Navier-Stokes equations; the energy estimates acquire an additional “” term by doing so (as in the proof of Proposition 1), but the sign of that term is favorable.

We now apply the Beale-Kato-Majda criterion to obtain global well-posedness for the Euler equations in two dimensions:

Theorem 23 (Global well-posedness)Let be as in Exercise 10. If , then .

This theorem will be immediate from Theorem 19 and the following conservation law:

Proposition 24 (Conservation of vorticity distribution)Let be as in Exercise 10 with . Then one hasfor all and .

*Proof:* By a limiting argument it suffices to show the claim for , thus we need to show

By another limiting argument we can take to be an solution. By the monotone convergence theorem (and Sobolev embedding), it suffices to show that

whenever is a test function that vanishes in a neighbourhood of the origin . Note that as and all its derivatives are in on for every , and so is smooth and compactly supported in . We may therefore may differentiate under the integral sign to obtain

where we omit explicit dependence on for brevity. By Exercise 17(i), the right-hand side is

which one can write as a total derivative

which vanishes thanks to integration by parts and the divergence-free nature of . The claim follows.

The above proposition shows that in two dimensions, is constant, and so the integral cannot diverge for finite . Applying Theorem 19, we obtain Theorem 23. We remark that global regularity for two-dimensional Euler was established well before the Beale-Kato-Majda theorem, starting with the work of Wolibner.

One can adapt this argument to the Navier-Stokes equations:

Exercise 25Let be an integer, let , let be divergence-free, and let , be a maximal Cauchy development to the Navier-Stokes equations with initial data . Let be the vorticity.

- (i) Establish the vorticity equation .
- (ii) Show that for all and . (Note: to adapt the proof of Proposition 12, one should restrict attention to functions that are convex on the range of on, say, . The case of this inequality can also be established using the maximum principle for parabolic equations.)
- (iii) Show that .

Remark 26There are other ways to establish global regularity for two-dimensional Navier-Stokes (originally due to Ladyzhenskaya); for instance, the bound on the vorticity in Exercise 25(ii), combined with energy conservation, gives a uniform bound on the velocity field, which can then be inserted into (the non-periodic version of) Theorem 38 of Notes 1.

]]>

Remark 27If solve the Euler equations on some time interval with initial data , then the time-reversed fields solve the Euler equations on the reflected interval with initial data . Because of this time reversal symmetry, the local and global well-posedness theory for the Euler equations can also be extended backwards in time; for instance, in two dimensions any divergence free initial data leads to an solution to the Euler equations on the whole time interval . However, the Navier-Stokes equations are very muchnottime-reversible in this fashion.

However, it is possible to construct “weak” solutions which lack many of the desirable features of strong solutions (notably, uniqueness, propagation of regularity, and conservation laws) but can often be constructed globally in time even when one us unable to do so for strong solutions. Broadly speaking, one usually constructs weak solutions by some sort of “compactness method”, which can generally be described as follows.

- Construct a sequence of “approximate solutions” to the desired equation, for instance by developing a well-posedness theory for some “regularised” approximation to the original equation. (This theory often follows similar lines to those in the previous set of notes, for instance using such tools as the contraction mapping theorem to construct the approximate solutions.)
- Establish some
*uniform*bounds (over appropriate time intervals) on these approximate solutions, even in the limit as an approximation parameter is sent to zero. (Uniformity is key;*non-uniform*bounds are often easy to obtain if one puts enough “mollification”, “hyper-dissipation”, or “discretisation” in the approximating equation.) - Use some sort of “weak compactness” (e.g., the Banach-Alaoglu theorem, the Arzela-Ascoli theorem, or the Rellich compactness theorem) to extract a subsequence of approximate solutions that converge (in a topology weaker than that associated to the available uniform bounds) to a limit. (Note that there is no reason
*a priori*to expect such limit points to be unique, or to have any regularity properties beyond that implied by the available uniform bounds..) - Show that this limit solves the original equation in a suitable weak sense.

The quality of these weak solutions is very much determined by the type of uniform bounds one can obtain on the approximate solution; the stronger these bounds are, the more properties one can obtain on these weak solutions. For instance, if the approximate solutions enjoy an energy identity leading to uniform energy bounds, then (by using tools such as Fatou’s lemma) one tends to obtain energy *inequalities* for the resulting weak solution; but if one somehow is able to obtain uniform bounds in a higher regularity norm than the energy then one can often recover the full energy *identity*. If the uniform bounds are at the regularity level needed to obtain well-posedness, then one generally expects to upgrade the weak solution to a strong solution. (This phenomenon is often formalised through *weak-strong uniqueness* theorems, which we will discuss later in these notes.) Thus we see that as far as attacking global regularity is concerned, both the theory of strong solutions and the theory of weak solutions encounter essentially the same obstacle, namely the inability to obtain uniform bounds on (exact or approximate) solutions at high regularities (and at arbitrary times).

For simplicity, we will focus our discussion in this notes on finite energy weak solutions on . There is a completely analogous theory for periodic weak solutions on (or equivalently, weak solutions on the torus which we will leave to the interested reader.

In recent years, a completely different way to construct weak solutions to the Navier-Stokes or Euler equations has been developed that are not based on the above compactness methods, but instead based on techniques of convex integration. These will be discussed in a later set of notes.

** — 1. A brief review of some aspects of distribution theory — **

We have already been using the concept of a distribution in previous notes, but we will rely more heavily on this theory in this set of notes, so we pause to review some key aspects of the theory. A more comprehensive discussion of distributions may be found in this previous blog post. To avoid some minor subtleties involving complex conjugation that are not relevant for this post, we will restrict attention to real-valued (scalar) distributions here. (One can then define vector-valued distributions (taking values in a finite-dimensional vector space) as a vector of scalar-valued distributions.)

Let us work in some non-empty open subset of a Euclidean space (which may eventually correspond to space, time, or spacetime). We recall that is the space of (real-valued) test functions . It has a rather subtle topological structure (see previous notes) which we will not detail here. A (real-valued) distribution on is a continuous linear functional from test functions to the reals . (This pairing may also be denoted or in other texts.) There are two basic examples of distributions to keep in mind:

- Any locally integrable function gives rise to a distribution (which by abuse of notation we also call ) by the formula .
- Any Radon measure gives rise to a distribution (which we will again call ) by the formula . For instance, if , the Dirac mass at is a distribution with .

Two distributions are equal in the sense of distributions of for all . For instance, it is not difficult to show that two locally integrable functions are equal in the sense of distributions if and only if they agree almost everywhere, and two Radon measures are equal in the sense of distributions if and only if they are identical.

As a general principle, any “linear” operation that makes sense for “nice” functions (such as test functions) can also be defined for distributions, but any “nonlinear” operation is unlikely to be usefully defined for arbitrary distributions (though it may still be a good concept to use for distributions with additional regularity). For instance, one can take a partial derivative (known as the weak derivative) of any distribution by the definition

for all . Note that this definition agrees with the “strong” or “classical” notion of a derivative when is a smooth function, thanks to integration by parts. Similarly, if is smooth, one can define the product distribution by the formula

for all . One can also take linear combinations of two distributions in the usual fashion, thus

for all and .

Exercise 1Let be a connected open subset of . Let be a distribution on such that in the sense of distributions for all . Show that is a constant, that is to say there exists such that in the sense of distributions.

A sequence of distributions is said to converge in the weak-* sense or *converge in the sense of distributions* to another distribution if one has

as for every test function ; in this case we write . This notion of convergence is sometimes referred to also as weak convergence (and one writes instead of ), although there is a subtle distinction between weak and weak-* convergence in non-reflexive spaces and so I will try to avoid this terminology (though in many cases one will be working in a reflexive space in which there is no distinction).

The linear operations alluded to above tend to be continuous in the distributional sense. For instance, it is easy to see that if , then for all , and for any smooth ; similarly, if , , and , are sequences of real numbers, then .

Suppose that one places a norm or seminorm on . Then one can define a subspace of the space of distributions, defined to be the space of all distributions for which the norm

is finite. For instance, if is the norm for some , then is just the dual space (with the (equivalence classes of) locally integrable functions in identified with distributions as above).

We have the following version of the Banach-Alaoglu theorem which allows us to easily create sequences that converge in the sense of distributions:

Proposition 2 (Variant of Banach-Alaoglu)Suppose that is a norm or seminorm on which makes the space separable. Let be a bounded sequence in . Then there is a subsequence of the which converges in the sense of distributions to a limit .

*Proof:* By hypothesis, there is a constant such that

for all . For each given , we may thus pass to a subsequence of such that converges to a limit. Passing to a subsequence a countably infinite number of times and using the Arzelá-Ascoli diagonalisation trick, we can thus find a dense subset of (using the metric) and a subsequence of the such that the limit exists for every , and hence for every by a limiting argument and (1). If one then defines to be the function

then one can verify that is a distribution, and by (1) we will have . By construction, converges in the sense of distributions to , and we are done.

It is important to note that there is no uniqueness claimed for ; while any given subsequence of the can have at most one limit , it is certainly possible for different subsequences to converge to different limits. Also, the proposition only applies for spaces that have preduals ; this covers many popular function spaces, such as spaces for , but omits endpoint spaces such as or . (For instance, approximations to the identity are uniformly bounded in , but converge weakly to a Dirac mass, which lies outside of .)

From definition we see that if , then we have the Fatou-type lemma

Thus, upper bounds on the approximating distributions are usually inherited by their limit . However, it is essential to be aware that the same is not true for lower bounds; there can be “loss of mass” in the limit. The following four examples illustrate some key ways in which this can occur:

- (Escape to spatial infinity) If is a non-zero test function, and is a sequence in going to infinity, then the translations of converge in the sense of distributions to zero, even though they will not go to zero in many function space norms (such as ).
- (Escape to frequency infinity) If is a non-zero test function, and is a sequence in going to infinity, then the modulations of converge in the sense of distributions to zero (cf. the Riemann-Lebesgue lemma), even though they will not go to zero in many function space norms (such as ).
- (Escape to infinitely fine scales) If , is a sequence of positive reals going to infinity, and , then the sequence converges in the sense of distributions to zero, but will not go to zero in several function space norms (e.g. with ).
- (Escape to infinitely coarse scales) If , is a sequence of positive reals going to zero, and , then the sequence converges in the sense of distributions to zero, but will not go to zero in several function space norms (e.g. with ).

Related to this loss of mass phenomenon is the important fact that the operation of pointwise multiplication is generally *not* continuous in the distributional topology: and does *not* necessarily imply in general (in fact in many cases the products or might not even be well-defined). For instance:

- Using the escape to frequency infinity example, the functions converge in the sense of distributions to zero, but their squares instead converge in the sense of distributions to , as can be seen from the double angle formula .
- Using the escape to infinitely fine scales example, the functions converge in the sense of distributions to zero, but their squares will not if .

This lack of continuity of multiplication means that one has to take a non-trivial amount of care when applying the theory of distributions to nonlinear PDE; a sufficiently careless regard for this issue (or more generally, treating distribution theory as some sort of “magic wand“) is likely to lead to serious errors in one’s arguments.

One way to recover continuity of pointwise multiplication is to somehow upgrade distributional convergence to stronger notions of convergence. For instance, from Hölder’s inequality one sees that if converges strongly to in (thus and both lie in , and goes to zero), and converges strongly to in , then will converge strongly in to , where .

One key way to obtain strong convergence in some norm is to obtain uniform bounds in an even stronger norm – so strong that the associated space embeds compactly in the space associated to the original norm. More precisely

Proposition 3 (Upgrading to strong convergence)Let be two norms on , with associated spaces of distributions. Suppose that embeds compactly into , that is to say the closed unit ball in is a compact subset of . If is a bounded sequence in that converges in the sense of distributions to a limit , then converges strongly in to as well.

*Proof:* By the Urysohn subsequence principle, it suffices to show that every subsequence of has a further subsequence that converges strongly in to . But by the compact embedding of into , every subsequence of has a further subsequence that converges strongly in to some limit , and hence also in the sense of distributions to by definition of the norm. But thus subsequence also converges in the sense of distributions to , and hence , and the claim follows.

** — 2. Simple examples of weak solutions — **

We now study weak solutions for some very simple equations, as a warmup for discussing weak solutions for Navier-Stokes.

We begin with an extremely simple initial value problem, the ODE

on a half-open time interval with , with initial condition , where and given and is the unknown. Of course, when are smooth, then the fundamental theorem of calculus gives the unique solution

for . If one integrates the identity against a test function (that is to say, one multiplies both sides of this identity by and then integrates) on , one obtains

which upon integration by parts and rearranging gives

where we extend by zero to the open set . Thus, we have

in the sense of distributions (on ). More generally, if are locally integrable functions on , we say that is a *weak solution* to the initial value problem if (4) holds in the sense of distributions on . Thanks to the fundamental theorem of calculus for locally integrable functions, we still recover the unique solution (16):

Exercise 4Let be locally integrable functions (extended by zero to all of ), and let . Show that the following are equivalent:

Now let be a finite dimensional vector space, let be a continuous function, let , and consider the initial value problem

on some forward time interval . The Picard existence theorem lets us construct such solutions when is Lipschitz continuous and is small enough, but now we are merely requiring to be continuous and not necessarily Lipschitz. As in the preceding case, we introduce the notion of a weak solution. If is locally bounded (and measurable) on , then will be locally integrable on ; we then extend by zero to be distributions on , and we say that is a *weak solution* to (5) if one has

in the sense of distributions on , or equivalently that one has the identity

for all test functions compactly supported in . In this simple ODE setting, the notion of a weak solution coincides with stronger notions of solutions:

Exercise 5Let be finite dimensional, let be continuous, let , and let be locally bounded and measurable. Show that the following are equivalent:

In particular, if the ODE initial value problem (5) exhibits finite time blowup for its (unique) classical solution, then it will also do so for weak solutions (with exactly the same blouwp time). This will be in contrast with the situation for PDE, in which it is possible for weak solutions to persist beyond the time in which classical solutions exist.

Now we give a compactness argument to produce weak solutions (which will then be classical solutions, by the above exercise):

Proposition 6 (Weak existence)Let be a finite dimensional vector space, let , let , and let be a continuous function. Let be the timeThen there exists a continuously differentiable solution to the initial value problem (5) on .

*Proof:* By construction, we have

Using the Weierstrass approximation theorem (or Stone-Weierstrass theorem), we can express on as the uniform limit of Lipschitz continuous functions , such that

for all ; we can then extend in a Lipschitz continuous fashion to all of . (The Lipschitz constant of is permitted to diverge to infinity as ). We can then apply the Picard existence theorem (Theorem 8 of Notes 1), for each we have a (continuously differentiable) maximal Cauchy development of the initial value problem