For sake of discussion we will just work in the non-periodic domain , , although the arguments here can be adapted without much difficulty to the periodic setting. We will only work with solutions in which the pressure is normalised in the usual fashion:

Formally, the Euler equations (with normalised pressure) arise as the vanishing viscosity limit of the Navier-Stokes equations

that was studied in previous notes. However, because most of the bounds established in previous notes, either on the lifespan of the solution or on the size of the solution itself, depended on , it is not immediate how to justify passing to the limit and obtain either a strong well-posedness theory or a weak solution theory for the limiting equation (1). (For instance, weak solutions to the Navier-Stokes equations (or the approximate solutions used to create such weak solutions) have lying in for , but the bound on the norm is and so one could lose this regularity in the limit , at which point it is not clear how to ensure that the nonlinear term still converges in the sense of distributions to what one expects.)

Nevertheless, by carefully using the energy method (which we will do loosely following an approach of Bertozzi and Majda), it is still possible to obtain *local-in-time* estimates on (high-regularity) solutions to (3) that are uniform in the limit . Such *a priori* estimates can then be combined with a number of variants of these estimates obtain a satisfactory local well-posedness theory for the Euler equations. Among other things, we will be able to establish the *Beale-Kato-Majda criterion* – smooth solutions to the Euler (or Navier-Stokes) equations can be continued indefinitely unless the integral

becomes infinite at the final time , where is the *vorticity* field. The vorticity has the important property that it is transported by the Euler flow, and in two spatial dimensions it can be used to establish global regularity for both the Euler and Navier-Stokes equations in these settings. (Unfortunately, in three and higher dimensions the phenomenon of vortex stretching has frustrated all attempts to date to use the vorticity transport property to establish global regularity of either equation in this setting.)

There is a rather different approach to establishing local well-posedness for the Euler equations, which relies on the *vorticity-stream* formulation of these equations. This will be discused in a later set of notes.

** — 1. A priori bounds — **

We now develop some *a priori* bounds for very smooth solutions to Navier-Stokes that are uniform in the viscosity . Define an function to be a function that lies in every space; similarly define an function to be a function that lies in for every . Given divergence-free initial data , an mild solution to the Navier-Stokes initial value problem (3) is a solution that is an mild solution for all . From the (non-periodic version of) Corollary 40 of Notes 1, we know that for any divergence-free initial data , there is unique maximal Cauchy development , with infinite if is finite.

Here are our first bounds:

Theorem 1 (A priori bound)Let be an maximal Cauchy development to (3) with initial data .

- (i) For any integer , we have
Furthermore, if for a sufficiently small constant depending only on , then

- (ii) For any and integer , one has

The hypothesis that is integer can be dropped by more heavily exploiting the theory of paraproducts, but we shall restrict attention to integer for simplicity.

We now prove this theorem using the energy method. Using the Navier-Stokes equations, we see that and all lie in for any ; an easy iteration argument then shows that the same is true for all higher derivatives of also. This will make it easy to justify the differentiation under the integral sign that we shall shortly perform.

Let be an integer. For each time , we introduce the energy-type quantity

Here we think of as taking values in the Euclidean space . This quantity is of course comparable to , up to constants depending on . It is easy to verify that is continuously differentiable in time, with derivative

where we suppress explicit dependence on in the integrand for brevity. We now try to bound this quantity in terms of . We expand the right-hand side in coordinates using (3) to obtain

where

For , we can integrate by parts to move the operator onto and use the divergence-free nature of to conclude that . Similarly, we may integrate by parts for to move one copy of over to the other factor in the integrand to conclude

so in particular (note that as we are seeking bounds that are uniform in , we can’t get much further use out of beyond this bound). Thus we have

Now we expand out using the Leibniz rule. There is one dangerous term, in which all the derivatives in fall on the factor, giving rise to the expression

But we can locate a total derivative to write this as

and then an integration by parts using as before shows that this term vanishes. Estimating the remaining contributions to using the triangle inequality, we arrive at the bound

At this point we now need a variant of Proposition 35 from Notes 1:

Exercise 2Let be integers. For any , show that(

Hint:for or , use Hölder’s inequality. Otherwise, use a suitable Littlewood-Paley decomposition.)

Using this exercise and Hölder’s inequality, we see that

By Gronwall’s inequality we conclude that

for any and , which gives part (ii).

Now assume . Then we have the Sobolev embedding

which when inserted into (4) yields the differential inequality

or equivalently

for some constant (strictly speaking one should work with for some small which one sends to zero later, if one wants to avoid the possibility that vanishes, but we will ignore this small technicality for sake of exposition.) Since , we conclude that stays bounded for a time interval of the form ; this, together with the blowup criterion that must go to infinity as , gives part (i).

As a consequence, we can now obtain local existence for the Euler equations from smooth data:

Corollary 3 (Local existence for smooth solutions)Let be divergence-free. Let be an integer, and setThen there is a smooth solution , to (1) with all derivatives of in for appropriate . Furthermore, for any integer , one has

*Proof:* We use the compactness method, which will be more powerful here than in the last section because we have much higher regularity uniform bounds (but they are only local in time rather than global). Let be a sequence of viscosities going to zero. By the local existence theory for Navier-Stokes (Corollary 40 of Notes 1), for each we have a maximal Cauchy development , to the Navier-Stokes initial value problem (3) with viscosity and initial data . From Theorem 1(i), we have for all (if is small enough), and

for all . By Sobolev embedding, this implies that

and then by Theorem 1(ii) one has

for every integer . Thus, for each , is bounded in , uniformly in . By repeatedly using (3) and product estimates for Sobolev spaces, we see the same is true for , and for all higher derivatives of . In particular, all derivatives of are equicontinuous.

Using weak compactness (Proposition 2 of Notes 2), one can pass to a subsequence such that converge weakly to some limits , such that and all their derivatives lie in on ; in particular, are smooth. From the Arzelá-Ascoli theorem (and Proposition 3 of Notes 2), and converge locally uniformly to , and similarly for all derivatives of . One can then take limits in (3) and conclude that solve (1). The bound (5) follows from taking limits in (6).

Remark 4We are able to easily pass to the zero viscosity limit here because our domain has no boundary. In the presence of a boundary, we cannot freely differentiate in space as casually as we have been doing above, and one no longer has bounds on higher derivatives on and near the boundary that are uniform in the viscosity. Instead, it is possible for the fluid to form a thin boundary layer that has a non-trivial effect on the limiting dynamics. We hope to return to this topic in a future set of notes.

We have constructed a local smooth solution to the Euler equations from smooth data, but have not yet established uniqueness or continuous dependence on the data; related to the latter point, we have not extended the construction to larger classes of initial data than the smooth class . To accomplish these tasks we need a further *a priori* estimate, now involving *differences* of two solutions, rather than just bounding a single solution:

Theorem 5 (A priori bound for differences)Let , let be an integer, and let be divergence-free with norm at most . Letwhere is sufficiently small depending on . Let and be an solution to (1) with initial data (this exists thanks to Corollary 3), and let and be an solution to (1) with initial data . Then one has

Note the asymmetry between and in (8): this estimate requires control on the initial data in the high regularity space in order to be usable, but has no such requirement on the initial data . This asymmetry will be important in some later applications.

*Proof:* From Corollary 3 we have

Now we need bounds on the difference . Initially we have , where . To evolve later in time, we will need to use the energy method. Subtracting (1) for and , we have

By hypothesis, all derivatives of and lie in on , which will allow us to justify the manipulations below without difficulty. We introduce the low regularity energy for the difference:

Arguing as in the proof of Proposition 1, we see that

where

As before, the divergence-free nature of ensures that vanishes. For , we use the Leibniz rule and again extract out the dangerous term

which again vanishes by integration by parts. We then use the triangle inequality to bound

Using Exercise 2 and Hölder, we may bound this by

which by Sobolev embedding gives

Applying (9) and Gronwall’s inequality, we conclude that

for , and (7) follows.

Now we work with the high regularity energ

Arguing as before we have

Using Exercise 2 and Hölder, we may bound this by

Using Sobolev embedding we thus have

By the chain rule, we obtain

(one can work with in place of and then send later if one wishes to avoid a lack of differentiability at ). By Gronwall’s inequality, we conclude that

for all , and (8) follows.

By specialising (7) (or (8)) to the case where , we see the solution constructed in Corollary 3 is unique. Now we can extend to wider classes of initial data than initial data. The following result is essentially due to Kato and to Swann (with a similar result obtained by different methods by Ebin-Marsden):

Proposition 6Let be an integer, and let be divergence-free. Setwhere is sufficiently small depending on . Let be a sequence of divergence-free vector fields converging to in norm (for instance, one could apply Littlewood-Paley projections to ). Let , be the associated solutions to (1) provided by Corollary 3 (these are well-defined for large enough). Then and converge in norm on to limits , respectively, which solve (1) in a distributional sense.

*Proof:* We use a variant of Kato’s argument (see also the paper of Bona and Smith for a related technique). It will suffice to show that the form a Cauchy sequence in , since the algebra properties of then give the same for , and one can then easily take limits (in this relatively high regularity setting) to obtain the limiting solution that solves (1) in a distributional sense.

Let be a large dyadic integer. By Corollary 3, we may find an solution be the solution to the Euler equations (1) with initial data (which lies in ). From Theorem 5, one has

Applying the triangle inequality and then taking limit superior, we conclude that

But by Plancherel’s theorem and dominated convergence we see that

as , and hence

giving the claim.

Remark 7Since the sequence can converge to at most one limit , we see that the solution to (1) is unique in the class of distributional solutions that are limits of smooth solutions (with initial data of those solutions converging to in ). However, this leaves open the possibility that there are other distributional solutions that do not arise as the limits of smooth solutions (or as limits of smooth solutions whose initial data only converge to in a weaker sense). It is possible to recover some uniqueness results for fairly weak solutions to the Euler equations if one also assumes some additional regularity on the fields (or on related fields such as the vorticity ). In two dimensions, for instance, there is a celebrated theorem of Yudovich that weak solutions to 2D Euler are unique if one has an bound on the vorticity. In higher dimensions one can also obtain uniqueness results if one assumes that the solution is in a high-regularity space such as , . See for instance this paper of Chae for an example of such a result.

Exercise 8 (Continuous dependence on initial data)Let be an integer, let , and set , where is sufficiently small depending on . Let be the closed ball of radius around the origin of divergence-free vector fields in . The above proposition provides a solution to the associated initial value problem. Show that the map from to is a continuous map from to .

Remark 9The continuity result provided by the above exercise is not as strong as in Navier-Stokes, where the solution map is in fact Lipschitz continuous (see e.g., Exercise 43 of Notes 1). In fact for the Euler equations, which is classified as a “quasilinear” equation rather than a “semilinear” one due to the lack of the dissipative term in the equation, the solution map is not expected to be uniformly continuous on this ball, let alone Lipschitz continuous. See this previous blog post for some more discussion.

Exercise 10 (Maximal Cauchy development)Let be an integer, and let be divergence free. Show that there exists a unique and unique , with the following properties:

- (i) If and is divergence-free and converges to in norm, then for large enough, there is an solution to (1) with initial data on , and furthermore and converge in norm on to .
- (ii) If , then as .
- (iii) If , then we have the
weak Beale-Kato-Majda criterionFurthermore, show that do not depend on the particular choice of , in the sense that if belongs to both and for two integers then the time and the fields produced by the above claims are the same for both and .

We will refine part (iii) of the above exercise in the next section. It is a major open problem as to whether the case (i.e., finite time blowup) can actually occur. (It is important here that we have some spatial decay at infinity, as represented here by the presence of the norm; when the solution is allowed to diverge at spatial infinity, it is not difficult to construct smooth solutions to the Euler equations that blow up in finite time; see e.g., this article of Stuart for an example.)

Remark 11The condition that recurs in the above results can be explained using the heuristics from Section 5 of Notes 1. Assume that a given time , the velocity field fluctuates at a spatial frequency , with the fluctuations being of amplitude . (We however permit the velocity field to contain a “bulk” low frequency component which can have much higher amplitude than ; for instance, the first component of might take the form where is a quantity much larger than .) Suppose one considers the trajectories of two particles whose separation at time zero is comparable to the wavelength of the frequency oscillation. Then the relative velocities of will differ by about , so one would expect the particles to stay roughly the same distance from each other up to time , and then exhibit more complicated and unpredictable behaviour after that point. Thus the natural time scale here is , so one only expects to have a reasonable local well-posedness theory in the regime

On the other hand, if lies in , and the frequency fluctuations are spread out over a set of volume , the heuristics from the previous notes predict that

The uncertainty principle predicts , and so

Thus we force the regime (11) to occur if , and barely have a chance of doing so in the endpoint case , but would not expect to have a local theory (at least using the sort of techniques deployed in this section) for .

Exercise 12Use similar heuristics to explain the relevance of quantities of the form that occurs in various places in this section.

Because the solutions constructed in Exercise 10 are limits (in rather strong topologies) of smooth solutions, it is fairly easy to extend estimates and conservation laws that are known for smooth solutions to these slightly less regular solutions. For instance:

Exercise 13Let be as in Exercise 10.

- (i) (Energy conservation) Show that for all .
- (ii) Show that
for all .

Exercise 14 (Vanishing viscosity limit)Let the notation and hypotheses be as in Corollary 3. For each , let , be the solution to (3) with this choice of viscosity and with initial data . Show that as , and converge locally uniformly to , and similarly for all derivatives of and . (In other words, there is actually no need to pass to a subsequence as is done in the proof of Corollary 3.)Hint:apply the energy method to control the difference .

Exercise 15 (Local existence for forced Euler)Let be divergence-free, and let , thus is smooth and for any and any integer and , . Show that there exists and a smooth solution to the forced Euler equation

Note:one will first need a local existence theory for the forced Navier-Stokes equation. It is also possible to develop forced analogues of most of the other results in this section, but we will not detail this here.

** — 2. The Beale-Kato-Majda blowup criterion — **

In Exercise 10 we saw that we could continue solutions, to the Euler equations indefinitely in time, unless the integral became infinite at some finite time . There is an important refinement of this blowup criterion, due to Beale, Kato, and Majda, in which the tensor is replaced by the vorticity two-form (or vorticity, for short)

that is to say is essentially the anti-symmetric component of . Whereas is the tensor field

is the anti-symmetric tensor field

Remark 16In two dimensions, is essentially a scalar, since and . As such, it is common in fluid mechanics to refer to the scalar field as the vorticity, rather than the two form . In three dimensions, there are three independent components of the vorticity, and it is common to view as a vector field rather than a two-form in this case (actually, to be precise would be a pseudovector field rather than a vector field, because it behaves slightly differently to vectors with respect to changes of coordinate). With this interpretation, the vorticity is now the curl of the velocity field . From a differential geometry viewpoint, one can view the two-form as an antisymmetric bilinear map from vector fields to scalar functions , and the relation between the vorticity two-form and the vorticity (pseudo-)vector field in is given by the relationfor arbitrary vector fields , where is the volume form on , which can be viewed in three dimensions as an antisymmetric trilinear form on vector fields. The fact that is a pseudovector rather than a vector then arises from the fact that the volume form changes sign upon applying a reflection.

The point is that vorticity behaves better under the Euler flow than the full derivative . Indeed, if one takes a smooth solution to the Euler equation in coordinates

and applies to both sides, one obtains

If one interchanges and then subtracts, the pressure terms disappear, and one is left with

which we can rearrange using the material derivative as

Writing and , this becomes the *vorticity equation*

The vorticity equation is particularly simple in two and three dimensions:

Exercise 17 (Transport of vorticity)Let be a smooth solution to Euler equation in , and let be the vorticity two-form.

- (i) If , show that
- (ii) If , show that
where is the vorticity pseudovector.

Remark 18One can interpret the vorticity equation in the language of differential geometry, which is a more covenient formalism when working on more general Riemann manifolds than . To be consistent with the conventions of differential geometry, we now write the components of the velocity field as rather than (and the coordinates of as rather than ). Define thecovelocity -formaswhere is the Euclidean metric tensor (in the standard coordinates, is the Kronecker delta, though can take other values than if one uses a different coordinate system). Thus in coordinates, ; the covelocity field is thus the musical isomorphism applied to the velocity field. The vorticity -form can then be interpreted as the exterior derivative of the covelocity, thus

or in coordinates

The Euler equations can be rearranged as

where is the Lie derivative along , which for -forms is given in coordinates as

and is the modified pressure

If one takes exterior derivatives of both sides of (14) using the basic differential geometry identities and , one obtains the vorticity equation

where the Lie derivative for -forms is given in coordinates as

and so we recover (13) after some relabeling.

We now present the Beale-Kato-Majda condition.

Theorem 19 (Beale-Kato-Majda)Let be an integer, and let be divergence free. Let , be the maximal Cauchy development from Exercise 10, and let be the vorticity.

The double exponential in (i) is not a typo! It is an open question though as to whether this double exponential bound can be at all improved, even in the simplest case of two spatial dimensions.

We turn to the proof of this theorem. Part (ii) will be implied by part (i), since if is finite then part (i) gives a uniform bound on as , preventing finite time blowup. So it suffices to prove part (i). To do this, it suffices to do so for solutions, since one can then pass to a limit (using the strong continuity in ) to establish the general case. In particular, we can now assume that are smooth.

We would like to convert control on back to control of the full derivative . If one takes divergences of the vorticity using (12) and the divergence-free nature of , we see that

Thus, we can recover the derivative from the vorticity by the formula

where one can define via the Fourier transform as a multiplier bounded on every space.

If the operators were bounded in , then we would have

and the claimed bound (15) would follow from Theorem 1(ii) (with one exponential to spare). Unfortunately, is not quite bounded on . Indeed, from Exercise 18 of Notes 1 we have the formula

for any test function and , where is the singular kernel

If one sets to be a (smooth approximation) to the signum restricted to an annulus , we conclude that the operator norm of is at least as large as

But one can calculate using polar coordinaates that this expression diverges like in the limit , , giving unboundedness.

As it turns out, though, the Gronwall argument used to establish Theorem 1(ii) can just barely tolerate an additional “logarithmic loss” of the above form, albeit at the cost of worsening the exponential term to a double exponential one. The key lemma is the following result that quantifies the logarithmic divergence indicated by the previous calculaation, and is similar in spirit to a well known inequality of Brezis and Wainger.

Lemma 20 (Near-boundedness of )For any and , one has

The lower order terms will be easily dealt with in practice; the main point is that one can almost bound the norm of by that of , up to a logarithmic factor.

*Proof:* By a limiting argument we may assume that is a test function. We apply Littlewood-Paley decomposition to write

and hence by the triangle inequality we may bound the left-hand side of (17) by

where we omit the domain and range from the function space norms for brevity.

By Bernstein’s inequality we have

Also, from Bernstein and Plancherel we have

and hence by geometric series we have

for any . This gives an acceptable contribution if we select . This leaves remaining values of to control, so if one can bound

Observe from applying the scaling (that is, replacing with that to prove (18) for all it suffices to do so for . By Fourier analysis, the function is the convolution of with the inverse Fourier transform of the function

This function is a test function, so is a Schwartz function, and the claim now follows from Young’s inequality.

We return now to the proof of (15). We adapt the proof of Proposition 1(i). As in that proposition, we introduce the higher energy

We no longer have the viscosity term as , but that term was discarded anyway in the analysis. From (4) we have

Applying (16), (20) one thus has

From Exercise 13 one has

By the chain rule, one then has

and hence by Gronwall’s inequality one has

The claim (15) follows.

Remark 21The Beale-Kato-Majda criterion can be sharpened a little bit, by replacing the sup norm with slightly smaller norms, such as the bounded mean oscillation (BMO) norm of , basically by improving the right-hand side of Lemma 20 slightly. See for instance this paper of Planchon and the references therein.

Remark 22An inspection of the proof of Theorem 19 reveals that the same result holds if the Euler equations are replaced by the Navier-Stokes equations; the energy estimates acquire an additional “” term by doing so (as in the proof of Proposition 1), but the sign of that term is favorable.

We now apply the Beale-Kato-Majda criterion to obtain global well-posedness for the Euler equations in two dimensions:

Theorem 23 (Global well-posedness)Let be as in Exercise 10. If , then .

This theorem will be immediate from Theorem 19 and the following conservation law:

Proposition 24 (Conservation of vorticity distribution)Let be as in Exercise 10 with . Then one hasfor all and .

*Proof:* By a limiting argument it suffices to show the claim for , thus we need to show

By another limiting argument we can take to be an solution. By the monotone convergence theorem (and Sobolev embedding), it suffices to show that

whenever is a test function that vanishes in a neighbourhood of the origin . Note that as and all its derivatives are in on for every , it is Lipschitz in space and time, which among other things implies that the level sets are compact for every , and so is smooth and compactly supported in . We may therefore may differentiate under the integral sign to obtain

where we omit explicit dependence on for brevity. By Exercise 17(i), the right-hand side is

which one can write as a total derivative

which vanishes thanks to integration by parts and the divergence-free nature of . The claim follows.

The above proposition shows that in two dimensions, is constant, and so the integral cannot diverge for finite . Applying Theorem 19, we obtain Theorem 23. We remark that global regularity for two-dimensional Euler was established well before the Beale-Kato-Majda theorem, starting with the work of Wolibner.

One can adapt this argument to the Navier-Stokes equations:

Exercise 25Let be an integer, let , let be divergence-free, and let , be a maximal Cauchy development to the Navier-Stokes equations with initial data . Let be the vorticity.

- (i) Establish the vorticity equation .
- (ii) Show that for all and . (Note: to adapt the proof of Proposition 12, one should restrict attention to functions that are convex on the range of on, say, . The case of this inequality can also be established using the maximum principle for parabolic equations.)
- (iii) Show that .

Remark 26There are other ways to establish global regularity for two-dimensional Navier-Stokes (originally due to Ladyzhenskaya); for instance, the bound on the vorticity in Exercise 25(ii), combined with energy conservation, gives a uniform bound on the velocity field, which can then be inserted into (the non-periodic version of) Theorem 38 of Notes 1.

]]>

Remark 27If solve the Euler equations on some time interval with initial data , then the time-reversed fields solve the Euler equations on the reflected interval with initial data . Because of this time reversal symmetry, the local and global well-posedness theory for the Euler equations can also be extended backwards in time; for instance, in two dimensions any divergence free initial data leads to an solution to the Euler equations on the whole time interval . However, the Navier-Stokes equations are very muchnottime-reversible in this fashion.

However, it is possible to construct “weak” solutions which lack many of the desirable features of strong solutions (notably, uniqueness, propagation of regularity, and conservation laws) but can often be constructed globally in time even when one us unable to do so for strong solutions. Broadly speaking, one usually constructs weak solutions by some sort of “compactness method”, which can generally be described as follows.

- Construct a sequence of “approximate solutions” to the desired equation, for instance by developing a well-posedness theory for some “regularised” approximation to the original equation. (This theory often follows similar lines to those in the previous set of notes, for instance using such tools as the contraction mapping theorem to construct the approximate solutions.)
- Establish some
*uniform*bounds (over appropriate time intervals) on these approximate solutions, even in the limit as an approximation parameter is sent to zero. (Uniformity is key;*non-uniform*bounds are often easy to obtain if one puts enough “mollification”, “hyper-dissipation”, or “discretisation” in the approximating equation.) - Use some sort of “weak compactness” (e.g., the Banach-Alaoglu theorem, the Arzela-Ascoli theorem, or the Rellich compactness theorem) to extract a subsequence of approximate solutions that converge (in a topology weaker than that associated to the available uniform bounds) to a limit. (Note that there is no reason
*a priori*to expect such limit points to be unique, or to have any regularity properties beyond that implied by the available uniform bounds..) - Show that this limit solves the original equation in a suitable weak sense.

The quality of these weak solutions is very much determined by the type of uniform bounds one can obtain on the approximate solution; the stronger these bounds are, the more properties one can obtain on these weak solutions. For instance, if the approximate solutions enjoy an energy identity leading to uniform energy bounds, then (by using tools such as Fatou’s lemma) one tends to obtain energy *inequalities* for the resulting weak solution; but if one somehow is able to obtain uniform bounds in a higher regularity norm than the energy then one can often recover the full energy *identity*. If the uniform bounds are at the regularity level needed to obtain well-posedness, then one generally expects to upgrade the weak solution to a strong solution. (This phenomenon is often formalised through *weak-strong uniqueness* theorems, which we will discuss later in these notes.) Thus we see that as far as attacking global regularity is concerned, both the theory of strong solutions and the theory of weak solutions encounter essentially the same obstacle, namely the inability to obtain uniform bounds on (exact or approximate) solutions at high regularities (and at arbitrary times).

For simplicity, we will focus our discussion in this notes on finite energy weak solutions on . There is a completely analogous theory for periodic weak solutions on (or equivalently, weak solutions on the torus which we will leave to the interested reader.

In recent years, a completely different way to construct weak solutions to the Navier-Stokes or Euler equations has been developed that are not based on the above compactness methods, but instead based on techniques of convex integration. These will be discussed in a later set of notes.

** — 1. A brief review of some aspects of distribution theory — **

We have already been using the concept of a distribution in previous notes, but we will rely more heavily on this theory in this set of notes, so we pause to review some key aspects of the theory. A more comprehensive discussion of distributions may be found in this previous blog post. To avoid some minor subtleties involving complex conjugation that are not relevant for this post, we will restrict attention to real-valued (scalar) distributions here. (One can then define vector-valued distributions (taking values in a finite-dimensional vector space) as a vector of scalar-valued distributions.)

Let us work in some non-empty open subset of a Euclidean space (which may eventually correspond to space, time, or spacetime). We recall that is the space of (real-valued) test functions . It has a rather subtle topological structure (see previous notes) which we will not detail here. A (real-valued) distribution on is a continuous linear functional from test functions to the reals . (This pairing may also be denoted or in other texts.) There are two basic examples of distributions to keep in mind:

- Any locally integrable function gives rise to a distribution (which by abuse of notation we also call ) by the formula .
- Any Radon measure gives rise to a distribution (which we will again call ) by the formula . For instance, if , the Dirac mass at is a distribution with .

Two distributions are equal in the sense of distributions of for all . For instance, it is not difficult to show that two locally integrable functions are equal in the sense of distributions if and only if they agree almost everywhere, and two Radon measures are equal in the sense of distributions if and only if they are identical.

As a general principle, any “linear” operation that makes sense for “nice” functions (such as test functions) can also be defined for distributions, but any “nonlinear” operation is unlikely to be usefully defined for arbitrary distributions (though it may still be a good concept to use for distributions with additional regularity). For instance, one can take a partial derivative (known as the weak derivative) of any distribution by the definition

for all . Note that this definition agrees with the “strong” or “classical” notion of a derivative when is a smooth function, thanks to integration by parts. Similarly, if is smooth, one can define the product distribution by the formula

for all . One can also take linear combinations of two distributions in the usual fashion, thus

for all and .

Exercise 1Let be a connected open subset of . Let be a distribution on such that in the sense of distributions for all . Show that is a constant, that is to say there exists such that in the sense of distributions.

A sequence of distributions is said to converge in the weak-* sense or *converge in the sense of distributions* to another distribution if one has

as for every test function ; in this case we write . This notion of convergence is sometimes referred to also as weak convergence (and one writes instead of ), although there is a subtle distinction between weak and weak-* convergence in non-reflexive spaces and so I will try to avoid this terminology (though in many cases one will be working in a reflexive space in which there is no distinction).

The linear operations alluded to above tend to be continuous in the distributional sense. For instance, it is easy to see that if , then for all , and for any smooth ; similarly, if , , and , are sequences of real numbers, then .

Suppose that one places a norm or seminorm on . Then one can define a subspace of the space of distributions, defined to be the space of all distributions for which the norm

is finite. For instance, if is the norm for some , then is just the dual space (with the (equivalence classes of) locally integrable functions in identified with distributions as above).

We have the following version of the Banach-Alaoglu theorem which allows us to easily create sequences that converge in the sense of distributions:

Proposition 2 (Variant of Banach-Alaoglu)Suppose that is a norm or seminorm on which makes the space separable. Let be a bounded sequence in . Then there is a subsequence of the which converges in the sense of distributions to a limit .

*Proof:* By hypothesis, there is a constant such that

for all . For each given , we may thus pass to a subsequence of such that converges to a limit. Passing to a subsequence a countably infinite number of times and using the Arzelá-Ascoli diagonalisation trick, we can thus find a dense subset of (using the metric) and a subsequence of the such that the limit exists for every , and hence for every by a limiting argument and (1). If one then defines to be the function

then one can verify that is a distribution, and by (1) we will have . By construction, converges in the sense of distributions to , and we are done.

It is important to note that there is no uniqueness claimed for ; while any given subsequence of the can have at most one limit , it is certainly possible for different subsequences to converge to different limits. Also, the proposition only applies for spaces that have preduals ; this covers many popular function spaces, such as spaces for , but omits endpoint spaces such as or . (For instance, approximations to the identity are uniformly bounded in , but converge weakly to a Dirac mass, which lies outside of .)

From definition we see that if , then we have the Fatou-type lemma

Thus, upper bounds on the approximating distributions are usually inherited by their limit . However, it is essential to be aware that the same is not true for lower bounds; there can be “loss of mass” in the limit. The following four examples illustrate some key ways in which this can occur:

- (Escape to spatial infinity) If is a non-zero test function, and is a sequence in going to infinity, then the translations of converge in the sense of distributions to zero, even though they will not go to zero in many function space norms (such as ).
- (Escape to frequency infinity) If is a non-zero test function, and is a sequence in going to infinity, then the modulations of converge in the sense of distributions to zero (cf. the Riemann-Lebesgue lemma), even though they will not go to zero in many function space norms (such as ).
- (Escape to infinitely fine scales) If , is a sequence of positive reals going to infinity, and , then the sequence converges in the sense of distributions to zero, but will not go to zero in several function space norms (e.g. with ).
- (Escape to infinitely coarse scales) If , is a sequence of positive reals going to zero, and , then the sequence converges in the sense of distributions to zero, but will not go to zero in several function space norms (e.g. with ).

Related to this loss of mass phenomenon is the important fact that the operation of pointwise multiplication is generally *not* continuous in the distributional topology: and does *not* necessarily imply in general (in fact in many cases the products or might not even be well-defined). For instance:

- Using the escape to frequency infinity example, the functions converge in the sense of distributions to zero, but their squares instead converge in the sense of distributions to , as can be seen from the double angle formula .
- Using the escape to infinitely fine scales example, the functions converge in the sense of distributions to zero, but their squares will not if .

This lack of continuity of multiplication means that one has to take a non-trivial amount of care when applying the theory of distributions to nonlinear PDE; a sufficiently careless regard for this issue (or more generally, treating distribution theory as some sort of “magic wand“) is likely to lead to serious errors in one’s arguments.

One way to recover continuity of pointwise multiplication is to somehow upgrade distributional convergence to stronger notions of convergence. For instance, from Hölder’s inequality one sees that if converges strongly to in (thus and both lie in , and goes to zero), and converges strongly to in , then will converge strongly in to , where .

One key way to obtain strong convergence in some norm is to obtain uniform bounds in an even stronger norm – so strong that the associated space embeds compactly in the space associated to the original norm. More precisely

Proposition 3 (Upgrading to strong convergence)Let be two norms on , with associated spaces of distributions. Suppose that embeds compactly into , that is to say the closed unit ball in is a compact subset of . If is a bounded sequence in that converges in the sense of distributions to a limit , then converges strongly in to as well.

*Proof:* By the Urysohn subsequence principle, it suffices to show that every subsequence of has a further subsequence that converges strongly in to . But by the compact embedding of into , every subsequence of has a further subsequence that converges strongly in to some limit , and hence also in the sense of distributions to by definition of the norm. But thus subsequence also converges in the sense of distributions to , and hence , and the claim follows.

** — 2. Simple examples of weak solutions — **

We now study weak solutions for some very simple equations, as a warmup for discussing weak solutions for Navier-Stokes.

We begin with an extremely simple initial value problem, the ODE

on a half-open time interval with , with initial condition , where and given and is the unknown. Of course, when are smooth, then the fundamental theorem of calculus gives the unique solution

for . If one integrates the identity against a test function (that is to say, one multiplies both sides of this identity by and then integrates) on , one obtains

which upon integration by parts and rearranging gives

where we extend by zero to the open set . Thus, we have

in the sense of distributions (on ). More generally, if are locally integrable functions on , we say that is a *weak solution* to the initial value problem if (4) holds in the sense of distributions on . Thanks to the fundamental theorem of calculus for locally integrable functions, we still recover the unique solution (16):

Exercise 4Let be locally integrable functions (extended by zero to all of ), and let . Show that the following are equivalent:

Now let be a finite dimensional vector space, let be a continuous function, let , and consider the initial value problem

on some forward time interval . The Picard existence theorem lets us construct such solutions when is Lipschitz continuous and is small enough, but now we are merely requiring to be continuous and not necessarily Lipschitz. As in the preceding case, we introduce the notion of a weak solution. If is locally bounded (and measurable) on , then will be locally integrable on ; we then extend by zero to be distributions on , and we say that is a *weak solution* to (5) if one has

in the sense of distributions on , or equivalently that one has the identity

for all test functions compactly supported in . In this simple ODE setting, the notion of a weak solution coincides with stronger notions of solutions:

Exercise 5Let be finite dimensional, let be continuous, let , and let be locally bounded and measurable. Show that the following are equivalent:

In particular, if the ODE initial value problem (5) exhibits finite time blowup for its (unique) classical solution, then it will also do so for weak solutions (with exactly the same blouwp time). This will be in contrast with the situation for PDE, in which it is possible for weak solutions to persist beyond the time in which classical solutions exist.

Now we give a compactness argument to produce weak solutions (which will then be classical solutions, by the above exercise):

Proposition 6 (Weak existence)Let be a finite dimensional vector space, let , let , and let be a continuous function. Let be the timeThen there exists a continuously differentiable solution to the initial value problem (5) on .

*Proof:* By construction, we have

Using the Weierstrass approximation theorem (or Stone-Weierstrass theorem), we can express on as the uniform limit of Lipschitz continuous functions , such that

for all ; we can then extend in a Lipschitz continuous fashion to all of . (The Lipschitz constant of is permitted to diverge to infinity as ). We can then apply the Picard existence theorem (Theorem 8 of Notes 1), for each we have a (continuously differentiable) maximal Cauchy development of the initial value problem

with as if is finite. (We could also solve the ODE backwards in time, but will not need to do so here.) We now claim that , and furthermore that one has the uniform bound

for all and all . Indeed, if this were not the case then by continuity (and the fact that ) there would be some and some such that , and for all . But then by the fundamental theorem of calculus and the triangle inequality (and (6)) we have

a contradiction. Thus we have (8) for all and , so takes values in on . Applying (7), (6) we conclude that

for all and all ; in particular, the are uniformly Lipschitz continuous and uniformly bounded on . Applying the Arzelá-Ascoli theorem, we can then pass to a subsequence in which the converge uniformly on to a limit , which then also takes values in . (Alternatively, one could use Proposition 2 to have converge in the sense of distributions, followed by Proposition 3 to upgrade to uniform convergence.) As converges uniformly to on , we conclude that converges uniformly to on . Since we have

in the sense of distributions (extending , by zero to ), we can take distributional limits and conclude that

in the sense of distributions, which by Exercise 5 shows that is a continuously differentiable solution to the initial value problem (5) as required.

In contrast to the Picard theory when is Lipschitz, Proposition 6 does not assert any uniqueness of the solution to the initial value problem (5). And in fact uniqueness often fails once the Lipschitz hypothesis is dropped! Consider the simple example of the scalar initial value problem

on , so the nonlinearity here is the continuous, but not Lipschitz continuous, function . Clearly the zero function is a solution to this ODE. But so is the function . In fact there are a continuum of solutions: for any , the function is a solution. Proposition 6 will select one of these solutions, but the precise solution selected will depend on the choice of approximating functions :

Exercise 7Let . For each , let denote the function

- (i) Show that each is Lipschitz continuous, and the converge uniformly to the function as .
- (ii) Show that the solution to the initial value problem is given by
for and

for .

- (iii) Show that as , converges uniformly to the function .

Now we give a simple example of a weak solution construction for a PDE, namely the linear transport equation

where the initial data and a position-dependent velocity field is given, and is the unknown field.

Suppose for the moment that are smooth, with bounded. Then one can solve this problem using the method of characteristics. For any , let denote the solution to the initial value problem

The Picard existence theorem gives us a smooth maximal Cauchy development for this problem; as is bounded, this development cannot go to infinity in finite time (either forward or backwards in time), and so the solution is global. Thus we have a well-defined map for each time . In fact we can say more:

Exercise 8Let the assumptions be as above.

- (i) Show the semigroup property for all .
- (ii) Show that is a homeomorphism for each .
- (iii) Show that for every , is differentiable, and the derivative obeys the linear initial value problem
(Hint: while this system formally can be obtained by differentiating (10) in , this formal differentiation requires rigorous justification. One can for instance proceed by first principles, showing that the Newton quotients approximately obey this equation, and then using a Gronwall inequality argument to compare this approximate solution to an exact solution.)

- (iv) Show that is a diffeomorphism for each ; that is to say, and its inverse are both continuously differentiable.
- (v) Show that is a smooth diffeomorphism (that is to say and its inverse are both smooth). (Caution: one may require a bit of planning to avoid the proof from becoming extremely long and tedious.)

From (10) and the chain rule we have the identity

for any smooth function (cf. the material derivative used in Notes 0). Thus, one can rewrite the initial value problem (9) as

at which point it is clear that the unique smooth solution to the initial value problem (10) is given by

Among other things, this shows that the sup norm is a conserved quantity:

Now we drop the hypothesis that is bounded. One can no longer assume that the trajectories are globally defined, or even that they are defined for a positive time independent of the starting point . Nevertheless, we have

Proposition 9 (Weak existence)Let be smooth, and let be smooth and bounded. Then there exists a bounded measurable function which weakly solves (10) in the sense thatin the sense of distributions on ) (extending by zero outside of ), or equivalently that

*Proof:* By multiplying by appropriate smooth cutoff functions, we can express as the locally uniform limit of smooth bounded functions with equal to on (say) . By the preceding discussion, for each we have a smooth global solution to the initial value problem

in the sense of distributions on . By (11), the are uniformly bounded with

Thus, by Proposition 2, we can pass to a subsequence and assume that converges in the sense of distributions to an element on ; by (2) we have

Since the are all supported on , is also. Taking weak limits in (13) (multiplying first by a cutoff function to localise to

This gives the required weak solution.

The following exercise shows that while one can construct global weak solutions, there is significant failure of uniqueness and persistence of regularity:

Exercise 10Set , thus we are solving the ODE

- (i) If are bounded measurable functions, show that the function defined by
for and

for is a weak solution to (14) with initial data

for and

for . (Note that one does not need to specify these functions at , since this describes a measure zero set.)

- (ii) Suppose further that , and that is smooth and compactly supported in . Show that the weak solution described in (i) is the solution constructed by Proposition 9.
- (iii) Show that there exist at least two bounded measurable weak solutions to (14) with initial data , thus showing that weak solutions are not unique. (Of course, at most one of these solutions could obey the inequality (12), so there are some weak solutions that are not constructible using Proposition 9.) Show that this lack of uniqueness persists even if one also demands that the weak solutions be smooth; conversely, show that there exist weak solutions with initial data that are discontinuous.

Remark 11As the above example illustrates, the loss of mass phenomenon for weak solutions arises because the approximants to those weak solutions “escape to infinity”in the limit, similarly, the loss of uniqueness phenomenon for weak solutions arises because the approximants “come from infinity” in the limit. In this particular case of a transport equation, the infinity is spatial infinity, but for other types of PDE it can be possible for approximate solutions to escape from, or come from, other types of infinity, such as frequency infinity, fine scale infinity, or coarse scale infinity. (In the former two cases, the loss of mass phenomenon will also be closely related to a loss of regularity in the weak solution.) Eliminating these types of “bad behaviour” for weak solutions is morally equivalent to obtaining uniform bounds for the approximating solutions that are strong enough to prevent such solutions from having a significant presence near infinity; in the case of Navier-Stokes, this basically corresponds to controlling such solutions uniformly in subcritical or critical norms.

** — 3. Leray-Hopf weak solutions — **

We now adapt the above formalism to construct weak solutions to the Navier-Stokes equations, following the fundamental work of Leray, who constructed such solutions on , (as before, we discard the case as being degenerate). The later work of Hopf extended this construction to other domains, but we will work solely with here for simplicity.

In the previous set of notes, several formulations of the Navier-Stokes equations were considered. For smooth solutions (with suitable decay at infinity, and in some cases a normalisation hypothesis on the pressure also), these formulations were shown to be essentially equivalent to each other. But at the very low level of regularity that weak solutions are known to have, these different formulations of Navier-Stokes are no longer obviously equivalent. As such, there is not a single notion of a “weak solution to the Navier-Stokes equations”; the notion depends on which formulation of these equations one chooses to work with. This leads to a number of rather technical subtleties when developing a theory of weak solutions. We will largely avoid these issues here, focusing on a specific type of weak solution that arises from our version of Leray’s construction.

It will be convenient to work with the formulation

of the initial value problem for the Navier-Stokes equations. Writing out the divergence as and interchanging with , we can rewrite this as

The point of this formulation is that it can be interpreted distributionally with fairly weak regularity hypotheses on . For Leray’s construction, it turns out that a natural regularity class is

basically because the norms associated to these function spaces are precisely the quantities that will be controlled by the important *energy identity* that we will discuss later. With this regularity, we have in particular that

by which we mean that

for all . Next, we need a special case of the Sobolev embedding theorem:

Exercise 12 (Non-endpoint Sobolev embedding theorem)Let be such that . Show that for any , one has with(

Hint:this non-endpoint case can be proven using the Littlewood-Paley projections from the previous set of notes.) The endpoint case of the Sobolev embedding theorem is also true (as long as ), but the proof requires the Hardy-Littlewood-Sobolev fractional integration inequality, which we will not cover here; see for instance these previous lecture notes.

We conclude that there is some for which

and hence by Hölder’s inequality

for all . (The precise value of is not terribly important for our arguments.)

Next, we invoke the following result from harmonic analysis:

Proposition 13 (Boundedness of the Leray projection)For any , one has the boundfor all . In particular, has a unique continuous extension to a linear map from to itself.

For , this proposition follows easily from Plancherel’s theorem. For , the proposition is more non-trivial, and is usually proven using the Calderón-Zygmund theory of singular integrals. A proof can be found for instance in Stein’s “Singular integrals“; we shall simply assume it as a black box here. We conclude that for in the regularity class (16), we have

In particular, is locally integrable in spacetime and thus can be interpreted as a distribution on (after extending by zero outside of . Thus also can be interpreted as a distribution. Similarly for the other two terms in (15). We then say that a function in the regularity class (16) is a *weak solution* to the initial value problem (15) for some distribution if one has

in the sense of spacetime distributions on (after extending by zero outside of . Unpacking the definitions of distributional derivative, this is equivalent to requiring that

for all spacetime test functions .

We can now state a form of Leray’s theorem:

Theorem 14 (Leray’s weak solutions)Let be divergence free (in the sense of distributions), and let . Then there exists a weak solution to the initial value problem (15). Furthermore, obeys the energy inequality

for almost every .

We now prove this theorem using the same sort of scheme that was used previously to construct weak solutions to other equations. We first need to set up some approximate solutions to (15). There are many ways to do this – the traditional way being to use some variant of the Galerkin method – but we will proceed using the Littlewood-Paley projections that were already introduced in the previous set of notes. Let be a sequence of dyadic integers going to infinity. We consider solutions to the initial value problem

this is (15) except with some additional factors of inserted in the initial data and in the nonlinear term. Formally, in the limit , the factors should converge to the identity and one should recover (15); but this requires rigorous justification. The number of factors of in the nonlinear term may seem excessive, but as we shall see, this turns out to be a convenient choice as it will lead to a favourable energy inequality for these solutions.

The Fujita-Kato theory of mild solutions for (15) from the previous set of notes can be easily adapted to the initial value problem (19), because the projections are bounded on all the function spaces of interest. Thus, for any , and any divergence-free , we can define an -mild solution to (15) on a time interval to be a function in the function space

such that

(in the sense of distributions) for all ; a mild solution on is a solution that is an mild solution when restricted to every compact subinterval . Note that the frequency-localised initial data lies in every space. By a modification of the theory of the previous set of notes, we thus see that there is a maximal Cauchy development that is a smooth solution to (19) (and an mild solution for every ), with if . Note that as is divergence-free, , and preserves the divergence-free property, and projects to divergence-free functions, is divergence-free for all . Similarly, as projects to functions with Fourier transform supported on the ball in , and this property is preserved by , , and we see that also has Fourier transform supported on the ball . This (non-uniformly) bounded frequency support is the key additional feature enjoyed by our approximate solutions that has no analogue for the actual solution , and effectively serves as a sort of “discretisation” of the problem (as per the uncertainty principle).

The next step is to ensure that the approximate solutions exist globally in time, that is to say that . We can do this by exploiting the energy conservation law for this equation. Indeed for any time , define the energy

(compare with Exercise 4 from Notes 0). From (19) we know that and lie in for any and any . This very high regularity allows us to easily justify operations such as integration by parts or differentiation under the integral sign in what follows. In particular, it is easy to establish the identity

for any . Inserting (19) (and suppressing explicit dependence on for brevity), we obtain

For the second term, we integrate by parts to obtain

For the first term

we use the self-adjointness of and , the skew-adjointness of , the fact that all three of these operators (being Fourier multipliers) commute with each other to write it as

Since is divergence-free, the Leray projection acts as the identity on it, so we may write the above expression as

Recalling the rules of thumb for the energy method from the previous set of notes, we locate a total derivative to rewrite the preceding expression as

(It is here that we begin to see how important it was to have so many factors of in our approximating equation.) We may now integrate by parts (easily justified using the high regularity of ) to obtain

But is divergence-free, so vanishes. To summarise, we conclude the (differential form of) the *energy identity*

by the fundamental theorem of calculus, we conclude in particular that

for all . Among other things, this gives a uniform bound

Ordinarily, this type bound would be too weak to combine with the blowup criterion mentioned earlier. But we know that has Fourier transform supported in , so in particular we have the reproducing formula . We may thus use the Bernstein inequality (Exercise 52 from Notes 1) and conclude that

This bound is not uniform in , but it is still finite, and so by combining with the blowup criterion we conclude that .

Now we need to start taking limits as . For this we need uniform bounds. Returning to the energy identity (20), we have the uniform bounds

so in particular for any finite one has

This is enough regularity for Proposition 2 to apply, and we can pass to a subsequence of which converges in the sense of spacetime distributions in (after extending by zero outside of to a limit , which is in for every .

Now we work on verifying the energy inequality (18). Let be a test function with which is non-increasing on . From (20) and integration by parts we have

Taking limit inferior and using the Fatou-type lemma (2), we conclude that

Now let , take to equal on and zero outside of for some small . Then we have

The function is supported on , is non-negative, and has total mass one. By the Lebesgue differentiation theorem applied to the bounded measurable function , we conclude that for almost every , we have

as . The claim (18) follows.

It remains to show that is a weak solution of (15), that is to say that (17) holds in the sense of spacetime distributions. Certainly the smooth solution of (19) will also be a weak solution, thus

in the sense of spacetime distributions on , where we extend by zero outside of .

At this point it is tempting to just take distributional limits of both sides of (22) to obtain (17). Certainly we have the expected convergence for the linear components of the equation:

However, it is not immediately clear that

mainly because of the previously mentioned problem that multiplication is not continuous with respect to weak notions of convergence. But if we can show (23), then we do indeed recover (17) as the limit of (22), which will complete the proof of Theorem 14.

Let’s try to simplify the task of proving (23). The partial derivative operator is continuous with respect to convergence in distributions, so it suffices to show that

where

We now try to get rid of the outer Littlewood-Paley projection. We claim that

Let be a fixed time. By Sobolev embedding and (21), is bounded in , uniformly in , for some . The same is then true for , hence by Hölder’s inequality and Proposition 13, is uniformly bounded in . On the other hand, for any spacetime test function , it is not difficult (using the rapid decrease of the Fourier transform of ) to show that goes to zero in the dual space . This gives (24).

It thus suffices to show that converges in the sense of distributions to , thus one wants

for any spacetime test function . One can easily calculate that lies in the dual space to the space that and are bounded in, so it will suffices to show that converges strongly in to for sufficiently close to . and any compact subset of spacetime (since the norm of outside of can be made arbitrarily small by making large enough.)

Let be a dyadic integer, then we can split

The functions are uniformly bounded in by some bound , hence by Plancherel’s theorem the functions , have an norm of (assuming is large enough so that ). Indeed, by Littlewood-Paley decomposition and Bernstein’s inequality we also see that these functions have an norm of if is close enough to that the exponent of is negative. It will therefore suffice to show that

strongly in for every fixed and .

We already know that goes to zero in the sense of distributions, so (as Proposition 3 indicates) the main difficulty is to obtain compactness of the sequence. The operator localises in spatial frequency, and the restriction to localises in both space and time, however there is still the possibility of escaping to temporal frequency. To prevent this, we need some sort of equicontinuity in time. For this, we may turn to the equation (19) obeyed by . Applying , we see that

when is large enough. We have already seen that is bounded in uniformly in , so by the Bernstein inequality is bounded in (we allow the bound to depend on ). Similarly for . We conclude that is bounded in uniformly in ; taking weak limits using (2), the same is true for , and hence is bounded in . Also, is bounded in by Bernstein’s inequality; thus is equicontinuous in . By the Arzelá-Ascoli theorem and Proposition 3, must therefore go to zero uniformly, and the claim follows. This completes the proof of Theorem 14.

Exercise 15 (Rellich compactness theorem)Let be such that .

- (i) Show that if is a bounded sequence in that converges in the sense of distributions to a limit , then there is a subsequence which converges strongly in to (thus, for any compact set , the restrictions of to converge strongly in to the restriction of to ).
- (ii) Show that for any compact set , the linear map defined by setting to be the restriction of to is a compact linear map.
- (iii) Show that the above two claims fail at the endpoint (which of course only occurs when ).

The weak solutions constructed by Theorem 14 have additional properties beyond the ones listed in the above theorem. For instance:

Exercise 16Let be as in Theorem 14, and let be a weak solution constructed using the proof of Theorem 14.

- (i) Show that is divergence-free in the sense of spacetime distributions.
- (ii) Show that there is a measure zero subset of such that one has the energy inequality
for all with . Furthermore, show that for all , the time-shifted function defined by is a weak solution to the initial value problem (15) with initial data .

- (iii) Show that after modifying on a set of measure zero, the function is continuous for any . (
Hint:first establish this when is a test function.)

We will discuss some further properties of the Leray weak solutions in later notes.

** — 4. Weak-strong uniqueness — **

If is a (non-zero) element in a Hilbert space , and is another element obeying the inequality

then this is very far from the assertion that is equal to , since the ball of elements of obeying (25) is far larger than the single point . However, if one also posseses the information that agrees with when tested against , in the sense that

then (25) and (26) combine to indeed be able to conclude that . Geometrically, this is because the above-mentioned ball is tangent to the hyperplane described by (26) at the point . Algebraically, one can establish this claim by the cosine rule computation

giving the claim.

This basic argument has many variants. Here are two of them:

Exercise 17 (Weak convergence plus norm bound equals strong convergence (Hilbert spaces))Let be an element of a Hilbert space , and let be a sequence in which weakly converges to , that is to say that for all . Show that the following are equivalent:

- (i) .
- (ii) .
- (iii) converges
stronglyto .

Exercise 18 (Weak convergence plus norm bound equals strong convergence ( norms))Let be a measure space, let be an absolutely integrable non-negative function, and let be a sequence of absolutely integrable non-negative functions that converge pointwise to . Show that the following are equivalent:

- (i) .
- (ii) .
- (iii) converges strongly in to .
(

Hint:express and in terms of the positive and negative parts of . The latter can be controlled using the dominated convergence theorem.)

Exercise 19Let be as in Theorem 14, and let be a weak solution constructed using the proof of Theorem 14. Show that (after modifying on a set of measure zero if necessary), converges strongly in to as . (Hint:use Exercise 16(iii) and Exercise 17.)

Now we give a variant relating to weak and strong solutions of the Navier-Stokes equations.

Proposition 20 (Weak-strong uniqueness)Let be an mild solution to the Navier-Stokes equations (15) for some , , and with . Let be a weak solution to the Navier-Stokes equation which obeys the energy inequality (18) for almost all . Then and agree almost everywhere on .

Roughly speaking, this proposition asserts that weak solutions obeying the energy inequality stay unique as long as a strong solution exists (in particular, it is unique whenever it is regular enough to be a strong solution). However, once a strong solution reaches the end of its maximal Cauchy development, there is no further guarantee of uniqueness for the rest of the weak solution. Also, there is no guarantee of uniqueness of weak solutions if the energy inequality is dropped, and indeed there is now increasing evidence that uniqueness is simply false in this case; see for instance this paper of Buckmaster and Vicol for recent work in this direction. The conditions on can be relaxed somewhat (in particular, it is possible to drop the condition ), though they still need to be “subcritical” or “critical” in nature; see for instance the classic papers of Prodi, of Serrin, and of Ladyzhenskaya, which show that weak solutions on obeying the energy inequality are necessarily unique and smooth (after time ) if they lie in the space for some exponents with and ; the endpoint case was worked out more recently by Escauriaza, Seregin, and Sverak. For a recent survey of weak-strong uniqueness results for fluid equations, see this paper of Wiedemann.

*Proof:* Before we give the formal proof, let us first give a non-rigorous proof in which we pretend that the weak solution can be manipulated like a strong solution. Then we have

and

As in the beginning of the section, the idea is to analyse the norm of the difference . Writing in the first equation and subtracting from the second equation, we obtain the *difference equation*

If we formally differentiate the energy using this equation, we obtain

(omitting the explicit dependence of the integrand on and ) which after some integration by parts (noting that is divergence-free and thus is the identity on formally becomes

The and terms formally cancel out by the usual trick of writing as a total derivative and integrating by parts, using the divergence-free nature of both and . For the term , we can cancel it against the term by the arithmetic mean-geometric mean inequality

to obtain

thanks to Hölder’s inequality. As is an mild solution, it lies in , which by Sobolev embedding and Hölder means that it is also in . Since , Gronwall’s inequality then should give for all , giving the claim.

Now we begin the rigorous proof, in which is only known to be a weak solution. Here, we do not directly manipulate the difference equation, but instead carefully use the equations for and as a substitute. Define and as before. From the cosine rule we have

where we drop the explicit dependence on in the integrand. From the energy inequality hypothesis (18), we have

for almost all , where we also drop explicit dependence on in the integrand. The strong solution also obeys the energy inequality; in fact we have the energy equality

as can be seen by first working with smooth solutions and taking limits using the local well-posedness theory. We conclude that

Now we work on the integral . Because we only know to solve the equation

in the sense of spacetime integrals, it is difficult to directly treat this spatial integral. Instead (similarly to the proof of the energy inequality for Leray solutions), we will first work with a proxy

where is a test function in time, which we normalise with ; eventually we will make an approximation to the indicator function of and apply the Lebesgue differentiation theorem to recover information about for almost every .

By hypothesis, we have

for any spacetime test function . We would like to apply this identity with replaced by (in order to obtain an identity involving the expression (28)). Now is not a test function; however, as is an mild solution, it has the regularity

also, using the equation (15), Sobolev embedding, Hölder’s inequality, and the hypotheses and we see that

(If one wishes, one can first obtain this bound for smooth solutions, and take limits using the local well-posedness theory.) As a consequence, one can find a sequence of test functions , such that converges to in and norm (so converges to in norm), and converges to in norm. Since lies in , lies in , and lies in by Hölder and Sobolev, we can take limits and conclude that

Since is divergence-free, and does not depend on the spatial variables, we can simplify this slightly as

and so we can write (28) as

Using the Lebesgue differentiation theorem as in the proof of Theorem 14, we conclude that for almost every , one has the identity

Applying (15), the right-hand side is

(Note that expressions such as are well defined because lie in .) We can integrate by parts (justified using the usual limiting argument and the bounds on ) and use the divergence-free nature of to write this as

Inserting this into (27), we conclude that

We write and write this as

noting from the regularity , on and Sobolev embedding that one can ensure that all integrals here are absolutely convergent.

The integral can be rewritten using integration by parts as (noting that there is enough regularity to justify the integration by parts by the usual limiting argument); expressing as a total derivative and integrating by parts again using the divergence-free nature of , we see that this expression vanishes. Similarly for the term. Now we eliminate the remaining terms which are linear in :

We may integrate by parts, and write the dot product in coordinates, to write this as

Applying the Leibniz rule and the divergence-free nature of , we see that this expression vanishes. We conclude that

Now we use the Leibniz rule, the divergence-free nature of , and the arithmetic mean-geometric mean inequality to write

to obtain

and hence by Sobolev embedding we have

for almost all . Applying Gronwall’s inequality (modifying on a set of measure zero) we conclude that for almost all , giving the claim.

One application of weak-strong uniqueness results is to give (in the case at least) *partial regularity* on the weak solutions constructed by Leray, in that the solutions agree with smooth solutions on large regions of spacetime – large enough, in fact, to cover all but a measure zero set of times . Unfortunately, the complement of this measure zero set could be disconnected, and so one could have different smooth solutions agreeing with at different epochs, so this is still quite far from an assertion of global regularity of the solution. Nevertheless it is still a non-trivial and interesting result:

Theorem 21 (Partial regularity)Let . Let be as in Theorem 14, and let be a weak solution constructed using the proof of Theorem 14.

- (i) (Eventual regularity) There exists a time such that (after modification on a set of measure zero), the weak solution on agrees with an mild solution on with initial data (where we time shift the notion of a mild solution to start at instead of ).
- (ii) (Epochs of regularity) There exists a compact exceptional set of measure zero, such that for any time , there is a time interval containing in its interior such that on agrees almost everywhere whtn an mild solution on with initial data .

*Proof:* (Sketch) We begin with (i). From (18), the norm of and the norm of are finite. Thus, for any , one can find a positive measure set of times such that

which by Plancherel and Cauchy-Schwarz implies that

In particular, by Exercise 16, one can find a time such that is a weak solution on with initial data obeying the energy inequality, with

By the small data global existence theory (Theorem 45 from Notes 1), if is chosen small enough, then there is then a global mild solution on to the Navier-Stokes equations with initial data , which must then agree with by weak-strong uniqueness. This proves (i).

Now we look at (ii). In view of (i) we can work in a fixed compact interval . Let be a time, and let be a sufficiently small constant. If there is a positive measure set of times for which

then by the same argument as above (but now using well-posedness theory instead of well-posedness theory), we will be able to equate (almost everywhere) with an mild solution on for some neighbourhood of . Thus the only times for which we cannot do this are those for which one has

for almost all . In particular, for any , one can cover such times by a collection of intervals of length , such that for almost every in that interval. On the other hand, as is bounded in , the number of disjoint time intervals of this form is at most (where we allow the implied constant to depend on and ). Thus the set of exceptional times can be covered by intervals of length , and thus its closure has Lebesgue measure . Sending we see that the exceptional times are contained in a closed measure zero subset of , and the claim follows.

The above argument in fact shows that the exceptional set in part (ii) of the above theorem will have upper Minkowski dimension at most (and hence also Hausdorff dimension at most ). There is a significant strengthening of this partial regularity result due to Caffarelli, Kohn, and Nirenberg, which we will discuss in later notes.

]]> where is a given constant (the *kinematic viscosity*, or *viscosity* for short), is an unknown vector field (the *velocity field*), and is an unknown scalar field (the *pressure field*). Here is a time interval, usually of the form or . We will either be interested in spatially decaying situations, in which decays to zero as , or -periodic (or *periodic* for short) settings, in which one has for all . (One can also require the pressure to be periodic as well; this brings up a small subtlety in the uniqueness theory for these equations, which we will address later in this set of notes.) As is usual, we abuse notation by identifying a -periodic function on with a function on the torus .

In order for the system (1) to even make sense, one requires some level of regularity on the unknown fields ; this turns out to be a relatively important technical issue that will require some attention later in this set of notes, and we will end up transforming (1) into other forms that are more suitable for lower regularity candidate solution. Our focus here will be on local existence of these solutions in a short time interval or , for some . (One could in principle also consider solutions that extend to negative times, but it turns out that the equations are not time-reversible, and the forward evolution is significantly more natural to study than the backwards one.) The study of Euler equations, in which , will be deferred to subsequent lecture notes.

As the unknown fields involve a time parameter , and the first equation of (1) involves time derivatives of , the system (1) should be viewed as describing an evolution for the velocity field . (As we shall see later, the pressure is not really an independent dynamical field, as it can essentially be expressed in terms of the velocity field without requiring any differentiation or integration in time.) As such, the natural question to study for this system is the initial value problem, in which an initial velocity field is specified, and one wishes to locate a solution to the system (1) with initial condition

for . Of course, in order for this initial condition to be compatible with the second equation in (1), we need the compatibility condition

and one should also impose some regularity, decay, and/or periodicity hypotheses on in order to be compatible with corresponding level of regularity etc. on the solution .

The fundamental questions in the local theory of an evolution equation are that of *existence*, *uniqueness*, and *continuous dependence*. In the context of the Navier-Stokes equations, these questions can be phrased (somewhat broadly) as follows:

- (a) (Local existence) Given suitable initial data , does there exist a solution to the above initial value problem that exists for some time ? What can one say about the time of existence? How regular is the solution?
- (b) (Uniqueness) Is it possible to have two solutions of a certain regularity class to the same initial value problem on a common time interval ? To what extent does the answer to this question depend on the regularity assumed on one or both of the solutions? Does one need to normalise the solutions beforehand in order to obtain uniqueness?
- (c) (Continuous dependence on data) If one perturbs the initial conditions by a small amount, what happens to the solution and on the time of existence ? (This question tends to only be sensible once one has a reasonable uniqueness theory.)

The answers to these questions tend to be more complicated than a simple “Yes” or “No”, for instance they can depend on the precise regularity hypotheses one wishes to impose on the data and on the solution, and even on exactly how one interprets the concept of a “solution”. However, once one settles on such a set of hypotheses, it generally happens that one either gets a “strong” theory (in which one has existence, uniqueness, and continuous dependence on the data), a “weak” theory (in which one has existence of somewhat low-quality solutions, but with only limited uniqueness results (or even some spectacular failures of uniqueness) and almost no continuous dependence on data), or no satsfactory theory whatsoever. In the former case, we say (roughly speaking) that the initial value problem is *locally well-posed*, and one can then try to build upon the theory to explore more interesting topics such as global existence and asymptotics, classifying potential blowup, rigorous justification of conservation laws, and so forth. With a weak local theory, it becomes much more difficult to address these latter sorts of questions, and there are serious analytic pitfalls that one could fall into if one tries too strenuously to treat weak solutions as if they were strong. (For instance, conservation laws that are rigorously justified for strong, high-regularity solutions may well fail for weak, low-regularity ones.) Also, even if one is primarily interested in solutions at one level of regularity, the well-posedness theory at another level of regularity can be very helpful; for instance, if one is interested in smooth solutions in , it turns out that the well-posedness theory at the critical regularity of can be used to establish *globally* smooth solutions from small initial data. As such, it can become quite important to know what kind of local theory one can obtain for a given equation.

This set of notes will focus on the “strong” theory, in which a substantial amount of regularity is assumed in the initial data and solution, giving a satisfactory (albeit largely local-in-time) well-posedness theory. “Weak” solutions will be considered in later notes.

The Navier-Stokes equations are not the simplest of partial differential equations to study, in part because they are an amalgam of three more basic equations, which behave rather differently from each other (for instance the first equation is nonlinear, while the latter two are linear):

- (a)
*Transport equations*such as . - (b)
*Diffusion equations*(or*heat equations*) such as . - (c) Systems such as , , which (for want of a better name) we will call
*Leray systems*.

Accordingly, we will devote some time to getting some preliminary understanding of the linear diffusion and Leray systems before returning to the theory for the Navier-Stokes equation. Transport systems will be discussed further in subsequent notes; in this set of notes, we will instead focus on a more basic example of nonlinear equations, namely the first-order *ordinary differential equation*

where takes values in some finite-dimensional (real or complex) vector space on some time interval , and is a given linear or nonlinear function. (Here, we use “interval” to denote a connected non-empty subset of ; in particular, we allow intervals to be half-infinite or infinite, or to be open, closed, or half-open.) Fundamental results in this area include the Picard existence and uniqueness theorem, the Duhamel formula, and Grönwall’s inequality; they will serve as motivation for the approach to local well-posedness that we will adopt in this set of notes. (There are other ways to construct strong or weak solutions for Navier-Stokes and Euler equations, which we will discuss in later notes.)

A key role in our treatment here will be played by the fundamental theorem of calculus (in various forms and variations). Roughly speaking, this theorem, and its variants, allow us to recast differential equations (such as (1) or (4)) as integral equations. Such integral equations are less tractable algebraically than their differential counterparts (for instance, they are not ideal for verifying conservation laws), but are significantly more convenient for well-posedness theory, basically because integration tends to increase the regularity of a function, while differentiation reduces it. (Indeed, the problem of “losing derivatives”, or more precisely “losing regularity”, is a key obstacle that one often has to address when trying to establish well-posedness for PDE, particularly those that are quite nonlinear and with rough initial data, though for nonlinear parabolic equations such as Navier-Stokes the obstacle is not as serious as it is for some other PDE, due to the smoothing effects of the heat equation.)

One weakness of the methods deployed here are that the quantitative bounds produced deteriorate to the point of uselessness in the inviscid limit , rendering these techniques unsuitable for analysing the Euler equations in which . However, some of the methods developed in later notes have bounds that remain uniform in the limit, allowing one to also treat the Euler equations.

In this and subsequent set of notes, we use the following asymptotic notation (a variant of Vinogradov notation that is commonly used in PDE and harmonic analysis). The statement , , or will be used to denote an estimate of the form (or equivalently ) for some constant , and will be used to denote the estimates . If the constant depends on other parameters (such as the dimension ), this will be indicated by subscripts, thus for instance denotes the estimate for some depending on .

** — 1. Ordinary differential equations — **

We now study solutions to ordinary differential equations (4), focusing in particular on the initial value problem when the initial state is specified. We restrict attention to *strong solutions* , in which is continuously differentiable () in the time variable, so that the derivative in (4) can be interpreted as the classical (strong) derivative, and one has the classical fundamental theorem of calculus

whenever (in this post we use the signed definite integral, thus ).

We begin with homogeneous linear equations

where is a linear operator. Using the integrating factor , where is the matrix exponential of , and noting that , we see that this equation is equivalent to

and hence from the fundamental theorem of calculus we see that if then we have the unique global solution , or equivalently

More generally, if one wishes to solve the inhomogeneous linear equation

for some continuous with initial condition , then from the fundamental theorem of calculus we have a unique global solution given by

or equivalently one has the Duhamel’s formula

which is continuously differentiable in time if is continuous. Intuitively, the first term represents the contribution of the initial data to the solution at time (with the factor representing the evolution from time to time ), while the integrand represents the contribution of the forcing term at time to the solution at time (with the factor representing the evolution from time to time ).

One can apply a similar analysis to the differential inequality

where is now a scalar continuously differentiable function, are continuous functions, and is an interval containing as its left endpoint; we also assume an initial condition . Here, the natural integrating factor is , whose derivative is by the chain rule and the fundamental theorem of calculus. Applying this integrating factor to (7), we may write it as

and hence by the fundamental theorem of calculus we have

for all (compare with (6)). This is the differential form of Grönwall’s inequality. In the homogeneous case , the inequality of course simplifies to

We continue assuming that for simplicity. From the fundamental theorem of calculus, (7) (and the initial condition ) implies the integral inequality

although the converse implication of (7) from (10) is false in general. Nevertheless, there is an analogue of (9) just assuming the weaker inequality (10), and not requiring any differentiability on , at least when all functions involved are non-negative:

Lemma 1 (Integral form of Grönwall inequality)Let be an interval containing as left endpoint, let , and let be continuous functions obeying the inequality (10) for all . Then one has (9) for all .

*Proof:* From (10) and the fundamental theorem of calculus, the function is continuously differentiable and obeys the differential inequality

(note here that we use the hypothesis that is non-negative). Applying the differential form (9) of Gronwall’s inequality, we conclude that

The claim now follows from (10).

Exercise 2Relax the hypotheses of continuity on to that of being measurable and bounded on compact intervals. (You will need tools such as the fundamental theorem of calculus for absolutely continuous or Lipschitz functions, covered for instance in this previous set of notes.)

Gronwall’s inequality is an excellent tool for bounding the growth of a solution to an ODE or PDE, or the difference between two such solutions. Here is a basic example, one half of the Picard (or Picard-Lindeöf) theorem:

Theorem 3 (Picard uniqueness theorem)Let be an interval, let be a finite-dimensional vector space, let be a function that is Lipschitz continuous on every bounded subset of , and let be continuously differentiable solutions to the ODE (4), thuson . If for some , then and agree identically on , thus for all .

*Proof:* By translating and we may assume without loss of generality that . By splitting into at most two intervals, we may assume that is either the left or right endpoint of ; by applying the time reversal symmetry of replacing by respectively, and also replacing by and , we may assume without loss of generality that is the left endpoint of . Finally, by writing as the union of compact intervals with left endpoint , we may assume without loss of generality that is compact. In particular, are bounded and hence is Lipschitz continuous with some finite Lipschitz constant on the ranges of and .

From the fundamental theorem of calculus we have

and

for every ; subtracting, we conclude

Applying the Lipschitz property of and the triangle inequality, we conclude that

By the integral form of Grönwall’s inequality, we conclude that

and the claim follows.

Remark 4The same result applies for infinite-dimensional normed vector spaces , at least if one requires to be continuously differentiable in the strong (Fréchet) sense; the proof is identical.

Exercise 5 (Comparison principle)Let be a function that is Lipschitz continuous on compact intervals. Let be an interval, and let be continuously differentiable functions such thatand

for all .

- (a) Suppose that for some . Show that for all with . (
Hint:there are several ways to proceed here. One is to try to verify the hypotheses of Grönwall’s inequality for the quantity or .)- (b) Suppose that for some . Show that for all with .

Now we turn to the existence side of the Picard theorem.

Theorem 6 (Picard existence theorem)Let be a finite dimensional normed vector space, let , and let lie in the closed ball . Let be a function which has a Lipschitz constant of on the ball . If one setsthen there exists a continuously differentiable solution to the ODE (4) with initial data such that for all .

Note that the solution produced by this theorem is unique on , thanks to Theorem 3. We will be primarily concerned with the case , in which case the time of existence simplifies to .

*Proof:* Using the fundamental theorem of calculus, we write (4) (with initial condition ) in integral form as

Indeed, if is continuously differentiable and solves (4) with on , then (12) holds on . Conversely, if is continuous and solves (12) on , then by the fundamental theorem of calculus the right-hand side of (12) (and hence ) is continuously differentiable and solves (4) with . Thus it suffices to solve the integral equation (12) with a solution taking values in .

We can view this as a fixed point problem. Let denote the space of continuous functions from to . We give this the uniform metric

As is well known, becomes a complete metric space with this metric. Let denote the map

Let us first verify that does map to . If , then is clearly continuous. For any , one has from the triangle inequality that

by choice of , hence as claimed. A similar argument shows that is in fact a contraction on . Namely, if , then

and hence by choice of . Applying the contraction mapping theorem, we obtain a fixed point to the equation , which is precisely (12), and the claim follows.

Remark 7The proof extends without difficulty to infinite dimensional Banach spaces . Up to a multiplicative constant, the result is sharp. For instance, consider the linear ODE for some , with . Here, the function is of course Lipschitz with constant on all of , and the solution is of the form , hence will exit in time , which is only larger than the time given by the above theorem by a multiplicative constant.

We can iterate the Picard existence theorem (and combine it with the uniqueness theorem) to conclude that there is a *maximal Cauchy development* to the ODE (4) with initial data , with the solution diverging to infinity (or “blowing up”) at the endpoint if this endpoint is finite, and similarly for (thus one has a dichotomy between global existence and finite time blowup). More precisely:

Theorem 8 (Maximal Cauchy development)Let be a finite dimensional normed vector space, let , and let be a function which is Lipschitz on bounded sets. Then there exists and a continuously differentiable solution to (4) with , such that if is finite, and if is finite. Furthermore, , and are unique.

*Proof:* Uniqueness follows easily from Theorem 3. For existence, let be the union of all the intervals containing for which there is a continuously differentiable solution to (4) with . From Theorem 6 contains a neighbourhood of the origin. From Theorem 3 one can glue all the solutions together to obtain a continuously differentiable solution to (4) with . If is contained in , then by Theorem 6 (and time translation) one could find a solution to (4) in a neighbourhood of such that ; by Theorem 3 we must then have , otherwise we could glue to and obtain a solution on a larger domain than , contradicting the definition of . Thus is open, and is of the form for some .

Suppose for contradiction that is finite and does not go to infinity as . Then there exists a finite and a sequence such that . Let be the Lipschitz constant of on . By Theorem 6, for each one can find a solution to (4) on with , where does not depend on . For large enough, this and Theorem 7 allow us to extend the solution outside of , contradicting the definition of . Thus we have when is finite, and a similar argument gives when is finite.

Remark 9Theorem 6 gives a more quantitative description of the blowup: if is finite, then for any , one must havewhere is the Lipschitz constant of on . This can be used to give some explicit lower bound on blowup rates. For instance, if and behaves like for some in the sense that the Lipschitz constant of on is for any , then we obtain a lower bound

as , if is finite, and similarly when is finite. This type of blowup rate is sharp. For instance, consider the scalar ODE

where takes values in and is fixed. Then for any , one has explicit solutions on of the form

where is a positive constant depending only on . The blowup rate at is consistent with (13) and also with (11).

Exercise 10 (Higher regularity)Let the notation and hypotheses be as in Theorem 8. Suppose that is times continuously differentiable for some natural number . Show that the maximal Cauchy development is times continuously differentiable. In particular, if is smooth, then so is .

Exercise 11 (Lipschitz continuous dependence on data)Let be a finite-dimensional normed vector space.

- (a) Let , let be a function which has a Lipschitz constant of on the ball , and let be the quantity (11). If , and are the solutions to (4) with given by Theorem 6, show that
- (b) Let be a function which is Lipschitz on bounded sets, let , and let be the maximal Cauchy development of (4) with initial data given by Theorem 6. Show that for any compact interval containing , there exists an open neighbourhood of , such that for any , there exists a solution of (4) with initial data . Furthermore, the map from to is a Lipschitz continuous map from to .

Exercise 12 (Non-autonomous Picard theorem)Let be a finite-dimensional normed vector space, and let be a function which is Lipschitz on bounded sets. Let . Show that there exist and a continuously differentiable function solving the non-autonomous ODEfor with initial data ; furthermore one has if is finite, and if is finite. Finally, show that are unique. (

Hint:this could be done by repeating all of the previous arguments, but there is also a way to deduce this non-autonomous version of the Picard theorem directly from the Picard theorem by adding one extra dimension to the space .)

The above theory is symmetric with respect to the time reversal of replacing with and with . However, one can break this symmetry by introducing a dissipative linear term, in which case one only obtains the forward-in-time portion of the Picard existence theorem:

Exercise 13Let be a finite dimensional normed vector space, let , and let lie in the closed ball . Let be a function which has a Lipschitz constant of on the ball . Let be the quantity in (11). Let be a linear operator obeying the dissipative estimatesfor all and . Show that there exists a continuously differentiable solution to the ODE

Remark 14With the hypotheses of the above exercise, one can also solve the ODE backwards in time by an amount , where denotes the operator norm of . However, in the limit as the operator norm of goes to infinity, the amount to which one can evolve backwards in time goes to zero, whereas the time in which one can evolve forwards in time remains bounded away from zero, thus breaking the time symmetry.

** — 2. Leray systems — **

Now we discuss the Leray system of equations

where is given, and the vector field and the scalar field are unknown. In other words, we wish to decompose a specified function as the sum of a gradient and a divergence-free vector field . We will use the usual Lebesgue spaces of measurable functions (up to almost everywhere equivalence) defined on some measure space (which in our case will always be either or with Lebesgue measure) such that the norm is finite. (For , the norm is defined instead to be the essential supremum of .)

Proceeding purely formally, we could solve this system by taking the divergence of the first equation to conclude that

where is the Laplacian of , and then we could formally solve for as

However, if one wishes to justify this rigorously one runs into the issue that the Laplacian is not quite invertible. To sort this out and make this problem well-defined, we need to specify the regularity and decay one wishes to impose on the data and on the solution . To begin with, let us suppose that are all smooth.

We first understand the uniqueness theory for this problem. By linearity, this amounts to solving the homogeneous equation when , thus we wish to classify the smooth fields and solving the system

Of course, we can eliminate and write this a single equation

That is to say, the solutions to this equation arise by selecting to be a (smooth) harmonic function, and to be the negative gradient of . This is consistent with our preceding discussion that identified the potential lack of invertibility of as a key issue.

By linearity, this implies that (smooth) solutions to the system (15) are only unique up to the addition of an arbitrary harmonic function to , and tbe subtraction of the gradient of that harmonic function from .

We can largely eliminate this lack of uniqueness by imposing further requirements on . For instance, suppose in addition that we require to all be -periodic (or *periodic* for short), thus

for and . Then the only freedom we have is to modify by an arbitrary periodic harmonic function (and to subtract the gradient of that function from ). However, by Liouville’s theorem, the only periodic harmonic functions are the constants, whose gradient vanishes. Thus the only freedom in this setting is to add a constant to . This freedom will be almost irrelevant when we consider the Euler and Navier-Stokes equations, since it is only the gradient of the pressure which appears in those equations, rather than the pressure itself. Nevertheless, if one wishes, one could remove this freedom by requiring that be of mean zero: .

Now suppose instead that we only require that and be -periodic, but do not require to be -periodic. Then we have the freedom to modify by a harmonic function which need not be -periodic, but whose gradient is -periodic. Since the gradient of a harmonic function is also harmonic, has to be constant, and so is an affine-linear function. Conversely, all affine-linear functions are harmonic, and their gradients are constant and thus also -periodic. Thus, one has the freedom in this setting to add an arbitrary affine-linear function to , and subtract the constant gradient of that function from .

Instead of periodicity, one can also impose decay conditions on the various functions. Suppose for instance that we require the pressure to lie in an space for some ; roughly speaking, this forces the pressure to decay to zero at infinity “on the average”. Then we only have the freedom to modify by a harmonic function that is also in the class (and modify by the negative gradient of this harmonic function). However, the mean value property of harmonic functions implies that

for any ball of some radius centred around , where denotes the measure of the ball. By Hölder’s inequality, we conclude that

Sending we conclude that vanishes identically; thus there are no non-trivial harmonic functions in . Thus there is uniqueness for the problem (15) if we require the pressure to lie in . If instead we require the vector field to be in , then we can modify by a harmonic function with in , thus vanishes identically and hence is constant. So if we require then we only have the freedom to adjust by arbitrary constants.

Having discussed uniqueness, we now turn to existence. We begin with the periodic setting in which are required to be -periodic and smooth, so that they can also be viewed (by slight abuse of notation) as functions on the torus . The system (15) is linear and translation-invariant, which strongly suggests that one solve the system using the Fourier transform (which tends to diagonalise linear translation-invariant equations, because the plane waves that underlie the Fourier transform are the eigenfunctions of translation.) Indeed, we may expand as Fourier series

where the Fourier coefficients , , are given by the formulae

When are smooth, then are rapidly decreasing as , which will allow us to justify manipulations such as interchanging summation and derivatives without difficulty. Expanding out (15) in Fourier series and then comparing Fourier coefficients (which are unique for smooth functions), we obtain the system

for each . As mentioned above, the Fourier transform has *diagonalised* the system (15), in that there are no interactions between different frequencies , and we now have a decoupled system of vector equations. To solve these equations, we can take the inner product of both sides of (18) with and apply (19) to conclude that

For non-zero , we can then solve for and hence by the formulae

and

For , these formulae no longer apply; however from (18) we see that , while can be arbitrary (which corresponds to the aforementioned freedom to add an arbitrary constant to ). Thus we have the explicit general solution

where is an arbitrary constant. Note that if is smooth, then is rapidly decreasing and the functions defined by the above formulae are also smooth.

We can write the above general solution in a form similar to (16), (17) as

where, *by definition*, the inverse Laplacian of a smooth periodic function of mean zero is given by the Fourier series formula

(Note that automatically has mean zero.) It is easy to see that for such functions , thus justifying the choice of notation. We refer to as the (periodic) Leray projection of and denote it , thus in the above solution we have . By construction, is divergence-free, and vanishes whenever is a gradient .

If we require to be -periodic, but do not require to be -periodic, then by the previous uniqueness discussion, the general solution is now

where and are arbitrary.

The above discussion was for smooth periodic functions , but one can make the same construction in other function spaces. For instance, recall that for any , the Sobolev space consists of those elements of whose Sobolev norm

is finite, where we use the “Japanese bracket” convention . (One can also define Sobolev spaces for negative , but we will not need them here.) Basic properties of these Sobolev spaces can be found in this previous post. From comparing Fourier coefficients we see that the operators and defined for smooth periodic functions can be extended without difficulty to (taking values in and respectively), with bounds of the form

Thus, if , then one can solve (15) (in the sense of distributions, at least) with some and , with bounds

In particular, the Leray projection is bounded on . (In fact it is a contraction; see Exercise 16.)

One can argue similarly in the non-periodic setting, as long as one avoids the one-dimensional case which contains some technical divergences. Recall (see e.g., these previous lecture notes on this blog) that functions have a Fourier transform , which for in the dense subclass of is defined by the formula

and then is extended to the rest of by continuous extension in the topology, taking advantage of the Plancherel identity

The Fourier transform is then extended to tempered distributions in the usual fashion (see this previous set of notes).

We then define the Sobolev space for to be the collection of those functions for which the norm

is finite; equivalently, one has

where the Fourier multiplier is defined by

For any vector-valued function in the Schwartz class, we define to be the scalar tempered distribution whose (distributional) Fourier transform is given by the formula

and define the Leray projection to be the vector-valued distribution

or in terms of the (distributional) Fourier transform

Then by using the well-known relationship

between (distributional) derivatives and (distributional) Fourier transforms we see that the tempered distributions

solve the equation (15) in the distributional sense, and hence also in the classical sense since have rapidly decreasing Fourier transforms and are thus smooth.

As in the periodic case we see that we have the bound

for all Schwartz vector fields (in fact is again a contraction), so we can extend the Leray projection without difficulty to functions. The operator can similarly be extended continuously to a map from to the space of scalar tempered distributions with gradient in , although we will not need to work directly with the pressure much in this course. This allows us to solve (15) in a distributional sense for all .

Remark 15(Remark removed due to inaccuracy.)

Exercise 16 (Hodge decomposition)Define the following three subspaces of the Hilbert space :

- is the space of all elements of of the form (in the sense of distributions) for some ;
- is the space of all elements of that are weakly harmonic in the sense that (in the sense of distributions).
- is the space of all elements of which take the form
(with the usual summation conventions) for some tensor obeying the antisymmetry property .

- (a) Show that these three spaces are closed subspaces of , and one has the orthogonal decomposition
This is a simple case of a more general splitting known as the Hodge decomposition, which is available for more general differential forms on manifolds.

- (b) Show that on , the Leray projection is the orthogonal projection to .
- (c) Show that the Leray projection is a contraction on for all .

Exercise 17 (Helmholtz decomposition)Define the following two subspaces of the Hilbert space :

- is the space of functions which are divergence-free, by which we mean that in the sense of distributions.
- is the space of functions which are curl-free, by which we mean that in the sense of distributions, where is the rank two tensor with components .

- (a) Show that these two spaces are closed subspaces of , and one has the orthogonal decomposition
This is known as the Helmholtz decomposition (particularly in the three-dimensional case , in which one can interpret as the curl of ).

- (b) Show that on , the Leray projection is the orthogonal projection to .
- (c) Show that the Leray projection is a contraction on for all .

Exercise 18 (Singular integral form of Leray projection)Let . Then the function is locally integrable and thus well-defined as a distribution.

- (a) For , show that the distribution , defined on test functions by the formula
can be expressed in principal value form as

where denotes the surface area of the unit sphere in and is the Kronecker delta.

- (b) Conclude in particular the Newtonian potential identity
where (at the risk of a mild notational clash) is the Dirac delta distribution at .

- (c) For a test vector field , establish the explicit form
- (d) Extend part (c) to the case . (
Hint:Replace the role of with , in the spirit of the replica trick from physics.)

Remark 19One can also solve (15) in -based Sobolev spaces for exponents other than by using Calderón-Zygmund theory and the singular integral form of the Leray projection given in Exercise 18. However, we will try to avoid having to rely on this theory in these notes.

** — 3. The heat equation — **

We now turn to the study of the heat equation

on a spacetime region , with initial data , where is a fixed constant; we also consider the inhomogeneous analog

Formally, the solution to the initial value problem for (23) should be given by , and (by the Duhamel formula(6)) the solution to (24) should similarly be

but there are subtleties arising from the unbounded nature of .

The first issue is that even if vanishes and is required to be smooth without any decay hypothesis at infinity, one can have non-uniqueness. The following counterexample is basically due to Tychonoff:

Exercise 20 (Tychonoff example)Let be a real number, and let .

- (a) Show that there exists smooth, compactly supported function , not identically zero, obeying the derivative bounds
for all and . (

Hint:one can construct as the convolution of an infinite number of approximate identities , where each is supported on an interval of length , and use the identity repeatedly. To justify things rigorously, one may need to first work with finite convolutions and take limits.)- (b) With as in part (i) show that the function
is well-defined as a smooth function on that is compactly supported in time, and obeys the heat equation (23) for without being identically zero.

- (c) Show that the initial value problem to (23) is not unique (for any dimension ) if is only required to be smooth, even if vanishes.

Exercise 21 (Kowalevski example)

- (a) Let be the function . Show that there does not exist any solution to (23) that is jointly real analytic in at (that is to say, it can be expressed as an absolutely convergent power series in in a neighbourhood of ).
- (b) Modify the above example by replacing by a function that extends to an entire function on (as opposed to , which has poles at ).
This classic example, due to Sofia Kowalevski, demonstrates the need for some hypotheses on the PDE in order to invoke the Cauchy-Kowaleski theorem.

One can recover uniqueness (forwards in time) by imposing some growth condition at infinity. We give a simple example of this, which illustrates a basic tool in the subject, namely the *energy method*, which is based on understanding the rate of change of various “energy” integrals of integrands which primarily involve quadratic expressions of the solution or its derivatives. The reason for favouring quadratic expressions is that they are more likely to produce integrals with a definite sign (positive definite or negative definite), such as (squares of) norms or higher Sobolev norms of the solution, particularly after suitable application of integration by parts.

Proposition 22 (Uniqueness with energy bounds)Let , and let be smooth solutions to (24) with common initial data and forcing term such that the normof is finite, and similarly for . Then .

*Proof:* As the heat equation (23) is linear, we may subtract from and assume without loss of generality that , , and . By working with each component separately we may take .

Let be a non-negative test function supported on that equals on . Let be a parameter, and consider the “energy” (or more precisely, “local mass”)

for . As , we have . As is smooth and is compactly supported, depends smoothly on , and we can differentiate under the integral sign to obtain

Using (23) we thus have

using the usual summation conventions.

A basic rule of thumb in the energy method is this: whenever one is faced with an integral in which one term in the integrand has much lower regularity (or much less control on regularity) than any other, due to a large number of derivatives placed on that term, one should integrate by parts to move one or more derivatives off of that term to other terms in order to make the distribution of derivatives more balanced (which, as we shall see, tends to make the integrals easier to estimate, or to ascribe a definite sign to). Accordingly, we integrate by parts to write

The first term is non-positive, thus we may discard it to obtain the inequality

Another rule of thumb in the energy method is to keep an eye out for opportunities to express some expression appearing in the integrand as a total derivative In this case, we can write

and then integrate by parts to move the derivative on to the much more slowly varying function to conclude

In particular we have a bound of the form

where the subscript indicates that the implied constant can depend on and . Since , we conclude from the fundamental theorem of calculus that

for all (note how it is important here that we evolve forwards in time, rather than backwards). Sending and using the dominated convergence theorem, we conclude that

and thus vanishes identically, as required.

Now we turn to existence for the heat equation, restricting attention to forward in time solutions. Formally, if one solves the heat equation (23), then on taking spatial Fourier transforms

the equation transforms to the ODE

which when combined with the initial condition gives

and hence by the Fourier inversion formula we arrive (formally, at least) at the representation

As we are assuming forward time evolution , the exponential factor here is bounded. In the case that is a Schwartz function, then is also Schwartz, and this formula is certainly well-defined to be smooth in both time and space (and rapidly decreasing in space for any fixed time), and in particular in ; one can easily justify differentiation under the integral sign to conclude that (23) is indeed verified, and the Fourier inversion formula shows that we have the initial data condition . So this is the unique solution to the initial value problem (23) for the heat equation that lies in . By definition we declare the right-hand side of (25) to be , thus

for all and all Schwartz functions ; equivalently, one has

(One can justify this choice of notation using the functional calculus of the self-adjoint operator , as discussed for instance in this previous blog post, but we will not do so here since the Fourier transform is available as a substitute.) It is also clear from (27) that commutes with other Fourier multipliers such as or constant-coefficient differential operators, on Schwartz functions at least.

From (27) and Plancherel’s theorem we see that for is a contraction in (the Schwartz functions of) , and more generally in for any , thus

for any Schwartz and any . Thus by density one can extend the heat propagator for to all of , in a fashion that is a contraction on and more generally on . By a limiting argument, (27) holds almost everywhere for all .

There is also a smoothing effect:

Exercise 23 (Smoothing effect)Let . Show thatfor all and .

Exercise 24 (Fundamental solution for the heat equation)For and , establish the identityfor almost every . (

Hint:first work with Schwartz functions. Either compute the Fourier transform explicitly, or verify directly that the heat equation initial value problem is solved by the right-hand side.) Conclude in particular that (after modification on a measure zero set if necessary) is smooth for any .

Exercise 25 (Ill-posedness of the backwards heat equation)Show that there exists a Schwartz function with the property that there is no solution to (23) with final data for any . (Hint:choose so that the Fourier transform decays somewhat, but not extremely rapidly. Then argue by contradiction using (27).

Exercise 26 (Continuity in the strong operator topology)For any , let denote the Banach space of functions such that for each , lies in and varies continuously and boundedly in in the strong topology, with normShow that if and solves the heat equation on , then with

Similar considerations apply to the inhomogeneous heat equation (24). If and are Schwartz for some , then the function defined by the Duhamel formula

can easily be verified to also be Schwartz and solve (24) with initial data ; by Proposition 22, this is the only such solution in . It also obeys good estimates:

Exercise 27 (Energy estimates)Let and be Schwartz functions for some , and let be the solution to the equationwith initial condition given by the Duhamel formula. For any , establish the energy estimate

in two different ways:

- (i) By using the Fourier representation (27) and Plancherel’s formula;
- (ii) By using energy methods as in the proof of Proposition 22. (
Hint:first reduce to the case . You may find the arithmetic mean-geometric mean inequality to useful.)Here of course we are using the norms

and

The energy estimate contains some smoothing effects similar (though not identical) to those in Exercise 23, since it shows that can in principle be one degree of regularity smoother than (if one averages in time in an sense, and the viscosity is not sent to zero), and two degrees of regularity smoother than the forcing term (with the same caveats). As we shall shortly see, this smoothing effect will allow us to handle the nonlinear terms in the Navier-Stokes equations for the purposes of setting up a local well-posedness theory.

Exercise 28 (Distributional solution)Let , let , and let for some . Let be given by the Duhamel formula (28). Show that (24) is true in the spacetime distributional sense, or more precisely thatin the sense of spaceime distributions for any test function supported in the interior of .

Pretty much all of the above discussion can be extended to the periodic setting:

- (a) If is smooth, define by the formula
where are the Fourier coefficients of . Show that extends continuously to a contraction on for every , and that if then the function lies in .

- (b) For and , establish the formula
for almost every , where (by abuse of notation) we identify functions with -periodic functions in the usual fashion.

- (c) If , and and are smooth, show that the function defined by (28) is smooth and solves the inhomogeneous equation (24) with initial data , and that this is the unique smooth solution to that initial value problem.
- (d) If , , and and are smooth, and is the unique smooth solution to the heat equation with , establish the energy estimate
- (e) If , and , show that the function given by (28) is in and obeys (24) in the sense of spacetime distributions (30).

Remark 30The heat equation for negative viscosities can be transformed into a positive viscosity heat equation by time reversal: if solves the equation , then solves the equation . Thus one can solve negative viscosity heat equations (also known asbackwards heat equations) backwards in time, but one tends not to have well-posedness forwards in time. In a similar spirit, if is positive, one can normalise it to (say) by an appropriate rescaling of the time variable, . However, we will generally keep the parameter non-normalised in preparation for understanding the limit as .

** — 4. Local well-posedness for Navier-Stokes — **

We now have all the ingredients necessary to create a local well-posedness theory for the Navier-Stokes equations (1).

We first dispose of the one-dimensional case , which is rather degenerate as incompressible one-dimensional fluids are somewhat boring. Namely, suppose that one had a smooth solution to the one-dimensional Navier-Stokes equations

The second equation implies that is just a function of time, , and the first equation becomes

To solve this equation, one can set to be an arbitrary smooth function of time, and then set

for an arbitrary smooth function . If one requires the pressure to be bounded, then vanishes identically, and then is constant in time, which among other things shows that the initial value problem is (rather trivially) well-posed in the category of smooth solutions, up to the ability to alter the pressure by an arbitrary constant . On the other hand, if one does not require the pressure to stay bounded, then one has a lot less uniqueness, since the function is essentially unconstrained.

Now we work in two or higher dimensions , and consider solutions to (1) on the spacetime region . To begin with, we assume that is smooth and periodic in space: for ; we assume is smooth but do not place any periodicity hypotheses on it. Then, by (1), is periodic. In particular, for any and , the function has vanishing gradient and is thus constant in , so that

for all and some function of . The map is a homomorphism for fixed , so we can write for some , which will be smooth since is smooth. We thus have for some smooth -periodic function . By subtracting off the mean, we can further decompose

for some smooth function and some smooth -periodic function which has mean zero at every time.

Note that one can simply omit the constant term from the pressure without affecting the system (1). One can also eliminate the linear term by the following “generalised Galilean transformation“. If are as above, and one lets

be the primitive of , then a short calculation reveals that the smooth function defined by

solves the Navier-Stokes equations

with having the same initial data as ; conversely, if is a solution to Navier-Stokes, then so is . In particular this reveals a lack of uniqueness for the periodic Navier-Stokes equations that is essentially the same lack of uniqueness that is present for the Leray system: one can add an arbitrary spatially affine function to the pressure by applying a suitable Galilean transform to . On the other hand, we can eliminate this lack of uniqueness by requiring that the pressure be *normalised* in the sense that and , that is to say we require to be -periodic and mean zero. The above discussion shows that any smooth solution to Navier-Stokes with periodic can be transformed by a Galilean transformation to one in which the pressure is normalised.

Once the pressure is normalised, it turns out that one can recover uniqueness (much as was the case with the Leray system):

Theorem 31 (Uniqueness with normalised pressure)Let be two smooth periodic solutions to (1) on with normalised pressure such that . Then .

*Proof:* We use the energy method. Write , then subtracting (1) for from we see that is smooth with

and

Now we consider the energy . This varies smoothly with , and we can differentiate under the integral sign to obtain

where

and we have omitted the explicit dependence on and for brevity.

For , we observe the total derivative and integrate by parts to conclude that

since is divergence-free. Similarly, integration by parts shows that vanishes since is divergence-free. Another integration by parts gives

and hence . Finally, from Hölder’s inequality we have

and hence

Since , we conclude from Gronwall’s inequality that for all , and hence is identically zero, thus . Substituting this into (1) we conclude that ; as have mean zero, we conclude (e.g., from Fourier inversion) that , and the claim follows.

Now we turn to existence in the periodic setting, assuming normalised pressure. For various technical reasons, it is convenient to reduce to the case when the velocity field has zero mean. Observe that the right-hand sides , of (1) have zero mean on , thanks to integration by parts. A further integration by parts, using the divergence-free condition , reveals that the transport term also has zero mean:

Thus, we see that the mean is a conserved integral of motion: if is the mean initial velocity, and is a solution to (1) (obeying some minimal regularity hypothesis), then continues to have mean velocity for all subsequent times. On the other hand, if is a smooth periodic solution to (1) with normalised pressure and initial velocity , then the Galilean transform defined by

can be easily verified to be a smooth periodic solution to (1) with normalised pressure and initial velocity . Of course, one can reconstruct from by the inverse tranformation

Thus, up to this simple transformation, solving the initial value problem for (1) for is equivalent to that of , so we may assume without loss of generality that the initial velocity (and hence the velocity at all subsequent times) has zero mean.

A general rule of thumb is that whenever an integral of a solution to a PDE can be proven to vanish (or be equal to boundary terms) by integration by parts, it is because the integrand can be rewritten in “divergence form” – as the divergence of a tensor of one higher rank. (This is because the integration by parts identity arises from the divergence form of the expression .) Thus we expect the transport term to be in divergence form. Indeed, in components we have

since we have the divergence-free condition , we thus have from the Leibniz rule that

We write this in coordinate-free notation as

where is the tensor product and denotes the divergence

Thus we can rewrite (1) as the system

Next, we observe that we can use the Leray projection operator to eliminate the role of the (normalised) pressure. Namely, if are a smooth periodic solution to (1) with normalised pressure, then on applying (which preserves divergence-free vector fields such as and , but annihilates gradients such as ) we conclude an equation that does not involve the pressure at all:

Conversely, suppose that one has a smooth periodic solution to (33) with initial condition for some smooth periodic divergence-free vector field . Taking divergences of both sides of (33), we then conclude that

that is to say obeys the heat equation (23). Since is periodic, smooth, and vanishes at , we see from Exercise 29(c) that vanishes on all of , thus is divergence free on the entire time interval . From (33) and (22) we thus see that if one defines to be the function

(which can easily be verified to be a smooth function in both space and time) then is a smooth periodic solution to (1) with normalised pressure and initial condition (and is thus the unique solution to this system, thanks to Theorem 31). Thus, the problem of finding a smooth solution to (1) in the smooth periodic setting with normalised pressure and divergence-free initial data is equivalent to that of solving (33) with the same initial data.

By Duhamel’s formula (Exercise 29(c)), any smooth solution to the initial value problem (33) with obeys the Duhamel formula

(The operator is sometimes referred to as the *Oseen operator* in the literature.) Conversely, a smooth solution to (34) will solve the initial value problem (33) with initial data .

To obtain existence of smooth periodic solutions (with normalised pressure) to the Navier-Stokes equations with given smooth divergence-free periodic initial data , it thus suffices to find a smooth periodic solution to the integral equation (34). We will achieve this by a two-step procedure:

- (i) (Existence at finite regularity) Construct a solution to (34) in a certain function space with a finite amount of regularity (assuming that the initial data has a similarly finite amount of regularity); and then
- (ii) (Propagation of regularity) show that if is in fact smooth, then the solution constructed in (i) is also smooth.

The reason for this two step procedure is that one wishes to solve (34) using iteration-type methods (which for instance power the contraction mapping theorem that was used to prove the Picard existence theorem); however the function space that one ultimately wishes the solution to lie in is not well adapted for such iteration (for instance, it is not a Banach space, instead being merely a Fréchet space). Instead, we iterate in an auxiliary lower regularity space first, and then “bootstrap” the lower regularity to the desired higher regularity. Observe that the same situation occured with the Picard existence theorem, where one performed the iteration in the low regularity space , even though ultimately one desired the solution to be continuously differentiable or even smooth.

Of course, to run this procedure, one actually has to write down an explicit function space in which one will perform the iteration argument. Selection of this space is actually a non-trivial matter and often requires a substantial amount of trial and error, as well as experience with similar iteration arguments for other PDE. Often one is guided by the function space theory for the linearised counterpart of the PDE, which in this case is the heat equation (23). As such, the following definition can be at least partially motivated by the energy estimates in Exercise 29(d).

Definition 32 (Mild solution)Let , , and let be divergence-free, where denotes the subspace of consisting of mean zero functions. An-mild solution(orFujita-Kato mild solutionto the Navier-Stokes equations with initial data is a function in the function spacethat obeys the integral equation (34) (in the sense of distributions) for all . We say that is a mild solution on if it is a mild solution on for every .

Remark 33The definition of a mild solution could be extended to those choices of initial data that are not divergence-free, but then this solution concept no longer has any direct connection with the Navier-Stokes equations, so we will not consider such “solutions” here. Similarly, one could also consider mild solutions without the mean zero hypothesis, but the function space estimates are slightly less favourable in this setting and so we shall restrict attention to mean zero solutions only.

Note that the regularity on places in (with plenty of room to spare), which is more than enough regularity to make sense of the right-hand side of (34). One can also define mild solutions for other function spaces than the one provided here, but we focus on this notion for now, which was introduced in the work of Fujita and Kato. We record a simple compatibility property of mild solutions:

Exercise 34 (Splitting)Let , , let be divergence-free, and letLet . Show that the following are equivalent:

- (i) is an mild solution to the Navier-Stokes equations on with initial data .
- (ii) is an mild solution to the Navier-Stokes equations on with initial data , and the translated function defined by is an mild solution to the Navier-Stokes equations with initial condition .

To use this notion of a mild solution, we will need the following harmonic analysis estimate:

Proposition 35 (Product estimate)Let , and let . Then one has , with the estimate

When this claim follows immediately from Hölder’s inequality. For the claim is similarly immediate from the Leibniz rule and the triangle and Hölder inequalities (noting that is comparable to . For more general the claim is not quite so immediate (for instance, when one runs into difficulties controlling the intermediate term arising in the Leibniz expansion of ). Nevertheless the bound is still true. However, to prove it we will need to introduce a tool from harmonic analysis, namely Littlewood-Paley theory, and we defer the proof to the appendix.

We also need a simple case of Sobolev embedding:

Exercise 36 (Sobolev embedding)

- (a) If , show that for any , one has with
- (b) Show that the inequality fails at .
- (c) Establish the same statements with replaced by throughout.

In particular, combining this exercise with Proposition 35 we see that for , is a Banach algebra:

Now we can construct mild solutions at high regularities .

Theorem 37 (Local well-posedness of mild solutions at high regularity)Let , and let be divergence-free. Then there exists a timeand an mild solution to (34). Furthermore, this mild solution is unique.

The hypothesis is not optimal; we return to this point later in these notes.

*Proof:* We begin with existence. We can write (34) in the fixed point form

We remark that this expression automatically has mean zero since has mean zero. Let denote the function space

with norm

This is a Banach space. Because of the mean zero restriction on , we may estimate

Note that if , then by (35), which by Exercise 29(d) (and the fact that commutes with and is a contraction on ) implies that . Thus is a map from to . In fact we can obtain more quantitative control on this map. By using Exercise 29(d), (35), and the Hölder bound

Thus, if we set for a suitably large constant , and set for a sufficienly small constant , then maps the closed ball in to itself. Furthermore, for , we have by similar arguments to above

and hence if the constant is chosen small enough, is also a contraction (with constant, say, ) on . Thus there exists such that , thus is an mild solution.

Now we show that it is the only mild solution. Suppose for contradiction that there is another mild solution with the same initial data . This solution might not lie in , but it will lie in for some . By the same arguments as above, if is sufficiently small depending on then will be a contraction on , which implies that and agree on . Now we apply Exercise 34 to advance in time by and iterate this process (noting that depends on but does not otherwise depend on or ) until one concludes that on all of .

Iterating this as in the proof of Theorem 8, we have

Theorem 38 (Maximal Cauchy development)Let , and let be divergence-free. Then there exists a time and an mild solution to (34), such that if then as . Furthermore, and are unique.

In principle, if the initial data belongs to multiple Sobolev spaces the maximal time of existence could depend on (so that the solution exits different regularity classes at different times). However, this is not the case, because there is an -independent blowup criterion:

Proposition 39 (Blowup criterion)Let be as in Theorem 38. If , then .

Note from Exercise 36 that is finite for any . This shows that is the unique time at which the norm “blows up” (becomes infinite) and thus is independent of .

*Proof:* Suppose for contradiction that but that the quantity was finite. Let be parameters to be optimised in later. We define the norm

As is a mild solution, this expression is finite.

We adapt the proof of Theorem 37. Using Exercise 29(d) (and Exercise 34) we have

Again we discard and use (a variant of) (37) to conclude

If we now use Proposition 35 in place of (35), we conclude that

If we choose to be sufficiently close to (depending on and ), we can absorb the second term on the RHS into the LHS and conclude that

In particular, stays bounded as , contradicting Theorem 38.

Corollary 40 (Existence of smooth solutions)If is smooth and divergence free then there is a and a smooth periodic solution to the Navier-Stokes equations on with normalised pressure such that if , then . Furthermore, and are unique.

*Proof:* As discussed previously, we may assume without loss of generality that has mean zero. As is periodic and smooth, it lies in for every . From the preceding discussion we already have and a function that is an mild solution for every , and with if is finite. It will suffice to show that is smooth, since we know from preceding discussion that a smooth solution to (33) can be converted to a smooth solution to (1).

By Exercise 29, one has

in the sense of spacetime distributions. The right-hand side lies in for every , hence the left-hand side does also; this makes lie in . It is then easy to see that this implies that the right-hand side of the above equation lies in for every , and so now lies in for every . Iterating this (and using Sobolev embedding) we conclude that is smooth in space and time, giving the claim.

Remark 41When , it is a notorious open problem whether the maximal lifespan given by the above corollary is always infinite.

Exercise 42 (Instantaneous smoothing)Let , let be divergence-free, and let be the maximal Cauchy development provided by Theorem 38. Show that is smooth on (note the omission of the initial time ). (Hint:first show that is a mild solution for arbitrarily small .)

Exercise 43 (Lipschitz continuous dependence on initial data)Let , let , and let be divergence-free. Suppose one has an mild solution to the Navier-Stokes equations with initial data . Show that there in a neighbourhood of in (the divergence-free elements of) in , such that for every , there exists an mild solution to the Navier-Stokes equations with initial data with the map from to Lipschitz continuous (using the metric for the initial data and the metric for the solution ).

Now we discuss the issue of relaxing the regularity condition in the above theory. The main inefficiency in the above arguments is the use of the crude estimate (37), which sacrifices some of the exponent in time in exchange for extracting a positive power of the lifespan that can be used to create a contraction mapping, as long as is small enough. It turns out that by using a different energy estimate than Exercise 29(d), one can avoid such an exchange, allowing one to construct solutions at lower regularity, and in particular at the “critical” regularity of . Furthermore, in the category of smooth solutions, one can even achieve the desirable goal of ensuring that the time of existence is infinite – but only provided that the initial data is small. More precisely,

Proposition 44Let and let . Then the function defined by the Duhamel formulaalso has mean zero for all , and obeys the estimates

*Proof:* By Minkowski’s integral inequality, it will suffice to establish the bounds in the case . The first two norms of the right-hand side are already established by Exercise 29(d), so it remains to establish the estimate

By working with the rescaled function (and also rescaling ), we may normalise . By a limiting argument we may assume without loss of generality that is Schwarz. We cannot directly apply Exercise 36 here due to the failure of endpoint Sobolev embedding; nevertheless we may argue as follows. For any , we see from (31), the mean zero hypothesis, and the triangle inequality that

and hence by Cauchy-Schwarz and the bound

(which can be verified using the integral test for , while for it is easy to bound the LHS by ) we have

Integrating this in using Fubini’s theorem, we conclude

and the claim follows.

This gives the following small data global existence result, also due to Fujita and Kato:

Theorem 45 (Small data global existence)Suppose that is divergence-free with norm at most , where is a sufficiently small constant depending only on . Then there exists a mild solution to the Navier-Stokes equations on . Furthermore, if is smooth, then this mild solution is also smooth.

*Proof:* By working with the rescaled function , we may normalise . Let denote the Banach space of functions

with the obvious norm

Let be the Duhamel operator (36). If , then by Proposition 44 and Lemma 35 one has

In particular, maps to . A similar argument establishes the bound

for all . For small enough, will be a contraction on for some absolute constant depending only on , and hence has a fixed point which will be the desired mild solution.

Now suppose that is smooth. Let , and let be the maximal Cauchy development provided by Theorem 38. For any , if one defines

then the preceding arguments give

thus either or . On the other hand, depends continuously on and converges to as . For small enough, this implies that for all (this is an example of a “continuity argument”). Next, if we set

then repeating the previous arguments also gives

as is finite and , we conclude (for small enough) that

In particular we have

for all , and hence by Theorem 38 we have . The argument used to prove Corollary 40 shows that is smooth, and the claim follows.

Remark 46Modifications of this argument also allow one to establish local existence of mild solutions when the initial data lies in , but has large norm rather than norm less than . However, in this case one does not have a lower bound on the time of existence that depends only on the norm of the data, as was the case with Theorem 37. Further modification of the argument also allows one to extend Theorem 38 to the entire “subcritical” range of regularities . See the paper of Fujita and Kato for details.

We now turn attention to the non-periodic case in two and higher dimensions . The theory is largely identical, though with some minor technical differences. Unlike the periodic case, we will not attempt to reduce to the case of having mean zero (indeed, we will not even assume that is absolutely integrable, so that the mean might not even be well defined).

In the periodic case, we focused initially on smooth solutions. Smoothness is not sufficient by itself in the non-periodic setting to provide a good well-posedness theory, as we already saw in Section 3 when discussing the linear heat equation; some additional decay at spatial infinity is needed. There is some flexibility as to how much smoothness to prescribe. Let us say that a solution to Navier-Stokes is *classical* if and are smooth, and furthermore lies in for every .

Now we work on normalising the pressure. Suppose is a classical solution. As before we may write the Navier-Stokes equation in divergence form as (32). Taking a further divergence we obtain the equation

The function belongs to for every , so if we define the normalised pressure

via the Fourier transform as

then will also belong to for every . We then have for some smooth harmonic function . To control this harmonic function, we return to (32), which we write as

where

and apply the fundamental theorem of calculus to conclude that

The left-hand side is harmonic (thanks to differentiating under the integral sign), and the right-hand side lies in (in fact it is in for every ), hence both sides vanish. By the fundamental theorem of calculus this implies that vanishes identically, thus is constant in space. One can then subtract it from the pressure without affecting (1). Thus, in the category of classical solutions, at least, we may assume without loss of generality that we have *normalised pressure*

in which case the Navier-Stokes equations may be written as before as (33). (See also this paper of mine for some variants of this argument.)

Exercise 47 (Uniqueness with normalised pressure)Let be two smooth classical solutions to (1) on with normalised pressure such that . Then .

We can now define the notion of a Fujita-Kato mild solution as before, except that we replace all mention of the torus with the Euclidean space , and omit all requirements for the solution to be of mean zero. As stated in the appendix, the product estimate in Proposition 35 continues to hold in , so one can obtain the analogue of Theorem 37, Theorem 38, Proposition 39, and Corollary 40 on by repeating the proofs with the obvious changes; we leave the details as an exercise for the interested reader.

Exercise 48Establish an analogue of Proposition 44 on , using the homogeneous Sobolev space defined to be the closure of the Schwartz functions with respect to the normand use this to state and prove an analogue of Theorem 45.

** — 5. Heuristics — **

There are several further extensions of these types of local and global existence results for smooth solutions, in which the role of the Sobolev spaces here are replaced by other function spaces. For instance, in three dimensions in the non-periodic setting, the role of the critical space was replaced by the larger critical space by Kato, and to the even larger space by Koch and Tataru, who also gave evidence that the latter space essentially the limit of the method; in even larger spaces such as the Besov space , there are constructions of Bourgain and Pavlovic that demonstrate ill-posedness in the sense of “norm inflation” – solutions that start from arbitrarily small norm data but end up being arbitrarily large in arbitrarily small amounts of time. (This grossly abbreviated history skips over dozens of other results, both positive and negative, in yet further function spaces, such as Morrey spaces or Besov spaces. See for instance the recent text of Lemarie-Rieusset for a survey.

Rather tham detail these other results, let us present instead a *scaling heuristic* which can be used to interpret these results (and can clarify why all the positive well-posedness results discussed here involve either “subcritical” or “critical” function spaces, rather than “supercritical” ones). For simplicity we restrict our discussion to the non-periodic setting , although the discussion here could also be adapted without much difficulty to the periodic setting (which effectively just imposes an additional constraint on the frequency parameter to be introduced below).

In this heuristic discussion, we assume that any given time , the velocity field is primarily located at a certain frequency (or equivalently, at a certain wavelength ) in the sense that the spatial Fourier transform is largely concentrated in the region . We also assume that at this time, the solution has an amplitude , in the sense that tends to be of order in magnitude in the region where it is concentrated. (We are deliberately leaving terms such as “concentrated” vague for the purposes of this discussion.) Using this ansatz, one can then heuristically compute the magnitude of various terms in the Navier-Stokes equations (1) or the projected version (33). For instance, if has ampltude and frequency , then should have amplitude (and frequency ), since the Laplacian operator multiplies the Fourier transform by ; one can also take a more “physical space” viewpoint and view the second derivatives in as being roughly like dividing out by the wavelength twice. Thus we see that the viscosity term in (1) or (33) should have size about . Similarly, the expression in (33) should have magnitude and frequency (or maybe slightly less due to cancellation), so and hence should have magnitude . The terms and in (1) can similarly be computed to have magnitude . Finally, if the solution oscillates (or blows up) in time in intervals of length (which one can think of as the natural time scale for the solution), then the term should have magnitude .

This leads to the following heuristics:

- If (or equivalently if ), then the viscosity term dominates the nonlinear terms in (1) or (33), and one should expect the Navier-Stokes equations to behave like the heat equation (23) in this regime. In particular solutions should exist and maintain (or even improve) their regularity as long as this regime persists. To balance the equation (1) or (33), one expects , so the natural time scale here is .
- If (or equivalently if ), then nonlinear effects dominate, and the behaviour is likely to be quite different to that of the heat equation. One now expects , so the natural time scale here is . In particular, one could theoretically have blowup or other bad behaviour after this time scale.

As a general rule of thumb, the known well-posedness theory for the Navier-Stokes equation is only applicable when the hypotheses on the initial data (and on the timescale being considered) is compatible either with the viscosity-dominated regime , or the time-limited regime . Outside of these regimes, we expect the evolution to be highly nonlinear in nature, and techniques such as the ones in this set of notes, which are primarily based on approximating the evolution by the linear heat flow, are not expected to apply.

Let’s discuss some of the results in this set of notes using these heuristics. Suppose we are given that the initial data is bounded in norm by some bound :

As in the above heuristics, we assume that exhibits some amplitude and frequency . Heuristically, the norm of should resemble times the norm of , which should be roughly , where is the volume of the region where is concentrated in. Thus we morally have a bound of the form

To use this bound, we invoke (at a heuristic level) the uncertainty principle , which indicates that the data should be spatially spread out at a scale of at least the wavelength , which implies that the volume should be at least . Thus we have

Suppose we have , then we have the crude bound

so we expect to have an amplitude bound . If we are in the nonlinear regime , this implies that , and so the natural time scale here is lower bounded by . This matches up with the local existence time given in Theorem 37 (or the non-periodic analogue of this theorem). However, the use of the crude bound (38) suggests that one can make improvements to this bound when is far from :

Exercise 49If , make a heuristic argument as to why the optimal lower bound for the time of existence for the Navier-Stokes equation in terms of the norm of the initial data should take the form

In a similar spirit, suppose we have the smallness hypothesis

on the critical norm , then a similar analysis to above leads to

and hence we will be in the viscosity dominated regime if is small enough, regardless of what time scale one uses; this is consistent with the global existence result in Theorem 45. On the other hand, if the norm is much larger than , then can be larger than , and we can fail to be in the viscosity dominated regime at any choice of frequency ; setting to be a large multiple of and sending to infinity, we see that the natural time scale could be arbitrarily small.

Finally, if one only controls a supercritical norm such as for some , this gives a bound on a quantity of the form , which allows one to leave the viscosity dominated regime (with plenty of room to spare) when is large, creating examples of initial data for which the natural time scale can be made arbitrarily small. As increases (restricting to, say, powers of two), the supercritical norm of these examples decays geometrically, so one can superimpose an infinite number of these examples together, leading to a choice of initial data with arbitrarily small supercritical norm for which the natural time scale is in fact zero. This strongly suggests that there is no good local well-posedness theory at such regularities.

Exercise 50Discuss the product estimate in Proposition 35, the Sobolev estimate in Exercise 36, and the energy estimates in Exercise 29(d) and Proposition 44 using the above heuristics.

Remark 51These heuristics can also be used to locate errors in many purported solutions to the Navier-Stokes global regularity problem that proceed through a sequence of estimates on a Navier-Stokes solution. At some point, the estimates have to rule out the scenario that the solution leaves the viscosity-dominated regime at larger and larger frequencies (and at smaller and smaller time scales ), with the time scales converging to zero to achieve a finite time blowup. If the estimates in the proposed solution are strong enough to heuristically rule out this scenario by the end of the argument, but not at the beginning of the argument, then there must be some step inside the argument where one moves from “supercritical” estimates that are too weak to rule out this scenario, to “critical” or “subcritical” estimates which are capable of doing so. This step is often where the error in the argument may be found.

The above heuristics are closely tied to the classification of various function space norms as being “subcritical”, “supercritical”, or “critical”. Roughly speaking, a norm is subcritical if bounding that norm heuristically places one in the linear-dominated regime (which, for Navier-Stokes, is the viscosity-dominated regime) at high frequencies; critical if control of the norm very nearly places one in the linear-dominated regime at high frequencies; and supercritical if control of the norm completely fails to place one in the linear-dominated regime at high frequencies. When the equation in question enjoys a scaling symmetry, the distinction between subcritical, supercritical, and critical norms can be made by seeing how the the top-order component of these norms vary with respect to scaling a function to be high frequency. In the case of the Navier-Stokes equations (1), the scaling is given by the formulae

with the initial data similarly being scaled to

Here is a scaling parameter; as , the functions are being sent to increasingly fine scales (i.e., high frequencies). One easily checks that if solves the Navier-Stokes equations (1) with initial data , then solves the same equations with initial data ; similarly for other formulations of the Navier-Stokes equations such as (33) or (34). (In terms of the parameters from the previous heuristic discussion, this scaling corresponds to the map .)

Typically, if one considers a function space norm of (or of or ) in the limit , the top order behaviour will be given by some power of . A norm is called *subcritical* if the exponent is positive, *supercritical* if the exponent is negative, and *critical* if the exponent is zero. For instance, one can calculate the Fourier transform

and hence

As , this expression behaves like to top order; hence the norm is subcritical when , supercritical when , and critical when .

Another way to phrase this classification is to use dimensional analysis. If we use to denote the unit of length, and the unit of time, then the velocity field should have units , and the terms and in (1) then have units . To be dimensionally consistent, the kinematic viscosity must then have the units , and the pressure should have units . (This differs from the usual units given in physics to the pressure, which is where is the unit of mass; the discrepancy comes from the choice to normalise the density, which usually has units , to equal .) If we fix to be a dimensionless constant such as , this forces a relation between the time and length units, so now and have the units and respectively (compare with (39) and (40)). Of course will then also have units