You are currently browsing the category archive for the ‘teaching’ category.

Let us call an arithmetic function *-bounded* if we have for all . In this section we focus on the asymptotic behaviour of -bounded multiplicative functions. Some key examples of such functions include:

- The Möbius function ;
- The Liouville function ;
- “Archimedean” characters (which I call Archimedean because they are pullbacks of a Fourier character on the multiplicative group , which has the Archimedean property);
- Dirichlet characters (or “non-Archimedean” characters) (which are essentially pullbacks of Fourier characters on a multiplicative cyclic group with the discrete (non-Archimedean) metric);
- Hybrid characters .

The space of -bounded multiplicative functions is also closed under multiplication and complex conjugation.

Given a multiplicative function , we are often interested in the asymptotics of long averages such as

for large values of , as well as short sums

where and are both large, but is significantly smaller than . (Throughout these notes we will try to normalise most of the sums and integrals appearing here as averages that are trivially bounded by ; note that other normalisations are preferred in some of the literature cited here.) For instance, as we established in Theorem 58 of Notes 1, the prime number theorem is equivalent to the assertion that

as . The Liouville function behaves almost identically to the Möbius function, in that estimates for one function almost always imply analogous estimates for the other:

Exercise 1Without using the prime number theorem, show that (1) is also equivalent to

Henceforth we shall focus our discussion more on the Liouville function, and turn our attention to averages on shorter intervals. From (2) one has

as if is such that for some fixed . However it is significantly more difficult to understand what happens when grows much slower than this. By using the techniques based on zero density estimates discussed in Notes 6, it was shown by Motohashi and that one can also establish \eqref. On the Riemann Hypothesis Maier and Montgomery lowered the threshold to for an absolute constant (the bound is more classical, following from Exercise 33 of Notes 2). On the other hand, the randomness heuristics from Supplement 4 suggest that should be able to be taken as small as , and perhaps even if one is particularly optimistic about the accuracy of these probabilistic models. On the other hand, the Chowla conjecture (mentioned for instance in Supplement 4) predicts that cannot be taken arbitrarily slowly growing in , due to the conjectured existence of arbitrarily long strings of consecutive numbers where the Liouville function does not change sign (and in fact one can already show from the known partial results towards the Chowla conjecture that (3) fails for some sequence and some sufficiently slowly growing , by modifying the arguments in these papers of mine).

The situation is better when one asks to understand the mean value on *almost all* short intervals, rather than all intervals. There are several equivalent ways to formulate this question:

Exercise 2Let be a function of such that and as . Let be a -bounded function. Show that the following assertions are equivalent:

As it turns out the second moment formulation in (iii) will be the most convenient for us to work with in this set of notes, as it is well suited to Fourier-analytic techniques (and in particular the Plancherel theorem).

Using zero density methods, for instance, it was shown by Ramachandra that

whenever and . With this quality of bound (saving arbitrary powers of over the trivial bound of ), this is still the lowest value of one can reach unconditionally. However, in a striking recent breakthrough, it was shown by Matomaki and Radziwill that as long as one is willing to settle for weaker bounds (saving a small power of or , or just a qualitative decay of ), one can obtain non-trivial estimates on far shorter intervals. For instance, they show

Theorem 3 (Matomaki-Radziwill theorem for Liouville)For any , one hasfor some absolute constant .

In fact they prove a slightly more precise result: see Theorem 1 of that paper. In particular, they obtain the asymptotic (4) for *any* function that goes to infinity as , no matter how slowly! This ability to let grow slowly with is important for several applications; for instance, in order to combine this type of result with the entropy decrement methods from Notes 9, it is essential that be allowed to grow more slowly than . See also this survey of Soundararajan for further discussion.

Exercise 4In this exercise you may use Theorem 3 freely.

- (i) Establish the lower bound
for some absolute constant and all sufficiently large . (

Hint:if this bound failed, then would hold for almost all ; use this to create many intervals for which is extremely large.)- (ii) Show that Theorem 3 also holds with replaced by , where is the principal character of period . (Use the fact that for all .) Use this to establish the corresponding upper bound
to (i).

(There is a curious asymmetry to the difficulty level of these bounds; the upper bound in (ii) was established much earlier by Harman, Pintz, and Wolke, but the lower bound in (i) was only established in the Matomaki-Radziwill paper.)

The techniques discussed previously were highly complex-analytic in nature, relying in particular on the fact that functions such as or have Dirichlet series , that extend meromorphically into the critical strip. In contrast, the Matomaki-Radziwill theorem does *not* rely on such meromorphic continuations, and in fact holds for more general classes of -bounded multiplicative functions , for which one typically does not expect any meromorphic continuation into the strip. Instead, one can view the Matomaki-Radziwill theory as following the philosophy of a slightly different approach to multiplicative number theory, namely the *pretentious multiplicative number theory* of Granville and Soundarajan (as presented for instance in their draft monograph). A basic notion here is the *pretentious distance* between two -bounded multiplicative functions (at a given scale ), which informally measures the extent to which “pretends” to be like (or vice versa). The precise definition is

Definition 5 (Pretentious distance)Given two -bounded multiplicative functions , and a threshold , thepretentious distancebetween and up to scale is given by the formula

Note that one can also define an infinite version of this distance by removing the constraint , though in such cases the pretentious distance may then be infinite. The pretentious distance is not quite a metric (because can be non-zero, and furthermore can vanish without being equal), but it is still quite close to behaving like a metric, in particular it obeys the triangle inequality; see Exercise 16 below. The philosophy of pretentious multiplicative number theory is that two -bounded multiplicative functions will exhibit similar behaviour at scale if their pretentious distance is bounded, but will become uncorrelated from each other if this distance becomes large. A simple example of this philosophy is given by the following “weak Halasz theorem”, proven in Section 2:

Proposition 6 (Logarithmically averaged version of Halasz)Let be sufficiently large. Then for any -bounded multiplicative functions , one hasfor an absolute constant .

In particular, if does not pretend to be , then the logarithmic average will be small. This condition is basically necessary, since of course .

If one works with non-logarithmic averages , then not pretending to be is insufficient to establish decay, as was already observed in Exercise 11 of Notes 1: if is an Archimedean character for some non-zero real , then goes to zero as (which is consistent with Proposition 6), but does not go to zero. However, this is in some sense the “only” obstruction to these averages decaying to zero, as quantified by the following basic result:

Theorem 7 (Halasz’s theorem)Let be sufficiently large. Then for any -bounded multiplicative function , one hasfor an absolute constant and any .

Informally, we refer to a -bounded multiplicative function as “pretentious’; if it pretends to be a character such as , and “non-pretentious” otherwise. The precise distinction is rather malleable, as the precise class of characters that one views as “obstructions” varies from situation to situation. For instance, in Proposition 6 it is just the trivial character which needs to be considered, but in Theorem 7 it is the characters with . In other contexts one may also need to add Dirichlet characters or hybrid characters such as to the list of characters that one might pretend to be. The division into pretentious and non-pretentious functions in multiplicative number theory is faintly analogous to the division into major and minor arcs in the circle method applied to additive number theory problems; see Notes 8. The Möbius and Liouville functions are model examples of non-pretentious functions; see Exercise 24.

In the contrapositive, Halasz’ theorem can be formulated as the assertion that if one has a large mean

for some , then one has the pretentious property

for some . This has the flavour of an “inverse theorem”, of the type often found in arithmetic combinatorics.

Among other things, Halasz’s theorem gives yet another proof of the prime number theorem (1); see Section 2.

We now give a version of the Matomaki-Radziwill theorem for general (non-pretentious) multiplicative functions that is formulated in a similar contrapositive (or “inverse theorem”) fashion, though to simplify the presentation we only state a qualitative version that does not give explicit bounds.

Theorem 8 ((Qualitative) Matomaki-Radziwill theorem)Let , and let , with sufficiently large depending on . Suppose that is a -bounded multiplicative function such thatThen one has

for some .

The condition is basically optimal, as the following example shows:

Exercise 9Let be a sufficiently small constant, and let be such that . Let be the Archimedean character for some . Show that

Combining Theorem 8 with standard non-pretentiousness facts about the Liouville function (see Exercise 24), we recover Theorem 3 (but with a decay rate of only rather than ). We refer the reader to the original paper of Matomaki-Radziwill (as well as this followup paper with myself) for the quantitative version of Theorem 8 that is strong enough to recover the full version of Theorem 3, and which can also handle real-valued pretentious functions.

With our current state of knowledge, the only arguments that can establish the full strength of Halasz and Matomaki-Radziwill theorems are Fourier analytic in nature, relating sums involving an arithmetic function with its Dirichlet series

which one can view as a discrete Fourier transform of (or more precisely of the measure , if one evaluates the Dirichlet series on the right edge of the critical strip). In this aspect, the techniques resemble the complex-analytic methods from Notes 2, but with the key difference that no analytic or meromorphic continuation into the strip is assumed. The key identity that allows us to pass to Dirichlet series is the following variant of Proposition 7 of Notes 2:

Proposition 10 (Parseval type identity)Let be finitely supported arithmetic functions, and let be a Schwartz function. Thenwhere is the Fourier transform of . (Note that the finite support of and the Schwartz nature of ensure that both sides of the identity are absolutely convergent.)

The restriction that be finitely supported will be slightly annoying in places, since most multiplicative functions will fail to be finitely supported, but this technicality can usually be overcome by suitably truncating the multiplicative function, and taking limits if necessary.

*Proof:* By expanding out the Dirichlet series, it suffices to show that

for any natural numbers . But this follows from the Fourier inversion formula applied at .

For applications to Halasz type theorems, one sets equal to the Kronecker delta , producing weighted integrals of of “” type. For applications to Matomaki-Radziwill theorems, one instead sets , and more precisely uses the following corollary of the above proposition, to obtain weighted integrals of of “” type:

Exercise 11 (Plancherel type identity)If is finitely supported, and is a Schwartz function, establish the identity

In contrast, information about the non-pretentious nature of a multiplicative function will give “pointwise” or “” type control on the Dirichlet series , as is suggested from the Euler product factorisation of .

It will be convenient to formalise the notion of , , and control of the Dirichlet series , which as previously mentioned can be viewed as a sort of “Fourier transform” of :

Definition 12 (Fourier norms)Let be finitely supported, and let be a bounded measurable set. We define theFourier normthe

Fourier normand the

Fourier norm

One could more generally define norms for other exponents , but we will only need the exponents in this current set of notes. It is clear that all the above norms are in fact (semi-)norms on the space of finitely supported arithmetic functions.

As mentioned above, Halasz’s theorem gives good control on the Fourier norm for restrictions of non-pretentious functions to intervals:

Exercise 13 (Fourier control via Halasz)Let be a -bounded multiplicative function, let be an interval in for some , let , and let be a bounded measurable set. Show that(Hint: you will need to use summation by parts (or an equivalent device) to deal with a weight.)

Meanwhile, the Plancherel identity in Exercise 11 gives good control on the Fourier norm for functions on long intervals (compare with Exercise 2 from Notes 6):

Exercise 14 ( mean value theorem)Let , and let be finitely supported. Show thatConclude in particular that if is supported in for some and , then

In the simplest case of the logarithmically averaged Halasz theorem (Proposition 6), Fourier estimates are already sufficient to obtain decent control on the (weighted) Fourier type expressions that show up. However, these estimates are not enough by themselves to establish the full Halasz theorem or the Matomaki-Radziwill theorem. To get from Fourier control to Fourier or control more efficiently, the key trick is use Hölder’s inequality, which when combined with the basic Dirichlet series identity

The strategy is then to factor (or approximately factor) the original function as a Dirichlet convolution (or average of convolutions) of various components, each of which enjoys reasonably good Fourier or estimates on various regions , and then combine them using the Hölder inequalities (5), (6) and the triangle inequality. For instance, to prove Halasz’s theorem, we will split into the Dirichlet convolution of three factors, one of which will be estimated in using the non-pretentiousness hypothesis, and the other two being estimated in using Exercise 14. For the Matomaki-Radziwill theorem, one uses a significantly more complicated decomposition of into a variety of Dirichlet convolutions of factors, and also splits up the Fourier domain into several subregions depending on whether the Dirichlet series associated to some of these components are large or small. In each region and for each component of these decompositions, all but one of the factors will be estimated in , and the other in ; but the precise way in which this is done will vary from component to component. For instance, in some regions a key factor will be small in by construction of the region; in other places, the control will come from Exercise 13. Similarly, in some regions, satisfactory control is provided by Exercise 14, but in other regions one must instead use “large value” theorems (in the spirit of Proposition 9 from Notes 6), or amplify the power of the standard mean value theorems by combining the Dirichlet series with other Dirichlet series that are known to be large in this region.

There are several ways to achieve the desired factorisation. In the case of Halasz’s theorem, we can simply work with a crude version of the Euler product factorisation, dividing the primes into three categories (“small”, “medium”, and “large” primes) and expressing as a triple Dirichlet convolution accordingly. For the Matomaki-Radziwill theorem, one instead exploits the Turan-Kubilius phenomenon (Section 5 of Notes 1, or Lemma 2 of Notes 9)) that for various moderately wide ranges of primes, the number of prime divisors of a large number in the range is almost always close to . Thus, if we introduce the arithmetic functions

and more generally we have a twisted approximation

for multiplicative functions . (Actually, for technical reasons it will be convenient to work with a smoothed out version of these functions; see Section 3.) Informally, these formulas suggest that the “ energy” of a multiplicative function is concentrated in those regions where is extremely large in a sense. Iterations of this formula (or variants of this formula, such as an identity due to Ramaré) will then give the desired (approximate) factorisation of .

In these notes we presume familiarity with the basic concepts of probability theory, such as random variables (which could take values in the reals, vectors, or other measurable spaces), probability, and expectation. Much of this theory is in turn based on measure theory, which we will also presume familiarity with. See for instance this previous set of lecture notes for a brief review.

The basic objects of study in analytic number theory are deterministic; there is nothing inherently random about the set of prime numbers, for instance. Despite this, one can still interpret many of the averages encountered in analytic number theory in probabilistic terms, by introducing random variables into the subject. Consider for instance the form

of the prime number theorem (where we take the limit ). One can interpret this estimate probabilistically as

where is a random variable drawn uniformly from the natural numbers up to , and denotes the expectation. (In this set of notes we will use boldface symbols to denote random variables, and non-boldface symbols for deterministic objects.) By itself, such an interpretation is little more than a change of notation. However, the power of this interpretation becomes more apparent when one then imports concepts from probability theory (together with all their attendant intuitions and tools), such as independence, conditioning, stationarity, total variation distance, and entropy. For instance, suppose we want to use the prime number theorem (1) to make a prediction for the sum

After dividing by , this is essentially

With probabilistic intuition, one may expect the random variables to be approximately independent (there is no obvious relationship between the number of prime factors of , and of ), and so the above average would be expected to be approximately equal to

which by (2) is equal to . Thus we are led to the prediction

The asymptotic (3) is widely believed (it is a special case of the *Chowla conjecture*, which we will discuss in later notes; while there has been recent progress towards establishing it rigorously, it remains open for now.

How would one try to make these probabilistic intuitions more rigorous? The first thing one needs to do is find a more quantitative measurement of what it means for two random variables to be “approximately” independent. There are several candidates for such measurements, but we will focus in these notes on two particularly convenient measures of approximate independence: the “” measure of independence known as covariance, and the “” measure of independence known as mutual information (actually we will usually need the more general notion of conditional mutual information that measures conditional independence). The use of type methods in analytic number theory is well established, though it is usually not described in probabilistic terms, being referred to instead by such names as the “second moment method”, the “large sieve” or the “method of bilinear sums”. The use of methods (or “entropy methods”) is much more recent, and has been able to control certain types of averages in analytic number theory that were out of reach of previous methods such as methods. For instance, in later notes we will use entropy methods to establish the logarithmically averaged version

of (3), which is implied by (3) but strictly weaker (much as the prime number theorem (1) implies the bound , but the latter bound is much easier to establish than the former).

As with many other situations in analytic number theory, we can exploit the fact that certain assertions (such as approximate independence) can become significantly easier to prove if one only seeks to establish them *on average*, rather than uniformly. For instance, given two random variables and of number-theoretic origin (such as the random variables and mentioned previously), it can often be extremely difficult to determine the extent to which behave “independently” (or “conditionally independently”). However, thanks to second moment tools or entropy based tools, it is often possible to assert results of the following flavour: if are a large collection of “independent” random variables, and is a further random variable that is “not too large” in some sense, then must necessarily be nearly independent (or conditionally independent) to many of the , even if one cannot pinpoint precisely which of the the variable is independent with. In the case of the second moment method, this allows us to compute correlations such as for “most” . The entropy method gives bounds that are significantly weaker quantitatively than the second moment method (and in particular, in its current incarnation at least it is only able to say non-trivial assertions involving interactions with residue classes at small primes), but can control significantly more general quantities for “most” thanks to tools such as the Pinsker inequality.

In the fall quarter (starting Sep 27) I will be teaching a graduate course on analytic prime number theory. This will be similar to a graduate course I taught in 2015, and in particular will reuse several of the lecture notes from that course, though it will also incorporate some new material (and omit some material covered in the previous course, to compensate). I anticipate covering the following topics:

- Elementary multiplicative number theory
- Complex-analytic multiplicative number theory
- The entropy decrement argument
- Bounds for exponential sums
- Zero density theorems
- Halasz’s theorem and the Matomaki-Radziwill theorem
- The circle method
- (If time permits) Chowla’s conjecture and the Erdos discrepancy problem

Lecture notes for topics 3, 6, and 8 will be forthcoming.

These lecture notes are a continuation of the 254A lecture notes from the previous quarter.

We consider the Euler equations for incompressible fluid flow on a Euclidean space ; we will label as the “Eulerian space” (or “Euclidean space”, or “physical space”) to distinguish it from the “Lagrangian space” (or “labels space”) that we will introduce shortly (but the reader is free to also ignore the or subscripts if he or she wishes). Elements of Eulerian space will be referred to by symbols such as , we use to denote Lebesgue measure on and we will use for the coordinates of , and use indices such as to index these coordinates (with the usual summation conventions), for instance denotes partial differentiation along the coordinate. (We use superscripts for coordinates instead of subscripts to be compatible with some differential geometry notation that we will use shortly; in particular, when using the summation notation, we will now be matching subscripts with superscripts for the pair of indices being summed.)

In Eulerian coordinates, the Euler equations read

where is the velocity field and is the pressure field. These are functions of time and on the spatial location variable . We will refer to the coordinates as Eulerian coordinates. However, if one reviews the physical derivation of the Euler equations from 254A Notes 0, before one takes the continuum limit, the fundamental unknowns were not the velocity field or the pressure field , but rather the trajectories , which can be thought of as a single function from the coordinates (where is a time and is an element of the label set ) to . The relationship between the trajectories and the velocity field was given by the informal relationship

We will refer to the coordinates as (discrete) *Lagrangian coordinates* for describing the fluid.

In view of this, it is natural to ask whether there is an alternate way to formulate the continuum limit of incompressible inviscid fluids, by using a continuous version of the Lagrangian coordinates, rather than Eulerian coordinates. This is indeed the case. Suppose for instance one has a smooth solution to the Euler equations on a spacetime slab in Eulerian coordinates; assume furthermore that the velocity field is uniformly bounded. We introduce another copy of , which we call *Lagrangian space* or *labels space*; we use symbols such as to refer to elements of this space, to denote Lebesgue measure on , and to refer to the coordinates of . We use indices such as to index these coordinates, thus for instance denotes partial differentiation along the coordinate. We will use summation conventions for both the Eulerian coordinates and the Lagrangian coordinates , with an index being summed if it appears as both a subscript and a superscript in the same term. While and are of course isomorphic, we will try to refrain from identifying them, except perhaps at the initial time in order to fix the initialisation of Lagrangian coordinates.

Given a smooth and bounded velocity field , define a *trajectory map* for this velocity to be any smooth map that obeys the ODE

in view of (2), this describes the trajectory (in ) of a particle labeled by an element of . From the Picard existence theorem and the hypothesis that is smooth and bounded, such a map exists and is unique as long as one specifies the initial location assigned to each label . Traditionally, one chooses the initial condition

for , so that we label each particle by its initial location at time ; we are also free to specify other initial conditions for the trajectory map if we please. Indeed, we have the freedom to “permute” the labels by an arbitrary diffeomorphism: if is a trajectory map, and is any diffeomorphism (a smooth map whose inverse exists and is also smooth), then the map is also a trajectory map, albeit one with different initial conditions .

Despite the popularity of the initial condition (4), we will try to keep conceptually separate the Eulerian space from the Lagrangian space , as they play different physical roles in the interpretation of the fluid; for instance, while the Euclidean metric is an important feature of Eulerian space , it is not a geometrically natural structure to use in Lagrangian space . We have the following more general version of Exercise 8 from 254A Notes 2:

Exercise 1Let be smooth and bounded.

- If is a smooth map, show that there exists a unique smooth trajectory map with initial condition for all .
- Show that if is a diffeomorphism and , then the map is also a diffeomorphism.

Remark 2The first of the Euler equations (1) can now be written in the formwhich can be viewed as a continuous limit of Newton’s first law .

Call a diffeomorphism *(oriented) volume preserving* if one has the equation

for all , where the total differential is the matrix with entries for and , where are the components of . (If one wishes, one can also view as a linear transformation from the tangent space of Lagrangian space at to the tangent space of Eulerian space at .) Equivalently, is orientation preserving and one has a Jacobian-free change of variables formula

for all , which is in turn equivalent to having the same Lebesgue measure as for any measurable set .

The divergence-free condition then can be nicely expressed in terms of volume-preserving properties of the trajectory maps , in a manner which confirms the interpretation of this condition as an incompressibility condition on the fluid:

Lemma 3Let be smooth and bounded, let be a volume-preserving diffeomorphism, and let be the trajectory map. Then the following are equivalent:

- on .
- is volume-preserving for all .

*Proof:* Since is orientation-preserving, we see from continuity that is also orientation-preserving. Suppose that is also volume-preserving, then for any we have the conservation law

for all . Differentiating in time using the chain rule and (3) we conclude that

for all , and hence by change of variables

which by integration by parts gives

for all and , so is divergence-free.

To prove the converse implication, it is convenient to introduce the *labels map* , defined by setting to be the inverse of the diffeomorphism , thus

for all . By the implicit function theorem, is smooth, and by differentiating the above equation in time using (3) we see that

where is the usual material derivative

acting on functions on . If is divergence-free, we have from integration by parts that

for any test function . In particular, for any , we can calculate

and hence

for any . Since is volume-preserving, so is , thus

Thus is volume-preserving, and hence is also.

Exercise 4Let be a continuously differentiable map from the time interval to the general linear group of invertible matrices. Establish Jacobi’s formulaand use this and (6) to give an alternate proof of Lemma 3 that does not involve any integration in space.

Remark 5One can view the use of Lagrangian coordinates as an extension of the method of characteristics. Indeed, from the chain rule we see that for any smooth function of Eulerian spacetime, one hasand hence any transport equation that in Eulerian coordinates takes the form

for smooth functions of Eulerian spacetime is equivalent to the ODE

where are the smooth functions of Lagrangian spacetime defined by

In this set of notes we recall some basic differential geometry notation, particularly with regards to pullbacks and Lie derivatives of differential forms and other tensor fields on manifolds such as and , and explore how the Euler equations look in this notation. Our discussion will be entirely formal in nature; we will assume that all functions have enough smoothness and decay at infinity to justify the relevant calculations. (It is possible to work rigorously in Lagrangian coordinates – see for instance the work of Ebin and Marsden – but we will not do so here.) As a general rule, Lagrangian coordinates tend to be somewhat less convenient to use than Eulerian coordinates for establishing the basic analytic properties of the Euler equations, such as local existence, uniqueness, and continuous dependence on the data; however, they are quite good at clarifying the more algebraic properties of these equations, such as conservation laws and the variational nature of the equations. It may well be that in the future we will be able to use the Lagrangian formalism more effectively on the analytic side of the subject also.

Remark 6One can also write the Navier-Stokes equations in Lagrangian coordinates, but the equations are not expressed in a favourable form in these coordinates, as the Laplacian appearing in the viscosity term becomes replaced with a time-varying Laplace-Beltrami operator. As such, we will not discuss the Lagrangian coordinate formulation of Navier-Stokes here.

Note: this post is not required reading for this course, or for the sequel course in the winter quarter.

In a Notes 2, we reviewed the classical construction of Leray of global weak solutions to the Navier-Stokes equations. We did not quite follow Leray’s original proof, in that the notes relied more heavily on the machinery of Littlewood-Paley projections, which have become increasingly common tools in modern PDE. On the other hand, we did use the same “exploiting compactness to pass to weakly convergent subsequence” strategy that is the standard one in the PDE literature used to construct weak solutions.

As I discussed in a previous post, the manipulation of sequences and their limits is analogous to a “cheap” version of nonstandard analysis in which one uses the Fréchet filter rather than an ultrafilter to construct the nonstandard universe. (The manipulation of generalised functions of Columbeau-type can also be comfortably interpreted within this sort of cheap nonstandard analysis.) Augmenting the manipulation of sequences with the right to pass to subsequences whenever convenient is then analogous to a sort of “lazy” nonstandard analysis, in which the implied ultrafilter is never actually constructed as a “completed object“, but is instead lazily evaluated, in the sense that whenever membership of a given subsequence of the natural numbers in the ultrafilter needs to be determined, one either passes to that subsequence (thus placing it in the ultrafilter) or the complement of the sequence (placing it out of the ultrafilter). This process can be viewed as the initial portion of the transfinite induction that one usually uses to construct ultrafilters (as discussed using a voting metaphor in this post), except that there is generally no need in any given application to perform the induction for any uncountable ordinal (or indeed for most of the countable ordinals also).

On the other hand, it is also possible to work directly in the orthodox framework of nonstandard analysis when constructing weak solutions. This leads to an approach to the subject which is largely equivalent to the usual subsequence-based approach, though there are some minor technical differences (for instance, the subsequence approach occasionally requires one to work with separable function spaces, whereas in the ultrafilter approach the reliance on separability is largely eliminated, particularly if one imposes a strong notion of saturation on the nonstandard universe). The subject acquires a more “algebraic” flavour, as the quintessential analysis operation of taking a limit is replaced with the “standard part” operation, which is an algebra homomorphism. The notion of a sequence is replaced by the distinction between standard and nonstandard objects, and the need to pass to subsequences disappears entirely. Also, the distinction between “bounded sequences” and “convergent sequences” is largely eradicated, particularly when the space that the sequences ranged in enjoys some compactness properties on bounded sets. Also, in this framework, the notorious non-uniqueness features of weak solutions can be “blamed” on the non-uniqueness of the nonstandard extension of the standard universe (as well as on the multiple possible ways to construct nonstandard mollifications of the original standard PDE). However, many of these changes are largely cosmetic; switching from a subsequence-based theory to a nonstandard analysis-based theory does *not* seem to bring one significantly closer for instance to the global regularity problem for Navier-Stokes, but it could have been an alternate path for the historical development and presentation of the subject.

In any case, I would like to present below the fold this nonstandard analysis perspective, quickly translating the relevant components of real analysis, functional analysis, and distributional theory that we need to this perspective, and then use it to re-prove Leray’s theorem on existence of global weak solutions to Navier-Stokes.

We now turn to the local existence theory for the initial value problem for the incompressible Euler equations

For sake of discussion we will just work in the non-periodic domain , , although the arguments here can be adapted without much difficulty to the periodic setting. We will only work with solutions in which the pressure is normalised in the usual fashion:

Formally, the Euler equations (with normalised pressure) arise as the vanishing viscosity limit of the Navier-Stokes equations

that was studied in previous notes. However, because most of the bounds established in previous notes, either on the lifespan of the solution or on the size of the solution itself, depended on , it is not immediate how to justify passing to the limit and obtain either a strong well-posedness theory or a weak solution theory for the limiting equation (1). (For instance, weak solutions to the Navier-Stokes equations (or the approximate solutions used to create such weak solutions) have lying in for , but the bound on the norm is and so one could lose this regularity in the limit , at which point it is not clear how to ensure that the nonlinear term still converges in the sense of distributions to what one expects.)

Nevertheless, by carefully using the energy method (which we will do loosely following an approach of Bertozzi and Majda), it is still possible to obtain *local-in-time* estimates on (high-regularity) solutions to (3) that are uniform in the limit . Such *a priori* estimates can then be combined with a number of variants of these estimates obtain a satisfactory local well-posedness theory for the Euler equations. Among other things, we will be able to establish the *Beale-Kato-Majda criterion* – smooth solutions to the Euler (or Navier-Stokes) equations can be continued indefinitely unless the integral

becomes infinite at the final time , where is the *vorticity* field. The vorticity has the important property that it is transported by the Euler flow, and in two spatial dimensions it can be used to establish global regularity for both the Euler and Navier-Stokes equations in these settings. (Unfortunately, in three and higher dimensions the phenomenon of vortex stretching has frustrated all attempts to date to use the vorticity transport property to establish global regularity of either equation in this setting.)

There is a rather different approach to establishing local well-posedness for the Euler equations, which relies on the *vorticity-stream* formulation of these equations. This will be discused in a later set of notes.

In the previous set of notes we developed a theory of “strong” solutions to the Navier-Stokes equations. This theory, based around viewing the Navier-Stokes equations as a perturbation of the linear heat equation, has many attractive features: solutions exist locally, are unique, depend continuously on the initial data, have a high degree of regularity, can be continued in time as long as a sufficiently high regularity norm is under control, and tend to enjoy the same sort of conservation laws that classical solutions do. However, it is a major open problem as to whether these solutions can be extended to be (forward) global in time, because the norms that we know how to control globally in time do not have high enough regularity to be useful for continuing the solution. Also, the theory becomes degenerate in the inviscid limit .

However, it is possible to construct “weak” solutions which lack many of the desirable features of strong solutions (notably, uniqueness, propagation of regularity, and conservation laws) but can often be constructed globally in time even when one us unable to do so for strong solutions. Broadly speaking, one usually constructs weak solutions by some sort of “compactness method”, which can generally be described as follows.

- Construct a sequence of “approximate solutions” to the desired equation, for instance by developing a well-posedness theory for some “regularised” approximation to the original equation. (This theory often follows similar lines to those in the previous set of notes, for instance using such tools as the contraction mapping theorem to construct the approximate solutions.)
- Establish some
*uniform*bounds (over appropriate time intervals) on these approximate solutions, even in the limit as an approximation parameter is sent to zero. (Uniformity is key;*non-uniform*bounds are often easy to obtain if one puts enough “mollification”, “hyper-dissipation”, or “discretisation” in the approximating equation.) - Use some sort of “weak compactness” (e.g., the Banach-Alaoglu theorem, the Arzela-Ascoli theorem, or the Rellich compactness theorem) to extract a subsequence of approximate solutions that converge (in a topology weaker than that associated to the available uniform bounds) to a limit. (Note that there is no reason
*a priori*to expect such limit points to be unique, or to have any regularity properties beyond that implied by the available uniform bounds..) - Show that this limit solves the original equation in a suitable weak sense.

The quality of these weak solutions is very much determined by the type of uniform bounds one can obtain on the approximate solution; the stronger these bounds are, the more properties one can obtain on these weak solutions. For instance, if the approximate solutions enjoy an energy identity leading to uniform energy bounds, then (by using tools such as Fatou’s lemma) one tends to obtain energy *inequalities* for the resulting weak solution; but if one somehow is able to obtain uniform bounds in a higher regularity norm than the energy then one can often recover the full energy *identity*. If the uniform bounds are at the regularity level needed to obtain well-posedness, then one generally expects to upgrade the weak solution to a strong solution. (This phenomenon is often formalised through *weak-strong uniqueness* theorems, which we will discuss later in these notes.) Thus we see that as far as attacking global regularity is concerned, both the theory of strong solutions and the theory of weak solutions encounter essentially the same obstacle, namely the inability to obtain uniform bounds on (exact or approximate) solutions at high regularities (and at arbitrary times).

For simplicity, we will focus our discussion in this notes on finite energy weak solutions on . There is a completely analogous theory for periodic weak solutions on (or equivalently, weak solutions on the torus which we will leave to the interested reader.

In recent years, a completely different way to construct weak solutions to the Navier-Stokes or Euler equations has been developed that are not based on the above compactness methods, but instead based on techniques of convex integration. These will be discussed in a later set of notes.

We now begin the rigorous theory of the incompressible Navier-Stokes equations

where is a given constant (the *kinematic viscosity*, or *viscosity* for short), is an unknown vector field (the *velocity field*), and is an unknown scalar field (the *pressure field*). Here is a time interval, usually of the form or . We will either be interested in spatially decaying situations, in which decays to zero as , or -periodic (or *periodic* for short) settings, in which one has for all . (One can also require the pressure to be periodic as well; this brings up a small subtlety in the uniqueness theory for these equations, which we will address later in this set of notes.) As is usual, we abuse notation by identifying a -periodic function on with a function on the torus .

In order for the system (1) to even make sense, one requires some level of regularity on the unknown fields ; this turns out to be a relatively important technical issue that will require some attention later in this set of notes, and we will end up transforming (1) into other forms that are more suitable for lower regularity candidate solution. Our focus here will be on local existence of these solutions in a short time interval or , for some . (One could in principle also consider solutions that extend to negative times, but it turns out that the equations are not time-reversible, and the forward evolution is significantly more natural to study than the backwards one.) The study of Euler equations, in which , will be deferred to subsequent lecture notes.

As the unknown fields involve a time parameter , and the first equation of (1) involves time derivatives of , the system (1) should be viewed as describing an evolution for the velocity field . (As we shall see later, the pressure is not really an independent dynamical field, as it can essentially be expressed in terms of the velocity field without requiring any differentiation or integration in time.) As such, the natural question to study for this system is the initial value problem, in which an initial velocity field is specified, and one wishes to locate a solution to the system (1) with initial condition

for . Of course, in order for this initial condition to be compatible with the second equation in (1), we need the compatibility condition

and one should also impose some regularity, decay, and/or periodicity hypotheses on in order to be compatible with corresponding level of regularity etc. on the solution .

The fundamental questions in the local theory of an evolution equation are that of *existence*, *uniqueness*, and *continuous dependence*. In the context of the Navier-Stokes equations, these questions can be phrased (somewhat broadly) as follows:

- (a) (Local existence) Given suitable initial data , does there exist a solution to the above initial value problem that exists for some time ? What can one say about the time of existence? How regular is the solution?
- (b) (Uniqueness) Is it possible to have two solutions of a certain regularity class to the same initial value problem on a common time interval ? To what extent does the answer to this question depend on the regularity assumed on one or both of the solutions? Does one need to normalise the solutions beforehand in order to obtain uniqueness?
- (c) (Continuous dependence on data) If one perturbs the initial conditions by a small amount, what happens to the solution and on the time of existence ? (This question tends to only be sensible once one has a reasonable uniqueness theory.)

The answers to these questions tend to be more complicated than a simple “Yes” or “No”, for instance they can depend on the precise regularity hypotheses one wishes to impose on the data and on the solution, and even on exactly how one interprets the concept of a “solution”. However, once one settles on such a set of hypotheses, it generally happens that one either gets a “strong” theory (in which one has existence, uniqueness, and continuous dependence on the data), a “weak” theory (in which one has existence of somewhat low-quality solutions, but with only limited uniqueness results (or even some spectacular failures of uniqueness) and almost no continuous dependence on data), or no satsfactory theory whatsoever. In the former case, we say (roughly speaking) that the initial value problem is *locally well-posed*, and one can then try to build upon the theory to explore more interesting topics such as global existence and asymptotics, classifying potential blowup, rigorous justification of conservation laws, and so forth. With a weak local theory, it becomes much more difficult to address these latter sorts of questions, and there are serious analytic pitfalls that one could fall into if one tries too strenuously to treat weak solutions as if they were strong. (For instance, conservation laws that are rigorously justified for strong, high-regularity solutions may well fail for weak, low-regularity ones.) Also, even if one is primarily interested in solutions at one level of regularity, the well-posedness theory at another level of regularity can be very helpful; for instance, if one is interested in smooth solutions in , it turns out that the well-posedness theory at the critical regularity of can be used to establish *globally* smooth solutions from small initial data. As such, it can become quite important to know what kind of local theory one can obtain for a given equation.

This set of notes will focus on the “strong” theory, in which a substantial amount of regularity is assumed in the initial data and solution, giving a satisfactory (albeit largely local-in-time) well-posedness theory. “Weak” solutions will be considered in later notes.

The Navier-Stokes equations are not the simplest of partial differential equations to study, in part because they are an amalgam of three more basic equations, which behave rather differently from each other (for instance the first equation is nonlinear, while the latter two are linear):

- (a)
*Transport equations*such as . - (b)
*Diffusion equations*(or*heat equations*) such as . - (c) Systems such as , , which (for want of a better name) we will call
*Leray systems*.

Accordingly, we will devote some time to getting some preliminary understanding of the linear diffusion and Leray systems before returning to the theory for the Navier-Stokes equation. Transport systems will be discussed further in subsequent notes; in this set of notes, we will instead focus on a more basic example of nonlinear equations, namely the first-order *ordinary differential equation*

where takes values in some finite-dimensional (real or complex) vector space on some time interval , and is a given linear or nonlinear function. (Here, we use “interval” to denote a connected non-empty subset of ; in particular, we allow intervals to be half-infinite or infinite, or to be open, closed, or half-open.) Fundamental results in this area include the Picard existence and uniqueness theorem, the Duhamel formula, and Grönwall’s inequality; they will serve as motivation for the approach to local well-posedness that we will adopt in this set of notes. (There are other ways to construct strong or weak solutions for Navier-Stokes and Euler equations, which we will discuss in later notes.)

A key role in our treatment here will be played by the fundamental theorem of calculus (in various forms and variations). Roughly speaking, this theorem, and its variants, allow us to recast differential equations (such as (1) or (4)) as integral equations. Such integral equations are less tractable algebraically than their differential counterparts (for instance, they are not ideal for verifying conservation laws), but are significantly more convenient for well-posedness theory, basically because integration tends to increase the regularity of a function, while differentiation reduces it. (Indeed, the problem of “losing derivatives”, or more precisely “losing regularity”, is a key obstacle that one often has to address when trying to establish well-posedness for PDE, particularly those that are quite nonlinear and with rough initial data, though for nonlinear parabolic equations such as Navier-Stokes the obstacle is not as serious as it is for some other PDE, due to the smoothing effects of the heat equation.)

One weakness of the methods deployed here are that the quantitative bounds produced deteriorate to the point of uselessness in the inviscid limit , rendering these techniques unsuitable for analysing the Euler equations in which . However, some of the methods developed in later notes have bounds that remain uniform in the limit, allowing one to also treat the Euler equations.

In this and subsequent set of notes, we use the following asymptotic notation (a variant of Vinogradov notation that is commonly used in PDE and harmonic analysis). The statement , , or will be used to denote an estimate of the form (or equivalently ) for some constant , and will be used to denote the estimates . If the constant depends on other parameters (such as the dimension ), this will be indicated by subscripts, thus for instance denotes the estimate for some depending on .

This coming fall quarter, I am teaching a class on topics in the mathematical theory of incompressible fluid equations, focusing particularly on the incompressible Euler and Navier-Stokes equations. These two equations are by no means the only equations used to model fluids, but I will focus on these two equations in this course to narrow the focus down to something manageable. I have not fully decided on the choice of topics to cover in this course, but I would probably begin with some core topics such as local well-posedness theory and blowup criteria, conservation laws, and construction of weak solutions, then move on to some topics such as boundary layers and the Prandtl equations, the Euler-Poincare-Arnold interpretation of the Euler equations as an infinite dimensional geodesic flow, and some discussion of the Onsager conjecture. I will probably also continue to more advanced and recent topics in the winter quarter.

In this initial set of notes, we begin by reviewing the physical derivation of the Euler and Navier-Stokes equations from the first principles of Newtonian mechanics, and specifically from Newton’s famous three laws of motion. Strictly speaking, this derivation is not needed for the mathematical analysis of these equations, which can be viewed if one wishes as an arbitrarily chosen system of partial differential equations without any physical motivation; however, I feel that the derivation sheds some insight and intuition on these equations, and is also worth knowing on purely intellectual grounds regardless of its mathematical consequences. I also find it instructive to actually see the journey from Newton’s law

to the seemingly rather different-looking law

for incompressible Navier-Stokes (or, if one drops the viscosity term , the Euler equations).

Our discussion in this set of notes is physical rather than mathematical, and so we will not be working at mathematical levels of rigour and precision. In particular we will be fairly casual about interchanging summations, limits, and integrals, we will manipulate approximate identities as if they were exact identities (e.g., by differentiating both sides of the approximate identity), and we will not attempt to verify any regularity or convergence hypotheses in the expressions being manipulated. (The same holds for the exercises in this text, which also do not need to be justified at mathematical levels of rigour.) Of course, once we resume the mathematical portion of this course in subsequent notes, such issues will be an important focus of careful attention. This is a basic division of labour in mathematical modeling: non-rigorous heuristic reasoning is used to derive a mathematical model from physical (or other “real-life”) principles, but once a precise model is obtained, the analysis of that model should be completely rigorous if at all possible (even if this requires applying the model to regimes which do not correspond to the original physical motivation of that model). See the discussion by John Ball quoted at the end of these slides of Gero Friesecke for an expansion of these points.

Note: our treatment here will differ slightly from that presented in many fluid mechanics texts, in that it will emphasise first-principles derivations from many-particle systems, rather than relying on bulk laws of physics, such as the laws of thermodynamics, which we will not cover here. (However, the derivations from bulk laws tend to be more robust, in that they are not as reliant on assumptions about the particular interactions between particles. In particular, the physical hypotheses we assume in this post are probably quite a bit stronger than the minimal assumptions needed to justify the Euler or Navier-Stokes equations, which can hold even in situations in which one or more of the hypotheses assumed here break down.)

## Recent Comments