You are currently browsing the tag archive for the ‘distributions’ tag.

Note: this post is not required reading for this course, or for the sequel course in the winter quarter.
In a Notes 2, we reviewed the classical construction of Leray of global weak solutions to the Navier-Stokes equations. We did not quite follow Leray’s original proof, in that the notes relied more heavily on the machinery of Littlewood-Paley projections, which have become increasingly common tools in modern PDE. On the other hand, we did use the same “exploiting compactness to pass to weakly convergent subsequence” strategy that is the standard one in the PDE literature used to construct weak solutions.
As I discussed in a previous post, the manipulation of sequences and their limits is analogous to a “cheap” version of nonstandard analysis in which one uses the Fréchet filter rather than an ultrafilter to construct the nonstandard universe. (The manipulation of generalised functions of Columbeau-type can also be comfortably interpreted within this sort of cheap nonstandard analysis.) Augmenting the manipulation of sequences with the right to pass to subsequences whenever convenient is then analogous to a sort of “lazy” nonstandard analysis, in which the implied ultrafilter is never actually constructed as a “completed object“, but is instead lazily evaluated, in the sense that whenever membership of a given subsequence of the natural numbers in the ultrafilter needs to be determined, one either passes to that subsequence (thus placing it in the ultrafilter) or the complement of the sequence (placing it out of the ultrafilter). This process can be viewed as the initial portion of the transfinite induction that one usually uses to construct ultrafilters (as discussed using a voting metaphor in this post), except that there is generally no need in any given application to perform the induction for any uncountable ordinal (or indeed for most of the countable ordinals also).
On the other hand, it is also possible to work directly in the orthodox framework of nonstandard analysis when constructing weak solutions. This leads to an approach to the subject which is largely equivalent to the usual subsequence-based approach, though there are some minor technical differences (for instance, the subsequence approach occasionally requires one to work with separable function spaces, whereas in the ultrafilter approach the reliance on separability is largely eliminated, particularly if one imposes a strong notion of saturation on the nonstandard universe). The subject acquires a more “algebraic” flavour, as the quintessential analysis operation of taking a limit is replaced with the “standard part” operation, which is an algebra homomorphism. The notion of a sequence is replaced by the distinction between standard and nonstandard objects, and the need to pass to subsequences disappears entirely. Also, the distinction between “bounded sequences” and “convergent sequences” is largely eradicated, particularly when the space that the sequences ranged in enjoys some compactness properties on bounded sets. Also, in this framework, the notorious non-uniqueness features of weak solutions can be “blamed” on the non-uniqueness of the nonstandard extension of the standard universe (as well as on the multiple possible ways to construct nonstandard mollifications of the original standard PDE). However, many of these changes are largely cosmetic; switching from a subsequence-based theory to a nonstandard analysis-based theory does not seem to bring one significantly closer for instance to the global regularity problem for Navier-Stokes, but it could have been an alternate path for the historical development and presentation of the subject.
In any case, I would like to present below the fold this nonstandard analysis perspective, quickly translating the relevant components of real analysis, functional analysis, and distributional theory that we need to this perspective, and then use it to re-prove Leray’s theorem on existence of global weak solutions to Navier-Stokes.
Read the rest of this entry »

The celebrated decomposition theorem of Fefferman and Stein shows that every function ${f \in \mathrm{BMO}({\bf R}^n)}$ of bounded mean oscillation can be decomposed in the form

$\displaystyle f = f_0 + \sum_{i=1}^n R_i f_i \ \ \ \ \ (1)$

modulo constants, for some ${f_0,f_1,\dots,f_n \in L^\infty({\bf R}^n)}$, where ${R_i := |\nabla|^{-1} \partial_i}$ are the Riesz transforms. A technical note here a function in BMO is defined only up to constants (as well as up to the usual almost everywhere equivalence); related to this, if ${f_i}$ is an ${L^\infty({\bf R}^n)}$ function, then the Riesz transform ${R_i f_i}$ is well defined as an element of ${\mathrm{BMO}({\bf R}^n)}$, but is also only defined up to constants and almost everywhere equivalence.

The original proof of Fefferman and Stein was indirect (relying for instance on the Hahn-Banach theorem). A constructive proof was later given by Uchiyama, and was in fact the topic of the second post on this blog. A notable feature of Uchiyama’s argument is that the construction is quite nonlinear; the vector-valued function ${(f_0,f_1,\dots,f_n)}$ is defined to take values on a sphere, and the iterative construction to build these functions from ${f}$ involves repeatedly projecting a potential approximant to this function to the sphere (also, the high-frequency components of this approximant are constructed in a manner that depends nonlinearly on the low-frequency components, which is a type of technique that has become increasingly common in analysis and PDE in recent years).

It is natural to ask whether the Fefferman-Stein decomposition (1) can be made linear in ${f}$, in the sense that each of the ${f_i, i=0,\dots,n}$ depend linearly on ${f}$. Strictly speaking this is easily accomplished using the axiom of choice: take a Hamel basis of ${\mathrm{BMO}({\bf R}^n)}$, choose a decomposition (1) for each element of this basis, and then extend linearly to all finite linear combinations of these basis functions, which then cover ${\mathrm{BMO}({\bf R}^n)}$ by definition of Hamel basis. But these linear operations have no reason to be continuous as a map from ${\mathrm{BMO}({\bf R}^n)}$ to ${L^\infty({\bf R}^n)}$. So the correct question is whether the decomposition can be made continuously linear (or equivalently, boundedly linear) in ${f}$, that is to say whether there exist continuous linear transformations ${T_i: \mathrm{BMO}({\bf R}^n) \rightarrow L^\infty({\bf R}^n)}$ such that

$\displaystyle f = T_0 f + \sum_{i=1}^n R_i T_i f \ \ \ \ \ (2)$

modulo constants for all ${f \in \mathrm{BMO}({\bf R}^n)}$. Note from the open mapping theorem that one can choose the functions ${f_0,\dots,f_n}$ to depend in a bounded fashion on ${f}$ (thus ${\|f_i\|_{L^\infty} \leq C \|f\|_{BMO}}$ for some constant ${C}$, however the open mapping theorem does not guarantee linearity. Using a result of Bartle and Graves one can also make the ${f_i}$ depend continuously on ${f}$, but again the dependence is not guaranteed to be linear.

It is generally accepted folklore that continuous linear dependence is known to be impossible, but I had difficulty recently tracking down an explicit proof of this assertion in the literature (if anyone knows of a reference, I would be glad to know of it). The closest I found was a proof of a similar statement in this paper of Bourgain and Brezis, which I was able to adapt to establish the current claim. The basic idea is to average over the symmetries of the decomposition, which in the case of (1) are translation invariance, rotation invariance, and dilation invariance. This effectively makes the operators ${T_0,T_1,\dots,T_n}$ invariant under all these symmetries, which forces them to themselves be linear combinations of the identity and Riesz transform operators; however, no such non-trivial linear combination maps ${\mathrm{BMO}}$ to ${L^\infty}$, and the claim follows. Formal details of this argument (which we phrase in a dual form in order to avoid some technicalities) appear below the fold.

In the previous set of notes we developed a theory of “strong” solutions to the Navier-Stokes equations. This theory, based around viewing the Navier-Stokes equations as a perturbation of the linear heat equation, has many attractive features: solutions exist locally, are unique, depend continuously on the initial data, have a high degree of regularity, can be continued in time as long as a sufficiently high regularity norm is under control, and tend to enjoy the same sort of conservation laws that classical solutions do. However, it is a major open problem as to whether these solutions can be extended to be (forward) global in time, because the norms that we know how to control globally in time do not have high enough regularity to be useful for continuing the solution. Also, the theory becomes degenerate in the inviscid limit ${\nu \rightarrow 0}$.

However, it is possible to construct “weak” solutions which lack many of the desirable features of strong solutions (notably, uniqueness, propagation of regularity, and conservation laws) but can often be constructed globally in time even when one us unable to do so for strong solutions. Broadly speaking, one usually constructs weak solutions by some sort of “compactness method”, which can generally be described as follows.

1. Construct a sequence of “approximate solutions” to the desired equation, for instance by developing a well-posedness theory for some “regularised” approximation to the original equation. (This theory often follows similar lines to those in the previous set of notes, for instance using such tools as the contraction mapping theorem to construct the approximate solutions.)
2. Establish some uniform bounds (over appropriate time intervals) on these approximate solutions, even in the limit as an approximation parameter is sent to zero. (Uniformity is key; non-uniform bounds are often easy to obtain if one puts enough “mollification”, “hyper-dissipation”, or “discretisation” in the approximating equation.)
3. Use some sort of “weak compactness” (e.g., the Banach-Alaoglu theorem, the Arzela-Ascoli theorem, or the Rellich compactness theorem) to extract a subsequence of approximate solutions that converge (in a topology weaker than that associated to the available uniform bounds) to a limit. (Note that there is no reason a priori to expect such limit points to be unique, or to have any regularity properties beyond that implied by the available uniform bounds..)
4. Show that this limit solves the original equation in a suitable weak sense.

The quality of these weak solutions is very much determined by the type of uniform bounds one can obtain on the approximate solution; the stronger these bounds are, the more properties one can obtain on these weak solutions. For instance, if the approximate solutions enjoy an energy identity leading to uniform energy bounds, then (by using tools such as Fatou’s lemma) one tends to obtain energy inequalities for the resulting weak solution; but if one somehow is able to obtain uniform bounds in a higher regularity norm than the energy then one can often recover the full energy identity. If the uniform bounds are at the regularity level needed to obtain well-posedness, then one generally expects to upgrade the weak solution to a strong solution. (This phenomenon is often formalised through weak-strong uniqueness theorems, which we will discuss later in these notes.) Thus we see that as far as attacking global regularity is concerned, both the theory of strong solutions and the theory of weak solutions encounter essentially the same obstacle, namely the inability to obtain uniform bounds on (exact or approximate) solutions at high regularities (and at arbitrary times).

For simplicity, we will focus our discussion in this notes on finite energy weak solutions on ${{\bf R}^d}$. There is a completely analogous theory for periodic weak solutions on ${{\bf R}^d}$ (or equivalently, weak solutions on the torus ${({\bf R}^d/{\bf Z}^d)}$ which we will leave to the interested reader.

In recent years, a completely different way to construct weak solutions to the Navier-Stokes or Euler equations has been developed that are not based on the above compactness methods, but instead based on techniques of convex integration. These will be discussed in a later set of notes.

In set theory, a function ${f: X \rightarrow Y}$ is defined as an object that evaluates every input ${x}$ to exactly one output ${f(x)}$. However, in various branches of mathematics, it has become convenient to generalise this classical concept of a function to a more abstract one. For instance, in operator algebras, quantum mechanics, or non-commutative geometry, one often replaces commutative algebras of (real or complex-valued) functions on some space ${X}$, such as ${C(X)}$ or ${L^\infty(X)}$, with a more general – and possibly non-commutative – algebra (e.g. a ${C^*}$-algebra or a von Neumann algebra). Elements in this more abstract algebra are no longer definable as functions in the classical sense of assigning a single value ${f(x)}$ to every point ${x \in X}$, but one can still define other operations on these “generalised functions” (e.g. one can multiply or take inner products between two such objects).

Generalisations of functions are also very useful in analysis. In our study of ${L^p}$ spaces, we have already seen one such generalisation, namely the concept of a function defined up to almost everywhere equivalence. Such a function ${f}$ (or more precisely, an equivalence class of classical functions) cannot be evaluated at any given point ${x}$, if that point has measure zero. However, it is still possible to perform algebraic operations on such functions (e.g. multiplying or adding two functions together), and one can also integrate such functions on measurable sets (provided, of course, that the function has some suitable integrability condition). We also know that the ${L^p}$ spaces can usually be described via duality, as the dual space of ${L^{p'}}$ (except in some endpoint cases, namely when ${p=\infty}$, or when ${p=1}$ and the underlying space is not ${\sigma}$-finite).

We have also seen (via the Lebesgue-Radon-Nikodym theorem) that locally integrable functions ${f \in L^1_{\hbox{loc}}({\bf R})}$ on, say, the real line ${{\bf R}}$, can be identified with locally finite absolutely continuous measures ${m_f}$ on the line, by multiplying Lebesgue measure ${m}$ by the function ${f}$. So another way to generalise the concept of a function is to consider arbitrary locally finite Radon measures ${\mu}$ (not necessarily absolutely continuous), such as the Dirac measure ${\delta_0}$. With this concept of “generalised function”, one can still add and subtract two measures ${\mu, \nu}$, and integrate any measure ${\mu}$ against a (bounded) measurable set ${E}$ to obtain a number ${\mu(E)}$, but one cannot evaluate a measure ${\mu}$ (or more precisely, the Radon-Nikodym derivative ${d\mu/dm}$ of that measure) at a single point ${x}$, and one also cannot multiply two measures together to obtain another measure. From the Riesz representation theorem, we also know that the space of (finite) Radon measures can be described via duality, as linear functionals on ${C_c({\bf R})}$.

There is an even larger class of generalised functions that is very useful, particularly in linear PDE, namely the space of distributions, say on a Euclidean space ${{\bf R}^d}$. In contrast to Radon measures ${\mu}$, which can be defined by how they “pair up” against continuous, compactly supported test functions ${f \in C_c({\bf R}^d)}$ to create numbers ${\langle f, \mu \rangle := \int_{{\bf R}^d} f\ d\overline{\mu}}$, a distribution ${\lambda}$ is defined by how it pairs up against a smooth compactly supported function ${f \in C^\infty_c({\bf R}^d)}$ to create a number ${\langle f, \lambda \rangle}$. As the space ${C^\infty_c({\bf R}^d)}$ of smooth compactly supported functions is smaller than (but dense in) the space ${C_c({\bf R}^d)}$ of continuous compactly supported functions (and has a stronger topology), the space of distributions is larger than that of measures. But the space ${C^\infty_c({\bf R}^d)}$ is closed under more operations than ${C_c({\bf R}^d)}$, and in particular is closed under differential operators (with smooth coefficients). Because of this, the space of distributions is similarly closed under such operations; in particular, one can differentiate a distribution and get another distribution, which is something that is not always possible with measures or ${L^p}$ functions. But as measures or functions can be interpreted as distributions, this leads to the notion of a weak derivative for such objects, which makes sense (but only as a distribution) even for functions that are not classically differentiable. Thus the theory of distributions can allow one to rigorously manipulate rough functions “as if” they were smooth, although one must still be careful as some operations on distributions are not well-defined, most notably the operation of multiplying two distributions together. Nevertheless one can use this theory to justify many formal computations involving derivatives, integrals, etc. (including several computations used routinely in physics) that would be difficult to formalise rigorously in a purely classical framework.

If one shrinks the space of distributions slightly, to the space of tempered distributions (which is formed by enlarging dual class ${C^\infty_c({\bf R}^d)}$ to the Schwartz class ${{\mathcal S}({\bf R}^d)}$), then one obtains closure under another important operation, namely the Fourier transform. This allows one to define various Fourier-analytic operations (e.g. pseudodifferential operators) on such distributions.

Of course, at the end of the day, one is usually not all that interested in distributions in their own right, but would like to be able to use them as a tool to study more classical objects, such as smooth functions. Fortunately, one can recover facts about smooth functions from facts about the (far rougher) space of distributions in a number of ways. For instance, if one convolves a distribution with a smooth, compactly supported function, one gets back a smooth function. This is a particularly useful fact in the theory of constant-coefficient linear partial differential equations such as ${Lu=f}$, as it allows one to recover a smooth solution ${u}$ from smooth, compactly supported data ${f}$ by convolving ${f}$ with a specific distribution ${G}$, known as the fundamental solution of ${L}$. We will give some examples of this later in these notes.

It is this unusual and useful combination of both being able to pass from classical functions to generalised functions (e.g. by differentiation) and then back from generalised functions to classical functions (e.g. by convolution) that sets the theory of distributions apart from other competing theories of generalised functions, in particular allowing one to justify many formal calculations in PDE and Fourier analysis rigorously with relatively little additional effort. On the other hand, being defined by linear duality, the theory of distributions becomes somewhat less useful when one moves to more nonlinear problems, such as nonlinear PDE. However, they still serve an important supporting role in such problems as a “ambient space” of functions, inside of which one carves out more useful function spaces, such as Sobolev spaces, which we will discuss in the next set of notes.

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

Title: Give yourself an epsilon of room.

Quick description: You want to prove some statement $S_0$ about some object $x_0$ (which could be a number, a point, a function, a set, etc.).  To do so, pick a small $\varepsilon > 0$, and first prove a weaker statement $S_\varepsilon$ (which allows for “losses” which go to zero as $\varepsilon \to 0$) about some perturbed object $x_\varepsilon$.  Then, take limits $\varepsilon \to 0$.  Provided that the dependency and continuity of the weaker conclusion $S_\varepsilon$ on $\varepsilon$ are sufficiently controlled, and $x_\varepsilon$ is converging to $x_0$ in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement $S_\infty$ about some object $X_\infty$, by first proving a weaker statement $S_N$ on some approximation $X_N$ to $X_\infty$ for some large parameter N, and then send $N \to \infty$ at the end.

General discussion: Here are some typical examples of a target statement $S_0$, and the approximating statements $S_\varepsilon$ that would converge to $S$:

 $S_0$ $S_\varepsilon$ $f(x_0) = g(x_0)$ $f(x_\varepsilon) = g(x_\varepsilon) + o(1)$ $f(x_0) \leq g(x_0)$ $f(x_\varepsilon) \leq g(x_\varepsilon) + o(1)$ $f(x_0) > 0$ $f(x_\varepsilon) \geq c - o(1)$ for some $c>0$ independent of $\varepsilon$ $f(x_0)$ is finite $f(x_\varepsilon)$ is bounded uniformly in $\varepsilon$ $f(x_0) \geq f(x)$ for all $x \in X$ (i.e. $x_0$ maximises f) $f(x_\varepsilon) \geq f(x)-o(1)$ for all $x \in X$ (i.e. $x_\varepsilon$ nearly maximises f) $f_n(x_0)$ converges as $n \to \infty$ $f_n(x_\varepsilon)$ fluctuates by at most o(1) for sufficiently large n $f_0$ is a measurable function $f_\varepsilon$ is a measurable function converging pointwise to $f_0$ $f_0$ is a continuous function $f_\varepsilon$ is an equicontinuous family of functions converging pointwise to $f_0$ OR $f_\varepsilon$ is continuous and converges (locally) uniformly to $f_0$ The event $E_0$ holds almost surely The event $E_\varepsilon$ holds with probability 1-o(1) The statement $P_0(x)$ holds for almost every x The statement $P_\varepsilon(x)$ holds for x outside of a set of measure o(1)

Of course, to justify the convergence of $S_\varepsilon$ to $S_0$, it is necessary that $x_\varepsilon$ converge to $x_0$ (or $f_\varepsilon$ converge to $f_0$, etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as $f(x_0) \leq M$, one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.)  Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control $S_\varepsilon$ on the approximating object $x_\varepsilon$ is somehow “uniform in $\varepsilon$“, although for “$\sigma$-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion $S_\varepsilon$ on $x_\varepsilon$ that needs to have this uniformity in $\varepsilon$; one is permitted to have some intermediate stages in the derivation of $S_\varepsilon$ that depend on $\varepsilon$ in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis.  For instance, by replacing “rough”, “infinite-complexity”, “continuous”,  “global”, or otherwise “infinitary” objects $x_0$ with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants $x_\varepsilon$, one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals).  [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. $x_\varepsilon$ should be expected to degrade in the limit $\varepsilon \to 0$, and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in $\varepsilon$.]  Similarly, issues such as whether the supremum $M := \sup \{ f(x): x \in X \}$ of a function on a set is actually attained by some maximiser $x_0$ become moot if one is willing to settle instead for an almost-maximiser $x_\varepsilon$, e.g. one which comes within an epsilon of that supremum M (or which is larger than $1/\varepsilon$, if M turns out to be infinite).  Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

A variant: It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified.  However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day.  (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.)  Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

Prerequisites: Graduate real analysis.  (Actually, this isn’t so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.)  Some examples also require some exposure to PDE.

We now begin using the theory established in the last two lectures to rigorously extract an asymptotic gradient shrinking soliton from the scaling limit of any given $\kappa$-solution. This will require a number of new tools, including the notion of a geometric limit of pointed Ricci flows $t \mapsto (M, g(t), p)$, which can be viewed as the analogue of the Gromov-Hausdorff limit in the category of smooth Riemannian flows. A key result here is Hamilton’s compactness theorem: a sequence of complete pointed non-collapsed Ricci flows with uniform bounds on curvature will have a subsequence which converges geometrically to another Ricci flow. This result, which one can view as an analogue of the Arzelá-Ascoli theorem for Ricci flows, relies on some parabolic regularity estimates for Ricci flow due to Shi.

Next, we use the estimates on reduced length from the Harnack inequality analysis in Lecture 13 to locate some good regions of spacetime of a $\kappa$-solution in which to do the asymptotic analysis. Rescaling these regions and applying Hamilton’s compactness theorem (relying heavily here on the $\kappa$-noncollapsed nature of such solutions) we extract a limit. Formally, the reduced volume is now constant and so Lecture 14 suggests that this limit is a gradient soliton; however, some care is required to make this argument rigorous. In the next section we shall study such solitons, which will then reveal important information about the original $\kappa$-solution.

Our treatment here is primarily based on Morgan-Tian’s book and the notes of Ye. Other treatments can be found in Perelman’s original paper, the notes of Kleiner-Lott, and the paper of Cao-Zhu. See also the foundational papers of Shi and Hamilton, as well as the book of Chow, Lu, and Ni.

I’m continuing my series of articles for the Princeton Companion to Mathematics through the winter break with my article on distributions. These “generalised functions” can be viewed either as the limits of actual functions, as well as the dual of suitable “test” functions. Having such a space of virtual functions to work in is very convenient for several reasons, in particular it allws one to perform various algebraic manipulations while avoiding (or at least deferring) technical analytical issues, such as how to differentiate a non-differentiable function. You can also find a more recent draft of my article at the PCM web site (username Guest, password PCM).

Today I will highlight Carl Pomerance‘s informative PCM article on “Computational number theory“, which in particular focuses on topics such as primality testing and factoring, which are of major importance in modern cryptography. Interestingly, sieve methods play a critical role in making modern factoring arguments (such as the quadratic sieve and number field sieve) practical even for rather large numbers, although the use of sieves here is rather different from the use of sieves in additive prime number theory.