You are currently browsing the tag archive for the ‘distributions’ tag.

In set theory, a function is defined as an object that *evaluates* every input to exactly one output . However, in various branches of mathematics, it has become convenient to generalise this classical concept of a function to a more abstract one. For instance, in operator algebras, quantum mechanics, or non-commutative geometry, one often replaces commutative algebras of (real or complex-valued) functions on some space , such as or , with a more general – and possibly non-commutative – algebra (e.g. a -algebra or a von Neumann algebra). Elements in this more abstract algebra are no longer definable as functions in the classical sense of assigning a single value to every point , but one can still define other operations on these “generalised functions” (e.g. one can multiply or take inner products between two such objects).

Generalisations of functions are also very useful in analysis. In our study of spaces, we have already seen one such generalisation, namely the concept of a function defined up to almost everywhere equivalence. Such a function (or more precisely, an equivalence class of classical functions) cannot be evaluated at any given point , if that point has measure zero. However, it is still possible to perform algebraic operations on such functions (e.g. multiplying or adding two functions together), and one can also integrate such functions on measurable sets (provided, of course, that the function has some suitable integrability condition). We also know that the spaces can usually be described via duality, as the dual space of (except in some endpoint cases, namely when , or when and the underlying space is not -finite).

We have also seen (via the Lebesgue-Radon-Nikodym theorem) that locally integrable functions on, say, the real line , can be identified with locally finite absolutely continuous measures on the line, by multiplying Lebesgue measure by the function . So another way to generalise the concept of a function is to consider arbitrary locally finite Radon measures (not necessarily absolutely continuous), such as the Dirac measure . With this concept of “generalised function”, one can still add and subtract two measures , and integrate any measure against a (bounded) measurable set to obtain a number , but one cannot evaluate a measure (or more precisely, the Radon-Nikodym derivative of that measure) at a single point , and one also cannot multiply two measures together to obtain another measure. From the Riesz representation theorem, we also know that the space of (finite) Radon measures can be described via duality, as linear functionals on .

There is an even larger class of generalised functions that is very useful, particularly in linear PDE, namely the space of distributions, say on a Euclidean space . In contrast to Radon measures , which can be defined by how they “pair up” against continuous, compactly supported test functions to create numbers , a distribution is defined by how it pairs up against a *smooth* compactly supported function to create a number . As the space of smooth compactly supported functions is smaller than (but dense in) the space of continuous compactly supported functions (and has a stronger topology), the space of distributions is larger than that of measures. But the space is closed under more operations than , and in particular is closed under differential operators (with smooth coefficients). Because of this, the space of distributions is similarly closed under such operations; in particular, one can differentiate a distribution and get another distribution, which is something that is not always possible with measures or functions. But as measures or functions can be interpreted as distributions, this leads to the notion of a weak derivative for such objects, which makes sense (but only as a distribution) even for functions that are not classically differentiable. Thus the theory of distributions can allow one to rigorously manipulate rough functions “as if” they were smooth, although one must still be careful as some operations on distributions are not well-defined, most notably the operation of multiplying two distributions together. Nevertheless one can use this theory to justify many formal computations involving derivatives, integrals, etc. (including several computations used routinely in physics) that would be difficult to formalise rigorously in a purely classical framework.

If one shrinks the space of distributions slightly, to the space of *tempered distributions* (which is formed by enlarging dual class to the Schwartz class ), then one obtains closure under another important operation, namely the Fourier transform. This allows one to define various Fourier-analytic operations (e.g. pseudodifferential operators) on such distributions.

Of course, at the end of the day, one is usually not all that interested in distributions in their own right, but would like to be able to use them as a tool to study more classical objects, such as smooth functions. Fortunately, one can recover facts about smooth functions from facts about the (far rougher) space of distributions in a number of ways. For instance, if one convolves a distribution with a smooth, compactly supported function, one gets back a smooth function. This is a particularly useful fact in the theory of constant-coefficient linear partial differential equations such as , as it allows one to recover a smooth solution from smooth, compactly supported data by convolving with a specific distribution , known as the fundamental solution of . We will give some examples of this later in these notes.

It is this unusual and useful combination of both being able to pass from classical functions to generalised functions (e.g. by differentiation) and then back from generalised functions to classical functions (e.g. by convolution) that sets the theory of distributions apart from other competing theories of generalised functions, in particular allowing one to justify many formal calculations in PDE and Fourier analysis rigorously with relatively little additional effort. On the other hand, being defined by linear duality, the theory of distributions becomes somewhat less useful when one moves to more nonlinear problems, such as nonlinear PDE. However, they still serve an important supporting role in such problems as a “ambient space” of functions, inside of which one carves out more useful function spaces, such as Sobolev spaces, which we will discuss in the next set of notes.

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

**Title**: Give yourself an epsilon of room.

**Quick description**: You want to prove some statement about some object (which could be a number, a point, a function, a set, etc.). To do so, pick a small , and first prove a weaker statement (which allows for “losses” which go to zero as ) about some perturbed object . Then, take limits . Provided that the dependency and continuity of the weaker conclusion on are sufficiently controlled, and is converging to in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement about some object , by first proving a weaker statement on some approximation to for some large parameter N, and then send at the end.

**General discussion: **Here are some typical examples of a target statement , and the approximating statements that would converge to :

for some independent of | |

is finite | is bounded uniformly in |

for all (i.e. maximises f) | for all (i.e. nearly maximises f) |

converges as | fluctuates by at most o(1) for sufficiently large n |

is a measurable function | is a measurable function converging pointwise to |

is a continuous function | is an equicontinuous family of functions converging pointwise to OR is continuous and converges (locally) uniformly to |

The event holds almost surely | The event holds with probability 1-o(1) |

The statement holds for almost every x | The statement holds for x outside of a set of measure o(1) |

Of course, to justify the convergence of to , it is necessary that converge to (or converge to , etc.) in a suitably strong sense. (But for the purposes of proving just *upper* bounds, such as , one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.) Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control on the approximating object is somehow “uniform in “, although for “-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the *final* conclusion on that needs to have this uniformity in ; one is permitted to have some intermediate stages in the derivation of that depend on in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis. For instance, by replacing “rough”, “infinite-complexity”, “continuous”, “global”, or otherwise “infinitary” objects with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants , one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals). [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. should be expected to degrade in the limit , and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in .] Similarly, issues such as whether the supremum of a function on a set is actually attained by some maximiser become moot if one is willing to settle instead for an almost-maximiser , e.g. one which comes within an epsilon of that supremum M (or which is larger than , if M turns out to be infinite). Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

**A variant:** It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified. However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a *sufficiently* small (but not *infinitesimally* small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day. (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.) Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

**Prerequisites**: Graduate real analysis. (Actually, this isn’t so much a prerequisite as it is a *corequisite*: the limiting argument plays a central role in many fundamental results in real analysis.) Some examples also require some exposure to PDE.

We now begin using the theory established in the last two lectures to rigorously extract an asymptotic gradient shrinking soliton from the scaling limit of any given -solution. This will require a number of new tools, including the notion of a *geometric limit* of pointed Ricci flows , which can be viewed as the analogue of the Gromov-Hausdorff limit in the category of smooth Riemannian flows. A key result here is *Hamilton’s compactness theorem*: a sequence of complete pointed non-collapsed Ricci flows with uniform bounds on curvature will have a subsequence which converges geometrically to another Ricci flow. This result, which one can view as an analogue of the Arzelá-Ascoli theorem for Ricci flows, relies on some parabolic regularity estimates for Ricci flow due to Shi.

Next, we use the estimates on reduced length from the Harnack inequality analysis in Lecture 13 to locate some good regions of spacetime of a -solution in which to do the asymptotic analysis. Rescaling these regions and applying Hamilton’s compactness theorem (relying heavily here on the -noncollapsed nature of such solutions) we extract a limit. Formally, the reduced volume is now constant and so Lecture 14 suggests that this limit is a gradient soliton; however, some care is required to make this argument rigorous. In the next section we shall study such solitons, which will then reveal important information about the original -solution.

Our treatment here is primarily based on Morgan-Tian’s book and the notes of Ye. Other treatments can be found in Perelman’s original paper, the notes of Kleiner-Lott, and the paper of Cao-Zhu. See also the foundational papers of Shi and Hamilton, as well as the book of Chow, Lu, and Ni.

I’m continuing my series of articles for the Princeton Companion to Mathematics through the winter break with my article on distributions. These “generalised functions” can be viewed either as the limits of actual functions, as well as the dual of suitable “test” functions. Having such a space of virtual functions to work in is very convenient for several reasons, in particular it allws one to perform various algebraic manipulations while avoiding (or at least deferring) technical analytical issues, such as how to differentiate a non-differentiable function. You can also find a more recent draft of my article at the PCM web site (username Guest, password PCM).

Today I will highlight Carl Pomerance‘s informative PCM article on “Computational number theory“, which in particular focuses on topics such as primality testing and factoring, which are of major importance in modern cryptography. Interestingly, sieve methods play a critical role in making modern factoring arguments (such as the quadratic sieve and number field sieve) practical even for rather large numbers, although the use of sieves here is rather different from the use of sieves in additive prime number theory.

[*Update*, Jan 1: Link fixed.]

## Recent Comments