In set theory, a function is defined as an object that *evaluates* every input to exactly one output . However, in various branches of mathematics, it has become convenient to generalise this classical concept of a function to a more abstract one. For instance, in operator algebras, quantum mechanics, or non-commutative geometry, one often replaces commutative algebras of (real or complex-valued) functions on some space , such as or , with a more general – and possibly non-commutative – algebra (e.g. a -algebra or a von Neumann algebra). Elements in this more abstract algebra are no longer definable as functions in the classical sense of assigning a single value to every point , but one can still define other operations on these “generalised functions” (e.g. one can multiply or take inner products between two such objects).

Generalisations of functions are also very useful in analysis. In our study of spaces, we have already seen one such generalisation, namely the concept of a function defined up to almost everywhere equivalence. Such a function (or more precisely, an equivalence class of classical functions) cannot be evaluated at any given point , if that point has measure zero. However, it is still possible to perform algebraic operations on such functions (e.g. multiplying or adding two functions together), and one can also integrate such functions on measurable sets (provided, of course, that the function has some suitable integrability condition). We also know that the spaces can usually be described via duality, as the dual space of (except in some endpoint cases, namely when , or when and the underlying space is not -finite).

We have also seen (via the Lebesgue-Radon-Nikodym theorem) that locally integrable functions on, say, the real line , can be identified with locally finite absolutely continuous measures on the line, by multiplying Lebesgue measure by the function . So another way to generalise the concept of a function is to consider arbitrary locally finite Radon measures (not necessarily absolutely continuous), such as the Dirac measure . With this concept of “generalised function”, one can still add and subtract two measures , and integrate any measure against a (bounded) measurable set to obtain a number , but one cannot evaluate a measure (or more precisely, the Radon-Nikodym derivative of that measure) at a single point , and one also cannot multiply two measures together to obtain another measure. From the Riesz representation theorem, we also know that the space of (finite) Radon measures can be described via duality, as linear functionals on .

There is an even larger class of generalised functions that is very useful, particularly in linear PDE, namely the space of distributions, say on a Euclidean space . In contrast to Radon measures , which can be defined by how they “pair up” against continuous, compactly supported test functions to create numbers , a distribution is defined by how it pairs up against a *smooth* compactly supported function to create a number . As the space of smooth compactly supported functions is smaller than (but dense in) the space of continuous compactly supported functions (and has a stronger topology), the space of distributions is larger than that of measures. But the space is closed under more operations than , and in particular is closed under differential operators (with smooth coefficients). Because of this, the space of distributions is similarly closed under such operations; in particular, one can differentiate a distribution and get another distribution, which is something that is not always possible with measures or functions. But as measures or functions can be interpreted as distributions, this leads to the notion of a weak derivative for such objects, which makes sense (but only as a distribution) even for functions that are not classically differentiable. Thus the theory of distributions can allow one to rigorously manipulate rough functions “as if” they were smooth, although one must still be careful as some operations on distributions are not well-defined, most notably the operation of multiplying two distributions together. Nevertheless one can use this theory to justify many formal computations involving derivatives, integrals, etc. (including several computations used routinely in physics) that would be difficult to formalise rigorously in a purely classical framework.

If one shrinks the space of distributions slightly, to the space of *tempered distributions* (which is formed by enlarging dual class to the Schwartz class ), then one obtains closure under another important operation, namely the Fourier transform. This allows one to define various Fourier-analytic operations (e.g. pseudodifferential operators) on such distributions.

Of course, at the end of the day, one is usually not all that interested in distributions in their own right, but would like to be able to use them as a tool to study more classical objects, such as smooth functions. Fortunately, one can recover facts about smooth functions from facts about the (far rougher) space of distributions in a number of ways. For instance, if one convolves a distribution with a smooth, compactly supported function, one gets back a smooth function. This is a particularly useful fact in the theory of constant-coefficient linear partial differential equations such as , as it allows one to recover a smooth solution from smooth, compactly supported data by convolving with a specific distribution , known as the fundamental solution of . We will give some examples of this later in these notes.

It is this unusual and useful combination of both being able to pass from classical functions to generalised functions (e.g. by differentiation) and then back from generalised functions to classical functions (e.g. by convolution) that sets the theory of distributions apart from other competing theories of generalised functions, in particular allowing one to justify many formal calculations in PDE and Fourier analysis rigorously with relatively little additional effort. On the other hand, being defined by linear duality, the theory of distributions becomes somewhat less useful when one moves to more nonlinear problems, such as nonlinear PDE. However, they still serve an important supporting role in such problems as a “ambient space” of functions, inside of which one carves out more useful function spaces, such as Sobolev spaces, which we will discuss in the next set of notes.

** — 1. Smooth functions with compact support — **

In the rest of the notes we will work on a fixed Euclidean space . (One can also define distributions on other domains related to , such as open subsets of , or -dimensional manifolds, but for simplicity we shall restrict attention to Euclidean spaces in these notes.)

A test function is any smooth, compactly supported function ; the space of such functions is denoted . (In some texts, this space is denoted instead.)

From analytic continuation one sees that there are no real-analytic test functions other than the zero function. Despite this negative result, test functions actually exist in abundance:

- (i) Show that there exists at least one test function that is not identically zero. (
Hint: it suffices to do this for . One starting point is to use the fact that the function defined by for and otherwise is smooth, even at the origin .)- (ii) Show that if and is absolutely integrable and compactly supported, then the convolution is also in . (
Hint:first show that is continuously differentiable with .)- (iii) ( Urysohn lemma) Let be a compact subset of , and let be an open neighbourhood of . Show that there exists a function supported in which equals on . (Hint: use the ordinary Urysohn lemma to find a function in that equals on a neighbourhood of and is supported in a compact subset of , then convolve this function by a suitable test function.)
- (iv) Show that is dense in (in the uniform topology), and dense in (with the topology) for all .

The space is clearly a vector space. Now we place a (very strong!) topology on it. We first observe that , where ranges over all compact subsets of and consists of those functions which are supported in . Each will be given a topology (called the *smooth topology*) generated by the norms

for , where we view as a -dimensional vector (or, if one wishes, a -dimensional rank tensor); thus a sequence converges to a limit if and only if converges uniformly to for all . (This gives the structure of a Fréchet space, though we will not use this fact here.)

We make the trivial remark that if are compact sets, then is a subspace of , and the topology on the former space is the restriction of the topology of the latter space. Because of this, we are able to give a (very strong) topology as follows. Call a seminorm on *good* if it is continuous function on for each compact (or equivalently, the ball is open in for each compact ). We then give the topology defined by all good seminorms. Clearly, this makes a (locally convex) topological vector space.

Exercise 2Let be a sequence in , and let be another function in . Show that converges in the topology of to if and only if there exists a compact set such that are all supported in , and converges to in the smooth topology of .

Exercise 3

- (i) Show that the topology of is first countable for every compact .
- (ii) Show that the topology of is
notfirst countable. (Hint:given any countable sequence of open neighbourhoods of , build a new open neighbourhood that does not contain any of the previous ones, using the -compact nature of .)- (iii) As an additional challenge, construct a set such that is an adherent point of , but is not as the limit of any sequence in .

There are plenty of continuous operations on :

- (i) Let be a compact set. Show that a linear map into a normed vector space is continuous if and only if there exists and such that for all .
- (ii) Let be compact sets. Show that a linear map is continuous if and only if for every there exists and a constant such that for all .
- (iii) Show that a linear map from the space of test functions into a topological vector space generated by some family of seminorms (i.e., a locally convex topological vector space) is continuous if and only if it is sequentially continuous (i.e. whenever converges to in , converges to in ), and if and only if is continuous for each compact . Thus while first countability fails for , we have a serviceable substitute for this property.
- (iv) Show that the inclusion map from to is continuous for every .
- (v) Show that a map is continuous if and only if for every compact set there exists a compact set such that maps continuously to .
- (vi) Show that every linear differential operator with smooth coefficients is a continuous operation on .
- (vii) Show that convolution with any absolutely integrable, compactly supported function is a continuous operation on .
- (viii) Show that the product operation is continuous from to .

A sequence of continuous, compactly supported functions is said to be an approximation to the identity if the are non-negative, have total mass equal to , and whose supports shrink to the origin, thus for any fixed , is supported on the ball for sufficiently large. One can generate such a sequence by starting with a single non-negative continuous compactly supported function of total mass , and then setting ; many other constructions are possible also.

One has the following useful fact:

Exercise 5Let be a sequence of approximations to the identity.

- (i) If is continuous, show that converges uniformly on compact sets to .
- (ii) If for some , show that converges in to . (
Hint:use (i), the density of in , and Young’s inequality.)- (iii) If , show that converges in to . (Hint: use the identity , cf. Exercise 1(ii).)

Exercise 6Show that is separable. (Hint:it suffices to show that is separable for each compact . There are several ways to accomplish this. One is to begin with the Stone-Weierstrass theorem, which will give a countable set which is dense in the uniform topology, then use the fundamental theorem of calculus to strengthen the topology. Another is to use Exercise 5 and then discretise the convolution. Another is to embed into a torus and use Fourier series, noting that the Fourier coefficients of a smooth function decay faster than any power of .)

** — 2. Distributions — **

Now we can define the concept of a distribution.

Definition 1 (Distribution)Adistributionon is a continuous linear functional from to . The space of such distributions is denoted , and is given the weak-* topology. In particular, a sequence of distributions converges (in the sense of distributions) to a limit if one has for all .A technical point: we endow the space with the

conjugatecomplex structure. Thus, if , and is a complex number, then is the distribution that maps a test function to rather than ; thus . This is to keep the analogy between the evaluation of a distribution against a function, and the usual Hermitian inner product of two test functions.

From Exercise 4, we see that a linear functional is a distribution if, for every compact set , there exists and such that

Exercise 7Show that is a Hausdorff topological vector space.

We note two basic examples of distributions:

- Any locally integrable function can be viewed as a distribution, by writing for all test functions .
- Any complex Radon measure can be viewed as a distribution, by writing , where is the complex conjugate of (thus ). (Note that this example generalises the preceding one, which corresponds to the case when is absolutely continuous with respect to Lebesgue measure.) Thus, for instance, the Dirac measure at the origin is a distribution, with for all test functions .

Exercise 8Show that the above identifications of locally integrable functions or complex Radon measures with distributions are injective. (Hint: use Exercise 1(iv).)

From the above exercise, we may view locally integrable functions and locally finite measures as a special type of distribution. In particular, and are now contained in for all .

Exercise 9Show that if a sequence of locally integrable functions converge in to a limit, then they also converge in the sense of distributions; similarly, if a sequence of complex Radon measures converge in the vague topology to a limit, then they also converge in the sense of distributions.

Thus we see that convergence in the sense of distributions is among the weakest of the notions of convergence used in analysis; however, from the Hausdorff property, distributional limits are still *unique*.

Exercise 10If is a sequence of approximations to the identity, show that converges in the sense of distributions to the Dirac distribution .

More exotic examples of distributions can be given:

Exercise 11 (Derivative of the delta function)Let . Show that the functional for all test functions is a distribution which does not arise from either a locally integrable function or a Radon measure. (Note how it is important here that is smooth (and in particular differentiable, and not merely continuous.) The presence of the minus sign will be explained shortly.

Exercise 12 (Principal value of )Let . Show that the functional defined by the formulais a distribution which does not arise from either a locally integrable function or a Radon measure. (Note that is not a locally integrable function!)

Exercise 13 (Distributional interpretations of )Let . For any , show that the functional defined by the formulais a distribution that does not arise from either a locally integrable function or a Radon measure. Note that any two such functionals differ by a constant multiple of the Dirac delta distribution.

Exercise 14A distribution is said to berealif is real for every real-valued test function . Show that every distribution can be uniquely expressed as for some real distributions .

Exercise 15A distribution is said to benon-negativeif is non-negative for every non-negative test function . Show that a distribution is non-negative if and only if it is a non-negative Radon measure. (Hint: use the Riesz representation theorem and Exercise 1(iv).) Note that this implies that the analogue of the Jordan decomposition fails for distributions; any distribution which is not a Radon measure will not be the difference of non-negative distributions.

We will now extend various operations on locally integrable functions or Radon measures to distributions by arguing by analogy. (Shortly we will give a more formal approach, based on density.)

We begin with the operation of multiplying a distribution by a smooth function . Observe that

for all test functions . Inspired by this formula, we define the product of a distribution with a smooth function by setting

for all test functions . It is easy to see (e.g. using Exercise 4(vi)) that this defines a distribution , and that this operation is compatible with existing definitions of products between a locally integrable function (or Radon measure) with a smooth function. It is important that is smooth (and not merely, say, continuous) because one needs the product of a test function with to still be a test function.

Exercise 16Let . Establish the identityfor any smooth function . In particular,

where we abuse notation slightly and write for the identity function . Conversely, if is a distribution such that

show that is a constant multiple of . (

Hint:Use the identity to write as the sum of and times a test function for any test function , where is a fixed test function equalling at the origin.)

Remark 1Even though distributions are not, strictly speaking, functions, it is often useful heuristically to view them as such, thus for instance one might write a distributional identity such as suggestively as . Another useful (and rigorous) way to view such identities is to write distributions such as as a limit of approximations to the identity , and show that the relevant identity becomes true in the limit; thus, for instance, to show that , one can show that in the sense of distributions as . (In fact, converges to zero in the norm.)

Exercise 17Let . With the distribution from Exercise 12, show that is equal to . With the distributions from Exercise 13, show that , where is the signum function.

A distribution is said to be *supported* in a closed set in for all that vanish on an open neighbourhood of . The intersection of all that is supported on is denoted and is referred to as the *support* of the distribution; this is the smallest closed set that is supported on. Thus, for instance, the Dirac delta function is supported on , as are all derivatives of that function. (Note here that it is important that vanish on a *neighbourhood* of , rather than merely vanishing on itself; for instance, in one dimension, there certainly exist test functions that vanish at but nevertheless have a non-zero inner product with .)

Exercise 18Show that every distribution is the limit of a sequence of compactly supported distributions (using the weak-* topology, of course). (Hint:Approximate a distribution by the truncated distributions for some smooth cutoff functions constructed using Exercise 1(iii).)

In a similar spirit, we can convolve a distribution by an absolutely integrable, compactly supported function . From Fubini’s theorem we observe the formula

for all test functions , where . Inspired by this formula, we define the convolution of a distribution with an absolutely integrable, compactly supported function by the formula

for all test functions . This gives a well-defined distribution (thanks to Exercise 4(vii)) which is compatible with previous notions of convolution.

Example 1One has for all test functions . In one dimension, we have (why?), thus differentiation can be viewed as convolution with a distribution.

A remarkable fact about convolutions of two functions is that they inherit the regularity of the *smoother* of the two factors (in contrast to products , which tend to inherit the regularity of the *rougher* of the two factors). (This disparity can be also be seen by contrasting the identity with the identity .) In the case of convolving distributions with test functions, this phenomenon is manifested as follows:

Lemma 2Let be a distribution, and let be a test function. Then is equal to a smooth function.

*Proof:* If were itself a smooth function, then one could easily verify the identity

where . As is a test function, it is easy to see that varies smoothly in in any norm (indeed, it has Taylor expansions to any order in such norms) and so the right-hand side is a smooth function of . So it suffices to verify the identity (3). As distributions are defined against test functions , it suffices to show that

On the other hand, we have from (2) that

So the only issue is to justify the interchange of integral and inner product:

Certainly, (from the compact support of ) any Riemann sum can be interchanged with the inner product:

where ranges over some lattice and is the volume of the fundamental domain. A modification of the argument that shows convergence of the Riemann integral for smooth, compactly supported functions then works here and allows one to take limits; we omit the details.

This has an important corollary:

Lemma 3Every distribution is the limit of a sequence of test functions. In particular, is dense in .

*Proof:* By Exercise 18, it suffices to verify this for compactly supported distributions . We let be a sequence of approximations to the identity. By Exercise 5(iii) and (2), we see that converges in the sense of distributions to . By Lemma 2, is a smooth function; as and are both compactly supported, is compactly supported also. The claim follows.

Because of this lemma, we can formalise the previous procedure of extending operations that were previously defined on test functions, to distributions, provided that these operations were continuous in distributional topologies. However, we shall continue to proceed by analogy as it requires fewer verifications in order to motivate the definition.

Exercise 19Another consequence of Lemma 2 is that it allows one to extend the definition (2) of convolution to the case when is not an integrable function of compact support, but is instead merely a distribution of compact support. Adopting this convention, show that convolution of distributions of compact support is both commutative and associative. (Hint:this can either be done directly, or by carefully taking limits using Lemma 3.)

The next operation we will introduce is that of differentiation. An integration by parts reveals the identity

for any test functions and . Inspired by this, we define the (distributional) partial derivative of a distribution by the formula

This can be verified to still be a distribution, and by Exercise 4(vi), the operation of differentiation is a continuous one on distributions. More generally, given any linear differential operator with smooth coefficients, one can define for a distribution by the formula

where is the adjoint differential operator , which can be defined implicitly by the formula

for test functions , or more explicitly by replacing all coefficients with complex conjugates, replacing each partial derivative with its negative, and reversing the order of operations (thus for instance the adjoint of the first-order operator would be ).

Example 2The distribution defined in Exercise 11 is the derivative of , as defined by the above formula.

Many of the identities one is used to in classical calculus extend to the distributional setting (as one would already expect from Lemma 3). For instance:

Exercise 20 (Product rule)Let be a distribution, and let be smooth. Show thatfor all .

Exercise 21Let . Show that in three different ways:

- Directly from the definitions;
- using the product rule;
- Writing as the limit of approximations to the identity.

- (i) Show that if is a distribution and is an integer, then if and only if is a linear combination of and its first derivatives .
- (ii) Show that a distribution is supported on if and only if it is a linear combination of and finitely many of its derivatives.
- (iii) Generalise (ii) to the case of general dimension (where of course one now uses partial derivatives instead of derivatives).

Exercise 23Let .

- Show that the derivative of the Heaviside function is equal to .
- Show that the derivative of the signum function is equal to .
- Show that the derivative of the locally integrable function is equal to .
- Show that the derivative of the locally integrable function is equal to the distribution from Exercise 13.
- Show that the derivative of the locally integrable function is the locally integrable function .

If a locally integrable function has a distributional derivative which is also a locally integrable function, we refer to the latter as the weak derivative of the former. Thus, for instance, the weak derivative of is (as one would expect), but does not have a weak derivative (despite being (classically) differentiable almost everywhere), because the distributional derivative of this function is not itself a locally integrable function. Thus weak derivatives differ in some respects from their classical counterparts, though of course the two concepts agree for smooth functions.

Exercise 24Let . Show that for any , and any distribution , we have , thus weak derivatives commute with each other. (This is in contrast to classical derivatives, which can fail to commute for non-smooth functions; for instance, at the origin , despite both derivatives being defined. More generally, weak derivatives tend to be less pathological than classical derivatives, but of course the downside is that weak derivatives do not always have a classical interpretation as a limit of a Newton quotient.)

Exercise 25Let , and let be an integer. Let us say that a compactly supported distributionhas of order at mostif the functional is continuous in the norm. Thus, for instance, has order at most , and has order at most , and every compactly supported distribution is of order at most for some sufficiently large .

- Show that if is a compactly supported distribution of order at most , then it is a compactly supported Radon measure.
- Show that if is a compactly supported distribution of order at most , then has order at most .
- Conversely, if is a compactly supported distribution of order , then we can write for some compactly supported distributions of order . (
Hint:one has to “dualise” the fundamental theorem of calculus, and then apply smooth cutoffs to recover compact support.)- Show that every compactly supported distribution can be expressed as a finite linear combination of (distributional) derivatives of compactly supported Radon measures.
- Show that every compactly supported distribution can be expressed as a finite linear combination of (distributional) derivatives of functions in , for any fixed .

We now set out some other operations on distributions. If we define the translation of a test function by a shift by the formula , then we have

for all test functions , so it is natural to define the translation of a distribution by the formula

Next, we consider linear changes of variable.

Exercise 26 (Linear changes of variable)Let , and let be a linear transformation. Given a distribution , let be the distribution given by the formulafor all test functions . (How would one motivate this formula?)

- Show that for all linear transformations .
- If , show that for all linear transformations .
- Conversely, if and is a distribution such that for all linear transformations . (
Hint:first show that there exists a constant such that whenever is a bump function supported in . To show this, approximate by the functionfor an approximation to the identity.)

Remark 2One can also compose distributions with diffeomorphisms. However, things become much more delicate if the map one is composing with contains stationary points; for instance, in one dimension, one cannot meaningfully make sense of (the composition of the Dirac delta distribution with ); this can be seen by first noting that for an approximation to the identity, does not converge to a limit in the distributional sense.

Exercise 27 (Tensor product of distributions)Let be integers. If and are distributions, show that there is a unique distribution with the property thatfor all test functions , , where is the tensor product of and . (

Hint:like many other constructions of tensor products, this is rather intricate. One way is to start by fixing two cutoff functions on respectively, and define on modulated test functions for various frequencies , and then use Fourier series to define on for smooth . Then show that these definitions of are compatible for different choices of and can be glued together to form a distribution; finally, go back and verify (4).)

We close this section with one caveat. Despite the many operations that one can perform on distributions, there are two types of operations which cannot, in general, be defined on arbitrary distributions (at least while remaining in the class of distributions):

- Nonlinear operations (e.g. taking the absolute value of a distribution); or
- Multiplying a distribution by anything rougher than a smooth function.

Thus, for instance, there is no meaningful way to interpret the square of the Dirac delta function as a distribution. This is perhaps easiest to see using an approximation to the identity: converges to in the sense of distributions, but does not converge to anything (the integral against a test function that does not vanish at the origin will go to infinity as ). For similar reasons, one cannot meaningfully interpret the absolute value of the derivative of the delta function. (One also cannot multiply by – why?)

Exercise 28Let be a normed vector space which contains as a dense subspace (and such that the inclusion of to is continuous). The adjoint (or transpose) of this inclusion map is then an injection from to the space of distributions ; thus can be viewed as a subspace of the space of distributions.

- Show that the closed unit ball in is also closed in the space of distributions.
- Conclude that any distributional limit of a bounded sequence in for , is still in .
- Show that the previous claim fails for , but holds for the space of finite measures.

** — 3. Tempered distributions — **

The list of operations one can define on distributions has one major omission – the Fourier transform . Unfortunately, one cannot easily define the Fourier transform for all distributions. One can see this as follows. From Plancherel’s theorem one has the identity

for test functions , so one would like to define the Fourier transform of a distribution by the formula

Unfortunately this does not quite work, because the adjoint Fourier transform of a test function is not a test function, but is instead just a Schwartz function. (Indeed, by Exercise 46 of Notes 2, it is not possible to find a non-trivial test function whose Fourier transform is again a test function.) To address this, we need to work with a slightly smaller space than that of all distributions, namely those of *tempered* distributions:

Definition 4 (Tempered distributions)A tempered distribution is a continuous linear functional on the Schwartz space (with the topology given by Exercise 25 of Notes 2), i.e. an element of .

Since embeds continuously into (with a dense image), we see that the space of tempered distributions can be embedded into the space of distributions. However, not every distribution is tempered:

Example 3The distribution is not tempered. Indeed, if is a bump function, observe that the sequence of functions converges to zero in the Schwartz space topology, but does not go to zero, and so this distribution does not correspond to a tempered distribution.

On the other hand, distributions which avoid this sort of exponential growth, and instead only grow polynomially, tend to be tempered:

Exercise 29Show that any Radon measure which is ofpolynomial growthin the sense that for all and some constants , where is the ball of radius centred at the origin in , is tempered.

Remark 3As a zeroth approximation, one can roughly think of “tempered” as being synonymous with “polynomial growth”. However, this is not strictly true: for instance, the (weak) derivative of a function of polynomial growth will still be tempered, but need not be of polynomial growth (for instance, the derivative of is a tempered distribution, despite having exponential growth). While one can eventually describe which distributions are tempered by measuring their “growth” in both physical space and in frequency space, we will not do so here.

Most of the operations that preserve the space of distributions, also preserve the space of tempered distributions. For instance:

Exercise 30

- Show that any derivative of a tempered distribution is again a tempered distribution.
- Show that and any convolution of a tempered distribution with a compactly supported distribution is again a tempered distribution.
- Show that if is a measurable function which is
rapidly decreasingin the sense that is an function for each , then a convolution of a tempered distribution with can be defined, and is again a tempered distribution.- Show that if is a smooth function such that and all its derivatives have
at most polynomial growth(thus for each there exists such that for all ) then the product of a tempered distribution with is again a tempered distribution. Give a counterexample to show that this statement fails if the polynomial growth hypotheses are dropped.- Show that the translate of a tempered distribution is again a tempered distribution.

But we can now add a new operation to this list using (5): as the Fourier transform maps Schwartz functions continuously to Schwartz functions, it also continuously maps the space of tempered distributions to itself. One can also define the inverse Fourier transform on tempered distributions in a similar manner.

It is not difficult to extend many of the properties of the Fourier transform from Schwartz functions to distributions. For instance:

Exercise 31Let be a tempered distribution, and let be a Schwartz function.

- (Inversion formula) Show that .
- (Multiplication intertwines with convolution) Show that and .
- (Translation intertwines with modulation) For any , show that , where . Similarly, show that for any , one has .
- (Linear transformations) For any invertible linear transformation , show that .
- (Differentiation intertwines with polynomial multiplication) For any , show that , where and is the coordinate function in physical space and frequency space respectively, and similarly .

Exercise 32Let .

- (Inversion formula) Show that and .
- (Orthogonality) Let be a subspace of , and let be Lebesgue measure on . Show that is Lebesgue measure on the orthogonal complement of . (Note that this generalises the previous exercise.)
- (Poisson summation formula) Let be the distribution
Show that this is a tempered distribution which is equal to its own Fourier transform.

One can use these properties of tempered distributions to start solving constant-coefficient PDE. We first illustrate this by an ODE example, showing how the formal symbolic calculus for solving such ODE that you may have seen as an undergraduate, can now be (sometimes) justified using tempered distributions.

Exercise 33Let , let be real numbers, and let be the operator .

- If , use the Fourier transform to show that all tempered distribution solutions to the ODE are of the form for some constants .
- If , show that all tempered distribution solutions to the ODE are of the form for some constants .

Remark 4More generally, one can solve any homogeneous constant-coefficient ODE using tempered distributions and the Fourier transform so long as the roots of the characteristic polynomial are purely imaginary. In all other cases, solutions can grow exponentially as or and so are not tempered. There are other theories of generalised functions that can handle these objects (e.g. hyperfunctions) but we will not discuss them here.

Now we turn to PDE. To illustrate the method, let us focus on solving Poisson’s equation

in , where is a Schwartz function and is a distribution, where is the Laplacian. (In some texts, particularly those using spectral analysis, the Laplacian is occasionally defined instead as , to make it positive semi-definite, but we will eschew that sign convention here, though of course the theory is only changed in a trivial fashion if one adopts it.)

We first settle the question of uniqueness:

Exercise 34Let . Using the Fourier transform, show that the only tempered distributions which are harmonic (by which we mean that in the sense of distributions) are the harmonic polynomials. (Hint:use Exercise 22.) Note that this generalises Liouville’s theorem. There are of course many other harmonic functions than the harmonic polynomials, e.g. , but such functions are not tempered distributions.

From the above exercise, we know that the solution to (6), if tempered, is defined up to harmonic polynomials. To find a solution, we observe that it is enough to find a *fundamental solution*, i.e. a tempered distribution solving the equation

Indeed, if one then convolves this equation with the Schwartz function , and uses the identity (which can either be seen directly, or by using Exercise 31), we see that will be a tempered distribution solution to (6) (and all the other solutions will equal this solution plus a harmonic polynomial). So, it is enough to locate a fundamental solution . We can take Fourier transforms and rewrite this equation as

(here we are treating the tempered distribution as a function to emphasise that the dependent variable is now ). It is then natural to propose to solve this equation as

though this may not be the unique solution (for instance, one is free to modify by a multiple of the Dirac delta function, cf. Exercise 16).

A short computation in polar coordinates shows that is locally integrable in dimensions , so the right-hand side of (7) makes sense. To then compute explicitly, we have from the distributional inversion formula that

so we now need to figure out what the Fourier transform of a negative power of (or the adjoint Fourier transform of a negative power of ) is.

Let us work formally at first, and consider the problem of computing the Fourier transform of the function in for some exponent . A direct attack, based on evaluating the (formal) Fourier integral

does not seem to make much sense (the integral is not absolutely integrable), although a change of variables (or dimensional analysis) heuristic can at least lead to the prediction that the integral (8) should be some multiple of . But which multiple should it be? To continue the formal calculation, we can write the non-integrable function as an average of integrable functions whose Fourier transforms are already known. There are many such functions that one could use here, but it is natural to use Gaussians, as they have a particularly pleasant Fourier transform, namely

for (see Exercise 42 of Notes 2). To get from Gaussians to , one can observe that is invariant under the scaling for . Thus, it is natural to average the standard Gaussian with respect to this scaling, thus producing the function , then integrate with respect to the multiplicative Haar measure . A straightforward change of variables then gives the identity

where

is the Gamma function. If we formally take Fourier transforms of this identity, we obtain

Another change of variables shows that

and so we conclude (formally) that

thus solving the problem of what the constant multiple of should be.

Exercise 35Give a rigorous proof of (9) for (when both sides are locally integrable) in the sense of distributions. (Hint:basically, one needs to test the entire formal argument against an arbitrary Schwartz function.) The identity (9) can in fact be continued meromorphically in , but the interpretation of distributions such as when is not locally integrable is somewhat complicated (cf. Exercise 12) and will not be discussed here.

Specialising back to the current situation with , and using the standard identities

we see that

and similarly

and so from (7) we see that one choice of the fundamental solution is the Newton potential

leading to an explicit (and rigorously derived) solution

to the Poisson equation (6) in for Schwartz functions . (This is not quite the only fundamental solution available; one can add a harmonic polynomial to , which will end up adding a harmonic polynomial to , since the convolution of a harmonic polynomial with a Schwartz function is easily seen to still be harmonic.)

Exercise 36Without using the theory of distributions, give an alternate (and still rigorous) proof that the function defined in (10) solves (6) in .

Exercise 37

- Show that for any , a fundamental solution to the Poisson equation is given by the locally integrable function
where is the volume of the unit ball in dimensions.

- Show that for , a fundamental solution is given by the locally integrable function .
- Show that for , a fundamental solution is given by the locally integrable function .
This we see that for the Poisson equation, is a “critical” dimension, requiring a logarithmic correction to the usual formula.

Similar methods can solve other constant coefficient linear PDE. We give some standard examples in the exercises below.

Exercise 38Let . Show that a smooth solution to the heat equation with initial data for some Schwartz function is given by for , where is the heat kernel(This solution is unique assuming certain smoothness and decay conditions at infinity, but we will not pursue this issue here.)

Exercise 39Let . Show that a smooth solution to the Schrödinger equation with initial data for some Schwartz function is given by for , where is theSchrödinger kerneland we use the standard branch of the complex logarithm (with cut on the negative real axis) to define . (

Hint:You may wish to investigate the Fourier transform of , where is a complex number with positive real part, and then let approach the imaginary axis.) (The close similarity with the heat kernel is a manifestation of Wick rotation in action. However, from an analytical viewpoint, the two kernels are very different. For instance, the convergence of to as follows in the heat kernel case by the theory of approximations to the identity, whereas the convergence in the Schrödinger case is much more subtle, and is best seen via Fourier analysis.)

Exercise 40Let . Show that a smooth solution to the wave equation with initial data for some Schwartz functions is given by the formulafor , where is the distribution

where is Lebesgue measure on the sphere , and the derivative is defined in the Newtonian sense , with the limit taken in the sense of distributions.

Remark 5The theory of (tempered) distributions is also highly effective for studying variable coefficient linear PDE, especially if the coefficients are fairly smooth, and particularly if one is primarily interested in the singularities of solutions to such PDE and how they propagate; here the Fourier transform must be augmented with more general transforms of this type, such as Fourier integral operators. A classic reference for this topic is the four volumes of Hörmander’s “The analysis of linear partial differential operators”. For nonlinear PDE, subspaces of the space of distributions, such as Sobolev spaces, tend to be more useful.

## 145 comments

Comments feed for this article

19 April, 2009 at 6:14 pm

AnonymousWONDERFUL POST.

THANK YOU PROF. TAO

20 April, 2009 at 11:22 am

Successful ResearcherSuperb! Thank you!

20 April, 2009 at 4:38 pm

birdDear Prof. Tao,

in some places, you write ” absolutely integrable and compactly supported”…

but if a function is compactly supported, then isn’t it absolutely integrable? is there a specific reason that you are writing in that way?

thanks

20 April, 2009 at 4:46 pm

AnonymousWhen we say ” a function f is supported in a set V”, do we mean that the set where f is nonzero is a subset of V or the closure of the set where f is nonzero is a subset of V?

thanks

20 April, 2009 at 5:06 pm

birdin exercise 16, on the right hand side is it ?

20 April, 2009 at 5:25 pm

AnonymousDear Prof. Tao,

in execise 17,

is that right integral value of f?

if it is right, why the answer is 1, I did not understand?

thanks

20 April, 2009 at 9:03 pm

Dale RobertsTypo: The display eq. between (7) & (8), shouldn’t RHS be K (without hat)?

Like all your (analysis course) posts, your exposition really clarifies some subtle points that were glossed over in a course that I took as an undergrad and provide a lot of great intuition.

IMHO, your blog posts really help fill the lack of good graduate-level analysis courses in Australia. It is true that we can sit down and plough through the books but your style really brings it to life. Thanks, I personally really appreciate it.

21 April, 2009 at 11:19 am

Terence TaoNot all compactly supported functions are absolutely integrable, e.g. the restriction of 1/x to the interval [0,1].

Either definition of support would work here, but one can for sake of concreteness take the closure of the set where the function is non-zero.

For exercise 16, I forgot to mention that to make the various notions of multiplication consistent, distributions should be given the anti-complex structure rather than the complex structure; thus if is a distribution and is a complex number, then is the distribution given by rather than . This conjugation is always something of an irritant (the same technicality shows up when considering duals of complex Hilbert spaces), though if one wishes one can avoid this issue by restricting attention to real-valued functions (but then it becomes harder to implement Fourier analysis).

For exercise 17, 1 is indeed the distribution that maps any test function f to its integral value: .

Dale: thanks for the correction!

21 April, 2009 at 3:23 pm

lutfuDear Prof. Tao,

for Exc 11,12,13 showing that they define a distribution is not too hard but the second part, i.e showing that they does not arise from either a locally integrable function or a Radon measure is not easy.

Could you please give some hint for anyone of them?

thanks

23 April, 2009 at 12:11 pm

Terence TaoDear Lutfu,

In all of these exercises you should try arguing by contradiction, assuming that, say, arises from a Radon measure and then trying to deduce as much about as possible. For instance, in that specific example one can look at lower bounds for the total variation of on the interval for small .

30 April, 2009 at 9:34 pm

245C, Notes 4: Sobolev spaces « What’s new[…] dual space of , the distributional limit of any sequence bounded in remains in , by Exercise 28 of Notes 3.) To prove (1), observe from the fundamental theorem of calculus […]

2 May, 2009 at 5:55 pm

StudentI’m sorry if it’s a dumb question, but I don’t understand how you can take a Fourier Transform of the right hand side of the equation that is 2lines after (8). It’s not in L^1 as it blows up at the origin.

If you could specify the change of variables you are using, that would be appreciated too.

2 May, 2009 at 9:01 pm

Terence TaoAh, several of the powers of in the text should have been instead. This should be fixed now…

6 May, 2009 at 6:41 pm

At the Fefferman conference « What’s new[…] some explicit constant (this can be seen by computations similar to those in my recent lecture notes on distributions, or by analytically continuing such computations; see also Stein’s “Singular integrals […]

11 May, 2009 at 4:09 pm

maxbaroiThere is a disparity in Exercise 19’s hint. Your hint’s text references Exercise 3, but links to Lemma 3.

[Fixed, thanks – T.]13 May, 2009 at 7:28 pm

Yasser TaimaDear Professor Tao,

In lemma 2, isn’t it that the Riemann sum can be interchanged with the inner product because of the linearity of the inner product, rather than the compactness of the support of f?

It was mentioned in lecture that converges to as pointwise. We then said that the convergence is uniform. Why is that?

This appears to be useful to obtain the Lipschitz condition needed for the convergence of the Riemann sum. Is that right?

Thanks and apologies as I’m new to latex.

13 May, 2009 at 7:40 pm

Terence TaoDear Yasser,

The compactness of f is needed to make the Riemann sum finite, at which point the sum and inner product can be interchanged by linearity.

The uniform convergence of the Newton quotients to the derivative (i.e. the uniform differentiability of in x) is a consequence of the smoothness of h. For instance, one can use Taylor’s theorem with remainder to get a uniform bound on the error between and in terms of the norm of h. (Actually, this is overkill; having h be would already be sufficient thanks to the fundamental theorem of calculus or mean value theorem and the uniform continuity of h’, although one wouldn’t get as quantitative a decay rate as with that approach.)

19 May, 2009 at 9:24 pm

maxbaroiI believe there’s a typo in Exercise 16’s hint.

I believe it should read .

[Corrected, thanks, -T.]19 May, 2009 at 11:04 pm

Solutions to Selected Exercises from 245C, Notes 3: Distributions « Less Incompetence[…] Of course, the actual exercises can be found at Professor Tao’s blog. […]

21 May, 2009 at 8:36 am

ClocloThank you very much for this very very interesting post, as usual !!

I would like to ask you the following question since I couldn’t find an answer in the references I looked at:

Given two functions, f in , g in (), let h be their convolution in the "classical" sense (without the help of the theory of distribution) . Thus h is a function in and its Fourier transform is a tempered distribution.

Can we relate it to the Fourier transform of f and g ?

Or in other words is it possible under maybe given assumptions to have the formula ?

Thank you by advance !

21 May, 2009 at 8:44 am

Terence TaoDear Cloclo,

Yes, this is true for . The easiest way to prove this is to express f and g as the limit (in and respectively) of test functions, which by Young’s inequality gives h as the limit (in ) of the convolution of those test functions. In particular, all the convergences here are also in the sense of tempered distributions, and so their Fourier transforms will also converge in this sense. Meanwhile, one gets a similar convergence to by their test function counterparts using Hausdorff-Young and Holder’s inequality. Since the Fourier transform is already known to intertwine convolution and multiplication for test functions, one can now take distributional limits to conclude the claim (note that the space of distributions is Hausdorff and so limits are unique).

For the situation is more subtle, because the expression does not necessarily make sense ( need not be a locally integrable function, instead being merely a distribution in general; meanwhile, is continuous but need not be smooth). One could

definethis product by fiat to be , but this is a tautological way to solve the problem and does not have any actual content.This general strategy (establishing identities for rough functions by taking limits of the corresponding identity for classical functions) is generally quite a good way to establish identities for rough functions, especially if they are somehow “linear” or otherwise likely to obey good continuity properties.

21 May, 2009 at 10:07 am

ClocloThank you so much for your answer.

And I am really sorry for bothering you once again. But I am not sure I understand how you prove that (if and ), then , indeed may be just a continuous but not a smooth function.

Thank you so much once again and sorry !

21 May, 2009 at 10:14 am

Terence TaoIf converges to f in , then will converge to in by Hausdorff-Young; similarly, will converge to g in (here we need p to be at most 2). From Holder’s inequality we conclude that converges to in and thus in the sense of tempered distributions.

21 May, 2009 at 10:14 am

ClocloThank you very much. I have now to study carefully your answer (I didn’t get it in time). So don’t take into account my last questions and sorry for that.

with best regards.

10 June, 2009 at 4:07 pm

studentDear Prof Tao,

in dimension 1, for the wave equation, if the given initial data are distributions, how do we define the solution for the equation? how do we interpret the equation?

thanks

11 June, 2009 at 6:36 pm

Terence TaoDear student,

There are a number of different ways to define “weak” solutions to these sorts of equations, which are mostly equivalent for linear equations such as the free wave equation, but which are subtly different for nonlinear equations. For instance, one can integrate the equation against a test function in spacetime in the half-space , and interpret the resulting integral equation in a distributional sense. Or, one can take Fourier transforms in space, solve the equation in time using the fundamental solution, and then interpret everything distributionally. A closely related approach is to use the Duhamel formula to define the notion of a solution. Yet another approach is to interpret a rough solution as a limit of smooth solutions to the same equation (or a regularised or “viscosity” version of that equation) in suitable topologies (e.g. the distributional topology). Then there are other approaches based on comparison with subsolutions or supersolutions, or based on variational characterisations of the equation, or on entropy or kinetic formulations of the equation; as one can imagine, the topic of how to properly define a rough solution is quite a subtle one in general, particularly for nonlinear equations.

For wave equations, there is some discussion of this in Sogge’s book “nonlinear wave equations”; I also discuss this a little in my own PDE book.

11 June, 2009 at 6:50 pm

studentThank you very much Prof Tao, you are really very kind person. you are always trying to help everyone as much as you can….

thanks again….

2 August, 2009 at 12:56 pm

GeorgeWhy isn’t the following a counterexample to exercise 2 section 1?

Assume that d=1 and the limit function f is identically 0.

Take a positive bump function b(x) with support [-1,1] . And let f_n(x) be b(x/n)/n.

The support of f_n is [-n,n] (so that the union of supports is all R).

And on each compact K, (f_n)^(j) tends uniformly to 0 for each j.

3 August, 2009 at 5:41 am

Terence TaoDear George,

This sequence does not converge to zero in the topology. Indeed, consider the set of all f in such that (say) for all x. It is not hard to see that the restriction of this set to is open in for every compact K, so this set is an open neighbourhood of the origin in the final topology of ; however, it does not absorb the sequence .

One can modify this argument to show that in order to get convergence in this topology, the supports of the must be uniformly bounded (indeed, this is the bulk of the work in establishing Exercise 2).

3 August, 2009 at 8:57 am

GeorgeDear Prof. Tao,

Thank you for your explanation.

7 January, 2010 at 8:42 pm

Mean field games « What’s new[…] will evolve as time goes forward. There are several ways to find the answer, but we will take a distributional viewpoint and test the density against various test functions – smooth, compactly supported […]

18 January, 2010 at 6:29 pm

254A, Notes 3b: Brownian motion and Dyson Brownian motion « What’s new[…] the sense of (tempered) distributions (see e.g. my earlier notes on this topic). In other words, is a (tempered distributional) solution to the heat equation (3). […]

2 February, 2010 at 1:35 pm

254A, Notes 4: The semi-circular law « What’s new[…] of approximations to the identity, and thus converges in the vague topology to (see e.g. my notes on distributions). Thus we see […]

5 March, 2010 at 1:46 pm

2010 Mar 消磨时间 « 逝去日子[…] then use the fundamental theorem of calculus to strengthen the topology. Another is to use Exercise 5 and then discretise the convolution. Another is to embed into a torus and use Fourier series, […]

19 September, 2010 at 7:22 pm

245A, Notes 2: The Lebesgue integral « What’s new[…] functions are all dense subsets of with respect to the (semi-)metric. Much later in the course (in 245C), we will see that a similar statement holds if one replaces continuous, compactly supported […]

16 October, 2010 at 8:29 pm

245A, Notes 5: Differentiation theorems « What’s new[…] and its derivative is continuous, then we say that is continuously differentiable. Remark 1 Much later in this sequence, when we cover the theory of distributions, we will see the notion of a weak derivative or […]

8 November, 2010 at 7:46 pm

245A, Notes 4: Modes of convergence « What’s new[…] (not to be confused with convergence in the sense of distributions, which we will study later in this sequence) is commonly used in probability; but, as the above exercise demonstrates, it is quite a weak […]

16 December, 2010 at 4:30 am

245A, Notes 2: The Lebesgue integral « mathTHÍCHinTOÁNmyHỌCbrain[…] functions are all dense subsets of with respect to the (semi-)metric. Much later in the course (in 245C), we will see that a similar statement holds if one replaces continuous, compactly supported […]

30 April, 2011 at 7:29 pm

AnonymousIn Exercise 23, I used integral by part and finally get

How can I get rid of the first term on the right hand side? Since , one may get . But how to deal with the part?

1 May, 2011 at 12:02 am

Terence TaoTry the mean value theorem.

23 September, 2011 at 12:11 am

fedfueA technical question: Don’t you need that the topology on $C_c^\infty(\R^d)$ be defined as a locally convex topology (and not simply as the final topology)? My question is because Rudin and other books have this requirement, but I’m not really sure. If not, why do these books make this more complex construction, when a much simpler and natural one can be defined (as done here)?

16 December, 2011 at 10:14 am

254B, Notes 3: Quasirandom groups, expansion, and Selberg’s 3/16 theorem « What’s new[…] See for instance this blog post for a very brief introduction to Riemannian geometry, and these two previous posts for an introduction to distributions and Sobolev […]

29 February, 2012 at 4:49 pm

RexFor Exercise 5, it seems to me that, because of the fact that is not compact, we need something a little stronger in the definition of “approximations to the identity” than just uniformly converging to zero away from the origin for the convolution o converge pointwise to . Perhaps we need to require that the support of is also shrinking to the origin as well.

Here is the situation I am worried about: suppose that is a sequence of kernels which actually has increasing support as goes to infinity. We can arrange to have a big “spike” of width centered at the origin, but also a long “tail” of height along the interval of length centered at the origin. Thus is supported on an interval of length . For most of this interval, (which is close to zero), with the exception of the spike in the center. Set up correctly, we should be able to make into a sequence of approximations of the identity satisfying the definition you’ve given.

However, suppose we now choose a continuous such that and increases very rapidly as moves away from the origin. So rapidly, in fact, that if we convolve with , then the “tail” portion of the function collects a very large contribution to the integral because is very large when is around . Then it seems that will not even converge to .

This can’t happen if has compact support, but in Exercise 5, is only assumed to be continuous.

[Thanks, I’ve altered the definition of approximation to the identity suitably. -T]17 July, 2012 at 6:19 pm

zuchongzhiProfessor, what is a good way to define the convolution of multiple distributions? Do we need at least one of them having compact support? I have been thinking about this for a while but feel uncertain what is the best way to define it. The textbook (Friedlander) used the notion of proper maps, but unfortunately his ‘proof’ has a technical flow that made the support of convolutions of distributions closed but could be non-compact. So I am looking an alternative.

17 July, 2012 at 7:06 pm

zuchongzhiSorry I made a mistake, Friedlander is right.

8 March, 2013 at 6:39 am

Sriyan WickramasuriyaDear Professor Tao,

Your notes are just brilliant! When we restrict the metric defined on Schwartz Space(pg 204 of Mike Taylor’s PDE book) to C_c^{infinity}(R^n) we do not get a complete metric topology and therefore we put the inductive limit topology on C_c^{infinity}(R^n). But this inductive limit topology is not metrizable and the proof of this fact involves Baire’s Category theorem.

Is there any way we prove that the inductive limit topology on C_c^{infinity}(R^n) is not metrizable without using the Baire’s Category Theorem?

8 March, 2013 at 8:31 am

Terence TaoSee exercise 3 (metrizable topologies are necessarily first countable).

8 March, 2013 at 6:38 pm

Sriyan WickramasuriyaDear Professor Tao,

Thank you very very much!

Best Wishes,

Sriyan

20 May, 2013 at 6:34 pm

SaraRespected Terence Tao,

How will prove that sin(nx)—>0 as n goes to infinity in D'(R),

but sin^2(nx) –/->0 as n goes to infinity.

That is, multiplication of distn. is not a continuous operation even it is defined.

4 July, 2013 at 6:56 pm

DebdeepDear Prof. Tao

I want to construct a function f which is in Holder \alpha space for some \alpha > 0 and compactly supported such that the tail of the L_2 integral of the Fourier transform is lower bounded by |x|^{-\beta} for some \beta i.e.

int_{|t| > x} |\hat{f}(t)|^2 dt > = |x|^{-\beta}

What is the smallest \beta (related to \alpha) (tightest lower bound) we can achieve?

Can \beta = \alpha achievable? Thanks for your reply

11 March, 2014 at 8:23 pm

FedericoDear Professor Tao,

These notes are great! However, I’m having a hard time to believe (and prove) the last part of exercise 3. You seem to suggest that the space of test functions is sequential with that topology (despite not being first countable). However I do not think this is true. A question I asked here and an article by Dudley seem to suggest otherwise. Indeed, sequential continuity is enough for proving continuity only in the case of linear maps, but to me this does not imply the space itself is sequential. Naturally, I could be wrong, but could you please clarify this subtle point?

11 March, 2014 at 10:48 pm

Terence TaoYes, you’re right; this issue had been pointed out to me before but I had neglected to update the exercise. But now I’ve altered it to just assert that continuity is equivalent to sequential continuity (which is actually relatively easy to prove, and is what is actually needed in applications).

12 March, 2014 at 12:26 pm

FedericoThanks for the update Professor Tao! It is a relief. Just to finish clearing things up in my head: Isn’t sequential continuity (for the space of test functions) equivalent to continuity only in the case of

linearmaps to another locally convex topological vector space? Or is it true for arbitrary maps as well?12 March, 2014 at 1:26 pm

Terence TaoLinearity is not required (in fact the range can be an arbitrary topological space). If is given the final topology from topological spaces , then a map to any topological space is continuous if and only if its restrictions to each are continuous. From this it is not difficult to see the equivalence of continuity and sequential continuity for maps from to any topological space.

12 March, 2014 at 1:49 pm

FedericoThank you for your reply! Yes, I agree with you regarding the final topology. Indeed this was my first approach to the exercise. But isn’t the test function space an inductive limit in the category of locally convex topological vector spaces (which is different from the typical inductive limit, which in turn is equivalent to the final topology induced by these identity maps)? I mean, isn’t the test function topology the finest (largest)

locally convextopology on which makes the identity maps continuous as opposed to simply the final topology induced by the identity maps ? I think this subtle difference in the inductive limit topology (of LF spaces in general, I believe) might make a difference in regard to the sequential properties of the test function space. This is the true source of my doubt.12 March, 2014 at 4:31 pm

Terence TaoYou’re right, I had not realised this subtlety between the two different notions of a final topology. I’ve modified the text (using the locally convex final topology rather than just the plain final topology), and hopefully all these issues are fixed now…

12 March, 2014 at 5:04 pm

FedericoProfessor Tao, I really appreciate your answers very much! Thank you for your attention to this issue and for making this clearer for everyone.

30 March, 2014 at 9:48 pm

AnonymousDear Prof. Tao,

I am struggling to solve all the exercises from section 1. I am pretty sure I solved from 1 through 5 but I am lost in Exercise 6. There I am trying to use Exercise 5 but I don’t understand what you mean by discretise the convolution and how we can discretise the convolution?

Thanks!

31 March, 2014 at 8:53 am

Terence TaoIf one approximates an integral by a Riemann sum, one can approximate by a finite linear combination of translates of , which is one way to approach separability. (There are many others.)

31 March, 2014 at 10:05 pm

AnonymousThanks Prof. Tao!

Yes, i know f*\phi_n is infinitely differential but we we discretise with Riemann Sum then the Riemann Sum not gonna be in C_c^\infty. How do we make this precise?

Thanks in advance!

31 March, 2014 at 10:30 pm

Terence TaoIf one replaces the convolution with a Riemann sum approximant , one will obtain a function in .

2 April, 2014 at 11:28 pm

AnonymousThank you very much, Professor Tao for your guidance.

22 April, 2014 at 10:38 pm

AnonymousDear Dr. Tao, I am not sure what you mean by ‘dualise’ the Fundamental Theorem of Calculus in Exercise 25 (iii). So, would you mind giving me more details?

Thanks in advance!

23 April, 2014 at 7:37 am

Terence TaoAs a model case, try k=0 and assume the “cheat” that the antiderivative of a test function supported on an interval is again a test function supported on that interval. (In this case, the additional distribution would not be required.) The fundamental theorem of calculus, which guarantees the existence of an antiderivative for any test function, is what will be needed to construct in this case.

24 April, 2014 at 1:09 am

AnonymousSo, do you mean to define where ?

Then order of not gonna be , right?

1 May, 2014 at 8:13 pm

AnonymousProfessor T. Tao,

I am teaching your notes for the first time and I am also getting better understanding with my students. But I have almost don’t have knowledge of PDE and we are struggling to solve Exercise 40. So , it would be great help if you direct us in this Exercise.

Thanks in advance!

3 May, 2014 at 9:36 pm

AnonymousDear Dr. Tao,

In Exercise 40, is it $t/4\pi$ or 1/4\pi t$ ?

3 May, 2014 at 10:43 pm

Terence TaoIt is , due to the rescaling of the integral to the unit sphere, rather than the sphere of radius t.

4 May, 2014 at 8:12 pm

AnonymousThanks Dr. Tao for your response.

I have again couple of confusions and it would be great if you clarify these too.

For Exercise 38 an 39, in most of the text book of PDE I found just {e^{-|x|^2/4t}} and {e^{i|x|^2/4t}} respectively instead of {e^{-|x-y|^2/4t}} and {e^{i|x-|^2/4t}}. Is there any special reason subtracting {y} from {x}?

In Exercise 40, you said just wave equation {-\partial_{tt} u + \Delta u}. Do you mean {-\partial_{tt} u + \Delta u =0}?

Thanks in advance!

[Corrected, thanks – T.]8 May, 2014 at 10:22 pm

AnonymousDear Dr. Tao,

In the process of solving Exercise 40, how do we define the in Fourier transform of a tempered distribution? For, example, let \lambda be a tempered distribution then (F\lambda) (f)=\lambda(Ff) or (F\lambda) (f)=\lambda(F^{-1}f), where F is denoting the Fourier transform?

9 May, 2014 at 7:25 am

Terence TaoSee equation (5).

9 May, 2014 at 8:09 pm

AnonymousDear Dr. Tao,

Are we supposed to use Kirchhoff’s formula to get the distribution K_{t} in Exercise 40?

Thanks

.

14 May, 2014 at 8:13 pm

AnonymousDear Dr. Tao,

In Exercise 40, what is the meaning of “Newtonian Sense” and how to use it in solving this Exercise?

Thanks!

14 May, 2014 at 8:28 pm

Terence Taohttp://en.wikipedia.org/wiki/Newton_quotient

14 May, 2014 at 10:47 pm

AnonymousJust a minor comment for correction, in Exercise 40 isn’t it “… for some Schwartz functions {f} and {g} …” instead of “… for some Schwartz functions {f} is given by the formula …?”

[Corrected, thanks – T.]20 August, 2015 at 5:04 am

KE operator and eigenfunctions[…] I could explain what's going on but it will involve a long sojourn into distribution theory: https://terrytao.wordpress.com/2009/04/19/245c-notes-3-distributions/#more-2072 Really your teacher needs to explain it – post here with what he/she says – it should prove […]

26 August, 2015 at 6:52 am

AnonymousIn some books, a distribution on is defined as a linear functional with the following property

Suppose is a sequence in such that

There exists a compact subset of such that

for all

and

uniformly

as for all

Then as .

Is it exactly the same as the distributions defined in this note?

[Yes; see Exercise 4(iii). -T.]27 August, 2015 at 10:52 am

AnonymousWhat is the weak derivative of the distribution defined in Exercise 12? It seems that p.v.(1/x^2) is not well defined…

27 August, 2015 at 11:17 am

Terence TaoIf one chases the definitions and integrates by parts, one has

So the derivative of p.v. 1/x is a renormalised principal value of -1/x^2.

27 August, 2015 at 12:10 pm

AnonymousHmm, how can one get the last line?

[Taylor expansion -T.]28 August, 2015 at 7:21 am

AnonymousIt seems that this relates to https://en.wikipedia.org/wiki/Hadamard_regularization

But I don’t see why the limit exists…

28 August, 2015 at 7:35 am

AnonymousCan one say that the limit in the last line exists since it is “equal” to , the existence of which has been proven in Ex. 12? It looks like a circular argument to me since in the very beginning one assumes the existence of .

[Yes, the existence of the earlier limits (which indeed follows from the easily verified fact that the distributional derivative of a distribution is again a distribution, together with Exercise 12) can be used to establish the existence of the later limits. It is also instructive to establish (again, using Taylor expansion) directly that the later limit is of a Cauchy sequence and thus convergent. -T.]29 August, 2015 at 1:25 pm

AnonymousI read somewhere before but I don’t remember exactly in what book:

Let $(a_n)$ be a strictly decreasing (I don’t remember if it is increasing or decreasing) positive sequence such that . If we defines

where , then converges (uniformly) to a function in .

Has anybody here seen a reference for such constructions of a nontrivial test function before? It seems that Exercise 1(i) is more standard.

18 September, 2015 at 1:25 pm

AnonymousStop cheating on math M541 homework.

2 November, 2015 at 7:06 pm

275A, Notes 4: The central limit theorem | What's new[…] in the sense of distributions that arises in distribution theory (discussed for instance in this previous blog post), however strictly speaking the two notions of convergence are distinct and should not be confused […]

9 November, 2015 at 6:52 am

AnonymousWhen construct an approximation to the identity, is there a particular reason that one chooses sequence instead of a continuous version ?

9 November, 2015 at 7:00 am

AnonymousIn Exercise 5(ii), does one need to

assumethat first in order to argue that “ converges in to “?9 November, 2015 at 10:25 am

Terence TaoNo, this is implicitly part of the conclusion (but this follows easily from Young’s inequality or Minkowski’s inequality).

If one only assumes local integrability or on , then one needs compact support of the approximating sequence , as otherwise the tail of could interact with an arbitrarily large growth of and it is not even immediate that is anywhere finite, let alone convergent to anything interesting.

In most applications there is not much difference between using a discrete sequence of approximations rather than a continuous sequence, but using a countable discrete sequence makes it essentially trivial to establish measurability of any limit objects obtained, and also there are some sequential compactness results one can exploit when working with a discrete sequence that are not easily available in the continuous setting. (Of course in many situations one can use topological compactness as a substitute for sequential compactness, e.g. use topological Banach-Alaoglu in place of sequential Banach-Alaoglu.)

9 November, 2015 at 7:12 am

AnonymousIf is locally integrable or more generally , can one have the similar fact in Exercise 5?

9 November, 2015 at 11:55 am

AnonymousIn Exercise 1(iv), what is ? I’ve looked it up in your textbook and I didn’t find it there.

[See Exercise 1.10.7 (or Exercise 7 of 245B Notes 12), or also Exercise 1.10.17, Remark 1.10.16, etc..]9 November, 2015 at 12:03 pm

AnonymousIn Exercise 9

if a sequence of locally integrable functions converge in to a limit…What is the topology defined for ?

[The topology of is the topology generated by the seminorms for compact , as in Example 1.9.5. -T]30 November, 2015 at 5:49 pm

AnonymousCan one safely replace with any open set in this note to get the theory of distribution on ?

30 November, 2015 at 6:53 pm

Terence TaoThe theory of distributions localises fairly easily (basically just replace with ). But the theory of

tempereddistributions is significantly more difficult to work with on domains, because it is not obvious how to define the Schwartz class on a domain (and because the main tool that makes the tempered distribution concept useful, namely the distributional Fourier transform, is not obviously available).9 March, 2016 at 12:26 pm

AnonymousI’m confused with the distributional derivatives in the setting of linear PDE. When one uses the formula

to define , is just the “formal adjoint” of ? For general , when one works in a boundary value problem in PDE (say Dirichlet problem or the Neumann problem for the Laplacian equations), should the adjoint depend also on the boundary conditions so that one has different definitions of the distributional derivative of the same differential operator ?

9 March, 2016 at 1:58 pm

Terence TaoFor distributions on a domain , the test functions involved are in , so they vanish to infinite order on the boundary. In particular, the precise notion of adjoint used is not relevant, as they will all agree on these test functions; in particular, the formal adjoint suffices. But one certainly has to take boundary issues into account when attempting to extend a distribution on to all of ; the situation here is more subtle than the corresponding situation with measurable functions, in which one can simply extend the function by zero outside of the domain. In general, the extension operation on distributions is not unique, which is related to the non-uniqueness of the adjoint operator when applied to more general functions than functions.

31 December, 2015 at 2:39 pm

AnonymousGiven a distribution , can we in general find a distribution with as its distributional derivative?

[Yes; this is a good exercise for you to establish. The key point is that the test functions of mean zero form a hyperplane in the space of all test functions, in that any test function can be made mean zero by subtraction of a scalar multiple of a fixed reference test function. -T.]11 May, 2016 at 4:48 pm

Anonymous2A result due to de Rham says the following:

Let , , be distributions, where is a domain in . Then for some iff for all with .

Does this contradicts the positive answer to the question above?

11 May, 2016 at 5:57 pm

Terence TaoNo; the previous question concerned only the one-dimensional case (or of a single partial derivative, rather than the full gradient), in which case the condition forces to vanish identically.

20 January, 2016 at 11:50 am

AnonymousHow do we see that ” embeds ‘continuously’ into ”

[Easiest way is via nets – show that every net that converges in also converges in the Schwartz topology. Or one can explicitly show that pullbacks of basic open sets are open. -T.]17 February, 2016 at 5:01 pm

Debanjana KunduI was wondering if when we’re in and we know that , also ; for a test function if is not in the support of can we say for all ?

10 March, 2016 at 3:09 am

Anonymousi am in grade 7 and i knew about your iq

30 March, 2016 at 10:16 am

AnonymousIn Exercise 5(i), I get

and

. I want to do the estimate

But I don’t see how to go on. Do you have a hint?

30 March, 2016 at 10:37 am

Terence Taois continuous and compactly supported, and hence uniformly continuous. is mostly concentrated in the region where is small, which by uniform continuity lets one place a bound on . So one should chop up the integral into two pieces, one where is small and one where is large, and use two different estimates for each component.

30 March, 2016 at 1:56 pm

jackInstead of , one needs a bounded for , I think?

Consider the standard symmetric mollifier and . Fix . For large enough , uniformly in

Why do we need "chop up the integral into two pieces"? Am I doing something wrong here?

30 March, 2016 at 2:10 pm

Terence TaoThanks for the correction. I had in mind a more general notion of approximation to the identity than the one that is in this post (in which some leakage of mass outside of a small ball is permitted), but you are right that with the notion of approximation to the identity used here, one does not need to decompose the integral.

4 April, 2016 at 3:56 pm

AnonymousWhen one tries to upgrade the convergence to uniform convergence on , the proof above certainly would not work. But do you have a counterexample that one cannot upgrade the compact convergence? Why is compactness crucial here?

30 March, 2016 at 2:14 pm

AnonymousWhat can one say in general for Exercise 5? Let for some function space such that makes sense. Can one always expect that in some convergence mode $latex $Y$ (or there are cases that not convergent in any mode at all)?

31 March, 2016 at 12:28 pm

AnonymousIn Exercise 5(ii), do you have a quick counterexample for ?

31 March, 2016 at 1:26 pm

Terence TaoUniform limits of continuous functions are necessarily continuous.

4 April, 2016 at 10:59 am

AnonymousIn Exercise 5(iii), what is the definition of when is a vector value function? Perhaps you mean

?

4 April, 2016 at 1:15 pm

Terence TaoYes. By default, operations on scalar-valued functions are understood to act componentwise on vector-valued functions unless otherwise stated.

5 July, 2016 at 3:21 pm

AnonymousSince embeds continuously into (with a dense image), we see that the space of tempered distributions can be embedded into the space of distributions.How “good” is the embedding? In other words, is every tempered distribution locally integrable? Or is it in the “basic two examples of distributions”?

[No. For instance, the Dirac delta distribution is tempered, but not locally integrable. -T.]30 August, 2016 at 6:33 am

AnonymousIn your definition of the “approximation to the identity”, is the third condition “whose supports shrink to the origin” equivalent to the following condition?

where the limit is understood in the space of Schwartz distributions.

22 September, 2016 at 9:53 pm

246A, Notes 1: Complex differentiation | What's new[…] by the way, can be largely explained using the theory of distributions, as covered for instance in this previous post, but this is beyond the scope of the current […]

15 October, 2016 at 6:31 am

AnonymousI don’t see in the notes if the following is true:

the weak derivative, if exists, must be unique.

I found that I end up with examining the following

if is locally integrable and

for all test functions , then a.e.

I don’t find a proof here. Would you give me a hint that how this can be proved?

15 October, 2016 at 10:48 pm

Terence TaoUse a limiting argument to show that if is orthogonal to all test functions, then it is also orthogonal to indicator functions of bounded measurable sets. Then argue by contradiction.

1 February, 2018 at 8:27 am

AnonymousWould anyone elaborate the hint? (1)Where does this set of notes show that the set of test functions is dense in the set of indicator functions of bounded measurable sets? (2) What does the “contradiction” mentioned above refer to?

17 February, 2017 at 9:57 pm

254A, Notes 2: The central limit theorem | What's new[…] in the sense of distributions that arises in distribution theory (discussed for instance in this previous blog post), however strictly speaking the two notions of convergence are distinct and should not be confused […]

2 August, 2017 at 5:42 am

LRDear Prof. Tao,

I suggest you to take a look at “Multiplication of the Distributions”, by Colombeau, which have some thoughts about applying it to nonlinear problems and the “Impossibility of multiplication” cited by Schwartz.

11 February, 2018 at 4:45 am

SébastienIn your excellent PCM note on distributions, you mention that distributions can be composed both ways by suitably smooth functions. Was this a typo or not? Indeed, here and on the internet, I can only find composition in the “apply the true function first” order, and whenever people speak of the exponential of the Gaussian Free Field, they insist on the “one cannot a priori take the exponential of a distribution” issue, so I am confused now.

11 February, 2018 at 11:00 am

Terence TaoOops, that is indeed a typo. Distributions can be composed on the right with smooth functions but not on the left in general.

11 February, 2018 at 12:00 pm

SébastienThank you very much!

19 March, 2018 at 8:49 am

AnonymousProbably too late but I am curious. Would you define |delta(x)|=delta(x)?

I think there might be reasons for doing it but also reasons for not doing it.

19 March, 2018 at 6:26 pm

Terence TaoSure; one can for instance view this as a (somewhat degenerate) special case of the concept of a total variation of a measure. In general, though, it is only the measures of finite total variation for which one has a meaningful absolute value; more general distributions, such as the (distributional) derivative of , do not have any particularly useful notion of an absolute value.

20 March, 2018 at 2:41 am

Anonymous[sorry I am messing up with latex]

Thank you very much for your answer. What bothers me is that the reasoning used to explain why we do not define \delta^2 also might be invoked for |\delta|. If one takes the functions f_n(x)= n*c_n*\mbox{sinc}(n*x)/\mbox{rect}(x)

where c_n is an appropriate constant, then it seems to me that f_n\to \delta, but |f_n| does not converge. Am I missing something here?

20 March, 2018 at 7:42 am

Terence TaoI do not know what the rect(x) function is, but it is certainly the case that the operation (as defined on “nice” functions) is not continuously extendible to the space of all distributions, much as is the case with . So there is no meaningful notion of the absolute value of an

arbitrarydistribution. However, if one restricts to the subclass of distributions that are measures of finite total variation, then becomes a continuous operation (using now the total variation topology), and so one can define for in this class. (The sequence you provide will likely not converge in the total variation topology, but only in the distributional topology.)20 March, 2018 at 3:12 pm

AnonymousThank you very much, I think I understand although I am a bit surprised by this idea of considering a subclass of measures of finite total variation. I wrote the functions in a wrong way but I meant to multiply n*sinc(n*x) by the indicator function of an interval say like [-1,1] and rescale appropriately to have integral =1. But actually even n*sinc(n*x) would have worked for my concern.

I thought (I am not a professional mathematician) one should usually require nice behaviour over any converging sequence in the distributional topology. I will think more about it.

– Actually, now I am puzzled by ; if I consider again n*sinc(n*x) then this converges to in distribution and its square does converge to \delta too… so we might consider L_2 and define \delta^2?… mm…. ok I don’t want to bother further, you helped already enough, thank you very much again! :)

28 April, 2018 at 5:55 pm

AnonymousSince Wolff’s notes are used in 271A, I think my following question is related to this post.

In an estimate, the author gets (the appendix of Chapter 2 The Schwartz Space) a pointwise bound:

where

But what one really needs is the bound with instead of . It is said in Wolff’s notes that one can simply do it using rescaling. Can any one elaborate how the rescaling is done?

28 April, 2018 at 5:59 pm

AnonymousThe original goal is to prove by induction that

with where .

The course should be MATH 247A: http://www.math.ucla.edu/~tao/247a.1.06f/

29 April, 2018 at 1:32 pm

Terence TaoApply the previous bound to the function .

29 April, 2018 at 1:50 pm

AnonymousIn the first comment, it should be that “But what one really needs is the bound with instead of .

I find that I messed up with the change of variables calculations. In the simplest case, one has

and hence

Thanks!

12 May, 2018 at 3:22 pm

AnonymousConsidering both Exercise 1 and 5, I still can’t see how to answer the question in https://terrytao.wordpress.com/2009/04/06/the-fourier-transform/#comment-498013

If one uses an approximation to the identity, , then

in both and for . This seems to be quite close to what I want. But there is no guarantee that

On the other hand, since , there exists so that . So one can approximate by

where

.

Since as well, similarly one has

where But then one has two different sequences, $\phi_n*h and \phi_n*g$.

12 May, 2018 at 3:32 pm

AnonymousIn the first case, I have already had

in both the and the topologies. What I want is

where .

But it is known that . If one can control the decay (and retain the differentiability) of to get , then the proof would be complete…

13 May, 2018 at 7:45 am

Terence TaoIn addition to convolving with a mollifier, one should also multiply by a smooth cutoff such as to obtain compact support (which, together with the smoothness provided by the mollifier, place one in the Schwartz class without difficulty).

5 July, 2018 at 8:39 pm

Josh ChenIs there any reference or literature you might point to for generalized functions/distributions/Schwartz functions in infinite dimensions? It’s interesting to consider tempered distributions rather than for example Gaussian measures for PDEs with random coefficient fields or time stochastic PDE processes..

13 August, 2018 at 3:24 am

Daniel PiresHi professor Tao! I was trying to solve this one problem and I thought it was related to this notes

if than

(here denotes the inverse Fourier Transform).

Do you have any hint for how can I solve this?

13 August, 2018 at 7:33 am

Terence TaoBasically one needs some decay estimates on the Fourier transform (it turns out in this case that it decays like ). One can see this for instance by dyadic decomposition (splitting into a bunch of rescaled bump functions). In this particular case there may also be some exact formulae that could be helpful (there are some computations of related kernels for instance in Stein’s “singular integrals”, possibly also in Stein-Weiss’s “Fourier analysis on Euclidean spaces”).

3 September, 2018 at 3:13 pm

254A, Notes 0: Physical derivation of the incompressible Euler and Navier-Stokes equations | What's new[…] but we will adopt the viewpoint of the theory of distributions (as reviewed for instance in these old lecture notes of mine) and consider approximation against test functions in spacetime, thus we assume […]

2 October, 2018 at 3:47 pm

254A, Notes 2: Weak solutions of the Navier-Stokes equations | What's new[…] some key aspects of the theory. A more comprehensive discussion of distributions may be found in this previous blog post. To avoid some minor subtleties involving complex conjugation that are not relevant for this post, […]

26 October, 2018 at 12:21 am

Rajnikant SnhaDear Professor Tao,

On p 172, line number 5, Rudin’s Functional Analysis, there is a limit in the space of test functions. It seems to me wrong. Can you supply its intelligible proof.

With thanks in advance.

28 November, 2018 at 1:58 pm

254A, Notes 3: Local well-posedness for the Euler equations | What's new[…] some key aspects of the theory. A more comprehensive discussion of distributions may be found in this previous blog post. To avoid some minor subtleties involving complex conjugation that are not relevant for this post, […]

7 February, 2019 at 7:14 am

LaszloDoes the convolution algebra of compactly supported distributions on $R^n$ have any delicate algebraic properties (like Noether ring, etc.)?

7 February, 2019 at 8:34 am

Terence TaoI did a bit of poking around and found this recent paper by Vogt on the subject: https://mathscinet.ams.org/mathscinet-getitem?mr=3858282 . Presumably the references cited in the introduction will describe the current state of knowledge in this direction.

7 February, 2019 at 11:11 am

LaszloThx!

28 June, 2019 at 4:57 pm

acx01bcYou should mention which works everywhere even for things such as , the only point is that the limit doesn’t need to converge in the sense of distributions.