As discussed in previous notes, a function space norm can be viewed as a means to rigorously quantify various statistics of a function {f: X \rightarrow {\bf C}}. For instance, the “height” and “width” can be quantified via the {L^p(X,\mu)} norms (and their relatives, such as the Lorentz norms {\|f\|_{L^{p,q}(X,\mu)}}). Indeed, if {f} is a step function {f = A 1_E}, then the {L^p} norm of {f} is a combination {\|f\|_{L^p(X,\mu)} = |A| \mu(E)^{1/p}} of the height (or amplitude) {A} and the width {\mu(E)}.

However, there are more features of a function {f} of interest than just its width and height. When the domain {X} is a Euclidean space {{\bf R}^d} (or domains related to Euclidean spaces, such as open subsets of {{\bf R}^d}, or manifolds), then another important feature of such functions (especially in PDE) is the regularity of a function, as well as the related concept of the frequency scale of a function. These terms are not rigorously defined; but roughly speaking, regularity measures how smooth a function is (or how many times one can differentiate the function before it ceases to be a function), while the frequency scale of a function measures how quickly the function oscillates (and would be inversely proportional to the wavelength). One can illustrate this informal concept with some examples:

  • Let {\phi \in C^\infty_c({\bf R})} be a test function that equals {1} near the origin, and {N} be a large number. Then the function {f(x) := \phi(x) \sin(Nx)} oscillates at a wavelength of about {1/N}, and a frequency scale of about {N}. While {f} is, strictly speaking, a smooth function, it becomes increasingly less smooth in the limit {N \rightarrow \infty}; for instance, the derivative {f'(x) = \phi'(x) \sin(Nx) + N \phi(x) \cos(Nx)} grows at a roughly linear rate as {N \rightarrow \infty}, and the higher derivatives grow at even faster rates. So this function does not really have any regularity in the limit {N \rightarrow \infty}. Note however that the height and width of this function is bounded uniformly in {N}; so regularity and frequency scale are independent of height and width.
  • Continuing the previous example, now consider the function {g(x) := N^{-s} \phi(x) \sin(Nx)}, where {s \geq 0} is some parameter. This function also has a frequency scale of about {N}. But now it has a certain amount of regularity, even in the limit {N \rightarrow \infty}; indeed, one easily checks that the {k^{th}} derivative of {g} stays bounded in {N} as long as {k \leq s}. So one could view this function as having “{s} degrees of regularity” in the limit {N \rightarrow \infty}.
  • In a similar vein, the function {N^{-s} \phi(Nx)} also has a frequency scale of about {N}, and can be viewed as having {s} degrees of regularity in the limit {N \rightarrow \infty}.
  • The function {\phi(x) |x|^s 1_{x > 0}} also has about {s} degrees of regularity, in the sense that it can be differentiated up to {s} times before becoming unbounded. By performing a dyadic decomposition of the {x} variable, one can also decompose this function into components {\psi(2^n x) |x|^s} for {n \geq 0}, where {\psi(x) := (\phi(x)-\phi(2x)) 1_{x>0}} is a bump function supported away from the origin; each such component has frequency scale about {2^n} and {s} degrees of regularity. Thus we see that the original function {\phi(x) |x|^s 1_{x > 0}} has a range of frequency scales, ranging from about {1} all the way to {+\infty}.
  • One can of course concoct higher-dimensional analogues of these examples. For instance, the localised plane wave {\phi(x) \sin(\xi \cdot x)} in {{\bf R}^d}, where {\phi \in C^\infty_c({\bf R}^d)} is a test function, would have a frequency scale of about {|\xi|}.

There are a variety of function space norms that can be used to capture frequency scale (or regularity) in addition to height and width. The most common and well-known examples of such spaces are the Sobolev space norms {\| f\|_{W^{s,p}({\bf R}^d)}}, although there are a number of other norms with similar features (such as Hölder norms, Besov norms, and Triebel-Lizorkin norms). Very roughly speaking, the {W^{s,p}} norm is like the {L^p} norm, but with “{s} additional degrees of regularity”. For instance, in one dimension, the function {A \phi(x/R) \sin(Nx)}, where {\phi} is a fixed test function and {R, N} are large, will have a {W^{s,p}} norm of about {|A| R^{1/p} N^s}, thus combining the “height” {|A|}, the “width” {R}, and the “frequency scale” {N} of this function together. (Compare this with the {L^p} norm of the same function, which is about {|A| R^{1/p}}.)

To a large extent, the theory of the Sobolev spaces {W^{s,p}({\bf R}^d)} resembles their Lebesgue counterparts {L^p({\bf R}^d)} (which are as the special case of Sobolev spaces when {s=0}), but with the additional benefit of being able to interact very nicely with (weak) derivatives: a first derivative {\frac{\partial f}{\partial x_j}} of a function in an {L^p} space usually leaves all Lebesgue spaces, but a first derivative of a function in the Sobolev space {W^{s,p}} will end up in another Sobolev space {W^{s-1,p}}. This compatibility with the differentiation operation begins to explain why Sobolev spaces are so useful in the theory of partial differential equations. Furthermore, the regularity parameter {s} in Sobolev spaces is not restricted to be a natural number; it can be any real number, and one can use fractional derivative or integration operators to move from one regularity to another. Despite the fact that most partial differential equations involve differential operators of integer order, fractional spaces are still of importance; for instance it often turns out that the Sobolev spaces which are critical (scale-invariant) for a certain PDE are of fractional order.

The uncertainty principle in Fourier analysis places a constraint between the width and frequency scale of a function; roughly speaking (and in one dimension for simplicity), the product of the two quantities has to be bounded away from zero (or to put it another way, a wave is always at least as wide as its wavelength). This constraint can be quantified as the very useful Sobolev embedding theorem, which allows one to trade regularity for integrability: a function in a Sobolev space {W^{s,p}} will automatically lie in a number of other Sobolev spaces {W^{\tilde s,\tilde p}} with {\tilde s < s} and {\tilde p > p}; in particular, one can often embed Sobolev spaces into Lebesgue spaces. The trade is not reversible: one cannot start with a function with a lot of integrability and no regularity, and expect to recover regularity in a space of lower integrability. (One can already see this with the most basic example of Sobolev embedding, coming from the fundamental theorem of calculus. If a (continuously differentiable) function {f: {\bf R} \rightarrow {\bf R}} has {f'} in {L^1({\bf R})}, then we of course have {f \in L^\infty({\bf R})}; but the converse is far from true.)

Plancherel’s theorem reveals that Fourier-analytic tools are particularly powerful when applied to {L^2} spaces. Because of this, the Fourier transform is very effective at dealing with the {L^2}-based Sobolev spaces {W^{s,2}({\bf R}^d)}, often abbreviated {H^s({\bf R}^d)}. Indeed, using the fact that the Fourier transform converts regularity to decay, we will see that the {H^s({\bf R}^d)} spaces are nothing more than Fourier transforms of weighted {L^2} spaces, and in particular enjoy a Hilbert space structure. These Sobolev spaces, and in particular the energy space {H^1({\bf R}^d)}, are of particular importance in any PDE that involves some sort of energy functional (this includes large classes of elliptic, parabolic, dispersive, and wave equations, and especially those equations connected to physics and/or geometry).

We will not fully develop the theory of Sobolev spaces here, as this would require the theory of singular integrals, which is beyond the scope of this course. There are of course many references for further reading; one is Stein’s “Singular integrals and differentiability properties of functions“.

— 1. Hölder spaces —

Throughout these notes, {d \geq 1} is a fixed dimension.

Before we study Sobolev spaces, let us first look at the more elementary theory of Hölder spaces {C^{k,\alpha}({\bf R}^d)}, which resemble Sobolev spaces but with the aspect of width removed (thus Hölder norms only measure a combination of height and frequency scale). One can define these spaces on many domains (for instance, the {C^{0,\alpha}} norm can be defined on any metric space) but we shall largely restrict attention to Euclidean spaces {{\bf R}^d} for sake of concreteness.

We first recall the {C^k({\bf R}^d)} spaces, which we have already been implicitly using in previous lectures. The space {C^0({\bf R}^d) = BC({\bf R}^d)} is the space of bounded continuous functions {f: {\bf R}^d \rightarrow {\bf C}} on {{\bf R}^d}, with norm

\displaystyle \|f\|_{C^0({\bf R}^d)} := \sup_{x \in {\bf R}^d} |f(x)| = \|f\|_{L^\infty({\bf R}^d)}.

This norm gives {C^0} the structure of a Banach space. More generally, one can then define the spaces {C^k({\bf R}^d)} for any non-negative integer {k} as the space of all functions which are {k} times continuously differentiable, with all derivatives of order {k} bounded, and whose norm is given by the formula

\displaystyle \|f\|_{C^k({\bf R}^d)} := \sum_{j=0}^k \sup_{x \in {\bf R}^d} |\nabla^j f(x)| = \sum_{j=0}^k \|\nabla^j f\|_{L^\infty({\bf R}^d)},

where we view {\nabla^j f} as a rank {j}, dimension {d} tensor with complex coefficients (or equivalently, as a vector of dimension {d^j} with complex coefficients), thus

\displaystyle |\nabla^j f(x)| = (\sum_{i_1,\ldots,i_j = 1,\ldots d} |\frac{\partial^j}{\partial {x_{i_1}} \ldots \partial {x_{i_j}}} f(x)|^2)^{1/2}.

(One does not have to use the {\ell^2} norm here, actually; since all norms on a finite-dimensional space are equivalent, any other means of taking norms here will lead to an equivalent definition of the {C^k} norm. More generally, all the norms discussed here tend to have several definitions which are equivalent up to constants, and in most cases the exact choice of norm one uses is just a matter of personal taste.)

Remark 1 In some texts, {C^k({\bf R}^d)} is used to denote the functions which are {k} times continuously differentiable, but whose derivatives up to {k^{th}} order are allowed to be unbounded, so for instance {e^x} would lie in {C^k({\bf R})} for every {k} under this definition. Here, we will refer to such functions (with unbounded derivatives) as lying in {C^k_{loc}({\bf R}^d)} (i.e. they are locally in {C^k}), rather than {C^k({\bf R}^d)}. Similarly, we make a distinction between {C^\infty_{loc}({\bf R}^d) = \bigcap_{k=1}^\infty C^k_{loc}({\bf R}^d)} (smooth functions, with no bounds on derivatives) and {C^\infty({\bf R}^d) = \bigcap_{k=1}^\infty C^k({\bf R}^d)} (smooth functions, all of whose derivatives are bounded). Thus, for instance, {e^x} lies in {C^\infty_{loc}({\bf R})} but not {C^\infty({\bf R})}.

Exercise 2 Show that {C^k({\bf R}^d)} is a Banach space.

Exercise 3 Show that for every {d \geq 1} and {k \geq 0}, the {C^k({\bf R}^d)} norm is equivalent to the modified norm

\displaystyle \| f \|_{\tilde C^k({\bf R}^d)} := \|f\|_{L^\infty({\bf R}^d)} + \|\nabla^k f\|_{L^\infty({\bf R}^d)}

in the sense that there exists a constant {C} (depending on {k} and {d}) such that

\displaystyle C^{-1} \|f\|_{C^k({\bf R}^d)} \leq \| f \|_{\tilde C^k({\bf R}^d)} \leq \|f\|_{C^k({\bf R}^d)}

for all {f \in C^k({\bf R}^d)}. (Hint: use Taylor series with remainder.) Thus when defining the {C^k} norms, one does not really need to bound all the intermediate derivatives {\nabla^j f} for {0 < j < k}; the two extreme terms {j=0, j=k} suffice. (This is part of a more general interpolation phenomenon; the extreme terms in a sum often already suffice to control the intermediate terms.)

Exercise 4 Let {\phi \in C^\infty_c({\bf R}^d)} be a bump function, and {k \geq 0}. Show that if {\xi \in {\bf R}^d} with {|\xi| \geq 1}, {R \geq 1/|\xi|}, and {A > 0}, then the function {A \phi(x/R) \sin(\xi \cdot x)} has a {C^k} norm of at most {C A |\xi|^k}, where {C} is a constant depending only on {\phi}, {d} and {k}. Thus we see how the {C^k} norm relates to the height {A}, width {R^d}, and frequency scale {|\xi|} of the function, and in particular how the width {R} is largely irrelevant. What happens when the condition {R \geq 1/|\xi|} is dropped?

We clearly have the inclusions

\displaystyle C^0({\bf R}^d) \supset C^1({\bf R}^d) \supset C^2({\bf R}^d) \supset \ldots

and for any constant-coefficient partial differential operator

\displaystyle L = \sum_{i_1,\ldots,i_d \geq 0: i_1+\ldots+i_d \leq m} c_{i_1,\ldots,i_d} \frac{\partial^{i_1+\ldots+i_d} }{\partial {x_1^{i_1}} \ldots \partial {x_d^{i_d}}}

of some order {m \geq 0}, it is easy to see that {L} is a bounded linear operator from {C^{k+m}({\bf R}^d)} to {C^k({\bf R}^d)} for any {k \geq 0}.

The Hölder spaces {C^{k,\alpha}({\bf R}^d)} are designed to “fill up the gaps” between the discrete spectrum {C^k({\bf R}^d)} of the continuously differentiable spaces. For {k=0} and {0 \leq \alpha \leq 1}, these spaces are defined as the subspace of functions {f \in C^0({\bf R}^d)} whose norm

\displaystyle \|f\|_{C^{0,\alpha}({\bf R}^d)} := \|f\|_{C^0({\bf R}^d)} + \sup_{x,y \in {\bf R}^d: x \neq y} \frac{|f(x)-f(y)|}{|x-y|^\alpha}

is finite. To put it another way, {f \in C^{0,\alpha}({\bf R}^d)} if {f} is bounded and continuous, and furthermore obeys the Hölder continuity bound

\displaystyle |f(x)-f(y)| \leq C |x-y|^\alpha

for some constant {C > 0} and all {x,y \in {\bf R}^d}.

The space {C^{0,0}({\bf R}^d)} is easily seen to be just {C^0({\bf R}^d)} (with an equivalent norm). At the other extreme, {C^{0,1}({\bf R}^d)} is the class of Lipschitz functions, and is also denoted {\hbox{Lip}({\bf R}^d)} (and the {C^{0,1}} norm is also known as the Lipschitz norm).

Exercise 5 Show that {C^{0,\alpha}({\bf R}^d)} is a Banach space for every {0 \leq \alpha \leq 1}.

Exercise 6 Show that {C^{0,\alpha}({\bf R}^d) \supset C^{0,\beta}({\bf R}^d)} for every {0 \leq \alpha \leq \beta \leq 1}, and that the inclusion map is continuous.

Exercise 7 If {\alpha > 1}, show that the {C^{0,\alpha}({\bf R}^d)} norm of a function {f} is finite if and only if {f} is constant. This explains why we generally restrict the Hölder index {\alpha} to be less than or equal to {1}.

Exercise 8 Show that {C^1({\bf R}^d)} is a proper subspace of {C^{0,1}({\bf R}^d)}, and that the restriction of the {C^{0,1}({\bf R}^d)} norm to {C^1({\bf R}^d)} is equivalent to the {C^1} norm. (The relationship between {C^1({\bf R}^d)} and {C^{0,1}({\bf R}^d)} is in fact closely analogous to that between {C^0({\bf R}^d)} and {L^\infty({\bf R}^d)}, as can be seen from the fundamental theorem of calculus.)

Exercise 9 Let {f \in (C^\infty_c({\bf R}))^*} be a distribution. Show that {f \in C^{0,1}({\bf R})} if and only if {f \in L^\infty({\bf R})}, and the distributional derivative {f'} of {f} also lies in {L^\infty({\bf R})}. Furthermore, for {f \in C^{0,1}({\bf R})}, show that {\|f\|_{C^{0,1}({\bf R})}} is comparable to {\|f\|_{L^\infty({\bf R})} + \|f'\|_{L^\infty({\bf R})}}.

We can then define the {C^{k,\alpha}({\bf R}^d)} spaces for natural numbers {k \geq 0} and {0 \leq \alpha \leq 1} to be the subspace of {C^k({\bf R}^d)} whose norm

\displaystyle \| f \|_{C^{k,\alpha}({\bf R}^d)} := \sum_{j=0}^k \| \nabla^j f \|_{C^{0,\alpha}({\bf R}^d)}

is finite. (As before, there are a variety of ways to define the {C^{0,\alpha}} norm of the tensor-valued quantity {\nabla^j f}, but they are all equivalent to each other.)

Exercise 10 Show that {C^{k,\alpha}({\bf R}^d)} is a Banach space which contains {C^{k+1}({\bf R}^d)}, and is contained in turn in {C^k({\bf R}^d)}.

As before, {C^{k,0}({\bf R}^d)} is equal to {C^k({\bf R}^d)}, and {C^{k,\alpha}({\bf R}^d)} contains {C^{k,\beta}({\bf R}^d)} when {\alpha \leq \beta}. The space {C^{k,1}({\bf R}^d)} is slightly larger than {C^{k+1}}, but is fairly close to it, thus providing a near-continuum of spaces between the sequence of spaces {C^k({\bf R}^d)}. The following examples illustrates this:

Exercise 11 Let {\phi \in C^\infty_c({\bf R})} be a test function, let {k \geq 0} be a natural number, and let {0 \leq \alpha \leq 1}.

  • Show that the function {|x|^s \phi(x)} lies in {C^{k,\alpha}({\bf R})} whenever {s \geq k+\alpha}.
  • Conversely, if {s} is not an integer, {\phi(0) \neq 0}, and {s < k+\alpha}, show that {|x|^s \phi(x)} does not lie in {C^{k,\alpha}({\bf R})}.
  • Show that {|x|^{k+1} \phi(x) 1_{x>0}} lies in {C^{k,1}({\bf R})}, but not in {C^{k+1}({\bf R})}.

This example illustrates that the quantity {k+\alpha} can be viewed as measuring the total amount of regularity held by functions in {C^{k,\alpha}({\bf R})}: {k} full derivatives, plus an additional {\alpha} amount of Hölder continuity.

Exercise 12 Let {\phi \in C^\infty_c({\bf R}^d)} be a test function, let {k \geq 0} be a natural number, and let {0 \leq \alpha \leq 1}. Show that for {\xi \in {\bf R}^d} with {|\xi| \geq 1}, the function {\phi(x) \sin(\xi \cdot x)} has a {C^{k,\alpha}({\bf R})} norm of at most {C |\xi|^{k+\alpha}}, for some {C} depending on {\phi, d, k, \alpha}.

By construction, it is clear that continuously differential operators {L} of order {m} will map {C^{k+m,\alpha}({\bf R}^d)} continuously to {C^{k,\alpha}({\bf R}^d)}.

Now we consider what happens with products.

Exercise 13 Let {k,l \geq 0} be natural numbers, and {0 \leq \alpha,\beta \leq 1}.

  • If {f \in C^k({\bf R}^d)} and {g \in C^l({\bf R}^d)}, show that {fg \in C^{\min(k,l)}({\bf R}^d)}, and that the multiplication map is continuous from {C^k({\bf R}^d) \times C^l({\bf R}^d)} to {C^{\min(k,l)}({\bf R}^d)}. (Hint: reduce to the case {k=l} and use induction.)
  • If {f \in C^{k,\alpha}({\bf R}^d)} and {g \in C^{l,\beta}({\bf R}^d)}, and {k+\alpha \leq l+\beta}, show that {fg \in C^{k,\alpha}({\bf R}^d)}, and that the multiplication map is continuous from {C^{k,\alpha}({\bf R}^d) \times C^{l,\beta}({\bf R}^d)} to {C^{k,\alpha}({\bf R}^d)}.

It is easy to see that the regularity in these results cannot be improved (just take {g=1}). This illustrates a general principle, namely that a pointwise product {fg} tends to acquire the lower of the regularities of the two factors {f, g}.

As one consequence of this exercise, we see that any variable-coefficient differential operator {L} of order {m} with {C^\infty({\bf R})} coefficients will map {C^{m+k,\alpha}({\bf R}^d)} to {C^{k,\alpha}({\bf R}^d)} for any {k \geq 0} and {0 \leq \alpha \leq 1}.

We now briefly remark on Hölder spaces on open domains {\Omega} in Euclidean space {{\bf R}^d}. Here, a new subtlety emerges; instead of having just one space {C^{k,\alpha}} for each choice of exponents {k,\alpha}, one actually has a range of spaces to choose from, depending on what kind of behaviour one wants to impose at the boundary of the domain. At one extreme, one has the space {C^{k,\alpha}(\Omega)}, defined as the space of {k} times continuously differentiable functions {f: \Omega \rightarrow {\bf C}} whose Hölder norm

\displaystyle \|f\|_{C^{k,\alpha}(\Omega)} := \sum_{j=0}^k \sup_{x \in \Omega} |\nabla^j f(x)| + \sup_{x,y \in \Omega: x \neq y} \frac{|\nabla^j f(x)-\nabla^j f(y)|}{|x-y|^\alpha}

is finite; this is the “maximal” choice for the {C^{k,\alpha}(\Omega)}. At the other extreme, one has the space {C^{k,\alpha}_0(\Omega)}, defined as the closure of the compactly supported functions in {C^{k,\alpha}(\Omega)}. This space is smaller than {C^{k,\alpha}(\Omega)}; for instance, functions in {C^{0,\alpha}_0((0,1))} must converge to zero at the endpoints {0,1}, while functions in {C^{k,\alpha}((0,1))} do not need to do so. An intermediate space is {C^{k,\alpha}({\bf R}^d)\downharpoonright_{\Omega}}, defined as the space of restrictions of functions in {C^{k,\alpha}({\bf R}^d)} to {\Omega}. For instance, the restriction of {|x| \psi(x)} to {{\bf R} \backslash \{0\}}, where {\psi} is a cutoff function non-vanishing at the origin, lies in {C^{1,0}({\bf R} \backslash \{0\})}, but is not in {C^{1,0}({\bf R})\downharpoonright_{{\bf R} \backslash \{0\}}} or {C^{1,0}_0({\bf R} \backslash \{0\})} (note that {|x| \psi(x)} itself is not in {C^{1,0}({\bf R})}, as it is not continuously differentiable at the origin). It is possible to clarify the exact relationships between the various flavours of Hölder spaces on domains (and similarly for the Sobolev spaces discussed below), but we will not discuss these topics here.

Exercise 14 Let {k \geq 0} and {0 \leq \alpha' < \alpha \leq 1}. Show that {C^\infty_c({\bf R}^d)} is a dense subset of {C^{k,\alpha}_0({\bf R}^d)} if one places the {C^{k,\alpha'}} topology on the latter space. (Hint: To approximate a compactly supported {C^{k,\alpha}} function by a {C^\infty_c} one, convolve with a smooth, compactly supported approximation to the identity.) What happens in the endpoint case {\alpha=\alpha'}?

Hölder spaces are particularly useful in elliptic PDE, because tools such as the maximum principle lend themselves well to the suprema that appear inside the definition of the {C^{k,\alpha}} norms; see for instance the book of Gilbarg and Trudinger for a thorough treatment. For simple examples of elliptic PDE, such as the Poisson equation {\Delta u = f}, one can also use the explicit fundamental solution, through lengthy but straightforward computations. We give a typical example here:

Exercise 15 (Schauder estimate) Let {0 < \alpha < 1}, and let {f \in C^{0,\alpha}({\bf R}^3)} be a function supported on the unit ball {B(0,1)}. Let {u} be the unique bounded solution to the Poisson equation {\Delta u = f} (where {\Delta = \sum_{j=1}^3 \frac{\partial^2}{\partial x_j^2}} is the Laplacian), given by convolution with the Newton kernel:

\displaystyle u(x) := \frac{1}{4\pi} \int_{{\bf R}^3} \frac{f(y)}{|x-y|}\ dy.

  • (i) Show that {u \in C^0({\bf R}^3)}.
  • (ii) Show that {u \in C^1({\bf R}^3)}, and rigorously establish the formula

    \displaystyle \frac{\partial u}{\partial x_j}(x) = -\frac{1}{4\pi} \int_{{\bf R}^3} (x_j-y_j) \frac{f(y)}{|x-y|^3}\ dy

    for {j=1,2,3}.

  • (iii) Show that {u \in C^2({\bf R}^3)}, and rigorously establish the formula

    \displaystyle \frac{\partial^2 u}{\partial x_i \partial x_j}(x) = \frac{1}{4\pi} \lim_{\varepsilon \rightarrow 0}\int_{|x-y| \geq \varepsilon} [\frac{3 (x_i-y_i) (x_j-y_j)}{|x-y|^5}

    \displaystyle - \frac{\delta_{ij}}{|x-y|^3}] f(y)\ dy

    for {i,j=1,2,3}, where {\delta_{ij}} is the Kronecker delta. (Hint: first establish this in the two model cases when {f(x)=0}, and when {f} is constant near {x}.)

  • (iv) Show that {u \in C^{2,\alpha}({\bf R}^3)}, and establish the Schauder estimate

    \displaystyle \|u\|_{C^{2,\alpha}({\bf R}^3)} \leq C_\alpha \|f\|_{C^{0,\alpha}({\bf R}^3)}

    where {C_\alpha} depends only on {\alpha}.

  • (v) Show that the Schauder estimate fails when {\alpha=0}. Using this, conclude that there eixsts {f \in C^0({\bf R}^3)} supported in the unit ball such that the function {u} defined above fails to be in {C^2({\bf R}^3)}. (Hint: use the closed graph theorem.) This failure helps explain why it is necessary to introduce Hölder spaces into elliptic theory in the first place (as opposed to the more intuitive {C^k} spaces).

Remark 16 Roughly speaking, the Schauder estimate asserts that if {\Delta u} has {C^{0,\alpha}} regularity, then all other second derivatives of {u} have {C^{0,\alpha}} regularity as well. This phenomenon – that control of a special derivative of {u} at some order implies control of all other derivatives of {u} at that order – is known as elliptic regularity, and relies crucially on {\Delta} being an elliptic differential operator. We will discus ellipticity a little bit more later in Exercise 45. The theory of Schauder estimates is by now extremely well developed, and applies to large classes of elliptic operators on quite general domains, but we will not discuss these estimates and their applications to various linear and nonlinear elliptic PDE here.

Exercise 17 (Rellich-Kondrachov type embedding theorem for Hölder spaces) Let {0 \leq \alpha < \beta \leq 1}. Show that any bounded sequence of functions {f_n \in C^{0,\beta}({\bf R}^d)} that are all supported in the same compact subset of {{\bf R}^n} will have a subsequence that converges in {C^{0,\alpha}({\bf R}^d)}. (Hint: use the Arzelá-Ascoli theorem to first obtain uniform convergence, then upgrade this convergence.) This is part of a more general phenomenon: sequences bounded in a high regularity space, and constrained to lie in a compact domain, will tend to have convergent subsequences in low regularity spaces.

— 2. Classical Sobolev spaces —

We now turn to the “classical” Sobolev spaces {W^{k,p}({\bf R}^d)}, which involve only an integral amount {k} of regularity.

Definition 18 Let {1 \leq p \leq \infty}, and let {k \geq 0} be a natural number. A function {f} is said to lie in {W^{k,p}({\bf R}^d)} if its weak derivatives {\nabla^j f} exist and lie in {L^p({\bf R}^d)} for all {j=0,\ldots,k}. If {f} lies in {W^{k,p}({\bf R}^d)}, we define the {W^{k,p}} norm of {f} by the formula

\displaystyle \|f\|_{W^{k,p}({\bf R}^d)} := \sum_{j=0}^k \|\nabla^j f\|_{L^p({\bf R}^d)}.

(As before, the exact choice of convention in which one measures the {L^p} norm of {\nabla^j} is not particularly relevant for most applications, as all such conventions are equivalent up to multiplicative constants.)

The space {W^{k,p}({\bf R}^d)} is also denoted {L^p_k({\bf R}^d)} in some texts.

Example 19 {W^{0,p}({\bf R}^d)} is of course the same space as {L^p({\bf R}^d)}, thus the Sobolev spaces generalise the Lebesgue spaces. From Exercise 9 we see that {W^{1,\infty}({\bf R})} is the same space as {C^{0,1}({\bf R})}, with an equivalent norm. More generally, one can see from induction that {W^{k+1,\infty}({\bf R})} is the same space as {C^{k,1}({\bf R})} for {k \geq 0}, with an equivalent norm. It is also clear that {W^{k,p}({\bf R}^d)} contains {W^{k+1,p}({\bf R}^d)} for any {k,p}.

Example 20 The function {|\sin x|} lies in {W^{1,\infty}({\bf R})}, but is not everywhere differentiable in the classical sense; nevertheless, it has a bounded weak derivative of {\cos x \hbox{sgn}(\sin(x))}. On the other hand, the Cantor function (aka the “Devil’s staircase”) is not in {W^{1,\infty}({\bf R})}, despite having a classical derivative of zero at almost every point; the weak derivative is a Cantor measure, which does not lie in any {L^p} space. Thus one really does need to work with weak derivatives rather than classical derivatives to define Sobolev spaces properly (in contrast to the {C^{k,\alpha}} spaces).

Exercise 21 Let {\phi \in C^\infty_c({\bf R}^d)} be a bump function, {k \geq 0}, and {1 \leq p \leq \infty}. Show that if {\xi \in {\bf R}^d} with {|\xi| \geq 1}, {R \geq 1/|\xi|}, and {A > 0}, then the function {A \phi(x/R) \sin(\xi \cdot x)} has a {W^{k,p}({\bf R})} norm of at most {C A |\xi|^k R^{d/p}}, where {C} is a constant depending only on {\phi}, {p} and {k}. (Compare this with Exercise 4 and Exercise 12.) What happens when the condition {R \geq 1/|\xi|} is dropped?

Exercise 22 Show that {W^{k,p}({\bf R}^d)} is a Banach space for any {1 \leq p \leq \infty} and {k \geq 0}.

The fact that Sobolev spaces are defined using weak derivatives is a technical nuisance, but in practice one can often end up working with classical derivatives anyway by means of the following lemma:

Lemma 23 Let {1 \leq p < \infty} and {k \geq 0}. Then the space {C^\infty_c({\bf R}^d)} of test functions is a dense subspace of {W^{k,p}({\bf R}^d)}.

Proof: It is clear that {C^\infty_c({\bf R}^d)} is a subspace of {W^{k,p}({\bf R}^d)}. We first show that the smooth functions {C^\infty_{loc}({\bf R}^d) \cap W^{k,p}({\bf R}^d)} is a dense subspace of {W^{k,p}({\bf R}^d)}, and then show that {C^\infty_c({\bf R}^d)} is dense in {C^\infty_{loc}({\bf R}^d) \cap W^{k,p}({\bf R}^d)}.

We begin with the former claim. Let {f \in W^{k,p}({\bf R}^d)}, and let {\phi_n} be a sequence of smooth, compactly supported approximations to the identity. Since {f \in L^p({\bf R}^d)}, we see that {f*\phi_n} converges to {f} in {L^p({\bf R}^d)}. More generally, since {\nabla^j f} is in {L^p({\bf R}^d)} for {0 \leq j \leq k}, we see that {(\nabla^j f) * \phi_n = \nabla^j (f * \phi_n)} converges to {\nabla^j f} in {L^p({\bf R}^d)}. Thus we see that {f*\phi_n} converges to {f} in {W^{k,p}({\bf R}^d)}. On the other hand, as {\phi_n} is smooth, {f*\phi_n} is smooth; and the claim follows.

Now we prove the latter claim. Let {f} be a smooth function in {W^{k,p}({\bf R}^d)}, thus {\nabla^j f \in L^p({\bf R}^d)} for all {0 \leq j \leq k}. We let {\eta \in C^\infty_c({\bf R}^d)} be a compactly supported function which equals {1} near the origin, and consider the functions {f_R(x) := f(x) \eta(x/R)} for {R > 0}. Clearly, each {f_R} lies in {C^\infty_c({\bf R}^d)}. As {R \rightarrow \infty}, dominated convergence shows that {f_R} converges to {f} in {L^p({\bf R}^d)}. An application of the product rule then lets us write {\nabla f_R(x) = (\nabla f)(x) \eta(x/R) + \frac{1}{R} f(x) (\nabla \eta)(x/R)}. The first term converges to {\nabla f} in {L^p({\bf R}^d)} by dominated convergence, while the second term goes to zero in the same topology; thus {\nabla f_R} converges to {\nabla f} in {L^p({\bf R}^d)}. A similar argument shows that {\nabla^j f_R} converges to {\nabla^j f} in {L^p({\bf R}^d)} for all {0 \leq j \leq k}, and so {f_R} converges to {f} in {W^{k,p}({\bf R}^d)}, and the claim follows. \Box

As a corollary of this lemma we also see that the space {{\mathcal S}({\bf R}^d)} of Schwartz functions is dense in {W^{k,p}({\bf R}^d)}.

Exercise 24 Let {k \geq 0}. Show that the closure of {C^\infty_c({\bf R}^d)} in {W^{k,\infty}({\bf R}^d)} is {C^{k}_0({\bf R}^d)} (the space of {C^{k}} functions whose first {k} derivatives all go to zero at infinity), thus Lemma 23 fails at the endpoint {p=\infty}.

Now we come to the important Sobolev embedding theorem, which allows one to trade regularity for integrability. We illustrate this phenomenon first with some very simple cases. First, we claim that the space {W^{1,1}({\bf R})} embeds continuously into {W^{0,\infty}({\bf R}) = L^\infty({\bf R})}, thus trading in one degree of regularity to upgrade {L^1} integrability to {L^\infty} integrability. To prove this claim, it suffices to establish the bound

\displaystyle \|f\|_{L^\infty({\bf R})} \leq C \|f\|_{W^{1,1}({\bf R})} \ \ \ \ \ (1)

for all test functions {f \in C^\infty_c({\bf R})} and some constant {C}, as the claim then follows by taking limits using Lemma 23. (Note that any limit in either the {L^\infty} or {W^{1,1}} topologies, is also a limit in the sense of distributions, and such limits are necessarily unique. Also, since {L^\infty({\bf R})} is the dual space of {L^1({\bf R})}, the distributional limit of any sequence bounded in {L^\infty({\bf R})} remains in {L^\infty({\bf R})}, by Exercise 28 of Notes 3.) To prove (1), observe from the fundamental theorem of calculus that

\displaystyle |f(x) - f(0)| = |\int_0^x f'(t)\ dt| \leq \|f'\|_{L^1({\bf R})} \leq \|f\|_{W^{1,1}({\bf R})}

for all {x}; in particular, from the triangle inequality

\displaystyle \| f \|_{L^\infty({\bf R})} \leq |f(0)| + \|f\|_{W^{1,1}({\bf R})}.

Also, taking {x} to be sufficiently large, we see (from the compact support of {f}) that

\displaystyle |f(0)| \leq \|f\|_{W^{1,1}({\bf R})}

and (1) follows.

Since the closure of {C^\infty_c({\bf R})} in {L^\infty({\bf R})} is {C_0({\bf R})}, we actually obtain the stronger embedding, that {W^{1,1}({\bf R})} embeds continuously into {C_0({\bf R})}.

Exercise 25 Show that {W^{d,1}({\bf R}^d)} embeds continuously into {C_0({\bf R}^d)}, thus there exists a constant {C} (depending only on {d}) such that

\displaystyle \|f\|_{C_0({\bf R}^d)} \leq C \|f\|_{W^{d,1}({\bf R}^d)}

for all {f \in W^{d,1}({\bf R}^d)}.

Now we turn to Sobolev embedding for exponents other than {p=1} and {p=\infty}.

Theorem 26 (Sobolev embedding theorem for one derivative) Let {1 \leq p \leq q \leq \infty} be such that {\frac{d}{p}-1 \leq \frac{d}{q} \leq \frac{d}{p}}, but that one is not in the endpoint cases {(p,q) = (d,\infty), (1, \frac{d}{d-1})}. Then {W^{1,p}({\bf R}^d)} embeds continuously into {L^q({\bf R}^d)}.

Proof: By Lemma 23 and the same limiting argument as before, it suffices to establish the Sobolev embedding inequality

\displaystyle \|f\|_{L^q({\bf R}^d)} \leq C_{p,q,d} \|f\|_{W^{1,p}({\bf R}^d)}

for all test functions {f \in C^\infty_c({\bf R}^d)}, and some constant {C_{p,q,d}} depending only on {p,q,d}, as the inequality will then extend to all {f \in W^{1,p}({\bf R}^d)}. To simplify the notation we shall use {X \lesssim Y} to denote an estimate of the form {X \leq C_{p,q,d} Y}, where {C_{p,q,d}} is a constant depending on {p,q,d} (the exact value of this constant may vary from instance to instance).

The case {p=q} is trivial. Now let us look at another extreme case, namely when {\frac{d}{p}-1 = \frac{d}{q}}; by our hypotheses, this forces {1 < p < d}. Here, we use the fundamental theorem of calculus (and the compact support of {f}) to write

\displaystyle f(x) = -\int_0^\infty \omega \cdot \nabla f(x+r\omega)\ dr

for any {x \in {\bf R}^d} and any direction {\omega \in S^{d-1}}. Taking absolute values, we conclude in particular that

\displaystyle |f(x)| \lesssim \int_0^\infty |\nabla f(x+r\omega)|\ dr.

We can average this over all directions {\omega}:

\displaystyle |f(x)| \lesssim \int_{S^{d-1}} \int_0^\infty |\nabla f(x+r\omega)\ dr d\omega.

Switching from polar coordinates back to Cartesian (multiplying and dividing by {r^{d-1}}) we conclude that

\displaystyle |f(x)| \lesssim \int_{{\bf R}^d} \frac{1}{|y|^{d-1}} |\nabla f(x-y)|\ dy,

thus {f} is pointwise controlled by the convolution of {|\nabla f|} with the fractional integration {\frac{1}{|x|^{d-1}}}. By the Hardy-Littlewood-Sobolev theorem on fractional integration (Corollary 7 of Notes 1) we conclude that

\displaystyle \|f\|_{L^q({\bf R}^d)} \lesssim \|\nabla f \|_{L^p({\bf R}^d)}

and the claim follows. (Note that the hypotheses {1<p<d} are needed here in order to be able to invoke this theorem.)

Now we handle intermediate cases, when {\frac{d}{p} - 1 < \frac{d}{q} < \frac{d}{p}}. (Many of these cases can be obtained from the endpoints already established by interpolation, but unfortunately not all such cases can be, so we will treat this case separately.) Here, the trick is not to integrate out to infinity, but instead to integrate out to a bounded distance. For instance, the fundamental theorem of calculus gives

\displaystyle f(x) = f(x+R\omega)-\int_0^R \omega \cdot \nabla f(x+r\omega)\ dr

for any {R>0}, hence

\displaystyle |f(x)| \lesssim |f(x+R\omega)| + \int_0^R |\nabla f(x+r\omega)|\ dr

What value of {R} should one pick? If one picks any specific value of {R}, one would end up with an average of {f} over spheres, which looks somewhat unpleasant. But what one can do here is average over a range of {R}‘s, for instance between {1} and {2}. This leads to

\displaystyle |f(x)| \lesssim \int_1^2 |f(x+R\omega)|\ dR + \int_0^2 |\nabla f(x+r\omega)|\ dr;

averaging over all directions {\omega} and converting back to Cartesian coordinates, we see that

\displaystyle |f(x)| \lesssim \int_{1 \leq |y| \leq 2} |f(x-y)|\ dy + \int_{|y| \leq 2} \frac{1}{|y|^{d-1}} |\nabla f(x-y)|\ dy.

Thus one is bounding {|f|} pointwise (up to constants) by the convolution of {|f|} with the kernel {K_1(y) := 1_{1 \leq |y| \leq 2}}, plus the convolution of {|\nabla f|} with the kernel {K_2(y) := 1_{|y| \leq 2} \frac{1}{|y|^{d-1}}}. A short computation shows that both kernels lie in {L^r({\bf R}^d)}, where {r} is the exponent in Young’s inequality, and more specifically that {\frac{1}{q} + 1 = \frac{1}{p} + \frac{1}{r}} (and in particular {1 < r < \frac{d}{d-1}}). Applying Young’s inequality (Exercise 25 of Notes 1), we conclude that

\displaystyle \|f\|_{L^q({\bf R}^d)} \lesssim \|f\|_{L^p({\bf R}^d)} + \| \nabla f \|_{L^p({\bf R}^d)}

and the claim follows. \Box

Remark 27 It is instructive to insert the example in Exercise 21 into the Sobolev embedding theorem. By replacing the {W^{1,p}({\bf R}^d)} norm with the {L^q({\bf R}^d)} norm, one trades one factor of the frequency scale {|\xi|} for {\frac{1}{q}-\frac{1}{p}} powers of the width {R^d}. This is consistent with the Sobolev embedding theorem so long as {R^d \gtrsim 1/|\xi|^d}, which is essentially one of the hypotheses in that exercise. Thus, one can view Sobolev embedding as an assertion that the width of a function must always be greater than or comparable to the wavelength scale (the reciprocal of the frequency scale), raised to the power of the dimension; this is a manifestation of the uncertainty principle.

Exercise 28 Let {d \geq 2}. Show that the Sobolev endpoint estimate fails in the case {(p,q)=(d,\infty)}. (Hint: experiment with functions {f} of the form {f(x) := \sum_{n=1}^N \phi(2^n x)}, where {\phi} is a test function supported on the ball {\{ |x| \leq 2 \}} and equal to one on {\{ |x| \leq 1 \}}.) Conclude in particular that {W^{1,d}({\bf R}^d)} is not a subset of {L^\infty({\bf R}^d)}. (Hint: Either use the closed graph theorem, or use some variant of the function {f} used in the first part of this exercise.) Note that when {d=1}, the Sobolev endpoint theorem for {(p,q)=(1,\infty)} follows from the fundamental theorem of calculus, as mentioned earlier. There are substitutes known for the endpoint Sobolev embedding theorem, but they involve more sophisticated function spaces, such as the space {\hbox{BMO}} of spaces of bounded mean oscillation, which we will not discuss here.

The {p=1} case of the Sobolev inequality cannot be proven via the Hardy-Littlewood-Sobolev inequality; however, there are other proofs available. One of these (due to Gagliardo and Nirenberg) is based on

Exercise 29 (Loomis-Whitney inequality) Let {d \geq 1}, let {f_1,\ldots,f_d \in L^p({\bf R}^{d-1})} for some {0 < p \leq \infty}, and let {F: {\bf R}^d \rightarrow {\bf C}} be the function

\displaystyle F( x_1,\ldots,x_d ) := \prod_{i=1}^d f_i( x_1,\ldots,x_{i-1},x_{i+1},\ldots,x_d).

Show that

\displaystyle \| F \|_{L^{p/(d-1)}({\bf R}^d)} \leq \prod_{i=1}^d \|f_i\|_{L^p({\bf R}^{d-1})}.

(Hint: induct on {d}, using Hölder’s inequality and Fubini’s theorem.)

Lemma 30 (Endpoint Sobolev inequality) {W^{1,1}({\bf R}^d)} embeds continuously into {L^{d/(d-1)}({\bf R}^d)}.

Proof: It will suffice to show that

\displaystyle \|f\|_{L^{d/(d-1)}({\bf R}^d)} \leq \|\nabla f \|_{L^1({\bf R}^d)}

for all test functions {f \in C^\infty_c({\bf R}^d)}. From the fundamental theorem of calculus we see that

\displaystyle |f(x_1,\ldots,x_d)| \leq \int_{\bf R} |\frac{\partial f}{\partial x_i}(x_1,\ldots,x_{i-1},t,x_{i+1},\ldots,x_d)|\ dt

and thus

\displaystyle |f(x_1,\ldots,x_d)| \leq f_i(x_1,\ldots,x_{i-1},x_{i+1},\ldots,x_d)


\displaystyle f_i(x_1,\ldots,x_{i-1},x_{i+1},\ldots,x_d) := \int_{\bf R} |\nabla f(x_1,\ldots,x_{i-1},t,x_{i+1},\ldots,x_d)|\ dt.

From Fubini’s theorem we have

\displaystyle \|f_i\|_{L^1({\bf R}^{d-1})} = \|\nabla f \|_{L^1({\bf R}^d)}

and hence by the Loomis-Whitney inequality

\displaystyle \|f_1 \ldots f_d \|_{L^{1/(d-1)}({\bf R}^d)} \leq \|\nabla f \|_{L^1({\bf R}^d)}^d,

and the claim follows. \Box

Exercise 31 (Connection between Sobolev embedding and isoperimetric inequality) Let {d \geq 2}, and let {\Omega} be an open subset of {{\bf R}^d} whose boundary {\partial \Omega} is a smooth {d-1}-dimensional manifold. Show that the surface area {|\partial \Omega|} of {\Omega} is related to the volume {|\Omega|} of {\Omega} by the isoperimetric inequality

\displaystyle |\Omega| \leq C_d |\partial \Omega|^{d/(d-1)}

for some constant {C_d} depending only on {d}. (Hint: Apply the endpoint Sobolev theorem to a suitably smoothed out version of {1_\Omega}.) It is also possible to reverse this implication and deduce the endpoint Sobolev embedding theorem from the isoperimetric inequality and the coarea formula, which we will do in later notes.

Exercise 32 Use dimensional analysis to argue why the Sobolev embedding theorem should fail when {\frac{d}{q} < \frac{d}{p}-1}. Then create a rigorous counterexample to that theorem in this case.

Exercise 33 Show that {W^{k,p}({\bf R}^d)} embeds into {W^{l,q}({\bf R}^d)} whenever {k \geq l \geq 0} and {1 < p < q \leq \infty} are such that {\frac{d}{p} - k \leq \frac{d}{q}-l}, and such that at least one of the two inequalities {q \leq \infty}, {\frac{d}{p}-k \leq \frac{d}{q}-l} is strict.

Exercise 34 Show that the Sobolev embedding theorem fails whenever {q<p}. (Hint: experiment with functions of the form {f(x) = \sum_{j=1}^n \phi(x-x_j)}, where {\phi} is a test function and the {x_j} are widely separated points in space.)

Exercise 35 (Hölder-Sobolev embedding) Let {d < p < \infty}. Show that {W^{1,p}({\bf R}^d)} embeds continuously into {C^{0,\alpha}({\bf R}^d)}, where {0 < \alpha < 1} is defined by the scaling relationship {\frac{d}{p} - 1 = - \alpha}. Use dimensional analysis to justify why one would expect this scaling relationship to arise naturally, and give an example to show that {\alpha} cannot be improved to any higher exponent.

More generally, with the same assumptions on {p,\alpha}, show that {W^{k+1,p}({\bf R}^d)} embeds continuously into {C^{k,\alpha}({\bf R}^d)} for all natural numbers {k \geq 0}.

Exercise 36 (Sobolev product theorem, special case) Let {k \geq 1}, {1 < p,q < d/k}, and {1 < r < \infty} be such that {\frac{1}{p}+\frac{1}{q} - \frac{k}{d} = \frac{1}{r}}. Show that whenever {f \in W^{k,p}({\bf R}^d)} and {g \in W^{k,q}({\bf R}^d)}, then {fg \in W^{k,r}({\bf R}^d)}, and that

\displaystyle \|fg\|_{W^{k,r}({\bf R}^d)} \leq C_{p,q,k,d,r} \|f\|_{W^{k,p}({\bf R}^d)} \|g\|_{W^{k,q}({\bf R}^d)}

for some constant {C_{p,q,k,d,r}} depending only on the subscripted parameters. (This is not the most general range of parameters for which this sort of product theorem holds, but it is an instructive special case.)

Exercise 37 Let {L} be a differential operator of order {m} whose coefficients lie in {C^\infty({\bf R}^d)}. Show that {L} maps {W^{k+m,p}({\bf R}^d)} continuously to {W^{k,p}({\bf R}^d)} for all {1 \leq p \leq \infty} and all integers {k \geq 0}.

— 3. {L^2}-based Sobolev spaces —

It is possible to develop more general Sobolev spaces {W^{s,p}({\bf R}^d)} than the integer-regularity spaces {W^{k,p}({\bf R}^d)} defined above, in which {s} is allowed to take any real number (including negative numbers) as a value, although the theory becomes somewhat pathological unless one restricts attention to the range {1 < p < \infty}, for reasons having to do with the theory of singular integrals.

As the theory of singular integrals is beyond the scope of this course, we will illustrate this theory only in the model case {p=2}, in which Plancherel’s theorem is available, which allows one to avoid dealing with singular integrals by working purely on the frequency space side.

To explain this, we begin with the Plancherel identity

\displaystyle \int_{{\bf R}^d} |f(x)|^2\ dx = \int_{{\bf R}^d} |\hat f(\xi)|^2\ d\xi,

which is valid for all {L^2({\bf R}^d)} functions and in particular for Schwartz functions {f \in {\mathcal S}({\bf R}^d)}. Also, we know that the Fourier transform of any derivative {\frac{\partial f}{\partial x_j} f} of {f} is {-2\pi i \xi_j \hat f(\xi)}. From this we see that

\displaystyle \int_{{\bf R}^d} |\frac{\partial f}{\partial x_j}(x)|^2\ dx = \int_{{\bf R}^d} (2\pi |\xi_j|)^2 |\hat f(\xi)|^2\ d\xi,

for all {f \in {\mathcal S}({\bf R}^d)} and so on summing in {j} we have

\displaystyle \int_{{\bf R}^d} |\nabla f(x)|^2\ dx = \int_{{\bf R}^d} (2\pi |\xi|)^2 |\hat f(\xi)|^2\ d\xi.

A similar argument then gives

\displaystyle \int_{{\bf R}^d} |\nabla^j f(x)|^2\ dx = \int_{{\bf R}^d} (2\pi |\xi|)^{2j} |\hat f(\xi)|^2\ d\xi

and so on summing in {j} we have

\displaystyle \|f\|_{W^{k,2}({\bf R}^d)}^2 = \int_{{\bf R}^d} \sum_{j=0}^k (2\pi |\xi|)^{2j} |\hat f(\xi)|^2\ d\xi

for all {k \geq 0} and all Schwartz functions {f \in {\mathcal S}({\bf R}^d)}. Since the Schwartz functions are dense in {W^{k,2}({\bf R}^d)}, a limiting argument (using the fact that {L^2} is complete) then shows that the above formula also holds for all {f \in W^{k,2}({\bf R}^d)}.

Now observe that the quantity {\sum_{j=0}^k (2\pi |\xi|)^{2j}} is comparable (up to constants depending on {k,d}) to the expression {\langle \xi \rangle^{2k}}, where {\langle x \rangle := (1+|x|^2)^{1/2}} (this quantity is sometimes known as the “Japanese bracket” of {x}). We thus conclude that

\displaystyle \| f \|_{W^{k,2}({\bf R}^d)} \sim \| \langle \xi \rangle^k \hat f(\xi) \|_{L^2({\bf R}^d)},

where we use {x \sim y} here to denote the fact that {x} and {y} are comparable up to constants depending on {d,k}, and {\xi} denotes the variable of independent variable on the right-hand side. If we then define, for any real number {s}, the space {H^s({\bf R}^d)} to be the space of all tempered distributions {f} such that the distribution {\langle \xi \rangle^s \hat f(\xi)} lies in {L^2}, and give this space the norm

\displaystyle \|f\|_{H^s({\bf R}^d)} := \| \langle \xi \rangle^s \hat f(\xi) \|_{L^2({\bf R}^d)},

then we see that {W^{k,2}({\bf R}^d)} embeds into {H^k({\bf R}^d)}, and that the norms are equivalent.

Actually, the two spaces are equal:

Exercise 38 For any {s \in {\bf R}}, show that {{\mathcal S}({\bf R}^d)} is a dense subspace of {H^s({\bf R}^d)}. Use this to conclude that {W^{k,2}({\bf R}^d) = H^k({\bf R}^d)} for all non-negative integers {k}.

It is clear that {H^0({\bf R}^d) \equiv L^2({\bf R}^d)}, and that {H^s({\bf R}^d) \subset H^{s'}({\bf R}^d)} whenever {s > s'}. The spaces {H^s({\bf R}^d)} are also (complex) Hilbert spaces, with the Hilbert space inner product

\displaystyle \langle f, g \rangle_{H^s({\bf R}^d)} := \int_{{\bf R}^d} \langle \xi \rangle^{2s} f(\xi) \overline{g(\xi)}\ d\xi.

It is not hard to verify that this inner product does indeed give {H^s({\bf R}^d)} the structure of a Hilbert space (indeed, it is isomorphic under the Fourier transform to the Hilbert space {L^2(\langle \xi \rangle^{2s} d\xi)} which is isomorphic in turn under the map {F(\xi) \mapsto \langle \xi \rangle^s F(\xi)} to the standard Hilbert space {L^2({\bf R}^d)}).

Being a Hilbert space, {H^s({\bf R}^d)} is isomorphic to its dual {H^s({\bf R}^d)^*} (or more precisely, to the complex conjugate of this dual). There is another duality relationship which is also useful:

Exercise 39 (Duality between {H^s} and {H^{-s}}) Let {s \in {\bf R}}, and {f \in H^s({\bf R}^d)}. Show also for any continuous linear functional {\lambda: H^s({\bf R}^d) \rightarrow {\bf C}} there exists a unique {g \in H^{-s}({\bf R}^d)} such that

\displaystyle \lambda(f) = \langle f, g \rangle_{L^2({\bf R}^d)}

for all {f \in H^s({\bf R}^d)}, where the inner product {\langle f, g \rangle_{L^2({\bf R}^d)}} is defined via the Fourier transform as

\displaystyle \langle f, g\rangle_{L^2({\bf R}^d)} := \int_{{\bf R}^d} \hat f(\xi) \overline{\hat g(\xi)}\ d\xi.

Also show that

\displaystyle \|f\|_{H^s({\bf R}^d)} := \sup \{ |\langle f, g \rangle_{L^2({\bf R}^d)}: g \in {\mathcal S}({\bf R}^d); \|g\|_{H^{-s}({\bf R}^d)} \leq 1 \}

for all {f \in H^s({\bf R}^d)}.

The {H^s} Sobolev spaces also enjoy the same type of embedding estimates as their classical counterparts:

Exercise 40 (Sobolev embedding for {H^s}, I) If {s > d/2}, show that {H^s({\bf R}^d)} embeds continuously into {C^{0,\alpha}({\bf R}^d)} whenever {0 < \alpha \leq \min( s-\frac{d}{2}, 1)}. (Hint: use the Fourier inversion formula and the Cauchy-Schwarz inequality.)

Exercise 41 (Sobolev embedding for {H^s}, II) If {0 < s < d/2}, show that {H^s({\bf R}^d)} embeds continuously into {L^q({\bf R}^d)} whenever {\frac{d}{2}-s \leq \frac{d}{q} \leq \frac{d}{2}}. (Hint: it suffices to handle the extreme case {\frac{d}{q} = \frac{d}{2}-s}. For this, first reduce to establishing the bound {\|f\|_{L^q({\bf R}^d)} \leq C \|f\|_{H^s({\bf R}^d)}} to the case when {f \in H^s({\bf R}^d)} is a Schwartz function whose Fourier transform vanishes near the origin (and {C} depends on {s,d,q}), and write {\hat f(\xi) = \hat g(\xi)/|\xi|^s} for some {g} which is bounded in {L^2({\bf R}^d)}. Then use Exercise 35 from Notes 3 and Corollary 7 from Notes 1.

Exercise 42 In this exercise we develop a more elementary variant of Sobolev spaces, the {L^p} Hölder spaces. For any {1 \leq p \leq \infty} and {0 < \alpha < 1}, let {\Lambda^p_\alpha({\bf R}^d)} be the space of functions {f} whose norm

\displaystyle \|f\|_{\Lambda^p_\alpha({\bf R}^d)} := \|f\|_{L^p({\bf R}^d)} + \sup_{x \in {\bf R}^d \backslash \{0\}} \frac{ \| \tau_x f - f \|_{L^p({\bf R}^d)} }{|x|^\alpha}

is finite, where {\tau_x(y) := f(y-x)} is the translation of {f} by {x}. Note that {\Lambda^\infty_\alpha({\bf R}^d) = C^{0,\alpha}({\bf R}^d)} (with equivalent norms).

  • (i) For any {0 < \alpha < 1}, establish the inclusions {\Lambda^2_{\alpha+\varepsilon}({\bf R}^d) \subset H^\alpha({\bf R}^d) \subset \Lambda^2_\alpha({\bf R}^d)} for any {0 < \varepsilon < 1-\alpha}. (Hint: take Fourier transforms and work in frequency space.)
  • (ii) Let {\phi \in C^\infty_c({\bf R}^d)} be a bump function, and let {\phi_n} be the approximations to the identity {\phi_n(x) := 2^{dn} \phi(2^n x)}. If {f \in \Lambda^p_\alpha({\bf R}^d)}, show that one has the equivalence

    \displaystyle \|f\|_{\Lambda^p_\alpha({\bf R}^d)} \sim \|f\|_{L^p({\bf R}^d)} + \sup_{n \geq 0} 2^{\alpha n} \| f*\phi_{n+1} - f*\phi_n\|_{L^p({\bf R}^d)}

    where we use {x \sim y} to denote the assertion that {x} and {y} are comparable up to constants depending on {p,d,\alpha}. (Hint: To upper bound {\| \tau_x f - f \|_{L^p({\bf R}^d)}} for {|x| \leq 1}, express {f} as a telescoping sum of {f*\phi_{n+1}-f*\phi_n} for {2^{-n} \leq x}, plus a final term {f*\phi_{n_0}} where {2^{-n_0}} is comparable to {x}.)

  • (iii) If {1 \leq p \leq q \leq \infty} and {0 < \alpha < 1} are such that {\frac{d}{p}-\alpha < \frac{d}{q}}, show that {\Lambda^p_\alpha({\bf R}^d)} embeds continuously into {L^q({\bf R}^d)}. (Hint: express {f(x)} as {f*\phi_1*\phi_0} plus a telescoping series of {f*\phi_{n+1}*\phi_n-f*\phi_n*\phi_{n-1}}, where {\phi_n} is as in the previous exercise. The additional convolution is in place in order to apply Young’s inequality.)

The functions {f*\phi_{n+1}-f*\phi_n} are crude versions of Littlewood-Paley projections, which play an important role in harmonic analysis and nonlinear wave and dispersive equations.

Exercise 43 (Sobolev trace theorem, special case) Let {s > 1/2}. For any {f \in C^\infty_c({\bf R}^d)}, establish the Sobolev trace inequality

\displaystyle \| f\downharpoonright_{{\bf R}^{d-1}} \|_{H^{s-1/2}({\bf R}^d)} \leq C \|f\|_{H^s({\bf R}^d)}

where {C} depends only on {d} and {s}, and {f\downharpoonright_{{\bf R}^{d-1}}} is the restriction of {f} to the standard hyperplane {{\bf R}^{d-1} \equiv {\bf R}^{d-1} \times \{0\} \subset {\bf R}^d}. (Hint: Convert everything to {L^2}-based statements involving the Fourier transform of {f}, and use either the Cauchy-Schwarz inequality or Schur’s test, see Lemma 5 of Notes 1.)

Exercise 44

  • (i) Show that if {f \in H^s({\bf R}^d)} for some {s \in {\bf R}}, and {g \in C^\infty({\bf R}^d)}, then {fg \in H^s({\bf R}^d)} (note that this product has to be defined in the sense of tempered distributions if {s} is negative), and the map {f \mapsto fg} is continuous from {H^s({\bf R}^d)} to {H^s({\bf R}^d)}. (Hint: First prove this when {s} is a non-negative integer using an argument similar to that in Exercise 36, then exploit duality to handle the case of negative integer {s}. To handle the remaining cases, decompose the Fourier transform of {f} into annular regions of the form {\{ \xi: 2^n \leq |\xi| \leq 2^{n+1}\}} for {n \geq 0}, as well as the ball {\{ \xi: |\xi| \leq 1 \}}, and use the preceding cases to estimate the {L^2} norm of the Fourier transform of {fg} these annular regions and on the ball.)
  • (ii) Let {L} be a partial differential operator of order {m} with coefficients in {C^\infty({\bf R}^d)} for some {m \geq 0}. Show that {L} maps {H^s({\bf R}^d)} continuously to {H^{s-m}({\bf R}^d)} for all {s \in {\bf R}}.

Now we consider a partial converse to Exercise 44.

Exercise 45 (Elliptic regularity) Let {m \geq 0}, and let

\displaystyle L = \sum_{j_1,\ldots,j_d \geq 0; j_1+\ldots+j_d=m} c_{j_1,\ldots,j_d} \frac{\partial^d}{\partial x_{j_1} \ldots \partial x_{j_d}}

be a constant-coefficient homogeneous differential operator of order {m}. Define the symbol {l: {\bf R}^d \rightarrow {\bf C}} of {L} to be the homogeneous polynomial of degree {m}, defined by the formula

\displaystyle l(\xi_1,\ldots,\xi_d) := \sum_{j_1,\ldots,j_d \geq 0; j_1+\ldots+j_d=m} c_{j_1,\ldots,j_d} \xi_{j_1} \ldots \xi_{j_d}.

We say that {L} is elliptic if one has the lower bound

\displaystyle |l(\xi)| \geq c |\xi|^m

for all {\xi \in {\bf R}^d} and some constant {c>0}. Thus, for instance, the Laplacian is elliptic. Another example of an elliptic operator is the Cauchy-Riemann operator {\frac{\partial}{\partial x_1} - i \frac{\partial}{\partial x_2}} in {{\bf R}^2}. On the other hand, the heat operator {\frac{\partial}{\partial t} - \Delta}, the Schrödinger operator {i\frac{\partial}{\partial t} + \Delta}, and the wave operator {-\frac{\partial^2}{\partial t^2}+\Delta} are not elliptic on {{\bf R}^{1+d}}.

  • (i) Show that if {L} is elliptic of order {m}, and {f} is a tempered distribution such that {f, Lf \in H^s({\bf R}^d)}, then {f \in H^{s+m}({\bf R}^d)}, and that one has the bound

    \displaystyle \| f\|_{H^{s+m}({\bf R}^d)} \leq C ( \|f\|_{H^s({\bf R}^d)} + \| Lf \|_{H^s({\bf R}^d)} ) \ \ \ \ \ (2)

    for some {C} depending on {s,m,d,L}. (Hint: Once again, rewrite everything in terms of the Fourier transform {\hat f} of {f}.)

  • (ii) Show that if {L} is a constant-coefficient differential operator of {m} which is not elliptic, then the estimate (2) fails.
  • (iii) Let {f \in L^2_{loc}({\bf R}^d)} be a function which is locally in {L^2}, and let {L} be an elliptic operator of order {m}. Show that if {Lf=0}, then {f} is smooth. (Hint: First show inductively that {f \phi \in H^k({\bf R}^d)} for every test function {\phi} and every natural number {k \geq 0}.)

Remark 46 The symbol {l} of an elliptic operator (with real coefficients) tends to have level sets that resemble ellipsoids, hence the name. In contrast, the symbol of parabolic operators such as the heat operator {\frac{\partial}{\partial t}-\Delta} has level sets resembling paraboloids, and the symbol of hyperbolic operators such as the wave operator {-\frac{\partial^2}{\partial t^2}+\Delta} has level sets resembling hyperboloids. The symbol in fact encodes many important features of linear differential operators, in particular controlling whether singularities can form, and how they must propagate in space and/or time; but this topic is beyond the scope of this course.