You are currently browsing the tag archive for the ‘Sobolev embedding’ tag.

As discussed in previous notes, a function space norm can be viewed as a means to rigorously quantify various statistics of a function ${f: X \rightarrow {\bf C}}$. For instance, the “height” and “width” can be quantified via the ${L^p(X,\mu)}$ norms (and their relatives, such as the Lorentz norms ${\|f\|_{L^{p,q}(X,\mu)}}$). Indeed, if ${f}$ is a step function ${f = A 1_E}$, then the ${L^p}$ norm of ${f}$ is a combination ${\|f\|_{L^p(X,\mu)} = |A| \mu(E)^{1/p}}$ of the height (or amplitude) ${A}$ and the width ${\mu(E)}$.

However, there are more features of a function ${f}$ of interest than just its width and height. When the domain ${X}$ is a Euclidean space ${{\bf R}^d}$ (or domains related to Euclidean spaces, such as open subsets of ${{\bf R}^d}$, or manifolds), then another important feature of such functions (especially in PDE) is the regularity of a function, as well as the related concept of the frequency scale of a function. These terms are not rigorously defined; but roughly speaking, regularity measures how smooth a function is (or how many times one can differentiate the function before it ceases to be a function), while the frequency scale of a function measures how quickly the function oscillates (and would be inversely proportional to the wavelength). One can illustrate this informal concept with some examples:

• Let ${\phi \in C^\infty_c({\bf R})}$ be a test function that equals ${1}$ near the origin, and ${N}$ be a large number. Then the function ${f(x) := \phi(x) \sin(Nx)}$ oscillates at a wavelength of about ${1/N}$, and a frequency scale of about ${N}$. While ${f}$ is, strictly speaking, a smooth function, it becomes increasingly less smooth in the limit ${N \rightarrow \infty}$; for instance, the derivative ${f'(x) = \phi'(x) \sin(Nx) + N \phi(x) \cos(Nx)}$ grows at a roughly linear rate as ${N \rightarrow \infty}$, and the higher derivatives grow at even faster rates. So this function does not really have any regularity in the limit ${N \rightarrow \infty}$. Note however that the height and width of this function is bounded uniformly in ${N}$; so regularity and frequency scale are independent of height and width.
• Continuing the previous example, now consider the function ${g(x) := N^{-s} \phi(x) \sin(Nx)}$, where ${s \geq 0}$ is some parameter. This function also has a frequency scale of about ${N}$. But now it has a certain amount of regularity, even in the limit ${N \rightarrow \infty}$; indeed, one easily checks that the ${k^{th}}$ derivative of ${g}$ stays bounded in ${N}$ as long as ${k \leq s}$. So one could view this function as having “${s}$ degrees of regularity” in the limit ${N \rightarrow \infty}$.
• In a similar vein, the function ${N^{-s} \phi(Nx)}$ also has a frequency scale of about ${N}$, and can be viewed as having ${s}$ degrees of regularity in the limit ${N \rightarrow \infty}$.
• The function ${\phi(x) |x|^s 1_{x > 0}}$ also has about ${s}$ degrees of regularity, in the sense that it can be differentiated up to ${s}$ times before becoming unbounded. By performing a dyadic decomposition of the ${x}$ variable, one can also decompose this function into components ${\psi(2^n x) |x|^s}$ for ${n \geq 0}$, where ${\psi(x) := (\phi(x)-\phi(2x)) 1_{x>0}}$ is a bump function supported away from the origin; each such component has frequency scale about ${2^n}$ and ${s}$ degrees of regularity. Thus we see that the original function ${\phi(x) |x|^s 1_{x > 0}}$ has a range of frequency scales, ranging from about ${1}$ all the way to ${+\infty}$.
• One can of course concoct higher-dimensional analogues of these examples. For instance, the localised plane wave ${\phi(x) \sin(\xi \cdot x)}$ in ${{\bf R}^d}$, where ${\phi \in C^\infty_c({\bf R}^d)}$ is a test function, would have a frequency scale of about ${|\xi|}$.

There are a variety of function space norms that can be used to capture frequency scale (or regularity) in addition to height and width. The most common and well-known examples of such spaces are the Sobolev space norms ${\| f\|_{W^{s,p}({\bf R}^d)}}$, although there are a number of other norms with similar features (such as Hölder norms, Besov norms, and Triebel-Lizorkin norms). Very roughly speaking, the ${W^{s,p}}$ norm is like the ${L^p}$ norm, but with “${s}$ additional degrees of regularity”. For instance, in one dimension, the function ${A \phi(x/R) \sin(Nx)}$, where ${\phi}$ is a fixed test function and ${R, N}$ are large, will have a ${W^{s,p}}$ norm of about ${|A| R^{1/p} N^s}$, thus combining the “height” ${|A|}$, the “width” ${R}$, and the “frequency scale” ${N}$ of this function together. (Compare this with the ${L^p}$ norm of the same function, which is about ${|A| R^{1/p}}$.)

To a large extent, the theory of the Sobolev spaces ${W^{s,p}({\bf R}^d)}$ resembles their Lebesgue counterparts ${L^p({\bf R}^d)}$ (which are as the special case of Sobolev spaces when ${s=0}$), but with the additional benefit of being able to interact very nicely with (weak) derivatives: a first derivative ${\frac{\partial f}{\partial x_j}}$ of a function in an ${L^p}$ space usually leaves all Lebesgue spaces, but a first derivative of a function in the Sobolev space ${W^{s,p}}$ will end up in another Sobolev space ${W^{s-1,p}}$. This compatibility with the differentiation operation begins to explain why Sobolev spaces are so useful in the theory of partial differential equations. Furthermore, the regularity parameter ${s}$ in Sobolev spaces is not restricted to be a natural number; it can be any real number, and one can use fractional derivative or integration operators to move from one regularity to another. Despite the fact that most partial differential equations involve differential operators of integer order, fractional spaces are still of importance; for instance it often turns out that the Sobolev spaces which are critical (scale-invariant) for a certain PDE are of fractional order.

The uncertainty principle in Fourier analysis places a constraint between the width and frequency scale of a function; roughly speaking (and in one dimension for simplicity), the product of the two quantities has to be bounded away from zero (or to put it another way, a wave is always at least as wide as its wavelength). This constraint can be quantified as the very useful Sobolev embedding theorem, which allows one to trade regularity for integrability: a function in a Sobolev space ${W^{s,p}}$ will automatically lie in a number of other Sobolev spaces ${W^{\tilde s,\tilde p}}$ with ${\tilde s < s}$ and ${\tilde p > p}$; in particular, one can often embed Sobolev spaces into Lebesgue spaces. The trade is not reversible: one cannot start with a function with a lot of integrability and no regularity, and expect to recover regularity in a space of lower integrability. (One can already see this with the most basic example of Sobolev embedding, coming from the fundamental theorem of calculus. If a (continuously differentiable) function ${f: {\bf R} \rightarrow {\bf R}}$ has ${f'}$ in ${L^1({\bf R})}$, then we of course have ${f \in L^\infty({\bf R})}$; but the converse is far from true.)

Plancherel’s theorem reveals that Fourier-analytic tools are particularly powerful when applied to ${L^2}$ spaces. Because of this, the Fourier transform is very effective at dealing with the ${L^2}$-based Sobolev spaces ${W^{s,2}({\bf R}^d)}$, often abbreviated ${H^s({\bf R}^d)}$. Indeed, using the fact that the Fourier transform converts regularity to decay, we will see that the ${H^s({\bf R}^d)}$ spaces are nothing more than Fourier transforms of weighted ${L^2}$ spaces, and in particular enjoy a Hilbert space structure. These Sobolev spaces, and in particular the energy space ${H^1({\bf R}^d)}$, are of particular importance in any PDE that involves some sort of energy functional (this includes large classes of elliptic, parabolic, dispersive, and wave equations, and especially those equations connected to physics and/or geometry).

We will not fully develop the theory of Sobolev spaces here, as this would require the theory of singular integrals, which is beyond the scope of this course. There are of course many references for further reading; one is Stein’s “Singular integrals and differentiability properties of functions“.

Title: Use basic examples to calibrate exponents

Motivation: In the more quantitative areas of mathematics, such as analysis and combinatorics, one has to frequently keep track of a large number of exponents in one’s identities, inequalities, and estimates.  For instance, if one is studying a set of N elements, then many expressions that one is faced with will often involve some power $N^p$ of N; if one is instead studying a function f on a measure space X, then perhaps it is an $L^p$ norm $\|f\|_{L^p(X)}$ which will appear instead.  The exponent $p$ involved will typically evolve slowly over the course of the argument, as various algebraic or analytic manipulations are applied.  In some cases, the exact value of this exponent is immaterial, but at other times it is crucial to have the correct value of $p$ at hand.   One can (and should) of course carefully go through one’s arguments line by line to work out the exponents correctly, but it is all too easy to make a sign error or other mis-step at one of the lines, causing all the exponents on subsequent lines to be incorrect.  However, one can guard against this (and avoid some tedious line-by-line exponent checking) by continually calibrating these exponents at key junctures of the arguments by using basic examples of the object of study (sets, functions, graphs, etc.) as test cases.  This is a simple trick, but it lets one avoid many unforced errors with exponents, and also lets one compute more rapidly.

Quick description: When trying to quickly work out what an exponent p in an estimate, identity, or inequality should be without deriving that statement line-by-line, test that statement with a simple example which has non-trivial behaviour with respect to that exponent p, but trivial behaviour with respect to as many other components of that statement as one is able to manage.   The “non-trivial” behaviour should be parametrised by some very large or very small parameter.  By matching the dependence on this parameter on both sides of the estimate, identity, or inequality, one should recover p (or at least a good prediction as to what p should be).

General discussion: The test examples should be as basic as possible; ideally they should have trivial behaviour in all aspects except for one feature that relates to the exponent p that one is trying to calibrate, thus being only “barely” non-trivial.   When the object of study is a function, then (appropriately rescaled, or otherwise modified) bump functions are very typical test objects, as are Dirac masses, constant functions, Gaussians, or other functions that are simple and easy to compute with.  In additive combinatorics, when the object of study is a subset of a group, then subgroups, arithmetic progressions, or random sets are typical test objects.  In graph theory, typical examples of test objects include complete graphs, complete bipartite graphs, and random graphs. And so forth.

This trick is closely related to that of using dimensional analysis to recover exponents; indeed, one can view dimensional analysis as the special case of exponent calibration when using test objects which are non-trivial in one dimensional aspect (e.g. they exist at a single very large or very small length scale) but are otherwise of a trivial or “featureless” nature.   But the calibration trick is more general, as it can involve parameters (such as probabilities, angles, or eccentricities) which are not commonly associated with the physical concept of a dimension.  And personally, I find example-based calibration to be a much more satisfying (and convincing) explanation of an exponent than a calibration arising from formal dimensional analysis.

When one is trying to calibrate an inequality or estimate, one should try to pick a basic example which one expects to saturate that inequality or estimate, i.e. an example for which the inequality is close to being an equality.  Otherwise, one would only expect to obtain some partial information on the desired exponent p (e.g. a lower bound or an upper bound only).  Knowing the examples that saturate an estimate that one is trying to prove is also useful for several other reasons – for instance, it strongly suggests that any technique which is not efficient when applied to the saturating example, is unlikely to be strong enough to prove the estimate in general, thus eliminating fruitless approaches to a problem and (hopefully) refocusing one’s attention on those strategies which actually have a chance of working.

Calibration is best used for the type of quick-and-dirty calculations one uses when trying to rapidly map out an argument that one has roughly worked out already, but without precise details; in particular, I find it particularly useful when writing up a rapid prototype.  When the time comes to write out the paper in full detail, then of course one should instead carefully work things out line by line, but if all goes well, the exponents obtained in that process should match up with the preliminary guesses for those exponents obtained by calibration, which adds confidence that there are no exponent errors have been committed.