You are currently browsing the category archive for the ‘245C – Real analysis’ category.

A fundamental characteristic of many mathematical spaces (e.g. vector spaces, metric spaces, topological spaces, etc.) is their dimension, which measures the “complexity” or “degrees of freedom” inherent in the space. There is no single notion of dimension; instead, there are a variety of different versions of this concept, with different versions being suitable for different classes of mathematical spaces. Typically, a single mathematical object may have several subtly different notions of dimension that one can place on it, which will be related to each other, and which will often agree with each other in “non-pathological” cases, but can also deviate from each other in many other situations. For instance:

• One can define the dimension of a space ${X}$ by seeing how it compares to some standard reference spaces, such as ${{\bf R}^n}$ or ${{\bf C}^n}$; one may view a space as having dimension ${n}$ if it can be (locally or globally) identified with a standard ${n}$-dimensional space. The dimension of a vector space or a manifold can be defined in this fashion.
• Another way to define dimension of a space ${X}$ is as the largest number of “independent” objects one can place inside that space; this can be used to give an alternate notion of dimension for a vector space, or of an algebraic variety, as well as the closely related notion of the transcendence degree of a field. The concept of VC dimension in machine learning also broadly falls into this category.
• One can also try to define dimension inductively, for instance declaring a space ${X}$ to be ${n}$-dimensional if it can be “separated” somehow by an ${n-1}$-dimensional object; thus an ${n}$-dimensional object will tend to have “maximal chains” of sub-objects of length ${n}$ (or ${n+1}$, depending on how one initialises the chain and how one defines length). This can give a notion of dimension for a topological space or a commutative ring.

The notions of dimension as defined above tend to necessarily take values in the natural numbers (or the cardinal numbers); there is no such space as ${{\bf R}^{\sqrt{2}}}$, for instance, nor can one talk about a basis consisting of ${\pi}$ linearly independent elements, or a chain of maximal ideals of length ${e}$. There is however a somewhat different approach to the concept of dimension which makes no distinction between integer and non-integer dimensions, and is suitable for studying “rough” sets such as fractals. The starting point is to observe that in the ${d}$-dimensional space ${{\bf R}^d}$, the volume ${V}$ of a ball of radius ${R}$ grows like ${R^d}$, thus giving the following heuristic relationship

$\displaystyle \frac{\log V}{\log R} \approx d \ \ \ \ \ (1)$

between volume, scale, and dimension. Formalising this heuristic leads to a number of useful notions of dimension for subsets of ${{\bf R}^n}$ (or more generally, for metric spaces), including (upper and lower) Minkowski dimension (also known as box-packing dimension or Minkowski-Bougliand dimension), and Hausdorff dimension.

[In ${K}$-theory, it is also convenient to work with virtual" vector spaces or vector bundles, such as formal differences of such spaces, and which may therefore have a negative dimension; but as far as I am aware there is no connection between this notion of dimension and the metric ones given here.]

Minkowski dimension can either be defined externally (relating the external volume of ${\delta}$-neighbourhoods of a set ${E}$ to the scale ${\delta}$) or internally (relating the internal ${\delta}$-entropy of ${E}$ to the scale). Hausdorff dimension is defined internally by first introducing the ${d}$-dimensional Hausdorff measure of a set ${E}$ for any parameter ${0 \leq d < \infty}$, which generalises the familiar notions of length, area, and volume to non-integer dimensions, or to rough sets, and is of interest in its own right. Hausdorff dimension has a lengthier definition than its Minkowski counterpart, but is more robust with respect to operations such as countable unions, and is generally accepted as the “standard” notion of dimension in metric spaces. We will compare these concepts against each other later in these notes.

One use of the notion of dimension is to create finer distinctions between various types of “small” subsets of spaces such as ${{\bf R}^n}$, beyond what can be achieved by the usual Lebesgue measure (or Baire category). For instance, a point, line, and plane in ${{\bf R}^3}$ all have zero measure with respect to three-dimensional Lebesgue measure (and are nowhere dense), but of course have different dimensions (${0}$, ${1}$, and ${2}$ respectively). (The Kakeya set conjecture, discussed recently on this blog, offers another good example.) This can be used to clarify the nature of various singularities, such as that arising from non-smooth solutions to PDE; a function which is non-smooth on a set of large Hausdorff dimension can be considered less smooth than one which is non-smooth on a set of small Hausdorff dimension, even if both are smooth almost everywhere. While many properties of the singular set of such a function are worth studying (e.g. their rectifiability), understanding their dimension is often an important starting point. The interplay between these types of concepts is the subject of geometric measure theory.

As discussed in previous notes, a function space norm can be viewed as a means to rigorously quantify various statistics of a function ${f: X \rightarrow {\bf C}}$. For instance, the “height” and “width” can be quantified via the ${L^p(X,\mu)}$ norms (and their relatives, such as the Lorentz norms ${\|f\|_{L^{p,q}(X,\mu)}}$). Indeed, if ${f}$ is a step function ${f = A 1_E}$, then the ${L^p}$ norm of ${f}$ is a combination ${\|f\|_{L^p(X,\mu)} = |A| \mu(E)^{1/p}}$ of the height (or amplitude) ${A}$ and the width ${\mu(E)}$.

However, there are more features of a function ${f}$ of interest than just its width and height. When the domain ${X}$ is a Euclidean space ${{\bf R}^d}$ (or domains related to Euclidean spaces, such as open subsets of ${{\bf R}^d}$, or manifolds), then another important feature of such functions (especially in PDE) is the regularity of a function, as well as the related concept of the frequency scale of a function. These terms are not rigorously defined; but roughly speaking, regularity measures how smooth a function is (or how many times one can differentiate the function before it ceases to be a function), while the frequency scale of a function measures how quickly the function oscillates (and would be inversely proportional to the wavelength). One can illustrate this informal concept with some examples:

• Let ${\phi \in C^\infty_c({\bf R})}$ be a test function that equals ${1}$ near the origin, and ${N}$ be a large number. Then the function ${f(x) := \phi(x) \sin(Nx)}$ oscillates at a wavelength of about ${1/N}$, and a frequency scale of about ${N}$. While ${f}$ is, strictly speaking, a smooth function, it becomes increasingly less smooth in the limit ${N \rightarrow \infty}$; for instance, the derivative ${f'(x) = \phi'(x) \sin(Nx) + N \phi(x) \cos(Nx)}$ grows at a roughly linear rate as ${N \rightarrow \infty}$, and the higher derivatives grow at even faster rates. So this function does not really have any regularity in the limit ${N \rightarrow \infty}$. Note however that the height and width of this function is bounded uniformly in ${N}$; so regularity and frequency scale are independent of height and width.
• Continuing the previous example, now consider the function ${g(x) := N^{-s} \phi(x) \sin(Nx)}$, where ${s \geq 0}$ is some parameter. This function also has a frequency scale of about ${N}$. But now it has a certain amount of regularity, even in the limit ${N \rightarrow \infty}$; indeed, one easily checks that the ${k^{th}}$ derivative of ${g}$ stays bounded in ${N}$ as long as ${k \leq s}$. So one could view this function as having “${s}$ degrees of regularity” in the limit ${N \rightarrow \infty}$.
• In a similar vein, the function ${N^{-s} \phi(Nx)}$ also has a frequency scale of about ${N}$, and can be viewed as having ${s}$ degrees of regularity in the limit ${N \rightarrow \infty}$.
• The function ${\phi(x) |x|^s 1_{x > 0}}$ also has about ${s}$ degrees of regularity, in the sense that it can be differentiated up to ${s}$ times before becoming unbounded. By performing a dyadic decomposition of the ${x}$ variable, one can also decompose this function into components ${\psi(2^n x) |x|^s}$ for ${n \geq 0}$, where ${\psi(x) := (\phi(x)-\phi(2x)) 1_{x>0}}$ is a bump function supported away from the origin; each such component has frequency scale about ${2^n}$ and ${s}$ degrees of regularity. Thus we see that the original function ${\phi(x) |x|^s 1_{x > 0}}$ has a range of frequency scales, ranging from about ${1}$ all the way to ${+\infty}$.
• One can of course concoct higher-dimensional analogues of these examples. For instance, the localised plane wave ${\phi(x) \sin(\xi \cdot x)}$ in ${{\bf R}^d}$, where ${\phi \in C^\infty_c({\bf R}^d)}$ is a test function, would have a frequency scale of about ${|\xi|}$.

There are a variety of function space norms that can be used to capture frequency scale (or regularity) in addition to height and width. The most common and well-known examples of such spaces are the Sobolev space norms ${\| f\|_{W^{s,p}({\bf R}^d)}}$, although there are a number of other norms with similar features (such as Hölder norms, Besov norms, and Triebel-Lizorkin norms). Very roughly speaking, the ${W^{s,p}}$ norm is like the ${L^p}$ norm, but with “${s}$ additional degrees of regularity”. For instance, in one dimension, the function ${A \phi(x/R) \sin(Nx)}$, where ${\phi}$ is a fixed test function and ${R, N}$ are large, will have a ${W^{s,p}}$ norm of about ${|A| R^{1/p} N^s}$, thus combining the “height” ${|A|}$, the “width” ${R}$, and the “frequency scale” ${N}$ of this function together. (Compare this with the ${L^p}$ norm of the same function, which is about ${|A| R^{1/p}}$.)

To a large extent, the theory of the Sobolev spaces ${W^{s,p}({\bf R}^d)}$ resembles their Lebesgue counterparts ${L^p({\bf R}^d)}$ (which are as the special case of Sobolev spaces when ${s=0}$), but with the additional benefit of being able to interact very nicely with (weak) derivatives: a first derivative ${\frac{\partial f}{\partial x_j}}$ of a function in an ${L^p}$ space usually leaves all Lebesgue spaces, but a first derivative of a function in the Sobolev space ${W^{s,p}}$ will end up in another Sobolev space ${W^{s-1,p}}$. This compatibility with the differentiation operation begins to explain why Sobolev spaces are so useful in the theory of partial differential equations. Furthermore, the regularity parameter ${s}$ in Sobolev spaces is not restricted to be a natural number; it can be any real number, and one can use fractional derivative or integration operators to move from one regularity to another. Despite the fact that most partial differential equations involve differential operators of integer order, fractional spaces are still of importance; for instance it often turns out that the Sobolev spaces which are critical (scale-invariant) for a certain PDE are of fractional order.

The uncertainty principle in Fourier analysis places a constraint between the width and frequency scale of a function; roughly speaking (and in one dimension for simplicity), the product of the two quantities has to be bounded away from zero (or to put it another way, a wave is always at least as wide as its wavelength). This constraint can be quantified as the very useful Sobolev embedding theorem, which allows one to trade regularity for integrability: a function in a Sobolev space ${W^{s,p}}$ will automatically lie in a number of other Sobolev spaces ${W^{\tilde s,\tilde p}}$ with ${\tilde s < s}$ and ${\tilde p > p}$; in particular, one can often embed Sobolev spaces into Lebesgue spaces. The trade is not reversible: one cannot start with a function with a lot of integrability and no regularity, and expect to recover regularity in a space of lower integrability. (One can already see this with the most basic example of Sobolev embedding, coming from the fundamental theorem of calculus. If a (continuously differentiable) function ${f: {\bf R} \rightarrow {\bf R}}$ has ${f'}$ in ${L^1({\bf R})}$, then we of course have ${f \in L^\infty({\bf R})}$; but the converse is far from true.)

Plancherel’s theorem reveals that Fourier-analytic tools are particularly powerful when applied to ${L^2}$ spaces. Because of this, the Fourier transform is very effective at dealing with the ${L^2}$-based Sobolev spaces ${W^{s,2}({\bf R}^d)}$, often abbreviated ${H^s({\bf R}^d)}$. Indeed, using the fact that the Fourier transform converts regularity to decay, we will see that the ${H^s({\bf R}^d)}$ spaces are nothing more than Fourier transforms of weighted ${L^2}$ spaces, and in particular enjoy a Hilbert space structure. These Sobolev spaces, and in particular the energy space ${H^1({\bf R}^d)}$, are of particular importance in any PDE that involves some sort of energy functional (this includes large classes of elliptic, parabolic, dispersive, and wave equations, and especially those equations connected to physics and/or geometry).

We will not fully develop the theory of Sobolev spaces here, as this would require the theory of singular integrals, which is beyond the scope of this course. There are of course many references for further reading; one is Stein’s “Singular integrals and differentiability properties of functions“.

In set theory, a function ${f: X \rightarrow Y}$ is defined as an object that evaluates every input ${x}$ to exactly one output ${f(x)}$. However, in various branches of mathematics, it has become convenient to generalise this classical concept of a function to a more abstract one. For instance, in operator algebras, quantum mechanics, or non-commutative geometry, one often replaces commutative algebras of (real or complex-valued) functions on some space ${X}$, such as ${C(X)}$ or ${L^\infty(X)}$, with a more general – and possibly non-commutative – algebra (e.g. a ${C^*}$-algebra or a von Neumann algebra). Elements in this more abstract algebra are no longer definable as functions in the classical sense of assigning a single value ${f(x)}$ to every point ${x \in X}$, but one can still define other operations on these “generalised functions” (e.g. one can multiply or take inner products between two such objects).

Generalisations of functions are also very useful in analysis. In our study of ${L^p}$ spaces, we have already seen one such generalisation, namely the concept of a function defined up to almost everywhere equivalence. Such a function ${f}$ (or more precisely, an equivalence class of classical functions) cannot be evaluated at any given point ${x}$, if that point has measure zero. However, it is still possible to perform algebraic operations on such functions (e.g. multiplying or adding two functions together), and one can also integrate such functions on measurable sets (provided, of course, that the function has some suitable integrability condition). We also know that the ${L^p}$ spaces can usually be described via duality, as the dual space of ${L^{p'}}$ (except in some endpoint cases, namely when ${p=\infty}$, or when ${p=1}$ and the underlying space is not ${\sigma}$-finite).

We have also seen (via the Lebesgue-Radon-Nikodym theorem) that locally integrable functions ${f \in L^1_{\hbox{loc}}({\bf R})}$ on, say, the real line ${{\bf R}}$, can be identified with locally finite absolutely continuous measures ${m_f}$ on the line, by multiplying Lebesgue measure ${m}$ by the function ${f}$. So another way to generalise the concept of a function is to consider arbitrary locally finite Radon measures ${\mu}$ (not necessarily absolutely continuous), such as the Dirac measure ${\delta_0}$. With this concept of “generalised function”, one can still add and subtract two measures ${\mu, \nu}$, and integrate any measure ${\mu}$ against a (bounded) measurable set ${E}$ to obtain a number ${\mu(E)}$, but one cannot evaluate a measure ${\mu}$ (or more precisely, the Radon-Nikodym derivative ${d\mu/dm}$ of that measure) at a single point ${x}$, and one also cannot multiply two measures together to obtain another measure. From the Riesz representation theorem, we also know that the space of (finite) Radon measures can be described via duality, as linear functionals on ${C_c({\bf R})}$.

There is an even larger class of generalised functions that is very useful, particularly in linear PDE, namely the space of distributions, say on a Euclidean space ${{\bf R}^d}$. In contrast to Radon measures ${\mu}$, which can be defined by how they “pair up” against continuous, compactly supported test functions ${f \in C_c({\bf R}^d)}$ to create numbers ${\langle f, \mu \rangle := \int_{{\bf R}^d} f\ d\overline{\mu}}$, a distribution ${\lambda}$ is defined by how it pairs up against a smooth compactly supported function ${f \in C^\infty_c({\bf R}^d)}$ to create a number ${\langle f, \lambda \rangle}$. As the space ${C^\infty_c({\bf R}^d)}$ of smooth compactly supported functions is smaller than (but dense in) the space ${C_c({\bf R}^d)}$ of continuous compactly supported functions (and has a stronger topology), the space of distributions is larger than that of measures. But the space ${C^\infty_c({\bf R}^d)}$ is closed under more operations than ${C_c({\bf R}^d)}$, and in particular is closed under differential operators (with smooth coefficients). Because of this, the space of distributions is similarly closed under such operations; in particular, one can differentiate a distribution and get another distribution, which is something that is not always possible with measures or ${L^p}$ functions. But as measures or functions can be interpreted as distributions, this leads to the notion of a weak derivative for such objects, which makes sense (but only as a distribution) even for functions that are not classically differentiable. Thus the theory of distributions can allow one to rigorously manipulate rough functions “as if” they were smooth, although one must still be careful as some operations on distributions are not well-defined, most notably the operation of multiplying two distributions together. Nevertheless one can use this theory to justify many formal computations involving derivatives, integrals, etc. (including several computations used routinely in physics) that would be difficult to formalise rigorously in a purely classical framework.

If one shrinks the space of distributions slightly, to the space of tempered distributions (which is formed by enlarging dual class ${C^\infty_c({\bf R}^d)}$ to the Schwartz class ${{\mathcal S}({\bf R}^d)}$), then one obtains closure under another important operation, namely the Fourier transform. This allows one to define various Fourier-analytic operations (e.g. pseudodifferential operators) on such distributions.

Of course, at the end of the day, one is usually not all that interested in distributions in their own right, but would like to be able to use them as a tool to study more classical objects, such as smooth functions. Fortunately, one can recover facts about smooth functions from facts about the (far rougher) space of distributions in a number of ways. For instance, if one convolves a distribution with a smooth, compactly supported function, one gets back a smooth function. This is a particularly useful fact in the theory of constant-coefficient linear partial differential equations such as ${Lu=f}$, as it allows one to recover a smooth solution ${u}$ from smooth, compactly supported data ${f}$ by convolving ${f}$ with a specific distribution ${G}$, known as the fundamental solution of ${L}$. We will give some examples of this later in these notes.

It is this unusual and useful combination of both being able to pass from classical functions to generalised functions (e.g. by differentiation) and then back from generalised functions to classical functions (e.g. by convolution) that sets the theory of distributions apart from other competing theories of generalised functions, in particular allowing one to justify many formal calculations in PDE and Fourier analysis rigorously with relatively little additional effort. On the other hand, being defined by linear duality, the theory of distributions becomes somewhat less useful when one moves to more nonlinear problems, such as nonlinear PDE. However, they still serve an important supporting role in such problems as a “ambient space” of functions, inside of which one carves out more useful function spaces, such as Sobolev spaces, which we will discuss in the next set of notes.

In these notes we lay out the basic theory of the Fourier transform, which is of course the most fundamental tool in harmonic analysis and also of major importance in related fields (functional analysis, complex analysis, PDE, number theory, additive combinatorics, representation theory, signal processing, etc.). The Fourier transform, in conjunction with the Fourier inversion formula, allows one to take essentially arbitrary (complex-valued) functions on a group ${G}$ (or more generally, a space ${X}$ that ${G}$ acts on, e.g. a homogeneous space ${G/H}$), and decompose them as a (discrete or continuous) superposition of much more symmetric functions on the domain, such as characters ${\chi: G \rightarrow S^1}$; the precise superposition is given by Fourier coefficients ${\hat f(\xi)}$, which take values in some dual object such as the Pontryagin dual ${\hat G}$ of ${G}$. Characters behave in a very simple manner with respect to translation (indeed, they are eigenfunctions of the translation action), and so the Fourier transform tends to simplify any mathematical problem which enjoys a translation invariance symmetry (or an approximation to such a symmetry), and is somehow “linear” (i.e. it interacts nicely with superpositions). In particular, Fourier analytic methods are particularly useful for studying operations such as convolution ${f, g \mapsto f*g}$ and set-theoretic addition ${A, B \mapsto A+B}$, or the closely related problem of counting solutions to additive problems such as ${x = a_1 + a_2 + a_3}$ or ${x = a_1 - a_2}$, where ${a_1, a_2, a_3}$ are constrained to lie in specific sets ${A_1, A_2, A_3}$. The Fourier transform is also a particularly powerful tool for solving constant-coefficient linear ODE and PDE (because of the translation invariance), and can also approximately solve some variable-coefficient (or slightly non-linear) equations if the coefficients vary smoothly enough and the nonlinear terms are sufficiently tame.

The Fourier transform ${\hat f(\xi)}$ also provides an important new way of looking at a function ${f(x)}$, as it highlights the distribution of ${f}$ in frequency space (the domain of the frequency variable ${\xi}$) rather than physical space (the domain of the physical variable ${x}$). A given property of ${f}$ in the physical domain may be transformed to a rather different-looking property of ${\hat f}$ in the frequency domain. For instance:

• Smoothness of ${f}$ in the physical domain corresponds to decay of ${\hat f}$ in the Fourier domain, and conversely. (More generally, fine scale properties of ${f}$ tend to manifest themselves as coarse scale properties of ${\hat f}$, and conversely.)
• Convolution in the physical domain corresponds to pointwise multiplication in the Fourier domain, and conversely.
• Constant coefficient differential operators such as ${d/dx}$ in the physical domain corresponds to multiplication by polynomials such as ${2\pi i \xi}$ in the Fourier domain, and conversely.
• More generally, translation invariant operators in the physical domain correspond to multiplication by symbols in the Fourier domain, and conversely.
• Rescaling in the physical domain by an invertible linear transformation corresponds to an inverse (adjoint) rescaling in the Fourier domain.
• Restriction to a subspace (or subgroup) in the physical domain corresponds to projection to the dual quotient space (or quotient group) in the Fourier domain, and conversely.
• Frequency modulation in the physical domain corresponds to translation in the frequency domain, and conversely.

(We will make these statements more precise below.)

On the other hand, some operations in the physical domain remain essentially unchanged in the Fourier domain. Most importantly, the ${L^2}$ norm (or energy) of a function ${f}$ is the same as that of its Fourier transform, and more generally the inner product ${\langle f, g \rangle}$ of two functions ${f}$ is the same as that of their Fourier transforms. Indeed, the Fourier transform is a unitary operator on ${L^2}$ (a fact which is variously known as the Plancherel theorem or the Parseval identity). This makes it easier to pass back and forth between the physical domain and frequency domain, so that one can combine techniques that are easy to execute in the physical domain with other techniques that are easy to execute in the frequency domain. (In fact, one can combine the physical and frequency domains together into a product domain known as phase space, and there are entire fields of mathematics (e.g. microlocal analysis, geometric quantisation, time-frequency analysis) devoted to performing analysis on these sorts of spaces directly, but this is beyond the scope of this course.)

In these notes, we briefly discuss the general theory of the Fourier transform, but will mainly focus on the two classical domains for Fourier analysis: the torus ${{\Bbb T}^d := ({\bf R}/{\bf Z})^d}$, and the Euclidean space ${{\bf R}^d}$. For these domains one has the advantage of being able to perform very explicit algebraic calculations, involving concrete functions such as plane waves ${x \mapsto e^{2\pi i x \cdot \xi}}$ or Gaussians ${x \mapsto A^{d/2} e^{-\pi A |x|^2}}$.

In the previous two quarters, we have been focusing largely on the “soft” side of real analysis, which is primarily concerned with “qualitative” properties such as convergence, compactness, measurability, and so forth. In contrast, we will begin this quarter with more of an emphasis on the “hard” side of real analysis, in which we study estimates and upper and lower bounds of various quantities, such as norms of functions or operators. (Of course, the two sides of analysis are closely connected to each other; an understanding of both sides and their interrelationships, are needed in order to get the broadest and most complete perspective for this subject.)

One basic tool in hard analysis is that of interpolation, which allows one to start with a hypothesis of two (or more) “upper bound” estimates, e.g. ${A_0 \leq B_0}$ and ${A_1 \leq B_1}$, and conclude a family of intermediate estimates ${A_\theta \leq B_\theta}$ (or maybe ${A_\theta \leq C_\theta B_\theta}$, where ${C_\theta}$ is a constant) for any choice of parameter ${0 < \theta < 1}$. Of course, interpolation is not a magic wand; one needs various hypotheses (e.g. linearity, sublinearity, convexity, or complexifiability) on ${A_i, B_i}$ in order for interpolation methods to be applicable. Nevertheless, these techniques are available for many important classes of problems, most notably that of establishing boundedness estimates such as ${\| T f \|_{L^q(Y, \nu)} \leq C \| f \|_{L^p(X, \mu)}}$ for linear (or “linear-like”) operators ${T}$ from one Lebesgue space ${L^p(X,\mu)}$ to another ${L^q(Y,\nu)}$. (Interpolation can also be performed for many other normed vector spaces than the Lebesgue spaces, but we will just focus on Lebesgue spaces in these notes to focus the discussion.) Using interpolation, it is possible to reduce the task of proving such estimates to that of proving various “endpoint” versions of these estimates. In some cases, each endpoint only faces a portion of the difficulty that the interpolated estimate did, and so by using interpolation one has split the task of proving the original estimate into two or more simpler subtasks. In other cases, one of the endpoint estimates is very easy, and the other one is significantly more difficult than the original estimate; thus interpolation does not really simplify the task of proving estimates in this case, but at least clarifies the relative difficulty between various estimates in a given family.

As is the case with many other tools in analysis, interpolation is not captured by a single “interpolation theorem”; instead, there are a family of such theorems, which can be broadly divided into two major categories, reflecting the two basic methods that underlie the principle of interpolation. The real interpolation method is based on a divide and conquer strategy: to understand how to obtain control on some expression such as ${\| T f \|_{L^q(Y, \nu)}}$ for some operator ${T}$ and some function ${f}$, one would divide ${f}$ into two or more components, e.g. into components where ${f}$ is large and where ${f}$ is small, or where ${f}$ is oscillating with high frequency or only varying with low frequency. Each component would be estimated using a carefully chosen combination of the extreme estimates available; optimising over these choices and summing up (using whatever linearity-type properties on ${T}$ are available), one would hope to get a good estimate on the original expression. The strengths of the real interpolation method are that the linearity hypotheses on ${T}$ can be relaxed to weaker hypotheses, such as sublinearity or quasilinearity; also, the endpoint estimates are allowed to be of a weaker “type” than the interpolated estimates. On the other hand, the real interpolation often concedes a multiplicative constant in the final estimates obtained, and one is usually obligated to keep the operator ${T}$ fixed throughout the interpolation process. The proofs of real interpolation theorems are also a little bit messy, though in many cases one can simply invoke a standard instance of such theorems (e.g. the Marcinkiewicz interpolation theorem) as a black box in applications.

The complex interpolation method instead proceeds by exploiting the powerful tools of complex analysis, in particular the maximum modulus principle and its relatives (such as the Phragmén-Lindelöf principle). The idea is to rewrite the estimate to be proven (e.g. ${\| T f \|_{L^q(Y, \nu)} \leq C \| f \|_{L^p(X, \mu)}}$) in such a way that it can be embedded into a family of such estimates which depend holomorphically on a complex parameter ${s}$ in some domain (e.g. the strip ${\{ \sigma+it: t \in {\mathbb R}, \sigma \in [0,1]\}}$. One then exploits things like the maximum modulus principle to bound an estimate corresponding to an interior point of this domain by the estimates on the boundary of this domain. The strengths of the complex interpolation method are that it typically gives cleaner constants than the real interpolation method, and also allows the underlying operator ${T}$ to vary holomorphically with respect to the parameter ${s}$, which can significantly increase the flexibility of the interpolation technique. The proofs of these methods are also very short (if one takes the maximum modulus principle and its relatives as a black box), which make the method particularly amenable for generalisation to more intricate settings (e.g. multilinear operators, mixed Lebesgue norms, etc.). On the other hand, the somewhat rigid requirement of holomorphicity makes it much more difficult to apply this method to non-linear operators, such as sublinear or quasilinear operators; also, the interpolated estimate tends to be of the same “type” as the extreme ones, so that one does not enjoy the upgrading of weak type estimates to strong type estimates that the real interpolation method typically produces. Also, the complex method runs into some minor technical problems when target space ${L^q(Y,\nu)}$ ceases to be a Banach space (i.e. when ${q<1}$) as this makes it more difficult to exploit duality.

Despite these differences, the real and complex methods tend to give broadly similar results in practice, especially if one is willing to ignore constant losses in the estimates or epsilon losses in the exponents.

The theory of both real and complex interpolation can be studied abstractly, in general normed or quasi-normed spaces; see e.g. this book for a detailed treatment. However in these notes we shall focus exclusively on interpolation for Lebesgue spaces ${L^p}$ (and their cousins, such as the weak Lebesgue spaces ${L^{p,\infty}}$ and the Lorentz spaces ${L^{p,r}}$).

The 245B final can be found here.  I am not posting solutions, but readers (both students and non-students) are welcome to discuss the final questions in the comments below.

The continuation to this course, 245C, will begin on Monday, March 29.  The topics for this course are still somewhat fluid – but I tentatively plan to cover the following topics, roughly in order:

• $L^p$ spaces and interpolation; fractional integration
• The Fourier transform on ${\Bbb R}^n$ (a very quick review; this is of course covered more fully in 247A)
• Schwartz functions, and the theory of distributions
• Hausdorff measure
• The spectral theorem (introduction only; the topic is covered in depth in 255A)

I am open to further suggestions for topics that would build upon the 245AB material, which would be of interest to students, and which would not overlap too substantially with other graduate courses offered at UCLA.