You are currently browsing the category archive for the ‘245B – Real analysis’ category.

One way to study a general class of mathematical objects is to embed them into a more structured class of mathematical objects; for instance, one could study manifolds by embedding them into Euclidean spaces. In these (optional) notes we study two (related) embedding theorems for topological spaces:

The 245B final can be found here.  I am not posting solutions, but readers (both students and non-students) are welcome to discuss the final questions in the comments below.

The continuation to this course, 245C, will begin on Monday, March 29.  The topics for this course are still somewhat fluid – but I tentatively plan to cover the following topics, roughly in order:

• $L^p$ spaces and interpolation; fractional integration
• The Fourier transform on ${\Bbb R}^n$ (a very quick review; this is of course covered more fully in 247A)
• Schwartz functions, and the theory of distributions
• Hausdorff measure
• The spectral theorem (introduction only; the topic is covered in depth in 255A)

I am open to further suggestions for topics that would build upon the 245AB material, which would be of interest to students, and which would not overlap too substantially with other graduate courses offered at UCLA.

A key theme in real analysis is that of studying general functions ${f: X \rightarrow {\bf R}}$ or ${f: X \rightarrow {\bf C}}$ by first approximating them by “simpler” or “nicer” functions. But the precise class of “simple” or “nice” functions may vary from context to context. In measure theory, for instance, it is common to approximate measurable functions by indicator functions or simple functions. But in other parts of analysis, it is often more convenient to approximate rough functions by continuous or smooth functions (perhaps with compact support, or some other decay condition), or by functions in some algebraic class, such as the class of polynomials or trigonometric polynomials.

In order to approximate rough functions by more continuous ones, one of course needs tools that can generate continuous functions with some specified behaviour. The two basic tools for this are Urysohn’s lemma, which approximates indicator functions by continuous functions, and the Tietze extension theorem, which extends continuous functions on a subdomain to continuous functions on a larger domain. An important consequence of these theorems is the Riesz representation theorem for linear functionals on the space of compactly supported continuous functions, which describes such functionals in terms of Radon measures.

Sometimes, approximation by continuous functions is not enough; one must approximate continuous functions in turn by an even smoother class of functions. A useful tool in this regard is the Stone-Weierstrass theorem, that generalises the classical Weierstrass approximation theorem to more general algebras of functions.

As an application of this theory (and of many of the results accumulated in previous lecture notes), we will present (in an optional section) the commutative Gelfand-Neimark theorem classifying all commutative unital ${C^*}$-algebras.

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

Title: Give yourself an epsilon of room.

Quick description: You want to prove some statement $S_0$ about some object $x_0$ (which could be a number, a point, a function, a set, etc.).  To do so, pick a small $\varepsilon > 0$, and first prove a weaker statement $S_\varepsilon$ (which allows for “losses” which go to zero as $\varepsilon \to 0$) about some perturbed object $x_\varepsilon$.  Then, take limits $\varepsilon \to 0$.  Provided that the dependency and continuity of the weaker conclusion $S_\varepsilon$ on $\varepsilon$ are sufficiently controlled, and $x_\varepsilon$ is converging to $x_0$ in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement $S_\infty$ about some object $X_\infty$, by first proving a weaker statement $S_N$ on some approximation $X_N$ to $X_\infty$ for some large parameter N, and then send $N \to \infty$ at the end.

General discussion: Here are some typical examples of a target statement $S_0$, and the approximating statements $S_\varepsilon$ that would converge to $S$:

 $S_0$ $S_\varepsilon$ $f(x_0) = g(x_0)$ $f(x_\varepsilon) = g(x_\varepsilon) + o(1)$ $f(x_0) \leq g(x_0)$ $f(x_\varepsilon) \leq g(x_\varepsilon) + o(1)$ $f(x_0) > 0$ $f(x_\varepsilon) \geq c - o(1)$ for some $c>0$ independent of $\varepsilon$ $f(x_0)$ is finite $f(x_\varepsilon)$ is bounded uniformly in $\varepsilon$ $f(x_0) \geq f(x)$ for all $x \in X$ (i.e. $x_0$ maximises f) $f(x_\varepsilon) \geq f(x)-o(1)$ for all $x \in X$ (i.e. $x_\varepsilon$ nearly maximises f) $f_n(x_0)$ converges as $n \to \infty$ $f_n(x_\varepsilon)$ fluctuates by at most o(1) for sufficiently large n $f_0$ is a measurable function $f_\varepsilon$ is a measurable function converging pointwise to $f_0$ $f_0$ is a continuous function $f_\varepsilon$ is an equicontinuous family of functions converging pointwise to $f_0$ OR $f_\varepsilon$ is continuous and converges (locally) uniformly to $f_0$ The event $E_0$ holds almost surely The event $E_\varepsilon$ holds with probability 1-o(1) The statement $P_0(x)$ holds for almost every x The statement $P_\varepsilon(x)$ holds for x outside of a set of measure o(1)

Of course, to justify the convergence of $S_\varepsilon$ to $S_0$, it is necessary that $x_\varepsilon$ converge to $x_0$ (or $f_\varepsilon$ converge to $f_0$, etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as $f(x_0) \leq M$, one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.)  Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control $S_\varepsilon$ on the approximating object $x_\varepsilon$ is somehow “uniform in $\varepsilon$“, although for “$\sigma$-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion $S_\varepsilon$ on $x_\varepsilon$ that needs to have this uniformity in $\varepsilon$; one is permitted to have some intermediate stages in the derivation of $S_\varepsilon$ that depend on $\varepsilon$ in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis.  For instance, by replacing “rough”, “infinite-complexity”, “continuous”,  “global”, or otherwise “infinitary” objects $x_0$ with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants $x_\varepsilon$, one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals).  [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. $x_\varepsilon$ should be expected to degrade in the limit $\varepsilon \to 0$, and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in $\varepsilon$.]  Similarly, issues such as whether the supremum $M := \sup \{ f(x): x \in X \}$ of a function on a set is actually attained by some maximiser $x_0$ become moot if one is willing to settle instead for an almost-maximiser $x_\varepsilon$, e.g. one which comes within an epsilon of that supremum M (or which is larger than $1/\varepsilon$, if M turns out to be infinite).  Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

A variant: It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified.  However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day.  (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.)  Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

Prerequisites: Graduate real analysis.  (Actually, this isn’t so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.)  Some examples also require some exposure to PDE.

A normed vector space ${(X, \| \|_X)}$ automatically generates a topology, known as the norm topology or strong topology on ${X}$, generated by the open balls ${B(x,r) := \{ y \in X: \|y-x\|_X < r \}}$. A sequence ${x_n}$ in such a space converges strongly (or converges in norm) to a limit ${x}$ if and only if ${\|x_n-x\|_X \rightarrow 0}$ as ${n \rightarrow \infty}$. This is the topology we have implicitly been using in our previous discussion of normed vector spaces.

However, in some cases it is useful to work in topologies on vector spaces that are weaker than a norm topology. One reason for this is that many important modes of convergence, such as pointwise convergence, convergence in measure, smooth convergence, or convergence on compact subsets, are not captured by a norm topology, and so it is useful to have a more general theory of topological vector spaces that contains these modes. Another reason (of particular importance in PDE) is that the norm topology on infinite-dimensional spaces is so strong that very few sets are compact or pre-compact in these topologies, making it difficult to apply compactness methods in these topologies. Instead, one often first works in a weaker topology, in which compactness is easier to establish, and then somehow upgrades any weakly convergent sequences obtained via compactness to stronger modes of convergence (or alternatively, one abandons strong convergence and exploits the weak convergence directly). Two basic weak topologies for this purpose are the weak topology on a normed vector space ${X}$, and the weak* topology on a dual vector space ${X^*}$. Compactness in the latter topology is usually obtained from the Banach-Alaoglu theorem (and its sequential counterpart), which will be a quick consequence of the Tychonoff theorem (and its sequential counterpart) from the previous lecture.

The strong and weak topologies on normed vector spaces also have analogues for the space ${B(X \rightarrow Y)}$ of bounded linear operators from ${X}$ to ${Y}$, thus supplementing the operator norm topology on that space with two weaker topologies, which (somewhat confusingly) are named the strong operator topology and the weak operator topology.

One of the most useful concepts for analysis that arise from topology and metric spaces is the concept of compactness; recall that a space ${X}$ is compact if every open cover of ${X}$ has a finite subcover, or equivalently if any collection of closed sets with the finite intersection property (i.e. every finite subcollection of these sets has non-empty intersection) has non-empty intersection. In these notes, we explore how compactness interacts with other key topological concepts: the Hausdorff property, bases and sub-bases, product spaces, and equicontinuity, in particular establishing the useful Tychonoff and Arzelá-Ascoli theorems that give criteria for compactness (or precompactness).

Exercise 1 (Basic properties of compact sets)

• Show that any finite set is compact.
• Show that any finite union of compact subsets of a topological space is still compact.
• Show that any image of a compact space under a continuous map is still compact.

Show that these three statements continue to hold if “compact” is replaced by “sequentially compact”.

The notion of what it means for a subset E of a space X to be “small” varies from context to context.  For instance, in measure theory, when $X = (X, {\mathcal X}, \mu)$ is a measure space, one useful notion of a “small” set is that of a null set: a set E of measure zero (or at least contained in a set of measure zero).  By countable additivity, countable unions of null sets are null.  Taking contrapositives, we obtain

Lemma 1. (Pigeonhole principle for measure spaces) Let $E_1, E_2, \ldots$ be an at most countable sequence of measurable subsets of a measure space X.  If $\bigcup_n E_n$ has positive measure, then at least one of the $E_n$ has positive measure.

Now suppose that X was a Euclidean space ${\Bbb R}^d$ with Lebesgue measure m.  The Lebesgue differentiation theorem easily implies that having positive measure is equivalent to being “dense” in certain balls:

Proposition 1. Let $E$ be a measurable subset of ${\Bbb R}^d$.  Then the following are equivalent:

1. E has positive measure.
2. For any $\varepsilon > 0$, there exists a ball B such that $m( E \cap B ) \geq (1-\varepsilon) m(B)$.

Thus one can think of a null set as a set which is “nowhere dense” in some measure-theoretic sense.

It turns out that there are analogues of these results when the measure space $X = (X, {\mathcal X}, \mu)$  is replaced instead by a complete metric space $X = (X,d)$.  Here, the appropriate notion of a “small” set is not a null set, but rather that of a nowhere dense set: a set E which is not dense in any ball, or equivalently a set whose closure has empty interior.  (A good example of a nowhere dense set would be a proper subspace, or smooth submanifold, of ${\Bbb R}^d$, or a Cantor set; on the other hand, the rationals are a dense subset of ${\Bbb R}$ and thus clearly not nowhere dense.)   We then have the following important result:

Theorem 1. (Baire category theorem). Let $E_1, E_2, \ldots$ be an at most countable sequence of subsets of a complete metric space X.  If $\bigcup_n E_n$ contains a ball B, then at least one of the $E_n$ is dense in a sub-ball B’ of B (and in particular is not nowhere dense).  To put it in the contrapositive: the countable union of nowhere dense sets cannot contain a ball.

Exercise 1. Show that the Baire category theorem is equivalent to the claim that in a complete metric space, the countable intersection of open dense sets remain dense.  $\diamond$

Exercise 2. Using the Baire category theorem, show that any non-empty complete metric space without isolated points is uncountable.  (In particular, this shows that Baire category theorem can fail for incomplete metric spaces such as the rationals ${\Bbb Q}$.)  $\diamond$

To quickly illustrate an application of the Baire category theorem, observe that it implies that one cannot cover a finite-dimensional real or complex vector space ${\Bbb R}^n, {\Bbb C}^n$ by a countable number of proper subspaces.  One can of course also establish this fact by using Lebesgue measure on this space.  However, the advantage of the Baire category approach is that it also works well in infinite dimensional complete normed vector spaces, i.e. Banach spaces, whereas the measure-theoretic approach runs into significant difficulties in infinite dimensions.  This leads to three fundamental equivalences between the qualitative theory of continuous linear operators on Banach spaces (e.g. finiteness, surjectivity, etc.) to the quantitative theory (i.e. estimates):

1. The uniform boundedness principle, that equates the qualitative boundedness (or convergence) of a family of continuous operators with their quantitative boundedness.
2. The open mapping theorem, that equates the qualitative solvability of a linear problem Lu = f with the quantitative solvability.
3. The closed graph theorem, that equates the qualitative regularity of a (weakly continuous) operator T with the quantitative regularity of that operator.

Strictly speaking, these theorems are not used much directly in practice, because one usually works in the reverse direction (i.e. first proving quantitative bounds, and then deriving qualitative corollaries); but the above three theorems help explain why we usually approach qualitative problems in functional analysis via their quantitative counterparts.

To progress further in our study of function spaces, we will need to develop the standard theory of metric spaces, and of the closely related theory of topological spaces (i.e. point-set topology).  I will be assuming that students in my class will already have encountered these concepts in an undergraduate topology or real analysis course, but for sake of completeness I will briefly review the basics of both spaces here.

Notational convention: As in Notes 2, I will colour a statement red in this post if it assumes the axiom of choice.  We will, of course, rely on every other axiom of Zermelo-Frankel set theory here (and in the rest of the course).  $\diamond$

In this course we will often need to iterate some sort of operation “infinitely many times” (e.g. to create a infinite basis by choosing one basis element at a time).  In order to do this rigorously, we will rely on Zorn’s lemma:

Zorn’s Lemma. Let $(X, \leq)$ be a non-empty partially ordered set, with the property that every chain (i.e. a totally ordered set) in X has an upper bound.  Then X contains a maximal element (i.e. an element with no larger element).

Indeed, we have used this lemma several times already in previous notes.  Given the other standard axioms of set theory, this lemma is logically equivalent to

Axiom of choice. Let X be a set, and let ${\mathcal F}$ be a collection of non-empty subsets of X.  Then there exists a choice function $f: {\mathcal F} \to X$, i.e. a function such that $f(A) \in A$ for all $A \in {\mathcal F}$.

One implication is easy:

Proof of axiom of choice using Zorn’s lemma. Define a partial choice function to be a pair $({\mathcal F}', f')$, where ${\mathcal F}'$ is a subset of ${\mathcal F}$ and $f': {\mathcal F}' \to X$ is a choice function for ${\mathcal F'}$.  We can partially order the collection of partial choice functions by writing $({\mathcal F}', f') \leq ({\mathcal F}'', f'')$ if ${\mathcal F}' \subset {\mathcal F}''$ and f” extends f’.  The collection of partial choice functions is non-empty (since it contains the pair $(\emptyset, ())$ consisting of the empty set and the empty function), and it is easy to see that any chain of partial choice functions has an upper bound (formed by gluing all the partial choices together).  Hence, by Zorn’s lemma, there is a maximal partial choice function $({\mathcal F}_*, f_*)$.  But the domain ${\mathcal F}_*$ of this function must be all of ${\mathcal F}$, since otherwise one could enlarge ${\mathcal F}_*$ by a single set A and extend $f_*$ to A by choosing a single element of A.  (One does not need the axiom of choice to make a single choice, or finitely many choices; it is only when making infinitely many choices that the axiom becomes necessary.)  The claim follows. $\Box$

In the rest of these notes I would like to supply the reverse implication, using the machinery of well-ordered sets.  Instead of giving the shortest or slickest proof of Zorn’s lemma here, I would like to take the opportunity to place the lemma in the context of several related topics, such as ordinals and transfinite induction, noting that much of this material is in fact independent of the axiom of choice.  The material here is standard, but for the purposes of this course one may simply take Zorn’s lemma as a “black box” and not worry about the proof, so this material is optional.

When studying a mathematical space X (e.g. a vector space, a topological space, a manifold, a group, an algebraic variety etc.), there are two fundamentally basic ways to try to understand the space:

1. By looking at subobjects in X, or more generally maps $f: Y \to X$ from some other space Y into X.  For instance, a point in a space X can be viewed as a map from $pt$ to X; a curve in a space X could be thought of as a map from ${}[0,1]$ to X; a group G can be studied via its subgroups K, and so forth.
2. By looking at objects on X, or more precisely maps $f: X \to Y$ from X into some other space Y.  For instance, one can study a topological space X via the real- or complex-valued continuous functions $f \in C(X)$ on X; one can study a group G via its quotient groups $\pi: G \to G/H$; one can study an algebraic variety V by studying the polynomials on V (and in particular, the ideal of polynomials that vanish identically on V); and so forth.

(There are also more sophisticated ways to study an object via its maps, e.g. by studying extensions, joinings, splittings, universal lifts, etc.  The general study of objects via the maps between them is formalised abstractly in modern mathematics as category theory, and is also closely related to homological algebra.)

A remarkable phenomenon in many areas of mathematics is that of (contravariant) duality: that the maps into and out of one type of mathematical object X can be naturally associated to the maps out of and into a dual object $X^*$ (note the reversal of arrows here!).  In some cases, the dual object $X^*$ looks quite different from the original object X.  (For instance, in Stone duality, discussed in Notes 4, X would be a Boolean algebra (or some other partially ordered set) and $X^*$ would be a compact totally disconnected Hausdorff space (or some other topological space).)   In other cases, most notably with Hilbert spaces as discussed in Notes 5, the dual object $X^*$ is essentially identical to X itself.

In these notes we discuss a third important case of duality, namely duality of normed vector spaces, which is of an intermediate nature to the previous two examples: the dual $X^*$ of a normed vector space turns out to be another normed vector space, but generally one which is not equivalent to X itself (except in the important special case when X is a Hilbert space, as mentioned above).  On the other hand, the double dual $(X^*)^*$ turns out to be closely related to X, and in several (but not all) important cases, is essentially identical to X.  One of the most important uses of dual spaces in functional analysis is that it allows one to define the transpose $T^*: Y^* \to X^*$ of a continuous linear operator $T: X \to Y$.

A fundamental tool in understanding duality of normed vector spaces will be the Hahn-Banach theorem, which is an indispensable tool for exploring the dual of a vector space.  (Indeed, without this theorem, it is not clear at all that the dual of a non-trivial normed vector space is non-trivial!)  Thus, we shall study this theorem in detail in these notes concurrently with our discussion of duality.