You are currently browsing the category archive for the ‘245B – Real analysis’ category.

In functional analysis, it is common to endow various (infinite-dimensional) vector spaces with a variety of topologies. For instance, a normed vector space can be given the strong topology as well as the weak topology; if the vector space has a predual, it also has a weak-* topology. Similarly, spaces of operators have a number of useful topologies on them, including the operator norm topology, strong operator topology, and the weak operator topology. For function spaces, one can use topologies associated to various modes of convergence, such as uniform convergence, pointwise convergence, locally uniform convergence, or convergence in the sense of distributions. (A small minority of such modes are not topologisable, though, the most common of which is pointwise almost everywhere convergence; see Exercise 8 of this previous post).

Some of these topologies are much stronger than others (in that they contain many more open sets, or equivalently that they have many fewer convergent sequences and nets). However, even the weakest topologies used in analysis (e.g. convergence in distributions) tend to be Hausdorff, since this at least ensures the uniqueness of limits of sequences and nets, which is a fundamentally useful feature for analysis. On the other hand, some Hausdorff topologies used are “better” than others in that many more analysis tools are available for those topologies. In particular, topologies that come from Banach space norms are particularly valued, as such topologies (and their attendant norm and metric structures) grant access to many convenient additional results such as the Baire category theorem, the uniform boundedness principle, the open mapping theorem, and the closed graph theorem.

Of course, most topologies placed on a vector space will not come from Banach space norms. For instance, if one takes the space of continuous functions on that converge to zero at infinity, the topology of uniform convergence comes from a Banach space norm on this space (namely, the uniform norm ), but the topology of pointwise convergence does not; and indeed all the other usual modes of convergence one could use here (e.g. convergence, locally uniform convergence, convergence in measure, etc.) do not arise from Banach space norms.

I recently realised (while teaching a graduate class in real analysis) that the closed graph theorem provides a quick explanation for why Banach space topologies are so rare:

Proposition 1Let be a Hausdorff topological vector space. Then, up to equivalence of norms, there is at most one norm one can place on so that is a Banach space whose topology is at least as strong as . In particular, there is at most one topology stronger than that comes from a Banach space norm.

*Proof:* Suppose one had two norms on such that and were both Banach spaces with topologies stronger than . Now consider the graph of the identity function from the Banach space to the Banach space . This graph is closed; indeed, if is a sequence in this graph that converged in the product topology to , then converges to in norm and hence in , and similarly converges to in norm and hence in . But limits are unique in the Hausdorff topology , so . Applying the closed graph theorem (see also previous discussions on this theorem), we see that the identity map is continuous from to ; similarly for the inverse. Thus the norms are equivalent as claimed.

By using various generalisations of the closed graph theorem, one can generalise the above proposition to Fréchet spaces, or even to F-spaces. The proposition can fail if one drops the requirement that the norms be stronger than a specified Hausdorff topology; indeed, if is infinite dimensional, one can use a Hamel basis of to construct a linear bijection on that is unbounded with respect to a given Banach space norm , and which can then be used to give an inequivalent Banach space structure on .

One can interpret Proposition 1 as follows: once one equips a vector space with some “weak” (but still Hausdorff) topology, there is a *canonical* choice of “strong” topology one can place on that space that is stronger than the “weak” topology but arises from a Banach space structure (or at least a Fréchet or F-space structure), provided that at least one such structure exists. In the case of function spaces, one can usually use the topology of convergence in distribution as the “weak” Hausdorff topology for this purpose, since this topology is weaker than almost all of the other topologies used in analysis. This helps justify the common practice of describing a Banach or Fréchet function space just by giving the set of functions that belong to that space (e.g. is the space of Schwartz functions on ) without bothering to specify the precise topology to serve as the “strong” topology, since it is usually understood that one is using the canonical such topology (e.g. the Fréchet space structure on given by the usual Schwartz space seminorms).

Of course, there are still some topological vector spaces which have no “strong topology” arising from a Banach space at all. Consider for instance the space of finitely supported sequences. A weak, but still Hausdorff, topology to place on this space is the topology of pointwise convergence. But there is no norm stronger than this topology that makes this space a Banach space. For, if there were, then letting be the standard basis of , the series would have to converge in , and hence pointwise, to an element of , but the only available pointwise limit for this series lies outside of . But I do not know if there is an easily checkable criterion to test whether a given vector space (equipped with a Hausdorff “weak” toplogy) can be equipped with a stronger Banach space (or Fréchet space or -space) topology.

One way to study a general class of mathematical objects is to embed them into a more structured class of mathematical objects; for instance, one could study manifolds by embedding them into Euclidean spaces. In these (optional) notes we study two (related) embedding theorems for topological spaces:

- The Stone-Čech compactification, which embeds locally compact Hausdorff spaces into compact Hausdorff spaces in a “universal” fashion; and
- The Urysohn metrization theorem, that shows that every second-countable normal Hausdorff space is metrizable.

The 245B final can be found here. I am not posting solutions, but readers (both students and non-students) are welcome to discuss the final questions in the comments below.

The continuation to this course, 245C, will begin on Monday, March 29. The topics for this course are still somewhat fluid – but I tentatively plan to cover the following topics, roughly in order:

- spaces and interpolation; fractional integration
- The Fourier transform on (a very quick review; this is of course covered more fully in 247A)
- Schwartz functions, and the theory of distributions
- Hausdorff measure
- The spectral theorem (introduction only; the topic is covered in depth in 255A)

I am open to further suggestions for topics that would build upon the 245AB material, which would be of interest to students, and which would not overlap too substantially with other graduate courses offered at UCLA.

A key theme in real analysis is that of studying general functions or by first approximating them by “simpler” or “nicer” functions. But the precise class of “simple” or “nice” functions may vary from context to context. In measure theory, for instance, it is common to approximate measurable functions by indicator functions or simple functions. But in other parts of analysis, it is often more convenient to approximate rough functions by continuous or smooth functions (perhaps with compact support, or some other decay condition), or by functions in some algebraic class, such as the class of polynomials or trigonometric polynomials.

In order to approximate rough functions by more continuous ones, one of course needs tools that can generate continuous functions with some specified behaviour. The two basic tools for this are Urysohn’s lemma, which approximates indicator functions by continuous functions, and the Tietze extension theorem, which extends continuous functions on a subdomain to continuous functions on a larger domain. An important consequence of these theorems is the Riesz representation theorem for linear functionals on the space of compactly supported continuous functions, which describes such functionals in terms of Radon measures.

Sometimes, approximation by continuous functions is not enough; one must approximate continuous functions in turn by an even smoother class of functions. A useful tool in this regard is the Stone-Weierstrass theorem, that generalises the classical Weierstrass approximation theorem to more general algebras of functions.

As an application of this theory (and of many of the results accumulated in previous lecture notes), we will present (in an optional section) the commutative Gelfand-Neimark theorem classifying all commutative unital -algebras.

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

**Title**: Give yourself an epsilon of room.

**Quick description**: You want to prove some statement about some object (which could be a number, a point, a function, a set, etc.). To do so, pick a small , and first prove a weaker statement (which allows for “losses” which go to zero as ) about some perturbed object . Then, take limits . Provided that the dependency and continuity of the weaker conclusion on are sufficiently controlled, and is converging to in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement about some object , by first proving a weaker statement on some approximation to for some large parameter N, and then send at the end.

**General discussion: **Here are some typical examples of a target statement , and the approximating statements that would converge to :

for some independent of | |

is finite | is bounded uniformly in |

for all (i.e. maximises f) | for all (i.e. nearly maximises f) |

converges as | fluctuates by at most o(1) for sufficiently large n |

is a measurable function | is a measurable function converging pointwise to |

is a continuous function | is an equicontinuous family of functions converging pointwise to OR is continuous and converges (locally) uniformly to |

The event holds almost surely | The event holds with probability 1-o(1) |

The statement holds for almost every x | The statement holds for x outside of a set of measure o(1) |

Of course, to justify the convergence of to , it is necessary that converge to (or converge to , etc.) in a suitably strong sense. (But for the purposes of proving just *upper* bounds, such as , one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.) Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control on the approximating object is somehow “uniform in “, although for “-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the *final* conclusion on that needs to have this uniformity in ; one is permitted to have some intermediate stages in the derivation of that depend on in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis. For instance, by replacing “rough”, “infinite-complexity”, “continuous”, “global”, or otherwise “infinitary” objects with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants , one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals). [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. should be expected to degrade in the limit , and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in .] Similarly, issues such as whether the supremum of a function on a set is actually attained by some maximiser become moot if one is willing to settle instead for an almost-maximiser , e.g. one which comes within an epsilon of that supremum M (or which is larger than , if M turns out to be infinite). Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

**A variant:** It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified. However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a *sufficiently* small (but not *infinitesimally* small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day. (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.) Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

**Prerequisites**: Graduate real analysis. (Actually, this isn’t so much a prerequisite as it is a *corequisite*: the limiting argument plays a central role in many fundamental results in real analysis.) Some examples also require some exposure to PDE.

A normed vector space automatically generates a topology, known as the *norm topology* or *strong topology* on , generated by the open balls . A sequence in such a space *converges strongly* (or *converges in norm*) to a limit if and only if as . This is the topology we have implicitly been using in our previous discussion of normed vector spaces.

However, in some cases it is useful to work in topologies on vector spaces that are weaker than a norm topology. One reason for this is that many important modes of convergence, such as pointwise convergence, convergence in measure, smooth convergence, or convergence on compact subsets, are not captured by a norm topology, and so it is useful to have a more general theory of topological vector spaces that contains these modes. Another reason (of particular importance in PDE) is that the norm topology on infinite-dimensional spaces is so strong that very few sets are compact or pre-compact in these topologies, making it difficult to apply *compactness methods* in these topologies. Instead, one often first works in a weaker topology, in which compactness is easier to establish, and then somehow upgrades any weakly convergent sequences obtained via compactness to stronger modes of convergence (or alternatively, one abandons strong convergence and exploits the weak convergence directly). Two basic weak topologies for this purpose are the weak topology on a normed vector space , and the weak* topology on a dual vector space . Compactness in the latter topology is usually obtained from the Banach-Alaoglu theorem (and its sequential counterpart), which will be a quick consequence of the Tychonoff theorem (and its sequential counterpart) from the previous lecture.

The strong and weak topologies on normed vector spaces also have analogues for the space of bounded linear operators from to , thus supplementing the operator norm topology on that space with two weaker topologies, which (somewhat confusingly) are named the strong operator topology and the weak operator topology.

One of the most useful concepts for analysis that arise from topology and metric spaces is the concept of compactness; recall that a space is compact if every open cover of has a finite subcover, or equivalently if any collection of closed sets with the finite intersection property (i.e. every finite subcollection of these sets has non-empty intersection) has non-empty intersection. In these notes, we explore how compactness interacts with other key topological concepts: the Hausdorff property, bases and sub-bases, product spaces, and equicontinuity, in particular establishing the useful Tychonoff and Arzelá-Ascoli theorems that give criteria for compactness (or precompactness).

Exercise 1 (Basic properties of compact sets)

- Show that any finite set is compact.
- Show that any finite union of compact subsets of a topological space is still compact.
- Show that any image of a compact space under a continuous map is still compact.
Show that these three statements continue to hold if “compact” is replaced by “sequentially compact”.

The notion of what it means for a subset E of a space X to be “small” varies from context to context. For instance, in measure theory, when is a measure space, one useful notion of a “small” set is that of a null set: a set E of measure zero (or at least contained in a set of measure zero). By countable additivity, countable unions of null sets are null. Taking contrapositives, we obtain

Lemma 1.(Pigeonhole principle for measure spaces) Let be an at most countable sequence of measurable subsets of a measure space X. If has positive measure, then at least one of the has positive measure.

Now suppose that X was a Euclidean space with Lebesgue measure m. The Lebesgue differentiation theorem easily implies that having positive measure is equivalent to being “dense” in certain balls:

Proposition 1.Let be a measurable subset of . Then the following are equivalent:

- E has positive measure.
- For any , there exists a ball B such that .

Thus one can think of a null set as a set which is “nowhere dense” in some measure-theoretic sense.

It turns out that there are analogues of these results when the measure space is replaced instead by a complete metric space . Here, the appropriate notion of a “small” set is not a null set, but rather that of a nowhere dense set: a set E which is not dense in any ball, or equivalently a set whose closure has empty interior. (A good example of a nowhere dense set would be a proper subspace, or smooth submanifold, of , or a Cantor set; on the other hand, the rationals are a dense subset of and thus clearly not nowhere dense.) We then have the following important result:

Theorem 1.(Baire category theorem). Let be an at most countable sequence of subsets of a complete metric space X. If contains a ball B, then at least one of the is dense in a sub-ball B’ of B (and in particular is not nowhere dense). To put it in the contrapositive: the countable union of nowhere dense sets cannot contain a ball.

**Exercise 1.** Show that the Baire category theorem is equivalent to the claim that in a complete metric space, the countable intersection of open dense sets remain dense.

**Exercise 2. **Using the Baire category theorem, show that any non-empty complete metric space without isolated points is uncountable. (In particular, this shows that Baire category theorem can fail for incomplete metric spaces such as the rationals .)

To quickly illustrate an application of the Baire category theorem, observe that it implies that one cannot cover a finite-dimensional real or complex vector space by a countable number of proper subspaces. One can of course also establish this fact by using Lebesgue measure on this space. However, the advantage of the Baire category approach is that it also works well in infinite dimensional complete normed vector spaces, i.e. Banach spaces, whereas the measure-theoretic approach runs into significant difficulties in infinite dimensions. This leads to three fundamental equivalences between the *qualitative* theory of continuous linear operators on Banach spaces (e.g. finiteness, surjectivity, etc.) to the *quantitative* theory (i.e. estimates):

- The uniform boundedness principle, that equates the qualitative boundedness (or convergence) of a family of continuous operators with their quantitative boundedness.
- The open mapping theorem, that equates the qualitative solvability of a linear problem Lu = f with the quantitative solvability.
- The closed graph theorem, that equates the qualitative regularity of a (weakly continuous) operator T with the quantitative regularity of that operator.

Strictly speaking, these theorems are not used much directly in practice, because one usually works in the reverse direction (i.e. first proving quantitative bounds, and then deriving qualitative corollaries); but the above three theorems help explain *why* we usually approach qualitative problems in functional analysis via their quantitative counterparts.

To progress further in our study of function spaces, we will need to develop the standard theory of metric spaces, and of the closely related theory of topological spaces (i.e. point-set topology). I will be assuming that students in my class will already have encountered these concepts in an undergraduate topology or real analysis course, but for sake of completeness I will briefly review the basics of both spaces here.

**Notational convention:** As in Notes 2, I will colour a statement red in this post if it assumes the axiom of choice. We will, of course, rely on every other axiom of Zermelo-Frankel set theory here (and in the rest of the course).

In this course we will often need to iterate some sort of operation “infinitely many times” (e.g. to create a infinite basis by choosing one basis element at a time). In order to do this rigorously, we will rely on *Zorn’s lemma*:

Zorn’s Lemma.Let be a non-empty partially ordered set, with the property that every chain (i.e. a totally ordered set) in X has an upper bound. Then X contains a maximal element (i.e. an element with no larger element).

Indeed, we have used this lemma several times already in previous notes. Given the other standard axioms of set theory, this lemma is logically equivalent to

Axiom of choice.Let X be a set, and let be a collection of non-empty subsets of X. Then there exists a choice function , i.e. a function such that for all .

One implication is easy:

**Proof of axiom of choice using Zorn’s lemma.** Define a *partial choice function* to be a pair , where is a subset of and is a choice function for . We can partially order the collection of partial choice functions by writing if and f” extends f’. The collection of partial choice functions is non-empty (since it contains the pair consisting of the empty set and the empty function), and it is easy to see that any chain of partial choice functions has an upper bound (formed by gluing all the partial choices together). Hence, by Zorn’s lemma, there is a maximal partial choice function . But the domain of this function must be all of , since otherwise one could enlarge by a single set A and extend to A by choosing a single element of A. (One does not need the axiom of choice to make a single choice, or finitely many choices; it is only when making infinitely many choices that the axiom becomes necessary.) The claim follows.

In the rest of these notes I would like to supply the reverse implication, using the machinery of well-ordered sets. Instead of giving the shortest or slickest proof of Zorn’s lemma here, I would like to take the opportunity to place the lemma in the context of several related topics, such as ordinals and transfinite induction, noting that much of this material is in fact independent of the axiom of choice. The material here is standard, but for the purposes of this course one may simply take Zorn’s lemma as a “black box” and not worry about the proof, so this material is optional.

## Recent Comments