You are currently browsing the category archive for the ‘245B – Real analysis’ category.

Suppose one has a bounded sequence {(a_n)_{n=1}^\infty = (a_1, a_2, \dots)} of real numbers. What kinds of limits can one form from this sequence?

Of course, we have the usual notion of limit {\lim_{n \rightarrow \infty} a_n}, which in this post I will refer to as the classical limit to distinguish from the other limits discussed in this post. The classical limit, if it exists, is the unique real number {L} such that for every {\varepsilon>0}, one has {|a_n-L| \leq \varepsilon} for all sufficiently large {n}. We say that a sequence is (classically) convergent if its classical limit exists. The classical limit obeys many useful limit laws when applied to classically convergent sequences. Firstly, it is linear: if {(a_n)_{n=1}^\infty} and {(b_n)_{n=1}^\infty} are classically convergent sequences, then {(a_n+b_n)_{n=1}^\infty} is also classically convergent with

\displaystyle \lim_{n \rightarrow \infty} (a_n + b_n) = (\lim_{n \rightarrow \infty} a_n) + (\lim_{n \rightarrow \infty} b_n) \ \ \ \ \ (1)

and similarly for any scalar {c}, {(ca_n)_{n=1}^\infty} is classically convergent with

\displaystyle \lim_{n \rightarrow \infty} (ca_n) = c \lim_{n \rightarrow \infty} a_n. \ \ \ \ \ (2)

It is also an algebra homomorphism: {(a_n b_n)_{n=1}^\infty} is also classically convergent with

\displaystyle \lim_{n \rightarrow \infty} (a_n b_n) = (\lim_{n \rightarrow \infty} a_n) (\lim_{n \rightarrow \infty} b_n). \ \ \ \ \ (3)

We also have shift invariance: if {(a_n)_{n=1}^\infty} is classically convergent, then so is {(a_{n+1})_{n=1}^\infty} with

\displaystyle \lim_{n \rightarrow \infty} a_{n+1} = \lim_{n \rightarrow \infty} a_n \ \ \ \ \ (4)

and more generally in fact for any injection {\phi: {\bf N} \rightarrow {\bf N}}, {(a_{\phi(n)})_{n=1}^\infty} is classically convergent with

\displaystyle \lim_{n \rightarrow \infty} a_{\phi(n)} = \lim_{n \rightarrow \infty} a_n. \ \ \ \ \ (5)

The classical limit of a sequence is unchanged if one modifies any finite number of elements of the sequence. Finally, we have boundedness: for any classically convergent sequence {(a_n)_{n=1}^\infty}, one has

\displaystyle \inf_n a_n \leq \lim_{n \rightarrow \infty} a_n \leq \sup_n a_n. \ \ \ \ \ (6)

One can in fact show without much difficulty that these laws uniquely determine the classical limit functional on convergent sequences.

One would like to extend the classical limit notion to more general bounded sequences; however, when doing so one must give up one or more of the desirable limit laws that were listed above. Consider for instance the sequence {a_n = (-1)^n}. On the one hand, one has {a_n^2 = 1} for all {n}, so if one wishes to retain the homomorphism property (3), any “limit” of this sequence {a_n} would have to necessarily square to {1}, that is to say it must equal {+1} or {-1}. On the other hand, if one wished to retain the shift invariance property (4) as well as the homogeneity property (2), any “limit” of this sequence would have to equal its own negation and thus be zero.

Nevertheless there are a number of useful generalisations and variants of the classical limit concept for non-convergent sequences that obey a significant portion of the above limit laws. For instance, we have the limit superior

\displaystyle \limsup_{n \rightarrow \infty} a_n := \inf_N \sup_{n \geq N} a_n

and limit inferior

\displaystyle \liminf_{n \rightarrow \infty} a_n := \sup_N \inf_{n \geq N} a_n

which are well-defined real numbers for any bounded sequence {(a_n)_{n=1}^\infty}; they agree with the classical limit when the sequence is convergent, but disagree otherwise. They enjoy the shift-invariance property (4), and the boundedness property (6), but do not in general obey the homomorphism property (3) or the linearity property (1); indeed, we only have the subadditivity property

\displaystyle \limsup_{n \rightarrow \infty} (a_n + b_n) \leq (\limsup_{n \rightarrow \infty} a_n) + (\limsup_{n \rightarrow \infty} b_n)

for the limit superior, and the superadditivity property

\displaystyle \liminf_{n \rightarrow \infty} (a_n + b_n) \geq (\liminf_{n \rightarrow \infty} a_n) + (\liminf_{n \rightarrow \infty} b_n)

for the limit inferior. The homogeneity property (2) is only obeyed by the limits superior and inferior for non-negative {c}; for negative {c}, one must have the limit inferior on one side of (2) and the limit superior on the other, thus for instance

\displaystyle \limsup_{n \rightarrow \infty} (-a_n) = - \liminf_{n \rightarrow \infty} a_n.

The limit superior and limit inferior are examples of limit points of the sequence, which can for instance be defined as points that are limits of at least one subsequence of the original sequence. Indeed, the limit superior is always the largest limit point of the sequence, and the limit inferior is always the smallest limit point. However, limit points can be highly non-unique (indeed they are unique if and only if the sequence is classically convergent), and so it is difficult to sensibly interpret most of the usual limit laws in this setting, with the exception of the homogeneity property (2) and the boundedness property (6) that are easy to state for limit points.

Another notion of limit are the Césaro limits

\displaystyle \mathrm{C}\!\!-\!\!\lim_{n \rightarrow \infty} a_n := \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N a_n;

if this limit exists, we say that the sequence is Césaro convergent. If the sequence {(a_n)_{n=1}^\infty} already has a classical limit, then it also has a Césaro limit that agrees with the classical limit; but there are additional sequences that have a Césaro limit but not a classical one. For instance, the non-classically convergent sequence {a_n= (-1)^n} discussed above is Césaro convergent, with a Césaro limit of {0}. However, there are still bounded sequences that do not have Césaro limit, such as {a_n := \sin( \log n )} (exercise!), basically because such sequences oscillate too slowly for the Césaro averaging to be of much use in accelerating the convergence. The Césaro limit is linear, bounded, and shift invariant, but not an algebra homomorphism and also does not obey the rearrangement property (5).

Using the Hahn-Banach theorem, one can extend the classical limit functional to generalised limit functionals {\mathop{\widetilde \lim}_{n \rightarrow \infty} a_n}, defined to be bounded linear functionals from the space {\ell^\infty({\bf N})} of bounded real sequences to the real numbers {{\bf R}} that extend the classical limit functional (defined on the space {c_0({\bf N}) + {\bf R}} of convergent sequences) without any increase in the operator norm. (In some of my past writings I made the slight error of referring to these generalised limit functionals as Banach limits, though as discussed below, the latter actually refers to a subclass of generalised limit functionals.) It is not difficult to see that such generalised limit functionals will range between the limit inferior and limit superior. In fact, for any specific sequence {(a_n)_{n=1}^\infty} and any number {L} lying in the closed interval {[\liminf_{n \rightarrow \infty} a_n, \limsup_{n \rightarrow \infty} a_n]}, there exists at least one generalised limit functional {\mathop{\widetilde \lim}_{n \rightarrow \infty}} that takes the value {L} when applied to {a_n}; for instance, for any number {\theta} in {[-1,1]}, there exists a generalised limit functional that assigns that number {\theta} as the “limit” of the sequence {a_n = (-1)^n}. This claim can be seen by first designing such a limit functional on the vector space spanned by the convergent sequences and by {(a_n)_{n=1}^\infty}, and then appealing to the Hahn-Banach theorem to extend to all sequences. This observation also gives a necessary and sufficient criterion for a bounded sequence {(a_n)_{n=1}^\infty} to classically converge to a limit {L}, namely that all generalised limits of this sequence must equal {L}.

Because of the reliance on the Hahn-Banach theorem, the existence of generalised limits requires the axiom of choice (or some weakened version thereof); there are models of set theory without the axiom of choice in which no generalised limits exist. For instance, consider a Solovay model in which all subsets of the real numbers are measurable. If one lets {e_n: {\bf R} \rightarrow \{0,1,2\}} denote the function that extracts the {n^{th}} ternary digit past the decimal point (thus {e_n(x) = \lfloor 3^n x \rfloor \hbox{ mod } 3}, and lets {\mathop{\widetilde \lim}} be a generalised limit functional, then the function {f(x) := \mathop{\widetilde \lim}_{n \rightarrow \infty} e_n(x)} is non-constant (e.g. {f(0)=0} and {f(1/2)=1}), but also invariant almost everywhere with respect to translation by ternary rationals {a/3^n}, and hence cannot be measurable (due to the continuity of translation in the strong operator topology, or the Steinhaus lemma), and so generalised limit functionals cannot exist.

Generalised limits can obey the shift-invariance property (4) or the algebra homomorphism property (3), but as the above analysis of the sequence {a_n = (-1)^n} shows, they cannot do both. Generalised limits that obey the shift-invariance property (4) are known as Banach limits; one can for instance construct them by applying the Hahn-Banach theorem to the Césaro limit functional; alternatively, if {\mathop{\widetilde \lim}} is any generalised limit, then the Césaro-type functional {(a_n)_{n=1}^\infty \mapsto \mathop{\widetilde \lim}_{N \rightarrow \infty} \frac{1}{N} \sum_{n=1}^N a_n} will be a Banach limit. The existence of Banach limits can be viewed as a demonstration of the amenability of the natural numbers (or integers); see this previous blog post for further discussion.

Generalised limits that obey the algebra homomorphism property (3) are known as ultrafilter limits. If one is given a generalised limit functional {p\!\!-\!\!\lim_{n \rightarrow \infty}} that obeys (3), then for any subset {A} of the natural numbers {{\bf N}}, the generalised limit {p\!\!-\!\!\lim_{n \rightarrow \infty} 1_A(n)} must equal its own square (since {1_A(n)^2 = 1_A(n)}) and is thus either {0} or {1}. If one defines {p \subset 2^{2^{\bf N}}} to be the collection of all subsets {A} of {{\bf N}} for which {p\!\!-\!\!\lim_{n \rightarrow \infty} 1_A(n) = 1}, one can verify that {p} obeys the axioms of a non-principal ultrafilter. Conversely, if {p} is a non-principal ultrafilter, one can define the associated generalised limit {p\!\!-\!\!\lim_{n \rightarrow \infty} a_n} of any bounded sequence {(a_n)_{n=1}^\infty} to be the unique real number {L} such that the sets {\{ n \in {\bf N}: |a_n - L| \leq \varepsilon \}} lie in {p} for all {\varepsilon>0}; one can check that this does indeed give a well-defined generalised limit that obeys (3). Non-principal ultrafilters can be constructed using Zorn’s lemma. In fact, they do not quite need the full strength of the axiom of choice; see the Wikipedia article on the ultrafilter lemma for examples.

We have previously noted that generalised limits of a sequence can converge to any point between the limit inferior and limit superior. The same is not true if one restricts to Banach limits or ultrafilter limits. For instance, by the arguments already given, the only possible Banach limit for the sequence {a_n = (-1)^n} is zero. Meanwhile, an ultrafilter limit must converge to a limit point of the original sequence, but conversely every limit point can be attained by at least one ultrafilter limit; we leave these assertions as an exercise to the interested reader. In particular, a bounded sequence converges classically to a limit {L} if and only if all ultrafilter limits converge to {L}.

There is no generalisation of the classical limit functional to any space that includes non-classically convergent sequences that obeys the subsequence property (5), since any non-classically-convergent sequence will have one subsequence that converges to the limit superior, and another subsequence that converges to the limit inferior, and one of these will have to violate (5) since the limit superior and limit inferior are distinct. So the above limit notions come close to the best generalisations of limit that one can use in practice.

(Added after comments) If {\beta {\bf N}} denotes the Stone-Cech compactification of the natural numbers, then {\ell^\infty({\bf N})} can be canonically identified with the continuous functions on {\beta {\bf N}}, and hence by the Riesz representation theorem, bounded linear functionals on {\ell^\infty({\bf N})} can be identified with finite measures on this space. From this it is not difficult to show that generalised limit functionals can be canonically identified with probability measures on the compact Hausdorff space {\beta {\bf N} \backslash {\bf N}}, that ultrafilter limits correspond to those probability measures that are Dirac measures (i.e. they can be canonically identified with points in {\beta {\bf N} \backslash {\bf N}}), and Banach limits correspond to those probability measures that are invariant with respect to the translation action of the integers {{\bf Z}} on {\beta {\bf N} \backslash {\bf N}}.

We summarise (some of) the above discussion in the following table:

Limit Always defined Linear Shift-invariant Homomorphism Constructive
Classical No Yes Yes Yes Yes
Superior Yes No Yes No Yes
Inferior Yes No Yes No Yes
Césaro No Yes Yes No Yes
Generalised Yes Yes Depends Depends No
Banach Yes Yes Yes No No
Ultrafilter Yes Yes No Yes No

 

In functional analysis, it is common to endow various (infinite-dimensional) vector spaces with a variety of topologies. For instance, a normed vector space can be given the strong topology as well as the weak topology; if the vector space has a predual, it also has a weak-* topology. Similarly, spaces of operators have a number of useful topologies on them, including the operator norm topology, strong operator topology, and the weak operator topology. For function spaces, one can use topologies associated to various modes of convergence, such as uniform convergence, pointwise convergence, locally uniform convergence, or convergence in the sense of distributions. (A small minority of such modes are not topologisable, though, the most common of which is pointwise almost everywhere convergence; see Exercise 8 of this previous post).

Some of these topologies are much stronger than others (in that they contain many more open sets, or equivalently that they have many fewer convergent sequences and nets). However, even the weakest topologies used in analysis (e.g. convergence in distributions) tend to be Hausdorff, since this at least ensures the uniqueness of limits of sequences and nets, which is a fundamentally useful feature for analysis. On the other hand, some Hausdorff topologies used are “better” than others in that many more analysis tools are available for those topologies. In particular, topologies that come from Banach space norms are particularly valued, as such topologies (and their attendant norm and metric structures) grant access to many convenient additional results such as the Baire category theorem, the uniform boundedness principle, the open mapping theorem, and the closed graph theorem.

Of course, most topologies placed on a vector space will not come from Banach space norms. For instance, if one takes the space {C_0({\bf R})} of continuous functions on {{\bf R}} that converge to zero at infinity, the topology of uniform convergence comes from a Banach space norm on this space (namely, the uniform norm {\| \|_{L^\infty}}), but the topology of pointwise convergence does not; and indeed all the other usual modes of convergence one could use here (e.g. {L^1} convergence, locally uniform convergence, convergence in measure, etc.) do not arise from Banach space norms.

I recently realised (while teaching a graduate class in real analysis) that the closed graph theorem provides a quick explanation for why Banach space topologies are so rare:

Proposition 1 Let {V = (V, {\mathcal F})} be a Hausdorff topological vector space. Then, up to equivalence of norms, there is at most one norm {\| \|} one can place on {V} so that {(V,\| \|)} is a Banach space whose topology is at least as strong as {{\mathcal F}}. In particular, there is at most one topology stronger than {{\mathcal F}} that comes from a Banach space norm.

Proof: Suppose one had two norms {\| \|_1, \| \|_2} on {V} such that {(V, \| \|_1)} and {(V, \| \|_2)} were both Banach spaces with topologies stronger than {{\mathcal F}}. Now consider the graph of the identity function {\hbox{id}: V \rightarrow V} from the Banach space {(V, \| \|_1)} to the Banach space {(V, \| \|_2)}. This graph is closed; indeed, if {(x_n,x_n)} is a sequence in this graph that converged in the product topology to {(x,y)}, then {x_n} converges to {x} in {\| \|_1} norm and hence in {{\mathcal F}}, and similarly {x_n} converges to {y} in {\| \|_2} norm and hence in {{\mathcal F}}. But limits are unique in the Hausdorff topology {{\mathcal F}}, so {x=y}. Applying the closed graph theorem (see also previous discussions on this theorem), we see that the identity map is continuous from {(V, \| \|_1)} to {(V, \| \|_2)}; similarly for the inverse. Thus the norms {\| \|_1, \| \|_2} are equivalent as claimed. \Box

By using various generalisations of the closed graph theorem, one can generalise the above proposition to Fréchet spaces, or even to F-spaces. The proposition can fail if one drops the requirement that the norms be stronger than a specified Hausdorff topology; indeed, if {V} is infinite dimensional, one can use a Hamel basis of {V} to construct a linear bijection on {V} that is unbounded with respect to a given Banach space norm {\| \|}, and which can then be used to give an inequivalent Banach space structure on {V}.

One can interpret Proposition 1 as follows: once one equips a vector space with some “weak” (but still Hausdorff) topology, there is a canonical choice of “strong” topology one can place on that space that is stronger than the “weak” topology but arises from a Banach space structure (or at least a Fréchet or F-space structure), provided that at least one such structure exists. In the case of function spaces, one can usually use the topology of convergence in distribution as the “weak” Hausdorff topology for this purpose, since this topology is weaker than almost all of the other topologies used in analysis. This helps justify the common practice of describing a Banach or Fréchet function space just by giving the set of functions that belong to that space (e.g. {{\mathcal S}({\bf R}^n)} is the space of Schwartz functions on {{\bf R}^n}) without bothering to specify the precise topology to serve as the “strong” topology, since it is usually understood that one is using the canonical such topology (e.g. the Fréchet space structure on {{\mathcal S}({\bf R}^n)} given by the usual Schwartz space seminorms).

Of course, there are still some topological vector spaces which have no “strong topology” arising from a Banach space at all. Consider for instance the space {c_c({\bf N})} of finitely supported sequences. A weak, but still Hausdorff, topology to place on this space is the topology of pointwise convergence. But there is no norm {\| \|} stronger than this topology that makes this space a Banach space. For, if there were, then letting {e_1,e_2,e_3,\dots} be the standard basis of {c_c({\bf N})}, the series {\sum_{n=1}^\infty 2^{-n} e_n / \| e_n \|} would have to converge in {\| \|}, and hence pointwise, to an element of {c_c({\bf N})}, but the only available pointwise limit for this series lies outside of {c_c({\bf N})}. But I do not know if there is an easily checkable criterion to test whether a given vector space (equipped with a Hausdorff “weak” toplogy) can be equipped with a stronger Banach space (or Fréchet space or {F}-space) topology.

One way to study a general class of mathematical objects is to embed them into a more structured class of mathematical objects; for instance, one could study manifolds by embedding them into Euclidean spaces. In these (optional) notes we study two (related) embedding theorems for topological spaces:

Read the rest of this entry »

The 245B final can be found here.  I am not posting solutions, but readers (both students and non-students) are welcome to discuss the final questions in the comments below.

The continuation to this course, 245C, will begin on Monday, March 29.  The topics for this course are still somewhat fluid – but I tentatively plan to cover the following topics, roughly in order:

  • L^p spaces and interpolation; fractional integration
  • The Fourier transform on {\Bbb R}^n (a very quick review; this is of course covered more fully in 247A)
  • Schwartz functions, and the theory of distributions
  • Hausdorff measure
  • The spectral theorem (introduction only; the topic is covered in depth in 255A)

I am open to further suggestions for topics that would build upon the 245AB material, which would be of interest to students, and which would not overlap too substantially with other graduate courses offered at UCLA.

A key theme in real analysis is that of studying general functions {f: X \rightarrow {\bf R}} or {f: X \rightarrow {\bf C}} by first approximating them by “simpler” or “nicer” functions. But the precise class of “simple” or “nice” functions may vary from context to context. In measure theory, for instance, it is common to approximate measurable functions by indicator functions or simple functions. But in other parts of analysis, it is often more convenient to approximate rough functions by continuous or smooth functions (perhaps with compact support, or some other decay condition), or by functions in some algebraic class, such as the class of polynomials or trigonometric polynomials.

In order to approximate rough functions by more continuous ones, one of course needs tools that can generate continuous functions with some specified behaviour. The two basic tools for this are Urysohn’s lemma, which approximates indicator functions by continuous functions, and the Tietze extension theorem, which extends continuous functions on a subdomain to continuous functions on a larger domain. An important consequence of these theorems is the Riesz representation theorem for linear functionals on the space of compactly supported continuous functions, which describes such functionals in terms of Radon measures.

Sometimes, approximation by continuous functions is not enough; one must approximate continuous functions in turn by an even smoother class of functions. A useful tool in this regard is the Stone-Weierstrass theorem, that generalises the classical Weierstrass approximation theorem to more general algebras of functions.

As an application of this theory (and of many of the results accumulated in previous lecture notes), we will present (in an optional section) the commutative Gelfand-Neimark theorem classifying all commutative unital {C^*}-algebras.

Read the rest of this entry »

Today I’d like to discuss (in the Tricks Wiki format) a fundamental trick in “soft” analysis, sometimes known as the “limiting argument” or “epsilon regularisation argument”.

Title: Give yourself an epsilon of room.

Quick description: You want to prove some statement S_0 about some object x_0 (which could be a number, a point, a function, a set, etc.).  To do so, pick a small \varepsilon > 0, and first prove a weaker statement S_\varepsilon (which allows for “losses” which go to zero as \varepsilon \to 0) about some perturbed object x_\varepsilon.  Then, take limits \varepsilon \to 0.  Provided that the dependency and continuity of the weaker conclusion S_\varepsilon on \varepsilon are sufficiently controlled, and x_\varepsilon is converging to x_0 in an appropriately strong sense, you will recover the original statement.

One can of course play a similar game when proving a statement S_\infty about some object X_\infty, by first proving a weaker statement S_N on some approximation X_N to X_\infty for some large parameter N, and then send N \to \infty at the end.

General discussion: Here are some typical examples of a target statement S_0, and the approximating statements S_\varepsilon that would converge to S:

S_0 S_\varepsilon
f(x_0) = g(x_0) f(x_\varepsilon) = g(x_\varepsilon) + o(1)
f(x_0) \leq g(x_0) f(x_\varepsilon) \leq g(x_\varepsilon) + o(1)
f(x_0) > 0 f(x_\varepsilon) \geq c - o(1) for some c>0 independent of \varepsilon
f(x_0) is finite f(x_\varepsilon) is bounded uniformly in \varepsilon
f(x_0) \geq f(x) for all x \in X (i.e. x_0 maximises f) f(x_\varepsilon) \geq f(x)-o(1) for all x \in X (i.e. x_\varepsilon nearly maximises f)
f_n(x_0) converges as n \to \infty f_n(x_\varepsilon) fluctuates by at most o(1) for sufficiently large n
f_0 is a measurable function f_\varepsilon is a measurable function converging pointwise to f_0
f_0 is a continuous function f_\varepsilon is an equicontinuous family of functions converging pointwise to f_0 OR f_\varepsilon is continuous and converges (locally) uniformly to f_0
The event E_0 holds almost surely The event E_\varepsilon holds with probability 1-o(1)
The statement P_0(x) holds for almost every x The statement P_\varepsilon(x) holds for x outside of a set of measure o(1)

Of course, to justify the convergence of S_\varepsilon to S_0, it is necessary that x_\varepsilon converge to x_0 (or f_\varepsilon converge to f_0, etc.) in a suitably strong sense. (But for the purposes of proving just upper bounds, such as f(x_0) \leq M, one can often get by with quite weak forms of convergence, thanks to tools such as Fatou’s lemma or the weak closure of the unit ball.)  Similarly, we need some continuity (or at least semi-continuity) hypotheses on the functions f, g appearing above.

It is also necessary in many cases that the control S_\varepsilon on the approximating object x_\varepsilon is somehow “uniform in \varepsilon“, although for “\sigma-closed” conclusions, such as measurability, this is not required. [It is important to note that it is only the final conclusion S_\varepsilon on x_\varepsilon that needs to have this uniformity in \varepsilon; one is permitted to have some intermediate stages in the derivation of S_\varepsilon that depend on \varepsilon in a non-uniform manner, so long as these non-uniformities cancel out or otherwise disappear at the end of the argument.]

By giving oneself an epsilon of room, one can evade a lot of familiar issues in soft analysis.  For instance, by replacing “rough”, “infinite-complexity”, “continuous”,  “global”, or otherwise “infinitary” objects x_0 with “smooth”, “finite-complexity”, “discrete”, “local”, or otherwise “finitary” approximants x_\varepsilon, one can finesse most issues regarding the justification of various formal operations (e.g. exchanging limits, sums, derivatives, and integrals).  [It is important to be aware, though, that any quantitative measure on how smooth, discrete, finite, etc. x_\varepsilon should be expected to degrade in the limit \varepsilon \to 0, and so one should take extreme caution in using such quantitative measures to derive estimates that are uniform in \varepsilon.]  Similarly, issues such as whether the supremum M := \sup \{ f(x): x \in X \} of a function on a set is actually attained by some maximiser x_0 become moot if one is willing to settle instead for an almost-maximiser x_\varepsilon, e.g. one which comes within an epsilon of that supremum M (or which is larger than 1/\varepsilon, if M turns out to be infinite).  Last, but not least, one can use the epsilon room to avoid degenerate solutions, for instance by perturbing a non-negative function to be strictly positive, perturbing a non-strictly monotone function to be strictly monotone, and so forth.

To summarise: one can view the epsilon regularisation argument as a “loan” in which one borrows an epsilon here and there in order to be able to ignore soft analysis difficulties, and can temporarily be able to utilise estimates which are non-uniform in epsilon, but at the end of the day one needs to “pay back” the loan by establishing a final “hard analysis” estimate which is uniform in epsilon (or whose error terms decay to zero as epsilon goes to zero).

A variant: It may seem that the epsilon regularisation trick is useless if one is already in “hard analysis” situations when all objects are already “finitary”, and all formal computations easily justified.  However, there is an important variant of this trick which applies in this case: namely, instead of sending the epsilon parameter to zero, choose epsilon to be a sufficiently small (but not infinitesimally small) quantity, depending on other parameters in the problem, so that one can eventually neglect various error terms and to obtain a useful bound at the end of the day.  (For instance, any result proven using the Szemerédi regularity lemma is likely to be of this type.)  Since one is not sending epsilon to zero, not every term in the final bound needs to be uniform in epsilon, though for quantitative applications one still would like the dependencies on such parameters to be as favourable as possible.

Prerequisites: Graduate real analysis.  (Actually, this isn’t so much a prerequisite as it is a corequisite: the limiting argument plays a central role in many fundamental results in real analysis.)  Some examples also require some exposure to PDE.

Read the rest of this entry »

A normed vector space {(X, \| \|_X)} automatically generates a topology, known as the norm topology or strong topology on {X}, generated by the open balls {B(x,r) := \{ y \in X: \|y-x\|_X < r \}}. A sequence {x_n} in such a space converges strongly (or converges in norm) to a limit {x} if and only if {\|x_n-x\|_X \rightarrow 0} as {n \rightarrow \infty}. This is the topology we have implicitly been using in our previous discussion of normed vector spaces.

However, in some cases it is useful to work in topologies on vector spaces that are weaker than a norm topology. One reason for this is that many important modes of convergence, such as pointwise convergence, convergence in measure, smooth convergence, or convergence on compact subsets, are not captured by a norm topology, and so it is useful to have a more general theory of topological vector spaces that contains these modes. Another reason (of particular importance in PDE) is that the norm topology on infinite-dimensional spaces is so strong that very few sets are compact or pre-compact in these topologies, making it difficult to apply compactness methods in these topologies. Instead, one often first works in a weaker topology, in which compactness is easier to establish, and then somehow upgrades any weakly convergent sequences obtained via compactness to stronger modes of convergence (or alternatively, one abandons strong convergence and exploits the weak convergence directly). Two basic weak topologies for this purpose are the weak topology on a normed vector space {X}, and the weak* topology on a dual vector space {X^*}. Compactness in the latter topology is usually obtained from the Banach-Alaoglu theorem (and its sequential counterpart), which will be a quick consequence of the Tychonoff theorem (and its sequential counterpart) from the previous lecture.

The strong and weak topologies on normed vector spaces also have analogues for the space {B(X \rightarrow Y)} of bounded linear operators from {X} to {Y}, thus supplementing the operator norm topology on that space with two weaker topologies, which (somewhat confusingly) are named the strong operator topology and the weak operator topology.

Read the rest of this entry »

One of the most useful concepts for analysis that arise from topology and metric spaces is the concept of compactness; recall that a space {X} is compact if every open cover of {X} has a finite subcover, or equivalently if any collection of closed sets with the finite intersection property (i.e. every finite subcollection of these sets has non-empty intersection) has non-empty intersection. In these notes, we explore how compactness interacts with other key topological concepts: the Hausdorff property, bases and sub-bases, product spaces, and equicontinuity, in particular establishing the useful Tychonoff and Arzelá-Ascoli theorems that give criteria for compactness (or precompactness).

Exercise 1 (Basic properties of compact sets)

  • Show that any finite set is compact.
  • Show that any finite union of compact subsets of a topological space is still compact.
  • Show that any image of a compact space under a continuous map is still compact.

Show that these three statements continue to hold if “compact” is replaced by “sequentially compact”.

Read the rest of this entry »

The notion of what it means for a subset E of a space X to be “small” varies from context to context.  For instance, in measure theory, when X = (X, {\mathcal X}, \mu) is a measure space, one useful notion of a “small” set is that of a null set: a set E of measure zero (or at least contained in a set of measure zero).  By countable additivity, countable unions of null sets are null.  Taking contrapositives, we obtain

Lemma 1. (Pigeonhole principle for measure spaces) Let E_1, E_2, \ldots be an at most countable sequence of measurable subsets of a measure space X.  If \bigcup_n E_n has positive measure, then at least one of the E_n has positive measure.

Now suppose that X was a Euclidean space {\Bbb R}^d with Lebesgue measure m.  The Lebesgue differentiation theorem easily implies that having positive measure is equivalent to being “dense” in certain balls:

Proposition 1. Let E be a measurable subset of {\Bbb R}^d.  Then the following are equivalent:

  1. E has positive measure.
  2. For any \varepsilon > 0, there exists a ball B such that m( E \cap B ) \geq (1-\varepsilon) m(B).

Thus one can think of a null set as a set which is “nowhere dense” in some measure-theoretic sense.

It turns out that there are analogues of these results when the measure space X = (X, {\mathcal X}, \mu)  is replaced instead by a complete metric space X = (X,d).  Here, the appropriate notion of a “small” set is not a null set, but rather that of a nowhere dense set: a set E which is not dense in any ball, or equivalently a set whose closure has empty interior.  (A good example of a nowhere dense set would be a proper subspace, or smooth submanifold, of {\Bbb R}^d, or a Cantor set; on the other hand, the rationals are a dense subset of {\Bbb R} and thus clearly not nowhere dense.)   We then have the following important result:

Theorem 1. (Baire category theorem). Let E_1, E_2, \ldots be an at most countable sequence of subsets of a complete metric space X.  If \bigcup_n E_n contains a ball B, then at least one of the E_n is dense in a sub-ball B’ of B (and in particular is not nowhere dense).  To put it in the contrapositive: the countable union of nowhere dense sets cannot contain a ball.

Exercise 1. Show that the Baire category theorem is equivalent to the claim that in a complete metric space, the countable intersection of open dense sets remain dense.  \diamond

Exercise 2. Using the Baire category theorem, show that any non-empty complete metric space without isolated points is uncountable.  (In particular, this shows that Baire category theorem can fail for incomplete metric spaces such as the rationals {\Bbb Q}.)  \diamond

To quickly illustrate an application of the Baire category theorem, observe that it implies that one cannot cover a finite-dimensional real or complex vector space {\Bbb R}^n, {\Bbb C}^n by a countable number of proper subspaces.  One can of course also establish this fact by using Lebesgue measure on this space.  However, the advantage of the Baire category approach is that it also works well in infinite dimensional complete normed vector spaces, i.e. Banach spaces, whereas the measure-theoretic approach runs into significant difficulties in infinite dimensions.  This leads to three fundamental equivalences between the qualitative theory of continuous linear operators on Banach spaces (e.g. finiteness, surjectivity, etc.) to the quantitative theory (i.e. estimates):

  1. The uniform boundedness principle, that equates the qualitative boundedness (or convergence) of a family of continuous operators with their quantitative boundedness.
  2. The open mapping theorem, that equates the qualitative solvability of a linear problem Lu = f with the quantitative solvability.
  3. The closed graph theorem, that equates the qualitative regularity of a (weakly continuous) operator T with the quantitative regularity of that operator.

Strictly speaking, these theorems are not used much directly in practice, because one usually works in the reverse direction (i.e. first proving quantitative bounds, and then deriving qualitative corollaries); but the above three theorems help explain why we usually approach qualitative problems in functional analysis via their quantitative counterparts.

Read the rest of this entry »

To progress further in our study of function spaces, we will need to develop the standard theory of metric spaces, and of the closely related theory of topological spaces (i.e. point-set topology).  I will be assuming that students in my class will already have encountered these concepts in an undergraduate topology or real analysis course, but for sake of completeness I will briefly review the basics of both spaces here.

Read the rest of this entry »

Archives