The notion of what it means for a subset E of a space X to be “small” varies from context to context. For instance, in measure theory, when is a measure space, one useful notion of a “small” set is that of a null set: a set E of measure zero (or at least contained in a set of measure zero). By countable additivity, countable unions of null sets are null. Taking contrapositives, we obtain
Lemma 1. (Pigeonhole principle for measure spaces) Let be an at most countable sequence of measurable subsets of a measure space X. If has positive measure, then at least one of the has positive measure.
Now suppose that X was a Euclidean space with Lebesgue measure m. The Lebesgue differentiation theorem easily implies that having positive measure is equivalent to being “dense” in certain balls:
Proposition 1. Let be a measurable subset of . Then the following are equivalent:
- E has positive measure.
- For any , there exists a ball B such that .
Thus one can think of a null set as a set which is “nowhere dense” in some measure-theoretic sense.
It turns out that there are analogues of these results when the measure space is replaced instead by a complete metric space . Here, the appropriate notion of a “small” set is not a null set, but rather that of a nowhere dense set: a set E which is not dense in any ball, or equivalently a set whose closure has empty interior. (A good example of a nowhere dense set would be a proper subspace, or smooth submanifold, of , or a Cantor set; on the other hand, the rationals are a dense subset of and thus clearly not nowhere dense.) We then have the following important result:
Theorem 1. (Baire category theorem). Let be an at most countable sequence of subsets of a complete metric space X. If contains a ball B, then at least one of the is dense in a sub-ball B’ of B (and in particular is not nowhere dense). To put it in the contrapositive: the countable union of nowhere dense sets cannot contain a ball.
Exercise 1. Show that the Baire category theorem is equivalent to the claim that in a complete metric space, the countable intersection of open dense sets remain dense.
Exercise 2. Using the Baire category theorem, show that any non-empty complete metric space without isolated points is uncountable. (In particular, this shows that Baire category theorem can fail for incomplete metric spaces such as the rationals .)
To quickly illustrate an application of the Baire category theorem, observe that it implies that one cannot cover a finite-dimensional real or complex vector space by a countable number of proper subspaces. One can of course also establish this fact by using Lebesgue measure on this space. However, the advantage of the Baire category approach is that it also works well in infinite dimensional complete normed vector spaces, i.e. Banach spaces, whereas the measure-theoretic approach runs into significant difficulties in infinite dimensions. This leads to three fundamental equivalences between the qualitative theory of continuous linear operators on Banach spaces (e.g. finiteness, surjectivity, etc.) to the quantitative theory (i.e. estimates):
- The uniform boundedness principle, that equates the qualitative boundedness (or convergence) of a family of continuous operators with their quantitative boundedness.
- The open mapping theorem, that equates the qualitative solvability of a linear problem Lu = f with the quantitative solvability.
- The closed graph theorem, that equates the qualitative regularity of a (weakly continuous) operator T with the quantitative regularity of that operator.
Strictly speaking, these theorems are not used much directly in practice, because one usually works in the reverse direction (i.e. first proving quantitative bounds, and then deriving qualitative corollaries); but the above three theorems help explain why we usually approach qualitative problems in functional analysis via their quantitative counterparts.
— Proof of Baire category theorem —
Assume that the Baire category theorem failed; then it would be possible to cover a ball in a complete metric space by a countable family of nowhere dense sets.
We now invoke the following easy observation: if E is nowhere dense, then every ball B contains a subball B’ which is disjoint from E. Indeed, this follows immediately from the definition of a nowhere dense set.
Invoking this observation, we can find a ball in (say) which is disjoint from ; we may also assume that by shrinking as necessary. Then, inside , we can find a ball which is also disjoint from , with . Continuing this process, we end up with a nested sequence of balls , each of which are disjoint from , and such that and for all .
From the triangle inequality we have , and so the sequence is a Cauchy sequence. As X is complete, converges to a limit x. Summing the geometric series, one verifies that for all , and in particular is an element of which avoids all of , a contradiction.
We can illustrate the analogy between the Baire category theorem and the measure-theoretic analogs by introducing some further definitions. Call a set E meager or of the first category if it can be expressed (or covered) by a countable union of nowhere dense sets, and of the second category if it is not meager. Thus, the Baire category theorem shows that any subset of a complete metric space with non-empty interior is of the second category, which may help explain the name for the property. Call a set co-meager or residual if its complement is meager, and call a set Baire or almost open if it differs from an open set by a meager set (note that a Baire set is unrelated to the Baire -algebra). Then we have the following analogy between complete metric space topology, and measure theory:
|Complete non-empty metric space X||Measure space X of positive measure|
|first category (meager)||zero measure (null )|
|second category||positive measure|
|residual (co-meager)||full measure (co-null)|
Nowhere dense sets are meager, and meager sets have empty interior. Contrapositively, sets with dense interior
are residual, and residual sets are somewhere dense. Taking complements instead of contrapositives, we see that open dense sets are co-meager,and co-meager sets are dense.
While there are certainly many analogies between meager sets and null sets (for instance, both classes are closed under countable unions, or under intersections with arbitrary sets), the two concepts can differ in practice. For instance, in the real line with the standard metric and measure space structures, the set
where is an enumeration of the rationals, is open and dense, but has Lebesgue measure at most 2; thus its complement has infinite measure in but is nowhere dense (hence meager). As a variant of this, the set
is a null set, but is the intersection of countably many open dense sets and is thus co-meager.
Exercise 3. A real number x is Diophantine if for every there exists such that for every rational number . Show that the set of Diophantine real numbers has full measure but is meager.
Remark 1. If one assumes some additional axioms of set theory (e.g. the continuum hypothesis), it is possible to show that the collection of meager subsets of and the collection of null subsets of (viewed as -ideals of the collection of all subsets of ) are isomorphic; this is the Sierpinski-Erdös theorem, which we will not prove here. Roughly speaking, this theorem tells us that any “effective” first-order statement which is true about meager sets will also be true about null sets, and conversely.
— The uniform boundedness principle —
As mentioned in the introduction, the Baire category theorem implies various equivalences between qualitative and quantitative properties of linear transformations between Banach spaces. (Lemma 1 of Notes 3 already gives a prototypical such equivalence between a qualitative property (continuity) and a quantitative one (boundedness).)
Theorem 2. (Uniform boundedness principle) Let X be a Banach space, let Y be a normed vector space, and let be a family of continuous linear operators . Then the following are equivalent:
- (Pointwise boundedness) For every , the set is bounded.
- (Uniform boundedness) The operator norms are bounded.
The uniform boundedness principle is also known as the Banach-Steinhaus theorem.
Proof. It is clear that 2. implies 1.; now assume 1 holds and let us obtain 2.
For each , let be the set
The hypothesis 1 is nothing more than the assertion that the cover X, and thus by the Baire category theorem one of the must be dense in a ball. Since the are continuous, the are closed, and so one of the contains a ball. Since , we see that one of the contains a ball centred at the origin. Dilating n as necessary, we see that one of the contains the unit ball . But then all the are bounded by n, and the claim follows.
Exercise 4. Give counterexamples to show that the uniform boundedness principle fails if one relaxes the assumptions in any of the following ways:
- X is merely a normed vector space rather than a Banach space (i.e. completeness is dropped).
- The are not assumed to be continuous.
- The are allowed to be nonlinear rather than linear.
Thus completeness, continuity, and linearity are all essential for the uniform boundedness principle to apply.
Remark 2. It is instructive to establish the uniform boundedness principle more “constructively” without the Baire category theorem (though the proof of the Baire category theorem is still implicitly present), as follows. Suppose that 2 fails, then is unbounded. We can then find a sequence such that (say) for all n. We can then find unit vectors such that .
We can then form the absolutely convergent (and hence conditionally convergent, by completeness) sum for some choice of signs recursively as follows: once have been chosen, choose the sign so that
From the triangle inequality we soon conclude that
But by hypothesis, the RHS is unbounded in n, contradicting 1.
A common way to apply the uniform boundedness principle is via the following corollary:
Corollary 1. (Uniform boundedness principle for norm convergence) Let and be Banach spaces, and let be a family of continuous linear operators . Then the following are equivalent:
- (Pointwise convergence) For every , converges strongly in as .
- (Pointwise convergence to a continuous limit) There exists a continuous linear such that for every , converges strongly in to as .
- (Uniform boundedness + dense subclass convergence) The operator norms are bounded, and for a dense set of in , converges strongly in as .
Proof. Clearly 2. implies 1., and as convergent sequences are bounded, we see from Theorem 2 that 1. implies 3. The implication of 2 from 3 follows by a standard limiting argument and is left as an exercise.
Remark 3. The same equivalences hold if one replaces the sequence by a net .
Example 1 (Fourier inversion formula). For any and N > 0, define the Dirichlet summation operator
where is the Fourier transform of f, defined on smooth compactly supported functions by the formula and then extended to by the Plancherel theorem. Using the Plancherel identity, we can verify that the operator norms are uniformly bounded (indeed, they are all 1); also, one can check that for , that converges in norm to f as . As is known to be dense in , this implies that converges in norm to f for every .
This argument only used the “easy” implication of Corollary 1, namely the deduction of 2. from 3. The “hard” implication using the Baire category theorem was not directly utilised. However, from a metamathematical standpoint, that implication is important because it tells us that the above strategy to prove convergence in norm of the Fourier inversion formula on – i.e. to obtain uniform operator norms on the partial sums, and to establish convergence on a dense subclass of “nice” functions – is in some sense the only strategy available to prove such a result.
Remark 4. There is a partial analogue of Corollary 1 for the question of pointwise almost everywhere convergence rather than norm convergence, known as Stein’s maximal principle (discussed for instance in this previous blog post of mine). For instance, it reduces Carleson’s theorem on the pointwise almost everywhere convergence of Fourier series to the boundedness of a certain maximal function (the Carleson maximal operator) related to Fourier summation, although the latter task is again quite non-trivial. (As in Example 1, the role of the maximal principle is meta-mathematical rather than direct.)
Of course, if we omit some of the hypotheses, it is no longer true that pointwise boundedness and uniform boundedness are the same. For instance, if we let be the space of complex sequences with only finitely many non-zero entries and with the uniform topology, and let be the map , then the are pointwise bounded but not uniformly bounded; thus completeness of X is important. Also, even in the one-dimensional case , the uniform boundedness principle can easily be seen to fail if the are non-linear transformations rather than linear ones.
— The open mapping theorem —
A map between topological spaces X and Y is said to be open if it maps open sets to open sets. This is similar to, but slightly different, from the more familiar property of being continuous, which is equivalent to the inverse image of open sets being open. For instance, the map defined by is continuous but not open; conversely, the function defined by is discontinuous but open.
We have seen that it is quite possible for non-linear continuous maps to fail to be open. But for linear maps between Banach spaces, the situation is much better:
Theorem 3. (Open mapping theorem) Let be a continuous linear transformation between two Banach spaces X and Y. Then the following are equivalent:
- L is surjective.
- L is open.
- (Qualitative solvability) For every there exists a solution to the equation .
- (Quantitative solvability) There exists a constant such that for every there exists a solution to the equation , which obeys the bound .
- (Quantitative solvability for a dense subclass) There exists a constant such that for a dense set of f in Y, there exists a solution to the equation , which obeys the bound .
Proof. Clearly 4. implies 3., which is equivalent to 1., and it is easy to see from linearity that 2. and 4. are equivalent (cf. the proof of Lemma 1 from Notes 3). 4. trivially implies 5., while to obtain 4. from 5., observe that if E is any dense subset of the Banach space Y, then any f in Y can be expressed as an absolutely convergent series of elements in E (since one can iteratively approximate the residual to arbitrary accuracy by an element of E for ), and the claim easily follows. So it suffices to show that 3. implies 4.
For each n, let be the set of all for which there exists a solution to Lu=f with . From the hypothesis 3, we see that . Since Y is complete, the Baire category theorem implies that there is some which is dense in some ball in Y. In other words, the problem Lu=f is approximately quantitatively solvable in the ball in the sense that
- For every and every , there exists an approximate solution u with and , and thus .
By subtracting two such approximate solutions, we conclude that
- For any and any , there exists with and .
Since L is homogeneous, we can rescale and conclude that
- For any and any there exists with and .
In particular, setting (treating the case f=0 separately), we conclude that
- For any , we may write , where and .
We can iterate this procedure and then take limits (now using the completeness of X rather than Y) to obtain a solution to Lu=f for every with , and the claim follows.
Remark 5. The open mapping theorem provides metamathematical justification for the method of a priori estimates for solving linear equations such as for a given datum and for an unknown , which is of course a familiar problem in linear PDE. The a priori method assumes that f is in some dense class of nice functions (e.g. smooth functions) in which solvability of Lu=f is presumably easy, and then proceeds to obtain the a priori estimate for some constant C. Theorem 3 then assures that Lu=f is solvable for all f in Y (with a similar bound). As before, this implication does not directly use the Baire category theorem, but that theorem helps explain why this method is “not wasteful”.
A pleasant corollary of the open mapping theorem is that, as with ordinary linear algebra or with arbitrary functions, invertibility is the same thing as bijectivity:
Corollary 2. Let be a continuous linear operator between two Banach spaces X, Y. Then the following are equivalent:
- (Qualitative invertibility) T is bijective.
- (Quantitative invertibility) T is bijective, and is a continuous (hence bounded) linear transformation.
Remark 6. The claim fails without the completeness hypotheses on X and Y. For instance, consider the operator defined by , where we give the uniform norm. Then T is continuous and bijective, but is unbounded.
Exercise 5. Show that Corollary 2 can still fail if we drop the completeness hypothesis on just X, or just Y.
Exercise 6. Suppose that is a surjective continuous linear transformation between Banach spaces. By using the open mapping theorem, show that the transpose map is bounded from below, i.e. there exists such that for all . Conclude that is an isomorphism between and .
Let L be as in Theorem 3, so that the problem Lu=f is both qualitatively and quantitatively solvable. A standard application of Zorn’s lemma (similar to that used to prove the Hahn-Banach theorem) shows that the problem Lu=f is also qualitatively linearly solvable, in the sense that there exists a linear transformation such that for all (i.e. S is a right-inverse of L). In view of the open mapping theorem, it is then tempting to conjecture that L must also be quantitatively linearly solvable, in the sense that there exists a continuous linear transformation such that for all . By Corollary 2, we see that this conjecture is true when the problem Lu=f is determined, i.e. there is exactly one solution u for each datum f. Unfortunately, the conjecture can fail when Lu=f is underdetermined (more than one solution u for each f); we discuss this in the appendix to these notes. On the other hand, the situation is much better for Hilbert spaces:
Exercise 7. Suppose that is a surjective continuous linear transformation between Hilbert spaces. Show that there exists a continuous linear transformation such that . Furthermore, we can ensure that the range of S is orthogonal to the kernel of L, and that this condition determines S uniquely.
Remark 7. In fact, Hilbert spaces are essentially the only type of Banach space for which we have this nice property, due to the Lindenstrauss-Tzafriri solution of the complemented subspaces problem.
Exercise 8. Let M and N be closed subspaces of a Banach space X. Show that the following statements are equivalent:
- (Qualitative complementation) Every x in X can be expressed in the form m+n for in exactly one way.
- (Quantitative complementation) Every x in X can be expressed in the form m+n for in exactly one way. Furthermore there exists C > 0 such that all x.
When either of these two properties hold, we say that M (or N) is a complemented subspace, and that N is a complement of M (or vice versa).
The property of being complemented is closely related to that of quantitative linear solvability:
Exercise 9. Let be a surjective bounded linear map between Banach spaces. Show that there exists a bounded linear map such that for all if and only if the kernel is a complemented subspace of X.
Exercise 10. Show that any finite-dimensional or closed finite co-dimensional subspace of a Banach space is complemented.
Remark 8. The problem of determining whether a given closed subspace of a Banach space is complemented or not is, in general, quite difficult. However, non-complemented subspaces do exist in abundance; some example are given in the apendix, and the Lindenstrauss-Tzafriri theorem referred to in in Remark 7 asserts that any Banach space not isomorphic to a Hilbert space contains at least one non-complemented subspace. There is also a remarkable construction of Gowers and Maurey of a Banach space such that every subspace, other than those ruled out by Exercise 10, are uncomplemented.
— The closed graph theorem —
Recall that a map between two metric spaces is continuous if and only if, whenever converges to x in X, converges to Tx in Y. We can also define the weaker property of being closed: an map is closed if and only if whenever converges to x in X, and converges to a limit y in Y, then y is equal to Tx; equivalently, T is closed if its graph is a closed subset of . This is weaker than continuity because it has the additional requirement that the sequence is already convergent. (Despite the name, closed operators are not directly related to open operators.)
Example 2. Let be the transformation . This transformation is unbounded and hence discontinuous, but one easily verifies that it is closed.
As Example 2 shows, being closed is often a weaker property than being continuous. However, the remarkable closed graph theorem shows that as long as the domain and range of the operator are both Banach spaces, the two statements are equivalent:
Theorem 4. (Closed graph theorem) Let be a linear transformation between two Banach spaces. Then the following are equivalent:
Proof. It is clear that 1 implies 3 (just take to equal the norm topology). To see why 3 implies 2, observe that if in X and in norm, then in the weaker topology as well; but by weak continuity in . Since Hausdorff topological spaces have unique limits, we have Tx=y and so T is closed.
Now we show that 2 implies 1. If T is closed, then the graph is a closed linear subspace of the Banach space and is thus also a Banach space. On the other hand, the projection map from to X is clearly a continuous linear bijection. By Corollary 2, its inverse is also continuous, and so T is continuous as desired.
We can reformulate the closed graph theorem in the following fashion:
Corollary 3. Let X, Y be Banach spaces, and suppose we have some continuous inclusion of Y into a Hausdorff topological vector space Z. Let be a continuous linear transformation. Then the following are equivalent.
- (Qualitative regularity) For all , .
- (Quantitative regularity) For all , , and furthermore for some independent of x.
- (Quantitative regularity on a dense subclass) For all x in a dense subset of X, , and furthermore for some independent of x.
Proof. Clearly 2. implies 3. or 1. If we have 3., then T extends uniquely to a bounded linear map from X to Y, which must agree with the original continuous map from X to Z since limits in the Hausdorff space Z are unique, and so 3. implies 2. Finally, if 1. holds, then we can view T as a map from X to Y, which by Theorem 4 is continuous, and the claim now follows from Lemma 1 from Notes 3.
In practice, one should think of Z as some sort of “low regularity” space with a weak topology, and Y as a “high regularity” subspace with a stronger topology. Corollary 3 motivates the method of a priori estimates to establish the Y-regularity of some linear transform Tx of an arbitrary element x in a Banach space X, by first establishing the a priori estimate for a dense subclass of “nice” elements of X, and then using the above corollary (and some weak continuity of T in a low regularity space) to conclude. The closed graph theorem provides the metamathematical explanation as to why this approach is at least as powerful as any other approach to proving regularity.
for some constant and all f in some suitable dense subclass of (e.g. the space of smooth functions of compact support), together with the “soft” observation that the Fourier transform is continuous from to the space of tempered distributions, which is a Hausdorff space into which embeds continuously. One can replace the Hausdorff-Young inequality here by countless other estimates in harmonic analysis to obtain similar qualitative regularity conclusions.
— Appendix: Nonlinear solvability (optional) —
In this appendix we give an example of a linear equations Lu=f which can only be quantitatively solved in a nonlinear fashion. We will use a number of basic tools which we will only cover later in this course, and so this material is optional reading.
Let be the infinite discrete cube with the product topology; by Tychonoff’s theorem, this is a compact Hausdorff space. The Borel -algebra is generated by the cylinder sets
(From a probabilistic view point, one can think of X as the event space for flipping a countably infinite number of coins, and as the event that the coin lands as heads.)
Let be the space of finite Borel measures on X; this can be verified to be a Banach space. There is a map defined by
This is a continuous linear transformation. The equation is quantitatively solvable for every . Indeed, if f is an indicator function , then , where is the sequence that equals 1 on A and 0 outside of A, and is the Dirac mass at A. The general case then follows by expressing a bounded sequence as an integral of indicator functions (e.g. if f takes values in [0,1], we can write ). Note however that this is a nonlinear operation, since the indicator depends nonlinearly on f.
We now claim that the equation is not quantitatively linearly solvable, i.e. there is no bounded linear map such that LSf = f for all . This fact was first observed by Banach and Mazur; we shall give two proofs, one of a “soft analysis” flavour and one of a “hard analysis” flavour.
We begin with the “soft analysis” proof, starting with a measure-theoretic result which is of independent interest.
Theorem 5. (Nikodym convergence theorem) Let be a measurable space, and let be a sequence of signed finite measures which is weakly convergent in the sense that converges to some limit for each . Then:
- The are uniformly countably additive, which means that for any sequence of disjoint measurable sets, the series converges uniformly in n.
- is a signed finite measure.
Proof. It suffices to prove the first part, since this easily implies that is also countably additive, and is thence a signed finite measure. Suppose for contradiction that the claim failed, then one could find disjoint and such that one has for all M. We now construct disjoint sets , each consisting of the union of a finite collection of the , and an increasing sequence of positive integers, by the following recursive procedure:
- Initialise .
- Suppose recursively that and has already been constructed for some .
- Choose so large that for all , differs from by at most .
- Choose so large that is larger than j for any , and such that for all .
- Choose so that .
- Pick to be a finite union of the with such that .
- Increment k to k+1 and then return to Step 2.
It is then a routine matter to show that if , then for all j, contradicting the hypothesis that is weakly convergent to .
Exercise 11. (Schur’s property for ) Show that if a sequence in is convergent in the weak topology, then it is convergent in the strong topology.
We return now to the map . Consider the sequence defined by , i.e. each is the sequence consisting of n 1’s followed by an infinite number of 0’s. As the dual of is isomorphic to , we see from the dominated convergence theorem that is a weakly Cauchy sequence in , in the sense that is Cauchy for any . Applying S, we conclude that is weakly Cauchy in . In particular, using the bounded linear functionals on M(X), we see that converges to some limit for all measurable sets E. Applying the Nikodym convergence theorem we see that is also a signed finite measure. We then see that converges in the weak topology to . (One way to see this is to define , then is finite and are all absolutely continuous with respect to ; now use the Radon-Nikodym theorem (see Notes 1) and the fact that .) On the other hand, as and L and S are both bounded, S is a Banach space isomorphism between and . Thus is complete, hence closed, hence weakly closed (by Hahn-Banach), and so for some . By Hahn-Banach again, this implies that converges weakly to . But this is easily seen to be impossible, since the constant sequence does not lie in , and the claim follows.
Now we give the “hard analysis” proof. Let be the standard basis for , let N be a large number, and consider the random sums
where are iid random signs. Since the norm of is 1, we have
for some constant C independent of N. On the other hand, we can write for some finite measure and some using Radon-Nikodym as in the previous proof, and then
Taking expectations and applying Khintchine’s inequality we conclude
for some constant C’ independent of N. By Cauchy-Schwarz this implies that
But as for some constant c > 0 independent of N, we obtain a contradiction for N large enough, and the claim follows.
Remark 9. The phenomenon of nonlinear quantitative solvability actually comes up in many applications of interest. For instance, consider the Fefferman-Stein decomposition theorem, which asserts that any of bounded mean oscillation can be decomposed as for some , where H is the Hilbert transform. This theorem was first proven by using the duality of the Hardy space and BMO (and by using Exercise 13 from Notes 6), and by using the fact that a function f is in if and only if f and Hf both lie in . From the open mapping theorem we know that we can pick g, h so that the norms of g, h are bounded by a multiple of the BMO norm of f. But it turns out not to be possible to pick g and h in a bounded linear manner in terms of f, although this is a little tricky to prove. (Uchiyama famously gave an explicit construction of g, h in terms of f, but the construction was highly nonlinear; see my blog post on the topic.)
An example in a similar spirit was given more recently by Bourgain and Brezis, who considered the problem of solving the equation on the d-dimensional torus for some function on the torus with mean zero, and with some unknown vector field , where the derivatives are interpreted in the weak sense. They showed that if and , then there existed a solution u to this problem with , despite the failure of Sobolev embedding at this endpoint. Again, the open mapping theorem allows one to choose u with norm bounded by a multiple of the norm of f, but Bourgain and Brezis also show that one cannot select u in a bounded linear fashion depending on f.
Question. All of the above constructions of non-complemented closed subspaces, or of linear problems that can only be quantitatively solved nonlinearly, were quite involved. Is there a “soft” or “elementary” way to see that closed subspaces of Banach spaces exist which are not complemented, or (equivalently) that surjective continuous linear maps between Banach spaces do not always enjoy a continuous linear right-inverse? I do not have a good answer to this question.
[Update, Feb 4: definition of “residual” corrected.]