Thus far, we have only focused on measure and integration theory in the context of Euclidean spaces . Now, we will work in a more abstract and general setting, in which the Euclidean space is replaced by a more general space .
It turns out that in order to properly define measure and integration on a general space , it is not enough to just specify the set . One also needs to specify two additional pieces of data:
- A collection of subsets of that one is allowed to measure; and
- The measure one assigns to each measurable set .
For instance, Lebesgue measure theory covers the case when is a Euclidean space , is the collection of all Lebesgue measurable subsets of , and is the Lebesgue measure of .
The collection has to obey a number of axioms (e.g. being closed with respect to countable unions) that make it a -algebra, which is a stronger variant of the more well-known concept of a boolean algebra. Similarly, the measure has to obey a number of axioms (most notably, a countable additivity axiom) in order to obtain a measure and integration theory comparable to the Lebesgue theory on Euclidean spaces. When all these axioms are satisfied, the triple is known as a measure space. These play much the same role in abstract measure theory that metric spaces or topological spaces play in abstract point-set topology, or that vector spaces play in abstract linear algebra.
On any measure space, one can set up the unsigned and absolutely convergent integrals in almost exactly the same way as was done in the previous notes for the Lebesgue integral on Euclidean spaces, although the approximation theorems are largely unavailable at this level of generality due to the lack of such concepts as “elementary set” or “continuous function” for an abstract measure space. On the other hand, one does have the fundamental convergence theorems for the subject, namely Fatou’s lemma, the monotone convergence theorem and the dominated convergence theorem, and we present these results here.
One question that will not be addressed much in this current set of notes is how one actually constructs interesting examples of measures. We will discuss this issue more in later notes (although one of the most powerful tools for such constructions, namely the Riesz representation theorem, will not be covered until 245B).
— 1. Boolean algebras —
We begin by recalling the concept of a Boolean algebra.
Definition 1 (Boolean algebras) Let be a set. A (concrete) Boolean algebra on is a collection of which obeys the following properties:
- (Empty set) .
- (Complement) If , then the complement also lies in .
- (Finite unions) If , then .
We sometimes say that is -measurable, or measurable with respect to , if .
Given two Boolean algebras on , we say that is finer than, a sub-algebra of, or a refinement of , or that is coarser than or a coarsening of , if .
We have chosen a “minimalist” definition of a Boolean algebra, in which one is only assumed to be closed under two of the basic Boolean operations, namely complement and finite union. However, by using the laws of Boolean algebra (such as de Morgan’s laws), it is easy to see that a Boolean algebra is also closed under other Boolean algebra operations such as intersection , set differerence , and symmetric difference . So one could have placed these additional closure properties inside the definition of a Boolean algebra without any loss of generality. However, when we are verifying that a given collection of sets is indeed a Boolean algebra, it is convenient to have as minimal a set of axioms as possible. (This point is discussed further in this Math Overflow comment of mine.)
Remark 1 One can also consider abstract Boolean algebras , which do not necessarily live in an ambient domain , but for which one has a collection of abstract Boolean operations such as meet and join instead of the concrete operations of intersection and union . We will not take this abstract perspective here, but see this blog post of mine for some further discussion of the relationship between concrete and abstract Boolean algebras, which is codified by Stone’s theorem.
Example 1 (Trivial and discrete algebra) Given any set , the coarsest Boolean algebra is the trivial algebra , in which the only measurable sets are the empty set and the whole set. The finest Boolean algebra is the discrete algebra , in which every set is measurable. All other Boolean algebras are intermediate between these two extremes: finer than the trivial algebra, but coarser than the discrete one.
Exercise 1 (Elementary algebra) Let be the collection of those sets that are either elementary sets, or co-elementary sets (i.e. the complement of an elementary set). Show that is a Boolean algebra. We will call this algebra the elementary Boolean algebra of .
Example 2 (Jordan algebra) Let be the collection of subsets of that are either Jordan measurable or co-Jordan measurable (i.e. the complement of a Jordan measurable set). Then is a Boolean algebra that is finer than the elementary algebra. We refer to this algebra as the Jordan algebra on (but caution that there is a completely different concept of a Jordan algebra in mathematics.)
Example 3 (Lebesgue algebra) Let be the collection of Lebesgue measurable subsets of . Then is a Boolean algebra that is finer than the Jordan algebra; we refer to this as the Lebesgue algebra on .
Example 4 (Null algebra) Let be the collection of subsets of that are either Lebesgue null sets or Lebesgue co-null sets (the complement of null sets). Then is a Boolean algebra that is coarser than the Lebesgue algebra; we refer to it as the null algebra on .
Exercise 2 (Restriction) Let be a Boolean algebra on a set , and let be a subset of (not necessarily -measurable). Show that the restriction of to is a Boolean algebra on . If is -measurable, show that
Example 5 (Atomic algebra) Let be partitioned into a union of disjoint sets , which we refer to as atoms. Then this partition generates a Boolean algebra , defined as the collection of all the sets of the form for some , i.e. is the collection of all sets that can be represented as the union of one or more atoms. This is easily verified to be a Boolean algebra, and we refer to it as the atomic algebra with atoms . The trivial algebra corresponds to the trivial partition into a single atom; at the other extreme, the discrete algebra corresponds to the discrete partition into singleton atoms. More generally, note that finer (resp. coarser) partitions lead to finer (resp. coarser) atomic algebra. In this definition, we permit some of the atoms in the partition to be empty; but it is clear that empty atoms have no impact on the final atomic algebra, and so without loss of generality one can delete all empty atoms and assume that all atoms are non-empty if one wishes.
of length (see Exercise 14 of the prologue). These are Boolean algebras which are increasing in : . Draw a diagram to indicate how these algebras sit in relation to the elementary, Jordan, and Lebesgue, null, discrete, and trivial algebras.
Remark 2 The dyadic algebras are analogous to the finite resolution one has on modern computer monitors, which subdivide space into square pixels. A low resolution monitor (in which each pixel has a large size) can only resolve a very small set of “blocky” images, as opposed to the larger class of images that can be resolved by a finer resolution monitor.
Exercise 3 Show that the non-empty atoms of an atomic algebra are determined up to relabeling. More precisely, show that if are two partitions of into non-empty atoms , , then if and only if exists a bijection such that for all .
While many Boolean algebras are atomic, many are not, as the following two exercises indicate.
Exercise 4 Show that every finite Boolean algebra is an atomic algebra. (A Boolean algebra is finite if its cardinality is finite, i.e. there are only finitely many measurable sets.) Conclude that every finite Boolean algebra has a cardinality of the form for some natural number . From this exercise and Exercise 3 we see that there is a one-to-one correspondence between finite Boolean algebras on and finite partitions of into non-empty sets (up to relabeling).
Exercise 5 Show that the elementary, Jordan, Lebesgue, and null algebras are not atomic algebras. (Hint: argue by contradiction. If these algebras were atomic, what must the atoms be?)
Now we describe some further ways to generate Boolean algebras.
Exercise 6 (Intersection of algebras) Let be a family of Boolean algebras on a set , indexed by a (possibly infinite or uncountable) label set . Show that the intersection of these algebras is still a Boolean algebra, and is the finest Boolean algebra that is coarser than all of the . (If is empty, we adopt the convention that is the discrete algebra.)
Definition 2 (Generation of algebras) Let be any family of sets in . We define to be the intersection of all the Boolean algebras that contain , which is again a Boolean algebra by Exercise 6. Equivalently, is the coarsest Boolean algebra that contains . We say that is the Boolean algebra generated by .
Example 7 is a Boolean algebra if and only if ; thus each Boolean algebra is generated by itself.
Exercise 8 Let be a natural number. Show that if is a finite collection of sets, then is a finite Boolean algebra of cardinality at most (in particular, finite sets generate finite algebras). Give an example to show that this bound is best possible. (Hint: for the latter, it may be convenient to use a discrete ambient space such as the discrete cube .)
The Boolean algebra can be described explicitly in terms of as follows:
- For each , we define to be the collection of all sets that either the union of a finite number of sets in (including the empty union ), or the complement of such a union.
Show that .
— 2. -algebras and measurable spaces —
In order to obtain a measure and integration theory that can cope well with limits, the finite union axiom of a Boolean algebra is insufficient, and must be improved to a countable union axiom:
Definition 3 (Sigma algebras) Let be a set. A -algebra on is a collection of which obeys the following properties:
- (Empty set) .
- (Complement) If , then the complement also lies in .
- (Countable unions) If , then .
We refer to the pair of a set together with a -algebra on that set as a measurable space.
From de Morgan’s law (which is just as valid for infinite unions and intersections as it is for finite ones), we see that -algebras are closed under countable intersections as well as countable unions.
By padding a finite union into a countable union by using the empty set, we see that every -algebra is automatically a Boolean algebra. Thus, we automatically inherit the notion of being measurable with respect to a -algebra, or of one -algebra being coarser or finer than another.
Exercise 10 Show that all atomic algebras are -algebras. In particular, the discrete algebra and trivial algebra are -algebras, as are the finite algebras and the dyadic algebras on Euclidean spaces.
Exercise 11 Show that the Lebesgue and null algebras are -algebras, but the elementary and Jordan algebras are not.
Exercise 12 Show that any restriction of a -algebra to a subspace of (as defined in Exercise 2) is again a -algebra on the subspace .
There is an exact analogue of Exercise 6:
Exercise 13 (Intersection of -algebras) Show that the intersection of an arbitrary (and possibly infinite or uncountable) number of -algebras is again a -algebra, and is the finest -algebra that is coarser than all of the .
Similarly, we have a notion of generation:
Definition 4 (Generation of -algebras) Let be any family of sets in . We define to be the intersection of all the -algebras that contain , which is again a -algebra by Exercise 13. Equivalently, is the coarsest -algebra that contains . We say that is the -algebra generated by .
Since every -algebra is a Boolean algebra, we have the trivial inclusion
However, equality need not hold; it only holds if and only if is a -algebra. For instance, if is the collection of all boxes in , then is the elementary algebra (Exercise 7), but cannot equal this algebra, as it is not a -algebra.
Remark 4 From the definitions, it is clear that we have the following principle, somewhat analogous to the principle of mathematical induction: if is a family of sets in , and is a property of sets which obeys the following axioms:
- is true.
- is true for all .
- If is true for some , then is true also.
- If are such that is true for all , then is true also.
Then one can conclude that is true for all . Indeed, the set of all for which holds is a -algebra that contains , whence the claim. This principle is particularly useful for establishing properties of Borel measurable sets (see below).
We now turn to an important example of a -algebra:
Definition 5 (Borel -algebra) Let be a metric space, or more generally a topological space. The Borel -algebra of is defined to be the -algebra generated by the open subsets of . Elements of will be called Borel measurable.
Thus, for instance, the Borel -algebra contains the open sets, the closed sets (which are complements of open sets), the countable unions of closed sets (i.e. sets), the countable intersections of open sets (i.e. sets), the countable intersections of sets, and so forth.
In , every open set is Lebesgue measurable, and so we see that the Borel -algebra is coarser than the Lebesgue -algebra. We will shortly see, though, that the two -algebras are not equal.
We defined the Borel -algebra to be generated by the open sets. However, they are also generated by several other sets:
Exercise 14 Show that the Borel -algebra of a Euclidean set is generated by any of the following collections of sets:
- The open subsets of .
- The closed subsets of .
- The compact subsets of .
- The open balls of .
- The boxes in .
- The elementary sets in .
(Hint: To show that two families of sets generate the same -algebra, it suffices to show that every -algebra that contains , contains also, and conversely.)
There is an analogue of Exercise 9, which illustrates the extent to which a generated -algebra is “larger” than the analogous generated Boolean algebra:
Exercise 15 (Recursive description of a generated Boolean algebra) (This exercise requires familiarity with the theory of ordinals, which is reviewed here. Recall that we are assuming the axiom of choice throughout this course.) Let be a collection of sets in a set , and let be the first uncountable ordinal. Define the sets for every countable ordinal via transfinite induction as follows:
- For each countable successor ordinal , we define to be the collection of all sets that either the union of an at most countable number of sets in (including the empty union ), or the complement of such a union.
- For each countable limit ordinal , we define .
Show that .
Remark 5 The first uncountable ordinal will make several further cameo appearances throughout this course, for instance by generating counterexamples to various plausible statements in point-set topology. In the case when is the collection of open sets in a topological space, so that , then the sets are essentially the Borel hierarchy (which starts at the open and closed sets, then moves on to the and sets, and so forth); these play an important role in descriptive set theory.
Exercise 16 (This exercise requires familiarity with the theory of cardinals.) Let be an infinite family of subsets of of cardinality (thus is an infinite cardinal). Show that has cardinality at most . (Hint: use Exercise 15.) In particular, show that the Borel -algebra has cardinality at most .
Conclude that there exist Jordan measurable (and hence Lebesgue measurable) subsets of which are not Borel measurable. (Hint: How many subsets of the Cantor set are there?) Use this to place the Borel -algebra on the diagram that you drew for Exercise 6.
Remark 6 Despite this demonstration that not all Lebesgue measurable subsets are Borel measurable, it is remarkably difficult (though not impossible) to exhibit a specific set that is not Borel measurable. Indeed, a large majority of the explicitly constructible sets that one actually encounters in practice tend to be Borel measurable, and one can view the property of Borel measurability intuitively as a kind of “constructibility” property. (Indeed, as a very crude first approximation, one can view the Borel measurable sets as those sets of “countable descriptive complexity”; in contrast, sets of finite descriptive complexity tend to be Jordan measurable (assuming they are bounded, of course).
Exercise 17 Let be Borel measurable subsets of respectively. Show that is a Borel measurable subset of . (Hint: first establish this in the case when is a box, by using Remark 4. To obtain the general case, apply Remark 4 yet again.)
The above exercise has a partial converse:
Exercise 18 Let be a Borel measurable subset of .
- Show that for any , the slice is a Borel measurable subset of . Similarly, show that for every , the slice is a Borel measurable subset of .
- Give a counterexample to show that this claim is not true if “Borel” is replaced with “Lebesgue” throughout. (Hint: the Cartesian product of any set with a point is a null set, even if the first set was not measurable.)
Exercise 19 Show that the Lebesgue -algebra on is generated by the union of the Borel -algebra and the null -algebra.
— 3. Countably additive measures and measure spaces —
Having set out the concept of a -algebra a measurable space, we now endow these structures with a measure.
We begin with the finitely additive theory, although this theory is too weak for our purposes and will soon be supplanted by the countably additive theory.
Definition 6 (Finitely additive measure) Let be a Boolean algebra on a space . An (unsigned) finitely additive measure on is a map that obeys the following axioms:
- (Empty set) .
- (Finite additivity) Whenever are disjoint, then .
Remark 7 The empty set axiom is needed in order to rule out the degenerate situation in which every set (including the empty set) has infinite measure.
Example 8 Lebesgue measure is a finitely additive measure on the Lebesgue -algebra, and hence on all sub-algebras (such as the null algebra, the Jordan algebra, or the elementary algebra). In particular, Jordan measure and elementary measure are finitely additive (adopting the convention that co-Jordan measurable sets have infinite Jordan measure, and co-elementary sets have infinite elementary measure).
On the other hand, as we saw in previous notes, Lebesgue outer measure is not finitely additive on the discrete algebra, and Jordan outer measure is not finitely additive on the Lebesgue algebra.
Example 9 (Dirac measure) Let and be an arbitrary Boolean algebra on . Then the Dirac measure at , defined by setting , is finitely additive.
Example 10 (Zero measure) The zero measure is a finitely additive measure on any Boolean algebra.
Example 11 (Linear combinations of measures) If is a Boolean algebra on , and are finitely additive measures on , then is also a finitely additive measure, as is for any . Thus, for instance, the sum of Lebesgue measure and a Dirac measure is also a finitely additive measure on the Lebesgue algebra (or on any of its sub-algebras).
Example 12 (Restriction of a measure) If is a Boolean algebra on , is a finitely additive measure, and is a -measurable subset of , then the restriction of to , defined by setting whenever (i.e. if and ), is also a finitely additive meaure.
Example 13 (Counting measure) If is a Boolean algebra on , then the function defined by setting to be the cardinality of if is finite, and if is infinite, is a finitely additive measure, known as counting measure.
As with our definition of Boolean algebras and -algebras, we adopted a “minimalist” definition so that the axioms are easy to verify. But they imply several further useful properties:
Exercise 20 Let be a finitely additive measure on a Boolean -algebra . Establish the following facts:
- (Monotonicity) If are -measurable and , then .
- (Finite additivity) If is a natural number, and are -measurable and disjoint, then .
- (Finite sub-additivity) If is a natural number, and are -measurable, then .
- (Inclusion-exclusion for two sets) If are -measurable, then .
(Caution: remember that the cancellation law does not hold in if is infinite, and so the use of cancellation (or subtraction) should be avoided if possible.)
One can characterise measures completely for any finite algebra:
Exercise 21 Let be a finite Boolean algebra, generated by a finite family of non-empty atoms. Show that for every finitely additive measure on there exists such that
Equivalently, if is a point in for each , then
Furthermore, show that the are uniquely determined by .
This is about the limit of what one can say about finitely additive measures at this level of generality. We now specialise to the countably additive measures on -algebras.
Definition 7 (Countably additive measure) Let be a measurable space. An (unsigned) countably additive measure on , or measure for short, is a map that obeys the following axioms:
- (Empty set) .
- (Countable additivity) Whenever are a countable sequence of disjoint measurable sets, then .
A triplet , where is a measurable space and is a countably additive measure, is known as a measure space.
Note the distinction between a measure space and a measurable space. The latter has the capability to be equipped with a measure, but the former is actually equipped with a measure.
Example 14 Lebesgue measure is a countably additive measure on the Lebesgue -algebra, and hence on every sub--algebra (such as the Borel -algebra).
Example 15 The Dirac measures from Exercise 9 are countably additive, as is counting measure.
Example 16 Any restriction of a countably additive measure to a measurable subspace is again countably additive.
Exercise 22 (Countable combinations of measures) Let be a measurable space.
- If is a countably additive measure on , and , then is also countably additive.
- If are a sequence of countably additive measures on , then the sum is also a countably additive measure.
Note that countable additivity measures are necessarily finitely additive (by padding out a finite union into a countable union using the empty set), and so countably additive measures inherit all the properties of finitely additive properties, such as monotonicity and finite subadditivity. But one also has additional properties:
- (Countable subadditivity) If are -measurable, then .
- (Upwards monotone convergence) If are -measurable, then
- (Downwards monotone convergence) If are -measurable, and for at least one , then
Show that the downward monotone convergence claim can fail if the hypothesis that for at least one is dropped. (Hint: copy the argument used for Exercise 10 in Notes 1.)
- Show that is also -measurable.
- If there exists a -measurable set of finite measure (i.e. ) that contains all of the , show that . (Hint: Apply downward monotonicity to the sets .)
- Show that the previous part of this exercise can fail if the hypothesis that all the are contained in a set of finite measure is omitted.
Exercise 25 Let be an at most countable set with the discrete -algebra. Show that every measure on this measurable space can be uniquely represented in the form
for some , thus
for all . (This claim fails in the uncountable case, although showing this is slightly tricky.)
A null set of a measure space is defined to be a -measurable set of measure zero. A sub-null set is any subset of a null set. A measure space is said to be complete if every sub-null set is a null set. Thus, for instance, the Lebesgue measure space is complete, but the Borel measure space is not (as can be seen from the solution to Exercise 16).
Completion is a convenient property to have in some cases, particularly when dealing with properties that hold almost everywhere. Fortunately, it is fairly easy to modify any measure space to be complete:
Exercise 26 (Completion) Let be a measure space. Show that there exists a unique refinement , known as the completion of , which is the coarsest refinement of that is complete. Furthermore, show that consists precisely of those sets that differ from a -measurable set by a -subnull set.
Exercise 27 Show that the Lebesgue measure space is the completion of the Borel measure space .
— 4. Measurable functions, and integration on a measure space —
Now we are ready to define integration on measure spaces. We first need the notion of a measurable function, which is analogous to that of a continuous function in topology. Recall that a function between two topological spaces is continuous if the inverse image of any open set is open. In a similar spirit, we have
Definition 8 Let be a measurable space, and let or be an unsigned or complex-valued function. We say that is measurable if is -measurable for every open subset of or .
From Lemma 7 of Notes 2, we see that this generalises the notion of a Lebesgue measurable function.
Exercise 28 Let be a measurable space.
- Show that a function is measurable if and only if the level sets are -measurable.
- Show that an indicator function of a set is measurable if and only if itself is -measurable.
- Show that a function or is measurable if and only if is -measurable for every Borel-measurable subset of or .
- Show that a function is measurable if and only if its real and imaginary parts are measurable.
- Show that a function is measurable if and only if the magnitudes , of its positive and negative parts are measurable.
- If are a sequence of measurable functions that converge pointwise to a limit , then show that is also measurable. Obtain the same claim if is replaced by .
- If is measurable and is continuous, show that is measurable. Obtain the same claim if is replaced by .
- Show that the sum or product of two measurable functions in or is still measurable.
Remark 8 One can also view measurable functions in a more category theoretic fashion. Define measurable morphism or measurable map from one measurable space to another to be a function with the property that is -measurable for every -measurable set . Then a measurable function or is the same thing as a measurable morphism from to or , where the latter is equipped with the Borel -algebra. Also, one -algebra on a space is coarser than another precisely when the identity map is a measurable morphism from to . The main purpose of adopting this viewpoint is that it is obvious that the composition of measurable morphisms is again a measurable morphism. This is important in those fields of mathematics, such as ergodic theory, in which one frequently wishes to compose measurable transformations (and in particular, to compose a transformation with itself repeatedly); but it will not play a major role in this course.
Measurable functions are particularly easy to describe on atomic spaces:
Exercise 29 Let be a measurable space that is atomic, thus for some partition of into disjoint non-empty atoms. Show that a function or is measurable if and only if it is constant on each atom, or equivalently if one has a representation of the form
for some constants in or in as appropriate. Furthermore, the are uniquely determined by .
Exercise 30 (Egorov’s theorem) Let be a finite measure space (so ), and let be a sequence of measurable functions that converge pointwise almost everywhere to a limit , and let . Show that there exists a measurable set of measure at most such that converges uniformly to outside of . Give an example to show that the claim can fail when the measure is not finite.
In Notes 2 we defined first an simple integral, then an unsigned integral, and then finally an absolutely convergent integral. We perform the same three stages here. We begin with the simple integral in the case when the -algebra is finite:
Definition 9 (Simple integral) Let be a measure space with finite. By Exercise 4, is partitioned into a finite number of atoms . If is measurable, then by Exercise 29 it has a unique representation of the form
for some . We then define the simple integral of by the formula
Note that, thanks to Exercise 3, the precise decomposition into atoms does not affect the definition of the simple integral.
One could also define a simple integral for absolutely convergent complex-valued functions on a measurable space with a finite -algebra, but we will not need to do so here.
With this definition, it is clear that one has the monotonicity property
whenever are unsigned measurable, as well as the linearity properties
for unsigned measurable and . We also make the following important technical observation:
Exercise 31 (Simple integral unaffected by refinements) Let be a measure space, and let be a refinement of , which means that contains and agrees with on . Suppose that both are finite, and let be measurable. Show that
This allows one to extend the simple integral to simple functions:
Definition 10 (Integral of simple functions) An (unsigned) simple function on a measurable space is a measurable function that takes on finitely many values . Note that such a function is then automatically measurable with respect to at least one finite sub--algebra of , namely the -algebra generated by the preimages of . We then define the simple integral by the formula
where is the restriction of to .
Note that there could be multiple finite -algebras with respect to which is measurable, but Exercise 31 guarantees that all such extensions will give the same simple integral. Indeed, if were measurable with respect to two separate finite sub--algebras and of , then it would also be measurable with respect to their common refinement , which is also finite (by Exercise 8), and then by Exercise 31, and are both equal to , and hence equal to each other.
From this we can deduce the following properties of the simple integral. As with the Lebesgue theory, we say that a property of an element of a measure space holds -almost everywhere if it holds outside of a sub-null set.
Exercise 32 (Basic properties of the simple integral) Let be a measure space, and let be simple functions.
- (Monotonicity) If pointwise, then .
- (Compatibility with measure) For every -measurable set , we have .
- (Homogeneity) For every , one has .
- (Finite additivity) .
- (Insensitivity to refinement) If is a refinement of (as defined in Exercise 31), then .
- (Almost everywhere equivalence) If for -almost every , then .
- (Finiteness) if and only if is finite almost everywhere, and is supported on a set of finite measure.
- (Vanishing) if and only if is zero almost everywhere.
Exercise 33 (Inclusion-exclusion principle) Let be a measure space, and let be -measurable sets of finite measure. Show that
(Hint: Compute in two different ways.)
Remark 9 The simple integral could also be defined on finitely additive measure spaces, rather than countably additive ones, and all the above properties would still apply. However, on a finitely additive measure space one would have difficulty extending the integral beyond simple functions, as we will now do.
From the simple integral, we can now define the unsigned integral, similarly to what was done for the unsigned Lebesgue integral in Notes 2:
Clearly, this definition generalises the corresponding definition in Definition 10 of Notes 2. Indeed, if is Lebesgue measurable, then .
We record some easy properties of this integral:
- (Almost everywhere equivalence) If -almost everywhere, then
- (Monotonicity) If -almost everywhere, then .
- (Homogeneity) We have for every .
- (Superadditivity) We have .
- (Compatibility with the simple integral) If is simple, then .
- (Markov’s inequality) For any , one has
In particular, if , then the sets have finite measure for each .
- (Finiteness) If , then is finite for -almost every .
- (Vanishing) If , then is zero for -almost every .
- (Vertical truncation) We have .
- (Horizontal truncation) If is an increasing sequence of -measurable sets, then
- (Restriction) If is a measurable subset of , then , where is the restriction of to , and the restriction was defined in Example 12. We will often abbreviate (by slight abuse of notation) as .
As before, one of the key properties of this integral is its additivity:
Theorem 12 Let be a measure space, and let be measurable. Then
Proof: In view of super-additivity, it suffices to establish the sub-additivity property
We establish this in stages. We first deal with the case when is a finite measure (which means that ) and are bounded. Pick an , and let be rounded down to the nearest integer multiple of , and be rounded up to the nearest integer multiple. Clearly, we have the pointwise bounds
Since is bounded, and are simple. Similarly define . We then have the pointwise bound
hence by Exercise 34 and the properties of the simple integral,
From (1) we conclude that
Letting and using the assumption that is finite, we obtain the claim.
Now we continue to assume that is a finite measure, but now do not assume that are bounded. Then for any natural number , we can use the previous case to deduce that
Since , we conclude that
Taking limits as using vertical truncation, we obtain the claim.
Finally, we no longer assume that is of finite measure, and also do not require to be bounded. If either or is infinite, then by monotonicity, is infinite as well, and the claim follows; so we may assume that and are both finite. By Markov’s inequality, we conclude that for each natural number , the set has finite measure. These sets are increasing in , and are supported on , and so by horizontal truncation
From the previous case, we have
Letting and using horizontal truncation we obtain the claim.
Exercise 35 (Linearity in ) Let be a measure space, and let be measurable.
- Show that for every .
- If are a sequence of measures on , show that
Exercise 36 (Change of variables formula) Let be a measure space, and let be a measurable morphism (as defined in Remark 8) from to another measurable space . Define the pushforward of by by the formula .
- Show that is a measure on , so that is a measure space.
- If is measurable, show that .
(Hint: the quickest proof here is via the monotone convergence theorem below, but it is also possible to prove the exercise without this theorem.)
Exercise 37 Let be an invertible linear transformation, and let be Lebesgue measure on . Show that , where the pushforward of was defined in Exercise 36.
Exercise 38 (Sums as integrals) Let be an arbitrary set (with the discrete -algebra), let be counting measure (see Exercise 13), and let be an arbitrary unsigned function. Show that is measurable with
Once one has the unsigned integral, one can define the absolutely convergent integral exactly as in the Lebesgue case:
Definition 13 (Absolutely convergent integral) Let be a measure space. A measurable function is said to be absolutely integrable if the unsigned integral
is finite, and use , , or to denote the space of absolutely integrable functions. If is real-valued and absolutely integrable, we define the integral by the formula
where , are the magnitudes of the positive and negative components of . If is complex-valued and absolutely integrable, we define the integral by the formula
where the two integrals on the right are interpreted as real-valued integrals. It is easy to see that the unsigned, real-valued, and complex-valued integrals defined in this manner are compatible on their common domains of definition.
Clearly, this definition generalises the corresponding definition in Definition 13 of Notes 2.
We record some of the key facts about the absolutely convergent integral:
Exercise 39 Let be a measure space.
- Show that is a complex vector space.
- Show that the integration map is a complex-linear map from to .
- Establish the triangle inequality and the homogeneity property for all and .
- Show that if are such that for -almost every , then .
- If , and is a refinement of , then , and . (Hint: it is easy to get one inequality. To get the other inequality, first work in the case when is both bounded and has finite measure support (i.e. is both vertically and horizontally truncated).)
- Show that if , then if and only if is zero -almost everywhere.
- If is -measurable and , then and . As before, by abuse of notation we write for .
— 5. The convergence theorems —
Let be a measure space, and let be a sequence of measurable functions. Suppose that as , converges pointwise either everywhere, or -almost everywhere, to a measurable limit . A basic question in the subject is to determine the conditions under which such pointwise convergence would imply convergence of the integral:
To put it another way: when can we ensure that one can interchange integrals and limits,
There are certainly some cases in which one can safely do this:
Exercise 40 (Uniform convergence on a finite measure space) Suppose that is a finite measure space (so ), and (resp. ) are a sequence of unsigned measurable functions (resp. absolutely integrable functions) that converge uniformly to a limit . Show that converges to .
However, there are also cases in which one cannot interchange limits and integrals, even when the are unsigned. We give the three classic examples, all of “moving bump” type, though the way in which the bump moves varies from example to example:
Example 17 (Escape to horizontal infinity) Let be the real line with Lebesgue measure, and let . Then converges pointwise to , but does not converge to . Somehow, all the mass in the has escaped by moving off to infinity in a horizontal direction, leaving none behind for the pointwise limit .
Example 18 (Escape to width infinity) Let be the real line with Lebesgue measure, and let . Then now converges uniformly , but still does not converge to . Exercise 40 would prevent this from happening if all the were supported in a single set of finite measure, but the increasingly wide nature of the support of the prevents this from happening.
Example 19 (Escape to vertical infinity) Let be the unit interval with Lebesgue measure (restricted from ), and let . Now, we have finite measure, and converges pointwise to , but no uniform convergence. And again, is not converging to . This time, the mass has escaped vertically, through the increasingly large values of .
Remark 10 From the perspective of time-frequency analysis (or perhaps more accurately, space-frequency analysis), these three escapes are analogous (though not quite identical) to escape to spatial infinity, escape to zero frequency, and escape to infinite frequency respectively, thus describing the three different ways in which phase space fails to be compact (if one excises the zero frequency as being singular).
However, once one shuts down these avenues of escape to infinity, it turns out that one can recover convergence of the integral. There are two major ways to accomplish this. One is to enforce monotonicity, which prevents each from abandoning the location where the mass of the preceding was concentrated and which thus shuts down the above three escape scenarios. More precisely, we have the monotone convergence theorem:
Theorem 14 (Monotone convergence theorem) Let be a measure space, and let be a monotone non-decreasing sequence of unsigned measurable functions on . Then we have
Note that in the special case when each is an indicator function , this theorem collapses to the upwards monotone convergence property (Exercise 23.2). Conversely, the upwards monotone convergence property will play a key role in the proof of this theorem.
Proof: Write , then is measurable. Since the are non-decreasing to , we see from monotonicity that are non-decreasing and bounded above by , which gives the bound
It remains to establish the reverse inequality
By definition, it suffices to show that
whenever is a simple function that is bounded pointwise by . By vertical truncation we may assume without loss of generality that also is finite everywhere, then we can write
for some and some disjoint -measurable sets , thus
Let be arbitrary. Then we have
for all . Thus, if we define the sets
then the increase to and are measurable. By upwards monotonicity of measure, we conclude that
On the other hand, observe the pointwise bound
for any ; integrating this, we obtain
Taking limits as , we obtain
sending we then obtain the claim.
Remark 11 It is easy to see that the result still holds if the monotonicity only holds almost everywhere rather than everywhere.
Corollary 15 (Tonelli’s theorem for sums and integrals) Let be a measure space, and let be a sequence of unsigned measurable functions. Then one has
Proof: Apply the monotone convergence theorem to the partial sums .
Exercise 41 Give an example to show that this corollary can fail if the are assumed to be absolutely integrable rather than unsigned measurable, even if the sum is absolutely convergent for each . (Hint: think about the three escapes to infinity.)
Exercise 42 (Borel-Cantelli lemma) Let be a measure space, and let be a sequence of -measurable sets such that . Show that almost every is contained in at most finitely many of the (i.e. is finite for almost every ). (Hint: Apply Tonelli’s theorem to the indicator functions .)
- Give an alternate proof of the Borel-Cantelli lemma (Exercise 42) that does not go through any of the convergence theorems, but instead exploits the more basic properties of measure from Exercise 23.
- Give a counterexample that shows that the Borel-Cantelli lemma can fail if the condition is relaxed to .
Secondly, when one does not have monotonicity, one can at least obtain an important inequality, known as Fatou’s lemma:
Corollary 16 (Fatou’s lemma) Let be a measure space, and let be a sequence of unsigned measurable functions. Then
Proof: Write for each . Then the are measurable and non-decreasing, and hence by the monotone convergence theorem
By definition of lim inf, we have . By monotonicity, we have for all , and thus
Hence we have
The claim then follows by another appeal to the definition of lim inf.
Remark 12 Informally, Fatou’s lemma tells us that when taking the pointwise limit of unsigned functions , that mass can be destroyed in the limit (as was the case in the three key moving bump examples), but it cannot be created in the limit. Of course the unsigned hypothesis is necessary here (consider for instance multiplying any of the moving bump examples by ). While this lemma was stated only for pointwise limits, the same general principle (that mass can be destroyed, but not created, by the process of taking limits) tends to hold for other “weak” notions of convergence. We will see some instances of this in 245B.
Finally, we give the other major way to shut down loss of mass via escape to infinity, which is to dominate all of the functions involved by an absolutely convergent one. This result is known as the dominated convergence theorem:
Theorem 17 (Dominated convergence theorem) Let be a measure space, and let be a sequence of measurable functions that converge pointwise -almost everywhere to a measurable limit . Suppose that there is an unsigned absolutely integrable function such that are pointwise -almost everywhere bounded by for each . Then we have
From the moving bump examples we see that this statement fails if there is no absolutely integrable dominating function . The reader is encouraged to see why, in each of the moving bump examples, no such dominating function exists, without appealing to the above theorem. Note also that when each of the is an indicator function , the dominated convergence theorem collapses to Exercise 24.
Proof: By modifying on a null set, we may assume without loss of generality that the converge to pointwise everywhere rather than -almost everywhere, and similarly we can assume that are bounded by pointwise everywhere rather than -almost everywhere.
By taking real and imaginary parts we may assume without loss of generality that are real, thus pointwise. Of course, this implies that pointwise also.
If we apply Fatou’s lemma to the unsigned functions , we see that
which on subtracting the finite quantity gives
Similarly, if we apply that lemma to the unsigned functions , we obtain
negating this inequality and then cancelling again we conclude that
The claim then follows by combining these inequalities.
Remark 13 We deduced the dominated convergence theorem from Fatou’s lemma, and Fatou’s lemma from the monotone convergence theorem. However, one can obtain these theorems in a different order, depending on one’s taste, as they are so closely related. For instance, in Stein-Shakarchi, the logic is somewhat different; one first obtains the slightly simpler bounded convergence theorem, which is the dominated convergence theorem under the assumption that the functions are uniformly bounded and all supported on a single set of finite measure, and then uses that to deduce Fatou’s lemma, which in turn is used to deduce the monotone convergence theorem; and then the horizontal and vertical truncation properties are used to extend the bounded convergence theorem to the dominated convergence theorem. It is instructive to view a couple different derivations of these key theorems to get more of an intuitive understanding as to how they work.
Exercise 44 Under the hypotheses of the dominated convergence theorem, establish also that as .
Exercise 45 (Almost dominated convergence) Let be a measure space, and let be a sequence of measurable functions that converge pointwise -almost everywhere to a measurable limit . Suppose that there is an unsigned absolutely integrable functions such that the are pointwise -almost everywhere bounded by , and that as . Show that
Exercise 46 (Defect version of Fatou’s lemma) Let be a measure space, and let be a sequence of unsigned absolutely integrable functions that converges pointwise to an absolutely integrable limit . Show that
as . (Hint: Apply the dominated convergence theorem to .) Informally, this tells us that the gap between the left and right hand sides of Fatou’s lemma can be measured by the quantity .
Exercise 47 Let be a measure space, and let be measurable. Show that the function defined by the formula
is a measure. (We will study such measures in greater detail in 245B.)
The monotone convergence theorem is, in some sense, a defining property of the unsigned integral, as the following exercise illustrates.
Exercise 48 (Characterisation of the unsigned integral) Let be a measurable space. be a map from the space of unsigned measurable functions to that obeys the following axioms:
- (Homogeneity) For every and , one has .
- (Finite additivity) For every , one has .
- (Monotone convergence) If are a non-decreasing sequence of unsigned measurable functions, then .
Then there exists a unique measure on such that for all . Furthermore, is given by the formula for all -measurable sets .
Exercise 49 Let be a finite measure space (i.e. ), and let be a bounded function. Suppose that is complete, which means that every sub-null set is a null set. Suppose that the upper integral
and lower integral
agree. Show that is measurable. (This is a converse to Exercise 11 of Notes 2.)
We will continue to see the monotone convergence theorem, Fatou’s lemma, and the dominated convergence theorem make an appearance throughout the rest of this course sequence.
— 6. Probability spaces (optional) —
We now pause to isolate a special type of measure space, namely an probability space. As the name suggests, these spaces are of fundamental importance in the foundations of probability, although it should be emphasised that probability theory should not be viewed as the study of probability spaces, as these are merely models for the true objects of study of that theory, namely the behaviour of random events and random variables. (See this post for further discussion of this point.) This course will not be focused on applications to probability theory, but other courses (such as the Math 275 sequence at UCLA) will certainly be taking several results from measure theory (e.g. the Borel-Cantelli lemma, Exercise 42) and transferring them to a probabilistic context in order to apply them to problems of interest in probability theory.
Definition 18 (Probability space) A probability space is a measure space of total measure : . The measure is known as a probability measure.
Note the change of notation: whereas measure spaces are traditionally denoted by symbols such as , probability spaces are traditionally denoted by symbols such as . Of course, such notational changes have no impact on the underlying mathematical formalism, but they reflect the different cultures of measure theory and probability theory. In particular, the various components , , carry the following interpretations in probability theory, that are absent in other applications of measure theory:
- The space is known as the sample space, and is interpreted as the set of all possible states that a random system could be in.
- The -algebra is known as the event space, and is interpreted as the set of all possible events that one can measure.
- The measure of an event is known as the probability of that event.
The various axioms of a probability space then formalise the foundational axioms of probability, as set out by Kolmogorov.
Example 20 (Normalised measure) Given any measure space with , the space is a probability space. For instance, if is a non-empty finite set with the discrete -algebra and the counting measure , then the normalised counting measure is a probability measure (known as the (discrete) uniform probability measure on ), and is a probability space. In probability theory, this probability spaces models the act of drawing an element of the discrete set uniformly at random.
Similarly, if is a Lebesgue measurable set of positive finite Lebesgue measure, , then is a probability space. The probability measure is known as the (continuous) uniform probability measure on . In probability theory, this probability spaces models the act of drawing an element of the continuous set uniformly at random.
Example 21 (Discrete and continuous probability measures) If is a (possibly infinite) non-empty set with the discrete -algebra , and if are a collection of real numbers in with , then the probability measure defined by , or in other words
is indeed a probability measure, and is a probability space. The function is known as the (discrete) probability distribution of the state variable .
Similarly, if is a Lebesgue measurable subset of of positive (and possibly infinite) measure, and is a Lebesgue measurable function on (where of course we restrict the Lebesgue measure space on to in the usual fashion) with , then is a probability space, where is the measure
The function is known as the (continuous) probability density of the state variable . (This density is not quite unique, since one can modify it on a set of probability zero, but it is well-defined up to this ambiguity. We will return to this point in 245B.)
Exercise 50 (No translation-invariant random integer) Show that there is no probability measure on the integers with the discrete -algebra with the translation-invariance property for every event and every integer .
Exercise 51 (No translation-invariant random real) Show that there is no probability measure on the reals with the Lebesgue -algebra with the translation-invariance property for every event and every real .
Many concepts in measure theory are of importance in probability theory, although the terminology is changed to reflect the different perspective on the subject. For instance, the notion of a property holding almost everywhere is now replaced with that of a property holding almost surely. A measurable function is now referred to as a random variable and is often denoted by symbols such as , and the integral of that function on the probability space (if the random variable is unsigned or absolutely convergent) is known as the expectation of that random variable, and is denoted . Thus, for instance, the Borel-Cantelli lemma (Exercise 42) now reads as follows: given any sequence of events such that , it is almost surely true that at most finitely many of these events hold.
In later notes, when we develop the machinery of product measures and other tools to construct measures, we will see some more interesting examples of probability spaces, which would correspond in probability theory to random processes that are generated by an infinite number of independent random sources.
The following exercise will be moved to a more suitable location in the published version of the notes, but is here currently so as not to disrupt the exercise numbering.
Exercise 52 (Approximation by an algebra) Let be a Boolean algebra on , and let be a measure on .
- If , show that for every and there exists such that .
- More generally, if for some with for all , has finite measure, and , show that there exists such that .