In this course so far, we have focused primarily on one specific example of a countably additive measure, namely Lebesgue measure. This measure was constructed from a more primitive concept of Lebesgue outer measure, which in turn was constructed from the even more primitive concept of elementary measure.
It turns out that both of these constructions can be abstracted. In this set of notes, we will give the Carathéodory lemma, which constructs a countably additive measure from any abstract outer measure; this generalises the construction of Lebesgue measure from Lebesgue outer measure. One can in turn construct outer measures from another concept known as a pre-measure, of which elementary measure is a typical example.
With these tools, one can start constructing many more measures, such as Lebesgue-Stieltjes measures, product measures, and Hausdorff measures. With a little more effort, one can also establish the Kolmogorov extension theorem, which allows one to construct a variety of measures on infinite-dimensional spaces, and is of particular importance in the foundations of probability theory, as it allows one to set up probability spaces associated to both discrete and continuous random processes, even if they have infinite length.
The most important result about product measure, beyond the fact that it exists, is that one can use it to evaluate iterated integrals, and to interchange their order, provided that the integrand is either unsigned or absolutely integrable. This fact is known as the Fubini-Tonelli theorem, and is an absolutely indispensable tool for computing integrals, and for deducing higher-dimensional results from lower-dimensional ones.
We remark that these notes omit a very important way to construct measures, namely the Riesz representation theorem, but we will defer discussion of this theorem to 245B.
This is the final set of notes in this sequence. If time permits, the course will then begin covering the 245B notes, starting with the material on signed measures and the Radon-Nikodym-Lebesgue theorem.
— 1. Outer measures and the Carathéodory extension theorem —
We begin with the abstract concept of an outer measure.
Definition 1 (Abstract outer measure) Let
be a set. An abstract outer measure (or outer measure for short) is a map
that assigns an unsigned extended real number
to every set
which obeys the following axioms:
- (Empty set)
.
- (Monotonicity) If
, then
.
- (Countable subadditivity) If
is a countable sequence of subsets of
, then
.
Outer measures are also known as exterior measures.
Thus, for instance, Lebesgue outer measure is an outer measure (see Exercise 4 of Notes 1) is an outer measure. On the other hand, Jordan outer measure
is only finitely subadditive rather than countably subadditive and thus is not, strictly speaking, an outer measure; for this reason this concept is often referred to as Jordan outer content rather than Jordan outer measure.
Note that outer measures are weaker than measures in that they are merely countably subadditive, rather than countably additive. On the other hand, they are able to measure all subsets of , whereas measures can only measure a
-algebra of measurable sets.
In Definition 1 of Notes 1, we used Lebesgue outer measure together with the notion of an open set to define the concept of Lebesgue measurability. This definition is not available in our more abstract setting, as we do not necessarily have the notion of an open set. An alternative definition of measurability was put forth in Exercise 17 of Notes 1, but this still required the notion of a box or an elementary set, which is still not available in this setting. Nevertheless, we can modify that definition to give an abstract definition of measurability:
Definition 2 (Carathéodory measurability) Let
be an outer measure on a set
. A set
is said to be Carathéodory measurable with respect to
if one has
for every set
.
Exercise 3 (Null sets are Carathéodory measurable) Suppose that
is a null set for an outer measure
(i.e.
). Show that
is Carathéodory measurable with respect to
.
Exercise 4 (Compatibility with Lebesgue measurability) Show that a set
is Carathéodory measurable with respect to Lebesgue outer measure if and only if it is Lebesgue measurable. (Hint: one direction follows from Exercise 17 of Notes 1. For the other direction, first verify simple cases, such as when
is a box, or when
or
are bounded.)
The construction of Lebesgue measure can then be abstracted as follows:
Theorem 5 (Carathéodory lemma) Let
be an outer measure on a set
, let
be the collection of all subsets of
that are Carathéodory measurable with respect to
, and let
be the restriction of
to
(thus
whenever
). Then
is a
-algebra, and
is a measure.
Proof: We begin with the -algebra property. It is easy to see that the empty set lies in
, and that the complement of a set in
lies in
also. Next, we verify that
is closed under finite unions (which will make
a Boolean algebra). Let
, and let
be arbitrary. By definition, it suffices to show that
To simplify the notation, we partition into the four disjoint sets
(the reader may wish to draw a Venn diagram here to understand the nature of these sets). Thus (1) becomes
On the other hand, from the Carathéodory measurability of , one has
and
while from the Carathéodory measurability of one has
putting these identities together we obtain (2). (Note that no subtraction is employed here, and so the arguments still work when some sets have infinite outer measure.)
Now we verify that is a
-algebra. As it is already a Boolean algebra, it suffices (see Exercise 6 below) to verify that
is closed with respect to countable disjoint unions. Thus, let
be a sequence of disjoint Carathéodory-measurable sets, and let
be arbitrary. We wish to show that
In view of subadditivity, it suffices to show that
For any ,
is Carathéodory measurable (as
is a Boolean algebra), and so
By monotonicity, . Taking limits as
, it thus suffices to show that
But by the Carathéodory measurability of , we have
for any , and thus on iteration
On the other hand, from countable subadditivity one has
and the claim follows.
Finally, we show that is a measure. It is clear that
, so it suffices to establish countable additivity, thus we need to show that
whenever are Carathéodory-measurable and disjoint. By subadditivity it suffices to show that
By monotonicity it suffices to show that
for any finite . But from the Carathéodory measurability of
one has
for any , and the claim follows from induction.
Exercise 6 Let
be a Boolean algebra on a set
. Show that
is a
-algebra if and only if it is closed under countable disjoint unions, which means that
whenever
are a countable sequence of disjoint sets in
.
Remark 7 Note that the above theorem, combined with Exercise 4 gives a slightly alternate way to construct Lebesgue measure from Lebesgue outer measure than the construction given in Notes 1. This is arguably a more efficient way to proceed, but is also less geometrically intuitive than the approach taken in Notes 1.
Remark 8 From Exercise 3 we see that the measure
constructed by the Carathéodory lemma is automatically complete, in the sense that any sub-null set for
(a subset of a null set for
) is also a null set.
Remark 9 In 245C we will give an important example of a measure constructed by Carathéodory’s lemma, namely the
-dimensional Hausdorff measure
on
that is good for measuring the size of
-dimensional subsets of
.
— 2. Pre-measures —
In previous notes, we saw that finitely additive measures, such as elementary measure or Jordan measure, could be extended to a countably additive measure, namely Lebesgue measure. It is natural to ask whether this property is true in general. In other words, given a finitely additive measure on a Boolean algebra
, is it possible to find a
-algebra
refining
, and a countably additive measure
that extends
?
There is an obvious necessary condition in order for to have a countably additive extension, namely that
already has to be countably additive within
. More precisely, suppose that
were disjoint sets such that their union
was also in
. (Note that this latter property is not automatic as
is merely a Boolean algebra rather than a
-algebra.) Then, in order for
to be extendible to a countably additive measure, it is clearly necessary that
Using the Carathéodory lemma, we can show that this necessary condition is also sufficient. More precisely, we have
Definition 10 (Pre-measure) A pre-measure on a Boolean algebra
is a finitely additive measure
with the property that
whenever
are disjoint sets such that
is in
.
Exercise 11
- Show that the requirement that
is finitely additive could be relaxed to the condition that
without affecting the definition of a pre-measure.
- Show that the condition
could be relaxed to
without affecting the definition of a pre-measure.
- On the other hand, give an example to show that if one performs both of the above two relaxations at once, one starts admitting objects
that are not pre-measures.
Exercise 12 Without using the theory of Lebesgue measure, show that elementary measure (on the elementary Boolean algebra) is a pre-measure. (Hint: use }{Lemma 6} from Notes 1. Note that one has to also deal with co-elementary sets as well as elementary sets in the elementary Boolean algebra.)
Exercise 13 Construct a finitely additive measure
that is not a pre-measure. (Hint: take
to be the natural numbers, take
to be the discrete algebra, and define
separately for finite and infinite sets.)
Theorem 14 (Hahn-Kolmogorov theorem) Every pre-measure
on a Boolean algebra
in
can be extended to a countably additive measure
.
Proof: We mimic the construction of Lebesgue measure from elementary measure. Namely, for any set , define the outer measure
of
to be the quantity
It is easy to verify (cf. Exercise 4 of Notes 1) that is indeed an outer measure. Let
be the collection of all sets
that are Carathéodory measurable with respect to
, and let
be the restriction of
to
. By the Carathéodory lemma,
is a
-algebra and
is a countably additive measure.
It remains to show that contains
and that
extends
. Thus, let
; we need to show that
is Carathéodory measurable with respect to
and that
. To prove the first claim, let
be arbitrary. We need to show that
by subadditivity, it suffices to show that
We may assume that is finite, since the claim is trivial otherwise.
Fix . By definition of
, one can find
covering
such that
The sets lie in
and cover
and thus
Similarly we have
Meanwhile, from finite additivity we have
Combining all of these estimates, we obtain
since was arbitrary, the claim follows.
Finally, we have to show that . Since
covers itself, we certainly have
. To show the converse inequality, it suffices to show that
whenever cover
. By replacing each
with the smaller set
(which still lies in
, and still covers
), we may assume without loss of generality (thanks to the monotonicity of
) that the
are disjoint. Similarly, by replacing each
with the smaller set
we may assume without loss of generality that the union of the
is exactly equal to
. But then the claim follows from the hypothesis that
is a pre-measure (and not merely a finitely additive measure).
Let us call the measure constructed in the above proof the Hahn-Kolmogorov extension of the pre-measure
. Thus, for instance, from Exercise 4, the Hahn-Kolmogorov extension of elementary measure (with the convention that co-elementary sets have infinite elementary measure) is Lebesgue measure. This is not quite the unique extension of
to a countably additive measure, though. For instance, one could restrict Lebesgue measure to the Borel
-algebra, and this would still be a countably additive extension of elementary measure. However, the extension is unique within its own
-algebra:
Exercise 15 Let
be a pre-measure, let
be the Hahn-Kolmogorov extension of
, and let
be another countably additive extension of
. Suppose also that
is
-finite, which means that one can express the whole space
as the countable union of sets
for which
for all
. Show that
and
agree on their common domain of definition. In other words, show that
for all
. (Hint: first show that
for all
.)
Exercise 16 The purpose of this exercise is to show that the
-finite hypothesis in Exercise 15 cannot be removed. Let
be the collection of all subsets in
that can be expressed as finite unions of half-open intervals
. Let
be the function such that
for non-empty
and
.
- Show that
is a pre-measure.
- Show that
is the Borel
-algebra
.
- Show that the Hahn-Kolmogorov extension
of
assigns an infinite measure to any non-empty Borel set.
- Show that counting measure
(or more generally,
for any
) is another extension of
on
.
Exercise 17 Let
be a pre-measure which is
-finite (thus
is the countable union of sets in
of finite
-measure), and let
be the Hahn-Kolmogorov extension of
.
- Show that if
, then there exists
containing
such that
(thus
consists of the union of
and a null set). Furthermore, show that
can be chosen to be a countable intersection
of sets
, each of which is a countable union
of sets
in
.
- If
has finite measure (i.e.
), and
, show that there exists
such that
.
- Conversely, if
is a set such that for every
there exists
such that
, show that
.
— 3. Lebesgue-Stieltjes measure —
Now we use the Hahn-Kolmogorov extension theorem to construct a variety of measures. We begin with Lebesgue-Stieltjes measure.
Theorem 18 (Existence of Lebesgue-Stieltjes measure) Let
be a monotone non-decreasing function, and define the left and right limits
thus one has
for all
. Let
be the Borel
-algebra on
. Then there exists a unique Borel measure
such that
for all
.
Proof: (Sketch) For this proof, we will deviate from our previous notational conventions, and allow intervals to be unbounded, thus in particular including the half-infinite intervals ,
,
,
and the doubly infinite interval
as intervals.
Define the -volume
of any interval
to be the required value of
given by (3) (e.g.,
), adopting the obvious conventions that
and
, and also adopting the convention that the empty interval
has zero
-volume,
. Note that
could equal
and
could equal
, but in all circumstances the
-volume
is well-defined and takes values in
, after adopting the obvious conventions to evaluate expressions such as
.
A somewhat tedious case check (Exercise!) gives the additivity property
whenever ,
are disjoint intervals that share a common endpoint. As a corollary, we see that if a interval
is partitioned into finitely many disjoint sub-intervals
, we have
.
Let be the Boolean algebra generated by the (possibly infinite) intervals, then
consists of those sets that can be expressed as a finite union of intervals. (This is slightly larger than the elementary algebra, as it allows for half-infinite intervals such as
, whereas the elementary algebra does not.) We can define a measure
on this algebra by declaring
whenever is the disjoint union of finitely many intervals. One can check (Exercise!) that this measure is well-defined (in the sense that it gives a unique value to
for each
) and is finitely additive. We now claim that
is a pre-measure: thus we suppose that
is the disjoint union of countably many sets
, and wish to show that
By splitting up into intervals and then intersecting each of the
with these intervals and using finite additivity, we may assume that
is a single interval. By splitting up the
into their component intervals and using finite additivity, we may assume that the
are also individual intervals. By finite additivity, we have
for every
, so it suffices to show that
By the definition of , one can check that
where ranges over all compact intervals contained in
(Exercise!). Thus, it suffices to show that
for each compact sub-interval of
. In a similar spirit, one can show that
where ranges over all open intervals containing
(Exercise!). Using the
trick, it thus suffices to show that
whenever is an open interval containing
. But by the Heine-Borel theorem, one can cover
by a finite number
of the
, hence by finite subadditivity
and the claim follows.
As is now verified to be a pre-measure, we may use the Hahn-Kolmogorov extension theorem to extend it to a countably additive measure
on a
-algebra
that contains
. In particular,
contains all the elementary sets and hence (by Exercise 14 of Notes 3) contains the Borel
-algebra. Restricting
to the Borel
-algebra we obtain the existence claim.
Finally, we establish uniqueness. If is another Borel measure with the stated properties, then
for every compact interval
, and hence by (5) and upward monotone convergence, one has
for every interval (including the unbounded ones). This implies that
agrees with
on
, and thus (by Exercise 15, noting that
is
-finite) agrees with
on Borel measurable sets.
Exercise 19 Verify the claims marked “Exercise!” in the above proof.
The measure given by the above theorem is known as the Lebesgue-Stieltjes measure
of
. (In some texts, this measure is only defined when
is right-continuous, or equivalently if
.)
Exercise 20 Define a Radon measure on
to be a Borel measure
obeying the following additional properties:
- (Local finiteness)
for every compact
.
- (Inner regularity) One has
for every Borel set
.
- (Outer regularity) One has
for every Borel set
.
Show that for every monotone function
, the Lebesgue-Stieltjes measure
is a Radon measure on
; conversely, if
is a Radon measure on
, show that there exists a monotone function
such that
.
Radon measures will be studied in more detail in 245B.
Exercise 21 (Near uniqueness) If
are monotone non-decreasing functions, show that
if and only if there exists a constant
such that
and
for all
. Note that this implies that the value of
at its points of discontinuity are irrelevant for the purposes of determining the Lebesgue-Stieltjes measure
; in particular,
.
In the special case when and
, then
is a probability measure, and
is known as the cumulative distribution function of
.
Now we give some examples of Lebesgue-Stieltjes measure.
Exercise 22 (Lebesgue-Stieltjes measure, absolutely continuous case)
- If
is the identity function
, show that
is equal to Lebesgue measure
.
- If
is monotone non-decreasing and absolutely continuous (which in particular implies that
exists and is absolutely integrable, show that
in the sense of Exercise 47 of Notes 3, thus
for any Borel measurable
, and
for any unsigned Borel measurable
.
In view of the above exercise, the integral is often abbreviated
, and referred to as the Lebesgue-Stieltjes integral of
with respect to
. In particular, observe the identity
for any monotone non-decreasing and any
, which can be viewed as yet another formulation of the fundamental theorem of calculus.
Exercise 23 (Lebesgue-Stieltjes measure, pure point case)
- If
is the Heaviside function
, show that
is equal to the Dirac measure
at the origin (defined in Example 9 of Notes 3).
- If
is a jump function (as defined in Definition 17 of Notes 5), show that
is equal to the linear combination
of delta functions (as defined in Exercise 22 of Notes 3), where
is the point of discontinuity for the basic jump function
.
Exercise 24 (Lebesgue-Stieltjes measure, singular continuous case)
- If
is a monotone non-decreasing function, show that
is continuous if and only if
for all
.
- If
is the Cantor function (defined in Exercise 46 of Notes 5), show that
is a probability measure supported on the middle-thirds Cantor set (Exercise 10 from Notes 1) in the sense that
. The measure
is known as Cantor measure.
- If
is Cantor measure, establish the self-similarity properties
and
for every Borel-measurable
, where
.
Exercise 25 (Connection with Riemann-Stieltjes integral) Let
be monotone non-decreasing, let
be a compact interval, and let
be continuous. Suppose that
is continuous at the endpoints
of the interval. Show that for every
there exists
such that
whenever
and
for
are such that
. In the language of the Riemann-Stieltjes integral, this result asserts that the Lebesgue-Stieltjes integral extends the Riemann-Stieltjes integral.
Exercise 26 (Integration by parts formula) Let
be monotone non-decreasing and continuous. Show that
for any compact interval
. (Hint: use Exercise \ref}{riemstil}.) This formula can be partially extended to the case when one or both of
have discontinuities, but care must be taken when
and
are simultaneously discontinuous at the same location.
— 4. Product measure —
Given two sets and
, one can form their Cartesian product
. This set is naturally equipped with the coordinate projection maps
and
defined by setting
and
. One can certainly take Cartesian products
of more than two sets, or even take an infinite product
, but for simplicity we will only discuss the theory for products of two sets for now.
Now suppose that and
are measurable spaces. Then we can still form the Cartesian product
and the projection maps
and
. But now we can also form the pullback
-algebras
and
We then define the product -algebra
to be the
-algebra generated by the union of these two
-algebras:
This definition has several equivalent formulations:
Exercise 27 Let
and
be measurable spaces.
- Show that
is the
-algebra generated by the sets
with
,
. In other words,
is the coarsest
-algebra on
with the property that the product of a
-measurable set and a
-measurable set is always
measurable.
- Show that
is the coarsest
-algebra on
that makes the projection maps
both measurable morphisms (see Remark 8 from Notes 3).
- If
, show that the sets
lie in
for every
, and similarly that the sets
lie in
for every
.
- If
is measurable (with respect to
), show that the function
is
-measurable for every
, and similarly that the function
is
-measurable for every
.
- If
, show that the slices
lie in a countably generated
-algebra. In other words, show that there exists an at most countable collection
of sets (which can depend on
) such that
. Conclude in particular that the number of distinct slices
is at most
, the cardinality of the continuum. (The last part of this exercise is only suitable for students who are comfortable with cardinal arithmetic.)
- Show that the product of two trivial
-algebras (on two different spaces
) is again trivial.
- (Exercise removed)
- Show that the product of two finite
-algebras is again finite.
- Show that the product of two Borel
-algebras (on two Euclidean spaces
with
) is again the Borel
-algebra (on
).
- Show that the product of two Lebesgue
-algebras (on two Euclidean spaces
with
) is not the Lebesgue
-algebra. (Hint: argue by contradiction and use Exercise 27(3).)
- However, show that the Lebesgue
-algebra on
is the completion of the product of the Lebesgue
-algebras of
and
with respect to
-dimensional Lebesgue measure (see Exercise 26 of Notes 3 for the definition of completion of a measure space).
- This part of the exercise is only for students who are comfortable with cardinal arithmetic. Give an example to show that the product of two discrete
-algebras is not necessarily discrete.
- On the other hand, show that the product of two discrete
-algebras
is again a discrete
-algebra if at least one of the domains
is at most countably infinite.
Now suppose we have two measure spaces and
. Given that we can multiply together the sets
and
to form a product set
, and can multiply the
-algebras
and
together to form a product
-algebra
, it is natural to expect that we can multiply the two measures
and
to form a product measure
. In view of the “base times height formula” that one learns in elementary school, one expects to have
whenever and
.
To construct this measure, it is convenient to make the assumption that both spaces are -finite.
Definition 29 (
-finite) A measure space
is
-finite if
can be expressed as the countable union of sets of finite measure.
Thus, for instance, with Lebesgue measure is
-finite, as
can be expressed as the union of (for instance) the balls
for
, each of which has finite measure. On the other hand,
with counting measure is not
-finite (why?). But most measure spaces that one actually encounters in analysis (including, clearly, all probability spaces) are
-finite. It is possible to partially extend the theory of product spaces to the non-
-finite setting, but there are a number of very delicate technical issues that arise and so we will not discuss them here.
As long as we restrict attention to the -finite case, product measure always exists and is unique:
Proposition 30 (Existence and uniqueness of product measure) Let
and
be
-finite measure spaces. Then there exists a unique measure
on
that obeys
whenever
and
.
Proof: We first show existence. Inspired by the fact that Lebesgue measure is the Hahn-Kolmogorov completion of elementary (pre-)measure, we shall first construct an “elementary product pre-measure” that we will then apply Theorem 14 to.
Let be the collection of all finite unions
of Cartesian products of -measurable sets
and
-measurable sets
. (One can think of such sets as being somewhat analogous to elementary sets in Euclidean space, although the analogy is not perfectly exact.) It is not difficult to verify that this is a Boolean algebra (though it is not, in general, a
-algebra). Also, any set in
can be easily decomposed into a disjoint union of product sets
of
-measurable sets and
-measurable sets (cf. Lemma 2 (and Exercise 2) from the prologue). We then define the quantity
associated such a disjoint union
by the formula
whenever is the disjoint union of products
of
-measurable sets and
-measurable sets. One can show that this definition does not depend on exactly how
is decomposed, and gives a finitely additive measure
(cf. Exercise 2 from the prologue, and also Exercise 31 from Notes 3).
Now we show that is a pre-measure. It suffices to show that if
is the countable disjoint union of sets
, then
.
Splitting up into disjoint product sets, and restricting the
to each of these product sets in turn, we may assume without loss of generality (using the finite additivity of
) that
for some
and
. In a similar spirit, by breaking each
up into component product sets and using finite additivity again, we may assume without loss of generality that each
takes the form
for some
and
. By definition of
, our objective is now to show that
To do this, first observe from construction that we have the pointwise identity
for all and
. We fix
, and integrate this identity in
(noting that both sides are measurable and unsigned) to conclude that
The left-hand side simplifies to . To compute the right-hand side, we use the monotone convergence theorem to interchange the summation and integration, and soon see that the right-hand side is
, thus
for all . Both sides are measurable and unsigned in
, so we may integrate in
and conclude that
The left-hand side here is . Using monotone convergence as before, the right-hand side simplifies to
, and the claim follows.
Now that we have established that is a pre-measure, we may apply Theorem 14 to extend this measure to a countably additive measure
on a
-algebra containing
. By Exercise 27(2),
is a countably additive measure on
, and as it extends
, it will obey (6). Finally, to show uniqueness, observe from finite additivity that any measure
on
that obeys (6) must extend
, and so uniqueness follows from Exercise 15.
Remark 31 When
,
are not both
-finite, then one can still construct at least one product measure, but it will, in general, not be unique. This makes the theory much more subtle, and we will not discuss it in these notes.
Example 32 From Exercise 22 of Notes 1, we see that the product
of the Lebesgue measures
on
and
respectively will agree with Lebesgue measure
on the product space
, which as noted in Exercise 28 is a subalgebra of
. After taking the completion
of this product measure, one obtains the full Lebesgue measure
.
Exercise 33 Let
,
be measurable spaces.
- Show that the product of two Dirac measures on
,
is a Dirac measure on
.
- If
are at most countable, show that the product of the two counting measures on
,
is the counting measure on
.
Exercise 34 (Associativity of product) Let
,
,
be
-finite sets. We may identify the Cartesian products
and
with each other in the obvious manner. If we do so, show that
and
.
Now we integrate using this product measure. We will need the following technical lemma. Define a monotone class in is a collection
of subsets of
with the following two closure properties:
- If
are a countable increasing sequence of sets in
, then
.
- If
are a countable decreasing sequence of sets in
, then
.
Lemma 35 (Monotone class lemma) Let
be a Boolean algebra on
. Then
is the smallest monotone class that contains
.
Proof: Let be the intersection of all the monotone classes that contain
. Since
is clearly one such class,
is a subset of
. Our task is then to show that
contains
.
It is also clear that is a monotone class that contains
. By replacing all the elements of
with their complements, we see that
is necessarily closed under complements.
For any , consider the set
of all sets
such that
,
,
, and
all lie in
. It is clear that
contains
; since
is a monotone class, we see that
is also. By definition of
, we conclude that
for all
.
Next, let be the set of all
such that
,
,
, and
all lie in
for all
. By the previous discussion, we see that
contains
. One also easily verifies that
is a monotone class. By definition of
, we conclude that
. Since
is also closed under complements, this implies that
is closed with respect to finite unions. Since this class also contains
, which contains
, we conclude that
is a Boolean algebra. Since
is also closed under increasing countable unions, we conclude that it is closed under arbitrary countable unions, and is thus a
-algebra. As it contains
, it must also contain
.
Theorem 36 (Tonelli’s theorem, incomplete version) Let
and
be
-finite measure spaces, and let
be measurable with respect to
. Then:
- The functions
and
(which are well-defined, thanks to Exercise 27) are measurable with respect to
and
respectively.
- We have
Proof: By writing the -finite space
as an increasing union
of finite measure sets, we see from several applications of the monotone convergence theorem that it suffices to prove the claims with
replaced by
. Thus we may assume without loss of generality that
has finite measure. Similarly we may assume
has finite measure. Note from (6) that this implies that
has finite measure also.
Every unsigned measurable function is the increasing limit of unsigned simple functions. By several applications of the monotone convergence theorem, we thus see that it suffices to verify the claim when is a simple function. By linearity, it then suffices to verify the claim when
is an indicator function, thus
for some
.
Let be the set of all
for which the claims hold. From the repeated applications of the monotone convergence theorem and the downward monotone convergence theorem (which is available in this finite measure setting) we see that
is a monotone class.
By direct computation (using (6)), we see that contains as an element any product
with
and
. By finite additivity, we conclude that
also contains as an element any a disjoint finite union
of such products. This implies that
also contains the Boolean algebra
in the proof of Proposition 30, as such sets can always be expressed as the disjoint finite union of Cartesian products of measurable sets. Applying the monotone class lemma, we conclude that
contains
, and the claim follows.
Remark 37 Note that Tonelli’s theorem for sums (Theorem 2 from Notes 1) is a special case of the above result when
are counting measure. In a similar spirit, Corollary 15 from Notes 3 is the special case when just one of
is counting measure.
Corollary 38 Let
and
be
-finite measure spaces, and let
be a null set with respect to
. Then for
-almost every
, the set
is a
-null set; and similarly, for
-almost every
, the set
is a
-null set.
Proof: Applying the Tonelli theorem to the indicator function , we conclude that
and thus
and the claim follows.
With this corollary, we can extend Tonelli’s theorem to the completion of the product space
: (see Exercise 26 of Notes 3 for the definition of completion). But we can easily extend the Tonelli theorem to this context:
Theorem 39 (Tonelli’s theorem, complete version) Let
and
be complete
-finite measure spaces, and let
be measurable with respect to
. Then:
Proof: From Exercise 26 of Notes 3, every measurable set in is equal to a measurable set in
outside of a
-null set. This implies that the
-measurable function
agrees with a
-measurable function
outside of a
-null set
(as can be seen by expressing
as the limit of simple functions). From Corollary 38, we see that for
-almost every
, the function
agrees with
outside of a
-null set (and is in particular measurable, as
is complete); and similarly for
-almost every
, the function
agrees with
outside of a
-null set and is measurable, and the claim follows.
Specialising to the case when is an indicator function
, we conclude
Corollary 40 (Tonelli’s theorem for sets) Let
and
be complete
-finite measure spaces, and let
. Then:
Exercise 41 The purpose of this exercise is to demonstrate that Tonelli’s theorem can fail if the
-finite hypothesis is removed, and also that product measure need not be unique. Let
is the unit interval
with Lebesgue measure
(and the Lebesgue
-algebra
) and
is the unit interval
with counting measure (and the discrete
-algebra
)
. Let
be the indicator function of the diagonal
.
- Show that
is measurable in the product
-algebra.
- Show that
.
- Show that
.
- Show that there is more than one measure
on
with the property that
for all
and
. (Hint: use the two different ways to perform a double integral to create two different measures.)
Remark 42 If
is not assumed to be measurable in the product space (or its completion), then of course the expression
does not make sense. Furthermore, in this case the remaining two expressions in (7) may become different as well (in some models of set theory, at least), even when
and
are finite measure. For instance, let us assume the continuum hypothesis, which implies that the unit interval
can be placed in one-to-one correspondence with the first uncountable ordinal
. Let
be the ordering of
that is associated to this ordinal, let
, and let
. Then, for any
, there are at most countably many
such that
, and so
exists and is equal to zero for every
. On the other hand, for every
, one has
for all but countably many
, and so
exists and is equal to one for every
, and so the last two expressions in (7) exist but are unequal. (In particular, Tonelli’s theorem implies that
cannot be a Lebesgue measurable subset of
.) Thus we see that measurability in the product space is an important hypothesis. (There do however exist models of set theory (with the axiom of choice) in which such counterexamples cannot be constructed, at least in the case when
and
are the unit interval with Lebesgue measure.)
Tonelli’s theorem is for the unsigned integral, but it leads to an important analogue for the absolutely integral, known as Fubini’s theorem:
Theorem 43 (Fubini’s theorem) Let
and
be complete
-finite measure spaces, and let
be absolutely integrable with respect to
. Then:
- For
-almost every
, the function
is absolutely integrable with respect to
, and in particular
exists. Furthermore, the (
-almost everywhere defined) map
is absolutely integrable with respect to
.
- For
-almost every
, the function
is absolutely integrable with respect to
, and in particular
exists. Furthermore, the (
-almost everywhere defined) map
is absolutely integrable with respect to
.
- We have
Proof: By taking real and imaginary parts we may assume that is real; by taking positive and negative parts we may assume that
is unsigned. But then the claim follows from Tonelli’s theorem; note from (7) that
is finite, and so
for
-almost every
, and similarly
for
-almost every
.
Exercise 44 Give an example of a Borel measurable function
such that the integrals
and
exist and are absolutely integrable for all
and
respectively, and that
and
exist and are absolutely integrable, but such that
are unequal. (Hint: adapt the example from Remark 2 of Notes 1.) Thus we see that Fubini’s theorem fails when one drops the hypothesis that
is absolutely integrable with respect to the product space.
Remark 45 Despite the failure of Tonelli’s theorem in the non-
-finite setting, it is possible to (carefully) extend Fubini’s theorem to the non-
-finite setting, as the absolute integrability hypotheses, when combined with Markov’s inequality, can provide a substitute for the
-finite property. However, we will not do so here, and indeed I would recommend proceeding with extreme caution when performing any sort of interchange of integrals or invoking of product measure when one is not in the
-finite setting.
Informally, Fubini’s theorem allows one to always interchange the order of two integrals, as long as the integrand is absolutely integrable in the product space (or its completion). In particular, specialising to Lebesgue measure, we have
whenever is absolutely integrable. In view of this, we often write
(or
) for
.
By combining Fubini’s theorem with Tonelli’s theorem, we can recast the absolute integrability hypothesis:
Corollary 46 (Fubini-Tonelli theorem) Let
and
be complete
-finite measure spaces, and let
be measurable with respect to
. If
(note the left-hand side always exists, by Tonelli’s theorem) then
is absolutely integrable with respect to
, and in particular the conclusions of Fubini’s theorem hold. Similarly if we use
instead of
.
The Fubini-Tonelli theorem is an indispensable tool for computing integrals. We give some basic examples below:
Exercise 47 (Area interpretation of integral) Let
be a
-finite measure space, and let
be equipped with Lebesgue measure
and the Borel
-algebra
. Show that if
is measurable, then the set
is measurable in
, and
Similarly if we replace
by
.
Exercise 48 (Distribution formula) Let
be a
-finite measure space, and let
be measurable. Show that
(Note that the integrand on the right-hand side is monotone and thus Lebesgue measurable.) Similarly if we replace
by
.
Exercise 49 (Approximations to the identity) Let
be a good kernel (see Exercise 26 from Notes 5), and let
be the associated rescaled functions. Show that if
is absolutely integrable, that
converges in
norm to
as
. (Hint: use the density argument. You will need an upper bound on
which can be obtained using Tonelli’s theorem.)
— 5. Application: the Radamacher differentiation theorem (Optional) —
The Fubini-Tonelli theorem is often used in extending lower-dimensional results to higher-dimensional ones. We illustrate this by extending the one-dimensional Lipschitz differentiation theorem (Exercise 40 from Notes 5) to higher dimensions. We first recall some higher-dimensional definitions:
Definition 50 (Lipschitz continuity) A function
from one metric space
to another
is said to be Lipschitz continuous if there exists a constant
such that
for all
. (In our current application,
will be
and
will be
, with the usual metrics.)
Exercise 51 Show that Lipschitz continuous functions are uniformly continuous, and hence continuous. Then give an example of a uniformly continuous function
that is not Lipschitz continuous.
Definition 52 (Differentiability) Let
be a function, and let
. For any
, we say that
is directionally differentiable at
in the direction
if the limit
exists, in which case we call
the directional derivative of
at
in this direction. If
is one of the standard basis vectors
of
, we write
as
, and refer to this as the partial derivative of
at
in the
direction.
We say thatis totally differentiable at
if there exists a vector
with the property that
where
is the usual dot product on
. We refer to
(if it exists) as the gradient of
at
.
Remark 53 From the viewpoint of differential geometry, it is better to work not with the gradient vector
, but rather with the derivative covector
given by
. This is because one can then define the notion of total differentiability without any mention of the Euclidean dot product, which allows one to extend this notion to other manifolds in which there is no Euclidean (or more generally, Riemannian) structure. However, as we are working exclusively in Euclidean space for this application, this distinction will not be important for us.
Total differentiability implies directional and partial differentiability, but not conversely, as the following three exercises demonstrate.
Exercise 54 (Total differentiability implies directional and partial differentiability) Show that if
is totally differentiable at
, then it is directionally differentiable at
in each direction
, and one has the formula
In particular, the partial derivatives
exist for
and
Exercise 55 (Continuous partial differentiability implies total differentiability) Let
be such that the partial derivatives
exist everywhere and are continuous. Then show that
is totally differentiable everywhere, which in particular implies that the gradient is given by the formula (10) and the directional derivatives are given by (9).
Exercise 56 (Directional differentiability does not imply total differentiability) Let
be defined by setting
and
for
. Show that the directional derivatives
exist for all
(so in particular, the partial derivatives exist), but that
is not totally differentiable at the origin
.
Now we can state the Rademacher differentiation theorem.
Theorem 57 (Rademacher differentiation theorem) Let
be Lipschitz continuous. Then
is totally differentiable at
for almost every
.
Note that the case of this theorem is Exercise 40 from Notes 5, and indeed we will use the one-dimensional theorem to imply the higher-dimensional one, though there will be some technical issues due to the gap between directional and total differentiability.
Proof: The strategy here is to first aim for the more modest goal of directional differentiability, and then find a way to link the directional derivatives together to get total differentiability.
Let . As
is continuous, we see that in order for the directional derivative
to exist, it suffices to let range in the dense subset
of
for the purposes of determing whether the limit exists. In particular,
exists if and only if
From this we easily conclude that for each direction , the set
is Lebesgue measurable in (indeed, it is even Borel measurable). A similar argument reveals that
is a measurable function outside of
. From the Lipschitz nature of
, we see that
is also a bounded function.
Now we claim that is a null set for each
. For
is clearly empty, so we may assume
. Applying an invertible linear transformation to map
to
(noting that such transformations will map Lipschitz functions to Lispchitz functions, and null sets to null sets) we may assume without loss of generality that
is the basis vector
. Thus our task is now to show that
exists for almost every
.
We now split as
. For each
and
, we see from the definitions that
exists if and only if the one-dimensional function
is differentiable at
. But this function is Lipschitz continuous (this is inherited from the Lipschitz continuity of
), and so we see that for each fixed
, the set
is a null set in
. Applying Tonelli’s theorem for sets (Corollary 40), we conclude that
is a null set as required.
We would like to now conclude that is a null set, but there are uncountably many
‘s, so this is not directly possible. However, as
is rational, we can at least assert that
is a null set. In particular, for almost every
,
is directionally differentiable in every rational direction
.
Now we perform an important trick, in which we interpret the directional derivative as a weak derivative. We already know that
is almost everywhere defined, bounded and measurable. Now let
be any function that is compactly supported and Lipschitz continuous. We investigate the integral
This integral is absolutely convergent since is bounded and measurable, and
is continuous and compactly supported, hence bounded. We expand this out as
Note (from the Lipschitz nature of ) that the expression
is bounded uniformly in
and
, and is also uniformly compactly supported in
for
in a bounded set. We may thus apply the Lebesgue dominated convergence theorem to pull the limit out of the integral to obtain
Now, from translation invariance of the Lebesgue integral (Exercise 15) we have
and so (by the lienarity of the Lebesgue integral) we may rearrange the previous expression as
Now, as is Lipschitz, we know that
is uniformly bounded and converges pointwise almost everywhere to
as
. We may thus apply the dominated convergence theorem again and end up with the integration by parts formula
This formula moves the directional derivative operator from
over to
. At present, this does not look like much of an advantage, because
is the same sort of function that
is. However, the key point is that we can choose
to be whatever we please, whereas
is fixed. In particular, we can choose
to be a compactly supported, continuously differentiable function (such functions are Lipschitz from the fundamental theorem of calculus, as their derivatives are bounded). By Exercise 55, one has
for such functions, and so
The right-hand side is linear in , and so the left-hand side must be linear in
also. In particular, if
, then we have
If we define the gradient candidate function
(note that this function is well-defined almost everywhere, even though we don’t know yet whether is totally differentiable almost everywhere), we thus have
for all compactly supported, continuously differentiable . This implies (see Exercise 58 below) that
vanishes almost everywhere, thus (by countable subadditivity) we have
for almost every and every
.
Let be such that (12) holds for all
. We claim that this forces
to be totally differentiable at
, which would give the claim. Let
be the modified function
Our objective is to show that
On the other hand, we have ,
is Lipschitz, and from (12) we see that
for every
.
Let , and suppose that
. Then we can write
where
and
lies on the unit sphere. This
need not lie in
, but we can approximate it by some vector
with
. Furthermore, by the total boundedness of the unit sphere, we can make
lie in a finite subset
of
that only depends on
(and on
).
Since for all
, we see (by making
small enough depending on
) that we have
for all , and thus
On the other hand, from the Lipschitz nature of , we have
where is the Lipschitz constant of
. As
, we conclude that
In other words, we have shown that
whenever is sufficiently small depending on
. Letting
, we obtain the claim.
Exercise 58 Let
be a locally integrable function with the property that
whenever
is a compactly supported, continuously differentiable function. Show that
is zero almost everywhere. (Hint: if not, use the Lebesgue differentiation theorem to find a Lebesgue point
of
for which
, then pick a
which is supported in a sufficiently small neighbourhood of
.)
— 6. Infinite product spaces and the Kolmogorov extension theorem (optional) —
In Section 4 we considered the product of two sets, measurable spaces, or (-finite) measure spaces. We now consider how to generalise this concept to products of more than two such spaces. The axioms of set theory allow us to form a Cartesian product
of any family
of sets indexed by another set
, which consists of the space of all tuples
indexed by
, for which
for all
. This concept allows for a succinct formulation of the axiom of choice (Axiom 3 from Notes 1), namely that an arbitrary Cartesian product of non-empty sets remains non-empty.
For any , we have the coordinate projection maps
defined by
. More generally, given any
, we define the partial projections
to the partial product space
by
. More generally still, given two subsets
, we have the partial subprojections
defined by
. These partial subprojections obey the composition law
for all
(and thus form a very simple example of a category).
As before, given any -algebra
on
, we can pull it back by
to create a
-algebra
on . One easily verifies that this is indeed a
-algebra. Informally,
describes those sets (or “events”, if one is thinking in probabilistic terms) that depend only on the
coordinate of the state
, and whose dependence on
is
-measurable. We can then define the product
-algebra
We have a generalisation of Exercise 27:
Exercise 59 Let
be a family of measurable spaces. For any
, write
.
- Show that
is the coarsest
-algebra on
that makes the projection maps
measurable morphisms for all
.
- Show that for each
, that
is a measurable morphism from
to
.
- If
in
, show that there exists an at most countable set
and a set
such that
. Informally, this asserts that a measurable event can only depend on at most countably many of the coefficients.
- If
is
-measurable, show that there exists an at most countable set
and a
-measurable function
such that
.
- If
is at most countable, show that
is the
-algebra generated by the sets
with
for all
.
- On the other hand, show that if
is uncountable and the
are all non-trivial, show that
is not the
-algebra generated by sets
with
for all
.
- If
,
, and
, show that the set
lies in
, where we identify
with
in the obvious manner.
- If
,
is
-measurable, and
, show that the function
is
-measurable.
Now we consider the problem of constructing a measure on the product space
. Any such measure
will induce pushforward measures
on
(introduced in Exercise 36 of Notes 3), thus
for all . These measures obey the compatibility relation
whenever , as can be easily seen by chasing the definitions.
One can then ask whether one can reconstruct from just from the projections
to finite subsets
. This is possible in the important special case when the
(and hence
) are probability measures, provided one imposes an additional inner regularity hypothesis on the measures
. More precisely:
Definition 60 (Inner regularity) A (metrisable) inner regular measure space
is a measure space
equipped with a metric
such that
- Every compact set is measurable; and
- One has
for all measurable
.
We say that
is inner regular if it is associated to an inner regular measure space.
Thus for instance Lebesgue measure is inner regular, as are Dirac measures and counting measures. Indeed, most measures that one actually encounters in applications will be inner regular. For instance, any finite Borel measure on (or more generally, on a locally compact,
-compact space) is inner regular (see Exercise 12 of 245B Notes 12). Inner regularity is one of the axioms of a Radon measure, which we will discuss in more detail in 245B.
Remark 61 One can generalise the concept of an inner regular measure space to one which is given by a topology rather than a metric; Kolmogorov’s extension theorem still holds in this more general setting, but requires Tychonoff’s theorem, which we will cover in 245B Notes 10. However, some minimal regularity hypotheses of a topological nature are needed to make the Kolmogorov extension theorem work, although this is usually not a severe restriction in practice.
Theorem 62 (Kolmogorov extension theorem) Let
be a family of measurable spaces
, equipped with a topology
. For each finite
, let
be an inner regular probability measure on
with the product topology
, obeying the compatibility condition (13) whenever
are two nested finite subsets of
. Then there exists a unique probability measure
on
with the property that
for all finite
.
Proof: Our main tool here will be the Hahn-Kolmogorov extension theorem for pre-measures (Theorem 14), combined with the Heine-Borel theorem.
Let be the set of all subsets of
that are of the form
for some finite
and some
. One easily verifies that this is a Boolean algebra that is contained in
. We define a function
by setting
whenever takes the form
for some finite
and
. Note that a set
may have two different representations
for some finite
, but then one must have
and
, where
. Applying (13), we see that
and
and thus . This shows that
is well defined. As the
are probability measures, we see that
.
It is not difficult to see that is finitely additive. We now claim that
is a pre-measure. In other words, we claim that if
is the disjoint countable union
of sets
, then
.
For each , let
. Then the
lie in
, are decreasing, and are such that
. By finite additivity (and the finiteness of
), we see that it suffices to show that
.
Suppose this is not the case, then there exists such that
for all
. As each
lies in
, we have
for some finite sets
and some
-measurable sets
. By enlarging each
as necessary we may assume that the
are increasing in
. The decreasing nature of the
then gives the inclusions
By inner regularity, one can find a compact subset of each
such that
If we then set
then we see that each is compact and
In particular, the sets are non-empty. By construction, we also have the inclusions
and thus the sets are decreasing in
. On the other hand, since these sets are contained in
, we have
.
By the axiom of choice, we can select an element from
for each
. Observe that for any
, that
will lie in the compact set
whenever
. Applying the Heine-Borel theorem repeatedly, we may thus find a subsequence
of the
for
such that
converges; then we can find a further subsequence
of that subsequence such that
, and more generally obtain nested subsequences
for
and
such that for each
, the sequence
converges.
Now we use the diagonalisation trick. Consier the sequence for
. By construction, we see that for each
,
converges to a limit as
. This implies that for each
,
converges to a limit
as
. As
is closed, we see that
for each
. If we then extend
arbitrarily from
to
, then the point
lies in
for each
. But this contradicts the fact that
. This contradiction completes the proof that
is a pre-measure.
If we then let be the Hahn-Kolmogorov extension of
, one easily verifies that
obeys all the required properties, and the uniqueness follows from Exercise 15.
The Kolmogorov extension theorem is a fundamental tool in the foundations of probability theory, as it allows one to construct a probability space to hold a variety of random processes , both in the discrete case (when the set of times
is something like the integers
) and in the continuous case (when the set of times
is something like
). In particular, it can be used to rigorously construct a process for Brownian motion, known as the Wiener process. We will however not focus on this topic, which can be found in many graduate probability texts. But we will give one common special case of the Kolmogorov extension theorem, which is to construct product probability measures:
Theorem 63 (Existence of product measures) Let
be an arbitrary set. For each
, let
be a probability space in which
is a locally compact,
-compact metric space, with
being its Borel
-algebra (i.e. the
-algebra generated by the open sets). Then there exists a unique probability measure
on
with the property that
whenever
for each
, and one has
for all but finitely many of the
.
Proof: We apply the Kolmogorov extension theorem to the finite product measures for finite
, which can be constructed using the machinery in Section 4. These are Borel probability measures on a locally compact,
-compact space and are thus inner regular by Exercise 12 of 245B Notes 12. The compatibility condition (13) can be verified from the uniqueness properties of finite product measures.
Remark 64 This result can also be obtained from the }{Riesz representation theorem}, which we will cover in 245B Notes 12.
Example 65 (Bernoulli cube) Let
, and for each
, let
be the two-element set
with the discrete metric (and thus discrete
-algebra) and the uniform probability measure
. Then Theorem 63 gives a probability measure
on the infinite discrete cube
, known as the (uniform) Bernoulli measure on this cube. The coordinate functions
can then be interpreted as a countable sequence of random variables taking values in
. From the properties of product measure one can easily check that these random variables are uniformly distributed on
and are jointly independent. Informally, Bernoulli measure allows one to model an infinite number of “coin flips”. One can replace the natural numbers here by any other index set, and have a similar construction.
Example 66 (Continuous cube) We repeat the previous example, but replace
with the unit interval
(with the usual metric, the Borel
-algebra, and the uniform probability measure). This gives a probability measure on the infinite continuous cube
, and the coordinate functions
can now be interpreted as jointly independent random variables, each having the uniform distribution on
.
Example 67 (Independent gaussians) We repeat the previous example, but now replace
with
(with the usual metric, and the Borel
-algebra), and the normal probability distribution
(thus
for every Borel set
). This gives a probability space that supports a countable sequence of jointly independent gaussian random variables
.
84 comments
Comments feed for this article
21 April, 2018 at 3:10 am
Anonymous
For lemma 35,
Correct me if I am wrong, but is the second paragraph necessary: “It is also clear… closed under complements. ” As
, the proof already shows
is closed under complements,
.
21 April, 2018 at 7:57 am
Terence Tao
Yes, one could arrange the argument in this fashion instead if one wished.
3 October, 2018 at 6:00 am
Nathanael Schilling
I think there is a small mistake in the proof of the Kolomogorov Extension Theorem (Theorem 62) Here you write:
As
potentially only has measure
, and hence
(which is contained in the inverse image under a projection of
) cannot have a larger measure.
[Corrected, thanks – T.]
22 January, 2021 at 6:22 am
Anonymous
Dear Professor Tao
Can I ask you about the following thing, please?:In (4) of Exercise 59. How can we get the at most countable set {B} and get the function {f_B}?
Now I just know that if want that {f = f_B \circ \pi_B},we will have that
{ (x_\alpha)_{\alpha \in A}} and { (x^’_\alpha)_{\alpha \in A}}are equal under the given function {f} if {x_\alpha = x^’_\alpha} for all {\alpha belong B}.
22 January, 2021 at 1:22 pm
Anonymous
In theorem 18, it seems that there is a typo in the definition of left and right limits of
.
[Corrected, thanks – T.]
11 October, 2021 at 9:19 am
254A, Supplement 4: Probabilistic models and heuristics for the primes (optional) | What's new
[…] of random variables, as this set of notes is not focused on rigorous formalism. (See for instance this previous post for the relevant theory […]
30 April, 2022 at 12:52 am
Outer Measures – deep mind
[…] Apart from the literature [1] to [6] listed at the end of this post, we also refer to the lecture 245A notes of T. Tao. […]