In these notes we quickly review the basics of abstract measure theory and integration theory, which was covered in the previous course but will of course be relied upon in the current course. This is only a brief summary of the material; of course, one should consult a real analysis text for the full details of the theory.
— Measurable spaces —
Ideally, measure theory on a space X should be able to assign a measure (or “volume”, or “mass”, etc.) to every set in X. Unfortunately, due to paradoxes such as the Banach-Tarski paradox, many natural notions of measure (e.g. Lebesgue measure) cannot be applied to measure all subsets of X; instead, one must restrict attention to certain measurable subsets of X. This turns out to suffice for most applications; for instance, just about any “non-pathological” subset of Euclidean space that one actually encounters will be Lebesgue measurable (as a general rule of thumb, any set which does not rely on the axiom of choice in its construction will be measurable).
To formalise this abstractly, we use
Definition 1. A measurable space
is a set X, together with a collection
of subsets of X which form a
-algebra, thus
contains the empty set and X, and is closed under countable intersections, countable unions, and complements. A subset of X is said to be measurable with respect to the measurable space if it lies in
.
A function
from one measurable space
to another
is said to be measurable if
for all
.
Remark 1. The class of measurable spaces forms a category, with the measurable functions being the morphisms. The symbol stands for “countable union”; cf.
-compact,
-finite,
set .
Remark 2. The notion of a measurable space (and of a measurable function) is superficially similar to that of a topological space
(and of a continuous function); the topology
contains
and X just as the
-algebra
does, but is now closed under arbitrary unions and finite intersections, rather than countable unions, countable intersections, and complements. The two categories are linked to each other by the Borel algebra construction, see Example 2 below.
Example 1. We say that one -algebra
on a set X is coarser than another
(or that
is finer than
) if
(or equivalently, if the identity map from
to
is measurable); thus every set which is measurable in the coarse space is also measurable in the fine space. The coarsest
-algebra on a set X is the trivial
-algebra
, while the finest is the discrete
-algebra
.
Example 2. The intersection of an arbitrary family
of
-algebras on X is another
-algebra on X. Because of this, given any collection
of sets on X we can define the
-algebra
generated by
, defined to be the intersection of all the
-algebras containing
, or equivalently the coarsest algebra for which all sets in
are measurable. (This intersection is non-vacuous, since it will always involve the discrete
-algebra
.) In particular, the open sets
of a topological space
generate a
-algebra, known as the Borel
-algebra of that space.
We can also define the join of any family
of
-algebras on X by the formula
(1)
For instance, the Lebesgue -algebra
of Lebesgue measurable sets on a Euclidean space
is the join of the Borel
-algebra
and of the algebra of null sets and their complements (also called co-null sets).
Exercise 1. A function from one topological space to another is said to be Borel measurable if it is measurable once X and Y are equipped with their respective Borel
-algebras. Show that every continuous function is Borel measurable. (The converse statement, of course, is very far from being true; for instance, the pointwise limit of a sequence of measurable functions, if it exists, is also measurable, whereas the analogous claim for continuous functions is completely false.)
Remark 3. A function is said to be Lebesgue measurable if it is measurable from
(with the Lebesgue
-algebra) to
(with the Borel
-algebra), or equivalently if
is Lebesgue measurable for every open ball B in
. Note the asymmetry between Lebesgue and Borel here; in particular, the composition of two Lebesgue measurable functions need not be Lebesgue measurable.
Example 3. Given a function from a set X to a measurable space
, we can define the pullback
of
to be the
-algebra
; this is the coarsest structure on X that makes f measurable. For instance, the pullback of the Borel
-algebra from [0,1] to
under the map
consists of all sets of the form
, where
is Borel-measurable.
More generally, given a family of functions into measurable spaces
, we can form the
-algebra
generated by the
; this is the coarsest structure on X that makes all the
simultaneously measurable.
Remark 4. In probability theory and information theory, the functions in Example 3 can be interpreted as observables, and the
-algebra generated by these observables thus captures mathematically the concept of observable information. For instance, given a time parameter t, one might define the
-algebra
generated by all observables for some random process (e.g. Brownian motion) that can be made at time t or earlier; this endows the underlying event space X with an uncountable increasing family of
-algebras.
Example 4. If E is a subset of a measurable space , the pullback of
under the inclusion map
is called the restriction of
to E and is denoted
. Thus, for instance, we can restrict the Borel and Lebesgue
-algebras on a Euclidean space
to any subset of such a space.
Exercise 2. Let M be an n-dimensional manifold, and let be an atlas of coordinate charts for M, where
is an open cover of M and
are open subsets of
. Show that the Borel
-algebra on M is the unique
-algebra whose restriction to each
is the pullback via
of the restriction of the Borel
-algebra of
to
.
Example 5. A function into some index set A will partition X into level sets
for
; conversely, every partition
of X arises from at least one function f in this manner (one can just take f to be the map from points in X to the partition cell that that point lies in). Given such an f, we call the
-algebra
the
-algebra generated by the partition; a set is measurable with respect to this structure if and only if it is the union of some sub-collection
of cells of the partition.
Exercise 3. Show that a -algebra on a finite set X necessarily arises from a partition
as in Example 5, and furthermore the partition is unique (up to relabeling). Thus in the finite world,
-algebras are essentially the same concept as partitions.
Example 6. Let be a family of measurable spaces, then the Cartesian product
has canonical projection maps
for each
. The product
-algebra
is defined as the
-algebra on
generated by the
as in Example 3.
Exercise 4. Let be an at most countable family of second countable topological spaces. Show that the Borel
-algebra of the product space (with the product topology) is equal to the product of the Borel
-algebras of the factor spaces. In particular, the Borel
-algebra on
is the product of n copies of the Borel
-algebra on
. (The claim can fail when the countability hypotheses are dropped, though in most applications in analysis, these hypotheses are satisfied.) We caution however that the Lebesgue
-algebra on
is not the product of n copies of the one-dimensional Lebesgue
-algebra, as it contains some additional null sets; however, it is the completion of that product.
Exercise 5. Let and
be measurable spaces. Show that if E is measurable with respect to
, then for every
, the set
is measurable in
, and similarly for every
, the set
is measurable in
. Thus, sections of Borel-measurable sets are again Borel-measurable. (The same is not true for Lebesgue-measurable sets.)
— Measure spaces —
Now we endow measurable spaces with a measure, turning them into measure spaces.
Definition 2 (Measures) A (non-negative) measure
on a measurable space
is a function
such that
, and such that we have the countable additivity property
whenever
are disjoint measurable sets. We refer to the triplet
as a measure space.
A measure space
is finite if
; it is a probability space if
(and then we call
a probability measure). It is
-finite if X can be covered by countably many sets of finite measure.
A measurable set E is a null set if
. A property on points x in X is said to hold for almost every
(or almost surely, for probability spaces) if it holds outside of a null set. We abbreviate almost every and almost surely as a.e. and a.s. respectively. The complement of a null set is said to be a co-null set or to have full measure.
Example 7. (Dirac measures) Given any measurable space and a point
, we can define the Dirac measure (or Dirac mass)
to be the measure such that
when
and
otherwise. This is a probability measure.
Example 8. (Counting measure) Given any measurable space , we define counting measure
by defining
to be the cardinality |E| of E when E is finite, or
otherwise. This measure is finite when X is finite, and
-finite when X is at most countable. If X is also finite, we can define normalised counting measure
; this is a probability measure, also known as uniform probability measure on X (especially if we give X the discrete
-algebra).
Example 9. Any finite non-negative linear combination of measures is again a measure; any finite convex combination of probability measures is again a probability measure.
Example 10. If is a measurable map from one measurable space
to another
, and
is a measure on
, we can define the push-forward
by the formula
; this is a measure on
. Thus, for instance,
for all
.
We record some basic properties of measures of sets:
Exercise 6. Let be a measure space.
- (Monotonicity) If
are measurable sets, then
. (In particular, any measurable subset of a null set is again a null set.)
- (Countable subadditivity) If
are a countable sequence of measurable sets, then
. (Of course, one also has subadditivity for finite sequences.) In particular, any countable union of null sets is again a null set.
- (Monotone convergence for sets) If
are measurable, then
.
- (Dominated convergence for sets) If
are measurable, and
is finite, then
. Show that the claim can fail if
is infinite.
Exercise 7. A measure space is said to be complete if every subset of a null set is measurable (and is thus again a null set). Show that every measure space has a unique minimal complete refinement
, known as the completion of
, and that a set is measurable in
if and only if it is equal almost everywhere to a measurable set in
. (The completion of the Borel
-algebra with respect to Lebesgue measure is known as the Lebesgue
-algebra.)
A powerful way to construct measures on -algebras
is to first construct them on a smaller Boolean algebra
that generates
, and then extend them via the following result:
Theorem 1. (Carathéodory’s extension theorem, special case) Let
be a measurable space, and let
be a Boolean algebra (i.e. closed under finite unions, intersections, and complements) that generates
. Let
be a function such that
;
- If
are disjoint and
, then
.
Then
can be extended to a measure
on
, which we shall also call
.
Remark 5. The conditions 1,2 in the above theorem are clearly necessary if has any hope to be extended to a measure on
. Thus this theorem gives a necessary and sufficient condition for a function on a Boolean algebra to be extended to a measure. The extension can easily be shown to be unique when X is
-finite.
Proof. (sketch) Define the outer measure of any set
as the infimum of
, where
ranges over all coverings of E by elements in
. It is not hard to see that if
agrees with
on
, so it will suffice to show that it is a measure on
.
It is easy to check that is monotone and countably subadditive (as in parts 1,2 of Exercise 5) on all of
, and assigns zero to
; thus it is an outer measure in the abstract sense. But we need to show countable additivity on
. The key is to first show the related property
(2)
for all and
. This can first be shown for
, and then one observes that the class of E that obey (2) for all A is a
-algebra; we leave this as a (moderately lengthy) exercise.
The identity (2) already shows that is finitely additive on
; combining this with countable subadditivity and monotonicity, we conclude that
is countably additive, as required.
Exercise 8. Let the notation and hypotheses be as in Theorem 1. Show that given any and any set
of finite measure, there exists a set
which differs from E by a set of measure at most
. Thus sets in the
-algebra
“almost” lie in the algebra
; this is an example of Littlewood’s first principle. The same statements of course apply for the completion
of
.
One can use Theorem 1 to construct Lebesgue measure on and on
(taking
to be, say, the algebra generated by half-open intervals or boxes), although the verification of hypothesis 2 of Theorem 2 turns out to be somewhat delicate, even in the one-dimensional case. But one can at least get the higher-dimensional Lebesgue measure from the one-dimensional one by the product measure construction:
Exercise 9. Let be a finite collection of measure spaces, and let
be the product measurable space . Show that there exists a unique measure
on this space such that
for all
. The measure
is referred to as the product measure of the
and is denoted
.
Exercise 10. Let E be a Lebesgue measurable subset of . and let m be Lebesgue measure. Establish the inner regularity property
(3)
and the outer regularity property
(4).
Combined with the fact that m is locally finite, this implies that m is a Radon measure.
— Integration —
Now we define integration on a measure space .
Definition 3. (Integration) Let
be a measure space.
- If
is a non-negative simple function (i.e. a measurable function that only takes on finitely many values
), we define the integral
of f to be
(with the convention that
). In particular, if
is the indicator function of a measurable set A, then
.
- If
is a non-negative measurable function, we define the integral
to be the supremum of
, where g ranges over all simple functions bounded between 0 and f.
- If
is a measurable function, whose positive and negative parts
,
have finite integral, we say that f is absolutely integrable and define
.
- If
is a measurable function with real and imaginary parts absolutely integrable, we say that f is absolutely integrable and define
.
We will sometimes show the variable of integration, e.g. writing
for
, for sake of clarity.
The following results are standard, and the proofs are omitted:
Theorem 2. (Standard facts about integration) Let
be a measure space.
- All the above integration notions are compatible with each other; for instance, if f is both non-negative and absolutely integrable, then definitions 2 and 3 (and 4) agree.
- The functional
is linear over
for simple functions or non-negative functions, is linear over
for real-valued absolutely integrable functions, and linear over
for complex-valued absolutely integrable functions. In particular, the set of (real or complex) absolutely integrable functions on
is a (real or complex) vector space.
- A complex-valued measurable function
is absolutely integrable if and only if
, in which case we have the triangle inequality
. Of course, the same claim holds for real-valued measurable functions.
- If
is non-negative, then
, with equality holding if and only if f = 0 a.e.
- If one modifies an absolutely integrable function on a set of measure zero, then the new function is also absolutely integrable, and has the same integral as the original function. Similarly, two non-negative functions that agree a.e. have the same integral. (Because of this, we can meaningfully integrate functions that are only defined almost everywhere.)
- If
is absolutely integrable, then f is finite a.e., and vanishes outside of a
-finite set.
- If
is absolutely integrable, and
then there exists a complex-valued simple function
such that
. (This is a manifestation of Littlewood’s second principle.)
- (Change of variables formula) If
is a measurable map to another measurable space
, and
, then we have
, in the sense that whenever one of the integrals is well defined, then the other is also, and equals the first.
It is also important to note that the Lebesgue integral on extends the more classical Riemann integral. As a consequence, many properties of the Riemann integral (e.g. change of variables formula with respect to smooth diffeomorphisms) are inherited by the Lebesgue integral, thanks to various limiting arguments.
We now recall the fundamental convergence theorems relating limits and integration: the first three are for non-negative functions, the last three are for absolutely integrable functions. They are ultimately derived from their namesakes in Exercise 6 and an approximation argument by simple functions, and the proofs are again omitted. (They are also closely related to each other, and are in fact largely equivalent.)
Theorem 3. (Convergence theorems) Let
be a measure space.
- (Monotone convergence for sequences) If
are measurable, then
.
- (Monotone convergence for series) If
are measurable, then
.
- (Fatou’s lemma) If
are measurable, then
.
- (Dominated convergence for sequences) If
are measurable functions converging pointwise a.e. to a limit f, and
a.e. for some absolutely integrable
, then
.
- (Dominated convergence for series) If
are measurable functions with
, then
is absolutely convergent for a.e. x and
.
- (Egorov’s theorem) If
are measurable functions converging pointwise a.e. to a limit f on a subset A of X of finite measure, and
, then there exists a set of measure at most
, outside of which
converges uniformly to f in A. (This is a manifestation of Littlewood’s third principle.)
Remark 7. As a rule of thumb, if one does not have exact or approximate monotonicity or domination (where “approximate” means “up to an error whose
norm
goes to zero”), then one should not expect the integral of a limit to equal the limit of the integral in general; there is just too much room for oscillation.
Exercise 11. Let be an absolutely integrable function on a measure space
. Show that f is uniformly integrable, in the sense that for every
there exists
such that
whenever E is a measurable set of measure at most
. (The property of uniform integrability becomes more interesting, of course when applied to a family of functions, rather than to a single function.)
With regard to product measures and integration, the fundamental theorem in this subject is
Theorem 4. (Fubini-Tonelli theorem) Let
and
be
-finite measure spaces, with product space
.
- (Tonelli theorem) If
is measurable, then
![]()
.
- (Fubini theorem) If
is absolutely integrable, then we also have
![]()
, with the inner integrals being absolutely integrable a.e. and the outer integrals all being absolutely integrable.
- If
and
are complete measure spaces, then the same claims hold with the product
-algebra
replaced by its completion.
Remark 8. The theorem fails for non--finite spaces, but virtually every measure space actually encountered in “hard analysis” applications will be
-finite. (One should be cautious, however, with any space constructed using ultrafilters or the first uncountable ordinal.) It is also important that f obey some measurability in the product space; there exist non-measurable f for which the iterated integrals exist (and may or may not be equal to each other, depending on the properties of f and even on which axioms of set theory one chooses), but the product integral (of course) does not.
[Update, Jan 20: statement of Egorov’s theorem corrected.]
52 comments
Comments feed for this article
1 January, 2009 at 2:38 pm
Xinwei Yu
Dear Prof. Tao,
You said “any set which does not rely on the axiom of choice in its construction will be [Lebesgue] measurable”. Could you give me a hint on how to prove it or tell me where to find the proof? Thank you very much!
1 January, 2009 at 3:03 pm
Anonymous
Thanks again for a great post. A few typos:
1. Definition 1: Last sentence of the first paragraph, $\mathcal B$ should be $\mathcal X$?
2. Example 1: $/sigma$ should be $\sigma$.
3. Example 4: In the first sentence $(Y, \mathcal Y)$ is missing the right parenthesis.
4. Definition 3: In the second item, missing $\$ in int_g.
5. Right before Theorem 3: “They are derived from their namesakes in Exercise 5” should really be Exercise 6? Same for the second paragraph of “Proof” after Remark 5.
1 January, 2009 at 5:14 pm
Ian
Xinwei Yu, the paper you’re looking for is by Robert Solovay
http://en.wikipedia.org/wiki/Robert_M._Solovay
1 January, 2009 at 5:22 pm
Andy
Hi Xinwei,
Solovay proved that it’s consistent with the negation of the axiom of choice that all functions are measurable. See
http://en.wikipedia.org/wiki/Non-measurable_set
and Solovay’s wiki entry for a discussion & ref. (Almost surely a difficult result to reprove on one’s own…)
Prof. Tao–tiny typo, in the para before Ex. 1, “the join if the Borel” –> “the join of the Borel”
1 January, 2009 at 7:40 pm
Xinwei Yu
Thanks Andy and Ian!
1 January, 2009 at 7:42 pm
Terence Tao
Thanks for the corrections!
Regarding non-measurable sets, perhaps the more precise statement is that any set which can be constructed
usingwithout using the axiom of choice is measurable in at least one model of set theory (excluding the axiom of choice), though it need not be measurable in every model. But as a first approximation at least, “sets constructed without choice are measurable” is a good rule of thumb.1 January, 2009 at 10:50 pm
Roger
Don’t you mean: any set which can be constructed WITHOUT using the axiom of choice?
2 January, 2009 at 6:32 am
Sune Kristian Jakobsen
Shouldn’t the equation in theorem 3 item 1 be:

2 January, 2009 at 7:05 am
James
Thanks for this post. It was a pleasure to read and to find to my surprise that I could remember most of it from when I learnt it many years ago (although I would fail miserably trying to construct any of the proofs these days as I’m long out of practice).
One thing I recall is that the first text I used (Ash: Real Analysis and Probability) derived the Caratheodary theorem and then proceeded to construct Lebesgue measure etc. much as you do. Later on I picked up Rudin’s Real and Complex Analysis and found that this approach was not used. Instead some form of the Riesz Representation Theorem was used to perform the construction using linear functionals on locally compact Hausdorf spaces (if I remember correctly…). I was a bit upset at the time since, having fought may way through many pages using the first approach, it now seamed I would need to climb the same mountain again by a different route. Anyway, how might you rate the pros and cons of these two approaches?
2 January, 2009 at 8:19 am
KKK
Hi Prof. Tao,
Is it possible to have a pdf (or latex) version of your notes? Thank you.
2 January, 2009 at 10:14 am
Terence Tao
Thanks for the corrections!
Dear James: I do plan to discuss the Riesz representation approach a bit later in this course. Both approaches are important, but arrive at the same goal (construction of the Lebesgue measure and Lebesgue integral) from opposite directions. Philosophically, it comes down to whether one views measure as the fundamental concept (and integration being defined in terms of measure, via simple functions etc.), or integration as the fundamental concept (with measure then being a special case of the integral, by specialising to indicator functions). The Caratheodory approach takes the former view, building measures first and only later defining the integral, while the Riesz approach does the opposite. The two approaches roughly correspond to two of the basic perspectives we have to view modern mathematics, namely the set-theoretic perspective (studying a space concretely, via its points and subsets) and the category-theoretic perspective (studying a space abstractly, via the functions or maps into or out of that space). Both of these perspectives are important (as are other perspectives, e.g. information-theoretic perspective, algorithmic perspective, model-theoretic perspective, etc.), and I feel it is important to have one’s foundational theoretical concepts supported by as many of these perspectives as possible.
The Caratheodory approach is more elementary – no integrals, function spaces, point set topology, etc. – but there is definitely a fair amount of combinatorial trickery involved (e.g. the verification that if one decomposes an interval into countably many subintervals, that the length of the big interval is the sum of the smaller ones is really rather non-trivial). The Riesz approach is slicker, and more natural in many ways, but does require much more theory (though this theory is definitely useful for many other things also). Also, in the Riesz approach, the hard part of building the integral still exists; but it’s just hidden in the construction of the Riemann integral (or, alternatively, in the construction of Haar measure).
From a pedagogical perspective, I prefer to do Caratheodory first, to get an initial construction of Lebesgue measure and integral, then do function spaces (using the L^p spaces provided by the Lebesgue integral as model examples), and then finally to do Riesz representation and close the circle. One can do things in a different order, but I find that without the L^p spaces, the function space material becomes a bit abstract, and not as easy to motivate.
Dear KKK: I plan to include these notes in my yearly book compilation of the blog, though the timing of this particular notes turns out to be maximally unfortunate in this regard. (I am currently in the process of compiling the 2008 posts and should have a draft PDF for those shortly.) If you “print preview” the post, though, then you can get a slightly cleaner version of the file, which can be converted to PDF.
2 January, 2009 at 10:27 am
Terence Tao
p.s. Incidentally, one of the quickest ways I know of to construct Lebesgue measure on, say, [0,1], is to first construct it on the infinite discrete cube
(which, being totally disconnected, allows for some simplifications in either the Caratheodory or Riesz approaches) and then use the binary expansion to identify this with [0,1] (modulo a countable set of terminating binary decimals, which can be easily dealt with.) But from a pedagogical perspective this is not a particularly good way to see Lebesgue measure for the first time; if you will pardon the pun, it is totally disconnected from the geometric intuition one already has for the concepts of measure and integration.
2 January, 2009 at 10:56 am
Sune Kristian Jakobsen
“any set which can be constructed without using the axiom of choice is measurable in at least one model of set theory (excluding the axiom of choice)”
I don’t know if I understand this statement correct, but you seem to suggest that the model to choose depends on the set? If I understand the title of Solovay’s paper “A model of set-theory in which every set of reals is Lebesgue measurable” correctly: There is at least one model of set theory (excluding the axiom of choice) in which any set that can be constructed without using the axiom of choice is measurable?
3 January, 2009 at 8:03 am
Eric
The trick of constructing Lebesgue measure by first using the totally disconnected Cantor set can also be used in proving the Riesz representation theorem (at least for finite measures). First, you construct a measure from a functional on any totally disconnected compact Hausdoff space, which is significantly simpler than the general case (in particular, if you have already done the Caratheodory approach, verifying the countable additivity hypothesis to apply the Caratheodory extension theorem is trivial). Since free compact Hausdorff spaces (in the categorical sense) are totally disconnected, every compact Hausdorff space X is a quotient of a totally disconnected compact Hausdorff space Y. By Hahn-Banach, a functional on C(X) extends to C(Y), which then gives a measure on Y which can be pushed forward to X.
3 January, 2009 at 2:51 pm
James
Dear Terence,
I had never considered the contrast between the Caratheodory and Riesz approaches as an example of the two mathematical points view that you have talked about (“problem solving and model building”- maybe that was Timothy Gowers…)
I agree that I think the Caratheodery approach is better from a pedagogical point of view – more “hands on” and leaves you with a feeling that you have constructed something “from the ground up” rather than by some clever slight of hand.
However, how would you place the Daniell integral, or the Henstock–Kurzweil one – in so far as they might be useful for a first calculus course for people who might not go on to higher level maths?
3 January, 2009 at 4:15 pm
Terence Tao
Dear Sune: Yes, you are right, Solovay’s result even allows one to take the same model to make every set constructible without choice measurable. (Though this does raise a question that I do not know the answer to: is there a set that can be constructed without AC which is not necessarily measurable, i.e. it is non-measurable in at least one model of set theory? I suppose one first has to formalise what “constructed without AC” means more precisely to answer this.)
Dear Eric: Thanks for the comments! Another way to use the totally disconnected trick is to then invoke the Kolmogorov extension theorem rather than the Caratheodory one, though this is only a minor variation of the usual Caratheodory approach in the end.
Dear James: The set-theoretic vs. category-theoretic distinction (or in this context, the measure-theoretic vs. functional distinction) is perhaps slightly different from the problem solving vs. theory building distinction that Gowers writes about, though I can see that there might be some correlation between the two in practice.
I was not too familiar with the two integrals you mentioned, so I looked them up. The Daniell integral seems like a variant of the Riesz approach, using elementary functions instead of continuous ones to build the initial linear functional that will become the integral. Meanwhile, the Henstock-Kurzweil integral seems to be a variant of the Riemann integral and thus coming from the geometric perspective rather than the measure-theoretic or functional perspectives.
For undergraduate mathematics purposes, I think the Riemann integral is largely adequate, especially outside of analysis, which is the main place in mathematics where one has to deal with singularities and other pathologies. While the Riemann integral does not enjoy the convergence theorems (Theorem 3) that make the Lebesgue integral so convenient for analysis, one at least still has good behaviour under the uniform convergence topology (which is good enough for many applications outside of analysis). Furthermore, the Riemann integral ties in closely with one’s intuitive geometric notions of area, etc., and reinforces the close analogy between summation and integration (and more generally, between the discrete and the continuous) that helps build a lot of algebraic intuition for integration also.
There is unlikely to be any one “best” notion of integration that covers all cases and applications. For instance, the Lebesgue integral (or any of the other integrals mentioned above) cannot directly handle principal value integrals, or integrals obtained via zeta function regularisation, even though both of these types of integration are of significant importance in certain subfields of mathematics. Like many other broad and fundamental mathematical concepts (e.g. “number”, “space”, “limit”, “distance”, “size”, “similar”, etc.), it seems best to view integration as a loosely related family of mathematical operations, rather than to try to force all of them artificially into a single unifying formal or axiomatic framework.
The Lebesgue approach does have a philosophy which seems particularly compatible with analysis and related disciplines, namely that it treats “small” sets as “negligible” (thus, for instance, identifying two functions if they agree outside of a set of measure zero). This fits in well with the general tendency in analysis to routinely discard various small errors. Littlewood’s principles tell us that up to such small errors, Lebesgue-measurable sets are essentially elementary, Lebesgue-measurable functions are essentially continuous, and pointwise convergence is essentially uniform, so the Lebesgue theory is basically nothing more than the completion of the classical Riemann theory, and thus very natural in any situation in which one is willing to ignore what is going on in a set of small measure.
3 January, 2009 at 5:28 pm
254A, notes 0a. An alternate approach to the Carathéodory extension theorem « What’s new
[…] theorem, elementary sets, Lebesgue measure | by Terence Tao In this supplemental note to the previous lecture notes, I would like to give an alternate proof of a (weak form of the) Carathéodory extension theorem. […]
3 January, 2009 at 5:52 pm
wangtwo
Dear Pro Tao
I think in example 3 that you should use
but not
.
In the following sentence
-algebra
generated by the
;”
“we can form the
[Corrected, thanks – T.]
4 January, 2009 at 7:21 am
liuxiaochuan
Dear Prefossor Tao:
When I met a propblem in measure theory which ask for some property for a
– algebra, I find I could always just asume the family of sets which satisfy this property is
(say) and then try to prove it is indeed a
– algebra. This method is so strong and een in this post, it can be applied for several times. In the end, though I can finish the proof, I feel like the conclusion just come “out of nowhere”.
For an arbitrarily chosen set in a
– algebra, sometimes I have the impulse to get it “by hand”, I want to get this set through several countable union or countable joint, which is obvious impossible. Every time in this situation, I find the above method always can be used.
I am wondering if there are some more important thoughts in this phenomenon.
4 January, 2009 at 9:41 am
Terence Tao
Dear Liuxiaochuan,
One can indeed replicate the argument you describe (i.e. showing that the collection of sets that obey a certain property forms a
-algebra, and also contains some generating set
, and thus must contain the
-algebra
generated by that set) by repeatedly taking countable unions and intersections of generators in
instead. The catch is that “repeatedly” means “iterated up to the first uncountable ordinal” in the case of the Borel algebra (see http://en.wikipedia.org/wiki/Borel_algebra ), and is more complicated still in the case of general
-algebras (involving “trees” of at most countable depth, if I recall correctly). So the relationship between a generating set and the algebra that they generate is significantly more complicated for
-algebras than it is for other algebraic structures (e.g. rings, groups, vector spaces, Boolean algebras, topologies, etc.) where the description of the generated structure tends to be much more concrete and explicit.
Basically, what is going in here is that one is using powerful axioms of set theory (indeed, one is literally relying on the power set axiom, in order to consider the collection of all
-algebras containing a certain base set
) to bypass the tedious task of actually working out constructively what sets can be generated from some base set
of generators, and jumping straight to the existence of the
-algebra
generated by that set
, without getting much information as to precisely what that
-algebra consists of. As a consequence, the only real way to establish properties on this
-algebra is to exploit its definition as the minimal
-algebra containing
.
There is a certain amount of reverse mathematics analysis in the literature that shows that some reliance on strong axioms of set theory (axiom of choice being of course the most famous one, but there are others) is necessary to establish some of the basic results in measure theory, which may give some formal limitations on what one can do by purely “constructive” methods here, but I am not really an expert on these things.
Of course, the Borel measurable sets that one actually encounters in concrete applications tend to be fairly low on the Borel hierarchy, e.g. they are often just
or
sets at worst, and similarly for Lebesgue measurable sets, etc.. So for many specific applications, one can probably replace the general theory with some more concrete but ad hoc substitute, in which everything is done “by hand” and one only takes countable unions and intersections a fairly limited number of times. This would probably make one’s arguments significantly longer and messier, though, and also harder to generalise.
4 January, 2009 at 12:13 pm
Phil
Hi Prof. Tao,
In Exercise 8, I think you meant to put “element of” rather than “subset of” in two instances.
Exercise 9 only seems correct when all the measures involved are probability measures.
You commented on Theorem 3 that the convergence theorems for integrals of functions were largely equivalent to the corresponding ones for sets. On the other hand, on face value, they seem to be “stronger” statements since they reduce to Exercise 5 when we apply them to characteristic functions of sets. I think this is a relevant case study for the recent discussion on Gowers’ Weblog regarding how one equivalent statement can be “stronger” than another.
From the way integration theory is developed here, they seem stronger, but consider the following alternative approach to integration of functions (which maybe you had in mind when you said they were equivalent). First construct the Lebesgue measure on R, and develop the product measure without reference to integration of functions. Define the integral of a positive function as, literally, “the area under the curve” in the product measure sense, then proceed as one usually would. From this standpoint, the linearity of the integral becomes a consequence of the translation invariance of the Lebesgue measure, and the convergence theorems for functions become special cases of the convergence theorems for sets.
I think this approach is geometrically intuitive and gives insight to the “equivalences” discussed, but probably it’s a bit too gross to practice because the constructions of Lebesgue measure and the product measure without appeal to integration can be tough and it also doesn’t generalize to vector valued integration.
4 January, 2009 at 3:31 pm
Terence Tao
Dear Phil: Thanks for the corrections!
It is true that by interpreting a (non-negative) integral as the measure of a set in a higher-dimensional space, one can equate the integral convergence theorems with their measure counterparts, but I had a slightly different correspondence in mind, namely if one specialises a convergence theorem to the case of simple functions taking a fixed finite set of values, then the integral theorem quickly collapses to just the set version. The integral convergence theorems can then be recovered from this special case by a discretisation argument. (One has to be a little careful to avoid using the very same convergence theorem one is trying to prove in order to justify the limiting procedure, of course.)
4 January, 2009 at 11:08 pm
245B, notes 1: Signed measures and the Radon-Nikodym-Lebesgue theorem « What’s new
[…] The implication of 3. from 1. is Exercise 11 from Notes 0. The implication of 2. from 3. is trivial. To deduce 1. from 2., apply Theorem 1 to and […]
6 January, 2009 at 9:33 am
Joe Shipman
You say in your last sentence that iterated integrals may or not be equal to each other. In my thesis (Cardinal Conditions for Strong Fubini Theorems, October 1990 TAMS) I showed that for non-negative functions in n dimensions, it is consistent with ZFC (and a consequence of a real-valued measurable cardinal) that iterated integrals are always equal whenever they exist (consistency previously shown by H. Friedman for n=2; Sierpinski’s famous example where iterated integrals of non-negative functions differ depends on the continuum hypothesis). The non-negativity assumption is necessary or there are simple counterexamples, for example the function on [0,infty)^2 that is 0 for y>x, 1 for x>y>(x-1), -1 for (x-1)>y>(x-2), 0 for (x-2)>y.
— Joe Shipman
6 January, 2009 at 10:59 am
Terence Tao
Thanks for the clarification!
7 January, 2009 at 5:03 am
Mohamed
Thanks
I Would ask if you have simple note regarding to differential geometry and gauge theory since I found so hard to understand this theory
regards
Mohamed
7 January, 2009 at 9:17 am
Terence Tao
You might try these posts of mine:
https://terrytao.wordpress.com/2008/09/27/what-is-a-gauge/
https://terrytao.wordpress.com/2008/03/12/pcm-article-ricci-flow/
https://terrytao.wordpress.com/2008/03/26/285g-lecture-0-riemannian-manifolds-and-curvature/
9 January, 2009 at 7:18 pm
245B, notes 3: L^p spaces « What’s new
[…] to f outside of a set of measure zero. (Compare with Egorov’s theorem (Theorem 3.6 from Notes 0), which equates pointwise convergence with uniform convergence outside of a set of arbitrarily […]
20 January, 2009 at 8:57 am
PDEbeginner
Dear Prof. Tao,
Thanks a lot for your so nice lecture note!
There seems some problem in your Egorov Theorem. I think we need to restrict the theorem on some
-finite set
. Otherwise, there is a counterexample: Let
,
and
. Clearly,
pointwisely, but we cannot find any small set
to make
uniformly on
.
20 January, 2009 at 10:38 am
Terence Tao
Oops, you’re right; it’s corrected now.
16 February, 2009 at 7:28 am
实分析0-10 « Liu Xiaochuan’s Weblog
[…] 第零节是关于测度论的复习,基本上都是已知的结论,其中关于littlewood的三个学习实分析的原则还是蛮有意思的,我是头一次听说,不过这些思想在学习过程中依然感受到了。测度论中有很多很规范的证明方法,而且这些方法常常跟选择有关。我在评论中问了陶教授一个问题,已经困扰了我一段时间:测度论中有的问题会要求证明一个西格玛代数具有什么什么性质,这时候有一种很强大的方法,就是将满足这个性质的集合类直接设出来,然后再证明它确实是个西格玛代数。此种方法在运用的时候,常常会感到结论不知从什么地方出来的。证明的过程不具有启发性。陶教授回答了不少,上面的问题,正常的处理方法应该是从这个西格玛代数中任取一个元素,然后直接证明该元素是满足目标的性质的。但问题是,我们对这个任意的元素很难有个确切的描述。事实上,我们不得对“可数交”,“可数并”这样的操作做不可数次才行。而这本身其实正式选择公理强大行的体现。因为即使利用上面的方法构造出了“更清楚”的证明,其过程也一定会很复杂,很难看。 […]
3 March, 2009 at 12:40 pm
Anonymous
Dear Professor Tau,
What am I missing in Exercise 8? It seems that the example you give in Remark 1 under your “alternate approach” shows the necessity of X
-finite. In this example, the algebra is the collection of finite unions of sets
,
, which generates the discrete
-algebra on
, and every non-empty set in the algebra has measure
. If we extend to the counting measure, then the assertion in Exercise 8 fails for every
and
with
.
Jim Lewis
3 March, 2009 at 2:34 pm
Terence Tao
Hmm, you’re right. I think I thought I could restrict the algebra
to E so that I can work entirely in a finite measure setting, but of course this doesn’t work since E doesn’t lie in
. I’ve adjusted the exercise accordingly.
14 June, 2009 at 10:22 am
student
Dear Prof. Tao,
is the following claim right?
” any measure on
is the limit of finite linear combination of Dirac delta unit masses at points of [0,1].”
thanks
14 June, 2009 at 10:28 am
Terence Tao
Yes, this is correct, as long as one uses the vague topology to define the concept of a limit of a sequence of measures. The claim is closely related to the fact that the Riemann-Stieltjes integral is compatible with the Lebesgue-Stieltjes integral. For instance, the vague convergence of
to Lebesgue measure on [0,1] follows from the compatibility of the Riemann and Lebesgue integrals when integrating continuous functions.
14 June, 2009 at 10:34 am
student
Dear Prof. Tao,
I am asking this question to clarify my understanding.
is the following right?
for any
right continuous,
nondecreasing function F,
we can define an outer measure
on power set of R, then
thanks
14 June, 2009 at 10:39 am
student
wow! Prof Tao, you are so fast… thanks a lot
2 August, 2009 at 2:32 am
Mark Reid
Dear Professor Tao,
When I studied integration theory as an undergraduate it was in the Caratheodory style you mention (measures first, integration later). Some of the work I am doing now in statistical learning theory involves a fair bit of functional analysis and expectations are a central object of study. The Riesz approach to integration theory you mention in your comment above sounds like an approach I would like to understand better.
I was wondering if you could recommend any good books (or papers) on the Riesz approach to integration theory, particularly probability theory.
Thanks in advance,
Mark.
1 January, 2010 at 8:47 pm
254A, Notes 0: A review of probability theory « What’s new
[…] will assume familiarity with the foundations of measure theory; see for instance these earlier lecture notes of mine for a quick review of that topic. This is also not intended to be a first introduction to […]
2 January, 2010 at 4:31 am
Anonymous
“In these notes we quickly review the basics of abstract measure theory and integration theory, which was covered in the previous course..”
Which previous course is being referred to here, and is the referred course available online? If not, what is the recommended book for the referred course?
Thanks
2 January, 2010 at 11:31 am
Terence Tao
245a, taught by my colleague, Jim Ralston. The textbook was Folland’s “Real Analysis”.
8 April, 2010 at 9:53 am
254B, Notes 2: Roth’s theorem « What’s new
[…] additive probability measure on . (Hint: use the Carathéodory extension theorem, see e.g. my 254B notes 0 or notes […]
25 May, 2010 at 12:51 am
abdussalam mustapha
Dear prof.
I am a student newly admited into Msaters programme in mathematics but I have poor background in some knowledge of the basics more especially in measure theory and integration. I want you to please assist me with materials that will help me grasp the main ideas in measures. Thank you
Mustapha Abdussalam
Usmanu Danfodiyo University, Sokoto,
Nigeria.
31 August, 2010 at 9:03 am
Course announcement: 245A, Real analysis « What’s new
[…] also this preliminary 245B post for a summary of the material to be covered in […]
31 October, 2011 at 8:16 am
An engineering introduction to measure theory « Memming
[…] Terry Tao has a great summary on measure theory and integration. Advertisement Eco World Content From Across The Internet. Featured on EcoPressed The EU is […]
10 May, 2015 at 10:11 am
Qasim
Hi all,
I am looking for Categorical properties of measurable space such as initial structure, final structure and discrete structure…Thanks in advance
17 November, 2017 at 6:12 am
Anonymous
Example 2 gives a connection between “sigma-algebra” and “topology”. Why would one need to introduce the “sigma-algebra” structure in the first place? Is it possible to develop the measure theory (and the integration theory) with merely the notion of “topology” on a set X? What could go wrong with this attempt?
17 November, 2017 at 10:56 am
Terence Tao
The space of continuous functions (or, for that matter, Riemann integrable functions) is not closed under pointwise limits or infinite sums, whereas the space of measurable functions is. This is one of the main advantages of the Lebesgue integration theory in analysis, as it often allows one to interchange such limits and sums with integrals using such tools as the monotone convergence theorem or dominated convergence theorem. If one could only integrate functions that respected the topology (aka continuous functions), then it becomes significantly more difficult at a technical level to perform such interchanges.
29 April, 2018 at 9:32 am
Maxie Schmidt
We recently had a similar extra credit assignment in my graduate Analysis II course. The assignment was to create a “roadmap” of notions of (modes of) convergence relevant to the course. I talked with the professor after showing him this link as a reference, and he suggested that I post my solution roadmap as a comment here. Without going on too much about what’s in the pdf and why it’s useful to organize and compare these topics for understanding, I will just give the link to my comparison: [roadmap pdf linked here](http://people.math.gatech.edu/~mschmidt34/files/roadmap.pdf). I would appreciate any **constructive** feedback about this document.
12 September, 2019 at 2:22 pm
Jack Dippel
Dear Terry, A small comment about Exercise 8. I am not sure about the
-finite, any element
-algebra is well-approximated by an element of
. For example, consider the case that
consists of the finite and co-finite subsets of
, and the measure is counting measure. Then the
; and if
is neither
will have
.
assertion that when the measure is
of the
the algebra
extension theorem simply gives counting measure on all subsets of
finite nor co-finite, then any set
[You are right; I have deleted this part of the exercise. -T]
14 September, 2019 at 6:49 pm
Nadroj Straw
This is a reply to Mr. Jack Dippel. I’m having trouble understanding this counter-example. Can you help me understand why the space is sigma finite?
31 December, 2020 at 8:33 am
Anonymous
We now recall the fundamental convergence theorems relating limits and integration: the first three are for non-negative functions, the last three are for absolutely integrable functions. They are ultimately derived from their namesakes in Exercise 5 and an approximation argument by simple functions, and the proofs are again omitted.
Is there a typo in the numbering? I don’t see how the convergence theorems follow from Exercise 5. (It is probably Exercise 6.)
[Corrected, thanks – T.]