In these notes we quickly review the basics of abstract measure theory and integration theory, which was covered in the previous course but will of course be relied upon in the current course. This is only a brief summary of the material; of course, one should consult a real analysis text for the full details of the theory.

— Measurable spaces —

Ideally, measure theory on a space X should be able to assign a measure (or “volume”, or “mass”, etc.) to every set in X. Unfortunately, due to paradoxes such as the Banach-Tarski paradox, many natural notions of measure (e.g. Lebesgue measure) cannot be applied to measure all subsets of X; instead, one must restrict attention to certain *measurable* subsets of X. This turns out to suffice for most applications; for instance, just about any “non-pathological” subset of Euclidean space that one actually encounters will be Lebesgue measurable (as a general rule of thumb, any set which does not rely on the axiom of choice in its construction will be measurable).

To formalise this abstractly, we use

Definition 1.Ameasurable spaceis a set X, together with a collection of subsets of X which form a -algebra, thus contains the empty set and X, and is closed under countable intersections, countable unions, and complements. A subset of X is said to be measurable with respect to the measurable space if it lies in .A function from one measurable space to another is said to be measurable if for all .

**Remark 1.** The class of measurable spaces forms a category, with the measurable functions being the morphisms. The symbol stands for “countable union”; cf. -compact, -finite, set .

**Remark 2.** The notion of a measurable space (and of a measurable function) is superficially similar to that of a topological space (and of a continuous function); the topology contains and X just as the -algebra does, but is now closed under arbitrary unions and finite intersections, rather than countable unions, countable intersections, and complements. The two categories are linked to each other by the Borel algebra construction, see Example 2 below.

**Example 1.** We say that one -algebra on a set X is *coarser* than another (or that is finer than ) if (or equivalently, if the identity map from to is measurable); thus every set which is measurable in the coarse space is also measurable in the fine space. The coarsest -algebra on a set X is the *trivial* -algebra , while the finest is the *discrete* -algebra .

**Example 2.** The intersection of an arbitrary family of -algebras on X is another -algebra on X. Because of this, given any collection of sets on X we can define the -algebra *generated* by , defined to be the intersection of all the -algebras containing , or equivalently the coarsest algebra for which all sets in are measurable. (This intersection is non-vacuous, since it will always involve the discrete -algebra .) In particular, the open sets of a topological space generate a -algebra, known as the *Borel -algebra* of that space.

We can also define the join of any family of -algebras on X by the formula

(1)

For instance, the Lebesgue -algebra of Lebesgue measurable sets on a Euclidean space is the join of the Borel -algebra and of the algebra of null sets and their complements (also called *co-null sets*).

**Exercise 1.** A function from one topological space to another is said to be* Borel measurable* if it is measurable once X and Y are equipped with their respective Borel -algebras. Show that every continuous function is Borel measurable. (The converse statement, of course, is very far from being true; for instance, the pointwise limit of a sequence of measurable functions, if it exists, is also measurable, whereas the analogous claim for continuous functions is completely false.)

**Remark 3.** A function is said to be *Lebesgue measurable* if it is measurable from (with the Lebesgue -algebra) to (with the *Borel* -algebra), or equivalently if is Lebesgue measurable for every open ball B in . Note the asymmetry between Lebesgue and Borel here; in particular, the composition of two Lebesgue measurable functions need *not* be Lebesgue measurable.

**Example 3.** Given a function from a set X to a measurable space , we can define the pullback of to be the -algebra ; this is the coarsest structure on X that makes f measurable. For instance, the pullback of the Borel -algebra from [0,1] to under the map consists of all sets of the form , where is Borel-measurable.

More generally, given a family of functions into measurable spaces , we can form the -algebra *generated* by the ; this is the coarsest structure on X that makes all the simultaneously measurable.

**Remark 4.** In probability theory and information theory, the functions in Example 3 can be interpreted as observables, and the -algebra generated by these observables thus captures mathematically the concept of observable information. For instance, given a time parameter t, one might define the -algebra generated by all observables for some random process (e.g. Brownian motion) that can be made at time t or earlier; this endows the underlying event space X with an uncountable increasing family of -algebras.

**Example 4.** If E is a subset of a measurable space , the pullback of under the inclusion map is called the *restriction *of to E and is denoted . Thus, for instance, we can restrict the Borel and Lebesgue -algebras on a Euclidean space to any subset of such a space.

**Exercise 2.** Let M be an n-dimensional manifold, and let be an atlas of coordinate charts for M, where is an open cover of M and are open subsets of . Show that the Borel -algebra on M is the unique -algebra whose restriction to each is the pullback via of the restriction of the Borel -algebra of to .

**Example 5.** A function into some index set A will partition X into level sets for ; conversely, every partition of X arises from at least one function f in this manner (one can just take f to be the map from points in X to the partition cell that that point lies in). Given such an f, we call the -algebra the -algebra *generated* by the partition; a set is measurable with respect to this structure if and only if it is the union of some sub-collection of cells of the partition.

**Exercise 3.** Show that a -algebra on a finite set X necessarily arises from a partition as in Example 5, and furthermore the partition is unique (up to relabeling). Thus in the finite world, -algebras are essentially the same concept as partitions.

**Example 6.** Let be a family of measurable spaces, then the Cartesian product has canonical projection maps for each . The *product -algebra* is defined as the -algebra on generated by the as in Example 3.

**Exercise 4.** Let be an at most countable family of second countable topological spaces. Show that the Borel -algebra of the product space (with the product topology) is equal to the product of the Borel -algebras of the factor spaces. In particular, the Borel -algebra on is the product of n copies of the Borel -algebra on . (The claim can fail when the countability hypotheses are dropped, though in most applications in analysis, these hypotheses are satisfied.) We caution however that the Lebesgue -algebra on is *not* the product of n copies of the one-dimensional Lebesgue -algebra, as it contains some additional null sets; however, it is the completion of that product.

**Exercise 5.** Let and be measurable spaces. Show that if E is measurable with respect to , then for every , the set is measurable in , and similarly for every , the set is measurable in . Thus, sections of Borel-measurable sets are again Borel-measurable. (The same is not true for Lebesgue-measurable sets.)

— Measure spaces —

Now we endow measurable spaces with a measure, turning them into measure spaces.

Definition 2(Measures) A (non-negative)measureon a measurable space is a function such that , and such that we have the countable additivity property whenever are disjoint measurable sets. We refer to the triplet as ameasure space.A measure space is

finiteif ; it is aprobability spaceif (and then we call aprobability measure). It is-finiteif X can be covered by countably many sets of finite measure.A measurable set E is a null set if . A property on points x in X is said to hold for

almost every(oralmost surely, for probability spaces) if it holds outside of a null set. We abbreviate almost every and almost surely as a.e. and a.s. respectively. The complement of a null set is said to be aco-null setor to havefull measure.

**Example 7. **(Dirac measures) Given any measurable space and a point , we can define the *Dirac measure* (or *Dirac mass*) to be the measure such that when and otherwise. This is a probability measure.

**Example 8.** (Counting measure) Given any measurable space , we define counting measure by defining to be the cardinality |E| of E when E is finite, or otherwise. This measure is finite when X is finite, and -finite when X is at most countable. If X is also finite, we can define *normalised counting measure* ; this is a probability measure, also known as *uniform probability measure* on X (especially if we give X the discrete -algebra).

**Example 9. ** Any finite non-negative linear combination of measures is again a measure; any finite convex combination of probability measures is again a probability measure.

**Example 10.** If is a measurable map from one measurable space to another , and is a measure on , we can define the push-forward by the formula ; this is a measure on . Thus, for instance, for all .

We record some basic properties of measures of sets:

**Exercise 6.** Let be a measure space.

- (Monotonicity) If are measurable sets, then . (In particular, any measurable subset of a null set is again a null set.)
- (Countable subadditivity) If are a countable sequence of measurable sets, then . (Of course, one also has subadditivity for finite sequences.) In particular, any countable union of null sets is again a null set.
- (Monotone convergence for sets) If are measurable, then .
- (Dominated convergence for sets) If are measurable, and is finite, then . Show that the claim can fail if is infinite.

**Exercise 7.** A measure space is said to be complete if every subset of a null set is measurable (and is thus again a null set). Show that every measure space has a unique minimal complete refinement , known as the completion of , and that a set is measurable in if and only if it is equal almost everywhere to a measurable set in . (The completion of the Borel -algebra with respect to Lebesgue measure is known as the Lebesgue -algebra.)

A powerful way to construct measures on -algebras is to first construct them on a smaller *Boolean algebra* that generates , and then extend them via the following result:

Theorem 1.(Carathéodory’s extension theorem, special case) Let be a measurable space, and let be a Boolean algebra (i.e. closed underfiniteunions, intersections, and complements) that generates . Let be a function such that

- ;
- If are disjoint and , then .
Then can be extended to a measure on , which we shall also call .

**Remark 5. ** The conditions 1,2 in the above theorem are clearly necessary if has any hope to be extended to a measure on . Thus this theorem gives a necessary and sufficient condition for a function on a Boolean algebra to be extended to a measure. The extension can easily be shown to be unique when X is -finite.

**Proof. **(sketch) Define the outer measure of any set as the infimum of , where ranges over all coverings of E by elements in . It is not hard to see that if agrees with on , so it will suffice to show that it is a measure on .

It is easy to check that is monotone and countably subadditive (as in parts 1,2 of Exercise 5) on all of , and assigns zero to ; thus it is an outer measure in the abstract sense. But we need to show countable additivity on . The key is to first show the related property

(2)

for all and . This can first be shown for , and then one observes that the class of E that obey (2) for all A is a -algebra; we leave this as a (moderately lengthy) exercise.

The identity (2) already shows that is finitely additive on ; combining this with countable subadditivity and monotonicity, we conclude that is countably additive, as required.

**Exercise 8.** Let the notation and hypotheses be as in Theorem 1. Show that given any and any set of finite measure, there exists a set which differs from E by a set of measure at most . If X is -finite, show that the hypothesis that E have finite measure can be removed. (Hint: first reduce to the case when X is finite, then show that the class of all E obeying this property is a -algebra.) Thus sets in the -algebra “almost” lie in the algebra ; this is an example of Littlewood’s first principle. The same statements of course apply for the completion of .

One can use Theorem 1 to construct Lebesgue measure on and on (taking to be, say, the algebra generated by half-open intervals or boxes), although the verification of hypothesis 2 of Theorem 2 turns out to be somewhat delicate, even in the one-dimensional case. But one can at least get the higher-dimensional Lebesgue measure from the one-dimensional one by the product measure construction:

**Exercise 9.** Let be a finite collection of measure spaces, and let be the product measurable space . Show that there exists a unique measure on this space such that for all . The measure is referred to as the *product measure* of the and is denoted .

**Exercise 10.** Let E be a Lebesgue measurable subset of . and let m be Lebesgue measure. Establish the inner regularity property

(3)

and the outer regularity property

(4).

Combined with the fact that m is locally finite, this implies that m is a Radon measure.

— Integration —

Now we define integration on a measure space .

Definition 3.(Integration) Let be a measure space.

- If is a non-negative
simple function(i.e. a measurable function that only takes on finitely many values ), we define the integral of f to be (with the convention that ). In particular, if is the indicator function of a measurable set A, then .- If is a non-negative measurable function, we define the integral to be the supremum of , where g ranges over all simple functions bounded between 0 and f.
- If is a measurable function, whose positive and negative parts , have finite integral, we say that f is
absolutely integrableand define .- If is a measurable function with real and imaginary parts absolutely integrable, we say that f is absolutely integrable and define .
We will sometimes show the variable of integration, e.g. writing for , for sake of clarity.

The following results are standard, and the proofs are omitted:

Theorem 2.(Standard facts about integration) Let be a measure space.

- All the above integration notions are compatible with each other; for instance, if f is both non-negative and absolutely integrable, then definitions 2 and 3 (and 4) agree.
- The functional is linear over for simple functions or non-negative functions, is linear over for real-valued absolutely integrable functions, and linear over for complex-valued absolutely integrable functions. In particular, the set of (real or complex) absolutely integrable functions on is a (real or complex) vector space.
- A complex-valued measurable function is absolutely integrable if and only if , in which case we have the triangle inequality . Of course, the same claim holds for real-valued measurable functions.
- If is non-negative, then , with equality holding if and only if f = 0 a.e.
- If one modifies an absolutely integrable function on a set of measure zero, then the new function is also absolutely integrable, and has the same integral as the original function. Similarly, two non-negative functions that agree a.e. have the same integral. (Because of this, we can meaningfully integrate functions that are only defined almost everywhere.)
- If is absolutely integrable, then f is finite a.e., and vanishes outside of a -finite set.
- If is absolutely integrable, and then there exists a complex-valued simple function such that . (This is a manifestation of Littlewood’s second principle.)
- (Change of variables formula) If is a measurable map to another measurable space , and , then we have , in the sense that whenever one of the integrals is well defined, then the other is also, and equals the first.

It is also important to note that the Lebesgue integral on extends the more classical Riemann integral. As a consequence, many properties of the Riemann integral (e.g. change of variables formula with respect to smooth diffeomorphisms) are inherited by the Lebesgue integral, thanks to various limiting arguments.

We now recall the fundamental convergence theorems relating limits and integration: the first three are for non-negative functions, the last three are for absolutely integrable functions. They are ultimately derived from their namesakes in Exercise 5 and an approximation argument by simple functions, and the proofs are again omitted. (They are also closely related to each other, and are in fact largely equivalent.)

Theorem 3.(Convergence theorems) Let be a measure space.

- (Monotone convergence for sequences) If are measurable, then .
- (Monotone convergence for series) If are measurable, then .
- (Fatou’s lemma) If are measurable, then .
- (Dominated convergence for sequences) If are measurable functions converging pointwise a.e. to a limit f, and a.e. for some absolutely integrable , then .
- (Dominated convergence for series) If are measurable functions with , then is absolutely convergent for a.e. x and .
- (Egorov’s theorem) If are measurable functions converging pointwise a.e. to a limit f on a subset A of X of finite measure, and , then there exists a set of measure at most , outside of which converges uniformly to f in A. (This is a manifestation of Littlewood’s third principle.)

**Remark 7.** As a rule of thumb, if one does not have exact or approximate monotonicity or domination (where “approximate” means “up to an error whose norm goes to zero”), then one should not expect the integral of a limit to equal the limit of the integral in general; there is just too much room for oscillation.

**Exercise 11.** Let be an absolutely integrable function on a measure space . Show that f is uniformly integrable, in the sense that for every there exists such that whenever E is a measurable set of measure at most . (The property of uniform integrability becomes more interesting, of course when applied to a family of functions, rather than to a single function.)

With regard to product measures and integration, the fundamental theorem in this subject is

Theorem 4.(Fubini-Tonelli theorem) Let and be -finite measure spaces, with product space .

- (Tonelli theorem) If is measurable, then .
- (Fubini theorem) If is absolutely integrable, then we also have , with the inner integrals being absolutely integrable a.e. and the outer integrals all being absolutely integrable.
- If and are complete measure spaces, then the same claims hold with the product -algebra replaced by its completion.

**Remark 8.** The theorem fails for non--finite spaces, but virtually every measure space actually encountered in “hard analysis” applications will be -finite. (One should be cautious, however, with any space constructed using ultrafilters or the first uncountable ordinal.) It is also important that f obey some measurability in the product space; there exist non-measurable f for which the iterated integrals exist (and may or may not be equal to each other, depending on the properties of f and even on which axioms of set theory one chooses), but the product integral (of course) does not.

[*Update*, Jan 20: statement of Egorov’s theorem corrected.]

## 45 comments

Comments feed for this article

1 January, 2009 at 2:38 pm

Xinwei YuDear Prof. Tao,

You said “any set which does not rely on the axiom of choice in its construction will be [Lebesgue] measurable”. Could you give me a hint on how to prove it or tell me where to find the proof? Thank you very much!

1 January, 2009 at 3:03 pm

AnonymousThanks again for a great post. A few typos:

1. Definition 1: Last sentence of the first paragraph, $\mathcal B$ should be $\mathcal X$?

2. Example 1: $/sigma$ should be $\sigma$.

3. Example 4: In the first sentence $(Y, \mathcal Y)$ is missing the right parenthesis.

4. Definition 3: In the second item, missing $\$ in int_g.

5. Right before Theorem 3: “They are derived from their namesakes in Exercise 5″ should really be Exercise 6? Same for the second paragraph of “Proof” after Remark 5.

1 January, 2009 at 5:14 pm

IanXinwei Yu, the paper you’re looking for is by Robert Solovay

http://en.wikipedia.org/wiki/Robert_M._Solovay

1 January, 2009 at 5:22 pm

AndyHi Xinwei,

Solovay proved that it’s consistent with the negation of the axiom of choice that all functions are measurable. See

http://en.wikipedia.org/wiki/Non-measurable_set

and Solovay’s wiki entry for a discussion & ref. (Almost surely a difficult result to reprove on one’s own…)

Prof. Tao–tiny typo, in the para before Ex. 1, “the join if the Borel” –> “the join of the Borel”

1 January, 2009 at 7:40 pm

Xinwei YuThanks Andy and Ian!

1 January, 2009 at 7:42 pm

Terence TaoThanks for the corrections!

Regarding non-measurable sets, perhaps the more precise statement is that any set which can be constructed

~~using~~without using the axiom of choice is measurable in at least one model of set theory (excluding the axiom of choice), though it need not be measurable in every model. But as a first approximation at least, “sets constructed without choice are measurable” is a good rule of thumb.1 January, 2009 at 10:50 pm

RogerDon’t you mean: any set which can be constructed WITHOUT using the axiom of choice?

2 January, 2009 at 6:32 am

Sune Kristian JakobsenShouldn’t the equation in theorem 3 item 1 be:

2 January, 2009 at 7:05 am

JamesThanks for this post. It was a pleasure to read and to find to my surprise that I could remember most of it from when I learnt it many years ago (although I would fail miserably trying to construct any of the proofs these days as I’m long out of practice).

One thing I recall is that the first text I used (Ash: Real Analysis and Probability) derived the Caratheodary theorem and then proceeded to construct Lebesgue measure etc. much as you do. Later on I picked up Rudin’s Real and Complex Analysis and found that this approach was not used. Instead some form of the Riesz Representation Theorem was used to perform the construction using linear functionals on locally compact Hausdorf spaces (if I remember correctly…). I was a bit upset at the time since, having fought may way through many pages using the first approach, it now seamed I would need to climb the same mountain again by a different route. Anyway, how might you rate the pros and cons of these two approaches?

2 January, 2009 at 8:19 am

KKKHi Prof. Tao,

Is it possible to have a pdf (or latex) version of your notes? Thank you.

2 January, 2009 at 10:14 am

Terence TaoThanks for the corrections!

Dear James: I do plan to discuss the Riesz representation approach a bit later in this course. Both approaches are important, but arrive at the same goal (construction of the Lebesgue measure and Lebesgue integral) from opposite directions. Philosophically, it comes down to whether one views measure as the fundamental concept (and integration being defined in terms of measure, via simple functions etc.), or integration as the fundamental concept (with measure then being a special case of the integral, by specialising to indicator functions). The Caratheodory approach takes the former view, building measures first and only later defining the integral, while the Riesz approach does the opposite. The two approaches roughly correspond to two of the basic perspectives we have to view modern mathematics, namely the set-theoretic perspective (studying a space concretely, via its points and subsets) and the category-theoretic perspective (studying a space abstractly, via the functions or maps into or out of that space). Both of these perspectives are important (as are other perspectives, e.g. information-theoretic perspective, algorithmic perspective, model-theoretic perspective, etc.), and I feel it is important to have one’s foundational theoretical concepts supported by as many of these perspectives as possible.

The Caratheodory approach is more elementary – no integrals, function spaces, point set topology, etc. – but there is definitely a fair amount of combinatorial trickery involved (e.g. the verification that if one decomposes an interval into countably many subintervals, that the length of the big interval is the sum of the smaller ones is really rather non-trivial). The Riesz approach is slicker, and more natural in many ways, but does require much more theory (though this theory is definitely useful for many other things also). Also, in the Riesz approach, the hard part of building the integral still exists; but it’s just hidden in the construction of the

Riemannintegral (or, alternatively, in the construction of Haar measure).From a pedagogical perspective, I prefer to do Caratheodory first, to get an initial construction of Lebesgue measure and integral, then do function spaces (using the L^p spaces provided by the Lebesgue integral as model examples), and then finally to do Riesz representation and close the circle. One can do things in a different order, but I find that without the L^p spaces, the function space material becomes a bit abstract, and not as easy to motivate.

Dear KKK: I plan to include these notes in my yearly book compilation of the blog, though the timing of this particular notes turns out to be maximally unfortunate in this regard. (I am currently in the process of compiling the 2008 posts and should have a draft PDF for those shortly.) If you “print preview” the post, though, then you can get a slightly cleaner version of the file, which can be converted to PDF.

2 January, 2009 at 10:27 am

Terence Taop.s. Incidentally, one of the quickest ways I know of to construct Lebesgue measure on, say, [0,1], is to first construct it on the infinite discrete cube (which, being totally disconnected, allows for some simplifications in either the Caratheodory or Riesz approaches) and then use the binary expansion to identify this with [0,1] (modulo a countable set of terminating binary decimals, which can be easily dealt with.) But from a pedagogical perspective this is not a particularly good way to see Lebesgue measure for the first time; if you will pardon the pun, it is totally disconnected from the geometric intuition one already has for the concepts of measure and integration.

2 January, 2009 at 10:56 am

Sune Kristian Jakobsen“any set which can be constructed without using the axiom of choice is measurable in at least one model of set theory (excluding the axiom of choice)”

I don’t know if I understand this statement correct, but you seem to suggest that the model to choose depends on the set? If I understand the title of Solovay’s paper “A model of set-theory in which every set of reals is Lebesgue measurable” correctly: There is at least one model of set theory (excluding the axiom of choice) in which any set that can be constructed without using the axiom of choice is measurable?

3 January, 2009 at 8:03 am

EricThe trick of constructing Lebesgue measure by first using the totally disconnected Cantor set can also be used in proving the Riesz representation theorem (at least for finite measures). First, you construct a measure from a functional on any totally disconnected compact Hausdoff space, which is significantly simpler than the general case (in particular, if you have already done the Caratheodory approach, verifying the countable additivity hypothesis to apply the Caratheodory extension theorem is trivial). Since free compact Hausdorff spaces (in the categorical sense) are totally disconnected, every compact Hausdorff space X is a quotient of a totally disconnected compact Hausdorff space Y. By Hahn-Banach, a functional on C(X) extends to C(Y), which then gives a measure on Y which can be pushed forward to X.

3 January, 2009 at 2:51 pm

JamesDear Terence,

I had never considered the contrast between the Caratheodory and Riesz approaches as an example of the two mathematical points view that you have talked about (“problem solving and model building”- maybe that was Timothy Gowers…)

I agree that I think the Caratheodery approach is better from a pedagogical point of view – more “hands on” and leaves you with a feeling that you have constructed something “from the ground up” rather than by some clever slight of hand.

However, how would you place the Daniell integral, or the Henstock–Kurzweil one – in so far as they might be useful for a first calculus course for people who might not go on to higher level maths?

3 January, 2009 at 4:15 pm

Terence TaoDear Sune: Yes, you are right, Solovay’s result even allows one to take the same model to make every set constructible without choice measurable. (Though this does raise a question that I do not know the answer to: is there a set that can be constructed without AC which is not

necessarilymeasurable, i.e. it is non-measurable in at least one model of set theory? I suppose one first has to formalise what “constructed without AC” means more precisely to answer this.)Dear Eric: Thanks for the comments! Another way to use the totally disconnected trick is to then invoke the Kolmogorov extension theorem rather than the Caratheodory one, though this is only a minor variation of the usual Caratheodory approach in the end.

Dear James: The set-theoretic vs. category-theoretic distinction (or in this context, the measure-theoretic vs. functional distinction) is perhaps slightly different from the problem solving vs. theory building distinction that Gowers writes about, though I can see that there might be some correlation between the two in practice.

I was not too familiar with the two integrals you mentioned, so I looked them up. The Daniell integral seems like a variant of the Riesz approach, using elementary functions instead of continuous ones to build the initial linear functional that will become the integral. Meanwhile, the Henstock-Kurzweil integral seems to be a variant of the Riemann integral and thus coming from the geometric perspective rather than the measure-theoretic or functional perspectives.

For undergraduate mathematics purposes, I think the Riemann integral is largely adequate, especially outside of analysis, which is the main place in mathematics where one has to deal with singularities and other pathologies. While the Riemann integral does not enjoy the convergence theorems (Theorem 3) that make the Lebesgue integral so convenient for analysis, one at least still has good behaviour under the uniform convergence topology (which is good enough for many applications outside of analysis). Furthermore, the Riemann integral ties in closely with one’s intuitive geometric notions of area, etc., and reinforces the close analogy between summation and integration (and more generally, between the discrete and the continuous) that helps build a lot of algebraic intuition for integration also.

There is unlikely to be any one “best” notion of integration that covers all cases and applications. For instance, the Lebesgue integral (or any of the other integrals mentioned above) cannot directly handle principal value integrals, or integrals obtained via zeta function regularisation, even though both of these types of integration are of significant importance in certain subfields of mathematics. Like many other broad and fundamental mathematical concepts (e.g. “number”, “space”, “limit”, “distance”, “size”, “similar”, etc.), it seems best to view integration as a loosely related family of mathematical operations, rather than to try to force all of them artificially into a single unifying formal or axiomatic framework.

The Lebesgue approach does have a philosophy which seems particularly compatible with analysis and related disciplines, namely that it treats “small” sets as “negligible” (thus, for instance, identifying two functions if they agree outside of a set of measure zero). This fits in well with the general tendency in analysis to routinely discard various small errors. Littlewood’s principles tell us that up to such small errors, Lebesgue-measurable sets are essentially elementary, Lebesgue-measurable functions are essentially continuous, and pointwise convergence is essentially uniform, so the Lebesgue theory is basically nothing more than the completion of the classical Riemann theory, and thus very natural in any situation in which one is willing to ignore what is going on in a set of small measure.

3 January, 2009 at 5:28 pm

254A, notes 0a. An alternate approach to the Carathéodory extension theorem « What’s new[…] theorem, elementary sets, Lebesgue measure | by Terence Tao In this supplemental note to the previous lecture notes, I would like to give an alternate proof of a (weak form of the) Carathéodory extension theorem. […]

3 January, 2009 at 5:52 pm

wangtwoDear Pro Tao

I think in example 3 that you should use but not .

In the following sentence

“we can form the -algebra generated by the ;”

[Corrected, thanks – T.]4 January, 2009 at 7:21 am

liuxiaochuanDear Prefossor Tao:

When I met a propblem in measure theory which ask for some property for a – algebra, I find I could always just asume the family of sets which satisfy this property is (say) and then try to prove it is indeed a – algebra. This method is so strong and een in this post, it can be applied for several times. In the end, though I can finish the proof, I feel like the conclusion just come “out of nowhere”.

For an arbitrarily chosen set in a – algebra, sometimes I have the impulse to get it “by hand”, I want to get this set through several countable union or countable joint, which is obvious impossible. Every time in this situation, I find the above method always can be used.

I am wondering if there are some more important thoughts in this phenomenon.

4 January, 2009 at 9:41 am

Terence TaoDear Liuxiaochuan,

One can indeed replicate the argument you describe (i.e. showing that the collection of sets that obey a certain property forms a -algebra, and also contains some generating set , and thus must contain the -algebra generated by that set) by repeatedly taking countable unions and intersections of generators in instead. The catch is that “repeatedly” means “iterated up to the first uncountable ordinal” in the case of the Borel algebra (see http://en.wikipedia.org/wiki/Borel_algebra ), and is more complicated still in the case of general -algebras (involving “trees” of at most countable depth, if I recall correctly). So the relationship between a generating set and the algebra that they generate is significantly more complicated for -algebras than it is for other algebraic structures (e.g. rings, groups, vector spaces, Boolean algebras, topologies, etc.) where the description of the generated structure tends to be much more concrete and explicit.

Basically, what is going in here is that one is using powerful axioms of set theory (indeed, one is literally relying on the power set axiom, in order to consider the collection of all -algebras containing a certain base set ) to bypass the tedious task of actually working out constructively what sets can be generated from some base set of generators, and jumping straight to the existence of the -algebra generated by that set , without getting much information as to precisely what that -algebra consists of. As a consequence, the only real way to establish properties on this -algebra is to exploit its definition as the minimal -algebra containing .

There is a certain amount of reverse mathematics analysis in the literature that shows that some reliance on strong axioms of set theory (axiom of choice being of course the most famous one, but there are others) is necessary to establish some of the basic results in measure theory, which may give some formal limitations on what one can do by purely “constructive” methods here, but I am not really an expert on these things.

Of course, the Borel measurable sets that one actually encounters in concrete applications tend to be fairly low on the Borel hierarchy, e.g. they are often just or sets at worst, and similarly for Lebesgue measurable sets, etc.. So for many specific applications, one can probably replace the general theory with some more concrete but

ad hocsubstitute, in which everything is done “by hand” and one only takes countable unions and intersections a fairly limited number of times. This would probably make one’s arguments significantly longer and messier, though, and also harder to generalise.4 January, 2009 at 12:13 pm

PhilHi Prof. Tao,

In Exercise 8, I think you meant to put “element of” rather than “subset of” in two instances.

Exercise 9 only seems correct when all the measures involved are probability measures.

You commented on Theorem 3 that the convergence theorems for integrals of functions were largely equivalent to the corresponding ones for sets. On the other hand, on face value, they seem to be “stronger” statements since they reduce to Exercise 5 when we apply them to characteristic functions of sets. I think this is a relevant case study for the recent discussion on Gowers’ Weblog regarding how one equivalent statement can be “stronger” than another.

From the way integration theory is developed here, they seem stronger, but consider the following alternative approach to integration of functions (which maybe you had in mind when you said they were equivalent). First construct the Lebesgue measure on R, and develop the product measure without reference to integration of functions. Define the integral of a positive function as, literally, “the area under the curve” in the product measure sense, then proceed as one usually would. From this standpoint, the linearity of the integral becomes a consequence of the translation invariance of the Lebesgue measure, and the convergence theorems for functions become special cases of the convergence theorems for sets.

I think this approach is geometrically intuitive and gives insight to the “equivalences” discussed, but probably it’s a bit too gross to practice because the constructions of Lebesgue measure and the product measure without appeal to integration can be tough and it also doesn’t generalize to vector valued integration.

4 January, 2009 at 3:31 pm

Terence TaoDear Phil: Thanks for the corrections!

It is true that by interpreting a (non-negative) integral as the measure of a set in a higher-dimensional space, one can equate the integral convergence theorems with their measure counterparts, but I had a slightly different correspondence in mind, namely if one specialises a convergence theorem to the case of simple functions taking a fixed finite set of values, then the integral theorem quickly collapses to just the set version. The integral convergence theorems can then be recovered from this special case by a discretisation argument. (One has to be a little careful to avoid using the very same convergence theorem one is trying to prove in order to justify the limiting procedure, of course.)

4 January, 2009 at 11:08 pm

245B, notes 1: Signed measures and the Radon-Nikodym-Lebesgue theorem « What’s new[…] The implication of 3. from 1. is Exercise 11 from Notes 0. The implication of 2. from 3. is trivial. To deduce 1. from 2., apply Theorem 1 to and […]

6 January, 2009 at 9:33 am

Joe ShipmanYou say in your last sentence that iterated integrals may or not be equal to each other. In my thesis (Cardinal Conditions for Strong Fubini Theorems, October 1990 TAMS) I showed that for non-negative functions in n dimensions, it is consistent with ZFC (and a consequence of a real-valued measurable cardinal) that iterated integrals are always equal whenever they exist (consistency previously shown by H. Friedman for n=2; Sierpinski’s famous example where iterated integrals of non-negative functions differ depends on the continuum hypothesis). The non-negativity assumption is necessary or there are simple counterexamples, for example the function on [0,infty)^2 that is 0 for y>x, 1 for x>y>(x-1), -1 for (x-1)>y>(x-2), 0 for (x-2)>y.

— Joe Shipman

6 January, 2009 at 10:59 am

Terence TaoThanks for the clarification!

7 January, 2009 at 5:03 am

MohamedThanks

I Would ask if you have simple note regarding to differential geometry and gauge theory since I found so hard to understand this theory

regards

Mohamed

7 January, 2009 at 9:17 am

Terence TaoYou might try these posts of mine:

https://terrytao.wordpress.com/2008/09/27/what-is-a-gauge/

https://terrytao.wordpress.com/2008/03/12/pcm-article-ricci-flow/

https://terrytao.wordpress.com/2008/03/26/285g-lecture-0-riemannian-manifolds-and-curvature/

9 January, 2009 at 7:18 pm

245B, notes 3: L^p spaces « What’s new[…] to f outside of a set of measure zero. (Compare with Egorov’s theorem (Theorem 3.6 from Notes 0), which equates pointwise convergence with uniform convergence outside of a set of arbitrarily […]

20 January, 2009 at 8:57 am

PDEbeginnerDear Prof. Tao,

Thanks a lot for your so nice lecture note!

There seems some problem in your Egorov Theorem. I think we need to restrict the theorem on some -finite set . Otherwise, there is a counterexample: Let , and . Clearly, pointwisely, but we cannot find any small set to make uniformly on .

20 January, 2009 at 10:38 am

Terence TaoOops, you’re right; it’s corrected now.

16 February, 2009 at 7:28 am

实分析0-10 « Liu Xiaochuan’s Weblog[…] 第零节是关于测度论的复习，基本上都是已知的结论，其中关于littlewood的三个学习实分析的原则还是蛮有意思的，我是头一次听说，不过这些思想在学习过程中依然感受到了。测度论中有很多很规范的证明方法，而且这些方法常常跟选择有关。我在评论中问了陶教授一个问题，已经困扰了我一段时间：测度论中有的问题会要求证明一个西格玛代数具有什么什么性质，这时候有一种很强大的方法，就是将满足这个性质的集合类直接设出来，然后再证明它确实是个西格玛代数。此种方法在运用的时候，常常会感到结论不知从什么地方出来的。证明的过程不具有启发性。陶教授回答了不少，上面的问题，正常的处理方法应该是从这个西格玛代数中任取一个元素，然后直接证明该元素是满足目标的性质的。但问题是，我们对这个任意的元素很难有个确切的描述。事实上，我们不得对“可数交”，“可数并”这样的操作做不可数次才行。而这本身其实正式选择公理强大行的体现。因为即使利用上面的方法构造出了“更清楚”的证明，其过程也一定会很复杂，很难看。 […]

3 March, 2009 at 12:40 pm

AnonymousDear Professor Tau,

What am I missing in Exercise 8? It seems that the example you give in Remark 1 under your “alternate approach” shows the necessity of X -finite. In this example, the algebra is the collection of finite unions of sets , , which generates the discrete -algebra on , and every non-empty set in the algebra has measure . If we extend to the counting measure, then the assertion in Exercise 8 fails for every and with .

Jim Lewis

3 March, 2009 at 2:34 pm

Terence TaoHmm, you’re right. I think I thought I could restrict the algebra to E so that I can work entirely in a finite measure setting, but of course this doesn’t work since E doesn’t lie in . I’ve adjusted the exercise accordingly.

14 June, 2009 at 10:22 am

studentDear Prof. Tao,

is the following claim right?

” any measure on is the limit of finite linear combination of Dirac delta unit masses at points of [0,1].”

thanks

14 June, 2009 at 10:28 am

Terence TaoYes, this is correct, as long as one uses the vague topology to define the concept of a limit of a sequence of measures. The claim is closely related to the fact that the Riemann-Stieltjes integral is compatible with the Lebesgue-Stieltjes integral. For instance, the vague convergence of to Lebesgue measure on [0,1] follows from the compatibility of the Riemann and Lebesgue integrals when integrating continuous functions.

14 June, 2009 at 10:34 am

studentDear Prof. Tao,

I am asking this question to clarify my understanding.

is the following right?

for any

right continuous,

nondecreasing function F,

we can define an outer measure on power set of R, then

measurable sets form a sigma algebra which contains Borel sigma algebra, and the restriction of to is a complete measure and it is inner and outer regular…

thanks

14 June, 2009 at 10:39 am

studentwow! Prof Tao, you are so fast… thanks a lot

2 August, 2009 at 2:32 am

Mark ReidDear Professor Tao,

When I studied integration theory as an undergraduate it was in the Caratheodory style you mention (measures first, integration later). Some of the work I am doing now in statistical learning theory involves a fair bit of functional analysis and expectations are a central object of study. The Riesz approach to integration theory you mention in your comment above sounds like an approach I would like to understand better.

I was wondering if you could recommend any good books (or papers) on the Riesz approach to integration theory, particularly probability theory.

Thanks in advance,

Mark.

1 January, 2010 at 8:47 pm

254A, Notes 0: A review of probability theory « What’s new[…] will assume familiarity with the foundations of measure theory; see for instance these earlier lecture notes of mine for a quick review of that topic. This is also not intended to be a first introduction to […]

2 January, 2010 at 4:31 am

Anonymous“In these notes we quickly review the basics of abstract measure theory and integration theory, which was covered in the previous course..”

Which previous course is being referred to here, and is the referred course available online? If not, what is the recommended book for the referred course?

Thanks

2 January, 2010 at 11:31 am

Terence Tao245a, taught by my colleague, Jim Ralston. The textbook was Folland’s “Real Analysis”.

8 April, 2010 at 9:53 am

254B, Notes 2: Roth’s theorem « What’s new[…] additive probability measure on . (Hint: use the Carathéodory extension theorem, see e.g. my 254B notes 0 or notes […]

25 May, 2010 at 12:51 am

abdussalam mustaphaDear prof.

I am a student newly admited into Msaters programme in mathematics but I have poor background in some knowledge of the basics more especially in measure theory and integration. I want you to please assist me with materials that will help me grasp the main ideas in measures. Thank you

Mustapha Abdussalam

Usmanu Danfodiyo University, Sokoto,

Nigeria.

31 August, 2010 at 9:03 am

Course announcement: 245A, Real analysis « What’s new[…] also this preliminary 245B post for a summary of the material to be covered in […]

31 October, 2011 at 8:16 am

An engineering introduction to measure theory « Memming[…] Terry Tao has a great summary on measure theory and integration. Advertisement Eco World Content From Across The Internet. Featured on EcoPressed The EU is […]