In the previous notes, we defined the Lebesgue measure {m(E)} of a Lebesgue measurable set {E \subset {\bf R}^d}, and set out the basic properties of this measure. In this set of notes, we use Lebesgue measure to define the Lebesgue integral

\displaystyle \int_{{\bf R}^d} f(x)\ dx

of functions {f: {\bf R}^d \rightarrow {\bf C} \cup \{\infty\}}. Just as not every set can be measured by Lebesgue measure, not every function can be integrated by the Lebesgue integral; the function will need to be Lebesgue measurable. Furthermore, the function will either need to be unsigned (taking values on {[0,+\infty]}), or absolutely integrable.

To motivate the Lebesgue integral, let us first briefly review two simpler integration concepts. The first is that of an infinite summation

\displaystyle \sum_{n=1}^\infty c_n

of a sequence of numbers {c_n}, which can be viewed as a discrete analogue of the Lebesgue integral. Actually, there are two overlapping, but different, notions of summation that we wish to recall here. The first is that of the unsigned infinite sum, when the {c_n} lie in the extended non-negative real axis {[0,+\infty]}. In this case, the infinite sum can be defined as the limit of the partial sums

\displaystyle \sum_{n=1}^\infty c_n = \lim_{N \rightarrow \infty} \sum_{n=1}^N c_n \ \ \ \ \ (1)

 

or equivalently as a supremum of arbitrary finite partial sums:

\displaystyle \sum_{n=1}^\infty c_n = \sup_{A \subset {\bf N}, A \hbox{ finite}} \sum_{n \in A} c_n. \ \ \ \ \ (2)

 

The unsigned infinite sum {\sum_{n=1}^\infty c_n} always exists, but its value may be infinite, even when each term is individually finite (consider e.g. {\sum_{n=1}^\infty 1}).

The second notion of a summation is the absolutely summable infinite sum, in which the {c_n} lie in the complex plane {{\bf C}} and obey the absolute summability condition

\displaystyle \sum_{n=1}^\infty |c_n| < \infty,

where the left-hand side is of course an unsigned infinite sum. When this occurs, one can show that the partial sums {\sum_{n=1}^N c_n} converge to a limit, and we can then define the infinite sum by the same formula (1) as in the unsigned case, though now the sum takes values in {{\bf C}} rather than {[0,+\infty]}. The absolute summability condition confers a number of useful properties that are not obeyed by sums that are merely conditionally convergent; most notably, the value of an absolutely convergent sum is unchanged if one rearranges the terms in the series in an arbitrary fashion. Note also that the absolutely summable infinite sums can be defined in terms of the unsigned infinite sums by taking advantage of the formulae

\displaystyle \sum_{n=1}^\infty c_n = (\sum_{n=1}^\infty \hbox{Re}(c_n)) + i (\sum_{n=1}^\infty \hbox{Im}(c_n))

for complex absolutely summable {c_n}, and

\displaystyle \sum_{n=1}^\infty c_n = \sum_{n=1}^\infty c_n^+ - \sum_{n=1}^\infty c_n^-

for real absolutely summable {c_n}, where {c_n^+ := \max(c_n,0)} and {c_n^- := \max(-c_n,0)} are the (magnitudes of the) positive and negative parts of {c_n}.

In an analogous spirit, we will first define an unsigned Lebesgue integral {\int_{{\bf R}^d} f(x)\ dx} of (measurable) unsigned functions {f: {\bf R}^d \rightarrow [0,+\infty]}, and then use that to define the absolutely convergent Lebesgue integral {\int_{{\bf R}^d} f(x)\ dx} of absolutely integrable functions {f: {\bf R}^d \rightarrow {\bf C} \cup \{\infty\}}. (In contrast to absolutely summable series, which cannot have any infinite terms, absolutely integrable functions will be allowed to occasionally become infinite. However, as we will see, this can only happen on a set of Lebesgue measure zero.)

To define the unsigned Lebesgue integral, we now turn to another more basic notion of integration, namely the Riemann integral {\int_a^b f(x)\ dx} of a Riemann integrable function {f: [a,b] \rightarrow {\bf R}}. Recall from the prologue that this integral is equal to the lower Darboux integral

\displaystyle \int_a^b f(x) = \underline{\int_a^b} f(x)\ dx := \sup_{g \leq f; g \hbox{ piecewise constant}} \hbox{p.c.} \int_a^b g(x)\ dx.

(It is also equal to the upper Darboux integral; but much as the theory of Lebesgue measure is easiest to define by relying solely on outer measure and not on inner measure, the theory of the unsigned Lebesgue integral is easiest to define by relying solely on lower integrals rather than upper ones; the upper integral is somewhat problematic when dealing with “improper” integrals of functions that are unbounded or are supported on sets of infinite measure.) Compare this formula also with (2). The integral {\hbox{p.c.} \int_a^b g(x)\ dx} is a piecewise constant integral, formed by breaking up the piecewise constant functions {g, h} into finite linear combinations of indicator functions of intervals, and then measuring the length of each interval.

It turns out that virtually the same definition allows us to define a lower Lebesgue integral {\underline{\int_{{\bf R}^d}} f(x)\ dx} of any unsigned function {f: {\bf R}^d \rightarrow [0,+\infty]}, simply by replacing intervals with the more general class of Lebesgue measurable sets (and thus replacing piecewise constant functions with the more general class of simple functions). If the function is Lebesgue measurable (a concept that we will define presently), then we refer to the lower Lebesgue integral simply as the Lebesgue integral. As we shall see, it obeys all the basic properties one expects of an integral, such as monotonicity and additivity; in subsequent notes we will also see that it behaves quite well with respect to limits, as we shall see by establishing the two basic convergence theorems of the unsigned Lebesgue integral, namely Fatou’s lemma and the monotone convergence theorem.

Once we have the theory of the unsigned Lebesgue integral, we will then be able to define the absolutely convergent Lebesgue integral, similarly to how the absolutely convergent infinite sum can be defined using the unsigned infinite sum. This integral also obeys all the basic properties one expects, such as linearity and compatibility with the more classical Riemann integral; in subsequent notes we will see that it also obeys a fundamentally important convergence theorem, the dominated convergence theorem. This convergence theorem makes the Lebesgue integral (and its abstract generalisations to other measure spaces than {{\bf R}^d}) particularly suitable for analysis, as well as allied fields that rely heavily on limits of functions, such as PDE, probability, and ergodic theory.

Remark 1 This is not the only route to setting up the unsigned and absolutely convergent Lebesgue integrals. Stein-Shakarchi, for instance, proceeds slightly differently, beginning with the unsigned integral but then making an auxiliary stop at integration of functions that are bounded and are supported on a set of finite measure, before going to the absolutely convergent Lebesgue integral. Another approach (which will not be discussed here) is to take the metric completion of the Riemann integral with respect to the {L^1} metric.

The Lebesgue integral and Lebesgue measure can be viewed as completions of the Riemann integral and Jordan measure respectively. This means three things. Firstly, the Lebesgue theory extends the Riemann theory: every Jordan measurable set is Lebesgue measurable, and every Riemann integrable function is Lebesgue measurable, with the measures and integrals from the two theories being compatible. Conversely, the Lebesgue theory can be approximated by the Riemann theory; as we saw in the previous notes, every Lebesgue measurable set can be approximated (in various senses) by simpler sets, such as open sets or elementary sets, and in a similar fashion, Lebesgue measurable functions can be approximated by nicer functions, such as Riemann integrable or continuous functions. Finally, the Lebesgue theory is complete in various ways; we will formalise this properly only in the next quarter when we study {L^p} spaces, but the convergence theorems mentioned above already hint at this completeness. A related fact, known as Egorov’s theorem, asserts that a pointwise converging sequence of functions can be approximated as a (locally) uniformly converging sequence of functions. The facts listed here manifestations of Littlewood’s three principles of real analysis, which capture much of the essence of the Lebesgue theory.

— 1. Integration of simple functions —

Much as the Riemann integral was set up by first using the integral for piecewise constant functions, the Lebesgue integral is set up using the integral for simple functions.

Definition 1 (Simple function) A (complex-valued) simple function {f: {\bf R}^d \rightarrow {\bf C}} is a finite linear combination

\displaystyle f = c_1 1_{E_1} + \ldots + c_k 1_{E_k} \ \ \ \ \ (3)

 

of indicator functions {1_{E_i}} of Lebesgue measurable sets {E_i \subset {\bf R}^d} for {i=1,\ldots,k}, where {k \geq 0} is a natural number and {c_1,\ldots,c_k \in {\bf C}} are complex numbers. An unsigned simple function {f: {\bf R}^d \rightarrow [0,+\infty]}, is defined similarly, but with the {c_i} taking values in {[0,+\infty]} rather than {{\bf C}}.

It is clear from construction that the space {\hbox{Simp}({\bf R}^d)} of complex-valued simple functions forms a complex vector space; also, {\hbox{Simp}({\bf R}^d)} also closed under pointwise product {f, g \mapsto fg} and complex conjugation {f \mapsto \overline{f}}. In short, {\hbox{Simp}({\bf R}^d)} is a commutative {*}-algebra. Meanwhile, the space {\hbox{Simp}^+({\bf R}^d)} of unsigned simple functions is a {[0,+\infty]}-module; it is closed under addition, and under scalar multiplication by elements in {[0,+\infty]}.

In this definition, we did not require the {E_1,\ldots,E_k} to be disjoint. However, it is easy enough to arrange this, basically by exploiting Venn diagrams (or, to use fancier language, finite boolean algebras). Indeed, any {k} subsets {E_1,\ldots,E_k} of {{\bf R}^d} partition {{\bf R}^d} into {2^k} disjoint sets, each of which is an intersection of {E_i} or the complement {{\bf R}^d \backslash E_i} for {i=1,\ldots,k} (and in particular, is measurable). The (complex or unsigned) simple function is constant on each of these sets, and so can easily be decomposed as a linear combination of the indicator function of these sets. One easy consequence of this is that if {f} is a complex-valued simple function, then its absolute value {|f|: x \mapsto |f(x)|} is an unsigned simple function.

It is geometrically intuitive that we should define the integral {\int_{{\bf R}^d} 1_E(x)\ dx} of an indicator function of a measurable set {E} to equal {m(E)}:

\displaystyle \int_{{\bf R}^d} 1_E(x)\ dx = m(E).

Using this and applying the laws of integration formally, we are led to propose the following definition for the integral of an unsigned simple function:

Definition 2 (Integral of a unsigned simple function) If {f = c_1 1_{E_1} + \ldots + c_k 1_{E_k}} is an unsigned simple function, the integral {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx} is defined by the formula

\displaystyle \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx := c_1 m(E_1) + \ldots + c_k m(E_k),

thus {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx} will take values in {[0,+\infty]}.

However, one has to actually check that this definition is well-defined, in the sense that different representations

\displaystyle f = c_1 1_{E_1} + \ldots + c_k 1_{E_k} = c'_1 1_{E'_1} + \ldots + c'_{k'} 1_{E'_{k'}}

of a function as a finite unsigned combination of indicator functions of measurable sets will give the same value for the integral {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx}. This is the purpose of the following lemma:

Lemma 3 (Well-definedness of simple integral) Let {k, k' \geq 0} be natural numbers, {c_1,\ldots,c_k,c'_1,\ldots,c'_{k'} \in [0,+\infty]}, and let {E_1,\ldots,E_k,E'_1,\ldots,E'_{k'} \subset {\bf R}^d} be Lebesgue measurable sets such that the identity

\displaystyle c_1 1_{E_1} + \ldots + c_k 1_{E_k} = c'_1 1_{E'_1} + \ldots + c'_{k'} 1_{E'_{k'}} \ \ \ \ \ (4)

 

holds identically on {{\bf R}^d}. Then one has

\displaystyle c_1 m(E_1) + \ldots + c_k m(E_k) = c'_1 m(E'_1) + \ldots + c'_{k'} m(E'_{k'}).

Proof: We again use a Venn diagram argument. The {k+k'} sets {E_1,\ldots,E_k,E'_1,\ldots,E'_{k'}} partition {{\bf R}^d} into {2^{k+k'}} disjoint sets, each of which is an intersection of some of the {E_1,\ldots,E_k,E'_1,\ldots,E'_{k'}} and their complements. We throw away any sets that are empty, leaving us with a partition of {{\bf R}^d} into {m} non-empty disjoint sets {A_1,\ldots,A_m} for some {0 \leq m \leq 2^{k+k'}}. As the {E_1,\ldots,E_k,E'_1,\ldots,E'_k} are Lebesgue measurable, the {A_1,\ldots,A_m} are too. By construction, each of the {E_1,\ldots,E_k,E'_1,\ldots,E_{k'}} arise as unions of some of the {A_1,\ldots,A_m}, thus we can write

\displaystyle E_i = \bigcup_{j \in J_i} A_j

and

\displaystyle E'_{i'} = \bigcup_{j' \in J'_{i'}} A_{j'}

for all {i=1,\ldots,k} and {i'=1,\ldots,k'}, and some subsets {J_i, J'_{i'} \subset \{1,\ldots,m\}}. By finite additivity of Lebesgue measure, we thus have

\displaystyle m(E_i) = \sum_{j \in J_i} m(A_j)

and

\displaystyle m(E'_{i'}) = \sum_{j \in J'_{i'}} m(A_{j})

Thus, our objective is now to show that

\displaystyle \sum_{i=1}^k c_i \sum_{j \in J_i} m(A_j) = \sum_{i'=1}^{k'} c'_{i'} \sum_{j \in J'_{i'}} m(A_{j}). \ \ \ \ \ (5)

 

To obtain this, we fix {1 \leq j \leq m} and evaluate (4) at a point {x} in the non-empty set {A_j}. At such a point, {1_{E_i}(x)} is equal to {1_{J_i}(j)}, and similarly {1_{E'_{i'}}} is equal to {1_{J'_{i'}}(j)}. From (4) we conclude that

\displaystyle \sum_{i=1}^k c_i 1_{J_i}(j) = \sum_{i'=1}^{k'} c'_{i'} 1_{J'_{i'}}(j).

Multiplying this by {m(A_j)} and then summing over all {j=1,\ldots,m} we obtain (5). \Box

We now make some important definitions that we will use repeatedly in the course:

Definition 4 (Almost everywhere and support) A property {P(x)} of a point {x \in {\bf R}^d} is said to hold (Lebesgue) almost everywhere in {{\bf R}^d}, or for (Lebesgue) almost every point {x \in {\bf R}^d}, if the set of {x \in {\bf R}^d} for which {P(x)} fails has Lebesgue measure zero (i.e. {P} is true outside of a null set). We usually omit the prefix Lebesgue, and often abbreviate “almost everywhere” or “almost every” as a.e.

Two functions {f, g: {\bf R}^d \rightarrow Z} into an arbitrary range {Z} are said to agree almost everywhere if one has {f(x)=g(x)} for almost every {x \in {\bf R}^d}.

The support of a function {f: {\bf R}^d \rightarrow {\bf C}} or {f: {\bf R}^d \rightarrow [0,+\infty]} is defined to be the set {\{x \in {\bf R}^d: f(x) \neq 0 \}} where {f} is non-zero.

Note that if {P(x)} holds for almost every {x}, and {P(x)} implies {Q(x)}, then {Q(x)} holds for almost every {x}. Also, if {P_1(x), P_2(x), \ldots} are an at most countable family of properties, each of which individually holds for almost every {x}, then they will simultaneously be true for almost every {x}, because the countable union of null sets is still a null set. Because of these properties, one can (as a rule of thumb) treat the almost universal quantifier “for almost every” as if it was the truly universal quantifier “for every”, as long as one is only concatenating at most countably many properties together, and as long as one never specialises the free variable {x} to a null set. Observe also that the property of agreeing almost everywhere is an equivalence relation, which we will refer to as almost everywhere equivalence.

In later notes we will also see the notion of the closed support of a function {f: {\bf R}^d \rightarrow {\bf C}}, defined as the closure of the support.

The following properties of the simple unsigned integral are easily obtained from the definitions:

Exercise 1 (Basic properties of the simple unsigned integral) Let {f, g: {\bf R}^d \rightarrow [0,+\infty]} be simple unsigned functions.

  1. (Unsigned linearity) We have

    \displaystyle \hbox{Simp} \int_{{\bf R}^d} f(x)+g(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx

    \displaystyle + \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx

    and

    \displaystyle \hbox{Simp} \int_{{\bf R}^d} c f(x)\ dx = c \times \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx

    for all {c \in [0,+\infty]}.

  2. (Finiteness) We have {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx < \infty} if and only if {f} is finite almost everywhere, and its support has finite measure.
  3. (Vanishing) We have {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx = 0} if and only if {f} is zero almost everywhere.
  4. (Equivalence) If {f} and {g} agree almost everywhere, then {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx}.
  5. (Monotonicity) If {f(x) \leq g(x)} for almost every {x \in {\bf R}^d}, then {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx \leq \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx}.
  6. (Compatibility with Lebesgue measure) For any Lebesgue measurable {E}, one has {\hbox{Simp} \int_{{\bf R}^d} 1_E(x)\ dx = m(E)}.

Furthermore, show that the simple unsigned integral {f \mapsto \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx} is the only map from the space {\hbox{Simp}^+({\bf R}^d)} of unsigned simple functions to {[0,+\infty]} that obeys all of the above properties.

We can now define an absolutely convergent counterpart to the simple unsigned integral. This integral will be superceded by the absolutely Lebesgue integral, but we give it here as motivation for that more general notion of integration.

Definition 5 (Absolutely convergent simple integral) A complex-valued simple function {f: {\bf R}^d \rightarrow {\bf C}} is said to be absolutely integrable if {\hbox{Simp} \int_{{\bf R}^d} |f(x)|\ dx < \infty}. If {f} is absolutely integrable, the integral {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx} is defined for real signed {f} by the formula

\displaystyle \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx := \hbox{Simp} \int_{{\bf R}^d} f_+(x)\ dx - \hbox{Simp} \int_{{\bf R}^d} f_-(x)\ dx

where {f_+(x) := \max(f(x),0)} and {f_-(x) := \max(-f(x),0)} (note that these are unsigned simple functions that are pointwise dominated by {|f|} and thus have finite integral), and for complex-valued {f} by the formula

\displaystyle \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx := \hbox{Simp} \int_{{\bf R}^d} \hbox{Re} f(x)\ dx

\displaystyle + i \hbox{Simp} \int_{{\bf R}^d} \hbox{Im} f(x)\ dx.

(Strictly speaking, this is an abuse of notation as we have now defined the simple integral {\hbox{Simp} \int_{{\bf R}^d}} three different times, for unsigned, real signed, and complex-valued simple functions, but one easily verifies that these three definitions agree with each other on their common domains of definition, so it is safe to use a single notation for all three.)

Note from the preceding exercise that a complex-valued simple function {f} is absolutely integrable if and only if it has finite measure support (since finiteness almost everywhere is automatic). In particular, the space {\hbox{Simp}^{abs}({\bf R}^d)} of absolutely integrable simple functions is closed under addition and scalar multiplication by complex numbers, and is thus a complex vector space.

The properties of the unsigned simple integral then can be used to deduce analogous properties for the complex-valued integral:

Exercise 2 (Basic properties of the complex-valued simple integral) Let {f, g: {\bf R}^d \rightarrow {\bf C}} be absolutely integrable simple functions.

  1. (*-linearity) We have

    \displaystyle \hbox{Simp} \int_{{\bf R}^d} f(x)+g(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx

    \displaystyle + \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx

    and

    \displaystyle \hbox{Simp} \int_{{\bf R}^d} c f(x)\ dx = c \times \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx \ \ \ \ \ (6)

     

    for all {c \in {\bf C}}. Also we have

    \displaystyle \hbox{Simp} \int_{{\bf R}^d} \overline{f}(x)\ dx = \overline{\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx}.

  2. (Equivalence) If {f} and {g} agree almost everywhere, then {\hbox{Simp} \int_{{\bf R}^d} f(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx}.
  3. (Compatibility with Lebesgue measure) For any Lebesgue measurable {E}, one has {\hbox{Simp} \int_{{\bf R}^d} 1_E(x)\ dx = m(E)}.

(Hints: Work out the real-valued counterpart of the linearity property first. To establish (6), treat the cases {c>0, c=0, c=-1} separately. To deal with the additivity for real functions {f,g}, start with the identity

\displaystyle f + g = (f+g)_+ - (f+g)_- = (f_+-f_-) + (g_+-g_-)

and rearrange the second inequality so that no subtraction appears.) Furthermore, show that the complex-valued simple integral {f \mapsto \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx} is the only map from the space {\hbox{Simp}^{abs}({\bf R}^d)} of absolutely integrable simple functions to {{\bf C}} that obeys all of the above properties.

We now comment further on the fact that (simple) functions that agree almost everywhere, have the same integral. We can view this as an assertion that integration is a noise-tolerant operation: one can have “noise” or “errors” in a function {f(x)} on a null set, and this will not affect the final value of the integral. Indeed, once one has this noise tolerance, one can even integrate functions {f} that are not defined everywhere on {{\bf R}^d}, but merely defined almost everywhere on {{\bf R}^d} (i.e. {f} is defined on some set {{\bf R}^d \backslash N} where {N} is a null set), simply by extending {f} to all of {{\bf R}^d} in some arbitrary fashion (e.g. by setting {f} equal to zero on {N}). This is extremely convenient for analysis, as there are many natural functions (e.g. {\frac{\sin x}{x}} in one dimension, or {\frac{1}{|x|^\alpha}} for various {\alpha > 0} in higher dimensions) that are only defined almost everywhere instead of everywhere (often due to “division by zero” problems when a denominator vanishes). While such functions cannot be evaulated at certain singular points, they can still be integrated (provided they obey some integrability condition, of course, such as absolute integrability), and so one can still perform a large portion of analysis on such functions.

In fact, in the subfield of analysis known as functional analysis, it is convenient to abstract the notion of an almost everywhere defined function somewhat, by replacing any such function {f} with the equivalence class of almost everywhere defined functions that are equal to {f} almost everywhere. Such classes are then no longer functions in the standard set-theoretic sense (they do not map each point in the domain to a unique point in the range, since points in {{\bf R}^d} have measure zero), but the properties of various function spaces improve when one does this (various semi-norms become norms, various topologies become Hausdorff, and so forth). See these 245B lecture notes for further discussion.

Remark 2 The “Lebesgue philosophy” that one is willing to lose control on sets of measure zero is a perspective that distinguishes Lebesgue-type analysis from other types of analysis, most notably that of descriptive set theory, which is also interested in studying subsets of {{\bf R}^d}, but can give completely different structural classifications to a pair of sets that agree almost everywhere. This loss of control on null sets is the price one has to pay for gaining access to the powerful tool of the Lebesgue integral; if one needs to control a function at absolutely every point, and not just almost every point, then one often needs to use other tools than integration theory (unless one has some regularity on the function, such as continuity, that lets one pass from almost everywhere true statements to everywhere true statements).

— 2. Measurable functions —

Much as the piecewise constant integral can be completed to the Riemann integral, the unsigned simple integral can be completed to the unsigned Lebesgue integral, by extending the class of unsigned simple functions to the larger class of unsigned Lebesgue measurable functions. One of the shortest ways to define this class is as follows:

Definition 6 (Unsigned measurable function) An unsigned function {f: {\bf R}^d \rightarrow [0,+\infty]} is unsigned Lebesgue measurable, or measurable for short, if it is the pointwise limit of unsigned simple functions, i.e. if there exists a sequence {f_1, f_2, f_3, \ldots: {\bf R}^d \rightarrow [0,+\infty]} of unsigned simple functions such that {f_n(x) \rightarrow f(x)} for every {x \in {\bf R}^d}.

This particular definition is not always the most tractable. Fortunately, it has many equivalent forms:

Lemma 7 (Equivalent notions of measurability) Let {f: {\bf R}^d \rightarrow [0,+\infty]} be an unsigned function. Then the following are equivalent:

  1. {f} is unsigned Lebesgue measurable.
  2. {f} is the pointwise limit of unsigned simple functions {f_n} (thus the limit {\lim_{n \rightarrow \infty} f_n(x)} exists and is equal to {f(x)} for all {x \in {\bf R}^d}).
  3. {f} is the pointwise almost everywhere limit of unsigned simple functions {f_n} (thus the limit {\lim_{n \rightarrow \infty} f_n(x)} exists and is equal to {f(x)} for almost every {x \in {\bf R}^d}).
  4. {f} is the supremum {f(x) = \sup_n f_n(x)} of an increasing sequence {0 \leq f_1 \leq f_2 \leq \ldots} of unsigned simple functions {f_n}, each of which are bounded with finite measure support.
  5. For every {\lambda \in [0,+\infty]}, the set {\{ x \in {\bf R}^d: f(x) > \lambda \}} is Lebesgue measurable.
  6. For every {\lambda \in [0,+\infty]}, the set {\{ x \in {\bf R}^d: f(x) \geq \lambda \}} is Lebesgue measurable.
  7. For every {\lambda \in [0,+\infty]}, the set {\{ x \in {\bf R}^d: f(x) < \lambda \}} is Lebesgue measurable.
  8. For every {\lambda \in [0,+\infty]}, the set {\{ x \in {\bf R}^d: f(x) \leq \lambda \}} is Lebesgue measurable.
  9. For every interval {I \subset [0,+\infty)}, the set {f^{-1}(I) := \{ x\in {\bf R}^d: f(x) \in I \}} is Lebesgue measurable.
  10. For every (relatively) open set {U \subset [0,+\infty)}, the set {f^{-1}(U) := \{ x \in {\bf R}^d: f(x) \in U \}} is Lebesgue measurable.
  11. For every (relatively) closed set {K \subset [0,+\infty)}, the set {f^{-1}(K) := \{ x \in {\bf R}^d: f(x) \in K \}} is Lebesgue measurable.

Proof: (1.) and (2.) are equivalent by definition. (2.) clearly implies (3.). As every monotone sequence in {[0,+\infty]} converges, (4.) implies (2.). Now we show that (3.) implies (5.). If {f} is the pointwise almost everywhere limit of {f_n}, then for almost every {x \in {\bf R}^d} one has

\displaystyle f(x) = \lim_{n \rightarrow \infty} f_n(x) = \limsup_{n \rightarrow \infty} f_n(x) = \inf_{N>0} \sup_{n\geq N} f_n(x).

This implies that, for any {\lambda}, the set {\{ x \in {\bf R}^d: f(x) > \lambda \}} is equal to

\displaystyle \bigcup_{M>0} \bigcap_{N>0} \{ x \in {\bf R}^d: \sup_{n \geq N} f_n(x) > \lambda + \frac{1}{M} \}

outside of a set of measure zero; this set in turn is equal to

\displaystyle \bigcup_{M>0} \bigcap_{N>0} \bigcup_{n \ge N} \{ x \in {\bf R}^d: f_n(x) > \lambda + \frac{1}{M} \}

outside of a set of measure zero. But as each {f_n} is an unsigned simple function, the sets {\{ x \in {\bf R}^d: f_n(x) > \lambda + \frac{1}{M} \}} are Lebesgue measurable. Since countable unions or countable intersections of Lebesgue measurable sets are Lebesgue measurable, and modifying a Lebesgue measurable set on a null set produces another Lebesgue measurable set, we obtain (5.).

To obtain the equivalence of (5.) and (6.), observe that

\displaystyle \{ x \in {\bf R}^d: f(x) \geq \lambda \} = \bigcap_{\lambda' \in {\bf Q}^+: \lambda' < \lambda} \{ x \in {\bf R}^d: f(x) > \lambda' \}

for {\lambda \in (0,+\infty]} and

\displaystyle \{ x \in {\bf R}^d: f(x) > \lambda \} = \bigcup_{\lambda' \in {\bf Q}^+: \lambda' > \lambda} \{ x \in {\bf R}^d: f(x) \geq \lambda' \}

{\lambda \in [0,+\infty)}, where {{\bf Q}^+ := {\bf Q} \cap [0,+\infty]} are the non-negative rationals. The claim then easily follows from the countable nature of {{\bf Q}^+} (treating the extreme cases {\lambda = 0, +\infty} separately if necessary). A similar argument lets one deduce (5.) or (6.) from (9.).

The equivalence of (5.), (6.) with (7.), (8.) comes from the observation that {\{ x \in {\bf R}^d: f(x) \leq \lambda \}} is the complement of {\{ x \in {\bf R}^d: f(x) > \lambda \}}, and {\{ x \in {\bf R}^d: f(x) < \lambda \}} is the complement of {\{ x \in {\bf R}^d: f(x) \geq \lambda \}}. A similar argument shows that (10.) and (11.) are equivalent.

By expressing an interval as the intersection of two half-intervals, we see that (9.) follows from (5.)-(8.), and so all of (5.)-(9.) are now shown to be equivalent.

Clearly (10.) implies (7.), and hence (5.)-(9.). Conversely, because every open set in {[0,+\infty)} is the union of countably many open intervals in {[0,+\infty)}, (9.) implies (10.).

The only remaining task is to show that (5.)-(11.) implies (4.). Let {f} obey (5.)-(11.). For each positive integer {n}, we let {f_n(x)} be defined to be the largest integer multiple of {2^{-n}} that is less than or equal to {\min(f(x),n)} when {|x| \leq n}, with {f_n(x) := 0} for {|x| > n}. From construction it is easy to see that the {f_n: {\bf R}^d \rightarrow [0,+\infty]} are increasing and have {f} as their supremum. Furthermore, each {f_n} takes on only finitely many values, and for each non-zero value {c} it attains, the set {f_n^{-1}(c)} takes the form {f^{-1}(I_c) \cap \{ x \in {\bf R}^d: |x| \leq n \}} for some interval or ray {I_c}, and is thus measurable. As a consequence, {f_n} is a simple function, and by construction it is bounded and has finite measure support. The claim follows. \Box

With these equivalent formulations, we can now generate plenty of measurable functions:

Exercise 3

  1. Show that every continuous function {f: {\bf R}^d \rightarrow [0,+\infty]} is measurable.
  2. Show that every unsigned simple function is measurable.
  3. Show that the supremum, infimum, limit superior, or limit inferior of unsigned measurable functions is unsigned measurable.
  4. Show that an unsigned function that is equal almost everywhere to an unsigned measurable function, is itself measurable.
  5. Show that if a sequence {f_n} of unsigned measurable functions converges pointwise almost everywhere to an unsigned limit {f}, then {f} is also measurable.
  6. If {f: {\bf R}^d \rightarrow [0,+\infty]} is measurable and {\phi: [0,+\infty] \rightarrow [0,+\infty]} is continuous, show that {\phi \circ f: {\bf R}^d \rightarrow [0,+\infty]} is measurable.
  7. If {f, g} are unsigned measurable functions, show that {f+g} and {fg} are measurable.

In view of part (4.) of the above exercise, one can define the concept of measurability for an unsigned function that is only defined almost everywhere on {{\bf R}^d}, rather than everywhere on {{\bf R}^d}, by extending that function arbitrarily to the null set where it is currently undefined.

Exercise 4 Let {f: {\bf R}^d \rightarrow [0,+\infty]}. Show that {f} is a bounded unsigned measurable function if and only if {f} is the uniform limit of bounded simple functions.

Exercise 5 Show that an unsigned function {f: {\bf R}^d \rightarrow [0,+\infty]} is a simple function if and only if it is measurable and takes on at most finitely many values.

Exercise 6 Let {f: {\bf R}^d \rightarrow [0,+\infty]} be an unsigned measurable function. Show that the region {\{ (x,t) \in {\bf R}^d \times {\bf R}: 0 \leq t \leq f(x) \}} is a measurable subset of {{\bf R}^{d+1}}. (There is a converse to this statement, but we will wait until later notes to prove it, once we have the Fubini-Tonelli theorem available to us.)

Remark 3 Lemma 7 tells us that if {f: {\bf R}^d \rightarrow [0,+\infty]} is measurable, then {f^{-1}(E)} is Lebesgue measurable for many classes of sets {E}. However, we caution that it is not necessarily the case that {f^{-1}(E)} is Lebesgue measurable if {E} is Lebesgue measurable. To see this, we let {C} be the Cantor set

\displaystyle C := \{ \sum_{j=1}^\infty a_j 3^{-j}: a_j \in \{0,2\} \hbox{ for all } j\}

and let {f: {\bf R} \rightarrow [0,+\infty]} be the function defined by setting

\displaystyle f(x) := \sum_{j=1}^\infty 2 b_j 3^{-j}

whenever {x \in [0,1]} is not a terminating binary decimal, and so has a unique binary expansion {x = \sum_{j=1}^\infty b_j 2^{-j}} for some {b_j \in\{0,1\}}, and {f(x) := 0} otherwise. We thus see that {f} takes values in {C}, and is bijective on the set {A} of non-terminating decimals in {[0,1]}. Using Lemma 7, it is not difficult to show that {f} is measurable. On the other hand, by modifying the construction from the previous notes, we can find a subset {F} of {A} which is non-measurable. If we set {E := f(F)}, then {E} is a subset of the null set {C} and is thus itself a null set; but {f^{-1}(E) = F} is non-measurable, and so the inverse image of a Lebesgue measurable set by a measurable function need not remain Lebesgue measurable.

However, in later notes we will see that it is still true that {f^{-1}(E)} is Lebesgue measurable if {E} has a slightly stronger measurability property than Lebesgue measurability, namely Borel measurability.

Now we can define the concept of a complex-valued measurable function. As discussed earlier, it will be convenient to allow for such functions to only be defined almost everywhere, rather than everywhere, to allow for the possibility that the function becomes singular or otherwise undefined on a null set.

Definition 8 (Complex measurability) An almost everywhere defined complex-valued function {f: {\bf R}^d \rightarrow {\bf C}} is Lebesgue measurable, or measurable for short, if it is the pointwise almost everywhere limit of complex-valued simple functions.

As before, there are several equivalent definitions:

Exercise 7 Let {f: {\bf R}^d \rightarrow {\bf C}} be an almost everywhere defined complex-valued function. Then the following are equivalent:

  1. {f} is measurable.
  2. {f} is the pointwise almost everywhere limit of complex-valued simple functions.
  3. The (magnitudes of the) positive and negative parts of {\hbox{Re}(f)} and {\hbox{Im}(f)} are unsigned measurable functions.
  4. {f^{-1}(U)} is Lebesgue measurable for every open set {U \subset {\bf C}}.
  5. {f^{-1}(K)} is Lebesgue measurable for every closed set {K \subset {\bf C}}.

From the above exercise, we see that the notion of complex-valued measurability and unsigned measurability are compatible when applied to a function that takes values in {[0,+\infty) = [0,+\infty] \cap {\bf C}} everywhere (or almost everywhere).

Exercise 8

  1. Show that every continuous function {f: {\bf R}^d \rightarrow {\bf C}} is measurable.
  2. Show that a function {f: {\bf R}^d \rightarrow {\bf C}} is simple if and only if it is measurable and takes on at most finitely many values.
  3. Show that a complex-valued function that is equal almost everywhere to an measurable function, is itself measurable.
  4. Show that if a sequence {f_n} of complex-valued measurable functions converges pointwise almost everywhere to an complex-valued limit {f}, then {f} is also measurable.
  5. If {f: {\bf R}^d \rightarrow {\bf C}} is measurable and {\phi: {\bf C} \rightarrow {\bf C}} is continuous, show that {\phi \circ f: {\bf R}^d \rightarrow {\bf C}} is measurable.
  6. If {f, g} are measurable functions, show that {f+g} and {fg} are measurable.

Exercise 9 Let {f: [a,b] \rightarrow {\bf R}} be a Riemann integrable function. Show that if one extends {f} to all of {{\bf R}} by defining {f(x)=0} for {x \not \in [a,b]}, then {f} is measurable.

— 3. Unsigned Lebesgue integrals —

We are now ready to integrate unsigned measurable functions. We begin with the notion of the lower unsigned Lebesgue integral, which can be defined for arbitrary unsigned functions (not necessarily measurable):

Definition 9 (Lower unsigned Lebesgue integral) Let {f: {\bf R}^d \rightarrow [0,+\infty]} be an unsigned function (not necessarily measurable). We define the lower unsigned Lebesgue integral {\underline{\int_{{\bf R}^d}} f(x)\ dx} to be the quantity

\displaystyle \underline{\int_{{\bf R}^d}} f(x)\ dx := \sup_{0 \leq g \leq f; g \hbox{ simple}} \hbox{Simp} \int_{{\bf R}^d} g(x)\ dx

where {g} ranges over all unsigned simple functions {g: {\bf R}^d \rightarrow [0,+\infty]} that are pointwise bounded by {f}.

One can also define the upper unsigned Lebesgue integral

\displaystyle \overline{\int_{{\bf R}^d}} f(x)\ dx := \inf_{h \geq f; h \hbox{ simple}} \hbox{Simp} \int_{{\bf R}^d} h(x)\ dx

but we will use this integral much more rarely. Note that both integrals take values in {[0,+\infty]}, and that the upper Lebesgue integral is always at least as large as the lower Lebesgue integral.

In the definition of the lower unsigned Lebesgue integral, {g} is required to be bounded by {f} pointwise everywhere, but it is easy to see that one could also require {g} to just be bounded by {f} pointwise almost everywhere without affecting the value of the integral, since the simple integral is not affected by modifications on sets of measure zero.

The following properties of the lower Lebesgue integral are easy to establish:

Exercise 10 (Basic properties of the lower Lebesgue integral) Let {f, g: {\bf R}^d \rightarrow [0,+\infty]} be unsigned functions (not necessarily measurable).

  1. (Compatibility with the simple integral) If {f} is simple, then {\underline{\int_{{\bf R}^d}} f(x)\ dx = \overline{\int_{{\bf R}^d}} f(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx}.
  2. (Monotonicity) If {f \leq g} pointwise almost everywhere, then {\underline{\int_{{\bf R}^d}} f(x)\ dx \leq \underline{\int_{{\bf R}^d}} g(x)\ dx} and {\overline{\int_{{\bf R}^d}} f(x)\ dx \leq \overline{\int_{{\bf R}^d}} g(x)\ dx}.
  3. (Homogeneity) If {c \in [0,+\infty)}, then {\underline{\int_{{\bf R}^d}} cf(x)\ dx = c \underline{\int_{{\bf R}^d}} f(x)\ dx}.
  4. (Equivalence) If {f, g} agree almost everywhere, then {\underline{\int_{{\bf R}^d}} f(x)\ dx = \underline{\int_{{\bf R}^d}} g(x)\ dx} and {\overline{\int_{{\bf R}^d}} f(x)\ dx = \overline{\int_{{\bf R}^d}} g(x)\ dx}.
  5. (Superadditivity) {\underline{\int_{{\bf R}^d}} f(x)+g(x)\ dx \geq \underline{\int_{{\bf R}^d}} f(x)\ dx + \underline{\int_{{\bf R}^d}} g(x)\ dx}.
  6. (Subadditivity of upper integral) {\overline{\int_{{\bf R}^d}} f(x)+g(x)\ dx \leq \overline{\int_{{\bf R}^d}} f(x)\ dx + \overline{\int_{{\bf R}^d}} g(x)\ dx}
  7. (Divisibility) For any measurable set {E}, one has {\underline{\int_{{\bf R}^d}} f(x)\ dx = \underline{\int_{{\bf R}^d}} f(x) 1_E(x)\ dx + \underline{\int_{{\bf R}^d}} f(x) 1_{{\bf R}^d \backslash E}(x)\ dx}.
  8. (Horizontal truncation) As {n \rightarrow \infty}, {\underline{\int_{{\bf R}^d}} \min(f(x),n)\ dx} converges to {\underline{\int_{{\bf R}^d}} f(x)\ dx}.
  9. (Vertical truncation) As {n \rightarrow \infty}, {\underline{\int_{{\bf R}^d}} f(x) 1_{|x| \leq n}\ dx} converges to {\underline{\int_{{\bf R}^d}} f(x)\ dx}. Hint: From Exercise 11 of Notes 1, we have {m( E \cap \{ x: |x| \leq n \}) \rightarrow m(E)} for any measurable set {E}.
  10. (Reflection) If {f+g} is a simple function that is bounded with finite measure support (i.e. it is absolutely integrable), then {\hbox{Simp} \int_{{\bf R}^d} f(x)+g(x)\ dx = \underline{\int_{{\bf R}^d}} f(x)\ dx + \overline{\int_{{\bf R}^d}} g(x)\ dx}.

Do the horizontal and vertical truncation properties hold if the lower Lebesgue integral is replaced with the upper Lebesgue integral?

Now we restrict attention to measurable functions.

Definition 10 (Unsigned Lebesgue integral) If {f: {\bf R}^d \rightarrow [0,+\infty]} is measurable, we define the unsigned Lebesgue integral {\int_{{\bf R}^d} f(x)\ dx} of {f} to equal the lower unsigned Lebesgue integral {\underline{\int_{{\bf R}^d}} f(x)\ dx}. (For non-measurable functions, we leave the unsigned Lebesgue integral undefined.)

One nice feature of measurable functions is that the lower and upper Lebesgue integrals can match, if one also assumes some boundedness:

Exercise 11 Let {f: {\bf R}^d \rightarrow [0,+\infty]} be measurable, bounded, and vanishing outside of a set of finite measure. Show that the lower and upper Lebesgue integrals of {f} agree. (Hint: use Exercise 4.) There is a converse to this statement, but we will defer it to later notes. What happens if {f} is allowed to be unbounded, or is not supported inside a set of finite measure?

This gives an important corollary:

Corollary 11 (Finite additivity of the Lebesgue integral) Let {f, g: {\bf R}^d \rightarrow [0,+\infty]} be measurable. Then {\int_{{\bf R}^d} f(x)+g(x)\ dx = \int_{{\bf R}^d} f(x)\ dx + \int_{{\bf R}^d} g(x)\ dx}.

Proof: From the horizontal truncation property and a limiting argument, we may assume that {f, g} are bounded. From the vertical truncation property and another limiting argument, we may assume that {f, g} are supported inside a bounded set. From Exercise 11, we now see that the lower and upper Lebesgue integrals of {f}, {g}, and {f+g} agree. The claim now follows by combining the superadditivity of the lower Lebesgue integral with the subadditivity of the upper Lebesgue integral. \Box

In later notes we will improve this finite additivity property for the unsigned Lebesgue integral further, to countable additivity; this property is also known as the monotone convergence theorem.

Exercise 12 (Upper Lebesgue integral and outer Lebesgue measure) Show that for any set {E \subset {\bf R}^d}, {\overline{\int_{{\bf R}^d}} 1_E(x)\ dx = m^*(E)}. Conclude that the upper and lower Lebesgue integrals are not necessarily additive if no measurability hypotheses are assumed.

Exercise 13 (Area interpretation of integral) If {f: {\bf R}^d \rightarrow [0,+\infty]} is measurable, show that {\int_{{\bf R}^d} f(x)\ dx} is equal to the {d+1}-dimensional Lebesgue measure of the region {\{ (x,t) \in {\bf R}^d \times {\bf R}: 0 \leq t \leq f(x) \}}. (This can be used as an alternate, and more geometrically intuitive, definition of the unsigned Lebesgue integral; it is a more convenient formulation for establishing the basic convergence theorems, but not quite as convenient for establishing basic properties such as additivity.) (Hint: use Exercise 22 from Notes 1.)

Exercise 14 Show that the Lebesgue integral {f \mapsto \int_{{\bf R}^d} f(x)\ dx} is the only map from measurable unsigned functions {f: {\bf R}^d \rightarrow [0,+\infty]} to {[0,+\infty]} that obeys the following properties for measurable {f, g: {\bf R}^d \rightarrow [0,+\infty]}:

  • (Compatibility with the simple integral) If {f} is simple, then {\int_{{\bf R}^d} f(x)\ dx = \hbox{Simp} \int_{{\bf R}^d} f(x)\ dx}.
  • (Finite additivity) {\int_{{\bf R}^d} f(x)+g(x)\ dx = \int_{{\bf R}^d} f(x)\ dx + \int_{{\bf R}^d} g(x)\ dx}.
  • (Horizontal truncation) As {n \rightarrow \infty}, {\int_{{\bf R}^d} \min(f(x),n)\ dx} converges to {\int_{{\bf R}^d} f(x)\ dx}.
  • (Vertical truncation) As {n \rightarrow \infty}, {\int_{{\bf R}^d} f(x) 1_{|x| \leq n}\ dx} converges to {\int_{{\bf R}^d} f(x)\ dx}.

Exercise 15 (Translation invariance) Let {f: {\bf R}^d \rightarrow [0,+\infty]} be measurable. Show that {\int_{{\bf R}^d} f(x+y)\ dx = \int_{{\bf R}^d} f(x)\ dx} for any {y \in {\bf R}^d}.

Exercise 16 (Linear change of variables) Let {f: {\bf R}^d \rightarrow [0,+\infty]} be measurable, and let {T: {\bf R}^d \rightarrow {\bf R}^d} be an invertible linear transformation. Show that {\int_{{\bf R}^d} f(T^{-1}(x))\ dx = |\det T| \int_{{\bf R}^d} f(x)\ dx}, or equivalently {\int_{{\bf R}^d} f(T x)\ dx = \frac{1}{|\det T|} \int_{{\bf R}^d} f(x)\ dx}.

Exercise 17 (Compatibility with the Riemann integral) Let {f: [a,b] \rightarrow [0,+\infty]} be Riemann integrable. If we extend {f} to {{\bf R}} by declaring {f} to equal zero outside of {[a,b]}, show that {\int_{\bf R} f(x)\ dx = \int_a^b f(x)\ dx}.

We record a basic inequality, known as Markov’s inequality, that asserts that the Lebesgue integral of an unsigned measurable function controls how often that function can be large:

Lemma 12 (Markov’s inequality) Let {f: {\bf R}^d \rightarrow [0,+\infty]} be measurable. Then for any {0 < \lambda < \infty}, one has

\displaystyle m( \{ x \in {\bf R}^d: f(x) \geq \lambda \} ) \leq \frac{1}{\lambda} \int_{{\bf R}^d} f(x)\ dx.

Proof: We have the trivial pointwise inequality

\displaystyle \lambda 1_{\{ x \in {\bf R}^d: f(x) \geq \lambda \}} \leq f(x).

From the definition of the lower Lebesgue integral, we conclude that

\displaystyle \lambda m( \{ x \in {\bf R}^d: f(x) \geq \lambda \} ) \leq \int_{{\bf R}^d} f(x)\ dx

and the claim follows. \Box

By sending {\lambda} to infinity or to zero, we obtain the following important corollary:

Exercise 18 Let {f: {\bf R}^d \rightarrow [0,+\infty]} be measurable.

  • Show that if {\int_{{\bf R}^d} f(x)\ dx < \infty}, then {f} is finite almost everywhere. Give a counterexample to show that the converse statement is false.
  • Show that {\int_{{\bf R}^d} f(x)\ dx = 0} if and only if {f} is zero almost everywhere.

Remark 4 The use of the integral {\int_{{\bf R}^d} f(x)\ dx} to control the distribution of {f} is known as the first moment method. One can also control this distribution using higher moments such as {\int_{{\bf R}^d} |f(x)|^p\ dx} for various values of {p}, or exponential moments such as {\int_{{\bf R}^d} e^{tf(x)}\ dx} or the Fourier moments {\int_{{\bf R}^d} e^{itf(x)}\ dx} for various values of {t}; such moment methods are fundamental to probability theory.

— 4. Absolute integrability —

Having set out the theory of the unsigned Lebesgue integral, we can now define the absolutely convergent Lebesgue integral.

Definition 13 (Absolute integrability) An almost everywhere defined measurable function {f: {\bf R}^d \rightarrow {\bf C}} is said to be absolutely integrable if the unsigned integral

\displaystyle \|f\|_{L^1({\bf R}^d)} := \int_{{\bf R}^d} |f(x)|\ dx

is finite. We refer to this quantity {\|f\|_{L^1({\bf R}^d)}} as the {L^1({\bf R}^d)} norm of {f}, and use {L^1({\bf R}^d)} or {L^1({\bf R}^d \rightarrow {\bf C})} to denote the space of absolutely integrable functions. If {f} is real-valued and absolutely integrable, we define the Lebesgue integral {\int_{{\bf R}^d} f(x)\ dx} by the formula

\displaystyle \int_{{\bf R}^d} f(x)\ dx := \int_{{\bf R}^d} f_+(x)\ dx - \int_{{\bf R}^d} f_-(x)\ dx \ \ \ \ \ (7)

 

where {f_+ := \max(f,0)}, {f_- := \max(-f,0)} are the magnitudes of the positive and negative components of {f} (note that the two unsigned integrals on the right-hand side are finite, as {f_+, f_-} are pointwise dominated by {|f|}). If {f} is complex-valued and absolutely integrable, we define the Lebesgue integral {\int_{{\bf R}^d} f(x)\ dx} by the formula

\displaystyle \int_{{\bf R}^d} f(x)\ dx := \int_{{\bf R}^d} \hbox{Re} f(x)\ dx + i \int_{{\bf R}^d} \hbox{Im} f(x)\ dx

where the two integrals on the right are interpreted as real-valued absolutely integrable Lebesgue integrals. It is easy to see that the unsigned, real-valued, and complex-valued Lebesgue integrals defined in this manner are compatible on their common domains of definition.

Note from construction that the absolutely integrable Lebesgue integral extends the absolutely integrable simple integral, which is now redundant and will not be needed any further in the sequel.

Remark 5 One can attempt to define integrals for non-absolutely-integrable functions, analogous to the improper integrals {\int_0^\infty f(x)\ dx := \lim_{R \rightarrow \infty} \int_0^R f(x)\ dx} or the principal value integrals {p.v. \int_{-\infty}^\infty f(x)\ dx := \lim_{R \rightarrow \infty} \int_{-R}^R f(x)\ dx} one sees in the classical one-dimensional Riemannian theory. While one can certainly generate any number of such extensions of the Lebesgue integral concept, such extensions tend to be poorly behaved with respect to various important operations, such as change of variables or exchanging limits and integrals, so it is usually not worthwhile to try to set up a systematic theory for such non-absolutely-integrable integrals that is anywhere near as complete as the absolutely integrable theory, and instead deal with such exotic integrals on an ad hoc basis.

From the pointwise triangle inequality {|f(x)+g(x)| \leq |f(x)|+|g(x)|}, we conclude the {L^1} triangle inequality

\displaystyle \|f+g\|_{L^1({\bf R}^d)} \leq \|f\|_{L^1({\bf R}^d)} + \|g\|_{L^1({\bf R}^d)} \ \ \ \ \ (8)

 

for any almost everywhere defined measurable {f, g: {\bf R}^d \rightarrow {\bf C}}. It is also easy to see that

\displaystyle \|c f\|_{L^1({\bf R}^d)} = |c| \|f\|_{L^1({\bf R}^d)}

for any complex number {c}. As such, we see that {L^1({\bf R}^d \rightarrow {\bf C})} is a complex vector space. (The {L^1} norm is then a seminorm on this space, but we will not need to discuss norms and seminorms in detail until 245B.) From Exercise 18 we make the important observation that a function {f \in L^1({\bf R}^d \rightarrow {\bf C})} has zero {L^1} norm, {\|f\|_{L^1({\bf R}^d)}=0}, if and only if {f} is zero almost everywhere.

Given two functions {f, g \in L^1({\bf R}^d \rightarrow {\bf C})}, we can define the {L^1} distance {d_{L^1}(f,g)} between them by the formula

\displaystyle d_{L^1}(f,g) := \|f-g\|_{L^1({\bf R}^d)}.

Thanks to (8), this distance obeys almost all the axioms of a metric on {L^1({\bf R}^d)}, with one exception: it is possible for two different functions {f, g \in L^1({\bf R}^d \rightarrow {\bf C})} to have a zero {L^1} distance, if they agree almost everywhere. As such, {d_{L^1}} is only a semi-metric (also known as a pseudo-metric) rather than a metric. However, if one adopts the convention that any two functions that agree almost everywhere are considered equivalent (or more formally, one works in the quotient space of {L^1({\bf R}^d)} by the equivalence relation of almost everywhere agreement, which by abuse of notation is also denoted {L^1({\bf R}^d)}), then one recovers a genuine metric. (Later on, we will establish the important fact that this metric makes the (quotient space) {L^1({\bf R}^d)} a complete metric space, a fact known as the {L^1} Riesz-Fischer theorem; this completeness is one of the main reasons we spend so much effort setting up Lebesgue integration theory in the first place.)

The linearity properties of the unsigned integral induce analogous linearity properties of the absolutely convergent Lebesgue integral:

Exercise 19 (Integration is linear) Show that integration {f \mapsto \int_{{\bf R}^d} f(x)\ dx} is a (complex) linear operation from {L^1({\bf R}^d)} to {{\bf C}}. In other words, show that

\displaystyle \int_{{\bf R}^d} f(x) + g(x)\ dx = \int_{{\bf R}^d} f(x)\ dx + \int_{{\bf R}^d} g(x)\ dx

and

\displaystyle \int_{{\bf R}^d} cf(x)\ dx = c\int_{{\bf R}^d} f(x)\ dx

for all absolutely integrable {f, g: {\bf R}^d \rightarrow {\bf C}} and complex numbers {c}. Also establish the identity

\displaystyle \int_{{\bf R}^d} \overline{f(x)}\ dx = \overline{\int_{{\bf R}^d} f(x)\ dx},

which makes integration not just a linear operation, but a *-linear operation.

Exercise 20 Show that Exercises 15, 16, and 17 also hold for complex-valued, absolutely integrable functions rather than for unsigned measurable functions.

Exercise 21 (Absolute summability is a special case of absolute integrability) Let {(c_n)_{n \in {\bf Z}}} be a doubly infinite sequence of complex numbers, and let {f: {\bf R} \rightarrow {\bf C}} be the function

\displaystyle f(x) := \sum_{n \in {\bf Z}} c_n 1_{[n,n+1)}(x) = c_{\lfloor x \rfloor}

where {\lfloor x \rfloor} is the greatest integer less than {x}. Show that {f} is absolutely integrable if and only if the series {\sum_{n \in {\bf Z}} c_n} is absolutely convergent, in which case one has {\int_{\bf R} f(x)\ dx = \sum_{n \in {\bf Z}} c_n}.

We can localise the absolutely convergent integral to any measurable subset {E} of {{\bf R}^d}. Indeed, if {f: E \rightarrow {\bf C}} is a function, we say that {f} is measurable (resp. absolutely integrable) if its extension {\tilde f: {\bf R}^d \rightarrow {\bf C}} is measurable (resp. absolutely integrable), where {\tilde f(x)} is defined to equal {f(x)} when {x \in E} and zero otherwise, and then we define {\int_E f(x)\ dx := \int_{{\bf R}^d} \tilde f(x)\ dx}. Thus, for instance, the absolutely integrable analogue of Exercise 17 tells us that

\displaystyle \int_a^b f(x)\ dx = \int_{[a,b]} f(x)\ dx

for any Riemann-integrable {f: [a,b] \rightarrow {\bf C}}.

Exercise 22 If {E, F} are disjoint measurable subsets of {{\bf R}^d}, and {f: E \cup F \rightarrow {\bf C}} is absolutely integrable, show that

\displaystyle \int_E f(x)\ dx = \int_{E \cup F} f(x) 1_E(x)\ dx

and

\displaystyle \int_E f(x)\ dx + \int_F f(x)\ dx = \int_{E \cup F} f(x)\ dx.

We will study the properties of the absolutely convergent Lebesgue integral in more detail in later notes, as a special case of the more general Lebesgue integration theory on abstract measure spaces. For now, we record one very basic inequality:

Lemma 14 (Triangle inequality) Let {f \in L^1({\bf R}^d \rightarrow {\bf C})}. Then

\displaystyle |\int_{{\bf R}^d} f(x)\ dx| \leq \int_{{\bf R}^d} |f(x)|\ dx.

Proof: If {f} is real-valued, then {|f| = f_+ + f_-} and the claim is obvious from (7). When {f} is complex-valued, one cannot argue quite so simply; a naive mimicking of the real-valued argument would lose a factor of {2}, giving the inferior bound

\displaystyle |\int_{{\bf R}^d} f(x)\ dx| \leq 2\int_{{\bf R}^d} |f(x)|\ dx.

To do better, we exploit the phase rotation invariance properties of the absolute value operation and of the integral, as follows. Note that for any complex number {z}, one can write {|z|} as {z e^{i\theta}} for some real {\theta}. In particular, we have

\displaystyle |\int_{{\bf R}^d} f(x)\ dx| = e^{i\theta} \int_{{\bf R}^d} f(x)\ dx = \int_{{\bf R}^d} e^{i\theta} f(x)\ dx

for some real {\theta}. Taking real parts of both sides, we obtain

\displaystyle |\int_{{\bf R}^d} f(x)\ dx| = \int_{{\bf R}^d} \hbox{Re}( e^{i\theta} f(x) )\ dx.

Since {\hbox{Re}(e^{i\theta} f(x)) \leq |e^{i\theta} f(x)| = |f(x)|}, we obtain the claim. \Box

— 5. Littlewood’s three principles —

Littlewood’s three principles are informal heuristics that convey much of the basic intuition behind the measure theory of Lebesgue. Briefly, the three principles are as follows:

  1. Every (measurable) set is nearly a finite sum of intervals;
  2. Every (absolutely integrable) function is nearly continuous; and
  3. Every (pointwise) convergent sequence of functions is nearly uniformly convergent.

Various manifestations of the first principle were given in the previous set of notes (and specifically, in Exercises 7 and 14 of those notes). We now discuss the second principle. Define a step function to be a finite linear combination of indicator functions of boxes.

Theorem 15 (Approximation of {L^1} functions) Let {f \in L^1({\bf R}^d)} and {\epsilon > 0}.

  1. There exists an absolutely integrable simple function {g} such that {\|f-g\|_{L^1({\bf R}^d)} \leq \epsilon}.
  2. There exists a step function {g} such that {\|f-g\|_{L^1({\bf R}^d)} \leq \epsilon}.
  3. There exists a continuous, compactly supported {g} such that {\|f-g\|_{L^1({\bf R}^d)} \leq \epsilon}.

To put things another way, the absolutely integrable simple functions, the step functions, and the continuous, compactly supported functions are all dense subsets of {L^1({\bf R}^d)} with respect to the {L^1({\bf R}^d)} (semi-)metric. Much later in the course (in 245C), we will see that a similar statement holds if one replaces continuous, compactly supported functions with smooth, compactly supported functions, also known as test functions; this is an important fact for the theory of distributions.

Proof: We begin with part (1.). When {f} is unsigned, we see from the definition of the lower Lebesgue integral that there exists an unsigned simple function {g} such that {g \leq f} (so, in particular, {g} is absolutely integrable) and

\displaystyle \int_{{\bf R}^d} g(x)\ dx \geq \int_{{\bf R}^d} f(x)\ dx - \epsilon,

which by linearity implies that {\|f-g\|_{L^1({\bf R}^d)} \leq \epsilon}. This gives (1.) when {f} is unsigned. The case when {f} is real-valued then follows by splitting {f} into positive and negative parts (and adjusting {\epsilon} as necessary), and the case when {f} is complex-valued then follows by splitting {f} into real and imaginary parts (and adjusting {\epsilon} yet again).

To establish part (2.), we see from (1.) and the triangle inequality in {L^1} that it suffices to show this when {f} is an absolutely integrable simple function. By linearity (and more applications of the triangle inequality), it then suffices to show this when {f=1_E} is the indicator function of a measurable set {E \subset {\bf R}^d} of finite measure. But then, by Exercise 14 of Notes 1, such a set can be approximated (up to an error of measure at most {\epsilon}) by an elementary set, and the claim follows.

To establish part (3.), we see from (2.) and the argument from the preceding paragraph that it suffices to show this when {f = 1_E} is the indicator function of a box. But one can then establish the claim by direct construction. Indeed, if one makes a slightly larger box {F} that contains the closure of {E} in its interior, but has a volume at most {\epsilon} more than that of {E}, then one can directly construct a piecewise linear continuous function {g} supported on {F} that equals {1} on {E} (e.g. one can set {g(x) = \max(1 - R \hbox{dist}(x,E),0)} for some sufficiently large {R}; one may also invoke Urysohn’s lemma, which we wil cover in 245B). It is then clear from construction that {\|f-g\|_{L^1({\bf R}^d)} \leq \epsilon} as required. \Box

This is not the only way to make Littlewood’s second principle manifest; we return to this point shortly. For now, we turn to Littlewood’s third principle. We recall three basic ways in which a sequence {f_n: {\bf R}^d \rightarrow {\bf C}} of functions can converge to a limit {f: {\bf R}^d \rightarrow {\bf C}}:

  • (Pointwise convergence) {f_n(x) \rightarrow f(x)} for every {x \in {\bf R}^d}.
  • (Pointwise almost everywhere convergence) {f_n(x) \rightarrow f(x)} for almost every {x \in {\bf R}^d}.
  • (Uniform convergence) For every {\epsilon > 0}, there exists {N} such that {|f_n(x)-f(x)| \leq \epsilon} for all {n \geq N} and all {x \in {\bf R}^d}.

Uniform convergence implies pointwise convergence, which in turn implies pointwise almost everywhere convergence.

We now add a fourth mode of convergence, that is weaker than uniform convergence but stronger than pointwise convergence:

Definition 16 (Locally uniform convergence) A sequence of functions {f_n: {\bf R}^d \rightarrow {\bf C}} converges locally uniformly to a limit {f: {\bf R}^d \rightarrow {\bf C}} if, for every bounded subset {E} of {{\bf R}^d}, {f_n} converges uniformly to {f} on {E}. In other words, for every bounded {E \subset {\bf R}^d} and every {\epsilon > 0}, there exists {N > 0} such that {|f_n(x) - f(x)| \leq \epsilon} for all {n \geq N} and {x \in E}.

Remark 6 At least as far as {{\bf R}^d} is concerned, an equivalent definition of local uniform convergence is: {f_n} converges locally uniformly to {f} if, for every point {x_0 \in {\bf R}^d}, there exists an open neighbourhood {U} of {x_0} such that {f_n} converges uniformly to {f} on {U}. The equivalence of the two definitions is immediate from the Heine-Borel theorem. More generally, the adverb “locally” in mathematics is usually used in this fashion; a propery {P} is said to hold locally on some domain {X} if, for every point {x_0} in that domain, there is an open neighbourhood of {x_0} in {X} on which {P} holds.

One should caution, though, that on domains on which the Heine-Borel theorem does not hold, the bounded-set notion of local uniform convergence is not equivalent to the open-set notion of local uniform convergence (though, for locally compact spaces, one can recover equivalence if one replaces “bounded” by “compact”).

Example 1 The functions {x \mapsto x/n} on {{\bf R}} for {n=1,2,\ldots} converge locally uniformly (and hence pointwise) to zero on {{\bf R}}, but do not converge uniformly.

Example 2 The partial sums {\sum_{n=0}^N \frac{x^n}{n!}} of the Taylor series {e^x = \sum_{n=0}^\infty \frac{x^n}{n!}} converges to {e^x} locally uniformly (and hence pointwise) on {{\bf R}}, but not uniformly.

Example 3 The functions {f_n(x) := \frac{1}{nx} 1_{x>0}} for {n=1,2,\ldots} (with the convention that {f_n(0)=0}) converge pointwise everywhere to zero, but do not converge locally uniformly.

From the preceding example, we see that pointwise convergence (either everywhere or almost everywhere) is a weaker concept than local uniform convergence. Nevertheless, a remarkable theorem of Egorov, which demonstrates Littlewood’s third principle, asserts that one can recover local uniform convergence as long as one is willing to delete a set of small measure:

Theorem 17 (Egorov’s theorem) Let {f_n: {\bf R}^d \rightarrow {\bf C}} be a sequence of measurable functions that converge pointwise almost everywhere to another function {f: {\bf R}^d \rightarrow {\bf C}}, and let {\epsilon > 0}. Then there exists a Lebesgue measurable set {A} of measure at most {\epsilon}, such that {f_n} converges locally uniformly to {f} outside of {A}.

Note that Example 3 demonstrates that the exceptional set {A} in Egorov’s theorem cannot be taken to have zero measure, at least if one uses the bounded-set definition of local uniform convergence from Definition 16. (If one instead takes the “open neighbourhood” definition, then the sequence in Example 3 does converge locally uniformly on {{\bf R} \backslash \{0\}} in the open neighbourhood sense, even if it does not do so in the bounded-set sense. On a domain such as {{\bf R}^d \backslash A}, bounded-set locally uniform convergence implies open-neighbourhood locally uniform convergence, but not conversely, so for the purposes of applying Egorov’s theorem, the distinction is not too important since one has local uniform convergence in both senses.)

Proof: By modifying {f_n} and {f} on a set of measure zero (that can be absorbed into {A} at the end of the argument) we may assume that {f_n} converges pointwise everywhere to {f}, thus for every {x \in {\bf R}^d} and {m > 0} there exists {N \geq 0} such that {|f_n(x)-f(x)| \leq 1/m} for all {n \geq N}. We can rewrite this fact set-theoretically as

\displaystyle \bigcap_{N=0}^\infty E_{N,m} = \emptyset

for each {m}, where

\displaystyle E_{N,m} := \{ x \in {\bf R}^d: |f_n(x)-f(x)| > 1/m \hbox{ for some } n \geq N \}.

It is clear that the {E_{N,m}} are Lebesgue measurable, and are decreasing in {N}. Applying downward monotone convergence (Exercise 9 of Notes 1) we conclude that, for any radius {R>0}, one has

\displaystyle \lim_{N \rightarrow \infty} m( E_{N,m} \cap B(0,R) ) = 0.

(The restriction to the ball {B(0,R)} is necessary, because the downward monotone convergence property only works when the sets involved have finite measure.) In particular, for any {m \geq 1}, we can find {N_m} such that

\displaystyle m( E_{N,m} \cap B(0,m) ) \leq \frac{\epsilon}{2^m}

for all {N \geq N_m}.

Now let {A := \bigcup_{m=1}^\infty E_{N_m,m} \cap B(0,m)}. Then {A} is Lebesgue measurable, and by countable subadditivity, {m(A) \leq \epsilon}. By construction, we have

\displaystyle |f_n(x)-f(x)| \leq 1/m

whenever {m \geq 1}, {x \in {\bf R}^d \backslash A}, {|x| \leq m}, and {n \geq N_m}. In particular, we see for any ball {B(0,m_0)} with an integer radius, {f_n} converges uniformly to {f} on {B(0,m_0) \backslash A}. Since every bounded set is contained in such a ball, the claim follows. \Box

Remark 7 Unfortunately, one cannot in general upgrade local uniform convergence to uniform convergence in Egorov’s theorem. A basic example here is the moving bump example {f_n := 1_{[n,n+1]}} on {{\bf R}}. This sequence converges pointwise (and locally uniformly) to the zero function {f \equiv 0}. However, for any {0 < \epsilon < 1} and any {n}, we have {|f_n(x) - f(x)| > \epsilon} on a set of measure {1}, namely on the interval {[n,n+1]}. Thus, if one wanted {f_n} to converge uniformly to {f} outside of a set {A}, then that set {A} has to contain a set of measure {1}. In fact, it must contain the intervals {[n,n+1]} for all sufficiently large {n} and must therefore have infinite measure.

However, if all the {f_n} and {f} were supported on a fixed set {E} of finite measure (e.g. on a ball {B(0,R)}), then the above “escape to horizontal infinity” cannot occur, it is easy to see from the above argument that one can recover uniform convergence (and not just locally uniform convergence) outside of a set of arbitrarily small measure.

We now use Theorem 15 to give another version of Littlewood’s second principle, known as Lusin’s theorem:

Theorem 18 (Lusin’s theorem) Let {f: {\bf R}^d \rightarrow {\bf C}} be absolutely integrable, and let {\epsilon > 0}. Then there exists a Lebesgue measurable set {E \subset {\bf R}^d} of measure at most {\epsilon} such that the restriction of {f} to the complementary set {{\bf R}^d \backslash E} is continuous on that set.

Caution: this theorem does not imply that the unrestricted function {f} is continuous on {{\bf R}^d \backslash E}. For instance, the absolutely integrable function {1_{\bf Q}: {\bf R} \rightarrow {\bf C}} is nowhere continuous, so is certainly not continuous on {{\bf R} \backslash E} for any {E} of finite measure; but on the other hand, if one deletes the measure zero set {E := {\bf Q}} from the reals, then the restriction of {f} to {{\bf R} \backslash E} is identically zero and thus continuous.

Proof: By Theorem 15, for any {n \geq 1} one can find a continuous, compactly supported function {f_n} such that {\|f-f_n\|_{L^1({\bf R}^d)} \leq \epsilon/4^n} (say). By Markov’s inequality, that implies that {|f(x)-f_n(x)| \leq 1/2^{n}} for all {x} outside of a Lebesgue measurable set {E_n} of measure at most {\epsilon/2^{n}}. Letting {E := \bigcup_{n=1}^\infty E_n}, we conclude that {E} is Lebesgue measurable with measure at most {\epsilon}, and {f_n} converges uniformly to {f} outside of {E}. But the uniform limit of continuous functions is continuous. We conclude that the restriction {f} to {{\bf R}^d \backslash E} is continuous, as required. \Box

Exercise 23 Show that the hypothesis that {f} is absolutely integrable in Lusin’s theorem can be relaxed to being locally absolutely integrable (i.e. absolutely integrable on every bounded set), and then relaxed further to that of being measurable (but still finite everywhere or almost everywhere). (To achieve the latter goal, one can replace {f} locally with a horizontal truncation {f 1_{|f| \leq n}}; alternatively, one can replace {f} with a bounded variant, such as {\frac{f}{(1+|f|^2)^{1/2}}}.)

Exercise 24 Show that a function {f: {\bf R}^d \rightarrow {\bf C}} is measurable if and only if it is the pointwise almost everywhere limit of continuous functions {f_n: {\bf R}^d \rightarrow {\bf C}}. (Hint: if {f: {\bf R}^d \rightarrow {\bf C}} is measurable and {n \geq 1}, show that there exists a continuous function {f_n: {\bf R}^d \rightarrow {\bf C}} for which the set {\{ x \in B(0,n): |f(x)-f_n(x)| \geq 1/n \}} has measure at most {\frac{1}{2^n}}. You may find Exercise 25 below to be useful for this.) Use this (and Egorov’s theorem) to give an alternate proof of Lusin’s theorem for arbitrary measurable functions.

Remark 8 This is a trivial but important remark: when dealing with unsigned measurable functions such as {f: {\bf R}^d \rightarrow [0,+\infty]}, then Lusin’s theorem does not apply directly because {f} could be infinite on a set of positive measure, which is clearly in contradiction with the conclusion of Lusin’s theorem (unless one allows the continuous function to also take values in the extended non-negative reals {[0,+\infty]} with the extended topology). However, if one knows already that {f} is almost everywhere finite (which is for instance the case when {f} is absolutely integrable), then Lusin’s theorem applies (since one can simply zero out {f} on the null set where it is infinite, and add that null set to the exceptional set of Lusin’s theorem).

Remark 9 By combining Lusin’s theorem with inner regularity (Exercise 13 from Notes 1) and the Tietze extension theorem (which we will cover in 245B), one can conclude that every measurable function {f: {\bf R}^d \rightarrow {\bf C}} agrees (outside of a set of arbitrarily small measure) with a continuous function {g: {\bf R}^d \rightarrow {\bf C}}.

Exercise 25 (Littlewood-like principles) The following facts are not, strictly speaking, instances of any of Littlewood’s three principles, but are in a similar spirit.

  • (Absolutely integrable functions almost have bounded support) Let {f: {\bf R}^d \rightarrow {\bf C}} be an absolutely integrable function, and let {\epsilon > 0}. Show that there exists a ball {B(0,R)} outside of which {f} has an {L^1} norm of at most {\epsilon}, or in other words that {\int_{{\bf R}^d \backslash B(0,R)} |f(x)|\ dx \leq \epsilon}.
  • (Measurable functions are almost locally bounded) Let {f: {\bf R}^d \rightarrow {\bf C}} be a measurable function supported on a set of finite measure, and let {\epsilon > 0}. Show that there exists a measurable set {E \subset {\bf R}^d} of measure at most {\epsilon} outside of which {f} is locally bounded, or in other words that for every {R > 0} there exists {M < \infty} such that {|f(x)| \leq M} for all {x \in B(0,R) \backslash E}.

As with Remark 8, it is important in the second part of the exercise that {f} is known to be finite everywhere (or at least almost everywhere); the result would of course fail if {f} was, say, unsigned but took the value {+\infty} on a set of positive measure.