Now that we have reviewed the foundations of measure theory, let us now put it to work to set up the basic theory of one of the fundamental families of function spaces in analysis, namely the spaces (also known as Lebesgue spaces). These spaces serve as important model examples for the general theory of topological and normed vector spaces, which we will discuss a little bit in this lecture and then in much greater detail in later lectures. (See also my previous blog post on function spaces.)
Just as scalar quantities live in the space of real or complex numbers, and vector quantities live in vector spaces, functions (or other objects closely related to functions, such as measures) live in function spaces. Like other spaces in mathematics (e.g. vector spaces, metric spaces, topological spaces, etc.) a function space
is not just mere sets of objects (in this case, the objects are functions), but they also come with various important structures that allow one to do some useful operations inside these spaces, and from one space to another. For example, function spaces tend to have several (though usually not all) of the following types of structures, which are usually related to each other by various compatibility conditions:
- Vector space structure. One can often add two functions
in a function space
, and expect to get another function
in that space
; similarly, one can multiply a function
in
by a scalar
and get another function
in
. Usually, these operations obey the axioms of a vector space, though it is important to caution that the dimension of a function space is typically infinite. (In some cases, the space of scalars is a more complicated ring than the real or complex field, in which case we need the notion of a module rather than a vector space, but we will not use this more general notion in this course.) Virtually all of the function spaces we shall encounter in this course will be vector spaces. Because the field of scalars is real or complex, vector spaces also come with the notion of convexity, which turns out to be crucial in many aspects of analysis. As a consequence (and in marked contrast to algebra or number theory), much of the theory in real analysis does not seem to extend to other fields of scalars (in particular, real analysis fails spectacularly in the finite characteristic setting).
- Algebra structure. Sometimes (though not always), we also wish to multiply two functions
,
in
and get another function
in
; when combined with the vector space structure and assuming some compatibility conditions (e.g. the distributive law), this makes
an algebra. This multiplication operation is often just pointwise multiplication, but there are other important multiplication operations on function spaces too, such as convolution. (One sometimes sees other algebraic structures than multiplication appear in function spaces, most notably derivations, but again we will not encounter those in this course. Another common algebraic operation for function spaces is conjugation or adjoint, leading to the notion of a *-algebra.)
- Norm structure. We often want to distinguish “large” functions in
from “small” ones, especially in analysis, in which “small” terms in an expression are routinely discarded or deemed to be acceptable errors. One way to do this is to assign a magnitude or norm
to each function that measures its size. Unlike the situation with scalars, where there is basically a single notion of magnitude, functions have a wide variety of useful notions of size, each measuring a different aspect (or combination of aspects) of the function, such as height, width, oscillation, regularity, decay, and so forth. Typically, each such norm gives rise to a separate function space (although sometimes it is useful to consider a single function space with multiple norms on it). We usually require the norm to be compatible with the vector space structure (and algebra structure, if present), for instance by demanding that the triangle inequality hold.
- Metric structure. We also want to tell whether two functions f, g in a function space V are “near together” or “far apart”. A typical way to do this is to impose a metric
on the space
. If both a norm
and a vector space structure are available, there is an obvious way to do this: define the distance between two functions
in
to be
. (This will be the only type of metric on function spaces encountered in this course. But there are some nonlinear function spaces of importance in nonlinear analysis (e.g. spaces of maps from one manifold to another) which have no vector space structure or norm, but still have a metric.) It is often important to know if the vector space is complete with respect to the given metric; this allows one to take limits of Cauchy sequences, and (with a norm and vector space structure) sum absolutely convergent series, as well as use some useful results from point set topology such as the Baire category theorem. All of these operations are of course vital in analysis. [Compactness would be an even better property than completeness to have, but function spaces unfortunately tend be non-compact in various rather nasty ways, although there are useful partial substitutes for compactness that are available, see e.g. this blog post of mine.]
- Topological structure. It is often important to know when a sequence (or, occasionally, nets) of functions
in
“converges” in some sense to a limit
(which, hopefully, is still in
); there are often many distinct modes of convergence (e.g. pointwise convergence, uniform convergence, etc.) that one wishes to carefully distinguish from each other. Also, in order to apply various powerful topological theorems (or to justify various formal operations involving limits, suprema, etc.), it is important to know when certain subsets of
enjoy key topological properties (most notably compactness and connectedness), and to know which operations on
are continuous. For all of this, one needs a topology on
. If one already has a metric, then one of course has a topology generated by the open balls of that metric; but there are many important topologies on function spaces in analysis that do not arise from metrics. We also often require the topology to be compatible with the other structures on the function space; for instance, we usually require the vector space operations of addition and scalar multiplication to be continuous. In some cases, the topology on
extends to some natural superspace
of more general functions that contain
; in such cases, it is often important to know whether
is closed in
, so that limits of sequences in
stay in
.
- Functional structures. Since numbers are easier to understand and deal with than functions, it is not surprising that we often study functions f in a function space V by first applying some functional
to V to identify some key numerical quantity
associated to f. Norms
are of course one important example of a functional; integration
provides another; and evaluation
at a point x provides a third important class. (Note, though, that while evaluation is the fundamental feature of a function in set theory, it is often a quite minor operation in analysis; indeed, in many function spaces, evaluation is not even defined at all, for instance because the functions in the space are only defined almost everywhere!) An inner product
on
(see below) also provides a large family
of useful functionals. It is of particular interest to study functionals that are compatible with the vector space structure (i.e. are linear) and with the topological structure (i.e. are continuous); this will give rise to the important notion of duality on function spaces.
- Inner product structure. One often would like to pair a function f in a function space V with another object g (which is often, though not always, another function in the same function space V) and obtain a number
, that typically measures the amount of “interaction” or “correlation” between f and g. Typical examples include inner products arising from integration, such as
; integration itself can also be viewed as a pairing,
. Of course, we usually require such inner products to be compatible with the other structures present on the space (e.g., to be compatible with the vector space structure, we usually require the inner product to be bilinear or sesquilinear). Inner products, when available, are incredibly useful in understanding the metric and norm geometry of a space, due to such fundamental facts as the Cauchy-Schwarz inequality and the parallelogram law. They also give rise to the important notion of orthogonality between functions.
- Group actions. We often expect our function spaces to enjoy various symmetries; we might wish to rotate, reflect, translate, modulate, or dilate our functions and expect to preserve most of the structure of the space when doing so. In modern mathematics, symmetries are usually encoded by group actions (or actions of other group-like objects, such as semigroups or groupoids; one also often upgrades groups to more structured objects such as Lie groups). As usual, we typically require the group action to preserve the other structures present on the space, e.g. one often restricts attention to group actions that are linear (to preserve the vector space structure), continuous (to preserve topological structure), unitary (to preserve inner product structure), isometric (to preserve metric structure), and so forth. Besides giving us useful symmetries to spend, the presence of such group actions allows one to apply the powerful techniques of representation theory, Fourier analysis, and ergodic theory. However, as this is a foundational real analysis class, we will not discuss these important topics much here (and in fact will not deal with group actions much at all).
- Order structure. In some cases, we want to utilise the notion of a function f being “non-negative”, or “dominating” another function g. One might also want to take the “max” or “supremum” of two or more functions in a function space V, or split a function into “positive” and “negative” components. Such order structures interact with the other structures on a space in many useful ways (e.g. via the Stone-Weierstrass theorem). Much like convexity, order structure is specific to the real line and is another reason why much of real analysis breaks down over other fields. (The complex plane is of course an extension of the real line and so is able to exploit the order structure of that line, usually by treating the real and imaginary components separately.)
There are of course many ways to combine various flavours of these structures together, and there are entire subfields of mathematics that are devoted to studying particularly common and useful categories of such combinations (e.g. topological vector spaces, normed vector spaces, Banach spaces, Banach algebras, von Neumann algebras, C^* algebras, Frechet spaces, Hilbert spaces, group algebras, etc.). The study of these sorts of spaces is known collectively as functional analysis. We will study some (but certainly not all) of these combinations in an abstract and general setting later in this course, but to begin with we will focus on the spaces, which are very good model examples for many of the above general classes of spaces, and also of importance in many applications of analysis (such as probability or PDE).
– spaces –
In this post, will be a fixed measure space; notions such as “measurable”, “measure”, “almost everywhere”, etc. will always be with respect to this space, unless otherwise specified. Similarly, unless otherwise specified, all subsets of X mentioned are restricted to be measurable, as are all scalar functions on X.
For sake of concreteness, we shall select the field of scalars to be the complex numbers . The theory of real Lebesgue spaces is virtually identical to that of complex Lebesgue spaces, and the former can largely be deduced from the latter as a special case.
We already have the notion of an absolutely integrable function on X, which is a function such that
is finite. More generally, given any exponent
, we can define a
-power integrable function to be a function
such that
is finite. (Besides p=1, the case of most interest is the case of square-integrable functions, when
. We will also extend this notion later to
, which is also an important special case.)
Remark 1. One can also extend these notions to functions that take values in the extended complex plane , but one easily observes that
power integrable functions must be finite almost everywhere, and so there is essentially no increase in generality afforded by extending the range in this manner.
Following the “Lebesgue philosophy” that one should ignore whatever is going on on a set of measure zero, let us declare two measurable functions to be equivalent if they agree almost everywhere. This is easily checked to be an equivalence relation, which does not affect the property of being -power integrable. Thus, we can define the Lebesgue space
to be the space of
-power integrable functions, quotiented out by this equivalence relation. Thus, strictly speaking, a typical element of
is not actually a specific function f, but is instead an equivalence class [f], consisting of all functions equivalent to a single function f. However, we shall abuse notation and speak loosely of a function f “belonging” to
, where it is understood that f is only defined up to equivalence, or more imprecisely is “defined almost everywhere”. For the purposes of integration, this equivalence is quite harmless, but this convention does mean that we can no longer evaluate a function f in
at a single point x if that point x has zero measure. It takes a little bit of getting used to the idea of a function that cannot actually be evaluated at any specific point, but with some practice you will find that it will not cause any significant conceptual difficulty. [One could also take a more abstract view, dispensing with the set X altogether and defining the Lebesgue space
on abstract measure spaces
, but we will not do so here. Another way to think about elements of
is that they are functions which are "unreliable" on an unknown set of measure zero, but remain "reliable" almost everywhere.]
Exercise 0. If is a measure space, and
is the completion of
, show that the spaces
and
are isomorphic using the obvious candidate for the isomorphism. Because of this, when dealing with
spaces, we will usually not be too concerned with whether the underlying measure space is complete.
Remark 2. Depending on which of the three structures of the measure space one wishes to emphasise, the space
is often abbreviated
,
,
, or even just
. Since for this discussion the measure space
will be fixed, we shall usually use the
abbreviation in this post. When the space X is discrete (i.e.
) and
is counting measure, then
is usually abbreviated
or just
(and the almost everywhere equivqlence relation trivialises and can thus be completely ignored).
At present, the Lebesgue spaces are just sets. We now begin to place several of the structures mentioned in the introduction to upgrade these sets to richer spaces.
We begin with vector space structure. Fix , and let
be two
-power integrable functions. From the crude pointwise (or more precisely, “pointwise almost everywhere”) inequality
(1)
we see that the sum of two -power integrable functions is also
-power integrable. It is also easy to see that any scalar multiple of a
-power integrable function is also
-power integrable. These operations respect almost everywhere equivalence, and so
becomes a (complex) vector space.
Next, we set up the norm structure. If , we define the
norm
of f to be the number
(2)
this is a finite non-negative number by definition of ; in particular, we have the identity
(3)
for all .
The norm has the following three basic properties:
Lemma 1. Let
and
.
- (Non-degeneracy)
if and only if f = 0.
- (Homogeneity)
for all complex numbers c.
- ((Quasi-)triangle inequality) We have
for some constant C depending on p. If
, then we can take C=1 (this fact is also known as Minkowski’s inequality).
Proof. The claims 1, 2 are obvious. (Note how important it is that we equate functions that vanish almost everywhere in order to get 1.) The quasi-triangle inequality follows from a variant of the estimates in (1) and is left as an exercise. For the triangle inequality, we have to be more efficient than the crude estimate (1). By the non-degeneracy property we may take and
to be non-zero. Using the homogeneity, we can normalise
to equal 1, thus (by homogeneity again) we can write
and
for some
and
with
. Our task is now to show that
(4)
But observe that for , the function
is convex on
, and in particular that
. (5)
(If one wishes, one can use the complex triangle inequality to first reduce to the case when F, G are non-negative, in which case one only needs convexity on rather than all of
.) The claim (4) then follows from (5) and the normalisations of F, G.
Exercise 1. Let and
.
- Establish the variant
of the triangle inequality.
- If furthermore f and g are non-negative (almost everywhere), establish also the reverse triangle inequality
.
- Show that the best constant C in the quasi-triangle inequality is
. In particular, the triangle inequality is false for
.
- Now suppose instead that
or
. If
are such that
, show that one of the functions f, g is a non-negative scalar multiple of the other (up to equivalence, of course). What happens when p=1?
A vector space V with a function obeying the non-degeneracy, homogeneity, and (quasi-)triangle inequality is known as a (quasi-)normed vector space, and the function
is then known as a (quasi-)norm; thus
is a normed vector space for
but only a quasi-normed vector space for
. A function
obeying the homogeneity and triangle inequality, but not necessarily the non-degeneracy property, is known as a seminorm; thus for instance the
norms for
would have been seminorms if we did not equate functions that agreed almost everywhere. (Conversely, given a seminormed vector space
, one can convert it into a normed vector space by quotienting out the subspace
; we leave the details as an exercise for the reader.)
Exercise 2. Let be a function on a vector space which obeys the non-degeneracy and homogeneity properties. Show that
is a norm if and only if the closed unit ball
is convex; show that the same equivalence also holds for the open unit ball. This emphasises the geometric nature of the triangle inequality.
Exercise 3. If for some
, show that the support
of f (which is defined only up to sets of measure zero) is a
-finite set. (Because of this, we can often reduce from the non-
-finite case to the
-finite case in many, though not all, questions concerning
spaces.)
We now are able to define norms and spaces in the limit
. We say that a function
is essentially bounded if there exists an M such that
for almost every x, and define
to be the least M that serves as such a bound. We let
denote the space of essentially bounded functions, quotiented out by equivalence, and given the norm
. It is not hard to see that this is also a normed vector space. Observe that a sequence
converges to a limit
if and only if
converges essentially uniformly to f, i.e. it converges uniformly to f outside of a set of measure zero. (Compare with Egorov’s theorem (Theorem 3.6 from Notes 0), which equates pointwise convergence with uniform convergence outside of a set of arbitrarily small emasure.)
Now we explain why we call this norm the norm:
Example 1. Let f be a (generalised) step function, thus for some amplitude
and some set E; let us assume that E has positive finite measure. Then
for all
, and also
. Thus in this case, at least, the
norm is the limit of the
norms. This example illustrates also that the
norms behave like combinations of the “height” A of a function, and the “width”
of such a function, though of course the concepts of height and width are not formally defined for functions that are not step functions.
Exercise 4.
- If
for some
, show that
as
. (Hint: use the monotone convergence theorem.)
- If
, show that
as
.
Once one has a vector space structure and a (quasi-)norm structure, we immediately get a (quasi-)metric structure:
Exercise 5. Let be a normed vector space. Show that the function
defined by
is a metric on V which is translation-invariant (thus
for all
) and homogeneous (thus
for all
and scalars c). Conversely, show that every translation-invariant homogeneous metric on V arises from precisely one norm in this manner. Establish a similar claim relating quasi-norms with quasi-metrics (which are defined as metrics, but with the triangle inequality replaced by a quasi-triangle inequality; note that the term “quasi-metric” is occasionally used to denote a slightly different concept), or between seminorms and semimetrics (which are defined as metrics, but where distinct points are allowed to have a zero separation; these are also known as pseudometrics, with “semimetric” used to denote something else).
The (quasi-)metric structure in turn generates a topological structure in the usual manner using the (quasi-)metric balls as a base for the topology. In particular, a sequence of functions converges to a limit
if
as
. We refer to this type of convergence as convergence in
norm, or strong convergence in
(we will discuss other modes of convergence in later lectures). As is usual in (quasi-)metric spaces (or more generally for Hausdorff spaces), the limit, if it exists, is unique. (This is however not the case for topological structures induced by seminorms or semimetrics, though we can solve this problem by quotienting out the degenerate elements as discussed earlier.)
Recall that any series of scalars is convergent if it is absolutely convergent (i.e. if
. This fact turns out to be closely related to the fact that the field of scalars
is complete. This can be seen from the following result:
Exercise 6. Let be a normed vector space (and hence also a metric space and a topological space). Show that the following are equivalent:
- V is a complete metric space (i.e. every Cauchy sequence converges).
- Every sequence
which is absolutely convergent (i.e.
), is also conditionally convergent (i.e.
converges to a limit as
.
Remark 3. The situation is more complicated for complete quasi-normed vector spaces; not every absolutely convergent series is conditionally convergent. On the other hand, if decays faster than a sufficiently large negative power of n, one recovers conditional convergence; see these old notes of mine.
Remark 4. Let X be a topological space, and let BC(X) be the space of bounded continuous functions on X; this is a vector space. We can place the uniform norm on this space; this makes BC(X) into a normed vector space. It is not hard to verify that this space is complete, and so every absolutely convergent series in BC(X) is conditionally convergent. This fact is better known as the Weierstrass M-test.
A space obeying the properties in Exercise 4 (i.e. a complete normed vector space) is known as a Banach space. We will study Banach spaces in more detail later in this course. For now, we give one of the fundamental examples of Banach spaces.
Proposition 1.
is a Banach space for every
.
Proof. By Exercise 6, it suffices to show that any series of functions in
which is absolutely convergent, is also conditionally convergent. This is easy in the case
and is left as an exercise. In the case
, we write
, which is a finite quantity by hypothesis. By the triangle inequality, we have
for all N. By monotone convergence, we conclude
. In particular,
is absolutely convergent for almost every x. Write the limit of this series as
. By dominated convergence, we see that
converges in
norm to F, and we are done.
An important fact is that functions in can be approximated by simple functions:
Proposition 2. If
, then the space of simple functions with finite measure support is a dense subspace of
.
(The concept of a non-trivial dense subspace is one which only comes up in infinite dimensions, and is hard to visualise directly. Very roughly speaking, the infinite number of degrees of freedom in an infinite dimensional space gives a subspace an infinite number of “opportunities” to come as close as one desires to any given point in that space, which is what allows such spaces to be dense.)
Proof. The only non-trivial thing to show is the density. An application of the monotone convergence theorem shows that the space of bounded functions are dense in
. Another application of monotone convergence (and Exercise 3) then shows that the space bounded
functions of finite measure support are dense in the space of bounded
functions. Finally, by discretising the range of bounded
functions, we see that the space of simple functions with finite measure support is dense in the space of bounded
functions with finite support.
Remark 5. Since not every function in is a simple function with finite measure support, we thus see that the space of simple functions with finite measure support with the
norm is an example of a normed vector space which is not complete.
Exercise 7. Show that the space of simple functions (not necessarily with finite measure support) is a dense subspace of . Is the same true if one reinstates the finite measure support restriction?
Exercise 7a. Suppose that is
-finite and
is separable (i.e. countably generated). Show that
is separable (i.e. has a countable dense subset) for all
. Give a counterexample that shows that
need not be separable. (Hint: take the integers with counting measure.)
Next, we turn to algebra properties of spaces. The key fact here is
Proposition 3. (Hölder’s inequality) Let
and
for some
. Then
and
, where the exponent r is defined by the formula
.
Proof. This will be a variant of the proof of the triangle inequality in Lemma 1, again relying ultimately on convexity. The claim is easy when or
and is left as an exercise for the reader in this case, so we assume
. Raising f and g to the power r using (2) we may assume r=1, which makes
dual exponents in the sense that
. The claim is obvious if either
or
are zero, so we may assume they are non-zero; by homogeneity we may then normalise
. Our task is now to show that
. (6)
Here, we use the convexity of the exponential function on
, which implies the convexity of the function
for
for any x. In particular we have
(7)
and the claim (6) follows from the normalisations on p, q, f, g.
Remark 6. For a different proof of this inequality (based on the tensor power trick), see Example 1 of this blog post of mine.
Remark 7. One can also use Hölder’s inequality to prove the triangle inequality for ,
(i.e. Minkowski’s inequality). From the complex triangle inequality
, it suffices to check the case when f, g are non-negative. In this case we have the identity
(8)
while Hölder’s inequality gives and
. The claim then follows from some algebra (and checking the degenerate cases separately, e.g. when
).
Remark 8. The proofs of Hölder’s inequality and Minkowski’s inequality both relied on convexity of various functions in or
. One way to emphasise this is to deduce both inequalities from Jensen’s inequality, which is an inequality which manifestly exploits this convexity. We will not take this approach here, but see for instance the book of Lieb and Loss for a discussion.
Example 2. It is instructive to test Hölder’s inequality (and also Exercises 8-12 below) in the special case when f, g are generalised step functions, say and
with A, B non-zero. The inequality then simplifies to
(8)
which can be easily deduced from the hypothesis and the trivial inequalities
and
. One then easily sees (when p,q are finite) that equality in (8) only holds if
, or in other words if E and F agree almost everywhere. Note the above computations also explain why the condition
is necessary.
Exercise 8. Let , and let
be such that Hölder’s inequality is obeyed with equality. Show that of the functions
, one of them is a scalar multiple of the other (up to equivalence, of course). What happens if p or q is infinite?
An important corollary of Hölder’s inequality is the Cauchy-Schwarz inequality
(9)
which can of course be proven by many other means.
Exercise 9. If for some
, and is also supported on a set E of finite measure, show that
for all
, with
. When does equality occur?
Exercise 10. If for some
, and every set of positive measure in X has measure at least m, show that
for all
, with
. When does equality occur? (This result is especially useful for the
spaces, in which
is counting measure and m can be taken to be 1.)
Exercise 11. If for some
, show that
for all
, and that
, where
is such that
. Another way of saying this is that the function
is convex. When does equality occur? This convexity is a prototypical example of interpolation, about which we shall say more in a later lecture.
Exercise 12. If for some
, and its support
has finite measure, show that
for all
, and that
as
. (Because of this, the measure of the support of f is sometimes known as the
norm of f, or more precisely the
norm raised to the power 0.)
– Linear functionals on –
Given an exponent , define the dual exponent
by the formula
(thus
for
, while 1 and
are duals of each other). From Hölder’s inequality, we see that for any
, the functional
defined by
(10)
is well-defined on ; the functional is also clearly linear. Furthermore, Hölder’s inequality also tells us that this functional is continuous.
A deep and important fact about spaces is that, in most cases, the converse is true: the recipe (10) is the only way to create continuous linear functionals on
.
Theorem 1. Let
, and assume
is
-finite. Let
be a continuous linear functional. Then there exists a unique
such that
.
This result should be compared with the Radon-Nikodym theorem (Corollary 1 from Notes 1). Both theorems start with an abstract function or
, and create a function out of it. Indeed, we shall see shortly that the two theorems are essentially equivalent to each other. We will develop Theorem 1 further in later lectures, once we introduce the notion of a dual space.
To prove Theorem 1, we first need a simple and useful lemma:
Lemma 1. (Continuity is equivalent to boundedness for linear operators) Let
be a linear transformation from one normed vector space
to another
. Then the following are equivalent:
- T is continuous.
- T is continuous at 0.
- There exists a constant C such that
for all
.
Proof. It is clear that 1 implies 2, and that 3 implies 2. Next, from linearity we have for any
, which (together with the continuity of addition, which follows from the triangle inequality) shows that continuity of T at 0 implies continuity of T at any
, so that 2 implies 1. The only remaining task is to show that 1 implies 3. By continuity, the inverse image of the unit ball in Y must be an open neighbourhood of 0 in X, thus there exists some radius
such that
whenever
. The claim then follows (with
) by homogeneity. (Alternatively, one can deduce 3 from 2 by contradiction. If 3 failed, then there exists a sequence
of non-zero elements of X such that
goes to infinity. By homogeneity, we can arrange matters so that
goes to zero, but
stays away from zero, thus contradicting continuity at 0.)
Proof of Theorem 1. The uniqueness claim is similar to the uniqueness claim in the Radon-Nikodym theorem (Exercise 2 from Notes 1) and is left as an exercise to the reader; the hard part is establishing existence.
Let us first consider the case when is finite. The linear functional
induces a functional
on sets E by the formula
. (11)
Since is linear,
is finitely additive (and sends the empty set to zero). Also, if
are a sequence of disjoint sets, then
converges in
to
as
(by the dominated convergence theorem and the finiteness of
), and thus (by continuity of
and finite additivity of
),
is countably additive as well. Finally, from (11) we also see that
whenever
, thus
is absolutely continuous with respect to
. Applying the Radon-Nikodym theorem (Corollary 1 from Notes 1) to both the real and imaginary components of
, we conclude that
for some
; thus by (11) we have
(12)
for all measurable E. By linearity, this implies that and
agree on simple functions. Taking uniform limits (using Exercise 7) and using continuity (and the finite measure of
) we conclude that
and
agree on all bounded functions. Taking monotone limits (working on the positive and negative supports of the real and imaginary parts of g separately) we conclude that
and
agree on all functions in
, and in particular that
is absolutely convergent for all
.
To finish the theorem in this case, we need to establish that g lies in . By taking real and imaginary parts we may assume without loss of generality that g is real; by splitting into the regions where g is positive and negative we may assume that g is non-negative.
We already know that is a continuous functional from
to
. By Lemma 1, this implies a bound of the form
for some
.
Suppose first that . Heuristically, we would like to test this inequality with
, since we formally have
and
. (Not coincidentally, this is also the choice that would make Hölder’s inequality an equality, see Exercise 8.) Cancelling the
factors would then give the desired finiteness of
.
We can’t quite make that argument work, because it is circular: it assumes is finite in order to show that
is finite! But this can be easily remedied. We test the inequality with
for some large N; this lies in
. We have
and
, and hence
for all N. Letting N go to infinity and using monotone convergence, we obtain the claim.
In the p=1 case, we instead use as the test functions, to conclude that g is bounded almost everywhere by N; we leave the details to the reader.
This handles the case when is finite. When
is
-finite, we can write X as the union of an increasing sequence
of sets of finite measure. On each such set, the above arguments let us write
for some
. The uniqueness arguments tell us that the
are all compatible with each other, in particular if
, then
and
agree on
. Thus all the
are in fact restrictions of a single function g to
. The previous arguments also tell us that the
norm of
is bounded by the same constant C uniformly in n, so by monotone convergence, g has bounded
norm also, and we are done.
Remark 9. When , the hypothesis that
is
-finite can be dropped, but not when
; see e.g. Section 6.2 of Folland for further discussion. In these lectures, though, we will be content with working in the
-finite setting. On the other hand, the claim fails when
(except when X is finite); we will see this in later lectures, when we discuss the Hahn-Banach theorem.
Remark 10. We have seen how the Lebesgue-Radon-Nikodym theorem can be used to establish Theorem 1. The converse is also true: Theorem 1 can be used to deduce the Lebesgue-Radon-Nikodym theorem (a fact essentially observed by von Neumann). For simplicity, let us restrict attention to the unsigned finite case, thus and
are unsigned and finite. This implies that the sum
is also unsigned and finite. We observe that the linear functional
is continuous on
, hence by Theorem 1 there must exist a function
such that
(13)
for all . It is easy to see that g must be real and non-negative, and also at most 1 almost everywhere. If E is the set where g=1, we see by setting
in (13) that E has m-measure zero, and so
is singular. Outside of E, we see from (13) and some rearrangement that
(14)
and one then easily verifies that agrees with
outside of E’. This gives the desired Lebesgue-Radon-Nikodym decomposition
.
Remark 11. The argument used in Remark 10 also shows that the Radon-Nikodym theorem implies the Lebesgue-Radon-Nikodym theorem.
In a later set of notes, we will give an alternate proof of Theorem 1, which relies on the geometry of spaces rather than on the Radon-Nikodym theorem, and can thus be viewed as giving an independent proof of that theorem.
[Update, Jan 10: Another exercise added.]
[Update, Jan 13: Lemma 1 added.]
[Update, Jan 14: More remarks and another exercise added; note this changes exercise numbering.]
[Update, Jan 15: Exercise 0 added; other exercise numbering unchanged.]
[Update, Jan 16: Exercise 7a added; other exercise numbering unchanged.]

37 comments
Comments feed for this article
10 January, 2009 at 1:26 am
liuxiaochuan
Dear Professor Tao:
This post doesn’t appear in the “254B, Real Analysis” page. [Corrected, thanks - T.]
13 January, 2009 at 1:51 am
liuxiaochuan
Dear Professor Tao:
I am considering (4) of exercise 1. I think the scalar between f and g should be real, am I right? [Yes - and it should be non-negative, too. Thanks - T.]
13 January, 2009 at 4:16 am
Mustafa Said
Dear Professor Tao:
In Exercise #3 I think you need
where
.
13 January, 2009 at 5:06 am
Matthew Folz
It is not necessary. We can get one inequality by using Holder’s inequality to control the
(
) norm of f by the
and
norms, and then letting p go to infinity (perhaps you were trying to use the estimate in Exercise 8 here instead?). For the reverse inequality, if
on a set of measure
, then Chebyshev’s inequality shows that the
norm of f is at least
. Letting p go to infinity gives the other inequality we need.
13 January, 2009 at 5:10 am
Matthew Folz
(strictly speaking, the aforementioned inequalities are not quite ‘reverses’, one of them has a ‘liminf’ and the other has a ‘limsup’, but since the liminf of a sequence is always smaller than the limsup, things work out fine)
14 January, 2009 at 8:24 am
liuxiaochuan
Dear Professor Tao:
Here is a correction ( if I am right):
In the fourth paragraph after (12), I think f should be
. Then
and
should be
and
(with p modified to p’)
The same problem happens in the fifth paragraph.
14 January, 2009 at 9:10 am
Terence Tao
Thanks for the correction!
17 January, 2009 at 9:46 am
254A, notes 5: Hilbert spaces « What’s new
[...] 6. From Proposition 1 from Notes 3, (real or complex) is a Hilbert space for any measure space . In particular, and are Hilbert [...]
26 January, 2009 at 3:41 pm
245B, Notes 6: Duality and the Hahn-Banach theorem « What’s new
[...] of a continuous linear transformation between two normed vector spaces X, Y. By Lemma 1 from Notes 3, any such linear transformation is bounded, in the sense that there exists a constant C such that [...]
1 February, 2009 at 11:15 pm
245B, Notes 9: The Baire category theorem and its Banach space consequences « What’s new
[...] and quantitative properties of linear transformations between Banach spaces. (Lemma 1 of Notes 3 already gives a prototypical such equivalence between a qualitative property (continuity) and a [...]
16 February, 2009 at 7:28 am
实分析0-10 « Liu Xiaochuan’s Weblog
[...] 第三节的内容是大家都很熟悉的L^P空间。开始两页作者站在一个很高的观点上谈了有关一般的数学对象的研究方法的内容,我很喜欢。但是初学者还是略去先不要看了。L^P空间中避免不了不等式。除了复习了几个以前见过的之外,我在学习中注意了不等式取等号时刻的意义。另外,伪-范数(pseudometrics)以及近似范数(quasi-norms)的概念也挺有意思的。 [...]
14 March, 2009 at 3:03 pm
ERIC
Dear Prof. Tao,
is the following right? ” f is not in
means that for every
, there exists a set E with positive measure s.t. $f(x) $ is greater than M for all $x \in E.$
14 March, 2009 at 4:10 pm
Terence Tao
Dear Eric,
One needs to replace f(x) by |f(x)| (or assume that f is non-negative), but other than that, your statement is correct.
30 March, 2009 at 8:57 am
245C, Notes 1: Interpolation of L^p spaces « What’s new
[...] real or complex spaces; for sake of concretness we work with complex spaces. Then for , recall (see 245B Notes 3) that is the space of all functions whose [...]
6 April, 2009 at 2:58 pm
254C, Notes 2: The Fourier transform « What’s new
[...] with . (Unfortunately, is not always -finite, and so the standard duality theorem from Notes 3 of 245B does not directly apply. However, one can get around this using Exercise [...]
17 September, 2009 at 10:56 pm
Paul J
Hello,
I have got some problems with proving quasi-triangle inequality from Lemma 1. Could anyone give me same hint?
20 January, 2010 at 8:05 am
Seminar „Funktionalanalysis“ « UGroh's Weblog
[...] Räume (T. Tao, Spaces, Blognotes zur Vorlesung 245B) [...]
8 June, 2010 at 9:19 am
Kestutis Cesnavicius
A couple of typos:
1. Before “In the
case…” paragraph in the proof of dual of
theorem, one should have
instead of
.
should be
. Also, after the last display in this remark
should be
.
2. In Remark 10 between the two displays
[Corrected, thanks - T.]
19 September, 2010 at 7:21 pm
245A, Notes 2: The Lebesgue integral « What’s new
[...] is complete in various ways; we will formalise this properly only in the next quarter when we study spaces, but the convergence theorems mentioned above already hint at this completeness. A related fact, [...]
2 October, 2010 at 3:01 pm
245A, Notes 4: Modes of convergence « What’s new
[...] whenever and . This is a sequence of indicator functions of intervals of decreasing length, marching across the unit interval over and over again. Then converges to zero in measure and in norm, but not pointwise almost everywhere (and hence also not pointwise, not almost uniformly, nor in norm, nor uniformly). Remark 2 The norm of a measurable function is defined to the infimum of all the quantities that are essential upper bounds for in the sense that for almost every . Then converges to in norm if and only if as . The and norms are part of the larger family of norms, which we will study in more detail in 245B. [...]
12 November, 2010 at 6:28 pm
quantum probability
Oh … so THIS is why people are interested in Lp norms.
Is there another post that contrasts Lpspaces with other possible metrizations?
16 December, 2010 at 4:28 am
245A, Notes 2: The Lebesgue integral « mathTHÍCHinTOÁNmyHỌCbrain
[...] is complete in various ways; we will formalise this properly only in the next quarter when we study spaces, but the convergence theorems mentioned above already hint at this completeness. A related fact, [...]
21 December, 2010 at 7:47 pm
mcknight0219
Hi, Prof Tao
In the last two line of proof for Proposition 2 there is a typo, ‘finit emeasure’ should be ‘finite measure’.
Qiang
[Corrected, thanks - T.]
19 January, 2011 at 4:35 pm
Lecture Notes on Topology: 1 « Mcknight0219's Blog
[...] vector space, space,…). A very good and abstract illustration of this conception can be found here. In the following, I will stick to contents of lecture notes. Definition 1.2: The discrete topology [...]
12 March, 2011 at 6:53 pm
Anonymous
With Holder’s inequality, one can estimate any integral of the form
with assumption that
and
. However, if one likes bound it below, it seems to be unhelpful. For instance, if one wants to find
, such that for any
,
, Holder may not help, even just for finding a smaller group of
. For a more concrete example, find all
such that
where
. Maybe another kind of estimation is needed.
16 March, 2011 at 12:07 pm
Anonymous
In the case of
, one can talk about the range and the kernel of the function
. However, if
,
is a equivalent class instead of a “function”. Is it still possible to define the counterpart of the concepts in
, say, “range” and “kernel”?
16 March, 2011 at 12:28 pm
Terence Tao
http://en.wikipedia.org/wiki/Essential_range
http://en.wikipedia.org/wiki/Support_(mathematics)
2 November, 2011 at 7:49 am
Jack
I am always not confident in using the
space. Since its element is not a function at all but a “equivalent class”. Every time I read something like “
“, I convert it into
. This is really strange. It seems that what we actually care about is the “property” that
. The "equivalent class" is only used for forming a Banach space. Am I right?
2 November, 2011 at 8:08 am
Terence Tao
Well, yes, but most of the point of introducing L^p spaces in the first place is in order to exploit the properties of a Banach space. For instance, if one has
, one would like to conclude that
(because this is what normally happens in a Banach space), but because of the equivalence class in the way, one can only conclude that
is equal to
almost everywhere.
As mentioned in the lecture notes, L^p spaces adhere to the “Lebesgue philosophy” of analysis, in which one considers sets of measure zero to be negligible, and in particular allows functions to be uncontrolled on such sets; this is in order to take full advantage of the powerful tools of measure theory, integration theory, and function space theory. As such, analysis using Lebesgue methods (such as L^p spaces, the Lebesgue integral, etc.) tends to only give almost everywhere control of one’s functions, rather than everywhere control. If one has sufficient regularity (e.g. continuity is usually enough), one can upgrade almost everywhere control to everywhere control, but it is important to keep in mind (particularly in subjects such as ergodic theory) that this upgrade is not automatic in the absence of such regularity.
2 November, 2011 at 11:10 am
Jack
Ok.
can be read as either
or
, where
is the equivalent class. I am always wondering if it is necessarily to do so.
What’s more, I don’t know how to completely accept the philosophy of Lebesgue. Since this “almost everywhere” issue, I cannot specify the value of
from a equivalent class “anywhere”! In this sense, one cannot control the function at all, instead of “almost everywhere”.
I don’t quite understand the word “control” you used in the answer, though I have seen it in lots of places. It seems to be a common jargon in mathematics. Does mean the integration of the function? But the “everywhere control” means, I think, specifying the value of the function.
2 November, 2011 at 1:40 pm
Terence Tao
The Lebesgue philosophy is analogous to the “noise-tolerant” philosophy in modern signal progressing. If one is receiving a signal (e.g. a television signal) from a noisy source (e.g. a television station in the presence of electrical interference), then any individual component of that signal (e.g. a pixel of the television image) may be corrupted. But as long as the total number of corrupted data points is negligible, one can still get a good enough idea of the image to do things like distinguish foreground from background, compute the area of an object, or the mean intensity, etc. This is similar to how one can measure a set or integrate a function even in the presence of a measure zero “noise” which renders any specific point of the set or function value “unreliable”.
“Control” is a loose term which can mean exact specification of a function and its values, but can also mean estimation of the values to within some error tolerance, or some convergence rate of those values to some limit. Basically, control is the ability to acquire useful information on an object from the given hypotheses.
22 November, 2011 at 6:31 pm
Anonymous
As I understand from your comment, if “control” means exact specification and its values, then in the sense of L^p space, one cannot control the function “anywhere”.
4 December, 2011 at 5:28 am
How to understand "It takes a little bit of getting used to the idea…"? | web technical support
[...] The following sentence is from a mathematical lecture note here: [...]
17 December, 2011 at 4:01 am
How to understand “It takes a little bit of getting used to the idea…”? | Q&A System
[...] The following sentence is from a mathematical lecture note here: [...]
29 January, 2012 at 6:10 am
nn
can anyone help with second part of exercise 7.a I want to guess what the completion would be in the case of simple functions with finite support measure.
29 January, 2012 at 6:16 am
nn
I meant exercise 7. sorry
8 June, 2012 at 8:09 am
[T]he point of introducing L^p spaces in the first... • see things differently
[...] exists + addition exists + everything’s included = it’s a Banach space) (Source: terrytao.wordpress.com) View the discussion [...]