You are currently browsing the monthly archive for January 2009.

Today, Prof. Margulis continued his lecture series, focusing on two specific examples of homogeneous dynamics applications to number theory, namely counting lattice points on algebraic varieties, and quantitative versions of the Oppenheim conjecture.  (Due to lack of time, the third application mentioned in the previous lecture, namely metric theory of Diophantine approximation, was not covered.)

The final distinguished lecture series for the academic year here at UCLA is being given this week by Gregory Margulis, who is giving three lectures on “homogeneous dynamics and number theory”.  In his first lecture, Prof. Margulis surveyed some classical problems in number theory that turn out, rather surprisingly, to have more or less equivalent counterparts in homogeneous dynamics – the theory of dynamical systems on homogeneous spaces $G/\Gamma$.

As usual, any errors in this post are due to my transcription of the talk.

A (concrete) Boolean algebra is a pair $(X, {\mathcal B})$, where X is a set, and ${\mathcal B}$ is a collection of subsets of X which contain the empty set $\emptyset$, and which is closed under unions $A, B \mapsto A \cup B$, intersections $A, B \mapsto A \cap B$, and complements $A \mapsto A^c := X \backslash A$. The subset relation $\subset$ also gives a relation on ${\mathcal B}$. Because the ${\mathcal B}$ is concretely represented as subsets of a space X, these relations automatically obey various axioms, in particular, for any $A,B,C \in {\mathcal B}$, we have:

1. $\subset$ is a partial ordering on ${\mathcal B}$, and A and B have join $A \cup B$ and meet $A \cap B$.
2. We have the distributive laws $A \cup (B \cap C) = (A \cup B) \cap (A \cup C)$ and $A \cap (B \cup C) = (A \cap B) \cup (A \cap C)$.
3. $\emptyset$ is the minimal element of the partial ordering $\subset$, and $\emptyset^c$ is the maximal element.
4. $A \cap A^c = \emptyset$ and $A \cup A^c = \emptyset^c$.

(More succinctly: ${\mathcal B}$ is a lattice which is distributive, bounded, and complemented.)

We can then define an abstract Boolean algebra ${\mathcal B} = ({\mathcal B}, \emptyset, \cdot^c, \cup, \cap, \subset)$ to be an abstract set ${\mathcal B}$ with the specified objects, operations, and relations that obey the axioms 1-4. [Of course, some of these operations are redundant; for instance, intersection can be defined in terms of complement and union by de Morgan’s laws. In the literature, different authors select different initial operations and axioms when defining an abstract Boolean algebra, but they are all easily seen to be equivalent to each other. To emphasise the abstract nature of these algebras, the symbols $\emptyset, \cdot^c, \cup, \cap, \subset$ are often replaced with other symbols such as $0, \overline{\cdot}, \vee, \wedge, <$.]

Clearly, every concrete Boolean algebra is an abstract Boolean algebra. In the converse direction, we have Stone’s representation theorem (see below), which asserts (among other things) that every abstract Boolean algebra is isomorphic to a concrete one (and even constructs this concrete representation of the abstract Boolean algebra canonically). So, up to (abstract) isomorphism, there is really no difference between a concrete Boolean algebra and an abstract one.

Now let us turn from Boolean algebras to $\sigma$-algebras.

A concrete $\sigma$-algebra (also known as a measurable space) is a pair $(X,{\mathcal B})$, where X is a set, and ${\mathcal B}$ is a collection of subsets of X which contains $\emptyset$ and are closed under countable unions, countable intersections, and complements; thus every concrete $\sigma$-algebra is a concrete Boolean algebra, but not conversely. As before, concrete $\sigma$-algebras come equipped with the structures $\emptyset, \cdot^c, \cup, \cap, \subset$ which obey axioms 1-4, but they also come with the operations of countable union $(A_n)_{n=1}^\infty \mapsto \bigcup_{n=1}^\infty A_n$ and countable intersection $(A_n)_{n=1}^\infty \mapsto \bigcap_{n=1}^\infty A_n$, which obey an additional axiom:

5. Any countable family $A_1, A_2, \ldots$ of elements of ${\mathcal B}$ has supremum $\bigcup_{n=1}^\infty A_n$ and infimum $\bigcap_{n=1}^\infty A_n$.

As with Boolean algebras, one can now define an abstract $\sigma$-algebra to be a set ${\mathcal B} = ({\mathcal B}, \emptyset, \cdot^c, \cup, \cap, \subset, \bigcup_{n=1}^\infty, \bigcap_{n=1}^\infty )$ with the indicated objects, operations, and relations, which obeys axioms 1-5. Again, every concrete $\sigma$-algebra is an abstract one; but is it still true that every abstract $\sigma$-algebra is representable as a concrete one?

The answer turns out to be no, but the obstruction can be described precisely (namely, one needs to quotient out an ideal of “null sets” from the concrete $\sigma$-algebra), and there is a satisfactory representation theorem, namely the Loomis-Sikorski representation theorem (see below). As a corollary of this representation theorem, one can also represent abstract measure spaces $({\mathcal B},\mu)$ (also known as measure algebras) by concrete measure spaces, $(X, {\mathcal B}, \mu)$, after quotienting out by null sets.

In the rest of this post, I will state and prove these representation theorems. They are not actually used directly in the rest of the course (and they will also require some results that we haven’t proven yet, most notably Tychonoff’s theorem), and so these notes are optional reading; but these theorems do help explain why it is “safe” to focus attention primarily on concrete $\sigma$-algebras and measure spaces when doing measure theory, since the abstract analogues of these mathematical concepts are largely equivalent to their concrete counterparts. (The situation is quite different for non-commutative measure theories, such as quantum probability, in which there is basically no good representation theorem available to equate the abstract with the classically concrete, but I will not discuss these theories here.)

Now that we have reviewed the foundations of measure theory, let us now put it to work to set up the basic theory of one of the fundamental families of function spaces in analysis, namely the $L^p$ spaces (also known as Lebesgue spaces). These spaces serve as important model examples for the general theory of topological and normed vector spaces, which we will discuss a little bit in this lecture and then in much greater detail in later lectures. (See also my previous blog post on function spaces.)

Just as scalar quantities live in the space of real or complex numbers, and vector quantities live in vector spaces, functions $f: X \to {\Bbb C}$ (or other objects closely related to functions, such as measures) live in function spaces. Like other spaces in mathematics (e.g. vector spaces, metric spaces, topological spaces, etc.) a function space $V$ is not just mere sets of objects (in this case, the objects are functions), but they also come with various important structures that allow one to do some useful operations inside these spaces, and from one space to another. For example, function spaces tend to have several (though usually not all) of the following types of structures, which are usually related to each other by various compatibility conditions:

1. Vector space structure. One can often add two functions $f, g$ in a function space $V$, and expect to get another function $f+g$ in that space $V$; similarly, one can multiply a function $f$ in $V$ by a scalar $c$ and get another function $cf$ in $V$. Usually, these operations obey the axioms of a vector space, though it is important to caution that the dimension of a function space is typically infinite. (In some cases, the space of scalars is a more complicated ring than the real or complex field, in which case we need the notion of a module rather than a vector space, but we will not use this more general notion in this course.) Virtually all of the function spaces we shall encounter in this course will be vector spaces. Because the field of scalars is real or complex, vector spaces also come with the notion of convexity, which turns out to be crucial in many aspects of analysis. As a consequence (and in marked contrast to algebra or number theory), much of the theory in real analysis does not seem to extend to other fields of scalars (in particular, real analysis fails spectacularly in the finite characteristic setting).
2. Algebra structure. Sometimes (though not always), we also wish to multiply two functions $f$, $g$ in $V$ and get another function $fg$ in $V$; when combined with the vector space structure and assuming some compatibility conditions (e.g. the distributive law), this makes $V$ an algebra. This multiplication operation is often just pointwise multiplication, but there are other important multiplication operations on function spaces too, such as convolution. (One sometimes sees other algebraic structures than multiplication appear in function spaces, most notably derivations, but again we will not encounter those in this course. Another common algebraic operation for function spaces is conjugation or adjoint, leading to the notion of a *-algebra.)
3. Norm structure. We often want to distinguish “large” functions in $V$ from “small” ones, especially in analysis, in which “small” terms in an expression are routinely discarded or deemed to be acceptable errors. One way to do this is to assign a magnitude or norm $\|f\|_V$ to each function that measures its size. Unlike the situation with scalars, where there is basically a single notion of magnitude, functions have a wide variety of useful notions of size, each measuring a different aspect (or combination of aspects) of the function, such as height, width, oscillation, regularity, decay, and so forth. Typically, each such norm gives rise to a separate function space (although sometimes it is useful to consider a single function space with multiple norms on it). We usually require the norm to be compatible with the vector space structure (and algebra structure, if present), for instance by demanding that the triangle inequality hold.
4. Metric structure. We also want to tell whether two functions f, g in a function space V are “near together” or “far apart”. A typical way to do this is to impose a metric $d: V \times V \to {\Bbb R}^+$ on the space $V$. If both a norm $\| \|_V$ and a vector space structure are available, there is an obvious way to do this: define the distance between two functions $f, g$ in $V$ to be $d( f, g ) := \|f-g\|_V$. (This will be the only type of metric on function spaces encountered in this course. But there are some nonlinear function spaces of importance in nonlinear analysis (e.g. spaces of maps from one manifold to another) which have no vector space structure or norm, but still have a metric.) It is often important to know if the vector space is complete with respect to the given metric; this allows one to take limits of Cauchy sequences, and (with a norm and vector space structure) sum absolutely convergent series, as well as use some useful results from point set topology such as the Baire category theorem. All of these operations are of course vital in analysis. [Compactness would be an even better property than completeness to have, but function spaces unfortunately tend be non-compact in various rather nasty ways, although there are useful partial substitutes for compactness that are available, see e.g. this blog post of mine.]
5. Topological structure. It is often important to know when a sequence (or, occasionally, nets) of functions $f_n$ in $V$ “converges” in some sense to a limit $f$ (which, hopefully, is still in $V$); there are often many distinct modes of convergence (e.g. pointwise convergence, uniform convergence, etc.) that one wishes to carefully distinguish from each other. Also, in order to apply various powerful topological theorems (or to justify various formal operations involving limits, suprema, etc.), it is important to know when certain subsets of $V$ enjoy key topological properties (most notably compactness and connectedness), and to know which operations on $V$ are continuous. For all of this, one needs a topology on $V$. If one already has a metric, then one of course has a topology generated by the open balls of that metric; but there are many important topologies on function spaces in analysis that do not arise from metrics. We also often require the topology to be compatible with the other structures on the function space; for instance, we usually require the vector space operations of addition and scalar multiplication to be continuous. In some cases, the topology on $V$ extends to some natural superspace $W$ of more general functions that contain $V$; in such cases, it is often important to know whether $V$ is closed in $W$, so that limits of sequences in $V$ stay in $V$.
6. Functional structures. Since numbers are easier to understand and deal with than functions, it is not surprising that we often study functions f in a function space V by first applying some functional $\lambda: V \to {\Bbb C}$ to V to identify some key numerical quantity $\lambda(f)$ associated to f. Norms $f \mapsto \|f\|_V$ are of course one important example of a functional; integration $f \mapsto \int_X f\ d\mu$ provides another; and evaluation $f \mapsto f(x)$ at a point x provides a third important class. (Note, though, that while evaluation is the fundamental feature of a function in set theory, it is often a quite minor operation in analysis; indeed, in many function spaces, evaluation is not even defined at all, for instance because the functions in the space are only defined almost everywhere!) An inner product $\langle,\rangle$ on $V$ (see below) also provides a large family $f \mapsto \langle f, g \rangle$ of useful functionals. It is of particular interest to study functionals that are compatible with the vector space structure (i.e. are linear) and with the topological structure (i.e. are continuous); this will give rise to the important notion of duality on function spaces.
7. Inner product structure. One often would like to pair a function f in a function space V with another object g (which is often, though not always, another function in the same function space V) and obtain a number $\langle f, g \rangle$, that typically measures the amount of “interaction” or “correlation” between f and g. Typical examples include inner products arising from integration, such as $\langle f, g\rangle := \int_X f \overline{g}\ d\mu$; integration itself can also be viewed as a pairing, $\langle f, \mu \rangle := \int_X f\ d\mu$. Of course, we usually require such inner products to be compatible with the other structures present on the space (e.g., to be compatible with the vector space structure, we usually require the inner product to be bilinear or sesquilinear). Inner products, when available, are incredibly useful in understanding the metric and norm geometry of a space, due to such fundamental facts as the Cauchy-Schwarz inequality and the parallelogram law. They also give rise to the important notion of orthogonality between functions.
8. Group actions. We often expect our function spaces to enjoy various symmetries; we might wish to rotate, reflect, translate, modulate, or dilate our functions and expect to preserve most of the structure of the space when doing so. In modern mathematics, symmetries are usually encoded by group actions (or actions of other group-like objects, such as semigroups or groupoids; one also often upgrades groups to more structured objects such as Lie groups). As usual, we typically require the group action to preserve the other structures present on the space, e.g. one often restricts attention to group actions that are linear (to preserve the vector space structure), continuous (to preserve topological structure), unitary (to preserve inner product structure), isometric (to preserve metric structure), and so forth. Besides giving us useful symmetries to spend, the presence of such group actions allows one to apply the powerful techniques of representation theory, Fourier analysis, and ergodic theory. However, as this is a foundational real analysis class, we will not discuss these important topics much here (and in fact will not deal with group actions much at all).
9. Order structure. In some cases, we want to utilise the notion of a function f being “non-negative”, or “dominating” another function g. One might also want to take the “max” or “supremum” of two or more functions in a function space V, or split a function into “positive” and “negative” components. Such order structures interact with the other structures on a space in many useful ways (e.g. via the Stone-Weierstrass theorem). Much like convexity, order structure is specific to the real line and is another reason why much of real analysis breaks down over other fields. (The complex plane is of course an extension of the real line and so is able to exploit the order structure of that line, usually by treating the real and imaginary components separately.)

There are of course many ways to combine various flavours of these structures together, and there are entire subfields of mathematics that are devoted to studying particularly common and useful categories of such combinations (e.g. topological vector spaces, normed vector spaces, Banach spaces, Banach algebras, von Neumann algebras, C^* algebras, Frechet spaces, Hilbert spaces, group algebras, etc.). The study of these sorts of spaces is known collectively as functional analysis. We will study some (but certainly not all) of these combinations in an abstract and general setting later in this course, but to begin with we will focus on the $L^p$ spaces, which are very good model examples for many of the above general classes of spaces, and also of importance in many applications of analysis (such as probability or PDE).

Notational convention: In this post only, I will colour a statement red if it assumes the axiom of choice. (For the rest of the course, the axiom of choice will be implicitly assumed throughout.) $\diamond$

The famous Banach-Tarski paradox asserts that one can take the unit ball in three dimensions, divide it up into finitely many pieces, and then translate and rotate each piece so that their union is now two disjoint unit balls.  As a consequence of this paradox, it is not possible to create a finitely additive measure on ${\Bbb R}^3$ that is both translation and rotation invariant, which can measure every subset of ${\Bbb R}^3$, and which gives the unit ball a non-zero measure. This paradox helps explain why Lebesgue measure (which is countably additive and both translation and rotation invariant, and gives the unit ball a non-zero measure) cannot measure every set, instead being restricted to measuring sets that are Lebesgue measurable.

On the other hand, it is not possible to replicate the Banach-Tarski paradox in one or two dimensions; the unit interval in ${\Bbb R}$ or unit disk in ${\Bbb R}^2$ cannot be rearranged into two unit intervals or two unit disks using only finitely many pieces, translations, and rotations, and indeed there do exist non-trivial finitely additive measures on these spaces. However, it is possible to obtain a Banach-Tarski type paradox in one or two dimensions using countably many such pieces; this rules out the possibility of extending Lebesgue measure to a countably additive translation invariant measure on all subsets of ${\Bbb R}$ (or any higher-dimensional space).

In these notes I would like to establish all of the above results, and tie them in with some important concepts and tools in modern group theory, most notably amenability and the ping-pong lemma.  This material is not required for the rest of the course, but nevertheless has some independent interest.

For these notes, $X = (X, {\mathcal X})$ is a fixed measurable space. We shall often omit the $\sigma$-algebra ${\mathcal X}$, and simply refer to elements of ${\mathcal X}$ as measurable sets. Unless otherwise indicated, all subsets of X appearing below are restricted to be measurable, and all functions on X appearing below are also restricted to be measurable.

We let ${\mathcal M}_+(X)$ denote the space of measures on X, i.e. functions $\mu: {\mathcal X} \to [0,+\infty]$ which are countably additive and send $\emptyset$ to 0. For reasons that will be clearer later, we shall refer to such measures as unsigned measures. In this section we investigate the structure of this space, together with the closely related spaces of signed measures and finite measures.

Suppose that we have already constructed one unsigned measure $m \in {\mathcal M}_+(X)$ on X (e.g. think of X as the real line with the Borel $\sigma$-algebra, and let m be Lebesgue measure). Then we can obtain many further unsigned measures on X by multiplying m by a function $f: X \to [0,+\infty]$, to obtain a new unsigned measure $m_f$, defined by the formula

$m_f(E) := \int_X 1_E f\ d\mu$. (1)

If $f = 1_A$ is an indicator function, we write $m\downharpoonright_A$ for $m_{1_A}$, and refer to this measure as the restriction of m to A.

Exercise 1. Show (using the monotone convergence theorem) that $m_f$ is indeed a unsigned measure, and for any $g: X \to [0,+\infty]$, we have ${}\int_X g\ dm_f = \int_X gf\ dm$. We will express this relationship symbolically as

$dm_f = f dm$.$\diamond$ (2)

Exercise 2. Let m be $\sigma$-finite. Given two functions $f, g: X \to [0,+\infty]$, show that $m_f = m_g$ if and only if $f(x) = g(x)$ for m-almost every x. (Hint: as usual, first do the case when m is finite. The key point is that if f and g are not equal m-almost everywhere, then either f>g on a set of positive measure, or f<g on a set of positive measure.) Give an example to show that this uniqueness statement can fail if m is not $\sigma$-finite. (Hint: take a very simple example, e.g. let X consist of just one point.) $\diamond$

In view of Exercises 1 and 2, let us temporarily call a measure $\mu$ differentiable with respect to m if $d\mu = f dm$ (i.e. $\mu = m_f$) for some $f: X \to [0,+\infty]$, and call f the Radon-Nikodym derivative of $\mu$ with respect to m, writing

$\displaystyle f = \frac{d\mu}{dm}$; (3)

by Exercise 2, we see if $m$ is $\sigma$-finite that this derivative is defined up to m-almost everywhere equivalence.

Exercise 3. (Relationship between Radon-Nikodym derivative and classical derivative) Let m be Lebesgue measure on ${}[0,+\infty)$, and let $\mu$ be an unsigned measure that is differentiable with respect to m. If $\mu$ has a continuous Radon-Nikodym derivative $\frac{d\mu}{dm}$, show that the function $x \mapsto \mu( [0,x])$ is differentiable, and $\frac{d}{dx} \mu([0,x]) = \frac{d\mu}{dm}(x)$ for all x. $\diamond$

Exercise 4. Let X be at most countable. Show that every measure on X is differentiable with respect to counting measure $\#$. $\diamond$

If every measure was differentiable with respect to m (as is the case in Exercise 4), then we would have completely described the space of measures of X in terms of the non-negative functions of X (modulo m-almost everywhere equivalence). Unfortunately, not every measure is differentiable with respect to every other: for instance, if x is a point in X, then the only measures that are differentiable with respect to the Dirac measure $\delta_x$ are the scalar multiples of that measure. We will explore the precise obstruction that prevents all measures from being differentiable, culminating in the Radon-Nikodym-Lebesgue theorem that gives a satisfactory understanding of the situation in the $\sigma$-finite case (which is the case of interest for most applications).

In order to establish this theorem, it will be important to first study some other basic operations on measures, notably the ability to subtract one measure from another. This will necessitate the study of signed measures, to which we now turn.

[The material here is largely based on Folland’s text, except for the last section.]

In this supplemental note to the previous lecture notes, I would like to give an alternate proof of a (weak form of the) Carathéodory extension theorem.  This argument is restricted to the $\sigma$-finite case, and does not extend the measure to quite as large a $\sigma$-algebra as is provided by the standard proof of this theorem, but I find it conceptually clearer (in particular, hewing quite closely to Littlewood’s principles, and the general Lebesgue philosophy of treating sets of small measure as negligible), and suffices for many standard applications of this theorem, in particular the construction of Lebesgue measure.

Let us first state the precise statement of the theorem:

Theorem 1. (Weak Carathéodory extension theorem)  Let ${\mathcal A}$ be a Boolean algebra of subsets of a set X, and let $\mu: {\mathcal A} \to [0,+\infty]$ be a function obeying the following three properties:

1. $\mu(\emptyset) = 0$.
2. (Pre-countable additivity) If $A_1,A_2,\ldots \in {\mathcal A}$ are disjoint and such that $\bigcup_{n=1}^\infty A_n$ also lies in ${\mathcal A}$, then $\mu(\bigcup_{n=1}^\infty A_n) = \sum_{n=1}^\infty \mu(A_n)$.
3. ($\sigma$-finiteness) X can be covered by at most countably many sets in ${\mathcal A}$, each of which has finite $\mu$-measure.

Let ${\mathcal X}$ be the $\sigma$-algebra generated by ${\mathcal A}$.  Then $\mu$ can be uniquely extended to a countably additive measure on ${\mathcal X}$.

We will refer to sets in ${\mathcal A}$ as elementary sets and sets in ${\mathcal X}$ as measurable sets. A typical example is when X=[0,1] and ${\mathcal A}$ is the collection of all sets that are unions of finitely many intervals; in this case, ${\mathcal X}$ are the Borel-measurable sets.

In these notes we quickly review the basics of abstract measure theory and integration theory, which was covered in the previous course but will of course be relied upon in the current course.  This is only a brief summary of the material; of course, one should consult a real analysis text for the full details of the theory.

### Recent Comments

 Don'tknow on Open thread for mathematicians… Don'tknow on Open thread for mathematicians… Alexander Davis on The Dantzig selector: Statisti… Covering and Plünnec… on An entropy Plünnecke-Ruzsa… michaelmross on The Collatz conjecture, Little… Vincent on The Collatz conjecture, Little… juststudent on The Collatz conjecture, Little… Terence Tao on Polymath15, ninth thread: goin… Terence Tao on 245B, Notes 11: The strong and… Meaning of 'hig… on Talks are not the same as… Mike Neely on 245B, Notes 11: The strong and… Terence Tao on 245B, Notes 11: The strong and… Terence Tao on Analysis I Turbolenza: introduz… on Why global regularity for Navi… Anonymous on Analysis I