For these notes, $X = (X, {\mathcal X})$ is a fixed measurable space. We shall often omit the $\sigma$-algebra ${\mathcal X}$, and simply refer to elements of ${\mathcal X}$ as measurable sets. Unless otherwise indicated, all subsets of X appearing below are restricted to be measurable, and all functions on X appearing below are also restricted to be measurable.

We let ${\mathcal M}_+(X)$ denote the space of measures on X, i.e. functions $\mu: {\mathcal X} \to [0,+\infty]$ which are countably additive and send $\emptyset$ to 0. For reasons that will be clearer later, we shall refer to such measures as unsigned measures. In this section we investigate the structure of this space, together with the closely related spaces of signed measures and finite measures.

Suppose that we have already constructed one unsigned measure $m \in {\mathcal M}_+(X)$ on X (e.g. think of X as the real line with the Borel $\sigma$-algebra, and let m be Lebesgue measure). Then we can obtain many further unsigned measures on X by multiplying m by a function $f: X \to [0,+\infty]$, to obtain a new unsigned measure $m_f$, defined by the formula

$m_f(E) := \int_X 1_E f\ dm$. (1)

If $f = 1_A$ is an indicator function, we write $m\downharpoonright_A$ for $m_{1_A}$, and refer to this measure as the restriction of m to A.

Exercise 1. Show (using the monotone convergence theorem) that $m_f$ is indeed a unsigned measure, and for any $g: X \to [0,+\infty]$, we have ${}\int_X g\ dm_f = \int_X gf\ dm$. We will express this relationship symbolically as

$dm_f = f dm$.$\diamond$ (2)

Exercise 2. Let m be $\sigma$-finite. Given two functions $f, g: X \to [0,+\infty]$, show that $m_f = m_g$ if and only if $f(x) = g(x)$ for m-almost every x. (Hint: as usual, first do the case when m is finite. The key point is that if f and g are not equal m-almost everywhere, then either f>g on a set of positive measure, or f<g on a set of positive measure.) Give an example to show that this uniqueness statement can fail if m is not $\sigma$-finite. (Hint: take a very simple example, e.g. let X consist of just one point.) $\diamond$

In view of Exercises 1 and 2, let us temporarily call a measure $\mu$ differentiable with respect to m if $d\mu = f dm$ (i.e. $\mu = m_f$) for some $f: X \to [0,+\infty]$, and call f the Radon-Nikodym derivative of $\mu$ with respect to m, writing

$\displaystyle f = \frac{d\mu}{dm}$; (3)

by Exercise 2, we see if $m$ is $\sigma$-finite that this derivative is defined up to m-almost everywhere equivalence.

Exercise 3. (Relationship between Radon-Nikodym derivative and classical derivative) Let m be Lebesgue measure on ${}[0,+\infty)$, and let $\mu$ be an unsigned measure that is differentiable with respect to m. If $\mu$ has a continuous Radon-Nikodym derivative $\frac{d\mu}{dm}$, show that the function $x \mapsto \mu( [0,x])$ is differentiable, and $\frac{d}{dx} \mu([0,x]) = \frac{d\mu}{dm}(x)$ for all x. $\diamond$

Exercise 4. Let X be at most countable. Show that every measure on X is differentiable with respect to counting measure $\#$. $\diamond$

If every measure was differentiable with respect to m (as is the case in Exercise 4), then we would have completely described the space of measures of X in terms of the non-negative functions of X (modulo m-almost everywhere equivalence). Unfortunately, not every measure is differentiable with respect to every other: for instance, if x is a point in X, then the only measures that are differentiable with respect to the Dirac measure $\delta_x$ are the scalar multiples of that measure. We will explore the precise obstruction that prevents all measures from being differentiable, culminating in the Radon-Nikodym-Lebesgue theorem that gives a satisfactory understanding of the situation in the $\sigma$-finite case (which is the case of interest for most applications).

In order to establish this theorem, it will be important to first study some other basic operations on measures, notably the ability to subtract one measure from another. This will necessitate the study of signed measures, to which we now turn.

[The material here is largely based on Folland’s text, except for the last section.]

— Signed measures —

We have seen that if we fix a reference measure m, then non-negative functions $f: X \to [0,+\infty]$ (modulo m-almost everywhere equivalence) can be identified with unsigned measures $m_f: {\mathcal X} \to [0,+\infty]$. This motivates various operations on measures that are analogous to operations on functions (indeed, one could view measures as a kind of “generalised function” with respect to a fixed reference measure m). For instance, we can define the sum of two unsigned measures $\mu, \nu: {\mathcal X} \to [0,+\infty]$ as

$(\mu+\nu)(E) :=\mu(E) + \nu(E)$ (4)

and non-negative scalar multiples $c\mu$ for $c > 0$ by

$(c\mu)(E) := c(\mu(E))$. (5)

We can also say that one measure $\mu$ is less than another $\nu$ if

$\mu(E) \leq \nu(E)$ for all $E \in {\mathcal X}$. (6)

These operations are all consistent with their functional counterparts, e.g. $m_{f+g} = m_f + m_g$, etc.

Next, we would like to define the difference $\mu-\nu$ of two unsigned measures. The obvious thing to do is to define

$(\mu-\nu)(E) := \mu(E) - \nu(E)$ (7)

but we have a problem if $\mu(E)$ and $\nu(E)$ are both infinite: $\infty-\infty$ is undefined! To fix this problem, we will only define the difference of two unsigned measures $\mu,\nu$ if at least one of them is a finite measure. Observe that in that case, $\mu-\nu$ takes values in $(-\infty,+\infty]$ or ${}[-\infty,+\infty)$, but not both.

Of course, we no longer expect $\mu-\nu$ to be monotone. However, it is still finitely additive, and even countably additive in the sense that the sum $\sum_{n=1}^\infty (\mu-\nu)(E_n)$ converges to $(\mu-\nu)(\bigcup_{n=1}^\infty E_n)$ whenever $E_1,E_2,\ldots$ are disjoint sets, and furthermore that the sum is absolutely convergent when $(\mu-\nu)(\bigcup_{n=1}^\infty E_n)$ is finite. This motivates

Definition 1. (Signed measure) A signed measure is a map $\mu: {\mathcal X} \to [-\infty,+\infty]$ such that

1. $\mu(\emptyset) = 0$;
2. $\mu$ can take either the value $+\infty$ or $-\infty$, but not both;
3. If $E_1,E_2,\ldots \subset X$ are disjoint, then $\sum_{n=1}^\infty \mu(E_n)$ converges to $\mu( \bigcup_{n=1}^\infty E_n)$, with the former sum being absolutely convergent if the latter expression is finite. [Actually, the absolute convergence is automatic from the Riemann rearrangement theorem. Another consequence of 3. is that any subset of a finite measure set is again finite measure, and the finite union of finite measure sets again has finite measure.]

Thus every unsigned measure is a signed measure, and the difference of two unsigned measures is a signed measure if at least one of the unsigned measures is finite; we will see shortly that the converse statement is also true, i.e. every signed measure is the difference of two unsigned measures (with one of the unsigned measures being finite). Another example of a signed measure are the measures $m_f$ defined by (1), where $f: X \to [-\infty,+\infty]$ is now signed rather than unsigned, but with the assumption that at least one of the signed parts $f_+ := \max(f,0)$, $f_- := \max(-f,0)$ of f is absolutely integrable.

We also observe that a signed measure $\mu$ is unsigned if and only if $\mu \geq 0$ (where we use (6) to define order on measures).

Given a function $f: X \to [-\infty,+\infty]$, we can partition X into one set $X_+ := \{x: f(x) \geq 0 \}$ on which f is non-negative, and another set $X_- := \{x: f(x) < 0\}$ on which f is negative; thus $f\downharpoonright_{X_+} \geq 0$ and $f\downharpoonright_{X_-} \leq 0$. It turns out that the same is true for signed measures:

Theorem 1. (Hahn decomposition theorem) Let $\mu$ be a signed measure. Then one can find a partition $X = X_+ \cup X_-$ such that $\mu\downharpoonright_{X_+} \geq 0$ and $\mu\downharpoonright_{X_-} \leq 0$.

Proof. By replacing $\mu$ with $-\mu$ if necessary, we may assume that $\mu$ avoids the value $+\infty$.

Call a set E totally positive if $\mu\downharpoonright_E \geq 0$, and totally negative if $\mu\downharpoonright_E \leq 0$. The idea is to pick $X_+$ to be the totally positive set of maximal measure – a kind of “greedy algorithm”, if you will. More precisely, define $m_+$ to be the supremum of $\mu(E)$, where E ranges over all totally positive sets. (The supremum is non-vacuous, since the empty set is totally positive.) We claim that the supremum is actually attained. Indeed, we can always find a maximising sequence $E_1, E_2, \ldots$ of totally positive sets with $\mu(E_n) \to m_+$. It is not hard to see that the union $X_+ := \bigcup_{n=1}^\infty E_n$ is also totally positive, and $\mu(X_+) = m_+$ as required. Since $\mu$ avoids $+\infty$, we see in particular that $m_+$ is finite.

Set $X_- := X \backslash X_+$. We claim that $X_-$ is totally negative. We do this as follows. Suppose for contradiction that $X_-$ is not totally negative, then there exists a set $E_1$ in $X_-$ of strictly positive measure. If $E_1$ is totally positive, then $X_+ \cup E_1$ is a totally positive set having measure strictly greater than $m_+$, a contradiction. Thus $E_1$ must contain a subset $E_2$ of strictly larger measure. Let us pick $E_2$ so that $\mu(E_2) \geq \mu(E_1) + 1/n_1$, where $n_1$ is the smallest integer for which such an $E_2$ exists. If $E_2$ is totally positive, then we are again done, so we can find a subset $E_3$ with $\mu(E_3) \geq \mu(E_2) + 1/n_2$, where $n_2$ is the smallest integer for whch such a $E_3$ exists. Continuing in this fashion, we either stop and get a contradiction, or obtain a nested sequence of sets $E_1 \supset E_2 \supset \ldots$ in $X_-$ of increasing positive measure (with $\mu(E_{j+1}) \geq \mu(E_j) + 1/n_j$). The intersection $E := \bigcap_j E_j$ then also has positive measure, hence finite, which implies that the $n_j$ go to infinity; it is then not difficult to see that E itself cannot contain any subsets of strictly larger measure, and so E is a totally positive set of positive measure in $X_-$, and we again obtain a contradiction. $\Box$

Remark 0. A somewhat simpler proof of the Hahn decomposition theorem is available if we assume $\mu$ to be finite positive variation (which means that $\mu(E)$ is bounded above as E varies).  For each positive n, let $E_n$ be a set whose measure $\mu(E_n)$ is within $2^{-n}$ of $\sup \{ \mu(E): E \in {\mathcal X} \}$.  One can easily show that any subset of $E_n \backslash E_{n-1}$ has measure $O(2^{-n})$, and in particular that $E_n \backslash \bigcup_{n'=n_0}^{n-1} E_{n-1}$ has measure $O(2^{-n})$ for any $n_0 \leq n$.  This allows one to control the unions $\bigcup_{n=n_0}^\infty E_n$, and thence the lim sup $X_+$ of the $E_n$, which one can then show to have the required properties.  One can in fact show that any signed measure that avoids $+\infty$ must have finite positive variation, but this turns out to require a certain amount of work. $\diamond$

Let us say that a set E is null for a signed measure $\mu$ if $\mu\downharpoonright_E = 0$. (This implies that $\mu(E)=0$, but the converse is not true, since a set E of signed measure zero could contain subsets of non-zero measure.) It is easy to see that the sets $X_-, X_+$ given by the Hahn decomposition theorem are unique modulo null sets.

Let us say that a signed measure $\mu$ is supported on E if the complement of E is null (or equivalently, if $\mu\downharpoonright_E = \mu$. If two signed measures $\mu, \nu$ can be supported on disjoint sets, we say that they are mutually singular (or that $\mu$ is singular with respect to $\nu$) and write $\mu \perp \nu$. If we write $\mu_+ := \mu\downharpoonright_{X_+}$ and $\mu_- := - \mu \downharpoonright_{X_-}$, we thus soon establish

Exercise 5. (Jordan decomposition theorem) Every signed measure $\mu$ an be uniquely decomposed as $\mu = \mu_+ - \mu_-$, where $\mu_+, \mu_-$ are mutually singular unsigned measures. (The only claim not already established is the uniqueness.) We refer to $\mu_+, \mu_-$ as the positive and negative parts (or positive and negative variations) of $\mu$. $\diamond$

This is of course analogous to the decomposition $f = f_+ - f_-$ of a function into positive and negative parts. Inspired by this, we define the absolute value (or total variation) $|\mu|$ of a signed measure to be $|\mu| := \mu_+ + \mu_-$.

Exercise 6. Show that $|\mu|$ is the minimal unsigned measure such that $-|\mu| \leq \mu \leq |\mu|$. Furthermore, $|\mu|(E)$ is equal to the maximum value of $\sum_{n=1}^\infty |\mu(E_n)|$, where $(E_n)_{n=1}^\infty$ ranges over the partitions of E. (This may help explain the terminology “total variation”.) $\diamond$

Exercise 7. Show that $\mu(E)$ is finite for every E if and only if $|\mu|$ is a finite unsigned measure, if and only if $\mu_+, \mu_-$ are finite unsigned measures. If any of these properties hold, we call $\mu$ a finite measure. (In a similar spirit, we call a signed measure $\mu$ $\sigma$-finite if $|\mu|$ is $\sigma$-finite.) $\diamond$

The space of finite measures on X is clearly a real vector space, and is denoted ${\mathcal M}(X)$.

Let m be a reference unsigned measure. We saw in the introduction that the map $f \mapsto m_f$ is an embedding of the space $L^+(X,dm)$ of non-negative functions (modulo m-almost everywhere equivalence) into the space ${\mathcal M}^+(X)$ of unsigned measures. The same map is also an embedding of the space $L^1(X,dm)$ of absolutely integrable functions (again modulo m-almost everywhere equivalence) into the space ${\mathcal M}(X)$ of finite measures. (To verify this, one first makes the easy observation that the Jordan decomposition of a measure $m_f$ given by an absolutely integrable function f is simply $m_f = m_{f_+} - m_{f_-}$.)

In the converse direction, one can ask if every finite measure $\mu$ in ${\mathcal M}(X)$ can be expressed as $m_f$ for some absolutely integrable f. Unfortunately, there are some obstructions to this. Firstly, from (1) we see that if $\mu = m_f$, then any set that has measure zero with respect to $m$, must also have measure zero with respect to $\mu$. In particular, this implies that a non-trivial measure that is singular with respect to m cannot be expressed in the form $m_f$.

In the $\sigma$-finite case, this turns out to be the only obstruction:

Theorem 2. (LebesgueRadon-Nikodym theorem) Let m be an unsigned $\sigma$-finite measure, and let $\mu$ be a signed $\sigma$-finite measure. Then there exists a unique decomposition $\mu = m_f + \mu_s$, where $f: X \to {\bf C}$ is measurable and $\mu_s \perp m$. If $\mu$ is unsigned, then f and $\mu_s$ are also. If $\mu$ is finite, $f$ lies in $L^1(X,dm)$ and $\mu_s$ is finite.

Proof. We prove this only for the case when $\mu,\nu$ are finite rather than $\sigma$-finite, and leave the general case as an exercise. The uniqueness follows from Exercise 2 and the previous observation that $m_f$ cannot be mutually singular with m for any non-zero f, so it suffices to prove existence. By the Jordan decomposition theorem, we may assume that $\mu$ is unsigned as well. (In this case, we expect f and $\mu_s$ to be unsigned also.)

The idea is to select f “greedily“. More precisely, let M be the supremum of the quantity $\int_X f\ dm$, where f ranges over all non-negative functions such that $m_f \leq \mu$. Since $\mu$ is finite, M is finite. We claim that the supremum is actually attained for some f. Indeed, if we let $f_n$ be a maximising sequence, thus $m_{f_n} \leq \mu$ and $\int_X f_n\ dm \to M$, one easily checks that the function $f = \sup_n f_n$ attains the supremum.

The measure $\mu_s := \mu - m_f$ is a non-negative finite measure by construction. To finish the theorem, it suffices to show that $\mu_s \perp m$.

It will suffice to show that $(\mu_s - \varepsilon m)_+ \perp m$ for all $\varepsilon$, as the claim then easily follows by letting $\varepsilon$ be a countable sequence going to zero. But if $(\mu_s - \varepsilon m)_+$ were not singular with respect to m, we see from the Hahn decomposition theorem that there is a set E with $m(E)>0$ such that $(\mu_s - \varepsilon m)\downharpoonright_E \geq 0$, and thus $\mu_s \geq \varepsilon m\downharpoonright_E$. But then one could add $\varepsilon 1_E$ to f, contradicting the construction of f. $\Box$

Exercise 8. Complete the proof of Theorem 2 for the $\sigma$-finite case. $\diamond$

We have the following corollary:

Corollary 1. (Radon-Nikodym theorem) Let m be an unsigned $\sigma$-finite measure, and let $\mu$ be a signed finite measure. Then the following are equivalent.

1. $\mu = m_f$ for some measurable $f$.
2. $\mu(E)=0$ whenever $m(E)=0$.
3. For every $\varepsilon > 0$, there exists $\delta > 0$ such that $|\mu(E)| < \varepsilon$ whenever $m(E) \leq \delta$.

When $\mu$ is $\sigma$-finite instead of finite, 1 and 2 are still equivalent.

When statement 1 or 2 (or 3, in the finite case) occurs, we say that $\mu$ is absolutely continuous with respect to m, and write $\mu \ll m$. As in the introduction, we call f the Radon-Nikodym derivative of $\mu$ with respect to m, and write $f = \frac{d\mu}{dm}$.

Proof.   It suffices to establish the case when $\mu$ is finite.  The implication of 3. from 1. is Exercise 11 from Notes 0. The implication of 2. from 3. is trivial. To deduce 1. from 2., apply Theorem 1 to $\mu$ and observe that $\mu_s$ is supported on a set of m-measure zero E by hypothesis. Since E is null for m, it is null for $m_f$ and $\mu$ also, and so $\mu_s$ is trivial, giving 1. $\Box$

Corollary 2. (Lebesgue decomposition theorem) Let m be an unsigned $\sigma$-finite measure, and let $\mu$ be a signed $\sigma$-finite measure. Then there is a unique decomposition $\mu = \mu_{ac} + \mu_s$, where $\mu_{ac} \ll m$ and $\mu_s \perp m$. (We refer to $\mu_{ac}$ and $\mu_s$ as the absolutely continuous and singular components of $\mu$ with respect to m.) If $\mu$ is unsigned, then $\mu_{ac}$ and $\mu_s$ are also.

Exercise 9. If every point in X is measurable, we call a signed measure $\mu$ continuous if $\mu(\{x\})=0$ for all x. Let the hypotheses be as in Corollary 2, but suppose also that every point is measurable and m is continuous. Show that there is a unique decomposition $\mu = \mu_{ac} + \mu_{sc} + \mu_{pp}$, where $\mu_{ac} \ll m$, $\mu_{pp}$ is supported on an at most countable set, and $\mu_{sc}$ is both singular with respect to m and continuous. Furthermore, if $\mu$ is unsigned, then $\mu_{ac}, \mu_{sc}, \mu_{pp}$ are also. We call $\mu_{sc}$ and $\mu_{pp}$ the singular continuous and pure point components of $\mu$ respectively. $\diamond$

Example 1. A Cantor measure is singular continuous with respect to Lebesgue measure, while Dirac measures are pure point. Lebesgue measure on a line is singular continuous with respect to Lebesgue measure on a plane containing that line. $\diamond$

Remark 1. Suppose one is decomposing a measure $\mu$ on a Euclidean space ${\Bbb R}^d$ with respect to Lebesgue measure m on that space. Very roughly speaking, a measure is pure point if it is supported on a 0-dimensional subset of ${\Bbb R}^d$, it is absolutely continuous if its support is spread out on a full dimensional subset, and is singular continuous if it is supported on some set of dimension intermediate between 0 and d. For instance, if $\mu$ is the sum of a Dirac mass at $(0,0) \in {\Bbb R}^2$, one-dimensional Lebesgue measure on the x-axis, and two-dimensional Lebesgue measure on ${\Bbb R}^2$, then these are the pure point, singular continuous, and absolutely continuous components of $\mu$ respectively. This heuristic is not completely accurate (in part because I have left the definition of “dimension” vague) but is not a bad rule of thumb for a first approximation. $\diamond$

To motivate the terminology “continuous” and “singular continuous”, we recall two definitions on an interval $I \subset {\Bbb R}$, and make a third:

1. A function $f: I \to {\Bbb R}$ is continuous if for every $x \in I$ and every $\varepsilon > 0$, there exists $\delta > 0$ such that $|f(y)-f(x)| \leq \varepsilon$ whenever $y \in I$ is such that $|y-x| \leq \delta$.
2. A function $f: I \to {\Bbb R}$ is uniformly continuous if for every $\varepsilon > 0$, there exists $\delta > 0$ such that $|f(y)-f(x)| \leq \varepsilon$ whenever ${}[x, y] \subset I$ has length at most $\delta$.
3. A function $f: I \to {\Bbb R}$ is absolutely continuous if for every $\varepsilon > 0$, there exists $\delta > 0$ such that $\sum_{i=1}^n |f(y_i)-f(x_i)| \leq \varepsilon$ whenever ${}[x_1,y_1],\ldots,{}[x_n,y_n]$ are disjoint intervals in I of total length at most $\delta$.

Clearly, absolute continuity implies uniform continuity, which in turn implies continuity. The significance of absolute continuity is that it is the largest class of functions for which the fundamental theorem of calculus holds (using the classical derivative, and the Lebesgue integral), as was shown in the previous course.

Exercise 10. Let m be Lebesgue measure on the interval ${}[0,+\infty]$, and let $\mu$ be a finite unsigned measure.

1. Show that $\mu$ is a continuous measure if and only if the function $x \mapsto \mu([0,x])$ is continuous.
2. Show that $\mu$ is an absolutely continuous measure with respect to m if and only if the function $x \mapsto \mu([0,x])$ is absolutely continuous. $\diamond$

— A finitary analogue of the Lebesgue decomposition (optional) —

At first glance, the above theory is only non-trivial when the underlying set X is infinite. For instance, if X is finite, and m is the uniform distribution on X, then every other measure on X will be absolutely continuous with respect to m, making the Lebesgue decomposition trivial. Nevertheless, there is a non-trivial version of the above theory that can be applied to finite sets (cf. my blog post on the relationship between soft analysis and hard analysis). The cleanest formulation is to apply it to a sequence of (increasingly large) sets, rather than to a single set:

Theorem 3. (Finitary analogue of the Lebesgue-Radon-Nikodym theorem) Let $X_n$ be a sequence of finite sets (and with the discrete $\sigma$-algebra), and for each n, let $m_n$ be the uniform distribution on $X_n$, and let $\mu_n$ be another probability measure on $X_n$. Then, after passing to a subsequence, one has a decomposition

$\mu_n = \mu_{n,ac} + \mu_{n,sc} + \mu_{n,pp}$ (9)

where

1. (Uniform absolute continuity) For every $\varepsilon > 0$, there exists $\delta > 0$ (independent of n) such that $\mu_{n,ac}(E) \leq \varepsilon$ whenever $m_n(E) \leq \delta$, for all n and all $E \subset X_n$.
2. (Asymptotic singular continuity) $\mu_{n,sc}$ is supported on a set of $m_n$-measure $o(1)$, and we have $\mu_{n,sc}(\{x\}) = o(1)$ uniformly for all $x \in X_n$, where $o(1)$ denotes an error that goes to zero as $n \to \infty$.
3. (Uniform pure point) For every $\varepsilon > 0$ there exists $N > 0$ (independent of n) such that for each n, there exists a set $E_n \subset X_n$ of cardinality at most N such that $\mu_{n,pp}(X_n \backslash E_n) \leq \varepsilon$.

Proof. Using the Radon-Nikodym theorem (or just working by hand, since everything is finite), we can write $d\mu_n = f_n\ dm_n$ for some $f_n: X_n \to [0,+\infty)$ with average value 1.

For each positive integer k, the sequence $\mu_n( \{ f_n \geq k \} )$ is bounded between 0 and 1, so by the Bolzano-Weierstrass theorem, it has a convergent subsequence. Applying the usual diagonalisation argument (as in the proof of the Arzelà-Ascoli theorem), we may thus assume (after passing to a subsequence, and relabeling) that $\mu_n( \{ f_n \geq k \} )$ converges for positive k to some limit $c_k$.

Clearly, the $c_k$ are decreasing and range between 0 and 1, and so converge as $k \to \infty$ to some limit $0 < c < 1$.

Since $\lim_{k \to \infty} \lim_{n \to \infty} \mu_n(\{f_n \geq k \}) = c$, we can find a sequence $k_n$ going to infinity such that $\mu_n( \{f_n \geq k_n\} ) \to c$ as $n \to \infty$. We now set $\mu_{n,ac}$ to be the restriction of $\mu_n$ to the set $\{f_n < k_n\}$. We claim the absolute continuithy property 1. Indeed, for any $\varepsilon > 0$, we can find a k such that $c_k \geq c - \varepsilon/10$. For n sufficiently large, we thus have

$\mu_n(\{f_n \geq k \}) \geq c - \varepsilon/5$ (10)

and

$\mu_n(\{f_n \geq k_n \}) \leq c + \varepsilon/5$ (11)

and hence

$\mu_{n,ac}( \{ f_n \geq k \} ) \leq 2 \varepsilon / 5$. (12)

If we take $\delta < \varepsilon / 5k$, we thus see (for n sufficiently large) that 1. holds. (For the remaining n, one simply shrinks delta as much as is necessary.)

Write $\mu_{n,s} := \mu_n - \mu_{n,ac}$, thus $\mu_{n,s}$ is supported on a set of size $|X_n|/K_n = o(|X_n|)$ by Markov’s inequality. It remains to extract out the pure point components. This we do by a similar procedure as above. Indeed, by arguing as before we may assume (after passing to a subsequence as necessary) that the quantities $\mu_n \{ x: \mu_n(\{x\}) \geq 1/j \}$ converge to a limit $d_j$ for each positive integer j, that the $d_j$ themselves converge to a limit d, and that there exists a sequence $j_n \to \infty$ such that $\mu_n \{ x: \mu_n(\{x\}) \geq 1/j_n \}$ converges to d. If one sets $\mu_{sc}$ and $\mu_{pp}$ to be the restrictions of $\mu_s$ to the sets $\{ x: \mu_n(\{x\}) < 1/j_n \}$ and $\{ x: \mu_n(\{x\}) \geq 1/j_n \}$ respectively, one can verify the remaining claims by arguments similar to those already given. $\Box$

Exercise 11. Generalise Theorem 3 to the setting where the $X_n$ can be infinite and non-discrete (but we still require every point to be measurable), the $m_n$ are arbitrary probability measures, and the $\mu_n$ are arbitrary finite measures of uniformly bounded total variation. $\diamond$

Remark 2. This result is still not fully “finitary” because it deals with a sequence of finite structures, rather than with a single finite structure. It appears in fact to be quite difficult (and perhaps even impossible) to make a fully finitary version of the Lebesgue decomposition (in the same way that the finite convergence principle in this blog post of mine was a fully finitary analogue of the infinite convergence principle), though one can certainly form some weaker finitary statements that capture a portion of the strength of this theorem. For instance, one very cheap thing to do, given two probability measures $\mu, m$, is to introduce a threshold parameter k, and partition $\mu = \mu_{\leq k} + \mu_{>k}$, where $\mu_{\leq k} \leq k m$, and $\mu_{>k}$ is supported on a set of m-measure at most $1/k$; such a decomposition is automatic from Theorem 2 and Markov’s inequality, and has meaningful content even when the underlying space X is finite, but this type of decomposition is not as powerful as the full Lebesgue decompositions (mainly because the size of the support for $\mu_{>k}$ is relatively large compared to the threshold k). Using the finite convergence principle, one can do a bit better, writing $\mu = \mu_{\leq k} + \mu_{k < \cdot \leq F(k)} + \mu_{\geq F(k)}$ for any function F and any $\varepsilon > 0$, where $k = O_{F,\varepsilon}(1)$, $\mu_{\leq k} \leq k m$, $\mu_{\geq F(k)}$ is supported on a set of m-measure at most $1/F(k)$, and $\mu_{k < \cdot \leq F(k)}$ has total mass at most $\varepsilon$, but this is still fails to capture the full strength of the infinitary decomposition, because $\varepsilon$ needs to be fixed in advance. I have not been able to find a fully finitary statement that is equivalent to, say, Theorem 3; I suspect that if it does exist, it will have quite a messy formulation. $\diamond$