You are currently browsing the tag archive for the ‘almost everywhere convergence’ tag.

Suppose one has a measure space ${X = (X, {\mathcal B}, \mu)}$ and a sequence of operators ${T_n: L^p(X) \rightarrow L^p(X)}$ that are bounded on some ${L^p(X)}$ space, with ${1 \leq p < \infty}$. Suppose that on some dense subclass of functions ${f}$ in ${L^p(X)}$ (e.g. continuous compactly supported functions, if the space ${X}$ is reasonable), one already knows that ${T_n f}$ converges pointwise almost everywhere to some limit ${Tf}$, for another bounded operator ${T: L^p(X) \rightarrow L^p(X)}$ (e.g. ${T}$ could be the identity operator). What additional ingredient does one need to pass to the limit and conclude that ${T_n f}$ converges almost everywhere to ${Tf}$ for all ${f}$ in ${L^p(X)}$ (and not just for ${f}$ in a dense subclass)?

One standard way to proceed here is to study the maximal operator

$\displaystyle T_* f(x) := \sup_n |T_n f(x)|$

and aim to establish a weak-type maximal inequality

$\displaystyle \| T_* f \|_{L^{p,\infty}(X)} \leq C \| f \|_{L^p(X)} \ \ \ \ \ (1)$

for all ${f \in L^p(X)}$ (or all ${f}$ in the dense subclass), and some constant ${C}$, where ${L^{p,\infty}}$ is the weak ${L^p}$ norm

$\displaystyle \|f\|_{L^{p,\infty}(X)} := \sup_{t > 0} t \mu( \{ x \in X: |f(x)| \geq t \})^{1/p}.$

A standard approximation argument using (1) then shows that ${T_n f}$ will now indeed converge to ${Tf}$ pointwise almost everywhere for all ${f}$ in ${L^p(X)}$, and not just in the dense subclass. See for instance these lecture notes of mine, in which this method is used to deduce the Lebesgue differentiation theorem from the Hardy-Littlewood maximal inequality. This is by now a very standard approach to establishing pointwise almost everywhere convergence theorems, but it is natural to ask whether it is strictly necessary. In particular, is it possible to have a pointwise convergence result ${T_n f \mapsto T f}$ without being able to obtain a weak-type maximal inequality of the form (1)?

In the case of norm convergence (in which one asks for ${T_n f}$ to converge to ${Tf}$ in the ${L^p}$ norm, rather than in the pointwise almost everywhere sense), the answer is no, thanks to the uniform boundedness principle, which among other things shows that norm convergence is only possible if one has the uniform bound

$\displaystyle \sup_n \| T_n f \|_{L^p(X)} \leq C \| f \|_{L^p(X)} \ \ \ \ \ (2)$

for some ${C>0}$ and all ${f \in L^p(X)}$; and conversely, if one has the uniform bound, and one has already established norm convergence of ${T_n f}$ to ${Tf}$ on a dense subclass of ${L^p(X)}$, (2) will extend that norm convergence to all of ${L^p(X)}$.

Returning to pointwise almost everywhere convergence, the answer in general is “yes”. Consider for instance the rank one operators

$\displaystyle T_n f(x) := 1_{[n,n+1]} \int_0^1 f(y)\ dy$

from ${L^1({\bf R})}$ to ${L^1({\bf R})}$. It is clear that ${T_n f}$ converges pointwise almost everywhere to zero as ${n \rightarrow \infty}$ for any ${f \in L^1({\bf R})}$, and the operators ${T_n}$ are uniformly bounded on ${L^1({\bf R})}$, but the maximal function ${T_*}$ does not obey (1). One can modify this example in a number of ways to defeat almost any reasonable conjecture that something like (1) should be necessary for pointwise almost everywhere convergence.

In spite of this, a remarkable observation of Stein, now known as Stein’s maximal principle, asserts that the maximal inequality is necessary to prove pointwise almost everywhere convergence, if one is working on a compact group and the operators ${T_n}$ are translation invariant, and if the exponent ${p}$ is at most ${2}$:

Theorem 1 (Stein maximal principle) Let ${G}$ be a compact group, let ${X}$ be a homogeneous space of ${G}$ with a finite Haar measure ${\mu}$, let ${1\leq p \leq 2}$, and let ${T_n: L^p(X) \rightarrow L^p(X)}$ be a sequence of bounded linear operators commuting with translations, such that ${T_n f}$ converges pointwise almost everywhere for each ${f \in L^p(X)}$. Then (1) holds.

This is not quite the most general vesion of the principle; some additional variants and generalisations are given in the original paper of Stein. For instance, one can replace the discrete sequence ${T_n}$ of operators with a continuous sequence ${T_t}$ without much difficulty. As a typical application of this principle, we see that Carleson’s celebrated theorem that the partial Fourier series ${\sum_{n=-N}^N \hat f(n) e^{2\pi i nx}}$ of an ${L^2({\bf R}/{\bf Z})}$ function ${f: {\bf R}/{\bf Z} \rightarrow {\bf C}}$ converge almost everywhere is in fact equivalent to the estimate

$\displaystyle \| \sup_{N>0} |\sum_{n=-N}^N \hat f(n) e^{2\pi i n\cdot}|\|_{L^{2,\infty}({\bf R}/{\bf Z})} \leq C \|f\|_{L^2({\bf R}/{\bf Z})}. \ \ \ \ \ (3)$

And unsurprisingly, most of the proofs of this (difficult) theorem have proceeded by first establishing (3), and Stein’s maximal principle strongly suggests that this is the optimal way to try to prove this theorem.

On the other hand, the theorem does fail for ${p>2}$, and almost everywhere convergence results in ${L^p}$ for ${p>2}$ can be proven by other methods than weak ${(p,p)}$ estimates. For instance, the convergence of Bochner-Riesz multipliers in ${L^p({\bf R}^n)}$ for any ${n}$ (and for ${p}$ in the range predicted by the Bochner-Riesz conjecture) was verified for ${p > 2}$ by Carbery, Rubio de Francia, and Vega, despite the fact that the weak ${(p,p)}$ of even a single Bochner-Riesz multiplier, let alone the maximal function, has still not been completely verified in this range. (Carbery, Rubio de Francia and Vega use weighted ${L^2}$ estimates for the maximal Bochner-Riesz operator, rather than ${L^p}$ type estimates.) For ${p \leq 2}$, though, Stein’s principle (after localising to a torus) does apply, though, and pointwise almost everywhere convergence of Bochner-Riesz means is equivalent to the weak ${(p,p)}$ estimate (1).

Stein’s principle is restricted to compact groups (such as the torus ${({\bf R}/{\bf Z})^n}$ or the rotation group ${SO(n)}$) and their homogeneous spaces (such as the torus ${({\bf R}/{\bf Z})^n}$ again, or the sphere ${S^{n-1}}$). As stated, the principle fails in the noncompact setting; for instance, in ${{\bf R}}$, the convolution operators ${T_n f := f * 1_{[n,n+1]}}$ are such that ${T_n f}$ converges pointwise almost everywhere to zero for every ${f \in L^1({\bf R}^n)}$, but the maximal function is not of weak-type ${(1,1)}$. However, in many applications on non-compact domains, the ${T_n}$ are “localised” enough that one can transfer from a non-compact setting to a compact setting and then apply Stein’s principle. For instance, Carleson’s theorem on the real line ${{\bf R}}$ is equivalent to Carleson’s theorem on the circle ${{\bf R}/{\bf Z}}$ (due to the localisation of the Dirichlet kernels), which as discussed before is equivalent to the estimate (3) on the circle, which by a scaling argument is equivalent to the analogous estimate on the real line ${{\bf R}}$.

Stein’s argument from his 1961 paper can be viewed nowadays as an application of the probabilistic method; starting with a sequence of increasingly bad counterexamples to the maximal inequality (1), one randomly combines them together to create a single “infinitely bad” counterexample. To make this idea work, Stein employs two basic ideas:

1. The random rotations (or random translations) trick. Given a subset ${E}$ of ${X}$ of small but positive measure, one can randomly select about ${|G|/|E|}$ translates ${g_i E}$ of ${E}$ that cover most of ${X}$.
2. The random sums trick Given a collection ${f_1,\ldots,f_n: X \rightarrow {\bf C}}$ of signed functions that may possibly cancel each other in a deterministic sum ${\sum_{i=1}^n f_i}$, one can perform a random sum ${\sum_{i=1}^n \pm f_i}$ instead to obtain a random function whose magnitude will usually be comparable to the square function ${(\sum_{i=1}^n |f_i|^2)^{1/2}}$; this can be made rigorous by concentration of measure results, such as Khintchine’s inequality.

These ideas have since been used repeatedly in harmonic analysis. For instance, I used the random rotations trick in a recent paper with Jordan Ellenberg and Richard Oberlin on Kakeya-type estimates in finite fields. The random sums trick is by now a standard tool to build various counterexamples to estimates (or to convergence results) in harmonic analysis, for instance being used by Fefferman in his famous paper disproving the boundedness of the ball multiplier on ${L^p({\bf R}^n)}$ for ${p \neq 2}$, ${n \geq 2}$. Another use of the random sum trick is to show that Theorem 1 fails once ${p>2}$; see Stein’s original paper for details.

Another use of the random rotations trick, closely related to Theorem 1, is the Nikishin-Stein factorisation theorem. Here is Stein’s formulation of this theorem:

Theorem 2 (Stein factorisation theorem) Let ${G}$ be a compact group, let ${X}$ be a homogeneous space of ${G}$ with a finite Haar measure ${\mu}$, let ${1\leq p \leq 2}$ and ${q>0}$, and let ${T: L^p(X) \rightarrow L^q(X)}$ be a bounded linear operator commuting with translations and obeying the estimate

$\displaystyle \|T f \|_{L^q(X)} \leq A \|f\|_{L^p(X)}$

for all ${f \in L^p(X)}$ and some ${A>0}$. Then ${T}$ also maps ${L^p(X)}$ to ${L^{p,\infty}(X)}$, with

$\displaystyle \|T f \|_{L^{p,\infty}(X)} \leq C_{p,q} A \|f\|_{L^p(X)}$

for all ${f \in L^p(X)}$, with ${C_{p,q}}$ depending only on ${p, q}$.

This result is trivial with ${q \geq p}$, but becomes useful when ${q. In this regime, the translation invariance allows one to freely “upgrade” a strong-type ${(p,q)}$ result to a weak-type ${(p,p)}$ result. In other words, bounded linear operators from ${L^p(X)}$ to ${L^q(X)}$ automatically factor through the inclusion ${L^{p,\infty}(X) \subset L^q(X)}$, which helps explain the name “factorisation theorem”. Factorisation theory has been developed further by many authors, including Maurey and Pisier.

Stein’s factorisation theorem (or more precisely, a variant of it) is useful in the theory of Kakeya and restriction theorems in Euclidean space, as first observed by Bourgain.

In 1970, Nikishin obtained the following generalisation of Stein’s factorisation theorem in which the translation-invariance hypothesis can be dropped, at the cost of excluding a set of small measure:

Theorem 3 (Nikishin-Stein factorisation theorem) Let ${X}$ be a finite measure space, let ${1\leq p \leq 2}$ and ${q>0}$, and let ${T: L^p(X) \rightarrow L^q(X)}$ be a bounded linear operator commuting with translations and obeying the estimate

$\displaystyle \|T f \|_{L^q(X)} \leq A \|f\|_{L^p(X)}$

for all ${f \in L^p(X)}$ and some ${A>0}$. Then for any ${\epsilon > 0}$, there exists a subset ${E}$ of ${X}$ of measure at most ${\epsilon}$ such that

$\displaystyle \|T f \|_{L^{p,\infty}(X \backslash E)} \leq C_{p,q,\epsilon} A \|f\|_{L^p(X)} \ \ \ \ \ (4)$

for all ${f \in L^p(X)}$, with ${C_{p,q,\epsilon}}$ depending only on ${p, q, \epsilon}$.

One can recover Theorem 2 from Theorem 3 by an averaging argument to eliminate the exceptional set; we omit the details.

If one has a sequence ${x_1, x_2, x_3, \ldots \in {\bf R}}$ of real numbers ${x_n}$, it is unambiguous what it means for that sequence to converge to a limit ${x \in {\bf R}}$: it means that for every ${\epsilon > 0}$, there exists an ${N}$ such that ${|x_n-x| \leq \epsilon}$ for all ${n > N}$. Similarly for a sequence ${z_1, z_2, z_3, \ldots \in {\bf C}}$ of complex numbers ${z_n}$ converging to a limit ${z \in {\bf C}}$.

More generally, if one has a sequence ${v_1, v_2, v_3, \ldots}$ of ${d}$-dimensional vectors ${v_n}$ in a real vector space ${{\bf R}^d}$ or complex vector space ${{\bf C}^d}$, it is also unambiguous what it means for that sequence to converge to a limit ${v \in {\bf R}^d}$ or ${v \in {\bf C}^d}$; it means that for every ${\epsilon > 0}$, there exists an ${N}$ such that ${\|v_n-v\| \leq \epsilon}$ for all ${n \geq N}$. Here, the norm ${\|v\|}$ of a vector ${v = (v^{(1)},\ldots,v^{(d)})}$ can be chosen to be the Euclidean norm ${\|v\|_2 := (\sum_{j=1}^d (v^{(j)})^2)^{1/2}}$, the supremum norm ${\|v\|_\infty := \sup_{1 \leq j \leq d} |v^{(j)}|}$, or any other number of norms, but for the purposes of convergence, these norms are all equivalent; a sequence of vectors converges in the Euclidean norm if and only if it converges in the supremum norm, and similarly for any other two norms on the finite-dimensional space ${{\bf R}^d}$ or ${{\bf C}^d}$.

If however one has a sequence ${f_1, f_2, f_3, \ldots}$ of functions ${f_n: X \rightarrow {\bf R}}$ or ${f_n: X \rightarrow {\bf C}}$ on a common domain ${X}$, and a putative limit ${f: X \rightarrow {\bf R}}$ or ${f: X \rightarrow {\bf C}}$, there can now be many different ways in which the sequence ${f_n}$ may or may not converge to the limit ${f}$. (One could also consider convergence of functions ${f_n: X_n \rightarrow {\bf C}}$ on different domains ${X_n}$, but we will not discuss this issue at all here.) This is contrast with the situation with scalars ${x_n}$ or ${z_n}$ (which corresponds to the case when ${X}$ is a single point) or vectors ${v_n}$ (which corresponds to the case when ${X}$ is a finite set such as ${\{1,\ldots,d\}}$). Once ${X}$ becomes infinite, the functions ${f_n}$ acquire an infinite number of degrees of freedom, and this allows them to approach ${f}$ in any number of inequivalent ways.

What different types of convergence are there? As an undergraduate, one learns of the following two basic modes of convergence:

1. We say that ${f_n}$ converges to ${f}$ pointwise if, for every ${x \in X}$, ${f_n(x)}$ converges to ${f(x)}$. In other words, for every ${\epsilon > 0}$ and ${x \in X}$, there exists ${N}$ (that depends on both ${\epsilon}$ and ${x}$) such that ${|f_n(x)-f(x)| \leq \epsilon}$ whenever ${n \geq N}$.
2. We say that ${f_n}$ converges to ${f}$ uniformly if, for every ${\epsilon > 0}$, there exists ${N}$ such that for every ${n \geq N}$, ${|f_n(x) - f(x)| \leq \epsilon}$ for every ${x \in X}$. The difference between uniform convergence and pointwise convergence is that with the former, the time ${N}$ at which ${f_n(x)}$ must be permanently ${\epsilon}$-close to ${f(x)}$ is not permitted to depend on ${x}$, but must instead be chosen uniformly in ${x}$.

Uniform convergence implies pointwise convergence, but not conversely. A typical example: the functions ${f_n: {\bf R} \rightarrow {\bf R}}$ defined by ${f_n(x) := x/n}$ converge pointwise to the zero function ${f(x) := 0}$, but not uniformly.

However, pointwise and uniform convergence are only two of dozens of many other modes of convergence that are of importance in analysis. We will not attempt to exhaustively enumerate these modes here (but see this Wikipedia page, and see also these 245B notes on strong and weak convergence). We will, however, discuss some of the modes of convergence that arise from measure theory, when the domain ${X}$ is equipped with the structure of a measure space ${(X, {\mathcal B}, \mu)}$, and the functions ${f_n}$ (and their limit ${f}$) are measurable with respect to this space. In this context, we have some additional modes of convergence:

1. We say that ${f_n}$ converges to ${f}$ pointwise almost everywhere if, for (${\mu}$-)almost everywhere ${x \in X}$, ${f_n(x)}$ converges to ${f(x)}$.
2. We say that ${f_n}$ converges to ${f}$ uniformly almost everywhere, essentially uniformly, or in ${L^\infty}$ norm if, for every ${\epsilon > 0}$, there exists ${N}$ such that for every ${n \geq N}$, ${|f_n(x) - f(x)| \leq \epsilon}$ for ${\mu}$-almost every ${x \in X}$.
3. We say that ${f_n}$ converges to ${f}$ almost uniformly if, for every ${\epsilon > 0}$, there exists an exceptional set ${E \in {\mathcal B}}$ of measure ${\mu(E) \leq \epsilon}$ such that ${f_n}$ converges uniformly to ${f}$ on the complement of ${E}$.
4. We say that ${f_n}$ converges to ${f}$ in ${L^1}$ norm if the quantity ${\|f_n-f\|_{L^1(\mu)} = \int_X |f_n(x)-f(x)|\ d\mu}$ converges to ${0}$ as ${n \rightarrow \infty}$.
5. We say that ${f_n}$ converges to ${f}$ in measure if, for every ${\epsilon > 0}$, the measures ${\mu( \{ x \in X: |f_n(x) - f(x)| \geq \epsilon \} )}$ converge to zero as ${n \rightarrow \infty}$.

Observe that each of these five modes of convergence is unaffected if one modifies ${f_n}$ or ${f}$ on a set of measure zero. In contrast, the pointwise and uniform modes of convergence can be affected if one modifies ${f_n}$ or ${f}$ even on a single point.

Remark 1 In the context of probability theory, in which ${f_n}$ and ${f}$ are interpreted as random variables, convergence in ${L^1}$ norm is often referred to as convergence in mean, pointwise convergence almost everywhere is often referred to as almost sure convergence, and convergence in measure is often referred to as convergence in probability.

Exercise 1 (Linearity of convergence) Let ${(X, {\mathcal B}, \mu)}$ be a measure space, let ${f_n, g_n: X \rightarrow {\bf C}}$ be sequences of measurable functions, and let ${f, g: X \rightarrow {\bf C}}$ be measurable functions.

1. Show that ${f_n}$ converges to ${f}$ along one of the above seven modes of convergence if and only if ${|f_n-f|}$ converges to ${0}$ along the same mode.
2. If ${f_n}$ converges to ${f}$ along one of the above seven modes of convergence, and ${g_n}$ converges to ${g}$ along the same mode, show that ${f_n+g_n}$ converges to ${f+g}$ along the same mode, and that ${cf_n}$ converges to ${cf}$ along the same mode for any ${c \in {\bf C}}$.
3. (Squeeze test) If ${f_n}$ converges to ${0}$ along one of the above seven modes, and ${|g_n| \leq f_n}$ pointwise for each ${n}$, show that ${g_n}$ converges to ${0}$ along the same mode.

We have some easy implications between modes:

Exercise 2 (Easy implications) Let ${(X, {\mathcal B}, \mu)}$ be a measure space, and let ${f_n: X \rightarrow {\bf C}}$ and ${f: X \rightarrow {\bf C}}$ be measurable functions.

1. If ${f_n}$ converges to ${f}$ uniformly, then ${f_n}$ converges to ${f}$ pointwise.
2. If ${f_n}$ converges to ${f}$ uniformly, then ${f_n}$ converges to ${f}$ in ${L^\infty}$ norm. Conversely, if ${f_n}$ converges to ${f}$ in ${L^\infty}$ norm, then ${f_n}$ converges to ${f}$ uniformly outside of a null set (i.e. there exists a null set ${E}$ such that the restriction ${f_n\downharpoonright_{X \backslash E}}$ of ${f_n}$ to the complement of ${E}$ converges to the restriction ${f\downharpoonright_{X \backslash E}}$ of ${f}$).
3. If ${f_n}$ converges to ${f}$ in ${L^\infty}$ norm, then ${f_n}$ converges to ${f}$ almost uniformly.
4. If ${f_n}$ converges to ${f}$ almost uniformly, then ${f_n}$ converges to ${f}$ pointwise almost everywhere.
5. If ${f_n}$ converges to ${f}$ pointwise, then ${f_n}$ converges to ${f}$ pointwise almost everywhere.
6. If ${f_n}$ converges to ${f}$ in ${L^1}$ norm, then ${f_n}$ converges to ${f}$ in measure.
7. If ${f_n}$ converges to ${f}$ almost uniformly, then ${f_n}$ converges to ${f}$ in measure.

The reader is encouraged to draw a diagram that summarises the logical implications between the seven modes of convergence that the above exercise describes.

We give four key examples that distinguish between these modes, in the case when ${X}$ is the real line ${{\bf R}}$ with Lebesgue measure. The first three of these examples already were introduced in the previous set of notes.

Example 1 (Escape to horizontal infinity) Let ${f_n := 1_{[n,n+1]}}$. Then ${f_n}$ converges to zero pointwise (and thus, pointwise almost everywhere), but not uniformly, in ${L^\infty}$ norm, almost uniformly, in ${L^1}$ norm, or in measure.

Example 2 (Escape to width infinity) Let ${f_n := \frac{1}{n} 1_{[0,n]}}$. Then ${f_n}$ converges to zero uniformly (and thus, pointwise, pointwise almost everywhere, in ${L^\infty}$ norm, almost uniformly, and in measure), but not in ${L^1}$ norm.

Example 3 (Escape to vertical infinity) Let ${f_n := n 1_{[\frac{1}{n}, \frac{2}{n}]}}$. Then ${f_n}$ converges to zero pointwise (and thus, pointwise almost everywhere) and almost uniformly (and hence in measure), but not uniformly, in ${L^\infty}$ norm, or in ${L^1}$ norm.

Example 4 (Typewriter sequence) Let ${f_n}$ be defined by the formula

$\displaystyle f_n := 1_{[\frac{n-2^k}{2^k}, \frac{n-2^k+1}{2^k}]}$

whenever ${k \geq 0}$ and ${2^k \leq n < 2^{k+1}}$. This is a sequence of indicator functions of intervals of decreasing length, marching across the unit interval ${[0,1]}$ over and over again. Then ${f_n}$ converges to zero in measure and in ${L^1}$ norm, but not pointwise almost everywhere (and hence also not pointwise, not almost uniformly, nor in ${L^\infty}$ norm, nor uniformly).

Remark 2 The ${L^\infty}$ norm ${\|f\|_{L^\infty(\mu)}}$ of a measurable function ${f: X \rightarrow {\bf C}}$ is defined to the infimum of all the quantities ${M \in [0,+\infty]}$ that are essential upper bounds for ${f}$ in the sense that ${|f(x)| \leq M}$ for almost every ${x}$. Then ${f_n}$ converges to ${f}$ in ${L^\infty}$ norm if and only if ${\|f_n-f\|_{L^\infty(\mu)} \rightarrow 0}$ as ${n \rightarrow \infty}$. The ${L^\infty}$ and ${L^1}$ norms are part of the larger family of ${L^p}$ norms, which we will study in more detail in 245B.

One particular advantage of ${L^1}$ convergence is that, in the case when the ${f_n}$ are absolutely integrable, it implies convergence of the integrals,

$\displaystyle \int_X f_n\ d\mu \rightarrow \int_X f\ d\mu,$

as one sees from the triangle inequality. Unfortunately, none of the other modes of convergence automatically imply this convergence of the integral, as the above examples show.

The purpose of these notes is to compare these modes of convergence with each other. Unfortunately, the relationship between these modes is not particularly simple; unlike the situation with pointwise and uniform convergence, one cannot simply rank these modes in a linear order from strongest to weakest. This is ultimately because the different modes react in different ways to the three “escape to infinity” scenarios described above, as well as to the “typewriter” behaviour when a single set is “overwritten” many times. On the other hand, if one imposes some additional assumptions to shut down one or more of these escape to infinity scenarios, such as a finite measure hypothesis ${\mu(X) < \infty}$ or a uniform integrability hypothesis, then one can obtain some additional implications between the different modes.