In a few weeks, Princeton University will host a conference in Analysis and Applications in honour of the 80th birthday of Elias Stein (though, technically, Eli’s 80th birthday was actually in January). As one of Eli’s students, I was originally scheduled to be one of the speakers at this conference; but unfortunately, for family reasons I will be unable to attend. In lieu of speaking at this conference, I have decided to devote some space on this blog for this month to present some classic results of Eli from his many decades of work in harmonic analysis, ergodic theory, several complex variables, and related topics. My choice of selections here will be a personal and idiosyncratic one; the results I present are not necessarily the “best” or “deepest” of his results, but are ones that I find particularly elegant and appealing. (There will also inevitably be some overlap here with Charlie Fefferman’s article “Selected theorems by Eli Stein“, which not coincidentally was written for Stein’s 60th birthday conference in 1991.)

In this post I would like to describe one of Eli Stein’s very first results that is still used extremely widely today, namely his interpolation theorem from 1956 (and its refinement, the Fefferman-Stein interpolation theorem from 1972). This is a deceptively innocuous, yet remarkably powerful, generalisation of the classic Riesz-Thorin interpolation theorem which uses methods from complex analysis (and in particular, the Lindelöf theorem or the Phragmén-Lindelöf principle) to show that if a linear operator ${T: L^{p_0}(X) + L^{p_1}(X) \rightarrow L^{q_0}(Y) + L^{q_1}(Y)}$ from one (${\sigma}$-finite) measure space ${X = (X,{\mathcal X},\mu)}$ to another ${Y = (Y, {\mathcal Y}, \nu)}$ obeyed the estimates

$\displaystyle \| Tf \|_{L^{q_0}(Y)} \leq B_0 \|f\|_{L^{p_0}(X)} \ \ \ \ \ (1)$

for all ${f \in L^{p_0}(X)}$ and

$\displaystyle \| Tf \|_{L^{q_1}(Y)} \leq B_1 \|f\|_{L^{p_1}(X)} \ \ \ \ \ (2)$

for all ${f \in L^{p_1}(X)}$, where ${1 \leq p_0,p_1,q_0,q_1 \leq \infty}$ and ${B_0,B_1 > 0}$, then one automatically also has the interpolated estimates

$\displaystyle \| Tf \|_{L^{q_\theta}(Y)} \leq B_\theta \|f\|_{L^{p_\theta}(X)} \ \ \ \ \ (3)$

for all ${f \in L^{p_\theta}(X)}$ and ${0 \leq \theta \leq 1}$, where the quantities ${p_\theta, q_\theta, B_\theta}$ are defined by the formulae

$\displaystyle \frac{1}{p_\theta} = \frac{1-\theta}{p_0} + \frac{\theta}{p_1}$

$\displaystyle \frac{1}{q_\theta} = \frac{1-\theta}{q_0} + \frac{\theta}{q_1}$

$\displaystyle B_\theta = B_0^{1-\theta} B_1^\theta.$

The Riesz-Thorin theorem is already quite useful (it gives, for instance, by far the quickest proof of the Hausdorff-Young inequality for the Fourier transform, to name just one application), but it requires the same linear operator ${T}$ to appear in (1), (2), and (3). Eli Stein realised, though, that due to the complex-analytic nature of the proof of the Riesz-Thorin theorem, it was possible to allow different linear operators to appear in (1), (2), (3), so long as the dependence was analytic. A bit more precisely: if one had a family ${T_z}$ of operators which depended in an analytic manner on a complex variable ${z}$ in the strip ${\{ z \in {\bf C}: 0 \leq \hbox{Re}(z) \leq 1 \}}$ (thus, for any test functions ${f, g}$, the inner product ${\langle T_z f, g \rangle}$ would be analytic in ${z}$) which obeyed some mild regularity assumptions (which are slightly technical and are omitted here), and one had the estimates

$\displaystyle \| T_{0+it} f \|_{L^{q_0}(Y)} \leq C_t \|f\|_{L^{p_0}(X)}$

and

$\displaystyle \| T_{1+it} f \|_{L^{q_1}(Y)} \leq C_t\|f\|_{L^{p_1}(X)}$

for all ${t \in {\bf R}}$ and some quantities ${C_t}$ that grew at most exponentially in ${t}$ (actually, any growth rate significantly slower than the double-exponential ${e^{\exp(\pi |t|)}}$ would suffice here), then one also has the interpolated estimates

$\displaystyle \| T_\theta f \|_{L^{q_\theta}(Y)} \leq C' \|f\|_{L^{p_\theta}(X)}$

for all ${0 \leq \theta \leq 1}$ and a constant ${C'}$ depending only on ${C, p_0, p_1, q_0, q_1}$.

In Fefferman’s survey, he notes the proof of the Stein interpolation theorem can be obtained from that of the Riesz-Thorin theorem simply “by adding a single letter of the alphabet”. Indeed, the way the Riesz-Thorin theorem is proven is to study an expression of the form

$\displaystyle F(z) := \int_Y T f_z(y) g_z(y)\ dy,$

where ${f_z, g_z}$ are functions depending on ${z}$ in a suitably analytic manner, for instance taking ${f_z = |f|^{\frac{1-z}{p_0}+\frac{z}{p_1}} \hbox{sgn}(f)}$ for some test function ${f}$, and similarly for ${g}$. If ${f_z, g_z}$ are chosen properly, ${F}$ will depend analytically on ${z}$ as well, and the two hypotheses (1), (2) give bounds on ${F(0+it)}$ and ${F(1+it)}$ for ${t \in {\bf R}}$ respectively. The Lindelöf theorem then gives bounds on intermediate values of ${F}$, such as ${F(\theta)}$; and the Riesz-Thorin theorem can then be deduced by a duality argument. (This is covered in many graduate real analysis texts; I myself covered it here.)

The Stein interpolation theorem proceeds by instead studying the expression

$\displaystyle F(z) := \int_Y T_z f_z(y) g_z(y)\ dy.$

One can then repeat the proof of the Riesz-Thorin theorem more or less verbatim to obtain the Stein interpolation theorem.

The ability to vary the operator ${T}$ makes the Stein interpolation theorem significantly more flexible than the Riesz-Thorin theorem. We illustrate this with the following sample result:

Proposition 1 For any (test) function ${f: {\bf R}^2 \rightarrow {\bf R}}$, let ${Tf: {\bf R}^2 \rightarrow {\bf R}}$ be the average of ${f}$ along an arc of a parabola:

$\displaystyle Tf(x_1,x_2) := \int_{\bf R} f(x_1-t, x_2-t^2)\eta(t)\ dt$

where ${\eta}$ is a bump function supported on (say) ${[-1,1]}$. Then ${T}$ is bounded from ${L^{3/2}({\bf R}^2)}$ to ${L^3({\bf R}^2)}$, thus

$\displaystyle \|Tf\|_{L^3({\bf R}^2)} \leq C \|f\|_{L^{3/2}({\bf R}^2)}. \ \ \ \ \ (4)$

There is nothing too special here about the parabola; the same result in fact holds for convolution operators on any arc of a smooth curve with nonzero curvature (and there are many extensions to higher dimensions, to variable-coefficient operators, etc.). We will however restrict attention to the parabola for sake of exposition. One can view ${Tf}$ as a convolution ${Tf = f * \sigma}$, where ${\sigma}$ is a measure on the parabola arc ${\{ (t,t^2): |t| \leq 1 \}}$. We will also be somewhat vague about what “test function” means in this exposition in order to gloss over some minor technical details.

By testing ${T}$ (and its adjoint) on the indicator function of a small ball of some radius ${\delta > 0}$ (or of small rectangles such as ${[-\delta,\delta] \times [0,\delta^2]}$) one sees that the exponent ${L^{3/2}}$, ${L^3}$ here are best possible.

This proposition was first proven by Littman in 1973 using the Stein interpolation theorem. To illustrate the power of this theorem, it should be noted that for almost two decades this was the only known proof of this result; a proof based on multilinear interpolation (exploiting the fact that the exponent ${3}$ in (4) is an integer) was obtained by Oberlin, and a fully combinatorial proof was only obtained in 2008 in an unpublished note of Christ (see also the recent papers of Stovall and of Dendrinos-Laghi-Wright for further extensions of the combinatorial argument).

To motivate the Stein interpolation argument, let us first try using the Riesz-Thorin interpolation theorem first. The exponent pair ${L^{3/2} \rightarrow L^3}$ is an interpolant between ${L^2 \rightarrow L^2}$ and ${L^1 \rightarrow L^\infty}$, so a first attempt to proceed here would be to establish the bounds

$\displaystyle \|Tf\|_{L^2({\bf R}^2)} \leq C \|f\|_{L^2({\bf R}^2)} \ \ \ \ \ (5)$

and

$\displaystyle \|Tf\|_{L^\infty({\bf R}^2)} \leq C \|f\|_{L^1({\bf R}^2)} \ \ \ \ \ (6)$

for all (test) functions ${f}$

The bound (5) is an easy consequence of Minkowski’s integral inequality(or Young’s inequality, noting that ${\sigma}$ is a finite measure). On the other hand, because the measure ${\sigma}$ is not absolutely continuous, let alone arising from an ${L^\infty({\bf R}^2)}$ function, the estimate (6) is very false. For instance, if one applies ${Tf}$ to the indicator function ${1_{[-\delta,\delta] \times [-\delta,\delta]}}$ for some small ${\delta>0}$, then the ${L^1}$ norm of ${f}$ is ${\delta^2}$, but the ${L^\infty}$ norm of ${Tf}$ is comparable to ${\delta}$, contradicting (6) as one sense ${\delta}$ to zero.

To get around this, one first notes that there is a lot of “room” in (5) due to the smoothing properties of the measure ${\sigma}$. Indeed, from Plancherel’s theorem one has

$\displaystyle \|f\|_{L^2({\bf R}^2)} = \| \hat f \|_{L^2({\bf R}^2)}$

and

$\displaystyle \|Tf\|_{L^2({\bf R}^2)} = \| \hat f \hat \sigma \|_{L^2({\bf R}^2)}$

for all test functions ${f}$, where

$\displaystyle \hat f(\xi) := \int_{{\bf R}^2} e^{-2\pi i x \cdot \xi} f(x)\ dx$

is the Fourier transform of ${f}$, and

$\displaystyle \hat \sigma(\xi_1, \xi_2) := \int_{\bf R} e^{-2\pi i (t \xi_1 + t^2 \xi_2)} \eta(t)\ dt.$

It is clear that ${\hat \sigma(\xi)}$ is uniformly bounded in ${\xi}$, which already gives (5). But a standard application of the method of stationary phase reveals that one in fact has a decay estimate

$\displaystyle |\hat \sigma(\xi)| \leq \frac{C}{|\xi|^{1/2}} \ \ \ \ \ (7)$

for some ${C > 0}$. This shows that ${Tf}$ is not just in ${L^2}$, but is somewhat smoother as well; in particular, one has

$\displaystyle \| D^{1/2} Tf \|_{L^2({\bf R}^2)} \leq C \| f\|_{L^2({\bf R}^2)}$

for any (fractional) differential operator ${D^{1/2}}$ of order ${1/2}$. (Here we adopt the usual convention that the constant ${C}$ is allowed to vary from line to line.)

Using the numerology of the Stein interpolation theorem, this suggests that if we can somehow obtain the counterbalancing estimate

$\displaystyle \| D^{-1} Tf \|_{L^\infty({\bf R}^2)} \leq C \|f\|_{L^1({\bf R}^2)}$

for some differential operator ${D^{-1}}$ of order ${-1}$, then we should be able to interpolate and obtain the desired estimate (4). And indeed, we can take an antiderivative in the ${x_2}$ direction, giving the operator

$\displaystyle \partial_{x_2}^{-1} Tf(x_1,x_2) := \int_{\bf R} \int_{-\infty}^0 f(x_1-t, x_2-t^2-s)\ \eta(t) dt ds;$

and a simple change of variables does indeed verify that this operator is bounded from ${L^1({\bf R}^2)}$ to ${L^\infty({\bf R}^2)}$.

Unfortunately, the above argument is not rigorous, because we need an analytic family of operators ${T_z}$ in order to invoke the Stein interpolation theorem, rather than just two operators ${T_0}$ and ${T_1}$. This turns out to require some slightly tricky complex analysis: after some trial and error, one finds that one can use the family ${T_z}$ defined for ${\hbox{Re}(z) > 1/3}$ by the formula

$\displaystyle T_z f(x_1,x_2) = \frac{1}{\Gamma((3z-1)/2)} \int_{\bf R} \int_{-\infty}^0 \frac{1}{s^{(3-3z)/2}} f(x_1-t, x_2-t^2-s)\ \eta(t) dt ds$

where ${\Gamma}$ is the Gamma function, and extended to the rest of the complex plane by analytic continuation. The Gamma factor is a technical one, needed to compensate for the divergence of the weight ${\frac{1}{s^{(3-3z)/2}}}$ as ${z}$ approaches ${1/3}$; it also makes the Fourier representation of ${T_z}$ cleaner (indeed, ${T_z f}$ is morally ${\partial_{x_2}^{(1-3z)/2} f * \sigma}$). It is then easy to verify the estimates

$\displaystyle \|T_{1+it} f \|_{L^\infty({\bf R}^2)} \leq C_t \|f\|_{L^1({\bf R}^2)} \ \ \ \ \ (8)$

for all ${t \in {\bf R}}$ (with ${C_t}$ growing at a controlled rate), while from Fourier analysis one also can show that

$\displaystyle \|T_{0+it} f \|_{L^2({\bf R}^2)} \leq C_t \|f\|_{L^2({\bf R}^2)} \ \ \ \ \ (9)$

for all ${t \in {\bf R}}$. Finally, one can verify that ${T_{1/3} = T}$, and (4) then follows from the Stein interpolation theorem.

It is instructive to compare this result with what can be obtained by real-variable methods. One can perform a smooth dyadic partition of unity

$\displaystyle \delta(s) = \phi(s) + \sum_{j=1}^\infty 2^j \psi( 2^j s )$

for some bump function ${\phi}$ (of total mass ${1}$) and bump function ${\psi}$ (of total mass zero), which (formally, at least) leads to the decomposition

$\displaystyle T f = T_0 f + \sum_{j=1}^\infty T_j f$

where ${T_0 f}$ is a harmless smoothing operator (which certainly maps ${L^{3/2}({\bf R}^2)}$ to ${L^3({\bf R}^2)}$) and

$\displaystyle T_j f(x_1,x_2) := \int_{\bf R} \int_{\bf R} 2^j \psi(2^j s) f(x_1-t, x_2-t^2-s) \eta(t)\ dt ds.$

It is not difficult to show that

$\displaystyle \| T_j f \|_{L^\infty({\bf R}^2)} \leq C 2^j \|f\|_{L^1({\bf R}^2)} \ \ \ \ \ (10)$

while a Fourier-analytic computation (using (7)) reveals that

$\displaystyle \| T_j f \|_{L^2({\bf R}^2)} \leq C 2^{-j/2} \|f\|_{L^2({\bf R}^2)} \ \ \ \ \ (11)$

which interpolates (by, say, the Riesz-Thorin theorem, or the real-variable Marcinkiewicz interpolation theorem) to

$\displaystyle \| T_j f \|_{L^3({\bf R}^2)} \leq C \|f\|_{L^{3/2}({\bf R}^2)}$

which is close to (4). Unfortunately, we still have to sum in ${j}$, and this creates a “logarithmic divergence” that just barely fails to recover (4). (With a slightly more refined real interpolation argument, one can at least obtain a restricted weak-type estimate from ${L^{3/2,1}({\bf R}^2)}$ to ${L^{3,\infty}({\bf R}^2)}$ this way, but one can concoct abstract counterexamples to show that the estimates (10), (11) are insufficient to obtain an ${L^{3/2} \rightarrow L^3}$ bound on ${\sum_{j=1}^\infty T_j}$.)

The key difference is that the inputs (8), (9) used in the Stein interpolation theorem are more powerful than the inputs (10), (11) in the real-variable method. Indeed, (8) is roughly equivalent to the assertion that

$\displaystyle \| \sum_{j=1}^\infty e^{2\pi i j t} 2^{-j} T_j f \|_{L^\infty({\bf R}^2)} \leq C_t \|f\|_{L^1({\bf R}^2)}$

and (9) is similarly equivalent to the assertion that

$\displaystyle \| \sum_{j=1}^\infty e^{2\pi i j t} 2^{j/2} T_j f \|_{L^2({\bf R}^2)} \leq C_t \|f\|_{L^2({\bf R}^2)}.$

A Fourier averaging argument shows that these estimates imply (10) and (11), but not conversely. If one unpacks the proof of Lindelöf’s theorem (which is ultimately powered by an integral representation, such as that provided by the Cauchy integral formula) and hence of the Stein interpolation theorem, one can interpret Stein interpolation in this case as using a clever integral representation of ${\sum_{j=1}^\infty T_j f}$ in terms of expressions such as ${\sum_{j=1}^\infty e^{2\pi i j t} 2^{-j} T_j f_{1+it}}$ and ${\sum_{j=1}^\infty e^{2\pi i j t} 2^{j/2} T_j f_{0+it}}$, where ${f_{1+it}, f_{0+it}}$ are various nonlinear transforms of ${f}$. Technically, it would then be possible to rewrite the Stein interpolation argument as a real-variable one, without explicit mention of Lindelöf’s theorem; but the proof would then look extremely contrived; the complex-analytic framework is much more natural (much as it is in analytic number theory, where the distribution of the primes is best handled by a complex-analytic study of the Riemann zeta function).

Remark 1 A useful strengthening of the Stein interpolation theorem is the Fefferman-Stein interpolation theorem, in which the endpoint spaces ${L^1}$ and ${L^\infty}$ are replaced by the Hardy space ${{\mathcal H}^1}$ and the space ${BMO}$ of functions of bounded mean oscillation respectively. These spaces are more stable with respect to various harmonic analysis operators, such as singular integrals (and in particular, with respect to the Marcinkiewicz operators ${|\nabla|^{it}}$, which come up frequently when attempting to use the complex method), which makes the Fefferman-Stein theorem particularly useful for controlling expressions derived from these sorts of operators.