One of my favourite unsolved problems in harmonic analysis is the restriction problem. This problem, first posed explicitly by Elias Stein, can take many equivalent forms, but one of them is this: one starts with a smooth compact hypersurface {S} (possibly with boundary) in {{\bf R}^d}, such as the unit sphere {S = S^2} in {{\bf R}^3}, and equips it with surface measure {d\sigma}. One then takes a bounded measurable function {f \in L^\infty(S,d\sigma)} on this surface, and then computes the (inverse) Fourier transform

\displaystyle  \widehat{fd\sigma}(x) = \int_S e^{2\pi i x \cdot \omega} f(\omega) d\sigma(\omega)

of the measure {fd\sigma}. As {f} is bounded and {d\sigma} is a finite measure, this is a bounded function on {{\bf R}^d}; from the dominated convergence theorem, it is also continuous. The restriction problem asks whether this Fourier transform also decays in space, and specifically whether {\widehat{fd\sigma}} lies in {L^q({\bf R}^d)} for some {q < \infty}. (This is a natural space to control decay because it is translation invariant, which is compatible on the frequency space side with the modulation invariance of {L^\infty(S,d\sigma)}.) By the closed graph theorem, this is the case if and only if there is an estimate of the form

\displaystyle  \| \widehat{f d\sigma} \|_{L^q({\bf R}^d)} \leq C_{q,d,S} \|f\|_{L^\infty(S,d\sigma)} \ \ \ \ \ (1)

for some constant {C_{q,d,S}} that can depend on {q,d,S} but not on {f}. By a limiting argument, to provide such an estimate, it suffices to prove such an estimate under the additional assumption that {f} is smooth.

Strictly speaking, the above problem should be called the extension problem, but it is dual to the original formulation of the restriction problem, which asks to find those exponents {1 \leq q' \leq \infty} for which the Fourier transform of an {L^{q'}({\bf R}^d)} function {g} can be meaningfully restricted to a hypersurface {S}, in the sense that the map {g \mapsto \hat g|_{S}} can be continuously defined from {L^{q'}({\bf R}^d)} to, say, {L^1(S,d\sigma)}. A duality argument shows that the exponents {q'} for which the restriction property holds are the dual exponents to the exponents {q} for which the extension problem holds.

There are several motivations for studying the restriction problem. The problem is connected to the classical question of determining the nature of the convergence of various Fourier summation methods (and specifically, Bochner-Riesz summation); very roughly speaking, if one wishes to perform a partial Fourier transform by restricting the frequencies (possibly using a well-chosen weight) to some region {B} (such as a ball), then one expects this operation to well behaved if the boundary {\partial B} of this region has good restriction (or extension) properties. More generally, the restriction problem for a surface {S} is connected to the behaviour of Fourier multipliers whose symbols are singular at {S}. The problem is also connected to the analysis of various linear PDE such as the Helmholtz equation, Schro\”dinger equation, wave equation, and the (linearised) Korteweg-de Vries equation, because solutions to such equations can be expressed via the Fourier transform in the form {fd\sigma} for various surfaces {S} (the sphere, paraboloid, light cone, and cubic for the Helmholtz, Schrödinger, wave, and linearised Korteweg de Vries equation respectively). A particular family of restriction-type theorems for such surfaces, known as Strichartz estimates, play a foundational role in the nonlinear perturbations of these linear equations (e.g. the nonlinear Schrödinger equation, the nonlinear wave equation, and the Korteweg-de Vries equation). Last, but not least, there is a a fundamental connection between the restriction problem and the Kakeya problem, which roughly speaking concerns how tubes that point in different directions can overlap. Indeed, by superimposing special functions of the type {\widehat{fd\sigma}}, known as wave packets, and which are concentrated on tubes in various directions, one can “encode” the Kakeya problem inside the restriction problem; in particular, the conjectured solution to the restriction problem implies the conjectured solution to the Kakeya problem. Finally, the restriction problem serves as a simplified toy model for studying discrete exponential sums whose coefficients do not have a well controlled phase; this perspective was, for instance, used by Ben Green when he established Roth’s theorem in the primes by Fourier-analytic methods, which was in turn one of the main inspirations for our later work establishing arbitrarily long progressions in the primes, although we ended up using ergodic-theoretic arguments instead of Fourier-analytic ones and so did not directly use restriction theory in that paper.

The estimate (1) is trivial for {q=\infty} and becomes harder for smaller {q}. The geometry, and more precisely the curvature, of the surface {S}, plays a key role: if {S} contains a portion which is completely flat, then it is not difficult to concoct an {f} for which {\widehat{f d\sigma}} fails to decay in the normal direction to this flat portion, and so there are no restriction estimates for any finite {q}. Conversely, if {S} is not infinitely flat at any point, then from the method of stationary phase, the Fourier transform {\widehat{d\sigma}} can be shown to decay at a power rate at infinity, and this together with a standard method known as the {TT^*} argument can be used to give non-trivial restriction estimates for finite {q}. However, these arguments fall somewhat short of obtaining the best possible exponents {q}. For instance, in the case of the sphere {S = S^{d-1} \subset {\bf R}^d}, the Fourier transform {\widehat{d\sigma}(x)} is known to decay at the rate {O(|x|^{-(d-1)/2})} and no better as {d \rightarrow \infty}, which shows that the condition {q > \frac{2d}{d-1}} is necessary in order for (1) to hold for this surface. The restriction conjecture for {S^{d-1}} asserts that this necessary condition is also sufficient. However, the {TT^*}-based argument gives only the Tomas-Stein theorem, which in this context gives (1) in the weaker range {q \geq \frac{2(d+1)}{d-1}}. (On the other hand, by the nature of the {TT^*} method, the Tomas-Stein theorem does allow the {L^\infty(S,d\sigma)} norm on the right-hand side to be relaxed to {L^2(S,d\sigma)}, at which point the Tomas-Stein exponent {\frac{2(d+1)}{d-1}} becomes best possible. The fact that the Tomas-Stein theorem has an {L^2} norm on the right-hand side is particularly valuable for applications to PDE, leading in particular to the Strichartz estimates mentioned earlier.)

Over the last two decades, there was a fair amount of work in pushing past the Tomas-Stein barrier. For sake of concreteness let us work just with the restriction problem for the unit sphere {S^2} in {{\bf R}^3}. Here, the restriction conjecture asserts that (1) holds for all {q > 3}, while the Tomas-Stein theorem gives only {q \geq 4}. By combining a multiscale analysis approach with some new progress on the Kakeya conjecture, Bourgain was able to obtain the first improvement on this range, establishing the restriction conjecture for {q > 4 - \frac{2}{15}}. The methods were steadily refined over the years; until recently, the best result (due to myself) was that the conjecture held for all {q > 3 \frac{1}{3}}, which proceeded by analysing a “bilinear {L^2}” variant of the problem studied previously by Bourgain and by Wolff. This is essentially the limit of that method; the relevant bilinear {L^2} estimate fails for {q < 3 + \frac{1}{3}}. (This estimate was recently established at the endpoint {q=3+\frac{1}{3}} by Jungjin Lee (personal communication), though this does not quite improve the range of exponents in (1) due to a logarithmic inefficiency in converting the bilinear estimate to a linear one.)

On the other hand, the full range {q>3} of exponents in (1) was obtained by Bennett, Carbery, and myself (with an alternate proof later given by Guth), but only under the additional assumption of non-coplanar interactions. In three dimensions, this assumption was enforced by replacing (1) with the weaker trilinear (and localised) variant

\displaystyle  \| \widehat{f_1 d\sigma_1} \widehat{f_2 d\sigma_2} \widehat{f_3 d\sigma_3} \|_{L^{q/3}(B(0,R))} \leq C_{q,d,S_1,S_2,S_3,\epsilon} R^\epsilon \ \ \ \ \ (2)

\displaystyle  \|f_1\|_{L^\infty(S_1,d\sigma_1)} \|f_2\|_{L^\infty(S_2,d\sigma_2)} \|f_3\|_{L^\infty(S_3,d\sigma_3)}

where {\epsilon>0} and {R \geq 1} are arbitrary, {B(0,R)} is the ball of radius {R} in {{\bf R}^3}, and {S_1,S_2,S_3} are compact portions of {S} whose unit normals {n_1(),n_2(),n_3()} are never coplanar, thus there is a uniform lower bound

\displaystyle  |n_1(\omega_1) \wedge n_2(\omega_2) \wedge n_3(\omega_3)| \geq c

for some {c>0} and all {\omega_1 \in S_1, \omega_2 \in S_2, \omega_3 \in S_3}. If it were not for this non-coplanarity restriction, (2) would be equivalent to (1) (by setting {S_1=S_2=S_3} and {f_1=f_2=f_3}, with the converse implication coming from Hölder’s inequality; the {R^\epsilon} loss can be removed by a lemma from a paper of mine). At the time we wrote this paper, we tried fairly hard to try to remove this non-coplanarity restriction in order to recover progress on the original restriction conjecture, but without much success.

A few weeks ago, though, Bourgain and Guth found a new way to use multiscale analysis to “interpolate” between the result of Bennett, Carbery and myself (that has optimal exponents, but requires non-coplanar interactions), with a more classical square function estimate of Córdoba that handles the coplanar case. A direct application of this interpolation method already ties with the previous best known result in three dimensions (i.e. that (1) holds for {q > 3 \frac{1}{3}}). But it also allows for the insertion of additional input, such as the best Kakeya estimate currently known in three dimensions, due to Wolff. This enlarges the range slightly to {q > 3.3}. The method also can extend to variable-coefficient settings, and in some of these cases (where there is so much “compression” going on that no additional Kakeya estimates are available) the estimates are best possible.

As is often the case in this field, there is a lot of technical book-keeping and juggling of parameters in the formal arguments of Bourgain and Guth, but the main ideas and “numerology” can be expressed fairly readily. (In mathematics, numerology refers to the empirically observed relationships between various key exponents and other numerical parameters; in many cases, one can use shortcuts such as dimensional analysis or informal heuristic, to compute these exponents long before the formal argument is completely in place.) Below the fold, I would like to record this numerology for the simplest of the Bourgain-Guth arguments, namely a reproof of (1) for {p > 3 \frac{1}{3}}. This is primarily for my own benefit, but may be of interest to other experts in this particular topic. (See also my 2003 lecture notes on the restriction conjecture.)

In order to focus on the ideas in the paper (rather than on the technical details), I will adopt an informal, heuristic approach, for instance by interpreting the uncertainty principle and the pigeonhole principle rather liberally, and by focusing on main terms in a decomposition and ignoring secondary terms. I will also be somewhat vague with regard to asymptotic notation such as {\ll}. Making the arguments rigorous requires a certain amount of standard but tedious effort (and is one of the main reasons why the Bourgain-Guth paper is as long as it is), which I will not focus on here.

— 1. The Córdoba square function estimate —

In two dimensions, the restriction theory is well understood, due to the work of Córdoba, Fefferman, and others. The situation is particularly simple when one looks at bilinear expressions such as

\displaystyle  \| F_1 F_2 \|_{L^2({\bf R}^2)}

where {F_1 :=\widehat{f_1 d\sigma_1}}, {F_2 := \widehat{f_2 d\sigma_2}}, and {d\sigma_1, d\sigma_2} are surface measures on two smooth compact curves {S_1, S_2} that are transverse in the sense that the unit normals of {S_1} are never oriented in the same direction as the unit normals of {S_2}. (A model case to consider here are two arcs of the unit circle, one near {(1,0)} and one near {(0,1)}.) In this case, we can use Plancherel’s theorem to rewrite the above expression as a convolution

\displaystyle  \| f_1 d\sigma_1 * f_2 d\sigma_2 \|_{L^2({\bf R}^2)}.

The transversality of {S_1} and {S_2}, combined with the inverse function theorem, shows that {f_1 d\sigma_1 * f_2 d\sigma_2} is a non-degenerate pushforward of the tensor product {f_1 \otimes f_2}, and so one obtains the basic bilinear restriction estimate

\displaystyle  \| F_1 F_2 \|_{L^2({\bf R}^2)} \ll \|f_1\|_{L^2(S_1,d\sigma_1)} \|f_2\|_{L^2(S_2,d\sigma_2)}.

This estimate (and higher-dimensional analogues thereof) lead to the bilinear {X^{s,b}} estimates which are of fundamental importance in nonlinear dispersive equations (particularly those in which the nonlinearity contains derivatives).

This bilinear estimate can be localised. Suppose one splits {S_1} into arcs {S_{1,\alpha}} of diameter {\sim 1/r} for some {r \gg 1}, which induces a decomposition {F_1 = \sum_\alpha F_{1,\alpha}} of {F_1} into components {F_{1,\alpha} := \widehat{ f_1 1_{S_{1,\alpha}} d\sigma_1}}. Similarly decompose {F_2 = \sum_\beta F_{2,\beta}}. Then we have

\displaystyle  F_1 F_2 = \sum_\alpha \sum_\beta F_{1,\alpha} F_{2,\beta}.

The Fourier transform of {F_{1,\alpha} F_{2,\beta}} is supported in the Minkowski sum {S_{1,\alpha} + S_{2,\beta}}. The transversality of {S_1, S_2} ensures that these sums are basically disjoint as {\alpha,\beta} varies, so by almost orthogonality one has

\displaystyle  \| F_1 F_2 \|_{L^2({\bf R}^2)} \ll (\sum_{\alpha} \sum_\beta \| F_{1,\alpha} F_{2,\beta} \|_{L^2({\bf R}^2)}^2)^{1/2}

or equivalently

\displaystyle  \| F_1 F_2 \|_{L^2({\bf R}^2)} \ll \| (\sum_{\alpha} |F_{1,\alpha}|^2)^{1/2}) ( \sum_\beta |F_{2,\beta}|^2)^{1/2} \|_{L^2({\bf R}^2)}.

Actually, this estimate is morally localisable to balls {B(x,r)} of radius {r}; heuristically, we have

\displaystyle  \| F_1 F_2 \|_{L^2(B(x,r))} \ll \| (\sum_{\alpha} |F_{1,\alpha}|^2)^{1/2}) ( \sum_\beta |F_{2,\beta}|^2)^{1/2} \|_{L^2(B(x,r))}. \ \ \ \ \ (3)

Informally, this is due to the uncertainty principle: localising in space to scale {r} wouuld cause the arcs {S_{1,\alpha}, S_{2,\beta}} in Fourier space to blur out at the scale {1/r}, but this will not significantly affect the almost disjointness of the Minkowski sums {S_{1,\alpha} + S_{2,\beta}}. (To make this rigorous, one would use a smoother cutoff than {1_{B(x,r)}}, and in particular it is convenient to use a cutoff which is compactly supported in Fourier space rather than physical space; we will not discuss these technicalities further here.)

Furthermore, the uncertainty principle suggests to us that {F_{1,\alpha}} and {F_{2,\beta}} are essentially constant on balls {B(x,r)} of radius {r}. As such, the expression inside the norm on the right-hand side of (3) is morally constant on such balls, which allows us to apply Hölder’s inequality and conclude that

\displaystyle  \| F_1 F_2 \|_{L^q(B(x,r))} \ll \| (\sum_{\alpha} |F_{1,\alpha}|^2)^{1/2}) ( \sum_\beta |F_{2,\beta}|^2)^{1/2} \|_{L^q(B(x,r))} \ \ \ \ \ (4)

for any {q \leq 2}.

This is a bilinear estimate, but for heuristic purposes it is morally equivalent to the linear estimate

\displaystyle  \| F \|_{L^q(B(x,r))} \ll \| (\sum_{\alpha} |F_{\alpha}|^2)^{1/2}) \|_{L^q(B(x,r))} \ \ \ \ \ (5)

for {q \leq 4}, where {F = \widehat{f d\sigma}} and {d\sigma} is the surface measure on a curve {S} which “exhibits curvature” and such that {F} is “dominated by transverse interactions”, {F_\alpha = \widehat{f 1_{S_\alpha} d\sigma}}, and {S} is partitioned into arcs {S_\alpha} of diameter {\sim 1/r}. For the purposes of numerology, we will pretend that (5) is true as stated, though in practice one has to actually work with the bilinearisation (4) instead.

We remark that Córdoba used (a rigorous form of) (5) to establish the restriction conjecture (1) for curves in the plane (such as the unit circle) in the optimal range {q > 4}.

The estimate (5) is a two-dimensional one, but it can be stepped up to a three-dimensional estimate

\displaystyle  \| F \|_{L^q(B(x,r))} \ll \| (\sum_{\alpha} |F_{\alpha}|^2)^{1/2}) \|_{L^q(B(x,r))} \ \ \ \ \ (6)

for {q \leq 4}, where {F = \widehat{f d\sigma}}, {d\sigma} is now surface measure on the sphere {S^2 \subset {\bf R}^3}, which one decomposes into caps {S_\alpha} of diameter {O(1/r)}, {f} is supported on the {O(1/r)}-neighbourhood of a great circle in {S^2} with {F_\alpha:= \widehat{ f 1_{S_\alpha} d\sigma}}, and {F} is “dominated by transverse interactions” in a sense that we will not quantify precisely here. This gives efficient control on {F} in terms of square functions, but only in the “transverse coplanar case” in which the frequencies that dominate {F} are both coplanar (in the sense that they all lie roughly on the same great circle) and transverse.

— 2. The Bourgain-Guth argument —

Now we sketch how the Bourgain-Guth argument works to establish (1) for {q > \frac{10}{3}}. Fix {q}; we may assume {q<4}. For each radius {R \geq 1}, let {Q_R} be the best constant in the local restriction estimate

\displaystyle  \| F \|_{L^q(B(x,R))} \leq Q_R \|f\|_{L^\infty(S^2)}

where {F := \widehat{f d\sigma}}. To show (1), one has to show that {Q_R} is bounded uniformly in {R}. Actually, thanks to an “epsilon removal lemma” that I proved some time ago using a variant of the Tomas-Stein argument, it suffices to show that the logarithmic growth estimate {Q_R \ll R^\epsilon} for any {\epsilon > 0}.

An effective technique for achieving this is an induction on scales argument, bounding {Q_R} efficiently in terms of {Q_R} for various scales {R'} between {1} and {R}. This technique was introduced by Bourgain, using the intermediate scale {R' := \sqrt{R}} (which is a natural scale for the purposes of approximating spherical caps by disks while still respecting the uncertainty principle). A subsequent paper of Wolff adapted this argument by also relying on scales {R' = R^{1-\epsilon}} that were much closer to {R}. The Bourgain-Guth argument is closer in spirit to this latter approach.

Specifically, one sets {K := R^\epsilon} to be a small power of {R}, and divides the sphere {S^2} into largish caps {S_\alpha} of radius {\sim 1/K}, thus splitting {F = \sum_\alpha F_\alpha}. At the same time, we cover {B(x,R)} by smallish balls {B(y,K)} of radius {K}. On each such ball {B(y,K)}, the functions {F_\alpha} are morally constant, as per the uncertainty principle. Of course, the amplitude of the {F_\alpha} on {B(y,K)} depend on {\alpha}; for each small ball {B(y,K)}, only a fraction of the {F_\alpha} will “dominate” the sum {F}. Roughly speaking, we can then sort the balls {B(y,K)} into three classes:

  1. (Non-coplanar case) There exist three dominant caps {S_\alpha} which do not lie within {O(1/K)} of a great circle.
  2. (Non-transverse case) All the dominant caps {S_\alpha} lie in a cap of size {o(1)}.
  3. (Transverse coplanar case) All the dominant caps lie within {O(1/K)} of a great circle, but at least two of them are at distance {\sim 1} from each other.

In the first case, one can control {\| F \|_{L^q(B(y,K))}} by {O(K^{O(1)})} non-coplanar interactions of the form {\| F_1 F_2 F_3 \|_{L^{q/3}(B(y,K))}}, where {F_1, F_2, F_3} are portions of {F} on non-coplanar portions of the sphere {S^2}. In this case, one can use (2) and obtain a contribution of {O( K^{O(1)} ) = O(R^{O(\epsilon)} )} in this case.

It has been known for some time (since a paper of myself, Vargas, and Vega) that the non-transverse case can always be eliminated. Basically, if we group the caps {S_\alpha} into larger caps {\tilde S_\beta} of radius {1/K' = o(1)}, and decompose {F = \sum_\beta \tilde F_\beta} accordingly, then in the non-transverse case we can morally bound

\displaystyle  |F| \ll (\sum_\beta |\tilde F_{\beta}|^q)^{1/q}

and so

\displaystyle  \|F\|_{L^q(B(x,R))} \ll (\sum_\beta \|\tilde F_{\beta}\|_{L^q(B(x,R))}^q)^{1/q}.

However, a standard parabolic rescaling argument (which, strictly speaking, requires one to generalise the sphere to a larger family of similarly curved surfaces, but let us ignore this technical detail) shows that

\displaystyle  \|\tilde F_{\beta}\|_{L^q(B(x,R))} \ll Q_{R/K'} (K')^{4/q-2}

and so (since there are {\sim (K')^2} large caps {\tilde S_\beta})

\displaystyle  \|F\|_{L^q(B(x,R))} \ll (K')^{6/q-2} Q_{R/K'}.

Since {q>3}, the exponent of {K'} here is negative, and so this is a good term for the recurrence.

Finally, we deal with the transverse, coplanar case. Here, the main tool is the Córdoba-type square function estimate (6). Being coplanar, there are only about {O(K)} caps {S_\alpha} that contribute here, so we can pay a factor of {O(K^{1/2 - 1/q})} and convert the square function to a {\ell^q}-function:

\displaystyle  \| F \|_{L^q(B(y,K))} \ll K^{1/2 - 1/q} \| (\sum_{\alpha} |F_{\alpha}|^q)^{1/q}) \|_{L^q(B(y,K))}.

Summing over all such balls, we obtain

\displaystyle  \| F \|_{L^q(B(x,R))} \ll K^{1/2 - 1/q} \sum_{\alpha} \|F_{\alpha}\|_{L^q(B(x,R))}^q)^{1/q}).

Again, a parabolic rescaling gives

\displaystyle  \|F_{\alpha}\|_{L^q(B(x,R))} \ll K^{4/q-2} Q_{R/K}

so the net contribution to {\|F\|_{L^q(B(x,R))}} here is {O( K^{1/2 - 1/q} K^{6/q-2} Q_{R/K} )}. This leads to the recursion

\displaystyle  Q_R \ll R^{O(\epsilon)} + (K')^{6/q-2} Q_{R/K'} + K^{1/2 - 1/q} K^{6/q-2} Q_{R/K}.

For {q > 10/3}, the exponents of {K'} and {K} are negative, and this allows one to induct on scales and get the required bound {Q_R \ll R^{O(\epsilon)}}.

The argument given above is not optimal; the main inefficiency here is the factor of {O(K^{1/2-1/q})} that one pays to convert the square function to the {\ell^q} function. This factor is only truly present if almost every cap {S_\alpha} along a great circle is genuinely contributing to {F}. However, one can use Kakeya estimates to prevent this event from happening too often. Indeed, thanks to the nature of parabolic scaling, the functions {F_\alpha} are not merely essentially constant on balls of radius {K}, but are in fact essentially constant on {K \times K^2} tubes oriented in the normal direction of {S_\alpha}. One can use a Kakeya estimate (such as Wolff’s Kakeya estimate) to then prevent these tubes from congregating too often with too high of a multiplicity; quantifying this, Bourgain and Guth were able to relax the constraint {q > 10/3} to {q > 3.3}. Unfortunately, there are still some residual inefficiencies, and even with the full Kakekya conjecture, the argument given in that paper only gets down to {3 \frac{3}{11}}.