We work in a Euclidean space . Recall that is the space of -power integrable functions , quotiented out by almost everywhere equivalence, with the usual modifications when . If then the Fourier transform will be defined in this course by the formula

From the dominated convergence theorem we see that is a continuous function; from the Riemann-Lebesgue lemma we see that it goes to zero at infinity. Thus lies in the space of continuous functions that go to zero at infinity, which is a subspace of . Indeed, from the triangle inequality it is obvious that

If , then Plancherel’s theorem tells us that we have the identity

Because of this, there is a unique way to extend the Fourier transform from to , in such a way that it becomes a unitary map from to itself. By abuse of notation we continue to denote this extension of the Fourier transform by . Strictly speaking, this extension is no longer defined in a pointwise sense by the formula (1) (indeed, the integral on the RHS ceases to be absolutely integrable once leaves ; we will return to the (surprisingly difficult) question of whether pointwise convergence continues to hold (at least in an almost everywhere sense) later in this course, when we discuss Carleson’s theorem. On the other hand, the formula (1) remains valid in the sense of distributions, and in practice most of the identities and inequalities one can show about the Fourier transform of “nice” functions (e.g., functions in , or in the Schwartz class , or test function class ) can be extended to functions in “rough” function spaces such as by standard limiting arguments.

By (2), (3), and the Riesz-Thorin interpolation theorem, we also obtain the Hausdorff-Young inequality

for all and , where is the dual exponent to , defined by the usual formula . (One can improve this inequality by a constant factor, with the optimal constant worked out by Beckner, but the focus in these notes will not be on optimal constants.) As a consequence, the Fourier transform can also be uniquely extended as a continuous linear map from . (The situation with is much worse; see below the fold.)

The *restriction problem* asks, for a given exponent and a subset of , whether it is possible to meaningfully restrict the Fourier transform of a function to the set . If the set has positive Lebesgue measure, then the answer is yes, since lies in and therefore has a meaningful restriction to even though functions in are only defined up to sets of measure zero. But what if has measure zero? If , then is continuous and therefore can be meaningfully restricted to any set . At the other extreme, if and is an arbitrary function in , then by Plancherel’s theorem, is also an arbitrary function in , and thus has no well-defined restriction to any set of measure zero.

It was observed by Stein (as reported in the Ph.D. thesis of Charlie Fefferman) that for certain measure zero subsets of , such as the sphere , one can obtain meaningful restrictions of the Fourier transforms of functions for certain between and , thus demonstrating that the Fourier transform of such functions retains more structure than a typical element of :

Theorem 1 (Preliminary restriction theorem)If and , then one has the estimatefor all Schwartz functions , where denotes surface measure on the sphere . In particular, the restriction can be meaningfully defined by continuous linear extension to an element of .

*Proof:* Fix . We expand out

From (1) and Fubini’s theorem, the right-hand side may be expanded as

where the inverse Fourier transform of the measure is defined by the formula

In other words, we have the identity

using the Hermitian inner product . Since the sphere have bounded measure, we have from the triangle inequality that

Also, from the method of stationary phase (as covered in the previous class 247A), or Bessel function asymptotics, we have the decay

for any (note that the bound already follows from (6) unless ). We remark that the exponent here can be seen geometrically from the following considerations. For , the phase on the sphere is stationary at the two antipodal points of the sphere, and constant on the tangent hyperplanes to the sphere at these points. The wavelength of this phase is proportional to , so the phase would be approximately stationary on a cap formed by intersecting the sphere with a neighbourhood of the tangent hyperplane to one of the stationary points. As the sphere is tangent to second order at these points, this cap will have diameter in the directions of the -dimensional tangent space, so the cap will have surface measure , which leads to the prediction (7). We combine (6), (7) into the unified estimate

where the “Japanese bracket” is defined as . Since lies in precisely when , we conclude that

Applying Young’s convolution inequality, we conclude (after some arithmetic) that

whenever , and the claim now follows from (5) and Hölder’s inequality.

Remark 2By using the Hardy-Littlewood-Sobolev inequality in place of Young’s convolution inequality, one can also establish this result for .

Motivated by this result, given any Radon measure on and any exponents , we use to denote the claim that the *restriction estimate*

for all Schwartz functions ; if is a -dimensional submanifold of (possibly with boundary), we write for where is the -dimensional surface measure on . Thus, for instance, we trivially always have , while Theorem 1 asserts that holds whenever . We will not give a comprehensive survey of restriction theory in these notes, but instead focus on some model results that showcase some of the basic techniques in the field. (I have a more detailed survey on this topic from 2003, but it is somewhat out of date.)

** — 1. Necessary conditions — **

It is relatively easy to find necessary conditions for a restriction estimate to hold, as one simply needs to test the estimate (9) against a suitable family of examples. We begin with the simplest case . The Hausdorff-Young inequality (4) tells us that we have the restriction estimate whenever . These are the only restriction estimates available:

Proposition 3 (Restriction to )Suppose that are such that holds. Then and .

We first establish the necessity of the duality condition . This is easily shown, but we will demonstrate it in three slightly different ways in order to illustrate different perspectives. The first perspective is from scale invariance. Suppose that the estimate holds, thus one has

for all Schwartz functions . For any scaling factor , we define the scaled version of by the formula

Applying (10) with replaced by , we then have

From change of variables, we have

and from the definition of Fourier transform and further change of variables we have

so that

combining all these estimates and rearranging, we conclude that

If is non-zero, then by sending either to zero or infinity we conclude that for all , which is absurd. Thus we must have the necessary condition , or equivalently that .

We now establish the same necessary condition from the perspective of dimensional analysis, which one can view as an abstraction of scale invariance arguments. We give the spatial variable a unit of length. It is not so important what units we assign to the range of the function (it will cancel out of both sides), but let us make it dimensionless for sake of discussion. Then the norm

will have the units of , because integration against -dimensional Lebesgue measure will have the units of (note this conclusion can also be justified in the limiting case ). For similar reasons, the Fourier transform

will have the units of ; also, the frequency variable must have the units of in order to make the exponent appearing in the exponential dimensionless. As such, the norm

has units . In order for the estimate (10) to be dimensionally consistent, we must therefore have , or equivalently that .

Finally, we establish the necessary condition once again using the example of a rescaled bump function, which is basically the same as the first approach but with replaced by a bump function. We will argue at a slightly heuristic level, but it is not difficult to make the arguments below rigorous and we leave this as an exercise to the reader. Given a length scale , let be a bump function adapted to the ball of radius around the origin, thus where is some fixed test function supported on . We refer to this as a bump function *adapted* to ; more generally, given an ellipsoid (or other convex region, such as a cube, tube, or disk) , we define a bump function adapted to to be a function of the form , where is an affine map from (or other fixed convex region) to and is a bump function with all derivatives uniformly bounded. As long as is non-zero, the norm is comparable to (up to constant factors that can depend on but are independent of ). The uncertainty principle then predicts that the Fourier transform will be concentrated in the dual ball , and within this ball (or perhaps a slightly smaller version of this ball) would be expected to be of size comparable to (the phase does not vary enough to cause significant cancellation). From this we expect to be comparable in size to . If (10) held, we would then have

for all , which is only possible if , or equivalently .

Now we turn to the other necessary condition . Here one does not use scaling considerations; instead, it is more convenient to work with randomised examples. A useful tool in this regard is Khintchine’s inequality, which encodes the *square root cancellation* heuristic that a sum of numbers or functions with randomised signs (or phases) should have magnitude roughly comparable to the *square function* .

Lemma 4 (Khintchine’s inequality)Let , and let be independent random variables that each take the values with an equal probability of .

- (i) For any complex numbers , one has
- (ii) For any functions on a measure space , one has

*Proof:* We begin with (i). By taking real and imaginary parts we may assume without loss of generality that the are all real, then by normalisation it suffices to show the upper bound

for all , whenever are real numbers with .

When the upper and lower bounds follow by direct calculation (in fact we have equality in this case). By Hölder’s inequality, this yields the upper bound for and the lower bound for . To handle the remaining cases of (11) it is convenient to use the exponential moment method. Let be an arbitrary threshold, and consider the upper tail probability

For any , we see from Markov’s inequality that this quantity is less than or equal to

The expectation here can be computed to equal

By comparing power series we see that for any real , hence by the normalisation we see that

If we set we conclude that

since the random variable is symmetric around the origin, we conclude that

From the Fubini-Tonelli theorem we have

and this then gives the upper bound (11) for any . The claim (12) for then follows from this, Hölder's inequality (applied in reverse), and the fact that (12) was already established for .

To prove (ii), observe from (i) that for every one has

integrating in and applying the Fubini-Tonelli theorem, we obtain the claim.

Exercise 5

- (i) How does the implied constant in (11) depend on in the limit if one analyses the above argument more carefully?
- (ii) Establish (11) for the case of even integers by direct expansion of the left-hand side and some combinatorial calculation. How does the dependence of the implied constant in (11) (that is to say, the supremum of over all and with ) on compare with (i) if one does this?
- (iii) Establish a matching lower bound (up to absolute constants) for the implied constant in (11).

Now we show that the estimate (10) fails in the large regime , even when . Here, the idea is to have “spread out” in physical space (in order to keep the norm low), and also having somewhat spread out in frequency space (in order to prevent the norm from dropping too much). We use the probabilistic method (constructing a random counterexample rather than a deterministic one) in order to exploit Khintchine’s inequality. Let be a non-zero bump function supported on (say) the unit ball , and consider a (random) function of the form

where are the random signs from Lemma 4, and are sufficiently separated points in (all we need for this construction is that for all ); thus is the random sum of bump functions adapted to disjoint balls . In particular, the summands here have disjoint supports and

(note that the signs have no effect on the magnitude of ). If (10) were true, this would give the (deterministic) bound

On the other hand, the Fourier transform of is

so by Khintchine’s inequality

The phases can be deleted, and is not identically zero, so one arrives at

Comparing this with (13) and sending , we obtain a contradiction if . This completes the proof of Proposition 3.

Exercise 6Find a deterministic construction that explains why the estimate (10) fails when and .

Exercise 7 (Marcinkiewicz-Zygmund theorem)Let be measure spaces, let , and suppose is a bounded linear operator with operator norm . Show thatfor any at most countable index set and any functions . Informally, this result asserts that if a linear operator is bounded from scalar-valued functions to scalar-valued functions, then it is automatically bounded from

vector-valuedfunctions to vector-valued functions. (By using gaussians instead of random sums, one can even obtain this bound with the implied constant equal to .)

Exercise 8Let be a bounded open subset of , and let . Show that holds if and only if and . (Note: in order to use either the scale invariance argument or the dimensional analysis argument to get the condition , one should replace with something like a ball of some radius , and allow the estimates to depend on .)

Now we study the restriction problem for two model hypersurfaces:

- (i) The
*paraboloid*equipped with the measure induced from Lebesgue measure in the horizontal variables , thus

(note this is

*not*the same as surface measure on , although it is mutually absolutely continuous with this measure). - (ii) The
*sphere*.

These two hypersurfaces differ from each other in one important respect: the paraboloid is non-compact, while the sphere is compact. Aside from that, though, they behave very similarly; they are both quadric hypersurfaces with everywhere positive curvature. Furthermore, they are also very highly symmetric surfaces. The sphere of course enjoys the rotation symmetry under the orthogonal group . At first glance the paraboloid only enjoys symmetry under the smaller orthogonal group that rotates the variable (leaving the final coordinate unchanged), but it also has a family of Galilean symmetries

for any , which preserves (and also can be seen to preserve the measure , since the horizontal variable is simply translated by ). Furthermore, the paraboloid also enjoys a *parabolic scaling symmetry*

for any , for which the sphere does not have an exact analogue (though morally Taylor expansion suggests that the sphere “behaves like” the paraboloid at small scales, or equivalently that certain parabolically rescaled copies of the sphere behave like the paraboloid in the limit). The following exercise exploits these symmetries:

- (i) Let be a non-empty open subset of , and let . Show that holds if and only if holds.
- (ii) Let be bounded non-empty open subsets of (endowed with the restriction of to ), and let . Show that holds if and only if holds.
- (iii) Suppose that are such that holds. Show that . (
Hint:Any of the three methods of scale invariance, dimensional analysis, or rescaled bump functions will work here.)- (iv) Suppose that are such that holds. Show that . (
Hint:The same three methods still work, but some will be easier to pull off than others.)- (v) Suppose that are such that holds for some bounded non-empty open subset of , and that . Conclude that holds.
- (vi) Suppose that are such that holds, and that . Conclude that holds.

Exercise 10 (No non-trivial restriction estimates for flat hypersurfaces)Let be an open non-empty subset of a hyperplane in , and let . Show that can only hold when .

To obtain a further necessary condition on the restriction estimates or holding, it is convenient to dualise the restriction estimate to an *extension estimate*.

Exercise 11 (Duality)Let be a Radon measure on , let , and let . Show that the following claims are equivalent:

This gives a further necessary condition as follows. Suppose for instance that holds; then by the above exercise, one has

for all . In particular, . However, we have the following stationary phase computation:

for all and some non-zero constants depending only on . Conclude that the estimate can only hold if .

Exercise 13Show that the estimate can only hold if . (Hint:one can explicitly test (15) when is a gaussian; the fact that gaussians are not, strictly speaking, compactly supported can be dealt with by a limiting argument.)

It is conjectured that the necessary conditions claimed above are sufficient. Namely, we have

Conjecture 14 (Restriction conjecture for the sphere)Let . Then we have whenever and .

Conjecture 15 (Restriction conjecture for the paraboloid)Let . Then we have whenever and .

It is also conjectured that Conjecture 14 holds if one replaces the sphere by any bounded open non-empty subset of the paraboloid .

The current status of these conjectures is that they are fully solved in the two-dimensional case (as we will see later in these notes) and partially resolved in higher dimensions. For instance, in one of the strongest results currently is due to Hong Wang, who established for a bounded open non-empty subset of when (conjecturally this should hold for all ); for higher dimensions see this paper of Hickman and Rogers for the most recent results.

We close this section with an important connection between the restriction conjecture and another conjecture known as the *Kakeya maximal function conjecture*. To describe this connection, we first give an alternate derivation of the necessary condition in Conjecture 14, using a basic example known as the *Knapp example* (as described for instance in this article of Strichartz).

Let be a spherical cap in of some small radius , thus for some . Let be a bump function adapted to this cap, say where is a fixed non-zero bump function supported on . We refer to as a *Knapp example* at frequency (and spatially centred at the origin). The cap (or any slightly smaller version of ) has surface measure , thus

for any . We then apply the extension operator to :

The integrand is only non-vanishing if ; since also from the cosine rule we have

we also have . Thus, if lies in the tube

for a sufficiently small absolute constant , then the phase has real part . If we set to be non-negative and not identically zero, and note that , we conclude that

for . Since the tube has dimensions , its volume is

and thus

for any . By Exercise 11, we thus see that if the estimate holds, then

for all small ; sending to zero, we conclude that

or equivalently that , recovering the second necessary condition in Conjecture 14.

- (i) By considering a random superposition of Knapp examples located at different frequencies , and using Khintchine’s inequality, recover the first necessary condition of Conjecture 14.
- (ii) Suppose that holds for some . Establish the estimate
whenever is a collection of tubes – that is to say, sets of the form

whose directions are -separated (thus for any two distinct ), and the are arbitrary real numbers.

- (iii) Establish claims (i) and (ii) with the sphere replaced by a bounded non-empty open subset of the paraboloid .

Using this exercise, we can show that restriction estimates imply assertions about the dimension of Kakeya sets (also known as *Besicovitch sets*.

Exercise 17 (Restriction implies Kakeya)Assume that either Conjecture 14 or Conjecture 15 holds. Define aKakeya setto be a compact subset of that contains a unit line segment in every direction (thus for every , there exists a line segment for some that is contained in . Show that for any , the -neighbourhood of has Lebesgue measure for any . (This is equivalent to the assertion that has Minkowski dimension .) It is also possible to show that the restriction conjecture implies that all Kakeya sets have Hausdorff dimension , but this is trickier; see this paper of Bourgain. (This can be viewed as a challenge problem for those students who are familiar with the concept of Hausdorff dimension.)

The *Kakeya conjecture* asserts that all Kakeya sets in have Minkowski and Hausdorff dimension equal to . As with the restriction conjecture, this is known to be true in two dimensions (as was first proven by Davies), but only partial results are known in higher dimensions. For instance, in three dimensions, Kakeya sets are known to have (upper) Minkowski dimension at least for some absolute constant (a result of Katz, Laba, and myself), and also more recently for Hausdorff dimension (a result of Katz and Zahl). For the latest results in higher dimensions, see these papers of Hickman-Rogers-Zhang and Zahl.

Much of the modern progress on the restriction conjecture has come from trying to reverse the implication in Exercise 17, and use known partial results towards the Kakeya conjecture (or its relatives) to obtain restriction estimates. We will not give the latest arguments in this direction here, but give an illustrative example (in the multilinear setting) at the end of this set of notes.

** — 2. theory — **

One of the best understood cases of the restriction conjecture is the case. Note that Conjecture 14 asserts that holds whenever , and Conjecture 15 asserts that holds when . Theorem 1 already gave a partial result in this direction. Now we establish the full range of the restriction conjecture, due to Tomas and Stein:

Theorem 18 (Tomas-Stein restriction theorem)Let . Then holds for all , and holds for .

The exponent is sometimes referred to in the literature as the *Tomas-Stein exponent*; though the dual exponent is also referred to by this name.

We first establish the restriction estimate in the non-endpoint case by an interpolation method. Fix . By the identity (5) and Hölder’s inequality, it suffices to establish the inequality

We use the standard technique of *dyadic decomposition*. Let be a bump function supported on that equals on . Then one has the telescoping series

where is a bump function supported on the annulus . We can then decompose the convolution kernel as

so by the triangle inequality it will suffice to establish the bounds

for all and some constant depending only on .

The function is smooth and compactly supported, so (18) is immediate from Young’s inequality (note that when ). So it remains to prove (19). Firstly, we recall from (7) (or (8)) that the kernel is of magnitude . Thus by Young’s inequality we have

We now complement this with an estimate. The Fourier transform of can be computed as

for any , and hence by the triangle inequality and the rapid decay of the Schwartz function we have

By dyadic decomposition we then have

From elementary geometry we have

(basically because the sphere is -dimensional), and then on summing the geometric series we conclude that

From Plancherel’s theorem we conclude that

Applying either the Marcinkiewicz interpolation theorem (or the Riesz-Thorin interpolation theorem) to (20) and (21), we conclude (after some arithmetic) the required estimate (18) with

which is indeed positive when .

At the endpoint the above argument does not quite work; we obtain a decent bound for each dyadic component of , but then we have trouble getting a good bound for the sum. The original argument of Stein got around this problem by using complex interpolation instead of dyadic decomposition, embedding in an analytic family of functions. We present here another approach, which is now popular in PDE applications; the basic inputs (namely, an to estimate similar to (20), an estimate similar to (21), and an interpolation) are the same, but we employ the additional tool of Hardy-Littlewood-Sobolev fractional integration to recover the endpoint.

We turn to the details. Set . We write as and parameterise the frequency variable by with , thus for instance . (One can think of as a “time” variable that we will give a privileged role in the physical domain .) We split the spatial variable similarly. Let be a non-negative bump function localised to a small neighbourhood of the north pole of . By Exercise 9 it will suffice to show that

for . Squaring as before, it suffices to show that

For each , let denote the function , and let denote the function

Then we have

where on the right-hand side the convolution is now over rather than . By the Fubini-Tonelli theorem and Minkowski’s inequality, we thus have

From Exercise 12 we have the bounds

leading to the *dispersive estimate*

for any (the claim is vacuous when vanishes). On the other hand, the -dimensional Fourier transform of can be computed as

which is bounded by , hence by Plancherel we have the *energy estimate*

Interpolating, we conclude after some arithmetic that

Applying the one-dimensional Hardy-Littlewood-Sobolev inequality we conclude (after some more arithmetic) that

and the claim follows.

This latter argument can be adapted for the paraboloid, which in turn leads to some very useful estimates for the Schrödinger equation:

Exercise 19 (Strichartz estimates for the Schrödinger equation)Let .

- (i) By modifying the above arguments, establish the restriction estimate .
- (ii) Let , and let denote the function
(This is the solution to the Schrödinger equation with initial data .) Establish the

Strichartz estimate- (iii) More generally, with the hypotheses as in (ii), establish the bound
whenever are exponents obeying the scaling condition . (The endpoint case of this estimate is also available when , using a more sophisticated interpolation argument; see this paper of Keel and myself.)

The Strichartz estimates in the above exercise were for the linear Schrödinger equation, but Strichartz estimates can also be established by the same method (namely, interpolating between energy and dispersive estimates) for other linear dispersive equations, such as the linear wave equation . Such Strichartz estimates are a fundamental tool in the modern analysis of *nonlinear* dispersive equations, as they often allow one to view such nonlinear equations as perturbations of linear ones. The topic is too vast to survey in these notes, but see for instance my monograph on this topic.

** — 3. Bilinear estimates — **

A restriction estimate such as

are *linear* estimates, asserting the boundedness of either a restriction operator (where denotes the support of ) or an extension operator . In the last ten or twenty years, it has been realised that one should also consider *bilinear* or *multilinear* versions of the extension estimate, both as stepping stones towards making progress on the linear estimate, and also as being of independent interest and application.

In this section we will show how the consideration of bilinear extension estimates can be used to resolve the restriction conjecture for the circle (i.e., the case of Conjecture 14):

Theorem 20 (Restriction conjecture for )One has whenever and .

Note from Exercise 9(vi) that this theorem also implies the case of Conjecture 15. This case of the restriction conjecture was first established by Zygmund; Zygmund’s proof is shorter than the one given here (relying on the Hausdorff-Young inequality (4)), but the arguments here have broader applicability, in particular they are also useful in higher-dimensional settings.

To prove this conjecture, it suffices to verify it at the endpoint , since from Hölder’s inequality the norm is essentially non-decreasing in , where is arclength measure on . By Exercise 9, we may replace here by (say) the first quadrant of the circle, where is the map ; we let be the arc length measure on that quadrant. (This reduction is technically convenient to avoid having to deal with antipodal points with parallel tangents a little later in the argument.)

By (22) and relabeling, it suffices to show that

whenever , , and (we drop the requirement that is smooth, in order to apply rough cutoffs shortly), and arcs such as are always understood to be endowed with arclength measure.

We now bilinearise this estimate. It is clear that the estimate (23) is equivalent to

for any , since (24) follows from (23) and Hölder’s inequality, and (23) follows from (24) by setting .

Right now, the two functions and are both allowed to occupy the entirety of the arc . However, one can get better estimates if one separates the functions to lie in *transverse* sub-arcs of (where by “transverse” we mean that there is some non-zero separation between the normal vectors of and the normal vectors of . The key estimate is

Proposition 21 (Bilinear estimate)Let be subintervals of such that . Then we havefor an , where denotes the arclength measure on .

*Proof:* To avoid some very minor technicalities involving convolutions of measures, let us approximate the arclength measures . Observe that we have

in the sense of distributions, where is the annular region

Thus we have the pointwise bound

and

and similarly for . Hence by monotone convergence it suffices to show that

for sufficiently small . By Plancherel’s theorem, it thus suffices to show that

for , if is sufficiently small. From Young’s inequality one has

so by interpolation it suffices to show that

But this follows from the pointwise bound

for sufficiently small , whose proof we leave as an exercise.

Exercise 22Establish (25).

Remark 23Higher-dimensional bilinear estimates, involving more complicated manifolds than arcs, play an important role in the modern theory of nonlinear dispersive equations, especially when combined with the formalism of dispersive variants of Sobolev spaces known asspaces, introduced by Bourgain (and independently by Klainerman-Machedon). See for instance this book of mine for further discussion.

From the triangle inequality we have

so by complex interpolation (which works perfectly well for bilinear operators) we have

for any . The estimate (26) begins to look rather similar to (24), and we can deduce (24) from (26) as follows. Firstly, it is convenient to use Marcinkiewicz interpolation (using the fact that we have an open range of ) to reduce (23) to proving a restricted estimate

for any measurable subset of the circle, so to prove (24) it suffices to show that

We can view the expression as a two-dimensional integral

We now perform a Whitney decomposition away from the diagonal of the square to decompose it as rectangles of the form for which Proposition 21 applies. There are many ways to perform this decomposition; here is one such. Define a *dyadic interval* in to be a subinterval of of the form ; then each dyadic interval other than the full interval is contained in a unique parent interval of twice the length. Let us say that two dyadic intervals are *close*, and write , if they are the same length and are disjoint, but whose parents are not disjoint. Observe that if then and almost every pair lies in exactly one pair of the form with . Therefore we can decompose (42) as the sum of

where range over pairs of close dyadic intervals, leading to the decomposition

We perform a triangle inequality based on the length of the dyadic intervals, thus it will suffice to show that

One could apply the triangle inequality a second time to pull out the sum on the left-hand side, but this turns out to be too inefficient for certain ranges of . Instead we need to exploit some further orthogonality of the factors :

Exercise 24Let .

- (i) Show that for each pair with and , the function is supported in a rectangle in with the property that the dilates of these rectangles (the rectangle of twice the sidelengths of but the same center) have bounded overlap (any point in is contained in of these dilates .
- (ii) Establish the bound
for all and all functions whose Fourier transform is supported in . (

Hint:one needs to apply an interpolation theorem, but one has to take care to make sure the interpolation is rigorous.)

Using the above exercise, we can bound the left-hand side of (28) by

Applying (26) and noting that , this is bounded by

Bounding and noting that each is close to only intervals and vice versa, we can bound this by

Since and , we can bound this by

and a routine calculation (dividing into the regimes and and using the hypothesis ) shows that the right-hand side is as required. This completes the proof of Theorem 20.

Exercise 25 (Bilinear restriction for paraboloid implies linear restriction)It is a known theorem (first conjectured by Klainerman and Machedon) that one has the bilinear restriction theoremwhenever , , disjoint compact subsets of , and functions , where denotes the measure given by the integral

(The range is known to be sharp for (29) except possibly for the endpoint , which remains open currently.) Assuming this result, show that Conjecture 15 holds for all . (

Hint:one repeats the above arguments, but at one point one will be faced with estimating a bilinear expression involving two “close” regions , which could be very large or very small. The hypothesis (29) does not specify how the implied constants depend on the size or location of , but one can obtain such a dependence by exploiting the translation and Galilean symmetries of the paraboloid.)

** — 4. Multilinear estimates — **

We now turn to multilinear (or more precisely, -linear) Kakeya and restriction estimates, where we happen to have nearly optimal estimates. For instance, we have the following estimate (cf. (17)), first established by Bennett, Carbery, and myself:

Theorem 26 (Multilinear Kakeya estimate)Let , let be sufficiently small, and let . Suppose that are collections of tubes such that each tube in is oriented within of the basis vector . Then we have

Exercise 27Assuming Theorem 26, obtain an estimate for for any in terms of and , and use examples to show that this estimate is optimal in the sense that the exponents for and can only be improved by epsilon factors at best.

In the two-dimensional case the estimate is easily established with no epsilon loss. Indeed, in this case we can expand the left-hand side of (30) as

But if is a rectangle oriented near , and is a -rectangle oriented near , then is comparable with , and the claim follows.

The epsilon loss was removed in general dimension by Guth, using the polynomial method. We will not give that argument here, but instead give a simpler proof of Theorem 26, also due to Guth, and based primarily on the method of *induction on scales*. We first treat the case when , that is when all the tubes in each family are completely parallel:

Exercise 28

- (i) (Loomis-Whitney inequality) Let , and for each , let be the linear projection . Establish the inequality
for all . (

Hint:induct on and use Hölder’s inequality and the Fubini-Tonelli theorem.)- (ii) Establish Theorem 26 in the case .

Now we turn to the case of positive . Fix . For any , let denote the best constant in the inequality

whenever are collections of tubes such that each tube in is oriented within of the basis vector . Note that each tube can be covered by tubes whose direction is *exactly* equal to . From this we obtain the crude bound

In particular, is finite, and we have

whenever . Our objective is to show that whenever is sufficiently small, , and . The bound (32) establishes the claim when is large; the strategy is now to use the *induction on scales* method to push this “base case” bound down to smaller values of . The key estimate is

*Proof:* For each , let be a collection of tubes oriented within of . Our objective is to show that

Let be a small constant depending only on . We partition into cubes of sidelenth , then the left-hand side of (33) can be decomposed as

Clearly we can restrict the inner sum to those tubes that actually intersect . For small enough, the intersection of with is contained in a tube oriented within of ; such a tube can be viewed as a rescaling by of a tube, also oriented within of . From (31) and rescaling we conclude that

Now let be the tube with the same central axis and center of mass as . For small enough, if then equals on all of , and hence

Combining all these estimates, we can bound the left-hand side of (33) by

But by (31) and rescaling we have

and the claim follows.

Now let , and let be sufficiently large depending on . If is sufficiently small depending on , then from (32) we have the claim

whenever . On the other hand, from Proposition 29 we see (for large enough) that if (34) holds in some range with then it also holds in the larger range . By induction we then have (34) for all . Combining this with (32), we have shown that

for all , whenever is sufficiently small depending on . This is *almost* what we need to prove Theorem 26, except that we are requiring to be small depending on as well as , whereas Theorem 26 only requires to be sufficiently small depending on and not . We can overcome this (at the cost of worsening the implied constants by an -dependent factor) by the triangle inequality and exploiting affine invariance (somewhat in the spirit of Exercise 9). Namely, suppose that and is only assumed to be small depending on but not on . By what we have previously established, we have

whenever the tubes lie within of , where is a quantity that is sufficiently small depending on . Now we apply a linear transformation to both sides, and also modify slightly, and conclude that for any within of , we still have the bound (35) if the are assumed to lie within (say) of instead of . On the other hand, by compactness (or more precisely, total boundedness), we can find directions that lie within of , such that any other direction that lies within of lies within of one of the . Applying the (quasi-)triangle inequality for , we conclude that

whenever the direction of are merely assumed to lie within of . This concludes the proof of Theorem 26.

Exercise 30By optimising the parameters in the above argument, refine the estimate in Theorem 26 slightly tofor any .

We can use the multilinear Kakeya estimate to prove a multilinear restriction (or more precisely, multilinear extension) estimate:

Theorem 31 (Multilinear restriction estimate)Let , let be sufficiently small, and let . Suppose that are open subsets of that lie within of the basis vector . Then we have

Exercise 32By modifying the arguments used to prove Exercise 16(ii), show that Theorem 31 implies Theorem 26.

Exercise 33Assuming Theorem 31, obtain for each and an estimate of the formwhenever , , and and some exponent , and use examples to show that the exponent you obtain is best possible.

Remark 34In the case, this result with no epsilon loss follows from Proposition 21. It is an open question whether the epsilon can be removed in higher dimensions; see this recent paper of mine for some progress in this direction.

To prove Theorem 31, we again turn to induction on scales; the argument here is a corrected version of one from this paper of Bennett, Carbery and myself, which first appeared in this paper of Bennett. Fix , and let be sufficiently small. For technical reasons it is convenient to replace the subsets of the sphere by annuli. More precisely, for each , let denote the best constant in the inequality

whenever , where is the annular cap

Because we have restricted both the Fourier and spatial domains to be compactly supported it is clear that is finite for each , thus

Exercise 35Show that (40) implies Theorem 31. (Hint:starting with , multiply by a suitable weight function that is large on and has Fourier transform supported on , and write this as for a suitable . Then obtain estimates on .)

To establish (40), the key estimate is

Proposition 36 (Induction on scales)For any , one haswhere the multilinear Kakeya constant was defined in (31).

Suppose we can establish this claim. Applying Theorem 26, we conclude that if for a sufficiently large depending on , one has

From (39) one has

for all and some , and then an easy induction then shows that (41) holds for all , giving the claim.

It remains to prove Proposition 36. We will rely on the *wave packet decomposition*. Informally, this decomposes into a sum of “wave packets” that is approximately of the form

where ranges over -tubes in oriented in various directions oriented near , and the coefficients obey an type estimate

(This decomposition is inaccurate in a number of technical ways, for instance the sharp cutoff should be replaced by something smoother, but we ignore these issues for the sake of this informal discussion.) Heuristically speaking, (42) is asserting that behaves like a superposition of various (translated) Knapp examples (16) with .

Let us informally indicate why we would expect the wave packet decomposition to hold, and then why it should imply something like Proposition 36. Geometrically, the annular cap behaves like the union of essentially disjoint -disks , each centred at some point on the unit sphere that is close to , and oriented normal to the direction . Thus should behave like the sum of the components . By the uncertainty principle, each such component should behave like a constant multiple of the plane wave on each translate of the dual region to , which is a tube oriented in the direction . By Plancherel’s theorem, the total norm of should equal . Thus we expect to have a decomposition roughly of the form

where is a collection of parallel and boundedly overlapping tubes oriented in the direction , and the are coefficients with

Summing over and collecting powers of , we (heuristically) obtain the wave packet decomposition (42) with bound (43).

Now we informally explain why the decomposition (42) (and attendant bound (43)) should yield Proposition 36. Our task is to show that

for . We may as well normalise . Applying the wave packet decomposition, one expects to have an approximation of the form

and the are essentially distinct tubes oriented within of . We cover by balls of radius . On each such ball, the cutoffs are morally constant, and so

From the uncertainty principle, the trigonometric polynomial behaves on like the inverse Fourier transform of a function supported on with

and hence by (38) we expect the expression (46) to be bounded by

which is also morally

Averaging in , we thus expect the left-hand side of (44) to be

Applying a rescaled (and weighted) version of Theorem 26, this is bounded by

and the claim now follows from (45).

Now we begin the rigorous argument. We need to prove (44), and we normalise . By Fubini’s theorem we have

Let be a fixed Schwartz function that is bounded away from zero on and has Fourier transform supported on , thus the function is bounded away from zero on and has Fourier transform supported on . In particular we have

We can write

and observe that has Fourier transform supported in . Thus

Thus it remains to establish the bound

We cover by a collection of disks , each one centered at an element that lies within of , and is oriented with normal , with the separated from each other by . A partition of unity then lets us write where each with

The functions then have bounded overlapping supports in the sense that every is contained in at most of these supports. Hence

By Plancherel’s theorem the right-hand side is at most

This is morally bounded by

so one has morally bounded the left-hand side of (47) by

In practice, due to the rapid decay of , one has to add some additional terms involving some translates of the balls , but these can be handled by the same method as the one given below and we omit this technicality for brevity. We can write , where is a Schwartz function adapted to a slight dilate of whose inverse Fourier transform is a bump function adapted to a tube oriented along through the origin, and with

This gives a reproducing-type formula

which by Cauchy-Schwarz (or Jensen’s inequality) gives the pointwise bound

By enlarging slightly, we then have

for all , hence

We have thus bounded the left-hand side of (47) by

which we can rearrange as

Using a rescaled version of (31) (and viewing the convolution here as a limit of Riemann sums) we can bound this by

which by (49), (48) is bounded by

giving (47) as desired.

]]>This seems like a very interesting and timely proposal to me and I would like to open it up for discussion, for instance by proposing some seed requests for data and data cleaning and to discuss possible platforms that such a repository could be built on. In the spirit of “building the plane while flying it”, one could begin by creating a basic github repository as a prototype and use the comments in this blog post to handle requests, and then migrate to a more high quality platform once it becomes clear what direction this project might move in. (For instance one might eventually move beyond data cleaning to more sophisticated types of data analysis.)

UPDATE, Mar 25: a prototype page for such a clearinghouse is now up at this wiki page.

UPDATE, Mar 27: the data cleaning aspect of this project largely duplicates the existing efforts at the United against COVID-19 project, so we are redirecting requests of this type to that project (and specifically to their data discourse page). The polymath proposal will now refocus on crowdsourcing a list of public data sets relating to the COVID-19 pandemic.

]]>

(I am on this board, but could not make it to this particular meeting; I caught up on the presentation later, and thought it would of interest to several readers of this blog.) While there is some mathematics in the presentation, it is relatively non-technical.

]]>- Restriction theory and Strichartz estimates
- Decoupling estimates and applications
- Paraproducts; time frequency analysis; Carleson’s theorem

As usual, lecture notes will be made available on this blog.

Unlike previous courses, this one will be given online as part of UCLA’s social distancing efforts. In particular, the course will be open to anyone with an internet connection (no UCLA affiliation is required), though non-UCLA participants will not have full access to all aspects of the course, and there is the possibility that some restrictions on participation may be imposed if there are significant disruptions to class activity. For more information, see the course description. **UPDATE**: due to time limitations, I will not be able to respond to personal email inquiries about this class from non-UCLA participants in the course. Please use the comment thread to this blog post for such inquiries. I will also update the course description throughout the course to reflect the latest information about the course, both for UCLA students enrolled in the course and for non-UCLA participants.

The same goes for giving mathematical talks. I learned recently (from Jordan Ellenberg) that Rachel Preis has recently launched a “virtual math seminar on open conjectures in number theory in arithmetic geometry” (VaNTAGe) that is run using the BlueJeans platform. And for many years there has been a regular joint math seminar between UC Berkeley, U. Paris-Nord, U. Zurich, and U. Bonn (see e.g., this calendar), and nowadays many mathematical institutes stream their talks or at least videotape them to place them online later. Our own department does not have a dedicated lecture hall for videocasting, so I would be interested in knowing of any successful ways to improvise such casting with more portable technology. (Skype in principle could work here, but I have found this to be clunky even for smaller meetings involving just a handful of partcipants.)

EDIT: in addition to lectures and talks, it would also be topical to discuss online options for office hours, midterms, and final exams.

]]>where we use the averaging notation

for any non-empty finite set (with denoting the cardinality of ), and is the multiplicative discrete derivative operator

One reason why these norms play an important role is that they control various multilinear averages. We give two sample examples here:

We establish these claims a little later in this post.

In some more recent literature (e.g., this paper of Conlon, Fox, and Zhao), the role of Gowers norms have been replaced by (generalisations) of the *cut norm*, a concept originating from graph theory. In this blog post, it will be convenient to define these cut norms in the language of probability theory (using boldface to denote random variables).

Definition 2 (Cut norm)Let be independent random variables with ; to avoid minor technicalities we assume that these random variables are discrete and take values in a finite set. Given a random variable of these independent random variables, we define thecut normwhere the supremum ranges over all choices of random variables that are -bounded (thus surely), and such that does not depend on .

If , we abbreviate as .

Strictly speaking, the cut norm is only a cut semi-norm when , but we will abuse notation by referring to it as a norm nevertheless.

Example 3If is a bipartite graph, and , are independent random variables chosen uniformly from respectively, thenwhere the supremum ranges over all -bounded functions , . The right hand side is essentially the cut norm of the graph , as defined for instance by Frieze and Kannan.

The cut norm is basically an expectation when :

Example 4If , we see from definition thatIf , one easily checks that

where is the conditional expectation of to the -algebra generated by all the variables other than , i.e., the -algebra generated by . In particular, if are independent random variables drawn uniformly from respectively, then

Here are some basic properties of the cut norm:

Lemma 5 (Basic properties of cut norm)Let be independent discrete random variables, and a function of these variables.

- (i) (Permutation invariance) The cut norm is invariant with respect to permutations of the , or permutations of the .
- (ii) (Conditioning) One has
where on the right-hand side we view, for each realisation of , as a function of the random variables alone, thus the right-hand side may be expanded as

- (iii) (Monotonicity) If , we have
- (iv) (Multiplicative invariances) If is a -bounded function that does not depend on one of the , then
In particular, if we additionally assume , then

- (v) (Cauchy-Schwarz) If , one has
where is a copy of that is independent of and is the random variable

- (vi) (Averaging) If and , where is another random variable independent of , and is a random variable depending on both and , then

*Proof:* The claims (i), (ii) are clear from expanding out all the definitions. The claim (iii) also easily follows from the definitions (the left-hand side involves a supremum over a more general class of multipliers , while the right-hand side omits the multiplier), as does (iv) (the multiplier can be absorbed into one of the multipliers in the definition of the cut norm). The claim (vi) follows by expanding out the definitions, and observing that all of the terms in the supremum appearing in the left-hand side also appear as terms in the supremum on the right-hand side. It remains to prove (v). By definition, the left-hand side is the supremum over all quantities of the form

where the are -bounded functions of that do not depend on . We average out in the direction (that is, we condition out the variables ), and pull out the factor (which does not depend on ), to write this as

which by Cauchy-Schwarz is bounded by

which can be expanded using the copy as

Expanding

and noting that each is -bounded and independent of for , we obtain the claim.

Now we can relate the cut norm to Gowers uniformity norms:

Lemma 6Let be a finite abelian group, let be independent random variables uniformly drawn from for some , and let . ThenIf is additionally assumed to be -bounded, we have the converse inequalities

*Proof:* Applying Lemma 5(v) times, we can bound

where are independent copies of that are also independent of . The expression inside the norm can also be written as

so by Example 4 one can write (6) as

which after some change of variables simplifies to

which by Cauchy-Schwarz is bounded by

which one can rearrange as

giving (2). A similar argument bounds

by

which gives (3).

For (4), we can reverse the above steps and expand as

which we can write as

for some -bounded function . This can in turn be expanded as

for some -bounded functions that do not depend on . By Example 4, this can be written as

which by several applications of Theorem 5(iii) and then Theorem 5(iv) can be bounded by

giving (4). A similar argument gives (5).

Now we can prove Proposition 1. We begin with part (i). By permutation we may assume , then by translation we may assume . Replacing by and by , we can write the left-hand side of (1) as

where

is a -bounded function that does not depend on . Taking to be independent random variables drawn uniformly from , the left-hand side of (1) can then be written as

which by Example 4 is bounded in magnitude by

After many applications of Lemma 5(iii), (iv), this is bounded by

By Lemma 5(ii) we may drop the variable, and then the claim follows from Lemma 6.

For part (ii), we replace by and by to write the left-hand side as

the point here is that the first factor does not involve , the second factor does not involve , and the third factor has no quadratic terms in . Letting be independent variables drawn uniformly from , we can use Example 4 to bound this in magnitude by

which by Lemma 5(i),(iii),(iv) is bounded by

and then by Lemma 5(v) we may bound this by

which by Example 4 is

Now the expression inside the expectation is the product of four factors, each of which is or applied to an affine form where depends on and is one of , , , . With probability , the four different values of are distinct, and then by part (i) we have

When they are not distinct, we can instead bound this quantity by . Taking expectations in , we obtain the claim.

The analogue of the inverse theorem for cut norms is the following claim (which I learned from Ben Green):

Lemma 7 (-type inverse theorem)Let be independent random variables drawn from a finite abelian group , and let be -bounded. Then we havewhere is the group of homomorphisms is a homomorphism from to , and .

*Proof:* Suppose first that for some , then by definition

for some -bounded . By Fourier expansion, the left-hand side is also

where . From Plancherel’s theorem we have

hence by Hölder’s inequality one has for some , and hence

Conversely, suppose (7) holds. Then there is such that

which on substitution and Example 4 implies

The term splits into the product of a factor not depending on , and a factor not depending on . Applying Lemma 5(iii), (iv) we conclude that

The claim follows.

The higher order inverse theorems are much less trivial (and the optimal quantitative bounds are not currently known). However, there is a useful *degree lowering* argument, due to Peluse and Prendiville, that can allow one to lower the order of a uniformity norm in some cases. We give a simple version of this argument here:

Lemma 8 (Degree lowering argument, special case)Let be a finite abelian group, let be a non-empty finite set, and let be a function of the form for some -bounded functions indexed by . Suppose thatfor some and . Then one of the following claims hold (with implied constants allowed to depend on ):

- (i) (Degree lowering) one has .
- (ii) (Non-zero frequency) There exist and non-zero such that

There are more sophisticated versions of this argument in which the frequency is “minor arc” rather than “zero frequency”, and then the Gowers norms are localised to suitable large arithmetic progressions; this is implicit in the above-mentioned paper of Peluse and Prendiville.

*Proof:* One can write

and hence we conclude that

for a set of tuples of density . Applying Lemma 6 and Lemma 7, we see that for each such tuple, there exists such that

where is drawn uniformly from .

Let us adopt the convention that vanishes for not in , then from Lemma 5(ii) we have

where are independent random variables drawn uniformly from and also independent of . By repeated application of Lemma 5(iii) we then have

Expanding out and using Lemma 5(iv) repeatedly we conclude that

From definition of we then have

By Lemma 5(vi), we see that the left-hand side is less than

where is drawn uniformly from , independently of . By repeated application of Lemma 5(i), (v) repeatedly, we conclude that

where are independent copies of that are also independent of , . By Lemma 5(ii) and Example 4 we conclude that

with probability .

The left-hand side can be rewritten as

where is the additive version of , thus

Translating , we can simplify this a little to

If the frequency is ever non-vanishing in the event (9) then conclusion (ii) applies. We conclude that

with probability . In particular, by the pigeonhole principle, there exist such that

with probability . Expanding this out, we obtain a representation of the form

holding with probability , where the are functions that do not depend on the coordinate. From (8) we conclude that

for of the tuples . Thus by Lemma 5(ii)

By repeated application of Lemma 5(iii) we then have

and then by repeated application of Lemma 5(iv)

and then the conclusion (i) follows from Lemma 6.

As an application of degree lowering, we give an inverse theorem for the average in Proposition 1(ii), first established by Bourgain-Chang and later reproved by Peluse (by different methods from those given here):

Proposition 9Let be a cyclic group of prime order. Suppose that one has -bounded functions such thatfor some . Then either , or one has

We remark that a modification of the arguments below also give .

*Proof:* The left-hand side of (10) can be written as

where is the *dual function*

By Cauchy-Schwarz one thus has

and hence by Proposition 1, we either have (in which case we are done) or

Writing with , we conclude that either , or that

for some and non-zero . The left-hand side can be rewritten as

where and . We can rewrite this in turn as

which is bounded by

where are independent random variables drawn uniformly from . Applying Lemma 5(v), we conclude that

However, a routine Gauss sum calculation reveals that the left-hand side is for some absolute constant because is non-zero, so that . The only remaining case to consider is when

Repeating the above arguments we then conclude that

and then

The left-hand side can be computed to equal , and the claim follows.

This argument was given for the cyclic group setting, but the argument can also be applied to the integers (see Peluse-Prendiville) and can also be used to establish an analogue over the reals (that was first obtained by Bourgain).

]]>We are currently in the process of designing posters (and possibly even a more interactive online resource) for each of the ten topics listed in the webinars; hopefully these will be available in a few months.

]]>Basel has historically been home to a number of very prominent mathematicians, most notably Jacob Bernoulli, whose headstone I saw at the Basel Minster,

and also Leonhard Euler, for which I could not find a formal memorial, but I did at least see a hotel bearing his name:

]]>