You are currently browsing the tag archive for the ‘Jean Bourgain’ tag.

One of the basic problems in analytic number theory is to estimate sums of the form

$\displaystyle \sum_{p

as ${x \rightarrow \infty}$, where ${p}$ ranges over primes and ${f}$ is some explicit function of interest (e.g. a linear phase function ${f(p) = e^{2\pi i \alpha p}}$ for some real number ${\alpha}$). This is essentially the same task as obtaining estimates on the sum

$\displaystyle \sum_{n

where ${\Lambda}$ is the von Mangoldt function. If ${f}$ is bounded, ${f(n)=O(1)}$, then from the prime number theorem one has the trivial bound

$\displaystyle \sum_{n

but often (when ${f}$ is somehow “oscillatory” in nature) one is seeking the refinement

$\displaystyle \sum_{n

or equivalently

$\displaystyle \sum_{p

Thanks to identities such as

$\displaystyle \Lambda(n) = \sum_{d|n} \mu(d) \log(\frac{n}{d}), \ \ \ \ \ (3)$

where ${\mu}$ is the Möbius function, refinements such as (1) are similar in spirit to estimates of the form

$\displaystyle \sum_{n

Unfortunately, the connection between (1) and (4) is not particularly tight; roughly speaking, one needs to improve the bounds in (4) (and variants thereof) by about two factors of ${\log x}$ before one can use identities such as (3) to recover (1). Still, one generally thinks of (1) and (4) as being “morally” equivalent, even if they are not formally equivalent.

When ${f}$ is oscillating in a sufficiently “irrational” way, then one standard way to proceed is the method of Type I and Type II sums, which uses truncated versions of divisor identities such as (3) to expand out either (1) or (4) into linear (Type I) or bilinear sums (Type II) with which one can exploit the oscillation of ${f}$. For instance, Vaughan’s identity lets one rewrite the sum in (1) as the sum of the Type I sum

$\displaystyle \sum_{d \leq U} \mu(d) (\sum_{V/d \leq r \leq x/d} (\log r) f(rd)),$

the Type I sum

$\displaystyle -\sum_{d \leq UV} a(d) \sum_{V/d \leq r \leq x/d} f(rd),$

the Type II sum

$\displaystyle -\sum_{V \leq d \leq x/U} \sum_{U < m \leq x/V} \Lambda(d) b(m) f(dm),$

and the error term ${\sum_{d \leq V} \Lambda(n) f(n)}$, whenever ${1 \leq U, V \leq x}$ are parameters, and ${a, b}$ are the sequences

$\displaystyle a(d) := \sum_{e \leq U, f \leq V: ef = d} \Lambda(d) \mu(e)$

and

$\displaystyle b(m) := \sum_{d|m: d \leq U} \mu(d).$

Similarly one can express (4) as the Type I sum

$\displaystyle -\sum_{d \leq UV} c(d) \sum_{UV/d \leq r \leq x/d} f(rd),$

the Type II sum

$\displaystyle - \sum_{V < d \leq x/U} \sum_{U < m \leq x/d} \mu(m) b(d) f(dm)$

and the error term ${\sum_{d \leq UV} \mu(n) f(N)}$, whenever ${1 \leq U,V \leq x}$ with ${UV \leq x}$, and ${c}$ is the sequence

$\displaystyle c(d) := \sum_{e \leq U, f \leq V: ef = d} \mu(d) \mu(e).$

After eliminating troublesome sequences such as ${a(), b(), c()}$ via Cauchy-Schwarz or the triangle inequality, one is then faced with the task of estimating Type I sums such as

$\displaystyle \sum_{r \leq y} f(rd)$

or Type II sums such as

$\displaystyle \sum_{r \leq y} f(rd) \overline{f(rd')}$

for various ${y, d, d' \geq 1}$. Here, the trivial bound is ${O(y)}$, but due to a number of logarithmic inefficiencies in the above method, one has to obtain bounds that are more like ${O( \frac{y}{\log^C y})}$ for some constant ${C}$ (e.g. ${C=5}$) in order to end up with an asymptotic such as (1) or (4).

However, in a recent paper of Bourgain, Sarnak, and Ziegler, it was observed that as long as one is only seeking the Mobius orthogonality (4) rather than the von Mangoldt orthogonality (1), one can avoid losing any logarithmic factors, and rely purely on qualitative equidistribution properties of ${f}$. A special case of their orthogonality criterion (which actually dates back to an earlier paper of Katai, as was pointed out to me by Nikos Frantzikinakis) is as follows:

Proposition 1 (Orthogonality criterion) Let ${f: {\bf N} \rightarrow {\bf C}}$ be a bounded function such that

$\displaystyle \sum_{n \leq x} f(pn) \overline{f(qn)} = o(x) \ \ \ \ \ (5)$

for any distinct primes ${p, q}$ (where the decay rate of the error term ${o(x)}$ may depend on ${p}$ and ${q}$). Then

$\displaystyle \sum_{n \leq x} \mu(n) f(n) =o(x). \ \ \ \ \ (6)$

Actually, the Bourgain-Sarnak-Ziegler paper establishes a more quantitative version of this proposition, in which ${\mu}$ can be replaced by an arbitrary bounded multiplicative function, but we will content ourselves with the above weaker special case. (See also these notes of Harper, which uses the Katai argument to give a slightly weaker quantitative bound in the same spirit.) This criterion can be viewed as a multiplicative variant of the classical van der Corput lemma, which in our notation asserts that ${\sum_{n \leq x} f(n) = o(x)}$ if one has ${\sum_{n \leq x} f(n+h) \overline{f(n)} = o(x)}$ for each fixed non-zero ${h}$.

As a sample application, Proposition 1 easily gives a proof of the asymptotic

$\displaystyle \sum_{n \leq x} \mu(n) e^{2\pi i \alpha n} = o(x)$

for any irrational ${\alpha}$. (For rational ${\alpha}$, this is a little trickier, as it is basically equivalent to the prime number theorem in arithmetic progressions.) The paper of Bourgain, Sarnak, and Ziegler also apply this criterion to nilsequences (obtaining a quick proof of a qualitative version of a result of Ben Green and myself, see these notes of Ziegler for details) and to horocycle flows (for which no Möbius orthogonality result was previously known).

Informally, the connection between (5) and (6) comes from the multiplicative nature of the Möbius function. If (6) failed, then ${\mu(n)}$ exhibits strong correlation with ${f(n)}$; by change of variables, we then expect ${\mu(pn)}$ to correlate with ${f(pn)}$ and ${\mu(pm)}$ to correlate with ${f(qn)}$, for “typical” ${p,q}$ at least. On the other hand, since ${\mu}$ is multiplicative, ${\mu(pn)}$ exhibits strong correlation with ${\mu(qn)}$. Putting all this together (and pretending correlation is transitive), this would give the claim (in the contrapositive). Of course, correlation is not quite transitive, but it turns out that one can use the Cauchy-Schwarz inequality as a substitute for transitivity of correlation in this case.

I will give a proof of Proposition 1 below the fold (which is not quite based on the argument in the above mentioned paper, but on a variant of that argument communicated to me by Tamar Ziegler, and also independently discovered by Adam Harper). The main idea is to exploit the following observation: if ${P}$ is a “large” but finite set of primes (in the sense that the sum ${A := \sum_{p \in P} \frac{1}{p}}$ is large), then for a typical large number ${n}$ (much larger than the elements of ${P}$), the number of primes in ${P}$ that divide ${n}$ is pretty close to ${A = \sum_{p \in P} \frac{1}{p}}$:

$\displaystyle \sum_{p \in P: p|n} 1 \approx A. \ \ \ \ \ (7)$

A more precise formalisation of this heuristic is provided by the Turan-Kubilius inequality, which is proven by a simple application of the second moment method.

In particular, one can sum (7) against ${\mu(n) f(n)}$ and obtain an approximation

$\displaystyle \sum_{n \leq x} \mu(n) f(n) \approx \frac{1}{A} \sum_{p \in P} \sum_{n \leq x: p|n} \mu(n) f(n)$

that approximates a sum of ${\mu(n) f(n)}$ by a bunch of sparser sums of ${\mu(n) f(n)}$. Since

$\displaystyle x = \frac{1}{A} \sum_{p \in P} \frac{x}{p},$

we see (heuristically, at least) that in order to establish (4), it would suffice to establish the sparser estimates

$\displaystyle \sum_{n \leq x: p|n} \mu(n) f(n) = o(\frac{x}{p})$

for all ${p \in P}$ (or at least for “most” ${p \in P}$).

Now we make the change of variables ${n = pm}$. As the Möbius function is multiplicative, we usually have ${\mu(n) = \mu(p) \mu(m) = - \mu(m)}$. (There is an exception when ${n}$ is divisible by ${p^2}$, but this will be a rare event and we will be able to ignore it.) So it should suffice to show that

$\displaystyle \sum_{m \leq x/p} \mu(m) f(pm) = o( x/p )$

for most ${p \in P}$. However, by the hypothesis (5), the sequences ${m \mapsto f(pm)}$ are asymptotically orthogonal as ${p}$ varies, and this claim will then follow from a Cauchy-Schwarz argument.

One of my favourite unsolved problems in harmonic analysis is the restriction problem. This problem, first posed explicitly by Elias Stein, can take many equivalent forms, but one of them is this: one starts with a smooth compact hypersurface ${S}$ (possibly with boundary) in ${{\bf R}^d}$, such as the unit sphere ${S = S^2}$ in ${{\bf R}^3}$, and equips it with surface measure ${d\sigma}$. One then takes a bounded measurable function ${f \in L^\infty(S,d\sigma)}$ on this surface, and then computes the (inverse) Fourier transform

$\displaystyle \widehat{fd\sigma}(x) = \int_S e^{2\pi i x \cdot \omega} f(\omega) d\sigma(\omega)$

of the measure ${fd\sigma}$. As ${f}$ is bounded and ${d\sigma}$ is a finite measure, this is a bounded function on ${{\bf R}^d}$; from the dominated convergence theorem, it is also continuous. The restriction problem asks whether this Fourier transform also decays in space, and specifically whether ${\widehat{fd\sigma}}$ lies in ${L^q({\bf R}^d)}$ for some ${q < \infty}$. (This is a natural space to control decay because it is translation invariant, which is compatible on the frequency space side with the modulation invariance of ${L^\infty(S,d\sigma)}$.) By the closed graph theorem, this is the case if and only if there is an estimate of the form

$\displaystyle \| \widehat{f d\sigma} \|_{L^q({\bf R}^d)} \leq C_{q,d,S} \|f\|_{L^\infty(S,d\sigma)} \ \ \ \ \ (1)$

for some constant ${C_{q,d,S}}$ that can depend on ${q,d,S}$ but not on ${f}$. By a limiting argument, to provide such an estimate, it suffices to prove such an estimate under the additional assumption that ${f}$ is smooth.

Strictly speaking, the above problem should be called the extension problem, but it is dual to the original formulation of the restriction problem, which asks to find those exponents ${1 \leq q' \leq \infty}$ for which the Fourier transform of an ${L^{q'}({\bf R}^d)}$ function ${g}$ can be meaningfully restricted to a hypersurface ${S}$, in the sense that the map ${g \mapsto \hat g|_{S}}$ can be continuously defined from ${L^{q'}({\bf R}^d)}$ to, say, ${L^1(S,d\sigma)}$. A duality argument shows that the exponents ${q'}$ for which the restriction property holds are the dual exponents to the exponents ${q}$ for which the extension problem holds.

There are several motivations for studying the restriction problem. The problem is connected to the classical question of determining the nature of the convergence of various Fourier summation methods (and specifically, Bochner-Riesz summation); very roughly speaking, if one wishes to perform a partial Fourier transform by restricting the frequencies (possibly using a well-chosen weight) to some region ${B}$ (such as a ball), then one expects this operation to well behaved if the boundary ${\partial B}$ of this region has good restriction (or extension) properties. More generally, the restriction problem for a surface ${S}$ is connected to the behaviour of Fourier multipliers whose symbols are singular at ${S}$. The problem is also connected to the analysis of various linear PDE such as the Helmholtz equation, Schro\”dinger equation, wave equation, and the (linearised) Korteweg-de Vries equation, because solutions to such equations can be expressed via the Fourier transform in the form ${fd\sigma}$ for various surfaces ${S}$ (the sphere, paraboloid, light cone, and cubic for the Helmholtz, Schrödinger, wave, and linearised Korteweg de Vries equation respectively). A particular family of restriction-type theorems for such surfaces, known as Strichartz estimates, play a foundational role in the nonlinear perturbations of these linear equations (e.g. the nonlinear Schrödinger equation, the nonlinear wave equation, and the Korteweg-de Vries equation). Last, but not least, there is a a fundamental connection between the restriction problem and the Kakeya problem, which roughly speaking concerns how tubes that point in different directions can overlap. Indeed, by superimposing special functions of the type ${\widehat{fd\sigma}}$, known as wave packets, and which are concentrated on tubes in various directions, one can “encode” the Kakeya problem inside the restriction problem; in particular, the conjectured solution to the restriction problem implies the conjectured solution to the Kakeya problem. Finally, the restriction problem serves as a simplified toy model for studying discrete exponential sums whose coefficients do not have a well controlled phase; this perspective was, for instance, used by Ben Green when he established Roth’s theorem in the primes by Fourier-analytic methods, which was in turn one of the main inspirations for our later work establishing arbitrarily long progressions in the primes, although we ended up using ergodic-theoretic arguments instead of Fourier-analytic ones and so did not directly use restriction theory in that paper.

The estimate (1) is trivial for ${q=\infty}$ and becomes harder for smaller ${q}$. The geometry, and more precisely the curvature, of the surface ${S}$, plays a key role: if ${S}$ contains a portion which is completely flat, then it is not difficult to concoct an ${f}$ for which ${\widehat{f d\sigma}}$ fails to decay in the normal direction to this flat portion, and so there are no restriction estimates for any finite ${q}$. Conversely, if ${S}$ is not infinitely flat at any point, then from the method of stationary phase, the Fourier transform ${\widehat{d\sigma}}$ can be shown to decay at a power rate at infinity, and this together with a standard method known as the ${TT^*}$ argument can be used to give non-trivial restriction estimates for finite ${q}$. However, these arguments fall somewhat short of obtaining the best possible exponents ${q}$. For instance, in the case of the sphere ${S = S^{d-1} \subset {\bf R}^d}$, the Fourier transform ${\widehat{d\sigma}(x)}$ is known to decay at the rate ${O(|x|^{-(d-1)/2})}$ and no better as ${d \rightarrow \infty}$, which shows that the condition ${q > \frac{2d}{d-1}}$ is necessary in order for (1) to hold for this surface. The restriction conjecture for ${S^{d-1}}$ asserts that this necessary condition is also sufficient. However, the ${TT^*}$-based argument gives only the Tomas-Stein theorem, which in this context gives (1) in the weaker range ${q \geq \frac{2(d+1)}{d-1}}$. (On the other hand, by the nature of the ${TT^*}$ method, the Tomas-Stein theorem does allow the ${L^\infty(S,d\sigma)}$ norm on the right-hand side to be relaxed to ${L^2(S,d\sigma)}$, at which point the Tomas-Stein exponent ${\frac{2(d+1)}{d-1}}$ becomes best possible. The fact that the Tomas-Stein theorem has an ${L^2}$ norm on the right-hand side is particularly valuable for applications to PDE, leading in particular to the Strichartz estimates mentioned earlier.)

Over the last two decades, there was a fair amount of work in pushing past the Tomas-Stein barrier. For sake of concreteness let us work just with the restriction problem for the unit sphere ${S^2}$ in ${{\bf R}^3}$. Here, the restriction conjecture asserts that (1) holds for all ${q > 3}$, while the Tomas-Stein theorem gives only ${q \geq 4}$. By combining a multiscale analysis approach with some new progress on the Kakeya conjecture, Bourgain was able to obtain the first improvement on this range, establishing the restriction conjecture for ${q > 4 - \frac{2}{15}}$. The methods were steadily refined over the years; until recently, the best result (due to myself) was that the conjecture held for all ${q > 3 \frac{1}{3}}$, which proceeded by analysing a “bilinear ${L^2}$” variant of the problem studied previously by Bourgain and by Wolff. This is essentially the limit of that method; the relevant bilinear ${L^2}$ estimate fails for ${q < 3 + \frac{1}{3}}$. (This estimate was recently established at the endpoint ${q=3+\frac{1}{3}}$ by Jungjin Lee (personal communication), though this does not quite improve the range of exponents in (1) due to a logarithmic inefficiency in converting the bilinear estimate to a linear one.)

On the other hand, the full range ${q>3}$ of exponents in (1) was obtained by Bennett, Carbery, and myself (with an alternate proof later given by Guth), but only under the additional assumption of non-coplanar interactions. In three dimensions, this assumption was enforced by replacing (1) with the weaker trilinear (and localised) variant

$\displaystyle \| \widehat{f_1 d\sigma_1} \widehat{f_2 d\sigma_2} \widehat{f_3 d\sigma_3} \|_{L^{q/3}(B(0,R))} \leq C_{q,d,S_1,S_2,S_3,\epsilon} R^\epsilon \ \ \ \ \ (2)$

$\displaystyle \|f_1\|_{L^\infty(S_1,d\sigma_1)} \|f_2\|_{L^\infty(S_2,d\sigma_2)} \|f_3\|_{L^\infty(S_3,d\sigma_3)}$

where ${\epsilon>0}$ and ${R \geq 1}$ are arbitrary, ${B(0,R)}$ is the ball of radius ${R}$ in ${{\bf R}^3}$, and ${S_1,S_2,S_3}$ are compact portions of ${S}$ whose unit normals ${n_1(),n_2(),n_3()}$ are never coplanar, thus there is a uniform lower bound

$\displaystyle |n_1(\omega_1) \wedge n_2(\omega_2) \wedge n_3(\omega_3)| \geq c$

for some ${c>0}$ and all ${\omega_1 \in S_1, \omega_2 \in S_2, \omega_3 \in S_3}$. If it were not for this non-coplanarity restriction, (2) would be equivalent to (1) (by setting ${S_1=S_2=S_3}$ and ${f_1=f_2=f_3}$, with the converse implication coming from Hölder’s inequality; the ${R^\epsilon}$ loss can be removed by a lemma from a paper of mine). At the time we wrote this paper, we tried fairly hard to try to remove this non-coplanarity restriction in order to recover progress on the original restriction conjecture, but without much success.

A few weeks ago, though, Bourgain and Guth found a new way to use multiscale analysis to “interpolate” between the result of Bennett, Carbery and myself (that has optimal exponents, but requires non-coplanar interactions), with a more classical square function estimate of Córdoba that handles the coplanar case. A direct application of this interpolation method already ties with the previous best known result in three dimensions (i.e. that (1) holds for ${q > 3 \frac{1}{3}}$). But it also allows for the insertion of additional input, such as the best Kakeya estimate currently known in three dimensions, due to Wolff. This enlarges the range slightly to ${q > 3.3}$. The method also can extend to variable-coefficient settings, and in some of these cases (where there is so much “compression” going on that no additional Kakeya estimates are available) the estimates are best possible.

As is often the case in this field, there is a lot of technical book-keeping and juggling of parameters in the formal arguments of Bourgain and Guth, but the main ideas and “numerology” can be expressed fairly readily. (In mathematics, numerology refers to the empirically observed relationships between various key exponents and other numerical parameters; in many cases, one can use shortcuts such as dimensional analysis or informal heuristic, to compute these exponents long before the formal argument is completely in place.) Below the fold, I would like to record this numerology for the simplest of the Bourgain-Guth arguments, namely a reproof of (1) for ${p > 3 \frac{1}{3}}$. This is primarily for my own benefit, but may be of interest to other experts in this particular topic. (See also my 2003 lecture notes on the restriction conjecture.)

In order to focus on the ideas in the paper (rather than on the technical details), I will adopt an informal, heuristic approach, for instance by interpreting the uncertainty principle and the pigeonhole principle rather liberally, and by focusing on main terms in a decomposition and ignoring secondary terms. I will also be somewhat vague with regard to asymptotic notation such as ${\ll}$. Making the arguments rigorous requires a certain amount of standard but tedious effort (and is one of the main reasons why the Bourgain-Guth paper is as long as it is), which I will not focus on here.

In this final lecture in the Marker lecture series, I discuss the recent work of Bourgain, Gamburd, and Sarnak on how arithmetic combinatorics and expander graphs were used to sieve for almost primes in various algebraic sets.