In this blog post, I would like to specialise the arguments of Bourgain, Demeter, and Guth from the previous post to the two-dimensional case of the Vinogradov main conjecture, namely

Theorem 1 (Two-dimensional Vinogradov main conjecture)One hasas .

This particular case of the main conjecture has a classical proof using some elementary number theory. Indeed, the left-hand side can be viewed as the number of solutions to the system of equations

with . These two equations can combine (using the algebraic identity applied to ) to imply the further equation

which, when combined with the divisor bound, shows that each is associated to choices of excluding diagonal cases when two of the collide, and this easily yields Theorem 1. However, the Bourgain-Demeter-Guth argument (which, in the two dimensional case, is essentially contained in a previous paper of Bourgain and Demeter) does not require the divisor bound, and extends for instance to the the more general case where ranges in a -separated set of reals between to .

In this special case, the Bourgain-Demeter argument simplifies, as the lower dimensional inductive hypothesis becomes a simple almost orthogonality claim, and the multilinear Kakeya estimate needed is also easy (collapsing to just Fubini’s theorem). Also one can work entirely in the context of the Vinogradov main conjecture, and not turn to the increased generality of decoupling inequalities (though this additional generality is convenient in higher dimensions). As such, I am presenting this special case as an introduction to the Bourgain-Demeter-Guth machinery.

We now give the specialisation of the Bourgain-Demeter argument to Theorem 1. It will suffice to establish the bound

for all , (where we keep fixed and send to infinity), as the bound then follows by combining the above bound with the trivial bound . Accordingly, for any and , we let denote the claim that

as . Clearly, for any fixed , holds for some large , and it will suffice to establish

Proposition 2Let , and let be such that holds. Then there exists (with depending continuously on ) such that holds.

Indeed, this proposition shows that for , the infimum of the for which holds is zero.

We prove the proposition below the fold, using a simplified form of the methods discussed in the previous blog post. To simplify the exposition we will be a bit cavalier with the uncertainty principle, for instance by essentially ignoring the tails of rapidly decreasing functions.

Henceforth we fix and , and assume that holds. For any interval , let denote the exponential sum

this function is periodic with respect to the lattice and can thus also be thought of as a function on the torus . The hypothesis , is then asserting that

A Galilean rescaling argument (noting that the Galilean transform used lies in ) then shows that

for any interval of length going to infinity as .

for some . We first observe that it will suffice to show the apparently weaker *bilinear estimate*

whenever are disjoint intervals in that are separated by . Indeed, suppose the bilinear estimate (4) held for all . If we define the quantity

then by decomposing into intervals of length about , with a moderately large natural number, we can use the triangle inequality to bound

By (4), the contribution of those with is . On the other hand, by Hölder’s inequality and affine rescaling, the contribution of the near-diagonal with is . This gives the inequality

and by taking to be a sufficiently large constant (depending on ) and using a trivial bound for small , one can obtain the bound , which gives (3). Thus it suffices to show (4).

Let be as in (4). For any fixed and , we let denote the best constant for which one has the bound

as , where for , ranges over a partition of into intervals of length , and

is the local norm of near , where is the rectangle

(Actually, to make the argument below work rigorously we have to replace the indicator by a smoothed out variant , but to simplify the exposition we shall simply ignore this technical issue.) The function has Fourier support in the rectangle , and so by uncertainty principle heuristics one morally has (ignoring the technical issue alluded to above) a pointwise bound of the form

for any . We will shortly establish the inequality

for any and for any that is sufficiently small depending on ; inserting this bound into (5) for a suitably large and sufficiently small gives the desired bound (4).

It remains to establish (6). This will follow from the following claims.

Proposition 3For sufficiently small , we have

- (i) (Hölder) The functions and are convex non-increasing in .
- (ii) (Rescaled induction hypothesis) We have .
- (iii) ( decoupling) We have .
- (iv) (Bilinear Kakeya) We have .

Let us now see why this proposition implies (6) for all . From the proposition we have

which gives the claim for . To increase , assume that (6) already holds for some value of , then by Proposition 3(iii) we have

for sufficiently small . On the other hand, from (ii) we have . Interpolating using (i) and the hypothesis , we have

for sufficiently small and for some depending only on . Applying (iv) followed by (i) we conclude that (6) holds with replaced by . Iterating this, we can obtain (6) for arbitrarily large , as required.

The claim (i) is an easy application of Hölder’s inequality; we now turn to the more interesting claims (ii), (iii), (iv).

** — 1. Rescaled induction hypothesis — **

To prove (ii), we need to show

where ranges over a partition of into intervals of length , and similarly for . By Hölder’s inequality it suffices to show that

for . Since , we can use Minkowski’s inequality to conclude that

and the claim then follows from (2) (since there are intervals to sum over).

** — 2. decoupling — **

To prove (iii), it will suffice to show that

where the and are partitions of into intervals of length and respectively. This will follow from the pointwise estimates

for any , any and any interval of length (assuming the intervals are nicely nested in some dyadic fashion for simplicity). This expands as

where is a rectangle of dimensions roughly with sides parallel to the coordinate axes. Without the localisation to , this would be immediate from the orthogonality of the . Morally, the localisation to introduces a Fourier uncertainty by a rectangle of dimensions roughly . But the frequencies that the are Fourier supported in are essentially disjoint in even up to this uncertainty, so the global orthogonality of the should localise to the scale of the rectangle . (This can be made rigorous using suitable smoothed approximants to the indicator of , but we omit this technical detail here.)

** — 3. Bilinear Kakeya — **

To prove (iv), it will suffice to show that

as , where ranges over a partition of into intervals of length . By averaging, it suffices to show that

whenever is a rectangle of dimensions essentially with sides parallel to the axes. If we set , then we morally have

on , and so the estimate will follow if we can show that

(As before, to be rigorous we need to replace the localisation with a smoother weight , but we ignore this technicality here.) We now apply a logarithmic pigeonholing (conceding a factor of ) to restrict to a set in which all the means are comparable to each other, and similarly to restrict to a set where the means are comparable to each other. We can then normalise so that

for all surviving , so it now suffices to show that

Since , we have

for , so it suffices to show that

By the triangle inequality, it suffices to show that

Recall that is a rectangle of dimensions about . As each is an interval of length about , we see from the uncertainty principle that the are essentially constant along parallelograms with a horizontal side of length and a vertical height of that fit inside the rectangle in a certain orientation (depending on the location of ; the slanted side has vertical slope ). Thus the functions also exhibit similar behaviour, and can be essentially written within as

for some non-negative coefficients and some parallelograms of horizontal side and height in . The estimate (7) then takes the form

so it would suffice (since ) to show that

for any parallelograms associated to intervals from respectively. But the transversality of ensures that these parallelograms have vertical slopes that differ by , and the claim follows from simple geometry ( behaves like a parallelogram of horizontal side and height ).

## 25 comments

Comments feed for this article

11 December, 2015 at 10:35 pm

AnonymousPossible typo: “to reduce to restrict” – 4 lines above (7).

[Corrected, thanks – T.]12 December, 2015 at 4:18 am

AnonymousJust under the second display above (5) there is “to make the argument work below rigorously” which is not a good word order; “argument below work rigorously” would be better. In the statement of Proposition 3(i), “Holder” should probably be “Hölder”. The sentence immediately before heading “3. Bilinear Kakeya” about needind to replace 1_B to make things rigorous is not inside parentheses, although every other mention of it appears to be. A few lines above (7) there is “pigenholing”.

[Corrected, thanks -T.]12 December, 2015 at 8:02 am

AnonymousIs the three-dimensional case still simpler than the general case ?

12 December, 2015 at 12:03 pm

Terence TaoThe iterative portion of the argument, in which all the various component estimates (lower dimensional decoupling, multilinear Kakeya, etc.) are put together to obtain the full dimensional decoupling, is a little simpler in 3D than it is in general; the Bourgain-Demeter-Guth paper spends some time on the 3D (and also 4D) cases of the iteration in order to motivate the general case.

12 December, 2015 at 12:46 pm

AnonymousCould you explain what you mean by “affine rescaling” after (4)?

12 December, 2015 at 6:40 pm

Terence TaoThe functions are compositions of functions with affine transformations, up to a harmless phase modulation. For instance, if , then .

12 December, 2015 at 7:03 pm

monsieurcactusI am lost.

13 December, 2015 at 5:36 am

wooleyTwo displays below (4), it seems that decomposing into intervals requires some Holder argument before entering into the -th power of each generating function. So is’nt there an extra factor (squared) here to carry through the argument? This is compensated by A(N/K) in the diagonal part of the argument, though only just, since we assume slightly worse than diagonal behaviour. For the non-diagonal part, one has to win this loss back later.

[Corrected, thanks – T.]13 December, 2015 at 6:33 am

John MangualIf intervals are separated by then how can they both fit in ??

Man, I am really fighting with the notation here. I am guessing in your definition of after (4)

Your use of Hölder inequality may have to do with how noisy the function is. I failed to verify the exponent of on my computer. It oscillates to rapidly.

13 December, 2015 at 8:17 am

Terence TaoThanks for the correction. I am using Vinogradov notation here (appropriately enough, given the subject matter); means “bounded below by for some ”. Thus for instance and would qualify.

It is a result of Blomer and Bludern (refining a previous work of Rogovskaya) that the integral in Theorem 1 is in fact equal to

but there may well be some oscillation initially for small values of N (for instance, the integral is necessarily an integer). One can view this result as a more complicated variant of Dirichlet’s bound

for the divisor summatory function.

14 December, 2015 at 9:46 am

John MangualThis Bourgain-Demeter-Guth integrand oscillates a bunch with respect to Here is how the Riemann sum varies with respect to the size of The exact answer was hard to punch into the calculator. Perhaps it’s like 60000. Using numerical integration – literally adding up the value at many points – I got 71000 or so… Which is about the same size as which is 48000

13 December, 2015 at 8:19 pm

anonymousI don’t see why proposition 2 implies that P(p, eta) is true for all eta > 0.

Maybe P(p, eta) is only true for eta > 1 .. that seems to be consistent with proposition 2.

13 December, 2015 at 8:34 pm

AnonymousIf P(p, eta) holds for all eta >1 (say) then it holds for eta =1 using its definition. Once we know it holds for eta=1 it must hold for some eta < 1.

14 December, 2015 at 3:35 pm

AnonymousTh 1 also follows from the Strichartz inequality of JB for the Schrodinger eq. on the circle. It is also shown in the JB paper that o(1) cannot be avoided in your Th 1.

23 December, 2015 at 2:13 pm

Edward SpencerTerence Tao,

When you have a god given gift, why don’t you use it to solve the problems of this world.

If the universe can be described mathematically and you being the most intelligent person to ever live on this planet then surely there must be a mathematical formula for peace.

You just need to find it. All the best.

6 January, 2016 at 11:04 pm

AnonymousWhere does the come from in the expression ?

6 January, 2016 at 11:14 pm

AnonymousHolder’s inequality on the second equation after (4) seems to give that

7 January, 2016 at 9:21 am

Terence TaoYou need to restrict first to the region in order to only lose rather than .

31 January, 2016 at 7:37 am

Polymath10-post 4: Back to the drawing board? | Combinatorics and more[…] Because of polymath10, I did not discuss over here other things. Let me mention two super major developments that I am sure you all know about. One is Laci Babai’s quasi-polynomial algorithm for Graph isomorphism. (This is a good time to mention that my wife’s mother’s maiden name is Babai.) You can read about it here (and the next three posts) and here. Another is the solution by Jean Bourgain, Ciprian Demeter, Larry Guth of Vinogradov’s main conjecture. You can read about it here and here. […]

16 September, 2016 at 8:01 pm

ZakI do not understand your claim following proposition 6. It seems like, as stated, there is no reason the infimum has to be 0. In a similar spirit, in equation 6, the fact that the smallness of a depends on W seems troublesome.

In the rescaled induction hypothesis section, the last equality should be an inequality. Also, all that is used is that p is at least 4, not that p is at most six.

Lastly, thank you for this post, very helpful!

16 September, 2016 at 9:34 pm

Terence TaoThanks for the correction. As regards Proposition 2, the point is that depends continuously on (as can be seen from inspection of the argument) and so is uniformly bounded away from zero whenever ranges in a compact set. In particular, if approaches a non-zero infimum then will dip below that infimum. (For similar reasons, the quantities and can also be chosen to depend continuously on and so cannot degenerate to infinity or zero as approaches any non-zero value.)

17 September, 2016 at 3:01 pm

Zakthanks for the clarification!

I have also a conceptual question. In the proof of bilinear Kakeya, the Fourier support is a power of N and is translated of order N. So on the domain side the modulation induced from the translation is rather fast and seems to take over, not allowing one to show that the original function is constant on the appropriate scale.

17 September, 2016 at 6:54 pm

Terence TaoThere was a typo: it is rather than that behaves like a constant at the indicated scale.

4 October, 2016 at 7:41 am

ZakThanks again for your responses!

I was curious if there is a heuristic reason why one should choose the scales to be u and 2u (as in the third and fourth part of Proposition 3). Checking the arguments, I see that is a boundary case.

Also, in case it helps future readers, I believe one can take $\eta’ =(p-4)/2 \eta$.

[This appears to be the optimal scale for the “ball inflation” step based on the bilinear Kakeya estimate to work; this is the most nontrivial part of the argument, and seems to be the main engine powering the decoupling phenomenon. -T.]14 October, 2016 at 9:22 am

Zakactually I believe there is a subtle relation between the limit of ball inflation and orthogonality estimates and the limitations of how small we can take eta as follows. let s be the solution to , coming from interpolating (iii) and (iv). Then we need s to be strictly bigger than 1/2 which is the reciprocal of the ball inflation constant. Otherwise we cannot prove for all eta greater than zero. to check this took me several pages of computations, in other words it’s not obvious to me a priori