In this blog post, I would like to specialise the arguments of Bourgain, Demeter, and Guth from the previous post to the two-dimensional case of the Vinogradov main conjecture, namely

Theorem 1 (Two-dimensional Vinogradov main conjecture) One has

\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^6\ dx dy \ll N^{3+o(1)}

as {N \rightarrow \infty}.

This particular case of the main conjecture has a classical proof using some elementary number theory. Indeed, the left-hand side can be viewed as the number of solutions to the system of equations

\displaystyle j_1 + j_2 + j_3 = k_1 + k_2 + k_3

\displaystyle j_1^2 + j_2^2 + j_3^2 = k_1^2 + k_2^2 + k_3^2

with {j_1,j_2,j_3,k_1,k_2,k_3 \in \{0,\dots,N\}}. These two equations can combine (using the algebraic identity {(a+b-c)^2 - (a^2+b^2-c^2) = 2 (a-c)(b-c)} applied to {(a,b,c) = (j_1,j_2,k_3), (k_1,k_2,j_3)}) to imply the further equation

\displaystyle (j_1 - k_3) (j_2 - k_3) = (k_1 - j_3) (k_2 - j_3)

which, when combined with the divisor bound, shows that each {k_1,k_2,j_3} is associated to {O(N^{o(1)})} choices of {j_1,j_2,k_3} excluding diagonal cases when two of the {j_1,j_2,j_3,k_1,k_2,k_3} collide, and this easily yields Theorem 1. However, the Bourgain-Demeter-Guth argument (which, in the two dimensional case, is essentially contained in a previous paper of Bourgain and Demeter) does not require the divisor bound, and extends for instance to the the more general case where {j} ranges in a {1}-separated set of reals between {0} to {N}.

In this special case, the Bourgain-Demeter argument simplifies, as the lower dimensional inductive hypothesis becomes a simple {L^2} almost orthogonality claim, and the multilinear Kakeya estimate needed is also easy (collapsing to just Fubini’s theorem). Also one can work entirely in the context of the Vinogradov main conjecture, and not turn to the increased generality of decoupling inequalities (though this additional generality is convenient in higher dimensions). As such, I am presenting this special case as an introduction to the Bourgain-Demeter-Guth machinery.

We now give the specialisation of the Bourgain-Demeter argument to Theorem 1. It will suffice to establish the bound

\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^p\ dx dy \ll N^{p/2+o(1)}

for all {4<p<6}, (where we keep {p} fixed and send {N} to infinity), as the {L^6} bound then follows by combining the above bound with the trivial bound {|\sum_{j=0}^N e( j x + j^2 x^2)| \ll N}. Accordingly, for any {\eta > 0} and {4<p<6}, we let {P(p,\eta)} denote the claim that

\displaystyle \int_{[0,1]^2} |\sum_{j=0}^N e( j x + j^2 y)|^p\ dx dy \ll N^{p/2+\eta+o(1)}

as {N \rightarrow \infty}. Clearly, for any fixed {p}, {P(p,\eta)} holds for some large {\eta}, and it will suffice to establish

Proposition 2 Let {4<p<6}, and let {\eta>0} be such that {P(p,\eta)} holds. Then there exists {0 < \eta' < \eta} (depending continuously on \eta) such that {P(p,\eta')} holds.

Indeed, this proposition shows that for {4<p<6}, the infimum of the {\eta} for which {P(p,\eta)} holds is zero.

We prove the proposition below the fold, using a simplified form of the methods discussed in the previous blog post. To simplify the exposition we will be a bit cavalier with the uncertainty principle, for instance by essentially ignoring the tails of rapidly decreasing functions.

Henceforth we fix {4 < p < 6} and {\eta > 0}, and assume that {P(p,\eta)} holds. For any interval {I}, let {f_I: {\bf R}^2 \rightarrow {\bf C}} denote the exponential sum

\displaystyle f_I(x,y) := \sum_{j \in I \cap {\bf Z}} e(jx + j^2 y);

this function is periodic with respect to the lattice {{\bf Z}^2} and can thus also be thought of as a function on the torus {({\bf R}/{\bf Z})^2}. The hypothesis {P(p,\eta)}, is then asserting that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |f_{[0,N]}(x,y)|^p\ dx dy \ll N^{p/2+\eta+o(1)}. \ \ \ \ \ (1)


A Galilean rescaling argument (noting that the Galilean transform used lies in {SL_2({\bf Z})}) then shows that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |f_I(x,y)|^p\ dx dy \ll |I|^{p/2+\eta+o(1)} \ \ \ \ \ (2)


for any interval {I} of length going to infinity as {N \rightarrow \infty}.

Our task is to show that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |f_{[0,N]}(x,y)|^p\ dx dy \ll N^{p/2+\eta'+o(1)} \ \ \ \ \ (3)


for some {0 < \eta' < \eta}. We first observe that it will suffice to show the apparently weaker bilinear estimate

\displaystyle \int_{({\bf R}/{\bf Z})^2} |f_{I_1}(x,y)|^{p/2} |f_{I_2}(x,y)|^{p/2}\ dx dy \ll N^{p/2+\eta'+o(1)} \ \ \ \ \ (4)


whenever {I, J} are disjoint intervals in {[0,N]} that are separated by {\gg N}. Indeed, suppose the bilinear estimate (4) held for all {N}. If we define the quantity

\displaystyle A(N) := N^{-p/2} \int_{({\bf R}/{\bf Z})^2} |f_{[0,N]}(x,y)|^p\ dx dy

then by decomposing {[0,N]} into {K} intervals {I_1,\dots,I_K} of length about {N/K}, with {K} a moderately large natural number, we can use the triangle inequality to bound

\displaystyle A(N)\leq N^{-p/2} \int_{({\bf R}/{\bf Z})^2} (\sum_{1 \leq i,j \leq K} |f_{I_i}(x,y)||f_{I_j}(x,y)|)^{p/2}\ dx dy

By (4), the contribution of those {i,j} with {|i-j| > 1} is {O_K( N^{\eta'+o(1)} )}. On the other hand, by Hölder’s inequality and affine rescaling, the contribution of the near-diagonal {i,j} with {|i-j| \leq 1} is {O( K^{p/2-1} K K^{-p/2} A(N/K) )}. This gives the inequality

\displaystyle A(N) \ll A(N/K) + O_K( N^{\eta' + o(1)} )

and by taking {K} to be a sufficiently large constant (depending on {\eta'}) and using a trivial bound {A(N) \ll N^{O(1)}} for small {N}, one can obtain the bound {A(N) \ll N^{\eta'+o(1)}}, which gives (3). Thus it suffices to show (4).

Let {I_1, I_2} be as in (4). For any fixed {0 \leq q \leq s \leq 1} and {2 \leq t \leq 6}, we let {a_t(q,s)} denote the best constant for which one has the bound

\displaystyle \int_{({\bf R}/{\bf Z})^2} |\sum_{J_1} F_{t,s,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{t,s,J_2}(x,y)^2|^{p/4}\ dx dy

\displaystyle \ll N^{p/2 + a_t(q,s) + o(1)}

as {N \rightarrow \infty}, where for {j=1,2}, {J_j} ranges over a partition of {I_j} into intervals of length {\sim N^{1-q}}, and

\displaystyle F_{t,s,J}(x_0,y_0) := (\frac{1}{|B_{x_0,y_0,s}|} \int_{B_{x_0,y_0,s}} |f_J(x,y)|^t \ dx dy)^{1/t}

is the local {L^t} norm of {f_J} near {(x_0,y_0)}, where {B_{x_0,y_0,s}} is the rectangle

\displaystyle B_{x_0,y_0,s} := [x_0-N^{-1+s}, x_0+N^{-1+s}] \times [y_0-N^{-2+s}, y_0+N^{-2+s}].

(Actually, to make the argument below work rigorously we have to replace the indicator {1_{B_{x_0,y_0,s}}} by a smoothed out variant {w_{B_{x_0,y_0,s}}}, but to simplify the exposition we shall simply ignore this technical issue.) The function {f_{I_j}} has Fourier support in the rectangle {[0,N] \times [0,N^2]}, and so by uncertainty principle heuristics one morally has (ignoring the technical issue alluded to above) a pointwise bound of the form

\displaystyle |f_{I_j}(x,y)| \ll N^{2s} \sum_{J_j} F_{t,s,J_j}(x,y)

which leads to the bound

\displaystyle \int_{({\bf R}/{\bf Z})^2} |f_{I_1}(x,y)|^{p/2} |f_{I_2}(x,y)|^{p/2}\ dx dy \ll N^{p/2+a_t(q,s) + O(q+s)+o(1)} \ \ \ \ \ (5)


for any {0 \leq q \leq s \leq 1}. We will shortly establish the inequality

\displaystyle a_2(u,u) \leq (1-Wu) \eta \ \ \ \ \ (6)


for any {W>0} and for any {u>0} that is sufficiently small depending on {W}; inserting this bound into (5) for a suitably large {W} and sufficiently small {u} gives the desired bound (4).

It remains to establish (6). This will follow from the following claims.

Proposition 3 For sufficiently small {u}, we have

  • (i) (Hölder) The functions {t \mapsto a_t(u,u)} and {t \mapsto a_t(u,2u)} are convex non-increasing in {1/t}.
  • (ii) (Rescaled induction hypothesis) We have {a_p(u,2u) \leq (1-u) \eta}.
  • (iii) ({L^2} decoupling) We have {a_2(u,2u) \leq a_2(2u, 2u)}.
  • (iv) (Bilinear Kakeya) We have {a_{p/2}(u,u) \leq a_{p/2}(u,2u)}.

Let us now see why this proposition implies (6) for all {W>0}. From the proposition we have

\displaystyle a_2(u,u) \leq a_{p/2}(u,u) \leq a_{p/2}(u,2u) \leq a_p(u,2u) \leq (1-u) \eta

which gives the claim for {W = 1}. To increase {W}, assume that (6) already holds for some value of {W}, then by Proposition 3(iii) we have

\displaystyle a_2(u,2u) \leq a_2(2u,2u) \leq (1-2Wu)\eta

for sufficiently small {u}. On the other hand, from (ii) we have {a_p(u,2u) \leq \eta}. Interpolating using (i) and the hypothesis {4 < p < 6}, we have

\displaystyle a_{p/2}(u,2u) \leq (1 - (1+\delta)W u) \eta

for sufficiently small {u} and for some {\delta>0} depending only on {p}. Applying (iv) followed by (i) we conclude that (6) holds with {W} replaced by {(1+\delta) W}. Iterating this, we can obtain (6) for arbitrarily large {W}, as required.

The claim (i) is an easy application of Hölder’s inequality; we now turn to the more interesting claims (ii), (iii), (iv).

— 1. Rescaled induction hypothesis —

To prove (ii), we need to show

\displaystyle \int_{({\bf R}/{\bf Z})^2} |\sum_{J_1} F_{p,2u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{p,2u,J_2}(x,y)^2|^{p/4}\ dx dy

\displaystyle \ll N^{p/2 + (1-u)\eta + o(1)}

where {J_1} ranges over a partition of {I_1} into intervals of length {N^{1-u}}, and similarly for {J_2}. By Hölder’s inequality it suffices to show that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |\sum_{J_j} F_{p,2u,J_j}(x,y)^2|^{p/2}\ dx dy \ll N^{p/2 + (1-u)\eta + o(1)}

for {j=1,2}. Since {4 < p < 6}, we can use Minkowski’s inequality to conclude that

\displaystyle (\int_{({\bf R}/{\bf Z})^2} |\sum_{J_j} F_{p,2u,J_j}(x,y)^2|^{p/2}\ dx dy)^{2/p}

\displaystyle \leq \sum_{J_j} (\int_{({\bf R}/{\bf Z})^2} |F_{p,2u,J_j}(x,y)|^p\ dx dy)^{2/p}

\displaystyle \leq \sum_{J_j} (\int_{({\bf R}/{\bf Z})^2} |f_{J_j}(x,y)|^p\ dx dy)^{2/p}

and the claim then follows from (2) (since there are {O(N^u)} intervals {J_j} to sum over).

— 2. {L^2} decoupling —

To prove (iii), it will suffice to show that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |\sum_{J_1} F_{2,2u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{2,2u,J_2}(x,y)^2|^{p/4}\ dx dy

\displaystyle \ll \int_{({\bf R}/{\bf Z})^2} |\sum_{K_1} F_{2,2u,K_1}(x,y)^2|^{p/4} |\sum_{K_2} F_{2,2u,K_2}(x,y)^2|^{p/4}\ dx dy

where the {J_j} and {K_j} are partitions of {I_j} into intervals of length {\sim N^{1-u}} and {\sim N^{1-2u}} respectively. This will follow from the pointwise estimates

\displaystyle F_{2,2u,J_j}(x_0,y_0)^2 \ll \sum_{K_j: K_j \subset J_j} F_{2,2u,K_j}(x_0,y_0)^2

for any {j=1,2}, any {x_0,y_0} and any interval {J_j} of length {\sim N^{1-u}} (assuming the intervals are nicely nested in some dyadic fashion for simplicity). This expands as

\displaystyle \int_{B} |\sum_{K_j \subset J_j} f_{K_j}(x,y)|^2\ dx dy \ll \sum_{K_j \subset J_j} \int_{B} |f_{K_j}(x,y)|^2 \ dx dy,

where {B} is a rectangle of dimensions roughly {N^{-1+2u} \times N^{-2+2u}} with sides parallel to the coordinate axes. Without the localisation to {B}, this would be immediate from the orthogonality of the {f_{K_j}}. Morally, the localisation to {B} introduces a Fourier uncertainty by a rectangle of dimensions roughly {N^{1-2u} \times N^{2-2u}}. But the frequencies {\{ (k,k^2): k \in K_j \}} that the {f_{K_j}} are Fourier supported in are essentially disjoint in {K_j} even up to this uncertainty, so the global orthogonality of the {f_{K_j}} should localise to the scale of the rectangle {B}. (This can be made rigorous using suitable smoothed approximants to the indicator of {1_B}, but we omit this technical detail here.)

— 3. Bilinear Kakeya —

To prove (iv), it will suffice to show that

\displaystyle \int_{({\bf R}/{\bf Z})^2} |\sum_{J_1} F_{p/2,u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{p/2,u,J_2}(x,y)^2|^{p/4}\ dx dy

\displaystyle \ll N^{o(1)} \int_{({\bf R}/{\bf Z})^2} |\sum_{J_1} F_{p/2,2u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{p/2,2u,J_2}(x,y)^2|^{p/4}\ dx dy

as {N \rightarrow \infty}, where {J_j} ranges over a partition of {I_j} into intervals of length {\sim N^{1-u}}. By averaging, it suffices to show that

\displaystyle \int_B |\sum_{J_1} F_{p/2,u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{p/2,u,J_2}(x,y)^2|^{p/4}\ dx dy

\displaystyle \ll N^{o(1)} \int_B |\sum_{J_1} F_{p/2,2u,J_1}(x,y)^2|^{p/4} |\sum_{J_2} F_{p/2,2u,J_2}(x,y)^2|^{p/4}\ dx dy

whenever {B} is a rectangle of dimensions essentially {N^{-1+2u} \times N^{-2+2u}} with sides parallel to the axes. If we set {G_{J_j} := F_{p/2,u,J_j}^{p/2}}, then we morally have

\displaystyle F_{p/2,2u,J_j}(x,y)^{p/2} \sim |B|^{-1} \int_B G_{J_j}

on {B}, and so the estimate will follow if we can show that

\displaystyle |B|^{-1} \int_B |\sum_{J_1} G_{J_1}^{4/p}|^{p/4} |\sum_{J_2} G_{J_2}^{4/p}|^{p/4}

\displaystyle \ll N^{o(1)} (\sum_{J_1} (|B|^{-1} \int_B G_{J_1})^{4/p})^{p/4} (\sum_{J_2} (|B|^{-1} \int_B G_{J_2})^{4/p})^{p/4}.

(As before, to be rigorous we need to replace the {1_B} localisation with a smoother weight {w_B}, but we ignore this technicality here.) We now apply a logarithmic pigeonholing (conceding a factor of {N^{o(1)}}) to restrict {J_1} to a set {{\mathcal J}_1} in which all the means {|B|^{-1} \int_B G_{J_1}} are comparable to each other, and similarly to restrict {J_2} to a set {{\mathcal J}_2} where the means {|B|^{-1} \int_B G_{J_2}} are comparable to each other. We can then normalise so that

\displaystyle \int_B G_{J} \sim |B| \ \ \ \ \ (7)


for all surviving {J}, so it now suffices to show that

\displaystyle |B|^{-1} \int_B |\sum_{J_1 \in {\mathcal J}_1} G_{J_1}^{4/p}|^{p/4} |\sum_{J_2 \in {\mathcal J}_2} G_{J_2}^{4/p}|^{p/4}

\displaystyle \ll (\# {\mathcal J}_1)^{p/4} (\# {\mathcal J}_2)^{p/4}.

Since {p>4}, we have

\displaystyle |\sum_{J_j} G_{J_j}^{4/p}|^{p/4} \leq (\# {\mathcal J}_j)^{p/4 - 1} \sum_{J_j} G_{J_j}

for {j=1,2}, so it suffices to show that

\displaystyle |B|^{-1} \int_B (\sum_{J_1 \in {\mathcal J}_1} G_{J_1}) (\sum_{J_2 \in {\mathcal J}_2} G_{J_2})

\displaystyle \ll (\# {\mathcal J}_1) (\# {\mathcal J}_2).

By the triangle inequality, it suffices to show that

\displaystyle \int_B G_{J_1} G_{J_2} \ll |B|.

Recall that {B} is a rectangle of dimensions about {N^{-1+2u} \times N^{-2+2u}}. As each {J_j} is an interval of length about {N^{1-u}}, we see from the uncertainty principle that the {|f_{J_j}|} are essentially constant along parallelograms with a horizontal side of length {N^{-1+u}} and a vertical height of {N^{-2+2u}} that fit inside the rectangle {B} in a certain orientation (depending on the location of {J_j}; the slanted side has vertical slope {O(N)}). Thus the functions {G_{J_j}} also exhibit similar behaviour, and can be essentially written within {B} as

\displaystyle G_{J_j} \sim \sum_{P_j} c_{P_j} 1_{P_j}

for some non-negative coefficients {c_{P_j}} and some parallelograms of horizontal side {N^{-1+u}} and height {N^{-2+2u}} in {B}. The estimate (7) then takes the form

\displaystyle \sum_{P_j \subset B} c_{P_j} \ll N^{u}

so it would suffice (since {|B| \sim N^{-3+4u}}) to show that

\displaystyle |P_1 \cap P_2| \ll N^{-3+2u}

for any parallelograms {P_1, P_2} associated to intervals from {{\mathcal J}_1, {\mathcal J}_2} respectively. But the transversality of {I_1,I_2} ensures that these parallelograms {P_1,P_2} have vertical slopes that differ by {\sim N}, and the claim follows from simple geometry ({P_1 \cap P_2} behaves like a parallelogram of horizontal side {N^{-1+u}} and height {N^{-2+u}}).