You are currently browsing the monthly archive for December 2015.
Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials in the polynomial ring
over the finite field of characteristic two, the simplest of which is
(expanded in terms of Taylor series in ). Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.
In this blog post, I would like to specialise the arguments of Bourgain, Demeter, and Guth from the previous post to the two-dimensional case of the Vinogradov main conjecture, namely
Theorem 1 (Two-dimensional Vinogradov main conjecture) One has
as
.
This particular case of the main conjecture has a classical proof using some elementary number theory. Indeed, the left-hand side can be viewed as the number of solutions to the system of equations
with . These two equations can combine (using the algebraic identity
applied to
) to imply the further equation
which, when combined with the divisor bound, shows that each is associated to
choices of
excluding diagonal cases when two of the
collide, and this easily yields Theorem 1. However, the Bourgain-Demeter-Guth argument (which, in the two dimensional case, is essentially contained in a previous paper of Bourgain and Demeter) does not require the divisor bound, and extends for instance to the the more general case where
ranges in a
-separated set of reals between
to
.
In this special case, the Bourgain-Demeter argument simplifies, as the lower dimensional inductive hypothesis becomes a simple almost orthogonality claim, and the multilinear Kakeya estimate needed is also easy (collapsing to just Fubini’s theorem). Also one can work entirely in the context of the Vinogradov main conjecture, and not turn to the increased generality of decoupling inequalities (though this additional generality is convenient in higher dimensions). As such, I am presenting this special case as an introduction to the Bourgain-Demeter-Guth machinery.
We now give the specialisation of the Bourgain-Demeter argument to Theorem 1. It will suffice to establish the bound
for all , (where we keep
fixed and send
to infinity), as the
bound then follows by combining the above bound with the trivial bound
. Accordingly, for any
and
, we let
denote the claim that
as . Clearly, for any fixed
,
holds for some large
, and it will suffice to establish
Proposition 2 Let
, and let
be such that
holds. Then there exists
(with
depending continuously on
) such that
holds.
Indeed, this proposition shows that for , the infimum of the
for which
holds is zero.
We prove the proposition below the fold, using a simplified form of the methods discussed in the previous blog post. To simplify the exposition we will be a bit cavalier with the uncertainty principle, for instance by essentially ignoring the tails of rapidly decreasing functions.
Given any finite collection of elements in some Banach space
, the triangle inequality tells us that
However, when the all “oscillate in different ways”, one expects to improve substantially upon the triangle inequality. For instance, if
is a Hilbert space and the
are mutually orthogonal, we have the Pythagorean theorem
For sake of comparison, from the triangle inequality and Cauchy-Schwarz one has the general inequality
for any finite collection in any Banach space
, where
denotes the cardinality of
. Thus orthogonality in a Hilbert space yields “square root cancellation”, saving a factor of
or so over the trivial bound coming from the triangle inequality.
More generally, let us somewhat informally say that a collection exhibits decoupling in
if one has the Pythagorean-like inequality
for any , thus one obtains almost the full square root cancellation in the
norm. The theory of almost orthogonality can then be viewed as the theory of decoupling in Hilbert spaces such as
. In
spaces for
one usually does not expect this sort of decoupling; for instance, if the
are disjointly supported one has
and the right-hand side can be much larger than when
. At the opposite extreme, one usually does not expect to get decoupling in
, since one could conceivably align the
to all attain a maximum magnitude at the same location with the same phase, at which point the triangle inequality in
becomes sharp.
However, in some cases one can get decoupling for certain . For instance, suppose we are in
, and that
are bi-orthogonal in the sense that the products
for
are pairwise orthogonal in
. Then we have
giving decoupling in . (Similarly if each of the
is orthogonal to all but
of the other
.) A similar argument also gives
decoupling when one has tri-orthogonality (with the
mostly orthogonal to each other), and so forth. As a slight variant, Khintchine’s inequality also indicates that decoupling should occur for any fixed
if one multiplies each of the
by an independent random sign
.
In recent years, Bourgain and Demeter have been establishing decoupling theorems in spaces for various key exponents of
, in the “restriction theory” setting in which the
are Fourier transforms of measures supported on different portions of a given surface or curve; this builds upon the earlier decoupling theorems of Wolff. In a recent paper with Guth, they established the following decoupling theorem for the curve
parameterised by the polynomial curve
For any ball in
, let
denote the weight
which should be viewed as a smoothed out version of the indicator function of
. In particular, the space
can be viewed as a smoothed out version of the space
. For future reference we observe a fundamental self-similarity of the curve
: any arc
in this curve, with
a compact interval, is affinely equivalent to the standard arc
.
Theorem 1 (Decoupling theorem) Let
. Subdivide the unit interval
into
equal subintervals
of length
, and for each such
, let
be the Fourier transform
of a finite Borel measure
on the arc
, where
. Then the
exhibit decoupling in
for any ball
of radius
.
Orthogonality gives the case of this theorem. The bi-orthogonality type arguments sketched earlier only give decoupling in
up to the range
; the point here is that we can now get a much larger value of
. The
case of this theorem was previously established by Bourgain and Demeter (who obtained in fact an analogous theorem for any curved hypersurface). The exponent
(and the radius
) is best possible, as can be seen by the following basic example. If
where is a bump function adapted to
, then standard Fourier-analytic computations show that
will be comparable to
on a rectangular box of dimensions
(and thus volume
) centred at the origin, and exhibit decay away from this box, with
comparable to
On the other hand, is comparable to
on a ball of radius comparable to
centred at the origin, so
is
, which is just barely consistent with decoupling. This calculation shows that decoupling will fail if
is replaced by any larger exponent, and also if the radius of the ball
is reduced to be significantly smaller than
.
This theorem has the following consequence of importance in analytic number theory:
Corollary 2 (Vinogradov main conjecture) Let
be integers, and let
. Then
Proof: By the Hölder inequality (and the trivial bound of for the exponential sum), it suffices to treat the critical case
, that is to say to show that
We can rescale this as
As the integrand is periodic along the lattice , this is equivalent to
The left-hand side may be bounded by , where
and
. Since
the claim now follows from the decoupling theorem and a brief calculation.
Using the Plancherel formula, one may equivalently (when is an integer) write the Vinogradov main conjecture in terms of solutions
to the system of equations
but we will not use this formulation here.
A history of the Vinogradov main conjecture may be found in this survey of Wooley; prior to the Bourgain-Demeter-Guth theorem, the conjecture was solved completely for , or for
and
either below
or above
, with the bulk of recent progress coming from the efficient congruencing technique of Wooley. It has numerous applications to exponential sums, Waring’s problem, and the zeta function; to give just one application, the main conjecture implies the predicted asymptotic for the number of ways to express a large number as the sum of
fifth powers (the previous best result required
fifth powers). The Bourgain-Demeter-Guth approach to the Vinogradov main conjecture, based on decoupling, is ostensibly very different from the efficient congruencing technique, which relies heavily on the arithmetic structure of the program, but it appears (as I have been told from second-hand sources) that the two methods are actually closely related, with the former being a sort of “Archimedean” version of the latter (with the intervals
in the decoupling theorem being analogous to congruence classes in the efficient congruencing method); hopefully there will be some future work making this connection more precise. One advantage of the decoupling approach is that it generalises to non-arithmetic settings in which the set
that
is drawn from is replaced by some other similarly separated set of real numbers. (A random thought – could this allow the Vinogradov-Korobov bounds on the zeta function to extend to Beurling zeta functions?)
Below the fold we sketch the Bourgain-Demeter-Guth argument proving Theorem 1.
I thank Jean Bourgain and Andrew Granville for helpful discussions.
Let denote the Liouville function. The prime number theorem is equivalent to the estimate
as , that is to say that
exhibits cancellation on large intervals such as
. This result can be improved to give cancellation on shorter intervals. For instance, using the known zero density estimates for the Riemann zeta function, one can establish that
as if
for some fixed
; I believe this result is due to Ramachandra (see also Exercise 21 of this previous blog post), and in fact one could obtain a better error term on the right-hand side that for instance gained an arbitrary power of
. On the Riemann hypothesis (or the weaker density hypothesis), it was known that the
could be lowered to
.
Early this year, there was a major breakthrough by Matomaki and Radziwill, who (among other things) showed that the asymptotic (1) was in fact valid for any with
that went to infinity as
, thus yielding cancellation on extremely short intervals. This has many further applications; for instance, this estimate, or more precisely its extension to other “non-pretentious” bounded multiplicative functions, was a key ingredient in my recent solution of the Erdös discrepancy problem, as well as in obtaining logarithmically averaged cases of Chowla’s conjecture, such as
It is of interest to twist the above estimates by phases such as the linear phase . In 1937, Davenport showed that
which of course improves the prime number theorem. Recently with Matomaki and Radziwill, we obtained a common generalisation of this estimate with (1), showing that
as , for any
that went to infinity as
. We were able to use this estimate to obtain an averaged form of Chowla’s conjecture.
In that paper, we asked whether one could improve this estimate further by moving the supremum inside the integral, that is to say to establish the bound
as , for any
that went to infinity as
. This bound is asserting that
is locally Fourier-uniform on most short intervals; it can be written equivalently in terms of the “local Gowers
norm” as
from which one can see that this is another averaged form of Chowla’s conjecture (stronger than the one I was able to prove with Matomaki and Radziwill, but a consequence of the unaveraged Chowla conjecture). If one inserted such a bound into the machinery I used to solve the Erdös discrepancy problem, it should lead to further averaged cases of Chowla’s conjecture, such as
though I have not fully checked the details of this implication. It should also have a number of new implications for sign patterns of the Liouville function, though we have not explored these in detail yet.
One can write (4) equivalently in the form
uniformly for all -dependent phases
. In contrast, (3) is equivalent to the subcase of (6) when the linear phase coefficient
is independent of
. This dependency of
on
seems to necessitate some highly nontrivial additive combinatorial analysis of the function
in order to establish (4) when
is small. To date, this analysis has proven to be elusive, but I would like to record what one can do with more classical methods like Vaughan’s identity, namely:
Proposition 1 The estimate (4) (or equivalently (6)) holds in the range
for any fixed
. (In fact one can improve the right-hand side by an arbitrary power of
in this case.)
The values of in this range are far too large to yield implications such as new cases of the Chowla conjecture, but it appears that the
exponent is the limit of “classical” methods (at least as far as I was able to apply them), in the sense that one does not do any combinatorial analysis on the function
, nor does one use modern equidistribution results on “Type III sums” that require deep estimates on Kloosterman-type sums. The latter may shave a little bit off of the
exponent, but I don’t see how one would ever hope to go below
without doing some non-trivial combinatorics on the function
. UPDATE: I have come across this paper of Zhan which uses mean-value theorems for L-functions to lower the
exponent to
.
Let me now sketch the proof of the proposition, omitting many of the technical details. We first remark that known estimates on sums of the Liouville function (or similar functions such as the von Mangoldt function) in short arithmetic progressions, based on zero-density estimates for Dirichlet -functions, can handle the “major arc” case of (4) (or (6)) where
is restricted to be of the form
for
(the exponent here being of the same numerology as the
exponent in the classical result of Ramachandra, tied to the best zero density estimates currently available); for instance a modification of the arguments in this recent paper of Koukoulopoulos would suffice. Thus we can restrict attention to “minor arc” values of
(or
, using the interpretation of (6)).
Next, one breaks up (or the closely related Möbius function) into Dirichlet convolutions using one of the standard identities (e.g. Vaughan’s identity or Heath-Brown’s identity), as discussed for instance in this previous post (which is focused more on the von Mangoldt function, but analogous identities exist for the Liouville and Möbius functions). The exact choice of identity is not terribly important, but the upshot is that
can be decomposed into
terms, each of which is either of the “Type I” form
for some coefficients that are roughly of logarithmic size on the average, and scales
with
and
, or else of the “Type II” form
for some coefficients that are roughly of logarithmic size on the average, and scales
with
and
. As discussed in the previous post, the
exponent is a natural barrier in these identities if one is unwilling to also consider “Type III” type terms which are roughly of the shape of the third divisor function
.
A Type I sum makes a contribution to that can be bounded (via Cauchy-Schwarz) in terms of an expression such as
The inner sum exhibits a lot of cancellation unless is within
of an integer. (Here, “a lot” should be loosely interpreted as “gaining many powers of
over the trivial bound”.) Since
is significantly larger than
, standard Vinogradov-type manipulations (see e.g. Lemma 13 of these previous notes) show that this bad case occurs for many
only when
is “major arc”, which is the case we have specifically excluded. This lets us dispose of the Type I contributions.
A Type II sum makes a contribution to roughly of the form
We can break this up into a number of sums roughly of the form
for ; note that the
range is non-trivial because
is much larger than
. Applying the usual bilinear sum Cauchy-Schwarz methods (e.g. Theorem 14 of these notes) we conclude that there is a lot of cancellation unless one has
for some
. But with
,
is well below the threshold
for the definition of major arc, so we can exclude this case and obtain the required cancellation.
Recent Comments