You are currently browsing the monthly archive for December 2015.
Over on the polymath blog, I’ve posted (on behalf of Dinesh Thakur) a new polymath proposal, which is to explain some numerically observed identities involving the irreducible polynomials in the polynomial ring over the finite field of characteristic two, the simplest of which is
(expanded in terms of Taylor series in ). Comments on the problem should be placed in the polymath blog post; if there is enough interest, we can start a formal polymath project on it.
In this blog post, I would like to specialise the arguments of Bourgain, Demeter, and Guth from the previous post to the two-dimensional case of the Vinogradov main conjecture, namely
This particular case of the main conjecture has a classical proof using some elementary number theory. Indeed, the left-hand side can be viewed as the number of solutions to the system of equations
with . These two equations can combine (using the algebraic identity applied to ) to imply the further equation
which, when combined with the divisor bound, shows that each is associated to choices of excluding diagonal cases when two of the collide, and this easily yields Theorem 1. However, the Bourgain-Demeter-Guth argument (which, in the two dimensional case, is essentially contained in a previous paper of Bourgain and Demeter) does not require the divisor bound, and extends for instance to the the more general case where ranges in a -separated set of reals between to .
In this special case, the Bourgain-Demeter argument simplifies, as the lower dimensional inductive hypothesis becomes a simple almost orthogonality claim, and the multilinear Kakeya estimate needed is also easy (collapsing to just Fubini’s theorem). Also one can work entirely in the context of the Vinogradov main conjecture, and not turn to the increased generality of decoupling inequalities (though this additional generality is convenient in higher dimensions). As such, I am presenting this special case as an introduction to the Bourgain-Demeter-Guth machinery.
We now give the specialisation of the Bourgain-Demeter argument to Theorem 1. It will suffice to establish the bound
for all , (where we keep fixed and send to infinity), as the bound then follows by combining the above bound with the trivial bound . Accordingly, for any and , we let denote the claim that
as . Clearly, for any fixed , holds for some large , and it will suffice to establish
Proposition 2 Let , and let be such that holds. Then there exists (depending continuously on ) such that holds.
Indeed, this proposition shows that for , the infimum of the for which holds is zero.
We prove the proposition below the fold, using a simplified form of the methods discussed in the previous blog post. To simplify the exposition we will be a bit cavalier with the uncertainty principle, for instance by essentially ignoring the tails of rapidly decreasing functions.
Given any finite collection of elements in some Banach space , the triangle inequality tells us that
However, when the all “oscillate in different ways”, one expects to improve substantially upon the triangle inequality. For instance, if is a Hilbert space and the are mutually orthogonal, we have the Pythagorean theorem
for any finite collection in any Banach space , where denotes the cardinality of . Thus orthogonality in a Hilbert space yields “square root cancellation”, saving a factor of or so over the trivial bound coming from the triangle inequality.
More generally, let us somewhat informally say that a collection exhibits decoupling in if one has the Pythagorean-like inequality
for any , thus one obtains almost the full square root cancellation in the norm. The theory of almost orthogonality can then be viewed as the theory of decoupling in Hilbert spaces such as . In spaces for one usually does not expect this sort of decoupling; for instance, if the are disjointly supported one has
and the right-hand side can be much larger than when . At the opposite extreme, one usually does not expect to get decoupling in , since one could conceivably align the to all attain a maximum magnitude at the same location with the same phase, at which point the triangle inequality in becomes sharp.
However, in some cases one can get decoupling for certain . For instance, suppose we are in , and that are bi-orthogonal in the sense that the products for are pairwise orthogonal in . Then we have
giving decoupling in . (Similarly if each of the is orthogonal to all but of the other .) A similar argument also gives decoupling when one has tri-orthogonality (with the mostly orthogonal to each other), and so forth. As a slight variant, Khintchine’s inequality also indicates that decoupling should occur for any fixed if one multiplies each of the by an independent random sign .
In recent years, Bourgain and Demeter have been establishing decoupling theorems in spaces for various key exponents of , in the “restriction theory” setting in which the are Fourier transforms of measures supported on different portions of a given surface or curve; this builds upon the earlier decoupling theorems of Wolff. In a recent paper with Guth, they established the following decoupling theorem for the curve parameterised by the polynomial curve
For any ball in , let denote the weight
which should be viewed as a smoothed out version of the indicator function of . In particular, the space can be viewed as a smoothed out version of the space . For future reference we observe a fundamental self-similarity of the curve : any arc in this curve, with a compact interval, is affinely equivalent to the standard arc .
of a finite Borel measure on the arc , where . Then the exhibit decoupling in for any ball of radius .
Orthogonality gives the case of this theorem. The bi-orthogonality type arguments sketched earlier only give decoupling in up to the range ; the point here is that we can now get a much larger value of . The case of this theorem was previously established by Bourgain and Demeter (who obtained in fact an analogous theorem for any curved hypersurface). The exponent (and the radius ) is best possible, as can be seen by the following basic example. If
where is a bump function adapted to , then standard Fourier-analytic computations show that will be comparable to on a rectangular box of dimensions (and thus volume ) centred at the origin, and exhibit decay away from this box, with comparable to
On the other hand, is comparable to on a ball of radius comparable to centred at the origin, so is , which is just barely consistent with decoupling. This calculation shows that decoupling will fail if is replaced by any larger exponent, and also if the radius of the ball is reduced to be significantly smaller than .
This theorem has the following consequence of importance in analytic number theory:
Corollary 2 (Vinogradov main conjecture) Let be integers, and let . Then
Proof: By the Hölder inequality (and the trivial bound of for the exponential sum), it suffices to treat the critical case , that is to say to show that
We can rescale this as
As the integrand is periodic along the lattice , this is equivalent to
The left-hand side may be bounded by , where and . Since
the claim now follows from the decoupling theorem and a brief calculation.
Using the Plancherel formula, one may equivalently (when is an integer) write the Vinogradov main conjecture in terms of solutions to the system of equations
but we will not use this formulation here.
A history of the Vinogradov main conjecture may be found in this survey of Wooley; prior to the Bourgain-Demeter-Guth theorem, the conjecture was solved completely for , or for and either below or above , with the bulk of recent progress coming from the efficient congruencing technique of Wooley. It has numerous applications to exponential sums, Waring’s problem, and the zeta function; to give just one application, the main conjecture implies the predicted asymptotic for the number of ways to express a large number as the sum of fifth powers (the previous best result required fifth powers). The Bourgain-Demeter-Guth approach to the Vinogradov main conjecture, based on decoupling, is ostensibly very different from the efficient congruencing technique, which relies heavily on the arithmetic structure of the program, but it appears (as I have been told from second-hand sources) that the two methods are actually closely related, with the former being a sort of “Archimedean” version of the latter (with the intervals in the decoupling theorem being analogous to congruence classes in the efficient congruencing method); hopefully there will be some future work making this connection more precise. One advantage of the decoupling approach is that it generalises to non-arithmetic settings in which the set that is drawn from is replaced by some other similarly separated set of real numbers. (A random thought – could this allow the Vinogradov-Korobov bounds on the zeta function to extend to Beurling zeta functions?)
Below the fold we sketch the Bourgain-Demeter-Guth argument proving Theorem 1.
I thank Jean Bourgain and Andrew Granville for helpful discussions.
Let denote the Liouville function. The prime number theorem is equivalent to the estimate
as , that is to say that exhibits cancellation on large intervals such as . This result can be improved to give cancellation on shorter intervals. For instance, using the known zero density estimates for the Riemann zeta function, one can establish that
as if for some fixed ; I believe this result is due to Ramachandra (see also Exercise 21 of this previous blog post), and in fact one could obtain a better error term on the right-hand side that for instance gained an arbitrary power of . On the Riemann hypothesis (or the weaker density hypothesis), it was known that the could be lowered to .
Early this year, there was a major breakthrough by Matomaki and Radziwill, who (among other things) showed that the asymptotic (1) was in fact valid for any with that went to infinity as , thus yielding cancellation on extremely short intervals. This has many further applications; for instance, this estimate, or more precisely its extension to other “non-pretentious” bounded multiplicative functions, was a key ingredient in my recent solution of the Erdös discrepancy problem, as well as in obtaining logarithmically averaged cases of Chowla’s conjecture, such as
It is of interest to twist the above estimates by phases such as the linear phase . In 1937, Davenport showed that
from which one can see that this is another averaged form of Chowla’s conjecture (stronger than the one I was able to prove with Matomaki and Radziwill, but a consequence of the unaveraged Chowla conjecture). If one inserted such a bound into the machinery I used to solve the Erdös discrepancy problem, it should lead to further averaged cases of Chowla’s conjecture, such as
though I have not fully checked the details of this implication. It should also have a number of new implications for sign patterns of the Liouville function, though we have not explored these in detail yet.
One can write (4) equivalently in the form
uniformly for all -dependent phases . In contrast, (3) is equivalent to the subcase of (6) when the linear phase coefficient is independent of . This dependency of on seems to necessitate some highly nontrivial additive combinatorial analysis of the function in order to establish (4) when is small. To date, this analysis has proven to be elusive, but I would like to record what one can do with more classical methods like Vaughan’s identity, namely:
The values of in this range are far too large to yield implications such as new cases of the Chowla conjecture, but it appears that the exponent is the limit of “classical” methods (at least as far as I was able to apply them), in the sense that one does not do any combinatorial analysis on the function , nor does one use modern equidistribution results on “Type III sums” that require deep estimates on Kloosterman-type sums. The latter may shave a little bit off of the exponent, but I don’t see how one would ever hope to go below without doing some non-trivial combinatorics on the function . UPDATE: I have come across this paper of Zhan which uses mean-value theorems for L-functions to lower the exponent to .
Let me now sketch the proof of the proposition, omitting many of the technical details. We first remark that known estimates on sums of the Liouville function (or similar functions such as the von Mangoldt function) in short arithmetic progressions, based on zero-density estimates for Dirichlet -functions, can handle the “major arc” case of (4) (or (6)) where is restricted to be of the form for (the exponent here being of the same numerology as the exponent in the classical result of Ramachandra, tied to the best zero density estimates currently available); for instance a modification of the arguments in this recent paper of Koukoulopoulos would suffice. Thus we can restrict attention to “minor arc” values of (or , using the interpretation of (6)).
Next, one breaks up (or the closely related Möbius function) into Dirichlet convolutions using one of the standard identities (e.g. Vaughan’s identity or Heath-Brown’s identity), as discussed for instance in this previous post (which is focused more on the von Mangoldt function, but analogous identities exist for the Liouville and Möbius functions). The exact choice of identity is not terribly important, but the upshot is that can be decomposed into terms, each of which is either of the “Type I” form
for some coefficients that are roughly of logarithmic size on the average, and scales with and , or else of the “Type II” form
for some coefficients that are roughly of logarithmic size on the average, and scales with and . As discussed in the previous post, the exponent is a natural barrier in these identities if one is unwilling to also consider “Type III” type terms which are roughly of the shape of the third divisor function .
A Type I sum makes a contribution to that can be bounded (via Cauchy-Schwarz) in terms of an expression such as
The inner sum exhibits a lot of cancellation unless is within of an integer. (Here, “a lot” should be loosely interpreted as “gaining many powers of over the trivial bound”.) Since is significantly larger than , standard Vinogradov-type manipulations (see e.g. Lemma 13 of these previous notes) show that this bad case occurs for many only when is “major arc”, which is the case we have specifically excluded. This lets us dispose of the Type I contributions.
A Type II sum makes a contribution to roughly of the form
We can break this up into a number of sums roughly of the form
for ; note that the range is non-trivial because is much larger than . Applying the usual bilinear sum Cauchy-Schwarz methods (e.g. Theorem 14 of these notes) we conclude that there is a lot of cancellation unless one has for some . But with , is well below the threshold for the definition of major arc, so we can exclude this case and obtain the required cancellation.