You are currently browsing the monthly archive for August 2015.
The twin prime conjecture is one of the oldest unsolved problems in analytic number theory. There are several reasons why this conjecture remains out of reach of current techniques, but the most important obstacle is the parity problem which prevents purely sieve-theoretic methods (or many other popular methods in analytic number theory, such as the circle method) from detecting pairs of prime twins in a way that can distinguish them from other twins of almost primes. The parity problem is discussed in these previous blog posts; this obstruction is ultimately powered by the Möbius pseudorandomness principle that asserts that the Möbius function is asymptotically orthogonal to all “structured” functions (and in particular, to the weight functions constructed from sieve theory methods).
However, there is an intriguing “alternate universe” in which the Möbius function is strongly correlated with some structured functions, and specifically with some Dirichlet characters, leading to the existence of the infamous “Siegel zero“. In this scenario, the parity problem obstruction disappears, and it becomes possible, in principle, to attack problems such as the twin prime conjecture. In particular, we have the following result of Heath-Brown:
Theorem 1 At least one of the following two statements are true:
- (Twin prime conjecture) There are infinitely many primes such that is also prime.
- (No Siegel zeroes) There exists a constant such that for every real Dirichlet character of conductor , the associated Dirichlet -function has no zeroes in the interval .
Informally, this result asserts that if one had an infinite sequence of Siegel zeroes, one could use this to generate infinitely many twin primes. See this survey of Friedlander and Iwaniec for more on this “illusory” or “ghostly” parallel universe in analytic number theory that should not actually exist, but is surprisingly self-consistent and to date proven to be impossible to banish from the realm of possibility.
The strategy of Heath-Brown’s proof is fairly straightforward to describe. The usual starting point is to try to lower bound for some large value of , where is the von Mangoldt function. Actually, in this post we will work with the slight variant
where is the second von Mangoldt function, and denotes Dirichlet convolution, and is an (unsquared) Selberg sieve that damps out small prime factors. This sum also detects twin primes, but will lead to slightly simpler computations. For technical reasons we will also smooth out the interval and remove very small primes from , but we will skip over these steps for the purpose of this informal discussion. (In Heath-Brown’s original paper, the Selberg sieve is essentially replaced by the more combinatorial restriction for some large , where is the primorial of , but I found the computations to be slightly easier if one works with a Selberg sieve, particularly if the sieve is not squared to make it nonnegative.)If there is a Siegel zero with close to and a Dirichlet character of conductor , then multiplicative number theory methods can be used to show that the Möbius function “pretends” to be like the character in the sense that for “most” primes near (e.g. in the range for some small and large ). Traditionally, one uses complex-analytic methods to demonstrate this, but one can also use elementary multiplicative number theory methods to establish these results (qualitatively at least), as will be shown below the fold.
The fact that pretends to be like can be used to construct a tractable approximation (after inserting the sieve weight ) in the range (where for some large ) for the second von Mangoldt function , namely the function
Roughly speaking, we think of the periodic function and the slowly varying function as being of about the same “complexity” as the constant function , so that is roughly of the same “complexity” as the divisor function which is considerably simpler to obtain asymptotics for than the von Mangoldt function as the Möbius function is no longer present. (For instance, note from the Dirichlet hyperbola method that one can estimate to accuracy with little difficulty, whereas to obtain a comparable level of accuracy for or is essentially the Riemann hypothesis.)One expects to be a good approximant to if is of size and has no prime factors less than for some large constant . The Selberg sieve will be mostly supported on numbers with no prime factor less than . As such, one can hope to approximate (1) by the expression as it turns out, the error between this expression and (1) is easily controlled by sieve-theoretic techniques. Let us ignore the Selberg sieve for now and focus on the slightly simpler sum
As discussed above, this sum should be thought of as a slightly more complicated version of the sum Accordingly, let us look (somewhat informally) at the task of estimating the model sum (3). One can think of this problem as basically that of counting solutions to the equation with in various ranges; this is clearly related to understanding the equidistribution of the hyperbola in . Taking Fourier transforms, the latter problem is closely related to estimation of the Kloosterman sums where denotes the inverse of in . One can then use the Weil bound where is the greatest common divisor of (with the convention that this is equal to if vanish), and the decays to zero as . The Weil bound yields good enough control on error terms to estimate (3), and as it turns out the same method also works to estimate (2) (provided that with large enough).Actually one does not need the full strength of the Weil bound here; any power savings over the trivial bound of will do. In particular, it will suffice to use the weaker, but easier to prove, bounds of Kloosterman:
Lemma 2 (Kloosterman bound) One has whenever and are coprime to , where the is with respect to the limit (and is uniform in ).
Proof: Observe from change of variables that the Kloosterman sum is unchanged if one replaces with for . For fixed , the number of such pairs is at least , thanks to the divisor bound. Thus it will suffice to establish the fourth moment bound
The left-hand side can be rearranged as which by Fourier summation is equal to Observe from the quadratic formula and the divisor bound that each pair has at most solutions to the system of equations . Hence the number of quadruples of the desired form is , and the claim follows.We will also need another easy case of the Weil bound to handle some other portions of (2):
Lemma 3 (Easy Weil bound) Let be a primitive real Dirichlet character of conductor , and let . Then
Proof: As is the conductor of a primitive real Dirichlet character, is equal to times a squarefree odd number for some . By the Chinese remainder theorem, it thus suffices to establish the claim when is an odd prime. We may assume that is not divisible by this prime , as the claim is trivial otherwise. If vanishes then does not vanish, and the claim follows from the mean zero nature of ; similarly if vanishes. Hence we may assume that do not vanish, and then we can normalise them to equal . By completing the square it now suffices to show that
whenever . As is on the quadratic residues and on the non-residues, it now suffices to show that But by making the change of variables , the left-hand side becomes , and the claim follows.While the basic strategy of Heath-Brown’s argument is relatively straightforward, implementing it requires a large amount of computation to control both main terms and error terms. I experimented for a while with rearranging the argument to try to reduce the amount of computation; I did not fully succeed in arriving at a satisfactorily minimal amount of superfluous calculation, but I was able to at least reduce this amount a bit, mostly by replacing a combinatorial sieve with a Selberg-type sieve (which was not needed to be positive, so I dispensed with the squaring aspect of the Selberg sieve to simplify the calculations a little further; also for minor reasons it was convenient to retain a tiny portion of the combinatorial sieve to eliminate extremely small primes). Also some modest reductions in complexity can be obtained by using the second von Mangoldt function in place of . These exercises were primarily for my own benefit, but I am placing them here in case they are of interest to some other readers.
The Poincaré upper half-plane (with a boundary consisting of the real line together with the point at infinity ) carries an action of the projective special linear group
via fractional linear transformations:
Here and in the rest of the post we will abuse notation by identifying elements of the special linear group with their equivalence class in ; this will occasionally create or remove a factor of two in our formulae, but otherwise has very little effect, though one has to check that various definitions and expressions (such as (1)) are unaffected if one replaces a matrix by its negation . In particular, we recommend that the reader ignore the signs that appear from time to time in the discussion below.
As the action of on is transitive, and any given point in (e.g. ) has a stabiliser isomorphic to the projective rotation group , we can view the Poincaré upper half-plane as a homogeneous space for , and more specifically the quotient space of of a maximal compact subgroup . In fact, we can make the half-plane a symmetric space for , by endowing with the Riemannian metric
(using Cartesian coordinates ), which is invariant with respect to the action. Like any other Riemannian metric, the metric on generates a number of other important geometric objects on , such as the distance function which can be computed to be given by the formula
the volume measure , which can be computed to be
and the Laplace-Beltrami operator, which can be computed to be (here we use the negative definite sign convention for ). As the metric was -invariant, all of these quantities arising from the metric are similarly -invariant in the appropriate sense.
The Gauss curvature of the Poincaré half-plane can be computed to be the constant , thus is a model for two-dimensional hyperbolic geometry, in much the same way that the unit sphere in is a model for two-dimensional spherical geometry (or is a model for two-dimensional Euclidean geometry). (Indeed, is isomorphic (via projection to a null hyperplane) to the upper unit hyperboloid in the Minkowski spacetime , which is the direct analogue of the unit sphere in Euclidean spacetime or the plane in Galilean spacetime .)
One can inject arithmetic into this geometric structure by passing from the Lie group to the full modular group
or congruence subgroups such as
for natural number , or to the discrete stabiliser of the point at infinity:
These are discrete subgroups of , nested by the subgroup inclusions
There are many further discrete subgroups of (known collectively as Fuchsian groups) that one could consider, but we will focus attention on these three groups in this post.
Any discrete subgroup of generates a quotient space , which in general will be a non-compact two-dimensional orbifold. One can understand such a quotient space by working with a fundamental domain – a set consisting of a single representative of each of the orbits of in . This fundamental domain is by no means uniquely defined, but if the fundamental domain is chosen with some reasonable amount of regularity, one can view as the fundamental domain with the boundaries glued together in an appropriate sense. Among other things, fundamental domains can be used to induce a volume measure on from the volume measure on (restricted to a fundamental domain). By abuse of notation we will refer to both measures simply as when there is no chance of confusion.
For instance, a fundamental domain for is given (up to null sets) by the strip , with identifiable with the cylinder formed by gluing together the two sides of the strip. A fundamental domain for is famously given (again up to null sets) by an upper portion , with the left and right sides again glued to each other, and the left and right halves of the circular boundary glued to itself. A fundamental domain for can be formed by gluing together
copies of a fundamental domain for in a rather complicated but interesting fashion.
While fundamental domains can be a convenient choice of coordinates to work with for some computations (as well as for drawing appropriate pictures), it is geometrically more natural to avoid working explicitly on such domains, and instead work directly on the quotient spaces . In order to analyse functions on such orbifolds, it is convenient to lift such functions back up to and identify them with functions which are -automorphic in the sense that for all and . Such functions will be referred to as -automorphic forms, or automorphic forms for short (we always implicitly assume all such functions to be measurable). (Strictly speaking, these are the automorphic forms with trivial factor of automorphy; one can certainly consider other factors of automorphy, particularly when working with holomorphic modular forms, which corresponds to sections of a more non-trivial line bundle over than the trivial bundle that is implicitly present when analysing scalar functions . However, we will not discuss this (important) more general situation here.)
An important way to create a -automorphic form is to start with a non-automorphic function obeying suitable decay conditions (e.g. bounded with compact support will suffice) and form the Poincaré series defined by
which is clearly -automorphic. (One could equivalently write in place of here; there are good argument for both conventions, but I have ultimately decided to use the convention, which makes explicit computations a little neater at the cost of making the group actions work in the opposite order.) Thus we naturally see sums over associated with -automorphic forms. A little more generally, given a subgroup of and a -automorphic function of suitable decay, we can form a relative Poincaré series by
where is any fundamental domain for , that is to say a subset of consisting of exactly one representative for each right coset of . As is -automorphic, we see (if has suitable decay) that does not depend on the precise choice of fundamental domain, and is -automorphic. These operations are all compatible with each other, for instance . A key example of Poincaré series are the Eisenstein series, although there are of course many other Poincaré series one can consider by varying the test function .
For future reference we record the basic but fundamental unfolding identities
for any function with sufficient decay, and any -automorphic function of reasonable growth (e.g. bounded and compact support, and bounded, will suffice). Note that is viewed as a function on on the left-hand side, and as a -automorphic function on on the right-hand side. More generally, one has
whenever are discrete subgroups of , is a -automorphic function with sufficient decay on , and is a -automorphic (and thus also -automorphic) function of reasonable growth. These identities will allow us to move fairly freely between the three domains , , and in our analysis.
When computing various statistics of a Poincaré series , such as its values at special points , or the quantity , expressions of interest to analytic number theory naturally emerge. We list three basic examples of this below, discussed somewhat informally in order to highlight the main ideas rather than the technical details.
The first example we will give concerns the problem of estimating the sum
where is the divisor function. This can be rewritten (by factoring and ) as
which is basically a sum over the full modular group . At this point we will “cheat” a little by moving to the related, but different, sum
This sum is not exactly the same as (8), but will be a little easier to handle, and it is plausible that the methods used to handle this sum can be modified to handle (8). Observe from (2) and some calculation that the distance between and is given by the formula
and so one can express the above sum as
(the factor of coming from the quotient by in the projective special linear group); one can express this as , where and is the indicator function of the ball . Thus we see that expressions such as (7) are related to evaluations of Poincaré series. (In practice, it is much better to use smoothed out versions of indicator functions in order to obtain good control on sums such as (7) or (9), but we gloss over this technical detail here.)
The second example concerns the relative
of the sum (7). Note from multiplicativity that (7) can be written as , which is superficially very similar to (10), but with the key difference that the polynomial is irreducible over the integers.
As with (7), we may expand (10) as
At first glance this does not look like a sum over a modular group, but one can manipulate this expression into such a form in one of two (closely related) ways. First, observe that any factorisation of into Gaussian integers gives rise (upon taking norms) to an identity of the form , where and . Conversely, by using the unique factorisation of the Gaussian integers, every identity of the form gives rise to a factorisation of the form , essentially uniquely up to units. Now note that is of the form if and only if , in which case . Thus we can essentially write the above sum as something like
and one the modular group is now manifest. An equivalent way to see these manipulations is as follows. A triple of natural numbers with gives rise to a positive quadratic form of normalised discriminant equal to with integer coefficients (it is natural here to allow to take integer values rather than just natural number values by essentially doubling the sum). The group acts on the space of such quadratic forms in a natural fashion (by composing the quadratic form with the inverse of an element of ). Because the discriminant has class number one (this fact is equivalent to the unique factorisation of the gaussian integers, as discussed in this previous post), every form in this space is equivalent (under the action of some element of ) with the standard quadratic form . In other words, one has
which (up to a harmless sign) is exactly the representation , , introduced earlier, and leads to the same reformulation of the sum (10) in terms of expressions like (11). Similar considerations also apply if the quadratic polynomial is replaced by another quadratic, although one has to account for the fact that the class number may now exceed one (so that unique factorisation in the associated quadratic ring of integers breaks down), and in the positive discriminant case the fact that the group of units might be infinite presents another significant technical problem.
Note that has real part and imaginary part . Thus (11) is (up to a factor of two) the Poincaré series as in the preceding example, except that is now the indicator of the sector .
Sums involving subgroups of the full modular group, such as , often arise when imposing congruence conditions on sums such as (10), for instance when trying to estimate the expression when and are large. As before, one then soon arrives at the problem of evaluating a Poincaré series at one or more special points, where the series is now over rather than .
The third and final example concerns averages of Kloosterman sums
where and is the inverse of in the multiplicative group . It turns out that the norms of Poincaré series or are closely tied to such averages. Consider for instance the quantity
where is a natural number and is a -automorphic form that is of the form
for some integer and some test function , which for sake of discussion we will take to be smooth and compactly supported. Using the unfolding formula (6), we may rewrite (13) as
To compute this, we use the double coset decomposition
where for each , are arbitrarily chosen integers such that . To see this decomposition, observe that every element in outside of can be assumed to have by applying a sign , and then using the row and column operations coming from left and right multiplication by (that is, shifting the top row by an integer multiple of the bottom row, and shifting the right column by an integer multiple of the left column) one can place in the interval and to be any specified integer pair with . From this we see that
and so from further use of the unfolding formula (5) we may expand (13) as
The first integral is just . The second expression is more interesting. We have
so we can write
as
which on shifting by simplifies a little to
and then on scaling by simplifies a little further to
Note that as , we have modulo . Comparing the above calculations with (12), we can thus write (13) as
where
is a certain integral involving and a parameter , but which does not depend explicitly on parameters such as . Thus we have indeed expressed the expression (13) in terms of Kloosterman sums. It is possible to invert this analysis and express varius weighted sums of Kloosterman sums in terms of expressions (possibly involving inner products instead of norms) of Poincaré series, but we will not do so here; see Chapter 16 of Iwaniec and Kowalski for further details.
Traditionally, automorphic forms have been analysed using the spectral theory of the Laplace-Beltrami operator on spaces such as or , so that a Poincaré series such as might be expanded out using inner products of (or, by the unfolding identities, ) with various generalised eigenfunctions of (such as cuspidal eigenforms, or Eisenstein series). With this approach, special functions, and specifically the modified Bessel functions of the second kind, play a prominent role, basically because the -automorphic functions
for and non-zero are generalised eigenfunctions of (with eigenvalue ), and are almost square-integrable on (the norm diverges only logarithmically at one end of the cylinder , while decaying exponentially fast at the other end ).
However, as discussed in this previous post, the spectral theory of an essentially self-adjoint operator such as is basically equivalent to the theory of various solution operators associated to partial differential equations involving that operator, such as the Helmholtz equation , the heat equation , the Schrödinger equation , or the wave equation . Thus, one can hope to rephrase many arguments that involve spectral data of into arguments that instead involve resolvents , heat kernels , Schrödinger propagators , or wave propagators , or involve the PDE more directly (e.g. applying integration by parts and energy methods to solutions of such PDE). This is certainly done to some extent in the existing literature; resolvents and heat kernels, for instance, are often utilised. In this post, I would like to explore the possibility of reformulating spectral arguments instead using the inhomogeneous wave equation
Actually it will be a bit more convenient to normalise the Laplacian by , and look instead at the automorphic wave equation
This equation somewhat resembles a “Klein-Gordon” type equation, except that the mass is imaginary! This would lead to pathological behaviour were it not for the negative curvature, which in principle creates a spectral gap of that cancels out this factor.
The point is that the wave equation approach gives access to some nice PDE techniques, such as energy methods, Sobolev inequalities and finite speed of propagation, which are somewhat submerged in the spectral framework. The wave equation also interacts well with Poincaré series; if for instance and are -automorphic solutions to (15) obeying suitable decay conditions, then their Poincaré series and will be -automorphic solutions to the same equation (15), basically because the Laplace-Beltrami operator commutes with translations. Because of these facts, it is possible to replicate several standard spectral theory arguments in the wave equation framework, without having to deal directly with things like the asymptotics of modified Bessel functions. The wave equation approach to automorphic theory was introduced by Faddeev and Pavlov (using the Lax-Phillips scattering theory), and developed further by by Lax and Phillips, to recover many spectral facts about the Laplacian on modular curves, such as the Weyl law and the Selberg trace formula. Here, I will illustrate this by deriving three basic applications of automorphic methods in a wave equation framework, namely
- Using the Weil bound on Kloosterman sums to derive Selberg’s 3/16 theorem on the least non-trivial eigenvalue for on (discussed previously here);
- Conversely, showing that Selberg’s eigenvalue conjecture (improving Selberg’s bound to the optimal ) implies an optimal bound on (smoothed) sums of Kloosterman sums; and
- Using the same bound to obtain pointwise bounds on Poincaré series similar to the ones discussed above. (Actually, the argument here does not use the wave equation, instead it just uses the Sobolev inequality.)
This post originated from an attempt to finally learn this part of analytic number theory properly, and to see if I could use a PDE-based perspective to understand it better. Ultimately, this is not that dramatic a depature from the standard approach to this subject, but I found it useful to think of things in this fashion, probably due to my existing background in PDE.
I thank Bill Duke and Ben Green for helpful discussions. My primary reference for this theory was Chapters 15, 16, and 21 of Iwaniec and Kowalski.
The equidistribution theorem asserts that if is an irrational phase, then the sequence is equidistributed on the unit circle, or equivalently that
for any continuous (or equivalently, for any smooth) function . By approximating uniformly by a Fourier series, this claim is equivalent to that of showing that
for any non-zero integer (where ), which is easily verified from the irrationality of and the geometric series formula. Conversely, if is rational, then clearly fails to go to zero when is a multiple of the denominator of .
One can then ask for more quantitative information about the decay of exponential sums of , or more generally on exponential sums of the form for an arithmetic progression (in this post all progressions are understood to be finite) and a polynomial . It will be convenient to phrase such information in the form of an inverse theorem, describing those phases for which the exponential sum is large. Indeed, we have
Lemma 1 (Geometric series formula, inverse form) Let be an arithmetic progression of length at most for some , and let be a linear polynomial for some . If
for some , then there exists a subprogression of of size such that varies by at most on (that is to say, lies in a subinterval of of length at most ).
Proof: By a linear change of variable we may assume that is of the form for some . We may of course assume that is non-zero in , so that ( denotes the distance from to the nearest integer). From the geometric series formula we see that
and so . Setting for some sufficiently small absolute constant , we obtain the claim.
Thus, in order for a linear phase to fail to be equidistributed on some long progression , must in fact be almost constant on large piece of .
As is well known, this phenomenon generalises to higher order polynomials. To achieve this, we need two elementary additional lemmas. The first relates the exponential sums of to the exponential sums of its “first derivatives” .
Lemma 2 (Van der Corput lemma, inverse form) Let be an arithmetic progression of length at most , and let be an arbitrary function such that
for some . Then, for integers , there exists a subprogression of , of the same spacing as , such that
Proof: Squaring (1), we see that
We write and conclude that
where is a subprogression of of the same spacing. Since , we conclude that
for values of (this can be seen, much like the pigeonhole principle, by arguing via contradiction for a suitable choice of implied constants). The claim follows.
The second lemma (which we recycle from this previous blog post) is a variant of the equidistribution theorem.
Lemma 3 (Vinogradov lemma) Let be an interval for some , and let be such that for at least values of , for some . Then either
or
or else there is a natural number such that
Proof: We may assume that and , since we are done otherwise. Then there are at least two with , and by the pigeonhole principle we can find in with and . By the triangle inequality, we conclude that there exists at least one natural number for which
We take to be minimal amongst all such natural numbers, then we see that there exists coprime to and such that
If then we are done, so suppose that . Suppose that are elements of such that and . Writing for some , we have
By hypothesis, ; note that as and we also have . This implies that and thus . We then have
We conclude that for fixed with , there are at most elements of such that . Iterating this with a greedy algorithm, we see that the number of with is at most ; since , this implies that
and the claim follows.
Now we can quickly obtain a higher degree version of Lemma 1:
Proposition 4 (Weyl exponential sum estimate, inverse form) Let be an arithmetic progression of length at most for some , and let be a polynomial of some degree at most . If
for some , then there exists a subprogression of with such that varies by at most on .
Proof: We induct on . The cases are immediate from Lemma 1. Now suppose that , and that the claim had already been proven for . To simplify the notation we allow implied constants to depend on . Let the hypotheses be as in the proposition. Clearly cannot exceed . By shrinking as necessary we may assume that for some sufficiently small constant depending on .
By rescaling we may assume . By Lemma 3, we see that for choices of such that
for some interval . We write , then is a polynomial of degree at most with leading coefficient . We conclude from induction hypothesis that for each such , there exists a natural number such that , by double-counting, this implies that there are integers in the interval such that . Applying Lemma 3, we conclude that either , or that
In the former case the claim is trivial (just take to be a point), so we may assume that we are in the latter case.
We partition into arithmetic progressions of spacing and length comparable to for some large depending on to be chosen later. By hypothesis, we have
so by the pigeonhole principle, we have
for at least one such progression . On this progression, we may use the binomial theorem and (4) to write as a polynomial in of degree at most , plus an error of size . We thus can write for for some polynomial of degree at most . By the triangle inequality, we thus have (for large enough) that
and hence by induction hypothesis we may find a subprogression of of size such that varies by most on , and thus (for large enough again) that varies by at most on , and the claim follows.
This gives the following corollary (also given as Exercise 16 in this previous blog post):
Corollary 5 (Weyl exponential sum estimate, inverse form II) Let be a discrete interval for some , and let polynomial of some degree at most for some . If
for some , then there is a natural number such that for all .
One can obtain much better exponents here using Vinogradov’s mean value theorem; see Theorem 1.6 this paper of Wooley. (Thanks to Mariusz Mirek for this reference.) However, this weaker result already suffices for many applications, and does not need any result as deep as the mean value theorem.
Proof: To simplify notation we allow implied constants to depend on . As before, we may assume that for some small constant depending only on . We may also assume that for some large , as the claim is trivial otherwise (set ).
Applying Proposition 4, we can find a natural number and an arithmetic subprogression of such that and such that varies by at most on . Writing for some interval of length and some , we conclude that the polynomial varies by at most on . Taking order differences, we conclude that the coefficient of this polynomial is ; by the binomial theorem, this implies that differs by at most on from a polynomial of degree at most . Iterating this, we conclude that the coefficient of is for , and the claim then follows by inverting the change of variables (and replacing with a larger quantity such as as necessary).
For future reference we also record a higher degree version of the Vinogradov lemma.
Lemma 6 (Polynomial Vinogradov lemma) Let be a discrete interval for some , and let be a polynomial of degree at most for some such that for at least values of , for some . Then either
or else there is a natural number such that
for all .
Proof: We induct on . For this follows from Lemma 3 (noting that if then ), so suppose that and that the claim is already proven for . We now allow all implied constants to depend on .
For each , let denote the number of such that . By hypothesis, , and clearly , so we must have for choices of . For each such , we then have for choices of , so by induction hypothesis, either (5) or (6) holds, or else for choices of , there is a natural number such that
for , where are the coefficients of the degree polynomial . We may of course assume it is the latter which holds. By the pigeonhole principle we may take to be independent of .
Since , we have
for choices of , so by Lemma 3, either (5) or (6) holds, or else (after increasing as necessary) we have
We can again assume it is the latter that holds. This implies that modulo , so that
for choices of . Arguing as before and iterating, we obtain the claim.
The above results also extend to higher dimensions. Here is the higher dimensional version of Proposition 4:
Proposition 7 (Multidimensional Weyl exponential sum estimate, inverse form) Let and , and let be arithmetic progressions of length at most for each . Let be a polynomial of degrees at most in each of the variables separately. If
for some , then there exists a subprogression of with for each such that varies by at most on .
A much more general statement, in which the polynomial phase is replaced by a nilsequence, and in which one does not necessarily assume the exponential sum is small, is given in Theorem 8.6 of this paper of Ben Green and myself, but it involves far more notation to even state properly.
Proof: We induct on . The case was established in Proposition 5, so we assume that and that the claim has already been proven for . To simplify notation we allow all implied constants to depend on . We may assume that for some small depending only on .
By a linear change of variables, we may assume that for all .
We write . First suppose that . Then by the pigeonhole principle we can find such that
and the claim then follows from the induction hypothesis. Thus we may assume that for some large depending only on . Similarly we may assume that for all .
By the triangle inequality, we have
The inner sum is , and the outer sum has terms. Thus, for choices of , one has
for some polynomials of degrees at most in the variables . For each obeying (7), we apply Corollary 5 to conclude that there exists a natural number such that
for (the claim also holds for but we discard it as being trivial). By the pigeonhole principle, there thus exists a natural number such that
for all and for choices of . If we write
where is a polynomial of degrees at most , then for choices of we then have
Applying Lemma 6 in the and the largeness hypotheses on the (and also the assumption that ) we conclude (after enlarging as necessary, and pigeonholing to keep independent of ) that
for all (note that we now include that case, which is no longer trivial) and for choices of . Iterating this, we eventually conclude (after enlarging as necessary) that
whenever for , with nonzero. Permuting the indices, and observing that the claim is trivial for , we in fact obtain (8) for all , at which point the claim easily follows by taking for each .
An inspection of the proof of the above result (or alternatively, by combining the above result again with many applications of Lemma 6) reveals the following general form of Proposition 4, which was posed as Exercise 17 in this previous blog post, but had a slight misprint in it (it did not properly treat the possibility that some of the could be small) and was a bit trickier to prove than anticipated (in fact, the reason for this post was that I was asked to supply a more detailed solution for this exercise):
Proposition 8 (Multidimensional Weyl exponential sum estimate, inverse form, II) Let be an natural number, and for each , let be a discrete interval for some . Let
be a polynomial in variables of multidegrees for some . If
for some , or else there is a natural number such that
Again, the factor of is natural in this bound. In the case, the option (10) may be deleted since (11) trivially holds in this case, but this simplification is no longer available for since one needs (10) to hold for all (not just one ) to make (11) completely trivial. Indeed, the above proposition fails for if one removes (10) completely, as can be seen for instance by inspecting the exponential sum , which has size comparable to regardless of how irrational is.
Chantal David, Andrew Granville, Emmanuel Kowalski, Phillipe Michel, Kannan Soundararajan, and I are running a program at MSRI in the Spring of 2017 (more precisely, from Jan 17, 2017 to May 26, 2017) in the area of analytic number theory, with the intention to bringing together many of the leading experts in all aspects of the subject and to present recent work on the many active areas of the subject (the discussion on previous blog posts here have mostly focused on advances in the study of the distribution of the prime numbers, but there have been many other notable recent developments too, such as refinements of the circle method, a deeper understanding of the asymptotics of bounded multiplicative functions and of the “pretentious” approach to analytic number theory, more “analysis-friendly” formulations of the theorems of Deligne and others involving trace functions over fields, and new subconvexity theorems for automorphic forms, to name a few). Like any other semester MSRI program, there will be a number of workshops, seminars, and similar activities taking place while the members are in residence. I’m personally looking forward to the program, which should be occurring in the midst of a particularly productive time for the subject. Needless to say, I (and the rest of the organising committee) plan to be present for most of the program.
Applications for Postdoctoral Fellowships, Research Memberships, and Research Professorships for this program (and for other MSRI programs in this time period, namely the companion program in Harmonic Analysis and the Fall program in Geometric Group Theory, as well as the complementary program in all other areas of mathematics) have just opened up today. Applications are open to everyone (until they close on Dec 1), but require supporting documentation, such as a CV, statement of purpose, and letters of recommendation from other mathematicians; see the application page for more details.
Recent Comments