You are currently browsing the tag archive for the ‘arithmetic progressions’ tag.

In the previous set of notes, we saw how zero-density theorems for the Riemann zeta function, when combined with the zero-free region of Vinogradov and Korobov, could be used to obtain prime number theorems in short intervals. It turns out that a more sophisticated version of this type of argument also works to obtain prime number theorems in arithmetic progressions, in particular establishing the celebrated theorem of Linnik:

Theorem 1 (Linnik’s theorem)Let be a primitive residue class. Then contains a prime with .

In fact it is known that one can find a prime with , a result of Xylouris. For sake of comparison, recall from Exercise 65 of Notes 2 that the Siegel-Walfisz theorem gives this theorem with a bound of , and from Exercise 48 of Notes 2 one can obtain a bound of the form if one assumes the generalised Riemann hypothesis. The probabilistic random models from Supplement 4 suggest that one should in fact be able to take .

We will not aim to obtain the optimal exponents for Linnik’s theorem here, and follow the treatment in Chapter 18 of Iwaniec and Kowalski. We will in fact establish the following more quantitative result (a special case of a more powerful theorem of Gallagher), which splits into two cases, depending on whether there is an exceptional zero or not:

Theorem 2 (Quantitative Linnik theorem)Let be a primitive residue class for some . For any , let denote the quantityAssume that for some sufficiently large .

- (i) (No exceptional zero) If all the real zeroes of -functions of real characters of modulus are such that , then
for all and some absolute constant .

- (ii) (Exceptional zero) If there is a zero of an -function of a real character of modulus with for some sufficiently small , then
for all and some absolute constant .

The implied constants here are effective.

Note from the Landau-Page theorem (Exercise 54 from Notes 2) that at most one exceptional zero exists (if is small enough). A key point here is that the error term in the exceptional zero case is an *improvement* over the error term when no exceptional zero is present; this compensates for the potential reduction in the main term coming from the term. The splitting into cases depending on whether an exceptional zero exists or not turns out to be an essential technique in many advanced results in analytic number theory (though presumably such a splitting will one day become unnecessary, once the possibility of exceptional zeroes are finally eliminated for good).

Exercise 3Assuming Theorem 2, and assuming for some sufficiently large absolute constant , establish the lower boundwhen there is no exceptional zero, and

when there is an exceptional zero . Conclude that Theorem 2 implies Theorem 1, regardless of whether an exceptional zero exists or not.

Remark 4The Brun-Titchmarsh theorem (Exercise 33 from Notes 4), in the sharp form of Montgomery and Vaughan, gives thatfor any primitive residue class and any . This is (barely) consistent with the estimate (1). Any lowering of the coefficient in the Brun-Titchmarsh inequality (with reasonable error terms), in the regime when is a large power of , would then lead to at least some elimination of the exceptional zero case. However, this has not led to any progress on the Landau-Siegel zero problem (and may well be just a reformulation of that problem). (When is a relatively small power of , some improvements to Brun-Titchmarsh are possible that are not in contradiction with the presence of an exceptional zero; see this paper of Maynard for more discussion.

Theorem 2 is deduced in turn from facts about the distribution of zeroes of -functions. Recall from the truncated explicit formula (Exercise 45(iv) of Notes 2) with (say) that

for any non-principal character of modulus , where we assume for some large ; for the principal character one has the same formula with an additional term of on the right-hand side (as is easily deduced from Theorem 21 of Notes 2). Using the Fourier inversion formula

(see Theorem 69 of Notes 1), we thus have

and so it suffices by the triangle inequality (bounding very crudely by , as the contribution of the low-lying zeroes already turns out to be quite dominant) to show that

when no exceptional zero is present, and

when an exceptional zero is present.

To handle the former case (2), one uses two facts about zeroes. The first is the classical zero-free region (Proposition 51 from Notes 2), which we reproduce in our context here:

Proposition 5 (Classical zero-free region)Let . Apart from a potential exceptional zero , all zeroes of -functions with of modulus and are such thatfor some absolute constant .

Using this zero-free region, we have

whenever contributes to the sum in (2), and so the left-hand side of (2) is bounded by

where we recall that is the number of zeroes of any -function of a character of modulus with and (here we use conjugation symmetry to make non-negative, accepting a multiplicative factor of two).

In Exercise 25 of Notes 6, the grand density estimate

is proven. If one inserts this bound into the above expression, one obtains a bound for (2) which is of the form

Unfortunately this is off from what we need by a factor of (and would lead to a weak form of Linnik’s theorem in which was bounded by rather than by ). In the analogous problem for prime number theorems in short intervals, we could use the Vinogradov-Korobov zero-free region to compensate for this loss, but that region does not help here for the contribution of the low-lying zeroes with , which as mentioned before give the dominant contribution. Fortunately, it is possible to remove this logarithmic loss from the zero-density side of things:

Theorem 6 (Log-free grand density estimate)For any and , one hasThe implied constants are effective.

We prove this estimate below the fold. The proof follows the methods of the previous section, but one inserts various sieve weights to restrict sums over natural numbers to essentially become sums over “almost primes”, as this turns out to remove the logarithmic losses. (More generally, the trick of restricting to almost primes by inserting suitable sieve weights is quite useful for avoiding any unnecessary losses of logarithmic factors in analytic number theory estimates.)

Now we turn to the case when there is an exceptional zero (3). The argument used to prove (2) applies here also, but does not gain the factor of in the exponent. To achieve this, we need an additional tool, a version of the Deuring-Heilbronn repulsion phenomenon due to Linnik:

Theorem 8 (Deuring-Heilbronn repulsion phenomenon)Suppose is such that there is an exceptional zero with small. Then all other zeroes of -functions of modulus are such thatIn other words, the exceptional zero enlarges the classical zero-free region by a factor of . The implied constants are effective.

Exercise 9Use Theorem 6 and Theorem 8 to complete the proof of (3), and thus Linnik’s theorem.

Exercise 10Use Theorem 8 to give an alternate proof of (Tatuzawa’s version of) Siegel’s theorem (Theorem 62 of Notes 2). (Hint:if two characters have different moduli, then they can be made to have the same modulus by multiplying by suitable principal characters.)

Theorem 8 is proven by similar methods to that of Theorem 6, the basic idea being to insert a further weight of (in addition to the sieve weights), the point being that the exceptional zero causes this weight to be quite small on the average. There is a strengthening of Theorem 8 due to Bombieri that is along the lines of Theorem 6, obtaining the improvement

with effective implied constants for any and in the presence of an exceptional zero, where the prime in means that the exceptional zero is omitted (thus if ). Note that the upper bound on falls below one when for a sufficiently small , thus recovering Theorem 8. Bombieri’s theorem can be established by the methods in this set of notes, and will be given as an exercise to the reader.

Remark 11There are a number of alternate ways to derive the results in this set of notes, for instance using the Turan power sums method which is based on studying derivatives such asfor and large , and performing various sorts of averaging in to attenuate the contribution of many of the zeroes . We will not develop this method here, but see for instance Chapter 9 of Montgomery’s book. See the text of Friedlander and Iwaniec for yet another approach based primarily on sieve-theoretic ideas.

Remark 12When one optimises all the exponents, it turns out that the exponent in Linnik’s theorem isextremelygood in the presence of an exceptional zero – indeed Friedlander and Iwaniec showed can even get a bound of the form for some , which is even stronger than one can obtain from GRH! There are other places in which exceptional zeroes can be used to obtain results stronger than what one can obtain even on the Riemann hypothesis; for instance, Heath-Brown used the hypothesis of an infinite sequence of Siegel zeroes to obtain the twin prime conejcture.

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap . Here, we wish to consider the *largest* prime gap that one can find in the interval as goes to infinity.

Finding lower bounds on is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number , the consecutive numbers are all composite, because each , is divisible by some prime , while being strictly larger than that prime . From this and Stirling’s formula, it is not difficult to obtain the bound

A more efficient bound comes from the prime number theorem: there are only primes up to , so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to of length at least , thus

where we use or as shorthand for or .

What about upper bounds? The *Cramér random model* predicts that the primes up to are distributed like a random subset of density . Using this model, Cramér arrived at the conjecture

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

(note that is slightly larger than ). For comparison, the known upper bounds on are quite weak; unconditionally one has by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to , as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

which Erdös in 1935 improved to

and Rankin in 1938 improved slightly further to

with . Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant , which was raised to in 1963 by Schönhage, to in 1963 by Rankin, to by Maier and Pomerance, and finally to in 1997 by Pintz.

Erdös listed the problem of making arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1The bound (3) holds for arbitrarily large .

In principle, we thus have a bound of the form

for some that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of in the denominator is instead of .) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

Ben Green and I have just uploaded to the arXiv our paper “New bounds for Szemeredi’s theorem, Ia: Progressions of length 4 in finite field geometries revisited“, submitted to Proc. Lond. Math. Soc.. This is both an erratum to, and a replacement for, our previous paper “New bounds for Szemeredi’s theorem. I. Progressions of length 4 in finite field geometries“. The main objective in both papers is to bound the quantity for a vector space over a finite field of characteristic greater than , where is defined as the cardinality of the largest subset of that does not contain an arithmetic progression of length . In our earlier paper, we gave two arguments that bounded in the regime when the field was fixed and was large. The first “cheap” argument gave the bound

and the more complicated “expensive” argument gave the improvement

for some constant depending only on .

Unfortunately, while the cheap argument is correct, we discovered a subtle but serious gap in our expensive argument in the original paper. Roughly speaking, the strategy in that argument is to employ the *density increment method*: one begins with a large subset of that has no arithmetic progressions of length , and seeks to locate a subspace on which has a significantly increased density. Then, by using a “Koopman-von Neumann theorem”, ultimately based on an iteration of the inverse theorem of Ben and myself (and also independently by Samorodnitsky), one approximates by a “quadratically structured” function , which is (locally) a combination of a bounded number of quadratic phase functions, which one can prepare to be in a certain “locally equidistributed” or “locally high rank” form. (It is this reduction to the high rank case that distinguishes the “expensive” argument from the “cheap” one.) Because has no progressions of length , the count of progressions of length weighted by will also be small; by combining this with the theory of equidistribution of quadratic phase functions, one can then conclude that there will be a subspace on which has increased density.

The error in the paper was to conclude from this that the original function also had increased density on the same subspace; it turns out that the manner in which approximates is not strong enough to deduce this latter conclusion from the former. (One can strengthen the nature of approximation until one restores such a conclusion, but only at the price of deteriorating the quantitative bounds on one gets at the end of the day to be worse than the cheap argument.)

After trying unsuccessfully to repair this error, we eventually found an alternate argument, based on earlier papers of ourselves and of Bergelson-Host-Kra, that avoided the density increment method entirely and ended up giving a simpler proof of a stronger result than (1), and also gives the explicit value of for the exponent in (1). In fact, it gives the following stronger result:

Theorem 1Let be a subset of of density at least , and let . Then there is a subspace of of codimension such that the number of (possibly degenerate) progressions in is at least .

The bound (1) is an easy consequence of this theorem after choosing and removing the degenerate progressions from the conclusion of the theorem.

The main new idea is to work with a *local* Koopman-von Neumann theorem rather than a global one, trading a relatively weak global approximation to with a significantly stronger local approximation to on a subspace . This is somewhat analogous to how sometimes in graph theory it is more efficient (from the point of view of quantative estimates) to work with a local version of the Szemerédi regularity lemma which gives just a single regular pair of cells, rather than attempting to regularise almost all of the cells. This local approach is well adapted to the inverse theorem we use (which also has this local aspect), and also makes the reduction to the high rank case much cleaner. At the end of the day, one ends up with a fairly large subspace on which is quite dense (of density ) and which can be well approximated by a “pure quadratic” object, namely a function of a small number of quadratic phases obeying a high rank condition. One can then exploit a special positivity property of the count of length four progressions weighted by pure quadratic objects, essentially due to Bergelson-Host-Kra, which then gives the required lower bound.

In this, the final lecture notes of this course, we discuss one of the motivating applications of the theory developed thus far, namely to count solutions to linear equations in primes (or in dense subsets of primes ). Unfortunately, the most famous linear equations in primes: the twin prime equation and the even Goldbach equation – remain out of reach of this technology (because the relevant affine linear forms involved are commensurate, and thus have infinite complexity with respect to the Gowers norms), but most other systems of equations, in particular that of arithmetic progressions for (or equivalently, for ) , as well as the odd Goldbach equation , are tractable.

To illustrate the main ideas, we will focus on the following result of Green:

Theorem 1 (Roth’s theorem in the primes)Let be a subset of primes whose upper density is positive. Then contains infinitely many arithmetic progressions of length three.

This should be compared with Roth’s theorem in the integers (Notes 2), which is the same statement but with the primes replaced by the integers (or natural numbers ). Indeed, Roth’s theorem for the primes is proven by *transferring* Roth’s theorem for the integers to the prime setting; the latter theorem is used as a “black box”. The key difficulty here in performing this transference is that the primes have zero density inside the integers; indeed, from the prime number theorem we have .

There are a number of generalisations of this transference technique. In a paper of Green and myself, we extended the above theorem to progressions of longer length (thus transferring Szemerédi’s theorem to the primes). In a series of papers (culminating in a paper to appear shortly) of Green, myself, and also Ziegler, related methods are also used to obtain an asymptotic for the number of solutions in the primes to any system of linear equations of bounded complexity. This latter result uses the full power of higher order Fourier analysis, in particular relying heavily on the inverse conjecture for the Gowers norms; in contrast, Roth’s theorem and Szemerédi’s theorem in the primes are “softer” results that do not need this conjecture.

To transfer results from the integers to the primes, there are three basic steps:

- A general transference principle, that transfers certain types of additive combinatorial results from dense subsets of the integers to dense subsets of a suitably “pseudorandom set” of integers (or more precisely, to the integers weighted by a suitably “pseudorandom measure”);
- An application of sieve theory to show that the primes (or more precisely, an affine modification of the primes) lie inside a suitably pseudorandom set of integers (or more precisely, have significant mass with respect to a suitably pseudorandom measure).
- If one is seeking asymptotics for patterns in the primes, and not simply lower bounds, one also needs to control correlations between the primes (or proxies for the primes, such as the Möbius function) with various objects that arise from higher order Fourier analysis, such as nilsequences.

The former step can be accomplished in a number of ways. For progressions of length three (and more generally, for controlling linear patterns of complexity at most one), transference can be accomplished by Fourier-analytic methods. For more complicated patterns, one can use techniques inspired by ergodic theory; more recently, simplified and more efficient methods based on duality (the Hahn-Banach theorem) have also been used. No number theory is used in this step. (In the case of transference to *genuinely* random sets, rather than pseudorandom sets, similar ideas appeared earlier in the graph theory setting, see this paper of Kohayakawa, Luczak, and Rodl.

The second step is accomplished by fairly standard sieve theory methods (e.g. the Selberg sieve, or the slight variants of this sieve used by Goldston and Yildirim). Remarkably, very little of the formidable apparatus of modern analytic number theory is needed for this step; for instance, the only fact about the Riemann zeta function that is truly needed is that it has a simple pole at , and no knowledge of L-functions is needed.

The third step does draw more significantly on analytic number theory techniques and results (most notably, the method of Vinogradov to compute oscillatory sums over the primes, and also the Siegel-Walfisz theorem that gives a good error term on the prime number theorem in arithemtic progressions). As these techniques are somewhat orthogonal to the main topic of this course, we shall only touch briefly on this aspect of the transference strategy.

This week I am at Penn State University, giving this year’s Marker lectures. My chosen theme for my four lectures here is “recent developments in additive prime number theory”. My first lecture, “Long arithmetic progressions in primes”, is similar to my AMS lecture on the same topic and so I am not reposting it here. The second lecture, the notes for which begin after the fold, is on “Linear equations in primes”. These two lectures focus primarily on work of myself and Ben Green. The third and fourth lectures, entitled “Small gaps between primes” and “Sieving for almost primes and expander graphs”, will instead be focused on the work of Goldston-Yildirim-Pintz and Bourgain-Gamburd-Sarnak respectively.

Read the rest of this entry »

This week I am in San Diego for the annual joint mathematics meeting of the American Mathematical Society and the Mathematical Association of America. I am giving two talks here. One is a lecture (for the AMS “Current Events” Bulletin) on recent developments (by Martel-Merle, Merle-Raphael, and others) on stability of solitons; I will post on that lecture at some point in the near future, once the survey paper associated to that lecture is finalised.

The other, which I am presenting here, is an address on “structure and randomness in the prime numbers“. Of course, I’ve talked about this general topic many times before, (e.g. at my Simons lecture at MIT, my Milliman lecture at U. Washington, and my Science Research Colloquium at UCLA), and I have given similar talks to the one here – which focuses on my original 2004 paper with Ben Green on long arithmetic progressions in the primes – about a dozen or so times. As such, this particular talk has probably run its course, and so I am “retiring” it by posting it here.

p.s. At this meeting, Endre Szemerédi was awarded the 2008 Steele prize for a seminal contribution to research, for his landmark paper establishing what is now known as Szemerédi’s theorem, which underlies the result I discuss in this talk. This prize is richly deserved – congratulations Endre! [The AMS and MAA also awarded prizes to several dozen other mathematicians, including many mentioned previously on this blog; rather than list them all here, let me just point you to their prize booklet.]

This week I am in Boston, giving this year’s Simons lectures at MIT together with David Donoho. (These lectures, incidentally, are endowed by Jim Simons, who was mentioned in some earlier discussion here.) While preparing these lectures, it occurred to me that I may as well post my lecture notes on this blog, since this medium is essentially just an asynchronous version of a traditional lecture series, and the hypertext capability is in some ways more convenient and informal than, say, slides.

I am giving three lectures, each expounding on some aspects of the theme “the dichotomy between structure and randomness”, which I also spoke about (and wrote about) for the ICM last August. This theme seems to pervade many of the areas of mathematics that I work in, and my lectures aim to explore how this theme manifests itself in several of these. In this, the first lecture, I describe the dichotomy as it appears in Fourier analysis and in number theory. (In the second, I discuss the dichotomy in ergodic theory and graph theory, while in the third, I discuss PDE.)

## Recent Comments