You are currently browsing the tag archive for the ‘arithmetic progressions’ tag.

Kevin Ford, Ben Green, Sergei Konyagin, and myself have just posted to the arXiv our preprint “Large gaps between consecutive prime numbers“. This paper concerns the “opposite” problem to that considered by the recently concluded Polymath8 project, which was concerned with very small values of the prime gap ${p_{n+1}-p_n}$. Here, we wish to consider the largest prime gap ${G(X) = p_{n+1}-p_n}$ that one can find in the interval ${[X] = \{1,\dots,X\}}$ as ${X}$ goes to infinity.

Finding lower bounds on ${G(X)}$ is more or less equivalent to locating long strings of consecutive composite numbers that are not too large compared to the length of the string. A classic (and quite well known) construction here starts with the observation that for any natural number ${n}$, the consecutive numbers ${n!+2, n!+3,\dots,n!+n}$ are all composite, because each ${n!+i}$, ${i=2,\dots,n}$ is divisible by some prime ${p \leq n}$, while being strictly larger than that prime ${p}$. From this and Stirling’s formula, it is not difficult to obtain the bound

$\displaystyle G(X) \gg \frac{\log X}{\log\log X}. \ \ \ \ \ (1)$

A more efficient bound comes from the prime number theorem: there are only ${(1+o(1)) \frac{X}{\log X}}$ primes up to ${X}$, so just from the pigeonhole principle one can locate a string of consecutive composite numbers up to ${X}$ of length at least ${(1-o(1)) \log X}$, thus

$\displaystyle G(X) \gtrsim \log X \ \ \ \ \ (2)$

where we use ${X \gtrsim Y}$ or ${Y \lesssim X}$ as shorthand for ${X \geq (1-o(1)) Y}$ or ${Y \leq (1+o(1)) X}$.

What about upper bounds? The Cramér random model predicts that the primes up to ${X}$ are distributed like a random subset ${\{1,\dots,X\}}$ of density ${1/\log X}$. Using this model, Cramér arrived at the conjecture

$\displaystyle G(X) \ll \log^2 X.$

In fact, if one makes the extremely optimistic assumption that the random model perfectly describes the behaviour of the primes, one would arrive at the even more precise prediction

$\displaystyle G(X) \sim \log^2 X.$

However, it is no longer widely believed that this optimistic version of the conjecture is true, due to some additional irregularities in the primes coming from the basic fact that large primes cannot be divisible by very small primes. Using the Maier matrix method to capture some of this irregularity, Granville was led to the conjecture that

$\displaystyle G(X) \gtrsim 2e^{-\gamma} \log^2 X$

(note that ${2e^{-\gamma} = 1.1229\dots}$ is slightly larger than ${1}$). For comparison, the known upper bounds on ${G(X)}$ are quite weak; unconditionally one has ${G(X) \ll X^{0.525}}$ by the work of Baker, Harman, and Pintz, and even on the Riemann hypothesis one only gets down to ${G(X) \ll X^{1/2} \log X}$, as shown by Cramér (a slight improvement is also possible if one additionally assumes the pair correlation conjecture; see this article of Heath-Brown and the references therein).

This conjecture remains out of reach of current methods. In 1931, Westzynthius managed to improve the bound (2) slightly to

$\displaystyle G(X) \gg \frac{\log\log\log X}{\log\log\log\log X} \log X ,$

which Erdös in 1935 improved to

$\displaystyle G(X) \gg \frac{\log\log X}{(\log\log\log X)^2} \log X$

and Rankin in 1938 improved slightly further to

$\displaystyle G(X) \gtrsim c \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X \ \ \ \ \ (3)$

with ${c=1/3}$. Remarkably, this rather strange bound then proved extremely difficult to advance further on; until recently, the only improvements were to the constant ${c}$, which was raised to ${c=\frac{1}{2} e^\gamma}$ in 1963 by Schönhage, to ${c= e^\gamma}$ in 1963 by Rankin, to ${c = 1.31256 e^\gamma}$ by Maier and Pomerance, and finally to ${c = 2e^\gamma}$ in 1997 by Pintz.

Erdös listed the problem of making ${c}$ arbitrarily large one of his favourite open problems, even offering (“somewhat rashly”, in his words) a cash prize for the solution. Our main result answers this question in the affirmative:

Theorem 1 The bound (3) holds for arbitrarily large ${c>0}$.

In principle, we thus have a bound of the form

$\displaystyle G(X) \geq f(X) \frac{\log\log X (\log\log\log\log X)}{(\log\log\log X)^2} \log X$

for some ${f(X)}$ that grows to infinity. Unfortunately, due to various sources of ineffectivity in our methods, we cannot provide any explicit rate of growth on ${f(X)}$ at all.

We decided to announce this result the old-fashioned way, as part of a research lecture; more precisely, Ben Green announced the result in his ICM lecture this Tuesday. (The ICM staff have very efficiently put up video of his talks (and most of the other plenary and prize talks) online; Ben’s talk is here, with the announcement beginning at about 0:48. Note a slight typo in his slides, in that the exponent of ${\log\log\log X}$ in the denominator is ${3}$ instead of ${2}$.) Ben’s lecture slides may be found here.

By coincidence, an independent proof of this theorem has also been obtained very recently by James Maynard.

I discuss our proof method below the fold.

Ben Green and I have just uploaded to the arXiv our paper “New bounds for Szemeredi’s theorem, Ia: Progressions of length 4 in finite field geometries revisited“, submitted to Proc. Lond. Math. Soc.. This is both an erratum to, and a replacement for, our previous paper “New bounds for Szemeredi’s theorem. I. Progressions of length 4 in finite field geometries“. The main objective in both papers is to bound the quantity ${r_4(F^n)}$ for a vector space ${F^n}$ over a finite field ${F}$ of characteristic greater than ${4}$, where ${r_4(F^n)}$ is defined as the cardinality of the largest subset of ${F^n}$ that does not contain an arithmetic progression of length ${4}$. In our earlier paper, we gave two arguments that bounded ${r_4(F^n)}$ in the regime when the field ${F}$ was fixed and ${n}$ was large. The first “cheap” argument gave the bound

$\displaystyle r_4(F^n) \ll |F|^n \exp( - c \sqrt{\log n} )$

and the more complicated “expensive” argument gave the improvement

$\displaystyle r_4(F^n) \ll |F|^n n^{-c} \ \ \ \ \ (1)$

for some constant ${c>0}$ depending only on ${F}$.

Unfortunately, while the cheap argument is correct, we discovered a subtle but serious gap in our expensive argument in the original paper. Roughly speaking, the strategy in that argument is to employ the density increment method: one begins with a large subset ${A}$ of ${F^n}$ that has no arithmetic progressions of length ${4}$, and seeks to locate a subspace on which ${A}$ has a significantly increased density. Then, by using a “Koopman-von Neumann theorem”, ultimately based on an iteration of the inverse ${U^3}$ theorem of Ben and myself (and also independently by Samorodnitsky), one approximates ${A}$ by a “quadratically structured” function ${f}$, which is (locally) a combination of a bounded number of quadratic phase functions, which one can prepare to be in a certain “locally equidistributed” or “locally high rank” form. (It is this reduction to the high rank case that distinguishes the “expensive” argument from the “cheap” one.) Because ${A}$ has no progressions of length ${4}$, the count of progressions of length ${4}$ weighted by ${f}$ will also be small; by combining this with the theory of equidistribution of quadratic phase functions, one can then conclude that there will be a subspace on which ${f}$ has increased density.

The error in the paper was to conclude from this that the original function ${1_A}$ also had increased density on the same subspace; it turns out that the manner in which ${f}$ approximates ${1_A}$ is not strong enough to deduce this latter conclusion from the former. (One can strengthen the nature of approximation until one restores such a conclusion, but only at the price of deteriorating the quantitative bounds on ${r_4(F^n)}$ one gets at the end of the day to be worse than the cheap argument.)

After trying unsuccessfully to repair this error, we eventually found an alternate argument, based on earlier papers of ourselves and of Bergelson-Host-Kra, that avoided the density increment method entirely and ended up giving a simpler proof of a stronger result than (1), and also gives the explicit value of ${c = 2^{-22}}$ for the exponent ${c}$ in (1). In fact, it gives the following stronger result:

Theorem 1 Let ${A}$ be a subset of ${F^n}$ of density at least ${\alpha}$, and let ${\epsilon>0}$. Then there is a subspace ${W}$ of ${F^n}$ of codimension ${O( \epsilon^{-2^{20}})}$ such that the number of (possibly degenerate) progressions ${a, a+r, a+2r, a+3r}$ in ${A \cap W}$ is at least ${(\alpha^4-\epsilon)|W|^2}$.

The bound (1) is an easy consequence of this theorem after choosing ${\epsilon := \alpha^4/2}$ and removing the degenerate progressions from the conclusion of the theorem.

The main new idea is to work with a local Koopman-von Neumann theorem rather than a global one, trading a relatively weak global approximation to ${1_A}$ with a significantly stronger local approximation to ${1_A}$ on a subspace ${W}$. This is somewhat analogous to how sometimes in graph theory it is more efficient (from the point of view of quantative estimates) to work with a local version of the Szemerédi regularity lemma which gives just a single regular pair of cells, rather than attempting to regularise almost all of the cells. This local approach is well adapted to the inverse ${U^3}$ theorem we use (which also has this local aspect), and also makes the reduction to the high rank case much cleaner. At the end of the day, one ends up with a fairly large subspace ${W}$ on which ${A}$ is quite dense (of density ${\alpha-O(\epsilon)}$) and which can be well approximated by a “pure quadratic” object, namely a function of a small number of quadratic phases obeying a high rank condition. One can then exploit a special positivity property of the count of length four progressions weighted by pure quadratic objects, essentially due to Bergelson-Host-Kra, which then gives the required lower bound.

In this, the final lecture notes of this course, we discuss one of the motivating applications of the theory developed thus far, namely to count solutions to linear equations in primes ${{\mathcal P} = \{2,3,5,7,\ldots\}}$ (or in dense subsets ${A}$ of primes ${{\mathcal P}}$). Unfortunately, the most famous linear equations in primes: the twin prime equation ${p_2 - p_1 = 2}$ and the even Goldbach equation ${p_1+p_2=N}$ – remain out of reach of this technology (because the relevant affine linear forms involved are commensurate, and thus have infinite complexity with respect to the Gowers norms), but most other systems of equations, in particular that of arithmetic progressions ${p_i = n+ir}$ for ${i=0,\ldots,k-1}$ (or equivalently, ${p_i + p_{i+2} = 2p_{i+1}}$ for ${i=0,\ldots,k-2}$) , as well as the odd Goldbach equation ${p_1+p_2+p_3=N}$, are tractable.

To illustrate the main ideas, we will focus on the following result of Green:

Theorem 1 (Roth’s theorem in the primes) Let ${A \subset {\mathcal P}}$ be a subset of primes whose upper density ${\limsup_{N \rightarrow \infty} |A \cap [N]|/|{\mathcal P} \cap [N]|}$ is positive. Then ${A}$ contains infinitely many arithmetic progressions of length three.

This should be compared with Roth’s theorem in the integers (Notes 2), which is the same statement but with the primes ${{\mathcal P}}$ replaced by the integers ${{\bf Z}}$ (or natural numbers ${{\bf N}}$). Indeed, Roth’s theorem for the primes is proven by transferring Roth’s theorem for the integers to the prime setting; the latter theorem is used as a “black box”. The key difficulty here in performing this transference is that the primes have zero density inside the integers; indeed, from the prime number theorem we have ${|{\mathcal P} \cap [N]| = (1+o(1)) \frac{N}{\log N} = o(N)}$.

There are a number of generalisations of this transference technique. In a paper of Green and myself, we extended the above theorem to progressions of longer length (thus transferring Szemerédi’s theorem to the primes). In a series of papers (culminating in a paper to appear shortly) of Green, myself, and also Ziegler, related methods are also used to obtain an asymptotic for the number of solutions in the primes to any system of linear equations of bounded complexity. This latter result uses the full power of higher order Fourier analysis, in particular relying heavily on the inverse conjecture for the Gowers norms; in contrast, Roth’s theorem and Szemerédi’s theorem in the primes are “softer” results that do not need this conjecture.

To transfer results from the integers to the primes, there are three basic steps:

• A general transference principle, that transfers certain types of additive combinatorial results from dense subsets of the integers to dense subsets of a suitably “pseudorandom set” of integers (or more precisely, to the integers weighted by a suitably “pseudorandom measure”);
• An application of sieve theory to show that the primes (or more precisely, an affine modification of the primes) lie inside a suitably pseudorandom set of integers (or more precisely, have significant mass with respect to a suitably pseudorandom measure).
• If one is seeking asymptotics for patterns in the primes, and not simply lower bounds, one also needs to control correlations between the primes (or proxies for the primes, such as the Möbius function) with various objects that arise from higher order Fourier analysis, such as nilsequences.

The former step can be accomplished in a number of ways. For progressions of length three (and more generally, for controlling linear patterns of complexity at most one), transference can be accomplished by Fourier-analytic methods. For more complicated patterns, one can use techniques inspired by ergodic theory; more recently, simplified and more efficient methods based on duality (the Hahn-Banach theorem) have also been used. No number theory is used in this step. (In the case of transference to genuinely random sets, rather than pseudorandom sets, similar ideas appeared earlier in the graph theory setting, see this paper of Kohayakawa, Luczak, and Rodl.

The second step is accomplished by fairly standard sieve theory methods (e.g. the Selberg sieve, or the slight variants of this sieve used by Goldston and Yildirim). Remarkably, very little of the formidable apparatus of modern analytic number theory is needed for this step; for instance, the only fact about the Riemann zeta function that is truly needed is that it has a simple pole at ${s=1}$, and no knowledge of L-functions is needed.

The third step does draw more significantly on analytic number theory techniques and results (most notably, the method of Vinogradov to compute oscillatory sums over the primes, and also the Siegel-Walfisz theorem that gives a good error term on the prime number theorem in arithemtic progressions). As these techniques are somewhat orthogonal to the main topic of this course, we shall only touch briefly on this aspect of the transference strategy.

This week I am at Penn State University, giving this year’s Marker lectures.  My chosen theme for my four lectures here is “recent developments in additive prime number theory”.  My first lecture, “Long arithmetic progressions in primes”, is similar to my AMS lecture on the same topic and so I am not reposting it here.  The second lecture, the notes for which begin after the fold, is on “Linear equations in primes”.  These two lectures focus primarily on work of myself and Ben Green.  The third and fourth lectures, entitled “Small gaps between primes” and “Sieving for almost primes and expander graphs”, will instead be focused on the work of Goldston-Yildirim-Pintz and Bourgain-Gamburd-Sarnak respectively.
Read the rest of this entry »

This week I am in San Diego for the annual joint mathematics meeting of the American Mathematical Society and the Mathematical Association of America. I am giving two talks here. One is a lecture (for the AMS “Current Events” Bulletin) on recent developments (by Martel-Merle, Merle-Raphael, and others) on stability of solitons; I will post on that lecture at some point in the near future, once the survey paper associated to that lecture is finalised.
The other, which I am presenting here, is an address on “structure and randomness in the prime numbers“. Of course, I’ve talked about this general topic many times before, (e.g. at my Simons lecture at MIT, my Milliman lecture at U. Washington, and my Science Research Colloquium at UCLA), and I have given similar talks to the one here – which focuses on my original 2004 paper with Ben Green on long arithmetic progressions in the primes – about a dozen or so times. As such, this particular talk has probably run its course, and so I am “retiring” it by posting it here.

p.s. At this meeting, Endre Szemerédi was awarded the 2008 Steele prize for a seminal contribution to research, for his landmark paper establishing what is now known as Szemerédi’s theorem, which underlies the result I discuss in this talk. This prize is richly deserved – congratulations Endre! [The AMS and MAA also awarded prizes to several dozen other mathematicians, including many mentioned previously on this blog; rather than list them all here, let me just point you to their prize booklet.]

This week I am in Boston, giving this year’s Simons lectures at MIT together with David Donoho. (These lectures, incidentally, are endowed by Jim Simons, who was mentioned in some earlier discussion here.) While preparing these lectures, it occurred to me that I may as well post my lecture notes on this blog, since this medium is essentially just an asynchronous version of a traditional lecture series, and the hypertext capability is in some ways more convenient and informal than, say, $\LaTeX$ slides.

I am giving three lectures, each expounding on some aspects of the theme “the dichotomy between structure and randomness”, which I also spoke about (and wrote about) for the ICM last August. This theme seems to pervade many of the areas of mathematics that I work in, and my lectures aim to explore how this theme manifests itself in several of these. In this, the first lecture, I describe the dichotomy as it appears in Fourier analysis and in number theory. (In the second, I discuss the dichotomy in ergodic theory and graph theory, while in the third, I discuss PDE.)