Tamar Ziegler and I have just uploaded to the arXiv our paper “Narrow progressions in the primes“, submitted to the special issue “Analytic Number Theory” in honor of the 60th birthday of Helmut Maier. The results here are vaguely reminiscent of the recent progress on bounded gaps in the primes, but use different methods.

About a decade ago, Ben Green and I showed that the primes contained arbitrarily long arithmetic progressions: given any {k}, one could find a progression {n, n+r, \dots, n+(k-1)r} with {r>0} consisting entirely of primes. In fact we showed the same statement was true if the primes were replaced by any subset of the primes of positive relative density.

A little while later, Tamar Ziegler and I obtained the following generalisation: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes. Furthermore, we could make this progression somewhat “narrow” by taking {r = n^{o(1)}} (where {o(1)} denotes a quantity that goes to zero as {n} goes to infinity). Again, the same statement also applies if the primes were replaced by a subset of positive relative density. My previous result with Ben corresponds to the linear case {P_i(r) = (i-1)r}.

In this paper we were able to make the progressions a bit narrower still: given any {k} and any polynomials {P_1,\dots,P_k: {\bf Z} \rightarrow {\bf Z}} with {P_1(0)=\dots=P_k(0)}, one could find a “polynomial progression” {n+P_1(r),\dots,n+P_k(r)} with {r>0} consisting entirely of primes, and such that {r \leq \log^L n}, where {L} depends only on {k} and {P_1,\dots,P_k} (in fact it depends only on {k} and the degrees of {P_1,\dots,P_k}). The result is still true if the primes are replaced by a subset of positive density {\delta}, but unfortunately in our arguments we must then let {L} depend on {\delta}. However, in the linear case {P_i(r) = (i-1)r}, we were able to make {L} independent of {\delta} (although it is still somewhat large, of the order of {k 2^k}).

The polylogarithmic factor is somewhat necessary: using an upper bound sieve, one can easily construct a subset of the primes of density, say, {90\%}, whose arithmetic progressions {n,n+r,\dots,n+(k-1)r} of length {k} all obey the lower bound {r \gg \log^{k-1} n}. On the other hand, the prime tuples conjecture predicts that if one works with the actual primes rather than dense subsets of the primes, then one should have infinitely many length {k} arithmetic progressions of bounded width for any fixed {k}. The {k=2} case of this is precisely the celebrated theorem of Yitang Zhang that was the focus of the recently concluded Polymath8 project here. The higher {k} case is conjecturally true, but appears to be out of reach of known methods. (Using the multidimensional Selberg sieve of Maynard, one can get {m} primes inside an interval of length {O( \exp(O(m)) )}, but this is such a sparse set of primes that one would not expect to find even a progression of length three within such an interval.)

The argument in the previous paper was unable to obtain a polylogarithmic bound on the width of the progressions, due to the reliance on a certain technical “correlation condition” on a certain Selberg sieve weight {\nu}. This correlation condition required one to control arbitrarily long correlations of {\nu}, which was not compatible with a bounded value of {L} (particularly if one wanted to keep {L} independent of {\delta}).

However, thanks to recent advances in this area by Conlon, Fox, and Zhao (who introduced a very nice “densification” technique), it is now possible (in principle, at least) to delete this correlation condition from the arguments. Conlon-Fox-Zhao did this for my original theorem with Ben; and in the current paper we apply the densification method to our previous argument to similarly remove the correlation condition. This method does not fully eliminate the need to control arbitrarily long correlations, but allows most of the factors in such a long correlation to be bounded, rather than merely controlled by an unbounded weight such as {\nu}. This turns out to be significantly easier to control, although in the non-linear case we still unfortunately had to make {L} large compared to {\delta} due to a certain “clearing denominators” step arising from the complicated nature of the Gowers-type uniformity norms that we were using to control polynomial averages. We believe though that this an artefact of our method, and one should be able to prove our theorem with an {L} that is uniform in {\delta}.

Here is a simple instance of the densification trick in action. Suppose that one wishes to establish an estimate of the form

\displaystyle  {\bf E}_n {\bf E}_r f(n) g(n+r) h(n+r^2) = o(1) \ \ \ \ \ (1)

for some real-valued functions {f,g,h} which are bounded in magnitude by a weight function {\nu}, but which are not expected to be bounded; this average will naturally arise when trying to locate the pattern {(n,n+r,n+r^2)} in a set such as the primes. Here I will be vague as to exactly what range the parameters {n,r} are being averaged over. Suppose that the factor {g} (say) has enough uniformity that one can already show a smallness bound

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) H(n+r^2) = o(1) \ \ \ \ \ (2)

whenever {F, H} are bounded functions. (One should think of {F,H} as being like the indicator functions of “dense” sets, in contrast to {f,h} which are like the normalised indicator functions of “sparse” sets). The bound (2) cannot be directly applied to control (1) because of the unbounded (or “sparse”) nature of {f} and {h}. However one can “densify” {f} and {h} as follows. Since {f} is bounded in magnitude by {\nu}, we can bound the left-hand side of (1) as

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |.

The weight function {\nu} will be normalised so that {{\bf E}_n \nu(n) = O(1)}, so by the Cauchy-Schwarz inequality it suffices to show that

\displaystyle  {\bf E}_n \nu(n) | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

The left-hand side expands as

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s \nu(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2).

Now, it turns out that after an enormous (but finite) number of applications of the Cauchy-Schwarz inequality to steadily eliminate the {g,h} factors, as well as a certain “polynomial forms condition” hypothesis on {\nu}, one can show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s (\nu-1)(n) g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

(Because of the polynomial shifts, this requires a method known as “PET induction”, but let me skip over this point here.) In view of this estimate, we now just need to show that

\displaystyle  {\bf E}_n {\bf E}_r {\bf E}_s g(n+r) h(n+r^2) g(n+s) h(n+s^2) = o(1).

Now we can reverse the previous steps. First, we collapse back to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) |^2 = o(1).

One can bound {|{\bf E}_r g(n+r) h(n+r^2)|} by {{\bf E}_r \nu(n+r) \nu(n+r^2)}, which can be shown to be “bounded on average” in a suitable sense (e.g. bounded {L^4} norm) via the aforementioned polynomial forms condition. Because of this and the Hölder inequality, the above estimate is equivalent to

\displaystyle  {\bf E}_n | {\bf E}_r g(n+r) h(n+r^2) | = o(1).

By setting {F} to be the signum of {{\bf E}_r g(n+r) h(n+r^2)}, this is equivalent to

\displaystyle  {\bf E}_n {\bf E}_r F(n) g(n+r) h(n+r^2) = o(1).

This is halfway between (1) and (2); the sparsely supported function {f} has been replaced by its “densification” {F}, but we have not yet densified {h} to {H}. However, one can shift {n} by {r^2} and repeat the above arguments to achieve a similar densificiation of {h}, at which point one has reduced (1) to (2).