You are currently browsing the tag archive for the ‘transference principle’ tag.

Tamar Ziegler and I have just uploaded to the arXiv our joint paper “A multi-dimensional Szemerédi theorem for the primes via a correspondence principle“. This paper is related to an earlier result of Ben Green and mine in which we established that the primes contain arbitrarily long arithmetic progressions. Actually, in that paper we proved a more general result:

Theorem 1 (Szemerédi’s theorem in the primes) Let ${A}$ be a subset of the primes ${{\mathcal P}}$ of positive relative density, thus ${\limsup_{N \rightarrow \infty} \frac{|A \cap [N]|}{|{\mathcal P} \cap [N]|} > 0}$. Then ${A}$ contains arbitrarily long arithmetic progressions.

This result was based in part on an earlier paper of Green that handled the case of progressions of length three. With the primes replaced by the integers, this is of course the famous theorem of Szemerédi.

Szemerédi’s theorem has now been generalised in many different directions. One of these is the multidimensional Szemerédi theorem of Furstenberg and Katznelson, who used ergodic-theoretic techniques to show that any dense subset of ${{\bf Z}^d}$ necessarily contained infinitely many constellations of any prescribed shape. Our main result is to relativise that theorem to the primes as well:

Theorem 2 (Multidimensional Szemerédi theorem in the primes) Let ${d \geq 1}$, and let ${A}$ be a subset of the ${d^{th}}$ Cartesian power ${{\mathcal P}^d}$ of the primes of positive relative density, thus

$\displaystyle \limsup_{N \rightarrow \infty} \frac{|A \cap [N]^d|}{|{\mathcal P}^d \cap [N]^d|} > 0.$

Then for any ${v_1,\ldots,v_k \in {\bf Z}^d}$, ${A}$ contains infinitely many “constellations” of the form ${a+r v_1, \ldots, a + rv_k}$ with ${a \in {\bf Z}^k}$ and ${r}$ a positive integer.

In the case when ${A}$ is itself a Cartesian product of one-dimensional sets (in particular, if ${A}$ is all of ${{\mathcal P}^d}$), this result already follows from Theorem 1, but there does not seem to be a similarly easy argument to deduce the general case of Theorem 2 from previous results. Simultaneously with this paper, an independent proof of Theorem 2 using a somewhat different method has been established by Cook, Maygar, and Titichetrakun.

The result is reminiscent of an earlier result of mine on finding constellations in the Gaussian primes (or dense subsets thereof). That paper followed closely the arguments of my original paper with Ben Green, namely it first enclosed (a W-tricked version of) the primes or Gaussian primes (in a sieve theoretic-sense) by a slightly larger set (or more precisely, a weight function ${\nu}$) of almost primes or almost Gaussian primes, which one could then verify (using methods closely related to the sieve-theoretic methods in the ongoing Polymath8 project) to obey certain pseudorandomness conditions, known as the linear forms condition and the correlation condition. Very roughly speaking, these conditions assert statements of the following form: if ${n}$ is a randomly selected integer, then the events of ${n+h_1,\ldots,n+h_k}$ simultaneously being an almost prime (or almost Gaussian prime) are approximately independent for most choices of ${h_1,\ldots,h_k}$. Once these conditions are satisfied, one can then run a transference argument (initially based on ergodic-theory methods, but nowadays there are simpler transference results based on the Hahn-Banach theorem, due to Gowers and Reingold-Trevisan-Tulsiani-Vadhan) to obtain relative Szemerédi-type theorems from their absolute counterparts.

However, when one tries to adapt these arguments to sets such as ${{\mathcal P}^2}$, a new difficulty occurs: the natural analogue of the almost primes would be the Cartesian square ${{\mathcal A}^2}$ of the almost primes – pairs ${(n,m)}$ whose entries are both almost primes. (Actually, for technical reasons, one does not work directly with a set of almost primes, but would instead work with a weight function such as ${\nu(n) \nu(m)}$ that is concentrated on a set such as ${{\mathcal A}^2}$, but let me ignore this distinction for now.) However, this set ${{\mathcal A}^2}$ does not enjoy as many pseudorandomness conditions as one would need for a direct application of the transference strategy to work. More specifically, given any fixed ${h, k}$, and random ${(n,m)}$, the four events

$\displaystyle (n,m) \in {\mathcal A}^2$

$\displaystyle (n+h,m) \in {\mathcal A}^2$

$\displaystyle (n,m+k) \in {\mathcal A}^2$

$\displaystyle (n+h,m+k) \in {\mathcal A}^2$

do not behave independently (as they would if ${{\mathcal A}^2}$ were replaced for instance by the Gaussian almost primes), because any three of these events imply the fourth. This blocks the transference strategy for constellations which contain some right-angles to them (e.g. constellations of the form ${(n,m), (n+r,m), (n,m+r)}$) as such constellations soon turn into rectangles such as the one above after applying Cauchy-Schwarz a few times. (But a few years ago, Cook and Magyar showed that if one restricted attention to constellations which were in general position in the sense that any coordinate hyperplane contained at most one element in the constellation, then this obstruction does not occur and one can establish Theorem 2 in this case through the transference argument.) It’s worth noting that very recently, Conlon, Fox, and Zhao have succeeded in removing of the pseudorandomness conditions (namely the correlation condition) from the transference principle, leaving only the linear forms condition as the remaining pseudorandomness condition to be verified, but unfortunately this does not completely solve the above problem because the linear forms condition also fails for ${{\mathcal A}^2}$ (or for weights concentrated on ${{\mathcal A}^2}$) when applied to rectangular patterns.

There are now two ways known to get around this problem and establish Theorem 2 in full generality. The approach of Cook, Magyar, and Titichetrakun proceeds by starting with one of the known proofs of the multidimensional Szemerédi theorem – namely, the proof that proceeds through hypergraph regularity and hypergraph removal – and attach pseudorandom weights directly within the proof itself, rather than trying to add the weights to the result of that proof through a transference argument. (A key technical issue is that weights have to be added to all the levels of the hypergraph – not just the vertices and top-order edges – in order to circumvent the failure of naive pseudorandomness.) As one has to modify the entire proof of the multidimensional Szemerédi theorem, rather than use that theorem as a black box, the Cook-Magyar-Titichetrakun argument is lengthier than ours; on the other hand, it is more general and does not rely on some difficult theorems about primes that are used in our paper.

In our approach, we continue to use the multidimensional Szemerédi theorem (or more precisely, the equivalent theorem of Furstenberg and Katznelson concerning multiple recurrence for commuting shifts) as a black box. The difference is that instead of using a transference principle to connect the relative multidimensional Szemerédi theorem we need to the multiple recurrence theorem, we instead proceed by a version of the Furstenberg correspondence principle, similar to the one that connects the absolute multidimensional Szemerédi theorem to the multiple recurrence theorem. I had discovered this approach many years ago in an unpublished note, but had abandoned it because it required an infinite number of linear forms conditions (in contrast to the transference technique, which only needed a finite number of linear forms conditions and (until the recent work of Conlon-Fox-Zhao) a correlation condition). The reason for this infinite number of conditions is that the correspondence principle has to build a probability measure on an entire ${\sigma}$-algebra; for this, it is not enough to specify the measure ${\mu(A)}$ of a single set such as ${A}$, but one also has to specify the measure ${\mu( T^{n_1} A \cap \ldots \cap T^{n_m} A)}$ of “cylinder sets” such as ${T^{n_1} A \cap \ldots \cap T^{n_m} A}$ where ${m}$ could be arbitrarily large. The larger ${m}$ gets, the more linear forms conditions one needs to keep the correspondence under control.

With the sieve weights ${\nu}$ we were using at the time, standard sieve theory methods could indeed provide a finite number of linear forms conditions, but not an infinite number, so my idea was abandoned. However, with my later work with Green and Ziegler on linear equations in primes (and related work on the Mobius-nilsequences conjecture and the inverse conjecture on the Gowers norm), Tamar and I realised that the primes themselves obey an infinite number of linear forms conditions, so one can basically use the primes (or a proxy for the primes, such as the von Mangoldt function ${\Lambda}$) as the enveloping sieve weight, rather than a classical sieve. Thus my old idea of using the Furstenberg correspondence principle to transfer Szemerédi-type theorems to the primes could actually be realised. In the one-dimensional case, this simply produces a much more complicated proof of Theorem 1 than the existing one; but it turns out that the argument works as well in higher dimensions and yields Theorem 2 relatively painlessly, except for the fact that it needs the results on linear equations in primes, the known proofs of which are extremely lengthy (and also require some of the transference machinery mentioned earlier). The problem of correlations in rectangles is avoided in the correspondence principle approach because one can compensate for such correlations by performing a suitable weighted limit to compute the measure ${\mu( T^{n_1} A \cap \ldots \cap T^{n_m} A)}$ of cylinder sets, with each ${m}$ requiring a different weighted correction. (This may be related to the Cook-Magyar-Titichetrakun strategy of weighting all of the facets of the hypergraph in order to recover pseudorandomness, although our contexts are rather different.)

In this, the final lecture notes of this course, we discuss one of the motivating applications of the theory developed thus far, namely to count solutions to linear equations in primes ${{\mathcal P} = \{2,3,5,7,\ldots\}}$ (or in dense subsets ${A}$ of primes ${{\mathcal P}}$). Unfortunately, the most famous linear equations in primes: the twin prime equation ${p_2 - p_1 = 2}$ and the even Goldbach equation ${p_1+p_2=N}$ – remain out of reach of this technology (because the relevant affine linear forms involved are commensurate, and thus have infinite complexity with respect to the Gowers norms), but most other systems of equations, in particular that of arithmetic progressions ${p_i = n+ir}$ for ${i=0,\ldots,k-1}$ (or equivalently, ${p_i + p_{i+2} = 2p_{i+1}}$ for ${i=0,\ldots,k-2}$) , as well as the odd Goldbach equation ${p_1+p_2+p_3=N}$, are tractable.

To illustrate the main ideas, we will focus on the following result of Green:

Theorem 1 (Roth’s theorem in the primes) Let ${A \subset {\mathcal P}}$ be a subset of primes whose upper density ${\limsup_{N \rightarrow \infty} |A \cap [N]|/|{\mathcal P} \cap [N]|}$ is positive. Then ${A}$ contains infinitely many arithmetic progressions of length three.

This should be compared with Roth’s theorem in the integers (Notes 2), which is the same statement but with the primes ${{\mathcal P}}$ replaced by the integers ${{\bf Z}}$ (or natural numbers ${{\bf N}}$). Indeed, Roth’s theorem for the primes is proven by transferring Roth’s theorem for the integers to the prime setting; the latter theorem is used as a “black box”. The key difficulty here in performing this transference is that the primes have zero density inside the integers; indeed, from the prime number theorem we have ${|{\mathcal P} \cap [N]| = (1+o(1)) \frac{N}{\log N} = o(N)}$.

There are a number of generalisations of this transference technique. In a paper of Green and myself, we extended the above theorem to progressions of longer length (thus transferring Szemerédi’s theorem to the primes). In a series of papers (culminating in a paper to appear shortly) of Green, myself, and also Ziegler, related methods are also used to obtain an asymptotic for the number of solutions in the primes to any system of linear equations of bounded complexity. This latter result uses the full power of higher order Fourier analysis, in particular relying heavily on the inverse conjecture for the Gowers norms; in contrast, Roth’s theorem and Szemerédi’s theorem in the primes are “softer” results that do not need this conjecture.

To transfer results from the integers to the primes, there are three basic steps:

• A general transference principle, that transfers certain types of additive combinatorial results from dense subsets of the integers to dense subsets of a suitably “pseudorandom set” of integers (or more precisely, to the integers weighted by a suitably “pseudorandom measure”);
• An application of sieve theory to show that the primes (or more precisely, an affine modification of the primes) lie inside a suitably pseudorandom set of integers (or more precisely, have significant mass with respect to a suitably pseudorandom measure).
• If one is seeking asymptotics for patterns in the primes, and not simply lower bounds, one also needs to control correlations between the primes (or proxies for the primes, such as the Möbius function) with various objects that arise from higher order Fourier analysis, such as nilsequences.

The former step can be accomplished in a number of ways. For progressions of length three (and more generally, for controlling linear patterns of complexity at most one), transference can be accomplished by Fourier-analytic methods. For more complicated patterns, one can use techniques inspired by ergodic theory; more recently, simplified and more efficient methods based on duality (the Hahn-Banach theorem) have also been used. No number theory is used in this step. (In the case of transference to genuinely random sets, rather than pseudorandom sets, similar ideas appeared earlier in the graph theory setting, see this paper of Kohayakawa, Luczak, and Rodl.

The second step is accomplished by fairly standard sieve theory methods (e.g. the Selberg sieve, or the slight variants of this sieve used by Goldston and Yildirim). Remarkably, very little of the formidable apparatus of modern analytic number theory is needed for this step; for instance, the only fact about the Riemann zeta function that is truly needed is that it has a simple pole at ${s=1}$, and no knowledge of L-functions is needed.

The third step does draw more significantly on analytic number theory techniques and results (most notably, the method of Vinogradov to compute oscillatory sums over the primes, and also the Siegel-Walfisz theorem that gives a good error term on the prime number theorem in arithemtic progressions). As these techniques are somewhat orthogonal to the main topic of this course, we shall only touch briefly on this aspect of the transference strategy.