You are currently browsing the tag archive for the ‘transference principle’ tag.

Asgar Jamneshan and myself have just uploaded to the arXiv our preprint “The inverse theorem for the {U^3} Gowers uniformity norm on arbitrary finite abelian groups: Fourier-analytic and ergodic approaches“. This paper, which is a companion to another recent paper of ourselves and Or Shalom, studies the inverse theory for the third Gowers uniformity norm

\displaystyle  \| f \|_{U^3(G)}^8 = {\bf E}_{h_1,h_2,h_3,x \in G} \Delta_{h_1} \Delta_{h_2} \Delta_{h_3} f(x)

on an arbitrary finite abelian group {G}, where {\Delta_h f(x) := f(x+h) \overline{f(x)}} is the multiplicative derivative. Our main result is as follows:

Theorem 1 (Inverse theorem for {U^3(G)}) Let {G} be a finite abelian group, and let {f: G \rightarrow {\bf C}} be a {1}-bounded function with {\|f\|_{U^3(G)} \geq \eta} for some {0 < \eta \leq 1/2}. Then:
  • (i) (Correlation with locally quadratic phase) There exists a regular Bohr set {B(S,\rho) \subset G} with {|S| \ll \eta^{-O(1)}} and {\exp(-\eta^{-O(1)}) \ll \rho \leq 1/2}, a locally quadratic function {\phi: B(S,\rho) \rightarrow {\bf R}/{\bf Z}}, and a function {\xi: G \rightarrow \hat G} such that

    \displaystyle  {\bf E}_{x \in G} |{\bf E}_{h \in B(S,\rho)} f(x+h) e(-\phi(h)-\xi(x) \cdot h)| \gg \eta^{O(1)}.

  • (ii) (Correlation with nilsequence) There exists an explicit degree two filtered nilmanifold {H/\Lambda} of dimension {O(\eta^{-O(1)})}, a polynomial map {g: G \rightarrow H/\Lambda}, and a Lipschitz function {F: H/\Lambda \rightarrow {\bf C}} of constant {O(\exp(\eta^{-O(1)}))} such that

    \displaystyle  |{\bf E}_{x \in G} f(x) \overline{F}(g(x))| \gg \exp(-\eta^{-O(1)}).

Such a theorem was proven by Ben Green and myself in the case when {|G|} was odd, and by Samorodnitsky in the {2}-torsion case {G = {\bf F}_2^n}. In all cases one uses the “higher order Fourier analysis” techniques introduced by Gowers. After some now-standard manipulations (using for instance what is now known as the Balog-Szemerédi-Gowers lemma), one arrives (for arbitrary {G}) at an estimate that is roughly of the form

\displaystyle  |{\bf E}_{x \in G} {\bf E}_{h,k \in B(S,\rho)} f(x+h+k) b(x,k) b(x,h) e(-B(h,k))| \gg \eta^{O(1)}

where {b} denotes various {1}-bounded functions whose exact values are not too important, and {B: B(S,\rho) \times B(S,\rho) \rightarrow {\bf R}/{\bf Z}} is a symmetric locally bilinear form. The idea is then to “integrate” this form by expressing it in the form

\displaystyle  B(h,k) = \phi(h+k) - \phi(h) - \phi(k) \ \ \ \ \ (1)

for some locally quadratic {\phi: B(S,\rho) \rightarrow {\bf C}}; this then allows us to write the above correlation as

\displaystyle  |{\bf E}_{x \in G} {\bf E}_{h,k \in B(S,\rho)} f(x+h+k) e(-\phi(h+k)) b(x,k) b(x,h)| \gg \eta^{O(1)}

(after adjusting the {b} functions suitably), and one can now conclude part (i) of the above theorem using some linear Fourier analysis. Part (ii) follows by encoding locally quadratic phase functions as nilsequences; for this we adapt an algebraic construction of Manners.

So the key step is to obtain a representation of the form (1), possibly after shrinking the Bohr set {B(S,\rho)} a little if needed. This has been done in the literature in two ways:

  • When {|G|} is odd, one has the ability to divide by {2}, and on the set {2 \cdot B(S,\frac{\rho}{10}) = \{ 2x: x \in B(S,\frac{\rho}{10})\}} one can establish (1) with {\phi(h) := B(\frac{1}{2} h, h)}. (This is similar to how in single variable calculus the function {x \mapsto \frac{1}{2} x^2} is a function whose second derivative is equal to {1}.)
  • When {G = {\bf F}_2^n}, then after a change of basis one can take the Bohr set {B(S,\rho)} to be {{\bf F}_2^m} for some {m}, and the bilinear form can be written in coordinates as

    \displaystyle  B(h,k) = \sum_{1 \leq i,j \leq m} a_{ij} h_i k_j / 2 \hbox{ mod } 1

    for some {a_{ij} \in {\bf F}_2} with {a_{ij}=a_{ji}}. The diagonal terms {a_{ii}} cause a problem, but by subtracting off the rank one form {(\sum_{i=1}^m a_{ii} h_i) ((\sum_{i=1}^m a_{ii} k_i) / 2} one can write

    \displaystyle  B(h,k) = \sum_{1 \leq i,j \leq m} b_{ij} h_i k_j / 2 \hbox{ mod } 1

    on the orthogonal complement of {(a_{11},\dots,a_{mm})} for some coefficients {b_{ij}=b_{ji}} which now vanish on the diagonal: {b_{ii}=0}. One can now obtain (1) on this complement by taking

    \displaystyle  \phi(h) := \sum_{1 \leq i < j \leq m} b_{ij} h_i h_k / 2 \hbox{ mod } 1.

In our paper we can now treat the case of arbitrary finite abelian groups {G}, by means of the following two new ingredients:

  • (i) Using some geometry of numbers, we can lift the group {G} to a larger (possibly infinite, but still finitely generated) abelian group {G_S} with a projection map {\pi: G_S \rightarrow G}, and find a globally bilinear map {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} on the latter group, such that one has a representation

    \displaystyle  B(\pi(x), \pi(y)) = \tilde B(x,y) \ \ \ \ \ (2)

    of the locally bilinear form {B} by the globally bilinear form {\tilde B} when {x,y} are close enough to the origin.
  • (ii) Using an explicit construction, one can show that every globally bilinear map {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} has a representation of the form (1) for some globally quadratic function {\tilde \phi: G_S \rightarrow {\bf R}/{\bf Z}}.

To illustrate (i), consider the Bohr set {B(S,1/10) = \{ x \in {\bf Z}/N{\bf Z}: \|x/N\|_{{\bf R}/{\bf Z}} < 1/10\}} in {G = {\bf Z}/N{\bf Z}} (where {\|\|_{{\bf R}/{\bf Z}}} denotes the distance to the nearest integer), and consider a locally bilinear form {B: B(S,1/10) \times B(S,1/10) \rightarrow {\bf R}/{\bf Z}} of the form {B(x,y) = \alpha x y \hbox{ mod } 1} for some real number {\alpha} and all integers {x,y \in (-N/10,N/10)} (which we identify with elements of {G}. For generic {\alpha}, this form cannot be extended to a globally bilinear form on {G}; however if one lifts {G} to the finitely generated abelian group

\displaystyle  G_S := \{ (x,\theta) \in {\bf Z}/N{\bf Z} \times {\bf R}: \theta = x/N \hbox{ mod } 1 \}

(with projection map {\pi: (x,\theta) \mapsto x}) and introduces the globally bilinear form {\tilde B: G_S \times G_S \rightarrow {\bf R}/{\bf Z}} by the formula

\displaystyle  \tilde B((x,\theta),(y,\sigma)) = N^2 \alpha \theta \sigma \hbox{ mod } 1

then one has (2) when {\theta,\sigma} lie in the interval {(-1/10,1/10)}. A similar construction works for higher rank Bohr sets.

To illustrate (ii), the key case turns out to be when {G_S} is a cyclic group {{\bf Z}/N{\bf Z}}, in which case {\tilde B} will take the form

\displaystyle  \tilde B(x,y) = \frac{axy}{N} \hbox{ mod } 1

for some integer {a}. One can then check by direct construction that (1) will be obeyed with

\displaystyle  \tilde \phi(x) = \frac{a \binom{x}{2}}{N} - \frac{a x \binom{N}{2}}{N^2} \hbox{ mod } 1

regardless of whether {N} is even or odd. A variant of this construction also works for {{\bf Z}}, and the general case follows from a short calculation verifying that the claim (ii) for any two groups {G_S, G'_S} implies the corresponding claim (ii) for the product {G_S \times G'_S}.

This concludes the Fourier-analytic proof of Theorem 1. In this paper we also give an ergodic theory proof of (a qualitative version of) Theorem 1(ii), using a correspondence principle argument adapted from this previous paper of Ziegler, and myself. Basically, the idea is to randomly generate a dynamical system on the group {G}, by selecting an infinite number of random shifts {g_1, g_2, \dots \in G}, which induces an action of the infinitely generated free abelian group {{\bf Z}^\omega = \bigcup_{n=1}^\infty {\bf Z}^n} on {G} by the formula

\displaystyle  T^h x := x + \sum_{i=1}^\infty h_i g_i.

Much as the law of large numbers ensures the almost sure convergence of Monte Carlo integration, one can show that this action is almost surely ergodic (after passing to a suitable Furstenberg-type limit {X} where the size of {G} goes to infinity), and that the dynamical Host-Kra-Gowers seminorms of that system coincide with the combinatorial Gowers norms of the original functions. One is then well placed to apply an inverse theorem for the third Host-Kra-Gowers seminorm {U^3(X)} for {{\bf Z}^\omega}-actions, which was accomplished in the companion paper to this one. After doing so, one almost gets the desired conclusion of Theorem 1(ii), except that after undoing the application of the Furstenberg correspondence principle, the map {g: G \rightarrow H/\Lambda} is merely an almost polynomial rather than a polynomial, which roughly speaking means that instead of certain derivatives of {g} vanishing, they instead are merely very small outside of a small exceptional set. To conclude we need to invoke a “stability of polynomials” result, which at this level of generality was first established by Candela and Szegedy (though we also provide an independent proof here in an appendix), which roughly speaking asserts that every approximate polynomial is close in measure to an actual polynomial. (This general strategy is also employed in the Candela-Szegedy paper, though in the absence of the ergodic inverse theorem input that we rely upon here, the conclusion is weaker in that the filtered nilmanifold {H/\Lambda} is replaced with a general space known as a “CFR nilspace”.)

This transference principle approach seems to work well for the higher step cases (for instance, the stability of polynomials result is known in arbitrary degree); the main difficulty is to establish a suitable higher step inverse theorem in the ergodic theory setting, which we hope to do in future research.

Tamar Ziegler and I have just uploaded to the arXiv our joint paper “A multi-dimensional Szemerédi theorem for the primes via a correspondence principle“. This paper is related to an earlier result of Ben Green and mine in which we established that the primes contain arbitrarily long arithmetic progressions. Actually, in that paper we proved a more general result:

Theorem 1 (Szemerédi’s theorem in the primes) Let {A} be a subset of the primes {{\mathcal P}} of positive relative density, thus {\limsup_{N \rightarrow \infty} \frac{|A \cap [N]|}{|{\mathcal P} \cap [N]|} > 0}. Then {A} contains arbitrarily long arithmetic progressions.

This result was based in part on an earlier paper of Green that handled the case of progressions of length three. With the primes replaced by the integers, this is of course the famous theorem of Szemerédi.

Szemerédi’s theorem has now been generalised in many different directions. One of these is the multidimensional Szemerédi theorem of Furstenberg and Katznelson, who used ergodic-theoretic techniques to show that any dense subset of {{\bf Z}^d} necessarily contained infinitely many constellations of any prescribed shape. Our main result is to relativise that theorem to the primes as well:

Theorem 2 (Multidimensional Szemerédi theorem in the primes) Let {d \geq 1}, and let {A} be a subset of the {d^{th}} Cartesian power {{\mathcal P}^d} of the primes of positive relative density, thus

\displaystyle  \limsup_{N \rightarrow \infty} \frac{|A \cap [N]^d|}{|{\mathcal P}^d \cap [N]^d|} > 0.

Then for any {v_1,\ldots,v_k \in {\bf Z}^d}, {A} contains infinitely many “constellations” of the form {a+r v_1, \ldots, a + rv_k} with {a \in {\bf Z}^k} and {r} a positive integer.

In the case when {A} is itself a Cartesian product of one-dimensional sets (in particular, if {A} is all of {{\mathcal P}^d}), this result already follows from Theorem 1, but there does not seem to be a similarly easy argument to deduce the general case of Theorem 2 from previous results. Simultaneously with this paper, an independent proof of Theorem 2 using a somewhat different method has been established by Cook, Maygar, and Titichetrakun.

The result is reminiscent of an earlier result of mine on finding constellations in the Gaussian primes (or dense subsets thereof). That paper followed closely the arguments of my original paper with Ben Green, namely it first enclosed (a W-tricked version of) the primes or Gaussian primes (in a sieve theoretic-sense) by a slightly larger set (or more precisely, a weight function {\nu}) of almost primes or almost Gaussian primes, which one could then verify (using methods closely related to the sieve-theoretic methods in the ongoing Polymath8 project) to obey certain pseudorandomness conditions, known as the linear forms condition and the correlation condition. Very roughly speaking, these conditions assert statements of the following form: if {n} is a randomly selected integer, then the events of {n+h_1,\ldots,n+h_k} simultaneously being an almost prime (or almost Gaussian prime) are approximately independent for most choices of {h_1,\ldots,h_k}. Once these conditions are satisfied, one can then run a transference argument (initially based on ergodic-theory methods, but nowadays there are simpler transference results based on the Hahn-Banach theorem, due to Gowers and Reingold-Trevisan-Tulsiani-Vadhan) to obtain relative Szemerédi-type theorems from their absolute counterparts.

However, when one tries to adapt these arguments to sets such as {{\mathcal P}^2}, a new difficulty occurs: the natural analogue of the almost primes would be the Cartesian square {{\mathcal A}^2} of the almost primes – pairs {(n,m)} whose entries are both almost primes. (Actually, for technical reasons, one does not work directly with a set of almost primes, but would instead work with a weight function such as {\nu(n) \nu(m)} that is concentrated on a set such as {{\mathcal A}^2}, but let me ignore this distinction for now.) However, this set {{\mathcal A}^2} does not enjoy as many pseudorandomness conditions as one would need for a direct application of the transference strategy to work. More specifically, given any fixed {h, k}, and random {(n,m)}, the four events

\displaystyle  (n,m) \in {\mathcal A}^2

\displaystyle  (n+h,m) \in {\mathcal A}^2

\displaystyle  (n,m+k) \in {\mathcal A}^2

\displaystyle  (n+h,m+k) \in {\mathcal A}^2

do not behave independently (as they would if {{\mathcal A}^2} were replaced for instance by the Gaussian almost primes), because any three of these events imply the fourth. This blocks the transference strategy for constellations which contain some right-angles to them (e.g. constellations of the form {(n,m), (n+r,m), (n,m+r)}) as such constellations soon turn into rectangles such as the one above after applying Cauchy-Schwarz a few times. (But a few years ago, Cook and Magyar showed that if one restricted attention to constellations which were in general position in the sense that any coordinate hyperplane contained at most one element in the constellation, then this obstruction does not occur and one can establish Theorem 2 in this case through the transference argument.) It’s worth noting that very recently, Conlon, Fox, and Zhao have succeeded in removing of the pseudorandomness conditions (namely the correlation condition) from the transference principle, leaving only the linear forms condition as the remaining pseudorandomness condition to be verified, but unfortunately this does not completely solve the above problem because the linear forms condition also fails for {{\mathcal A}^2} (or for weights concentrated on {{\mathcal A}^2}) when applied to rectangular patterns.

There are now two ways known to get around this problem and establish Theorem 2 in full generality. The approach of Cook, Magyar, and Titichetrakun proceeds by starting with one of the known proofs of the multidimensional Szemerédi theorem – namely, the proof that proceeds through hypergraph regularity and hypergraph removal – and attach pseudorandom weights directly within the proof itself, rather than trying to add the weights to the result of that proof through a transference argument. (A key technical issue is that weights have to be added to all the levels of the hypergraph – not just the vertices and top-order edges – in order to circumvent the failure of naive pseudorandomness.) As one has to modify the entire proof of the multidimensional Szemerédi theorem, rather than use that theorem as a black box, the Cook-Magyar-Titichetrakun argument is lengthier than ours; on the other hand, it is more general and does not rely on some difficult theorems about primes that are used in our paper.

In our approach, we continue to use the multidimensional Szemerédi theorem (or more precisely, the equivalent theorem of Furstenberg and Katznelson concerning multiple recurrence for commuting shifts) as a black box. The difference is that instead of using a transference principle to connect the relative multidimensional Szemerédi theorem we need to the multiple recurrence theorem, we instead proceed by a version of the Furstenberg correspondence principle, similar to the one that connects the absolute multidimensional Szemerédi theorem to the multiple recurrence theorem. I had discovered this approach many years ago in an unpublished note, but had abandoned it because it required an infinite number of linear forms conditions (in contrast to the transference technique, which only needed a finite number of linear forms conditions and (until the recent work of Conlon-Fox-Zhao) a correlation condition). The reason for this infinite number of conditions is that the correspondence principle has to build a probability measure on an entire {\sigma}-algebra; for this, it is not enough to specify the measure {\mu(A)} of a single set such as {A}, but one also has to specify the measure {\mu( T^{n_1} A \cap \ldots \cap T^{n_m} A)} of “cylinder sets” such as {T^{n_1} A \cap \ldots \cap T^{n_m} A} where {m} could be arbitrarily large. The larger {m} gets, the more linear forms conditions one needs to keep the correspondence under control.

With the sieve weights {\nu} we were using at the time, standard sieve theory methods could indeed provide a finite number of linear forms conditions, but not an infinite number, so my idea was abandoned. However, with my later work with Green and Ziegler on linear equations in primes (and related work on the Mobius-nilsequences conjecture and the inverse conjecture on the Gowers norm), Tamar and I realised that the primes themselves obey an infinite number of linear forms conditions, so one can basically use the primes (or a proxy for the primes, such as the von Mangoldt function {\Lambda}) as the enveloping sieve weight, rather than a classical sieve. Thus my old idea of using the Furstenberg correspondence principle to transfer Szemerédi-type theorems to the primes could actually be realised. In the one-dimensional case, this simply produces a much more complicated proof of Theorem 1 than the existing one; but it turns out that the argument works as well in higher dimensions and yields Theorem 2 relatively painlessly, except for the fact that it needs the results on linear equations in primes, the known proofs of which are extremely lengthy (and also require some of the transference machinery mentioned earlier). The problem of correlations in rectangles is avoided in the correspondence principle approach because one can compensate for such correlations by performing a suitable weighted limit to compute the measure {\mu( T^{n_1} A \cap \ldots \cap T^{n_m} A)} of cylinder sets, with each {m} requiring a different weighted correction. (This may be related to the Cook-Magyar-Titichetrakun strategy of weighting all of the facets of the hypergraph in order to recover pseudorandomness, although our contexts are rather different.)

In this, the final lecture notes of this course, we discuss one of the motivating applications of the theory developed thus far, namely to count solutions to linear equations in primes {{\mathcal P} = \{2,3,5,7,\ldots\}} (or in dense subsets {A} of primes {{\mathcal P}}). Unfortunately, the most famous linear equations in primes: the twin prime equation {p_2 - p_1 = 2} and the even Goldbach equation {p_1+p_2=N} – remain out of reach of this technology (because the relevant affine linear forms involved are commensurate, and thus have infinite complexity with respect to the Gowers norms), but most other systems of equations, in particular that of arithmetic progressions {p_i = n+ir} for {i=0,\ldots,k-1} (or equivalently, {p_i + p_{i+2} = 2p_{i+1}} for {i=0,\ldots,k-2}) , as well as the odd Goldbach equation {p_1+p_2+p_3=N}, are tractable.

To illustrate the main ideas, we will focus on the following result of Green:

Theorem 1 (Roth’s theorem in the primes) Let {A \subset {\mathcal P}} be a subset of primes whose upper density {\limsup_{N \rightarrow \infty} |A \cap [N]|/|{\mathcal P} \cap [N]|} is positive. Then {A} contains infinitely many arithmetic progressions of length three.

This should be compared with Roth’s theorem in the integers (Notes 2), which is the same statement but with the primes {{\mathcal P}} replaced by the integers {{\bf Z}} (or natural numbers {{\bf N}}). Indeed, Roth’s theorem for the primes is proven by transferring Roth’s theorem for the integers to the prime setting; the latter theorem is used as a “black box”. The key difficulty here in performing this transference is that the primes have zero density inside the integers; indeed, from the prime number theorem we have {|{\mathcal P} \cap [N]| = (1+o(1)) \frac{N}{\log N} = o(N)}.

There are a number of generalisations of this transference technique. In a paper of Green and myself, we extended the above theorem to progressions of longer length (thus transferring Szemerédi’s theorem to the primes). In a series of papers (culminating in a paper to appear shortly) of Green, myself, and also Ziegler, related methods are also used to obtain an asymptotic for the number of solutions in the primes to any system of linear equations of bounded complexity. This latter result uses the full power of higher order Fourier analysis, in particular relying heavily on the inverse conjecture for the Gowers norms; in contrast, Roth’s theorem and Szemerédi’s theorem in the primes are “softer” results that do not need this conjecture.

To transfer results from the integers to the primes, there are three basic steps:

  • A general transference principle, that transfers certain types of additive combinatorial results from dense subsets of the integers to dense subsets of a suitably “pseudorandom set” of integers (or more precisely, to the integers weighted by a suitably “pseudorandom measure”);
  • An application of sieve theory to show that the primes (or more precisely, an affine modification of the primes) lie inside a suitably pseudorandom set of integers (or more precisely, have significant mass with respect to a suitably pseudorandom measure).
  • If one is seeking asymptotics for patterns in the primes, and not simply lower bounds, one also needs to control correlations between the primes (or proxies for the primes, such as the Möbius function) with various objects that arise from higher order Fourier analysis, such as nilsequences.

The former step can be accomplished in a number of ways. For progressions of length three (and more generally, for controlling linear patterns of complexity at most one), transference can be accomplished by Fourier-analytic methods. For more complicated patterns, one can use techniques inspired by ergodic theory; more recently, simplified and more efficient methods based on duality (the Hahn-Banach theorem) have also been used. No number theory is used in this step. (In the case of transference to genuinely random sets, rather than pseudorandom sets, similar ideas appeared earlier in the graph theory setting, see this paper of Kohayakawa, Luczak, and Rodl.

The second step is accomplished by fairly standard sieve theory methods (e.g. the Selberg sieve, or the slight variants of this sieve used by Goldston and Yildirim). Remarkably, very little of the formidable apparatus of modern analytic number theory is needed for this step; for instance, the only fact about the Riemann zeta function that is truly needed is that it has a simple pole at {s=1}, and no knowledge of L-functions is needed.

The third step does draw more significantly on analytic number theory techniques and results (most notably, the method of Vinogradov to compute oscillatory sums over the primes, and also the Siegel-Walfisz theorem that gives a good error term on the prime number theorem in arithemtic progressions). As these techniques are somewhat orthogonal to the main topic of this course, we shall only touch briefly on this aspect of the transference strategy.

Read the rest of this entry »