We now give a basic application of Fourier analysis to the problem of counting additive patterns in sets, namely the following famous theorem of Roth:

Theorem 1 (Roth’s theorem)Let be a subset of the integers whose upper densityis positive. Then contains infinitely many arithmetic progressions of length three, with and .

This is the first non-trivial case of Szemerédi’s theorem, which is the same assertion but with length three arithmetic progressions replaced by progressions of length for any .

As it turns out, one can prove Roth’s theorem by an application of linear Fourier analysis – by comparing the set (or more precisely, the indicator function of that set, or of pieces of that set) against linear characters for various frequencies . There are two extreme cases to consider (which are model examples of a more general dichotomy between structure and randomness). One is when is aligned up almost completely with one of these linear characters, for instance by being a *Bohr set* of the form

or more generally of the form

for some multi-dimensional frequency and some open set . In this case, arithmetic progressions can be located using the equidistribution theory of the previous set of notes. At the other extreme, one has *Fourier-uniform* or *Fourier-pseudorandom sets*, whose correlation with any linear character is negligible. In this case, arithmetic progressions can be produced in abundance via a Fourier-analytic calculation.

To handle the general case, one must somehow synthesise together the argument that deals with the structured case with the argument that deals with the random case. There are several known ways to do this, but they can be basically classified into two general methods, namely the *density increment argument* (or increment argument) and the *energy increment argument* (or increment argument).

The idea behind the density increment argument is to introduce a dichotomy: either the object being studied is pseudorandom (in which case one is done), or else one can use the theory of the structured objects to locate a sub-object of significantly higher “density” than the original object. As the density cannot exceed one, one should thus be done after a finite number of iterations of this dichotomy. This argument was introduced by Roth in his original proof of the above theorem.

The idea behind the energy increment argument is instead to decompose the original object into two pieces (and, sometimes, a small additional error term): a *structured component* that captures all the structured objects that have significant correlation with , and a *pseudorandom component* which has no significant correlation with any structured object. This decomposition usually proceeds by trying to maximise the “energy” (or norm) of the structured component, or dually by trying to minimise the energy of the residual between the original object and the structured object. This argument appears for instance in the proof of the Szemerédi regularity lemma (which, not coincidentally, can also be used to prove Roth’s theorem), and is also implicit in the ergodic theory approach to such problems (through the machinery of conditional expectation relative to a factor, which is a type of orthogonal projection, the existence of which is usually established via an energy increment argument). However, one can also deploy the energy increment argument in the Fourier analytic setting, to give an alternate Fourier-analytic proof of Roth’s theorem that differs in some ways from the density increment proof.

In these notes we give both two Fourier-analytic proofs of Roth’s theorem, one proceeding via the density increment argument, and the other by the energy increment argument. As it turns out, both of these arguments extend to establish Szemerédi’s theorem, and more generally in counting other types of patterns, but this is non-trivial (requiring some sort of *inverse conjecture* for the Gowers uniformity norms in both cases); we will discuss this further in later notes.

** — 1. The density increment argument — **

We begin with the density increment argument. We first rephrase Roth’s theorem in a finitary form:

Theorem 2 (Roth’s theorem, again)For every , there exists an , such that for every , and every with , contains an arithmetic progression of length three.

Exercise 3Show that Theorem 1 and Theorem 2 are equivalent.

We prove Theorem 2 by a downward induction on the density parameter . Let denote the proposition that Theorem 2 holds for that value of (i.e. for sufficiently large and all with , contains an arithmetic progression of length three). Our objective is to show that holds for all .

Clearly, is (vacuously) true for (and trivially true for ). It is also monotone in the sense that if holds for some , then holds for all . To downwardly induct on , we will prove the following dichotomy:

Proposition 4 (Lack of progressions implies density increment)Let , let be sufficiently large depending on , and let be such that . Then one of the following holds:

- contains an arithmetic progression of length three; or
- there exists a subprogression of of length at least such that , where goes to infinity as , and is bounded away from zero whenever is bounded away from zero.

Let us see why Proposition 4 implies Theorem 2. It is slightly more convenient to use a “well-ordering principle” argument rather than an induction argument, though of course the two approaches are equivalent. Let be the infimum of all for which holds, thus . If then we are done, so suppose that is non-zero. Then for any , is false, thus there exist arbitrarily large and with with no progressions of length three. By Proposition 4, we can thus find a subprogression of of length at least with ; if is small enough, this implies that . We then use an affine transformation to map to (noting crucially that the property of having no arithmetic progressions of a given length is preserved by affine transformations). As can be arbitrarily large, can be arbitrarily large also. Since is true, we see that contains an arithmetic progression of length three, hence does also; which gives the desired contradiction.

It remains to prove Proposition 4. There are two main steps. The first relies heavily on the fact that the progressions only have length three, and is proven via Fourier analysis:

Proposition 5 (Lack of progressions implies correlation with a linear phase)Let , let be sufficiently large depending on , let be such that for some , with containing no arithmetic progressions of length three. Then there exists such that .

*Proof:* In order to use Fourier analysis, it will be convenient to embed inside a cyclic group , where is equal to (say) ; the exact choice here is only of minor importance, though it will be convenient to take to be odd. We introduce the trilinear form

for any functions ; we then observe that the quantity

(extending by zero outside of ) is equal to the number of arithmetic progressions in (counting the degenerate progressions in which , and also allowing for to be negative), divided by the normalising factor of . On the other hand, by hypothesis, contains no non-degenerate arithmetic progressions of length three, and clearly has degenerate progressions; thus we have

On the other hand, from the Fourier inversion formula on the cyclic group we may write

for any function , where are the Fourier coefficients

We may thus write as

Now observe that we have the identity

so the phase is trivial whenever is of the form , and so the expectation in (2) is equal to . Conversely, if is not of this form, then the phase is non-trivial, and from Fourier analysis we conclude that the expectation in (2) vanishes. We conclude that the left-hand side of (2) can be expressed as

Now using Plancherel’s theorem we have

(using normalised counting measure). Using this and Hölder’s inequality (and the fact that is odd), we obtain the bounds

and similarly for permutations of on the right-hand side.

We could apply this directly to , but this is not useful, since we seek a *lower* bound on this quantity rather than an upper bound. To get such a lower bound, we split , where is the mean zero portion of , and use trilinearity to split into a main term , plus seven other error terms involving and , with each error term involving at least one copy of . The main term can be computed explicitly as

Comparing this with (1), we conclude that one of the error terms must have magnitude also. For sake of concreteness, let us say that

the other cases are similar.

From the triangle inequality we see that have an norm of , and so from (3) one has

and so we conclude that

for some . (Similarly for other error terms, though sometimes onewill need a permutation of (3) instead of (3) itself.) The claim follows.

Remark 6The above argument relied heavily on the fact that there was only a one-parameter family of linear relations between . The same argument does not work directly for finding arithmetic progressions of length longer than three; we return to this point in later notes.

The second step converts correlation with a linear character into a density increment on a subprogression:

Proposition 7 (Fragmenting a linear character into progressions)Let , let , and let be a linear phase. Then there exists which goes to infinity as for fixed , and a partitionof into arithmetic progressions of length at least , together with an error term of cardinality at most , such that fluctuates by at most on each progression (i.e. whenever ).

*Proof:* We may assume that is sufficiently large depending on , as the claim is trivial otherwise (just set ).

Fix , and let be a slowly growing function of to be chosen later. By using recurrence for the linear phase , we can find a shift of size such that . We then partition into arithmetic progressions of spacing , and then partition each of those progressions in turn into subprogressions of spacing and length , plus an error of cardinality at most , leading to an error set of cardinality at most . On each of the , fluctuates by at most . The claim then follows by choosing to be a sufficiently slowly growing function of .

Now we can prove Proposition 4 (and thus Roth’s theorem). Let be as in Proposition 4. By Proposition 5 (if is large enough), we can find for which

We now let be a small quantity depending on to be chosen later (actually it turns out that we can take to be a small multiple of ) and apply Proposition 7 to decompose into progressions and an error term with the stated properties. Then we have

Since fluctuates by at most on , we can apply the triangle inequality and conclude that

If is sufficiently small depending on , we conclude that

On the other hand, as is the mean of on , we have

and thus

Adding this to (4) and noting that for real , we conclude (for small enough) that

and hence by the pigeonhole principle we can find such that

or in other words

for some absolute constant , and Proposition 4 follows.

It is possible to rewrite the above argument in the ultralimit setting, though it only makes the argument slightly shorter as a consequence. We sketch this alternate formulation below.

Exercise 8Let be as above.

- Show that if is an unbounded limit natural number, and is a limit subset whose density is strictly greater than , then contains a (limit) arithmetic progression of length three (with ).
- Show that there exists an unbounded limit natural number and a limit subset of density , which does not contain any arithmetic progressions of length three.

Exercise 9Show that if is an unbounded limit natural number, and is a limit subset of positive density with no arithmetic progressions of length three, then there exists a limit real such that .

Exercise 10If is an unbounded limit natural number, and is a limit real, show that one can partition , where is a limit natural number, the are limit arithmetic subprogressions of of unbounded length (with the map being a limit function), such that fluctuates by on each (uniformly in ), and .

Exercise 11Use the previous three exercises to reprove Roth’s theorem.

Exercise 12 (Roth’s theorem in bounded characteristic)Let be a finite field, let , and let be a finite vector space. Show that if the dimension of is sufficiently large depending on , and if is such that , then there exists with such that . (Hint:Mimic the above arguments (either finitarily, or with ultralimits), using hyperplanes as a substitute for subprogressions.)

Exercise 13 (Roth’s theorem in finite abelian groups)Let be a finite abelian group, and let . Show that if is sufficiently large depending on , and is such that , then there exists with such that . (Hint:if there is an element of of large order, one can use Theorem 2 and the pigeonhole principle. If all elements have bounded order, one can instead use Exercise 12.) This result (as well as the special case in the preceding exercise) was first established by Meshulam.

** — 2. The energy increment argument — **

Now we turn to the energy increment approach. This approach requires a bit more machinery to set up, but ends up being quite flexible and powerful (for instance, it is the starting point for my theorem with Ben Green establishing arbitrarily long progressions in the primes, which we do not know how to establish via density increment arguments).

Instead of passing from to a subprogression, we now instead *coarsen* to some partition (or *factor*) of , as follows. Define a *factor* of to be a -algebra of subsets of , or equivalently a partition of into disjoint *atoms* or *cells* (with the elements of then being the arbitary unions of atoms). Given a function and a factor , we define the *conditional expectation* to be the function whose value at a given point is given by the formula

where is the unique atom of that contains . One can view the map as the orthogonal projection from to , where is the space of functions with the inner product

and is the subspace of functions in which are measurable with respect to , or equivalently are constant on each atom of .

We say that one factor *refines* another if , or equivalently if every atom of is a union of atoms of , or if every atom of is contained in an atom of , or equivalently again if . Given two factors , , one can define their *join* to be their least common refinement, thus the atoms in are the non-empty intersections of atoms in with atoms in .

The idea is to split a given function in (and specifically, an indicator function ) into a projection onto a “structured factor” to obtain a “structured component” , together with a “pseudorandom component” that is essentially orthogonal to all structured functions. This decomposition is related to the classical decomposition of a vector in a Hilbert space into its orthogonal projection onto a closed subspace , plus the complementary projection to the orthogonal complement ; we will see the relationship between the two decompositions later when we pass to the ultralimit.

We need to make the notion of “structured” more precise. We begin with some definitions. We say that a function has *Fourier complexity* at most if it can be expressed as

for some and some complex numbers of magnitude at most , and some real numbers . Note that from the Fourier inversion formula that every function will have some finite Fourier complexity, but typically one expects the complexity to grow with ; only a few special functions will have complexity bounded uniformly in . Also note that if have Fourier complexity then , or all have Fourier complexity at most ; informally, the space of bounded complexity functions forms an algebra. (We will be able to formalise this statement after we take ultralimits.)

Ideally, we would like to take “functions of bounded Fourier complexity” as our class of structured functions. For technical reasons (related to our desire to use indicator functions as structured functions), we need to take an closure and work with the wider class of *Fourier measurable* functions as our structured class.

Definition 14 (Measurability)Let be a function. We say that a function isFourier measurablewith growth function if, for every , one can find a function of Fourier complexity at most such that .

A subset of isFourier measurablewith growth function if is Fourier measurable with this growth function.

Exercise 15Show that every interval in is Fourier measurable with some growth function independent of . (Hint: apply Fejér summation to .)

Exercise 16Let be a Fourier-measurable function with some growth function , which is bounded in magnitude by . Show that for every , one can find a function which also is bounded in magnitude by , and of Fourier complexity , such that . (Hint:start with the approximating function from Definition 14, which is already bounded in magnitude by , and then set where is a polynomial bounded in magnitude by on the ball of radius which is close to the identity function on the ball of radius (such a function can be constructed via the Stone-Weierstrass theorem).)

Exercise 17Show that if are bounded in magnitude by , and are Fourier measurable with growth functions , then , , and are Fourier measurable with some growth function depending only on and .

Conclude that if are Fourier-measurable with growth function , then , , and are Fourier-measurable with some growth function depending only on .

We thus see that Fourier-measurable sets morally form a Boolean algebra. (Again, we can formalise this assertion once we pass to the ultralimit.)

Now we make a key observation (cf. this paper by Reingold, Trevisan, Tulsiani, and Vadhan).

Lemma 18 (Correlation with a Fourier character implies correlation with a Fourier-measurable set)Let be bounded in magnitude by , and suppose that for some . Then there exists a Fourier-measurable set with some growth function depending on , such that .

*Proof:* By splitting into real and imaginary parts, we may assume without loss of generality that is real. Rotating , we may find a real number such that

We then express

where

By Minkowski’s inequality, we thus have either

or

In the former case we are done (setting ), so suppose that the latter holds. If all the were uniformly Fourier-measurable, we would now be done in this case also by the pigeonhole principle. This is not quite true; however, it turns out that *most* are uniformly measurable, and this will be enough. More precisely, let be a small parameter to be chosen later, and say that is *good* if one has

for all . Let be the set of all bad . Observe that for each bad , we have , where is the probability measure

and is the Hardy-Littlewood maximal function

Applying the Hardy-Littlewood maximal inequality, we conclude that . In particular, if is small enough compared with , we have

and so by the pigeonhole principle, there exists a good such that

It remains to verify that is good. For any , we have (as is good) that

Applying Urysohn’s lemma, we can thus find a smooth function with for and for such that

Using the Weierstrass approximation theorem, one can then approximate uniformly by on by a polynomial of degree and coefficients . This allows one to approximate in norm to an accuracy of by a function of Fourier complexity , and the claim follows.

Corollary 19 (Correlation implies energy increment)Let , and let be a factor generated by at most atoms, each of which is Fourier-measurable with growth function . Suppose that we have the correlationfor some and . Then there exists a refinement generated by at most atoms, each of which is Fourier-measurable with a growth function depending only on , such that

*Proof:* By Lemma 18, we can find a Fourier-measurable set with some growth function depending on , such that

We let be the factor generated by and . As is measurable with respect to , we may project onto and conclude that

By Cauchy-Schwarz, we thus have

Squaring and using Pythagoras’ theorem, we obtain (5). The remaining claims in the corollary follow from Exercise 17.

We can then iterate this corollary via an *energy increment argument* to obtain

Proposition 20 (Weak arithmetic regularity lemma)Let , and let be a factor generated by at most atoms, each of which is Fourier-measurable with growth function . Let . Then there exists an extension of generated by atoms, each of which is Fourier-measurable with growth function depending on , such that

*Proof:* We initially set equal to . If (6) already holds, then we are done; otherwise, we invoke Corollary 19 to increase the “energy” by , at the cost of possibly doubling the number of atoms in , and also altering the growth function somewhat. We iterate this procedure; as the energy is bounded between zero and one, and increases by at each step, the procedure must terminate in steps, at which point the claim follows.

It turns out that the power of this lemma is amplified if we iterate one more time, to obtain

Theorem 21 (Strong arithmetic regularity lemma)Let , let , and let be an arbitrary function. Then we can decompose and find such that

- (Nonnegativity) take values in , and have mean zero;
- (Structure) is Fourier-measurable with a growth function that depends only on ;
- (Smallness) has an norm of at most ; and
- (Pseudorandomness) One has for all .

*Proof:* We recursively define a sequence by setting and (say). Applying Proposition 20 (starting with the trivial factor ), one can then find a nested sequence of refinements , such that

for all and , and such that each consists of atoms that are Fourier-measurable with some growth function depending on (note that this quantity dominates and by construction). By Pythagoras’ theorem, the energies are monotone increasing between and , so by the pigeonhole principle there exists such that

and hence by Pythagoras

Setting , , , we obtain the claim.

Remark 22This result is essentially due to Green (though not quite in this language). Earlier related decompositions are due to Bourgain and to Green-Konyagin. The Szemerédi regularity lemma in graph theory can be viewed as the graph-theoretic analogue of this Fourier-analytic result; see this paper of mine (or my FOCS paper) for further discussion. The double iteration required to prove Theorem 21 means that the bounds here are quite poor (of tower-exponential type, in fact, when is exponential, which is typical in applications), much as in the graph theory case; thus the use of this lemma, while technically quantitative in nature, gives bounds that are usually quite inferior to what is known or suspected to be true.

As with the Ratner-type theorems from the previous notes, it is crucial that the uniformity for the pseudorandom component is of an arbitrarily higher quality than the measurability of the structured component .

Much as the Ratner-type theorems from the previous notes could be used to prove multiple recurrence theorems, the arithmetic regularity lemma can be used (among other things) to give a proof of Roth’s theorem. We do so as follows. Let be a large integer, and let be a subset of with for some . We consider the expression , where is the trilinear form

which implies that the number of all three-term arithmetic progressions in (including the degenerate ones with ) is . For sufficiently large depending on , this number is larger than the number of degenerate progressions, giving the theorem.

It remains to establish (7). We apply Theorem 21 with parameters , to be chosen later (they will depend on ) to obtain a quantity and a decomposition

with the stated properties. This splits the left-hand side of (7) into terms. But we can eliminate several of these terms:

Exercise 23Show that all of the terms in (7) which involve at least one copy of are of size . (Hint:Modify the proof of Proposition 5.)

From this exercise we see that

Now we need to deal with . A key point is the almost periodicity of :

Lemma 24 (Almost periodicity)For values of , one has(where we extend by zero outside of ).

*Proof:* As is Fourier-measurable, we can approximate it to an error of in norm by a function

of Fourier complexity . From the smallness of , we then have

(where we extend using (9) rather than by zero, with the error being when ). We can use (9) and the triangle inequality to bound

Using multiple recurrence, we can find values of such that for all . The claim follows.

Now we can finish the proof of Roth’s theorem. As has the same mean as , we have

and hence by Hölder’s inequality (and the non-negativity of )

Now if is one of the periods in the above lemma, we have

and thus by shifting

and so by the triangle inequality

Putting all this together using the triangle and Hölder inequalities, we obtain

Thus, if is sufficiently small depending on , we have

for values of , and thus

if we then set to be a sufficiently rapidly growing function (depending on ), we obtain the claim from (8). This concludes the proof of Roth’s theorem.

Exercise 25Use the energy increment method to establish a different proof of Exercise 13. (Hint:For the multiple recurrence step, use a pigeonhole principle argument rather than an appeal to equidistribution theory.)

We now briefly indicate how to translate the above arguments into the ultralimit setting. We first need to construct an important measure on limit sets, namely *Loeb measure*.

Exercise 26 (Construction of Loeb measure)Let be an unbounded natural number. Define theLoeb measureof a limit subset of to be the quantity , thus for instance a set of cardinality will have Loeb measure zero.

- Show that if a limit subset of is partitioned into countably many disjoint limit subsets , that all but finitely many of the are empty, and so .
- Define the
outer measureof a subset of (not necessarily a limit subset) to be the infimum of , where is a countable family of limit subsets of that cover , and call a subset ofnullif it has zero outer measure. Call a subsetLoeb measurableif it differs from a limit set by a null set. Show that there is a unique extension of Loeb measure from limit sets to Loeb measurable sets that is a countably additive probability measure on . (Hint:use the Carathéodory extension theorem, see e.g. my 254B notes 0 or notes 0a.)- If is a limit function bounded in magnitude by some standard real , show that is a Loeb measurable function in , with norm at most .
- Show that there exists a unique trilinear form , jointly continuous in the topology for all three inputs, such that
for all bounded limit functions .

- Show that Roth’s theorem is equivalent to the assertion that whenever is a bounded non-negative function with .

Next, we develop the ultralimit analogue of Fourier measurability, which we will rename *Kronecker measurability* due to the close analogy with the *Kronecker factor* in ergodic theory.

Exercise 27 (Construction of the Kronecker factor)Let be an unbounded natural number. We define aFourier characterto be a function in of the form for some limit real number . We define atrigonometric polynomialto be any finite linear combination (over the standard complex numbers) of Fourier characters. Let be the -algebra of Loeb measurable sets generated by the Fourier characters; we refer to as theKronecker factor, and functions or sets measurable in this factor asKronecker measurablefunctions and sets. Thus for instance all trigonometric polynomials are Kronecker measurable. We let denote the orthogonal projection from to , i.e. the conditional expectation to the Kronecker factor.

- Show that if is bounded in magnitude by and is a standard real, then there exists a trigonometric polynomial which is also bounded in magnitude by and is within of in norm.
- Show that if and , then there exists a limit subset of of cardinality such that for all (extending by zero).
- Show that if is non-negative with , then .
- Show that if and for at least one , then .
- Conclude the proof of Roth’s theorem using ultralimits.

Remark 28Note how the (finitary) arithmetic regularity lemma has been replaced by the more familiar (infinitary) theory of conditional expectation to a factor, and the finitary notion of measurability has been replaced by a notion from the traditional (countably additive) infinitary theory of measurability. This is one of the key advantages of the ultralimit approach, namely that it allows one to exploit already established theories of infinitary mathematics (e.g. measure theory, ergodic theory, Hilbert space geometry, etc.) to prove a finitary result.

Exercise 29Use the ultralimit energy increment method to establish yet another proof of Exercise 13.

** — 3. More quantitative bounds (optional) — **

The above proofs of Roth’s theorem (as formulated in, say, Theorem 2) were qualitative in the sense that they did not explicitly give a bound for in terms of . Nevertheless, by analysing the finitary arguments more carefully, a bound can be extracted:

Exercise 30Show that in Proposition 7, one can take . Using this and the density increment argument, show that one can take in Theorem 2. (To put it another way, subsets of of density much larger than will contain progressions of length three.)

Exercise 31Show that in the energy increment proof of Roth’s theorem, one can take the growth functions involved to be polynomial in (but with the exponent growing exponentially with each refinement of the factor), and can be taken to be an iterated exponential; thus ultimately allows one to take to be a tower exponential of height . (To put it another way, subsets of of density much larger than for some will contain progressions of length three.) Thus we see that the energy increment argument, in the form presented here, provides much worse bounds than the density increment argument; but see below.

For the ultralimit arguments, it is significantly harder to extract a quantitative bound from the argument (basically one has to painstakingly “finitise” the argument first, essentially reaching the finitary counterparts of these arguments presented above). Thus we see that there is a tradeoff when using ultralimit analysis; the arguments become slightly cleaner (and one can deploy infinitary methods), but one tends to lose sight of what quantitative bounds the method establishes. (This is particularly the case if one begins to rely heavily on the axiom of choice (or on large cardinal axioms) once one takes ultralimits, although these axioms are not used in the examples above.)

It is possible to run the density increment argument more efficiently by combining it with some aspects of the energy increment argument. As described above, the density increment argument proceeds by locating a *single* large Fourier coefficient of , and uses this to obtain a density increment on a relatively short subprogression of (of length comparable to , ignoring factors of ). One then has to iterate this about times before one obtains a truly significant density increment (e.g. from to ). It is this repeated passage from to which is ultimately responsible for the double exponential bound for at the end of the day.

In an unpublished work, Endre Szemerédi observed that one can run this argument more efficiently by collecting *several* large Fourier coefficients of simultaneously (somewhat in the spirit of the energy increment argument), and only then passing to a subprogression on which all of the relevant Fourier characters are simultaneously close to constant. The subprogression obtained is smaller as a consequence, but the density increment is much more substantial. Using this strategy, Endre was able to improve the original Roth bound of to the somewhat better (or equivalently, he was able to establish length three progressions in any subset of of density much larger than for some ). By carefully optimising the choice of threshold for selecting the “large Fourier coefficients”, Szemerédi (unpublished) and Heath-Brown independently improved this method further to obtain , or equivalently obtaining length three progressions in sets of density much larger than . (This result was extended to arbitrary finite abelian groups by Meshulam.)

The next advance was by Bourgain, who realised that rather than pass to short subprogressions, it was more efficient to work on the significantly larger (but messier) Bohr sets , after ensuring that such Bohr sets were *regular* (this condition is closely related to the Fourier measurability condition used in the energy increment argument). With this modification to the original Roth argument, the bound was lowered to , or equivalently obtaining length three progressions in sets of density much larger than . Even more recently, this argument was combined with the Szemerédi-Heath-Brown argument by Bourgain (not yet published) to obtain the further improvement of , or equivalently obtaining length three progressions in sets of density much larger than . This is getting close to the case of an old conjecture of Erdös that asserts that any subset of the natural numbers whose sums of reciprocals diverge should have infinitely many arithmetic progressions of length for any . To establish the case from quantitative versions of Roth’s theorem, one would basically need a bound of the form for some (or the ability to obtain progressions in sets of density ).

On the other hand, there is an old counterexample of Behrend (based ultimately on the observation that a sphere in a high-dimensional lattice does not contain any arithmetic progressions of length three) which shows that must be at least (in particular, it must be super-polynomial in ); equivalently, it is known that there are subsets of of density about with no arithmetic progressions of length three. For the sharpest results in this direction, see this paper of Elkin (and a later exposition of Green and Wolf).

The question of refining the bounds is an important one, as it tends to improve the technological understanding of current methods, as well as shed light on their relative strengths and weaknesses. However, this comes at the cost of making the arguments somewhat more technical, and so we shall not focus on the sharpest quantitative results in these notes.

## 17 comments

Comments feed for this article

14 April, 2010 at 5:35 am

Joel MoreiraDear Prof. Tao

In theorem 1, shouldn’t be limsup?

Also, after (1) (in the proof of proposition 4) I think it’s the cyclic group Z/N’Z.

And in the second equation in the proof of lemma 7, I believe that there is a sign missing (not that it matters).

[Corrected, thanks – T.]Great Notes!

14 April, 2010 at 5:10 pm

AnonymousThe full entry shows up on the main page for your blog. Did you mean to have it all show, rather than just the beginning? Maybe it’s something weird on my side.

[Corrected, thanks – T.]28 April, 2010 at 2:37 pm

254B, Notes 3: Linear patterns « What’s new[…] von Neumann theorem, Gowers uniformity norms, linear patterns | by Terence Tao In the previous lecture notes, we used (linear) Fourier analysis to control the number of three-term arithmetic progressions in […]

20 May, 2010 at 9:45 pm

254B, Notes 5: The inverse conjecture for the Gowers norm I. The finite field case « What’s new[…] proceed we need the following analogue of Proposition 5 of Notes 2: Exercise 6 (Fragmenting a polynomial into subspaces) Let be a classical polynomial of degree . […]

29 May, 2010 at 2:15 pm

254B, Notes 6: The inverse conjecture for the Gowers norm II. The integer case « What’s new[…] with growth function depending only on , where Fourier measurability was defined in Notes 2. Exercise 18 Show that the class of nilsequences of degree does not change if we drop the […]

5 June, 2010 at 12:57 pm

254B, Lecture Notes 7: The transference principle, and linear equations in primes « What’s new[…] should be compared with Roth’s theorem in the integers (Notes 2), which is the same statement but with the primes replaced by the integers (or natural numbers ). […]

5 September, 2011 at 1:17 pm

ylin2Dear Professor Tao: In exercise 8, you said to consider a Fejer summation method. But Fejer kernel seems to be defined on the continuous torus, so how to adapt that method to the discrete setting as in this problem?

[Corrected, thanks – T.]5 September, 2011 at 6:43 pm

Terence TaoA function on the integers can be viewed as the restriction of a function on the reals.

7 January, 2012 at 9:35 am

Applications of Szemerédi’s regularity lemma: triangle removal lemma, Roth’s theorem, corner’s theorem and graph removal lemma « Disquisitiones Mathematicae[…] regularity lemma, and now we give a few of its various applications: the triangle removal lemma, Roth’s theorem on the existence of arithmetic progressions of length in subsets of the integers with positive […]

14 May, 2018 at 12:51 pm

Finitary versions of theorems – Write down what you’ve done[…] of Roth’s theorem on arithmetic progressions which I wanted to understand. I found it in Terence Tao’s lecture notes on higher order Fourier analysis. While working my way through this proof I tried to solve the exercises. However, already the first […]

20 January, 2020 at 3:38 pm

hcI have a confusion: what does arithmetic regularity lemma buy us in the energy increment proof?

It seems like after Proposition 9, we already found the right sigma algebra of Fourier measurable sets with complexity (say we start with being the trivial -algebra). We can then write

Almost periodicity should hold for as in your proof. By proposition 9, the residual would have small -norm, say . It seems like one can already conclude Roth’s theorem here.

Thank you!

20 January, 2020 at 7:44 pm

Terence TaoAlmost periodicity will give a good contribution from , but only on a relatively small number of shifts (a lower bound of . The control of is not strong enough to control its contribution on such a sparse set of shifts, so one cannot conclude any nontrivial lower bound on the number of 3-term APs this way.

22 January, 2020 at 8:12 am

hcThanks for your quick reply, this is very helpful!

I missed that has another dependency on (coming from having atoms), hence I thought I can still have enough shifts independent of :(

15 May, 2020 at 3:04 pm

S MDear Professor Tao, if you do not terribly mind a couple of stupid questions, could you please comment on how to actually prove that is Fourier-measurable? I do not quite understand what Fejer summation means but do you mean that one should define take our to be where denotes the Fejer kernel?

Secondly, I simply cannot wrap my head around what the definition of is. Usually, if I had to calculate such an integral I would try to approximate such a function by simple functions and then using the monotone convergence theorem or something similar. However, I do not quite know how to work with this integral of . Thank you for patiently reading my questions.

15 May, 2020 at 3:07 pm

S MSorry, I meant and .

17 May, 2020 at 10:07 am

S MI think I (with some help from my teacher!) just figured out what the definition of should naturally be

where we define .

27 May, 2020 at 9:45 am

Abhishek KhetanDear Professor Tao,

In the section on the density increment argument, when you wrote “…conclude that the left-hand side of (1) can be expressed as…” I think you meant to refer to Equation (2) rather than Equation (1).

Thank you.

[Corrected, thanks – T.]