Ben Green and I have (finally!) uploaded to the arXiv our paper “New bounds for Szemerédi’s theorem, III: A polylogarithmic bound for “, submitted to Mathematika. This is the sequel to two previous papers (and an erratum to the former paper), concerning quantitative versions of Szemerédi’s theorem in the case of length four progressions. This sequel has been delayed for over a decade for a number of reasons, but we have finally managed to write the arguments up to our satisfaction and submit it (to a special issue of Mathematika honouring the work of Klaus Roth).

For any natural number , define to be the largest cardinality of a subset of which does not contain any non-trivial arithmetic progressions of length four (where “non-trivial” means that is non-zero). Trivially we have . In 1969, Szemerédi showed that . However, the decay rate that could be theoretically extracted from this argument (and from several subsequent proofs of this bound, including one by Roth) were quite poor. The first significant quantitative bound on this quantity was by Gowers, who showed that for some absolute constant . In the second paper in the above-mentioned series, we managed to improve this bound to . In this paper, we improve the bound further to , which seems to be the limit of the methods. (We remark that if we could take to be larger than one, this would imply the length four case of a well known conjecture of Erdös that any set of natural numbers whose sum of reciprocals diverges would contain arbitrarily long arithmetic progressions. Thanks to the work of Sanders and of Bloom, the corresponding case of the conjecture for length three conjectures is nearly settled, as it is known that for the analogous bound on one can take any less than one.)

Most of the previous work on bounding relied in some form or another on the *density increment argument* introduced by Roth back in 1953; roughly speaking, the idea is to show that if a dense subset of fails to contain arithmetic progressions of length four, one seeks to then locate a long subprogression of in which has increased density. This was the basic method for instance underlying our previous bound , as well as a finite field analogue of the bound ; however we encountered significant technical difficulties for several years in extending this argument to obtain the result of the current paper. Our method is instead based on “energy increment arguments”, and more specifically on establishing quantitative version of a Khintchine-type recurrence theorem, similar to the qualitative recurrence theorems established (in the ergodic theory context) by Bergelson-Host-Kra, and (in the current combinatorial context) by Ben Green and myself.

One way to phrase the latter recurrence theorem is as follows. Suppose that has density . Then one would expect a “randomly” selected arithmetic progression in (using the convention that random variables will be in boldface) to be contained in with probability about . This is not true in general, however it was shown by Ben and myself that for any , there was a set of shifts of cardinality , such that for any such one had

if was chosen uniformly at random from . This easily implies that , but does not give a particularly good bound on the decay rate, because the implied constant in the cardinality lower bound is quite poor (in fact of tower-exponential type, due to the use of regularity lemmas!), and so one has to take to be extremely large compared to to avoid the possibility that the set of shifts in the above theorem consists only of the trivial shift .

We do not know how to improve the lower bound on the set of shifts to the point where it can give bounds that are competitive with those in this paper. However, we can obtain better quantitative results if we permit ourselves to *couple* together the two parameters and of the length four progression. Namely, with , , as above, we are able to show that there exist random variables , not necessarily independent, such that

and such that we have the non-degeneracy bound

This then easily implies the main theorem.

The energy increment method is then deployed to locate a good pair of random variables that will obey the above bounds. One can get some intuition on how to proceed here by considering some model cases. Firstly one can consider a “globally quadratically structured” case in which the indicator function “behaves like” a globally quadratic function such as , for some irrational and some smooth periodic function of mean . If one then takes to be uniformly distributed in and respectively for some small , with no coupling between the two variables, then the left-hand side of (1) is approximately of the form

where the integral is with respect to the probability Haar measure, and the constraint ultimately arises from the algebraic constraint

However, an application of the Cauchy-Schwarz inequality and Fubini’s theorem shows that the integral in (2) is at least , which (morally at least) gives (1) in this case.

Due to the nature of the energy increment argument, it also becomes necessary to consider “locally quadratically structured” cases, in which is partitioned into some number of structured pieces (think of these as arithmetic progressions, or as “Bohr sets), and on each piece , behaves like a locally quadratic function such as , where now varies with , and the mean of will be approximately on the average after averaging in (weighted by the size of the pieces ). Now one should select and in the following coupled manner: first one chooses uniformly from , then one defines to be the label such that , and then selects uniformly from a set which is related to in much the same way that is related to . If one does this correctly, the analogue of (2) becomes

and one can again use Cauchy-Schwarz and Fubini’s theorem to conclude.

The general case proceeds, very roughly, by an iterative argument. At each stage of the iteration, one has some sort of quadratic model of which involves a decomposition of into structured pieces , and a quadratic approximation to on each piece. If this approximation is accurate enough (or more precisely, if a certain (averaged) local Gowers uniformity norm of the error is small enough) to model the count in (1) (for random variables determined by the above partition of into pieces ), and if the frequencies (such as ) involved in the quadratic approximation are “high rank” or “linearly independent over the rationals” in a suitably quantitative sense, then some version of the above arguments can be made to work. If there are some unwanted linear dependencies in the frequencies, we can do some linear algebra to eliminate one of the frequencies (using some geometry of numbers to keep the quantitative bounds under control) and continue the iteration. If instead the approximation is too inaccurate, then the error will be large in a certain averaged local Gowers uniformity norm . A significant fraction of the paper is then devoted to establishing a quantitative *inverse theorem* for that norm that concludes (with good bounds) that the error must then locally correlate with locally quadratic phases, which can be used to refine the quadratic approximation to in a manner that significantly increases its “energy” (basically an norm). Such energy increments cannot continue indefinitely, and when they terminate we obtain the desired claim.

There are existing inverse theorems for type norms in the literature, going back to the pioneering work of Gowers mentioned previously, and relying on arithmetic combinatorics tools such as Freiman’s theorem and the Balog-Szemerédi-Gowers lemma, which are good for analysing the “-structured homomorphisms” that arise in Gowers’ argument. However, when we applied these methods to the local Gowers norms we obtained inferior quantitative results that were not strong enough for our application. Instead, we use arguments from a different paper of Gowers in which he tackled Szemerédi’s theorem for arbitrary length progressions. This method produces “-structured homomorphisms” associated to any function with large Gowers uniformity norm; however the catch is that such homomorphisms are initially supported only on a sparse unstructured set, rather than a structured set such as a Bohr set. To proceed further, one first has to locate inside the sparse unstructured set a sparse *pseudorandom* subset of a Bohr set, and then use “error-correction” type methods (such as “majority-vote” based algorithms) to locally upgrade this -structured homomorphism on pseudorandom subsets of Bohr sets to a -structured homomorphism on the entirety of a Bohr set. It is then possible to use some “approximate cohomology” tools to “integrate” these homomorphisms (and discern a key “local symmetry” property of these homomorphisms) to locate the desired local quadratic structure (in much the same fashion that a -form on that varies linearly with the coordinates can be integrated to be the derivative of a quadratic function if we know that the -form is closed). These portions of the paper are unfortunately rather technical, but broadly follow the methods already used in previous literature.

## 18 comments

Comments feed for this article

5 May, 2017 at 1:39 pm

AnonymousIs this method effective in finding any explicit numerical value for the exponent ?

Is there any conjecture on the maximal possible value of such ?

5 May, 2017 at 9:00 pm

Terence TaoIn principle the exponent can be explicitly calculated from the arguments in the paper, but we made no efforts to optimise it and so it is likely to be quite terrible (on the order of or so). As I mentioned in the blog post, the Erdos conjecture on arithmetic progressions roughly corresponds to the assertion that can be taken to be larger than 1. In view of the strong bounds now available for three-term progressions in the finite field setting, it is probably reasonable to conjecture that can in fact be taken to be arbitrarily large.

6 May, 2017 at 5:02 am

AnonymousMotivated by the introduction in your paper, for each , define as the supremum over all nonnegative exponents for which

Sanders’ result implies that and your result implies that . Is it possible that all are positive and finite ?

6 May, 2017 at 2:52 pm

Terence TaoI would conjecture this to be the case. The known Behrend-type lower bounds are of the shape , and it is quite conceivable that these are close to sharp, though it looks like we are still far off from settling this claim (unless there is some further breakthrough, e.g. if the polynomial method that works so well for the cap set problem can somehow be extended to the cyclic group setting).

6 May, 2017 at 4:04 am

AnonymousIn the definition of , it is not clear why such (unique) largest set should exist.

6 May, 2017 at 2:41 pm

Terence TaoThere are only finitely many subsets of .

6 May, 2017 at 5:34 pm

AnonymousI meant that (since the property of such subset is translation-invariant) it is possible that there are several maximal subsets with this property without any largest one (on which the above definition of is based), so in the above definition of it seems clearer to replace “cardinality of the largest subset ” by “largest cardinality of a subset ” (as in the definition of in the abstract of your paper.)

[Wording changed – T.]7 May, 2017 at 1:51 am

Klaus85“This sequel has been delayed for over a decade for a number of reasons”. I find this very surprising. Does it mean that you had the results already a decade ago, and just did not write it up? Do you have more close-to-finished projects from very long time ago which are unpublished?

10 May, 2017 at 10:44 am

Terence TaoIt’s a long story, and I’ve actually forgotten parts of it at this point, but basically the first version of our argument (based on density increment methods) turned out to have some significant technical difficulties, so we put it aside for a few years until we found a different argument (based on energy increment methods) which worked; however the write up of this paper was extremely technical and we did not feel it was readable enough to be in a publishable state (though we did share the preprint with some close colleagues who requested it), so the paper again languished. About a year ago we were contacted to submit an article for a special issue of Mathematika honouring the work of Klaus Roth, and this encouraged us to rewrite the (quite lengthy) paper to a state that we were both happy with for submission, but we were only able to do so after we were both at MSRI earlier this year, and could devote a significant portion of time to the task.

7 May, 2017 at 5:26 am

mathlogWhy do you call it a polylogarithmic bound?

[ See https://en.wikipedia.org/wiki/Time_complexity#Polylogarithmic_time -T.]8 May, 2017 at 1:31 am

AnonymousPage 2, paragraph 2 in the paper contains: “A major

breakthrough was made by in 1998 by Gowers”; but should probably say, “A major breakthrough was made by Gowers in 1998”?

[Thanks, this will be corrected in the next version of the ms. -T.]8 May, 2017 at 5:16 am

AnonymousIs there any apparent obstruction for applying this method (with similar estimates) to ?

10 May, 2017 at 10:23 am

Terence TaoThere are two obstacles to this currently. The first is that one would need a quantitative (local) inverse U^4 theorem of comparable strength to the quantitative (local) inverse U^3 theorem that is used in this paper, and at present no such quantitative theorem is known. Secondly (and more seriously), the application of Cauchy-Schwarz inequality that lower bounds (2) is not available for the analogue to (2) in the k=5 setting (which is a somewhat complicated quintilinear integral expression on a certain nilmanifold); indeed there is a counterexample by Ruzsa in the appendix to the Bergelson-Host-Kra paper mentioned in the post that suggests that no analogue of the Khintchine-type recurrence theorems proven here in the k=4 case will be available for k equal to 5 or higher.

10 May, 2017 at 5:06 pm

DaniloPerfect!

11 May, 2017 at 9:21 pm

AnonymousIt seems that people are very close to prove that any set of natural numbers whose sum of reciprocals diverges would contain arithmetic progressions of length 3.

25 May, 2017 at 10:03 am

AnonymousComments to the bibliography:

* [8]: “93-128” –> “93–128”

* [9]: The title should be in italic

* [10]: “261-264” –> “261–264”

* [16]: “327,1–29” –> “327, 1–29”

* [23]: The URL should be put in typewriter font, i.e., \href{https://arxiv.org/abs/1205.1330}{\texttt{https://arxiv.org/abs/1205.1330}}.

* [26]: “2009. 180–204” –> “2009, 180–204”

* [32]: “AMS 1994” –> “AMS, 1994”

* [33]: “R.A.” –> ” R. A.”

* [40]: The title should be in italic

* [44]: “Hungar.56” –> “Hungar. 56”

* [44]: “no. 1-2” –> “no. 1–2”

* [46]: Should “2004 El Escorial conference proceedings” be in italic?

[Thanks, this will be corrected in the next version of the ms. -T.]4 March, 2022 at 8:28 pm

James LengIs there a typo in Theorem 8.1 condition (8.2)? I believe i and i + 1 should be switched around?

[Yes, this is a typo – T.]30 May, 2022 at 7:05 am

AnonymousShouldn’t it be (without the negative sign in the argument of exp)?

[Corrected, thanks – T.]