Ben Green and I have (finally!) uploaded to the arXiv our paper “New bounds for Szemerédi’s theorem, III: A polylogarithmic bound for “, submitted to Mathematika. This is the sequel to two previous papers (and an erratum to the former paper), concerning quantitative versions of Szemerédi’s theorem in the case of length four progressions. This sequel has been delayed for over a decade for a number of reasons, but we have finally managed to write the arguments up to our satisfaction and submit it (to a special issue of Mathematika honouring the work of Klaus Roth).
For any natural number , define
to be the largest cardinality of a subset
of
which does not contain any non-trivial arithmetic progressions
of length four (where “non-trivial” means that
is non-zero). Trivially we have
. In 1969, Szemerédi showed that
. However, the decay rate that could be theoretically extracted from this argument (and from several subsequent proofs of this bound, including one by Roth) were quite poor. The first significant quantitative bound on this quantity was by Gowers, who showed that
for some absolute constant
. In the second paper in the above-mentioned series, we managed to improve this bound to
. In this paper, we improve the bound further to
, which seems to be the limit of the methods. (We remark that if we could take
to be larger than one, this would imply the length four case of a well known conjecture of Erdös that any set of natural numbers whose sum of reciprocals diverges would contain arbitrarily long arithmetic progressions. Thanks to the work of Sanders and of Bloom, the corresponding case of the conjecture for length three conjectures is nearly settled, as it is known that for the analogous bound on
one can take any
less than one.)
Most of the previous work on bounding relied in some form or another on the density increment argument introduced by Roth back in 1953; roughly speaking, the idea is to show that if a dense subset
of
fails to contain arithmetic progressions of length four, one seeks to then locate a long subprogression of
in which
has increased density. This was the basic method for instance underlying our previous bound
, as well as a finite field analogue of the bound
; however we encountered significant technical difficulties for several years in extending this argument to obtain the result of the current paper. Our method is instead based on “energy increment arguments”, and more specifically on establishing quantitative version of a Khintchine-type recurrence theorem, similar to the qualitative recurrence theorems established (in the ergodic theory context) by Bergelson-Host-Kra, and (in the current combinatorial context) by Ben Green and myself.
One way to phrase the latter recurrence theorem is as follows. Suppose that has density
. Then one would expect a “randomly” selected arithmetic progression
in
(using the convention that random variables will be in boldface) to be contained in
with probability about
. This is not true in general, however it was shown by Ben and myself that for any
, there was a set of shifts
of cardinality
, such that for any such
one had
if was chosen uniformly at random from
. This easily implies that
, but does not give a particularly good bound on the decay rate, because the implied constant in the cardinality lower bound
is quite poor (in fact of tower-exponential type, due to the use of regularity lemmas!), and so one has to take
to be extremely large compared to
to avoid the possibility that the set of shifts in the above theorem consists only of the trivial shift
.
We do not know how to improve the lower bound on the set of shifts to the point where it can give bounds that are competitive with those in this paper. However, we can obtain better quantitative results if we permit ourselves to couple together the two parameters and
of the length four progression. Namely, with
,
,
as above, we are able to show that there exist random variables
, not necessarily independent, such that
and such that we have the non-degeneracy bound
This then easily implies the main theorem.
The energy increment method is then deployed to locate a good pair of random variables that will obey the above bounds. One can get some intuition on how to proceed here by considering some model cases. Firstly one can consider a “globally quadratically structured” case in which the indicator function
“behaves like” a globally quadratic function such as
, for some irrational
and some smooth periodic function
of mean
. If one then takes
to be uniformly distributed in
and
respectively for some small
, with no coupling between the two variables, then the left-hand side of (1) is approximately of the form
where the integral is with respect to the probability Haar measure, and the constraint ultimately arises from the algebraic constraint
However, an application of the Cauchy-Schwarz inequality and Fubini’s theorem shows that the integral in (2) is at least , which (morally at least) gives (1) in this case.
Due to the nature of the energy increment argument, it also becomes necessary to consider “locally quadratically structured” cases, in which is partitioned into some number of structured pieces
(think of these as arithmetic progressions, or as “Bohr sets), and on each piece
,
behaves like a locally quadratic function such as
, where
now varies with
, and the mean of
will be approximately
on the average after averaging in
(weighted by the size of the pieces
). Now one should select
and
in the following coupled manner: first one chooses
uniformly from
, then one defines
to be the label
such that
, and then selects
uniformly from a set
which is related to
in much the same way that
is related to
. If one does this correctly, the analogue of (2) becomes
and one can again use Cauchy-Schwarz and Fubini’s theorem to conclude.
The general case proceeds, very roughly, by an iterative argument. At each stage of the iteration, one has some sort of quadratic model of which involves a decomposition of
into structured pieces
, and a quadratic approximation to
on each piece. If this approximation is accurate enough (or more precisely, if a certain (averaged) local Gowers uniformity norm
of the error is small enough) to model the count in (1) (for random variables
determined by the above partition of
into pieces
), and if the frequencies (such as
) involved in the quadratic approximation are “high rank” or “linearly independent over the rationals” in a suitably quantitative sense, then some version of the above arguments can be made to work. If there are some unwanted linear dependencies in the frequencies, we can do some linear algebra to eliminate one of the frequencies (using some geometry of numbers to keep the quantitative bounds under control) and continue the iteration. If instead the approximation is too inaccurate, then the error will be large in a certain averaged local Gowers uniformity norm
. A significant fraction of the paper is then devoted to establishing a quantitative inverse theorem for that norm that concludes (with good bounds) that the error must then locally correlate with locally quadratic phases, which can be used to refine the quadratic approximation to
in a manner that significantly increases its “energy” (basically an
norm). Such energy increments cannot continue indefinitely, and when they terminate we obtain the desired claim.
There are existing inverse theorems for type norms in the literature, going back to the pioneering work of Gowers mentioned previously, and relying on arithmetic combinatorics tools such as Freiman’s theorem and the Balog-Szemerédi-Gowers lemma, which are good for analysing the “
-structured homomorphisms” that arise in Gowers’ argument. However, when we applied these methods to the local Gowers norms we obtained inferior quantitative results that were not strong enough for our application. Instead, we use arguments from a different paper of Gowers in which he tackled Szemerédi’s theorem for arbitrary length progressions. This method produces “
-structured homomorphisms” associated to any function with large Gowers uniformity norm; however the catch is that such homomorphisms are initially supported only on a sparse unstructured set, rather than a structured set such as a Bohr set. To proceed further, one first has to locate inside the sparse unstructured set a sparse pseudorandom subset of a Bohr set, and then use “error-correction” type methods (such as “majority-vote” based algorithms) to locally upgrade this
-structured homomorphism on pseudorandom subsets of Bohr sets to a
-structured homomorphism on the entirety of a Bohr set. It is then possible to use some “approximate cohomology” tools to “integrate” these homomorphisms (and discern a key “local symmetry” property of these homomorphisms) to locate the desired local quadratic structure (in much the same fashion that a
-form on
that varies linearly with the coordinates can be integrated to be the derivative of a quadratic function if we know that the
-form is closed). These portions of the paper are unfortunately rather technical, but broadly follow the methods already used in previous literature.
18 comments
Comments feed for this article
5 May, 2017 at 1:39 pm
Anonymous
Is this method effective in finding any explicit numerical value for the exponent
?
?
Is there any conjecture on the maximal possible value of such
5 May, 2017 at 9:00 pm
Terence Tao
In principle the exponent can be explicitly calculated from the arguments in the paper, but we made no efforts to optimise it and so it is likely to be quite terrible (on the order of
or so). As I mentioned in the blog post, the Erdos conjecture on arithmetic progressions roughly corresponds to the assertion that
can be taken to be larger than 1. In view of the strong bounds now available for three-term progressions in the finite field setting, it is probably reasonable to conjecture that
can in fact be taken to be arbitrarily large.
6 May, 2017 at 5:02 am
Anonymous
Motivated by the introduction in your paper, for each
, define
as the supremum over all nonnegative exponents
for which 
and your result implies that
. Is it possible that all
are positive and finite ?
Sanders’ result implies that
6 May, 2017 at 2:52 pm
Terence Tao
I would conjecture this to be the case. The known Behrend-type lower bounds are of the shape
, and it is quite conceivable that these are close to sharp, though it looks like we are still far off from settling this claim (unless there is some further breakthrough, e.g. if the polynomial method that works so well for the cap set problem can somehow be extended to the cyclic group setting).
6 May, 2017 at 4:04 am
Anonymous
In the definition of
, it is not clear why such (unique) largest set
should exist.
6 May, 2017 at 2:41 pm
Terence Tao
There are only finitely many subsets
of
.
6 May, 2017 at 5:34 pm
Anonymous
I meant that (since the property of such subset
is translation-invariant) it is possible that there are several maximal subsets
with this property without any largest one (on which the above definition of
is based), so in the above definition of
it seems clearer to replace “cardinality of the largest subset
” by “largest cardinality of a subset
” (as in the definition of
in the abstract of your paper.)
[Wording changed – T.]
7 May, 2017 at 1:51 am
Klaus85
“This sequel has been delayed for over a decade for a number of reasons”. I find this very surprising. Does it mean that you had the results already a decade ago, and just did not write it up? Do you have more close-to-finished projects from very long time ago which are unpublished?
10 May, 2017 at 10:44 am
Terence Tao
It’s a long story, and I’ve actually forgotten parts of it at this point, but basically the first version of our argument (based on density increment methods) turned out to have some significant technical difficulties, so we put it aside for a few years until we found a different argument (based on energy increment methods) which worked; however the write up of this paper was extremely technical and we did not feel it was readable enough to be in a publishable state (though we did share the preprint with some close colleagues who requested it), so the paper again languished. About a year ago we were contacted to submit an article for a special issue of Mathematika honouring the work of Klaus Roth, and this encouraged us to rewrite the (quite lengthy) paper to a state that we were both happy with for submission, but we were only able to do so after we were both at MSRI earlier this year, and could devote a significant portion of time to the task.
7 May, 2017 at 5:26 am
mathlog
Why do you call it a polylogarithmic bound?
[ See https://en.wikipedia.org/wiki/Time_complexity#Polylogarithmic_time -T.]
8 May, 2017 at 1:31 am
Anonymous
Page 2, paragraph 2 in the paper contains: “A major
breakthrough was made by in 1998 by Gowers”; but should probably say, “A major breakthrough was made by Gowers in 1998”?
[Thanks, this will be corrected in the next version of the ms. -T.]
8 May, 2017 at 5:16 am
Anonymous
Is there any apparent obstruction for applying this method (with similar estimates) to
?
10 May, 2017 at 10:23 am
Terence Tao
There are two obstacles to this currently. The first is that one would need a quantitative (local) inverse U^4 theorem of comparable strength to the quantitative (local) inverse U^3 theorem that is used in this paper, and at present no such quantitative theorem is known. Secondly (and more seriously), the application of Cauchy-Schwarz inequality that lower bounds (2) is not available for the analogue to (2) in the k=5 setting (which is a somewhat complicated quintilinear integral expression on a certain nilmanifold); indeed there is a counterexample by Ruzsa in the appendix to the Bergelson-Host-Kra paper mentioned in the post that suggests that no analogue of the Khintchine-type recurrence theorems proven here in the k=4 case will be available for k equal to 5 or higher.
10 May, 2017 at 5:06 pm
Danilo
Perfect!
11 May, 2017 at 9:21 pm
Anonymous
It seems that people are very close to prove that any set of natural numbers whose sum of reciprocals diverges would contain arithmetic progressions of length 3.
25 May, 2017 at 10:03 am
Anonymous
Comments to the bibliography:
* [8]: “93-128” –> “93–128”
* [9]: The title should be in italic
* [10]: “261-264” –> “261–264”
* [16]: “327,1–29” –> “327, 1–29”
* [23]: The URL should be put in typewriter font, i.e., \href{https://arxiv.org/abs/1205.1330}{\texttt{https://arxiv.org/abs/1205.1330}}.
* [26]: “2009. 180–204” –> “2009, 180–204”
* [32]: “AMS 1994” –> “AMS, 1994”
* [33]: “R.A.” –> ” R. A.”
* [40]: The title should be in italic
* [44]: “Hungar.56” –> “Hungar. 56”
* [44]: “no. 1-2” –> “no. 1–2”
* [46]: Should “2004 El Escorial conference proceedings” be in italic?
[Thanks, this will be corrected in the next version of the ms. -T.]
4 March, 2022 at 8:28 pm
James Leng
Is there a typo in Theorem 8.1 condition (8.2)? I believe i and i + 1 should be switched around?
[Yes, this is a typo – T.]
30 May, 2022 at 7:05 am
Anonymous
Shouldn’t it be
(without the negative sign in the argument of exp)?
[Corrected, thanks – T.]