The following result is due independently to Furstenberg and to Sarkozy:
Theorem 1 (Furstenberg-Sarkozy theorem) Let
, and suppose that
is sufficiently large depending on
. Then every subset
of
of density
at least
contains a pair
for some natural numbers
with
.
This theorem is of course similar in spirit to results such as Roth’s theorem or Szemerédi’s theorem, in which the pattern is replaced by
or
for some fixed
respectively. There are by now many proofs of this theorem (see this recent paper of Lyall for a survey), but most proofs involve some form of Fourier analysis (or spectral theory). This may be compared with the standard proof of Roth’s theorem, which combines some Fourier analysis with what is now known as the density increment argument.
A few years ago, Ben Green, Tamar Ziegler, and myself observed that it is possible to prove the Furstenberg-Sarkozy theorem by just using the Cauchy-Schwarz inequality (or van der Corput lemma) and the density increment argument, removing all invocations of Fourier analysis, and instead relying on Cauchy-Schwarz to linearise the quadratic shift . As such, this theorem can be considered as even more elementary than Roth’s theorem (and its proof can be viewed as a toy model for the proof of Roth’s theorem). We ended up not doing too much with this observation, so decided to share it here.
The first step is to use the density increment argument that goes back to Roth. For any , let
denote the assertion that for
sufficiently large, all sets
of density at least
contain a pair
with
non-zero. Note that
is vacuously true for
. We will show that for any
, one has the implication
for some absolute constant . This implies that
is true for any
(as can be seen by considering the infimum of all
for which
holds), which gives Theorem 1.
It remains to establish the implication (1). Suppose for sake of contradiction that we can find for which
holds (for some sufficiently small absolute constant
), but
fails. Thus, we can find arbitrarily large
, and subsets
of
of density at least
, such that
contains no patterns of the form
with
non-zero. In particular, we have
(The exact ranges of and
are not too important here, and could be replaced by various other small powers of
if desired.)
Let be the density of
, so that
. Observe that
and
If we thus set , then
In particular, for large enough,
On the other hand, one easily sees that
and hence by the Cauchy-Schwarz inequality
which we can rearrange as
Shifting by
we obtain (again for
large enough)
In particular, by the pigeonhole principle (and deleting the diagonal case , which we can do for
large enough) we can find distinct
such that
so in particular
If we set and shift
by
, we can simplify this (again for
large enough) as
On the other hand, since
we have
for any , and thus
Averaging this with (2) we conclude that
In particular, by the pigeonhole principle we can find such that
or equivalently has density at least
on the arithmetic progression
, which has length
and spacing
, for some absolute constant
. By partitioning this progression into subprogressions of spacing
and length
(plus an error set of size
, we see from the pigeonhole principle that we can find a progression
of length
and spacing
on which
has density at least
(and hence at least
) for some absolute constant
. If we then apply the induction hypothesis to the set
we conclude (for large enough) that
contains a pair
for some natural numbers
with
non-zero. This implies that
lie in
, a contradiction, establishing the implication (1).
A more careful analysis of the above argument reveals a more quantitative version of Theorem 1: for (say), any subset of
of density at least
for some sufficiently large absolute constant
contains a pair
with
non-zero. This is not the best bound known; a (difficult) result of Pintz, Steiger, and Szemeredi allows the density to be as low as
. On the other hand, this already improves on the (simpler) Fourier-analytic argument of Green that works for densities at least
(although the original argument of Sarkozy, which is a little more intricate, works up to
). In the other direction, a construction of Rusza gives a set of density
without any pairs
.
Remark 1 A similar argument also applies with
replaced by
for fixed
, because this sort of pattern is preserved by affine dilations
into arithmetic progressions whose spacing
is a
power. By re-introducing Fourier analysis, one can also perform an argument of this type for
where
is the sum of two squares; see the above-mentioned paper of Green for details. However there seems to be some technical difficulty in extending it to patterns of the form
for polynomials
that consist of more than a single monomial (and with the normalisation
, to avoid local obstructions), because one no longer has this preservation property.
25 comments
Comments feed for this article
28 February, 2013 at 11:16 pm
Hossein
WOW!
1 March, 2013 at 1:30 am
anon
Typo: “Suppose for sake of contradiction that …
holds” – should this be
?
[Corrected, thanks. Actually I found just now that one could actually get back the
gain in density by being a bit more careful with the Cauchy-Schwarz, and updated the post accordingly. -T]
1 March, 2013 at 4:09 am
Michael
In the three equations before equation (2) I think r+h+h’ should be 2r+h+h’
[Corrected, thanks – T.]
1 March, 2013 at 6:32 am
Anonymous
Thank you for the nice post! Actually, through shifting by (r+h)^2 one obtains 2r instead of r, and later 2rd but this is a minor thing.
1 March, 2013 at 6:38 am
Gil Kalai
Terry, What are the upper and lower bounds on the required density in this case?
[Thanks for the question! I just added a link to the best known lower bound, by Ruzsa, at the end of this post. The best known upper bound is due to Pintz, Steiger, and Szemeredi and is also given at the end of the post. -T.]
2 March, 2013 at 3:03 am
Ben Green
Terry, I don’t think my arguments from 2002 give
,
,
– in the paper you refer to, the common difference has to be a sum of two squares, a much weaker result. Also, the original Fourier -analytic argument of S\’ark\”ozy, which is not too complicated, gives a bound of
. The idea of this argument is to make sure your large Fourier coefficient is near a low-complexity rational, so you pass to a subprogression with small common difference.
[Corrected, thanks – T.]
2 March, 2013 at 10:45 am
Gil Kalai
Thanks for the lower bound, Terry, That’s quite a gap :)
I have another question: For Roth we have 2 finite alphabet analogs: The first is DHJ(3) (the density Hales Jewett for 3-letters alphabet) and the second is the cupset problem (a set of (0,1,n}^n without an affine line). (Of course, there are intermediate problems, etc.)
For Furstenberg-Sarkozy we have a DHJ analog (which is the first case of a polynomial DHJ that Tim proposed to study; it is over a 2-letters alphabet, the ground set has the structure of edges of a complete graph, and asks for a Sperner theorem where the difference correspond to a complete graph).
http://gowers.wordpress.com/2009/11/14/the-first-unknown-case-of-polynomial-dhj/
My question is about an analog analogous to the the cupset analog. Maybe you have to exclude a difference is in a quadric? Do you know?
(If the forbidden things will includes those of the polynomial DHJ this would be nice but not necessary)
2 March, 2013 at 11:48 am
Terence Tao
Well, the most direct capset analogue of the Furstenberg-Sarkozy theorem is that every dense subset of the polynomial ring
contains a pair of polynomials of the form
for
non-zero. Here density is in the sense that the density in the space of polynomials of degree less than
(which is isomorphic as a vector space to
) is bounded from below as
. Pretty much any of the known proofs of the Furstenberg-Sarkozy theorem will adapt to this finite field setting (and in fact the arguments will usually become a little bit simpler) But this is not a purely alphabetical statement, because the operation of squaring a polynomial becomes quite complicated if one attempts to interpret it purely in terms of combinatorial operations on the coefficients (it is convolution rather than pointwise addition).
2 March, 2013 at 11:58 am
Gil Kalai
That’s fine, but why F_3 and not F_2?
2 March, 2013 at 4:18 pm
Terence Tao
The squaring function
behaves quite differently in characteristic 2 as compared to odd (or zero) characteristic; in particular, in this characteristic the squaring function becomes linear (
), and the result here collapses to something more analogous to the Ajtai-Szemeredi corners theorem than the Furstenberg-Sarkozy theorem.
14 March, 2013 at 1:19 pm
Gil Kalai
Regarding
as representing polynomials in one variable of degree less than n is very natural, but we can have other versions, e.g., polynomials of degree 2 in some m variables (or “bipartite” polynomials of degree 2) , which may make sense over
as well. In particular, it will be nice to have cupset versions related to versions of polynomial HJT that Tim Gowers considered here http://gowers.wordpress.com/2009/11/14/the-first-unknown-case-of-polynomial-dhj/ .
3 March, 2013 at 1:33 pm
Neil Lyall
Terry, thanks for the very nice post!
Just wanted to point out that I believe the argument you present can be adapted in a standard way, using Lucier’s increment strategy (with the polynomials changing with each iteration), to obtain the analogous result for full class of so-called intersective polynomials. A nice exposition of Lucier’s iterative procedure can be found at the beginning of Chapter 6 (pp 40-42) of Alex Rice’s Ph.D. thesis, see http://www.math.uga.edu/~arice/AlexThesis.pdf
A quick calculation seems to give a bound of the order
for intersective polynomials of degree
.
10 March, 2013 at 6:45 am
Joseph
Thank you for one more great post. You work inspire me, Dr. Tao.
Can anyone help me with a very basic question? How does (1) guarantee that the infimus of the set of all \delta>0 for which P(\delta) holds is zero? And how does analyzing this leads to the fact that P holds for all \delta>0?
Sorry for the basic question, but I got stuck.
10 March, 2013 at 10:22 am
anon
Define the set
. If
then for any
we have
, simply because the theorem is stronger for smaller
. So the set is some interval (closed
or open
) and trivially it contains 1. Let’s call
infimum of
.
may or may not be in
but anything greater than it definitely is in
.
If
, then we are done, since this means
contains any positive value, which proves the theorem. So let’s assume
and derive a contradiction.
Let
. Note that
is (strictly) increasing and continuous. From (1) we have
. Equivalently,
.
Take any number
. Since
we have
, therefore
. But since
we have
. So we have a number
which is in
and smaller than
, contradiction.
10 March, 2013 at 12:14 pm
Joseph
Dear anon,
Thank you for your nice explanation!
I now got the idea.
Sincerely, Joseph
11 March, 2013 at 6:20 am
Joseph
Dear anon,
Thank you again. But just a quick question: as we define f(x) = x+cx^3, can we consider the same value of c for all x? I mean, the beginning of Tao’s proof says that for any 0<delta00. Can we then fix c when defining f(x) (for example, by taking a c that satisfies (1) for any delta, if such one exists)?
Otherwise, if we state f(x) = x+c(x)x^3, with c now a function, can we still guarantee that f is invertible?
Sorry if the question is too basic… This is a new field for me.
Regards,
Joseph.
[It appears that you used some < and > signs in your answer that became interpreted as HTML and thus disappeared. If you use < and > instead then this should avoid the problem. In any case,
is indeed the fixed absolute constant from (1). -T.]
13 March, 2013 at 6:45 pm
Joseph
Never mind, I got it… Once you assume that the infimum is delta0 > 0, you define f(x) = x+cx^3, with c the value that satisfies (1) for that particular delta0 value (so, yes, a unique value for c). Than f is of course increasing and continuous, and everything else follows… I was thinking in a different way, but your explanation was perfect. Thanks!
23 March, 2013 at 7:50 am
solver6
Hello, i don’t understand wy
holds. Can someone help me.
23 March, 2013 at 9:21 am
Terence Tao
For all but about
of the choices of n (the choices that are closest to the boundary of
, the inner average
is equal to
.
19 April, 2013 at 6:32 am
Anonymous
Hi, I don’t understand the move from the inequality depending only on $h,n,r$ to the one depending also on $h’$.
Any help?
19 April, 2013 at 8:21 am
Terence Tao
21 April, 2013 at 9:33 am
Anonymous
Maybe the left side should be squared?
[Corrected, thanks – T.],
6 May, 2015 at 8:39 am
Ned
Hi, I need some help: I don’t understand how Cauchy-Schwarz is applied.
6 May, 2015 at 2:26 pm
Terence Tao
Cauchy-Schwarz bounds
by the geometric mean of
and
.
8 May, 2015 at 12:26 am
Ned
thanks!