For any positive integer n, let be the set of strings of length n consisting of 1s, 2s, and 3s, and define a combinatorial line to be a triple of such strings arising by taking a string in with at least one wildcard x, and substituting x=1, x=2, x=3 in that string (e.g. xx1x3 would give the combinatorial line ). Call a set of strings line-free if it contains no combinatorial lines, and let be the size of the largest set A which is line-free. We then have
Density Hales-Jewett theorem (k=3): , i.e. .
This theorem implies several other important results, most notably Roth’s theorem on length three progressions (in an arbitrary abelian group!) and also the corners theorem of Ajtai and Szemeredi (again in an arbitrary abelian group).
Here are some of the questions we are initially exploring in this thread (and which I will comment further on below the fold):
- What are the best upper and lower bounds for for small n? Currently we know , with some partial results for higher n. We also have some relationships between different .
- What do extremal or near-extremal sets (i.e. line-free sets with cardinality close to ) look like?
- What are some good constructions of line-free sets that lead to asymptotic lower bounds on ? The best asymptotic lower bound we have currently is .
- Can these methods extend to other related problems, such as the Moser cube problem or the capset problem?
- How feasible is it to extend the existing combinatorial proofs of Roth’s theorem (or the corners theorem), in particular the arguments of Szemeredi, to the Density Hales-Jewett theorem?
But I imagine that further threads might develop in the course of the discussion.
As with the rest of the project, this is supposed to be an open collaboration: please feel free to pose a question or a comment, even if (or especially if) it is just barely non-trivial. (For more on the “rules of the game”, see this post.)
— I. for small n —
Because the Cartesian product of two line-free sets is line-free, we have a lower bound
for all , although this seems to be an inferior bound in practice.
There is a trivial bound
which comes from the observation that any line-free subset of can be split into three slices, each of which is essentially a line-free subset of . In particular, since , we see that
If we identify with and consider the set
thus . Observe that a combinatorial line with a,b,c,k 1s, 2s, 3s, and wildcards respectively lies in if and only if k is a multiple of 3, and a is not equal to b mod 3. In particular, is line-free for , thus matching the upper bound (3) and concluding that
For n=4, the only combinatorial lines remaining in are and their permutations. Thus (as observed in Neylon.83) the set is line-free, leading to the lower bound .
This bound was complemented with the lower bound in Jakobsen.90. Thus .
For , the best lower bound currently is (Neylon.83) and .
No attempt at getting reasonable bounds for , etc. appears in the previous threads.
Question I.A: Can we improve the upper and lower bounds on ?
Question I.B: Is there some reasonably efficient way to automate this process?
Question I.C: What do the extremal line-free sets (sets of size close to ) look like?
At some point we should submit this sequence to the OEIS.
— II. Lower bounds on for large n —
For any integers a,b,c adding up to n, let be the set of all strings with a 1s, b 2s, and c 3s. Observe that if is any set of triples (a,b,c) that avoids equilateral triangles for , then the set is line-free. In particular, if we choose B to be the set of all triples such that a+2b lies in a set free of length three arithmetic progressions (e.g. a Behrend set), then is line-free. This seems to give a lower bound
Question II.B: What does one get for higher k than 3? Presumably one would use the O’Bryant bound (O’Bryant.126).
Question II.C: Do the examples from Section I extend to give competitive lower bounds for large n?
Question II.D: Are there better ways to avoid triangles or corners in two dimensions than simply lifting up the Behrend example from one dimension?
— III. Miscellanous facts about —
Using the colouring Hales-Jewett theorem, one can show that for every m, one has for sufficiently large m, improving (2) very slightly; see (Tao.85).
Question III.A: Anything else we can say about the ? For instance, is there a relation to Hales-Jewett numbers HJ(3,r) (defined as the first n such that every r-colouring of contains a monochromatic combinatorial line?)
— IV. Related problems —
Let denote the largest subset of which does not contain any geometric line (which is the same as a combinatorial line, but has a second wildcard y which goes from 3 to 1 whilst x goes from 1 to 3, e.g. xx2yy gives the geometric line ). The Moser cube problem is to understand the behaviour of . The first few values are (see OEIS A003142):
It is clear that .
Question IV.A. Can we improve the lower bound in (8)?
Question IV.B. Can we get good bounds on ?
Define to be the largest cardinality of a subset of which contains no algebraic lines, defined as a triple where ; such sets are known as capsets. Clearly . As noted my previous blog post, the best asymptotic bounds are
Question IV.C. Can we improve on these bounds in any way?
Question IV.D. What are the first few values of ? Brute force calculation reveals . (Presumably this sequence is also in the OEIS.)
[Update, Feb 7: I will now maintain a running table of the current “world records” for for small n, formed by combining all the bounds already mentioned in this post and comments.]
[Update, Feb 8: A more complete (and up-to-date) spreadsheet can now be found here.]
- green – obtained via the inequality (1)
- orange – obtained via the inequality (2)
- lavender – uses the inequality
— V. Szemeredi’s proof of Roth’s theorem —
Szemeredi provided a short proof of Roth’s theorem by purely combinatorial means, and also famously proved the full case of Szemeredi’s theorem by much more intricate combinatorial argument. Both arguments are based on the density increment method. In the setting of , the density increment method revolves around the critical density , which exists by (2); the density Hales-Jewett theorem is precisely the statement that vanishes, so suppose for contradiction that is positive.
Then one can find a line-free set of density such that A has density at most for every m-dimensional subspace of (and one can complement this lower bound with an upper bound for “most” subspaces, see Tao.100). One can also regularise the density on other sets than subspaces, see Tao.121). So, basically, the moment one gets a significant density increment on a non-trivial subspace, one is done (see Tao.120, O’Donnell.122, Tao.127).
Szemeredi’s proof of Roth’s theorem proceeds by first showing that a dense set A contains a large cube Q, and thus (if it is free of algebraic lines) must completely avoid the set . This set contains (most of) lots of parallel subspaces and so squeezes A to have higher density on some other subspace, closing the argument. (See Tao.86, Tao.118, Tao.131)
One can at least generate lots of cubes inside A without difficulty (Tao.102, Solymosi.135), lots of combinatorial lines with two elements inside A (Tao.130, Bukh.132, O’Donnell.133, Solymosi.135), and many unorganised subspaces in the complement of A (Tao.118, Solymosi.135), but this does not seem to be enough as yet.
Question V.B: What about using the techniques from Szemeredi’s big paper on Szemeredi’s theorem? (The immediate issue is that there is a key point where one needs to regularise a graph, and the relevant graph in HJ or even for corners is far too sparse; I’ll try to clarify this at some point.)
— A note on comments —
As in the first thread, please number your comments (starting with 200, then 201, etc.) and provide a title, to assist with following the comments. (Presumably, in future projects of this type, we will use a platform that allows for comment threading; see this post for further discussion.)