A capset in the vector space over the finite field
of three elements is a subset
of
that does not contain any lines
, where
and
. A basic problem in additive combinatorics (discussed in one of the very first posts on this blog) is to obtain good upper and lower bounds for the maximal size of a capset in
.
Trivially, one has . Using Fourier methods (and the density increment argument of Roth), the bound of
was obtained by Meshulam, and improved only as late as 2012 to
for some absolute constant
by Bateman and Katz. But in a very recent breakthrough, Ellenberg (and independently Gijswijt) obtained the exponentially superior bound
, using a version of the polynomial method recently introduced by Croot, Lev, and Pach. (In the converse direction, a construction of Edel gives capsets as large as
.) Given the success of the polynomial method in superficially similar problems such as the finite field Kakeya problem (discussed in this previous post), it was natural to wonder that this method could be applicable to the cap set problem (see for instance this MathOverflow comment of mine on this from 2010), but it took a surprisingly long time before Croot, Lev, and Pach were able to identify the precise variant of the polynomial method that would actually work here.
The proof of the capset bound is very short (Ellenberg’s and Gijswijt’s preprints are both 3 pages long, and Croot-Lev-Pach is 6 pages), but I thought I would present a slight reformulation of the argument which treats the three points on a line in symmetrically (as opposed to treating the third point differently from the first two, as is done in the Ellenberg and Gijswijt papers; Croot-Lev-Pach also treat the middle point of a three-term arithmetic progression differently from the two endpoints, although this is a very natural thing to do in their context of
). The basic starting point is this: if
is a capset, then one has the identity
for all , where
is the Kronecker delta function, which we view as taking values in
. Indeed, (1) reflects the fact that the equation
has solutions precisely when
are either all equal, or form a line, and the latter is ruled out precisely when
is a capset.
To exploit (1), we will show that the left-hand side of (1) is “low rank” in some sense, while the right-hand side is “high rank”. Recall that a function taking values in a field
is of rank one if it is non-zero and of the form
for some
, and that the rank of a general function
is the least number of rank one functions needed to express
as a linear combination. More generally, if
, we define the rank of a function
to be the least number of “rank one” functions of the form
for some and some functions
,
, that are needed to generate
as a linear combination. For instance, when
, the rank one functions take the form
,
,
, and linear combinations of
such rank one functions will give a function of rank at most
.
It is a standard fact in linear algebra that the rank of a diagonal matrix is equal to the number of non-zero entries. This phenomenon extends to higher dimensions:
Lemma 1 (Rank of diagonal hypermatrices) Let
, let
be a finite set, let
be a field, and for each
, let
be a coefficient. Then the rank of the function
Proof: We induct on . As mentioned above, the case
follows from standard linear algebra, so suppose now that
and the claim has already been proven for
.
It is clear that the function (2) has rank at most equal to the number of non-zero (since the summands on the right-hand side are rank one functions), so it suffices to establish the lower bound. By deleting from
those elements
with
(which cannot increase the rank), we may assume without loss of generality that all the
are non-zero. Now suppose for contradiction that (2) has rank at most
, then we obtain a representation
for some sets of cardinalities adding up to at most
, and some functions
and
.
Consider the space of functions that are orthogonal to all the
,
in the sense that
for all . This space is a vector space whose dimension
is at least
. A basis of this space generates a
coordinate matrix of full rank, which implies that there is at least one non-singular
minor. This implies that there exists a function
in this space which is nowhere vanishing on some subset
of
of cardinality at least
.
If we multiply (3) by and sum in
, we conclude that
where
The right-hand side has rank at most , since the summands are rank one functions. On the other hand, from induction hypothesis the left-hand side has rank at least
, giving the required contradiction.
On the other hand, we have the following (symmetrised version of a) beautifully simple observation of Croot, Lev, and Pach:
Lemma 2 On
, the rank of the function
is at most
, where
Proof: Using the identity for
, we have
The right-hand side is clearly a polynomial of degree in
, which is then a linear combination of monomials
with with
In particular, from the pigeonhole principle, at least one of is at most
.
Consider the contribution of the monomials for which . We can regroup this contribution as
where ranges over those
with
,
is the monomial
and is some explicitly computable function whose exact form will not be of relevance to our argument. The number of such
is equal to
, so this contribution has rank at most
. The remaining contributions arising from the cases
and
similarly have rank at most
(grouping the monomials so that each monomial is only counted once), so the claim follows.
Upon restricting from to
, the rank of
is still at most
. The two lemmas then combine to give the Ellenberg-Gijswijt bound
All that remains is to compute the asymptotic behaviour of . This can be done using the general tool of Cramer’s theorem, but can also be derived from Stirling’s formula (discussed in this previous post). Indeed, if
,
,
for some
summing to
, Stirling’s formula gives
where is the entropy function
We then have
where is the maximum entropy
subject to the constraints
A routine Lagrange multiplier computation shows that the maximum occurs when
and is approximately
, giving rise to the claimed bound of
.
Remark 3 As noted in the Ellenberg and Gijswijt papers, the above argument extends readily to other fields than
to control the maximal size of subset of
that has no non-trivial solutions to the equation
, where
are non-zero constants that sum to zero. Of course one replaces the function
in Lemma 2 by
in this case.
Remark 4 This symmetrised formulation suggests that one possible way to improve slightly on the numerical quantity
by finding a more efficient way to decompose
into rank one functions, however I was not able to do so (though such improvements are reminiscent of the Strassen type algorithms for fast matrix multiplication).
Remark 5 It is tempting to see if this method can get non-trivial upper bounds for sets
with no length
progressions, in (say)
. One can run the above arguments, replacing the function
with
this leads to the bound
where
Unfortunately,
is asymptotic to
and so this bound is in fact slightly worse than the trivial bound
! However, there is a slim chance that there is a more efficient way to decompose
into rank one functions that would give a non-trivial bound on
. I experimented with a few possible such decompositions but unfortunately without success.
Remark 6 Return now to the capset problem. Since Lemma 1 is valid for any field
, one could perhaps hope to get better bounds by viewing the Kronecker delta function
as taking values in another field than
, such as the complex numbers
. However, as soon as one works in a field of characteristic other than
, one can adjoin a cube root
of unity, and one now has the Fourier decomposition
Moving to the Fourier basis, we conclude from Lemma 1 that the function
on
now has rank exactly
, and so one cannot improve upon the trivial bound of
by this method using fields of characteristic other than three as the range field. So it seems one has to stick with
(or the algebraic completion thereof).
Thanks to Jordan Ellenberg and Ben Green for helpful discussions.
78 comments
Comments feed for this article
18 May, 2016 at 4:14 am
JSE
Beautiful! The symmetric formulation “feels right” and is surely the best way to express it.
20 May, 2016 at 12:57 am
Anonymous
It seems to be a proof from “the book”!
18 May, 2016 at 4:23 am
Nets Katz
Terry, is it obvious that decompositions of the quadratic form in Remark 5 into rank 1 pieces is the only way to go? Any argument that gives a lower bound for the dimension of a set of vectors which pairwise vanish would work. Must such an argument have a simple algebraic formulation? I would like to comment that applying the technology to characteristic zero meets with similar obstructions. A simple model for characteristic zero is to replace F_p^d addition with addition that has carrying but for which the carrying in each digit is independent. Then you have to look at a product of 2^{d-1} forms, one for each carrying profile.
18 May, 2016 at 7:02 am
JSE
I was thinking along similar lines, Nets. It’s an interesting question — what formal properties of polynomials are needed to make some argument of this kind work? Are there families of functions on (Z/NZ) which are “polynomial-like” enough? I truly don’t know. My first instinct would be to look at the Witt vector construction of (Z/p^n Z), which is somehow of the most canonical way of expressing “carrying” and passing from char p to char 0.
18 May, 2016 at 10:02 am
Terence Tao
It could well be that working with more general classes of functions than polynomials of a given degree would be useful – for instance polynomials taking values in some other commutative ring, or even a non-commutative ring (though the linear algebra becomes replaced by something nastier of course if one strays too far from the field setting). One could possibly play with completely non-linear spaces also. Maybe it is worth working with very low dimensional examples (as in my other recent comment here) to experiment if there are other decompositions than the polynomial ones which are more efficient. I played around for a while with multi-degrees instead of degrees, for instance, but eventually concluded that one did not gain much asymptotically by doing so.
18 May, 2016 at 6:01 pm
Will
It depends on whether we want to replace the additive structure of polynomials or just multiplicative. Since like most mathematicians I will only stop using linear algebra under extreme duress, I thought about how to generalize the polynomial method here while still using the notion of tensor rank to measure the complexity of functions.
So we can ask the general question: given a tensor f in V^3, viewing f tensor f tensor … f as a tensor in (V^k)^3, how may we upper bound its tensor rank as a function of k?
We may also try to lower bound its tensor rank, to let us know that we are not missing an easy way to improve the bound, or for use on the other side of the inequality. So maybe the right question is what is the asymptotic of the tensor rank – is it always an exponential function in k?
Anyways the upper bound argument generalizes immediately from polynomial functions to any functions that can be written as a sum of v_a tensor v_b tensor v_c, where v_a,v_b,v_c lie in some list of vector that come with weights w_a,w_b,w_c, such that we have an upper bound w_a + w_b + w_c < W for every term appearing in the sum.
Then we get an upper bound in terms of the maximum entropy of a probability distribution on the indices a where the expected value of w_a is at most W/3, by precisely the same argument.
To recover the polynomial special case, take v_a,v_b,v_c to be monomials in one variable, w_a,w_b,w_c the degree of the monomial, and W the degree of the polynomial function.
In the case where v_a range through a basis v_1,..,v_n (e.g. the monomials in one variable over F_p of degree less than p), then as soon as W/3 is less than the average value of w under the uniform distribution we will force the distribution to have an entropy slightly smaller than log n and hence we will get a nontrivial bound for the tensor ranks.
So for a 3-variable polynomial over F_p, the degree needs to be less than 3(p-1)/2 to get a nontrivial bound, and for a d-variable polynomial, less than d(p-1)/2.
In what should be interesting to some, this condition is precisely the Hilbert-Mumford criterion for GIT-instability under the action of (SL_n)^3 on P^{n^3-1}. So we can summarize the key idea in higher-level language as: There is a power savings bound for the rank of powers of GIT-unstable tensors.
Are there any known bounds for ranks of tensor powers that cannot be derived in this way from tuples of vectors and weight functions? Is there any method that would allow us to lower bound tensor ranks so as to show that this upper bound is sometimes sharp?
19 May, 2016 at 11:59 pm
Terence Tao
If
has some basis
, then we can associate a “Newton diagram”
to a function
by declaring
to be an element of
if the
coefficient of
is non-zero. Define the “capacity”
of this diagram to be the maximum value of
, where
ranges over random variables taking values in
and
denotes Shannon entropy. Then I think Stirling’s formula ends up saying that the rank of
(viewed as an element of
) is at most
, basically because every “monomial” in
becomes associated with a discrete random variable
in
(taking
values). I think this bound covers the previous bounds on rank mentioned here, though I haven’t checked fully.
20 May, 2016 at 6:13 am
Will
I think the bound you sketched and the bound I sketched are the same because the problems you need to solve to compute each bound are dual in a convex optimization sense. Given a weight function, you can upper bound the entropy of the marginal distributions of any probability distribution by observing min(E(w_X,w_Y,w_Z))<E(w_X+w_Y+w_Z)<sup(w_X+w_Y+w_Z)<W. If you have any distribution, you can define a weight function as the derivative of the entropy of each marginal distribution by the probability of each marginal value times the Lagrange multiplier of the entropy. Then the upper bound we get from the weight function is exactly the minimum entropy of a marginal distribution.
This makes it easy to check you aren't missing an obvious improvement – write down a weight function and a distribution and check that their entropies agree.
20 May, 2016 at 11:26 am
Terence Tao
I think one may have to generalise to the non-symmetric setting where the weight functions associated to each of the three components
are allowed to be different from each other, but I think I agree with your point that the two formulations are dual.
18 May, 2016 at 5:09 am
Fred Lunnon
Typos (presumptive?):
for “F_3^n” read “F_{3^n}” (throughout);
for “F^3” read “F_3” (once).
[Second erratum corrected. First is not an erratum, I have tried to clarify in post – T.]
18 May, 2016 at 5:46 am
Will
One can not improve on your bound for 4-term arithmetic progressions using only polynomials. Indeed if
is any multivariate polynomial over
that is
outside a a linear subspace of codimension
but not
everywhere, then
restricted to some subspace of dimension
is a polynomial that vanishes outside a point but not at that point, which by a fragment of Chevalley-Warning (http://mathoverflow.net/a/238527/18060) has degree at least
. So a nontrivial polynomial which is nonvanishing for
-term arithmetic progressions in
has degree at least
, and the upper bound for the degree in a single vector is
, which is precisely the critical value one needs to beat to improve the bound.
18 May, 2016 at 7:09 am
JSE
One note about this higher notion of “rank” is that rank <= r is not a closed condition, the way it is for matrices! I always find this confusing and it's probably good to keep it in mind.
18 May, 2016 at 5:31 pm
Will
Are you sure that is true for this notion of tensor rank and not just the more commonly used notion of tensor rank? I might have a proof that this notion of rank is in fact semicontinuous.
18 May, 2016 at 6:03 pm
JSE
Oh is Terry’s notion not the same as usual tensor rank? I thought it was, but didn’t think about it carefully.
18 May, 2016 at 6:47 pm
Will
The usual definition allows only functions of the form f(x)g(y)h(z), which is much more restrictive. For instance it is clear that every n x n x n tensor has rank at most n in Terry’s sense but this is not true in the usual sense. Maybe we should call Terry’s version something like lax tensor rank to remove ambiguity.
Let me sketch my argument that the space of tensors of bounded lax tensor rank is closed:
A tensor in V tensor V tensor V is of rank < k if it is in the space spanned by W1 x V x V + V x W2 x V + V x V x W3 where dim W1 + dim W2 + dim W3 < k. Now for fixed W1, W2, W3 the space spanned by W1 x V x V + V x W2 x V + V x V x W3 is a linear subspace of V x V x V and varies smoothly as W1 , W2, W3 vary in the Grassmanian of subspaces of fixed dimension, so this forms a closed subset of V tensor V tensor V x G(dim W1, V) x G(dim W2, V) x G(dim W3, V) that projects to a closed subset of V tensor V tensor V. Then a finite sum over all triples of dimensions satisfying the inequality gives the desired result.
19 May, 2016 at 7:42 am
JSE
Oh good point! I had failed to grasp this distinction.
18 May, 2016 at 7:29 am
Will
Also two quick observations:
1. It seems that the bound given for sets in F_p^n with no 3-term arithmetic progressions is ((c+o(1))p)^n for some constant c<1 that I doubt can be put in closed form.
2. For each set S of size 2n+1 in F_p one gets power-saving bounds for subsets of F_p^n that do not contain the image of S under any nonconstant map whose coordinates are polynomials of degree n. I didn't see a trivial proof of power savings here for general S (though for special S we may reduce using special polynomials to a simpler bound) but I also didn't see a relationship to any well-known open problem in combinatorics.
18 May, 2016 at 11:16 am
Eric Naslund
In fact we can show that c=e^{-1/6}. That is we have the upper bound (e^{-1/6}+o(1))^n p^n, where the o(1) term is positive and tending to zero as p–>infty.
18 May, 2016 at 9:35 am
Terence Tao
A small but perhaps amusing computation: for the question of finding the largest number of cards in the game Set without a “set” (or equivalently, the largest capset in
), the upper bound given here is
This can be compared with the asymptotic prediction of
for the upper bound (or the asymptotic prediction of
for the lower bound. Theorem 1.2 of Meshulam’s paper using the Fourier method gives an upper bound of
. The maximal size of a capset in
is actually known to be 20 by the work of Pellegrino. So even in this low dimensional case, the bound given by the polynomial method improves upon existing bounds, but is a bit short of the truth. One could imagine in this case at least one could improve upon the bound of
while staying within the framework of the polynomial method (it might also be worth working out numbers for other low dimensional cases).
19 May, 2016 at 5:22 am
Fedor Petrov
Well, 45 may be replaced by 30, as
may be replaced to
always, see a proof here (sorry for doubling this link, but I think it may be convenient)
Click to access f3_eng.pdf
For specific
this looks better improvement than it actually is:)
27 June, 2016 at 12:12 am
Anurag Bishnoi
The bounds given by Bierbrauer and Edel (based on Meshulam’s Fourier-Analytic proof) seem to be closer to the truth in the low-dimensional cases than the polynomial method bound. For dimensions 4, 5 and 6, they get upper bounds, 21, 48 and 114, respectively. See Section 3 in http://onlinelibrary.wiley.com/doi/10.1002/jcd.10000/abstract. The values of the polynomial bound (constant 2 instead of 3, as obtained by Fedor) for these dimensions are 30, 102 and 336.
18 May, 2016 at 11:30 am
Sam
I think remark 5 has a typo. As written,
is asymptotic to
. Should it say
[Corrected, thanks – T.]
18 May, 2016 at 11:59 am
Eric Naslund
Using this general approach, Will and I applied this directly to the Erdos-Szemeredi sunflower problem, proving that if F is a sunflower-free collection of subsets of {1,2,3,…,n}, then |F|< (3/2^{2/3})^n: https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDozZDQ2NzRjNzAxNmJjNDlh
18 May, 2016 at 1:22 pm
The Ellenberg-Gijswijt bound on cap sets | Anurag's Math Blog
[…] [2] Polymath 10 Emergency Post 5: The Erdos-Szemeredi Sunflower Conjecture is Now Proven. [3] A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capset bound [4] Large caps in projective Galois […]
18 May, 2016 at 4:14 pm
Anonymous
It seems that the above constant
is (simply) given by
18 May, 2016 at 4:28 pm
Fedor Petrov
I think, we may improve multiple 3 to 2 (not a big deal, yes.) But maybe modification of method is useful elsewhere.
For
denote by
a number of points
with sum of coordiantes at most
. Assume that
and
for some
(min is attained, of course,
about
and is exponentially smaller than
by a suitable verison of Law of Large Numbers.)
in
for
We need
Lemma. Let
be a field,
be a finite set and let
be a
-dimensional linear subspace of the space
of functions
. Then the squares of functions from
span a space of dimension at least
.
on
Proof (inspired by Ilya Bogdanov). Let
.
of
such that
, where
is the number of the first nonzero
. Their squares work.
By Gauss' method, we may find a basis
entry in
Denote by
the space of
-valued functions on
such that
for any polynomial 
, which has degree at most 2 in each variable and total degree at most
.
of this space
.
, for any polynomial
of degree at most
we have
over
Dimension
is at least
If
(this is clear for monomials: sum factorizes, and there exists zero factor).
Assume that
for
,
in
. Consider the space of polynomials
of degree at most 2 in each variable, equal to 0
, of total degree at most
.
.
(since they are different as functions on
). For all 
0 outside
Dimension of the space of such polynomials is at least
And these polynomials are all different as functions on
we have
By lemma the functions
,
, form a space of dimension at least
,
form a space of dimension at least
. Sum of dimensions
, thus orthogonal complement of the first subspace can not lie in the second.
and functions
exceed
18 May, 2016 at 5:03 pm
Fedor Petrov
Formatting looks really ugly, sorry. Here is pdf
Click to access f3_eng.pdf
19 May, 2016 at 11:20 pm
Terence Tao
Nice! I actually get
instead of
(as I believe it is the former that is equal to
. This seems to give the bounds
,
,
,
, if I did the arithmetic correctly; this is sharp in one and two dimensions but misses a little in three (I believe
) and higher. In terms of the tensor rank formulation, the point seems to be this: if one wishes to write
as a sum
then Lemma 1 of the post tells us that we need
. But if we have the additional information that
and
for
, then we can improve this condition to
. I guess there is some sort of “symmetric rank” here where we symmetrise under the operation of interchanging x and y. (It also seems tempting to symmetrise in all of x,y,z, but there seems to be some issue in characteristic 3 in that one can’t divide by 3!. Given that the bound is now sharp in one and two dimensions, I guess it is not too realistic to hope for a really big further gain.)
20 May, 2016 at 1:17 am
Fedor Petrov
Yes, of course
. I hope that now everything in the linked pdf is correct, and it is expressed in more symmetric form, without two steps. The space
satisfies a condition
for all
in
. In general it implies that
, but we are additionally given that supports of all functions in
are quite large. Does it allow to get essentially better bound on the dimension? Without conditions on the support it is tight, but the example I have in mind (linear span of vectors
) has small supports.
It gives at least some improvement, say, estimate 30 in the game set problem may be easily improved (though I doubt that down to 20.)
19 May, 2016 at 8:25 am
Polymath 10 Emergency Post 5: The Erdos-Szemeredi Sunflower Conjecture is Now Proven. | Combinatorics and more
[…] nice modification) of the Croot-Lev-Pach-Ellenberg-Gijswijt capset and further discussion see this post and comment thread on Terry Tao’s […]
19 May, 2016 at 9:28 am
Eric Naslund
Here is another application of this method which can be seen as progress on the Erdos-Rado sunflower problem. The functions used are combinations of indicator functions of the elements in the underlying set rather than polynomials: https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDozZWM4ZTI0ZTcyNGMyY2M2
19 May, 2016 at 11:05 am
Jop Briet
Dear Terry, thanks for the great post! Tiny typo: Eq (3) seems to be missing an equality symbol.
[Corrected, thanks – T.]
20 May, 2016 at 3:54 pm
Terence Tao
For the record: the known values of
(i.e.for
) are at https://oeis.org/A090245
21 May, 2016 at 1:36 am
Fedor Petrov
We may mix Croot-Lev-Pach and Olson’s ideas to rephrase this proof using group rings.
Let
be odd prime. Denote by
a cyclic group of order
. Let
be a finite Abelian
-group with
generators
,
generates
. Assume that we managed to find a subspace
of a group algebra
of codimension
satisfying
for all
. Then any subset
of cardinality at least
contains three different elements
such that
.
Why? Assume the contrary. Consider the subspaces
of
with all coefficients outside
(resp. outside
) equal to 0. They have dimensions at least
. Consider the product
where two first summands are chosen from
and the second from
. If
only for
, the coefficient of 1 in this product is nothing but
. But the space of functions of the form
,
is at least
, as we may see by choosing basis in
by Gauss elimination, and it can not be orthogonal to the space formed by functions
, which is isomorphic to
. (This argument is the same as before.)
Now how to choose such a space with low codimension. Group algebra is generated by the products
, where
. Let
be a subspace generated by monomials for which
. Then any product
for
has some
in a power strictly greater than
, but
.
What is the codimension
of this subspace? If
are random variables uniformly distributed in the set
, then
is the probability that
. It decays expoentially in
. Alas, this does not seem to be better than partitioning
onto classes modulo
and reasoning in them separately.
21 May, 2016 at 5:12 am
Fedor Petrov
Well, at least this gives better exponent for, say, $C_9^n$ compared to $3^n$ copies of $C_3^n$, so I hope that it touches some effect.
31 May, 2016 at 8:54 am
Fedor Petrov
It gives similar estimates for progression-free sets
, where
such that any large enough finite group
(at least of odd order, though this is hardily important) contains
elements without 3-progressions?
31 May, 2016 at 8:57 am
Fedor Petrov
I tried to say that it works for non-Abelian
-groups, but something went wrong, probably inequality signs have eaten my post. Never mind. Here is a draft
Click to access grouprings1.pdf
22 May, 2016 at 11:55 am
Tom Gur
Should delta(x_1 – a) * … * delta(x_k – a) be delta_a(x_1) *…* delta_a(x_k) ?
(Near the end of the proof of Lemma 1.)
[Corrected, thanks – T.]
25 May, 2016 at 8:50 am
Mind Boggling: Following the work of Croot, Lev, and Pach, Jordan Ellenberg settled the cap set problem! | Combinatorics and more
[…] See also this post by Tao (presenting a symmetric version of the proof) and this post by […]
25 May, 2016 at 10:01 am
Thomas Bloom
Interestingly, this proof also works for the similar problem over polynomial rings. For example, bounding the size of the largest subset of F_2[t]_{\deg f< n} without non-trivial solutions to a+tb=(1+t)c, this method gives an exponential upper bound (I think something like 1.9^n for this particular problem).
This is great, because before this not even the $N/\log N$ bound was known, but rather $N(\log\log N)^2/\log N$ – I used to think that this problem was closer to the normal integer setting than the cap set problem, but not any more!
And, of course, there are many analogues between the integers and polynomials over finite fields…
25 May, 2016 at 7:08 pm
Terence Tao
Ben Green has recently obtained an exponential bound for the analogue of Sarkozy’s theorem in function fields using these methods: http://arxiv.org/abs/1605.07263
26 May, 2016 at 12:33 pm
Mario Carneiro
You comment on https://oeis.org/A090245 that “Asymptotically, a(n) = O(3^n/n)” which contradicts this post.
26 May, 2016 at 5:13 pm
Terence Tao
In combinatorics,
is used to denote an upper bound only (one would write instead
if one wanted to assert the exact order of magnitude).
26 May, 2016 at 7:25 pm
Mario Carneiro
Oops, newbie mistake. Indeed I read it as a(n) ~ 3^n/n.
27 May, 2016 at 12:11 am
Polymath 10 post 6: The Erdos-Rado sunflower conjecture, and the Turan (4,3) problem: homological approaches. | Combinatorics and more
[…] We will not regard attacks on the sunflower conjecture based on the recent solution of the cap set problem and the Erdos Szemeredi sunflower conjecture (which is a weaker version of the Erdos-Rado conjecture) as part of the present polymath10 project, but, of course, I will be happy to hear also reports on progress or comments on these directions. Some updates on this story: Eric Naslund and Will Sawin gave a direct proof based on the polynomial method for the Erdos-Szemeredi sunflower conjecture, and an even stronger result is given by Naslund here. (Eric also has stronger quantitative bounds for Erdos-Szemeredi based on bounds for cap sets.) More cap set updates are added to this post, and can be found in the comment threads here and here. […]
27 May, 2016 at 12:38 pm
Anonymous
Typo: I believe $k$ should be $k-1$ in the equation following “If we multiply (3) by…”
[Corrected, thanks – T.]
29 May, 2016 at 3:54 am
Klaus85
This is a very interesting post, thanks. I wonder, do you have a personal favorite-list of open problems? (it seemed this one was on the list). Or do professional mathematicians not have such lists but concentrate in developing the field further independent whether there is some identified open question? I’m honestly interested in this. Thanks!
29 May, 2016 at 8:22 am
Terence Tao
The blog posts listed at https://terrytao.wordpress.com/category/question/ are devoted to discussion of a number of open questions that I am interested in. I did at one point try to maintain a list of such open questions more systematically, but have not done so for more than a decade.
1 June, 2016 at 8:14 am
Terence Tao
A shout out, by the way, to Ben Green, who to my knowledge was the first to dare conjecture [EDIT: explicitly, and in print] that cap sets had to be exponentially sparse (see Conjecture 4.3 of http://arxiv.org/abs/math/0409420 ). I actually disagreed with him at the time, and even said so in my 2007 blog post, but am happy to have been proven wrong on this.
While on the topic of past inaccuracies, it seems I may have inadvertently propagated a slightly incorrect terminology in that post: sets in
free of collinear triples are known as affine caps or simply caps in the design theory literature, rather than cap sets. The latter terminology seems to have become rather entrenched, though, at least in additive combinatorics circles…
1 June, 2016 at 9:33 am
samuelfhopkins
In the commentary after Conjecture 4.3, Green says he would expect any progress on that conjecture to lead to improvements on bounding the size of a maximum subset of $\mathbb{Z}_n$ without 3-term APs. Is the jury out at the moment about the possibility of these latest developments being applied to the $\mathbb{Z}_n$ setting?
1 June, 2016 at 11:56 am
Gil Kalai
It is interesting to look at the original 1987 paper of Peter Frankl, Ron Graham, and Vojta Rodl.
upper bound using Rusza-Szemeredi’s theorem (that applies also to show that
). In that paper FGR asked if cap sets are exponentially sparse. They also relate the question to the Erdos-Rado’s sunflower conjecture.
FGR gave a
I myself worked with Meshulam on this problem around 2000, and we had a Fourier statement about inequality between certain norms that in its strongest form would give exponentially good upper bound, (and in some weaker forms will give weaker improvement to Roy’s original 95 bound.) (These norm-inequalities are still completely open, I think.) Roy certainly conjectured that cap sets are exponentially sparse while I changed sides several times over the years. In 2009 I discussed another approach also initiated with discussions with Roy with certain weakening and strengthening of the 3AP condition, potential relations to variants of the Frankl-Wilson and Frankl-Rodl theorems, and some slightly related polynomial-method lemmas.
On another matter it is certainly an obvious question at this point to try to give upper bounds for the rank of subsets of
without a combinatorial line. The most optimistic constructions are given in terms of minimum of level sets corresponding to 3AP free sets of levels. Maybe there could be a way to dualize somehow and get upper bounds in terms of maximum of some related question. (It will be sufficiently wonderful to relate upper bounds for DHJ to upper bounds for
.)
1 June, 2016 at 12:46 pm
Terence Tao
One can certainly try to apply the same method to the DHJ problem – one replaces the function
by the indicator function of all the triples in
that form a combinatorial line. Unfortunately the rank of this tensor appears to be too large for the method to work (and in any event, a suitable modification of the Behrend construction shows that an exponential gain is not possible in this case anyway).
At present it seems like the most plausible way one could use these methods to improve upon the
problem is through progress on DHJ (or slight variants thereof), perhaps in
for some suitable choice of
and
with
(the Behrend example suggests that
has to grow with
in order for there to be any chance here). I myself haven’t seriously tried this, but presumably some of the many other mathematicians now experimenting with this method would be considering it; it’s worth a shot, certainly.
p.s. Regarding by previous comment, I should clarify that Ben was certainly not the first to pose the question of whether capsets are exponentially sparse or not – the question was also raised as “an interesting open problem” in the 2004 paper of Edel. But I don’t know of an earlier reference in the literature in which the author came out strongly in favour of the exponentially sparse scenario, rather than being undecided or posing the opposing scenario as a question (as was done for instance in the Frankl-Graham-Rodl paper).
2 June, 2016 at 5:52 pm
Paul Delhanty
Edel, in the 2004 paper that you linked previously, writes, “That leaves us with two research problems. Firstly to improve the bounds on $\mu(3)$ by finding better *capsets*.”
So you get a pass on the 2nd point.
2 June, 2016 at 9:36 pm
Paul Delhanty
Ah, my bad – sorry! On reading through Edel’s paper more carefully, *capset* is defined in Definition 9 and it is a somewhat different and much more specific concept from a cap. It looks like a capset is a set of indexing tuples that represent a set of product caps that get unioned together to create the final cap.
1 June, 2016 at 3:04 pm
anthonyquas
Very nice! I have 3 minor comments and a question.
is *nowhere zero* on
(rather than non-zero, which is ambiguous);
for
(not for general
;
1) I think it would be helpful to say that
2) In the first line of Lemma 2,
3) In my browser, I was unable to find the equation label (3) (but I inferred where it was).
And the question: Am I right that the term “capset” comes from spherical caps that were used to build large sets without arithmetic progressions? The terminology is a bit mysterious to non-experts.
[Corrected, thanks. The notion of an “affine cap” comes from design theory, but I do not know the provenance exactly – T.]
2 June, 2016 at 1:39 am
kodlu
In Remark 3, “to control the maximal size of aubset” should end with “a subset”, I presume.
[Corrected, thanks -T.]
8 June, 2016 at 8:02 pm
aram
I’m confused about (1). The LHS =1 if x,y,z form a line but the RHS = 0 then. And the empty set is a valid capset, so the RHS could be always 0. And the RHS could be sometimes 2 but the LHS is always 0 or 1.
9 June, 2016 at 11:01 am
Terence Tao
The case when
form a line never occurs when
and
is a capset, by definition of capset. In the case of the empty set, (1) is vacuously true for
. The RHS cannot actually attain 2 because at most one of the summands can be non-zero at a time.
9 June, 2016 at 7:38 pm
aram
Oh, I missed that x,y,z have to belong to A! That makes sense.
23 June, 2016 at 9:25 am
Abbas
Nice argument and presentation!
On the last line of Remark 3,
should be
? Also in Remark 5, 3rd line from below.
[Corrected, thanks – T.]
8 July, 2016 at 1:12 pm
Bounds for sum free sets in prime power cyclic groups — three ways | Secret Blogging Seminar
[…] now-standard argument (I like Terry Tao’s exposition) shows that is bounded by three times the number of monomials of degree . One needs to check that […]
29 July, 2016 at 8:05 am
Comments on the Ellenberg-Gijswijt bound for caps in higher characteristic | Blogderbeweise - Proofs from the blog
[…] breakthrough due to Croot, Lev and Pach, recycled by Ellenberg-Gijswijt and streamlined by Tao on this blog post, revolutionised the landscape of Roth’s type problems in vector spaces over finite fields. […]
24 August, 2016 at 7:52 am
Notes on the “slice rank” of tensors | What's new
[…] the previous blog post, one of us (Terry) implicitly introduced a notion of rank for tensors which is a little different […]
4 September, 2016 at 6:17 am
Paata Ivanishvili
Once at the end of a seminar about capset problem fedja said: “you don’t need Lagrange multipliers. For
clearly
. Minimizing the LHS gives
"
5 September, 2016 at 11:59 pm
Will
Yes, a similar method is very useful for quickly doing slice rank calculations. Eric Naslund and I used it on the last page of this: https://arxiv.org/pdf/1606.09575v1.pdf
However, it is not obvious that the upper bounds produced by this method are optimal. To analyze that question I don’t see how to avoid the use of Lagrange multipliers. See https://terrytao.wordpress.com/2016/08/24/notes-on-the-slice-rank-of-tensors/
6 September, 2016 at 12:10 am
Fedor Petrov
The multiplicative constant obtained on this way (variants of which are called Cramer theorem or Chernoff bound in probability and which is essentially a saddle-point method) is of course optimal. Further asymptotics (polynomial multiples) may be also obtained by saddle-point method, this is more technical.
6 September, 2016 at 5:27 am
Will
But in the probability theory problem we are trying to bound the tail probability of a fixed random variable. I agree that the method is optimal there. Here we have the freedom to choose the random variable, subject to some constraints. If you write down the constraints, then any way you slice it, you have a convex optimization problem for which Lagrange multipliers are a suitable tool.
Of course optimizing the random variable for progression-free sets in F_q^n with q=3 is not very hard. But for higher values of q it is nontrivial, though solved by these papers: https://arxiv.org/abs/1608.00243 https://arxiv.org/abs/1608.05740
26 October, 2016 at 5:04 am
Thomas Dybdahl Ahle
Wait, why is that clearly an upper bound?
26 October, 2016 at 5:57 am
Thomas Dybdahl Ahle
Ah,
. You just flipped the inequality on
.
9 September, 2016 at 9:48 am
Introduction to the polynomial method (and other similar things) | Short, Fat Matrices
[…] result then follows from analyzing the asymptotic behavior of (see this post for details). The proof of the lemma is perhaps more interesting, considering this is where polynomials […]
15 May, 2017 at 1:40 am
Notes of the seminar ‘On the slice rank of functions’ | Points And Lines
[…] try to elude the idea of ‘slice rank’ of a tensor, as was defined by Terry Tao in his reformulation of the Ellenberg-Gijswijt proof. I will try to indicate where the idea might have come from and how […]
23 June, 2017 at 2:55 pm
TheoryFest Day 3: Plenaries | A bunch of data
[…] came out of the recent flurry of work by Croot, Lev, Pach, Ellenberg, Gijswijt and Tao (and cited Tao’s blog post as well as the papers, which I thought was […]
1 July, 2017 at 9:29 pm
수학동아 연재 (2016년) - Prof. Sang-il Oum
[…] 참고 블로그글: Ellenberg 블로그, Tim Gowers 블로그, Terrence Tao 블로그 […]
16 April, 2018 at 7:21 am
George
In regards to Remark 5 and Will’s nice response to it above, it seems that Hongze Li has made progress for large q. https://arxiv.org/pdf/1610.00247.pdf
16 April, 2018 at 5:45 pm
George Shakan
After reading the paper and getting in touch with the author, unfortunately there is an error.
19 March, 2019 at 4:03 am
수학동아 연재 (2016년) - Sang-il Oum
[…] 참고 블로그글: Ellenberg 블로그, Tim Gowers 블로그, Terrence Tao 블로그 […]
23 January, 2023 at 9:24 am
The Trifference Problem | Anurag's Math Blog
[…] have led to any improvement in the bounds on . For example, Costa and Dalai have shown that the slice rank method, which was recently used to make breakthroughs on the cap set problem, and many other combinatorial […]
28 February, 2023 at 9:00 am
Theory at the Institute and Beyond, February 2023 | Calvin Café: The Simons Institute Blog
[…] Kelley-Meka proof in this mold). A few years ago, an ingenious use of the polynomial method led to stunning progress on the capset problem, leading to a short proof of an upper bound of O(2.756n) for q = 3, a vast improvement to the […]