You are currently browsing the tag archive for the ‘cap sets’ tag.
A capset in the vector space over the finite field
of three elements is a subset
of
that does not contain any lines
, where
and
. A basic problem in additive combinatorics (discussed in one of the very first posts on this blog) is to obtain good upper and lower bounds for the maximal size of a capset in
.
Trivially, one has . Using Fourier methods (and the density increment argument of Roth), the bound of
was obtained by Meshulam, and improved only as late as 2012 to
for some absolute constant
by Bateman and Katz. But in a very recent breakthrough, Ellenberg (and independently Gijswijt) obtained the exponentially superior bound
, using a version of the polynomial method recently introduced by Croot, Lev, and Pach. (In the converse direction, a construction of Edel gives capsets as large as
.) Given the success of the polynomial method in superficially similar problems such as the finite field Kakeya problem (discussed in this previous post), it was natural to wonder that this method could be applicable to the cap set problem (see for instance this MathOverflow comment of mine on this from 2010), but it took a surprisingly long time before Croot, Lev, and Pach were able to identify the precise variant of the polynomial method that would actually work here.
The proof of the capset bound is very short (Ellenberg’s and Gijswijt’s preprints are both 3 pages long, and Croot-Lev-Pach is 6 pages), but I thought I would present a slight reformulation of the argument which treats the three points on a line in symmetrically (as opposed to treating the third point differently from the first two, as is done in the Ellenberg and Gijswijt papers; Croot-Lev-Pach also treat the middle point of a three-term arithmetic progression differently from the two endpoints, although this is a very natural thing to do in their context of
). The basic starting point is this: if
is a capset, then one has the identity
for all , where
is the Kronecker delta function, which we view as taking values in
. Indeed, (1) reflects the fact that the equation
has solutions precisely when
are either all equal, or form a line, and the latter is ruled out precisely when
is a capset.
To exploit (1), we will show that the left-hand side of (1) is “low rank” in some sense, while the right-hand side is “high rank”. Recall that a function taking values in a field
is of rank one if it is non-zero and of the form
for some
, and that the rank of a general function
is the least number of rank one functions needed to express
as a linear combination. More generally, if
, we define the rank of a function
to be the least number of “rank one” functions of the form
for some and some functions
,
, that are needed to generate
as a linear combination. For instance, when
, the rank one functions take the form
,
,
, and linear combinations of
such rank one functions will give a function of rank at most
.
It is a standard fact in linear algebra that the rank of a diagonal matrix is equal to the number of non-zero entries. This phenomenon extends to higher dimensions:
Lemma 1 (Rank of diagonal hypermatrices) Let
, let
be a finite set, let
be a field, and for each
, let
be a coefficient. Then the rank of the function
Proof: We induct on . As mentioned above, the case
follows from standard linear algebra, so suppose now that
and the claim has already been proven for
.
It is clear that the function (2) has rank at most equal to the number of non-zero (since the summands on the right-hand side are rank one functions), so it suffices to establish the lower bound. By deleting from
those elements
with
(which cannot increase the rank), we may assume without loss of generality that all the
are non-zero. Now suppose for contradiction that (2) has rank at most
, then we obtain a representation
for some sets of cardinalities adding up to at most
, and some functions
and
.
Consider the space of functions that are orthogonal to all the
,
in the sense that
for all . This space is a vector space whose dimension
is at least
. A basis of this space generates a
coordinate matrix of full rank, which implies that there is at least one non-singular
minor. This implies that there exists a function
in this space which is nowhere vanishing on some subset
of
of cardinality at least
.
If we multiply (3) by and sum in
, we conclude that
where
The right-hand side has rank at most , since the summands are rank one functions. On the other hand, from induction hypothesis the left-hand side has rank at least
, giving the required contradiction.
On the other hand, we have the following (symmetrised version of a) beautifully simple observation of Croot, Lev, and Pach:
Lemma 2 On
, the rank of the function
is at most
, where
Proof: Using the identity for
, we have
The right-hand side is clearly a polynomial of degree in
, which is then a linear combination of monomials
with with
In particular, from the pigeonhole principle, at least one of is at most
.
Consider the contribution of the monomials for which . We can regroup this contribution as
where ranges over those
with
,
is the monomial
and is some explicitly computable function whose exact form will not be of relevance to our argument. The number of such
is equal to
, so this contribution has rank at most
. The remaining contributions arising from the cases
and
similarly have rank at most
(grouping the monomials so that each monomial is only counted once), so the claim follows.
Upon restricting from to
, the rank of
is still at most
. The two lemmas then combine to give the Ellenberg-Gijswijt bound
All that remains is to compute the asymptotic behaviour of . This can be done using the general tool of Cramer’s theorem, but can also be derived from Stirling’s formula (discussed in this previous post). Indeed, if
,
,
for some
summing to
, Stirling’s formula gives
where is the entropy function
We then have
where is the maximum entropy
subject to the constraints
A routine Lagrange multiplier computation shows that the maximum occurs when
and is approximately
, giving rise to the claimed bound of
.
Remark 3 As noted in the Ellenberg and Gijswijt papers, the above argument extends readily to other fields than
to control the maximal size of subset of
that has no non-trivial solutions to the equation
, where
are non-zero constants that sum to zero. Of course one replaces the function
in Lemma 2 by
in this case.
Remark 4 This symmetrised formulation suggests that one possible way to improve slightly on the numerical quantity
by finding a more efficient way to decompose
into rank one functions, however I was not able to do so (though such improvements are reminiscent of the Strassen type algorithms for fast matrix multiplication).
Remark 5 It is tempting to see if this method can get non-trivial upper bounds for sets
with no length
progressions, in (say)
. One can run the above arguments, replacing the function
with
this leads to the bound
where
Unfortunately,
is asymptotic to
and so this bound is in fact slightly worse than the trivial bound
! However, there is a slim chance that there is a more efficient way to decompose
into rank one functions that would give a non-trivial bound on
. I experimented with a few possible such decompositions but unfortunately without success.
Remark 6 Return now to the capset problem. Since Lemma 1 is valid for any field
, one could perhaps hope to get better bounds by viewing the Kronecker delta function
as taking values in another field than
, such as the complex numbers
. However, as soon as one works in a field of characteristic other than
, one can adjoin a cube root
of unity, and one now has the Fourier decomposition
Moving to the Fourier basis, we conclude from Lemma 1 that the function
on
now has rank exactly
, and so one cannot improve upon the trivial bound of
by this method using fields of characteristic other than three as the range field. So it seems one has to stick with
(or the algebraic completion thereof).
Thanks to Jordan Ellenberg and Ben Green for helpful discussions.
Earlier this month, in the previous incarnation of this page, I posed a question which I thought was unsolved, and obtained the answer (in fact, it was solved 25 years ago) within a week. Now that this new version of the page has better feedback capability, I am now tempted to try again, since I have a large number of such questions which I would like to publicise. (Actually, I even have a secret web page full of these somewhere near my home page, though it will take a non-trivial amount of effort to find it!)
Perhaps my favourite open question is the problem on the maximal size of a cap set – a subset of (
being the finite field of three elements) which contains no lines, or equivalently no non-trivial arithmetic progressions of length three. As an upper bound, one can easily modify the proof of Roth’s theorem to show that cap sets must have size
(see e.g. this paper of Meshulam). This of course is better than the trivial bound of
once n is large. In the converse direction, the trivial example
shows that cap sets can be as large as
; the current world record is
, held by Edel. The gap between these two bounds is rather enormous; I would be very interested in either an improvement of the upper bound to
, or an improvement of the lower bound to
. (I believe both improvements are true, though a good friend of mine disagrees about the improvement to the lower bound.)
Recent Comments