A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capset bound

18 May, 2016 in expository, math.CO, math.RA | Tags: cap sets, Dion Gijswijt, Ernie Croot, Jordan Ellenberg, Peter Pach, polynomial method, Seva Lev | by Terence Tao

A capset in the vector space ${{\bf F}_3^n}$ over the finite field ${{\bf F}_3}$ of three elements is a subset ${A}$ of ${{\bf F}_3^n}$ that does not contain any lines ${\{ x,x+r,x+2r\}}$ , where ${x,r \in {\bf F}_3^n}$ and ${r \neq 0}$ . A basic problem in additive combinatorics (discussed in one of the very first posts on this blog) is to obtain good upper and lower bounds for the maximal size of a capset in ${{\bf F}_3^n}$ .

Trivially, one has ${|A| \leq 3^n}$ . Using Fourier methods (and the density increment argument of Roth), the bound of ${|A| \leq O( 3^n / n )}$ was obtained by Meshulam, and improved only as late as 2012 to ${O( 3^n /n^{1+c})}$ for some absolute constant ${c>0}$ by Bateman and Katz. But in a very recent breakthrough, Ellenberg (and independently Gijswijt) obtained the exponentially superior bound ${|A| \leq O( 2.756^n )}$ , using a version of the polynomial method recently introduced by Croot, Lev, and Pach. (In the converse direction, a construction of Edel gives capsets as large as ${(2.2174)^n}$ .) Given the success of the polynomial method in superficially similar problems such as the finite field Kakeya problem (discussed in this previous post), it was natural to wonder that this method could be applicable to the cap set problem (see for instance this MathOverflow comment of mine on this from 2010), but it took a surprisingly long time before Croot, Lev, and Pach were able to identify the precise variant of the polynomial method that would actually work here.

The proof of the capset bound is very short (Ellenberg’s and Gijswijt’s preprints are both 3 pages long, and Croot-Lev-Pach is 6 pages), but I thought I would present a slight reformulation of the argument which treats the three points on a line in ${{\bf F}_3}$ symmetrically (as opposed to treating the third point differently from the first two, as is done in the Ellenberg and Gijswijt papers; Croot-Lev-Pach also treat the middle point of a three-term arithmetic progression differently from the two endpoints, although this is a very natural thing to do in their context of ${({\bf Z}/4{\bf Z})^n}$ ). The basic starting point is this: if ${A}$ is a capset, then one has the identity

$\displaystyle \delta_{0^n}( x+y+z ) = \sum_{a \in A} \delta_a(x) \delta_a(y) \delta_a(z) \ \ \ \ \ (1)$

for all ${(x,y,z) \in A^3}$ , where ${\delta_a(x) := 1_{a=x}}$ is the Kronecker delta function, which we view as taking values in ${{\bf F}_3}$ . Indeed, (1) reflects the fact that the equation ${x+y+z=0}$ has solutions precisely when ${x,y,z}$ are either all equal, or form a line, and the latter is ruled out precisely when ${A}$ is a capset.

To exploit (1), we will show that the left-hand side of (1) is “low rank” in some sense, while the right-hand side is “high rank”. Recall that a function ${F: A \times A \rightarrow {\bf F}}$ taking values in a field ${{\bf F}}$ is of rank one if it is non-zero and of the form ${(x,y) \mapsto f(x) g(y)}$ for some ${f,g: A \rightarrow {\bf F}}$ , and that the rank of a general function ${F: A \times A \rightarrow {\bf F}}$ is the least number of rank one functions needed to express ${F}$ as a linear combination. More generally, if ${k \geq 2}$ , we define the rank of a function ${F: A^k \rightarrow {\bf F}}$ to be the least number of “rank one” functions of the form

$\displaystyle (x_1,\dots,x_k) \mapsto f(x_i) g(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k)$

for some ${i=1,\dots,k}$ and some functions ${f: A \rightarrow {\bf F}}$ , ${g: A^{k-1} \rightarrow {\bf F}}$ , that are needed to generate ${F}$ as a linear combination. For instance, when ${k=3}$ , the rank one functions take the form ${(x,y,z) \mapsto f(x) g(y,z)}$ , ${(x,y,z) \mapsto f(y) g(x,z)}$ , ${(x,y,z) \mapsto f(z) g(x,y)}$ , and linear combinations of ${r}$ such rank one functions will give a function of rank at most ${r}$ .

It is a standard fact in linear algebra that the rank of a diagonal matrix is equal to the number of non-zero entries. This phenomenon extends to higher dimensions:

Lemma 1 (Rank of diagonal hypermatrices) Let ${k \geq 2}$ , let ${A}$ be a finite set, let ${{\bf F}}$ be a field, and for each ${a \in A}$ , let ${c_a \in {\bf F}}$ be a coefficient. Then the rank of the function

$\displaystyle (x_1,\dots,x_k) \mapsto \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k) \ \ \ \ \ (2)$

is equal to the number of non-zero coefficients ${c_a}$ .

Proof: We induct on ${k}$ . As mentioned above, the case ${k=2}$ follows from standard linear algebra, so suppose now that ${k>2}$ and the claim has already been proven for ${k-1}$ .

It is clear that the function (2) has rank at most equal to the number of non-zero ${c_a}$ (since the summands on the right-hand side are rank one functions), so it suffices to establish the lower bound. By deleting from ${A}$ those elements ${a \in A}$ with ${c_a=0}$ (which cannot increase the rank), we may assume without loss of generality that all the ${c_a}$ are non-zero. Now suppose for contradiction that (2) has rank at most ${|A|-1}$ , then we obtain a representation

$\displaystyle \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k)$

$\displaystyle = \sum_{i=1}^k \sum_{\alpha \in I_i} f_{i,\alpha}(x_i) g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) \ \ \ \ \ (3)$

for some sets ${I_1,\dots,I_k}$ of cardinalities adding up to at most ${|A|-1}$ , and some functions ${f_{i,\alpha}: A \rightarrow {\bf F}}$ and ${g_{i,\alpha}: A^{k-1} \rightarrow {\bf R}}$ .

Consider the space of functions ${h: A \rightarrow {\bf F}}$ that are orthogonal to all the ${f_{k,\alpha}}$ , ${\alpha \in I_k}$ in the sense that

$\displaystyle \sum_{x \in A} f_{k,\alpha}(x) h(x) = 0$

for all ${\alpha \in I_k}$ . This space is a vector space whose dimension ${d}$ is at least ${|A| - |I_k|}$ . A basis of this space generates a ${d \times |A|}$ coordinate matrix of full rank, which implies that there is at least one non-singular ${d \times d}$ minor. This implies that there exists a function ${h: A \rightarrow {\bf F}}$ in this space which is nowhere vanishing on some subset ${A'}$ of ${A}$ of cardinality at least ${|A|-|I_k|}$ .

If we multiply (3) by ${h(x_k)}$ and sum in ${x_k}$ , we conclude that

$\displaystyle \sum_{a \in A} c_a h(a) \delta_a(x_1) \dots \delta_a(x_{k-1})$

$\displaystyle = \sum_{i=1}^{k-1} \sum_{\alpha \in I_i} f_{i,\alpha}(x_i)\tilde g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})$

where

$\displaystyle \tilde g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})$

$\displaystyle := \sum_{x_k \in A} g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) h(x_k).$

The right-hand side has rank at most ${|A|-1-|I_k|}$ , since the summands are rank one functions. On the other hand, from induction hypothesis the left-hand side has rank at least ${|A|-|I_k|}$ , giving the required contradiction. $\Box$

On the other hand, we have the following (symmetrised version of a) beautifully simple observation of Croot, Lev, and Pach:

Lemma 2 On ${({\bf F}_3^n)^3}$ , the rank of the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ is at most ${3N}$ , where

$\displaystyle N := \sum_{a,b,c \geq 0: a+b+c=n, b+2c \leq 2n/3} \frac{n!}{a!b!c!}.$

Proof: Using the identity ${\delta_0(x) = 1 - x^2}$ for ${x \in {\bf F}_3}$ , we have

$\displaystyle \delta_{0^n}(x+y+z) = \prod_{i=1}^n (1 - (x_i+y_i+z_i)^2).$

The right-hand side is clearly a polynomial of degree ${2n}$ in ${x,y,z}$ , which is then a linear combination of monomials

$\displaystyle x_1^{i_1} \dots x_n^{i_n} y_1^{j_1} \dots y_n^{j_n} z_1^{k_1} \dots z_n^{k_n}$

with ${i_1,\dots,i_n,j_1,\dots,j_n,k_1,\dots,k_n \in \{0,1,2\}}$ with

$\displaystyle i_1 + \dots + i_n + j_1 + \dots + j_n + k_1 + \dots + k_n \leq 2n.$

In particular, from the pigeonhole principle, at least one of ${i_1 + \dots + i_n, j_1 + \dots + j_n, k_1 + \dots + k_n}$ is at most ${2n/3}$ .

Consider the contribution of the monomials for which ${i_1 + \dots + i_n \leq 2n/3}$ . We can regroup this contribution as

$\displaystyle \sum_\alpha f_\alpha(x) g_\alpha(y,z)$

where ${\alpha}$ ranges over those ${(i_1,\dots,i_n) \in \{0,1,2\}^n}$ with ${i_1 + \dots + i_n \leq 2n/3}$ , ${f_\alpha}$ is the monomial

$\displaystyle f_\alpha(x_1,\dots,x_n) := x_1^{i_1} \dots x_n^{i_n}$

and ${g_\alpha: {\bf F}_3^n \times {\bf F}_3^n \rightarrow {\bf F}_3}$ is some explicitly computable function whose exact form will not be of relevance to our argument. The number of such ${\alpha}$ is equal to ${N}$ , so this contribution has rank at most ${N}$ . The remaining contributions arising from the cases ${j_1 + \dots + j_n \leq 2n/3}$ and ${k_1 + \dots + k_n \leq 2n/3}$ similarly have rank at most ${N}$ (grouping the monomials so that each monomial is only counted once), so the claim follows.

Upon restricting from ${({\bf F}_3^n)^3}$ to ${A^3}$ , the rank of ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ is still at most ${3N}$ . The two lemmas then combine to give the Ellenberg-Gijswijt bound

$\displaystyle |A| \leq 3N.$

All that remains is to compute the asymptotic behaviour of ${N}$ . This can be done using the general tool of Cramer’s theorem, but can also be derived from Stirling’s formula (discussed in this previous post). Indeed, if ${a = (\alpha+o(1)) n}$ , ${b = (\beta+o(1)) n}$ , ${c = (\gamma+o(1)) n}$ for some ${\alpha,\beta,\gamma \geq 0}$ summing to ${1}$ , Stirling’s formula gives

$\displaystyle \frac{n!}{a!b!c!} = \exp( n (h(\alpha,\beta,\gamma) + o(1)) )$

where ${h}$ is the entropy function

$\displaystyle h(\alpha,\beta,\gamma) = \alpha \log \frac{1}{\alpha} + \beta \log \frac{1}{\beta} + \gamma \log \frac{1}{\gamma}.$

We then have

$\displaystyle N = \exp( n (X + o(1))$

where ${X}$ is the maximum entropy ${h(\alpha,\beta,\gamma)}$ subject to the constraints

$\displaystyle \alpha,\beta,\gamma \geq 0; \alpha+\beta+\gamma=1; \beta+2\gamma \leq 2/3.$

A routine Lagrange multiplier computation shows that the maximum occurs when

$\displaystyle \alpha = \frac{32}{3(15 + \sqrt{33})}$

$\displaystyle \beta = \frac{4(\sqrt{33}-1)}{3(15+\sqrt{33})}$

$\displaystyle \gamma = \frac{(\sqrt{33}-1)^2}{6(15+\sqrt{33})}$

and ${h(\alpha,\beta,\gamma)}$ is approximately ${1.013455}$ , giving rise to the claimed bound of ${O( 2.756^n )}$ .

Remark 3 As noted in the Ellenberg and Gijswijt papers, the above argument extends readily to other fields than ${{\bf F}_3}$ to control the maximal size of subset of ${{\bf F}^n}$ that has no non-trivial solutions to the equation ${ax+by+cz=0}$ , where ${a,b,c \in {\bf F}}$ are non-zero constants that sum to zero. Of course one replaces the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ in Lemma 2 by ${(x,y,z) \mapsto \delta_{0^n}(ax+by+cz)}$ in this case.

Remark 4 This symmetrised formulation suggests that one possible way to improve slightly on the numerical quantity ${2.756}$ by finding a more efficient way to decompose ${\delta_{0^n}(x+y+z)}$ into rank one functions, however I was not able to do so (though such improvements are reminiscent of the Strassen type algorithms for fast matrix multiplication).

Remark 5 It is tempting to see if this method can get non-trivial upper bounds for sets ${A}$ with no length ${4}$ progressions, in (say) ${{\bf F}_5^n}$ . One can run the above arguments, replacing the function

$\displaystyle (x,y,z) \mapsto \delta_{0^n}(x+y+z)$

with

$\displaystyle (x,y,z,w) \mapsto \delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w);$

this leads to the bound ${|A| \leq 4N}$ where

$\displaystyle N := \sum_{a,b,c,d,e \geq 0: a+b+c+d+e=n, b+2c+3d+4e \leq 2n} \frac{n!}{a!b!c!d!e!}.$

Unfortunately, ${N}$ is asymptotic to ${\frac{1}{2} 5^n}$ and so this bound is in fact slightly worse than the trivial bound ${|A| \leq 5^n}$ ! However, there is a slim chance that there is a more efficient way to decompose ${\delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w)}$ into rank one functions that would give a non-trivial bound on ${A}$ . I experimented with a few possible such decompositions but unfortunately without success.

Remark 6 Return now to the capset problem. Since Lemma 1 is valid for any field ${{\bf F}}$ , one could perhaps hope to get better bounds by viewing the Kronecker delta function ${\delta}$ as taking values in another field than ${{\bf F}_3}$ , such as the complex numbers ${{\bf C}}$ . However, as soon as one works in a field of characteristic other than ${3}$ , one can adjoin a cube root ${\omega}$ of unity, and one now has the Fourier decomposition

$\displaystyle \delta_{0^n}(x+y+z) = \frac{1}{3^n} \sum_{\xi \in {\bf F}_3^n} \omega^{\xi \cdot x} \omega^{\xi \cdot y} \omega^{\xi \cdot z}.$

Moving to the Fourier basis, we conclude from Lemma 1 that the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ on ${{\bf F}_3^n}$ now has rank exactly ${3^n}$ , and so one cannot improve upon the trivial bound of ${|A| \leq 3^n}$ by this method using fields of characteristic other than three as the range field. So it seems one has to stick with ${{\bf F}_3}$ (or the algebraic completion thereof).

Thanks to Jordan Ellenberg and Ben Green for helpful discussions.

78 comments

Comments feed for this article

18 May, 2016 at 4:14 am

JSE

Beautiful! The symmetric formulation “feels right” and is surely the best way to express it.

20 May, 2016 at 12:57 am

Anonymous

It seems to be a proof from “the book”!

18 May, 2016 at 4:23 am

Nets Katz

Terry, is it obvious that decompositions of the quadratic form in Remark 5 into rank 1 pieces is the only way to go? Any argument that gives a lower bound for the dimension of a set of vectors which pairwise vanish would work. Must such an argument have a simple algebraic formulation? I would like to comment that applying the technology to characteristic zero meets with similar obstructions. A simple model for characteristic zero is to replace F_p^d addition with addition that has carrying but for which the carrying in each digit is independent. Then you have to look at a product of 2^{d-1} forms, one for each carrying profile.

18 May, 2016 at 7:02 am

JSE

I was thinking along similar lines, Nets. It’s an interesting question — what formal properties of polynomials are needed to make some argument of this kind work? Are there families of functions on (Z/NZ) which are “polynomial-like” enough? I truly don’t know. My first instinct would be to look at the Witt vector construction of (Z/p^n Z), which is somehow of the most canonical way of expressing “carrying” and passing from char p to char 0.

18 May, 2016 at 10:02 am

Terence Tao

It could well be that working with more general classes of functions than polynomials of a given degree would be useful – for instance polynomials taking values in some other commutative ring, or even a non-commutative ring (though the linear algebra becomes replaced by something nastier of course if one strays too far from the field setting). One could possibly play with completely non-linear spaces also. Maybe it is worth working with very low dimensional examples (as in my other recent comment here) to experiment if there are other decompositions than the polynomial ones which are more efficient. I played around for a while with multi-degrees instead of degrees, for instance, but eventually concluded that one did not gain much asymptotically by doing so.

18 May, 2016 at 6:01 pm

Will

It depends on whether we want to replace the additive structure of polynomials or just multiplicative. Since like most mathematicians I will only stop using linear algebra under extreme duress, I thought about how to generalize the polynomial method here while still using the notion of tensor rank to measure the complexity of functions.

So we can ask the general question: given a tensor f in V^3, viewing f tensor f tensor … f as a tensor in (V^k)^3, how may we upper bound its tensor rank as a function of k?

We may also try to lower bound its tensor rank, to let us know that we are not missing an easy way to improve the bound, or for use on the other side of the inequality. So maybe the right question is what is the asymptotic of the tensor rank – is it always an exponential function in k?

Anyways the upper bound argument generalizes immediately from polynomial functions to any functions that can be written as a sum of v_a tensor v_b tensor v_c, where v_a,v_b,v_c lie in some list of vector that come with weights w_a,w_b,w_c, such that we have an upper bound w_a + w_b + w_c < W for every term appearing in the sum.

Then we get an upper bound in terms of the maximum entropy of a probability distribution on the indices a where the expected value of w_a is at most W/3, by precisely the same argument.

To recover the polynomial special case, take v_a,v_b,v_c to be monomials in one variable, w_a,w_b,w_c the degree of the monomial, and W the degree of the polynomial function.

In the case where v_a range through a basis v_1,..,v_n (e.g. the monomials in one variable over F_p of degree less than p), then as soon as W/3 is less than the average value of w under the uniform distribution we will force the distribution to have an entropy slightly smaller than log n and hence we will get a nontrivial bound for the tensor ranks.

So for a 3-variable polynomial over F_p, the degree needs to be less than 3(p-1)/2 to get a nontrivial bound, and for a d-variable polynomial, less than d(p-1)/2.

In what should be interesting to some, this condition is precisely the Hilbert-Mumford criterion for GIT-instability under the action of (SL_n)^3 on P^{n^3-1}. So we can summarize the key idea in higher-level language as: There is a power savings bound for the rank of powers of GIT-unstable tensors.

Are there any known bounds for ranks of tensor powers that cannot be derived in this way from tuples of vectors and weight functions? Is there any method that would allow us to lower bound tensor ranks so as to show that this upper bound is sometimes sharp?

19 May, 2016 at 11:59 pm

Terence Tao

If $V$ has some basis $v_1,\dots,v_n$ , then we can associate a “Newton diagram” $\Delta \subset \{1,\dots,n\}^3$ to a function $f \in V^{\otimes 3}$ by declaring $(i,j,k)$ to be an element of $\Delta$ if the $v_i \otimes v_j \otimes v_k$ coefficient of $f$ is non-zero. Define the “capacity” $c(\Delta)$ of this diagram to be the maximum value of $\min( h(X), h(Y), h(Z))$ , where $(X,Y,Z)$ ranges over random variables taking values in $\Delta$ and $h$ denotes Shannon entropy. Then I think Stirling’s formula ends up saying that the rank of $f^{\otimes n}$ (viewed as an element of $(V^{\otimes n})^{\otimes 3}$ ) is at most $\exp( (c(\Delta)+o(1)) n)$ , basically because every “monomial” in $f^{\otimes n}$ becomes associated with a discrete random variable $(X,Y,Z)$ in $\Delta$ (taking $n$ values). I think this bound covers the previous bounds on rank mentioned here, though I haven’t checked fully.

20 May, 2016 at 6:13 am

Will

I think the bound you sketched and the bound I sketched are the same because the problems you need to solve to compute each bound are dual in a convex optimization sense. Given a weight function, you can upper bound the entropy of the marginal distributions of any probability distribution by observing min(E(w_X,w_Y,w_Z))<E(w_X+w_Y+w_Z)<sup(w_X+w_Y+w_Z)<W. If you have any distribution, you can define a weight function as the derivative of the entropy of each marginal distribution by the probability of each marginal value times the Lagrange multiplier of the entropy. Then the upper bound we get from the weight function is exactly the minimum entropy of a marginal distribution.

This makes it easy to check you aren't missing an obvious improvement – write down a weight function and a distribution and check that their entropies agree.

20 May, 2016 at 11:26 am

Terence Tao

I think one may have to generalise to the non-symmetric setting where the weight functions associated to each of the three components $v_a, v_b, v_c$ are allowed to be different from each other, but I think I agree with your point that the two formulations are dual.

18 May, 2016 at 5:09 am

Fred Lunnon

Typos (presumptive?):
for “F_3^n” read “F_{3^n}” (throughout);
for “F^3” read “F_3” (once).

[Second erratum corrected. First is not an erratum, I have tried to clarify in post – T.]

18 May, 2016 at 5:46 am

Will

One can not improve on your bound for 4-term arithmetic progressions using only polynomials. Indeed if $f$ is any multivariate polynomial over $\mathbb F_p$ that is $0$ outside a a linear subspace of codimension $c$ but not $0$ everywhere, then $f$ restricted to some subspace of dimension $d$ is a polynomial that vanishes outside a point but not at that point, which by a fragment of Chevalley-Warning (http://mathoverflow.net/a/238527/18060) has degree at least $c(p-1)$ . So a nontrivial polynomial which is nonvanishing for $4$ -term arithmetic progressions in $\mathbb F_p^n$ has degree at least $2n(p-1)$ , and the upper bound for the degree in a single vector is $n(p-1)/2$ , which is precisely the critical value one needs to beat to improve the bound.

18 May, 2016 at 7:09 am

JSE

One note about this higher notion of “rank” is that rank <= r is not a closed condition, the way it is for matrices! I always find this confusing and it's probably good to keep it in mind.

18 May, 2016 at 5:31 pm

Will

Are you sure that is true for this notion of tensor rank and not just the more commonly used notion of tensor rank? I might have a proof that this notion of rank is in fact semicontinuous.

18 May, 2016 at 6:03 pm

JSE

Oh is Terry’s notion not the same as usual tensor rank? I thought it was, but didn’t think about it carefully.

18 May, 2016 at 6:47 pm

Will

The usual definition allows only functions of the form f(x)g(y)h(z), which is much more restrictive. For instance it is clear that every n x n x n tensor has rank at most n in Terry’s sense but this is not true in the usual sense. Maybe we should call Terry’s version something like lax tensor rank to remove ambiguity.

Let me sketch my argument that the space of tensors of bounded lax tensor rank is closed:

A tensor in V tensor V tensor V is of rank < k if it is in the space spanned by W1 x V x V + V x W2 x V + V x V x W3 where dim W1 + dim W2 + dim W3 < k. Now for fixed W1, W2, W3 the space spanned by W1 x V x V + V x W2 x V + V x V x W3 is a linear subspace of V x V x V and varies smoothly as W1 , W2, W3 vary in the Grassmanian of subspaces of fixed dimension, so this forms a closed subset of V tensor V tensor V x G(dim W1, V) x G(dim W2, V) x G(dim W3, V) that projects to a closed subset of V tensor V tensor V. Then a finite sum over all triples of dimensions satisfying the inequality gives the desired result.

19 May, 2016 at 7:42 am

JSE

Oh good point! I had failed to grasp this distinction.

18 May, 2016 at 7:29 am

Will

Also two quick observations:

1. It seems that the bound given for sets in F_p^n with no 3-term arithmetic progressions is ((c+o(1))p)^n for some constant c<1 that I doubt can be put in closed form.

2. For each set S of size 2n+1 in F_p one gets power-saving bounds for subsets of F_p^n that do not contain the image of S under any nonconstant map whose coordinates are polynomials of degree n. I didn't see a trivial proof of power savings here for general S (though for special S we may reduce using special polynomials to a simpler bound) but I also didn't see a relationship to any well-known open problem in combinatorics.

18 May, 2016 at 11:16 am

Eric Naslund

In fact we can show that c=e^{-1/6}. That is we have the upper bound (e^{-1/6}+o(1))^n p^n, where the o(1) term is positive and tending to zero as p–>infty.

18 May, 2016 at 9:35 am

Terence Tao

A small but perhaps amusing computation: for the question of finding the largest number of cards in the game Set without a “set” (or equivalently, the largest capset in ${\bf F}_3^4$ ), the upper bound given here is

$3\sum_{a+b+c = 4; b+2c \leq 8/3} \frac{4!}{a!b!c!} = 45.$

This can be compared with the asymptotic prediction of $(2.756)^4 \approx 58$ for the upper bound (or the asymptotic prediction of $(2.2174)^4 \approx 24$ for the lower bound. Theorem 1.2 of Meshulam’s paper using the Fourier method gives an upper bound of $2 \times 3^4 / 3 = 54$ . The maximal size of a capset in ${\bf F}_3^4$ is actually known to be 20 by the work of Pellegrino. So even in this low dimensional case, the bound given by the polynomial method improves upon existing bounds, but is a bit short of the truth. One could imagine in this case at least one could improve upon the bound of $45$ while staying within the framework of the polynomial method (it might also be worth working out numbers for other low dimensional cases).

19 May, 2016 at 5:22 am

Fedor Petrov

Well, 45 may be replaced by 30, as $3N$ may be replaced to $2N$ always, see a proof here (sorry for doubling this link, but I think it may be convenient)

Click to access f3_eng.pdf

For specific $N$ this looks better improvement than it actually is:)

27 June, 2016 at 12:12 am

Anurag Bishnoi

The bounds given by Bierbrauer and Edel (based on Meshulam’s Fourier-Analytic proof) seem to be closer to the truth in the low-dimensional cases than the polynomial method bound. For dimensions 4, 5 and 6, they get upper bounds, 21, 48 and 114, respectively. See Section 3 in http://onlinelibrary.wiley.com/doi/10.1002/jcd.10000/abstract. The values of the polynomial bound (constant 2 instead of 3, as obtained by Fedor) for these dimensions are 30, 102 and 336.

18 May, 2016 at 11:30 am

Sam

I think remark 5 has a typo. As written, $N$ is asymptotic to $4^n$ . Should it say

$\displaystyle N := \sum_{a,b,c,d,e \geq 0: a+b+c+d+e=n, b+2c+3d+4e \leq 2n} \frac{n!}{a!b!c!d!e!}$ ?

[Corrected, thanks – T.]

18 May, 2016 at 11:59 am

Eric Naslund

Using this general approach, Will and I applied this directly to the Erdos-Szemeredi sunflower problem, proving that if F is a sunflower-free collection of subsets of {1,2,3,…,n}, then |F|< (3/2^{2/3})^n: https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDozZDQ2NzRjNzAxNmJjNDlh

18 May, 2016 at 1:22 pm

The Ellenberg-Gijswijt bound on cap sets | Anurag's Math Blog

[…] [2] Polymath 10 Emergency Post 5: The Erdos-Szemeredi Sunflower Conjecture is Now Proven. [3] A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capset bound [4] Large caps in projective Galois […]

18 May, 2016 at 4:14 pm

Anonymous

It seems that the above constant $2.755...$ is (simply) given by

$\alpha^{-\alpha} \beta^{-\beta} \gamma^{-\gamma}$

18 May, 2016 at 4:28 pm

Fedor Petrov

I think, we may improve multiple 3 to 2 (not a big deal, yes.) But maybe modification of method is useful elsewhere.

For $\alpha\in\{0,1,\dots,2n\}$ denote by $d(\alpha)$ a number of points
in $\{0,1,2\}^n$ with sum of coordiantes at most
$\alpha$ . Assume that $A\subset \mathbb{F}_3^n$ and
$N=|A|>d(\alpha)+d(2n-2\alpha-1)$ for some
$\alpha<n$ (min is attained, of course,
for $\alpha$ about
$2n/3$ and is exponentially smaller than $3^n$ by a suitable verison of Law of Large Numbers.)

We need

Lemma. Let $\mathbb{F}$ be a field, $U$ be a finite set and let
$X$ be a $k$ -dimensional linear subspace of the space $\mathbb{F}^U$ of functions
on $U$ . Then the squares of functions from $X$ span a space of dimension at least $k$ .

Proof (inspired by Ilya Bogdanov). Let $U=\{1,\dots,|U|\}$ .
By Gauss' method, we may find a basis $v_1,\dots,v_k$ of $X$ such that
$s(v_1)<s(v_2)<\dots<s(v_k)$ , where $s(v)$ is the number of the first nonzero
entry in $v$ . Their squares work.

Denote by $F$ the space of $\mathbb{F}_3$ -valued functions on $A$ such that
$\sum_{a\in A} f(a) W(a)=0$ for any polynomial $W$
over $\mathbb{F}_3$ , which has degree at most 2 in each variable and total degree at most $\alpha$ .
Dimension $\dim F$ of this space
is at least $N-d(\alpha)$ .
If $f,g\in F$ , for any polynomial $Q(x,y)$ of degree at most $2\alpha+1$ we have

$\displaystyle \sum_{a\in A,b\in B} f(a)g(b) Q(a,b)=0$

(this is clear for monomials: sum factorizes, and there exists zero factor).

Assume that $a+b\ne 2c$ for $a\ne b$ , $c$ in $A$ . Consider the space of polynomials
$P(x)$ of degree at most 2 in each variable, equal to 0
0 outside $2A=\{2a,a\in A\}$ , of total degree at most $2\alpha+1$ .
Dimension of the space of such polynomials is at least $d(2\alpha+1)-3^n+N=N-d(2n-2\alpha-1)$ .
And these polynomials are all different as functions on
$2A$ (since they are different as functions on
$\mathbb{F}_3^n$ ). For all $f\in F$
we have

$\displaystyle \sum_{a\in A} f^2(a)P(2a)=\sum_{a\in A,b\in B} f(a)f(b)P(a+b)=0.$

By lemma the functions $f^2$ , $f\in F$ , form a space of dimension at least $N-d(\alpha)$ ,
and functions $P(2a)$ form a space of dimension at least $N-d(2n-2\alpha-1)$ . Sum of dimensions
exceed $N$ , thus orthogonal complement of the first subspace can not lie in the second.

18 May, 2016 at 5:03 pm

Fedor Petrov

Formatting looks really ugly, sorry. Here is pdf

Click to access f3_eng.pdf

19 May, 2016 at 11:20 pm

Terence Tao

Nice! I actually get $d(2n-2\alpha-2)$ instead of $d(2n-2\alpha-1)$ (as I believe it is the former that is equal to $3^n - d(2\alpha+1)$ . This seems to give the bounds $r_3({\bf F}_3) \leq 2$ , $r_3({\bf F}_3^2) \leq 4$ , $r_3({\bf F}_3)^3) \leq 11$ , $r_3({\bf F}_3^4) \leq 30$ , if I did the arithmetic correctly; this is sharp in one and two dimensions but misses a little in three (I believe $r_3({\bf F}_3^3)=9$ ) and higher. In terms of the tensor rank formulation, the point seems to be this: if one wishes to write $\sum_{a \in A} \delta_a(x) \delta_a(y) \delta_a(z)$ as a sum

$\displaystyle \sum_{i=1}^I f_{1,i}(x) g_{1,i}(y,z) + \sum_{j=1}^J f_{2,j}(y) g_{2,j}(x,z) + \sum_{k=1}^K f_{3,k}(z) g_{3,k}(x,y)$

then Lemma 1 of the post tells us that we need $I+J+K \geq |A|$ . But if we have the additional information that $I=J$ and $f_{1,i} = f_{2,i}$ for $i=1,\dots,I$ , then we can improve this condition to $I+K \geq |A|$ . I guess there is some sort of “symmetric rank” here where we symmetrise under the operation of interchanging x and y. (It also seems tempting to symmetrise in all of x,y,z, but there seems to be some issue in characteristic 3 in that one can’t divide by 3!. Given that the bound is now sharp in one and two dimensions, I guess it is not too realistic to hope for a really big further gain.)

20 May, 2016 at 1:17 am

Fedor Petrov

Yes, of course $d(2n-2\alpha-2)=3^n-d(2\alpha+1)$ . I hope that now everything in the linked pdf is correct, and it is expressed in more symmetric form, without two steps. The space $F$ satisfies a condition $\sum_{a\in A} u(a)v(a)w(a)=0$ for all $u,v,w$ in $F$ . In general it implies that $\dim F\leqslant |A|/2$ , but we are additionally given that supports of all functions in $F$ are quite large. Does it allow to get essentially better bound on the dimension? Without conditions on the support it is tight, but the example I have in mind (linear span of vectors $e_{2i-1}-e_{2i}$ ) has small supports.

It gives at least some improvement, say, estimate 30 in the game set problem may be easily improved (though I doubt that down to 20.)

19 May, 2016 at 8:25 am

Polymath 10 Emergency Post 5: The Erdos-Szemeredi Sunflower Conjecture is Now Proven. | Combinatorics and more

[…] nice modification) of the Croot-Lev-Pach-Ellenberg-Gijswijt capset and further discussion see this post and comment thread on Terry Tao’s […]

19 May, 2016 at 9:28 am

Eric Naslund

Here is another application of this method which can be seen as progress on the Erdos-Rado sunflower problem. The functions used are combinations of indicator functions of the elements in the underlying set rather than polynomials: https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxuYXNsdW5kZXJpY3xneDozZWM4ZTI0ZTcyNGMyY2M2

19 May, 2016 at 11:05 am

Jop Briet

Dear Terry, thanks for the great post! Tiny typo: Eq (3) seems to be missing an equality symbol.

[Corrected, thanks – T.]

20 May, 2016 at 3:54 pm

Terence Tao

For the record: the known values of $r_3({\bf F}_3^n)$ (i.e.for $n \leq 6$ ) are at https://oeis.org/A090245

21 May, 2016 at 1:36 am

Fedor Petrov

We may mix Croot-Lev-Pach and Olson’s ideas to rephrase this proof using group rings.

Let $p$ be odd prime. Denote by $C_m$ a cyclic group of order $m$ . Let $G=\prod_{i=1}^n C_{p^{k_i}}$ be a finite Abelian $p$ -group with $n$ generators $g_1,\dots,g_n$ , $g_i$ generates $C_{p^{k_i}}$ . Assume that we managed to find a subspace $X$ of a group algebra $\mathbb{F}_p[G]$ of codimension $t$ satisfying $uvw=0$ for all $u,v,w\in X$ . Then any subset $A\subset G$ of cardinality at least $2t+1$ contains three different elements $a,b,c$ such that $ab=c^2$ .

Why? Assume the contrary. Consider the subspaces $X_1,X_{-2}$ of $X$ with all coefficients outside $A$ (resp. outside $A^{-2}$ ) equal to 0. They have dimensions at least $t+1$ . Consider the product

$0=(\sum_{a\in A} u(a) a)(\sum_{b\in A} v(b) b)(\sum_{c\in A} w(c^{-2})c^{-2}),$

where two first summands are chosen from $X_1$ and the second from $X_{-2}$ . If $abc^{-2}=1$ only for $a=b=c$ , the coefficient of 1 in this product is nothing but $\sum_{u\in A} u(a)v(a)w(a^{-2})$ . But the space of functions of the form $u(a)v(a)$ , $u,v\in X_1$ is at least $\dim X_1$ , as we may see by choosing basis in $X_1$ by Gauss elimination, and it can not be orthogonal to the space formed by functions $w(a^{-2})$ , which is isomorphic to $X_2$ . (This argument is the same as before.)

Now how to choose such a space with low codimension. Group algebra is generated by the products $\prod (1-g_i)^{m_i}$ , where $m_i\in \{0,1,\dots,p^{k_i}-1\}$ . Let $X$ be a subspace generated by monomials for which $\sum_{i=1}^n m_i/(p^{k_i}-1)>n/3$ . Then any product $uvw$ for $u,v,w\in X$ has some $1-g_i$ in a power strictly greater than $p^{k_i}-1$ , but $(1-g_i)^{p^{k_i}}=0$ .

What is the codimension $t$ of this subspace? If $\xi_i$ are random variables uniformly distributed in the set
$\frac1{p^{k_i}-1}\{0,1,\dots,p^{k_i}-1\}$ , then $t/|G|$ is the probability that $\sum (\xi_i-E(\xi_i))\leqslant -n/6$ . It decays expoentially in $n$ . Alas, this does not seem to be better than partitioning $G$ onto classes modulo $C_p^n$ and reasoning in them separately.

21 May, 2016 at 5:12 am

Fedor Petrov

Well, at least this gives better exponent for, say, $C_9^n$ compared to $3^n$ copies of $C_3^n$, so I hope that it touches some effect.

31 May, 2016 at 8:54 am

Fedor Petrov

It gives similar estimates for progression-free sets $|A|\leqslant |G|^{\kappa}$ , where $\kappa1/2$ such that any large enough finite group $G$ (at least of odd order, though this is hardily important) contains $|G|^c$ elements without 3-progressions?

31 May, 2016 at 8:57 am

Fedor Petrov

I tried to say that it works for non-Abelian $p$ -groups, but something went wrong, probably inequality signs have eaten my post. Never mind. Here is a draft

Click to access grouprings1.pdf

22 May, 2016 at 11:55 am

Tom Gur

Should delta(x_1 – a) * … * delta(x_k – a) be delta_a(x_1) *…* delta_a(x_k) ?
(Near the end of the proof of Lemma 1.)

[Corrected, thanks – T.]

25 May, 2016 at 8:50 am

Mind Boggling: Following the work of Croot, Lev, and Pach, Jordan Ellenberg settled the cap set problem! | Combinatorics and more

[…] See also this post by Tao (presenting a symmetric version of the proof) and this post by […]

25 May, 2016 at 10:01 am

Thomas Bloom

Interestingly, this proof also works for the similar problem over polynomial rings. For example, bounding the size of the largest subset of F_2[t]_{\deg f< n} without non-trivial solutions to a+tb=(1+t)c, this method gives an exponential upper bound (I think something like 1.9^n for this particular problem).

This is great, because before this not even the $N/\log N$ bound was known, but rather $N(\log\log N)^2/\log N$ – I used to think that this problem was closer to the normal integer setting than the cap set problem, but not any more!

And, of course, there are many analogues between the integers and polynomials over finite fields…

25 May, 2016 at 7:08 pm

Terence Tao

Ben Green has recently obtained an exponential bound for the analogue of Sarkozy’s theorem in function fields using these methods: http://arxiv.org/abs/1605.07263

26 May, 2016 at 12:33 pm

Mario Carneiro

You comment on https://oeis.org/A090245 that “Asymptotically, a(n) = O(3^n/n)” which contradicts this post.

26 May, 2016 at 5:13 pm

Terence Tao

In combinatorics, $O()$ is used to denote an upper bound only (one would write instead $a(n) = \Theta(3^n/n)$ if one wanted to assert the exact order of magnitude).

26 May, 2016 at 7:25 pm

Mario Carneiro

Oops, newbie mistake. Indeed I read it as a(n) ~ 3^n/n.

27 May, 2016 at 12:11 am

Polymath 10 post 6: The Erdos-Rado sunflower conjecture, and the Turan (4,3) problem: homological approaches. | Combinatorics and more

[…] We will not regard attacks on the sunflower conjecture based on the recent solution of the cap set problem and the Erdos Szemeredi sunflower conjecture (which is a weaker version of the Erdos-Rado conjecture) as part of the present polymath10 project, but, of course, I will be happy to hear also reports on progress or comments on these directions. Some updates on this story: Eric Naslund and Will Sawin gave a direct proof based on the polynomial method for the Erdos-Szemeredi sunflower conjecture, and an even stronger result is given by Naslund here. (Eric also has stronger quantitative bounds for Erdos-Szemeredi based on bounds for cap sets.) More cap set updates are added to this post, and can be found in the comment threads here and here. […]

27 May, 2016 at 12:38 pm

Anonymous

Typo: I believe $k$ should be $k-1$ in the equation following “If we multiply (3) by…”

[Corrected, thanks – T.]

29 May, 2016 at 3:54 am

Klaus85

This is a very interesting post, thanks. I wonder, do you have a personal favorite-list of open problems? (it seemed this one was on the list). Or do professional mathematicians not have such lists but concentrate in developing the field further independent whether there is some identified open question? I’m honestly interested in this. Thanks!

29 May, 2016 at 8:22 am

Terence Tao

The blog posts listed at https://terrytao.wordpress.com/category/question/ are devoted to discussion of a number of open questions that I am interested in. I did at one point try to maintain a list of such open questions more systematically, but have not done so for more than a decade.

1 June, 2016 at 8:14 am

Terence Tao

A shout out, by the way, to Ben Green, who to my knowledge was the first to dare conjecture [EDIT: explicitly, and in print] that cap sets had to be exponentially sparse (see Conjecture 4.3 of http://arxiv.org/abs/math/0409420 ). I actually disagreed with him at the time, and even said so in my 2007 blog post, but am happy to have been proven wrong on this.

While on the topic of past inaccuracies, it seems I may have inadvertently propagated a slightly incorrect terminology in that post: sets in ${\bf F}_3^n$ free of collinear triples are known as affine caps or simply caps in the design theory literature, rather than cap sets. The latter terminology seems to have become rather entrenched, though, at least in additive combinatorics circles…

1 June, 2016 at 9:33 am

samuelfhopkins

In the commentary after Conjecture 4.3, Green says he would expect any progress on that conjecture to lead to improvements on bounding the size of a maximum subset of $\mathbb{Z}_n$ without 3-term APs. Is the jury out at the moment about the possibility of these latest developments being applied to the $\mathbb{Z}_n$ setting?

1 June, 2016 at 11:56 am

Gil Kalai

It is interesting to look at the original 1987 paper of Peter Frankl, Ron Graham, and Vojta Rodl.
FGR gave a $o(3^n)$ upper bound using Rusza-Szemeredi’s theorem (that applies also to show that $r_3(n)=o(n)$ ). In that paper FGR asked if cap sets are exponentially sparse. They also relate the question to the Erdos-Rado’s sunflower conjecture.

I myself worked with Meshulam on this problem around 2000, and we had a Fourier statement about inequality between certain norms that in its strongest form would give exponentially good upper bound, (and in some weaker forms will give weaker improvement to Roy’s original 95 bound.) (These norm-inequalities are still completely open, I think.) Roy certainly conjectured that cap sets are exponentially sparse while I changed sides several times over the years. In 2009 I discussed another approach also initiated with discussions with Roy with certain weakening and strengthening of the 3AP condition, potential relations to variants of the Frankl-Wilson and Frankl-Rodl theorems, and some slightly related polynomial-method lemmas.

On another matter it is certainly an obvious question at this point to try to give upper bounds for the rank of subsets of $\{0,1,2\}^n$ without a combinatorial line. The most optimistic constructions are given in terms of minimum of level sets corresponding to 3AP free sets of levels. Maybe there could be a way to dualize somehow and get upper bounds in terms of maximum of some related question. (It will be sufficiently wonderful to relate upper bounds for DHJ to upper bounds for $r_3(n)$ .)

1 June, 2016 at 12:46 pm

Terence Tao

One can certainly try to apply the same method to the DHJ problem – one replaces the function $\delta_0(x+y+z)$ by the indicator function of all the triples in $\{0,1,2\}^3$ that form a combinatorial line. Unfortunately the rank of this tensor appears to be too large for the method to work (and in any event, a suitable modification of the Behrend construction shows that an exponential gain is not possible in this case anyway).

At present it seems like the most plausible way one could use these methods to improve upon the ${\bf Z}/N{\bf Z}$ problem is through progress on DHJ (or slight variants thereof), perhaps in $\{0,1,\dots,r\}^n$ for some suitable choice of $r$ and $n$ with $r^n \approx N$ (the Behrend example suggests that $r$ has to grow with $N$ in order for there to be any chance here). I myself haven’t seriously tried this, but presumably some of the many other mathematicians now experimenting with this method would be considering it; it’s worth a shot, certainly.

p.s. Regarding by previous comment, I should clarify that Ben was certainly not the first to pose the question of whether capsets are exponentially sparse or not – the question was also raised as “an interesting open problem” in the 2004 paper of Edel. But I don’t know of an earlier reference in the literature in which the author came out strongly in favour of the exponentially sparse scenario, rather than being undecided or posing the opposing scenario as a question (as was done for instance in the Frankl-Graham-Rodl paper).

2 June, 2016 at 5:52 pm

Paul Delhanty

Edel, in the 2004 paper that you linked previously, writes, “That leaves us with two research problems. Firstly to improve the bounds on $\mu(3)$ by finding better *capsets*.”

So you get a pass on the 2nd point.

2 June, 2016 at 9:36 pm

Paul Delhanty

Ah, my bad – sorry! On reading through Edel’s paper more carefully, *capset* is defined in Definition 9 and it is a somewhat different and much more specific concept from a cap. It looks like a capset is a set of indexing tuples that represent a set of product caps that get unioned together to create the final cap.

1 June, 2016 at 3:04 pm

anthonyquas

Very nice! I have 3 minor comments and a question.
1) I think it would be helpful to say that $h$ is *nowhere zero* on $A'$ (rather than non-zero, which is ambiguous);
2) In the first line of Lemma 2, $\delta(x)=1-x^2$ for $x\in\mathbf F_3$ (not for general $\mathbf F$ ;
3) In my browser, I was unable to find the equation label (3) (but I inferred where it was).

And the question: Am I right that the term “capset” comes from spherical caps that were used to build large sets without arithmetic progressions? The terminology is a bit mysterious to non-experts.

[Corrected, thanks. The notion of an “affine cap” comes from design theory, but I do not know the provenance exactly – T.]

2 June, 2016 at 1:39 am

kodlu

In Remark 3, “to control the maximal size of aubset” should end with “a subset”, I presume.

[Corrected, thanks -T.]

8 June, 2016 at 8:02 pm

aram

I’m confused about (1). The LHS =1 if x,y,z form a line but the RHS = 0 then. And the empty set is a valid capset, so the RHS could be always 0. And the RHS could be sometimes 2 but the LHS is always 0 or 1.

9 June, 2016 at 11:01 am

Terence Tao

The case when $x,y,z$ form a line never occurs when $x,y,z \in A$ and $A$ is a capset, by definition of capset. In the case of the empty set, (1) is vacuously true for $x,y,z \in A$ . The RHS cannot actually attain 2 because at most one of the summands can be non-zero at a time.

9 June, 2016 at 7:38 pm

aram

Oh, I missed that x,y,z have to belong to A! That makes sense.

23 June, 2016 at 9:25 am

Abbas

Nice argument and presentation!

On the last line of Remark 3, $\delta(ax+by+cz)$ should be $\delta_{0^n}(ax+by+cz)$ ? Also in Remark 5, 3rd line from below.

[Corrected, thanks – T.]

8 July, 2016 at 1:12 pm

Bounds for sum free sets in prime power cyclic groups — three ways | Secret Blogging Seminar

[…] now-standard argument (I like Terry Tao’s exposition) shows that is bounded by three times the number of monomials of degree . One needs to check that […]

29 July, 2016 at 8:05 am

Comments on the Ellenberg-Gijswijt bound for caps in higher characteristic | Blogderbeweise - Proofs from the blog

[…] breakthrough due to Croot, Lev and Pach, recycled by Ellenberg-Gijswijt and streamlined by Tao on this blog post, revolutionised the landscape of Roth’s type problems in vector spaces over finite fields. […]

24 August, 2016 at 7:52 am

Notes on the “slice rank” of tensors | What's new

[…] the previous blog post, one of us (Terry) implicitly introduced a notion of rank for tensors which is a little different […]

4 September, 2016 at 6:17 am

Paata Ivanishvili

Once at the end of a seminar about capset problem fedja said: “you don’t need Lagrange multipliers. For $0<x<1$ clearly $\frac{(1+x+x^{2})^{n}}{x^{2n/3}}\geq \sum_{a+b+c=n, b+2c \geq 2n/3} \frac{n!}{a!b!c!}$ . Minimizing the LHS gives $(2.75511..)^{n}$ "

5 September, 2016 at 11:59 pm

Will

Yes, a similar method is very useful for quickly doing slice rank calculations. Eric Naslund and I used it on the last page of this: https://arxiv.org/pdf/1606.09575v1.pdf

However, it is not obvious that the upper bounds produced by this method are optimal. To analyze that question I don’t see how to avoid the use of Lagrange multipliers. See https://terrytao.wordpress.com/2016/08/24/notes-on-the-slice-rank-of-tensors/

6 September, 2016 at 12:10 am

Fedor Petrov

The multiplicative constant obtained on this way (variants of which are called Cramer theorem or Chernoff bound in probability and which is essentially a saddle-point method) is of course optimal. Further asymptotics (polynomial multiples) may be also obtained by saddle-point method, this is more technical.

6 September, 2016 at 5:27 am

Will

But in the probability theory problem we are trying to bound the tail probability of a fixed random variable. I agree that the method is optimal there. Here we have the freedom to choose the random variable, subject to some constraints. If you write down the constraints, then any way you slice it, you have a convex optimization problem for which Lagrange multipliers are a suitable tool.

Of course optimizing the random variable for progression-free sets in F_q^n with q=3 is not very hard. But for higher values of q it is nontrivial, though solved by these papers: https://arxiv.org/abs/1608.00243 https://arxiv.org/abs/1608.05740

26 October, 2016 at 5:04 am

Thomas Dybdahl Ahle

Wait, why is that clearly an upper bound?

26 October, 2016 at 5:57 am

Thomas Dybdahl Ahle

Ah, $\frac{(1+x+x^2)^n}{x^{2n/r}} = \sum_{a+b+c=n}x^{b+2c-2n/3}$ . You just flipped the inequality on $b+2c\le 2n/3$ .

9 September, 2016 at 9:48 am

Introduction to the polynomial method (and other similar things) | Short, Fat Matrices

[…] result then follows from analyzing the asymptotic behavior of (see this post for details). The proof of the lemma is perhaps more interesting, considering this is where polynomials […]

15 May, 2017 at 1:40 am

Notes of the seminar ‘On the slice rank of functions’ | Points And Lines

[…] try to elude the idea of ‘slice rank’ of a tensor, as was defined by Terry Tao in his reformulation of the Ellenberg-Gijswijt proof. I will try to indicate where the idea might have come from and how […]

23 June, 2017 at 2:55 pm

TheoryFest Day 3: Plenaries | A bunch of data

[…] came out of the recent flurry of work by Croot, Lev, Pach, Ellenberg, Gijswijt and Tao (and cited Tao’s blog post as well as the papers, which I thought was […]

1 July, 2017 at 9:29 pm

수학동아 연재 (2016년) - Prof. Sang-il Oum

[…] 참고 블로그글: Ellenberg 블로그, Tim Gowers 블로그, Terrence Tao 블로그 […]

16 April, 2018 at 7:21 am

George

In regards to Remark 5 and Will’s nice response to it above, it seems that Hongze Li has made progress for large q. https://arxiv.org/pdf/1610.00247.pdf

16 April, 2018 at 5:45 pm

George Shakan

After reading the paper and getting in touch with the author, unfortunately there is an error.

19 March, 2019 at 4:03 am

수학동아 연재 (2016년) - Sang-il Oum

[…] 참고 블로그글: Ellenberg 블로그, Tim Gowers 블로그, Terrence Tao 블로그 […]

23 January, 2023 at 9:24 am

The Trifference Problem | Anurag's Math Blog

[…] have led to any improvement in the bounds on . For example, Costa and Dalai have shown that the slice rank method, which was recently used to make breakthroughs on the cap set problem, and many other combinatorial […]

28 February, 2023 at 9:00 am

Theory at the Institute and Beyond, February 2023 | Calvin Café: The Simons Institute Blog

[…] Kelley-Meka proof in this mold). A few years ago, an ingenious use of the polynomial method led to stunning progress on the capset problem, leading to a short proof of an upper bound of O(2.756n) for q = 3, a vast improvement to the […]

	Anonymous on Erratum for “An inverse…
	Jas, the Physicist on Career advice
	Jas, the Physicist on Career advice
	Anonymous on Two announcements: AI for Math…
	Anonymous on Erratum for “An inverse…
	Anonymous on Erratum for “An inverse…
	Anonymous on 275A, Notes 3: The weak and st…
	Anonymous on An epsilon of room: pages from…
	Anonymous on An epsilon of room: pages from…
	Anonymous on An epsilon of room: pages from…
	Anonymous on Erratum for “An inverse…
	Anonymous on Two announcements: AI for Math…
	Anonymous on Erratum for “An inverse…
	Anonymous on Marton’s conjecture in a…
	Anonymous on Marton’s conjecture in a…

A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capset bound

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

78 comments

Leave a comment Cancel reply

For commenters

A symmetric formulation of the Croot-Lev-Pach-Ellenberg-Gijswijt capset bound

Share this:

Recent Comments

Articles by others

Diversions

Mathematics

Selected articles

Software

The sciences

Top Posts

Archives

Categories

The Polymath Blog

78 comments

Leave a comment Cancel reply

For commenters