You are currently browsing the tag archive for the ‘Jordan Ellenberg’ tag.

A capset in the vector space ${{\bf F}_3^n}$ over the finite field ${{\bf F}_3}$ of three elements is a subset ${A}$ of ${{\bf F}_3^n}$ that does not contain any lines ${\{ x,x+r,x+2r\}}$, where ${x,r \in {\bf F}_3^n}$ and ${r \neq 0}$. A basic problem in additive combinatorics (discussed in one of the very first posts on this blog) is to obtain good upper and lower bounds for the maximal size of a capset in ${{\bf F}_3^n}$.

Trivially, one has ${|A| \leq 3^n}$. Using Fourier methods (and the density increment argument of Roth), the bound of ${|A| \leq O( 3^n / n )}$ was obtained by Meshulam, and improved only as late as 2012 to ${O( 3^n /n^{1+c})}$ for some absolute constant ${c>0}$ by Bateman and Katz. But in a very recent breakthrough, Ellenberg (and independently Gijswijt) obtained the exponentially superior bound ${|A| \leq O( 2.756^n )}$, using a version of the polynomial method recently introduced by Croot, Lev, and Pach. (In the converse direction, a construction of Edel gives capsets as large as ${(2.2174)^n}$.) Given the success of the polynomial method in superficially similar problems such as the finite field Kakeya problem (discussed in this previous post), it was natural to wonder that this method could be applicable to the cap set problem (see for instance this MathOverflow comment of mine on this from 2010), but it took a surprisingly long time before Croot, Lev, and Pach were able to identify the precise variant of the polynomial method that would actually work here.

The proof of the capset bound is very short (Ellenberg’s and Gijswijt’s preprints are both 3 pages long, and Croot-Lev-Pach is 6 pages), but I thought I would present a slight reformulation of the argument which treats the three points on a line in ${{\bf F}_3}$ symmetrically (as opposed to treating the third point differently from the first two, as is done in the Ellenberg and Gijswijt papers; Croot-Lev-Pach also treat the middle point of a three-term arithmetic progression differently from the two endpoints, although this is a very natural thing to do in their context of ${({\bf Z}/4{\bf Z})^n}$). The basic starting point is this: if ${A}$ is a capset, then one has the identity

$\displaystyle \delta_{0^n}( x+y+z ) = \sum_{a \in A} \delta_a(x) \delta_a(y) \delta_a(z) \ \ \ \ \ (1)$

for all ${(x,y,z) \in A^3}$, where ${\delta_a(x) := 1_{a=x}}$ is the Kronecker delta function, which we view as taking values in ${{\bf F}_3}$. Indeed, (1) reflects the fact that the equation ${x+y+z=0}$ has solutions precisely when ${x,y,z}$ are either all equal, or form a line, and the latter is ruled out precisely when ${A}$ is a capset.

To exploit (1), we will show that the left-hand side of (1) is “low rank” in some sense, while the right-hand side is “high rank”. Recall that a function ${F: A \times A \rightarrow {\bf F}}$ taking values in a field ${{\bf F}}$ is of rank one if it is non-zero and of the form ${(x,y) \mapsto f(x) g(y)}$ for some ${f,g: A \rightarrow {\bf F}}$, and that the rank of a general function ${F: A \times A \rightarrow {\bf F}}$ is the least number of rank one functions needed to express ${F}$ as a linear combination. More generally, if ${k \geq 2}$, we define the rank of a function ${F: A^k \rightarrow {\bf F}}$ to be the least number of “rank one” functions of the form

$\displaystyle (x_1,\dots,x_k) \mapsto f(x_i) g(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k)$

for some ${i=1,\dots,k}$ and some functions ${f: A \rightarrow {\bf F}}$, ${g: A^{k-1} \rightarrow {\bf F}}$, that are needed to generate ${F}$ as a linear combination. For instance, when ${k=3}$, the rank one functions take the form ${(x,y,z) \mapsto f(x) g(y,z)}$, ${(x,y,z) \mapsto f(y) g(x,z)}$, ${(x,y,z) \mapsto f(z) g(x,y)}$, and linear combinations of ${r}$ such rank one functions will give a function of rank at most ${r}$.

It is a standard fact in linear algebra that the rank of a diagonal matrix is equal to the number of non-zero entries. This phenomenon extends to higher dimensions:

Lemma 1 (Rank of diagonal hypermatrices) Let ${k \geq 2}$, let ${A}$ be a finite set, let ${{\bf F}}$ be a field, and for each ${a \in A}$, let ${c_a \in {\bf F}}$ be a coefficient. Then the rank of the function

$\displaystyle (x_1,\dots,x_k) \mapsto \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k) \ \ \ \ \ (2)$

is equal to the number of non-zero coefficients ${c_a}$.

Proof: We induct on ${k}$. As mentioned above, the case ${k=2}$ follows from standard linear algebra, so suppose now that ${k>2}$ and the claim has already been proven for ${k-1}$.

It is clear that the function (2) has rank at most equal to the number of non-zero ${c_a}$ (since the summands on the right-hand side are rank one functions), so it suffices to establish the lower bound. By deleting from ${A}$ those elements ${a \in A}$ with ${c_a=0}$ (which cannot increase the rank), we may assume without loss of generality that all the ${c_a}$ are non-zero. Now suppose for contradiction that (2) has rank at most ${|A|-1}$, then we obtain a representation

$\displaystyle \sum_{a \in A} c_a \delta_a(x_1) \dots \delta_a(x_k)$

$\displaystyle = \sum_{i=1}^k \sum_{\alpha \in I_i} f_{i,\alpha}(x_i) g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) \ \ \ \ \ (3)$

for some sets ${I_1,\dots,I_k}$ of cardinalities adding up to at most ${|A|-1}$, and some functions ${f_{i,\alpha}: A \rightarrow {\bf F}}$ and ${g_{i,\alpha}: A^{k-1} \rightarrow {\bf R}}$.

Consider the space of functions ${h: A \rightarrow {\bf F}}$ that are orthogonal to all the ${f_{k,\alpha}}$, ${\alpha \in I_k}$ in the sense that

$\displaystyle \sum_{x \in A} f_{k,\alpha}(x) h(x) = 0$

for all ${\alpha \in I_k}$. This space is a vector space whose dimension ${d}$ is at least ${|A| - |I_k|}$. A basis of this space generates a ${d \times |A|}$ coordinate matrix of full rank, which implies that there is at least one non-singular ${d \times d}$ minor. This implies that there exists a function ${h: A \rightarrow {\bf F}}$ in this space which is nowhere vanishing on some subset ${A'}$ of ${A}$ of cardinality at least ${|A|-|I_k|}$.

If we multiply (3) by ${h(x_k)}$ and sum in ${x_k}$, we conclude that

$\displaystyle \sum_{a \in A} c_a h(a) \delta_a(x_1) \dots \delta_a(x_{k-1})$

$\displaystyle = \sum_{i=1}^{k-1} \sum_{\alpha \in I_i} f_{i,\alpha}(x_i)\tilde g_{i,\alpha}( x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})$

where

$\displaystyle \tilde g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_{k-1})$

$\displaystyle := \sum_{x_k \in A} g_{i,\alpha}(x_1,\dots,x_{i-1},x_{i+1},\dots,x_k) h(x_k).$

The right-hand side has rank at most ${|A|-1-|I_k|}$, since the summands are rank one functions. On the other hand, from induction hypothesis the left-hand side has rank at least ${|A|-|I_k|}$, giving the required contradiction. $\Box$

On the other hand, we have the following (symmetrised version of a) beautifully simple observation of Croot, Lev, and Pach:

Lemma 2 On ${({\bf F}_3^n)^3}$, the rank of the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ is at most ${3N}$, where

$\displaystyle N := \sum_{a,b,c \geq 0: a+b+c=n, b+2c \leq 2n/3} \frac{n!}{a!b!c!}.$

Proof: Using the identity ${\delta_0(x) = 1 - x^2}$ for ${x \in {\bf F}_3}$, we have

$\displaystyle \delta_{0^n}(x+y+z) = \prod_{i=1}^n (1 - (x_i+y_i+z_i)^2).$

The right-hand side is clearly a polynomial of degree ${2n}$ in ${x,y,z}$, which is then a linear combination of monomials

$\displaystyle x_1^{i_1} \dots x_n^{i_n} y_1^{j_1} \dots y_n^{j_n} z_1^{k_1} \dots z_n^{k_n}$

with ${i_1,\dots,i_n,j_1,\dots,j_n,k_1,\dots,k_n \in \{0,1,2\}}$ with

$\displaystyle i_1 + \dots + i_n + j_1 + \dots + j_n + k_1 + \dots + k_n \leq 2n.$

In particular, from the pigeonhole principle, at least one of ${i_1 + \dots + i_n, j_1 + \dots + j_n, k_1 + \dots + k_n}$ is at most ${2n/3}$.

Consider the contribution of the monomials for which ${i_1 + \dots + i_n \leq 2n/3}$. We can regroup this contribution as

$\displaystyle \sum_\alpha f_\alpha(x) g_\alpha(y,z)$

where ${\alpha}$ ranges over those ${(i_1,\dots,i_n) \in \{0,1,2\}^n}$ with ${i_1 + \dots + i_n \leq 2n/3}$, ${f_\alpha}$ is the monomial

$\displaystyle f_\alpha(x_1,\dots,x_n) := x_1^{i_1} \dots x_n^{i_n}$

and ${g_\alpha: {\bf F}_3^n \times {\bf F}_3^n \rightarrow {\bf F}_3}$ is some explicitly computable function whose exact form will not be of relevance to our argument. The number of such ${\alpha}$ is equal to ${N}$, so this contribution has rank at most ${N}$. The remaining contributions arising from the cases ${j_1 + \dots + j_n \leq 2n/3}$ and ${k_1 + \dots + k_n \leq 2n/3}$ similarly have rank at most ${N}$ (grouping the monomials so that each monomial is only counted once), so the claim follows.

Upon restricting from ${({\bf F}_3^n)^3}$ to ${A^3}$, the rank of ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ is still at most ${3N}$. The two lemmas then combine to give the Ellenberg-Gijswijt bound

$\displaystyle |A| \leq 3N.$

All that remains is to compute the asymptotic behaviour of ${N}$. This can be done using the general tool of Cramer’s theorem, but can also be derived from Stirling’s formula (discussed in this previous post). Indeed, if ${a = (\alpha+o(1)) n}$, ${b = (\beta+o(1)) n}$, ${c = (\gamma+o(1)) n}$ for some ${\alpha,\beta,\gamma \geq 0}$ summing to ${1}$, Stirling’s formula gives

$\displaystyle \frac{n!}{a!b!c!} = \exp( n (h(\alpha,\beta,\gamma) + o(1)) )$

where ${h}$ is the entropy function

$\displaystyle h(\alpha,\beta,\gamma) = \alpha \log \frac{1}{\alpha} + \beta \log \frac{1}{\beta} + \gamma \log \frac{1}{\gamma}.$

We then have

$\displaystyle N = \exp( n (X + o(1))$

where ${X}$ is the maximum entropy ${h(\alpha,\beta,\gamma)}$ subject to the constraints

$\displaystyle \alpha,\beta,\gamma \geq 0; \alpha+\beta+\gamma=1; \beta+2\gamma \leq 2/3.$

A routine Lagrange multiplier computation shows that the maximum occurs when

$\displaystyle \alpha = \frac{32}{3(15 + \sqrt{33})}$

$\displaystyle \beta = \frac{4(\sqrt{33}-1)}{3(15+\sqrt{33})}$

$\displaystyle \gamma = \frac{(\sqrt{33}-1)^2}{6(15+\sqrt{33})}$

and ${h(\alpha,\beta,\gamma)}$ is approximately ${1.013455}$, giving rise to the claimed bound of ${O( 2.756^n )}$.

Remark 3 As noted in the Ellenberg and Gijswijt papers, the above argument extends readily to other fields than ${{\bf F}_3}$ to control the maximal size of subset of ${{\bf F}^n}$ that has no non-trivial solutions to the equation ${ax+by+cz=0}$, where ${a,b,c \in {\bf F}}$ are non-zero constants that sum to zero. Of course one replaces the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ in Lemma 2 by ${(x,y,z) \mapsto \delta_{0^n}(ax+by+cz)}$ in this case.

Remark 4 This symmetrised formulation suggests that one possible way to improve slightly on the numerical quantity ${2.756}$ by finding a more efficient way to decompose ${\delta_{0^n}(x+y+z)}$ into rank one functions, however I was not able to do so (though such improvements are reminiscent of the Strassen type algorithms for fast matrix multiplication).

Remark 5 It is tempting to see if this method can get non-trivial upper bounds for sets ${A}$ with no length ${4}$ progressions, in (say) ${{\bf F}_5^n}$. One can run the above arguments, replacing the function

$\displaystyle (x,y,z) \mapsto \delta_{0^n}(x+y+z)$

with

$\displaystyle (x,y,z,w) \mapsto \delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w);$

this leads to the bound ${|A| \leq 4N}$ where

$\displaystyle N := \sum_{a,b,c,d,e \geq 0: a+b+c+d+e=n, b+2c+3d+4e \leq 2n} \frac{n!}{a!b!c!d!e!}.$

Unfortunately, ${N}$ is asymptotic to ${\frac{1}{2} 5^n}$ and so this bound is in fact slightly worse than the trivial bound ${|A| \leq 5^n}$! However, there is a slim chance that there is a more efficient way to decompose ${\delta_{0^n}(x-2y+z) \delta_{0^n}(y-2z+w)}$ into rank one functions that would give a non-trivial bound on ${A}$. I experimented with a few possible such decompositions but unfortunately without success.

Remark 6 Return now to the capset problem. Since Lemma 1 is valid for any field ${{\bf F}}$, one could perhaps hope to get better bounds by viewing the Kronecker delta function ${\delta}$ as taking values in another field than ${{\bf F}_3}$, such as the complex numbers ${{\bf C}}$. However, as soon as one works in a field of characteristic other than ${3}$, one can adjoin a cube root ${\omega}$ of unity, and one now has the Fourier decomposition

$\displaystyle \delta_{0^n}(x+y+z) = \frac{1}{3^n} \sum_{\xi \in {\bf F}_3^n} \omega^{\xi \cdot x} \omega^{\xi \cdot y} \omega^{\xi \cdot z}.$

Moving to the Fourier basis, we conclude from Lemma 1 that the function ${(x,y,z) \mapsto \delta_{0^n}(x+y+z)}$ on ${{\bf F}_3^n}$ now has rank exactly ${3^n}$, and so one cannot improve upon the trivial bound of ${|A| \leq 3^n}$ by this method using fields of characteristic other than three as the range field. So it seems one has to stick with ${{\bf F}_3}$ (or the algebraic completion thereof).

Thanks to Jordan Ellenberg and Ben Green for helpful discussions.

Michael Nielsen has posted a draft of his article “Introduction to the Polymath Project and “Density Hales-Jewett and Moser Numbers”” for comments, which will precede our own polymath article in the Szemerédi birthday conference proceedings.

As an unrelated link, Jordan Ellenberg has just written a piece for the Washington Post on statistical sampling and how it could make the US census more accurate.  (Whether this is politically viable is, of course, a different matter.)

Jordan Ellenberg, Richard Oberlin, and I have just uploaded to the arXiv the paper “The Kakeya set and maximal conjectures for algebraic varieties over finite fields“, submitted to Mathematika.  This paper builds upon some work of Dvir and later authors on the Kakeya problem in finite fields, which I have discussed in this earlier blog post.  Dvir established the following:

Kakeya set conjecture for finite fields. Let F be a finite field, and let E be a subset of $F^n$ that contains a line in every direction.  Then E has cardinality at least $c_n |F|^n$ for some $c_n > 0$.

The initial argument of Dvir gave $c_n = 1/n!$.  This was improved to $c_n = c^n$ for some explicit $0 < c < 1$ by Saraf and Sudan, and recently to $c_n =1/2^n$ by Dvir, Kopparty, Saraf, and Sudan, which is within a factor 2 of the optimal result.

In our work we investigate a somewhat different set of improvements to Dvir’s result.  The first concerns the Kakeya maximal function $f^*: {\Bbb P}^{n-1}(F) \to {\Bbb R}$ of a function $f: F^n \to {\Bbb R}$, defined for all directions $\xi \in {\Bbb P}^{n-1}(F)$ in the projective hyperplane at infinity by the formula

$f^*(\xi) = \sup_{\ell // \xi} \sum_{x \in \ell} |f(x)|$

where the supremum ranges over all lines $\ell$ in $F^n$ oriented in the direction $\xi$.  Our first result is the endpoint $L^p$ estimate for this operator, namely

Kakeya maximal function conjecture in finite fields. We have $\| f^* \|_{\ell^n({\Bbb P}^{n-1}(F))} \leq C_n |F|^{(n-1)/n} \|f\|_{\ell^n(F^n)}$ for some constant $C_n > 0$.

This result implies Dvir’s result, since if f is the indicator function of the set E in Dvir’s result, then $f^*(\xi) = |F|$ for every $\xi \in {\Bbb P}^{n-1}(F)$.  However, it also gives information on more general sets E which do not necessarily contain a line in every direction, but instead contain a certain fraction of a line in a subset of directions.  The exponents here are best possible in the sense that all other $\ell^p \to \ell^q$ mapping properties of the operator can be deduced (with bounds that are optimal up to constants) by interpolating the above estimate with more trivial estimates.  This result is the finite field analogue of a long-standing (and still open) conjecture for the Kakeya maximal function in Euclidean spaces; we rely on the polynomial method of Dvir, which thus far has not extended to the Euclidean setting (but note the very interesting variant of this method by Guth that has established the endpoint multilinear Kakeya maximal function estimate in this setting, see this blog post for further discussion).

It turns out that a direct application of the polynomial method is not sufficient to recover the full strength of the maximal function estimate; but by combining the polynomial method with the Nikishin-Maurey-Pisier-Stein “method of random rotations” (as interpreted nowadays by Stein and later by Bourgain, and originally inspired by the factorisation theorems of Nikishin, Maurey, and Pisier), one can already recover a “restricted weak type” version of the above estimate.  If one then enhances the polynomial method with the “method of multiplicities” (as introduced by Saraf and Sudan) we can then recover the full “strong type” estimate; a few more details below the fold.

It turns out that one can generalise the above results to more general affine or projective algebraic varieties over finite fields.  In particular, we showed

Kakeya maximal function conjecture in algebraic varieties. Suppose that $W \subset {\Bbb P}^N$ is an (n-1)-dimensional algebraic variety.  Let $d \geq 1$ be an integer. Then we have

$\| \sup_{\gamma \ni x; \gamma \not \subset W} \sum_{y \in \gamma} f(y) \|_{\ell^n_x(W(F))} \leq C_{n,d,N,W} |F|^{(n-1)/n} \|f\|_{\ell^n({\Bbb P}^N(F))}$

for some constant $C_{n,d,N,W} > 0$, where the supremum is over all irreducible algebraic curves $\gamma$ of degree at most d that pass through x but do not lie in W, and W(F) denotes the F-points of W.

The ordinary Kakeya maximal function conjecture corresponds to the case when N=n, W is the hyperplane at infinity, and the degree d is equal to 1.  One corollary of this estimate is a Dvir-type result: a subset of ${\Bbb P}^N(F)$ which contains, for each x in W, an irreducible algebraic curve of degree d passing through x but not lying in W, has cardinality $\gg |F|^n$ if $|W| \gg |F|^{n-1}$.  (In particular this implies a lower bound for Nikodym sets worked out by Li.)  The dependence of the implied constant on W is only via the degree of W.

The techniques used in the flat case can easily handle curves $\gamma$ of higher degree (provided that we allow the implied constants to depend on d), but the method of random rotations does not seem to work directly on the algebraic variety W as there are usually no symmetries of this variety to exploit.  Fortunately, we can get around this by using a “random projection trick” to “flatten” W into a hyperplane (after first expressing W as the zero locus of some polynomials, and then composing with the graphing map for such polynomials), reducing the non-flat case to the flat case.

Below the fold, I wish to sketch two of the key ingredients in our arguments, the random rotations method and the random projections trick.  (We of course also use some algebraic geometry, but mostly low-tech stuff, on the level of Bezout’s theorem, though we do need one non-trivial result of Kleiman (from SGA6), that asserts that bounded degree varieties can be cut out by a bounded number of polynomials of bounded degree.)

[Update, March 14: See also Jordan’s own blog post on our paper.]