You are currently browsing the tag archive for the ‘polynomial method’ tag.

Let {F} be a finite field, with algebraic closure {\overline{F}}, and let {V} be an (affine) algebraic variety defined over {\overline{F}}, by which I mean a set of the form

\displaystyle  V = \{ x \in \overline{F}^d: P_1(x) = \ldots = P_m(x) = 0 \}

for some ambient dimension {d \geq 0}, and some finite number of polynomials {P_1,\ldots,P_m: \overline{F}^d \rightarrow \overline{F}}. In order to reduce the number of subscripts later on, let us say that {V} has complexity at most {M} if {d}, {m}, and the degrees of the {P_1,\ldots,P_m} are all less than or equal to {M}. Note that we do not require at this stage that {V} be irreducible (i.e. not the union of two strictly smaller varieties), or defined over {F}, though we will often specialise to these cases later in this post. (Also, everything said here can also be applied with almost no changes to projective varieties, but we will stick with affine varieties for sake of concreteness.)

One can consider two crude measures of how “big” the variety {V} is. The first measure, which is algebraic geometric in nature, is the dimension {\hbox{dim}(V)} of the variety {V}, which is an integer between {0} and {d} (or, depending on convention, {-\infty}, {-1}, or undefined, if {V} is empty) that can be defined in a large number of ways (e.g. it is the largest {r} for which the generic linear projection from {V} to {\overline{F}^r} is dominant, or the smallest {r} for which the intersection with a generic codimension {r} subspace is non-empty). The second measure, which is number-theoretic in nature, is the number {|V(F)| = |V \cap F^d|} of {F}-points of {V}, i.e. points {x = (x_1,\ldots,x_d)} in {V} all of whose coefficients lie in the finite field, or equivalently the number of solutions to the system of equations {P_i(x_1,\ldots,x_d) = 0} for {i=1,\ldots,m} with variables {x_1,\ldots,x_d} in {F}.

These two measures are linked together in a number of ways. For instance, we have the basic Schwarz-Zippel type bound (which, in this qualitative form, goes back at least to Lemma 1 of the work of Lang and Weil in 1954).

Lemma 1 (Schwarz-Zippel type bound) Let {V} be a variety of complexity at most {M}. Then we have {|V(F)| \ll_M |F|^{\hbox{dim}(V)}}.

Proof: (Sketch) For the purposes of exposition, we will not carefully track the dependencies of implied constants on the complexity {M}, instead simply assuming that all of these quantities remain controlled throughout the argument. (If one wished, one could obtain ineffective bounds on these quantities by an ultralimit argument, as discussed in this previous post, or equivalently by moving everything over to a nonstandard analysis framework; one could also obtain such uniformity using the machinery of schemes.)

We argue by induction on the ambient dimension {d} of the variety {V}. The {d=0} case is trivial, so suppose {d \geq 1} and that the claim has already been proven for {d-1}. By breaking up {V} into irreducible components we may assume that {V} is irreducible (this requires some control on the number and complexity of these components, but this is available, as discussed in this previous post). For each {x_1,\ldots,x_{d-1} \in \overline{F}}, the fibre {\{ x_d \in \overline{F}: (x_1,\ldots,x_{d-1},x_d) \in V \}} is either one-dimensional (and thus all of {\overline{F}}) or zero-dimensional. In the latter case, one has {O_M(1)} points in the fibre from the fundamental theorem of algebra (indeed one has a bound of {D} in this case), and {(x_1,\ldots,x_{d-1})} lives in the projection of {V} to {\overline{F}^{d-1}}, which is a variety of dimension at most {\hbox{dim}(V)} and controlled complexity, so the contribution of this case is acceptable from the induction hypothesis. In the former case, the fibre contributes {|F|} {F}-points, but {(x_1,\ldots,x_{d-1})} lies in a variety in {\overline{F}^{d-1}} of dimension at most {\hbox{dim}(V)-1} (since otherwise {V} would contain a subvariety of dimension at least {\hbox{dim}(V)+1}, which is absurd) and controlled complexity, and so the contribution of this case is also acceptable from the induction hypothesis. \Box

One can improve the bound on the implied constant to be linear in the degree of {V} (see e.g. Claim 7.2 of this paper of Dvir, Kollar, and Lovett, or Lemma A.3 of this paper of Ellenberg, Oberlin, and myself), but we will not be concerned with these improvements here.

Without further hypotheses on {V}, the above upper bound is sharp (except for improvements in the implied constants). For instance, the variety

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: \prod_{j=1}^D (x_d - a_j) = 0\},

where {a_1,\ldots,a_D \in F} are distict, is the union of {D} distinct hyperplanes of dimension {d-1}, with {|V(F)| = D |F|^{d-1}} and complexity {\max(D,d)}; similar examples can easily be concocted for other choices of {\hbox{dim}(V)}. In the other direction, there is also no non-trivial lower bound for {|V(F)|} without further hypotheses on {V}. For a trivial example, if {a} is an element of {\overline{F}} that does not lie in {F}, then the hyperplane

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: x_d - a = 0 \}

clearly has no {F}-points whatsoever, despite being a {d-1}-dimensional variety in {\overline{F}^d} of complexity {d}. For a slightly less non-trivial example, if {a} is an element of {F} that is not a quadratic residue, then the variety

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: x_d^2 - a = 0 \},

which is the union of two hyperplanes, still has no {F}-points, even though this time the variety is defined over {F} instead of {\overline{F}} (by which we mean that the defining polynomial(s) have all of their coefficients in {F}). There is however the important Lang-Weil bound that allows for a much better estimate as long as {V} is both defined over {F} and irreducible:

Theorem 2 (Lang-Weil bound) Let {V} be a variety of complexity at most {M}. Assume that {V} is defined over {F}, and that {V} is irreducible as a variety over {\overline{F}} (i.e. {V} is geometrically irreducible or absolutely irreducible). Then

\displaystyle  |V(F)| = (1 + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(V)}.

Again, more explicit bounds on the implied constant here are known, but will not be the focus of this post. As the previous examples show, the hypotheses of definability over {F} and geometric irreducibility are both necessary.

The Lang-Weil bound is already non-trivial in the model case {d=2, \hbox{dim}(V)=1} of plane curves:

Theorem 3 (Hasse-Weil bound) Let {P: \overline{F}^2 \rightarrow \overline{F}} be an irreducible polynomial of degree {D} with coefficients in {F}. Then

\displaystyle  |\{ (x,y) \in F^2: P(x,y) = 0 \}| = |F| + O_D( |F|^{1/2} ).

Thus, for instance, if {a,b \in F}, then the elliptic curve {\{ (x,y) \in F^2: y^2 = x^3 + ax + b \}} has {|F| + O(|F|^{1/2})} {F}-points, a result first established by Hasse. The Hasse-Weil bound is already quite non-trivial, being the analogue of the Riemann hypothesis for plane curves. For hyper-elliptic curves, an elementary proof (due to Stepanov) is discussed in this previous post. For general plane curves, the first proof was by Weil (leading to his famous Weil conjectures); there is also a nice version of Stepanov’s argument due to Bombieri covering this case which is a little less elementary (relying crucially on the Riemann-Roch theorem for the upper bound, and a lifting trick to then get the lower bound), which I briefly summarise later in this post. The full Lang-Weil bound is deduced from the Hasse-Weil bound by an induction argument using generic hyperplane slicing, as I will also summarise later in this post.

The hypotheses of definability over {F} and geometric irreducibility in the Lang-Weil can be removed after inserting a geometric factor:

Corollary 4 (Lang-Weil bound, alternate form) Let {V} be a variety of complexity at most {M}. Then one has

\displaystyle  |V(F)| = (c(V) + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(V)}

where {c(V)} is the number of top-dimensional components of {V} (i.e. geometrically irreducible components of {V} of dimension {\hbox{dim}(V)}) that are definable over {F}, or equivalently are invariant with respect to the Frobenius endomorphism {x \mapsto x^{|F|}} that defines {F}.

Proof: By breaking up a general variety {V} into components (and using Lemma 1 to dispose of any lower-dimensional components), it suffices to establish this claim when {V} is itself geometrically irreducible. If {V} is definable over {F}, the claim follows from Theorem 2. If {V} is not definable over {F}, then it is not fixed by the Frobenius endomorphism {Frob} (since otherwise one could produce a set of defining polynomials that were fixed by Frobenius and thus defined over {F} by using some canonical basis (such as a reduced Grobner basis) for the associated ideal), and so {V \cap Frob(V)} has strictly smaller dimension than {V}. But {V \cap Frob(V)} captures all the {F}-points of {V}, so in this case the claim follows from Lemma 1. \Box

Note that if {V} is reducible but is itself defined over {F}, then the Frobenius endomorphism preserves {V} itself, but may permute the components of {V} around. In this case, {c(V)} is the number of fixed points of this permutation action of Frobenius on the components. In particular, {c(V)} is always a natural number between {0} and {O_M(1)}; thus we see that regardless of the geometry of {V}, the normalised count {|V(F)|/|F|^{\hbox{dim}(V)}} is asymptotically restricted to a bounded range of natural numbers (in the regime where the complexity stays bounded and {|F|} goes to infinity).

Example 1 Consider the variety

\displaystyle  V := \{ (x,y) \in \overline{F}^2: x^2 - ay^2 = 0 \}

for some non-zero parameter {a \in F}. Geometrically (by which we basically mean “when viewed over the algebraically closed field {\overline{F}}“), this is the union of two lines, with slopes corresponding to the two square roots of {a}. If {a} is a quadratic residue, then both of these lines are defined over {F}, and are fixed by Frobenius, and {c(V) = 2} in this case. If {a} is not a quadratic residue, then the lines are not defined over {F}, and the Frobenius automorphism permutes the two lines while preserving {V} as a whole, giving {c(V)=0} in this case.

Corollary 4 effectively computes (at least to leading order) the number-theoretic size {|V(F)|} of a variety in terms of geometric information about {V}, namely its dimension {\hbox{dim}(V)} and the number {c(V)} of top-dimensional components fixed by Frobenius. It turns out that with a little bit more effort, one can extend this connection to cover not just a single variety {V}, but a family of varieties indexed by points in some base space {W}. More precisely, suppose we now have two affine varieties {V,W} of bounded complexity, together with a regular map {\phi: V \rightarrow W} of bounded complexity (the definition of complexity of a regular map is a bit technical, see e.g. this paper, but one can think for instance of a polynomial or rational map of bounded degree as a good example). It will be convenient to assume that the base space {W} is irreducible. If the map {\phi} is a dominant map (i.e. the image {\phi(V)} is Zariski dense in {W}), then standard algebraic geometry results tell us that the fibres {\phi^{-1}(\{w\})} are an unramified family of {\hbox{dim}(V)-\hbox{dim}(W)}-dimensional varieties outside of an exceptional subset {W'} of {W} of dimension strictly smaller than {\hbox{dim}(W)} (and with {\phi^{-1}(W')} having dimension strictly smaller than {\hbox{dim}(V)}); see e.g. Section I.6.3 of Shafarevich.

Now suppose that {V}, {W}, and {\phi} are defined over {F}. Then, by Lang-Weil, {W(F)} has {(1 + O(|F|^{-1/2})) |F|^{\hbox{dim}(W)}} {F}-points, and by Schwarz-Zippel, for all but {O( |F|^{\hbox{dim}(W)-1})} of these {F}-points {w} (the ones that lie in the subvariety {W'}), the fibre {\phi^{-1}(\{w\})} is an algebraic variety defined over {F} of dimension {\hbox{dim}(V)-\hbox{dim}(W)}. By using ultraproduct arguments (see e.g. Lemma 3.7 of this paper of mine with Emmanuel Breuillard and Ben Green), this variety can be shown to have bounded complexity, and thus by Corollary 4, has {(c(\phi^{-1}(\{w\})) + O(|F|^{-1/2}) |F|^{\hbox{dim}(V)-\hbox{dim}(W)}} {F}-points. One can then ask how the quantity {c(\phi^{-1}(\{w\})} is distributed. A simple but illustrative example occurs when {V=W=F} and {\phi: F \rightarrow F} is the polynomial {\phi(x) := x^2}. Then {c(\phi^{-1}(\{w\})} equals {2} when {w} is a non-zero quadratic residue and {0} when {w} is a non-zero quadratic non-residue (and {1} when {w} is zero, but this is a negligible fraction of all {w}). In particular, in the asymptotic limit {|F| \rightarrow \infty}, {c(\phi^{-1}(\{w\})} is equal to {2} half of the time and {0} half of the time.

Now we describe the asymptotic distribution of the {c(\phi^{-1}(\{w\}))}. We need some additional notation. Let {w_0} be an {F}-point in {W \backslash W'}, and let {\pi_0( \phi^{-1}(\{w_0\}) )} be the connected components of the fibre {\phi^{-1}(\{w_0\})}. As {\phi^{-1}(\{w_0\})} is defined over {F}, this set of components is permuted by the Frobenius endomorphism {Frob}. But there is also an action by monodromy of the fundamental group {\pi_1(W \backslash W')} (this requires a certain amount of étale machinery to properly set up, as we are working over a positive characteristic field rather than over the complex numbers, but I am going to ignore this rather important detail here, as I still don’t fully understand it). This fundamental group may be infinite, but (by the étale construction) is always profinite, and in particular has a Haar probability measure, in which every finite index subgroup (and their cosets) are measurable. Thus we may meaningfully talk about elements drawn uniformly at random from this group, so long as we work only with the profinite {\sigma}-algebra on {\pi_1(W \backslash W')} that is generated by the cosets of the finite index subgroups of this group (which will be the only relevant sets we need to measure when considering the action of this group on finite sets, such as the components of a generic fibre).

Theorem 5 (Lang-Weil with parameters) Let {V, W} be varieties of complexity at most {M} with {W} irreducible, and let {\phi: V \rightarrow W} be a dominant map of complexity at most {M}. Let {w_0} be an {F}-point of {W \backslash W'}. Then, for any natural number {a}, one has {c(\phi^{-1}(\{w\})) = a} for {(\mathop{\bf P}( X = a ) + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(W)}} values of {w \in W(F)}, where {X} is the random variable that counts the number of components of a generic fibre {\phi^{-1}(w_0)} that are invariant under {g \circ Frob}, where {g} is an element chosen uniformly at random from the étale fundamental group {\pi_1(W \backslash W')}. In particular, in the asymptotic limit {|F| \rightarrow \infty}, and with {w} chosen uniformly at random from {W(F)}, {c(\phi^{-1}(\{w\}))} (or, equivalently, {|\phi^{-1}(\{w\})(F)| / |F|^{\hbox{dim}(V)-\hbox{dim}(W)}}) and {X} have the same asymptotic distribution.

This theorem generalises Corollary 4 (which is the case when {W} is just a point, so that {\phi^{-1}(\{w\})} is just {V} and {g} is trivial). Informally, the effect of a non-trivial parameter space {W} on the Lang-Weil bound is to push around the Frobenius map by monodromy for the purposes of counting invariant components, and a randomly chosen set of parameters corresponds to a randomly chosen loop on which to perform monodromy.

Example 2 Let {V=W=F} and {\phi(x) = x^m} for some fixed {m \geq 1}; to avoid some technical issues let us suppose that {m} is coprime to {|F|}. Then {W'} can be taken to be {\{0\}}, and for a base point {w_0 \in W \backslash W'} we can take {w_0=1}. The fibre {\phi^{-1}(\{1\})} – the {m^{th}} roots of unity – can be identified with the cyclic group {{\bf Z}/m{\bf Z}} by using a primitive root of unity. The étale fundamental group {\pi(W \backslash W') = \pi(\overline{F} \backslash 0)} is (I think) isomorphic to the profinite closure {\hat {\bf Z}} of the integers {{\bf Z}} (excluding the part of that closure coming from the characteristic of {F}). Not coincidentally, the integers {{\bf Z}} are the fundamental group of the complex analogue {{\bf C} \backslash \{0\}} of {W \backslash W'}. (Brian Conrad points out to me though that for more complicated varieties, such as covers of {\overline{F} \backslash \{0\}} by a power of the characteristic, the etale fundamental group is more complicated than just a profinite closure of the ordinary fundamental group, due to the presence of Artin-Schreier covers that are only ramified at infinity.) The action of this fundamental group on the fibres {{\bf Z}/m{\bf Z}} can given by translation. Meanwhile, the Frobenius map {Frob} on {{\bf Z}/m{\bf Z}} is given by multiplication by {|F|}. A random element {g \circ Frob} then becomes a random affine map {x \mapsto |F|x+b} on {{\bf Z}/m{\bf Z}}, where {b} chosen uniformly at random from {{\bf Z}/m{\bf Z}}. The number of fixed points of this map is equal to the greatest common divisor {(|F|-1,m)} of {|F|-1} and {m} when {b} is divisible by {(|F|-1,m)}, and equal to {0} otherwise. This matches up with the elementary number fact that a randomly chosen non-zero element of {F} will be an {m^{th}} power with probability {1/(|F|-1,m)}, and when this occurs, the number of {m^{th}} roots in {F} will be {(|F|-1,m)}.

Example 3 (Thanks to Jordan Ellenberg for this example.) Consider a random elliptic curve {E = \{ y^2 = x^3 + ax + b \}}, where {a,b} are chosen uniformly at random, and let {m \geq 1}. Let {E[m]} be the {m}-torsion points of {E} (i.e. those elements {g \in E} with {mg = 0} using the elliptic curve addition law); as a group, this is isomorphic to {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}} (assuming that {F} has sufficiently large characteristic, for simplicity), and consider the number of {F} points of {E[m]}, which is a random variable taking values in the natural numbers between {0} and {m^2}. In this case, the base variety {W} is the modular curve {X(1)}, and the covering variety {V} is the modular curve {X_1(m)}. The generic fibre here can be identified with {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}}, the monodromy action projects down to the action of {SL_2({\bf Z}/m{\bf Z})}, and the action of Frobenius on this fibre can be shown to be given by a {2 \times 2} matrix with determinant {|F|} (with the exact choice of matrix depending on the choice of fibre and of the identification), so the distribution of the number of {F}-points of {E[m]} is asymptotic to the distribution of the number of fixed points {X} of a random linear map of determinant {|F|} on {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}}.

Theorem 5 seems to be well known “folklore” among arithmetic geometers, though I do not know of an explicit reference for it. I enjoyed deriving it for myself (though my derivation is somewhat incomplete due to my lack of understanding of étale cohomology) from the ordinary Lang-Weil theorem and the moment method. I’m recording this derivation later in this post, mostly for my own benefit (as I am still in the process of learning this material), though perhaps some other readers may also be interested in it.

Caveat: not all details are fully fleshed out in this writeup, particularly those involving the finer points of algebraic geometry and étale cohomology, as my understanding of these topics is not as complete as I would like it to be.

Many thanks to Brian Conrad and Jordan Ellenberg for helpful discussions on these topics.

Read the rest of this entry »

The ham sandwich theorem asserts that, given {d} bounded open sets {U_1,\ldots,U_d} in {{\bf R}^d}, there exists a hyperplane {\{ x \in {\bf R}^d: x \cdot v = c \}} that bisects each of these sets {U_i}, in the sense that each of the two half-spaces {\{ x \in {\bf R}^d: x \cdot v < c \}, \{ x \in {\bf R}^d: x \cdot v > c \}} on either side of the hyperplane captures exactly half of the volume of {U_i}. The shortest proof of this result proceeds by invoking the Borsuk-Ulam theorem.

A useful generalisation of the ham sandwich theorem is the polynomial ham sandwich theorem, which asserts that given {m} bounded open sets {U_1,\ldots,U_m} in {{\bf R}^d}, there exists a hypersurface {\{ x \in {\bf R}^d: Q(x)=0\}} of degree {O_d( m^{1/d} )} (thus {P: {\bf R}^d \rightarrow {\bf R}} is a polynomial of degree {O(m^{1/n})} such that the two semi-algebraic sets {\{ Q > 0 \}} and {\{ Q < 0\}} capture half the volume of each of the {U_i}. (More precisely, the degree will be at most {D}, where {D} is the first positive integer for which {\binom{D+d}{d}} exceeds {m}.) This theorem can be deduced from the Borsuk-Ulam theorem in the same manner that the ordinary ham sandwich theorem is (and can also be deduced directly from the ordinary ham sandwich theorem via the Veronese embedding).

The polynomial ham sandwich theorem is a theorem about continuous bodies (bounded open sets), but a simple limiting argument leads one to the following discrete analogue: given {m} finite sets {S_1,\ldots,S_m} in {{\bf R}^d}, there exists a hypersurface {\{ x \in {\bf R}^d: Q(x)=0\}} of degree {O_d( m^{1/d} )}, such that each of the two semi-algebraic sets {\{ Q > 0 \}} and {\{ Q < 0\}} contain at most half of the points on {S_i} (note that some of the points of {S_i} can certainly lie on the boundary {\{Q=0\}}). This can be iterated to give a useful cell decomposition:

Proposition 1 (Cell decomposition) Let {P} be a finite set of points in {{\bf R}^d}, and let {D} be a positive integer. Then there exists a polynomial {Q} of degree at most {D}, and a decomposition

\displaystyle  {\bf R}^d = \{ Q = 0\} \cup C_1 \cup \ldots \cup C_m

into the hypersurface {\{Q=0\}} and a collection {C_1,\ldots,C_m} of cells bounded by {\{P=0\}}, such that {m = O_d(D^d)}, and such that each cell {C_i} contains at most {O_d( |P|/D^d )} points.

A proof is sketched in this previous blog post. The cells in the argument are not necessarily connected (being instead formed by intersecting together a number of semi-algebraic sets such as {\{ Q > 0\}} and {\{Q<0\}}), but it is a classical result (established independently by Oleinik-Petrovskii, Milnor, and Thom) that any degree {D} hypersurface {\{Q=0\}} divides {{\bf R}^d} into {O_d(D^d)} connected components, so one can easily assume that the cells are connected if desired. (Incidentally, one does not need the full machinery of the results in the above cited papers – which control not just the number of components, but all the Betti numbers of the complement of {\{Q=0\}} – to get the bound on connected components; one can instead observe that every bounded connected component has a critical point where {\nabla Q = 0}, and one can control the number of these points by Bezout’s theorem, after perturbing {Q} slightly to enforce genericity, and then count the unbounded components by an induction on dimension.)

Remark 1 By setting {D} as large as {O_d(|P|^{1/m})}, we obtain as a limiting case of the cell decomposition the fact that any finite set {P} of points in {{\bf R}^d} can be captured by a hypersurface of degree {O_d(|P|^{1/m})}. This fact is in fact true over arbitrary fields (not just over {{\bf R}}), and can be proven by a simple linear algebra argument (see e.g. this previous blog post). However, the cell decomposition is more flexible than this algebraic fact due to the ability to arbitrarily select the degree parameter {D}.

The cell decomposition can be viewed as a structural theorem for arbitrary large configurations of points in space, much as the Szemerédi regularity lemma can be viewed as a structural theorem for arbitrary large dense graphs. Indeed, just as many problems in the theory of large dense graphs can be profitably attacked by first applying the regularity lemma and then inspecting the outcome, it now seems that many problems in combinatorial incidence geometry can be attacked by applying the cell decomposition (or a similar such decomposition), with a parameter {D} to be optimised later, to a relevant set of points, and seeing how the cells interact with each other and with the other objects in the configuration (lines, planes, circles, etc.). This strategy was spectacularly illustrated recently with Guth and Katz‘s use of the cell decomposition to resolve the Erdös distinct distance problem (up to logarithmic factors), as discussed in this blog post.

In this post, I wanted to record a simpler (but still illustrative) version of this method (that I learned from Nets Katz), namely to provide yet another proof of the Szemerédi-Trotter theorem in incidence geometry:

Theorem 2 (Szemerédi-Trotter theorem) Given a finite set of points {P} and a finite set of lines {L} in {{\bf R}^2}, the set of incidences {I(P,L):= \{ (p,\ell) \in P \times L: p \in \ell \}} has cardinality

\displaystyle  |I(P,L)| \ll |P|^{2/3} |L|^{2/3} + |P| + |L|.

This theorem has many short existing proofs, including one via crossing number inequalities (as discussed in this previous post) or via a slightly different type of cell decomposition (as discussed here). The proof given below is not that different, in particular, from the latter proof, but I believe it still serves as a good introduction to the polynomial method in combinatorial incidence geometry.

Read the rest of this entry »

Combinatorial incidence geometry is the study of the possible combinatorial configurations between geometric objects such as lines and circles. One of the basic open problems in the subject has been the Erdős distance problem, posed in 1946:

Problem 1 (Erdős distance problem) Let {N} be a large natural number. What is the least number {\# \{ |x_i-x_j|: 1 \leq i < j \leq N \}} of distances that are determined by {N} points {x_1,\ldots,x_N} in the plane?

Erdős called this least number {g(N)}. For instance, one can check that {g(3)=1} and {g(4)=2}, although the precise computation of {g} rapidly becomes more difficult after this. By considering {N} points in arithmetic progression, we see that {g(N) \leq N-1}. By considering the slightly more sophisticated example of a {\sqrt{N} \times \sqrt{N}} lattice grid (assuming that {N} is a square number for simplicity), and using some analytic number theory, one can obtain the slightly better asymptotic bound {g(N) = O( N / \sqrt{\log N} )}.

On the other hand, lower bounds are more difficult to obtain. As observed by Erdős, an easy argument, ultimately based on the incidence geometry fact that any two circles intersect in at most two points, gives the lower bound {g(N) \gg N^{1/2}}. The exponent {1/2} has been slowly increasing over the years by a series of increasingly intricate arguments combining incidence geometry facts with other known results in combinatorial incidence geometry (most notably the Szemerédi-Trotter theorem) and also some tools from additive combinatorics; however, these methods seemed to fall quite short of getting to the optimal exponent of {1}. (Indeed, previously to last week, the best lower bound known was approximately {N^{0.8641}}, due to Katz and Tardos.)

Very recently, though, Guth and Katz have obtained a near-optimal result:

Theorem 2 One has {g(N) \gg N / \log N}.

The proof neatly combines together several powerful and modern tools in a new way: a recent geometric reformulation of the problem due to Elekes and Sharir; the polynomial method as used recently by Dvir, Guth, and Guth-Katz on related incidence geometry problems (and discussed previously on this blog); and the somewhat older method of cell decomposition (also discussed on this blog). A key new insight is that the polynomial method (and more specifically, the polynomial Ham Sandwich theorem, also discussed previously on this blog) can be used to efficiently create cells.

In this post, I thought I would sketch some of the key ideas used in the proof, though I will not give the full argument here (the paper itself is largely self-contained, well motivated, and of only moderate length). In particular I will not go through all the various cases of configuration types that one has to deal with in the full argument, but only some illustrative special cases.

To simplify the exposition, I will repeatedly rely on “pigeonholing cheats”. A typical such cheat: if I have {n} objects (e.g. {n} points or {n} lines), each of which could be of one of two types, I will assume that either all {n} of the objects are of the first type, or all {n} of the objects are of the second type. (In truth, I can only assume that at least {n/2} of the objects are of the first type, or at least {n/2} of the objects are of the second type; but in practice, having {n/2} instead of {n} only ends up costing an unimportant multiplicative constant in the type of estimates used here.) A related such cheat: if one has {n} objects {A_1,\ldots,A_n} (again, think of {n} points or {n} circles), and to each object {A_i} one can associate some natural number {k_i} (e.g. some sort of “multiplicity” for {A_i}) that is of “polynomial size” (of size {O(N^{O(1)})}), then I will assume in fact that all the {k_i} are in a fixed dyadic range {[k,2k]} for some {k}. (In practice, the dyadic pigeonhole principle can only achieve this after throwing away all but about {n/\log N} of the original {n} objects; it is this type of logarithmic loss that eventually leads to the logarithmic factor in the main theorem.) Using the notation {X \sim Y} to denote the assertion that {C^{-1} Y \leq X \leq CY} for an absolute constant {C}, we thus have {k_i \sim k} for all {i}, thus {k_i} is morally constant.

I will also use asymptotic notation rather loosely, to avoid cluttering the exposition with a certain amount of routine but tedious bookkeeping of constants. In particular, I will use the informal notation {X \lll Y} or {Y \ggg X} to denote the statement that {X} is “much less than” {Y} or {Y} is “much larger than” {X}, by some large constant factor.

See also Janos Pach’s recent reaction to the Guth-Katz paper on Kalai’s blog.

Read the rest of this entry »

Below the fold is a version of my talk “Recent progress on the Kakeya conjecture” that I gave at the Fefferman conference.

Read the rest of this entry »

Jordan Ellenberg, Richard Oberlin, and I have just uploaded to the arXiv the paper “The Kakeya set and maximal conjectures for algebraic varieties over finite fields“, submitted to Mathematika.  This paper builds upon some work of Dvir and later authors on the Kakeya problem in finite fields, which I have discussed in this earlier blog post.  Dvir established the following:

Kakeya set conjecture for finite fields. Let F be a finite field, and let E be a subset of F^n that contains a line in every direction.  Then E has cardinality at least c_n |F|^n for some c_n > 0.

The initial argument of Dvir gave c_n = 1/n!.  This was improved to c_n = c^n for some explicit 0 < c < 1 by Saraf and Sudan, and recently to c_n =1/2^n by Dvir, Kopparty, Saraf, and Sudan, which is within a factor 2 of the optimal result.

In our work we investigate a somewhat different set of improvements to Dvir’s result.  The first concerns the Kakeya maximal function f^*: {\Bbb P}^{n-1}(F) \to {\Bbb R} of a function f: F^n \to {\Bbb R}, defined for all directions \xi \in {\Bbb P}^{n-1}(F) in the projective hyperplane at infinity by the formula

f^*(\xi) = \sup_{\ell // \xi} \sum_{x \in \ell} |f(x)|

where the supremum ranges over all lines \ell in F^n oriented in the direction \xi.  Our first result is the endpoint L^p estimate for this operator, namely

Kakeya maximal function conjecture in finite fields. We have \| f^* \|_{\ell^n({\Bbb P}^{n-1}(F))} \leq C_n |F|^{(n-1)/n} \|f\|_{\ell^n(F^n)} for some constant C_n > 0.

This result implies Dvir’s result, since if f is the indicator function of the set E in Dvir’s result, then f^*(\xi) = |F| for every \xi \in {\Bbb P}^{n-1}(F).  However, it also gives information on more general sets E which do not necessarily contain a line in every direction, but instead contain a certain fraction of a line in a subset of directions.  The exponents here are best possible in the sense that all other \ell^p \to \ell^q mapping properties of the operator can be deduced (with bounds that are optimal up to constants) by interpolating the above estimate with more trivial estimates.  This result is the finite field analogue of a long-standing (and still open) conjecture for the Kakeya maximal function in Euclidean spaces; we rely on the polynomial method of Dvir, which thus far has not extended to the Euclidean setting (but note the very interesting variant of this method by Guth that has established the endpoint multilinear Kakeya maximal function estimate in this setting, see this blog post for further discussion).

It turns out that a direct application of the polynomial method is not sufficient to recover the full strength of the maximal function estimate; but by combining the polynomial method with the Nikishin-Maurey-Pisier-Stein “method of random rotations” (as interpreted nowadays by Stein and later by Bourgain, and originally inspired by the factorisation theorems of Nikishin, Maurey, and Pisier), one can already recover a “restricted weak type” version of the above estimate.  If one then enhances the polynomial method with the “method of multiplicities” (as introduced by Saraf and Sudan) we can then recover the full “strong type” estimate; a few more details below the fold.

It turns out that one can generalise the above results to more general affine or projective algebraic varieties over finite fields.  In particular, we showed

Kakeya maximal function conjecture in algebraic varieties. Suppose that W \subset {\Bbb P}^N is an (n-1)-dimensional algebraic variety.  Let d \geq 1 be an integer. Then we have

\| \sup_{\gamma \ni x; \gamma \not \subset W} \sum_{y \in \gamma} f(y) \|_{\ell^n_x(W(F))} \leq C_{n,d,N,W} |F|^{(n-1)/n} \|f\|_{\ell^n({\Bbb P}^N(F))}

for some constant C_{n,d,N,W} > 0, where the supremum is over all irreducible algebraic curves \gamma of degree at most d that pass through x but do not lie in W, and W(F) denotes the F-points of W.

The ordinary Kakeya maximal function conjecture corresponds to the case when N=n, W is the hyperplane at infinity, and the degree d is equal to 1.  One corollary of this estimate is a Dvir-type result: a subset of {\Bbb P}^N(F) which contains, for each x in W, an irreducible algebraic curve of degree d passing through x but not lying in W, has cardinality \gg |F|^n if |W| \gg |F|^{n-1}.  (In particular this implies a lower bound for Nikodym sets worked out by Li.)  The dependence of the implied constant on W is only via the degree of W.

The techniques used in the flat case can easily handle curves \gamma of higher degree (provided that we allow the implied constants to depend on d), but the method of random rotations does not seem to work directly on the algebraic variety W as there are usually no symmetries of this variety to exploit.  Fortunately, we can get around this by using a “random projection trick” to “flatten” W into a hyperplane (after first expressing W as the zero locus of some polynomials, and then composing with the graphing map for such polynomials), reducing the non-flat case to the flat case.

Below the fold, I wish to sketch two of the key ingredients in our arguments, the random rotations method and the random projections trick.  (We of course also use some algebraic geometry, but mostly low-tech stuff, on the level of Bezout’s theorem, though we do need one non-trivial result of Kleiman (from SGA6), that asserts that bounded degree varieties can be cut out by a bounded number of polynomials of bounded degree.)

[Update, March 14: See also Jordan's own blog post on our paper.]

Read the rest of this entry »

One of my favourite family of conjectures (and one that has preoccupied a significant fraction of my own research) is the family of Kakeya conjectures in geometric measure theory and harmonic analysis.  There are many (not quite equivalent) conjectures in this family.  The cleanest one to state is the set conjecture:

Kakeya set conjecture: Let n \geq 1, and let E \subset {\Bbb R}^n contain a unit line segment in every direction (such sets are known as Kakeya sets or Besicovitch sets).  Then E has Hausdorff dimension and Minkowski dimension equal to n.

One reason why I find these conjectures fascinating is the sheer variety of mathematical fields that arise both in the partial results towards this conjecture, and in the applications of those results to other problems.  See for instance this survey of Wolff, my Notices article and this article of Łaba on the connections between this problem and other problems in Fourier analysis, PDE, and additive combinatorics; there have even been some connections to number theory and to cryptography.  At the other end of the pipeline, the mathematical tools that have gone into the proofs of various partial results have included:

[This list is not exhaustive.]

Very recently, I was pleasantly surprised to see yet another mathematical tool used to obtain new progress on the Kakeya conjecture, namely (a generalisation of) the famous Ham Sandwich theorem from algebraic topology.  This was recently used by Guth to establish a certain endpoint multilinear Kakeya estimate left open by the work of Bennett, Carbery, and myself.  With regards to the Kakeya set conjecture, Guth’s arguments assert, roughly speaking, that the only Kakeya sets that can fail to have full dimension are those which obey a certain “planiness” property, which informally means that the line segments that pass through a typical point in the set must be essentially coplanar. (This property first surfaced in my paper with Katz and Łaba.)  Guth’s arguments can be viewed as a partial analogue of Dvir’s arguments in the finite field setting (which I discussed in this blog post) to the Euclidean setting; in particular, both arguments rely crucially on the ability to create a polynomial of controlled degree that vanishes at or near a large number of points.  Unfortunately, while these arguments fully settle the Kakeya conjecture in the finite field setting, it appears that some new ideas are still needed to finish off the problem in the Euclidean setting.  Nevertheless this is an interesting new development in the long history of this conjecture, in particular demonstrating that the polynomial method can be successfully applied to continuous Euclidean problems (i.e. it is not confined to the finite field setting).

In this post I would like to sketch some of the key ideas in Guth’s paper, in particular the role of the Ham Sandwich theorem (or more precisely, a polynomial generalisation of this theorem first observed by Gromov).

Read the rest of this entry »

One of my favourite unsolved problems in mathematics is the Kakeya conjecture in geometric measure theory. This conjecture is descended from the

Kakeya needle problem. (1917) What is the least area in the plane required to continuously rotate a needle of unit length and zero thickness around completely (i.e. by 360^\circ)?

For instance, one can rotate a unit needle inside a unit disk, which has area \pi/4. By using a deltoid one requires only \pi/8 area.

In 1928, Besicovitch showed that that in fact one could rotate a unit needle using an arbitrarily small amount of positive area. This unintuitive fact was a corollary of two observations. The first, which is easy, is that one can translate a needle using arbitrarily small area, by sliding the needle along the direction it points in for a long distance (which costs zero area), turning it slightly (costing a small amount of area), sliding back, and then undoing the turn. The second fact, which is less obvious, can be phrased as follows. Define a Kakeya set in {\Bbb R}^2 to be any set which contains a unit line segment in each direction. (See this Java applet of mine, or the wikipedia page, for some pictures of such sets.)

Theorem. (Besicovitch, 1919) There exists Kakeya sets {\Bbb R}^2 of arbitrarily small area (or more precisely, Lebesgue measure).

In fact, one can construct such sets with zero Lebesgue measure. On the other hand, it was shown by Davies that even though these sets had zero area, they were still necessarily two-dimensional (in the sense of either Hausdorff dimension or Minkowski dimension). This led to an analogous conjecture in higher dimensions:

Kakeya conjecture. A Besicovitch set in {\Bbb R}^n (i.e. a subset of {\Bbb R}^n that contains a unit line segment in every direction) has Minkowski and Hausdorff dimension equal to n.

This conjecture remains open in dimensions three and higher (and gets more difficult as the dimension increases), although many partial results are known. For instance, when n=3, it is known that Besicovitch sets have Hausdorff dimension at least 5/2 and (upper) Minkowski dimension at least 5/2 + 10^{-10}. See my Notices article for a general survey of this problem (and its connections with Fourier analysis, additive combinatorics, and PDE), my paper with Katz for a more technical survey, and Wolff’s survey for a systematic treatment of the field (up to about 1998 or so).

In 1999, Wolff proposed a simpler finite field analogue of the Kakeya conjecture as a model problem that avoided all the technical issues involving Minkowski and Hausdorff dimension. If F^n is a vector space over a finite field F, define a Kakeya set to be a subset of F^n which contains a line in every direction.

Finite field Kakeya conjecture. Let E \subset F^n be a Kakeya set. Then E has cardinality at least c_n |F|^n, where c_n > 0 depends only on n.

This conjecture has had a significant influence in the subject, in particular inspiring work on the sum-product phenomenon in finite fields, which has since proven to have many applications in number theory and computer science. Modulo minor technicalities, the progress on the finite field Kakeya conjecture was, until very recently, essentially the same as that of the original “Euclidean” Kakeya conjecture.

Last week, the finite field Kakeya conjecture was proven using a beautifully simple argument by Zeev Dvir, using the polynomial method in algebraic extremal combinatorics. The proof is so short that I can present it in full here.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,322 other followers