You are currently browsing the monthly archive for August 2012.
Let be a finite field, with algebraic closure
, and let
be an (affine) algebraic variety defined over
, by which I mean a set of the form
for some ambient dimension , and some finite number of polynomials
. In order to reduce the number of subscripts later on, let us say that
has complexity at most
if
,
, and the degrees of the
are all less than or equal to
. Note that we do not require at this stage that
be irreducible (i.e. not the union of two strictly smaller varieties), or defined over
, though we will often specialise to these cases later in this post. (Also, everything said here can also be applied with almost no changes to projective varieties, but we will stick with affine varieties for sake of concreteness.)
One can consider two crude measures of how “big” the variety is. The first measure, which is algebraic geometric in nature, is the dimension
of the variety
, which is an integer between
and
(or, depending on convention,
,
, or undefined, if
is empty) that can be defined in a large number of ways (e.g. it is the largest
for which the generic linear projection from
to
is dominant, or the smallest
for which the intersection with a generic codimension
subspace is non-empty). The second measure, which is number-theoretic in nature, is the number
of
-points of
, i.e. points
in
all of whose coefficients lie in the finite field, or equivalently the number of solutions to the system of equations
for
with variables
in
.
These two measures are linked together in a number of ways. For instance, we have the basic Schwarz-Zippel type bound (which, in this qualitative form, goes back at least to Lemma 1 of the work of Lang and Weil in 1954).
Lemma 1 (Schwarz-Zippel type bound) Let
be a variety of complexity at most
. Then we have
.
Proof: (Sketch) For the purposes of exposition, we will not carefully track the dependencies of implied constants on the complexity , instead simply assuming that all of these quantities remain controlled throughout the argument. (If one wished, one could obtain ineffective bounds on these quantities by an ultralimit argument, as discussed in this previous post, or equivalently by moving everything over to a nonstandard analysis framework; one could also obtain such uniformity using the machinery of schemes.)
We argue by induction on the ambient dimension of the variety
. The
case is trivial, so suppose
and that the claim has already been proven for
. By breaking up
into irreducible components we may assume that
is irreducible (this requires some control on the number and complexity of these components, but this is available, as discussed in this previous post). For each
, the fibre
is either one-dimensional (and thus all of
) or zero-dimensional. In the latter case, one has
points in the fibre from the fundamental theorem of algebra (indeed one has a bound of
in this case), and
lives in the projection of
to
, which is a variety of dimension at most
and controlled complexity, so the contribution of this case is acceptable from the induction hypothesis. In the former case, the fibre contributes
-points, but
lies in a variety in
of dimension at most
(since otherwise
would contain a subvariety of dimension at least
, which is absurd) and controlled complexity, and so the contribution of this case is also acceptable from the induction hypothesis.
One can improve the bound on the implied constant to be linear in the degree of (see e.g. Claim 7.2 of this paper of Dvir, Kollar, and Lovett, or Lemma A.3 of this paper of Ellenberg, Oberlin, and myself), but we will not be concerned with these improvements here.
Without further hypotheses on , the above upper bound is sharp (except for improvements in the implied constants). For instance, the variety
where are distict, is the union of
distinct hyperplanes of dimension
, with
and complexity
; similar examples can easily be concocted for other choices of
. In the other direction, there is also no non-trivial lower bound for
without further hypotheses on
. For a trivial example, if
is an element of
that does not lie in
, then the hyperplane
clearly has no -points whatsoever, despite being a
-dimensional variety in
of complexity
. For a slightly less non-trivial example, if
is an element of
that is not a quadratic residue, then the variety
which is the union of two hyperplanes, still has no -points, even though this time the variety is defined over
instead of
(by which we mean that the defining polynomial(s) have all of their coefficients in
). There is however the important Lang-Weil bound that allows for a much better estimate as long as
is both defined over
and irreducible:
Theorem 2 (Lang-Weil bound) Let
be a variety of complexity at most
. Assume that
is defined over
, and that
is irreducible as a variety over
(i.e.
is geometrically irreducible or absolutely irreducible). Then
Again, more explicit bounds on the implied constant here are known, but will not be the focus of this post. As the previous examples show, the hypotheses of definability over and geometric irreducibility are both necessary.
The Lang-Weil bound is already non-trivial in the model case of plane curves:
Theorem 3 (Hasse-Weil bound) Let
be an irreducible polynomial of degree
with coefficients in
. Then
Thus, for instance, if , then the elliptic curve
has
-points, a result first established by Hasse. The Hasse-Weil bound is already quite non-trivial, being the analogue of the Riemann hypothesis for plane curves. For hyper-elliptic curves, an elementary proof (due to Stepanov) is discussed in this previous post. For general plane curves, the first proof was by Weil (leading to his famous Weil conjectures); there is also a nice version of Stepanov’s argument due to Bombieri covering this case which is a little less elementary (relying crucially on the Riemann-Roch theorem for the upper bound, and a lifting trick to then get the lower bound), which I briefly summarise later in this post. The full Lang-Weil bound is deduced from the Hasse-Weil bound by an induction argument using generic hyperplane slicing, as I will also summarise later in this post.
The hypotheses of definability over and geometric irreducibility in the Lang-Weil can be removed after inserting a geometric factor:
Corollary 4 (Lang-Weil bound, alternate form) Let
be a variety of complexity at most
. Then one has
where
is the number of top-dimensional components of
(i.e. geometrically irreducible components of
of dimension
) that are definable over
, or equivalently are invariant with respect to the Frobenius endomorphism
that defines
.
Proof: By breaking up a general variety into components (and using Lemma 1 to dispose of any lower-dimensional components), it suffices to establish this claim when
is itself geometrically irreducible. If
is definable over
, the claim follows from Theorem 2. If
is not definable over
, then it is not fixed by the Frobenius endomorphism
(since otherwise one could produce a set of defining polynomials that were fixed by Frobenius and thus defined over
by using some canonical basis (such as a reduced Grobner basis) for the associated ideal), and so
has strictly smaller dimension than
. But
captures all the
-points of
, so in this case the claim follows from Lemma 1.
Note that if is reducible but is itself defined over
, then the Frobenius endomorphism preserves
itself, but may permute the components of
around. In this case,
is the number of fixed points of this permutation action of Frobenius on the components. In particular,
is always a natural number between
and
; thus we see that regardless of the geometry of
, the normalised count
is asymptotically restricted to a bounded range of natural numbers (in the regime where the complexity stays bounded and
goes to infinity).
Example 1 Consider the variety
for some non-zero parameter
. Geometrically (by which we basically mean “when viewed over the algebraically closed field
“), this is the union of two lines, with slopes corresponding to the two square roots of
. If
is a quadratic residue, then both of these lines are defined over
, and are fixed by Frobenius, and
in this case. If
is not a quadratic residue, then the lines are not defined over
, and the Frobenius automorphism permutes the two lines while preserving
as a whole, giving
in this case.
Corollary 4 effectively computes (at least to leading order) the number-theoretic size of a variety in terms of geometric information about
, namely its dimension
and the number
of top-dimensional components fixed by Frobenius. It turns out that with a little bit more effort, one can extend this connection to cover not just a single variety
, but a family of varieties indexed by points in some base space
. More precisely, suppose we now have two affine varieties
of bounded complexity, together with a regular map
of bounded complexity (the definition of complexity of a regular map is a bit technical, see e.g. this paper, but one can think for instance of a polynomial or rational map of bounded degree as a good example). It will be convenient to assume that the base space
is irreducible. If the map
is a dominant map (i.e. the image
is Zariski dense in
), then standard algebraic geometry results tell us that the fibres
are an unramified family of
-dimensional varieties outside of an exceptional subset
of
of dimension strictly smaller than
(and with
having dimension strictly smaller than
); see e.g. Section I.6.3 of Shafarevich.
Now suppose that ,
, and
are defined over
. Then, by Lang-Weil,
has
-points, and by Schwarz-Zippel, for all but
of these
-points
(the ones that lie in the subvariety
), the fibre
is an algebraic variety defined over
of dimension
. By using ultraproduct arguments (see e.g. Lemma 3.7 of this paper of mine with Emmanuel Breuillard and Ben Green), this variety can be shown to have bounded complexity, and thus by Corollary 4, has
-points. One can then ask how the quantity
is distributed. A simple but illustrative example occurs when
and
is the polynomial
. Then
equals
when
is a non-zero quadratic residue and
when
is a non-zero quadratic non-residue (and
when
is zero, but this is a negligible fraction of all
). In particular, in the asymptotic limit
,
is equal to
half of the time and
half of the time.
Now we describe the asymptotic distribution of the . We need some additional notation. Let
be an
-point in
, and let
be the connected components of the fibre
. As
is defined over
, this set of components is permuted by the Frobenius endomorphism
. But there is also an action by monodromy of the fundamental group
(this requires a certain amount of étale machinery to properly set up, as we are working over a positive characteristic field rather than over the complex numbers, but I am going to ignore this rather important detail here, as I still don’t fully understand it). This fundamental group may be infinite, but (by the étale construction) is always profinite, and in particular has a Haar probability measure, in which every finite index subgroup (and their cosets) are measurable. Thus we may meaningfully talk about elements drawn uniformly at random from this group, so long as we work only with the profinite
-algebra on
that is generated by the cosets of the finite index subgroups of this group (which will be the only relevant sets we need to measure when considering the action of this group on finite sets, such as the components of a generic fibre).
Theorem 5 (Lang-Weil with parameters) Let
be varieties of complexity at most
with
irreducible, and let
be a dominant map of complexity at most
. Let
be an
-point of
. Then, for any natural number
, one has
for
values of
, where
is the random variable that counts the number of components of a generic fibre
that are invariant under
, where
is an element chosen uniformly at random from the étale fundamental group
. In particular, in the asymptotic limit
, and with
chosen uniformly at random from
,
(or, equivalently,
) and
have the same asymptotic distribution.
This theorem generalises Corollary 4 (which is the case when is just a point, so that
is just
and
is trivial). Informally, the effect of a non-trivial parameter space
on the Lang-Weil bound is to push around the Frobenius map by monodromy for the purposes of counting invariant components, and a randomly chosen set of parameters corresponds to a randomly chosen loop on which to perform monodromy.
Example 2 Let
and
for some fixed
; to avoid some technical issues let us suppose that
is coprime to
. Then
can be taken to be
, and for a base point
we can take
. The fibre
– the
roots of unity – can be identified with the cyclic group
by using a primitive root of unity. The étale fundamental group
is (I think) isomorphic to the profinite closure
of the integers
(excluding the part of that closure coming from the characteristic of
). Not coincidentally, the integers
are the fundamental group of the complex analogue
of
. (Brian Conrad points out to me though that for more complicated varieties, such as covers of
by a power of the characteristic, the etale fundamental group is more complicated than just a profinite closure of the ordinary fundamental group, due to the presence of Artin-Schreier covers that are only ramified at infinity.) The action of this fundamental group on the fibres
can given by translation. Meanwhile, the Frobenius map
on
is given by multiplication by
. A random element
then becomes a random affine map
on
, where
chosen uniformly at random from
. The number of fixed points of this map is equal to the greatest common divisor
of
and
when
is divisible by
, and equal to
otherwise. This matches up with the elementary number fact that a randomly chosen non-zero element of
will be an
power with probability
, and when this occurs, the number of
roots in
will be
.
Example 3 (Thanks to Jordan Ellenberg for this example.) Consider a random elliptic curve
, where
are chosen uniformly at random, and let
. Let
be the
-torsion points of
(i.e. those elements
with
using the elliptic curve addition law); as a group, this is isomorphic to
(assuming that
has sufficiently large characteristic, for simplicity), and consider the number of
points of
, which is a random variable taking values in the natural numbers between
and
. In this case, the base variety
is the modular curve
, and the covering variety
is the modular curve
. The generic fibre here can be identified with
, the monodromy action projects down to the action of
, and the action of Frobenius on this fibre can be shown to be given by a
matrix with determinant
(with the exact choice of matrix depending on the choice of fibre and of the identification), so the distribution of the number of
-points of
is asymptotic to the distribution of the number of fixed points
of a random linear map of determinant
on
.
Theorem 5 seems to be well known “folklore” among arithmetic geometers, though I do not know of an explicit reference for it. I enjoyed deriving it for myself (though my derivation is somewhat incomplete due to my lack of understanding of étale cohomology) from the ordinary Lang-Weil theorem and the moment method. I’m recording this derivation later in this post, mostly for my own benefit (as I am still in the process of learning this material), though perhaps some other readers may also be interested in it.
Caveat: not all details are fully fleshed out in this writeup, particularly those involving the finer points of algebraic geometry and étale cohomology, as my understanding of these topics is not as complete as I would like it to be.
Many thanks to Brian Conrad and Jordan Ellenberg for helpful discussions on these topics.
Ben Green and I have just uploaded to the arXiv our new paper “On sets defining few ordinary lines“, submitted to Discrete and Computational Geometry. This paper asymptotically solves two old questions concerning finite configurations of points in the plane
. Given a set
of
points in the plane, define an ordinary line to be a line containing exactly two points of
. The classical Sylvester-Gallai theorem, first posed as a problem by Sylvester in 1893, asserts that as long as the points of
are not all collinear,
defines at least one ordinary line:
It is then natural to pose the question of what is the minimal number of ordinary lines that a set of non-collinear points can generate. In 1940, Melchior gave an elegant proof of the Sylvester-Gallai theorem based on projective duality and Euler’s formula
, showing that at least three ordinary lines must be created; in 1951, Motzkin showed that there must be
ordinary lines. Previously to this paper, the best lower bound was by Csima and Sawyer, who in 1993 showed that there are at least
ordinary lines. In the converse direction, if
is even, then by considering
equally spaced points on a circle, and
points on the line at infinity in equally spaced directions, one can find a configuration of
points that define just
ordinary lines.
As first observed by Böröczky, variants of this example also give few ordinary lines for odd , though not quite as few as
; more precisely, when
one can find a configuration with
ordinary lines, and when
one can find a configuration with
ordinary lines. Our first main result is that these configurations are best possible for sufficiently large
:
Theorem 1 (Dirac-Motzkin conjecture) If
is sufficiently large, then any set of
non-collinear points in the plane will define at least
ordinary lines. Furthermore, if
is odd, at least
ordinary lines must be created.
The Dirac-Motzkin conjecture asserts that the first part of this theorem in fact holds for all , not just for sufficiently large
; in principle, our theorem reduces that conjecture to a finite verification, although our bound for “sufficiently large” is far too poor to actually make this feasible (it is of double exponential type). (There are two known configurations for which one has
ordinary lines, one with
(discovered by Kelly and Moser), and one with
(discovered by Crowe and McKee).)
Our second main result concerns not the ordinary lines, but rather the -rich lines of an
-point set – a line that meets exactly three points of that set. A simple double counting argument (counting pairs of distinct points in the set in two different ways) shows that there are at most
-rich lines. On the other hand, on an elliptic curve, three distinct points P,Q,R on that curve are collinear precisely when they sum to zero with respect to the group law on that curve. Thus (as observed first by Sylvester in 1868), any finite subgroup of an elliptic curve (of which one can produce numerous examples, as elliptic curves in
have the group structure of either
or
) can provide examples of
-point sets with a large number of
-rich lines (
, to be precise). One can also shift such a finite subgroup by a third root of unity and obtain a similar example with only one fewer
-rich line. Sylvester then formally posed the question of determining whether this was best possible.
This problem was known as the Orchard planting problem, and was given a more poetic formulation as such by Jackson in 1821 (nearly fifty years prior to Sylvester!):
Our second main result answers this problem affirmatively in the large case:
Theorem 2 (Orchard planting problem) If
is sufficiently large, then any set of
points in the plane will determine at most
![]()
-rich lines.
Again, our threshold for “sufficiently large” for this is extremely large (though slightly less large than in the previous theorem), and so a full solution of the problem, while in principle reduced to a finitary computation, remains infeasible at present.
Our results also classify the extremisers (and near extremisers) for both of these problems; basically, the known examples mentioned previously are (up to projective transformation) the only extremisers when is sufficiently large.
Our proof strategy follows the “inverse theorem method” from additive combinatorics. Namely, rather than try to prove direct theorems such as lower bounds on the number of ordinary lines, or upper bounds on the number of -rich lines, we instead try to prove inverse theorems (also known as structure theorems), in which one attempts a structural classification of all configurations with very few ordinary lines (or very many
-rich lines). In principle, once one has a sufficiently explicit structural description of these sets, one simply has to compute the precise number of ordinary lines or
-rich lines in each configuration in the list provided by that structural description in order to obtain results such as the two theorems above.
Note from double counting that sets with many -rich lines will necessarily have few ordinary lines. Indeed, if we let
denote the set of lines that meet exactly
points of an
-point configuration, so that
is the number of
-rich lines and
is the number of ordinary lines, then we have the double counting identity
which among other things implies that any counterexample to the orchard problem can have at most ordinary lines. In particular, any structural theorem that lets us understand configurations with
ordinary lines will, in principle, allow us to obtain results such as the above two theorems.
As it turns out, we do eventually obtain a structure theorem that is strong enough to achieve these aims, but it is difficult to prove this theorem directly. Instead we proceed more iteratively, beginning with a “cheap” structure theorem that is relatively easy to prove but provides only a partial amount of control on the configurations with ordinary lines. One then builds upon that theorem with additional arguments to obtain an “intermediate” structure theorem that gives better control, then a “weak” structure theorem that gives even more control, a “strong” structure theorem that gives yet more control, and then finally a “very strong” structure theorem that gives an almost complete description of the configurations (but only in the asymptotic regime when
is very large). It turns out that the “weak” theorem is enough for the orchard planting problem, and the “strong” version is enough for the Dirac-Motzkin conjecture. (So the “very strong” structure theorem ends up being unnecessary for the two applications given, but may be of interest for other applications.) Note that the stronger theorems do not completely supercede the weaker ones, because the quantitative bounds in the theorems get progressively worse as the control gets stronger.
Before we state these structure theorems, note that all the examples mentioned previously of sets with few ordinary lines involved cubic curves: either irreducible examples such as elliptic curves, or reducible examples such as the union of a circle (or more generally, a conic section) and a line. (We also allow singular cubic curves, such as the union of a conic section and a tangent line, or a singular irreducible curve such as .) This turns out to be no coincidence; cubic curves happen to be very good at providing many
-rich lines (and thus, few ordinary lines), and conversely it turns out that they are essentially the only way to produce such lines. This can already be evidenced by our cheap structure theorem:
Theorem 3 (Cheap structure theorem) Let
be a configuration of
points with at most
ordinary lines for some
. Then
can be covered by at most
cubic curves.
This theorem is already a non-trivial amount of control on sets with few ordinary lines, but because the result does not specify the nature of these curves, and how they interact with each other, it does not seem to be directly useful for applications. The intermediate structure theorem given below gives a more precise amount of control on these curves (essentially guaranteeing that all but at most one of the curve components are lines):
Theorem 4 (Intermediate structure theorem) Let
be a configuration of
points with at most
ordinary lines for some
. Then one of the following is true:
lies on the union of an irreducible cubic curve and an additional
points.
lies on the union of an irreducible conic section and an additional
lines, with
of the points on
in either of the two components.
lies on the union of
lines and an additional
points.
By some additional arguments (including a very nice argument supplied to us by Luke Alexander Betts, an undergraduate at Cambridge, which replaces a much more complicated (and weaker) argument we originally had for this paper), one can cut down the number of lines in the above theorem to just one, giving a more useful structure theorem, at least when is large:
Theorem 5 (Weak structure theorem) Let
be a configuration of
points with at most
ordinary lines for some
. Assume that
for some sufficiently large absolute constant
. Then one of the following is true:
lies on the union of an irreducible cubic curve and an additional
points.
lies on the union of an irreducible conic section, a line, and an additional
points, with
of the points on
in either of the first two components.
lies on the union of a single line and an additional
points.
As mentioned earlier, this theorem is already strong enough to resolve the orchard planting problem for large . The presence of the double exponential here is extremely annoying, and is the main reason why the final thresholds for “sufficiently large” in our results are excessively large, but our methods seem to be unable to eliminate these exponentials from our bounds (though they can fortunately be confined to a lower bound for
, keeping the other bounds in the theorem polynomial in
).
For the Dirac-Motzkin conjecture one needs more precise control on the portion of on the various low-degree curves indicated. This is given by the following result:
Theorem 6 (Strong structure theorem) Let
be a configuration of
points with at most
ordinary lines for some
. Assume that
for some sufficiently large absolute constant
. Then, after adding or deleting
points from
if necessary (modifying
appropriately), and then applying a projective transformation, one of the following is true:
is a finite subgroup of an elliptic curve (EDIT: as pointed out in comments, one also needs to allow for finite subgroups of acnodal singular cubic curves), possibly shifted by a third root of unity.
is the Borozcky example mentioned previously (the union of
equally spaced points on the circle, and
points on the line at infinity).
lies on a single line.
By applying a final “cleanup” we can replace the in the above theorem with the optimal
, which is our “very strong” structure theorem. But the strong structure theorem is already sufficient to establish the Dirac-Motzkin conjecture for large
.
There are many tools that go into proving these theorems, some of which are extremely classical (with at least one going back to the ancient Greeks), and others being more recent. I will discuss some (not all) of these tools below the fold, and more specifically:
- Melchior’s argument, based on projective duality and Euler’s formula, initially used to prove the Sylvester-Gallai theorem;
- Chasles’ version of the Cayley-Bacharach theorem, which can convert dual triangular grids (produced by Melchior’s argument) into cubic curves that meet many points of the original configuration
);
- Menelaus’s theorem, which is useful for producing ordinary lines when the point configuration lies on a few non-concurrent lines, particularly when combined with a sum-product estimate of Elekes, Nathanson, and Ruzsa;
- Betts’ argument, that produces ordinary lines when the point configuration lies on a few concurrent lines;
- A result of Poonen and Rubinstein that any point not on the origin or unit circle can lie on at most seven chords connecting roots of unity; this, together with a variant for elliptic curves, gives the very strong structure theorem, and is also (a strong version of) what is needed to finish off the Dirac-Motzkin and orchard planting problems from the structure theorems given above.
There are also a number of more standard tools from arithmetic combinatorics (e.g. a version of the Balog-Szemeredi-Gowers lemma) which are needed to tie things together at various junctures, but I won’t say more about these methods here as they are (by now) relatively routine.
Bill Thurston, who made fundamental contributions to our understanding of low-dimensional manifolds and related structures, died on Tuesday, aged 65.
Perhaps Thurston’s best known achievement is the proof of the hyperbolisation theorem for Haken manifolds, which showed that 3-manifolds which obeyed a certain number of topological conditions, could always be given a hyperbolic geometry (i.e. a Riemannian metric that made the manifold isometric to a quotient of the hyperbolic 3-space ). This difficult theorem connecting the topological and geometric structure of 3-manifolds led Thurston to give his influential geometrisation conjecture, which (in principle, at least) completely classifies the topology of an arbitrary compact 3-manifold as a combination of eight model geometries (now known as Thurston model geometries). This conjecture has many consequences, including Thurston’s hyperbolisation theorem and (most famously) the Poincaré conjecture. Indeed, by placing that conjecture in the context of a conceptually appealing general framework, of which many other cases could already be verified, Thurston provided one of the strongest pieces of evidence towards the truth of the Poincaré conjecture, until the work of Grisha Perelman in 2002-2003 proved both the Poincaré conjecture and the geometrisation conjecture by developing Hamilton’s Ricci flow methods. (There are now several variants of Perelman’s proof of both conjectures; in the proof of geometrisation by Bessieres, Besson, Boileau, Maillot, and Porti, Thurston’s hyperbolisation theorem is a crucial ingredient, allowing one to bypass the need for the theory of Alexandrov spaces in a key step in Perelman’s argument.)
One of my favourite results of Thurston’s is his elegant method for everting the sphere (smoothly turning a sphere in
inside out without any folds or singularities). The fact that sphere eversion can be achieved at all is highly unintuitive, and is often referred to as Smale’s paradox, as Stephen Smale was the first to give a proof that such an eversion exists. However, prior to Thurston’s method, the known constructions for sphere eversion were quite complicated. Thurston’s method, relying on corrugating and then twisting the sphere, is sufficiently conceptual and geometric that it can in fact be explained quite effectively in non-technical terms, as was done in the following excellent video entitled “Outside In“, and produced by the Geometry Center:
In addition to his direct mathematical research contributions, Thurston was also an amazing mathematical expositor, having the rare knack of being able to describe the process of mathematical thinking in addition to the results of that process and the intuition underlying it. His wonderful essay “On proof and progress in mathematics“, which I highly recommend, is the quintessential instance of this; more recent examples include his many insightful questions and answers on MathOverflow.
I unfortunately never had the opportunity to meet Thurston in person (although we did correspond a few times online), but I know many mathematicians who have been profoundly influenced by him and his work. His death is a great loss for mathematics.
Recent Comments