Ben Green and I have just uploaded to the arXiv our new paper “On sets defining few ordinary lines“, submitted to Discrete and Computational Geometry. This paper asymptotically solves two old questions concerning finite configurations of points in the plane . Given a set of points in the plane, define an ordinary line to be a line containing exactly two points of . The classical Sylvester-Gallai theorem, first posed as a problem by Sylvester in 1893, asserts that as long as the points of are not all collinear, defines at least one ordinary line:
It is then natural to pose the question of what is the minimal number of ordinary lines that a set of non-collinear points can generate. In 1940, Melchior gave an elegant proof of the Sylvester-Gallai theorem based on projective duality and Euler’s formula , showing that at least three ordinary lines must be created; in 1951, Motzkin showed that there must be ordinary lines. Previously to this paper, the best lower bound was by Csima and Sawyer, who in 1993 showed that there are at least ordinary lines. In the converse direction, if is even, then by considering equally spaced points on a circle, and points on the line at infinity in equally spaced directions, one can find a configuration of points that define just ordinary lines.
As first observed by Böröczky, variants of this example also give few ordinary lines for odd , though not quite as few as ; more precisely, when one can find a configuration with ordinary lines, and when one can find a configuration with ordinary lines. Our first main result is that these configurations are best possible for sufficiently large :
Theorem 1 (Dirac-Motzkin conjecture) If is sufficiently large, then any set of non-collinear points in the plane will define at least ordinary lines. Furthermore, if is odd, at least ordinary lines must be created.
The Dirac-Motzkin conjecture asserts that the first part of this theorem in fact holds for all , not just for sufficiently large ; in principle, our theorem reduces that conjecture to a finite verification, although our bound for “sufficiently large” is far too poor to actually make this feasible (it is of double exponential type). (There are two known configurations for which one has ordinary lines, one with (discovered by Kelly and Moser), and one with (discovered by Crowe and McKee).)
Our second main result concerns not the ordinary lines, but rather the -rich lines of an -point set – a line that meets exactly three points of that set. A simple double counting argument (counting pairs of distinct points in the set in two different ways) shows that there are at most
-rich lines. On the other hand, on an elliptic curve, three distinct points P,Q,R on that curve are collinear precisely when they sum to zero with respect to the group law on that curve. Thus (as observed first by Sylvester in 1868), any finite subgroup of an elliptic curve (of which one can produce numerous examples, as elliptic curves in have the group structure of either or ) can provide examples of -point sets with a large number of -rich lines (, to be precise). One can also shift such a finite subgroup by a third root of unity and obtain a similar example with only one fewer -rich line. Sylvester then formally posed the question of determining whether this was best possible.
This problem was known as the Orchard planting problem, and was given a more poetic formulation as such by Jackson in 1821 (nearly fifty years prior to Sylvester!):
Our second main result answers this problem affirmatively in the large case:
Again, our threshold for “sufficiently large” for this is extremely large (though slightly less large than in the previous theorem), and so a full solution of the problem, while in principle reduced to a finitary computation, remains infeasible at present.
Our results also classify the extremisers (and near extremisers) for both of these problems; basically, the known examples mentioned previously are (up to projective transformation) the only extremisers when is sufficiently large.
Our proof strategy follows the “inverse theorem method” from additive combinatorics. Namely, rather than try to prove direct theorems such as lower bounds on the number of ordinary lines, or upper bounds on the number of -rich lines, we instead try to prove inverse theorems (also known as structure theorems), in which one attempts a structural classification of all configurations with very few ordinary lines (or very many -rich lines). In principle, once one has a sufficiently explicit structural description of these sets, one simply has to compute the precise number of ordinary lines or -rich lines in each configuration in the list provided by that structural description in order to obtain results such as the two theorems above.
Note from double counting that sets with many -rich lines will necessarily have few ordinary lines. Indeed, if we let denote the set of lines that meet exactly points of an -point configuration, so that is the number of -rich lines and is the number of ordinary lines, then we have the double counting identity
which among other things implies that any counterexample to the orchard problem can have at most ordinary lines. In particular, any structural theorem that lets us understand configurations with ordinary lines will, in principle, allow us to obtain results such as the above two theorems.
As it turns out, we do eventually obtain a structure theorem that is strong enough to achieve these aims, but it is difficult to prove this theorem directly. Instead we proceed more iteratively, beginning with a “cheap” structure theorem that is relatively easy to prove but provides only a partial amount of control on the configurations with ordinary lines. One then builds upon that theorem with additional arguments to obtain an “intermediate” structure theorem that gives better control, then a “weak” structure theorem that gives even more control, a “strong” structure theorem that gives yet more control, and then finally a “very strong” structure theorem that gives an almost complete description of the configurations (but only in the asymptotic regime when is very large). It turns out that the “weak” theorem is enough for the orchard planting problem, and the “strong” version is enough for the Dirac-Motzkin conjecture. (So the “very strong” structure theorem ends up being unnecessary for the two applications given, but may be of interest for other applications.) Note that the stronger theorems do not completely supercede the weaker ones, because the quantitative bounds in the theorems get progressively worse as the control gets stronger.
Before we state these structure theorems, note that all the examples mentioned previously of sets with few ordinary lines involved cubic curves: either irreducible examples such as elliptic curves, or reducible examples such as the union of a circle (or more generally, a conic section) and a line. (We also allow singular cubic curves, such as the union of a conic section and a tangent line, or a singular irreducible curve such as .) This turns out to be no coincidence; cubic curves happen to be very good at providing many -rich lines (and thus, few ordinary lines), and conversely it turns out that they are essentially the only way to produce such lines. This can already be evidenced by our cheap structure theorem:
Theorem 3 (Cheap structure theorem) Let be a configuration of points with at most ordinary lines for some . Then can be covered by at most cubic curves.
This theorem is already a non-trivial amount of control on sets with few ordinary lines, but because the result does not specify the nature of these curves, and how they interact with each other, it does not seem to be directly useful for applications. The intermediate structure theorem given below gives a more precise amount of control on these curves (essentially guaranteeing that all but at most one of the curve components are lines):
Theorem 4 (Intermediate structure theorem) Let be a configuration of points with at most ordinary lines for some . Then one of the following is true:
- lies on the union of an irreducible cubic curve and an additional points.
- lies on the union of an irreducible conic section and an additional lines, with of the points on in either of the two components.
- lies on the union of lines and an additional points.
By some additional arguments (including a very nice argument supplied to us by Luke Alexander Betts, an undergraduate at Cambridge, which replaces a much more complicated (and weaker) argument we originally had for this paper), one can cut down the number of lines in the above theorem to just one, giving a more useful structure theorem, at least when is large:
Theorem 5 (Weak structure theorem) Let be a configuration of points with at most ordinary lines for some . Assume that for some sufficiently large absolute constant . Then one of the following is true:
- lies on the union of an irreducible cubic curve and an additional points.
- lies on the union of an irreducible conic section, a line, and an additional points, with of the points on in either of the first two components.
- lies on the union of a single line and an additional points.
As mentioned earlier, this theorem is already strong enough to resolve the orchard planting problem for large . The presence of the double exponential here is extremely annoying, and is the main reason why the final thresholds for “sufficiently large” in our results are excessively large, but our methods seem to be unable to eliminate these exponentials from our bounds (though they can fortunately be confined to a lower bound for , keeping the other bounds in the theorem polynomial in ).
For the Dirac-Motzkin conjecture one needs more precise control on the portion of on the various low-degree curves indicated. This is given by the following result:
Theorem 6 (Strong structure theorem) Let be a configuration of points with at most ordinary lines for some . Assume that for some sufficiently large absolute constant . Then, after adding or deleting points from if necessary (modifying appropriately), and then applying a projective transformation, one of the following is true:
- is a finite subgroup of an elliptic curve (EDIT: as pointed out in comments, one also needs to allow for finite subgroups of acnodal singular cubic curves), possibly shifted by a third root of unity.
- is the Borozcky example mentioned previously (the union of equally spaced points on the circle, and points on the line at infinity).
- lies on a single line.
By applying a final “cleanup” we can replace the in the above theorem with the optimal , which is our “very strong” structure theorem. But the strong structure theorem is already sufficient to establish the Dirac-Motzkin conjecture for large .
There are many tools that go into proving these theorems, some of which are extremely classical (with at least one going back to the ancient Greeks), and others being more recent. I will discuss some (not all) of these tools below the fold, and more specifically:
- Melchior’s argument, based on projective duality and Euler’s formula, initially used to prove the Sylvester-Gallai theorem;
- Chasles’ version of the Cayley-Bacharach theorem, which can convert dual triangular grids (produced by Melchior’s argument) into cubic curves that meet many points of the original configuration );
- Menelaus’s theorem, which is useful for producing ordinary lines when the point configuration lies on a few non-concurrent lines, particularly when combined with a sum-product estimate of Elekes, Nathanson, and Ruzsa;
- Betts’ argument, that produces ordinary lines when the point configuration lies on a few concurrent lines;
- A result of Poonen and Rubinstein that any point not on the origin or unit circle can lie on at most seven chords connecting roots of unity; this, together with a variant for elliptic curves, gives the very strong structure theorem, and is also (a strong version of) what is needed to finish off the Dirac-Motzkin and orchard planting problems from the structure theorems given above.
There are also a number of more standard tools from arithmetic combinatorics (e.g. a version of the Balog-Szemeredi-Gowers lemma) which are needed to tie things together at various junctures, but I won’t say more about these methods here as they are (by now) relatively routine.
1. Melchior’s argument
Our starting point was a closer look at Melchior’s proof of the Sylvester-Gallai theorem establishing a non-trivial bound on the number of ordinary lines. The first step was to look at the projective dual of the point set , in which each point is replaced by its dual line . (For this, one should really work in the projective plane rather than the affine plane , but let us ignore this technical detail for this post.) One now has a collection of lines that intersect each other in a number of points; each ordinary line now dualises to a point incident to exactly two of the lines in , a -rich line dualises to a point incident to exactly three lines, and so forth.
Now look at the graph (or more precisely, the drawing) determined by these points of intersection, and the line segments of lines in cut out by these points. This partitions the (projective) plane into a number of vertices, edges, and polygonal faces, which by Euler’s formula obey the relationship . (Because we are working on the projective plane rather than the affine plane or sphere, the relevant Euler characteristic here is , rather than .) If one counts the number of vertices, edges, and faces in this configuration, one quickly arrives at the identities
where is the number of -rich lines, and is the number of faces with sides. Euler’s formula then can be rearranged, after some elementary algebra, as Melchior’s identity
which among other things gives Melchior’s bound , which implies the Sylvester-Gallai theorem. But it also shows that when the number of ordinary lines is small, the number of faces with more than sides is also small; in particular, when there are ordinary lines, all but the faces are triangles. Thus, in such cases the lines of must form configurations which locally resemble a triangular grid most of the time. The following configuration (of , not of ) is typical:
2. Chasles’ Cayley-Bacharach theorem
Once one sees the above consequence of Melchior’s argument, it becomes natural to ask what configurations of points have the property that their dual lines meet each other in local triangular grids. The three triplets of coloured lines in the above local triangular grid, when one removes the dualisation, become three triples of points which arrange in two different ways as collinear triples of points. This is precisely the type of configuration governed by the Cayley-Bacharach theorem (or more precisely, a version of this theorem first stated by Chasles):
Theorem 7 (Chasles) Suppose that two triples of lines meet each other transversely in nine distinct points. Then any cubic curve that passes through eight of these points, also passes through the ninth.
I blogged about this theorem in this previous post; the result is intimately connected to other classical facts in plane geometry such as Pappus’s theorem, Pascal’s theorem, and the associativity of the elliptic curve group law. This theorem also clearly shows how cubic curves will arise in any analysis of sets with few ordinary lines; using this theorem, one can show that whenever three triplets of points in have their dual lines arranged in the hexagonal formation above, then any cubic curve that passes through eight of these points, also passes through the ninth. By iterating this observation (together with the conclusion from Melchior’s argument that there are large portions of the dual configuration that look like triangular grids), one can obtain the cheap structure theorem from routine combinatorial arguments (omitted here). Actually, these arguments provide not just one way to cover the set by a small number of cubic curves, but a large family of such ways; by comparing these coverings to each other (and making heavy use of Bezout’s theorem) one can eliminate a large number of possibilities for the cubic curves, eventually leading to the intermediate structure theorem.
3. Menelaus theorem
A good model case to consider when trying to understand ordinary lines is the case when the point set can be covered by a bounded number of lines, with a fair number of points () on each line. This is a key case to understand when one wants to improve the intermediate structure theorem; in fact, by some fairly routine (but quantitatively expensive) reductions, one can basically reduce to this case when trying to reduce the number of lines in the intermediate structure theorem and replace it with the weak structure theorem.
One key subcase of this model case occurs when lies on three non-concurrent lines. In this case, Menelaus’s theorem from classical Greek geometry shows that any collinearity on (i.e. lack of ordinary lines) is connected to multiplicative structure on the coordinates of . For instance, if the lines in question are the -axis , the -axis , and the shifted -axis , then observe that any three points , , (not on the intersections between these three lines) are collinear precisely when
thus providing a multiplicative relationship between the coordinates of the three points in question. The multiplicative relations given by Menelaus’s theorem already show that one cannot have too many collinear triples among any finite set of points in a triple of non-concurrent lines, basically because the multiplicative real line do not contain large finite subgroups. (To make this intuition precise requires some basic tools from arithmetic combinatorics, which we omit here.)
A more complicated version of these arguments also allows one to handle larger collections of mostly non-concurrent lines. Each triple of non-concurrent lines gives rise to a slightly different multiplicative relationship between the concurrent points on these triples. It turns out that there is a sum-product phenomenon that ensures that these multiplicative relationships are “incompatible” with each other; any configuration of points which has a lot of collinear triples with respect to one of these relationships, necessarily has few collinear triples with respect to other relationships. This can be formalised by arithmetic combinatorics tools, and more precisely a result of Elekes, Nathanson, and Ruzsa, which (by using the Szemeredi-Trotter theorem) established results to the effect that if was a set of reals for which is small, then was necessarily large for suitably nonlinear choices of (e.g. strictly convex). One can reduce the study of collinearity on bounded collections of mostly non-concurrent lines to the Elekes-Nathanson-Ruzsa result by some routine arithmetic combinatorics arguments which we omit here.
4. Betts’ argument
The arguments based on Menelaus’s theorem allow one to produce many ordinary lines when a large fraction of the points lie on mostly non-collinear lines. But there is a key case not covered by the Menelaus argument, which is when the points lie on a set of collinear lines, which (after projective transformation) we can take to be parallel. To eliminate this case (and some related cases) from consideration, we use an argument of Luke Betts (superceding a much more complicated use of additive combinatorics tools, which no longer appears in the paper). Betts establishes a quantitative version of the following fact:
Theorem 8 Suppose that a set of points lie on parallel lines, with the number of lines being at least two, and with each line containing points of . Then determines ordinary lines.
Betts’s argument proceeds by first dualising the theorem to the following equivalent form;
Theorem 9 Suppose that a set of lines lie in families of parallel lines, with the number of directions being at least two, and with each direction having lines. Then there are points that are ordinary (i.e. are incident to exactly two lines in ).
The proof of this theorem, exploiting convexity, is elementary enough that we can describe the proof here. Let be the set of all non-ordinary intersection points of lines in , i.e. points incident to at least three lines of . The claim is easy if all the points in are collinear, so suppose this is not the case. Then we may find a vertex in that is a vertex of the convex hull of . This point is incident to at least three lines of , and is thus the endpoint of at least three rays in lines in that lie completely outside of the convex hull of , so that any intersection point on these lines is necessarily ordinary. Of these three rays, one lies between the other two (in the complement of ). Every line in parallel to this middle ray, will meet one of the other two rays, and by the previous discussion, every one of these intersection points is ordinary. Thus we have found two lines in (the lines containing the two exterior rays) which between them capture ordinary points. Removing these lines and iterating, we obtain the theorem.
By combining the Menelaus theorem-based argument with Betts’ theorem and some additional combinatorial arguments, we were able to upgrade the intermediate structure theorem to the weak structure theorem. Using additional additive combinatorics tools (exploiting the group (or pseudo-group) structure on cubic curves, which is most well known in the case of elliptic curves, but also survives (in a somewhat degenerate form) for other cubic curves, such as singular cubic curves, unions of three lines, or the union of a conic and a line) we can then get the strong structure theorem.
5. Poonen-Rubinstein theorems
The strong structure theorem shows that sets with ordinary lines are a perturbation (by adding or deleting points, and by applying a projective transformation) from a very explicit example, either arising from a finite subgroup of an elliptic curve, equally spaced points on a circle and line, or points on a line. The projective transform is a symmetry of the problem and is harmless; the only remaining issue is to understand the effect of the points added or deleted on the number of ordinary lines. Due to the very explicit nature of these examples, deleting a point is easy to understand; the main remaining difficulty is to understand the effect that the addition of a single point to one of these explicit configurations has on the number of ordinary lines. In particular, one wishes to prevent the scenario that the addition of such a point removes more ordinary lines than it creates, by being collinear with many pairs of points in the original configuration that were previously defining an ordinary line.
The following result of Poonen and Rubinstein (a corollary of their more detailed results) is relevant for excluding this scenario:
Theorem 10 (Poonen-Rubinstein) Let be a point not on the unit circle or at the origin. Then lies on at most seven chords connecting two roots of unity.
It turns out (basically by an application of Menelaus’s theorem) that if three chords connecting roots of unity are concurrent, then there is a linear relationship between twelve other roots of unity closely related to the original six roots of unity, namely that these roots of unity sum to zero. The Poonen-Rubinstein paper uses some Galois theory and numerical computation to completely classify all the ways in which twelve or fewer roots of unity can sum to zero, which then leads to the above theorem. It turns out that a variant of the argument also works for secants, showing that a point outside the unit circle can lie on at most secants passing through two roots of unity.
One would also like to have a similar result in which “roots of unity” are replaced by “elements of a finite subgroup of an elliptic curve”. Unfortunately, the Galois-theoretic techniques seem to become unavailable, and we were unable to establish the full analogue of the Poonen-Rubinstein theorem for these finite subgroups. We could however show that given a finite subgroup of order , any point not on the curve formed at least ordinary lines with that subgroup, which was good enough for our applications; this came from an order-theoretic argument investigating how triples of adjacent points on the finite subgroup could form collinear triples with .