You are currently browsing the tag archive for the ‘Bezout’s theorem’ tag.
One of the first non-trivial theorems one encounters in classical algebraic geometry is Bézout’s theorem, which we will phrase as follows:
Theorem 1 (Bézout’s theorem) Let
be a field, and let
be non-zero polynomials in two variables
with no common factor. Then the two curves
and
have no common components, and intersect in at most
points.
This theorem can be proven by a number of means, for instance by using the classical tool of resultants. It has many strengthenings, generalisations, and variants; see for instance this previous blog post on Bézout’s inequality. Bézout’s theorem asserts a fundamental algebraic dichotomy, of importance in combinatorial incidence geometry: any two algebraic curves either share a common component, or else have a bounded finite intersection; there is no intermediate case in which the intersection is unbounded in cardinality, but falls short of a common component. This dichotomy is closely related to the integrality gap in algebraic dimension: an algebraic set can have an integer dimension such as or
, but cannot attain any intermediate dimensions such as
. This stands in marked contrast to sets of analytic, combinatorial, or probabilistic origin, whose “dimension” is typically not necessarily constrained to be an integer.
Bézout’s inequality tells us, roughly speaking, that the intersection of a curve of degree and a curve of degree
forms a set of at most
points. One can consider the converse question: given a set
of
points in the plane
, can one find two curves of degrees
with
and no common components, whose intersection contains these points?
A model example that supports the possibility of such a converse is a grid that is a Cartesian product of two finite subsets
of
with
. In this case, one can take one curve to be the union of
vertical lines, and the other curve to be the union of
horizontal lines, to obtain the required decomposition. Thus, if the proposed converse to Bézout’s inequality held, it would assert that any set of
points was essentially behaving like a “nonlinear grid” of size
.
Unfortunately, the naive converse to Bézout’s theorem is false. A counterexample can be given by considering a set of
points for some large perfect square
, where
is a
by
grid of the form described above, and
consists of
points on an line
(e.g. a
or
grid). Each of the two component sets
can be written as the intersection between two curves whose degrees multiply up to
; in the case of
, we can take the two families of parallel lines (viewed as reducible curves of degree
) as the curves, and in the case of
, one can take
as one curve, and the graph of a degree
polynomial on
vanishing on
for the second curve. But, if
is large enough, one cannot cover
by the intersection of a single pair
of curves with no common components whose degrees
multiply up to
. Indeed, if this were the case, then without loss of generality we may assume that
, so that
. By Bézout’s theorem,
either contains
, or intersects
in at most
points. Thus, in order for
to capture all of
, it must contain
, which forces
to not contain
. But
has to intersect
in
points, so by Bézout’s theorem again we have
, thus
. But then (by more applications of Bézout’s theorem)
can only capture
of the
points of
, a contradiction.
But the above counterexample suggests that even if an arbitrary set of (or
) points cannot be covered by the single intersection of a pair of curves with degree multiplying up to
, one may be able to cover such a set by a small number of such intersections. The purpose of this post is to record the simple observation that this is, indeed, the case:
Theorem 2 (Partial converse to Bézout’s theorem) Let
be a field, and let
be a set of
points in
for some
. Then one can find
and pairs
of coprime non-zero polynomials for
such that
and
Informally, every finite set in the plane is (a dense subset of) the union of logarithmically many nonlinear grids. The presence of the logarithm is necessary, as can be seen by modifying the example to be the union of logarithmically many Cartesian products of distinct dimensions, rather than just a pair of such products.
Unfortunately I do not know of any application of this converse, but I thought it was cute anyways. The proof is given below the fold.
An algebraic (affine) plane curve of degree over some field
is a curve
of the form
where is some non-constant polynomial of degree
. Examples of low-degree plane curves include
- Degree
(linear) curves
, which are simply the lines;
- Degree
(quadric) curves
, which (when
) include the classical conic sections (i.e. ellipses, hyperbolae, and parabolae), but also include the reducible example of the union of two lines; and
- Degree
(cubic) curves
, which include the elliptic curves
(with non-zero discriminant
, so that the curve is smooth) as examples (ignoring some technicalities when
has characteristic two or three), but also include the reducible examples of the union of a line and a conic section, or the union of three lines.
- etc.
Algebraic affine plane curves can also be extended to the projective plane by homogenising the polynomial. For instance, the affine quadric curve
would become
.
One of the fundamental theorems about algebraic plane curves is Bézout’s theorem, which asserts that if a degree curve
and a degree
curve
have no common component, then they intersect in at most
points (and if the underlying field
is algebraically closed, one works projectively, and one counts intersections with multiplicity, they intersect in exactly
points). Thus, for instance, two distinct lines intersect in at most one point; a line and a conic section intersect in at most two points; two distinct conic sections intersect in at most four points; a line and an elliptic curve intersect in at most three points; two distinct elliptic curves intersect in at most nine points; and so forth. Bézout’s theorem is discussed in this previous post.
From linear algebra we also have the fundamental fact that one can build algebraic curves through various specified points. For instance, for any two points one can find a line
passing through the points
, because this imposes two linear constraints on three unknowns
and is thus guaranteed to have at least one solution. Similarly, given any five points
, one can find a quadric curve passing through these five points (though note that if three of these points are collinear, then this curve cannot be a conic thanks to Bézout’s theorem, and is thus necessarily reducible to the union of two lines); given any nine points
, one can find a cubic curve going through these nine points; and so forth. This simple observation is one of the foundational building blocks of the polynomial method in combinatorial incidence geometry, discussed in these blog posts.
In the degree case, it is always true that two distinct points
determine exactly one line
. In higher degree, the situation is a bit more complicated. For instance, five collinear points determine more than one quadric curve, as one can simply take the union of the line containing those five points, together with an arbitrary additional line. Similarly, eight points on a conic section plus one additional point determine more than one cubic curve, as one can take that conic section plus an arbitrary line going through the additional point. However, if one places some “general position” hypotheses on these points, then one can recover uniqueness. For instance, given five points, no three of which are collinear, there can be at most one quadric curve that passes through these points (because these five points cannot lie on the union of two lines, and by Bézout’s theorem they cannot simultaneously lie on two distinct conic sections).
For cubic curves, the situation is more complicated still. Consider for instance two distinct cubic curves and
that intersect in precisely nine points
(note from Bézout’s theorem that this is an entirely typical situation). Then there is in fact an entire one-parameter family of cubic curves that pass through these points, namely the curves
for any
(with the convention that the constraint
is interpreted as
when
).
In fact, these are the only cubics that pass through these nine points, or even through eight of the nine points. More precisely, we have the following useful fact, known as the Cayley-Bacharach theorem:
Proposition 1 (Cayley-Bacharach theorem) Let
and
be two cubic curves that intersect (over some algebraically closed field
) in precisely nine distinct points
. Let
be a cubic polynomial that vanishes on eight of these points (say
). Then
is a linear combination of
, and in particular vanishes on the ninth point
.
Proof: (This proof is based off of a text of Husemöller.) We assume for contradiction that there is a cubic polynomial that vanishes on
, but is not a linear combination of
and
.
We first make some observations on the points . No four of these points can be collinear, because then by Bézout’s theorem,
and
would both have to vanish on this line, contradicting the fact that
meet in at most nine points. For similar reasons, no seven of these points can lie on a quadric curve.
One consequence of this is that any five of the determine a unique quadric curve
. The existence of the curve follows from linear algebra as discussed previously. If five of the points lie on two different quadric curves
, then by Bezout’s theorem, they must share a common line; but this line can contain at most three of the five points, and the other two points determine uniquely the other line that is the component of both
and
, and the claim follows.
Now suppose that three of the first eight points, say , are collinear, lying on a line
. The remaining five points
do not lie on
, and determine a unique quadric curve
by the previous discussion. Let
be another point on
, and let
be a point that does not lie on either
or
. By linear algebra, one can find a non-trivial linear combination
of
that vanishes at both
and
. Then
is a cubic polynomial that vanishes on the four collinear points
and thus vanishes on
, thus the cubic curve defined by
consists of
and a quadric curve. This curve passes through
and thus equals
. But then
does not lie on either
or
despite being a vanishing point of
, a contradiction. Thus, no three points from
are collinear.
In a similar vein, suppose next that six of the first eight points, say , lie on a quadric curve
; as no three points are collinear, this quadric curve cannot be the union of two lines, and is thus a conic section. The remaining two points
determine a unique line
. Let
be another point on
, and let
be another point that does not lie on either
and
. As before, we can find a non-trivial cubic
that vanishes at both
. As
vanishes at seven points of a conic section
, it must vanish on all of
, and so the cubic curve defined by
is the union of
and a line that passes through
and
, which must necessarily be
. But then this curve does not pass through
, a contradiction. Thus no six points in
lie on a quadric curve.
Finally, let be the line through the two points
, and
the quadric curve through the five points
; as before,
must be a conic section, and by the preceding paragraphs we see that
does not lie on either
or
. We pick two more points
lying on
but not on
. As before, we can find a non-trivial cubic
that vanishes on
; it vanishes on four points on
and thus
defines a cubic curve that consists of
and a quadric curve. The quadric curve passes through
and is thus
; but then the curve does not pass through
, a contradiction. This contradiction finishes the proof of the proposition.
I recently learned of this proposition and its role in unifying many incidence geometry facts concerning lines, quadric curves, and cubic curves. For instance, we can recover the proof of the classical theorem of Pappus:
Theorem 2 (Pappus’ theorem) Let
be two distinct lines, let
be distinct points on
that do not lie on
, and let
be distinct points on
that do not lie on
. Suppose that for
, the lines
and
meet at a point
. Then the points
are collinear.
Proof: We may assume that are distinct, since the claim is trivial otherwise.
Let be the union of the three lines
,
, and
(the purple lines in the first figure), let
be the union of the three lines
,
, and
(the dark blue lines), and let
be the union of the three lines
,
, and
(the other three lines). By construction,
and
are cubic curves with no common component that meet at the nine points
. Also,
is a cubic curve that passes through the first eight of these points, and thus also passes through the ninth point
, by the Cayley-Bacharach theorem. The claim follows (note that
cannot lie on
or
).
The same argument gives the closely related theorem of Pascal:
Theorem 3 (Pascal’s theorem) Let
be distinct points on a conic section
. Suppose that for
, the lines
and
meet at a point
. Then the points
are collinear.
Proof: Repeat the proof of Pappus’ theorem, with taking the place of
. (Note that as any line meets
in at most two points, the
cannot lie on
.)
One can view Pappus’s theorem as the degenerate case of Pascal’s theorem, when the conic section degenerates to the union of two lines.
Finally, Proposition 1 gives the associativity of the elliptic curve group law:
Theorem 4 (Associativity of the elliptic curve law) Let
be a (projective) elliptic curve, where
is the point at infinity on the
-axis, and the discriminant
is non-zero. Define an addition law
on
by defining
to equal
, where
is the unique point on
collinear with
and
(if
are disjoint) or tangent to
(if
), and
is the reflection of
through the
-axis (thus
are collinear), with the convention
. Then
gives
the structure of an abelian group with identity
and inverse
.
Proof: It is clear that is the identity for
,
is an inverse, and
is abelian. The only non-trivial assertion is associativity:
. By a perturbation (or Zariski closure) argument, we may assume that we are in the generic case when
are all distinct from each other and from
. (Here we are implicitly using the smoothness of the elliptic curve, which is guaranteed by the hypothesis that the discriminant is non-zero.)
Let be the union of the three lines
,
, and
(the purple lines), and let
be the union of the three lines
,
, and
(the green lines). Observe that
and
are cubic curves with no common component that meet at the nine distinct points
. The cubic curve
goes through the first eight of these points, and thus (by Proposition 1) also goes through the ninth point
. This implies that the line through
and
meets
in both
and
, and so these two points must be equal, and so
as required.
One can view Pappus’s theorem and Pascal’s theorem as a degeneration of the associativity of the elliptic curve law, when the elliptic curve degenerates to three lines (in the case of Pappus) or the union of one line and one conic section (in the case of Pascal’s theorem).
Classically, the fundamental object of study in algebraic geometry is the solution set
to multiple algebraic equations
in multiple unknowns in a field
, where the
are polynomials of various degrees
. We adopt the classical perspective of viewing
as a set (and specifically, as an algebraic set), rather than as a scheme. Without loss of generality we may order the degrees in non-increasing order:
We can distinguish between the underdetermined case , when there are more unknowns than equations; the determined case
when there are exactly as many unknowns as equations; and the overdetermined case
, when there are more equations than unknowns.
Experience has shown that the theory of such equations is significantly simpler if one assumes that the underlying field is algebraically closed , and so we shall make this assumption throughout the rest of this post. In particular, this covers the important case when
is the field of complex numbers (but it does not cover the case
of real numbers – see below).
From the general “soft” theory of algebraic geometry, we know that the algebraic set is a union of finitely many algebraic varieties, each of dimension at least
, with none of these components contained in any other. In particular, in the underdetermined case
, there are no zero-dimensional components of
, and thus
is either empty or infinite.
Now we turn to the determined case , where we expect the solution set
to be zero-dimensional and thus finite. Here, the basic control on the solution set is given by Bezout’s theorem. In our notation, this theorem states the following:
Theorem 1 (Bezout’s theorem) Let
. If
is finite, then it has cardinality at most
.
This result can be found in any introductory algebraic geometry textbook; it can for instance be proven using the classical tool of resultants. The solution set will be finite when the two polynomials
are coprime, but can (and will) be infinite if
share a non-trivial common factor.
By defining the right notion of multiplicity on (and adopting a suitably “scheme-theoretic” viewpoint), and working in projective space rather than affine space, one can make the inequality
an equality. However, for many applications (and in particular, for the applications to combinatorial incidence geometry), the upper bound usually suffices.
Bezout’s theorem can be generalised in a number of ways. For instance, the restriction on the finiteness of the solution set can be dropped by restricting attention to
, the union of the zero-dimensional irreducible components of
:
Corollary 2 (Bezout’s theorem, again) Let
. Then
has cardinality at most
.
Proof: We factor into irreducible factors (using unique factorisation of polynomials). By removing repeated factors, we may assume
are square-free. We then write
,
where
is the greatest common divisor of
and
are coprime. Observe that the zero-dimensional component of
is contained in
, which is finite from the coprimality of
. The claim follows.
It is also not difficult to use Bezout’s theorem to handle the overdetermined case in the plane:
Corollary 3 (Bezout’s theorem, yet again) Let
. Then
has cardinality at most
.
Proof: We may assume all the are square-free. We write
, where
is coprime to
and
divides
(and also
). We then write
, where
is coprime to
and
divides
(and also
). Continuing in this fashion we obtain a factorisation
. One then observes that
is contained in the set
, which by Theorem 1 has cardinality at most
. Since
and
, the claim follows.
Remark 1 Of course, in the overdetermined case
one generically expects the solution set to be empty, but if there is enough degeneracy or numerical coincidence then non-zero solution sets can occur. In particular, by considering the case when
and
we see that the bound
can be sharp in some cases. However, one can do a little better in this situation; by decomposing
into irreducible components, for instance, one can improve the upper bound of
slightly to
. However, this improvement seems harder to obtain in higher dimensions (see below).
Bezout’s theorem also extends to higher dimensions. Indeed, we have
Theorem 4 (Higher-dimensional Bezout’s theorem) Let
. If
is finite, then it has cardinality at most
.
This is a standard fact, and can for instance be proved from the more general and powerful machinery of intersection theory. A typical application of this theorem is to show that, given a degree polynomial
over the reals, the number of connected components of
is
. The main idea of the proof is to locate a critical point
inside each connected component, and use Bezout’s theorem to count the number of zeroes of the polynomial map
. (This doesn’t quite work directly because some components may be unbounded, and because the fibre of
at the origin may contain positive-dimensional components, but one can use truncation and generic perturbation to deal with these issues; see my recent paper with Solymosi for further discussion.)
Bezout’s theorem can be extended to the overdetermined case as before:
Theorem 5 (Bezout’s inequality) Let
. Then
has cardinality at most
.
Remark 2 Theorem 5 ostensibly only controls the zero-dimensional components of
, but by throwing in a few generic affine-linear forms to the set of polynomials
(thus intersecting
with a bunch of generic hyperplanes) we can also control the total degree of all the
-dimensional components of
for any fixed
. (Again, by using intersection theory one can get a slightly more precise bound than this, but the proof of that bound is more complicated than the arguments given here.)
This time, though, it is a slightly non-trivial matter to deduce Theorem 5 from Theorem 4, due to the standard difficulty that the intersection of irreducible varieties need not be irreducible (which can be viewed in some ways as the source of many other related difficulties, such as the fact that not every algebraic variety is a complete intersection), and so one cannot evade all irreducibility issues merely by assuming that the original polynomials are irreducible. Theorem 5 first appeared explicitly in the work of Heintz.
As before, the most systematic way to establish Theorem 5 is via intersection theory. In this post, though, I would like to give a slightly more elementary argument (essentially due to Schmid), based on generically perturbing the polynomials in the problem ; this method is less powerful than the intersection-theoretic methods, which can be used for a wider range of algebraic geometry problems, but suffices for the narrow objective of proving Theorem 5. The argument requires some of the “soft” or “qualitative” theory of algebraic geometry (in particular, one needs to understand the semicontinuity properties of preimages of dominant maps), as well as basic linear algebra. As such, the proof is not completely elementary, but it uses only a small amount of algebraic machinery, and as such I found it easier to understand than the intersection theory arguments.
Theorem 5 is a statement about arbitrary polynomials . However, it turns out (in the determined case
, at least) that upper bounds on
are Zariski closed properties, and so it will suffice to establish this claim for generic polynomials
. On the other hand, it is possible to use duality to deduce such upper bounds on
from a Zariski open condition, namely that a certain collection of polynomials are linearly independent. As such, to verify the generic case of this open condition, it suffices to establish this condition for a single family of polynomials, such as a family of monomials, in which case the condition can be verified by direct inspection. Thus we see an example of the somewhat strange strategy of establishing the general case from a specific one, using the generic case as an intermediate step.
Remark 3 There is an important caveat to note here, which is that these theorems only hold for algebraically closed fields, and in particular can fail over the reals
. For instance, in
, the polynomials
have degrees
respectively, but their common zero locus
has cardinality
. In some cases one can safely obtain incidence bounds in
by embedding
inside
, but as the above example shows, one needs to be careful when doing so.
I have blogged a number of times in the past about the relationship between finitary (or “hard”, or “quantitative”) analysis, and infinitary (or “soft”, or “qualitative”) analysis. One way to connect the two types of analysis is via compactness arguments (and more specifically, contradiction and compactness arguments); such arguments can convert qualitative properties (such as continuity) to quantitative properties (such as bounded), basically because of the fundamental fact that continuous functions on a compact space are bounded (or the closely related fact that sequentially continuous functions on a sequentially compact space are bounded).
A key stage in any such compactness argument is the following: one has a sequence of “quantitative” or “finitary” objects or spaces, and one has to somehow end up with a “qualitative” or “infinitary” limit object
or limit space. One common way to achieve this is to embed everything inside some universal space and then use some weak compactness property of that space, such as the Banach-Alaoglu theorem (or its sequential counterpart). This is for instance the idea behind the Furstenberg correspondence principle relating ergodic theory to combinatorics; see for instance this post of mine on this topic.
However, there is a slightly different approach, which I will call ultralimit analysis, which proceeds via the machinery of ultrafilters and ultraproducts; typically, the limit objects one constructs are now the ultraproducts (or ultralimits) of the original objects
. There are two main facts that make ultralimit analysis powerful. The first is that one can take ultralimits of arbitrary sequences of objects, as opposed to more traditional tools such as metric completions, which only allow one to take limits of Cauchy sequences of objects. The second fact is Los’s theorem, which tells us that
is an elementary limit of the
(i.e. every sentence in first-order logic which is true for the
for
large enough, is true for
). This existence of elementary limits is a manifestation of the compactness theorem in logic; see this earlier blog post for more discussion. So we see that compactness methods and ultrafilter methods are closely intertwined. (See also my earlier class notes for a related connection between ultrafilters and compactness.)
Ultralimit analysis is very closely related to nonstandard analysis. I already discussed some aspects of this relationship in an earlier post, and will expand upon it at the bottom of this post. Roughly speaking, the relationship between ultralimit analysis and nonstandard analysis is analogous to the relationship between measure theory and probability theory.
To illustrate how ultralimit analysis is actually used in practice, I will show later in this post how to take a qualitative infinitary theory – in this case, basic algebraic geometry – and apply ultralimit analysis to then deduce a quantitative version of this theory, in which the complexity of the various algebraic sets and varieties that appear as outputs are controlled uniformly by the complexity of the inputs. The point of this exercise is to show how ultralimit analysis allows for a relatively painless conversion back and forth between the quantitative and qualitative worlds, though in some cases the quantitative translation of a qualitative result (or vice versa) may be somewhat unexpected. In an upcoming paper of myself, Ben Green, and Emmanuel Breuillard (announced in the previous blog post), we will rely on ultralimit analysis to reduce the messiness of various quantitative arguments by replacing them with a qualitative setting in which the theory becomes significantly cleaner.
For sake of completeness, I also redo some earlier instances of the correspondence principle via ultralimit analysis, namely the deduction of the quantitative Gromov theorem from the qualitative one, and of Szemerédi’s theorem from the Furstenberg recurrence theorem, to illustrate how close the two techniques are to each other.




Recent Comments