You are currently browsing the monthly archive for September 2012.
One of the first non-trivial theorems one encounters in classical algebraic geometry is Bézout’s theorem, which we will phrase as follows:
Theorem 1 (Bézout’s theorem) Let be a field, and let be non-zero polynomials in two variables with no common factor. Then the two curves and have no common components, and intersect in at most points.
This theorem can be proven by a number of means, for instance by using the classical tool of resultants. It has many strengthenings, generalisations, and variants; see for instance this previous blog post on Bézout’s inequality. Bézout’s theorem asserts a fundamental algebraic dichotomy, of importance in combinatorial incidence geometry: any two algebraic curves either share a common component, or else have a bounded finite intersection; there is no intermediate case in which the intersection is unbounded in cardinality, but falls short of a common component. This dichotomy is closely related to the integrality gap in algebraic dimension: an algebraic set can have an integer dimension such as or , but cannot attain any intermediate dimensions such as . This stands in marked contrast to sets of analytic, combinatorial, or probabilistic origin, whose “dimension” is typically not necessarily constrained to be an integer.
Bézout’s inequality tells us, roughly speaking, that the intersection of a curve of degree and a curve of degree forms a set of at most points. One can consider the converse question: given a set of points in the plane , can one find two curves of degrees with and no common components, whose intersection contains these points?
A model example that supports the possibility of such a converse is a grid that is a Cartesian product of two finite subsets of with . In this case, one can take one curve to be the union of vertical lines, and the other curve to be the union of horizontal lines, to obtain the required decomposition. Thus, if the proposed converse to Bézout’s inequality held, it would assert that any set of points was essentially behaving like a “nonlinear grid” of size .
Unfortunately, the naive converse to Bézout’s theorem is false. A counterexample can be given by considering a set of points for some large perfect square , where is a by grid of the form described above, and consists of points on an line (e.g. a or grid). Each of the two component sets can be written as the intersection between two curves whose degrees multiply up to ; in the case of , we can take the two families of parallel lines (viewed as reducible curves of degree ) as the curves, and in the case of , one can take as one curve, and the graph of a degree polynomial on vanishing on for the second curve. But, if is large enough, one cannot cover by the intersection of a single pair of curves with no common components whose degrees multiply up to . Indeed, if this were the case, then without loss of generality we may assume that , so that . By Bézout’s theorem, either contains , or intersects in at most points. Thus, in order for to capture all of , it must contain , which forces to not contain . But has to intersect in points, so by Bézout’s theorem again we have , thus . But then (by more applications of Bézout’s theorem) can only capture of the points of , a contradiction.
But the above counterexample suggests that even if an arbitrary set of (or ) points cannot be covered by the single intersection of a pair of curves with degree multiplying up to , one may be able to cover such a set by a small number of such intersections. The purpose of this post is to record the simple observation that this is, indeed, the case:
Informally, every finite set in the plane is (a dense subset of) the union of logarithmically many nonlinear grids. The presence of the logarithm is necessary, as can be seen by modifying the example to be the union of logarithmically many Cartesian products of distinct dimensions, rather than just a pair of such products.
Unfortunately I do not know of any application of this converse, but I thought it was cute anyways. The proof is given below the fold.
There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago of the last of a series of papers by Shinichi Mochizuki which, as one of its major applications, claims to establish this conjecture. It’s still far too early to judge whether this proof is likely to be correct or not (the entire argument encompasses several hundred pages of argument, mostly in the area of anabelian geometry, which very few mathematicians are expert in, to the extent that we still do not even have a full outline of the proof strategy yet), and I don’t have anything substantial to add to the existing discussion around that conjecture. (But, for those that are interested, the Polymath wiki page on the ABC conjecture has collected most of the links to that discussion, and to various background materials.)
In the meantime, though, I thought I might give the standard probabilistic heuristic argument that explains why we expect the ABC conjecture to be true. The underlying heuristic is a common one, used throughout number theory, and it can be summarised as follows:
Heuristic 1 (Probabilistic heuristic) Even though number theory is a deterministic subject (one does not need to roll any dice to factorise a number, or figure out if a number is prime), one expects to get a good asymptotic prediction for the answers to many number-theoretic questions by pretending that various number-theoretic assertions (e.g. that a given number is prime) are probabilistic events (with a probability that can vary between and ) rather than deterministic events (that are either always true or always false). Furthermore:
- (Basic heuristic) If two or more of these heuristically probabilistic events have no obvious reason to be strongly correlated to each other, then we should expect them to behave as if they were (jointly) independent.
- (Advanced heuristic) If two or more of these heuristically probabilistic events have some obvious correlation between them, but no further correlations are suspected, then we should expect them to behave as if they were conditionally independent, relative to whatever data is causing the correlation.
This is, of course, an extremely vague and completely non-rigorous heuristic, requiring (among other things) a subjective and ad hoc determination of what an “obvious reason” is, but in practice it tends to give remarkably plausible predictions, some fraction of which can in fact be backed up by rigorous argument (although in many cases, the actual argument has almost nothing in common with the probabilistic heuristic). A famous special case of this heuristic is the Cramér random model for the primes, but this is not the only such instance for that heuristic.
To give the most precise predictions, one should use the advanced heuristic in Heuristic 1, but this can be somewhat complicated to execute, and so we shall focus instead on the predictions given by the basic heuristic (thus ignoring the presence of some number-theoretic correlations), which tends to give predictions that are quantitatively inaccurate but still reasonably good at the qualitative level.
Here is a basic “corollary” of Heuristic 1:
Heuristic 2 (Heuristic Borel-Cantelli) Suppose one has a sequence of number-theoretic statements, which we heuristically interpet as probabilistic events with probabilities . Suppose also that we know of no obvious reason for these events to have much of a correlation with each other. Then:
- If , we expect only finitely many of the statements to be true. (And if is much smaller than , we in fact expect none of the to be true.)
- If , we expect infinitely many of the statements to be true.
This heuristic is motivated both by the Borel-Cantelli lemma, and by the standard probabilistic computation that if one is given jointly independent, and genuinely probabilistic, events with , then one almost surely has an infinite number of the occuring.
Before we get to the ABC conjecture, let us give two simpler (and well known) demonstrations of these heuristics in action:
Example 1 (Twin prime conjecture) One can heuristically justify the twin prime conjecture as follows. Using the prime number theorem, one can heuristically assign a probability of to the event that any given large integer is prime. In particular, the probability that is prime will then be . Making the assumption that there are no strong correlations between these events, we are led to the prediction that the probability that and are simultaneously prime is . Since , the Borel-Cantelli heuristic then predicts that there should be infinitely many twin primes.
Note that the above argument is a bit too naive, because there are some non-trivial correlations between the primality of and the primality of . Most obviously, if is prime, this greatly increases the probability that is odd, which implies that is odd, which then elevates the probability that is prime. A bit more subtly, if is prime, then is likely to avoid the residue class , which means that avoids the residue class , which ends up decreasing the probability that is prime. However, there is a standard way to correct for these local correlations; see for instance in this previous blog post. As it turns out, these local correlations ultimately alter the prediction for the asymptotic density of twin primes by a constant factor (the twin prime constant), but do not affect the qualitative prediction of there being infinitely many twin primes.
Example 2 (Fermat’s last theorem) Let us now heuristically count the number of solutions to for various and natural numbers (which we can reduce to be coprime if desired). We recast this (in the spirit of the ABC conjecture) as , where are powers. The number of powers up to any given number is about , so heuristically any given natural number has a probability about of being an power. If we make the naive assumption that (in the coprime case at least) there is no strong correlation between the events that is an power, is an power, and being an power, then for typical , the probability that are all simultaneously powers would then be . For fixed , the total number of solutions to the Fermat equation would then be predicted to be
(Strictly speaking, we need to restrict to the coprime case, but given that a positive density of pairs of integers are coprime, it should not affect the qualitative conclusion significantly if we now omit this restriction.) It might not be immediately obvious as to whether this sum converges or diverges, but (as is often the case with these sorts of unsigned sums) one can clarify the situation by dyadic decomposition. Suppose for instance that we consider the portion of the sum where lies between and . Then this portion of the sum can be controlled by
which simplifies to
Summing in , one thus expects infinitely many solutions for , only finitely many solutions for (indeed, a refinement of this argument shows that one expects only finitely many solutions even if one considers all at once), and a borderline prediction of there being a barely infinite number of solutions when . Here is of course a place where a naive application of the probabilistic heuristic breaks down; there is enough arithmetic structure in the equation that the naive probabilistic prediction ends up being an inaccurate model. Indeed, while this heuristic suggests that a typical homogeneous cubic should have a logarithmic number of integer solutions of a given height , it turns out that some homogeneous cubics (namely, those associated to elliptic curves of positive rank) end up with the bulk of these solutions, while other homogeneous cubics (including those associated to elliptic curves of zero rank, including the Fermat curve ) only get finitely many solutions. The reasons for this are subtle, but certainly the high degree of arithmetic structure present in an elliptic curve (starting with the elliptic curve group law which allows one to generate new solutions from old ones, and which also can be used to exclude solutions to via the method of descent) is a major contributing factor.
Below the fold, we apply similar heuristics to suggest the truth of the ABC conjecture.
Much as group theory is the study of groups, or graph theory is the study of graphs, model theory is the study of models (also known as structures) of some language (which, in this post, will always be a single-sorted, first-order language). A structure is a set , equipped with one or more operations, constants, and relations. This is of course an extremely general type of mathematical object, but (quite remarkably) one can still say a substantial number of interesting things about very broad classes of structures.
We will observe the common abuse of notation of using the set as a metonym for the entire structure, much as we usually refer to a group simply as , a vector space simply as , and so forth. Following another common bending of the rules, we also allow some operations on structures (such as the multiplicative inverse operation on a group or field) to only be partially defined, and we allow use of the usual simplifying conventions for mathematical formulas (e.g. writing instead of or , in cases where associativity is known). We will also deviate slightly from the usual practice in logic by emphasising individual structures, rather than the theory of general classes of structures; for instance, we will talk about the theory of a single field such as or , rather than the theory of all fields of a certain type (e.g. real closed fields or algebraically closed fields).
Once one has a structure , one can introduce the notion of a definable subset of , or more generally of a Cartesian power of , defined as a set of the form
for some formula in the language with free variables and any number of constants from (that is, is a well-formed formula built up from a finite number of constants in , the relations and operations on , logical connectives such as , , , and the quantifiers ). Thus, for instance, in the theory of the arithmetic of the natural numbers , the set of primes is a definable set, since we have
In the theory of the field of reals , the unit circle is an example of a definable set,
but so is the the complement of the circle,
and the interval :
Due to the unlimited use of constants, any finite subset of a power of any structure is, by our conventions, definable in that structure. (One can of course also consider definability without parameters (also known as -definability), in which arbitrary constants are not permitted, but we will not do so here.)
We can isolate some special subclasses of definable sets:
- An atomic definable set is a set of the form (1) in which is an atomic formula (i.e. it does not contain any logical connectives or quantifiers).
- A quantifier-free definable set is a set of the form (1) in which is quantifier-free (i.e. it can contain logical connectives, but does not contain the quantifiers ).
Example 1 In the theory of a field such as , an atomic definable set is the same thing as an affine algebraic set (also known as an affine algebraic variety, with the understanding that varieties are not necessarily assumed to be irreducible), and a quantifier-free definable set is known as a constructible set; thus we see that algebraic geometry can be viewed in some sense as a special case of model theory. (Conversely, it can in fact be quite profitable to think of model theory as an abstraction of algebraic geometry; for instance, the concepts of Morley rank and Morley degree in model theory (discussed in this previous blog post) directly generalises the concepts of dimension and degree in algebraic geometry.) Over , the interval is a definable set, but not a quantifier-free definable set (and certainly not an atomic definable set); and similarly for the primes over .
A quantifier-free definable set in is nothing more than a finite boolean combination of atomic definable sets; in other words, the class of quantifier-free definable sets over is the smallest class that contains the atomic definable sets and is closed under boolean operations such as complementation and union (which generate all the other boolean operations). Similarly, the class of definable sets over is the smallest class that contains the quantifier-free definable sets, and is also closed under the operation of projection from to for every natural number , where is the map .
Some structures have the property of enjoying quantifier elimination, which means that every definable set is in fact a quantifier-free definable set, or equivalently that the projection of a quantifier-free definable set is again quantifier-free. For instance, an algebraically closed field (with the field operations) has quantifier elimination (i.e. the projection of a constructible set is again constructible); this fact can be proven by the classical tool of resultants, and among other things can be used to give a proof of Hilbert’s nullstellensatz. (Note though that projection does not necessary preserve the property of being atomic; for instance, the projection of the atomic set is the non-atomic, but still quantifier-free definable, set .) In the converse direction, it is not difficult to use the nullstellensatz to deduce quantifier elimination. For theory of the real field , which is not algebraically closed, one does not have quantifier elimination, as one can see from the example of the unit circle (which is a quantifier-free definable set) projecting down to the interval (which is definable, but not quantifer-free definable). However, if one adds the additional operation of order to the reals, giving it the language of an ordered field rather than just a field, then quantifier elimination is recovered (the class of quantifier-free definable sets now enlarges to match the class of definable sets, which in this case is also the class of semi-algebraic sets); this is the famous Tarski-Seidenberg theorem.
On the other hand, many important structures do not have quantifier elimination; typically, the projection of a quantifier-free definable set is not, in general, quantifier-free definable. This failure of the projection property also shows up in many contexts outside of model theory; for instance, Lebesgue famously made the error of thinking that the projection of a Borel measurable set remained Borel measurable (it is merely an analytic set instead). Turing’s halting theorem can be viewed as an assertion that the projection of a decidable set (also known as a computable or recursive set) is not necessarily decidable (it is merely semi-decidable (or recursively enumerable) instead). The notorious P=NP problem can also be essentially viewed in this spirit; roughly speaking (and glossing over the placement of some quantifiers), it asks whether the projection of a polynomial-time decidable set is again polynomial-time decidable. And so forth. (See this blog post of Dick Lipton for further discussion of the subtleties of projections.)
Now we consider the status of quantifier elimination for the theory of a finite field . If interpreted naively, quantifier elimination is trivial for a finite field , since every subset of is finite and thus quantifier-free definable. However, we can recover an interesting question in one of two (essentially equivalent) ways. One is to work in the asymptotic regime in which the field is large, but the length of the formulae used to construct one’s definable sets stays bounded uniformly in the size of (where we view any constant in as contributing a unit amount to the length of a formula, no matter how large is). A simple counting argument then shows that only a small number of subsets of become definable in the asymptotic limit , since the number of definable sets clearly grows at most polynomially in for any fixed bound on the formula length, while the number of all subsets of grows exponentially in .
Another way to proceed is to work not with a single finite field , or even with a sequence of finite fields, but with the ultraproduct of a sequence of finite fields, and to study the properties of definable sets over this ultraproduct. (We will be using the notation of ultraproducts and nonstandard analysis from this previous blog post.) This approach is equivalent to the more finitary approach mentioned in the previous paragraph, at least if one does not care to track of the exact bounds on the length of the formulae involved. Indeed, thanks to Los’s theorem, a definable subset of is nothing more than the ultraproduct of definable subsets of for all sufficiently close to , with the length of the formulae used to define uniformly bounded in . In the language of nonstandard analysis, one can view as a nonstandard finite field.
The ultraproduct of finite fields is an important example of a pseudo-finite field – a field that obeys all the sentences in the languages of fields that finite fields do, but is not necessarily itself a finite field. The model theory of pseudo-finite fields was first studied systematically by Ax (in the same paper where the Ax-Grothendieck theorem, discussed previously on this blog, was established), with important further contributions by Kiefe, by Fried-Sacerdote, by two papers of Chatzidakis-van den Dries-Macintyre, and many other authors.
As mentioned before, quantifier elimination trivially holds for finite fields. But for infinite pseudo-finite fields, such as the ultraproduct of finite fields with going to infinity, quantifier elimination fails. For instance, in a finite field , the set of quadratic residues is a definable set, with a bounded formula length, and so in the ultraproduct , the set of nonstandard quadratic residues is also a definable set. However, in one dimension, we see from the factor theorem that the only atomic definable sets are either finite or the whole field , and so the only constructible sets (i.e. the only quantifier-free definable sets) are either finite or cofinite in . Since the quadratic residues have asymptotic density in a large finite field, they cannot form a quantifier-free definable set, despite being definable.
Nevertheless, there is a very nice almost quantifier elimination result for these fields, in characteristic zero at least, which we phrase here as follows:
where is an atomic definable subset of (i.e. the -points of an algebraic variety defined over in ) and is a polynomial.
Informally, this theorem says that while we cannot quite eliminate all quantifiers from a definable set over a nonstandard finite field, we can eliminate all but one existential quantifier. Note that negation has also been eliminated in this theorem; for instance, the definable set uses a negation, but can also be described using a single existential quantifier as .) I believe that there are more complicated analogues of this result in positive characteristic, but I have not studied this case in detail (Kiefe’s result does not assume characteristic zero, but her conclusion is slightly different from the one given here). In the one-dimensional case , the only varieties are the affine line and finite sets, and we can simplify the above statement, namely that any definable subset of takes the form for some polynomial (i.e. definable sets in are nothing more than the projections of the -points of a plane curve).
There is an equivalent formulation of this theorem for standard finite fields, namely that if is a finite field and is definable using a formula of length at most , then can be expressed in the form (2) with the degree of bounded by some quantity depending on and , assuming that the characteristic of is sufficiently large depending on .
The theorem gives quite a satisfactory description of definable sets in either standard or nonstandard finite fields (at least if one does not care about effective bounds in some of the constants, and if one is willing to exclude the small characteristic case); for instance, in conjunction with the Lang-Weil bound discussed in this recent blog post, it shows that any non-empty definable subset of a nonstandard finite field has a nonstandard cardinality of for some positive standard rational and integer . Equivalently, any non-empty definable subset of for some standard finite field using a formula of length at most has a standard cardinality of for some positive rational of height and some natural number between and . (For instance, in the example of the quadratic residues given above, is equal to and equal to .) There is a more precise statement to this effect, namely that the Poincaré series of a definable set is rational; see Kiefe’s paper for details.
Below the fold I give a proof of Theorem 1, which relies primarily on the Lang-Weil bound mentioned above.
[Note: the idea for this post originated before the recent preprint of Mochizuki on the abc conjecture was released, and is not intended as a commentary on that work, which offers a much more non-trivial perspective on scheme theory. -T.]
In classical algebraic geometry, the central object of study is an algebraic variety over a field (and the theory works best when this field is algebraically closed). One can talk about either affine or projective varieties; for sake of discussion, let us restrict attention to affine varieties. Such varieties can be viewed in at least four different ways:
- (Algebraic geometry) One can view a variety through the set of points (over ) in that variety.
- (Commutative algebra) One can view a variety through the field of rational functions on that variety, or the subring of polynomial functions in that field.
- (Dual algebraic geometry) One can view a variety through a collection of polynomials that cut out that variety.
- (Dual commutative algebra) One can view a variety through the ideal of polynomials that vanish on that variety.
For instance, the unit circle over the reals can be thought of in each of these four different ways:
- (Algebraic geometry) The set of points .
- (Commutative algebra) The quotient of the polynomial ring by the ideal generated by (or equivalently, the algebra generated by subject to the constraint ), or the fraction field of that quotient.
- (Dual algebraic geometry) The polynomial .
- (Dual commutative algebra) The ideal generated by .
The four viewpoints are almost equivalent to each other (particularly if the underlying field is algebraically closed), as there are obvious ways to pass from one viewpoint to another. For instance, starting with the set of points on a variety, one can form the space of rational functions on that variety, or the ideal of polynomials that vanish on that variety. Given a set of polynomials, one can cut out their zero locus, or form the ideal that they generate. Given an ideal in a polynomial ring, one can quotient out the ring by the ideal and then form the fraction field. Finally, given the ring of polynomials on a variety, one can form its spectrum (the space of prime ideals in the ring) to recover the set of points on that variety (together with the Zariski topology on that variety).
Because of the connections between these viewpoints, there are extensive “dictionaries” (most notably the ideal-variety dictionary) that convert basic concepts in one of these four perspectives into any of the other three. For instance, passing from a variety to a subvariety shrinks the set of points and the function field, but enlarges the set of polynomials needed to cut out the variety, as well as the associated ideal. Taking the intersection or union of two varieties corresponds to adding or multiplying together the two ideals respectively. The dimension of an (irreducible) algebraic variety can be defined as the transcendence degree of the function field, the maximal length of chains of subvarieties, or the Krull dimension of the ring of polynomials. And so on and so forth. Thanks to these dictionaries, it is now commonplace to think of commutative algebras geometrically, or conversely to approach algebraic geometry from the perspective of abstract algebra. There are however some very well known defects to these dictionaries, at least when viewed in the classical setting of algebraic varieties. The main one is that two different ideals (or two inequivalent sets of polynomials) can cut out the same set of points, particularly if the underlying field is not algebraically closed. For instance, if the underlying field is the real line , then the polynomial equations and cut out the same set of points, namely the empty set, but the ideal generated by in is certainly different from the ideal generated by . This particular example does not work in an algebraically closed field such as , but in that case the polynomial equations and also cut out the same set of points (namely the origin), but again and generate different ideals in . Thanks to Hilbert’s nullstellensatz, we can get around this problem (in the case when is algebraically closed) by always passing from an ideal to its radical, but this causes many aspects of the theory of algebraic varieties to become more complicated when the varieties involved develop singularities or multiplicities, as can already be seen with the simple example of Bezout’s theorem.
Nowadays, the standard way to deal with these issues is to replace the notion of an algebraic variety with the more general notion of a scheme. Roughly speaking, the way schemes are defined is to focus on the commutative algebra perspective as the primary one, and to allow the base field to be not algebraically closed, or even to just be a commutative ring instead of a field. (One could even consider non-commutative rings, leading to non-commutative geometry, but we will not discuss this extension of scheme theory further here.) Once one generalises to these more abstract rings, the notion of a rational function becomes more complicated (one has to work locally instead of globally, cutting out the points where the function becomes singular), but as a first approximation one can think of a scheme as basically being the same concept as a commutative ring. (In actuality, due to the need to localise, a scheme is defined as a sheaf of rings rather than a single ring, but these technicalities will not be important for the purposes of this discussion.) All the other concepts from algebraic geometry that might previously have been defined using one of the other three perspectives, are then redefined in terms of this ring (or sheaf of rings) in order to generalise them to schemes.
Thus, for instance, in scheme theory the rings and describe different schemes; from the classical perspective, they cut out the same locus, namely the point , but the former scheme makes this point “fatter” than the latter scheme, giving it a degree (or multiplicity) of rather than .
Because of this, it seems that the link between the commutative algebra perspective and the algebraic geometry perspective is still not quite perfect in scheme theory, unless one is willing to start “fattening” various varieties to correctly model multiplicity or singularity. But – and this is the trivial remark I wanted to make in this blog post – one can recover a tight connection between the two perspectives as long as one allows the freedom to arbitrarily extend the underlying base ring.
Here’s what I mean by this. Consider classical algebraic geometry over some commutative ring (not necessarily a field). Any set of polynomials in indeterminate variables with coefficients in determines, on the one hand, an ideal
in , and also cuts out a zero locus
since each of the polynomials clearly make sense as maps from to . Of course, one can also write in terms of :
Thus the ideal uniquely determines the zero locus , and we will emphasise this by writing as . As the previous counterexamples illustrate, the converse is not true. However, whenever we have any extension of the ring (i.e. a commutative ring that contains as a subring), then we can also view the polynomials as maps from to , and so one can also define the zero locus for all the extensions:
As before, is determined by the ideal :
The trivial remark is then that while a single zero locus is insufficient to recover , the collection of zero loci for all extensions of (or more precisely, the assignment map , known as the functor of points of ) is sufficient to recover , as long as at least one zero locus, say , is non-empty. Indeed, suppose we have two ideals of that cut out the same non-empty zero locus for all extensions of , thus
for all extensions of . We apply this with the extension of given by . Note that the embedding of in is injective, since otherwise would cut out the empty set as the zero locus over , and so is indeed an extension of . Tautologically, the point lies in , and thus necessarily lies in as well. Unpacking what this means, we conclude that whenever , that is to say that . By a symmetric argument, we also have , and thus as claimed. (As pointed out in comments, this fact (and its proof) is essentially a special case of the Yoneda lemma. The connection is tighter if one allows to be any ring with a (not necessarily injective) map from into it, rather than an extension of , in which case one can also drop the hypothesis that is non-empty for at least one . For instance, for every extension of the integers, but if one also allows quotients such as or instead, then and are no longer necessarily equal.)
Thus, as long as one thinks of a variety or scheme as cutting out points not just in the original base ring or field, but in all extensions of that base ring or field, one recovers an exact correspondence between the algebraic geometry perspective and the commutative algebra perspective. This is similar to the classical algebraic geometry position of viewing an algebraic variety as being defined simultaneously over all fields that contain the coefficients of the defining polynomials, but the crucial difference between scheme theory and classical algebraic geometry is that one also allows definition over commutative rings, and not just fields. In particular, one needs to allow extensions to rings that may contain nilpotent elements, otherwise one cannot distinguish an ideal from its radical.
There are of course many ways to extend a field into a ring, but as an analyst, one way to do so that appeals particularly to me is to introduce an epsilon parameter and work modulo errors of . To formalise this algebraically, let’s say for sake of concreteness that the base field is the real line . Consider the ring of real-valued quantities that depend on a parameter (i.e. functions from to ), which are locally bounded in the sense that is bounded whenever is bounded. (One can, if one wishes, impose some further continuity or smoothness hypotheses on how depends on , but this turns out not to be relevant for the following discussion. Algebraists often prefer to use the ring of Puiseux series here in place of , and a nonstandard analyst might instead use the hyperreals, but again this will not make too much difference for our purposes.) Inside this commutative ring, we can form the ideal of quantities that are of size as , i.e. there exists a quantity independent of such that for all sufficiently small . This can easily be seen to indeed be an ideal in . We then form the quotient ring . Note that is equivalent to the assertion that , so we are encoding the analyst’s notion of “equal up to errors of ” into algebraic terms.
Clearly, is a commutative ring extending . Hence, any algebraic variety
defined over the reals (so the polynomials have coefficients in ), also is defined over :
In language that more closely resembles analysis, we have
Thus we see that is in some sense an “-thickening” of , and is thus one way to give rigorous meaning to the intuition that schemes can “thicken” varieties. For instance, the scheme associated to the ideal , when interpreted over , becomes an neighbourhood of the origin
but the scheme associated to the smaller ideal , when interpreted over , becomes an -neighbourhood of the origin, thus being a much “fatter” point:
Once one introduces the analyst’s epsilon, one can see quite clearly that is coming from a larger scheme than , with fewer polynomials vanishing on it; in particular, the polynomial vanishes to order on but does not vanish to order on .
By working with this analyst’s extension of , one can already get a reasonably good first approximation of what schemes over look like, which I found particularly helpful for getting some intuition on these objects. However, since this is only one extension of , and not a “universal” such extension, it cannot quite distinguish any two schemes from each other, although it does a better job of this than classical algebraic geometry. For instance, consider the scheme cut out by the polynomials in two dimensions. Over , this becomes
Note that the polynomial vanishes to order on this locus, but fails to lie in the ideal . Equivalently, we have , despite and being distinct ideals. Basically, the analogue of the nullstellensatz for does not completely remove the need for performing a closure operation on the ideal ; it is less severe than taking the radical, but is instead more like taking a “convex hull” in that one needs to be able to “interpolate” between two polynomials in the ideal (such as and to arrive at intermediate polynomials (such as ) that one then places in the ideal.
One can also view ideals (and hence, schemes), from a model-theoretic perspective. Let be an ideal of a polynomial ring generated by some polynomials . Then, clearly, if is another polynomial in the ideal , then we can use the axioms of commutative algebra (which are basically the axioms of high school algebra) to obtain the syntactic deduction
for any assignment of indeterminates in (or in any extension of ). If we restrict to lie in only, then (even if is an algebraically closed field), the converse of the above statement is false; there can exist polynomials outside of for which (1) holds for all assignments in . For instance, we have
for all in an algebraically closed field, despite not lying in the ideal . Of course, the nullstellensatz again explains what is going on here; (1) holds whenever lies in the radical of , which can be larger than itself. But if one allows the indeterminates to take values in arbitrary extensions of , then the truth of the converse is restored, thus giving a “completeness theorem” relating the syntactic deductions of commutative algebra to the semantic interpretations of such algebras over the extensions . For instance, since
we no longer have a counterexample to the converse coming from and once we work in instead of . On the other hand, we still have
so the extension is not powerful enough to detect that does not actually lie in ; a larger ring (which is less easy to assign an analytic interpretation to) is needed to achieve this.