You are currently browsing the monthly archive for September 2012.

One of the first non-trivial theorems one encounters in classical algebraic geometry is Bézout’s theorem, which we will phrase as follows:

Theorem 1 (Bézout’s theorem) Let {k} be a field, and let {P, Q \in k[x,y]} be non-zero polynomials in two variables {x,y} with no common factor. Then the two curves {\{ (x,y) \in k^2: P(x,y) = 0\}} and {\{ (x,y) \in k^2: Q(x,y) = 0\}} have no common components, and intersect in at most {\hbox{deg}(P) \hbox{deg}(Q)} points.

This theorem can be proven by a number of means, for instance by using the classical tool of resultants. It has many strengthenings, generalisations, and variants; see for instance this previous blog post on Bézout’s inequality. Bézout’s theorem asserts a fundamental algebraic dichotomy, of importance in combinatorial incidence geometry: any two algebraic curves either share a common component, or else have a bounded finite intersection; there is no intermediate case in which the intersection is unbounded in cardinality, but falls short of a common component. This dichotomy is closely related to the integrality gap in algebraic dimension: an algebraic set can have an integer dimension such as {0} or {1}, but cannot attain any intermediate dimensions such as {1/2}. This stands in marked contrast to sets of analytic, combinatorial, or probabilistic origin, whose “dimension” is typically not necessarily constrained to be an integer.

Bézout’s inequality tells us, roughly speaking, that the intersection of a curve of degree {D_1} and a curve of degree {D_2} forms a set of at most {D_1 D_2} points. One can consider the converse question: given a set {S} of {N} points in the plane {k^2}, can one find two curves of degrees {D_1,D_2} with {D_1 D_2 = O(N)} and no common components, whose intersection contains these points?

A model example that supports the possibility of such a converse is a grid {S = A \times B} that is a Cartesian product of two finite subsets {A, B} of {k} with {|A| |B| = N}. In this case, one can take one curve to be the union of {|A|} vertical lines, and the other curve to be the union of {|B|} horizontal lines, to obtain the required decomposition. Thus, if the proposed converse to Bézout’s inequality held, it would assert that any set of {N} points was essentially behaving like a “nonlinear grid” of size {N}.

Unfortunately, the naive converse to Bézout’s theorem is false. A counterexample can be given by considering a set {S = S_1 \cup S_2} of {2N} points for some large perfect square {N}, where {P_1} is a {\sqrt{N}} by {\sqrt{N}} grid of the form described above, and {S_2} consists of {N} points on an line {\ell} (e.g. a {1 \times N} or {N \times 1} grid). Each of the two component sets {S_1, S_2} can be written as the intersection between two curves whose degrees multiply up to {N}; in the case of {S_1}, we can take the two families of parallel lines (viewed as reducible curves of degree {\sqrt{N}}) as the curves, and in the case of {S_2}, one can take {\ell} as one curve, and the graph of a degree {N} polynomial on {\ell} vanishing on {S_2} for the second curve. But, if {N} is large enough, one cannot cover {S} by the intersection of a single pair {\gamma_1, \gamma_2} of curves with no common components whose degrees {D_1,D_2} multiply up to {D_1 D_2 = O(N)}. Indeed, if this were the case, then without loss of generality we may assume that {D_1 \leq D_2}, so that {D_1 = O(\sqrt{N})}. By Bézout’s theorem, {\gamma_1} either contains {\ell}, or intersects {\ell} in at most {O(D_1) = O(\sqrt{N})} points. Thus, in order for {\gamma_1} to capture all of {S}, it must contain {\ell}, which forces {\gamma_2} to not contain {\ell}. But {\gamma_2} has to intersect {\ell} in {N} points, so by Bézout’s theorem again we have {D_2 \geq N}, thus {D_1 = O(1)}. But then (by more applications of Bézout’s theorem) {\gamma_1} can only capture {O(\sqrt{N})} of the {N} points of {S_1}, a contradiction.

But the above counterexample suggests that even if an arbitrary set of {N} (or {2N}) points cannot be covered by the single intersection of a pair of curves with degree multiplying up to {O(N)}, one may be able to cover such a set by a small number of such intersections. The purpose of this post is to record the simple observation that this is, indeed, the case:

Theorem 2 (Partial converse to Bézout’s theorem) Let {k} be a field, and let {S} be a set of {N} points in {k} for some {N > 1}. Then one can find {m = O(\log N)} and pairs {P_i,Q_i \in k[x,y]} of coprime non-zero polynomials for {i=1,\ldots,m} such that

\displaystyle  S \subset \bigcup_{i=1}^m \{ (x,y) \in k^2: P_i(x,y) = Q_i(x,y) = 0 \} \ \ \ \ \ (1)

and

\displaystyle  \sum_{i=1}^m \hbox{deg}(P_i) \hbox{deg}(Q_i) = O( N ). \ \ \ \ \ (2)

Informally, every finite set in the plane is (a dense subset of) the union of logarithmically many nonlinear grids. The presence of the logarithm is necessary, as can be seen by modifying the {P_1 \cup P_2} example to be the union of logarithmically many Cartesian products of distinct dimensions, rather than just a pair of such products.

Unfortunately I do not know of any application of this converse, but I thought it was cute anyways. The proof is given below the fold.

Read the rest of this entry »

There has been a lot of recent interest in the abc conjecture, since the release a few weeks ago of the last of a series of papers by Shinichi Mochizuki which, as one of its major applications, claims to establish this conjecture. It’s still far too early to judge whether this proof is likely to be correct or not (the entire argument encompasses several hundred pages of argument, mostly in the area of anabelian geometry, which very few mathematicians are expert in, to the extent that we still do not even have a full outline of the proof strategy yet), and I don’t have anything substantial to add to the existing discussion around that conjecture. (But, for those that are interested, the Polymath wiki page on the ABC conjecture has collected most of the links to that discussion, and to various background materials.)

In the meantime, though, I thought I might give the standard probabilistic heuristic argument that explains why we expect the ABC conjecture to be true. The underlying heuristic is a common one, used throughout number theory, and it can be summarised as follows:

Heuristic 1 (Probabilistic heuristic) Even though number theory is a deterministic subject (one does not need to roll any dice to factorise a number, or figure out if a number is prime), one expects to get a good asymptotic prediction for the answers to many number-theoretic questions by pretending that various number-theoretic assertions {E} (e.g. that a given number {n} is prime) are probabilistic events (with a probability {\mathop{\bf P}(E)} that can vary between {0} and {1}) rather than deterministic events (that are either always true or always false). Furthermore:

  • (Basic heuristic) If two or more of these heuristically probabilistic events have no obvious reason to be strongly correlated to each other, then we should expect them to behave as if they were (jointly) independent.
  • (Advanced heuristic) If two or more of these heuristically probabilistic events have some obvious correlation between them, but no further correlations are suspected, then we should expect them to behave as if they were conditionally independent, relative to whatever data is causing the correlation.

This is, of course, an extremely vague and completely non-rigorous heuristic, requiring (among other things) a subjective and ad hoc determination of what an “obvious reason” is, but in practice it tends to give remarkably plausible predictions, some fraction of which can in fact be backed up by rigorous argument (although in many cases, the actual argument has almost nothing in common with the probabilistic heuristic). A famous special case of this heuristic is the Cramér random model for the primes, but this is not the only such instance for that heuristic.

To give the most precise predictions, one should use the advanced heuristic in Heuristic 1, but this can be somewhat complicated to execute, and so we shall focus instead on the predictions given by the basic heuristic (thus ignoring the presence of some number-theoretic correlations), which tends to give predictions that are quantitatively inaccurate but still reasonably good at the qualitative level.

Here is a basic “corollary” of Heuristic 1:

Heuristic 2 (Heuristic Borel-Cantelli) Suppose one has a sequence {E_1, E_2, \ldots} of number-theoretic statements, which we heuristically interpet as probabilistic events with probabilities {\mathop{\bf P}(E_1), \mathop{\bf P}(E_2), \ldots}. Suppose also that we know of no obvious reason for these events to have much of a correlation with each other. Then:

  • If {\sum_{i=1}^\infty \mathop{\bf P}(E_i) < \infty}, we expect only finitely many of the statements {E_n} to be true. (And if {\sum_{i=1}^\infty \mathop{\bf P}(E_i)} is much smaller than {1}, we in fact expect none of the {E_n} to be true.)
  • If {\sum_{i=1}^\infty \mathop{\bf P}(E_i) = \infty}, we expect infinitely many of the statements {E_n} to be true.

This heuristic is motivated both by the Borel-Cantelli lemma, and by the standard probabilistic computation that if one is given jointly independent, and genuinely probabilistic, events {E_1, E_2, \ldots} with {\sum_{i=1}^\infty \mathop{\bf P}(E_i) = \infty}, then one almost surely has an infinite number of the {E_i} occuring.

Before we get to the ABC conjecture, let us give two simpler (and well known) demonstrations of these heuristics in action:

Example 1 (Twin prime conjecture) One can heuristically justify the twin prime conjecture as follows. Using the prime number theorem, one can heuristically assign a probability of {1/\log n} to the event that any given large integer {n} is prime. In particular, the probability that {n+2} is prime will then be {1/\log(n+2)}. Making the assumption that there are no strong correlations between these events, we are led to the prediction that the probability that {n} and {n+2} are simultaneously prime is {\frac{1}{(\log n)(\log n+2)}}. Since {\sum_{n=1}^\infty \frac{1}{(\log n) (\log n+2)} = \infty}, the Borel-Cantelli heuristic then predicts that there should be infinitely many twin primes.

Note that the above argument is a bit too naive, because there are some non-trivial correlations between the primality of {n} and the primality of {n+2}. Most obviously, if {n} is prime, this greatly increases the probability that {n} is odd, which implies that {n+2} is odd, which then elevates the probability that {n+2} is prime. A bit more subtly, if {n} is prime, then {n} is likely to avoid the residue class {0 \hbox{ mod } 3}, which means that {n+2} avoids the residue class {2 \hbox{ mod } 3}, which ends up decreasing the probability that {n+2} is prime. However, there is a standard way to correct for these local correlations; see for instance in this previous blog post. As it turns out, these local correlations ultimately alter the prediction for the asymptotic density of twin primes by a constant factor (the twin prime constant), but do not affect the qualitative prediction of there being infinitely many twin primes.

Example 2 (Fermat’s last theorem) Let us now heuristically count the number of solutions to {x^n+y^n=z^n} for various {n} and natural numbers {x,y,z} (which we can reduce to be coprime if desired). We recast this (in the spirit of the ABC conjecture) as {a+b=c}, where {a,b,c} are {n^{th}} powers. The number of {n^{th}} powers up to any given number {N} is about {N^{1/n}}, so heuristically any given natural number {a} has a probability about {a^{1/n - 1}} of being an {n^{th}} power. If we make the naive assumption that (in the coprime case at least) there is no strong correlation between the events that {a} is an {n^{th}} power, {b} is an {n^{th}} power, and {a+b} being an {n^{th}} power, then for typical {a,b}, the probability that {a,b,a+b} are all simultaneously {n^{th}} powers would then be {a^{1/n-1} b^{1/n-1} (a+b)^{1/n-1}}. For fixed {n}, the total number of solutions to the Fermat equation would then be predicted to be

\displaystyle  \sum_{a=1}^\infty \sum_{b=1}^\infty a^{1/n-1} b^{1/n-1} (a+b)^{1/n-1}.

(Strictly speaking, we need to restrict to the coprime case, but given that a positive density of pairs of integers are coprime, it should not affect the qualitative conclusion significantly if we now omit this restriction.) It might not be immediately obvious as to whether this sum converges or diverges, but (as is often the case with these sorts of unsigned sums) one can clarify the situation by dyadic decomposition. Suppose for instance that we consider the portion of the sum where {c=a+b} lies between {2^k} and {2^{k+1}}. Then this portion of the sum can be controlled by

\displaystyle  \sum_{a \leq 2^{k+1}} \sum_{b \leq 2^{k+1}} a^{1/n-1} b^{1/n-1} O( ( 2^k )^{1/n - 1} )

which simplifies to

\displaystyle  O( 2^{(3/n - 1)k} ).

Summing in {k}, one thus expects infinitely many solutions for {n=2}, only finitely many solutions for {n>3} (indeed, a refinement of this argument shows that one expects only finitely many solutions even if one considers all {n>3} at once), and a borderline prediction of there being a barely infinite number of solutions when {n=3}. Here is of course a place where a naive application of the probabilistic heuristic breaks down; there is enough arithmetic structure in the equation {x^3+y^3=z^3} that the naive probabilistic prediction ends up being an inaccurate model. Indeed, while this heuristic suggests that a typical homogeneous cubic should have a logarithmic number of integer solutions of a given height {N}, it turns out that some homogeneous cubics (namely, those associated to elliptic curves of positive rank) end up with the bulk of these solutions, while other homogeneous cubics (including those associated to elliptic curves of zero rank, including the Fermat curve {x^3+y^3=z^3}) only get finitely many solutions. The reasons for this are subtle, but certainly the high degree of arithmetic structure present in an elliptic curve (starting with the elliptic curve group law which allows one to generate new solutions from old ones, and which also can be used to exclude solutions to {x^3+y^3=z^3} via the method of descent) is a major contributing factor.

Below the fold, we apply similar heuristics to suggest the truth of the ABC conjecture.

Read the rest of this entry »

Much as group theory is the study of groups, or graph theory is the study of graphs, model theory is the study of models (also known as structures) of some language {{\mathcal L}} (which, in this post, will always be a single-sorted, first-order language). A structure is a set {X}, equipped with one or more operations, constants, and relations. This is of course an extremely general type of mathematical object, but (quite remarkably) one can still say a substantial number of interesting things about very broad classes of structures.

We will observe the common abuse of notation of using the set {X} as a metonym for the entire structure, much as we usually refer to a group {(G,1,\cdot,()^{-1})} simply as {G}, a vector space {(V, 0, +, \cdot)} simply as {V}, and so forth. Following another common bending of the rules, we also allow some operations on structures (such as the multiplicative inverse operation on a group or field) to only be partially defined, and we allow use of the usual simplifying conventions for mathematical formulas (e.g. writing {a+b+c} instead of {(a+b)+c} or {a+(b+c)}, in cases where associativity is known). We will also deviate slightly from the usual practice in logic by emphasising individual structures, rather than the theory of general classes of structures; for instance, we will talk about the theory of a single field such as {{\bf R}} or {{\bf C}}, rather than the theory of all fields of a certain type (e.g. real closed fields or algebraically closed fields).

Once one has a structure {X}, one can introduce the notion of a definable subset of {X}, or more generally of a Cartesian power {X^n} of {X}, defined as a set {E \subset X^n} of the form

\displaystyle  E = \{ (x_1,\ldots,x_n): P(x_1,\ldots,x_n) \hbox{ true} \} \ \ \ \ \ (1)

for some formula {P} in the language {{\mathcal L}} with {n} free variables and any number of constants from {X} (that is, {P(x_1,\ldots,x_n)} is a well-formed formula built up from a finite number of constants {c_1,\ldots,c_m} in {X}, the relations and operations on {X}, logical connectives such as {\neg}, {\wedge}, {\implies}, and the quantifiers {\forall, \exists}). Thus, for instance, in the theory of the arithmetic of the natural numbers {{\bf N} = ({\bf N}, 0, 1, +, \times)}, the set of primes {{\mathcal P}} is a definable set, since we have

\displaystyle  {\mathcal P} = \{ x \in {\bf N}: (\exists y: x=y+2) \wedge \neg (\exists z,w: x = (z+2)(w+2)) \}.

In the theory of the field of reals {{\bf R} = ({\bf R}, 0, 1, +, -, \times, ()^{-1})}, the unit circle {S^1} is an example of a definable set,

\displaystyle  S^1 = \{ (x,y) \in {\bf R}^2: x^2+y^2 = 1 \},

but so is the the complement of the circle,

\displaystyle  {\bf R}^2 \backslash S^1 = \{ (x,y) \in {\bf R}^2: \neg(x^2+y^2 = 1) \}

and the interval {[-1,1]}:

\displaystyle  [-1,1] = \{ x \in {\bf R}: \exists y: x^2+y^2 = 1\}.

Due to the unlimited use of constants, any finite subset of a power {X^n} of any structure {X} is, by our conventions, definable in that structure. (One can of course also consider definability without parameters (also known as {0}-definability), in which arbitrary constants are not permitted, but we will not do so here.)

We can isolate some special subclasses of definable sets:

  • An atomic definable set is a set of the form (1) in which {P()} is an atomic formula (i.e. it does not contain any logical connectives or quantifiers).
  • A quantifier-free definable set is a set of the form (1) in which {P()} is quantifier-free (i.e. it can contain logical connectives, but does not contain the quantifiers {\forall, \exists}).

Example 1 In the theory of a field such as {{\bf R}}, an atomic definable set is the same thing as an affine algebraic set (also known as an affine algebraic variety, with the understanding that varieties are not necessarily assumed to be irreducible), and a quantifier-free definable set is known as a constructible set; thus we see that algebraic geometry can be viewed in some sense as a special case of model theory. (Conversely, it can in fact be quite profitable to think of model theory as an abstraction of algebraic geometry; for instance, the concepts of Morley rank and Morley degree in model theory (discussed in this previous blog post) directly generalises the concepts of dimension and degree in algebraic geometry.) Over {{\bf R}}, the interval {[-1,1]} is a definable set, but not a quantifier-free definable set (and certainly not an atomic definable set); and similarly for the primes over {{\bf N}}.

A quantifier-free definable set in {X^n} is nothing more than a finite boolean combination of atomic definable sets; in other words, the class of quantifier-free definable sets over {X} is the smallest class that contains the atomic definable sets and is closed under boolean operations such as complementation and union (which generate all the other boolean operations). Similarly, the class of definable sets over {X} is the smallest class that contains the quantifier-free definable sets, and is also closed under the operation of projection {\pi_n: E \mapsto \pi_n(E)} from {X^{n+1}} to {X^n} for every natural number {n}, where {\pi_n: X^{n+1} \rightarrow X^n} is the map {\pi_n(x_1,\ldots,x_n,x_{n+1}) := (x_1,\ldots,x_n)}.

Some structures have the property of enjoying quantifier elimination, which means that every definable set is in fact a quantifier-free definable set, or equivalently that the projection of a quantifier-free definable set is again quantifier-free. For instance, an algebraically closed field {k} (with the field operations) has quantifier elimination (i.e. the projection of a constructible set is again constructible); this fact can be proven by the classical tool of resultants, and among other things can be used to give a proof of Hilbert’s nullstellensatz. (Note though that projection does not necessary preserve the property of being atomic; for instance, the projection of the atomic set {\{ (x,y) \in k^2: xy=1 \}} is the non-atomic, but still quantifier-free definable, set {\{ x \in k: \neg (k=0) \}}.) In the converse direction, it is not difficult to use the nullstellensatz to deduce quantifier elimination. For theory of the real field {{\bf R}}, which is not algebraically closed, one does not have quantifier elimination, as one can see from the example of the unit circle (which is a quantifier-free definable set) projecting down to the interval {[-1,1]} (which is definable, but not quantifer-free definable). However, if one adds the additional operation of order {<} to the reals, giving it the language of an ordered field rather than just a field, then quantifier elimination is recovered (the class of quantifier-free definable sets now enlarges to match the class of definable sets, which in this case is also the class of semi-algebraic sets); this is the famous Tarski-Seidenberg theorem.

On the other hand, many important structures do not have quantifier elimination; typically, the projection of a quantifier-free definable set is not, in general, quantifier-free definable. This failure of the projection property also shows up in many contexts outside of model theory; for instance, Lebesgue famously made the error of thinking that the projection of a Borel measurable set remained Borel measurable (it is merely an analytic set instead). Turing’s halting theorem can be viewed as an assertion that the projection of a decidable set (also known as a computable or recursive set) is not necessarily decidable (it is merely semi-decidable (or recursively enumerable) instead). The notorious P=NP problem can also be essentially viewed in this spirit; roughly speaking (and glossing over the placement of some quantifiers), it asks whether the projection of a polynomial-time decidable set is again polynomial-time decidable. And so forth. (See this blog post of Dick Lipton for further discussion of the subtleties of projections.)

Now we consider the status of quantifier elimination for the theory of a finite field {F}. If interpreted naively, quantifier elimination is trivial for a finite field {F}, since every subset of {F^n} is finite and thus quantifier-free definable. However, we can recover an interesting question in one of two (essentially equivalent) ways. One is to work in the asymptotic regime in which the field {F} is large, but the length of the formulae used to construct one’s definable sets stays bounded uniformly in the size of {F} (where we view any constant in {F} as contributing a unit amount to the length of a formula, no matter how large {F} is). A simple counting argument then shows that only a small number of subsets of {F^n} become definable in the asymptotic limit {|F| \rightarrow \infty}, since the number of definable sets clearly grows at most polynomially in {|F|} for any fixed bound on the formula length, while the number of all subsets of {|F|^n} grows exponentially in {n}.

Another way to proceed is to work not with a single finite field {F}, or even with a sequence {F_m} of finite fields, but with the ultraproduct {F = \prod_{m \rightarrow p} F_m} of a sequence of finite fields, and to study the properties of definable sets over this ultraproduct. (We will be using the notation of ultraproducts and nonstandard analysis from this previous blog post.) This approach is equivalent to the more finitary approach mentioned in the previous paragraph, at least if one does not care to track of the exact bounds on the length of the formulae involved. Indeed, thanks to Los’s theorem, a definable subset {E} of {F^n} is nothing more than the ultraproduct {E = \prod_{m \rightarrow p} E_m} of definable subsets {E_m} of {F_m^n} for all {m} sufficiently close to {p}, with the length of the formulae used to define {E_m} uniformly bounded in {m}. In the language of nonstandard analysis, one can view {F} as a nonstandard finite field.

The ultraproduct {F} of finite fields is an important example of a pseudo-finite field – a field that obeys all the sentences in the languages of fields that finite fields do, but is not necessarily itself a finite field. The model theory of pseudo-finite fields was first studied systematically by Ax (in the same paper where the Ax-Grothendieck theorem, discussed previously on this blog, was established), with important further contributions by Kiefe, by Fried-Sacerdote, by two papers of Chatzidakis-van den Dries-Macintyre, and many other authors.

As mentioned before, quantifier elimination trivially holds for finite fields. But for infinite pseudo-finite fields, such as the ultraproduct {F = \prod_{m \rightarrow p} F_m} of finite fields with {|F_m|} going to infinity, quantifier elimination fails. For instance, in a finite field {F_m}, the set {E_m := \{ x \in F_m: \exists y \in F_m: x=y^2 \}} of quadratic residues is a definable set, with a bounded formula length, and so in the ultraproduct {F =\prod_{m \rightarrow p} F_m}, the set {E := \prod_{m\rightarrow p} E_m} of nonstandard quadratic residues is also a definable set. However, in one dimension, we see from the factor theorem that the only atomic definable sets are either finite or the whole field {F}, and so the only constructible sets (i.e. the only quantifier-free definable sets) are either finite or cofinite in {F}. Since the quadratic residues have asymptotic density {1/2} in a large finite field, they cannot form a quantifier-free definable set, despite being definable.

Nevertheless, there is a very nice almost quantifier elimination result for these fields, in characteristic zero at least, which we phrase here as follows:

Theorem 1 (Almost quantifier elimination) Let {F} be a nonstandard finite field of characteristic zero, and let {E \subset F^n} be a definable set over {F}. Then {E} is the union of finitely many sets of the form

\displaystyle  E = \{ x \in V(F): \exists t \in F: P(x,t) = 0 \} \ \ \ \ \ (2)

where {V(F)} is an atomic definable subset of {F^n} (i.e. the {F}-points of an algebraic variety {V} defined over {F} in {F^n}) and {P: F^{n+1} \rightarrow F} is a polynomial.

Results of this type were first obtained essentially due to Catarina Kiefe, although the formulation here is closer to that of Chatzidakis-van den Dries-Macintyre.

Informally, this theorem says that while we cannot quite eliminate all quantifiers from a definable set over a nonstandard finite field, we can eliminate all but one existential quantifier. Note that negation has also been eliminated in this theorem; for instance, the definable set {F \backslash \{0\} = \{ x \in F: \neg(x=0) \}} uses a negation, but can also be described using a single existential quantifier as {\{ x \in F: \exists t: xt = 1 \}}.) I believe that there are more complicated analogues of this result in positive characteristic, but I have not studied this case in detail (Kiefe’s result does not assume characteristic zero, but her conclusion is slightly different from the one given here). In the one-dimensional case {n=1}, the only varieties {V} are the affine line and finite sets, and we can simplify the above statement, namely that any definable subset of {F} takes the form {\{ x\in F: \exists t \in F: P(x,t) = 0 \}} for some polynomial {P} (i.e. definable sets in {F} are nothing more than the projections of the {F}-points of a plane curve).

There is an equivalent formulation of this theorem for standard finite fields, namely that if {F} is a finite field and {E \subset F^n} is definable using a formula of length at most {M}, then {E} can be expressed in the form (2) with the degree of {P} bounded by some quantity {C_{M,n}} depending on {M} and {n}, assuming that the characteristic of {F} is sufficiently large depending on {M, n}.

The theorem gives quite a satisfactory description of definable sets in either standard or nonstandard finite fields (at least if one does not care about effective bounds in some of the constants, and if one is willing to exclude the small characteristic case); for instance, in conjunction with the Lang-Weil bound discussed in this recent blog post, it shows that any non-empty definable subset of a nonstandard finite field has a nonstandard cardinality of {(\alpha + O(|F|^{-1/2})) |F|^d} for some positive standard rational {\alpha} and integer {d}. Equivalently, any non-empty definable subset of {F^n} for some standard finite field {F} using a formula of length at most {M} has a standard cardinality of {(\alpha + O_{M,n}(|F|^{-1/2})) |F|^d} for some positive rational of height {O_{M,n}(1)} and some natural number {d} between {0} and {n}. (For instance, in the example of the quadratic residues given above, {d} is equal to {1} and {\alpha} equal to {1/2}.) There is a more precise statement to this effect, namely that the Poincaré series of a definable set is rational; see Kiefe’s paper for details.

Below the fold I give a proof of Theorem 1, which relies primarily on the Lang-Weil bound mentioned above.

Read the rest of this entry »

[Note: the idea for this post originated before the recent preprint of Mochizuki on the abc conjecture was released, and is not intended as a commentary on that work, which offers a much more non-trivial perspective on scheme theory. -T.]

In classical algebraic geometry, the central object of study is an algebraic variety {V} over a field {k} (and the theory works best when this field {k} is algebraically closed). One can talk about either affine or projective varieties; for sake of discussion, let us restrict attention to affine varieties. Such varieties can be viewed in at least four different ways:

  • (Algebraic geometry) One can view a variety through the set {V(k)} of points (over {k}) in that variety.
  • (Commutative algebra) One can view a variety through the field of rational functions {k(V)} on that variety, or the subring {k[V]} of polynomial functions in that field.
  • (Dual algebraic geometry) One can view a variety through a collection of polynomials {P_1,\ldots,P_m} that cut out that variety.
  • (Dual commutative algebra) One can view a variety through the ideal {I(V)} of polynomials that vanish on that variety.

For instance, the unit circle over the reals can be thought of in each of these four different ways:

  • (Algebraic geometry) The set of points {\{ (x,y) \in {\bf R}^2: x^2+y^2 = 1 \}}.
  • (Commutative algebra) The quotient {{\bf R}[x,y] / (x^2+y^2-1)} of the polynomial ring {{\bf R}[x,y]} by the ideal generated by {x^2+y^2-1} (or equivalently, the algebra generated by {x,y} subject to the constraint {x^2+y^2=1}), or the fraction field of that quotient.
  • (Dual algebraic geometry) The polynomial {x^2+y^2-1}.
  • (Dual commutative algebra) The ideal {(x^2+y^2-1)} generated by {x^2+y^2-1}.

The four viewpoints are almost equivalent to each other (particularly if the underlying field {k} is algebraically closed), as there are obvious ways to pass from one viewpoint to another. For instance, starting with the set of points on a variety, one can form the space of rational functions on that variety, or the ideal of polynomials that vanish on that variety. Given a set of polynomials, one can cut out their zero locus, or form the ideal that they generate. Given an ideal in a polynomial ring, one can quotient out the ring by the ideal and then form the fraction field. Finally, given the ring of polynomials on a variety, one can form its spectrum (the space of prime ideals in the ring) to recover the set of points on that variety (together with the Zariski topology on that variety).

Because of the connections between these viewpoints, there are extensive “dictionaries” (most notably the ideal-variety dictionary) that convert basic concepts in one of these four perspectives into any of the other three. For instance, passing from a variety to a subvariety shrinks the set of points and the function field, but enlarges the set of polynomials needed to cut out the variety, as well as the associated ideal. Taking the intersection or union of two varieties corresponds to adding or multiplying together the two ideals respectively. The dimension of an (irreducible) algebraic variety can be defined as the transcendence degree of the function field, the maximal length of chains of subvarieties, or the Krull dimension of the ring of polynomials. And so on and so forth. Thanks to these dictionaries, it is now commonplace to think of commutative algebras geometrically, or conversely to approach algebraic geometry from the perspective of abstract algebra. There are however some very well known defects to these dictionaries, at least when viewed in the classical setting of algebraic varieties. The main one is that two different ideals (or two inequivalent sets of polynomials) can cut out the same set of points, particularly if the underlying field {k} is not algebraically closed. For instance, if the underlying field is the real line {{\bf R}}, then the polynomial equations {x^2+1=0} and {1=0} cut out the same set of points, namely the empty set, but the ideal generated by {x^2+1} in {{\bf R}[x]} is certainly different from the ideal generated by {1}. This particular example does not work in an algebraically closed field such as {{\bf C}}, but in that case the polynomial equations {x^2=0} and {x=0} also cut out the same set of points (namely the origin), but again {x^2} and {x} generate different ideals in {{\bf C}[x]}. Thanks to Hilbert’s nullstellensatz, we can get around this problem (in the case when {k} is algebraically closed) by always passing from an ideal to its radical, but this causes many aspects of the theory of algebraic varieties to become more complicated when the varieties involved develop singularities or multiplicities, as can already be seen with the simple example of Bezout’s theorem.

Nowadays, the standard way to deal with these issues is to replace the notion of an algebraic variety with the more general notion of a scheme. Roughly speaking, the way schemes are defined is to focus on the commutative algebra perspective as the primary one, and to allow the base field {k} to be not algebraically closed, or even to just be a commutative ring instead of a field. (One could even consider non-commutative rings, leading to non-commutative geometry, but we will not discuss this extension of scheme theory further here.) Once one generalises to these more abstract rings, the notion of a rational function becomes more complicated (one has to work locally instead of globally, cutting out the points where the function becomes singular), but as a first approximation one can think of a scheme as basically being the same concept as a commutative ring. (In actuality, due to the need to localise, a scheme is defined as a sheaf of rings rather than a single ring, but these technicalities will not be important for the purposes of this discussion.) All the other concepts from algebraic geometry that might previously have been defined using one of the other three perspectives, are then redefined in terms of this ring (or sheaf of rings) in order to generalise them to schemes.

Thus, for instance, in scheme theory the rings {{\bf R}[x]/(x^2)} and {{\bf R}[x]/(x)} describe different schemes; from the classical perspective, they cut out the same locus, namely the point {\{0\}}, but the former scheme makes this point “fatter” than the latter scheme, giving it a degree (or multiplicity) of {2} rather than {1}.

Because of this, it seems that the link between the commutative algebra perspective and the algebraic geometry perspective is still not quite perfect in scheme theory, unless one is willing to start “fattening” various varieties to correctly model multiplicity or singularity. But – and this is the trivial remark I wanted to make in this blog post – one can recover a tight connection between the two perspectives as long as one allows the freedom to arbitrarily extend the underlying base ring.

Here’s what I mean by this. Consider classical algebraic geometry over some commutative ring {R} (not necessarily a field). Any set of polynomials {P_1,\ldots,P_m \in R[x_1,\ldots,x_d]} in {d} indeterminate variables {x_1,\ldots,x_d} with coefficients in {R} determines, on the one hand, an ideal

\displaystyle I := (P_1,\ldots,P_m)

\displaystyle  = \{P_1Q_1+\ldots+P_mQ_m: Q_1,\ldots,Q_m \in R[x_1,\ldots,x_d]\}

in {R[x_1,\ldots,x_d]}, and also cuts out a zero locus

\displaystyle  V[R] := \{ (y_1,\ldots,y_d) \in R^d: P_1(y_1,\ldots,y_d) = \ldots = P_m(y_1,\ldots,y_d) = 0 \},

since each of the polynomials {P_1,\ldots,P_m} clearly make sense as maps from {R^d} to {R}. Of course, one can also write {V[R]} in terms of {I}:

\displaystyle  V[R] := \{ (y_1,\ldots,y_d) \in R^d: P(y_1,\ldots,y_d) = 0 \hbox{ for all } P \in I \}.

Thus the ideal {I} uniquely determines the zero locus {V[R]}, and we will emphasise this by writing {V[R]} as {V_I[R]}. As the previous counterexamples illustrate, the converse is not true. However, whenever we have any extension {R'} of the ring {R} (i.e. a commutative ring {R'} that contains {R} as a subring), then we can also view the polynomials {P_1,\ldots,P_m} as maps from {(R')^d} to {R'}, and so one can also define the zero locus for all the extensions:

\displaystyle  V[R'] := \{ (y_1,\ldots,y_d) \in (R')^d:

\displaystyle  P_1(y_1,\ldots,y_d) = \ldots = P_m(y_1,\ldots,y_d) = 0 \}.

As before, {V[R']} is determined by the ideal {I}:

\displaystyle  V[R'] = V_I[R'] := \{ (y_1,\ldots,y_d) \in (R')^d:

\displaystyle  P(y_1,\ldots,y_d) = 0 \hbox{ for all } P \in I \}.

The trivial remark is then that while a single zero locus {V_I[R]} is insufficient to recover {I}, the collection of zero loci {V_I[R']} for all extensions {R'} of {R} (or more precisely, the assignment map {R' \mapsto V_I[R']}, known as the functor of points of {V_I}) is sufficient to recover {I}, as long as at least one zero locus, say {V_I[R_0]}, is non-empty. Indeed, suppose we have two ideals {I, I'} of {R[x_1,\ldots,x_d]} that cut out the same non-empty zero locus for all extensions {R'} of {R}, thus

\displaystyle  V_I[R'] = V_{I'}[R'] \neq \emptyset

for all extensions {R'} of {R}. We apply this with the extension {R'} of {R} given by {R' := R_0[x_1,\ldots,x_d]/I}. Note that the embedding of {R} in {R_0[x_1,\ldots,x_d]/I} is injective, since otherwise {I} would cut out the empty set as the zero locus over {R_0}, and so {R'} is indeed an extension of {R}. Tautologically, the point {(x_1 \hbox{ mod } I, \ldots, x_d \hbox{ mod } I)} lies in {V_I[R']}, and thus necessarily lies in {V_{I'}[R']} as well. Unpacking what this means, we conclude that {P \in I} whenever {P \in I'}, that is to say that {I' \subset I}. By a symmetric argument, we also have {I \subset I'}, and thus {I=I'} as claimed. (As pointed out in comments, this fact (and its proof) is essentially a special case of the Yoneda lemma. The connection is tighter if one allows {R'} to be any ring with a (not necessarily injective) map from {R} into it, rather than an extension of {R}, in which case one can also drop the hypothesis that {V_I[R_0]} is non-empty for at least one {R_0}. For instance, {V_{(2)}[R'] = V_{(3)}[R'] = \emptyset} for every extension {R'} of the integers, but if one also allows quotients such as {{\bf Z}/(2)} or {{\bf Z}/(3)} instead, then {V_{(2)}[R']} and {V_{(3)}[R']} are no longer necessarily equal.)

Thus, as long as one thinks of a variety or scheme as cutting out points not just in the original base ring or field, but in all extensions of that base ring or field, one recovers an exact correspondence between the algebraic geometry perspective and the commutative algebra perspective. This is similar to the classical algebraic geometry position of viewing an algebraic variety as being defined simultaneously over all fields that contain the coefficients of the defining polynomials, but the crucial difference between scheme theory and classical algebraic geometry is that one also allows definition over commutative rings, and not just fields. In particular, one needs to allow extensions to rings that may contain nilpotent elements, otherwise one cannot distinguish an ideal from its radical.

There are of course many ways to extend a field into a ring, but as an analyst, one way to do so that appeals particularly to me is to introduce an epsilon parameter and work modulo errors of {O(\varepsilon)}. To formalise this algebraically, let’s say for sake of concreteness that the base field is the real line {{\bf R}}. Consider the ring {\tilde R} of real-valued quantities {x = x_\varepsilon} that depend on a parameter {\varepsilon \geq 0} (i.e. functions from {{\bf R}^+} to {{\bf R}}), which are locally bounded in the sense that {x} is bounded whenever {\varepsilon} is bounded. (One can, if one wishes, impose some further continuity or smoothness hypotheses on how {x} depends on {\varepsilon}, but this turns out not to be relevant for the following discussion. Algebraists often prefer to use the ring of Puiseux series here in place of {\tilde R}, and a nonstandard analyst might instead use the hyperreals, but again this will not make too much difference for our purposes.) Inside this commutative ring, we can form the ideal {I} of quantities {x = x_\varepsilon} that are of size {O(\varepsilon)} as {\varepsilon \rightarrow 0}, i.e. there exists a quantity {C>0} independent of {\varepsilon} such that {|x| \leq C\varepsilon} for all sufficiently small {\varepsilon}. This can easily be seen to indeed be an ideal in {\tilde R}. We then form the quotient ring {R' := \tilde R/I := \{ x \hbox{ mod } I: x \in \tilde R \}}. Note that {x = y \hbox{ mod } I} is equivalent to the assertion that {x = y + O(\varepsilon)}, so we are encoding the analyst’s notion of “equal up to errors of {O(\varepsilon)}” into algebraic terms.

Clearly, {R'} is a commutative ring extending {{\bf R}}. Hence, any algebraic variety

\displaystyle  V[{\bf R}] = \{ (y_1,\ldots,y_d) \in {\bf R}^d: P_1(y_1,\ldots,y_d) = \ldots = P_m(y_1,\ldots,y_d) = 0 \}

defined over the reals {{\bf R}} (so the polynomials {P_1,\ldots,P_m} have coefficients in {{\bf R}}), also is defined over {R'}:

\displaystyle  V[R'] = \{ (y_1,\ldots,y_d) \in (R')^d:

\displaystyle  P_1(y_1,\ldots,y_d) = \ldots = P_m(y_1,\ldots,y_d) = 0 \}.

In language that more closely resembles analysis, we have

\displaystyle  V[R'] = \{ (y_1,\ldots,y_d) \in \tilde R^d: P_1(y_1,\ldots,y_d), \ldots, P_m(y_1,\ldots,y_d) = O(\varepsilon) \}

\displaystyle  \hbox{ mod } I^d.

Thus we see that {V[R']} is in some sense an “{\varepsilon}-thickening” of {V[{\bf R}]}, and is thus one way to give rigorous meaning to the intuition that schemes can “thicken” varieties. For instance, the scheme associated to the ideal {(x)}, when interpreted over {R'}, becomes an {O(\varepsilon)} neighbourhood of the origin

\displaystyle  V_{(x)}[R'] = \{ y \in \tilde R: y = O(\varepsilon) \} \hbox{ mod } I,

but the scheme associated to the smaller ideal {(x^2)}, when interpreted over {R'}, becomes an {O(\varepsilon^{1/2})}-neighbourhood of the origin, thus being a much “fatter” point:

\displaystyle  V_{(x^2)}[R'] = \{ y \in \tilde R: y^2 = O(\varepsilon) \} \hbox{ mod } I

\displaystyle  = \{ y \in \tilde R: y = O(\varepsilon^{1/2} ) \} \hbox{ mod } I.

Once one introduces the analyst’s epsilon, one can see quite clearly that {V_{(x^2)}[R']} is coming from a larger scheme than {V_{(x)}[R']}, with fewer polynomials vanishing on it; in particular, the polynomial {x} vanishes to order {O(\varepsilon)} on {V_{(x)}[R']} but does not vanish to order {O(\varepsilon)} on {V_{(x^2)}[R']}.

By working with this analyst’s extension of {{\bf R}}, one can already get a reasonably good first approximation of what schemes over {{\bf R}} look like, which I found particularly helpful for getting some intuition on these objects. However, since this is only one extension of {{\bf R}}, and not a “universal” such extension, it cannot quite distinguish any two schemes from each other, although it does a better job of this than classical algebraic geometry. For instance, consider the scheme cut out by the polynomials {x^2, y^2} in two dimensions. Over {R'}, this becomes

\displaystyle  V_{(x^2,y^2)}[R'] = \{ (x,y) \in \tilde R^2: x^2, y^2 = O(\varepsilon) \} \hbox{ mod } I^2

\displaystyle  = \{ (x,y) \in \tilde R^2: x, y = O(\varepsilon^{1/2}) \} \hbox{ mod } I^2.

Note that the polynomial {xy} vanishes to order {O(\varepsilon)} on this locus, but {xy} fails to lie in the ideal {(x^2,y^2)}. Equivalently, we have {V_{(x^2,y^2)}[R'] = V_{(x^2,y^2,xy)}[R']}, despite {(x^2,y^2)} and {(x^2,y^2,xy)} being distinct ideals. Basically, the analogue of the nullstellensatz for {R'} does not completely remove the need for performing a closure operation on the ideal {I}; it is less severe than taking the radical, but is instead more like taking a “convex hull” in that one needs to be able to “interpolate” between two polynomials in the ideal (such as {x^2} and {y^2} to arrive at intermediate polynomials (such as {xy}) that one then places in the ideal.

One can also view ideals (and hence, schemes), from a model-theoretic perspective. Let {I} be an ideal of a polynomial ring {R[x_1,\ldots,x_d]} generated by some polynomials {P_1,\ldots,P_m \in R[x_1,\ldots,x_d]}. Then, clearly, if {Q} is another polynomial in the ideal {I}, then we can use the axioms of commutative algebra (which are basically the axioms of high school algebra) to obtain the syntactic deduction

\displaystyle  P_1(x_1,\ldots,x_d) = \ldots = P_m(x_1,\ldots,x_d) = 0 \vdash Q(x_1,\ldots,x_d) = 0

(since {Q} is just a sum of multiples of {P_1,\ldots,P_m}). In particular, we have the semantic deduction

\displaystyle  P_1(y_1,\ldots,y_d) = \ldots = P_m(y_1,\ldots,y_d) = 0 \implies Q(y_1,\ldots,y_d) = 0 \ \ \ \ \ (1)

for any assignment of indeterminates {y_1,\ldots,y_d} in {R} (or in any extension {R'} of {R}). If we restrict {y_1,\ldots,y_d} to lie in {R} only, then (even if {R} is an algebraically closed field), the converse of the above statement is false; there can exist polynomials {Q} outside of {I} for which (1) holds for all assignments {y_1,\ldots,y_d} in {R}. For instance, we have

\displaystyle  y^2 = 0 \implies y = 0

for all {y} in an algebraically closed field, despite {x} not lying in the ideal {(x^2)}. Of course, the nullstellensatz again explains what is going on here; (1) holds whenever {Q} lies in the radical of {I}, which can be larger than {I} itself. But if one allows the indeterminates {y_1,\ldots,y_d} to take values in arbitrary extensions {R'} of {R}, then the truth of the converse is restored, thus giving a “completeness theorem” relating the syntactic deductions of commutative algebra to the semantic interpretations of such algebras over the extensions {R'}. For instance, since

\displaystyle  y^2 = O(\varepsilon) \not \implies y = O(\varepsilon)

we no longer have a counterexample to the converse coming from {x} and {(x^2)} once we work in {R'} instead of {{\bf R}}. On the other hand, we still have

\displaystyle  x^2, y^2 = O(\varepsilon) \implies xy = O(\varepsilon)

so the extension {R'} is not powerful enough to detect that {xy} does not actually lie in {(x^2,y^2)}; a larger ring (which is less easy to assign an analytic interpretation to) is needed to achieve this.

Archives