I had occasion recently to look up the proof of Hilbert’s nullstellensatz, which I haven’t studied since cramming for my algebra qualifying exam as a graduate student. I was a little unsatisfied with the proofs I was able to locate – they were fairly abstract and used a certain amount of algebraic machinery, which I was terribly rusty on – so, as an exercise, I tried to find a more computational proof that avoided as much abstract machinery as possible. I found a proof which used only the extended Euclidean algorithm and high school algebra, together with an induction on dimension and the obvious observation that any non-zero polynomial of one variable on an algebraically closed field has at least one non-root. It probably isn’t new (in particular, it might be related to the standard model-theoretic proof of the nullstellensatz, with the Euclidean algorithm and high school algebra taking the place of quantifier elimination), but I thought I’d share it here anyway.
Throughout this post, F is going to be a fixed algebraically closed field (e.g. the complex numbers ). I’d like to phrase the nullstellensatz in a fairly concrete fashion, in terms of the problem of solving a set of simultaneous polynomial equations in several variables over F, thus are polynomials in d variables. One obvious obstruction to solvability of this system is if the equations one is trying to solve are inconsistent in the sense that they can be used to imply 1=0. In particular, if one can find polynomials such that , then clearly one cannot solve . The weak nullstellensatz asserts that this is, in fact, the only obstruction:
Weak nullstellensatz. Let be polynomials. Then exactly one of the following statements holds:
- The system of equations has a solution .
- There exist polynomials such that .
Note that the hypothesis that F is algebraically closed is crucial; for instance, if F is the reals, then the equation has no solution, but there is no polynomial such that .
Like many results of the “The only obstructions are the obvious obstructions” type, the power of the nullstellensatz lies in the ability to take a hypothesis about non-existence (in this case, non-existence of solutions to ) and deduce a conclusion about existence (in this case, existence of such that ). The ability to get “something from nothing” is clearly going to be both non-trivial and useful. In particular, the nullstellensatz offers an important correspondence between algebraic geometry (the conclusion 1 is an assertion that a certain algebraic variety is empty) and commutative algebra (the conclusion 2 is an assertion that a certain ideal is non-proper).
Now suppose one is trying to solve the more complicated system for some polynomials . Again, any identity of the form will be an obstruction to solvability, but now more obstructions are possible: any identity of the form for some non-negative integer r will also obstruct solvability. The strong nullstellensatz asserts that this is the only obstruction:
Strong nullstellensatz. Let be polynomials. Then exactly one of the following statements holds:
- The system of equations , has a solution .
- There exist polynomials and a non-negative integer r such that .
Of course, the weak nullstellensatz corresponds to the special case in which R=1. The strong nullstellensatz is usually phrased instead in terms of ideals and radicals, but the above formulation is easily shown to be equivalent to the usual version (modulo Hilbert’s basis theorem).
One could consider generalising the nullstellensatz a little further by considering systems of the form , but this is not a significant generalisation, since all the inequations can be concatenated into a single inequation . The presence of the exponent r in conclusion (2) is a little annoying; to get rid of it, one needs to generalise the notion of an algebraic variety to that of a scheme (which is worth doing for several other reasons too, in particular one can now work over much more general objects than just algebraically closed fields), but that is a whole story in itself (and one that I am not really qualified to tell).
[Update, Nov 26: It turns out that my approach is more complicated than I first thought, and so I had to revise the proof quite a bit to fix a certain gap, in particular making it significantly messier than my first version. On the plus side, I was able to at least eliminate any appeal to Hilbert’s basis theorem, so in particular the proof is now manifestly effective (but with terrible bounds). In any case, I am keeping the argument here in case it has some interest.]
– The base case –
I had initially posted here an attempted proof of the weak nullstellensatz by induction on dimension, and then deduced the strong nullstellensatz from the weak via a lifting trick of Zariski (the key observation being that the inequation was equivalent to the solvability of the equation for some y). But I realised that my proof of the weak nullstellensatz was incorrect (it required the strong nullstellensatz in one lower dimension as the inductive hypothesis), and so now I am proceeding by establishing the strong nullstellensatz directly.
We shall induct on the dimension d (i.e. the number of variables in the system of equations).
The case d=0 is trivial, so we use the d=1 case as the base case, thus are all polynomials of one variable. This case follows easily from the fundamental theorem of algebra, but it will be important for later purposes to instead use a more algorithmic proof in which the coefficients of the polynomials required for (2) are obtained from the coefficients of in an explicitly computable fashion (using only the operations of addition, subtraction, multiplication, division, and branching on whether a given field element is zero or non-zero). In particular, one does not need to locate roots of polynomials in order to construct (although one will of course need to do so to locate a solution x to (1)). [It is likely that one could get these sorts of computability properties on “for free” from Galois theory, but I have not attempted to do so.] It is however instructive to secretly apply the fundamental theorem of algebra throughout the proof which follows, to clarify what is going on.
Let us say that a collection of polynomials obeys the nullstellensatz if at least one of (1) and (2) is true. It is clear that (1) and (2) cannot both be true, so to prove the nullstellensatz it suffices to show that every collection obeys the nullstellensatz.
We can of course throw away any of the that are identically zero, as this does not affect whether obeys the nullstellensatz. If none of the remain, then we are in case (1), because the polynomial R has at most finitely many zeroes, and because an algebraically closed field must be infinite. So suppose that we have some non-zero . We repeatedly use the extended Euclidean algorithm to locate the greatest common divisor D(x) of the remaining . Note that this algorithm automatically supplies for us some polynomials such that
Because of this, we see that obeys the nullstellensatz if and only if obeys the nullstellensatz. So we have effectively reduced to the case m=1.
Now we apply the extended Euclidean algorithm again, this time to D and R, to express the gcd D’ of D and R as a combination D’ = D A + R B, and also to factor D = D’ S and R = D’ T for some polynomials A, B, S, T with AS + BT = 1. A little algebra then shows that one has a solution to the problem
whenever one has a solution to the problem
Also, if some power of D’ is a multiple of S, then some power of R is a multiple of D. Thus we see that if (S; D’) obeys the nullstellensatz, then (D; R) does also. But we see that the net degree of S and D’ is less than the net degree of D and R unless R is constant, so by infinite descent we may reduce to that case. If R is zero then we are clearly in case (2), so we may assume R is non-zero. If D is constant then we are again in case (2), so assume that D is non-constant. But then as the field is algebraically closed, D has at least one root, and so we are in case (1). This completes the proof of the d=1 case.
For the inductive step, it is important to remark that the above proof is algorithmic in the sense that a computer which was given the coefficients for as inputs could apply a finite number of arithmetic operations (addition, subtraction, multiplication, division), as well as a finite number of branching operations based on whether a given variable was zero or non-zero, in order to output either
- the coefficients of a non-constant polynomial D with the property that any root x of D would solve (1);
- the coefficients of a non-zero polynomial R with the property that any non-root x of R would solve (1); or
- the coefficients of polynomials which solved (2) for some specific r.
In most cases, the number of branching operations is rather large (see for instance the example of solving two linear equations below). There is however one simple case in which only one branching is involved, namely when m=2, R=1, and are monic. In this case, we have an identity of the form
where are polynomials (whose coefficients are polynomial combinations of the coefficients of and and is the resultant of and , which is another polynomial combination of the coefficients of and . If the resultant is non-zero then we have
and so the system is unsolvable (we are in case (2)); otherwise, the system is solvable.
– The inductive case –
Now we do the inductive case, when and the claim has already been proven for d-1. The basic idea is to view the system (1) not as a system of equations in d unknowns, but as a d-1-dimensional family of systems in one unknown. We will then apply the d=1 theory to each system in that family and use the algorithmic nature of that theory to glue everything together properly.
We write the variable as for and . The ring F[x] of polynomials in d variables can thus be viewed as a ring F[y][t] of polynomials in one variable t, in which the coefficients lie in the ring F[y].
Let I be the ideal in F[x] generated by . We either need to solve the system
or show that
for some r. (2)
We assume that no solution to (1) exists, and use this to synthesise a relation of the form (2).
Let be arbitrary. We can view the polynomials as polynomials in F[t], whose coefficients lie in F but happen to depend in a polynomial fashion on y. To emphasise this, we write for and for . Then by hypothesis, there is no t for which
To motivate the strategy, let us consider the easy case when , m=2, and , are monic polynomials in t. Then by our previous discussion, the above system is solvable for any fixed y precisely when is zero. So either the equation has a solution, in which case we are in case (1), or it does not. But in the latter case, by applying the nullstellensatz at one lower dimension we see that must be constant in y. But recall that the resultant is a linear combination of and , where the polynomials and depend polynomially on and and thus on y itself. Thus we end up in case (2), and the induction closes in this case.
Now we turn to the general case. Applying the d=1 analysis, we conclude that there exist polynomials of t, and an , such that
Now, if the exponent was constant in y, and the coefficients of depended polynomially on y, we would be in case (2) and therefore done.
It is not difficult to make constant in y. Indeed, we observe that the degrees of are bounded uniformly in y. Inspecting the d=1 analysis, we conclude that the exponent returned by that algorithm is then also bounded uniformly in y. We can always raise the value of by multiplying both sides of (3) by , and so we can make independent of y, thus
Now we need to work on the Q’s. Unfortunately, the coefficients on Q are not polynomial in y; instead, they are piecewise rational in y. Indeed, by inspecting the algorithm used to prove the d=1 case, we see that the algorithm makes a finite number of branches, depending on whether certain polynomial expressions T(y) of y are zero or non-zero. At the end of each branching path, the algorithm returns polynomials whose coefficients were rational combinations of the coefficients of and are thus rational functions of x. Furthermore, all the division operations are by polynomials T(y) which were guaranteed to be non-zero by some stage of the branching process, and so the net denominator of any of these coefficients is some product of the T(y) that are guaranteed non-zero.
An example might help illustrate what’s going on here. Suppose m=2, that R = 1, and is linear in t, thus
for some polynomials . To find the gcd of and for a given y, which determines the solvability of the system , the Euclidean algorithm branches as follows:
- If b(y) is zero, then
- If a(y) is zero, then
- If d(y) is non-zero, then is the gcd (and the system is solvable).
- Otherwise, if d(y) is zero, then
- If c(y) is non-zero, then is the gcd (and the system is unsolvable).
- Otherwise, if c(y) is zero, then is the gcd (and the system is solvable).
- Otherwise, if a(y) is non-zero, then is the gcd (and the system is unsolvable).
- If a(y) is zero, then
- Otherwise, if b(y) is non-zero, then
- If a(y)d(y)-b(y)c(y) is non-zero, then is the gcd (and the system is unsolvable).
- Otherwise, if a(y)d(y)-b(y)c(y) is zero, then is the gcd (and the system is solvable).
So we see that even in the rather simple case of solving two linear equations in one unknown, there is a moderately complicated branching tree involved. Nevertheless, there are only finitely many branching paths. Some of these paths may be infeasible, in the sense that there do not exist any which can follow these paths. But given any feasible path, say one in which the polynomials are observed to be zero, and are observed to be non-zero, we know (since we are assuming no solution to (1)) that the algorithm creates an identity of the form (4) in which the coefficients of are rational polynomials in y, whose denominators are products of . We may thus clear denominators (enlarging r if necessary) and obtain an identity of the form
for some polynomials . This identity holds whenever y is such that are zero and are non-zero. But an inspection of the algorithm shows that the only reason we needed to be non-zero was in order to divide by these numbers; if we clear denominators throughout, we thus see that we can remove these constraints and deduce that (5) holds whenever are zero. Further inspection of the algorithm then shows that even if are non-zero, this only introduces additional terms to (5) which are combinations (over ) of . Thus, for any feasible path, we obtain an identity in F[y,t] of the form
for some polynomials . In other words, we see that
for any feasible path.
Now what we need to do is fold up the branching tree and simplify the relations (5) until we obtain (2). More precisely, we claim that (5) holds (for some r) not only for complete feasible paths (in which we follow the branching tree all the way to the end), but for partial feasible paths, in which we branch some of the way and then stop in a place where at least one can solve all the constraints branched on so far. In particular, the empty feasible path will then give (2).
To prove this claim, we induct backwards on the length of the partial path. So suppose we have some partial feasible path, which required to be zero and to be non-zero in order to get here. If this path is complete, then we are already done, so suppose there is a further branching, say on a polynomial . At least one of the cases and must be feasible; and so we now divide into three cases.
Case 1: is feasible and is infeasible. If we follow the path and use the inductive hypothesis, we obtain a constraint
for some r. On the other hand, since is infeasible, we see that the problem
has no solution. Since the nullstellensatz is assumed to hold for dimension d-1, we conclude that
for some r’. If we then raise (6) to the power r’ by to eliminate the role of W, we conclude (5) (for rr’+r’) as required.
Case 2: is infeasible and is feasible. If we follow the path, we obtain a constraint
for some r”, while the infeasibility of the path means that there is no solution to
and so by the nullstellensatz in dimension d-1 we have
for some polynomial Z and some r”’. If we then multiply (7) by to eliminate W, we obtain (5) as desired (for r”+r”’).
Case 3: and are both feasible.
In this case we obtain the constraints (6) and (7). We rewrite (6) in the form
for some Z, and then multiply (7) by to eliminate W and obtain (5) as desired (for r + r”).
This inductively establishes (5) for all partial branching paths, leading eventually to (2) as desired.