Jean-Pierre Serre (whose papers are, of course, always worth reading) recently posted a lovely lecture on the arXiv entitled “How to use finite fields for problems concerning infinite fields”. In it, he describes several ways in which algebraic statements over fields of zero characteristic, such as ${{\mathbb C}}$, can be deduced from their positive characteristic counterparts such as ${F_{p^m}}$, despite the fact that there is no non-trivial field homomorphism between the two types of fields. In particular finitary tools, including such basic concepts as cardinality, can now be deployed to establish infinitary results. This leads to some simple and elegant proofs of non-trivial algebraic results which are not easy to establish by other means.

One deduction of this type is based on the idea that positive characteristic fields can partially model zero characteristic fields, and proceeds like this: if a certain algebraic statement failed over (say) ${{\mathbb C}}$, then there should be a “finitary algebraic” obstruction that “witnesses” this failure over ${{\mathbb C}}$. Because this obstruction is both finitary and algebraic, it must also be definable in some (large) finite characteristic, thus leading to a comparable failure over a finite characteristic field. Taking contrapositives, one obtains the claim.

Algebra is definitely not my own field of expertise, but it is interesting to note that similar themes have also come up in my own area of additive combinatorics (and more generally arithmetic combinatorics), because the combinatorics of addition and multiplication on finite sets is definitely of a “finitary algebraic” nature. For instance, a recent paper of Vu, Wood, and Wood establishes a finitary “Freiman-type” homomorphism from (finite subsets of) the complex numbers to large finite fields that allows them to pull back many results in arithmetic combinatorics in finite fields (e.g. the sum-product theorem) to the complex plane. (Van Vu and I also used a similar trick to control the singularity property of random sign matrices by first mapping them into finite fields in which cardinality arguments became available.) And I have a particular fondness for correspondences between finitary and infinitary mathematics; the correspondence Serre discusses is slightly different from the one I discuss for instance in here or here, although there seems to be a common theme of “compactness” (or of model theory) tying these correspondences together.

As one of his examples, Serre cites one of my own favourite results in algebra, discovered independently by Ax and by Grothendieck (and then rediscovered many times since). Here is a special case of that theorem:

Theorem 1 (Ax-Grothendieck theorem, special case) Let ${P: {\mathbb C}^n \rightarrow {\mathbb C}^n}$ be a polynomial map from a complex vector space to itself. If ${P}$ is injective, then ${P}$ is bijective.

The full version of the theorem allows one to replace ${{\mathbb C}^n}$ by an algebraic variety ${X}$ over any algebraically closed field, and for ${P}$ to be an morphism from the algebraic variety ${X}$ to itself, but for simplicity I will just discuss the above special case. This theorem is not at all obvious; it is not too difficult (see Lemma 4 below) to show that the Jacobian of ${P}$ is non-degenerate, but this does not come close to solving the problem since one would then be faced with the notorious Jacobian conjecture. Also, the claim fails if “polynomial” is replaced by “holomorphic”, due to the existence of Fatou-Bieberbach domains.

In this post I would like to give the proof of Theorem 1 based on finite fields as mentioned by Serre, as well as another elegant proof of Rudin that combines algebra with some elementary complex variable methods. (There are several other proofs of this theorem and its generalisations, for instance a topological proof by Borel, which I will not discuss here.)

Update, March 8: Some corrections to the finite field proof. Thanks to Matthias Aschenbrenner also for clarifying the relationship with Tarski’s theorem and some further references.

— 1. Proof via finite fields —

The first observation is that the theorem is utterly trivial in the finite field case:

Theorem 2 (Ax-Grothendieck theorem in ${F}$) Let ${F}$ be a finite field, and let ${P: F^n \rightarrow F^n}$ be a polynomial. If ${P}$ is injective, then ${P}$ is bijective.

Proof: Any injection from a finite set to itself is necessarily bijective. (The hypothesis that ${P}$ is a polynomial is not needed at this stage, but becomes crucial later on.) $\Box$

Next, we pass from a finite field ${F}$ to its algebraic closure ${\overline{F}}$.

Theorem 3 (Ax-Grothendieck theorem in ${\overline{F}}$) Let ${F}$ be a finite field, let ${\overline{F}}$ be its algebraic closure, and let ${P: \overline{F}^n \rightarrow \overline{F}^n}$ be a polynomial. If ${P}$ is injective, then ${P}$ is bijective.

Proof: Our main tool here is Hilbert’s nullstellensatz, which we interpret here as an assertion that if an algebraic problem is insoluble, then there exists a finitary algebraic obstruction that witnesses this lack of solution (see also my blog post on this topic). Specifically, suppose for contradiction that we can find a polynomial ${P: \overline{F}^n \rightarrow \overline{F}^n}$ which is injective but not surjective. Injectivity of ${P}$ means that the algebraic system

$\displaystyle P(x) = P(y); \quad x \neq y$

has no solution over the algebraically closed field ${\overline{F}}$; by the nullstellensatz, this implies that there must exist an algebraic identity of the form

$\displaystyle (P(x) - P(y)) \cdot Q(x,y) = (x-y)^r \ \ \ \ \ (1)$

for some ${r \geq 1}$ and some polynomial ${Q: \overline{F}^n \times \overline{F}^n \rightarrow \overline{F}^n}$ that specifically witnesses this lack of solvability. Similarly, lack of surjectivity means the existence of an ${z_0 \in \overline{F}^n}$ such that the algebraic system

$\displaystyle P(x) = z_0$

has no solution over the algebraically closed field ${\overline{F}}$; by another application of the nullstellensatz, there must exist an algebraic identity of the form

$\displaystyle (P(x) - z_0) \cdot R(x) = 1 \ \ \ \ \ (2)$

for some polynomial ${R: \overline{F}^n \rightarrow \overline{F}^n}$ that specifically witnesses this lack of solvability.

Fix ${Q, z_0, R}$ as above, and let ${k}$ be the subfield of ${\overline{F}}$ generated by ${F}$ and the coefficients of ${P, Q, z_0, R}$. Then we observe (thanks to our explicit witnesses (1), (2)) that the counterexample ${P}$ descends from ${\overline{F}}$ to ${k}$; ${P}$ is a polynomial from ${k^n}$ to ${k^n}$ which is injective but not surjective.

But ${k}$ is finitely generated, and every element of ${k}$ is algebraic over the finite field ${F}$, thus ${k}$ is finite. But this contradicts Theorem 2. $\Box$

The complex case ${{\mathbb C}}$ follows by a slight extension of the argument used to prove Theorem 3. Indeed, suppose for contradiction that there is a polynomial ${P: {\mathbb C}^n \rightarrow {\mathbb C}^n}$ which is injective but not surjective. As ${{\mathbb C}}$ is algebraically closed, we may invoke the nullstellensatz as before and find witnesses (1), (2) for some ${Q, z_0, R}$.

Now let ${k=Q[{\mathcal C}]}$ be the subfield of ${{\mathbb C}}$ generated by the rationals ${{\mathbb Q}}$ and the coefficients ${{\mathcal C}}$ of ${P, Q, z_0, R}$. Then we can descend the counterexample to ${k}$. This time, ${k}$ is not finite, but we can descend it to a finite field (and obtain the desired contradiction) by a number of methods. One approach, which is the one taken by Serre, is to quotient the ring ${{\mathbb Z}[{\mathcal C}]}$ generated by the above coefficients by a maximal ideal, observing that this quotient is necessarily a finite field. Another is to use a general mapping theorem of Vu, Wood, and Wood. We sketch the latter approach as follows. Being finitely generated, we know that ${k}$ has a finite transcendence basis ${\alpha_1,\ldots,\alpha_m}$ over ${{\mathbb Q}}$. Applying the primitive element theorem, we can then express ${k}$ as the finite extension of ${{\mathbb Q}[\alpha_1,\ldots,\alpha_m]}$ by an element ${\beta}$ which is algebraic over ${{\mathbb Q}[\alpha_1,\ldots,\alpha_m]}$; all the coefficients ${{\mathcal C}}$ are thus rational combinations of ${\alpha_1,\ldots,\alpha_m,\beta}$. By rationalising, we can ensure that the denominators of the expressions of these coefficients are integers in ${{\mathbb Z}[\alpha_1,\ldots,\alpha_m]}$; dividing ${\beta}$ by an appropriate power of the product of these denominators we may assume that the coefficients in ${{\mathcal C}}$ all lie in the commutative ring ${{\mathbb Z}[\alpha_1,\ldots,\alpha_m,\beta]}$, which can be identified with the commutative ring ${{\mathbb Z}[a_1,\ldots,a_m,b]}$ generated by formal indeterminates ${a_1,\ldots,a_m,b}$, quotiented by the ideal generated by the minimal polynomial ${f \in {\mathbb Z}[a_1,\ldots,a_m,b]}$ of ${\beta}$; the algebraic identities (1), (2) then transfer to this ring. Now pick a large prime ${p}$, and map ${a_1,\ldots,a_m}$ to random elements of ${F_p}$. With high probability, the image of ${f}$ (which is now in ${F_p[b]}$) is non-degenerate; we can then map ${b}$ to a root of this image in a finite extension of ${F_p}$. (In fact, by using the Chebotarev density theorem (or Frobenius density theorem), we can place ${b}$ back in ${F_p}$ for infinitely many primes ${p}$.) This descends the identities (1), (2) to this finite extension, as desired.

Remark 1 This argument can be generalised substantially; it can be used to show that any first-order sentence in the language of fields is true in all algebraically closed fields of characteristic zero if and only if it is true for all algebraically closed fields of sufficiently large characteristic. This result can be deduced from the famous result (proved by Tarski, and independently, in an equivalent formulation, by Chevalley) that the theory of algebraically closed fields (in the language of rings) admits elimination of quantifiers. See for instance Section IV.23.4 of the Princeton Companion to Mathematics. There are also analogues for real closed fields, starting with the paper of Bialynicki-Birula and Rosenlicht, with a general result established by Kurdyka. Ax-Grothendieck type properties in other categories have been studied by Gromov, who calls this property “surjunctivity”.

— 2. Rudin’s proof —

Now we give Rudin’s proof, which does not use the nullstellensatz, instead relying on some Galois theory and the topological structure of ${{\mathbb C}}$. We first need a basic fact:

Lemma 4 Let ${\Omega \subset {\mathbb C}^n}$ be an open set, and let ${f: \Omega \rightarrow {\mathbb C}^n}$ be an injective holomorphic map. Then the Jacobian of ${f}$ is non-degenerate, i.e. ${\det Df(z) \neq 0}$ for all ${z \in \Omega}$.

Actually, we only need the special case of this lemma when ${f}$ is a polynomial.

Proof: We use an argument of Rosay. For ${n=1}$ the claim follows from Taylor expansion. Now suppose ${n>1}$ and the claim is proven for ${n-1}$. Suppose for contradiction that ${\det Df(z_0)=0}$ for some ${z_0 \in \Omega}$. We claim that ${Df(z_0)}$ in fact vanishes entirely. If not, then we can find ${1 \leq i,j \leq n}$ such that ${\frac{\partial}{\partial z_j} f_i(z_0) \neq 0}$; by permuting we may take ${i=j=1}$. We can also normalise ${z_0=f(z_0)=0}$. Then the map ${h: z \mapsto (f_1(z),z_2,\ldots,z_n)}$ is holomorphic with non-degenerate Jacobian at ${0}$ and is thus locally invertible at ${0}$. The map ${f \circ h^{-1}}$ is then holomorphic at ${0}$ and preserves the ${z_1}$ coordinate, and thus descends to an injective holomorphic map on a neighbourhood of the origin ${{\mathbb C}^{n-1}}$, and so its Jacobian is non-degenerate by induction hypothesis, a contradiction.

We have just shown that the gradient of ${f}$ vanishes on the zero set ${\{ \det Df = 0 \}}$, which is an analytic variety of codimension ${1}$ (if ${f}$ is polynomial, it is of course an algebraic variety). Thus ${f}$ is locally constant on this variety, which contradicts injectivity and we are done. $\Box$

From this lemma and the inverse function theorem we have

Corollary 5 Injective holomorphic maps from ${{\mathbb C}^n}$ to ${{\mathbb C}^n}$ are open.

Now we can give Rudin’s proof. Let ${P: {\mathbb C}^n \rightarrow {\mathbb C}^n}$ be an injective polynomial. We let ${k}$ be the field generated by ${{\mathbb Q}}$ and the coefficients of ${P}$; thus ${P}$ is definable over ${k}$. Let ${k[z] = k[z_1,\ldots,z_n]}$ be the extension of ${k}$ by ${n}$ indeterminates ${z_1,\ldots,z_n}$. Inside ${k[z]}$ we have the subfield ${k[P(z)]}$ generated by ${k}$ and the components of ${P(z)}$.

We claim that ${k[P(z)]}$ is all of ${k[z]}$. For if this were not the case, we see from Galois theory that there is a non-trivial automorphism ${\phi: k[z] \rightarrow k[z]}$ that fixes ${k[P(z)]}$; in particular, there exists a non-trivial rational (over ${k}$) combination ${Q(z)/R(z)}$ of ${z}$ such that ${P(Q(z)/R(z)) = P(z)}$. Now map ${z}$ to a random complex vector in ${{\mathbb C}^n}$, which will almost surely be transcendental over the countable field ${k}$; this explicitly demonstrates non-injectivity of ${P}$, a contradiction.

Since ${k[P(z)] = k[z]}$, there exists a rational function ${Q_j(z)/R_j(z)}$ over ${k}$ for each ${j=1,\ldots,n}$ such that ${z_j = Q_j(P(z))/R_j(P(z))}$. We may of course assume that ${Q_j, R_j}$ have no common factors.

We have the polynomial identity ${Q_j(P(z)) = z_j R_j(P(z))}$. In particular, this implies that on the domain ${P({\mathbb C}^n) \subset {\mathbb C}^n}$ (which is open by Corollary 5), the zero set of ${R_j}$ is contained in the zero set of ${Q_j}$. But as ${Q_j}$ and ${R_j}$ have no common factors, this is impossible by elementary algebraic geometry; thus ${R_j}$ is non-vanishing on ${P({\mathbb C}^n)}$. Thus the polynomial ${R_j \cdot P}$ has no zeroes and is thus constant; we may then normalise so that ${R_j \cdot P = 1}$. Thus we now have ${z = Q(P(z))}$ for some polynomial ${Q}$, which implies that ${w = P(Q(w))}$ for all ${w}$ in the open set ${P({\mathbb C}^n)}$. But ${w}$ and ${P(Q(w))}$ are both polynomials, and thus must agree on all of ${{\mathbb C}^n}$. Thus ${P}$ is bijective as required.

Remark 2 Note that Rudin’s proof gives the stronger statement that if a polynomial map from ${{\mathbb C}^n}$ to ${{\mathbb C}^n}$ is injective, then it is bijective and its inverse is also a polynomial.