You are currently browsing the tag archive for the ‘Galois theory’ tag.

Let ${k}$ be a field, and let ${E}$ be a finite extension of that field; in this post we will denote such a relationship by ${k \hookrightarrow E}$. We say that ${E}$ is a Galois extension of ${k}$ if the cardinality of the automorphism group ${\mathrm{Aut}(E/k)}$ of ${E}$ fixing ${k}$ is as large as it can be, namely the degree ${[E:k]}$ of the extension. In that case, we call ${\mathrm{Aut}(E/k)}$ the Galois group of ${E}$ over ${k}$ and denote it also by ${\mathrm{Gal}(E/k)}$. The fundamental theorem of Galois theory then gives a one-to-one correspondence (also known as the Galois correspondence) between the intermediate extensions between ${E}$ and ${k}$ and the subgroups of ${\mathrm{Gal}(E/k)}$:

Theorem 1 (Fundamental theorem of Galois theory) Let ${E}$ be a Galois extension of ${k}$.

• (i) If ${k \hookrightarrow F \hookrightarrow E}$ is an intermediate field betwen ${k}$ and ${E}$, then ${E}$ is a Galois extension of ${F}$, and ${\mathrm{Gal}(E/F)}$ is a subgroup of ${\mathrm{Gal}(E/k)}$.
• (ii) Conversely, if ${H}$ is a subgroup of ${\mathrm{Gal}(E/k)}$, then there is a unique intermediate field ${k \hookrightarrow F \hookrightarrow E}$ such that ${\mathrm{Gal}(E/F)=H}$; namely ${F}$ is the set of elements of ${E}$ that are fixed by ${H}$.
• (iii) If ${k \hookrightarrow F_1 \hookrightarrow E}$ and ${k \hookrightarrow F_2 \hookrightarrow E}$, then ${F_1 \hookrightarrow F_2}$ if and only if ${\mathrm{Gal}(E/F_2)}$ is a subgroup of ${\mathrm{Gal}(E/F_1)}$.
• (iv) If ${k \hookrightarrow F \hookrightarrow E}$ is an intermediate field between ${k}$ and ${E}$, then ${F}$ is a Galois extension of ${k}$ if and only if ${\mathrm{Gal}(E/F)}$ is a normal subgroup of ${\mathrm{Gal}(E/k)}$. In that case, ${\mathrm{Gal}(F/k)}$ is isomorphic to the quotient group ${\mathrm{Gal}(E/k) / \mathrm{Gal}(E/F)}$.

Example 2 Let ${k= {\bf Q}}$, and let ${E = {\bf Q}(e^{2\pi i/n})}$ be the degree ${\phi(n)}$ Galois extension formed by adjoining a primitive ${n^{th}}$ root of unity (that is to say, ${E}$ is the cyclotomic field of order ${n}$). Then ${\mathrm{Gal}(E/k)}$ is isomorphic to the multiplicative cyclic group ${({\bf Z}/n{\bf Z})^\times}$ (the invertible elements of the ring ${{\bf Z}/n{\bf Z}}$). Amongst the intermediate fields, one has the cyclotomic fields of the form ${F = {\bf Q}(e^{2\pi i/m})}$ where ${m}$ divides ${n}$; they are also Galois extensions, with ${\mathrm{Gal}(F/k)}$ isomorphic to ${({\bf Z}/m{\bf Z})^\times}$ and ${\mathrm{Gal}(E/F)}$ isomorphic to the elements ${a}$ of ${({\bf Z}/n{\bf Z})^\times}$ such that ${a(n/m) = (n/m)}$ modulo ${n}$. (There can also be other intermediate fields, corresponding to other subgroups of ${({\bf Z}/n{\bf Z})^\times}$.)

Example 3 Let ${k = {\bf C}(z)}$ be the field of rational functions of one indeterminate ${z}$ with complex coefficients, and let ${E = {\bf C}(w)}$ be the field formed by adjoining an ${n^{th}}$ root ${w = z^{1/n}}$ to ${k}$, thus ${k = {\bf C}(w^n)}$. Then ${E}$ is a degree ${n}$ Galois extension of ${k}$ with Galois group isomorphic to ${{\bf Z}/n{\bf Z}}$ (with an element ${a \in {\bf Z}/n{\bf Z}}$ corresponding to the field automorphism of ${k}$ that sends ${w}$ to ${e^{2\pi i a/n} w}$). The intermediate fields are of the form ${F = {\bf C}(w^{n/m})}$ where ${m}$ divides ${n}$; they are also Galois extensions, with ${\mathrm{Gal}(F/k)}$ isomorphic to ${{\bf Z}/m{\bf Z}}$ and ${\mathrm{Gal}(E/F)}$ isomorphic to the multiples of ${m}$ in ${{\bf Z}/n{\bf Z}}$.

There is an analogous Galois correspondence in the covering theory of manifolds. For simplicity we restrict attention to finite covers. If ${L}$ is a connected manifold and ${\pi_{L \leftarrow M}: M \rightarrow L}$ is a finite covering map of ${L}$ by another connected manifold ${M}$, we denote this relationship by ${L \leftarrow M}$. (Later on we will change our function notations slightly and write ${\pi_{L \leftarrow M}: L \leftarrow M}$ in place of the more traditional ${\pi_{L \leftarrow M}: M \rightarrow L}$, and similarly for the deck transformations ${g: M \leftarrow M}$ below; more on this below the fold.) If ${L \leftarrow M}$, we can define ${\mathrm{Aut}(M/L)}$ to be the group of deck transformations: continuous maps ${g: M \rightarrow M}$ which preserve the fibres of ${\pi}$. We say that this covering map is a Galois cover if the cardinality of the group ${\mathrm{Aut}(M/L)}$ is as large as it can be. In that case we call ${\mathrm{Aut}(M/L)}$ the Galois group of ${M}$ over ${L}$ and denote it by ${\mathrm{Gal}(M/L)}$.

Suppose ${M}$ is a finite cover of ${L}$. An intermediate cover ${N}$ between ${M}$ and ${L}$ is a cover of ${N}$ by ${L}$, such that ${L \leftarrow N \leftarrow M}$, in such a way that the covering maps are compatible, in the sense that ${\pi_{L \leftarrow M}}$ is the composition of ${\pi_{L \leftarrow N}}$ and ${\pi_{N \leftarrow M}}$. This sort of compatibilty condition will be implicitly assumed whenever we chain together multiple instances of the ${\leftarrow}$ notation. Two intermediate covers ${N,N'}$ are equivalent if they cover each other, in a fashion compatible with all the other covering maps, thus ${L \leftarrow N \leftarrow N' \leftarrow M}$ and ${L \leftarrow N' \leftarrow N \leftarrow M}$. We then have the analogous Galois correspondence:

Theorem 4 (Fundamental theorem of covering spaces) Let ${L \leftarrow M}$ be a Galois covering.

• (i) If ${L \leftarrow N \leftarrow M}$ is an intermediate cover betwen ${L}$ and ${M}$, then ${M}$ is a Galois extension of ${N}$, and ${\mathrm{Gal}(M/N)}$ is a subgroup of ${\mathrm{Gal}(M/L)}$.
• (ii) Conversely, if ${H}$ is a subgroup of ${\mathrm{Gal}(M/L)}$, then there is a intermediate cover ${L \leftarrow N \leftarrow M}$, unique up to equivalence, such that ${\mathrm{Gal}(M/N)=H}$.
• (iii) If ${L \leftarrow N_1 \leftarrow M}$ and ${L \leftarrow N_2 \leftarrow M}$, then ${L \leftarrow N_1 \leftarrow N_2 \leftarrow M}$ if and only if ${\mathrm{Gal}(M/N_2)}$ is a subgroup of ${\mathrm{Gal}(M/N_1)}$.
• (iv) If ${L \leftarrow N \leftarrow M}$, then ${N}$ is a Galois cover of ${L}$ if and only if ${\mathrm{Gal}(M/N)}$ is a normal subgroup of ${\mathrm{Gal}(M/L)}$. In that case, ${\mathrm{Gal}(N/L)}$ is isomorphic to the quotient group ${\mathrm{Gal}(M/L) / \mathrm{Gal}(M/N)}$.

Example 5 Let ${L= {\bf C}^\times := {\bf C} \backslash \{0\}}$, and let ${M = {\bf C}^\times}$ be the ${n}$-fold cover of ${L}$ with covering map ${\pi_{L \leftarrow M}(w) := w^n}$. Then ${M}$ is a Galois cover of ${L}$, and ${\mathrm{Gal}(M/L)}$ is isomorphic to the cyclic group ${{\bf Z}/n{\bf Z}}$. The intermediate covers are (up to equivalence) of the form ${N = {\bf C}^\times}$ with covering map ${\pi_{L \leftarrow N}(u) := u^m}$ where ${m}$ divides ${n}$; they are also Galois covers, with ${\mathrm{Gal}(N/L)}$ isomorphic to ${{\bf Z}/m{\bf Z}}$ and ${\mathrm{Gal}(M/N)}$ isomorphic to the multiples of ${m}$ in ${{\bf Z}/n{\bf Z}}$.

Given the strong similarity between the two theorems, it is natural to ask if there is some more concrete connection between Galois theory and the theory of finite covers.

In one direction, if the manifolds ${L,M,N}$ have an algebraic structure (or a complex structure), then one can relate covering spaces to field extensions by considering the field of rational functions (or meromorphic functions) on the space. For instance, if ${L = {\bf C}^\times}$ and ${z}$ is the coordinate on ${L}$, one can consider the field ${{\bf C}(z)}$ of rational functions on ${L}$; the ${n}$-fold cover ${M = {\bf C}^\times}$ with coordinate ${w}$ from Example 5 similarly has a field ${{\bf C}(w)}$ of rational functions. The covering ${\pi_{L \leftarrow M}(w) = w^n}$ relates the two coordinates ${z,w}$ by the relation ${z = w^n}$, at which point one sees that the rational functions ${{\bf C}(w)}$ on ${L}$ are a degree ${n}$ extension of that of ${{\bf C}(z)}$ (formed by adjoining the ${n^{th}}$ root of unity ${w}$ to ${z}$). In this way we see that Example 5 is in fact closely related to Example 3.

Exercise 6 What happens if one uses meromorphic functions in place of rational functions in the above example? (To answer this question, I found it convenient to use a discrete Fourier transform associated to the multiplicative action of the ${n^{th}}$ roots of unity on ${M}$ to decompose the meromorphic functions on ${M}$ as a linear combination of functions invariant under this action, times a power ${w^j}$ of the coordinate ${w}$ for ${j=0,\dots,n-1}$.)

I was curious however about the reverse direction. Starting with some field extensions ${k \hookrightarrow F \hookrightarrow E}$, is it is possible to create manifold like spaces ${M_k \leftarrow M_F \leftarrow M_E}$ associated to these fields in such a fashion that (say) ${M_E}$ behaves like a “covering space” to ${M_k}$ with a group ${\mathrm{Aut}(M_E/M_k)}$ of deck transformations isomorphic to ${\mathrm{Aut}(E/k)}$, so that the Galois correspondences agree? Also, given how the notion of a path (and associated concepts such as loops, monodromy and the fundamental group) play a prominent role in the theory of covering spaces, can spaces such as ${M_k}$ or ${M_E}$ also come with a notion of a path that is somehow compatible with the Galois correspondence?

The standard answer from modern algebraic geometry (as articulated for instance in this nice MathOverflow answer by Minhyong Kim) is to set ${M_E}$ equal to the spectrum ${\mathrm{Spec}(E)}$ of the field ${E}$. As a set, the spectrum ${\mathrm{Spec}(R)}$ of a commutative ring ${R}$ is defined as the set of prime ideals of ${R}$. Generally speaking, the map ${R \mapsto \mathrm{Spec}(R)}$ that maps a commutative ring to its spectrum tends to act like an inverse of the operation that maps a space ${X}$ to a ring of functions on that space. For instance, if one considers the commutative ring ${{\bf C}[z, z^{-1}]}$ of regular functions on ${M = {\bf C}^\times}$, then each point ${z_0}$ in ${M}$ gives rise to the prime ideal ${\{ f \in {\bf C}[z, z^{-1}]: f(z_0)=0\}}$, and one can check that these are the only such prime ideals (other than the zero ideal ${(0)}$), giving an almost one-to-one correspondence between ${\mathrm{Spec}( {\bf C}[z,z^{-1}] )}$ and ${M}$. (The zero ideal corresponds instead to the generic point of ${M}$.)

Of course, the spectrum of a field such as ${E}$ is just a point, as the zero ideal ${(0)}$ is the only prime ideal. Naively, it would then seem that there is not enough space inside such a point to support a rich enough structure of paths to recover the Galois theory of this field. In modern algebraic geometry, one addresses this issue by considering not just the set-theoretic elements of ${E}$, but more general “base points” ${p: \mathrm{Spec}(b) \rightarrow \mathrm{Spec}(E)}$ that map from some other (affine) scheme ${\mathrm{Spec}(b)}$ to ${\mathrm{Spec}(E)}$ (one could also consider non-affine base points of course). One has to rework many of the fundamentals of the subject to accommodate this “relative point of view“, for instance replacing the usual notion of topology with an étale topology, but once one does so one obtains a very satisfactory theory.

As an exercise, I set myself the task of trying to interpret Galois theory as an analogue of covering space theory in a more classical fashion, without explicit reference to more modern concepts such as schemes, spectra, or étale topology. After some experimentation, I found a reasonably satisfactory way to do so as follows. The space ${M_E}$ that one associates with ${E}$ in this classical perspective is not the single point ${\mathrm{Spec}(E)}$, but instead the much larger space consisting of ring homomorphisms ${p: E \rightarrow b}$ from ${E}$ to arbitrary integral domains ${b}$; informally, ${M_E}$ consists of all the “models” or “representations” of ${E}$ (in the spirit of this previous blog post). (There is a technical set-theoretic issue here because the class of integral domains ${R}$ is a proper class, so that ${M_E}$ will also be a proper class; I will completely ignore such technicalities in this post.) We view each such homomorphism ${p: E \rightarrow b}$ as a single point in ${M_E}$. The analogous notion of a path from one point ${p: E \rightarrow b}$ to another ${p': E \rightarrow b'}$ is then a homomorphism ${\gamma: b \rightarrow b'}$ of integral domains, such that ${p'}$ is the composition of ${p}$ with ${\gamma}$. Note that every prime ideal ${I}$ in the spectrum ${\mathrm{Spec}(R)}$ of a commutative ring ${R}$ gives rise to a point ${p_I}$ in the space ${M_R}$ defined here, namely the quotient map ${p_I: R \rightarrow R/I}$ to the ring ${R/I}$, which is an integral domain because ${I}$ is prime. So one can think of ${\mathrm{Spec}(R)}$ as being a distinguished subset of ${M_R}$; alternatively, one can think of ${M_R}$ as a sort of “penumbra” surrounding ${\mathrm{Spec}(R)}$. In particular, when ${E}$ is a field, ${\mathrm{Spec}(E) = \{(0)\}}$ defines a special point ${p_R}$ in ${M_R}$, namely the identity homomorphism ${p_R: R \rightarrow R}$.

Below the fold I would like to record this interpretation of Galois theory, by first revisiting the theory of covering spaces using paths as the basic building block, and then adapting that theory to the theory of field extensions using the spaces indicated above. This is not too far from the usual scheme-theoretic way of phrasing the connection between the two topics (basically I have replaced étale-type points ${p: \mathrm{Spec}(b) \rightarrow \mathrm{Spec}(E)}$ with more classical points ${p: E \rightarrow b}$), but I had not seen it explicitly articulated before, so I am recording it here for my own benefit and for any other readers who may be interested.

Much as group theory is the study of groups, or graph theory is the study of graphs, model theory is the study of models (also known as structures) of some language ${{\mathcal L}}$ (which, in this post, will always be a single-sorted, first-order language). A structure is a set ${X}$, equipped with one or more operations, constants, and relations. This is of course an extremely general type of mathematical object, but (quite remarkably) one can still say a substantial number of interesting things about very broad classes of structures.

We will observe the common abuse of notation of using the set ${X}$ as a metonym for the entire structure, much as we usually refer to a group ${(G,1,\cdot,()^{-1})}$ simply as ${G}$, a vector space ${(V, 0, +, \cdot)}$ simply as ${V}$, and so forth. Following another common bending of the rules, we also allow some operations on structures (such as the multiplicative inverse operation on a group or field) to only be partially defined, and we allow use of the usual simplifying conventions for mathematical formulas (e.g. writing ${a+b+c}$ instead of ${(a+b)+c}$ or ${a+(b+c)}$, in cases where associativity is known). We will also deviate slightly from the usual practice in logic by emphasising individual structures, rather than the theory of general classes of structures; for instance, we will talk about the theory of a single field such as ${{\bf R}}$ or ${{\bf C}}$, rather than the theory of all fields of a certain type (e.g. real closed fields or algebraically closed fields).

Once one has a structure ${X}$, one can introduce the notion of a definable subset of ${X}$, or more generally of a Cartesian power ${X^n}$ of ${X}$, defined as a set ${E \subset X^n}$ of the form

$\displaystyle E = \{ (x_1,\ldots,x_n): P(x_1,\ldots,x_n) \hbox{ true} \} \ \ \ \ \ (1)$

for some formula ${P}$ in the language ${{\mathcal L}}$ with ${n}$ free variables and any number of constants from ${X}$ (that is, ${P(x_1,\ldots,x_n)}$ is a well-formed formula built up from a finite number of constants ${c_1,\ldots,c_m}$ in ${X}$, the relations and operations on ${X}$, logical connectives such as ${\neg}$, ${\wedge}$, ${\implies}$, and the quantifiers ${\forall, \exists}$). Thus, for instance, in the theory of the arithmetic of the natural numbers ${{\bf N} = ({\bf N}, 0, 1, +, \times)}$, the set of primes ${{\mathcal P}}$ is a definable set, since we have

$\displaystyle {\mathcal P} = \{ x \in {\bf N}: (\exists y: x=y+2) \wedge \neg (\exists z,w: x = (z+2)(w+2)) \}.$

In the theory of the field of reals ${{\bf R} = ({\bf R}, 0, 1, +, -, \times, ()^{-1})}$, the unit circle ${S^1}$ is an example of a definable set,

$\displaystyle S^1 = \{ (x,y) \in {\bf R}^2: x^2+y^2 = 1 \},$

but so is the the complement of the circle,

$\displaystyle {\bf R}^2 \backslash S^1 = \{ (x,y) \in {\bf R}^2: \neg(x^2+y^2 = 1) \}$

and the interval ${[-1,1]}$:

$\displaystyle [-1,1] = \{ x \in {\bf R}: \exists y: x^2+y^2 = 1\}.$

Due to the unlimited use of constants, any finite subset of a power ${X^n}$ of any structure ${X}$ is, by our conventions, definable in that structure. (One can of course also consider definability without parameters (also known as ${0}$-definability), in which arbitrary constants are not permitted, but we will not do so here.)

We can isolate some special subclasses of definable sets:

• An atomic definable set is a set of the form (1) in which ${P()}$ is an atomic formula (i.e. it does not contain any logical connectives or quantifiers).
• A quantifier-free definable set is a set of the form (1) in which ${P()}$ is quantifier-free (i.e. it can contain logical connectives, but does not contain the quantifiers ${\forall, \exists}$).

Example 1 In the theory of a field such as ${{\bf R}}$, an atomic definable set is the same thing as an affine algebraic set (also known as an affine algebraic variety, with the understanding that varieties are not necessarily assumed to be irreducible), and a quantifier-free definable set is known as a constructible set; thus we see that algebraic geometry can be viewed in some sense as a special case of model theory. (Conversely, it can in fact be quite profitable to think of model theory as an abstraction of algebraic geometry; for instance, the concepts of Morley rank and Morley degree in model theory (discussed in this previous blog post) directly generalises the concepts of dimension and degree in algebraic geometry.) Over ${{\bf R}}$, the interval ${[-1,1]}$ is a definable set, but not a quantifier-free definable set (and certainly not an atomic definable set); and similarly for the primes over ${{\bf N}}$.

A quantifier-free definable set in ${X^n}$ is nothing more than a finite boolean combination of atomic definable sets; in other words, the class of quantifier-free definable sets over ${X}$ is the smallest class that contains the atomic definable sets and is closed under boolean operations such as complementation and union (which generate all the other boolean operations). Similarly, the class of definable sets over ${X}$ is the smallest class that contains the quantifier-free definable sets, and is also closed under the operation of projection ${\pi_n: E \mapsto \pi_n(E)}$ from ${X^{n+1}}$ to ${X^n}$ for every natural number ${n}$, where ${\pi_n: X^{n+1} \rightarrow X^n}$ is the map ${\pi_n(x_1,\ldots,x_n,x_{n+1}) := (x_1,\ldots,x_n)}$.

Some structures have the property of enjoying quantifier elimination, which means that every definable set is in fact a quantifier-free definable set, or equivalently that the projection of a quantifier-free definable set is again quantifier-free. For instance, an algebraically closed field ${k}$ (with the field operations) has quantifier elimination (i.e. the projection of a constructible set is again constructible); this fact can be proven by the classical tool of resultants, and among other things can be used to give a proof of Hilbert’s nullstellensatz. (Note though that projection does not necessary preserve the property of being atomic; for instance, the projection of the atomic set ${\{ (x,y) \in k^2: xy=1 \}}$ is the non-atomic, but still quantifier-free definable, set ${\{ x \in k: \neg (k=0) \}}$.) In the converse direction, it is not difficult to use the nullstellensatz to deduce quantifier elimination. For theory of the real field ${{\bf R}}$, which is not algebraically closed, one does not have quantifier elimination, as one can see from the example of the unit circle (which is a quantifier-free definable set) projecting down to the interval ${[-1,1]}$ (which is definable, but not quantifer-free definable). However, if one adds the additional operation of order ${<}$ to the reals, giving it the language of an ordered field rather than just a field, then quantifier elimination is recovered (the class of quantifier-free definable sets now enlarges to match the class of definable sets, which in this case is also the class of semi-algebraic sets); this is the famous Tarski-Seidenberg theorem.

On the other hand, many important structures do not have quantifier elimination; typically, the projection of a quantifier-free definable set is not, in general, quantifier-free definable. This failure of the projection property also shows up in many contexts outside of model theory; for instance, Lebesgue famously made the error of thinking that the projection of a Borel measurable set remained Borel measurable (it is merely an analytic set instead). Turing’s halting theorem can be viewed as an assertion that the projection of a decidable set (also known as a computable or recursive set) is not necessarily decidable (it is merely semi-decidable (or recursively enumerable) instead). The notorious P=NP problem can also be essentially viewed in this spirit; roughly speaking (and glossing over the placement of some quantifiers), it asks whether the projection of a polynomial-time decidable set is again polynomial-time decidable. And so forth. (See this blog post of Dick Lipton for further discussion of the subtleties of projections.)

Now we consider the status of quantifier elimination for the theory of a finite field ${F}$. If interpreted naively, quantifier elimination is trivial for a finite field ${F}$, since every subset of ${F^n}$ is finite and thus quantifier-free definable. However, we can recover an interesting question in one of two (essentially equivalent) ways. One is to work in the asymptotic regime in which the field ${F}$ is large, but the length of the formulae used to construct one’s definable sets stays bounded uniformly in the size of ${F}$ (where we view any constant in ${F}$ as contributing a unit amount to the length of a formula, no matter how large ${F}$ is). A simple counting argument then shows that only a small number of subsets of ${F^n}$ become definable in the asymptotic limit ${|F| \rightarrow \infty}$, since the number of definable sets clearly grows at most polynomially in ${|F|}$ for any fixed bound on the formula length, while the number of all subsets of ${|F|^n}$ grows exponentially in ${n}$.

Another way to proceed is to work not with a single finite field ${F}$, or even with a sequence ${F_m}$ of finite fields, but with the ultraproduct ${F = \prod_{m \rightarrow p} F_m}$ of a sequence of finite fields, and to study the properties of definable sets over this ultraproduct. (We will be using the notation of ultraproducts and nonstandard analysis from this previous blog post.) This approach is equivalent to the more finitary approach mentioned in the previous paragraph, at least if one does not care to track of the exact bounds on the length of the formulae involved. Indeed, thanks to Los’s theorem, a definable subset ${E}$ of ${F^n}$ is nothing more than the ultraproduct ${E = \prod_{m \rightarrow p} E_m}$ of definable subsets ${E_m}$ of ${F_m^n}$ for all ${m}$ sufficiently close to ${p}$, with the length of the formulae used to define ${E_m}$ uniformly bounded in ${m}$. In the language of nonstandard analysis, one can view ${F}$ as a nonstandard finite field.

The ultraproduct ${F}$ of finite fields is an important example of a pseudo-finite field – a field that obeys all the sentences in the languages of fields that finite fields do, but is not necessarily itself a finite field. The model theory of pseudo-finite fields was first studied systematically by Ax (in the same paper where the Ax-Grothendieck theorem, discussed previously on this blog, was established), with important further contributions by Kiefe, by Fried-Sacerdote, by two papers of Chatzidakis-van den Dries-Macintyre, and many other authors.

As mentioned before, quantifier elimination trivially holds for finite fields. But for infinite pseudo-finite fields, such as the ultraproduct ${F = \prod_{m \rightarrow p} F_m}$ of finite fields with ${|F_m|}$ going to infinity, quantifier elimination fails. For instance, in a finite field ${F_m}$, the set ${E_m := \{ x \in F_m: \exists y \in F_m: x=y^2 \}}$ of quadratic residues is a definable set, with a bounded formula length, and so in the ultraproduct ${F =\prod_{m \rightarrow p} F_m}$, the set ${E := \prod_{m\rightarrow p} E_m}$ of nonstandard quadratic residues is also a definable set. However, in one dimension, we see from the factor theorem that the only atomic definable sets are either finite or the whole field ${F}$, and so the only constructible sets (i.e. the only quantifier-free definable sets) are either finite or cofinite in ${F}$. Since the quadratic residues have asymptotic density ${1/2}$ in a large finite field, they cannot form a quantifier-free definable set, despite being definable.

Nevertheless, there is a very nice almost quantifier elimination result for these fields, in characteristic zero at least, which we phrase here as follows:

Theorem 1 (Almost quantifier elimination) Let ${F}$ be a nonstandard finite field of characteristic zero, and let ${E \subset F^n}$ be a definable set over ${F}$. Then ${E}$ is the union of finitely many sets of the form

$\displaystyle E = \{ x \in V(F): \exists t \in F: P(x,t) = 0 \} \ \ \ \ \ (2)$

where ${V(F)}$ is an atomic definable subset of ${F^n}$ (i.e. the ${F}$-points of an algebraic variety ${V}$ defined over ${F}$ in ${F^n}$) and ${P: F^{n+1} \rightarrow F}$ is a polynomial.

Results of this type were first obtained essentially due to Catarina Kiefe, although the formulation here is closer to that of Chatzidakis-van den Dries-Macintyre.

Informally, this theorem says that while we cannot quite eliminate all quantifiers from a definable set over a nonstandard finite field, we can eliminate all but one existential quantifier. Note that negation has also been eliminated in this theorem; for instance, the definable set ${F \backslash \{0\} = \{ x \in F: \neg(x=0) \}}$ uses a negation, but can also be described using a single existential quantifier as ${\{ x \in F: \exists t: xt = 1 \}}$.) I believe that there are more complicated analogues of this result in positive characteristic, but I have not studied this case in detail (Kiefe’s result does not assume characteristic zero, but her conclusion is slightly different from the one given here). In the one-dimensional case ${n=1}$, the only varieties ${V}$ are the affine line and finite sets, and we can simplify the above statement, namely that any definable subset of ${F}$ takes the form ${\{ x\in F: \exists t \in F: P(x,t) = 0 \}}$ for some polynomial ${P}$ (i.e. definable sets in ${F}$ are nothing more than the projections of the ${F}$-points of a plane curve).

There is an equivalent formulation of this theorem for standard finite fields, namely that if ${F}$ is a finite field and ${E \subset F^n}$ is definable using a formula of length at most ${M}$, then ${E}$ can be expressed in the form (2) with the degree of ${P}$ bounded by some quantity ${C_{M,n}}$ depending on ${M}$ and ${n}$, assuming that the characteristic of ${F}$ is sufficiently large depending on ${M, n}$.

The theorem gives quite a satisfactory description of definable sets in either standard or nonstandard finite fields (at least if one does not care about effective bounds in some of the constants, and if one is willing to exclude the small characteristic case); for instance, in conjunction with the Lang-Weil bound discussed in this recent blog post, it shows that any non-empty definable subset of a nonstandard finite field has a nonstandard cardinality of ${(\alpha + O(|F|^{-1/2})) |F|^d}$ for some positive standard rational ${\alpha}$ and integer ${d}$. Equivalently, any non-empty definable subset of ${F^n}$ for some standard finite field ${F}$ using a formula of length at most ${M}$ has a standard cardinality of ${(\alpha + O_{M,n}(|F|^{-1/2})) |F|^d}$ for some positive rational of height ${O_{M,n}(1)}$ and some natural number ${d}$ between ${0}$ and ${n}$. (For instance, in the example of the quadratic residues given above, ${d}$ is equal to ${1}$ and ${\alpha}$ equal to ${1/2}$.) There is a more precise statement to this effect, namely that the Poincaré series of a definable set is rational; see Kiefe’s paper for details.

Below the fold I give a proof of Theorem 1, which relies primarily on the Lang-Weil bound mentioned above.

One of the most well known problems from ancient Greek mathematics was that of trisecting an angle by straightedge and compass, which was eventually proven impossible in 1837 by Pierre Wantzel, using methods from Galois theory.

Formally, one can set up the problem as follows. Define a configuration to be a finite collection ${{\mathcal C}}$ of points, lines, and circles in the Euclidean plane. Define a construction step to be one of the following operations to enlarge the collection ${{\mathcal C}}$:

• (Straightedge) Given two distinct points ${A, B}$ in ${{\mathcal C}}$, form the line ${\overline{AB}}$ that connects ${A}$ and ${B}$, and add it to ${{\mathcal C}}$.
• (Compass) Given two distinct points ${A, B}$ in ${{\mathcal C}}$, and given a third point ${O}$ in ${{\mathcal C}}$ (which may or may not equal ${A}$ or ${B}$), form the circle with centre ${O}$ and radius equal to the length ${|AB|}$ of the line segment joining ${A}$ and ${B}$, and add it to ${{\mathcal C}}$.
• (Intersection) Given two distinct curves ${\gamma, \gamma'}$ in ${{\mathcal C}}$ (thus ${\gamma}$ is either a line or a circle in ${{\mathcal C}}$, and similarly for ${\gamma'}$), select a point ${P}$ that is common to both ${\gamma}$ and ${\gamma'}$ (there are at most two such points), and add it to ${{\mathcal C}}$.

We say that a point, line, or circle is constructible by straightedge and compass from a configuration ${{\mathcal C}}$ if it can be obtained from ${{\mathcal C}}$ after applying a finite number of construction steps.

Problem 1 (Angle trisection) Let ${A, B, C}$ be distinct points in the plane. Is it always possible to construct by straightedge and compass from ${A,B,C}$ a line ${\ell}$ through ${A}$ that trisects the angle ${\angle BAC}$, in the sense that the angle between ${\ell}$ and ${BA}$ is one third of the angle of ${\angle BAC}$?

Thanks to Wantzel’s result, the answer to this problem is known to be “no” in general; a generic angle ${\angle BAC}$ cannot be trisected by straightedge and compass. (On the other hand, some special angles can certainly be trisected by straightedge and compass, such as a right angle. Also, one can certainly trisect generic angles using other methods than straightedge and compass; see the Wikipedia page on angle trisection for some examples of this.)

The impossibility of angle trisection stands in sharp contrast to the easy construction of angle bisection via straightedge and compass, which we briefly review as follows:

1. Start with three points ${A, B, C}$.
2. Form the circle ${c_0}$ with centre ${A}$ and radius ${AB}$, and intersect it with the line ${\overline{AC}}$. Let ${D}$ be the point in this intersection that lies on the same side of ${A}$ as ${C}$. (${D}$ may well be equal to ${C}$).
3. Form the circle ${c_1}$ with centre ${B}$ and radius ${AB}$, and the circle ${c_2}$ with centre ${D}$ and radius ${AB}$. Let ${E}$ be the point of intersection of ${c_1}$ and ${c_2}$ that is not ${A}$.
4. The line ${\ell := \overline{AE}}$ will then bisect the angle ${\angle BAC}$.

The key difference between angle trisection and angle bisection ultimately boils down to the following trivial number-theoretic fact:

Lemma 2 There is no power of ${2}$ that is evenly divisible by ${3}$.

Proof: Obvious by modular arithmetic, by induction, or by the fundamental theorem of arithmetic. $\Box$

In contrast, there are of course plenty of powers of ${2}$ that are evenly divisible by ${2}$, and this is ultimately why angle bisection is easy while angle trisection is hard.

The standard way in which Lemma 2 is used to demonstrate the impossibility of angle trisection is via Galois theory. The implication is quite short if one knows this theory, but quite opaque otherwise. We briefly sketch the proof of this implication here, though we will not need it in the rest of the discussion. Firstly, Lemma 2 implies the following fact about field extensions.

Corollary 3 Let ${F}$ be a field, and let ${E}$ be an extension of ${F}$ that can be constructed out of ${F}$ by a finite sequence of quadratic extensions. Then ${E}$ does not contain any cubic extensions ${K}$ of ${F}$.

Proof: If $E$ contained a cubic extension $K$ of $F$, then the dimension of $E$ over $F$ would be a multiple of three. On the other hand, if $E$ is obtained from $F$ by a tower of quadratic extensions, then the dimension of $E$ over $F$ is a power of two. The claim then follows from Lemma 2. $\Box$

To conclude the proof, one then notes that any point, line, or circle that can be constructed from a configuration ${{\mathcal C}}$ is definable in a field obtained from the coefficients of all the objects in ${{\mathcal C}}$ after taking a finite number of quadratic extensions, whereas a trisection of an angle ${\angle ABC}$ will generically only be definable in a cubic extension of the field generated by the coordinates of ${A,B,C}$.

The Galois theory method also allows one to obtain many other impossibility results of this type, most famously the Abel-Ruffini theorem on the insolvability of the quintic equation by radicals. For this reason (and also because of the many applications of Galois theory to number theory and other branches of mathematics), the Galois theory argument is the “right” way to prove the impossibility of angle trisection within the broader framework of modern mathematics. However, this argument has the drawback that it requires one to first understand Galois theory (or at least field theory), which is usually not presented until an advanced undergraduate algebra or number theory course, whilst the angle trisection problem requires only high-school level mathematics to formulate. Even if one is allowed to “cheat” and sweep several technicalities under the rug, one still needs to possess a fair amount of solid intuition about advanced algebra in order to appreciate the proof. (This was undoubtedly one reason why, even after Wantzel’s impossibility result was published, a large amount of effort was still expended by amateur mathematicians to try to trisect a general angle.)

In this post I would therefore like to present a different proof (or perhaps more accurately, a disguised version of the standard proof) of the impossibility of angle trisection by straightedge and compass, that avoids explicit mention of Galois theory (though it is never far beneath the surface). With “cheats”, the proof is actually quite simple and geometric (except for Lemma 2, which is still used at a crucial juncture), based on the basic geometric concept of monodromy; unfortunately, some technical work is needed however to remove these cheats.

To describe the intuitive idea of the proof, let us return to the angle bisection construction, that takes a triple ${A, B, C}$ of points as input and returns a bisecting line ${\ell}$ as output. We iterate the construction to create a quadrisecting line ${m}$, via the following sequence of steps that extend the original bisection construction:

1. Start with three points ${A, B, C}$.
2. Form the circle ${c_0}$ with centre ${A}$ and radius ${AB}$, and intersect it with the line ${\overline{AC}}$. Let ${D}$ be the point in this intersection that lies on the same side of ${A}$ as ${C}$. (${D}$ may well be equal to ${C}$).
3. Form the circle ${c_1}$ with centre ${B}$ and radius ${AB}$, and the circle ${c_2}$ with centre ${D}$ and radius ${AB}$. Let ${E}$ be the point of intersection of ${c_1}$ and ${c_2}$ that is not ${A}$.
4. Let ${F}$ be the point on the line ${\ell := \overline{AE}}$ which lies on ${c_0}$, and is on the same side of ${A}$ as ${E}$.
5. Form the circle ${c_3}$ with centre ${F}$ and radius ${AB}$. Let ${G}$ be the point of intersection of ${c_1}$ and ${c_3}$ that is not ${A}$.
6. The line ${m := \overline{AG}}$ will then quadrisect the angle ${\angle BAC}$.

Let us fix the points ${A}$ and ${B}$, but not ${C}$, and view ${m}$ (as well as intermediate objects such as ${D}$, ${c_2}$, ${E}$, ${\ell}$, ${F}$, ${c_3}$, ${G}$) as a function of ${C}$.

Let us now do the following: we begin rotating ${C}$ counterclockwise around ${A}$, which drags around the other objects ${D}$, ${c_2}$, ${E}$, ${\ell}$, ${F}$, ${c_3}$, ${G}$ that were constructed by ${C}$ accordingly. For instance, here is an early stage of this rotation process, when the angle ${\angle BAC}$ has become obtuse:

Now for the slightly tricky bit. We are going to keep rotating ${C}$ beyond a half-rotation of ${180^\circ}$, so that ${\angle BAC}$ now becomes a reflex angle. At this point, a singularity occurs; the point ${E}$ collides into ${A}$, and so there is an instant in which the line ${\ell = \overline{AE}}$ is not well-defined. However, this turns out to be a removable singularity (and the easiest way to demonstrate this will be to tap the power of complex analysis, as complex numbers can easily route around such a singularity), and we can blast through it to the other side, giving a picture like this:

Note that we have now deviated from the original construction in that ${F}$ and ${E}$ are no longer on the same side of ${A}$; we are thus now working in a continuation of that construction rather than with the construction itself. Nevertheless, we can still work with this continuation (much as, say, one works with analytic continuations of infinite series such as ${\sum_{n=1}^\infty \frac{1}{n^s}}$ beyond their original domain of definition).

We now keep rotating ${C}$ around ${A}$. Here, ${\angle BAC}$ is approaching a full rotation of ${360^\circ}$:

When ${\angle BAC}$ reaches a full rotation, a different singularity occurs: ${c_1}$ and ${c_2}$ coincide. Nevertheless, this is also a removable singularity, and we blast through to beyond a full rotation:

And now ${C}$ is back where it started, as are ${D}$, ${c_2}$, ${E}$, and ${\ell}$… but the point ${F}$ has moved, from one intersection point of ${\ell \cap c_3}$ to the other. As a consequence, ${c_3}$, ${G}$, and ${m}$ have also changed, with ${m}$ being at right angles to where it was before. (In the jargon of modern mathematics, the quadrisection construction has a non-trivial monodromy.)

But nothing stops us from rotating ${C}$ some more. If we continue this procedure, we see that after two full rotations of ${C}$ around ${A}$, all points, lines, and circles constructed from ${A, B, C}$ have returned to their original positions. Because of this, we shall say that the quadrisection construction described above is periodic with period ${2}$.

Similarly, if one performs an octisection of the angle ${\angle BAC}$ by bisecting the quadrisection, one can verify that this octisection is periodic with period ${4}$; it takes four full rotations of ${C}$ around ${A}$ before the configuration returns to where it started. More generally, one can show

Proposition 4 Any construction of straightedge and compass from the points ${A,B,C}$ is periodic with period equal to a power of ${2}$.

The reason for this, ultimately, is because any two circles or lines will intersect each other in at most two points, and so at each step of a straightedge-and-compass construction there is an ambiguity of at most ${2! = 2}$. Each rotation of ${C}$ around ${A}$ can potentially flip one of these points to the other, but then if one rotates again, the point returns to its original position, and then one can analyse the next point in the construction in the same fashion until one obtains the proposition.

But now consider a putative trisection operation, that starts with an arbitrary angle ${\angle BAC}$ and somehow uses some sequence of straightedge and compass constructions to end up with a trisecting line ${\ell}$:

What is the period of this construction? If we continuously rotate ${C}$ around ${A}$, we observe that a full rotations of ${C}$ only causes the trisecting line ${\ell}$ to rotate by a third of a full rotation (i.e. by ${120^\circ}$):

Because of this, we see that the period of any construction that contains ${\ell}$ must be a multiple of ${3}$. But this contradicts Proposition 4 and Lemma 2.

Below the fold, I will make the above proof rigorous. Unfortunately, in doing so, I had to again leave the world of high-school mathematics, as one needs a little bit of algebraic geometry and complex analysis to resolve the issues with singularities that we saw in the above sketch. Still, I feel that at an intuitive level at least, this argument is more geometric and accessible than the Galois-theoretic argument (though anyone familiar with Galois theory will note that there is really not that much difference between the proofs, ultimately, as one has simply replaced the Galois group with a closely related monodromy group instead).

In the last few weeks, the Great Internet Mersenne Prime Search (GIMPS) announced the discovery of two new Mersenne primes, both over ten million digits in length, including one discovered by the computing team right here at UCLA (see this page for more information).  [I was not involved in this computing effort.]  As for the question “Why do we want to find such big primes anyway?”, see this page, though this is not the focus of my post today.

The GIMPS approach to finding Mersenne primes relies of course on modern computing power, parallelisation, and efficient programming, but the number-theoretic heart of it – aside from some basic optimisation tricks such as fast multiplication and preliminary sieving to eliminate some obviously non-prime Mersenne number candidates – is the Lucas-Lehmer primality test for Mersenne numbers, which is much faster for this special type of number than any known general-purpose (deterministic) primality test (such as, say, the AKS test).  This test is easy enough to describe, and I will do so later in this post, and also has some short elementary proofs of correctness; but the proofs are sometimes presented in a way that involves pulling a lot of rabbits out of hats, giving the argument a magical feel rather than a natural one.  In this post, I will try to explain the basic ideas that make the primality test work, seeking a proof which is perhaps less elementary and a little longer than some of the proofs in the literature, but is perhaps a bit better motivated.