You are currently browsing the tag archive for the ‘Galois theory’ tag.

Much as group theory is the study of groups, or graph theory is the study of graphs, model theory is the study of models (also known as structures) of some language {{\mathcal L}} (which, in this post, will always be a single-sorted, first-order language). A structure is a set {X}, equipped with one or more operations, constants, and relations. This is of course an extremely general type of mathematical object, but (quite remarkably) one can still say a substantial number of interesting things about very broad classes of structures.

We will observe the common abuse of notation of using the set {X} as a metonym for the entire structure, much as we usually refer to a group {(G,1,\cdot,()^{-1})} simply as {G}, a vector space {(V, 0, +, \cdot)} simply as {V}, and so forth. Following another common bending of the rules, we also allow some operations on structures (such as the multiplicative inverse operation on a group or field) to only be partially defined, and we allow use of the usual simplifying conventions for mathematical formulas (e.g. writing {a+b+c} instead of {(a+b)+c} or {a+(b+c)}, in cases where associativity is known). We will also deviate slightly from the usual practice in logic by emphasising individual structures, rather than the theory of general classes of structures; for instance, we will talk about the theory of a single field such as {{\bf R}} or {{\bf C}}, rather than the theory of all fields of a certain type (e.g. real closed fields or algebraically closed fields).

Once one has a structure {X}, one can introduce the notion of a definable subset of {X}, or more generally of a Cartesian power {X^n} of {X}, defined as a set {E \subset X^n} of the form

\displaystyle  E = \{ (x_1,\ldots,x_n): P(x_1,\ldots,x_n) \hbox{ true} \} \ \ \ \ \ (1)

for some formula {P} in the language {{\mathcal L}} with {n} free variables and any number of constants from {X} (that is, {P(x_1,\ldots,x_n)} is a well-formed formula built up from a finite number of constants {c_1,\ldots,c_m} in {X}, the relations and operations on {X}, logical connectives such as {\neg}, {\wedge}, {\implies}, and the quantifiers {\forall, \exists}). Thus, for instance, in the theory of the arithmetic of the natural numbers {{\bf N} = ({\bf N}, 0, 1, +, \times)}, the set of primes {{\mathcal P}} is a definable set, since we have

\displaystyle  {\mathcal P} = \{ x \in {\bf N}: (\exists y: x=y+2) \wedge \neg (\exists z,w: x = (z+2)(w+2)) \}.

In the theory of the field of reals {{\bf R} = ({\bf R}, 0, 1, +, -, \times, ()^{-1})}, the unit circle {S^1} is an example of a definable set,

\displaystyle  S^1 = \{ (x,y) \in {\bf R}^2: x^2+y^2 = 1 \},

but so is the the complement of the circle,

\displaystyle  {\bf R}^2 \backslash S^1 = \{ (x,y) \in {\bf R}^2: \neg(x^2+y^2 = 1) \}

and the interval {[-1,1]}:

\displaystyle  [-1,1] = \{ x \in {\bf R}: \exists y: x^2+y^2 = 1\}.

Due to the unlimited use of constants, any finite subset of a power {X^n} of any structure {X} is, by our conventions, definable in that structure. (One can of course also consider definability without parameters (also known as {0}-definability), in which arbitrary constants are not permitted, but we will not do so here.)

We can isolate some special subclasses of definable sets:

  • An atomic definable set is a set of the form (1) in which {P()} is an atomic formula (i.e. it does not contain any logical connectives or quantifiers).
  • A quantifier-free definable set is a set of the form (1) in which {P()} is quantifier-free (i.e. it can contain logical connectives, but does not contain the quantifiers {\forall, \exists}).

Example 1 In the theory of a field such as {{\bf R}}, an atomic definable set is the same thing as an affine algebraic set (also known as an affine algebraic variety, with the understanding that varieties are not necessarily assumed to be irreducible), and a quantifier-free definable set is known as a constructible set; thus we see that algebraic geometry can be viewed in some sense as a special case of model theory. (Conversely, it can in fact be quite profitable to think of model theory as an abstraction of algebraic geometry; for instance, the concepts of Morley rank and Morley degree in model theory (discussed in this previous blog post) directly generalises the concepts of dimension and degree in algebraic geometry.) Over {{\bf R}}, the interval {[-1,1]} is a definable set, but not a quantifier-free definable set (and certainly not an atomic definable set); and similarly for the primes over {{\bf N}}.

A quantifier-free definable set in {X^n} is nothing more than a finite boolean combination of atomic definable sets; in other words, the class of quantifier-free definable sets over {X} is the smallest class that contains the atomic definable sets and is closed under boolean operations such as complementation and union (which generate all the other boolean operations). Similarly, the class of definable sets over {X} is the smallest class that contains the quantifier-free definable sets, and is also closed under the operation of projection {\pi_n: E \mapsto \pi_n(E)} from {X^{n+1}} to {X^n} for every natural number {n}, where {\pi_n: X^{n+1} \rightarrow X^n} is the map {\pi_n(x_1,\ldots,x_n,x_{n+1}) := (x_1,\ldots,x_n)}.

Some structures have the property of enjoying quantifier elimination, which means that every definable set is in fact a quantifier-free definable set, or equivalently that the projection of a quantifier-free definable set is again quantifier-free. For instance, an algebraically closed field {k} (with the field operations) has quantifier elimination (i.e. the projection of a constructible set is again constructible); this fact can be proven by the classical tool of resultants, and among other things can be used to give a proof of Hilbert’s nullstellensatz. (Note though that projection does not necessary preserve the property of being atomic; for instance, the projection of the atomic set {\{ (x,y) \in k^2: xy=1 \}} is the non-atomic, but still quantifier-free definable, set {\{ x \in k: \neg (k=0) \}}.) In the converse direction, it is not difficult to use the nullstellensatz to deduce quantifier elimination. For theory of the real field {{\bf R}}, which is not algebraically closed, one does not have quantifier elimination, as one can see from the example of the unit circle (which is a quantifier-free definable set) projecting down to the interval {[-1,1]} (which is definable, but not quantifer-free definable). However, if one adds the additional operation of order {<} to the reals, giving it the language of an ordered field rather than just a field, then quantifier elimination is recovered (the class of quantifier-free definable sets now enlarges to match the class of definable sets, which in this case is also the class of semi-algebraic sets); this is the famous Tarski-Seidenberg theorem.

On the other hand, many important structures do not have quantifier elimination; typically, the projection of a quantifier-free definable set is not, in general, quantifier-free definable. This failure of the projection property also shows up in many contexts outside of model theory; for instance, Lebesgue famously made the error of thinking that the projection of a Borel measurable set remained Borel measurable (it is merely an analytic set instead). Turing’s halting theorem can be viewed as an assertion that the projection of a decidable set (also known as a computable or recursive set) is not necessarily decidable (it is merely semi-decidable (or recursively enumerable) instead). The notorious P=NP problem can also be essentially viewed in this spirit; roughly speaking (and glossing over the placement of some quantifiers), it asks whether the projection of a polynomial-time decidable set is again polynomial-time decidable. And so forth. (See this blog post of Dick Lipton for further discussion of the subtleties of projections.)

Now we consider the status of quantifier elimination for the theory of a finite field {F}. If interpreted naively, quantifier elimination is trivial for a finite field {F}, since every subset of {F^n} is finite and thus quantifier-free definable. However, we can recover an interesting question in one of two (essentially equivalent) ways. One is to work in the asymptotic regime in which the field {F} is large, but the length of the formulae used to construct one’s definable sets stays bounded uniformly in the size of {F} (where we view any constant in {F} as contributing a unit amount to the length of a formula, no matter how large {F} is). A simple counting argument then shows that only a small number of subsets of {F^n} become definable in the asymptotic limit {|F| \rightarrow \infty}, since the number of definable sets clearly grows at most polynomially in {|F|} for any fixed bound on the formula length, while the number of all subsets of {|F|^n} grows exponentially in {n}.

Another way to proceed is to work not with a single finite field {F}, or even with a sequence {F_m} of finite fields, but with the ultraproduct {F = \prod_{m \rightarrow p} F_m} of a sequence of finite fields, and to study the properties of definable sets over this ultraproduct. (We will be using the notation of ultraproducts and nonstandard analysis from this previous blog post.) This approach is equivalent to the more finitary approach mentioned in the previous paragraph, at least if one does not care to track of the exact bounds on the length of the formulae involved. Indeed, thanks to Los’s theorem, a definable subset {E} of {F^n} is nothing more than the ultraproduct {E = \prod_{m \rightarrow p} E_m} of definable subsets {E_m} of {F_m^n} for all {m} sufficiently close to {p}, with the length of the formulae used to define {E_m} uniformly bounded in {m}. In the language of nonstandard analysis, one can view {F} as a nonstandard finite field.

The ultraproduct {F} of finite fields is an important example of a pseudo-finite field – a field that obeys all the sentences in the languages of fields that finite fields do, but is not necessarily itself a finite field. The model theory of pseudo-finite fields was first studied systematically by Ax (in the same paper where the Ax-Grothendieck theorem, discussed previously on this blog, was established), with important further contributions by Kiefe, by Fried-Sacerdote, by two papers of Chatzidakis-van den Dries-Macintyre, and many other authors.

As mentioned before, quantifier elimination trivially holds for finite fields. But for infinite pseudo-finite fields, such as the ultraproduct {F = \prod_{m \rightarrow p} F_m} of finite fields with {|F_m|} going to infinity, quantifier elimination fails. For instance, in a finite field {F_m}, the set {E_m := \{ x \in F_m: \exists y \in F_m: x=y^2 \}} of quadratic residues is a definable set, with a bounded formula length, and so in the ultraproduct {F =\prod_{m \rightarrow p} F_m}, the set {E := \prod_{m\rightarrow p} E_m} of nonstandard quadratic residues is also a definable set. However, in one dimension, we see from the factor theorem that the only atomic definable sets are either finite or the whole field {F}, and so the only constructible sets (i.e. the only quantifier-free definable sets) are either finite or cofinite in {F}. Since the quadratic residues have asymptotic density {1/2} in a large finite field, they cannot form a quantifier-free definable set, despite being definable.

Nevertheless, there is a very nice almost quantifier elimination result for these fields, in characteristic zero at least, which we phrase here as follows:

Theorem 1 (Almost quantifier elimination) Let {F} be a nonstandard finite field of characteristic zero, and let {E \subset F^n} be a definable set over {F}. Then {E} is the union of finitely many sets of the form

\displaystyle  E = \{ x \in V(F): \exists t \in F: P(x,t) = 0 \} \ \ \ \ \ (2)

where {V(F)} is an atomic definable subset of {F^n} (i.e. the {F}-points of an algebraic variety {V} defined over {F} in {F^n}) and {P: F^{n+1} \rightarrow F} is a polynomial.

Results of this type were first obtained essentially due to Catarina Kiefe, although the formulation here is closer to that of Chatzidakis-van den Dries-Macintyre.

Informally, this theorem says that while we cannot quite eliminate all quantifiers from a definable set over a nonstandard finite field, we can eliminate all but one existential quantifier. Note that negation has also been eliminated in this theorem; for instance, the definable set {F \backslash \{0\} = \{ x \in F: \neg(x=0) \}} uses a negation, but can also be described using a single existential quantifier as {\{ x \in F: \exists t: xt = 1 \}}.) I believe that there are more complicated analogues of this result in positive characteristic, but I have not studied this case in detail (Kiefe’s result does not assume characteristic zero, but her conclusion is slightly different from the one given here). In the one-dimensional case {n=1}, the only varieties {V} are the affine line and finite sets, and we can simplify the above statement, namely that any definable subset of {F} takes the form {\{ x\in F: \exists t \in F: P(x,t) = 0 \}} for some polynomial {P} (i.e. definable sets in {F} are nothing more than the projections of the {F}-points of a plane curve).

There is an equivalent formulation of this theorem for standard finite fields, namely that if {F} is a finite field and {E \subset F^n} is definable using a formula of length at most {M}, then {E} can be expressed in the form (2) with the degree of {P} bounded by some quantity {C_{M,n}} depending on {M} and {n}, assuming that the characteristic of {F} is sufficiently large depending on {M, n}.

The theorem gives quite a satisfactory description of definable sets in either standard or nonstandard finite fields (at least if one does not care about effective bounds in some of the constants, and if one is willing to exclude the small characteristic case); for instance, in conjunction with the Lang-Weil bound discussed in this recent blog post, it shows that any non-empty definable subset of a nonstandard finite field has a nonstandard cardinality of {(\alpha + O(|F|^{-1/2})) |F|^d} for some positive standard rational {\alpha} and integer {d}. Equivalently, any non-empty definable subset of {F^n} for some standard finite field {F} using a formula of length at most {M} has a standard cardinality of {(\alpha + O_{M,n}(|F|^{-1/2})) |F|^d} for some positive rational of height {O_{M,n}(1)} and some natural number {d} between {0} and {n}. (For instance, in the example of the quadratic residues given above, {d} is equal to {1} and {\alpha} equal to {1/2}.) There is a more precise statement to this effect, namely that the Poincaré series of a definable set is rational; see Kiefe’s paper for details.

Below the fold I give a proof of Theorem 1, which relies primarily on the Lang-Weil bound mentioned above.

Read the rest of this entry »

One of the most well known problems from ancient Greek mathematics was that of trisecting an angle by straightedge and compass, which was eventually proven impossible in 1837 by Pierre Wantzel, using methods from Galois theory.

Formally, one can set up the problem as follows. Define a configuration to be a finite collection {{\mathcal C}} of points, lines, and circles in the Euclidean plane. Define a construction step to be one of the following operations to enlarge the collection {{\mathcal C}}:

  • (Straightedge) Given two distinct points {A, B} in {{\mathcal C}}, form the line {\overline{AB}} that connects {A} and {B}, and add it to {{\mathcal C}}.
  • (Compass) Given two distinct points {A, B} in {{\mathcal C}}, and given a third point {O} in {{\mathcal C}} (which may or may not equal {A} or {B}), form the circle with centre {O} and radius equal to the length {|AB|} of the line segment joining {A} and {B}, and add it to {{\mathcal C}}.
  • (Intersection) Given two distinct curves {\gamma, \gamma'} in {{\mathcal C}} (thus {\gamma} is either a line or a circle in {{\mathcal C}}, and similarly for {\gamma'}), select a point {P} that is common to both {\gamma} and {\gamma'} (there are at most two such points), and add it to {{\mathcal C}}.

We say that a point, line, or circle is constructible by straightedge and compass from a configuration {{\mathcal C}} if it can be obtained from {{\mathcal C}} after applying a finite number of construction steps.

Problem 1 (Angle trisection) Let {A, B, C} be distinct points in the plane. Is it always possible to construct by straightedge and compass from {A,B,C} a line {\ell} through {A} that trisects the angle {\angle BAC}, in the sense that the angle between {\ell} and {BA} is one third of the angle of {\angle BAC}?

Thanks to Wantzel’s result, the answer to this problem is known to be “no” in general; a generic angle {\angle BAC} cannot be trisected by straightedge and compass. (On the other hand, some special angles can certainly be trisected by straightedge and compass, such as a right angle. Also, one can certainly trisect generic angles using other methods than straightedge and compass; see the Wikipedia page on angle trisection for some examples of this.)

The impossibility of angle trisection stands in sharp contrast to the easy construction of angle bisection via straightedge and compass, which we briefly review as follows:

  1. Start with three points {A, B, C}.
  2. Form the circle {c_0} with centre {A} and radius {AB}, and intersect it with the line {\overline{AC}}. Let {D} be the point in this intersection that lies on the same side of {A} as {C}. ({D} may well be equal to {C}).
  3. Form the circle {c_1} with centre {B} and radius {AB}, and the circle {c_2} with centre {D} and radius {AB}. Let {E} be the point of intersection of {c_1} and {c_2} that is not {A}.
  4. The line {\ell := \overline{AE}} will then bisect the angle {\angle BAC}.

The key difference between angle trisection and angle bisection ultimately boils down to the following trivial number-theoretic fact:

Lemma 2 There is no power of {2} that is evenly divisible by {3}.

Proof: Obvious by modular arithmetic, by induction, or by the fundamental theorem of arithmetic. \Box

In contrast, there are of course plenty of powers of {2} that are evenly divisible by {2}, and this is ultimately why angle bisection is easy while angle trisection is hard.

The standard way in which Lemma 2 is used to demonstrate the impossibility of angle trisection is via Galois theory. The implication is quite short if one knows this theory, but quite opaque otherwise. We briefly sketch the proof of this implication here, though we will not need it in the rest of the discussion. Firstly, Lemma 2 implies the following fact about field extensions.

Corollary 3 Let {F} be a field, and let {E} be an extension of {F} that can be constructed out of {F} by a finite sequence of quadratic extensions. Then {E} does not contain any cubic extensions {K} of {F}.

Proof: If E contained a cubic extension K of F, then the dimension of E over F would be a multiple of three. On the other hand, if E is obtained from F by a tower of quadratic extensions, then the dimension of E over F is a power of two. The claim then follows from Lemma 2. \Box

To conclude the proof, one then notes that any point, line, or circle that can be constructed from a configuration {{\mathcal C}} is definable in a field obtained from the coefficients of all the objects in {{\mathcal C}} after taking a finite number of quadratic extensions, whereas a trisection of an angle {\angle ABC} will generically only be definable in a cubic extension of the field generated by the coordinates of {A,B,C}.

The Galois theory method also allows one to obtain many other impossibility results of this type, most famously the Abel-Ruffini theorem on the insolvability of the quintic equation by radicals. For this reason (and also because of the many applications of Galois theory to number theory and other branches of mathematics), the Galois theory argument is the “right” way to prove the impossibility of angle trisection within the broader framework of modern mathematics. However, this argument has the drawback that it requires one to first understand Galois theory (or at least field theory), which is usually not presented until an advanced undergraduate algebra or number theory course, whilst the angle trisection problem requires only high-school level mathematics to formulate. Even if one is allowed to “cheat” and sweep several technicalities under the rug, one still needs to possess a fair amount of solid intuition about advanced algebra in order to appreciate the proof. (This was undoubtedly be one reason why, even after Wantzel’s impossibility result was published, a large amount of effort was still expended by amateur mathematicians to try to trisect a general angle.)

In this post I would therefore like to present a different proof (or perhaps more accurately, a disguised version of the standard proof) of the impossibility of angle trisection by straightedge and compass, that avoids explicit mention of Galois theory (though it is never far beneath the surface). With “cheats”, the proof is actually quite simple and geometric (except for Lemma 2, which is still used at a crucial juncture), based on the basic geometric concept of monodromy; unfortunately, some technical work is needed however to remove these cheats.

To describe the intuitive idea of the proof, let us return to the angle bisection construction, that takes a triple {A, B, C} of points as input and returns a bisecting line {\ell} as output. We iterate the construction to create a quadrisecting line {m}, via the following sequence of steps that extend the original bisection construction:

  1. Start with three points {A, B, C}.
  2. Form the circle {c_0} with centre {A} and radius {AB}, and intersect it with the line {\overline{AC}}. Let {D} be the point in this intersection that lies on the same side of {A} as {C}. ({D} may well be equal to {C}).
  3. Form the circle {c_1} with centre {B} and radius {AB}, and the circle {c_2} with centre {D} and radius {AB}. Let {E} be the point of intersection of {c_1} and {c_2} that is not {A}.
  4. Let {F} be the point on the line {\ell := \overline{AE}} which lies on {c_0}, and is on the same side of {A} as {E}.
  5. Form the circle {c_3} with centre {F} and radius {AB}. Let {G} be the point of intersection of {c_1} and {c_3} that is not {A}.
  6. The line {m := \overline{AG}} will then quadrisect the angle {\angle BAC}.

Let us fix the points {A} and {B}, but not {C}, and view {m} (as well as intermediate objects such as {D}, {c_2}, {E}, {\ell}, {F}, {c_3}, {G}) as a function of {C}.

Let us now do the following: we begin rotating {C} counterclockwise around {A}, which drags around the other objects {D}, {c_2}, {E}, {\ell}, {F}, {c_3}, {G} that were constructed by {C} accordingly. For instance, here is an early stage of this rotation process, when the angle {\angle BAC} has become obtuse:

Now for the slightly tricky bit. We are going to keep rotating {C} beyond a half-rotation of {180^\circ}, so that {\angle BAC} now becomes a reflex angle. At this point, a singularity occurs; the point {E} collides into {A}, and so there is an instant in which the line {\ell = \overline{AE}} is not well-defined. However, this turns out to be a removable singularity (and the easiest way to demonstrate this will be to tap the power of complex analysis, as complex numbers can easily route around such a singularity), and we can blast through it to the other side, giving a picture like this:

Note that we have now deviated from the original construction in that {F} and {E} are no longer on the same side of {A}; we are thus now working in a continuation of that construction rather than with the construction itself. Nevertheless, we can still work with this continuation (much as, say, one works with analytic continuations of infinite series such as {\sum_{n=1}^\infty \frac{1}{n^s}} beyond their original domain of definition).

We now keep rotating {C} around {A}. Here, {\angle BAC} is approaching a full rotation of {360^\circ}:

When {\angle BAC} reaches a full rotation, a different singularity occurs: {c_1} and {c_2} coincide. Nevertheless, this is also a removable singularity, and we blast through to beyond a full rotation:

And now {C} is back where it started, as are {D}, {c_2}, {E}, and {\ell}… but the point {F} has moved, from one intersection point of {\ell \cap c_3} to the other. As a consequence, {c_3}, {G}, and {m} have also changed, with {m} being at right angles to where it was before. (In the jargon of modern mathematics, the quadrisection construction has a non-trivial monodromy.)

But nothing stops us from rotating {C} some more. If we continue this procedure, we see that after two full rotations of {C} around {A}, all points, lines, and circles constructed from {A, B, C} have returned to their original positions. Because of this, we shall say that the quadrisection construction described above is periodic with period {2}.

Similarly, if one performs an octisection of the angle {\angle BAC} by bisecting the quadrisection, one can verify that this octisection is periodic with period {4}; it takes four full rotations of {C} around {A} before the configuration returns to where it started. More generally, one can show

Proposition 4 Any construction of straightedge and compass from the points {A,B,C} is periodic with period equal to a power of {2}.

The reason for this, ultimately, is because any two circles or lines will intersect each other in at most two points, and so at each step of a straightedge-and-compass construction there is an ambiguity of at most {2! = 2}. Each rotation of {C} around {A} can potentially flip one of these points to the other, but then if one rotates again, the point returns to its original position, and then one can analyse the next point in the construction in the same fashion until one obtains the proposition.

But now consider a putative trisection operation, that starts with an arbitrary angle {\angle BAC} and somehow uses some sequence of straightedge and compass constructions to end up with a trisecting line {\ell}:

What is the period of this construction? If we continuously rotate {C} around {A}, we observe that a full rotations of {C} only causes the trisecting line {\ell} to rotate by a third of a full rotation (i.e. by {120^\circ}):

Because of this, we see that the period of any construction that contains {\ell} must be a multiple of {3}. But this contradicts Proposition 4 and Lemma 2.

Below the fold, I will make the above proof rigorous. Unfortunately, in doing so, I had to again leave the world of high-school mathematics, as one needs a little bit of algebraic geometry and complex analysis to resolve the issues with singularities that we saw in the above sketch. Still, I feel that at an intuitive level at least, this argument is more geometric and accessible than the Galois-theoretic argument (though anyone familiar with Galois theory will note that there is really not that much difference between the proofs, ultimately, as one has simply replaced the Galois group with a closely related monodromy group instead).

Read the rest of this entry »

In the last few weeks, the Great Internet Mersenne Prime Search (GIMPS) announced the discovery of two new Mersenne primes, both over ten million digits in length, including one discovered by the computing team right here at UCLA (see this page for more information).  [I was not involved in this computing effort.]  As for the question “Why do we want to find such big primes anyway?”, see this page, though this is not the focus of my post today.

The GIMPS approach to finding Mersenne primes relies of course on modern computing power, parallelisation, and efficient programming, but the number-theoretic heart of it – aside from some basic optimisation tricks such as fast multiplication and preliminary sieving to eliminate some obviously non-prime Mersenne number candidates – is the Lucas-Lehmer primality test for Mersenne numbers, which is much faster for this special type of number than any known general-purpose (deterministic) primality test (such as, say, the AKS test).  This test is easy enough to describe, and I will do so later in this post, and also has some short elementary proofs of correctness; but the proofs are sometimes presented in a way that involves pulling a lot of rabbits out of hats, giving the argument a magical feel rather than a natural one.  In this post, I will try to explain the basic ideas that make the primality test work, seeking a proof which is perhaps less elementary and a little longer than some of the proofs in the literature, but is perhaps a bit better motivated.

Read the rest of this entry »


RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.

Get every new post delivered to your Inbox.

Join 3,307 other followers