Analytic number theory is only one of many different approaches to number theory. Another important branch of the subject is algebraic number theory, which studies algebraic structures (e.g. groups, rings, and fields) of number-theoretic interest. With this perspective, the classical field of rationals {{\bf Q}}, and the classical ring of integers {{\bf Z}}, are placed inside the much larger field {\overline{{\bf Q}}} of algebraic numbers, and the much larger ring {{\mathcal A}} of algebraic integers, respectively. Recall that an algebraic number is a root of a polynomial with integer coefficients, and an algebraic integer is a root of a monic polynomial with integer coefficients; thus for instance {\sqrt{2}} is an algebraic integer (a root of {x^2-2}), while {\sqrt{2}/2} is merely an algebraic number (a root of {4x^2-2}). For the purposes of this post, we will adopt the concrete (but somewhat artificial) perspective of viewing algebraic numbers and integers as lying inside the complex numbers {{\bf C}}, thus {{\mathcal A} \subset \overline{{\bf Q}} \subset {\bf C}}. (From a modern algebraic perspective, it is better to think of {\overline{{\bf Q}}} as existing as an abstract field separate from {{\bf C}}, but which has a number of embeddings into {{\bf C}} (as well as into other fields, such as the completed p-adics {{\bf C}_p}), no one of which should be considered favoured over any other; cf. this mathOverflow post. But for the rudimentary algebraic number theory in this post, we will not need to work at this level of abstraction.) In particular, we identify the algebraic integer {\sqrt{-d}} with the complex number {\sqrt{d} i} for any natural number {d}.

Exercise 1 Show that the field of algebraic numbers {\overline{{\bf Q}}} is indeed a field, and that the ring of algebraic integers {{\mathcal A}} is indeed a ring, and is in fact an integral domain. Also, show that {{\bf Z} = {\mathcal A} \cap {\bf Q}}, that is to say the ordinary integers are precisely the algebraic integers that are also rational. Because of this, we will sometimes refer to elements of {{\bf Z}} as rational integers.

In practice, the field {\overline{{\bf Q}}} is too big to conveniently work with directly, having infinite dimension (as a vector space) over {{\bf Q}}. Thus, algebraic number theory generally restricts attention to intermediate fields {{\bf Q} \subset F \subset \overline{{\bf Q}}} between {{\bf Q}} and {\overline{{\bf Q}}}, which are of finite dimension over {{\bf Q}}; that is to say, finite degree extensions of {{\bf Q}}. Such fields are known as algebraic number fields, or number fields for short. Apart from {{\bf Q}} itself, the simplest examples of such number fields are the quadratic fields, which have dimension exactly two over {{\bf Q}}.

Exercise 2 Show that if {\alpha} is a rational number that is not a perfect square, then the field {{\bf Q}(\sqrt{\alpha})} generated by {{\bf Q}} and either of the square roots of {\alpha} is a quadratic field. Conversely, show that all quadratic fields arise in this fashion. (Hint: show that every element of a quadratic field is a root of a quadratic polynomial over the rationals.)

The ring of algebraic integers {{\mathcal A}} is similarly too large to conveniently work with directly, so in algebraic number theory one usually works with the rings {{\mathcal O}_F := {\mathcal A} \cap F} of algebraic integers inside a given number field {F}. One can (and does) study this situation in great generality, but for the purposes of this post we shall restrict attention to a simple but illustrative special case, namely the quadratic fields with a certain type of negative discriminant. (The positive discriminant case will be briefly discussed in Remark 42 below.)

Exercise 3 Let {d} be a square-free natural number with {d=1\ (4)} or {d=2\ (4)}. Show that the ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} of algebraic integers in {{\bf Q}(\sqrt{-d})} is given by

\displaystyle  {\mathcal O} = {\bf Z}[\sqrt{-d}] = \{ a + b \sqrt{-d}: a,b \in {\bf Z} \}.

If instead {d} is square-free with {d=3\ (4)}, show that the ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} is instead given by

\displaystyle  {\mathcal O} = {\bf Z}[\frac{1+\sqrt{-d}}{2}] = \{ a + b \frac{1+\sqrt{-d}}{2}: a,b \in {\bf Z} \}.

What happens if {d} is not square-free, or negative?

Remark 4 In the case {d=3\ (4)}, it may naively appear more natural to work with the ring {{\bf Z}[\sqrt{-d}]}, which is an index two subring of {{\mathcal O}}. However, because this ring only captures some of the algebraic integers in {{\bf Q}(\sqrt{-d})} rather than all of them, the algebraic properties of these rings are somewhat worse than those of {{\mathcal O}} (in particular, they generally fail to be Dedekind domains) and so are not convenient to work with in algebraic number theory.

We refer to fields of the form {{\bf Q}(\sqrt{-d})} for natural square-free numbers {d} as quadratic fields of negative discriminant, and similarly refer to {{\mathcal O}_{{\bf Q}(\sqrt{-d})}} as a ring of quadratic integers of negative discriminant. Quadratic fields and quadratic integers of positive discriminant are just as important to analytic number theory as their negative discriminant counterparts, but we will restrict attention to the latter here for simplicity of discussion.

Thus, for instance, when {d=1}, the ring of integers in {{\bf Q}(\sqrt{-1})} is the ring of Gaussian integers

\displaystyle  {\bf Z}[\sqrt{-1}] = \{ x + y \sqrt{-1}: x,y \in {\bf Z} \}

and when {d=3}, the ring of integers in {{\bf Q}(\sqrt{-3})} is the ring of Eisenstein integers

\displaystyle  {\bf Z}[\omega] := \{ x + y \omega: x,y \in {\bf Z} \}

where {\omega := e^{2\pi i /3}} is a cube root of unity.

As these examples illustrate, the additive structure of a ring {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} of quadratic integers is that of a two-dimensional lattice in {{\bf C}}, which is isomorphic as an additive group to {{\bf Z}^2}. Thus, from an additive viewpoint, one can view quadratic integers as “two-dimensional” analogues of rational integers. From a multiplicative viewpoint, however, the quadratic integers (and more generally, integers in a number field) behave very similarly to the rational integers (as opposed to being some sort of “higher-dimensional” version of such integers). Indeed, a large part of basic algebraic number theory is devoted to treating the multiplicative theory of integers in number fields in a unified fashion, that naturally generalises the classical multiplicative theory of the rational integers.

For instance, every rational integer {n \in {\bf Z}} has an absolute value {|n| \in {\bf N} \cup \{0\}}, with the multiplicativity property {|nm| = |n| |m|} for {n,m \in {\bf Z}}, and the positivity property {|n| > 0} for all {n \neq 0}. Among other things, the absolute value detects units: {|n| = 1} if and only if {n} is a unit in {{\bf Z}} (that is to say, it is multiplicatively invertible in {{\bf Z}}). Similarly, in any ring of quadratic integers {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} with negative discriminant, we can assign a norm {N(n) \in {\bf N} \cup \{0\}} to any quadratic integer {n \in {\mathcal O}_{{\bf Q}(\sqrt{-d})}} by the formula

\displaystyle  N(n) = n \overline{n}

where {\overline{n}} is the complex conjugate of {n}. (When working with other number fields than quadratic fields of negative discriminant, one instead defines {N(n)} to be the product of all the Galois conjugates of {n}.) Thus for instance, when {d=1,2\ (4)} one has

\displaystyle  N(x + y \sqrt{-d}) = x^2 + dy^2 \ \ \ \ \ (1)

and when {d=3\ (4)} one has

\displaystyle  N(x + y \frac{1+\sqrt{-d}}{2}) = x^2 + xy + \frac{d+1}{4} y^2. \ \ \ \ \ (2)

Analogously to the rational integers, we have the multiplicativity property {N(nm) = N(n) N(m)} for {n,m \in {\mathcal O}} and the positivity property {N(n) > 0} for {n \neq 0}, and the units in {{\mathcal O}} are precisely the elements of norm one.

Exercise 5 Establish the three claims of the previous paragraph. Conclude that the units (invertible elements) of {{\mathcal O}} consist of the four elements {\pm 1, \pm i} if {d=1}, the six elements {\pm 1, \pm \omega, \pm \omega^2} if {d=3}, and the two elements {\pm 1} if {d \neq 1,3}.

For the rational integers, we of course have the fundamental theorem of arithmetic, which asserts that every non-zero rational integer can be uniquely factored (up to permutation and units) as the product of irreducible integers, that is to say non-zero, non-unit integers that cannot be factored into the product of integers of strictly smaller norm. As it turns out, the same claim is true for a few additional rings of quadratic integers, such as the Gaussian integers and Eisenstein integers, but fails in general; for instance, in the ring {{\bf Z}[\sqrt{-5}]}, we have the famous counterexample

\displaystyle  6 = 2 \times 3 = (1+\sqrt{-5}) (1-\sqrt{-5})

that decomposes {6} non-uniquely into the product of irreducibles in {{\bf Z}[\sqrt{-5}]}. Nevertheless, it is an important fact that the fundamental theorem of arithmetic can be salvaged if one uses an “idealised” notion of a number in a ring of integers {{\mathcal O}}, now known in modern language as an ideal of that ring. For instance, in {{\bf Z}[\sqrt{-5}]}, the principal ideal {(6)} turns out to uniquely factor into the product of (non-principal) ideals {(2) + (1+\sqrt{-5}), (2) + (1-\sqrt{-5}), (3) + (1+\sqrt{-5}), (3) + (1-\sqrt{-5})}; see Exercise 27. We will review the basic theory of ideals in number fields (focusing primarily on quadratic fields of negative discriminant) below the fold.

The norm forms (1), (2) can be viewed as examples of positive definite quadratic forms {Q: {\bf Z}^2 \rightarrow {\bf Z}} over the integers, by which we mean a polynomial of the form

\displaystyle  Q(x,y) = ax^2 + bxy + cy^2

for some integer coefficients {a,b,c}. One can declare two quadratic forms {Q, Q': {\bf Z}^2 \rightarrow {\bf Z}} to be equivalent if one can transform one to the other by an invertible linear transformation {T: {\bf Z}^2 \rightarrow {\bf Z}^2}, so that {Q' = Q \circ T}. For example, the quadratic forms {(x,y) \mapsto x^2 + y^2} and {(x',y') \mapsto 2 (x')^2 + 2 x' y' + (y')^2} are equivalent, as can be seen by using the invertible linear transformation {(x,y) = (x',x'+y')}. Such equivalences correspond to the different choices of basis available when expressing a ring such as {{\mathcal O}} (or an ideal thereof) additively as a copy of {{\bf Z}^2}.

There is an important and classical invariant of a quadratic form {(x,y) \mapsto ax^2 + bxy + c y^2}, namely the discriminant {\Delta := b^2 - 4ac}, which will of course be familiar to most readers via the quadratic formula, which among other things tells us that a quadratic form will be positive definite precisely when its discriminant is negative. It is not difficult (particularly if one exploits the multiplicativity of the determinant of {2 \times 2} matrices) to show that two equivalent quadratic forms have the same discriminant. Thus for instance any quadratic form equivalent to (1) has discriminant {-4d}, while any quadratic form equivalent to (2) has discriminant {-d}. Thus we see that each ring {{\mathcal O}[\sqrt{-d}]} of quadratic integers is associated with a certain negative discriminant {D}, defined to equal {-4d} when {d=1,2\ (4)} and {-d} when {d=3\ (4)}.

Exercise 6 (Geometric interpretation of discriminant) Let {Q: {\bf Z}^2 \rightarrow {\bf Z}} be a quadratic form of negative discriminant {D}, and extend it to a real form {Q: {\bf R}^2 \rightarrow {\bf R}} in the obvious fashion. Show that for any {X>0}, the set {\{ (x,y) \in {\bf R}^2: Q(x,y) \leq X \}} is an ellipse of area {2\pi X / \sqrt{|D|}}.

It is natural to ask the converse question: if two quadratic forms have the same discriminant, are they necessarily equivalent? For certain choices of discriminant, this is the case:

Exercise 7 Show that any quadratic form {ax^2+bxy+cy^2} of discriminant {-4} is equivalent to the form {x^2+y^2}, and any quadratic form of discriminant {-3} is equivalent to {x^2+xy+y^2}. (Hint: use elementary transformations to try to make {|b|} as small as possible, to the point where one only has to check a finite number of cases; this argument is due to Legendre.) More generally, show that for any negative discriminant {D}, there are only finitely many quadratic forms of that discriminant up to equivalence (a result first established by Gauss).

Unfortunately, for most choices of discriminant, the converse question fails; for instance, the quadratic forms {x^2+5y^2} and {2x^2+2xy+3y^2} both have discriminant {-20}, but are not equivalent (Exercise 38). This particular failure of equivalence turns out to be intimately related to the failure of unique factorisation in the ring {{\bf Z}[\sqrt{-5}]}.

It turns out that there is a fundamental connection between quadratic fields, equivalence classes of quadratic forms of a given discriminant, and real Dirichlet characters, thus connecting the material discussed above with the last section of the previous set of notes. Here is a typical instance of this connection:

Proposition 8 Let {\chi_4: {\bf N} \rightarrow {\bf R}} be the real non-principal Dirichlet character of modulus {4}, or more explicitly {\chi_4(n)} is equal to {+1} when {n = 1\ (4)}, {-1} when {n = 3\ (4)}, and {0} when {n = 0,2\ (4)}.

  • (i) For any natural number {n}, the number of Gaussian integers {m \in {\bf Z}[\sqrt{-1}]} with norm {N(m)=n} is equal to {4(1 * \chi_4)(n)}. Equivalently, the number of solutions to the equation {n = x^2+y^2} with {x,y \in{\bf Z}} is {4(1*\chi_4)(n)}. (Here, as in the previous post, the symbol {*} denotes Dirichlet convolution.)
  • (ii) For any natural number {n}, the number of Gaussian integers {m \in {\bf Z}[\sqrt{-1}]} that divide {n} (thus {n = dm} for some {d \in {\bf Z}[\sqrt{-1}]}) is {4(1*1*1*\mu\chi_4)(n)}.

We will prove this proposition later in these notes. We observe that as a special case of part (i) of this proposition, we recover the Fermat two-square theorem: an odd prime {p} is expressible as the sum of two squares if and only if {p = 1\ (4)}. This proposition should also be compared with the fact, used crucially in the previous post to prove Dirichlet’s theorem, that {1*\chi(n)} is non-negative for any {n}, and at least one when {n} is a square, for any quadratic character {\chi}.

As an illustration of the relevance of such connections to analytic number theory, let us now explicitly compute {L(1,\chi_4)}.

Corollary 9 {L(1,\chi_4) = \frac{\pi}{4}}.

This particular identity is also known as the Leibniz formula.

Proof: For a large number {x}, consider the quantity

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1

of all the Gaussian integers of norm less than {x}. On the one hand, this is the same as the number of lattice points of {{\bf Z}^2} in the disk {\{ (a,b) \in {\bf R}^2: a^2+b^2 \leq x \}} of radius {\sqrt{x}}. Placing a unit square centred at each such lattice point, we obtain a region which differs from the disk by a region contained in an annulus of area {O(\sqrt{x})}. As the area of the disk is {\pi x}, we conclude the Gauss bound

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1 = \pi x + O(\sqrt{x}).

On the other hand, by Proposition 8(i) (and removing the {n=0} contribution), we see that

\displaystyle  \sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1 = 1 + 4 \sum_{n \leq x} 1 * \chi_4(n).

Now we use the Dirichlet hyperbola method to expand the right-hand side sum, first expressing

\displaystyle  \sum_{n \leq x} 1 * \chi_4(n) = \sum_{d \leq \sqrt{x}} \chi_4(d) \sum_{m \leq x/d} 1 + \sum_{m \leq \sqrt{x}} \sum_{d \leq x/m} \chi_4(d)

\displaystyle  - (\sum_{d \leq \sqrt{x}} \chi_4(d)) (\sum_{m \leq \sqrt{x}} 1)

and then using the bounds {\sum_{d \leq y} \chi_4(d) = O(1)}, {\sum_{m \leq y} 1 = y + O(1)}, {\sum_{d \leq \sqrt{x}} \frac{\chi_4(d)}{d} = L(1,\chi_4) + O(\frac{1}{\sqrt{x}})} from the previous set of notes to conclude that

\displaystyle  \sum_{n \leq x} 1 * \chi_4(n) = x L(1,\chi_4) + O(\sqrt{x}).

Comparing the two formulae for {\sum_{n \in {\bf Z}[\sqrt{-1}]: N(n) \leq x} 1} and sending {x \rightarrow \infty}, we obtain the claim. \Box

Exercise 10 Give an alternate proof of Corollary 9 that relies on obtaining asymptotics for the Dirichlet series {\sum_{n \in {\bf Z}} \frac{1 * \chi_4(n)}{n^s}} as {s \rightarrow 1^+}, rather than using the Dirichlet hyperbola method.

Exercise 11 Give a direct proof of Corollary 9 that does not use Proposition 8, instead using Taylor expansion of the complex logarithm {\log(1+z)}. (One can also use Taylor expansions of some other functions related to the complex logarithm here, such as the arctangent function.)

More generally, one can relate {L(1,\chi)} for a real Dirichlet character {\chi} with the number of inequivalent quadratic forms of a certain discriminant, via the famous class number formula; we will give a special case of this formula below the fold.

The material here is only a very rudimentary introduction to algebraic number theory, and is not essential to the rest of the course. A slightly expanded version of the material here, from the perspective of analytic number theory, may be found in Sections 5 and 6 of Davenport’s book. A more in-depth treatment of algebraic number theory may be found in a number of texts, e.g. Fröhlich and Taylor.

— 1. Ideals —

We begin by reviewing the notion of an ideal in an arbitrary commutative ring.

Definition 12 (Ideals) Let {R} be a commutative ring (in this set of notes, rings are understood to contain a multiplicative unit {1}). An ideal of {R} is an additive subgroup {I} of {R} with the property that {nm \in I} whenever {n \in I} and {m \in R}. Note that if {I} is an ideal, then the quotient {R/I} is well defined as a commutative ring. We write {n \mapsto n \hbox{ mod } I} for the quotient map from {R} to {R/I}, and write {n = m \hbox{ mod } I} if {n-m \in I}, or equivalently if {n \hbox{ mod } I} is equal to {m \hbox{ mod } I}.

An ideal is proper if it is not all of {R}. An ideal is principal if it is of the form {(n) := \{ nm: m \in R \}} for some {n \in R}, and non-zero if it is not the zero ideal {(0) = \{0\}}.

If {I, J} are ideals, then the intersection {I \cap J} is an ideal, as is the sum {I + J := \{ n+m: n \in I, m \in J \}}. The product set {\{ nm: n\in I, m \in J\}} need not be an ideal in general (it is not always closed under addition); however, we can define the product ideal {I \cdot J} to be the ideal generated by this product set (that is, the intersection of all the ideals containing this product set). One can then define powers {I^n} for any {n \geq 0} in the obvious fashion, with the convention that {I^0 = R}. We say that {I} divides {J} if {I \supset J}, thus for instance {I} divides {I \cdot J} for any ideals {I,J}. If {I \supsetneq J}, we say that {I} strictly divides {J}.

An ideal is prime if it is proper, and it has the property that for any {n,m \in R} with {nm \in I}, one has at least one of {n \in I} or {m \in I} true. Equivalently, an ideal {I} is prime if the quotient ring {R/I} is an integral domain.

One can easily check in the rational integers {{\bf Z}} that product, divisibility, and primality correspond to their counterpart notions in the natural numbers. More precisely, if {n,m} are natural numbers, then {(n) \cdot (m) = (nm)}, that {(n)} divides {(m)} if and only if {n} divides {m}, and that {(n)} is prime if and only if {n} is prime. (But note that the zero ideal {(0)} is also prime, and can be viewed as a sort of “prime at infinity” from the perspective of scheme theory.) Also, if {n} is a natural number and {a,b} are integers, then {a=b \hbox{ mod } (n)} holds if and only if {a=b \hbox{ mod } n}. Thus we see that the above operations on ideals are quite compatible with their classical counterparts in {{\bf N}} or {{\bf Z}}. Also, the integers form a principal ideal domain, in that every ideal {I} is principal; indeed, if {I} is non-zero, it is generated by the element of minimal norm. In particular, from the classical fundamental theorem of arithmetic we see that every non-zero ideal in {{\bf Z}} is uniquely factorisable (up to rearrangement) as the product of prime ideals.

Now we specialise to rings {{\mathcal O} = {\mathcal O}_{{\bf Q}(\sqrt{-d})}} of quadratic integers, where {d} is a squarefree natural number. These more general rings {{\mathcal O}} need no longer be principal ideal domains. For instance, {{\bf Z}[\sqrt{-5}]} contains the non-principal ideal {(2) + (1 + \sqrt{-5})}. Closely related to this is the breakdown of the fundamental theorem of arithmetic for quadratic integers (i.e. {{\mathcal O}} need not be a unique factorisation domain); for instance, {6 \in {\bf Z}[\sqrt{-5}]} factors non-uniquely as {6 = 2 \times 3} and {6 = (1+\sqrt{-5}) \times (1-\sqrt{-5})}. Despite this, one still has unique factorisation at the level of ideals; for instance, in {{\bf Z}[\sqrt{-5}]} it turns out that {(6)} factors uniquely as the product of {(2) + (1+\sqrt{-5})}, {(2) + (1-\sqrt{-5})}, {(3) + (1+\sqrt{-5})}, and {(3) + (1-\sqrt{-5})}. As we shall see, the precise failure of unique factorisation at the level of quadratic numbers can be quantified by an important number {h(D)}, known as the class number of the ring of integers, where {D} is the discriminant mentioned in the introduction (equal to {-4d} when {d=1,2\ (4)}, and {-d} when {d=3\ (4)}).

— 2. Unique factorisation of ideals —

Henceforth {d} is a fixed squarefree natural number, and {{\mathcal O}} is the ring of integers in {{\bf Q}(\sqrt{-d})}. We set the discriminant {D} equal to {-4d} when {d=1,2\ (4)} and equal to {-d} when {d=3\ (4)}.

Exercise 13 (Algebraic interpretation of discriminant) Let {a, b} be an additive basis for {{\mathcal O}} (thus {{\mathcal O}} is generated by {a,b} as an additive group). Show that

\displaystyle  D = \hbox{det} \begin{pmatrix} a & b \\ \overline{a} & \overline{b} \end{pmatrix}^2.

We remark that the discriminant of a more general number field is defined similarly.

As mentioned in the introduction, one can view {{\mathcal O}} additively, as a rank two lattice in the complex numbers {{\bf C}}. Any non-zero ideal {I} of {{\mathcal O}} can then be seen to be a rank two sublattice of {{\mathcal O}}, and in particular must have finite index. We refer to this index as the norm {N(I)} of the ideal, thus {N(I)} is the natural number defined by the formula

\displaystyle  N(I) := | {\mathcal O} / I |.

For the ring {{\mathcal O}} of quadratic integers we are considering here, one can interpret {N(I)} geometrically as the area of the torus {{\bf C}/I}, divided by the area of the torus {{\bf C}/{\mathcal O}}, which can be easily computed to be {\sqrt{|D|}/2}.

It is clear that if {I} divides {J}, then {N(I)} divides {N(J)}, since {{\mathcal O}/I} is a quotient of {{\mathcal O}/J}. Similarly, if {I} divides {J} and {N(I)=N(J)}, then one must have {I=J}. This implies the important Noetherian property: {{\mathcal O}} does not contain any infinite strictly increasing sequence of ideals

\displaystyle  I_1 \subsetneq I_2 \subsetneq I_3 \subsetneq \dots

since their norms must be strictly decreasing, creating an infinite descent which is absurd. This notion of norm is compatible with the notion of the norm of a quadratic integer:

Exercise 14 If {n} is a quadratic integer, show that {N((n)) = N(n)}, where {N(n)} was defined in the introduction.

We remark that for quadratic integers of positive discriminant the situation is slightly more complicated, because the norm {N(x + y \sqrt{d}) = x^2 - dy^2} of an individual element can now be negative, whereas the norm of an ideal is always positive. We will not dwell on this complication further here.

Now we develop a unique factorisation theory for ideals. We first establish that prime ideals are prime within the multiplicative structure of ideals (rather than of quadratic integers):

Lemma 15 Let {{\mathfrak p}} be a prime ideal that divides the product {I \cdot J} of two ideals {I,J}. Then {{\mathfrak p}} must divide at least one of {I,J}.

Proof: If {{\mathfrak p}} does not divide either of {I, J}, then we can find {n \in I, m \in J} that lie outside of {{\mathfrak p}}. As {{\mathfrak p}} is prime, we conclude that {nm \in I \cdot J} also lies outside of {{\mathfrak p}}, and so {{\mathfrak p}} does not divide {I \cdot J}, a contradiction. \Box

Also, prime ideals are maximal:

Exercise 16 Show that the only ideals that divide a prime ideal {{\mathfrak p}} are {{\mathfrak p}} itself, and the full ring {(1)}.

If a non-zero ideal {I} is not prime, then by definition there exist two quadratic integers {n,m} outside of {I} such that {nm \in I}. If we set {I_1 := I + (n)} and {I_2 := I + (m)}, we then see that {I_1, I_2} strictly divide {I}, and that {I} divides {I_1 \cdot I_2}. Thus any non-zero ideal {I} is either prime, or divides the product of two non-zero ideals that strictly divide it (and thus have smaller norm). Iterating this (and using the Noetherian property), we conclude

Proposition 17 Every non-zero ideal divides the product of a finite number of prime ideals.

A similar argument gives

Exercise 18 Show that every non-zero ideal is divisible by at least one prime ideal.

We now need a technical lemma that allows one to “invert” a prime ideal {{\mathfrak p}}.

Lemma 19 Let {{\mathfrak p}} be a prime ideal. Then there exists a quadratic field element {x \in {\bf Q}(\sqrt{-d})} that is not a quadratic integer (thus {x \not \in {\mathcal O}}), but is such that {x \cdot {\mathfrak p} \subset {\mathcal O}}.

Proof: Let {n} be a non-zero element of {{\mathfrak p}}. By Proposition 17, {(n)} must divide some product {{\mathfrak p}_1 \dots {\mathfrak p}_m} of prime ideals. In particular, {{\mathfrak p}} also divides {{\mathfrak p}_1 \dots {\mathfrak p}_m}, which by Lemma 15 and Exercise 16 implies that one of the {{\mathfrak p}_1,\dots,{\mathfrak p}_m}, say {{\mathfrak p}_m}, is equal to {{\mathfrak p}}. By taking {m} to be minimal, we may assume that {(n)} does not divide {{\mathfrak p}_1 \dots {\mathfrak p}_{m-1}}. Thus, we may find an element {a} of {{\mathfrak p}_1 \dots {\mathfrak p}_{m-1}} that does not lie in {(n)}, but such that {a \cdot {\mathfrak p}} is contained in {(n) = n \cdot {\mathcal O}}. Setting {x := \frac{a}{n}}, we obtain the claim. \Box

Remark 20 We can formalise the notion of inverting an ideal by introducing the concept of a fractional ideal, which are to ideals as rational numbers are to integers, but we will not do so in this set of notes.

Now we can give the most difficult step of unique factorisation:

Proposition 21 Suppose {I} is a non-zero ideal that is divisible by a prime ideal {{\mathfrak p}}. Then one has {I = {\mathfrak p} \cdot J} for some non-zero ideal {J} which is a strict divisor of {I}.

Proof: By the previous lemma, we can find {x \in {\bf Q}(\sqrt{-d})} that is not a quadratic integer, such that {x \cdot {\mathfrak p} \subset {\mathcal O}}. Note that {x \cdot {\mathfrak p}} is an ideal dividing {{\mathfrak p}}, so by Exercise 16 is either equal to {{\mathfrak p}} or {{\mathcal O}}.

Suppose first that {x \cdot {\mathfrak p} = {\mathfrak p}}. The ideal {{\mathfrak p}} is a rank two lattice, and thus isomorphic as an abelian group to {{\bf Z}^2}. The action of multiplication by {x} on {{\mathfrak p}} is then conjugate to the action of a {2 \times 2} matrix with integer coefficients. By the Cayley-Hamilton theorem, this implies that there is a monic quadratic polynomial of {x} that annihilates {{\mathfrak p}}, and is thus zero (since {{\mathcal O}} is an integral domain). In other words, {x} is an algebraic integer, and hence {x \in {\mathcal O}}, a contradiction. (Note here that we crucially used the fact that {{\mathcal O}} contains all the algebraic integers of {{\bf Q}(\sqrt{-d})}; cf. Remark 4.)

Thus we must have {x \cdot {\mathfrak p} = {\mathcal O}}. If we then set {J := x \cdot I}, then we have {I = {\mathfrak p} \cdot J}, and {J} is an ideal dividing {I}. We are thus done unless {I=J}, that is to say {x \cdot I = I}. But one can then repeat the previous argument to conclude that {x} is an algebraic integer and thus in {{\mathcal O}}, again reaching a contradiction. \Box

We now have enough tools to mimic the usual proof of unique factorisation for natural numbers, to obtain the analogous result for ideals in a ring of quadratic integers:

Exercise 22 (Unique factorisation) Show that any non-zero ideal can be uniquely expressed (up to rearrangement) as a product {{\mathfrak p}_1^{a_1} \dots {\mathfrak p}_k^{a_k}} of prime ideals, with {a_1,\dots,a_k > 0}. Show that one non-zero ideal {I} divides another {J} if and only if the number of times any given prime ideal {{\mathfrak p}} appears in the unique factorisation of {I} is less than or equal to the number of times it appears in {J}.

A basic application of unique factorisation is

Proposition 23 (Chinese Remainder Theorem) Let {I, J} be non-zero ideals that are coprime (they have no prime ideal divisors in common). Then the obvious ring homomorphism {\phi} from {{\mathcal O}/(I \cdot J)} to {{\mathcal O}/I \times {\mathcal O}/J}, defined by setting {\phi( n \hbox{ mod } I \cdot J ) := (n \hbox{ mod } I, n \hbox{ mod } J)}, is an isomorphism.

Proof: Observe that the ideal {I+J} divides both {I} and {J} and must therefore be all of {{\mathcal O}}, by unique factorisation and coprimality. Similarly, the ideal {I \cap J} is divisible by both {I} and {J} while dividing {I \cdot J}, and must therefore be exactly {I \cdot J}, by unique factorisation and coprimality. Since the kernel of {\phi} is {(I \cap J)/(I \cdot J)}, we conclude that {\phi} is injective; it remains to show that {\phi} is surjective. Since {I + J = {\mathcal O}}, we can split {1 = n+m} for some {n \in I} and {m \in J}. But then {\phi( n \hbox{ mod } I \cdot J ) = (0 \hbox{ mod } I, 1 \hbox{ mod } J)} and {\phi( m \hbox{ mod } I \cdot J ) = (1 \hbox{ mod } I, 0 \hbox{ mod } J)}, and the surjectivity then follows since these two elements generate {{\mathcal O}/I \times {\mathcal O}/J} as an {{\mathcal O}}-module. \Box

In the non-coprime case, we have the following basic fact.

Proposition 24 Let {{\mathfrak p}} be a prime ideal. Then for any non-negative integer {j}, we have {{\mathfrak p}^{j}/{\mathfrak p}^{j+1}} isomorphic (as a ring) to {{\mathcal O} / {\mathfrak p}}.

Proof: By unique factorisation, {{\mathfrak p}^{j}} is a strict divisor of {{\mathfrak p}^{j+1}}, thus we can find {a \in {\mathfrak p}^{j}} that does not lie in {{\mathfrak p}^{j+1}}. This gives a ring homomorphism {\phi: {\mathcal O} \rightarrow {\mathfrak p}^{j}/{\mathfrak p}^{j+1}} defined by {\phi(n) := an \hbox{ mod } {\mathfrak p}^{j+1}}. The kernel of this map is an ideal dividing {{\mathfrak p}} that does not contain {1}, and is thus {{\mathfrak p}}. Thus we have an injection from {{\mathcal O}/{\mathfrak p}} to {{\mathfrak p}^{j}/{\mathfrak p}^{j+1}}.

It remains to show surjectivity. By several applications of Proposition 21, we may write {(a) = {\mathfrak p}^j I} for some non-zero ideal {I} not divisible by {{\mathfrak p}}. By the Chinese remainder theorem, we may then find, for any {n \in {\mathfrak p}^j}, a quadratic integer {b} such that {b = 1 \hbox{ mod } {\mathfrak p}^{j+1}} and {b = 0 \hbox{ mod } I}. Thus {b} lies in both {{\mathfrak p}^j} and {I}, and hence in {(a)} by coprimality; thus {b/a} is a quadratic integer. By construction, we have {\phi(b/a) = n \hbox{ mod } p^{j+1}}, giving the desired surjectivity.

Corollary 25 (Multiplicativity of norm) For any non-zero ideals {I, J}, we have {N(I \cdot J) = N(I) N(J)}.

Proof: When {I,J} are coprime this follows directly from the Chinese remainder theorem. By unique factorisation, it thus suffices to show that {N({\mathfrak p}^j) = N({\mathfrak p})^j} for all natural numbers {j}. But this follows from Proposition 24 and induction on {j}. \Box

Exercise 26 Show that the Gaussian integers and the Eisenstein integers are principal ideal domains. (Hint: if {I} is a non-zero ideal in one of these rings, consider a non-zero element of {I} of minimal norm.) Conclude a unique factorisation theorem for elements of these rings.

Exercise 27 Verify that in {{\bf Z}[\sqrt{-5}]}, the principal ideal {(6)} factors into the four ideals mentioned in the introduction, and that these ideals are prime. What are the norms of all the ideals involved?

Remark 28 The unique factorisation theorem for ideals holds in the more general context of Dedekind domains, but we will not develop the abstract theory of Dedekind domains here.

— 3. Connection with the Kronecker symbol —

Let {{\mathfrak p}} be a prime ideal, then {{\mathcal O}/{\mathfrak p}} is a finite integral domain, and is thus a finite field (each non-zero element acts via multiplication by a permutation). On the other hand, since {{\mathcal O}} is a rank two abelian group, this finite field must have rank at most two. We conclude that {{\mathcal O}/{\mathfrak p}} is isomorphic to a finite field of order either {p} or {p^2} for some rational prime {p}, which is the characteristic of the field. In particular, {N({\mathfrak p})} is either a prime {p} or a square {p^2} of that prime {p}. On the other hand, since {{\mathcal O}/{\mathfrak p}} has characteristic {p}, {{\mathfrak p}} must divide {(p)}, which has norm exactly {p^2} by Exercise 14. By unique factorisation, we conclude that for each rational prime {p}, the ideal {(p)} is either prime of norm {p^2}, or is the product {{\mathfrak p}_1 {\mathfrak p}_2} of two prime ideals that each have norm {p}, and furthermore that all prime ideals arise in this fashion.

We can determine precisely which of the two is the case:

Proposition 29 Let {p} be a rational prime.

  • If {D} is a quadratic residue modulo {p}, then {(p)} is the product of two prime ideals {{\mathfrak p}_1, {\mathfrak p}_2} of norm {p}.
  • {D} is a quadratic non-residue modulo {p}, then {(p)} is a prime ideal of norm {p^2}.

Proof: We just handle the case {d=1,2\ (4)} and leave the {d=3\ (4)} case as an exercise. Suppose there is an prime ideal {{\mathfrak p}} of norm {p}, then {{\mathfrak p}/(p)} is isomorphic to the field of order {p}. In particular, if {a + b \sqrt{-d}} is a element of {{\mathfrak p}} not divisible by {p}, then {\sqrt{-d} ( a + b \sqrt{-d} )} must be a multiple of {a + b \sqrt{-d}} modulo {p}, thus one can find non-zero {k \in {\bf Z}/p{\bf Z}} such that {ka = -d b \hbox{ mod } p} and {kb = a \hbox{ mod } p}, which implies that {k^2 = -d \hbox{ mod } p}, since {a,b} are not both zero modulo {p}. Thus {-d} (and hence {D}) is a quadratic residue modulo {p}. Conversely, if {D} (and hence {-d}) is a quadratic residue, then we can find {a,b,k} with {ka = -d b \hbox{ mod } p} and {kb = a \hbox{ mod } p} with {a,b} not both zero modulo {p}, and then {(a+b\sqrt{-d}) + (p)} is an ideal dividing {(p)} of norm {p}, and thus prime. The claim follows. \Box

Exercise 30 Complete the proof of the proposition in the case {d=3\ (4)}.

Exercise 31 Show that when {D} is a quadratic residue modulo {p}, the two prime ideals {{\mathfrak p}_1, {\mathfrak p}_2} appearing in the above proposition are distinct unless {p} divides {D}, in which case the two ideals are equal.

The above proposition gives us a formula for the number of prime ideals of a given norm. For any natural number {n}, define the Kronecker symbol {\chi(n) := (\frac{D}{n})} to be the completely multiplicative function of {n} such that for each prime {p}, {(\frac{D}{p})} equals {0} if {p} divides {D}, equals {+1} if {D} is a non-zero quadratic residue mod {p}, and {-1} if {D} is a non-zero quadratic nonresidue mod {p}. From the law of quadratic reciprocity, one can verify that {\chi} is a Dirichlet character of conductor {D}. For instance, if {d=1}, {\chi = \chi_4}.

Exercise 32 Show that for any natural number {n}, the number of ideals of norm {n} is equal to {1 * \chi(n)}.

Exercise 33 Prove Proposition 8.

Another way to phrase the conclusion of Exercise 32 is as the factorisation

\displaystyle  \zeta_{\mathcal O}(s) = \zeta(s) L(s,\chi)

(for {\hbox{Re}(s)>1} at least), where {\zeta} is the Riemann zeta function, {\zeta_{\mathcal O}} is the Dedekind zeta function

\displaystyle  \zeta_{\mathcal O}(s) = \sum_I \frac{1}{N(I)^s}

where {I} ranges over prime ideals, and {L(s,\chi)} is the Dirichlet {L}-function

\displaystyle  L(s,\chi) = \sum_n \frac{\chi(n)}{n^s}.

For instance, in the case of the Gaussian integers {{\bf Z}[\sqrt{-1}]}, we have

\displaystyle  \zeta_{{\bf Z}[\sqrt{-1}]}(s) = \zeta(s) L(s,\chi_4).

— 4. Connection with quadratic forms —

Let us say that two ideals {I, J} are equivalent if one has {I = z \cdot J} for some {z \in {\bf Q}(\sqrt{-d})}. This is clearly an equivalence relation; the equivalence class of {{\mathcal O}} is simply the class of principal ideals. Using unique factorisation (and the fact that every ideal divides a principal ideal), the space of such equivalence classes is a group, called the class group of the ring of integers {{\mathcal O}}.

One can analyse this class group by associating a positive definite quadratic form {Q_I: I \rightarrow {\bf Z}_{\geq 0}} to each ideal {I}, by the formula

\displaystyle  Q_I(n) := N(n) / N(I)

for all {n \in I}. Note that {Q_I(n) = | I / (n) |} for {n \neq 0}, and so {Q_I} takes values in the non-negative integers (and is strictly positive for non-zero {n}). Since {N(n)} is a quadratic form in {n}, we see that {Q_I} is a quadratic form on {I}.

We call two quadratic forms {Q_I: I \rightarrow {\bf Z}}, {Q_J: J \rightarrow {\bf Z}} equivalent if there is an additive group isomorphism {\phi: I \rightarrow J} such that {Q_I(n) = Q_J( \phi(n) )} for all {n \in I}. This relation captures the equivalence relation on ideals:

Exercise 34 Let {I, J} be ideals. Show that {Q_I} and {Q_J} are equivalent if and only if {I, J} are equivalent.

A quadratic form {Q: {\bf Z}^2 \rightarrow {\bf Z}} on the standard lattice {{\bf Z}^2} is of the form {Q(x,y) = ax^2 + bxy + cy^2} for some integers {a,b,c}. The discriminant of this quadratic form is defined to be {b^2-4ac}. This is an invariant with respect to invertible linear transformations of {{\bf Z}^2}. Thus, given any other quadratic form {Q_I: I \rightarrow {\bf Z}} on a rank two lattice {I}, one can define the discriminant of {Q_I} by identifying {I} with {{\bf Z}^2} via a linear isomorphism; it is clear that this definition does not depend on the choice of isomorphism. It turns out that the quadratic form of all ideals have the same discriminant:

Exercise 35 Let {I} be an ideal. Show that {Q_I} has discriminant {D}. (Hint: if one identifies {{\mathcal O}} with {{\bf Z}^2}, show that {I} takes the form {T {\bf Z}^2} for some linear transformation {T: {\bf Z}^2 \rightarrow {\bf Z}^2} of determinant {N(I)}, and use this to show that {Q_I} has the same discriminant as {Q_{{\mathcal O}}}.)

From Exercise 7 we see that the class group of {{\mathcal O}} is finite. The order of this group is known as the class number and will be denoted {h=h(D)}.

In the converse direction, we have

Lemma 36 Let {Q: {\bf Z}^2 \rightarrow {\bf Z}} be a positive definite quadratic form of discriminant {D}. Then there exists an ideal {I} such that {Q_I} is equivalent to {Q}.

In particular, the class group is in one-to-one correspondence with the equivalence classes of positive definite quadratic forms {Q: {\bf Z}^2 \rightarrow {\bf Z}} of a given discriminant, a famous result of Gauss. In particular, the group law on the class group induces the Gauss composition law on equivalence classes of quadratic forms of a given discriminant.

Proof: We perform some ad hoc computations in the case {d=1,2\ (4)}. If {Q(x,y) = ax^2 + bxy + c y^2}, then {b^2-4ac = D} and {a,c > 0}, which makes {b} even. One may verify that the set

\displaystyle  I := \{ 2ax - y (-b-\sqrt{D}): x,y \in {\bf Z} \}

is an additive subgroup of {{\mathcal O}} that is also closed under multiplication by {\frac{-b+\sqrt{D}}{2}}, and is thus an ideal; one may also calculate its norm to be {4a}, with

\displaystyle  N( 2ax - y (-b-\sqrt{D}) ) = 4a Q(x,y)

for {x,y \in {\bf Z}}. This implies that {Q_I} is equivalent to {Q} as required. \Box

Exercise 37 Complete the proof of the lemma by treating the {d=3\ (4)} case.

Exercise 38 Show that {x^2+5y^2} and {2x^2+2xy+3y^2} are inequivalent quadratic forms of discriminant {-20}, and that all other quadratic forms of this discriminant are equivalent to one of these two forms.

Now we relate the distribution of norms of ideals to representation by quadratic forms.

Proposition 39 Let {n} be a natural number, and let {Q_1,\dots,Q_h} be representatives of the equivalence classes of positive definite quadratic forms of discriminant {D}. Then the number of ideals {I} with {N(I) = n} is equal to the number of representations of {n} of the form {Q_i(x,y)} for some {i=1,\dots,h} and {x,y \in {\bf Z}}, divided by the number {w} of units (i.e. {4} if {d=1}, {6} if {d=3}, or {2} otherwise, thanks to Exercise 5).

Proof: Suppose we can write {n = Q_i(x,y)} for some {i=1,\dots,h} and {x,y \in {\bf Z}}. By construction, {Q_i} is isomorphic to {Q_J} for some ideal {J = J_i}, and so we can write {n = Q_J(m)} for some {m \in J}, thus {N(m) = n N(J)}. By unique factorisation, we may write {(m) = I \cdot J} for some ideal {I} of norm {n}. Note that if we replace {i} with a different index, then {J = J_i} is replaced with an inequivalent ideal, and then so is {I}. On the other hand, if we keep {i} fixed and replace {x,y} with some {x',y'}, thus also replacing {m} with some {m' \in J}, then we also change {I} to a different ideal unless {(m) = (m')}, or equivalently if {m'} differs from {m} by a unit, in which case {I} is unchanged. Thus we have a map from representations {n = Q_i(x,y)} with {i=1,\dots,h} and {x,y \in {\bf Z}} to ideals of norm {n} whose multiplicity is exactly equal to the number {w} of units.

To conclude the proposition, we need to show that every ideal {I} of norm {n} arises in this fashion. But by Lagrange’s theorem, {N(I)=n} lies in {I}, and then {Q_I(n) = n}, giving the claim. \Box

We can then give an elementary asymptotic for the norm of ideals:

Corollary 40 We have

\displaystyle  \sum_{I: N(I) \leq x} 1 = \frac{2\pi h(D)}{w \sqrt{|D|}} x + O_D( \sqrt{x} )

for {x>1}, where {w \in \{2,4,6\}} is the number of units.

Proof: By the preceding proposition, the left-hand side can be written as

\displaystyle  1 + \frac{1}{w} \sum_{i=1}^{h(D)} \sum_{a,b \in {\bf Z}: Q_i(a,b) \leq x} 1.

The inner sum is the number of lattice points in the ellipse {\{ (a,b) \in {\bf Z}^2: Q_i(a,b) \leq x \}}, which has area {\frac{2\pi x}{\sqrt{|D|}}} by Exercise 6, since {Q_i} has discriminant {D}. If one places a unit square centred at each such lattice point, one obtains a region that differs from the ellipse by a set of area {O_D(\sqrt{x})} (since the difference set is contained in a {O(1)}-neighbourhood of the boundary of the ellipse). The claim follows. \Box

This, combined with Exercise 32, gives a special case of the famous Dirichlet class number formula, generalising Exercise 9:

Exercise 41 (Dirichlet class number formula) Show that

\displaystyle  \zeta_{\mathcal O}(s) = \frac{2\pi h(D)}{w \sqrt{|D|}} \frac{1}{s-1} + O_D(1)

for {s>1} sufficiently close to {1}, and then conclude that

\displaystyle  h(D) = \frac{w \sqrt{|D|}}{2\pi} L(1,\chi).

In particular, since the class number {h(D)} is clearly at least one, we obtain a “trivial” lower bound

\displaystyle  L(1,\chi) \geq \frac{2\pi}{w \sqrt{|D|}} \gg \frac{1}{\sqrt{d}}.

This looks weaker than Siegel’s bound

\displaystyle  L(1,\chi) \gg_\varepsilon d^{-\varepsilon}

for any {\varepsilon>0} (which we will discuss in later notes), but a key difference is that the trivial bound has effective constants, whereas Siegel’s bound is ineffective. The best effective bound currently known on {L(1,\chi)} only improves on the trivial bound by a logarithmic factor, and involves quite deep and difficult mathematics relating to elliptic curves; see this survey of Goldfeld.

Remark 42 Most of the above discussion also extends to the rings of integers in real quadratic fields {{\bf Q}(\sqrt{d})} of positive discriminant, with a few changes; for instance, there are now infinitely many units, quadratic integers may now have negative norm, and the field is now embedded (in two different ways) into {{\bf R}} rather than into {{\bf C}}. The ellipses of Exercise 6 become hyperbolae, which creates a logarithmic correction term (known as a regulator) in the class number formula. We leave the detailed modifications needed to the interested reader. It turns out that every real Dirichlet character will essentially arise from a quadratic field (of either positive or negative discriminant); see Chapters 5, 6 of Davenport for details. One can also consider higher degree number fields than quadratic fields; again, much of the above theory carries through in this case, but the characters {\chi} that then emerge are not necessarily Dirichlet characters, but lie instead in the more general class of Hecke characters. We will not discuss this more general theory here, but see for instance the book of Fröhlich and Taylor.

Remark 43 There is an important connection between class groups and abelian extensions of fields (an abelian version of the connection between Galois groups and arbitrary extensions of fields), known as class field theory, but we will not discuss this topic further in this course.

Exercise 44 Let {d} be a rational prime with {d = 3\ (4)}, thus generating a quadratic field {{\bf Q}(\sqrt{-d})} of discriminant {D := -d} and the associated ring of integers {{\mathcal O}}.

  • (i) Show that for any rational integer {x} with {-\frac{d-7}{4} \leq x \leq \frac{d-3}{4}}, that the only solutions to the equation

    \displaystyle  N(a + b \frac{1+\sqrt{-d}}{2}) = x^2 - x + \frac{d+1}{4}

    are {(a,b) = (x,-1), (-x,1), (x-1,1), (1-x,-1)}.

  • (ii) Suppose further that the class number {h(D)} is {1}, that is to say that {{\mathcal O}} is a principal ideal domain. Conclude that {x^2 + x + \frac{d+1}{4}} is prime for all rational integers {-\frac{d-7}{4} \leq x \leq \frac{d-3}{4}}. (The converse statement is also true, a result of Rabinovitch.) For instance, the discriminant {D = -163} is known to have class number one, giving Euler’s famous prime-generating polynomial {x^2-x+41}, which gives primes for {x=1,\dots,40}, as well as Legendre’s variant {x^2+x+41}, which gives primes for {x=0,\dots,39}. (This also generates some of the lines in Ulam’s spiral.) Unfortunately, {163} is the largest value of {d} with this property, thanks to the Stark-Heegner theorem.