In the last few weeks, the Great Internet Mersenne Prime Search (GIMPS) announced the discovery of two new Mersenne primes, both over ten million digits in length, including one discovered by the computing team right here at UCLA (see this page for more information). [I was not involved in this computing effort.] As for the question “Why do we want to find such big primes anyway?”, see this page, though this is not the focus of my post today.
The GIMPS approach to finding Mersenne primes relies of course on modern computing power, parallelisation, and efficient programming, but the number-theoretic heart of it – aside from some basic optimisation tricks such as fast multiplication and preliminary sieving to eliminate some obviously non-prime Mersenne number candidates – is the Lucas-Lehmer primality test for Mersenne numbers, which is much faster for this special type of number than any known general-purpose (deterministic) primality test (such as, say, the AKS test). This test is easy enough to describe, and I will do so later in this post, and also has some short elementary proofs of correctness; but the proofs are sometimes presented in a way that involves pulling a lot of rabbits out of hats, giving the argument a magical feel rather than a natural one. In this post, I will try to explain the basic ideas that make the primality test work, seeking a proof which is perhaps less elementary and a little longer than some of the proofs in the literature, but is perhaps a bit better motivated.
— Order —
We begin with a general discussion of how to tell when a given number n (which is not necessarily of the Mersenne form ) is prime or not. One should think of n as being moderately large, e.g. (which is broadly the size of the Mersenne primes discovered recently).
Our starting point will be Lagrange’s theorem, which asserts that
for prime and coprime to ; applying it instead to the multiplicative group of a cyclic group of order , we obtain Euler’s theorem
whenever is coprime to n, where is the Euler totient function of n.
Fermat’s little theorem (2) already gives a necessary condition for the primality of a candidate prime : take any coprime to (typically one picks a small number such as or ), and compute modulo . If it is not equal to 1, then cannot be prime. This is a (barely) feasible test to execute for as large as , because one can compute exponents such as relatively quickly, by the trick of repeatedly squaring a modulo n to obtain , and then decomposing into binary to compute . (If is a Mersenne number, then there are some pretty obvious shortcuts one can take for this last step, using division instead of multiplication.) Unfortunately, while this test is necessary for primality, it is not sufficient, due to the existence of pseudoprimes. So Fermat’s little theorem alone does not provide the answer, at least if one wants a deterministic certificate of primality rather than a probabilistic one.
Nevertheless, the above facts do provide some important information about the order of a modulo . If is prime, Fermat’s little theorem (2) tells us that is at most (in fact, it divides ). On the other hand, if is not prime, then Euler’s theorem (3) tells us that cannot be as large as , but is instead at most , which is now strictly less than . Thus: if we can find a number coprime to such that is exactly , we have certified that is prime.
Testing whether a given number is coprime to is very easy and fast (thanks to the Euclidean algorithm). Unfortunately, it is difficult in general to compute the order of a number if the base is large; a brute-force approach would require one to compute up to powers of , which is prohibitively expensive if has size comparable to . (More generally, the problem of computing order exactly in general is closely related to the discrete logarithm problem, which is notoriously difficult.) However, there are a few cases in which the order can be found very quickly. Suppose that we somehow find positive integers , such that
(which in particular implies that is coprime to ). Squaring this, we obtain
and so we see that divides but not , and thus must be exactly equal to . Conversely, if , then we must have (4). So in the special case when the order of is a power of 2, we can compute the order using only one exponentiation, which is computationally feasible for the orders of magnitude we are considering.
Unfortunately, this is not quite what we need for the Mersenne prime problem, because if is a Mersenne number, then it is plus 1 which is a power of 2, rather than minus 1. (The above method would be ideal for finding Fermat primes rather than Mersenne primes, but it is likely that in fact there are no more such primes to be found.)
So this is a frustrating near miss: if is a Mersenne number, we can easily check if a number has order modulo , but we needed a test for when a number has order instead. And indeed, even when is prime, Fermat’s little theorem (2) shows that it is impossible for a number to have order modulo , since the order needs to divide . So we seem to be a bit stuck.
But while clearly does not divide , it does divide . Looking at Lagrange’s theorem (1), we then see that it could be possible to find elements of order in a multiplicative group of order rather than . Recall that if was prime, then the multiplicative group of the finite field had order . But is also a finite field, and its multiplicative group has order . Aha!
So the plan (assuming for sake of argument that is prime) is to somehow work in the finite field instead of , in order to find elements of order . We can get our hands on this larger finite field more concretely by viewing it as a quadratic extension , where a is a quadratic non-residue of .
Let’s now take n to be a Mersenne prime. What numbers are quadratic non-residues? A quick appeal to quadratic reciprocity and some elementary number theory soon reveals that 2 is a quadratic residue of , but that 3 is not. Thus we can take . (One could work with other numbers than 3 here, but being the smallest quadratic non-residue available, it is the simplest one to use, and the one which is most likely to be able to take advantage of the strong law of small numbers.) Henceforth all calculations will be in this field , which of course contains as a subfield.
Now we need to look for a field element of order , which is a power of 2. Thus (by adapting (4) to ) we need to find a solution to the equation
in this field.
Let’s compute some expressions of the form . From Fermat’s little theorem (2) we have
because 3 is not a quadratic residue, we see (from taking discrete logarithms) that
Similarly we have
These are pretty close to (5), but not quite right. To go further, it is convenient to work with powers rather than powers – i.e. we work with the Frobenius automorphism . Indeed, since has characteristic n, we have the automorphism properties
; . (8)
From (6) we have , while from (2) we have for . From (8) we thus see that
for ; thus the Frobenius automorphism is nothing more than Galois conjugation. (Actually, this can be deduced quite readily from standard Galois theory.)
Now we go back from powers to powers. Multiplying both sides of the preceding equation by , we obtain
Squaring , we conclude
Now we in a good position to solve the equation (5). We cannot make equal to -1 – since -1 is not a quadratic residue modulo 3 – but we can make it equal to, say, -2, by setting a=1 and b=-1 (say):
Dividing this by (7) we obtain the desired solution
to (5), where . (Note that one could also use here; indeed, Galois theory tells us that and are interchangeable in these computations.)
To summarise, we have shown that
If is a Mersenne prime, then (9) holds.
Based on our previous discussion, we expect to be able to reverse this implication. Indeed, we have the following converse:
Lemma 1. Let be a Mersenne number. If (9) holds (in the ring ), then is prime.
Proof. Let be a prime divisor of . Then in the field (which we define as if 3 is a quadratic residue there), thus has order exactly n+1 (cf. (4)). By Lagrange’s theorem (1), this means that divides the multiplicative order of , which is (if 3 is a non-residue modulo ) or (if 3 is a residue modulo ). In particular, has to exceed . Thus the only prime divisors of exceed , and so by the sieve of Eratosthenes, is prime. (This proof is due to Bruce.)
We have thus shown
Corollary 1. (Lucas-Lehmer test, preliminary version) Let with m odd. Then n is prime if and only if (9) holds in .
This is already a reasonable criterion, but it is a little non-elementary (and also a little unpleasant numerically) due to the presence of the quadratic extension by . One can get rid of this extension by the Galois theory trick of taking traces. Indeed, observe that is the Galois conjugate of . Basic Galois theory tells us that lies in , and vanishes precisely when is equal to -1. So it suffices to show that
vanishes in .
The quantity could be computed by repeated squaring in . The quantity can be computed by a similar device in . Indeed, the sequence is easily seen to obey the recursion
and so we have
Theorem 1. (Lucas-Lehmer test, final version) Let with m odd. Then n is prime if and only if vanishes modulo n, where is given by the recursion (10).
To apply this test, one needs to perform about squaring operations modulo . Doing everything as efficiently as possible (in particular, using fast multiplication), the total cost of testing a single Mersenne number for primality is about (modulo some terms). This turns out to barely be within reach of modern computers for , especially since the algorithm is somewhat parallelisable. (In contrast, the best known general-purpose deterministic primality testing algorithm, the AKS algorithm, has a run time of about (with a sizable implicit constant), which is not feasible for .) There are general-purpose probabilistic tests (such as Miller-Rabin) which have run-time comparable to the Lucas-Lehmer test, but as mentioned at the beginning, we are only interested here in deterministic (and unconditional, e.g. not relying on GRH) certificates of primality.
[Updated, Nov 16: Lemma 1 corrected.]