The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser of any non-identity element is abelian; equivalently, the commuting relation (defined as the relation that holds when commutes with , thus ) is an equivalence relation on the non-identity elements of . Trivially, every abelian group is CA. A non-abelian example of a CA-group is the group of invertible affine transformations on a field . A little less obviously, the special linear group over a finite field is a CA-group when is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in , for instance, when is bounded and is large), is typically a maximal torus (because most elements in are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup (called the Frobenius complement) with the property that all the non-trivial conjugates of for , intersect only at the origin. For instance the group is also a Frobenius group (take to be the affine transformations that fix a specified point , e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if is a CA-group and is a maximal abelian subgroup of , then any conjugate of that is not identical to will intersect only at the origin (because and each of its conjugates consist of equivalence classes under the commuting relation , together with the identity). So if a maximal abelian subgroup of a CA-group is its own normaliser (thus is equal to ), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let be a Frobenius group with Frobenius complement . Then there exists a normal subgroup of (called the Frobenius kernel of ) such that is the semi-direct product of and .
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let be a group of permutations acting transitively on a finite set , with the property that any non-identity permutation in fixes at most one point in . Then the set of permutations in that fix no points in , together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field (i.e. the group for that field), and is the set of points on that field. In that case, the set of permutations in that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set to be the space of left-cosets of , with the obvious left -action; one easily verifies that this action is faithful, transitive, and each non-identity element of fixes at most one left-coset of (basically because it lies in at most one conjugate of ). If we let be the elements of that do not fix any point in , plus the identity, then by Theorem 3 is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of . From construction is the identity plus the complement of all the conjugates of , which are all disjoint except at the identity, so by counting elements we see that
As normalises and is disjoint from , we thus see that is all of , giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
— 1. Basic character theory —
Let be a finite group. Then we can form the finite-dimensional complex Hilbert space of functions with the inner product
and thus norm
where is the averaging operator. Inside this space, we have the subspace of class functions: functions which are invariant under the conjugation action of on itself, thus
for all . Equivalently, is constant on each conjugacy class of . In particular, we see that the dimension of is equal to the class number of – the number of conjugacy classes of .
One way to generate class functions is from taking traces of finite-dimensional unitary representations , i.e. a homomorphism from the group to unitary operators on a finite-dimensional complex Hilbert space . We will abbreviate “finite-dimensional unitary representation” as “representation” henceforth. Given any such representation, one has an associated character defined by
One easily verifies that this is a class function. For instance, the regular representation , in which , has character
and every linear character (i.e. a homomorphism to the complex unit circle) is a character associated the obvious one-dimensional representation corresponding to . In particular, the constant function is a character, associated to the principal one-dimensional representation.
The characters interact well with various representation-theoretic operations. For instance, isomorphic representations clearly have the same character. If are representations, then the characters of the direct sum and tensor product are the sum and product of the individual characters:
Also, if is the Hilbert space with the conjugated inner product, then the conjugate representation (given by taking and conjugating the inner product structure) has a conjugated character:
Thus the space of characters forms a semi-ring in that is closed under complex conjugation. Also, since any element of the absolute Galois group of the rationals can be extended to the complex numbers , we have the stronger fact that the space of characters is also invariant with respect to the action of ; we will need this fact somewhat later in this post.
The space of characters is not a ring, because characters are certainly not preserved with respect to negation: the value of a representation at the identity is the dimension of the space that acts on, and so is non-negative; in particular, will not be a character as long as is positive dimensional. We can then define the space of generalised characters to be the ring generated by the characters, thus a generalised character is nothing more than a difference of two characters.
By repeatedly taking orthogonal complements, one can easily see that representations are completely reducible, thus if is a collection of one representative of each of the isomorphism classes of irreducible finite-dimensional representations of , then every character can be written as a linear combination (over the natural numbers) of the irreducible characters .
From the ergodic theorem (which is a triviality in the case of an action of a finite group ), the average value of a character of a representation is equal to the dimension of its invariant component :
As a consequence, the inner product of two characters is equal to the dimension of the -invariant component of . By Schur’s lemma, this implies in particular that the irreducible characters are an orthonormal system in :
In fact, they form an orthonormal basis for . (Proof: given any non-trivial , the convolution operator is non-trivial in the regular representation, and thus must also be non-trivial with respect to at least one irreducible representation , which implies that has non-zero inner product with . Thus there is no non-zero element of that is orthogonal to all the irreducible characters, giving the claim.) In particular, we see that is equal to the class number of .
Using this basis, we now have a Fourier transform that is an isometry between the Hilbert space of class functions, and the Hilbert space , defined by taking inner products with characters
and with the usual Fourier inversion formula
and Plancherel and Parseval identities
(One can also relate convolution in with pointwise multiplication in , and pointwise multiplication in is related to a plethysm on involving tensor product multiplicities, but we will not need these operations here.)
A character of a (not necessarily irreducible) representation then has Fourier coefficients that count the multiplicity of in ; in particular, a class function is a character (resp. generalised character) iff its Fourier coefficients are all natural numbers (resp. integers), and any two representations are isomorphic iff they have the same character.
Thus for instance the regular representation has Fourier coefficients , leading to the identity
which gives the Peter-Weyl theorem that is isomorphic to the direct sum of copies of for each . In particular we have
From these Fourier identities we can now detect whether a representation is irreducible (or a combination of a small number of representations) through the structure of its character. Indeed, by taking Fourier transforms and working in we now have the following immediate corollaries, which will be very useful to us in the sequel:
- (i) is a natural number.
- (ii) equals iff for some sign and irreducible character . If we also know that , this forces (since is positive).
- (iii) equals iff for some signs and distinct irreducible characters . If we also know that , then this forces (again because are positive).
One can of course also characterise when is equal to , , etc. by this method, although the descriptions rapidly become more complicated and less useful. In practice, this lemma will allow us to construct interesting examples of irreducible representations by first exhibiting a generalised character of small norm (or equivalently, two characters that are close to each other in norm). It seems very difficult to mimic this type of construction by any other means, including non-character-based representation theoretic methods. (But perhaps one could categorify this lemma somehow using K-theory.)
Lemma 4 is a typical application of the integrality gap, which is the trivial but fundamental fact that integers are either zero or have magnitude at least one. It is the interplay between the integrality gap and the Fourier analysis of characters which drives the proof of both Frobenius’ theorem and Suzuki’s theorem, as we shall soon see.
of a character is automatically a normal subgroup of . This is because a unitary operator on a finite-dimensional space has trace if and only if it is equal to the identity operator, and so (3) is also the kernel of the associated representation . This latter fact suggests a strategy to prove Frobenius’ theorem by exhibiting a character whose kernel is precisely the complement of the conjugacy classes of excluding the identity, and this is in fact exactly what we will do.
Thus far we have focused on the representation theory of a single group . However, the situation becomes significantly more interesting when one relates the representation theory of two groups, a finite group and a subgroup (not necessarily normal). We then have an obvious restriction map that restricts any class function on to a class function on (since any two elements of that are conjugate in are clearly also conjugate in ). The adjoint map can be easily computed: given any class function , the induced class function is given by the Frobenius formula
with the convention that is extended by zero from to . Equivalently, one has
where is an enumeration of the left-cosets of .
In a similar fashion, given a representation of , we may restrict it to to obtain a representation of . In the adjoint direction, given a representation of , one can associate an induced representation of by the following construction. One takes to be the space of functions that for all and with inner product
and then lets act on by the formula
which one can check to indeed give a representation. Thus, for instance, the regular representation on induces (an isomorphic copy of) the regular representation on . It is not difficult to show that these representation-theoretic constructions are compatible with the operations on class functions mentioned earlier, thus
for every representation of , and dually
In particular, any character of restricts to a character of , and every character of induces a character on . Thus the adjoint relationship between and for class functions induces a corresponding adjoint relationship for representations, known as Frobenius reciprocity.
In general, the restriction or induction of an irreducible representation will not be irreducible (and the operations of restriction and induction do not invert each other). However, the geometry of the characters can often be controlled quite precisely (especially for structured groups such as Frobenius groups or CA-groups), and this together with tools such as Lemma 4 can allow us to create interesting irreducible representations of in non-trivial fashion from irreducible representations of .
— 2. Frobenius’ theorem —
We are now ready to prove Frobenius’ theorem. Let be a Frobenius group with Frobenius complement . Then there are distinct conjugates of , which are all disjoint except for the origin, thus one can partition
where (by abuse of notation) ranges over a set of representatives of the left cosets of and is the identity together with all the elements that do not lie in any conjugate of . I like to think of this decomposition by picturing as being like a plane, being the origin in this plane, each conjugate being a non-vertical line through the origin, and being the vertical line through the origin (note that in the case of the group, this is more or less exactly what (5) actually looks like). Counting elements in (5), we thus have
We now start inducing characters from to and determine their geometr;8 Let be a irreducible character of of some dimension , then the character obeys the identities
Now we consider the induced character
Using the above identities and the partition (5), we see that equals at , vanishes at all other elements of , and on each of the sets is given by a conjugate of . In particular,
Similarly, if one induces the trivial character on to a character on , this character will equal at , vanish at all other elements of , and will equal on . If we then form the generalised character
then equals on and is equal to outside of . In particular, we have
Using (6), we thus see that has surprisingly small norm:
We can then apply Lemma 4 and conclude that for some irreducible representation of . Note that . Thus we have shown that every irreducible representation of is the restriction of an irreducible representation of .
Remark 1 The above analysis shows a little bit more, namely that arises in as the orthogonal complement of a copy of the mean zero component of quasiregular representation on (i.e. the induction of the trivial representation on ), although it is not obvious to me how one would demonstrate (other than via an inspection of characters) that the induced representation actually contains a copy of this component of the quasiregular representation.
Now we consider the regular character of :
This character equals at the identity, and vanishes at the other elements of . If we then form the associated character
of , then restricts to and so also equals at the identity and vanishes at the other elements of . By (5), we conclude that (which is a class function) is supported on . Also, as each of the are constant on , is also, and so equals on all of . Thus is the kernel (3) of the character and is thus normal. This gives Frobenius’ theorem as discussed in the introduction.
— 3. More character theory —
Before we turn to Suzuki’s theorem, we will need some additional facts about characters which go beyond the Fourier-analytic considerations of Section 1 by also employing some tools from algebraic number theory.
Let be a finite group. Observe that if is a representation, then for any , the unitary operator can be diagonalised. As (and hence ) has finite order, the eigenvalues of are roots of unity, and so the trace is the sum of finitely many roots of unity. In particular, is always an algebraic integer. Unlike rational integers, algebraic integers do not directly enjoy an integrality gap; one can have algebraic integers of arbitrarily small nonzero magnitude (e.g. powers of ). However, we will rely in several places on the basic but fundamental fact that a number which is both an algebraic integer and a rational is necessarily a rational integer, which then is subject to the integrality gap.
We have a variant of the above fact:
Lemma 5 Let be a finite group, let be a -dimensional irreducible representation, and let . Then is an algebraic integer, where is the conjugacy class of .
Proof: The endomorphism is -equivariant and has trace ; by Schur’s lemma, it is thus equal to times the identity. It thus suffices to show that the diagonal entries of are algebraic integers; thus it will suffice to show that for some monic polynomial with integer coefficients.
for some , or equivalently that is an integer combination of . This implies that is an integer combination of , and the claim follows.
This leads to an important corollary:
Proof: As the character has norm one, we have
Grouping the summation by conjugacy classes , we can express the left-hand side as the sum of terms of the form , which by the preceding lemma and discussion is equal to times an algebraic integer. We conclude that is an algebraic integer also; but it is rational, and so must be a rational integer also.
In particular, if is of odd order, then the dimension of any irreducible representation of has odd dimension. This has a further important consequence, due to Burnside:
Proof: Suppose for contradiction that is real-valued. By (2) this implies that for all .
for some subset of of order . On the other hand, as is non-principal, we have
By (7) one has
But is the dimension of , which is odd by Corollary 6, so is a half-integer. But it is also an algebraic integer, giving the desired contradiction.
— 4. Suzuki’s theorem —
We can now begin the proof of Suzuki’s theorem; we will basically use an arrangement of this theorem from the thesis of Wilcox. We begin with an easy reduction to the simple case:
Proposition 8 (Reduction to the simple case) Let be a finite CA-group of odd order which is not simple. Suppose that all CA groups of smaller odd order than are solvable. Then is solvable also.
Proof: If is not simple, it has a proper normal subgroup . This group is also of odd order and inherits the CA property from , so by hypothesis is solvable. If we let be the last non-trivial group in the derived series of , then is a non-trivial abelian characteristic subgroup of , and is thus also a normal subgroup of . Let be the centraliser of , then is also a normal subgroup of , which is still non-trivial and abelian as is a CA-group. Furthermore, is maximal abelian (it is not contained in any larger abelian group).
To show that is solvable, it then suffices to show that the quotient is solvable. As this group has an odd order smaller than that of , it suffices to show that is a CA-group. Thus, if are non-identity elements of with both commuting with , we need to show that commute with each other. Equivalently, if are such that both commute with modulo , then commutes with modulo .
If we fix , then acts on by conjugation. This action cannot fix any non-identity element of , else the centraliser of would contain as well as , contradicting the maximal abelian nature of . Thus the map , which is a homomorphism on the normal abelian group , has trivial kernel and is thus an isomorphism. From this we see that if commutes with modulo , then one can multiply by an element of (on the left or right) in order to make it commute with exactly. Thus, without loss of generality, and both commute with exactly, and so commute exactly as well as is a CA-group, giving the claim.
In view of this proposition, we see that to prove Suzuki’s theorem it suffices to show that simple non-abelian CA-groups of odd order do not exist.
Observe that in a CA-group , every non-identity element of is contained in a unique maximal abelian subgroup of , namely the centraliser of . Thus the maximal abelian subgroups of , once one removes the identity, form a partition of . It is instructive to keep some examples in mind:
- In the case of the group on a field , the maximal abelian subgroups are the translation group and the stabilisers of points , where is the multiplicative group .
- In the case of the special linear group with a power of two, the maximal abelian groups are conjugates of the split torus
the non-split torus
(where is any quantity not of the form for some ) or the unipotent group
As these examples show, while many of the maximal abelian subgroups may be conjugate to each other, there can certainly be several non-conjugate examples of maximal abelian subgroups. Let be a set consisting of one representative from each of these conjugacy classes, then we have the following analogue of the partition (5):
where is the normaliser of . This partition turns out to be a somewhat less favourable than (5), but one can still run analogues of the Frobenius argument, particularly in the case when is odd (which forces many other related quantities, such as , to be odd also). I like to think of this decomposition by viewing as a plane, as the origin, and as sweeping out various sectors of this plane, with each conjugate of being one of the rays in the sector. (This picture is an oversimplification, for instance it does not accurately reflect the closure of with respect to group inversion, but I still find it a useful picture to have in mind.)
where is the group (I like to think of this group as a sort of “Weyl group” associated to ). As a first approximation, the right hand side of (9) is close to . Thus, if one can somehow prevent from getting too small for too many values of , one an hope to upper bound the right-hand side of (9) by something less than , leading to the desired contradiction. (This strategy won’t quite work when is very small – to be precise – but this case can be worked out by hand.) Thus, we will be looking for such things as lower bounds on or upper bounds on . The fact that is odd will force the and to be odd as well, which will turn out to be useful in improving these bounds by doubling the power of the integrality gap. (We will also rely on the odd order of in a number of other places, in particular using Proposition 7.)
To get started on this strategy, suppose first that was trivial for some , thus , and then all the conjugates of are distinct. This makes a Frobenius group, and so by Theorem 2 there is a Frobenius kernel , which is a normal subgroup of . If is simple, then has to be trivial, which makes and so is abelian. This gives Suzuki’s theorem in this case. Thus we may assume that for all ; as the are all odd, we may improve this to
for all . From this and (9) (and bounding by one of the ) we thus see that is not too small:
Next, we use the Sylow theorems to make the pairwise coprime:
Proof: Suppose for contradiction that and are divisible by a common prime . Then and both contain groups of order , and thus both non-trivially intersect a Sylow -group. On the other hand, non-trivial Sylow -groups have non-trivial centre (otherwise all conjugacy classes other than the identity would have order divisible by , contradiction) and so must be abelian in a CA group. By further application of the CA property we thus conclude that and both contain a Sylow -group. But all Sylow -groups are conjugate, and so non-trivially intersects a conjugate of , contradicting (8).
(because the order of a Sylow -group is the largest power of dividing ), although we will not need to rely on this fact here. (Actually we will barely use Lemma 9 as it is, it being needed to dispose of one technical case in the final analysis.) It is interesting though to see that classical techniques such as Sylow theorems are capable of demonstrating a number of facts about the various quantities appearing in the class equation (9), although without the additional control arising from character theory these facts appear to be insufficient in and of themselves to actually contradict that equation. As an example of (12) one can take the special linear group with a power of two, in which there are three abelian groups (split torus, non-split torus, and unipotent group) of orders respectively, with the entire group being of order .
We do not yet have a sufficiently strong upper bound on the right-hand side of (9), basically because we have no upper bound on the number of conjugacy classes (or sufficiently good lower bounds on the ). To get further bounds we have to return to character theory. The basic idea will be to construct generalised characters which have small norm but which take non-trivial rational integer values at many places, which when combined with the integrality gap will yield useful bounds on various quantities that appear in (9).
We turn to the details. Let be one of the maximal abelian groups. Being abelian, the character theory of is just Fourier analysis: is the group of linear characters on (i.e. the Pontryagin dual of ). (Indeed, from (1) and the observation that the class number of an abelian group is the same as its order, we see that all irreducible representations of an abelian group are one-dimensional.)
The normaliser acts on by conjugation; as is abelian, the action of on itself is trivial, and so we obtain an action of on also. Taking adjoints, we obtain an action of on as well. Any non-trivial element of cannot fix an non-trivial element of , as the centraliser of would then contain an element outside of , contradicting the CA-group nature of . Taking adjoints, we conclude that a non-trivial element of cannot fix an non-trivial element of either (otherwise the action of minus the identity would be non-injective, hence non-surjective, on , so that the corresponding homomorphism on is non-injective). Thus we see that the action of foliates the non-identity elements of into orbits of size . Among other things, this implies that divides , and that the number of such orbits is . As are both odd, is even, and in particular
for all .
Now let us make some generalised characters. Let be non-identity elements of which do not lie in the same -orbit. Then is a generalised character of that vanishes at the identity. Applying induction and (4), we see that
is a generalised character of that is supported on the set
and whose restriction to takes the form
From the Plancherel identity (and the assumption that have disjoint -orbits) we see that
since , we thus have
We can then apply Lemma 4 and conclude the important fact that is the difference of two distinct irreducible characters of .
Let be a set of representatives of all the orbits of . Then by the above discussion, we see that
is the difference of two distinct irreducible characters of whenever are distinct. From the linear independence of the irreducible characters and some easy combinatorics (using (13)), we then see that we can find distinct irreducible characters of and a sign such that
for all . For , the sign and the characters are unique. When , there is a non-uniqueness: one then has the freedom to swap while reversing the sign of . But the set of characters remains unique. We will call the the exceptional characters associated to .
Note that if is in the absolute Galois group of the rationals, then permutes the non-trivial linear characters of . Applying to (15) and using the uniqueness of the set of exceptional characters, we conclude that also permutes the exceptional characters of . On the other hand, from (15) we know that the exceptional characters of all agree outside of , and are thus fixed by the absolute Galois group in this region; in other words, they are rational outside of . On the other hand, as mentioned in the previous section, characters always take the values of algebraic integers. We conclude that
As remarked earlier, from (15) we see that is supported on the set . In particular, if and are distinct, and and , then and are orthogonal. From this and the orthonormality of irreducible characters, we conclude (again using (13)) that and are distinct. Thus we see that the total number of exceptional characters in ; together with the trivial character, this gives distinct irreducible characters of . On the other hand, observe that , and hence , consists of conjugacy classes, and so from (8) we see that the class number of is also . As these numbers match, we see that we have located all of the irreducible characters of ; thus every non-principal irreducible character of is an exceptional character for some .
Now that we have identified the irreducible characters of , we can analyse other generalised characters in terms of them. We pick and consider the generalised character
As with (14), is supported on and when restricted to , is equal to
In particular, from Plancherel’s theorem we have
This is a bit too large of a norm to apply Lemma 4 again, but is still only of moderate size (recall our enemy when trying to contradict (9) is that the are too small, too frequently), and we can nevertheless use the geometry of and the other known characters, together with the integrality gap, to limit how breaks up into irreducible components. Firstly, since the all have mean zero, we see that (16) sums to on , and thus
Thus the Fourier coefficient of at the trivial representation is . Next, we see that (16) is orthogonal to for any and , which upon summing on the conjugacy classes of and on gives that
Thus the Fourier coefficients of at the exceptional characters for are all equal. Similarly, we have
for any and , so from (15) we have
and so the Fourier coefficient at is plus the Fourier coefficients at all the other exceptional characters at . Next, for distinct from and , we see from (15) that the generalised character is supported on , which by (8) is disjoint from the support of , thus
for some natural numbers .
Taking norms using the orthonormality of the irreducible characters, we conclude that
Note that regardless of what is, the quantity is always at least one, thanks to (13). We thus obtain an upper bound on the :
In particular, from (13) we see that there are not many for which is non-zero:
This is progress towards our goal of bounding (9) (because it helps control ), except that we also need to deal with those for which is zero. For this, the generalised character will no longer be useful, but another character of small norm – namely, – will be available as a substitute.
We turn to the details. Let be such that . Then we return to (18) and conclude that
on the set . On this set, we know from Lemma 10 that is an integer. But furthermore, from Proposition 7 we know that the exceptional characters come in conjugate pairs, so in fact Lemma 10 gives that is an even integer. We conclude that is an odd integer on , and in particular, has magnitude at least on this set. As all the agree outside of , we conclude that
on . On the other hand, from the orthonormality of the we know that has an norm of . Since each set has cardinality (as was shown in the derivation of the class equation (9)), we conclude that
We now have enough bounds on the various terms in (9) to obtain the necessary contradiction to finish Suzuki’s theorem from an elementary (though admittedly ad hoc) analysis. It is convenient to order the subgroups so that
We then write the right-hand side of (9) as
and thus (bounding by and by )
If , then
On the other hand
The two bounds are inconsistent for , so we have from (10) (and the odd nature of ), which then gives the upper bound from (23), and Suzuki’s theorem can be verified by classical computations for the odd non-abelian groups of order less than (of which there are actually not that many); alternatively one can use (11), Lemma 9, and (12) to eliminate this case (as no odd number less than has more than three prime factors). So the only remaining case is when and . In this case we may interchange the indices and (which does not affect (21)) and repeating the above arguments we may thus also assume that . Since , we conclude that . But this contradicts Lemma 9, and Suzuki’s theorem is proved.