The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.
An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.
However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group with the property that the centraliser
of any non-identity element
is abelian; equivalently, the commuting relation
(defined as the relation that holds when
commutes with
, thus
) is an equivalence relation on the non-identity elements
of
. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the
group of invertible affine transformations
on a field
. A little less obviously, the special linear group
over a finite field
is a CA-group when
is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in
, for instance, when
is bounded and
is large), is typically a maximal torus (because most elements in
are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have
Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.
Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.
Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group which has a subgroup
(called the Frobenius complement) with the property that all the non-trivial conjugates
of
for
, intersect
only at the origin. For instance the
group is also a Frobenius group (take
to be the affine transformations that fix a specified point
, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if
is a CA-group and
is a maximal abelian subgroup of
, then any conjugate
of
that is not identical to
will intersect
only at the origin (because
and each of its conjugates consist of equivalence classes under the commuting relation
, together with the identity). So if a maximal abelian subgroup
of a CA-group is its own normaliser (thus
is equal to
), then the group is a Frobenius group.
Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:
Theorem 2 (Frobenius’ theorem) Let
be a Frobenius group with Frobenius complement
. Then there exists a normal subgroup
of
(called the Frobenius kernel of
) such that
is the semi-direct product
of
and
.
Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the example (which is a quintessential example of a semi-direct product).
Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of , since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).
Frobenius’ theorem can be reformulated in the following concrete combinatorial form:
Theorem 3 (Frobenius’ theorem, equivalent version) Let
be a group of permutations acting transitively on a finite set
, with the property that any non-identity permutation in
fixes at most one point in
. Then the set of permutations in
that fix no points in
, together with the identity, is closed under composition.
Again, a good example to keep in mind for this theorem is when is the group of affine permutations on a field
(i.e. the
group for that field), and
is the set of points on that field. In that case, the set of permutations in
that do not fix any points are the non-trivial translations.
To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in . Conversely, to deduce Theorem 2 from Theorem 3, set
to be the space of left-cosets of
, with the obvious left
-action; one easily verifies that this action is faithful, transitive, and each non-identity element
of
fixes at most one left-coset of
(basically because it lies in at most one conjugate of
). If we let
be the elements of
that do not fix any point in
, plus the identity, then by Theorem 3
is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of
. From construction
is the identity plus the complement of all the
conjugates of
, which are all disjoint except at the identity, so by counting elements we see that
As normalises
and is disjoint from
, we thus see that
is all of
, giving Theorem 2.
Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)
It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.
— 1. Basic character theory —
Let be a finite group. Then we can form the finite-dimensional complex Hilbert space
of functions
with the inner product
and thus norm
where is the averaging operator. Inside this space, we have the subspace
of class functions: functions
which are invariant under the conjugation action of
on itself, thus
for all . Equivalently,
is constant on each conjugacy class of
. In particular, we see that the dimension of
is equal to the class number of
– the number of conjugacy classes of
.
One way to generate class functions is from taking traces of finite-dimensional unitary representations , i.e. a homomorphism from the group
to unitary operators on a finite-dimensional complex Hilbert space
. We will abbreviate “finite-dimensional unitary representation” as “representation” henceforth. Given any such representation, one has an associated character
defined by
One easily verifies that this is a class function. For instance, the regular representation , in which
, has character
and every linear character (i.e. a homomorphism to the complex unit circle) is a character associated the obvious one-dimensional representation corresponding to
. In particular, the constant function
is a character, associated to the principal one-dimensional representation.
The characters interact well with various representation-theoretic operations. For instance, isomorphic representations clearly have the same character. If are representations, then the characters of the direct sum
and tensor product
are the sum and product of the individual characters:
and
Also, if is the Hilbert space
with the conjugated inner product, then the conjugate representation
(given by taking
and conjugating the inner product structure) has a conjugated character:
Thus the space of characters forms a semi-ring in that is closed under complex conjugation. Also, since any element of the absolute Galois group
of the rationals can be extended to the complex numbers
, we have the stronger fact that the space of characters is also invariant with respect to the action of
; we will need this fact somewhat later in this post.
The space of characters is not a ring, because characters are certainly not preserved with respect to negation: the value of a representation at the identity is the dimension
of the space that
acts on, and so is non-negative; in particular,
will not be a character as long as
is positive dimensional. We can then define the space of generalised characters to be the ring generated by the characters, thus a generalised character is nothing more than a difference of two characters.
By repeatedly taking orthogonal complements, one can easily see that representations are completely reducible, thus if is a collection of one representative of each of the isomorphism classes of irreducible finite-dimensional representations of
, then every character can be written as a linear combination (over the natural numbers) of the irreducible characters
.
From the ergodic theorem (which is a triviality in the case of an action of a finite group ), the average value of a character
of a representation
is equal to the dimension of its invariant component
:
As a consequence, the inner product of two characters is equal to the dimension of the
-invariant component of
. By Schur’s lemma, this implies in particular that the irreducible characters
are an orthonormal system in
:
In fact, they form an orthonormal basis for . (Proof: given any non-trivial
, the convolution operator
is non-trivial in the regular representation, and thus must also be non-trivial with respect to at least one irreducible representation
, which implies that
has non-zero inner product with
. Thus there is no non-zero element of
that is orthogonal to all the irreducible characters, giving the claim.) In particular, we see that
is equal to the class number of
.
Using this basis, we now have a Fourier transform that is an isometry between the Hilbert space
of class functions, and the Hilbert space
, defined by taking inner products with characters
and with the usual Fourier inversion formula
and Plancherel and Parseval identities
and
(One can also relate convolution in with pointwise multiplication in
, and pointwise multiplication in
is related to a plethysm on
involving tensor product multiplicities, but we will not need these operations here.)
A character of a (not necessarily irreducible) representation
then has Fourier coefficients
that count the multiplicity of
in
; in particular, a class function is a character (resp. generalised character) iff its Fourier coefficients are all natural numbers (resp. integers), and any two representations are isomorphic iff they have the same character.
Thus for instance the regular representation has Fourier coefficients
, leading to the identity
which gives the Peter-Weyl theorem that is isomorphic to the direct sum of
copies of
for each
. In particular we have
From these Fourier identities we can now detect whether a representation is irreducible (or a combination of a small number of representations) through the structure of its character. Indeed, by taking Fourier transforms and working in
we now have the following immediate corollaries, which will be very useful to us in the sequel:
Lemma 4 (Small
norm and irreducibility) Let
be a generalised character.
- (i)
is a natural number.
- (ii)
equals
iff
for some sign
and irreducible character
. If we also know that
, this forces
(since
is positive).
- (iii)
equals
iff
for some signs
and distinct irreducible characters
. If we also know that
, then this forces
(again because
are positive).
One can of course also characterise when is equal to
,
, etc. by this method, although the descriptions rapidly become more complicated and less useful. In practice, this lemma will allow us to construct interesting examples of irreducible representations by first exhibiting a generalised character of small norm (or equivalently, two characters that are close to each other in
norm). It seems very difficult to mimic this type of construction by any other means, including non-character-based representation theoretic methods. (But perhaps one could categorify this lemma somehow using K-theory.)
Lemma 4 is a typical application of the integrality gap, which is the trivial but fundamental fact that integers are either zero or have magnitude at least one. It is the interplay between the integrality gap and the Fourier analysis of characters which drives the proof of both Frobenius’ theorem and Suzuki’s theorem, as we shall soon see.
We also record some additional easy properties of characters which we will need later. Firstly, we have the identity
for any , since the inverse of a unitary operator is also its adjoint. Secondly, the kernel
of a character is automatically a normal subgroup of . This is because a unitary operator on a finite-dimensional space
has trace
if and only if it is equal to the identity operator, and so (3) is also the kernel of the associated representation
. This latter fact suggests a strategy to prove Frobenius’ theorem by exhibiting a character whose kernel is precisely the complement of the conjugacy classes of
excluding the identity, and this is in fact exactly what we will do.
Thus far we have focused on the representation theory of a single group . However, the situation becomes significantly more interesting when one relates the representation theory of two groups, a finite group
and a subgroup
(not necessarily normal). We then have an obvious restriction map
that restricts any class function on
to a class function on
(since any two elements of
that are conjugate in
are clearly also conjugate in
). The adjoint map
can be easily computed: given any class function
, the induced class function
is given by the Frobenius formula
with the convention that is extended by zero from
to
. Equivalently, one has
where is an enumeration of the left-cosets of
.
In a similar fashion, given a representation of
, we may restrict it to
to obtain a representation
of
. In the adjoint direction, given a representation
of
, one can associate an induced representation
of
by the following construction. One takes
to be the space of functions
that
for all
and
with inner product
and then lets act on
by the formula
which one can check to indeed give a representation. Thus, for instance, the regular representation on induces (an isomorphic copy of) the regular representation on
. It is not difficult to show that these representation-theoretic constructions are compatible with the operations on class functions mentioned earlier, thus
for every representation of
, and dually
In particular, any character of restricts to a character of
, and every character of
induces a character on
. Thus the adjoint relationship between
and
for class functions induces a corresponding adjoint relationship for representations, known as Frobenius reciprocity.
In general, the restriction or induction of an irreducible representation will not be irreducible (and the operations of restriction and induction do not invert each other). However, the geometry of the characters can often be controlled quite precisely (especially for structured groups such as Frobenius groups or CA-groups), and this together with tools such as Lemma 4 can allow us to create interesting irreducible representations of
in non-trivial fashion from irreducible representations of
.
— 2. Frobenius’ theorem —
We are now ready to prove Frobenius’ theorem. Let be a Frobenius group with Frobenius complement
. Then there are
distinct conjugates
of
, which are all disjoint except for the origin, thus one can partition
where (by abuse of notation) ranges over a set of representatives
of the left cosets of
and
is the identity together with all the elements that do not lie in any conjugate of
. I like to think of this decomposition by picturing
as being like a plane,
being the origin in this plane, each conjugate
being a non-vertical line through the origin, and
being the vertical line through the origin (note that in the case of the
group, this is more or less exactly what (5) actually looks like). Counting elements in (5), we thus have
We now start inducing characters from to
and determine their geometr;8 Let
be a irreducible character of
of some dimension
, then the character
obeys the identities
and
In particular,
Now we consider the induced character
Using the above identities and the partition (5), we see that equals
at
, vanishes at all other elements of
, and on each of the
sets
is given by a conjugate of
. In particular,
Similarly, if one induces the trivial character on
to a character
on
, this character will equal
at
, vanish at all other elements of
, and will equal
on
. If we then form the generalised character
then equals
on
and is equal to
outside of
. In particular, we have
Using (6), we thus see that has surprisingly small norm:
We can then apply Lemma 4 and conclude that for some irreducible representation
of
. Note that
. Thus we have shown that every irreducible representation
of
is the restriction of an irreducible representation
of
.
Remark 1 The above analysis shows a little bit more, namely that
arises in
as the orthogonal complement of a copy of the mean zero component of quasiregular representation on
(i.e. the induction of the trivial representation on
), although it is not obvious to me how one would demonstrate (other than via an inspection of characters) that the induced representation
actually contains a copy of this component of the quasiregular representation.
Now we consider the regular character of :
This character equals at the identity, and vanishes at the other elements of
. If we then form the associated character
of , then
restricts to
and so also equals
at the identity and vanishes at the other elements of
. By (5), we conclude that
(which is a class function) is supported on
. Also, as each of the
are constant on
,
is also, and so
equals
on all of
. Thus
is the kernel (3) of the character
and is thus normal. This gives Frobenius’ theorem as discussed in the introduction.
— 3. More character theory —
Before we turn to Suzuki’s theorem, we will need some additional facts about characters which go beyond the Fourier-analytic considerations of Section 1 by also employing some tools from algebraic number theory.
Let be a finite group. Observe that if
is a representation, then for any
, the unitary operator
can be diagonalised. As
(and hence
) has finite order, the eigenvalues of
are roots of unity, and so the trace
is the sum of finitely many roots of unity. In particular,
is always an algebraic integer. Unlike rational integers, algebraic integers do not directly enjoy an integrality gap; one can have algebraic integers of arbitrarily small nonzero magnitude (e.g. powers of
). However, we will rely in several places on the basic but fundamental fact that a number which is both an algebraic integer and a rational is necessarily a rational integer, which then is subject to the integrality gap.
We have a variant of the above fact:
Lemma 5 Let
be a finite group, let
be a
-dimensional irreducible representation, and let
. Then
is an algebraic integer, where
is the conjugacy class of
.
Proof: The endomorphism is
-equivariant and has trace
; by Schur’s lemma, it is thus equal to
times the identity. It thus suffices to show that the diagonal entries of
are algebraic integers; thus it will suffice to show that
for some monic polynomial
with integer coefficients.
Consider the associated element in the group ring
of
. Then the modules
,
,
, etc. form an increasing sequence of submodules of
. As
is Noetherian, we thus have
for some , or equivalently that
is an integer combination of
. This implies that
is an integer combination of
, and the claim follows.
This leads to an important corollary:
Corollary 6 (Dimension divides order) Let
be a finite group, and let
be an irreducible
-dimensional representation. Then
divides
.
Proof: As the character has
norm one, we have
Grouping the summation by conjugacy classes
, we can express the left-hand side as the sum of terms of the form
, which by the preceding lemma and discussion is equal to
times an algebraic integer. We conclude that
is an algebraic integer also; but it is rational, and so must be a rational integer also.
In particular, if is of odd order, then the dimension of any irreducible representation of
has odd dimension. This has a further important consequence, due to Burnside:
Proposition 7 (Odd groups have non-real characters) Let
be a finite group of odd order, and let
be a non-principal irreducible character of
. Then
is not a real-valued character. In other words,
.
Proof: Suppose for contradiction that is real-valued. By (2) this implies that
for all
.
As is odd, there are no elements in
of order
. Thus one can partition
for some subset of
of order
. On the other hand, as
is non-principal, we have
By (7) one has
But is the dimension of
, which is odd by Corollary 6, so
is a half-integer. But it is also an algebraic integer, giving the desired contradiction.
— 4. Suzuki’s theorem —
We can now begin the proof of Suzuki’s theorem; we will basically use an arrangement of this theorem from the thesis of Wilcox. We begin with an easy reduction to the simple case:
Proposition 8 (Reduction to the simple case) Let
be a finite CA-group of odd order which is not simple. Suppose that all CA groups of smaller odd order than
are solvable. Then
is solvable also.
Proof: If is not simple, it has a proper normal subgroup
. This group is also of odd order and inherits the CA property from
, so by hypothesis
is solvable. If we let
be the last non-trivial group in the derived series of
, then
is a non-trivial abelian characteristic subgroup of
, and is thus also a normal subgroup of
. Let
be the centraliser of
, then
is also a normal subgroup of
, which is still non-trivial and abelian as
is a CA-group. Furthermore,
is maximal abelian (it is not contained in any larger abelian group).
To show that is solvable, it then suffices to show that the quotient
is solvable. As this group has an odd order smaller than that of
, it suffices to show that
is a CA-group. Thus, if
are non-identity elements of
with
both commuting with
, we need to show that
commute with each other. Equivalently, if
are such that
both commute with
modulo
, then
commutes with
modulo
.
If we fix , then
acts on
by conjugation. This action cannot fix any non-identity element
of
, else the centraliser of
would contain
as well as
, contradicting the maximal abelian nature of
. Thus the map
, which is a homomorphism on the normal abelian group
, has trivial kernel and is thus an isomorphism. From this we see that if
commutes with
modulo
, then one can multiply
by an element of
(on the left or right) in order to make it commute with
exactly. Thus, without loss of generality,
and
both commute with
exactly, and so
commute exactly as well as
is a CA-group, giving the claim.
In view of this proposition, we see that to prove Suzuki’s theorem it suffices to show that simple non-abelian CA-groups of odd order do not exist.
Observe that in a CA-group , every non-identity element
of
is contained in a unique maximal abelian subgroup of
, namely the centraliser
of
. Thus the maximal abelian subgroups of
, once one removes the identity, form a partition of
. It is instructive to keep some examples in mind:
- In the case of the
group on a field
, the maximal abelian subgroups are the translation group
and the stabilisers
of points
, where
is the multiplicative group
.
- In the case of the special linear group
with
a power of two, the maximal abelian groups are conjugates of the split torus
the non-split torus
(where
is any quantity not of the form
for some
) or the unipotent group
As these examples show, while many of the maximal abelian subgroups may be conjugate to each other, there can certainly be several non-conjugate examples of maximal abelian subgroups. Let be a set consisting of one representative from each of these conjugacy classes, then we have the following analogue of the partition (5):
where is the normaliser of
. This partition turns out to be a somewhat less favourable than (5), but one can still run analogues of the Frobenius argument, particularly in the case when
is odd (which forces many other related quantities, such as
, to be odd also). I like to think of this decomposition by viewing
as a plane,
as the origin, and
as sweeping out various sectors of this plane, with each conjugate
of
being one of the rays in the sector. (This picture is an oversimplification, for instance it does not accurately reflect the closure of
with respect to group inversion, but I still find it a useful picture to have in mind.)
We first give the analogue of (6). Taking cardinalities in (8) we obtain the class equation
for a CA-group, which we can rearrange as
where is the group
(I like to think of this group as a sort of “Weyl group” associated to
). As a first approximation, the right hand side of (9) is close to
. Thus, if one can somehow prevent
from getting too small for too many values of
, one an hope to upper bound the right-hand side of (9) by something less than
, leading to the desired contradiction. (This strategy won’t quite work when
is very small –
to be precise – but this case can be worked out by hand.) Thus, we will be looking for such things as lower bounds on
or upper bounds on
. The fact that
is odd will force the
and
to be odd as well, which will turn out to be useful in improving these bounds by doubling the power of the integrality gap. (We will also rely on the odd order of
in a number of other places, in particular using Proposition 7.)
To get started on this strategy, suppose first that was trivial for some
, thus
, and then all the
conjugates of
are distinct. This makes
a Frobenius group, and so by Theorem 2 there is a Frobenius kernel
, which is a normal subgroup of
. If
is simple, then
has to be trivial, which makes
and so
is abelian. This gives Suzuki’s theorem in this case. Thus we may assume that
for all
; as the
are all odd, we may improve this to
for all . From this and (9) (and bounding
by one of the
) we thus see that
is not too small:
Next, we use the Sylow theorems to make the pairwise coprime:
Proof: Suppose for contradiction that and
are divisible by a common prime
. Then
and
both contain groups of order
, and thus both non-trivially intersect a Sylow
-group. On the other hand, non-trivial Sylow
-groups have non-trivial centre (otherwise all conjugacy classes other than the identity would have order divisible by
, contradiction) and so must be abelian in a CA group. By further application of the CA property we thus conclude that
and
both contain a Sylow
-group. But all Sylow
-groups are conjugate, and so
non-trivially intersects a conjugate of
, contradicting (8).
We remark that the above analysis also reveals that
(because the order of a Sylow -group is the largest power of
dividing
), although we will not need to rely on this fact here. (Actually we will barely use Lemma 9 as it is, it being needed to dispose of one technical case in the final analysis.) It is interesting though to see that classical techniques such as Sylow theorems are capable of demonstrating a number of facts about the various quantities appearing in the class equation (9), although without the additional control arising from character theory these facts appear to be insufficient in and of themselves to actually contradict that equation. As an example of (12) one can take the special linear group
with
a power of two, in which there are three abelian groups
(split torus, non-split torus, and unipotent group) of orders
respectively, with the entire group
being of order
.
We do not yet have a sufficiently strong upper bound on the right-hand side of (9), basically because we have no upper bound on the number of conjugacy classes (or sufficiently good lower bounds on the
). To get further bounds we have to return to character theory. The basic idea will be to construct generalised characters which have small
norm but which take non-trivial rational integer values at many places, which when combined with the integrality gap will yield useful bounds on various quantities that appear in (9).
We turn to the details. Let be one of the maximal abelian groups. Being abelian, the character theory of
is just Fourier analysis:
is the group of linear characters on
(i.e. the Pontryagin dual of
). (Indeed, from (1) and the observation that the class number of an abelian group is the same as its order, we see that all irreducible representations of an abelian group are one-dimensional.)
The normaliser acts on
by conjugation; as
is abelian, the action of
on itself is trivial, and so we obtain an action of
on
also. Taking adjoints, we obtain an action of
on
as well. Any non-trivial element of
cannot fix an non-trivial element
of
, as the centraliser of
would then contain an element outside of
, contradicting the CA-group nature of
. Taking adjoints, we conclude that a non-trivial element
of
cannot fix an non-trivial element
of
either (otherwise the action of
minus the identity would be non-injective, hence non-surjective, on
, so that the corresponding homomorphism on
is non-injective). Thus we see that the action of
foliates the non-identity elements
of
into orbits of size
. Among other things, this implies that
divides
, and that the number of such orbits
is
. As
are both odd,
is even, and in particular
for all .
Now let us make some generalised characters. Let be non-identity elements of
which do not lie in the same
-orbit. Then
is a generalised character of
that vanishes at the identity. Applying induction and (4), we see that
is a generalised character of that is supported on the set
and whose restriction to takes the form
From the Plancherel identity (and the assumption that have disjoint
-orbits) we see that
and so
since , we thus have
We can then apply Lemma 4 and conclude the important fact that is the difference of two distinct irreducible characters of
.
Let be a set of representatives of all the
orbits of
. Then by the above discussion, we see that
is the difference of two distinct irreducible characters of whenever
are distinct. From the linear independence of the irreducible characters and some easy combinatorics (using (13)), we then see that we can find distinct irreducible characters
of
and a sign
such that
for all . For
, the sign
and the characters
are unique. When
, there is a non-uniqueness: one then has the freedom to swap
while reversing the sign of
. But the set of characters
remains unique. We will call the
the exceptional characters associated to
.
Note that if is in the absolute Galois group of the rationals, then
permutes the non-trivial linear characters of
. Applying
to (15) and using the uniqueness of the set of exceptional characters, we conclude that
also permutes the exceptional characters of
. On the other hand, from (15) we know that the exceptional characters of
all agree outside of
, and are thus fixed by the absolute Galois group in this region; in other words, they are rational outside of
. On the other hand, as mentioned in the previous section, characters always take the values of algebraic integers. We conclude that
Lemma 10 Any exceptional character for
takes rational integer values outside of
.
As remarked earlier, from (15) we see that is supported on the set
. In particular, if
and
are distinct, and
and
, then
and
are orthogonal. From this and the orthonormality of irreducible characters, we conclude (again using (13)) that
and
are distinct. Thus we see that the total number of exceptional characters in
; together with the trivial character, this gives
distinct irreducible characters of
. On the other hand, observe that
, and hence
, consists of
conjugacy classes, and so from (8) we see that the class number of
is also
. As these numbers match, we see that we have located all of the irreducible characters of
; thus every non-principal irreducible character of
is an exceptional character for some
.
Now that we have identified the irreducible characters of , we can analyse other generalised characters in terms of them. We pick
and consider the generalised character
As with (14), is supported on
and when restricted to
, is equal to
In particular, from Plancherel’s theorem we have
This is a bit too large of a norm to apply Lemma 4 again, but is still only of moderate size (recall our enemy when trying to contradict (9) is that the
are too small, too frequently), and we can nevertheless use the
geometry of
and the other known characters, together with the integrality gap, to limit how
breaks up into irreducible components. Firstly, since the
all have mean zero, we see that (16) sums to
on
, and thus
Thus the Fourier coefficient of at the trivial representation is
. Next, we see that (16) is orthogonal to
for any
and
, which upon summing on the conjugacy classes of
and on
gives that
Thus the Fourier coefficients of at the exceptional characters
for
are all equal. Similarly, we have
for any and
, so from (15) we have
and so the Fourier coefficient at is
plus the Fourier coefficients at all the other exceptional characters at
. Next, for
distinct from
and
, we see from (15) that the generalised character
is supported on
, which by (8) is disjoint from the support of
, thus
Thus all the Fourier coefficients of at exceptional characters of
are the same. We thus have obtained a decomposition of the form
for some natural numbers .
Taking norms using the orthonormality of the irreducible characters, we conclude that
Note that regardless of what is, the quantity
is always at least one, thanks to (13). We thus obtain an upper bound on the
:
In particular, from (13) we see that there are not many for which
is non-zero:
This is progress towards our goal of bounding (9) (because it helps control ), except that we also need to deal with those
for which
is zero. For this, the generalised character
will no longer be useful, but another character of small norm – namely,
– will be available as a substitute.
We turn to the details. Let be such that
. Then we return to (18) and conclude that
on the set . On this set, we know from Lemma 10 that
is an integer. But furthermore, from Proposition 7 we know that the exceptional characters
come in conjugate pairs, so in fact Lemma 10 gives that
is an even integer. We conclude that
is an odd integer on
, and in particular, has magnitude at least
on this set. As all the
agree outside of
, we conclude that
on . On the other hand, from the orthonormality of the
we know that
has an
norm of
. Since each set
has cardinality
(as was shown in the derivation of the class equation (9)), we conclude that
We now have enough bounds on the various terms in (9) to obtain the necessary contradiction to finish Suzuki’s theorem from an elementary (though admittedly ad hoc) analysis. It is convenient to order the subgroups so that
We then write the right-hand side of (9) as
For the first summation, we crudely bound by
and use (19); for the second summation we use (20). We conclude from (9) that
and thus (bounding by
and
by
)
Applying (10), we conclude that , which forces
since
is even. Since
, we conclude that
. We use this to return to the bound (22) to obtain
and hence
If , then
and so
On the other hand
The two bounds are inconsistent for , so we have
from (10) (and the odd nature of
), which then gives the upper bound
from (23), and Suzuki’s theorem can be verified by classical computations for the odd non-abelian groups of order less than
(of which there are actually not that many); alternatively one can use (11), Lemma 9, and (12) to eliminate this case (as no odd number less than
has more than three prime factors). So the only remaining case is when
and
. In this case we may interchange the indices
and
(which does not affect (21)) and repeating the above arguments we may thus also assume that
. Since
, we conclude that
. But this contradicts Lemma 9, and Suzuki’s theorem is proved.
18 comments
Comments feed for this article
13 April, 2013 at 2:36 am
Daniel Shved
There seems to be an excess
when counting elements in the Frobenius kernel:
.
[Oops, I see what you’re saying now; corrected, thanks. -T]
13 April, 2013 at 3:26 am
yulanqing
Reblogged this on The Mind's Place.
13 April, 2013 at 8:17 am
L’enorme teorema | In teoria
[…] sempre l’ispirazione proviene da questo blog, fonte poderosa di fatti […]
13 April, 2013 at 12:16 pm
J.Hahn
Hi.
Being a group theorist at heart and not currently having so much time for group theory anymore, I really enjoyed this blog post.
I noticed some minor errors:
instead of
.
and
are elements of
and not elements of
.
* In the proof of Prop. 7 it should be
* Right after (13) the
* Between formulas (19) and (20) and in (20) there are some curly braces missing I think.
* Just before formula (22) the link for formula (19) points to (17) instead of (19).
Thanks again for this blog post.
[Corrected, thanks – T.]
14 April, 2013 at 3:08 am
Daniel Shved
I’m working through the post with great interest. Even though I’m familiar with (a very modest) part of the material, it looks much clearer than it was in my head. I especially like your pace which is way faster than usually seen in textbooks. Thanks a lot!
Here are some more typos I’ve found:
– In the Parseval identity (or Plancherel, I’m not sure which is which):

appears without a hat in
in these two equalities and in the text right below them.
It seems like some hats are missing on the right hand side. Also,
– In the equality that states that induced characters are characters of induced representations:

is upside down (should be
).
– In the proof of Frobenius theorem, when the induced character
first appears, there is this phrase: “Using the above identities and the partition (5), we see that
equals
at
.” For some reason the closing | is not displayed in the denominator of |G|/|H|. It is there in the LaTeX code, but doesn’t display correctly. The same quirk happened in my first comment to this post. Maybe this is a wordpress bug? Maybe due to some unimaginable reasons | is ignored when it is the last symbol in a formula?
– The very next formula says:

should be squared.
Looks like
[Corrected, thanks. I was not able to reproduce the issue in your third bullet point, the | signs seem to be rendering fine to me. – T.]
26 April, 2013 at 12:14 pm
Curious
Dear prof Tao,
If you were forced to answer, would would you answer to whether the universe is mathematical or whether math is only an approximation? What do you think about string theory?
7 March, 2018 at 9:24 pm
Anonymous
A possible answer is in Tegmark’s book “Our mathematical universe”. See also the wikipedia article on the mathematical universe hypotheses.
https://en.wikipedia.org/wiki/Mathematical_universe_hypothesis
2 May, 2013 at 4:28 pm
Quasirandom groups and a cheap version of the Brauer-Fowler theorem | What's new
[…] theorem can be proven by a nice application of character theory, of the type discussed in this recent blog post, ultimately based on analysing the alternating tensor power of representations; I reproduce a […]
3 May, 2013 at 1:34 am
b98201031
I enjoyed that article.
26 May, 2013 at 7:27 am
A Fourier-analytic proof of Frobenius’ theorem | What's new
[…] discussed Frobenius’ theorem and its proof in this recent blog post. This proof uses the theory of characters on a finite group , in particular relying on the fact […]
3 June, 2013 at 12:22 pm
Allen Knutson
I quite like the idea that CA-groups are easier, and groups of Lie type are nearly CA. Too bad CA-groups are never nonabelian simple.
It makes me wonder whether there’s a measure of how non-CA a group is, such that Suzuki’s arguments might extend to groups that are nearly CA. (Naively, the fraction of elements whose C are not A, but quite likely some subtler version thereof.)
3 June, 2013 at 1:03 pm
Terence Tao
Yes, this is something I would like to understand better myself. One of the funny things coming out of Suzuki’s analysis is that to every (Weyl group conjugacy class of a) character
on a (conjugacy class of a) maximal abelian subgroup
of G there is associated an “exceptional character”
of G which is a component of the induced representation of G coming from the character of
(or sometimes, a bit weirdly, it is an “anti-component”, if the sign
is negative), and that in a CA group these exceptional characters actually end up occupying all the non-trivial characters of G. So in a Lie group of bounded rank, which is almost CA, one would expect the (Weyl group conjugacy classes of) characters of a maximal torus to be associated somehow with special characters on the Lie group, and that “most” characters on the Lie group should actually arise in this way. Presumably this is what Deligne-Luzstig theory is about, and learning more about that theory is on my “to do” list, but I don’t have much more to say about this at this point. One annoying thing is that the way that the exceptional character
is generated from
is “non-functorial”; it doesn’t come from the categorical operations on representations (induction, direct sum, orthocomplement, etc.) but instead weirdly emerges from the integrality gap (Lemma 4 in the post) which is a way to easily produce characters without saying much of anything about their associated representations. I don’t yet know if this disconnect is really there, or if there are some other ways to construct representations that are somehow behind exceptional character theory (but again Deligne-Luzstig theory is presumably a big clue in this regard).
27 June, 2013 at 11:38 am
DG
Is there a citable reference to the “theorem of Frobenius from 1901”, and to the original definitions of Frobenius group/kernel/complement? Thanks.
27 June, 2013 at 12:19 pm
Terence Tao
The original reference is
G. Frobenius, “Ueber auflösbare Gruppen IV” Sitzungsber. Preuss. Akad. Wissenschaft. (1901) pp. 1216–1230
but it may be difficult to locate (see http://math.stackexchange.com/questions/222167/where-can-i-find-the-original-papers-by-frobenius-concerning-solutions-to-xn for some related discussion). A somewhat more modern reference is
I.M. Isaacs, “Character theory of finite groups” , Acad. Press (1976)
27 June, 2013 at 1:15 pm
DG
Thankyou very much!
1 December, 2013 at 8:53 pm
Frobenius groups and vibrating molecules (Dec. 2-6) | Bilinear Forms & Group Representations: MATH 5657
[…] help with understanding their structure. The stuff I want to cover is in the intro and Section 2 of this blog post. The rest of the post is quite interesting as well, but is more than we’ll have time to get […]
7 March, 2018 at 11:33 am
Michael Geline
Hi. Another of the “museum piece” applications of characters to finite groups is Burnside’s theorem which says a transitive permutation group of prime degree
is either doubly transitive or has a normal Sylow
-subgroup. This implies right away that a non-abelian finite simple group cannot have a subgroup of prime index for any prime. A group of odd order cannot act doubly transitively! However, I’m pretty sure that arguments like this play no role in Feit Thompson argument.
20 May, 2018 at 2:13 am
Anonymous
Thanks for the post. I think I’m misunderstanding something basic or you have a small mistake; when you give the second definition of an induced function, it should be the sum over f(g_i ^-1 x g_i) and not f(g_i x g_i ^-1), or alternatively you can a tranversal over right cosets and not left cosets.