You are currently browsing the monthly archive for April 2013.

An abstract finite-dimensional complex Lie algebra, or Lie algebra for short, is a finite-dimensional complex vector space ${{\mathfrak g}}$ together with an anti-symmetric bilinear form ${[,] = [,]_{\mathfrak g}: {\mathfrak g} \times {\mathfrak g} \rightarrow {\mathfrak g}}$ that obeys the Jacobi identity

$\displaystyle [[x,y],z] + [[y,z],x] + [[z,x],y] = 0 \ \ \ \ \ (1)$

for all ${x,y,z \in {\mathfrak g}}$; by anti-symmetry one can also rewrite the Jacobi identity as

$\displaystyle [x,[y,z]] = [[x,y],z] + [y,[x,z]]. \ \ \ \ \ (2)$

We will usually omit the subscript from the Lie bracket ${[,]_{\mathfrak g}}$ when this will not cause ambiguity. A homomorphism ${\phi: {\mathfrak g} \rightarrow {\mathfrak h}}$ between two Lie algebras ${{\mathfrak g},{\mathfrak h}}$ is a linear map that respects the Lie bracket, thus ${\phi([x,y]_{\mathfrak g}) =[\phi(x),\phi(y)]_{\mathfrak h}}$ for all ${x,y \in {\mathfrak g}}$. As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted ${0}$; Lie algebras of positive dimension will be called non-trivial.

Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).

A particular example of a Lie algebra is the general linear Lie algebra ${{\mathfrak{gl}}(V)}$ of linear transformations ${x: V \rightarrow V}$ on a finite-dimensional complex vector space (or vector space for short) ${V}$, with the commutator Lie bracket ${[x,y] := xy-yx}$; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of ${{\mathfrak{gl}}(V)}$ for some vector space ${V}$, and similarly define a representation of a Lie algebra ${{\mathfrak g}}$ to be a homomorphism ${\rho: {\mathfrak g} \rightarrow {\mathfrak h}}$ into a concrete Lie algebra ${{\mathfrak h}}$. It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.

Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra ${{\mathfrak g}}$ is to factor it into two simpler algebras ${{\mathfrak h}, {\mathfrak k}}$ via a short exact sequence

$\displaystyle 0 \rightarrow {\mathfrak h} \rightarrow {\mathfrak g} \rightarrow {\mathfrak k} \rightarrow 0, \ \ \ \ \ (3)$

thus one has an injective homomorphism from ${{\mathfrak h}}$ to ${{\mathfrak g}}$ and a surjective homomorphism from ${{\mathfrak g}}$ to ${{\mathfrak k}}$ such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify ${{\mathfrak g}}$ with the vector space ${{\mathfrak h} \times {\mathfrak k}}$ equipped with a Lie bracket of the form

$\displaystyle [(t,x), (s,y)]_{\mathfrak g} = ([t,s]_{\mathfrak h} + A(t,y) - A(s,x) + B(x,y), [x,y]_{\mathfrak k}) \ \ \ \ \ (4)$

for some bilinear maps ${A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ and ${B: {\mathfrak k} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps ${A,B}$ are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding ${{\mathfrak g}}$ can be reduced to that of understanding that of its factors ${{\mathfrak k}, {\mathfrak h}}$. To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation

$\displaystyle {\mathfrak g} = {\mathfrak h} . {\mathfrak k} \ \ \ \ \ (5)$

although one should caution that for given ${{\mathfrak h}}$ and ${{\mathfrak k}}$, there can be multiple non-isomorphic ${{\mathfrak g}}$ that can form a short exact sequence with ${{\mathfrak h},{\mathfrak k}}$, so that ${{\mathfrak h} . {\mathfrak k}}$ is not a uniquely defined combination of ${{\mathfrak h}}$ and ${{\mathfrak k}}$; one could emphasise this by writing ${{\mathfrak h} ._{A,B} {\mathfrak k}}$ instead of ${{\mathfrak h} . {\mathfrak k}}$, though we will not do so here. We will refer to ${{\mathfrak g}}$ as an extension of ${{\mathfrak k}}$ by ${{\mathfrak h}}$, and read the notation (5) as “ ${{\mathfrak g}}$ is ${{\mathfrak h}}$-by-${{\mathfrak k}}$“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation ${.}$ is not commutative, and it is only partly associative: every Lie algebra of the form ${{\mathfrak k} . ({\mathfrak h} . {\mathfrak l})}$ is also of the form ${({\mathfrak k} . {\mathfrak h}) . {\mathfrak l}}$, but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of ${{\mathfrak h} . {\mathfrak k}}$ as a (twisted) “sum” of ${{\mathfrak h}}$ and ${{\mathfrak k}}$ rather than a “product”; for instance, we have ${{\mathfrak g} = 0 . {\mathfrak g}}$ and ${{\mathfrak g} = {\mathfrak g} . 0}$, and also ${\dim {\mathfrak h} . {\mathfrak k} = \dim {\mathfrak h} + \dim {\mathfrak k}}$.

Special examples of extensions ${{\mathfrak h} .{\mathfrak k}}$ of ${{\mathfrak k}}$ by ${{\mathfrak h}}$ include the direct sum (or direct product) ${{\mathfrak h} \oplus {\mathfrak k}}$ (also denoted ${{\mathfrak h} \times {\mathfrak k}}$), which is given by the construction (4) with ${A}$ and ${B}$ both vanishing, and the split extension (or semidirect product) ${{\mathfrak h} : {\mathfrak k} = {\mathfrak h} :_\rho {\mathfrak k}}$ (also denoted ${{\mathfrak h} \ltimes {\mathfrak k} = {\mathfrak h} \ltimes_\rho {\mathfrak k}}$), which is given by the construction (4) with ${B}$ vanishing and the bilinear map ${A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ taking the form

$\displaystyle A( t, x ) = \rho(x)(t)$

for some representation ${\rho: {\mathfrak k} \rightarrow \hbox{Der} {\mathfrak h}}$ of ${{\mathfrak k}}$ in the concrete Lie algebra of derivations ${\hbox{Der} {\mathfrak h} \subset {\mathfrak{gl}}({\mathfrak h})}$ of ${{\mathfrak h}}$, that is to say the algebra of linear maps ${D: {\mathfrak h} \rightarrow {\mathfrak h}}$ that obey the Leibniz rule

$\displaystyle D[s,t]_{\mathfrak h} = [Ds,t]_{\mathfrak h} + [s,Dt]_{\mathfrak h}$

for all ${s,t \in {\mathfrak h}}$. (The derivation algebra ${\hbox{Der} {\mathfrak g}}$ of a Lie algebra ${{\mathfrak g}}$ is analogous to the automorphism group ${\hbox{Aut}(G)}$ of a Lie group ${G}$, with the two concepts being intertwined by the tangent space functor ${G \mapsto {\mathfrak g}}$ from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)

There are two general ways to factor a Lie algebra ${{\mathfrak g}}$ as an extension ${{\mathfrak h} . {\mathfrak k}}$ of a smaller Lie algebra ${{\mathfrak k}}$ by another smaller Lie algebra ${{\mathfrak h}}$. One is to locate a Lie algebra ideal (or ideal for short) ${{\mathfrak h}}$ in ${{\mathfrak g}}$, thus ${[{\mathfrak h},{\mathfrak g}] \subset {\mathfrak h}}$, where ${[{\mathfrak h},{\mathfrak g}]}$ denotes the Lie algebra generated by ${\{ [x,y]: x \in {\mathfrak h}, y \in {\mathfrak g} \}}$, and then take ${{\mathfrak k}}$ to be the quotient space ${{\mathfrak g}/{\mathfrak h}}$ in the usual manner; one can check that ${{\mathfrak h}}$, ${{\mathfrak k}}$ are also Lie algebras and that we do indeed have a short exact sequence

$\displaystyle {\mathfrak g} = {\mathfrak h} . ({\mathfrak g}/{\mathfrak h}).$

Conversely, whenever one has a factorisation ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$, one can identify ${{\mathfrak h}}$ with an ideal in ${{\mathfrak g}}$, and ${{\mathfrak k}}$ with the quotient of ${{\mathfrak g}}$ by ${{\mathfrak h}}$.

The other general way to obtain such a factorisation is is to start with a homomorphism ${\rho: {\mathfrak g} \rightarrow {\mathfrak m}}$ of ${{\mathfrak g}}$ into another Lie algebra ${{\mathfrak m}}$, take ${{\mathfrak k}}$ to be the image ${\rho({\mathfrak g})}$ of ${{\mathfrak g}}$, and ${{\mathfrak h}}$ to be the kernel ${\hbox{ker} \rho := \{ x \in {\mathfrak g}: \rho(x) = 0 \}}$. Again, it is easy to see that this does indeed create a short exact sequence:

$\displaystyle {\mathfrak g} = \hbox{ker} \rho . \rho({\mathfrak g}).$

Conversely, whenever one has a factorisation ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$, one can identify ${{\mathfrak k}}$ with the image of ${{\mathfrak g}}$ under some homomorphism, and ${{\mathfrak h}}$ with the kernel of that homomorphism. Note that if a representation ${\rho: {\mathfrak g} \rightarrow {\mathfrak m}}$ is faithful (i.e. injective), then the kernel is trivial and ${{\mathfrak g}}$ is isomorphic to ${\rho({\mathfrak g})}$.

Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras ${{\mathfrak g}}$, in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra ${{\bf C}}$. Conversely, by using an arbitrary linear basis of ${{\mathfrak g}}$, we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of ${{\bf C}}$.

Now consider a Lie algebra ${{\mathfrak g}}$ that is not necessarily abelian. We then form the derived algebra ${[{\mathfrak g},{\mathfrak g}]}$; this algebra is trivial if and only if ${{\mathfrak g}}$ is abelian. It is easy to see that ${[{\mathfrak h},{\mathfrak k}]}$ is an ideal whenever ${{\mathfrak h},{\mathfrak k}}$ are ideals, so in particular the derived algebra ${[{\mathfrak g},{\mathfrak g}]}$ is an ideal and we thus have the short exact sequence

$\displaystyle {\mathfrak g} = [{\mathfrak g},{\mathfrak g}] . ({\mathfrak g}/[{\mathfrak g},{\mathfrak g}]).$

The algebra ${{\mathfrak g}/[{\mathfrak g},{\mathfrak g}]}$ is the maximal abelian quotient of ${{\mathfrak g}}$, and is known as the abelianisation of ${{\mathfrak g}}$. If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than ${{\mathfrak g}}$. From this, it is natural to associate two series to any Lie algebra ${{\mathfrak g}}$, the lower central series

$\displaystyle {\mathfrak g}_1 = {\mathfrak g}; {\mathfrak g}_2 := [{\mathfrak g}, {\mathfrak g}_1]; {\mathfrak g}_3 := [{\mathfrak g}, {\mathfrak g}_2]; \ldots$

and the derived series

$\displaystyle {\mathfrak g}^{(1)} := {\mathfrak g}; {\mathfrak g}^{(2)} := [{\mathfrak g}^{(1)}, {\mathfrak g}^{(1)}]; {\mathfrak g}^{(3)} := [{\mathfrak g}^{(2)}, {\mathfrak g}^{(2)}]; \ldots.$

By induction we see that these are both decreasing series of ideals of ${{\mathfrak g}}$, with the derived series being slightly smaller (${{\mathfrak g}^{(k)} \subseteq {\mathfrak g}_k}$ for all ${k}$). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group ${{\mathfrak{gl}}_n = {\mathfrak{gl}}({\bf C}^n)}$, which can be identified with the Lie algebra of ${n \times n}$ complex matrices, the subalgebra ${{\mathfrak n}}$ of strictly upper triangular matrices is nilpotent (but not abelian for ${n \geq 3}$), while the subalgebra ${{\mathfrak n}}$ of upper triangular matrices is solvable (but not nilpotent for ${n \geq 2}$). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.

From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus

$\displaystyle {\mathfrak g} = {\mathfrak a}_1 . ({\mathfrak a}_2 . \ldots ({\mathfrak a}_{k-1} . {\mathfrak a}_k) \ldots )$

for some abelian ${{\mathfrak a}_1,\ldots,{\mathfrak a}_k}$. Similarly, a Lie algebra ${{\mathfrak g}}$ is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions ${{\mathfrak h} . {\mathfrak k}}$ in the above factorisation, ${{\mathfrak h}}$ is central in ${{\mathfrak h} . {\mathfrak k}}$, where we say that ${{\mathfrak h}}$ is central in ${{\mathfrak g}}$ if ${[{\mathfrak h},{\mathfrak g}]=0}$). We also see that an extension ${{\mathfrak h} . {\mathfrak k}}$ is solvable if and only of both factors ${{\mathfrak h}, {\mathfrak k}}$ are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.

For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra ${{\mathfrak g}}$ has an adjoint representation ${\hbox{ad}: {\mathfrak g} \rightarrow \hbox{ad} {\mathfrak g} \subset {\mathfrak{gl}}({\mathfrak g})}$, where for each ${x \in {\mathfrak g}}$, ${\hbox{ad} x \in {\mathfrak{gl}}({\mathfrak g})}$ is the linear map ${(\hbox{ad} x)(y) := [x,y]}$; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that ${\hbox{ad} [x,y] = [\hbox{ad} x, \hbox{ad} y]}$ for all ${x,y \in {\mathfrak g}}$). The kernel of this representation is the center ${Z({\mathfrak g}) := \{ x \in {\mathfrak g}: [x,{\mathfrak g}] = 0\}}$, which the maximal central subalgebra of ${{\mathfrak g}}$. We thus have the short exact sequence

$\displaystyle {\mathfrak g} = Z({\mathfrak g}) . \hbox{ad} g \ \ \ \ \ (6)$

which, among other things, shows that every abstract Lie algebra is a central extension of a concrete Lie algebra (which can serve as a cheap substitute for Ado’s theorem mentioned earlier).

For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra ${{\mathfrak g}}$ is simple if it is non-abelian and has no ideals other than ${0}$ and ${{\mathfrak g}}$; thus simple Lie algebras cannot be factored ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$ into strictly smaller algebras ${{\mathfrak h},{\mathfrak k}}$. In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:

Theorem 1 (Equivalent definitions of semisimplicity) Let ${{\mathfrak g}}$ be a Lie algebra. Then the following are equivalent:

• (i) ${{\mathfrak g}}$ does not contain any non-trivial solvable ideal.
• (ii) ${{\mathfrak g}}$ does not contain any non-trivial abelian ideal.
• (iii) The Killing form ${K: {\mathfrak g} \times {\mathfrak g} \rightarrow {\bf C}}$, defined as the bilinear form ${K(x,y) := \hbox{tr}_{\mathfrak g}( (\hbox{ad} x) (\hbox{ad} y) )}$, is non-degenerate on ${{\mathfrak g}}$.
• (iv) ${{\mathfrak g}}$ is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.

We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 32), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)

If ${{\mathfrak h}}$ and ${{\mathfrak k}}$ are solvable ideals of a Lie algebra ${{\mathfrak g}}$, then it is not difficult to see that the vector sum ${{\mathfrak h}+{\mathfrak k}}$ is also a solvable ideal (because on quotienting by ${{\mathfrak h}}$ we see that the derived series of ${{\mathfrak h}+{\mathfrak k}}$ must eventually fall inside ${{\mathfrak h}}$, and thence must eventually become trivial by the solvability of ${{\mathfrak h}}$). As our Lie algebras are finite dimensional, we conclude that ${{\mathfrak g}}$ has a unique maximal solvable ideal, known as the radical ${\hbox{rad} {\mathfrak g}}$ of ${{\mathfrak g}}$. The quotient ${{\mathfrak g}/\hbox{rad} {\mathfrak g}}$ is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition

$\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} . ({\mathfrak g} / \hbox{rad} {\mathfrak g})$

expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra ${{\mathfrak g}/\hbox{rad}{\mathfrak g}}$ by a solvable algebra ${\hbox{rad} {\mathfrak g}}$ (and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension

$\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} : ({\mathfrak g} / \hbox{rad} {\mathfrak g})$

although we will not need or prove this result here.

In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra ${{\bf C}}$. In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over ${{\bf C}}$, are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:

Theorem 2 (Classification of simple Lie algebras) Up to isomorphism, every simple Lie algebra is of one of the following forms:

• ${A_n = \mathfrak{sl}_{n+1}}$ for some ${n \geq 1}$.
• ${B_n = \mathfrak{so}_{2n+1}}$ for some ${n \geq 2}$.
• ${C_n = \mathfrak{sp}_{2n}}$ for some ${n \geq 3}$.
• ${D_n = \mathfrak{so}_{2n}}$ for some ${n \geq 4}$.
• ${E_6, E_7}$, or ${E_8}$.
• ${F_4}$.
• ${G_2}$.

(The precise definition of the classical Lie algebras ${A_n,B_n,C_n,D_n}$ and the exceptional Lie algebras ${E_6,E_7,E_8,F_4,G_2}$ will be recalled later.)

(One can extend the families ${A_n,B_n,C_n,D_n}$ of classical Lie algebras a little bit to smaller values of ${n}$, but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)

This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals ${{\bf R}}$), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.

The classification of finite simple groups (CFSG), first announced in 1983 but only fully completed in 2004, is one of the monumental achievements of twentieth century mathematics. Spanning hundreds of papers and tens of thousands of pages, it has been called the “enormous theorem”. A “second generation” proof of the theorem is nearly completed which is a little shorter (estimated at about five thousand pages in length), but currently there is no reasonably sized proof of the classification.

An important precursor of the CFSG is the Feit-Thompson theorem from 1962-1963, which asserts that every finite group of odd order is solvable, or equivalently that every non-abelian finite simple group has even order. This is an immediate consequence of CFSG, and conversely the Feit-Thompson theorem is an essential starting point in the proof of the classification, since it allows one to reduce matters to groups of even order for which key additional tools (such as the Brauer-Fowler theorem) become available. The original proof of the Feit-Thompson theorem is 255 pages long, which is significantly shorter than the proof of the CFSG, but still far from short. While parts of the proof of the Feit-Thompson theorem have been simplified (and it has recently been converted, after six years of effort, into an argument that has been verified by the proof assistant Coq), the available proofs of this theorem are still extremely lengthy by any reasonable standard.

However, there is a significantly simpler special case of the Feit-Thompson theorem that was established previously by Suzuki in 1957, which was influential in the proof of the more general Feit-Thompson theorem (and thus indirectly to the proof of CFSG). Define a CA-group to be a group ${G}$ with the property that the centraliser ${C_G(x) := \{ g \in G: gx=xg \}}$ of any non-identity element ${x \in G}$ is abelian; equivalently, the commuting relation ${x \sim y}$ (defined as the relation that holds when ${x}$ commutes with ${y}$, thus ${xy=yx}$) is an equivalence relation on the non-identity elements ${G \backslash \{1\}}$ of ${G}$. Trivially, every abelian group is CA. A non-abelian example of a CA-group is the ${ax+b}$ group of invertible affine transformations ${x \mapsto ax+b}$ on a field ${F}$. A little less obviously, the special linear group ${SL_2(F_q)}$ over a finite field ${F_q}$ is a CA-group when ${q}$ is a power of two. The finite simple groups of Lie type are not, in general, CA-groups, but when the rank is bounded they tend to behave as if they were “almost CA”; the centraliser of a generic element in ${SL_d(F_q)}$, for instance, when ${d}$ is bounded and ${q}$ is large), is typically a maximal torus (because most elements in ${SL_d(F_q)}$ are regular semisimple) which is certainly abelian. In view of the CFSG, we thus see that CA or nearly CA groups form an important subclass of the simple groups, and it is thus of interest to study them separately. To this end, we have

Theorem 1 (Suzuki’s theorem on CA-groups) Every finite CA-group of odd order is solvable.

Of course, this theorem is superceded by the more general Feit-Thompson theorem, but Suzuki’s proof is substantially shorter (the original proof is nine pages) and will be given in this post. (See this survey of Solomon for some discussion of the link between Suzuki’s argument and the Feit-Thompson argument.) Suzuki’s analysis can be pushed further to give an essentially complete classification of all the finite CA-groups (of either odd or even order), but we will not pursue these matters here.

Moving even further down the ladder of simple precursors of CSFG is the following theorem of Frobenius from 1901. Define a Frobenius group to be a finite group ${G}$ which has a subgroup ${H}$ (called the Frobenius complement) with the property that all the non-trivial conjugates ${gHg^{-1}}$ of ${H}$ for ${g \in G \backslash H}$, intersect ${H}$ only at the origin. For instance the ${ax+b}$ group is also a Frobenius group (take ${H}$ to be the affine transformations that fix a specified point ${x_0 \in F}$, e.g. the origin). This example suggests that there is some overlap between the notions of a Frobenius group and a CA group. Indeed, note that if ${G}$ is a CA-group and ${H}$ is a maximal abelian subgroup of ${G}$, then any conjugate ${gHg^{-1}}$ of ${H}$ that is not identical to ${H}$ will intersect ${H}$ only at the origin (because ${H}$ and each of its conjugates consist of equivalence classes under the commuting relation ${\sim}$, together with the identity). So if a maximal abelian subgroup ${H}$ of a CA-group is its own normaliser (thus ${N(H) := \{ g \in G: gH=Hg\}}$ is equal to ${H}$), then the group is a Frobenius group.

Frobenius’ theorem places an unexpectedly strong amount of structure on a Frobenius group:

Theorem 2 (Frobenius’ theorem) Let ${G}$ be a Frobenius group with Frobenius complement ${H}$. Then there exists a normal subgroup ${K}$ of ${G}$ (called the Frobenius kernel of ${G}$) such that ${G}$ is the semi-direct product ${H \ltimes K}$ of ${H}$ and ${K}$.

Roughly speaking, this theorem indicates that all Frobenius groups “behave” like the ${ax+b}$ example (which is a quintessential example of a semi-direct product).

Note that if every CA-group of odd order was either Frobenius or abelian, then Theorem 2 would imply Theorem 1 by an induction on the order of ${G}$, since any subgroup of a CA-group is clearly again a CA-group. Indeed, the proof of Suzuki’s theorem does basically proceed by this route (Suzuki’s arguments do indeed imply that CA-groups of odd order are Frobenius or abelian, although we will not quite establish that fact here).

Frobenius’ theorem can be reformulated in the following concrete combinatorial form:

Theorem 3 (Frobenius’ theorem, equivalent version) Let ${G}$ be a group of permutations acting transitively on a finite set ${X}$, with the property that any non-identity permutation in ${G}$ fixes at most one point in ${X}$. Then the set of permutations in ${G}$ that fix no points in ${X}$, together with the identity, is closed under composition.

Again, a good example to keep in mind for this theorem is when ${G}$ is the group of affine permutations on a field ${F}$ (i.e. the ${ax+b}$ group for that field), and ${X}$ is the set of points on that field. In that case, the set of permutations in ${G}$ that do not fix any points are the non-trivial translations.

To deduce Theorem 3 from Theorem 2, one applies Theorem 2 to the stabiliser of a single point in ${X}$. Conversely, to deduce Theorem 2 from Theorem 3, set ${X := G/H = \{ gH: g \in G \}}$ to be the space of left-cosets of ${H}$, with the obvious left ${G}$-action; one easily verifies that this action is faithful, transitive, and each non-identity element ${g}$ of ${G}$ fixes at most one left-coset of ${H}$ (basically because it lies in at most one conjugate of ${H}$). If we let ${K}$ be the elements of ${G}$ that do not fix any point in ${X}$, plus the identity, then by Theorem 3 ${K}$ is closed under composition; it is also clearly closed under inverse and conjugation, and is hence a normal subgroup of ${G}$. From construction ${K}$ is the identity plus the complement of all the ${|G|/|H|}$ conjugates of ${H}$, which are all disjoint except at the identity, so by counting elements we see that

$\displaystyle |K| = |G| - \frac{|G|}{|H|}(|H|-1) = |G|/|H|.$

As ${H}$ normalises ${K}$ and is disjoint from ${K}$, we thus see that ${KH = H \ltimes K}$ is all of ${G}$, giving Theorem 2.

Despite the appealingly concrete and elementary form of Theorem 3, the only known proofs of that theorem (or equivalently, Theorem 2) in its full generality proceed via the machinery of group characters (which one can think of as a version of Fourier analysis for nonabelian groups). On the other hand, once one establishes the basic theory of these characters (reviewed below the fold), the proof of Frobenius’ theorem is very short, which gives quite a striking example of the power of character theory. The proof of Suzuki’s theorem also proceeds via character theory, and is basically a more involved version of the Frobenius argument; again, no character-free proof of Suzuki’s theorem is currently known. (The proofs of Feit-Thompson and CFSG also involve characters, but those proofs also contain many other arguments of much greater complexity than the character-based portions of the proof.)

It seems to me that the above four theorems (Frobenius, Suzuki, Feit-Thompson, and CFSG) provide a ladder of sorts (with exponentially increasing complexity at each step) to the full classification, and that any new approach to the classification might first begin by revisiting the earlier theorems on this ladder and finding new proofs of these results first (in particular, if one had a “robust” proof of Suzuki’s theorem that also gave non-trivial control on “almost CA-groups” – whatever that means – then this might lead to a new route to classifying the finite simple groups of Lie type and bounded rank). But even for the simplest two results on this ladder – Frobenius and Suzuki – it seems remarkably difficult to find any proof that is not essentially the character-based proof. (Even trying to replace character theory by its close cousin, representation theory, doesn’t seem to work unless one gives in to the temptation to take traces everywhere and put the characters back in; it seems that rather than abandon characters altogether, one needs to find some sort of “robust” generalisation of existing character-based methods.) In any case, I am recording here the standard character-based proofs of the theorems of Frobenius and Suzuki below the fold. There is nothing particularly novel here, but I wanted to collect all the relevant material in one place, largely for my own benefit.