An abstract finite-dimensional complex Lie algebra, or Lie algebra for short, is a finite-dimensional complex vector space ${{\mathfrak g}}$ together with an anti-symmetric bilinear form ${[,] = [,]_{\mathfrak g}: {\mathfrak g} \times {\mathfrak g} \rightarrow {\mathfrak g}}$ that obeys the Jacobi identity

$\displaystyle [[x,y],z] + [[y,z],x] + [[z,x],y] = 0 \ \ \ \ \ (1)$

for all ${x,y,z \in {\mathfrak g}}$; by anti-symmetry one can also rewrite the Jacobi identity as

$\displaystyle [x,[y,z]] = [[x,y],z] + [y,[x,z]]. \ \ \ \ \ (2)$

We will usually omit the subscript from the Lie bracket ${[,]_{\mathfrak g}}$ when this will not cause ambiguity. A homomorphism ${\phi: {\mathfrak g} \rightarrow {\mathfrak h}}$ between two Lie algebras ${{\mathfrak g},{\mathfrak h}}$ is a linear map that respects the Lie bracket, thus ${\phi([x,y]_{\mathfrak g}) =[\phi(x),\phi(y)]_{\mathfrak h}}$ for all ${x,y \in {\mathfrak g}}$. As with many other classes of mathematical objects, the class of Lie algebras together with their homomorphisms then form a category. One can of course also consider Lie algebras in infinite dimension or over other fields, but we will restrict attention throughout these notes to the finite-dimensional complex case. The trivial, zero-dimensional Lie algebra is denoted ${0}$; Lie algebras of positive dimension will be called non-trivial.

Lie algebras come up in many contexts in mathematics, in particular arising as the tangent space of complex Lie groups. It is thus very profitable to think of Lie algebras as being the infinitesimal component of a Lie group, and in particular almost all of the notation and concepts that are applicable to Lie groups (e.g. nilpotence, solvability, extensions, etc.) have infinitesimal counterparts in the category of Lie algebras (often with exactly the same terminology). See this previous blog post for more discussion about the connection between Lie algebras and Lie groups (that post was focused over the reals instead of the complexes, but much of the discussion carries over to the complex case).

A particular example of a Lie algebra is the general linear Lie algebra ${{\mathfrak{gl}}(V)}$ of linear transformations ${x: V \rightarrow V}$ on a finite-dimensional complex vector space (or vector space for short) ${V}$, with the commutator Lie bracket ${[x,y] := xy-yx}$; one easily verifies that this is indeed an abstract Lie algebra. We will define a concrete Lie algebra to be a Lie algebra that is a subalgebra of ${{\mathfrak{gl}}(V)}$ for some vector space ${V}$, and similarly define a representation of a Lie algebra ${{\mathfrak g}}$ to be a homomorphism ${\rho: {\mathfrak g} \rightarrow {\mathfrak h}}$ into a concrete Lie algebra ${{\mathfrak h}}$. It is a deep theorem of Ado (discussed in this previous post) that every abstract Lie algebra is in fact isomorphic to a concrete one (or equivalently, that every abstract Lie algebra has a faithful representation), but we will not need or prove this fact here.

Even without Ado’s theorem, though, the structure of abstract Lie algebras is very well understood. As with objects in many other algebraic categories, a basic way to understand a Lie algebra ${{\mathfrak g}}$ is to factor it into two simpler algebras ${{\mathfrak h}, {\mathfrak k}}$ via a short exact sequence

$\displaystyle 0 \rightarrow {\mathfrak h} \rightarrow {\mathfrak g} \rightarrow {\mathfrak k} \rightarrow 0, \ \ \ \ \ (3)$

thus one has an injective homomorphism from ${{\mathfrak h}}$ to ${{\mathfrak g}}$ and a surjective homomorphism from ${{\mathfrak g}}$ to ${{\mathfrak k}}$ such that the image of the former homomorphism is the kernel of the latter. (To be pedantic, a short exact sequence in a general category requires these homomorphisms to be monomorphisms and epimorphisms respectively, but in the category of Lie algebras these turn out to reduce to the more familiar concepts of injectivity and surjectivity respectively.) Given such a sequence, one can (non-uniquely) identify ${{\mathfrak g}}$ with the vector space ${{\mathfrak h} \times {\mathfrak k}}$ equipped with a Lie bracket of the form

$\displaystyle [(t,x), (s,y)]_{\mathfrak g} = ([t,s]_{\mathfrak h} + A(t,y) - A(s,x) + B(x,y), [x,y]_{\mathfrak k}) \ \ \ \ \ (4)$

for some bilinear maps ${A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ and ${B: {\mathfrak k} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ that obey some Jacobi-type identities which we will not record here. Understanding exactly what maps ${A,B}$ are possible here (up to coordinate change) can be a difficult task (and is one of the key objectives of Lie algebra cohomology), but in principle at least, the problem of understanding ${{\mathfrak g}}$ can be reduced to that of understanding that of its factors ${{\mathfrak k}, {\mathfrak h}}$. To emphasise this, I will (perhaps idiosyncratically) express the existence of a short exact sequence (3) by the ATLAS-type notation

$\displaystyle {\mathfrak g} = {\mathfrak h} . {\mathfrak k} \ \ \ \ \ (5)$

although one should caution that for given ${{\mathfrak h}}$ and ${{\mathfrak k}}$, there can be multiple non-isomorphic ${{\mathfrak g}}$ that can form a short exact sequence with ${{\mathfrak h},{\mathfrak k}}$, so that ${{\mathfrak h} . {\mathfrak k}}$ is not a uniquely defined combination of ${{\mathfrak h}}$ and ${{\mathfrak k}}$; one could emphasise this by writing ${{\mathfrak h} ._{A,B} {\mathfrak k}}$ instead of ${{\mathfrak h} . {\mathfrak k}}$, though we will not do so here. We will refer to ${{\mathfrak g}}$ as an extension of ${{\mathfrak k}}$ by ${{\mathfrak h}}$, and read the notation (5) as “ ${{\mathfrak g}}$ is ${{\mathfrak h}}$-by-${{\mathfrak k}}$“; confusingly, these two notations reverse the subject and object of “by”, but unfortunately both notations are well entrenched in the literature. We caution that the operation ${.}$ is not commutative, and it is only partly associative: every Lie algebra of the form ${{\mathfrak k} . ({\mathfrak h} . {\mathfrak l})}$ is also of the form ${({\mathfrak k} . {\mathfrak h}) . {\mathfrak l}}$, but the converse is not true (see this previous blog post for some related discussion). As we are working in the infinitesimal world of Lie algebras (which have an additive group operation) rather than Lie groups (in which the group operation is usually written multiplicatively), it may help to think of ${{\mathfrak h} . {\mathfrak k}}$ as a (twisted) “sum” of ${{\mathfrak h}}$ and ${{\mathfrak k}}$ rather than a “product”; for instance, we have ${{\mathfrak g} = 0 . {\mathfrak g}}$ and ${{\mathfrak g} = {\mathfrak g} . 0}$, and also ${\dim {\mathfrak h} . {\mathfrak k} = \dim {\mathfrak h} + \dim {\mathfrak k}}$.

Special examples of extensions ${{\mathfrak h} .{\mathfrak k}}$ of ${{\mathfrak k}}$ by ${{\mathfrak h}}$ include the direct sum (or direct product) ${{\mathfrak h} \oplus {\mathfrak k}}$ (also denoted ${{\mathfrak h} \times {\mathfrak k}}$), which is given by the construction (4) with ${A}$ and ${B}$ both vanishing, and the split extension (or semidirect product) ${{\mathfrak h} : {\mathfrak k} = {\mathfrak h} :_\rho {\mathfrak k}}$ (also denoted ${{\mathfrak h} \ltimes {\mathfrak k} = {\mathfrak h} \ltimes_\rho {\mathfrak k}}$), which is given by the construction (4) with ${B}$ vanishing and the bilinear map ${A: {\mathfrak h} \times {\mathfrak k} \rightarrow {\mathfrak h}}$ taking the form

$\displaystyle A( t, x ) = \rho(x)(t)$

for some representation ${\rho: {\mathfrak k} \rightarrow \hbox{Der} {\mathfrak h}}$ of ${{\mathfrak k}}$ in the concrete Lie algebra of derivations ${\hbox{Der} {\mathfrak h} \subset {\mathfrak{gl}}({\mathfrak h})}$ of ${{\mathfrak h}}$, that is to say the algebra of linear maps ${D: {\mathfrak h} \rightarrow {\mathfrak h}}$ that obey the Leibniz rule

$\displaystyle D[s,t]_{\mathfrak h} = [Ds,t]_{\mathfrak h} + [s,Dt]_{\mathfrak h}$

for all ${s,t \in {\mathfrak h}}$. (The derivation algebra ${\hbox{Der} {\mathfrak g}}$ of a Lie algebra ${{\mathfrak g}}$ is analogous to the automorphism group ${\hbox{Aut}(G)}$ of a Lie group ${G}$, with the two concepts being intertwined by the tangent space functor ${G \mapsto {\mathfrak g}}$ from Lie groups to Lie algebras (i.e. the derivation algebra is the infinitesimal version of the automorphism group). Of course, this functor also intertwines the Lie algebra and Lie group versions of most of the other concepts discussed here, such as extensions, semidirect products, etc.)

There are two general ways to factor a Lie algebra ${{\mathfrak g}}$ as an extension ${{\mathfrak h} . {\mathfrak k}}$ of a smaller Lie algebra ${{\mathfrak k}}$ by another smaller Lie algebra ${{\mathfrak h}}$. One is to locate a Lie algebra ideal (or ideal for short) ${{\mathfrak h}}$ in ${{\mathfrak g}}$, thus ${[{\mathfrak h},{\mathfrak g}] \subset {\mathfrak h}}$, where ${[{\mathfrak h},{\mathfrak g}]}$ denotes the Lie algebra generated by ${\{ [x,y]: x \in {\mathfrak h}, y \in {\mathfrak g} \}}$, and then take ${{\mathfrak k}}$ to be the quotient space ${{\mathfrak g}/{\mathfrak h}}$ in the usual manner; one can check that ${{\mathfrak h}}$, ${{\mathfrak k}}$ are also Lie algebras and that we do indeed have a short exact sequence

$\displaystyle {\mathfrak g} = {\mathfrak h} . ({\mathfrak g}/{\mathfrak h}).$

Conversely, whenever one has a factorisation ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$, one can identify ${{\mathfrak h}}$ with an ideal in ${{\mathfrak g}}$, and ${{\mathfrak k}}$ with the quotient of ${{\mathfrak g}}$ by ${{\mathfrak h}}$.

The other general way to obtain such a factorisation is is to start with a homomorphism ${\rho: {\mathfrak g} \rightarrow {\mathfrak m}}$ of ${{\mathfrak g}}$ into another Lie algebra ${{\mathfrak m}}$, take ${{\mathfrak k}}$ to be the image ${\rho({\mathfrak g})}$ of ${{\mathfrak g}}$, and ${{\mathfrak h}}$ to be the kernel ${\hbox{ker} \rho := \{ x \in {\mathfrak g}: \rho(x) = 0 \}}$. Again, it is easy to see that this does indeed create a short exact sequence:

$\displaystyle {\mathfrak g} = \hbox{ker} \rho . \rho({\mathfrak g}).$

Conversely, whenever one has a factorisation ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$, one can identify ${{\mathfrak k}}$ with the image of ${{\mathfrak g}}$ under some homomorphism, and ${{\mathfrak h}}$ with the kernel of that homomorphism. Note that if a representation ${\rho: {\mathfrak g} \rightarrow {\mathfrak m}}$ is faithful (i.e. injective), then the kernel is trivial and ${{\mathfrak g}}$ is isomorphic to ${\rho({\mathfrak g})}$.

Now we consider some examples of factoring some class of Lie algebras into simpler Lie algebras. The easiest examples of Lie algebras to understand are the abelian Lie algebras ${{\mathfrak g}}$, in which the Lie bracket identically vanishes. Every one-dimensional Lie algebra is automatically abelian, and thus isomorphic to the scalar algebra ${{\bf C}}$. Conversely, by using an arbitrary linear basis of ${{\mathfrak g}}$, we see that an abelian Lie algebra is isomorphic to the direct sum of one-dimensional algebras. Thus, a Lie algebra is abelian if and only if it is isomorphic to the direct sum of finitely many copies of ${{\bf C}}$.

Now consider a Lie algebra ${{\mathfrak g}}$ that is not necessarily abelian. We then form the derived algebra ${[{\mathfrak g},{\mathfrak g}]}$; this algebra is trivial if and only if ${{\mathfrak g}}$ is abelian. It is easy to see that ${[{\mathfrak h},{\mathfrak k}]}$ is an ideal whenever ${{\mathfrak h},{\mathfrak k}}$ are ideals, so in particular the derived algebra ${[{\mathfrak g},{\mathfrak g}]}$ is an ideal and we thus have the short exact sequence

$\displaystyle {\mathfrak g} = [{\mathfrak g},{\mathfrak g}] . ({\mathfrak g}/[{\mathfrak g},{\mathfrak g}]).$

The algebra ${{\mathfrak g}/[{\mathfrak g},{\mathfrak g}]}$ is the maximal abelian quotient of ${{\mathfrak g}}$, and is known as the abelianisation of ${{\mathfrak g}}$. If it is trivial, we call the Lie algebra perfect. If instead it is non-trivial, then the derived algebra has strictly smaller dimension than ${{\mathfrak g}}$. From this, it is natural to associate two series to any Lie algebra ${{\mathfrak g}}$, the lower central series

$\displaystyle {\mathfrak g}_1 = {\mathfrak g}; {\mathfrak g}_2 := [{\mathfrak g}, {\mathfrak g}_1]; {\mathfrak g}_3 := [{\mathfrak g}, {\mathfrak g}_2]; \ldots$

and the derived series

$\displaystyle {\mathfrak g}^{(1)} := {\mathfrak g}; {\mathfrak g}^{(2)} := [{\mathfrak g}^{(1)}, {\mathfrak g}^{(1)}]; {\mathfrak g}^{(3)} := [{\mathfrak g}^{(2)}, {\mathfrak g}^{(2)}]; \ldots.$

By induction we see that these are both decreasing series of ideals of ${{\mathfrak g}}$, with the derived series being slightly smaller (${{\mathfrak g}^{(k)} \subseteq {\mathfrak g}_k}$ for all ${k}$). We say that a Lie algebra is nilpotent if its lower central series is eventually trivial, and solvable if its derived series eventually becomes trivial. Thus, abelian Lie algebras are nilpotent, and nilpotent Lie algebras are solvable, but the converses are not necessarily true. For instance, in the general linear group ${{\mathfrak{gl}}_n = {\mathfrak{gl}}({\bf C}^n)}$, which can be identified with the Lie algebra of ${n \times n}$ complex matrices, the subalgebra ${{\mathfrak n}}$ of strictly upper triangular matrices is nilpotent (but not abelian for ${n \geq 3}$), while the subalgebra ${{\mathfrak n}}$ of upper triangular matrices is solvable (but not nilpotent for ${n \geq 2}$). It is also clear that any subalgebra of a nilpotent algebra is nilpotent, and similarly for solvable or abelian algebras.

From the above discussion we see that a Lie algebra is solvable if and only if it can be represented by a tower of abelian extensions, thus

$\displaystyle {\mathfrak g} = {\mathfrak a}_1 . ({\mathfrak a}_2 . \ldots ({\mathfrak a}_{k-1} . {\mathfrak a}_k) \ldots )$

for some abelian ${{\mathfrak a}_1,\ldots,{\mathfrak a}_k}$. Similarly, a Lie algebra ${{\mathfrak g}}$ is nilpotent if it is expressible as a tower of central extensions (so that in all the extensions ${{\mathfrak h} . {\mathfrak k}}$ in the above factorisation, ${{\mathfrak h}}$ is central in ${{\mathfrak h} . {\mathfrak k}}$, where we say that ${{\mathfrak h}}$ is central in ${{\mathfrak g}}$ if ${[{\mathfrak h},{\mathfrak g}]=0}$). We also see that an extension ${{\mathfrak h} . {\mathfrak k}}$ is solvable if and only of both factors ${{\mathfrak h}, {\mathfrak k}}$ are solvable. Splitting abelian algebras into cyclic (i.e. one-dimensional) ones, we thus see that a finite-dimensional Lie algebra is solvable if and only if it is polycylic, i.e. it can be represented by a tower of cyclic extensions.

For our next fundamental example of using short exact sequences to split a general Lie algebra into simpler objects, we observe that every abstract Lie algebra ${{\mathfrak g}}$ has an adjoint representation ${\hbox{ad}: {\mathfrak g} \rightarrow \hbox{ad} {\mathfrak g} \subset {\mathfrak{gl}}({\mathfrak g})}$, where for each ${x \in {\mathfrak g}}$, ${\hbox{ad} x \in {\mathfrak{gl}}({\mathfrak g})}$ is the linear map ${(\hbox{ad} x)(y) := [x,y]}$; one easily verifies that this is indeed a representation (indeed, (2) is equivalent to the assertion that ${\hbox{ad} [x,y] = [\hbox{ad} x, \hbox{ad} y]}$ for all ${x,y \in {\mathfrak g}}$). The kernel of this representation is the center ${Z({\mathfrak g}) := \{ x \in {\mathfrak g}: [x,{\mathfrak g}] = 0\}}$, which the maximal central subalgebra of ${{\mathfrak g}}$. We thus have the short exact sequence

$\displaystyle {\mathfrak g} = Z({\mathfrak g}) . \hbox{ad} g \ \ \ \ \ (6)$

which, among other things, shows that every abstract Lie algebra is a central extension of a concrete Lie algebra (which can serve as a cheap substitute for Ado’s theorem mentioned earlier).

For our next fundamental decomposition of Lie algebras, we need some more definitions. A Lie algebra ${{\mathfrak g}}$ is simple if it is non-abelian and has no ideals other than ${0}$ and ${{\mathfrak g}}$; thus simple Lie algebras cannot be factored ${{\mathfrak g} = {\mathfrak h} . {\mathfrak k}}$ into strictly smaller algebras ${{\mathfrak h},{\mathfrak k}}$. In particular, simple Lie algebras are automatically perfect and centerless. We have the following fundamental theorem:

Theorem 1 (Equivalent definitions of semisimplicity) Let ${{\mathfrak g}}$ be a Lie algebra. Then the following are equivalent:

• (i) ${{\mathfrak g}}$ does not contain any non-trivial solvable ideal.
• (ii) ${{\mathfrak g}}$ does not contain any non-trivial abelian ideal.
• (iii) The Killing form ${K: {\mathfrak g} \times {\mathfrak g} \rightarrow {\bf C}}$, defined as the bilinear form ${K(x,y) := \hbox{tr}_{\mathfrak g}( (\hbox{ad} x) (\hbox{ad} y) )}$, is non-degenerate on ${{\mathfrak g}}$.
• (iv) ${{\mathfrak g}}$ is isomorphic to the direct sum of finitely many non-abelian simple Lie algebras.

We review the proof of this theorem later in these notes. A Lie algebra obeying any (and hence all) of the properties (i)-(iv) is known as a semisimple Lie algebra. The statement (iv) is usually taken as the definition of semisimplicity; the equivalence of (iv) and (i) is a special case of Weyl’s complete reducibility theorem (see Theorem 32), and the equivalence of (iv) and (iii) is known as the Cartan semisimplicity criterion. (The equivalence of (i) and (ii) is easy.)

If ${{\mathfrak h}}$ and ${{\mathfrak k}}$ are solvable ideals of a Lie algebra ${{\mathfrak g}}$, then it is not difficult to see that the vector sum ${{\mathfrak h}+{\mathfrak k}}$ is also a solvable ideal (because on quotienting by ${{\mathfrak h}}$ we see that the derived series of ${{\mathfrak h}+{\mathfrak k}}$ must eventually fall inside ${{\mathfrak h}}$, and thence must eventually become trivial by the solvability of ${{\mathfrak h}}$). As our Lie algebras are finite dimensional, we conclude that ${{\mathfrak g}}$ has a unique maximal solvable ideal, known as the radical ${\hbox{rad} {\mathfrak g}}$ of ${{\mathfrak g}}$. The quotient ${{\mathfrak g}/\hbox{rad} {\mathfrak g}}$ is then a Lie algebra with trivial radical, and is thus semisimple by the above theorem, giving the Levi decomposition

$\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} . ({\mathfrak g} / \hbox{rad} {\mathfrak g})$

expressing an arbitrary Lie algebra as an extension of a semisimple Lie algebra ${{\mathfrak g}/\hbox{rad}{\mathfrak g}}$ by a solvable algebra ${\hbox{rad} {\mathfrak g}}$ (and it is not hard to see that this is the only possible such extension up to isomorphism). Indeed, a deep theorem of Levi allows one to upgrade this decomposition to a split extension

$\displaystyle {\mathfrak g} = \hbox{rad} {\mathfrak g} : ({\mathfrak g} / \hbox{rad} {\mathfrak g})$

although we will not need or prove this result here.

In view of the above decompositions, we see that we can factor any Lie algebra (using a suitable combination of direct sums and extensions) into a finite number of simple Lie algebras and the scalar algebra ${{\bf C}}$. In principle, this means that one can understand an arbitrary Lie algebra once one understands all the simple Lie algebras (which, being defined over ${{\bf C}}$, are somewhat confusingly referred to as simple complex Lie algebras in the literature). Amazingly, this latter class of algebras are completely classified:

Theorem 2 (Classification of simple Lie algebras) Up to isomorphism, every simple Lie algebra is of one of the following forms:

• ${A_n = \mathfrak{sl}_{n+1}}$ for some ${n \geq 1}$.
• ${B_n = \mathfrak{so}_{2n+1}}$ for some ${n \geq 2}$.
• ${C_n = \mathfrak{sp}_{2n}}$ for some ${n \geq 3}$.
• ${D_n = \mathfrak{so}_{2n}}$ for some ${n \geq 4}$.
• ${E_6, E_7}$, or ${E_8}$.
• ${F_4}$.
• ${G_2}$.

(The precise definition of the classical Lie algebras ${A_n,B_n,C_n,D_n}$ and the exceptional Lie algebras ${E_6,E_7,E_8,F_4,G_2}$ will be recalled later.)

(One can extend the families ${A_n,B_n,C_n,D_n}$ of classical Lie algebras a little bit to smaller values of ${n}$, but the resulting algebras are either isomorphic to other algebras on this list, or cease to be simple; see this previous post for further discussion.)

This classification is a basic starting point for the classification of many other related objects, including Lie algebras and Lie groups over more general fields (e.g. the reals ${{\bf R}}$), as well as finite simple groups. Being so fundamental to the subject, this classification is covered in almost every basic textbook in Lie algebras, and I myself learned it many years ago in an honours undergraduate course back in Australia. The proof is rather lengthy, though, and I have always had difficulty keeping it straight in my head. So I have decided to write some notes on the classification in this blog post, aiming to be self-contained (though moving rapidly). There is no new material in this post, though; it is all drawn from standard reference texts (I relied particularly on Fulton and Harris’s text, which I highly recommend). In fact it seems remarkably hard to deviate from the standard routes given in the literature to the classification; I would be interested in knowing about other ways to reach the classification (or substeps in that classification) that are genuinely different from the orthodox route.

— 1. Abelian representations —

One of the key strategies in the classification of a Lie algebra ${{\mathfrak g}}$ is to work with representations of ${{\mathfrak g}}$, particularly the adjoint representation ${\hbox{ad}: {\mathfrak g} \rightarrow \hbox{ad} g}$, and then restrict such representations to various simpler subalgebras ${{\mathfrak h}}$ of ${{\mathfrak g}}$, for which the representation theory is well understood. In particular, one aims to exploit the representation theory of abelian algebras (and to a lesser extent, nilpotent and solvable algebras), as well as the fundamental example of the two-dimensional special linear Lie algebra ${\mathfrak{sl}_2}$, which is the smallest and easiest to understand of the simple Lie algebras, and plays an absolutely crucial role in exploring and then classifying all the other simple Lie algebras.

We begin this program by recording the representation theory of abelian Lie algebras. We begin with representations ${\rho: {\bf C} \rightarrow {\mathfrak{gl}}(V)}$ of the one-dimensional algebra ${{\bf C}}$. Setting ${x := \rho(1)}$, this is essentially the representation theory of a single linear transformation ${x: V \rightarrow V}$. Here, the theory is given by the Jordan decomposition. Firstly, for each complex number ${\lambda\in {\bf C}}$, we can define the generalised eigenspace

$\displaystyle V_\lambda^x := \{ v \in V: (x-\lambda)^n v = 0 \hbox{ for some } n \}.$

One easily verifies that the ${V_\lambda^x}$ are all linearly independent ${x}$-invariant subspaces of ${V}$, and in particular that there are only finitely many ${\lambda}$ (the spectrum ${\sigma(x)}$ of ${x}$) for which ${V_\lambda^x}$ is non-trivial. If one quotients out all the generalised eigenspaces, one can check that the quotiented transformation ${x}$ no longer has any spectrum, which contradicts the fundamental theorem of algebra applied to the characteristic polynomial of this quotiented transformation (or, if is more analytically inclined, one could apply Liouville’s theorem to the resolvent operators to obtain the required contradiction). Thus the generalised eigenspaces span ${V}$:

$\displaystyle V = \bigoplus_{\lambda \in \sigma(x)} V_\lambda^x.$

On each space ${V_\lambda^x}$, the operator ${x-\lambda}$ only has spectrum at zero, and thus (again from the fundamental theorem of algebra) has non-trivial kernel; similarly for any ${x}$-invariant subspace of ${V_\lambda^x}$, such as the range ${(x-\lambda) V_\lambda^x}$ of ${x-\lambda}$. Iterating this observation we conclude that ${x-\lambda}$ is a nilpotent operator on ${V_\lambda^x}$, thus ${(x-\lambda)^n=0}$ for some ${n}$. If we then write ${x_{ss}}$ to be the direct sum of the scalar multiplication operators ${\lambda}$ on each generalised eigenspace ${V_\lambda^x}$, and ${x_n}$ to be the direct sum of the operators ${x-\lambda}$ on these spaces, we have obtained the Jordan decomposition (or Jordan-Chevalley decomposition)

$\displaystyle x = x_{ss} + x_n$

where the operator ${x_{ss}: V \rightarrow V}$ is semisimple in the sense that it is a diagonalisable linear transformation on ${V}$ (or equivalently, all generalised eigenspaces are actually eigenspaces), and ${x_n}$ is nilpotent. Furthermore, as we may use polynomial interpolation to find a polynomial ${P: {\bf C} \rightarrow {\bf C}}$ such that ${P(z)-\lambda}$ vanishes to arbitrarily high order at ${z=\lambda}$ for each ${\lambda \in \sigma(V)}$ (and also ${P(0)=0}$), we see that ${x_{ss}}$ (and hence ${x_n}$) can be expressed as polynomials in ${x}$ with zero constant coefficient; this fact will be important later. In particular, ${x_{ss}}$ and ${x_n}$ commute.

Conversely, given an arbitrary linear transformation ${x: V \rightarrow V}$, the Jordan-Chevalley decomposition is the unique decomposition into commuting semisimple and nilpotent elements. Indeed, if we have an alternate decomposition ${x = x'_{ss} + x'_n}$ into a semisimple element ${x'_{ss}}$ commuting with a nilpotent element ${x'_n}$, then the generalised eigenspaces of ${x}$ must be preserved by both ${x'_{ss}}$ and ${x'_n}$, and so without loss of generality we may assume that there is just a single generalised eigenspace ${V = V_\lambda^x}$; subtracting ${\lambda}$ we may then assume that ${\lambda=0}$, but then ${x}$ is nilpotent, and so ${x'_{ss} = x - x'_n}$ is also nilpotent; but the only transformation which is both semisimple and nilpotent is the zero transformation, and the claim follows.

From the Jordan-Chevalley decomposition it is not difficult to then place ${x}$ in Jordan normal form by selecting a suitable basis for ${V}$; see e.g. this previous blog post. But in contrast to the Jordan-Chevalley decomposition, the basis is not unique in general, and we will not explicitly use the Jordan normal form in the rest of this post.

Given an abstract complex vector space ${V}$, there is in general no canonical notion of complex conjugation on ${V}$, or of linear transformations ${x: V \rightarrow V}$. However, we can define the conjugate ${\overline{x}}$ of any semisimple transformation ${x: V \rightarrow V}$, defined as the direct sum of ${\overline{\lambda}}$ on each eigenspace ${V_\lambda^x}$ of ${x}$. In particular, we can define the conjugate ${\overline{x_{ss}}: V \rightarrow V}$ of the semisimple component ${x_{ss}}$ of an arbitrary linear transformation ${x_{ss}}$, which will be the direct sum of ${\overline{\lambda}}$ on each generalised eigenspace ${V_\lambda^x}$ of ${x}$. The significance of this transformation lies in the observation that the product ${\overline{x_{ss}} x}$ has trace ${|\lambda|^2 \hbox{dim} V_\lambda^x}$ on each generalised eigenspace (since nilpotent operators have zero trace), and in particular we see that

$\displaystyle \hbox{tr}(\overline{x_{ss}} x) = 0 \ \ \ \ \ (7)$

if and only if the spectrum consists only of zero, or equivalently that ${x}$ is nilpotent. Thus (7) provides a test for nilpotency, which will be turn out to be quite useful later in this post. (Note that this trick relies very much on the special structure of ${{\bf C}}$, in particular the fact that it has characteristic zero.)

In the above arguments we have used the basic fact that if two operators ${x: V \rightarrow V}$ and ${y: V \rightarrow V}$ commute, then the generalised eigenspaces of one operator are preserved by the other. Iterating this fact, we can now start understanding the representations ${\rho: {\mathfrak h} \rightarrow {\mathfrak{gl}}(V)}$ of an abelian Lie algebra. Namely, there is a finite set ${\sigma(\rho) \subset {\mathfrak h}^*}$ of linear functionals (or homomorphisms) ${\lambda: {\mathfrak h} \rightarrow {\bf C}}$ on ${{\mathfrak h}}$ (i.e. elements of the dual space ${{\mathfrak h}^*}$) for which the generalised eigenspaces

$\displaystyle V_\lambda^{\mathfrak h} := \{ v \in V: (\rho({\mathfrak h}) - \lambda)^n v = 0 \hbox{ for some } n \}$

are non-trivial and ${{\mathfrak h}}$-invariant, and we have the decomposition

$\displaystyle V = \bigoplus_{\lambda \in \sigma(x)} V_\lambda^{\mathfrak h}.$

Here we use ${(\rho({\mathfrak h})-\lambda)^n v=0}$ as short-hand for writing ${(x_1-\lambda(x_1)) \ldots (\rho(x_n)-\lambda(x_n)) v = 0}$ for all ${x_1,\ldots,x_n \in {\mathfrak h}}$. An important special case arises when the action of ${{\mathfrak h}}$ is semisimple in the sense that ${\rho(x)}$ is semisimple for all ${x \in {\mathfrak h}}$. Then all the generalised eigenspaces are just eigenspaces (or weight spaces) , thus

$\displaystyle \rho(x) v = \lambda(x) v$

for all ${v \in V_\lambda^{\mathfrak h}}$ and ${x \in{\mathfrak h}}$. When this occurs we call ${v}$ a weight vector with weight ${\lambda}$.

— 2. Engel’s theorem and Lie’s theorem —

In the introduction we gave the two basic examples of nilpotent and solvable Lie algebras, namely the strictly upper triangular and upper triangular matrices. The theorems of Engel and of Lie assert, roughly speaking, that these examples (and subalgebras thereof) are essentially the only type of solvable and nilpotent Lie algebras that can exist, at least in the concrete setting of subalgebras of ${{\mathfrak{gl}}(V)}$. Among other things, these theorems greatly clarify the representation theory of nilpotent and solvable Lie algebras.

We begin with Engel’s theorem.

Theorem 3 (Engel’s theorem) Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ be a concrete Lie algebra such that every element ${x}$ of ${{\mathfrak g}}$ is nilpotent as a linear transformation on ${V}$.

• (i) If ${V}$ is non-trivial, then there is a non-zero element ${v}$ of ${V}$ which is annihilated by every element of ${{\mathfrak g}}$.
• (ii) There is a basis of ${V}$ for which all elements of ${{\mathfrak g}}$ are strictly upper triangular. In particular, ${{\mathfrak g}}$ is nilpotent.

Proof: We begin with (i). We induct on the dimension of ${{\mathfrak g}}$. The claim is trivial for dimensions ${0}$ and ${1}$, so suppose that ${{\mathfrak g}}$ has dimension greater than ${1}$, and that the claim is already proven for smaller dimensions.

Let ${{\mathfrak h}}$ be a maximal proper subalgebra of ${{\mathfrak g}}$, then ${{\mathfrak h}}$ has dimension strictly between zero and ${\hbox{dim} \mathfrak{g}}$ (since all one-dimensional subspaces are proper subalgebras). Observe that for every ${x \in {\mathfrak h}}$, ${\hbox{ad} x}$ acts on both the vector spaces ${{\mathfrak g}}$ and ${{\mathfrak h}}$ and thus also on the quotient space ${{\mathfrak g}/{\mathfrak h}}$. As ${x}$ is nilpotent, all of these actions are nilpotent also. In particular, by induction hypothesis, there is ${v \in {\mathfrak g}/{\mathfrak h}}$ which is annihilated by ${\hbox{ad} x}$ for all ${x \in {\mathfrak h}}$. Let ${w}$ be a representative of ${v}$ in ${{\mathfrak g}}$, then ${[w, {\mathfrak h}] \subset {\mathfrak h}}$, and so ${\hbox{span}(w, {\mathfrak h})}$ is a subalgebra and is thus all of ${{\mathfrak g}}$.

By induction hypothesis again, the space ${W}$ of vectors in ${V}$ annihilated by ${{\mathfrak h}}$ is non-trivial; as ${[w, {\mathfrak h}] \subset {\mathfrak h}}$, it is preserved by ${w}$. As ${w}$ is nilpotent, there is a non-trivial element of ${W}$ annihilated by ${w}$ and hence by ${{\mathfrak g}}$, as required.

Now we prove (ii). We induct on the dimension of ${V}$. The case of dimension zero is trivial, so suppose ${V}$ has dimension at least one, and the claim has already been proven for dimension ${\hbox{dim}(V)-1}$. By (i), we may find a non-trivial vector ${v}$ annihilated by ${{\mathfrak g}}$, and so we may project ${{\mathfrak g}}$ down to ${{\mathfrak{gl}}(V / \hbox{span}(v))}$. By the induction hypothesis, there is a basis for ${V/\hbox{span}(v)}$ on which the projection of any element of ${{\mathfrak g}}$ is strictly upper-triangular; pulling this basis back to ${V}$ and adjoining ${v}$, we obtain the claim. $\Box$

As a corollary of this theorem and the short exact sequence (6) we see that an abstract Lie algebra ${{\mathfrak g}}$ is nilpotent iff ${\hbox{ad} {\mathfrak g}}$ is nilpotent iff ${\hbox{ad} x}$ is nilpotent in ${{\mathfrak g}}$ for every ${x \in {\mathfrak g}}$ (i.e. every element of ${{\mathfrak g}}$ is ad-nilpotent).

Engel’s theorem is in fact valid over every field. The analogous theorem of Lie for solvable algebras, however, relies much more strongly on the specific properties of the complex field ${{\bf C}}$.

Theorem 4 (Lie’s theorem) Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ be a solvable concrete Lie algebra.

• (i) If ${V}$ is non-trivial, there exists a non-zero element ${v}$ of ${V}$ which is an eigenvector for every element of ${{\mathfrak g}}$.
• (ii) There is a basis for ${V}$ such that every element of ${{\mathfrak g}}$ is upper triangular.

Note that if one specialises Lie’s theorem to abelian ${{\mathfrak g}}$ then one essentially recovers the abelian theory of the previous section.

Proof: We prove (i). As before we induct on the dimension of ${{\mathfrak g}}$. The dimension zero case is trivial, so suppose that ${{\mathfrak g}}$ has dimension at least one and that the claim has been proven for smaller dimensions.

Let ${{\mathfrak h}}$ be a codimension one subalgebra of ${{\mathfrak g}}$; such an algebra can be formed by taking a codimension one subspace of the abelianisation ${{\mathfrak g}/[{\mathfrak g}, {\mathfrak g}]}$ (which has dimension at least one, else ${{\mathfrak g}}$ will not be solvable) and then pulling back to ${{\mathfrak g}}$. Note that ${{\mathfrak h}}$ is automatically an ideal.

By induction, there is a non-zero element ${v}$ of ${V}$ such that every element of ${{\mathfrak h}}$ has ${v}$ as an eigenvector, thus we have

$\displaystyle xv = \lambda(x) v$

for all ${x \in {\mathfrak h}}$ and some linear functional ${\lambda: {\mathfrak h} \rightarrow {\bf C}}$. If we then set ${W = V_\lambda^{\mathfrak h}}$ to be the simultaneous eigenspace

$\displaystyle W := \{ w \in V: xv = \lambda(x) v \hbox{ for all } x \in {\mathfrak h} \}$

then ${W}$ is a non-trivial subspace of ${V}$.

Let ${y}$ be an element of ${{\mathfrak g}}$ that is not in ${{\mathfrak h}}$, and let ${w \in W}$. Consider the space spanned by the orbit ${w, yw, y^2 w, \ldots}$. By finite dimensionality, this space has a basis ${w, yw, y^2 w, \ldots, y^{n-1} w}$ for some ${n}$. By induction and definition of ${W}$, we see that every ${x \in {\mathfrak h}}$ acts on this space by an upper-triangular matrix with diagonal entries ${\lambda(x)}$ in this basis. Of course, ${y}$ acts on this space as well, and so ${[x,y]}$ has trace zero on this space, thus ${n \lambda([x,y]) = 0}$ and so ${\lambda([x,y])=0}$ (here we use the characteristic zero nature of ${{\bf C}}$). From this we see that ${y}$ fixes ${W}$. If we let ${v'}$ be an eigenvector of ${y}$ on ${W}$ (which exists from the Jordan decomposition of ${y}$), we conclude that ${v'}$ is a simultaneous eigenvector of ${{\mathfrak g}}$ as required.

The claim (ii) follows from (i) much as in Engel’s theorem. $\Box$

— 3. Characterising semisimplicity —

The objective of this section will be to prove Theorem 1.

Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ be an concrete Lie algebra, and ${x}$ be an element of ${{\mathfrak g}}$. Then the components ${x_{ss}, x_n: V \rightarrow V}$ of ${x}$ need not lie in ${{\mathfrak g}}$. However they behave “as if” they lie in ${{\mathfrak g}}$ for the purposes of taking Lie brackets, in the following sense:

Lemma 5 Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ and let ${x \in {\mathfrak g}}$ have Jordan decomposition ${x = x_{ss} + x_n}$. Then ${[x_{ss},{\mathfrak g}] \subset [{\mathfrak g},{\mathfrak g}]}$, ${[\overline{x_{ss}},{\mathfrak g}] \subset [{\mathfrak g},{\mathfrak g}]}$ and ${[x_n,{\mathfrak g}] \subset [{\mathfrak g},{\mathfrak g}]}$.

Proof: As ${x_{ss}}$ and ${x_n}$ are semisimple and nilpotent on ${V}$ and commute with each other, ${\hbox{ad} x_{ss}}$ and ${\hbox{ad} x_n}$ are semisimple and nilpotent on ${{\mathfrak{gl}}(V)}$ and also commute with each other (this can for instance by using Lie’s theorem (or the Jordan normal form) to place ${x}$ in upper triangular form and computing everything explicitly). Thus ${\hbox{ad} x = \hbox{ad} x_{ss} + \hbox{ad} x_n}$ is the Jordan-Chevalley decomposition of ${\hbox{ad} x}$, and in particular ${\hbox{ad} x_{ss} = Q(\hbox{ad} x)}$ for some polynomial ${Q}$ with zero constant coefficient. Since ${\hbox{ad} x}$ maps ${{\mathfrak g}}$ to the subalgebra ${[{\mathfrak g},{\mathfrak g}]}$, we conclude that ${\hbox{ad} x_{ss} = Q(\hbox{ad} x)}$ does also, thus ${[x_{ss}, {\mathfrak g}] \subset [{\mathfrak g},{\mathfrak g}]}$ as required. Similarly for ${\overline{x_{ss}}}$ and ${x_n}$ (note that ${\hbox{ad} \overline{x_{ss}} = \overline{ \hbox{ad} x_{ss} }}$). $\Box$

We can now use this (together with Engel’s theorem and the test (7) for nilpotency) to obtain a part of Theorem 1:

Proposition 6 Let ${{\mathfrak g}}$ be a simple Lie algebra. Then the Killing form ${K}$ is non-degenerate.

Proof: As ${{\mathfrak g}}$ is simple, its center ${Z({\mathfrak g})}$ is trivial, so by (6) ${{\mathfrak g}}$ is isomorphic to ${\hbox{ad} {\mathfrak g}}$. In particular we may assume that ${{\mathfrak g}}$ is a concrete Lie algebra, thus ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ for some vector space ${V}$.

Suppose for contradiction that ${K}$ is degenerate. Using the skew-adjointness identity

$\displaystyle K( [z,x], y ) = - K( x, [z,y] )$

for all ${x,y,z \in {\mathfrak g}}$ (which comes from the cyclic properties of trace), we see that the kernel ${\{ x \in {\mathfrak g}: K(x,y)=0 \hbox{ for all } y \in {\mathfrak g} \}}$ is a non-trivial ideal of ${{\mathfrak g}}$, and is thus all of ${{\mathfrak g}}$ as ${{\mathfrak g}}$ is simple. Thus ${K(x,y)=0}$ for all ${x, y \in {\mathfrak g}}$.

Now let ${x,y,z \in {\mathfrak g}}$. By Lemma 5, ${\overline{x_{ss}}}$ acts by Lie bracket on ${{\mathfrak g}}$ and so one can define ${\hbox{ad} \overline{x_{ss}} \in {\mathfrak{gl}}({\mathfrak g})}$. We now consider the quantity

$\displaystyle \hbox{tr}_{\mathfrak g} (\hbox{ad} \overline{x_{ss}}) (\hbox{ad} [y,z]).$

We can rearrange this as

$\displaystyle \hbox{tr}_{\mathfrak g} (\hbox{ad} [y,\overline{x_{ss}}]) (\hbox{ad} z).$

By Lemma 5, ${[y,\overline{x_{ss}}] \in {\mathfrak g}}$, so this is equal to

$\displaystyle K( [y,\overline{x_{ss}}], z ) = 0,$

and so

$\displaystyle \hbox{tr}_{\mathfrak g} (\hbox{ad} \overline{x_{ss}}) (\hbox{ad} w) = 0$

for all ${w \in [{\mathfrak g},{\mathfrak g}]}$. On the other hand, ${[{\mathfrak g},{\mathfrak g}]}$ is an ideal of ${{\mathfrak g}}$; as ${{\mathfrak g}}$ is simple, we must thus have ${{\mathfrak g} = [{\mathfrak g},{\mathfrak g}]}$ (i.e. ${{\mathfrak g}}$ is perfect). As ${x \in {\mathfrak g}}$, we conclude that

$\displaystyle \hbox{tr}_{\mathfrak g} (\hbox{ad} \overline{x_{ss}}) (\hbox{ad} x) = 0.$

From (7) we conclude that ${\hbox{ad} x}$ is nilpotent for every ${x}$. By Engel’s theorem, this implies that ${\hbox{ad} {\mathfrak g}}$, and hence ${{\mathfrak g}}$, is nilpotent; but ${{\mathfrak g}}$ is simple, giving the desired contradiction. $\Box$

Corollary 7 Let ${{\mathfrak h}}$ be a simple ideal of a Lie algebra ${{\mathfrak g}}$. Then ${{\mathfrak h}}$ is complemented by another ideal ${{\mathfrak k}}$ of ${{\mathfrak g}}$ (thus ${{\mathfrak h} \cap {\mathfrak k} = \{0\}}$ and ${{\mathfrak h} + {\mathfrak k} = {\mathfrak g}}$), with ${{\mathfrak g}}$ isomorphic to the direct sum ${{\mathfrak h} \oplus {\mathfrak k}}$.

Proof: The adjoint action of ${{\mathfrak g}}$ restricts to the ideal ${{\mathfrak h}}$ and gives a restricted Killing form

$\displaystyle K_\mathfrak{h}(x,y) := \hbox{tr}_\mathfrak{h}( (\hbox{ad} x) (\hbox{ad} y) ).$

By Proposition 6, this bilinear form is non-degenerate on ${{\mathfrak h}}$, so the orthogonal complement

$\displaystyle {\mathfrak k} := {\mathfrak h}^\perp = \{ x \in {\mathfrak g}: K_\mathfrak{h}(x,y) = 0\hbox{ for all } y \in {\mathfrak h} \}$

is a complementary subspace to ${{\mathfrak h}}$. It can be verified to also be an ideal. Since ${[{\mathfrak h},{\mathfrak k}]}$ lies in both ${{\mathfrak h}}$ and ${{\mathfrak k}}$, we see that ${[{\mathfrak h},{\mathfrak k}]=0}$, and so ${{\mathfrak g}}$ is isomorphic to ${{\mathfrak h} \oplus {\mathfrak k}}$ as claimed. $\Box$

Now we can prove Theorem 1. We first observe that (i) trivially implies (ii); conversely, if ${{\mathfrak g}}$ has a non-trivial solvable ideal ${{\mathfrak h}}$, then every element of the derived series of ${{\mathfrak h}}$ is also an ideal of ${{\mathfrak g}}$, and in particular ${{\mathfrak g}}$ will have a non-trivial abelian ideal. Thus (i) and (ii) are equivalent.

Now we show that (i) implies (iv), which we do by induction on the dimension of ${{\mathfrak g}}$. Of course we may assume ${{\mathfrak g}}$ is non-trivial. Let ${{\mathfrak h}}$ be a non-trivial ideal of ${{\mathfrak g}}$ of minimal dimension. If ${{\mathfrak h} = {\mathfrak g}}$ then ${{\mathfrak g}}$ is simple (note that it cannot be abelian as ${{\mathfrak g}}$ is non-trivial and semisimple) and we are done. If ${{\mathfrak h}}$ is strictly smaller than ${{\mathfrak g}}$, then it also has no non-trivial solvable ideals (because the radical of ${{\mathfrak h}}$ is a characteristic subalgebra of ${{\mathfrak h}}$ and is thus an ideal in ${{\mathfrak g}}$) and so by induction is isomorphic to the direct sum of simple Lie algebras; as ${{\mathfrak h}}$ was minimal, we conclude that ${{\mathfrak h}}$ is itself simple. By Corollary 7, ${{\mathfrak g}}$ then splits as the direct sum of ${{\mathfrak h}}$ and a semisimple Lie algebra of strictly smaller dimension, and the claim follows from the induction hypothesis.

From Proposition 6 we see that (iv) implies (iii), so to finish the proof of Theorem 1 it suffices to show that (iii) implies (ii). Indeed, if ${{\mathfrak g}}$ has a non-trivial abelian ideal ${{\mathfrak h}}$, then for any ${x \in {\mathfrak g}}$ and ${y \in {\mathfrak h}}$, ${\hbox{ad} x \hbox{ad} y}$ annihilates ${{\mathfrak h}}$ and also has range in ${{\mathfrak h}}$, hence has trace zero, so ${{\mathfrak h}}$ is ${K}$-orthogonal to ${{\mathfrak g}}$, giving the degeneracy of the Killing form.

Remark 1 Similar methods also give the Cartan solvability criterion: a Lie algebra ${{\mathfrak g}}$ is solvable if and only if ${{\mathfrak g}}$ is orthogonal to ${[{\mathfrak g},{\mathfrak g}]}$ with respect to the Killing form. Indeed, the “only if” part follows easily from Lie’s theorem, while for the “if” part one can adapt the proof of Proposition 6 to show that if ${{\mathfrak g}}$ is orthogonal to ${[{\mathfrak g},{\mathfrak g}]}$, then every element of ${\hbox{ad} [{\mathfrak g},{\mathfrak g}]}$ is nilpotent, hence by Engel’s theorem ${\hbox{ad} [{\mathfrak g},{\mathfrak g}]}$ is nilpotent, and so from the short exact sequence (6) we see that ${[{\mathfrak g},{\mathfrak g}]}$ is nilpotent, and hence ${{\mathfrak g}}$ is solvable.

Remark 2 The decomposition of a semisimple Lie algebra as the direct sum of simple Lie algebras is unique up to isomorphism and permutation. Indeed, suppose that ${\bigoplus_{i=1}^n {\mathfrak g}_i}$ is isomorphic to ${\bigoplus_{j=1}^{n'} {\mathfrak g}'_j}$ for some simple ${{\mathfrak g}_i, {\mathfrak g}'_j}$. We project each ${{\mathfrak g}'_j}$ to ${{\mathfrak g}_i}$ and observe from simplicity that these projections must either be zero or isomorphisms (cf. Schur’s lemma). For fixed ${i}$, there must be at least one ${j}$ for which the projection is an isomorphism (otherwise ${\bigoplus_{j=1}^{n'} {\mathfrak g}'_j}$ could not generate all of ${\bigoplus_{i=1}^n {\mathfrak g}_i}$); on the other hand, as any two ${{\mathfrak g}'_j}$ commute with each other in the direct sum, and ${{\mathfrak g}_i}$ is nonabelian, there is at most one ${j}$ for which the projection is an isomorphism. This gives the required identification of the ${{\mathfrak g}_i}$ and ${{\mathfrak g}'_j}$ up to isomorphism and permutation.

Remark 3 One can also establish complete reducibility by using the Weyl unitary trick, in which one first creates a real compact Lie group whose Lie algebra is a real form of the complex Lie algebra being studied, and then uses the complete reducibility of actions of compact groups. This also gives an alternate way to establish Theorem 32 in the appendix.

Semisimple Lie algebras have a number of important non-degeneracy properties. For instance, they have no non-trivial outer automorphisms (at the infinitesimal level, at least):

Lemma 8 Let ${{\mathfrak g}}$ be a semisimple Lie algebra. Then every derivation ${D \in \hbox{Der} {\mathfrak g}}$ on ${{\mathfrak g}}$ is inner, thus ${D = \hbox{ad} x}$ for some ${x \in {\mathfrak g}}$.

Proof: From the identity ${[D, \hbox{ad} x] = \hbox{ad} Dx}$ we see that ${\hbox{ad} g}$ is an ideal in ${\hbox{Der} {\mathfrak g}}$. The trace form ${(D_1,D_2) \mapsto \hbox{tr}(D_1 D_2)}$ on ${\hbox{Der} {\mathfrak g}}$ restricts to the Killing form on ${\hbox{ad}({\mathfrak g})}$, which is non-degenerate.

Suppose for contradiction that ${\hbox{ad}({\mathfrak g})}$ is not all of ${\hbox{Der}({\mathfrak g})}$, then there is a non-trivial derivation ${D}$ which is trace-form orthogonal to ${\hbox{ad}({\mathfrak g})}$, thus ${D}$ is trace-orthogonal to ${[\hbox{ad} x, \hbox{ad} y]}$ for all ${x,y \in {\mathfrak g}}$, so that ${[D, \hbox{ad} x] = \hbox{ad} {Dx}}$ is trace-orthogonal to ${\hbox{ad} y}$ for all ${x,y \in {\mathfrak g}}$. As ${K}$ is non-degenerate, we conclude that ${Dx=0}$ for all ${x}$, and so ${D}$ is trivial, a contradiction. $\Box$

This fact, combined with the complete reducibility of ${{\mathfrak g}}$-modules (a fact which we will prove in an appendix) implies that the Jordan decomposition preserves concrete semisimple Lie algebras:

Corollary 9 Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ be a concrete semisimple Lie algebra, and let ${x \in {\mathfrak g}}$. Then ${x_{ss}, x_n, \overline{x}_{ss}}$ also lie in ${{\mathfrak g}}$.

Proof: By Theorem 1, ${{\mathfrak g}}$ is the direct sum of commuting simple algebras. It is easy to see that if ${x,y}$ commute then the Jordan decomposition of ${x+y}$ arises from the sum of the Jordan decompositions of ${x}$ and ${y}$ separately, so we may assume without loss of generality that ${{\mathfrak g}}$ is simple.

Observe that if ${V}$ splits as the direct sum ${V = V_1 \oplus V_2}$ of two ${{\mathfrak g}}$-invariant subspaces (so that ${{\mathfrak g}}$ can be viewed as a subalgebra of ${{\mathfrak{gl}}(V_1) \oplus {\mathfrak{gl}}(V_2)}$, and the elements of ${x}$ can be viewed as being block-diagonal in a suitable basis of ${V_1,V_2}$), then the claim for ${V}$ follows from that of ${V_1}$ and ${V_2}$. So by an induction on dimension, it suffices to establish the claim under the hypothesis that ${V}$ is indecomposable, in that it cannot be expressed as the direct sum of two non-trivial invariant subspaces.

In the appendix we will show that every invariant subspace ${W}$ of ${V}$ is complemented in that one can write ${V = W \oplus W'}$ for some invariant subspace ${W'}$. Assuming this fact, it suffices to establish the claim in the case that ${V}$ is irreducible, in the sense that it contains no proper invariant subspaces.

By Lemma 7, the operation ${y \mapsto [x_{ss}, y]}$ is a derivation on ${{\mathfrak g}}$, thus there exists ${a \in {\mathfrak g}}$ such that ${[x_{ss},y] = (\hbox{ad} a) y}$ for all ${y \in {\mathfrak g}}$, thus ${x_{ss}-a \in {\mathfrak{gl}}(V)}$ centralises ${{\mathfrak g}}$. By Schur’s lemma and the hypothesis of irreducibility, we conclude that ${x_{ss}-a}$ is a multiple of a constant ${\lambda}$. Onthe other hand, every element of ${{\mathfrak g}}$ has trace zero since ${{\mathfrak g} = [{\mathfrak g},{\mathfrak g}]}$; in particular, ${a}$ and ${x}$ have trace zero, and so ${x_{ss}-a}$ has trace zero. But this trace is just ${\lambda\hbox{dim} W}$, so we conclude that ${x_{ss}-a = \lambda = 0}$ and the claim follows. $\Box$

This allows us to make the Jordan decomposition universal for semisimple algebras:

Lemma 10 (Semisimple Jordan decomposition) Let ${{\mathfrak g}}$ be a semisimple Lie algebra, and let ${x \in {\mathfrak g}}$. Then we have a unique decomposition ${x = x_{ss} + x_n}$ in ${{\mathfrak g}}$ such that ${\rho(x_{ss}) = (\rho(x))_{ss}}$ and ${\rho(x_n) = (\rho(x))_n}$ for every representation ${\rho}$ of ${{\mathfrak g}}$.

Proof: As the adjoint representation is faithful we may assume without loss of generality that ${{\mathfrak g}}$ is a concrete algebra, thus ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$. The uniqueness is then clear by taking ${\rho}$ to be the identity. To obtain existence, we take ${x_{ss}, x_n}$ to be the concrete Jordan decomposition. We need to verify ${\rho(x_{ss}) = (\rho(x))_{ss}}$ and ${\rho(x_n) = (\rho(x))_n}$ for any representation ${\rho: {\mathfrak g} \rightarrow {\mathfrak m}}$. The adjoint actions of ${\rho(x_{ss})}$ and ${\rho(x_n)}$ on ${\rho({\mathfrak g})}$ commute and are semisimple and nilpotent respectively and so

$\displaystyle \hbox{ad} \rho(x_{ss}) = (\hbox{ad} \rho(x))_{ss}; \quad \hbox{ad} \rho(x_{n}) = (\hbox{ad} \rho(x))_{n}$

in ${\hbox{ad} \rho(g)}$ (cf. the proof of Lemma 5). A similar argument (applying Corollary 9 to ${\rho(g)}$, which is isomorphic to a quotient of ${{\mathfrak g}}$ and is thus semisimple, to keep ${\rho(x)_{ss}, \rho(x)_n}$ in ${\rho({\mathfrak g})}$) gives

$\displaystyle \hbox{ad} (\rho(x)_{ss}) = (\hbox{ad} \rho(x))_{ss}; \quad \hbox{ad}(\rho(x)_{n}) = (\hbox{ad} \rho(x))_{n}.$

Since the adjoint representation of the semisimple algebra ${\rho({\mathfrak g})}$ is faithful, the claim follows. $\Box$

One can also show that ${x_{ss}}$, ${x_n}$ commute with each other and with the centraliser ${C(x) := \{ y \in {\mathfrak g}: [x,y] = 0 \}}$ of ${x}$ by using the faithful nature of the adjoint representation for semisimple algebras, though we will not need these facts here. Using this lemma we have a well-defined notion of an element ${x}$ of a semisimple algebra ${{\mathfrak g}}$ being semisimple (resp. nilpotent), namely that ${x=x_{ss}}$ or ${x=x_n}$. Lemma 10 then implies that any representation of a semisimple element of ${{\mathfrak g}}$ is again semisimple, and any representation of a nilpotent element of ${{\mathfrak g}}$ is again nilpotent. This apparently innocuous statement relies heavily on the semisimple nature of ${{\mathfrak g}}$; note for instance that the representation

$\displaystyle t \mapsto \begin{pmatrix} 0 & t \\ 0 & 0 \end{pmatrix}$

of the non-semisimple algebra ${{\bf C} \equiv {\mathfrak{gl}}_1}$ into ${{\mathfrak{gl}}_2}$ takes semisimple elements to nilpotent ones.

— 4. Cartan subalgebras —

While simple Lie algebras do not have any non-trivial ideals, they do have some very useful subalgebras known as Cartan subalgebras which will eventually turn out to be abelian and which can be used to dramatically clarify the structure of the rest of the algebra.

We need some definitions. An element ${x}$ of ${{\mathfrak g}}$ is said to be regular if its generalised null space

$\displaystyle {\mathfrak g}^x_0 := \{ y \in {\mathfrak g}: (\hbox{ad} x)^n y = 0 \hbox{ for some } n \}$

has minimal dimension. A Cartan subalgebra of ${{\mathfrak g}}$ is a nilpotent subalgebra ${{\mathfrak h}}$ of ${{\mathfrak g}}$ which is its own normaliser, thus ${N({\mathfrak h}) := \{ x \in {\mathfrak g}: [x,{\mathfrak h}] \subset {\mathfrak h} \}}$ is equal to ${{\mathfrak h}}$. From the polynomial nature of the Lie algebra operations (and the Noetherian nature of algebraic geometry) we see that the regular elements of ${{\mathcal g}}$ are generic (i.e. they form a non-empty Zariski-open subset of ${{\mathfrak g}}$).

Example 1 In ${{\mathfrak{gl}}(V)}$, the regular elements consist of the semisimple elements with distinct eigenvalues. Fixing a basis for ${V}$, the space of elements of ${{\mathfrak{gl}}(V)}$ that are diagonalised by that basis form a Cartan subalgebra of ${{\mathfrak{gl}}(V)}$.

Cartan algebras always exist, and can be constructed as generalised null spaces of regular elements:

Proposition 11 (Existence of Cartan subalgebras) Let ${{\mathfrak g}}$ be an abstract Lie algebra. If ${x \in {\mathfrak g}}$ is regular, then the generalised null space ${{\mathfrak h} := {\mathfrak g}^x_0}$ of ${x}$ is a Cartan subalgebra.

Proof: Suppose that ${{\mathfrak h}}$ is not nilpotent, then by Engel’s theorem the adjoint action of at least one element of ${{\mathfrak h}}$ on ${{\mathfrak h}}$ is not nilpotent. By the polynomial nature of the Lie algebra operations, we conclude that the adjoint action of a generic element of ${{\mathfrak h}}$ on ${{\mathfrak h}}$ is not nilpotent.

The action of ${x}$ on ${{\mathfrak g}/{\mathfrak h}}$ is non-singular, so the action of generic elements of ${{\mathfrak h}}$ on ${{\mathfrak g}/{\mathfrak h}}$ is also non-singular. Thus we can find ${y \in {\mathfrak h}}$ such that ${\hbox{ad} y}$ is not nilpotent on ${{\mathfrak h}}$ and not singular on ${{\mathfrak g}/{\mathfrak h}}$. From this we see that ${{\mathfrak g}^y_0}$ is a proper subspace of ${{\mathfrak g}^x_0}$, contradicting the regularity of ${x}$. Thus ${{\mathfrak h}}$ is nilpotent.

Finally, we show that ${{\mathfrak h}}$ is its own normaliser. Suppose that ${y \in {\mathfrak g}}$ normalises ${{\mathfrak h}}$, then ${(\hbox{ad} x) y \in {\mathfrak h}}$. But ${{\mathfrak h}}$ is the generalised null space of ${\hbox{ad} x}$, and so ${y \in {\mathfrak h}}$ as required.

Furthermore, all Cartan algebras arise as generalised null spaces:

Proposition 12 (Cartans are null spaces) Let ${{\mathfrak g}}$ be an abstract Lie algebra, and let ${{\mathfrak h}}$ be a Cartan subalgebra. Let

$\displaystyle {\mathfrak g}_0^{\mathfrak h} = \{ x \in {\mathfrak g}: (\hbox{ad} {\mathfrak h})^n x = 0 \hbox{ for some } n \}$

be the generalised null space of ${{\mathfrak h}}$. Then ${{\mathfrak g}_0^{\mathfrak h} = {\mathfrak h}}$. Furthermore, for generic ${x \in {\mathfrak h}}$, one has

$\displaystyle {\mathfrak h} = {\mathfrak g}_0^x.$

Proof: As ${{\mathfrak h}}$ is nilpotent, we certainly have ${{\mathfrak h} \subset {\mathfrak g}_0^{\mathfrak h}}$. Now, for any ${x \in {\mathfrak h}}$, ${\hbox{ad} x}$ acts nilpotently on both ${{\mathfrak g}_0^{\mathfrak h}}$ and ${{\mathfrak h}}$ and hence on ${{\mathfrak g}_0^{\mathfrak h}/{\mathfrak h}}$. By Engel’s theorem, we can thus find ${y \in {\mathfrak g}_0^{\mathfrak h}/{\mathfrak h}}$ that is annihilated by the adjoint action of ${{\mathfrak h}}$; pulling back to ${{\mathfrak g}_0^{\mathfrak h}}$, we conclude that the normaliser of ${{\mathfrak h}}$ is strictly larger than ${{\mathfrak h}}$, contradicting the hypothesis that ${{\mathfrak h}}$ is a Cartan subalgebra. This shows that ${{\mathfrak g}_0^{\mathfrak h} = {\mathfrak h}}$.

Now let ${x \in {\mathfrak h}}$ be generic, then ${{\mathfrak g}_0^x}$ has minimal dimension amongst ${x \in {\mathfrak h}}$. Let ${y \in {\mathfrak h}}$ be arbitrary. Then for any scalar ${t}$, ${\hbox{ad} (x+ty)}$ acts on ${{\mathfrak g}}$ and on ${{\mathfrak h}}$ and hence on ${{\mathfrak g}/{\mathfrak h}}$. This action is invertible when ${t=0}$, and hence is also invertible for generic ${t}$; thus for generic ${t}$, ${{\mathfrak g}^{x+ty}_0 \subset {\mathfrak g}^x_0}$. By minimality we conclude that ${{\mathfrak g}^{x+ty}_0 = {\mathfrak g}^x_0}$, so ${\hbox{ad} (x+ty)}$ is nilpotent on ${{\mathfrak g}^x_0}$ for generic ${t}$, and thus for all ${t}$. In particular ${\hbox{ad} (x+y)}$ is nilpotent on ${{\mathfrak g}^x_0}$ for any ${y \in {\mathfrak h}}$, thus ${{\mathfrak g}^x_0 \subset {\mathfrak g}^{\mathfrak h}_0 = {\mathfrak h}}$. Since ${{\mathfrak h} \subset {\mathfrak g}^x_0}$, we obtain ${{\mathfrak h} = {\mathfrak g}_0^x}$ as required. $\Box$

Corollary 13 (Cartans are conjugate) Let ${{\mathfrak g}}$ be a Lie algebra, and let ${{\mathfrak h}}$ be a Cartan algebra. Then for generic ${x \in {\mathfrak g}}$, ${{\mathfrak h}}$ is conjugate to ${{\mathfrak g}_0^x}$ by an inner automorphism of ${{\mathfrak g}}$ (i.e. an element of the algebraic group generated by ${\exp(\hbox{ad} y)}$ for ${y \in {\mathfrak g}}$). In particular, any two Cartan subalgebras are conjugate to each other by an inner automorphism.

Proof: Let ${S}$ be the set of ${x' \in {\mathfrak h}}$ with ${{\mathfrak h} = {\mathfrak g}_0^{x'}}$, then ${x'}$ is a Zariski open dense subset of ${{\mathfrak h}}$ by Proposition 12. Then let ${T}$ be the collection of ${x \in {\mathfrak g}}$ that are conjugate to an ${x' \in S}$, then ${T}$ is a algebraically constructible subset of ${{\mathfrak g}}$. For ${x' \in S}$, observe that ${(\hbox{ad} {x'})({\mathfrak g})}$ and ${{\mathfrak h}}$ span ${{\mathfrak g}}$, since ${{\mathfrak h} = {\mathfrak g}_0^{x'}}$, and so by the inverse function theorem, a (topological) neighbourhood of ${x'}$ is contained in ${T}$. This implies that ${T}$ is Zariski dense, and the claim follows. $\Box$

In the case of semisimple algebras, the Cartan structure is particularly clean:

Proposition 14 Let ${{\mathfrak g}}$ be a semisimple Lie algebra. Then every Cartan subalgebra ${{\mathfrak h}}$ is abelian, and ${K}$ is non-degenerate on ${{\mathfrak h}}$.

The dimension of the Cartan algebra of a semisimple Lie algebra is known as the rank of the algebra.

Proof: The nilpotent algebra ${{\mathfrak h}}$ acts via the adjoint action on ${{\mathfrak g}}$, and by Lie’s theorem this action can be made upper triangular. From this it is not difficult to obtain a decomposition

$\displaystyle {\mathfrak g} = \bigoplus_{\lambda \in \sigma({\mathfrak h})} {\mathfrak g}^{\mathfrak h}_\lambda$

for some finite set ${\sigma({\mathfrak h}) \subset {\mathfrak h}^*}$, where ${{\mathfrak g}^{\mathfrak h}_\lambda}$ are the generalised eigenspaces

$\displaystyle {\mathfrak g}^{\mathfrak h}_\lambda = \{ x \in {\mathfrak g}: (\hbox{ad} {\mathfrak h}-\lambda)^k x = 0 \hbox{ for some } k \}.$

From the Jacobi identity (2) we see that ${[{\mathfrak g}^{\mathfrak h}_{\lambda}, {\mathfrak g}^{\mathfrak h}_{\mu}] \subset {\mathfrak g}^{\mathfrak h}_{\lambda+\mu}}$. Among other things, this shows that ${{\mathfrak g}^{\mathfrak h}_{\lambda}}$ has ad-trace zero for any non-zero ${\lambda}$, and hence ${{\mathfrak g}^{\mathfrak h}_{\lambda}, {\mathfrak g}^{\mathfrak h}_{\mu}}$ are ${K}$-orthogonal if ${\lambda+\mu \neq 0}$. In particular, ${{\mathfrak g}^{\mathfrak h}_{0}}$ is ${K}$-orthogonal to ${\bigoplus_{\lambda \neq 0} {\mathfrak g}^{\mathfrak h}_\lambda}$. By Theorem 1, ${K}$ is non-degenerate on ${{\mathfrak g}}$, and thus also non-degenerate on ${{\mathfrak g}^{\mathfrak h}_0}$; by Proposition 12, ${K}$ is thus non-degenerate on ${{\mathfrak h}}$. But by Lie’s theorem, we can find a basis for which ${{\mathfrak h}}$ consists of upper-triangular matrices in the adjoint representation of ${{\mathfrak g}}$, so that ${[{\mathfrak h},{\mathfrak h}]}$ is strictly upper-triangular and thus ${K}$-orthogonal to ${{\mathfrak h}}$. As ${K}$ is non-degenerate on ${{\mathfrak h}}$, this forces ${[{\mathfrak h},{\mathfrak h}]}$ to be trivial, as required. $\Box$

We now use the semisimple Jordan decomposition (Lemma 10) to obtain a further non-degeneracy property of the Cartan subalgebras of semisimple algebras:

Proposition 15 Let ${{\mathfrak g}}$ be a semisimple Lie algebra. Then every Cartan subalgebra ${{\mathfrak h}}$ consists entirely of semisimple elements.

Proof: Let ${x \in {\mathfrak h}}$, then (by the abelian nature of ${{\mathfrak h}}$) ${\hbox{ad} x}$ annihilates ${{\mathfrak h}}$; as ${\hbox{ad} {x_{n}}}$ is a polynomial in ${\hbox{ad} x}$ with zero constant coefficient, ${\hbox{ad} {x_n}}$ annihilates ${{\mathfrak h}}$ as well; thus ${x_n}$ normalises ${{\mathfrak h}}$ and thus also lies in ${{\mathfrak h}}$ as ${{\mathfrak h}}$ is Cartan. If ${y \in {\mathfrak h}}$, then ${y}$ commutes with ${x_n}$ and so ${\hbox{ad} y}$ commutes with ${\hbox{ad} {x_n}}$. As the latter is nilpotent, we conclude that ${\hbox{ad} {x_n} \hbox{ad} y}$ is nilpotent and thus has trace zero. Thus ${x_n}$ is ${K}$-orthogonal to ${{\mathfrak h}}$ and thus vanishes since the Killing form is non-degenerate on ${{\mathfrak h}}$. Thus every element of ${{\mathfrak h}}$ is semisimple as required. $\Box$

— 5. ${{\mathfrak{sl}}_2}$ representations —

To proceed further, we now need to perform some computations on a very specific Lie algebra, the special linear algebra ${\mathfrak{sl}_2}$ of ${2 \times 2}$ complex matrices with zero trace. This is a three-dimensional concrete Lie algebra, spanned by the three generators

$\displaystyle H := \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}; X := \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}; Y := \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}$

which obey the commutation relations

$\displaystyle [H,X] = 2X; [H,Y] = -2Y; [X,Y] = H. \ \ \ \ \ (8)$

Conversely, any abstract three-dimensional Lie algebra generated by ${H,X,Y}$ with relations (8) is clearly isomorphic to ${\mathfrak{sl}_2}$. One can check that this is a simple Lie algebra, with the one-dimensional space generated by ${H}$ being a Cartan subalgebra.

Now we classify by hand the representations ${\rho: \mathfrak{sl}_2 \rightarrow {\mathfrak{gl}}(V)}$ of ${\mathfrak{sl}_2}$. Observe that ${\mathfrak{sl}_2}$ acts infinitesimally on ${{\bf R}^2}$ by the differential operators (or vector fields)

$\displaystyle H \rightarrow x \partial_x - y \partial_y; \quad X \rightarrow x \partial_y; \quad Y \rightarrow y \partial_x.$

In particular, we see that for each natural number ${n}$, the space ${P_n}$ of homogeneous polynomials in two variables ${x,y}$ of degree ${n}$ has a representation ${\sigma_n: \mathfrak{sl}_2 \rightarrow {\mathfrak{gl}}(P_n)}$; if we give this space the basis ${e_{2i-n} := x^i y^{n-i}}$ for ${i=0,\ldots,n}$, the action is then described by the formulae

$\displaystyle \sigma_n(H) e_j = j e_j; \quad \sigma_n(X) e_j = \frac{n-j}{2} e_{j+2}; \quad \sigma_n(Y) e_j = \frac{n+j}{2} e_{j-2} \ \ \ \ \ (9)$

for ${j = n, n-2, \ldots, -n+2, n}$. From these formulae it is also easy to see that these representations are irreducible in the sense that the ${P_n}$ have no non-trivial ${\mathfrak{sl}_2}$-invariant subspaces.

Conversely, these representations (and their direct sums) describe (up to isomorphism) all of the representations of ${\mathfrak{sl}_2}$:

Theorem 16 (Representations of ${\mathfrak{sl}_2}$) Any representation ${\rho: \mathfrak{sl}_2 \rightarrow {\mathfrak{gl}}(V)}$ is isomorphic to the direct sum of finitely many of the representations ${\sigma_n: \mathfrak{sl}_2 \rightarrow {\mathfrak{gl}}(P_n)}$.

Here of course the direct sum ${\rho_1 \oplus \rho_2: {\mathfrak g} \rightarrow {\mathfrak{gl}}(V_1 \oplus V_2)}$ of two representations ${\rho_1: {\mathfrak g} \rightarrow {\mathfrak{gl}}(V_1)}$, ${\rho_2: {\mathfrak g} \rightarrow {\mathfrak{gl}}(V_2)}$ is defined as ${\rho_1 \oplus \rho_2(x) := (\rho_1(x), \rho_2(x))}$, and two representations ${\rho_1: {\mathfrak g} \rightarrow {\mathfrak{gl}}(V_1)}$, ${\rho_2: {\mathfrak g} \rightarrow {\mathfrak{gl}}(V_2)}$ are isomorphic if there is an invertible linear map ${\phi: V_1 \rightarrow V_2}$ such that ${\phi \circ \rho_1(x) = \rho_2(x) \circ \phi}$ for all ${x \in {\mathfrak g}}$.

Proof: By induction we may assume that ${V}$ is non-trivial, the claim has already been proven for any smaller dimensional spaces than ${V}$.

As ${H}$ is semisimple, ${\rho(H)}$ is semisimple by Lemma 10, and so we can split ${V}$ into the direct sum

$\displaystyle V = \oplus_{\lambda \in \sigma(H)} V^H_\lambda$

of eigenspaces of ${H}$ for some finite ${\sigma(H) \subset {\bf C}}$.

From (8) we have the raising law

$\displaystyle \rho(X) V^H_\lambda \subset V^H_{\lambda+2}$

and the lowering law

$\displaystyle \rho(Y) V^H_\lambda \subset V^H_{\lambda-2}$

As ${\sigma(H)}$ is finite, we may find a “highest weight” ${\lambda \in \sigma(H)}$ with the property that ${\lambda+2 \not \in \sigma(H)}$, thus ${\rho(X)}$ annihilates ${V^H_\lambda}$ by the raising law. We will use the basic strategy of starting from the highest weight space and applying lowering operators to discover one of the irreducible components of the representation.

From (8) one has

$\displaystyle \rho(X) \rho(Y) = \rho(Y) \rho(X) + \rho(H)$

and so from induction and the lowering law we see that

$\displaystyle \rho(X) \rho(Y)^{k+1} v = (\lambda + (\lambda-2) + \ldots + (\lambda-2k)) \rho(Y)^k v \ \ \ \ \ (10)$

for all natural numbers ${k}$ and all ${v \in V^H_\lambda}$. If ${\lambda + (\lambda-2) + \ldots + (\lambda-2k)}$ is never zero, this creates an infinite sequence ${V^H_\lambda, V^H_{\lambda-2}, V^H_{\lambda-4}, \ldots}$ of non-trivial eigenspaces, which is absurd, so we have ${\lambda + (\lambda-2) + \ldots + (\lambda-2n) = 0}$ for some natural number ${n}$, thus ${\lambda = n}$. If we then let

$\displaystyle W := \bigoplus_{k=0}^n \rho(Y)^k V^H_n$

then we see that ${W}$ is invariant under ${H}$, ${X}$, and ${Y}$, and thus ${{\mathfrak g}}$-invariant; also if for each ${\lambda \in \sigma(H)}$ we let ${\tilde V^H_\lambda}$ be the set of all ${v \in V^H_\lambda}$ such that ${\rho(X)^k v}$ is never a non-zero element of ${V^H_n}$ then we see that

$\displaystyle \tilde W := \bigoplus_{\lambda \in \sigma(H)} \tilde V^H_\lambda$

is also ${{\mathfrak g}}$-invariant, and furthermore that ${W}$ and ${\tilde W}$ are complementary subspaces in ${V}$. Applying the induction hypothesis, we are done unless ${W=V}$, but then by splitting ${V^H_n}$ into one-dimensional spaces and applying the lowering operators, we see that we reduce to the case that ${V^H_n}$ is one-dimensional. But if one then lets ${e_n}$ be a generator of ${V^H_n}$ and recursively defines ${e_{n-2}, e_{n-4},\ldots,e_{-n}}$ by

$\displaystyle \rho(Y) e_j = \frac{n+j}{2} e_{j-2}$

one then checks using (10) that ${\rho}$ is isomorphic to ${\sigma_n}$, and the claim follows. $\Box$

Remark 4 Theorem 16 shows that all representations of ${\mathfrak{sl}_2}$ are completely reducible in that they can be decomposed as the direct sum of irreducible representations. In fact, all representations of semisimple Lie algebras are completely reducible; this can be proven by a variant of the above arguments (in combination with the analysis of weights given below), and can also be proven by the unitary trick, or by analysing the action of Casimir elements of the universal enveloping algebra of ${{\mathfrak g}}$, as done in the Appendix.

— 6. Root spaces —

Now we use the ${\mathfrak{sl}_2}$ theory to analyse more general semisimple algebras.

Let ${{\mathfrak g}}$ be a semisimple Lie algebra, and let ${{\mathfrak h}}$ be a Cartan algebra, then by Proposition 14 ${{\mathfrak h}}$ is abelian and acts in a semisimple fashion on ${{\mathfrak g}}$, and by Proposition 12 ${{\mathfrak h}}$ is its own null space ${{\mathfrak g}^{\mathfrak h}_0}$ in the weight decomposition of ${{\mathfrak g}}$, thus we have the Cartan decomposition

$\displaystyle {\mathfrak g} = {\mathfrak h} \oplus \bigoplus_{\alpha \in \Phi} {\mathfrak g}^{\mathfrak h}_\alpha$

as vector spaces (not as Lie algebras) where ${\Phi}$ is a finite subset of ${{\mathfrak h}^* \backslash \{0\}}$ (known as the set of roots) and ${{\mathfrak g}^{\mathfrak h}_\alpha}$ is the non-trivial eigenspace

$\displaystyle {\mathfrak g}^{\mathfrak h}_\alpha = \{ x \in {\mathfrak g}: [y,x] = \alpha(y) x \hbox{ for all } y \in {\mathfrak h} \}. \ \ \ \ \ (11)$

Example 2 A key example to keep in mind is when ${{\mathfrak g} = \mathfrak{sl}_n}$ is the Lie algebra of ${n \times n}$ matrices of trace zero. An explicit computation using the Killing form and Theorem 1 shows that this algebra is semisimple; in fact it is simple, but we will not show this yet. The space ${{\mathfrak h}}$ of diagonal matrices of trace zero can then be verified to be a Cartan algebra; it can be identified with the space ${{\bf C}^n_0}$ of complex ${n}$-tuples summing to zero, and using the usual Hermitian inner product on ${{\bf C}^n}$ we can also identify ${{\mathfrak h}^*}$ with ${{\bf C}^n_0}$. The roots are then of the form ${e_i - e_j}$ for distinct ${1 \leq i,j \leq n}$, where ${e_1,\ldots,e_n}$ is the standard basis for ${{\bf C}^n}$, with ${{\mathfrak g}^{\mathfrak h}_{e_i-e_j}}$ being the one-dimensional space of matrices that are vanishing except possibly at the ${(i,j)}$ coefficient.

From the Jacobi identity (2) we see that the Lie bracket acts additively on the weights, thus

$\displaystyle [{\mathfrak g}^{\mathfrak h}_\alpha, {\mathfrak g}^{\mathfrak h}_\beta] \subset {\mathfrak g}^{\mathfrak h}_{\alpha+\beta} \ \ \ \ \ (12)$

for all ${\alpha,\beta \in {\mathfrak h}^*}$. Taking traces, we conclude that

$\displaystyle K( {\mathfrak g}^{\mathfrak h}_\alpha, {\mathfrak g}^{\mathfrak h}_\beta ) = 0$

whenever ${\alpha+\beta \neq 0}$. As ${K}$ is non-degenerate, we conclude that if ${{\mathfrak g}^{\mathfrak h}_\alpha}$ is non-trivial, then ${{\mathfrak g}^{\mathfrak h}_{-\alpha}}$ must also be non-trivial, thus ${\Phi}$ is symmetric around the origin.

We also claim that ${\Phi}$ spans ${{\mathfrak h}^*}$ as a vector space. For if this were not the case, then there would be a non-trivial ${x \in {\mathfrak h}}$ that is annihilated by ${\Phi}$, which by (11) implies that ${\hbox{ad} x}$ annihilates all of the ${{\mathfrak g}^{\mathfrak h}_\alpha}$ and is thus central, contradicting the semisimplicity of ${{\mathfrak g}}$.

From Proposition 14, ${K}$ is non-degenerate on ${{\mathfrak h}}$. Thus, for each root ${\alpha \in \Phi}$, there is a corresponding non-zero element ${t_\alpha}$ of ${{\mathfrak h}}$ such that ${K( t_\alpha, x ) = \alpha(x)}$ for all ${x \in {\mathfrak h}}$. If we let ${x \in {\mathfrak g}^{\mathfrak h}_\alpha, y \in {\mathfrak g}^{\mathfrak h}_{-\alpha}}$, and ${z \in {\mathfrak g}^{\mathfrak h}_0= {\mathfrak h}}$, we have

$\displaystyle K( [x,y], z ) = K( y, [z,x] )$

$\displaystyle = K( y, \alpha(z) x )$

$\displaystyle = \alpha( K(x,y) z )$

$\displaystyle = K( K(x,y) t_\alpha, z )$

and thus by the non-degeneracy of ${K}$ on ${{\mathfrak h}}$ we obtain the useful formula

$\displaystyle [x,y] = K(x,y) t_\alpha \ \ \ \ \ (13)$

for ${x \in {\mathfrak g}^{\mathfrak h}_\alpha}$ and ${y \in {\mathfrak g}^{\mathfrak h}_{-\alpha}}$.

As ${K}$ is non-degenerate, we can find ${X = X_\alpha \in {\mathfrak g}^{\mathfrak h}_\alpha}$ and ${Y = Y_\alpha \in {\mathfrak g}^{\mathfrak h}_{-\alpha}}$ with ${K(X,Y) \neq 0}$ (which can be found as ${K}$ is non-degenerate). We divide into two cases depending on whether ${\alpha(t_\alpha)}$ vanishes or not. If ${\alpha(t_\alpha)}$ vanishes, then ${[X,Y]}$ is non-trivial but commutes with ${X}$ and ${Y}$, and so ${\hbox{ad} X,\hbox{ad} Y}$ generate a solvable algebra. By Lie’s theorem, this algebra is upper-triangular in some basis, and so ${\hbox{ad} [X,Y]}$ is nilpotent, hence ${\hbox{ad} t_\alpha}$ is nilpotent; but by Proposition 15 ${\hbox{ad} t_\alpha}$ is also semisimple, contradicting the non-zero nature of ${t_\alpha}$ (and the semisimple nature of ${{\mathfrak g}}$). Thus ${\alpha(t_\alpha)}$ is non-vanishing. If we then scale ${X,Y}$ so that ${[X,Y] = H}$, where ${H = H_\alpha}$ is the co-root of ${\alpha}$, defined as the element of ${{\mathfrak h}}$ given by the formula

$\displaystyle H := \frac{2}{\alpha(t_\alpha)} t_\alpha$

so that

$\displaystyle \alpha(H) = 2, \ \ \ \ \ (14)$

then ${X,Y,H}$ obey the relations (8) and thus generate a copy of ${\mathfrak{sl}_2}$, rather than a solvable algebra. The representation theory of ${\mathfrak{sl}_2}$ can then be applied to the space

$\displaystyle \bigcup_{n \in S_\alpha} {\mathfrak g}^{\mathfrak h}_{n\alpha/2}, \ \ \ \ \ (15)$

where ${S_\alpha := \{ n \in {\bf R}: n \alpha/2 \in \Phi \cup \{0\}\}}$. By (19), this space is invariant with respect to ${x}$ and ${y}$ and hence to the copy of ${\mathfrak{sl}_2}$, and by (11), (14) each ${{\mathfrak g}^{\mathfrak h}_{n\alpha/2}}$ is the weight space of ${H}$ of weight ${n}$ for each ${n \in S}$. By Theorem 16, we conclude that the set ${S}$ consists of integers. On the other hand, from (13) we see that any copy of the representation ${\sigma_n}$ with ${n}$ a positive even integer must have its ${0}$ weight space contained in the span of ${t_\alpha}$, and so there is only one such representation in (15). As ${X, Y, H}$ already give a copy of ${\sigma_2}$ in (15), there are no other copies of ${\sigma_n}$ with ${n}$ positive even, thus we have that ${{\mathfrak g}^{\mathfrak h}_\alpha}$ is one-dimensional and that the only even multiples of ${\alpha/2}$ in ${\Phi}$ are ${\pm \alpha}$. In particular, ${2\alpha \not \in \Phi}$ whenever ${\alpha \in \Phi}$, which also implies that ${\alpha/2 \not \in \Phi}$ whenever ${\alpha \in \Phi}$. Returning to Theorem 16, we conclude that the set ${S_\alpha}$ contains no odd integers, and so ${\alpha}$ and ${-\alpha}$ are the only multiples of ${\alpha}$ in ${\Phi}$.

Next, let ${\beta}$ be any non-zero element of ${{\mathfrak h}^*}$ orthogonal to ${\alpha}$ with respect to the inner product ${\langle,\rangle}$ of ${{\mathfrak h}^*}$ that is dual to the restriction of the Killing form to ${{\mathfrak h}}$, and consider the space

$\displaystyle \bigcup_{n \in S_{\alpha,\beta}} {\mathfrak g}^{\mathfrak h}_{\beta+n\alpha/2} \ \ \ \ \ (16)$

where

$\displaystyle S_{\alpha,\beta} := \{ n \in {\bf R}: \beta + n \alpha/2 \in \Phi \}.$

By (19), this is again an ${\mathfrak{sl}_2}$-invariant space, and by (11), (14) each ${{\mathfrak g}^{\mathfrak h}_{\beta+n\alpha/2}}$ is the weight space of ${H}$ of weight ${n}$. From Theorem 16 we see that ${S_{\alpha,\beta}}$ is an arithmetic progression ${\{-m,-m+2,\ldots,m-2,m\}}$ of spacing ${2}$; in particular, ${S_{\alpha,\beta}}$ is symmetric around the origin and consists only of integers. This implies that the set ${\Phi}$ is symmetric with respect to reflection across the hyperplane that is orthogonal to ${\alpha}$, and also implies that

$\displaystyle 2 \frac{\langle \alpha,\beta\rangle}{\langle \alpha,\alpha \rangle} \in {\bf Z}$

for all roots ${\alpha,\beta \in \Phi}$.

We summarise the various geometric properties of ${\Phi}$ as follows:

Proposition 17 (Root systems) Let ${{\mathfrak g}}$ be a semisimple Lie algebra, let ${{\mathfrak h}}$ be a Cartan subalgebra, and let ${\langle,\rangle}$ be the inner product on ${{\mathfrak h}^*}$ that is dual to the Killing form restricted to ${{\mathfrak h}}$. Let ${\Phi \subset {\mathfrak h}^*}$ be the set of roots. Then:

• (i) ${\Phi}$ does not contain zero.
• (ii) If ${\alpha}$ is a root, then ${\Phi}$ is symmetric with respect to the reflection operation ${s_\alpha: {\mathfrak h}^* \rightarrow {\mathfrak h}^*}$ across the hyperplane orthogonal to ${\alpha}$; in particular, ${-\alpha}$ is also a root.
• (iii) If ${\alpha}$ is a root, then no multiple of ${\alpha}$ other than ${\pm \alpha}$ are roots.
• (iv) If ${\alpha,\beta}$ are roots, then ${\frac{\langle \alpha,\beta \rangle}{\langle \alpha,\alpha \rangle}}$ is an integer or half-integer. Equivalently, ${s_\alpha(\beta) = \beta + m \alpha}$ for some integer ${m}$.
• (v) ${\Phi}$ spans ${{\mathfrak h}^*}$.

A set of vectors ${\Phi}$ obeying the above axioms (i)-(v) is known as a root system on ${{\mathfrak h}^*}$ (viewed as a finite dimensional complex Hilbert space with the inner product ${\langle,\rangle}$).

Remark 5 A short calculation reveals the remarkable fact that if ${\Phi}$ is a root system, then the associated system of co-roots ${\{H_\alpha: \alpha \in \Phi\}}$ is also a root system. This is one of the starting points for the deep phenomenon of Langlands duality, which we will not discuss here.

When ${{\mathfrak g}}$ is simple, one can impose a useful additional axiom on ${\Phi}$. Say that a root system ${\Phi}$ is irreducible if ${\Phi}$ cannot be covered by the union ${V \cup W}$ of two orthogonal proper subspaces of ${{\mathfrak h}^*}$.

Lemma 18 If ${{\mathfrak g}}$ is a simple Lie algebra, then the root system of ${\Phi}$ is irreducible.

Proof: If ${\Phi}$ can be covered by two orthogonal subspaces ${V \cup W}$, then if we consider the subspace of ${{\mathfrak g}}$

$\displaystyle V \oplus \bigoplus_{\alpha \in \Phi \cap V} {\mathfrak g}^{\mathfrak h}_\alpha$

where we use the inner product ${\langle,\rangle}$ to identify ${{\mathfrak h}^*}$ with ${{\mathfrak h}}$ and thus ${V}$ with a subspace of ${{\mathfrak h}}$ (thus for instance this identifies ${\alpha}$ with ${t_\alpha}$), then one can check using (19) and (13) that this is a proper ideal of ${{\mathfrak g}}$, contradicting simplicity. $\Box$

It is easy to see that every root system is expressible as the union of irreducible root systems (on orthogonal subspaces of ${{\mathfrak h}^*}$). As it turns out, the irreducible root systems are completely classified, with the complete list of root systems (up to isomorphism) being described in terms of the Dynkin diagrams ${A_n, B_n, C_n, D_n, E_6, E_7, E_8, F_4, G_2}$ briefly mentioned in Theorem 2. We will now turn to this classification in the next section, and then use root systems to recover the Lie algebra.

— 7. Classification of root systems —

In this section we classify all the irreducible root systems ${\Phi}$ on a finite dimensional complex Hilbert space ${{\mathfrak h}^*}$, up to Hilbert space isometry. Of course, we may take ${{\mathfrak h}^*}$ to be a standard complex Hilbert space ${{\bf C}^n}$ without loss of generality. The arguments here are purely elementary, proceeding purely from the root system axioms rather than from any Lie algebra theory.

Actually, we can quickly pass from the complex setting to the real setting. By axiom (v), ${\Phi}$ contains a basis ${\alpha_1,\ldots,\alpha_n}$ of ${{\bf C}^n}$; by axiom (iv), the inner products between these basis vectors are real, as are the inner products between any other root and a basis root. From this we see that ${\Phi}$ lies in the real vector space spanned by the basis roots, so by a change of basis we may assume without loss of generality that ${\Phi \subset {\bf R}^n}$.

Henceforth ${\Phi}$ is assumed to lie in ${{\bf R}^n}$. From two applications of (iv) we see that for any two roots ${\alpha,\beta}$, the expression

$\displaystyle \frac{\langle \alpha,\beta \rangle}{\langle \alpha,\alpha \rangle} \frac{\langle \alpha,\beta \rangle}{\langle \beta,\beta \rangle}$

lies in ${\frac{1}{4} {\bf Z}}$; but it is also equal to ${\cos^2 \angle(\alpha,\beta)}$, and hence

$\displaystyle \cos^2 \angle(\alpha,\beta) \in \{ 0, \frac{1}{4}, \frac{1}{2}, \frac{3}{4}, 1 \}$

for all roots ${\alpha,\beta}$. Analysing these cases further using (iv) again, we conclude that there are only a restricted range of options for a pair of roots ${\alpha,\beta}$:

Lemma 19 Let ${\alpha,\beta}$ be roots. Then one of the following occurs:

• (0) ${\beta}$ and ${\alpha}$ are orthogonal.
• (1/4) ${\alpha,\beta}$ have the same length and subtend an angle of ${\pi/3}$ or ${2\pi/3}$.
• (1/2) ${\alpha}$ has ${\sqrt{2}}$ times the length of ${\beta}$ or vice versa, and ${\alpha,\beta}$ subtend an angle of ${\pi/4}$ or ${3\pi/4}$.
• (3/4) ${\alpha}$ has ${\sqrt{3}}$ times the length of ${\beta}$ or vice versa, and ${\alpha,\beta}$ subtend an angle of ${\pi/6}$ or ${5\pi/6}$.
• (1) ${\beta = \pm \alpha}$.

We next record a useful corollary of Lemma 19 (and axiom (ii)):

Corollary 20 Let ${\alpha,\beta}$ be roots. If ${\alpha,\beta}$ subtend an acute angle, then ${\alpha-\beta}$ and ${\beta-\alpha}$ are also roots, but ${\alpha+\beta}$ is not a root. Equivalently, if ${\alpha,\beta}$ subtend an obtuse angle, then ${\alpha+\beta}$ is a root, but ${\alpha-\beta}$ and ${\beta-\alpha}$ are not roots.

This follows from a routine case analysis and is omitted.

We can leverage Corollary 20 as follows. Call an element ${h}$ of ${{\bf R}^n}$ regular if it is not orthogonal to any root, thus generic elements of ${{\bf R}^n}$ are regular. Given a regular element ${h}$, let ${\Phi_h^+ := \{ \alpha \in \Phi: \langle \alpha,h\rangle > 0 \}}$ denote the roots ${\alpha}$ which are ${h}$-positive in the sense that their inner product with ${h}$ is positive; thus ${\Phi}$ is partitioned into ${\Phi_h^+}$ and ${-\Phi_h^+}$. We will abbreviate ${h}$-positive as positive if ${h}$ is understood from context. Call a positive root ${\alpha \in \Phi_h^+}$ a ${h}$-simple root (or simple root for short) if it cannot be written as the sum of two positive roots. Clearly every positive root is then a linear combination of simple roots with natural number coefficients. By Corollary 20, two simple roots cannot subtend an acute angle, and so any two distinct simple roots subtend a right or obtuse angle.

Example 3 Using the root system ${\{ e_i -e_j: 1 \leq i,j \leq n; i \neq j \}}$ of ${\mathfrak{sl}_n}$ discussed previously, if one takes ${h}$ to be any vector in ${{\bf C}^n_0}$ with decreasing coefficients, then the positive roots are those roots ${e_i - e_j}$ with ${i, and the simple roots are the roots ${e_i - e_{i+1}}$ for ${1 \leq i < n}$.

Define an admissible configuration to be a collection of unit vectors in ${{\bf R}^n}$ in a open half-space ${\{ v: \langle v,h \rangle>0\}}$ with the property that any two vectors in this collection form an angle of ${\pi/2}$, ${2\pi/3}$, ${3\pi/4}$, or ${5\pi/6}$, and call the configuration irreducible if it cannot be decomposed into two non-empty orthogonal subsets. From Lemma 19 and the above discussion we see that the unit vectors ${\alpha/\|\alpha\|}$ associated to the simple roots are an admissible configuration. They are also irreducible, for if the simple roots partition into two orthogonal sets then it is not hard to show (using Corollary 20) that all positive roots lie in the span of one of these two sets, contradicting irreducibility of the root system.

We can say quite a bit about admissible configurations; the fact that the vectors in the system always subtend right or obtuse angles, combined with the half-space restriction, is quite limiting (basically because this information can be in violation of inequalities such as the Bessel inequality, or the positive (semi-)definiteness ${\| \sum_i c_i v_i \|^2 \geq 0}$ of the Gram matrix). We begin with an assertion of linear independence:

Lemma 21 If ${v_1,\ldots,v_n}$ is an admissible configuration, then it is linearly independent.

Among other things, this shows that the number of simple roots of a semisimple Lie algebra is equal to the rank of that algebra.

Proof: Suppose this is not the case, then one has a non-trivial linear constraint

$\displaystyle \sum_{i \in A} c_i v_i = \sum_{j \in B} c_j v_j$

for some positive ${c_i,c_j}$ and disjoint ${A,B \subset \{1,\ldots,n\}}$. But as any two vectors in an admissible configuration subtend a right or obtuse angle, ${\langle \sum_{i \in A} c_i v_i, \sum_{j \in B} c_j v_j\rangle \leq 0}$, and thus ${\sum_{i \in A} c_i v_i = \sum_{j \in B} c_j v_j=0}$. But this is not possible as all the ${v_i}$ lie in an open half-space. $\Box$

Define the Coxeter diagram of an admissible configuration ${v_1,\ldots,v_n}$ to be the graph with vertices ${v_1,\ldots,v_n}$, and with any two vertices ${v_i,v_j}$ connected by an edge of multiplicity ${4 \cos^2 \angle v_i,v_j}$, thus two vertices are unconnected if they are orthogonal, connected with a single edge if they subtend an angle of ${2\pi/3}$, a double edge if they subtend an angle of ${3\pi/4}$, and a triple edge if they subtend an angle of ${5\pi/6}$. The irreducibility of a configuration is equivalent to the connectedness of a Coxeter diagram. Note that the Coxeter diagram describes all the inner products between the ${v_i}$ and thus describes the ${v_i}$ up to an orthogonal transformation (as can be seen for instance by applying the Gram-Schmidt process).

Lemma 22 The Coxeter diagram of an admissible configuration is acyclic (ignoring multiplicity of edges). In particular, the Coxeter diagram of an irreducible admissible configuration is a tree.

Proof: Suppose for contradiction that the Coxeter diagram contains a cycle ${v_1,\ldots,v_n}$, we see that ${\langle v_i,v_{i+1}\rangle \leq - \frac{1}{2}}$ for ${i=1,\ldots,n}$ (with the convention ${v_{n+1}=v_1}$) and ${\langle v_i,v_j \rangle \leq 0}$ for all other ${i}$. This implies that ${\|\sum_{i=1}^n v_i\|^2 \leq 0}$, which contradicts the linear independence of the ${v_i}$. $\Box$

Lemma 23 Any vertex in the Coxeter diagram has degree at most three (counting multiplicity).

Proof: Let ${v_0}$ be a vertex which is adjacent to some other vertices ${v_1,\ldots,v_d}$, which are then an orthonormal system. By Bessel’s inequality (and linear independence) one has

$\displaystyle \sum_{i=1}^d \langle v_0,v_i\rangle^2 < 1.$

But from construction of the Coxeter diagram we have ${\langle v_0,v_i\rangle^2 = \frac{m_i}{4}}$ for each ${i}$, where ${m_i \in \{1,2,3\}}$ is the multiplicity of the edge connecting ${v_0}$ and ${v_i}$. The claim follows. $\Box$

We can also contract simple edges:

Lemma 24 If ${v_1,\ldots,v_n}$ is an admissible configuration with ${v_i,v_j}$ joined by a single edge, then the configuration formed from ${v_1,\ldots,v_n}$ by replacing ${v_i,v_j}$ with the single vertex ${v_i+v_j}$ is again an admissible configuration, with the resulting Coxeter diagram formed from the original Coxeter diagram by deleting the edge between ${v_i}$ and ${v_j}$ and then identifying ${v_i,v_j}$ together.

This follows easily from acyclicity and direct computation.

By Lemma 23 and Lemma 24, the Coxeter diagram can never form a vertex of degree three no matter how many simple edges are contracted. From this we can easily show that connected Coxeter diagrams must have one of the following shapes:

• ${A_n}$: ${n}$ vertices joined in a chain of simple edges;
• ${BCF_n}$: ${n}$ vertices joined in a chain of edges, one of which is a double edge and all others are simple edges;
• ${DE_n}$: three chains of simple edges emenating from a common vertex (forming a “Y” shape), connecting ${n}$ vertices in all;
• ${G_2}$: Two vertices joined by a triple edge.

We can cut down the ${BCF_n}$ and ${DE_n}$ cases further:

Lemma 25 The Coxeter diagram of an admissible configuration cannot contain as a subgraph

• (a) A chain of four edges, with one of the interior edges a double edge;
• (b) Three chains of two simple edges each, emenating from a common vertex;
• (c) Three chains of simple edges of length ${1, 2, 5}$ respectively, emenating from a common vertex.

Proof: To exclude (a), suppose for contradiction that we have two chains ${(u_1,u_2)}$ and ${(v_1,v_2,v_3)}$ of simple edges, with ${u_2,v_3}$ joined by a double edge. Writing ${U := \frac{1}{\sqrt{3}} (u_1+2u_2)}$ and ${V := \frac{1}{\sqrt{6}}(v_1+2v_2+3v_3)}$, one computes that ${U,V}$ are unit vectors with inner product ${\langle U,V \rangle = -1}$, implying that ${U,V}$ are parallel, contradicting linear independence.

To exclude (b), suppose that we have three chains ${(u_1,u_2,x)}$, ${(v_1,v_2,x)}$, ${(w_1,w_2,x)}$ of simple edges joined at ${x}$. Then the vectors ${U := \frac{1}{\sqrt{3}}(u_1+2u_2), V := \frac{1}{\sqrt{3}}(v_1+2v_2), W := \frac{1}{\sqrt{3}}(w_1+2w_2)}$ are an orthonormal system that each have an inner product of ${-1/\sqrt{3}}$ each with ${x}$. Comparing this with Bessel’s inequality we conclude that ${x}$ lies in the span of ${U,V,W}$, contradicting linear independence.

Finally, to exclude (c), suppose we have three chains ${(u_1,x)}$, ${(v_1,v_2,x)}$, ${(w_1,w_2,w_3,w_4,w_5,x)}$ of simple edges joined at ${x}$. Writing ${U := u_1}$, ${V := \frac{1}{\sqrt{3}}(v_1+2v_2)}$, ${W := \frac{1}{\sqrt{15}}(w_1+2w_2+3w_3+4w_4+5w_5)}$, we compute that ${U,V,W}$ are an orthonormal system that have inner products of ${-1/2, -1/\sqrt{3}, -\frac{5}{\sqrt{60}}}$ respectively with ${x}$. As ${\frac{1}{4}+\frac{1}{3}+\frac{25}{60} = 1}$, this forces ${x}$ to lie in the span of ${U,V,W}$, again contradicting linear independence. $\Box$

We remark that one could also obtain the required contradictions in the above proof by verifying in all three cases that the Gram matrix of the subconfiguration has determinant zero.

Corollary 26 The Coxeter diagram of an irreducible admissible configuration must take one of the following forms:

• ${A_n}$: ${n}$ vertices joined in a chain of simple edges for some ${n \geq 1}$;
• ${BC_n}$: ${n}$ vertices joined in a chain of edges for some ${n \geq 2}$, with one boundary edge being a double edge and all other edges simple;
• ${D_n}$: Three chains of simple edges of length ${1,1,n-3}$ respectively for some ${n \geq 4}$, emenating from a single vertex;
• ${E_n}$: Three chains of simple edges of length ${1,2,n-4}$ respectively for some ${n=6,7,8}$, emenating from a single vertex;
• ${F_4}$: Four vertices joined in a chain of edges, with the middle edge being a double edge and the other two edges simple;
• ${G_2}$: Two vertices joined by a triple edge.

Now we return to root systems. Fixing a regular ${h}$, we define the Dynkin diagram to be the Coxeter diagram associated to the (unit vectors of the) simple roots, except that we orient the double or triple edges to point from the longer root to the shorter root. (Note from Lemma 19 that we know exactly what the ratio between lengths is in these cases; in particular, the Dynkin diagram describes the root system up to a unitary transformation and dilation.) We conclude

Corollary 27 The Dynkin diagram of an irreducible root system must take one of the following forms:

• ${A_n}$: ${n}$ vertices joined in a chain of simple edges for some ${n \geq 1}$;
• ${B_n}$: ${n}$ vertices joined in a chain of edges for some ${n \geq 2}$, with one boundary edge being a double edge (pointing outward) and all other edges simple;
• ${C_n}$: ${n}$ vertices joined in a chain of edges for some ${n \geq 3}$, with one boundary edge being a double edge (pointing inward) and all other edges simple;
• ${D_n}$: Three chains of simple edges of length ${1,1,n-3}$ respectively for some ${n \geq 4}$, emenating from a single vertex;
• ${E_n}$: Three chains of simple edges of length ${1,2,n-4}$ respectively for some ${n=6,7,8}$, emenating from a single vertex;
• ${F_4}$: Four vertices joined in a chain of edges, with the middle edge being a double (oriented) edge and the other two edges simple;
• ${G_2}$: Two vertices joined by a triple (oriented) edge.

This describes (up to isomorphism and dilation) the simple roots:

• ${A_n}$: The simple roots take the form ${e_i - e_{i+1}}$ for ${1 \leq i \leq n+1}$ in the space ${{\bf C}^{n+1}_0}$ of vectors whose coefficients sum to zero;
• ${B_n}$: The simple roots take the form ${e_i-e_{i+1}}$ for ${1 \leq i \leq n-1}$ and also ${e_n}$ in ${{\bf C}^n}$.
• ${C_n}$: The simple roots take the form ${e_i-e_{i+1}}$ for ${1 \leq i \leq n-1}$ and also ${2e_n}$ in ${{\bf C}^n}$.
• ${D_n}$: The simple roots take the form ${e_i-e_{i+1}}$ for ${1 \leq i \leq n-1}$ and also ${e_{n-1}+e_n}$ in ${{\bf C}^n}$.
• ${E_8}$: The simple roots take the form ${e_i-e_{i+1}}$ for ${1 \leq i \leq 6}$ and also ${e_6+e_7}$ and ${-\frac{1}{2}\sum_{i=1}^8 e_i}$ in ${{\bf C}^8}$.
• ${E_6,E_7}$: This system is obtained from ${E_8}$ by deleting the first one or two simple roots (and cutting down ${{\bf C}^8}$ appropriately)
• ${F_4}$: The simple roots take the form ${e_i-e_{i+1}}$ for ${1 \leq i \leq 2}$ and also ${e_3}$ and ${-\frac{1}{2} \sum_{i=1}^4 e_i}$ in ${{\bf C}^4}$.
• ${G_2}$: The simple roots take the form ${e_1-e_2}$, ${e_3-2e_2+e_1}$ in ${{\bf C}^3_0}$.

Remark 6 A slightly different way to reach the classification is to replace the Dynkin diagram by the extended Dynkin diagram in which one also adds the maximal negative root in addition to the simple roots; this breaks the linear independence, but one can then label each vertex by the coefficient in the linear combination needed to make the roots sum to zero, and one can then analyse these multiplicities to classify the possible diagrams and thence the root systems.

Now we show how the simple roots can be used to recover the entire root system. Define the Weyl group ${W}$ to be the group generated by all the reflections ${s_\alpha}$ coming from all the roots ${\alpha}$; as the roots span ${{\bf R}^n}$ and obey axiom (ii), the Weyl group acts faithfully on the finite set ${\Phi}$ and is thus itself finite.

Lemma 28 Let ${h}$ be regular, and let ${h'}$ be any element of ${{\bf R}^n}$. Then there exists ${w \in W}$ such that ${\langle w(h'), \alpha \rangle \geq 0}$ for all ${h}$-simple roots ${\alpha}$ (or equivalently, for all ${h}$-positive roots ${\alpha}$). In particular, if ${h'}$ is regular, then ${\Phi_{w(h')}^+ = \Phi_h^+}$, so that all ${h}$-simple roots are ${w(h')}$-simple and vice versa.

Furthermore, every root can be mapped by an element of ${W}$ to an ${h}$-simple root.

Finally, ${W}$ is generated by the reflections ${s_\alpha}$ coming from the ${h}$-simple roots ${\alpha}$.

Proof: Let ${\alpha}$ be a simple root. The action of the reflection ${s_\alpha}$ maps ${\alpha}$ to ${-\alpha}$, and maps all other simple roots ${\beta}$ to ${\beta+m\alpha}$ for some non-negative ${m}$ (since ${\alpha,\beta}$ subtend a right or obtuse angle). In particular, we see that ${s_\alpha}$ maps all positive roots other than ${\alpha}$ to positive roots, and hence (as ${s_\alpha}$ is an involution)

$\displaystyle s_\alpha(\Phi_h^+) = \Phi_h^+ \cup \{-\alpha\} \backslash \{\alpha\}.$

In particular, if we define ${\rho := \frac{1}{2} \sum_{\beta \in \Phi_h^+} \beta}$, then

$\displaystyle s_\alpha(\rho) = \rho-\alpha \ \ \ \ \ (17)$

for all simple roots ${\alpha}$.

Let ${W_h}$ be the subgroup of ${W}$ generated by the ${s_\alpha}$ for the simple roots ${\alpha}$, and choose ${w \in W_h}$ to maximise ${\langle w(h'), \rho \rangle}$. Then from (17) we have ${\langle w(h'),\alpha \rangle \geq 0}$, giving the first claim. Since every root ${\alpha}$ is ${h'}$-simple for some regular ${h'}$ (by selecting ${h'}$ to very nearly be orthogonal to ${\alpha}$), we conclude that every root can be mapped by an element of ${W_h}$ to a ${h}$-simple root in ${h}$, giving the second claim. Thus for any root ${\beta}$, ${s_\beta}$ is conjugate in ${W_h}$ to a reflection ${s_\alpha}$ for a ${h}$-simple root ${\alpha}$, so ${s_\beta}$ lies in ${W_h}$ and so ${W=W_h}$, giving the final claim. $\Box$

Remark 7 The set of all ${h'}$ for which ${\Phi_{h'}^+ = \Phi_h^+}$ is known as the Weyl chamber associated to ${h}$; this is an open polyhedral cone in ${{\bf R}^n}$, and the above lemma shows that it is the interior of a fundamental domain of the action of the Weyl group. In the case of the special linear group, the standard Weyl chamber (in ${{\bf R}^n_0}$ now instead of ${{\bf R}^n}$) would be the set of vectors ${h' \in {\bf R}^n_0}$ with decreasing coefficients.

From the above lemma we can reconstruct the root system from the simple roots by using the reflections ${s_\alpha}$ associated to the simple roots to generate the Weyl group ${W}$, and then applying the Weyl group to the simple roots to recover all the roots. Note that the lemma also shows that the set of ${h}$-simple roots and ${h'}$-simple roots are isomorphic for any regular ${h,h'}$, so that the Dynkin diagram is indeed independent (up to isomorphism) of the choice of regular element ${h}$ as claimed earlier. We have thus in principle described the irreducible root systems (up to isomorphism) as coming from the Dynkin diagrams ${A_n,B_n,C_n,D_n,E_6,E_7,E_8,F_4,G_2}$; see for instance the Wikipedia page on root systems for explicit descriptions of all of these. With these explicit descriptions one can verify that all of these systems are indeed irreducible root systems.

— 8. Chevalley bases —

Now that we have described root systems, we use them to reconstruct Lie algebras. We first begin with an abstract uniqueness result that shows that a simple Lie algebra is determined up to isomorphism by its root system.

Theorem 29 (Root system uniquely determines a simple Lie algebra) Let ${{\mathfrak g}, \tilde {\mathfrak g}}$ be simple Lie algebras with Cartan subalgebras ${{\mathfrak h}}$, ${\tilde {\mathfrak h}}$ and root systems ${\Phi \subset {\mathfrak h}^*}$, ${\tilde \Phi \subset \tilde {\mathfrak h}^*}$. Suppose that one can identify ${{\mathfrak h}}$ with ${\tilde {\mathfrak h}}$ as vector spaces in such a way that the root systems agree: ${\Phi = \tilde \Phi}$. Then the identification between ${{\mathfrak h}}$ and ${\tilde {\mathfrak h}}$ can be extended to an identification of ${{\mathfrak g}}$ and ${\tilde {\mathfrak g}}$ as Lie algebras.

Proof: First we note from (11) and the identification ${\Phi=\tilde \Phi}$ that the Killing forms on ${{\mathfrak h}}$ and ${\tilde {\mathfrak h}}$ agree, so we will identify ${{\mathfrak h}, \tilde {\mathfrak h}}$ as Hilbert spaces, not just as vector spaces.

The strategy will be exploit a Lie algebra version of the Goursat lemma (or the Schur lemma), finding a sufficiently “non-degenerate” subalgebra ${{\mathfrak k}}$ of ${{\mathfrak g} \oplus \tilde {\mathfrak g}}$ and using the simple nature of ${{\mathfrak g}}$ and ${\tilde {\mathfrak g}}$ to show that this subalgebra is the graph of an isomorphism from ${{\mathfrak g}}$ to ${\tilde {\mathfrak g}}$. This strategy will follow the same general strategy used in Theorem 16, namely to start with a “highest weight” space and apply lowering operators to discover the required graph.

We turn to the details. Pick a regular element ${h}$ of ${{\mathfrak h} =\tilde {\mathfrak h}}$, so that one has a notion of a positive root. For every simple root ${\alpha}$, we select non-zero elements ${X_\alpha, Y_\alpha}$, of ${{\mathfrak g}^{\mathfrak h}_\alpha, {\mathfrak g}^{\mathfrak h}_{-\alpha}}$ respectively such that

$\displaystyle [X_\alpha,Y_\alpha] = H_\alpha \ \ \ \ \ (18)$

where ${H_\alpha}$ is the co-root of ${\alpha}$; similarly select ${\tilde X_\alpha, \tilde Y_\alpha}$ in ${\tilde {\mathfrak g}^{\mathfrak h}_\alpha, \tilde {\mathfrak g}^{\mathfrak h}_{-\alpha}}$, and set ${X'_\alpha := X_\alpha \oplus \tilde X_\alpha}$ and ${Y'_\alpha := Y_\alpha \oplus \tilde Y_\alpha}$. Let ${{\mathfrak k}}$ be the subalgebra of ${{\mathfrak g} \oplus {\mathfrak g}' }$ generated by the ${X'_\alpha}$ and ${Y'_\alpha}$. It is not hard to see that the ${X_\alpha, Y_\alpha}$ generate ${{\mathfrak g}}$ as a Lie algebra, so ${{\mathfrak k}}$ surjects onto ${{\mathfrak g}}$; similarly ${{\mathfrak k}}$ surjects onto ${{\mathfrak g}' }$.

Let ${\beta}$ be a maximal root, that is to say a root such that ${\beta+\alpha}$ is not a root for any positive ${\alpha}$; such a root always exists. (It is in fact unique, though we will not need this fact here.) Then we have one-dimensional spaces ${{\mathfrak g}^{\mathfrak h}_\beta}$ and ${\tilde {\mathfrak g}^{\mathfrak h}_\beta}$, and thus a two-dimensional subspace ${{\mathfrak g}^{\mathfrak h}_\beta \oplus \tilde {\mathfrak g}^{\mathfrak h}_\beta}$ in ${{\mathfrak g} \oplus \tilde {\mathfrak g}}$. Inside this subspace, we select a one-dimensional subspace ${L}$ which is not equal to ${{\mathfrak g}^{\mathfrak h}_\beta \oplus 0}$ or ${0 \times \tilde {\mathfrak g}^{\mathfrak h}_\beta}$; in particular, ${L}$ is not contained in ${{\mathfrak g} \oplus 0}$ or ${0 \oplus \tilde {\mathfrak g}}$.

Let ${{\mathfrak l}}$ be the subspace of ${{\mathfrak g} \oplus {\mathfrak g}' }$ generated by ${L}$ and the adjoint action of the lowering operators ${Y'_\alpha}$, thus it is spanned by elements of the form

$\displaystyle \hbox{ad} Y'_{\alpha_1} \ldots \hbox{ad} Y'_{\alpha_k} x \ \ \ \ \ (19)$

for simple roots ${\alpha_1,\ldots,\alpha_k}$ and ${x \in L}$. Then ${{\mathfrak l}}$ contains ${L}$ and is thus not contained in ${{\mathfrak g} \oplus 0, 0 \oplus \tilde {\mathfrak g}}$; because (19) only involves lowering operators, we also see that ${L}$ does not contain any other element of ${{\mathfrak g}^{\mathfrak h}_\beta \oplus \tilde {\mathfrak g}^{\mathfrak h}_\beta}$ other than ${{\mathfrak l}}$. In particular, ${L}$ is not all of ${{\mathfrak g} \oplus \tilde {\mathfrak g}}$.

Clearly ${{\mathfrak l}}$ is closed under the adjoint action of the lowering operators ${Y'_\alpha}$. We claim that it is also closed under the adjoint action of the raising operators ${X'_\alpha}$. To see this, first observe that ${X'_\alpha, Y'_\gamma}$ commute when ${\alpha,\gamma}$ are distinct simple roots, because ${\alpha-\beta}$ cannot be a root (since this would make one of ${\alpha,\gamma}$ non-simple). Next, from (18) we see that ${\hbox{ad} X'_\alpha \hbox{ad} Y'_\alpha}$ acts as a scalar on any element of the form (19), while from the maximality of ${\beta}$ we see that ${\hbox{ad} X'_\alpha}$ annihilates ${x}$. From this the claim easily follows.

As ${{\mathfrak l}}$ is closed under the adjoint action of both the ${X'_\alpha}$ and the ${Y'_\alpha}$, we have ${[{\mathfrak k},{\mathfrak l}] \subset {\mathfrak l}}$. Projecting onto ${{\mathfrak g}}$, we see that the projection of ${{\mathfrak l}}$ is an ideal of ${{\mathfrak g}}$, and is hence ${0}$ or ${{\mathfrak g}}$ as ${{\mathfrak g}}$ is simple. As ${{\mathfrak l}}$ is not contained in ${0 \oplus \tilde {\mathfrak g}}$, we see that ${{\mathfrak l}}$ surjects onto ${{\mathfrak g}}$; similarly it surjects onto ${\tilde {\mathfrak g}}$. An analogous argument shows that the intersection of ${{\mathfrak l}}$ with ${{\mathfrak g} \oplus 0}$ is either ${0}$ or ${{\mathfrak g} \oplus 0}$; the latter would force ${{\mathfrak l} = {\mathfrak g} \oplus \tilde {\mathfrak g}}$ by the surjective projection onto ${\tilde {\mathfrak g}}$, which was already ruled out. Thus ${{\mathfrak l}}$ has trivial intersection with ${{\mathfrak g} \oplus 0}$, and similarly with ${0 \oplus \tilde {\mathfrak g}}$, and is thus a graph. Such a graph cannot be an ideal of ${{\mathfrak g} \oplus \tilde {\mathfrak g}}$, so that ${{\mathfrak k} \neq {\mathfrak g} \oplus \tilde {\mathfrak g}}$. As ${{\mathfrak k}}$ was a subalgebra that surjected onto both ${{\mathfrak g}}$ and ${\tilde {\mathfrak g}}$, we conclude by arguing as before that ${{\mathfrak k}}$ is also a graph; as ${{\mathfrak k}}$ is a Lie algebra, the graph is that of a Lie algebra isomorphism by the Lie algebra closed graph theorem (see this previous blog post). Since ${[X'_\alpha,Y'_\alpha] = H_\alpha \oplus H_\alpha}$, we see that ${{\mathfrak k}}$ restricts to the graph of the identity on ${{\mathfrak h}}$, and the claim follows. $\Box$

Remark 8 The above arguments show that every root can be obtained from the maximal root by iteratively subtracting off simple roots (while staying in ${\Phi \cup \{0\}}$), which among other things implies that the maximal root is unique. These facts can also be established directly from the axioms of a root system (or from the classification of root systems), but we will not do so here. By using Theorem 29, one can convert graph automorphisms of the Dynkin diagram (e.g. the automorphism sending the ${A_n}$ Dynkin diagram to its inverse, or the triality automorphism that rotates the ${D_4}$ diagram) to automorphisms of the Lie algebra; these are important in the theory of twisted groups of Lie type, and more specifically the Steinberg groups and Suzuki-Ree groups, but will not be discussed further here.

Remark 9 In a converse direction, once one establishes that in an irreducible root system ${\Phi}$ that every root can be obtained from the maximal root by subtracting off simple roots (while staying in ${\Phi \cup \{0\}}$), this shows that any Lie algebra ${{\mathfrak g}}$ associated to this system is necessarily simple. Indeed, given any non-trivial ideal ${{\mathfrak h}}$ in ${{\mathfrak g}}$ and a non-trivial element ${x}$ of ${{\mathfrak h}}$, one locates a minimal element of ${\Phi \cup \{0\}}$ in which ${x}$ has a non-trivial component, then iteratively applies raising operators to then locate a non-trivial element of the root space of the maximal root in ${{\mathfrak h}}$; if one then applies lowering operators one recovers all the other root spaces, so that ${{\mathfrak h}={\mathfrak g}}$.

Theorem 29, when combined with the results from previous sections, already gives Theorem 2, but without a fully explicit way to determine the Lie algebras ${A_n,B_n,C_n,D_n,E_6,E_7,E_8,F_4,G_2}$ listed in that theorem (or even to establish whether these systems exist at all). In the case of the classical Lie algebras ${A_n,B_n,C_n,D_n}$, one can explicitly describe these algebras in terms of the special linear algebras ${\mathfrak{sl}_n}$, special orthogonal algebras ${\mathfrak{so}_n}$, and symplectic algebras ${\mathfrak{sp}_n}$, but this does not give too much guidance as to how to explicitly describe the exceptional Lie algebras ${E_6,E_7,E_8,F_4,G_2}$. We now turn to the question of how to explicitly describe all the simple Lie algebras in a unified fashion.

Let ${{\mathfrak g}}$ be a simple Lie algebra, with Cartan algebra ${{\mathfrak h}}$. We view ${{\mathfrak h}}$ as a Hilbert space with the Killing form, and then identify this space with its dual ${{\mathfrak h}^*}$. Thus for instance the coroot ${H_\alpha}$ of a root ${\alpha \in {\mathfrak h}^* \equiv {\mathfrak h}}$ is now given by the simpler formula

$\displaystyle H_\alpha = \frac{2}{\langle\alpha,\alpha\rangle} \alpha. \ \ \ \ \ (20)$

Let ${\Phi \subset {\mathfrak h}^* \equiv {\mathfrak h}}$ be the root system, which is irreducible. As described in Section 6, we have the vector space decomposition

$\displaystyle {\mathfrak g} \equiv {\mathfrak h} \oplus \bigoplus_{\alpha \in \Phi} {\mathfrak g}^{\mathfrak h}_\alpha$

where the spaces ${{\mathfrak g}^{\mathfrak h}_\alpha}$ are one-dimensional, thus we can choose a generator ${E_\alpha}$ for each ${{\mathfrak g}^{\mathfrak h}_\alpha}$, though we have the freedom to multiply each ${E_\alpha}$ by a complex constant, which we will take advantage of to perform various normalisations. A basis for algebra ${{\mathfrak h}}$ together with the ${E_\alpha}$ then form a basis for ${{\mathfrak g}}$, known as a Cartan-Weyl basis for this Lie algebra. From (11), (20) we have

$\displaystyle [H_\alpha, E_\beta] = A_{\alpha,\beta} E_\beta$

where ${A_{\alpha,\beta}}$ is the quantity

$\displaystyle A_{\alpha,\beta} := \frac{2\langle \alpha,\beta\rangle}{\langle \alpha,\alpha\rangle}$

which is always an integer because ${\Phi}$ is a root system (indeed ${A_{\alpha,\beta}}$ takes values in ${\{ 0, \pm 1, \pm 2, \pm 3 \}}$, and form an interesting matrix known as the Cartan matrix).

As discussed in Section 6, ${[E_\alpha, E_{-\alpha}]}$ is a multiple of the coroot ${H_\alpha}$; by adjusting ${E_\alpha,E_{-\alpha}}$ for each pair ${\{\alpha,-\alpha\}}$ we may normalise things so that

$\displaystyle [E_\alpha, E_{-\alpha}] = H_\alpha \ \ \ \ \ (21)$

for all ${\alpha}$ (here we use the fact that ${H_{-\alpha} = -H_\alpha}$ to avoid inconsistency). Next, we see from (19) that

$\displaystyle [E_\alpha, E_\beta] = 0$

if ${\alpha+\beta \not \in \Phi \cup \{0\}}$, and

$\displaystyle [E_\alpha, E_\beta] = N_{\alpha,\beta} E_{\alpha+\beta} \ \ \ \ \ (22)$

for some complex number ${N_{\alpha,\beta}}$ if ${\alpha+\beta \in \Phi}$. By considering the action of ${E_\alpha}$ on (16) using Theorem 16 one can verify that ${N_{\alpha,\beta}}$ is non-zero; however, its value is not yet fully determined because there is still residual freedom to normalise the ${E_\alpha}$. Indeed, one has the freedom to multiply ${E_\alpha}$ by any non-zero complex scalar ${c_\alpha}$ as long as ${c_{-\alpha} = c_\alpha^{-1}}$ (to preserve the normalisation (21)), in which case the structure constant ${N_{\alpha,\beta}}$ gets transformed according to the law

$\displaystyle N_{\alpha,\beta} \mapsto \frac{c_\alpha c_\beta}{c_{\alpha+\beta}} N_{\alpha,\beta}. \ \ \ \ \ (23)$

However, observe that the combined structure constant ${N_{\alpha,\beta} N_{-\alpha,-\beta}}$ is unchanged by this rescaling. And indeed there is an explicit formula for this quantity:

Lemma 30 For any roots ${\alpha,\beta}$ with ${\alpha+\beta \in \Phi}$, one has

$\displaystyle N_{\alpha,\beta} N_{-\alpha,-\beta} = (r+1)^2$

where ${\beta-r\alpha,\ldots,\beta,\ldots,\beta+q\alpha}$ are the string of roots of the form ${\beta + m\alpha}$ for integer ${m}$.

This formula can be confirmed by an explicit computation using Theorem 16 (using, say, the standard basis for ${P_n}$ to select ${E_{\beta+m\alpha}}$, which then fixes ${E_{-\beta-m\alpha}}$ by (21)); we omit the details.

On the other hand, we have the following clever renormalisation trick of Chevalley, exploiting the abstract isomorphism from Theorem 29:

Lemma 31 (Chevalley normalisation) There exist choices of ${E_\alpha}$ such that

$\displaystyle N_{\alpha,\beta} = N_{-\alpha,-\beta}$

for all roots ${\alpha,\beta}$ with ${\alpha+\beta \in \Phi}$.

Proof: We first select ${E_\alpha}$ arbitrarily, then we will have

$\displaystyle N_{\alpha,\beta} = a_{\alpha,\beta} N_{-\alpha,-\beta}$

for some non-zero ${a_{\alpha,\beta}}$ for all roots ${\alpha,\beta}$. The plan is then to locate coefficients ${c_\alpha}$ so that the transformation (23) eliminates all of the ${a_{\alpha,\beta}}$ factors.

To do this, observe that we may identify ${{\mathfrak h}}$ with itself and ${\Phi}$ with itself via the negation map ${x \mapsto -x}$ for ${x \in {\mathfrak h}}$ and ${\alpha \mapsto -\alpha}$ for ${\alpha \in \Phi}$. From this and Theorem 29, we may find a Lie algebra isomorphism ${\phi: {\mathfrak g} \rightarrow {\mathfrak g}}$ that maps ${x}$ to ${-x}$ on ${{\mathfrak h}}$, and thus maps ${{\mathfrak g}^{\mathfrak h}_\alpha}$ to ${{\mathfrak g}^{\mathfrak h}_{-\alpha}}$ for any root ${\alpha}$. In particular, we have

$\displaystyle \phi( E_\alpha ) = b_\alpha E_{-\alpha}$

for some non-zero coefficients ${b_\alpha}$; from (21) we see in particular that

$\displaystyle b_\alpha b_{-\alpha} = 1. \ \ \ \ \ (24)$

If we then apply ${\phi}$ to (22), we conclude that

$\displaystyle b_\alpha b_\beta N_{-\alpha,-\beta} = b_{\alpha+\beta} N_{\alpha,\beta}$

when ${\alpha+\beta}$ is a root, so that ${a_{\alpha,\beta}}$ takes the special form

$\displaystyle a_{\alpha,\beta} = \frac{b_\alpha b_\beta}{b_{\alpha+\beta}}.$

If we then select ${c_\alpha}$ so that

$\displaystyle c_\alpha = b_\alpha c_{-\alpha}$

for all roots ${\alpha}$ (this is possible thanks to (24)), then the transformation (23) eliminates ${a_{\alpha,\beta}}$ as desired. $\Box$

From the above two lemmas, we see that we can select a special Cartan-Weyl basis, known as a Chevalley basis, such that

$\displaystyle [E_\alpha, E_\beta] = \pm (r+1) E_{\alpha+\beta} \ \ \ \ \ (25)$

whenever ${\alpha+\beta}$ is a root; in particular, the structure constants ${N_{\alpha,\beta}}$ are all integers, which is a crucial fact when one wishes to construct Lie algebras and Chevalley groups over fields of arbitrary characteristic. This comes very close to fully describing the Lie algebra structure associated to a given Dynkin diagram, except that one still has to select the signs ${\pm}$ in (25) so that one actually gets a Lie algebra (i.e. that the Jacobi identity (1) is obeyed). This turns out to be non-trivial; see this paper of Tits for details. (There are other approaches to demonstrate existence of a Lie algebra associated to a given root system; one popular one proceeds using the Chevalley-Serre relations, see e.g. this text of Serre. There is still a certain amount of freedom to select the signs, but this ambiguity can be described precisely; see the book of Carter for details.) Among other things, this construction shows that every root system actually creates a Lie algebra (thus far we have only established uniqueness, not existence), though once one has the classification one could also build a Lie algebra explicitly for each Dynkin diagram by hand (in particular, one can build the simply laced classical Lie algebras ${A_n, D_n}$ and the maximal simply laced exceptional algebra ${E_8}$, and construct the remaining Lie algebras by taking fixed points of suitable involutions; see e.g. these notes of Borcherds et al. for this approach).

— 9. Appendix: Casimirs and complete reducibility —

In this appendix we supply a proof of the following fact, used in the proof of Corollary 9:

Theorem 32 (Weyl’s complete reducibility theorem) Let ${{\mathfrak g} \subset {\mathfrak{gl}}(V)}$ be a simple Lie algebra, and let ${W}$ be a ${{\mathfrak g}}$-invariant subspace of ${V}$. Then there exists a complementary ${{\mathfrak g}}$-invariant subspace ${W'}$ such that ${V = W \oplus W'}$.

Among other things, Weyl’s complete reducibility theorem shows that every finite-dimensional linear representation of ${{\mathfrak g}}$ splits into the direct sum of irreducible representations, which explains the terminology. The claim is also true for semisimple Lie algebras ${{\mathfrak g}}$, but we will only need the simple case here, which allows for some minor simplifications to the argument.

The proof of this theorem requires a variant ${B: {\mathfrak g} \times {\mathfrak g} \rightarrow {\bf C}}$ of the Killing form associated to ${V}$, defined by the formula

$\displaystyle B(x,y) := \hbox{tr}( xy ), \ \ \ \ \ (26)$

and a certain element of ${{\mathfrak{gl}}(V)}$ associated to this form known as the Casimir operator. We first need to establish a variant of Theorem 1:

Proposition 33 With the hypotheses of Theorem 32, ${B}$ is non-degenerate.

Proof: This is a routine modification of Proposition 6 (one simply omits the use of the adjoint representation). $\Box$

Once one establishes non-degeneracy, one can then define the Casimir operator ${C \in {\mathfrak{gl}}(V)}$ by setting

$\displaystyle C:= \sum_{i=1}^n e_i f_i$

whenever ${e_1,\ldots,e_n}$ is a basis of ${{\mathfrak g}}$ and ${f_1,\ldots,f_n}$ is its dual basis, thus ${B(e_i,f_j) = \delta_{ij}}$. It is easy to see that this definition does not depend on the choice of basis, which in turn (by infinitesimally conjugating both bases by an element ${x}$ of the algebra ${{\mathfrak g}}$) implies that ${C}$ commutes with every element ${x}$ of ${{\mathfrak g}}$.

On the other hand, ${C}$ does not vanish entirely. Indeed, taking traces and using (26) we see that

$\displaystyle \hbox{tr}(C) = \hbox{dim}({\mathfrak g}). \ \ \ \ \ (27)$

This already gives an important special case of Theorem 32:

Proposition 34 Theorem 32 is true when ${W}$ has codimension one and is irreducible.

Proof: The Lie algebra ${{\mathfrak g}}$ acts on the one-dimensional space ${V/W}$; since ${{\mathfrak g} = [{\mathfrak g},{\mathfrak g}]}$ (from the simplicity hypothesis), we conclude that this action is trivial. In other words, each element of ${{\mathfrak g}}$ maps ${V}$ to ${W}$, so the Casimir operator ${C}$ does as well. In particular, the trace of ${C}$ on ${V}$ is the same as the trace of ${C}$ on ${W}$. On the other hand, by Schur’s lemma, ${C}$ is a constant on ${W}$; applying (27), we conclude that this constant is non-zero. Thus ${C}$ is non-degenerate on ${W}$, but is not full rank on ${V}$ as it maps ${V}$ to ${W}$. Thus it must have a one-dimensional null-space ${W'}$ which is complementary to ${W}$. As ${C}$ commutes with ${{\mathfrak g}}$, ${W'}$ is ${{\mathfrak g}}$-invariant, and the claim follows. $\Box$

We can then remove the irreducibility hypothesis:

Proposition 35 Theorem 32 is true when ${W}$ has codimension one.

We remark that this statement is essentially a reformulation of Whitehead’s lemma.

Proof: We induct on the dimension of ${W}$ (or ${V}$). If ${W}$ is irreducible then we are already done, so suppose that ${W}$ has a proper invariant subspace ${U}$. Then ${W/U}$ has codimension one in ${V/U}$, so by the induction hypothesis ${W/U}$ is complemented by a one-dimensional invariant subspace ${Y}$ of ${V/U}$, which lifts to an invariant subspace ${Z}$ of ${V}$ in which ${U}$ has codimension one. By the induction hypothesis again, ${U}$ is complemented by a one-dimensional invariant subspace ${W'}$ in ${Z}$, and it is then easy to see that ${W'}$ also complements ${W}$ in ${V}$, and the claim follows. $\Box$

Next, we remove the codimension one hypothesis instead:

Proposition 36 Theorem 32 is true when ${W}$ is irreducible.

Proof: Let ${A}$ be the space of linear maps ${T: V \rightarrow W}$ whose restriction to ${W}$ is a constant multiple of the identity, and let ${B}$ be the subalgebra of ${A}$ whose restriction to ${W}$ vanishes. Then ${A, B}$ are ${{\mathfrak g}}$-invariant (using the Lie bracket action), and ${B}$ has codimension one in ${A}$. Applying Proposition 35 (pushing ${{\mathfrak g}}$ forward to ${\mathfrak{gl}(A)}$, and treating the degenerate case when ${\mathfrak{gl}(A)}$ vanishes separately) we see that ${B}$ is complemented by a one-dimensional invariant subspace ${B'}$ of ${A}$. Thus there exist ${T \in A}$ that does not lie in ${B}$, and which commutes with every element of ${{\mathfrak g}}$. The kernel ${W'}$ of ${T}$ is then an invariant complement of ${W}$ in ${V}$, and the claim follows. $\Box$

Applying the induction argument used to prove Proposition 35, we now obtain Theorem 32 in full generality.