In this set of notes, we describe the basic analytic structure theory of Lie groups, by relating them to the simpler concept of a Lie algebra. Roughly speaking, the Lie algebra encodes the “infinitesimal” structure of a Lie group, but is a simpler object, being a vector space rather than a nonlinear manifold. Nevertheless, thanks to the fundamental theorems of Lie, the Lie algebra can be used to reconstruct the Lie group (at a local level, at least), by means of the exponential map and the Baker-Campbell-Hausdorff formula. As such, the local theory of Lie groups is completely described (in principle, at least) by the theory of Lie algebras, which leads to a number of useful consequences, such as the following:

  • (Local Lie implies Lie) A topological group {G} is Lie (i.e. it is isomorphic to a Lie group) if and only if it is locally Lie (i.e. the group operations are smooth near the origin).
  • (Uniqueness of Lie structure) A topological group has at most one smooth structure on it that makes it Lie.
  • (Weak regularity implies strong regularity, I) Lie groups are automatically real analytic. (In fact one only needs a “local {C^{1,1}}” regularity on the group structure to obtain real analyticity.)
  • (Weak regularity implies strong regularity, II) A continuous homomorphism from one Lie group to another is automatically smooth (and real analytic).

The connection between Lie groups and Lie algebras also highlights the role of one-parameter subgroups of a topological group, which will play a central role in the solution of Hilbert’s fifth problem.

We note that there is also a very important algebraic structure theory of Lie groups and Lie algebras, in which the Lie algebra is split into solvable and semisimple components, with the latter being decomposed further into simple components, which can then be completely classified using Dynkin diagrams. This classification is of fundamental importance in many areas of mathematics (e.g. representation theory, arithmetic geometry, and group theory), and many of the deeper facts about Lie groups and Lie algebras are proven via this classification (although in such cases it can be of interest to also find alternate proofs that avoid the classification). However, it turns out that we will not need this theory in this course, and so we will not discuss it further here (though it can of course be found in any graduate text on Lie groups and Lie algebras).

— 1. Local groups —

The connection between Lie groups and Lie algebras will be local in nature – the only portion of the Lie group that will be of importance will be the portion that is close to the group identity {1}. To formalise this locality, it is convenient to introduce the notion of a local group and a local Lie group, which are local versions of the concept of a topological group and a Lie group respectively. We will only set up the barest bones of the theory of local groups here; a more detailed discussion may be found at this previous blog post.

Definition 1 (Local group) A local topological group {G = (G, \Omega, \Lambda, 1, \cdot, ()^{-1})}, or local group for short, is a topological space {G} equipped with an identity element {1 \in G}, a partially defined but continuous multiplication operation {\cdot: \Omega \rightarrow G} for some domain {\Omega \subset G \times G}, and a partially defined but continuous inversion operation {()^{-1}: \Lambda \rightarrow G}, where {\Lambda \subset G}, obeying the following axioms:

  • (Local closure) {\Omega} is an open neighbourhood of {G \times \{1\} \cup \{1\} \times G}, and {\Lambda} is an open neighbourhood of {1}.
  • (Local associativity) If {g, h, k \in G} are such that {(g \cdot h) \cdot k} and {g \cdot (h \cdot k)} are both well-defined in {G}, then they are equal. (Note however that it may be possible for one of these products to be defined but not the other.)
  • (Identity) For all {g \in G}, {g \cdot 1 = 1 \cdot g = g}.
  • (Local inverse) If {g \in G} and {g^{-1}} is well-defined in {G}, then {g \cdot g^{-1} = g^{-1} \cdot g = 1}. (In particular this, together with the other axioms, forces {1^{-1} = 1}.)

We will sometimes use additive notation for local groups if the groups are locally abelian (thus if {g+h} is defined, then {h+g} is also defined and equal to {g+h}.)

A local group is said to be symmetric if {\Lambda = G}, i.e. if every element {g} in {G} has an inverse {g^{-1}} that is also in {G}.

A local Lie group is a local group that is also a smooth manifold, in such a fashion that the partially defined group operations {\cdot, ()^{-1}} are smooth on their domain of definition.

Clearly, every topological group is a local group, and every Lie group is a local Lie group. We will sometimes refer to the former concepts as global topological groups and global Lie groups in order to distinguish them from their local counterparts. One could also consider local discrete groups, in which the topological structure is just the discrete topology, but we will not need to study such objects in this course.

A model class of examples of a local (Lie) group comes from restricting a global (Lie) group to an open neighbourhood of the identity. Let us formalise this concept:

Definition 2 (Restriction) If {G} is a local group, and {U} is an open neighbourhood of the identity in {G}, then we define the restriction {G\downharpoonright_U} of {G} to {U} to be the topological space {U} with domains {\Omega\downharpoonright_U := \{ (g,h) \in \Omega: g, h, g \cdot h \in U \}} and {\Lambda\downharpoonright_U := \{ g \in \Lambda: g, g^{-1} \in U \}}, and with the group operations {\cdot, ()^{-1}} being the restriction of the group operations of {G} to {\Omega\downharpoonright_U}, {\Lambda\downharpoonright_U} respectively. If {U} is symmetric (in the sense that {g^{-1}} is well-defined and lies in {U} for all {g \in U}), then this restriction {G\downharpoonright_U} will also be symmetric. If {G} is a global or local Lie group, then {G\downharpoonright_U} will also be a local Lie group. We will sometimes abuse notation and refer to the local group {G\downharpoonright_U} simply as {U}.

Thus, for instance, one can take the Euclidean space {{\bf R}^d}, and restrict it to a ball {B} centred at the origin, to obtain an additive local group {{\bf R}^d\downharpoonright_B}. In this group, two elements {x, y} in {B} have a well-defined sum {x+y} only when their sum in {{\bf R}^d} stays inside {B}. Intuitively, this local group behaves like the global group {{\bf R}^d} as long as one is close enough to the identity element {0}, but as one gets closer to the boundary of {B}, the group structure begins to break down.

It is natural to ask the question as to whether every local group arises as the restriction of a global group. The answer to this question is somewhat complicated, and can be summarised as “essentially yes in certain circumstances, but not in general”. See this previous blog post for more discussion.

A key example of a local Lie group for this blog post will come from pushing forward a Lie group via a coordinate chart near the origin:

Example 1 Let {G} be a global or local Lie group of some dimension {d}, and let {\phi: U \rightarrow V} be a smooth coordinate chart from a neighbourhood {U} of the identity {1} in {G} to a neighbourhood {V} of the origin {0} in {{\bf R}^d}, such that {\phi} maps {1} to {0}. Then we can define a local group {\phi_* G\downharpoonright_U} which is the set {V} (viewed as a smooth submanifold of {{\bf R}^d}) with the local group identity {0}, the local group multiplication law {\ast} defined by the formula

\displaystyle  x \ast y := \phi( \phi^{-1}(x) \cdot \phi^{-1}(y) )

defined whenever {\phi^{-1}(x), \phi^{-1}(y), \phi^{-1}(x) \cdot \phi^{-1}(y)} are well-defined and lie in {U}, and the local group inversion law {()^{\ast-1}} defined by the formula

\displaystyle  x^{\ast -1} := \phi( \phi^{-1}(x)^{-1} )

defined whenever {\phi^{-1}(x), \phi^{-1}(x)^{-1}} are well-defined and lie in {U}. One easily verifies that {\phi_* G \downharpoonright_U} is a local Lie group. We will sometimes denote this local Lie group as {(V,\ast)}, to distinguish it from the additive local Lie group {(V,+)} arising by restriction of {({\bf R}^d,+)} to {V}. The precise distinction between the two local Lie groups will in fact be a major focus of this post.

Example 2 Let {G} be the Lie group {GL_n({\bf R})}, and let {U} be the ball {U := \{ g \in GL_n({\bf R}): \|g-1\|_{op} < 1 \}}. If we then let {V \subset M_n({\bf R})} be the ball {V := \{ x \in M_n({\bf R}): \|x\|_{op} < 1 \}} and {\phi} be the map {\phi(g) := g-1}, then {\phi} is a smooth coordinate chart (after identifying {M_n({\bf R})} with {{\bf R}^{n \times n}}), and by the construction in the preceding exercise, {V = \phi_* G\downharpoonright_U} becomes a local Lie group with the operations

\displaystyle  x \ast y := x + y + xy

(defined whenever {x, y, x+y+xy} all lie in {V}) and

\displaystyle  x^{\ast -1} := (1+x)^{-1} - 1 = x - x^2 + x^3 - \ldots

(defined whenever {x} and {(1+x)^{-1} - 1} both lie in {V}). Note that this Lie group structure is not equal to the additive structure {(V,+)} on {V}, nor is it equal to the multiplicative structure {(V,\cdot)} on {V} given by matrix multiplication, which is one of the reasons why we use the symbol {\ast} instead of {+} or {\cdot} for such structures.

Many (though not all) of the familiar constructions in group theory can be generalised to the local setting, though often with some slight additional subtleties. We will not systematically do so here, but we give a single such generalisation for now:

Definition 3 (Homomorphism) A continuous homomorphism {\phi: G \rightarrow H} between two local groups {G, H} is a continuous map from {G} to {H} with the following properties:

  • {\phi} maps the identity {1_G} of {G} to the identity {1_H} of {H}: {\phi(1_G) = 1_H}.
  • If {g \in G} is such that {g^{-1}} is well-defined in {G}, then {\phi(g)^{-1}} is well-defined in {H} and is equal to {\phi(g^{-1})}.
  • If {g, h \in G} are such that {g \cdot h} is well-defined in {G}, then {\phi(g) \cdot \phi(h)} is well-defined and equal to {\phi(g \cdot h)}.

A smooth homomorphism {\phi: G \rightarrow H} between two local Lie groups {G, H} is a continuous homomorphism that is also smooth.

It is easy to see that the composition of two continuous homomorphisms is again a continuous homomorphism; this gives the class of local groups the structure of a category. Similarly, the class of local Lie groups with their smooth homomorphisms is also a category.

Note that homomorphisms on a local group {G} are defined on the entirety of {G}; it is also natural to consider (continuous or smooth) local homomorphisms, which are only defined on an open neighbourhood of the identity in {G}, with two local homomorphisms considered equivalent if they agree on a (possibly smaller) open neighbourhood of the identity. We will not need to do so for now, however.

Example 3 With the notation of Example 1, {\phi: U \rightarrow V} is a smooth homomorphism from the local Lie group {G\downharpoonright_U} to the local Lie group {\phi_* G\downharpoonright_U}. In fact, it is a smooth isomorphism, since {\phi^{-1}: V \rightarrow U} provides the inverse homomorphism.

Let us say that a word {g_1 \ldots g_n} in a local group {G} is well-defined in {G} (or well-defined, for short) if every possible way of associating this word using parentheses is well-defined from applying the product operation. For instance, in order for {abcd} to be well-defined, {((ab)c)d}, {(a(bc))d}, {(ab)(cd)}, {a(b(cd))}, and {a((bc)d)} must all be well-defined. For instance, in the additive local group {\{-9,\ldots,9\}} (with the group structure restricted from that of the integers {{\bf Z}}), {-2+6+5} is not well-defined because one of the ways of associating this sum, namely {-2+(6+5)}, is not well-defined (even though {(-2+6)+5} is well-defined).

Exercise 1 (Iterating the associative law)

  • Show that if a word {g_1 \ldots g_n} in a local group {G} is well-defined, then all ways of associating this word give the same answer, and so we can uniquely evaluate {g_1 \ldots g_n} as an element in {G}.
  • Give an example of a word {g_1 \ldots g_n} in a local group {G} which has two ways of being associated that are both well-defined, but give different answers. (Hint: the local associativity axiom prevents this from happening for {n \leq 3}, so try {n=4}. A small discrete local group will already suffice to give a counterexample; verifying the local group axioms are easier if one makes the domain of definition of the group operations as small as one can get away with while still having the counterexample.)

— 2. Some differential geometry —

To define the Lie algebra of a Lie group, we must first quickly recall some basic notions from differential geometry associated to smooth manifolds (which are not necessarily embedded in some larger Euclidean space, but instead exist intrinsically as abstract geometric structures). This requires a certain amount of abstract formalism in order to define things rigorously, though for the purposes of visualisation, it is more intuitive to view these concepts from a more informal geometric perspective.

We begin with the concept of the tangent space and related structures.

Definition 4 (Tangent space) Let {M} be a smooth {d}-dimensional manifold. At every point {x} of this manifold, we can define the tangent space {T_x M} of {M} at {x}. Formally, this tangent space can be defined as the space of all continuously differentiable curves {\gamma: I \rightarrow M} defined on an open interval {I} containing {0} with {\gamma(0)=x}, modulo the relation that two curves {\gamma_1, \gamma_2} are considered equivalent if they have the same derivative at {0}, in the sense that

\displaystyle  \frac{d}{dt} \phi(\gamma_1(t))|_{t=0} = \frac{d}{dt} \phi(\gamma_2(t))|_{t=0}

where {\phi:U \rightarrow V} is a coordinate chart of {G} defined in a neighbourhood of {x}; it is easy to see from the chain rule that this equivalence is independent of the actual choice of {\phi}. Using such a coordinate chart, one can identify the tangent space {T_x M} with the Euclidean space {{\bf R}^d}, by identifying {\gamma} with {\frac{d}{dt} \phi(\gamma(t))|_{t=0}}. One easily verifies that this gives {T_x M} the structure of a {d}-dimensional vector space, in a manner which is independent of the choice of coordinate chart {\phi}. Elements of {T_x M} are called tangent vectors of {M} at {x}. If {\gamma: I \rightarrow M} is a continuously differentiable curve with {\gamma(0)=x}, the equivalence class of {\gamma} in {T_x M} will be denoted {\gamma'(0)}.

The space {TM := \bigcup_{x \in M} (\{x\} \times T_x M)} of pairs {(x,v)}, where {x} is a point in {M} and {v} is a tangent vector of {M} at {x}, is called the tangent bundle.

If {\Phi: M \rightarrow N} is a smooth map between two manifolds, we define the derivative map {D\Phi: TM \rightarrow TN} to be the map defined by setting

\displaystyle  D\Phi( (x, \gamma'(0)) ) := (\Phi(x), (\Phi \circ \gamma)'(0))

for all continously differentiable curves {\gamma: I \rightarrow M} with {\gamma(0)=x} for some {x \in M}. We also write {(\Phi(x), D\Phi(x)(v))} for {D\Phi(x,v)}, so that for each {x \in M}, {D\Phi(x)} is a map from {T_x M} to {T_{\Phi(x)} N}. One can easily verify that this latter map is linear. We observe the chain rule

\displaystyle  D(\Psi \circ \Phi) = (D\Psi) \circ (D\Phi) \ \ \ \ \ (1)

for any smooth maps {\Phi: M \rightarrow N}, {\Psi: N \rightarrow O}. (Indeed, one can view the tangent operator {T} and the derivative operator {D} together as a single covariant functor from the category of smooth manifolds to itself, although we will not need to use this perspective here.)

Observe that if {V} is an open subset of {{\bf R}^d}, then {TV} may be identified with {V \times {\bf R}^d}. In particular, every coordinate chart {\phi: U \rightarrow V} of {M} gives rise to a coordinate chart {D\phi: TU \rightarrow V \times {\bf R}^d} of {TM}, which gives {TM} the structure of a smooth {2d}-dimensional manifold.

Remark 1 Informally, one can think of a tangent vector {(x,v)} as an infinitesimal vector from the point {x} of {M} to a nearby point {x + \epsilon v + O(\epsilon^2)} on {M}, where {\epsilon>0} is infinitesimally small; a smooth map {\phi} then sends {x + \epsilon v + O(\epsilon^2)} to {\phi(x) + \epsilon D\phi(x)(v) + O(\epsilon^2)}. One can make this informal perspective rigorous by means of nonstandard analysis, but we will not do so here.

Once one has the notion of a tangent bundle, one can define the notion of a smooth vector field:

Definition 5 (Vector fields) A smooth vector field on {M} is a smooth map {X: M \rightarrow TM} which is a right inverse for the projection map {\pi: TM \rightarrow M}, thus (by slight abuse of notation) {X} maps {x} to {(x,X(x))} for some {X(x) \in T_x M}. The space of all smooth vector fields is denoted {\Gamma(TM)}. It is clearly a real vector space. In fact, it is a {C^\infty(M)}-module: given a smooth vector field {X \in \Gamma(TM)} and a smooth function {f \in C^\infty(M)} (i.e. a smooth map {f: M \rightarrow {\bf R}}), one can define the product {fX} in the obvious manner: {fX(x) := f(x) X(x)}, and one easily verifies the module axioms.

Given a smooth function {f \in C^\infty(M)} and a smooth vector field {X \in \Gamma(TM)}, we define the directional derivative {\nabla_X f \in C^\infty(M)} of {f} along {X} by the formula

\displaystyle  \nabla_X f(x) := \frac{d}{dt} f(\gamma(t))|_{t=0}

whenever {\gamma: I \rightarrow M} is a continuously differentiable function with {\gamma(0)=x} and {\gamma'(0)=X(x)}; one easily verifies that {\nabla_X f} is well-defined and is an element of {C^\infty(M)}.

Remark 2 One can define {\nabla_X f} in a more “co-ordinate free” manner as

\displaystyle  \nabla_X f = \eta \circ Df \circ X,

where {\eta: T{\bf R} \rightarrow {\bf R}} is the projection map to the second coordinate of {T{\bf R} \equiv {\bf R} \times {\bf R}}; one can also view {\nabla_X f} as the Lie derivative of {f} along {X} (although, in most texts, the latter definition would be circular, because the Lie derivative is usually defined using the directional derivative).

Remark 3 If {V} is an open subset of {{\bf R}^d}, a smooth vector field on {V} can be identified with a smooth map {X: V \rightarrow {\bf R}^d} from {V} to {{\bf R}^d}. If {X: M \rightarrow TM} is a smooth vector field on {M} and {\phi: U \rightarrow V} is a coordinate chart of {M}, then the pushforward {\phi_* X := D\phi \circ X \circ \phi^{-1}: V \rightarrow TV} of {X} by {\phi} is a smooth vector field of {V}. Thus, in coordinates, one can view vector fields as maps from open subsets of {{\bf R}^d} to {{\bf R}^d}. This perspective is convenient for quick and dirty calculations; for instance, in coordinates, the directional derivative {\nabla_X f} is the same as the familiar directional derivative {X \cdot \nabla f} from several variable calculus. If however one wishes to perform several changes of variable, then the more intrinsically geometric (and “coordinate-free”) perspective outlined above can be more helpful.

There is a fundamental link between smooth vector fields and derivations of {C^\infty(M)}:

Exercise 2 (Correspondence between smooth vector fields and derivations) Let {M} be a smooth manifold.

  • If {X \in \Gamma(TM)} is a smooth vector field, show that {\nabla_X: C^\infty(M) \rightarrow C^\infty(M)} is a derivation on the (real) algebra {C^\infty(M)}, i.e. a (real) linear map that obeys the Leibniz rule

    \displaystyle  \nabla_X(fg) = f \nabla_X g + (\nabla_X f) g \ \ \ \ \ (2)

    for all {f, g \in C^\infty(M)}.

  • Conversely, if {d: C^\infty(M) \rightarrow C^\infty(M)} is a derivation on {C^\infty(M)}, show that there exists a unique smooth vector field {X} such that {d = \nabla_X}.

We see from the above exercise that smooth vector fields can be interpreted as a purely algebraic construction associated to the real algebra {C^\infty(M)}, namely as the space of derivations on that vector space. This can be useful for analysing the algebraic structure of such vector fields. Indeed, we have the following basic algebraic observation:

Exercise 3 (Commutator of derivations is a derivation) Let {d_1, d_2: A \rightarrow A} be two derivations on an algebra {A}. Show that the commutator {[d_1,d_2] := d_1 \circ d_2 - d_2 \circ d_1} is also a derivation on {A}.

From the preceding two exercises, we can define the Lie bracket {[X,Y]} of two vector fields {X, Y \in \Gamma(TM)} by the formula

\displaystyle  \nabla_{[X,Y]} := [\nabla_X, \nabla_Y].

This gives the space {\Gamma(TM)} of smooth vector fields the structure of an (infinite-dimensional) Lie algebra:

Definition 6 (Lie algebra) A (real) Lie algebra is a real vector space {V} (possibly infinite dimensional), together with a bilinear map {[,]: V \times V \rightarrow V} which is anti-symmetric (thus {[X,Y] = -[Y,X]} for all {X,Y \in V}, or equivalently {[X,X] = 0} for all {X \in V}) and obeys the Jacobi identity

\displaystyle  [[X,Y],Z] + [[Y,Z],X] + [[Z,X],Y] = 0 \ \ \ \ \ (3)

for all {X,Y,Z \in V}.

Exercise 4 If {M} is a smooth manifold, show that {\Gamma(TM)} (equipped with the Lie bracket) is a Lie algebra.

— 3. The Lie algebra of a Lie group —

Let {G} be a (global) Lie group. By definition, {G} is then a smooth manifolds, so we can thus define the tangent bundle {TG} and smooth vector fields {X \in \Gamma(TG)} as in the preceding section. In particular, we can define the tangent space {T_1 G} of {G} at the identity element {1}.

If {g \in G}, then the left multiplication operation {\rho_g^{left}: x \mapsto gx} is, by definition of a Lie group, a smooth map from {G} to {G}. This creates a derivative map {D \rho_g^{left}: TG \rightarrow TG} from the tangent bundle {TG} to itself. We say that a vector field {X \in \Gamma(TG)} is left-invariant if one has {(\rho_g^{left})_* X = X} for all {g \in G}, or equivalently if {(D \rho_g^{left}) \circ X = X \circ \rho_g^{left}} for all {g \in G}.

Exercise 5 Let {G} be a (global) Lie group.

  • Show that for every element {x} of {T_1 G} there is a unique left-invariant vector field {X \in \Gamma(TG)} such that {X(1)=x}.
  • Show that the commutator {[X,Y]} of two left-invariant vector fields is again a left-invariant vector field.

From the above exercise, we can identify the tangent space {T_1 G} with the left-invariant vector fields on {TG}, and the Lie bracket structure on the latter then induces a Lie bracket (which we also call {[,]}) on {T_1 G}. The vector space {T_1 G} together with this Lie bracket is then a (finite-dimensional) Lie algebra, which we call the Lie algebra of the Lie group {G}, and we write as {{\mathfrak g}}.

Remark 4 Informally, an element {x} of the Lie algebra {{\mathfrak g}} is associated with an infinitesimal perturbation {1 + \epsilon x + O(\epsilon^2)} of the identity in the Lie group {G}. This intuition can be formalised fairly easily in the case of matrix Lie groups such as {GL_n({\bf C})}; for more abstract Lie groups, one can still formalise things using nonstandard analysis, but we will not do so here.

Exercise 6

  • Show that the Lie algebra {{\mathfrak gl}_n({\bf C})} of the general linear group {GL_n({\bf C})} can be identified with the space {M_n({\bf C})} of {n \times n} complex matrices, with the Lie bracket {[A,B] := AB-BA}.
  • Describe the Lie algebra {{\mathfrak u}_n({\bf C})} of the unitary group {U_n({\bf C})}.
  • Describe the Lie algebra {\mathfrak{su}_n({\bf C})} of the special unitary group {SU_n({\bf C})}.
  • Describe the Lie algebra {{\mathfrak o}_n({\bf R})} of the orthogonal {O_n({\bf R})}.
  • Describe the Lie algebra {\mathfrak{so}_n({\bf R})} of the special orthogonal {SO_n({\bf R})}.
  • Describe the Lie algebra of the Heisenberg group {\begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix}}.

Exercise 7 Let {\phi: G \rightarrow H} be a smooth homomorphism between (global) Lie groups. Show that the derivative map {D\phi(1_G)} at the identity element {1_G} is then a Lie algebra homomorphism from the Lie algebra {{\mathfrak g}} of {G} to the Lie algebra {{\mathfrak h}} of {H} (thus this map is linear and preserves the Lie bracket). (From this and the chain rule (1), we see that the map {\phi \mapsto D\phi(1_G)} creates a covariant functor from the category of Lie groups to the category of Lie algebras.)

We have seen that every global Lie group gives rise to a Lie algebra. One can also associate Lie algebras to local Lie groups as follows:

Exercise 8 Let {G} be a local Lie group. Let {U} be a symmetric neighbourhood of the identity in {G}. (It is not difficult to see that least one such neighbourhood exists.) Call a vector field {X \in \Gamma(TU)} left-invariant if, for every {g \in U}, one has {(\rho_g^{left})_* X(g) = X(g)}, where {\rho_g^{left}} is the left-multiplication map {x \mapsto gx}, defined on the open set {\{ x \in U: gx \in U \}} (where we adopt the convention that {gx \in U} is shorthand for “{g \cdot x} is well-defined and lies in {U}“).

  • Establish the analogue of Exercise 5 in this setting. Conclude that one can give {T_1 G} the structure of a Lie algebra, which is independent of the choice of {U}.
  • Establish the analogue of Exercise 7 in this setting.

Remark 5 In the converse direction, it is also true that every finite-dimensional Lie algebra can be associated to either a local or a global Lie group; this is known as Lie’s third theorem. However, this theorem is somewhat tricky to prove (particularly if one wants to associate the Lie algebra with a global Lie group), requiring the non-trivial algebraic tool of Ado’s theorem (discussed in this previous blog post); see Exercise 21 below.

— 4. The exponential map —

The exponential map {x \mapsto \exp(x)} on the reals {{\bf R}} (or its extension to the complex numbers {{\bf C}}) is of course fundamental to modern analysis. It can be defined in a variety of ways, such as the following:

  • (i) {\exp: {\bf R} \rightarrow {\bf R}} is the differentiable map obeying the ODE {\frac{d}{dx} \exp(x) = \exp(x)} and the initial condition {\exp(0)=1}.
  • (ii) {\exp: {\bf R} \rightarrow {\bf R}} is the differentiable map obeying the homomorphism property {\exp(x+y) = \exp(x) \exp(y)} and the initial condition {\frac{d}{dx} \exp(x)|_{x=0}=1}.
  • (iii) {\exp: {\bf R} \rightarrow {\bf R}} is the limit of the functions {x \mapsto (1+\frac{x}{n})^n} as {n \rightarrow \infty}.
  • (iv) {\exp: {\bf R} \rightarrow {\bf R}} is the limit of the infinite series {x \mapsto \sum_{n=0}^\infty \frac{x^n}{n!}}.

We will need to generalise this map to arbitrary Lie algebras and Lie groups. In the case of matrix Lie groups (and matrix Lie algebras), one can use the matrix exponential, which can be defined efficiently by modifying definition (iv) above, and which was already discussed in the previous set of notes. It is however difficult to use this definition for abstract Lie algebras and Lie groups. The definition based on (ii) will ultimately be the best one to use for the purposes of this course, but for foundational purposes (i) or (iii) is initially easier to work with. In most of the foundational literature on Lie groups and Lie algebras, one uses (i), in which case the existence and basic properties of the exponential map can be provided by the Picard existence theorem from the theory of ordinary differential equations. However, we will use (iii), because it relies less heavily on the smooth structure of the Lie group, and will therefore be more aligned with the spirit of Hilbert’s fifth problem (which seeks to minimise the reliance of smoothness hypotheses whenever possible). Actually, for minor technical reasons it is slightly more convenient to work with the limit of {(1+\frac{x}{2^n})^{2^n}} rather than {(1+\frac{x}{n})^{n}}.

We turn to the details. It will be convenient to work in local coordinates, and for applications to Hilbert’s fifth problem it will be useful to “forget” almost all of the smooth structure. We make the following definition:

Definition 7 ({C^{1,1}} local group) A {C^{1,1}} local group is a local group {V} that is an open neighbourhood of the origin {0} in a Euclidean space {{\bf R}^d}, with group identity {0}, and whose group operation {\ast} obeys the estimate

\displaystyle  x \ast y = x + y + O(|x| |y|) \ \ \ \ \ (4)

for all sufficiently small {x, y}, where the implied constant in the {O()} notation can depend on {V} but is uniform in {x, y}.

Example 4 Let {G} be a local Lie group of some dimension {d}, and let {\phi: U \rightarrow V} be a smooth coordinate chart that maps a neighbourhood {U} of the group identity {1} to a neighbourhood {V} of the origin {0} in {{\bf R}^d}, with {\phi(1)=0}. Then, as explained in Example 1, {V = (V,\ast) = \phi_* G\downharpoonright_U} is a local Lie group with identity {0}; in particular, one has

\displaystyle  0 \ast x = x \ast 0 = x.

From Taylor expansion (using the smoothness of {\ast}) we thus have (4) for sufficiently small {x,y}. Thus we see that every local Lie group generates a {C^{1,1}} local group when viewed in coordinates.

Remark 6 In real analysis, a (locally) {C^{1,1}} function is a function {f: U \rightarrow {\bf R}^m} on a domain {U \subset {\bf R}^n} which is continuously differentiable (i.e. in the regularity class {C^1}), and whose first derivatives {\nabla f} are (locally) Lipschitz (i.e. in the regularity class {C^{0,1}}) the {C^{1,1}} regularity class is slightly weaker (i.e. larger) than the class {C^2} of twice continuously differentiable functions, but much stronger than the class {C^1} of singly continuously differentiable functions. See this previous blog post for more on these sorts of regularity classes. The reason for the terminology {C^{1,1}} in the above definition is that {C^{1,1}} regularity is essentially the minimal regularity for which one has the Taylor expansion

\displaystyle  f(x) = f(x_0) + \nabla f(x_0) \cdot (x-x_0) + O(|x-x_0|^2)

for any {x_0} in the domain of {f}, and any {x} sufficiently close to {x_0}; note that the asymptotic (4) is of this form.

We now estimate various expressions in a {C^{1,1}} local group.

Exercise 9 Let {V} be a {C^{1,1}} local group. Throughout this exercise, the implied constants in the {O()} notation can depend on {V}, but not on parameters such as {x, y, \epsilon, k, n}.

  • (i) Show that there exists an {\epsilon>0} such that one has

    \displaystyle  x_1 \ast \ldots \ast x_k = x_1 + \ldots + x_k + O( \sum_{1 \leq i < j \leq k} |x_i| |x_j| ) \ \ \ \ \ (5)

    whenever {k \geq 1} and {x_1,\ldots,x_k \in V} are such that {\sum_{i=1}^k |x_i| \leq \epsilon}, and the implied constant is uniform in {k}. Here and in the sequel we adopt the convention that a statement such as (5) is automatically false unless all expressions in that statement are well-defined. (Hint: induct on {k} using (4). It is best to replace the asymptotic {O()} notation by explicit constants {C} in order to ensure that such constants remain uniform in {k}.) In particular, one has the crude estimate

    \displaystyle  x_1 \ast \ldots \ast x_k = O( \sum_{i=1}^k |x_i| )

    under the same hypotheses as above.

  • (ii) Show that one has

    \displaystyle  x^{\ast -1} = -x + O( |x|^2 )

    for {x} sufficiently close to the origin.

  • (iii) Show that

    \displaystyle  x \ast y \ast x^{\ast -1} \ast y^{\ast -1} = O( |x| |y| )

    for {x,y} sufficiently close to the origin. (Hint: first show that {x \ast y = y \ast x + O( |x| |y| )}, then express {x \ast y} as the product of {x \ast y \ast x^{\ast -1} \ast y^{\ast -1}} and {y \ast x}.)

  • (iv) Show that

    \displaystyle  x \ast y \ast x^{\ast -1} = y + O( |x| |y| )

    whenever {x, y} are sufficiently close to the origin.

  • (v) Show that

    \displaystyle  y \ast x^{\ast -1}, x^{\ast -1} \ast y = O( |x-y| )

    whenever {x, y} are sufficiently close to the origin.

  • (vi) Show that there exists an {\epsilon > 0} such that

    \displaystyle  x_1 \ast \ldots \ast x_k = y_1 \ast \ldots \ast y_k + O( \sum_{i=1}^k |x_i-y_i| )

    whenever {k \geq 1} and {x_1,\ldots,x_k, y_1,\ldots,y_k} are such that {\sum_{i=1}^k |x_i|, \sum_{j=1}^k |y_i| \leq \epsilon}.

  • (vii) Show that there exists an {\epsilon > 0} such that

    \displaystyle  \frac{1}{2} |n| |x-y| \leq |x^{\ast n} - y^{\ast n}| \leq 2 |n| |x-y|

    for all {n \in {\bf Z}} and {x,y \in {\bf R}^d} such that {|nx|, |ny| \leq \epsilon}, where {x^{\ast n} = x \ast \ldots \ast x} is the product of {n} copies of {x} (assuming of course that this product is well-defined) for {n \geq 0}, and {x^{\ast -n} := (x^{\ast n})^{\ast -1}}.

  • (viii) Show that there exists an {\epsilon > 0} such that

    \displaystyle  (xy)^{\ast n} = x^{\ast n} y^{\ast n} + O( |n|^2 |x| |y| )

    for all {n \in {\bf Z}} and {x,y \in {\bf R}^d} such that {|nx|, |ny| \leq \epsilon}. (Hint: do the case when {n} is positive first. In that case, express {x^{\ast -n} \ast (xy)^{\ast n}} as the product of {n} conjugates of {y} by various powers of {x}.)

We can now define the exponential map {\exp: V' \rightarrow V} on this {C^{1,1}} local group by defining

\displaystyle  \exp(x) := \lim_{n \rightarrow \infty} (\frac{1}{2^n} x)^{\ast 2^n} \ \ \ \ \ (6)

for any {x} in a sufficiently small neighbourhood {V'} of the origin in {V}.

Exercise 10 Let {V} be a local {C^{1,1}} group.

  • (i) Show that if {V'} is a sufficiently small neighbourhood of the origin in {V}, then the limit in (6) exists for all {x \in V'}. (Hint: use the previous exercise to estimate the distance between {(\frac{1}{2^n} x)^{\ast 2^n}} and {(\frac{1}{2^{n+1}} x)^{\ast 2^{n+1}}}.) Establish the additional estimate

    \displaystyle  \exp(x) = x + O(|x|^2). \ \ \ \ \ (7)

  • (ii) Show that if {\gamma: I \rightarrow G} is a smooth curve with {\gamma(0)=1}, and {\gamma'(0)} is sufficiently small, then

    \displaystyle  \exp( \gamma'(0) ) = \lim_{n \rightarrow \infty} \gamma(1/2^n)^{\ast 2^n}.

  • (iii) Show that for all sufficiently small {x, y}, one has the bilipschitz property

    \displaystyle  |(\exp(x)-\exp(y)) - (x-y)| \leq \frac{1}{2} |x-y|.

    Conclude in particular that for {V'} sufficiently small, {\exp} is a homeomorphism between {V'} and an open neighbourhood {\exp(V')} of the origin. (Hint: To show that {\exp(V')} contains a neighbourhood of the origin, use (7) and the contraction mapping theorem.)

  • Show that

    \displaystyle  \exp(sx) \ast \exp(tx) = \exp((s+t)x) \ \ \ \ \ (8)

    for {s,t \in {\bf R}} and {x \in {\bf R}^d} with {sx, tx} sufficiently small. (Hint: first handle the case when {s,t \in {\bf Z}[\frac{1}{2}]} are dyadic numbers.)

  • (iv) Show that for any sufficiently small {x,y \in {\bf R}^d}, one has

    \displaystyle  \exp(x+y) = \lim_{n \rightarrow \infty} (\exp(x/2^n) \ast \exp(y/2^n))^{\ast 2^n}. \ \ \ \ \ (9)

    Then conclude the stronger estimate

    \displaystyle  \exp(x+y) = \lim_{n \rightarrow \infty} (\exp(x/n) \ast \exp(y/n))^{\ast n}. \ \ \ \ \ (10)

  • (v) Show that for any sufficiently small {x,y \in {\bf R}^d}, one has

    \displaystyle  \exp(x+y) = \exp(x) \ast \exp(y) + O( |x| |y| ).

    (Hint: use the previous part, as well as (viii) of Exercise 9.)

Let us say that a {C^{1,1}} local group is radially homogeneous if one has

\displaystyle  sx \ast tx = (s+t)x \ \ \ \ \ (11)

whenever {s,t \in {\bf R}} and {x \in {\bf R}^d} are such that {sx, tx} are sufficiently small. (In particular, this implies that {x^{\ast -1} = -x} for sufficiently small {x}.) From the above exercise, we see that any {C^{1,1}} local group {V} can be made into a radially homogeneous {C^{1,1}} local group {V'} by first restricting to an open neighbourhood {\exp(V')} of the identity, and then applying the logarithmic homeomorphism {\exp^{-1}}. Thus:

Corollary 8 Every {C^{1,1}} local group has a neighbourhood of the identity which is isomorphic (as a topological group) to a radially homogeneous {C^{1,1}} local group.

Now we study the exponential map on global Lie groups. If {G} is a global Lie group, and {{\mathfrak g}} is its Lie algebra, we define the exponential map {\exp: {\mathfrak g} \rightarrow G} on a global Lie group {G} by setting

\displaystyle  \exp( \gamma'(0) ) := \lim_{n \rightarrow \infty} \gamma(1/2^n)^{2^n}

whenever {\gamma: I \rightarrow G} is a smooth curve with {\gamma(0)=1}.

Exercise 11 Let {G} be a global Lie group.

  • (i) Show that the exponential map is well-defined. (Hint: First handle the case when {\gamma'(0)} is small, using the previous exercise, then bootstrap to larger values of {\gamma'(0)}.)
  • (ii) Show that for all {x, y \in {\mathfrak g}} and {s,t \in {\bf R}}, one has

    \displaystyle  \exp(sx) \exp(tx) = \exp((s+t)x) \ \ \ \ \ (12)

    and

    \displaystyle  \exp(x+y) = \lim_{n \rightarrow \infty} (\exp(x/n) \exp(y/n))^{n}. \ \ \ \ \ (13)

    (Hint: again, begin with the case when {x, y} are small.)

  • (iii) Show that the exponential map is continuous.
  • (iv) Show that for each {x \in {\mathfrak g}}, the function {t \mapsto \exp(tx)} is the unique homomorphism from {{\bf R}} to {G} that is differentiable at {t=0} with derivative equal to {x}.

Proposition 9 (Lie’s first theorem) Let {G} be a Lie group. Then the exponential map is smooth. Furthermore, there is an open neighbourhood {U} of the origin in {{\mathfrak g}} and an open neighbourhood {V} of the identity in {G} such that the exponential map {\exp} is a diffeomorphism from {U} to {V}.

Proof: We begin with the smoothness. From the homomorphism property we see that

\displaystyle  \frac{d}{dt} \exp(tx) = (\rho_{\exp(tx)}^{left})_* x

for all {x \in {\mathfrak g}} and {t \in {\bf R}}. If {x} and {t} are sufficiently small, and one uses a coordinate chart {\phi} near the origin, the function {f(t,x) := \phi(\exp(tx))} then satisfies an ODE of the form

\displaystyle  \frac{d}{dt} f(t,x) = F( f(t,x), x )

for some smooth function {F}, with initial condition {f(0,x) = 0}; thus by the fundamental theorem of calculus we have

\displaystyle  f(t,x) = \int_0^t F(f(t',x), x)\ dt'. \ \ \ \ \ (14)

Now let {k \geq 0}. An application of the contraction mapping theorem (in the function space {L^\infty_t C^k_x} localised to small region of spacetime) then shows that {f} lies in {L^\infty_t C^k_x} for small enough {t,x}, and by further iteration of the integral equation we then conclude that {f(t,x)} is {k} times continuously differentiable for small enough {t,x}. By (8) we then conclude that {\exp} is smooth everywhere.

Since

\displaystyle  \frac{d}{dt} \exp(tx)|_{t=0} = x

we see that the derivative of the exponential map at the origin is the identity map on {{\mathfrak g}}. The second claim of the proposition thus follows from the inverse function theorem. \Box

In view of this proposition, we see that given a vector space basis {X_1,\ldots,X_d} for the Lie algebra {{\mathfrak g}}, we may obtain a smooth coordinate chart {\phi: U \rightarrow V} for some neighbourhood {U} of the identity and neighbourhood {V} of the origin in {{\bf R}^d} by defining

\displaystyle  \tilde \phi( \exp( t_1 X_1 + \ldots + t_d X_d ) ) := (t_1,\ldots,t_d)

for sufficiently small {t_1,\ldots,t_d \in {\bf R}}. These are known as exponential coordinates of the first kind. Although we will not use them much here, we also note that there are exponential coordinates of the second kind, in which the expression {\exp(t_1 X_1 + \ldots + t_d X_d)} is replaced by the slight variant {\exp(t_1 X_1) \ldots \exp(t_d X_d)}.

Using exponential coordinates of the first kind, we see that we may identify a local piece {U} of the Lie group {G} with the radially homogeneous {C^{1,1}} local group {V}. In the next section, we will analyse such radially homogeneous {C^{1,1}} groups further. For now, let us record some easy consequences of the existence of exponential coordinates. Define a one-parameter subgroup of a topological group {G} to be a continuous homomorphism {\phi: {\bf R} \rightarrow G} from {{\bf R}} to {G}.

Exercise 12 (Classification of one-parameter subgroups) Let {G} be a Lie group. For any {X \in {\mathfrak g}}, show that the map {t \mapsto \exp(tX)} is a one-parameter subgroup. Conversely, if {\phi: {\bf R} \rightarrow G} is a one-parameter subgroup, there exists a unique {X \in {\mathfrak g}} such that {\phi(t) = \exp(tX)} for all {t \in {\bf R}}. (Hint: mimic the proof of Proposition 1 of Notes 0.)

Proposition 10 (Weak regularity implies strong regularity) Let {G, H} be global Lie groups, and let {\Phi: G \rightarrow H} be a continuous homomorphism. Then {\Phi} is smooth.

Proof: Since {\Phi} is a continuous homomorphism, it maps one-parameter subgroups of {G} to one-parameter subgroups of {H}. Thus, for every {X \in {\mathfrak g}}, there exists a unique element {L(X) \in {\mathfrak h}} such that

\displaystyle  \Phi(\exp(tX)) = \exp(tL(X))

for all {t \in {\bf R}}. In particular, we see that {L} is homogeneous: {L(sX) = sL(X)} for all {X \in {\mathfrak g}} and {s \in {\bf R}}. Next, we observe using (9) and the fact that {\Phi} is a continuous homomorphism that for any {X, Y \in {\mathfrak g}} and {t \in {\bf R}}, one has

\displaystyle  \Phi(\exp(t(X+Y))) = \Phi( \lim_{n\rightarrow \infty} (\exp(tX/2^n) \exp(tY/2^n))^{2^n} )

\displaystyle  = \lim_{n \rightarrow \infty} (\Phi(\exp(tX/2^n)) \Phi(\exp(tY/2^n)))^{2^n}

\displaystyle  = \lim_{n \rightarrow \infty} (\exp(tL(X)/2^n) \exp(tL(Y)/2^n))^{2^n}

\displaystyle  = \exp( t(L(X)+L(Y)) )

and thus {L} is additive:

\displaystyle  L(X+Y) = L(X)+L(Y).

We conclude that {L} is a linear transformation from the finite-dimensional vector space {{\mathfrak g}} to the finite-dimensional vector space {{\mathfrak h}}. In particular, {L} is smooth. On the other hand, we have

\displaystyle  \Phi(\exp(X)) = \exp(L(X)).

Since {\exp: {\mathfrak g} \rightarrow G} and {\exp: {\mathfrak h} \rightarrow H} are diffeomorphisms near the origin, we conclude that {\Phi} is smooth in a neighbourhood of the identity. Using the homomorphism property (and the fact that the group operations are smooth for both {G} and {H}) we conclude that {\Phi} is smooth everywhere, as required. \Box

This fact has a pleasant corollary:

Corollary 11 (Uniqueness of Lie structure) Any (global) topological group can be made into a Lie group in at most one manner. More precisely, given a topological group {G}, there is at most one smooth structure one can place on {G} that makes the group operations smooth.

Proof: Suppose for sake of contradiction that one could find two different smooth structures on {G} that make the group operations smooth, leading to two different Lie groups {G', G''} based on {G}. The identity map from {G'} to {G''} is a continuous homomorphism, and hence smooth by the preceding proposition; similarly for the inverse map from {G''} to {G'}. This implies that the smooth structures coincide, and the claim follows. \Box

Note that a general high-dimensional topological manifold may have more than one smooth structure, which may even be non-diffeomorphic to each other (as the example of exotic spheres demonstrates), so this corollary is not entirely vacuous.

Exercise 13 Let {G} be a connected (global) Lie group, let {H} be another (global) Lie group, and let {\Phi: G \rightarrow H} be a continuous homomorphism (which is thus smooth by Proposition 10). Show that {\Phi} is uniquely determined by the derivative map {D\Phi(1): {\mathfrak g} \rightarrow {\mathfrak h}}. In other words, if {\Phi': G \rightarrow H} is another continuous homomorphism with {D\Phi(1)=D\Phi'(1)}, then {\Phi = \Phi'}. (Hint: first prove this in a small neighbourhood of the origin. What group does this neighbourhood generate?) What happens if {G} is not connected?

Exercise 14 (Weak regularity implies strong regularity, local version) Let {G, H} be local Lie groups, and let {\Phi: G \rightarrow H} be a continuous homomorphism. Show that {\Phi} is smooth in a neighbourhood of the identity in {G}.

Exercise 15 (Local Lie implies Lie) Let {G} be a global topological group. Suppose that there is an open neighbourhood {U} of the identity such that the local group {G\downharpoonright_U} can be given the structure of a local Lie group. Show that {G} can be given the structure of a global Lie group. (Hint: We already have at least one coordinate chart on {G}; translate it around to create an atlas of such charts. To show compatibility of the charts and global smoothness of the group, one needs to show that the conjugation maps {x \mapsto gxg^{-1}} are smooth near the origin for any {g \in G}. To prove this, use Exercise 14.)

— 5. The Baker-Campbell-Hausdorff formula —

We now study radially homogeneous {C^{1,1}} local groups in more detail. We will show

Theorem 12 (Baker-Campbell-Hausdorff formula, qualitative version) Let {V \subset {\bf R}^d} be a radially homogeneous {C^{1,1}} local group. Then the group operation {\ast} is real analytic near the origin. In particular, after restricting {V} to a sufficiently small neighbourhood of the origin, one obtains a local Lie group.

We will in fact give a more precise formula for {\ast}, known as the Baker-Campbell-Haudorff-Dynkin formula, in the course of proving Theorem 12.

Remark 7 In the case where {V} comes from viewing a general linear group {GL_n({\bf C})} in local exponential coordinates, the group operation {\ast} is given by {x \ast y = \log(\exp(x) \exp(y))} for sufficiently small {x,y \in M_n({\bf C})}. Thus, a corollary of Theorem 12 is that this map is real analytic.

We begin the proof of Theorem 12. Throughout this section, {V \subset {\bf R}^d} is a fixed radially homogeneous {C^{1,1}} local group. We will need some variants of the basic bound (4).

Exercise 16 (Lipschitz bounds) If {x,y,z \in V} are sufficiently small, establish the bounds

\displaystyle  x \ast y = x + y + O( |x+y| |y| ) \ \ \ \ \ (15)

\displaystyle  x \ast y = x + y + O( |x+y| |x| ) \ \ \ \ \ (16)

\displaystyle  x \ast y = x \ast z + O( |y-z| ) \ \ \ \ \ (17)

and

\displaystyle  y \ast x = z \ast x + O( |y-z| ). \ \ \ \ \ (18)

(Hint: to prove (15), start with the identity {(x \ast y) \ast (-y) = x}.)

Now we exploit the radial homogeneity to describe the conjugation operation {y \mapsto x \ast y \ast (-x)} as a linear map:

Lemma 13 (Adjoint representation) For all {x} sufficiently close to the origin, there exists a linear transformation {\hbox{Ad}_x: {\bf R}^d \rightarrow {\bf R}^d} such that {x \ast y \ast (-x) = \hbox{Ad}_x(y)} for all {y} sufficiently close to the origin.

Remark 8 Using the matrix example from Remark 7, we are asserting here that

\displaystyle  \exp(x) \exp(y) \exp(-x) = \exp( \hbox{Ad}_x(y) )

for some linear transform {\hbox{Ad}_x(y)} of {y}, and all sufficiently small {x,y}. Indeed, using the basic matrix identity {\exp(AxA^{-1}) = A\exp(x)A^{-1}} for invertible {A} (coming from the fact that the conjugation map {x \mapsto A x A^{-1}} is a continuous ring homomorphism) we see that we may take {\hbox{Ad}(x) = \exp(x) y \exp(-x)} here.

Proof: Fix {x}. The map {y \mapsto x \ast y \ast (-x)} is continuous near the origin, so it will suffice to establish additivity, in the sense that

\displaystyle  x \ast (y+z) \ast (-x) = (x \ast y \ast (-x)) + (x \ast z \ast (-x))

for {y,z} sufficiently close to the origin.

Let {n} be a large natural number. Then from (11) we have

\displaystyle  (y+z) = (\frac{1}{n} y + \frac{1}{n} z)^{\ast n}.

Conjugating this by {x}, we see that

\displaystyle  x \ast (y+z) \ast (-x) = (x \ast (\frac{1}{n} y + \frac{1}{n} z) \ast (-x))^n

\displaystyle  = n (x \ast (\frac{1}{n} y + \frac{1}{n} z) \ast (-x)).

But from (4) we have

\displaystyle  \frac{1}{n} y + \frac{1}{n} z = \frac{1}{n} y \ast \frac{1}{n} z + O( \frac{1}{n^2} )

and thus (by Exercise 16)

\displaystyle  x \ast (\frac{1}{n} y + \frac{1}{n} z) \ast (-x) = x \ast \frac{1}{n} y \ast \frac{1}{n} z \ast (-x) + O( \frac{1}{n^2} ).

But if we split {x \ast \frac{1}{n} y \ast \frac{1}{n} z \ast (-x)} as the product of {x \ast \frac{1}{n} y \ast (-x) } and {x \ast \frac{1}{n} z \ast (-x)} and use (4), we have

\displaystyle  x \ast \frac{1}{n} y \ast \frac{1}{n} z \ast (-x) = x \ast \frac{1}{n} y \ast (-x) + x \ast \frac{1}{n} z \ast (-x) + O(\frac{1}{n^2}).

Putting all this together we see that

\displaystyle  x \ast (y+z) \ast (-x) = n( x \ast \frac{1}{n} y \ast (-x) + x \ast \frac{1}{n} z \ast (-x) + O(\frac{1}{n^2}) )

\displaystyle  = x \ast y \ast (-x) + x \ast z \ast (-x) + O(\frac{1}{n});

sending {n \rightarrow \infty} we obtain the claim. \Box

From (4) we see that

\displaystyle  \| \hbox{Ad}_x - I \|_{op} = O(|x|)

for {x} sufficiently small. Also from the associativity property we see that

\displaystyle  \hbox{Ad}_{x\ast y} = \hbox{Ad}_x \hbox{Ad}_y \ \ \ \ \ (19)

for all {x,y} sufficiently small. Combining these two properties (and using (15)) we conclude in particular that

\displaystyle  \| \hbox{Ad}_x - \hbox{Ad}_y \|_{op} = O(|x-y|) \ \ \ \ \ (20)

for {x,y} sufficiently small. Thus we see that {Ad} is a (locally) continuous linear representation. In particular, {t \mapsto \hbox{Ad}_{tx}} is a (locally) continuous homomorphism into a linear group, and so (by Proposition 1 of Notes 0) we have the Hadamard lemma

\displaystyle  \hbox{Ad}_x = \exp( \hbox{ad}_x )

for all sufficiently small {x}, where {\hbox{ad}_x: {\bf R}^d \rightarrow {\bf R}^d} is the linear transformation

\displaystyle  \hbox{ad}_x = \frac{d}{dt} \hbox{Ad}_{tx} |_{t=0}.

From (21), (20), (4) we see that

\displaystyle  \hbox{Ad}_{tx} \hbox{Ad}_{ty} = \hbox{Ad}_{t(x+y)} + O(|t|^2)

for {x,y,t} sufficiently small, and so by the product rule we have

\displaystyle  \hbox{ad}_{x+y} = \hbox{ad}_x + \hbox{ad}_y.

Also we clearly have {\hbox{ad}_{tx} = t \hbox{ad}_x} for {x, t} small. Thus we see that {\hbox{ad}_x} is linear in {x}, and so we have

\displaystyle  \hbox{ad}_x y = [x,y] \ \ \ \ \ (21)

for some bilinear form {[,]: {\bf R}^d \rightarrow {\bf R}^d}.

One can show that this bilinear form in fact defines a Lie bracket (i.e. it is anti-symmetric and obeys the Jacobi identity), but for now, all we need is that it is manifestly real analytic (since all bilinear forms are polynomial and thus analytic). In particular {\hbox{ad}_x} and {\hbox{Ad}_x} depend analytically on {x}.

We now give an important approximation to {x\ast y} in the case when {y} is small:

Lemma 14 For {x,y} sufficiently small, we have

\displaystyle  x \ast y = x + F(\hbox{Ad}_x) y + O(|y|^2)

where

\displaystyle  F(z) := \frac{z \log z}{z-1}.

Proof: If we write {z := x\ast y - x}, then {z = O(|y|)} (by (4)) and

\displaystyle  (-x) \ast (x+z) = y.

We will shortly establish the approximation

\displaystyle  (-x) \ast (x+z) = \frac{1-\exp(-\hbox{ad}_x)}{\hbox{ad}_x} z + O( |z|^2 ); \ \ \ \ \ (22)

inverting

\displaystyle \frac{1-\exp(-\hbox{ad}_x)}{\hbox{ad}_x} = \frac{\hbox{Ad}_x - 1}{\hbox{Ad}_x \log \hbox{Ad}_x}

we obtain the claim.

It remains to verify (22). Let {n} be a large natural number. We can expand the left-hand side of (22) as a telescoping series

\displaystyle  \sum_{j=0}^{n-1} (-\frac{j+1}{n} x) \ast (\frac{j+1}{n} x + \frac{j+1}{n} z) - (-\frac{j}{n} x) \ast (\frac{j}{n} x + \frac{j}{n} z). \ \ \ \ \ (23)

Using (11), the first summand can be expanded as

\displaystyle  (-\frac{j}{n} x) \ast (-\frac{x}{n})\ast (\frac{x}{n} + \frac{z}{n}) \ast (\frac{j}{n} x + \frac{j}{n} z).

From (15) one has {(-\frac{x}{n}) \ast (\frac{x}{n} + \frac{z}{n}) = \frac{z}{n} + O( \frac{|z|}{n^2} )}, so by (17), (18) we can write the preceding expression as

\displaystyle  (-\frac{j}{n} x) \ast \frac{z}{n} \ast (\frac{j}{n} x + \frac{j}{n} z) + O( \frac{|z|}{n^2} )

which by definition of {\hbox{Ad}} can be rewritten as

\displaystyle  (\hbox{Ad}_{-\frac{j}{n} x} \frac{z}{n}) \ast (-\frac{j}{n} x) \ast (\frac{j}{n} x + \frac{j}{n} z) + O( \frac{|z|}{n^2} ). \ \ \ \ \ (24)

From (15) one has

\displaystyle  (-\frac{j}{n} x) \ast (\frac{j}{n} x + \frac{j}{n} z) = O( |z| )

while from (20) one has {\hbox{Ad}_{-\frac{j}{n} x} \frac{z}{n} = O( |z|/n )}, hence from (4) we can rewrite (24) as

\displaystyle  \hbox{Ad}_{-\frac{j}{n} x} \frac{z}{n} + (-\frac{j}{n} x) * (\frac{j}{n} x + \frac{j}{n} z) + O(\frac{|z|^2}{n}) + O( \frac{|z|}{n^2} ).

Inserting this back into (23), we can thus write the left-hand side of (22) as

\displaystyle  (\sum_{j=0}^{n-1} \hbox{Ad}_{-\frac{j}{n} x} \frac{z}{n}) + O( |z|^2 ) + O( \frac{|z|}{n} ).

Writing {\hbox{Ad}_{-\frac{j}{n} x} = \exp( - \frac{j}{n} \hbox{ad}_x )}, and then letting {n \rightarrow \infty}, we conclude (from the convergence of the Riemann sum to the Riemann integral) that

\displaystyle  (-x) \ast (x+z) = \int_0^1 \exp(-t \hbox{ad}_x) z\ dt + O( |z|^2 )

and the claim follows. \Box

Remark 9 In the matrix case, the key computation is to show that

\displaystyle  \exp( -x ) \exp( x+z ) = 1 + \frac{1-\exp(-\hbox{ad}_x)}{\hbox{ad}_x} z + O( |z|^2 ).

To see this, we can use the fundamental theorem of calculus to write the left-hand side as

\displaystyle  1 + \int_0^1 \frac{d}{dt} (\exp(-tx) \exp(t(x+z)))\ dt.

Since {\frac{d}{dt} \exp(-tx) = \exp(-tx) (-x)} and {\frac{d}{dt} \exp(t(x+z)) = (x+z) \exp(t(x+z))}, we can rewrite this as

\displaystyle  1 + \int_0^1 \exp(-tx) z \exp(t(x+z))\ dt.

Since {\exp(t(x+z)) = \exp(tx) + O(|z|)}, this becomes

\displaystyle  1 + \int_0^1 \exp(-tx) z \exp(tx)\ dt + O(|z|^2);

since {\exp(-tx) z \exp(tx) = \exp(-t \hbox{ad}_x) z}, we obtain the desired claim.

We can integrate the above formula to obtain an exact formula for {\ast}:

Corollary 15 (Baker-Campbell-Hausdorff-Dynkin formula) For {x, y} sufficiently small, one has

\displaystyle  x \ast y = x + \int_0^1 F( \hbox{Ad}_x \hbox{Ad}_{ty} ) y\ dt.

The right-hand side is clearly real analytic in {x} and {y}, and Theorem 12 follows.

Proof: Let {n} be a large natural number. We can express {x\ast y} as the telescoping sum

\displaystyle  x + \sum_{j=0}^{n-1} x \ast (\frac{j+1}{n} y) - x \ast (\frac{j}{n} y).

From (11) followed by Lemma 14 and (21), one has

\displaystyle  x \ast (\frac{j+1}{n} y) = x \ast (\frac{j}{n} y) \ast \frac{y}{n}

\displaystyle = x \ast (\frac{j}{n} y) + F( \hbox{Ad}_x \hbox{Ad}_{\frac{j}{n} y} ) \frac{y}{n} + O( \frac{1}{n^2} ).

We conclude that

\displaystyle  x\ast y = x + \frac{1}{n} \sum_{j=0}^{n-1} F( \hbox{Ad}_x \hbox{Ad}_{\frac{j}{n} y} ) y + O( \frac{1}{n} ).

Sending {n \rightarrow \infty}, so that the Riemann sum converges to a Riemann integral, we obtain the claim. \Box

Exercise 17 Use the Taylor-type expansion

\displaystyle  F(z) = 1 - \frac{1/z - 1}{2} + \frac{(1/z-1)^2}{3} - \frac{(1/z-1)^3}{4} + \ldots

to obtain the explicit expansion

\displaystyle  x \ast y = x + \sum_{n=0}^\infty \frac{(-1)^m}{n+1}

\displaystyle \sum_{\stackrel{r_i,s_i \geq 0}{(r_i,s_i) \neq (0,0)}} \frac{(\hbox{ad}_y)^{r_1} (\hbox{ad}_x)^{s_1} \ldots (\hbox{ad}_y)^{r_n} (\hbox{ad}_x)^{s_n} }{r_1! s_1! \ldots r_n! s_n! (r_1+\ldots+r_n+1)} y

where {m := n+r_1+\ldots+r_n+s_1+\ldots+s_n+1}, and show that the series is absolutely convergent for {x, y} small enough. Invert this to obtain the alternate expansion

\displaystyle  x \ast y = y + \sum_{n=0}^\infty \frac{(-1)^n}{n+1}

\displaystyle  \sum_{\stackrel{r_i,s_i \geq 0}{(r_i,s_i) \neq (0,0)}} \frac{(\hbox{ad}_x)^{r_1} (\hbox{ad}_y)^{s_1} \ldots (\hbox{ad}_x)^{r_n} (\hbox{ad}_y)^{s_n} }{r_1! s_1! \ldots r_n! s_n! (r_1+\ldots+r_n+1)} x.

Exercise 18 Let {V} be a radially homogeneous {C^{1,1}} local group. By Theorem 12, an open neighbourhood of the origin in {V} has the structure of a local Lie group, and thus by Exercise 8 is associated to a Lie algebra. Show that this Lie algebra is isomorphic to {{\bf R}^d} and the Lie bracket {[,]} is given by (21). Note that this establishes a posteriori the fact that the bracket {[,]} occurring in (21) is anti-symmetric and obeys the Jacobi identity.

We now record some consequences of the Baker-Campbell-Hausdorff formula.

Exercise 19 (Lie groups are analytic) Let {G} be a global Lie group. Show that {G} is a real analytic manifold (i.e. one can find an atlas of smooth coordinate charts whose transition maps are all real analytic), and that the group operations are also real analytic (i.e. they are real analytic when viewed in the above-mentioned coordinate charts). Furthermore, show that any continuous homomorphism between Lie groups is also real analytic.

Exercise 20 (Lie’s second theorem) Let {G, H} be global Lie groups, and let {\phi: {\mathfrak g} \rightarrow {\mathfrak h}} be a Lie algebra homomorphism. Show that there exists an open neighbourhood {U} of the identity in {G} and a homomorphism {\Phi: U \rightarrow H} from the local Lie group {G\downharpoonright_U} to {H} such that {D\Phi(1) = \phi}. If {G} is connected and simply connected, show that one can take {U} to be all of {G}.

Exercise 21 (Lie’s third theorem) Ado’s theorem asserts that every finite-dimensional Lie algebra is isomorphic to a subalgebra of {{\mathfrak gl}_n({\bf R})} for some {n}. This (somewhat difficult) theorem and its proof is discussed in this previous blog post. Assuming Ado’s theorem as a “black box”, conclude the following claims:

  • (i) (Lie’s third theorem, local version) Every finite-dimensional Lie algebra is isomorphic to the Lie algebra of some local Lie group.
  • (ii) Every local or global Lie group has a neighbourhood of the identity that is isomorphic to a local linear Lie group (i.e. a local Lie group contained in {GL_n({\bf R})} or {GL_n({\bf C})} for some {n}).
  • (iii) (Lie’s third theorem, global version) Every finite-dimensional Lie algebra {{\mathfrak g}} is isomorphic to the Lie algebra of some global Lie group. (Hint: from (i) and (ii), one may identify {{\mathfrak g}} with the Lie algebra of a local linear Lie group. Now consider the space of all smooth curves in the ambient linear group that are everywhere “tangent” to this local linear Lie group modulo “homotopy”, and use this to build the global Lie group.)
  • (iv) (Lie’s third theorem, simply connected version) Every finite-dimensional Lie algebra {{\mathfrak g}} is isomorphic to the Lie algebra of some global connected, simply connected Lie group. Furthermore, this Lie group is unique up to isomorphism.
  • (v) Show that every local Lie group {G} has a neighbourhood of the identity that is isomorphic to a neighbourhood of the identity of a global connected, simply connected Lie group. Furthermore, this Lie group is unique up to isomorphism.

Remark 10 One does not need the full strength of Ado’s theorem to establish conclusion (i) of the above exercise. Indeed, it suffices to show that the operation {\ast} defined in Exercise 17 is associative near the origin. To do this, it suffices to verify associativity in the sense of formal power series; and then by abstract nonsense one can lift up to the free Lie algebra on {d} generators, and then down to the free nilpotent Lie algebra on {d} generators and of some arbitrary finite step {s}, which one can verify to be a finite dimensional Lie algebra. Applying Ado’s theorem for the special case of nilpotent Lie algebras (which is easier to establish than the general case of Ado’s theorem, as discussed in this previous blog post), one can identify this nilpotent Lie algebra with a subalgebra of {{\mathfrak g}_n({\bf R})} for some {n}, and then one can argue as in the above exercise to conclude. However, I do not know how to establish conclusions (ii), (iii) or (iv) without using Ado’s theorem in full generality (and (ii) is in fact equivalent to this theorem, at least in characteristic {0}).

Remark 11 Lie’s three theorems can be interpreted as establishing an equivalence between three different categories: the category of finite-dimensional Lie algebras; the category of local Lie groups (or more precisely, the category of local Lie group germs, formed by identifying local Lie groups that are identical near the origin); and the category of global connected, simply connected Lie groups. See this blog post for further discussion.

The fact that we were able to establish the Baker-Campbell-Hausdorff formula at the {C^{1,1}} regularity level will be useful for the purposes of proving results related to Hilbert’s fifth problem. In particular, we have the following criterion for a group to be Lie (very much in accordance with the “weak regularity implies strong regularity for group-like objects” principle):

Lemma 16 (Criterion for Lie structure) Let {G} be a topological group. Show that {G} is Lie if and only if there is a neighbourhood of the identity in {G} which is isomorphic (as a topological group) to a {C^{1,1}} local group.

Proof: The “only if” direction is trivial. For the “if” direction, combine Corollary 8 with Theorem 12 and Exercise 15. \Box

Remark 12 Informally, Lemma 16 asserts that {C^{1,1}} regularity can automatically be upgraded to smooth ({C^\infty}) or even real analytic ({C^\omega}) regularity for topological groups. In contrast, note that a locally Euclidean group has neighbourhoods of the identity that are isomorphic to a “{C^0} local group” (which is the same concept as a {C^{1,1}} local group, but without the asymptotic (4)). Thus we have reduced Hilbert’s fifth problem to the task of boosting {C^0} regularity to {C^{1,1}} regularity, rather than that of boosting {C^0} regularity to {C^\infty} regularity.

Exercise 22 Let {G} be a Lie group with Lie algebra {{\mathfrak g}}. For any {X, Y \in {\mathfrak g}}, show that

\displaystyle  \exp([X,Y]) = \lim_{n \rightarrow \infty} (\exp(X/n) \exp(Y/n)

\displaystyle  \exp(-X/n) \exp(-Y/n))^{n^2}.