Let ${{\mathfrak g}}$ be a finite-dimensional Lie algebra (over the reals). Given two sufficiently small elements ${x, y}$ of ${{\mathfrak g}}$, define the right Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle R_y(x) := x + \int_0^1 F_R( \hbox{Ad}_x \hbox{Ad}_{ty} ) y \ dt \ \ \ \ \ (1)$

where ${\hbox{Ad}_x := \exp(\hbox{ad}_x)}$, ${\hbox{ad}_x: {\mathfrak g} \rightarrow {\mathfrak g}}$ is the adjoint map ${\hbox{ad}_x(y) := [x,y]}$, and ${F_R}$ is the function ${F_R(z) := \frac{z \log z}{z-1}}$, which is analytic for ${z}$ near ${1}$. Similarly, define the left Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle L_x(y) := y + \int_0^1 F_L( \hbox{Ad}_{tx} \hbox{Ad}_y ) x\ dt \ \ \ \ \ (2)$

where ${F_L(z) := \frac{\log z}{z-1}}$. One easily verifies that these expressions are well-defined (and depend smoothly on ${x}$ and ${y}$) when ${x}$ and ${y}$ are sufficiently small.

We have the famous Baker-Campbell-Hausdoff-Dynkin formula:

Theorem 1 (BCH formula) Let ${G}$ be a finite-dimensional Lie group over the reals with Lie algebra ${{\mathfrak g}}$. Let ${\log}$ be a local inverse of the exponential map ${\exp: {\mathfrak g} \rightarrow G}$, defined in a neighbourhood of the identity. Then for sufficiently small ${x, y \in {\mathfrak g}}$, one has

$\displaystyle \log( \exp(x) \exp(y) ) = R_y(x) = L_x(y).$

See for instance these notes of mine for a proof of this formula (it is for ${R_y}$, but one easily obtains a similar proof for ${L_x}$).

In particular, one can give a neighbourhood of the identity in ${{\mathfrak g}}$ the structure of a local Lie group by defining the group operation ${\ast}$ as

$\displaystyle x \ast y := R_y(x) = L_x(y) \ \ \ \ \ (3)$

for sufficiently small ${x, y}$, and the inverse operation by ${x^{-1} := -x}$ (one easily verifies that ${R_x(-x) = L_x(-x) = 0}$ for all small ${x}$).

It is tempting to reverse the BCH formula and conclude (the local form of) Lie’s third theorem, that every finite-dimensional Lie algebra is isomorphic to the Lie algebra of some local Lie group, by using (3) to define a smooth local group structure on a neighbourhood of the identity. (See this previous post for a definition of a local Lie group.) The main difficulty in doing so is in verifying that the definition (3) is well-defined (i.e. that ${R_y(x)}$ is always equal to ${L_x(y)}$) and locally associative. The well-definedness issue can be trivially disposed of by using just one of the expressions ${R_y(x)}$ or ${L_x(y)}$ as the definition of ${\ast}$ (though, as we shall see, it will be very convenient to use both of them simultaneously). However, the associativity is not obvious at all.

With the assistance of Ado’s theorem, which places ${{\mathfrak g}}$ inside the general linear Lie algebra ${\mathfrak{gl}_n({\bf R})}$ for some ${n}$, one can deduce both the well-definedness and associativity of (3) from the Baker-Campbell-Hausdorff formula for ${\mathfrak{gl}_n({\bf R})}$. However, Ado’s theorem is rather difficult to prove (see for instance this previous blog post for a proof), and it is natural to ask whether there is a way to establish these facts without Ado’s theorem.

After playing around with this for some time, I managed to extract a direct proof of well-definedness and local associativity of (3), giving a proof of Lie’s third theorem independent of Ado’s theorem. This is not a new result by any means, (indeed, the original proofs of Lie and Cartan of Lie’s third theorem did not use Ado’s theorem), but I found it an instructive exercise to work out the details, and so I am putting it up on this blog in case anyone else is interested (and also because I want to be able to find the argument again if I ever need it in the future).

The key is to observe that the right and left BCH laws commute with each other:

Proposition 2 (Commutativity) Let ${{\mathfrak g}}$ be a finite-dimensional Lie algebra. Then for sufficiently small ${x,y,z}$, one has

$\displaystyle L_y(R_z(x)) = R_z(L_y(x)). \ \ \ \ \ (4)$

Note that this commutativity has to hold if (3) is to be both well-defined and associative. Assuming Proposition 2, we can set ${x=0}$ in (4) and use the easily verified identities ${R_z(0)=z}$, ${L_y(0)=y}$ to conclude that ${L_y(z)=R_z(y)}$ for small ${y,z}$, ensuring that (3) is well-defined; and then inserting (3) into (4) we obtain the desired (local) associativity.

It remains to prove Proposition 2. We first make a convenient observation. Thanks to the Jacobi identity, the adjoint representation ${\hbox{ad}: x \mapsto \hbox{ad}_x}$ is a Lie algebra homomorphism from ${{\mathfrak g}}$ to the Lie algebra ${\mathfrak{gl}({\mathfrak g})}$. As this latter Lie algebra is the Lie algebra of a Lie group, namely ${GL({\mathfrak g})}$, the Baker-Campbell-Hausdorff formula is valid for that Lie algebra. In particular, one has

$\displaystyle \log( \exp( \hbox{ad}_x ) \exp( \hbox{ad}_y ) ) = R_{\hbox{ad}_y}(\hbox{ad}_x) = L_{\hbox{ad}_x}(\hbox{ad}_y)$

for sufficiently small ${x,y}$. But as ${\hbox{ad}}$ is a Lie algebra homomorphism, one has

$\displaystyle R_{\hbox{ad}_y}(\hbox{ad}_x) = \hbox{ad}_{R_y(x)}$

and similarly

$\displaystyle L_{\hbox{ad}_x}(\hbox{ad}_y) = \hbox{ad}_{L_x(y)}.$

Exponentiating, we conclude that

$\displaystyle \hbox{Ad}_x \hbox{Ad}_y = \hbox{Ad}_{R_y(x)} = \hbox{Ad}_{L_x(y)}. \ \ \ \ \ (5)$

This would already give what we want if the adjoint representation was faithful. We unfortunately cannot assume this (and this is the main reason, by the way, why Ado’s theorem is so difficult), but we can at least use (5) to rewrite the formulae (1), (2) as

$\displaystyle R_y(x) = x + \int_0^1 F_R( \hbox{Ad}_{R_{ty}(x)} )\ dt$

and

$\displaystyle L_x(y) := y + \int_0^1 F_L( \hbox{Ad}_{L_{tx}(y)} )\ dt.$

$\displaystyle R_{(s+t)y}(x) = R_{sy}( R_{ty}(x) )$

and

$\displaystyle L_{(s+t)x}(y) = L_{sx}( L_{tx}(y) )$

for all sufficiently small ${x,y \in {\mathfrak g}}$ and ${0 \leq s,t \leq 1}$, as can be seen by a short computation.

Because of these radial homogeneity identities (together with the smoothness of the right and left BCH laws), it will now suffice to prove the approximate commutativity law

$\displaystyle L_y(R_z(x)) = R_z(L_y(x)) + O(|y|^2 |z|) + O( |y| |z|^2 ) \ \ \ \ \ (6)$

for all small ${x,y,z}$. Indeed, this law implies that

$\displaystyle L_{y/n} \circ R_{z/n} = R_{z/n} \circ L_{y/n} + O(1/n^3) \ \ \ \ \ (7)$

for fixed small ${y,z}$, a large natural number ${n}$, and with the understanding that the operations are only applied to sufficiently small elements ${x}$. From radial homogeneity we have ${L_y = L_{y/n}^n}$ and ${R_z = R_{z/n}^n}$, and so a large number (${O(n^2)}$, to be more precise) of iterations of (7) (using uniform smoothness to control all errors) gives

$\displaystyle L_y \circ R_z = R_z \circ L_y + O(1/n),$

and the claim (4) then follows by sending ${n \rightarrow \infty}$.

It remains to prove (6). When ${y=0}$, then ${L_y}$ is the identity map and the claim is trivial; similarly if ${z=0}$. By Taylor expansion, it thus suffices to establish the infinitesimal commutativity law

$\displaystyle \frac{\partial}{\partial a} \frac{\partial}{\partial b} L_{ay}(R_{bz}(x))|_{a=b=0} = \frac{\partial}{\partial a} \frac{\partial}{\partial b} R_{bz}(L_{ay}(x))|_{a=b=0}.$

(One can interpret this infinitesimal commutativity as a commutativity of the vector fields corresponding to the infinitesimal generators of the left and right BCH laws, although we will not explicitly adopt that perspective here.) This is a simplification, because the infinitesimal versions of (1), (2) are simpler than the non-infinitesimal versions. Indeed, from the fundamental theorem of calculus one has

$\displaystyle \frac{\partial}{\partial a} L_{ay}(w)|_{a=0} = F_L(\hbox{Ad}_w) y$

for any fixed ${y, w}$, and similarly

$\displaystyle \frac{\partial}{\partial b} R_{bz}(v)|_{b=0} = F_R(\hbox{Ad}_v) z.$

Thus it suffices (by Clairaut’s theorem) to show that

$\displaystyle \frac{\partial}{\partial b} F_L( \hbox{Ad}_{R_{bz}(x)} ) y|_{b=0} = \frac{\partial}{\partial a} F_R( \hbox{Ad}_{L_{az}(x)} ) z|_{a=0}. \ \ \ \ \ (8)$

It will be more convenient to work with the reciprocals ${F_L^{-1}}$, ${F_R^{-1}}$ of the functions ${F_L, F_R}$. Recall the general matrix identity

$\displaystyle \frac{d}{dt} A^{-1}(t) = - A^{-1}(t) A'(t) A^{-1}(t)$

for any smoothly varying invertible matrix function ${A(t)}$ of a real parameter ${t}$. Using this identity, we can write the left-hand side of (8) as

$\displaystyle - F_L( \hbox{Ad}_x ) (\frac{\partial}{\partial b} F_L^{-1}(\hbox{Ad}_{R_{bz}(x)})|_{b=0}) F_L(\hbox{Ad}_x) y.$

If we write ${Y := F_L(\hbox{Ad}_x) y}$ and ${Z := F_R(\hbox{Ad}_x) z}$, then from Taylor expansion we have

$\displaystyle R_{bz}(x) = x + bZ + O(|b|^2)$

and so we can simplify the above expression as

$\displaystyle - F_L(\hbox{Ad}_x) (\frac{\partial}{\partial b} F_L^{-1}(\hbox{Ad}_{x+bZ})|_{b=0}) Y.$

Similarly, the right-hand side of (8) is

$\displaystyle - F_R(\hbox{Ad}_x) (\frac{\partial}{\partial a} F_R^{-1}(\hbox{Ad}_{x+aY})|_{a=0}) Z.$

Since ${F_R(\hbox{Ad}_x) = \hbox{Ad}_x F_L(\hbox{Ad}_x)}$, it thus suffices to show that

$\displaystyle (\frac{\partial}{\partial b} F_L^{-1}(\hbox{Ad}_{x+bZ})|_{b=0}) Y = \hbox{Ad}_x (\frac{\partial}{\partial a} F_R^{-1}(\hbox{Ad}_{x+aY})|_{a=0}) Z. \ \ \ \ \ (9)$

Now, we write

$\displaystyle F_L^{-1}(\hbox{Ad}_{x}) = \int_0^1 \hbox{Ad}_{tx}\ dt$

and

$\displaystyle F_R^{-1}(\hbox{Ad}_{x}) = \int_0^1 \hbox{Ad}_{-tx}\ dt$

and thus expand (9) as

$\displaystyle \int_0^1 (\frac{\partial}{\partial b} \hbox{Ad}_{tx+tbZ})|_{b=0} Y\ dt = \hbox{Ad}_x \int_0^1 (\frac{\partial}{\partial b} \hbox{Ad}_{-tx-taY})|_{a=0} Z\ dt. \ \ \ \ \ (10)$

We write ${\hbox{Ad}}$ as the exponential of ${\hbox{ad}}$. Using the Duhamel matrix identity

$\displaystyle \frac{d}{dt} \exp(A(t)) = \int_0^1 \exp(sA(t)) A'(t) \exp((1-s)A(t))\ dt$

for any smoothly varying matrix function ${A(t)}$ of a real variable ${t}$, together with the linearity of ${\hbox{ad}}$, we see that

$\displaystyle (\frac{\partial}{\partial b} \hbox{Ad}_{tx+tbZ})|_{b=0} = \int_0^1 \hbox{Ad}_{stx} t\hbox{ad}_Z \hbox{Ad}_{(1-s)tx}\ ds$

and similarly

$\displaystyle (\frac{\partial}{\partial b} \hbox{Ad}_{-tx-taY})|_{a=0} = -\int_0^1 \hbox{Ad}_{-stx} t\hbox{ad}_Y \hbox{Ad}_{-(1-s)tx}\ ds.$

Collecting terms, our task is now to show that

$\displaystyle \int_0^1 \int_0^1 \hbox{Ad}_{stx} \hbox{ad}_Z \hbox{Ad}_{(1-s)tx} Y\ t ds dt = - \int_0^1 \int_0^1 \hbox{Ad}_{(1-st)x} \hbox{ad}_Y \hbox{Ad}_{-(1-s)tx} Z\ t ds dt. \ \ \ \ \ (11)$

For any ${x \in {\mathfrak g}}$, the adjoint map ${\hbox{ad}_x: {\mathfrak g} \rightarrow {\mathfrak g}}$ is a derivation in the sense that

$\displaystyle \hbox{ad}_x [y,z] = [\hbox{ad}_x y, z ] + [y, \hbox{ad}_x z],$

thanks to the Jacobi identity. Exponentiating, we conclude that

$\displaystyle \hbox{Ad}_x [y,z] = [\hbox{Ad}_x y, \hbox{Ad}_x z]$

(thus each ${\hbox{Ad}_x}$ is a Lie algebra homomorphism) and thus

$\displaystyle \hbox{Ad}_x \hbox{ad}_y \hbox{Ad}_x^{-1} = \hbox{ad}_{\hbox{Ad}_x y}.$

Using this, we can simplify (11) as

$\displaystyle \int_0^1 \int_0^1 \hbox{ad}_{\hbox{Ad}_{stx} Z} \hbox{Ad}_{tx} Y\ t ds dt = - \int_0^1 \int_0^1 \hbox{ad}_{\hbox{Ad}_{(1-st)x} Y} \hbox{Ad}_{(1-t)x} Y\ t ds dt$

which we can rewrite as

$\displaystyle \int_0^1 \int_0^1 [\hbox{Ad}_{stx} Z, \hbox{Ad}_{tx} Y]\ t ds dt = - \int_0^1 \int_0^1 [\hbox{Ad}_{(1-st)x} Y, \hbox{Ad}_{(1-t)x} Y]\ t ds dt.$

But by an appropriate change of variables (and the anti-symmetry of the Lie bracket), both sides of this equation can be written as

$\displaystyle \int_{0 \leq a\leq b \leq 1} [\hbox{Ad}_{ax} Z, \hbox{Ad}_{bx} Y]\ da db$

and the claim follows.

Remark 1 The above argument shows that every finite-dimensional Lie algebra ${{\mathfrak g}}$ can be viewed as arising from a local Lie group ${G}$. It is natural to then ask if that local Lie group (or a sufficiently small piece thereof) can in turn be extended to a global Lie group ${\tilde G}$. The answer to this is affirmative, as was first shown by Cartan. I have been unable however to find a proof of this result that does not either use Ado’s theorem, the proof method of Ado’s theorem (in particular, the structural decomposition of Lie algebras into semisimple and solvable factors), or some facts about group cohomology (particularly with regards to central extensions of Lie groups) which are closely related to the structural decompositions just mentioned. (As noted by Serre, though, a certain amount of this sort of difficulty in the proof may in fact be necessary, given that the global form of Lie’s third theorem is known to fail in the infinite-dimensional case.)