This is another post in a series on various components to the solution of Hilbert’s fifth problem. One interpretation of this problem is to ask for a purely topological classification of the topological groups which are isomorphic to Lie groups. (Here we require Lie groups to be finite-dimensional, but allow them to be disconnected.)

There are some obvious necessary conditions on a topological group in order for it to be isomorphic to a Lie group; for instance, it must be Hausdorff and locally compact. These two conditions, by themselves, are not quite enough to force a Lie group structure; consider for instance a ${p}$-adic field ${{\mathbf Q}_p}$ for some prime ${p}$, which is a locally compact Hausdorff topological group which is not a Lie group (the topology is locally that of a Cantor set). Nevertheless, it turns out that by adding some key additional assumptions on the topological group, one can recover Lie structure. One such result, which is a key component of the full solution to Hilbert’s fifth problem, is the following result of von Neumann:

Theorem 1 Let ${G}$ be a locally compact Hausdorff topological group that has a faithful finite-dimensional linear representation, i.e. an injective continuous homomorphism ${\rho: G \rightarrow GL_d({\bf C})}$ into some linear group. Then ${G}$ can be given the structure of a Lie group. Furthermore, after giving ${G}$ this Lie structure, ${\rho}$ becomes smooth (and even analytic) and non-degenerate (the Jacobian always has full rank).

This result is closely related to a theorem of Cartan:

Theorem 2 (Cartan’s theorem) Any closed subgroup ${H}$ of a Lie group ${G}$, is again a Lie group (in particular, ${H}$ is an analytic submanifold of ${G}$, with the induced analytic structure).

Indeed, Theorem 1 immediately implies Theorem 2 in the important special case when the ambient Lie group is a linear group, and in any event it is not difficult to modify the proof of Theorem 1 to give a proof of Theorem 2. However, Theorem 1 is more general than Theorem 2 in some ways. For instance, let ${G}$ be the real line ${{\bf R}}$, which we faithfully represent in the ${2}$-torus ${({\bf R}/{\bf Z})^2}$ using an irrational embedding ${t \mapsto (t,\alpha t) \hbox{ mod } {\bf Z}^2}$ for some fixed irrational ${\alpha}$. The ${2}$-torus can in turn be embedded in a linear group (e.g. by identifying it with ${U(1) \times U(1)}$, or ${SO(2) \times SO(2)}$), thus giving a faithful linear representation ${\rho}$ of ${{\bf R}}$. However, the image is not closed (it is a dense subgroup of a ${2}$-torus), and so Cartan’s theorem does not directly apply (${\rho({\bf R})}$ fails to be a Lie group). Nevertheless, Theorem 1 still applies and guarantees that the original group ${{\bf R}}$ is a Lie group.

(On the other hand, the image of any compact subset of ${G}$ under a faithful representation ${\rho}$ must be closed, and so Theorem 1 is very close to the version of Theorem 2 for local groups.)

The key to building the Lie group structure on a topological group is to first build the associated Lie algebra structure, by means of one-parameter subgroups.

Definition 3 A one-parameter subgroup of a topological group ${G}$ is a continuous homomorphism ${\phi: {\bf R} \rightarrow G}$ from the real line (with the additive group structure) to ${G}$.

Remark 1 Technically, ${\phi}$ is a parameterisation of a subgroup ${\phi({\bf R})}$, rather than a subgroup itself, but we will abuse notation and refer to ${\phi}$ as the subgroup.

In a Lie group ${G}$, the one-parameter subgroups are in one-to-one correspondence with the Lie algebra ${{\mathfrak g}}$, with each element ${X \in {\mathfrak g}}$ giving rise to a one-parameter subgroup ${\phi(t) := \exp(tX)}$, and conversely each one-parameter subgroup ${\phi}$ giving rise to an element ${\phi'(0)}$ of the Lie algebra; we will establish these basic facts in the special case of linear groups below the fold. On the other hand, the notion of a one-parameter subgroup can be defined in an arbitrary topological group. So this suggests the following strategy if one is to try to represent a topological group ${G}$ as a Lie group:

1. First, form the space ${L(G)}$ of one-parameter subgroups of ${G}$.
2. Show that ${L(G)}$ has the structure of a (finite-dimensional) Lie algebra.
3. Show that ${L(G)}$ “behaves like” the tangent space of ${G}$ at the identity (in particular, the one-parameter subgroups in ${L(G)}$ should cover a neighbourhood of the identity in ${G}$).
4. Conclude that ${G}$ has the structure of a Lie group.

It turns out that this strategy indeed works to give Theorem 1 (and variants of this strategy are ubiquitious in the rest of the theory surrounding Hilbert’s fifth problem).

Below the fold, I record the proof of Theorem 1 (based on the exposition of Montgomery and Zippin). I plan to organise these disparate posts surrounding Hilbert’s fifth problem (and its application to related topics, such as Gromov’s theorem or to the classification of approximate groups) at a later date.

— 1. One-parameter subgroups of linear groups —

Let us first understand the one-parameter subgroups of linear groups ${GL_d({\bf C})}$. Here, we can take advantage of the matrix exponential

$\displaystyle \exp(A) := 1 + A + \frac{A^2}{2!} + \frac{A^3}{3!} + \ldots,$

defined for any ${n \times n}$ complex matrix ${A \in \mathfrak{gl}_d({\bf C})}$, where ${\mathfrak{gl}_d({\bf C})}$ is the Lie algebra of ${GL_d({\bf C})}$, i.e. the space of ${d \times d}$ complex matrices with the usual Lie bracket ${[A,B] := AB-BA}$. One easily verifies that for any such matrix ${A \in \mathfrak{gl}_d({\bf C})}$, the map ${\phi: t \mapsto \exp(tA)}$ is a one-parameter subgroup of ${GL_d({\bf C})}$. Conversely, these are the only such groups:

Proposition 4 Let ${\phi: {\bf R} \rightarrow GL_d({\bf C})}$ be a one-parameter subgroup of ${GL_d({\bf C})}$. Then there exists a (unique) matrix ${A \in \mathfrak{gl}_d({\bf C})}$ such that ${\phi(t) = \exp(tA)}$ for ${t \in {\bf R}}$.

Proof: Uniqueness follows from the differential identity

$\displaystyle \frac{d}{dt} \exp(tA)|_{t=0} = A,$

so we turn to existence. The basic idea here is to take logarithms. By the inverse function theorem, ${\exp}$ is a homeomorphism between a neighbourhood of the origin in ${\mathfrak{gl}_d({\bf C})}$, and a neighbourhood of the identity in ${GL_d({\bf C})}$. Thus, for sufficiently small ${\epsilon > 0}$, we can write ${\phi(t) = \exp( B(t) )}$ for all ${t \in [-\epsilon,\epsilon]}$ and a continuous function ${B: [-\epsilon,\epsilon] \rightarrow M_n({\bf C})}$. In particular we have

$\displaystyle \exp(2 B(t)) = \exp( B(t) )^2 = \exp( B( 2t ) )$

for ${|t| \leq \epsilon/2}$, and thus by the local homeomorphism properties of the exponential

$\displaystyle 2B(t) = B(2t).$

Iterating this we see that

$\displaystyle B( 2^{-n} \epsilon ) = 2^{-n} B(\epsilon)$

for all ${n = 0,1,2,\ldots}$; using the homomorphism nature of ${\phi}$ and the laws of exponentiation (and the local homeomorphism properties of the exponential) we have

$\displaystyle B( q \epsilon ) = q B(\epsilon)$

for all dyadic rationals ${-1 \leq q \leq 1}$. By continuity we thus have

$\displaystyle B(t) = t A$

for all ${t \in [-\epsilon,\epsilon]}$, where ${A := B(\epsilon)/\epsilon}$. This gives ${\phi(t) = \exp(tA)}$ for all ${|t| \leq \epsilon}$, and by applying the homomorphism property of ${\phi}$ we conclude that ${\phi(t)=\exp(tA)}$ for all ${t}$, and the claim follows. $\Box$

Now we see the extent to which we can transfer Proposition 4 to the groups ${G}$ appearing in Theorem 1, namely locally compact Hausdorff topological groups with a faithful linear representation ${\rho: G \rightarrow GL_d({\bf C})}$. Every one-parameter subgroup ${\phi: {\bf R} \rightarrow G}$ of course induces a one-parameter subgroup ${\rho \circ \phi: {\bf R} \rightarrow GL_d({\bf C})}$ that takes values in the image ${\rho(G)}$ of ${G}$; from faithfulness, we see that ${\phi}$ is uniquely determined by ${\rho \circ \phi}$. In the converse direction, one would like to say that every one-parameter subgroup ${\Phi: {\bf R} \rightarrow GL_d({\bf C})}$ taking values in ${\rho(G)}$, factors through ${\rho}$ in this manner. This is not quite true as stated; for instance, if ${G}$ is ${GL_d({\bf C})}$ with the discrete topology, and ${\rho}$ is the inclusion map, then only the trivial one-parameter subgroup ${\Phi: t \mapsto 1}$ will factor through ${\rho}$. However, we can fix this by strengthening the hypothesis “${\Phi}$ takes values in ${\rho(G)}$” slightly:

Lemma 5 Let ${G}$, ${\rho}$, ${d}$ be as in Theorem 1, let ${K}$ be a compact neighbourhood of ${G}$, and let ${\Phi: {\bf R} \rightarrow GL_d({\bf C})}$ be a one-parameter subgroup. Then ${\Phi}$ factors through ${\rho}$ (i.e. ${\Phi = \rho \circ \phi}$ for some one-parameter subgroup ${\phi: {\bf R} \rightarrow G}$) if and only if ${\Phi((-\epsilon,\epsilon)) \subset \rho(K)}$ for some ${\epsilon > 0}$.

Proof: The “only if” is immediate from continuity of ${\phi}$, so we turn to the “if” part. As ${\rho}$ is continuous, ${\rho(K)}$ is compact. The restriction ${\rho\downharpoonright_K: K \rightarrow \rho(K)}$ is then a continuous bijection from a compact space to a Hausdorff one, and is therefore a homeomorphism (since it maps closed (hence compact) subsets of ${K}$ to compact (hence closed) subsets of ${\rho(K)}$).

For future reference, we note that this already shows that ${G}$ is first-countable, and hence metrisable, thanks to the Birkhoff-Kakutani theorem as discussed earlier.

Since ${\Phi((-\epsilon,\epsilon)) \subset \rho(K)}$, we see from the homomorphism property that ${\Phi({\bf R}) \subset \rho(G)}$. As ${\rho}$ is faithful, we thus have ${\Phi = \rho \circ \phi}$ for some homomorphism ${\phi: {\bf R} \rightarrow G}$, with ${\phi((-\epsilon,\epsilon)) \subset K}$. As ${\rho}$ is a homeomorphism from ${K}$ to ${\rho(K)}$, we conclude that ${\phi}$ is continuous on ${(-\epsilon,\epsilon)}$, and hence (by the homomorphism property of ${\phi}$) is continuous on all of ${{\bf R}}$, and the claim follows. $\Box$

Exercise 1 If we make the additional assumption that ${G}$ is ${\sigma}$-compact, show that any one-parameter subgroup ${\Phi: {\bf R} \rightarrow GL_d({\bf C})}$ taking values in ${\rho(G)}$ factors through ${\rho}$. (Hint: use the Baire category theorem.)

Henceforth we take ${G, \rho, d, K}$ to be as in the above lemma; it is convenient to take ${K}$ to be symmetric, ${K = K^{-1}}$. By Lemma 5 and Proposition 4, we see that there is a one-to-one correspondence between the space ${L(G)}$ of one-parameter subgroups ${\phi}$ of ${G}$, and those matrices ${A \in \mathfrak{gl}_d({\bf C})}$ with the property that ${\exp(tA) \in \rho(K)}$ for all sufficiently small ${t}$, by identifying ${\phi}$ with the unique matrix ${A}$ for which ${\rho(\phi(t)) = \exp(tA)}$ for all ${t}$. Let ${{\mathfrak g}}$ denote the set of all such ${A}$ that arise in this manner.

Lemma 6 ${{\mathfrak g}}$ is a Lie subalgebra of ${\mathfrak{gl}_d({\bf C})}$.

Proof: It is clear that ${{\mathfrak g}}$ contains the origin, and by composing one-parameter subgroups with dilations ${t \mapsto \lambda t}$ of the real line, we see that it is also closed under scalar multiplication. Now we show that ${{\mathfrak g}}$ is closed under addition. Let ${A, B \in {\mathfrak g}}$, then we have ${\exp(tA), \exp(tB) \in \rho(K)}$ for all sufficiently small ${t}$. In particular, for sufficiently large natural numbers ${n}$, we have

$\displaystyle \exp(A/n), \exp(B/n) \in \rho(K). \ \ \ \ \ (1)$

We wish to show that ${\exp(t(A+B))}$ lies in ${\rho(K)}$ for all sufficiently small ${t}$. The idea is to use the formula

$\displaystyle \exp(t(A+B)) = \lim_{n \rightarrow \infty} x_n^{\lfloor tn \rfloor} \ \ \ \ \ (2)$

where ${x_n := \exp(A/n) \exp(B/n)}$. If we had that ${x_n^{\lfloor tn\rfloor} \in \rho(K)}$ for all sufficiently small ${t}$, uniformly for an infinite sequence of ${n}$, then the claim would follow from the compact (hence closed) nature of ${\rho(K)}$.

From (1) and Proposition 4, we see that ${\rho^{-1}(\exp(A/n)), \rho^{-1}(\exp(B/n))}$ go to zero as ${n \rightarrow \infty}$, which implies that for any fixed ${k}$, we have ${x_n^k \in \rho(K)}$ for all sufficiently large ${n}$. If we had ${x_n^k \in \rho(K)}$ for all sufficiently large ${n}$ and all ${0 \leq k \leq n}$, we would be done (using symmetry of ${K}$ to then get the negative values of ${k}$); so suppose that this is not the case. Then we can find a sequence of ${n}$, and a sequence ${k_n}$ of natural numbers going to infinity with ${k_n \leq n}$, such that

$\displaystyle x_n^k \in \rho(K)$

for all ${0 \leq k < k_n}$, and

$\displaystyle x_n^{k_n} \not \in \rho(K).$

Remark 2 One can view the integers ${k_n}$ as the “escape times” associated to the ${x_n}$. The concept of escape time (and its reciprocal, which one can view as an “escape norm” that measures how deeply nested a given point is inside a fixed neighbourhood of the identity) turns out to be of major importance throughout the theory of Hilbert’s fifth problem; this should become clearer in subsequent posts on this problem.

By passing to a subsequence if necessary we may assume that ${k_n/n}$ converges to some limit ${\lambda \in [0,1]}$. If ${\lambda = 0}$, then (by a variant of (2)) we would have ${x_n^{k_n-1}}$ and ${x_n}$ both converging to the identity in ${\rho(K)}$, which contradicts ${x_n^{k_n}}$ staying out of ${\rho(K)}$ for sufficiently large ${n}$ (recall that ${\rho(K)}$ is homeomorphic to ${K}$). So we have ${0 < \lambda \leq 1}$. But then from (2) (and symmetry of ${K}$) we have ${\exp(t(A+B)) \in \rho(K)}$ for all ${|t| < \lambda}$, and the claim follows.

A similar argument using the formula

$\displaystyle \exp(t[A,B]) = \lim_{n \rightarrow \infty} y_n^{\lfloor tn^2 \rfloor},$

where ${y_n := \exp(A/n) \exp(B/n) \exp(-A/n) \exp(-B/n)}$, shows that ${{\mathfrak g}}$ is closed under Lie bracket; we leave the details as an exercise to the reader. Thus ${{\mathfrak g}}$ is a Lie subalgebra of ${{\mathfrak gl}_n({\bf C})}$ as claimed. $\Box$

We have now located a good candidate ${{\mathfrak g}}$ for the “Lie algebra” of ${G}$, which one can then try to “exponentiate” to create Lie group structure for ${G}$. Indeed it is clear from construction that ${\exp({\mathfrak g})}$ is contained in ${\rho(G)}$. However, we have not yet shown that ${{\mathfrak g}}$ is “big enough” to cover all of ${G}$. Indeed, at this point it is conceivable that this Lie algebra could well be the trivial Lie algebra, even if ${G}$ is highly non-trivial. To prevent this scenario from happening, we need a way to generate non-trivial one-parameter subgroups. Such subgroups can be extracted from sequences in ${G}$ converging to the identity by a compactness argument. Let us first illustrate this idea in a simple case:

Lemma 7 If ${G}$ is not discrete, then ${{\mathfrak g}}$ is non-trivial.

Proof: If ${G}$ is not discrete, then there exists a sequence ${g_n \in G}$ of group elements distinct from the group identity ${\hbox{id}}$, which nevertheless converge to the identity. (As remarked in the proof of Lemma 5, ${G}$ is necessarily metrisable.) Then ${\rho(g_n)}$ converges to the matrix identity, but will always be distinct from the identity. Now let ${K}$ be a symmetric compact neighbourhood of the identity in ${G}$ that is small enough that ${\rho(K)}$ contains no non-trivial subgroups. Then for each ${n}$, ${\rho(g_n)^k}$ must eventually escape ${\rho(K)}$ for some ${k}$. Let ${k_n}$ be the first natural number for which ${\rho(g_n)^{k_n}}$ escapes ${\rho(K)}$ (or equivalently, for which ${g_n^{k_n}}$ escapes ${K}$). Since ${\rho(g_n)}$ converges to the identity, we see that ${k_n}$ must go to infinity, and the distance between ${\rho(g_n)^{k_n}}$ and the boundary ${\partial \rho(K)}$ of ${\rho(K)}$ must go to zero. By compactness, we may then pass to a subsequence such that ${\rho(g_n)^{k_n}}$ converges to an element of ${\partial \rho(K)}$; if ${\rho(K)}$ is small enough, we can write this element as ${\exp(A)}$ for some small ${A \in \mathfrak{gl}_d({\bf C})}$; this must be non-zero, as the identity is not a boundary point of ${\rho(K)}$. Using logarithms, we then see (if ${K}$ is small enough) that for any ${|t| \leq 1}$, that ${\rho(g_n)^{\lfloor t k_n\rfloor}}$ converges to ${\exp(tA)}$. As ${\rho(g_n)^{\lfloor t k_n\rfloor}}$ lies in the compact set ${\rho(K)}$, we thus conclude that ${\exp(tA) \in \rho(K)}$ for all ${|t| \leq 1}$, and thus ${A \in {\mathfrak g}}$, and the claim follows. $\Box$

A similar argument now gives

Proposition 8 There exists a compact neighbourhood ${K}$ of the identity in ${G}$, and a compact neighbourhood ${F}$ of zero in ${{\mathfrak g}}$, such that ${\rho(K) = \exp( F )}$.

Proof: The basic idea is to first “quotient out” ${{\mathfrak g}}$ from ${{\mathfrak gl}_n}$ and then apply the Lemma 7 argument to the quotient space.

We turn to the details. Let ${F}$ be a compact neighbourhood of zero in ${{\mathfrak g}}$. If the claim failed, then there exists a sequence ${g_n}$ of group elements ${G}$ converging to zero such that ${\rho(g_n) \not \in \exp(F)}$ for any ${n}$. Since ${\rho(g_n)}$ is close to zero for large ${n}$, we thus have ${\rho(g_n) = \exp(A_n)}$ for some ${A_n \not \in {\mathfrak g}}$ that goes to zero as ${n \rightarrow \infty}$.

Split ${{\mathfrak gl}_n({\bf C})}$ as a vector space direct sum ${{\mathfrak g} + W}$ for some complementary subspace ${W}$ (not necessarily a Lie algebra). From the inverse function theorem, we can then write ${\exp(A_n) = \exp(B_n) \exp(C_n)}$ where ${B_n \in {\mathfrak g}}$ and ${C_n \in W}$ both go to zero. Since ${A_n \not \in {\mathfrak g}}$, we also have ${C_n \neq 0}$ for ${n}$ large enough.

Let ${K}$ be a small compact neighbourhood of the identity in ${G}$. For each ${B \in {\mathfrak g}}$, we have ${\exp(tB) \in \rho(K)}$ for ${t}$ small enough, thanks to Lemma 5. In particular, ${\exp(tB)}$ avoids the compact set ${\rho(\partial K)}$ for ${t}$ small enough. If ${L}$ is a compact neighbourhood of the origin in ${{\mathfrak g}}$, we can conclude that there exists a ${t_0>0}$ such that ${\exp(tB)}$ avoids ${\rho(\partial K)}$ for all ${B \in L}$ and all ${|t| \leq t_0}$; by a continuity argument, we conclude that ${\exp(tB) \in \rho(K)}$ for all ${B \in L}$ and all ${|t| \leq t_0}$. In particular, we have ${\exp(B_n) \in \rho(K)}$ for sufficiently large ${n}$. Letting ${K}$ shrink to zero, we conclude that for sufficiently large ${n}$, we have ${\exp(B_n) = \rho(h_n)}$ for some ${h_n}$ converging to the identity in ${G}$.

We now have

$\displaystyle \exp(C_n) = \exp(B_n)^{-1} \exp(A_n) =\rho(g'_n)$

where ${g'_n := h_n^{-1} g_n}$ goes to zero. Since ${C_n}$ is non-zero for large ${n}$, ${g'_n}$ is also non-zero for large ${n}$.

Now we can repeat the arguments used to prove Lemma 7. As in that lemma, we pick a symmetric neighbourhood ${K}$ of ${GL_n({\bf C})}$ small enough to contain no nontrivial subgroups, and let ${k_n}$ be the first integer for which ${\rho(g'_n)^{k_n}}$ escapes ${K}$. As before, by passing to a subsequence we may assume that there is a non-zero ${A \in \mathfrak{gl}_n({\bf C})}$ such that ${\rho(g'_n)^{\lfloor t k_n \rfloor} \rightarrow \exp(tA)}$ for all ${|t| \leq 1}$. Since ${\rho(g'_n)}$ is the exponential of a small element of ${W}$, the same is true for ${\rho(g'_n)^{\lfloor t k_n \rfloor}}$ (which, recall, has not yet escaped ${K}$) for ${t}$ small enough, and we conclude that ${tA \in W}$ for ${t}$ small enough. This implies that ${W}$ has non-trivial intersection with ${{\mathfrak g}}$, a contradiction. $\Box$

From the above proposition we see that ${G}$ is locally isomorphic to ${\exp({\mathfrak g})}$. But ${\exp({\mathfrak g})}$ is locally an analytic manifold, and from the Baker-Campbell-Hausdorff formula we see that multiplication and inversion are smooth (and even analytic) operations on ${\exp({\mathfrak g})}$ locally near the origin. This gives a left-invariant (say) smooth (and even analytic) structure on ${G}$ with the group operations smooth near the origin. A continuity argument then shows that the group operations remain smooth on the identity connected component ${G^\circ}$ of ${G}$. This already gives Theorem 1 in the connected case.

Now we turn to the disconnected case. From the local connectedness of ${\exp({\mathfrak g})}$ we see that ${G}$ is locally connected, so that ${G/G^\circ}$ is discrete. So it will suffice to show that the group operations are continuous on each connected component of ${G}$ (i.e. the cosets of the normal subgroup ${G^\circ}$) separately. But observe that any element ${h}$ of ${G}$ induces an outer automorphism ${g \mapsto hgh^{-1}}$ on ${G^\circ}$, the graph of which can be viewed as a closed connected subgroup of ${G^\circ \times G^\circ}$. By the connected case of Theorem 1, this subgroup must also be a Lie group, and so the outer automorphism is smooth. From this one easily verifies that the group operations are now smooth on all connected components of ${G}$, as required.

Remark 3 A similar argument shows that any continuous homomorphism between two Lie groups is automatically smooth (and analytic). Thus we see a rigidity phenomenon in Lie groups: the smooth structure is completely determined by the topological structure. This is ultimately due to the fact that the Lie algebra (which controls the smooth and analytic structure) can be constructed in a purely topological fashion, via one-parameter subgroups.

Remark 4 The above arguments were sufficiently “local” in nature that they can be extended without much difficulty to local groups, with the conclusion being that any local group that has a locally faithful continuous linear representation, is a local Lie group.