You are currently browsing the tag archive for the ‘Lie groups’ tag.

Because of Euler’s identity ${e^{\pi i} + 1 = 0}$, the complex exponential is not injective: ${e^{z + 2\pi i k} = e^z}$ for any complex ${z}$ and integer ${k}$. As such, the complex logarithm ${z \mapsto \log z}$ is not well-defined as a single-valued function from ${{\bf C} \backslash \{0\}}$ to ${{\bf C}}$. However, after making a branch cut, one can create a branch of the logarithm which is single-valued. For instance, after removing the negative real axis ${(-\infty,0]}$, one has the standard branch ${\hbox{Log}: {\bf C} \backslash (-\infty,0] \rightarrow \{ z \in {\bf C}: |\hbox{Im} z| < \pi \}}$ of the logarithm, with ${\hbox{Log}(z)}$ defined as the unique choice of the complex logarithm of ${z}$ whose imaginary part has magnitude strictly less than ${\pi}$. This particular branch has a number of useful additional properties:

• The standard branch ${\hbox{Log}}$ is holomorphic on its domain ${{\bf C} \backslash (-\infty,0]}$.
• One has ${\hbox{Log}( \overline{z} ) = \overline{ \hbox{Log}(z) }}$ for all ${z}$ in the domain ${{\bf C} \backslash (-\infty,0]}$. In particular, if ${z \in {\bf C} \backslash (-\infty,0]}$ is real, then ${\hbox{Log} z}$ is real.
• One has ${\hbox{Log}( z^{-1} ) = - \hbox{Log}(z)}$ for all ${z}$ in the domain ${{\bf C} \backslash (-\infty,0]}$.

One can then also use the standard branch of the logarithm to create standard branches of other multi-valued functions, for instance creating a standard branch ${z \mapsto \exp( \frac{1}{2} \hbox{Log} z )}$ of the square root function. We caution however that the identity ${\hbox{Log}(zw) = \hbox{Log}(z) + \hbox{Log}(w)}$ can fail for the standard branch (or indeed for any branch of the logarithm).

One can extend this standard branch of the logarithm to ${n \times n}$ complex matrices, or (equivalently) to linear transformations ${T: V \rightarrow V}$ on an ${n}$-dimensional complex vector space ${V}$, provided that the spectrum of that matrix or transformation avoids the branch cut ${(-\infty,0]}$. Indeed, from the spectral theorem one can decompose any such ${T: V \rightarrow V}$ as the direct sum of operators ${T_\lambda: V_\lambda \rightarrow V_\lambda}$ on the non-trivial generalised eigenspaces ${V_\lambda}$ of ${T}$, where ${\lambda \in {\bf C} \backslash (-\infty,0]}$ ranges in the spectrum of ${T}$. For each component ${T_\lambda}$ of ${T}$, we define

$\displaystyle \hbox{Log}( T_\lambda ) = P_\lambda( T_\lambda )$

where ${P_\lambda}$ is the Taylor expansion of ${\hbox{Log}}$ at ${\lambda}$; as ${T_\lambda-\lambda}$ is nilpotent, only finitely many terms in this Taylor expansion are required. The logarithm ${\hbox{Log} T}$ is then defined as the direct sum of the ${\hbox{Log} T_\lambda}$.

The matrix standard branch of the logarithm has many pleasant and easily verified properties (often inherited from their scalar counterparts), whenever ${T: V \rightarrow V}$ has no spectrum in ${(-\infty,0]}$:

• (i) We have ${\exp( \hbox{Log} T ) = T}$.
• (ii) If ${T_1: V_1 \rightarrow V_1}$ and ${T_2: V_2 \rightarrow V_2}$ have no spectrum in ${(-\infty,0]}$, then ${\hbox{Log}( T_1 \oplus T_2 ) = \hbox{Log}(T_1) \oplus \hbox{Log}(T_2)}$.
• (iii) If ${T}$ has spectrum in a closed disk ${B(z,r)}$ in ${{\bf C} \backslash (-\infty,0]}$, then ${\hbox{Log}(T) = P_z(T)}$, where ${P_z}$ is the Taylor series of ${\hbox{Log}}$ around ${z}$ (which is absolutely convergent in ${B(z,r)}$).
• (iv) ${\hbox{Log}(T)}$ depends holomorphically on ${T}$. (Easily established from (ii), (iii), after covering the spectrum of ${T}$ by disjoint disks; alternatively, one can use the Cauchy integral representation ${\hbox{Log}(T) = \frac{1}{2\pi i} \int_\gamma \hbox{Log}(z)(z-T)^{-1}\ dz}$ for a contour ${\gamma}$ in the domain enclosing the spectrum of ${T}$.) In particular, the standard branch of the matrix logarithm is smooth.
• (v) If ${S: V \rightarrow W}$ is any invertible linear or antilinear map, then ${\hbox{Log}( STS^{-1} ) = S \hbox{Log}(T) S^{-1}}$. In particular, the standard branch of the logarithm commutes with matrix conjugations; and if ${T}$ is real with respect to a complex conjugation operation on ${V}$ (that is to say, an antilinear involution), then ${\hbox{Log}(T)}$ is real also.
• (vi) If ${T^*: V^* \rightarrow V^*}$ denotes the transpose of ${T}$ (with ${V^*}$ the complex dual of ${V}$), then ${\hbox{Log}(T^*) = \hbox{Log}(T)^*}$. Similarly, if ${T^\dagger: V^\dagger \rightarrow V^\dagger}$ denotes the adjoint of ${T}$ (with ${V^\dagger}$ the complex conjugate of ${V^*}$, i.e. ${V^*}$ with the conjugated multiplication map ${(c,z) \mapsto \overline{c} z}$), then ${\hbox{Log}(T^\dagger) = \hbox{Log}(T)^\dagger}$.
• (vii) One has ${\hbox{Log}(T^{-1}) = - \hbox{Log}( T )}$.
• (viii) If ${\sigma(T)}$ denotes the spectrum of ${T}$, then ${\sigma(\hbox{Log} T) = \hbox{Log} \sigma(T)}$.

As a quick application of the standard branch of the matrix logarithm, we have

Proposition 1 Let ${G}$ be one of the following matrix groups: ${GL_n({\bf C})}$, ${GL_n({\bf R})}$, ${U_n({\bf C})}$, ${O(Q)}$, ${Sp_{2n}({\bf C})}$, or ${Sp_{2n}({\bf R})}$, where ${Q: {\bf R}^n \rightarrow {\bf R}}$ is a non-degenerate real quadratic form (so ${O(Q)}$ is isomorphic to a (possibly indefinite) orthogonal group ${O(k,n-k)}$ for some ${0 \leq k \leq n}$. Then any element ${T}$ of ${G}$ whose spectrum avoids ${(-\infty,0]}$ is exponential, that is to say ${T = \exp(X)}$ for some ${X}$ in the Lie algebra ${{\mathfrak g}}$ of ${G}$.

Proof: We just prove this for ${G=O(Q)}$, as the other cases are similar (or a bit simpler). If ${T \in O(Q)}$, then (viewing ${T}$ as a complex-linear map on ${{\bf C}^n}$, and using the complex bilinear form associated to ${Q}$ to identify ${{\bf C}^n}$ with its complex dual ${({\bf C}^n)^*}$, then ${T}$ is real and ${T^{*-1} = T}$. By the properties (v), (vi), (vii) of the standard branch of the matrix logarithm, we conclude that ${\hbox{Log} T}$ is real and ${- \hbox{Log}(T)^* = \hbox{Log}(T)}$, and so ${\hbox{Log}(T)}$ lies in the Lie algebra ${{\mathfrak g} = {\mathfrak o}(Q)}$, and the claim now follows from (i). $\Box$

Exercise 2 Show that ${\hbox{diag}(-\lambda, -1/\lambda)}$ is not exponential in ${GL_2({\bf R})}$ if ${-\lambda \in (-\infty,0) \backslash \{-1\}}$. Thus we see that the branch cut in the above proposition is largely necessary. See this paper of Djokovic for a more complete description of the image of the exponential map in classical groups, as well as this previous blog post for some more discussion of the surjectivity (or lack thereof) of the exponential map in Lie groups.

For a slightly less quick application of the standard branch, we have the following result (recently worked out in the answers to this MathOverflow question):

Proposition 3 Let ${T}$ be an element of the split orthogonal group ${O(n,n)}$ which lies in the connected component of the identity. Then ${\hbox{det}(1+T) \geq 0}$.

The requirement that ${T}$ lie in the identity component is necessary, as the counterexample ${T = \hbox{diag}(-\lambda, -1/\lambda )}$ for ${\lambda \in (-\infty,-1) \cup (-1,0)}$ shows.

Proof: We think of ${T}$ as a (real) linear transformation on ${{\bf C}^{2n}}$, and write ${Q}$ for the quadratic form associated to ${O(n,n)}$, so that ${O(n,n) \equiv O(Q)}$. We can split ${{\bf C}^{2n} = V_1 \oplus V_2}$, where ${V_1}$ is the sum of all the generalised eigenspaces corresponding to eigenvalues in ${(-\infty,0]}$, and ${V_2}$ is the sum of all the remaining eigenspaces. Since ${T}$ and ${(-\infty,0]}$ are real, ${V_1,V_2}$ are real (i.e. complex-conjugation invariant) also. For ${i=1,2}$, the restriction ${T_i: V_i \rightarrow V_i}$ of ${T}$ to ${V_i}$ then lies in ${O(Q_i)}$, where ${Q_i}$ is the restriction of ${Q}$ to ${V_i}$, and

$\displaystyle \hbox{det}(1+T) = \hbox{det}(1+T_1) \hbox{det}(1+T_2).$

The spectrum of ${T_2}$ consists of positive reals, as well as complex pairs ${\lambda, \overline{\lambda}}$ (with equal multiplicity), so ${\hbox{det}(1+T_2) > 0}$. From the preceding proposition we have ${T_2 = \exp( X_2 )}$ for some ${X_2 \in {\mathfrak o}(Q_2)}$; this will be important later.

It remains to show that ${\hbox{det}(1+T_1) \geq 0}$. If ${T_1}$ has spectrum at ${-1}$ then we are done, so we may assume that ${T_1}$ has spectrum only at ${(-\infty,-1) \cup (-1,0)}$ (being invertible, ${T}$ has no spectrum at ${0}$). We split ${V_1 = V_3 \oplus V_4}$, where ${V_3,V_4}$ correspond to the portions of the spectrum in ${(-\infty,-1)}$, ${(-1,0)}$; these are real, ${T}$-invariant spaces. We observe that if ${V_\lambda, V_\mu}$ are generalised eigenspaces of ${T}$ with ${\lambda \mu \neq 1}$, then ${V_\lambda, V_\mu}$ are orthogonal with respect to the (complex-bilinear) inner product ${\cdot}$ associated with ${Q}$; this is easiest to see first for the actual eigenspaces (since ${ \lambda \mu u \cdot v = Tu \cdot Tv = u \cdot v}$ for all ${u \in V_\lambda, v \in V_\mu}$), and the extension to generalised eigenvectors then follows from a routine induction. From this we see that ${V_1}$ is orthogonal to ${V_2}$, and ${V_3}$ and ${V_4}$ are null spaces, which by the non-degeneracy of ${Q}$ (and hence of the restriction ${Q_1}$ of ${Q}$ to ${V_1}$) forces ${V_3}$ to have the same dimension as ${V_4}$, indeed ${Q}$ now gives an identification of ${V_3^*}$ with ${V_4}$. If we let ${T_3, T_4}$ be the restrictions of ${T}$ to ${V_3,V_4}$, we thus identify ${T_4}$ with ${T_3^{*-1}}$, since ${T}$ lies in ${O(Q)}$; in particular ${T_3}$ is invertible. Thus

$\displaystyle \hbox{det}(1+T_1) = \hbox{det}(1 + T_3) \hbox{det}( 1 + T_3^{*-1} ) = \hbox{det}(T_3)^{-1} \hbox{det}(1+T_3)^2$

and so it suffices to show that ${\hbox{det}(T_3) > 0}$.

At this point we need to use the hypothesis that ${T}$ lies in the identity component of ${O(n,n)}$. This implies (by a continuity argument) that the restriction of ${T}$ to any maximal-dimensional positive subspace has positive determinant (since such a restriction cannot be singular, as this would mean that ${T}$ positive norm vector would map to a non-positive norm vector). Now, as ${V_3,V_4}$ have equal dimension, ${Q_1}$ has a balanced signature, so ${Q_2}$ does also. Since ${T_2 = \exp(X_2)}$, ${T_2}$ already lies in the identity component of ${O(Q_2)}$, and so has positive determinant on any maximal-dimensional positive subspace of ${V_2}$. We conclude that ${T_1}$ has positive determinant on any maximal-dimensional positive subspace of ${V_1}$.

We choose a complex basis of ${V_3}$, to identify ${V_3}$ with ${V_3^*}$, which has already been identified with ${V_4}$. (In coordinates, ${V_3,V_4}$ are now both of the form ${{\bf C}^m}$, and ${Q( v \oplus w ) = v \cdot w}$ for ${v,w \in {\bf C}^m}$.) Then ${\{ v \oplus v: v \in V_3 \}}$ becomes a maximal positive subspace of ${V_1}$, and the restriction of ${T_1}$ to this subspace is conjugate to ${T_3 + T_3^{*-1}}$, so that

$\displaystyle \hbox{det}( T_3 + T_3^{*-1} ) > 0.$

But since ${\hbox{det}( T_3 + T_3^{*-1} ) = \hbox{det}(T_3) \hbox{det}( 1 + T_3^{-1} T_3^{*-1} )}$ and ${ 1 + T_3^{-1} T_3^{*-1}}$ is positive definite, so ${\hbox{det}(T_3)>0}$ as required. $\Box$

In this previous post I recorded some (very standard) material on the structural theory of finite-dimensional complex Lie algebras (or Lie algebras for short), with a particular focus on those Lie algebras which were semisimple or simple. Among other things, these notes discussed the Weyl complete reducibility theorem (asserting that semisimple Lie algebras are the direct sum of simple Lie algebras) and the classification of simple Lie algebras (with all such Lie algebras being (up to isomorphism) of the form ${A_n}$, ${B_n}$, ${C_n}$, ${D_n}$, ${E_6}$, ${E_7}$, ${E_8}$, ${F_4}$, or ${G_2}$).

Among other things, the structural theory of Lie algebras can then be used to build analogous structures in nearby areas of mathematics, such as Lie groups and Lie algebras over more general fields than the complex field ${{\bf C}}$ (leading in particular to the notion of a Chevalley group), as well as finite simple groups of Lie type, which form the bulk of the classification of finite simple groups (with the exception of the alternating groups and a finite number of sporadic groups).

In the case of complex Lie groups, it turns out that every simple Lie algebra ${\mathfrak{g}}$ is associated with a finite number of connected complex Lie groups, ranging from a “minimal” Lie group ${G_{ad}}$ (the adjoint form of the Lie group) to a “maximal” Lie group ${\tilde G}$ (the simply connected form of the Lie group) that finitely covers ${G_{ad}}$, and occasionally also a number of intermediate forms which finitely cover ${G_{ad}}$, but are in turn finitely covered by ${\tilde G}$. For instance, ${\mathfrak{sl}_n({\bf C})}$ is associated with the projective special linear group ${\hbox{PSL}_n({\bf C}) = \hbox{PGL}_n({\bf C})}$ as its adjoint form and the special linear group ${\hbox{SL}_n({\bf C})}$ as its simply connected form, and intermediate groups can be created by quotienting out ${\hbox{SL}_n({\bf C})}$ by some subgroup of its centre (which is isomorphic to the ${n^{th}}$ roots of unity). The minimal form ${G_{ad}}$ is simple in the group-theoretic sense of having no normal subgroups, but the other forms of the Lie group are merely quasisimple, although traditionally all of the forms of a Lie group associated to a simple Lie algebra are known as simple Lie groups.

Thanks to the work of Chevalley, a very similar story holds for algebraic groups over arbitrary fields ${k}$; given any Dynkin diagram, one can define a simple Lie algebra with that diagram over that field, and also one can find a finite number of connected algebraic groups over ${k}$ (known as Chevalley groups) with that Lie algebra, ranging from an adjoint form ${G_{ad}}$ to a universal form ${G_u}$, with every form having an isogeny (the analogue of a finite cover for algebraic groups) to the adjoint form, and in turn receiving an isogeny from the universal form. Thus, for instance, one could construct the universal form ${E_7(q)_u}$ of the ${E_7}$ algebraic group over a finite field ${{\bf F}_q}$ of finite order.

When one restricts the Chevalley group construction to adjoint forms over a finite field (e.g. ${\hbox{PSL}_n({\bf F}_q)}$), one usually obtains a finite simple group (with a finite number of exceptions when the rank and the field are very small, and in some cases one also has to pass to a bounded index subgroup, such as the derived group, first). One could also use other forms than the adjoint form, but one then recovers the same finite simple group as before if one quotients out by the centre. This construction was then extended by Steinberg, Suzuki, and Ree by taking a Chevalley group over a finite field and then restricting to the fixed points of a certain automorphism of that group; after some additional minor modifications such as passing to a bounded index subgroup or quotienting out a bounded centre, this gives some additional finite simple groups of Lie type, including classical examples such as the projective special unitary groups ${\hbox{PSU}_n({\bf F}_{q^2})}$, as well as some more exotic examples such as the Suzuki groups or the Ree groups.

While I learned most of the classical structural theory of Lie algebras back when I was an undergraduate, and have interacted with Lie groups in many ways in the past (most recently in connection with Hilbert’s fifth problem, as discussed in this previous series of lectures), I have only recently had the need to understand more precisely the concepts of a Chevalley group and of a finite simple group of Lie type, as well as better understand the structural theory of simple complex Lie groups. As such, I am recording some notes here regarding these concepts, mainly for my own benefit, but perhaps they will also be of use to some other readers. The material here is standard, and was drawn from a number of sources, but primarily from Carter, Gorenstein-Lyons-Solomon, and Fulton-Harris, as well as the lecture notes on Chevalley groups by my colleague Robert Steinberg. The arrangement of material also reflects my own personal preferences; in particular, I tend to favour complex-variable or Riemannian geometry methods over algebraic ones, and this influenced a number of choices I had to make regarding how to prove certain key facts. The notes below are far from a comprehensive or fully detailed discussion of these topics, and I would refer interested readers to the references above for a properly thorough treatment.

A common theme in mathematical analysis (particularly in analysis of a “geometric” or “statistical” flavour) is the interplay between “macroscopic” and “microscopic” scales. These terms are somewhat vague and imprecise, and their interpretation depends on the context and also on one’s choice of normalisations, but if one uses a “macroscopic” normalisation, “macroscopic” scales correspond to scales that are comparable to unit size (i.e. bounded above and below by absolute constants), while “microscopic” scales are much smaller, being the minimal scale at which nontrivial behaviour occurs. (Other normalisations are possible, e.g. making the microscopic scale a unit scale, and letting the macroscopic scale go off to infinity; for instance, such a normalisation is often used, at least initially, in the study of groups of polynomial growth. However, for the theory of approximate groups, a macroscopic scale normalisation is more convenient.)

One can also consider “mesoscopic” scales which are intermediate between microscopic and macroscopic scales, or large-scale behaviour at scales that go off to infinity (and in particular are larger than the macroscopic range of scales), although the behaviour of these scales will not be the main focus of this post. Finally, one can divide the macroscopic scales into “local” macroscopic scales (less than ${\epsilon}$ for some small but fixed ${\epsilon>0}$) and “global” macroscopic scales (scales that are allowed to be larger than a given large absolute constant ${C}$). For instance, given a finite approximate group ${A}$:

• Sets such as ${A^m}$ for some fixed ${m}$ (e.g. ${A^{10}}$) can be considered to be sets at a global macroscopic scale. Sending ${m}$ to infinity, one enters the large-scale regime.
• Sets such as the sets ${S}$ that appear in the Sanders lemma from the previous set of notes (thus ${S^m \subset A^4}$ for some fixed ${m}$, e.g. ${m=100}$) can be considered to be sets at a local macroscopic scale. Sending ${m}$ to infinity, one enters the mesoscopic regime.
• The non-identity element ${u}$ of ${A}$ that is “closest” to the identity in some suitable metric (cf. the proof of Jordan’s theorem from Notes 0) would be an element associated to the microscopic scale. The orbit ${u, u^2, u^3, \ldots}$ starts out at microscopic scales, and (assuming some suitable “escape” axioms) will pass through mesoscopic scales and finally entering the macroscopic regime. (Beyond this point, the orbit may exhibit a variety of behaviours, such as periodically returning back to the smaller scales, diverging off to ever larger scales, or filling out a dense subset of some macroscopic set; the escape axioms we will use do not exclude any of these possibilities.)

For comparison, in the theory of locally compact groups, properties about small neighbourhoods of the identity (e.g. local compactness, or the NSS property) would be properties at the local macroscopic scale, whereas the space ${L(G)}$ of one-parameter subgroups can be interpreted as an object at the microscopic scale. The exponential map then provides a bridge connecting the microscopic and macroscopic scales.

We return now to approximate groups. The macroscopic structure of these objects is well described by the Hrushovski Lie model theorem from the previous set of notes, which informally asserts that the macroscopic structure of an (ultra) approximate group can be modeled by a Lie group. This is already an important piece of information about general approximate groups, but it does not directly reveal the full structure of such approximate groups, because these Lie models are unable to see the microscopic behaviour of these approximate groups.

To illustrate this, let us review one of the examples of a Lie model of an ultra approximate group, namely Exercise 28 from Notes 7. In this example one studied a “nilbox” from a Heisenberg group, which we rewrite here in slightly different notation. Specifically, let ${G}$ be the Heisenberg group

$\displaystyle G := \{ (a,b,c): a,b,c \in {\bf Z} \}$

with group law

$\displaystyle (a,b,c) \ast (a',b',c') := (a+a', b+b', c+c'+ab') \ \ \ \ \ (1)$

and let ${A = \prod_{n \rightarrow \alpha} A_n}$, where ${A_n \subset G}$ is the box

$\displaystyle A_n := \{ (a,b,c) \in G: |a|, |b| \leq n; |c| \leq n^{10} \};$

thus ${A}$ is the nonstandard box

$\displaystyle A := \{ (a,b,c) \in {}^* G: |a|, |b| \leq N; |c| \leq N^{10} \}$

where ${N := \lim_{n \rightarrow \alpha} n}$. As the above exercise establishes, ${A \cup A^{-1}}$ is an ultra approximate group with a Lie model ${\pi: \langle A \rangle \rightarrow {\bf R}^3}$ given by the formula

$\displaystyle \pi( a, b, c ) := ( \hbox{st} \frac{a}{N}, \hbox{st} \frac{b}{N}, \hbox{st} \frac{c}{N^{10}} )$

for ${a,b = O(N)}$ and ${c = O(N^{10})}$. Note how the nonabelian nature of ${G}$ (arising from the ${ab'}$ term in the group law (1)) has been lost in the model ${{\bf R}^3}$, because the effect of that nonabelian term on ${\frac{c}{N^{10}}}$ is only ${O(\frac{N^2}{N^8})}$ which is infinitesimal and thus does not contribute to the standard part. In particular, if we replace ${G}$ with the abelian group ${G' := \{(a,b,c): a,b,c \in {\bf Z} \}}$ with the additive group law

$\displaystyle (a,b,c) \ast' (a',b',c') := (a+a',b+b',c+c')$

and let ${A'}$ and ${\pi'}$ be defined exactly as with ${A}$ and ${\pi}$, but placed inside the group structure of ${G'}$ rather than ${G}$, then ${A \cup A^{-1}}$ and ${A' \cup (A')^{-1}}$ are essentially “indistinguishable” as far as their models by ${{\bf R}^3}$ are concerned, even though the latter approximate group is abelian and the former is not. The problem is that the nonabelian-ness in the former example is so microscopic that it falls entirely inside the kernel of ${\pi}$ and is thus not detected at all by the model.

The problem of not being able to “see” the microscopic structure of a group (or approximate group) also was a key difficulty in the theory surrounding Hilbert’s fifth problem that was discussed in previous notes. A key tool in being able to resolve such structure was to build left-invariant metrics ${d}$ (or equivalently, norms ${\| \|}$) on one’s group, which obeyed useful “Gleason axioms” such as the commutator axiom

$\displaystyle \| [g,h] \| \ll \|g\| \|h\| \ \ \ \ \ (2)$

for sufficiently small ${g,h}$, or the escape axiom

$\displaystyle \| g^n \| \gg |n| \|g\| \ \ \ \ \ (3)$

when ${|n| \|g\|}$ was sufficiently small. Such axioms have important and non-trivial content even in the microscopic regime where ${g}$ or ${h}$ are extremely close to the identity. For instance, in the proof of Jordan’s theorem from Notes 0, which showed that any finite unitary group ${G}$ was boundedly virtually abelian, a key step was to apply the commutator axiom (2) (for the distance to the identity in operator norm) to the most “microscopic” element of ${G}$, or more precisely a non-identity element of ${G}$ of minimal norm. The key point was that this microscopic element was virtually central in ${G}$, and as such it restricted much of ${G}$ to a lower-dimensional subgroup of the unitary group, at which point one could argue using an induction-on-dimension argument. As we shall see, a similar argument can be used to place “virtually nilpotent” structure on finite approximate groups. For instance, in the Heisenberg-type approximate groups ${A \cup A^{-1}}$ and ${A' \cup (A')^{-1}}$ discussed earlier, the element ${(0,0,1)}$ will be “closest to the origin” in a suitable sense to be defined later, and is centralised by both approximate groups; quotienting out (the orbit of) that central element and iterating the process two more times, we shall see that one can express both ${A \cup A^{-1}}$ and ${A'\cup (A')^{-1}}$ as a tower of central cyclic extensions, which in particular establishes the nilpotency of both groups.

The escape axiom (3) is a particularly important axiom in connecting the microscopic structure of a group ${G}$ to its macroscopic structure; for instance, as shown in Notes 2, this axiom (in conjunction with the closely related commutator axiom) tends to imply dilation estimates such as ${d( g^n, h^n ) \sim n d(g,h)}$ that allow one to understand the microscopic geometry of points ${g,h}$ close to the identity in terms of the (local) macroscopic geometry of points ${g^n, h^n}$ that are significantly further away from the identity.

It is thus of interest to build some notion of a norm (or left-invariant metrics) on an approximate group ${A}$ that obeys the escape and commutator axioms (while being non-degenerate enough to adequately capture the geometry of ${A}$ in some sense), in a fashion analogous to the Gleason metrics that played such a key role in the theory of Hilbert’s fifth problem. It is tempting to use the Lie model theorem to do this, since Lie groups certainly come with Gleason metrics. However, if one does this, one ends up, roughly speaking, with a norm on ${A}$ that only obeys the escape and commutator estimates macroscopically; roughly speaking, this means that one has a macroscopic commutator inequality

$\displaystyle \| [g,h] \| \ll \|g\| \|h\| + o(1)$

and a macroscopic escape property

$\displaystyle \| g^n \| \gg |n| \|g\| - o(|n|)$

but such axioms are too weak for analysis at the microscopic scale, and in particular in establishing centrality of the element closest to the identity.

Another way to proceed is to build a norm that is specifically designed to obey the crucial escape property. Given an approximate group ${A}$ in a group ${G}$, and an element ${g}$ of ${G}$, we can define the escape norm ${\|g\|_{e,A}}$ of ${g}$ by the formula

$\displaystyle \| g \|_{e,A} := \inf \{ \frac{1}{n+1}: n \in {\bf N}: g, g^2, \ldots, g^n \in A \}.$

Thus, ${\|g\|_{e,A}}$ equals ${1}$ if ${g}$ lies outside of ${A}$, equals ${1/2}$ if ${g}$ lies in ${A}$ but ${g^2}$ lies outside of ${A}$, and so forth. Such norms had already appeared in Notes 4, in the context of analysing NSS groups.

As it turns out, this expression will obey an escape axiom, as long as we place some additional hypotheses on ${A}$ which we will present shortly. However, it need not actually be a norm; in particular, the triangle inequality

$\displaystyle \|gh\|_{e,A} \leq \|g\|_{e,A} + \|h\|_{e,A}$

is not necessarily true. Fortunately, it turns out that by a (slightly more complicated) version of the Gleason machinery from Notes 4 we can establish a usable substitute for this inequality, namely the quasi-triangle inequality

$\displaystyle \|g_1 \ldots g_k \|_{e,A} \leq C (\|g_1\|_{e,A} + \ldots + \|g_k\|_{e,A}),$

where ${C}$ is a constant independent of ${k}$. As we shall see, these estimates can then be used to obtain a commutator estimate (2).

However, to do all this, it is not enough for ${A}$ to be an approximate group; it must obey two additional “trapping” axioms that improve the properties of the escape norm. We formalise these axioms (somewhat arbitrarily) as follows:

Definition 1 (Strong approximate group) Let ${K \geq 1}$. A strong ${K}$-approximate group is a finite ${K}$-approximate group ${A}$ in a group ${G}$ with a symmetric subset ${S}$ obeying the following axioms:

An ultra strong ${K}$-approximate group is an ultraproduct ${A = \prod_{n \rightarrow \alpha} A_n}$ of strong ${K}$-approximate groups.

The first trapping condition can be rewritten as

$\displaystyle \|g\|_{e,A} \leq 1000 \|g\|_{e,A^{100}}$

and the second trapping condition can similarly be rewritten as

$\displaystyle \|g\|_{e,S} \leq 10^6 K^3 \|g\|_{e,A}.$

This makes the escape norms of ${A, A^{100}}$, and ${S}$ comparable to each other, which will be needed for a number of reasons (and in particular to close a certain bootstrap argument properly). Compare this with equation (12) from Notes 4, which used the NSS hypothesis to obtain similar conclusions. Thus, one can view the strong approximate group axioms as being a sort of proxy for the NSS property.

Example 1 Let ${N}$ be a large natural number. Then the interval ${A = [-N,N]}$ in the integers is a ${2}$-approximate group, which is also a strong ${2}$-approximate group (setting ${S = [10^{-6} N, 10^{-6} N]}$, for instance). On the other hand, if one places ${A}$ in ${{\bf Z}/5N{\bf Z}}$ rather than in the integers, then the first trapping condition is lost and one is no longer a strong ${2}$-approximate group. Also, if one remains in the integers, but deletes a few elements from ${A}$, e.g. deleting ${\pm \lfloor 10^{-10} N\rfloor}$ from ${A}$), then one is still a ${O(1)}$-approximate group, but is no longer a strong ${O(1)}$-approximate group, again because the first trapping condition is lost.

A key consequence of the Hrushovski Lie model theorem is that it allows one to replace approximate groups by strong approximate groups:

Exercise 1 (Finding strong approximate groups)

• (i) Let ${A}$ be an ultra approximate group with a good Lie model ${\pi: \langle A \rangle \rightarrow L}$, and let ${B}$ be a symmetric convex body (i.e. a convex open bounded subset) in the Lie algebra ${{\mathfrak l}}$. Show that if ${r>0}$ is a sufficiently small standard number, then there exists a strong ultra approximate group ${A'}$ with

$\displaystyle \pi^{-1}(\exp(rB)) \subset A' \subset \pi^{-1}(\exp(1.1 rB)) \subset A,$

and with ${A}$ can be covered by finitely many left translates of ${A'}$. Furthermore, ${\pi}$ is also a good model for ${A'}$.

• (ii) If ${A}$ is a finite ${K}$-approximate group, show that there is a strong ${O_K(1)}$-approximate group ${A'}$ inside ${A^4}$ with the property that ${A}$ can be covered by ${O_K(1)}$ left translates of ${A'}$. (Hint: use (i), Hrushovski’s Lie model theorem, and a compactness and contradiction argument.)

The need to compare the strong approximate group to an exponentiated small ball ${\exp(rB)}$ will be convenient later, as it allows one to easily use the geometry of ${L}$ to track various aspects of the strong approximate group.

As mentioned previously, strong approximate groups exhibit some of the features of NSS locally compact groups. In Notes 4, we saw that the escape norm for NSS locally compact groups was comparable to a Gleason metric. The following theorem is an analogue of that result:

Theorem 2 (Gleason lemma) Let ${A}$ be a strong ${K}$-approximate group in a group ${G}$.

• (Symmetry) For any ${g \in G}$, one has ${\|g^{-1}\|_{e,A} = \|g\|_{e,A}}$.
• (Conjugacy bound) For any ${g, h \in A^{10}}$, one has ${\|g^h\|_{e,A} \ll \|g\|_{e,A}}$.
• (Triangle inequality) For any ${g_1,\ldots,g_k \in G}$, one has ${\|g_1 \ldots g_k \|_{e,A} \ll_K (\|g_1\|_{e,A} + \ldots + \|g_k\|_{e,A})}$.
• (Escape property) One has ${\|g^n\|_{e,A} \gg |n| \|g\|_{e,A}}$ whenever ${|n| \|g\|_{e,A} < 1}$.
• (Commutator inequality) For any ${g,h \in A^{10}}$, one has ${\| [g,h] \|_{e,A} \ll_K \|g\|_{e,A} \|h\|_{e,A}}$.

The proof of this theorem will occupy a large part of the current set of notes. We then aim to use this theorem to classify strong approximate groups. The basic strategy (temporarily ignoring a key technical issue) follows the Bieberbach-Frobenius proof of Jordan’s theorem, as given in Notes 0, is as follows.

1. Start with an (ultra) strong approximate group ${A}$.
2. From the Gleason lemma, the elements with zero escape norm form a normal subgroup of ${A}$. Quotient these elements out. Show that all non-identity elements will have positive escape norm.
3. Find the non-identity element ${g_1}$ in (the quotient of) ${A}$ of minimal escape norm. Use the commutator estimate (assuming it is inherited by the quotient) to show that ${g_1}$ will centralise (most of) this quotient. In particular, the orbit ${\langle g_1 \rangle}$ is (essentially) a central subgroup of ${\langle A \rangle}$.
4. Quotient this orbit out; then find the next non-identity element ${g_2}$ in this new quotient of ${A}$. Again, show that ${\langle g_2 \rangle}$ is essentially a central subgroup of this quotient.
5. Repeat this process until ${A}$ becomes entirely trivial. Undoing all the quotients, this should demonstrate that ${\langle A \rangle}$ is virtually nilpotent, and that ${A}$ is essentially a coset nilprogression.

There are two main technical issues to resolve to make this strategy work. The first is to show that the iterative step in the argument terminates in finite time. This we do by returning to the Lie model theorem. It turns out that each time one quotients out by an orbit of an element that escapes, the dimension of the Lie model drops by at least one. This will ensure termination of the argument in finite time.

The other technical issue is that while the quotienting out all the elements of zero escape norm eliminates all “torsion” from ${A}$ (in the sense that the quotient of ${A}$ has no non-trivial elements of zero escape norm), further quotienting operations can inadvertently re-introduce such torsion. This torsion can be re-eradicated by further quotienting, but the price one pays for this is that the final structural description of ${\langle A \rangle}$ is no longer as strong as “virtually nilpotent”, but is instead a more complicated tower alternating between (ultra) finite extensions and central extensions.

Example 2 Consider the strong ${O(1)}$-approximate group

$\displaystyle A := \{ a N^{10} + 5 b: |a| \leq N; |b| \leq N^2 \}$

in the integers, where ${N}$ is a large natural number not divisible by ${5}$. As ${{\bf Z}}$ is torsion-free, all non-zero elements of ${A}$ have positive escape norm, and the nonzero element of minimal escape norm here is ${g=5}$ (or ${g=-5}$). But if one quotients by ${\langle g \rangle}$, ${A}$ projects down to ${{\bf Z}/5{\bf Z}}$, which now has torsion (and all elements in this quotient have zero escape norm). Thus torsion has been re-introduced by the quotienting operation. (A related observation is that the intersection of ${A}$ with ${\langle g \rangle = 5{\bf Z}}$ is not a simple progression, but is a more complicated object, namely a generalised arithmetic progression of rank two.)

To deal with this issue, we will not quotient out by the entire cyclic group ${\langle g \rangle = \{g^n: n \in {\bf Z} \}}$ generated by the element ${g}$ of minimal escape norm, but rather by an arithmetic progression ${P = \{g^n: |n| \leq N\}}$, where ${N}$ is a natural number comparable to the reciprocal ${1/\|g\|_{e,A}}$ of the escape norm, as this will be enough to cut the dimension of the Lie model down by one without introducing any further torsion. Of course, this cannot be done in the category of global groups, since the arithmetic progression ${P}$ will not, in general, be a group. However, it is still a local group, and it turns out that there is an analogue of the quotient space construction in local groups. This fixes the problem, but at a cost: in order to make the inductive portion of the argument work smoothly, it is now more natural to place the entire argument inside the category of local groups rather than global groups, even though the primary interest in approximate groups ${A}$ is in the global case when ${A}$ lies inside a global group. This necessitates some technical modification to some of the preceding discussion (for instance, the Gleason-Yamabe theorem must be replaced by the local version of this theorem, due to Goldbring); details can be found in this recent paper of Emmanuel Breuillard, Ben Green, and myself, but will only be sketched here.

Let ${{\mathfrak g}}$ be a finite-dimensional Lie algebra (over the reals). Given two sufficiently small elements ${x, y}$ of ${{\mathfrak g}}$, define the right Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle R_y(x) := x + \int_0^1 F_R( \hbox{Ad}_x \hbox{Ad}_{ty} ) y \ dt \ \ \ \ \ (1)$

where ${\hbox{Ad}_x := \exp(\hbox{ad}_x)}$, ${\hbox{ad}_x: {\mathfrak g} \rightarrow {\mathfrak g}}$ is the adjoint map ${\hbox{ad}_x(y) := [x,y]}$, and ${F_R}$ is the function ${F_R(z) := \frac{z \log z}{z-1}}$, which is analytic for ${z}$ near ${1}$. Similarly, define the left Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle L_x(y) := y + \int_0^1 F_L( \hbox{Ad}_{tx} \hbox{Ad}_y ) x\ dt \ \ \ \ \ (2)$

where ${F_L(z) := \frac{\log z}{z-1}}$. One easily verifies that these expressions are well-defined (and depend smoothly on ${x}$ and ${y}$) when ${x}$ and ${y}$ are sufficiently small.

We have the famous Baker-Campbell-Hausdoff-Dynkin formula:

Theorem 1 (BCH formula) Let ${G}$ be a finite-dimensional Lie group over the reals with Lie algebra ${{\mathfrak g}}$. Let ${\log}$ be a local inverse of the exponential map ${\exp: {\mathfrak g} \rightarrow G}$, defined in a neighbourhood of the identity. Then for sufficiently small ${x, y \in {\mathfrak g}}$, one has

$\displaystyle \log( \exp(x) \exp(y) ) = R_y(x) = L_x(y).$

See for instance these notes of mine for a proof of this formula (it is for ${R_y}$, but one easily obtains a similar proof for ${L_x}$).

In particular, one can give a neighbourhood of the identity in ${{\mathfrak g}}$ the structure of a local Lie group by defining the group operation ${\ast}$ as

$\displaystyle x \ast y := R_y(x) = L_x(y) \ \ \ \ \ (3)$

for sufficiently small ${x, y}$, and the inverse operation by ${x^{-1} := -x}$ (one easily verifies that ${R_x(-x) = L_x(-x) = 0}$ for all small ${x}$).

It is tempting to reverse the BCH formula and conclude (the local form of) Lie’s third theorem, that every finite-dimensional Lie algebra is isomorphic to the Lie algebra of some local Lie group, by using (3) to define a smooth local group structure on a neighbourhood of the identity. (See this previous post for a definition of a local Lie group.) The main difficulty in doing so is in verifying that the definition (3) is well-defined (i.e. that ${R_y(x)}$ is always equal to ${L_x(y)}$) and locally associative. The well-definedness issue can be trivially disposed of by using just one of the expressions ${R_y(x)}$ or ${L_x(y)}$ as the definition of ${\ast}$ (though, as we shall see, it will be very convenient to use both of them simultaneously). However, the associativity is not obvious at all.

With the assistance of Ado’s theorem, which places ${{\mathfrak g}}$ inside the general linear Lie algebra ${\mathfrak{gl}_n({\bf R})}$ for some ${n}$, one can deduce both the well-definedness and associativity of (3) from the Baker-Campbell-Hausdorff formula for ${\mathfrak{gl}_n({\bf R})}$. However, Ado’s theorem is rather difficult to prove (see for instance this previous blog post for a proof), and it is natural to ask whether there is a way to establish these facts without Ado’s theorem.

After playing around with this for some time, I managed to extract a direct proof of well-definedness and local associativity of (3), giving a proof of Lie’s third theorem independent of Ado’s theorem. This is not a new result by any means, (indeed, the original proofs of Lie and Cartan of Lie’s third theorem did not use Ado’s theorem), but I found it an instructive exercise to work out the details, and so I am putting it up on this blog in case anyone else is interested (and also because I want to be able to find the argument again if I ever need it in the future).

In the previous set of notes, we introduced the notion of an ultra approximate group – an ultraproduct ${A = \prod_{n \rightarrow\alpha} A_n}$ of finite ${K}$-approximate groups ${A_n}$ for some ${K}$ independent of ${n}$, where each ${K}$-approximate group ${A_n}$ may lie in a distinct ambient group ${G_n}$. Although these objects arise initially from the “finitary” objects ${A_n}$, it turns out that ultra approximate groups ${A}$ can be profitably analysed by means of infinitary groups ${L}$ (and in particular, locally compact groups or Lie groups ${L}$), by means of certain models ${\rho: \langle A \rangle \rightarrow L}$ of ${A}$ (or of the group ${\langle A \rangle}$ generated by ${A}$). We will define precisely what we mean by a model later, but as a first approximation one can view a model as a representation of the ultra approximate group ${A}$ (or of ${\langle A \rangle}$) that is “macroscopically faithful” in that it accurately describes the “large scale” behaviour of ${A}$ (or equivalently, that the kernel of the representation is “microscopic” in some sense). In the next section we will see how one can use “Gleason lemma” technology to convert this macroscopic control of an ultra approximate group into microscopic control, which will be the key to classifying approximate groups.

Models of ultra approximate groups can be viewed as the multiplicative combinatorics analogue of the more well known concept of an ultralimit of metric spaces, which we briefly review below the fold as motivation.

The crucial observation is that ultra approximate groups enjoy a local compactness property which allows them to be usefully modeled by locally compact groups (and hence, through the Gleason-Yamabe theorem from previous notes, by Lie groups also). As per the Heine-Borel theorem, the local compactness will come from a combination of a completeness property and a local total boundedness property. The completeness property turns out to be a direct consequence of the countable saturation property of ultraproducts, thus illustrating one of the key advantages of the ultraproduct setting. The local total boundedness property is more interesting. Roughly speaking, it asserts that “large bounded sets” (such as ${A}$ or ${A^{100}}$) can be covered by finitely many translates of “small bounded sets” ${S}$, where “small” is a topological group sense, implying in particular that large powers ${S^m}$ of ${S}$ lie inside a set such as ${A}$ or ${A^4}$. The easiest way to obtain such a property comes from the following lemma of Sanders:

Lemma 1 (Sanders lemma) Let ${A}$ be a finite ${K}$-approximate group in a (global) group ${G}$, and let ${m \geq 1}$. Then there exists a symmetric subset ${S}$ of ${A^4}$ with ${|S| \gg_{K,m} |A|}$ containing the identity such that ${S^m \subset A^4}$.

This lemma has an elementary combinatorial proof, and is the key to endowing an ultra approximate group with locally compact structure. There is also a closely related lemma of Croot and Sisask which can achieve similar results, and which will also be discussed below. (The locally compact structure can also be established more abstractly using the much more general methods of definability theory, as was first done by Hrushovski, but we will not discuss this approach here.)

By combining the locally compact structure of ultra approximate groups ${A}$ with the Gleason-Yamabe theorem, one ends up being able to model a large “ultra approximate subgroup” ${A'}$ of ${A}$ by a Lie group ${L}$. Such Lie models serve a number of important purposes in the structure theory of approximate groups. Firstly, as all Lie groups have a dimension which is a natural number, they allow one to assign a natural number “dimension” to ultra approximate groups, which opens up the ability to perform “induction on dimension” arguments. Secondly, Lie groups have an escape property (which is in fact equivalent to no small subgroups property): if a group element ${g}$ lies outside of a very small ball ${B_\epsilon}$, then some power ${g^n}$ of it will escape a somewhat larger ball ${B_1}$. Or equivalently: if a long orbit ${g, g^2, \ldots, g^n}$ lies inside the larger ball ${B_1}$, one can deduce that the original element ${g}$ lies inside the small ball ${B_\epsilon}$. Because all Lie groups have this property, we will be able to show that all ultra approximate groups ${A}$ “essentially” have a similar property, in that they are “controlled” by a nearby ultra approximate group which obeys a number of escape-type properties analogous to those enjoyed by small balls in a Lie group, and which we will call a strong ultra approximate group. This will be discussed in the next set of notes, where we will also see how these escape-type properties can be exploited to create a metric structure on strong approximate groups analogous to the Gleason metrics studied in previous notes, which can in turn be exploited (together with an induction on dimension argument) to fully classify such approximate groups (in the finite case, at least).

There are some cases where the analysis is particularly simple. For instance, in the bounded torsion case, one can show that the associated Lie model ${L}$ is necessarily zero-dimensional, which allows for a easy classification of approximate groups of bounded torsion.

Some of the material here is drawn from my recent paper with Ben Green and Emmanuel Breuillard, which is in turn inspired by a previous paper of Hrushovski.

Hilbert’s fifth problem concerns the minimal hypotheses one needs to place on a topological group ${G}$ to ensure that it is actually a Lie group. In the previous set of notes, we saw that one could reduce the regularity hypothesis imposed on ${G}$ to a “${C^{1,1}}$” condition, namely that there was an open neighbourhood of ${G}$ that was isomorphic (as a local group) to an open subset ${V}$ of a Euclidean space ${{\bf R}^d}$ with identity element ${0}$, and with group operation ${\ast}$ obeying the asymptotic

$\displaystyle x \ast y = x + y + O(|x| |y|)$

for sufficiently small ${x,y}$. We will call such local groups ${(V,\ast)}$ ${C^{1,1}}$ local groups.

We now reduce the regularity hypothesis further, to one in which there is no explicit Euclidean space that is initially attached to ${G}$. Of course, Lie groups are still locally Euclidean, so if the hypotheses on ${G}$ do not involve any explicit Euclidean spaces, then one must somehow build such spaces from other structures. One way to do so is to exploit an ambient space with Euclidean or Lie structure that ${G}$ is embedded or immersed in. A trivial example of this is provided by the following basic fact from linear algebra:

Lemma 1 If ${V}$ is a finite-dimensional vector space (i.e. it is isomorphic to ${{\bf R}^d}$ for some ${d}$), and ${W}$ is a linear subspace of ${V}$, then ${W}$ is also a finite-dimensional vector space.

We will establish a non-linear version of this statement, known as Cartan’s theorem. Recall that a subset ${S}$ of a ${d}$-dimensional smooth manifold ${M}$ is a ${d'}$-dimensional smooth (embedded) submanifold of ${M}$ for some ${0 \leq d' \leq d}$ if for every point ${x \in S}$ there is a smooth coordinate chart ${\phi: U \rightarrow V}$ of a neighbourhood ${U}$ of ${x}$ in ${M}$ that maps ${x}$ to ${0}$, such that ${\phi(U \cap S) = V \cap {\bf R}^{d'}}$, where we identify ${{\bf R}^{d'} \equiv {\bf R}^{d'} \times \{0\}^{d-d'}}$ with a subspace of ${{\bf R}^d}$. Informally, ${S}$ locally sits inside ${M}$ the same way that ${{\bf R}^{d'}}$ sits inside ${{\bf R}^d}$.

Theorem 2 (Cartan’s theorem) If ${H}$ is a (topologically) closed subgroup of a Lie group ${G}$, then ${H}$ is a smooth submanifold of ${G}$, and is thus also a Lie group.

Note that the hypothesis that ${H}$ is closed is essential; for instance, the rationals ${{\bf Q}}$ are a subgroup of the (additive) group of reals ${{\bf R}}$, but the former is not a Lie group even though the latter is.

Exercise 1 Let ${H}$ be a subgroup of a locally compact group ${G}$. Show that ${H}$ is closed in ${G}$ if and only if it is locally compact.

A variant of the above results is provided by using (faithful) representations instead of embeddings. Again, the linear version is trivial:

Lemma 3 If ${V}$ is a finite-dimensional vector space, and ${W}$ is another vector space with an injective linear transformation ${\rho: W \rightarrow V}$ from ${W}$ to ${V}$, then ${W}$ is also a finite-dimensional vector space.

Here is the non-linear version:

Theorem 4 (von Neumann’s theorem) If ${G}$ is a Lie group, and ${H}$ is a locally compact group with an injective continuous homomorphism ${\rho: H \rightarrow G}$, then ${H}$ also has the structure of a Lie group.

Actually, it will suffice for the homomorphism ${\rho}$ to be locally injective rather than injective; related to this, von Neumann’s theorem localises to the case when ${H}$ is a local group rather a group. The requirement that ${H}$ be locally compact is necessary, for much the same reason that the requirement that ${H}$ be closed was necessary in Cartan’s theorem.

Example 1 Let ${G = ({\bf R}/{\bf Z})^2}$ be the two-dimensional torus, let ${H = {\bf R}}$, and let ${\rho: H \rightarrow G}$ be the map ${\rho(x) := (x,\alpha x)}$, where ${\alpha \in {\bf R}}$ is a fixed real number. Then ${\rho}$ is a continuous homomorphism which is locally injective, and is even globally injective if ${\alpha}$ is irrational, and so Theorem 4 is consistent with the fact that ${H}$ is a Lie group. On the other hand, note that when ${\alpha}$ is irrational, then ${\rho(H)}$ is not closed; and so Theorem 4 does not follow immediately from Theorem 2 in this case. (We will see, though, that Theorem 4 follows from a local version of Theorem 2.)

As a corollary of Theorem 4, we observe that any locally compact Hausdorff group ${H}$ with a faithful linear representation, i.e. a continuous injective homomorphism from ${H}$ into a linear group such as ${GL_n({\bf R})}$ or ${GL_n({\bf C})}$, is necessarily a Lie group. This suggests a representation-theoretic approach to Hilbert’s fifth problem. While this approach does not seem to readily solve the entire problem, it can be used to establish a number of important special cases with a well-understood representation theory, such as the compact case or the abelian case (for which the requisite representation theory is given by the Peter-Weyl theorem and Pontryagin duality respectively). We will discuss these cases further in later notes.

In all of these cases, one is not really building up Euclidean or Lie structure completely from scratch, because there is already a Euclidean or Lie structure present in another object in the hypotheses. Now we turn to results that can create such structure assuming only what is ostensibly a weaker amount of structure. In the linear case, one example of this is is the following classical result in the theory of topological vector spaces.

Theorem 5 Let ${V}$ be a locally compact Hausdorff topological vector space. Then ${V}$ is isomorphic (as a topological vector space) to ${{\bf R}^d}$ for some finite ${d}$.

Remark 1 The Banach-Alaoglu theorem asserts that in a normed vector space ${V}$, the closed unit ball in the dual space ${V^*}$ is always compact in the weak-* topology. Of course, this dual space ${V^*}$ may be infinite-dimensional. This however does not contradict the above theorem, because the closed unit ball is not a neighbourhood of the origin in the weak-* topology (it is only a neighbourhood with respect to the strong topology).

The full non-linear analogue of this theorem would be the Gleason-Yamabe theorem, which we are not yet ready to prove in this set of notes. However, by using methods similar to that used to prove Cartan’s theorem and von Neumann’s theorem, one can obtain a partial non-linear analogue which requires an additional hypothesis of a special type of metric, which we will call a Gleason metric:

Definition 6 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

Exercise 2 Let ${G}$ be a topological group that contains a neighbourhood of the identity isomorphic to a ${C^{1,1}}$ local group. Show that ${G}$ admits at least one Gleason metric.

Theorem 7 (Building Lie structure from Gleason metrics) Let ${G}$ be a locally compact group that has a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

We will rely on Theorem 7 to solve Hilbert’s fifth problem; this theorem reduces the task of establishing Lie structure on a locally compact group to that of building a metric with suitable properties. Thus, much of the remainder of the solution of Hilbert’s fifth problem will now be focused on the problem of how to construct good metrics on a locally compact group.

In all of the above results, a key idea is to use one-parameter subgroups to convert from the nonlinear setting to the linear setting. Recall from the previous notes that in a Lie group ${G}$, the one-parameter subgroups are in one-to-one correspondence with the elements of the Lie algebra ${{\mathfrak g}}$, which is a vector space. In a general topological group ${G}$, the concept of a one-parameter subgroup (i.e. a continuous homomorphism from ${{\bf R}}$ to ${G}$) still makes sense; the main difficulties are then to show that the space of such subgroups continues to form a vector space, and that the associated exponential map ${\exp: \phi \mapsto \phi(1)}$ is still a local homeomorphism near the origin.

Exercise 3 The purpose of this exercise is to illustrate the perspective that a topological group can be viewed as a non-linear analogue of a vector space. Let ${G, H}$ be locally compact groups. For technical reasons we assume that ${G, H}$ are both ${\sigma}$-compact and metrisable.

• (i) (Open mapping theorem) Show that if ${\phi: G \rightarrow H}$ is a continuous homomorphism which is surjective, then it is open (i.e. the image of open sets is open). (Hint: mimic the proof of the open mapping theorem for Banach spaces, as discussed for instance in these notes. In particular, take advantage of the Baire category theorem.)
• (ii) (Closed graph theorem) Show that if a homomorphism ${\phi: G \rightarrow H}$ is closed (i.e. its graph ${\{ (g, \phi(g)): g \in G \}}$ is a closed subset of ${G \times H}$), then it is continuous. (Hint: mimic the derivation of the closed graph theorem from the open mapping theorem in the Banach space case, as again discussed in these notes.)
• (iii) Let ${\phi: G \rightarrow H}$ be a homomorphism, and let ${\rho: H \rightarrow K}$ be a continuous injective homomorphism into another Hausdorff topological group ${K}$. Show that ${\phi}$ is continuous if and only if ${\rho \circ \phi}$ is continuous.
• (iv) Relax the condition of metrisability to that of being Hausdorff. (Hint: Now one cannot use the Baire category theorem for metric spaces; but there is an analogue of this theorem for locally compact Hausdorff spaces.)

In this set of notes, we describe the basic analytic structure theory of Lie groups, by relating them to the simpler concept of a Lie algebra. Roughly speaking, the Lie algebra encodes the “infinitesimal” structure of a Lie group, but is a simpler object, being a vector space rather than a nonlinear manifold. Nevertheless, thanks to the fundamental theorems of Lie, the Lie algebra can be used to reconstruct the Lie group (at a local level, at least), by means of the exponential map and the Baker-Campbell-Hausdorff formula. As such, the local theory of Lie groups is completely described (in principle, at least) by the theory of Lie algebras, which leads to a number of useful consequences, such as the following:

• (Local Lie implies Lie) A topological group ${G}$ is Lie (i.e. it is isomorphic to a Lie group) if and only if it is locally Lie (i.e. the group operations are smooth near the origin).
• (Uniqueness of Lie structure) A topological group has at most one smooth structure on it that makes it Lie.
• (Weak regularity implies strong regularity, I) Lie groups are automatically real analytic. (In fact one only needs a “local ${C^{1,1}}$” regularity on the group structure to obtain real analyticity.)
• (Weak regularity implies strong regularity, II) A continuous homomorphism from one Lie group to another is automatically smooth (and real analytic).

The connection between Lie groups and Lie algebras also highlights the role of one-parameter subgroups of a topological group, which will play a central role in the solution of Hilbert’s fifth problem.

We note that there is also a very important algebraic structure theory of Lie groups and Lie algebras, in which the Lie algebra is split into solvable and semisimple components, with the latter being decomposed further into simple components, which can then be completely classified using Dynkin diagrams. This classification is of fundamental importance in many areas of mathematics (e.g. representation theory, arithmetic geometry, and group theory), and many of the deeper facts about Lie groups and Lie algebras are proven via this classification (although in such cases it can be of interest to also find alternate proofs that avoid the classification). However, it turns out that we will not need this theory in this course, and so we will not discuss it further here (though it can of course be found in any graduate text on Lie groups and Lie algebras).

One of the fundamental structures in modern mathematics is that of a group. Formally, a group is a set ${G = (G,1,\cdot,()^{-1})}$ equipped with an identity element ${1 = 1_G \in G}$, a multiplication operation ${\cdot: G \times G \rightarrow G}$, and an inversion operation ${()^{-1}: G \rightarrow G}$ obeying the following axioms:

• (Closure) If ${g, h \in G}$, then ${g \cdot h}$ and ${g^{-1}}$ are well-defined and lie in ${G}$. (This axiom is redundant from the above description, but we include it for emphasis.)
• (Associativity) If ${g, h, k \in G}$, then ${(g \cdot h) \cdot k = g \cdot (h \cdot k)}$.
• (Identity) If ${g \in G}$, then ${g \cdot 1 = 1 \cdot g = g}$.
• (Inverse) If ${g \in G}$, then ${g \cdot g^{-1} = g^{-1} \cdot g = 1}$.

One can also consider additive groups ${G = (G,0,+,-)}$ instead of multiplicative groups, with the obvious changes of notation. By convention, additive groups are always understood to be abelian, so it is convenient to use additive notation when one wishes to emphasise the abelian nature of the group structure. As usual, we often abbreviate ${g \cdot h}$ by ${gh}$ (and ${1_G}$ by ${1}$) when there is no chance of confusion.

If furthermore ${G}$ is equipped with a topology, and the group operations ${\cdot, ()^{-1}}$ are continuous in this topology, then ${G}$ is a topological group. Any group can be made into a topological group by imposing the discrete topology, but there are many more interesting examples of topological groups, such as Lie groups, in which ${G}$ is not just a topological space, but is in fact a smooth manifold (and the group operations are not merely continuous, but also smooth).

There are many naturally occuring group-like objects that obey some, but not all, of the axioms. For instance, monoids are required to obey the closure, associativity, and identity axioms, but not the inverse axiom. If we also drop the identity axiom, we end up with a semigroup. Groupoids do not necessarily obey the closure axiom, but obey (versions of) the associativity, identity, and inverse axioms. And so forth.

Another group-like concept is that of a local topological group (or local group, for short), which is essentially a topological group with the closure axiom omitted (but do not obey the same axioms set as groupoids); they arise primarily in the study of local properties of (global) topological groups, and also in the study of approximate groups in additive combinatorics. Formally, a local group ${G = (G, \Omega, \Lambda, 1, \cdot, ()^{-1})}$ is a topological space ${G}$ equipped with an identity element ${1 \in G}$, a partially defined but continuous multiplication operation ${\cdot: \Omega \rightarrow G}$ for some domain ${\Omega \subset G \times G}$, and a partially defined but continuous inversion operation ${()^{-1}: \Lambda \rightarrow G}$, where ${\Lambda \subset G}$, obeying the following axioms:

• (Local closure) ${\Omega}$ is an open neighbourhood of ${G \times \{1\} \cup \{1\} \times G}$, and ${\Lambda}$ is an open neighbourhood of ${1}$.
• (Local associativity) If ${g, h, k \in G}$ are such that ${(g \cdot h) \cdot k}$ and ${g \cdot (h \cdot k)}$ are both well-defined, then they are equal. (Note however that it may be possible for one of these products to be defined but not the other, in contrast for instance with groupoids.)
• (Identity) For all ${g \in G}$, ${g \cdot 1 = 1 \cdot g = g}$.
• (Local inverse) If ${g \in G}$ and ${g^{-1}}$ is well-defined, then ${g \cdot g^{-1} = g^{-1} \cdot g = 1}$. (In particular this, together with the other axioms, forces ${1^{-1} = 1}$.)

We will often refer to ordinary groups as global groups (and topological groups as global topological groups) to distinguish them from local groups. Every global topological group is a local group, but not conversely.

One can consider discrete local groups, in which the topology is the discrete topology; in this case, the openness and continuity axioms in the definition are automatic and can be omitted. At the other extreme, one can consider local Lie groups, in which the local group ${G}$ has the structure of a smooth manifold, and the group operations are smooth. We can also consider symmetric local groups, in which ${\Lambda=G}$ (i.e. inverses are always defined). Symmetric local groups have the advantage of local homogeneity: given any ${g \in G}$, the operation of left-multiplication ${x \mapsto gx}$ is locally inverted by ${x \mapsto g^{-1} x}$ near the identity, thus giving a homeomorphism between a neighbourhood of ${g}$ and a neighbourhood of the identity; in particular, we see that given any two group elements ${g, h}$ in a symmetric local group ${G}$, there is a homeomorphism between a neighbourhood of ${g}$ and a neighbourhood of ${h}$. (If the symmetric local group is also Lie, then these homeomorphisms are in fact diffeomorphisms.) This local homogeneity already simplifies a lot of the possible topology of symmetric local groups, as it basically means that the local topological structure of such groups is determined by the local structure at the origin. (For instance, all connected components of a local Lie group necessarily have the same dimension.) It is easy to see that any local group has at least one symmetric open neighbourhood of the identity, so in many situations we can restrict to the symmetric case without much loss of generality.

A prime example of a local group can be formed by restricting any global topological group ${G}$ to an open neighbourhood ${U \subset G}$ of the identity, with the domains

$\displaystyle \Omega := \{ (g,h) \in U: g \cdot h \in U \}$

and

$\displaystyle \Lambda := \{ g \in U: g^{-1} \in U \};$

one easily verifies that this gives ${U}$ the structure of a local group (which we will sometimes call ${G\downharpoonright_U}$ to emphasise the original group ${G}$). If ${U}$ is symmetric (i.e. ${U^{-1}=U}$), then we in fact have a symmetric local group. One can also restrict local groups ${G}$ to open neighbourhoods ${U}$ to obtain a smaller local group ${G\downharpoonright_U}$ by the same procedure (adopting the convention that statements such as ${g \cdot h \in U}$ or ${g^{-1} \in U}$ are considered false if the left-hand side is undefined). (Note though that if one restricts to non-open neighbourhoods of the identity, then one usually does not get a local group; for instance ${[-1,1]}$ is not a local group (why?).)

Finite subsets of (Hausdorff) groups containing the identity can be viewed as local groups. This point of view turns out to be particularly useful for studying approximate groups in additive combinatorics, a point which I hope to expound more on later. Thus, for instance, the discrete interval ${\{-9,\ldots,9\} \subset {\bf Z}}$ is an additive symmetric local group, which informally might model an adding machine that can only handle (signed) one-digit numbers. More generally, one can view a local group as an object that behaves like a group near the identity, but for which the group laws (and in particular, the closure axiom) can start breaking down once one moves far enough away from the identity.

One can formalise this intuition as follows. Let us say that a word ${g_1 \ldots g_n}$ in a local group ${G}$ is well-defined in ${G}$ (or well-defined, for short) if every possible way of associating this word using parentheses is well-defined from applying the product operation. For instance, in order for ${abcd}$ to be well-defined, ${((ab)c)d}$, ${(a(bc))d}$, ${(ab)(cd)}$, ${a(b(cd))}$, and ${a((bc)d)}$ must all be well-defined. In the preceding example ${\{-9,\ldots,9\}}$, ${-2+6+5}$ is not well-defined because one of the ways of associating this sum, namely ${-2+(6+5)}$, is not well-defined (even though ${(-2+6)+5}$ is well-defined).

Exercise 1 (Iterating the associative law)

• Show that if a word ${g_1 \ldots g_n}$ in a local group is well-defined, then all ways of associating this word give the same answer, and so we can uniquely evaluate ${g_1 \ldots g_n}$ as an element in ${G}$.
• Give an example of a word ${g_1 \ldots g_n}$ in a local group which has two ways of being associated that are both well-defined, but give different answers. (Hint: the local associativity axiom prevents this from happening for ${n \leq 3}$, so try ${n=4}$. A small discrete local group will already suffice to give a counterexample; verifying the local group axioms are easier if one makes the domain of definition of the group operations as small as one can get away with while still having the counterexample.)

Exercise 2 Show that the number of ways to associate a word ${g_1 \ldots g_n}$ is given by the Catalan number ${C_{n-1} := \frac{1}{n} \binom{2n-2}{n-1}}$.

Exercise 3 Let ${G}$ be a local group, and let ${m \geq 1}$ be an integer. Show that there exists a symmetric open neighbourhood ${U_m}$ of the identity such that every word of length ${m}$ in ${U_m}$ is well-defined in ${G}$ (or more succinctly, ${U_m^m}$ is well-defined). (Note though that these words will usually only take values in ${G}$, rather than in ${U_m}$, and also the sets ${U_m}$ tend to become smaller as ${m}$ increases.)

In many situations (such as when one is investigating the local structure of a global group) one is only interested in the local properties of a (local or global) group. We can formalise this by the following definition. Let us call two local groups ${G = (G, \Omega, \Lambda, 1_G, \cdot, ()^{-1})}$ and ${G' = (G', \Omega', \Lambda', 1_{G'}, \cdot, ()^{-1})}$ locally identical if they have a common restriction, thus there exists a set ${U \subset G \cap G'}$ such that ${G\downharpoonright_U = G'\downharpoonright_U}$ (thus, ${1_G = 1_{G'}}$, and the topology and group operations of ${G}$ and ${G'}$ agree on ${U}$). This is easily seen to be an equivalence relation. We call an equivalence class ${[G]}$ of local groups a group germ.

Let ${{\mathcal P}}$ be a property of a local group (e.g. abelianness, connectedness, compactness, etc.). We call a group germ locally ${{\mathcal P}}$ if every local group in that germ has a restriction that obeys ${{\mathcal P}}$; we call a local or global group ${G}$ locally ${{\mathcal P}}$ if its germ is locally ${{\mathcal P}}$ (or equivalently, every open neighbourhood of the identity in ${G}$ contains a further neighbourhood that obeys ${{\mathcal P}}$). Thus, the study of local properties of (local or global) groups is subsumed by the study of group germs.

Exercise 4

• Show that the above general definition is consistent with the usual definitions of the properties “connected” and “locally connected” from point-set topology.
• Strictly speaking, the above definition is not consistent with the usual definitions of the properties “compact” and “local compact” from point-set topology because in the definition of local compactness, the compact neighbourhoods are certainly not required to be open. Show however that the point-set topology notion of “locally compact” is equivalent, using the above conventions, to the notion of “locally precompact inside of an ambient local group”. Of course, this is a much more clumsy terminology, and so we shall abuse notation slightly and continue to use the standard terminology “locally compact” even though it is, strictly speaking, not compatible with the above general convention.
• Show that a local group is discrete if and only if it is locally trivial.
• Show that a connected global group is abelian if and only if it is locally abelian. (Hint: in a connected global group, the only open subgroup is the whole group.)
• Show that a global topological group is first-countable if and only if it is locally first countable. (By the Birkhoff-Kakutani theorem, this implies that such groups are metrisable if and only if they are locally metrisable.)
• Let ${p}$ be a prime. Show that the solenoid group ${{\bf Z}_p \times {\bf R} / {\bf Z}^\Delta}$, where ${{\bf Z}_p}$ is the ${p}$-adic integers and ${{\bf Z}^\Delta := \{ (n,n): n \in {\bf Z}\}}$ is the diagonal embedding of ${{\bf Z}}$ inside ${{\bf Z}_p \times {\bf R}}$, is connected but not locally connected.

Remark 1 One can also study the local properties of groups using nonstandard analysis. Instead of group germs, one works (at least in the case when ${G}$ is first countable) with the monad ${o(G)}$ of the identity element ${1_G}$ of ${G}$, defined as the nonstandard group elements ${g = \lim_{n \rightarrow \alpha} g_n}$ in ${{}^* G}$ that are infinitesimally close to the origin in the sense that they lie in every standard neighbourhood of the identity. The monad ${o(G)}$ is closely related to the group germ ${[G]}$, but has the advantage of being a genuine (global) group, as opposed to an equivalence class of local groups. It is possible to recast most of the results here in this nonstandard formulation; see e.g. the classic text of Robinson. However, we will not adopt this perspective here.

A useful fact to know is that Lie structure is local. Call a (global or local) topological group Lie if it can be given the structure of a (global or local) Lie group.

Lemma 1 (Lie is a local property) A global topological group ${G}$ is Lie if and only if it is locally Lie. The same statement holds for local groups ${G}$ as long as they are symmetric.

We sketch a proof of this lemma below the fold. One direction is obvious, as the restriction a global Lie group to an open neighbourhood of the origin is clearly a local Lie group; for instance, the continuous interval ${(-10,10) \subset {\bf R}}$ is a symmetric local Lie group. The converse direction is almost as easy, but (because we are not assuming ${G}$ to be connected) requires one non-trivial fact, namely that local homomorphisms between local Lie groups are automatically smooth; details are provided below the fold.

As with so many other basic classes of objects in mathematics, it is of fundamental importance to specify and study the morphisms between local groups (and group germs). Given two local groups ${G, G'}$, we can define the notion of a (continuous) homomorphism ${\phi: G \rightarrow G'}$ between them, defined as a continuous map with

$\displaystyle \phi(1_G) = 1_{G'}$

such that whenever ${g, h \in G}$ are such that ${gh}$ is well-defined, then ${\phi(g)\phi(h)}$ is well-defined and equal to ${\phi(gh)}$; similarly, whenever ${g \in G}$ is such that ${g^{-1}}$ is well-defined, then ${\phi(g)^{-1}}$ is well-defined and equal to ${\phi(g^{-1})}$. (In abstract algebra, the continuity requirement is omitted from the definition of a homomorphism; we will call such maps discrete homomorphisms to distinguish them from the continuous ones which will be the ones studied here.)

It is often more convenient to work locally: define a local (continuous) homomorphism ${\phi: U \rightarrow G'}$ from ${G}$ to ${G'}$ to be a homomorphism from an open neighbourhood ${U}$ of the identity to ${G'}$. Given two local homomorphisms ${\phi: U \rightarrow G'}$, ${\tilde \phi: \tilde U \rightarrow \tilde G'}$ from one pair of locally identical groups ${G, \tilde G}$ to another pair ${G', \tilde G'}$, we say that ${\phi, \phi'}$ are locally identical if they agree on some open neighbourhood of the identity in ${U \cap \tilde U'}$ (note that it does not matter here whether we require openness in ${G}$, in ${\tilde G}$, or both). An equivalence class ${[\phi]}$ of local homomorphisms will be called a germ homomorphism (or morphism for short) from the group germ ${[G]}$ to the group germ ${[G']}$.

Exercise 5 Show that the class of group germs, equipped with the germ homomorphisms, becomes a category. (Strictly speaking, because group germs are themselves classes rather than sets, the collection of all group germs is a second-order class rather than a class, but this set-theoretic technicality can be resolved in a number of ways (e.g. by restricting all global and local groups under consideration to some fixed “universe”) and should be ignored for this exercise.)

As is usual in category theory, once we have a notion of a morphism, we have a notion of an isomorphism: two group germs ${[G], [G']}$ are isomorphic if there are germ homomorphisms ${\phi: [G] \rightarrow [G']}$, ${\psi: [G'] \rightarrow [G]}$ that invert each other. Lifting back to local groups, the associated notion is that of local isomorphism: two local groups ${G, G'}$ are locally isomorphic if there exist local isomorphisms ${\phi: U \rightarrow G'}$ and ${\psi: U' \rightarrow G}$ from ${G}$ to ${G'}$ and from ${G'}$ to ${G}$ that locally invert each other, thus ${\psi(\phi(g))=g}$ for ${g \in G}$ sufficiently close to ${1_G}$, and ${\phi(\psi(g))}$ for ${g' \in G'}$ sufficiently close to ${1_{G'}}$. Note that all local properties of (global or local) groups that can be defined purely in terms of the group and topological structures will be preserved under local isomorphism. Thus, for instance, if ${G, G'}$ are locally isomorphic local groups, then ${G}$ is locally connected iff ${G'}$ is, ${G}$ is locally compact iff ${G'}$ is, and (by Lemma 1) ${G}$ is Lie iff ${G'}$ is.

Exercise 6

• Show that the additive global groups ${{\bf R}/{\bf Z}}$ and ${{\bf R}}$ are locally isomorphic.
• Show that every locally path-connected group ${G}$ is locally isomorphic to a path-connected, simply connected group.
• — 1. Lie’s third theorem —

Lie’s fundamental theorems of Lie theory link the Lie group germs to Lie algebras. Observe that if ${[G]}$ is a locally Lie group germ, then the tangent space ${{\mathfrak g} := T_1 G}$ at the identity of this germ is well-defined, and is a finite-dimensional vector space. If we choose ${G}$ to be symmetric, then ${{\mathfrak g}}$ can also be identified with the left-invariant (say) vector fields on ${G}$, which are first-order differential operators on ${C^\infty(M)}$. The Lie bracket for vector fields then endows ${{\mathfrak g}}$ with the structure of a Lie algebra. It is easy to check that every morphism ${\phi: [G] \rightarrow [H]}$ of locally Lie germs gives rise (via the derivative map at the identity) to a morphism ${D\phi(1): {\mathfrak g} \rightarrow {\mathfrak h}}$ of the associated Lie algebras. From the Baker-Campbell-Hausdorff formula (which is valid for local Lie groups, as discussed in this previous post) we conversely see that ${D\phi(1)}$ uniquely determines the germ homomorphism ${\phi}$. Thus the derivative map provides a covariant functor from the category of locally Lie group germs to the category of (finite-dimensional) Lie algebras. In fact, this functor is an isomorphism, which is part of a fact known as Lie’s third theorem:

Theorem 2 (Lie’s third theorem) For this theorem, all Lie algebras are understood to be finite dimensional (and over the reals).

1. Every Lie algebra ${{\mathfrak g}}$ is the Lie algebra of a local Lie group germ ${[G]}$, which is unique up to germ isomorphism (fixing ${{\mathfrak g}}$).
2. Every Lie algebra ${{\mathfrak g}}$ is the Lie algebra of some global connected, simply connected Lie group ${G}$, which is unique up to Lie group isomorphism (fixing ${{\mathfrak g}}$).
3. Every homomorphism ${\Phi: {\mathfrak g} \rightarrow {\mathfrak h}}$ between Lie algebras is the derivative of a unique germ homomorphism ${\phi: [G] \rightarrow [H]}$ between the associated local Lie group germs.
4. Every homomorphism ${\Phi: {\mathfrak g} \rightarrow {\mathfrak h}}$ between Lie algebras is the derivative of a unique Lie group homomorphism ${\phi: G \rightarrow H}$ between the associated global connected, simply connected, Lie groups.
5. Every local Lie group germ is the germ of a global connected, simply connected Lie group ${G}$, which is unique up to Lie group isomorphism. In particular, every local Lie group is locally isomorphic to a global Lie group.

We record the (standard) proof of this theorem below the fold, which is ultimately based on Ado’s theorem and the Baker-Campbell-Hausdorff formula. Lie’s third theorem (which, actually, was proven in full generality by Cartan) demonstrates the equivalence of three categories: the category of finite-dimensonal Lie algebras, the category of local Lie group germs, and the category of connected, simply connected Lie groups.

— 2. Globalising a local group —

Many properties of a local group improve after passing to a smaller neighbourhood of the identity. Here are some simple examples:

Exercise 7 Let ${G}$ be a local group.

Note that the counterexamples in the above exercise demonstrate that not every local group is the restriction of a global group, because global groups (and hence, their restrictions) always obey the cancellation law (1), the inversion law (2), and the involution law (3). Another way in which a local group can fail to come from a global group is if it contains relations which can interact in a “global’ way to cause trouble, in a fashion which is invisible at the local level. For instance, consider the open unit cube ${(-1,1)^3}$, and consider four points ${a_1, a_2, a_3, a_4}$ in this cube that are close to the upper four corners ${(1,1,1), (1,1,-1), (1,-1,1), (1,-1,-1)}$ of this cube respectively. Define an equivalence relation ${\sim}$ on this cube by setting ${x \sim y}$ if ${x, y \in (-1,1)^3}$ and ${x-y}$ is equal to either ${0}$ or ${\pm 2a_i}$ for some ${i=1,\ldots,4}$. Note that this indeed an equivalence relation if ${a_1,a_2,a_3,a_4}$ are close enough to the corners (as this forces all non-trivial combinations ${\pm 2a_i \pm 2a_j}$ to lie outside the doubled cube ${(-2,2)^3}$). The quotient space ${(-1,1)^3/\sim}$ (which is a cube with bits around opposite corners identified together) can then be seen to be a symmetric additive local Lie group, but will usually not come from a global group. Indeed, it is not hard to see that if ${(-1,1)^3/\sim}$ is the restriction of a global group ${G}$, then ${G}$ must be a Lie group with Lie algebra ${{\bf R}^3}$ (by Lemma 1), and so the connected component ${G^\circ}$ of ${G}$ containing the identity is isomorphic to ${{\bf R}^3/\Gamma}$ for some sublattice ${\Gamma}$ of ${{\bf R}^3}$ that contains ${a_1,a_2,a_3,a_4}$; but for generic ${a_1,a_2,a_3,a_4}$, there is no such lattice, as the ${a_i}$ will generate a dense subset of ${{\bf R}^3}$. (The situation here is somewhat analogous to a number of famous Escher prints, such as Ascending and Descending, in which the geometry is locally consistent but globally inconsistent.) We will give this sort of argument in more detail below the fold (see the proof of Proposition 7).

Nevertheless, the space ${(-1,1)^3/\sim}$ is still locally isomorphic to a global Lie group, namely ${{\bf R}^3}$; for instance, the open neighbourhood ${(-0.5,0.5)^3/\sim}$ is isomorphic to ${(-0.5,0.5)^3}$, which is an open neighbourhood of ${{\bf R}^3}$. More generally, Lie’s third theorem tells us that any local Lie group is locally isomorphic to a global Lie group.

Let us call a local group globalisable if it is locally isomorphic to a global group; thus Lie’s third theorem tells us that every local Lie group is globalisable. Thanks to Goldbring’s solution to the local version of Hilbert’s fifth problem, we also know that locally Euclidean local groups are globalisable. A modification of this argument by van den Dries and Goldbring shows in fact that every locally compact local group is globalisable.

In view of these results, it is tempting to conjecture that all local groups are globalisable;; among other things, this would simplify the proof of Lie’s third theorem (and of the local version of Hilbert’s fifth problem). Unfortunately, this claim as stated is false:

Theorem 3 There exists local groups ${G}$ which are not globalisable.

The counterexamples used to establish Theorem 3 are remarkably delicate; the first example I know of is due to van Est and Korthagen. One reason for this, of course, is that the previous results prevents one from using any local Lie group, or even a locally compact group as a counterexample. We will present a (somewhat complicated) example below, based on the unit ball in the infinite-dimensional Banach space ${\ell^\infty({\bf N}^2)}$.

However, there are certainly many situations in which we can globalise a local group. For instance, this is the case if one has a locally faithful representation of that local group inside a global group:

Lemma 4 (Faithful representation implies globalisability) Let ${G}$ be a local group, and suppose there exists an injective local homomorphism ${\phi: U \rightarrow H}$ from ${G}$ into a global topological group ${H}$ with ${U}$ symmetric. Then ${U}$ is isomorphic to the restriction of a global topological group to an open neighbourhood of the identity; in particular, ${G}$ is globalisable.

The material here is based in part on this paper of Olver and this paper of Goldbring.

Over the past few months or so, I have been brushing up on my Lie group theory, as part of my project to fully understand the theory surrounding Hilbert’s fifth problem. Every so often, I encounter a basic fact in Lie theory which requires a slightly non-trivial “trick” to prove; I am recording two of them here, so that I can find these tricks again when I need to.

The first fact concerns the exponential map ${\exp: {\mathfrak g} \rightarrow G}$ from a Lie algebra ${{\mathfrak g}}$ of a Lie group ${G}$ to that group. (For this discussion we will only consider finite-dimensional Lie groups and Lie algebras over the reals ${{\bf R}}$.) A basic fact in the subject is that the exponential map is locally a homeomorphism: there is a neighbourhood of the origin in ${{\mathfrak g}}$ that is mapped homeomorphically by the exponential map to a neighbourhood of the identity in ${G}$. This local homeomorphism property is the foundation of an important dictionary between Lie groups and Lie algebras.

It is natural to ask whether the exponential map is globally a homeomorphism, and not just locally: in particular, whether the exponential map remains both injective and surjective. For instance, this is the case for connected, simply connected, nilpotent Lie groups (as can be seen from the Baker-Campbell-Hausdorff formula.)

The circle group ${S^1}$, which has ${{\bf R}}$ as its Lie algebra, already shows that global injectivity fails for any group that contains a circle subgroup, which is a huge class of examples (including, for instance, the positive dimensional compact Lie groups, or non-simply-connected Lie groups). Surjectivity also obviously fails for disconnected groups, since the Lie algebra is necessarily connected, and so the image under the exponential map must be connected also. However, even for connected Lie groups, surjectivity can fail. To see this, first observe that if the exponential map was surjective, then every group element ${g \in G}$ has a square root (i.e. an element ${h \in G}$ with ${h^2 = g}$), since ${\exp(x)}$ has ${\exp(x/2)}$ as a square root for any ${x \in {\mathfrak g}}$. However, there exist elements in connected Lie groups without square roots. A simple example is provided by the matrix

$\displaystyle g = \begin{pmatrix} -4 & 0 \\ 0 & -1/4 \end{pmatrix}$

in the connected Lie group ${SL_2({\bf R})}$. This matrix has eigenvalues ${-4}$, ${-1/4}$. Thus, if ${h \in SL_2({\bf R})}$ is a square root of ${g}$, we see (from the Jordan normal form) that it must have at least one eigenvalue in ${\{-2i,+2i\}}$, and at least one eigenvalue in ${\{-i/2,i/2\}}$. On the other hand, as ${h}$ has real coefficients, the complex eigenvalues must come in conjugate pairs ${\{ a+bi, a-bi\}}$. Since ${h}$ can only have at most ${2}$ eigenvalues, we obtain a contradiction.

However, there is an important case where surjectivity is recovered:

Proposition 1 If ${G}$ is a compact connected Lie group, then the exponential map is surjective.

Proof: The idea here is to relate the exponential map in Lie theory to the exponential map in Riemannian geometry. We first observe that every compact Lie group ${G}$ can be given the structure of a Riemannian manifold with a bi-invariant metric. This can be seen in one of two ways. Firstly, one can put an arbitrary positive definite inner product on ${{\mathfrak g}}$ and average it against the adjoint action of ${G}$ using Haar probability measure (which is available since ${G}$ is compact); this gives an ad-invariant positive-definite inner product on ${{\mathfrak g}}$ that one can then translate by either left or right translation to give a bi-invariant Riemannian structure on ${G}$. Alternatively, one can use the Peter-Weyl theorem to embed ${G}$ in a unitary group ${U(n)}$, at which point one can induce a bi-invariant metric on ${G}$ from the one on the space ${M_n({\bf C}) \equiv {\bf C}^{n^2}}$ of ${n \times n}$ complex matrices.

As ${G}$ is connected and compact and thus complete, we can apply the Hopf-Rinow theorem and conclude that any two points are connected by at least one geodesic, so that the Riemannian exponential map from ${{\mathfrak g}}$ to ${G}$ formed by following geodesics from the origin is surjective. But one can check that the Lie exponential map and Riemannian exponential map agree; for instance, this can be seen by noting that the group structure naturally defines a connection on the tangent bundle which is both torsion-free and preserves the bi-invariant metric, and must therefore agree with the Levi-Civita metric. (Alternatively, one can embed into a unitary group ${U(n)}$ and observe that ${G}$ is totally geodesic inside ${U(n)}$, because the geodesics in ${U(n)}$ can be described explicitly in terms of one-parameter subgroups.) The claim follows. $\Box$

Remark 1 While it is quite nice to see Riemannian geometry come in to prove this proposition, I am curious to know if there is any other proof of surjectivity for compact connected Lie groups that does not require explicit introduction of Riemannian geometry concepts.

The other basic fact I learned recently concerns the algebraic nature of Lie groups and Lie algebras. An important family of examples of Lie groups are the algebraic groups – algebraic varieties with a group law given by algebraic maps. Given that one can always automatically upgrade the smooth structure on a Lie group to analytic structure (by using the Baker-Campbell-Hausdorff formula), it is natural to ask whether one can upgrade the structure further to an algebraic structure. Unfortunately, this is not always the case. A prototypical example of this is given by the one-parameter subgroup

$\displaystyle G := \{ \begin{pmatrix} t & 0 \\ 0 & t^\alpha \end{pmatrix}: t \in {\bf R}^+ \} \ \ \ \ \ (1)$

of ${GL_2({\bf R})}$. This is a Lie group for any exponent ${\alpha \in {\bf R}}$, but if ${\alpha}$ is irrational, then the curve that ${G}$ traces out is not an algebraic subset of ${GL_2({\bf R})}$ (as one can see by playing around with Puiseux series).

This is not a true counterexample to the claim that every Lie group can be given the structure of an algebraic group, because one can give ${G}$ a different algebraic structure than one inherited from the ambient group ${GL_2({\bf R})}$. Indeed, ${G}$ is clearly isomorphic to the additive group ${{\bf R}}$, which is of course an algebraic group. However, a modification of the above construction works:

Proposition 2 There exists a Lie group ${G}$ that cannot be given the structure of an algebraic group.

Proof: We use an example from the text of Tauvel and Yu (that I found via this MathOverflow posting). We consider the subgroup

$\displaystyle G := \{ \begin{pmatrix} 1 & 0 & 0 \\ x & t & 0 \\ y & 0 & t^\alpha \end{pmatrix}: x, y \in {\bf R}; t \in {\bf R}^+ \}$

of ${GL_3({\bf R})}$, with ${\alpha}$ an irrational number. This is a three-dimensional (metabelian) Lie group, whose Lie algebra ${{\mathfrak g} \subset {\mathfrak gl}_3({\bf R})}$ is spanned by the elements

$\displaystyle X := \begin{pmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & \alpha \end{pmatrix}$

$\displaystyle Y := \begin{pmatrix} 0 & 0 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}$

$\displaystyle Z := \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ -\alpha & 0 & 0 \end{pmatrix}$

with the Lie bracket given by

$\displaystyle [Y,X] = -Y; [Z,X] = -\alpha Z; [Y,Z] = 0.$

As such, we see that if we use the basis ${X, Y, Z}$ to identify ${{\mathfrak g}}$ to ${{\bf R}^3}$, then adjoint representation of ${G}$ is the identity map.

If ${G}$ is an algebraic group, it is easy to see that the adjoint representation ${\hbox{Ad}: G \rightarrow GL({\mathfrak g})}$ is also algebraic, and so ${\hbox{Ad}(G) = G}$ is algebraic in ${GL({\mathfrak g})}$. Specialising to our specific example, in which adjoint representation is the identity, we conclude that if ${G}$ has any algebraic structure, then it must also be an algebraic subgroup of ${GL_3({\bf R})}$; but ${G}$ projects to the group (1) which is not algebraic, a contradiction. $\Box$

A slight modification of the same argument also shows that not every Lie algebra is algebraic, in the sense that it is isomorphic to a Lie algebra of an algebraic group. (However, there are important classes of Lie algebras that are automatically algebraic, such as nilpotent or semisimple Lie algebras.)

Hilbert’s fifth problem asks to clarify the extent that the assumption on a differentiable or smooth structure is actually needed in the theory of Lie groups and their actions. While this question is not precisely formulated and is thus open to some interpretation, the following result of Gleason and Montgomery-Zippin answers at least one aspect of this question:

Theorem 1 (Hilbert’s fifth problem) Let ${G}$ be a topological group which is locally Euclidean (i.e. it is a topological manifold). Then ${G}$ is isomorphic to a Lie group.

Theorem 1 can be viewed as an application of the more general structural theory of locally compact groups. In particular, Theorem 1 can be deduced from the following structural theorem of Gleason and Yamabe:

Theorem 2 (Gleason-Yamabe theorem) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is isomorphic to a Lie group.

The deduction of Theorem 1 from Theorem 2 proceeds using the Brouwer invariance of domain theorem and is discussed in this previous post. In this post, I would like to discuss the proof of Theorem 2. We can split this proof into three parts, by introducing two additional concepts. The first is the property of having no small subgroups:

Definition 3 (NSS) A topological group ${G}$ is said to have no small subgroups, or is NSS for short, if there is an open neighbourhood ${U}$ of the identity in ${G}$ that contains no subgroups of ${G}$ other than the trivial subgroup ${\{ \hbox{id}\}}$.

An equivalent definition of an NSS group is one which has an open neighbourhood ${U}$ of the identity that every non-identity element ${g \in G \backslash \{\hbox{id}\}}$ escapes in finite time, in the sense that ${g^n \not \in U}$ for some positive integer ${n}$. It is easy to see that all Lie groups are NSS; we shall shortly see that the converse statement (in the locally compact case) is also true, though significantly harder to prove.

Another useful property is that of having what I will call a Gleason metric:

Definition 4 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

For instance, the unitary group ${U(n)}$ with the operator norm metric ${d(g,h) := \|g-h\|_{op}}$ can easily verified to be a Gleason metric, with the commutator estimate (1) coming from the inequality

$\displaystyle \| [g,h] - 1 \|_{op} = \| gh - hg \|_{op}$

$\displaystyle = \| (g-1) (h-1) - (h-1) (g-1) \|_{op}$

$\displaystyle \leq 2 \|g-1\|_{op} \|g-1\|_{op}.$

Similarly, any left-invariant Riemannian metric on a (connected) Lie group can be verified to be a Gleason metric. From the escape property one easily sees that all groups with Gleason metrics are NSS; again, we shall see that there is a partial converse.

Remark 1 The escape and commutator properties are meant to capture “Euclidean-like” structure of the group. Other metrics, such as Carnot-Carathéodory metrics on Carnot Lie groups such as the Heisenberg group, usually fail one or both of these properties.

The proof of Theorem 2 can then be split into three subtheorems:

Theorem 5 (Reduction to the NSS case) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is NSS, locally compact, and metrisable.

Theorem 6 (Gleason’s lemma) Let ${G}$ be a locally compact metrisable NSS group. Then ${G}$ has a Gleason metric.

Theorem 7 (Building a Lie structure) Let ${G}$ be a locally compact group with a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

Clearly, by combining Theorem 5, Theorem 6, and Theorem 7 one obtains Theorem 2 (and hence Theorem 1).

Theorem 5 and Theorem 6 proceed by some elementary combinatorial analysis, together with the use of Haar measure (to build convolutions, and thence to build “smooth” bump functions with which to create a metric, in a variant of the analysis used to prove the Birkhoff-Kakutani theorem); Theorem 5 also requires Peter-Weyl theorem (to dispose of certain compact subgroups that arise en route to the reduction to the NSS case), which was discussed previously on this blog.

In this post I would like to detail the final component to the proof of Theorem 2, namely Theorem 7. (I plan to discuss the other two steps, Theorem 5 and Theorem 6, in a separate post.) The strategy is similar to that used to prove von Neumann’s theorem, as discussed in this previous post (and von Neumann’s theorem is also used in the proof), but with the Gleason metric serving as a substitute for the faithful linear representation. Namely, one first gives the space ${L(G)}$ of one-parameter subgroups of ${G}$ enough of a structure that it can serve as a proxy for the “Lie algebra” of ${G}$; specifically, it needs to be a vector space, and the “exponential map” needs to cover an open neighbourhood of the identity. This is enough to set up an “adjoint” representation of ${G}$, whose image is a Lie group by von Neumann’s theorem; the kernel is essentially the centre of ${G}$, which is abelian and can also be shown to be a Lie group by a similar analysis. To finish the job one needs to use arguments of Kuranishi and of Gleason, as discussed in this previous post.

The arguments here can be phrased either in the standard analysis setting (using sequences, and passing to subsequences often) or in the nonstandard analysis setting (selecting an ultrafilter, and then working with infinitesimals). In my view, the two approaches have roughly the same level of complexity in this case, and I have elected for the standard analysis approach.

Remark 2 From Theorem 7 we see that a Gleason metric structure is a good enough substitute for smooth structure that it can actually be used to reconstruct the entire smooth structure; roughly speaking, the commutator estimate (1) allows for enough “Taylor expansion” of expressions such as ${g^n h^n}$ that one can simulate the fundamentals of Lie theory (in particular, construction of the Lie algebra and the exponential map, and its basic properties. The advantage of working with a Gleason metric rather than a smoother structure, though, is that it is relatively undemanding with regards to regularity; in particular, the commutator estimate (1) is roughly comparable to the imposition ${C^{1,1}}$ structure on the group ${G}$, as this is the minimal regularity to get the type of Taylor approximation (with quadratic errors) that would be needed to obtain a bound of the form (1). We will return to this point in a later post.