You are currently browsing the monthly archive for September 2011.

In the last few notes, we have been steadily reducing the amount of regularity needed on a topological group in order to be able to show that it is in fact a Lie group, in the spirit of Hilbert’s fifth problem. Now, we will work on Hilbert’s fifth problem from the other end, starting with the minimal assumption of local compactness on a topological group ${G}$, and seeing what kind of structures one can build using this assumption. (For simplicity we shall mostly confine our discussion to global groups rather than local groups for now.) In view of the preceding notes, we would like to see two types of structures emerge in particular:

• representations of ${G}$ into some more structured group, such as a matrix group ${GL_n({\bf C})}$; and
• metrics on ${G}$ that capture the escape and commutator structure of ${G}$ (i.e. Gleason metrics).

To build either of these structures, a fundamentally useful tool is that of (left-) Haar measure – a left-invariant Radon measure ${\mu}$ on ${G}$. (One can of course also consider right-Haar measures; in many cases (such as for compact or abelian groups), the two concepts are the same, but this is not always the case.) This concept generalises the concept of Lebesgue measure on Euclidean spaces ${{\bf R}^d}$, which is of course fundamental in analysis on those spaces.

Haar measures will help us build useful representations and useful metrics on locally compact groups ${G}$. For instance, a Haar measure ${\mu}$ gives rise to the regular representation ${\tau: G \rightarrow U(L^2(G,d\mu))}$ that maps each element ${g \in G}$ of ${G}$ to the unitary translation operator ${\rho(g): L^2(G,d\mu) \rightarrow L^2(G,d\mu)}$ on the Hilbert space ${L^2(G,d\mu)}$ of square-integrable measurable functions on ${G}$ with respect to this Haar measure by the formula

$\displaystyle \tau(g) f(x) := f(g^{-1} x).$

(The presence of the inverse ${g^{-1}}$ is convenient in order to obtain the homomorphism property ${\tau(gh) = \tau(g)\tau(h)}$ without a reversal in the group multiplication.) In general, this is an infinite-dimensional representation; but in many cases (and in particular, in the case when ${G}$ is compact) we can decompose this representation into a useful collection of finite-dimensional representations, leading to the Peter-Weyl theorem, which is a fundamental tool for understanding the structure of compact groups. This theorem is particularly simple in the compact abelian case, where it turns out that the representations can be decomposed into one-dimensional representations ${\chi: G \rightarrow U({\bf C}) \equiv S^1}$, better known as characters, leading to the theory of Fourier analysis on general compact abelian groups. With this and some additional (largely combinatorial) arguments, we will also be able to obtain satisfactory structural control on locally compact abelian groups as well.

The link between Haar measure and useful metrics on ${G}$ is a little more complicated. Firstly, once one has the regular representation ${\tau: G\rightarrow U(L^2(G,d\mu))}$, and given a suitable “test” function ${\psi: G \rightarrow {\bf C}}$, one can then embed ${G}$ into ${L^2(G,d\mu)}$ (or into other function spaces on ${G}$, such as ${C_c(G)}$ or ${L^\infty(G)}$) by mapping a group element ${g \in G}$ to the translate ${\tau(g) \psi}$ of ${\psi}$ in that function space. (This map might not actually be an embedding if ${\psi}$ enjoys a non-trivial translation symmetry ${\tau(g)\psi=\psi}$, but let us ignore this possibility for now.) One can then pull the metric structure on the function space back to a metric on ${G}$, for instance defining an ${L^2(G,d\mu)}$-based metric

$\displaystyle d(g,h) := \| \tau(g) \psi - \tau(h) \psi \|_{L^2(G,d\mu)}$

if ${\psi}$ is square-integrable, or perhaps a ${C_c(G)}$-based metric

$\displaystyle d(g,h) := \| \tau(g) \psi - \tau(h) \psi \|_{C_c(G)} \ \ \ \ \ (1)$

if ${\psi}$ is continuous and compactly supported (with ${\|f \|_{C_c(G)} := \sup_{x \in G} |f(x)|}$ denoting the supremum norm). These metrics tend to have several nice properties (for instance, they are automatically left-invariant), particularly if the test function is chosen to be sufficiently “smooth”. For instance, if we introduce the differentiation (or more precisely, finite difference) operators

$\displaystyle \partial_g := 1-\tau(g)$

(so that ${\partial_g f(x) = f(x) - f(g^{-1} x)}$) and use the metric (1), then a short computation (relying on the translation-invariance of the ${C_c(G)}$ norm) shows that

$\displaystyle d([g,h], \hbox{id}) = \| \partial_g \partial_h \psi - \partial_h \partial_g \psi \|_{C_c(G)}$

for all ${g,h \in G}$. This suggests that commutator estimates, such as those appearing in the definition of a Gleason metric in Notes 2, might be available if one can control “second derivatives” of ${\psi}$; informally, we would like our test functions ${\psi}$ to have a “${C^{1,1}}$” type regularity.

If ${G}$ was already a Lie group (or something similar, such as a ${C^{1,1}}$ local group) then it would not be too difficult to concoct such a function ${\psi}$ by using local coordinates. But of course the whole point of Hilbert’s fifth problem is to do without such regularity hypotheses, and so we need to build ${C^{1,1}}$ test functions ${\psi}$ by other means. And here is where the Haar measure comes in: it provides the fundamental tool of convolution

$\displaystyle \phi * \psi(x) := \int_G \phi(y) \psi(y^{-1}x) d\mu(y)$

between two suitable functions ${\phi, \psi: G \rightarrow {\bf C}}$, which can be used to build smoother functions out of rougher ones. For instance:

Exercise 1 Let ${\phi, \psi: {\bf R}^d \rightarrow {\bf C}}$ be continuous, compactly supported functions which are Lipschitz continuous. Show that the convolution ${\phi * \psi}$ using Lebesgue measure on ${{\bf R}^d}$ obeys the ${C^{1,1}}$-type commutator estimate

$\displaystyle \| \partial_g \partial_h (\phi * \psi) \|_{C_c({\bf R}^d)} \leq C \|g\| \|h\|$

for all ${g,h \in {\bf R}^d}$ and some finite quantity ${C}$ depending only on ${\phi, \psi}$.

This exercise suggests a strategy to build Gleason metrics by convolving together some “Lipschitz” test functions and then using the resulting convolution as a test function to define a metric. This strategy may seem somewhat circular because one needs a notion of metric in order to define Lipschitz continuity in the first place, but it turns out that the properties required on that metric are weaker than those that the Gleason metric will satisfy, and so one will be able to break the circularity by using a “bootstrap” or “induction” argument.

We will discuss this strategy – which is due to Gleason, and is fundamental to all currently known solutions to Hilbert’s fifth problem – in later posts. In this post, we will construct Haar measure on general locally compact groups, and then establish the Peter-Weyl theorem, which in turn can be used to obtain a reasonably satisfactory structural classification of both compact groups and locally compact abelian groups.

One of the fundamental inequalities in convex geometry is the Brunn-Minkowski inequality, which asserts that if ${A, B}$ are two non-empty bounded open subsets of ${{\bf R}^d}$, then

$\displaystyle \mu(A+B)^{1/d} \geq \mu(A)^{1/d} + \mu(B)^{1/d}, \ \ \ \ \ (1)$

where

$\displaystyle A+B := \{a+b: a \in A, b \in B \}$

is the sumset of ${A}$ and ${B}$, and ${\mu}$ denotes Lebesgue measure. The estimate is sharp, as can be seen by considering the case when ${A, B}$ are convex bodies that are dilates of each other, thus ${A = \lambda B := \{ \lambda b: b \in B \}}$ for some ${\lambda>0}$, since in this case one has ${\mu(A) = \lambda^d \mu(B)}$, ${A+B = (\lambda+1)B}$, and ${\mu(A+B) = (\lambda+1)^d \mu(B)}$.

The Brunn-Minkowski inequality has many applications in convex geometry. To give just one example, if we assume that ${A}$ has a smooth boundary ${\partial A}$, and set ${B}$ equal to a small ball ${B = B(0,\epsilon)}$, then ${\mu(B)^{1/d} = \epsilon \mu(B(0,1))^{1/d}}$, and in the limit ${\epsilon \rightarrow 0}$ one has

$\displaystyle \mu(A+B) = \mu(A) + \epsilon |\partial A| + o(\epsilon)$

where ${|\partial A|}$ is the surface measure of ${A}$; applying the Brunn-Minkowski inequality and performing a Taylor expansion, one soon arrives at the isoperimetric inequality

$\displaystyle |\partial A| \geq d \mu(A)^{1-1/d} \mu(B(0,1))^{1/d}.$

Thus one can view the isoperimetric inequality as an infinitesimal limit of the Brunn-Minkowski inequality.

There are many proofs known of the Brunn-Minkowski inequality. Firstly, the inequality is trivial in one dimension:

Lemma 1 (One-dimensional Brunn-Minkowski) If ${A, B,C \subset {\bf R}}$ are non-empty measurable sets with ${A+B \subset C \subset {\bf R}}$, then

$\displaystyle \mu(C) \geq \mu(A)+\mu(B).$

Proof: By inner regularity we may assume that ${A,B}$ are compact. The claim then follows since ${C}$ contains the sets ${\sup(A)+B}$ and ${A+\inf(B)}$, which meet only at a single point ${\sup(A)+\inf(B)}$. $\Box$

For the higher dimensional case, the inequality can be established from the Prékopa-Leindler inequality:

Theorem 2 (Prékopa-Leindler inequality in ${{\bf R}^d}$) Let ${0 < \theta < 1}$, and let ${f, g, h: {\bf R}^d \rightarrow {\bf R}}$ be non-negative measurable functions obeying the inequality

$\displaystyle h(x+y) \geq f(x)^{1-\theta} g(y)^\theta \ \ \ \ \ (2)$

for all ${x,y \in {\bf R}^d}$. Then we have

$\displaystyle \int_{{\bf R}^d} h \geq \frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}} (\int_{{\bf R}^d} f)^{1-\theta} (\int_{{\bf R}^d} g)^\theta. \ \ \ \ \ (3)$

This inequality is usually stated using ${h((1-\theta)x + \theta y)}$ instead of ${h(x+y)}$ in order to eliminate the ungainly factor ${\frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}}}$. However, we formulate the inequality in this fashion in order to avoid any reference to the dilation maps ${x \mapsto \lambda x}$; the reason for this will become clearer later.

The Prékopa-Leindler inequality quickly implies the Brunn-Minkowski inequality. Indeed, if we apply it to the indicator functions ${f := 1_A, g := 1_B, h := 1_{A+B}}$ (which certainly obey (2)), then (3) gives

$\displaystyle \mu(A+B)^{1/d} \geq \frac{1}{(1-\theta)^{1-\theta} \theta^{\theta}} \mu(A)^{\frac{1-\theta}{d}} \mu(B)^{\frac{\theta}{d}}$

for any ${0 < \theta < 1}$. We can now optimise in ${\theta}$; the optimal value turns out to be

$\displaystyle \theta := \frac{\mu(B)^{1/d}}{\mu(A)^{1/d}+\mu(B)^{1/d}}$

which yields (1).

To prove the Prékopa-Leindler inequality, we first observe that the inequality tensorises in the sense that if it is true in dimensions ${d_1}$ and ${d_2}$, then it is automatically true in dimension ${d_1+d_2}$. Indeed, if ${f, g, h: {\bf R}^{d_1} \times {\bf R}^{d_2} \rightarrow {\bf R}^+}$ are measurable functions obeying (2) in dimension ${d_1+d_2}$, then for any ${x_1, y_1 \in {\bf R}^{d_1}}$, the functions ${f(x_1,\cdot), g(y_1,\cdot), h(x_1+y_1,\cdot): {\bf R}^{d_2} \rightarrow {\bf R}^+}$ obey (2) in dimension ${d_2}$. Applying the Prékopa-Leindler inequality in dimension ${d_2}$, we conclude that

$\displaystyle H(x_1+y_1) \geq \frac{1}{(1-\theta)^{d_2(1-\theta)} \theta^{d_2\theta}} F(x_1)^{1-\theta} G(y_1)^\theta$

for all ${x_1,y_1 \in {\bf R}^{d_1}}$, where ${F(x_1) := \int_{{\bf R}^{d_2}} f(x_1,x_2)\ dx_2}$ and similarly for ${G, H}$. But then if we apply the Prékopa-Leindler inequality again, this time in dimension ${d_1}$ and to the functions ${F}$, ${G}$, and ${(1-\theta)^{d_2(1-\theta)} \theta^{d_2\theta} H}$, and then use the Fubini-Tonelli theorem, we obtain (3).

From tensorisation, we see that to prove the Prékopa-Leindler inequality it suffices to do so in the one-dimensional case. We can derive this from Lemma 1 by reversing the “Prékopa-Leindler implies Brunn-Minkowski” argument given earlier, as follows. We can normalise ${f,g}$ to have sup norm ${1}$. If (2) holds (in one dimension), then the super-level sets ${\{f>\lambda\}, \{g>\lambda\}, \{h>\lambda\}}$ are related by the set-theoretic inclusion

$\displaystyle \{ h > \lambda \} \supset \{ f > \lambda \} + \{ g > \lambda \}$

and thus by Lemma 1

$\displaystyle \mu(\{ h > \lambda \}) \geq \mu(\{ f > \lambda \}) + \mu(\{ g > \lambda \})$

whenever ${\lambda \leq 1}$. On the other hand, from the Fubini-Tonelli theorem one has the distributional identity

$\displaystyle \int_{\bf R} h = \int_0^\infty \mu(\{h > \lambda\})\ d\lambda$

(and similarly for ${f,g}$, but with ${\lambda}$ restricted to ${(0,1)}$), and thus

$\displaystyle \int_{\bf R} h \geq \int_{\bf R} f + \int_{\bf R} g.$

The claim then follows from the weighted arithmetic mean-geometric mean inequality ${(1-\theta) x + \theta y \geq x^{1-\theta} y^\theta}$.

In this post, I wanted to record the simple observation (which appears in this paper of Leonardi and Mansou in the case of the Heisenberg group, but may have also been stated elsewhere in the literature) that the above argument carries through without much difficulty to the nilpotent setting, to give a nilpotent Brunn-Minkowski inequality:

Theorem 3 (Nilpotent Brunn-Minkowski) Let ${G}$ be a connected, simply connected nilpotent Lie group of (topological) dimension ${d}$, and let ${A, B}$ be bounded open subsets of ${G}$. Let ${\mu}$ be a Haar measure on ${G}$ (note that nilpotent groups are unimodular, so there is no distinction between left and right Haar measure). Then

$\displaystyle \mu(A \cdot B)^{1/d} \geq \mu(A)^{1/d} + \mu(B)^{1/d}. \ \ \ \ \ (4)$

Here of course ${A \cdot B := \{ ab: a \in A, b \in B \}}$ is the product set of ${A}$ and ${B}$.

Indeed, by repeating the previous arguments, the nilpotent Brunn-Minkowski inequality will follow from

Theorem 4 (Nilpotent Prékopa-Leindler inequality) Let ${G}$ be a connected, simply connected nilpotent Lie group of topological dimension ${d}$ with a Haar measure ${\mu}$. Let ${0 < \theta < 1}$, and let ${f, g, h: G \rightarrow {\bf R}}$ be non-negative measurable functions obeying the inequality

$\displaystyle h(xy) \geq f(x)^{1-\theta} g(y)^\theta \ \ \ \ \ (5)$

for all ${x,y \in G}$. Then we have

$\displaystyle \int_G h\ d\mu \geq \frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}} (\int_G f\ d\mu)^{1-\theta} (\int_G g\ d\mu)^\theta. \ \ \ \ \ (6)$

To prove the nilpotent Prékopa-Leindler inequality, the key observation is that this inequality not only tensorises; it splits with respect to short exact sequences. Indeed, suppose one has a short exact sequence

$\displaystyle 0 \rightarrow K \rightarrow G \rightarrow H \rightarrow 0$

of connected, simply connected nilpotent Lie groups. The adjoint action of the connected group ${G}$ on ${K}$ acts nilpotently on the Lie algebra of ${K}$ and is thus unimodular. Because of this, we can split a Haar measure ${\mu_G}$ on ${G}$ into Haar measures ${\mu_K, \mu_H}$ on ${K, H}$ respectively so that we have the Fubini-Tonelli formula

$\displaystyle \int_G f(g)\ d\mu_G(g) = \int_H F(h)\ d\mu_H(h)$

for any measurable ${f: G \rightarrow {\bf R}^+}$, where ${F(h)}$ is defined by the formula

$\displaystyle F(h) := \int_K f(kg) d\mu_K(k) = \int_K f(gk)\ d\mu_K(k)$

for any coset representative ${g \in G}$ of ${h}$ (the choice of ${g}$ is not important, thanks to unimodularity of the conjugation action). It is then not difficult to repeat the proof of tensorisation (relying heavily on the unimodularity of conjugation) to conclude that the nilpotent Prékopa-Leindler inequality for ${H}$ and ${K}$ implies the Prékopa-Leindler inequality for ${G}$; we leave this as an exercise to the interested reader.

Now if ${G}$ is a connected simply connected Lie group, then the abeliansation ${G/[G,G]}$ is connected and simply connected and thus isomorphic to a vector space. This implies that ${[G,G]}$ is a retract of ${G}$ and is thus also connected and simply connected. From this and an induction of the step of the nilpotent group, we see that the nilpotent Prékopa-Leindler inequality follows from the abelian case, which we have already established in Theorem 2.

Remark 1 Some connected, simply connected nilpotent groups ${G}$ (and specifically, the Carnot groups) can be equipped with a one-parameter family of dilations ${x \mapsto \lambda \cdot x}$, which are a family of automorphisms on ${G}$, which dilate the Haar measure by the formula

$\displaystyle \mu( \lambda \cdot x ) = \lambda^D \mu(x)$

for an integer ${D}$, called the homogeneous dimension of ${G}$, which is typically larger than the topological dimension. For instance, in the case of the Heisenberg group

$\displaystyle G := \begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix},$

which has topological dimension ${d=3}$, the natural family of dilations is given by

$\displaystyle \lambda: \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix} \mapsto \begin{pmatrix} 1 & \lambda x & \lambda^2 z \\ 0 & 1 & \lambda y \\ 0 & 0 & 1 \end{pmatrix}$

with homogeneous dimension ${D=4}$. Because the two notions ${d, D}$ of dimension are usually distinct in the nilpotent case, it is no longer helpful to try to use these dilations to simplify the proof of the Brunn-Minkowski inequality, in contrast to the Euclidean case. This is why we avoided using dilations in the preceding discussion. It is natural to wonder whether one could replace ${d}$ by ${D}$ in (4), but it can be easily shown that the exponent ${d}$ is best possible (an observation that essentially appeared first in this paper of Monti). Indeed, working in the Heisenberg group for sake of concreteness, consider the set

$\displaystyle A := \{ \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}: |x|, |y| \leq N, |z| \leq N^{10} \}$

for some large parameter ${N}$. This set has measure ${N^{12}}$ using the standard Haar measure on ${G}$. The product set ${A \cdot A}$ is contained in

$\displaystyle A := \{ \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}: |x|, |y| \leq 2N, |z| \leq 2N^{10} + O(N^2) \}$

and thus has measure at most ${8N^{12} + O(N^4)}$. This already shows that the exponent in (4) cannot be improved beyond ${d=3}$; note that the homogeneous dimension ${D=4}$ is making its presence known in the ${O(N^4)}$ term in the measure of ${A \cdot A}$, but this is a lower order term only.

It is somewhat unfortunate that the nilpotent Brunn-Minkowski inequality is adapted to the topological dimension rather than the homogeneous one, because it means that some of the applications of the inequality (such as the application to isoperimetric inequalities mentioned at the start of the post) break down. (Indeed, the topic of isoperimetric inequalities for the Heisenberg group is a subtle one, with many naive formulations of the inequality being false. See the paper of Monti for more discussion.)

Remark 2 The inequality can be extended to non-simply-connected connected nilpotent groups ${G}$, if ${d}$ is now set to the dimension of the largest simply connected quotient of ${G}$. It seems to me that this is the best one can do in general; for instance, if ${G}$ is a torus, then the inequality fails for any ${d>0}$, as can be seen by setting ${A=B=G}$.

Remark 3 Specialising the nilpotent Brunn-Minkowski inequality to the case ${A=B}$, we conclude that

$\displaystyle \mu(A \cdot A) \geq 2^d \mu(A).$

This inequality actually has a much simpler proof (attributed to Tsachik Gelander in this paper of Hrushovski, as pointed out to me by Emmanuel Breuillard): one can show that for a connected, simply connected Lie group ${G}$, the exponential map ${\exp: {\mathfrak g} \rightarrow G}$ is a measure-preserving homeomorphism, for some choice of Haar measure ${\mu_{{\mathfrak g}}}$ on ${{\mathfrak g}}$, so it suffices to show that

$\displaystyle \mu_{{\mathfrak g}}(\log(A \cdot A)) \geq 2^d \mu_{{\mathfrak g}}(\log A).$

But ${A \cdot A}$ contains all the squares ${\{a^2: a \in A \}}$ of ${A}$, so ${\log(A \cdot A)}$ contains the isotropic dilation ${2 \cdot \log A}$, and the claim follows. Note that if we set ${A}$ to be a small ball around the origin, we can modify this argument to give another demonstration of why the topological dimension ${d}$ cannot be replaced with any larger exponent in (4).

One may tentatively conjecture that the inequality ${\mu(A \cdot A) \geq 2^d \mu(A)}$ in fact holds in all unimodular connected, simply connected Lie groups ${G}$, and all bounded open subsets ${A}$ of ${G}$; I do not know if this bound is always true, however.

My graduate text on measure theory (based on these lecture notes) is now published by the AMS as part of the Graduate Studies in Mathematics series.  (See also my own blog page for this book, which among other things contains a draft copy of the book in PDF format.)

The classical inverse function theorem reads as follows:

Theorem 1 (${C^1}$ inverse function theorem) Let ${\Omega \subset {\bf R}^n}$ be an open set, and let ${f: \Omega \rightarrow {\bf R}^n}$ be an continuously differentiable function, such that for every ${x_0 \in \Omega}$, the derivative map ${Df(x_0): {\bf R}^n \rightarrow {\bf R}^n}$ is invertible. Then ${f}$ is a local homeomorphism; thus, for every ${x_0 \in \Omega}$, there exists an open neighbourhood ${U}$ of ${x_0}$ and an open neighbourhood ${V}$ of ${f(x_0)}$ such that ${f}$ is a homeomorphism from ${U}$ to ${V}$.

It is also not difficult to show by inverting the Taylor expansion

$\displaystyle f(x) = f(x_0) + Df(x_0)(x-x_0) + o(\|x-x_0\|)$

that at each ${x_0}$, the local inverses ${f^{-1}: V \rightarrow U}$ are also differentiable at ${f(x_0)}$ with derivative

$\displaystyle Df^{-1}(f(x_0)) = Df(x_0)^{-1}. \ \ \ \ \ (1)$

The textbook proof of the inverse function theorem proceeds by an application of the contraction mapping theorem. Indeed, one may normalise ${x_0=f(x_0)=0}$ and ${Df(0)}$ to be the identity map; continuity of ${Df}$ then shows that ${Df(x)}$ is close to the identity for small ${x}$, which may be used (in conjunction with the fundamental theorem of calculus) to make ${x \mapsto x-f(x)+y}$ a contraction on a small ball around the origin for small ${y}$, at which point the contraction mapping theorem readily finishes off the problem.

I recently learned (after I asked this question on Math Overflow) that the hypothesis of continuous differentiability may be relaxed to just everywhere differentiability:

Theorem 2 (Everywhere differentiable inverse function theorem) Let ${\Omega \subset {\bf R}^n}$ be an open set, and let ${f: \Omega \rightarrow {\bf R}^n}$ be an everywhere differentiable function, such that for every ${x_0 \in \Omega}$, the derivative map ${Df(x_0): {\bf R}^n \rightarrow {\bf R}^n}$ is invertible. Then ${f}$ is a local homeomorphism; thus, for every ${x_0 \in \Omega}$, there exists an open neighbourhood ${U}$ of ${x_0}$ and an open neighbourhood ${V}$ of ${f(x_0)}$ such that ${f}$ is a homeomorphism from ${U}$ to ${V}$.

As before, one can recover the differentiability of the local inverses, with the derivative of the inverse given by the usual formula (1).

This result implicitly follows from the more general results of Cernavskii about the structure of finite-to-one open and closed maps, however the arguments there are somewhat complicated (and subsequent proofs of those results, such as the one by Vaisala, use some powerful tools from algebraic geometry, such as dimension theory). There is however a more elementary proof of Saint Raymond that was pointed out to me by Julien Melleray. It only uses basic point-set topology (for instance, the concept of a connected component) and the basic topological and geometric structure of Euclidean space (in particular relying primarily on local compactness, local connectedness, and local convexity). I decided to present (an arrangement of) Saint Raymond’s proof here.

To obtain a local homeomorphism near ${x_0}$, there are basically two things to show: local surjectivity near ${x_0}$ (thus, for ${y}$ near ${f(x_0)}$, one can solve ${f(x)=y}$ for some ${x}$ near ${x_0}$) and local injectivity near ${x_0}$ (thus, for distinct ${x_1, x_2}$ near ${f(x_0)}$, ${f(x_1)}$ is not equal to ${f(x_2)}$). Local surjectivity is relatively easy; basically, the standard proof of the inverse function theorem works here, after replacing the contraction mapping theorem (which is no longer available due to the possibly discontinuous nature of ${Df}$) with the Brouwer fixed point theorem instead (or one could also use degree theory, which is more or less an equivalent approach). The difficulty is local injectivity – one needs to preclude the existence of nearby points ${x_1, x_2}$ with ${f(x_1) = f(x_2) = y}$; note that in contrast to the contraction mapping theorem that provides both existence and uniqueness of fixed points, the Brouwer fixed point theorem only gives existence and not uniqueness.

In one dimension ${n=1}$ one can proceed by using Rolle’s theorem. Indeed, as one traverses the interval from ${x_1}$ to ${x_2}$, one must encounter some intermediate point ${x_*}$ which maximises the quantity ${|f(x_*)-y|}$, and which is thus instantaneously non-increasing both to the left and to the right of ${x_*}$. But, by hypothesis, ${f'(x_*)}$ is non-zero, and this easily leads to a contradiction.

Saint Raymond’s argument for the higher dimensional case proceeds in a broadly similar way. Starting with two nearby points ${x_1, x_2}$ with ${f(x_1)=f(x_2)=y}$, one finds a point ${x_*}$ which “locally extremises” ${\|f(x_*)-y\|}$ in the following sense: ${\|f(x_*)-y\|}$ is equal to some ${r_*>0}$, but ${x_*}$ is adherent to at least two distinct connected components ${U_1, U_2}$ of the set ${U = \{ x: \|f(x)-y\| < r_* \}}$. (This is an oversimplification, as one has to restrict the available points ${x}$ in ${U}$ to a suitably small compact set, but let us ignore this technicality for now.) Note from the non-degenerate nature of ${Df(x_*)}$ that ${x_*}$ was already adherent to ${U}$; the point is that ${x_*}$ “disconnects” ${U}$ in some sense. Very roughly speaking, the way such a critical point ${x_*}$ is found is to look at the sets ${\{ x: \|f(x)-y\| \leq r \}}$ as ${r}$ shrinks from a large initial value down to zero, and one finds the first value of ${r_*}$ below which this set disconnects ${x_1}$ from ${x_2}$. (Morally, one is performing some sort of Morse theory here on the function ${x \mapsto \|f(x)-y\|}$, though this function does not have anywhere near enough regularity for classical Morse theory to apply.)

The point ${x_*}$ is mapped to a point ${f(x_*)}$ on the boundary ${\partial B(y,r_*)}$ of the ball ${B(y,r_*)}$, while the components ${U_1, U_2}$ are mapped to the interior of this ball. By using a continuity argument, one can show (again very roughly speaking) that ${f(U_1)}$ must contain a “hemispherical” neighbourhood ${\{ z \in B(y,r_*): \|z-f(x_*)\| < \kappa \}}$ of ${f(x_*)}$ inside ${B(y,r_*)}$, and similarly for ${f(U_2)}$. But then from differentiability of ${f}$ at ${x_*}$, one can then show that ${U_1}$ and ${U_2}$ overlap near ${x_*}$, giving a contradiction.

The rigorous details of the proof are provided below the fold.

Hilbert’s fifth problem concerns the minimal hypotheses one needs to place on a topological group ${G}$ to ensure that it is actually a Lie group. In the previous set of notes, we saw that one could reduce the regularity hypothesis imposed on ${G}$ to a “${C^{1,1}}$” condition, namely that there was an open neighbourhood of ${G}$ that was isomorphic (as a local group) to an open subset ${V}$ of a Euclidean space ${{\bf R}^d}$ with identity element ${0}$, and with group operation ${\ast}$ obeying the asymptotic

$\displaystyle x \ast y = x + y + O(|x| |y|)$

for sufficiently small ${x,y}$. We will call such local groups ${(V,\ast)}$ ${C^{1,1}}$ local groups.

We now reduce the regularity hypothesis further, to one in which there is no explicit Euclidean space that is initially attached to ${G}$. Of course, Lie groups are still locally Euclidean, so if the hypotheses on ${G}$ do not involve any explicit Euclidean spaces, then one must somehow build such spaces from other structures. One way to do so is to exploit an ambient space with Euclidean or Lie structure that ${G}$ is embedded or immersed in. A trivial example of this is provided by the following basic fact from linear algebra:

Lemma 1 If ${V}$ is a finite-dimensional vector space (i.e. it is isomorphic to ${{\bf R}^d}$ for some ${d}$), and ${W}$ is a linear subspace of ${V}$, then ${W}$ is also a finite-dimensional vector space.

We will establish a non-linear version of this statement, known as Cartan’s theorem. Recall that a subset ${S}$ of a ${d}$-dimensional smooth manifold ${M}$ is a ${d'}$-dimensional smooth (embedded) submanifold of ${M}$ for some ${0 \leq d' \leq d}$ if for every point ${x \in S}$ there is a smooth coordinate chart ${\phi: U \rightarrow V}$ of a neighbourhood ${U}$ of ${x}$ in ${M}$ that maps ${x}$ to ${0}$, such that ${\phi(U \cap S) = V \cap {\bf R}^{d'}}$, where we identify ${{\bf R}^{d'} \equiv {\bf R}^{d'} \times \{0\}^{d-d'}}$ with a subspace of ${{\bf R}^d}$. Informally, ${S}$ locally sits inside ${M}$ the same way that ${{\bf R}^{d'}}$ sits inside ${{\bf R}^d}$.

Theorem 2 (Cartan’s theorem) If ${H}$ is a (topologically) closed subgroup of a Lie group ${G}$, then ${H}$ is a smooth submanifold of ${G}$, and is thus also a Lie group.

Note that the hypothesis that ${H}$ is closed is essential; for instance, the rationals ${{\bf Q}}$ are a subgroup of the (additive) group of reals ${{\bf R}}$, but the former is not a Lie group even though the latter is.

Exercise 1 Let ${H}$ be a subgroup of a locally compact group ${G}$. Show that ${H}$ is closed in ${G}$ if and only if it is locally compact.

A variant of the above results is provided by using (faithful) representations instead of embeddings. Again, the linear version is trivial:

Lemma 3 If ${V}$ is a finite-dimensional vector space, and ${W}$ is another vector space with an injective linear transformation ${\rho: W \rightarrow V}$ from ${W}$ to ${V}$, then ${W}$ is also a finite-dimensional vector space.

Here is the non-linear version:

Theorem 4 (von Neumann’s theorem) If ${G}$ is a Lie group, and ${H}$ is a locally compact group with an injective continuous homomorphism ${\rho: H \rightarrow G}$, then ${H}$ also has the structure of a Lie group.

Actually, it will suffice for the homomorphism ${\rho}$ to be locally injective rather than injective; related to this, von Neumann’s theorem localises to the case when ${H}$ is a local group rather a group. The requirement that ${H}$ be locally compact is necessary, for much the same reason that the requirement that ${H}$ be closed was necessary in Cartan’s theorem.

Example 1 Let ${G = ({\bf R}/{\bf Z})^2}$ be the two-dimensional torus, let ${H = {\bf R}}$, and let ${\rho: H \rightarrow G}$ be the map ${\rho(x) := (x,\alpha x)}$, where ${\alpha \in {\bf R}}$ is a fixed real number. Then ${\rho}$ is a continuous homomorphism which is locally injective, and is even globally injective if ${\alpha}$ is irrational, and so Theorem 4 is consistent with the fact that ${H}$ is a Lie group. On the other hand, note that when ${\alpha}$ is irrational, then ${\rho(H)}$ is not closed; and so Theorem 4 does not follow immediately from Theorem 2 in this case. (We will see, though, that Theorem 4 follows from a local version of Theorem 2.)

As a corollary of Theorem 4, we observe that any locally compact Hausdorff group ${H}$ with a faithful linear representation, i.e. a continuous injective homomorphism from ${H}$ into a linear group such as ${GL_n({\bf R})}$ or ${GL_n({\bf C})}$, is necessarily a Lie group. This suggests a representation-theoretic approach to Hilbert’s fifth problem. While this approach does not seem to readily solve the entire problem, it can be used to establish a number of important special cases with a well-understood representation theory, such as the compact case or the abelian case (for which the requisite representation theory is given by the Peter-Weyl theorem and Pontryagin duality respectively). We will discuss these cases further in later notes.

In all of these cases, one is not really building up Euclidean or Lie structure completely from scratch, because there is already a Euclidean or Lie structure present in another object in the hypotheses. Now we turn to results that can create such structure assuming only what is ostensibly a weaker amount of structure. In the linear case, one example of this is is the following classical result in the theory of topological vector spaces.

Theorem 5 Let ${V}$ be a locally compact Hausdorff topological vector space. Then ${V}$ is isomorphic (as a topological vector space) to ${{\bf R}^d}$ for some finite ${d}$.

Remark 1 The Banach-Alaoglu theorem asserts that in a normed vector space ${V}$, the closed unit ball in the dual space ${V^*}$ is always compact in the weak-* topology. Of course, this dual space ${V^*}$ may be infinite-dimensional. This however does not contradict the above theorem, because the closed unit ball is not a neighbourhood of the origin in the weak-* topology (it is only a neighbourhood with respect to the strong topology).

The full non-linear analogue of this theorem would be the Gleason-Yamabe theorem, which we are not yet ready to prove in this set of notes. However, by using methods similar to that used to prove Cartan’s theorem and von Neumann’s theorem, one can obtain a partial non-linear analogue which requires an additional hypothesis of a special type of metric, which we will call a Gleason metric:

Definition 6 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

Exercise 2 Let ${G}$ be a topological group that contains a neighbourhood of the identity isomorphic to a ${C^{1,1}}$ local group. Show that ${G}$ admits at least one Gleason metric.

Theorem 7 (Building Lie structure from Gleason metrics) Let ${G}$ be a locally compact group that has a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

We will rely on Theorem 7 to solve Hilbert’s fifth problem; this theorem reduces the task of establishing Lie structure on a locally compact group to that of building a metric with suitable properties. Thus, much of the remainder of the solution of Hilbert’s fifth problem will now be focused on the problem of how to construct good metrics on a locally compact group.

In all of the above results, a key idea is to use one-parameter subgroups to convert from the nonlinear setting to the linear setting. Recall from the previous notes that in a Lie group ${G}$, the one-parameter subgroups are in one-to-one correspondence with the elements of the Lie algebra ${{\mathfrak g}}$, which is a vector space. In a general topological group ${G}$, the concept of a one-parameter subgroup (i.e. a continuous homomorphism from ${{\bf R}}$ to ${G}$) still makes sense; the main difficulties are then to show that the space of such subgroups continues to form a vector space, and that the associated exponential map ${\exp: \phi \mapsto \phi(1)}$ is still a local homeomorphism near the origin.

Exercise 3 The purpose of this exercise is to illustrate the perspective that a topological group can be viewed as a non-linear analogue of a vector space. Let ${G, H}$ be locally compact groups. For technical reasons we assume that ${G, H}$ are both ${\sigma}$-compact and metrisable.

• (i) (Open mapping theorem) Show that if ${\phi: G \rightarrow H}$ is a continuous homomorphism which is surjective, then it is open (i.e. the image of open sets is open). (Hint: mimic the proof of the open mapping theorem for Banach spaces, as discussed for instance in these notes. In particular, take advantage of the Baire category theorem.)
• (ii) (Closed graph theorem) Show that if a homomorphism ${\phi: G \rightarrow H}$ is closed (i.e. its graph ${\{ (g, \phi(g)): g \in G \}}$ is a closed subset of ${G \times H}$), then it is continuous. (Hint: mimic the derivation of the closed graph theorem from the open mapping theorem in the Banach space case, as again discussed in these notes.)
• (iii) Let ${\phi: G \rightarrow H}$ be a homomorphism, and let ${\rho: H \rightarrow K}$ be a continuous injective homomorphism into another Hausdorff topological group ${K}$. Show that ${\phi}$ is continuous if and only if ${\rho \circ \phi}$ is continuous.
• (iv) Relax the condition of metrisability to that of being Hausdorff. (Hint: Now one cannot use the Baire category theorem for metric spaces; but there is an analogue of this theorem for locally compact Hausdorff spaces.)

In this set of notes, we describe the basic analytic structure theory of Lie groups, by relating them to the simpler concept of a Lie algebra. Roughly speaking, the Lie algebra encodes the “infinitesimal” structure of a Lie group, but is a simpler object, being a vector space rather than a nonlinear manifold. Nevertheless, thanks to the fundamental theorems of Lie, the Lie algebra can be used to reconstruct the Lie group (at a local level, at least), by means of the exponential map and the Baker-Campbell-Hausdorff formula. As such, the local theory of Lie groups is completely described (in principle, at least) by the theory of Lie algebras, which leads to a number of useful consequences, such as the following:

• (Local Lie implies Lie) A topological group ${G}$ is Lie (i.e. it is isomorphic to a Lie group) if and only if it is locally Lie (i.e. the group operations are smooth near the origin).
• (Uniqueness of Lie structure) A topological group has at most one smooth structure on it that makes it Lie.
• (Weak regularity implies strong regularity, I) Lie groups are automatically real analytic. (In fact one only needs a “local ${C^{1,1}}$” regularity on the group structure to obtain real analyticity.)
• (Weak regularity implies strong regularity, II) A continuous homomorphism from one Lie group to another is automatically smooth (and real analytic).

The connection between Lie groups and Lie algebras also highlights the role of one-parameter subgroups of a topological group, which will play a central role in the solution of Hilbert’s fifth problem.

We note that there is also a very important algebraic structure theory of Lie groups and Lie algebras, in which the Lie algebra is split into solvable and semisimple components, with the latter being decomposed further into simple components, which can then be completely classified using Dynkin diagrams. This classification is of fundamental importance in many areas of mathematics (e.g. representation theory, arithmetic geometry, and group theory), and many of the deeper facts about Lie groups and Lie algebras are proven via this classification (although in such cases it can be of interest to also find alternate proofs that avoid the classification). However, it turns out that we will not need this theory in this course, and so we will not discuss it further here (though it can of course be found in any graduate text on Lie groups and Lie algebras).