This is another installment of my my series of posts on Hilbert’s fifth problem. One formulation of this problem is answered by the following theorem of Gleason and Montgomery-Zippin:

Theorem 1 (Hilbert’s fifth problem) Let {G} be a topological group which is locally Euclidean. Then {G} is isomorphic to a Lie group.

Theorem 1 is deep and difficult result, but the discussion in the previous posts has reduced the proof of this Theorem to that of establishing two simpler results, involving the concepts of a no small subgroups (NSS) subgroup, and that of a Gleason metric. We briefly recall the relevant definitions:

Definition 2 (NSS) A topological group {G} is said to have no small subgroups, or is NSS for short, if there is an open neighbourhood {U} of the identity in {G} that contains no subgroups of {G} other than the trivial subgroup {\{ \hbox{id}\}}.

Definition 3 (Gleason metric) Let {G} be a topological group. A Gleason metric on {G} is a left-invariant metric {d: G \times G \rightarrow {\bf R}^+} which generates the topology on {G} and obeys the following properties for some constant {C>0}, writing {\|g\|} for {d(g,\hbox{id})}:

  • (Escape property) If {g \in G} and {n \geq 1} is such that {n \|g\| \leq \frac{1}{C}}, then

    \displaystyle  \|g^n\| \geq \frac{1}{C} n \|g\|. \ \ \ \ \ (1)

  • (Commutator estimate) If {g, h \in G} are such that {\|g\|, \|h\| \leq \frac{1}{C}}, then

    \displaystyle  \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (2)

    where {[g,h] := g^{-1}h^{-1}gh} is the commutator of {g} and {h}.

The remaining steps in the resolution of Hilbert’s fifth problem are then as follows:

Theorem 4 (Reduction to the NSS case) Let {G} be a locally compact group, and let {U} be an open neighbourhood of the identity in {G}. Then there exists an open subgroup {G'} of {G}, and a compact subgroup {N} of {G'} contained in {U}, such that {G'/N} is NSS and locally compact.

Theorem 5 (Gleason’s lemma) Let {G} be a locally compact NSS group. Then {G} has a Gleason metric.

The purpose of this post is to establish these two results, using arguments that are originally due to Gleason. We will split this task into several subtasks, each of which improves the structure on the group {G} by some amount:

Proposition 6 (From locally compact to metrisable) Let {G} be a locally compact group, and let {U} be an open neighbourhood of the identity in {G}. Then there exists an open subgroup {G'} of {G}, and a compact subgroup {N} of {G'} contained in {U}, such that {G'/N} is locally compact and metrisable.

For any open neighbourhood {U} of the identity in {G}, let {Q(U)} be the union of all the subgroups of {G} that are contained in {U}. (Thus, for instance, {G} is NSS if and only if {Q(U)} is trivial for all sufficiently small {U}.)

Proposition 7 (From metrisable to subgroup trapping) Let {G} be a locally compact metrisable group. Then {G} has the subgroup trapping property: for every open neighbourhood {U} of the identity, there exists another open neighbourhood {V} of the identity such that {Q(V)} generates a subgroup {\langle Q(V) \rangle} contained in {U}.

Proposition 8 (From subgroup trapping to NSS) Let {G} be a locally compact group with the subgroup trapping property, and let {U} be an open neighbourhood of the identity in {G}. Then there exists an open subgroup {G'} of {G}, and a compact subgroup {N} of {G'} contained in {U}, such that {G'/N} is locally compact and NSS.

Proposition 9 (From NSS to the escape property) Let {G} be a locally compact NSS group. Then there exists a left-invariant metric {d} on {G} generating the topology on {G} which obeys the escape property (1) for some constant {C}.

Proposition 10 (From escape to the commutator estimate) Let {G} be a locally compact group with a left-invariant metric {d} that obeys the escape property (1). Then {d} also obeys the commutator property (2).

It is clear that Propositions 6, 7, and 8 combine to give Theorem 4, and Propositions 9, 10 combine to give Theorem 5.

Propositions 610 are all proven separately, but their proofs share some common strategies and ideas. The first main idea is to construct metrics on a locally compact group {G} by starting with a suitable “bump function” {\phi \in C_c(G)} (i.e. a continuous, compactly supported function from {G} to {{\bf R}}) and pulling back the metric structure on {C_c(G)} by using the translation action {\tau_g \phi(x) := \phi(g^{-1} x)}, thus creating a (semi-)metric

\displaystyle  d_\phi( g, h ) := \| \tau_g \phi - \tau_h \phi \|_{C_c(G)} := \sup_{x \in G} |\phi(g^{-1} x) - \phi(h^{-1} x)|. \ \ \ \ \ (3)

One easily verifies that this is indeed a (semi-)metric (in that it is non-negative, symmetric, and obeys the triangle inequality); it is also left-invariant, and so we have {d_\phi(g,h) = \|g^{-1} h \|_\phi = \| h^{-1} g \|_\phi}, where

\displaystyle  \| g \|_\phi = d_\phi(g,\hbox{id}) = \| \partial_g \phi \|_{C_c(G)}

where {\partial_g} is the difference operator {\partial_g = 1 - \tau_g},

\displaystyle  \partial_g \phi(x) = \phi(x) - \phi(g^{-1} x).

This construction was already seen in the proof of the Birkhoff-Kakutani theorem, which is the main tool used to establish Proposition 6. For the other propositions, the idea is to choose a bump function {\phi} that is “smooth” enough that it creates a metric with good properties such as the commutator estimate (2). Roughly speaking, to get a bound of the form (2), one needs {\phi} to have “{C^{1,1}} regularity” with respect to the “right” smooth structure on {G} By {C^{1,1}} regularity, we mean here something like a bound of the form

\displaystyle  \| \partial_g \partial_h \phi \|_{C_c(G)} \ll \|g\|_\phi \|h\|_\phi \ \ \ \ \ (4)

for all {g,h \in G}. Here we use the usual asymptotic notation, writing {X \ll Y} or {X=O(Y)} if {X \leq CY} for some constant {C} (which can vary from line to line).

The following lemma illustrates how {C^{1,1}} regularity can be used to build Gleason metrics.

Lemma 11 Suppose that {\phi \in C_c(G)} obeys (4). Then the (semi-)metric {d_\phi} (and associated (semi-)norm {\|\|_\phi}) obey the escape property (1) and the commutator property (2).

Proof: We begin with the commutator property (2). Observe the identity

\displaystyle  \tau_{[g,h]} = \tau_{hg}^{-1} \tau_{gh}


\displaystyle  \partial_{[g,h]} = \tau_{hg}^{-1} ( \tau_{hg} - \tau_{gh} )

\displaystyle  = \tau_{hg}^{-1} ( \partial_h \partial_g - \partial_g \partial_h ).

From the triangle inequality (and translation-invariance of the {C_c(G)} norm) we thus see that (2) follows from (4). Similarly, to obtain the escape property (1), observe the telescoping identity

\displaystyle  \partial_{g^n} = n \partial_g + \sum_{i=0}^{n-1} \partial_g \partial_{g^i}

for any {g \in G} and natural number {n}, and thus by the triangle inequality

\displaystyle  \| g^n \|_\phi = n \| g \|_\phi + O( \sum_{i=0}^{n-1} \| \partial_g \partial_{g^i} \phi \|_{C_c(G)} ). \ \ \ \ \ (5)

But from (4) (and the triangle inequality) we have

\displaystyle  \| \partial_g \partial_{g^i} \phi \|_{C_c(G)} \ll \|g\|_\phi \|g^i \|_\phi \ll i \|g\|_\phi^2

and thus we have the “Taylor expansion”

\displaystyle  \|g^n\|_\phi = n \|g\|_\phi + O( n^2 \|g\|_\phi^2 )

which gives (1). \Box

It remains to obtain {\phi} that have the desired {C^{1,1}} regularity property. In order to get such regular bump functions, we will use the trick of convolving together two lower regularity bump functions (such as two functions with “{C^{0,1}} regularity” in some sense to be determined later). In order to perform this convolution, we will use the fundamental tool of (left-invariant) Haar measure {\mu} on the locally compact group {G}. Here we exploit the basic fact that the convolution

\displaystyle  f_1 * f_2(x) := \int_G f_1(y) f_2(y^{-1} x)\ d\mu(y) \ \ \ \ \ (6)

of two functions {f_1,f_2 \in C_c(G)} tends to be smoother than either of the two factors {f_1,f_2}. This is easiest to see in the abelian case, since in this case we can distribute derivatives according to the law

\displaystyle  \partial_g (f_1 * f_2) = (\partial_g f_1) * f_2 = f_1 * (\partial_g f_2),

which suggests that the order of “differentiability” of {f_1*f_2} should be the sum of the orders of {f_1} and {f_2} separately.

These ideas are already sufficient to establish Proposition 10 directly, and also Proposition 9 when comined with an additional bootstrap argument. The proofs of Proposition 7 and Proposition 8 use similar techniques, but is more difficult due to the potential presence of small subgroups, which require an application of the Peter-Weyl theorem to properly control. Both of these theorems will be proven below the fold, thus (when combined with the preceding posts) completing the proof of Theorem 1.

The presentation here is based on some unpublished notes of van den Dries and Goldbring on Hilbert’s fifth problem. I am indebted to Emmanuel Breuillard, Ben Green, and Tom Sanders for many discussions related to these arguments.

— 1. From escape to the commutator estimate —

The general strategy here is to keep using the Gleason strategy of using the regularity one already has on the group {G} to build good bump functions {\phi} to create metrics that give even more regularity on {G}. As with many such “bootstrap” arguments, the deepest and most difficult steps are the earliest ones, in which one has very little regularity to begin with; conversely, the easiest and most straightforward steps tend to be the final ones, when one already has most of the regularity that one needs, thus having plenty of structure and tools available to climb the next rung of the regularity ladder. (For instance, to get from {C^{1,1}} regularity of a topological group to {C^\infty} or real analytic regularity is relatively routine, with two different such approaches indicated in the preceding blog posts.) In particular, the easiest task to accomplish will be that of Proposition 10, which establishes the commutator estimate (2) once the rest of the structural control on the group {G} is in place.

We now prove this proposition. As indicated in the introduction, the key idea here is to involve a bump function {\phi} formed by convolving together two Lipschitz functions. The escape property (1) will be crucial in obtaining quantitative control of the metric geometry at very small scales, as one can study the size of a group element {g} very close to the origin through its powers {g^n}, which are further away from the origin.

Specifically, let {\epsilon > 0} be a small quantity to be chosen later, and let {\psi \in C_c(G)} be a non-negative Lipschitz function supported on the ball {B(0,\epsilon)} which is not identically zero. For instance, one could use the explicit function

\displaystyle  \psi(x) := (1 - \frac{\|x\|}{\epsilon})_+

where {y_+ := \max(y,0)}. Being Lipschitz, we see that

\displaystyle  \| \partial_g \psi \|_{C_c(G)} \ll \|g\| \ \ \ \ \ (7)

for all {g \in G} (where we allow implied constants to depend on {G}, {\epsilon}, and {\psi}).

Let {\mu} be a non-trivial left-invariant Haar measure on {G} (see for instance this previous blog post for a construction of Haar measure on locally compact groups). We then form the convolution {\phi := \psi * \psi}, with convolution defined using (6); this is a continuous function supported in {B(0,2\epsilon)}, and gives a metric {d_\phi} and a norm {\| \|_\phi}.

We now prove a variant of (4), namely that

\displaystyle  \| \partial_g \partial_h \phi \|_{C_c(G)} \ll \|g\| \| h \| \ \ \ \ \ (8)

whenever {g, h \in B(0,\epsilon)}. We first use the left-invariance of Haar measure to write

\displaystyle  \partial_h \phi = (\partial_h \psi) * \psi, \ \ \ \ \ (9)


\displaystyle  \partial_h \phi(x) = \int_G (\partial_h \psi)(y) \psi(y^{-1} x)\ d\mu(y).

We would like to similarly move the {\partial_g} operator over to the second factor, but we run into a difficulty due to the non-abelian nature of {G}. Nevertheless, we can still do this provided that we twist that operator by a conjugation. More precisely, we have

\displaystyle  \partial_g \partial_h \phi(x) = \int_G (\partial_h \psi)(y) (\partial_{g^y} \psi)(y^{-1} x)\ d\mu(y) \ \ \ \ \ (10)

where {g^y := y^{-1} g y} is {g} conjugated by {y}. If {h \in B(0,\epsilon)}, the integrand is only non-zero when {y \in B(0,2\epsilon)}. Applying (7), we obtain the bound

\displaystyle  \| \partial_g \partial_h \phi \|_{C_c(g)} \ll \|h\| \sup_{y \in B(0,2\epsilon)} \|g^y\|.

To finish the proof of (8), it suffices to show that

\displaystyle  \|g^y\| \ll \|g\|

whenever {g \in B(0,\epsilon)} and {y \in B(0,2\epsilon)}.

We can achieve this by the escape property (1). Let {n} be a natural number such that {n \|g\| \leq \epsilon}, then {\|g^n\| \leq \epsilon} and so {g^n \in B(0,\epsilon)}. Conjugating by {y}, this implies that {(g^y)^n \in B(0,5\epsilon)}, and so by (1), we have {\|g^y\| \ll \frac{1}{n}} (if {\epsilon} is small enough), and the claim follows.

Next, we claim that the norm {\| \|_\phi} is locally comparable to the original norm {\| \|}. More precisely, we claim:

  1. If {g \in G} with {\| g \|_\phi} sufficiently small, then {\| g \| \ll \| g\|_\phi}.
  2. If {g \in G} with {\| g \|} sufficiently small, then {\|g\|_\phi \ll \|g\|}.

Claim 2 follows easily from (9) and (7), so we turn to Claim 1. Let {g \in G}, and let {n} be a natural number such that

\displaystyle  n \|g\|_\phi < \| \phi \|_{C_c(G)}.

Then by the triangle inequality

\displaystyle  \|g^n \|_\phi < \|\phi \|_{C_c(G)}.

This implies that {\phi} and {\tau_{g^n} \phi} have overlapping support, and hence {g^n} lies in {B(0,4\epsilon)}. By the escape property (1), this implies (if {\epsilon} is small enough) that {\|g\| \ll \frac{1}{n}}, and the claim follows.

Combining Claim 2 with (8) we see that

\displaystyle  \| \partial_g \partial_h \phi \|_{C_c(G)} \ll \|g\|_\phi \| h \|_\phi

whenever {\|g\|_\phi, \|h\|_\phi} are small enough; arguing as in the proof of Lemma 11 we conclude that

\displaystyle  \| [g,h] \|_\phi \ll \|g\|_\phi \|h\|_\phi

whenever {\|g\|_\phi, \|h\|_\phi} are small enough. Proposition 10 then follows from Claim 1 and Claim 2.

— 2. From NSS to the escape property —

Now we turn to establishing Proposition 9. An important concept will be that of an escape norm associated to an open neighbourhood {U} of a group {G}, defined by the formula

\displaystyle  \|g\|_{e,U} := \inf \{ \frac{1}{n+1}: g, g^2, \ldots, g^n \in U \} \ \ \ \ \ (11)

for any {g \in G}. Thus, the longer it takes for the orbit {g, g^2, \ldots} to escape {U}, the smaller the escape norm.

Strictly speaking, the escape norm is not necessarily a norm, as it need not obey the symmetry, non-degeneracy, or triangle inequalities; however, we shall see that in many situations, the escape norm behaves similarly to a norm, even if it does not exactly obey the norm axioms. Also, as the name suggests, the escape norm will be well suited for establishing the escape property (1).

It is possible for the escape norm {\|g\|_{e,U}} of a non-identity element {g \in G} to be zero, if {U} contains the group {\langle g \rangle} generated by {U}. But if the group {G} has the NSS property, then we see that this cannot occur for all sufficiently small {U} (where “sufficiently small” means “contained in a suitably chosen open neighbourhood {U_0} of the identity”). In fact, more is true: if {U, U'} are two sufficiently small open neighbourhoods of the identity in a locally compact NSS group {G}, then the two escape norms are comparable, thus we have

\displaystyle  \|g \|_{e,U} \ll \|g\|_{e,U'} \ll \|g\|_{e,U} \ \ \ \ \ (12)

for all {g \in G} (where the implied constants can depend on {U, U'}).

By symmetry, it suffices to prove the second inequality in (12). By (11), it suffices to find an integer {m} such that whenever {g \in G} is such that {g, g^2, \ldots, g^m \in U}, then {g \in U'}. Equivalently: for every {g \not \in U'}, one has {g^i \not \in U} for some {1 \leq i \leq m}. If {U} is small enough, then by the NSS property, we know that for each {g \in \overline{U} \backslash U'}, we have {g^i \not \in U} for some {i \geq 0}. As {G} is locally compact, we can make {\overline{U}} and hence {\overline{U} \backslash U'} compact, and so we can make {i} uniformly bounded in {g} by a compactness argument, and the claim follows.

Exercise 1 Let {G} be a locally compact group. Show that if {d} is a left-invariant metric on {G} obeying the escape property (1) that generates the topology, then {G} is NSS, and {\| g\|} is comparable to {\|g\|_{e,U}} for all sufficiently small {U}. (In particular, any two left-invariant metrics obeying the escape property and generating the topology are comparable to each other.)

Henceforth {G} is a locally compact NSS group.

Proposition 12 (Approximate triangle inequality) Let {U_0} be a sufficiently small open neighbourhood of the identity. Then for any {n} and any {g_1,\ldots,g_n \in G}, one has

\displaystyle  \| g_1 \ldots g_n \|_{e,U_0} \ll \sum_{i=1}^n \|g_i\|_{e,U_0}

(where the implied constant can depend on {U_0}).

Of course, in view of (12), the exact choice of {U_0} is irrelevant, so long as it is small. It is slightly convenient to take {U_0} to be symmetric (thus {U_0 = U_0^{-1}}), so that {\|g\|_{e,U_0} = \|g^{-1}\|_{e,U_0}} for all {g}.

Proof: We will use a bootstrap argument. Assume to start with that we somehow already have a weaker form of the conclusion, namely

\displaystyle  \| g_1 \ldots g_n \|_{e,U_0} \leq M \sum_{i=1}^n \|g_i\|_{e,U_0} \ \ \ \ \ (13)

for all {n,g_1,\ldots,g_n} and some huge constant {M}, and deduce the same estimate with a smaller value of {M}. Afterwards we will show how to remove the hypothesis (13).

Now suppose we have (13) for some {M}. Motivated by the argument in the previous section, we now try to convolve together two “Lipschitz” functions. For this, we will need some metric-like functions. Define the modified escape norm {\|g\|_{*,U_0}} by the formula

\displaystyle  \|g\|_{*,U_0} := \inf \{ \sum_{i=1}^n \|g_i\|_{e,U_0}: g = g_1 \ldots g_n \}

where the infimum is over all possible ways to split {g} as a finite product of group elements. From (13), we have

\displaystyle  \frac{1}{M}\|g\|_{e,U_0} \leq \|g\|_{*,U_0} \leq \|g\|_{e,U_0} \ \ \ \ \ (14)

and we have the triangle inequality

\displaystyle  \|gh\|_{*,U_0} \leq \|g\|_{*,U_0} + \|h\|_{*,U_0}

for any {g,h \in G}. We also have the symmetry property {\|g\|_{*,U_0} = \|g^{-1} \|_{*,U_0}}. Thus {\| \|_{*,U_0}} gives a left-invariant semi-metric on {G} by defining

\displaystyle  \hbox{dist}_{*,U_0}(g,h) := \|g^{-1} h \|_{*,U_0}.

We can now define a “Lipschitz” function {\psi: G \rightarrow {\bf R}} by setting

\displaystyle  \psi(x) := (1 - M \hbox{dist}_{*,U_0}(x, U_0))_+.

On the one hand, we see from (14) that this function takes values in {[0,1]} obeys the Lipschitz bound

\displaystyle  |\partial_g \psi(x)| \leq M \|g\|_{e,U_0} \ \ \ \ \ (15)

for any {g, x \in G}. On the other hand, it is supported in the region where {\hbox{dist}_{*,U_0}(x,U_0) \leq 1/M}, which by (14) (and (11)) is contained in {U_0^2}.

We could convolve {\psi} with itself in analogy to the preceding section, but in doing so, we will eventually end up establishing a much worse estimate than (13) (in which the constant {M} is replaced with something like {M^2}). Instead, we will need to convolve {\psi} with another function {\eta}, that we define as follows. We will need a large natural number {L} (independent of {M}) to be chosen later, then a small open neighbourhood {U_1 \subset U_0} of the identity (depending on {L, U_0}) to be chosen later. We then let {\eta: G \rightarrow {\bf R}} be the function

\displaystyle  \eta(x) := \sup \{ 1 - \frac{j}{L}: x \in U_1^j U_0; j = 0,\ldots,L \} \cup \{0\}.

Similarly to {\psi}, we see that {\eta} takes values in {[0,1]} and obeys the Lipschitz-type bound

\displaystyle  |\partial_g \eta(x)| \leq \frac{1}{L} \ \ \ \ \ (16)

for all {g \in U_1} and {x \in G}. Also, {\eta} is supported in {U_1^L U_0}, and hence (if {U_1} is sufficiently small depending on {L,U_0}) is supported in {U_0^2}, just as {\psi} is.

The functions {\psi, \eta} need not be continuous, but they are compactly supported, bounded, and Borel measurable, and so one can still form their convolution {\phi := \psi * \eta}, which will then be continuous and compactly supported; indeed, {\phi} is supported in {U_0^4}.

We have a lower bound on how big {\phi} is, since

\displaystyle  \phi(0) \geq \mu(U_0) \gg 1

(where we allow implied constants to depend on {\mu, U_0}, but remain independent of {L}, {U_1}, or {M}). This gives us a way to compare {\| \|_{\phi}} with {\| \|_{e,U_0}}. Indeed, if {n \|g\|_{\phi} < \phi(0)}, then (as in the proof of Claim 1 in the previous section) we have {g^n \in U_0^8}; this implies that

\displaystyle  \| g \|_{e,U_0^8} \ll \| g \|_{\phi}

for all {g \in G}, and hence by (12) we have

\displaystyle  \| g \|_{e,U_0} \ll \| g \|_{\phi} \ \ \ \ \ (17)

also. In the converse direction, we have

\displaystyle  \|g\|_\phi = \| \partial_g (\psi * \eta) \|_{C_c(G)}

\displaystyle  = \| (\partial_g \psi) * \eta \|_{C_c(G)}

\displaystyle  \ll M \|g\|_{e,U_0} \ \ \ \ \ (18)

thanks to (15). But we can do better than this, as follows. For any {g, h \in G}, we have the analogue of (10), namely

\displaystyle  \partial_g \partial_h \phi(x) = \int_G (\partial_h \psi)(y) (\partial_{g^y} \eta)(y^{-1} x)\ d\mu(y)

If {h \in U_0}, then the integrand vanishes unless {y \in U_0^3}. By continuity, we can find a small open neighbourhood {U_2 \subset U_1} of the identity such that {g^y \in U_1} for all {g \in U_2} and {y \in U_0^3}; we conclude from (15), (16) that

\displaystyle  |\partial_g \partial_h \phi(x)| \ll \frac{M}{L} \|h\|_{e,U_0}.

whenever {h \in U_0} and {g \in U_2}. To use this, we apply (5) and conclude that

\displaystyle  \|g^n\|_\phi = n \|g\|_\phi + O( n \frac{M}{L} \|g\|_{e,U_0} )

whenever {n \geq 1} and {g,\ldots,g^n \in U_2}. Using the trivial bound {\|g^n\|_\phi = O(1)}, we then have

\displaystyle  \|g\|_\phi \ll \frac{1}{n} + \frac{M}{L} \|g\|_{e,U_0};

optimising in {n} we obtain

\displaystyle  \|g\|_\phi \ll \|g\|_{e,U_2} + \frac{M}{L} \|g\|_{e,U_0}

and hence by (12)

\displaystyle \|g\|_\phi \ll (\frac{M}{L} + O_{U_2}(1)) \|g\|_{e,U_0}

where the implied constant in {O_{U_2}(1)} can depend on {U_0,U_1,U_2, L}, but is crucially independent of {M}. Note the essential gain of {\frac{1}{L}} here compared with (18). We also have the norm inequality

\displaystyle  \|g_1 \ldots g_n \|_\phi \leq \sum_{i=1}^n \|g_i\|_\phi.

Combining these inequalities with (17) we see that

\displaystyle  \| g_1 \ldots g_n \|_{e,U_0} \ll (\frac{1}{L} M + O_{U_2}(1)) \sum_{i=1}^n \|g_i\|_{e,U_0}.

Thus we have improved the constant {M} in the hypothesis (13) to {O( \frac{1}{L} M ) + O_{U_2}(1)}. Choosing {L} large enough and iterating, we conclude that we can bootstrap any finite constant {M} in (13) to {O(1)}.

Of course, there is no reason why there has to be a finite {M} for which (13) holds in the first place. However, one can rectify this by the usual trick of creating an epsilon of room. Namely, one replaces the escape norm {\| g \|_{e,U_0}} by, say, {\|g\|_{e,U_0}+\epsilon} for some small {\epsilon > 0} in the definition of {\| \|_{*,U_0}} and in the hypothesis (13). Then the bound (13) will be automatic with a finite {M} (of size about {O(1/\epsilon)}). One can then run the above argument with the requisite changes and conclude a bound of the form

\displaystyle  \| g_1 \ldots g_n \|_{e,U_0} \ll \sum_{i=1}^n (\|g_i\|_{e,U_0}+\epsilon)

uniformly in {\epsilon}; we omit the details. Sending {\epsilon \rightarrow 0}, we have thus shown Proposition 12. \Box

Now we can finish the proof of Proposition 9. Let {G} be a locally compact NSS group, and let {U_0} be a sufficiently small neighbourhood of the identity. From Proposition 12, we see that the escape norm {\| \|_{e,U_0}} and the modified escape norm {\| \|_{*,U_0}} are comparable. We have seen {d_{*,U_0}} is a left-invariant semi-metric. As {G} is NSS and {U_0} is small, there are no non-identity elements with zero escape norm, and hence no non-identity elements with zero modified escape norm either; thus {d_{*,U_0}} is a genuine metric.

We now claim that {d_{*,U_0}} generates the topology of {G}. Given the left-invariance of {d_{*,U_0}}, it suffices to establish two things: firstly, that any open neighbourhood of the identity contains a ball around the identity in the {d_{*,U_0}} metric; and conversely, any such ball contains an open neighbourhood around the identity.

To prove the first claim, let {U} be an open neighbourhood around the identity, and let {U' \subset U} be a smaller neighbourhood of the identity. From (12) we see (if {U'} is small enough) that {\| \|_{*,U_0}} is comparable to {\| \|_{e,U'}}, and {U'} contains a small ball around the origin in the {d_{*,U_0}} metric, giving the claim. To prove the second claim, consider a ball {B(0,r)} in the {d_{*,U_0}} metric. For any positive integer {m}, we can find an open neighbourhood {U_m} of the identity such that {U_m^m \subset U_0}, and hence {\|g\|_{e,U_0} \leq \frac{1}{m}} for all {g \in U_m}. For {m} large enough, this implies that {U_m \subset B(0,r)}, and the claim follows.

To finish the proof of Proposition 9, we need to verify the escape property (1). Thus, we need to show that if {g \in G}, {n \geq 1} are such that {n \|g\|_{*,U_0}} is sufficiently small, then we have {\|g^n\|_{*,U_0} \gg n \|g\|_{*,U_0}}. We may of course assume that {g} is not the identity, as the claim is trivial otherwise. As {\|\|_{*,U_0}} is comparable to {\| \|_{e,U_0}}, we know that there exists a natural number {m \ll 1 / \| g \|_{*,U_0}} such that {g^m \not \in U_0}. Let {U_1} be a neighbourhood of the identity small enough that {U_1^2 \subset U_0}. We have {\|g^i\|_{*,U_0} \leq n \|g\|_{*,U_0}} for all {i=1,\ldots,n}, so {g^i \in U_1} and hence {m > n}. Let {m+i} be the first multiple of {n} larger than {n}, then {i \leq n} and so {g^i \in U_1}. Since {g^m \not \in U_0}, this implies {g^{m+i} \not \in U_1}. Since {m+i} is divisible by {n}, we conclude that {\| g^n \|_{e,U_1} \geq \frac{n}{m+i} \gg n \| g \|_{*,U_0}}, and the claim follows from (12).

— 3. From subgroup trapping to NSS —

We now turn to the task of proving Proposition 8. Intuitively, the idea is to use the subgroup trapping property to find a small compact normal subgroup {N} that contains {Q(V)} for some small {V}, and then quotient this group out to get an NSS group. Unfortunately, because {N} is not necessarily contained in {V}, this quotienting operation may create some additional small subgroups. To fix this, we need to pass from the compact subgroup {N} to a smaller one. In order to understand the subgroups of compact groups, the main tool will be the Peter-Weyl theorem. Actually, we will just need the following weak version of that theorem:

Theorem 13 (Weak Peter-Weyl theorem) Let {G} be a compact group, and let {U} be a neighbourhood of the identity in {G}. Then there exists a finite-dimensional real linear representation {\rho: G \rightarrow GL(V)} of {G} (i.e. a continuous homomorphism from {G} to the general linear group {GL(V)} of a finite-dimensional real vector space {V}) whose kernel {\hbox{ker}(\rho)} lies in {U}. Equivalently, there exists a compact normal subgroup {H} of {G} contained in {U} such that {G/H} is isomorphic to a compact subgroup of {GL(V)}.

Proof: As {G} is compact, it has a Haar probability measure {\mu}. Let {W} be a symmetric open neighbourhood of the identity such that {W^2 \subset U}. The convolution operator {T: L^2(G) \rightarrow L^2(G)} given by {Tf := f * 1_W} is a self-adjoint integral operator on a probability space with bounded measurable kernel and is thus compact (indeed, it is a Hilbert-Schmidt integral operator). By the spectral theorem, {L^2(G)} then decomposes as the orthogonal sum of the eigenspaces of {T}, with all the eigenspaces {V_\lambda} corresponding to non-zero eigenvalues {\lambda} being finite-dimensional.

Note that {T} commutes with the left translation operators {\tau_g} for every {g \in G}, so all of the eigenspaces {V_\lambda} are invariant with respect to this action, and so we have finite-dimensional linear represenations {\rho_\lambda: G \rightarrow GL(V_\lambda)} for each non-zero eigenvalue {\lambda}.

Let {g \in G \backslash U}, then {\tau_g T 1_W \neq T 1_W} (the supports are disjoint). The function {T1_W} lies in the direct sum of the {V_\lambda} with {\lambda} non-zero, and so there must exist at least one {V_\lambda} such that the projections of {T1_W} and {\tau_g T 1_W} to {V_\lambda} are distinct. We conclude that {\rho_\lambda(g)} is non-trivial for this {\lambda} and {g}; by continuity, the same is true for all {g'} in an open neighbourhood of {g}. By compactness of {G \backslash U}, we may thus find a finite number {\lambda_1,\ldots,\lambda_k} of non-zero eigenvalues such that for each {g \in G \backslash U}, {\rho_{\lambda_i}(g)} is non-trivial for at least one {i=1,\ldots,k}. The representation {\rho := \rho_{\lambda_1} \oplus \ldots \oplus \rho_{\lambda_k}} can then be seen to have all the required properties. \Box

For us, the main reason why we need the Peter-Weyl theorem is that the linear spaces {GL(V)} automatically have the NSS property, even though {G} need not. Thus, one can view Theorem 13 as giving the compact case of Theorem 4.

We now prove Proposition 8, using an argument of Yamabe. Let {G} be a locally compact group with the subgroup trapping property, and let {U} be an open neighbourhood of the identity. We may find a smaller neighbourhood {U_1} of the identity with {U_1^2 \subset U}, which in particular implies that {\overline{U_1} \subset U}; by shrinking {U_1} if necessary, we may assume that {\overline{U_1}} is compact. By the subgroup trapping property, one can find an open neighbourhood {U_2} of the identity such that {\langle Q(U_2) \rangle} is contained in {U_1}, and thus {H := \overline{\langle Q(U_2) \rangle}} is a compact subgroup of {G} contained in {U_1}. By shrinking {U_2} if necessary we may assume {U_2 \subset U_1}.

Ideally, if {H} were normal and contained in {U_2}, then the quotient group {G/H} would have the NSS property. Unfortunately {H} need not be normal, and need not be contained in {U_2}, but we can fix this as follows. Applying Theorem 13, we can find a compact normal subgroup {N} of {H} contained in {U_2 \cap H} such that {H/N} is isomorphic to a linear group, and in particular is NSS. In particular, we can find an open symmetric neighbourhood {U_3} of the identity in {G} such that {U_3 N U_3 \subset U_2} and that the quotient space {\pi(U_3 N U_3 \cap H)} has no non-trivial subgroups in {H/N}, where {\pi: H \rightarrow H/N} is the quotient map.

We now claim that {N} is normalised by {U_3}. Indeed, if {g \in U_3}, then the conjugate {N^g := g^{-1} N g} of {N} is contained in {U_3 N U_3} and hence in {U_2}. As {N^g} is a group, it must thus be contained in {Q(U_2)} and hence in {H}. But then {\pi(N^g)} is a subgroup of {H/N} that is contained in {\pi(U_3 N U_3 \cap H)}, and is hence trivial by construction. Thus {N^g \subset N}, and so {N} is normalised by {U_3}. If we then let {G'} be the subgroup of {G} generated by {N} and {U_3}, we see that {G'} is an open subgroup of {G}, with {N} a compact normal subgroup of {G'}.

To finish the job, we need to show that {G'/N} has the NSS property. It suffices to show that {U_3 N U_3 / N} has no nontrivial subgroups. But any subgroup in {U_3 N U_3 / N} pulls back to a subgroup in {U_3 N U_3}, hence in {U_2}, hence in {Q(U_2)}, hence in {H}; since {(U_3 N U_3 \cap H)/N} has no nontrivial subgroups, the claim follows.

— 4. From metrisable to subgroup trapping —

We now perform the most difficult step, which is to establish Proposition 7. This step will require both the weak Peter-Weyl theorem (Theorem 13) and the Gleason technology, as well as some of the basic theory of Hausdorff distance; as such, this is perhaps the most “infinitary” of all the steps in the argument.

The Gleason-type arguments can be encapsulated in the following proposition, which is a weak version of the subgroup trapping property:

Proposition 14 (Finite trapping) Let {G} be a locally compact group, let {U} be an open neighbourhood of the identity, and let {m \geq 1} be an integer. Then there exists an open neighbourhood {V} of the identity with the following property: if {Q \subset Q[V]} is a symmetric set containing the identity, and {n \geq 1} is such that {Q^n \subset U}, then {Q^{mn} \subset U^8}.

Informally, Proposition 14 asserts that subsets of {Q[V]} grow much more slowly than “large” sets such as {U}. We remark that if one could replace {U^8} in the conclusion here by {U}, then a simple induction on {n} (after first shrinking {V} to lie in {U}) would give Proposition 7. It is the loss of {8} in the exponent that necessitates some non-trivial additional arguments.

Proof: } Let {V} be small enough to be chosen later, and let {Q, n} be as in the proposition. Once again we will convolve together two “Lipschitz” functions {\psi, \eta} to obtain a good bump function {\phi = \psi*\eta} which generates a useful metric for analysing the situation. The first bump function {\psi: G \rightarrow {\bf R}} will be defined by the formula

\displaystyle  \psi(x) := \sup \{ 1 - \frac{j}{n}: x \in Q^j U; j = 0,\ldots,n \} \cup \{0\}.

Then {\psi} takes values in {[0,1]}, equals {1} on {U}, is supported in {U^2}, and obeys the Lipschitz type property

\displaystyle  |\partial_q \psi(x)| \leq \frac{1}{n} \ \ \ \ \ (19)

for all {q \in Q}. The second bump function {\eta: G \rightarrow {\bf R}} is similarly defined by the formula

\displaystyle  \eta(x) := \sup \{ 1 - \frac{j}{M}: x \in (V^{U^4})^j U; j = 0,\ldots,M \} \cup \{0\},

where {V^{U^4} := \{ g^{-1} x g: x \in V, g \in U^4 \}}, where {M} is a quantity depending on {m} and {U} to be chosen later. If {V} is small enough depending on {U} and {m}, then {(V^{U^4})^M \subset U}, and so {\eta} also takes values in {[0,1]}, equals {1} on {U}, is supported in {U^2}, and obeys the Lipschitz type property

\displaystyle  |\partial_g \psi(x)| \leq \frac{1}{M} \ \ \ \ \ (20)

for all {g \in V^{U^4}}.

Now let {\phi := \psi * \eta}. Then {\phi} is supported on {U^4} and {\| \phi \|_{C_c(G)} \gg 1} (where implied constants can depend on {U}, {\mu}). As before, we conclude that {g \in U^8} whenever {\|g\|_\phi} is sufficiently small.

Now suppose that {q \in Q[V]}; we will estimate {\|q\|_\phi}. From (5) one has

\displaystyle  \|q\|_\phi \ll \frac{1}{n} \| q^n \|_\phi + \sup_{0 \leq i \leq n} \| \partial_{q^i} \partial_{q} \phi \|_{C_c(G)}

(note that {\partial_{q^i}} and {\partial_q} commute). For the first term, we can compute

\displaystyle  \| q^n \|_\phi = \sup_x |\partial_{q^n} (\psi * \eta)(x)|


\displaystyle  \partial_{q^n} (\psi * \eta)(x) = \int_G \psi(y) \partial_{(q^n)^y}(y^{-1} x) d\mu(y).

Since {q \in Q[V]}, {q^n \in V}, so by (20) we conclude that

\displaystyle  \| q^n \|_\phi \ll \frac{1}{M}.

For the second term, we similarly expand

\displaystyle  \partial_{q^i} \partial_{q^i} \phi(x) = \int_G (\partial_q \psi)(y) \partial_{(q^n)^y}(y^{-1} x) d\mu(y).

Using (20), (19) we conclude that

\displaystyle  |\partial_{q^i} \partial_{q^i} \phi(x)| \ll \frac{1}{Mn}.

Putting this together we see that

\displaystyle  \|q\|_\phi \ll \frac{1}{Mn}

for all {q \in Q[V]}, which in particular implies that

\displaystyle  \| g \|_\phi \ll \frac{m}{M}

for all {g \in Q^{mn}}. For {M} sufficiently large, this gives {Q^{mn} \subset U^8} as required. \Box

We will also need the following compactness result in the Hausdorff distance

\displaystyle  d_H( E, F ) := \max( \sup_{x \in E} \hbox{dist}(x,F), \sup_{y \in F} \hbox{dist}(E, y) )

between two non-empty closed subsets {E, F} of a metric space {(X,d)}.

Example 1 In {{\bf R}} with the usual metric, the finite sets {\{ \frac{i}{n}: i=1,\ldots,n\}} converge in Hausdorff distance to the closed interval {[0,1]}.

Lemma 15 The space {K(X)} of non-empty closed subsets of a compact metric space {X} is itself a compact metric space (with the Hausdorff distance as the metric).

Proof: It is easy to see that the Hausdorff distance is indeed a metric on {K(X)}, and that this metric is complete. The total boundedness of {X} easily implies the total boundedness of {K(X)} (indeed, once one can cover {X} by the {\epsilon}-neighbourhood of a finite set {F}, one can cover {K(X)} by the {2\epsilon}-neighbourhood of {K(F)}, by “rounding” off any closed subset of {X} to the nearest subset of {F}). The claim then follows from the Heine-Borel theorem. \Box

Now we can prove Proposition 7. Let {G} be a locally compact group endowed with some metric {d}, and let {U} be an open neighbourhood of the identity; by shrinking {U} we may assume that {U} is precompact. Let {V_i} be a sequence of balls around the identity with radius going to zero, then {Q[V_i]} is a symmetric set in {V_i} that contains the identity. If, for some {i}, {Q[V_i]^n \subset U} for every {n}, then {\langle Q (V_i) \rangle \subset U} and we are done. Thus, we may assume for sake of contradiction that there exists {n_i} such that {Q[V_i]^{n_i} \subset U} and {Q[V_i]^{n_i + 1} \not \subset U}; since the {V_i} go to zero, we have {n_i \rightarrow \infty}. By Proposition 14, we can also find {m_i \rightarrow \infty} such that {Q[V_i]^{m_i n_i} \subset U^8}.

The sets {\overline{Q[V_i]}^{n_i}} are closed subsets of {\overline{U}}; by Lemma 15, we may pass to a subsequence and assume that they converge to some closed subset {E} of {\overline{U}}. Since the {Q[V_i]} are symmetric and contain the identity, {E} is also symmetric and contains the identity. For any fixed {m}, we have {Q[V_i]^{m n_i} \subset U^8} for all sufficiently large {i}, which on taking Hausdorff limits implies that {E^m \subset \overline{U^8}}. In particular, the group {H := \overline{\langle E \rangle}} is a compact subgroup of {G} contained in {\overline{U^8}}.

Let {U_1} be a small neighbourhood of the identity in {G} to be chosen later. By Theorem 13, we can find a normal subgroup {N} of {H} contained in {U_1 \cap H} such that {H/N} is NSS. Let {B} be a neigbourhood of the identity in {H/N} so small that {B^{10}} has no small subgroups. A compactness argument then shows that there exists a natural number {k} such that for any {g \in H/N} that is not in {B}, at least one of {g, \ldots,g^k} must lie outside of {B^{10}}.

Now let {\epsilon > 0} be a small parameter. Since {Q[V_i]^{n_i+1} \not \subset U}, we see that {Q[V_i]^{n_i+1}} does not lie in the {\epsilon}-neighbourhood {\pi^{-1}(B)_\epsilon} of {\pi^{-1}(B)} if {\epsilon} is small enough, where {\pi: H \rightarrow H/N} is the projection map. Let {n'_i} be the first integer for which {Q[V_i]^{n'_i}} does not lie in {\pi^{-1}(B)_\epsilon}, then {n'_i \leq n_i+1} and {n'_i \rightarrow \infty} as {i \rightarrow \infty} (for fixed {\epsilon}). On the other hand, as {Q[V_i]^{n'_i-1} \subset \pi^{-1}(B)_\epsilon}, we see from another application of Proposition 14 that {Q[V_i]^{kn'_i} \subset (\pi^{-1}(B)_\epsilon)^8} if {i} is sufficiently large depending on {\epsilon}.

On the other hand, since {Q[V_i]^{n_i}} converges to a subset of {H} in the Hausdorff distance, we know that for {i} large enough, {Q[V_i]^{2n_i}} and hence {Q[V_i]^{n'_i}} is contained in the {\epsilon}-neighbourhood of {H}. Thus we can find an element {g_i} of {Q[V_i]^{n'_i}} that lies within {\epsilon} of a group element {h_i} of {H}, but does not lie in {B_\epsilon}; thus {h_i} lies inside {H \backslash \pi^{-1}(B)}. By construction of {B}, we can find {1 \leq j_i \leq k} such that {h^{j_i}_i} lies in {H \backslash \pi^{-1}(B^{10})}. But {h_i^{j_i}} also lies within {o(1)} of {g_i^{j_i}}, which lies in {Q[V_i]^{kn'_i}} and hence in {(\pi^{-1}(B)_\epsilon)^8}, where {o(1)} denotes a quantity depending on {\epsilon} that goes to zero as {\epsilon \rightarrow 0}. We conclude that {H \backslash \pi^{-1}(B^{10})} and {\pi^{-1}(B^8)} are separated by {o(1)}, which leads to a contradiction if {\epsilon} is sufficiently small (note that {\overline{\pi^{-1}(B^8)}} and {H \backslash \pi^{-1}(B^{10})} are compact and disjoint, and hence separated by a positive distance), and the claim follows.

— 5. From locally compact to metrisable —

We finally establish Proposition 6, which is actually one of the easier steps of the argument (because the conclusion is so weak). This argument is also due to Gleason. Let {G} be a locally compact group, and let {U} be an open neighbourhood of the identity. Let {U_0} be a symmetric precompact neighbourhood of the identity in {U}. We can then recursively construct a sequence

\displaystyle  U_0 \supset U_1 \supset U_2 \supset \ldots

of symmetric precompact neighbourhoods such that {(U_{n+1}^{U_0})^2 \subset U_n} for each {n \geq 0}. In particular

\displaystyle  U_{n+1} \subset \overline{U_{n+1}} \subset U_{n+1}^2 \subset U_n.

If we then form

\displaystyle  N := \bigcap_n U_n = \bigcap_n \overline{U_n}

then {N} is compact, symmetric, contains the origin, and {N^2=N}; thus {N} is normal. Also, since {U_{n+1}^{U_0} \subset U_n}, we have {N^{U_0} \subset N}, thus {N} is normalised by {U_0}. Thus if {G'} is the group generated by {U_0}, then {G'} is an open subgroup of {G} and {N} is a normal subgroup of {G'}.

Let {\pi: G' \rightarrow G'/N} be the quotient map, then we see that {\pi(U_n)} are nested open sets with {\overline{\pi(U_n)}} compact and whose intersection is the identity. From this one easily verifies that they form a neighbourhood base for {G'/N}. Thus {G'/N} is first countable and Hausdorff, and thus metrisable by the Birkhoff-Kakutani theorem. As {G} is locally compact, {G'} and {G'/N} are also locally compact, and the claim follows.