You are currently browsing the category archive for the ‘math.MG’ category.

In the previous set of notes, we introduced the notion of an ultra approximate group – an ultraproduct ${A = \prod_{n \rightarrow\alpha} A_n}$ of finite ${K}$-approximate groups ${A_n}$ for some ${K}$ independent of ${n}$, where each ${K}$-approximate group ${A_n}$ may lie in a distinct ambient group ${G_n}$. Although these objects arise initially from the “finitary” objects ${A_n}$, it turns out that ultra approximate groups ${A}$ can be profitably analysed by means of infinitary groups ${L}$ (and in particular, locally compact groups or Lie groups ${L}$), by means of certain models ${\rho: \langle A \rangle \rightarrow L}$ of ${A}$ (or of the group ${\langle A \rangle}$ generated by ${A}$). We will define precisely what we mean by a model later, but as a first approximation one can view a model as a representation of the ultra approximate group ${A}$ (or of ${\langle A \rangle}$) that is “macroscopically faithful” in that it accurately describes the “large scale” behaviour of ${A}$ (or equivalently, that the kernel of the representation is “microscopic” in some sense). In the next section we will see how one can use “Gleason lemma” technology to convert this macroscopic control of an ultra approximate group into microscopic control, which will be the key to classifying approximate groups.

Models of ultra approximate groups can be viewed as the multiplicative combinatorics analogue of the more well known concept of an ultralimit of metric spaces, which we briefly review below the fold as motivation.

The crucial observation is that ultra approximate groups enjoy a local compactness property which allows them to be usefully modeled by locally compact groups (and hence, through the Gleason-Yamabe theorem from previous notes, by Lie groups also). As per the Heine-Borel theorem, the local compactness will come from a combination of a completeness property and a local total boundedness property. The completeness property turns out to be a direct consequence of the countable saturation property of ultraproducts, thus illustrating one of the key advantages of the ultraproduct setting. The local total boundedness property is more interesting. Roughly speaking, it asserts that “large bounded sets” (such as ${A}$ or ${A^{100}}$) can be covered by finitely many translates of “small bounded sets” ${S}$, where “small” is a topological group sense, implying in particular that large powers ${S^m}$ of ${S}$ lie inside a set such as ${A}$ or ${A^4}$. The easiest way to obtain such a property comes from the following lemma of Sanders:

Lemma 1 (Sanders lemma) Let ${A}$ be a finite ${K}$-approximate group in a (global) group ${G}$, and let ${m \geq 1}$. Then there exists a symmetric subset ${S}$ of ${A^4}$ with ${|S| \gg_{K,m} |A|}$ containing the identity such that ${S^m \subset A^4}$.

This lemma has an elementary combinatorial proof, and is the key to endowing an ultra approximate group with locally compact structure. There is also a closely related lemma of Croot and Sisask which can achieve similar results, and which will also be discussed below. (The locally compact structure can also be established more abstractly using the much more general methods of definability theory, as was first done by Hrushovski, but we will not discuss this approach here.)

By combining the locally compact structure of ultra approximate groups ${A}$ with the Gleason-Yamabe theorem, one ends up being able to model a large “ultra approximate subgroup” ${A'}$ of ${A}$ by a Lie group ${L}$. Such Lie models serve a number of important purposes in the structure theory of approximate groups. Firstly, as all Lie groups have a dimension which is a natural number, they allow one to assign a natural number “dimension” to ultra approximate groups, which opens up the ability to perform “induction on dimension” arguments. Secondly, Lie groups have an escape property (which is in fact equivalent to no small subgroups property): if a group element ${g}$ lies outside of a very small ball ${B_\epsilon}$, then some power ${g^n}$ of it will escape a somewhat larger ball ${B_1}$. Or equivalently: if a long orbit ${g, g^2, \ldots, g^n}$ lies inside the larger ball ${B_1}$, one can deduce that the original element ${g}$ lies inside the small ball ${B_\epsilon}$. Because all Lie groups have this property, we will be able to show that all ultra approximate groups ${A}$ “essentially” have a similar property, in that they are “controlled” by a nearby ultra approximate group which obeys a number of escape-type properties analogous to those enjoyed by small balls in a Lie group, and which we will call a strong ultra approximate group. This will be discussed in the next set of notes, where we will also see how these escape-type properties can be exploited to create a metric structure on strong approximate groups analogous to the Gleason metrics studied in previous notes, which can in turn be exploited (together with an induction on dimension argument) to fully classify such approximate groups (in the finite case, at least).

There are some cases where the analysis is particularly simple. For instance, in the bounded torsion case, one can show that the associated Lie model ${L}$ is necessarily zero-dimensional, which allows for a easy classification of approximate groups of bounded torsion.

Some of the material here is drawn from my recent paper with Ben Green and Emmanuel Breuillard, which is in turn inspired by a previous paper of Hrushovski.

Emmanuel Breuillard, Ben Green, and I have just uploaded to the arXiv our paper “The structure of approximate groups“, submitted to Pub. IHES. We had announced the main results of this paper in various forums (including this blog) for a few months now, but it had taken some time to fully write up the paper and put in various refinements and applications.

As announced previously, the main result of this paper is what is a (virtually, qualitatively) complete description of finite approximate groups in an arbitrary (local or global) group ${G}$. For simplicity let us work in the much more familiar setting of global groups, although our results also apply (but are a bit more technical to state) in the local group setting.

Recall that in a global group ${G = (G,\cdot)}$, a ${K}$-approximate group is a symmetric subset ${A}$ of ${G}$ containing the origin, with the property that the product set ${A \cdot A}$ is covered by ${K}$ left-translates of ${A}$. Examples of ${O(1)}$-approximate groups include genuine groups, convex bodies in a bounded dimensional vector space, small balls in a bounded dimensional Lie group, large balls in a discrete nilpotent group of bounded rank or step, or generalised arithmetic progressions (or more generally, coset progressions) of bounded rank in an abelian group. Specialising now to finite approximate groups, a key example of such a group is what we call a coset nilprogression: a set of the form ${\pi^{-1}(P)}$, where ${\pi: G' \rightarrow N}$ is a homomorphism with finite kernel from a subgroup ${G'}$ of ${G}$ to a nilpotent group ${N}$ of bounded step, and ${P = P(u_1,\ldots,u_r;N_1,\ldots,N_r)}$ is a nilprogression with a bounded number of generators ${u_1,\ldots,u_r}$ in ${N}$ and some lengths ${N_1,\ldots,N_r \gg 1}$, where ${P(u_1,\ldots,u_r;N_1,\ldots,N_r)}$ consists of all the words involving at most ${N_1}$ copies of ${u_1^{\pm 1}}$, ${N_2}$ copies of ${u_2^{\pm 1}}$, and so forth up to ${N_r}$ copies of ${u_r^{\pm 1}}$. One can show (by some nilpotent algebra) that all such coset nilprogressions are ${O(1)}$-approximate groups so long as the step and the rank ${r}$ are bounded (and if ${N_1,\ldots,N_r}$ are sufficiently large).

Our main theorem (which was essentially conjectured independently by Helfgott and by Lindenstrauss) asserts, roughly speaking, that coset nilprogressions are essentially the only examples of approximate groups.

Theorem 1 Let ${A}$ be a ${K}$-approximate group. Then ${A^4}$ contains a coset nilprogression ${P}$ of rank and step ${O_K(1)}$, such that ${A}$ can be covered by ${O_K(1)}$ left-translates of ${P}$.

In the torsion-free abelian case, this result is essentially Freiman’s theorem (with an alternate proof by Ruzsa); for general abelian case, it is due to Green and Ruzsa. Various partial results in this direction for some other groups (e.g. free groups, nilpotent groups, solvable groups, or simple groups of Lie type) are also known; see these previous blog posts for a summary of several of these results.

This result has a number of applications to geometric growth theory, and in particular to variants of Gromov’s theorem of groups of polynomial growth, which asserts that a finitely generated group is of polynomial growth if and only if it is virtually nilpotent. The connection lies in the fact that if the balls ${B_S(R)}$ associated to a finite set of generators ${S}$ has polynomial growth, then some simple volume-packing arguments combined with the pigeonhole principle will show that ${B_S(R)}$ will end up being a ${O(1)}$-approximate group for many radii ${R}$. In fact, since our theorem only needs a single approximate group to obtain virtually nilpotent structure, we are able to obtain some new strengthenings of Gromov’s theorem. For instance, if ${A}$ is any ${K}$-approximate group in a finitely generated group ${G}$ that contains ${B_S(R_0)}$ for some set of generators ${S}$ and some ${R_0}$ that is sufficiently large depending on ${K}$, our theorem implies that ${G}$ is virtually nilpotent, answering a question of Petrunin. Among other things, this gives an alternate proof of a recent result of Kapovitch and Wilking (see also this previous paper of Cheeger and Colding) that a compact manifold of bounded diameter and Ricci curvature at least ${-\epsilon}$ necessarily has a virtually nilpotent fundamental group if ${\epsilon}$ is sufficiently small (depending only on dimension). The main point here is that no lower bound on the injectivity radius is required. Another application is a “Margulis-type lemma”, which asserts that if a metric space ${X}$ has “bounded packing” (in the sense that any ball of radius (say) ${4}$ is covered by a bounded number of balls of radius ${1}$), and ${\Gamma}$ is a group of isometries on ${X}$ that acts discretely (i.e. every orbit has only finitely many elements (counting multiplicity) in each bounded set), then the near-stabiliser ${\{ \gamma \in \Gamma: d(\gamma x, x) \leq \epsilon \}}$ of a point ${x}$ is virtually nilpotent if ${\epsilon}$ is small enough depending on the packing constant.

There are also some variants and refinements to the main theorem proved in the paper, such as an extension to local groups, and also an improvement on the bound on the rank and step from ${O_K(1)}$ to ${O(\log K)}$ (but at the cost of replacing ${A^4}$ in the theorem with ${A^{O(1)}}$).

I’ll be discussing the proof of the main theorem in detail in the next few lecture notes of my current graduate course. The full proof is somewhat lengthy (occupying about 50 pages of the 90-page paper), but can be summarised in the following steps:

1. (Hrushovski) Take an arbitrary sequence ${A_n}$ of finite ${K}$-approximate groups, and show that an appropriate limit ${A}$ of such groups can be “modeled” in some sense by an open bounded subset of a locally compact group. (The precise definition of “model” is technical, but “macroscopically faithful representation” is a good first approximation.) As discussed in the previous lecture notes, we use an ultralimit for this purpose; the paper of Hrushovski where this strategy was first employed also considered more sophisticated model-theoretic limits. To build a locally compact topology, Hrushovski used some tools from definability theory; in our paper, we instead use a combinatorial lemma of Sanders (closely related to a similar result of Croot and Sisask.)
2. (Gleason-Yamabe) The locally compact group can in turn be “modeled” by a Lie group (possibly after shrinking the group, and thus the ultralimit ${A}$, slightly). (This result arose from the solution to Hilbert’s fifth problem, as discussed here. For our extension to local groups, we use a recent local version of the Gleason-Yamabe theorem, due to Goldbring.)
3. (Gleason) Using the escape properties of the Lie model, construct a norm ${\| \|}$ (and thus a left-invariant metric ${d}$) on the ultralimit approximate group ${A}$ (and also on the finitary groups ${A_n}$) that obeys a number of good properties, such as a commutator estimate ${\| [g,h]\| \ll \|g\| \|h\|}$. (This is modeled on an analogous construction used in the theory of Hilbert’s fifth problem, as discussed in this previous set of lecture notes.) This norm is essentially an escape norm associated to (a slight modification) of ${A}$ or ${A_n}$.
4. (Jordan-Bieberbach-Frobenius) We now take advantage of the finite nature of the ${A_n}$ by locating the non-trivial element ${e}$ of ${A_n}$ with minimal escape norm (but one has to first quotient out the elements of zero escape norm first). The commutator estimate mentioned previously ensures that this element is essentially “central” in ${A_n}$. One can then quotient out a progression ${P(e;N)}$ generated by this central element (reducing the dimension of the Lie model by one in the process) and iterates the process until the dimension of the model drops to zero. Reversing the process, this constructs a coset nilprogression inside ${A_n^4}$. This argument is based on the classic proof of Jordan’s theorem due to Bieberbach and Frobenius, as discussed in this blog post.

One quirk of the argument is that it requires one to work in the category of local groups rather than global groups. (This is somewhat analogous to how, in the standard proofs of Freiman’s theorem, one needs to work with the category of Freiman homomorphisms, rather than group homomorphisms.) The reason for this arises when performing the quotienting step in the Jordan-Bieberbach-Frobenius leg of the argument. The obvious way to perform this step (and the thing that we tried first) would be to quotient out by the entire cyclic group ${\langle e \rangle}$ generated by the element ${e}$ of minimal escape norm. However, it turns out that this doesn’t work too well, because the group quotiented out is so “large” that it can create a lot of torsion in the quotient. In particular, elements which used to have positive escape norm, can now become trapped in the quotient of ${A_n}$, thus sending their escape norm to zero. This leads to an inferior conclusion (in which a coset nilprogression is replaced by a more complicated tower of alternating extensions between central progressions and finite groups, similar to the towers encountered in my previous paper on this topic). To prevent this unwanted creation of torsion, one has to truncate the cyclic group ${\langle e \rangle}$ before it escapes ${A_n}$, so that one quotients out by a geometric progression ${P(e;N)}$ rather than the cyclic group. But the operation of quotienting out by a ${P(e;N)}$, which is a local group rather than a global one, cannot be formalised in the category of global groups, but only in the category of local groups. Because of this, we were forced to carry out the entire argument using the language of local groups. As it turns out, the arguments are ultimately more natural in this setting, although there is an initial investment of notation required, given that global group theory is much more familiar and well-developed than local group theory.

One interesting feature of the argument is that it does not use much of the existing theory of Freiman-type theorems, instead building the coset nilprogression directly from the geometric properties of the approximate group. In particular, our argument gives a new proof of Freiman’s theorem in the abelian case, which largely avoids Fourier analysis (except through the use of the theory of Hilbert’s fifth problem, which uses the Peter-Weyl theorem (or, in the abelian case, Pontryagin duality), which is basically a version of Fourier analysis).

In this set of notes we will be able to finally prove the Gleason-Yamabe theorem from Notes 0, which we restate here:

Theorem 1 (Gleason-Yamabe theorem) Let ${G}$ be a locally compact group. Then, for any open neighbourhood ${U}$ of the identity, there exists an open subgroup ${G'}$ of ${G}$ and a compact normal subgroup ${K}$ of ${G'}$ in ${U}$ such that ${G'/K}$ is isomorphic to a Lie group.

In the next set of notes, we will combine the Gleason-Yamabe theorem with some topological analysis (and in particular, using the invariance of domain theorem) to establish some further control on locally compact groups, and in particular obtaining a solution to Hilbert’s fifth problem.

To prove the Gleason-Yamabe theorem, we will use three major tools developed in previous notes. The first (from Notes 2) is a criterion for Lie structure in terms of a special type of metric, which we will call a Gleason metric:

Definition 2 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

Theorem 3 (Building Lie structure from Gleason metrics) Let ${G}$ be a locally compact group that has a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

The second tool is the existence of a left-invariant Haar measure on any locally compact group; see Theorem 3 from Notes 3. Finally, we will also need the compact case of the Gleason-Yamabe theorem (Theorem 8 from Notes 3), which was proven via the Peter-Weyl theorem:

Theorem 4 (Gleason-Yamabe theorem for compact groups) Let ${G}$ be a compact Hausdorff group, and let ${U}$ be a neighbourhood of the identity. Then there exists a compact normal subgroup ${H}$ of ${G}$ contained in ${U}$ such that ${G/H}$ is isomorphic to a linear group (i.e. a closed subgroup of a general linear group ${GL_n({\bf C})}$).

To finish the proof of the Gleason-Yamabe theorem, we have to somehow use the available structures on locally compact groups (such as Haar measure) to build good metrics on those groups (or on suitable subgroups or quotient groups). The basic construction is as follows:

Definition 5 (Building metrics out of test functions) Let ${G}$ be a topological group, and let ${\psi: G \rightarrow {\bf R}^+}$ be a bounded non-negative function. Then we define the pseudometric ${d_\psi: G \times G \rightarrow {\bf R}^+}$ by the formula

$\displaystyle d_\psi(g,h) := \sup_{x \in G} |\tau(g) \psi(x) - \tau(h) \psi(x)|$

$\displaystyle = \sup_{x \in G} |\psi(g^{-1} x ) - \psi(h^{-1} x)|$

and the semi-norm ${\| \|_\psi: G \rightarrow {\bf R}^+}$ by the formula

$\displaystyle \|g\|_\psi := d_\psi(g, \hbox{id}).$

Note that one can also write

$\displaystyle \|g\|_\psi = \sup_{x \in G} |\partial_g \psi(x)|$

where ${\partial_g \psi(x) := \psi(x) - \psi(g^{-1} x)}$ is the “derivative” of ${\psi}$ in the direction ${g}$.

Exercise 1 Let the notation and assumptions be as in the above definition. For any ${g,h,k \in G}$, establish the metric-like properties

1. (Identity) ${d_\psi(g,h) \geq 0}$, with equality when ${g=h}$.
2. (Symmetry) ${d_\psi(g,h) = d_\psi(h,g)}$.
3. (Triangle inequality) ${d_\psi(g,k) \leq d_\psi(g,h) + d_\psi(h,k)}$.
4. (Continuity) If ${\psi \in C_c(G)}$, then the map ${d_\psi: G \times G \rightarrow {\bf R}^+}$ is continuous.
5. (Boundedness) One has ${d_\psi(g,h) \leq \sup_{x \in G} |\psi(x)|}$. If ${\psi \in C_c(G)}$ is supported in a set ${K}$, then equality occurs unless ${g^{-1} h \in K K^{-1}}$.
6. (Left-invariance) ${d_\psi(g,h) = d_\psi(kg,kh)}$. In particular, ${d_\psi(g,h) = \| h^{-1} g \|_\psi = \| g^{-1} h \|_\psi}$.

In particular, we have the norm-like properties

1. (Identity) ${\|g\|_\psi \geq 0}$, with equality when ${g=\hbox{id}}$.
2. (Symmetry) ${\|g\|_\psi = \|g^{-1}\|_\psi}$.
3. (Triangle inequality) ${\|gh\|_\psi \leq \|g\|_\psi + \|h\|_\psi}$.
4. (Continuity) If ${\psi \in C_c(G)}$, then the map ${\|\|_\psi: G \rightarrow {\bf R}^+}$ is continuous.
5. (Boundedness) One has ${\|g\|_\psi \leq \sup_{x \in G} |\psi(x)|}$. If ${\psi \in C_c(G)}$ is supported in a set ${K}$, then equality occurs unless ${g \in K K^{-1}}$.

We remark that the first three properties of ${d_\psi}$ in the above exercise ensure that ${d_\psi}$ is indeed a pseudometric.

To get good metrics (such as Gleason metrics) on groups ${G}$, it thus suffices to obtain test functions ${\psi}$ that obey suitably good “regularity” properties. We will achieve this primarily by means of two tricks. The first trick is to obtain high-regularity test functions by convolving together two low-regularity test functions, taking advantage of the existence of a left-invariant Haar measure ${\mu}$ on ${G}$. The second trick is to obtain low-regularity test functions by means of a metric-like object on ${G}$. This latter trick may seem circular, as our whole objective is to get a metric on ${G}$ in the first place, but the key point is that the metric one starts with does not need to have as many “good properties” as the metric one ends up with, thanks to the regularity-improving properties of convolution. As such, one can use a “bootstrap argument” (or induction argument) to create a good metric out of almost nothing. It is this bootstrap miracle which is at the heart of the proof of the Gleason-Yamabe theorem (and hence to the solution of Hilbert’s fifth problem).

The arguments here are based on the nonstandard analysis arguments used to establish Hilbert’s fifth problem by Hirschfeld and by Goldbring (and also some unpublished lecture notes of Goldbring and van den Dries). However, we will not explicitly use any nonstandard analysis in this post.

One of the fundamental inequalities in convex geometry is the Brunn-Minkowski inequality, which asserts that if ${A, B}$ are two non-empty bounded open subsets of ${{\bf R}^d}$, then

$\displaystyle \mu(A+B)^{1/d} \geq \mu(A)^{1/d} + \mu(B)^{1/d}, \ \ \ \ \ (1)$

where

$\displaystyle A+B := \{a+b: a \in A, b \in B \}$

is the sumset of ${A}$ and ${B}$, and ${\mu}$ denotes Lebesgue measure. The estimate is sharp, as can be seen by considering the case when ${A, B}$ are convex bodies that are dilates of each other, thus ${A = \lambda B := \{ \lambda b: b \in B \}}$ for some ${\lambda>0}$, since in this case one has ${\mu(A) = \lambda^d \mu(B)}$, ${A+B = (\lambda+1)B}$, and ${\mu(A+B) = (\lambda+1)^d \mu(B)}$.

The Brunn-Minkowski inequality has many applications in convex geometry. To give just one example, if we assume that ${A}$ has a smooth boundary ${\partial A}$, and set ${B}$ equal to a small ball ${B = B(0,\epsilon)}$, then ${\mu(B)^{1/d} = \epsilon \mu(B(0,1))^{1/d}}$, and in the limit ${\epsilon \rightarrow 0}$ one has

$\displaystyle \mu(A+B) = \mu(A) + \epsilon |\partial A| + o(\epsilon)$

where ${|\partial A|}$ is the surface measure of ${A}$; applying the Brunn-Minkowski inequality and performing a Taylor expansion, one soon arrives at the isoperimetric inequality

$\displaystyle |\partial A| \geq d \mu(A)^{1-1/d} \mu(B(0,1))^{1/d}.$

Thus one can view the isoperimetric inequality as an infinitesimal limit of the Brunn-Minkowski inequality.

There are many proofs known of the Brunn-Minkowski inequality. Firstly, the inequality is trivial in one dimension:

Lemma 1 (One-dimensional Brunn-Minkowski) If ${A, B,C \subset {\bf R}}$ are non-empty measurable sets with ${A+B \subset C \subset {\bf R}}$, then

$\displaystyle \mu(C) \geq \mu(A)+\mu(B).$

Proof: By inner regularity we may assume that ${A,B}$ are compact. The claim then follows since ${C}$ contains the sets ${\sup(A)+B}$ and ${A+\inf(B)}$, which meet only at a single point ${\sup(A)+\inf(B)}$. $\Box$

For the higher dimensional case, the inequality can be established from the Prékopa-Leindler inequality:

Theorem 2 (Prékopa-Leindler inequality in ${{\bf R}^d}$) Let ${0 < \theta < 1}$, and let ${f, g, h: {\bf R}^d \rightarrow {\bf R}}$ be non-negative measurable functions obeying the inequality

$\displaystyle h(x+y) \geq f(x)^{1-\theta} g(y)^\theta \ \ \ \ \ (2)$

for all ${x,y \in {\bf R}^d}$. Then we have

$\displaystyle \int_{{\bf R}^d} h \geq \frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}} (\int_{{\bf R}^d} f)^{1-\theta} (\int_{{\bf R}^d} g)^\theta. \ \ \ \ \ (3)$

This inequality is usually stated using ${h((1-\theta)x + \theta y)}$ instead of ${h(x+y)}$ in order to eliminate the ungainly factor ${\frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}}}$. However, we formulate the inequality in this fashion in order to avoid any reference to the dilation maps ${x \mapsto \lambda x}$; the reason for this will become clearer later.

The Prékopa-Leindler inequality quickly implies the Brunn-Minkowski inequality. Indeed, if we apply it to the indicator functions ${f := 1_A, g := 1_B, h := 1_{A+B}}$ (which certainly obey (2)), then (3) gives

$\displaystyle \mu(A+B)^{1/d} \geq \frac{1}{(1-\theta)^{1-\theta} \theta^{\theta}} \mu(A)^{\frac{1-\theta}{d}} \mu(B)^{\frac{\theta}{d}}$

for any ${0 < \theta < 1}$. We can now optimise in ${\theta}$; the optimal value turns out to be

$\displaystyle \theta := \frac{\mu(B)^{1/d}}{\mu(A)^{1/d}+\mu(B)^{1/d}}$

which yields (1).

To prove the Prékopa-Leindler inequality, we first observe that the inequality tensorises in the sense that if it is true in dimensions ${d_1}$ and ${d_2}$, then it is automatically true in dimension ${d_1+d_2}$. Indeed, if ${f, g, h: {\bf R}^{d_1} \times {\bf R}^{d_2} \rightarrow {\bf R}^+}$ are measurable functions obeying (2) in dimension ${d_1+d_2}$, then for any ${x_1, y_1 \in {\bf R}^{d_1}}$, the functions ${f(x_1,\cdot), g(y_1,\cdot), h(x_1+y_1,\cdot): {\bf R}^{d_2} \rightarrow {\bf R}^+}$ obey (2) in dimension ${d_2}$. Applying the Prékopa-Leindler inequality in dimension ${d_2}$, we conclude that

$\displaystyle H(x_1+y_1) \geq \frac{1}{(1-\theta)^{d_2(1-\theta)} \theta^{d_2\theta}} F(x_1)^{1-\theta} G(y_1)^\theta$

for all ${x_1,y_1 \in {\bf R}^{d_1}}$, where ${F(x_1) := \int_{{\bf R}^{d_2}} f(x_1,x_2)\ dx_2}$ and similarly for ${G, H}$. But then if we apply the Prékopa-Leindler inequality again, this time in dimension ${d_1}$ and to the functions ${F}$, ${G}$, and ${(1-\theta)^{d_2(1-\theta)} \theta^{d_2\theta} H}$, and then use the Fubini-Tonelli theorem, we obtain (3).

From tensorisation, we see that to prove the Prékopa-Leindler inequality it suffices to do so in the one-dimensional case. We can derive this from Lemma 1 by reversing the “Prékopa-Leindler implies Brunn-Minkowski” argument given earlier, as follows. We can normalise ${f,g}$ to have sup norm ${1}$. If (2) holds (in one dimension), then the super-level sets ${\{f>\lambda\}, \{g>\lambda\}, \{h>\lambda\}}$ are related by the set-theoretic inclusion

$\displaystyle \{ h > \lambda \} \supset \{ f > \lambda \} + \{ g > \lambda \}$

and thus by Lemma 1

$\displaystyle \mu(\{ h > \lambda \}) \geq \mu(\{ f > \lambda \}) + \mu(\{ g > \lambda \})$

whenever ${\lambda \leq 1}$. On the other hand, from the Fubini-Tonelli theorem one has the distributional identity

$\displaystyle \int_{\bf R} h = \int_0^\infty \mu(\{h > \lambda\})\ d\lambda$

(and similarly for ${f,g}$, but with ${\lambda}$ restricted to ${(0,1)}$), and thus

$\displaystyle \int_{\bf R} h \geq \int_{\bf R} f + \int_{\bf R} g.$

The claim then follows from the weighted arithmetic mean-geometric mean inequality ${(1-\theta) x + \theta y \geq x^{1-\theta} y^\theta}$.

In this post, I wanted to record the simple observation (which appears in this paper of Leonardi and Mansou in the case of the Heisenberg group, but may have also been stated elsewhere in the literature) that the above argument carries through without much difficulty to the nilpotent setting, to give a nilpotent Brunn-Minkowski inequality:

Theorem 3 (Nilpotent Brunn-Minkowski) Let ${G}$ be a connected, simply connected nilpotent Lie group of (topological) dimension ${d}$, and let ${A, B}$ be bounded open subsets of ${G}$. Let ${\mu}$ be a Haar measure on ${G}$ (note that nilpotent groups are unimodular, so there is no distinction between left and right Haar measure). Then

$\displaystyle \mu(A \cdot B)^{1/d} \geq \mu(A)^{1/d} + \mu(B)^{1/d}. \ \ \ \ \ (4)$

Here of course ${A \cdot B := \{ ab: a \in A, b \in B \}}$ is the product set of ${A}$ and ${B}$.

Indeed, by repeating the previous arguments, the nilpotent Brunn-Minkowski inequality will follow from

Theorem 4 (Nilpotent Prékopa-Leindler inequality) Let ${G}$ be a connected, simply connected nilpotent Lie group of topological dimension ${d}$ with a Haar measure ${\mu}$. Let ${0 < \theta < 1}$, and let ${f, g, h: G \rightarrow {\bf R}}$ be non-negative measurable functions obeying the inequality

$\displaystyle h(xy) \geq f(x)^{1-\theta} g(y)^\theta \ \ \ \ \ (5)$

for all ${x,y \in G}$. Then we have

$\displaystyle \int_G h\ d\mu \geq \frac{1}{(1-\theta)^{d(1-\theta)} \theta^{d\theta}} (\int_G f\ d\mu)^{1-\theta} (\int_G g\ d\mu)^\theta. \ \ \ \ \ (6)$

To prove the nilpotent Prékopa-Leindler inequality, the key observation is that this inequality not only tensorises; it splits with respect to short exact sequences. Indeed, suppose one has a short exact sequence

$\displaystyle 0 \rightarrow K \rightarrow G \rightarrow H \rightarrow 0$

of connected, simply connected nilpotent Lie groups. The adjoint action of the connected group ${G}$ on ${K}$ acts nilpotently on the Lie algebra of ${K}$ and is thus unimodular. Because of this, we can split a Haar measure ${\mu_G}$ on ${G}$ into Haar measures ${\mu_K, \mu_H}$ on ${K, H}$ respectively so that we have the Fubini-Tonelli formula

$\displaystyle \int_G f(g)\ d\mu_G(g) = \int_H F(h)\ d\mu_H(h)$

for any measurable ${f: G \rightarrow {\bf R}^+}$, where ${F(h)}$ is defined by the formula

$\displaystyle F(h) := \int_K f(kg) d\mu_K(k) = \int_K f(gk)\ d\mu_K(k)$

for any coset representative ${g \in G}$ of ${h}$ (the choice of ${g}$ is not important, thanks to unimodularity of the conjugation action). It is then not difficult to repeat the proof of tensorisation (relying heavily on the unimodularity of conjugation) to conclude that the nilpotent Prékopa-Leindler inequality for ${H}$ and ${K}$ implies the Prékopa-Leindler inequality for ${G}$; we leave this as an exercise to the interested reader.

Now if ${G}$ is a connected simply connected Lie group, then the abeliansation ${G/[G,G]}$ is connected and simply connected and thus isomorphic to a vector space. This implies that ${[G,G]}$ is a retract of ${G}$ and is thus also connected and simply connected. From this and an induction of the step of the nilpotent group, we see that the nilpotent Prékopa-Leindler inequality follows from the abelian case, which we have already established in Theorem 2.

Remark 1 Some connected, simply connected nilpotent groups ${G}$ (and specifically, the Carnot groups) can be equipped with a one-parameter family of dilations ${x \mapsto \lambda \cdot x}$, which are a family of automorphisms on ${G}$, which dilate the Haar measure by the formula

$\displaystyle \mu( \lambda \cdot x ) = \lambda^D \mu(x)$

for an integer ${D}$, called the homogeneous dimension of ${G}$, which is typically larger than the topological dimension. For instance, in the case of the Heisenberg group

$\displaystyle G := \begin{pmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1 \end{pmatrix},$

which has topological dimension ${d=3}$, the natural family of dilations is given by

$\displaystyle \lambda: \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix} \mapsto \begin{pmatrix} 1 & \lambda x & \lambda^2 z \\ 0 & 1 & \lambda y \\ 0 & 0 & 1 \end{pmatrix}$

with homogeneous dimension ${D=4}$. Because the two notions ${d, D}$ of dimension are usually distinct in the nilpotent case, it is no longer helpful to try to use these dilations to simplify the proof of the Brunn-Minkowski inequality, in contrast to the Euclidean case. This is why we avoided using dilations in the preceding discussion. It is natural to wonder whether one could replace ${d}$ by ${D}$ in (4), but it can be easily shown that the exponent ${d}$ is best possible (an observation that essentially appeared first in this paper of Monti). Indeed, working in the Heisenberg group for sake of concreteness, consider the set

$\displaystyle A := \{ \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}: |x|, |y| \leq N, |z| \leq N^{10} \}$

for some large parameter ${N}$. This set has measure ${N^{12}}$ using the standard Haar measure on ${G}$. The product set ${A \cdot A}$ is contained in

$\displaystyle A := \{ \begin{pmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1 \end{pmatrix}: |x|, |y| \leq 2N, |z| \leq 2N^{10} + O(N^2) \}$

and thus has measure at most ${8N^{12} + O(N^4)}$. This already shows that the exponent in (4) cannot be improved beyond ${d=3}$; note that the homogeneous dimension ${D=4}$ is making its presence known in the ${O(N^4)}$ term in the measure of ${A \cdot A}$, but this is a lower order term only.

It is somewhat unfortunate that the nilpotent Brunn-Minkowski inequality is adapted to the topological dimension rather than the homogeneous one, because it means that some of the applications of the inequality (such as the application to isoperimetric inequalities mentioned at the start of the post) break down. (Indeed, the topic of isoperimetric inequalities for the Heisenberg group is a subtle one, with many naive formulations of the inequality being false. See the paper of Monti for more discussion.)

Remark 2 The inequality can be extended to non-simply-connected connected nilpotent groups ${G}$, if ${d}$ is now set to the dimension of the largest simply connected quotient of ${G}$. It seems to me that this is the best one can do in general; for instance, if ${G}$ is a torus, then the inequality fails for any ${d>0}$, as can be seen by setting ${A=B=G}$.

Remark 3 Specialising the nilpotent Brunn-Minkowski inequality to the case ${A=B}$, we conclude that

$\displaystyle \mu(A \cdot A) \geq 2^d \mu(A).$

This inequality actually has a much simpler proof (attributed to Tsachik Gelander in this paper of Hrushovski, as pointed out to me by Emmanuel Breuillard): one can show that for a connected, simply connected Lie group ${G}$, the exponential map ${\exp: {\mathfrak g} \rightarrow G}$ is a measure-preserving homeomorphism, for some choice of Haar measure ${\mu_{{\mathfrak g}}}$ on ${{\mathfrak g}}$, so it suffices to show that

$\displaystyle \mu_{{\mathfrak g}}(\log(A \cdot A)) \geq 2^d \mu_{{\mathfrak g}}(\log A).$

But ${A \cdot A}$ contains all the squares ${\{a^2: a \in A \}}$ of ${A}$, so ${\log(A \cdot A)}$ contains the isotropic dilation ${2 \cdot \log A}$, and the claim follows. Note that if we set ${A}$ to be a small ball around the origin, we can modify this argument to give another demonstration of why the topological dimension ${d}$ cannot be replaced with any larger exponent in (4).

One may tentatively conjecture that the inequality ${\mu(A \cdot A) \geq 2^d \mu(A)}$ in fact holds in all unimodular connected, simply connected Lie groups ${G}$, and all bounded open subsets ${A}$ of ${G}$; I do not know if this bound is always true, however.

Hilbert’s fifth problem concerns the minimal hypotheses one needs to place on a topological group ${G}$ to ensure that it is actually a Lie group. In the previous set of notes, we saw that one could reduce the regularity hypothesis imposed on ${G}$ to a “${C^{1,1}}$” condition, namely that there was an open neighbourhood of ${G}$ that was isomorphic (as a local group) to an open subset ${V}$ of a Euclidean space ${{\bf R}^d}$ with identity element ${0}$, and with group operation ${\ast}$ obeying the asymptotic

$\displaystyle x \ast y = x + y + O(|x| |y|)$

for sufficiently small ${x,y}$. We will call such local groups ${(V,\ast)}$ ${C^{1,1}}$ local groups.

We now reduce the regularity hypothesis further, to one in which there is no explicit Euclidean space that is initially attached to ${G}$. Of course, Lie groups are still locally Euclidean, so if the hypotheses on ${G}$ do not involve any explicit Euclidean spaces, then one must somehow build such spaces from other structures. One way to do so is to exploit an ambient space with Euclidean or Lie structure that ${G}$ is embedded or immersed in. A trivial example of this is provided by the following basic fact from linear algebra:

Lemma 1 If ${V}$ is a finite-dimensional vector space (i.e. it is isomorphic to ${{\bf R}^d}$ for some ${d}$), and ${W}$ is a linear subspace of ${V}$, then ${W}$ is also a finite-dimensional vector space.

We will establish a non-linear version of this statement, known as Cartan’s theorem. Recall that a subset ${S}$ of a ${d}$-dimensional smooth manifold ${M}$ is a ${d'}$-dimensional smooth (embedded) submanifold of ${M}$ for some ${0 \leq d' \leq d}$ if for every point ${x \in S}$ there is a smooth coordinate chart ${\phi: U \rightarrow V}$ of a neighbourhood ${U}$ of ${x}$ in ${M}$ that maps ${x}$ to ${0}$, such that ${\phi(U \cap S) = V \cap {\bf R}^{d'}}$, where we identify ${{\bf R}^{d'} \equiv {\bf R}^{d'} \times \{0\}^{d-d'}}$ with a subspace of ${{\bf R}^d}$. Informally, ${S}$ locally sits inside ${M}$ the same way that ${{\bf R}^{d'}}$ sits inside ${{\bf R}^d}$.

Theorem 2 (Cartan’s theorem) If ${H}$ is a (topologically) closed subgroup of a Lie group ${G}$, then ${H}$ is a smooth submanifold of ${G}$, and is thus also a Lie group.

Note that the hypothesis that ${H}$ is closed is essential; for instance, the rationals ${{\bf Q}}$ are a subgroup of the (additive) group of reals ${{\bf R}}$, but the former is not a Lie group even though the latter is.

Exercise 1 Let ${H}$ be a subgroup of a locally compact group ${G}$. Show that ${H}$ is closed in ${G}$ if and only if it is locally compact.

A variant of the above results is provided by using (faithful) representations instead of embeddings. Again, the linear version is trivial:

Lemma 3 If ${V}$ is a finite-dimensional vector space, and ${W}$ is another vector space with an injective linear transformation ${\rho: W \rightarrow V}$ from ${W}$ to ${V}$, then ${W}$ is also a finite-dimensional vector space.

Here is the non-linear version:

Theorem 4 (von Neumann’s theorem) If ${G}$ is a Lie group, and ${H}$ is a locally compact group with an injective continuous homomorphism ${\rho: H \rightarrow G}$, then ${H}$ also has the structure of a Lie group.

Actually, it will suffice for the homomorphism ${\rho}$ to be locally injective rather than injective; related to this, von Neumann’s theorem localises to the case when ${H}$ is a local group rather a group. The requirement that ${H}$ be locally compact is necessary, for much the same reason that the requirement that ${H}$ be closed was necessary in Cartan’s theorem.

Example 1 Let ${G = ({\bf R}/{\bf Z})^2}$ be the two-dimensional torus, let ${H = {\bf R}}$, and let ${\rho: H \rightarrow G}$ be the map ${\rho(x) := (x,\alpha x)}$, where ${\alpha \in {\bf R}}$ is a fixed real number. Then ${\rho}$ is a continuous homomorphism which is locally injective, and is even globally injective if ${\alpha}$ is irrational, and so Theorem 4 is consistent with the fact that ${H}$ is a Lie group. On the other hand, note that when ${\alpha}$ is irrational, then ${\rho(H)}$ is not closed; and so Theorem 4 does not follow immediately from Theorem 2 in this case. (We will see, though, that Theorem 4 follows from a local version of Theorem 2.)

As a corollary of Theorem 4, we observe that any locally compact Hausdorff group ${H}$ with a faithful linear representation, i.e. a continuous injective homomorphism from ${H}$ into a linear group such as ${GL_n({\bf R})}$ or ${GL_n({\bf C})}$, is necessarily a Lie group. This suggests a representation-theoretic approach to Hilbert’s fifth problem. While this approach does not seem to readily solve the entire problem, it can be used to establish a number of important special cases with a well-understood representation theory, such as the compact case or the abelian case (for which the requisite representation theory is given by the Peter-Weyl theorem and Pontryagin duality respectively). We will discuss these cases further in later notes.

In all of these cases, one is not really building up Euclidean or Lie structure completely from scratch, because there is already a Euclidean or Lie structure present in another object in the hypotheses. Now we turn to results that can create such structure assuming only what is ostensibly a weaker amount of structure. In the linear case, one example of this is is the following classical result in the theory of topological vector spaces.

Theorem 5 Let ${V}$ be a locally compact Hausdorff topological vector space. Then ${V}$ is isomorphic (as a topological vector space) to ${{\bf R}^d}$ for some finite ${d}$.

Remark 1 The Banach-Alaoglu theorem asserts that in a normed vector space ${V}$, the closed unit ball in the dual space ${V^*}$ is always compact in the weak-* topology. Of course, this dual space ${V^*}$ may be infinite-dimensional. This however does not contradict the above theorem, because the closed unit ball is not a neighbourhood of the origin in the weak-* topology (it is only a neighbourhood with respect to the strong topology).

The full non-linear analogue of this theorem would be the Gleason-Yamabe theorem, which we are not yet ready to prove in this set of notes. However, by using methods similar to that used to prove Cartan’s theorem and von Neumann’s theorem, one can obtain a partial non-linear analogue which requires an additional hypothesis of a special type of metric, which we will call a Gleason metric:

Definition 6 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

Exercise 2 Let ${G}$ be a topological group that contains a neighbourhood of the identity isomorphic to a ${C^{1,1}}$ local group. Show that ${G}$ admits at least one Gleason metric.

Theorem 7 (Building Lie structure from Gleason metrics) Let ${G}$ be a locally compact group that has a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

We will rely on Theorem 7 to solve Hilbert’s fifth problem; this theorem reduces the task of establishing Lie structure on a locally compact group to that of building a metric with suitable properties. Thus, much of the remainder of the solution of Hilbert’s fifth problem will now be focused on the problem of how to construct good metrics on a locally compact group.

In all of the above results, a key idea is to use one-parameter subgroups to convert from the nonlinear setting to the linear setting. Recall from the previous notes that in a Lie group ${G}$, the one-parameter subgroups are in one-to-one correspondence with the elements of the Lie algebra ${{\mathfrak g}}$, which is a vector space. In a general topological group ${G}$, the concept of a one-parameter subgroup (i.e. a continuous homomorphism from ${{\bf R}}$ to ${G}$) still makes sense; the main difficulties are then to show that the space of such subgroups continues to form a vector space, and that the associated exponential map ${\exp: \phi \mapsto \phi(1)}$ is still a local homeomorphism near the origin.

Exercise 3 The purpose of this exercise is to illustrate the perspective that a topological group can be viewed as a non-linear analogue of a vector space. Let ${G, H}$ be locally compact groups. For technical reasons we assume that ${G, H}$ are both ${\sigma}$-compact and metrisable.

• (i) (Open mapping theorem) Show that if ${\phi: G \rightarrow H}$ is a continuous homomorphism which is surjective, then it is open (i.e. the image of open sets is open). (Hint: mimic the proof of the open mapping theorem for Banach spaces, as discussed for instance in these notes. In particular, take advantage of the Baire category theorem.)
• (ii) (Closed graph theorem) Show that if a homomorphism ${\phi: G \rightarrow H}$ is closed (i.e. its graph ${\{ (g, \phi(g)): g \in G \}}$ is a closed subset of ${G \times H}$), then it is continuous. (Hint: mimic the derivation of the closed graph theorem from the open mapping theorem in the Banach space case, as again discussed in these notes.)
• (iii) Let ${\phi: G \rightarrow H}$ be a homomorphism, and let ${\rho: H \rightarrow K}$ be a continuous injective homomorphism into another Hausdorff topological group ${K}$. Show that ${\phi}$ is continuous if and only if ${\rho \circ \phi}$ is continuous.
• (iv) Relax the condition of metrisability to that of being Hausdorff. (Hint: Now one cannot use the Baire category theorem for metric spaces; but there is an analogue of this theorem for locally compact Hausdorff spaces.)

This fall (starting Monday, September 26), I will be teaching a graduate topics course which I have entitled “Hilbert’s fifth problem and related topics.” The course is going to focus on three related topics:

• Hilbert’s fifth problem on the topological description of Lie groups, as well as the closely related (local) classification of locally compact groups (the Gleason-Yamabe theorem).
• Approximate groups in nonabelian groups, and their classification via the Gleason-Yamabe theorem (this is very recent work of Emmanuel Breuillard, Ben Green, Tom Sanders, and myself, building upon earlier work of Hrushovski);
• Gromov’s theorem on groups of polynomial growth, as proven via the classification of approximate groups (as well as some consequences to fundamental groups of Riemannian manifolds).

I have already blogged about these topics repeatedly in the past (particularly with regard to Hilbert’s fifth problem), and I intend to recycle some of that material in the lecture notes for this course.

The above three families of results exemplify two broad principles (part of what I like to call “the dichotomy between structure and randomness“):

• (Rigidity) If a group-like object exhibits a weak amount of regularity, then it (or a large portion thereof) often automatically exhibits a strong amount of regularity as well;
• (Structure) This strong regularity manifests itself either as Lie type structure (in continuous settings) or nilpotent type structure (in discrete settings). (In some cases, “nilpotent” should be replaced by sister properties such as “abelian“, “solvable“, or “polycyclic“.)

Let me illustrate what I mean by these two principles with two simple examples, one in the continuous setting and one in the discrete setting. We begin with a continuous example. Given an ${n \times n}$ complex matrix ${A \in M_n({\bf C})}$, define the matrix exponential ${\exp(A)}$ of ${A}$ by the formula

$\displaystyle \exp(A) := \sum_{k=0}^\infty \frac{A^k}{k!} = 1 + A + \frac{1}{2!} A^2 + \frac{1}{3!} A^3 + \ldots$

which can easily be verified to be an absolutely convergent series.

Exercise 1 Show that the map ${A \mapsto \exp(A)}$ is a real analytic (and even complex analytic) map from ${M_n({\bf C})}$ to ${M_n({\bf C})}$, and obeys the restricted homomorphism property

$\displaystyle \exp(sA) \exp(tA) = \exp((s+t)A) \ \ \ \ \ (1)$

for all ${A \in M_n({\bf C})}$ and ${s,t \in {\bf C}}$.

Proposition 1 (Rigidity and structure of matrix homomorphisms) Let ${n}$ be a natural number. Let ${GL_n({\bf C})}$ be the group of invertible ${n \times n}$ complex matrices. Let ${\Phi: {\bf R} \rightarrow GL_n({\bf C})}$ be a map obeying two properties:

• (Group-like object) ${\Phi}$ is a homomorphism, thus ${\Phi(s) \Phi(t) = \Phi(s+t)}$ for all ${s,t \in {\bf R}}$.
• (Weak regularity) The map ${t \mapsto \Phi(t)}$ is continuous.

Then:

• (Strong regularity) The map ${t \mapsto \Phi(t)}$ is smooth (i.e. infinitely differentiable). In fact it is even real analytic.
• (Lie-type structure) There exists a (unique) complex ${n \times n}$ matrix ${A}$ such that ${\Phi(t) = \exp(tA)}$ for all ${t \in {\bf R}}$.

Proof: Let ${\Phi}$ be as above. Let ${\epsilon > 0}$ be a small number (depending only on ${n}$). By the homomorphism property, ${\Phi(0) = 1}$ (where we use ${1}$ here to denote the identity element of ${GL_n({\bf C})}$), and so by continuity we may find a small ${t_0>0}$ such that ${\Phi(t) = 1 + O(\epsilon)}$ for all ${t \in [-t_0,t_0]}$ (we use some arbitrary norm here on the space of ${n \times n}$ matrices, and allow implied constants in the ${O()}$ notation to depend on ${n}$).

The map ${A \mapsto \exp(A)}$ is real analytic and (by the inverse function theorem) is a diffeomorphism near ${0}$. Thus, by the inverse function theorem, we can (if ${\epsilon}$ is small enough) find a matrix ${B}$ of size ${B = O(\epsilon)}$ such that ${\Phi(t_0) = \exp(B)}$. By the homomorphism property and (1), we thus have

$\displaystyle \Phi(t_0/2)^2 = \Phi(t_0) = \exp(B) = \exp(B/2)^2.$

On the other hand, by another application of the inverse function theorem we see that the squaring map ${A \mapsto A^2}$ is a diffeomorphism near ${1}$ in ${GL_n({\bf C})}$, and thus (if ${\epsilon}$ is small enough)

$\displaystyle \Phi(t_0/2) = \exp(B/2).$

We may iterate this argument (for a fixed, but small, value of ${\epsilon}$) and conclude that

$\displaystyle \Phi(t_0/2^k) = \exp(B/2^k)$

for all ${k = 0,1,2,\ldots}$. By the homomorphism property and (1) we thus have

$\displaystyle \Phi(qt_0) = \exp(qB)$

whenever ${q}$ is a dyadic rational, i.e. a rational of the form ${a/2^k}$ for some integer ${a}$ and natural number ${k}$. By continuity we thus have

$\displaystyle \Phi(st_0) = \exp(sB)$

for all real ${s}$. Setting ${A := B/t_0}$ we conclude that

$\displaystyle \Phi(t) = \exp(tA)$

for all real ${t}$, which gives existence of the representation and also real analyticity and smoothness. Finally, uniqueness of the representation ${\Phi(t) = \exp(tA)}$ follows from the identity

$\displaystyle A = \frac{d}{dt} \exp(tA)|_{t=0}.$

$\Box$

Exercise 2 Generalise Proposition 1 by replacing the hypothesis that ${\Phi}$ is continuous with the hypothesis that ${\Phi}$ is Lebesgue measurable (Hint: use the Steinhaus theorem.). Show that the proposition fails (assuming the axiom of choice) if this hypothesis is omitted entirely.

Note how one needs both the group-like structure and the weak regularity in combination in order to ensure the strong regularity; neither is sufficient on its own. We will see variants of the above basic argument throughout the course. Here, the task of obtaining smooth (or real analytic structure) was relatively easy, because we could borrow the smooth (or real analytic) structure of the domain ${{\bf R}}$ and range ${M_n({\bf C})}$; but, somewhat remarkably, we shall see that one can still build such smooth or analytic structures even when none of the original objects have any such structure to begin with.

Now we turn to a second illustration of the above principles, namely Jordan’s theorem, which uses a discreteness hypothesis to upgrade Lie type structure to nilpotent (and in this case, abelian) structure. We shall formulate Jordan’s theorem in a slightly stilted fashion in order to emphasise the adherence to the above-mentioned principles.

Theorem 2 (Jordan’s theorem) Let ${G}$ be an object with the following properties:

• (Group-like object) ${G}$ is a group.
• (Discreteness) ${G}$ is finite.
• (Lie-type structure) ${G}$ is contained in ${U_n({\bf C})}$ (the group of unitary ${n \times n}$ matrices) for some ${n}$.

Then there is a subgroup ${G'}$ of ${G}$ such that

• (${G'}$ is close to ${G}$) The index ${|G/G'|}$ of ${G'}$ in ${G}$ is ${O_n(1)}$ (i.e. bounded by ${C_n}$ for some quantity ${C_n}$ depending only on ${n}$).
• (Nilpotent-type structure) ${G'}$ is abelian.

A key observation in the proof of Jordan’s theorem is that if two unitary elements ${g, h \in U_n({\bf C})}$ are close to the identity, then their commutator ${[g,h] = g^{-1}h^{-1}gh}$ is even closer to the identity (in, say, the operator norm ${\| \|_{op}}$). Indeed, since multiplication on the left or right by unitary elements does not affect the operator norm, we have

$\displaystyle \| [g,h] - 1 \|_{op} = \| gh - hg \|_{op}$

$\displaystyle = \| (g-1)(h-1) - (h-1)(g-1) \|_{op}$

and so by the triangle inequality

$\displaystyle \| [g,h] - 1 \|_{op} \leq 2 \|g-1\|_{op} \|h-1\|_{op}. \ \ \ \ \ (2)$

Now we can prove Jordan’s theorem.

Proof: We induct on ${n}$, the case ${n=1}$ being trivial. Suppose first that ${G}$ contains a central element ${g}$ which is not a multiple of the identity. Then, by definition, ${G}$ is contained in the centraliser ${Z(g)}$ of ${g}$, which by the spectral theorem is isomorphic to a product ${U_{n_1}({\bf C}) \times \ldots \times U_{n_k}({\bf C})}$ of smaller unitary groups. Projecting ${G}$ to each of these factor groups and applying the induction hypothesis, we obtain the claim.

Thus we may assume that ${G}$ contains no central elements other than multiples of the identity. Now pick a small ${\epsilon > 0}$ (one could take ${\epsilon=\frac{1}{10n}}$ in fact) and consider the subgroup ${G'}$ of ${G}$ generated by those elements of ${G}$ that are within ${\epsilon}$ of the identity (in the operator norm). By considering a maximal ${\epsilon}$-net of ${G}$ we see that ${G'}$ has index at most ${O_{n,\epsilon}(1)}$ in ${G}$. By arguing as before, we may assume that ${G'}$ has no central elements other than multiples of the identity.

If ${G'}$ consists only of multiples of the identity, then we are done. If not, take an element ${g}$ of ${G'}$ that is not a multiple of the identity, and which is as close as possible to the identity (here is where we crucially use that ${G}$ is finite). By (2), we see that if ${\epsilon}$ is sufficiently small depending on ${n}$, and if ${h}$ is one of the generators of ${G'}$, then ${[g,h]}$ lies in ${G'}$ and is closer to the identity than ${g}$, and is thus a multiple of the identity. On the other hand, ${[g,h]}$ has determinant ${1}$. Given that it is so close to the identity, it must therefore be the identity (if ${\epsilon}$ is small enough). In other words, ${g}$ is central in ${G'}$, and is thus a multiple of the identity. But this contradicts the hypothesis that there are no central elements other than multiples of the identity, and we are done. $\Box$

Commutator estimates such as (2) will play a fundamental role in many of the arguments we will see in this course; as we saw above, such estimates combine very well with a discreteness hypothesis, but will also be very useful in the continuous setting.

Exercise 3 Generalise Jordan’s theorem to the case when ${G}$ is a finite subgroup of ${GL_n({\bf C})}$ rather than of ${U_n({\bf C})}$. (Hint: The elements of ${G}$ are not necessarily unitary, and thus do not necessarily preserve the standard Hilbert inner product of ${{\bf C}^n}$. However, if one averages that inner product by the finite group ${G}$, one obtains a new inner product on ${{\bf C}^n}$ that is preserved by ${G}$, which allows one to conjugate ${G}$ to a subgroup of ${U_n({\bf C})}$. This averaging trick is (a small) part of Weyl’s unitary trick in representation theory.)

Exercise 4 (Inability to discretise nonabelian Lie groups) Show that if ${n \geq 3}$, then the orthogonal group ${O_n({\bf R})}$ cannot contain arbitrarily dense finite subgroups, in the sense that there exists an ${\epsilon = \epsilon_n > 0}$ depending only on ${n}$ such that for every finite subgroup ${G}$ of ${O_n({\bf R})}$, there exists a ball of radius ${\epsilon}$ in ${O_n({\bf R})}$ (with, say, the operator norm metric) that is disjoint from ${G}$. What happens in the ${n=2}$ case?

Remark 1 More precise classifications of the finite subgroups of ${U_n({\bf C})}$ are known, particularly in low dimensions. For instance, one can show that the only finite subgroups of ${SO_3({\bf R})}$ (which ${SU_2({\bf C})}$ is a double cover of) are isomorphic to either a cyclic group, a dihedral group, or the symmetry group of one of the Platonic solids.

One of the most well known problems from ancient Greek mathematics was that of trisecting an angle by straightedge and compass, which was eventually proven impossible in 1837 by Pierre Wantzel, using methods from Galois theory.

Formally, one can set up the problem as follows. Define a configuration to be a finite collection ${{\mathcal C}}$ of points, lines, and circles in the Euclidean plane. Define a construction step to be one of the following operations to enlarge the collection ${{\mathcal C}}$:

• (Straightedge) Given two distinct points ${A, B}$ in ${{\mathcal C}}$, form the line ${\overline{AB}}$ that connects ${A}$ and ${B}$, and add it to ${{\mathcal C}}$.
• (Compass) Given two distinct points ${A, B}$ in ${{\mathcal C}}$, and given a third point ${O}$ in ${{\mathcal C}}$ (which may or may not equal ${A}$ or ${B}$), form the circle with centre ${O}$ and radius equal to the length ${|AB|}$ of the line segment joining ${A}$ and ${B}$, and add it to ${{\mathcal C}}$.
• (Intersection) Given two distinct curves ${\gamma, \gamma'}$ in ${{\mathcal C}}$ (thus ${\gamma}$ is either a line or a circle in ${{\mathcal C}}$, and similarly for ${\gamma'}$), select a point ${P}$ that is common to both ${\gamma}$ and ${\gamma'}$ (there are at most two such points), and add it to ${{\mathcal C}}$.

We say that a point, line, or circle is constructible by straightedge and compass from a configuration ${{\mathcal C}}$ if it can be obtained from ${{\mathcal C}}$ after applying a finite number of construction steps.

Problem 1 (Angle trisection) Let ${A, B, C}$ be distinct points in the plane. Is it always possible to construct by straightedge and compass from ${A,B,C}$ a line ${\ell}$ through ${A}$ that trisects the angle ${\angle BAC}$, in the sense that the angle between ${\ell}$ and ${BA}$ is one third of the angle of ${\angle BAC}$?

Thanks to Wantzel’s result, the answer to this problem is known to be “no” in general; a generic angle ${\angle BAC}$ cannot be trisected by straightedge and compass. (On the other hand, some special angles can certainly be trisected by straightedge and compass, such as a right angle. Also, one can certainly trisect generic angles using other methods than straightedge and compass; see the Wikipedia page on angle trisection for some examples of this.)

The impossibility of angle trisection stands in sharp contrast to the easy construction of angle bisection via straightedge and compass, which we briefly review as follows:

1. Start with three points ${A, B, C}$.
2. Form the circle ${c_0}$ with centre ${A}$ and radius ${AB}$, and intersect it with the line ${\overline{AC}}$. Let ${D}$ be the point in this intersection that lies on the same side of ${A}$ as ${C}$. (${D}$ may well be equal to ${C}$).
3. Form the circle ${c_1}$ with centre ${B}$ and radius ${AB}$, and the circle ${c_2}$ with centre ${D}$ and radius ${AB}$. Let ${E}$ be the point of intersection of ${c_1}$ and ${c_2}$ that is not ${A}$.
4. The line ${\ell := \overline{AE}}$ will then bisect the angle ${\angle BAC}$.

The key difference between angle trisection and angle bisection ultimately boils down to the following trivial number-theoretic fact:

Lemma 2 There is no power of ${2}$ that is evenly divisible by ${3}$.

Proof: Obvious by modular arithmetic, by induction, or by the fundamental theorem of arithmetic. $\Box$

In contrast, there are of course plenty of powers of ${2}$ that are evenly divisible by ${2}$, and this is ultimately why angle bisection is easy while angle trisection is hard.

The standard way in which Lemma 2 is used to demonstrate the impossibility of angle trisection is via Galois theory. The implication is quite short if one knows this theory, but quite opaque otherwise. We briefly sketch the proof of this implication here, though we will not need it in the rest of the discussion. Firstly, Lemma 2 implies the following fact about field extensions.

Corollary 3 Let ${F}$ be a field, and let ${E}$ be an extension of ${F}$ that can be constructed out of ${F}$ by a finite sequence of quadratic extensions. Then ${E}$ does not contain any cubic extensions ${K}$ of ${F}$.

Proof: If $E$ contained a cubic extension $K$ of $F$, then the dimension of $E$ over $F$ would be a multiple of three. On the other hand, if $E$ is obtained from $F$ by a tower of quadratic extensions, then the dimension of $E$ over $F$ is a power of two. The claim then follows from Lemma 2. $\Box$

To conclude the proof, one then notes that any point, line, or circle that can be constructed from a configuration ${{\mathcal C}}$ is definable in a field obtained from the coefficients of all the objects in ${{\mathcal C}}$ after taking a finite number of quadratic extensions, whereas a trisection of an angle ${\angle ABC}$ will generically only be definable in a cubic extension of the field generated by the coordinates of ${A,B,C}$.

The Galois theory method also allows one to obtain many other impossibility results of this type, most famously the Abel-Ruffini theorem on the insolvability of the quintic equation by radicals. For this reason (and also because of the many applications of Galois theory to number theory and other branches of mathematics), the Galois theory argument is the “right” way to prove the impossibility of angle trisection within the broader framework of modern mathematics. However, this argument has the drawback that it requires one to first understand Galois theory (or at least field theory), which is usually not presented until an advanced undergraduate algebra or number theory course, whilst the angle trisection problem requires only high-school level mathematics to formulate. Even if one is allowed to “cheat” and sweep several technicalities under the rug, one still needs to possess a fair amount of solid intuition about advanced algebra in order to appreciate the proof. (This was undoubtedly one reason why, even after Wantzel’s impossibility result was published, a large amount of effort was still expended by amateur mathematicians to try to trisect a general angle.)

In this post I would therefore like to present a different proof (or perhaps more accurately, a disguised version of the standard proof) of the impossibility of angle trisection by straightedge and compass, that avoids explicit mention of Galois theory (though it is never far beneath the surface). With “cheats”, the proof is actually quite simple and geometric (except for Lemma 2, which is still used at a crucial juncture), based on the basic geometric concept of monodromy; unfortunately, some technical work is needed however to remove these cheats.

To describe the intuitive idea of the proof, let us return to the angle bisection construction, that takes a triple ${A, B, C}$ of points as input and returns a bisecting line ${\ell}$ as output. We iterate the construction to create a quadrisecting line ${m}$, via the following sequence of steps that extend the original bisection construction:

1. Start with three points ${A, B, C}$.
2. Form the circle ${c_0}$ with centre ${A}$ and radius ${AB}$, and intersect it with the line ${\overline{AC}}$. Let ${D}$ be the point in this intersection that lies on the same side of ${A}$ as ${C}$. (${D}$ may well be equal to ${C}$).
3. Form the circle ${c_1}$ with centre ${B}$ and radius ${AB}$, and the circle ${c_2}$ with centre ${D}$ and radius ${AB}$. Let ${E}$ be the point of intersection of ${c_1}$ and ${c_2}$ that is not ${A}$.
4. Let ${F}$ be the point on the line ${\ell := \overline{AE}}$ which lies on ${c_0}$, and is on the same side of ${A}$ as ${E}$.
5. Form the circle ${c_3}$ with centre ${F}$ and radius ${AB}$. Let ${G}$ be the point of intersection of ${c_1}$ and ${c_3}$ that is not ${A}$.
6. The line ${m := \overline{AG}}$ will then quadrisect the angle ${\angle BAC}$.

Let us fix the points ${A}$ and ${B}$, but not ${C}$, and view ${m}$ (as well as intermediate objects such as ${D}$, ${c_2}$, ${E}$, ${\ell}$, ${F}$, ${c_3}$, ${G}$) as a function of ${C}$.

Let us now do the following: we begin rotating ${C}$ counterclockwise around ${A}$, which drags around the other objects ${D}$, ${c_2}$, ${E}$, ${\ell}$, ${F}$, ${c_3}$, ${G}$ that were constructed by ${C}$ accordingly. For instance, here is an early stage of this rotation process, when the angle ${\angle BAC}$ has become obtuse:

Now for the slightly tricky bit. We are going to keep rotating ${C}$ beyond a half-rotation of ${180^\circ}$, so that ${\angle BAC}$ now becomes a reflex angle. At this point, a singularity occurs; the point ${E}$ collides into ${A}$, and so there is an instant in which the line ${\ell = \overline{AE}}$ is not well-defined. However, this turns out to be a removable singularity (and the easiest way to demonstrate this will be to tap the power of complex analysis, as complex numbers can easily route around such a singularity), and we can blast through it to the other side, giving a picture like this:

Note that we have now deviated from the original construction in that ${F}$ and ${E}$ are no longer on the same side of ${A}$; we are thus now working in a continuation of that construction rather than with the construction itself. Nevertheless, we can still work with this continuation (much as, say, one works with analytic continuations of infinite series such as ${\sum_{n=1}^\infty \frac{1}{n^s}}$ beyond their original domain of definition).

We now keep rotating ${C}$ around ${A}$. Here, ${\angle BAC}$ is approaching a full rotation of ${360^\circ}$:

When ${\angle BAC}$ reaches a full rotation, a different singularity occurs: ${c_1}$ and ${c_2}$ coincide. Nevertheless, this is also a removable singularity, and we blast through to beyond a full rotation:

And now ${C}$ is back where it started, as are ${D}$, ${c_2}$, ${E}$, and ${\ell}$… but the point ${F}$ has moved, from one intersection point of ${\ell \cap c_3}$ to the other. As a consequence, ${c_3}$, ${G}$, and ${m}$ have also changed, with ${m}$ being at right angles to where it was before. (In the jargon of modern mathematics, the quadrisection construction has a non-trivial monodromy.)

But nothing stops us from rotating ${C}$ some more. If we continue this procedure, we see that after two full rotations of ${C}$ around ${A}$, all points, lines, and circles constructed from ${A, B, C}$ have returned to their original positions. Because of this, we shall say that the quadrisection construction described above is periodic with period ${2}$.

Similarly, if one performs an octisection of the angle ${\angle BAC}$ by bisecting the quadrisection, one can verify that this octisection is periodic with period ${4}$; it takes four full rotations of ${C}$ around ${A}$ before the configuration returns to where it started. More generally, one can show

Proposition 4 Any construction of straightedge and compass from the points ${A,B,C}$ is periodic with period equal to a power of ${2}$.

The reason for this, ultimately, is because any two circles or lines will intersect each other in at most two points, and so at each step of a straightedge-and-compass construction there is an ambiguity of at most ${2! = 2}$. Each rotation of ${C}$ around ${A}$ can potentially flip one of these points to the other, but then if one rotates again, the point returns to its original position, and then one can analyse the next point in the construction in the same fashion until one obtains the proposition.

But now consider a putative trisection operation, that starts with an arbitrary angle ${\angle BAC}$ and somehow uses some sequence of straightedge and compass constructions to end up with a trisecting line ${\ell}$:

What is the period of this construction? If we continuously rotate ${C}$ around ${A}$, we observe that a full rotations of ${C}$ only causes the trisecting line ${\ell}$ to rotate by a third of a full rotation (i.e. by ${120^\circ}$):

Because of this, we see that the period of any construction that contains ${\ell}$ must be a multiple of ${3}$. But this contradicts Proposition 4 and Lemma 2.

Below the fold, I will make the above proof rigorous. Unfortunately, in doing so, I had to again leave the world of high-school mathematics, as one needs a little bit of algebraic geometry and complex analysis to resolve the issues with singularities that we saw in the above sketch. Still, I feel that at an intuitive level at least, this argument is more geometric and accessible than the Galois-theoretic argument (though anyone familiar with Galois theory will note that there is really not that much difference between the proofs, ultimately, as one has simply replaced the Galois group with a closely related monodromy group instead).

This is another installment of my my series of posts on Hilbert’s fifth problem. One formulation of this problem is answered by the following theorem of Gleason and Montgomery-Zippin:

Theorem 1 (Hilbert’s fifth problem) Let ${G}$ be a topological group which is locally Euclidean. Then ${G}$ is isomorphic to a Lie group.

Theorem 1 is deep and difficult result, but the discussion in the previous posts has reduced the proof of this Theorem to that of establishing two simpler results, involving the concepts of a no small subgroups (NSS) subgroup, and that of a Gleason metric. We briefly recall the relevant definitions:

Definition 2 (NSS) A topological group ${G}$ is said to have no small subgroups, or is NSS for short, if there is an open neighbourhood ${U}$ of the identity in ${G}$ that contains no subgroups of ${G}$ other than the trivial subgroup ${\{ \hbox{id}\}}$.

Definition 3 (Gleason metric) Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

The remaining steps in the resolution of Hilbert’s fifth problem are then as follows:

Theorem 4 (Reduction to the NSS case) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is NSS and locally compact.

Theorem 5 (Gleason’s lemma) Let ${G}$ be a locally compact NSS group. Then ${G}$ has a Gleason metric.

The purpose of this post is to establish these two results, using arguments that are originally due to Gleason. We will split this task into several subtasks, each of which improves the structure on the group ${G}$ by some amount:

Proposition 6 (From locally compact to metrisable) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is locally compact and metrisable.

For any open neighbourhood ${U}$ of the identity in ${G}$, let ${Q(U)}$ be the union of all the subgroups of ${G}$ that are contained in ${U}$. (Thus, for instance, ${G}$ is NSS if and only if ${Q(U)}$ is trivial for all sufficiently small ${U}$.)

Proposition 7 (From metrisable to subgroup trapping) Let ${G}$ be a locally compact metrisable group. Then ${G}$ has the subgroup trapping property: for every open neighbourhood ${U}$ of the identity, there exists another open neighbourhood ${V}$ of the identity such that ${Q(V)}$ generates a subgroup ${\langle Q(V) \rangle}$ contained in ${U}$.

Proposition 8 (From subgroup trapping to NSS) Let ${G}$ be a locally compact group with the subgroup trapping property, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is locally compact and NSS.

Proposition 9 (From NSS to the escape property) Let ${G}$ be a locally compact NSS group. Then there exists a left-invariant metric ${d}$ on ${G}$ generating the topology on ${G}$ which obeys the escape property (1) for some constant ${C}$.

Proposition 10 (From escape to the commutator estimate) Let ${G}$ be a locally compact group with a left-invariant metric ${d}$ that obeys the escape property (1). Then ${d}$ also obeys the commutator property (2).

It is clear that Propositions 6, 7, and 8 combine to give Theorem 4, and Propositions 9, 10 combine to give Theorem 5.

Propositions 610 are all proven separately, but their proofs share some common strategies and ideas. The first main idea is to construct metrics on a locally compact group ${G}$ by starting with a suitable “bump function” ${\phi \in C_c(G)}$ (i.e. a continuous, compactly supported function from ${G}$ to ${{\bf R}}$) and pulling back the metric structure on ${C_c(G)}$ by using the translation action ${\tau_g \phi(x) := \phi(g^{-1} x)}$, thus creating a (semi-)metric

$\displaystyle d_\phi( g, h ) := \| \tau_g \phi - \tau_h \phi \|_{C_c(G)} := \sup_{x \in G} |\phi(g^{-1} x) - \phi(h^{-1} x)|. \ \ \ \ \ (3)$

One easily verifies that this is indeed a (semi-)metric (in that it is non-negative, symmetric, and obeys the triangle inequality); it is also left-invariant, and so we have ${d_\phi(g,h) = \|g^{-1} h \|_\phi = \| h^{-1} g \|_\phi}$, where

$\displaystyle \| g \|_\phi = d_\phi(g,\hbox{id}) = \| \partial_g \phi \|_{C_c(G)}$

where ${\partial_g}$ is the difference operator ${\partial_g = 1 - \tau_g}$,

$\displaystyle \partial_g \phi(x) = \phi(x) - \phi(g^{-1} x).$

This construction was already seen in the proof of the Birkhoff-Kakutani theorem, which is the main tool used to establish Proposition 6. For the other propositions, the idea is to choose a bump function ${\phi}$ that is “smooth” enough that it creates a metric with good properties such as the commutator estimate (2). Roughly speaking, to get a bound of the form (2), one needs ${\phi}$ to have “${C^{1,1}}$ regularity” with respect to the “right” smooth structure on ${G}$ By ${C^{1,1}}$ regularity, we mean here something like a bound of the form

$\displaystyle \| \partial_g \partial_h \phi \|_{C_c(G)} \ll \|g\|_\phi \|h\|_\phi \ \ \ \ \ (4)$

for all ${g,h \in G}$. Here we use the usual asymptotic notation, writing ${X \ll Y}$ or ${X=O(Y)}$ if ${X \leq CY}$ for some constant ${C}$ (which can vary from line to line).

The following lemma illustrates how ${C^{1,1}}$ regularity can be used to build Gleason metrics.

Lemma 11 Suppose that ${\phi \in C_c(G)}$ obeys (4). Then the (semi-)metric ${d_\phi}$ (and associated (semi-)norm ${\|\|_\phi}$) obey the escape property (1) and the commutator property (2).

Proof: We begin with the commutator property (2). Observe the identity

$\displaystyle \tau_{[g,h]} = \tau_{hg}^{-1} \tau_{gh}$

whence

$\displaystyle \partial_{[g,h]} = \tau_{hg}^{-1} ( \tau_{hg} - \tau_{gh} )$

$\displaystyle = \tau_{hg}^{-1} ( \partial_h \partial_g - \partial_g \partial_h ).$

From the triangle inequality (and translation-invariance of the ${C_c(G)}$ norm) we thus see that (2) follows from (4). Similarly, to obtain the escape property (1), observe the telescoping identity

$\displaystyle \partial_{g^n} = n \partial_g + \sum_{i=0}^{n-1} \partial_g \partial_{g^i}$

for any ${g \in G}$ and natural number ${n}$, and thus by the triangle inequality

$\displaystyle \| g^n \|_\phi = n \| g \|_\phi + O( \sum_{i=0}^{n-1} \| \partial_g \partial_{g^i} \phi \|_{C_c(G)} ). \ \ \ \ \ (5)$

But from (4) (and the triangle inequality) we have

$\displaystyle \| \partial_g \partial_{g^i} \phi \|_{C_c(G)} \ll \|g\|_\phi \|g^i \|_\phi \ll i \|g\|_\phi^2$

and thus we have the “Taylor expansion”

$\displaystyle \|g^n\|_\phi = n \|g\|_\phi + O( n^2 \|g\|_\phi^2 )$

which gives (1). $\Box$

It remains to obtain ${\phi}$ that have the desired ${C^{1,1}}$ regularity property. In order to get such regular bump functions, we will use the trick of convolving together two lower regularity bump functions (such as two functions with “${C^{0,1}}$ regularity” in some sense to be determined later). In order to perform this convolution, we will use the fundamental tool of (left-invariant) Haar measure ${\mu}$ on the locally compact group ${G}$. Here we exploit the basic fact that the convolution

$\displaystyle f_1 * f_2(x) := \int_G f_1(y) f_2(y^{-1} x)\ d\mu(y) \ \ \ \ \ (6)$

of two functions ${f_1,f_2 \in C_c(G)}$ tends to be smoother than either of the two factors ${f_1,f_2}$. This is easiest to see in the abelian case, since in this case we can distribute derivatives according to the law

$\displaystyle \partial_g (f_1 * f_2) = (\partial_g f_1) * f_2 = f_1 * (\partial_g f_2),$

which suggests that the order of “differentiability” of ${f_1*f_2}$ should be the sum of the orders of ${f_1}$ and ${f_2}$ separately.

These ideas are already sufficient to establish Proposition 10 directly, and also Proposition 9 when comined with an additional bootstrap argument. The proofs of Proposition 7 and Proposition 8 use similar techniques, but is more difficult due to the potential presence of small subgroups, which require an application of the Peter-Weyl theorem to properly control. Both of these theorems will be proven below the fold, thus (when combined with the preceding posts) completing the proof of Theorem 1.

The presentation here is based on some unpublished notes of van den Dries and Goldbring on Hilbert’s fifth problem. I am indebted to Emmanuel Breuillard, Ben Green, and Tom Sanders for many discussions related to these arguments.

Hilbert’s fifth problem asks to clarify the extent that the assumption on a differentiable or smooth structure is actually needed in the theory of Lie groups and their actions. While this question is not precisely formulated and is thus open to some interpretation, the following result of Gleason and Montgomery-Zippin answers at least one aspect of this question:

Theorem 1 (Hilbert’s fifth problem) Let ${G}$ be a topological group which is locally Euclidean (i.e. it is a topological manifold). Then ${G}$ is isomorphic to a Lie group.

Theorem 1 can be viewed as an application of the more general structural theory of locally compact groups. In particular, Theorem 1 can be deduced from the following structural theorem of Gleason and Yamabe:

Theorem 2 (Gleason-Yamabe theorem) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is isomorphic to a Lie group.

The deduction of Theorem 1 from Theorem 2 proceeds using the Brouwer invariance of domain theorem and is discussed in this previous post. In this post, I would like to discuss the proof of Theorem 2. We can split this proof into three parts, by introducing two additional concepts. The first is the property of having no small subgroups:

Definition 3 (NSS) A topological group ${G}$ is said to have no small subgroups, or is NSS for short, if there is an open neighbourhood ${U}$ of the identity in ${G}$ that contains no subgroups of ${G}$ other than the trivial subgroup ${\{ \hbox{id}\}}$.

An equivalent definition of an NSS group is one which has an open neighbourhood ${U}$ of the identity that every non-identity element ${g \in G \backslash \{\hbox{id}\}}$ escapes in finite time, in the sense that ${g^n \not \in U}$ for some positive integer ${n}$. It is easy to see that all Lie groups are NSS; we shall shortly see that the converse statement (in the locally compact case) is also true, though significantly harder to prove.

Another useful property is that of having what I will call a Gleason metric:

Definition 4 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

For instance, the unitary group ${U(n)}$ with the operator norm metric ${d(g,h) := \|g-h\|_{op}}$ can easily verified to be a Gleason metric, with the commutator estimate (1) coming from the inequality

$\displaystyle \| [g,h] - 1 \|_{op} = \| gh - hg \|_{op}$

$\displaystyle = \| (g-1) (h-1) - (h-1) (g-1) \|_{op}$

$\displaystyle \leq 2 \|g-1\|_{op} \|g-1\|_{op}.$

Similarly, any left-invariant Riemannian metric on a (connected) Lie group can be verified to be a Gleason metric. From the escape property one easily sees that all groups with Gleason metrics are NSS; again, we shall see that there is a partial converse.

Remark 1 The escape and commutator properties are meant to capture “Euclidean-like” structure of the group. Other metrics, such as Carnot-Carathéodory metrics on Carnot Lie groups such as the Heisenberg group, usually fail one or both of these properties.

The proof of Theorem 2 can then be split into three subtheorems:

Theorem 5 (Reduction to the NSS case) Let ${G}$ be a locally compact group, and let ${U}$ be an open neighbourhood of the identity in ${G}$. Then there exists an open subgroup ${G'}$ of ${G}$, and a compact subgroup ${N}$ of ${G'}$ contained in ${U}$, such that ${G'/N}$ is NSS, locally compact, and metrisable.

Theorem 6 (Gleason’s lemma) Let ${G}$ be a locally compact metrisable NSS group. Then ${G}$ has a Gleason metric.

Theorem 7 (Building a Lie structure) Let ${G}$ be a locally compact group with a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

Clearly, by combining Theorem 5, Theorem 6, and Theorem 7 one obtains Theorem 2 (and hence Theorem 1).

Theorem 5 and Theorem 6 proceed by some elementary combinatorial analysis, together with the use of Haar measure (to build convolutions, and thence to build “smooth” bump functions with which to create a metric, in a variant of the analysis used to prove the Birkhoff-Kakutani theorem); Theorem 5 also requires Peter-Weyl theorem (to dispose of certain compact subgroups that arise en route to the reduction to the NSS case), which was discussed previously on this blog.

In this post I would like to detail the final component to the proof of Theorem 2, namely Theorem 7. (I plan to discuss the other two steps, Theorem 5 and Theorem 6, in a separate post.) The strategy is similar to that used to prove von Neumann’s theorem, as discussed in this previous post (and von Neumann’s theorem is also used in the proof), but with the Gleason metric serving as a substitute for the faithful linear representation. Namely, one first gives the space ${L(G)}$ of one-parameter subgroups of ${G}$ enough of a structure that it can serve as a proxy for the “Lie algebra” of ${G}$; specifically, it needs to be a vector space, and the “exponential map” needs to cover an open neighbourhood of the identity. This is enough to set up an “adjoint” representation of ${G}$, whose image is a Lie group by von Neumann’s theorem; the kernel is essentially the centre of ${G}$, which is abelian and can also be shown to be a Lie group by a similar analysis. To finish the job one needs to use arguments of Kuranishi and of Gleason, as discussed in this previous post.

The arguments here can be phrased either in the standard analysis setting (using sequences, and passing to subsequences often) or in the nonstandard analysis setting (selecting an ultrafilter, and then working with infinitesimals). In my view, the two approaches have roughly the same level of complexity in this case, and I have elected for the standard analysis approach.

Remark 2 From Theorem 7 we see that a Gleason metric structure is a good enough substitute for smooth structure that it can actually be used to reconstruct the entire smooth structure; roughly speaking, the commutator estimate (1) allows for enough “Taylor expansion” of expressions such as ${g^n h^n}$ that one can simulate the fundamentals of Lie theory (in particular, construction of the Lie algebra and the exponential map, and its basic properties. The advantage of working with a Gleason metric rather than a smoother structure, though, is that it is relatively undemanding with regards to regularity; in particular, the commutator estimate (1) is roughly comparable to the imposition ${C^{1,1}}$ structure on the group ${G}$, as this is the minimal regularity to get the type of Taylor approximation (with quadratic errors) that would be needed to obtain a bound of the form (1). We will return to this point in a later post.

In a previous blog post, I discussed the recent result of Guth and Katz obtaining a near-optimal bound on the Erdos distance problem. One of the tools used in the proof (building upon the earlier work of Elekes and Sharir) was the observation that the incidence geometry of the Euclidean group ${SE(2)}$ of rigid motions of the plane was almost identical to that of lines in the Euclidean space ${{\bf R}^3}$:

Proposition 1 One can identify a (Zariski-)dense portion of ${SE(2)}$ with ${{\bf R}^3}$, in such a way that for any two points ${A, B}$ in the plane ${{\bf R}^2}$, the set ${l_{AB} := \{ R \in SE(2): RA = B \}}$ of rigid motions mapping ${A}$ to ${B}$ forms a line in ${{\bf R}^3}$.

Proof: A rigid motion is either a translation or a rotation, with the latter forming a Zariski-dense subset of ${SE(2)}$. Identify a rotation ${R}$ in ${SE(2)}$ by an angle ${\theta}$ with ${|\theta| < \pi}$ around a point ${P}$ with the element ${(P, \cot \frac{\theta}{2})}$ in ${{\bf R}^3}$. (Note that such rotations also form a Zariski-dense subset of ${SE(2)}$.) Elementary trigonometry then reveals that if ${R}$ maps ${A}$ to ${B}$, then ${P}$ lies on the perpendicular bisector of ${AB}$, and depends in a linear fashion on ${\cot \frac{\theta}{2}}$ (for fixed ${A,B}$). The claim follows. $\Box$

As seen from the proof, this proposition is an easy (though ad hoc) application of elementary trigonometry, but it was still puzzling to me why such a simple parameterisation of the incidence structure of ${SE(2)}$ was possible. Certainly it was clear from general algebraic geometry considerations that some bounded-degree algebraic description was available, but why would the ${l_{AB}}$ be expressible as lines and not as, say, quadratic or cubic curves?

In this post I would like to record some observations arising from discussions with Jordan Ellenberg, Jozsef Solymosi, and Josh Zahl which give a more conceptual (but less elementary) derivation of the above proposition that avoids the use of ad hoc coordinate transformations such as ${R \mapsto (P, \cot\frac{\theta}{2})}$. The starting point is to view the Euclidean plane ${{\bf R}^2}$ as the scaling limit of the sphere ${S^2}$ (a fact which is familiar to all of us through the geometry of the Earth), which makes the Euclidean group ${SE(2)}$ a scaling limit of the rotation group ${SO(3)}$. The latter can then be lifted to a double cover, namely the spin group ${Spin(3)}$. This group has a natural interpretation as the unit quaternions, which is isometric to the unit sphere ${S^3}$. The analogue of the lines ${l_{AB}}$ in this setting become great circles on this sphere; applying a projective transformation, one can map ${S^3}$ to ${{\bf R}^3}$ (or more precisely to the projective space ${{\bf P}^3}$), at whichi point the great circles become lines. This gives a proof of Proposition 1.

Details of the correspondence are provided below the fold. One by-product of this analysis, incidentally, is the observation that the Guth-Katz bound ${g(N) \gg N / \log N}$ for the Erdos distance problem in the plane ${{\bf R}^2}$, immediately extends with almost no modification to the sphere ${S^2}$ as well (i.e. any ${N}$ points in ${S^2}$ determine ${\gg N/\log N}$ distances), as well as to the hyperbolic plane ${H^2}$.