You are currently browsing the monthly archive for October 2011.

Let ${{\mathfrak g}}$ be a finite-dimensional Lie algebra (over the reals). Given two sufficiently small elements ${x, y}$ of ${{\mathfrak g}}$, define the right Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle R_y(x) := x + \int_0^1 F_R( \hbox{Ad}_x \hbox{Ad}_{ty} ) y \ dt \ \ \ \ \ (1)$

where ${\hbox{Ad}_x := \exp(\hbox{ad}_x)}$, ${\hbox{ad}_x: {\mathfrak g} \rightarrow {\mathfrak g}}$ is the adjoint map ${\hbox{ad}_x(y) := [x,y]}$, and ${F_R}$ is the function ${F_R(z) := \frac{z \log z}{z-1}}$, which is analytic for ${z}$ near ${1}$. Similarly, define the left Baker-Campbell-Hausdorff-Dynkin law

$\displaystyle L_x(y) := y + \int_0^1 F_L( \hbox{Ad}_{tx} \hbox{Ad}_y ) x\ dt \ \ \ \ \ (2)$

where ${F_L(z) := \frac{\log z}{z-1}}$. One easily verifies that these expressions are well-defined (and depend smoothly on ${x}$ and ${y}$) when ${x}$ and ${y}$ are sufficiently small.

We have the famous Baker-Campbell-Hausdoff-Dynkin formula:

Theorem 1 (BCH formula) Let ${G}$ be a finite-dimensional Lie group over the reals with Lie algebra ${{\mathfrak g}}$. Let ${\log}$ be a local inverse of the exponential map ${\exp: {\mathfrak g} \rightarrow G}$, defined in a neighbourhood of the identity. Then for sufficiently small ${x, y \in {\mathfrak g}}$, one has

$\displaystyle \log( \exp(x) \exp(y) ) = R_y(x) = L_x(y).$

See for instance these notes of mine for a proof of this formula (it is for ${R_y}$, but one easily obtains a similar proof for ${L_x}$).

In particular, one can give a neighbourhood of the identity in ${{\mathfrak g}}$ the structure of a local Lie group by defining the group operation ${\ast}$ as

$\displaystyle x \ast y := R_y(x) = L_x(y) \ \ \ \ \ (3)$

for sufficiently small ${x, y}$, and the inverse operation by ${x^{-1} := -x}$ (one easily verifies that ${R_x(-x) = L_x(-x) = 0}$ for all small ${x}$).

It is tempting to reverse the BCH formula and conclude (the local form of) Lie’s third theorem, that every finite-dimensional Lie algebra is isomorphic to the Lie algebra of some local Lie group, by using (3) to define a smooth local group structure on a neighbourhood of the identity. (See this previous post for a definition of a local Lie group.) The main difficulty in doing so is in verifying that the definition (3) is well-defined (i.e. that ${R_y(x)}$ is always equal to ${L_x(y)}$) and locally associative. The well-definedness issue can be trivially disposed of by using just one of the expressions ${R_y(x)}$ or ${L_x(y)}$ as the definition of ${\ast}$ (though, as we shall see, it will be very convenient to use both of them simultaneously). However, the associativity is not obvious at all.

With the assistance of Ado’s theorem, which places ${{\mathfrak g}}$ inside the general linear Lie algebra ${\mathfrak{gl}_n({\bf R})}$ for some ${n}$, one can deduce both the well-definedness and associativity of (3) from the Baker-Campbell-Hausdorff formula for ${\mathfrak{gl}_n({\bf R})}$. However, Ado’s theorem is rather difficult to prove (see for instance this previous blog post for a proof), and it is natural to ask whether there is a way to establish these facts without Ado’s theorem.

After playing around with this for some time, I managed to extract a direct proof of well-definedness and local associativity of (3), giving a proof of Lie’s third theorem independent of Ado’s theorem. This is not a new result by any means, (indeed, the original proofs of Lie and Cartan of Lie’s third theorem did not use Ado’s theorem), but I found it an instructive exercise to work out the details, and so I am putting it up on this blog in case anyone else is interested (and also because I want to be able to find the argument again if I ever need it in the future).

In the previous set of notes, we introduced the notion of an ultra approximate group – an ultraproduct ${A = \prod_{n \rightarrow\alpha} A_n}$ of finite ${K}$-approximate groups ${A_n}$ for some ${K}$ independent of ${n}$, where each ${K}$-approximate group ${A_n}$ may lie in a distinct ambient group ${G_n}$. Although these objects arise initially from the “finitary” objects ${A_n}$, it turns out that ultra approximate groups ${A}$ can be profitably analysed by means of infinitary groups ${L}$ (and in particular, locally compact groups or Lie groups ${L}$), by means of certain models ${\rho: \langle A \rangle \rightarrow L}$ of ${A}$ (or of the group ${\langle A \rangle}$ generated by ${A}$). We will define precisely what we mean by a model later, but as a first approximation one can view a model as a representation of the ultra approximate group ${A}$ (or of ${\langle A \rangle}$) that is “macroscopically faithful” in that it accurately describes the “large scale” behaviour of ${A}$ (or equivalently, that the kernel of the representation is “microscopic” in some sense). In the next section we will see how one can use “Gleason lemma” technology to convert this macroscopic control of an ultra approximate group into microscopic control, which will be the key to classifying approximate groups.

Models of ultra approximate groups can be viewed as the multiplicative combinatorics analogue of the more well known concept of an ultralimit of metric spaces, which we briefly review below the fold as motivation.

The crucial observation is that ultra approximate groups enjoy a local compactness property which allows them to be usefully modeled by locally compact groups (and hence, through the Gleason-Yamabe theorem from previous notes, by Lie groups also). As per the Heine-Borel theorem, the local compactness will come from a combination of a completeness property and a local total boundedness property. The completeness property turns out to be a direct consequence of the countable saturation property of ultraproducts, thus illustrating one of the key advantages of the ultraproduct setting. The local total boundedness property is more interesting. Roughly speaking, it asserts that “large bounded sets” (such as ${A}$ or ${A^{100}}$) can be covered by finitely many translates of “small bounded sets” ${S}$, where “small” is a topological group sense, implying in particular that large powers ${S^m}$ of ${S}$ lie inside a set such as ${A}$ or ${A^4}$. The easiest way to obtain such a property comes from the following lemma of Sanders:

Lemma 1 (Sanders lemma) Let ${A}$ be a finite ${K}$-approximate group in a (global) group ${G}$, and let ${m \geq 1}$. Then there exists a symmetric subset ${S}$ of ${A^4}$ with ${|S| \gg_{K,m} |A|}$ containing the identity such that ${S^m \subset A^4}$.

This lemma has an elementary combinatorial proof, and is the key to endowing an ultra approximate group with locally compact structure. There is also a closely related lemma of Croot and Sisask which can achieve similar results, and which will also be discussed below. (The locally compact structure can also be established more abstractly using the much more general methods of definability theory, as was first done by Hrushovski, but we will not discuss this approach here.)

By combining the locally compact structure of ultra approximate groups ${A}$ with the Gleason-Yamabe theorem, one ends up being able to model a large “ultra approximate subgroup” ${A'}$ of ${A}$ by a Lie group ${L}$. Such Lie models serve a number of important purposes in the structure theory of approximate groups. Firstly, as all Lie groups have a dimension which is a natural number, they allow one to assign a natural number “dimension” to ultra approximate groups, which opens up the ability to perform “induction on dimension” arguments. Secondly, Lie groups have an escape property (which is in fact equivalent to no small subgroups property): if a group element ${g}$ lies outside of a very small ball ${B_\epsilon}$, then some power ${g^n}$ of it will escape a somewhat larger ball ${B_1}$. Or equivalently: if a long orbit ${g, g^2, \ldots, g^n}$ lies inside the larger ball ${B_1}$, one can deduce that the original element ${g}$ lies inside the small ball ${B_\epsilon}$. Because all Lie groups have this property, we will be able to show that all ultra approximate groups ${A}$ “essentially” have a similar property, in that they are “controlled” by a nearby ultra approximate group which obeys a number of escape-type properties analogous to those enjoyed by small balls in a Lie group, and which we will call a strong ultra approximate group. This will be discussed in the next set of notes, where we will also see how these escape-type properties can be exploited to create a metric structure on strong approximate groups analogous to the Gleason metrics studied in previous notes, which can in turn be exploited (together with an induction on dimension argument) to fully classify such approximate groups (in the finite case, at least).

There are some cases where the analysis is particularly simple. For instance, in the bounded torsion case, one can show that the associated Lie model ${L}$ is necessarily zero-dimensional, which allows for a easy classification of approximate groups of bounded torsion.

Some of the material here is drawn from my recent paper with Ben Green and Emmanuel Breuillard, which is in turn inspired by a previous paper of Hrushovski.

Emmanuel Breuillard, Ben Green, and I have just uploaded to the arXiv our paper “The structure of approximate groups“, submitted to Pub. IHES. We had announced the main results of this paper in various forums (including this blog) for a few months now, but it had taken some time to fully write up the paper and put in various refinements and applications.

As announced previously, the main result of this paper is what is a (virtually, qualitatively) complete description of finite approximate groups in an arbitrary (local or global) group ${G}$. For simplicity let us work in the much more familiar setting of global groups, although our results also apply (but are a bit more technical to state) in the local group setting.

Recall that in a global group ${G = (G,\cdot)}$, a ${K}$-approximate group is a symmetric subset ${A}$ of ${G}$ containing the origin, with the property that the product set ${A \cdot A}$ is covered by ${K}$ left-translates of ${A}$. Examples of ${O(1)}$-approximate groups include genuine groups, convex bodies in a bounded dimensional vector space, small balls in a bounded dimensional Lie group, large balls in a discrete nilpotent group of bounded rank or step, or generalised arithmetic progressions (or more generally, coset progressions) of bounded rank in an abelian group. Specialising now to finite approximate groups, a key example of such a group is what we call a coset nilprogression: a set of the form ${\pi^{-1}(P)}$, where ${\pi: G' \rightarrow N}$ is a homomorphism with finite kernel from a subgroup ${G'}$ of ${G}$ to a nilpotent group ${N}$ of bounded step, and ${P = P(u_1,\ldots,u_r;N_1,\ldots,N_r)}$ is a nilprogression with a bounded number of generators ${u_1,\ldots,u_r}$ in ${N}$ and some lengths ${N_1,\ldots,N_r \gg 1}$, where ${P(u_1,\ldots,u_r;N_1,\ldots,N_r)}$ consists of all the words involving at most ${N_1}$ copies of ${u_1^{\pm 1}}$, ${N_2}$ copies of ${u_2^{\pm 1}}$, and so forth up to ${N_r}$ copies of ${u_r^{\pm 1}}$. One can show (by some nilpotent algebra) that all such coset nilprogressions are ${O(1)}$-approximate groups so long as the step and the rank ${r}$ are bounded (and if ${N_1,\ldots,N_r}$ are sufficiently large).

Our main theorem (which was essentially conjectured independently by Helfgott and by Lindenstrauss) asserts, roughly speaking, that coset nilprogressions are essentially the only examples of approximate groups.

Theorem 1 Let ${A}$ be a ${K}$-approximate group. Then ${A^4}$ contains a coset nilprogression ${P}$ of rank and step ${O_K(1)}$, such that ${A}$ can be covered by ${O_K(1)}$ left-translates of ${P}$.

In the torsion-free abelian case, this result is essentially Freiman’s theorem (with an alternate proof by Ruzsa); for general abelian case, it is due to Green and Ruzsa. Various partial results in this direction for some other groups (e.g. free groups, nilpotent groups, solvable groups, or simple groups of Lie type) are also known; see these previous blog posts for a summary of several of these results.

This result has a number of applications to geometric growth theory, and in particular to variants of Gromov’s theorem of groups of polynomial growth, which asserts that a finitely generated group is of polynomial growth if and only if it is virtually nilpotent. The connection lies in the fact that if the balls ${B_S(R)}$ associated to a finite set of generators ${S}$ has polynomial growth, then some simple volume-packing arguments combined with the pigeonhole principle will show that ${B_S(R)}$ will end up being a ${O(1)}$-approximate group for many radii ${R}$. In fact, since our theorem only needs a single approximate group to obtain virtually nilpotent structure, we are able to obtain some new strengthenings of Gromov’s theorem. For instance, if ${A}$ is any ${K}$-approximate group in a finitely generated group ${G}$ that contains ${B_S(R_0)}$ for some set of generators ${S}$ and some ${R_0}$ that is sufficiently large depending on ${K}$, our theorem implies that ${G}$ is virtually nilpotent, answering a question of Petrunin. Among other things, this gives an alternate proof of a recent result of Kapovitch and Wilking (see also this previous paper of Cheeger and Colding) that a compact manifold of bounded diameter and Ricci curvature at least ${-\epsilon}$ necessarily has a virtually nilpotent fundamental group if ${\epsilon}$ is sufficiently small (depending only on dimension). The main point here is that no lower bound on the injectivity radius is required. Another application is a “Margulis-type lemma”, which asserts that if a metric space ${X}$ has “bounded packing” (in the sense that any ball of radius (say) ${4}$ is covered by a bounded number of balls of radius ${1}$), and ${\Gamma}$ is a group of isometries on ${X}$ that acts discretely (i.e. every orbit has only finitely many elements (counting multiplicity) in each bounded set), then the near-stabiliser ${\{ \gamma \in \Gamma: d(\gamma x, x) \leq \epsilon \}}$ of a point ${x}$ is virtually nilpotent if ${\epsilon}$ is small enough depending on the packing constant.

There are also some variants and refinements to the main theorem proved in the paper, such as an extension to local groups, and also an improvement on the bound on the rank and step from ${O_K(1)}$ to ${O(\log K)}$ (but at the cost of replacing ${A^4}$ in the theorem with ${A^{O(1)}}$).

I’ll be discussing the proof of the main theorem in detail in the next few lecture notes of my current graduate course. The full proof is somewhat lengthy (occupying about 50 pages of the 90-page paper), but can be summarised in the following steps:

1. (Hrushovski) Take an arbitrary sequence ${A_n}$ of finite ${K}$-approximate groups, and show that an appropriate limit ${A}$ of such groups can be “modeled” in some sense by an open bounded subset of a locally compact group. (The precise definition of “model” is technical, but “macroscopically faithful representation” is a good first approximation.) As discussed in the previous lecture notes, we use an ultralimit for this purpose; the paper of Hrushovski where this strategy was first employed also considered more sophisticated model-theoretic limits. To build a locally compact topology, Hrushovski used some tools from definability theory; in our paper, we instead use a combinatorial lemma of Sanders (closely related to a similar result of Croot and Sisask.)
2. (Gleason-Yamabe) The locally compact group can in turn be “modeled” by a Lie group (possibly after shrinking the group, and thus the ultralimit ${A}$, slightly). (This result arose from the solution to Hilbert’s fifth problem, as discussed here. For our extension to local groups, we use a recent local version of the Gleason-Yamabe theorem, due to Goldbring.)
3. (Gleason) Using the escape properties of the Lie model, construct a norm ${\| \|}$ (and thus a left-invariant metric ${d}$) on the ultralimit approximate group ${A}$ (and also on the finitary groups ${A_n}$) that obeys a number of good properties, such as a commutator estimate ${\| [g,h]\| \ll \|g\| \|h\|}$. (This is modeled on an analogous construction used in the theory of Hilbert’s fifth problem, as discussed in this previous set of lecture notes.) This norm is essentially an escape norm associated to (a slight modification) of ${A}$ or ${A_n}$.
4. (Jordan-Bieberbach-Frobenius) We now take advantage of the finite nature of the ${A_n}$ by locating the non-trivial element ${e}$ of ${A_n}$ with minimal escape norm (but one has to first quotient out the elements of zero escape norm first). The commutator estimate mentioned previously ensures that this element is essentially “central” in ${A_n}$. One can then quotient out a progression ${P(e;N)}$ generated by this central element (reducing the dimension of the Lie model by one in the process) and iterates the process until the dimension of the model drops to zero. Reversing the process, this constructs a coset nilprogression inside ${A_n^4}$. This argument is based on the classic proof of Jordan’s theorem due to Bieberbach and Frobenius, as discussed in this blog post.

One quirk of the argument is that it requires one to work in the category of local groups rather than global groups. (This is somewhat analogous to how, in the standard proofs of Freiman’s theorem, one needs to work with the category of Freiman homomorphisms, rather than group homomorphisms.) The reason for this arises when performing the quotienting step in the Jordan-Bieberbach-Frobenius leg of the argument. The obvious way to perform this step (and the thing that we tried first) would be to quotient out by the entire cyclic group ${\langle e \rangle}$ generated by the element ${e}$ of minimal escape norm. However, it turns out that this doesn’t work too well, because the group quotiented out is so “large” that it can create a lot of torsion in the quotient. In particular, elements which used to have positive escape norm, can now become trapped in the quotient of ${A_n}$, thus sending their escape norm to zero. This leads to an inferior conclusion (in which a coset nilprogression is replaced by a more complicated tower of alternating extensions between central progressions and finite groups, similar to the towers encountered in my previous paper on this topic). To prevent this unwanted creation of torsion, one has to truncate the cyclic group ${\langle e \rangle}$ before it escapes ${A_n}$, so that one quotients out by a geometric progression ${P(e;N)}$ rather than the cyclic group. But the operation of quotienting out by a ${P(e;N)}$, which is a local group rather than a global one, cannot be formalised in the category of global groups, but only in the category of local groups. Because of this, we were forced to carry out the entire argument using the language of local groups. As it turns out, the arguments are ultimately more natural in this setting, although there is an initial investment of notation required, given that global group theory is much more familiar and well-developed than local group theory.

One interesting feature of the argument is that it does not use much of the existing theory of Freiman-type theorems, instead building the coset nilprogression directly from the geometric properties of the approximate group. In particular, our argument gives a new proof of Freiman’s theorem in the abelian case, which largely avoids Fourier analysis (except through the use of the theory of Hilbert’s fifth problem, which uses the Peter-Weyl theorem (or, in the abelian case, Pontryagin duality), which is basically a version of Fourier analysis).

Roughly speaking, mathematical analysis can be divided into two major styles, namely hard analysis and soft analysis. The precise distinction between the two types of analysis is imprecise (and in some cases one may use a blend the two styles), but some key differences can be listed as follows.

• Hard analysis tends to be concerned with quantitative or effective properties such as estimates, upper and lower bounds, convergence rates, and growth rates or decay rates. In contrast, soft analysis tends to be concerned with qualitative or ineffective properties such as existence and uniqueness, finiteness, measurability, continuity, differentiability, connectedness, or compactness.
• Hard analysis tends to be focused on finitary, finite-dimensional or discrete objects, such as finite sets, finitely generated groups, finite Boolean combination of boxes or balls, or “finite-complexity” functions, such as polynomials or functions on a finite set. In contrast, soft analysis tends to be focused on infinitary, infinite-dimensional, or continuous objects, such as arbitrary measurable sets or measurable functions, or abstract locally compact groups.
• Hard analysis tends to involve explicit use of many parameters such as ${\epsilon}$, ${\delta}$, ${N}$, etc. In contrast, soft analysis tends to rely instead on properties such as continuity, differentiability, compactness, etc., which implicitly are defined using a similar set of parameters, but whose parameters often do not make an explicit appearance in arguments.
• In hard analysis, it is often the case that a key lemma in the literature is not quite optimised for the application at hand, and one has to reprove a slight variant of that lemma (using a variant of the proof of the original lemma) in order for it to be suitable for applications. In contrast, in soft analysis, key results can often be used as “black boxes”, without need of further modification or inspection of the proof.
• The properties in soft analysis tend to enjoy precise closure properties; for instance, the composition or linear combination of continuous functions is again continuous, and similarly for measurability, differentiability, etc. In contrast, the closure properties in hard analysis tend to be fuzzier, in that the parameters in the conclusion are often different from the parameters in the hypotheses. For instance, the composition of two Lipschitz functions with Lipschitz constant ${K}$ is still Lipschitz, but now with Lipschitz constant ${K^2}$ instead of ${K}$. These changes in parameters mean that hard analysis arguments often require more “bookkeeping” than their soft analysis counterparts, and are less able to utilise algebraic constructions (e.g. quotient space constructions) that rely heavily on precise closure properties.

In the lectures so far, focusing on the theory surrounding Hilbert’s fifth problem, the results and techniques have fallen well inside the category of soft analysis. However, we will now turn to the theory of approximate groups, which is a topic which is traditionally studied using the methods of hard analysis. (Later we will also study groups of polynomial growth, which lies on an intermediate position in the spectrum between hard and soft analysis, and which can be profitably analysed using both styles of analysis.)

Despite the superficial differences between hard and soft analysis, though, there are a number of important correspondences between results in hard analysis and results in soft analysis. For instance, if one has some sort of uniform quantitative bound on some expression relating to finitary objects, one can often use limiting arguments to then conclude a qualitative bound on analogous expressions on infinitary objects, by viewing the latter objects as some sort of “limit” of the former objects. Conversely, if one has a qualitative bound on infinitary objects, one can often use compactness and contradiction arguments to recover uniform quantitative bounds on finitary objects as a corollary.

Remark 1 Another type of correspondence between hard analysis and soft analysis, which is “syntactical” rather than “semantical” in nature, arises by taking the proofs of a soft analysis result, and translating such a qualitative proof somehow (e.g. by carefully manipulating quantifiers) into a quantitative proof of an analogous hard analysis result. This type of technique is sometimes referred to as proof mining in the proof theory literature, and is discussed in this previous blog post (and its comments). We will however not employ systematic proof mining techniques here, although in later posts we will informally borrow arguments from infinitary settings (such as the methods used to construct Gleason metrics) and adapt them to finitary ones.

Let us illustrate the correspondence between hard and soft analysis results with a simple example.

Proposition 1 Let ${X}$ be a sequentially compact topological space, let ${S}$ be a dense subset of ${X}$, and let ${f: X \rightarrow [0,+\infty]}$ be a continuous function (giving the extended half-line ${[0,+\infty]}$ the usual order topology). Then the following statements are equivalent:

• (i) (Qualitative bound on infinitary objects) For all ${x \in X}$, one has ${f(x) < +\infty}$.
• (ii) (Quantitative bound on finitary objects) There exists ${M < +\infty}$ such that ${f(x) \leq M}$ for all ${x \in S}$.

In applications, ${S}$ is typically a (non-compact) set of “finitary” (or “finite complexity”) objects of a certain class, and ${X}$ is some sort of “completion” or “compactification” of ${S}$ which admits additional “infinitary” objects that may be viewed as limits of finitary objects.

Proof: To see that (ii) implies (i), observe from density that every point ${x}$ in ${X}$ is adherent to ${S}$, and so given any neighbourhood ${U}$ of ${x}$, there exists ${y \in S \cap U}$. Since ${f(y) \leq M}$, we conclude from the continuity of ${f}$ that ${f(x) \leq M}$ also, and the claim follows.

Conversely, to show that (i) implies (ii), we use the “compactness and contradiction” argument. Suppose for sake of contradiction that (ii) failed. Then for any natural number ${n}$, there exists ${x_n \in S}$ such that ${f(x_n) \geq n}$. (Here we have used the axiom of choice, which we will assume throughout this course.) Using sequential compactness, and passing to a subsequence if necessary, we may assume that the ${x_n}$ converge to a limit ${x \in X}$. By continuity of ${f}$, this implies that ${f(x) = +\infty}$, contradicting (i). $\Box$

Remark 2 Note that the above deduction of (ii) from (i) is ineffective in that it gives no explicit bound on the uniform bound ${M}$ in (ii). Without any further information on how the qualitative bound (i) is proven, this is the best one can do in general (and this is one of the most significant weaknesses of infinitary methods when used to solve finitary problems); but if one has access to the proof of (i), one can often finitise or proof mine that argument to extract an effective bound for ${M}$, although often the bound one obtains in the process is quite poor (particularly if the proof of (i) relied extensively on infinitary tools, such as limits). See this blog post for some related discussion.

The above simple example illustrates that in order to get from an “infinitary” statement such as (i) to a “finitary” statement such as (ii), a key step is to be able to take a sequence ${(x_n)_{n \in {\bf N}}}$ (or in some cases, a more general net ${(x_\alpha)_{\alpha \in A}}$) of finitary objects and extract a suitable infinitary limit object ${x}$. In the literature, there are three main ways in which one can extract such a limit:

• (Topological limit) If the ${x_n}$ are all elements of some topological space ${S}$ (e.g. an incomplete function space) which has a suitable “compactification” or “completion” ${X}$ (e.g. a Banach space), then (after passing to a subsequence if necessary) one can often ensure the ${x_n}$ converge in a topological sense (or in a metrical sense) to a limit ${x}$. The use of this type of limit to pass between quantitative/finitary and qualitative/infinitary results is particularly common in the more analytical areas of mathematics (such as ergodic theory, asymptotic combinatorics, or PDE), due to the abundance of useful compactness results in analysis such as the (sequential) Banach-Alaoglu theorem, Prokhorov’s theorem, the Helly selection theorem, the Arzelá-Ascoli theorem, or even the humble Bolzano-Weierstrass theorem. However, one often has to take care with the nature of convergence, as many compactness theorems only guarantee convergence in a weak sense rather than in a strong one.
• (Categorical limit) If the ${x_n}$ are all objects in some category (e.g. metric spaces, groups, fields, etc.) with a number of morphisms between the ${x_n}$ (e.g. morphisms from ${x_{n+1}}$ to ${x_n}$, or vice versa), then one can often form a direct limit ${\lim_{\rightarrow} x_n}$ or inverse limit ${\lim_{\leftarrow} x_n}$ of these objects to form a limiting object ${x}$. The use of these types of limits to connect quantitative and qualitative results is common in subjects such as algebraic geometry that are particularly amenable to categorical ways of thinking. (We have seen inverse limits appear in the discussion of Hilbert’s fifth problem, although in that context they were not really used to connect quantitative and qualitative results together.)
• (Logical limit) If the ${x_n}$ are all distinct spaces (or elements or subsets of distinct spaces), with few morphisms connecting them together, then topological and categorical limits are often unavailable or unhelpful. In such cases, however, one can still tie together such objects using an ultraproduct construction (or similar device) to create a limiting object ${\lim_{n \rightarrow \alpha} x_n}$ or limiting space ${\prod_{n \rightarrow \alpha} x_n}$ that is a logical limit of the ${x_n}$, in the sense that various properties of the ${x_n}$ (particularly those that can be phrased using the language of first-order logic) are preserved in the limit. As such, logical limits are often very well suited for the task of connecting finitary and infinitary mathematics together. Ultralimit type constructions are of course used extensively in logic (particularly in model theory), but are also popular in metric geometry. They can also be used in many of the previously mentioned areas of mathematics, such as algebraic geometry (as discussed in this previous post).

The three types of limits are analogous in many ways, with a number of connections between them. For instance, in the study of groups of polynomial growth, both topological limits (using the metric notion of Gromov-Hausdorff convergence) and logical limits (using the ultralimit construction) are commonly used, and to some extent the two constructions are at least partially interchangeable in this setting. (See also these previous posts for the use of ultralimits as a substitute for topological limits.) In the theory of approximate groups, though, it was observed by Hrushovski that logical limits (and in particular, ultraproducts) are the most useful type of limit to connect finitary approximate groups to their infinitary counterparts. One reason for this is that one is often interested in obtaining results on approximate groups ${A}$ that are uniform in the choice of ambient group ${G}$. As such, one often seeks to take a limit of approximate groups ${A_n}$ that lie in completely unrelated ambient groups ${G_n}$, with no obvious morphisms or metrics tying the ${G_n}$ to each other. As such, the topological and categorical limits are not easily usable, whereas the logical limits can still be employed without much difficulty.

Logical limits are closely tied with non-standard analysis. Indeed, by applying an ultraproduct construction to standard number systems such as the natural numbers ${{\bf N}}$ or the reals ${{\bf R}}$, one can obtain nonstandard number systems such as the nonstandard natural numbers ${{}^* {\bf N}}$ or the nonstandard real numbers (or hyperreals) ${{}^* {\bf R}}$. These nonstandard number systems behave very similarly to their standard counterparts, but also enjoy the advantage of containing the standard number systems as proper subsystems (e.g. ${{\bf R}}$ is a subring of ${{}^* {\bf R}}$), which allows for some convenient algebraic manipulations (such as the quotient space construction to create spaces such as ${{}^* {\bf R} / {\bf R}}$) which are not easily accessible in the purely standard universe. Nonstandard spaces also enjoy a useful completeness property, known as countable saturation, which is analogous to metric completeness (as discussed in this previous blog post) and which will be particularly useful for us in tying together the theory of approximate groups with the theory of Hilbert’s fifth problem. See this previous post for more discussion on ultrafilters and nonstandard analysis.

In these notes, we lay out the basic theory of ultraproducts and ultralimits (in particular, proving Los’s theorem, which roughly speaking asserts that ultralimits are limits in a logical sense, as well as the countable saturation property alluded to earlier). We also lay out some of the basic foundations of nonstandard analysis, although we will not rely too heavily on nonstandard tools in this course. Finally, we apply this general theory to approximate groups, to connect finite approximate groups to an infinitary type of approximate group which we will call an ultra approximate group. We will then study these ultra approximate groups (and models of such groups) in more detail in the next set of notes.

Remark 3 Throughout these notes (and in the rest of the course), we will assume the axiom of choice, in order to easily use ultrafilter-based tools. If one really wanted to expend the effort, though, one could eliminate the axiom of choice from the proofs of the final “finitary” results that one is ultimately interested in proving, at the cost of making the proofs significantly lengthier. Indeed, there is a general result of Gödel that any result which can be stated in the language of Peano arithmetic (which, roughly speaking, means that the result is “finitary” in nature), and can be proven in set theory using the axiom of choice (or more precisely, in the ZFC axiom system), can also be proven in set theory without the axiom of choice (i.e. in the ZF system). As this course is not focused on foundations, we shall simply assume the axiom of choice henceforth to avoid further distraction by such issues.

In the previous notes, we established the Gleason-Yamabe theorem:

Theorem 1 (Gleason-Yamabe theorem) Let ${G}$ be a locally compact group. Then, for any open neighbourhood ${U}$ of the identity, there exists an open subgroup ${G'}$ of ${G}$ and a compact normal subgroup ${K}$ of ${G'}$ in ${U}$ such that ${G'/K}$ is isomorphic to a Lie group.

Roughly speaking, this theorem asserts the “mesoscopic” structure of a locally compact group (after restricting to an open subgroup ${G'}$ to remove the macroscopic structure, and quotienting out by ${K}$ to remove the microscopic structure) is always of Lie type.

In this post, we combine the Gleason-Yamabe theorem with some additional tools from point-set topology to improve the description of locally compact groups in various situations.

We first record some easy special cases of this. If the locally compact group ${G}$ has the no small subgroups property, then one can take ${K}$ to be trivial; thus ${G'}$ is Lie, which implies that ${G}$ is locally Lie and thus Lie as well. Thus the assertion that all locally compact NSS groups are Lie (Theorem 10 from Notes 4) is a special case of the Gleason-Yamabe theorem.

In a similar spirit, if the locally compact group ${G}$ is connected, then the only open subgroup ${G'}$ of ${G}$ is the full group ${G}$; in particular, by arguing as in the treatment of the compact case (Exercise 19 of Notes 3), we conclude that any connected locally compact Hausdorff group is the inverse limit of Lie groups.

Now we return to the general case, in which ${G}$ need not be connected or NSS. One slight defect of Theorem 1 is that the group ${G'}$ can depend on the open neighbourhood ${U}$. However, by using a basic result from the theory of totally disconnected groups known as van Dantzig’s theorem, one can make ${G'}$ independent of ${U}$:

Theorem 2 (Gleason-Yamabe theorem, stronger version) Let ${G}$ be a locally compact group. Then there exists an open subgoup ${G'}$ of ${G}$ such that, for any open neighbourhood ${U}$ of the identity in ${G'}$, there exists a compact normal subgroup ${K}$ of ${G'}$ in ${U}$ such that ${G'/K}$ is isomorphic to a Lie group.

We prove this theorem below the fold. As in previous notes, if ${G}$ is Hausdorff, the group ${G'}$ is thus an inverse limit of Lie groups (and if ${G}$ (and hence ${G'}$) is first countable, it is the inverse limit of a sequence of Lie groups).

It remains to analyse inverse limits of Lie groups. To do this, it helps to have some control on the dimensions of the Lie groups involved. A basic tool for this purpose is the invariance of domain theorem:

Theorem 3 (Brouwer invariance of domain theorem) Let ${U}$ be an open subset of ${{\bf R}^n}$, and let ${f: U \rightarrow {\bf R}^n}$ be a continuous injective map. Then ${f(U)}$ is also open.

We prove this theorem below the fold. It has an important corollary:

Corollary 4 (Topological invariance of dimension) If ${n > m}$, and ${U}$ is a non-empty open subset of ${{\bf R}^n}$, then there is no continuous injective mapping from ${U}$ to ${{\bf R}^m}$. In particular, ${{\bf R}^n}$ and ${{\bf R}^m}$ are not homeomorphic.

Exercise 1 (Uniqueness of dimension) Let ${X}$ be a non-empty topological space. If ${X}$ is a manifold of dimension ${d_1}$, and also a manifold of dimension ${d_2}$, show that ${d_1=d_2}$. Thus, we may define the dimension ${\hbox{dim}(X)}$ of a non-empty manifold in a well-defined manner.

If ${X, Y}$ are non-empty manifolds, and there is a continuous injection from ${X}$ to ${Y}$, show that ${\hbox{dim}(X) \leq \hbox{dim}(Y)}$.

Remark 1 Note that the analogue of the above exercise for surjections is false: the existence of a continuous surjection from one non-empty manifold ${X}$ to another ${Y}$ does not imply that ${\hbox{dim}(X) \geq \hbox{dim}(Y)}$, thanks to the existence of space-filling curves. Thus we see that invariance of domain, while intuitively plausible, is not an entirely trivial observation.

As we shall see, we can use Corollary 4 to bound the dimension of the Lie groups ${L_n}$ in an inverse limit ${G = \lim_{n \rightarrow \infty} L_n}$ by the “dimension” of the inverse limit ${G}$. Among other things, this can be used to obtain a positive resolution to Hilbert’s fifth problem:

Theorem 5 (Hilbert’s fifth problem) Every locally Euclidean group is isomorphic to a Lie group.

Again, this will be shown below the fold.

Another application of this machinery is the following variant of Hilbert’s fifth problem, which was used in Gromov’s original proof of Gromov’s theorem on groups of polynomial growth, although we will not actually need it this course:

Proposition 6 Let ${G}$ be a locally compact ${\sigma}$-compact group that acts transitively, faithfully, and continuously on a connected manifold ${X}$. Then ${G}$ is isomorphic to a Lie group.

Recall that a continuous action of a topological group ${G}$ on a topological space ${X}$ is a continuous map ${\cdot: G \times X \rightarrow X}$ which obeys the associativity law ${(gh)x = g(hx)}$ for ${g,h \in G}$ and ${x \in X}$, and the identity law ${1x = x}$ for all ${x \in X}$. The action is transitive if, for every ${x,y \in X}$, there is a ${g \in G}$ with ${gx=y}$, and faithful if, whenever ${g, h \in G}$ are distinct, one has ${gx \neq hx}$ for at least one ${x}$.

The ${\sigma}$-compact hypothesis is a technical one, and can likely be dropped, but we retain it for this discussion (as in most applications we can reduce to this case).

Exercise 2 Show that Proposition 6 implies Theorem 5.

Remark 2 It is conjectured that the transitivity hypothesis in Proposition 6 can be dropped; this is known as the Hilbert-Smith conjecture. It remains open; the key difficulty is to figure out a way to eliminate the possibility that ${G}$ is a ${p}$-adic group ${{\bf Z}_p}$. See this previous blog post for further discussion.

Jordan’s theorem is a basic theorem in the theory of finite linear groups, and can be formulated as follows:

Theorem 1 (Jordan’s theorem) Let ${G}$ be a finite subgroup of the general linear group ${GL_d({\bf C})}$. Then there is an abelian subgroup ${G'}$ of ${G}$ of index ${[G:G'] \leq C_d}$, where ${C_d}$ depends only on ${d}$.

Informally, Jordan’s theorem asserts that finite linear groups over the complex numbers are almost abelian. The theorem can be extended to other fields of characteristic zero, and also to fields of positive characteristic so long as the characteristic does not divide the order of ${G}$, but we will not consider these generalisations here. A proof of this theorem can be found for instance in these lecture notes of mine.

I recently learned (from this comment of Kevin Ventullo) that the finiteness hypothesis on the group ${G}$ in this theorem can be relaxed to the significantly weaker condition of periodicity. Recall that a group ${G}$ is periodic if all elements are of finite order. Jordan’s theorem with “finite” replaced by “periodic” is known as the Jordan-Schur theorem.

The Jordan-Schur theorem can be quickly deduced from Jordan’s theorem, and the following result of Schur:

Theorem 2 (Schur’s theorem) Every finitely generated periodic subgroup of a general linear group ${GL_d({\bf C})}$ is finite. (Equivalently, every periodic linear group is locally finite.)

Remark 1 The question of whether all finitely generated periodic subgroups (not necessarily linear in nature) were finite was known as the Burnside problem; the answer was shown to be negative by Golod and Shafarevich in 1964.

Let us see how Jordan’s theorem and Schur’s theorem combine via a compactness argument to form the Jordan-Schur theorem. Let ${G}$ be a periodic subgroup of ${GL_d({\bf C})}$. Then for every finite subset ${S}$ of ${G}$, the group ${G_S}$ generated by ${S}$ is finite by Theorem 2. Applying Jordan’s theorem, ${G_S}$ contains an abelian subgroup ${G'_S}$ of index at most ${C_d}$.

In particular, given any finite number ${S_1,\ldots,S_m}$ of finite subsets of ${G}$, we can find abelian subgroups ${G'_{S_1},\ldots,G'_{S_m}}$ of ${G_{S_1},\ldots,G_{S_m}}$ respectively such that each ${G'_{S_j}}$ has index at most ${C_d}$ in ${G_{S_j}}$. We claim that we may furthermore impose the compatibility condition ${G'_{S_i} = G'_{S_j} \cap G_{S_i}}$ whenever ${S_i \subset S_j}$. To see this, we set ${S := S_1 \cup \ldots \cup S_m}$, locate an abelian subgroup ${G'_S}$ of ${G_S}$ of index at most ${C_d}$, and then set ${G'_{S_i} := G'_S \cap G_{S_i}}$. As ${G_S}$ is covered by at most ${C_d}$ cosets of ${G'_S}$, we see that ${G_{S_i}}$ is covered by at most ${C_d}$ cosets of ${G'_{S_i}}$, and the claim follows.

Note that for each ${S}$, the set of possible ${G'_S}$ is finite, and so the product space of all configurations ${(G'_S)_{S \subset G}}$, as ${S}$ ranges over finite subsets of ${G}$, is compact by Tychonoff’s theorem. Using the finite intersection property, we may thus locate a subgroup ${G'_S}$ of ${G_S}$ of index at most ${C_d}$ for all finite subsets ${S}$ of ${G}$, obeying the compatibility condition ${G'_T = G'_S \cap G_T}$ whenever ${T \subset S}$. If we then set ${G' := \bigcup_S G'_S}$, where ${S}$ ranges over all finite subsets of ${G}$, we then easily verify that ${G'}$ is abelian and has index at most ${C_d}$ in ${G}$, as required.

Below I record a proof of Schur’s theorem, which I extracted from this book of Wehrfritz. This was primarily an exercise for my own benefit, but perhaps it may be of interest to some other readers.

In this set of notes we will be able to finally prove the Gleason-Yamabe theorem from Notes 0, which we restate here:

Theorem 1 (Gleason-Yamabe theorem) Let ${G}$ be a locally compact group. Then, for any open neighbourhood ${U}$ of the identity, there exists an open subgroup ${G'}$ of ${G}$ and a compact normal subgroup ${K}$ of ${G'}$ in ${U}$ such that ${G'/K}$ is isomorphic to a Lie group.

In the next set of notes, we will combine the Gleason-Yamabe theorem with some topological analysis (and in particular, using the invariance of domain theorem) to establish some further control on locally compact groups, and in particular obtaining a solution to Hilbert’s fifth problem.

To prove the Gleason-Yamabe theorem, we will use three major tools developed in previous notes. The first (from Notes 2) is a criterion for Lie structure in terms of a special type of metric, which we will call a Gleason metric:

Definition 2 Let ${G}$ be a topological group. A Gleason metric on ${G}$ is a left-invariant metric ${d: G \times G \rightarrow {\bf R}^+}$ which generates the topology on ${G}$ and obeys the following properties for some constant ${C>0}$, writing ${\|g\|}$ for ${d(g,\hbox{id})}$:

• (Escape property) If ${g \in G}$ and ${n \geq 1}$ is such that ${n \|g\| \leq \frac{1}{C}}$, then ${\|g^n\| \geq \frac{1}{C} n \|g\|}$.
• (Commutator estimate) If ${g, h \in G}$ are such that ${\|g\|, \|h\| \leq \frac{1}{C}}$, then

$\displaystyle \|[g,h]\| \leq C \|g\| \|h\|, \ \ \ \ \ (1)$

where ${[g,h] := g^{-1}h^{-1}gh}$ is the commutator of ${g}$ and ${h}$.

Theorem 3 (Building Lie structure from Gleason metrics) Let ${G}$ be a locally compact group that has a Gleason metric. Then ${G}$ is isomorphic to a Lie group.

The second tool is the existence of a left-invariant Haar measure on any locally compact group; see Theorem 3 from Notes 3. Finally, we will also need the compact case of the Gleason-Yamabe theorem (Theorem 8 from Notes 3), which was proven via the Peter-Weyl theorem:

Theorem 4 (Gleason-Yamabe theorem for compact groups) Let ${G}$ be a compact Hausdorff group, and let ${U}$ be a neighbourhood of the identity. Then there exists a compact normal subgroup ${H}$ of ${G}$ contained in ${U}$ such that ${G/H}$ is isomorphic to a linear group (i.e. a closed subgroup of a general linear group ${GL_n({\bf C})}$).

To finish the proof of the Gleason-Yamabe theorem, we have to somehow use the available structures on locally compact groups (such as Haar measure) to build good metrics on those groups (or on suitable subgroups or quotient groups). The basic construction is as follows:

Definition 5 (Building metrics out of test functions) Let ${G}$ be a topological group, and let ${\psi: G \rightarrow {\bf R}^+}$ be a bounded non-negative function. Then we define the pseudometric ${d_\psi: G \times G \rightarrow {\bf R}^+}$ by the formula

$\displaystyle d_\psi(g,h) := \sup_{x \in G} |\tau(g) \psi(x) - \tau(h) \psi(x)|$

$\displaystyle = \sup_{x \in G} |\psi(g^{-1} x ) - \psi(h^{-1} x)|$

and the semi-norm ${\| \|_\psi: G \rightarrow {\bf R}^+}$ by the formula

$\displaystyle \|g\|_\psi := d_\psi(g, \hbox{id}).$

Note that one can also write

$\displaystyle \|g\|_\psi = \sup_{x \in G} |\partial_g \psi(x)|$

where ${\partial_g \psi(x) := \psi(x) - \psi(g^{-1} x)}$ is the “derivative” of ${\psi}$ in the direction ${g}$.

Exercise 6 Let the notation and assumptions be as in the above definition. For any ${g,h,k \in G}$, establish the metric-like properties

1. (Identity) ${d_\psi(g,h) \geq 0}$, with equality when ${g=h}$.
2. (Symmetry) ${d_\psi(g,h) = d_\psi(h,g)}$.
3. (Triangle inequality) ${d_\psi(g,k) \leq d_\psi(g,h) + d_\psi(h,k)}$.
4. (Continuity) If ${\psi \in C_c(G)}$, then the map ${d_\psi: G \times G \rightarrow {\bf R}^+}$ is continuous.
5. (Boundedness) One has ${d_\psi(g,h) \leq \sup_{x \in G} |\psi(x)|}$. If ${\psi \in C_c(G)}$ is supported in a set ${K}$, then equality occurs unless ${g^{-1} h \in K K^{-1}}$.
6. (Left-invariance) ${d_\psi(g,h) = d_\psi(kg,kh)}$. In particular, ${d_\psi(g,h) = \| h^{-1} g \|_\psi = \| g^{-1} h \|_\psi}$.

In particular, we have the norm-like properties

1. (Identity) ${\|g\|_\psi \geq 0}$, with equality when ${g=\hbox{id}}$.
2. (Symmetry) ${\|g\|_\psi = \|g^{-1}\|_\psi}$.
3. (Triangle inequality) ${\|gh\|_\psi \leq \|g\|_\psi + \|h\|_\psi}$.
4. (Continuity) If ${\psi \in C_c(G)}$, then the map ${\|\|_\psi: G \rightarrow {\bf R}^+}$ is continuous.
5. (Boundedness) One has ${\|g\|_\psi \leq \sup_{x \in G} |\psi(x)|}$. If ${\psi \in C_c(G)}$ is supported in a set ${K}$, then equality occurs unless ${g \in K K^{-1}}$.

We remark that the first three properties of ${d_\psi}$ in the above exercise ensure that ${d_\psi}$ is indeed a pseudometric.

To get good metrics (such as Gleason metrics) on groups ${G}$, it thus suffices to obtain test functions ${\psi}$ that obey suitably good “regularity” properties. We will achieve this primarily by means of two tricks. The first trick is to obtain high-regularity test functions by convolving together two low-regularity test functions, taking advantage of the existence of a left-invariant Haar measure ${\mu}$ on ${G}$. The second trick is to obtain low-regularity test functions by means of a metric-like object on ${G}$. This latter trick may seem circular, as our whole objective is to get a metric on ${G}$ in the first place, but the key point is that the metric one starts with does not need to have as many “good properties” as the metric one ends up with, thanks to the regularity-improving properties of convolution. As such, one can use a “bootstrap argument” (or induction argument) to create a good metric out of almost nothing. It is this bootstrap miracle which is at the heart of the proof of the Gleason-Yamabe theorem (and hence to the solution of Hilbert’s fifth problem).

The arguments here are based on the nonstandard analysis arguments used to establish Hilbert’s fifth problem by Hirschfeld and by Goldbring (and also some unpublished lecture notes of Goldbring and van den Dries). However, we will not explicitly use any nonstandard analysis in this post.