In the previous set of notes, we introduced the notion of an ultra approximate group – an ultraproduct ${A = \prod_{n \rightarrow\alpha} A_n}$ of finite ${K}$-approximate groups ${A_n}$ for some ${K}$ independent of ${n}$, where each ${K}$-approximate group ${A_n}$ may lie in a distinct ambient group ${G_n}$. Although these objects arise initially from the “finitary” objects ${A_n}$, it turns out that ultra approximate groups ${A}$ can be profitably analysed by means of infinitary groups ${L}$ (and in particular, locally compact groups or Lie groups ${L}$), by means of certain models ${\rho: \langle A \rangle \rightarrow L}$ of ${A}$ (or of the group ${\langle A \rangle}$ generated by ${A}$). We will define precisely what we mean by a model later, but as a first approximation one can view a model as a representation of the ultra approximate group ${A}$ (or of ${\langle A \rangle}$) that is “macroscopically faithful” in that it accurately describes the “large scale” behaviour of ${A}$ (or equivalently, that the kernel of the representation is “microscopic” in some sense). In the next section we will see how one can use “Gleason lemma” technology to convert this macroscopic control of an ultra approximate group into microscopic control, which will be the key to classifying approximate groups.

Models of ultra approximate groups can be viewed as the multiplicative combinatorics analogue of the more well known concept of an ultralimit of metric spaces, which we briefly review below the fold as motivation.

The crucial observation is that ultra approximate groups enjoy a local compactness property which allows them to be usefully modeled by locally compact groups (and hence, through the Gleason-Yamabe theorem from previous notes, by Lie groups also). As per the Heine-Borel theorem, the local compactness will come from a combination of a completeness property and a local total boundedness property. The completeness property turns out to be a direct consequence of the countable saturation property of ultraproducts, thus illustrating one of the key advantages of the ultraproduct setting. The local total boundedness property is more interesting. Roughly speaking, it asserts that “large bounded sets” (such as ${A}$ or ${A^{100}}$) can be covered by finitely many translates of “small bounded sets” ${S}$, where “small” is a topological group sense, implying in particular that large powers ${S^m}$ of ${S}$ lie inside a set such as ${A}$ or ${A^4}$. The easiest way to obtain such a property comes from the following lemma of Sanders:

Lemma 1 (Sanders lemma) Let ${A}$ be a finite ${K}$-approximate group in a (global) group ${G}$, and let ${m \geq 1}$. Then there exists a symmetric subset ${S}$ of ${A^4}$ with ${|S| \gg_{K,m} |A|}$ containing the identity such that ${S^m \subset A^4}$.

This lemma has an elementary combinatorial proof, and is the key to endowing an ultra approximate group with locally compact structure. There is also a closely related lemma of Croot and Sisask which can achieve similar results, and which will also be discussed below. (The locally compact structure can also be established more abstractly using the much more general methods of definability theory, as was first done by Hrushovski, but we will not discuss this approach here.)

By combining the locally compact structure of ultra approximate groups ${A}$ with the Gleason-Yamabe theorem, one ends up being able to model a large “ultra approximate subgroup” ${A'}$ of ${A}$ by a Lie group ${L}$. Such Lie models serve a number of important purposes in the structure theory of approximate groups. Firstly, as all Lie groups have a dimension which is a natural number, they allow one to assign a natural number “dimension” to ultra approximate groups, which opens up the ability to perform “induction on dimension” arguments. Secondly, Lie groups have an escape property (which is in fact equivalent to no small subgroups property): if a group element ${g}$ lies outside of a very small ball ${B_\epsilon}$, then some power ${g^n}$ of it will escape a somewhat larger ball ${B_1}$. Or equivalently: if a long orbit ${g, g^2, \ldots, g^n}$ lies inside the larger ball ${B_1}$, one can deduce that the original element ${g}$ lies inside the small ball ${B_\epsilon}$. Because all Lie groups have this property, we will be able to show that all ultra approximate groups ${A}$ “essentially” have a similar property, in that they are “controlled” by a nearby ultra approximate group which obeys a number of escape-type properties analogous to those enjoyed by small balls in a Lie group, and which we will call a strong ultra approximate group. This will be discussed in the next set of notes, where we will also see how these escape-type properties can be exploited to create a metric structure on strong approximate groups analogous to the Gleason metrics studied in previous notes, which can in turn be exploited (together with an induction on dimension argument) to fully classify such approximate groups (in the finite case, at least).

There are some cases where the analysis is particularly simple. For instance, in the bounded torsion case, one can show that the associated Lie model ${L}$ is necessarily zero-dimensional, which allows for a easy classification of approximate groups of bounded torsion.

Some of the material here is drawn from my recent paper with Ben Green and Emmanuel Breuillard, which is in turn inspired by a previous paper of Hrushovski.

— 1. Ultralimits of metric spaces (Optional) —

Suppose one has a sequence ${(X_n,d_n)}$ of metric spaces. Intuitively, there should be a sense in which such a sequence can (in certain circumstances) “converge” to a limit ${(X,d)}$ that is another metric space. Some informal examples of this intuition:

• (i) The sets ${\{-n,\ldots,n\}}$ (with the usual metric) should “converge” as ${n \rightarrow \infty}$ to the integers ${{\bf Z}}$ (with the usual metric).
• (ii) The cyclic groups ${{\bf Z}/n{\bf Z}}$ (with the “discrete” metric ${d(i,j) := \hbox{dist}(i+nZ, j+nZ)}$) should also “converge” as ${n \rightarrow \infty}$ to the integers ${{\bf Z}}$ (with the usual metric).
• (iii) The sets ${\{ 0, \frac{1}{n}, \ldots, \frac{n-1}{n}\}}$ (with the usual metric) should “converge” as ${n \rightarrow \infty}$ to the interval ${[0,1]}$ (with the usual metric).
• (iv) The cyclic groups ${{\bf Z}/n{\bf Z}}$ (with the “bounded” metric ${d(i,j) := \frac{1}{n} \hbox{dist}(i+nZ, j+nZ)}$) should “converge” as ${n \rightarrow \infty}$ to the unit circle ${{\bf R}/{\bf Z}}$ (with the usual metric).
• (v) A Euclidean circle of radius ${n}$ (such as ${\{ (x,y) \in {\bf R}^2: x^2+(y-n)^2=n^2\}}$) should “converge” as ${n \rightarrow \infty}$ to a Euclidean line (such as ${\{ (x,0) \in {\bf R}^2: x \in {\bf R} \}}$).

Let us now try to formalise this intuition, proceeding in stages. The first attempt to formalise the above concepts is via the Hausdoff distance, which already made an appearance in Notes 4:

Definition 2 (Hausdorff distance) Let ${X = (X,d)}$ be a metric space. The Hausdorff distance ${d_H(E,F)}$ between two non-empty subsets ${E, F}$ of ${X}$ is defined by the formula

$\displaystyle d_H(E,F) := \max( \sup_{x \in E} d(x,F), \sup_{y \in F} d(E,y) ).$

Thus, if ${d_H(E,F) < r}$, then every point in ${E}$ lies within a distance less than ${r}$ of a point in ${F}$, and every point in ${F}$ lies within a distance less ${r}$ of a point in ${E}$ (thus ${E \subset N_r(F)}$ and ${F \subset N_r(E)}$, where ${N_r(F)}$ is the ${r}$-neighbourhood of ${E}$); and conversely, if the latter claim holds, then ${d_H(E,F) \leq r}$.

This distance is always symmetric, non-negative, and obeys the triangle inequality. If one restricts attention to non-empty compact sets ${E, F}$, then one easily verifies that ${d_H(E,F)=0}$ if and only if ${E=F}$, so that the Hausdorff distance becomes a metric. In particular, it becomes meaningful to discuss the concept of a sequence ${E_n}$ of non-empty compact subsets of ${X}$ converging in the Hausdorff distance to a (unique) limit non-empty compact set ${E}$. Note that this concept captures the intuitive example (iii) given above, but not any of the others. Nevertheless, we will discuss it first as it is slightly simpler than the more general notions we will be using.

Exercise 1 (Hausdorff convergence and connectedness) Let ${E_n}$ be a sequence of non-empty compact subsets of a metric space ${X}$ converging to another non-empty compact set ${E}$.

• (i) If the ${E_n}$ are all connected, show that ${E}$ is connected also.
• (ii) If the ${E_n}$ are all path-connected, does this imply that ${E}$ is path-connected also? Support your answer with a proof or counterexample.
• (iii) If the ${E_n}$ are all disconnected, does this imply that ${E}$ is disconnected also? Support your answer with a proof or counterexample.

One of the key properties of Hausdorff distance in a compact set is that it is itself compact:

Lemma 3 (Compactness of the Hausdorff metric) Let ${E_n}$ be a sequence of compact subsets of a compact metric space ${X}$. Then there is a subsequence of the ${E_n}$ that is convergent in the Hausdorff metric.

The proof of this lemma was set as Exercise 13 of Notes 4. It can be proven by “conventional” means (relying in particular on the Heine-Borel theorem), but we will now sketch how one can establish this result using ultralimits instead. As in the previous set of notes, we fix a standard universe ${{\mathfrak U}}$ and a non-principal ultrafilter ${\alpha}$ in order to define ultrapowers.

If ${X = (X,d)}$ is a (standard) metric space with metric ${d: X \times X \rightarrow {\bf R}^+}$, then the ultrapower ${{}^* X}$ comes with a “nonstandard metric” ${{}^* d: {}^* X \times {}^* X \rightarrow {}^* {\bf R}^+}$ that extends ${d}$. (In the previous notes, we referred to this metric as ${d}$ rather than ${{}^* d}$, but here it will be convenient to use a different symbol for this extension of ${d}$ to reduce confusion.) Let us say that two elements ${x, y}$ of ${{}^* X}$ are infinitesimally close if ${{}^* d(x,y) = o(1)}$. (The set of points infinitesimally close to a standard point ${x}$ is sometimes known as the monad of ${x}$, although we will not use this terminology.)

Exercise 2 Let ${(X,d)}$ be a (standard) compact metric space. Show that for any nonstandard point ${x \in {}^* X}$ there exists a unique standard point ${\hbox{st}(x) \in {}^* X}$ which is infinitesimally close to ${x}$, with

$\displaystyle d( \hbox{st}(x), \hbox{st}(y) ) = \hbox{st} ( {}^* d(x,y) ).$

Exercise 3 (Automatic completeness) Let ${(X,d)}$ be a (standard) compact metric space, and let ${E = \prod_{n \rightarrow \alpha} E_n}$ be a nonstandard subset of ${{}^* X}$ (i.e. an ultraproduct of standard subsets ${E_n}$ of ${X}$), and let ${\hbox{st}: {}^* X \rightarrow X}$ be the standard part function as defined in the previous exercise. Show that ${\hbox{st}(E)}$ is complete (and thus compact, by the Heine-Borel theorem). (Hint: take advantage of the countable saturation property from the previous notes.)

Exercise 4 Let ${(X,d)}$ be a compact metric space, and let ${E_n}$ for ${n \in {\bf N}}$ be a sequence of non-empty compact subsets of ${X}$. Write ${E := \prod_{n \rightarrow \alpha} E_n}$ for the ultraproduct of the ${E_n}$, and let ${\hbox{st}: {}^* X \rightarrow X}$ be the standard part function as defined in the previous exercises. Show that ${\hbox{st}(E)}$ is a non-empty compact subset of ${X}$, and that ${\lim_{n \rightarrow \alpha} d_H( E_n, \hbox{st}(E) ) = o(1)}$. Use this to give an alternate proof of Lemma 3.

One can extend some of the above theory from compact metric spaces to locally compact metric spaces.

Exercise 5 Let ${(X,d)}$ be a (standard) locally compact metric space. Let ${X \subset O(X) \subset {}^* X}$ denote the set of ultralimits ${\lim_{n \rightarrow \alpha} x_n}$ of precompact sequences ${x_n}$ in ${X}$.

• Show that ${O(X) = {}^* X}$ if and only if ${X}$ is compact.
• Show that there is a unique function ${\hbox{st}: O(X) \rightarrow X}$ such that ${x}$ is infinitesimally close to ${\hbox{st}(x)}$ for all ${x \in O(X)}$.
• If ${E}$ is a nonstandard subset of ${{}^* X}$, show that ${\hbox{st}( E \cap O(X) )}$ is complete.
• Show that if ${K}$ is a compact set and ${\epsilon > 0}$ is a (standard) real number, then ${K \cap E \subset N_\epsilon(E_n)}$ and ${K \cap E_n \subset N_\epsilon(E)}$ for all ${n}$ sufficiently close to ${\alpha}$.

Now we consider limits of metric spaces ${(X_n,d_n)}$ that are not necessarily all embedded in a single metric space. We begin by considering the case of bounded metric spaces.

Definition 4 (Gromov-Hausdorff distance) The Gromov-Hausdorff distance ${d_{GH}(X,Y)}$ between two bounded metric spaces ${(X,d_X)}$, ${(Y,d_Y)}$ is the infimum of the Hausdorff distance ${d_H( \iota_{X \rightarrow Z}(X), \iota_{Y \rightarrow Z}(Y) )}$, ranging over all isometric embeddings ${\iota_{X \rightarrow Z}: X \rightarrow Z}$, ${\iota_{Y \rightarrow Z}: Y \rightarrow Z}$ from ${X, Y}$ respectively to ${Z}$.

Exercise 6

• (i) Show that Gromov-Hausdorff distance is a pseudometric (i.e. symmetric, non-negative, and obeys the triangle inequality). (Hint: to show that ${d_{GH}(X,Z) \leq d_{GH}(X,Y)+d_{GH}(Y,Z)}$, build a metric on the disjoint union ${X \uplus Z}$ which intuitively captures the idea of the shortest path from ${X}$ to ${Z}$ via ${Y}$.)
• (ii) If ${E}$ is a dense subset of a bounded metric space ${X}$, show that ${d_{GH}(E,X)=0}$. In particular, any bounded metric space ${E}$ is at a zero Gromov-Hausdorff distance from its metric completion ${\overline{E}}$.
• (iii) Show that if ${X, Y}$ are compact metric spaces, then ${d_{GH}(X,Y) = 0}$ if and only if ${X}$ and ${Y}$ are isometric. (Hint: If ${d_{GH}(X,Y)=0}$, find a sequence of “approximate isometries” from ${X}$ to ${Y}$ and from ${Y}$ to ${X}$ which almost invert each other. Then adapt the Arzelá-Ascoli theorem to take a limit (or one can use ultralimits and standard parts).)

We say that a sequence ${X_n=(X_n,d_n)}$ of bounded metric spaces converges in the Gromov-Hausdorff sense to another bounded metric space ${X_\infty}$ if one has ${d_{GH}(X_n,X_\infty) \rightarrow 0}$ as ${n \rightarrow \infty}$. This is a more general notion of convergence than Hausdorff convergence, and encompasses examples (iii) and (iv) at the beginning of this section.

Exercise 7 Show that every compact metric space is the limit (in the Gromov-Hausdorff sense) of finite metric spaces.

Now we turn to an important compactness result about Gromov-Hausdorff convergence. We say that a sequence ${X_n = (X_n,d_n)}$ of metric spaces is uniformly totally bounded if the diameters of the ${X_n}$ are bounded, and, for every ${\epsilon > 0}$, there exists ${C_\epsilon}$ such that each ${X_n}$ can be covered by at most ${C_\epsilon}$ balls of radius ${\epsilon}$ in the ${d_n}$ metric. (Of course, this implies that each ${X_n}$ is individually totally bounded.)

Proposition 5 (Uniformly total bounded spaces are Gromov-Hausdorff precompact) Let ${X_n}$ be a sequence of uniformly totally bounded metric spaces. Then there exists a subsequence ${X_{n_j}}$ which converges in the Gromov-Hausdorff space to a compact limit ${X_\infty}$.

We will prove this proposition using ultrafilters. Let ${X_n = (X_n,d_n)}$ be a sequence of uniformly totally bounded metric spaces, which we will assume to be standard (by defining the standard universe in a suitable fashion). Then we can form the ultraproduct ${X := \prod_{n \rightarrow \alpha} X_n}$ and the ultralimit ${d := \lim_{n \rightarrow \alpha} d_n}$, thus ${d: X \times X \rightarrow {}^* {\bf R}^+}$ is a nonstandard metric (obeying the nonstandard symmetry, triangle inequality, and positivity properties). As the ${X_n}$ are uniformly bounded, ${d}$ has bounded range. If we then take the standard part ${\hbox{st}(d): X \times X \rightarrow {\bf R}^+}$ of ${d}$, then ${\hbox{st}(d)}$ is a pseudometric on ${X}$ (i.e. it obeys all the axioms of a metric except possibly for positivity.) We can then construct a quotient metric space ${X_\infty := X/\sim}$ in the usual manner, by declaring two points ${x,y}$ in ${X}$ to be equivalent, ${x \sim y}$, if ${\hbox{st}(d)(x,y)=0}$ (or equivalently if ${d(x,y)=o(1)}$). The pseudometric ${\hbox{st}(d)}$ then descends to a genuine metric ${d_\infty:X_\infty \times X_\infty \rightarrow {\bf R}^+}$ on ${X_\infty}$. The space ${(X_\infty,d_\infty)}$ is known as a metric ultralimit (or ultralimit for short) of the ${(X_n,d_n)}$ (note that this is a slightly different usage of the term “ultralimit” from what we have been using previously).

One can easily establish compactness:

Exercise 8 (Compactness)

• (i) Show that ${(X_\infty, d_\infty)}$ is totally bounded. (Hint: use the uniform total boundedness of the ${X_n}$ together with Los’s theorem.)
• (ii) Show that ${(X_\infty,d_\infty)}$ is complete. (Hint: use countable saturation).

In particular, by the Heine-Borel theorem, ${X_\infty}$ is compact.

Now we establish Gromov-Hausdorff convergence, in the sense that for every standard ${\epsilon > 0}$, one has ${d_{GH}(X_n,X_\infty) < \epsilon}$ for all ${n}$ sufficiently close to ${\alpha}$. From this it is easy to extract a subsequence ${X_{n_j}}$ that converges in the Gromov-Hausdorff sense to ${X_\infty}$ as required.

Fix a standard ${\epsilon>0}$. As ${X_\infty}$ is totally bounded, we can cover it by at most ${M}$ balls ${B_\infty(x_{1,\infty},\epsilon/10), \ldots, B_\infty(x_{M,\infty},\epsilon/10)}$ of radius ${\epsilon/10}$ (say) in the metric ${d_\infty}$ for some standard natural number ${M}$. Lifting back to the ultraproduct ${X}$, we conclude that we may cover ${X}$ by ${M}$ balls ${B(x_1,2\epsilon/10),\ldots,B(x_M,2\epsilon/10)}$ in the nonstandard metric ${d}$.

Note that ${d_\infty(x_{i,\infty},x_{j,\infty})}$ is the standard part of ${d(x_i,x_j)}$, and in particular

$\displaystyle d_\infty(x_{i,\infty},x_{j,\infty}) - \epsilon/10 < d(x_i,x_j) < d_\infty(x_{i,\infty},x_{j,\infty}) + \epsilon/10.$

Each ball centre ${x_i}$ is an ultralimit ${x_i = \lim_{n \rightarrow \alpha} x_{i,n}}$ of points ${x_{i,n}}$ in ${X_n}$. By Los’s theorem, we conclude that for ${n}$ sufficiently close to ${\alpha}$, ${X_n}$ is covered by the ${M}$ balls ${B_n(x_{1,n},2\epsilon/10), \ldots, B_n(x_{M,n},2\epsilon/10)}$ in the metric ${d_n}$, and

$\displaystyle d_\infty(x_{i,\infty},x_{j,\infty}) - \epsilon/10 < d_n(x_{i,n},x_{j,n}) < d_\infty(x_{i,\infty},x_{j,\infty}) + \epsilon/10$

for all ${1 \leq i,j \leq M}$.

Because of this, we can embed both ${X_n}$ and ${X_\infty}$ in the disjoint union ${X_n \uplus X_\infty}$, with a metric ${d^{(n)}}$ extending the metrics on ${X_n}$ and ${X_\infty}$, with the cross distances ${d^{(n)}( x, y )}$ between points ${x \in X_n}$ and ${y \in X_\infty}$ defined by the formula

$\displaystyle d^{(n)}(x,y) = d^{(n)}(y,x) := \inf \{ d_\infty(x_{i,\infty},y) + d_\infty(x_{i,n},x) + \epsilon/2: 1 \leq i \leq M \};$

informally, this connects each ball center ${x_{i,\infty}}$ to its counterpart ${x_{i,n}}$ by a path of length ${\epsilon/2}$. It is a routine matter to verify that ${d^{(n)}}$ is indeed a metric, and that ${X_n}$ and ${X_\infty}$ are separated by a Hausdorff distance less than ${\epsilon}$ in ${X_n \uplus X_\infty}$, and the claim follows.

Exercise 9 Prove Proposition 5 without using ultrafilters.

Exercise 10 (Completeness) Let ${X_n}$ be a sequence of bounded metric spaces which is Cauchy in the Gromov-Hausdorff sense (i.e. ${d_{GH}(X_n,X_m) \rightarrow 0}$ as ${n,m \rightarrow \infty}$). Show that ${X_n}$ converges in the Gromov-Hausdorff sense to some limit ${X}$.

Now we generalise Gromov-Hausdorff convergence to the setting of unbounded metric spaces. To motivate the definition, let us first give an equivalent form of Gromov-Hausdorff convergence:

Exercise 11 Let ${(X_n,d_n)}$ be a sequence of bounded metric spaces, and let ${(X_\infty,d_\infty)}$ be another bounded metric space. Show that the following are equivalent:

• (i) ${(X_n,d_n)}$ converges in the Gromov-Hausdorff sense to ${(X_\infty,d_\infty)}$.
• (ii) There exist maps ${\phi_n: X_\infty \rightarrow X_n}$ which are asymptotically isometric isomorphisms in the sense that ${\sup_{x,y \in X} |d_n(\phi_n(x),\phi_n(y)) - d(x,y)| \rightarrow 0}$ and ${\sup_{x_n \in X_n} \hbox{dist}_n( x_n, \phi_n(X_\infty) ) \rightarrow 0}$ as ${n \rightarrow \infty}$.

Define a pointed metric space to be a triplet ${(X,d,p)}$, where ${(X,d)}$ is a metric space and ${p}$ is a point in ${X}$.

Definition 6 (Pointed Gromov-Hausdorff convergence) A sequence of pointed metric spaces ${(X_n,d_n,p_n)}$ is said to converge in the pointed Gromov-Hausdorff sense to another pointed metric space ${(X_\infty,d_\infty,p_\infty)}$ if there exists a sequence of maps ${\phi_n: X_\infty \rightarrow X_n}$ such that

• ${d_n(\phi_n(p_\infty),p_n) \rightarrow 0}$ as ${n \rightarrow \infty}$;
• (Asymptotic isometry) For each ${R>0}$, one has ${\sup_{x,y \in B_\infty(p_\infty,R)} |d_n(\phi_n(x),\phi_n(y)) - d_\infty(x,y)| \rightarrow 0}$ as ${n \rightarrow \infty}$;
• (Asymptotic surjectivity) For each ${R'>R>0}$, one has ${\sup_{x \in B_n(p_n,R)} \hbox{dist}_n( x, \phi_n(B_\infty(p_\infty,R')) ) \rightarrow 0}$ as ${n \rightarrow \infty}$.

Exercise 12 Verify that all the examples (i)-(v) given at the start of the section are examples of pointed Gromov-Hausdorff convergence, once one selects a suitable point in each space.

The above definition is by no means the only definition of pointed Gromov-Hausdorff convergence. Here is another (which is basically the original definition of Gromov):

Exercise 13 Let ${(X_n,d_n,p_n)}$ be a sequence of pointed metric spaces, and let ${(X_\infty,d_\infty,p_\infty)}$ be another pointed metric space. Show that the following are equivalent:

• ${(X_n,d_n,p_n)}$ converges in the pointed Gromov-Hausdorff sense to ${(X_\infty,d_\infty,p_\infty)}$.
• There exist metrics ${\tilde d_n}$ on the disjoint union ${X_n \cup X_\infty}$ extending the metrics ${d_n, d_\infty}$ such that for some sequence ${\epsilon_n \rightarrow 0}$, one has ${\tilde d_n(p_n,p_\infty) \leq \epsilon_n}$, ${B_n(p_n,1/\epsilon_n) \subset N_{\epsilon_n}(X_\infty)}$, and ${B_\infty(p_\infty,1/\epsilon_n) \subset N_{\epsilon_n}(X_n)}$.

Using this equivalence, construct a pseudometric between pointed metric spaces that describes pointwise Gromov-Hausdorff convergence.

Exercise 14 Let ${(X_n,d_n,p_n)}$ be a sequence of pointed metric spaces which converge in the pointed Gromov-Hausdorff sense to a limit ${(X_\infty,d_\infty,p_\infty)}$, and also to another limit ${(X'_\infty,d'_\infty,p'_\infty)}$. Suppose that these two limit spaces are proper, which means that the closed balls ${\overline{B_\infty(p_\infty,R)}}$ and ${\overline{B'_\infty(p'_\infty,R)}}$ are compact. Show that the two limit spaces are pointedly isometric, thus there is an isometric isomorphism ${\phi: X_\infty \rightarrow X'_\infty}$ that maps ${p_\infty}$ to ${p'_\infty}$.

In the compact case, Gromov-Hausdorff convergence and pointwise Gromov-Hausdorff convergence are almost equivalent:

Exercise 15 Let ${(X_n,d_n,p_n)}$ be a sequence of bounded pointed metric spaces, and let ${(X_\infty,d_\infty,p_\infty)}$ be a compact pointed metric space.

• (i) Show that if ${(X_n,d_n,p_n)}$ converges in the pointed Gromov-Hausdorff sense to ${(X_\infty,d_\infty,p_\infty)}$, then ${(X_n,d_n)}$ converges in the Gromov-Hausdorff sense to ${(X_\infty,d_\infty)}$.
• (ii) Conversely, if ${(X_n,d_n)}$ converges in the Gromov-Hausdorff sense to ${(X_\infty,d_\infty)}$, show that some subsequence ${(X_{n_j},d_{n_j},p_{n_j})}$ converges in the pointed Gromov-Hausdorff sense to ${(X_\infty,d_\infty,q)}$ for some ${q \in X_\infty}$.

There is a compactness theorem for pointed Gromov-Hausdorff convergence analogous to that for ordinary Gromov-Hausdorff convergence (Proposition 5):

Proposition 7 (Locally uniformly total bounded spaces are pointed Gromov-Hausdorff precompact) Let ${X_n = (X_n,d_n,p_n)}$ be a sequence of pointed metric spaces such that the balls ${B_n(p_n,R)}$ are uniformly totally bounded in ${n}$ for each fixed ${R>0}$. Then there exists a subsequence ${X_{n_j}}$ which converges in the pointed Gromov-Hausdorff space to a limit ${X_\infty = (X_\infty,d_\infty,p_\infty)}$ which is proper (i.e. every closed ball is compact).

We sketch a proof of this proposition in the exercise below.

Exercise 16 Let ${X_n}$ be as in the above proposition; we assume the ${X_n}$ to be standard. Let ${X := \prod_{n \rightarrow \alpha} X_n}$ be the ultraproduct of the ${X_n}$, and define the ultralimits ${d := \prod_{n \rightarrow \alpha} d_n}$ and ${p := \prod_{n \rightarrow \alpha}}$. Let ${O(X) := \{x \in X: d(x,p) = O(1)\}}$ be the points in ${X}$ that are a bounded distance from ${p}$.

• (i) Show that ${\hbox{st}(d): O(X) \times O(X) \rightarrow {\bf R}^+}$ is a pseudometric on ${O(X)}$.
• (ii) Show that the associated quotient space ${X_\infty := O(X)/\sim}$ with the quotient metric ${d_\infty}$ is a proper metric space.
• (iii) Show that some subsequence of the ${(X_n,p_n)}$ converge in the pointed Gromov-Hausdorff sense to ${(X_\infty,p_\infty)}$, where ${p_\infty}$ is the image of ${p}$ under the quotient map, thus establishing Proposition 7.

Exercise 17 Establish Proposition 7 without using ultrafilters.

Exercise 18 Let ${G}$ be a locally compact group with a Gleason metric ${d}$. Show that the pointed metric spaces ${(G, n d, 1)}$ converge in the pointed Gromov-Hausdorff sense as ${n \rightarrow \infty}$ to the vector space ${L(G)}$ with the metric ${d_{L(G)}(x,y) := \|x-y\|}$ and distinguished point ${0}$, where the norm ${\| \|}$ on ${L(G)}$ was defined in Exercise 7 of Notes 2.

Remark 1 The above exercise suggests that one could attack Hilbert’s fifth problem by somehow “blowing up” the locally compact group ${G}$ around the origin and extracting a Gromov-Hausdorff limit using ultraproducts. This approach can be formalised using the language of nonstandard analysis, and in particular can be used to describe Hirschfeld’s nonstandard solution to Hilbert’s fifth problem, as well as the later work of Goldbring. However, we will not detail this approach extensively here (though, on some level, it contains much the same ingredients as the known “standard” solutions to that problem, such as the one given in previous notes).

We now prove Sanders’ lemma (Lemma 1), which roughly speaking will be needed to establish an important “total boundedness” property of ultra approximate groups, which in turn is necessary to ensure local compactness for models of such groups.

Let ${A}$ be a ${K}$-approximate group for some ${K>1}$, and let ${m}$ be a (large) natural number. Our task is to locate a large set ${S}$ such that the iterated power ${S^m}$ is contained inside ${A^4}$. Sanders’ strategy for doing this is to pick a set ${S}$ that nearly stabilises a set ${B}$ that is comparable in some sense to ${A}$. More precisely, suppose we have non-empty finite sets ${S}$ and ${B}$ with the property that

$\displaystyle |B \backslash sB| < \frac{1}{m} |B|$

for all ${s \in S}$. Then an easy induction shows that

$\displaystyle |B \backslash gB| < \frac{k}{m} |B|$

for all ${g \in S^k}$ and all ${k \geq 1}$. In particular, ${B}$ and ${gB}$ are not disjoint whenever ${g \in S^m}$, which means that ${S^m \subset B B^{-1}}$. Thus, to prove Sanders’ lemma, it suffices to find a non-empty set ${B \subset A^2}$ with the property that the set

$\displaystyle S := \{ s \in G: |B \backslash sB| < \frac{1}{m} |B| \} \ \ \ \ \ (1)$

has cardinality ${\gg_{K,m} |A|}$. (Note that this set ${S}$ will automatically be symmetric and contain the origin.)

Remark 2 The set (1) is known as a symmetry set of ${B}$ and is sometimes denoted ${Sym_{1/m}(B)}$. One can also interpret this set as a ball, using the mapping ${g \mapsto \frac{1}{|B|} 1_{gB}}$ of ${G}$ into ${\ell^1(G)}$ to pull back the ${\ell^1(G)}$ metric to ${G}$, in which case ${S}$ is the ball of radius ${\frac{1}{m}}$ centred at the origin.

It remains to find a set ${B}$ for which the set (1) is large. To motivate how we would do this, let us naively try setting ${B := A^2}$. If the set (1) associated to this set is large, we will be done, so let us informally consider the opposite case in which ${S}$ is extremely small; in particular, we have

$\displaystyle |A^2 \backslash gA^2| \geq \frac{1}{m} |A^2|$

for “most” choices of ${g \in G}$. In particular,

$\displaystyle |A^2 \cap gA^2| \leq (1-\frac{1}{m}) |A^2|.$

Now observe that ${(A \cap gA) A}$ is contained in ${A^2 \cap gA^2}$, and so

$\displaystyle |(A \cap gA) A| \leq (1-\frac{1}{m}) |A^2|.$

We have thus achieved a dichotomy: either the choice ${B := A^2}$ “works”, or else the set ${B' := A' A}$ is significantly smaller than ${B = A^2}$ for some ${A' = A \cap gA}$. We can then try to repeat this dichotomy, to show that the choice ${B' = A' A}$ either “works”, or else the set ${B'' := A'' A}$ is significantly smaller than ${B' = A' A}$ for some ${A'' = A' \cap g' A'}$. We can keep iterating this dichotomy, creating ever smaller sets of the form ${B_n = A_n A}$; but on the other hand, these sets should be at least as large as ${|A|}$ (provided that we choose ${g}$, ${g'}$, etc. to prevent the sets ${A', A''}$, etc. from becoming completely empty). So at some point this iteration has to terminate, at which point we should get a set that “works”.

The original paper of Sanders contains a formalisation of the above argument. We present a slightly different arrangement of the argument below, which is focused on making sure that sets such as ${A'}$ do not get too small.

For any ${0 < t \leq 1}$, define the quantity

$\displaystyle f(t) := \inf \{ \frac{|A' A|}{|A|}: A' \subset A; |A'| \geq t|A| \}.$

Then ${f}$ is a non-decreasing function that takes values between ${1}$ and ${K}$. By the pigeonhole principle, we can find ${t \gg_{K,m} 1}$ such that

$\displaystyle f( \frac{t^2}{2K} ) > (1 - \frac{1}{m}) f(t). \ \ \ \ \ (2)$

(The reasons for these particular choices of parameters will become clearer shortly.) Fix this ${t}$, and let ${A'}$ attain the infimum for ${f(t)}$, thus ${A' \subset A}$, ${|A'| \geq t|A|}$, and

$\displaystyle |A' A| = f(t) |A|.$

(Such an ${A'}$ exists since the infimum is only being taken over a finite set.) Set ${B := A' A}$, and let ${S}$ be the set in (1). Observe that if ${g \not \in S}$, then

$\displaystyle |A' A \backslash gA' A| \geq \frac{1}{m} |A' A|,$

and so by arguing as before we see that

$\displaystyle |(A' \cap gA') A| \leq |A' A| - \frac{1}{m} |A' A|.$

Since ${|A^2| \geq |A|}$ and ${|A' A| = f(t) |A|}$, we thus have

$\displaystyle |(A' \cap gA') A| \leq (1 - \frac{1}{m}) f(t) |A|.$

In particular, from (2) and the definition of ${f}$, we conclude that

$\displaystyle |A' \cap gA'| < \frac{t^2}{2K} |A|$

for all ${g \not \in S}$. In other words,

$\displaystyle S \supset \{ g \in G: |A' \cap gA'| \geq \frac{t^2}{2K} |A| \}. \ \ \ \ \ (3)$

So to finish the proof of Sanders’ lemma, it will suffice to obtain a lower bound on the right-hand side of (3). But this can be done by a standard Cauchy-Schwarz argument: starting with the identity

$\displaystyle \sum_{x \in A^2} \sum_{a \in A} 1_{aA'}(x) = |A| |A'| \geq t |A|^2$

and using the Cauchy-Schwarz inequality and the bound ${|A^2| \leq K|A|}$, we conclude that

$\displaystyle \sum_{x \in A^2} \sum_{a,b \in A} 1_{aA'}(x) 1_{bA'}(x) \geq \frac{t^2}{K} |A|^3$

and thus

$\displaystyle \sum_{a,b \in A} |aA' \cap bA'| \geq \frac{t^2}{K} |A|^3.$

By the pigeonhole principle, we may thus find ${a \in A}$ such that

$\displaystyle \sum_{b \in A} |aA' \cap bA'| \geq \frac{t^2}{K} |A|^2,$

and thus (setting ${g := a^{-1} b}$)

$\displaystyle \sum_{g \in a^{-1} A} |A' \cap gA'| \geq \frac{t^2}{K} |A|^2.$

Since ${a^{-1} A}$ has cardinality ${|A|}$, this forces ${|A' \cap gA'|}$ to exceed ${\frac{t^2}{2K} |A|}$ for at least ${\frac{t^2}{2K} |A|}$ values of ${g}$, and the claim follows.

Remark 3 The lower bound on ${S}$ obtained by this argument is of the shape ${|S| \geq \exp(-K^{O(m)}) |A|}$. This is not optimal; see Remark 5 below.

We now present an alternate approach to Sanders’ lemma, using the Croot-Sisask theory of almost periods. Whereas in the Sanders argument, one selected the set ${S}$ to be the set of group elements that almost stabilised a set ${B}$, we now select ${S}$ to be the set of group elements that almost stabilise (or are almost periods of) the convolution

$\displaystyle 1_A * 1_A(x) := \sum_{y \in G} 1_A(y) 1_A(y^{-1} x).$

It is convenient to work in the ${\ell^2(G)}$ metric. Since the function ${1_A * 1_A}$ has an ${\ell^1}$ norm of ${|A|^2}$ and is supported in ${A^2}$, which has cardinality at most ${K|A|}$, we see from Cauchy-Schwarz that

$\displaystyle \| 1_A * 1_A \|_{\ell^2(G)} \geq |A|^{3/2} / K^{1/2}.$

We now set

$\displaystyle S := \{ g \in G: \| 1_A * 1_A - \tau(g) 1_A * 1_A \|_{\ell^2(G)} < \frac{1}{m} \frac{|A|^{3/2}}{K^{1/2}} \}.$

Exercise 19 Show that ${S}$ is symmetric, contains the origin, and that ${S^m \subset A^4}$.

To finish the proof of Sanders’ lemma, it suffices to show that there are lots of almost periods of ${1_A * 1_A}$ in the sense that ${|S| \gg_{K,m} |A|}$.

The key observation here is that the translates ${\tau(g) 1_A * 1_A}$ of ${1_A * 1_A}$, as ${g}$ varies in ${A}$, range in a “totally bounded” set. This in turn comes from a “compactness” property of the convolution operator ${f \mapsto f * 1_A}$. when ${f}$ is supported on ${A^2}$. Croot and Sisask establish this by using a random sampling argument to approximate this convolution operator by a bounded rank operator. More precisely, let ${M \geq 1}$ be an integer parameter to be chosen later, and select ${M}$ sample points ${y_1,\ldots,y_M}$ from ${A^2}$ independently and uniformly at random (allowing repetitions). We will approximate the operator

$\displaystyle Tf := f * 1_A = \sum_{y \in A^2} f(y) 1_{y A}$

for ${f}$ supported on ${A^2}$ by the operator

$\displaystyle Sf := \frac{|A^2|}{M} \sum_{i=1}^M f(y_i) 1_{y_i A}.$

The point is that as ${M}$ gets larger, ${Sf}$ becomes an increasingly good approximation to ${Tf}$:

Exercise 20 Let ${f}$ be a function supported on ${A^2}$ that is bounded in magnitude by ${1}$. Show that

$\displaystyle \mathop{\bf E} Sf = Tf$

and

$\displaystyle \mathop{\bf E} \| Sf - Tf\|_{\ell^2(G)}^2 \ll_K |A|^3 / M.$

In particular, with probability at least ${1/2}$, one has

$\displaystyle \| Sf - Tf\|_{\ell^2(G)} \ll_K |A|^{3/2} M^{-1/2}.$

Thus, for any ${g \in A}$, we have

$\displaystyle \| S 1_{gA} - T 1_{gA} \|_{\ell^2(G)} \ll_K |A|^{3/2} M^{-1/2}$

with probability at least ${1/2}$. By the pigeonhole principle (or the first moment method), we thus conclude that there exists a choice of sample points ${y_1,\ldots,y_M}$ for which

$\displaystyle \| S 1_{gA} - T 1_{gA} \|_{\ell^2(G)} \ll_K |A|^{3/2} M^{-1/2} \ \ \ \ \ (4)$

for at least ${|A|/2}$ choices of ${g \in A}$.

Let ${A'}$ denote the set of all ${g \in A}$ indicated above. For each ${g \in A'}$, the function ${S 1_{gA}}$ is a linear combination of the ${M}$ functions ${\frac{|A^2|}{M} 1_{y_i A}}$ with coefficients between ${0}$ and ${1}$. Each of these functions ${\frac{|A^2|}{M} 1_{y_i A}}$ has an ${\ell^2(G)}$ norm of ${O_K( |A|^{3/2} )}$. Thus, by the pigeonhole principle, one can find a subset ${A''}$ of ${A'}$ of size ${|A''| \gg_{K,m,M} |A|}$ such that the functions ${S 1_{gA}}$ for ${g \in A''}$ all lie within ${\frac{1}{2m} \frac{|A|^{3/2}}{K^{1/2}}}$ in ${\ell^2(G)}$ norm of each other. If ${M}$ is large enough depending on ${K,m}$, we then conclude from (4) and the triangle inequality that the functions ${T 1_{gA}}$ for ${g \in A''}$ all lie within ${\frac{1}{m} \frac{|A|^{3/2}}{K^{1/2}}}$ in ${\ell^2(G)}$ norm of each other. Translating by some fixed element of ${A''}$, we obtain the claim.

Remark 4 For future reference, we observe that the above argument did not need the full strength of the hypothesis that ${A}$ was an ${K}$-approximate group; it would have sufficed for ${A}$ to be finite and non-empty with ${|A^2| \leq K|A|}$.

Remark 5 The above version of the Croot-Sisask argument gives a lower bound on ${S}$ of the shape ${|S| \geq \exp(-O( m^2 K \log K)) |A|}$. As was observed by Sanders, by optimising the argument (in particular, replacing ${\ell^2}$ with ${\ell^p}$ for a large value of ${p}$, and replacing ${1_A * 1_A}$ by ${1_A * 1_{A^2}}$), one can improve this to ${|S| \geq \exp( -O( m^2 \log^2 K ) ) |A|}$.

Remark 6 It is also possible to replace the random sampling argument above by a singular value decomposition of ${T}$ (restricted to something like ${A^2}$) to split it as the sum of a bounded rank component and a small operator norm component, after computing the Frobenius norm of ${T}$ in order to limit the number of large singular values. (In the abelian setting, this corresponds to a Fourier decomposition into large and small Fourier coefficients, which is a fundamental tool in additive combinatorics.) We will however not pursue this approach here.

As mentioned previously, the Sanders lemma will be useful in building topological group structure on ultra approximate groups ${A}$ (and the group ${\langle A \rangle}$ that they generate). The connection can be seen from the simple observation that if ${U}$ is a neighbourhood of the identity in a topological group and ${m \geq 1}$, then there exists another neighbourhood ${S}$ of the identity such that ${S^m \subset U}$. However, in addition to this multiplicative structure, we will also need to impose some conjugacy structure as well. (Strangely, even though the conjugation operation ${(g,h) \mapsto ghg^{-1}}$ can be defined in terms of the more “primitive” operations of multiplication ${(g,h) \mapsto gh}$ and inversion ${g \mapsto g^{-1}}$, it almost seems to be an independent group operation in some ways, at least for the purposes of studying approximate groups, and the most powerful results tend to come from combining multiplicative structure and conjugation structure together. I do not know a fundamental reason for this “product-conjugation phenomenon” but it does seem to come up a lot in this subject.) Given two non-empty subsets ${A, B}$ of a group ${G}$, define

$\displaystyle A^B := \{ a^b: a \in A, b \in B \}$

where ${a^b := b^{-1} a b}$ is the conjugate of ${a}$ by ${b}$. (Thus, for instance, ${H^B = H}$ whenever ${H}$ is a subgroup normalised by ${B}$.)

We can now establish a stronger version of the Sanders lemma:

Lemma 8 (Normal Sanders lemma) Let ${A}$ be a finite ${K}$-approximate group in a (global) group ${G}$, let ${m \geq 1}$, and let ${B}$ be a symmetric subset of ${A}$ with ${|B| \geq \delta |A|}$ for some ${\delta>0}$. Then there exists a symmetric subset ${N}$ of ${A^4}$ with ${|N| \gg_{K,m,\delta} |A|}$ containing the identity such that ${(N^{A^m})^m \subset B^4}$.

Roughly speaking, one can think of the set ${S}$ in Lemma 1 as a “finite index subgroup” of ${A^4}$, while the set in Lemma 8 is a “finite index normal subgroup” of ${A^4}$. To get from the former to the latter, we will mimic the proof of the following well-known fact:

Lemma 9 Suppose that ${H}$ is a finite index subgroup of a group ${G}$. Then there is a finite index normal subgroup ${N}$ of ${G}$ that is contained in ${H}$.

Proof: Consider all the conjugates ${H^g}$ of the finite index subgroup ${H}$ as ${g}$ ranges over ${G}$. As all the elements ${g}$ from the same right coset ${Hk}$ give the same conjugate ${H^g}$, and ${H}$ is finite index, we see that there are only finitely many such conjugates. Since the intersection of two finite index subgroups is again a finite index subgroup (why?), the intersection ${N := \bigcap_{g \in G} H^g}$ is thus a finite index subgroup of ${G}$. But we have ${N = N^g}$ for all ${g \in G}$, and so ${N}$ is normal as desired. $\Box$

Remark 7 An inspection of the argument reveals that if ${H}$ had index ${k}$ in ${G}$, then the normal subgroup ${N}$ has index at most ${k^k}$. One can do slightly better than this by looking at the left action of ${G}$ on the ${k}$-element quotient space ${G/H}$, which one can think of as a homomorphism from ${G}$ to the symmetric group ${S_k}$ of ${k}$ elements. The kernel ${N}$ of this homomorphism is then clearly a normal subgroup of ${G}$ of index at most ${|S_k|=k!}$ that is contained in ${H}$. However, this slightly more efficient argument seems to be more “fragile” than the more robust proof given above, in that it is not obvious (to me, at least) how to adapt it to the approximate group setting of the Sanders lemma. (Thus we see the advantages of knowing multiple proofs for various basic facts in mathematics.)

Inspired by the above argument, we now prove Lemma 8. The main difficulty is to find an approximate version of the claim that the intersection of two finite index subgroups is again a finite index subgroup. This is provided by the following lemma:

Lemma 10 Let ${A}$ be a ${K}$-approximate group, and let ${A_1, A_2 \subset A}$ be such that ${|A_1| \geq \delta_1 |A|}$ and ${|A_2| \geq \delta_2 |A|}$. Then there exists a subset ${B}$ of ${A}$ with ${B B^{-1} \subset A_1 A_1^{-1} \cap A_2 A_2^{-1}}$ and ${|B| \geq \delta_1 \delta_2 |A|/K}$.

Proof: Since ${A^{-1}_1 A_2 \subseteq A^2}$, we have ${|A^{-1}_1 A_2| \leq K|A|}$. It follows that there is some ${x}$ with at least ${\delta_1\delta_2|A|/K}$ representations as ${a^{-1}_1 a_2}$. Let ${B}$ be the set of all values of ${a_2}$ that appear. Obviously ${B B^{-1} \subseteq A_2 A^{-1}_2}$. Suppose that ${a_2, a'_2 \in B}$. Then there are ${a_1, a'_1}$ such that ${x = a_1^{-1} a_2 = (a'_1)^{-1} a'_2}$, and so ${a'_1 a^{-1}_1 = a'_2 a_2^{-1}}$. Thus ${B B^{-1}}$ lies in ${A_1A_1^{-1}}$ as well. $\Box$

Now we prove Lemma 8. By Lemma 1, we may find a symmetric set ${S}$ of ${B^4}$ containing the identity such that ${S^{8m}}$ (say) is contained in ${B^4}$ and ${|S| \gg_{K,m,\delta} |A|}$. We need the following simple covering lemma:

Exercise 21 (Ruzsa covering lemma) Let ${A, B}$ be finite non-empty subsets of a group ${G}$ such that ${|AB| \leq K |B|}$. Show that ${A}$ can be covered by at most ${K}$ left-translates ${aB B^{-1}}$ of ${B B^{-1}}$ for various ${a \in A}$. (Hint: find a maximal disjoint family of ${aB}$ with ${a \in A}$.)

From the above lemma we see that ${A}$ can be covered by ${O_{K,m}(1)}$ left translates of ${S^2}$. As ${A^m}$ can be covered by ${O_{K,m}(1)}$ left-translates of ${A}$, we conclude that ${A^m}$ can be covered by ${O_{K,m}(1)}$ left-translates of ${S^2}$. Since ${S^2 \subset A^8}$, we conclude that ${A^m \subset \bigcap_{j=1}^J a_j S^2}$ for some ${J=O_{K,m}(1)}$ and ${a_1,\ldots,a_J \in A^{m+8}}$.

The conjugates ${a_j^{-1} S a_j^{-1}}$ all lie in ${A^{2m+20}}$. By many applications of Lemma 10, we see that we may find a subset ${D}$ of ${A^{2m+4}}$ with ${|D| \gg_{K,m} |A|}$ such that

$\displaystyle D D^{-1} \subset a_j S^4 a_j^{-1}$

for all ${j=1,\ldots,J}$. In particular, if ${N := DD^{-1}}$, then ${N^{a_j} \subset S^4}$ for all ${j=1,\ldots,J}$, and thus ${N^{A^m} \subset S^8}$. Thus ${(N^{A^m})^m \subset B^4}$, and the claim follows.

Remark 8 The quantitative bounds on ${|N|}$ given by this argument are quite poor, being of triple exponential type (!) in ${K}$. It is likely that this can be improved by a more sophisticated argument.

— 3. Locally compact models of ultra approximate groups —

We now use the normal Sanders lemma to place a topology on ultra approximate groups. We first build a “neighbourhood base” associated to an ultra approximate group:

Exercise 22 Let ${A}$ be an ultra approximate group. Show that there exist a sequence of ultra approximate groups

$\displaystyle A^4 = A_0 \supset A_1 \supset A_2 \supset \ldots \supset \ldots$

such that

$\displaystyle (A_m^{A^m})^2 \subset A_{m-1}$

for all ${m \geq 1}$, and such that ${A^4}$ can be covered by finitely many left-translates of ${A_m}$ for each ${m}$. (Hint: you will need Lemma 8, Lemma 10, Exercise 21, and some sort of recursive construction.)

Remark 9 A simple example to keep in mind here is the nonstandard interval ${A = [-N,N]}$ for some unbounded nonstandard natural number ${N}$, with ${A_m := [-4N/2^m, 4N/2^m]}$.

This gives a good topology on the group ${\langle A \rangle = \bigcup_{m=1}^\infty A^m}$ generated by ${A}$:

Exercise 23 Let ${A}$ be an ultra approximate group, and let ${\langle A \rangle}$ be the group generated by ${A}$. Let ${A_0, A_1, \ldots}$ be as in the preceding exercise. Given a subset ${E}$ of ${\langle A \rangle}$, call a point ${g}$ in ${E}$ an interior point of ${E}$ if one has ${g A_m \subset E}$ for some (standard) ${m}$. Call ${E}$ open if every element of ${E}$ is an interior point.

• (i) Show that this defines a topology on ${E}$.
• (ii) Show that this topology makes ${\langle A \rangle}$ into a topological group (i.e. the group operations are continuous).
• (iii) Show that this topology is first countable (and thus pseudo-metrisable by the Birkhoff-Kakutani theorem (see Theorem 6 from Notes 4)).
• (iv) Show that one can build a left-invariant pseudometric ${d: \langle A \rangle \times \langle A \rangle \rightarrow {\bf R}^+}$ on ${\langle A \rangle}$ which generates the topology on ${\langle A \rangle}$, with the property that

$\displaystyle B(1, c 2^{-m}) \subset A_m \subset B(1, C 2^{-m} )$

for all (standard) ${m \geq 0}$ and some (standard) ${C > c > 0}$. (Hint: inspect the proof of the Birkhoff-Kakutani theorem.)

• (v) Show that ${\langle A \rangle}$ with this pseudometric is complete (i.e. every Cauchy sequence is convergent). (Hint: use the countable saturation property.)
• (vi) Show that ${A^m}$ is totally bounded for each standard ${m}$ (i.e. covered by a finite number of ${\epsilon}$-balls for each ${\epsilon>0}$).
• (vii) Show that ${\langle A \rangle}$ is locally compact.

With this topology, the group ${\langle A \rangle}$ becomes a locally compact topological group. However, in general this group will not be Hausdorff, because the identity ${1}$ need not be closed. Indeed, it is easy to see that the closure of ${1}$ is the set ${\bigcap_{m=1}^\infty A_m}$, which is not necessarily trivial. For instance, using the example from Remark 9, ${\langle A \rangle}$ is the group ${\{ n \in {}^* {\bf Z}: n = O(N) \}}$, and the closure of the identity is ${\{ n \in {}^* {\bf Z}: n=o(N)\}}$. However, we may quotient out by this closure of the identity to obtain a locally compact Hausdorff group ${L}$, which is now metrisable (with the metric induced from the pseudometric ${d}$ on ${\langle A \rangle}$). Let ${\pi: \langle A \rangle \rightarrow L}$ be the quotient homomorphism. This homomorphism obeys a number of good properties, which we formalise as a definition:

Definition 11 Let ${A}$ be an ultra approximate group. A (global) good model for ${A}$ is a homomorphism ${\pi: \langle A \rangle \rightarrow L}$ from ${\langle A \rangle}$ to a locally compact Hausdorff group ${L}$ that obeys the following axioms:

• (Thick image) There exists a neighbourhood ${U_0}$ of the identity in ${L}$ such that ${\pi^{-1}(U_0) \subset A}$ and ${U_0 \subset \pi(A)}$. (In particular, the kernel of ${\pi}$ lies in ${A}$.)
• (Compact image) ${\pi(A)}$ is precompact.
• (Approximation by nonstandard sets) Suppose that ${F \subset U \subset U_0}$, where ${F}$ is compact and ${U}$ is open. Then there exists a nonstandard finite set ${B}$ (i.e. an ultraproduct of finite sets) such that ${\pi^{-1}(F) \subset B \subset \pi^{-1}(U)}$.

We will often abuse notation and refer to ${L}$ as the good model, rather than ${\pi}$. (Actually, to be truly pedantic, it is the ordered pair ${(\pi,L)}$ which is the good model, but we will not use this notation often.)

Remark 10 In the next set of notes we will also need to consider local good models, which only model (say) ${A^8}$ rather than all of ${\langle A \rangle}$, and in which ${L}$ is a local group rather than a global one. However, for simplicity we will not discuss local good models for the moment.

Remark 11 The thick and compact image axioms imply that the geometry of ${L}$ in some sense corresponds to the geometry of ${A}$. In particular, if ${a}$ is an element of ${\langle A \rangle}$, then ${a}$ will be “small” (in the sense that it lies in ${A}$) if ${\pi(a)}$ is close to the identity, and “large” (in the sense that it lies outside ${A}$) if ${\pi(a)}$ is far away from the identity. Thus ${\pi}$ is “faithful” in some “coarse” or “macroscopic” sense. The inclusion ${U_0 \subset \pi(A)}$ is a sort of “local surjectivity” condition, and ensures that ${L}$ does not contain any “excess” or “redundant” components. The approximation by nonstandard set axiom is a technical “measurability” axiom, that ensures that the model of the ultra approximate group actually has something nontrivial to say about the finite approximate groups that were used to build that ultra approximate group (as opposed to being some artefact of the ultrafilter itself).

Example 1 We continue the example from Remark 9. A good model for ${A = [-N,N]}$ is provided by the homomorphism ${\pi: \langle A \rangle \rightarrow {\bf R}}$ given by the formula ${\pi(x) := \hbox{st}(x/N)}$. The thick image and compact image properties are clear. To illustrate the approximation by nonstandard set property, take ${F = [-r,r]}$ and ${U = (-s,s)}$ for some (standard) real numbers ${0 < r < s}$. The preimages ${\pi^{-1}(F) = \{ n \in {}^* {\bf Z}: |n| \leq rN + o(N) \}}$ and ${\pi^{-1}(U) = \{ n \in {}^* {\bf Z}: |n| < (s-\epsilon) N \hbox{ for some standard } \epsilon>0\}}$ are not nonstandard finite sets (why? use the least upper bound axiom), but one can find a nonstandard integer ${M}$ such that ${r < \hbox{st}(M/N) < s}$, and ${[-M,M]}$ will be a nonstandard finite set between ${F}$ and ${U}$.

The following fundamental observation is essentially due to Hrushovski:

Exercise 24 Let ${A}$ be an ultra approximate group. Show that ${A^4}$ has a good model by a locally compact Hausdorff metrisable group, given by the construction discussed previously.

Exercise 25 The purpose of this exercise is to show why it is necessary to model ${A^4}$, rather than ${A}$, in Exercise 24. Let ${F_2}$ be the field of two elements. For each positive integer ${n}$, let ${A_n}$ be a random subset of ${F_2^n}$ formed by selecting one element uniformly at random from the set ${x, x+e_n}$ for each ${x \in F_2^{n-1} \times \{0\}}$, and also selecting the identity ${0}$. Clearly, ${A_n}$ is a ${2}$-approximate group, since ${A_n}$ is contained in ${F_2^n = A_n + \{0,e_n\}}$.

• (i) Show that almost surely, for all but finitely many ${n}$, ${A_n}$ does not contain any set of the form ${B+B}$, with ${|B| \geq 2^{0.9 n}}$. (Hint: use the Borel-Cantelli lemma.)
• (ii) Show that almost surely, the ultraproduct ${\prod_{n \rightarrow \alpha} A_n}$ cannot be modeled by any locally compact group.

Let us now give some further examples of good models, beyond that given by Example 1.

Example 2 (Nonstandard finite groups) Suppose that ${A_n}$ is a sequence of (standard) finite groups; then the ultraproduct ${A := \prod_{n \rightarrow \alpha} A_n}$ is an ultra approximate group. In this case, ${A}$ is in fact a genuine group, thus ${A = \langle A \rangle}$. In this case, the trivial homomorphism ${\pi: \langle A \rangle \rightarrow L = \{1\}}$ to the trivial group ${\{1\}}$ is a good model of ${A}$. Conversely, it is easy to see that this is the only case in which ${\{1\}}$ is a good model for ${A}$.

Example 3 (Generalised arithmetic progression) We still work in the integers ${{\bf Z}}$, but now take ${A_n}$ to be the rank two generalised arithmetic progression

$\displaystyle A_n := P(1,n^{10}; n,n) := \{ a+bn^{10}: a,b \in \{-n,\ldots,n\}\}.$

Then the ultraproduct ${A := \prod_{n \rightarrow \alpha} A_n}$ is the subset of the nonstandard integers ${{}^* {\bf Z}}$ of the form

$\displaystyle A = P(1,N^{10};N,N) = \{ a+bN^{10}: a,b \in \{-N,\ldots,N\}\},$

where ${N}$ is the unbounded natural number ${N := \lim_{n \rightarrow \alpha} n}$. This is an ultra approximate group, with

$\displaystyle \langle A \rangle = \{ a + b N^{10}: a,b \in {}^* {\bf Z}; a,b = O(N) \}.$

Then ${\langle A \rangle}$ can be modeled by the Euclidean plane ${{\bf R}^2}$, using the model maps ${\pi: \langle A \rangle \rightarrow {\bf R}^2}$ defined for each standard ${m}$ by the formula

$\displaystyle \pi( a + bN^{10} ) := \left( \hbox{st} \frac{a}{N}, \hbox{st} \frac{b}{N} \right)$

whenever ${a,b = O(N)}$. The image ${\pi(A^m)}$ is then the square ${[-m,m]^2}$ for any standard ${m}$. Note here that while ${A}$ lives in a “one-dimensional” group ${{}^* {\bf Z}}$, the model ${{\bf R}^2}$ is “two-dimensional”. This is also reflected in the volume growth of the powers ${A_n^m}$ of ${A_n}$ for small ${m}$ and large ${n}$, which grow quadratically rather than linearly in ${m}$. Informally, ${A}$ is “modeled” by the unit square in ${{\bf R}^2}$.

Exercise 26 With the notation of Example 3, show that ${A}$ cannot be modeled by the one-dimensional Lie group ${{\bf R}}$. (Hint: If ${A}$ was modeled by ${{\bf R}}$, conclude that ${A^m}$ could be covered by ${O(m)}$ translates of ${A}$ for each standard ${m}$.)

Exercise 27 (Heisenberg box, I) We take each ${A_n}$ to be the “nilbox”

$\displaystyle A_n := \left\{ \left(\begin{smallmatrix} 1 & x_{n} & z_{n} \\ 0 & 1 & y_{n} \\ 0 & 0 & 1\end{smallmatrix}\right) \in \left(\begin{smallmatrix} 1 & {\bf Z} & {\bf Z} \\ 0 & 1 & {\bf Z} \\ 0 & 0 & 1\end{smallmatrix}\right) : |x_{n}|, |y_{n}| \leq {n}, |z_{n}| \leq n^{2}\right\}.$

Consider the ultraproduct ${A := \prod_{n \rightarrow \alpha} A_n}$; this is a subset of the nilpotent (nonstandard) group ${\left(\begin{smallmatrix} 1 & {}^* {\bf Z} & {}^* {\bf Z} \\ 0 & 1 & {}^* {\bf Z} \\ 0 & 0 & 1\end{smallmatrix}\right)}$, consisting of all elements ${\left(\begin{smallmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{smallmatrix}\right)}$ with ${|x|, |y| \leq N}$ and ${|z| \leq N^2}$, where ${N := \lim_{n \rightarrow \alpha} n}$. Thus

$\displaystyle \langle A \rangle = \left(\begin{smallmatrix} 1 & O(N) & O(N^2) \\ 0 & 1 & O(N) \\ 0 & 0 & 1\end{smallmatrix}\right).$

Consider now the map

$\displaystyle \pi : \langle A\rangle \rightarrow \left(\begin{smallmatrix} 1 & {\bf R} & {\bf R} \\ 0 & 1 & {\bf R} \\ 0 & 0 & 1\end{smallmatrix}\right)$

defined by

$\displaystyle \pi\left( \left(\begin{smallmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{smallmatrix}\right)\right) := \left(\begin{smallmatrix} 1 & \hbox{st} \frac{x}{N} & \hbox{st} \frac{z}{N^2} \\ 0 & 1 & \hbox{st} \frac{y}{N} \\ 0 & 0 & 1\end{smallmatrix}\right). \ \ \ \ \ (5)$

• (i) Show that ${A \cup A^{-1}}$ is an ultra approximate group.
• (ii) Show that ${\pi}$ is a good model of ${A \cup A^{-1}}$.

Exercise 28 (Heisenberg box, II) This is a variant of the preceding exercise, in which the ${A_n}$ is now defined as

$\displaystyle A_n := \left\{ \left(\begin{smallmatrix} 1 & x_{n} & z_{n} \\ 0 & 1 & y_{n} \\ 0 & 0 & 1\end{smallmatrix}\right) : |x_n|, |y_n| \leq n, |z_{n}| \leq n^{10}\right\} \ \ \ \ \ (6)$

so that the ultralimit ${A := \prod_{n \rightarrow \alpha} A_n}$ takes the form

$\displaystyle A := \left\{ \left(\begin{smallmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{smallmatrix}\right) \in \left(\begin{smallmatrix} 1 & {}^* {\bf Z} & {}^* {\bf Z} \\ 0 & 1 & {}^* {\bf Z} \\ 0 & 0 & 1\end{smallmatrix}\right): |x|, |y| \leq N, |z| \leq N^{10}\right\}$

and

$\displaystyle \langle A\rangle := \left(\begin{smallmatrix} 1 & O(N) & O(N^{10}) \\ 0 & 1 & O(N) \\ 0 & 0 & 1\end{smallmatrix}\right),$

where ${N := \lim_{n \rightarrow \alpha} n}$. Now consider the map

$\displaystyle \pi : \langle A\rangle ^8 \rightarrow {\bf R}^3$

defined by

$\displaystyle \pi\left( \left(\begin{smallmatrix} 1 & x & z \\ 0 & 1 & y \\ 0 & 0 & 1\end{smallmatrix}\right) \right)= \left(\hbox{st} \frac{x}{N}, \hbox{st} \frac{y}{N}, \hbox{st} \frac{z}{N^{10}}\right).$

• (i) Show that ${A \cup A^{-1}}$ is an ultra approximate group.
• (ii) Show that ${\pi}$ is a good model of ${A \cup A^{-1}}$.
• (iii) Show that ${A \cup A^{-1}}$ cannot be modeled by ${{\bf R}^2}$, or by the Heisenberg group.

Remark 12 Note in the above exercise that the homomorphism ${\pi: A^8 \rightarrow {\bf R}^3}$ is not associated to any exact homomorphisms ${\pi_n}$ from ${A_n^8}$ to ${{\bf R}^3}$. Instead, it is only associated to approximate homomorphisms

$\displaystyle \pi_n\left( \left(\begin{smallmatrix} 1 & x_n & z_n \\ 0 & 1 & y_n \\ 0 & 0 & 1\end{smallmatrix}\right) \right) := \left(\frac{x_n}{n}, \frac{y_n}{n}, \frac{z_n}{n^{10}}\right)$

into ${{\bf R}^3}$. Such approximate homomorphisms are somewhat less pleasant to work with than genuine homomorphisms; one of the main reasons why we work in the ultraproduct setting is so that we can use genuine group homomorphisms.

We also note the sets ${A_n^m}$ for small ${m}$ and large ${n}$ grow cubically in ${m}$ in Exercise 27, and quartically in ${m}$ in Exercise 28. This reflects the corresponding growth rates in ${{\bf R}^3}$ and in the Heisenberg group respectively.

Finally, we observe that the nonabelian structure of the ultra approximate group ${A}$ is lost in the model group ${L}$, because the nonabelianness is “infinitesimal” at the scale of ${A}$. More generally, good models can capture the “macroscopic” structure of ${A}$, but do not directly see the “microscopic” structure.

The following exercise demonstrates that model groups need not be simply connected.

Exercise 29 (Models of Bohr sets) Let ${\alpha}$ be a standard irrational, let ${0 < \epsilon < 0.1}$ be a standard real number, and let ${A_n \subset {\bf Z}}$ be the sets

$\displaystyle A_n := \{ m \in [-n,n]: \| \alpha m \|_{{\bf R}/{\bf Z}} \leq \epsilon \}$

where ${\|x\|}$ denotes the distance to the nearest integer. Set ${A := \prod_{n \rightarrow \alpha} A_n}$.

• (i) Show that ${A}$ is an ultra approximate group.
• (ii) Show that ${{\bf R}/{\bf Z} \times {\bf R}}$ is a good model for ${A}$.
• (iii) Show that ${{\bf R}^2}$ is not a good model for ${A}$. (Hint: consider the growth of ${A^m}$, as measured by the number of translates of ${A}$ needed to cover this set.)
• (iv) Show that ${{\bf R}}$ is not a good model for ${A}$. (Hint: consider the decay of the sets ${\{ g: g,g^2,\ldots,g^m \in A \}}$, as measured by the number of translates of this set needed to cover ${A}$.)

Exercise 30 (Haar measure) Let ${\pi: \langle A \rangle \rightarrow L}$ be a good model for an ultra approximate group ${A = \prod_{n \rightarrow \alpha} A_n}$ by a locally compact group ${L}$. For any continuous compactly supported function ${f: L \rightarrow {\bf R}}$, we can define a functional ${I(f)}$ by the formula

$\displaystyle I(f) = \inf \hbox{st} \frac{\sum_{a \in A} F^+(a)}{|A|}$

where ${F^+ = \lim_{n \rightarrow \alpha} F^+_n}$ is the ultralimit of functions ${F^+_n: A_n \rightarrow {\bf R}}$, with the nonstandard real ${\sum_{a \in A} F^+}$ and nonstandard natural number ${|A|}$ defined in the usual fashion as

$\displaystyle \sum_{a \in A} F^+(a) := \lim_{n \rightarrow \alpha} \sum_{a_n \in A_n} F^+_n(a_n)$

and

$\displaystyle |A| := \lim_{n \rightarrow \alpha} |A_n|,$

and the infimum is over all ${F^+}$ for which ${F^+(a) \geq f(\pi(a))}$ for all ${a \in A}$.

• (i) Establish the equivalent formula

$\displaystyle I(f) = \sup \hbox{st} \frac{\sum_{a \in A} F^-(a)}{\sum_{a \in A} 1}$

where the supremum is over all ${F^-}$ for which ${F^-(a) \leq f(\pi(a))}$ for all ${a \in A}$.

• (ii) Show that there exists a bi-invariant Haar measure ${\mu}$ on ${G}$ such that ${I(f) = \int_G f\ d\mu}$ for all continuous compactly supported ${f: L \rightarrow {\bf R}}$. (In particular, this shows that ${L}$ is necessarily unimodular.)
• (iii) Show that

$\displaystyle \mu(F) |A| \leq |A'| \leq \mu(U) |A|$

whenever ${F \subseteq U \subseteq U_1}$, ${F}$ is compact, ${U}$ is open, and ${A'}$ is a nonstandard set with

$\displaystyle \pi^{-1}(F) \subset A' \subseteq \pi^{-1}(U).$

— 4. Lie models of ultra approximate groups —

In the examples of good models in the previous section, the model group ${L}$ was a Lie group. We give now give some examples to show that the model need not initially be of Lie type, but can then be replaced with a Lie model after some modification.

Example 4 (Nonstandard cyclic group, revisited) Consider the nonstandard cyclic group ${A := {\bf Z}/2^N{\bf Z} = \prod_{n \rightarrow \alpha} {\bf Z}/2^n {\bf Z}}$. This is a nonstandard finite group and can thus be modeled by the trivial group ${\{1\}}$ as discussed in Example 2. However, it can also be modeled by the compact abelian group ${{\bf Z}_2}$ of ${2}$-adic integers using the model ${\pi: A \rightarrow {\bf Z}_2}$ defined by the formula

$\displaystyle \pi(a) := \lim_{n \rightarrow \infty} a \hbox{ mod } {2^n}$

where for each standard natural number ${n}$, ${a \hbox{ mod } 2^n \in \{0,\ldots,2^{n-1}\}}$ is the remainder of ${a}$ modulo ${2^n}$ (this is well-defined in ${A}$) and the limit is in the ${2}$-adic metric. Note that the image ${\pi(A)}$ of ${A}$ is the entire group ${{\bf Z}_2}$, and conversely the preimage of ${{\bf Z}_2}$ in ${A^8 = A}$ is trivially all of ${{\bf Z}/2^N{\bf Z}}$; as such, one can quotient out ${{\bf Z}_2}$ in this model and recover the trivial model of ${A}$.

Example 5 (Nonstandard abelian ${2}$-torsion group) In a similar spirit to the preceding example, the nonstandard ${2}$-torsion group ${A := ({\bf Z}/2{\bf Z})^N = \prod_{n \rightarrow \alpha} ({\bf Z}/2{\bf Z})^n}$ can be modeled by the compact abelian group ${({\bf Z}/2{\bf Z})^{\bf N}}$ by the formula

$\displaystyle \pi(a) := \lim_{n \rightarrow \infty} \pi_n(a)$

where ${\pi_n: A \rightarrow ({\bf Z}/2{\bf Z})^n}$ is the obvious projection, and the limit is in the product topology of ${({\bf Z}/2{\bf Z})^{\bf N}}$. As before, we can quotient out ${({\bf Z}/2{\bf Z})^{\bf N}}$ and model ${A}$ instead by the trivial group.

Remark 13 The above two examples can be generalised to model any nonstandard finite group ${G = \prod_{n \rightarrow \alpha} G_n}$ equipped with surjective homomorphisms from ${G_{n+1}}$ to ${G_n}$ by the inverse limit of the ${G_n}$.

Exercise 31 (Lamplighter group) Let ${F_2}$ be the field of two elements. ${G}$ be the lamplighter group ${{\bf Z} \ltimes F_2^{\bf Z}}$, where ${{\bf Z}}$ acts on ${F_2^{\bf Z}}$ by the shift ${T: F_2^{\bf Z}}$ defined by ${T(a_k)_{k \in {\bf Z}} := (a_{k-1})_{k \in {\bf Z}}}$. Thus the group law in ${G}$ is given by

$\displaystyle (i,x) (j,y) := (i+j, x+T^i y).$

For each ${n}$, we then set ${A_n \subseteq G}$ to be the set

$\displaystyle A_n := \{ (i,x) \in G: i \in \{-1,0,+1\}; x \in F_2^n \},$

where we identify ${F_2^n}$ with the space of elements ${(a_k)_{k \in {\bf Z}}}$ of ${F_2^{\bf Z}}$ such that ${a_k \neq 0}$ only for ${k \in \{1,\ldots,n\}}$. Let ${F_2((t))}$ be the ring of formal Laurent series ${\sum_{k \in {\bf Z}} a_k t^k}$ in which all but finitely many of the ${a_k}$ for negative ${k}$ are zero. We let ${G_0}$ be the modified lamplighter group ${{\bf Z} \ltimes F_2((t))}$, where ${{\bf Z}}$ acts on ${F_2^{\bf Z}}$ by the shift ${T: f \mapsto tf}$. We give ${F_2((t))}$ a topology by assigning each non-zero Laurent series ${\sum_{k \in {\bf Z}} a_k t^k}$ a norm of ${2^{-k}}$, where ${k}$ is the least integer for which ${a_k \neq 0}$; this induces a topology on ${G_0}$ via the product topology construction.

We will model the ultraproduct ${A := \prod_{n \rightarrow \alpha} A_n \subset {\bf Z} \ltimes {}^* ({\bf Z}/2{\bf Z})^{\bf Z}}$ (or more precisely, the set ${A \cup A^{-1}}$, since ${A}$ is not quite symmetric) by the group

$\displaystyle G_0 \times_{\bf Z} G_0 := \{ ((i,x),(j,y)) \in G_0 \times G_0: i = j \}$

using the map

$\displaystyle \pi( (i, \lim_{n \rightarrow \alpha} (a^{(n)}_k)_{k \in {\bf Z}} ) )$

$\displaystyle := ( (i, \sum_{k \in {\bf Z}} \lim_{n \rightarrow \alpha} a^{(n)}_k t^k ), (i, \sum_{k \in {\bf Z}} \lim_{n \rightarrow \alpha} a^{(n)}_{n-k} t^k ) ).$

Roughly speaking, ${\pi(a)}$ captures the behaviour of ${a}$ at the two “ends” of ${F_2^N}$, where ${N := \lim_{n \rightarrow \alpha} n}$. We give ${G_0 \times_{\bf Z} G_0}$ the topology induced from the product topology on ${G_0 \times G_0}$.

• (i) Show that ${A \cup A^{-1}}$ is an ultra approximate group.
• (ii) Show that ${\pi}$ is a good model of ${A \cup A^{-1}}$.
• (iii) Show that ${\pi}$ is no longer a good model if one projects ${G_0 \times_{\bf Z} G_0}$ to the first or second copy of ${G_0}$, or to the base group ${{\bf Z}}$.
• (iv) Show that ${A \cup A^{-1}}$ does not have a good model by a Lie group ${L}$. (Hint: ${L}$ does not contain arbitrarily small elements of order two, other than the identity.)

Remark 14 In the above exercise, one needs a moderately complicated (though still locally compact) group ${G \times_{\bf Z} G}$ to properly model ${A}$ and its powers ${A^m}$. This can also be seen from volume growth considerations: ${A_n^m}$ grows like ${4^m}$ for fixed (large ${n}$), which is also the rate of volume growth of ${\pi(A)}$ in ${G \times_{\bf Z} G}$, whereas the volume growth in a single factor ${G}$ would only grow like ${2^m}$, and the volume growth in ${{\bf Z}}$ is only linear in ${m}$. However, if we pass to the large subset ${A'}$ of ${A}$ defined by ${A' := \prod_{n \rightarrow\alpha} A'_n}$, where

$\displaystyle A'_n := \{ (i,x) \in G: i = 0; x \in ({\bf Z}/2{\bf Z})^n \}$

then ${A'}$ is now a nonstandard finite group (isomorphic to the group ${({\bf Z}/2{\bf Z})^N}$ considered in Example 5) and can be modeled simply by the trivial group ${\{1\}}$. Thus we see that we can sometimes greatly simplify the modeling of an ultra approximate group by passing to a large ultra approximate subgroup.

By using the Gleason-Yamabe theorem, one can formalise these examples. Given two ultra approximate groups ${A'}$, we say that ${A'}$ is an large ultra approximate subgroup of ${A}$ if ${(A')^4 \subset A^4}$ and ${A}$ can be covered by finitely many left-translates of ${A'}$.

Theorem 12 (Hrushovski’s Lie model theorem) Let ${A}$ be an ultra approximate group. Then there exists a large ultra approximate subgroup ${A'}$ of ${A}$ that can be modeled by a connected Lie group ${L}$.

Proof: By Exercise 24, we have a good model ${\pi_0: \langle A \rangle \rightarrow L_0}$ of ${A^4}$ by some locally compact group ${L_0}$. In particular, there is an open neighbourhood ${U_0}$ of the identity in ${L_0}$ such that ${\pi_0^{-1}(U_0) \subset A^4}$ and ${U_0 \subset \pi_0(A^4)}$.

Let ${U_1}$ be a symmetric precompact neighbourhood of the identity such that ${U_1^{100} \subset U_0}$. By the Gleason-Yamabe theorem (Theorem 1 of Notes 5), there is an open subgroup ${L'_0}$ of ${L_0}$, and a compact normal subgroup ${N}$ of ${L'_0}$ contained in ${U_1}$, such that ${L'_0/N}$ is isomorphic to a connected Lie group ${L}$. Let ${\pi_1: L'_0 \rightarrow L}$ be the quotient map.

Write ${U_2 := U_1 \cap L'_0}$. As ${\pi_0}$ is a good model, we can find a nonstandard finite set ${A'}$ with

$\displaystyle \pi^{-1}(U_2) \subset A' \subset \pi^{-1}(U_2^2).$

By replacing ${A'}$ with ${A' \cap (A')^{-1}}$ if necessary, we may take ${A'}$ to be symmetric. As ${U_2^4}$ can be covered by finitely many left-translates of ${U_2}$, we see that ${A'}$ is an ultra approximate group. Since

$\displaystyle (A')^4 \subset \pi^{-1}(U_2^8) \subset \pi^{-1}(U_1^{100}) \subset \pi^{-1}(U_0) \subset A^4$

and ${\pi(A^4)}$ can be covered by finitely many left-translates of ${U_2}$, we see that ${A'}$ is a large ultra approximate subgroup of ${A}$. It is then a routine matter to verify that ${\pi_1 \circ \pi_0: \langle A' \rangle \rightarrow L}$ is a good model for ${A'}$. $\Box$

The Lie model ${L}$ need not be unique. For instance, the nonstandard cyclic group ${{\bf Z}/N{\bf Z}}$ can be modeled both by the trivial group and by the unit circle ${{\bf R}/{\bf Z}}$. However, as observed by Hrushovski, it can be shown that after quotienting out the (unique) maximal compact normal subgroup from the Lie model ${L}$, the resulting quotient group (which is also a Lie group, and in some sense describes the “large scale” structure of ${L}$) is unique up to isomorphism. The following exercise fleshes out the details of this observation.

Exercise 32 (Large-scale uniqueness of the Lie model) Let ${L, L'}$ be connected Lie groups, and let ${A}$ be an ultra approximate group with good models ${\pi: \langle A \rangle \rightarrow L}$ and ${\pi': \langle A' \rangle \rightarrow L'}$.

• (i) Show that the centre ${Z(L) := \{ g \in L: gh=hg \hbox{ for all } h \in L \}}$ is an abelian Lie group.
• (ii) Show that the connected component ${Z(L)^0}$ of the identity in ${Z(L)}$ is isomorphic to ${{\bf R}^d \times ({\bf R}/{\bf Z})^{d'}}$ for some ${d,d' \geq 0}$.
• (iii) Show that the quotient group ${Z(L)/Z(L)^0}$ is a finitely generated abelian group, and is isomorphic to ${{\bf Z}^{d''} \times H}$ for some finite group ${H}$.
• (iv) Show that the torsion points of ${Z(L)}$ are contained in a compact subgroup of ${Z(L)}$ isomorphic to ${({\bf R}/{\bf Z})^{d'} \times H}$.
• (v) Show that any finite normal subgroup of ${L}$ is central, and thus lies in the compact subgroup indicated above. (Hint: ${L}$ will act continuously by conjugation on this finite normal subgroup.)
• (vi) Show that given any increasing sequence ${N_1 \subset N_2 \subset \ldots}$ of compact normal subgroups of ${L}$, the upper bound ${\overline{\bigcup_n N_n}}$ is also a compact normal subgroup of ${L}$. (Hint: The dimensions of ${N_n}$ (which are well-defined by Cartan’s theorem) are monotone increasing but bounded by the dimension of ${L}$. In particular, the connected components ${N_n^0}$ must eventually stabilise. Quotient them out and then use (v).)
• (vii) Show that ${L}$ contains a unique maximal compact normal subgroup ${N}$. Similarly, ${L'}$ contains a unique maximal compact normal subgroup ${N'}$. Show that the quotient groups ${L/N, L'/N'}$ contain no non-trivial compact normal subgroups.
• (viii) Show that ${\pi^{-1}(N) = (\pi')^{-1}(N')}$. (Hint: if ${g \in L}$, then ${g \in N}$ iff the group generated by ${g}$ and its conjugates is bounded.)
• (ix) Show that ${L/N}$ is isomorphic to ${L'/N'}$.
• (x) Show that for sufficiently large standard ${m}$, ${A^m}$ can be modeled by a Lie group with no non-trivial compact normal subgroups, which is unique up to isomorphism.

To illustrate how this theorem is useful, let us apply it in the bounded torsion case.

Exercise 33 Let ${A}$ be an ultra approximate group in an ${r}$-torsion nonstandard group ${G}$ for some standard ${r \geq 1}$. Show that ${A^4}$ contains a nonstandard finite group ${H}$ such that ${A}$ can be covered by finitely many left translates of ${H}$. (Hint: if a Lie group has positive dimension, then it contains elements arbitrarily close to the identity of arbitrarily large order.)

Combining this exercise with Proposition 11 from Notes 6, we conclude a finitary consequence, first observed by Hrushovski:

Corollary 13 (Freiman theorem, bounded torsion case) Let ${r, K \geq 1}$, and let ${A}$ be a finite ${K}$-approximate subgroup of an ${r}$-torsion group ${G}$. Then ${A^4}$ contains a finite subgroup ${H}$ such that ${A}$ can be covered by ${O_{K,r}(1)}$ left translates of ${H}$.

Exercise 34 (Commutator self-containment)

• Show that if ${A}$ is an ultra approximate group, then there exists a large approximate subgroup ${B}$ of ${A}$ such that ${[B,B] \subset B}$, where we write ${[A,B] := \{ [a,b]: a \in A, b \in B \}}$, with ${[a,b] := a^{-1} b^{-1} ab}$. (Note that this is slightly different from the group-theoretic convention, when ${H}$ and ${K}$ are subgroups, to define ${[H,K]}$ to be the group generated by the commutators ${[h,k]}$ with ${h \in H}$ and ${k \in K}$.)
• Show that if ${A}$ is a finite ${K}$-approximate group, then there exists a symmetric set ${B}$ containing the origin with ${B^4 \subset A^4}$, ${|B| \gg_K |A|}$, and ${[B,B] \subset B}$.

Remark 15 I do not know of any proof of Exercise 34 that does not go through the Gleason-Yamabe theorem (or some other comparably deep fragment of the theory of Hilbert’s fifth problem).

The following result of Hrushovski is a significant strengthening of the preceding exercise:

Exercise 35 (Hrushovski’s structure theorem) Let ${A}$ be a finite ${K}$-approximate group, and let ${F: {\bf N} \times {\bf N} \rightarrow {\bf N}}$ be a function. Show that there exist natural numbers ${L, M, N}$ with ${N \geq F(L,M)}$ and ${L, M \ll_{K,F} 1}$, and nested sets

$\displaystyle \{1\} \subset A_N \subseteq \ldots \subseteq A_1 \subseteq A^4$

with the following properties:

• (i) For each ${1 \leq n \leq N}$, ${A_n}$ is symmetric;
• (ii) For each ${1 \leq n < N}$, ${A_{n+1}^2 \subseteq A_n}$;
• (iii) For each ${1 \leq n \leq N}$, ${A_n}$ is contained in ${M}$ left-translates of ${A_{n+1}}$;
• (iv) For ${1 \leq n,m,k \leq N}$ with ${k < n+m}$, one has ${[A_n,A_m] \subset A_k}$;
• (v) ${A}$ can be covered by ${L}$ left-translates of ${A_1}$.

(Hint: first find and prove an analogous statement for ultra approximate groups, in which the function ${F}$ is not present.)

There is a finitary formulation of Theorem 12, but it takes some effort to state. Let ${L}$ be a connected Lie group, with Lie algebra ${{\mathfrak l}}$ which we identify using some coordinate basis with ${{\bf R}^d}$, thus giving a Euclidean norm ${\| \|}$ on ${{\mathfrak l}}$. We say that ${L}$ with this basis has complexity at most ${M}$ for some ${M \geq 1}$ if

• The dimension ${d}$ of ${L}$ is at most ${M}$;
• The exponential map ${\exp: {\mathfrak l} \rightarrow L}$ is injective on the ball ${\{ x \in {\mathfrak l}: \|x\| \leq 1/M \}}$;
• One has ${\| [x,y] \| \leq M \|x\| \|y\|}$ for all ${x,y \in {\mathfrak l}}$.

We then define “balls” ${B_R}$ on ${L}$ by the formula

$\displaystyle B_R := \{ \exp(x): x \in {\mathfrak l}; \|x\| < R \}.$

Exercise 36 (Hrushovski’s Lie model theorem, finitary version) Let ${F: {\bf N} \rightarrow {\bf N}}$ be a function, and let ${A}$ be a finite ${K}$-approximate group. Show that there exists a natural number ${1 \leq M \ll_{K,F} 1}$, a connected Lie group ${L}$ of complexity at most ${M}$, a symmetric set ${A'}$ containing the identity with ${(A')^4 \subset A^4}$, and a map ${\pi: (A')^{F(M)} \rightarrow L}$ obeying the following properties:

• (i) (Large subgroup) ${A}$ can be covered by ${M}$ left-translates of ${A'}$.
• (ii) (Approximate homomorphism) One has ${\pi(1)=1}$, and for all ${a, b \in (A')^{F(M)}}$ with ${ab \in (A')^{F(M)}}$, one has

$\displaystyle \pi(ab) \pi(b)^{-1} \pi(a)^{-1} \in B_{1/F(M)}.$

• (iii) (Thick image) If ${a \in (A')^{F(M)}}$ and ${\pi(a) \in B_{1/M}}$, then ${a \in A'}$. Conversely, if ${g \in B_{1/M}}$, then ${\pi(A')}$ intersects ${g B_{1/F(M)}}$.
• (iv) (Compact image) One has ${\pi(A') \subset B_M^M}$.

(Hint: One needs to argue by compactness and contradiction, carefully negating all the quantifiers in the above claim, and then use Theorem 12.)

Exercise 37 In the converse direction, show that Theorem 12 can be deduced from Exercise 36. (Hint: to get started, one needs a statement to the effect that if an ultra approximate group ${A}$ is unable to be modeled by some Lie group of complexity at most ${M}$, then there is also some ${\epsilon>0}$ for which ${A}$ cannot be “approximately modeled up to error ${\epsilon}$” in some sense by such a Lie group. Once one has such a statement (provable via a compactness or ultralimit argument), one can use this ${\epsilon}$ to build a suitable function ${F()}$ which which to apply Exercise 36.