At UCLA we just concluded our third seminar in our reading of “Stable group theory and approximate subgroups” by Ehud Hrushovski. In this seminar, Isaac Goldbring made some more general remarks about universal saturated models (extending the discussion from the previous seminar), and then Henry Towsner gave some preliminaries on Kiesler measures, in preparation for connecting the main model-theoretic theorem (Theorem 3.4 of Hrushovski) to one of the combinatorial applications (Corollary 1.2 of Hrushovski).

As with the previous post, commentary on any topic related to Hrushovski’s paper is welcome, even if it is not directly related to what is under discussion by the UCLA group. Also, we have a number of questions below which perhaps some of the readers here may be able to help answer.

Note: the notes here are quite rough; corrections are very welcome. Henry’s notes on his part of the seminar can be found here.

(Thanks to Issac Goldbring for comments.)

— 1. More remarks on universal models —

Recall from the previous seminar that if we have any structure {M} of a language {L} (e.g. {M} could be a group, and {L} the language of groups), then we can (assuming some set-theoretic axioms, such as the existence of inaccessible cardinals) obtain an elementary extension {{\Bbb U}} of {M}, enjoying the following properties:

  • (i) (Saturation) If {A \subset {\Bbb U}} has strictly smaller cardinality than {{\Bbb U}}, then every partial type over {A} is realisable in {{\Bbb U}}; and
  • (ii) (Homogeneity) If {A} is as above and {\vec a, \vec b \in {\Bbb U}^n} are elementarily indistinguishable over {A}, then there is an automorphism of {{\Bbb U}} that maps {\vec a} to {\vec b}.

Another way of phrasing saturation: if one has a “small” family of formulae {\phi(x_1,\ldots,x_n)} (where “small” means “having cardinality less than that of {{\Bbb U}}“) which is finitely realisable in {{\Bbb U}^n} (i.e. for any finite collection of these formulae, one can find {(a_1,\ldots,a_n) \in {\Bbb U}^n} obeying these formulae), then it is realisable (one can find {(a_1,\ldots,a_n) \in {\Bbb U}^n} obeying all of the formulae). Equivaqlently the topology on {{\Bbb U}} generated by the definable sets (which form a clopen base) is compact.

Remark 1 {{\Bbb U}} has the following “universal” property: any small model {N} of the theory of {M} (i.e. any structure elementarily equivalent to {M}) can be embedded into {{\Bbb U}}. Indeed, if one throws in all the elements of {N} as constants, the model {N} can be described as a partial type over the small set {N}, and one simply applies saturation to obtain the embedding.

(A side remark: in Hrushovski’s paper, it is not completely clear whether one is working with the universal model (in which saturation holds over all small sets {A}) or merely a countably saturated one. If it is the latter, then the above remark does not quite hold, but it seems that one can get around this by replacing “small model” with “small elementary substructure of {{\Bbb U}}” throughout the paper. But this seems to not make a significant difference to the substance of the paper.)

Usually we fix a universal structure {{\Bbb U}}, and abbreviate {{\Bbb U} \models \phi(a)} as {\models \phi(a)}. (In the combinatorial applications, {{\Bbb U}} will be a group, and we will thus call it {G}.)

Remark 2 Universal models have the following useful cardinality gap property: if {X \subset {\Bbb U}^n} is a definable set, then either {X} is finite, or {|X| = |{\Bbb U}|} (i.e. {X} is large); there are no definable sets of intermediate cardinality. Proof: let {\phi(\vec x) = \phi(x_1,\ldots,x_n)} be a defining formula for {X}. If {X} is infinite, then the formula {\phi(\vec x)} together with the formulae {\vec x \neq \vec a} for every {\vec a \in X} are finitely realisable, hence realisable by saturation, but this is absurd. Note that this argument in fact shows that any {\bigwedge}-definable set (i.e. a partial type) is either finite or large. For {\bigvee}-definable sets over some infinite set of constants {A}, it was claimed that such sets either have cardinality less than or equal to {A}, or are large, but I didn’t check this.

(Question: if one only assumes countable saturation (which may be the case in Hrushovski’s paper), then one has a smaller cardinality gap: definable sets can be finite or uncountable. It wasn’t clear to us whether the rest of the paper would work if one just had this gap.)

Now we obtain an analogous cardinality gap for equivalence relations.

Lemma 1 Suppose the language {L} is at most countable. Let {E \subset {\Bbb U}^2} be a {\bigwedge}-definable equivalence relation on {{\Bbb U}} (one could also consider equivalence relations on {{\Bbb U}^n}). Then either the cardinality of the classes is at most {2^{\aleph_0}} (the cardinality of the continuum), or is {|{\Bbb U}|}.

Proof: Let {\phi(x,y)} be one of the defining relations of the equivalence relation {E}, which we can assume to be symmetric. By the greedy algorithm, we see that one of the following two statements must be true:

  • There exists finitely many {x_1, \ldots, x_n} such that for every {y}, {\phi(x_i,y)} is true for at least one {1 \leq i \leq n}.
  • There exists a countable sequence {x_1, x_2, \ldots} such that {\phi(x_i,x_j)} fails for all {i \neq j}.

Suppose the latter possibility holds. Then by transfinite induction (using the least ordinal with cardinality {|{\Bbb U}|}) and saturation, we can obtain a family {(x_\alpha)} of cardinality {|{\Bbb U}|} such that {\phi(x_\alpha,x_\beta)} fails for all {\alpha \neq \beta}. This implies that {E} has {|{\Bbb U}|} equivalence classes, and we are done. Thus the only case remaining is if every defining relation {\phi} enjoys the first property. But then we see that the equivalence class that an element {y} belongs to is determined by the precise set of {x_i} obeying {\phi(x_i,y)} for each {\phi}, and this implies that the number of classes is at most {2^{\aleph_0}}. \Box

We remark that {2^{\aleph_0}} is best possible. Indeed, consider the integers {{\Bbb Z}} in the language of groups, and take a universal extension {{\Bbb U}} of these integers. Consider the relation {E} defined by cosets of the group {\bigcap_{n=1}^\infty n {\Bbb U}} (i.e. {x \equiv y} if {x} and {y} agree modulo {n} for every (standard) natural number {n}). Then this is a {\bigwedge}-definable relation, but the set of equivalence classes is equivalent to the profinite integers {\hat {\Bbb Z}}, which have the cardinality of the continuum.

Equivalence relations whose set of classes have cardinality at most {2^{\aleph_0}} are known as bounded equivalence relations. (It may be that one does not actually need the above dichotomy to establish the results in Hrushovski’s paper, and just take the above statement as the definition of a bounded equivalence relation.)

Observe that the proof of the above lemma shows that if a bounded equivalence relation is definable rather than merely {\bigwedge}-definable, then the number of classes is finite. Thus bounded can be viewed as a generalisation of “finite”.

The boundedness of a certain equivalence relation (created by a certain stabiliser group) is essential in ensuring that we can define a certain locally compact “logic topology” on the quotient space, which is apparently what allows the Gleason-Yamabe theory of topological groups to kick in; presumably we will see more of this in later seminars. (This idea also appears in previous work, such as this paper by Hrushovski, Peterzil, and Pillay.)

— 2. Keisler measures —

To obtain combinatorial results, Hrushovski applies the contradiction and compactness method: assume that a combinatorial result fails, obtain a sequence of counterexamples to that result, and extract some sort of limiting object to which an infinitary theory (in this case, stability theory) can be applied. In this section we discuss what happens to the combinatorial concept of (normalised) cardinality along such a limit; what one gets at the end is an object known as a Kiesler measure.

Suppose we have a finite set {X} in a group {G}. Then given any other set {A} in {G}, we can define a normalised counting measure {\mu(A)} of {A} by the formula {\mu(A)=|A|/|X|} (this measure could be infinite if {A} is infinite). More generally, given {A \subset G^l}, one can define {\mu^l(A) := |A|/|X|^l}. We will often omit the {l} superscript and use {\mu} for all the measures on {G^l} simultaneously. Thus for instance one has {\mu(X^l)=1}, and {\mu(A \times B) = \mu(A) \mu(B)} if {A \subset G^l} and {B \subset G^m}.

Now suppose one has a sequence {G_n} of groups, each with a finite set {X_n \subset G_n}. Then we have a sequence of normalised counting measures {\mu_n} on {G_n} (and on powers {G_n^l}). We can then take an ultralimit of all of these objects, obtaining a group {G_\infty} (the ultrapower of the {G_n}), a subset {X_\infty} of {G_\infty} (the ultrapower of the {X_n}), and a “measure” {\mu_\infty}, which assigns to every subset {A} of {G_\infty} a non-standard real number (or {+\infty}) {\mu_\infty(A)}. Thus for instance {\mu_\infty(X_\infty)=1}. In practice we will be working inside {X_\infty} (or small powers of {X_\infty}, such as {X_\infty \cdot X_\infty}), which by hypothesis will have bounded measure, so the {+\infty} case will not be relevant.

In applications, the {X_n} will have bounded doubling, thus {\mu_n( X_n \cdot X_n )} is bounded in {n}; taking ultraproducts, we see that {\mu( X \cdot X )} is bounded also (i.e. less than a standard real number).

Now we want to extend this ultrapower {G_\infty} to a universal model {G}. To do this, we have to specify the language {L}. {L} will of course contain the language of groups, but if that is all that the language contains, then one cannot refer to {X} or {\mu} in this language. To add in {X} is easy: one simply adds a unary predicate for membership in {X}, so that “{x \in X}” is now a formula in {x}. To add {\mu} is a bit trickier: to keep the language {L} countable, one is not allowed to measure all sets {A} in the model, and also one does not want to work with non-standard real numbers. So instead, what one does is add predicates {\theta_{A,q}(x_1,\ldots,x_k)} that are interpreted as “{\mu(A(x_1,\ldots,x_k)) > q}” for any definable set {A(x_1,\ldots,x_k)} with parameters {x_1,\ldots,x_k \in G}, and any (standard) rational number {q}. (Note that by adding these predicates, one enlarges the space of possible definable sets; but one can iterate this countably many times to deal with this apparent circularity.) We thus obtain a countable set of predicates (even after iteration). With all these predicates, the language {L} is now capable of interpreting {\mu(A(x_1,\ldots,x_k))} as a standard (extended) real number (the supremum of all rational {q} for which {\mu(A(x_1,\ldots,x_k)) \geq q}); when the model is the ultrapower {G_\infty}, this standard real number is simply the standard part of the non-standard real number {\mu_\infty(A(x_1,\ldots,x_k))}.

[A slight subtlety: when one collapses {\mu} to the standard real numbers, the predicate {\theta_{A,q}(x_1,\ldots,x_k)} now deviates very slightly from its original interpretation "{\mu(A(x_1,\ldots,x_k)) > q}". What is now true is that this predicate is implied by {\mu(A(x_1,\ldots,x_k)) > q}, and in turn implies "{\mu(A(x_1,\ldots,x_k)) \geq q}", just as the assertion that a non-standard number {x} is larger than {q} is implied by the standard part {st(x)} being larger than {q}, and in turn implies {st(x) \geq q}.]

With this language {L}, we can now pass to a universal model {G}. This is a group with a set {X}, and a real number {\mu(A(x_1,\ldots,x_k)) \in [0,+\infty]} assigned to any set in {G} or {G^l} definable with parameters. It inherits all the elementary properties of {\mu_n}, for instance {\mu} is finitely additive and one has {\mu(X)=1} and {\mu(A \times B) = \mu(A) \mu(B)}. (One can use compactness and the Carathéodory extension theorem to extend {\mu} to a {\sigma}-additive measure on the “Borel {\sigma}-algebra” generated by the definable sets; this is basically the Loeb measure construction, but we will not need it here.)

By construction, {\mu} is a Keisler measure – a finitely additive measure on the definable sets. By construction, it is also {G_\infty}-invariant – i.e. it is invariant under all automorphisms of {G} that fix {G_\infty}. One consequence of this is that the quantity {\mu(A(\vec x))} depends only on the type of {\vec x} over {G_\infty}. Actually, there is a stronger statement – the quantity {\mu(A(\vec x))} depends continuously on {\vec x} using the topology generated by the {G_\infty}-definable sets (by using the predicates {\theta_{A,q}(x_1,\ldots,x_k)}). Because of this, we call {\mu} an {G_\infty}-definable Keisler measure.

Remark 3 One does not need to pass through the non-standard real numbers here; instead, one can interpret {\theta_{A,q}(x_1,\ldots,x_k)} directly as {\mu_n(A(x_1,\ldots,x_k)) > q} in the finite models {G_n}, take an ultralimit (or some other theory that contains all the statements that are true in all but finitely many of the {G_n}) and then take a universal extension, and then reconstruct {\mu(A(x_1,\ldots,x_k))} as the supremum of all {q} for which {\theta_{A,q}(x_1,\ldots,x_k)} holds. I find the temporary non-standard interpretation of {\mu} to be helpful though in understanding why {\theta_{A,q}(x_1,\ldots,x_k)} ends up deviating slightly from {\mu(A(x_1,\ldots,x_k)) > q}.

— 3. Wide types —

Recall that a partial type in {G^l} over some set of constants {A} is a set of {l}-ary formulae using {A}, which then defines a {\bigwedge}-definable (over {A}) subset of {G^l}, being the set of all {l}-tuples that realise those formulae. A complete type is a maximal such set of formulae (or a minimal {\bigwedge}-definable set, equivalently), or equivalently an orbit of the automorphism group of {G}.

Note that given a complete type and a definable set, the type either falls entirely inside the set or is disjoint from it. (Indeed, a complete type can be thought of as an ultrafilter on the algebra of definable sets.) In practice, all the types we will consider will lie inside a definable set such as {X \cdot X}, which will have finite measure in applications.

Now for a key concept. A (partial or complete) type is wide if the set it defines it cannot be contained in a definable set of zero measure. More generally, a type is wide over a set of constants {A} if it cannot be contained in an {A}-definable set of zero measure. Wide sets are thus somewhat “generic” or “Zariski-dense” in some sense.

Lemma 2 Every wide partial type {q} over {A} can be refined to a wide complete type {p} over {A}.

Proof: Let {q} be a wide partial type over {A}. By finite additivity of {\mu}, {q} cannot be covered by finitely many {A}-definable sets of measure zero, thus by saturation (compactness), {q} cannot be covered by all of such sets at once. If we set {q'} to be {q} with all the {A}-definable sets of measure zero, {q'} is then also a partial type which is wide (since it is disjoint from, and hence not contained in, any {A}-definable set of measure zero). If we then complete {q'} to any complete type {p} (e.g. by taking an element of {q'} and computing its type), this type is then also wide. \Box

A typical application of this lemma will be to start with a wide complete type over some set {A} of constants, and then enlarge the set of constants, thus turning the complete type back into a partial type; but if one can show that the partial type remains wide, then one can refine it back to a wide complete type.

— 4. Types and group operations —

Types are well behaved with respect to group operations. For instance, if {q} is a partial type over {A}, and {a \in G}, then the shift {q \cdot a} of {A} (defined by taking the set in {G} or {G^l} defined by {q} and shifting it by {a}) is a partial type over {A \cup \{a\}}; if {A} contained {a} and {q} was a complete type, then so is {q \cdot a}.

In a similar vein, if {q} is a partial or complete type over {A}, then {q^{-1}} is a partial or complete type over {A}.

Slightly less obvious is that if {p, q} are partial types over {A}, then the product set {p \cdot q} is also a partial type over {A}. Indeed, for any {A}-definable sets {B, C} containing {p, q}, we see that {p \cdot q} is contained in the set {B \cdot C}, which is {A}-definable (being cut out by the formula “{\exists y \in B, z \in C: x = yz}“). On the other hand, if {x} lies in all the {B \cdot C}, i.e. for each {B, C} one can find realisations {y,z} of the formula “{y \in B \wedge z \in C \wedge x=yz}“, then (as the intersection of definable sets is still definable) this whole family of formulae is finitely realisable, and thus (by saturation) realisable, which means that {x \in p \cdot q}. Thus we have exhibited {p \cdot q} as a finite type.

— 5. Indiscernibles —

We already have a notion of elementary indistinguishability: two {l}-tuples {\vec a, \vec b \in G^l} are elementarily indistinguishable (over some set of constants {A}) if every formula involving {A} obeyed by {\vec a} is also obeyed by {\vec b} and vice versa (i.e. {\vec a, \vec b} have the same type, or equivalently (by homogeneity) there is an automorphism that maps {\vec a} to {\vec b}).

We can build upon this notion: a sequence {b_1, b_2, \ldots \in G} is indiscernible (over {A}), if for every {l}, the {l}-tuples {(b_{i_1},\ldots,b_{i_l})} for {i_1 < \ldots < i_l} are elementarily indistinguishable; thus the {b_i} are indistinguishable, but also every ordered pair {(b_i,b_j)} for {i < j} are indistinguishable, and so forth. In practice, indiscernibles will be constructed by lots of Ramsey theory, and are sort of a way of pre-emptively doing all the Ramsey theory at once.

Lemma 3 Let {A \subset G^l \times G} be an {M}-definable set for some {M}, and for each {b \in G}, let {A_b} be the slice {A_b := \{ a \in G^l: (a,b) \in A \}} (note that this is a {M \cup \{b\}}-definable set). We assume that all the {A_b} are contained in an {M}-definable set {X'} of finite measure. Let {b \in G} be such that {\mu(A_b) > 0}, and let {b_1, b_2, \ldots} be an indiscernible sequence (over {M}) with the type of {b} (i.e. they are all elementarily indistinguishable from {b}. Then any finite number of the {A_{b_i}} have non-empty intersection (in fact their intersection has positive measure).

Proof: We establish this by induction. Suppose that we already know that

\displaystyle  \mu( A_{b_{i_1}} \cap \ldots \cap A_{b_{i_k}} ) > 0

for all {i_1 < \ldots < i_k} (note that this is already true for {k=1} by hypothesis and indiscernability). We claim that

\displaystyle  \mu( A_{b_{i_1}} \cap \ldots \cap A_{b_{i_{k+1}}} ) > 0

for {i_1 < \ldots < i_{k+1}} also. Note by indiscernability that these measures do not actually depend on the choices of {i_1,\ldots,i_{k+1}}. Suppose for contradiction that we had

\displaystyle  \mu( A_{b_{i_1}} \cap \ldots \cap A_{b_{i_{k+1}}} ) = 0

for one, and hence (by indiscernability) for all {i_1 < \ldots < i_{k+1}}. In particular, this shows that the sets

\displaystyle  A_{b_1} \cap \ldots \cap A_{b_{k-1}} \cup A_{b_n}

are disjoint modulo null sets for {n= k, k+1, \ldots}. On the other hand, these sets have uniformly positive measure by induction hypothesis and indiscernability. Thus their union has unbounded measure, which contradicts the hypothesis that they all lie in a set {X'} of finite measure. \Box

A remark: the above argument was phrased using the Kiesler measure {\mu}, but the same argument also works for a more general structure known as a {S1}-ideal. We’re not clear yet whether this additional generality will be necessary for the applications at hand.

Now we combine wide types and indiscernables, showing that wide types can be perturbed to “look the same” with respect to any finite number of indiscernibles:

Lemma 4 Let {b} be an element or tuple in {G}, let {a} be a wide type over {M \cup \{b\}} for some set of constants {M}, and let {b_1,b_2,\ldots} be a sequence of indiscernibles (over {M}) that has the same type as {b} (over {M}). Then for any finite number {b_1,\ldots,b_n} in this sequence, one can find a type {a'} such that {a'} has the same type over {b_i} as {a} does over {b}, for all {1 \leq i \leq n}.

Proof: To abbreviate notation we suppress “over {M}” in everything that follows. (We’ll probably use this convention again in later seminars.)

Let {\phi(a,b)} be a formula obeyed by {a} and {b}, then the definable set {A} cut out by {\phi} is such that the slice {A_b} has positive measure (since {a} is wide). By the previous lemma, {A_{b_1} \cap \ldots \cap A_{b_n}} is thus non-empty, so we can find an {a'} such that {\phi(a',b_i)} is true for all {1 \leq i \leq n}. At present {a'} depends on {\phi}; but by using saturation one can make {a'} independent of {\phi}, and the claim follows. \Box