One of my favorite open problems, which I have blogged about in the past, is that of establishing (or even correctly formulating) a non-commutative analogue of Freiman’s theorem. Roughly speaking, the question is this: given a finite set {X} in a non-commutative group {G} which is of small doubling in the sense that the product set {X \cdot X := \{ xy: x, y \in X \}} is not much larger than {X} (e.g. {|X \cdot X| \leq K|X|} for some {K = O(1)}), what does this say about the structure of {X}? (For various technical reasons one may wish to replace small doubling by, say, small tripling (i.e. {|X \cdot X \cdot X| = O( |X| )}), and one may also wish to assume that {X} contains the identity and is symmetric, {X^{-1} = X}, but these are relatively minor details.)

Sets of small doubling (or tripling), etc. can be thought of as “approximate groups”, since groups themselves have a doubling constant {K := |X \cdot X|/|X|} equal to one. Another obvious example of an approximate group is that of an arithmetic progression in an additive group, and more generally of a ball (in the word metric) in a nilpotent group of bounded rank and step. It is tentatively conjectured that in fact all examples can somehow be “generated” out of these basic examples, although it is not fully clear at present what “generated” should mean.

A weaker conjecture along the same lines is that if {X} is a set of small doubling, then there should be some sort of “pseudo-metric” {\rho} on {G} which is left-invariant, and for which {X} is controlled (in some suitable sense) by the unit ball in this metric. (For instance, if {X} was a subgroup of {G}, one would take the metric which identified all the left cosets of {X} to a point, but was otherwise a discrete metric; if {X} were a ball in a nilpotent group, one would use some rescaled version of the word metric, and so forth.) Actually for technical reasons one would like to work with a slightly weaker notion than a pseudo-metric, namely a Bourgain system, but let us again ignore this technicality here.

Recently, using some powerful tools from model theory combined with the theory of topological groups, Ehud Hrushovski has apparently achieved some breakthroughs on this problem, obtaining new structural control on sets of small doubling in arbitrary groups that was not previously accessible to the known combinatorial methods. The precise results are technical to state, but here are informal versions of two typical theorems. The first applies to sets of small tripling in an arbitrary group:

Theorem 1 (Rough version of Hrushovski Theorem 1.1) Let {X} be a set of small tripling, then one can find a long sequence of nested symmetric sets {X_1 \supset X_2 \supset X_3 \supset \ldots}, all of size comparable to {X} and contained in {(X^{-1} X)^2}, which are somewhat closed under multiplication in the sense that {X_i \cdot X_i \subset X_{i-1}} for all {i > 1}, and which are fairly well closed under commutation in the sense that {[X_i, X_j] \subset X_{i+j-1}}. (There are also some additional statements to the effect that the {X_n} efficiently cover each other, and also cover {X}, but I will omit those here.)

This nested sequence is somewhat analogous to a Bourgain system, though it is not quite the same notion.

If one assumes that {X} is “perfect” in a certain sense, which roughly means that there is no non-trivial abelian quotient, then one can do significantly better:

Theorem 2 (Rough version of Hrushovski Corollary 1.2) Let {X_0} be a set of small tripling, let {X := X_0^{-1} X_0}, and suppose that for almost all {l}-tuples {a_1, \ldots, a_l \in X} (where {l=O(1)}), the conjugacy classes {a_i^X := \{ x^{-1} ax: x \in X \}} generate most of {X} in the sense that {|a_1^X \cdot \ldots \cdot a_l^X| \gg |X|}. Then a large part of {X} is contained in a subgroup of size comparable to {X}.

Note that if one quotiented out by the commutator {[X,X]}, then all of the conjugacy classes {a_i^X} would collapse to points. So the hypothesis here is basically a strong quantitative assertion to the effect that the commutator {[X,X]} is extremely large, and rapidly fills out most of {X} itself.

Here at UCLA, a group of logicians and I (consisting of Matthias Aschenbrenner, Isaac Goldbring, Greg Hjorth, Henry Towsner, Anush Tserunyan, and possibly others) have just started a weekly reading seminar to come to grips with the various combinatorial, logical, and group-theoretic notions in Hrushovski’s paper, of which we only have a partial understanding at present. The seminar is a physical one, rather than an online one, but I am going to try to put some notes on the seminar on this blog as it progresses, as I know that there are a couple of other mathematicians who are interested in these developments.

So far there have been two meetings of the seminar. In the first, I surveyed the state of knowledge of the noncommutative Freiman theorem, covering broadly the material in my previous blog post. In the second meeting, Isaac reviewed some key notions of model theory used in Hrushovski’s paper, in particular the notions of definability and type, which I will review below. It is not yet clear how these are going to be connected with the combinatorial side of things, but this is something which we will hopefully develop in future seminars. The near-term objective is to understand the statement of the main theorem on the model-theoretic side (Theorem 3.4 of Hrushovski), and then understand some of its easier combinatorial consequences, before going back and trying to understand the proof of that theorem.

[Update, Oct 19: Given the level of interest in this paper, readers are encouraged to discuss any aspect of that paper in the comments below, even if they are not currently being covered by the UCLA seminar.]

— 1. Definability —

Model theory studies models (or structures) of languages. For our purposes, a language {L} consists of the basic symbols in logic (the boolean connectives, parentheses, the quantifiers, variables, and the equals sign) plus some additional relation and function symbols. For instance, the language of groups could be denoted as {(e, \cdot, \cdot^{-1})}, since one needs the identity {e} (which can be thought of as a {0}-ary function, i.e. a constant symbol), the multiplication {\cdot} (a binary function), and the inverse function {\cdot^{-1}}. The language of rings could be denoted as {(0, 1, +, -, \times)}, the language of ordered rings could be denoted as {(0, 1, +, -, \times, <)}, and so forth. To avoid cluttering the discussion with irrelevant formalities, we will use all the usual conventions of mathematical notation (e.g. using {x \neq y} as shorthand for {\neg (x = y)}, using {x^3} as short-hand for {x \cdot x \cdot x}, etc.). In the applications to additive combinatorics, the language {L} will either be the language of groups, or an extension thereof.

A structure {M} for a language {L} consists of some domain (which, by abuse of notation, we will also call {M}), together with an interpretation of the various relations and functions of {L} as relations or functions on {M}. For instance, every group {G} is naturally a structure for the language of groups. (Indeed, one does not need {G} to obey the group axioms in order to be a structure for this language; as long as {G} contains a distinguished element {e}, some binary operation {\cdot: G \times G \rightarrow G}, and some unary operation {\cdot^{-1}: G \rightarrow G}, it is a structure.)

Given a sentence {\phi} (i.e. a formula with no free variables) in {L}, a structure {M} can interpret that sentence and give it a truth value in the usual manner; if it gives {\phi} a true value, we write {M \models \phi}. For instance, in the language of groups, if {\phi} is the associativity sentence {\forall x,y,z: (x \cdot y) \cdot z = x \cdot (y \cdot z)}, then {M \models \phi} will hold if {M} is a group, but need not hold for other structures in the language of groups.

Given a formula {\phi(x_1,\ldots,x_n)} with {n} free variables in {L}, {M} cannot interpret that formula directly, but given any values {a_1,\ldots,a_n \in M} for the free variables, {M} can assign a truth value to the substituted formula {\phi(a_1,\ldots,a_n)}, and we write {M \models \phi(a_1,\ldots,a_n)} if that sentence is true. The set

\displaystyle  \{ (a_1,\ldots,a_n) \in M^n: M \models \phi(a_1,\ldots,a_n) \}

is then referred to as a {\emptyset}-definable subset of {M^n}, defined by the formula {\phi(x_1,\ldots,x_n)}. For instance, in a group {G}, the set {\{ a \in G: a^2 = e; a \neq e \}} of elements of order two is a {\emptyset}-definable set, defined using the formula

\displaystyle  \phi(x) := `` x^2 = e \wedge x \neq e ''.

Most subsets {M^n} in a structure are not {\emptyset}-definable. However one can increase the ability to define sets by adding constants. Given a set {A \subset M}, we can extend the language {L} to a new language {L(A)} by adding a constant symbol {c_a} for each element {a \in A} (actually we shall usually abuse notation and write {a} for {c_a}). The structure {M} similarly extends to a structure {M(A)} for {L(A)} which has the same domain as {M}, and interprets all existing symbols in {L} the same way, but also interprets the constant symbol {c_a} as {a}. A subset of {M^n} is said to be {A}-definable if it is {\emptyset}-definable in {M(A)}, i.e. there is a formula {\phi(x_1,\ldots,x_n)} which can use constants from {A} to define the set.

For instance, in the language of rings, and in the structure {{\mathbb R}} given by the real line (which is, of course, a ring), the singleton set {\{ \sqrt{\pi} \}} is not {\emptyset}-definable, but it is {\{ \pi \}}-definable, since one can use the formula “{x^2 = \pi \wedge \exists y: x=y^2}” to define it.

Even when one uses the entire domain {M} as a set of constants, one cannot always define every single set. For instance, in the empty language (which has the basic notion of equality, but no other relations or symbols), and in an infinite model {M}, the only {M}-definable subsets of {M} turn out to be the finite and cofinite subsets of {M}; one cannot for instance define the even integers from the integers from a single first-order formula in the empty language, even if one is allowed to use all the integers as constant symbols. (Of course, one can do so once one has the addition or multiplication symbol available.)

Another example: in the language of fields (or rings), if {M} is an algebraically closed field, then the {M}-definable sets are the boolean combinations of algebraic sets; this fact is closely related to Hilbert’s nullstellensatz (and can be proven by quantifier elimination). Similarly, in the language of ordered rings, if {M} is the real line, then the {M}-definable sets are the semi-algebraic sets, which is essentially a famous result of Tarski and Seidenberg. In particular, the one-dimensional {M}-definable sets are finite boolean combinations of half-lines, a property known as o-minimality.

Sometimes it is necessary to leave the original structure {M} for a larger structure {N} that is an elementary extension of {M}. What this means is that

  • {N} is an extension of {M} (thus the domain of {N} includes the domain of {M}, and the interpretations of the relations and functions on {N} and on {M} agree on {M}); and
  • Every sentence (allowing constant symbols from {M}) that is true in {M}, is also true on {N}. (The converse implication is then also true by taking negations.)

For instance, in the language of rings, the non-standard reals are an elementary extension of the standard reals. On the other hand, the reals are not an elementary extension of the rationals, because there are formulae such as {\not \exists x: x^2=2} which make sense and are true in the rationals, but not the reals.

Now for an important definition. Let {N} be an elementary extension of {M}, let {A} be a subset of {M}, and let {\vec a = (a_1,\ldots,a_n) \in N^n} be a tuple in {N}. The type {tp(\vec a/A)} of {\vec a} over {A} is the set of all formulae {\phi(x_1,\ldots,x_n)} (using {A} as constant symbols) which are satisfied by {\vec a}:

\displaystyle  tp(\vec a/A) := \{ \phi(x_1,\ldots,x_n) \in L(A): N \models \phi(a_1,\ldots,a_n) \}.

Informally, the type captures everything one can say in first-order logic about {\vec a} if one is only allowed to use the constant symbols {A}.

A complete type over {A} is a collection of formulae in {A} which is the type of some tuple {\vec a} in some elementary extension {N} of {M}. A partial type {p} over {A} is a subset of a complete type over {A}, i.e. a collection of formulae involving {A} which is satisfied by some {a} in some elementary extension. In that case we say that {a} realises the partial type {p}, and write {a \models p}.

For instance, each real number can be identified with a complete type over the rationals (using all the rationals as constants) in the language of totally ordered sets, by the Dedekind cut construction. However, there are additional complete types as well, such as those arising from infinitesimals.

By the completeness theorem, a partial type is nothing more than a set of formulae of a given arity involving {A} which is logically consistent with all the sentences in {A} that are true in {M}; and a complete type is a partial type which is maximal (given any formula {\phi}, either {\phi} or its negation lie in the complete type). One can view complete types as being ultrafilters on the language {L(A)} (modulo equivalence by sentences that are true in {M}, i.e. modulo the theory of {M}).

For instance, in the language of ordered sets, and with {M} being the natural numbers, the collection of formulae {\{ `` x > n'': n \in {\mathbb N} \}} in {L({\mathbb N})} cannot be satisfied in the natural numbers, but can be satisfied in elementary extensions of the natural numbers, e.g. the ordered set {{\mathbb N} \uplus {\mathbb Z}} or the non-standard natural numbers, and so is a partial type over {{\mathbb N}}.

Thus we see that in order to realise a partial type, one may have to move to a larger structure, which can be inconvenient. There is a related inconvenience involving tuples {\vec a, \vec b \in N^n} in an elementary extension {n} which are elementarily indistinguishable over {A} if {tp(\vec a/A) = tp(\vec b/A)}, i.e. there is no formula involving {A} that can distinguish {\vec a} from {\vec b}. One obvious way in which indistinguishability can occur is if there is an automorphism {\phi: N \rightarrow N} on {N} that fixes {A} (where an automorphism is defined as a bijection which preserves all the structures in the language) and maps {\vec a} to {\vec b}, much as two elements of a field extension which are Galois conjugate to each other cannot be distinguished using polynomials in the base field. However, it is possible to have two elementary indistinguishable tuples {\vec a, \vec b} in a structure {N} for which no automorphism of {N} exists that maps {\vec a} to {\vec b}.

Here is a concrete example: we use the language of order, let {M=N} be the rationals, and let {A} be the set consisting of {\{ -1/n: n \in {\mathbb N} \} \cup \{1 + 1/n: n \in {\mathbb N} \}}. Then {0} and {1} have the same type over {A} (i.e. are elementarily indistinguishable), but there is no {A}-fixing automorphism that maps {0} to {1} because {0} is the supremum of a subset of {A}, and {1} is not. On the other hand, if one extends the rationals to contain non-standard rationals, then an automorphism becmomes possible.

More generally, whenever {\vec a} and {\vec b} are elementarily indistinguishable, one can always find an elementary extension {N'} of {N} (and hence of {M}) and an automorphism of {N'} for which {\vec a} maps to {\vec b}. (This appears to be some sort of swindle type argument, but I do not know the details.)

It is convenient to pre-empt all these issues of passing to an elementary extension by working with a universal extension {{\Bbb U}} of {M} with the following two properties:

  • (i) (Saturation) If {A \subset {\Bbb U}} has strictly smaller cardinality than {{\Bbb U}}, then every partial type over {A} is realisable in {{\Bbb U}} (no need to pass to a further extension); and
  • (ii) (Homogeneity) If {A} is as above and {\vec a, \vec b \in {\Bbb U}^n} are elementarily indistinguishable over {A}, then there is an automorphism of {{\Bbb U}} that maps {\vec a} to {\vec b} (so again, no need to pass to a further extension).

I do not know fully how to obtain a saturated model (though it is standard in model theory textbooks); it does require some set-theoretic assumptions, such as the generalised continuum hypothesis, or the existence of a strongly inaccessible cardinal, though in practice one does not need such an absurdly huge model because one usually only needs saturation for sets {A} of much smaller cardinality (e.g. countable). For instance, given any uncountable ordinal {\kappa}, one can obtain saturation for all {A} of cardinality {\kappa} or less with a structure {{\Bbb U}} of cardinality {|M|^\kappa}. It seems that for the combinatorial applications we will in fact only need {A} and {M} to be countable, though we have not yet checked this fully.

It turns out that (i) formally implies (ii). Here is a sketch of a proof: to build an automorphism {f: {\Bbb U} \rightarrow {\Bbb U}} that fixes {A} and maps {\vec a} to {\vec b}, we will greedily build up partial automorphisms {f_C: C \rightarrow {\Bbb U}} that fix {A} and map {\vec a} to {\vec b}, where {C} is a set of strictly smaller cardinality than {{\Bbb U}}. By “partial automorphism”, we mean that if {c_1,\ldots,c_n \in C}, then {(f_C(c_1),\ldots,f_C(c_n))} has the same type over {A} (or over any other subset of {C}) as {(c_1,\ldots,c_n)}.

Clearly one can build a partial automorphism on the set {A \cup \{ \vec a, \vec b\}}. By transfinite induction (and by well-ordering {{\Bbb U}} using the least ordinal with the cardinality of {{\Bbb U}}), it then suffices to show that whenever one has a partial automorphism {f_C: C \rightarrow {\Bbb U}}, and an element {c} outside of {C}, one can extend the automorphism to an automorphism {f_{C \cup \{c\}}: C \cup \{c\} \rightarrow {\Bbb U}}. But if one looks at the set of constraints that {d := f_{C \cup \{c\}}(c)} needs to satisfy (roughly, that any formula that is true for {c} and {c_1,\ldots,c_n \in C}, must also be true for {d} and {f(c_1),\ldots,f(c_n) \in C}), we see that any finite collection of these constraints is satisfiable (because it can be encoded as an existential sentence {\exists d: \ldots} involving finitely many {f(c_1),\ldots,f(c_n)}, which by the partial automorphism property of {f} is equivalent to the corresponding (true) existential sentence {\exists c: \ldots} involving {c_1,\ldots,c_n}), and so by (i) it is realisable in {{\Bbb U}}, and the claim follows.

Property (ii) has a nice consequence: a {{\Bbb U}}-definable subset of {{\Bbb U}^n} is {A}-definable if and only if it is invariant under all automorphisms of {{\Bbb U}} that fix {A}.

Let’s now work in this universal model, with some set {A} of constants. Every partial type {p} (of some arity {n}) over {A} defines a subset {\{ \vec a \in {\Bbb U}^n: \vec a \models p \}} of {{\Bbb U}^n}; this is an intersection of {A}-definable sets (one for each formula in {p}), and is thus called a {\bigwedge}-definable set over {A}. For instance, in the language of groups, if {A} is a (possibly infinite) subset of a group {G}, the centraliser {C(A) := \{ g \in G: ag=ga \hbox{ for all } a \in A \}} is a {\bigwedge}-definable set over {A}, but is not {A}-definable in general. One can view {A}-definable sets as analogous to a base of clopen sets in topology, whereas {\bigwedge}-definable sets are analogous to closed sets. Dually, we have the notion of a {\bigvee}-definable set over {A}, which is the complement of a {\bigwedge}-definable set over {A}, or equivalently the union of {A}-definable sets; this is the analogue of an open set. (For instance, {A} is automatically {\bigvee}-definable over {A}.)

Not every interesting object involving {A} is {\bigwedge}-definable or {\bigvee}-definable; for instance, in the language of groups, if {A} is a subgroup of {G}, the normaliser {N(A) := \{ g \in G: gAg^{-1} = A \}} is neither {\bigwedge}-definable or {\bigvee}-definable in general (there are too many quantifiers hidden inside the statement {gAg^{-1} = A}!). Things might improve if one adds in a predicate for membership in {A}, though I was a little unclear on this point.

(Thanks to Matthias Aschenbrenner, Juan Diego Caycedo, and Isaac Goldbring for comments and suggestions.)