The classical foundations of probability theory (discussed for instance in this previous blog post) is founded on the notion of a probability space {(\Omega, {\cal E}, {\bf P})} – a space {\Omega} (the sample space) equipped with a {\sigma}-algebra {{\cal E}} (the event space), together with a countably additive probability measure {{\bf P}: {\cal E} \rightarrow [0,1]} that assigns a real number in the interval {[0,1]} to each event.

One can generalise the concept of a probability space to a finitely additive probability space, in which the event space {{\cal E}} is now only a Boolean algebra rather than a {\sigma}-algebra, and the measure {\mu} is now only finitely additive instead of countably additive, thus {{\bf P}( E \vee F ) = {\bf P}(E) + {\bf P}(F)} when {E,F} are disjoint events. By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). Nevertheless, one can still perform a fair amount of probability theory in this weaker setting.

In this post I would like to describe a further weakening of probability theory, which I will call qualitative probability theory, in which one does not assign a precise numerical probability value {{\bf P}(E)} to each event, but instead merely records whether this probability is zero, one, or something in between. Thus {{\bf P}} is now a function from {{\cal E}} to the set {\{0, I, 1\}}, where {I} is a new symbol that replaces all the elements of the open interval {(0,1)}. In this setting, one can no longer compute quantitative expressions, such as the mean or variance of a random variable; but one can still talk about whether an event holds almost surely, with positive probability, or with zero probability, and there are still usable notions of independence. (I will refer to classical probability theory as quantitative probability theory, to distinguish it from its qualitative counterpart.)

The main reason I want to introduce this weak notion of probability theory is that it becomes suited to talk about random variables living inside algebraic varieties, even if these varieties are defined over fields other than {{\bf R}} or {{\bf C}}. In algebraic geometry one often talks about a “generic” element of a variety {V} defined over a field {k}, which does not lie in any specified variety of lower dimension defined over {k}. Once {V} has positive dimension, such generic elements do not exist as classical, deterministic {k}-points {x} in {V}, since of course any such point lies in the {0}-dimensional subvariety {\{x\}} of {V}. There are of course several established ways to deal with this problem. One way (which one might call the “Weil” approach to generic points) is to extend the field {k} to a sufficiently transcendental extension {\tilde k}, in order to locate a sufficient number of generic points in {V(\tilde k)}. Another approach (which one might dub the “Zariski” approach to generic points) is to work scheme-theoretically, and interpret a generic point in {V} as being associated to the zero ideal in the function ring of {V}. However I want to discuss a third perspective, in which one interprets a generic point not as a deterministic object, but rather as a random variable {{\bf x}} taking values in {V}, but which lies in any given lower-dimensional subvariety of {V} with probability zero. This interpretation is intuitive, but difficult to implement in classical probability theory (except perhaps when considering varieties over {{\bf R}} or {{\bf C}}) due to the lack of a natural probability measure to place on algebraic varieties; however it works just fine in qualitative probability theory. In particular, the algebraic geometry notion of being “generically true” can now be interpreted probabilistically as an assertion that something is “almost surely true”.

It turns out that just as qualitative random variables may be used to interpret the concept of a generic point, they can also be used to interpret the concept of a type in model theory; the type of a random variable {x} is the set of all predicates {\phi(x)} that are almost surely obeyed by {x}. In contrast, model theorists often adopt a Weil-type approach to types, in which one works with deterministic representatives of a type, which often do not occur in the original structure of interest, but only in a sufficiently saturated extension of that structure (this is the analogue of working in a sufficiently transcendental extension of the base field). However, it seems that (in some cases at least) one can equivalently view types in terms of (qualitative) random variables on the original structure, avoiding the need to extend that structure. (Instead, one reserves the right to extend the sample space of one’s probability theory whenever necessary, as part of the “probabilistic way of thinking” discussed in this previous blog post.) We illustrate this below the fold with two related theorems that I will interpret through the probabilistic lens: the “group chunk theorem” of Weil (and later developed by Hrushovski), and the “group configuration theorem” of Zilber (and again later developed by Hrushovski). For sake of concreteness we will only consider these theorems in the theory of algebraically closed fields, although the results are quite general and can be applied to many other theories studied in model theory.

— 1. Qualitative probability theory – generalities —

We begin by setting up the foundations of qualitative probability theory, proceeding by close analogy with the more familiar quantitative probability theory (though of course we will have to jettison various quantitative concepts, such as mean and variance, from the theory).

As discussed in the introduction, we are replacing the unit interval {[0,1]} by the three-element set {\{0,I,1\}}; one could view this as the quotient space of {[0,1]} in which the interior {(0,1)} has been contracted to a single point {I}. This space is still totally ordered: {0 < I < 1}. The addition relation {x+y=z} on {[0,1]} contracts to an “addition” relation {x+y \sim z} on {\{0,I,1\}}, defined by the following rules:

\displaystyle  0 + 0 \sim 0; 0 + I \sim I; 0 + 1 \sim 1

\displaystyle  I+0 \sim I; 1 + 0 \sim 1

\displaystyle  I + I \sim I; I + I \sim 1

with no other relations of the form {x+y \sim z} in {\{0,I,1\}}. Strictly speaking, {+} is not a binary operation here, as {I+I} can evaluate to {I} or to {1}, but we keep the notation {+} in order to emphasise the analogy with quantitative probability theory.

A qualitative probability space {X = (X, {\cal X}, \mu)} is then a space {X} equipped with a Boolean algebra {{\cal X}} (the measurable subsets of {X}) and a function {\mu: {\cal X} \rightarrow \{0,I,1\}} with {\mu(X)=1}, which is finitely additive in the sense that {\mu(E) + \mu(F) \sim \mu(E \vee F)} whenever {E, F \in {\cal X}} are disjoint. It is easy to see that these measures are monotone (thus {\mu(E) \leq \mu(F)} whenever {E \subset F}), and that {\mu(\emptyset)=0}. A measurable subset {E} of {X} is called a {\mu}-null set if {\mu(E)=0}, and a {\mu}-full set, a {\mu}-conull set, or a {\mu}-generic set if {\mu(E)=1}; note that {\mu}-full sets are the complements of {\mu}-null sets and vice versa. A property {P(x)} of points {x \in X} is said to hold {\mu}-almost everywhere or for {\mu}-generic {x} if it holds outside of a {\mu}-null set (or equivalently, if it holds on a {\mu}-generic set).

One can describe a qualitative probability measure {\mu} on a Boolean space {(X,{\cal X})} purely through its null ideal {\hbox{null}(\mu) := \{ E \in {\cal X}: \mu(E)=0 \}} of null sets, or equivalently through its full filter {\hbox{full}(\mu) := \{ E \in {\cal X}: \mu(E)=1\}} of full sets. Conversely, if a subset {{\cal F}} of {{\cal X}} is the full filter of some qualitative probability measure on {(X,{\cal X})} if it obeys the following filter axioms:

  • (Empty set) {\emptyset \not \in {\cal F}} and {X \in {\cal F}}.
  • (Monotonicity) If {E \subset F} are in {{\cal X}}, and {E \in {\cal F}}, then {F \in {\cal F}}.
  • (Intersection) If {E, F \in {\cal F}}, then {E \cap F\in {\cal F}}.

Furthermore {\mu} is completely determined by the filter {{\cal F}}. Similarly for the null ideal (with suitably inverted axioms, of course). Thus, if one wished, one could replace the concept of a qualitative probability measure with the concept of an ideal or filter, but we retain the use of {\mu} to emphasise the probabilistic interpretation of these objects.

One obvious way to create a qualitative probability measure is to start with a quantitative probability measure and “forget” the quantitative aspect of this measure by quotienting {[0,1]} down to {\{0,I,1\}}. Under some reasonable hypotheses, one can reverse this procedure and view many qualitative probability measures as quantitative probability measures to which this forgetful process has been applied. However, this reversal is usually not unique, and we will not try to use it here.

In quantitative probability theory, one can take two quantitative probability measures {\mu,\nu} on the same space {X} and form an average {(1-\theta) \mu + \theta \nu} for some {0 < \theta < 1}, which is another quantitative probability measure. For instance, if {\mu} is a probability measure and {E} is a set with measure between {0} and {1}, then {\mu} can be expressed as an average of the conditioned measures {(\mu|E) := \frac{1}{\mu(E)} \mu\downharpoonright_E} and {(\mu|X \backslash E) := \frac{1}{\mu(X \backslash E)} \mu\downharpoonright_{X \backslash E}}.

In analogy with this, we can take two qualitative measures {\mu,\nu} on the same space {X} and form the average {I \mu + I \nu}, defined by setting {I\mu+I\nu(E) = 1} if and only if {\mu(E)=\nu(E)=1} (or equivalently, {I\mu+I\nu(E) = 0} if and only if {\mu(E)=\nu(E)=0}). If {\mu} is a qualitative probability measure and {E} is a set with measure {I}, then we can form the conditioned measure {(\mu|E)}, defined by setting {(\mu|E)(F) = 1} if {\mu(F \backslash E) = 0} (or equivalently {(\mu|E)(F)=0} if {\mu(E \cup (X \backslash F))=1}), and then one can check that {\mu} is the average of the conditioned measures {(\mu|E)} and {(\mu|X \backslash E)}.

We call a qualitative probability measure irreducible if it does not assign any set the intermediate measure of {I} (or equivalently, the full filter is an ultrafilter); thus, irreducible qualitative probability measures are the same concept as finitely additive {\{0,1\}}-valued probability measures (which, as is well known, are essentially the same concept as ultrafilters). By the previous discussion, we see that a qualitative probability measure is irreducible if and only if it is not the average of two other measures.

In this paper we will primarily work with irreducible measures, but will occasionally have to deal with reducible measures, for instance when taking the conditional product (or pullback) of two irreducible measures over a third.

Given a measurable map {f: X \rightarrow Y} between two Boolean spaces {(X, {\cal X})}, {(Y, {\cal Y})} (thus the pre-image of any measurable set in {Y} by {f} is measurable in {X}), we can define the pushforward {f_* \mu} of any qualitative probability measure {\mu} on {X} to be the qualitative probability measure on {Y} defined by the usual formula {f_* \mu(E) := \mu(f^{-1}(E))}. In particular, if {X} embeds into {Y}, then any measure {\mu} on {X} can also be viewed as a measure on {Y}, which we call the extension of {\mu} to {Y}.

We now discuss the issue of product measures in the qualitative setting. Here we will deviate a little from the usual probability formalism, in which one usually defines a product algebra {{\cal X} \times {\cal Y}} to be the minimal algebra that contains the Cartesian product of {{\cal X}} and {{\cal Y}}. Here, it turns out to be more useful to have a more flexible (but not unique) notion of a product, in which more measurable sets are permitted. Namely, given two Boolean spaces {(X, {\cal X})}, {(Y, {\cal Y})}, we say that a Boolean space {(X \times Y, {\cal X} \times {\cal Y})} is a product of the two spaces if {X \times Y} is the Cartesian product of {X} and {Y}, and the following axioms are obeyed:

  • (Products) If {E \in {\cal X}} and {F \in {\cal Y}}, then {E \times F \in {\cal X} \times {\cal Y}}.
  • (Slicing) If {E \in {\cal X} \times {\cal Y}}, then {E_x := \{ y \in Y: (x,y) \in E \} \in {\cal Y}} for all {x \in X}, and {E^y := \{ x \in X: (x,y) \in E \} \in {\cal X}} for all {y \in Y}.

These axioms do not uniquely specify {{\cal X} \times {\cal Y}}, but in practice each product space {X \times Y} will have a canonical choice for {{\cal X} \times {\cal Y}} attached to it.

Given two qualitative probability spaces {(X, {\cal X}, \mu)}, {(Y, {\cal Y}, \nu)}, a qualitative probability space {(X \times Y, {\cal X} \times {\cal Y}, \mu \times \nu)} is a product of the two spaces if {(X \times Y, {\cal X} \times {\cal Y})} is a product of {(X, {\cal X})} and {(Y, {\cal Y})}, and the following assertions are equivalent for any {E \in {\cal X} \times {\cal Y}}:

  • {\mu\times \nu(E) = 1}.
  • For {\mu}-almost every {x \in X}, {\nu(E_x) = 1}.
  • For {\nu}-almost every {y \in Y}, {\mu(E^y) = 1}.

Of course, one could replace {1} by {0} here in the above equivalence, which can be thought of as a qualitative Fubini-Tonelli type theorem. Once one selects the product Boolean algebra {{\cal X} \times {\cal Y}}, the product measure {\mu \times \nu} is uniquely specified, if it exists at all; but (as in the setting of classical measure theory if one is not working with {\sigma}-finite measures), existence is not always automatic. (But in our applications, it will be.)

One can define products of more than two (but still finitely many) qualitative probability spaces in a similar fashion; we leave the details to the reader.

Now we are ready to set up qualitative probability theory. We need a qualitative probability space {(\Omega, {\cal E}, {\bf P})} to serve as the sample space, event space, and probability measure. An event in {{\cal E}} is said to occur almost surely if it occurs on a {{\bf P}}-full set. Given an Boolean space {X = (X, {\cal X})}, a random variable is then a measurable map {{\bf x}: \Omega \rightarrow X}. We permit random variables that are only defined almost surely, thus {{\bf x}} is now a partial function defined on a {{\bf P}}-full event in {\Omega}; as in quantitative probability theory or measure theory, we view random variables that agree almost surely as being essentially equivalent to each other. The law or distribution of {{\bf x}} is the pushforward {{\bf x}_* {\bf P}} of the qualitative probability measure; this is then a qualitative measure on {X}. We say that two random variables {{\bf x}, {\bf x}': \Omega \rightarrow X} agree in distribution, and write {{\bf x} \equiv {\bf x}'}, if they have the same law.

Two random variables {{\bf x}: \Omega \rightarrow X}, {{\bf y}: \Omega \rightarrow Y} are said to be independent if the distribution of the joint random variable {({\bf x},{\bf y}): \Omega \rightarrow X \times Y} is the product of the distributions of {{\bf x}} and {{\bf y}} separately. Here we need to specify a product Boolean algebra {{\cal X} \times {\cal Y}} on the product space {X \times Y} to make this definition well-defined, but in the applications we will consider, we will always have a canonical product algebra to select. One can define independence of more than two random variables in a similar fashion.

At the opposite extreme to independence, we say that a random variable {{\bf y}: \Omega\rightarrow Y} is determined by another random variable {{\bf x}: \Omega \rightarrow X} if there is a measurable map {f: X \rightarrow Y} such that {f({\bf x})={\bf y}} almost surely. The constrast between independence and determination (as well as a weaker property than determination we will consider later, namely algebraicity) will be the focus of the group chunk and group configuration theorems discussed in later sections.

As discussed in this previous blog post, in quantitative probability theory we often reserve the right to extend the underlying probability space {(\Omega, {\cal E}, {\bf P})}, in order to introduce new sources of randomness, without destroying the probabilistic properties of existing random variables (such as their independence and determination properties). We say that an extension of a qualitative probability space {(\Omega, {\cal E}, {\bf P})} is another qualitative probability space {(\Omega', {\cal E}', {\bf P}')} together with a measurable map {\pi: \Omega' \rightarrow \Omega} such that {\pi_* {\bf P}' = {\bf P}}. One can then pull back any random variable {{\bf x}: \Omega \rightarrow X} to a random variable {{\bf x} \circ \pi: \Omega' \rightarrow X} on the new probability space; by abuse of notation, we continue to refer to {{\bf x} \circ \pi} as {{\bf x}}. Probabilistic notions such as independence, law, or determination remain unchanged under such an extension.

— 2. Qualitative probability theory on definable sets —

For the purposes of this post, the qualitative probability measures we will care about will live in the theory of algebraically closed fields. We will assume some basic familiarity with algebraic geometry concepts, such as the dimension of a variety. The exact choice of field {k} will not be important here, but one could work with the complex field {k={\bf C}} if desired (in which case one could (somewhat artificially) model the qualitative probability measures here by quantitative probability measures on complex varieties if one wished).

Henceforth {k} is fixed; in contrast to usual model theory practice, we will not need to introduce some sufficiently large extension of {k} to work in. The notion of measurability here will be given by the model-theoretic concept of definability. Namely, a definable set is a set {X \subset k^n} of the form

\displaystyle  X = \{ x \in k^n: \phi(x) \hbox{ true} \}

for some predicate {\phi(x)} that can be expressed in terms of the field operations {+,\cdot}, a finite number of variables and constants in {k}, the boolean symbols {\vee,\wedge, \neg}, the equality sign, the quantifiers {\forall, \exists} (with all variables being quantified over {k}), and punctuation symbols (parentheses and colons). A definable map {f: X \rightarrow Y} between two definable sets is a function whose graph {\{ (x, f(x)): x \in X \}} is a definable set.

As {k} is algebraically closed, the definable sets can be described quite simply. Define an irreducible quasiprojective variety, or variety for short, to be a Zariski-open dense subset of an irreducible affine variety over {k}. One can show (using elimination of quantifiers in algebraically closed fields, the existence of which follows from Hilbert’s nullstellensatz) that a set is definable if and only if it is the union of a finite number of disjoint varieties.

We equip each definable set {X} with the Boolean algebra {{\cal D}_X} of definable subsets of {X}; this will be the only algebra we shall ever place on a definable set. Note that if {X,Y} are definable, then {(X \times Y, {\cal D}_{X \times Y})} is a product space of {(X, {\cal D}_X)} and {(Y, {\cal D}_Y)} as per the previous definition.

If {\mu} is a qualitative probability measure on a definable set {X}, the support of {\mu} is defined to be the intersection of all the Zariski-closed sets of {\mu}-full measure. As the Zariski topology is Noetherian, the support is always a closed set of full measure.

Remark 1 One can describe a qualitative probability measure {\mu} through its type, defined as the set of all predicates {\phi(x)} which hold for {\mu}-almost all {x}. This concept is essentially the same as the concept of a type in model theory; the measure {\mu} is irreducible if and only if the type is complete. In this post, we have essentially replaced the notion of a type with that of a qualitative probability measure, and so types will not appear explicitly in the rest of the post.

We now give some basic examples of qualitative probability measures on definable sets.

Example 1 If {X} is a non-empty definable set of some dimension {d} (that is, {d} is the largest dimension of all components of {X} (or of its closure)), then the qualitative uniform probability measure {\mu_X} on {X} (or uniform measure, for short) is defined by setting all subsets of dimension {d-1} or less to have measure zero, or equivalently all generic subsets of {X} (that is, {X} with finitely many sets of dimension {d-1} removed) to have full measure. (Sets which contain a generic subset of some, but not all, of the {d}-dimensional components of {X}, then are assigned the intermediate measure.) The support of this measure is then the closure of the union of the {d}-dimensional components. This measure is irreducible if and only if {X} is almost irreducible in the sense that it only has one {d}-dimensional component. Note that the algebraic geometry notion of genericity now coincides with the (qualitative) probabilistic notion of almost sureness: a (definable) property on {X} holds generically if and only if it holds almost surely with respect to the uniform measure on {X}.

If {X, Y} are definable sets, then the uniform measure on {X \times Y} is the product of the uniform measure on {X} and the uniform measure on {Y}; the proof of the Fubini-Tonelli type statement that justifies this may be found for instance in Lemma 13 of this paper of mine.

Example 2 If {X} is almost irreducible and {f: X \rightarrow Y} is a definable map, then the uniform measure {\mu_X} on {X} pushes forward to the uniform measure of some almost irreducible subset {Z} of {Y}; see e.g. Lemma A.8 of this previous paper of Breuillard, Green, and myself. Also, generic points in {Z} have fibres in {X} of dimension {\hbox{dim}(X)-\hbox{dim}(Z)}, just as one would expect from naive dimension counting. If {X} is not almost irreducible, the situation becomes a bit more complicated because the image on {f} on different components of {X} may have a different dimension, and so {f_* \mu_X} may become an average of uniform measures on sets of different dimension.

Exercise 1 Let {\mu} be a qualitative probability measure on a definable set {X}.

  • (i) Show that {\mu} is irreducible if and only if it is the uniform measure of some almost irreducible subset of {X}.
  • (ii) Show that {\mu} is the average of finitely many uniform measures if and only if there does not exist a countable family of disjoint subsets of {X} of positive measure. (Hint: greedily select disjoint varieties of {\mu} of positive measure, starting with zero-dimensional varieties (points) and then increasing the dimension.)

We now use the formalism of qualitative probability theory from the previous section, but always working within the definable category; thus we require the sample space {(\Omega, {\cal E}, {\bf P})} to also be a definable set, and that all random variables are required to be definable maps (or at least generically definable maps), which is a stronger condition than measurability.

— 3. The group chunk theorem —

Random variables can interact very nicely with groups {G}, if they come equipped with an appropriate invariant measure. To illustrate this, let us first return to the classical setting of quantitative probability theory, working exclusively with finite groups to avoid all measurability issues.

Given a finite group {G}, let {{\bf g}_1,{\bf g}_2} be two elements of {G} chosen uniformly and independently at random, and then form their product {{\bf g}_3 := {\bf g}_1 \cdot {\bf g}_2}. This gives us a triple {({\bf g}_1,{\bf g}_2,{\bf g}_3)} of random variables taking values in {G}, which obey the following independence and determination properties:

  • (i) (Uniform distribution) For any {i \in \{1,2,3\}}, {{\bf g}_i} has the uniform distribution on {G}.
  • (ii) (Independence) For any distinct {i,j \in \{1,2,3\}}, {{\bf g}_i,{\bf g}_j} are independent.
  • (iii) (Determination) For any distinct {i,j,k \in \{1,2,3\}}, {{\bf g}_k} is determined by {{\bf g}_i,{\bf g}_j}.
  • (iv) (Associativity) After extending the sample space as necessary, one can locate additional random variables {{\bf g}_4,{\bf g}_5,{\bf g}_6} taking values in {G} such that {({\bf g}_3,{\bf g}_4,{\bf g}_6) \equiv ({\bf g}_2,{\bf g}_4,{\bf g}_5) \equiv ({\bf g}_1,{\bf g}_5,{\bf g}_6) \equiv ({\bf g}_1,{\bf g}_2,{\bf g}_3)} (see figure), and such that any other triple {({\bf g}_i,{\bf g}_j,{\bf g}_k)} for distinct {i,j,k \in \{1,\ldots,6\}} which is not a permutation of the four triples already mentioned is jointly independent.

Indeed, to see the associativity axiom, let {{\bf g}_4} be selected uniformly from {G} independently of {{\bf g}_1,{\bf g}_2}, and set {{\bf g}_5 := {\bf g}_2 {\bf g}_4} and {{\bf g}_6 := {\bf g}_1 {\bf g}_2 {\bf g}_4}. The associativity is depicted graphically in the figure below, in which three points connected by a line or curve indicate a dependence, but triples of points not joined by such a line or curve being independent.

Figure 1

In the converse direction, any triple {({\bf g}_1,{\bf g}_2,{\bf g}_3)} on a finite set {G} obeying the above axioms necessarily comes from an underlying group operation:

Proposition 1 (Probabilistic description of a finite group) Let {G} be a finite non-empty set, and let {{\bf g}_1,{\bf g}_2,{\bf g}_3} be random variables on {G} that obey axioms (i)-(iv). Then there exists a group structure {G = (G, 1, ()^{-1}, \cdot)} on {G} such that {{\bf g}_3 = {\bf g}_1 \cdot {\bf g}_2}.

Proof: By axiom (iii), we have {{\bf g}_3 = {\bf g}_1 \cdot {\bf g}_2} for some binary operation {\cdot: G \times G \rightarrow G}. From axiom (iv) we have

\displaystyle  ({\bf g}_1 \cdot {\bf g}_2) \cdot {\bf g}_4 = {\bf g}_3 \cdot {\bf g}_4 = {\bf g}_6 = {\bf g}_1 \cdot {\bf g}_5 = {\bf g}_1 \cdot ({\bf g}_2 \cdot {\bf g}_4)

almost surely (and hence surely, as {G} is finite); also from axioms (i), (iv) we see that {({\bf g}_1,{\bf g}_2,{\bf g}_4)} is uniformly distributed in {G^3}. We conclude that the binary operation {\cdot} is associative, thus {(x \cdot y) \cdot z = x \cdot (y \cdot z)} for all {x,y,z \in G}.

By axiom (iii), we see that for fixed {{\bf g}_1}, {{\bf g}_2} is determined by {{\bf g}_3} and vice versa; from axioms (i), (ii), this implies that for fixed {x \in G}, the map {\phi_x: y \mapsto x \cdot y} is a bijection from {G} to itself; similarly the map {\psi_x: y \mapsto y \cdot x} is a bijection from {G} to itself. Note also from associativity that {\phi_x} commutes with the right-action of {G} in the sense that {\phi_x( y \cdot z ) = \phi_x( y ) \cdot z} for all {x,y,z \in G}.

Conversely, given any bijection {\phi: G \rightarrow G} that commutes with the right-action in the sense that {\phi(y \cdot z) = \phi(y) \cdot z} for all {y,z \in G}, we claim that {\phi = \phi_x} for a unique {x \in G}. Indeed, from axioms (i)-(iii), we know that {{\bf g}_1} is determined by {{\bf g}_2,{\bf g}_3}, and so for any given {y \in G}, we may find {x \in G} such that {\phi(y) = x \cdot y}. If this holds for a single {y}, then it holds for all other {y} by associativity, since {\phi} commutes with the right action, and the map {z \mapsto y \cdot z} is a bijection. Thus we may identify {G} with the space of bijections {\phi: G \rightarrow G} that commute with the right-action. This is clearly a group, and the property {{\bf g}_3 = {\bf g}_1 \cdot {\bf g}_2} is then clear from construction. \Box

Now that we see that a finite group, at least, may be described in terms of the probabilistic language of independence and determination. It is natural to ask whether similar results hold for infinite groups (with the slight modification that we now only expect {{\bf g}_3 = {\bf g}_1 \cdot {\bf g}_2} to hold almost surely rather than surely, as we now will have non-trivial events of probability zero). Here we run into the technical difficulty that many groups – such as non-compact Lie groups, or algebraic groups defined over fields other than {{\bf R}} or {{\bf C}} – are not naturally equipped with a probability measure with which to define the concept of uniform distribution. However, if one is working in the definable category, one can use the language of qualitative probability theory instead and obtain the same result.

A definable group {G} is a group {G = (G, 1, ()^{-1}, \cdot)} which is a definable set, such that the group operations {()^{-1}: G \rightarrow G} and {\cdot: G \times G \rightarrow G} are definable maps. This notion is very close to, but subtly different from, that of the more commonly used notion of an algebraic group; the latter is a bit stricter because the group has to now be an algebraic variety, and the group operations have to be regular maps and not just definable maps. However, the two notions are quite close to each other, particularly in characteristic zero when they become equivalent up to definable group isomorphism, although additional subtleties arise in positive characteristic {p} due to the existence of things like the inverse Frobenius automorphism {x \mapsto x^{1/p}}, which is definable but not regular, and which can be used to definably “twist” an algebraic group. We will not discuss these issues further here, but see this survey of Bouscaren and this article of van den Dries for further discussion.

Theorem 2 (Group chunk theorem) Let {G} be an almost irreducible definable set, and let {{\bf g}_1,{\bf g}_2,{\bf g}_3} be qualitative random variables taking values in {G}, obeying axioms (i)-(iv). Then, after passing from {G} to a generic subset, one may definably identify {G} with a generic subset of an definable group {\tilde G}, so that {{\bf g}_1 \cdot {\bf g}_2 = {\bf g}_3} almost surely. (To put it another way, there is a definable bijection {\iota} between a generic subset of {G} and a generic subset of {\tilde G}, such that {\iota({\bf g}_1) \cdot \iota({\bf g}_2) = \iota({\bf g}_3)} almost surely.)

This theorem is essentially due to Weil (although he worked instead in the category of algebraic varieties and algebraic groups, rather than definable sets and definable groups). It was extended to many other model-theoretic contexts (and in particular to stable theories) in the thesis of Hrushovski, although we will only focus on the classical algebraic geometry case here.

A typical example of a group chunk comes from taking a definable group and removing a lower dimensional subset of it, so that the group law is only generically defined (but this is still sufficient for defining {{\bf g}_1 \cdot {\bf g}_2} almost surely). For instance, one could let {G} be the {2 \times 2} non-singular matrices {\begin{pmatrix} a & b \\ 1 & d \end{pmatrix}}, with (generically defined) group law given by matrix multiplication followed by projective normalisation to force the lower left coordinate to be {1} (which is only defined if this coordinate does not vanish, of course). This can be viewed as an open dense subset of the projective special linear group {PSL_2(k)}, and the theorem is asserting that {G} can be “completed” to this group, which can be interpreted as a definable group. ({PSL_2(k)} can be viewed as an affine variety using, for instance, the adjoint representation. In any event, even projective varieties can be interpreted affinely in a definable fashion by the artificial device of breaking up the variety into finitely many pieces, which can be all fit into a sufficiently large affine space.)

We now prove Theorem 2. By axioms (i)-(iii), there is a generically defined, and definable, map {\cdot: G \times G \rightarrow G} such that {{\bf g}_1 \cdot {\bf g}_2 = {\bf g}_3} almost surely. From axiom (iii) we then conclude the following cancellation axioms:

  • (vii) (Left cancellation) For generic {(g,h) \in G \times G}, there is a unique {k} such that {k \cdot g = h}.
  • (viii) (Right cancellation) For generic {(g,h) \in G \times G}, there is a unique {k} such that {g \cdot k = h}.

The associativity axiom then also gives {(g \cdot h) \cdot k = g \cdot (h \cdot k)} for generic {(g,h,k) \in G^3}.

These axioms also imply the following variants:

  • (vii’) (Left cancellation) For generic {(g, k) \in G \times G}, {k} is the unique {k' \in G} such that {k \cdot g = k' \cdot g}.
  • (viii’) (Right cancellation) For generic {(g, k) \in G \times G}, {k} is the unique {k' \in G} such that {g \cdot k = g \cdot k'}.

Indeed, from axiom (vii), we have a generically defined map {(g,h) \mapsto (g,k)} that maps a generic pair {(g,h) \in G \times G} to {(g,k)}, where {k} is the unique {k \in G} such that {k \cdot g = h}. This map is generically injective; as {G \times G} is essentally irreducible, this map has to be generically bijective also, which gives (vii’), and (viii’) is proven similarly. (We call a partially defined map {f: X \rightarrow Y} generically bijective if it can be made into a total bijection by refining {X, Y} to generic subsets.) We conclude that for generic {g \in G}, the maps {k \mapsto k \cdot g} and {k \mapsto g \cdot k} are generically bijective.

To obtain a genuine group from {G} rather than just a generic group, we perform a formal quotient construction, analogous to how the integers are formally constructed from the natural numbers (or the rationals from the integers). Define a formal pre-group element to be a definable subset {\Sigma} of {G \times G} with the following properties:

  • (ix) (Vertical line test) For generic {g \in G}, there is exactly one {h\in G} such that {(g,h) \in \Sigma}.
  • (ix’) (Vertical line test) For generic {(g,h) \in \Sigma}, {h} is the unique {h' \in G} such that {(g,h') \in \Sigma}.
  • (x) (Horizontal line test) For generic {h \in G}, there is exactly one {g\in G} such that {(g,h) \in \Sigma}.
  • (x’) (Horizontal line test) For generic {(g,h) \in \Sigma}, {g} is the unique {g' \in G} such that {(g',h) \in \Sigma}.
  • (xi) (Translation invariance) For generic {((g,h),k) \in \Sigma \times G}, {(g \cdot k, h \cdot k)} lies in {\Sigma}.

The axioms (ix)-(x’) are asserting that {\Sigma} is a generic bijection. (Here and in the sequel we adopt the convention that a mathematical statement is automatically considered false if one or more of its terms are undefined; for instance, {(g \cdot k, h \cdot k) \in \Sigma} is only true when {g \cdot k} and {h \cdot k} is well-defined. Similarly, when using set-builder notation, we only include those elements in the set which are well-defined; for instance, in the set {\{ (g \cdot k, h \cdot k): k \in G\}}, {k} is implicitly restricted to those values for which {g \cdot k} and {h \cdot k} are well-defined.)

Call two formal pre-group elements {\Sigma, \Sigma'} are equivalent if they have a common generic subset, and then define a formal group element to be an equivalence class {[\Sigma]} of formal pre-group elements {\Sigma}, and let {\tilde G} be the space of formal group elements. At this point, we have to deal with the technical problem that {\tilde G} is not obviously a definable set. However, observe that if {\Sigma} is a formal pre-group element, then we have a definable generic bijection {g \mapsto (g,h)} from {G} to {\Sigma}, which makes {\Sigma} essentially irreducible. If we then define {\overline{\Sigma}} to be the Zariski closure of the top dimensional component of {\Sigma} (i.e. the support of the uniform measure on {\Sigma}), then {\overline{\Sigma}} is an irreducible closed variety, which depends only on the equivalence class {[\Sigma]} of {\Sigma}, with inequivalent pre-group elements giving distinct irreducible closed varieties. Finally, one can describe {\overline{\Sigma}} in terms of a generic element {(g,h)} of {\Sigma}, as being the closure of the top-dimensional component of {\{ (g \cdot k, h \cdot k): k\in G \}}, and generic elements {(g,h) \in G \times G} produce such an objecft. As such, if we let {\tilde G} be the set of all {\overline{\Sigma}}, then {\tilde G} is in one-to-one correspondence with the space of formal group elements, and may be parameterised as a definable set using standard algebraic geometry tools (e.g. Chow coordinates, the Hilbert scheme, or elimination of imaginaries), as being the image under a generically definable map {(g,h) \mapsto \overline{\Sigma}} of {G \times G} with fibres of dimension {\hbox{dim}(G)}. In particular, as {G \times G} is essentially irreducible with dimension {2\hbox{dim}(G)}, {\tilde G} is essentially irreducible with dimension {\hbox{dim}(G)}.

We define the identity element of {\tilde G} to be the equivalence class of the diagonal {\{ (g,g): g \in G \}}, and define the inverse of the equivalence class of a formal pre-group element {\Sigma} to be the equivalence class of the reflection {\{ (h,g): (g,h) \in\Sigma \}} of {\Sigma}. As for the group law, we use generic composition: given two formal pre-group elements {\Sigma, \Sigma'}, we define the composition {\Sigma \cdot\Sigma'} to be the set of all pairs {(g,k)} such that there exists {h \in G} for which {h} is the unique element with {(h,k)\in\Sigma'}, and {h} is also the unique element with {(g,h) \in \Sigma}. One can check (somewhat tediously) that this descends to a well-defined operation on {\tilde G} that gives it the structure of a definable group.

Next, for generic {g \in G}, the set {\{ (g \cdot k, k): k \in G \}} is a formal pre-group element, giving a generically defined map {\iota: G \rightarrow \tilde G}. One can verify that this map is generically definable, generically injective, and generically a homomorphism. The remaining task is to verify that {\tilde G} has the same dimension as {G}, as this together with essential irreducibility and generic injectivity gives generic bijectivity thanks to dimension counting. But for every formal pre-group element {\Sigma}, we see that generic {g \in G}, there is a unique {h \in G} with {(g,h) \in \Sigma}, and furthermore that {(g \cdot k, h \cdot k) \in \Sigma} for generic {k \in G}; in particular, {\Sigma} is equivalent to {\{ (g \cdot k, h\cdot k):k\in G \}} and can thus be recovered from {(g,h)}; conversely, generic {(g,h) \in G \times G} gives rise to a formal group element by this construction. This sets up a generically bijective map {(\overline{\Sigma}, g) \mapsto (g,h)} from {\tilde G\times G} to {G \times G}, which shows that {\tilde G} has the same dimension as {G}, as required. This concludes the proof of Theorem 2.

— 4. The group configuration theorem —

We now discuss a variant of the group chunk theorem that characterises group actions, known as the group configuration theorem. Again, to motivate matters we start with the quantitative probabilistic setting in the finite case. If {G} is a finite group acting on a finite set {X}, and {{\bf g}_1,{\bf g}_2} are chosen independently and uniformly at random from {G}, and {{\bf x}_3} uniformly from {X} (independently of {{\bf g}_1,{\bf g}_2}), and then one defines {{\bf g}_3 := {\bf g}_1^{-1} {\bf g}_2^{-1}}, {{\bf x}_1:= {\bf g}_2 \cdot {\bf x}_3},and {{\bf x}_2:= {\bf g}_1^{-1} \cdot {\bf x}_3} (or, more symmetrically, we have the constraints {{\bf g}_3 {\bf g}_2 {\bf g}_1 = 1}, {{\bf g}_1 \cdot {\bf x}_2 = {\bf x}_3}, {{\bf g}_2 \cdot {\bf x}_3 = {\bf x}_1}, and {{\bf g}_3 \cdot {\bf x}_1 = {\bf x}_2}), then we observe the following independence and determination axioms (setting {G_i := G} and {X_i := X} for {i=1,2,3}):

  • (i’) (Uniform distribution) For any {i \in \{1,2,3\}}, {{\bf g}_i} has the uniform distribution on {G_i}, and {{\bf x}_i} has the uniform distribution on {X_i}.
  • (ii’) (Independence) Any two of the random variables {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} are independent.
  • (iii’) (Determination) For any distinct {i,j,k \in \{1,2,3\}}, {{\bf g}_k} is determined by {{\bf g}_i,{\bf g}_j}, and {{\bf x}_k} is determined by {{\bf g}_i} and {{\bf x}_j}.
  • (v) (More independence) Any three of the random variables {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} that are not of the form {\{{\bf g}_i,{\bf g}_j,{\bf g}_k\}} or {\{{\bf g}_i,{\bf x}_j,{\bf x}_k\}} for distinct {i,j,k} are independent.

Axiom (ii’) is in fact a consequence of axiom (v), but we add it for emphasis. We refer to a sextet {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)} obeying the above axioms as a group configuration. It can be described graphically by the picture below, in which the collinearity of three random variables indicates a dependence between them, with non-collinear triples of variables being independent.

Figure 2

The group configuration theorem concerns a generalisation of the above situation, in which the determination properties in axiom (iii’) are relaxed to the weaker properties of algebraicity. (In additive combinatorics, this would correspond to moving from a “99%” situation in which algebraic structure is present almost everywhere, to a “1%” settng in which it is only present a positive fraction of the time.) We say that one random variable {X} taking values in one definable set {R} is algebraic with respect to another random variable {Y} taking values in another definable set {R'} if there is a relation {\Sigma \subset R \times R'} such that {(X,Y) \in \Sigma} almost surely, and such that for each {y \in R'} there are only finitely many {x \in R} such that {(x,y) \in \Sigma}. More informally, {X} is algebraic over {Y} if {Y} determines {X} up to a finite ambiguity. For instance, if {X} is uniformly distributed in {k}, then {X} is algebraic with respect to any non-constant polynomial {P(X)} of {P}.

Two random variables {X,Y} are said to be interalgebraic if they are each algebraic over each other. For instance, if {X} is uniformly distributed in the affine line {k}, and {P, Q: k \rightarrow k} are two non-constant polynomials, then {P(X)} and {Q(X)} are interalgebraic. Note that interalgebraicity is an equivalence relation.

Now we can give the group configuration theorem:

Theorem 3 (Group configuration theorem) Let {G_1,G_2,G_3,X_1,X_2,X_3} be almost irreducible definable sets, and let {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} be random variables in {G_1,G_2,G_3,X_1,X_2,X_3} respectively on a qualitative probability space obeying the following axioms:

  • (i’) (Uniform distribution) For any {i \in \{1,2,3\}}, {{\bf g}_i} has the uniform distribution on {G_i}, and {{\bf x}_i} has the uniform distribution on {X_i}.
  • (ii’) (Independence) Any two of the random variables {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} are independent.
  • (iii”) (Interalgebraicity) For any distinct {i,j,k \in \{1,2,3\}}, {{\bf g}_k} is algebraic over {{\bf g}_i,{\bf g}_j}, and {{\bf x}_k} is algebraic over {{\bf g}_i} and {{\bf x}_j}.
  • (v) (More independence) Any three of the random variables {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} that are not of the form {\{{\bf g}_i,{\bf g}_j,{\bf g}_k\}} or {\{{\bf g}_i,{\bf x}_j,{\bf x}_k\}} for distinct {i,j,k} are independent.
  • (vi) (Irreducibility) The joint random variable {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)} has an irreducible distribution.

Then, after extending the (qualitative) probability space if necessary, there exist a definable group {G} that acts definably on a definable space {X}, as well as qualitative random variables {\tilde {\bf g}_i} and {\tilde {\bf x}_i} uniformly distributed in {G} and {X} respectively for {i=1,2,3}, with {\tilde {\bf g}_i} algebraic over {{\bf g}_i} and {\tilde {\bf x}_i} interalgebraic with {{\bf x}_i} with {i=1,2,3}, such that

\displaystyle  \tilde {\bf g}_3 \tilde {\bf g}_2 \tilde {\bf g}_1 = 1, \tilde {\bf g}_1 \cdot \tilde {\bf x}_2 = \tilde {\bf x}_3, \tilde {\bf g}_2 \cdot \tilde {\bf x}_3 = \tilde {\bf x}_1, \tilde {\bf g}_3 \cdot \tilde {\bf x}_1 = \tilde {\bf x}_2 \ \ \ \ \ (1)

almost surely, and with the {\tilde {\bf g}_i, \tilde {\bf x}_i} obeying the same independence and irreducibility hypotheses as the {{\bf g}_i, {\bf x}_i}.

This theorem was first established by Zilber, who worked in the more general setting of strongly minimal theories, and then strengthened significantly in the thesis of Hrushovski, who treated the case of stable theories. (However, in these more general situations, the space {X} is restricted to be “one-dimensional” for technical reasons.) We give a proof below the fold, following an exposition of Hrushovski’s method by Bouscaren. (See also a proof of a result very close to the above theorem, avoiding model-theoretic language, by Elekes and Szabo.) The irreducibility axiom (vi) can be relaxed, but then the conclusion becomes more complicated (there might not be a single group structure or group action involved, but rather an average of such actions).

The group configuration theorem has a number of applications to combinatorics; roughly speaking, this theorem is to “approximately associative” definable maps as Freiman’s theorem is for sets of small doubling. The aforementioned paper of Elekes and Szabo is one example of the configuration theorem in action; another is Theorem 41 of this paper of mine, which I proved by a different method (based on Riemann surface arguments), but for which a stronger statement has since been proven using the group configuration theorem by Hrushovski (private communication).

Now we prove Theorem 3. There are two main phases of the argument. The first phase involves upgrading several of the algebraicity hypotheses in axiom (iii”) to determination, by replacing several of the {{\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3} using algebraic changes of variable. Once this is done, the second phase consists of applying a modification of the proof of the group chunk theorem to locate the definable group {G} (and also the definable space {X} that {G} acts on), and to connect this action to the group configuration.

We begin with a simple dimension counting observation. By the axioms, the random variable {({\bf g}_1,{\bf g}_2,{\bf g}_3)} has an irreducible distribution and is thus uniformly distributed on some almost irreducible definable set {V}, with {{\bf g}_3} algebraic over {({\bf g}_1,{\bf g}_2)}, which is uniformly distributed on {G_1 \times G_2}. Taking dimensions, we conclude that {\hbox{dim}(V) = \hbox{dim}(G_1)+\hbox{dim}(G_2)}. Similarly for permutations. This implies that

\displaystyle  \hbox{dim}(G_1)=\hbox{dim}(G_2)=\hbox{dim}(G_3) = d_G

for some natural number {d_G}; a similar argument using the {{\bf g}_i,{\bf x}_j,{\bf x}_k} triples shows that

\displaystyle  \hbox{dim}(X_1)=\hbox{dim}(X_2)=\hbox{dim}(X_3) = d_X

for some natural number {d_X}. Next, since {({\bf g}_1,{\bf g}_2,{\bf x}_3)} are uniformly distributed on {G_1 \times G_2 \times X_3}, and the other three variables {{\bf g}_3,{\bf x}_1,{\bf x}_2} are algebraic over these variables, we see that the tuple {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)} is uniformly distributed on some almost irreducible definable set of dimension {2d_G+d_X}.

We now begin the first phase. Currently, by axiom (iii”), {{\bf x}_2} is algebraic over {{\bf g}_1,{\bf x}_3}. We now use further dimension counting upgrade this algebraicity relationship to determination, basically by removing some information from {{\bf x}_2}.

Proposition 4 Let the assumptions be as in Theorem 3. Then there exists a random variable {{\bf x}''_2} which is interalgebraic with {{\bf x}_2} and uniformly distributed in some almost irreducible set {X''_2}, such that {{\bf x}''_2} is determined by {{\bf g}_1,{\bf x}_3}.

Proof: Let {\Sigma} be the support of {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)}, and let {\Sigma_2} be the projection of {\Sigma} onto {X_1 \times X_3 \times G_1 \times G_2 \times G_3}, then {\Sigma_2} is a closed variety, and as {{\bf x}_2} is algebraic over {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_3)}, {\Sigma_2} is generically finite over {\Sigma}. In particular, {\Sigma_2} also has dimension {2d_G + d_X}. We then form the pullback (or base change) {\Sigma_2^2 := \Sigma_2 \times_\Sigma \Sigma_2} of two copies of {\Sigma_2} over (a generic subset of) {\Sigma}; we view {\Sigma_2^2} as a subset of {G_1 \times G_2 \times G_3 \times X_1 \times X_2 \times X_2 \times X_3}. This is a definable set, but is not necessarily almost irreducible.

Now consider the projection {S} of {\Sigma_2^2} to {X_2 \times X_2}. The set {S} contains the diagonal {\{(x_2,x_2): x_2 \in X_2 \}} and thus has dimension at least {d_X}. We claim that {S} in fact has dimension exactly {d_X}. Indeed, suppose this were not the case, then {S} would contain an irreducible variety {S'} of dimension {d_X+r} for some positive {r}. Now observe that as {{\bf x}_2} is algebraic over {{\bf g}_1,{\bf x}_3}, the projection of {\Sigma} to {G_1 \times X_3 \times X_2} is generically finite over {G_1 \times X_3}, which has dimension {d_G+d_X}; taking pullback with itself, we conclude that the projection of {\Sigma_2^2} to {G_1 \times X_3 \times X_2 \times X_2} also has dimension {d_G+d_X}. Thus, over a generic point in {S' \subset X_2 \times X_2}, the fibre of {\Sigma_2^2} projected to {G_1 \times X_3} has dimension at most {d_G-r}. Similarly, the projection of this fibre to {G_3 \times X_1} has dimension at most {d_G-r}. Since {{\bf g}_2} is algebraic over {{\bf g}_1,{\bf g}_3}, we conclude that the generic fibre of {\Sigma_2^2} over a point in {S'} has total dimensino at most {d_G-r + d_G-r}, so that the preimage of {S'} in {\Sigma_2^2} has dimension at most

\displaystyle  (d_X+r) + (d_G-r) + (d_G-r) < d_X + 2d_G

and is thus a lower-dimensional component of {\Sigma_2^2} (or of {\Sigma} after projecting to either of the two copies of {\Sigma}). Thus, if we pass to a suitable generic subset of {\Sigma}, the projection {S} of {\Sigma_2^2} to {X_2 \times X_2} has dimension {d_X}. Passing to a further generic subset if necessary, we may assume that {S} is algebraic in the sense that any horizontal or vertical line in {X_2 \times X_2} meets {S} in at most finitely many points. From the Noetherian property, we see that there is in fact a uniform upper bound {M} on how many such points can lie on a line (this is basically the degree of {S}).

We now define the random variable {{\bf x}''_2} to be the set of all {y_2} with {({\bf x}_2,y_2) \in S}, such that {({\bf g}_1,y_2,{\bf x}_3)} lies in the projection of (the generic portion of {\Sigma} we are working with) to {G_1 \times X_2 \times X_3}. By the above discussion, this is a finite subset of {X}, and the set of all such possible {x''_2} can be parameterised in a definable way (indeed, it lies inside the {m}-fold powers of {S} over {X_2} for {m=1,\ldots,M}), and is interalgebraic with {{\bf x}_2}. By construction, {{\bf x}''_2} is also determined by {({\bf g}_1,{\bf x}_3)}; as the latter is uniformly distributed on some almost irreducible set, {{\bf x}''_2} is also, and the claim follows. \Box

A similar argument provides a random variable {x''_1} interalgebraic with {{\bf x}_1} such that {{\bf x}''_1} is determined by {{\bf g}_2,{\bf x}_3}. By replacing {{\bf x}_1,{\bf x}_2,X_1,X_2} with {{\bf x}''_1, {\bf x}''_2,X''_1,X''_2} respectively, and checking that none of the axioms (i’). (ii’), (iii”), (v), (vi) are destroyed by this replacement, we see that we may reduce without loss of generality to the case in which we have the additional axiom

  • (iii”‘) {{\bf x}_1} is determined by {{\bf g}_2,{\bf x}_3}, and {{\bf x}_2} is determined by {{\bf g}_1,{\bf x}_3}.

Now we turn to the task of making {{\bf x}_3} determined both by {{\bf g}_1,{\bf x}_2} and by {{\bf g}_2,{\bf x}_1}. We are unable to effectively utilise (suitable permutations of) Proposition 4 here, because any replacement of {{\bf x}_3} by a random variable with less information content will likely destroy axiom (iii”‘). However, we can at least construct a random variable {{\bf x}'_3} interalgebraic with {{\bf x}_3} that is determined by the joint random variable {({\bf g}_1,{\bf x}_2,{\bf g}_2,{\bf x}_1)}. Indeed, the support {\Gamma} of {({\bf g}_1,{\bf x}_2,{\bf g}_2,{\bf x}_1,{\bf x}_3)} is generically finite over {({\bf g}_1,{\bf x}_2,{\bf g}_2,{\bf x}_1)}, and by repeating the dimension counting arguments from Proposition 4, we see tht the projection {\tilde S} of the pullback {\Gamma \times_{G_1\times X_2 \times G_2 \times X_1} \Gamma} to {X_3 \times X_3} has dimension at most {d_X}, and so has finite fibres after passing to a generic subset. If we then set {{\bf x}'_3} to be the fibre of {\Gamma} over {({\bf g}_1,{\bf x}_2,{\bf g}_2,{\bf x}_1)}, we conclude as before that {{\bf x}'_3} is interalgebraic with {{\bf x}_3}, and is clearly determined by {({\bf g}_1,{\bf x}_2,{\bf g}_2,{\bf x}_1)}. Also, each element of {{\bf x}'_3}, together with {{\bf g}_1}, generically determines {{\bf x}_2} by axiom (iii”‘), and hence {{\bf x}_2} is determined by {{\bf g}_1, {\bf x}'_3}; similarly {{\bf x}_1} is determined by {{\bf g}_1, {\bf x}'_3}. Thus by replacing {{\bf x}_3} with {{\bf x}'_3}, we may impose the additional axiom

  • (vii) {{\bf x}_3} is determined by {{\bf g}_1, {\bf g}_2,{\bf x}_1, {\bf x}_2}

while retaining all previous axioms.

Now we perform the following “doubling” trick, creating some new random variables by extending the probability space. As before, let {\Sigma \subset G_1 \times G_2 \times G_3 \times X_1 \times X_2 \times X_3} be the support of {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)}. As {({\bf g}_2, {\bf g}_3, {\bf x}_1)} is uniformly distributed in {G_2 \times G_3 \times X_1}, we see that for generic {(g_1,g_2,g_3,x_1,x_2,x_3) \in \Sigma} and generic {g'_3 \in G_3}, there is a non-zero finite number of tuples {(g'_1, x'_2) \in G_1 \times X_2} such that {(g'_1,g_2,g'_3,x_1,x'_2,x_3) \in \Sigma}. Similarly there are a non-zero finite number of tuples {(g'_2,x'_3)} such that {(g_1,g'_2,g'_3,x'_1, x_2, x_3) \in \Sigma}. Thus, for generic {g'_3}, if we let {\Lambda_{g'_3}} be the set of all tuples

\displaystyle  (g_1,g'_1,g_2,g'_2,g_3,x_1,x'_1,x_2,x'_2,x_3)

such that

\displaystyle  (g_1,g_2,g_3,x_1,x_2,x_3), (g'_1,g_2,g'_3,x_1,x'_2,x_3), (g_1,g'_2,g'_3,x'_1,x_2,x_3) \in \Sigma

(see figure), then {\Lambda_{g'_3}} is a generically finite cover of {\Sigma} (projecting onto the {(g_1,g_2,g_3,x_1,x_2,x_3)} coordinates). Thus, if we perform a base change of the probability space {\Omega} (which we view as lying over {\Sigma}) to {\Lambda_{g'_3}}, we may now create random variables

\displaystyle  ({\bf g}_1,{\bf g}'_1,{\bf g}_2,{\bf g}'_2,{\bf g}_3,{\bf x}_1,{\bf x}'_1,{\bf x}_2,{\bf x}'_2,{\bf x}_3) \in \Lambda_{g'_3}

with the {{\bf g}_i, {\bf x}_i} being an extension of the previous random variables of the same name.

Figure 3

Since {{\bf x}_2} is algebraic over {{\bf x}_1, {\bf g}_3}, we see (as {g'_3} is generic and deterministic) that {{\bf x}'_2} is algebraic over {{\bf x}_1}. Similarly {{\bf x}'_1} is algebraic over {{\bf x}_2}, and also {{\bf g}'_1} is algebraic over {{\bf g}_2} and {{\bf g}'_2} is algebraic over {{\bf g}_1}. From (iii) we also see that {{\bf x}'_1} is determined by {{\bf g}'_2} and {{\bf x}_3}, and {{\bf x}'_2} is determined by {{\bf g}'_1} and {{\bf x}_3}. Finally from (vii) we see that {{\bf g}'_1, {\bf g}_2, {\bf x}_1, {\bf x}'_2} determine {{\bf x}_3}, and similarly {{\bf g}_1, {\bf g}'_2, {\bf x}'_1, {\bf x}_2} determine {{\bf x}_3}. Thus if we set

\displaystyle  \tilde {\bf g}_1 := ({\bf g}_1, {\bf g}'_2)

\displaystyle  \tilde {\bf g}_2 := ({\bf g}_2, {\bf g}'_1)

\displaystyle  \tilde {\bf g}_3 := {\bf g}_3

\displaystyle  \tilde {\bf x}_1 := ({\bf x}_1, {\bf x}'_2)

\displaystyle  \tilde {\bf x}_2 := ({\bf x}_2, {\bf x}'_1)

\displaystyle  \tilde {\bf x}_3 := {\bf x}_3

then we see that

  • {\tilde {\bf g}_i} is interalgebraic with {{\bf g}_i} for {i=1,2,3};
  • {\tilde {\bf x}_i} is interalgebraic with {{\bf x}_i} for {i=1,2,3};
  • {\tilde {\bf x}_3} is determined by {\tilde {\bf g}_1, \tilde {\bf x}_2};
  • {\tilde {\bf x}_3} is determined by {\tilde {\bf g}_2, \tilde {\bf x}_1};
  • {\tilde {\bf x}_2} is determined by {\tilde {\bf g}_1, \tilde {\bf x}_3};
  • {\tilde {\bf x}_1} is determined by {\tilde {\bf g}_2, \tilde {\bf x}_3}.

Thus, by replacing {{\bf g}_i} and {{\bf x}_i} with {\tilde {\bf g}_i} and {\tilde {\bf x}_i}, we may now obtain the additional axiom

  • (vii’) {{\bf x}_3} is determined by {{\bf g}_1, {\bf x}_2}, and also {{\bf x}_3} is determined by {{\bf g}_2, {\bf x}_1}.

We now have enough determination relations to begin the second phase of the argument, in which the arguments used to prove the group chunk theorem may be applied. Observe that {{\bf g}_1, {\bf x_2}} determine {{\bf x}_3}, and {{\bf g}_1,{\bf x}_3} determine {{\bf x}_2}. Thus, for generic {g_1 \in G_1}, we have a generically bijective definable map {\phi_{g_1}: X_2 \rightarrow X_3} such that

\displaystyle  \phi_{{\bf g}_1}( {\bf x}_2 ) = {\bf x}_3

almost surely, with {\phi_{g_1}} also depending definably on {g_1}. Similarly, for generic {g_2 \in G_2}, we have a generically bijective definable map {\psi_{g_2}: X_1 \rightarrow X_3} such that

\displaystyle  \psi_{{\bf g}_2}( {\bf x}_1 ) = {\bf x}_3

almost surely, with {\psi_{g_2}} also depending definably on {g_2}.

We now relate the {\phi_{g_1}} and {\psi_{g_2}} to each other:

Lemma 5 After extending the probability space if necessary, there exist random variables {{\bf g}'_1, {\bf g}'_2} uniformly distributed in {G_1,G_2} respectively, such that we almost surely have the identity

\displaystyle  \phi_{{\bf g}_1} \circ \phi_{{\bf g}'_1}^{-1} = \psi_{{\bf g}_2} \circ \psi_{{\bf g}'_2}^{-1} \ \ \ \ \ (2)

holds generically. Furthermore, any three of the {{\bf g}_1, {\bf g}'_1, {\bf g}_2, {\bf g}'_2} are independent, with the fourth being algebraic over these three.

Proof: Let {\Sigma'} be the generic subset of {\Sigma} consisting of those {(g_1,g_2,g_3,x_1,x_2,x_3)} such that

\displaystyle  \phi_{g_1}(x_2) = x_3; \quad \phi_{g_1}^{-1}(x_3) = x_2; \psi_{g_2}(x_1) = x_3; \psi_{g_2}^{-1}(x_3) = x_1 \ \ \ \ \ (3)

(recall from our conventions that these statements implicitly require that all expressions be well-defined, thus for instance {x_3} must lie in the domain of {\phi_{g_1}^{-1}} for the second statement to be true). The set {\Sigma'} surjects onto a subset of {G_3 \times X_1 \times X_2} of dimension {d_G + d_X} (because {{\bf x}_2} is algebraic over {({\bf g}_3, {\bf x}_1)}, which is uniformly distributed over {G_3 \times X_1}), so the generic fibres have dimension {(2d_G+d_X) - (d_G+d_X) = d_G}. If we let {\Xi := \Sigma' \times_{G_3\times X_1 \times X_2} \Sigma'}, then {\Xi} thus has dimension {(d_G+d_X) + d_G + d_G = 3d_G + d_X}. We view {\Xi} as a subset of {G_1^2 \times G_2^2 \times G_3 \times X_1 \times X_2 \times X_3^2} and parameterise it as

\displaystyle  (g_1,g'_1,g_2,g'_2,g_3,x_1,x_2,x_3,x_3').

By a base change, we may then find a set of random variables

\displaystyle  ({\bf g}_1,{\bf g}'_1,{\bf g}_2,{\bf g}'_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3,{\bf x}_3')

uniformly distributed in {\Xi}, which restricts to the existing tuple {({\bf g}_1,{\bf g}_2,{\bf g}_3,{\bf x}_1,{\bf x}_2,{\bf x}_3)} of random variables.

Figure 4

By construction, one has

\displaystyle  \phi_{{\bf g}_1} \circ \phi_{{\bf g}'_1}^{-1}({\bf x}'_3) = {\bf x}_3 = \psi_{{\bf g}_2} \circ \psi_{{\bf g}'_2}^{-1}({\bf x}'_3)

almost surely. On the other hand, the support of {({\bf g}_1, {\bf g}_2, {\bf g}'_1, {\bf g}'_2, {\bf g}_3)} has dimension at most {3d_G} (because {{\bf g}_3, {\bf g}'_2} are algebraic over {{\bf g}_1, {\bf g}_2,{\bf g}'_1}) and so for generic choices of these random variables, the set of possible {({\bf x}_1, {\bf x}_2, {\bf x}_3, {\bf x}'_3)} has dimension at least {(3d_G+d_X)-3d_G=d_X}; since any one of these variables is algebraic over any other (once the {({\bf g}_1, {\bf g}_2, {\bf g}'_1, {\bf g}'_2, {\bf g}_3)} are fixed), we conclude that {{\bf x}'_3} cannot be restricted to any lower-dimensional set than {X}. We conclude that almost surely, (2) holds generically.

The above discussion shows that the support of {({\bf g}_1, {\bf g}_2, {\bf g}'_1, {\bf g}'_2, {\bf g}_3)} has dimension exactly {3d_G}; as any three of {{\bf g}_1, {\bf g}'_1, {\bf g}_2, {\bf g}'_2} are such that the remaining two random variables in {({\bf g}_1, {\bf g}_2, {\bf g}'_1, {\bf g}'_2, {\bf g}_3)} are algebraic over these three, the final claims of the lemma follow. \Box

We rewrite (2) as

\displaystyle  [\phi_{{\bf g}_1}] \circ [\phi_{{\bf g}'_1}]^{-1} = [\psi_{{\bf g}_2}] \circ [\psi_{{\bf g}'_2}]^{-1} \ \ \ \ \ (4)

almost surely, where {[f]} is the equivalence class of a generically bijective, and definable, partial function {f: X \rightarrow Y} up to generic equivalence, and inversion and composition on such equivalence classes is defined in the obvious manner. This equivalence class can be made definable by identifying {[f]} with the closure of the top-dimensional component of the graph {\{ (x_3,f(x_3)): x_3 \in X_3\}} of {f}, and then expressing this in Chow coordinates (or using the Hilbert scheme); this makes equivalence classes

\displaystyle  \overline{{\bf g}}_1 := [\phi_{{\bf g}_1}]; \overline{{\bf g}}'_1 := [\phi_{{\bf g}'_1}]; \overline{{\bf g}}_2 := [\psi_{{\bf g}_2}]; \overline{{\bf g}}'_2 := [\psi_{{\bf g}'_2}]

into random variables uniformly distributed in some definable sets {\overline{G}_1,\overline{G}_2}, which are definable images of {G_1,G_2} and thus almost irreducible (after passing to generic subsets of {G_1,G_2} if necessary). We thus see that any three of {\overline{{\bf g}}_1, \overline{{\bf g}}'_1, \overline{{\bf g}}_2, \overline{{\bf g}}'_2} are independent, and we have the relation

\displaystyle  \overline{{\bf g}}_1 \circ (\overline{{\bf g}}'_1)^{-1} = \overline{{\bf g}}_2 \circ \overline{{\bf g}}'_2)^{-1} \ \ \ \ \ (5)

almost surely.

In particular, for fixed choices of {\overline{\bf g}'_1}, {\overline{\bf g}'_2}, {\overline{\bf g}_1} determines {\overline{\bf g}_2} and vice versa. Thus {\overline{G}_1} and {\overline{G}_2} have the same dimension, say {d_{\overline{G}}}. (This could be strictly less than {d_G}, basically because the original sets {G_1,G_2,G_3} may contain superfluous degrees of freedom which do not interact with the spaces {X_1,X_2,X_3}.)

We now let {G} be the set of all equivalence classes {[f]} of generically bijective and definable partial functions {f: X_3 \rightarrow X_3} with the following properties:

  • For generic {\overline{g}_1 \in \overline{G}_1}, there exists {\overline{g}'_1 \in \overline{G}_1} such that {[f] = \overline{g}_1 \circ (\overline{g}'_1)^{-1}}.
  • For generic {\overline{g}'_1 \in \overline{G}_1}, there exists {\overline{g}_1 \in \overline{G}_1} such that {[f] = \overline{g}_1 \circ (\overline{g}'_1)^{-1}}.
  • For generic {\overline{g}_2 \in \overline{G}_2}, there exists {\overline{g}'_2 \in \overline{G}_2} such that {[f] = \overline{g}_2 \circ (\overline{g}'_2)^{-1}}.
  • For generic {\overline{g}'_2 \in \overline{G}_2}, there exists {\overline{g}_2 \in \overline{G}_2} such that {[f] = \overline{g}_2 \circ (\overline{g}'_2)^{-1}}.

This is a definable set; {G} is contained in the image of {\overline{G}_1 \times \overline{G}_1} by the map {(g_1,g'_1) \mapsto g_1 \circ (g'_1)^{-1}} with fibres of dimension at most {d_{\overline{G}}}, and thus can have at most one component of dimension {d_{\overline{G}}} (and no larger dimensional components); on the other hand, from (5) {G} contains the image of a generic subset of {\overline{G}_1 \times \overline{G}_1}; thus {G} has dimension exactly {d_{\overline{G}}} and is essentially irreducible.

By construction, {G} contains the (equivalence class of) the identity map on {X_3} and is closed under inversion. We also claim that it is closed under composition, which would make {G} a definable group. Indeed, let {[f], [\tilde f] \in G}. For generic {\overline{g}_1 \in \overline{G}_1}, there exists {\overline{g}'_1 \in \overline{G}_1} such that {[f] = \overline{g}_1 \circ (\overline{g}'_1)^{-1}}. The map {\overline{g}_1 \rightarrow \overline{g}'_1} is a generic bijection and {\overline{G}_1} is almost irreducible, and so {\overline{g}'_1} is generic also. Thus, we may generically also find {\overline{g}''_1 \in \overline{G}_1} such that {[f'] = \overline{g}'_1 \circ (\overline{g}''_1)^{-1}}, and hence {[f] \circ [f'] = \overline{g}_1 \circ (\overline{g}''_1)^{-1}}. This gives the first property required for {[f] \circ [f']} to lie in {G}, and the other three are proven similarly.

Having located the definable group {G}, the next step is to locate a space {X} that {G} acts definably on. We first observe that {G} generically acts on {X_3}, by defining {[f] \cdot x_3 = x'_3} whenever {x_3 \in X_3}, {f \in G}, and {x'_3} is the unique element of {X_3} such that {[f]} (which, recall, is the closure of the top-dimensional component of the graph of {f}) contains {(x_3,x'_3)}. This is a definable, and generically defined operation which obeys the action axiom {(g \cdot h) \cdot x_3 = g \cdot (h \cdot x_3)} for generic {(g,h,x_3) \in G\times G \times X_3}.

To create a genuine action and not just a generic action, we perform yet another “quotient space construction” to extract a suitable space {X} from {X_3} as follows. Define a formal pre-point to be a definable subset {\Sigma} of {G \times X_3} obeying the following axioms:

  • (xv) (Vertical line test) For generic {g \in G}, there is exactly one {x\in X_3} such that {(g,x) \in \Sigma}.
  • (xv’) (Vertical line test) For generic {(g,x) \in \Sigma}, {x} is the unique {x' \in X_3} such that {(g,x') \in \Sigma}.
  • (xvi) (Translation invariance) For generic {((g,x),h) \in\Sigma \times G}, we have {(h \cdot g, h \cdot x) \in \Sigma}.

Two formal pre-points are equivalent if they have a common definable subset, and a formal point is an equivalence class of formal pre-points. We then let {X} be the set of all formal points. If {\Sigma} is a formal pre-point and {(g,x)} is a generic element of {\Sigma}, then {\Sigma} is equivalent to {\{ (h \cdot g, h \cdot x): h \in G \}}, and so is essentially irreducible; by similar arguments to before, we may now parameterise {X} as a definable set, which is the image of the generically defined map {(g,x) \mapsto \Sigma} described above with fibres of dimension at least {d_{\overline{G}}} and so has at most one component of dimension {d_X} and no higher-dimensional components. The group {G} acts on formal pre-points by the formula

\displaystyle  g \cdot \Sigma := \{ (h \cdot g^{-1}, x): (h,x) \in \Sigma \};

one easily verifies that this is an action on formal pre-points which descends to a definable action on the space {X} of formal points.

For generic {x \in X_3}, the set {\{ (g, g \cdot x): g \in G \}} can be verified to be a formal pre-point; this gives a definable, generically defined map {\iota: X \rightarrow \tilde X}, which one can check to be generically injective; combined with the previous dimension control on {X}, we now see that {X} has dimension {d_X} and is almost irreducible, with {\iota} generically bijective. It can also be shown to generically preserve the action, in the sense that {\iota( g \cdot x ) = g \cdot \iota(x)} for generic {(g,x) \in G \times X_3}.

Finally, we can set up the random variables {\tilde {\bf g}_i} and {\tilde {\bf x}_i} required for Theorem 3. We choose generic elements {g_1^0 \in G_1, g_2^0 \in G_2}, and then set

\displaystyle  \tilde {\bf g}_1 := [\phi_{{\bf g}_1}] \circ [\phi_{g_1^0}]^{-1}

\displaystyle  \tilde {\bf g}_2 := [\psi_{g_2^0}] \circ [\psi_{{\bf g}_2}]^{-1}

\displaystyle  \tilde {\bf g}_3 := \tilde {\bf g}_1^{-1} \circ \tilde {\bf g}_2^{-1}

\displaystyle  \tilde {\bf x}_1 := \iota( \psi_{g_2^0}( {\bf x}_1 ) )

\displaystyle  \tilde {\bf x}_2 := \iota( \phi_{g_1^0}( {\bf x}_2 ) )

\displaystyle  \tilde {\bf x}_3 := \iota( {\bf x}_3 ).

It is clear that for generic {g_1^0,g_2^0}, the {\tilde {\bf x}_i} are interalgebraic with {{\bf x}_i}, and that the relations (1) hold almost surely. Also, {\tilde {\bf g}_i} is clearly algebraic over {{\bf g}_i} for {i=1,2}, and the {{\bf x}_i} are uniformly distributed in {X}. For {i=3}, we use the relation {\tilde {\bf g}_3 \cdot \tilde {\bf x}_1 = \tilde {\bf x}_2}. As {{\bf x}_2} is algebraic over {{\bf g}_3} and {{\bf x}_1}, {\tilde{\bf x}_2} is algebraic over {{\bf g}_3} and {{\bf x}_1}. Thus for generic {{\bf g}_3}, there can only be finitely many possible values of {\tilde {\bf g}_3}, giving the desired algebraicity.

We see that {(\tilde {\bf g}_1, \tilde {\bf g}_2)} is uniformly distributed in {G \times G}; as {\tilde {\bf g}_3 = \tilde {\bf g}_1^{-1} \circ \tilde {\bf g}_2^{-1}} almost surely, we conclude that {\tilde {\bf g}_3} is uniformly distributed over {G}. The independence requirements on the {\tilde {\bf g}_i, \tilde {\bf x}_i} now follow from the corresponding independence hypotheses on the {{\bf g}_i, {\bf x}_i}.