The classical foundations of probability theory (discussed for instance in this previous blog post) is founded on the notion of a probability space – a space (the sample space) equipped with a -algebra (the event space), together with a countably additive probability measure that assigns a real number in the interval to each event.
One can generalise the concept of a probability space to a finitely additive probability space, in which the event space is now only a Boolean algebra rather than a -algebra, and the measure is now only finitely additive instead of countably additive, thus when are disjoint events. By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). Nevertheless, one can still perform a fair amount of probability theory in this weaker setting.
In this post I would like to describe a further weakening of probability theory, which I will call qualitative probability theory, in which one does not assign a precise numerical probability value to each event, but instead merely records whether this probability is zero, one, or something in between. Thus is now a function from to the set , where is a new symbol that replaces all the elements of the open interval . In this setting, one can no longer compute quantitative expressions, such as the mean or variance of a random variable; but one can still talk about whether an event holds almost surely, with positive probability, or with zero probability, and there are still usable notions of independence. (I will refer to classical probability theory as quantitative probability theory, to distinguish it from its qualitative counterpart.)
The main reason I want to introduce this weak notion of probability theory is that it becomes suited to talk about random variables living inside algebraic varieties, even if these varieties are defined over fields other than or . In algebraic geometry one often talks about a “generic” element of a variety defined over a field , which does not lie in any specified variety of lower dimension defined over . Once has positive dimension, such generic elements do not exist as classical, deterministic -points in , since of course any such point lies in the -dimensional subvariety of . There are of course several established ways to deal with this problem. One way (which one might call the “Weil” approach to generic points) is to extend the field to a sufficiently transcendental extension , in order to locate a sufficient number of generic points in . Another approach (which one might dub the “Zariski” approach to generic points) is to work scheme-theoretically, and interpret a generic point in as being associated to the zero ideal in the function ring of . However I want to discuss a third perspective, in which one interprets a generic point not as a deterministic object, but rather as a random variable taking values in , but which lies in any given lower-dimensional subvariety of with probability zero. This interpretation is intuitive, but difficult to implement in classical probability theory (except perhaps when considering varieties over or ) due to the lack of a natural probability measure to place on algebraic varieties; however it works just fine in qualitative probability theory. In particular, the algebraic geometry notion of being “generically true” can now be interpreted probabilistically as an assertion that something is “almost surely true”.
It turns out that just as qualitative random variables may be used to interpret the concept of a generic point, they can also be used to interpret the concept of a type in model theory; the type of a random variable is the set of all predicates that are almost surely obeyed by . In contrast, model theorists often adopt a Weil-type approach to types, in which one works with deterministic representatives of a type, which often do not occur in the original structure of interest, but only in a sufficiently saturated extension of that structure (this is the analogue of working in a sufficiently transcendental extension of the base field). However, it seems that (in some cases at least) one can equivalently view types in terms of (qualitative) random variables on the original structure, avoiding the need to extend that structure. (Instead, one reserves the right to extend the sample space of one’s probability theory whenever necessary, as part of the “probabilistic way of thinking” discussed in this previous blog post.) We illustrate this below the fold with two related theorems that I will interpret through the probabilistic lens: the “group chunk theorem” of Weil (and later developed by Hrushovski), and the “group configuration theorem” of Zilber (and again later developed by Hrushovski). For sake of concreteness we will only consider these theorems in the theory of algebraically closed fields, although the results are quite general and can be applied to many other theories studied in model theory.
— 1. Qualitative probability theory – generalities —
We begin by setting up the foundations of qualitative probability theory, proceeding by close analogy with the more familiar quantitative probability theory (though of course we will have to jettison various quantitative concepts, such as mean and variance, from the theory).
As discussed in the introduction, we are replacing the unit interval by the three-element set ; one could view this as the quotient space of in which the interior has been contracted to a single point . This space is still totally ordered: . The addition relation on contracts to an “addition” relation on , defined by the following rules:
with no other relations of the form in . Strictly speaking, is not a binary operation here, as can evaluate to or to , but we keep the notation in order to emphasise the analogy with quantitative probability theory.
A qualitative probability space is then a space equipped with a Boolean algebra (the measurable subsets of ) and a function with , which is finitely additive in the sense that whenever are disjoint. It is easy to see that these measures are monotone (thus whenever ), and that . A measurable subset of is called a -null set if , and a -full set, a -conull set, or a -generic set if ; note that -full sets are the complements of -null sets and vice versa. A property of points is said to hold -almost everywhere or for -generic if it holds outside of a -null set (or equivalently, if it holds on a -generic set).
One can describe a qualitative probability measure on a Boolean space purely through its null ideal of null sets, or equivalently through its full filter of full sets. Conversely, if a subset of is the full filter of some qualitative probability measure on if it obeys the following filter axioms:
- (Empty set) and .
- (Monotonicity) If are in , and , then .
- (Intersection) If , then .
Furthermore is completely determined by the filter . Similarly for the null ideal (with suitably inverted axioms, of course). Thus, if one wished, one could replace the concept of a qualitative probability measure with the concept of an ideal or filter, but we retain the use of to emphasise the probabilistic interpretation of these objects.
One obvious way to create a qualitative probability measure is to start with a quantitative probability measure and “forget” the quantitative aspect of this measure by quotienting down to . Under some reasonable hypotheses, one can reverse this procedure and view many qualitative probability measures as quantitative probability measures to which this forgetful process has been applied. However, this reversal is usually not unique, and we will not try to use it here.
In quantitative probability theory, one can take two quantitative probability measures on the same space and form an average for some , which is another quantitative probability measure. For instance, if is a probability measure and is a set with measure between and , then can be expressed as an average of the conditioned measures and .
In analogy with this, we can take two qualitative measures on the same space and form the average , defined by setting if and only if (or equivalently, if and only if ). If is a qualitative probability measure and is a set with measure , then we can form the conditioned measure , defined by setting if (or equivalently if ), and then one can check that is the average of the conditioned measures and .
We call a qualitative probability measure irreducible if it does not assign any set the intermediate measure of (or equivalently, the full filter is an ultrafilter); thus, irreducible qualitative probability measures are the same concept as finitely additive -valued probability measures (which, as is well known, are essentially the same concept as ultrafilters). By the previous discussion, we see that a qualitative probability measure is irreducible if and only if it is not the average of two other measures.
In this paper we will primarily work with irreducible measures, but will occasionally have to deal with reducible measures, for instance when taking the conditional product (or pullback) of two irreducible measures over a third.
Given a measurable map between two Boolean spaces , (thus the pre-image of any measurable set in by is measurable in ), we can define the pushforward of any qualitative probability measure on to be the qualitative probability measure on defined by the usual formula . In particular, if embeds into , then any measure on can also be viewed as a measure on , which we call the extension of to .
We now discuss the issue of product measures in the qualitative setting. Here we will deviate a little from the usual probability formalism, in which one usually defines a product algebra to be the minimal algebra that contains the Cartesian product of and . Here, it turns out to be more useful to have a more flexible (but not unique) notion of a product, in which more measurable sets are permitted. Namely, given two Boolean spaces , , we say that a Boolean space is a product of the two spaces if is the Cartesian product of and , and the following axioms are obeyed:
- (Products) If and , then .
- (Slicing) If , then for all , and for all .
These axioms do not uniquely specify , but in practice each product space will have a canonical choice for attached to it.
Given two qualitative probability spaces , , a qualitative probability space is a product of the two spaces if is a product of and , and the following assertions are equivalent for any :
- .
- For -almost every , .
- For -almost every , .
Of course, one could replace by here in the above equivalence, which can be thought of as a qualitative Fubini-Tonelli type theorem. Once one selects the product Boolean algebra , the product measure is uniquely specified, if it exists at all; but (as in the setting of classical measure theory if one is not working with -finite measures), existence is not always automatic. (But in our applications, it will be.)
One can define products of more than two (but still finitely many) qualitative probability spaces in a similar fashion; we leave the details to the reader.
Now we are ready to set up qualitative probability theory. We need a qualitative probability space to serve as the sample space, event space, and probability measure. An event in is said to occur almost surely if it occurs on a -full set. Given an Boolean space , a random variable is then a measurable map . We permit random variables that are only defined almost surely, thus is now a partial function defined on a -full event in ; as in quantitative probability theory or measure theory, we view random variables that agree almost surely as being essentially equivalent to each other. The law or distribution of is the pushforward of the qualitative probability measure; this is then a qualitative measure on . We say that two random variables agree in distribution, and write , if they have the same law.
Two random variables , are said to be independent if the distribution of the joint random variable is the product of the distributions of and separately. Here we need to specify a product Boolean algebra on the product space to make this definition well-defined, but in the applications we will consider, we will always have a canonical product algebra to select. One can define independence of more than two random variables in a similar fashion.
At the opposite extreme to independence, we say that a random variable is determined by another random variable if there is a measurable map such that almost surely. The constrast between independence and determination (as well as a weaker property than determination we will consider later, namely algebraicity) will be the focus of the group chunk and group configuration theorems discussed in later sections.
As discussed in this previous blog post, in quantitative probability theory we often reserve the right to extend the underlying probability space , in order to introduce new sources of randomness, without destroying the probabilistic properties of existing random variables (such as their independence and determination properties). We say that an extension of a qualitative probability space is another qualitative probability space together with a measurable map such that . One can then pull back any random variable to a random variable on the new probability space; by abuse of notation, we continue to refer to as . Probabilistic notions such as independence, law, or determination remain unchanged under such an extension.
— 2. Qualitative probability theory on definable sets —
For the purposes of this post, the qualitative probability measures we will care about will live in the theory of algebraically closed fields. We will assume some basic familiarity with algebraic geometry concepts, such as the dimension of a variety. The exact choice of field will not be important here, but one could work with the complex field if desired (in which case one could (somewhat artificially) model the qualitative probability measures here by quantitative probability measures on complex varieties if one wished).
Henceforth is fixed; in contrast to usual model theory practice, we will not need to introduce some sufficiently large extension of to work in. The notion of measurability here will be given by the model-theoretic concept of definability. Namely, a definable set is a set of the form
for some predicate that can be expressed in terms of the field operations , a finite number of variables and constants in , the boolean symbols , the equality sign, the quantifiers (with all variables being quantified over ), and punctuation symbols (parentheses and colons). A definable map between two definable sets is a function whose graph is a definable set.
As is algebraically closed, the definable sets can be described quite simply. Define an irreducible quasiprojective variety, or variety for short, to be a Zariski-open dense subset of an irreducible affine variety over . One can show (using elimination of quantifiers in algebraically closed fields, the existence of which follows from Hilbert’s nullstellensatz) that a set is definable if and only if it is the union of a finite number of disjoint varieties.
We equip each definable set with the Boolean algebra of definable subsets of ; this will be the only algebra we shall ever place on a definable set. Note that if are definable, then is a product space of and as per the previous definition.
If is a qualitative probability measure on a definable set , the support of is defined to be the intersection of all the Zariski-closed sets of -full measure. As the Zariski topology is Noetherian, the support is always a closed set of full measure.
Remark 1 One can describe a qualitative probability measure through its type, defined as the set of all predicates which hold for -almost all . This concept is essentially the same as the concept of a type in model theory; the measure is irreducible if and only if the type is complete. In this post, we have essentially replaced the notion of a type with that of a qualitative probability measure, and so types will not appear explicitly in the rest of the post.
We now give some basic examples of qualitative probability measures on definable sets.
Example 1 If is a non-empty definable set of some dimension (that is, is the largest dimension of all components of (or of its closure)), then the qualitative uniform probability measure on (or uniform measure, for short) is defined by setting all subsets of dimension or less to have measure zero, or equivalently all generic subsets of (that is, with finitely many sets of dimension removed) to have full measure. (Sets which contain a generic subset of some, but not all, of the -dimensional components of , then are assigned the intermediate measure.) The support of this measure is then the closure of the union of the -dimensional components. This measure is irreducible if and only if is almost irreducible in the sense that it only has one -dimensional component. Note that the algebraic geometry notion of genericity now coincides with the (qualitative) probabilistic notion of almost sureness: a (definable) property on holds generically if and only if it holds almost surely with respect to the uniform measure on .
If are definable sets, then the uniform measure on is the product of the uniform measure on and the uniform measure on ; the proof of the Fubini-Tonelli type statement that justifies this may be found for instance in Lemma 13 of this paper of mine.
Example 2 If is almost irreducible and is a definable map, then the uniform measure on pushes forward to the uniform measure of some almost irreducible subset of ; see e.g. Lemma A.8 of this previous paper of Breuillard, Green, and myself. Also, generic points in have fibres in of dimension , just as one would expect from naive dimension counting. If is not almost irreducible, the situation becomes a bit more complicated because the image on on different components of may have a different dimension, and so may become an average of uniform measures on sets of different dimension.
Exercise 1 Let be a qualitative probability measure on a definable set .
- (i) Show that is irreducible if and only if it is the uniform measure of some almost irreducible subset of .
- (ii) Show that is the average of finitely many uniform measures if and only if there does not exist a countable family of disjoint subsets of of positive measure. (Hint: greedily select disjoint varieties of of positive measure, starting with zero-dimensional varieties (points) and then increasing the dimension.)
We now use the formalism of qualitative probability theory from the previous section, but always working within the definable category; thus we require the sample space to also be a definable set, and that all random variables are required to be definable maps (or at least generically definable maps), which is a stronger condition than measurability.
— 3. The group chunk theorem —
Random variables can interact very nicely with groups , if they come equipped with an appropriate invariant measure. To illustrate this, let us first return to the classical setting of quantitative probability theory, working exclusively with finite groups to avoid all measurability issues.
Given a finite group , let be two elements of chosen uniformly and independently at random, and then form their product . This gives us a triple of random variables taking values in , which obey the following independence and determination properties:
- (i) (Uniform distribution) For any , has the uniform distribution on .
- (ii) (Independence) For any distinct , are independent.
- (iii) (Determination) For any distinct , is determined by .
- (iv) (Associativity) After extending the sample space as necessary, one can locate additional random variables taking values in such that (see figure), and such that any other triple for distinct which is not a permutation of the four triples already mentioned is jointly independent.
Indeed, to see the associativity axiom, let be selected uniformly from independently of , and set and . The associativity is depicted graphically in the figure below, in which three points connected by a line or curve indicate a dependence, but triples of points not joined by such a line or curve being independent.
In the converse direction, any triple on a finite set obeying the above axioms necessarily comes from an underlying group operation:
Proposition 1 (Probabilistic description of a finite group) Let be a finite non-empty set, and let be random variables on that obey axioms (i)-(iv). Then there exists a group structure on such that .
Proof: By axiom (iii), we have for some binary operation . From axiom (iv) we have
almost surely (and hence surely, as is finite); also from axioms (i), (iv) we see that is uniformly distributed in . We conclude that the binary operation is associative, thus for all .
By axiom (iii), we see that for fixed , is determined by and vice versa; from axioms (i), (ii), this implies that for fixed , the map is a bijection from to itself; similarly the map is a bijection from to itself. Note also from associativity that commutes with the right-action of in the sense that for all .
Conversely, given any bijection that commutes with the right-action in the sense that for all , we claim that for a unique . Indeed, from axioms (i)-(iii), we know that is determined by , and so for any given , we may find such that . If this holds for a single , then it holds for all other by associativity, since commutes with the right action, and the map is a bijection. Thus we may identify with the space of bijections that commute with the right-action. This is clearly a group, and the property is then clear from construction.
Now that we see that a finite group, at least, may be described in terms of the probabilistic language of independence and determination. It is natural to ask whether similar results hold for infinite groups (with the slight modification that we now only expect to hold almost surely rather than surely, as we now will have non-trivial events of probability zero). Here we run into the technical difficulty that many groups – such as non-compact Lie groups, or algebraic groups defined over fields other than or – are not naturally equipped with a probability measure with which to define the concept of uniform distribution. However, if one is working in the definable category, one can use the language of qualitative probability theory instead and obtain the same result.
A definable group is a group which is a definable set, such that the group operations and are definable maps. This notion is very close to, but subtly different from, that of the more commonly used notion of an algebraic group; the latter is a bit stricter because the group has to now be an algebraic variety, and the group operations have to be regular maps and not just definable maps. However, the two notions are quite close to each other, particularly in characteristic zero when they become equivalent up to definable group isomorphism, although additional subtleties arise in positive characteristic due to the existence of things like the inverse Frobenius automorphism , which is definable but not regular, and which can be used to definably “twist” an algebraic group. We will not discuss these issues further here, but see this survey of Bouscaren and this article of van den Dries for further discussion.
Theorem 2 (Group chunk theorem) Let be an almost irreducible definable set, and let be qualitative random variables taking values in , obeying axioms (i)-(iv). Then, after passing from to a generic subset, one may definably identify with a generic subset of an definable group , so that almost surely. (To put it another way, there is a definable bijection between a generic subset of and a generic subset of , such that almost surely.)
This theorem is essentially due to Weil (although he worked instead in the category of algebraic varieties and algebraic groups, rather than definable sets and definable groups). It was extended to many other model-theoretic contexts (and in particular to stable theories) in the thesis of Hrushovski, although we will only focus on the classical algebraic geometry case here.
A typical example of a group chunk comes from taking a definable group and removing a lower dimensional subset of it, so that the group law is only generically defined (but this is still sufficient for defining almost surely). For instance, one could let be the non-singular matrices , with (generically defined) group law given by matrix multiplication followed by projective normalisation to force the lower left coordinate to be (which is only defined if this coordinate does not vanish, of course). This can be viewed as an open dense subset of the projective special linear group , and the theorem is asserting that can be “completed” to this group, which can be interpreted as a definable group. ( can be viewed as an affine variety using, for instance, the adjoint representation. In any event, even projective varieties can be interpreted affinely in a definable fashion by the artificial device of breaking up the variety into finitely many pieces, which can be all fit into a sufficiently large affine space.)
We now prove Theorem 2. By axioms (i)-(iii), there is a generically defined, and definable, map such that almost surely. From axiom (iii) we then conclude the following cancellation axioms:
- (vii) (Left cancellation) For generic , there is a unique such that .
- (viii) (Right cancellation) For generic , there is a unique such that .
The associativity axiom then also gives for generic .
These axioms also imply the following variants:
- (vii’) (Left cancellation) For generic , is the unique such that .
- (viii’) (Right cancellation) For generic , is the unique such that .
Indeed, from axiom (vii), we have a generically defined map that maps a generic pair to , where is the unique such that . This map is generically injective; as is essentally irreducible, this map has to be generically bijective also, which gives (vii’), and (viii’) is proven similarly. (We call a partially defined map generically bijective if it can be made into a total bijection by refining to generic subsets.) We conclude that for generic , the maps and are generically bijective.
To obtain a genuine group from rather than just a generic group, we perform a formal quotient construction, analogous to how the integers are formally constructed from the natural numbers (or the rationals from the integers). Define a formal pre-group element to be a definable subset of with the following properties:
- (ix) (Vertical line test) For generic , there is exactly one such that .
- (ix’) (Vertical line test) For generic , is the unique such that .
- (x) (Horizontal line test) For generic , there is exactly one such that .
- (x’) (Horizontal line test) For generic , is the unique such that .
- (xi) (Translation invariance) For generic , lies in .
The axioms (ix)-(x’) are asserting that is a generic bijection. (Here and in the sequel we adopt the convention that a mathematical statement is automatically considered false if one or more of its terms are undefined; for instance, is only true when and is well-defined. Similarly, when using set-builder notation, we only include those elements in the set which are well-defined; for instance, in the set , is implicitly restricted to those values for which and are well-defined.)
Call two formal pre-group elements are equivalent if they have a common generic subset, and then define a formal group element to be an equivalence class of formal pre-group elements , and let be the space of formal group elements. At this point, we have to deal with the technical problem that is not obviously a definable set. However, observe that if is a formal pre-group element, then we have a definable generic bijection from to , which makes essentially irreducible. If we then define to be the Zariski closure of the top dimensional component of (i.e. the support of the uniform measure on ), then is an irreducible closed variety, which depends only on the equivalence class of , with inequivalent pre-group elements giving distinct irreducible closed varieties. Finally, one can describe in terms of a generic element of , as being the closure of the top-dimensional component of , and generic elements produce such an objecft. As such, if we let be the set of all , then is in one-to-one correspondence with the space of formal group elements, and may be parameterised as a definable set using standard algebraic geometry tools (e.g. Chow coordinates, the Hilbert scheme, or elimination of imaginaries), as being the image under a generically definable map of with fibres of dimension . In particular, as is essentially irreducible with dimension , is essentially irreducible with dimension .
We define the identity element of to be the equivalence class of the diagonal , and define the inverse of the equivalence class of a formal pre-group element to be the equivalence class of the reflection of . As for the group law, we use generic composition: given two formal pre-group elements , we define the composition to be the set of all pairs such that there exists for which is the unique element with , and is also the unique element with . One can check (somewhat tediously) that this descends to a well-defined operation on that gives it the structure of a definable group.
Next, for generic , the set is a formal pre-group element, giving a generically defined map . One can verify that this map is generically definable, generically injective, and generically a homomorphism. The remaining task is to verify that has the same dimension as , as this together with essential irreducibility and generic injectivity gives generic bijectivity thanks to dimension counting. But for every formal pre-group element , we see that generic , there is a unique with , and furthermore that for generic ; in particular, is equivalent to and can thus be recovered from ; conversely, generic gives rise to a formal group element by this construction. This sets up a generically bijective map from to , which shows that has the same dimension as , as required. This concludes the proof of Theorem 2.
— 4. The group configuration theorem —
We now discuss a variant of the group chunk theorem that characterises group actions, known as the group configuration theorem. Again, to motivate matters we start with the quantitative probabilistic setting in the finite case. If is a finite group acting on a finite set , and are chosen independently and uniformly at random from , and uniformly from (independently of ), and then one defines , ,and (or, more symmetrically, we have the constraints , , , and ), then we observe the following independence and determination axioms (setting and for ):
- (i’) (Uniform distribution) For any , has the uniform distribution on , and has the uniform distribution on .
- (ii’) (Independence) Any two of the random variables are independent.
- (iii’) (Determination) For any distinct , is determined by , and is determined by and .
- (v) (More independence) Any three of the random variables that are not of the form or for distinct are independent.
Axiom (ii’) is in fact a consequence of axiom (v), but we add it for emphasis. We refer to a sextet obeying the above axioms as a group configuration. It can be described graphically by the picture below, in which the collinearity of three random variables indicates a dependence between them, with non-collinear triples of variables being independent.
The group configuration theorem concerns a generalisation of the above situation, in which the determination properties in axiom (iii’) are relaxed to the weaker properties of algebraicity. (In additive combinatorics, this would correspond to moving from a “99%” situation in which algebraic structure is present almost everywhere, to a “1%” settng in which it is only present a positive fraction of the time.) We say that one random variable taking values in one definable set is algebraic with respect to another random variable taking values in another definable set if there is a relation such that almost surely, and such that for each there are only finitely many such that . More informally, is algebraic over if determines up to a finite ambiguity. For instance, if is uniformly distributed in , then is algebraic with respect to any non-constant polynomial of .
Two random variables are said to be interalgebraic if they are each algebraic over each other. For instance, if is uniformly distributed in the affine line , and are two non-constant polynomials, then and are interalgebraic. Note that interalgebraicity is an equivalence relation.
Now we can give the group configuration theorem:
Theorem 3 (Group configuration theorem) Let be almost irreducible definable sets, and let be random variables in respectively on a qualitative probability space obeying the following axioms:
- (i’) (Uniform distribution) For any , has the uniform distribution on , and has the uniform distribution on .
- (ii’) (Independence) Any two of the random variables are independent.
- (iii”) (Interalgebraicity) For any distinct , is algebraic over , and is algebraic over and .
- (v) (More independence) Any three of the random variables that are not of the form or for distinct are independent.
- (vi) (Irreducibility) The joint random variable has an irreducible distribution.
Then, after extending the (qualitative) probability space if necessary, there exist a definable group that acts definably on a definable space , as well as qualitative random variables and uniformly distributed in and respectively for , with algebraic over and interalgebraic with with , such that
almost surely, and with the obeying the same independence and irreducibility hypotheses as the .
This theorem was first established by Zilber, who worked in the more general setting of strongly minimal theories, and then strengthened significantly in the thesis of Hrushovski, who treated the case of stable theories. (However, in these more general situations, the space is restricted to be “one-dimensional” for technical reasons.) We give a proof below the fold, following an exposition of Hrushovski’s method by Bouscaren. (See also a proof of a result very close to the above theorem, avoiding model-theoretic language, by Elekes and Szabo.) The irreducibility axiom (vi) can be relaxed, but then the conclusion becomes more complicated (there might not be a single group structure or group action involved, but rather an average of such actions).
The group configuration theorem has a number of applications to combinatorics; roughly speaking, this theorem is to “approximately associative” definable maps as Freiman’s theorem is for sets of small doubling. The aforementioned paper of Elekes and Szabo is one example of the configuration theorem in action; another is Theorem 41 of this paper of mine, which I proved by a different method (based on Riemann surface arguments), but for which a stronger statement has since been proven using the group configuration theorem by Hrushovski (private communication).
Now we prove Theorem 3. There are two main phases of the argument. The first phase involves upgrading several of the algebraicity hypotheses in axiom (iii”) to determination, by replacing several of the using algebraic changes of variable. Once this is done, the second phase consists of applying a modification of the proof of the group chunk theorem to locate the definable group (and also the definable space that acts on), and to connect this action to the group configuration.
We begin with a simple dimension counting observation. By the axioms, the random variable has an irreducible distribution and is thus uniformly distributed on some almost irreducible definable set , with algebraic over , which is uniformly distributed on . Taking dimensions, we conclude that . Similarly for permutations. This implies that
for some natural number ; a similar argument using the triples shows that
for some natural number . Next, since are uniformly distributed on , and the other three variables are algebraic over these variables, we see that the tuple is uniformly distributed on some almost irreducible definable set of dimension .
We now begin the first phase. Currently, by axiom (iii”), is algebraic over . We now use further dimension counting upgrade this algebraicity relationship to determination, basically by removing some information from .
Proposition 4 Let the assumptions be as in Theorem 3. Then there exists a random variable which is interalgebraic with and uniformly distributed in some almost irreducible set , such that is determined by .
Proof: Let be the support of , and let be the projection of onto , then is a closed variety, and as is algebraic over , is generically finite over . In particular, also has dimension . We then form the pullback (or base change) of two copies of over (a generic subset of) ; we view as a subset of . This is a definable set, but is not necessarily almost irreducible.
Now consider the projection of to . The set contains the diagonal and thus has dimension at least . We claim that in fact has dimension exactly . Indeed, suppose this were not the case, then would contain an irreducible variety of dimension for some positive . Now observe that as is algebraic over , the projection of to is generically finite over , which has dimension ; taking pullback with itself, we conclude that the projection of to also has dimension . Thus, over a generic point in , the fibre of projected to has dimension at most . Similarly, the projection of this fibre to has dimension at most . Since is algebraic over , we conclude that the generic fibre of over a point in has total dimensino at most , so that the preimage of in has dimension at most
and is thus a lower-dimensional component of (or of after projecting to either of the two copies of ). Thus, if we pass to a suitable generic subset of , the projection of to has dimension . Passing to a further generic subset if necessary, we may assume that is algebraic in the sense that any horizontal or vertical line in meets in at most finitely many points. From the Noetherian property, we see that there is in fact a uniform upper bound on how many such points can lie on a line (this is basically the degree of ).
We now define the random variable to be the set of all with , such that lies in the projection of (the generic portion of we are working with) to . By the above discussion, this is a finite subset of , and the set of all such possible can be parameterised in a definable way (indeed, it lies inside the -fold powers of over for ), and is interalgebraic with . By construction, is also determined by ; as the latter is uniformly distributed on some almost irreducible set, is also, and the claim follows.
A similar argument provides a random variable interalgebraic with such that is determined by . By replacing with respectively, and checking that none of the axioms (i’). (ii’), (iii”), (v), (vi) are destroyed by this replacement, we see that we may reduce without loss of generality to the case in which we have the additional axiom
- (iii”‘) is determined by , and is determined by .
Now we turn to the task of making determined both by and by . We are unable to effectively utilise (suitable permutations of) Proposition 4 here, because any replacement of by a random variable with less information content will likely destroy axiom (iii”‘). However, we can at least construct a random variable interalgebraic with that is determined by the joint random variable . Indeed, the support of is generically finite over , and by repeating the dimension counting arguments from Proposition 4, we see tht the projection of the pullback to has dimension at most , and so has finite fibres after passing to a generic subset. If we then set to be the fibre of over , we conclude as before that is interalgebraic with , and is clearly determined by . Also, each element of , together with , generically determines by axiom (iii”‘), and hence is determined by ; similarly is determined by . Thus by replacing with , we may impose the additional axiom
- (vii) is determined by
while retaining all previous axioms.
Now we perform the following “doubling” trick, creating some new random variables by extending the probability space. As before, let be the support of . As is uniformly distributed in , we see that for generic and generic , there is a non-zero finite number of tuples such that . Similarly there are a non-zero finite number of tuples such that . Thus, for generic , if we let be the set of all tuples
such that
(see figure), then is a generically finite cover of (projecting onto the coordinates). Thus, if we perform a base change of the probability space (which we view as lying over ) to , we may now create random variables
with the being an extension of the previous random variables of the same name.
Since is algebraic over , we see (as is generic and deterministic) that is algebraic over . Similarly is algebraic over , and also is algebraic over and is algebraic over . From (iii) we also see that is determined by and , and is determined by and . Finally from (vii) we see that determine , and similarly determine . Thus if we set
then we see that
- is interalgebraic with for ;
- is interalgebraic with for ;
- is determined by ;
- is determined by ;
- is determined by ;
- is determined by .
Thus, by replacing and with and , we may now obtain the additional axiom
- (vii’) is determined by , and also is determined by .
We now have enough determination relations to begin the second phase of the argument, in which the arguments used to prove the group chunk theorem may be applied. Observe that determine , and determine . Thus, for generic , we have a generically bijective definable map such that
almost surely, with also depending definably on . Similarly, for generic , we have a generically bijective definable map such that
almost surely, with also depending definably on .
We now relate the and to each other:
Lemma 5 After extending the probability space if necessary, there exist random variables uniformly distributed in respectively, such that we almost surely have the identity
holds generically. Furthermore, any three of the are independent, with the fourth being algebraic over these three.
Proof: Let be the generic subset of consisting of those such that
(recall from our conventions that these statements implicitly require that all expressions be well-defined, thus for instance must lie in the domain of for the second statement to be true). The set surjects onto a subset of of dimension (because is algebraic over , which is uniformly distributed over ), so the generic fibres have dimension . If we let , then thus has dimension . We view as a subset of and parameterise it as
By a base change, we may then find a set of random variables
uniformly distributed in , which restricts to the existing tuple of random variables.
By construction, one has
almost surely. On the other hand, the support of has dimension at most (because are algebraic over ) and so for generic choices of these random variables, the set of possible has dimension at least ; since any one of these variables is algebraic over any other (once the are fixed), we conclude that cannot be restricted to any lower-dimensional set than . We conclude that almost surely, (2) holds generically.
The above discussion shows that the support of has dimension exactly ; as any three of are such that the remaining two random variables in are algebraic over these three, the final claims of the lemma follow.
We rewrite (2) as
almost surely, where is the equivalence class of a generically bijective, and definable, partial function up to generic equivalence, and inversion and composition on such equivalence classes is defined in the obvious manner. This equivalence class can be made definable by identifying with the closure of the top-dimensional component of the graph of , and then expressing this in Chow coordinates (or using the Hilbert scheme); this makes equivalence classes
into random variables uniformly distributed in some definable sets , which are definable images of and thus almost irreducible (after passing to generic subsets of if necessary). We thus see that any three of are independent, and we have the relation
almost surely.
In particular, for fixed choices of , , determines and vice versa. Thus and have the same dimension, say . (This could be strictly less than , basically because the original sets may contain superfluous degrees of freedom which do not interact with the spaces .)
We now let be the set of all equivalence classes of generically bijective and definable partial functions with the following properties:
- For generic , there exists such that .
- For generic , there exists such that .
- For generic , there exists such that .
- For generic , there exists such that .
This is a definable set; is contained in the image of by the map with fibres of dimension at most , and thus can have at most one component of dimension (and no larger dimensional components); on the other hand, from (5) contains the image of a generic subset of ; thus has dimension exactly and is essentially irreducible.
By construction, contains the (equivalence class of) the identity map on and is closed under inversion. We also claim that it is closed under composition, which would make a definable group. Indeed, let . For generic , there exists such that . The map is a generic bijection and is almost irreducible, and so is generic also. Thus, we may generically also find such that , and hence . This gives the first property required for to lie in , and the other three are proven similarly.
Having located the definable group , the next step is to locate a space that acts definably on. We first observe that generically acts on , by defining whenever , , and is the unique element of such that (which, recall, is the closure of the top-dimensional component of the graph of ) contains . This is a definable, and generically defined operation which obeys the action axiom for generic .
To create a genuine action and not just a generic action, we perform yet another “quotient space construction” to extract a suitable space from as follows. Define a formal pre-point to be a definable subset of obeying the following axioms:
- (xv) (Vertical line test) For generic , there is exactly one such that .
- (xv’) (Vertical line test) For generic , is the unique such that .
- (xvi) (Translation invariance) For generic , we have .
Two formal pre-points are equivalent if they have a common definable subset, and a formal point is an equivalence class of formal pre-points. We then let be the set of all formal points. If is a formal pre-point and is a generic element of , then is equivalent to , and so is essentially irreducible; by similar arguments to before, we may now parameterise as a definable set, which is the image of the generically defined map described above with fibres of dimension at least and so has at most one component of dimension and no higher-dimensional components. The group acts on formal pre-points by the formula
one easily verifies that this is an action on formal pre-points which descends to a definable action on the space of formal points.
For generic , the set can be verified to be a formal pre-point; this gives a definable, generically defined map , which one can check to be generically injective; combined with the previous dimension control on , we now see that has dimension and is almost irreducible, with generically bijective. It can also be shown to generically preserve the action, in the sense that for generic .
Finally, we can set up the random variables and required for Theorem 3. We choose generic elements , and then set
It is clear that for generic , the are interalgebraic with , and that the relations (1) hold almost surely. Also, is clearly algebraic over for , and the are uniformly distributed in . For , we use the relation . As is algebraic over and , is algebraic over and . Thus for generic , there can only be finitely many possible values of , giving the desired algebraicity.
We see that is uniformly distributed in ; as almost surely, we conclude that is uniformly distributed over . The independence requirements on the now follow from the corresponding independence hypotheses on the .
6 comments
Comments feed for this article
16 November, 2013 at 11:05 pm
omar aboura
In the definition of “qualitative probability space” (third paragraph of Section 1), I think that the three must be .
[Corrected, thanks – T.]
17 November, 2013 at 12:13 am
Allen Knutson
PSL_2(k) isn’t projective; it’s the image of the adjoint representation of SL_2(k). More specifically it’s affine, being SO(the Killing form).
[Corrected, thanks – T.]
17 November, 2013 at 9:58 am
Rex
I believe “projective” here is just referring to the name of PSL and not asserting any projectivity.
[Allen was referring to an earlier version of the text, now corrected, in which there was some confusion in this regard – T.]
3 January, 2014 at 8:46 am
Sean Eberhard
An oft-used countably additive qualitative probability measure, completely unrelated to any genuine measure, is Baire category in a complete metric space. Say that if is meagre and if is comeagre.
9 May, 2021 at 10:18 am
Goursat and Furstenberg-Weiss type lemmas | What's new
[…] The ability to encode an abelian additive relation in terms of group-theoretic properties is vaguely reminiscent of the group configuration theorem. […]
13 May, 2021 at 4:48 pm
Goursat and Furstenberg-Weiss type lemmas – scroo0ooge
[…] The ability to encode an abelian additive relation in terms of group-theoretic properties is vaguely reminiscent of the group configuration theorem. […]