In the previous set of notes, we saw that one could derive expansion of Cayley graphs from three ingredients: non-concentration, product theorems, and quasirandomness. Quasirandomness was discussed in Notes 3. In the current set of notes, we discuss product theorems. Roughly speaking, these theorems assert that in certain circumstances, a finite subset of a group either exhibits expansion (in the sense that , say, is significantly larger than ), or is somehow “close to” or “trapped” by a genuine group.
- (Expansion) One has .
- (Close to ) One has .
- (Trapping) is contained in a proper subgroup of .
We will prove this theorem (which was proven first in the cases for fields of prime order by Helfgott, and then for and general by Dinai, and finally to general and independently by Pyber-Szabo and by Breuillard-Green-Tao) later in this notes. A more qualitative version of this proposition was also previously obtained by Hrushovski. There are also generalisations of the product theorem of importance to number theory, in which the field is replaced by a cyclic ring (with not necessarily prime); this was achieved first for and square-free by Bourgain, Gamburd, and Sarnak, by Varju for general and square-free, and finally by this paper of Bourgain and Varju for arbitrary and .
Exercise 1 (Girth bound) Assuming Theorem 1, show that whenever is a symmetric set of generators of for some finite field and some , then any element of can be expressed as the product of elements from . (Equivalently, if we add the identity element to , then for some .) This is a special case of a conjecture of Babai and Seress, who conjectured that the bound should hold uniformly for all finite simple groups (in particular, the implied constants here should not actually depend on . The methods used to handle the case can handle other finite groups of Lie type of bounded rank, but at present we do not have bounds that are independent of the rank. On the other hand, a recent paper of Helfgott and Seress has almost resolved the conjecture for the permutation groups .
A key tool to establish product theorems is an argument which is sometimes referred to as the pivot argument. To illustrate this argument, let us first discuss a much simpler (and older) theorem, essentially due to Freiman, which has a much weaker conclusion but is valid in any group :
- (Expansion) One has .
- (Close to a subgroup) is contained in a left-coset of a group with .
To prove this theorem, we suppose that the first conclusion does not hold, thus . Our task is then to place inside the left-coset of a fairly small group .
To do this, we take a group element , and consider the intersection . A priori, the size of this set could range from anywhere from to . However, we can use the hypothesis to obtain an important dichotomy, reminiscent of the classical fact that two cosets of a subgroup of are either identical or disjoint:
- (Non-involved case) is empty.
- (Involved case) .
Proof: Suppose we are not in the pivot case, so that is non-empty. Let be an element of , then and both lie in . The sets and then both lie in . As these sets have cardinality and lie in , which has cardinality less than , we conclude from the inclusion-exclusion formula that
But the left-hand side is equal to , and the claim follows.
The above proposition provides a clear separation between two types of elements : the “non-involved” elements, which have nothing to do with (in the sense that , and the “involved” elements, which have a lot to do with (in the sense that . The key point is that there is a significant “gap” between the non-involved and involved elements; there are no elements that are only “slightly involved”, in that and intersect a little but not a lot. It is this gap that will allow us to upgrade approximate structure to exact structure. Namely,
Proposition 4 The set of involved elements is a finite group, and is equal to .
Proof: It is clear that the identity element is involved, and that if is involved then so is (since . Now suppose that are both involved. Then and have cardinality greater than and are both subsets of , and so have non-empty intersection. In particular, is non-empty, and so is non-empty. By Proposition 3, this makes involved. It is then clear that is a group.
If , then is non-empty, and so from Proposition 3 is involved. Conversely, if is involved, then . Thus we have as claimed. In particular, is finite.
Now we can quickly wrap up the proof of Theorem 2. By construction, for all ,which by double counting shows that . As , we see that is contained in a right coset of ; setting , we conclude that is contained in a left coset of . is a conjugate of , and so . If , then and both lie in and have cardinality , so must overlap; and so . Thus , and so , and Theorem 2 follows.
Exercise 2 Show that the constant in Theorem 2 cannot be replaced by any larger constant.
Exercise 3 Let be a finite non-empty set such that . Show that . (Hint: If , show that for some .)
Exercise 4 Let be a finite non-empty set such that . Show that there is a finite group with and a group element such that and .
Below the fold, we give further examples of the pivot argument in other group-like situations, including Theorem 2 and also the “sum-product theorem” of Bourgain-Katz-Tao and Bourgain-Glibichuk-Konyagin.
— 1. The sum-product theorem —
Consider a finite non-empty subset of a field . Then we may form the sumset
and the product set
The minimal sizes of such sets are well understood:
- (i) Show that , with equality occuring if and only if is an additive coset of an finite additive subgroup of with some .
- (ii) Show that , with equality occuring if and only if is either equal to a multiplicative coset of a finite multiplicative subgroup of with some , or the set , or the set where is a multiplicative coset.
- (iii) Show that , with equality occuring if and only if is either equal to a multiplicative dilate of a finite subfield of with , a singleton set, or an additive subgroup of order .
The sum-product phenomenon is a robust version of the above observation, asserting that one of or must be significantly larger than if is not somehow “close” to a genuine subfield of . Here is one formulation of this phenomenon:
- (Expansion) .
- (Close to a subfield) There is a dilate of a subfield of with and which contains all but elements of .
- (Smallness) is an additive subgroup of order .
If has characteristic zero, then the second option here cannot occur, and we conclude that for some absolute constant as soon as contains at least two non-zero elements, a claim first established in by Erdos and Szemeredi. When is a finite field of prime order, the second option can only occur when , and we conclude that as soon as whenever is sufficiently small, is an absolute constant, and has at least two non-zero elements. A preliminary version of this result (which required more size assumptions on , in particular a bound of the shape ) was obtained by Bourgain, Katz, and Tao, with the version stated above first obtained by Bourgain, Glibichuk, and Konyagin. The proof given here is drawn from my book with Van, and was originally inspired by this paper of Bourgain and Konyagin.
Remark 1 There has been a substantial amount of literature on trying to optimise the exponent in the sum-product theorem. A relatively recent survey of this literature can be found in this paper of mine (and in the references to the other papers cited in this remark). In , the best result currently in this direction is by Solymosi, who established that one can take arbitrarily close to ; for , the best result currently is by Rudnev, who shows that one can take arbitrarily close to . For fields of prime order, one can take arbitrarily close to , a result of Rudnev; an extension to arbitrary finite fields was then obtained by Li and Roche-Newton.
We now start proving Theorem 5. As with Theorem 2, the engine of the proof is a dichotomy similar to that of Proposition 3. Whilst the former proposition was modeled on the basic group-theoretic assertion that cosets of a subgroup where either identical or disjoint, this proposition is modeled on the basic linear algebra fact that if is a subfield of and , then is either of size , or of size .
- (Non-involved case) .
- (Involved case) .
Proof: Suppose that we are not in the non-involved case, thus . Then the map from to is not injective, and so there exists with and
In particular, . We then have and so
Remark 2 One can view as measuring the extent to which the dilate of is “transverse” to . As the “slope” varies, “pivots” around the origin, encountering both the (relatively rare) involved slopes, and the (generic) non-involved slopes. It is this geometric picture which led to the term “pivot argument”, as used in particular by Helfgott (who labeled the non-involved slopes as “pivots”).
This dichotomy becomes useful if there is a significant gap between and . Let’s see how. To prove Theorem 5, we may assume that is larger than some large absolute constant , as the claim follows from Exercise 5 otherwise (making small enough depending on ). by deleting from , and tweaking , noting that we may then assume that does not contain . We suppose that
for some and some sufficiently small absolute constant . In particular we see that will exceed any quantity of the form if we make small enough and large enough.
We would like to boost this control of sums and products to more complex combinations of . We will need some basic tools from additive combinatorics.
Lemma 7 (Ruzsa triangle inequality) If are non-empty finite subsets of , then .
Proof: This is the additive version of Exercise 4 from Notes 4.
Lemma 8 (Ruzsa covering lemma) If are non-empty finite subsets of , then can be covered by at most translates of .
Proof: This is the additive version of Exercise 5 from Notes 4.
Exercise 6 (Sum set estimates) If are non-empty finite subsets of such that , show that and can both be covered by translates of the same -approximate group , with . Conclude that
for any natural numbers , where denotes the sum set of copies of . (Hint: use the additive form of Exercise 7 from Notes 4, and the preceding lemmas.)
These lemmas allow us to improve the sum-product properties of by passing to a large subset (cf. the Balog-Szemeredi-Gowers lemma from Notes 4):
Lemma 9 (Katz-Tao lemma) Let be as above. Then there is a subset of with such that .
Proof: The dilates of with all lie in a set of cardinality at most . Intuitively, this should force a lot of collision between the , which we will exploit using the sum set estimates. More precisely, observe that
and hence by Cauchy-Schwarz
The left-hand side can be written as
and so by the pigeonhole principle we can find such that
We apply a dilation to set (recall that does not contain ). If we set , we conclude that
which implies in particular that
If , then
since we also have
and thus by the Ruzsa triangle inequality
whenever . Informally, let us call a non-zero element of good if (but note that this notion of “good” is a bit fuzzy, as it depends on the choice of implied constants in the notation). Observe that if are good, then
and thus by the Ruzsa triangle inequality
thus the product of two good elements are good (with somewhat worse implied constants). Similarly, from the Ruzsa covering lemma we see that and are both covered by translates of , and from this and sum set estimates we see that
and so the sum of two good elements is again good. Similarly the difference of good elements is good. Applying all these facts, we conclude that all the elements of are good, thus for all . In particular, since exceeds , we see from the Cauchy-Schwarz inequality that for each , there are solutions to the equation with and . However, there are only possible choices for , and each such choice uniquely determines , so there are at most possible choices for , and the claim follows.
Note that one could replace in the above lemma by any other homogeneous polynomial combination of .
By applying a dilation, we may assume that contains . Applying Proposition 6 to this set (and using sum set estimates), we arrive at the following dichotomy: every field element is either “non-involved” in the sense that , or is “involved” in the sense that for some fixed absolute constant . By sum set estimates we have ; as we can assume that , and hence , is larger than any quantity of the form , this forces all elements of to be involved.
To exploit this, observe (by repeating the proof of Proposition 6) that if are involved, then the quantities are somewhat involved in the sense that
for those choices of (where the implied constants depend on ). But as we can assume that , and hence , is larger than any quantity of the form , we see from Proposition 6 that this forces to be involved as well (this is the crucial step at which approximate structure is improved to exact structure). We thus see that the set of all involved elements is closed under multiplication, addition, and subtraction; as it also contains , it is a subring of . Arguing as in the proof of Proposition 6, we have that is finite with ; in particular, must now be a finite subfield of .
Now we enter the “endgame”, in which we use this to control . By previous discussion, contains , and thus . By the Ruzsa triangle inequality applied to , this implies that , and so can be covered by translates of . A similar argument applied multiplicatively shows that can be covered by dilates of . Since a non-trivial translate of and a non-trivial dilate of intersect in at most one point, we conclude that has at most elements outside of , and the claim follows.
Remark 3 One can abstract this argument by replacing the multiplicative structure here by an abelian group action; see this paper of Helfgott for details. The argument can also extend to non-commutative settings, such as division algebras or more generally to arbitrary rings (though in the latter case, the presence of non-trivial zero-divisors becomes a very significant issue); see this paper for details.
— 2. Finite subgroups of —
We will shortly establish Theorem 1, which can be viewed as a way to describe approximate subgroups of . Before we do so, let us first warm up and digress slightly by by studying genuine finite subgroups of , in the model case , for which ad hoc explicit calculations are available. In order to make the algebraic geometry of the situation cleaner, it is convenient to embed the field in its algebraic closure , and similarly embed in . This is a group which is also an algebraic variety (identifying the space of matrices with coefficients in with ), whose group operations are algebraic (in fact, polynomial) maps; in other words, is an algebraic group. We now consider the question of what finite subgroups of can look like. This is a classical question, with a complete classification obtained by Dickson in 1901. The precise classification is somewhat complicated; to give just a taste of this complexity, we observe that the symmetry group of the isocahedron is a finite subgroup of , which can be lifted to the spin group (giving what is known as the binary isocahedral group, a group of order ), which is a subgroup of , which can be identified with . Because of this, it is possible for some choices of finite field to embed the binary isocahedral group into or . Similar considerations obtain for the symmetry group of other Platonic solids. However, if one is willing to settle for a “rough” classification, in which one ignores groups of bounded size (and more generally, is willing just to describe a bounded index subgroup of the group ), the situation becomes much simpler. In the characteristic zero case , for instance, we have Jordan’s theorem, which asserts that given a finite subgroup of for some , a bounded index subgroup of is abelian. (Jordan’s theorem was discussed further in last quarter’s course.) The finite characteristic case is inherently more complicated though (due in large part to the proliferation of finite subfields), with a satisfactory rough classification only becoming available for general with the work of Larsen and Pink (published in 2011, but which first appeared as a preprint in 1998). However, the case is significantly simpler and can be treated by somewhat ad hoc methods, as we shall now do. The discussion here is loosely based on this paper of Kowalski.
We pause to recall some basic structural facts about . Elements of this group are matrices with determinant one, and thus have two (algebraic, possibly repeated) eigenvalues for some (note here that we are using the algebraically closed nature of ). This allows us to classify elements of into three classes:
- The central elements ;
- The regular unipotent elements and their negations, which are non-central elements with a double eigenvalue at (or a double eigenvalue at ); and
- The regular semisimple elements, which have two distinct eigenvalues.
We collectively refer to regular unipotent elements and their negations as regular projectively unipotent elements.
Remark 4 The presence of the non-identity central element leads to some slight technical annoyances (for instance, it means that merely an almost simple algebraic group rather than a simple one, in the sense that the only normal algebraic subgroups are finite). One can eliminate this element by working instead with the projective special linear group , but we will not do so here. We remark that if one works in for then the classification of elements becomes significantly more complicated, for instance there exist elements which are semisimple (i.e. diagonalisable) but neither regular nor central, because some but not all of the eigenvalues may be repeated.
One can distinguish the unipotent elements from the semisimple ones using the trace: unipotent elements have trace , their negations have trace , and the semisimple elements have traces distinct from . The ability to classify elements purely from the trace is a very special fact concerning which breaks down completely for higher rank matrix groups, but we will not hesitate to take advantage of this fact here.
Associated to the above classification are some natural algebraic subgroups of , including the standard maximal torus
the one-dimensional standard unipotent group
and the two-dimensional standard Borel subgroup
More generally, we define a maximal torus of to be a conjugate (in ) of the standard maximal torus, a unipotent group to be a conjugate of the standard unipotent group, and a Borel subgroup to be a conjugate of the standard Borel subgroup. (This is not really the “right” way to define these groups, for the purpose of generalisation to other algebraic groups, but will suffice as long as we are only working with .) Note that one can also think of a Borel subgroup as the stabiliser of a one-dimensional subspace of (using the obvious action of on ). Using the Jordan normal form (again taking advantage of the algebraically closed nature of ), we can see how these groups interact with group elements:
- The central elements lie in every maximal torus and every Borel subgroup. The identity lies in every unipotent group, but lies in none of them.
- Every regular unipotent element lies in exactly one unipotent group, which in turn lies in exactly one Borel subgroup (the normaliser of the unipotent group). Conversely, a unipotent group consists entirely of regular unipotent elements and the identity .
- Every regular semisimple element lies in exactly one maximal torus, which in turn lies in exactly two Borel subgroups (the stabiliser of one of the eigenspaces of a regular semisimple element in the torus). Conversely, a maximal torus consists entirely of regular semisimple elements and the central elements .
Remark 5 If one was working in a non-algebraically closed field instead of in , one could subdivide the regular semisimple elements into two classes, the split case when the elements can be diagonalised inside , and the non-split case when they can only be diagonalised in a quadratic extension of . This similarly subdivides maximal tori into two families, the split tori and the non-split tori. In the case when one is working over the field , the unipotent, split semisimple, and non-split semisimple elements are referred to as parabolic, hyperbolic, and elliptic elements of respectively. Fortunately, in our applications we can work in algebraically closed fields and avoid these sorts of finer distinctions.
Ignoring the exceptional small examples of subgroups of , such as the binary isocahedral group mentioned earlier, there are two obvious ways to generate subgroups of . One is to pass from to a subfield , creating “arithmetic” subgroups of the form (or conjugates thereof). The other is to replace with an algebraic subgroup of the three-dimensional group , such as the maximal tori, unipotent groups, and Borel subgroups mentioned earlier. (Actually, these are the only (connected) proper algebraic subgroups of , as can be seen by consideration of the associated Lie algebras.)
Observe that if is an arithmetic subgroup, then its intersections , , capture a portion of proportionate to the dimensions involved, or more precisely that
Indeed, it is easy to see that , , and so forth. An important and general observation of Larsen and Pink is that this sort of behaviour is shared by all other finite subgroups of algebraic groups such as , as long as these groups are not (mostly) trapped in a proper algebraic subgroup. We first illustrate this phenomenon for the torus groups:
- (Non-concentration) For any maximal torus , one has .
- (Trapping) There is a Borel subgroup such that .
Proof: Suppose that the trapping hypothesis fails, thus for all Borel subgroups , where we interpret here to mean “less than for an arbitrarily small constant which we are at liberty to choose”. (If one is uncomfortable with this type of definition, one can instead consider a sequence of potential counterexamples to the above proposition in various groups , in which . Alternatively, one can also rephrase this argument if desired in the language of nonstandard analysis.) In particular, we see that any coset of occupies a fraction at most of . Thus, for instance, if we select an element from uniformly at random, then with probability , is non-zero, and similarly for . To put it more informally, the matrix entries of an element of are “generically” non-zero. Similarly if we first conjugate by a fixed group element.
We need to show that for any maximal torus . By conjugation we may take to be the standard maximal torus . Set , then is a subgroup of of the form
for some finite multiplicative subgroup of . We may assume that is larger than any given absolute constant, as the claim is trivial otherwise. Our task is to show that .
Let be a typical element of . By the preceding discussion, we may assume that are all non-zero. Since is a subgroup of , we have , thus
for all . We evaluate the inner matrix products to obtain that
Because are non-zero, we see that for all but values of , all four entries of the middle matrix here are non-zero. As a consequence, if one fixes and lets vary, all of the triple products given above are distinct. Note that if one takes the above triple product and multiplies the diagonal entries together, the terms cancel and one obtains . This rational map (as a function of ) is at most four-to-one; each value of this map is associated to at most four values of . Putting all this together, we conclude that there are different triple products one can form here as vary, and the claim follows.
Exercise 7 Establish a variant of Proposition 10 in which the maximal tori are replaced by unipotent groups.
Given a group element , let be the conjugacy class of . The behaviour of this class depends on the nature of :
- (i) If is central, show that .
- (ii) If is regular unipotent, show that is the space of all regular unipotent elements.
- (iii) If is negative of a regular unipotent element, show that is the space of all negatives of regular unipotent elements.
- (iv) If is regular semisimple, show that .
We can “dualise” the upper bound on maximal tori in Proposition 10 into a lower bound on conjugacy classes:
- (Large conjugacy classes) For any regular semisimple or regular projectively unipotent , one has .
- (Trapping) There is a Borel subgroup such that .
Proof: As before we may assume that for all Borel subgroups . Let be regular semisimple, and consider the map from to . For each , the preimage of by is contained in a coset of the centraliser of . As (and hence ) is regular semisimple or regular projectively unipotent, this centraliser is a maximal torus or (two copies of) a unipotent group (this can be seen by placing in Jordan normal form). By Proposition 10 or Exercise 7, we conclude that each preimage of has cardinality , which forces the range to have cardinality as claimed.
We remark that this gives a dichotomy analogous to Lemma 3 or Lemma 6 in the case . Namely, for any , either is empty, or . We will take advantage of a dichotomy similar to this (but for tori instead of conjugacy classes) in the next section.
We can match the lower bound in Proposition 11 with an upper bound:
- (Non-concentration) For any regular semisimple , one has .
- (Trapping) There is a Borel subgroup such that .
Proof: Again, we may assume that for all Borel subgroups . In particular, we may take larger than any given absolute constant. Let be regular semisimple, and let ; our task is to show that . Note from Exercise 8 that is conjugate to , and so is symmetric: . Also, for all .
Observe that whenever and , then the triple lies in ; conversely, every triple in arises in this manner. Thus we have the identity
which will give as required.
We now establish (1). We divide into several contributions. First suppose that . Then we bound the summand by ; there are summands here, leading to a total contribution of , which is acceptable. Similarly if , or , so we may restrict to the remaining cases when , , are distinct. In particular, are now either regular unipotent or regular semisimple.
We now consider the case in which are linearly dependent (in the space of matrices). For fixed , this constrains to either a maximal torus or a unipotent group (depending on whether is regular semisimple or regular projectively unipotent); this is easiest to see by placing in Jordan canonical form. By the preceding results, we see that there are choices of for each , leading to a contribution of in this case, which is acceptable. So we may now take to be linearly independent.
The set is the intersection of with the affine line
this is indeed a line when are linearly independent. In most cases, this line will intersect (which we can view as a quadric surface in ) in at most two points, leading to a contribution of for this case, which is acceptable. The only cases left to treat are when the line are incident to . This only occurs when the line takes the form for some and unipotent group ; this is easiest to see by multiplying on the left so that it contains the identity, and then placing another element of the line in Jordan normal form. In that case, we have
for all . This forces to all lie in the Borel subgroup associated to (this is easiest to see by first conjugating into the standard unipotent group ). In particular, both lie in . Furthermore, if we write , then the diagonal entries of are or , and so the diagonal entries of are either or or . In particular, is the stabiliser of one of the eigenvectors of – so for fixed , there are at most two choices for (recall that was regular). Furthermore, for fixed and , is constrained to lie in at most three cosets of . As such, there are only choices of here for each , giving another contribution of , and the claim follows.
We can use the upper bound on conjugacy classes to obtain a lower bound on tori:
- (Large torus) For any regular semisimple , one has , where is the unique maximal torus containing .
- (Trapping) There is a Borel subgroup such that .
Proof: We can of course assume that the trapping case does not occur. We consider the map from to . By Proposition 12, the range of has cardinality , so by the pigeonhole principle, there is a preimage of of cardinality . But all preimages are conjugate to each other, so the preimage of has cardinality . But this preimage is the intersection of with the centraliser of , which two cosets of , and so as required.
Exercise 10 Establish a variant of Proposition 13 in which is regular unipotent instead of regular semisimple, and is replaced with the unique unipotent group containing .
This gives a second (and particularly useful) dichotomy: assuming is not trapped by a Borel subgroup, for a maximal torus , is either at most two, or is comparable to .
To exploit this, we use the following counting argument of Larsen and Pink (which is also reminiscent of an old argument of Jordan, used to prove his theorem mentioned previously), followed by some ad hoc arguments specific to . We continue to assume that is not trapped by a Borel subgroup. Let denote the central elements of , thus is either or . Observe that every element in is either regular projectively unipotent or regular semisimple; in the latter case, the element lies in a unique maximal torus, which also contains . We conclude that
where ranges over all the maximal tori that intersect , and is the number of regular projective unipotents in .
If we conjugate a maximal torus by an element of , we get another maximal torus, or the same maximal torus if the element used to conjugate in was in the normaliser of . Thus, by the orbit-stabilizer theorem, there are exactly tori conjugate to in . We thus see that
where is a collection of representatives of conjugacy classes of maximal tori intersecting in a regular semisimple element. We rearrange this as
Note that if is a maximal torus, the normaliser of in has index . As such, has index at most two in , and so is either equal to or for each . From the preceding bounds on tori and unipotent elements, we also have and . As we are assuming to be large, the above equation is only consistent when has cardinality or , and is comparable to , or equivalently that is comparable to . Thus, has plenty of regular projective unipotents (matching the upper bound from Exercise 9); in particular, there is at least one regular unipotent.
Applying a conjugation, we may assume that contains , thus
The map maps to unipotent groups that intersect in regular unipotents. As there are regular unipotent elements in , we see that there are only such unipotent groups available. From the pigeonhole principle and conjugation, we conclude that the preimage of in this map has cardinality . But this preimage is simply . In particular, the quotient has cardinality . Observe that each element of this quotient acts on by conjugation, and the corresponding action on is multiplicative (by the square of a diagonal entry of an element in the quotient). As such, if we set
to be the “multiplicative symmetry set” of , then we have . As is a finite additive group, is a field of size at most , thus is a finite field of cardinality . Also, is a vector space over ; as contains and has cardinality , we see that . Thus we have
Also, as has to stabilise , we see that all elements of have diagonal elements whose square lies in . Combining this with (2), we see that takes the form
for some multiplicative group of elements whose square lies in (in particular, has index at most in ), and some function ; since has cardinality , must have cardinality . By taking the commutators of two matrices in (3), we see that
for all .
If we select such that is non-zero, then by conjugating by a suitable element of (which does not affect any of the previous control established on ) we may normalise to be zero. From (4) this makes for all . In particular, is almost completely contained in :
Now for any , the subgroups and of have index , so their intersection must have cardinality , thus
In particular, there must exist either a regular unipotent or a regular semisimple element of such that also lies in . If is regular semisimple in , it has an eigenbasis in , and so must map such an eigenbasis to another eigenbasis, and thus lies in . If instead is regular unipotent, it has the line as the unique (geometric) eigenspace; must preserve this eigenspace and thus lies in , and thus in and therefore in by (5). Combining the cases, we conclude that . We may therefore summarise our discussion as follows:
- (Arithmetic subgroup) There is a finite subfield of with such that is contained in a conjugate of (and is thus a subgroup of that conjugate of index ).
- (Trapping) There is a Borel subgroup such that .
In principle, the trapping case can be analysed further (using manipulations similar to those used to reach (5)) but we will not pursue this here. We remark that while these computations were somewhat lengthy (and less elementary and precise than the more classical results of Dickson), they can extend to more complicated algebraic groups, such as , or more generally to any algebraic group of bounded rank; see this paper of Larsen and Pink for details. In particular, Larsen and Pink were able to use these methods to establish an important subcase of the famous classification of finite simple groups, namely by verifying this classification for sufficiently large subgroups of a linear group of bounded rank over a field of arbitrary characteristic. It is conceivable that these methods may be extended in the future to give an alternate proof of the full classification (for sufficiently large groups, at least).
— 3. The product theorem in —
In this section we prove the case of Theorem 1. This result was first established (for fields of prime order) by Helfgott and then in the general case by Dinai; we will present a variant of Helfgott’s original argument which was developed by Breuillard, Green, and Tao and independently by Pyber and Szabo. It is convenient to rephrase Helfgott’s theorem as follows:
- (Close to trivial) One has .
- (Close to ) One has .
The problem now concerns the behaviour of finite approximate subgroups of . The first step will be to establish analogues of the Larsen-Pink non-concentration inequalities of the preceding section, but for approximate subgroups rather than genuine subgroups. (The observation that these inequalities could be usefully extended to the approximate group setting is due to Hrushovski, although for the specific case of , the most important of these inequalities for the purposes of proving Theorem 15 were first established by Helfgott.) We begin by eliminating concentration in linear subgroups.
Lemma 16 (Escape from subspaces) Let , be as in Theorem 15, and let . Then one of the following holds:
- (Close to trivial) One has .
- (Escape) For any and any -dimensional subspace of , such that is a subgroup of , one has .
In practice, we will only apply the escape conclusion for Borel subgroups of , which are intersections of with three-dimensional subspaces; however, we need to work with the more general escape construction in the proof of the lemma, for inductive purposes. The claim can in fact be established for any -dimensional subspace , or more generally for bounded complexity -dimensional algebraic varieties; this will be discussed in the next section.
Proof: We induct on . For , the claim is trivial, since in that case. Now suppose that , and the claim has already been proven for smaller values of .
Suppose that there exists an element of such that , so that has dimension strictly less than . From (6) we have
Since can be covered by right translates of , we can find such that
Let and . Then is contained in , and so is supported on a set of cardinality at most . Since
we thus see from the pigeonhole principle that
The left-hand side is , and thus
The set in the left-hand side is contained in both and in (here we use the group nature of ), and so
Applying the induction hypothesis, we conclude that , and the claim follows.
The only remaining case is when for all . As generates , this implies that is normalised by . But this is impossible if has dimension ; see Exercise 12 below.
Now we can obtain an approximate version of Proposition 10:
Proposition 17 (Larsen-Pink inequality, special case) Let , be as in Theorem 15. Then for any maximal torus , one has .
Proof: We may assume that for any given constant , as the claim is trivial otherwise. Similarly, by Lemma 16, we may assume that for all Borel subgroups .
We need to show that for any maximal torus . By conjugation we may take to be the standard maximal torus . (This may make generate a conjugate of , rather than itself, but this will not impact our argument). Set , then
for some finite subset of . We may assume that , as the claim is trivial otherwise. Our task is to show that .
As in the proof of Proposition 17, we may find an element of with all non-zero. Since , thus
for all . Arguing as in Proposition 17, we have
and the claim follows.
Exercise 13 Show that if the non-concentration conclusion in Proposition 17 holds, then for every maximal torus and every , one has .
We can now establish variants of the other Larsen-Pink inequalities from the preceding section:
Exercise 14 Establish a variant of Proposition 17 in which the maximal tori are replaced by unipotent groups.
Exercise 15 (Large conjugacy classes) Let , be as in Theorem 15. Show that for any regular semisimple or regular projectively unipotent , one has .
Exercise 16 (Larsen-Pink inequality, another special case) Let , be as in Theorem 15. Show that for any regular semisimple and any , one has .
Exercise 17 (Unipotent bound) Let , be as in Theorem 15. Show that of the elements of are unipotent.
Exercise 18 (Large tori) Let , be as in Theorem 15. Show that for any regular semisimple , one has , where is the unique maximal torus containing . In fact one has . (For the latter claim, cover by left translates of .)
We remark that versions of most the results in the above exercises were first obtained in the context of by Helfgott (and extended to in a subsequent paper of Helfgott). However, the lower bound in Exercise 18 was only obtained in those papers for some maximal tori meeting non-trivially, rather than for all such tori, leading to some additional technical complications in Helfgott’s proof of Theorem 15.
We now have a dichotomy: given a maximal torus , either has no regular semisimple elements (and thus contains only central elements), or else has cardinality . We exploit this dichotomy as follows. Call a maximal torus involved if contains a regular semisimple element.
Lemma 18 (Key lemma) Let , be as in Theorem 15. Then one of the following statements hold:
- (Invariance) If is an involved torus and , then is an involved torus.
- (Close to trivial) One has .
Proof: Let be an involved torus, then by the preceding exercise we have , and thus . Thus, one has for some , which implies that . In particular, if is not close to trivial, contains a regular semisimple element and so is involved, as desired.
We can now finish the proof of Theorem 15. Suppose is not close to trivial. As there are at most unipotent elements and central elements in , at least one regular semisimple element, and so there is at least one involved torus. By the above lemma, and the fact that generates , we see that the set of involved tori is invariant under conjugation by . As has cardinality , and its intersection with the stabiliser of a single torus has cardinality , we conclude that there are involved tori. By Exercise 18, each of these tori contains regular semisimple elements of . Since each regular semisimple element belongs to a unique maximal torus, we conclude that
as , we conclude that , as claimed.
- (Close to group) generates a finite subgroup of with .
- (Concentrated in Borel) There is a Borel subgroup of with .
(Hint: this does not follow directly from Theorem 15, but can be established by a modification of the proof of that theorem.)
Note that the above exercise can be combined with Theorem 14 to give a more detailed description of . The Borel group is solvable, and by using tools from additive combinatorics, such as Freiman’s theorem in solvable groups (or the Helfgott-Lindenstrauss conjecture, discussed in the previous quarter’s notes), one can give even more precise descriptions of (at the cost of losing polynomial dependence of the bounds on ), but we will not discuss these topics here.
Exercise 20 Use Exercise 19 to give an alternate proof of Theorem 5. (Hint: there are a number of ways to embed the sum-product problem in a field into a product problem in (or ). For instance, one consider the tripling properties of sets of the form in terms of sets such as or , and then project this set onto (or ), and combine this with the Katz-Tao lemma to obtain Theorem 5. More details of this connection can be found in Section 8 of this paper.) This is of course a much more complicated and inefficient way to establish the sum-product theorem, but it does illustrate the link between the two results (beyond the fact that both proofs exploit a dichotomy). Note alsot that the original proof of the product theorem in by Helfgott actually used the sum-product theorem in as a key tool.
— 4. The product theorem in —
We now discuss the extension of the product theory to the more general groups . Actually, the arguments here will be valid in any almost simple connected algebraic group of bounded rank, but for sake of concreteness we will work with . (This also has the (very) minor advantage that is an affine variety rather than a projective one, so we can work entirely in affine spaces such as ; related to this, the only regular maps we need to consider will be polynomial in nature.) There is also some recent work on product theorems in other algebraic groups than the almost simple ones; see for instance the papers of Pyber-Szabo, Gill-Helfgott, and Breuillard-Green-Tao for some examples of this. It is conceivable that a satisfactory understanding of approximate subgroups of arbitrary algebraic groups of bounded dimension will be available in the near future.
The treatment of the case relied on a number of ad hoc computations which were only valid in , and also on the pleasant fact that the only non-regular elements of were the central elements , which is certainly false for higher values of . In this paper, Helfgott was able to push his original arguments to the case, but again the arguments were somewhat ad hoc in nature and did not seem to extend to the general setting. However, the arguments based on the Larsen-Pink concentration estimates have proven to be quite general, and in particular can handle the situation of . The one catch is that instead of working with very concrete and explicit subsets of , such as Borel subgroups or other intersections with linear spaces , one has to work with more general algebraic subvarieties of . As such, a certain amount of basic algebraic geometry becomes necessary. Also, because we are seeking results with quantitative bounds, we will need to keep some track of the “complexity” of the varieties that one encounters in the course of the argument.
We now very quickly review some algebraic geometry notions, though for reasons of space we will not attempt to develop the full theory of algebraic geometry here, referring instead to standard texts such as Harris, Mumford, or Griffiths-Harris. As usual, algebraic geometry is cleanest when working over an algebraically closed field, so we will work primarily over .
Definition 19 (Variety) Let be integers, and let be an algebraically closed field. We write for the affine space .
- An (affine) variety of complexity at most is a set of the form
where and are polynomials of degree at most . (Thus the complexity parameter controls the dimension, degree, and number of polynomials needed to cut out the variety. Note that we do not assume our varieties to be irreducible, and as such what we call a variety corresponds to what is sometimes known as an algebraic set in the literature.) Note that the union or intersection of two varieties of complexity at most , is another variety of complexity at most .
- A variety is irreducible if it cannot be expressed as the union of two proper (i.e. strict) subvarieties.
Thus, for instance, is a variety of complexity in (after identifying this latter affine space with the space of matrices over ).
It is known that any variety can be expressed as the union of a finite number of irreducible components, and this decomposition is unique if we require that no component is contained in any other. Furthermore, to each irreducible variety one can assign a dimension , defined as the maximal integer for which there exists a chain
of irreducible varieties; this will be an integer between and . For instance, it can be shown that has dimension (as expected). We define the dimension of a non-irreducible variety to be the least integer such that can be covered by finitely many irreducible varieties of dimension . If a (non-empty) variety can be cut out from an irreducible variety by setting polynomials to zero, then one has . Since can be cut out from by a single polynomial, and is not equal to all of , we conclude in particular that has dimension .
One can show that the image of a -dimensional variety by a polynomial map is contained in a variety of dimension at most . One can thus produce upper bounds on the dimension of varieties, by covering them by polynomial images of varieties already known to be bounded by the same dimension.
An algebraic subgroup of is a subvariety of which is also a subgroup of . For instance, the standard maximal torus , consisting of all the diagonal elements of , is an algebraic subgroup; more generally, any maximal torus, by which we mean a conjugate of the standard maximal torus, is an algebraic subgroup.
Exercise 21 Show that every maximal torus has dimension and complexity .
Dual to the maximal tori are the conjugacy classes of regular semisimple elements. We call a element of regular semisimple if it has distinct eigenvalues, and is thus diagonalisable. Observe that each regular semisimple element lies in precisely one maximal torus.
Exercise 22 Show that every conjugacy class of a regular semisimple element has dimension and complexity .
If is a finite subfield of , then is a finite subgroup of , and is thus technically a -dimensional algebraic subgroup of . However, the complexity of this algebraic group is huge (comparable to the cardinality of ). It turns out that is “effectively Zariski-dense” in the sense that it cannot be captured in a low complexity algebraic variety:
Proof: is the hypersurface in cut out by the determinant polynomial. As is a proper subvariety, we can find a polynomial which is not a multiple of the determinant polynomial, but which vanishes on ; by the complexity hypothesis we may take to have degree . Our task is then to show that
But then one can express as a rational function of the other coordinates, and the left-hand side of (7) is contained in a set of the form for some polynomial of degree that is not identically zero. The claim then follows from the Schwartz-Zippel lemma, which we give as an exercise below.
Exercise 23 (Schwartz-Zippel lemma) Let be a finite field, and let be a polynomial of degree that is not identically zero. Show that
For an additional challenge, obtain the sharper bound
We contrast this with the size of itself:
The key non-concentration inequality we will need is the following.
This inequality subsumes results such as Proposition 17, Exercise 14, and Exercise 16. Note from Lemma 20 (and Exercise 24) that the trapping option of the above proposition cannot occur if generates and is sufficiently large depending on , while the non-concentration claim is trivial when ; thus in this case we have (9) unconditionally.
The proof of Proposition 21 is somewhat complicated and is deferred to the next section. We record some particular consequences of this inequality.
Exercise 25 (Consequences of the non-concentration inequality) Let be a -approximate subgroup of for some , which generates for some finite field .
- (i) If is a maximal torus (and thus of dimension ), show that .
- (ii) If denotes the elements of which are not regular semisimple, show that .
- (iii) If is regular semisimple, show that .
- (iv) Show that at most of the elements of are not regular semisimple.
- (v) For any regular semisimple , show that .
- (vi) For any regular semisimple , show that .
Exercise 26 By repeating the arguments of the preceding section, establish Theorem 1 for general .
Remark 6 There is an analogue of Exercise 19, in which the role of the Borel subgroups is replaced by proper algebraic subgroups of bounded complexity; see Theorem 5.5 of the paper of Breuillard, Green, and Tao for a more precise statement.
— 5. Proof of the Larsen-Pink inequality (optional) —
We now prove Proposition 21. In order to escape the burden of having to keep track of the complexity of everything, we will use the tool of ultraproducts (which we will phrase in the language of nonstandard analysis). See this previous blog post for a discussion of ultraproducts and how they can be used to turn quantitative (or “hard”) analysis tasks into qualitative (or “soft”) analysis tasks. One can also use the machinery of schemes and inverse limits as a substitute for the ultraproduct formalism; this is the approach taken in the paper of Larsen and Pink. The paper of Breuillard, Green, and Tao has a slightly reduced reliance on ultraproducts, at the cost of more complexity bookkeeping, while the Pyber-Szabo paper avoids ultraproducts altogether but has perhaps the most bookkeeping of all the papers mentioned here (but, by the same token, is the only argument currently known which gives effective bounds). We will thus presume some familiarity both with ultraproducts (and nonstandard analysis) and with algebraic geometry in this section.
As in the previously mentioned blog post, we select a non-principal ultrafilter , and use it to construct ultraproducts and nonstandard objects. (To ensure the existence of such an object, we shall assume the axiom of choice, as we have already been doing implicitly throughout this course.) We also use the usual nonstandard asymptotic notation, thus for instance denotes a nonstandard quantity bounded in magnitude by a standard number.
The quantitative Larsen-Pink inequality (Proposition 21) can then be deduced from the following nonstandard version, in which all references to complexity are now absent:
Proposition 22 (Larsen-Pink inequality) Let be standard. Let be a nonstandard algebraically complete field (i.e. an ultraproduct of standard algebraically complete fields). Let be a nonstandard natural number, and let be a nonstandard -approximate subgroup of (i.e. an ultraproduct of standard -approximate subgroups of ), and let be a subvariety of . Then one of the following is true:
Let us see why Proposition 22 implies Proposition 21. Suppose for contradiction that Proposition 21 failed. Carefully negating all the quantifiers (and using the axiom of choice), this means that there is a sequence of standard algebraically closed fields, a sequence of standard numbers, a sequence of -approximate subgroups of , and a standard , a sequence of subvarieties of of complexity at most , and a standard , such that for each , is not contained in a proper algebraicd subgroup of of complexity or less, and one has
Now one forms the ultralimit and the ultraproducts , , . Then is an algebraically closed field, is a nonstandard -approximate subgroup of , and is an algebraic subvariety of (here we use the uniform complexity bound). One can also show that ; see Lemma 3 of this blog post. As such, we have
so by Proposition 22, is contained in a proper algebraic subgroup of . By unpacking the coefficients of all the polynomials over used to cut out , we see that is itself an ultraproduct of proper algebraic subgroups of , of complexity bounded uniformly in . By Los’s theorem, one has for all sufficiently close to , which gives a contradiction for large enough.
It remains to establish Proposition 22. By Los’s theorem, the ultraproduct of algebraically closed fields is again algebraically closed, which allows us to use algebraic geometry in the nonstandard field without difficulty.
Let be the group generated by , and consider the Zariski closure of this group, that is to say the intersection of all the varieties containing . This is again an algebraic variety (here we use the Noetherian property of varieties, that there does not exist any infinite descending chain of varieties), and is also a group (exercise!), and is thus an algebraic subgroup of . If this subgroup is proper then we have the trapping propertly, so we may assume that the closure is all of . In other words, is Zariski dense in .
For any dimension between and inclusive, and any standard real , let us call -admissible if one has the bound
whenever is standard and is a -dimensional subvariety of . Our task is to show that is admissible for all . This claim is trivial at the two endpoints and ; the difficulty is to somehow “interpolate” between these two endpoints. We need the following combinatorial observation.
Exercise 27 (Extreme dimensions) Suppose for sake of contradiction that is inadmissible for some . Show that we can find dimensions
and a real number such that
- is not -admissible;
- is not -admissible;
- is -admissible whenever or ;
- is -admissible for any .
whenever has dimension strictly less than , or strictly greater than .
Proof: has dimension at most . By standard algebraic geometry, we see that for each , the set of for which the slice has dimension , has dimension at most . In particular, if , then by (12), (13) the contribution of such to is at most
while if , then the contribution is at most
(One may wonder about the question of uniformity in the notation, but in nonstandard analysis one can automatically gain such uniformity through countable saturation; see Exercise 20 of this blog post.) Summing over all we obtain the claim.
We will now use a counting argument (which is, unsurprisingly, related to the counting argument used to establish Proposition 17, or any of the other Larsen-Pink inequalities in preceding sections) to obtain a contradiction from these four estimates.
First, by decomposing into irreducible components (and using (12) to eliminate all lower-dimensional components) we may assume that are both irreducible.
The product is not necessarily a variety, but it is still a constructible set (i.e. a finite boolean combination of varieties), and can still be assigned a dimension (by equating the dimension of a constructible set with the dimension of its Zariski closure). As it contains a translate of , it has dimension at least . It would be convenient if had dimension strictly greater than . This is not necessarily the case, but it turns out that it becomes so after a generic conjugation, thanks to the almost simplicity of :
Exercise 28 (Almost simplicity) Show that the only proper normal subgroups of are those contained in the centre of , i.e. in the identity matrix multipled by the roots of unity. (Hint: Let be a normal subgroup of that contains an element which is not a multiple of the identity. Place that element in Jordan normal form and divide it by one of its conjugates to make it fix a subspace of ; iterate this procedure until one finds an element in that is the direct sum of the identity in and a non-central element of . Then use this to generate all of .)
Proposition 24 (Generic skewness) For generic (i.e. for all in outside of a lower-dimensional variety), the set has dimension strictly greater than .
Proof: Let , and assume that has dimension exactly . This set contains all the translates with , which are each -dimensional irreducible varieties. By splitting up into components, we conclude that there are only finitely many distinct translates . If we denote one of these translates as , the set is easily seen to be a variety (as it is the intersection of varieties for ); as a finite number of these sets cover , at least one of them has to be all of ; thus there is a such that for all . In particular, this implies that for all .
Let . Arguing as before, is a variety, and is also a group; it is thus an algebraic group, and by the preceding discussion we have .
The set is a variety. If it has dimension strictly less than that of , we are done, so we may assume this set is all of ; thus for all . By almost simplicity, the normal subgroup generated by is all of ; thus must be all of , thus for all . But this forces , a contradiction since is strictly less than .
Combining this proposition with the Zariski density of , we see that we can find for some standard such that has dimension strictly greater than .
Fix this . Let be the twisted product map . We have the double counting identity
Now, is a map from an irreducible -dimensional variety to a -dimensional variety with Zariski-dense image, and is thus a dominant map. Among other things, this implies that there is a subvariety of of dimension at most such that for all , the set has dimension . By (13), we then have
for all ; by another application of (13), we have
Combining these estimates we see that
The left-hand side simplifies to . But this then contradicts Lemma 23.