You are currently browsing the tag archive for the ‘definability’ tag.

I’ve just uploaded to the arXiv my paper “Expanding polynomials over finite fields of large characteristic, and a regularity lemma for definable sets“, submitted to Contrib. Disc. Math. The motivation of this paper is to understand a certain polynomial variant of the sum-product phenomenon in finite fields. This phenomenon asserts that if ${A}$ is a non-empty subset of a finite field ${F}$, then either the sumset ${A+A := \{a+b: a,b \in A\}}$ or product set ${A \cdot A := \{ab: a,b \in A \}}$ will be significantly larger than ${A}$, unless ${A}$ is close to a subfield of ${F}$ (or to ${\{1\}}$). In particular, in the regime when ${A}$ is large, say ${|F|^{1-c} < |A| \leq |F|}$, one expects an expansion bound of the form

$\displaystyle |A+A| + |A \cdot A| \gg (|F|/|A|)^{c'} |A| \ \ \ \ \ (1)$

for some absolute constants ${c, c' > 0}$. Results of this type are known; for instance, Hart, Iosevich, and Solymosi obtained precisely this bound for ${(c,c')=(3/10,1/3)}$ (in the case when ${|F|}$ is prime), which was then improved by Garaev to ${(c,c')=(1/3,1/2)}$.

We have focused here on the case when ${A}$ is a large subset of ${F}$, but sum-product estimates are also extremely interesting in the opposite regime in which ${A}$ is allowed to be small (see for instance the papers of KatzShen and Li and of Garaev for some recent work in this case, building on some older papers of Bourgain, Katz and myself and of Bourgain, Glibichuk, and Konyagin). However, the techniques used in these two regimes are rather different. For large subsets of ${F}$, it is often profitable to use techniques such as the Fourier transform or the Cauchy-Schwarz inequality to “complete” a sum over a large set (such as ${A}$) into a set over the entire field ${F}$, and then to use identities concerning complete sums (such as the Weil bound on complete exponential sums over a finite field). For small subsets of ${F}$, such techniques are usually quite inefficient, and one has to proceed by somewhat different combinatorial methods which do not try to exploit the ambient field ${F}$. But my paper focuses exclusively on the large ${A}$ regime, and unfortunately does not directly say much (except through reasoning by analogy) about the small ${A}$ case.

Note that it is necessary to have both ${A+A}$ and ${A \cdot A}$ appear on the left-hand side of (1). Indeed, if one just has the sumset ${A+A}$, then one can set ${A}$ to be a long arithmetic progression to give counterexamples to (1). Similarly, if one just has a product set ${A \cdot A}$, then one can set ${A}$ to be a long geometric progression. The sum-product phenomenon can then be viewed that it is not possible to simultaneously behave like a long arithmetic progression and a long geometric progression, unless one is already very close to behaving like a subfield.

Now we consider a polynomial variant of the sum-product phenomenon, where we consider a polynomial image

$\displaystyle P(A,A) := \{ P(a,b): a,b \in A \}$

of a set ${A \subset F}$ with respect to a polynomial ${P: F \times F \rightarrow F}$; we can also consider the asymmetric setting of the image

$\displaystyle P(A,B) := \{ P(a,b): a \in A,b \in B \}$

of two subsets ${A,B \subset F}$. The regime we will be interested is the one where the field ${F}$ is large, and the subsets ${A, B}$ of ${F}$ are also large, but the polynomial ${P}$ has bounded degree. Actually, for technical reasons it will not be enough for us to assume that ${F}$ has large cardinality; we will also need to assume that ${F}$ has large characteristic. (The two concepts are synonymous for fields of prime order, but not in general; for instance, the field with ${2^n}$ elements becomes large as ${n \rightarrow \infty}$ while the characteristic remains fixed at ${2}$, and is thus not going to be covered by the results in this paper.)

In this paper of Vu, it was shown that one could replace ${A \cdot A}$ with ${P(A,A)}$ in (1), thus obtaining a bound of the form

$\displaystyle |A+A| + |P(A,A)| \gg (|F|/|A|)^{c'} |A|$

whenever ${|A| \geq |F|^{1-c}}$ for some absolute constants ${c, c' > 0}$, unless the polynomial ${P}$ had the degenerate form ${P(x,y) = Q(L(x,y))}$ for some linear function ${L: F \times F \rightarrow F}$ and polynomial ${Q: F \rightarrow F}$, in which ${P(A,A)}$ behaves too much like ${A+A}$ to get reasonable expansion. In this paper, we focus instead on the question of bounding ${P(A,A)}$ alone. In particular, one can ask to classify the polynomials ${P}$ for which one has the weak expansion property

$\displaystyle |P(A,A)| \gg (|F|/|A|)^{c'} |A|$

whenever ${|A| \geq |F|^{1-c}}$ for some absolute constants ${c, c' > 0}$. One can also ask for stronger versions of this expander property, such as the moderate expansion property

$\displaystyle |P(A,A)| \gg |F|$

whenever ${|A| \geq |F|^{1-c}}$, or the almost strong expansion property

$\displaystyle |P(A,A)| \geq |F| - O( |F|^{1-c'})$

whenever ${|A| \geq |F|^{1-c}}$. (One can consider even stronger expansion properties, such as the strong expansion property ${|P(A,A)| \geq |F|-O(1)}$, but it was shown by Gyarmati and Sarkozy that this property cannot hold for polynomials of two variables of bounded degree when ${|F| \rightarrow \infty}$.) One can also consider asymmetric versions of these properties, in which one obtains lower bounds on ${|P(A,B)|}$ rather than ${|P(A,A)|}$.

The example of a long arithmetic or geometric progression shows that the polynomials ${P(x,y) = x+y}$ or ${P(x,y) = xy}$ cannot be expanders in any of the above senses, and a similar construction also shows that polynomials of the form ${P(x,y) = Q(f(x)+f(y))}$ or ${P(x,y) = Q(f(x) f(y))}$ for some polynomials ${Q, f: F \rightarrow F}$ cannot be expanders. On the other hand, there are a number of results in the literature establishing expansion for various polynomials in two or more variables that are not of this degenerate form (in part because such results are related to incidence geometry questions in finite fields, such as the finite field version of the Erdos distinct distances problem). For instance, Solymosi established weak expansion for polynomials of the form ${P(x,y) = f(x)+y}$ when ${f}$ is a nonlinear polynomial, with generalisations by Hart, Li, and Shen for various polynomials of the form ${P(x,y) = f(x) + g(y)}$ or ${P(x,y) = f(x) g(y)}$. Further examples of expanding polynomials appear in the work of Shkredov, Iosevich-Rudnev, and Bukh-Tsimerman, as well as the previously mentioned paper of Vu and of Hart-Li-Shen, and these papers in turn cite many further results which are in the spirit of the polynomial expansion bounds discussed here (for instance, dealing with the small ${A}$ regime, or working in other fields such as ${{\bf R}}$ instead of in finite fields ${F}$). We will not summarise all these results here; they are summarised briefly in my paper, and in more detail in the papers of Hart-Li-Shen and of Bukh-Tsimerman. But we will single out one of the results of Bukh-Tsimerman, which is one of most recent and general of these results, and closest to the results of my own paper. Roughly speaking, in this paper it is shown that a polynomial ${P(x,y)}$ of two variables and bounded degree will be a moderate expander if it is non-composite (in the sense that it does not take the form ${P(x,y) = Q(R(x,y))}$ for some non-linear polynomial ${Q}$ and some polynomial ${R}$, possibly having coefficients in the algebraic completion of ${F}$) and is monic on both ${x}$ and ${y}$, thus it takes the form ${P(x,y) = x^d + S(x,y)}$ for some ${d \geq 1}$ and some polynomial ${S}$ of degree at most ${d-1}$ in ${x}$, and similarly with the roles of ${x}$ and ${y}$ reversed, unless ${P}$ is of the form ${P(x,y) = f(x) + g(y)}$ or ${P(x,y) = f(x) g(y)}$ (in which case the expansion theory is covered to a large extent by the previous work of Hart, Li, and Shen).

Our first main result improves upon the Bukh-Tsimerman result by strengthening the notion of expansion and removing the non-composite and monic hypotheses, but imposes a condition of large characteristic. I’ll state the result here slightly informally as follows:

Theorem 1 (Criterion for moderate expansion) Let ${P: F \times F \rightarrow F}$ be a polynomial of bounded degree over a finite field ${F}$ of sufficiently large characteristic, and suppose that ${P}$ is not of the form ${P(x,y) = Q(f(x)+g(y))}$ or ${P(x,y) = Q(f(x)g(y))}$ for some polynomials ${Q,f,g: F \rightarrow F}$. Then one has the (asymmetric) moderate expansion property

$\displaystyle |P(A,B)| \gg |F|$

whenever ${|A| |B| \ggg |F|^{2-1/8}}$.

This is basically a sharp necessary and sufficient condition for asymmetric expansion moderate for polynomials of two variables. In the paper, analogous sufficient conditions for weak or almost strong expansion are also given, although these are not quite as satisfactory (particularly the conditions for almost strong expansion, which include a somewhat complicated algebraic condition which is not easy to check, and which I would like to simplify further, but was unable to).

The argument here resembles the Bukh-Tsimerman argument in many ways. One can view the result as an assertion about the expansion properties of the graph ${\{ (a,b,P(a,b)): a,b \in F \}}$, which can essentially be thought of as a somewhat sparse three-uniform hypergraph on ${F}$. Being sparse, it is difficult to directly apply techniques from dense graph or hypergraph theory for this situation; however, after a few applications of the Cauchy-Schwarz inequality, it turns out (as observed by Bukh and Tsimerman) that one can essentially convert the problem to one about the expansion properties of the set

$\displaystyle \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in F \} \ \ \ \ \ (2)$

(actually, one should view this as a multiset, but let us ignore this technicality) which one expects to be a dense set in ${F^4}$, except in the case when the associated algebraic variety

$\displaystyle \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in \overline{F} \}$

fails to be Zariski dense, but it turns out that in this case one can use some differential geometry and Riemann surface arguments (after first invoking the Lefschetz principle and the high characteristic hypothesis to work over the complex numbers instead over a finite field) to show that ${P}$ is of the form ${Q(f(x)+g(y))}$ or ${Q(f(x)g(y))}$. This reduction is related to the classical fact that the only one-dimensional algebraic groups over the complex numbers are the additive group ${({\bf C},+)}$, the multiplicative group ${({\bf C} \backslash \{0\},\times)}$, or the elliptic curves (but the latter have a group law given by rational functions rather than polynomials, and so ultimately end up being eliminated from consideration, though they would play an important role if one wanted to study the expansion properties of rational functions).

It remains to understand the structure of the set (2) is. To understand dense graphs or hypergraphs, one of the standard tools of choice is the Szemerédi regularity lemma, which carves up such graphs into a bounded number of cells, with the graph behaving pseudorandomly on most pairs of cells. However, the bounds in this lemma are notoriously poor (the regularity obtained is an inverse tower exponential function of the number of cells), and this makes this lemma unsuitable for the type of expansion properties we seek (in which we want to deal with sets ${A}$ which have a polynomial sparsity, e.g. ${|A| \sim |F|^{1-c}}$). Fortunately, in the case of sets such as (2) which are definable over the language of rings, it turns out that a much stronger regularity lemma is available, which I call the “algebraic regularity lemma”. I’ll state it (again, slightly informally) in the context of graphs as follows:

Lemma 2 (Algebraic regularity lemma) Let ${F}$ be a finite field of large characteristic, and let ${V, W}$ be definable sets over ${F}$ of bounded complexity (i.e. ${V, W}$ are subsets of ${F^n}$, ${F^m}$ for some bounded ${n,m}$ that can be described by some first-order predicate in the language of rings of bounded length and involving boundedly many constants). Let ${E}$ be a definable subset of ${V \times W}$, again of bounded complexity (one can view ${E}$ as a bipartite graph connecting ${V}$ and ${W}$). Then one can partition ${V, W}$ into a bounded number of cells ${V_1,\ldots,V_a}$, ${W_1,\ldots,W_b}$, still definable with bounded complexity, such that for all pairs ${i =1,\ldots a}$, ${j=1,\ldots,b}$, one has the regularity property

$\displaystyle |E \cap (A \times B)| = d_{ij} |A| |B| + O( |F|^{-1/4} |V| |W| )$

for all ${A \subset V_i, B \subset W_i}$, where ${d_{ij} := \frac{|E \cap (V_i \times W_j)|}{|V_i| |W_j|}}$ is the density of ${E}$ in ${V_i \times W_j}$.

This lemma resembles the Szemerédi regularity lemma, but regularises all pairs of cells (not just most pairs), and the regularity is of polynomial strength in ${|F|}$, rather than inverse tower exponential in the number of cells. Also, the cells are not arbitrary subsets of ${V,W}$, but are themselves definable with bounded complexity, which turns out to be crucial for applications. I am optimistic that this lemma will be useful not just for studying expanding polynomials, but for many other combinatorial questions involving dense subsets of definable sets over finite fields.

The above lemma is stated for graphs ${E \subset V \times W}$, but one can iterate it to obtain an analogous regularisation of hypergraphs ${E \subset V_1 \times \ldots \times V_k}$ for any bounded ${k}$ (for application to (2), we need ${k=4}$). This hypergraph regularity lemma, by the way, is not analogous to the strong hypergraph regularity lemmas of Rodl et al. and Gowers developed in the last six or so years, but closer in spirit to the older (but weaker) hypergraph regularity lemma of Chung which gives the same “order ${1}$” regularity that the graph regularity lemma gives, rather than higher order regularity.

One feature of the proof of Lemma 2 which I found striking was the need to use some fairly high powered technology from algebraic geometry, and in particular the Lang-Weil bound on counting points in varieties over a finite field (discussed in this previous blog post), and also the theory of the etale fundamental group. Let me try to briefly explain why this is the case. A model example of a definable set of bounded complexity ${E}$ is a set ${E \subset F^n \times F^m}$ of the form

$\displaystyle E = \{ (x,y) \in F^n \times F^m: \exists t \in F; P(x,y,t)=0 \}$

for some polynomial ${P: F^n \times F^m \times F \rightarrow F}$. (Actually, it turns out that one can essentially write all definable sets as an intersection of sets of this form; see this previous blog post for more discussion.) To regularise the set ${E}$, it is convenient to square the adjacency matrix, which soon leads to the study of counting functions such as

$\displaystyle \mu(x,x') := | \{ (y,t,t') \in F^m \times F \times F: P(x,y,t) = P(x',y,t') = 0 \}|.$

If one can show that this function ${\mu}$ is “approximately finite rank” in the sense that (modulo lower order errors, of size ${O(|F|^{-1/2})}$ smaller than the main term), this quantity depends only on a bounded number of bits of information about ${x}$ and a bounded number of bits of information about ${x'}$, then a little bit of linear algebra will then give the required regularity result.

One can recognise ${\mu(x,x')}$ as counting ${F}$-points of a certain algebraic variety

$\displaystyle V_{x,x'} := \{ (y,t,t') \in \overline{F}^m \times \overline{F} \times \overline{F}: P(x,y,t) = P(x',y,t') = 0 \}.$

The Lang-Weil bound (discussed in this previous post) provides a formula for this count, in terms of the number ${c(x,x')}$ of geometrically irreducible components of ${V_{x,x'}}$ that are defined over ${F}$ (or equivalently, are invariant with respect to the Frobenius endomorphism associated to ${F}$). So the problem boils down to ensuring that this quantity ${c(x,x')}$ is “generically bounded rank”, in the sense that for generic ${x,x'}$, its value depends only on a bounded number of bits of ${x}$ and a bounded number of bits of ${x'}$.

Here is where the étale fundamental group comes in. One can view ${V_{x,x'}}$ as a fibre product ${V_x \times_{\overline{F}^m} V_{x'}}$ of the varieties

$\displaystyle V_x := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x,y,t) = 0 \}$

and

$\displaystyle V_{x'} := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x',y,t) = 0 \}$

over ${\overline{F}^m}$. If one is in sufficiently high characteristic (or even better, in zero characteristic, which one can reduce to by an ultraproduct (or nonstandard analysis) construction, similar to that discussed in this previous post), the varieties ${V_x,V_{x'}}$ are generically finite étale covers of ${\overline{F}^m}$, and the fibre product ${V_x \times_{\overline{F}^m} V_{x'}}$ is then also generically a finite étale cover. One can count the components of a finite étale cover of a connected variety by counting the number of orbits of the étale fundamental group acting on a fibre of that variety (much as the number of components of a cover of a connected manifold is the number of orbits of the topological fundamental group acting on that fibre). So if one understands the étale fundamental group of a certain generic subset of ${\overline{F}^m}$ (formed by intersecting together an ${x}$-dependent generic subset of ${\overline{F}^m}$ with an ${x'}$-dependent generic subset), this in principle controls ${c(x,x')}$. It turns out that one can decouple the ${x}$ and ${x'}$ dependence of this fundamental group by using an étale version of the van Kampen theorem for the fundamental group, which I discussed in this previous blog post. With this fact (and another deep fact about the étale fundamental group in zero characteristic, namely that it is topologically finitely generated), one can obtain the desired generic bounded rank property of ${c(x,x')}$, which gives the regularity lemma.

In order to expedite the deployment of all this algebraic geometry (as well as some Riemann surface theory), it is convenient to use the formalism of nonstandard analysis (or the ultraproduct construction), which among other things can convert quantitative, finitary problems in large characteristic into equivalent qualitative, infinitary problems in zero characteristic (in the spirit of this blog post). This allows one to use several tools from those fields as “black boxes”; not just the theory of étale fundamental groups (which are considerably simpler and more favorable in characteristic zero than they are in positive characteristic), but also some results limiting the morphisms between compact Riemann surfaces of high genus (such as the de Franchis theorem, the Riemann-Hurwitz formula, or the fact that all morphisms between elliptic curves are essentially group homomorphisms), which would be quite unwieldy to utilise if one did not first pass to the zero characteristic case (and thence to the complex case) via the ultraproduct construction (followed by the Lefschetz principle).

I found this project to be particularly educational for me, as it forced me to wander outside of my usual range by quite a bit in order to pick up the tools from algebraic geometry and Riemann surfaces that I needed (in particular, I read through several chapters of EGA and SGA for the first time). This did however put me in the slightly unnerving position of having to use results (such as the Riemann existence theorem) whose proofs I have not fully verified for myself, but which are easy to find in the literature, and widely accepted in the field. I suppose this type of dependence on results in the literature is more common in the more structured fields of mathematics than it is in analysis, which by its nature has fewer reusable black boxes, and so key tools often need to be rederived and modified for each new application. (This distinction is discussed further in this article of Gowers.)

Much as group theory is the study of groups, or graph theory is the study of graphs, model theory is the study of models (also known as structures) of some language ${{\mathcal L}}$ (which, in this post, will always be a single-sorted, first-order language). A structure is a set ${X}$, equipped with one or more operations, constants, and relations. This is of course an extremely general type of mathematical object, but (quite remarkably) one can still say a substantial number of interesting things about very broad classes of structures.

We will observe the common abuse of notation of using the set ${X}$ as a metonym for the entire structure, much as we usually refer to a group ${(G,1,\cdot,()^{-1})}$ simply as ${G}$, a vector space ${(V, 0, +, \cdot)}$ simply as ${V}$, and so forth. Following another common bending of the rules, we also allow some operations on structures (such as the multiplicative inverse operation on a group or field) to only be partially defined, and we allow use of the usual simplifying conventions for mathematical formulas (e.g. writing ${a+b+c}$ instead of ${(a+b)+c}$ or ${a+(b+c)}$, in cases where associativity is known). We will also deviate slightly from the usual practice in logic by emphasising individual structures, rather than the theory of general classes of structures; for instance, we will talk about the theory of a single field such as ${{\bf R}}$ or ${{\bf C}}$, rather than the theory of all fields of a certain type (e.g. real closed fields or algebraically closed fields).

Once one has a structure ${X}$, one can introduce the notion of a definable subset of ${X}$, or more generally of a Cartesian power ${X^n}$ of ${X}$, defined as a set ${E \subset X^n}$ of the form

$\displaystyle E = \{ (x_1,\ldots,x_n): P(x_1,\ldots,x_n) \hbox{ true} \} \ \ \ \ \ (1)$

for some formula ${P}$ in the language ${{\mathcal L}}$ with ${n}$ free variables and any number of constants from ${X}$ (that is, ${P(x_1,\ldots,x_n)}$ is a well-formed formula built up from a finite number of constants ${c_1,\ldots,c_m}$ in ${X}$, the relations and operations on ${X}$, logical connectives such as ${\neg}$, ${\wedge}$, ${\implies}$, and the quantifiers ${\forall, \exists}$). Thus, for instance, in the theory of the arithmetic of the natural numbers ${{\bf N} = ({\bf N}, 0, 1, +, \times)}$, the set of primes ${{\mathcal P}}$ is a definable set, since we have

$\displaystyle {\mathcal P} = \{ x \in {\bf N}: (\exists y: x=y+2) \wedge \neg (\exists z,w: x = (z+2)(w+2)) \}.$

In the theory of the field of reals ${{\bf R} = ({\bf R}, 0, 1, +, -, \times, ()^{-1})}$, the unit circle ${S^1}$ is an example of a definable set,

$\displaystyle S^1 = \{ (x,y) \in {\bf R}^2: x^2+y^2 = 1 \},$

but so is the the complement of the circle,

$\displaystyle {\bf R}^2 \backslash S^1 = \{ (x,y) \in {\bf R}^2: \neg(x^2+y^2 = 1) \}$

and the interval ${[-1,1]}$:

$\displaystyle [-1,1] = \{ x \in {\bf R}: \exists y: x^2+y^2 = 1\}.$

Due to the unlimited use of constants, any finite subset of a power ${X^n}$ of any structure ${X}$ is, by our conventions, definable in that structure. (One can of course also consider definability without parameters (also known as ${0}$-definability), in which arbitrary constants are not permitted, but we will not do so here.)

We can isolate some special subclasses of definable sets:

• An atomic definable set is a set of the form (1) in which ${P()}$ is an atomic formula (i.e. it does not contain any logical connectives or quantifiers).
• A quantifier-free definable set is a set of the form (1) in which ${P()}$ is quantifier-free (i.e. it can contain logical connectives, but does not contain the quantifiers ${\forall, \exists}$).

Example 1 In the theory of a field such as ${{\bf R}}$, an atomic definable set is the same thing as an affine algebraic set (also known as an affine algebraic variety, with the understanding that varieties are not necessarily assumed to be irreducible), and a quantifier-free definable set is known as a constructible set; thus we see that algebraic geometry can be viewed in some sense as a special case of model theory. (Conversely, it can in fact be quite profitable to think of model theory as an abstraction of algebraic geometry; for instance, the concepts of Morley rank and Morley degree in model theory (discussed in this previous blog post) directly generalises the concepts of dimension and degree in algebraic geometry.) Over ${{\bf R}}$, the interval ${[-1,1]}$ is a definable set, but not a quantifier-free definable set (and certainly not an atomic definable set); and similarly for the primes over ${{\bf N}}$.

A quantifier-free definable set in ${X^n}$ is nothing more than a finite boolean combination of atomic definable sets; in other words, the class of quantifier-free definable sets over ${X}$ is the smallest class that contains the atomic definable sets and is closed under boolean operations such as complementation and union (which generate all the other boolean operations). Similarly, the class of definable sets over ${X}$ is the smallest class that contains the quantifier-free definable sets, and is also closed under the operation of projection ${\pi_n: E \mapsto \pi_n(E)}$ from ${X^{n+1}}$ to ${X^n}$ for every natural number ${n}$, where ${\pi_n: X^{n+1} \rightarrow X^n}$ is the map ${\pi_n(x_1,\ldots,x_n,x_{n+1}) := (x_1,\ldots,x_n)}$.

Some structures have the property of enjoying quantifier elimination, which means that every definable set is in fact a quantifier-free definable set, or equivalently that the projection of a quantifier-free definable set is again quantifier-free. For instance, an algebraically closed field ${k}$ (with the field operations) has quantifier elimination (i.e. the projection of a constructible set is again constructible); this fact can be proven by the classical tool of resultants, and among other things can be used to give a proof of Hilbert’s nullstellensatz. (Note though that projection does not necessary preserve the property of being atomic; for instance, the projection of the atomic set ${\{ (x,y) \in k^2: xy=1 \}}$ is the non-atomic, but still quantifier-free definable, set ${\{ x \in k: \neg (k=0) \}}$.) In the converse direction, it is not difficult to use the nullstellensatz to deduce quantifier elimination. For theory of the real field ${{\bf R}}$, which is not algebraically closed, one does not have quantifier elimination, as one can see from the example of the unit circle (which is a quantifier-free definable set) projecting down to the interval ${[-1,1]}$ (which is definable, but not quantifer-free definable). However, if one adds the additional operation of order ${<}$ to the reals, giving it the language of an ordered field rather than just a field, then quantifier elimination is recovered (the class of quantifier-free definable sets now enlarges to match the class of definable sets, which in this case is also the class of semi-algebraic sets); this is the famous Tarski-Seidenberg theorem.

On the other hand, many important structures do not have quantifier elimination; typically, the projection of a quantifier-free definable set is not, in general, quantifier-free definable. This failure of the projection property also shows up in many contexts outside of model theory; for instance, Lebesgue famously made the error of thinking that the projection of a Borel measurable set remained Borel measurable (it is merely an analytic set instead). Turing’s halting theorem can be viewed as an assertion that the projection of a decidable set (also known as a computable or recursive set) is not necessarily decidable (it is merely semi-decidable (or recursively enumerable) instead). The notorious P=NP problem can also be essentially viewed in this spirit; roughly speaking (and glossing over the placement of some quantifiers), it asks whether the projection of a polynomial-time decidable set is again polynomial-time decidable. And so forth. (See this blog post of Dick Lipton for further discussion of the subtleties of projections.)

Now we consider the status of quantifier elimination for the theory of a finite field ${F}$. If interpreted naively, quantifier elimination is trivial for a finite field ${F}$, since every subset of ${F^n}$ is finite and thus quantifier-free definable. However, we can recover an interesting question in one of two (essentially equivalent) ways. One is to work in the asymptotic regime in which the field ${F}$ is large, but the length of the formulae used to construct one’s definable sets stays bounded uniformly in the size of ${F}$ (where we view any constant in ${F}$ as contributing a unit amount to the length of a formula, no matter how large ${F}$ is). A simple counting argument then shows that only a small number of subsets of ${F^n}$ become definable in the asymptotic limit ${|F| \rightarrow \infty}$, since the number of definable sets clearly grows at most polynomially in ${|F|}$ for any fixed bound on the formula length, while the number of all subsets of ${|F|^n}$ grows exponentially in ${n}$.

Another way to proceed is to work not with a single finite field ${F}$, or even with a sequence ${F_m}$ of finite fields, but with the ultraproduct ${F = \prod_{m \rightarrow p} F_m}$ of a sequence of finite fields, and to study the properties of definable sets over this ultraproduct. (We will be using the notation of ultraproducts and nonstandard analysis from this previous blog post.) This approach is equivalent to the more finitary approach mentioned in the previous paragraph, at least if one does not care to track of the exact bounds on the length of the formulae involved. Indeed, thanks to Los’s theorem, a definable subset ${E}$ of ${F^n}$ is nothing more than the ultraproduct ${E = \prod_{m \rightarrow p} E_m}$ of definable subsets ${E_m}$ of ${F_m^n}$ for all ${m}$ sufficiently close to ${p}$, with the length of the formulae used to define ${E_m}$ uniformly bounded in ${m}$. In the language of nonstandard analysis, one can view ${F}$ as a nonstandard finite field.

The ultraproduct ${F}$ of finite fields is an important example of a pseudo-finite field – a field that obeys all the sentences in the languages of fields that finite fields do, but is not necessarily itself a finite field. The model theory of pseudo-finite fields was first studied systematically by Ax (in the same paper where the Ax-Grothendieck theorem, discussed previously on this blog, was established), with important further contributions by Kiefe, by Fried-Sacerdote, by two papers of Chatzidakis-van den Dries-Macintyre, and many other authors.

As mentioned before, quantifier elimination trivially holds for finite fields. But for infinite pseudo-finite fields, such as the ultraproduct ${F = \prod_{m \rightarrow p} F_m}$ of finite fields with ${|F_m|}$ going to infinity, quantifier elimination fails. For instance, in a finite field ${F_m}$, the set ${E_m := \{ x \in F_m: \exists y \in F_m: x=y^2 \}}$ of quadratic residues is a definable set, with a bounded formula length, and so in the ultraproduct ${F =\prod_{m \rightarrow p} F_m}$, the set ${E := \prod_{m\rightarrow p} E_m}$ of nonstandard quadratic residues is also a definable set. However, in one dimension, we see from the factor theorem that the only atomic definable sets are either finite or the whole field ${F}$, and so the only constructible sets (i.e. the only quantifier-free definable sets) are either finite or cofinite in ${F}$. Since the quadratic residues have asymptotic density ${1/2}$ in a large finite field, they cannot form a quantifier-free definable set, despite being definable.

Nevertheless, there is a very nice almost quantifier elimination result for these fields, in characteristic zero at least, which we phrase here as follows:

Theorem 1 (Almost quantifier elimination) Let ${F}$ be a nonstandard finite field of characteristic zero, and let ${E \subset F^n}$ be a definable set over ${F}$. Then ${E}$ is the union of finitely many sets of the form

$\displaystyle E = \{ x \in V(F): \exists t \in F: P(x,t) = 0 \} \ \ \ \ \ (2)$

where ${V(F)}$ is an atomic definable subset of ${F^n}$ (i.e. the ${F}$-points of an algebraic variety ${V}$ defined over ${F}$ in ${F^n}$) and ${P: F^{n+1} \rightarrow F}$ is a polynomial.

Results of this type were first obtained essentially due to Catarina Kiefe, although the formulation here is closer to that of Chatzidakis-van den Dries-Macintyre.

Informally, this theorem says that while we cannot quite eliminate all quantifiers from a definable set over a nonstandard finite field, we can eliminate all but one existential quantifier. Note that negation has also been eliminated in this theorem; for instance, the definable set ${F \backslash \{0\} = \{ x \in F: \neg(x=0) \}}$ uses a negation, but can also be described using a single existential quantifier as ${\{ x \in F: \exists t: xt = 1 \}}$.) I believe that there are more complicated analogues of this result in positive characteristic, but I have not studied this case in detail (Kiefe’s result does not assume characteristic zero, but her conclusion is slightly different from the one given here). In the one-dimensional case ${n=1}$, the only varieties ${V}$ are the affine line and finite sets, and we can simplify the above statement, namely that any definable subset of ${F}$ takes the form ${\{ x\in F: \exists t \in F: P(x,t) = 0 \}}$ for some polynomial ${P}$ (i.e. definable sets in ${F}$ are nothing more than the projections of the ${F}$-points of a plane curve).

There is an equivalent formulation of this theorem for standard finite fields, namely that if ${F}$ is a finite field and ${E \subset F^n}$ is definable using a formula of length at most ${M}$, then ${E}$ can be expressed in the form (2) with the degree of ${P}$ bounded by some quantity ${C_{M,n}}$ depending on ${M}$ and ${n}$, assuming that the characteristic of ${F}$ is sufficiently large depending on ${M, n}$.

The theorem gives quite a satisfactory description of definable sets in either standard or nonstandard finite fields (at least if one does not care about effective bounds in some of the constants, and if one is willing to exclude the small characteristic case); for instance, in conjunction with the Lang-Weil bound discussed in this recent blog post, it shows that any non-empty definable subset of a nonstandard finite field has a nonstandard cardinality of ${(\alpha + O(|F|^{-1/2})) |F|^d}$ for some positive standard rational ${\alpha}$ and integer ${d}$. Equivalently, any non-empty definable subset of ${F^n}$ for some standard finite field ${F}$ using a formula of length at most ${M}$ has a standard cardinality of ${(\alpha + O_{M,n}(|F|^{-1/2})) |F|^d}$ for some positive rational of height ${O_{M,n}(1)}$ and some natural number ${d}$ between ${0}$ and ${n}$. (For instance, in the example of the quadratic residues given above, ${d}$ is equal to ${1}$ and ${\alpha}$ equal to ${1/2}$.) There is a more precise statement to this effect, namely that the Poincaré series of a definable set is rational; see Kiefe’s paper for details.

Below the fold I give a proof of Theorem 1, which relies primarily on the Lang-Weil bound mentioned above.

This week, Henry Towsner concluded his portion of reading seminar of the Hrushovski paper, by discussing (a weaker, simplified version of) main model-theoretic theorem (Theorem 3.4 of Hrushovski), and described how this theorem implied the combinatorial application in Corollary 1.2 of Hrushovski. The presentation here differs slightly from that in Hrushovski’s paper, for instance by avoiding mention of the more general notions of S1 ideals and forking.

Here is a collection of resources so far on the Hrushovski paper:

One of my favorite open problems, which I have blogged about in the past, is that of establishing (or even correctly formulating) a non-commutative analogue of Freiman’s theorem. Roughly speaking, the question is this: given a finite set ${X}$ in a non-commutative group ${G}$ which is of small doubling in the sense that the product set ${X \cdot X := \{ xy: x, y \in X \}}$ is not much larger than ${X}$ (e.g. ${|X \cdot X| \leq K|X|}$ for some ${K = O(1)}$), what does this say about the structure of ${X}$? (For various technical reasons one may wish to replace small doubling by, say, small tripling (i.e. ${|X \cdot X \cdot X| = O( |X| )}$), and one may also wish to assume that ${X}$ contains the identity and is symmetric, ${X^{-1} = X}$, but these are relatively minor details.)

Sets of small doubling (or tripling), etc. can be thought of as “approximate groups”, since groups themselves have a doubling constant ${K := |X \cdot X|/|X|}$ equal to one. Another obvious example of an approximate group is that of an arithmetic progression in an additive group, and more generally of a ball (in the word metric) in a nilpotent group of bounded rank and step. It is tentatively conjectured that in fact all examples can somehow be “generated” out of these basic examples, although it is not fully clear at present what “generated” should mean.

A weaker conjecture along the same lines is that if ${X}$ is a set of small doubling, then there should be some sort of “pseudo-metric” ${\rho}$ on ${G}$ which is left-invariant, and for which ${X}$ is controlled (in some suitable sense) by the unit ball in this metric. (For instance, if ${X}$ was a subgroup of ${G}$, one would take the metric which identified all the left cosets of ${X}$ to a point, but was otherwise a discrete metric; if ${X}$ were a ball in a nilpotent group, one would use some rescaled version of the word metric, and so forth.) Actually for technical reasons one would like to work with a slightly weaker notion than a pseudo-metric, namely a Bourgain system, but let us again ignore this technicality here.

Recently, using some powerful tools from model theory combined with the theory of topological groups, Ehud Hrushovski has apparently achieved some breakthroughs on this problem, obtaining new structural control on sets of small doubling in arbitrary groups that was not previously accessible to the known combinatorial methods. The precise results are technical to state, but here are informal versions of two typical theorems. The first applies to sets of small tripling in an arbitrary group:

Theorem 1 (Rough version of Hrushovski Theorem 1.1) Let ${X}$ be a set of small tripling, then one can find a long sequence of nested symmetric sets ${X_1 \supset X_2 \supset X_3 \supset \ldots}$, all of size comparable to ${X}$ and contained in ${(X^{-1} X)^2}$, which are somewhat closed under multiplication in the sense that ${X_i \cdot X_i \subset X_{i-1}}$ for all ${i > 1}$, and which are fairly well closed under commutation in the sense that ${[X_i, X_j] \subset X_{i+j-1}}$. (There are also some additional statements to the effect that the ${X_n}$ efficiently cover each other, and also cover ${X}$, but I will omit those here.)

This nested sequence is somewhat analogous to a Bourgain system, though it is not quite the same notion.

If one assumes that ${X}$ is “perfect” in a certain sense, which roughly means that there is no non-trivial abelian quotient, then one can do significantly better:

Theorem 2 (Rough version of Hrushovski Corollary 1.2) Let ${X_0}$ be a set of small tripling, let ${X := X_0^{-1} X_0}$, and suppose that for almost all ${l}$-tuples ${a_1, \ldots, a_l \in X}$ (where ${l=O(1)}$), the conjugacy classes ${a_i^X := \{ x^{-1} ax: x \in X \}}$ generate most of ${X}$ in the sense that ${|a_1^X \cdot \ldots \cdot a_l^X| \gg |X|}$. Then a large part of ${X}$ is contained in a subgroup of size comparable to ${X}$.

Note that if one quotiented out by the commutator ${[X,X]}$, then all of the conjugacy classes ${a_i^X}$ would collapse to points. So the hypothesis here is basically a strong quantitative assertion to the effect that the commutator ${[X,X]}$ is extremely large, and rapidly fills out most of ${X}$ itself.

Here at UCLA, a group of logicians and I (consisting of Matthias Aschenbrenner, Isaac Goldbring, Greg Hjorth, Henry Towsner, Anush Tserunyan, and possibly others) have just started a weekly reading seminar to come to grips with the various combinatorial, logical, and group-theoretic notions in Hrushovski’s paper, of which we only have a partial understanding at present. The seminar is a physical one, rather than an online one, but I am going to try to put some notes on the seminar on this blog as it progresses, as I know that there are a couple of other mathematicians who are interested in these developments.

So far there have been two meetings of the seminar. In the first, I surveyed the state of knowledge of the noncommutative Freiman theorem, covering broadly the material in my previous blog post. In the second meeting, Isaac reviewed some key notions of model theory used in Hrushovski’s paper, in particular the notions of definability and type, which I will review below. It is not yet clear how these are going to be connected with the combinatorial side of things, but this is something which we will hopefully develop in future seminars. The near-term objective is to understand the statement of the main theorem on the model-theoretic side (Theorem 3.4 of Hrushovski), and then understand some of its easier combinatorial consequences, before going back and trying to understand the proof of that theorem.

[Update, Oct 19: Given the level of interest in this paper, readers are encouraged to discuss any aspect of that paper in the comments below, even if they are not currently being covered by the UCLA seminar.]