You are currently browsing the category archive for the ‘math.AG’ category.

Let {{\bf F}_q} be a finite field of order {q = p^n}, and let {C} be an absolutely irreducible smooth projective curve defined over {{\bf F}_q} (and hence over the algebraic closure {k := \overline{{\bf F}_q}} of that field). For instance, {C} could be the projective elliptic curve

\displaystyle C = \{ [x,y,z]: y^2 z = x^3 + ax z^2 + b z^3 \}

in the projective plane {{\bf P}^2 = \{ [x,y,z]: (x,y,z) \neq (0,0,0) \}}, where {a,b \in {\bf F}_q} are coefficients whose discriminant {-16(4a^3+27b^2)} is non-vanishing, which is the projective version of the affine elliptic curve

\displaystyle \{ (x,y): y^2 = x^3 + ax + b \}.

To each such curve {C} one can associate a genus {g}, which we will define later; for instance, elliptic curves have genus {1}. We can also count the cardinality {|C({\bf F}_q)|} of the set {C({\bf F}_q)} of {{\bf F}_q}-points of {C}. The Hasse-Weil bound relates the two:

Theorem 1 (Hasse-Weil bound) {||C({\bf F}_q)| - q - 1| \leq 2g\sqrt{q}}.

The usual proofs of this bound proceed by first establishing a trace formula of the form

\displaystyle |C({\bf F}_{p^n})| = p^n - \sum_{i=1}^{2g} \alpha_i^n + 1 \ \ \ \ \ (1)

 

for some complex numbers {\alpha_1,\dots,\alpha_{2g}} independent of {n}; this is in fact a special case of the Lefschetz-Grothendieck trace formula, and can be interpreted as an assertion that the zeta function associated to the curve {C} is rational. The task is then to establish a bound {|\alpha_i| \leq \sqrt{p}} for all {i=1,\dots,2g}; this (or more precisely, the slightly stronger assertion {|\alpha_i| = \sqrt{p}}) is the Riemann hypothesis for such curves. This can be done either by passing to the Jacobian variety of {C} and using a certain duality available on the cohomology of such varieties, known as Rosati involution; alternatively, one can pass to the product surface {C \times C} and apply the Riemann-Roch theorem for that surface.

In 1969, Stepanov introduced an elementary method (a version of what is now known as the polynomial method) to count (or at least to upper bound) the quantity {|C({\bf F}_q)|}. The method was initially restricted to hyperelliptic curves, but was soon extended to general curves. In particular, Bombieri used this method to give a short proof of the following weaker version of the Hasse-Weil bound:

Theorem 2 (Weak Hasse-Weil bound) If {q} is a perfect square, and {q \geq (g+1)^4}, then {|C({\bf F}_q)| \leq q + (2g+1) \sqrt{q} + 1}.

In fact, the bound on {|C({\bf F}_q)|} can be sharpened a little bit further, as we will soon see.

Theorem 2 is only an upper bound on {|C({\bf F}_q)|}, but there is a Galois-theoretic trick to convert (a slight generalisation of) this upper bound to a matching lower bound, and if one then uses the trace formula (1) (and the “tensor power trick” of sending {n} to infinity to control the weights {\alpha_i}) one can then recover the full Hasse-Weil bound. We discuss these steps below the fold.

I’ve discussed Bombieri’s proof of Theorem 2 in this previous post (in the special case of hyperelliptic curves), but now wish to present the full proof, with some minor simplifications from Bombieri’s original presentation; it is mostly elementary, with the deepest fact from algebraic geometry needed being Riemann’s inequality (a weak form of the Riemann-Roch theorem).

The first step is to reinterpret {|C({\bf F}_q)|} as the number of points of intersection between two curves {C_1,C_2} in the surface {C \times C}. Indeed, if we define the Frobenius endomorphism {\hbox{Frob}_q} on any projective space by

\displaystyle \hbox{Frob}_q( [x_0,\dots,x_n] ) := [x_0^q, \dots, x_n^q]

then this map preserves the curve {C}, and the fixed points of this map are precisely the {{\bf F}_q} points of {C}:

\displaystyle C({\bf F}_q) = \{ z \in C: \hbox{Frob}_q(z) = z \}.

Thus one can interpret {|C({\bf F}_q)|} as the number of points of intersection between the diagonal curve

\displaystyle \{ (z,z): z \in C \}

and the Frobenius graph

\displaystyle \{ (z, \hbox{Frob}_q(z)): z \in C \}

which are copies of {C} inside {C \times C}. But we can use the additional hypothesis that {q} is a perfect square to write this more symmetrically, by taking advantage of the fact that the Frobenius map has a square root

\displaystyle \hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}}^2

with {\hbox{Frob}_{\sqrt{q}}} also preserving {C}. One can then also interpret {|C({\bf F}_q)|} as the number of points of intersection between the curve

\displaystyle C_1 := \{ (z, \hbox{Frob}_{\sqrt{q}}(z)): z \in C \} \ \ \ \ \ (2)

 

and its transpose

\displaystyle C_2 := \{ (\hbox{Frob}_{\sqrt{q}}(w), w): w \in C \}.

Let {k(C \times C)} be the field of rational functions on {C \times C} (with coefficients in {k}), and define {k(C_1)}, {k(C_2)}, and {k(C_1 \cap C_2)} analogously )(although {C_1 \cap C_2} is likely to be disconnected, so {k(C_1 \cap C_2)} will just be a ring rather than a field. We then (morally) have the commuting square

\displaystyle \begin{array}{ccccc} && k(C \times C) && \\ & \swarrow & & \searrow & \\ k(C_1) & & & & k(C_2) \\ & \searrow & & \swarrow & \\ && k(C_1 \cap C_2) && \end{array},

if we ignore the issue that a rational function on, say, {C \times C}, might blow up on all of {C_1} and thus not have a well-defined restriction to {C_1}. We use {\pi_1: k(C \times C) \rightarrow k(C_1)} and {\pi_2: k(C \times C) \rightarrow k(C_2)} to denote the restriction maps. Furthermore, we have obvious isomorphisms {\iota_1: k(C_1) \rightarrow k(C)}, {\iota_2: k(C_2) \rightarrow k(C)} coming from composing with the graphing maps {z \mapsto (z, \hbox{Frob}_{\sqrt{q}}(z))} and {w \mapsto (\hbox{Frob}_{\sqrt{q}}(w), w)}.

The idea now is to find a rational function {f \in k(C \times C)} on the surface {C \times C} of controlled degree which vanishes when restricted to {C_1}, but is non-vanishing (and not blowing up) when restricted to {C_2}. On {C_2}, we thus get a non-zero rational function {f \downharpoonright_{C_2}} of controlled degree which vanishes on {C_1 \cap C_2} – which then lets us bound the cardinality of {C_1 \cap C_2} in terms of the degree of {f \downharpoonright_{C_2}}. (In Bombieri’s original argument, one required vanishing to high order on the {C_1} side, but in our presentation, we have factored out a {\hbox{Frob}_{\sqrt{q}}} term which removes this high order vanishing condition.)

To find this {f}, we will use linear algebra. Namely, we will locate a finite-dimensional subspace {V} of {k(C \times C)} (consisting of certain “controlled degree” rational functions) which projects injectively to {k(C_2)}, but whose projection to {k(C_1)} has strictly smaller dimension than {V} itself. The rank-nullity theorem then forces the existence of a non-zero element {P} of {V} whose projection to {k(C_1)} vanishes, but whose projection to {k(C_2)} is non-zero.

Now we build {V}. Pick a {{\bf F}_q} point {P_\infty} of {C}, which we will think of as being a point at infinity. (For the purposes of proving Theorem 2, we may clearly assume that {C({\bf F}_q)} is non-empty.) Thus {P_\infty} is fixed by {\hbox{Frob}_q}. To simplify the exposition, we will also assume that {P_\infty} is fixed by the square root {\hbox{Frob}_{\sqrt{q}}} of {\hbox{Frob}_q}; in the opposite case when {\hbox{Frob}_{\sqrt{q}}} has order two when acting on {P_\infty}, the argument is essentially the same, but all references to {P_\infty} in the second factor of {C \times C} need to be replaced by {\hbox{Frob}_{\sqrt{q}} P_\infty} (we leave the details to the interested reader).

For any natural number {n}, define {R_n} to be the set of rational functions {f \in k(C)} which are allowed to have a pole of order up to {n} at {P_\infty}, but have no other poles on {C}; note that as we are assuming {C} to be smooth, it is unambiguous what a pole is (and what order it will have). (In the fancier language of divisors and Cech cohomology, we have {R_n = H^0( C, {\mathcal O}_C(-n P_\infty) )}.) The space {R_n} is clearly a vector space over {k}; one can view intuitively as the space of “polynomials” on {C} of “degree” at most {n}. When {n=0}, {R_0} consists just of the constant functions. Indeed, if {f \in R_0}, then the image {f(C)} of {f} avoids {\infty} and so lies in the affine line {k = {\mathbf P}^1 \backslash \{\infty\}}; but as {C} is projective, the image {f(C)} needs to be compact (hence closed) in {{\mathbf P}^1}, and must therefore be a point, giving the claim.

For higher {n \geq 1}, we have the easy relations

\displaystyle \hbox{dim}(R_{n-1}) \leq \hbox{dim}(R_n) \leq \hbox{dim}(R_{n-1})+1. \ \ \ \ \ (3)

 

The former inequality just comes from the trivial inclusion {R_{n-1} \subset R_n}. For the latter, observe that if two functions {f, g} lie in {R_n}, so that they each have a pole of order at most {n} at {P_\infty}, then some linear combination of these functions must have a pole of order at most {n-1} at {P_\infty}; thus {R_{n-1}} has codimension at most one in {R_n}, giving the claim.

From (3) and induction we see that each of the {R_n} are finite dimensional, with the trivial upper bound

\displaystyle \hbox{dim}(R_n) \leq n+1. \ \ \ \ \ (4)

 

Riemann’s inequality complements this with the lower bound

\displaystyle \hbox{dim}(R_n) \geq n+1-g, \ \ \ \ \ (5)

 

thus one has {\hbox{dim}(R_n) = \hbox{dim}(R_{n-1})+1} for all but at most {g} exceptions (in fact, exactly {g} exceptions as it turns out). This is a consequence of the Riemann-Roch theorem; it can be proven from abstract nonsense (the snake lemma) if one defines the genus {g} in a non-standard fashion (as the dimension of the first Cech cohomology {H^1(C)} of the structure sheaf {{\mathcal O}_C} of {C}), but to obtain this inequality with a standard definition of {g} (e.g. as the dimension of the zeroth Cech cohomolgy {H^0(C, \Omega_C^1)} of the line bundle of differentials) requires the more non-trivial tool of Serre duality.

At any rate, now that we have these vector spaces {R_n}, we will define {V \subset k(C \times C)} to be a tensor product space

\displaystyle V = R_\ell \otimes R_m

for some natural numbers {\ell, m \geq 0} which we will optimise in later. That is to say, {V} is spanned by functions of the form {(z,w) \mapsto f(z) g(w)} with {f \in R_\ell} and {g \in R_m}. This is clearly a linear subspace of {k(C \times C)} of dimension {\hbox{dim}(R_\ell) \hbox{dim}(R_m)}, and hence by Rieman’s inequality we have

\displaystyle \hbox{dim}(V) \geq (\ell+1-g) (m+1-g) \ \ \ \ \ (6)

 

if

\displaystyle \ell,m \geq g-1. \ \ \ \ \ (7)

 

Observe that {\iota_1 \circ \pi_1} maps a tensor product {(z,w) \mapsto f(z) g(w)} to a function {z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)}. If {f \in R_\ell} and {g \in R_m}, then we see that the function {z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)} has a pole of order at most {\ell+m\sqrt{q}} at {P_\infty}. We conclude that

\displaystyle \iota_1 \circ \pi_1( V ) \subset R_{\ell + m\sqrt{q}} \ \ \ \ \ (8)

 

and in particular by (4)

\displaystyle \hbox{dim}(\pi_1(V)) \leq \ell + m \sqrt{q} + 1 \ \ \ \ \ (9)

 

and similarly

\displaystyle \hbox{dim}(\pi_2(V)) \leq \ell \sqrt{q} + m + 1. \ \ \ \ \ (10)

 

We will choose {m} to be a bit bigger than {\ell}, to make the {\pi_2} image of {V} smaller than that of {\pi_1}. From (6), (10) we see that if we have the inequality

\displaystyle (\ell+1-g) (m+1-g) > \ell \sqrt{q}+m + 1 \ \ \ \ \ (11)

 

(together with (7)) then {\pi_2} cannot be injective.

On the other hand, we have the following basic fact:

Lemma 3 (Injectivity) If

\displaystyle \ell < \sqrt{q}, \ \ \ \ \ (12)

 

then {\pi_1: V \rightarrow \pi_1(V)} is injective.

Proof: From (3), we can find a linear basis {f_1,\dots,f_a} of {R_\ell} such that each of the {f_i} has a distinct order {d_i} of pole at {P_\infty} (somewhere between {0} and {\ell} inclusive). Similarly, we may find a linear basis {g_1,\dots,g_b} of {R_m} such that each of the {g_j} has a distinct order {e_j} of pole at {P_\infty} (somewhere between {0} and {m} inclusive). The functions {z \mapsto f_i(z) g_j(\hbox{Frob}_{\sqrt{q}} z)} then span {\iota_1(\pi_1(V))}, and the order of pole at {P_\infty} is {d_i + \sqrt{q} e_j}. But since {\ell < \sqrt{q}}, these orders are all distinct, and so these functions must be linearly independent. The claim follows. \Box

This gives us the following bound:

Proposition 4 Let {\ell,m} be natural numbers such that (7), (11), (12) hold. Then {|C({\bf F}_q)| \leq \ell + m \sqrt{q}}.

Proof: As {\pi_2} is not injective, we can find {f \in V} with {\pi_2(f)} vanishing. By the above lemma, the function {\iota_1(\pi_1(f))} is then non-zero, but it must also vanish on {\iota_1(C_1 \cap C_2)}, which has cardinality {|C({\bf F}_q)|}. On the other hand, by (8), {\iota_1(\pi_1(f))} has a pole of order at most {\ell+m\sqrt{q}} at {P_\infty} and no other poles. Since the number of poles and zeroes of a rational function on a projective curve must add up to zero, the claim follows. \Box

If {q \geq (g+1)^4}, we may make the explicit choice

\displaystyle m := \sqrt{q}+2g; \quad \ell := \lfloor \frac{g}{g+1} \sqrt{q} \rfloor + g + 1

and a brief calculation then gives Theorem 2. In some cases one can optimise things a bit further. For instance, in the genus zero case {g=0} (e.g. if {C} is just the projective line {{\mathbf P}^1}) one may take {\ell=1, m = \sqrt{q}} and conclude the absolutely sharp bound {|C({\bf F}_q)| \leq q+1} in this case; in the case of the projective line {{\mathbf P}^1}, the function {f} is in fact the very concrete function {f(z,w) := z - w^{\sqrt{q}}}.

Remark 1 When {q = p^{2n+1}} is not a perfect square, one can try to run the above argument using the factorisation {\hbox{Frob}_q = \hbox{Frob}_{p^n} \hbox{Frob}_{p^{n+1}}} instead of {\hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}} \hbox{Frob}_{\sqrt{q}}}. This gives a weaker version of the above bound, of the shape {|C({\bf F}_q)| \leq q + O( \sqrt{p} \sqrt{q} )}. In the hyperelliptic case at least, one can erase this loss by working with a variant of the argument in which one requires {f} to vanish to high order at {C_1}, rather than just to first order; see this survey article of mine for details.

Read the rest of this entry »

Let {f: {\bf R}^3 \rightarrow {\bf R}} be an irreducible polynomial in three variables. As {{\bf R}} is not algebraically closed, the zero set {Z_{\bf R}(f) = \{ x \in{\bf R}^3: f(x)=0\}} can split into various components of dimension between {0} and {2}. For instance, if {f(x_1,x_2,x_3) = x_1^2+x_2^2}, the zero set {Z_{\bf R}(f)} is a line; more interestingly, if {f(x_1,x_2,x_3) = x_3^2 + x_2^2 - x_2^3}, then {Z_{\bf R}(f)} is the union of a line and a surface (or the product of an acnodal cubic curve with a line). We will assume that the {2}-dimensional component {Z_{{\bf R},2}(f)} is non-empty, thus defining a real surface in {{\bf R}^3}. In particular, this hypothesis implies that {f} is not just irreducible over {{\bf R}}, but is in fact absolutely irreducible (i.e. irreducible over {{\bf C}}), since otherwise one could use the complex factorisation of {f} to contain {Z_{\bf R}(f)} inside the intersection {{\bf Z}_{\bf C}(g) \cap {\bf Z}_{\bf C}(\bar{g})} of the complex zero locus of complex polynomial {g} and its complex conjugate, with {g,\bar{g}} having no common factor, forcing {Z_{\bf R}(f)} to be at most one-dimensional. (For instance, in the case {f(x_1,x_2,x_3)=x_1^2+x_2^2}, one can take {g(z_1,z_2,z_3) = z_1 + i z_2}.) Among other things, this makes {{\bf Z}_{{\bf R},2}(f)} a Zariski-dense subset of {{\bf Z}_{\bf C}(f)}, thus any polynomial identity which holds true at every point of {{\bf Z}_{{\bf R},2}(f)}, also holds true on all of {{\bf Z}_{\bf C}(f)}. This allows us to easily use tools from algebraic geometry in this real setting, even though the reals are not quite algebraically closed.

The surface {Z_{{\bf R},2}(f)} is said to be ruled if, for a Zariski open dense set of points {x \in Z_{{\bf R},2}(f)}, there exists a line {l_x = \{ x+tv_x: t \in {\bf R} \}} through {x} for some non-zero {v_x \in {\bf R}^3} which is completely contained in {Z_{{\bf R},2}(f)}, thus

\displaystyle f(x+tv_x)=0

for all {t \in {\bf R}}. Also, a point {x \in {\bf Z}_{{\bf R},2}(f)} is said to be a flecnode if there exists a line {l_x = \{ x+tv_x: t \in {\bf R}\}} through {x} for some non-zero {v_x \in {\bf R}^3} which is tangent to {Z_{{\bf R},2}(f)} to third order, in the sense that

\displaystyle f(x+tv_x)=O(t^4)

as {t \rightarrow 0}, or equivalently that

\displaystyle  \frac{d^j}{dt^j} f(x+tv_x)|_{t=0} = 0 \ \ \ \ \ (1)

for {j=0,1,2,3}. Clearly, if {Z_{{\bf R},2}(f)} is a ruled surface, then a Zariski open dense set of points on {Z_{{\bf R},2}} are a flecnode. We then have the remarkable theorem of Cayley and Salmon asserting the converse:

Theorem 1 (Cayley-Salmon theorem) Let {f: {\bf R}^3 \rightarrow {\bf R}} be an irreducible polynomial with {{\bf Z}_{{\bf R},2}} non-empty. Suppose that a Zariski dense set of points in {Z_{{\bf R},2}(f)} are flecnodes. Then {Z_{{\bf R},2}(f)} is a ruled surface.

Among other things, this theorem was used in the celebrated result of Guth and Katz that almost solved the Erdos distance problem in two dimensions, as discussed in this previous blog post. Vanishing to third order is necessary: observe that in a surface of negative curvature, such as the saddle {\{ (x_1,x_2,x_3): x_3 = x_1^2 - x_2^2 \}}, every point on the surface is tangent to second order to a line (the line in the direction for which the second fundamental form vanishes).

The original proof of the Cayley-Salmon theorem, dating back to at least 1915, is not easily accessible and not written in modern language. A modern proof of this theorem (together with substantial generalisations, for instance to higher dimensions) is given by Landsberg; the proof uses the machinery of modern algebraic geometry. The purpose of this post is to record an alternate proof of the Cayley-Salmon theorem based on classical differential geometry (in particular, the notion of torsion of a curve) and basic ODE methods (in particular, Gronwall’s inequality and the Picard existence theorem). The idea is to “integrate” the lines {l_x} indicated by the flecnode to produce smooth curves {\gamma} on the surface {{\bf Z}_{{\bf R},2}}; one then uses the vanishing (1) and some basic calculus to conclude that these curves have zero torsion and are thus planar curves. Some further manipulation using (1) (now just to second order instead of third) then shows that these curves are in fact straight lines, giving the ruling on the surface.

Update: Janos Kollar has informed me that the above theorem was essentially known to Monge in 1809; see his recent arXiv note for more details.

I thank Larry Guth and Micha Sharir for conversations leading to this post.

Read the rest of this entry »

The classical foundations of probability theory (discussed for instance in this previous blog post) is founded on the notion of a probability space {(\Omega, {\cal E}, {\bf P})} – a space {\Omega} (the sample space) equipped with a {\sigma}-algebra {{\cal E}} (the event space), together with a countably additive probability measure {{\bf P}: {\cal E} \rightarrow [0,1]} that assigns a real number in the interval {[0,1]} to each event.

One can generalise the concept of a probability space to a finitely additive probability space, in which the event space {{\cal E}} is now only a Boolean algebra rather than a {\sigma}-algebra, and the measure {\mu} is now only finitely additive instead of countably additive, thus {{\bf P}( E \vee F ) = {\bf P}(E) + {\bf P}(F)} when {E,F} are disjoint events. By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). Nevertheless, one can still perform a fair amount of probability theory in this weaker setting.

In this post I would like to describe a further weakening of probability theory, which I will call qualitative probability theory, in which one does not assign a precise numerical probability value {{\bf P}(E)} to each event, but instead merely records whether this probability is zero, one, or something in between. Thus {{\bf P}} is now a function from {{\cal E}} to the set {\{0, I, 1\}}, where {I} is a new symbol that replaces all the elements of the open interval {(0,1)}. In this setting, one can no longer compute quantitative expressions, such as the mean or variance of a random variable; but one can still talk about whether an event holds almost surely, with positive probability, or with zero probability, and there are still usable notions of independence. (I will refer to classical probability theory as quantitative probability theory, to distinguish it from its qualitative counterpart.)

The main reason I want to introduce this weak notion of probability theory is that it becomes suited to talk about random variables living inside algebraic varieties, even if these varieties are defined over fields other than {{\bf R}} or {{\bf C}}. In algebraic geometry one often talks about a “generic” element of a variety {V} defined over a field {k}, which does not lie in any specified variety of lower dimension defined over {k}. Once {V} has positive dimension, such generic elements do not exist as classical, deterministic {k}-points {x} in {V}, since of course any such point lies in the {0}-dimensional subvariety {\{x\}} of {V}. There are of course several established ways to deal with this problem. One way (which one might call the “Weil” approach to generic points) is to extend the field {k} to a sufficiently transcendental extension {\tilde k}, in order to locate a sufficient number of generic points in {V(\tilde k)}. Another approach (which one might dub the “Zariski” approach to generic points) is to work scheme-theoretically, and interpret a generic point in {V} as being associated to the zero ideal in the function ring of {V}. However I want to discuss a third perspective, in which one interprets a generic point not as a deterministic object, but rather as a random variable {{\bf x}} taking values in {V}, but which lies in any given lower-dimensional subvariety of {V} with probability zero. This interpretation is intuitive, but difficult to implement in classical probability theory (except perhaps when considering varieties over {{\bf R}} or {{\bf C}}) due to the lack of a natural probability measure to place on algebraic varieties; however it works just fine in qualitative probability theory. In particular, the algebraic geometry notion of being “generically true” can now be interpreted probabilistically as an assertion that something is “almost surely true”.

It turns out that just as qualitative random variables may be used to interpret the concept of a generic point, they can also be used to interpret the concept of a type in model theory; the type of a random variable {x} is the set of all predicates {\phi(x)} that are almost surely obeyed by {x}. In contrast, model theorists often adopt a Weil-type approach to types, in which one works with deterministic representatives of a type, which often do not occur in the original structure of interest, but only in a sufficiently saturated extension of that structure (this is the analogue of working in a sufficiently transcendental extension of the base field). However, it seems that (in some cases at least) one can equivalently view types in terms of (qualitative) random variables on the original structure, avoiding the need to extend that structure. (Instead, one reserves the right to extend the sample space of one’s probability theory whenever necessary, as part of the “probabilistic way of thinking” discussed in this previous blog post.) We illustrate this below the fold with two related theorems that I will interpret through the probabilistic lens: the “group chunk theorem” of Weil (and later developed by Hrushovski), and the “group configuration theorem” of Zilber (and again later developed by Hrushovski). For sake of concreteness we will only consider these theorems in the theory of algebraically closed fields, although the results are quite general and can be applied to many other theories studied in model theory.

Read the rest of this entry »

Let {F} be a field. A definable set over {F} is a set of the form

\displaystyle  \{ x \in F^n | \phi(x) \hbox{ is true} \} \ \ \ \ \ (1)

where {n} is a natural number, and {\phi(x)} is a predicate involving the ring operations {+,\times} of {F}, the equality symbol {=}, an arbitrary number of constants and free variables in {F}, the quantifiers {\forall, \exists}, boolean operators such as {\vee,\wedge,\neg}, and parentheses and colons, where the quantifiers are always understood to be over the field {F}. Thus, for instance, the set of quadratic residues

\displaystyle  \{ x \in F | \exists y: x = y \times y \}

is definable over {F}, and any algebraic variety over {F} is also a definable set over {F}. Henceforth we will abbreviate “definable over {F}” simply as “definable”.

If {F} is a finite field, then every subset of {F^n} is definable, since finite sets are automatically definable. However, we can obtain a more interesting notion in this case by restricting the complexity of a definable set. We say that {E \subset F^n} is a definable set of complexity at most {M} if {n \leq M}, and {E} can be written in the form (1) for some predicate {\phi} of length at most {M} (where all operators, quantifiers, relations, variables, constants, and punctuation symbols are considered to have unit length). Thus, for instance, a hypersurface in {n} dimensions of degree {d} would be a definable set of complexity {O_{n,d}(1)}. We will then be interested in the regime where the complexity remains bounded, but the field size (or field characteristic) becomes large.

In a recent paper, I established (in the large characteristic case) the following regularity lemma for dense definable graphs, which significantly strengthens the Szemerédi regularity lemma in this context, by eliminating “bad” pairs, giving a polynomially strong regularity, and also giving definability of the cells:

Lemma 1 (Algebraic regularity lemma) Let {F} be a finite field, let {V,W} be definable non-empty sets of complexity at most {M}, and let {E \subset V \times W} also be definable with complexity at most {M}. Assume that the characteristic of {F} is sufficiently large depending on {M}. Then we may partition {V = V_1 \cup \ldots \cup V_m} and {W = W_1 \cup \ldots \cup W_n} with {m,n = O_M(1)}, with the following properties:

  • (Definability) Each of the {V_1,\ldots,V_m,W_1,\ldots,W_n} are definable of complexity {O_M(1)}.
  • (Size) We have {|V_i| \gg_M |V|} and {|W_j| \gg_M |W|} for all {i=1,\ldots,m} and {j=1,\ldots,n}.
  • (Regularity) We have

    \displaystyle  |E \cap (A \times B)| = d_{ij} |A| |B| + O_M( |F|^{-1/4} |V| |W| ) \ \ \ \ \ (2)

    for all {i=1,\ldots,m}, {j=1,\ldots,n}, {A \subset V_i}, and {B\subset W_j}, where {d_{ij}} is a rational number in {[0,1]} with numerator and denominator {O_M(1)}.

My original proof of this lemma was quite complicated, based on an explicit calculation of the “square”

\displaystyle  \mu(w,w') := \{ v \in V: (v,w), (v,w') \in E \}

of {E} using the Lang-Weil bound and some facts about the étale fundamental group. It was the reliance on the latter which was the main reason why the result was restricted to the large characteristic setting. (I then applied this lemma to classify expanding polynomials over finite fields of large characteristic, but I will not discuss these applications here; see this previous blog post for more discussion.)

Recently, Anand Pillay and Sergei Starchenko (and independently, Udi Hrushovski) have observed that the theory of the étale fundamental group is not necessary in the argument, and the lemma can in fact be deduced from quite general model theoretic techniques, in particular using (a local version of) the concept of stability. One of the consequences of this new proof of the lemma is that the hypothesis of large characteristic can be omitted; the lemma is now known to be valid for arbitrary finite fields {F} (although its content is trivial if the field is not sufficiently large depending on the complexity at most {M}).

Inspired by this, I decided to see if I could find yet another proof of the algebraic regularity lemma, again avoiding the theory of the étale fundamental group. It turns out that the spectral proof of the Szemerédi regularity lemma (discussed in this previous blog post) adapts very nicely to this setting. The key fact needed about definable sets over finite fields is that their cardinality takes on an essentially discrete set of values. More precisely, we have the following fundamental result of Chatzidakis, van den Dries, and Macintyre:

Proposition 2 Let {F} be a finite field, and let {M > 0}.

  • (Discretised cardinality) If {E} is a non-empty definable set of complexity at most {M}, then one has

    \displaystyle  |E| = c |F|^d + O_M( |F|^{d-1/2} ) \ \ \ \ \ (3)

    where {d = O_M(1)} is a natural number, and {c} is a positive rational number with numerator and denominator {O_M(1)}. In particular, we have {|F|^d \ll_M |E| \ll_M |F|^d}.

  • (Definable cardinality) Assume {|F|} is sufficiently large depending on {M}. If {V, W}, and {E \subset V \times W} are definable sets of complexity at most {M}, so that {E_w := \{ v \in V: (v,w) \in W \}} can be viewed as a definable subset of {V} that is definably parameterised by {w \in W}, then for each natural number {d = O_M(1)} and each positive rational {c} with numerator and denominator {O_M(1)}, the set

    \displaystyle  \{ w \in W: |E_w| = c |F|^d + O_M( |F|^{d-1/2} ) \} \ \ \ \ \ (4)

    is definable with complexity {O_M(1)}, where the implied constants in the asymptotic notation used to define (4) are the same as those that appearing in (3). (Informally: the “dimension” {d} and “measure” {c} of {E_w} depends definably on {w}.)

We will take this proposition as a black box; a proof can be obtained by combining the description of definable sets over pseudofinite fields (discussed in this previous post) with the Lang-Weil bound (discussed in this previous post). (The former fact is phrased using nonstandard analysis, but one can use standard compactness-and-contradiction arguments to convert such statements to statements in standard analysis, as discussed in this post.)

The above proposition places severe restrictions on the cardinality of definable sets; for instance, it shows that one cannot have a definable set of complexity at most {M} and cardinality {|F|^{1/2}}, if {|F|} is sufficiently large depending on {M}. If {E \subset V} are definable sets of complexity at most {M}, it shows that {|E| = (c+ O_M(|F|^{-1/2})) |V|} for some rational {0\leq c \leq 1} with numerator and denominator {O_M(1)}; furthermore, if {c=0}, we may improve this bound to {|E| = O_M( |F|^{-1} |V|)}. In particular, we obtain the following “self-improving” properties:

  • If {E \subset V} are definable of complexity at most {M} and {|E| \leq \epsilon |V|} for some {\epsilon>0}, then (if {\epsilon} is sufficiently small depending on {M} and {F} is sufficiently large depending on {M}) this forces {|E| = O_M( |F|^{-1} |V| )}.
  • If {E \subset V} are definable of complexity at most {M} and {||E| - c |V|| \leq \epsilon |V|} for some {\epsilon>0} and positive rational {c}, then (if {\epsilon} is sufficiently small depending on {M,c} and {F} is sufficiently large depending on {M,c}) this forces {|E| = c |V| + O_M( |F|^{-1/2} |V| )}.

It turns out that these self-improving properties can be applied to the coefficients of various matrices (basically powers of the adjacency matrix associated to {E}) that arise in the spectral proof of the regularity lemma to significantly improve the bounds in that lemma; we describe how this is done below the fold. We also make some connections to the stability-based proofs of Pillay-Starchenko and Hrushovski.

Read the rest of this entry »

I’ve just uploaded to the arXiv my article “Algebraic combinatorial geometry: the polynomial method in arithmetic combinatorics, incidence combinatorics, and number theory“, submitted to the new journal “EMS surveys in the mathematical sciences“.  This is the first draft of a survey article on the polynomial method – a technique in combinatorics and number theory for controlling a relevant set of points by comparing it with the zero set of a suitably chosen polynomial, and then using tools from algebraic geometry (e.g. Bezout’s theorem) on that zero set. As such, the method combines algebraic geometry with combinatorial geometry, and could be viewed as the philosophy of a combined field which I dub “algebraic combinatorial geometry”.   There is also an important extension of this method when one is working overthe reals, in which methods from algebraic topology (e.g. the ham sandwich theorem and its generalisation to polynomials), and not just algebraic geometry, come into play also.

The polynomial method has been used independently many times in mathematics; for instance, it plays a key role in the proof of Baker’s theorem in transcendence theory, or Stepanov’s method in giving an elementary proof of the Riemann hypothesis for finite fields over curves; in combinatorics, the nullstellenatz of Alon is also another relatively early use of the polynomial method.  More recently, it underlies Dvir’s proof of the Kakeya conjecture over finite fields and Guth and Katz’s near-complete solution to the Erdos distance problem in the plane, and can be used to give a short proof of the Szemeredi-Trotter theorem.  One of the aims of this survey is to try to present all of these disparate applications of the polynomial method in a somewhat unified context; my hope is that there will eventually be a systematic foundation for algebraic combinatorial geometry which naturally contains all of these different instances the polynomial method (and also suggests new instances to explore); but the field is unfortunately not at that stage of maturity yet.

This is something of a first draft, so comments and suggestions are even more welcome than usual.  (For instance, I have already had my attention drawn to some additional uses of the polynomial method in the literature that I was not previously aware of.)

[Note: the content of this post is standard number theoretic material that can be found in many textbooks (I am relying principally here on Iwaniec and Kowalski); I am not claiming any new progress on any version of the Riemann hypothesis here, but am simply arranging existing facts together.]

The Riemann hypothesis is arguably the most important and famous unsolved problem in number theory. It is usually phrased in terms of the Riemann zeta function {\zeta}, defined by

\displaystyle  \zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s}

for {\hbox{Re}(s)>1} and extended meromorphically to other values of {s}, and asserts that the only zeroes of {\zeta} in the critical strip {\{ s: 0 \leq \hbox{Re}(s) \leq 1 \}} lie on the critical line {\{ s: \hbox{Re}(s)=\frac{1}{2} \}}.

One of the main reasons that the Riemann hypothesis is so important to number theory is that the zeroes of the zeta function in the critical strip control the distribution of the primes. To see the connection, let us perform the following formal manipulations (ignoring for now the important analytic issues of convergence of series, interchanging sums, branches of the logarithm, etc., in order to focus on the intuition). The starting point is the fundamental theorem of arithmetic, which asserts that every natural number {n} has a unique factorisation {n = p_1^{a_1} \ldots p_k^{a_k}} into primes. Taking logarithms, we obtain the identity

\displaystyle  \log n = \sum_{d|n} \Lambda(d) \ \ \ \ \ (1)

for any natural number {n}, where {\Lambda} is the von Mangoldt function, thus {\Lambda(n) = \log p} when {n} is a power of a prime {p} and zero otherwise. If we then perform a “Dirichlet-Fourier transform” by viewing both sides of (1) as coefficients of a Dirichlet series, we conclude that

\displaystyle  \sum_{n=1}^\infty \frac{\log n}{n^s} = \sum_{n=1}^\infty \sum_{d|n} \frac{\Lambda(d)}{n^s},

formally at least. Writing {n=dm}, the right-hand side factors as

\displaystyle (\sum_{d=1}^\infty \frac{\Lambda(d)}{d^s}) (\sum_{m=1}^\infty \frac{1}{m^s}) = \zeta(s) \sum_{n=1}^\infty \frac{\Lambda(n)}{n^s}

whereas the left-hand side is (formally, at least) equal to {-\zeta'(s)}. We conclude the identity

\displaystyle  \sum_{n=1}^\infty \frac{\Lambda(n)}{n^s} = -\frac{\zeta'(s)}{\zeta(s)},

(formally, at least). If we integrate this, we are formally led to the identity

\displaystyle  \sum_{n=1}^\infty \frac{\Lambda(n)}{\log n} n^{-s} = \log \zeta(s)

or equivalently to the exponential identity

\displaystyle  \zeta(s) = \exp( \sum_{n=1}^\infty \frac{\Lambda(n)}{\log n} n^{-s} ) \ \ \ \ \ (2)

which allows one to reconstruct the Riemann zeta function from the von Mangoldt function. (It is instructive exercise in enumerative combinatorics to try to prove this identity directly, at the level of formal Dirichlet series, using the fundamental theorem of arithmetic of course.) Now, as {\zeta} has a simple pole at {s=1} and zeroes at various places {s=\rho} on the critical strip, we expect a Weierstrass factorisation which formally (ignoring normalisation issues) takes the form

\displaystyle  \zeta(s) = \frac{1}{s-1} \times \prod_\rho (s-\rho) \times \ldots

(where we will be intentionally vague about what is hiding in the {\ldots} terms) and so we expect an expansion of the form

\displaystyle  \sum_{n=1}^\infty \frac{\Lambda(n)}{\log n} n^{-s} = - \log(s-1) + \sum_\rho \log(s-\rho) + \ldots. \ \ \ \ \ (3)

Note that

\displaystyle  \frac{1}{s-\rho} = \int_1^\infty t^{\rho-s} \frac{dt}{t}

and hence on integrating in {s} we formally have

\displaystyle  \log(s-\rho) = -\int_1^\infty t^{\rho-s-1} \frac{dt}{\log t}

and thus we have the heuristic approximation

\displaystyle  \log(s-\rho) \approx - \sum_{n=1}^\infty \frac{n^{\rho-s-1}}{\log n}.

Comparing this with (3), we are led to a heuristic form of the explicit formula

\displaystyle  \Lambda(n) \approx 1 - \sum_\rho n^{\rho-1}. \ \ \ \ \ (4)

When trying to make this heuristic rigorous, it turns out (due to the rough nature of both sides of (4)) that one has to interpret the explicit formula in some suitably weak sense, for instance by testing (4) against the indicator function {1_{[0,x]}(n)} to obtain the formula

\displaystyle  \sum_{n \leq x} \Lambda(n) \approx x - \sum_\rho \frac{x^{\rho}}{\rho} \ \ \ \ \ (5)

which can in fact be made into a rigorous statement after some truncation (the von Mangoldt explicit formula). From this formula we now see how helpful the Riemann hypothesis will be to control the distribution of the primes; indeed, if the Riemann hypothesis holds, so that {\hbox{Re}(\rho) = 1/2} for all zeroes {\rho}, it is not difficult to use (a suitably rigorous version of) the explicit formula to conclude that

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1/2} \log^2 x ) \ \ \ \ \ (6)

as {x \rightarrow \infty}, giving a near-optimal “square root cancellation” for the sum {\sum_{n \leq x} \Lambda(n)-1}. Conversely, if one can somehow establish a bound of the form

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1/2+\epsilon} )

for any fixed {\epsilon}, then the explicit formula can be used to then deduce that all zeroes {\rho} of {\zeta} have real part at most {1/2+\epsilon}, which leads to the following remarkable amplification phenomenon (analogous, as we will see later, to the tensor power trick): any bound of the form

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1/2+o(1)} )

can be automatically amplified to the stronger bound

\displaystyle  \sum_{n \leq x} \Lambda(n) = x + O( x^{1/2} \log^2 x )

with both bounds being equivalent to the Riemann hypothesis. Of course, the Riemann hypothesis for the Riemann zeta function remains open; but partial progress on this hypothesis (in the form of zero-free regions for the zeta function) leads to partial versions of the asymptotic (6). For instance, it is known that there are no zeroes of the zeta function on the line {\hbox{Re}(s)=1}, and this can be shown by some analysis (either complex analysis or Fourier analysis) to be equivalent to the prime number theorem

\displaystyle  \sum_{n \leq x} \Lambda(n) =x + o(x);

see e.g. this previous blog post for more discussion.

The main engine powering the above observations was the fundamental theorem of arithmetic, and so one can expect to establish similar assertions in other contexts where some version of the fundamental theorem of arithmetic is available. One of the simplest such variants is to continue working on the natural numbers, but “twist” them by a Dirichlet character {\chi: {\bf Z} \rightarrow {\bf R}}. The analogue of the Riemann zeta function is then the (1), which encoded the fundamental theorem of arithmetic, can be twisted by {\chi} to obtain

\displaystyle  \chi(n) \log n = \sum_{d|n} \chi(d) \Lambda(d) \chi(\frac{n}{d}) \ \ \ \ \ (7)

and essentially the same manipulations as before eventually lead to the exponential identity

\displaystyle  L(s,\chi) = \exp( \sum_{n=1}^\infty \frac{\chi(n) \Lambda(n)}{\log n} n^{-s} ). \ \ \ \ \ (8)

which is a twisted version of (2), as well as twisted explicit formula, which heuristically takes the form

\displaystyle  \chi(n) \Lambda(n) \approx - \sum_\rho n^{\rho-1} \ \ \ \ \ (9)

for non-principal {\chi}, where {\rho} now ranges over the zeroes of {L(s,\chi)} in the critical strip, rather than the zeroes of {\zeta(s)}; a more accurate formulation, following (5), would be

\displaystyle  \sum_{n \leq x} \chi(n) \Lambda(n) \approx - \sum_\rho \frac{x^{\rho}}{\rho}. \ \ \ \ \ (10)

(See e.g. Davenport’s book for a more rigorous discussion which emphasises the analogy between the Riemann zeta function and the Dirichlet {L}-function.) If we assume the generalised Riemann hypothesis, which asserts that all zeroes of {L(s,\chi)} in the critical strip also lie on the critical line, then we obtain the bound

\displaystyle  \sum_{n \leq x} \chi(n) \Lambda(n) = O( x^{1/2} \log(x) \log(xq) )

for any non-principal Dirichlet character {\chi}, again demonstrating a near-optimal square root cancellation for this sum. Again, we have the amplification property that the above bound is implied by the apparently weaker bound

\displaystyle  \sum_{n \leq x} \chi(n) \Lambda(n) = O( x^{1/2+o(1)} )

(where {o(1)} denotes a quantity that goes to zero as {x \rightarrow \infty} for any fixed {q}). Next, one can consider other number systems than the natural numbers {{\bf N}} and integers {{\bf Z}}. For instance, one can replace the integers {{\bf Z}} with rings {{\mathcal O}_K} of integers in other number fields {K} (i.e. finite extensions of {{\bf Q}}), such as the quadratic extensions {K = {\bf Q}[\sqrt{D}]} of the rationals for various square-free integers {D}, in which case the ring of integers would be the ring of quadratic integers {{\mathcal O}_K = {\bf Z}[\omega]} for a suitable generator {\omega} (it turns out that one can take {\omega = \sqrt{D}} if {D=2,3\hbox{ mod } 4}, and {\omega = \frac{1+\sqrt{D}}{2}} if {D=1 \hbox{ mod } 4}). Here, it is not immediately obvious what the analogue of the natural numbers {{\bf N}} is in this setting, since rings such as {{\bf Z}[\omega]} do not come with a natural ordering. However, we can adopt an algebraic viewpoint to see the correct generalisation, observing that every natural number {n} generates a principal ideal {(n) = \{ an: a \in {\bf Z} \}} in the integers, and conversely every non-trivial ideal {{\mathfrak n}} in the integers is associated to precisely one natural number {n} in this fashion, namely the norm {N({\mathfrak n}) := |{\bf Z} / {\mathfrak n}|} of that ideal. So one can identify the natural numbers with the ideals of {{\bf Z}}. Furthermore, with this identification, the prime numbers correspond to the prime ideals, since if {p} is prime, and {a,b} are integers, then {ab \in (p)} if and only if one of {a \in (p)} or {b \in (p)} is true. Finally, even in number systems (such as {{\bf Z}[\sqrt{-5}]}) in which the classical version of the fundamental theorem of arithmetic fail (e.g. {6 = 2 \times 3 = (1-\sqrt{-5})(1+\sqrt{-5})}), we have the fundamental theorem of arithmetic for ideals: every ideal {\mathfrak{n}} in a Dedekind domain (which includes the ring {{\mathcal O}_K} of integers in a number field as a key example) is uniquely representable (up to permutation) as the product of a finite number of prime ideals {\mathfrak{p}} (although these ideals might not necessarily be principal). For instance, in {{\bf Z}[\sqrt{-5}]}, the principal ideal {(6)} factors as the product of four prime (but non-principal) ideals {(2, 1+\sqrt{-5})}, {(2, 1-\sqrt{-5})}, {(3, 1+\sqrt{-5})}, {(3, 1-\sqrt{-5})}. (Note that the first two ideals {(2,1+\sqrt{5}), (2,1-\sqrt{5})} are actually equal to each other.) Because we still have the fundamental theorem of arithmetic, we can develop analogues of the previous observations relating the Riemann hypothesis to the distribution of primes. The analogue of the Riemann hypothesis is now the Dedekind zeta function

\displaystyle  \zeta_K(s) := \sum_{{\mathfrak n}} \frac{1}{N({\mathfrak n})^s}

where the summation is over all non-trivial ideals in {{\mathcal O}_K}. One can also define a von Mangoldt function {\Lambda_K({\mathfrak n})}, defined as {\log N( {\mathfrak p})} when {{\mathfrak n}} is a power of a prime ideal {{\mathfrak p}}, and zero otherwise; then the fundamental theorem of arithmetic for ideals can be encoded in an analogue of (1) (or (7)),

\displaystyle  \log N({\mathfrak n}) = \sum_{{\mathfrak d}|{\mathfrak n}} \Lambda_K({\mathfrak d}) \ \ \ \ \ (11)

which leads as before to an exponential identity

\displaystyle  \zeta_K(s) = \exp( \sum_{{\mathfrak n}} \frac{\Lambda_K({\mathfrak n})}{\log N({\mathfrak n})} N({\mathfrak n})^{-s} ) \ \ \ \ \ (12)

and an explicit formula of the heuristic form

\displaystyle  \Lambda({\mathfrak n}) \approx 1 - \sum_\rho N({\mathfrak n})^{\rho-1}

or, a little more accurately,

\displaystyle  \sum_{N({\mathfrak n}) \leq x} \Lambda({\mathfrak n}) \approx x - \sum_\rho \frac{x^{\rho}}{\rho} \ \ \ \ \ (13)

in analogy with (5) or (10). Again, a suitable Riemann hypothesis for the Dedekind zeta function leads to good asymptotics for the distribution of prime ideals, giving a bound of the form

\displaystyle  \sum_{N({\mathfrak n}) \leq x} \Lambda({\mathfrak n}) = x + O( \sqrt{x} \log(x) (d+\log(Dx)) )

where {D} is the conductor of {K} (which, in the case of number fields, is the absolute value of the discriminant of {K}) and {d = \hbox{dim}_{\bf Q}(K)} is the degree of the extension of {K} over {{\bf Q}}. As before, we have the amplification phenomenon that the above near-optimal square root cancellation bound is implied by the weaker bound

\displaystyle  \sum_{N({\mathfrak n}) \leq x} \Lambda({\mathfrak n}) = x + O( x^{1/2+o(1)} )

where {o(1)} denotes a quantity that goes to zero as {x \rightarrow \infty} (holding {K} fixed). See e.g. Chapter 5 of Iwaniec-Kowalski for details.

As was the case with the Dirichlet {L}-functions, one can twist the Dedekind zeta function example by characters, in this case the Hecke characters; we will not do this here, but see e.g. Section 3 of Iwaniec-Kowalski for details.

Very analogous considerations hold if we move from number fields to function fields. The simplest case is the function field associated to the affine line {{\mathbb A}^1} and a finite field {{\mathbb F} = {\mathbb F}_q} of some order {q}. The polynomial functions on the affine line {{\mathbb A}^1/{\mathbb F}} are just the usual polynomial ring {{\mathbb F}[t]}, which then play the role of the integers {{\bf Z}} (or {{\mathcal O}_K}) in previous examples. This ring happens to be a unique factorisation domain, so the situation is closely analogous to the classical setting of the Riemann zeta function. The analogue of the natural numbers are the monic polynomials (since every non-trivial principal ideal is generated by precisely one monic polynomial), and the analogue of the prime numbers are the irreducible monic polynomials. The norm {N(f)} of a polynomial is the order of {{\mathbb F}[t] / (f)}, which can be computed explicitly as

\displaystyle  N(f) = q^{\hbox{deg}(f)}.

Because of this, we will normalise things slightly differently here and use {\hbox{deg}(f)} in place of {\log N(f)} in what follows. The (local) zeta function {\zeta_{{\mathbb A}^1/{\mathbb F}}(s)} is then defined as

\displaystyle  \zeta_{{\mathbb A}^1/{\mathbb F}}(s) = \sum_f \frac{1}{N(f)^s}

where {f} ranges over monic polynomials, and the von Mangoldt function {\Lambda_{{\mathbb A}^1/{\mathbb F}}(f)} is defined to equal {\hbox{deg}(g)} when {f} is a power of a monic irreducible polynomial {g}, and zero otherwise. Note that because {N(f)} is always a power of {q}, the zeta function here is in fact periodic with period {2\pi i / \log q}. Because of this, it is customary to make a change of variables {T := q^{-s}}, so that

\displaystyle  \zeta_{{\mathbb A}^1/{\mathbb F}}(s) = Z( {\mathbb A}^1/{\mathbb F}, T )

and {Z} is the renormalised zeta function

\displaystyle  Z( {\mathbb A}^1/{\mathbb F}, T ) = \sum_f T^{\hbox{deg}(f)}. \ \ \ \ \ (14)

We have the analogue of (1) (or (7) or (11)):

\displaystyle  \hbox{deg}(f) = \sum_{g|f} \Lambda_{{\mathbb A}^1/{\mathbb F}}(g), \ \ \ \ \ (15)

which leads as before to an exponential identity

\displaystyle  Z( {\mathbb A}^1/{\mathbb F}, T ) = \exp( \sum_f \frac{\Lambda_{{\mathbb A}^1/{\mathbb F}}(f)}{\hbox{deg}(f)} T^{\hbox{deg}(f)} ) \ \ \ \ \ (16)

analogous to (2), (8), or (12). It also leads to the explicit formula

\displaystyle  \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) \approx 1 - \sum_\rho N(f)^{\rho-1}

where {\rho} are the zeroes of the original zeta function {\zeta_{{\mathbb A}^1/{\mathbb F}}(s)} (counting each residue class of the period {2\pi i/\log q} just once), or equivalently

\displaystyle  \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) \approx 1 - q^{-\hbox{deg}(f)} \sum_\alpha \alpha^{\hbox{deg}(f)},

where {\alpha} are the reciprocals of the roots of the normalised zeta function {Z( {\mathbb A}^1/{\mathbb F}, T )} (or to put it another way, {1-\alpha T} are the factors of this zeta function). Again, to make proper sense of this heuristic we need to sum, obtaining

\displaystyle  \sum_{\hbox{deg}(f) = n} \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) \approx q^n - \sum_\alpha \alpha^n.

As it turns out, in the function field setting, the zeta functions are always rational (this is part of the Weil conjectures), and the above heuristic formula is basically exact up to a constant factor, thus

\displaystyle  \sum_{\hbox{deg}(f) = n} \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) = q^n - \sum_\alpha \alpha^n + c \ \ \ \ \ (17)

for an explicit integer {c} (independent of {n}) arising from any potential pole of {Z} at {T=1}. In the case of the affine line {{\mathbb A}^1}, the situation is particularly simple, because the zeta function {Z( {\mathbb A}^1/{\mathbb F}, T)} is easy to compute. Indeed, since there are exactly {q^n} monic polynomials of a given degree {n}, we see from (14) that

\displaystyle  Z( {\mathbb A}^1/{\mathbb F}, T ) = \sum_{n=0}^\infty q^n T^n = \frac{1}{1-qT}

so in fact there are no zeroes whatsoever, and no pole at {T=1} either, so we have an exact prime number theorem for this function field:

\displaystyle  \sum_{\hbox{deg}(f) = n} \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) = q^n \ \ \ \ \ (18)

Among other things, this tells us that the number of irreducible monic polynomials of degree {n} is {q^n/n + O(q^{n/2})}.

We can transition from an algebraic perspective to a geometric one, by viewing a given monic polynomial {f \in {\mathbb F}[t]} through its roots, which are a finite set of points in the algebraic closure {\overline{{\mathbb F}}} of the finite field {{\mathbb F}} (or more suggestively, as points on the affine line {{\mathbb A}^1( \overline{{\mathbb F}} )}). The number of such points (counting multiplicity) is the degree of {f}, and from the factor theorem, the set of points determines the monic polynomial {f} (or, if one removes the monic hypothesis, it determines the polynomial {f} projectively). These points have an action of the Galois group {\hbox{Gal}( \overline{{\mathbb F}} / {\mathbb F} )}. It is a classical fact that this Galois group is in fact a cyclic group generated by a single element, the (geometric) Frobenius map {\hbox{Frob}: x \mapsto x^q}, which fixes the elements of the original finite field {{\mathbb F}} but permutes the other elements of {\overline{{\mathbb F}}}. Thus the roots of a given polynomial {f} split into orbits of the Frobenius map. One can check that the roots consist of a single such orbit (counting multiplicity) if and only if {f} is irreducible; thus the fundamental theorem of arithmetic can be viewed geometrically as as the orbit decomposition of any Frobenius-invariant finite set of points in the affine line.

Now consider the degree {n} finite field extension {{\mathbb F}_n} of {{\mathbb F}} (it is a classical fact that there is exactly one such extension up to isomorphism for each {n}); this is a subfield of {\overline{{\mathbb F}}} of order {q^n}. (Here we are performing a standard abuse of notation by overloading the subscripts in the {{\mathbb F}} notation; thus {{\mathbb F}_q} denotes the field of order {q}, while {{\mathbb F}_n} denotes the extension of {{\mathbb F} = {\mathbb F}_q} of order {n}, so that we in fact have {{\mathbb F}_n = {\mathbb F}_{q^n}} if we use one subscript convention on the left-hand side and the other subscript convention on the right-hand side. We hope this overloading will not cause confusion.) Each point {x} in this extension (or, more suggestively, the affine line {{\mathbb A}^1({\mathbb F}_n)} over this extension) has a minimal polynomial – an irreducible monic polynomial whose roots consist the Frobenius orbit of {x}. Since the Frobenius action is periodic of period {n} on {{\mathbb F}_n}, the degree of this minimal polynomial must divide {n}. Conversely, every monic irreducible polynomial of degree {d} dividing {n} produces {d} distinct zeroes that lie in {{\mathbb F}_d} (here we use the classical fact that finite fields are perfect) and hence in {{\mathbb F}_n}. We have thus partitioned {{\mathbb A}^1({\mathbb F}_n)} into Frobenius orbits (also known as closed points), with each monic irreducible polynomial {f} of degree {d} dividing {n} contributing an orbit of size {d = \hbox{deg}(f) = \Lambda(f^{n/d})}. From this we conclude a geometric interpretation of the left-hand side of (18):

\displaystyle  \sum_{\hbox{deg}(f) = n} \Lambda_{{\mathbb A}^1/{\mathbb F}}(f) = \sum_{x \in {\mathbb A}^1({\mathbb F}_n)} 1. \ \ \ \ \ (19)

The identity (18) thus is equivalent to the thoroughly boring fact that the number of {{\mathbb F}_n}-points on the affine line {{\mathbb A}^1} is equal to {q^n}. However, things become much more interesting if one then replaces the affine line {{\mathbb A}^1} by a more general (geometrically) irreducible curve {C} defined over {{\mathbb F}}; for instance one could take {C} to be an ellpitic curve

\displaystyle  E = \{ (x,y): y^2 = x^3 + ax + b \} \ \ \ \ \ (20)

for some suitable {a,b \in {\mathbb F}}, although the discussion here applies to more general curves as well (though to avoid some minor technicalities, we will assume that the curve is projective with a finite number of {{\mathbb F}}-rational points removed). The analogue of {{\mathbb F}[t]} is then the coordinate ring of {C} (for instance, in the case of the elliptic curve (20) it would be {{\mathbb F}[x,y] / (y^2-x^3-ax-b)}), with polynomials in this ring producing a set of roots in the curve {C( \overline{\mathbb F})} that is again invariant with respect to the Frobenius action (acting on the {x} and {y} coordinates separately). In general, we do not expect unique factorisation in this coordinate ring (this is basically because Bezout’s theorem suggests that the zero set of a polynomial on {C} will almost never consist of a single (closed) point). Of course, we can use the algebraic formalism of ideals to get around this, setting up a zeta function

\displaystyle  \zeta_{C/{\mathbb F}}(s) = \sum_{{\mathfrak f}} \frac{1}{N({\mathfrak f})^s}

and a von Mangoldt function {\Lambda_{C/{\mathbb F}}({\mathfrak f})} as before, where {{\mathfrak f}} would now run over the non-trivial ideals of the coordinate ring. However, it is more instructive to use the geometric viewpoint, using the ideal-variety dictionary from algebraic geometry to convert algebraic objects involving ideals into geometric objects involving varieties. In this dictionary, a non-trivial ideal would correspond to a proper subvariety (or more precisely, a subscheme, but let us ignore the distinction between varieties and schemes here) of the curve {C}; as the curve is irreducible and one-dimensional, this subvariety must be zero-dimensional and is thus a (multi-)set of points {\{x_1,\ldots,x_k\}} in {C}, or equivalently an effective divisor {[x_1] + \ldots + [x_k]} of {C}; this generalises the concept of the set of roots of a polynomial (which corresponds to the case of a principal ideal). Furthermore, this divisor has to be rational in the sense that it is Frobenius-invariant. The prime ideals correspond to those divisors (or sets of points) which are irreducible, that is to say the individual Frobenius orbits, also known as closed points of {C}. With this dictionary, the zeta function becomes

\displaystyle  \zeta_{C/{\mathbb F}}(s) = \sum_{D \geq 0} \frac{1}{q^{\hbox{deg}(D)}}

where the sum is over effective rational divisors {D} of {C} (with {k} being the degree of an effective divisor {[x_1] + \ldots + [x_k]}), or equivalently

\displaystyle  Z( C/{\mathbb F}, T ) = \sum_{D \geq 0} T^{\hbox{deg}(D)}.

The analogue of (19), which gives a geometric interpretation to sums of the von Mangoldt function, becomes

\displaystyle  \sum_{N({\mathfrak f}) = q^n} \Lambda_{C/{\mathbb F}}({\mathfrak f}) = \sum_{x \in C({\mathbb F}_n)} 1

\displaystyle  = |C({\mathbb F}_n)|,

thus this sum is simply counting the number of {{\mathbb F}_n}-points of {C}. The analogue of the exponential identity (16) (or (2), (8), or (12)) is then

\displaystyle  Z( C/{\mathbb F}, T ) = \exp( \sum_{n \geq 1} \frac{|C({\mathbb F}_n)|}{n} T^n ) \ \ \ \ \ (21)

and the analogue of the explicit formula (17) (or (5), (10) or (13)) is

\displaystyle  |C({\mathbb F}_n)| = q^n - \sum_\alpha \alpha^n + c \ \ \ \ \ (22)

where {\alpha} runs over the (reciprocal) zeroes of {Z( C/{\mathbb F}, T )} (counting multiplicity), and {c} is an integer independent of {n}. (As it turns out, {c} equals {1} when {C} is a projective curve, and more generally equals {1-k} when {C} is a projective curve with {k} rational points deleted.)

To evaluate {Z(C/{\mathbb F},T)}, one needs to count the number of effective divisors of a given degree on the curve {C}. Fortunately, there is a tool that is particularly well-designed for this task, namely the Riemann-Roch theorem. By using this theorem, one can show (when {C} is projective) that {Z(C/{\mathbb F},T)} is in fact a rational function, with a finite number of zeroes, and a simple pole at both {1} and {1/q}, with similar results when one deletes some rational points from {C}; see e.g. Chapter 11 of Iwaniec-Kowalski for details. Thus the sum in (22) is finite. For instance, for the affine elliptic curve (20) (which is a projective curve with one point removed), it turns out that we have

\displaystyle  |E({\mathbb F}_n)| = q^n - \alpha^n - \beta^n

for two complex numbers {\alpha,\beta} depending on {E} and {q}.

The Riemann hypothesis for (untwisted) curves – which is the deepest and most difficult aspect of the Weil conjectures for these curves – asserts that the zeroes of {\zeta_{C/{\mathbb F}}} lie on the critical line, or equivalently that all the roots {\alpha} in (22) have modulus {\sqrt{q}}, so that (22) then gives the asymptotic

\displaystyle  |C({\mathbb F}_n)| = q^n + O( q^{n/2} ) \ \ \ \ \ (23)

where the implied constant depends only on the genus of {C} (and on the number of points removed from {C}). For instance, for elliptic curves we have the Hasse bound

\displaystyle  |E({\mathbb F}_n) - q^n| \leq 2 \sqrt{q}.

As before, we have an important amplification phenomenon: if we can establish a weaker estimate, e.g.

\displaystyle  |C({\mathbb F}_n)| = q^n + O( q^{n/2 + O(1)} ), \ \ \ \ \ (24)

then we can automatically deduce the stronger bound (23). This amplification is not a mere curiosity; most of the proofs of the Riemann hypothesis for curves proceed via this fact. For instance, by using the elementary method of Stepanov to bound points in curves (discussed for instance in this previous post), one can establish the preliminary bound (24) for large {n}, which then amplifies to the optimal bound (23) for all {n} (and in particular for {n=1}). Again, see Chapter 11 of Iwaniec-Kowalski for details. The ability to convert a bound with {q}-dependent losses over the optimal bound (such as (24)) into an essentially optimal bound with no {q}-dependent losses (such as (23)) is important in analytic number theory, since in many applications (e.g. in those arising from sieve theory) one wishes to sum over large ranges of {q}.

Much as the Riemann zeta function can be twisted by a Dirichlet character to form a Dirichlet {L}-function, one can twist the zeta function on curves by various additive and multiplicative characters. For instance, suppose one has an affine plane curve {C \subset {\mathbb A}^2} and an additive character {\psi: {\mathbb F}^2 \rightarrow {\bf C}^\times}, thus {\psi(x+y) = \psi(x) \psi(y)} for all {x,y \in {\mathbb F}^2}. Given a rational effective divisor {D = [x_1] + \ldots + [x_k]}, the sum {x_1+\ldots+x_k} is Frobenius-invariant and thus lies in {{\mathbb F}^2}. By abuse of notation, we may thus define {\psi} on such divisors by

\displaystyle  \psi( [x_1] + \ldots + [x_k] ) := \psi( x_1 + \ldots + x_k )

and observe that {\psi} is multiplicative in the sense that {\psi(D_1 + D_2) = \psi(D_1) \psi(D_2)} for rational effective divisors {D_1,D_2}. One can then define {\psi({\mathfrak f})} for any non-trivial ideal {{\mathfrak f}} by replacing that ideal with the associated rational effective divisor; for instance, if {f} is a polynomial in the coefficient ring of {C}, with zeroes at {x_1,\ldots,x_k \in C}, then {\psi((f))} is {\psi( x_1+\ldots+x_k )}. Again, we have the multiplicativity property {\psi({\mathfrak f} {\mathfrak g}) = \psi({\mathfrak f}) \psi({\mathfrak g})}. If we then form the twisted normalised zeta function

\displaystyle  Z( C/{\mathbb F}, \psi, T ) = \sum_{D \geq 0} \psi(D) T^{\hbox{deg}(D)}

then by twisting the previous analysis, we eventually arrive at the exponential identity

\displaystyle  Z( C/{\mathbb F}, \psi, T ) = \exp( \sum_{n \geq 1} \frac{S_n(C/{\mathbb F}, \psi)}{n} T^n ) \ \ \ \ \ (25)

in analogy with (21) (or (2), (8), (12), or (16)), where the companion sums {S_n(C/{\mathbb F}, \psi)} are defined by

\displaystyle  S_n(C/{\mathbb F},\psi) = \sum_{x \in C({\mathbb F}^n)} \psi( \hbox{Tr}_{{\mathbb F}_n/{\mathbb F}}(x) )

where the trace {\hbox{Tr}_{{\mathbb F}_n/{\mathbb F}}(x)} of an element {x} in the plane {{\mathbb A}^2( {\mathbb F}_n )} is defined by the formula

\displaystyle  \hbox{Tr}_{{\mathbb F}_n/{\mathbb F}}(x) = x + \hbox{Frob}(x) + \ldots + \hbox{Frob}^{n-1}(x).

In particular, {S_1} is the exponential sum

\displaystyle  S_1(C/{\mathbb F},\psi) = \sum_{x \in C({\mathbb F})} \psi(x)

which is an important type of sum in analytic number theory, containing for instance the Kloosterman sum

\displaystyle  K(a,b;p) := \sum_{x \in {\mathbb F}_p^\times} e_p( ax + \frac{b}{x})

as a special case, where {a,b \in {\mathbb F}_p^\times}. (NOTE: the sign conventions for the companion sum {S_n} are not consistent across the literature, sometimes it is {-S_n} which is referred to as the companion sum.)

If {\psi} is non-principal (and {C} is non-linear), one can show (by a suitably twisted version of the Riemann-Roch theorem) that {Z} is a rational function of {T}, with no pole at {T=q^{-1}}, and one then gets an explicit formula of the form

\displaystyle  S_n(C/{\mathbb F},\psi) = -\sum_\alpha \alpha^n + c \ \ \ \ \ (26)

for the companion sums, where {\alpha} are the reciprocals of the zeroes of {S}, in analogy to (22) (or (5), (10), (13), or (17)). For instance, in the case of Kloosterman sums, there is an identity of the form

\displaystyle  \sum_{x \in {\mathbb F}_{p^n}^\times} e_p( a \hbox{Tr}(x) + \frac{b}{\hbox{Tr}(x)}) = -\alpha^n - \beta^n \ \ \ \ \ (27)

for all {n} and some complex numbers {\alpha,\beta} depending on {a,b,p}, where we have abbreviated {\hbox{Tr}_{{\mathbb F}_{p^n}/{\mathbb F}_p}} as {\hbox{Tr}}. As before, the Riemann hypothesis for {Z} then gives a square root cancellation bound of the form

\displaystyle  S_n(C/{\mathbb F},\psi) = O( q^{n/2} ) \ \ \ \ \ (28)

for the companion sums (and in particular gives the very explicit Weil bound {|K(a,b;p)| \leq 2\sqrt{p}} for the Kloosterman sum), but again there is the amplification phenomenon that this sort of bound can be deduced from the apparently weaker bound

\displaystyle  S_n(C/{\mathbb F},\psi) = O( q^{n/2 + O(1)} ).

As before, most of the known proofs of the Riemann hypothesis for these twisted zeta functions proceed by first establishing this weaker bound (e.g. one could again use Stepanov’s method here for this goal) and then amplifying to the full bound (28); see Chapter 11 of Iwaniec-Kowalski for further details.

One can also twist the zeta function on a curve by a multiplicative character {\chi: {\mathbb F}^\times \rightarrow {\bf C}^\times} by similar arguments, except that instead of forming the sum {x_1+\ldots+x_k} of all the components of an effective divisor {[x_1]+\ldots+[x_k]}, one takes the product {x_1 \ldots x_k} instead, and similarly one replaces the trace

\displaystyle  \hbox{Tr}_{{\mathbb F}_n/{\mathbb F}}(x) = x + \hbox{Frob}(x) + \ldots + \hbox{Frob}^{n-1}(x)

by the norm

\displaystyle  \hbox{Norm}_{{\mathbb F}_n/{\mathbb F}}(x) = x \cdot \hbox{Frob}(x) \cdot \ldots \cdot \hbox{Frob}^{n-1}(x).

Again, see Chapter 11 of Iwaniec-Kowalski for details.

Deligne famously extended the above theory to higher-dimensional varieties than curves, and also to the closely related context of {\ell}-adic sheaves on curves, giving rise to two separate proofs of the Weil conjectures in full generality. (Very roughly speaking, the former context can be obtained from the latter context by a sort of Fubini theorem type argument that expresses sums on higher-dimensional varieties as iterated sums on curves of various expressions related to {\ell}-adic sheaves.) In this higher-dimensional setting, the zeta function formalism is still present, but is much more difficult to use, in large part due to the much less tractable nature of divisors in higher dimensions (they are now combinations of codimension one subvarieties or subschemes, rather than combinations of points). To get around this difficulty, one has to change perspective yet again, from an algebraic or geometric perspective to an {\ell}-adic cohomological perspective. (I could imagine that once one is sufficiently expert in the subject, all these perspectives merge back together into a unified viewpoint, but I am certainly not yet at that stage of understanding.) In particular, the zeta function, while still present, plays a significantly less prominent role in the analysis (at least if one is willing to take Deligne’s theorems as a black box); the explicit formula is now obtained via a different route, namely the Grothendieck-Lefschetz fixed point formula. I have written some notes on this material below the fold (based in part on some lectures of Philippe Michel, as well as the text of Iwaniec-Kowalski and also this book of Katz), but I should caution that my understanding here is still rather sketchy and possibly inaccurate in places.

Read the rest of this entry »

The rectification principle in arithmetic combinatorics asserts, roughly speaking, that very small subsets (or, alternatively, small structured subsets) of an additive group or a field of large characteristic can be modeled (for the purposes of arithmetic combinatorics) by subsets of a group or field of zero characteristic, such as the integers {{\bf Z}} or the complex numbers {{\bf C}}. The additive form of this principle is known as the Freiman rectification principle; it has several formulations, going back of course to the original work of Freiman. Here is one formulation as given by Bilu, Lev, and Ruzsa:

Proposition 1 (Additive rectification) Let {A} be a subset of the additive group {{\bf Z}/p{\bf Z}} for some prime {p}, and let {s \geq 1} be an integer. Suppose that {|A| \leq \log_{2s} p}. Then there exists a map {\phi: A \rightarrow A'} into a subset {A'} of the integers which is a Freiman isomorphism of order {s} in the sense that for any {x_1,\ldots,x_s,y_1,\ldots,y_s \in A}, one has

\displaystyle  x_1+\ldots+x_s = y_1+\ldots+y_s

if and only if

\displaystyle  \phi(x_1)+\ldots+\phi(x_s) = \phi(y_1)+\ldots+\phi(y_s).

Furthermore {\phi} is a right-inverse of the obvious projection homomorphism from {{\bf Z}} to {{\bf Z}/p{\bf Z}}.

The original version of the rectification principle allowed the sets involved to be substantially larger in size (cardinality up to a small constant multiple of {p}), but with the additional hypothesis of bounded doubling involved; see the above-mentioned papers, as well as this later paper of Green and Ruzsa, for further discussion.

The proof of Proposition 1 is quite short (see Theorem 3.1 of Bilu-Lev-Ruzsa); the main idea is to use Minkowski’s theorem to find a non-trivial dilate {aA} of {A} that is contained in a small neighbourhood of the origin in {{\bf Z}/p{\bf Z}}, at which point the rectification map {\phi} can be constructed by hand.

Very recently, Codrut Grosu obtained an arithmetic analogue of the above theorem, in which the rectification map {\phi} preserves both additive and multiplicative structure:

Theorem 2 (Arithmetic rectification) Let {A} be a subset of the finite field {{\bf F}_p} for some prime {p \geq 3}, and let {s \geq 1} be an integer. Suppose that {|A| < \log_2 \log_{2s} \log_{2s^2} p - 1}. Then there exists a map {\phi: A \rightarrow A'} into a subset {A'} of the complex numbers which is a Freiman field isomorphism of order {s} in the sense that for any {x_1,\ldots,x_n \in A} and any polynomial {P(x_1,\ldots,x_n)} of degree at most {s} and integer coefficients of magnitude summing to at most {s}, one has

\displaystyle  P(x_1,\ldots,x_n)=0

if and only if

\displaystyle  P(\phi(x_1),\ldots,\phi(x_n))=0.

Note that it is necessary to use an algebraically closed field such as {{\bf C}} for this theorem, in contrast to the integers used in Proposition 1, as {{\bf F}_p} can contain objects such as square roots of {-1} which can only map to {\pm i} in the complex numbers (once {s} is at least {2}).

Using Theorem 2, one can transfer results in arithmetic combinatorics (e.g. sum-product or Szemerédi-Trotter type theorems) regarding finite subsets of {{\bf C}} to analogous results regarding sufficiently small subsets of {{\bf F}_p}; see the paper of Grosu for several examples of this. This should be compared with the paper of Vu, Wood, and Wood, which introduces a converse principle that embeds finite subsets of {{\bf C}} (or more generally, a characteristic zero integral domain) in a Freiman field-isomorphic fashion into finite subsets of {{\bf F}_p} for arbitrarily large primes {p}, allowing one to transfer arithmetic combinatorical facts from the latter setting to the former.

Grosu’s argument uses some quantitative elimination theory, and in particular a quantitative variant of a lemma of Chang that was discussed previously on this blog. In that previous blog post, it was observed that (an ineffective version of) Chang’s theorem could be obtained using only qualitative algebraic geometry (as opposed to quantitative algebraic geometry tools such as elimination theory results with explicit bounds) by means of nonstandard analysis (or, in what amounts to essentially the same thing in this context, the use of ultraproducts). One can then ask whether one can similarly establish an ineffective version of Grosu’s result by nonstandard means. The purpose of this post is to record that this can indeed be done without much difficulty, though the result obtained, being ineffective, is somewhat weaker than that in Theorem 2. More precisely, we obtain

Theorem 3 (Ineffective arithmetic rectification) Let {s, n \geq 1}. Then if {{\bf F}} is a field of characteristic at least {C_{s,n}} for some {C_{s,n}} depending on {s,n}, and {A} is a subset of {{\bf F}} of cardinality {n}, then there exists a map {\phi: A \rightarrow A'} into a subset {A'} of the complex numbers which is a Freiman field isomorphism of order {s}.

Our arguments will not provide any effective bound on the quantity {C_{s,n}} (though one could in principle eventually extract such a bound by deconstructing the proof of Proposition 4 below), making this result weaker than Theorem 2 (save for the minor generalisation that it can handle fields of prime power order as well as fields of prime order as long as the characteristic remains large).

Following the principle that ultraproducts can be used as a bridge to connect quantitative and qualitative results (as discussed in these previous blog posts), we will deduce Theorem 3 from the following (well-known) qualitative version:

Proposition 4 (Baby Lefschetz principle) Let {k} be a field of characteristic zero that is finitely generated over the rationals. Then there is an isomorphism {\phi: k \rightarrow \phi(k)} from {k} to a subfield {\phi(k)} of {{\bf C}}.

This principle (first laid out in an appendix of Lefschetz’s book), among other things, often allows one to use the methods of complex analysis (e.g. Riemann surface theory) to study many other fields of characteristic zero. There are many variants and extensions of this principle; see for instance this MathOverflow post for some discussion of these. I used this baby version of the Lefschetz principle recently in a paper on expanding polynomial maps.

Proof: We give two proofs of this fact, one using transcendence bases and the other using Hilbert’s nullstellensatz.

We begin with the former proof. As {k} is finitely generated over {{\bf Q}}, it has finite transcendence degree, thus one can find algebraically independent elements {x_1,\ldots,x_m} of {k} over {{\bf Q}} such that {k} is a finite extension of {{\bf Q}(x_1,\ldots,x_m)}, and in particular by the primitive element theorem {k} is generated by {{\bf Q}(x_1,\ldots,x_m)} and an element {\alpha} which is algebraic over {{\bf Q}(x_1,\ldots,x_m)}. (Here we use the fact that characteristic zero fields are separable.) If we then define {\phi} by first mapping {x_1,\ldots,x_m} to generic (and thus algebraically independent) complex numbers {z_1,\ldots,z_m}, and then setting {\phi(\alpha)} to be a complex root of of the minimal polynomial for {\alpha} over {{\bf Q}(x_1,\ldots,x_m)} after replacing each {x_i} with the complex number {z_i}, we obtain a field isomorphism {\phi: k \rightarrow \phi(k)} with the required properties.

Now we give the latter proof. Let {x_1,\ldots,x_m} be elements of {k} that generate that field over {{\bf Q}}, but which are not necessarily algebraically independent. Our task is then equivalent to that of finding complex numbers {z_1,\ldots,z_m} with the property that, for any polynomial {P(x_1,\ldots,x_m)} with rational coefficients, one has

\displaystyle  P(x_1,\ldots,x_m) = 0

if and only if

\displaystyle  P(z_1,\ldots,z_m) = 0.

Let {{\mathcal P}} be the collection of all polynomials {P} with rational coefficients with {P(x_1,\ldots,x_m)=0}, and {{\mathcal Q}} be the collection of all polynomials {P} with rational coefficients with {P(x_1,\ldots,x_m) \neq 0}. The set

\displaystyle  S := \{ (z_1,\ldots,z_m) \in {\bf C}^m: P(z_1,\ldots,z_m)=0 \hbox{ for all } P \in {\mathcal P} \}

is the intersection of countably many algebraic sets and is thus also an algebraic set (by the Hilbert basis theorem or the Noetherian property of algebraic sets). If the desired claim failed, then {S} could be covered by the algebraic sets {\{ (z_1,\ldots,z_m) \in {\bf C}^m: Q(z_1,\ldots,z_m) = 0 \}} for {Q \in {\mathcal Q}}. By decomposing into irreducible varieties and observing (e.g. from the Baire category theorem) that a variety of a given dimension over {{\bf C}} cannot be covered by countably many varieties of smaller dimension, we conclude that {S} must in fact be covered by a finite number of such sets, thus

\displaystyle  S \subset \bigcup_{i=1}^n \{ (z_1,\ldots,z_m) \in {\bf C}^m: Q_i(z_1,\ldots,z_m) = 0 \}

for some {Q_1,\ldots,Q_n \in {\bf C}^m}. By the nullstellensatz, we thus have an identity of the form

\displaystyle  (Q_1 \ldots Q_n)^l = P_1 R_1 + \ldots + P_r R_r

for some natural numbers {l,r \geq 1}, polynomials {P_1,\ldots,P_r \in {\mathcal P}}, and polynomials {R_1,\ldots,R_r} with coefficients in {\overline{{\bf Q}}}. In particular, this identity also holds in the algebraic closure {\overline{k}} of {k}. Evaluating this identity at {(x_1,\ldots,x_m)} we see that the right-hand side is zero but the left-hand side is non-zero, a contradiction, and the claim follows. \Box

From Proposition 4 one can now deduce Theorem 3 by a routine ultraproduct argument (the same one used in these previous blog posts). Suppose for contradiction that Theorem 3 fails. Then there exists natural numbers {s,n \geq 1}, a sequence of finite fields {{\bf F}_i} of characteristic at least {i}, and subsets {A_i=\{a_{i,1},\ldots,a_{i,n}\}} of {{\bf F}_i} of cardinality {n} such that for each {i}, there does not exist a Freiman field isomorphism of order {s} from {A_i} to the complex numbers. Now we select a non-principal ultrafilter {\alpha \in \beta {\bf N} \backslash {\bf N}}, and construct the ultraproduct {{\bf F} := \prod_{i \rightarrow \alpha} {\bf F}_i} of the finite fields {{\bf F}_i}. This is again a field (and is a basic example of what is known as a pseudo-finite field); because the characteristic of {{\bf F}_i} goes to infinity as {i \rightarrow \infty}, it is easy to see (using Los’s theorem) that {{\bf F}} has characteristic zero and can thus be viewed as an extension of the rationals {{\bf Q}}.

Now let {a_j := \lim_{i \rightarrow \alpha} a_{i,j}} be the ultralimit of the {a_{i,j}}, so that {A := \{a_1,\ldots,a_n\}} is the ultraproduct of the {A_i}, then {A} is a subset of {{\bf F}} of cardinality {n}. In particular, if {k} is the field generated by {{\bf Q}} and {A}, then {k} is a finitely generated extension of the rationals and thus, by Proposition 4 there is an isomorphism {\phi: k \rightarrow \phi(k)} from {k} to a subfield {\phi(k)} of the complex numbers. In particular, {\phi(a_1),\ldots,\phi(a_n)} are complex numbers, and for any polynomial {P(x_1,\ldots,x_n)} with integer coefficients, one has

\displaystyle  P(a_1,\ldots,a_n) = 0

if and only if

\displaystyle  P(\phi(a_1),\ldots,\phi(a_n)) = 0.

By Los’s theorem, we then conclude that for all {i} sufficiently close to {\alpha}, one has for all polynomials {P(x_1,\ldots,x_n)} of degree at most {s} and whose coefficients are integers whose magnitude sums up to {s}, one has

\displaystyle  P(a_{i,1},\ldots,a_{i,n}) = 0

if and only if

\displaystyle  P(\phi(a_1),\ldots,\phi(a_n)) = 0.

But this gives a Freiman field isomorphism of order {s} between {A_i} and {\phi(A)}, contradicting the construction of {A_i}, and Theorem 3 follows.

Given a function {f: X \rightarrow Y} between two sets {X, Y}, we can form the graph

\displaystyle  \Sigma := \{ (x,f(x)): x\in X \},

which is a subset of the Cartesian product {X \times Y}.

There are a number of “closed graph theorems” in mathematics which relate the regularity properties of the function {f} with the closure properties of the graph {\Sigma}, assuming some “completeness” properties of the domain {X} and range {Y}. The most famous of these is the closed graph theorem from functional analysis, which I phrase as follows:

Theorem 1 (Closed graph theorem (functional analysis)) Let {X, Y} be complete normed vector spaces over the reals (i.e. Banach spaces). Then a function {f: X \rightarrow Y} is a continuous linear transformation if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is both linearly closed (i.e. it is a linear subspace of {X \times Y}) and topologically closed (i.e. closed in the product topology of {X \times Y}).

I like to think of this theorem as linking together qualitative and quantitative notions of regularity preservation properties of an operator {f}; see this blog post for further discussion.

The theorem is equivalent to the assertion that any continuous linear bijection {f: X \rightarrow Y} from one Banach space to another is necessarily an isomorphism in the sense that the inverse map is also continuous and linear. Indeed, to see that this claim implies the closed graph theorem, one applies it to the projection from {\Sigma} to {X}, which is a continuous linear bijection; conversely, to deduce this claim from the closed graph theorem, observe that the graph of the inverse {f^{-1}} is the reflection of the graph of {f}. As such, the closed graph theorem is a corollary of the open mapping theorem, which asserts that any continuous linear surjection from one Banach space to another is open. (Conversely, one can deduce the open mapping theorem from the closed graph theorem by quotienting out the kernel of the continuous surjection to get a bijection.)

It turns out that there is a closed graph theorem (or equivalent reformulations of that theorem, such as an assertion that bijective morphisms between sufficiently “complete” objects are necessarily isomorphisms, or as an open mapping theorem) in many other categories in mathematics as well. Here are some easy ones:

Theorem 2 (Closed graph theorem (linear algebra)) Let {X, Y} be vector spaces over a field {k}. Then a function {f: X \rightarrow Y} is a linear transformation if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is linearly closed.

Theorem 3 (Closed graph theorem (group theory)) Let {X, Y} be groups. Then a function {f: X \rightarrow Y} is a group homomorphism if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is closed under the group operations (i.e. it is a subgroup of {X \times Y}).

Theorem 4 (Closed graph theorem (order theory)) Let {X, Y} be totally ordered sets. Then a function {f: X \rightarrow Y} is monotone increasing if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is totally ordered (using the product order on {X \times Y}).

Remark 1 Similar results to the above three theorems (with similarly easy proofs) hold for other algebraic structures, such as rings (using the usual product of rings), modules, algebras, or Lie algebras, groupoids, or even categories (a map between categories is a functor iff its graph is again a category). (ADDED IN VIEW OF COMMENTS: further examples include affine spaces and {G}-sets (sets with an action of a given group {G}).) There are also various approximate versions of this theorem that are useful in arithmetic combinatorics, that relate the property of a map {f} being an “approximate homomorphism” in some sense with its graph being an “approximate group” in some sense. This is particularly useful for this subfield of mathematics because there are currently more theorems about approximate groups than about approximate homomorphisms, so that one can profitably use closed graph theorems to transfer results about the former to results about the latter.

A slightly more sophisticated result in the same vein:

Theorem 5 (Closed graph theorem (point set topology)) Let {X, Y} be compact Hausdorff spaces. Then a function {f: X \rightarrow Y} is continuous if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is topologically closed.

Indeed, the “only if” direction is easy, while for the “if” direction, note that if {\Sigma} is a closed subset of {X \times Y}, then it is compact Hausdorff, and the projection map from {\Sigma} to {X} is then a bijective continuous map between compact Hausdorff spaces, which is then closed, thus open, and hence a homeomorphism, giving the claim.

Note that the compactness hypothesis is necessary: for instance, the function {f: {\bf R} \rightarrow {\bf R}} defined by {f(x) := 1/x} for {x \neq 0} and {f(0)=0} for {x=0} is a function which has a closed graph, but is discontinuous.

A similar result (but relying on a much deeper theorem) is available in algebraic geometry, as I learned after asking this MathOverflow question:

Theorem 6 (Closed graph theorem (algebraic geometry)) Let {X, Y} be normal projective varieties over an algebraically closed field {k} of characteristic zero. Then a function {f: X \rightarrow Y} is a regular map if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is Zariski-closed.

Proof: (Sketch) For the only if direction, note that the map {x \mapsto (x,f(x))} is a regular map from the projective variety {X} to the projective variety {X \times Y} and is thus a projective morphism, hence is proper. In particular, the image {\Sigma} of {X} under this map is Zariski-closed.

Conversely, if {\Sigma} is Zariski-closed, then it is also a projective variety, and the projection {(x,y) \mapsto x} is a projective morphism from {\Sigma} to {X}, which is clearly quasi-finite; by the characteristic zero hypothesis, it is also separated. Applying (Grothendieck’s form of) Zariski’s main theorem, this projection is the composition of an open immersion and a finite map. As projective varieties are complete, the open immersion is an isomorphism, and so the projection from {\Sigma} to {X} is finite. Being injective and separable, the degree of this finite map must be one, and hence {k(\Sigma)} and {k(X)} are isomorphic, hence (by normality of {X}) {k[\Sigma]} is contained in (the image of) {k[X]}, which makes the map from {X} to {\Sigma} regular, which makes {f} regular. \Box

The counterexample of the map {f: k \rightarrow k} given by {f(x) := 1/x} for {x \neq 0} and {f(0) := 0} demonstrates why the projective hypothesis is necessary. The necessity of the normality condition (or more precisely, a weak normality condition) is demonstrated by (the projective version of) the map {(t^2,t^3) \mapsto t} from the cusipdal curve {\{ (t^2,t^3): t \in k \}} to {k}. (If one restricts attention to smooth varieties, though, normality becomes automatic.) The necessity of characteristic zero is demonstrated by (the projective version of) the inverse of the Frobenius map {x \mapsto x^p} on a field {k} of characteristic {p}.

There are also a number of closed graph theorems for topological groups, of which the following is typical (see Exercise 3 of these previous blog notes):

Theorem 7 (Closed graph theorem (topological group theory)) Let {X, Y} be {\sigma}-compact, locally compact Hausdorff groups. Then a function {X \rightarrow Y} is a continuous homomorphism if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is both group-theoretically closed and topologically closed.

The hypotheses of being {\sigma}-compact, locally compact, and Hausdorff can be relaxed somewhat, but I doubt that they can be eliminated entirely (though I do not have a ready counterexample for this).

In several complex variables, it is a classical theorem (see e.g. Lemma 4 of this blog post) that a holomorphic function from a domain in {{\bf C}^n} to {{\bf C}^n} is locally injective if and only if it is a local diffeomorphism (i.e. its derivative is everywhere non-singular). This leads to a closed graph theorem for complex manifolds:

Theorem 8 (Closed graph theorem (complex manifolds)) Let {X, Y} be complex manifolds. Then a function {f: X \rightarrow Y} is holomorphic if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is a complex manifold (using the complex structure inherited from {X \times Y}) of the same dimension as {X}.

Indeed, one applies the previous observation to the projection from {\Sigma} to {X}. The dimension requirement is needed, as can be seen from the example of the map {f: {\bf C} \rightarrow {\bf C}} defined by {f(z) =1/z} for {z \neq 0} and {f(0)=0}.

(ADDED LATER:) There is a real analogue to the above theorem:

Theorem 9 (Closed graph theorem (real manifolds)) Let {X, Y} be real manifolds. Then a function {f: X \rightarrow Y} is continuous if and only if the graph {\Sigma := \{ (x,f(x)): x \in X \}} is a real manifold of the same dimension as {X}.

This theorem can be proven by applying invariance of domain (discussed in this previous post) to the projection of {\Sigma} to {X}, to show that it is open if {\Sigma} has the same dimension as {X}.

Note though that the analogous claim for smooth real manifolds fails: the function {f: {\bf R} \rightarrow {\bf R}} defined by {f(x) := x^{1/3}} has a smooth graph, but is not itself smooth.

(ADDED YET LATER:) Here is an easy closed graph theorem in the symplectic category:

Theorem 10 (Closed graph theorem (symplectic geometry)) Let {X = (X,\omega_X)} and {Y = (Y,\omega_Y)} be smooth symplectic manifolds of the same dimension. Then a smooth map {f: X \rightarrow Y} is a symplectic morphism (i.e. {f^* \omega_Y = \omega_X}) if and only if the graph {\Sigma := \{(x,f(x)): x \in X \}} is a Lagrangian submanifold of {X \times Y} with the symplectic form {\omega_X \oplus -\omega_Y}.

In view of the symplectic rigidity phenomenon, it is likely that the smoothness hypotheses on {f,X,Y} can be relaxed substantially, but I will not try to formulate such a result here.

There are presumably many further examples of closed graph theorems (or closely related theorems, such as criteria for inverting a morphism, or open mapping type theorems) throughout mathematics; I would be interested to know of further examples.

\Box

I’ve just uploaded to the arXiv my paper “Expanding polynomials over finite fields of large characteristic, and a regularity lemma for definable sets“, submitted to Contrib. Disc. Math. The motivation of this paper is to understand a certain polynomial variant of the sum-product phenomenon in finite fields. This phenomenon asserts that if {A} is a non-empty subset of a finite field {F}, then either the sumset {A+A := \{a+b: a,b \in A\}} or product set {A \cdot A := \{ab: a,b \in A \}} will be significantly larger than {A}, unless {A} is close to a subfield of {F} (or to {\{1\}}). In particular, in the regime when {A} is large, say {|F|^{1-c} < |A| \leq |F|}, one expects an expansion bound of the form

\displaystyle  |A+A| + |A \cdot A| \gg (|F|/|A|)^{c'} |A| \ \ \ \ \ (1)

for some absolute constants {c, c' > 0}. Results of this type are known; for instance, Hart, Iosevich, and Solymosi obtained precisely this bound for {(c,c')=(3/10,1/3)} (in the case when {|F|} is prime), which was then improved by Garaev to {(c,c')=(1/3,1/2)}.

We have focused here on the case when {A} is a large subset of {F}, but sum-product estimates are also extremely interesting in the opposite regime in which {A} is allowed to be small (see for instance the papers of Katz-Shen and Li and of Garaev for some recent work in this case, building on some older papers of Bourgain, Katz and myself and of Bourgain, Glibichuk, and Konyagin). However, the techniques used in these two regimes are rather different. For large subsets of {F}, it is often profitable to use techniques such as the Fourier transform or the Cauchy-Schwarz inequality to “complete” a sum over a large set (such as {A}) into a set over the entire field {F}, and then to use identities concerning complete sums (such as the Weil bound on complete exponential sums over a finite field). For small subsets of {F}, such techniques are usually quite inefficient, and one has to proceed by somewhat different combinatorial methods which do not try to exploit the ambient field {F}. But my paper focuses exclusively on the large {A} regime, and unfortunately does not directly say much (except through reasoning by analogy) about the small {A} case.

Note that it is necessary to have both {A+A} and {A \cdot A} appear on the left-hand side of (1). Indeed, if one just has the sumset {A+A}, then one can set {A} to be a long arithmetic progression to give counterexamples to (1). Similarly, if one just has a product set {A \cdot A}, then one can set {A} to be a long geometric progression. The sum-product phenomenon can then be viewed that it is not possible to simultaneously behave like a long arithmetic progression and a long geometric progression, unless one is already very close to behaving like a subfield.

Now we consider a polynomial variant of the sum-product phenomenon, where we consider a polynomial image

\displaystyle  P(A,A) := \{ P(a,b): a,b \in A \}

of a set {A \subset F} with respect to a polynomial {P: F \times F \rightarrow F}; we can also consider the asymmetric setting of the image

\displaystyle  P(A,B) := \{ P(a,b): a \in A,b \in B \}

of two subsets {A,B \subset F}. The regime we will be interested is the one where the field {F} is large, and the subsets {A, B} of {F} are also large, but the polynomial {P} has bounded degree. Actually, for technical reasons it will not be enough for us to assume that {F} has large cardinality; we will also need to assume that {F} has large characteristic. (The two concepts are synonymous for fields of prime order, but not in general; for instance, the field with {2^n} elements becomes large as {n \rightarrow \infty} while the characteristic remains fixed at {2}, and is thus not going to be covered by the results in this paper.)

In this paper of Vu, it was shown that one could replace {A \cdot A} with {P(A,A)} in (1), thus obtaining a bound of the form

\displaystyle  |A+A| + |P(A,A)| \gg (|F|/|A|)^{c'} |A|

whenever {|A| \geq |F|^{1-c}} for some absolute constants {c, c' > 0}, unless the polynomial {P} had the degenerate form {P(x,y) = Q(L(x,y))} for some linear function {L: F \times F \rightarrow F} and polynomial {Q: F \rightarrow F}, in which {P(A,A)} behaves too much like {A+A} to get reasonable expansion. In this paper, we focus instead on the question of bounding {P(A,A)} alone. In particular, one can ask to classify the polynomials {P} for which one has the weak expansion property

\displaystyle |P(A,A)| \gg (|F|/|A|)^{c'} |A|

whenever {|A| \geq |F|^{1-c}} for some absolute constants {c, c' > 0}. One can also ask for stronger versions of this expander property, such as the moderate expansion property

\displaystyle |P(A,A)| \gg |F|

whenever {|A| \geq |F|^{1-c}}, or the almost strong expansion property

\displaystyle |P(A,A)| \geq |F| - O( |F|^{1-c'})

whenever {|A| \geq |F|^{1-c}}. (One can consider even stronger expansion properties, such as the strong expansion property {|P(A,A)| \geq |F|-O(1)}, but it was shown by Gyarmati and Sarkozy that this property cannot hold for polynomials of two variables of bounded degree when {|F| \rightarrow \infty}.) One can also consider asymmetric versions of these properties, in which one obtains lower bounds on {|P(A,B)|} rather than {|P(A,A)|}.

The example of a long arithmetic or geometric progression shows that the polynomials {P(x,y) = x+y} or {P(x,y) = xy} cannot be expanders in any of the above senses, and a similar construction also shows that polynomials of the form {P(x,y) = Q(f(x)+f(y))} or {P(x,y) = Q(f(x) f(y))} for some polynomials {Q, f: F \rightarrow F} cannot be expanders. On the other hand, there are a number of results in the literature establishing expansion for various polynomials in two or more variables that are not of this degenerate form (in part because such results are related to incidence geometry questions in finite fields, such as the finite field version of the Erdos distinct distances problem). For instance, Solymosi established weak expansion for polynomials of the form {P(x,y) = f(x)+y} when {f} is a nonlinear polynomial, with generalisations by Hart, Li, and Shen for various polynomials of the form {P(x,y) = f(x) + g(y)} or {P(x,y) = f(x) g(y)}. Further examples of expanding polynomials appear in the work of Shkredov, Iosevich-Rudnev, and Bukh-Tsimerman, as well as the previously mentioned paper of Vu and of Hart-Li-Shen, and these papers in turn cite many further results which are in the spirit of the polynomial expansion bounds discussed here (for instance, dealing with the small {A} regime, or working in other fields such as {{\bf R}} instead of in finite fields {F}). We will not summarise all these results here; they are summarised briefly in my paper, and in more detail in the papers of Hart-Li-Shen and of Bukh-Tsimerman. But we will single out one of the results of Bukh-Tsimerman, which is one of most recent and general of these results, and closest to the results of my own paper. Roughly speaking, in this paper it is shown that a polynomial {P(x,y)} of two variables and bounded degree will be a moderate expander if it is non-composite (in the sense that it does not take the form {P(x,y) = Q(R(x,y))} for some non-linear polynomial {Q} and some polynomial {R}, possibly having coefficients in the algebraic completion of {F}) and is monic on both {x} and {y}, thus it takes the form {P(x,y) = x^d + S(x,y)} for some {d \geq 1} and some polynomial {S} of degree at most {d-1} in {x}, and similarly with the roles of {x} and {y} reversed, unless {P} is of the form {P(x,y) = f(x) + g(y)} or {P(x,y) = f(x) g(y)} (in which case the expansion theory is covered to a large extent by the previous work of Hart, Li, and Shen).

Our first main result improves upon the Bukh-Tsimerman result by strengthening the notion of expansion and removing the non-composite and monic hypotheses, but imposes a condition of large characteristic. I’ll state the result here slightly informally as follows:

Theorem 1 (Criterion for moderate expansion) Let {P: F \times F \rightarrow F} be a polynomial of bounded degree over a finite field {F} of sufficiently large characteristic, and suppose that {P} is not of the form {P(x,y) = Q(f(x)+g(y))} or {P(x,y) = Q(f(x)g(y))} for some polynomials {Q,f,g: F \rightarrow F}. Then one has the (asymmetric) moderate expansion property

\displaystyle  |P(A,B)| \gg |F|

whenever {|A| |B| \ggg |F|^{2-1/8}}.

This is basically a sharp necessary and sufficient condition for asymmetric expansion moderate for polynomials of two variables. In the paper, analogous sufficient conditions for weak or almost strong expansion are also given, although these are not quite as satisfactory (particularly the conditions for almost strong expansion, which include a somewhat complicated algebraic condition which is not easy to check, and which I would like to simplify further, but was unable to).

The argument here resembles the Bukh-Tsimerman argument in many ways. One can view the result as an assertion about the expansion properties of the graph {\{ (a,b,P(a,b)): a,b \in F \}}, which can essentially be thought of as a somewhat sparse three-uniform hypergraph on {F}. Being sparse, it is difficult to directly apply techniques from dense graph or hypergraph theory for this situation; however, after a few applications of the Cauchy-Schwarz inequality, it turns out (as observed by Bukh and Tsimerman) that one can essentially convert the problem to one about the expansion properties of the set

\displaystyle  \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in F \} \ \ \ \ \ (2)

(actually, one should view this as a multiset, but let us ignore this technicality) which one expects to be a dense set in {F^4}, except in the case when the associated algebraic variety

\displaystyle  \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in \overline{F} \}

fails to be Zariski dense, but it turns out that in this case one can use some differential geometry and Riemann surface arguments (after first invoking the Lefschetz principle and the high characteristic hypothesis to work over the complex numbers instead over a finite field) to show that {P} is of the form {Q(f(x)+g(y))} or {Q(f(x)g(y))}. This reduction is related to the classical fact that the only one-dimensional algebraic groups over the complex numbers are the additive group {({\bf C},+)}, the multiplicative group {({\bf C} \backslash \{0\},\times)}, or the elliptic curves (but the latter have a group law given by rational functions rather than polynomials, and so ultimately end up being eliminated from consideration, though they would play an important role if one wanted to study the expansion properties of rational functions).

It remains to understand the structure of the set (2) is. To understand dense graphs or hypergraphs, one of the standard tools of choice is the Szemerédi regularity lemma, which carves up such graphs into a bounded number of cells, with the graph behaving pseudorandomly on most pairs of cells. However, the bounds in this lemma are notoriously poor (the regularity obtained is an inverse tower exponential function of the number of cells), and this makes this lemma unsuitable for the type of expansion properties we seek (in which we want to deal with sets {A} which have a polynomial sparsity, e.g. {|A| \sim |F|^{1-c}}). Fortunately, in the case of sets such as (2) which are definable over the language of rings, it turns out that a much stronger regularity lemma is available, which I call the “algebraic regularity lemma”. I’ll state it (again, slightly informally) in the context of graphs as follows:

Lemma 2 (Algebraic regularity lemma) Let {F} be a finite field of large characteristic, and let {V, W} be definable sets over {F} of bounded complexity (i.e. {V, W} are subsets of {F^n}, {F^m} for some bounded {n,m} that can be described by some first-order predicate in the language of rings of bounded length and involving boundedly many constants). Let {E} be a definable subset of {V \times W}, again of bounded complexity (one can view {E} as a bipartite graph connecting {V} and {W}). Then one can partition {V, W} into a bounded number of cells {V_1,\ldots,V_a}, {W_1,\ldots,W_b}, still definable with bounded complexity, such that for all pairs {i =1,\ldots a}, {j=1,\ldots,b}, one has the regularity property

\displaystyle  |E \cap (A \times B)| = d_{ij} |A| |B| + O( |F|^{-1/4} |V| |W| )

for all {A \subset V_i, B \subset W_i}, where {d_{ij} := \frac{|E \cap (V_i \times W_j)|}{|V_i| |W_j|}} is the density of {E} in {V_i \times W_j}.

This lemma resembles the Szemerédi regularity lemma, but regularises all pairs of cells (not just most pairs), and the regularity is of polynomial strength in {|F|}, rather than inverse tower exponential in the number of cells. Also, the cells are not arbitrary subsets of {V,W}, but are themselves definable with bounded complexity, which turns out to be crucial for applications. I am optimistic that this lemma will be useful not just for studying expanding polynomials, but for many other combinatorial questions involving dense subsets of definable sets over finite fields.

The above lemma is stated for graphs {E \subset V \times W}, but one can iterate it to obtain an analogous regularisation of hypergraphs {E \subset V_1 \times \ldots \times V_k} for any bounded {k} (for application to (2), we need {k=4}). This hypergraph regularity lemma, by the way, is not analogous to the strong hypergraph regularity lemmas of Rodl et al. and Gowers developed in the last six or so years, but closer in spirit to the older (but weaker) hypergraph regularity lemma of Chung which gives the same “order {1}” regularity that the graph regularity lemma gives, rather than higher order regularity.

One feature of the proof of Lemma 2 which I found striking was the need to use some fairly high powered technology from algebraic geometry, and in particular the Lang-Weil bound on counting points in varieties over a finite field (discussed in this previous blog post), and also the theory of the etale fundamental group. Let me try to briefly explain why this is the case. A model example of a definable set of bounded complexity {E} is a set {E \subset F^n \times F^m} of the form

\displaystyle  E = \{ (x,y) \in F^n \times F^m: \exists t \in F; P(x,y,t)=0 \}

for some polynomial {P: F^n \times F^m \times F \rightarrow F}. (Actually, it turns out that one can essentially write all definable sets as an intersection of sets of this form; see this previous blog post for more discussion.) To regularise the set {E}, it is convenient to square the adjacency matrix, which soon leads to the study of counting functions such as

\displaystyle  \mu(x,x') := | \{ (y,t,t') \in F^m \times F \times F: P(x,y,t) = P(x',y,t') = 0 \}|.

If one can show that this function {\mu} is “approximately finite rank” in the sense that (modulo lower order errors, of size {O(|F|^{-1/2})} smaller than the main term), this quantity depends only on a bounded number of bits of information about {x} and a bounded number of bits of information about {x'}, then a little bit of linear algebra will then give the required regularity result.

One can recognise {\mu(x,x')} as counting {F}-points of a certain algebraic variety

\displaystyle  V_{x,x'} := \{ (y,t,t') \in \overline{F}^m \times \overline{F} \times \overline{F}: P(x,y,t) = P(x',y,t') = 0 \}.

The Lang-Weil bound (discussed in this previous post) provides a formula for this count, in terms of the number {c(x,x')} of geometrically irreducible components of {V_{x,x'}} that are defined over {F} (or equivalently, are invariant with respect to the Frobenius endomorphism associated to {F}). So the problem boils down to ensuring that this quantity {c(x,x')} is “generically bounded rank”, in the sense that for generic {x,x'}, its value depends only on a bounded number of bits of {x} and a bounded number of bits of {x'}.

Here is where the étale fundamental group comes in. One can view {V_{x,x'}} as a fibre product {V_x \times_{\overline{F}^m} V_{x'}} of the varieties

\displaystyle  V_x := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x,y,t) = 0 \}

and

\displaystyle  V_{x'} := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x',y,t) = 0 \}

over {\overline{F}^m}. If one is in sufficiently high characteristic (or even better, in zero characteristic, which one can reduce to by an ultraproduct (or nonstandard analysis) construction, similar to that discussed in this previous post), the varieties {V_x,V_{x'}} are generically finite étale covers of {\overline{F}^m}, and the fibre product {V_x \times_{\overline{F}^m} V_{x'}} is then also generically a finite étale cover. One can count the components of a finite étale cover of a connected variety by counting the number of orbits of the étale fundamental group acting on a fibre of that variety (much as the number of components of a cover of a connected manifold is the number of orbits of the topological fundamental group acting on that fibre). So if one understands the étale fundamental group of a certain generic subset of {\overline{F}^m} (formed by intersecting together an {x}-dependent generic subset of {\overline{F}^m} with an {x'}-dependent generic subset), this in principle controls {c(x,x')}. It turns out that one can decouple the {x} and {x'} dependence of this fundamental group by using an étale version of the van Kampen theorem for the fundamental group, which I discussed in this previous blog post. With this fact (and another deep fact about the étale fundamental group in zero characteristic, namely that it is topologically finitely generated), one can obtain the desired generic bounded rank property of {c(x,x')}, which gives the regularity lemma.

In order to expedite the deployment of all this algebraic geometry (as well as some Riemann surface theory), it is convenient to use the formalism of nonstandard analysis (or the ultraproduct construction), which among other things can convert quantitative, finitary problems in large characteristic into equivalent qualitative, infinitary problems in zero characteristic (in the spirit of this blog post). This allows one to use several tools from those fields as “black boxes”; not just the theory of étale fundamental groups (which are considerably simpler and more favorable in characteristic zero than they are in positive characteristic), but also some results limiting the morphisms between compact Riemann surfaces of high genus (such as the de Franchis theorem, the Riemann-Hurwitz formula, or the fact that all morphisms between elliptic curves are essentially group homomorphisms), which would be quite unwieldy to utilise if one did not first pass to the zero characteristic case (and thence to the complex case) via the ultraproduct construction (followed by the Lefschetz principle).

I found this project to be particularly educational for me, as it forced me to wander outside of my usual range by quite a bit in order to pick up the tools from algebraic geometry and Riemann surfaces that I needed (in particular, I read through several chapters of EGA and SGA for the first time). This did however put me in the slightly unnerving position of having to use results (such as the Riemann existence theorem) whose proofs I have not fully verified for myself, but which are easy to find in the literature, and widely accepted in the field. I suppose this type of dependence on results in the literature is more common in the more structured fields of mathematics than it is in analysis, which by its nature has fewer reusable black boxes, and so key tools often need to be rederived and modified for each new application. (This distinction is discussed further in this article of Gowers.)

One of the first non-trivial theorems one encounters in classical algebraic geometry is Bézout’s theorem, which we will phrase as follows:

Theorem 1 (Bézout’s theorem) Let {k} be a field, and let {P, Q \in k[x,y]} be non-zero polynomials in two variables {x,y} with no common factor. Then the two curves {\{ (x,y) \in k^2: P(x,y) = 0\}} and {\{ (x,y) \in k^2: Q(x,y) = 0\}} have no common components, and intersect in at most {\hbox{deg}(P) \hbox{deg}(Q)} points.

This theorem can be proven by a number of means, for instance by using the classical tool of resultants. It has many strengthenings, generalisations, and variants; see for instance this previous blog post on Bézout’s inequality. Bézout’s theorem asserts a fundamental algebraic dichotomy, of importance in combinatorial incidence geometry: any two algebraic curves either share a common component, or else have a bounded finite intersection; there is no intermediate case in which the intersection is unbounded in cardinality, but falls short of a common component. This dichotomy is closely related to the integrality gap in algebraic dimension: an algebraic set can have an integer dimension such as {0} or {1}, but cannot attain any intermediate dimensions such as {1/2}. This stands in marked contrast to sets of analytic, combinatorial, or probabilistic origin, whose “dimension” is typically not necessarily constrained to be an integer.

Bézout’s inequality tells us, roughly speaking, that the intersection of a curve of degree {D_1} and a curve of degree {D_2} forms a set of at most {D_1 D_2} points. One can consider the converse question: given a set {S} of {N} points in the plane {k^2}, can one find two curves of degrees {D_1,D_2} with {D_1 D_2 = O(N)} and no common components, whose intersection contains these points?

A model example that supports the possibility of such a converse is a grid {S = A \times B} that is a Cartesian product of two finite subsets {A, B} of {k} with {|A| |B| = N}. In this case, one can take one curve to be the union of {|A|} vertical lines, and the other curve to be the union of {|B|} horizontal lines, to obtain the required decomposition. Thus, if the proposed converse to Bézout’s inequality held, it would assert that any set of {N} points was essentially behaving like a “nonlinear grid” of size {N}.

Unfortunately, the naive converse to Bézout’s theorem is false. A counterexample can be given by considering a set {S = S_1 \cup S_2} of {2N} points for some large perfect square {N}, where {P_1} is a {\sqrt{N}} by {\sqrt{N}} grid of the form described above, and {S_2} consists of {N} points on an line {\ell} (e.g. a {1 \times N} or {N \times 1} grid). Each of the two component sets {S_1, S_2} can be written as the intersection between two curves whose degrees multiply up to {N}; in the case of {S_1}, we can take the two families of parallel lines (viewed as reducible curves of degree {\sqrt{N}}) as the curves, and in the case of {S_2}, one can take {\ell} as one curve, and the graph of a degree {N} polynomial on {\ell} vanishing on {S_2} for the second curve. But, if {N} is large enough, one cannot cover {S} by the intersection of a single pair {\gamma_1, \gamma_2} of curves with no common components whose degrees {D_1,D_2} multiply up to {D_1 D_2 = O(N)}. Indeed, if this were the case, then without loss of generality we may assume that {D_1 \leq D_2}, so that {D_1 = O(\sqrt{N})}. By Bézout’s theorem, {\gamma_1} either contains {\ell}, or intersects {\ell} in at most {O(D_1) = O(\sqrt{N})} points. Thus, in order for {\gamma_1} to capture all of {S}, it must contain {\ell}, which forces {\gamma_2} to not contain {\ell}. But {\gamma_2} has to intersect {\ell} in {N} points, so by Bézout’s theorem again we have {D_2 \geq N}, thus {D_1 = O(1)}. But then (by more applications of Bézout’s theorem) {\gamma_1} can only capture {O(\sqrt{N})} of the {N} points of {S_1}, a contradiction.

But the above counterexample suggests that even if an arbitrary set of {N} (or {2N}) points cannot be covered by the single intersection of a pair of curves with degree multiplying up to {O(N)}, one may be able to cover such a set by a small number of such intersections. The purpose of this post is to record the simple observation that this is, indeed, the case:

Theorem 2 (Partial converse to Bézout’s theorem) Let {k} be a field, and let {S} be a set of {N} points in {k} for some {N > 1}. Then one can find {m = O(\log N)} and pairs {P_i,Q_i \in k[x,y]} of coprime non-zero polynomials for {i=1,\ldots,m} such that

\displaystyle  S \subset \bigcup_{i=1}^m \{ (x,y) \in k^2: P_i(x,y) = Q_i(x,y) = 0 \} \ \ \ \ \ (1)

and

\displaystyle  \sum_{i=1}^m \hbox{deg}(P_i) \hbox{deg}(Q_i) = O( N ). \ \ \ \ \ (2)

Informally, every finite set in the plane is (a dense subset of) the union of logarithmically many nonlinear grids. The presence of the logarithm is necessary, as can be seen by modifying the {P_1 \cup P_2} example to be the union of logarithmically many Cartesian products of distinct dimensions, rather than just a pair of such products.

Unfortunately I do not know of any application of this converse, but I thought it was cute anyways. The proof is given below the fold.

Read the rest of this entry »

Archives

RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.
Follow

Get every new post delivered to your Inbox.

Join 3,875 other followers