You are currently browsing the tag archive for the ‘Stepanov’s method’ tag.

Let {{\bf F}_q} be a finite field of order {q = p^n}, and let {C} be an absolutely irreducible smooth projective curve defined over {{\bf F}_q} (and hence over the algebraic closure {k := \overline{{\bf F}_q}} of that field). For instance, {C} could be the projective elliptic curve

\displaystyle C = \{ [x,y,z]: y^2 z = x^3 + ax z^2 + b z^3 \}

in the projective plane {{\bf P}^2 = \{ [x,y,z]: (x,y,z) \neq (0,0,0) \}}, where {a,b \in {\bf F}_q} are coefficients whose discriminant {-16(4a^3+27b^2)} is non-vanishing, which is the projective version of the affine elliptic curve

\displaystyle \{ (x,y): y^2 = x^3 + ax + b \}.

To each such curve {C} one can associate a genus {g}, which we will define later; for instance, elliptic curves have genus {1}. We can also count the cardinality {|C({\bf F}_q)|} of the set {C({\bf F}_q)} of {{\bf F}_q}-points of {C}. The Hasse-Weil bound relates the two:

Theorem 1 (Hasse-Weil bound) {||C({\bf F}_q)| - q - 1| \leq 2g\sqrt{q}}.

The usual proofs of this bound proceed by first establishing a trace formula of the form

\displaystyle |C({\bf F}_{p^n})| = p^n - \sum_{i=1}^{2g} \alpha_i^n + 1 \ \ \ \ \ (1)


for some complex numbers {\alpha_1,\dots,\alpha_{2g}} independent of {n}; this is in fact a special case of the Lefschetz-Grothendieck trace formula, and can be interpreted as an assertion that the zeta function associated to the curve {C} is rational. The task is then to establish a bound {|\alpha_i| \leq \sqrt{p}} for all {i=1,\dots,2g}; this (or more precisely, the slightly stronger assertion {|\alpha_i| = \sqrt{p}}) is the Riemann hypothesis for such curves. This can be done either by passing to the Jacobian variety of {C} and using a certain duality available on the cohomology of such varieties, known as Rosati involution; alternatively, one can pass to the product surface {C \times C} and apply the Riemann-Roch theorem for that surface.

In 1969, Stepanov introduced an elementary method (a version of what is now known as the polynomial method) to count (or at least to upper bound) the quantity {|C({\bf F}_q)|}. The method was initially restricted to hyperelliptic curves, but was soon extended to general curves. In particular, Bombieri used this method to give a short proof of the following weaker version of the Hasse-Weil bound:

Theorem 2 (Weak Hasse-Weil bound) If {q} is a perfect square, and {q \geq (g+1)^4}, then {|C({\bf F}_q)| \leq q + (2g+1) \sqrt{q} + 1}.

In fact, the bound on {|C({\bf F}_q)|} can be sharpened a little bit further, as we will soon see.

Theorem 2 is only an upper bound on {|C({\bf F}_q)|}, but there is a Galois-theoretic trick to convert (a slight generalisation of) this upper bound to a matching lower bound, and if one then uses the trace formula (1) (and the “tensor power trick” of sending {n} to infinity to control the weights {\alpha_i}) one can then recover the full Hasse-Weil bound. We discuss these steps below the fold.

I’ve discussed Bombieri’s proof of Theorem 2 in this previous post (in the special case of hyperelliptic curves), but now wish to present the full proof, with some minor simplifications from Bombieri’s original presentation; it is mostly elementary, with the deepest fact from algebraic geometry needed being Riemann’s inequality (a weak form of the Riemann-Roch theorem).

The first step is to reinterpret {|C({\bf F}_q)|} as the number of points of intersection between two curves {C_1,C_2} in the surface {C \times C}. Indeed, if we define the Frobenius endomorphism {\hbox{Frob}_q} on any projective space by

\displaystyle \hbox{Frob}_q( [x_0,\dots,x_n] ) := [x_0^q, \dots, x_n^q]

then this map preserves the curve {C}, and the fixed points of this map are precisely the {{\bf F}_q} points of {C}:

\displaystyle C({\bf F}_q) = \{ z \in C: \hbox{Frob}_q(z) = z \}.

Thus one can interpret {|C({\bf F}_q)|} as the number of points of intersection between the diagonal curve

\displaystyle \{ (z,z): z \in C \}

and the Frobenius graph

\displaystyle \{ (z, \hbox{Frob}_q(z)): z \in C \}

which are copies of {C} inside {C \times C}. But we can use the additional hypothesis that {q} is a perfect square to write this more symmetrically, by taking advantage of the fact that the Frobenius map has a square root

\displaystyle \hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}}^2

with {\hbox{Frob}_{\sqrt{q}}} also preserving {C}. One can then also interpret {|C({\bf F}_q)|} as the number of points of intersection between the curve

\displaystyle C_1 := \{ (z, \hbox{Frob}_{\sqrt{q}}(z)): z \in C \} \ \ \ \ \ (2)


and its transpose

\displaystyle C_2 := \{ (\hbox{Frob}_{\sqrt{q}}(w), w): w \in C \}.

Let {k(C \times C)} be the field of rational functions on {C \times C} (with coefficients in {k}), and define {k(C_1)}, {k(C_2)}, and {k(C_1 \cap C_2)} analogously )(although {C_1 \cap C_2} is likely to be disconnected, so {k(C_1 \cap C_2)} will just be a ring rather than a field. We then (morally) have the commuting square

\displaystyle \begin{array}{ccccc} && k(C \times C) && \\ & \swarrow & & \searrow & \\ k(C_1) & & & & k(C_2) \\ & \searrow & & \swarrow & \\ && k(C_1 \cap C_2) && \end{array},

if we ignore the issue that a rational function on, say, {C \times C}, might blow up on all of {C_1} and thus not have a well-defined restriction to {C_1}. We use {\pi_1: k(C \times C) \rightarrow k(C_1)} and {\pi_2: k(C \times C) \rightarrow k(C_2)} to denote the restriction maps. Furthermore, we have obvious isomorphisms {\iota_1: k(C_1) \rightarrow k(C)}, {\iota_2: k(C_2) \rightarrow k(C)} coming from composing with the graphing maps {z \mapsto (z, \hbox{Frob}_{\sqrt{q}}(z))} and {w \mapsto (\hbox{Frob}_{\sqrt{q}}(w), w)}.

The idea now is to find a rational function {f \in k(C \times C)} on the surface {C \times C} of controlled degree which vanishes when restricted to {C_1}, but is non-vanishing (and not blowing up) when restricted to {C_2}. On {C_2}, we thus get a non-zero rational function {f \downharpoonright_{C_2}} of controlled degree which vanishes on {C_1 \cap C_2} – which then lets us bound the cardinality of {C_1 \cap C_2} in terms of the degree of {f \downharpoonright_{C_2}}. (In Bombieri’s original argument, one required vanishing to high order on the {C_1} side, but in our presentation, we have factored out a {\hbox{Frob}_{\sqrt{q}}} term which removes this high order vanishing condition.)

To find this {f}, we will use linear algebra. Namely, we will locate a finite-dimensional subspace {V} of {k(C \times C)} (consisting of certain “controlled degree” rational functions) which projects injectively to {k(C_2)}, but whose projection to {k(C_1)} has strictly smaller dimension than {V} itself. The rank-nullity theorem then forces the existence of a non-zero element {P} of {V} whose projection to {k(C_1)} vanishes, but whose projection to {k(C_2)} is non-zero.

Now we build {V}. Pick a {{\bf F}_q} point {P_\infty} of {C}, which we will think of as being a point at infinity. (For the purposes of proving Theorem 2, we may clearly assume that {C({\bf F}_q)} is non-empty.) Thus {P_\infty} is fixed by {\hbox{Frob}_q}. To simplify the exposition, we will also assume that {P_\infty} is fixed by the square root {\hbox{Frob}_{\sqrt{q}}} of {\hbox{Frob}_q}; in the opposite case when {\hbox{Frob}_{\sqrt{q}}} has order two when acting on {P_\infty}, the argument is essentially the same, but all references to {P_\infty} in the second factor of {C \times C} need to be replaced by {\hbox{Frob}_{\sqrt{q}} P_\infty} (we leave the details to the interested reader).

For any natural number {n}, define {R_n} to be the set of rational functions {f \in k(C)} which are allowed to have a pole of order up to {n} at {P_\infty}, but have no other poles on {C}; note that as we are assuming {C} to be smooth, it is unambiguous what a pole is (and what order it will have). (In the fancier language of divisors and Cech cohomology, we have {R_n = H^0( C, {\mathcal O}_C(-n P_\infty) )}.) The space {R_n} is clearly a vector space over {k}; one can view intuitively as the space of “polynomials” on {C} of “degree” at most {n}. When {n=0}, {R_0} consists just of the constant functions. Indeed, if {f \in R_0}, then the image {f(C)} of {f} avoids {\infty} and so lies in the affine line {k = {\mathbf P}^1 \backslash \{\infty\}}; but as {C} is projective, the image {f(C)} needs to be compact (hence closed) in {{\mathbf P}^1}, and must therefore be a point, giving the claim.

For higher {n \geq 1}, we have the easy relations

\displaystyle \hbox{dim}(R_{n-1}) \leq \hbox{dim}(R_n) \leq \hbox{dim}(R_{n-1})+1. \ \ \ \ \ (3)


The former inequality just comes from the trivial inclusion {R_{n-1} \subset R_n}. For the latter, observe that if two functions {f, g} lie in {R_n}, so that they each have a pole of order at most {n} at {P_\infty}, then some linear combination of these functions must have a pole of order at most {n-1} at {P_\infty}; thus {R_{n-1}} has codimension at most one in {R_n}, giving the claim.

From (3) and induction we see that each of the {R_n} are finite dimensional, with the trivial upper bound

\displaystyle \hbox{dim}(R_n) \leq n+1. \ \ \ \ \ (4)


Riemann’s inequality complements this with the lower bound

\displaystyle \hbox{dim}(R_n) \geq n+1-g, \ \ \ \ \ (5)


thus one has {\hbox{dim}(R_n) = \hbox{dim}(R_{n-1})+1} for all but at most {g} exceptions (in fact, exactly {g} exceptions as it turns out). This is a consequence of the Riemann-Roch theorem; it can be proven from abstract nonsense (the snake lemma) if one defines the genus {g} in a non-standard fashion (as the dimension of the first Cech cohomology {H^1(C)} of the structure sheaf {{\mathcal O}_C} of {C}), but to obtain this inequality with a standard definition of {g} (e.g. as the dimension of the zeroth Cech cohomolgy {H^0(C, \Omega_C^1)} of the line bundle of differentials) requires the more non-trivial tool of Serre duality.

At any rate, now that we have these vector spaces {R_n}, we will define {V \subset k(C \times C)} to be a tensor product space

\displaystyle V = R_\ell \otimes R_m

for some natural numbers {\ell, m \geq 0} which we will optimise in later. That is to say, {V} is spanned by functions of the form {(z,w) \mapsto f(z) g(w)} with {f \in R_\ell} and {g \in R_m}. This is clearly a linear subspace of {k(C \times C)} of dimension {\hbox{dim}(R_\ell) \hbox{dim}(R_m)}, and hence by Rieman’s inequality we have

\displaystyle \hbox{dim}(V) \geq (\ell+1-g) (m+1-g) \ \ \ \ \ (6)



\displaystyle \ell,m \geq g-1. \ \ \ \ \ (7)


Observe that {\iota_1 \circ \pi_1} maps a tensor product {(z,w) \mapsto f(z) g(w)} to a function {z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)}. If {f \in R_\ell} and {g \in R_m}, then we see that the function {z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)} has a pole of order at most {\ell+m\sqrt{q}} at {P_\infty}. We conclude that

\displaystyle \iota_1 \circ \pi_1( V ) \subset R_{\ell + m\sqrt{q}} \ \ \ \ \ (8)


and in particular by (4)

\displaystyle \hbox{dim}(\pi_1(V)) \leq \ell + m \sqrt{q} + 1 \ \ \ \ \ (9)


and similarly

\displaystyle \hbox{dim}(\pi_2(V)) \leq \ell \sqrt{q} + m + 1. \ \ \ \ \ (10)


We will choose {m} to be a bit bigger than {\ell}, to make the {\pi_2} image of {V} smaller than that of {\pi_1}. From (6), (10) we see that if we have the inequality

\displaystyle (\ell+1-g) (m+1-g) > \ell \sqrt{q}+m + 1 \ \ \ \ \ (11)


(together with (7)) then {\pi_2} cannot be injective.

On the other hand, we have the following basic fact:

Lemma 3 (Injectivity) If

\displaystyle \ell < \sqrt{q}, \ \ \ \ \ (12)


then {\pi_1: V \rightarrow \pi_1(V)} is injective.

Proof: From (3), we can find a linear basis {f_1,\dots,f_a} of {R_\ell} such that each of the {f_i} has a distinct order {d_i} of pole at {P_\infty} (somewhere between {0} and {\ell} inclusive). Similarly, we may find a linear basis {g_1,\dots,g_b} of {R_m} such that each of the {g_j} has a distinct order {e_j} of pole at {P_\infty} (somewhere between {0} and {m} inclusive). The functions {z \mapsto f_i(z) g_j(\hbox{Frob}_{\sqrt{q}} z)} then span {\iota_1(\pi_1(V))}, and the order of pole at {P_\infty} is {d_i + \sqrt{q} e_j}. But since {\ell < \sqrt{q}}, these orders are all distinct, and so these functions must be linearly independent. The claim follows. \Box

This gives us the following bound:

Proposition 4 Let {\ell,m} be natural numbers such that (7), (11), (12) hold. Then {|C({\bf F}_q)| \leq \ell + m \sqrt{q}}.

Proof: As {\pi_2} is not injective, we can find {f \in V} with {\pi_2(f)} vanishing. By the above lemma, the function {\iota_1(\pi_1(f))} is then non-zero, but it must also vanish on {\iota_1(C_1 \cap C_2)}, which has cardinality {|C({\bf F}_q)|}. On the other hand, by (8), {\iota_1(\pi_1(f))} has a pole of order at most {\ell+m\sqrt{q}} at {P_\infty} and no other poles. Since the number of poles and zeroes of a rational function on a projective curve must add up to zero, the claim follows. \Box

If {q \geq (g+1)^4}, we may make the explicit choice

\displaystyle m := \sqrt{q}+2g; \quad \ell := \lfloor \frac{g}{g+1} \sqrt{q} \rfloor + g + 1

and a brief calculation then gives Theorem 2. In some cases one can optimise things a bit further. For instance, in the genus zero case {g=0} (e.g. if {C} is just the projective line {{\mathbf P}^1}) one may take {\ell=1, m = \sqrt{q}} and conclude the absolutely sharp bound {|C({\bf F}_q)| \leq q+1} in this case; in the case of the projective line {{\mathbf P}^1}, the function {f} is in fact the very concrete function {f(z,w) := z - w^{\sqrt{q}}}.

Remark 1 When {q = p^{2n+1}} is not a perfect square, one can try to run the above argument using the factorisation {\hbox{Frob}_q = \hbox{Frob}_{p^n} \hbox{Frob}_{p^{n+1}}} instead of {\hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}} \hbox{Frob}_{\sqrt{q}}}. This gives a weaker version of the above bound, of the shape {|C({\bf F}_q)| \leq q + O( \sqrt{p} \sqrt{q} )}. In the hyperelliptic case at least, one can erase this loss by working with a variant of the argument in which one requires {f} to vanish to high order at {C_1}, rather than just to first order; see this survey article of mine for details.

Read the rest of this entry »

Let {F} be a finite field, with algebraic closure {\overline{F}}, and let {V} be an (affine) algebraic variety defined over {\overline{F}}, by which I mean a set of the form

\displaystyle  V = \{ x \in \overline{F}^d: P_1(x) = \ldots = P_m(x) = 0 \}

for some ambient dimension {d \geq 0}, and some finite number of polynomials {P_1,\ldots,P_m: \overline{F}^d \rightarrow \overline{F}}. In order to reduce the number of subscripts later on, let us say that {V} has complexity at most {M} if {d}, {m}, and the degrees of the {P_1,\ldots,P_m} are all less than or equal to {M}. Note that we do not require at this stage that {V} be irreducible (i.e. not the union of two strictly smaller varieties), or defined over {F}, though we will often specialise to these cases later in this post. (Also, everything said here can also be applied with almost no changes to projective varieties, but we will stick with affine varieties for sake of concreteness.)

One can consider two crude measures of how “big” the variety {V} is. The first measure, which is algebraic geometric in nature, is the dimension {\hbox{dim}(V)} of the variety {V}, which is an integer between {0} and {d} (or, depending on convention, {-\infty}, {-1}, or undefined, if {V} is empty) that can be defined in a large number of ways (e.g. it is the largest {r} for which the generic linear projection from {V} to {\overline{F}^r} is dominant, or the smallest {r} for which the intersection with a generic codimension {r} subspace is non-empty). The second measure, which is number-theoretic in nature, is the number {|V(F)| = |V \cap F^d|} of {F}-points of {V}, i.e. points {x = (x_1,\ldots,x_d)} in {V} all of whose coefficients lie in the finite field, or equivalently the number of solutions to the system of equations {P_i(x_1,\ldots,x_d) = 0} for {i=1,\ldots,m} with variables {x_1,\ldots,x_d} in {F}.

These two measures are linked together in a number of ways. For instance, we have the basic Schwarz-Zippel type bound (which, in this qualitative form, goes back at least to Lemma 1 of the work of Lang and Weil in 1954).

Lemma 1 (Schwarz-Zippel type bound) Let {V} be a variety of complexity at most {M}. Then we have {|V(F)| \ll_M |F|^{\hbox{dim}(V)}}.

Proof: (Sketch) For the purposes of exposition, we will not carefully track the dependencies of implied constants on the complexity {M}, instead simply assuming that all of these quantities remain controlled throughout the argument. (If one wished, one could obtain ineffective bounds on these quantities by an ultralimit argument, as discussed in this previous post, or equivalently by moving everything over to a nonstandard analysis framework; one could also obtain such uniformity using the machinery of schemes.)

We argue by induction on the ambient dimension {d} of the variety {V}. The {d=0} case is trivial, so suppose {d \geq 1} and that the claim has already been proven for {d-1}. By breaking up {V} into irreducible components we may assume that {V} is irreducible (this requires some control on the number and complexity of these components, but this is available, as discussed in this previous post). For each {x_1,\ldots,x_{d-1} \in \overline{F}}, the fibre {\{ x_d \in \overline{F}: (x_1,\ldots,x_{d-1},x_d) \in V \}} is either one-dimensional (and thus all of {\overline{F}}) or zero-dimensional. In the latter case, one has {O_M(1)} points in the fibre from the fundamental theorem of algebra (indeed one has a bound of {D} in this case), and {(x_1,\ldots,x_{d-1})} lives in the projection of {V} to {\overline{F}^{d-1}}, which is a variety of dimension at most {\hbox{dim}(V)} and controlled complexity, so the contribution of this case is acceptable from the induction hypothesis. In the former case, the fibre contributes {|F|} {F}-points, but {(x_1,\ldots,x_{d-1})} lies in a variety in {\overline{F}^{d-1}} of dimension at most {\hbox{dim}(V)-1} (since otherwise {V} would contain a subvariety of dimension at least {\hbox{dim}(V)+1}, which is absurd) and controlled complexity, and so the contribution of this case is also acceptable from the induction hypothesis. \Box

One can improve the bound on the implied constant to be linear in the degree of {V} (see e.g. Claim 7.2 of this paper of Dvir, Kollar, and Lovett, or Lemma A.3 of this paper of Ellenberg, Oberlin, and myself), but we will not be concerned with these improvements here.

Without further hypotheses on {V}, the above upper bound is sharp (except for improvements in the implied constants). For instance, the variety

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: \prod_{j=1}^D (x_d - a_j) = 0\},

where {a_1,\ldots,a_D \in F} are distict, is the union of {D} distinct hyperplanes of dimension {d-1}, with {|V(F)| = D |F|^{d-1}} and complexity {\max(D,d)}; similar examples can easily be concocted for other choices of {\hbox{dim}(V)}. In the other direction, there is also no non-trivial lower bound for {|V(F)|} without further hypotheses on {V}. For a trivial example, if {a} is an element of {\overline{F}} that does not lie in {F}, then the hyperplane

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: x_d - a = 0 \}

clearly has no {F}-points whatsoever, despite being a {d-1}-dimensional variety in {\overline{F}^d} of complexity {d}. For a slightly less non-trivial example, if {a} is an element of {F} that is not a quadratic residue, then the variety

\displaystyle  V := \{ (x_1,\ldots,x_d) \in \overline{F}^d: x_d^2 - a = 0 \},

which is the union of two hyperplanes, still has no {F}-points, even though this time the variety is defined over {F} instead of {\overline{F}} (by which we mean that the defining polynomial(s) have all of their coefficients in {F}). There is however the important Lang-Weil bound that allows for a much better estimate as long as {V} is both defined over {F} and irreducible:

Theorem 2 (Lang-Weil bound) Let {V} be a variety of complexity at most {M}. Assume that {V} is defined over {F}, and that {V} is irreducible as a variety over {\overline{F}} (i.e. {V} is geometrically irreducible or absolutely irreducible). Then

\displaystyle  |V(F)| = (1 + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(V)}.

Again, more explicit bounds on the implied constant here are known, but will not be the focus of this post. As the previous examples show, the hypotheses of definability over {F} and geometric irreducibility are both necessary.

The Lang-Weil bound is already non-trivial in the model case {d=2, \hbox{dim}(V)=1} of plane curves:

Theorem 3 (Hasse-Weil bound) Let {P: \overline{F}^2 \rightarrow \overline{F}} be an irreducible polynomial of degree {D} with coefficients in {F}. Then

\displaystyle  |\{ (x,y) \in F^2: P(x,y) = 0 \}| = |F| + O_D( |F|^{1/2} ).

Thus, for instance, if {a,b \in F}, then the elliptic curve {\{ (x,y) \in F^2: y^2 = x^3 + ax + b \}} has {|F| + O(|F|^{1/2})} {F}-points, a result first established by Hasse. The Hasse-Weil bound is already quite non-trivial, being the analogue of the Riemann hypothesis for plane curves. For hyper-elliptic curves, an elementary proof (due to Stepanov) is discussed in this previous post. For general plane curves, the first proof was by Weil (leading to his famous Weil conjectures); there is also a nice version of Stepanov’s argument due to Bombieri covering this case which is a little less elementary (relying crucially on the Riemann-Roch theorem for the upper bound, and a lifting trick to then get the lower bound), which I briefly summarise later in this post. The full Lang-Weil bound is deduced from the Hasse-Weil bound by an induction argument using generic hyperplane slicing, as I will also summarise later in this post.

The hypotheses of definability over {F} and geometric irreducibility in the Lang-Weil can be removed after inserting a geometric factor:

Corollary 4 (Lang-Weil bound, alternate form) Let {V} be a variety of complexity at most {M}. Then one has

\displaystyle  |V(F)| = (c(V) + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(V)}

where {c(V)} is the number of top-dimensional components of {V} (i.e. geometrically irreducible components of {V} of dimension {\hbox{dim}(V)}) that are definable over {F}, or equivalently are invariant with respect to the Frobenius endomorphism {x \mapsto x^{|F|}} that defines {F}.

Proof: By breaking up a general variety {V} into components (and using Lemma 1 to dispose of any lower-dimensional components), it suffices to establish this claim when {V} is itself geometrically irreducible. If {V} is definable over {F}, the claim follows from Theorem 2. If {V} is not definable over {F}, then it is not fixed by the Frobenius endomorphism {Frob} (since otherwise one could produce a set of defining polynomials that were fixed by Frobenius and thus defined over {F} by using some canonical basis (such as a reduced Grobner basis) for the associated ideal), and so {V \cap Frob(V)} has strictly smaller dimension than {V}. But {V \cap Frob(V)} captures all the {F}-points of {V}, so in this case the claim follows from Lemma 1. \Box

Note that if {V} is reducible but is itself defined over {F}, then the Frobenius endomorphism preserves {V} itself, but may permute the components of {V} around. In this case, {c(V)} is the number of fixed points of this permutation action of Frobenius on the components. In particular, {c(V)} is always a natural number between {0} and {O_M(1)}; thus we see that regardless of the geometry of {V}, the normalised count {|V(F)|/|F|^{\hbox{dim}(V)}} is asymptotically restricted to a bounded range of natural numbers (in the regime where the complexity stays bounded and {|F|} goes to infinity).

Example 1 Consider the variety

\displaystyle  V := \{ (x,y) \in \overline{F}^2: x^2 - ay^2 = 0 \}

for some non-zero parameter {a \in F}. Geometrically (by which we basically mean “when viewed over the algebraically closed field {\overline{F}}“), this is the union of two lines, with slopes corresponding to the two square roots of {a}. If {a} is a quadratic residue, then both of these lines are defined over {F}, and are fixed by Frobenius, and {c(V) = 2} in this case. If {a} is not a quadratic residue, then the lines are not defined over {F}, and the Frobenius automorphism permutes the two lines while preserving {V} as a whole, giving {c(V)=0} in this case.

Corollary 4 effectively computes (at least to leading order) the number-theoretic size {|V(F)|} of a variety in terms of geometric information about {V}, namely its dimension {\hbox{dim}(V)} and the number {c(V)} of top-dimensional components fixed by Frobenius. It turns out that with a little bit more effort, one can extend this connection to cover not just a single variety {V}, but a family of varieties indexed by points in some base space {W}. More precisely, suppose we now have two affine varieties {V,W} of bounded complexity, together with a regular map {\phi: V \rightarrow W} of bounded complexity (the definition of complexity of a regular map is a bit technical, see e.g. this paper, but one can think for instance of a polynomial or rational map of bounded degree as a good example). It will be convenient to assume that the base space {W} is irreducible. If the map {\phi} is a dominant map (i.e. the image {\phi(V)} is Zariski dense in {W}), then standard algebraic geometry results tell us that the fibres {\phi^{-1}(\{w\})} are an unramified family of {\hbox{dim}(V)-\hbox{dim}(W)}-dimensional varieties outside of an exceptional subset {W'} of {W} of dimension strictly smaller than {\hbox{dim}(W)} (and with {\phi^{-1}(W')} having dimension strictly smaller than {\hbox{dim}(V)}); see e.g. Section I.6.3 of Shafarevich.

Now suppose that {V}, {W}, and {\phi} are defined over {F}. Then, by Lang-Weil, {W(F)} has {(1 + O(|F|^{-1/2})) |F|^{\hbox{dim}(W)}} {F}-points, and by Schwarz-Zippel, for all but {O( |F|^{\hbox{dim}(W)-1})} of these {F}-points {w} (the ones that lie in the subvariety {W'}), the fibre {\phi^{-1}(\{w\})} is an algebraic variety defined over {F} of dimension {\hbox{dim}(V)-\hbox{dim}(W)}. By using ultraproduct arguments (see e.g. Lemma 3.7 of this paper of mine with Emmanuel Breuillard and Ben Green), this variety can be shown to have bounded complexity, and thus by Corollary 4, has {(c(\phi^{-1}(\{w\})) + O(|F|^{-1/2}) |F|^{\hbox{dim}(V)-\hbox{dim}(W)}} {F}-points. One can then ask how the quantity {c(\phi^{-1}(\{w\})} is distributed. A simple but illustrative example occurs when {V=W=F} and {\phi: F \rightarrow F} is the polynomial {\phi(x) := x^2}. Then {c(\phi^{-1}(\{w\})} equals {2} when {w} is a non-zero quadratic residue and {0} when {w} is a non-zero quadratic non-residue (and {1} when {w} is zero, but this is a negligible fraction of all {w}). In particular, in the asymptotic limit {|F| \rightarrow \infty}, {c(\phi^{-1}(\{w\})} is equal to {2} half of the time and {0} half of the time.

Now we describe the asymptotic distribution of the {c(\phi^{-1}(\{w\}))}. We need some additional notation. Let {w_0} be an {F}-point in {W \backslash W'}, and let {\pi_0( \phi^{-1}(\{w_0\}) )} be the connected components of the fibre {\phi^{-1}(\{w_0\})}. As {\phi^{-1}(\{w_0\})} is defined over {F}, this set of components is permuted by the Frobenius endomorphism {Frob}. But there is also an action by monodromy of the fundamental group {\pi_1(W \backslash W')} (this requires a certain amount of étale machinery to properly set up, as we are working over a positive characteristic field rather than over the complex numbers, but I am going to ignore this rather important detail here, as I still don’t fully understand it). This fundamental group may be infinite, but (by the étale construction) is always profinite, and in particular has a Haar probability measure, in which every finite index subgroup (and their cosets) are measurable. Thus we may meaningfully talk about elements drawn uniformly at random from this group, so long as we work only with the profinite {\sigma}-algebra on {\pi_1(W \backslash W')} that is generated by the cosets of the finite index subgroups of this group (which will be the only relevant sets we need to measure when considering the action of this group on finite sets, such as the components of a generic fibre).

Theorem 5 (Lang-Weil with parameters) Let {V, W} be varieties of complexity at most {M} with {W} irreducible, and let {\phi: V \rightarrow W} be a dominant map of complexity at most {M}. Let {w_0} be an {F}-point of {W \backslash W'}. Then, for any natural number {a}, one has {c(\phi^{-1}(\{w\})) = a} for {(\mathop{\bf P}( X = a ) + O_M(|F|^{-1/2})) |F|^{\hbox{dim}(W)}} values of {w \in W(F)}, where {X} is the random variable that counts the number of components of a generic fibre {\phi^{-1}(w_0)} that are invariant under {g \circ Frob}, where {g} is an element chosen uniformly at random from the étale fundamental group {\pi_1(W \backslash W')}. In particular, in the asymptotic limit {|F| \rightarrow \infty}, and with {w} chosen uniformly at random from {W(F)}, {c(\phi^{-1}(\{w\}))} (or, equivalently, {|\phi^{-1}(\{w\})(F)| / |F|^{\hbox{dim}(V)-\hbox{dim}(W)}}) and {X} have the same asymptotic distribution.

This theorem generalises Corollary 4 (which is the case when {W} is just a point, so that {\phi^{-1}(\{w\})} is just {V} and {g} is trivial). Informally, the effect of a non-trivial parameter space {W} on the Lang-Weil bound is to push around the Frobenius map by monodromy for the purposes of counting invariant components, and a randomly chosen set of parameters corresponds to a randomly chosen loop on which to perform monodromy.

Example 2 Let {V=W=F} and {\phi(x) = x^m} for some fixed {m \geq 1}; to avoid some technical issues let us suppose that {m} is coprime to {|F|}. Then {W'} can be taken to be {\{0\}}, and for a base point {w_0 \in W \backslash W'} we can take {w_0=1}. The fibre {\phi^{-1}(\{1\})} – the {m^{th}} roots of unity – can be identified with the cyclic group {{\bf Z}/m{\bf Z}} by using a primitive root of unity. The étale fundamental group {\pi(W \backslash W') = \pi(\overline{F} \backslash 0)} is (I think) isomorphic to the profinite closure {\hat {\bf Z}} of the integers {{\bf Z}} (excluding the part of that closure coming from the characteristic of {F}). Not coincidentally, the integers {{\bf Z}} are the fundamental group of the complex analogue {{\bf C} \backslash \{0\}} of {W \backslash W'}. (Brian Conrad points out to me though that for more complicated varieties, such as covers of {\overline{F} \backslash \{0\}} by a power of the characteristic, the etale fundamental group is more complicated than just a profinite closure of the ordinary fundamental group, due to the presence of Artin-Schreier covers that are only ramified at infinity.) The action of this fundamental group on the fibres {{\bf Z}/m{\bf Z}} can given by translation. Meanwhile, the Frobenius map {Frob} on {{\bf Z}/m{\bf Z}} is given by multiplication by {|F|}. A random element {g \circ Frob} then becomes a random affine map {x \mapsto |F|x+b} on {{\bf Z}/m{\bf Z}}, where {b} chosen uniformly at random from {{\bf Z}/m{\bf Z}}. The number of fixed points of this map is equal to the greatest common divisor {(|F|-1,m)} of {|F|-1} and {m} when {b} is divisible by {(|F|-1,m)}, and equal to {0} otherwise. This matches up with the elementary number fact that a randomly chosen non-zero element of {F} will be an {m^{th}} power with probability {1/(|F|-1,m)}, and when this occurs, the number of {m^{th}} roots in {F} will be {(|F|-1,m)}.

Example 3 (Thanks to Jordan Ellenberg for this example.) Consider a random elliptic curve {E = \{ y^2 = x^3 + ax + b \}}, where {a,b} are chosen uniformly at random, and let {m \geq 1}. Let {E[m]} be the {m}-torsion points of {E} (i.e. those elements {g \in E} with {mg = 0} using the elliptic curve addition law); as a group, this is isomorphic to {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}} (assuming that {F} has sufficiently large characteristic, for simplicity), and consider the number of {F} points of {E[m]}, which is a random variable taking values in the natural numbers between {0} and {m^2}. In this case, the base variety {W} is the modular curve {X(1)}, and the covering variety {V} is the modular curve {X_1(m)}. The generic fibre here can be identified with {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}}, the monodromy action projects down to the action of {SL_2({\bf Z}/m{\bf Z})}, and the action of Frobenius on this fibre can be shown to be given by a {2 \times 2} matrix with determinant {|F|} (with the exact choice of matrix depending on the choice of fibre and of the identification), so the distribution of the number of {F}-points of {E[m]} is asymptotic to the distribution of the number of fixed points {X} of a random linear map of determinant {|F|} on {{\bf Z}/m{\bf Z} \times {\bf Z}/m{\bf Z}}.

Theorem 5 seems to be well known “folklore” among arithmetic geometers, though I do not know of an explicit reference for it. I enjoyed deriving it for myself (though my derivation is somewhat incomplete due to my lack of understanding of étale cohomology) from the ordinary Lang-Weil theorem and the moment method. I’m recording this derivation later in this post, mostly for my own benefit (as I am still in the process of learning this material), though perhaps some other readers may also be interested in it.

Caveat: not all details are fully fleshed out in this writeup, particularly those involving the finer points of algebraic geometry and étale cohomology, as my understanding of these topics is not as complete as I would like it to be.

Many thanks to Brian Conrad and Jordan Ellenberg for helpful discussions on these topics.

Read the rest of this entry »


RSS Google+ feed

  • An error has occurred; the feed is probably down. Try again later.

Get every new post delivered to your Inbox.

Join 3,577 other followers