You are currently browsing the category archive for the ‘math.AG’ category.

Louis Esser, Burt Totaro, Chengxi Wang, and myself have just uploaded to the arXiv our preprint “Varieties of general type with many vanishing plurigenera, and optimal sine and sawtooth inequalities“. This is an interdisciplinary paper that arose because in order to optimize a certain algebraic geometry construction it became necessary to solve a purely analytic question which, while simple, did not seem to have been previously studied in the literature. We were able to solve the analytic question exactly and thus fully optimize the algebraic geometry construction, though the analytic question may have some independent interest.

Let us first discuss the algebraic geometry application. Given a smooth complex ${n}$-dimensional projective variety ${X}$ there is a standard line bundle ${K_X}$ attached to it, known as the canonical line bundle; ${n}$-forms on the variety become sections of this bundle. The bundle may not actually admit global sections; that is to say, the dimension ${h^0(X, K_X)}$ of global sections may vanish. But as one raises the canonical line bundle ${K_X}$ to higher and higher powers to form further line bundles ${mK_X}$, the number of global sections tends to increase; in particular, the dimension ${h^0(X, mK_X)}$ of global sections (known as the ${m^{th}}$ plurigenus) always obeys an asymptotic of the form

$\displaystyle h^0(X, mK_X) = \mathrm{vol}(X) \frac{m^n}{n!} + O( m^{n-1} )$

as ${m \rightarrow \infty}$ for some non-negative number ${\mathrm{vol}(X)}$, which is called the volume of the variety ${X}$, which is an invariant that reveals some information about the birational geometry of ${X}$. For instance, if the canonical line bundle is ample (or more generally, nef), this volume is equal to the intersection number ${K_X^n}$ (roughly speaking, the number of common zeroes of ${n}$ generic sections of the canonical line bundle); this is a special case of the asymptotic Riemann-Roch theorem. In particular, the volume ${\mathrm{vol}(X)}$ is a natural number in this case. However, it is possible for the volume to also be fractional in nature. One can then ask: how small can the volume get ${\mathrm{vol}(X)}$ without vanishing entirely? (By definition, varieties with non-vanishing volume are known as varieties of general type.)

It follows from a deep result obtained independently by Hacon–McKernan, Takayama and Tsuji that there is a uniform lower bound for the volume ${\mathrm{vol}(X)}$ of all ${n}$-dimensional projective varieties of general type. However, the precise lower bound is not known, and the current paper is a contribution towards probing this bound by constructing varieties of particularly small volume in the high-dimensional limit ${n \rightarrow \infty}$. Prior to this paper, the best such constructions of ${n}$-dimensional varieties basically had exponentially small volume, with a construction of volume at most ${e^{-(1+o(1))n \log n}}$ given by Ballico–Pignatelli–Tasin, and an improved construction with a volume bound of ${e^{-\frac{1}{3} n \log^2 n}}$ given by Totaro and Wang. In this paper, we obtain a variant construction with the somewhat smaller volume bound of ${e^{-(1-o(1)) n^{3/2} \log^{1/2} n}}$; the method also gives comparable bounds for some other related algebraic geometry statistics, such as the largest ${m}$ for which the pluricanonical map associated to the linear system ${|mK_X|}$ is not a birational embedding into projective space.

The space ${X}$ is constructed by taking a general hypersurface of a certain degree ${d}$ in a weighted projective space ${P(a_0,\dots,a_{n+1})}$ and resolving the singularities. These varieties are relatively tractable to work with, as one can use standard algebraic geometry tools (such as the ReidTai inequality) to provide sufficient conditions to guarantee that the hypersurface has only canonical singularities and that the canonical bundle is a reflexive sheaf, which allows one to calculate the volume exactly in terms of the degree ${d}$ and weights ${a_0,\dots,a_{n+1}}$. The problem then reduces to optimizing the resulting volume given the constraints needed for the above-mentioned sufficient conditions to hold. After working with a particular choice of weights (which consist of products of mostly consecutive primes, with each product occuring with suitable multiplicities ${c_0,\dots,c_{b-1}}$), the problem eventually boils down to trying to minimize the total multiplicity ${\sum_{j=0}^{b-1} c_j}$, subject to certain congruence conditions and other bounds on the ${c_j}$. Using crude bounds on the ${c_j}$ eventually leads to a construction with volume at most ${e^{-0.8 n^{3/2} \log^{1/2} n}}$, but by taking advantage of the ability to “dilate” the congruence conditions and optimizing over all dilations, we are able to improve the ${0.8}$ constant to ${1-o(1)}$.

Now it is time to turn to the analytic side of the paper by describing the optimization problem that we solve. We consider the sawtooth function ${g: {\bf R} \rightarrow (-1/2,1/2]}$, with ${g(x)}$ defined as the unique real number in ${(-1/2,1/2]}$ that is equal to ${x}$ mod ${1}$. We consider a (Borel) probability measure ${\mu}$ on the real line, and then compute the average value of this sawtooth function

$\displaystyle \mathop{\bf E}_\mu g(x) := \int_{\bf R} g(x)\ d\mu(x)$

as well as various dilates

$\displaystyle \mathop{\bf E}_\mu g(kx) := \int_{\bf R} g(kx)\ d\mu(x)$

of this expectation. Since ${g}$ is bounded above by ${1/2}$, we certainly have the trivial bound

$\displaystyle \min_{1 \leq k \leq m} \mathop{\bf E}_\mu g(kx) \leq \frac{1}{2}.$

However, this bound is not very sharp. For instance, the only way in which ${\mathop{\bf E}_\mu g(x)}$ could attain the value of ${1/2}$ is if the probability measure ${\mu}$ was supported on half-integers, but in that case ${\mathop{\bf E}_\mu g(2x)}$ would vanish. For the algebraic geometry application discussed above one is then led to the following question: for a given choice of ${m}$, what is the best upper bound ${c^{\mathrm{saw}}_m}$ on the quantity ${\min_{1 \leq k \leq m} \mathop{\bf E}_\mu g(kx)}$ that holds for all probability measures ${\mu}$?

If one considers the deterministic case in which ${\mu}$ is a Dirac mass supported at some real number ${x_0}$, then the Dirichlet approximation theorem tells us that there is ${1 \leq k \leq m}$ such that ${x_0}$ is within ${\frac{1}{m+1}}$ of an integer, so we have

$\displaystyle \min_{1 \leq k \leq m} \mathop{\bf E}_\mu g(kx) \leq \frac{1}{m+1}$

in this case, and this bound is sharp for deterministic measures ${\mu}$. Thus we have

$\displaystyle \frac{1}{m+1} \leq c^{\mathrm{saw}}_m \leq \frac{1}{2}.$

However, both of these bounds turn out to be far from the truth, and the optimal value of ${c^{\mathrm{saw}}_m}$ is comparable to ${\frac{\log 2}{\log m}}$. In fact we were able to compute this quantity precisely:

Theorem 1 (Optimal bound for sawtooth inequality) Let ${m \geq 1}$.
• (i) If ${m = 2^r}$ for some natural number ${r}$, then ${c^{\mathrm{saw}}_m = \frac{1}{r+2}}$.
• (ii) If ${2^r < m \leq 2^{r+1}}$ for some natural number ${r}$, then ${c^{\mathrm{saw}}_m = \frac{2^r}{2^r(r+1) + m}}$.
In particular, we have ${c^{\mathrm{saw}}_m = \frac{\log 2 + o(1)}{\log m}}$ as ${m \rightarrow \infty}$.

We establish this bound through duality. Indeed, suppose we could find non-negative coefficients ${a_1,\dots,a_m}$ such that one had the pointwise bound

$\displaystyle \sum_{k=1}^m a_k g(kx) \leq 1 \ \ \ \ \ (1)$

for all real numbers ${x}$. Integrating this against an arbitrary probability measure ${\mu}$, we would conclude

$\displaystyle (\sum_{k=1}^m a_k) \min_{1 \leq k \leq m} \mathop{\bf E}_\mu g(kx) \leq \sum_{k=1}^m a_k \mathop{\bf E}_\mu g(kx) \leq 1$

and hence

$\displaystyle c^{\mathrm{saw}}_m \leq \frac{1}{\sum_{k=1}^m a_k}.$

Conversely, one can find lower bounds on ${c^{\mathrm{saw}}_m}$ by selecting suitable candidate measures ${\mu}$ and computing the means ${\mathop{\bf E}_\mu g(kx)}$. The theory of linear programming duality tells us that this method must give us the optimal bound, but one has to locate the optimal measure ${\mu}$ and optimal weights ${a_1,\dots,a_m}$. This we were able to do by first doing some extensive numerics to discover these weights and measures for small values of ${m}$, and then doing some educated guesswork to extrapolate these examples to the general case, and then to verify the required inequalities. In case (i) the situation is particularly simple, as one can take ${\mu}$ to be the discrete measure that assigns a probability ${\frac{1}{r+2}}$ to the numbers ${\frac{1}{2}, \frac{1}{4}, \dots, \frac{1}{2^r}}$ and the remaining probability of ${\frac{2}{r+2}}$ to ${\frac{1}{2^{r+1}}}$, while the optimal weighted inequality (1) turns out to be

$\displaystyle 2g(x) + \sum_{j=1}^r g(2^j x) \leq 1$

which is easily proven by telescoping series. However the general case turned out to be significantly tricker to work out, and the verification of the optimal inequality required a delicate case analysis (reflecting the fact that equality was attained in this inequality in a large number of places).

After solving the sawtooth problem, we became interested in the analogous question for the sine function, that is to say what is the best bound ${c^{\sin}_m}$ for the inequality

$\displaystyle \min_{1 \leq k \leq m} \mathop{\bf E}_\mu \sin(kx) \leq c^{\sin}_m.$

The left-hand side is the smallest imaginary part of the first ${m}$ Fourier coefficients of ${\mu}$. To our knowledge this quantity has not previously been studied in the Fourier analysis literature. By adopting a similar approach as for the sawtooth problem, we were able to compute this quantity exactly also:

Theorem 2 For any ${m \geq 1}$, one has

$\displaystyle c^{\sin}_m = \frac{m+1}{2 \sum_{1 \leq j \leq m: j \hbox{ odd}} \cot \frac{\pi j}{2m+2}}.$

In particular,

$\displaystyle c^{\sin}_m = \frac{\frac{\pi}{2} + o(1)}{\log m}.$

Interestingly, a closely related cotangent sum recently appeared in this MathOverflow post. Verifying the lower bound on ${c^{\sin}_m}$ boils down to choosing the right test measure ${\mu}$; it turns out that one should pick the probability measure supported the ${\frac{\pi j}{2m+2}}$ with ${1 \leq j \leq m}$ odd, with probability proportional to ${\cot \frac{\pi j}{2m+2}}$, and the lower bound verification eventually follows from a classical identity

$\displaystyle \frac{m+1}{2} = \sum_{1 \leq j \leq m; j \hbox{ odd}} \cot \frac{\pi j}{2m+2} \sin \frac{\pi jk}{m+1}$

for ${1 \leq k \leq m}$, first posed by Eisenstein in 1844 and proved by Stern in 1861. The upper bound arises from establishing the trigonometric inequality

$\displaystyle \frac{2}{(m+1)^2} \sum_{1 \leq k \leq m; k \hbox{ odd}}$

$\displaystyle \cot \frac{\pi k}{2m+2} ( (m+1-k) \sin kx + k \sin(m+1-k)x ) \leq 1$

for all real numbers ${x}$, which to our knowledge is new; the left-hand side has a Fourier-analytic intepretation as convolving the Fejér kernel with a certain discretized square wave function, and this interpretation is used heavily in our proof of the inequality.

[UPDATE, Feb 1, 2021: the strategy sketched out below has been successfully implemented to rigorously obtain the desired implication in this recent preprint of Giulio Bresciani.]
I recently came across this question on MathOverflow asking if there are any polynomials ${P}$ of two variables with rational coefficients, such that the map ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ is a bijection. The answer to this question is almost surely “no”, but it is remarkable how hard this problem resists any attempt at rigorous proof. (MathOverflow users with enough privileges to see deleted answers will find that there are no less than seventeen deleted attempts at a proof in response to this question!)
On the other hand, the one surviving response to the question does point out this paper of Poonen which shows that assuming a powerful conjecture in Diophantine geometry known as the Bombieri-Lang conjecture (discussed in this previous post), it is at least possible to exhibit polynomials ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ which are injective.
I believe that it should be possible to also rule out the existence of bijective polynomials ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ if one assumes the Bombieri-Lang conjecture, and have sketched out a strategy to do so, but filling in the gaps requires a fair bit more algebraic geometry than I am capable of. So as a sort of experiment, I would like to see if a rigorous implication of this form (similarly to the rigorous implication of the Erdos-Ulam conjecture from the Bombieri-Lang conjecture in my previous post) can be crowdsourced, in the spirit of the polymath projects (though I feel that this particular problem should be significantly quicker to resolve than a typical such project).
Here is how I imagine a Bombieri-Lang-powered resolution of this question should proceed (modulo a large number of unjustified and somewhat vague steps that I believe to be true but have not established rigorously). Suppose for contradiction that we have a bijective polynomial ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$. Then for any polynomial ${Q: {\bf Q} \rightarrow {\bf Q}}$ of one variable, the surface

$\displaystyle S_Q := \{ (x,y,z) \in \mathbb{A}^3: P(x,y) = Q(z) \}$

has infinitely many rational points; indeed, every rational ${z \in {\bf Q}}$ lifts to exactly one rational point in ${S_Q}$. I believe that for “typical” ${Q}$ this surface ${S_Q}$ should be irreducible. One can now split into two cases:

• (a) The rational points in ${S_Q}$ are Zariski dense in ${S_Q}$.
• (b) The rational points in ${S_Q}$ are not Zariski dense in ${S_Q}$.

Consider case (b) first. By definition, this case asserts that the rational points in ${S_Q}$ are contained in a finite number of algebraic curves. By Faltings’ theorem (a special case of the Bombieri-Lang conjecture), any curve of genus two or higher only contains a finite number of rational points. So all but finitely many of the rational points in ${S_Q}$ are contained in a finite union of genus zero and genus one curves. I think all genus zero curves are birational to a line, and all the genus one curves are birational to an elliptic curve (though I don’t have an immediate reference for this). These curves ${C}$ all can have an infinity of rational points, but very few of them should have “enough” rational points ${C \cap {\bf Q}^3}$ that their projection ${\pi(C \cap {\bf Q}^3) := \{ z \in {\bf Q} : (x,y,z) \in C \hbox{ for some } x,y \in {\bf Q} \}}$ to the third coordinate is “large”. In particular, I believe

• (i) If ${C \subset {\mathbb A}^3}$ is birational to an elliptic curve, then the number of elements of ${\pi(C \cap {\bf Q}^3)}$ of height at most ${H}$ should grow at most polylogarithmically in ${H}$ (i.e., be of order ${O( \log^{O(1)} H )}$.
• (ii) If ${C \subset {\mathbb A}^3}$ is birational to a line but not of the form ${\{ (f(z), g(z), z) \}}$ for some rational ${f,g}$, then then the number of elements of ${\pi(C \cap {\bf Q}^3)}$ of height at most ${H}$ should grow slower than ${H^2}$ (in fact I think it can only grow like ${O(H)}$).

I do not have proofs of these results (though I think something similar to (i) can be found in Knapp’s book, and (ii) should basically follow by using a rational parameterisation ${\{(f(t),g(t),h(t))\}}$ of ${C}$ with ${h}$ nonlinear). Assuming these assertions, this would mean that there is a curve of the form ${\{ (f(z),g(z),z)\}}$ that captures a “positive fraction” of the rational points of ${S_Q}$, as measured by restricting the height of the third coordinate ${z}$ to lie below a large threshold ${H}$, computing density, and sending ${H}$ to infinity (taking a limit superior). I believe this forces an identity of the form

$\displaystyle P(f(z), g(z)) = Q(z) \ \ \ \ \ (1)$

for all ${z}$. Such identities are certainly possible for some choices of ${Q}$ (e.g. ${Q(z) = P(F(z), G(z))}$ for arbitrary polynomials ${F,G}$ of one variable) but I believe that the only way that such identities hold for a “positive fraction” of ${Q}$ (as measured using height as before) is if there is in fact a rational identity of the form

$\displaystyle P( f_0(z), g_0(z) ) = z$

for some rational functions ${f_0,g_0}$ with rational coefficients (in which case we would have ${f = f_0 \circ Q}$ and ${g = g_0 \circ Q}$). But such an identity would contradict the hypothesis that ${P}$ is bijective, since one can take a rational point ${(x,y)}$ outside of the curve ${\{ (f_0(z), g_0(z)): z \in {\bf Q} \}}$, and set ${z := P(x,y)}$, in which case we have ${P(x,y) = P(f_0(z), g_0(z) )}$ violating the injective nature of ${P}$. Thus, modulo a lot of steps that have not been fully justified, we have ruled out the scenario in which case (b) holds for a “positive fraction” of ${Q}$.
This leaves the scenario in which case (a) holds for a “positive fraction” of ${Q}$. Assuming the Bombieri-Lang conjecture, this implies that for such ${Q}$, any resolution of singularities of ${S_Q}$ fails to be of general type. I would imagine that this places some very strong constraints on ${P,Q}$, since I would expect the equation ${P(x,y) = Q(z)}$ to describe a surface of general type for “generic” choices of ${P,Q}$ (after resolving singularities). However, I do not have a good set of techniques for detecting whether a given surface is of general type or not. Presumably one should proceed by viewing the surface ${\{ (x,y,z): P(x,y) = Q(z) \}}$ as a fibre product of the simpler surface ${\{ (x,y,w): P(x,y) = w \}}$ and the curve ${\{ (z,w): Q(z) = w \}}$ over the line ${\{w \}}$. In any event, I believe the way to handle (a) is to show that the failure of general type of ${S_Q}$ implies some strong algebraic constraint between ${P}$ and ${Q}$ (something in the spirit of (1), perhaps), and then use this constraint to rule out the bijectivity of ${P}$ by some further ad hoc method.

Last week, we had Peter Scholze give an interesting distinguished lecture series here at UCLA on “Prismatic Cohomology”, which is a new type of cohomology theory worked out by Scholze and Bhargav Bhatt. (Video of the talks will be available shortly; for now we have some notes taken by two notetakers in the audience on that web page.) My understanding of this (speaking as someone that is rather far removed from this area) is that it is progress towards the “motivic” dream of being able to define cohomology ${H^i(X/\overline{A}, A)}$ for varieties ${X}$ (or similar objects) defined over arbitrary commutative rings ${\overline{A}}$, and with coefficients in another arbitrary commutative ring ${A}$. Currently, we have various flavours of cohomology that only work for certain types of domain rings ${\overline{A}}$ and coefficient rings ${A}$:

• Singular cohomology, which roughly speaking works when the domain ring ${\overline{A}}$ is a characteristic zero field such as ${{\bf R}}$ or ${{\bf C}}$, but can allow for arbitrary coefficients ${A}$;
• de Rham cohomology, which roughly speaking works as long as the coefficient ring ${A}$ is the same as the domain ring ${\overline{A}}$ (or a homomorphic image thereof), as one can only talk about ${A}$-valued differential forms if the underlying space is also defined over ${A}$;
• ${\ell}$-adic cohomology, which is a remarkably powerful application of étale cohomology, but only works well when the coefficient ring ${A = {\bf Z}_\ell}$ is localised around a prime ${\ell}$ that is different from the characteristic ${p}$ of the domain ring ${\overline{A}}$; and
• Crystalline cohomology, in which the domain ring is a field ${k}$ of some finite characteristic ${p}$, but the coefficient ring ${A}$ can be a slight deformation of ${k}$, such as the ring of Witt vectors of ${k}$.

There are various relationships between the cohomology theories, for instance de Rham cohomology coincides with singular cohomology for smooth varieties in the limiting case ${A=\overline{A} = {\bf R}}$. The following picture Scholze drew in his first lecture captures these sorts of relationships nicely:

The new prismatic cohomology of Bhatt and Scholze unifies many of these cohomologies in the “neighbourhood” of the point ${(p,p)}$ in the above diagram, in which the domain ring ${\overline{A}}$ and the coefficient ring ${A}$ are both thought of as being “close to characteristic ${p}$” in some sense, so that the dilates ${pA, pA'}$ of these rings is either zero, or “small”. For instance, the ${p}$-adic ring ${{\bf Z}_p}$ is technically of characteristic ${0}$, but ${p {\bf Z}_p}$ is a “small” ideal of ${{\bf Z}_p}$ (it consists of those elements of ${{\bf Z}_p}$ of ${p}$-adic valuation at most ${1/p}$), so one can think of ${{\bf Z}_p}$ as being “close to characteristic ${p}$” in some sense. Scholze drew a “zoomed in” version of the previous diagram to informally describe the types of rings ${A,A'}$ for which prismatic cohomology is effective:

To define prismatic cohomology rings ${H^i_\Delta(X/\overline{A}, A)}$ one needs a “prism”: a ring homomorphism from ${A}$ to ${\overline{A}}$ equipped with a “Frobenius-like” endomorphism ${\phi: A \to A}$ on ${A}$ obeying some axioms. By tuning these homomorphisms one can recover existing cohomology theories like crystalline or de Rham cohomology as special cases of prismatic cohomology. These specialisations are analogous to how a prism splits white light into various individual colours, giving rise to the terminology “prismatic”, and depicted by this further diagram of Scholze:

(And yes, Peter confirmed that he and Bhargav were inspired by the Dark Side of the Moon album cover in selecting the terminology.)

There was an abstract definition of prismatic cohomology (as being the essentially unique cohomology arising from prisms that obeyed certain natural axioms), but there was also a more concrete way to view them in terms of coordinates, as a “${q}$-deformation” of de Rham cohomology. Whereas in de Rham cohomology one worked with derivative operators ${d}$ that for instance applied to monomials ${t^n}$ by the usual formula

$\displaystyle d(t^n) = n t^{n-1} dt,$

prismatic cohomology in coordinates can be computed using a “${q}$-derivative” operator ${d_q}$ that for instance applies to monomials ${t^n}$ by the formula

$\displaystyle d_q (t^n) = [n]_q t^{n-1} d_q t$

where

$\displaystyle [n]_q = \frac{q^n-1}{q-1} = 1 + q + \dots + q^{n-1}$

is the “${q}$-analogue” of ${n}$ (a polynomial in ${q}$ that equals ${n}$ in the limit ${q=1}$). (The ${q}$-analogues become more complicated for more general forms than these.) In this more concrete setting, the fact that prismatic cohomology is independent of the choice of coordinates apparently becomes quite a non-trivial theorem.

Let ${k}$ be a field, and let ${E}$ be a finite extension of that field; in this post we will denote such a relationship by ${k \hookrightarrow E}$. We say that ${E}$ is a Galois extension of ${k}$ if the cardinality of the automorphism group ${\mathrm{Aut}(E/k)}$ of ${E}$ fixing ${k}$ is as large as it can be, namely the degree ${[E:k]}$ of the extension. In that case, we call ${\mathrm{Aut}(E/k)}$ the Galois group of ${E}$ over ${k}$ and denote it also by ${\mathrm{Gal}(E/k)}$. The fundamental theorem of Galois theory then gives a one-to-one correspondence (also known as the Galois correspondence) between the intermediate extensions between ${E}$ and ${k}$ and the subgroups of ${\mathrm{Gal}(E/k)}$:

Theorem 1 (Fundamental theorem of Galois theory) Let ${E}$ be a Galois extension of ${k}$.

• (i) If ${k \hookrightarrow F \hookrightarrow E}$ is an intermediate field betwen ${k}$ and ${E}$, then ${E}$ is a Galois extension of ${F}$, and ${\mathrm{Gal}(E/F)}$ is a subgroup of ${\mathrm{Gal}(E/k)}$.
• (ii) Conversely, if ${H}$ is a subgroup of ${\mathrm{Gal}(E/k)}$, then there is a unique intermediate field ${k \hookrightarrow F \hookrightarrow E}$ such that ${\mathrm{Gal}(E/F)=H}$; namely ${F}$ is the set of elements of ${E}$ that are fixed by ${H}$.
• (iii) If ${k \hookrightarrow F_1 \hookrightarrow E}$ and ${k \hookrightarrow F_2 \hookrightarrow E}$, then ${F_1 \hookrightarrow F_2}$ if and only if ${\mathrm{Gal}(E/F_2)}$ is a subgroup of ${\mathrm{Gal}(E/F_1)}$.
• (iv) If ${k \hookrightarrow F \hookrightarrow E}$ is an intermediate field between ${k}$ and ${E}$, then ${F}$ is a Galois extension of ${k}$ if and only if ${\mathrm{Gal}(E/F)}$ is a normal subgroup of ${\mathrm{Gal}(E/k)}$. In that case, ${\mathrm{Gal}(F/k)}$ is isomorphic to the quotient group ${\mathrm{Gal}(E/k) / \mathrm{Gal}(E/F)}$.

Example 2 Let ${k= {\bf Q}}$, and let ${E = {\bf Q}(e^{2\pi i/n})}$ be the degree ${\phi(n)}$ Galois extension formed by adjoining a primitive ${n^{th}}$ root of unity (that is to say, ${E}$ is the cyclotomic field of order ${n}$). Then ${\mathrm{Gal}(E/k)}$ is isomorphic to the multiplicative cyclic group ${({\bf Z}/n{\bf Z})^\times}$ (the invertible elements of the ring ${{\bf Z}/n{\bf Z}}$). Amongst the intermediate fields, one has the cyclotomic fields of the form ${F = {\bf Q}(e^{2\pi i/m})}$ where ${m}$ divides ${n}$; they are also Galois extensions, with ${\mathrm{Gal}(F/k)}$ isomorphic to ${({\bf Z}/m{\bf Z})^\times}$ and ${\mathrm{Gal}(E/F)}$ isomorphic to the elements ${a}$ of ${({\bf Z}/n{\bf Z})^\times}$ such that ${a(n/m) = (n/m)}$ modulo ${n}$. (There can also be other intermediate fields, corresponding to other subgroups of ${({\bf Z}/n{\bf Z})^\times}$.)

Example 3 Let ${k = {\bf C}(z)}$ be the field of rational functions of one indeterminate ${z}$ with complex coefficients, and let ${E = {\bf C}(w)}$ be the field formed by adjoining an ${n^{th}}$ root ${w = z^{1/n}}$ to ${k}$, thus ${k = {\bf C}(w^n)}$. Then ${E}$ is a degree ${n}$ Galois extension of ${k}$ with Galois group isomorphic to ${{\bf Z}/n{\bf Z}}$ (with an element ${a \in {\bf Z}/n{\bf Z}}$ corresponding to the field automorphism of ${k}$ that sends ${w}$ to ${e^{2\pi i a/n} w}$). The intermediate fields are of the form ${F = {\bf C}(w^{n/m})}$ where ${m}$ divides ${n}$; they are also Galois extensions, with ${\mathrm{Gal}(F/k)}$ isomorphic to ${{\bf Z}/m{\bf Z}}$ and ${\mathrm{Gal}(E/F)}$ isomorphic to the multiples of ${m}$ in ${{\bf Z}/n{\bf Z}}$.

There is an analogous Galois correspondence in the covering theory of manifolds. For simplicity we restrict attention to finite covers. If ${L}$ is a connected manifold and ${\pi_{L \leftarrow M}: M \rightarrow L}$ is a finite covering map of ${L}$ by another connected manifold ${M}$, we denote this relationship by ${L \leftarrow M}$. (Later on we will change our function notations slightly and write ${\pi_{L \leftarrow M}: L \leftarrow M}$ in place of the more traditional ${\pi_{L \leftarrow M}: M \rightarrow L}$, and similarly for the deck transformations ${g: M \leftarrow M}$ below; more on this below the fold.) If ${L \leftarrow M}$, we can define ${\mathrm{Aut}(M/L)}$ to be the group of deck transformations: continuous maps ${g: M \rightarrow M}$ which preserve the fibres of ${\pi}$. We say that this covering map is a Galois cover if the cardinality of the group ${\mathrm{Aut}(M/L)}$ is as large as it can be. In that case we call ${\mathrm{Aut}(M/L)}$ the Galois group of ${M}$ over ${L}$ and denote it by ${\mathrm{Gal}(M/L)}$.

Suppose ${M}$ is a finite cover of ${L}$. An intermediate cover ${N}$ between ${M}$ and ${L}$ is a cover of ${N}$ by ${L}$, such that ${L \leftarrow N \leftarrow M}$, in such a way that the covering maps are compatible, in the sense that ${\pi_{L \leftarrow M}}$ is the composition of ${\pi_{L \leftarrow N}}$ and ${\pi_{N \leftarrow M}}$. This sort of compatibilty condition will be implicitly assumed whenever we chain together multiple instances of the ${\leftarrow}$ notation. Two intermediate covers ${N,N'}$ are equivalent if they cover each other, in a fashion compatible with all the other covering maps, thus ${L \leftarrow N \leftarrow N' \leftarrow M}$ and ${L \leftarrow N' \leftarrow N \leftarrow M}$. We then have the analogous Galois correspondence:

Theorem 4 (Fundamental theorem of covering spaces) Let ${L \leftarrow M}$ be a Galois covering.

• (i) If ${L \leftarrow N \leftarrow M}$ is an intermediate cover betwen ${L}$ and ${M}$, then ${M}$ is a Galois extension of ${N}$, and ${\mathrm{Gal}(M/N)}$ is a subgroup of ${\mathrm{Gal}(M/L)}$.
• (ii) Conversely, if ${H}$ is a subgroup of ${\mathrm{Gal}(M/L)}$, then there is a intermediate cover ${L \leftarrow N \leftarrow M}$, unique up to equivalence, such that ${\mathrm{Gal}(M/N)=H}$.
• (iii) If ${L \leftarrow N_1 \leftarrow M}$ and ${L \leftarrow N_2 \leftarrow M}$, then ${L \leftarrow N_1 \leftarrow N_2 \leftarrow M}$ if and only if ${\mathrm{Gal}(M/N_2)}$ is a subgroup of ${\mathrm{Gal}(M/N_1)}$.
• (iv) If ${L \leftarrow N \leftarrow M}$, then ${N}$ is a Galois cover of ${L}$ if and only if ${\mathrm{Gal}(M/N)}$ is a normal subgroup of ${\mathrm{Gal}(M/L)}$. In that case, ${\mathrm{Gal}(N/L)}$ is isomorphic to the quotient group ${\mathrm{Gal}(M/L) / \mathrm{Gal}(M/N)}$.

Example 5 Let ${L= {\bf C}^\times := {\bf C} \backslash \{0\}}$, and let ${M = {\bf C}^\times}$ be the ${n}$-fold cover of ${L}$ with covering map ${\pi_{L \leftarrow M}(w) := w^n}$. Then ${M}$ is a Galois cover of ${L}$, and ${\mathrm{Gal}(M/L)}$ is isomorphic to the cyclic group ${{\bf Z}/n{\bf Z}}$. The intermediate covers are (up to equivalence) of the form ${N = {\bf C}^\times}$ with covering map ${\pi_{L \leftarrow N}(u) := u^m}$ where ${m}$ divides ${n}$; they are also Galois covers, with ${\mathrm{Gal}(N/L)}$ isomorphic to ${{\bf Z}/m{\bf Z}}$ and ${\mathrm{Gal}(M/N)}$ isomorphic to the multiples of ${m}$ in ${{\bf Z}/n{\bf Z}}$.

Given the strong similarity between the two theorems, it is natural to ask if there is some more concrete connection between Galois theory and the theory of finite covers.

In one direction, if the manifolds ${L,M,N}$ have an algebraic structure (or a complex structure), then one can relate covering spaces to field extensions by considering the field of rational functions (or meromorphic functions) on the space. For instance, if ${L = {\bf C}^\times}$ and ${z}$ is the coordinate on ${L}$, one can consider the field ${{\bf C}(z)}$ of rational functions on ${L}$; the ${n}$-fold cover ${M = {\bf C}^\times}$ with coordinate ${w}$ from Example 5 similarly has a field ${{\bf C}(w)}$ of rational functions. The covering ${\pi_{L \leftarrow M}(w) = w^n}$ relates the two coordinates ${z,w}$ by the relation ${z = w^n}$, at which point one sees that the rational functions ${{\bf C}(w)}$ on ${L}$ are a degree ${n}$ extension of that of ${{\bf C}(z)}$ (formed by adjoining the ${n^{th}}$ root of unity ${w}$ to ${z}$). In this way we see that Example 5 is in fact closely related to Example 3.

Exercise 6 What happens if one uses meromorphic functions in place of rational functions in the above example? (To answer this question, I found it convenient to use a discrete Fourier transform associated to the multiplicative action of the ${n^{th}}$ roots of unity on ${M}$ to decompose the meromorphic functions on ${M}$ as a linear combination of functions invariant under this action, times a power ${w^j}$ of the coordinate ${w}$ for ${j=0,\dots,n-1}$.)

I was curious however about the reverse direction. Starting with some field extensions ${k \hookrightarrow F \hookrightarrow E}$, is it is possible to create manifold like spaces ${M_k \leftarrow M_F \leftarrow M_E}$ associated to these fields in such a fashion that (say) ${M_E}$ behaves like a “covering space” to ${M_k}$ with a group ${\mathrm{Aut}(M_E/M_k)}$ of deck transformations isomorphic to ${\mathrm{Aut}(E/k)}$, so that the Galois correspondences agree? Also, given how the notion of a path (and associated concepts such as loops, monodromy and the fundamental group) play a prominent role in the theory of covering spaces, can spaces such as ${M_k}$ or ${M_E}$ also come with a notion of a path that is somehow compatible with the Galois correspondence?

The standard answer from modern algebraic geometry (as articulated for instance in this nice MathOverflow answer by Minhyong Kim) is to set ${M_E}$ equal to the spectrum ${\mathrm{Spec}(E)}$ of the field ${E}$. As a set, the spectrum ${\mathrm{Spec}(R)}$ of a commutative ring ${R}$ is defined as the set of prime ideals of ${R}$. Generally speaking, the map ${R \mapsto \mathrm{Spec}(R)}$ that maps a commutative ring to its spectrum tends to act like an inverse of the operation that maps a space ${X}$ to a ring of functions on that space. For instance, if one considers the commutative ring ${{\bf C}[z, z^{-1}]}$ of regular functions on ${M = {\bf C}^\times}$, then each point ${z_0}$ in ${M}$ gives rise to the prime ideal ${\{ f \in {\bf C}[z, z^{-1}]: f(z_0)=0\}}$, and one can check that these are the only such prime ideals (other than the zero ideal ${(0)}$), giving an almost one-to-one correspondence between ${\mathrm{Spec}( {\bf C}[z,z^{-1}] )}$ and ${M}$. (The zero ideal corresponds instead to the generic point of ${M}$.)

Of course, the spectrum of a field such as ${E}$ is just a point, as the zero ideal ${(0)}$ is the only prime ideal. Naively, it would then seem that there is not enough space inside such a point to support a rich enough structure of paths to recover the Galois theory of this field. In modern algebraic geometry, one addresses this issue by considering not just the set-theoretic elements of ${E}$, but more general “base points” ${p: \mathrm{Spec}(b) \rightarrow \mathrm{Spec}(E)}$ that map from some other (affine) scheme ${\mathrm{Spec}(b)}$ to ${\mathrm{Spec}(E)}$ (one could also consider non-affine base points of course). One has to rework many of the fundamentals of the subject to accommodate this “relative point of view“, for instance replacing the usual notion of topology with an étale topology, but once one does so one obtains a very satisfactory theory.

As an exercise, I set myself the task of trying to interpret Galois theory as an analogue of covering space theory in a more classical fashion, without explicit reference to more modern concepts such as schemes, spectra, or étale topology. After some experimentation, I found a reasonably satisfactory way to do so as follows. The space ${M_E}$ that one associates with ${E}$ in this classical perspective is not the single point ${\mathrm{Spec}(E)}$, but instead the much larger space consisting of ring homomorphisms ${p: E \rightarrow b}$ from ${E}$ to arbitrary integral domains ${b}$; informally, ${M_E}$ consists of all the “models” or “representations” of ${E}$ (in the spirit of this previous blog post). (There is a technical set-theoretic issue here because the class of integral domains ${R}$ is a proper class, so that ${M_E}$ will also be a proper class; I will completely ignore such technicalities in this post.) We view each such homomorphism ${p: E \rightarrow b}$ as a single point in ${M_E}$. The analogous notion of a path from one point ${p: E \rightarrow b}$ to another ${p': E \rightarrow b'}$ is then a homomorphism ${\gamma: b \rightarrow b'}$ of integral domains, such that ${p'}$ is the composition of ${p}$ with ${\gamma}$. Note that every prime ideal ${I}$ in the spectrum ${\mathrm{Spec}(R)}$ of a commutative ring ${R}$ gives rise to a point ${p_I}$ in the space ${M_R}$ defined here, namely the quotient map ${p_I: R \rightarrow R/I}$ to the ring ${R/I}$, which is an integral domain because ${I}$ is prime. So one can think of ${\mathrm{Spec}(R)}$ as being a distinguished subset of ${M_R}$; alternatively, one can think of ${M_R}$ as a sort of “penumbra” surrounding ${\mathrm{Spec}(R)}$. In particular, when ${E}$ is a field, ${\mathrm{Spec}(E) = \{(0)\}}$ defines a special point ${p_R}$ in ${M_R}$, namely the identity homomorphism ${p_R: R \rightarrow R}$.

Below the fold I would like to record this interpretation of Galois theory, by first revisiting the theory of covering spaces using paths as the basic building block, and then adapting that theory to the theory of field extensions using the spaces indicated above. This is not too far from the usual scheme-theoretic way of phrasing the connection between the two topics (basically I have replaced étale-type points ${p: \mathrm{Spec}(b) \rightarrow \mathrm{Spec}(E)}$ with more classical points ${p: E \rightarrow b}$), but I had not seen it explicitly articulated before, so I am recording it here for my own benefit and for any other readers who may be interested.

Previous set of notes: 246B Notes 4. Next set of notes: Notes 2.
The fundamental object of study in real differential geometry are the real manifolds: Hausdorff topological spaces ${M = M^n}$ that locally look like open subsets of a Euclidean space ${{\bf R}^n}$, and which can be equipped with an atlas ${(\phi_\alpha: U_\alpha \rightarrow V_\alpha)_{\alpha \in A}}$ of coordinate charts ${\phi_\alpha: U_\alpha \rightarrow V_\alpha}$ from open subsets ${U_\alpha}$ covering ${M}$ to open subsets ${V_\alpha}$ in ${{\bf R}^n}$, which are homeomorphisms; in particular, the transition maps ${\tau_{\alpha,\beta}: \phi_\alpha( U_\alpha \cap U_\beta ) \rightarrow \phi_\beta( U_\alpha \cap U_\beta )}$ defined by ${\tau_{\alpha,\beta}: \phi_\beta \circ \phi_\alpha^{-1}}$ are all continuous. (It is also common to impose the requirement that the manifold ${M}$ be second countable, though this will not be important for the current discussion.) A smooth real manifold is a real manifold in which the transition maps are all smooth.
In a similar fashion, the fundamental object of study in complex differential geometry are the complex manifolds, in which the model space is ${{\bf C}^n}$ rather than ${{\bf R}^n}$, and the transition maps ${\tau_{\alpha\beta}}$ are required to be holomorphic (and not merely smooth or continuous). In the real case, the one-dimensional manifolds (curves) are quite simple to understand, particularly if one requires the manifold to be connected; for instance, all compact connected one-dimensional real manifolds are homeomorphic to the unit circle (why?). However, in the complex case, the connected one-dimensional manifolds – the ones that look locally like subsets of ${{\bf C}}$ – are much richer, and are known as Riemann surfaces. For sake of completeness we give the (somewhat lengthy) formal definition:

Definition 1 (Riemann surface) If ${M}$ is a Hausdorff connected topological space, a (one-dimensional complex) atlas is a collection ${(\phi_\alpha: U_\alpha \rightarrow V_\alpha)_{\alpha \in A}}$ of homeomorphisms from open subsets ${(U_\alpha)_{\alpha \in A}}$ of ${M}$ that cover ${M}$ to open subsets ${V_\alpha}$ of the complex numbers ${{\bf C}}$, such that the transition maps ${\tau_{\alpha,\beta}: \phi_\alpha( U_\alpha \cap U_\beta ) \rightarrow \phi_\beta( U_\alpha \cap U_\beta )}$ defined by ${\tau_{\alpha,\beta}: \phi_\beta \circ \phi_\alpha^{-1}}$ are all holomorphic. Here ${A}$ is an arbitrary index set. Two atlases ${(\phi_\alpha: U_\alpha \rightarrow V_\alpha)_{\alpha \in A}}$, ${(\phi'_\beta: U'_\beta \rightarrow V'_\beta)_{\beta \in B}}$ on ${M}$ are said to be equivalent if their union is also an atlas, thus the transition maps ${\phi'_\beta \circ \phi_\alpha^{-1}: \phi_\alpha(U_\alpha \cap U'_\beta) \rightarrow \phi'_\beta(U_\alpha \cap U'_\beta)}$ and their inverses are all holomorphic. A Riemann surface is a Hausdorff connected topological space ${M}$ equipped with an equivalence class of one-dimensional complex atlases.
A map ${f: M \rightarrow M'}$ from one Riemann surface ${M}$ to another ${M'}$ is holomorphic if the maps ${\phi'_\beta \circ f \circ \phi_\alpha^{-1}: \phi_\alpha(U_\alpha \cap f^{-1}(U'_\beta)) \rightarrow {\bf C}}$ are holomorphic for any charts ${\phi_\alpha: U_\alpha \rightarrow V_\alpha}$, ${\phi'_\beta: U'_\beta \rightarrow V'_\beta}$ of an atlas of ${M}$ and ${M'}$ respectively; it is not hard to see that this definition does not depend on the choice of atlas. It is also clear that the composition of two holomorphic maps is holomorphic (and in fact the class of Riemann surfaces with their holomorphic maps forms a category).

Here are some basic examples of Riemann surfaces.

Example 2 (Quotients of ${{\bf C}}$) The complex numbers ${{\bf C}}$ clearly form a Riemann surface (using the identity map ${\phi: {\bf C} \rightarrow {\bf C}}$ as the single chart for an atlas). Of course, maps ${f: {\bf C} \rightarrow {\bf C}}$ that are holomorphic in the usual sense will also be holomorphic in the sense of the above definition, and vice versa, so the notion of holomorphicity for Riemann surfaces is compatible with that of holomorphicity for complex maps. More generally, given any discrete additive subgroup ${\Lambda}$ of ${{\bf C}}$, the quotient ${{\bf C}/\Lambda}$ is a Riemann surface. There are an infinite number of possible atlases to use here; one such is to pick a sufficiently small neighbourhood ${U}$ of the origin in ${{\bf C}}$ and take the atlas ${(\phi_\alpha: U_\alpha \rightarrow U)_{\alpha \in {\bf C}/\Lambda}}$ where ${U_\alpha := \alpha+U}$ and ${\phi_\alpha(\alpha+z) := z}$ for all ${z \in U}$. In particular, given any non-real complex number ${\omega}$, the complex torus ${{\bf C} / \langle 1, \omega \rangle}$ formed by quotienting ${{\bf C}}$ by the lattice ${\langle 1, \omega \rangle := \{ n + m \omega: n,m \in {\bf Z}\}}$ is a Riemann surface.

Example 3 Any open connected subset ${U}$ of ${{\bf C}}$ is a Riemann surface. By the Riemann mapping theorem, all simply connected open ${U \subset {\bf C}}$, other than ${{\bf C}}$ itself, are isomorphic (as Riemann surfaces) to the unit disk (or, equivalently, to the upper half-plane).

Example 4 (Riemann sphere) The Riemann sphere ${{\bf C} \cup \{\infty\}}$, as a topological manifold, is the one-point compactification of ${{\bf C}}$. Topologically, this is a sphere and is in particular connected. One can cover the Riemann sphere by the two open sets ${U_1 := {\bf C}}$ and ${U_2 := {\bf C} \cup \{\infty\} \backslash \{0\}}$, and give these two open sets the charts ${\phi_1: U_1 \rightarrow {\bf C}}$ and ${\phi_2: U_2 \rightarrow {\bf C}}$ defined by ${\phi_1(z) := z}$ for ${z \in {\bf C}}$, ${\phi_2(z) := 1/z}$ for ${z \in {\bf C} \backslash \{0\}}$, and ${\phi_2(\infty) := 0}$. This is a complex atlas since the ${1/z}$ is holomorphic on ${{\bf C} \backslash \{0\}}$.
An alternate way of viewing the Riemann sphere is as the projective line ${\mathbf{CP}^1}$. Topologically, this is the punctured complex plane ${{\bf C}^2 \backslash \{(0,0)\}}$ quotiented out by non-zero complex dilations, thus elements of this space are equivalence classes ${[z,w] := \{ (\lambda z, \lambda w): \lambda \in {\bf C} \backslash \{0\}\}}$ with the usual quotient topology. One can cover this space by two open sets ${U_1 := \{ [z,1]: z \in {\bf C} \}}$ and ${U_2: \{ [1,w]: w \in {\bf C} \}}$ and give these two open sets the charts ${\phi: U_1 \rightarrow {\bf C}}$ and ${\phi_2: U_2 \rightarrow {\bf C}}$ defined by ${\phi_1([z,1]) := z}$ for ${z \in {\bf C}}$, ${\phi_2([1,w]) := w}$. This is a complex atlas, basically because ${[z,1] = [1,1/z]}$ for ${z \in {\bf C} \backslash \{0\}}$ and ${1/z}$ is holomorphic on ${{\bf C} \backslash \{0\}}$.

Exercise 5 Verify that the Riemann sphere is isomorphic (as a Riemann surface) to the projective line.

Example 6 (Smooth algebraic plane curves) Let ${P(z_1,z_2,z_3)}$ be a complex polynomial in three variables which is homogeneous of some degree ${d \geq 1}$, thus

$\displaystyle P( \lambda z_1, \lambda z_2, \lambda z_3) = \lambda^d P( z_1, z_2, z_3). \ \ \ \ \ (1)$

Define the complex projective plane ${\mathbf{CP}^2}$ to be the punctured space ${{\bf C}^3 \backslash \{0\}}$ quotiented out by non-zero complex dilations, with the usual quotient topology. (There is another important topology to place here of fundamental importance in algebraic geometry, namely the Zariski topology, but we will ignore this topology here.) This is a compact space, whose elements are equivalence classes ${[z_1,z_2,z_3] := \{ (\lambda z_1, \lambda z_2, \lambda z_3)\}}$. Inside this plane we can define the (projective, degree ${d}$) algebraic curve

$\displaystyle Z(P) := \{ [z_1,z_2,z_3] \in \mathbf{CP}^2: P(z_1,z_2,z_3) = 0 \};$

this is well defined thanks to (1). It is easy to verify that ${Z(P)}$ is a closed subset of ${\mathbf{CP}^2}$ and hence compact; it is non-empty thanks to the fundamental theorem of algebra.
Suppose that ${P}$ is irreducible, which means that it is not the product of polynomials of smaller degree. As we shall show in the appendix, this makes the algebraic curve connected. (Actually, algebraic curves remain connected even in the reducible case, thanks to Bezout’s theorem, but we will not prove that theorem here.) We will in fact make the stronger nonsingularity hypothesis: there is no triple ${(z_1,z_2,z_3) \in {\bf C}^3 \backslash \{(0,0,0)\}}$ such that the four numbers ${P(z_1,z_2,z_3), \frac{\partial}{\partial z_j} P(z_1,z_2,z_3)}$ simultaneously vanish for ${j=1,2,3}$. (This looks like four constraints, but is in fact essentially just three, due to the Euler identity

$\displaystyle \sum_{j=1}^3 z_j \frac{\partial}{\partial z_j} P(z_1,z_2,z_3) = d P(z_1,z_2,z_3)$

that arises from differentiating (1) in ${\lambda}$. The fact that nonsingularity implies irreducibility is another consequence of Bezout’s theorem, which is not proven here.) For instance, the polynomial ${z_1^2 z_3 - z_2^3}$ is irreducible but singular (there is a “cusp” singularity at ${[0,0,1]}$). With this hypothesis, we call the curve ${Z(P)}$ smooth.
Now suppose ${[z_1,z_2,z_3]}$ is a point in ${Z(P)}$; without loss of generality we may take ${z_3}$ non-zero, and then we can normalise ${z_3=1}$. Now one can think of ${P(z_1,z_2,1)}$ as an inhomogeneous polynomial in just two variables ${z_1,z_2}$, and by nondegeneracy we see that the gradient ${(\frac{\partial}{\partial z_1} P(z_1,z_2,1), \frac{\partial}{\partial z_2} P(z_1,z_2,1))}$ is non-zero whenever ${P(z_1,z_2,1)=0}$. By the (complexified) implicit function theorem, this ensures that the affine algebraic curve

$\displaystyle Z(P)_{aff} := \{ (z_1,z_2) \in {\bf C}^2: P(z_1,z_2,1) = 0 \}$

is a Riemann surface in a neighbourhood of ${(z_1,z_2,1)}$; we leave this as an exercise. This can be used to give a coordinate chart for ${Z(P)}$ in a neighbourhood of ${[z_1,z_2,z_3]}$ when ${z_3 \neq 0}$. Similarly when ${z_1,z_2}$ is non-zero. This can be shown to give an atlas on ${Z(P)}$, which (assuming the connectedness claim that we will prove later) gives ${Z(P)}$ the structure of a Riemann surface.

Exercise 7 State and prove a complex version of the implicit function theorem that justifies the above claim that the charts in the above example form an atlas, and an algebraic curve associated to a non-singular polynomial is a Riemann surface.

Exercise 8

• (i) Show that all (irreducible plane projective) algebraic curves of degree ${1}$ are isomorphic to the Riemann sphere. (Hint: reduce to an explicit linear polynomial such as ${z_3}$.)
• (ii) Show that all (irreducible plane projective) algebraic curves of degree ${2}$ are isomorphic to the Riemann sphere. (Hint: to reduce computation, first use some linear algebra to reduce the homogeneous quadratic polynomial to a standard form, such as ${z_1^2+z_2^2+z_3^2}$ or ${z_2 z_3 - z_1^2}$.)

Exercise 9 If ${a,b}$ are complex numbers, show that the projective cubic curve

$\displaystyle \{ [z_1, z_2, z_3]: z_2^2 z_3 = z_1^3 + a z_1 z_3^2 + b z_3^3 \}$

is nonsingular if and only if the discriminant ${-16 (4a^3 + 27b^2)}$ is non-zero. (When this occurs, the curve is called an elliptic curve (in Weierstrass form), which is a fundamentally important example of a Riemann surface in many areas of mathematics, and number theory in particular. One can also define the discriminant for polynomials of higher degree, but we will not do so here.)

A recurring theme in mathematics is that an object ${X}$ is often best studied by understanding spaces of “good” functions on ${X}$. In complex analysis, there are two basic types of good functions:

Definition 10 Let ${X}$ be a Riemann surface. A holomorphic function on ${X}$ is a holomorphic map from ${X}$ to ${{\bf C}}$; the space of all such functions will be denoted ${{\mathcal O}(X)}$. A meromorphic function on ${X}$ is a holomorphic map from ${X}$ to the Riemann sphere ${{\bf C} \cup \{\infty\}}$, that is not identically equal to ${\infty}$; the space of all such functions will be denoted ${M(X)}$.

One can also define holomorphicity and meromorphicity in terms of charts: a function ${f: X \rightarrow {\bf C}}$ is holomorphic if and only if, for any chart ${\phi_\alpha: U_\alpha \rightarrow {\bf C}}$, the map ${f \circ \phi^{-1}_\alpha: \phi_\alpha(U_\alpha) \rightarrow {\bf C}}$ is holomorphic in the usual complex analysis sense; similarly, a function ${f: X \rightarrow {\bf C} \cup \{\infty\}}$ is meromorphic if and only if the preimage ${f^{-1}(\{\infty\})}$ is discrete (otherwise, by analytic continuation and the connectedness of ${X}$, ${f}$ will be identically equal to ${\infty}$) and for any chart ${\phi_\alpha: U_\alpha \rightarrow X}$, the map ${f \circ \phi_\alpha^{-1}: \phi_\alpha(U_\alpha) \rightarrow {\bf C} \cup \{\infty\}}$ becomes a meromorphic function in the usual complex analysis sense, after removing the discrete set of complex numbers where this map is infinite. One consequence of this alternate definition is that the space ${{\mathcal O}(X)}$ of holomorphic functions is a commutative complex algebra (a complex vector space closed under pointwise multiplication), while the space ${M(X)}$ of meromorphic functions is a complex field (a commutative complex algebra where every non-zero element has an inverse). Another consequence is that one can define the notion of a zero of given order ${k}$, or a pole of order ${k}$, for a holomorphic or meromorphic function, by composing with a chart map and using the usual complex analysis notions there, noting (from the holomorphicity of transition maps and their inverses) that this does not depend on the choice of chart. (However, one cannot similarly define the residue of a meromorphic function on ${X}$ this way, as the residue turns out to be chart-dependent thanks to the chain rule. Residues should instead be applied to meromorphic ${1}$-forms, a concept we will introduce later.) A third consequence is analytic continuation: if two holomorphic or meromorphic functions on ${X}$ agree on a non-empty open set, then they agree everywhere.
On the complex numbers ${{\bf C}}$, there are of course many holomorphic functions and meromorphic functions; for instance any power series with an infinite radius of convergence will give a holomorphic function, and the quotient of any two such functions (with non-zero denominator) will give a meromorphic function. Furthermore, we have extremely wide latitude in how to specify the zeroes of the holomorphic function, or the zeroes and poles of the meromorphic function, thanks to tools such as the Weierstrass factorisation theorem or the Mittag-Leffler theorem (covered in previous quarters).
It turns out, however, that the situation changes dramatically when the Riemann surface ${X}$ is compact, with the holomorphic and meromorphic functions becoming much more rigid. First of all, compactness eliminates all holomorphic functions except for the constants:

Lemma 11 Let ${f \in \mathcal{O}(X)}$ be a holomorphic function on a compact Riemann surface ${X}$. Then ${f}$ is constant.

This result should be seen as a close sibling of Liouville’s theorem that all bounded entire functions are constant. (Indeed, in the case of a complex torus, this lemma is a corollary of Liouville’s theorem.)
Proof: As ${f}$ is continuous and ${X}$ is compact, ${|f(z_0)|}$ must attain a maximum at some point ${z_0 \in X}$. Working in a chart around ${z_0}$ and applying the maximum principle, we conclude that ${f}$ is constant in a neighbourhood of ${z_0}$, and hence is constant everywhere by analytic continuation. $\Box$
This dramatically cuts down the number of possible meromorphic functions – indeed, for an abstract Riemann surface, it is not immediately obvious that there are any non-constant meromorphic functions at all! As the poles are isolated and the surface is compact, a meromorphic function can only have finitely many poles, and if one prescribes the location of the poles and the maximum order at each pole, then we shall see that the space of meromorphic functions is now finite dimensional. The precise dimensions of these spaces are in fact rather interesting, and obey a basic duality law known as the Riemann-Roch theorem. We will give a mostly self-contained proof of the Riemann-Roch theorem in these notes, omitting only some facts about genus and Euler characteristic, as well as construction of certain meromorphic ${1}$-forms (also known as Abelian differentials).
A more detailed study of Riemann surface (and more generally, complex manifolds) can be found for instance in Griffiths and Harris’s “Principles of Algebraic Geometry“.
Read the rest of this entry »

In 1946, Ulam, in response to a theorem of Anning and Erdös, posed the following problem:

Problem 1 (Erdös-Ulam problem) Let ${S \subset {\bf R}^2}$ be a set such that the distance between any two points in ${S}$ is rational. Is it true that ${S}$ cannot be (topologically) dense in ${{\bf R}^2}$?

The paper of Anning and Erdös addressed the case that all the distances between two points in ${S}$ were integer rather than rational in the affirmative.

The Erdös-Ulam problem remains open; it was discussed recently over at Gödel’s lost letter. It is in fact likely (as we shall see below) that the set ${S}$ in the above problem is not only forbidden to be topologically dense, but also cannot be Zariski dense either. If so, then the structure of ${S}$ is quite restricted; it was shown by Solymosi and de Zeeuw that if ${S}$ fails to be Zariski dense, then all but finitely many of the points of ${S}$ must lie on a single line, or a single circle. (Conversely, it is easy to construct examples of dense subsets of a line or circle in which all distances are rational, though in the latter case the square of the radius of the circle must also be rational.)

The main tool of the Solymosi-de Zeeuw analysis was Faltings’ celebrated theorem that every algebraic curve of genus at least two contains only finitely many rational points. The purpose of this post is to observe that an affirmative answer to the full Erdös-Ulam problem similarly follows from the conjectured analogue of Falting’s theorem for surfaces, namely the following conjecture of Bombieri and Lang:

Conjecture 2 (Bombieri-Lang conjecture) Let ${X}$ be a smooth projective irreducible algebraic surface defined over the rationals ${{\bf Q}}$ which is of general type. Then the set ${X({\bf Q})}$ of rational points of ${X}$ is not Zariski dense in ${X}$.

In fact, the Bombieri-Lang conjecture has been made for varieties of arbitrary dimension, and for more general number fields than the rationals, but the above special case of the conjecture is the only one needed for this application. We will review what “general type” means (for smooth projective complex varieties, at least) below the fold.

The Bombieri-Lang conjecture is considered to be extremely difficult, in particular being substantially harder than Faltings’ theorem, which is itself a highly non-trivial result. So this implication should not be viewed as a practical route to resolving the Erdös-Ulam problem unconditionally; rather, it is a demonstration of the power of the Bombieri-Lang conjecture. Still, it was an instructive algebraic geometry exercise for me to carry out the details of this implication, which quickly boils down to verifying that a certain quite explicit algebraic surface is of general type (Theorem 4 below). As I am not an expert in the subject, my computations here will be rather tedious and pedestrian; it is likely that they could be made much slicker by exploiting more of the machinery of modern algebraic geometry, and I would welcome any such streamlining by actual experts in this area. (For similar reasons, there may be more typos and errors than usual in this post; corrections are welcome as always.) My calculations here are based on a similar calculation of van Luijk, who used analogous arguments to show (assuming Bombieri-Lang) that the set of perfect cuboids is not Zariski-dense in its projective parameter space.

We also remark that in a recent paper of Makhul and Shaffaf, the Bombieri-Lang conjecture (or more precisely, a weaker consequence of that conjecture) was used to show that if ${S}$ is a subset of ${{\bf R}^2}$ with rational distances which intersects any line in only finitely many points, then there is a uniform bound on the cardinality of the intersection of ${S}$ with any line. I have also recently learned (private communication) that an unpublished work of Shaffaf has obtained a result similar to the one in this post, namely that the Erdös-Ulam conjecture follows from the Bombieri-Lang conjecture, plus an additional conjecture about the rational curves in a specific surface.

Let us now give the elementary reductions to the claim that a certain variety is of general type. For sake of contradiction, let ${S}$ be a dense set such that the distance between any two points is rational. Then ${S}$ certainly contains two points that are a rational distance apart. By applying a translation, rotation, and a (rational) dilation, we may assume that these two points are ${(0,0)}$ and ${(1,0)}$. As ${S}$ is dense, there is a third point of ${S}$ not on the ${x}$ axis, which after a reflection we can place in the upper half-plane; we will write it as ${(a,\sqrt{b})}$ with ${b>0}$.

Given any two points ${P, Q}$ in ${S}$, the quantities ${|P|^2, |Q|^2, |P-Q|^2}$ are rational, and so by the cosine rule the dot product ${P \cdot Q}$ is rational as well. Since ${(1,0) \in S}$, this implies that the ${x}$-component of every point ${P}$ in ${S}$ is rational; this in turn implies that the product of the ${y}$-coordinates of any two points ${P,Q}$ in ${S}$ is rational as well (since this differs from ${P \cdot Q}$ by a rational number). In particular, ${a}$ and ${b}$ are rational, and all of the points in ${S}$ now lie in the lattice ${\{ ( x, y\sqrt{b}): x, y \in {\bf Q} \}}$. (This fact appears to have first been observed in the 1988 habilitationschrift of Kemnitz.)

Now take four points ${(x_j,y_j \sqrt{b})}$, ${j=1,\dots,4}$ in ${S}$ in general position (so that the octuplet ${(x_1,y_1\sqrt{b},\dots,x_4,y_4\sqrt{b})}$ avoids any pre-specified hypersurface in ${{\bf C}^8}$); this can be done if ${S}$ is dense. (If one wished, one could re-use the three previous points ${(0,0), (1,0), (a,\sqrt{b})}$ to be three of these four points, although this ultimately makes little difference to the analysis.) If ${(x,y\sqrt{b})}$ is any point in ${S}$, then the distances ${r_j}$ from ${(x,y\sqrt{b})}$ to ${(x_j,y_j\sqrt{b})}$ are rationals that obey the equations

$\displaystyle (x - x_j)^2 + b (y-y_j)^2 = r_j^2$

for ${j=1,\dots,4}$, and thus determine a rational point in the affine complex variety ${V = V_{b,x_1,y_1,x_2,y_2,x_3,y_3,x_4,y_4} \subset {\bf C}^5}$ defined as

$\displaystyle V := \{ (x,y,r_1,r_2,r_3,r_4) \in {\bf C}^6:$

$\displaystyle (x - x_j)^2 + b (y-y_j)^2 = r_j^2 \hbox{ for } j=1,\dots,4 \}.$

By inspecting the projection ${(x,y,r_1,r_2,r_3,r_4) \rightarrow (x,y)}$ from ${V}$ to ${{\bf C}^2}$, we see that ${V}$ is a branched cover of ${{\bf C}^2}$, with the generic cover having ${2^4=16}$ points (coming from the different ways to form the square roots ${r_1,r_2,r_3,r_4}$); in particular, ${V}$ is a complex affine algebraic surface, defined over the rationals. By inspecting the monodromy around the four singular base points ${(x,y) = (x_i,y_i)}$ (which switch the sign of one of the roots ${r_i}$, while keeping the other three roots unchanged), we see that the variety ${V}$ is connected away from its singular set, and thus irreducible. As ${S}$ is topologically dense in ${{\bf R}^2}$, it is Zariski-dense in ${{\bf C}^2}$, and so ${S}$ generates a Zariski-dense set of rational points in ${V}$. To solve the Erdös-Ulam problem, it thus suffices to show that

Claim 3 For any non-zero rational ${b}$ and for rationals ${x_1,y_1,x_2,y_2,x_3,y_3,x_4,y_4}$ in general position, the rational points of the affine surface ${V = V_{b,x_1,y_1,x_2,y_2,x_3,y_3,x_4,y_4}}$ is not Zariski dense in ${V}$.

This is already very close to a claim that can be directly resolved by the Bombieri-Lang conjecture, but ${V}$ is affine rather than projective, and also contains some singularities. The first issue is easy to deal with, by working with the projectivisation

$\displaystyle \overline{V} := \{ [X,Y,Z,R_1,R_2,R_3,R_4] \in {\bf CP}^6: Q(X,Y,Z,R_1,R_2,R_3,R_4) = 0 \} \ \ \ \ \ (1)$

of ${V}$, where ${Q: {\bf C}^7 \rightarrow {\bf C}^4}$ is the homogeneous quadratic polynomial

$\displaystyle (X,Y,Z,R_1,R_2,R_3,R_4) := (Q_j(X,Y,Z,R_1,R_2,R_3,R_4) )_{j=1}^4$

with

$\displaystyle Q_j(X,Y,Z,R_1,R_2,R_3,R_4) := (X-x_j Z)^2 + b (Y-y_jZ)^2 - R_j^2$

and the projective complex space ${{\bf CP}^6}$ is the space of all equivalence classes ${[X,Y,Z,R_1,R_2,R_3,R_4]}$ of tuples ${(X,Y,Z,R_1,R_2,R_3,R_4) \in {\bf C}^7 \backslash \{0\}}$ up to projective equivalence ${(\lambda X, \lambda Y, \lambda Z, \lambda R_1, \lambda R_2, \lambda R_3, \lambda R_4) \sim (X,Y,Z,R_1,R_2,R_3,R_4)}$. By identifying the affine point ${(x,y,r_1,r_2,r_3,r_4)}$ with the projective point ${(X,Y,1,R_1,R_2,R_3,R_4)}$, we see that ${\overline{V}}$ consists of the affine variety ${V}$ together with the set ${\{ [X,Y,0,R_1,R_2,R_3,R_4]: X^2+bY^2=R^2; R_j = \pm R_1 \hbox{ for } j=2,3,4\}}$, which is the union of eight curves, each of which lies in the closure of ${V}$. Thus ${\overline{V}}$ is the projective closure of ${V}$, and is thus a complex irreducible projective surface, defined over the rationals. As ${\overline{V}}$ is cut out by four quadric equations in ${{\bf CP}^6}$ and has degree sixteen (as can be seen for instance by inspecting the intersection of ${\overline{V}}$ with a generic perturbation of a fibre over the generically defined projection ${[X,Y,Z,R_1,R_2,R_3,R_4] \mapsto [X,Y,Z]}$), it is also a complete intersection. To show (3), it then suffices to show that the rational points in ${\overline{V}}$ are not Zariski dense in ${\overline{V}}$.

Heuristically, the reason why we expect few rational points in ${\overline{V}}$ is as follows. First observe from the projective nature of (1) that every rational point is equivalent to an integer point. But for a septuple ${(X,Y,Z,R_1,R_2,R_3,R_4)}$ of integers of size ${O(N)}$, the quantity ${Q(X,Y,Z,R_1,R_2,R_3,R_4)}$ is an integer point of ${{\bf Z}^4}$ of size ${O(N^2)}$, and so should only vanish about ${O(N^{-8})}$ of the time. Hence the number of integer points ${(X,Y,Z,R_1,R_2,R_3,R_4) \in {\bf Z}^7}$ of height comparable to ${N}$ should be about

$\displaystyle O(N)^7 \times O(N^{-8}) = O(N^{-1});$

this is a convergent sum if ${N}$ ranges over (say) powers of two, and so from standard probabilistic heuristics (see this previous post) we in fact expect only finitely many solutions, in the absence of any special algebraic structure (e.g. the structure of an abelian variety, or a birational reduction to a simpler variety) that could produce an unusually large number of solutions.

The Bombieri-Lang conjecture, Conjecture 2, can be viewed as a formalisation of the above heuristics (roughly speaking, it is one of the most optimistic natural conjectures one could make that is compatible with these heuristics while also being invariant under birational equivalence).

Unfortunately, ${\overline{V}}$ contains some singular points. Being a complete intersection, this occurs when the Jacobian matrix of the map ${Q: {\bf C}^7 \rightarrow {\bf C}^4}$ has less than full rank, or equivalently that the gradient vectors

$\displaystyle \nabla Q_j = (2(X-x_j Z), 2(Y-y_j Z), -2x_j (X-x_j Z) - 2y_j (Y-y_j Z), \ \ \ \ \ (2)$

$\displaystyle 0, \dots, 0, -2R_j, 0, \dots, 0)$

for ${j=1,\dots,4}$ are linearly dependent, where the ${-2R_j}$ is in the coordinate position associated to ${R_j}$. One way in which this can occur is if one of the gradient vectors ${\nabla Q_j}$ vanish identically. This occurs at precisely ${4 \times 2^3 = 32}$ points, when ${[X,Y,Z]}$ is equal to ${[x_j,y_j,1]}$ for some ${j=1,\dots,4}$, and one has ${R_k = \pm ( (x_j - x_k)^2 + b (y_j - y_k)^2 )^{1/2}}$ for all ${k=1,\dots,4}$ (so in particular ${R_j=0}$). Let us refer to these as the obvious singularities; they arise from the geometrically evident fact that the distance function ${(x,y\sqrt{b}) \mapsto \sqrt{(x-x_j)^2 + b(y-y_j)^2}}$ is singular at ${(x_j,y_j\sqrt{b})}$.

The other way in which could occur is if a non-trivial linear combination of at least two of the gradient vectors vanishes. From (2), this can only occur if ${R_j=R_k=0}$ for some distinct ${j,k}$, which from (1) implies that

$\displaystyle (X - x_j Z) = \pm \sqrt{b} i (Y - y_j Z) \ \ \ \ \ (3)$

and

$\displaystyle (X - x_k Z) = \pm \sqrt{b} i (Y - y_k Z) \ \ \ \ \ (4)$

for two choices of sign ${\pm}$. If the signs are equal, then (as ${x_j, y_j, x_k, y_k}$ are in general position) this implies that ${Z=0}$, and then we have the singular point

$\displaystyle [X,Y,Z,R_1,R_2,R_3,R_4] = [\pm \sqrt{b} i, 1, 0, 0, 0, 0, 0]. \ \ \ \ \ (5)$

If the non-trivial linear combination involved three or more gradient vectors, then by the pigeonhole principle at least two of the signs involved must be equal, and so the only singular points are (5). So the only remaining possibility is when we have two gradient vectors ${\nabla Q_j, \nabla Q_k}$ that are parallel but non-zero, with the signs in (3), (4) opposing. But then (as ${x_j,y_j,x_k,y_k}$ are in general position) the vectors ${(X-x_j Z, Y-y_j Z), (X-x_k Z, Y-y_k Z)}$ are non-zero and non-parallel to each other, a contradiction. Thus, outside of the ${32}$ obvious singular points mentioned earlier, the only other singular points are the two points (5).

We will shortly show that the ${32}$ obvious singularities are ordinary double points; the surface ${\overline{V}}$ near any of these points is analytically equivalent to an ordinary cone ${\{ (x,y,z) \in {\bf C}^3: z^2 = x^2 + y^2 \}}$ near the origin, which is a cone over a smooth conic curve ${\{ (x,y) \in {\bf C}^2: x^2+y^2=1\}}$. The two non-obvious singularities (5) are slightly more complicated than ordinary double points, they are elliptic singularities, which approximately resemble a cone over an elliptic curve. (As far as I can tell, this resemblance is exact in the category of real smooth manifolds, but not in the category of algebraic varieties.) If one blows up each of the point singularities of ${\overline{V}}$ separately, no further singularities are created, and one obtains a smooth projective surface ${X}$ (using the Segre embedding as necessary to embed ${X}$ back into projective space, rather than in a product of projective spaces). Away from the singularities, the rational points of ${\overline{V}}$ lift up to rational points of ${X}$. Assuming the Bombieri-Lang conjecture, we thus are able to answer the Erdös-Ulam problem in the affirmative once we establish

Theorem 4 The blowup ${X}$ of ${\overline{V}}$ is of general type.

This will be done below the fold, by the pedestrian device of explicitly constructing global differential forms on ${X}$; I will also be working from a complex analysis viewpoint rather than an algebraic geometry viewpoint as I am more comfortable with the former approach. (As mentioned above, though, there may well be a quicker way to establish this result by using more sophisticated machinery.)

I thank Mark Green and David Gieseker for helpful conversations (and a crash course in varieties of general type!).

Remark 5 The above argument shows in fact (assuming Bombieri-Lang) that sets ${S \subset {\bf R}^2}$ with all distances rational cannot be Zariski-dense, and thus (by Solymosi-de Zeeuw) must lie on a single line or circle with only finitely many exceptions. Assuming a stronger version of Bombieri-Lang involving a general number field ${K}$, we obtain a similar conclusion with “rational” replaced by “lying in ${K}$” (one has to extend the Solymosi-de Zeeuw analysis to more general number fields, but this should be routine, using the analogue of Faltings’ theorem for such number fields).

Let ${{\bf F}_q}$ be a finite field of order ${q = p^n}$, and let ${C}$ be an absolutely irreducible smooth projective curve defined over ${{\bf F}_q}$ (and hence over the algebraic closure ${k := \overline{{\bf F}_q}}$ of that field). For instance, ${C}$ could be the projective elliptic curve

$\displaystyle C = \{ [x,y,z]: y^2 z = x^3 + ax z^2 + b z^3 \}$

in the projective plane ${{\bf P}^2 = \{ [x,y,z]: (x,y,z) \neq (0,0,0) \}}$, where ${a,b \in {\bf F}_q}$ are coefficients whose discriminant ${-16(4a^3+27b^2)}$ is non-vanishing, which is the projective version of the affine elliptic curve

$\displaystyle \{ (x,y): y^2 = x^3 + ax + b \}.$

To each such curve ${C}$ one can associate a genus ${g}$, which we will define later; for instance, elliptic curves have genus ${1}$. We can also count the cardinality ${|C({\bf F}_q)|}$ of the set ${C({\bf F}_q)}$ of ${{\bf F}_q}$-points of ${C}$. The Hasse-Weil bound relates the two:

Theorem 1 (Hasse-Weil bound) ${||C({\bf F}_q)| - q - 1| \leq 2g\sqrt{q}}$.

The usual proofs of this bound proceed by first establishing a trace formula of the form

$\displaystyle |C({\bf F}_{p^n})| = p^n - \sum_{i=1}^{2g} \alpha_i^n + 1 \ \ \ \ \ (1)$

for some complex numbers ${\alpha_1,\dots,\alpha_{2g}}$ independent of ${n}$; this is in fact a special case of the Lefschetz-Grothendieck trace formula, and can be interpreted as an assertion that the zeta function associated to the curve ${C}$ is rational. The task is then to establish a bound ${|\alpha_i| \leq \sqrt{p}}$ for all ${i=1,\dots,2g}$; this (or more precisely, the slightly stronger assertion ${|\alpha_i| = \sqrt{p}}$) is the Riemann hypothesis for such curves. This can be done either by passing to the Jacobian variety of ${C}$ and using a certain duality available on the cohomology of such varieties, known as Rosati involution; alternatively, one can pass to the product surface ${C \times C}$ and apply the Riemann-Roch theorem for that surface.

In 1969, Stepanov introduced an elementary method (a version of what is now known as the polynomial method) to count (or at least to upper bound) the quantity ${|C({\bf F}_q)|}$. The method was initially restricted to hyperelliptic curves, but was soon extended to general curves. In particular, Bombieri used this method to give a short proof of the following weaker version of the Hasse-Weil bound:

Theorem 2 (Weak Hasse-Weil bound) If ${q}$ is a perfect square, and ${q \geq (g+1)^4}$, then ${|C({\bf F}_q)| \leq q + (2g+1) \sqrt{q} + 1}$.

In fact, the bound on ${|C({\bf F}_q)|}$ can be sharpened a little bit further, as we will soon see.

Theorem 2 is only an upper bound on ${|C({\bf F}_q)|}$, but there is a Galois-theoretic trick to convert (a slight generalisation of) this upper bound to a matching lower bound, and if one then uses the trace formula (1) (and the “tensor power trick” of sending ${n}$ to infinity to control the weights ${\alpha_i}$) one can then recover the full Hasse-Weil bound. We discuss these steps below the fold.

I’ve discussed Bombieri’s proof of Theorem 2 in this previous post (in the special case of hyperelliptic curves), but now wish to present the full proof, with some minor simplifications from Bombieri’s original presentation; it is mostly elementary, with the deepest fact from algebraic geometry needed being Riemann’s inequality (a weak form of the Riemann-Roch theorem).

The first step is to reinterpret ${|C({\bf F}_q)|}$ as the number of points of intersection between two curves ${C_1,C_2}$ in the surface ${C \times C}$. Indeed, if we define the Frobenius endomorphism ${\hbox{Frob}_q}$ on any projective space by

$\displaystyle \hbox{Frob}_q( [x_0,\dots,x_n] ) := [x_0^q, \dots, x_n^q]$

then this map preserves the curve ${C}$, and the fixed points of this map are precisely the ${{\bf F}_q}$ points of ${C}$:

$\displaystyle C({\bf F}_q) = \{ z \in C: \hbox{Frob}_q(z) = z \}.$

Thus one can interpret ${|C({\bf F}_q)|}$ as the number of points of intersection between the diagonal curve

$\displaystyle \{ (z,z): z \in C \}$

and the Frobenius graph

$\displaystyle \{ (z, \hbox{Frob}_q(z)): z \in C \}$

which are copies of ${C}$ inside ${C \times C}$. But we can use the additional hypothesis that ${q}$ is a perfect square to write this more symmetrically, by taking advantage of the fact that the Frobenius map has a square root

$\displaystyle \hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}}^2$

with ${\hbox{Frob}_{\sqrt{q}}}$ also preserving ${C}$. One can then also interpret ${|C({\bf F}_q)|}$ as the number of points of intersection between the curve

$\displaystyle C_1 := \{ (z, \hbox{Frob}_{\sqrt{q}}(z)): z \in C \} \ \ \ \ \ (2)$

and its transpose

$\displaystyle C_2 := \{ (\hbox{Frob}_{\sqrt{q}}(w), w): w \in C \}.$

Let ${k(C \times C)}$ be the field of rational functions on ${C \times C}$ (with coefficients in ${k}$), and define ${k(C_1)}$, ${k(C_2)}$, and ${k(C_1 \cap C_2)}$ analogously )(although ${C_1 \cap C_2}$ is likely to be disconnected, so ${k(C_1 \cap C_2)}$ will just be a ring rather than a field. We then (morally) have the commuting square

$\displaystyle \begin{array}{ccccc} && k(C \times C) && \\ & \swarrow & & \searrow & \\ k(C_1) & & & & k(C_2) \\ & \searrow & & \swarrow & \\ && k(C_1 \cap C_2) && \end{array},$

if we ignore the issue that a rational function on, say, ${C \times C}$, might blow up on all of ${C_1}$ and thus not have a well-defined restriction to ${C_1}$. We use ${\pi_1: k(C \times C) \rightarrow k(C_1)}$ and ${\pi_2: k(C \times C) \rightarrow k(C_2)}$ to denote the restriction maps. Furthermore, we have obvious isomorphisms ${\iota_1: k(C_1) \rightarrow k(C)}$, ${\iota_2: k(C_2) \rightarrow k(C)}$ coming from composing with the graphing maps ${z \mapsto (z, \hbox{Frob}_{\sqrt{q}}(z))}$ and ${w \mapsto (\hbox{Frob}_{\sqrt{q}}(w), w)}$.

The idea now is to find a rational function ${f \in k(C \times C)}$ on the surface ${C \times C}$ of controlled degree which vanishes when restricted to ${C_1}$, but is non-vanishing (and not blowing up) when restricted to ${C_2}$. On ${C_2}$, we thus get a non-zero rational function ${f \downharpoonright_{C_2}}$ of controlled degree which vanishes on ${C_1 \cap C_2}$ – which then lets us bound the cardinality of ${C_1 \cap C_2}$ in terms of the degree of ${f \downharpoonright_{C_2}}$. (In Bombieri’s original argument, one required vanishing to high order on the ${C_1}$ side, but in our presentation, we have factored out a ${\hbox{Frob}_{\sqrt{q}}}$ term which removes this high order vanishing condition.)

To find this ${f}$, we will use linear algebra. Namely, we will locate a finite-dimensional subspace ${V}$ of ${k(C \times C)}$ (consisting of certain “controlled degree” rational functions) which projects injectively to ${k(C_2)}$, but whose projection to ${k(C_1)}$ has strictly smaller dimension than ${V}$ itself. The rank-nullity theorem then forces the existence of a non-zero element ${P}$ of ${V}$ whose projection to ${k(C_1)}$ vanishes, but whose projection to ${k(C_2)}$ is non-zero.

Now we build ${V}$. Pick a ${{\bf F}_q}$ point ${P_\infty}$ of ${C}$, which we will think of as being a point at infinity. (For the purposes of proving Theorem 2, we may clearly assume that ${C({\bf F}_q)}$ is non-empty.) Thus ${P_\infty}$ is fixed by ${\hbox{Frob}_q}$. To simplify the exposition, we will also assume that ${P_\infty}$ is fixed by the square root ${\hbox{Frob}_{\sqrt{q}}}$ of ${\hbox{Frob}_q}$; in the opposite case when ${\hbox{Frob}_{\sqrt{q}}}$ has order two when acting on ${P_\infty}$, the argument is essentially the same, but all references to ${P_\infty}$ in the second factor of ${C \times C}$ need to be replaced by ${\hbox{Frob}_{\sqrt{q}} P_\infty}$ (we leave the details to the interested reader).

For any natural number ${n}$, define ${R_n}$ to be the set of rational functions ${f \in k(C)}$ which are allowed to have a pole of order up to ${n}$ at ${P_\infty}$, but have no other poles on ${C}$; note that as we are assuming ${C}$ to be smooth, it is unambiguous what a pole is (and what order it will have). (In the fancier language of divisors and Cech cohomology, we have ${R_n = H^0( C, {\mathcal O}_C(-n P_\infty) )}$.) The space ${R_n}$ is clearly a vector space over ${k}$; one can view intuitively as the space of “polynomials” on ${C}$ of “degree” at most ${n}$. When ${n=0}$, ${R_0}$ consists just of the constant functions. Indeed, if ${f \in R_0}$, then the image ${f(C)}$ of ${f}$ avoids ${\infty}$ and so lies in the affine line ${k = {\mathbf P}^1 \backslash \{\infty\}}$; but as ${C}$ is projective, the image ${f(C)}$ needs to be compact (hence closed) in ${{\mathbf P}^1}$, and must therefore be a point, giving the claim.

For higher ${n \geq 1}$, we have the easy relations

$\displaystyle \hbox{dim}(R_{n-1}) \leq \hbox{dim}(R_n) \leq \hbox{dim}(R_{n-1})+1. \ \ \ \ \ (3)$

The former inequality just comes from the trivial inclusion ${R_{n-1} \subset R_n}$. For the latter, observe that if two functions ${f, g}$ lie in ${R_n}$, so that they each have a pole of order at most ${n}$ at ${P_\infty}$, then some linear combination of these functions must have a pole of order at most ${n-1}$ at ${P_\infty}$; thus ${R_{n-1}}$ has codimension at most one in ${R_n}$, giving the claim.

From (3) and induction we see that each of the ${R_n}$ are finite dimensional, with the trivial upper bound

$\displaystyle \hbox{dim}(R_n) \leq n+1. \ \ \ \ \ (4)$

Riemann’s inequality complements this with the lower bound

$\displaystyle \hbox{dim}(R_n) \geq n+1-g, \ \ \ \ \ (5)$

thus one has ${\hbox{dim}(R_n) = \hbox{dim}(R_{n-1})+1}$ for all but at most ${g}$ exceptions (in fact, exactly ${g}$ exceptions as it turns out). This is a consequence of the Riemann-Roch theorem; it can be proven from abstract nonsense (the snake lemma) if one defines the genus ${g}$ in a non-standard fashion (as the dimension of the first Cech cohomology ${H^1(C)}$ of the structure sheaf ${{\mathcal O}_C}$ of ${C}$), but to obtain this inequality with a standard definition of ${g}$ (e.g. as the dimension of the zeroth Cech cohomolgy ${H^0(C, \Omega_C^1)}$ of the line bundle of differentials) requires the more non-trivial tool of Serre duality.

At any rate, now that we have these vector spaces ${R_n}$, we will define ${V \subset k(C \times C)}$ to be a tensor product space

$\displaystyle V = R_\ell \otimes R_m$

for some natural numbers ${\ell, m \geq 0}$ which we will optimise in later. That is to say, ${V}$ is spanned by functions of the form ${(z,w) \mapsto f(z) g(w)}$ with ${f \in R_\ell}$ and ${g \in R_m}$. This is clearly a linear subspace of ${k(C \times C)}$ of dimension ${\hbox{dim}(R_\ell) \hbox{dim}(R_m)}$, and hence by Rieman’s inequality we have

$\displaystyle \hbox{dim}(V) \geq (\ell+1-g) (m+1-g) \ \ \ \ \ (6)$

if

$\displaystyle \ell,m \geq g-1. \ \ \ \ \ (7)$

Observe that ${\iota_1 \circ \pi_1}$ maps a tensor product ${(z,w) \mapsto f(z) g(w)}$ to a function ${z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)}$. If ${f \in R_\ell}$ and ${g \in R_m}$, then we see that the function ${z \mapsto f(z) g(\hbox{Frob}_{\sqrt{q}} z)}$ has a pole of order at most ${\ell+m\sqrt{q}}$ at ${P_\infty}$. We conclude that

$\displaystyle \iota_1 \circ \pi_1( V ) \subset R_{\ell + m\sqrt{q}} \ \ \ \ \ (8)$

and in particular by (4)

$\displaystyle \hbox{dim}(\pi_1(V)) \leq \ell + m \sqrt{q} + 1 \ \ \ \ \ (9)$

and similarly

$\displaystyle \hbox{dim}(\pi_2(V)) \leq \ell \sqrt{q} + m + 1. \ \ \ \ \ (10)$

We will choose ${m}$ to be a bit bigger than ${\ell}$, to make the ${\pi_2}$ image of ${V}$ smaller than that of ${\pi_1}$. From (6), (10) we see that if we have the inequality

$\displaystyle (\ell+1-g) (m+1-g) > \ell \sqrt{q}+m + 1 \ \ \ \ \ (11)$

(together with (7)) then ${\pi_2}$ cannot be injective.

On the other hand, we have the following basic fact:

Lemma 3 (Injectivity) If

$\displaystyle \ell < \sqrt{q}, \ \ \ \ \ (12)$

then ${\pi_1: V \rightarrow \pi_1(V)}$ is injective.

Proof: From (3), we can find a linear basis ${f_1,\dots,f_a}$ of ${R_\ell}$ such that each of the ${f_i}$ has a distinct order ${d_i}$ of pole at ${P_\infty}$ (somewhere between ${0}$ and ${\ell}$ inclusive). Similarly, we may find a linear basis ${g_1,\dots,g_b}$ of ${R_m}$ such that each of the ${g_j}$ has a distinct order ${e_j}$ of pole at ${P_\infty}$ (somewhere between ${0}$ and ${m}$ inclusive). The functions ${z \mapsto f_i(z) g_j(\hbox{Frob}_{\sqrt{q}} z)}$ then span ${\iota_1(\pi_1(V))}$, and the order of pole at ${P_\infty}$ is ${d_i + \sqrt{q} e_j}$. But since ${\ell < \sqrt{q}}$, these orders are all distinct, and so these functions must be linearly independent. The claim follows. $\Box$

This gives us the following bound:

Proposition 4 Let ${\ell,m}$ be natural numbers such that (7), (11), (12) hold. Then ${|C({\bf F}_q)| \leq \ell + m \sqrt{q}}$.

Proof: As ${\pi_2}$ is not injective, we can find ${f \in V}$ with ${\pi_2(f)}$ vanishing. By the above lemma, the function ${\iota_1(\pi_1(f))}$ is then non-zero, but it must also vanish on ${\iota_1(C_1 \cap C_2)}$, which has cardinality ${|C({\bf F}_q)|}$. On the other hand, by (8), ${\iota_1(\pi_1(f))}$ has a pole of order at most ${\ell+m\sqrt{q}}$ at ${P_\infty}$ and no other poles. Since the number of poles and zeroes of a rational function on a projective curve must add up to zero, the claim follows. $\Box$

If ${q \geq (g+1)^4}$, we may make the explicit choice

$\displaystyle m := \sqrt{q}+2g; \quad \ell := \lfloor \frac{g}{g+1} \sqrt{q} \rfloor + g + 1$

and a brief calculation then gives Theorem 2. In some cases one can optimise things a bit further. For instance, in the genus zero case ${g=0}$ (e.g. if ${C}$ is just the projective line ${{\mathbf P}^1}$) one may take ${\ell=1, m = \sqrt{q}}$ and conclude the absolutely sharp bound ${|C({\bf F}_q)| \leq q+1}$ in this case; in the case of the projective line ${{\mathbf P}^1}$, the function ${f}$ is in fact the very concrete function ${f(z,w) := z - w^{\sqrt{q}}}$.

Remark 1 When ${q = p^{2n+1}}$ is not a perfect square, one can try to run the above argument using the factorisation ${\hbox{Frob}_q = \hbox{Frob}_{p^n} \hbox{Frob}_{p^{n+1}}}$ instead of ${\hbox{Frob}_q = \hbox{Frob}_{\sqrt{q}} \hbox{Frob}_{\sqrt{q}}}$. This gives a weaker version of the above bound, of the shape ${|C({\bf F}_q)| \leq q + O( \sqrt{p} \sqrt{q} )}$. In the hyperelliptic case at least, one can erase this loss by working with a variant of the argument in which one requires ${f}$ to vanish to high order at ${C_1}$, rather than just to first order; see this survey article of mine for details.

Let ${f: {\bf R}^3 \rightarrow {\bf R}}$ be an irreducible polynomial in three variables. As ${{\bf R}}$ is not algebraically closed, the zero set ${Z_{\bf R}(f) = \{ x \in{\bf R}^3: f(x)=0\}}$ can split into various components of dimension between ${0}$ and ${2}$. For instance, if ${f(x_1,x_2,x_3) = x_1^2+x_2^2}$, the zero set ${Z_{\bf R}(f)}$ is a line; more interestingly, if ${f(x_1,x_2,x_3) = x_3^2 + x_2^2 - x_2^3}$, then ${Z_{\bf R}(f)}$ is the union of a line and a surface (or the product of an acnodal cubic curve with a line). We will assume that the ${2}$-dimensional component ${Z_{{\bf R},2}(f)}$ is non-empty, thus defining a real surface in ${{\bf R}^3}$. In particular, this hypothesis implies that ${f}$ is not just irreducible over ${{\bf R}}$, but is in fact absolutely irreducible (i.e. irreducible over ${{\bf C}}$), since otherwise one could use the complex factorisation of ${f}$ to contain ${Z_{\bf R}(f)}$ inside the intersection ${{\bf Z}_{\bf C}(g) \cap {\bf Z}_{\bf C}(\bar{g})}$ of the complex zero locus of complex polynomial ${g}$ and its complex conjugate, with ${g,\bar{g}}$ having no common factor, forcing ${Z_{\bf R}(f)}$ to be at most one-dimensional. (For instance, in the case ${f(x_1,x_2,x_3)=x_1^2+x_2^2}$, one can take ${g(z_1,z_2,z_3) = z_1 + i z_2}$.) Among other things, this makes ${{\bf Z}_{{\bf R},2}(f)}$ a Zariski-dense subset of ${{\bf Z}_{\bf C}(f)}$, thus any polynomial identity which holds true at every point of ${{\bf Z}_{{\bf R},2}(f)}$, also holds true on all of ${{\bf Z}_{\bf C}(f)}$. This allows us to easily use tools from algebraic geometry in this real setting, even though the reals are not quite algebraically closed.

The surface ${Z_{{\bf R},2}(f)}$ is said to be ruled if, for a Zariski open dense set of points ${x \in Z_{{\bf R},2}(f)}$, there exists a line ${l_x = \{ x+tv_x: t \in {\bf R} \}}$ through ${x}$ for some non-zero ${v_x \in {\bf R}^3}$ which is completely contained in ${Z_{{\bf R},2}(f)}$, thus

$\displaystyle f(x+tv_x)=0$

for all ${t \in {\bf R}}$. Also, a point ${x \in {\bf Z}_{{\bf R},2}(f)}$ is said to be a flecnode if there exists a line ${l_x = \{ x+tv_x: t \in {\bf R}\}}$ through ${x}$ for some non-zero ${v_x \in {\bf R}^3}$ which is tangent to ${Z_{{\bf R},2}(f)}$ to third order, in the sense that

$\displaystyle f(x+tv_x)=O(t^4)$

as ${t \rightarrow 0}$, or equivalently that

$\displaystyle \frac{d^j}{dt^j} f(x+tv_x)|_{t=0} = 0 \ \ \ \ \ (1)$

for ${j=0,1,2,3}$. Clearly, if ${Z_{{\bf R},2}(f)}$ is a ruled surface, then a Zariski open dense set of points on ${Z_{{\bf R},2}}$ are a flecnode. We then have the remarkable theorem (discovered first by Monge, and then later by Cayley and Salmon) asserting the converse:

Theorem 1 (Monge-Cayley-Salmon theorem) Let ${f: {\bf R}^3 \rightarrow {\bf R}}$ be an irreducible polynomial with ${{\bf Z}_{{\bf R},2}}$ non-empty. Suppose that a Zariski dense set of points in ${Z_{{\bf R},2}(f)}$ are flecnodes. Then ${Z_{{\bf R},2}(f)}$ is a ruled surface.

Among other things, this theorem was used in the celebrated result of Guth and Katz that almost solved the Erdos distance problem in two dimensions, as discussed in this previous blog post. Vanishing to third order is necessary: observe that in a surface of negative curvature, such as the saddle ${\{ (x_1,x_2,x_3): x_3 = x_1^2 - x_2^2 \}}$, every point on the surface is tangent to second order to a line (the line in the direction for which the second fundamental form vanishes). This surface happens to be ruled, but a generic perturbation of this surface (e.g. ${x_3 = x_1^2 - x_2^2 + x_2^4}$) will no longer be ruled, although it is still negative curvature near the origin.

The original proof of the Monge-Cayley-Salmon theorem is not easily accessible and not written in modern language. A modern proof of this theorem (together with substantial generalisations, for instance to higher dimensions) is given by Landsberg; the proof uses the machinery of modern algebraic geometry. The purpose of this post is to record an alternate proof of the Monge-Cayley-Salmon theorem based on classical differential geometry (in particular, the notion of torsion of a curve) and basic ODE methods (in particular, Gronwall’s inequality and the Picard existence theorem). The idea is to “integrate” the lines ${l_x}$ indicated by the flecnode to produce smooth curves ${\gamma}$ on the surface ${{\bf Z}_{{\bf R},2}}$; one then uses the vanishing (1) and some basic calculus to conclude that these curves have zero torsion and are thus planar curves. Some further manipulation using (1) (now just to second order instead of third) then shows that these curves are in fact straight lines, giving the ruling on the surface.

Update: Janos Kollar has informed me that the above theorem was essentially known to Monge in 1809; see his recent arXiv note for more details.

I thank Larry Guth and Micha Sharir for conversations leading to this post.

The classical foundations of probability theory (discussed for instance in this previous blog post) is founded on the notion of a probability space ${(\Omega, {\cal E}, {\bf P})}$ – a space ${\Omega}$ (the sample space) equipped with a ${\sigma}$-algebra ${{\cal E}}$ (the event space), together with a countably additive probability measure ${{\bf P}: {\cal E} \rightarrow [0,1]}$ that assigns a real number in the interval ${[0,1]}$ to each event.

One can generalise the concept of a probability space to a finitely additive probability space, in which the event space ${{\cal E}}$ is now only a Boolean algebra rather than a ${\sigma}$-algebra, and the measure ${\mu}$ is now only finitely additive instead of countably additive, thus ${{\bf P}( E \vee F ) = {\bf P}(E) + {\bf P}(F)}$ when ${E,F}$ are disjoint events. By giving up countable additivity, one loses a fair amount of measure and integration theory, and in particular the notion of the expectation of a random variable becomes problematic (unless the random variable takes only finitely many values). Nevertheless, one can still perform a fair amount of probability theory in this weaker setting.

In this post I would like to describe a further weakening of probability theory, which I will call qualitative probability theory, in which one does not assign a precise numerical probability value ${{\bf P}(E)}$ to each event, but instead merely records whether this probability is zero, one, or something in between. Thus ${{\bf P}}$ is now a function from ${{\cal E}}$ to the set ${\{0, I, 1\}}$, where ${I}$ is a new symbol that replaces all the elements of the open interval ${(0,1)}$. In this setting, one can no longer compute quantitative expressions, such as the mean or variance of a random variable; but one can still talk about whether an event holds almost surely, with positive probability, or with zero probability, and there are still usable notions of independence. (I will refer to classical probability theory as quantitative probability theory, to distinguish it from its qualitative counterpart.)

The main reason I want to introduce this weak notion of probability theory is that it becomes suited to talk about random variables living inside algebraic varieties, even if these varieties are defined over fields other than ${{\bf R}}$ or ${{\bf C}}$. In algebraic geometry one often talks about a “generic” element of a variety ${V}$ defined over a field ${k}$, which does not lie in any specified variety of lower dimension defined over ${k}$. Once ${V}$ has positive dimension, such generic elements do not exist as classical, deterministic ${k}$-points ${x}$ in ${V}$, since of course any such point lies in the ${0}$-dimensional subvariety ${\{x\}}$ of ${V}$. There are of course several established ways to deal with this problem. One way (which one might call the “Weil” approach to generic points) is to extend the field ${k}$ to a sufficiently transcendental extension ${\tilde k}$, in order to locate a sufficient number of generic points in ${V(\tilde k)}$. Another approach (which one might dub the “Zariski” approach to generic points) is to work scheme-theoretically, and interpret a generic point in ${V}$ as being associated to the zero ideal in the function ring of ${V}$. However I want to discuss a third perspective, in which one interprets a generic point not as a deterministic object, but rather as a random variable ${{\bf x}}$ taking values in ${V}$, but which lies in any given lower-dimensional subvariety of ${V}$ with probability zero. This interpretation is intuitive, but difficult to implement in classical probability theory (except perhaps when considering varieties over ${{\bf R}}$ or ${{\bf C}}$) due to the lack of a natural probability measure to place on algebraic varieties; however it works just fine in qualitative probability theory. In particular, the algebraic geometry notion of being “generically true” can now be interpreted probabilistically as an assertion that something is “almost surely true”.

It turns out that just as qualitative random variables may be used to interpret the concept of a generic point, they can also be used to interpret the concept of a type in model theory; the type of a random variable ${x}$ is the set of all predicates ${\phi(x)}$ that are almost surely obeyed by ${x}$. In contrast, model theorists often adopt a Weil-type approach to types, in which one works with deterministic representatives of a type, which often do not occur in the original structure of interest, but only in a sufficiently saturated extension of that structure (this is the analogue of working in a sufficiently transcendental extension of the base field). However, it seems that (in some cases at least) one can equivalently view types in terms of (qualitative) random variables on the original structure, avoiding the need to extend that structure. (Instead, one reserves the right to extend the sample space of one’s probability theory whenever necessary, as part of the “probabilistic way of thinking” discussed in this previous blog post.) We illustrate this below the fold with two related theorems that I will interpret through the probabilistic lens: the “group chunk theorem” of Weil (and later developed by Hrushovski), and the “group configuration theorem” of Zilber (and again later developed by Hrushovski). For sake of concreteness we will only consider these theorems in the theory of algebraically closed fields, although the results are quite general and can be applied to many other theories studied in model theory.

Let ${F}$ be a field. A definable set over ${F}$ is a set of the form

$\displaystyle \{ x \in F^n | \phi(x) \hbox{ is true} \} \ \ \ \ \ (1)$

where ${n}$ is a natural number, and ${\phi(x)}$ is a predicate involving the ring operations ${+,\times}$ of ${F}$, the equality symbol ${=}$, an arbitrary number of constants and free variables in ${F}$, the quantifiers ${\forall, \exists}$, boolean operators such as ${\vee,\wedge,\neg}$, and parentheses and colons, where the quantifiers are always understood to be over the field ${F}$. Thus, for instance, the set of quadratic residues

$\displaystyle \{ x \in F | \exists y: x = y \times y \}$

is definable over ${F}$, and any algebraic variety over ${F}$ is also a definable set over ${F}$. Henceforth we will abbreviate “definable over ${F}$” simply as “definable”.

If ${F}$ is a finite field, then every subset of ${F^n}$ is definable, since finite sets are automatically definable. However, we can obtain a more interesting notion in this case by restricting the complexity of a definable set. We say that ${E \subset F^n}$ is a definable set of complexity at most ${M}$ if ${n \leq M}$, and ${E}$ can be written in the form (1) for some predicate ${\phi}$ of length at most ${M}$ (where all operators, quantifiers, relations, variables, constants, and punctuation symbols are considered to have unit length). Thus, for instance, a hypersurface in ${n}$ dimensions of degree ${d}$ would be a definable set of complexity ${O_{n,d}(1)}$. We will then be interested in the regime where the complexity remains bounded, but the field size (or field characteristic) becomes large.

In a recent paper, I established (in the large characteristic case) the following regularity lemma for dense definable graphs, which significantly strengthens the Szemerédi regularity lemma in this context, by eliminating “bad” pairs, giving a polynomially strong regularity, and also giving definability of the cells:

Lemma 1 (Algebraic regularity lemma) Let ${F}$ be a finite field, let ${V,W}$ be definable non-empty sets of complexity at most ${M}$, and let ${E \subset V \times W}$ also be definable with complexity at most ${M}$. Assume that the characteristic of ${F}$ is sufficiently large depending on ${M}$. Then we may partition ${V = V_1 \cup \ldots \cup V_m}$ and ${W = W_1 \cup \ldots \cup W_n}$ with ${m,n = O_M(1)}$, with the following properties:

• (Definability) Each of the ${V_1,\ldots,V_m,W_1,\ldots,W_n}$ are definable of complexity ${O_M(1)}$.
• (Size) We have ${|V_i| \gg_M |V|}$ and ${|W_j| \gg_M |W|}$ for all ${i=1,\ldots,m}$ and ${j=1,\ldots,n}$.
• (Regularity) We have

$\displaystyle |E \cap (A \times B)| = d_{ij} |A| |B| + O_M( |F|^{-1/4} |V| |W| ) \ \ \ \ \ (2)$

for all ${i=1,\ldots,m}$, ${j=1,\ldots,n}$, ${A \subset V_i}$, and ${B\subset W_j}$, where ${d_{ij}}$ is a rational number in ${[0,1]}$ with numerator and denominator ${O_M(1)}$.

My original proof of this lemma was quite complicated, based on an explicit calculation of the “square”

$\displaystyle \mu(w,w') := \{ v \in V: (v,w), (v,w') \in E \}$

of ${E}$ using the Lang-Weil bound and some facts about the étale fundamental group. It was the reliance on the latter which was the main reason why the result was restricted to the large characteristic setting. (I then applied this lemma to classify expanding polynomials over finite fields of large characteristic, but I will not discuss these applications here; see this previous blog post for more discussion.)

Recently, Anand Pillay and Sergei Starchenko (and independently, Udi Hrushovski) have observed that the theory of the étale fundamental group is not necessary in the argument, and the lemma can in fact be deduced from quite general model theoretic techniques, in particular using (a local version of) the concept of stability. One of the consequences of this new proof of the lemma is that the hypothesis of large characteristic can be omitted; the lemma is now known to be valid for arbitrary finite fields ${F}$ (although its content is trivial if the field is not sufficiently large depending on the complexity at most ${M}$).

Inspired by this, I decided to see if I could find yet another proof of the algebraic regularity lemma, again avoiding the theory of the étale fundamental group. It turns out that the spectral proof of the Szemerédi regularity lemma (discussed in this previous blog post) adapts very nicely to this setting. The key fact needed about definable sets over finite fields is that their cardinality takes on an essentially discrete set of values. More precisely, we have the following fundamental result of Chatzidakis, van den Dries, and Macintyre:

Proposition 2 Let ${F}$ be a finite field, and let ${M > 0}$.

• (Discretised cardinality) If ${E}$ is a non-empty definable set of complexity at most ${M}$, then one has

$\displaystyle |E| = c |F|^d + O_M( |F|^{d-1/2} ) \ \ \ \ \ (3)$

where ${d = O_M(1)}$ is a natural number, and ${c}$ is a positive rational number with numerator and denominator ${O_M(1)}$. In particular, we have ${|F|^d \ll_M |E| \ll_M |F|^d}$.

• (Definable cardinality) Assume ${|F|}$ is sufficiently large depending on ${M}$. If ${V, W}$, and ${E \subset V \times W}$ are definable sets of complexity at most ${M}$, so that ${E_w := \{ v \in V: (v,w) \in W \}}$ can be viewed as a definable subset of ${V}$ that is definably parameterised by ${w \in W}$, then for each natural number ${d = O_M(1)}$ and each positive rational ${c}$ with numerator and denominator ${O_M(1)}$, the set

$\displaystyle \{ w \in W: |E_w| = c |F|^d + O_M( |F|^{d-1/2} ) \} \ \ \ \ \ (4)$

is definable with complexity ${O_M(1)}$, where the implied constants in the asymptotic notation used to define (4) are the same as those that appearing in (3). (Informally: the “dimension” ${d}$ and “measure” ${c}$ of ${E_w}$ depends definably on ${w}$.)

We will take this proposition as a black box; a proof can be obtained by combining the description of definable sets over pseudofinite fields (discussed in this previous post) with the Lang-Weil bound (discussed in this previous post). (The former fact is phrased using nonstandard analysis, but one can use standard compactness-and-contradiction arguments to convert such statements to statements in standard analysis, as discussed in this post.)

The above proposition places severe restrictions on the cardinality of definable sets; for instance, it shows that one cannot have a definable set of complexity at most ${M}$ and cardinality ${|F|^{1/2}}$, if ${|F|}$ is sufficiently large depending on ${M}$. If ${E \subset V}$ are definable sets of complexity at most ${M}$, it shows that ${|E| = (c+ O_M(|F|^{-1/2})) |V|}$ for some rational ${0\leq c \leq 1}$ with numerator and denominator ${O_M(1)}$; furthermore, if ${c=0}$, we may improve this bound to ${|E| = O_M( |F|^{-1} |V|)}$. In particular, we obtain the following “self-improving” properties:

• If ${E \subset V}$ are definable of complexity at most ${M}$ and ${|E| \leq \epsilon |V|}$ for some ${\epsilon>0}$, then (if ${\epsilon}$ is sufficiently small depending on ${M}$ and ${F}$ is sufficiently large depending on ${M}$) this forces ${|E| = O_M( |F|^{-1} |V| )}$.
• If ${E \subset V}$ are definable of complexity at most ${M}$ and ${||E| - c |V|| \leq \epsilon |V|}$ for some ${\epsilon>0}$ and positive rational ${c}$, then (if ${\epsilon}$ is sufficiently small depending on ${M,c}$ and ${F}$ is sufficiently large depending on ${M,c}$) this forces ${|E| = c |V| + O_M( |F|^{-1/2} |V| )}$.

It turns out that these self-improving properties can be applied to the coefficients of various matrices (basically powers of the adjacency matrix associated to ${E}$) that arise in the spectral proof of the regularity lemma to significantly improve the bounds in that lemma; we describe how this is done below the fold. We also make some connections to the stability-based proofs of Pillay-Starchenko and Hrushovski.