You are currently browsing the tag archive for the ‘polynomials’ tag.

I’ve just uploaded to the arXiv my paper “Sendov’s conjecture for sufficiently high degree polynomials“. This paper is a contribution to an old conjecture of Sendov on the zeroes of polynomials:

Conjecture 1 (Sendov’s conjecture) Let ${f: {\bf C} \rightarrow {\bf C}}$ be a polynomial of degree ${n \geq 2}$ that has all zeroes in the closed unit disk ${\{ z: |z| \leq 1 \}}$. If ${\lambda_0}$ is one of these zeroes, then ${f'}$ has at least one zero in ${\{z: |z-\lambda_0| \leq 1\}}$.

It is common in the literature on this problem to normalise ${f}$ to be monic, and to rotate the zero ${\lambda_0}$ to be an element ${a}$ of the unit interval ${[0,1]}$. As it turns out, the location of ${a}$ on this unit interval ${[0,1]}$ ends up playing an important role in the arguments.

Many cases of this conjecture are already known, for instance

• When ${n<9}$ (Brown-Xiang 1999);
• When ${a=0}$ (Gauss-Lucas theorem);
• When ${a \leq \frac{1}{n-1}}$ (Bojanov 2011);
• When ${c \leq a \leq 1-c}$ for a fixed ${c>0}$, and ${n}$ is sufficiently large depending on ${c}$ (Dégot 2014);
• When ${C n^{-1/7} \leq a \leq 1 - C n^{-1/4}}$ for a sufficiently large absolute constant ${C}$ (Chalebgwa 2020);
• When ${a=1}$ (Rubinstein 1968; Goodman-Rahman-Ratti 1969; Joyal 1969);
• When ${a \geq 1-\varepsilon_n}$, where ${\varepsilon_n>0}$ is sufficiently small depending on ${n}$ (Miller 1993; Vajaitu-Zaharescu 1993);
• When ${a \geq 1 - \frac{1}{2 n^9 4^n}}$ (Chijiwa 2011);
• When ${a \geq 1 - \frac{90}{n^{12} \log n}}$ (Kasmalkar 2014).

In particular, in high degrees the only cases left uncovered by prior results are when ${a}$ is close (but not too close) to ${0}$, or when ${a}$ is close (but not too close) to ${1}$; see Figure 1 of my paper.

Our main result covers the high degree case uniformly for all values of ${a \in [0,1]}$:

Theorem 2 There exists an absolute constant ${n_0}$ such that Sendov’s conjecture holds for all ${n \geq n_0}$.

In principle, this reduces the verification of Sendov’s conjecture to a finite time computation, although our arguments use compactness methods and thus do not easily provide an explicit value of ${n_0}$. I believe that the compactness arguments can be replaced with quantitative substitutes that provide an explicit ${n_0}$, but the value of ${n_0}$ produced is likely to be extremely large (certainly much larger than ${9}$).

Because of the previous results (particularly those of Chalebgwa and Chijiwa), we will only need to establish the following two subcases of the above theorem:

Theorem 3 (Sendov’s conjecture near the origin) Under the additional hypothesis ${a = o(1/\log n)}$, Sendov’s conjecture holds for sufficiently large ${n}$.

Theorem 4 (Sendov’s conjecture near the unit circle) Under the additional hypothesis ${1-o(1) \leq a \leq 1 - \varepsilon_0^n}$ for a fixed ${\varepsilon_0>0}$, Sendov’s conjecture holds for sufficiently large ${n}$.

We approach these theorems using the “compactness and contradiction” strategy, assuming that there is a sequence of counterexamples whose degrees ${n}$ going to infinity, using various compactness theorems to extract various asymptotic objects in the limit ${n \rightarrow \infty}$, and somehow using these objects to derive a contradiction. There are many ways to effect such a strategy; we will use a formalism that I call “cheap nonstandard analysis” and which is common in the PDE literature, in which one repeatedly passes to subsequences as necessary whenever one invokes a compactness theorem to create a limit object. However, the particular choice of asymptotic formalism one selects is not of essential importance for the arguments.

I also found it useful to use the language of probability theory. Given a putative counterexample ${f}$ to Sendov’s conjecture, let ${\lambda}$ be a zero of ${f}$ (chosen uniformly at random among the ${n}$ zeroes of ${f}$, counting multiplicity), and let ${\zeta}$ similarly be a uniformly random zero of ${f'}$. We introduce the logarithmic potentials

$\displaystyle U_\lambda(z) := {\bf E} \log \frac{1}{|z-\lambda|}; \quad U_\zeta(z) := {\bf E} \log \frac{1}{|z-\zeta|}$

and the Stieltjes transforms

$\displaystyle s_\lambda(z) := {\bf E} \frac{1}{z-\lambda}; \quad s_\zeta(z) := {\bf E} \log \frac{1}{z-\zeta}.$

Standard calculations using the fundamental theorem of algebra yield the basic identities

$\displaystyle U_\lambda(z) = \frac{1}{n} \log \frac{1}{|f(z)|}; \quad U_\zeta(z) = \frac{1}{n-1} \log \frac{n}{|f'(z)|}$

and

$\displaystyle s_\lambda(z) = \frac{1}{n} \frac{f'(z)}{f(z)}; \quad s_\zeta(z) = \frac{1}{n-1} \frac{f''(z)}{f'(z)} \ \ \ \ \ (1)$

and in particular the random variables ${\lambda, \zeta}$ are linked to each other by the identity

$\displaystyle U_\lambda(z) - \frac{n-1}{n} U_\zeta(z) = \frac{1}{n} \log |s_\lambda(z)|. \ \ \ \ \ (2)$

On the other hand, the hypotheses of Sendov’s conjecture (and the Gauss-Lucas theorem) place ${\lambda,\zeta}$ inside the unit disk ${\{ z:|z| \leq 1\}}$. Applying Prokhorov’s theorem, and passing to a subsequence, one can then assume that the random variables ${\lambda,\zeta}$ converge in distribution to some limiting random variables ${\lambda^{(\infty)}, \zeta^{(\infty)}}$ (possibly defined on a different probability space than the original variables ${\lambda,\zeta}$), also living almost surely inside the unit disk. Standard potential theory then gives the convergence

$\displaystyle U_\lambda(z) \rightarrow U_{\lambda^{(\infty)}}(z); \quad U_\zeta(z) \rightarrow U_{\zeta^{(\infty)}}(z) \ \ \ \ \ (3)$

and

$\displaystyle s_\lambda(z) \rightarrow s_{\lambda^{(\infty)}}(z); \quad s_\zeta(z) \rightarrow s_{\zeta^{(\infty)}}(z) \ \ \ \ \ (4)$

at least in the local ${L^1}$ sense. Among other things, we then conclude from the identity (2) and some elementary inequalities that

$\displaystyle U_{\lambda^{(\infty)}}(z) = U_{\zeta^{(\infty)}}(z)$

for all ${|z|>1}$. This turns out to have an appealing interpretation in terms of Brownian motion: if one takes two Brownian motions in the complex plane, one originating from ${\lambda^{(\infty)}}$ and one originating from ${\zeta^{(\infty)}}$, then the location where these Brownian motions first exit the unit disk ${\{ z: |z| \leq 1 \}}$ will have the same distribution. (In our paper we actually replace Brownian motion with the closely related formalism of balayage.) This turns out to connect the random variables ${\lambda^{(\infty)}}$, ${\zeta^{(\infty)}}$ quite closely to each other. In particular, with this observation and some additional arguments involving both the unique continuation property for harmonic functions and Grace’s theorem (discussed in this previous post), with the latter drawn from the prior work of Dégot, we can get very good control on these distributions:

Theorem 5
• (i) If ${a = o(1)}$, then ${\lambda^{(\infty)}, \zeta^{(\infty)}}$ almost surely lie in the semicircle ${\{ e^{i\theta}: \pi/2 \leq \theta \leq 3\pi/2\}}$ and have the same distribution.
• (ii) If ${a = 1-o(1)}$, then ${\lambda^{(\infty)}}$ is uniformly distributed on the circle ${\{ z: |z|=1\}}$, and ${\zeta^{(\infty)}}$ is almost surely zero.

In case (i) (and strengthening the hypothesis ${a=o(1)}$ to ${a=o(1/\log n)}$ to control some technical contributions of “outlier” zeroes of ${f}$), we can use this information about ${\lambda^{(\infty)}}$ and (4) to ensure that the normalised logarithmic derivative ${\frac{1}{n} \frac{f'}{f} = s_\lambda}$ has a non-negative winding number in a certain small (but not too small) circle around the origin, which by the argument principle is inconsistent with the hypothesis that ${f}$ has a zero at ${a = o(1)}$ and that ${f'}$ has no zeroes near ${a}$. This is how we establish Theorem 3.

Case (ii) turns out to be more delicate. This is because there are a number of “near-counterexamples” to Sendov’s conjecture that are compatible with the hypotheses and conclusion of case (ii). The simplest such example is ${f(z) = z^n - 1}$, where the zeroes ${\lambda}$ of ${f}$ are uniformly distributed amongst the ${n^{th}}$ roots of unity (including at ${a=1}$), and the zeroes of ${f'}$ are all located at the origin. In my paper I also discuss a variant of this construction, in which ${f'}$ has zeroes mostly near the origin, but also acquires a bounded number of zeroes at various locations ${\lambda_1+o(1),\dots,\lambda_m+o(1)}$ inside the unit disk. Specifically, we take

$\displaystyle f(z) := \left(z + \frac{c_2}{n}\right)^{n-m} P(z) - \left(a + \frac{c_2}{n}\right)^{n-m} P(a)$

where ${a = 1 - \frac{c_1}{n}}$ for some constants ${0 < c_1 < c_2}$ and

$\displaystyle P(z) := (z-\lambda_1) \dots (z-\lambda_m).$

By a perturbative analysis to locate the zeroes of ${f}$, one eventually would be able to arrive at a true counterexample to Sendov’s conjecture if these locations ${\lambda_1,\dots,\lambda_m}$ were in the open lune

$\displaystyle \{ \lambda: |\lambda| < 1 < |\lambda-1| \}$

and if one had the inequality

$\displaystyle c_2 - c_1 - c_2 \cos \theta + \sum_{j=1}^m \log \left|\frac{1 - \lambda_j}{e^{i\theta} - \lambda_j}\right| < 0 \ \ \ \ \ (5)$

for all ${0 \leq \theta \leq 2\pi}$. However, if one takes the mean of this inequality in ${\theta}$, one arrives at the inequality

$\displaystyle c_2 - c_1 + \sum_{j=1}^m \log |1 - \lambda_j| < 0$

which is incompatible with the hypotheses ${c_2 > c_1}$ and ${|\lambda_j-1| > 1}$. In order to extend this argument to more general polynomials ${f}$, we require a stability analysis of the endpoint equation

$\displaystyle c_2 - c_1 + c_2 \cos \theta + \sum_{j=1}^m \log \left|\frac{1 - \lambda_j}{e^{i\theta} - \lambda_j}\right| = 0 \ \ \ \ \ (6)$

where we now only assume the closed conditions ${c_2 \geq c_1}$ and ${|\lambda_j-1| \geq 1}$. The above discussion then places all the zeros ${\lambda_j}$ on the arc

$\displaystyle \{ \lambda: |\lambda| < 1 = |\lambda-1|\} \ \ \ \ \ (7)$

and if one also takes the second Fourier coefficient of (6) one also obtains the vanishing second moment

$\displaystyle \sum_{j=1}^m \lambda_j^2 = 0.$

These two conditions are incompatible with each other (except in the degenerate case when all the ${\lambda_j}$ vanish), because all the non-zero elements ${\lambda}$ of the arc (7) have argument in ${\pm [\pi/3,\pi/2]}$, so in particular their square ${\lambda^2}$ will have negative real part. It turns out that one can adapt this argument to the more general potential counterexamples to Sendov’s conjecture (in the form of Theorem 4). The starting point is to use (1), (4), and Theorem 5(ii) to obtain good control on ${f''/f'}$, which one then integrates and exponentiates to get good control on ${f'}$, and then on a second integration one gets enough information about ${f}$ to pin down the location of its zeroes to high accuracy. The constraint that these zeroes lie inside the unit disk then gives an inequality resembling (5), and an adaptation of the above stability analysis is then enough to conclude. The arguments here are inspired by the previous arguments of Miller, which treated the case when ${a}$ was extremely close to ${1}$ via a similar perturbative analysis; the main novelty is to control the error terms not in terms of the magnitude of the largest zero ${\zeta}$ of ${f'}$ (which is difficult to manage when ${n}$ gets large), but rather by the variance of those zeroes, which ends up being a more tractable expression to keep track of.

[UPDATE, Feb 1, 2021: the strategy sketched out below has been successfully implemented to rigorously obtain the desired implication in this recent preprint of Giulio Bresciani.]
I recently came across this question on MathOverflow asking if there are any polynomials ${P}$ of two variables with rational coefficients, such that the map ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ is a bijection. The answer to this question is almost surely “no”, but it is remarkable how hard this problem resists any attempt at rigorous proof. (MathOverflow users with enough privileges to see deleted answers will find that there are no less than seventeen deleted attempts at a proof in response to this question!)
On the other hand, the one surviving response to the question does point out this paper of Poonen which shows that assuming a powerful conjecture in Diophantine geometry known as the Bombieri-Lang conjecture (discussed in this previous post), it is at least possible to exhibit polynomials ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ which are injective.
I believe that it should be possible to also rule out the existence of bijective polynomials ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$ if one assumes the Bombieri-Lang conjecture, and have sketched out a strategy to do so, but filling in the gaps requires a fair bit more algebraic geometry than I am capable of. So as a sort of experiment, I would like to see if a rigorous implication of this form (similarly to the rigorous implication of the Erdos-Ulam conjecture from the Bombieri-Lang conjecture in my previous post) can be crowdsourced, in the spirit of the polymath projects (though I feel that this particular problem should be significantly quicker to resolve than a typical such project).
Here is how I imagine a Bombieri-Lang-powered resolution of this question should proceed (modulo a large number of unjustified and somewhat vague steps that I believe to be true but have not established rigorously). Suppose for contradiction that we have a bijective polynomial ${P: {\bf Q} \times {\bf Q} \rightarrow {\bf Q}}$. Then for any polynomial ${Q: {\bf Q} \rightarrow {\bf Q}}$ of one variable, the surface

$\displaystyle S_Q := \{ (x,y,z) \in \mathbb{A}^3: P(x,y) = Q(z) \}$

has infinitely many rational points; indeed, every rational ${z \in {\bf Q}}$ lifts to exactly one rational point in ${S_Q}$. I believe that for “typical” ${Q}$ this surface ${S_Q}$ should be irreducible. One can now split into two cases:

• (a) The rational points in ${S_Q}$ are Zariski dense in ${S_Q}$.
• (b) The rational points in ${S_Q}$ are not Zariski dense in ${S_Q}$.

Consider case (b) first. By definition, this case asserts that the rational points in ${S_Q}$ are contained in a finite number of algebraic curves. By Faltings’ theorem (a special case of the Bombieri-Lang conjecture), any curve of genus two or higher only contains a finite number of rational points. So all but finitely many of the rational points in ${S_Q}$ are contained in a finite union of genus zero and genus one curves. I think all genus zero curves are birational to a line, and all the genus one curves are birational to an elliptic curve (though I don’t have an immediate reference for this). These curves ${C}$ all can have an infinity of rational points, but very few of them should have “enough” rational points ${C \cap {\bf Q}^3}$ that their projection ${\pi(C \cap {\bf Q}^3) := \{ z \in {\bf Q} : (x,y,z) \in C \hbox{ for some } x,y \in {\bf Q} \}}$ to the third coordinate is “large”. In particular, I believe

• (i) If ${C \subset {\mathbb A}^3}$ is birational to an elliptic curve, then the number of elements of ${\pi(C \cap {\bf Q}^3)}$ of height at most ${H}$ should grow at most polylogarithmically in ${H}$ (i.e., be of order ${O( \log^{O(1)} H )}$.
• (ii) If ${C \subset {\mathbb A}^3}$ is birational to a line but not of the form ${\{ (f(z), g(z), z) \}}$ for some rational ${f,g}$, then then the number of elements of ${\pi(C \cap {\bf Q}^3)}$ of height at most ${H}$ should grow slower than ${H^2}$ (in fact I think it can only grow like ${O(H)}$).

I do not have proofs of these results (though I think something similar to (i) can be found in Knapp’s book, and (ii) should basically follow by using a rational parameterisation ${\{(f(t),g(t),h(t))\}}$ of ${C}$ with ${h}$ nonlinear). Assuming these assertions, this would mean that there is a curve of the form ${\{ (f(z),g(z),z)\}}$ that captures a “positive fraction” of the rational points of ${S_Q}$, as measured by restricting the height of the third coordinate ${z}$ to lie below a large threshold ${H}$, computing density, and sending ${H}$ to infinity (taking a limit superior). I believe this forces an identity of the form

$\displaystyle P(f(z), g(z)) = Q(z) \ \ \ \ \ (1)$

for all ${z}$. Such identities are certainly possible for some choices of ${Q}$ (e.g. ${Q(z) = P(F(z), G(z))}$ for arbitrary polynomials ${F,G}$ of one variable) but I believe that the only way that such identities hold for a “positive fraction” of ${Q}$ (as measured using height as before) is if there is in fact a rational identity of the form

$\displaystyle P( f_0(z), g_0(z) ) = z$

for some rational functions ${f_0,g_0}$ with rational coefficients (in which case we would have ${f = f_0 \circ Q}$ and ${g = g_0 \circ Q}$). But such an identity would contradict the hypothesis that ${P}$ is bijective, since one can take a rational point ${(x,y)}$ outside of the curve ${\{ (f_0(z), g_0(z)): z \in {\bf Q} \}}$, and set ${z := P(x,y)}$, in which case we have ${P(x,y) = P(f_0(z), g_0(z) )}$ violating the injective nature of ${P}$. Thus, modulo a lot of steps that have not been fully justified, we have ruled out the scenario in which case (b) holds for a “positive fraction” of ${Q}$.
This leaves the scenario in which case (a) holds for a “positive fraction” of ${Q}$. Assuming the Bombieri-Lang conjecture, this implies that for such ${Q}$, any resolution of singularities of ${S_Q}$ fails to be of general type. I would imagine that this places some very strong constraints on ${P,Q}$, since I would expect the equation ${P(x,y) = Q(z)}$ to describe a surface of general type for “generic” choices of ${P,Q}$ (after resolving singularities). However, I do not have a good set of techniques for detecting whether a given surface is of general type or not. Presumably one should proceed by viewing the surface ${\{ (x,y,z): P(x,y) = Q(z) \}}$ as a fibre product of the simpler surface ${\{ (x,y,w): P(x,y) = w \}}$ and the curve ${\{ (z,w): Q(z) = w \}}$ over the line ${\{w \}}$. In any event, I believe the way to handle (a) is to show that the failure of general type of ${S_Q}$ implies some strong algebraic constraint between ${P}$ and ${Q}$ (something in the spirit of (1), perhaps), and then use this constraint to rule out the bijectivity of ${P}$ by some further ad hoc method.

(This post is mostly intended for my own reference, as I found myself repeatedly looking up several conversions between polynomial bases on various occasions.)

Let ${\mathrm{Poly}_{\leq n}}$ denote the vector space of polynomials ${P:{\bf R} \rightarrow {\bf R}}$ of one variable ${x}$ with real coefficients of degree at most ${n}$. This is a vector space of dimension ${n+1}$, and the sequence of these spaces form a filtration:

$\displaystyle \mathrm{Poly}_{\leq 0} \subset \mathrm{Poly}_{\leq 1} \subset \mathrm{Poly}_{\leq 2} \subset \dots$

A standard basis for these vector spaces are given by the monomials ${x^0, x^1, x^2, \dots}$: every polynomial ${P(x)}$ in ${\mathrm{Poly}_{\leq n}}$ can be expressed uniquely as a linear combination of the first ${n+1}$ monomials ${x^0, x^1, \dots, x^n}$. More generally, if one has any sequence ${Q_0(x), Q_1(x), Q_2(x)}$ of polynomials, with each ${Q_n}$ of degree exactly ${n}$, then an easy induction shows that ${Q_0(x),\dots,Q_n(x)}$ forms a basis for ${\mathrm{Poly}_{\leq n}}$.

In particular, if we have two such sequences ${Q_0(x), Q_1(x), Q_2(x),\dots}$ and ${R_0(x), R_1(x), R_2(x), \dots}$ of polynomials, with each ${Q_n}$ of degree ${n}$ and each ${R_k}$ of degree ${k}$, then ${Q_n}$ must be expressible uniquely as a linear combination of the polynomials ${R_0,R_1,\dots,R_n}$, thus we have an identity of the form

$\displaystyle Q_n(x) = \sum_{k=0}^n c_{QR}(n,k) R_k(x)$

for some change of basis coefficients ${c_{QR}(n,k) \in {\bf R}}$. These coefficients describe how to convert a polynomial expressed in the ${Q_n}$ basis into a polynomial expressed in the ${R_k}$ basis.

Many standard combinatorial quantities ${c(n,k)}$ involving two natural numbers ${0 \leq k \leq n}$ can be interpreted as such change of basis coefficients. The most familiar example are the binomial coefficients ${\binom{n}{k}}$, which measures the conversion from the shifted monomial basis ${(x+1)^n}$ to the monomial basis ${x^k}$, thanks to (a special case of) the binomial formula:

$\displaystyle (x+1)^n = \sum_{k=0}^n \binom{n}{k} x^k,$

thus for instance

$\displaystyle (x+1)^3 = \binom{3}{0} x^0 + \binom{3}{1} x^1 + \binom{3}{2} x^2 + \binom{3}{3} x^3$

$\displaystyle = 1 + 3x + 3x^2 + x^3.$

More generally, for any shift ${h}$, the conversion from ${(x+h)^n}$ to ${x^k}$ is measured by the coefficients ${h^{n-k} \binom{n}{k}}$, thanks to the general case of the binomial formula.

But there are other bases of interest too. For instance if one uses the falling factorial basis

$\displaystyle (x)_n := x (x-1) \dots (x-n+1)$

then the conversion from falling factorials to monomials is given by the Stirling numbers of the first kind ${s(n,k)}$:

$\displaystyle (x)_n = \sum_{k=0}^n s(n,k) x^k,$

thus for instance

$\displaystyle (x)_3 = s(3,0) x^0 + s(3,1) x^1 + s(3,2) x^2 + s(3,3) x^3$

$\displaystyle = 0 + 2 x - 3x^2 + x^3$

and the conversion back is given by the Stirling numbers of the second kind ${S(n,k)}$:

$\displaystyle x^n = \sum_{k=0}^n S(n,k) (x)_k$

thus for instance

$\displaystyle x^3 = S(3,0) (x)_0 + S(3,1) (x)_1 + S(3,2) (x)_2 + S(3,3) (x)_3$

$\displaystyle = 0 + x + 3 x(x-1) + x(x-1)(x-2).$

If one uses the binomial functions ${\binom{x}{n} = \frac{1}{n!} (x)_n}$ as a basis instead of the falling factorials, one of course can rewrite these conversions as

$\displaystyle \binom{x}{n} = \sum_{k=0}^n \frac{1}{n!} s(n,k) x^k$

and

$\displaystyle x^n = \sum_{k=0}^n k! S(n,k) \binom{x}{k}$

thus for instance

$\displaystyle \binom{x}{3} = 0 + \frac{1}{3} x - \frac{1}{2} x^2 + \frac{1}{6} x^3$

and

$\displaystyle x^3 = 0 + \binom{x}{1} + 6 \binom{x}{2} + 6 \binom{x}{3}.$

As a slight variant, if one instead uses rising factorials

$\displaystyle (x)^n := x (x+1) \dots (x+n-1)$

then the conversion to monomials yields the unsigned Stirling numbers ${|s(n,k)|}$ of the first kind:

$\displaystyle (x)^n = \sum_{k=0}^n |s(n,k)| x^k$

thus for instance

$\displaystyle (x)^3 = 0 + 2x + 3x^2 + x^3.$

One final basis comes from the polylogarithm functions

$\displaystyle \mathrm{Li}_{-n}(x) := \sum_{j=1}^\infty j^n x^j.$

For instance one has

$\displaystyle \mathrm{Li}_1(x) = -\log(1-x)$

$\displaystyle \mathrm{Li}_0(x) = \frac{x}{1-x}$

$\displaystyle \mathrm{Li}_{-1}(x) = \frac{x}{(1-x)^2}$

$\displaystyle \mathrm{Li}_{-2}(x) = \frac{x}{(1-x)^3} (1+x)$

$\displaystyle \mathrm{Li}_{-3}(x) = \frac{x}{(1-x)^4} (1+4x+x^2)$

$\displaystyle \mathrm{Li}_{-4}(x) = \frac{x}{(1-x)^5} (1+11x+11x^2+x^3)$

and more generally one has

$\displaystyle \mathrm{Li}_{-n-1}(x) = \frac{x}{(1-x)^{n+2}} E_n(x)$

for all natural numbers ${n}$ and some polynomial ${E_n}$ of degree ${n}$ (the Eulerian polynomials), which when converted to the monomial basis yields the (shifted) Eulerian numbers

$\displaystyle E_n(x) = \sum_{k=0}^n A(n+1,k) x^k.$

For instance

$\displaystyle E_3(x) = A(4,0) x^0 + A(4,1) x^1 + A(4,2) x^2 + A(4,3) x^3$

$\displaystyle = 1 + 11x + 11x^2 + x^3.$

These particular coefficients also have useful combinatorial interpretations. For instance:

• The binomial coefficient ${\binom{n}{k}}$ is of course the number of ${k}$-element subsets of ${\{1,\dots,n\}}$.
• The unsigned Stirling numbers ${|s(n,k)|}$ of the first kind are the number of permutations of ${\{1,\dots,n\}}$ with exactly ${k}$ cycles. The signed Stirling numbers ${s(n,k)}$ are then given by the formula ${s(n,k) = (-1)^{n-k} |s(n,k)|}$.
• The Stirling numbers ${S(n,k)}$ of the second kind are the number of ways to partition ${\{1,\dots,n\}}$ into ${k}$ non-empty subsets.
• The Eulerian numbers ${A(n,k)}$ are the number of permutations of ${\{1,\dots,n\}}$ with exactly ${k}$ ascents.

These coefficients behave similarly to each other in several ways. For instance, the binomial coefficients ${\binom{n}{k}}$ obey the well known Pascal identity

$\displaystyle \binom{n+1}{k} = \binom{n}{k} + \binom{n}{k-1}$

(with the convention that ${\binom{n}{k}}$ vanishes outside of the range ${0 \leq k \leq n}$). In a similar spirit, the unsigned Stirling numbers ${|s(n,k)|}$ of the first kind obey the identity

$\displaystyle |s(n+1,k)| = n |s(n,k)| + |s(n,k-1)|$

and the signed counterparts ${s(n,k)}$ obey the identity

$\displaystyle s(n+1,k) = -n s(n,k) + s(n,k-1).$

The Stirling numbers of the second kind ${S(n,k)}$ obey the identity

$\displaystyle S(n+1,k) = k S(n,k) + S(n,k-1)$

and the Eulerian numbers ${A(n,k)}$ obey the identity

$\displaystyle A(n+1,k) = (k+1) A(n,k) + (n-k+1) A(n,k-1).$

This is a sequel to this previous blog post, in which we discussed the effect of the heat flow evolution

$\displaystyle \partial_t P(t,z) = \partial_{zz} P(t,z)$

on the zeroes of a time-dependent family of polynomials ${z \mapsto P(t,z)}$, with a particular focus on the case when the polynomials ${z \mapsto P(t,z)}$ had real zeroes. Here (inspired by some discussions I had during a recent conference on the Riemann hypothesis in Bristol) we record the analogous theory in which the polynomials instead have zeroes on a circle ${\{ z: |z| = \sqrt{q} \}}$, with the heat flow slightly adjusted to compensate for this. As we shall discuss shortly, a key example of this situation arises when ${P}$ is the numerator of the zeta function of a curve.

More precisely, let ${g}$ be a natural number. We will say that a polynomial

$\displaystyle P(z) = \sum_{j=0}^{2g} a_j z^j$

of degree ${2g}$ (so that ${a_{2g} \neq 0}$) obeys the functional equation if the ${a_j}$ are all real and

$\displaystyle a_j = q^{g-j} a_{2g-j}$

for all ${j=0,\dots,2g}$, thus

$\displaystyle P(\overline{z}) = \overline{P(z)}$

and

$\displaystyle P(q/z) = q^g z^{-2g} P(z)$

for all non-zero ${z}$. This means that the ${2g}$ zeroes ${\alpha_1,\dots,\alpha_{2g}}$ of ${P(z)}$ (counting multiplicity) lie in ${{\bf C} \backslash \{0\}}$ and are symmetric with respect to complex conjugation ${z \mapsto \overline{z}}$ and inversion ${z \mapsto q/z}$ across the circle ${\{ |z| = \sqrt{q}\}}$. We say that this polynomial obeys the Riemann hypothesis if all of its zeroes actually lie on the circle ${\{ z = \sqrt{q}\}}$. For instance, in the ${g=1}$ case, the polynomial ${z^2 - a_1 z + q}$ obeys the Riemann hypothesis if and only if ${|a_1| \leq 2\sqrt{q}}$.

Such polynomials arise in number theory as follows: if ${C}$ is a projective curve of genus ${g}$ over a finite field ${\mathbf{F}_q}$, then, as famously proven by Weil, the associated local zeta function ${\zeta_{C,q}(z)}$ (as defined for instance in this previous blog post) is known to take the form

$\displaystyle \zeta_{C,q}(z) = \frac{P(z)}{(1-z)(1-qz)}$

where ${P}$ is a degree ${2g}$ polynomial obeying both the functional equation and the Riemann hypothesis. In the case that ${C}$ is an elliptic curve, then ${g=1}$ and ${P}$ takes the form ${P(z) = z^2 - a_1 z + q}$, where ${a_1}$ is the number of ${{\bf F}_q}$-points of ${C}$ minus ${q+1}$. The Riemann hypothesis in this case is a famous result of Hasse.

Another key example of such polynomials arise from rescaled characteristic polynomials

$\displaystyle P(z) := \det( 1 - \sqrt{q} F ) \ \ \ \ \ (1)$

of ${2g \times 2g}$ matrices ${F}$ in the compact symplectic group ${Sp(g)}$. These polynomials obey both the functional equation and the Riemann hypothesis. The Sato-Tate conjecture (in higher genus) asserts, roughly speaking, that “typical” polyomials ${P}$ arising from the number theoretic situation above are distributed like the rescaled characteristic polynomials (1), where ${F}$ is drawn uniformly from ${Sp(g)}$ with Haar measure.

Given a polynomial ${z \mapsto P(0,z)}$ of degree ${2g}$ with coefficients

$\displaystyle P(0,z) = \sum_{j=0}^{2g} a_j(0) z^j,$

we can evolve it in time by the formula

$\displaystyle P(t,z) = \sum_{j=0}^{2g} \exp( t(j-g)^2 ) a_j(0) z^j,$

thus ${a_j(t) = \exp(t(j-g)) a_j(0)}$ for ${t \in {\bf R}}$. Informally, as one increases ${t}$, this evolution accentuates the effect of the extreme monomials, particularly, ${z^0}$ and ${z^{2g}}$ at the expense of the intermediate monomials such as ${z^g}$, and conversely as one decreases ${t}$. This family of polynomials obeys the heat-type equation

$\displaystyle \partial_t P(t,z) = (z \partial_z - g)^2 P(t,z). \ \ \ \ \ (2)$

In view of the results of Marcus, Spielman, and Srivastava, it is also very likely that one can interpret this flow in terms of expected characteristic polynomials involving conjugation over the compact symplectic group ${Sp(n)}$, and should also be tied to some sort of “${\beta=\infty}$” version of Brownian motion on this group, but we have not attempted to work this connection out in detail.

It is clear that if ${z \mapsto P(0,z)}$ obeys the functional equation, then so does ${z \mapsto P(t,z)}$ for any other time ${t}$. Now we investigate the evolution of the zeroes. Suppose at some time ${t_0}$ that the zeroes ${\alpha_1(t_0),\dots,\alpha_{2g}(t_0)}$ of ${z \mapsto P(t_0,z)}$ are distinct, then

$\displaystyle P(t_0,z) = a_{2g}(0) \exp( t_0g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t_0) ).$

From the inverse function theorem we see that for times ${t}$ sufficiently close to ${t_0}$, the zeroes ${\alpha_1(t),\dots,\alpha_{2g}(t)}$ of ${z \mapsto P(t,z)}$ continue to be distinct (and vary smoothly in ${t}$), with

$\displaystyle P(t,z) = a_{2g}(0) \exp( t g^2 ) \prod_{j=1}^{2g} (z - \alpha_j(t) ).$

Differentiating this at any ${z}$ not equal to any of the ${\alpha_j(t)}$, we obtain

$\displaystyle \partial_t P(t,z) = P(t,z) ( g^2 - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)})$

and

$\displaystyle \partial_z P(t,z) = P(t,z) ( \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)})$

and

$\displaystyle \partial_{zz} P(t,z) = P(t,z) ( \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}).$

Inserting these formulae into (2) (expanding ${(z \partial_z - g)^2}$ as ${z^2 \partial_{zz} - (2g-1) z \partial_z + g^2}$) and canceling some terms, we conclude that

$\displaystyle - \sum_{j=1}^{2g} \frac{\alpha'_j(t)}{z - \alpha_j(t)} = z^2 \sum_{1 \leq j,k \leq 2g: j \neq k} \frac{1}{(z - \alpha_j(t))(z - \alpha_k(t))}$

$\displaystyle - (2g-1) z \sum_{j=1}^{2g} \frac{1}{z - \alpha_j(t)}$

for ${t}$ sufficiently close to ${t_0}$, and ${z}$ not equal to ${\alpha_1(t),\dots,\alpha_{2g}(t)}$. Extracting the residue at ${z = \alpha_j(t)}$, we conclude that

$\displaystyle - \alpha'_j(t) = 2 \alpha_j(t)^2 \sum_{1 \leq k \leq 2g: k \neq j} \frac{1}{\alpha_j(t) - \alpha_k(t)} - (2g-1) \alpha_j(t)$

which we can rearrange as

$\displaystyle \frac{\alpha'_j(t)}{\alpha_j(t)} = - \sum_{1 \leq k \leq 2g: k \neq j} \frac{\alpha_j(t)+\alpha_k(t)}{\alpha_j(t)-\alpha_k(t)}.$

If we make the change of variables ${\alpha_j(t) = \sqrt{q} e^{i\theta_j(t)}}$ (noting that one can make ${\theta_j}$ depend smoothly on ${t}$ for ${t}$ sufficiently close to ${t_0}$), this becomes

$\displaystyle \partial_t \theta_j(t) = \sum_{1 \leq k \leq 2g: k \neq j} \cot \frac{\theta_j(t) - \theta_k(t)}{2}. \ \ \ \ \ (3)$

Intuitively, this equation asserts that the phases ${\theta_j}$ repel each other if they are real (and attract each other if their difference is imaginary). If ${z \mapsto P(t_0,z)}$ obeys the Riemann hypothesis, then the ${\theta_j}$ are all real at time ${t_0}$, then the Picard uniqueness theorem (applied to ${\theta_j(t)}$ and its complex conjugate) then shows that the ${\theta_j}$ are also real for ${t}$ sufficiently close to ${t_0}$. If we then define the entropy functional

$\displaystyle H(\theta_1,\dots,\theta_{2g}) := \sum_{1 \leq j < k \leq 2g} \log \frac{1}{|\sin \frac{\theta_j-\theta_k}{2}| }$

then the above equation becomes a gradient flow

$\displaystyle \partial_t \theta_j(t) = - 2 \frac{\partial H}{\partial \theta_j}( \theta_1(t),\dots,\theta_{2g}(t) )$

which implies in particular that ${H(\theta_1(t),\dots,\theta_{2g}(t))}$ is non-increasing in time. This shows that as one evolves time forward from ${t_0}$, there is a uniform lower bound on the separation between the phases ${\theta_1(t),\dots,\theta_{2g}(t)}$, and hence the equation can be solved indefinitely; in particular, ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for all ${t > t_0}$ if it does so at time ${t_0}$. Our argument here assumed that the zeroes of ${z \mapsto P(t_0,z)}$ were simple, but this assumption can be removed by the usual limiting argument.

For any polynomial ${z \mapsto P(0,z)}$ obeying the functional equation, the rescaled polynomials ${z \mapsto e^{-g^2 t} P(t,z)}$ converge locally uniformly to ${a_{2g}(0) (z^{2g} + q^g)}$ as ${t \rightarrow +\infty}$. By Rouche’s theorem, we conclude that the zeroes of ${z \mapsto P(t,z)}$ converge to the equally spaced points ${\{ e^{2\pi i(j+1/2)/2g}: j=1,\dots,2g\}}$ on the circle ${\{ |z| = \sqrt{q}\}}$. Together with the symmetry properties of the zeroes, this implies in particular that ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for all sufficiently large positive ${t}$. In the opposite direction, when ${t \rightarrow -\infty}$, the polynomials ${z \mapsto P(t,z)}$ converge locally uniformly to ${a_g(0) z^g}$, so if ${a_g(0) \neq 0}$, ${g}$ of the zeroes converge to the origin and the other ${g}$ converge to infinity. In particular, ${z \mapsto P(t,z)}$ fails the Riemann hypothesis for sufficiently large negative ${t}$. Thus (if ${a_g(0) \neq 0}$), there must exist a real number ${\Lambda}$, which we call the de Bruijn-Newman constant of the original polynomial ${z \mapsto P(0,z)}$, such that ${z \mapsto P(t,z)}$ obeys the Riemann hypothesis for ${t \geq \Lambda}$ and fails the Riemann hypothesis for ${t < \Lambda}$. The situation is a bit more complicated if ${a_g(0)}$ vanishes; if ${k}$ is the first natural number such that ${a_{g+k}(0)}$ (or equivalently, ${a_{g-j}(0)}$) does not vanish, then by the above arguments one finds in the limit ${t \rightarrow -\infty}$ that ${g-k}$ of the zeroes go to the origin, ${g-k}$ go to infinity, and the remaining ${2k}$ zeroes converge to the equally spaced points ${\{ e^{2\pi i(j+1/2)/2k}: j=1,\dots,2k\}}$. In this case the de Bruijn-Newman constant remains finite except in the degenerate case ${k=g}$, in which case ${\Lambda = -\infty}$.

For instance, consider the case when ${g=1}$ and ${P(0,z) = z^2 - a_1 z + q}$ for some real ${a_1}$ with ${|a_1| \leq 2\sqrt{q}}$. Then the quadratic polynomial

$\displaystyle P(t,z) = e^t z^2 - a_1 z + e^t q$

has zeroes

$\displaystyle \frac{a_1 \pm \sqrt{a_1^2 - 4 e^{2t} q}}{2e^t}$

and one easily checks that these zeroes lie on the circle ${\{ |z|=\sqrt{q}\}}$ when ${t \geq \log \frac{|a_1|}{2\sqrt{q}}}$, and are on the real axis otherwise. Thus in this case we have ${\Lambda = \log \frac{|a_1|}{2\sqrt{q}}}$ (with ${\Lambda=-\infty}$ if ${a_1=0}$). Note how as ${t}$ increases to ${+\infty}$, the zeroes repel each other and eventually converge to ${\pm i \sqrt{q}}$, while as ${t}$ decreases to ${-\infty}$, the zeroes collide and then separate on the real axis, with one zero going to the origin and the other to infinity.

The arguments in my paper with Brad Rodgers (discussed in this previous post) indicate that for a “typical” polynomial ${P}$ of degree ${g}$ that obeys the Riemann hypothesis, the expected time to relaxation to equilibrium (in which the zeroes are equally spaced) should be comparable to ${1/g}$, basically because the average spacing is ${1/g}$ and hence by (3) the typical velocity of the zeroes should be comparable to ${g}$, and the diameter of the unit circle is comparable to ${1}$, thus requiring time comparable to ${1/g}$ to reach equilibrium. Taking contrapositives, this suggests that the de Bruijn-Newman constant ${\Lambda}$ should typically take on values comparable to ${-1/g}$ (since typically one would not expect the initial configuration of zeroes to be close to evenly spaced). I have not attempted to formalise or prove this claim, but presumably one could do some numerics (perhaps using some of the examples of ${P}$ given previously) to explore this further.

Let ${P(z) = z^n + a_{n-1} z^{n-1} + \dots + a_0}$ be a monic polynomial of degree ${n}$ with complex coefficients. Then by the fundamental theorem of algebra, we can factor ${P}$ as

$\displaystyle P(z) = (z-z_1) \dots (z-z_n) \ \ \ \ \ (1)$

for some complex zeroes ${z_1,\dots,z_n}$ (possibly with repetition).

Now suppose we evolve ${P}$ with respect to time by heat flow, creating a function ${P(t,z)}$ of two variables with given initial data ${P(0,z) = P(z)}$ for which

$\displaystyle \partial_t P(t,z) = \partial_{zz} P(t,z). \ \ \ \ \ (2)$

On the space of polynomials of degree at most ${n}$, the operator ${\partial_{zz}}$ is nilpotent, and one can solve this equation explicitly both forwards and backwards in time by the Taylor series

$\displaystyle P(t,z) = \sum_{j=0}^\infty \frac{t^j}{j!} \partial_z^{2j} P(0,z).$

For instance, if one starts with a quadratic ${P(0,z) = z^2 + bz + c}$, then the polynomial evolves by the formula

$\displaystyle P(t,z) = z^2 + bz + (c+2t).$

As the polynomial ${P(t)}$ evolves in time, the zeroes ${z_1(t),\dots,z_n(t)}$ evolve also. Assuming for sake of discussion that the zeroes are simple, the inverse function theorem tells us that the zeroes will (locally, at least) evolve smoothly in time. What are the dynamics of this evolution?

For instance, in the quadratic case, the quadratic formula tells us that the zeroes are

$\displaystyle z_1(t) = \frac{-b + \sqrt{b^2 - 4(c+2t)}}{2}$

and

$\displaystyle z_2(t) = \frac{-b - \sqrt{b^2 - 4(c+2t)}}{2}$

after arbitrarily choosing a branch of the square root. If ${b,c}$ are real and the discriminant ${b^2 - 4c}$ is initially positive, we see that we start with two real zeroes centred around ${-b/2}$, which then approach each other until time ${t = \frac{b^2-4c}{8}}$, at which point the roots collide and then move off from each other in an imaginary direction.

In the general case, we can obtain the equations of motion by implicitly differentiating the defining equation

$\displaystyle P( t, z_i(t) ) = 0$

in time using (2) to obtain

$\displaystyle \partial_{zz} P( t, z_i(t) ) + \partial_t z_i(t) \partial_z P(t,z_i(t)) = 0.$

To simplify notation we drop the explicit dependence on time, thus

$\displaystyle \partial_{zz} P(z_i) + (\partial_t z_i) \partial_z P(z_i)= 0.$

From (1) and the product rule, we see that

$\displaystyle \partial_z P( z_i ) = \prod_{j:j \neq i} (z_i - z_j)$

and

$\displaystyle \partial_{zz} P( z_i ) = 2 \sum_{k:k \neq i} \prod_{j:j \neq i,k} (z_i - z_j)$

(where all indices are understood to range over ${1,\dots,n}$) leading to the equations of motion

$\displaystyle \partial_t z_i = \sum_{k:k \neq i} \frac{2}{z_k - z_i}, \ \ \ \ \ (3)$

at least when one avoids those times in which there is a repeated zero. In the case when the zeroes ${z_i}$ are real, each term ${\frac{2}{z_k-z_i}}$ represents a (first-order) attraction in the dynamics between ${z_i}$ and ${z_k}$, but the dynamics are more complicated for complex zeroes (e.g. purely imaginary zeroes will experience repulsion rather than attraction, as one already sees in the quadratic example). Curiously, this system resembles that of Dyson brownian motion (except with the brownian motion part removed, and time reversed). I learned of the connection between the ODE (3) and the heat equation from this paper of Csordas, Smith, and Varga, but perhaps it has been mentioned in earlier literature as well.

One interesting consequence of these equations is that if the zeroes are real at some time, then they will stay real as long as the zeroes do not collide. Let us now restrict attention to the case of real simple zeroes, in which case we will rename the zeroes as ${x_i}$ instead of ${z_i}$, and order them as ${x_1 < \dots < x_n}$. The evolution

$\displaystyle \partial_t x_i = \sum_{k:k \neq i} \frac{2}{x_k - x_i}$

can now be thought of as reverse gradient flow for the “entropy”

$\displaystyle H := -\sum_{i,j: i \neq j} \log |x_i - x_j|,$

(which is also essentially the logarithm of the discriminant of the polynomial) since we have

$\displaystyle \partial_t x_i = \frac{\partial H}{\partial x_i}.$

In particular, we have the monotonicity formula

$\displaystyle \partial_t H = 4E$

where ${E}$ is the “energy”

$\displaystyle E := \frac{1}{4} \sum_i (\frac{\partial H}{\partial x_i})^2$

$\displaystyle = \sum_i (\sum_{k:k \neq i} \frac{1}{x_k-x_i})^2$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2} + 2 \sum_{i,j,k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_j-x_i)}$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2}$

where in the last line we use the antisymmetrisation identity

$\displaystyle \frac{1}{(x_k-x_i)(x_j-x_i)} + \frac{1}{(x_i-x_j)(x_k-x_j)} + \frac{1}{(x_j-x_k)(x_i-x_k)} = 0.$

Among other things, this shows that as one goes backwards in time, the entropy decreases, and so no collisions can occur to the past, only in the future, which is of course consistent with the attractive nature of the dynamics. As ${H}$ is a convex function of the positions ${x_1,\dots,x_n}$, one expects ${H}$ to also evolve in a convex manner in time, that is to say the energy ${E}$ should be increasing. This is indeed the case:

Exercise 1 Show that

$\displaystyle \partial_t E = 2 \sum_{i,j: i \neq j} (\frac{2}{(x_i-x_j)^2} - \sum_{k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_k-x_j)})^2.$

Symmetric polynomials of the zeroes are polynomial functions of the coefficients and should thus evolve in a polynomial fashion. One can compute this explicitly in simple cases. For instance, the center of mass is an invariant:

$\displaystyle \partial_t \frac{1}{n} \sum_i x_i = 0.$

The variance decreases linearly:

Exercise 2 Establish the virial identity

$\displaystyle \partial_t \sum_{i,j} (x_i-x_j)^2 = - 4n^2(n-1).$

As the variance (which is proportional to ${\sum_{i,j} (x_i-x_j)^2}$) cannot become negative, this identity shows that “finite time blowup” must occur – that the zeroes must collide at or before the time ${\frac{1}{4n^2(n-1)} \sum_{i,j} (x_i-x_j)^2}$.

Exercise 3 Show that the Stieltjes transform

$\displaystyle s(t,z) = \sum_i \frac{1}{x_i - z}$

solves the viscous Burgers equation

$\displaystyle \partial_t s = \partial_{zz} s - 2 s \partial_z s,$

either by using the original heat equation (2) and the identity ${s = - \partial_z P / P}$, or else by using the equations of motion (3). This relation between the Burgers equation and the heat equation is known as the Cole-Hopf transformation.

The paper of Csordas, Smith, and Varga mentioned previously gives some other bounds on the lifespan of the dynamics; roughly speaking, they show that if there is one pair of zeroes that are much closer to each other than to the other zeroes then they must collide in a short amount of time (unless there is a collision occuring even earlier at some other location). Their argument extends also to situations where there are an infinite number of zeroes, which they apply to get new results on Newman’s conjecture in analytic number theory. I would be curious to know of further places in the literature where this dynamics has been studied.

The equidistribution theorem asserts that if ${\alpha \in {\bf R}/{\bf Z}}$ is an irrational phase, then the sequence ${(n\alpha)_{n=1}^\infty}$ is equidistributed on the unit circle, or equivalently that

$\displaystyle \frac{1}{N} \sum_{n=1}^N F(n\alpha) \rightarrow \int_{{\bf R}/{\bf Z}} F(x)\ dx$

for any continuous (or equivalently, for any smooth) function ${F: {\bf R}/{\bf Z} \rightarrow {\bf C}}$. By approximating ${F}$ uniformly by a Fourier series, this claim is equivalent to that of showing that

$\displaystyle \frac{1}{N} \sum_{n=1}^N e(hn\alpha) \rightarrow 0$

for any non-zero integer ${h}$ (where ${e(x) := e^{2\pi i x}}$), which is easily verified from the irrationality of ${\alpha}$ and the geometric series formula. Conversely, if ${\alpha}$ is rational, then clearly ${\frac{1}{N} \sum_{n=1}^N e(hn\alpha)}$ fails to go to zero when ${h}$ is a multiple of the denominator of ${\alpha}$.

One can then ask for more quantitative information about the decay of exponential sums of ${\frac{1}{N} \sum_{n=1}^N e(n \alpha)}$, or more generally on exponential sums of the form ${\frac{1}{|Q|} \sum_{n \in Q} e(P(n))}$ for an arithmetic progression ${Q}$ (in this post all progressions are understood to be finite) and a polynomial ${P: Q \rightarrow \/{\bf Z}}$. It will be convenient to phrase such information in the form of an inverse theorem, describing those phases for which the exponential sum is large. Indeed, we have

Lemma 1 (Geometric series formula, inverse form) Let ${Q \subset {\bf Z}}$ be an arithmetic progression of length at most ${N}$ for some ${N \geq 1}$, and let ${P(n) = n \alpha + \beta}$ be a linear polynomial for some ${\alpha,\beta \in {\bf R}/{\bf Z}}$. If

$\displaystyle \frac{1}{N} |\sum_{n \in Q} e(P(n))| \geq \delta$

for some ${\delta > 0}$, then there exists a subprogression ${Q'}$ of ${Q}$ of size ${|Q'| \gg \delta^2 N}$ such that ${P(n)}$ varies by at most ${\delta}$ on ${Q'}$ (that is to say, ${P(n)}$ lies in a subinterval of ${{\bf R}/{\bf Z}}$ of length at most ${\delta}$).

Proof: By a linear change of variable we may assume that ${Q}$ is of the form ${\{0,\dots,N'-1\}}$ for some ${N' \geq 1}$. We may of course assume that ${\alpha}$ is non-zero in ${{\bf R}/{\bf Z}}$, so that ${\|\alpha\|_{{\bf R}/{\bf Z}} > 0}$ (${\|x\|_{{\bf R}/{\bf Z}}}$ denotes the distance from ${x}$ to the nearest integer). From the geometric series formula we see that

$\displaystyle |\sum_{n \in Q} e(P(n))| \leq \frac{2}{|e(\alpha) - 1|} \ll \frac{1}{\|\alpha\|_{{\bf R}/{\bf Z}}},$

and so ${\|\alpha\|_{{\bf R}/{\bf Z}} \ll \frac{1}{\delta N}}$. Setting ${Q' := \{ n \in Q: n \leq c \delta^2 N \}}$ for some sufficiently small absolute constant ${c}$, we obtain the claim. $\Box$

Thus, in order for a linear phase ${P(n)}$ to fail to be equidistributed on some long progression ${Q}$, ${P}$ must in fact be almost constant on large piece of ${Q}$.

As is well known, this phenomenon generalises to higher order polynomials. To achieve this, we need two elementary additional lemmas. The first relates the exponential sums of ${P}$ to the exponential sums of its “first derivatives” ${n \mapsto P(n+h)-P(n)}$.

Lemma 2 (Van der Corput lemma, inverse form) Let ${Q \subset {\bf Z}}$ be an arithmetic progression of length at most ${N}$, and let ${P: Q \rightarrow {\bf R}/{\bf Z}}$ be an arbitrary function such that

$\displaystyle \frac{1}{N} |\sum_{n \in Q} e(P(n))| \geq \delta \ \ \ \ \ (1)$

for some ${\delta > 0}$. Then, for ${\gg \delta^2 N}$ integers ${h \in Q-Q}$, there exists a subprogression ${Q_h}$ of ${Q}$, of the same spacing as ${Q}$, such that

$\displaystyle \frac{1}{N} |\sum_{n \in Q_h} e(P(n+h)-P(n))| \gg \delta^2. \ \ \ \ \ (2)$

Proof: Squaring (1), we see that

$\displaystyle \sum_{n,n' \in Q} e(P(n') - P(n)) \geq \delta^2 N^2.$

We write ${n' = n+h}$ and conclude that

$\displaystyle \sum_{h \in Q-Q} \sum_{n \in Q_h} e( P(n+h)-P(n) ) \geq \delta^2 N^2$

where ${Q_h := Q \cap (Q-h)}$ is a subprogression of ${Q}$ of the same spacing. Since ${\sum_{n \in Q_h} e( P(n+h)-P(n) ) = O(N)}$, we conclude that

$\displaystyle |\sum_{n \in Q_h} e( P(n+h)-P(n) )| \gg \delta^2 N$

for ${\gg \delta^2 N}$ values of ${h}$ (this can be seen, much like the pigeonhole principle, by arguing via contradiction for a suitable choice of implied constants). The claim follows. $\Box$

The second lemma (which we recycle from this previous blog post) is a variant of the equidistribution theorem.

Lemma 3 (Vinogradov lemma) Let ${I \subset [-N,N] \cap {\bf Z}}$ be an interval for some ${N \geq 1}$, and let ${\theta \in{\bf R}/{\bf Z}}$ be such that ${\|n\theta\|_{{\bf R}/{\bf Z}} \leq \varepsilon}$ for at least ${\delta N}$ values of ${n \in I}$, for some ${0 < \varepsilon, \delta < 1}$. Then either

$\displaystyle N < \frac{2}{\delta}$

or

$\displaystyle \varepsilon > 10^{-2} \delta$

or else there is a natural number ${q \leq 2/\delta}$ such that

$\displaystyle \| q \theta \|_{{\bf R}/{\bf Z}} \ll \frac{\varepsilon}{\delta N}.$

Proof: We may assume that ${N \geq \frac{2}{\delta}}$ and ${\varepsilon \leq 10^{-2} \delta}$, since we are done otherwise. Then there are at least two ${n \in I}$ with ${\|n \theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$, and by the pigeonhole principle we can find ${n_1 < n_2}$ in ${Q}$ with ${\|n_1 \theta \|_{{\bf R}/{\bf Z}}, \|n_2 \theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$ and ${n_2-n_1 \leq \frac{2}{\delta}}$. By the triangle inequality, we conclude that there exists at least one natural number ${q \leq \frac{2}{\delta}}$ for which

$\displaystyle \| q \theta \|_{{\bf R}/{\bf Z}} \leq 2\varepsilon.$

We take ${q}$ to be minimal amongst all such natural numbers, then we see that there exists ${a}$ coprime to ${q}$ and ${|\kappa| \leq 2\varepsilon}$ such that

$\displaystyle \theta = \frac{a}{q} + \frac{\kappa}{q}. \ \ \ \ \ (3)$

If ${\kappa=0}$ then we are done, so suppose that ${\kappa \neq 0}$. Suppose that ${n < m}$ are elements of ${I}$ such that ${\|n\theta \|_{{\bf R}/{\bf Z}}, \|m\theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$ and ${m-n \leq \frac{1}{10 \kappa}}$. Writing ${m-n = qk + r}$ for some ${0 \leq r < q}$, we have

$\displaystyle \| (m-n) \theta \|_{{\bf R}/{\bf Z}} = \| \frac{ra}{q} + (m-n) \frac{\kappa}{q} \|_{{\bf R}/{\bf Z}} \leq 2\varepsilon.$

By hypothesis, ${(m-n) \frac{\kappa}{q} \leq \frac{1}{10 q}}$; note that as ${q \leq 2/\delta}$ and ${\varepsilon \leq 10^{-2} \delta}$ we also have ${\varepsilon \leq \frac{1}{10q}}$. This implies that ${\| \frac{ra}{q} \|_{{\bf R}/{\bf Z}} < \frac{1}{q}}$ and thus ${r=0}$. We then have

$\displaystyle |k \kappa| \leq 2 \varepsilon.$

We conclude that for fixed ${n \in I}$ with ${\|n\theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$, there are at most ${\frac{2\varepsilon}{|\kappa|}}$ elements ${m}$ of ${[n, n + \frac{1}{10 |\kappa|}]}$ such that ${\|m\theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$. Iterating this with a greedy algorithm, we see that the number of ${n \in I}$ with ${\|n\theta \|_{{\bf R}/{\bf Z}} \leq \varepsilon}$ is at most ${(\frac{N}{1/10|\kappa|} + 1) 2\varepsilon/|\kappa|}$; since ${\varepsilon < 10^{-2} \delta}$, this implies that

$\displaystyle \delta N \ll 2 \varepsilon / \kappa$

and the claim follows. $\Box$

Now we can quickly obtain a higher degree version of Lemma 1:

Proposition 4 (Weyl exponential sum estimate, inverse form) Let ${Q \subset {\bf Z}}$ be an arithmetic progression of length at most ${N}$ for some ${N \geq 1}$, and let ${P: {\bf Z} \rightarrow {\bf R}/{\bf Z}}$ be a polynomial of some degree at most ${d \geq 0}$. If

$\displaystyle \frac{1}{N} |\sum_{n \in Q} e(P(n))| \geq \delta$

for some ${\delta > 0}$, then there exists a subprogression ${Q'}$ of ${Q}$ with ${|Q'| \gg_d \delta^{O_d(1)} N}$ such that ${P}$ varies by at most ${\delta}$ on ${Q'}$.

Proof: We induct on ${d}$. The cases ${d=0,1}$ are immediate from Lemma 1. Now suppose that ${d \geq 2}$, and that the claim had already been proven for ${d-1}$. To simplify the notation we allow implied constants to depend on ${d}$. Let the hypotheses be as in the proposition. Clearly ${\delta}$ cannot exceed ${1}$. By shrinking ${\delta}$ as necessary we may assume that ${\delta \leq c}$ for some sufficiently small constant ${c}$ depending on ${d}$.

By rescaling we may assume ${Q \subset [0,N] \cap {\bf Z}}$. By Lemma 3, we see that for ${\gg \delta^2 N}$ choices of ${h \in [-N,N] \cap {\bf Z}}$ such that

$\displaystyle \frac{1}{N} |\sum_{n \in I_h} e(P(n+h) - P(n))| \gg \delta^2$

for some interval ${I_h \subset [0,N] \cap {\bf Z}}$. We write ${P(n) = \sum_{i \leq d} \alpha_i n^i}$, then ${P(n+h)-P(n)}$ is a polynomial of degree at most ${d-1}$ with leading coefficient ${h \alpha_d n^{d-1}}$. We conclude from induction hypothesis that for each such ${h}$, there exists a natural number ${q_h \ll \delta^{-O(1)}}$ such that ${\|q_h h \alpha_d \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N^{d-1}}$, by double-counting, this implies that there are ${\gg \delta^{O(1)} N}$ integers ${n}$ in the interval ${[-\delta^{-O(1)} N, \delta^{-O(1)} N] \cap {\bf Z}}$ such that ${\|n \alpha_d \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N^{d-1}}$. Applying Lemma 3, we conclude that either ${N \ll \delta^{-O(1)}}$, or that

$\displaystyle \| q \alpha_d \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N^d. \ \ \ \ \ (4)$

In the former case the claim is trivial (just take ${Q'}$ to be a point), so we may assume that we are in the latter case.

We partition ${Q}$ into arithmetic progressions ${Q'}$ of spacing ${q}$ and length comparable to ${\delta^{-C} N}$ for some large ${C}$ depending on ${d}$ to be chosen later. By hypothesis, we have

$\displaystyle \frac{1}{|Q|} |\sum_{n \in Q} e(P(n))| \geq \delta$

so by the pigeonhole principle, we have

$\displaystyle \frac{1}{|Q'|} |\sum_{n \in Q'} e(P(n))| \geq \delta$

for at least one such progression ${Q'}$. On this progression, we may use the binomial theorem and (4) to write ${\alpha_d n^d}$ as a polynomial in ${n}$ of degree at most ${d-1}$, plus an error of size ${O(\delta^{C - O(1)})}$. We thus can write ${P(n) = P'(n) + O(\delta^{C-O(1)})}$ for ${n \in Q'}$ for some polynomial ${P'}$ of degree at most ${d-1}$. By the triangle inequality, we thus have (for ${C}$ large enough) that

$\displaystyle \frac{1}{|Q'|} |\sum_{n \in Q'} e(P'(n))| \gg \delta$

and hence by induction hypothesis we may find a subprogression ${Q''}$ of ${Q'}$ of size ${|Q''| \gg \delta^{O(1)} N}$ such that ${P'}$ varies by most ${\delta/2}$ on ${Q''}$, and thus (for ${C}$ large enough again) that ${P}$ varies by at most ${\delta}$ on ${Q''}$, and the claim follows. $\Box$

This gives the following corollary (also given as Exercise 16 in this previous blog post):

Corollary 5 (Weyl exponential sum estimate, inverse form II) Let ${I \subset [-N,N] \cap {\bf Z}}$ be a discrete interval for some ${N \geq 1}$, and let ${P(n) = \sum_{i \leq d} \alpha_i n^i}$ polynomial of some degree at most ${d \geq 0}$ for some ${\alpha_0,\dots,\alpha_d \in {\bf R}/{\bf Z}}$. If

$\displaystyle \frac{1}{N} |\sum_{n \in I} e(P(n))| \geq \delta$

for some ${\delta > 0}$, then there is a natural number ${q \ll_d \delta^{-O_d(1)}}$ such that ${\| q\alpha_i \|_{{\bf R}/{\bf Z}} \ll_d \delta^{-O_d(1)} N^{-i}}$ for all ${i=0,\dots,d}$.

One can obtain much better exponents here using Vinogradov’s mean value theorem; see Theorem 1.6 this paper of Wooley. (Thanks to Mariusz Mirek for this reference.) However, this weaker result already suffices for many applications, and does not need any result as deep as the mean value theorem.

Proof: To simplify notation we allow implied constants to depend on ${d}$. As before, we may assume that ${\delta \leq c}$ for some small constant ${c>0}$ depending only on ${d}$. We may also assume that ${N \geq \delta^{-C}}$ for some large ${C}$, as the claim is trivial otherwise (set ${q=1}$).

Applying Proposition 4, we can find a natural number ${q \ll \delta^{-O(1)}}$ and an arithmetic subprogression ${Q}$ of ${I}$ such that ${|Q| \gg \delta^{O(1)}}$ and such that ${P}$ varies by at most ${\delta}$ on ${Q}$. Writing ${Q = \{ qn+r: n \in I'\}}$ for some interval ${I' \subset [0,N] \cap {\bf Z}}$ of length ${\gg \delta^{O(1)}}$ and some ${0 \leq r < q}$, we conclude that the polynomial ${n \mapsto P(qn+r)}$ varies by at most ${\delta}$ on ${I'}$. Taking ${d^{th}}$ order differences, we conclude that the ${d^{th}}$ coefficient of this polynomial is ${O(\delta^{-O(1)} / N^d)}$; by the binomial theorem, this implies that ${n \mapsto P(qn+r)}$ differs by at most ${O(\delta)}$ on ${I'}$ from a polynomial of degree at most ${d-1}$. Iterating this, we conclude that the ${i^{th}}$ coefficient of ${n \mapsto P(qn+r)}$ is ${O(\delta N^{-i})}$ for ${i=0,\dots,d}$, and the claim then follows by inverting the change of variables ${n \mapsto qn+r}$ (and replacing ${q}$ with a larger quantity such as ${q^d}$ as necessary). $\Box$

For future reference we also record a higher degree version of the Vinogradov lemma.

Lemma 6 (Polynomial Vinogradov lemma) Let ${I \subset [-N,N] \cap {\bf Z}}$ be a discrete interval for some ${N \geq 1}$, and let ${P: {\bf Z} \rightarrow {\bf R}/{\bf Z}}$ be a polynomial ${P(n) = \sum_{i \leq d} \alpha_i n^i}$ of degree at most ${d}$ for some ${d \geq 1}$ such that ${\|P(n)\|_{{\bf R}/{\bf Z}} \leq \varepsilon}$ for at least ${\delta N}$ values of ${n \in I}$, for some ${0 < \varepsilon, \delta < 1}$. Then either

$\displaystyle N \ll_d \delta^{-O_d(1)} \ \ \ \ \ (5)$

or

$\displaystyle \varepsilon \gg_d \delta^{O_d(1)} \ \ \ \ \ (6)$

or else there is a natural number ${q \ll_d \delta^{-O_d(1)}}$ such that

$\displaystyle \| q \alpha_i \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)} \varepsilon}{N^i}$

for all ${i=0,\dots,d}$.

Proof: We induct on ${d}$. For ${d=1}$ this follows from Lemma 3 (noting that if ${\|P(n)\|_{{\bf R}/{\bf Z}}, \|P(n_0)\|_{{\bf R}/Z} \leq \varepsilon}$ then ${\|P(n)-P(n_0)\|_{{\bf R}/{\bf Z}} \leq 2\varepsilon}$), so suppose that ${d \geq 2}$ and that the claim is already proven for ${d-1}$. We now allow all implied constants to depend on ${d}$.

For each ${h \in [-2N,2N] \cap {\bf Z}}$, let ${N_h}$ denote the number of ${n \in [-N,N] \cap {\bf Z}}$ such that ${\| P(n+h)\|_{{\bf R}/{\bf Z}}, \|P(n)\|_{{\bf R}/{\bf Z}} \leq \varepsilon}$. By hypothesis, ${\sum_{h \in [-2N,2N] \cap {\bf Z}} N_h \gg \delta^2 N^2}$, and clearly ${N_h = O(N)}$, so we must have ${N_h \gg \delta^2 N}$ for ${\gg \delta^2 N}$ choices of ${h}$. For each such ${h}$, we then have ${\|P(n+h)-P(n)\|_{{\bf R}/{\bf Z}} \leq 2\varepsilon}$ for ${\gg \delta^2 N}$ choices of ${n \in [-N,N] \cap {\bf Z}}$, so by induction hypothesis, either (5) or (6) holds, or else for ${\gg \delta^{O(1)} N}$ choices of ${h \in [-2N,2N] \cap {\bf Z}}$, there is a natural number ${q_h \ll \delta^{-O(1)}}$ such that

$\displaystyle \| q_h \alpha_{i,h} \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)} \varepsilon}{N^i}$

for ${i=1,\dots,d-1}$, where ${\alpha_{i,h}}$ are the coefficients of the degree ${d-1}$ polynomial ${n \mapsto P(n+h)-P(n)}$. We may of course assume it is the latter which holds. By the pigeonhole principle we may take ${q_h= q}$ to be independent of ${h}$.

Since ${\alpha_{d-1,h} = dh \alpha_d}$, we have

$\displaystyle \| qd h \alpha_d \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)} \varepsilon}{N^{d-1}}$

for ${\gg \delta^{O(1)} N}$ choices of ${h}$, so by Lemma 3, either (5) or (6) holds, or else (after increasing ${q}$ as necessary) we have

$\displaystyle \| q \alpha_d \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)} \varepsilon}{N^d}.$

We can again assume it is the latter that holds. This implies that ${q \alpha_{d-2,h} = (d-1) h \alpha_{d-1} + O( \delta^{-O(1)} \varepsilon / N^{d-2} )}$ modulo ${1}$, so that

$\displaystyle \| q(d-1) h \alpha_{d-1} \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)} \varepsilon}{N^{d-2}}$

for ${\gg \delta^{O(1)} N}$ choices of ${h}$. Arguing as before and iterating, we obtain the claim. $\Box$

The above results also extend to higher dimensions. Here is the higher dimensional version of Proposition 4:

Proposition 7 (Multidimensional Weyl exponential sum estimate, inverse form) Let ${k \geq 1}$ and ${N_1,\dots,N_k \geq 1}$, and let ${Q_i \subset {\bf Z}}$ be arithmetic progressions of length at most ${N_i}$ for each ${i=1,\dots,k}$. Let ${P: {\bf Z}^k \rightarrow {\bf R}/{\bf Z}}$ be a polynomial of degrees at most ${d_1,\dots,d_k}$ in each of the ${k}$ variables ${n_1,\dots,n_k}$ separately. If

$\displaystyle \frac{1}{N_1 \dots N_k} |\sum_{n \in Q_1 \times \dots \times Q_k} e(P(n))| \geq \delta$

for some ${\delta > 0}$, then there exists a subprogression ${Q'_i}$ of ${Q_i}$ with ${|Q'_i| \gg_{k,d_1,\dots,d_k} \delta^{O_{k,d_1,\dots,d_k}(1)} N_i}$ for each ${i=1,\dots,k}$ such that ${P}$ varies by at most ${\delta}$ on ${Q'_1 \times \dots \times Q'_k}$.

A much more general statement, in which the polynomial phase ${n \mapsto e(P(n))}$ is replaced by a nilsequence, and in which one does not necessarily assume the exponential sum is small, is given in Theorem 8.6 of this paper of Ben Green and myself, but it involves far more notation to even state properly.

Proof: We induct on ${k}$. The case ${k=1}$ was established in Proposition 5, so we assume that ${k \geq 2}$ and that the claim has already been proven for ${k-1}$. To simplify notation we allow all implied constants to depend on ${k,d_1,\dots,d_k}$. We may assume that ${\delta \leq c}$ for some small ${c>0}$ depending only on ${k,d_1,\dots,d_k}$.

By a linear change of variables, we may assume that ${Q_i \subset [0,N_i] \cap {\bf Z}}$ for all ${i=1,\dots,k}$.

We write ${n' := (n_1,\dots,n_{k-1})}$. First suppose that ${N_k = O(\delta^{-O(1)})}$. Then by the pigeonhole principle we can find ${n_k \in I_k}$ such that

$\displaystyle \frac{1}{N_1 \dots N_{k-1}} |\sum_{n' \in Q_1 \times \dots \times Q_{k-1}} e(P(n',n_k))| \geq \delta$

and the claim then follows from the induction hypothesis. Thus we may assume that ${N_k \geq \delta^{-C}}$ for some large ${C}$ depending only on ${k,d_1,\dots,d_k}$. Similarly we may assume that ${N_i \geq \delta^{-C}}$ for all ${i=1,\dots,k}$.

By the triangle inequality, we have

$\displaystyle \frac{1}{N_1 \dots N_k} \sum_{n_k \in Q_k} |\sum_{n' \in Q_1 \times \dots \times Q_{k-1}} e(P(n',n_k))| \geq \delta.$

The inner sum is ${O(N_k)}$, and the outer sum has ${O(N_1 \dots N_{k-1})}$ terms. Thus, for ${\gg \delta N_1 \dots N_{k-1}}$ choices of ${n' \in Q_1 \times \dots \times Q_{k-1}}$, one has

$\displaystyle \frac{1}{N_k} |\sum_{n_k \in Q_k} e(P(n',n_k))| \gg \delta. \ \ \ \ \ (7)$

We write

$\displaystyle P(n',n_k) = \sum_{i_k \leq d_k} P_{i_k}(n') n_k^i$

for some polynomials ${P_{i_k}: {\bf Z}^{k-1} \rightarrow {\bf R}/{\bf Z}}$ of degrees at most ${d_1,\dots,d_{k-1}}$ in the variables ${n_1,\dots,n_{k-1}}$. For each ${n'}$ obeying (7), we apply Corollary 5 to conclude that there exists a natural number ${q_{n'} \ll \delta^{-O(1)}}$ such that

$\displaystyle \| q_{n'} P_{i_k}(n') \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N_k^{i_k}$

for ${i_k=1,\dots,d_k}$ (the claim also holds for ${i_k=0}$ but we discard it as being trivial). By the pigeonhole principle, there thus exists a natural number ${q \ll \delta^{-O(1)}}$ such that

$\displaystyle \| q P_{i_k}(n') \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N_k^{i_k}$

for all ${i_k=1,\dots,d_k}$ and for ${\gg \delta^{O(1)} N_1 \dots N_{k-1}}$ choices of ${n' \in Q_1 \times \dots \times Q_{k-1}}$. If we write

$\displaystyle P_{i_k}(n') = \sum_{i_{k-1} \leq d_{k-1}} P_{i_{k-1},i_k}(n_1,\dots,n_{k-2}) n_{k-1}^{i_{k-1}},$

where ${P_{i_{k-1},i_k}: {\bf Z}^{k-2} \rightarrow {\bf R}/{\bf Z}}$ is a polynomial of degrees at most ${d_1,\dots,d_{k-2}}$, then for ${\gg \delta^{O(1)} N_1 \dots N_{k-2}}$ choices of ${(n_1,\dots,n_{k-2}) \in Q_1 \times \dots \times Q_{k-2}}$ we then have

$\displaystyle \| \sum_{i_{k-1} \leq d_{k-1}} q P_{i_{k-1},i_k}(n_1,\dots,n_{k-2}) n_{k-1}^{i_{k-1}} \|_{{\bf R}/{\bf Z}} \ll \delta^{-O(1)} / N_k^{i_k}.$

Applying Lemma 6 in the ${n_{k-1}}$ and the largeness hypotheses on the ${N_i}$ (and also the assumption that ${i_k \geq 1}$) we conclude (after enlarging ${q}$ as necessary, and pigeonholing to keep ${q}$ independent of ${n_1,\dots,n_{k-2}}$) that

$\displaystyle \| q P_{i_{k-1},i_k}(n_1,\dots,n_{k-2}) \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)}}{N_{k-1}^{i_{k-1}} N_k^{i_k}}$

for all ${i_{k-1}=0,\dots,d_{k-1}}$ (note that we now include that ${i_{k-1}=0}$ case, which is no longer trivial) and for ${\gg \delta^{O(1)} N_1 \dots N_{k-2}}$ choices of ${(n_1,\dots,n_{k-2}) \in Q_1 \times \dots \times Q_{k-2}}$. Iterating this, we eventually conclude (after enlarging ${q}$ as necessary) that

$\displaystyle \| q \alpha_{i_1,\dots,i_k} \|_{{\bf R}/{\bf Z}} \ll \frac{\delta^{-O(1)}}{N_1^{i_1} \dots N_k^{i_k}} \ \ \ \ \ (8)$

whenever ${i_j \in \{0,\dots,d_j\}}$ for ${j=1,\dots,k}$, with ${i_k}$ nonzero. Permuting the indices, and observing that the claim is trivial for ${(i_1,\dots,i_k) = (0,\dots,0)}$, we in fact obtain (8) for all ${(i_1,\dots,i_k) \in \{0,\dots,d_1\} \times \dots \times \{0,\dots,d_k\}}$, at which point the claim easily follows by taking ${Q'_j := \{ qn_j: n_j \leq \delta^C N_j\}}$ for each ${j=1,\dots,k}$. $\Box$

An inspection of the proof of the above result (or alternatively, by combining the above result again with many applications of Lemma 6) reveals the following general form of Proposition 4, which was posed as Exercise 17 in this previous blog post, but had a slight misprint in it (it did not properly treat the possibility that some of the ${N_j}$ could be small) and was a bit trickier to prove than anticipated (in fact, the reason for this post was that I was asked to supply a more detailed solution for this exercise):

Proposition 8 (Multidimensional Weyl exponential sum estimate, inverse form, II) Let ${k \geq 1}$ be an natural number, and for each ${j=1,\dots,k}$, let ${I_j \subset [0,N_j]_{\bf Z}}$ be a discrete interval for some ${N_j \geq 1}$. Let

$\displaystyle P(n_1,\dots,n_k) = \sum_{i_1 \leq d_1, \dots, i_k \leq d_k} \alpha_{i_1,\dots,i_k} n_1^{i_1} \dots n_k^{i_k}$

be a polynomial in ${k}$ variables of multidegrees ${d_1,\dots,d_k \geq 0}$ for some ${\alpha_{i_1,\dots,i_k} \in {\bf R}/{\bf Z}}$. If

$\displaystyle \frac{1}{N_1 \dots N_k} |\sum_{n \in I_1 \times \dots \times I_k} e(P(n))| \geq \delta \ \ \ \ \ (9)$

for some ${\delta > 0}$, then either

$\displaystyle N_j \ll_{k,d_1,\dots,d_k} \delta^{-O_{k,d_1,\dots,d_k}(1)} \ \ \ \ \ (10)$

for some ${1 \leq j \leq d_k}$, or else there is a natural number ${q \ll_{k,d_1,\dots,d_k} \delta^{-O_{k,d_1,\dots,d_k}(1)}}$ such that

$\displaystyle \| q\alpha_{i_1,\dots,i_k} \|_{{\bf R}/{\bf Z}} \ll_{k,d_1,\dots,d_k} \delta^{-O_d(1)} N_1^{-i_1} \dots N_k^{-i_k} \ \ \ \ \ (11)$

whenever ${i_j \leq d_j}$ for ${j=1,\dots,k}$.

Again, the factor of ${N_1^{-i_1} \dots N_k^{-i_k}}$ is natural in this bound. In the ${k=1}$ case, the option (10) may be deleted since (11) trivially holds in this case, but this simplification is no longer available for ${k>1}$ since one needs (10) to hold for all ${j}$ (not just one ${j}$) to make (11) completely trivial. Indeed, the above proposition fails for ${k \geq 2}$ if one removes (10) completely, as can be seen for instance by inspecting the exponential sum ${\sum_{n_1 \in \{0,1\}} \sum_{n_2 \in [1,N] \cap {\bf Z}} e( \alpha n_1 n_2)}$, which has size comparable to ${N}$ regardless of how irrational ${\alpha}$ is.

Vitaly Bergelson, Tamar Ziegler, and I have just uploaded to the arXiv our joint paper “Multiple recurrence and convergence results associated to ${{\bf F}_{p}^{\omega}}$-actions“. This paper is primarily concerned with limit formulae in the theory of multiple recurrence in ergodic theory. Perhaps the most basic formula of this type is the mean ergodic theorem, which (among other things) asserts that if ${(X,{\mathcal X}, \mu,T)}$ is a measure-preserving ${{\bf Z}}$-system (which, in this post, means that ${(X,{\mathcal X}, \mu)}$ is a probability space and ${T: X \mapsto X}$ is measure-preserving and invertible, thus giving an action ${(T^n)_{n \in {\bf Z}}}$ of the integers), and ${f,g \in L^2(X,{\mathcal X}, \mu)}$ are functions, and ${X}$ is ergodic (which means that ${L^2(X,{\mathcal X}, \mu)}$ contains no ${T}$-invariant functions other than the constants (up to almost everywhere equivalence, of course)), then the average

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x)\ d\mu \ \ \ \ \ (1)$

converges as ${N \rightarrow \infty}$ to the expression

$\displaystyle (\int_X f(x)\ d\mu) (\int_X g(x)\ d\mu);$

see e.g. this previous blog post. Informally, one can interpret this limit formula as an equidistribution result: if ${x}$ is drawn at random from ${X}$ (using the probability measure ${\mu}$), and ${n}$ is drawn at random from ${\{1,\ldots,N\}}$ for some large ${N}$, then the pair ${(x, T^n x)}$ becomes uniformly distributed in the product space ${X \times X}$ (using product measure ${\mu \times \mu}$) in the limit as ${N \rightarrow \infty}$.

If we allow ${(X,\mu)}$ to be non-ergodic, then we still have a limit formula, but it is a bit more complicated. Let ${{\mathcal X}^T}$ be the ${T}$-invariant measurable sets in ${{\mathcal X}}$; the ${{\bf Z}}$-system ${(X, {\mathcal X}^T, \mu, T)}$ can then be viewed as a factor of the original system ${(X, {\mathcal X}, \mu, T)}$, which is equivalent (in the sense of measure-preserving systems) to a trivial system ${(Z_0, {\mathcal Z}_0, \mu_{Z_0}, 1)}$ (known as the invariant factor) in which the shift is trivial. There is then a projection map ${\pi_0: X \rightarrow Z_0}$ to the invariant factor which is a factor map, and the average (1) converges in the limit to the expression

$\displaystyle \int_{Z_0} (\pi_0)_* f(z) (\pi_0)_* g(z)\ d\mu_{Z_0}(x), \ \ \ \ \ (2)$

where ${(\pi_0)_*: L^2(X,{\mathcal X},\mu) \rightarrow L^2(Z_0,{\mathcal Z}_0,\mu_{Z_0})}$ is the pushforward map associated to the map ${\pi_0: X \rightarrow Z_0}$; see e.g. this previous blog post. We can interpret this as an equidistribution result. If ${(x,T^n x)}$ is a pair as before, then we no longer expect complete equidistribution in ${X \times X}$ in the non-ergodic, because there are now non-trivial constraints relating ${x}$ with ${T^n x}$; indeed, for any ${T}$-invariant function ${f: X \rightarrow {\bf C}}$, we have the constraint ${f(x) = f(T^n x)}$; putting all these constraints together we see that ${\pi_0(x) = \pi_0(T^n x)}$ (for almost every ${x}$, at least). The limit (2) can be viewed as an assertion that this constraint ${\pi_0(x) = \pi_0(T^n x)}$ are in some sense the “only” constraints between ${x}$ and ${T^n x}$, and that the pair ${(x,T^n x)}$ is uniformly distributed relative to these constraints.

Limit formulae are known for multiple ergodic averages as well, although the statement becomes more complicated. For instance, consider the expression

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x) h(T^{2n} x)\ d\mu \ \ \ \ \ (3)$

for three functions ${f,g,h \in L^\infty(X, {\mathcal X}, \mu)}$; this is analogous to the combinatorial task of counting length three progressions in various sets. For simplicity we assume the system ${(X,{\mathcal X},\mu,T)}$ to be ergodic. Naively one might expect this limit to then converge to

$\displaystyle (\int_X f\ d\mu) (\int_X g\ d\mu) (\int_X h\ d\mu)$

which would roughly speaking correspond to an assertion that the triplet ${(x,T^n x, T^{2n} x)}$ is asymptotically equidistributed in ${X \times X \times X}$. However, even in the ergodic case there can be additional constraints on this triplet that cannot be seen at the level of the individual pairs ${(x,T^n x)}$, ${(x, T^{2n} x)}$. The key obstruction here is that of eigenfunctions of the shift ${T: X \rightarrow X}$, that is to say non-trivial functions ${f: X \rightarrow S^1}$ that obey the eigenfunction equation ${Tf = \lambda f}$ almost everywhere for some constant (or ${T}$-invariant) ${\lambda}$. Each such eigenfunction generates a constraint

$\displaystyle f(x) \overline{f(T^n x)}^2 f(T^{2n} x) = 1 \ \ \ \ \ (4)$

tying together ${x}$, ${T^n x}$, and ${T^{2n} x}$. However, it turns out that these are in some sense the only constraints on ${x,T^n x, T^{2n} x}$ that are relevant for the limit (3). More precisely, if one sets ${{\mathcal X}_1}$ to be the sub-algebra of ${{\mathcal X}}$ generated by the eigenfunctions of ${T}$, then it turns out that the factor ${(X, {\mathcal X}_1, \mu, T)}$ is isomorphic to a shift system ${(Z_1, {\mathcal Z}_1, \mu_{Z_1}, x \mapsto x+\alpha)}$ known as the Kronecker factor, for some compact abelian group ${Z_1 = (Z_1,+)}$ and some (irrational) shift ${\alpha \in Z_1}$; the factor map ${\pi_1: X \rightarrow Z_1}$ pushes eigenfunctions forward to (affine) characters on ${Z_1}$. It is then known that the limit of (3) is

$\displaystyle \int_\Sigma (\pi_1)_* f(x_0) (\pi_1)_* g(x_1) (\pi_1)_* h(x_2)\ d\mu_\Sigma$

where ${\Sigma \subset Z_1^3}$ is the closed subgroup

$\displaystyle \Sigma = \{ (x_1,x_2,x_3) \in Z_1^3: x_1-2x_2+x_3=0 \}$

and ${\mu_\Sigma}$ is the Haar probability measure on ${\Sigma}$; see this previous blog post. The equation ${x_1-2x_2+x_3=0}$ defining ${\Sigma}$ corresponds to the constraint (4) mentioned earlier. Among other things, this limit formula implies Roth’s theorem, which in the context of ergodic theory is the assertion that the limit (or at least the limit inferior) of (3) is positive when ${f=g=h}$ is non-negative and not identically vanishing.

If one considers a quadruple average

$\displaystyle \frac{1}{N} \sum_{n=1}^N \int_X f(x) g(T^n x) h(T^{2n} x) k(T^{3n} x)\ d\mu \ \ \ \ \ (5)$

(analogous to counting length four progressions) then the situation becomes more complicated still, even in the ergodic case. In addition to the (linear) eigenfunctions that already showed up in the computation of the triple average (3), a new type of constraint also arises from quadratic eigenfunctions ${f: X \rightarrow S^1}$, which obey an eigenfunction equation ${Tf = \lambda f}$ in which ${\lambda}$ is no longer constant, but is now a linear eigenfunction. For such functions, ${f(T^n x)}$ behaves quadratically in ${n}$, and one can compute the existence of a constraint

$\displaystyle f(x) \overline{f(T^n x)}^3 f(T^{2n} x)^3 \overline{f(T^{3n} x)} = 1 \ \ \ \ \ (6)$

between ${x}$, ${T^n x}$, ${T^{2n} x}$, and ${T^{3n} x}$ that is not detected at the triple average level. As it turns out, this is not the only type of constraint relevant for (5); there is a more general class of constraint involving two-step nilsystems which we will not detail here, but see e.g. this previous blog post for more discussion. Nevertheless there is still a similar limit formula to previous examples, involving a special factor ${(Z_2, {\mathcal Z}_2, \mu_{Z_2}, S)}$ which turns out to be an inverse limit of two-step nilsystems; this limit theorem can be extracted from the structural theory in this paper of Host and Kra combined with a limit formula for nilsystems obtained by Lesigne, but will not be reproduced here. The pattern continues to higher averages (and higher step nilsystems); this was first done explicitly by Ziegler, and can also in principle be extracted from the structural theory of Host-Kra combined with nilsystem equidistribution results of Leibman. These sorts of limit formulae can lead to various recurrence results refining Roth’s theorem in various ways; see this paper of Bergelson, Host, and Kra for some examples of this.

The above discussion was concerned with ${{\bf Z}}$-systems, but one can adapt much of the theory to measure-preserving ${G}$-systems for other discrete countable abelian groups ${G}$, in which one now has a family ${(T_g)_{g \in G}}$ of shifts indexed by ${G}$ rather than a single shift, obeying the compatibility relation ${T_{g+h}=T_g T_h}$. The role of the intervals ${\{1,\ldots,N\}}$ in this more general setting is replaced by that of Folner sequences. For arbitrary countable abelian ${G}$, the theory for double averages (1) and triple limits (3) is essentially identical to the ${{\bf Z}}$-system case. But when one turns to quadruple and higher limits, the situation becomes more complicated (and, for arbitrary ${G}$, still not fully understood). However one model case which is now well understood is the finite field case when ${G = {\bf F}_p^\omega = \bigcup_{n=1}^\infty {\bf F}_p^n}$ is an infinite-dimensional vector space over a finite field ${{\bf F}_p}$ (with the finite subspaces ${{\bf F}_p^n}$ then being a good choice for the Folner sequence). Here, the analogue of the structural theory of Host and Kra was worked out by Vitaly, Tamar, and myself in these previous papers (treating the high characteristic and low characteristic cases respectively). In the finite field setting, it turns out that nilsystems no longer appear, and one only needs to deal with linear, quadratic, and higher order eigenfunctions (known collectively as phase polynomials). It is then natural to look for a limit formula that asserts, roughly speaking, that if ${x}$ is drawn at random from a ${{\bf F}_p^\omega}$-system and ${n}$ drawn randomly from a large subspace of ${{\bf F}_p^\omega}$, then the only constraints between ${x, T^n x, \ldots, T^{(p-1)n} x}$ are those that arise from phase polynomials. The main theorem of this paper is to establish this limit formula (which, again, is a little complicated to state explicitly and will not be done here). In particular, we establish for the first time that the limit actually exists (a result which, for ${{\bf Z}}$-systems, was one of the main results of this paper of Host and Kra).

As a consequence, we can recover finite field analogues of most of the results of Bergelson-Host-Kra, though interestingly some of the counterexamples demonstrating sharpness of their results for ${{\bf Z}}$-systems (based on Behrend set constructions) do not seem to be present in the finite field setting (cf. this previous blog post on the cap set problem). In particular, we are able to largely settle the question of when one has a Khintchine-type theorem that asserts that for any measurable set ${A}$ in an ergodic ${{\bf F}_p^\omega}$-system and any ${\epsilon>0}$, one has

$\displaystyle \mu( T_{c_1 n} A \cap \ldots \cap T_{c_k n} A ) > \mu(A)^k - \epsilon$

for a syndetic set of ${n}$, where ${c_1,\ldots,c_k \in {\bf F}_p}$ are distinct residue classes. It turns out that Khintchine-type theorems always hold for ${k=1,2,3}$ (and for ${k=1,2}$ ergodicity is not required), and for ${k=4}$ it holds whenever ${c_1,c_2,c_3,c_4}$ form a parallelogram, but not otherwise (though the counterexample here was such a painful computation that we ended up removing it from the paper, and may end up putting it online somewhere instead), and for larger ${k}$ we could show that the Khintchine property failed for generic choices of ${c_1,\ldots,c_k}$, though the problem of determining exactly the tuples for which the Khintchine property failed looked to be rather messy and we did not completely settle it.

I’ve just uploaded to the arXiv my paper “Expanding polynomials over finite fields of large characteristic, and a regularity lemma for definable sets“, submitted to Contrib. Disc. Math. The motivation of this paper is to understand a certain polynomial variant of the sum-product phenomenon in finite fields. This phenomenon asserts that if ${A}$ is a non-empty subset of a finite field ${F}$, then either the sumset ${A+A := \{a+b: a,b \in A\}}$ or product set ${A \cdot A := \{ab: a,b \in A \}}$ will be significantly larger than ${A}$, unless ${A}$ is close to a subfield of ${F}$ (or to ${\{1\}}$). In particular, in the regime when ${A}$ is large, say ${|F|^{1-c} < |A| \leq |F|}$, one expects an expansion bound of the form

$\displaystyle |A+A| + |A \cdot A| \gg (|F|/|A|)^{c'} |A| \ \ \ \ \ (1)$

for some absolute constants ${c, c' > 0}$. Results of this type are known; for instance, Hart, Iosevich, and Solymosi obtained precisely this bound for ${(c,c')=(3/10,1/3)}$ (in the case when ${|F|}$ is prime), which was then improved by Garaev to ${(c,c')=(1/3,1/2)}$.

We have focused here on the case when ${A}$ is a large subset of ${F}$, but sum-product estimates are also extremely interesting in the opposite regime in which ${A}$ is allowed to be small (see for instance the papers of KatzShen and Li and of Garaev for some recent work in this case, building on some older papers of Bourgain, Katz and myself and of Bourgain, Glibichuk, and Konyagin). However, the techniques used in these two regimes are rather different. For large subsets of ${F}$, it is often profitable to use techniques such as the Fourier transform or the Cauchy-Schwarz inequality to “complete” a sum over a large set (such as ${A}$) into a set over the entire field ${F}$, and then to use identities concerning complete sums (such as the Weil bound on complete exponential sums over a finite field). For small subsets of ${F}$, such techniques are usually quite inefficient, and one has to proceed by somewhat different combinatorial methods which do not try to exploit the ambient field ${F}$. But my paper focuses exclusively on the large ${A}$ regime, and unfortunately does not directly say much (except through reasoning by analogy) about the small ${A}$ case.

Note that it is necessary to have both ${A+A}$ and ${A \cdot A}$ appear on the left-hand side of (1). Indeed, if one just has the sumset ${A+A}$, then one can set ${A}$ to be a long arithmetic progression to give counterexamples to (1). Similarly, if one just has a product set ${A \cdot A}$, then one can set ${A}$ to be a long geometric progression. The sum-product phenomenon can then be viewed that it is not possible to simultaneously behave like a long arithmetic progression and a long geometric progression, unless one is already very close to behaving like a subfield.

Now we consider a polynomial variant of the sum-product phenomenon, where we consider a polynomial image

$\displaystyle P(A,A) := \{ P(a,b): a,b \in A \}$

of a set ${A \subset F}$ with respect to a polynomial ${P: F \times F \rightarrow F}$; we can also consider the asymmetric setting of the image

$\displaystyle P(A,B) := \{ P(a,b): a \in A,b \in B \}$

of two subsets ${A,B \subset F}$. The regime we will be interested is the one where the field ${F}$ is large, and the subsets ${A, B}$ of ${F}$ are also large, but the polynomial ${P}$ has bounded degree. Actually, for technical reasons it will not be enough for us to assume that ${F}$ has large cardinality; we will also need to assume that ${F}$ has large characteristic. (The two concepts are synonymous for fields of prime order, but not in general; for instance, the field with ${2^n}$ elements becomes large as ${n \rightarrow \infty}$ while the characteristic remains fixed at ${2}$, and is thus not going to be covered by the results in this paper.)

In this paper of Vu, it was shown that one could replace ${A \cdot A}$ with ${P(A,A)}$ in (1), thus obtaining a bound of the form

$\displaystyle |A+A| + |P(A,A)| \gg (|F|/|A|)^{c'} |A|$

whenever ${|A| \geq |F|^{1-c}}$ for some absolute constants ${c, c' > 0}$, unless the polynomial ${P}$ had the degenerate form ${P(x,y) = Q(L(x,y))}$ for some linear function ${L: F \times F \rightarrow F}$ and polynomial ${Q: F \rightarrow F}$, in which ${P(A,A)}$ behaves too much like ${A+A}$ to get reasonable expansion. In this paper, we focus instead on the question of bounding ${P(A,A)}$ alone. In particular, one can ask to classify the polynomials ${P}$ for which one has the weak expansion property

$\displaystyle |P(A,A)| \gg (|F|/|A|)^{c'} |A|$

whenever ${|A| \geq |F|^{1-c}}$ for some absolute constants ${c, c' > 0}$. One can also ask for stronger versions of this expander property, such as the moderate expansion property

$\displaystyle |P(A,A)| \gg |F|$

whenever ${|A| \geq |F|^{1-c}}$, or the almost strong expansion property

$\displaystyle |P(A,A)| \geq |F| - O( |F|^{1-c'})$

whenever ${|A| \geq |F|^{1-c}}$. (One can consider even stronger expansion properties, such as the strong expansion property ${|P(A,A)| \geq |F|-O(1)}$, but it was shown by Gyarmati and Sarkozy that this property cannot hold for polynomials of two variables of bounded degree when ${|F| \rightarrow \infty}$.) One can also consider asymmetric versions of these properties, in which one obtains lower bounds on ${|P(A,B)|}$ rather than ${|P(A,A)|}$.

The example of a long arithmetic or geometric progression shows that the polynomials ${P(x,y) = x+y}$ or ${P(x,y) = xy}$ cannot be expanders in any of the above senses, and a similar construction also shows that polynomials of the form ${P(x,y) = Q(f(x)+f(y))}$ or ${P(x,y) = Q(f(x) f(y))}$ for some polynomials ${Q, f: F \rightarrow F}$ cannot be expanders. On the other hand, there are a number of results in the literature establishing expansion for various polynomials in two or more variables that are not of this degenerate form (in part because such results are related to incidence geometry questions in finite fields, such as the finite field version of the Erdos distinct distances problem). For instance, Solymosi established weak expansion for polynomials of the form ${P(x,y) = f(x)+y}$ when ${f}$ is a nonlinear polynomial, with generalisations by Hart, Li, and Shen for various polynomials of the form ${P(x,y) = f(x) + g(y)}$ or ${P(x,y) = f(x) g(y)}$. Further examples of expanding polynomials appear in the work of Shkredov, Iosevich-Rudnev, and Bukh-Tsimerman, as well as the previously mentioned paper of Vu and of Hart-Li-Shen, and these papers in turn cite many further results which are in the spirit of the polynomial expansion bounds discussed here (for instance, dealing with the small ${A}$ regime, or working in other fields such as ${{\bf R}}$ instead of in finite fields ${F}$). We will not summarise all these results here; they are summarised briefly in my paper, and in more detail in the papers of Hart-Li-Shen and of Bukh-Tsimerman. But we will single out one of the results of Bukh-Tsimerman, which is one of most recent and general of these results, and closest to the results of my own paper. Roughly speaking, in this paper it is shown that a polynomial ${P(x,y)}$ of two variables and bounded degree will be a moderate expander if it is non-composite (in the sense that it does not take the form ${P(x,y) = Q(R(x,y))}$ for some non-linear polynomial ${Q}$ and some polynomial ${R}$, possibly having coefficients in the algebraic completion of ${F}$) and is monic on both ${x}$ and ${y}$, thus it takes the form ${P(x,y) = x^d + S(x,y)}$ for some ${d \geq 1}$ and some polynomial ${S}$ of degree at most ${d-1}$ in ${x}$, and similarly with the roles of ${x}$ and ${y}$ reversed, unless ${P}$ is of the form ${P(x,y) = f(x) + g(y)}$ or ${P(x,y) = f(x) g(y)}$ (in which case the expansion theory is covered to a large extent by the previous work of Hart, Li, and Shen).

Our first main result improves upon the Bukh-Tsimerman result by strengthening the notion of expansion and removing the non-composite and monic hypotheses, but imposes a condition of large characteristic. I’ll state the result here slightly informally as follows:

Theorem 1 (Criterion for moderate expansion) Let ${P: F \times F \rightarrow F}$ be a polynomial of bounded degree over a finite field ${F}$ of sufficiently large characteristic, and suppose that ${P}$ is not of the form ${P(x,y) = Q(f(x)+g(y))}$ or ${P(x,y) = Q(f(x)g(y))}$ for some polynomials ${Q,f,g: F \rightarrow F}$. Then one has the (asymmetric) moderate expansion property

$\displaystyle |P(A,B)| \gg |F|$

whenever ${|A| |B| \ggg |F|^{2-1/8}}$.

This is basically a sharp necessary and sufficient condition for asymmetric expansion moderate for polynomials of two variables. In the paper, analogous sufficient conditions for weak or almost strong expansion are also given, although these are not quite as satisfactory (particularly the conditions for almost strong expansion, which include a somewhat complicated algebraic condition which is not easy to check, and which I would like to simplify further, but was unable to).

The argument here resembles the Bukh-Tsimerman argument in many ways. One can view the result as an assertion about the expansion properties of the graph ${\{ (a,b,P(a,b)): a,b \in F \}}$, which can essentially be thought of as a somewhat sparse three-uniform hypergraph on ${F}$. Being sparse, it is difficult to directly apply techniques from dense graph or hypergraph theory for this situation; however, after a few applications of the Cauchy-Schwarz inequality, it turns out (as observed by Bukh and Tsimerman) that one can essentially convert the problem to one about the expansion properties of the set

$\displaystyle \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in F \} \ \ \ \ \ (2)$

(actually, one should view this as a multiset, but let us ignore this technicality) which one expects to be a dense set in ${F^4}$, except in the case when the associated algebraic variety

$\displaystyle \{ (P(a,c), P(b,c), P(a,d), P(b,d)): a,b,c,d \in \overline{F} \}$

fails to be Zariski dense, but it turns out that in this case one can use some differential geometry and Riemann surface arguments (after first invoking the Lefschetz principle and the high characteristic hypothesis to work over the complex numbers instead over a finite field) to show that ${P}$ is of the form ${Q(f(x)+g(y))}$ or ${Q(f(x)g(y))}$. This reduction is related to the classical fact that the only one-dimensional algebraic groups over the complex numbers are the additive group ${({\bf C},+)}$, the multiplicative group ${({\bf C} \backslash \{0\},\times)}$, or the elliptic curves (but the latter have a group law given by rational functions rather than polynomials, and so ultimately end up being eliminated from consideration, though they would play an important role if one wanted to study the expansion properties of rational functions).

It remains to understand the structure of the set (2) is. To understand dense graphs or hypergraphs, one of the standard tools of choice is the Szemerédi regularity lemma, which carves up such graphs into a bounded number of cells, with the graph behaving pseudorandomly on most pairs of cells. However, the bounds in this lemma are notoriously poor (the regularity obtained is an inverse tower exponential function of the number of cells), and this makes this lemma unsuitable for the type of expansion properties we seek (in which we want to deal with sets ${A}$ which have a polynomial sparsity, e.g. ${|A| \sim |F|^{1-c}}$). Fortunately, in the case of sets such as (2) which are definable over the language of rings, it turns out that a much stronger regularity lemma is available, which I call the “algebraic regularity lemma”. I’ll state it (again, slightly informally) in the context of graphs as follows:

Lemma 2 (Algebraic regularity lemma) Let ${F}$ be a finite field of large characteristic, and let ${V, W}$ be definable sets over ${F}$ of bounded complexity (i.e. ${V, W}$ are subsets of ${F^n}$, ${F^m}$ for some bounded ${n,m}$ that can be described by some first-order predicate in the language of rings of bounded length and involving boundedly many constants). Let ${E}$ be a definable subset of ${V \times W}$, again of bounded complexity (one can view ${E}$ as a bipartite graph connecting ${V}$ and ${W}$). Then one can partition ${V, W}$ into a bounded number of cells ${V_1,\ldots,V_a}$, ${W_1,\ldots,W_b}$, still definable with bounded complexity, such that for all pairs ${i =1,\ldots a}$, ${j=1,\ldots,b}$, one has the regularity property

$\displaystyle |E \cap (A \times B)| = d_{ij} |A| |B| + O( |F|^{-1/4} |V| |W| )$

for all ${A \subset V_i, B \subset W_i}$, where ${d_{ij} := \frac{|E \cap (V_i \times W_j)|}{|V_i| |W_j|}}$ is the density of ${E}$ in ${V_i \times W_j}$.

This lemma resembles the Szemerédi regularity lemma, but regularises all pairs of cells (not just most pairs), and the regularity is of polynomial strength in ${|F|}$, rather than inverse tower exponential in the number of cells. Also, the cells are not arbitrary subsets of ${V,W}$, but are themselves definable with bounded complexity, which turns out to be crucial for applications. I am optimistic that this lemma will be useful not just for studying expanding polynomials, but for many other combinatorial questions involving dense subsets of definable sets over finite fields.

The above lemma is stated for graphs ${E \subset V \times W}$, but one can iterate it to obtain an analogous regularisation of hypergraphs ${E \subset V_1 \times \ldots \times V_k}$ for any bounded ${k}$ (for application to (2), we need ${k=4}$). This hypergraph regularity lemma, by the way, is not analogous to the strong hypergraph regularity lemmas of Rodl et al. and Gowers developed in the last six or so years, but closer in spirit to the older (but weaker) hypergraph regularity lemma of Chung which gives the same “order ${1}$” regularity that the graph regularity lemma gives, rather than higher order regularity.

One feature of the proof of Lemma 2 which I found striking was the need to use some fairly high powered technology from algebraic geometry, and in particular the Lang-Weil bound on counting points in varieties over a finite field (discussed in this previous blog post), and also the theory of the etale fundamental group. Let me try to briefly explain why this is the case. A model example of a definable set of bounded complexity ${E}$ is a set ${E \subset F^n \times F^m}$ of the form

$\displaystyle E = \{ (x,y) \in F^n \times F^m: \exists t \in F; P(x,y,t)=0 \}$

for some polynomial ${P: F^n \times F^m \times F \rightarrow F}$. (Actually, it turns out that one can essentially write all definable sets as an intersection of sets of this form; see this previous blog post for more discussion.) To regularise the set ${E}$, it is convenient to square the adjacency matrix, which soon leads to the study of counting functions such as

$\displaystyle \mu(x,x') := | \{ (y,t,t') \in F^m \times F \times F: P(x,y,t) = P(x',y,t') = 0 \}|.$

If one can show that this function ${\mu}$ is “approximately finite rank” in the sense that (modulo lower order errors, of size ${O(|F|^{-1/2})}$ smaller than the main term), this quantity depends only on a bounded number of bits of information about ${x}$ and a bounded number of bits of information about ${x'}$, then a little bit of linear algebra will then give the required regularity result.

One can recognise ${\mu(x,x')}$ as counting ${F}$-points of a certain algebraic variety

$\displaystyle V_{x,x'} := \{ (y,t,t') \in \overline{F}^m \times \overline{F} \times \overline{F}: P(x,y,t) = P(x',y,t') = 0 \}.$

The Lang-Weil bound (discussed in this previous post) provides a formula for this count, in terms of the number ${c(x,x')}$ of geometrically irreducible components of ${V_{x,x'}}$ that are defined over ${F}$ (or equivalently, are invariant with respect to the Frobenius endomorphism associated to ${F}$). So the problem boils down to ensuring that this quantity ${c(x,x')}$ is “generically bounded rank”, in the sense that for generic ${x,x'}$, its value depends only on a bounded number of bits of ${x}$ and a bounded number of bits of ${x'}$.

Here is where the étale fundamental group comes in. One can view ${V_{x,x'}}$ as a fibre product ${V_x \times_{\overline{F}^m} V_{x'}}$ of the varieties

$\displaystyle V_x := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x,y,t) = 0 \}$

and

$\displaystyle V_{x'} := \{ (y,t) \in \overline{F}^m \times \overline{F}: P(x',y,t) = 0 \}$

over ${\overline{F}^m}$. If one is in sufficiently high characteristic (or even better, in zero characteristic, which one can reduce to by an ultraproduct (or nonstandard analysis) construction, similar to that discussed in this previous post), the varieties ${V_x,V_{x'}}$ are generically finite étale covers of ${\overline{F}^m}$, and the fibre product ${V_x \times_{\overline{F}^m} V_{x'}}$ is then also generically a finite étale cover. One can count the components of a finite étale cover of a connected variety by counting the number of orbits of the étale fundamental group acting on a fibre of that variety (much as the number of components of a cover of a connected manifold is the number of orbits of the topological fundamental group acting on that fibre). So if one understands the étale fundamental group of a certain generic subset of ${\overline{F}^m}$ (formed by intersecting together an ${x}$-dependent generic subset of ${\overline{F}^m}$ with an ${x'}$-dependent generic subset), this in principle controls ${c(x,x')}$. It turns out that one can decouple the ${x}$ and ${x'}$ dependence of this fundamental group by using an étale version of the van Kampen theorem for the fundamental group, which I discussed in this previous blog post. With this fact (and another deep fact about the étale fundamental group in zero characteristic, namely that it is topologically finitely generated), one can obtain the desired generic bounded rank property of ${c(x,x')}$, which gives the regularity lemma.

In order to expedite the deployment of all this algebraic geometry (as well as some Riemann surface theory), it is convenient to use the formalism of nonstandard analysis (or the ultraproduct construction), which among other things can convert quantitative, finitary problems in large characteristic into equivalent qualitative, infinitary problems in zero characteristic (in the spirit of this blog post). This allows one to use several tools from those fields as “black boxes”; not just the theory of étale fundamental groups (which are considerably simpler and more favorable in characteristic zero than they are in positive characteristic), but also some results limiting the morphisms between compact Riemann surfaces of high genus (such as the de Franchis theorem, the Riemann-Hurwitz formula, or the fact that all morphisms between elliptic curves are essentially group homomorphisms), which would be quite unwieldy to utilise if one did not first pass to the zero characteristic case (and thence to the complex case) via the ultraproduct construction (followed by the Lefschetz principle).

I found this project to be particularly educational for me, as it forced me to wander outside of my usual range by quite a bit in order to pick up the tools from algebraic geometry and Riemann surfaces that I needed (in particular, I read through several chapters of EGA and SGA for the first time). This did however put me in the slightly unnerving position of having to use results (such as the Riemann existence theorem) whose proofs I have not fully verified for myself, but which are easy to find in the literature, and widely accepted in the field. I suppose this type of dependence on results in the literature is more common in the more structured fields of mathematics than it is in analysis, which by its nature has fewer reusable black boxes, and so key tools often need to be rederived and modified for each new application. (This distinction is discussed further in this article of Gowers.)

One of the first non-trivial theorems one encounters in classical algebraic geometry is Bézout’s theorem, which we will phrase as follows:

Theorem 1 (Bézout’s theorem) Let ${k}$ be a field, and let ${P, Q \in k[x,y]}$ be non-zero polynomials in two variables ${x,y}$ with no common factor. Then the two curves ${\{ (x,y) \in k^2: P(x,y) = 0\}}$ and ${\{ (x,y) \in k^2: Q(x,y) = 0\}}$ have no common components, and intersect in at most ${\hbox{deg}(P) \hbox{deg}(Q)}$ points.

This theorem can be proven by a number of means, for instance by using the classical tool of resultants. It has many strengthenings, generalisations, and variants; see for instance this previous blog post on Bézout’s inequality. Bézout’s theorem asserts a fundamental algebraic dichotomy, of importance in combinatorial incidence geometry: any two algebraic curves either share a common component, or else have a bounded finite intersection; there is no intermediate case in which the intersection is unbounded in cardinality, but falls short of a common component. This dichotomy is closely related to the integrality gap in algebraic dimension: an algebraic set can have an integer dimension such as ${0}$ or ${1}$, but cannot attain any intermediate dimensions such as ${1/2}$. This stands in marked contrast to sets of analytic, combinatorial, or probabilistic origin, whose “dimension” is typically not necessarily constrained to be an integer.

Bézout’s inequality tells us, roughly speaking, that the intersection of a curve of degree ${D_1}$ and a curve of degree ${D_2}$ forms a set of at most ${D_1 D_2}$ points. One can consider the converse question: given a set ${S}$ of ${N}$ points in the plane ${k^2}$, can one find two curves of degrees ${D_1,D_2}$ with ${D_1 D_2 = O(N)}$ and no common components, whose intersection contains these points?

A model example that supports the possibility of such a converse is a grid ${S = A \times B}$ that is a Cartesian product of two finite subsets ${A, B}$ of ${k}$ with ${|A| |B| = N}$. In this case, one can take one curve to be the union of ${|A|}$ vertical lines, and the other curve to be the union of ${|B|}$ horizontal lines, to obtain the required decomposition. Thus, if the proposed converse to Bézout’s inequality held, it would assert that any set of ${N}$ points was essentially behaving like a “nonlinear grid” of size ${N}$.

Unfortunately, the naive converse to Bézout’s theorem is false. A counterexample can be given by considering a set ${S = S_1 \cup S_2}$ of ${2N}$ points for some large perfect square ${N}$, where ${P_1}$ is a ${\sqrt{N}}$ by ${\sqrt{N}}$ grid of the form described above, and ${S_2}$ consists of ${N}$ points on an line ${\ell}$ (e.g. a ${1 \times N}$ or ${N \times 1}$ grid). Each of the two component sets ${S_1, S_2}$ can be written as the intersection between two curves whose degrees multiply up to ${N}$; in the case of ${S_1}$, we can take the two families of parallel lines (viewed as reducible curves of degree ${\sqrt{N}}$) as the curves, and in the case of ${S_2}$, one can take ${\ell}$ as one curve, and the graph of a degree ${N}$ polynomial on ${\ell}$ vanishing on ${S_2}$ for the second curve. But, if ${N}$ is large enough, one cannot cover ${S}$ by the intersection of a single pair ${\gamma_1, \gamma_2}$ of curves with no common components whose degrees ${D_1,D_2}$ multiply up to ${D_1 D_2 = O(N)}$. Indeed, if this were the case, then without loss of generality we may assume that ${D_1 \leq D_2}$, so that ${D_1 = O(\sqrt{N})}$. By Bézout’s theorem, ${\gamma_1}$ either contains ${\ell}$, or intersects ${\ell}$ in at most ${O(D_1) = O(\sqrt{N})}$ points. Thus, in order for ${\gamma_1}$ to capture all of ${S}$, it must contain ${\ell}$, which forces ${\gamma_2}$ to not contain ${\ell}$. But ${\gamma_2}$ has to intersect ${\ell}$ in ${N}$ points, so by Bézout’s theorem again we have ${D_2 \geq N}$, thus ${D_1 = O(1)}$. But then (by more applications of Bézout’s theorem) ${\gamma_1}$ can only capture ${O(\sqrt{N})}$ of the ${N}$ points of ${S_1}$, a contradiction.

But the above counterexample suggests that even if an arbitrary set of ${N}$ (or ${2N}$) points cannot be covered by the single intersection of a pair of curves with degree multiplying up to ${O(N)}$, one may be able to cover such a set by a small number of such intersections. The purpose of this post is to record the simple observation that this is, indeed, the case:

Theorem 2 (Partial converse to Bézout’s theorem) Let ${k}$ be a field, and let ${S}$ be a set of ${N}$ points in ${k}$ for some ${N > 1}$. Then one can find ${m = O(\log N)}$ and pairs ${P_i,Q_i \in k[x,y]}$ of coprime non-zero polynomials for ${i=1,\ldots,m}$ such that

$\displaystyle S \subset \bigcup_{i=1}^m \{ (x,y) \in k^2: P_i(x,y) = Q_i(x,y) = 0 \} \ \ \ \ \ (1)$

and

$\displaystyle \sum_{i=1}^m \hbox{deg}(P_i) \hbox{deg}(Q_i) = O( N ). \ \ \ \ \ (2)$

Informally, every finite set in the plane is (a dense subset of) the union of logarithmically many nonlinear grids. The presence of the logarithm is necessary, as can be seen by modifying the ${P_1 \cup P_2}$ example to be the union of logarithmically many Cartesian products of distinct dimensions, rather than just a pair of such products.

Unfortunately I do not know of any application of this converse, but I thought it was cute anyways. The proof is given below the fold.

One of the basic problems in analytic number theory is to obtain bounds and asymptotics for sums of the form

$\displaystyle \sum_{n \leq x} f(n)$

in the limit ${x \rightarrow \infty}$, where ${n}$ ranges over natural numbers less than ${x}$, and ${f: {\bf N} \rightarrow {\bf C}}$ is some arithmetic function of number-theoretic interest. (It is also often convenient to replace this sharply truncated sum with a smoother sum such as ${\sum_n f(n) \psi(n/x)}$, but we will not discuss this technicality here.) For instance, the prime number theorem is equivalent to the assertion

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + o(x)$

where ${\Lambda}$ is the von Mangoldt function, while the Riemann hypothesis is equivalent to the stronger assertion

$\displaystyle \sum_{n \leq x} \Lambda(n) = x + O(x^{1/2+o(1)}).$

It is thus of interest to develop techniques to estimate such sums ${\sum_{n \leq x} f(n)}$. Of course, the difficulty of this task depends on how “nice” the function ${f}$ is. The functions ${f}$ that come up in number theory lie on a broad spectrum of “niceness”, with some particularly nice functions being quite easy to sum, and some being insanely difficult.

At the easiest end of the spectrum are those functions ${f}$ that exhibit some sort of regularity or “smoothness”. Examples of smoothness include “Archimedean” smoothness, in which ${f(n)}$ is the restriction of some smooth function ${f: {\bf R} \rightarrow {\bf C}}$ from the reals to the natural numbers, and the derivatives of ${f}$ are well controlled. A typical example is

$\displaystyle \sum_{n \leq x} \log n.$

One can already get quite good bounds on this quantity by comparison with the integral ${\int_1^x \log t\ dt}$, namely

$\displaystyle \sum_{n \leq x} \log n = x \log x - x + O(\log x),$

with sharper bounds available by using tools such as the Euler-Maclaurin formula (see this blog post). Exponentiating such asymptotics, incidentally, leads to one of the standard proofs of Stirling’s formula (as discussed in this blog post).

One can also consider “non-Archimedean” notions of smoothness, such as periodicity relative to a small period ${q}$. Indeed, if ${f}$ is periodic with period ${q}$ (and is thus essentially a function on the cyclic group ${{\bf Z}/q{\bf Z}}$), then one has the easy bound

$\displaystyle \sum_{n \leq x} f(n) = \frac{x}{q} \sum_{n \in {\bf Z}/q{\bf Z}} f(n) + O( \sum_{n \in {\bf Z}/q{\bf Z}} |f(n)| ).$

In particular, we have the fundamental estimate

$\displaystyle \sum_{n \leq x: q|n} 1 = \frac{x}{q} + O(1). \ \ \ \ \ (1)$

This is a good estimate when ${q}$ is much smaller than ${x}$, but as ${q}$ approaches ${x}$ in magnitude, the error term ${O(1)}$ begins to overwhelm the main term ${\frac{x}{q}}$, and one needs much more delicate information on the fractional part of ${\frac{x}{q}}$ in order to obtain good estimates at this point.

One can also consider functions ${f}$ which combine “Archimedean” and “non-Archimedean” smoothness into an “adelic” smoothness. We will not define this term precisely here (though the concept of a Schwartz-Bruhat function is one way to capture this sort of concept), but a typical example might be

$\displaystyle \sum_{n \leq x} \chi(n) \log n$

where ${\chi}$ is periodic with some small period ${q}$. By using techniques such as summation by parts, one can estimate such sums using the techniques used to estimate sums of periodic functions or functions with (Archimedean) smoothness.

Another class of functions that is reasonably well controlled are the multiplicative functions, in which ${f(nm) = f(n) f(m)}$ whenever ${n,m}$ are coprime. Here, one can use the powerful techniques of multiplicative number theory, for instance by working with the Dirichlet series

$\displaystyle \sum_{n=1}^\infty \frac{f(n)}{n^s}$

which are clearly related to the partial sums ${\sum_{n \leq x} f(n)}$ (essentially via the Mellin transform, a cousin of the Fourier and Laplace transforms); for this post we ignore the (important) issue of how to make sense of this series when it is not absolutely convergent (but see this previous blog post for more discussion). A primary reason that this technique is effective is that the Dirichlet series of a multiplicative function factorises as an Euler product

$\displaystyle \sum_{n=1}^\infty \frac{f(n)}{n^s} = \prod_p (\sum_{j=0}^\infty \frac{f(p^j)}{p^{js}}).$

One also obtains similar types of representations for functions that are not quite multiplicative, but are closely related to multiplicative functions, such as the von Mangoldt function ${\Lambda}$ (whose Dirichlet series ${\sum_{n=1}^\infty \frac{\Lambda(n)}{n^s} = -\frac{\zeta'(s)}{\zeta(s)}}$ is not given by an Euler product, but instead by the logarithmic derivative of an Euler product).

Moving another notch along the spectrum between well-controlled and ill-controlled functions, one can consider functions ${f}$ that are divisor sums such as

$\displaystyle f(n) = \sum_{d \leq R; d|n} g(d) = \sum_{d \leq R} 1_{d|n} g(d)$

for some other arithmetic function ${g}$, and some level ${R}$. This is a linear combination of periodic functions ${1_{d|n} g(d)}$ and is thus technically periodic in ${n}$ (with period equal to the least common multiple of all the numbers from ${1}$ to ${R}$), but in practice this periodic is far too large to be useful (except for extremely small levels ${R}$, e.g. ${R = O(\log x)}$). Nevertheless, we can still control the sum ${\sum_{n \leq x} f(n)}$ simply by rearranging the summation:

$\displaystyle \sum_{n \leq x} f(n) = \sum_{d \leq R} g(d) \sum_{n \leq x: d|n} 1$

and thus by (1) one can bound this by the sum of a main term ${x \sum_{d \leq R} \frac{g(d)}{d}}$ and an error term ${O( \sum_{d \leq R} |g(d)| )}$. As long as the level ${R}$ is significantly less than ${x}$, one may expect the main term to dominate, and one can often estimate this term by a variety of techniques (for instance, if ${g}$ is multiplicative, then multiplicative number theory techniques are quite effective, as mentioned previously). Similarly for other slight variants of divisor sums, such as expressions of the form

$\displaystyle \sum_{d \leq R; d | n} g(d) \log \frac{n}{d}$

or expressions of the form

$\displaystyle \sum_{d \leq R} F_d(n)$

where each ${F_d}$ is periodic with period ${d}$.

One of the simplest examples of this comes when estimating the divisor function

$\displaystyle \tau(n) := \sum_{d|n} 1,$

which counts the number of divisors up to ${n}$. This is a multiplicative function, and is therefore most efficiently estimated using the techniques of multiplicative number theory; but for reasons that will become clearer later, let us “forget” the multiplicative structure and estimate the above sum by more elementary methods. By applying the preceding method, we see that

$\displaystyle \sum_{n \leq x} \tau(n) = \sum_{d \leq x} \sum_{n \leq x:d|n} 1$

$\displaystyle = \sum_{d \leq x} (\frac{x}{d} + O(1))$

$\displaystyle = x \log x + O(x). \ \ \ \ \ (2)$

Here, we are (barely) able to keep the error term smaller than the main term; this is right at the edge of the divisor sum method, because the level ${R}$ in this case is equal to ${x}$. Unfortunately, at this high choice of level, it is not always possible to always keep the error term under control like this. For instance, if one wishes to use the standard divisor sum representation

$\displaystyle \Lambda(n) = \sum_{d|n} \mu(d) \log \frac{n}{d},$

where ${\mu}$ is the Möbius function, to compute ${\sum_{n \leq x} \Lambda(n)}$, then one ends up looking at

$\displaystyle \sum_{n \leq x} \Lambda(n) = \sum_{d \leq x} \mu(d) \sum_{n \leq x:d|n} \log \frac{n}{d}$

$\displaystyle = \sum_{d \leq x} \mu(d) ( \frac{n}{d} \log \frac{n}{d} - \frac{n}{d} + O(\log \frac{n}{d}) )$

From Dirichlet series methods, it is not difficult to establish the identities

$\displaystyle \lim_{s\rightarrow 1^+} \sum_{n=1}^\infty \frac{\mu(n)}{n^s} = 0$

and

$\displaystyle \lim_{s \rightarrow 1^+} \sum_{n=1}^\infty \frac{\mu(n) \log n}{n^s} = -1.$

This suggests (but does not quite prove) that one has

$\displaystyle \sum_{n=1}^\infty \frac{\mu(n)}{n} = 0 \ \ \ \ \ (3)$

and

$\displaystyle \sum_{n=1}^\infty \frac{\mu(n)\log n}{n} = -1 \ \ \ \ \ (4)$

in the sense of conditionally convergent series. Assuming one can justify this (which, ultimately, requires one to exclude zeroes of the Riemann zeta function on the line ${\hbox{Re}(s)=1}$, as discussed in this previous post), one is eventually left with the estimate ${x + O(x)}$, which is useless as a lower bound (and recovers only the classical Chebyshev estimate ${\sum_{n \leq x} \Lambda(n) \ll x}$ as the upper bound). The inefficiency here when compared to the situation with the divisor function ${\tau}$ can be attributed to the signed nature of the Möbius function ${\mu(n)}$, which causes some cancellation in the divisor sum expansion that needs to be compensated for with improved estimates.

However, there are a number of tricks available to reduce the level of divisor sums. The simplest comes from exploiting the change of variables ${d \mapsto \frac{n}{d}}$, which can in principle reduce the level by a square root. For instance, when computing the divisor function ${\tau(n) = \sum_{d|n} 1}$, one can observe using this change of variables that every divisor of ${n}$ above ${\sqrt{n}}$ is paired with one below ${\sqrt{n}}$, and so we have

$\displaystyle \tau(n) = 2 \sum_{d \leq \sqrt{n}: d|n} 1 \ \ \ \ \ (5)$

except when ${n}$ is a perfect square, in which case one must subtract one from the right-hand side. Using this reduced-level divisor sum representation, one can obtain an improvement to (2), namely

$\displaystyle \sum_{n \leq x} \tau(n) = x \log x + (2\gamma-1) x + O(\sqrt{x}).$

This type of argument is also known as the Dirichlet hyperbola method. A variant of this argument can also deduce the prime number theorem from (3), (4) (and with some additional effort, one can even drop the use of (4)); this is discussed at this previous blog post.

Using this square root trick, one can now also control divisor sums such as

$\displaystyle \sum_{n \leq x} \tau(n^2+1).$

(Note that ${\tau(n^2+1)}$ has no multiplicativity properties in ${n}$, and so multiplicative number theory techniques cannot be directly applied here.) The level of the divisor sum here is initially of order ${x^2}$, which is too large to be useful; but using the square root trick, we can expand this expression as

$\displaystyle 2 \sum_{n \leq x} \sum_{d \leq n: d | n^2+1} 1$

which one can rewrite as

$\displaystyle 2 \sum_{d \leq x} \sum_{d \leq n \leq x: n^2+1 = 0 \hbox{ mod } d} 1.$

The constraint ${n^2+1=0 \hbox{ mod } d}$ is periodic in ${n}$ with period ${d}$, so we can write this as

$\displaystyle 2 \sum_{d \leq x} ( \frac{x}{d} \rho(d) + O(\rho(d)) )$

where ${\rho(d)}$ is the number of solutions in ${{\bf Z}/d{\bf Z}}$ to the equation ${n^2+1 = 0 \hbox{ mod } d}$, and so

$\displaystyle \sum_{n \leq x} \tau(n^2+1) = 2x \sum_{d \leq x} \frac{\rho(d)}{d} + O(\sum_{d \leq x} \rho(d)).$

The function ${\rho}$ is multiplicative, and can be easily computed at primes ${p}$ and prime powers ${p^j}$ using tools such as quadratic reciprocity and Hensel’s lemma. For instance, by Fermat’s two-square theorem, ${\rho(p)}$ is equal to ${2}$ for ${p=1 \hbox{ mod } 4}$ and ${0}$ for ${p=3 \hbox{ mod } 4}$. From this and standard multiplicative number theory methods (e.g. by obtaining asymptotics on the Dirichlet series ${\sum_d \frac{\rho(d)}{d^s}}$), one eventually obtains the asymptotic

$\displaystyle \sum_{d \leq x} \frac{\rho(d)}{d} = \frac{3}{2\pi} \log x + O(1)$

and also

$\displaystyle \sum_{d \leq x} \rho(d) = O(x)$

and thus

$\displaystyle \sum_{n \leq x} \tau(n^2+1) = \frac{3}{\pi} x \log x + O(x).$

Similar arguments give asymptotics for ${\tau}$ on other quadratic polynomials; see for instance this paper of Hooley and these papers by McKee. Note that the irreducibility of the polynomial will be important. If one considers instead a sum involving a reducible polynomial, such as ${\sum_{n \leq x} \tau(n^2-1)}$, then the analogous quantity ${\rho(n)}$ becomes significantly larger, leading to a larger growth rate (of order ${x \log^2 x}$ rather than ${x\log x}$) for the sum.

However, the square root trick is insufficient by itself to deal with higher order sums involving the divisor function, such as

$\displaystyle \sum_{n \leq x} \tau(n^3+1);$

the level here is initially of order ${x^3}$, and the square root trick only lowers this to about ${x^{3/2}}$, creating an error term that overwhelms the main term. And indeed, the asymptotic for such this sum has not yet been rigorously established (although if one heuristically drops error terms, one can arrive at a reasonable conjecture for this asymptotic), although some results are known if one averages over additional parameters (see e.g. this paper of Greaves, or this paper of Matthiesen).

Nevertheless, there is an ingenious argument of Erdös that allows one to obtain good upper and lower bounds for these sorts of sums, in particular establishing the asymptotic

$\displaystyle x \log x \ll \sum_{n \leq x} \tau(P(n)) \ll x \log x \ \ \ \ \ (6)$

for any fixed irreducible non-constant polynomial ${P}$ that maps ${{\bf N}}$ to ${{\bf N}}$ (with the implied constants depending of course on the choice of ${P}$). There is also the related moment bound

$\displaystyle \sum_{n \leq x} \tau^m(P(n)) \ll x \log^{O(1)} x \ \ \ \ \ (7)$

for any fixed ${P}$ (not necessarily irreducible) and any fixed ${m \geq 1}$, due to van der Corput; this bound is in fact used to dispose of some error terms in the proof of (6). These should be compared with what one can obtain from the divisor bound ${\tau(n) \ll n^{O(1/\log \log n)}}$ and the trivial bound ${\tau(n) \geq 1}$, giving the bounds

$\displaystyle x \ll \sum_{n \leq x} \tau^m(P(n)) \ll x^{1 + O(\frac{1}{\log \log x})}$

for any fixed ${m \geq 1}$.

The lower bound in (6) is easy, since one can simply lower the level in (5) to obtain the lower bound

$\displaystyle \tau(n) \geq \sum_{d \leq n^\theta: d|n} 1$

for any ${\theta>0}$, and the preceding methods then easily allow one to obtain the lower bound by taking ${\theta}$ small enough (more precisely, if ${P}$ has degree ${d}$, one should take ${\theta}$ equal to ${1/d}$ or less). The upper bounds in (6) and (7) are more difficult. Ideally, if we could obtain upper bounds of the form

$\displaystyle \tau(n) \ll \sum_{d \leq n^\theta: d|n} 1 \ \ \ \ \ (8)$

for any fixed ${\theta > 0}$, then the preceding methods would easily establish both results. Unfortunately, this bound can fail, as illustrated by the following example. Suppose that ${n}$ is the product of ${k}$ distinct primes ${p_1 \ldots p_k}$, each of which is close to ${n^{1/k}}$. Then ${n}$ has ${2^k}$ divisors, with ${\binom{n}{j}}$ of them close to ${n^{j/k}}$ for each ${0 \ldots j \leq k}$. One can think of (the logarithms of) these divisors as being distributed according to what is essentially a Bernoulli distribution, thus a randomly selected divisor of ${n}$ has magnitude about ${n^{j/k}}$, where ${j}$ is a random variable which has the same distribution as the number of heads in ${k}$ independently tossed fair coins. By the law of large numbers, ${j}$ should concentrate near ${k/2}$ when ${k}$ is large, which implies that the majority of the divisors of ${n}$ will be close to ${n^{1/2}}$. Sending ${k \rightarrow \infty}$, one can show that the bound (8) fails whenever ${\theta < 1/2}$.

This however can be fixed in a number of ways. First of all, even when ${\theta<1/2}$, one can show weaker substitutes for (8). For instance, for any fixed ${\theta > 0}$ and ${m \geq 1}$ one can show a bound of the form

$\displaystyle \tau(n)^m \ll \sum_{d \leq n^\theta: d|n} \tau(d)^C \ \ \ \ \ (9)$

for some ${C}$ depending only on ${m,\theta}$. This nice elementary inequality (first observed by Landreau) already gives a quite short proof of van der Corput’s bound (7).

For Erdös’s upper bound (6), though, one cannot afford to lose these additional factors of ${\tau(d)}$, and one must argue more carefully. Here, the key observation is that the counterexample discussed earlier – when the natural number ${n}$ is the product of a large number of fairly small primes – is quite atypical; most numbers have at least one large prime factor. For instance, the number of natural numbers less than ${x}$ that contain a prime factor between ${x^{1/2}}$ and ${x}$ is equal to

$\displaystyle \sum_{x^{1/2} \leq p \leq x} (\frac{x}{p} + O(1)),$

which, thanks to Mertens’ theorem

$\displaystyle \sum_{p \leq x} \frac{1}{p} = \log\log x + M+o(1)$

for some absolute constant ${M}$, is comparable to ${x}$. In a similar spirit, one can show by similarly elementary means that the number of natural numbers ${m}$ less than ${x}$ that are ${x^{1/m}}$-smooth, in the sense that all prime factors are at most ${x^{1/m}}$, is only about ${m^{-cm} x}$ or so. Because of this, one can hope that the bound (8), while not true in full generality, will still be true for most natural numbers ${n}$, with some slightly weaker substitute available (such as (7)) for the exceptional numbers ${n}$. This turns out to be the case by an elementary but careful argument.

The Erdös argument is quite robust; for instance, the more general inequality

$\displaystyle x \log^{2^m-1} x \ll \sum_{n \leq x} \tau(P(n))^m \ll x \log^{2^m-1} x$

for fixed irreducible ${P}$ and ${m \geq 1}$, which improves van der Corput’s inequality (8) was shown by Delmer using the same methods. (A slight error in the original paper of Erdös was also corrected in this latter paper.) In a forthcoming revision to my paper on the Erdös-Straus conjecture, Christian Elsholtz and I have also applied this method to obtain bounds such as

$\displaystyle \sum_{a \leq A} \sum_{b \leq B} \tau(a^2 b + 1) \ll AB \log(A+B),$

which turn out to be enough to obtain the right asymptotics for the number of solutions to the equation ${\frac{4}{p}= \frac{1}{x}+\frac{1}{y}+\frac{1}{z}}$.

Below the fold I will provide some more details of the arguments of Landreau and of Erdös.