You are currently browsing the tag archive for the ‘zeroes’ tag.

I’ve just uploaded to the arXiv my paper “Sendov’s conjecture for sufficiently high degree polynomials“. This paper is a contribution to an old conjecture of Sendov on the zeroes of polynomials:

Conjecture 1 (Sendov’s conjecture) Let ${f: {\bf C} \rightarrow {\bf C}}$ be a polynomial of degree ${n \geq 2}$ that has all zeroes in the closed unit disk ${\{ z: |z| \leq 1 \}}$. If ${\lambda_0}$ is one of these zeroes, then ${f'}$ has at least one zero in ${\{z: |z-\lambda_0| \leq 1\}}$.

It is common in the literature on this problem to normalise ${f}$ to be monic, and to rotate the zero ${\lambda_0}$ to be an element ${a}$ of the unit interval ${[0,1]}$. As it turns out, the location of ${a}$ on this unit interval ${[0,1]}$ ends up playing an important role in the arguments.

Many cases of this conjecture are already known, for instance

• When ${n<9}$ (Brown-Xiang 1999);
• When ${a=0}$ (Gauss-Lucas theorem);
• When ${a \leq \frac{1}{n-1}}$ (Bojanov 2011);
• When ${c \leq a \leq 1-c}$ for a fixed ${c>0}$, and ${n}$ is sufficiently large depending on ${c}$ (Dégot 2014);
• When ${C n^{-1/7} \leq a \leq 1 - C n^{-1/4}}$ for a sufficiently large absolute constant ${C}$ (Chalebgwa 2020);
• When ${a=1}$ (Rubinstein 1968; Goodman-Rahman-Ratti 1969; Joyal 1969);
• When ${a \geq 1-\varepsilon_n}$, where ${\varepsilon_n>0}$ is sufficiently small depending on ${n}$ (Miller 1993; Vajaitu-Zaharescu 1993);
• When ${a \geq 1 - \frac{1}{2 n^9 4^n}}$ (Chijiwa 2011);
• When ${a \geq 1 - \frac{90}{n^{12} \log n}}$ (Kasmalkar 2014).

In particular, in high degrees the only cases left uncovered by prior results are when ${a}$ is close (but not too close) to ${0}$, or when ${a}$ is close (but not too close) to ${1}$; see Figure 1 of my paper.

Our main result covers the high degree case uniformly for all values of ${a \in [0,1]}$:

Theorem 2 There exists an absolute constant ${n_0}$ such that Sendov’s conjecture holds for all ${n \geq n_0}$.

In principle, this reduces the verification of Sendov’s conjecture to a finite time computation, although our arguments use compactness methods and thus do not easily provide an explicit value of ${n_0}$. I believe that the compactness arguments can be replaced with quantitative substitutes that provide an explicit ${n_0}$, but the value of ${n_0}$ produced is likely to be extremely large (certainly much larger than ${9}$).

Because of the previous results (particularly those of Chalebgwa and Chijiwa), we will only need to establish the following two subcases of the above theorem:

Theorem 3 (Sendov’s conjecture near the origin) Under the additional hypothesis ${a = o(1/\log n)}$, Sendov’s conjecture holds for sufficiently large ${n}$.

Theorem 4 (Sendov’s conjecture near the unit circle) Under the additional hypothesis ${1-o(1) \leq a \leq 1 - \varepsilon_0^n}$ for a fixed ${\varepsilon_0>0}$, Sendov’s conjecture holds for sufficiently large ${n}$.

We approach these theorems using the “compactness and contradiction” strategy, assuming that there is a sequence of counterexamples whose degrees ${n}$ going to infinity, using various compactness theorems to extract various asymptotic objects in the limit ${n \rightarrow \infty}$, and somehow using these objects to derive a contradiction. There are many ways to effect such a strategy; we will use a formalism that I call “cheap nonstandard analysis” and which is common in the PDE literature, in which one repeatedly passes to subsequences as necessary whenever one invokes a compactness theorem to create a limit object. However, the particular choice of asymptotic formalism one selects is not of essential importance for the arguments.

I also found it useful to use the language of probability theory. Given a putative counterexample ${f}$ to Sendov’s conjecture, let ${\lambda}$ be a zero of ${f}$ (chosen uniformly at random among the ${n}$ zeroes of ${f}$, counting multiplicity), and let ${\zeta}$ similarly be a uniformly random zero of ${f'}$. We introduce the logarithmic potentials

$\displaystyle U_\lambda(z) := {\bf E} \log \frac{1}{|z-\lambda|}; \quad U_\zeta(z) := {\bf E} \log \frac{1}{|z-\zeta|}$

and the Stieltjes transforms

$\displaystyle s_\lambda(z) := {\bf E} \frac{1}{z-\lambda}; \quad s_\zeta(z) := {\bf E} \log \frac{1}{z-\zeta}.$

Standard calculations using the fundamental theorem of algebra yield the basic identities

$\displaystyle U_\lambda(z) = \frac{1}{n} \log \frac{1}{|f(z)|}; \quad U_\zeta(z) = \frac{1}{n-1} \log \frac{n}{|f'(z)|}$

and

$\displaystyle s_\lambda(z) = \frac{1}{n} \frac{f'(z)}{f(z)}; \quad s_\zeta(z) = \frac{1}{n-1} \frac{f''(z)}{f'(z)} \ \ \ \ \ (1)$

and in particular the random variables ${\lambda, \zeta}$ are linked to each other by the identity

$\displaystyle U_\lambda(z) - \frac{n-1}{n} U_\zeta(z) = \frac{1}{n} \log |s_\lambda(z)|. \ \ \ \ \ (2)$

On the other hand, the hypotheses of Sendov’s conjecture (and the Gauss-Lucas theorem) place ${\lambda,\zeta}$ inside the unit disk ${\{ z:|z| \leq 1\}}$. Applying Prokhorov’s theorem, and passing to a subsequence, one can then assume that the random variables ${\lambda,\zeta}$ converge in distribution to some limiting random variables ${\lambda^{(\infty)}, \zeta^{(\infty)}}$ (possibly defined on a different probability space than the original variables ${\lambda,\zeta}$), also living almost surely inside the unit disk. Standard potential theory then gives the convergence

$\displaystyle U_\lambda(z) \rightarrow U_{\lambda^{(\infty)}}(z); \quad U_\zeta(z) \rightarrow U_{\zeta^{(\infty)}}(z) \ \ \ \ \ (3)$

and

$\displaystyle s_\lambda(z) \rightarrow s_{\lambda^{(\infty)}}(z); \quad s_\zeta(z) \rightarrow s_{\zeta^{(\infty)}}(z) \ \ \ \ \ (4)$

at least in the local ${L^1}$ sense. Among other things, we then conclude from the identity (2) and some elementary inequalities that

$\displaystyle U_{\lambda^{(\infty)}}(z) = U_{\zeta^{(\infty)}}(z)$

for all ${|z|>1}$. This turns out to have an appealing interpretation in terms of Brownian motion: if one takes two Brownian motions in the complex plane, one originating from ${\lambda^{(\infty)}}$ and one originating from ${\zeta^{(\infty)}}$, then the location where these Brownian motions first exit the unit disk ${\{ z: |z| \leq 1 \}}$ will have the same distribution. (In our paper we actually replace Brownian motion with the closely related formalism of balayage.) This turns out to connect the random variables ${\lambda^{(\infty)}}$, ${\zeta^{(\infty)}}$ quite closely to each other. In particular, with this observation and some additional arguments involving both the unique continuation property for harmonic functions and Grace’s theorem (discussed in this previous post), with the latter drawn from the prior work of Dégot, we can get very good control on these distributions:

Theorem 5
• (i) If ${a = o(1)}$, then ${\lambda^{(\infty)}, \zeta^{(\infty)}}$ almost surely lie in the semicircle ${\{ e^{i\theta}: \pi/2 \leq \theta \leq 3\pi/2\}}$ and have the same distribution.
• (ii) If ${a = 1-o(1)}$, then ${\lambda^{(\infty)}}$ is uniformly distributed on the circle ${\{ z: |z|=1\}}$, and ${\zeta^{(\infty)}}$ is almost surely zero.

In case (i) (and strengthening the hypothesis ${a=o(1)}$ to ${a=o(1/\log n)}$ to control some technical contributions of “outlier” zeroes of ${f}$), we can use this information about ${\lambda^{(\infty)}}$ and (4) to ensure that the normalised logarithmic derivative ${\frac{1}{n} \frac{f'}{f} = s_\lambda}$ has a non-negative winding number in a certain small (but not too small) circle around the origin, which by the argument principle is inconsistent with the hypothesis that ${f}$ has a zero at ${a = o(1)}$ and that ${f'}$ has no zeroes near ${a}$. This is how we establish Theorem 3.

Case (ii) turns out to be more delicate. This is because there are a number of “near-counterexamples” to Sendov’s conjecture that are compatible with the hypotheses and conclusion of case (ii). The simplest such example is ${f(z) = z^n - 1}$, where the zeroes ${\lambda}$ of ${f}$ are uniformly distributed amongst the ${n^{th}}$ roots of unity (including at ${a=1}$), and the zeroes of ${f'}$ are all located at the origin. In my paper I also discuss a variant of this construction, in which ${f'}$ has zeroes mostly near the origin, but also acquires a bounded number of zeroes at various locations ${\lambda_1+o(1),\dots,\lambda_m+o(1)}$ inside the unit disk. Specifically, we take

$\displaystyle f(z) := \left(z + \frac{c_2}{n}\right)^{n-m} P(z) - \left(a + \frac{c_2}{n}\right)^{n-m} P(a)$

where ${a = 1 - \frac{c_1}{n}}$ for some constants ${0 < c_1 < c_2}$ and

$\displaystyle P(z) := (z-\lambda_1) \dots (z-\lambda_m).$

By a perturbative analysis to locate the zeroes of ${f}$, one eventually would be able to arrive at a true counterexample to Sendov’s conjecture if these locations ${\lambda_1,\dots,\lambda_m}$ were in the open lune

$\displaystyle \{ \lambda: |\lambda| < 1 < |\lambda-1| \}$

and if one had the inequality

$\displaystyle c_2 - c_1 - c_2 \cos \theta + \sum_{j=1}^m \log \left|\frac{1 - \lambda_j}{e^{i\theta} - \lambda_j}\right| < 0 \ \ \ \ \ (5)$

for all ${0 \leq \theta \leq 2\pi}$. However, if one takes the mean of this inequality in ${\theta}$, one arrives at the inequality

$\displaystyle c_2 - c_1 + \sum_{j=1}^m \log |1 - \lambda_j| < 0$

which is incompatible with the hypotheses ${c_2 > c_1}$ and ${|\lambda_j-1| > 1}$. In order to extend this argument to more general polynomials ${f}$, we require a stability analysis of the endpoint equation

$\displaystyle c_2 - c_1 + c_2 \cos \theta + \sum_{j=1}^m \log \left|\frac{1 - \lambda_j}{e^{i\theta} - \lambda_j}\right| = 0 \ \ \ \ \ (6)$

where we now only assume the closed conditions ${c_2 \geq c_1}$ and ${|\lambda_j-1| \geq 1}$. The above discussion then places all the zeros ${\lambda_j}$ on the arc

$\displaystyle \{ \lambda: |\lambda| < 1 = |\lambda-1|\} \ \ \ \ \ (7)$

and if one also takes the second Fourier coefficient of (6) one also obtains the vanishing second moment

$\displaystyle \sum_{j=1}^m \lambda_j^2 = 0.$

These two conditions are incompatible with each other (except in the degenerate case when all the ${\lambda_j}$ vanish), because all the non-zero elements ${\lambda}$ of the arc (7) have argument in ${\pm [\pi/3,\pi/2]}$, so in particular their square ${\lambda^2}$ will have negative real part. It turns out that one can adapt this argument to the more general potential counterexamples to Sendov’s conjecture (in the form of Theorem 4). The starting point is to use (1), (4), and Theorem 5(ii) to obtain good control on ${f''/f'}$, which one then integrates and exponentiates to get good control on ${f'}$, and then on a second integration one gets enough information about ${f}$ to pin down the location of its zeroes to high accuracy. The constraint that these zeroes lie inside the unit disk then gives an inequality resembling (5), and an adaptation of the above stability analysis is then enough to conclude. The arguments here are inspired by the previous arguments of Miller, which treated the case when ${a}$ was extremely close to ${1}$ via a similar perturbative analysis; the main novelty is to control the error terms not in terms of the magnitude of the largest zero ${\zeta}$ of ${f'}$ (which is difficult to manage when ${n}$ gets large), but rather by the variance of those zeroes, which ends up being a more tractable expression to keep track of.

A useful rule of thumb in complex analysis is that holomorphic functions ${f(z)}$ behave like large degree polynomials ${P(z)}$. This can be evidenced for instance at a “local” level by the Taylor series expansion for a complex analytic function in the disk, or at a “global” level by factorisation theorems such as the Weierstrass factorisation theorem (or the closely related Hadamard factorisation theorem). One can truncate these theorems in a variety of ways (e.g., Taylor’s theorem with remainder) to be able to approximate a holomorphic function by a polynomial on various domains.

In some cases it can be convenient instead to work with polynomials ${P(Z)}$ of another variable ${Z}$ such as ${Z = e^{2\pi i z}}$ (or more generally ${Z=e^{2\pi i z/N}}$ for a scaling parameter ${N}$). In the case of the Riemann zeta function, defined by meromorphic continuation of the formula

$\displaystyle \zeta(s) = \sum_{n=1}^\infty \frac{1}{n^s} \ \ \ \ \ (1)$

one ends up having the following heuristic approximation in the neighbourhood of a point ${\frac{1}{2}+it}$ on the critical line:

Heuristic 1 (Polynomial approximation) Let ${T \ggg 1}$ be a height, let ${t}$ be a “typical” element of ${[T,2T]}$, and let ${1 \lll N \ll \log T}$ be an integer. Let ${\phi_t = \phi_{t,T}: {\bf C} \rightarrow {\bf C}}$ be the linear change of variables

$\displaystyle \phi_t(z) := \frac{1}{2} + it - \frac{2\pi i z}{\log T}.$

Then one has an approximation

$\displaystyle \zeta( \phi_t(z) ) \approx P_t( e^{2\pi i z/N} ) \ \ \ \ \ (2)$

for ${z = o(N)}$ and some polynomial ${P_t = P_{t,T}}$ of degree ${N}$.

The requirement ${z=o(N)}$ is necessary since the right-hand side is periodic with period ${N}$ in the ${z}$ variable (or period ${\frac{2\pi i N}{\log T}}$ in the ${s = \phi_t(z)}$ variable), whereas the zeta function is not expected to have any such periodicity, even approximately.

Let us give two non-rigorous justifications of this heuristic. Firstly, it is standard that inside the critical strip (with ${\mathrm{Im}(s) = O(T)}$) we have an approximate form

$\displaystyle \zeta(s) \approx \sum_{n \leq T} \frac{1}{n^s}$

of (11). If we group the integers ${n}$ from ${1}$ to ${T}$ into ${N}$ bins depending on what powers of ${T^{1/N}}$ they lie between, we thus have

$\displaystyle \zeta(s) \approx \sum_{j=0}^N \sum_{T^{j/N} \leq n < T^{(j+1)/N}} \frac{1}{n^s}$

For ${s = \phi_t(z)}$ with ${z = o(N)}$ and ${T^{j/N} \leq n < T^{(j+1)/N}}$ we heuristically have

$\displaystyle \frac{1}{n^s} \approx \frac{1}{n^{\frac{1}{2}+it}} e^{2\pi i j z / N}$

and so

$\displaystyle \zeta(s) \approx \sum_{j=0}^N a_j(t) (e^{2\pi i z/N})^j$

where ${a_j(t)}$ are the partial Dirichlet series

$\displaystyle a_j(t) \approx \sum_{T^{j/N} \leq n < T^{(j+1)/N}} \frac{1}{n^{\frac{1}{2}+it}}. \ \ \ \ \ (3)$

This gives the desired polynomial approximation.

A second non-rigorous justification is as follows. From factorisation theorems such as the Hadamard factorisation theorem we expect to have

$\displaystyle \zeta(s) \propto \prod_\rho (s-\rho) \times \dots$

where ${\rho}$ runs over the non-trivial zeroes of ${\zeta}$, and there are some additional factors arising from the trivial zeroes and poles of ${\zeta}$ which we will ignore here; we will also completely ignore the issue of how to renormalise the product to make it converge properly. In the region ${s = \frac{1}{2} + it + o( N / \log T) = \phi_t( \{ z: z = o(N) \})}$, the dominant contribution to this product (besides multiplicative constants) should arise from zeroes ${\rho}$ that are also in this region. The Riemann-von Mangoldt formula suggests that for “typical” ${t}$ one should have about ${N}$ such zeroes. If one lets ${\rho_1,\dots,\rho_N}$ be any enumeration of ${N}$ zeroes closest to ${\frac{1}{2}+it}$, and then repeats this set of zeroes periodically by period ${\frac{2\pi i N}{\log T}}$, one then expects to have an approximation of the form

$\displaystyle \zeta(s) \propto \prod_{j=1}^N \prod_{k \in {\bf Z}} (s-(\rho_j+\frac{2\pi i kN}{\log T}) )$

again ignoring all issues of convergence. If one writes ${s = \phi_t(z)}$ and ${\rho_j = \phi_t(\lambda_j)}$, then Euler’s famous product formula for sine basically gives

$\displaystyle \prod_{k \in {\bf Z}} (s-(\rho_j+\frac{2\pi i kN}{\log T}) ) \propto \prod_{k \in {\bf Z}} (z - (\lambda_j+2\pi k N) )$

$\displaystyle \propto (e^{2\pi i z/N} - e^{2\pi i \lambda j/N})$

(here we are glossing over some technical issues regarding renormalisation of the infinite products, which can be dealt with by studying the asymptotics as ${\mathrm{Im}(z) \rightarrow \infty}$) and hence we expect

$\displaystyle \zeta(s) \propto \prod_{j=1}^N (e^{2\pi i z/N} - e^{2\pi i \lambda j/N}).$

This again gives the desired polynomial approximation.

Below the fold we give a rigorous version of the second argument suitable for “microscale” analysis. More precisely, we will show

Theorem 2 Let ${N = N(T)}$ be an integer going sufficiently slowly to infinity. Let ${W_0 \ll N}$ go to zero sufficiently slowly depending on ${N}$. Let ${t}$ be drawn uniformly at random from ${[T,2T]}$. Then with probability ${1-o(1)}$ (in the limit ${T \rightarrow \infty}$), and possibly after adjusting ${N}$ by ${1}$, there exists a polynomial ${P_t(Z)}$ of degree ${N}$ and obeying the functional equation (9) below, such that

$\displaystyle \zeta( \phi_t(z) ) = (1+o(1)) P_t( e^{2\pi i z/N} ) \ \ \ \ \ (4)$

whenever ${|z| \leq W_0}$.

It should be possible to refine the arguments to extend this theorem to the mesoscale setting by letting ${N}$ be anything growing like ${o(\log T)}$, and ${W_0}$ anything growing like ${o(N)}$; also we should be able to delete the need to adjust ${N}$ by ${1}$. We have not attempted these optimisations here.

Many conjectures and arguments involving the Riemann zeta function can be heuristically translated into arguments involving the polynomials ${P_t(Z)}$, which one can view as random degree ${N}$ polynomials if ${t}$ is interpreted as a random variable drawn uniformly at random from ${[T,2T]}$. These can be viewed as providing a “toy model” for the theory of the Riemann zeta function, in which the complex analysis is simplified to the study of the zeroes and coefficients of this random polynomial (for instance, the role of the gamma function is now played by a monomial in ${Z}$). This model also makes the zeta function theory more closely resemble the function field analogues of this theory (in which the analogue of the zeta function is also a polynomial (or a rational function) in some variable ${Z}$, as per the Weil conjectures). The parameter ${N}$ is at our disposal to choose, and reflects the scale ${\approx N/\log T}$ at which one wishes to study the zeta function. For “macroscopic” questions, at which one wishes to understand the zeta function at unit scales, it is natural to take ${N \approx \log T}$ (or very slightly larger), while for “microscopic” questions one would take ${N}$ close to ${1}$ and only growing very slowly with ${T}$. For the intermediate “mesoscopic” scales one would take ${N}$ somewhere between ${1}$ and ${\log T}$. Unfortunately, the statistical properties of ${P_t}$ are only understood well at a conjectural level at present; even if one assumes the Riemann hypothesis, our understanding of ${P_t}$ is largely restricted to the computation of low moments (e.g., the second or fourth moments) of various linear statistics of ${P_t}$ and related functions (e.g., ${1/P_t}$, ${P'_t/P_t}$, or ${\log P_t}$).

Let’s now heuristically explore the polynomial analogues of this theory in a bit more detail. The Riemann hypothesis basically corresponds to the assertion that all the ${N}$ zeroes of the polynomial ${P_t(Z)}$ lie on the unit circle ${|Z|=1}$ (which, after the change of variables ${Z = e^{2\pi i z/N}}$, corresponds to ${z}$ being real); in a similar vein, the GUE hypothesis corresponds to ${P_t(Z)}$ having the asymptotic law of a random scalar ${a_N(t)}$ times the characteristic polynomial of a random unitary ${N \times N}$ matrix. Next, we consider what happens to the functional equation

$\displaystyle \zeta(s) = \chi(s) \zeta(1-s) \ \ \ \ \ (5)$

where

$\displaystyle \chi(s) := 2^s \pi^{s-1} \sin(\frac{\pi s}{2}) \Gamma(1-s).$

A routine calculation involving Stirling’s formula reveals that

$\displaystyle \chi(\frac{1}{2}+it) = (1+o(1)) e^{-2\pi i L(t)} \ \ \ \ \ (6)$

with ${L(t) := \frac{t}{2\pi} \log \frac{t}{2\pi} - \frac{t}{2\pi} + \frac{7}{8}}$; one also has the closely related approximation

$\displaystyle \frac{\chi'}{\chi}(s) = -\log T + O(1) \ \ \ \ \ (7)$

and hence

$\displaystyle \chi(\phi_t(z)) = (1+o(1)) e^{-2\pi i \theta(t)} e^{2\pi i z} \ \ \ \ \ (8)$

when ${z = o(\log T)}$. Since ${\zeta(1-s) = \overline{\zeta(\overline{1-s})}}$, applying (5) with ${s = \phi_t(z)}$ and using the approximation (2) suggests a functional equation for ${P_t}$:

$\displaystyle P_t(e^{2\pi i z/N}) = e^{-2\pi i L(t)} e^{2\pi i z} \overline{P_t(e^{2\pi i \overline{z}/N})}$

or in terms of ${Z := e^{2\pi i z/N}}$,

$\displaystyle P_t(Z) = e^{-2\pi i L(t)} Z^N \overline{P_t}(1/Z) \ \ \ \ \ (9)$

where ${\overline{P_t}(Z) := \overline{P_t(\overline{Z})}}$ is the polynomial ${P_t}$ with all the coefficients replaced by their complex conjugate. Thus if we write

$\displaystyle P_t(Z) = \sum_{j=0}^N a_j Z^j$

then the functional equation can be written as

$\displaystyle a_j(t) = e^{-2\pi i L(t)} \overline{a_{N-j}(t)}.$

We remark that if we use the heuristic (3) (interpreting the cutoffs in the ${n}$ summation in a suitably vague fashion) then this equation can be viewed as an instance of the Poisson summation formula.

Another consequence of the functional equation is that the zeroes of ${P_t}$ are symmetric with respect to inversion ${Z \mapsto 1/\overline{Z}}$ across the unit circle. This is of course consistent with the Riemann hypothesis, but does not obviously imply it. The phase ${L(t)}$ is of little consequence in this functional equation; one could easily conceal it by working with the phase rotation ${e^{\pi i L(t)} P_t}$ of ${P_t}$ instead.

One consequence of the functional equation is that ${e^{\pi i L(t)} e^{-i N \theta/2} P_t(e^{i\theta})}$ is real for any ${\theta \in {\bf R}}$; the same is then true for the derivative ${e^{\pi i L(t)} e^{i N \theta} (i e^{i\theta} P'_t(e^{i\theta}) - i \frac{N}{2} P_t(e^{i\theta})}$. Among other things, this implies that ${P'_t(e^{i\theta})}$ cannot vanish unless ${P_t(e^{i\theta})}$ does also; thus the zeroes of ${P'_t}$ will not lie on the unit circle except where ${P_t}$ has repeated zeroes. The analogous statement is true for ${\zeta}$; the zeroes of ${\zeta'}$ will not lie on the critical line except where ${\zeta}$ has repeated zeroes.

Relating to this fact, it is a classical result of Speiser that the Riemann hypothesis is true if and only if all the zeroes of the derivative ${\zeta'}$ of the zeta function in the critical strip lie on or to the right of the critical line. The analogous result for polynomials is

Proposition 3 We have

$\displaystyle \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| > 1: P'_t(Z) = 0 \}$

(where all zeroes are counted with multiplicity.) In particular, the zeroes of ${P_t(Z)}$ all lie on the unit circle if and only if the zeroes of ${P'_t(Z)}$ lie in the closed unit disk.

Proof: From the functional equation we have

$\displaystyle \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| > 1: P_t(Z) = 0 \}.$

Thus it will suffice to show that ${P_t}$ and ${P'_t}$ have the same number of zeroes outside the closed unit disk.

Set ${f(z) := z \frac{P'(z)}{P(z)}}$, then ${f}$ is a rational function that does not have a zero or pole at infinity. For ${e^{i\theta}}$ not a zero of ${P_t}$, we have already seen that ${e^{\pi i L(t)} e^{-i N \theta/2} P_t(e^{i\theta})}$ and ${e^{\pi i L(t)} e^{i N \theta} (i e^{i\theta} P'_t(e^{i\theta}) - i \frac{N}{2} P_t(e^{i\theta})}$ are real, so on dividing we see that ${i f(e^{i\theta}) - \frac{iN}{2}}$ is always real, that is to say

$\displaystyle \mathrm{Re} f(e^{i\theta}) = \frac{N}{2}.$

(This can also be seen by writing ${f(e^{i\theta}) = \sum_\lambda \frac{1}{1-e^{-i\theta} \lambda}}$, where ${\lambda}$ runs over the zeroes of ${P_t}$, and using the fact that these zeroes are symmetric with respect to reflection across the unit circle.) When ${e^{i\theta}}$ is a zero of ${P_t}$, ${f(z)}$ has a simple pole at ${e^{i\theta}}$ with residue a positive multiple of ${e^{i\theta}}$, and so ${f(z)}$ stays on the right half-plane if one traverses a semicircular arc around ${e^{i\theta}}$ outside the unit disk. From this and continuity we see that ${f}$ stays on the right-half plane in a circle slightly larger than the unit circle, and hence by the argument principle it has the same number of zeroes and poles outside of this circle, giving the claim. $\Box$

From the functional equation and the chain rule, ${Z}$ is a zero of ${P'_t}$ if and only if ${1/\overline{Z}}$ is a zero of ${N P_t - P'_t}$. We can thus write the above proposition in the equivalent form

$\displaystyle \# \{ |Z| = 1: P_t(Z) = 0 \} = N - 2 \# \{ |Z| < 1: NP_t(Z) - P'_t(Z) = 0 \}.$

One can use this identity to get a lower bound on the number of zeroes of ${P_t}$ by the method of mollifiers. Namely, for any other polynomial ${M_t}$, we clearly have

$\displaystyle \# \{ |Z| = 1: P_t(Z) = 0 \}$

$\displaystyle \geq N - 2 \# \{ |Z| < 1: M_t(Z)(NP_t(Z) - P'_t(Z)) = 0 \}.$

By Jensen’s formula, we have for any ${r>1}$ that

$\displaystyle \log |M_t(0)| |NP_t(0)-P'_t(0)|$

$\displaystyle \leq -(\log r) \# \{ |Z| < 1: M_t(Z)(NP_t(Z) - P'_t(Z)) = 0 \}$

$\displaystyle + \frac{1}{2\pi} \int_0^{2\pi} \log |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|\ d\theta.$

We therefore have

$\displaystyle \# \{ |Z| = 1: P_t(Z) = 0 \} \geq N + \frac{2}{\log r} \log |M_t(0)| |NP_t(0)-P'_t(0)|$

$\displaystyle - \frac{1}{\log r} \frac{1}{2\pi} \int_0^{2\pi} \log |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|^2\ d\theta.$

As the logarithm function is concave, we can apply Jensen’s inequality to conclude

$\displaystyle {\bf E} \# \{ |Z| = 1: P_t(Z) = 0 \} \geq N$

$\displaystyle + {\bf E} \frac{2}{\log r} \log |M_t(0)| |NP_t(0)-P'_t(0)|$

$\displaystyle - \frac{1}{\log r} \log \left( \frac{1}{2\pi} \int_0^{2\pi} {\bf E} |M_t(re^{i\theta})(NP_t(e^{i\theta}) - P'_t(re^{i\theta}))|^2\ d\theta\right).$

where the expectation is over the ${t}$ parameter. It turns out that by choosing the mollifier ${M_t}$ carefully in order to make ${M_t P_t}$ behave like the function ${1}$ (while keeping the degree ${M_t}$ small enough that one can compute the second moment here), and then optimising in ${r}$, one can use this inequality to get a positive fraction of zeroes of ${P_t}$ on the unit circle on average. This is the polynomial analogue of a classical argument of Levinson, who used this to show that at least one third of the zeroes of the Riemann zeta function are on the critical line; all later improvements on this fraction have been based on some version of Levinson’s method, mainly focusing on more advanced choices for the mollifier ${M_t}$ and of the differential operator ${N - \partial_z}$ that implicitly appears in the above approach. (The most recent lower bound I know of is ${0.4191637}$, due to Pratt and Robles. In principle (as observed by Farmer) this bound can get arbitrarily close to ${1}$ if one is allowed to use arbitrarily long mollifiers, but establishing this seems of comparable difficulty to unsolved problems such as the pair correlation conjecture; see this paper of Radziwill for more discussion.) A variant of these techniques can also establish “zero density estimates” of the following form: for any ${W \geq 1}$, the number of zeroes of ${P_t}$ that lie further than ${\frac{W}{N}}$ from the unit circle is of order ${O( e^{-cW} N )}$ on average for some absolute constant ${c>0}$. Thus, roughly speaking, most zeroes of ${P_t}$ lie within ${O(1/N)}$ of the unit circle. (Analogues of these results for the Riemann zeta function were worked out by Selberg, by Jutila, and by Conrey, with increasingly strong values of ${c}$.)

The zeroes of ${P'_t}$ tend to live somewhat closer to the origin than the zeroes of ${P_t}$. Suppose for instance that we write

$\displaystyle P_t(Z) = \sum_{j=0}^N a_j(t) Z^j = a_N(t) \prod_{j=1}^N (Z - \lambda_j)$

where ${\lambda_1,\dots,\lambda_N}$ are the zeroes of ${P_t(Z)}$, then by evaluating at zero we see that

$\displaystyle \lambda_1 \dots \lambda_N = (-1)^N a_0(t) / a_N(t)$

and the right-hand side is of unit magnitude by the functional equation. However, if we differentiate

$\displaystyle P'_t(Z) = \sum_{j=1}^N a_j(t) j Z^{j-1} = N a_N(t) \prod_{j=1}^{N-1} (Z - \lambda'_j)$

where ${\lambda'_1,\dots,\lambda'_{N-1}}$ are the zeroes of ${P'_t}$, then by evaluating at zero we now see that

$\displaystyle \lambda'_1 \dots \lambda'_{N-1} = (-1)^N a_1(t) / N a_N(t).$

The right-hand side would now be typically expected to be of size ${O(1/N) \approx \exp(- \log N)}$, and so on average we expect the ${\lambda'_j}$ to have magnitude like ${\exp( - \frac{\log N}{N} )}$, that is to say pushed inwards from the unit circle by a distance roughly ${\frac{\log N}{N}}$. The analogous result for the Riemann zeta function is that the zeroes of ${\zeta'(s)}$ at height ${\sim T}$ lie at a distance roughly ${\frac{\log\log T}{\log T}}$ to the right of the critical line on the average; see this paper of Levinson and Montgomery for a precise statement.

Let ${P(z) = z^n + a_{n-1} z^{n-1} + \dots + a_0}$ be a monic polynomial of degree ${n}$ with complex coefficients. Then by the fundamental theorem of algebra, we can factor ${P}$ as

$\displaystyle P(z) = (z-z_1) \dots (z-z_n) \ \ \ \ \ (1)$

for some complex zeroes ${z_1,\dots,z_n}$ (possibly with repetition).

Now suppose we evolve ${P}$ with respect to time by heat flow, creating a function ${P(t,z)}$ of two variables with given initial data ${P(0,z) = P(z)}$ for which

$\displaystyle \partial_t P(t,z) = \partial_{zz} P(t,z). \ \ \ \ \ (2)$

On the space of polynomials of degree at most ${n}$, the operator ${\partial_{zz}}$ is nilpotent, and one can solve this equation explicitly both forwards and backwards in time by the Taylor series

$\displaystyle P(t,z) = \sum_{j=0}^\infty \frac{t^j}{j!} \partial_z^{2j} P(0,z).$

For instance, if one starts with a quadratic ${P(0,z) = z^2 + bz + c}$, then the polynomial evolves by the formula

$\displaystyle P(t,z) = z^2 + bz + (c+2t).$

As the polynomial ${P(t)}$ evolves in time, the zeroes ${z_1(t),\dots,z_n(t)}$ evolve also. Assuming for sake of discussion that the zeroes are simple, the inverse function theorem tells us that the zeroes will (locally, at least) evolve smoothly in time. What are the dynamics of this evolution?

For instance, in the quadratic case, the quadratic formula tells us that the zeroes are

$\displaystyle z_1(t) = \frac{-b + \sqrt{b^2 - 4(c+2t)}}{2}$

and

$\displaystyle z_2(t) = \frac{-b - \sqrt{b^2 - 4(c+2t)}}{2}$

after arbitrarily choosing a branch of the square root. If ${b,c}$ are real and the discriminant ${b^2 - 4c}$ is initially positive, we see that we start with two real zeroes centred around ${-b/2}$, which then approach each other until time ${t = \frac{b^2-4c}{8}}$, at which point the roots collide and then move off from each other in an imaginary direction.

In the general case, we can obtain the equations of motion by implicitly differentiating the defining equation

$\displaystyle P( t, z_i(t) ) = 0$

in time using (2) to obtain

$\displaystyle \partial_{zz} P( t, z_i(t) ) + \partial_t z_i(t) \partial_z P(t,z_i(t)) = 0.$

To simplify notation we drop the explicit dependence on time, thus

$\displaystyle \partial_{zz} P(z_i) + (\partial_t z_i) \partial_z P(z_i)= 0.$

From (1) and the product rule, we see that

$\displaystyle \partial_z P( z_i ) = \prod_{j:j \neq i} (z_i - z_j)$

and

$\displaystyle \partial_{zz} P( z_i ) = 2 \sum_{k:k \neq i} \prod_{j:j \neq i,k} (z_i - z_j)$

(where all indices are understood to range over ${1,\dots,n}$) leading to the equations of motion

$\displaystyle \partial_t z_i = \sum_{k:k \neq i} \frac{2}{z_k - z_i}, \ \ \ \ \ (3)$

at least when one avoids those times in which there is a repeated zero. In the case when the zeroes ${z_i}$ are real, each term ${\frac{2}{z_k-z_i}}$ represents a (first-order) attraction in the dynamics between ${z_i}$ and ${z_k}$, but the dynamics are more complicated for complex zeroes (e.g. purely imaginary zeroes will experience repulsion rather than attraction, as one already sees in the quadratic example). Curiously, this system resembles that of Dyson brownian motion (except with the brownian motion part removed, and time reversed). I learned of the connection between the ODE (3) and the heat equation from this paper of Csordas, Smith, and Varga, but perhaps it has been mentioned in earlier literature as well.

One interesting consequence of these equations is that if the zeroes are real at some time, then they will stay real as long as the zeroes do not collide. Let us now restrict attention to the case of real simple zeroes, in which case we will rename the zeroes as ${x_i}$ instead of ${z_i}$, and order them as ${x_1 < \dots < x_n}$. The evolution

$\displaystyle \partial_t x_i = \sum_{k:k \neq i} \frac{2}{x_k - x_i}$

can now be thought of as reverse gradient flow for the “entropy”

$\displaystyle H := -\sum_{i,j: i \neq j} \log |x_i - x_j|,$

(which is also essentially the logarithm of the discriminant of the polynomial) since we have

$\displaystyle \partial_t x_i = \frac{\partial H}{\partial x_i}.$

In particular, we have the monotonicity formula

$\displaystyle \partial_t H = 4E$

where ${E}$ is the “energy”

$\displaystyle E := \frac{1}{4} \sum_i (\frac{\partial H}{\partial x_i})^2$

$\displaystyle = \sum_i (\sum_{k:k \neq i} \frac{1}{x_k-x_i})^2$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2} + 2 \sum_{i,j,k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_j-x_i)}$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2}$

where in the last line we use the antisymmetrisation identity

$\displaystyle \frac{1}{(x_k-x_i)(x_j-x_i)} + \frac{1}{(x_i-x_j)(x_k-x_j)} + \frac{1}{(x_j-x_k)(x_i-x_k)} = 0.$

Among other things, this shows that as one goes backwards in time, the entropy decreases, and so no collisions can occur to the past, only in the future, which is of course consistent with the attractive nature of the dynamics. As ${H}$ is a convex function of the positions ${x_1,\dots,x_n}$, one expects ${H}$ to also evolve in a convex manner in time, that is to say the energy ${E}$ should be increasing. This is indeed the case:

Exercise 1 Show that

$\displaystyle \partial_t E = 2 \sum_{i,j: i \neq j} (\frac{2}{(x_i-x_j)^2} - \sum_{k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_k-x_j)})^2.$

Symmetric polynomials of the zeroes are polynomial functions of the coefficients and should thus evolve in a polynomial fashion. One can compute this explicitly in simple cases. For instance, the center of mass is an invariant:

$\displaystyle \partial_t \frac{1}{n} \sum_i x_i = 0.$

The variance decreases linearly:

Exercise 2 Establish the virial identity

$\displaystyle \partial_t \sum_{i,j} (x_i-x_j)^2 = - 4n^2(n-1).$

As the variance (which is proportional to ${\sum_{i,j} (x_i-x_j)^2}$) cannot become negative, this identity shows that “finite time blowup” must occur – that the zeroes must collide at or before the time ${\frac{1}{4n^2(n-1)} \sum_{i,j} (x_i-x_j)^2}$.

Exercise 3 Show that the Stieltjes transform

$\displaystyle s(t,z) = \sum_i \frac{1}{x_i - z}$

solves the viscous Burgers equation

$\displaystyle \partial_t s = \partial_{zz} s - 2 s \partial_z s,$

either by using the original heat equation (2) and the identity ${s = - \partial_z P / P}$, or else by using the equations of motion (3). This relation between the Burgers equation and the heat equation is known as the Cole-Hopf transformation.

The paper of Csordas, Smith, and Varga mentioned previously gives some other bounds on the lifespan of the dynamics; roughly speaking, they show that if there is one pair of zeroes that are much closer to each other than to the other zeroes then they must collide in a short amount of time (unless there is a collision occuring even earlier at some other location). Their argument extends also to situations where there are an infinite number of zeroes, which they apply to get new results on Newman’s conjecture in analytic number theory. I would be curious to know of further places in the literature where this dynamics has been studied.