You are currently browsing the monthly archive for October 2017.

Let {P(z) = z^n + a_{n-1} z^{n-1} + \dots + a_0} be a monic polynomial of degree {n} with complex coefficients. Then by the fundamental theorem of algebra, we can factor {P} as

\displaystyle  P(z) = (z-z_1) \dots (z-z_n) \ \ \ \ \ (1)

for some complex zeroes {z_1,\dots,z_n} (possibly with repetition).

Now suppose we evolve {P} with respect to time by heat flow, creating a function {P(t,z)} of two variables with given initial data {P(0,z) = P(z)} for which

\displaystyle  \partial_t P(t,z) = \partial_{zz} P(t,z). \ \ \ \ \ (2)

On the space of polynomials of degree at most {n}, the operator {\partial_{zz}} is nilpotent, and one can solve this equation explicitly both forwards and backwards in time by the Taylor series

\displaystyle  P(t,z) = \sum_{n=0}^\infty \frac{t^n}{n!} \partial_z^{2n} P(0,z).

For instance, if one starts with a quadratic {P(0,z) = z^2 + bz + c}, then the polynomial evolves by the formula

\displaystyle  P(t,z) = z^2 + bz + (c+2t).

As the polynomial {P(t)} evolves in time, the zeroes {z_1(t),\dots,z_n(t)} evolve also. Assuming for sake of discussion that the zeroes are simple, the inverse function theorem tells us that the zeroes will (locally, at least) evolve smoothly in time. What are the dynamics of this evolution?

For instance, in the quadratic case, the quadratic formula tells us that the zeroes are

\displaystyle  z_1(t) = \frac{-b + \sqrt{b^2 - 4(c+2t)}}{2}


\displaystyle  z_2(t) = \frac{-b - \sqrt{b^2 - 4(c+2t)}}{2}

after arbitrarily choosing a branch of the square root. If {b,c} are real and the discriminant {b^2 - 4c} is initially positive, we see that we start with two real zeroes centred around {-b/2}, which then approach each other until time {t = \frac{b^2-4c}{8}}, at which point the roots collide and then move off from each other in an imaginary direction.

In the general case, we can obtain the equations of motion by implicitly differentiating the defining equation

\displaystyle  P( t, z_i(t) ) = 0

in time using (2) to obtain

\displaystyle  \partial_{zz} P( t, z_i(t) ) + \partial_t z_i(t) \partial_z P(t,z_i(t)) = 0.

To simplify notation we drop the explicit dependence on time, thus

\displaystyle  \partial_{zz} P(z_i) + (\partial_t z_i) \partial_z P(z_i)= 0.

From (1) and the product rule, we see that

\displaystyle  \partial_z P( z_i ) = \prod_{j:j \neq i} (z_i - z_j)


\displaystyle  \partial_{zz} P( z_i ) = 2 \sum_{k:k \neq i} \prod_{j:j \neq i,k} (z_i - z_j)

(where all indices are understood to range over {1,\dots,n}) leading to the equations of motion

\displaystyle  \partial_t z_i = \sum_{k:k \neq i} \frac{2}{z_k - z_i}, \ \ \ \ \ (3)

at least when one avoids those times in which there is a repeated zero. In the case when the zeroes {z_i} are real, each term {\frac{2}{z_k-z_i}} represents a (first-order) attraction in the dynamics between {z_i} and {z_k}, but the dynamics are more complicated for complex zeroes (e.g. purely imaginary zeroes will experience repulsion rather than attraction, as one already sees in the quadratic example). Curiously, this system resembles that of Dyson brownian motion (except with the brownian motion part removed, and time reversed). I learned of the connection between the ODE (3) and the heat equation from this paper of Csordas, Smith, and Varga, but perhaps it has been mentioned in earlier literature as well.

One interesting consequence of these equations is that if the zeroes are real at some time, then they will stay real as long as the zeroes do not collide. Let us now restrict attention to the case of real simple zeroes, in which case we will rename the zeroes as {x_i} instead of {z_i}, and order them as {x_1 < \dots < x_n}. The evolution

\displaystyle  \partial_t x_i = \sum_{k:k \neq i} \frac{2}{x_k - x_i}

can now be thought of as reverse gradient flow for the “entropy”

\displaystyle  H := -\sum_{i,j: i \neq j} \log |x_i - x_j|,

(which is also essentially the logarithm of the discriminant of the polynomial) since we have

\displaystyle  \partial_t x_i = \frac{\partial H}{\partial x_i}.

In particular, we have the monotonicity formula

\displaystyle  \partial_t H = 4E

where {E} is the “energy”

\displaystyle  E := \frac{1}{4} \sum_i (\frac{\partial H}{\partial x_i})^2

\displaystyle  = \sum_i (\sum_{k:k \neq i} \frac{1}{x_k-x_i})^2

\displaystyle  = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2} + 2 \sum_{i,j,k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_j-x_i)}

\displaystyle  = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2}

where in the last line we use the antisymmetrisation identity

\displaystyle  \frac{1}{(x_k-x_i)(x_j-x_i)} + \frac{1}{(x_i-x_j)(x_k-x_j)} + \frac{1}{(x_j-x_k)(x_i-x_k)} = 0.

Among other things, this shows that as one goes backwards in time, the entropy decreases, and so no collisions can occur to the past, only in the future, which is of course consistent with the attractive nature of the dynamics. As {H} is a convex function of the positions {x_1,\dots,x_n}, one expects {H} to also evolve in a convex manner in time, that is to say the energy {E} should be increasing. This is indeed the case:

Exercise 1 Show that

\displaystyle  \partial_t E = 2 \sum_{i,j: i \neq j} (\frac{2}{(x_i-x_j)^2} - \sum_{k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_k-x_j)})^2.

Symmetric polynomials of the zeroes are polynomial functions of the coefficients and should thus evolve in a polynomial fashion. One can compute this explicitly in simple cases. For instance, the center of mass is an invariant:

\displaystyle  \partial_t \frac{1}{n} \sum_i x_i = 0.

The variance decreases linearly:

Exercise 2 Establish the virial identity

\displaystyle  \partial_t \sum_{i,j} (x_i-x_j)^2 = - 4n^2(n-1).

As the variance (which is proportional to {\sum_{i,j} (x_i-x_j)^2}) cannot become negative, this identity shows that “finite time blowup” must occur – that the zeroes must collide at or before the time {\frac{1}{4n^2(n-1)} \sum_{i,j} (x_i-x_j)^2}.

Exercise 3 Show that the Stieltjes transform

\displaystyle  s(t,z) = \sum_i \frac{1}{x_i - z}

solves the viscous Burgers equation

\displaystyle  \partial_t s = \partial_{zz} s - 2 s \partial_z s,

either by using the original heat equation (2) and the identity {s = - \partial_z P / P}, or else by using the equations of motion (3). This relation between the Burgers equation and the heat equation is known as the Cole-Hopf transformation.

The paper of Csordas, Smith, and Varga mentioned previously gives some other bounds on the lifespan of the dynamics; roughly speaking, they show that if there is one pair of zeroes that are much closer to each other than to the other zeroes then they must collide in a short amount of time (unless there is a collision occuring even earlier at some other location). Their argument extends also to situations where there are an infinite number of zeroes, which they apply to get new results on Newman’s conjecture in analytic number theory. I would be curious to know of further places in the literature where this dynamics has been studied.

Joni Teräväinen and I have just uploaded to the arXiv our paper “Odd order cases of the logarithmically averaged Chowla conjecture“, submitted to J. Numb. Thy. Bordeaux. This paper gives an alternate route to one of the main results of our previous paper, and more specifically reproves the asymptotic

\displaystyle \sum_{n \leq x} \frac{\lambda(n+h_1) \dots \lambda(n+h_k)}{n} = o(\log x) \ \ \ \ \ (1)


for all odd {k} and all integers {h_1,\dots,h_k} (that is to say, all the odd order cases of the logarithmically averaged Chowla conjecture). Our previous argument relies heavily on some deep ergodic theory results of Bergelson-Host-Kra, Leibman, and Le (and was applicable to more general multiplicative functions than the Liouville function {\lambda}); here we give a shorter proof that avoids ergodic theory (but instead requires the Gowers uniformity of the (W-tricked) von Mangoldt function, established in several papers of Ben Green, Tamar Ziegler, and myself). The proof follows the lines sketched in the previous blog post. In principle, due to the avoidance of ergodic theory, the arguments here have a greater chance to be made quantitative; however, at present the known bounds on the Gowers uniformity of the von Mangoldt function are qualitative, except at the {U^2} level, which is unfortunate since the first non-trivial odd case {k=3} requires quantitative control on the {U^3} level. (But it may be possible to make the Gowers uniformity bounds for {U^3} quantitative if one assumes GRH, although when one puts everything together, the actual decay rate obtained in (1) is likely to be poor.)