You are currently browsing the tag archive for the ‘heat flow’ tag.

Let ${P(z) = z^n + a_{n-1} z^{n-1} + \dots + a_0}$ be a monic polynomial of degree ${n}$ with complex coefficients. Then by the fundamental theorem of algebra, we can factor ${P}$ as

$\displaystyle P(z) = (z-z_1) \dots (z-z_n) \ \ \ \ \ (1)$

for some complex zeroes ${z_1,\dots,z_n}$ (possibly with repetition).

Now suppose we evolve ${P}$ with respect to time by heat flow, creating a function ${P(t,z)}$ of two variables with given initial data ${P(0,z) = P(z)}$ for which

$\displaystyle \partial_t P(t,z) = \partial_{zz} P(t,z). \ \ \ \ \ (2)$

On the space of polynomials of degree at most ${n}$, the operator ${\partial_{zz}}$ is nilpotent, and one can solve this equation explicitly both forwards and backwards in time by the Taylor series

$\displaystyle P(t,z) = \sum_{j=0}^\infty \frac{t^j}{j!} \partial_z^{2j} P(0,z).$

For instance, if one starts with a quadratic ${P(0,z) = z^2 + bz + c}$, then the polynomial evolves by the formula

$\displaystyle P(t,z) = z^2 + bz + (c+2t).$

As the polynomial ${P(t)}$ evolves in time, the zeroes ${z_1(t),\dots,z_n(t)}$ evolve also. Assuming for sake of discussion that the zeroes are simple, the inverse function theorem tells us that the zeroes will (locally, at least) evolve smoothly in time. What are the dynamics of this evolution?

For instance, in the quadratic case, the quadratic formula tells us that the zeroes are

$\displaystyle z_1(t) = \frac{-b + \sqrt{b^2 - 4(c+2t)}}{2}$

and

$\displaystyle z_2(t) = \frac{-b - \sqrt{b^2 - 4(c+2t)}}{2}$

after arbitrarily choosing a branch of the square root. If ${b,c}$ are real and the discriminant ${b^2 - 4c}$ is initially positive, we see that we start with two real zeroes centred around ${-b/2}$, which then approach each other until time ${t = \frac{b^2-4c}{8}}$, at which point the roots collide and then move off from each other in an imaginary direction.

In the general case, we can obtain the equations of motion by implicitly differentiating the defining equation

$\displaystyle P( t, z_i(t) ) = 0$

in time using (2) to obtain

$\displaystyle \partial_{zz} P( t, z_i(t) ) + \partial_t z_i(t) \partial_z P(t,z_i(t)) = 0.$

To simplify notation we drop the explicit dependence on time, thus

$\displaystyle \partial_{zz} P(z_i) + (\partial_t z_i) \partial_z P(z_i)= 0.$

From (1) and the product rule, we see that

$\displaystyle \partial_z P( z_i ) = \prod_{j:j \neq i} (z_i - z_j)$

and

$\displaystyle \partial_{zz} P( z_i ) = 2 \sum_{k:k \neq i} \prod_{j:j \neq i,k} (z_i - z_j)$

(where all indices are understood to range over ${1,\dots,n}$) leading to the equations of motion

$\displaystyle \partial_t z_i = \sum_{k:k \neq i} \frac{2}{z_k - z_i}, \ \ \ \ \ (3)$

at least when one avoids those times in which there is a repeated zero. In the case when the zeroes ${z_i}$ are real, each term ${\frac{2}{z_k-z_i}}$ represents a (first-order) attraction in the dynamics between ${z_i}$ and ${z_k}$, but the dynamics are more complicated for complex zeroes (e.g. purely imaginary zeroes will experience repulsion rather than attraction, as one already sees in the quadratic example). Curiously, this system resembles that of Dyson brownian motion (except with the brownian motion part removed, and time reversed). I learned of the connection between the ODE (3) and the heat equation from this paper of Csordas, Smith, and Varga, but perhaps it has been mentioned in earlier literature as well.

One interesting consequence of these equations is that if the zeroes are real at some time, then they will stay real as long as the zeroes do not collide. Let us now restrict attention to the case of real simple zeroes, in which case we will rename the zeroes as ${x_i}$ instead of ${z_i}$, and order them as ${x_1 < \dots < x_n}$. The evolution

$\displaystyle \partial_t x_i = \sum_{k:k \neq i} \frac{2}{x_k - x_i}$

can now be thought of as reverse gradient flow for the “entropy”

$\displaystyle H := -\sum_{i,j: i \neq j} \log |x_i - x_j|,$

(which is also essentially the logarithm of the discriminant of the polynomial) since we have

$\displaystyle \partial_t x_i = \frac{\partial H}{\partial x_i}.$

In particular, we have the monotonicity formula

$\displaystyle \partial_t H = 4E$

where ${E}$ is the “energy”

$\displaystyle E := \frac{1}{4} \sum_i (\frac{\partial H}{\partial x_i})^2$

$\displaystyle = \sum_i (\sum_{k:k \neq i} \frac{1}{x_k-x_i})^2$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2} + 2 \sum_{i,j,k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_j-x_i)}$

$\displaystyle = \sum_{i,k: i \neq k} \frac{1}{(x_k-x_i)^2}$

where in the last line we use the antisymmetrisation identity

$\displaystyle \frac{1}{(x_k-x_i)(x_j-x_i)} + \frac{1}{(x_i-x_j)(x_k-x_j)} + \frac{1}{(x_j-x_k)(x_i-x_k)} = 0.$

Among other things, this shows that as one goes backwards in time, the entropy decreases, and so no collisions can occur to the past, only in the future, which is of course consistent with the attractive nature of the dynamics. As ${H}$ is a convex function of the positions ${x_1,\dots,x_n}$, one expects ${H}$ to also evolve in a convex manner in time, that is to say the energy ${E}$ should be increasing. This is indeed the case:

Exercise 1 Show that

$\displaystyle \partial_t E = 2 \sum_{i,j: i \neq j} (\frac{2}{(x_i-x_j)^2} - \sum_{k: i,j,k \hbox{ distinct}} \frac{1}{(x_k-x_i)(x_k-x_j)})^2.$

Symmetric polynomials of the zeroes are polynomial functions of the coefficients and should thus evolve in a polynomial fashion. One can compute this explicitly in simple cases. For instance, the center of mass is an invariant:

$\displaystyle \partial_t \frac{1}{n} \sum_i x_i = 0.$

The variance decreases linearly:

Exercise 2 Establish the virial identity

$\displaystyle \partial_t \sum_{i,j} (x_i-x_j)^2 = - 4n^2(n-1).$

As the variance (which is proportional to ${\sum_{i,j} (x_i-x_j)^2}$) cannot become negative, this identity shows that “finite time blowup” must occur – that the zeroes must collide at or before the time ${\frac{1}{4n^2(n-1)} \sum_{i,j} (x_i-x_j)^2}$.

Exercise 3 Show that the Stieltjes transform

$\displaystyle s(t,z) = \sum_i \frac{1}{x_i - z}$

solves the viscous Burgers equation

$\displaystyle \partial_t s = \partial_{zz} s - 2 s \partial_z s,$

either by using the original heat equation (2) and the identity ${s = - \partial_z P / P}$, or else by using the equations of motion (3). This relation between the Burgers equation and the heat equation is known as the Cole-Hopf transformation.

The paper of Csordas, Smith, and Varga mentioned previously gives some other bounds on the lifespan of the dynamics; roughly speaking, they show that if there is one pair of zeroes that are much closer to each other than to the other zeroes then they must collide in a short amount of time (unless there is a collision occuring even earlier at some other location). Their argument extends also to situations where there are an infinite number of zeroes, which they apply to get new results on Newman’s conjecture in analytic number theory. I would be curious to know of further places in the literature where this dynamics has been studied.

Van Vu and I have just uploaded to the arXiv our paper “Random matrices: The Universality phenomenon for Wigner ensembles“. This survey is a longer version (58 pages) of a previous short survey we wrote up a few months ago. The survey focuses on recent progress in understanding the universality phenomenon for Hermitian Wigner ensembles, of which the Gaussian Unitary Ensemble (GUE) is the most well known. The one-sentence summary of this progress is that many of the asymptotic spectral statistics (e.g. correlation functions, eigenvalue gaps, determinants, etc.) that were previously known for GUE matrices, are now known for very large classes of Wigner ensembles as well. There are however a wide variety of results of this type, due to the large number of interesting spectral statistics, the varying hypotheses placed on the ensemble, and the different modes of convergence studied, and it is difficult to isolate a single such result currently as the definitive universality result. (In particular, there is at present a tradeoff between generality of ensemble and strength of convergence; the universality results that are available for the most general classes of ensemble are only presently able to demonstrate a rather weak sense of convergence to the universal distribution (involving an additional averaging in the energy parameter), which limits the applicability of such results to a number of interesting questions in which energy averaging is not permissible, such as the study of the least singular value of a Wigner matrix, or of related quantities such as the condition number or determinant. But it is conceivable that this tradeoff is a temporary phenomenon and may be eliminated by future work in this area; in the case of Hermitian matrices whose entries have the same second moments as that of the GUE ensemble, for instance, the need for energy averaging has already been removed.)

Nevertheless, throughout the family of results that have been obtained recently, there are two main methods which have been fundamental to almost all of the recent progress in extending from special ensembles such as GUE to general ensembles. The first method, developed extensively by Erdos, Schlein, Yau, Yin, and others (and building on an initial breakthrough by Johansson), is the heat flow method, which exploits the rapid convergence to equilibrium of the spectral statistics of matrices undergoing Dyson-type flows towards GUE. (An important aspect to this method is the ability to accelerate the convergence to equilibrium by localising the Hamiltonian, in order to eliminate the slowest modes of the flow; this refinement of the method is known as the “local relaxation flow” method. Unfortunately, the translation mode is not accelerated by this process, which is the principal reason why results obtained by pure heat flow methods still require an energy averaging in the final conclusion; it would of interest to find a way around this difficulty.) The other method, which goes all the way back to Lindeberg in his classical proof of the central limit theorem, and which was introduced to random matrix theory by Chatterjee and then developed for the universality problem by Van Vu and myself, is the swapping method, which is based on the observation that spectral statistics of Wigner matrices tend to be stable if one replaces just one or two entries of the matrix with another distribution, with the stability of the swapping process becoming stronger if one assumes that the old and new entries have many matching moments. The main formalisations of this observation are known as four moment theorems, because they require four matching moments between the entries, although there are some variant three moment theorems and two moment theorems in the literature as well. Our initial four moment theorems were focused on individual eigenvalues (and later also to eigenvectors), but it was later observed by Erdos, Yau, and Yin that simpler four moment theorems could also be established for aggregate spectral statistics, such as the coefficients of the Greens function, and Knowles and Yin also subsequently observed that these latter theorems could be used to recover a four moment theorem for eigenvalues and eigenvectors, giving an alternate approach to proving such theorems.

Interestingly, it seems that the heat flow and swapping methods are complementary to each other; the heat flow methods are good at removing moment hypotheses on the coefficients, while the swapping methods are good at removing regularity hypotheses. To handle general ensembles with minimal moment or regularity hypotheses, it is thus necessary to combine the two methods (though perhaps in the future a third method, or a unification of the two existing methods, might emerge).

Besides the heat flow and swapping methods, there are also a number of other basic tools that are also needed in these results, such as local semicircle laws and eigenvalue rigidity, which are also discussed in the survey. We also survey how universality has been established for wide variety of spectral statistics; the ${k}$-point correlation functions are the most well known of these statistics, but they do not tell the whole story (particularly if one can only control these functions after an averaging in the energy), and there are a number of other statistics, such as eigenvalue counting functions, determinants, or spectral gaps, for which the above methods can be applied.

In order to prevent the survey from becoming too enormous, we decided to restrict attention to Hermitian matrix ensembles, whose entries off the diagonal are identically distributed, as this is the case in which the strongest results are available. There are several results that are applicable to more general ensembles than these which are briefly mentioned in the survey, but they are not covered in detail.

We plan to submit this survey eventually to the proceedings of a workshop on random matrix theory, and will continue to update the references on the arXiv version until the time comes to actually submit the paper.

Finally, in the survey we issue some errata for previous papers of Van and myself in this area, mostly centering around the three moment theorem (a variant of the more widely used four moment theorem), for which the original proof of Van and myself was incomplete. (Fortunately, as the three moment theorem had many fewer applications than the four moment theorem, and most of the applications that it did have ended up being superseded by subsequent papers, the actual impact of this issue was limited, but still an erratum is in order.)

Below the fold is a version of my talk “Recent progress on the Kakeya conjecture” that I gave at the Fefferman conference.