It is well known that the heat equation

$\dot f = \Delta f$ (1)

on a compact Riemannian manifold (M,g) (with metric g static, i.e. independent of time), where $f: [0,T] \times M \to {\Bbb R}$ is a scalar field, can be interpreted as the gradient flow for the Dirichlet energy functional

$\displaystyle E(f) := \frac{1}{2} \int_M |\nabla f|_g^2\ d\mu$ (2)

using the inner product $\langle f_1, f_2 \rangle_\mu := \int_M f_1 f_2\ d\mu$ associated to the volume measure $d\mu$. Indeed, if we evolve f in time at some arbitrary rate $\dot f$, a simple application of integration by parts (equation (29) from Lecture 1) gives

$\displaystyle \frac{d}{dt} E(f) = - \int_M (\Delta f) \dot f\ d\mu = \langle -\Delta f, \dot f \rangle_\mu$ (3)

from which we see that (1) is indeed the gradient flow for (3) with respect to the inner product. In particular, if f solves the heat equation (1), we see that the Dirichlet energy is decreasing in time:

$\displaystyle \frac{d}{dt} E(f) = - \int_M |\Delta f|^2\ d\mu$. (4)

Thus we see that by representing the PDE (1) as a gradient flow, we automatically gain a controlled quantity of the evolution, namely the energy functional that is generating the gradient flow. This representation also strongly suggests (though does not quite prove) that solutions of (1) should eventually converge to stationary points of the Dirichlet energy (2), which by (3) are just the harmonic functions (i.e. the functions f with $\Delta f = 0$).

As one very quick application of the gradient flow interpretation, we can assert that the only periodic (or “breather”) solutions to the heat equation (1) are the harmonic functions (which, in fact, must be constant if M is compact, thanks to the maximum principle). Indeed, if a solution f was periodic, then the monotone functional E must be constant, which by (4) implies that f is harmonic as claimed.

It would therefore be desirable to represent Ricci flow as a gradient flow also, in order to gain a new controlled quantity, and also to gain some hints as to what the asymptotic behaviour of Ricci flows should be. It turns out that one cannot quite do this directly (there is an obstruction caused by gradient steady solitons, of which we shall say more later); but Perelman nevertheless observed that one can interpret Ricci flow as gradient flow if one first quotients out the diffeomorphism invariance of the flow. In fact, there are infinitely many such gradient flow interpretations available. This fact already allows one to rule out “breather” solutions to Ricci flow, and also reveals some information about how Poincaré’s inequality deforms under this flow.

The energy functionals associated to the above interpretations are subcritical (in fact, they are much like $R_{\min}$) but they are not coercive; Poincaré’s inequality holds both in collapsed and non-collapsed geometries, and so these functionals are not excluding the former. However, Perelman discovered a perturbation of these functionals associated to a deeper inequality, the log-Sobolev inequality (first introduced by Gross in Euclidean space). This inequality is sensitive to volume collapsing at a given scale. Furthermore, by optimising over the scale parameter, the controlled quantity (now known as the Perelman entropy) becomes scale-invariant and prevents collapsing at any scale – precisely what is needed to carry out the first phase of the strategy outlined in the previous lecture to establish global existence of Ricci flow with surgery.

The material here is loosely based on Perelman’s paper, Kleiner-Lott’s notes, and Müller’s book.

– Ricci flow as gradient flow –

We would like to represent Ricci flow

$\dot g = - 2 \hbox{Ric}$ (5)

as a gradient flow of some functional (with respect to some inner product, or at least with respect to some Riemannian metric on the space of all metrics g). We will assume that all quantities are smooth and that the manifold is either compact or that all expressions being integrated are rapidly decreasing at infinity (so no boundary terms etc. arise from integration by parts).

To do this, our starting point will be the first variation formula for the scalar curvature R (equation (15) from Lecture 1) for an arbitrary instantaneous deformation $\dot g$ of the metric g:

$\dot R = - \hbox{Ric}^{\alpha \beta} \dot g_{\alpha \beta} - \Delta \hbox{tr}(\dot g) +\nabla^\alpha \nabla^\beta \dot g_{\alpha \beta}$. (6)

We can integrate in M to eliminate the latter two terms on the right-hand side (by Stokes theorem, see equation (28) from Lecture 1) to get

$\displaystyle \int_M \dot R\ d\mu = - \int_M \hbox{Ric}^{\alpha \beta} \dot g_{\alpha \beta} \ d\mu$. (7)

This looks rather promising; it suggests that if we introduce the Einstein-Hilbert functional

$\displaystyle H(M,g) := \int_M R\ d\mu$ (8)

then the Ricci flow (5) might be interpretable as a gradient flow for -2H.

Unfortunately, there is a problem because R is not the only time-dependent quantity in the right-hand side of (8); the volume measure $d\mu$ also evolves in time by the formula

$\displaystyle \frac{d}{dt} d\mu = \frac{1}{2} \hbox{tr}(\dot g)\ d\mu$ (9)

(see equation (19) from Lecture 1). Thus, from the product rule, the true variation of the Einstein-Hilbert functional is given by the formula

$\displaystyle \frac{d}{dt} H(M,g) = \int_M (- \hbox{Ric}^{\alpha \beta} + \frac{1}{2} R g^{\alpha \beta}) \dot g_{\alpha \beta}\ d\mu.$ (10)

So the gradient flow of -2H (using the inner product associated to $d\mu$) is not Ricci flow, but is instead a rather strange flow

$\dot g = -2 \hbox{Ric} + R g = -2G$ (11)

where $G:= \hbox{Ric} - \frac{1}{2} R$ is the Einstein tensor. This flow does not have any particularly nice properties in general (it is not parabolic in three and higher dimensions, even after applying the de Turck trick from Lecture 1). On the other hand, in two dimensions the right-hand side of (10) vanishes and H(M,g) becomes invariant under deformations (we have already exploited this fact to prove the Gauss-Bonnet formula, see Proposition 1 from Lecture 4). More generally, we recover see from (10) the fact (well known in general relativity) that the (formal) stationary points of the Einstein-Hilbert functional are precisely the solutions of the vacuum Einstein equations $G=0$ (or equivalently, $\hbox{Ric}=0$ in any dimension other than 2).

We see that the variation of the measure $d\mu$ in time is causing us some difficulty. To fix this problem, let us take the (rather non-geometric looking) step of replacing this evolving measure $d\mu$ by some static measure $dm$ which we select in advance, and consider instead the variation of the functional $\int_M R\ dm$ with respect to some arbitrary perturbation $\dot g$. Now that m is static, we can apply (6) to get

$\displaystyle \frac{d}{dt} \int_M R\ dm = \int_M (- \hbox{Ric}^{\alpha \beta} \dot g_{\alpha \beta} - \Delta \hbox{tr}(\dot g) +\nabla^\alpha \nabla^\beta \dot g_{\alpha \beta})\ dm$. (12)

Previously, we used Stokes’ theorem to eliminate the latter two terms on the right-hand side to leave us with the one term $\int_M \hbox{Ric}^{\alpha \beta} \dot g_{\alpha \beta}\ dm$ that we do want. Unfortunately, Stokes’ theorem only applies for the volume measure $d\mu$, not for our static measure $dm$! In order to apply Stokes’ theorem, we must therefore convert the static measure back to volume measure. The Radon-Nikodym derivative $\frac{d\mu}{dm}$ of the two measures should be some positive function, which we shall denote by $e^f$ for some scalar (and time-varying) function $f: M \to {\Bbb R}$, thus

$dm = e^{-f} d\mu$. (13)

Inserting (13) into (12), integrating by parts using the volume measure $d\mu$, and then using (13) again to convert back to the static measure $dm$, we see after a little calculation that

$\displaystyle \int_M \Delta \hbox{tr}(\dot g)\ dm = \int_M ( |\nabla f|_g^2 - \Delta f) \hbox{tr}(\dot g)\ dm$ (14)

and similarly

$\displaystyle \int_M \nabla^\alpha \nabla^\beta \dot g_{\alpha \beta}\ dm = \int_M ( (\nabla^\alpha f) (\nabla^\beta g) - \nabla^{\alpha} \nabla^{\beta} f) \dot g_{\alpha \beta} \ dm$ (15)

and so we can express the right-hand side of (12) as

$\langle -\hbox{Ric}^{\alpha \beta} - (|\nabla f|_g^2 - \Delta f) g^{\alpha \beta} + (\nabla^\alpha f) (\nabla^\beta f) - \nabla^{\alpha} \nabla^{\beta} f, \dot g_{\alpha \beta} \rangle_m$. (16)

This looks rather unpleasant; we managed to eradicate the scalar curvature term $\frac{1}{2} R$ that was present in the variation in (10), but at the cost of introducing four new terms involving f. But to deal with this, first observe from differentiating (13) and using (9) and the static nature of dm that we know the first variation of f:

$\dot f = \frac{1}{2} \hbox{tr}(\dot g)$. (17)

So the term $\langle \Delta f g^{\alpha \beta}, \dot g_{\alpha \beta} \rangle_m$ that appears in (16) can be rewritten as $2 \int_M (\Delta f) \dot f\ dm$. Now this term looks familiar… in fact, it essentially the variation (3) of the Dirichlet energy functional for the measure dm! This suggests that we may be able to simplify (16) if we modify our functional $\int_M R\ dm$ by adding some multiple of the Dirichlet functional $E := \frac{1}{2} \int_M |\nabla f|_g^2\ dm$.

One cannot apply (3) directly, though, because (a) g is evolving in time, rather than static, and also (b) dm is not the volume measure for g. But we have all the equations to deal with this, and one can compute the first variation of E:

Exercise 1. Show that

$\displaystyle \frac{d}{dt} E = - \frac{1}{2} \langle \Delta f g^{\alpha \beta} - |\nabla f|_g^2 g^{\alpha \beta} + (\nabla f)^\alpha (\nabla f)^{\beta}, \dot g_{\alpha \beta} \rangle_m$. (18)

(Hint: expand out $|\nabla f|_g^2 = g^{\alpha \beta} (\nabla_\alpha f) (\nabla_\beta f)$ and use (3) from Lecture 1.) $\diamond$

If we thus define the functional

$\displaystyle {\mathcal F}_m( M, g ) := \int_M (R + |\nabla f|^2)\ dm$ (19)

we see from (16), (18) that we get a lot of cancellation, ending up with

$\displaystyle \frac{d}{dt} {\mathcal F}_m( M, g ) = - \langle \hbox{Ric}^{\alpha \beta} + \nabla^{\alpha} \nabla^{\beta} f, \dot g_{\alpha \beta} \rangle_m$. (20)

Thus the gradient flow of $-2{\mathcal F}_m(M,g)$ with respect to the inner product $\langle h, k \rangle_m := \int_M h^{\alpha \beta} k_{\alpha \beta}\ dm$ on symmetric two-forms (or more precisely, on the tangent space of such forms at g) is given by

$\dot g_{\alpha \beta} = - 2 \hbox{Ric}_{\alpha \beta} - 2 \nabla_{\alpha} \nabla_{\beta} f$. (21)

From (17) we see that f now evolves by a backward heat equation

$\dot f = - \Delta f - R$. (22)

With this flow, we see that ${\mathcal F}_m$ is monotone increasing, with

$\displaystyle \frac{d}{dt} {\mathcal F}_m = 2 \int_M |\hbox{Ric} + \hbox{Hess}(f)|^2\ dm$. (23)

The equation (21) is almost Ricci flow (5), but with one additional term associated with f. But we can observe (using equation (25) from Lecture 1) that $2\nabla_{\alpha} \nabla_{\beta} f = {\mathcal L}_{\nabla f} g_{\alpha \beta}$ is just the Lie derivative of g in the direction of the gradient vector field $\nabla^\gamma f$. Thus we see that (23) is a modified Ricci flow (see equation (36) from Lecture 1), which is conjugate to genuine Ricci flow by a diffeomorphism as discussed in that lecture. Thus while we have not established Ricci flow as a gradient flow directly, we have managed to find a whole family of gradient flows (parameterised by a choice of static measure dm, or equivalently by a choice of potential function f evolving by (17)) which are equivalent to Ricci flow modulo diffeomorphism. (Indeed, by placing an appropriate Riemannian structure on the moduli space of metrics modulo diffeomorphism, one can express Ricci flow modulo diffeomorphism as a true (formal) gradient flow; see Section 9 of the Kleiner-Lott notes.) As remarked in Perelman’s paper, one can view f as a kind of gauge function for the Ricci flow.

Example 1. If (M,g) is a Euclidean space $M = {\Bbb R}^d$ with the contracted Euclidean metric $g = \frac{\tau}{t_0} \eta$ for times $0 \leq t < t_0$, where $\tau := t_0 - t$ and $\eta$ is the standard metric, with $dm$ equal to the Gaussian measure $\frac{1}{(4\pi t_0)^{d/2}} e^{-|x|^2/4t_0}\ dx$ (thus $f(t,x) = \frac{|x|^2}{4t_0} + \frac{d}{2} \log(4\pi \tau)$), then g, f solve (21), (22). (One has to be a bit careful here because M is non-compact, of course.) $\diamond$

We can of course conjugate away the infinitesimal diffeomorphism given by the vector field $\nabla f$, which converts the system (21), (22) to the system

$\dot g = -2\hbox{Ric}; \quad \dot f = -\Delta f + |\nabla f|_g^2 - R$ (24)

(here we use the fact that ${\mathcal L}_{\nabla f} f = |\nabla f|_g^2$), which is Ricci flow coupled with a nonlinear backwards heat equation for the potential f. (Note that the equation for f is not always solvable forwards in time for any non-zero amount of time, but we can always solve it instantaneously for a fixed time, which is good enough for first variation analysis.) The non-linear backwards heat equation equation for f can be linearised by setting $u := e^{-f}$, in which case it becomes the adjoint heat equation

$\dot u = - \Delta u + R u$. (24′)

Exercise 2. Writing $dm := u d\mu$, show that (24′) is equivalent to the equation

$\frac{d}{dt} dm = - \Delta dm$ (24”)

where $dm$ is viewed as a d-form for the purposes of applying the Laplacian. Thus the adjoint heat equation can be viewed as the backwards heat equation for d-forms. $\diamond$

Example 2. If (M,g) is a static Euclidean space $M = {\Bbb R}^d$ and $f(t,x) = \frac{|x|^2}{4\tau} + \frac{d}{2} \log(4\pi \tau)$ with $\tau = t_0 - t$ and the time variable t is restricted to be less than $t_0$, then g, f solve (24), and $dm = e^{-f} d\mu$ is the Gaussian measure $\frac{1}{(4\pi \tau)^{d/2}} e^{-|x|^2/4\tau}\ dx$, which solves the backwards heat equation. Note that this is the conjugated version of Example 1. Again, one needs to take care because M is non-compact. $\diamond$

By performing this conjugation, the measure m is no longer static, and we reflect this by changing the notation a little to

$\displaystyle {\mathcal F}(M, g, f) := {\mathcal F}_{e^{-f} \mu}(M,g) = \int_M (|\nabla f|^2 + R) e^{-f}\ d\mu$. (25)

The relationship between ${\mathcal F}$ and the flow (24) is analogous to that between ${\mathcal F}_m$ and (21), (22). For instance, we have the following analogue of (23):

Exercise 3. If g, f solve (24), show that

$\displaystyle \frac{d}{dt} {\mathcal F}(M, g, f) = 2 \int_M |\hbox{Ric} + \hbox{Hess}(f)|^2 e^{-f}\ d\mu$. $\diamond$ (26)

Thus ${\mathcal F}(M,g,f)$ is monotone non-decreasing in time. We would like to use this to develop a controlled quantity for Ricci flow, but we need to eliminate f. This can be accomplished by taking an infimum, defining

$\displaystyle \lambda(M,g) := \inf_{f: \int_M e^{-f}\ d\mu = 1} {\mathcal F}(M,g,f)$. (27)

The normalisation $\int_M e^{-f}\ d\mu = 1$ (which makes dm a probability measure) is needed to ensure a meaningful infimum; note that this normalisation is preserved by the flow (24) since dm is only moved around by diffeomorphisms. This quantity has an interpretation as the best constant in a Poincaré inequality:

Exercise 4. Show that $\lambda(M,g)$ is the least number for which one has the inequality

$\displaystyle \int_M 4 |\nabla u|_g^2 + R|u|^2\ d\mu \geq \lambda(M,g) \int_M |u|^2\ d\mu$ (28)

for all $u$ in the Sobolev space $H^1(M)$. (Hint: reduce to the case when u is positive and smooth and then make the substitution $u = e^{-f/2}$.) Conclude in particular that $\lambda(M,g)$ is finite, that it is the least eigenvalue of the self-adjoint modified Laplacian $-4\Delta + 4R$, and lies between $R_{\min}$ and the average scalar curvature $\overline{R} := \int_M R\ d\mu / \int_M\ d\mu$. $\diamond$

A variational argument (using the standard fact that $H^1(M)$ embeds compactly into $L^2(M)$) shows that equality in (28) is attained by some strictly positive $u = e^{-f/2}$ with norm $\int_M |u|^2\ d\mu = 1$, and so the infimum in (27) is also attained for some f. Applying the flow (24) instantaneously at a given time, we conclude (formally, at least) that we have the monotonicity formula

$\displaystyle \frac{d}{dt} \lambda(M,g) = 2 \int_M |\hbox{Ric} + \hbox{Hess}(f)|^2 e^{-f}\ d\mu$ (29)

for any solution to Ricci flow (5), where f is the extremiser for (27) (note that this extremiser f need not evolve via (25)). (One can in fact make this formula rigorous whenever the Ricci flow is smooth and M is compact, but we will not detail this here.)

This monotonicity is similar to the monotonicity of $R_{\min}$. For instance, the functional $\lambda(M,g)$ has a dimension of -2 in the sense of the previous lecture, same as $R_{\min}$. As further evidence of similarity, we have:

Exercise 5. Show that $\frac{d}{dt} \lambda(M,g) \geq \frac{2}{d} \lambda(M,g)^2$, and use this to conclude an analogue of Proposition 1 from Lecture 3 for $\lambda(M,g)$. In particular conclude that Ricci flow must develop a finite time singularity if $\lambda(M,g)$ is positive. $\diamond$

Exercise 6. If $(M,g)$ is a Ricci flow which is a steady breather in the sense that it is periodic modulo isometries (thus $(M,g(t))$ is isometries to $(M,g(0))$ for some $t > 0$), show that at time zero we have

$\hbox{Ric} = - \hbox{Hess}(f) = - \frac{1}{2} \mathcal{L}_{\nabla f} g$ (30)

for some $f: M \to {\Bbb R}$. Conclude that $g(t) = \exp( t \nabla f )^* g(0)$, thus $(M,g(t))$ simply evolves by diffeomorphism by the gradient field f. (For this you may need to use the uniqueness of the initial value problem for Ricci flow.) In other words, all steady breathers are gradient steady solitons. $\diamond$

Remark 1. One can apply a similar argument to deal with compact expanding breathers (in which $(M,g(t))$ is isometric to a larger dilate of $(M,g(0))$ for some $t>0$ by normalising $\lambda(M,g)$ by a power of the volume as in Exercise 1 of Lecture 7, concluding that such breathers are necessarily gradient expanding solitons with

$\hbox{Ric} = - \hbox{Hess}(f) - \frac{g}{2\sigma}$ (31)

at time zero for some potential f and some $\sigma > 0$; see Perelman’s paper (or Section 7 of Kleiner-Lott) for details. With a little more work (using the maximum principle) one can in fact show that f is constant, and so the only compact expanding breathers are Einstein manifolds. (Actually, this result can also be established using Exercise 1 from Lecture 7 directly, as follows from the work of Hamilton.) This normalisation of $\lambda(M,g)$ is also closely related to the Yamabe invariant of M; see this paper of Kotschick for further discussion. $\diamond$

Example 3. Any Ricci-flat manifold (i.e. $\hbox{Ric}=0$) is of course a gradient steady soliton with $f = 0$. A more non-trivial example is given by Hamilton’s cigar soliton (also known as Witten’s black hole), which is the two-dimensional manifold $M = {\Bbb R}^2$ with the conformal metric $dg^2 = \frac{dx^2+dy^2}{1+x^2+y^2}$ and gradient function $f := \log \sqrt{1+x^2+y^2}$; we leave the verification of the gradient shrinking property (30) as an exercise. $\diamond$

Remark 2. If Ricci flow was a gradient flow for a functional which was geometric (or more precisely, invariant under diffeomorphism), then this flow could not deform a metric by any non-trivial diffeomorphism (since this is a stationary direction for this functional, rather than a steepest descent). Thus the existence of non-trivial gradient steady solitons, such as the cigar soliton, explains why Ricci flow cannot be directly expressed as a gradient flow without introducing a non-geometric object such as the reference measure dm or the potential function f. (See also Proposition 1.7 of Müller’s book for a different way of seeing that Ricci flow is not a pure gradient flow.) $\diamond$

Exercise 7. If (M,g) is a gradient steady soliton with potential f, show that $R + \Delta f = 0$, $|\nabla f|^2 + R = \hbox{const}$, and $\dot f = |\nabla f|^2$. (Hint: to prove the second identity, differentiate (30) and use the second Bianchi identity (Exercise 7 from Lecture 0).) Use the maximum principle to then conclude that the only compact gradient steady solitons are the Ricci-flat manifolds. $\diamond$

– Nash entropy –

Let us return to our analysis of the functional ${\mathcal F}_m(M,g)$, in which $dm=e^{-f}\ d\mu$ was fixed and g evolved by the modified Ricci flow (21) (which forced f to evolve by the backwards heat equation (22)). We then obtained the monotonicity formula (23). We shall normalise dm to be a probability measure.

We can squeeze a little bit more out of this formula – in particular, making it scale invariant – by introducing the Nash entropy

$N_m(M,g) := \int \log \frac{dm}{d\mu}\ dm = - \int f\ dm$ (31)

which is the relative entropy of $d\mu$ with respect to the background measure dm. (Some further relations and analogies between the functionals described here and notions of entropy from statistical mechanics are discussed in Perelman’s paper.) From (22) and one integration by parts (using (13), of course) we know how this entropy changes with time:

$\displaystyle \frac{d}{dt} N_m(M,g) = \int (|\nabla f|^2 + R)\ dm = {\mathcal F}_m(M,g)$. (32)

To exploit this identity, let us first consider the case of gradient shrinking solitons:

Exercise 8. Suppose that a Riemannian manifold (M,g)=(M,g(0)) verifies an equation of the form

$\hbox{Ric} = - \hbox{Hess}(f) + \frac{1}{2\tau} g$ (33)

for some function f and some $\tau > 0$. Show that this equation is preserved for times $0 \leq t < \tau(0)$ if g evolves by Ricci flow, if $\tau$ evolves by $\dot \tau = -1$ (i.e. $\tau(t) = \tau(0)-t$), and $\partial_t f = |\nabla f|_g^2$, and that $g(t) = \frac{\tau(t)}{\tau(0)} \exp( t \nabla f ) g(0)$ for all $0 \leq t < \tau(0)$. Such solutions are known as gradient shrinking solitons; they combine Ricci flow with the diffeomorphism and scaling flows from Lecture 1. Note that any positively curved Einstein manifold, such as the sphere, will be a gradient shrinking soliton (with f=0). Example 1 also shows that Euclidean space can also be viewed as a gradient shrinking soliton. $\diamond$

If we are to find a scale-invariant (and diffeomorphism-invariant) monotone quantity for Ricci flow, it had better be constant on the gradient shrinking solitons. In analogy with (23), we would therefore like the variation of this monotone quantity with respect to Ricci flow to look something like

$\displaystyle 2\int_M |\hbox{Ric} + \hbox{Hess}(f) - \frac{1}{2\tau} g|_g^2\ dm$ (34)

where $\tau$ is some quantity decreasing at the constant rate

$\dot \tau = -1$. (35)

But the scaling is wrong; time has dimension 2 with respect to the Ricci flow scaling, and so the dimension of a variation of a scale-invariant quantity should be -2, while the expression (34) has dimension -4. (Note that f should be dimensionless (up to logarithms), $\tau$ has the same dimension of time, i.e. 2, and $\int_M dm=1$ is of course dimensionless.) So actually we should be looking at

$\displaystyle 2\tau\int_M |\hbox{Ric} + \hbox{Hess}(f) - \frac{1}{2\tau} g|_g^2\ dm$. (36)

To find a functional whose derivative is (36), we expand the integrand as

$|\hbox{Ric} + \hbox{Hess}(f) - \frac{1}{2\tau} g|_g^2 = |\hbox{Ric} + \hbox{Hess}(f)|_g^2 - \frac{1}{\tau} (R + \Delta f) + \frac{d}{4\tau^2}$. (37)

Using (32) and the normalisation $\int_M\ dg = 1$, we can thus express (36) as

$\tau \frac{d}{dt} {\mathcal F}_m(M,g) - 2 {\mathcal F}_m(M,g) + \frac{d}{2\tau}$. (38)

Using (32) and (35), we can express this as a total derivative:

$\displaystyle \frac{d}{dt} (\tau {\mathcal F}_m(M,g) - N_m(M,g) + \frac{d}{2} \log \tau)$. (39)

Thus the quantity in parentheses is monotone increasing in time under Ricci flow (and with f, $\tau$ evolving by (22), (35)).

In analogy with Example 1, we rewrite the potential function f as

$f = \tilde f + \frac{d}{2} \log(4\pi \tau)$ (40)

then $\tilde f$ obeys the slight variant of (22)

$\displaystyle \frac{d}{dt} \tilde f = -\Delta f - R + \frac{d}{2\tau}$ (41)

and is related to the fixed measure m by the formula

$dm = (4\pi \tau)^{-d/2} e^{-\tilde f}\ d\mu$ (42)

and the equality between (36) and (39) becomes

$\displaystyle \frac{d}{dt} {\mathcal W}_m(M,g,\tau) = 2\tau\int_M |\hbox{Ric} + \hbox{Hess}(\tilde f) - \frac{1}{2\tau} g|_g^2\ dm$ (43)

where

$\displaystyle {\mathcal W}_m(M,g,\tau) := \int_M [\tau(R + |\nabla \tilde f|^2) + \tilde f - d]\ dm$. (44)

The -d term here is harmless (since m is fixed), and is in place to normalise this expression to vanish in the Euclidean case (Example 1, where now $\tilde f(t,x) = |x|^2/4t_0$).

As before, it is convenient to conjugate away the diffeomorphism by $\nabla f$ to recover a pure Ricci flow. Define the Perelman entropy ${\mathcal W}(M,g,f,\tau)$ of a manifold $(M,g)$, a scalar function $f: M \to {\Bbb R}$, and a positive real $\tau > 0$, by

$\displaystyle {\mathcal W}(M,g,f,\tau) = \int_M [\tau(R + |\nabla f|^2) + f - d] (4\pi \tau)^{-d/2} e^{-f}\ d\mu$. (45)

Note that this quantity has dimension 0 (if f is viewed as dimensionless, and $\tau$ given the dimension 2).

Exercise 9. Suppose that g evolves by Ricci flow (5), f evolves by the nonlinear backward heat equation

$\dot f = -\Delta f + |\nabla f|^2 - R + \frac{d}{2\tau}$, (46)

and $\tau$ evolves by (35). Show that

$\displaystyle \frac{d}{dt} {\mathcal W}(M,g,f,\tau) = 2\tau \int_M |\hbox{Ric} + \hbox{Hess}(f) - \frac{1}{2\tau} g|_g^2 (4\pi \tau)^{-d/2} e^{-f}\ d\mu$. $\diamond$ (47)

If we write $u := (4\pi \tau)^{-d/2} e^{-f}$, show that (46) is also equivalent to the adjoint heat equation

$\dot u = - \Delta u + R u$. $\diamond$ (48)

We have thus obtained a scale-invariant monotonicity formula, albeit one which depends on two additional time-varying parameters, f and $\tau$. To eliminate them, the obvious thing to do is to just take the infimum over all f and $\tau$; but we need to be sure that the infimum exists at all. This will be studied next.

– Connection to the log-Sobolev inequality –

We have just established the monotonicity formula (47) whenever g evolves by Ricci flow (5) and f, $\tau$ evolve by (46), (35). Let us now temporarily specialise to the case when $(M,g)$ is a static Euclidean space ${\Bbb R}^d$ (which of course obeys Ricci flow), and $\tau = -t$ (which of course obeys (35)), and now restrict to negative times $t < 0$. Now all curvatures $R, \hbox{Ric}$ vanish, thus for instance by (48) we see that $u=(4\pi \tau)^{-d/2} e^{-f}$ obeys the free backwards heat equation $\dot u = - \Delta u$. We will normalise $dm = u\ d\mu$ to be a probability measure, thus $\int_{{\Bbb R}^d} u\ dx = 1$.

Example 4. The key example to keep in mind here is $f(t,x) = |x|^2/4\tau$, in which case u becomes the backwards heat kernel $u(t,x) = (4\pi \tau)^{-d/2} e^{-|x|^2/4\tau}$. $\diamond$

We can now re-express the functional (45) in terms of u as

$\displaystyle {\mathcal W}(M,g,f,\tau) = \int_{{\Bbb R}^d} (\tau \frac{|\nabla u|^2}{u} - u \log u)\ dx - \frac{d}{2} \log(4\pi \tau) - d$. (49)

One easily verifies by direct calculation that this expression vanishes in the model case of Example 4. For more general u, we know that this quantity is monotone increasing in time, and so

${\mathcal W}(M,g,f,\tau)(t) \geq \lim_{t \to -\infty} {\mathcal W}(M,g,f,\tau)(t)$. (50)

Now suppose u is some non-negative test function $u_0(x)$ at time zero with total mass 1, then from the fundamental solution for the backwards heat equation we have

$\displaystyle u(t,x) = \frac{1}{(4\pi \tau)^{d/2}} \int_{{\Bbb R}^d} e^{-|x-y|^2/4\tau} u_0(y)\ dy = \frac{1}{(4\pi \tau)^{d/2}} \tilde u( t, x/\sqrt{\tau} )$ (51)

where $\tilde u$ is the renormalised solution

$\displaystyle \tilde u(t, x) := \int_{{\Bbb R}^d} e^{-|x - (y/\sqrt{\tau})|^2 / 4} u_0(y)\ dy$. (52)

Observe that $\tilde u(t,x)$ converges pointwise to $e^{-|x|^2/4}$ as $t \to -\infty$ for fixed x. Thus in some renormalised sense this general solution is converging to the model solution in Example 3 in the limit $t \to -\infty$.

We can rewrite the functional (49) after some calculation as

$\displaystyle {\mathcal W}(M,g,f,\tau) = \int_{{\Bbb R}^d} [\tau \frac{|\nabla \tilde u|^2}{\tilde u^2} - \log \tilde u] (-4\pi)^{-d/2} \tilde u\ dx - d$. (53)

One can check that $\nabla \tilde u$ is converging pointwise to $\nabla e^{-|x|^2/4}$. A careful application of dominated convergence then shows that in the limit $t \to -\infty$, (53) converges to the value attained in Example 3, i.e. zero. By the monotonicity formula, we have thus demonstrated that

${\mathcal W}(M,g,f,\tau) \geq 0$ (54)

for all times $-\infty < t < 0$. Writing $u = \phi^2$ and rearranging (49), we conclude the log-Sobolev inequality

$\displaystyle 2\int_{{\Bbb R}^d} \phi^2 \log \phi\ dx \leq 4\tau \int_{{\Bbb R}^d} |\nabla \phi|^2\ dx - \frac{d}{2} \log(4 \pi \tau) - d$ (55)

valid whenever $\tau > 0$ and $\int_{{\Bbb R}^d} \phi^2\ dx = 1$.

Exercise 10. By letting $dm := (4\pi \tau)^{-d/2} e^{-|x|^2/4\tau} dx$ be standard Gaussian measure and writing $u dx = F^2 dm$, deduce the original log-Sobolev inequality

$\displaystyle \int_{{\Bbb R}^d} F^2 \log F^2\ dm \leq \frac{1}{\tau} \int_{{\Bbb R}^d} |\nabla F|^2 dm$ (56)

of Gross, valid whenever $\tau > 0$ $\int_{{\Bbb R}^d} F^2\ dm = 1$. [One key feature of this inequality, as compared to more traditional Sobolev inequalities, is that it is almost completely independent of the dimension d.] $\diamond$

Remark 3. We have seen how knowledge of the heat kernel can lead to log-Sobolev inequalities, by evolving by the (backwards) heat flow (this is an example of the semigroup method for proving inequalities). This connection can in fact be reversed, using log-Sobolev inequalities to deduce information about heat kernels. Heat kernels can in turn be used to deduce ordinary Sobolev estimates, which then imply log-Sobolev estimates by convexity inequalities such as Hölder’s inequality, thus showing that all these phenomena are morally equivalent. There is a vast literature on these subjects (and other related topics, such as hypercontractivity); so much so that there are not only multiple surveys on the subject, but even a survey of all the surveys. $\diamond$

We now return to the case of general Ricci flows (not just the Euclidean one).

Exercise 11. Let (M,g) be a compact Riemannian manifold, and let $\tau > 0$. Using the Euclidean log-Sobolev inequality (48), show that we have a lower bound of the form ${\mathcal W}(M,g,f,\tau) \geq - C(M,g,\tau)$ for all functions f with $\int (4\pi \tau)^{-d/2} e^{-f}\ d\mu = 1$. Show in fact that $C(M,g,\tau)$ can be chosen to depend only on $\tau$, the dimension, an upper bound for the magnitude of the RIemann curvature, and a lower bound for the injectivity radius. Using a rescaling and compactness argument, show also that we can take $C(M,g,\tau) \to 0$ as $\tau \to 0$. (Details can be found in Section 3.1 of Perelman’s paper.) $\diamond$

We can now define the quantity $\mu(M,g,\tau)$ to be the infimum of ${\mathcal W}(M,g,f,\tau)$ for all functions f with $\int (4\pi \tau)^{-d/2} e^{-f}\ d\mu = 1$; thus $\mu(M,g,\tau)$ is non-decreasing if we evolve $\tau$ by (35). Thus we have obtained a one-parameter family of dimensionless monotone quantities (recalling that $\tau$ has dimension 2 with respect to scaling).

Remark 4. One can interpret $\mu(M,g,\tau)$ as a nonlinear analogue of the eigenvalue $\lambda(M,g)$. Indeed, just as $\lambda(M,g)$ is the least number $\lambda$ for which one can solve the linear eigenfunction equation

$(4\Delta + R) \Phi = \lambda \Phi$ (57)

subject to the constraint $\int_M \Phi^2 = 1$, $\mu(M,g,\tau)$ is the least number $\mu$ for which one can solve the nonlinear eigenfunction equation

$\tau (4\Delta+R) \Phi = 2 \Phi \log \Phi + (\mu+d) \Phi$ (58)

subject to the constraints $\Phi > 0$ and $\int_M (4\pi \tau)^{-d/2} \Phi^2\ d\mu = 1$. In particular we expect $\mu(M,g,\tau)$ to behave roughly like $\tau \lambda(M,g)$ in the limit $\tau \to \infty$. $\diamond$

Exercise 12. Show that the only shrinking breathers (those in which $(M,g(t))$ is isometric to a contraction of $(M,g(0))$ for some $t>0$) are the gradient shrinking solitons. $\diamond$

– Non-collapsing –

We now relate log-Sobolev inequalities (i.e. lower bounds on $\mu(M,g,\tau)$) to non-collapsing. We first note that by substituting $(4\pi \tau)^{-d/2} e^{-f} = \phi^2$ into (45) as in the Euclidean case, that we have the log-Sobolev inequality

$\displaystyle \int_{M} \phi^2 \log \phi^2\ d\mu \leq 4\tau \int_{M} |\nabla \phi|_g^2\ d\mu$

$\displaystyle + \tau \int_M R |\phi|^2\ d\mu - \frac{d}{2} \log(4 \pi \tau) - d - \mu(M,g,\tau)$ (59)

whenever $\phi$ is non-negative with $\int_M \phi^2\ d\mu = 1$.

To use this, suppose we have a ball $B = B(p,\sqrt{\tau})$ which has bounded normalised curvature, so in particular $R = O( \tau^{-1} )$ on this ball.
On the other hand, if $\phi$ is supported on B with $L^2$ mass 1, then from Jensen’s inequality we have

$\displaystyle \int_{M} \phi^2 \log \phi^2\ d\mu \geq \log \frac{1}{\hbox{Vol}(B)}$ (54)

and we thus conclude from (59) that

$\displaystyle \log \frac{\tau^{d/2}}{\hbox{Vol}} \leq 4 \tau \int_M |\nabla \phi|_g^2 + O(1) - \mu(M,g,\tau)$. (60)

If we let $\phi(x) := c \psi( d(x,p) / \sqrt{\tau} )$, where $\psi$ is a bump function that equals 1on [-1/2,1/2] and is supported on [-1,1] (thus $\phi=c$ on the ball $B_{1/2} := B( p, \sqrt{\tau}/2 )$, and $c \leq 1/\hbox{Vol}(B_{1/2})^{1/2}$ is the normalisation constant needed to ensure that $\phi$ has $L^2$ mass one, then $\nabla \phi = O( c / \sqrt{\tau} )$ on this ball, and so we conclude

$\log \frac{\tau^{d/2}}{\hbox{Vol}(B)} \leq O( \hbox{Vol}( B ) / \hbox{Vol}( B_{1/2} ) ) - \mu(M,g,\tau)$. (61)

At this point we need to invoke the relative Bishop-Gromov inequality from comparison geometry, which among other things ensures that $\hbox{Vol}(B) = O( \hbox{Vol}(B_{1/2})$ under the assumption of bounded normalised curvature. Indeed, from equations (15) and (17) from the previous lecture we see that ${\mathcal L}_{\partial_r}\ d\mu = O( 1/r )\ d\mu$ inside the ball of radius $1/\sqrt{\tau}$, from which the claim easily follows within the radius of injectivity. (To generalise the inequality beyond this region, one simply works on the region inside the cut locus, which is star-shaped around the origin in $T_p M$.)

Using this inequality, we thus conclude that

$\hbox{Vol}(B) \gg \tau^{d/2} \exp( \mu( M,g,\tau) )$. (62)

Thus a lower bound on $\mu(M,g,\tau)$ enforces non-collapsing of volume at scale $\tau$.

Exercise 13. Use (62), Exercise 11 and the monotonicity properties of $\mu(M,g,\tau)$ to establish $\kappa$-noncollapsing of Ricci flows (Theorem 2 from the previous lecture). $\diamond$

Remark 5. This argument in fact establishes a stronger form of non-collapsing, in which in order to get non-collapsing at time $t_0$ and scale $r_0$, one only needs bounded normalised curvature at time $t_0$ (instead of on the time interval ${}[t_0-r_0^2,t_0]$). It also works in arbitrary dimension. The second proof of non-collapsing that we will give, based on the Perelman reduced volume instead of Perelman entropy, needs the spacetime bounded normalised curvature assumption but also works in arbitrary dimension. $\diamond$

Remark 6. The parameter $\kappa$ in the above result, which measures the quality of the non-collapsing, will deteriorate with time T. This is because the decay of $\tau$ from (35) entails that in order to get non-collapsing of the manifold at time $t_0$ and scale $r_0$, one needs some non-collapsing at time zero and scale $\sqrt{r_0 + t_0^2}$. Of course, since the manifold is initially compact, one always has some non-collapsing at each scale, but the quantitative constants associated to this non-collapsing will deteriorate as the scale increases, which will happen when T increases. Fortunately (and especially in view of our finite time extinction results) we only need to analyse Ricci flow on compact (though potentially rather large) time intervals ${}[0,T]$. $\diamond$

Remark 7. It was recently shown by Zhang that the monotonicity properties of the quantities $\mu(M,g,\tau)$ also hold for Ricci flows with surgery. This should enable one to completely replace all applications of Perelman reduced volume in the existing proof of the Poincaré conjecture in the literature by Perelman (as well as in the expositions of Kleiner-Lott, Cao-Zhu, and Morgan-Tian) by Perelman entropy, which may lead to a shorter proof overall (although one still needs the Perelman reduced length for another purpose, namely to control the geometry of ancient non-collapsed Ricci flows). However, we shall mostly follow the original arguments of Perelman in this course. $\diamond$

Remark 8. The above entropy functionals are also useful for studying the forward or backward heat equation on a static Riemannian manifold $(M,g)$ (basically, one keeps the heat-type equations for u or f but now replace Ricci flow by the trivial flow $\dot g = 0$). However, some sign assumptions on curvature are now needed to recover the same type of monotonicity results. See this paper of Ni for details. $\diamond$

[Update, April 25: some corrections.]