In this lecture we discuss Perelman’s original approach to finite time extinction of the third homotopy group (Theorem 1 from the previous lecture), which, as previously discussed, can be combined with the finite time extinction of the second homotopy group to imply finite time extinction of the entire Ricci flow with surgery for any compact simply connected Riemannian 3-manifold, i.e. Theorem 4 from Lecture 2.

– Minimal disks –

In Lecture 4, we studied minimal immersed spheres f: S^2 \to M into a three-manifold, and how their area varied with respect to Ricci flow. This area variation formula was used to establish \pi_2 extinction, and was also used in the Colding-Minicozzi approach to \pi_3 extinction (see Lecture 5). The Perelman approach is similar, but is based upon minimal disks rather than minimal 2-spheres, which we will define as Lipschitz immersed maps f: D^2 \to M from the unit disk D to M which are smooth on the interior of the disk, and with mean curvature zero on the interior of the disk.

For simplicity let us restrict attention to 3-manifolds (M,g) which are simply connected (this case is, of course, our main concern in this course). Then every loop \gamma: S^1 \to M spans at least one disk. Let A(\gamma,g) denote the minimal area of all such spanning disks. From the work of Morrey and Hildebrandt on Plateau’s problem in Riemannian manifolds, it is known that this area is in fact attained by a minimal disk whose boundary traces out \gamma. (The fact that this disk is immersed was established by Gulliver-Lesley and by Hardt-Simon.) One can think of A(\gamma,g) as the two-dimensional generalisation of the distance function d(x,y) between two points x,y (which one can think of a map from S^0 to M). For instance, we have the following first variation formula for A(\gamma,g) analogous to that for the distance function.

Lemma 1. (First variation formula) Let \gamma: S^1 \to M be a loop in a 3-manifold (M,g), and let f: D^2 \to M be a minimal-area disk spanning \gamma, thus \int_{f(D^2)}\ d\mu = A(\gamma,g). Let t \mapsto \gamma_t be a smooth deformation of \gamma with \gamma_0 = \gamma. Then we have

\frac{d}{dt} A(\gamma_t,g)|_{t=0} \leq \int_\gamma g( N, \frac{d}{dt} \gamma_t|_{t=0} )\ ds (1)

where ds is the length element and n is the outward normal vector to f(D^2) on the boundary \gamma.

Proof. First suppose that \frac{d}{dt} \gamma_t|_{t=0} is orthogonal to the disk f(D^2). Then one can deform the disk f(D^2) to span \gamma_t for infinitesimally non-zero times t by flowing the disk along a vector field normal to that disk. Since f(D^2) is minimal, it has mean curvature zero, and so the first variation of the area in this case is zero by the calculation used to prove Proposition 2 of Lecture 4. Since the area of this deformed disk is an upper bound for A(\gamma_t,g), this proves (1) in this case.

In the case when \frac{d}{dt} \gamma_t|_{t=0} is tangential to f(D^2), the claim is clear simply by modifying the disk f(D^2) at the boundary to accommodate the change in \gamma_t with respect to the time parameter t. The general case then follows by combining the above two arguments. \Box

Now we let the manifold evolve by Ricci flow, and obtain a similar variation formula:

Corollary 1. (First variation formula with Ricci flow) Let t \mapsto (M,g(t)) be a Ricci flow, and for each time t let \gamma_t: S^1 \to M be a loop in a 3-manifold (M,g) smoothly varying in t, and let f_t: D^2 \to M be a minimal-area disk spanning \gamma_t. Then we have

\frac{d}{dt} A(\gamma_t,g(t)) \leq -\int_{f_t(D^2)} K_{f_t(D^2)}\ d\mu - \frac{1}{2} R_{\min} A(\gamma_t,g(t))

+ \int_{\gamma_t} g( N_t, \frac{d}{dt} \gamma_t)\ ds (2)

where K_{f_t(D^2)} is the Gauss curvature of f_t(D^2).

Proof. This follows from the chain rule and the computations used to derive equation (13) from Lecture 4. \Box

To deal with the Gauss curvature term, we need an analogue of the Gauss-Bonnet theorem for disks. Fortunately, we have such a result:

Proposition 1. (Gauss-Bonnet for disks) Let f: D^2 \to M be an immersed disk with boundary \gamma. Then we have

\int_{f(D^2)} K_{f(D^2)}\ d\mu + \int_\gamma k_{\gamma,f(D^2)}\ ds = 2\pi (3)

where k_{\gamma,f(D^2)} = -g( \nabla_T T, N ) is the signed curvature of the curve \gamma relative to the disk f(D^2) (here T is the unit tangent vector to \gamma, oriented in either direction; we are also abusing notation slightly by pulling back the Levi-Civita connection on TM to the pullback bundle S^1 in order to define \nabla_T T properly).

Proof. We use another flow argument. All quantities here are intrinsic and so we may pull back to the unit disk D^2. Our task is now to show that

\int_{D^2} K\ d\mu - \int_{S^1} g(\nabla_T T, N)\ ds = 2\pi (4)

for all metrics (D^2,g) on the unit disk. (Note that the vectors T, N will depend on g.) By the argument used to prove Proposition 1 in Lecture 4, the left-hand side is invariant under any compactly supported perturbation of the metric g, so we may assume that the metric is Euclidean on some neighbourhood of the origin.

We express D^2 in polar coordinates (r,\theta). It will then suffice to establish the R=1 case of the identity

\int_0^R \int_0^{2\pi} K(r,\theta) (\partial_r \wedge \partial_\theta)\ d\theta dr - \int_0^{2\pi} (\nabla_{\partial_\theta} T \wedge T)(R,\theta)\ d\theta = 2\pi (5)

where we use the metric g to identify the 2-form \partial_r \wedge \partial_\theta with a scalar, and T and N on (R,\theta) are the tangent and outward normal vectors to the circle \{r=R\} (the orientation of T is not relevant, but let us fix it as anticlockwise for sake of discussion, with the orientation chosen so that N \wedge T is positive). Note here that we are heavily relying on the two-dimensionality of the situation!

Because the metric is Euclidean near the origin, (5) is true for R close to zero. Thus by the fundamental theorem of calculus, it suffices to verify the identity

\int_0^{2\pi} K(r,\theta) (\partial_r \wedge \partial_\theta)\ d\theta - \int_0^{2\pi} \partial_r (\nabla_{\partial_\theta} T \wedge T)(R,\theta)\ d\theta = 0. (6)

for all 0 < r < 1. But as the Levi-Civita connection respects the metric (and all constructions arising from that metric, such as the identification of 2-forms with scalars) we have

\partial_r (\nabla_{\partial_\theta} T \wedge T) = (\nabla_{\partial_r} \nabla_{\partial_\theta} T \wedge T) + (\nabla_{\partial_\theta} T \wedge \nabla_{\partial_r} T). (7)

Since T is always a unit vector, \nabla_{\partial_\theta} T and \nabla_{\partial_r} T must be orthogonal to T and thus (in this two-dimensional setting) must be parallel, so the last term in (7) vanishes. An integration by parts then shows that (\nabla_{\partial_\theta} \nabla_{\partial_r} T \wedge T has vanishing integral. Finally, from the Bianchi identities that allow one to express Riemann curvature of 2-manifolds in terms of Gauss curvature, we have

(\nabla_{\partial_r} \nabla_{\partial_\theta} - \nabla_{\partial_\theta} \nabla_{\partial r})T \wedge T = K (\partial_r \wedge \partial_\theta) (8)

and the claim follows. \Box

Exercise 1. Use Proposition 1 to reprove Proposition 1 from Lecture 4. \diamond

We can now combine Corollary 1 and Proposition 1as follows. We say that a family of curves \gamma_t: S^1 \to M is undergoing curve-shortening flow if we have

\frac{\partial}{\partial t} \gamma_t = \nabla_T T (8)

where T is the tangent vector to \gamma_t.

Exercise 2. If \gamma_t undergoes curve-shortening flow, show that

\frac{d}{dt} \int_{\gamma_t}\ ds = - \int_{\gamma_t} |\nabla_T T|_g^2\ ds (9)

which may help explain the terminology “curve-shortening flow”. \diamond

Corollary 2. (First variation formula with Ricci flow and curve shortening flow) Let t \mapsto (M,g(t)) be a Ricci flow, and for each time t let \gamma_t: S^1 \to M be a loop in a 3-manifold (M,g) undergoing curve shortening flow, and let f_t: D^2 \to M be a minimal-area disk spanning \gamma_t. Then we have

\frac{d}{dt} A(\gamma_t,g(t)) \leq -2\pi + \frac{1}{2} R_{\min} A(\gamma_t,g(t)). (10)

– Perelman’s width functional –

We now begin a non-rigorous discussion of Perelman’s width functional, and how it is used to derive finite time \pi_3 extinction. There is a significant analytical difficulty regarding singularities in curve shortening flow, but we will address this issue later.

To simplify the exposition slightly, we will restrict attention to compact 3-manifolds whose components are all simply connected, and take advantage of Remark 2 in Lecture 5, although one can avoid use of this remark (and extend the analysis here to slightly more general manifolds, namely those with fundamental group a direct sum of cyclic groups and finite groups, and which contain no embedded \Bbb{RP}^2 with trivial normal bundle) by using the \pi_2 extinction theory from Lecture 4.

By this remark, all connected components of such manifolds are homotopy spheres, and in particular have trivial \pi_2 and \pi_3 isomorphic to the integers; thus every map f: S^3 \to M has a degree \hbox{deg}(f) \in {\Bbb Z}. This degree only fixed up to sign, so we shall work primarily with the magnitude |\hbox{deg}(f)| of this degree.

Let M be one of these connected components, and fix a base point x_0 \in M.

We can identify S^3 with the space S^2 \times S^1 / (\{ p_2 \} \times S^1) \cup (S^2 \times \{p_1\}), where p_1, p_2 are base points of S^1, S^2 respectively, thus we contract \{ p_2 \} \times S^1 and S^2 \times \{p_1\} to a single point p_3. (To see why these spaces are topologically isomorphic, use the standard identification of n-sphere S^n with an n-cube {}[0,1]^n with the entire boundary identified with a point p_n.) Thus, any map f: S^3 \to M with f(p_3) = x_0 can be viewed as a family \gamma_\omega: S^1 \to M of loops with fixed base point \gamma_\omega(p_1) = x_0 for \omega \in S^2, such that \gamma_\omega varies continuously in \omega and is identically equal to x_0 when \omega = p_2.

A little more generally, define a loop family \gamma = (\gamma_\omega)_{\omega \in S^2} to be a family \gamma_\omega: S^1 \to M of loops parameterised continuously by \omega \in S^2, such that \gamma_{p_2} \equiv x_0. (To put it another way, a loop family is a continuous map from S^2 to the loop space \Lambda M which has the constant loop x_0 as base point; equivalently, a loop family is a continuous map from S^2 \times S^1 / \{p_2\} \times S^1 which maps \{p_2\} \times S^1 to x_0.) Thus we see that every map f: S^3 \to M with f(p_3)=x_0 generates a loop family. The converse is not quite true, because we are not requiring the loops \gamma_\omega in a loop family to have fixed base point (i.e. we do not require \gamma_\omega(p_1)=x_0 for all \omega, only for \omega=p_2). However, as \pi_2(M) is trivial, the 2-sphere \omega \mapsto \gamma_\omega(p_1) is contractible, and so every loop family is homotopic to a loop family associated to a map f: S^3 \to M, and so in particular can be assigned a degree |\hbox{deg}(\gamma)|. This degree is well-defined and stable under deformations:

Exercise 3. Show that each loop family \gamma is associated to a unique degree magnitude, no matter how one chooses to contract the 2-sphere \omega \mapsto \gamma_\omega(p_1). Also, show that if a loop family \gamma can be continuously deformed to another loop family \tilde \gamma while staying within the class of loop families, then both loop families have the same degree . Conclude that the space of homotopy classes \pi_2(\Lambda M, x_0) of loop families can be canonically identified with \pi_3(M) \equiv {\Bbb Z}. \diamond

Exercise 4. Show that for any d \geq 1, the quotient S^d \times S^1 / \hbox{pt} \times S^1 is homotopy equivalent to the wedge sum S^d \vee S^{d+1}, and then use this to give another proof of Exercise 3. (Hint: first show that both spaces are homotopy equivalent to a sphere S^{d+1} with a disk D^d glued to it (identifying the boundary \partial D^d of the disk with some copy of S^{d-1} in S^{d+1}). The case d=1 might be easiest to visualise.) (Thanks to Kenny Maples, Peter Petersen, and Paul Smith for this hint.) \diamond

[Aside: the identification of \pi_3(M) with \pi_2(\Lambda M, x_0) presumably must also be immediate from the homotopy long exact sequence for fibrations, though I wasn't able to see this clearly.]

Given a loop family \gamma = (\gamma_\omega)_{\omega \in S^2}, define the width \tilde W_3(\gamma) of this family to be the quantity

\tilde W_3(\gamma) := \sup_{\omega \in S^2} A( \gamma_\omega ) (11)

and then for every non-negative \xi \in {\Bbb Z}^+, define the width \tilde W_3(\xi) to be the quantity

\tilde W_3(\xi) := \inf_{\gamma: |\hbox{deg}(\gamma)|= \xi} \tilde W_3(\gamma) (12)

(this is an inf of a sup of an inf!). We can define this concept for non-empty disconnected manifolds M also, by taking the infimum across all components and all choices of base point.

I do not know if \tilde W_3(\xi) is always positive when M is non-empty and \xi is positive (or equivalently, that if one has a loop family in which each loop is spanned by a disk of small area, that the entire loop family is contractible to a point). However, one can at least say that if \gamma is a loop family associated to a non-trivial degree \xi, then the length \int_{\gamma_\omega} ds of at least one of the loops \gamma_\omega is bounded away from zero by some constant depending only on M = (M,g), because if instead all loops had small length, then they could be contracted to a point, thus degenerating the loop family to an image of S^2, which is contractible since we are assuming \pi_2(M) to be trivial. This lower bound on length is important for technical reasons (which we are mostly suppressing here).

Let us temporarily pretend, though, that at some point in time during a Ricci flow t \mapsto (M,g(t)), that \tilde W_3(\xi) = \tilde W_3(\xi,t) is positive for some positive \xi, and that the infimum in (12) is attained by a smooth loop family \gamma, thus A(\gamma_\omega,g(t)) attains a maximum value of \tilde W_3(\xi,t) for some \omega \in S^2.

We now run the Ricci flow, while simultaneously deforming each loop \gamma_\omega in the loop family by curve-shortening flow (local existence for the latter flow is a result of Gage and Hamilton). Applying (10), we conclude that

\frac{d}{dt} \tilde W_3(\gamma) \leq -2\pi - \frac{1}{2} R_{\min} \tilde W_3(\gamma) (13)

(in the sense of forward difference quotients), and thus by assumption on \gamma that

\frac{d}{dt} \tilde W_3(\xi) \leq -2\pi - \frac{1}{2} R_{\min} \tilde W_3(\xi). (14)

Now we investigate what happens when a surgery occurs. It turns out that whenever a component of a pre-surgery manifold is disconnected into components of a post-surgery manifold, that there exist degree 1 (or -1) maps from the pre-surgery components to each of the post-surgery components (recall that all components are homotopy spheres, and in particular the 2-spheres that one performs surgery on are automatically contractible). Furthermore, these maps can be chosen to have Lipschitz constant less than 1 + \eta for any fixed \eta > 0, thus they are almost contractions. (We will discuss this fact later in this course, when we define surgery properly.) Because of this, we can convert any loop family on the pre-surgery component to a loop family on the post-surgery component which has the same degree magnitude and which has only slightly larger width at worst. Because of this, we can conclude that \tilde W_3(\xi) does not increase during surgery.

By arguing as in Lecture 4 we now conclude (using (14) and lower bounds in R_{\min}) that either the manifold becomes totally extinct or that \tilde W_3(\xi) becomes negative. The latter is absurd, and so we obtain the required finite time extinction (indeed, we have shown extinction not just of \pi_3 here, but of the entire manifold).

– Ramps –

The above argument had one significant gap in it; it assumed that the infimum in (12) was always attained. In practice, this is not necessarily the case, and so the best one can do is find loop families \gamma for each time t with homotopy class \xi whose width is within \varepsilon of the minimal width \tilde W_3(\xi, t), for any small \varepsilon > 0. One can try to run the above arguments with this near-minimiser \gamma in place of an exact minimiser, but in order to do so, it is necessary to ensure that the curve-shortening flow, when applied to \gamma, exists for a period of time that is bounded from below uniformly in \varepsilon.

Unfortunately, the local existence theory of Gage and Hamilton (see also Altschuler and Grayson) only guarantees such a uniform lower bound on time of existence when the curvature magnitude \kappa := |\nabla_T T|_g of these curves is uniformly bounded from above. (Indeed, by considering what curve-shortening flow does to small circles in Euclidean space, it is clear that one cannot hope to obtain uniform lifespan bounds without such a curvature bound.) And, in general, such curvature bounds are not available. (For instance, as one approaches the minimal value of \tilde W_3(\xi,t), the curves may begin to develop cusps or folds (i.e. they cease to be immersed).)

To resolve this moderately serious technical obstacle, Perelman employed the use of ramps, following the work of Altschuler and Grayson (see also a related argument of Ecker and Huisken). The basic idea is to give all the loops an upward “slope” that is bounded from below, which (in conjunction with the maximum principle) will prevent singularities from forming. In order to create this upward slope, it is necessary to increase the dimension of the ambient manifold M by one, working with M \times S^1_\lambda instead of M, where S^1_\lambda = {\Bbb R}/\lambda {\Bbb Z} is the circle of length \lambda for some small \lambda > 0. (Amusingly, this idea of attaching some tightly rolled up dimensions to space also appears in string theory, though I doubt that there is any connection here.)

We now turn to the details. We first develop some general variation formulae and estimates for a curve-shortening flow t \mapsto \gamma_t in a time-varying Riemannian manifold (M,g(t)) of arbitrary dimension. As before, we let T denote the unit tangent vector along \gamma_t. We write H := \nabla_T T = \frac{\partial}{\partial t} \gamma for the curvature vector, which is of course also the rate of change of the curve under curve shortening flow, and write k := |H|_g for the curvature.

Write x for the variable parameterising the loop \gamma_t: S^1 \to M, and write X := \frac{\partial}{\partial x} \gamma_t for the spatial velocity vector for this loop, thus X is a scalar multiple of the tangent vector T. Here and in the sequel we abuse notation by identifying connections on the tangent bundle TM with connections on pullback bundles.

Exercise 5. (Commutativity of X and H) Show that \nabla_X H = \nabla_H X.  (Hint: first show that \nabla_H \nabla_X F = \hbox{Hess}(F)( H, X ) + dF( \nabla_H X ) for any scalar function F: M \to {\Bbb R}, and similarly with the roles of H and X reversed.  Now use the torsion-free nature of the Levi-Civita connection and duality.) \diamond

We now record a variation formula for the squared speed g(X,X).

Lemma 2. For fixed x, we have

\frac{\partial}{\partial t} g( X, X ) = - 2 \hbox{Ric}(X,X) - 2k^2 g(X,X). (15)

Proof. From the chain rule we have

\frac{\partial}{\partial t} g( X, X ) = ( \frac{\partial}{\partial t} g )(X,X) + 2 g( \nabla_H X, X ). (16)

The first term on the right-hand side of (16) is - 2 \hbox{Ric}(X,X) by the Ricci flow equation. On the other hand, H is orthogonal to T (as T is a unit vector), and so g(X,H)=0.  From this and Exercise 5 we haveg( \nabla_H X, X ) = g( \nabla_X H, X ) = - g( H, \nabla_X X ). (17)

Writing X = g(X,X)^{1/2} T, and again using that H is orthogonal to T, we have

g( H, \nabla_X X ) = g(X,X) g( H, \nabla_T T ) = g(X,X) g(H,H). (18)

Since g(H,H) = k^2, the claim follows. \Box

Corollary 2. We have {}[H,T] = (k^2 + \hbox{Ric}(T,T)) T.

Proof. We already know that [H,X] vanishes. Expressing X = g(X,X)^{1/2} T and using the previous lemma (writing \frac{\partial}{\partial t} g(X,X) as \nabla_H g(X,X)), the corollary follows after a brief computation. \Box

We can now derive a heat equation for the curvature (vaguely reminiscent of a Bochner-type identity):

Lemma 3. (First variation of squared curvature) We have

\frac{\partial}{\partial t} k^2 = \nabla_T \nabla_T (k^2) - 2 g( \pi(\nabla_T H), \pi(\nabla_T H) ) + 2k^4 + O( k^2 ) (19)

where \pi is the projection to the orthogonal complement of X, and the implied constants in the O() terms depend only on the Riemannian manifold (M,g(t)) (and in particular on bounds on the Riemann curvature tensor).

Proof. We write k^2 = g(H,H). By the chain rule and the Ricci flow equation we have

\frac{\partial}{\partial t} g(H,H) = - 2 \hbox{Ric}(H,H) + 2 g(\nabla_H H,H). (20)

The first term on the right-hand side is O(k^2) which is acceptable. As for the second term, we expand

\nabla_H H = \nabla_H \nabla_T T = \nabla_T \nabla_H T + \nabla_{[H,T]} T + O(k). (21)

The O(k) term gives a contribution of O(k^2) to (20) which is acceptable. By Corollary 2, we have

\nabla_{[H,T]} T = (k^2 + O(1)) \nabla_T T = k^2 H + O(k) (22)

which gives a contribution of 2k^4 + O(k^2) to (20). Finally, we deal with the top-order term \nabla_T \nabla_H T. We express \nabla_H T = \nabla_T H + [H,T]. Applying Corollary 2 (and the orthogonality of T and H), we have

g( \nabla_T [H,T], H ) = (k^2+O(1)) g( \nabla_T T, H ) = k^4 + O(k^2) (23)

whereas from the Leibniz rule we have

g(\nabla_T \nabla_T H, H) = \frac{1}{2} \nabla_T \nabla_T g(H,H) - g( \nabla_T H, \nabla_T H ). (24)

Since H is orthogonal to T, we have

g(\nabla_T H, T) = - g(H, \nabla_T T) = - g(H,H) = -k^2 (25)

and so by Pythagoras

g(\nabla_T H,\nabla_T H) = g(\pi(\nabla_T H),\pi(\nabla_T H)) - k^4. (26)

Substituting (26) into (24), and combining this with (23) to calculate the net contribution of \nabla_T \nabla_H T, we obtain (19) as desired. \Box

Corollary 3. (First variation of curvature) We have

\frac{\partial}{\partial t} k \leq \nabla_T \nabla_T k + k^3 + O( k ) (27)

Proof. Expanding out (19) using the product rule and comparing with (27), we see that it suffices to show that

(\nabla_T k)^2 \leq g( \pi(\nabla_T H), \pi(\nabla_T H) ). (28)

But if we differentiate the identity k^2 = g(H,H) along T, we obtain

k \nabla_T k = g( \nabla_T H, H ) = g( \pi(\nabla_T H), H ) (29)

(since H is orthogonal to T) and the claim now follows from Cauchy-Schwarz. \Box

Note that by combining this corollary with the maximum principle (Corollary 1 of Lecture 3) we can get upper bounds on k for short times based on upper bounds for k at time zero. (By using energy estimates, one can also control higher derivatives of k, obtaining the usual parabolic type estimates as a consequence; such estimates are important for the analysis here but we will omit them.) Unfortunately, the non-linear term k^3 on the right-hand side has an unfavourable sign and can generate finite time blowup.

The situation is much improved, however, for a special class of loops known as ramps. These curves take values not in an arbitrary manifold M, but in a product manifold M \times S^1_\lambda (with the product Riemannian metric). The point here is that we have a vertical unit vector field U on this manifold (corresponding to infinitesimal rotation of the S^1_\lambda factor) which is completely parallel to the Levi-Civita connection: \nabla_\alpha U = 0. Define a ramp to be a curve \gamma: S^1 \to M \times S^1_\lambda whose unit tangent vector T is always upward sloping in the sense that g(T,U) > 0 on all of \gamma (thus the ramp must “wrap around” the vertical fibre S^1_\lambda at least once, in order to return to its starting point). In particular, since \gamma is compact, we have a uniform lower bound g(T,U) \geq c for some c > 0.

Write u := g(T,U) for the evolution of such a ramp under curve shortening flow, thus one can view u as a function of t and x. On the one hand, we have the trivial pointwise bound

|u| \leq 1 (30)

from Cauchy-Schwarz. On the other hand, we have an evolution equation for u:

Proposition 2. (First variation of u) We have

\frac{\partial}{\partial t} u = \nabla_T \nabla_T u + (k^2 + O(1) )u (31)

Proof. Differentiating u = g(T,U) by the chain rule as before (using the fact that U is parallel to the connection, as well as the Ricci flow equation) we have

\frac{\partial}{\partial t} u = - 2\hbox{Ric}(T,U) + g( \nabla_H T, U ). (32)

Since U is parallel to the connection, it is annihilated by any commutator \hbox{Riem}(X,Y) = [\nabla_X, \nabla_Y] and thus \hbox{Ric}(T,U) = 0. Writing \nabla_H T = [H,T] + \nabla_T H = [H,T] + \nabla_T \nabla_T T and using Corollary 2, we conclude

\frac{\partial}{\partial t} u = (k^2 + O(1)) u + g( \nabla_T \nabla_T T, U). (33)

Since U is parallel to the connection, we can write g( \nabla_T \nabla_T T, U) = \nabla_T \nabla_T g(T,U), and the claim follows from the definition of u. \Box

As a particular corollary of (31), we have the inequality

\frac{\partial}{\partial t} u \geq \nabla_T \nabla_T u - O(|u|). (34)

Using the maximum principle (Corollary 1 from Lecture 3), and the assumption that u is initially bounded away from zero, we conclude that u continues to be bounded away from zero for all time t for which the curve-shortening flow exists (though this bound can deteriorate exponentially fast in t). In particular, u is positive and the curve continues to be a ramp. Furthermore, by applying the quotient rule to (27) and (31), one obtains after some calculation the differential inequality

\displaystyle \frac{\partial}{\partial t} f \leq \nabla_T \nabla_T f + \frac{2 \nabla_T u}{u} \nabla_T f + O( f ) (35)

for the quantity f := k/u. Applying the maximum principle again, and noting that f is initially bounded at time zero, we conclude that f is bounded for all time for which the solution exists (though again, the bound can deteriorate exponentially in t). Combining this with the trivial bound (30), we conclude that the curvature k is bounded for any period of time on which the solution exists, with the bound deteriorating exponentially in t. Combining this with the local existence theory (see Altschuler and Grayson), which asserts that the curve shortening flow can be continued whenever the curvature remains bounded, we conclude that curve shortening flows for ramps persist globally in time.

Of course, in our applications to Ricci flow, the curves \gamma_\omega that we are applying curve shortening flow to are not ramps; they live in M rather than M \times S^1_\lambda. To address this, one has to embed M in M \times S^1_\lambda for some small \lambda and approximate each \gamma_\omega by a ramp that wraps around M \times S^1_\lambda exactly once. One then flows the ramps by curve shortening flow, and works with the minimal spanning areas A(\gamma) of these evolved ramps (rather than working with the curve shortening flow applied directly to the original curves).

There are of course many technical obstacles to this strategy. One of them is that one needs to show that small changes in the ramp \gamma do not significantly affect the area A(\gamma) of the minimal spanning disk. To achieve this, one needs to show that if two ramps \gamma_1, \gamma_2 are initially close in the sense that there is an annulus connecting them of small area, then they stay close (in the same sense) for any bounded period of time under curve shortening flow. This can be accomplished by using a first variation formula for area of minimal annuli which is similar to Corollary 2. There are several other technical difficulties of an analytical nature to resolve; see Chapter 19 of Morgan-Tian’s book for full details.

[Update, Apr 21: New exercise added; various corerctions.]