In this lecture we discuss Perelman’s original approach to finite time extinction of the third homotopy group (Theorem 1 from the previous lecture), which, as previously discussed, can be combined with the finite time extinction of the second homotopy group to imply finite time extinction of the entire Ricci flow with surgery for any compact simply connected Riemannian 3-manifold, i.e. Theorem 4 from Lecture 2.

– Minimal disks –

In Lecture 4, we studied minimal immersed spheres into a three-manifold, and how their area varied with respect to Ricci flow. This area variation formula was used to establish extinction, and was also used in the Colding-Minicozzi approach to extinction (see Lecture 5). The Perelman approach is similar, but is based upon minimal disks rather than minimal 2-spheres, which we will define as Lipschitz immersed maps from the unit disk D to M which are smooth on the interior of the disk, and with mean curvature zero on the interior of the disk.

For simplicity let us restrict attention to 3-manifolds (M,g) which are simply connected (this case is, of course, our main concern in this course). Then every loop spans at least one disk. Let denote the minimal area of all such spanning disks. From the work of Morrey and Hildebrandt on Plateau’s problem in Riemannian manifolds, it is known that this area is in fact attained by a minimal disk whose boundary traces out . (The fact that this disk is immersed was established by Gulliver-Lesley and by Hardt-Simon.) One can think of as the two-dimensional generalisation of the distance function between two points (which one can think of a map from to M). For instance, we have the following first variation formula for analogous to that for the distance function.

Lemma 1.(First variation formula) Let be a loop in a 3-manifold (M,g), and let be a minimal-area disk spanning , thus . Let be a smooth deformation of with . Then we have(1)

where ds is the length element and n is the outward normal vector to on the boundary .

**Proof. **First suppose that is orthogonal to the disk . Then one can deform the disk to span for infinitesimally non-zero times t by flowing the disk along a vector field normal to that disk. Since is minimal, it has mean curvature zero, and so the first variation of the area in this case is zero by the calculation used to prove Proposition 2 of Lecture 4. Since the area of this deformed disk is an upper bound for , this proves (1) in this case.

In the case when is tangential to , the claim is clear simply by modifying the disk at the boundary to accommodate the change in with respect to the time parameter t. The general case then follows by combining the above two arguments.

Now we let the manifold evolve by Ricci flow, and obtain a similar variation formula:

Corollary 1.(First variation formula with Ricci flow) Let be a Ricci flow, and for each time t let be a loop in a 3-manifold (M,g) smoothly varying in t, and let be a minimal-area disk spanning . Then we have(2)

where is the Gauss curvature of .

**Proof.** This follows from the chain rule and the computations used to derive equation (13) from Lecture 4.

To deal with the Gauss curvature term, we need an analogue of the Gauss-Bonnet theorem for disks. Fortunately, we have such a result:

Proposition 1.(Gauss-Bonnet for disks) Let be an immersed disk with boundary . Then we have(3)

where is the signed curvature of the curve relative to the disk (here T is the unit tangent vector to , oriented in either direction; we are also abusing notation slightly by pulling back the Levi-Civita connection on TM to the pullback bundle in order to define properly).

**Proof.** We use another flow argument. All quantities here are intrinsic and so we may pull back to the unit disk . Our task is now to show that

(4)

for all metrics on the unit disk. (Note that the vectors T, N will depend on g.) By the argument used to prove Proposition 1 in Lecture 4, the left-hand side is invariant under any *compactly supported* perturbation of the metric g, so we may assume that the metric is Euclidean on some neighbourhood of the origin.

We express in polar coordinates . It will then suffice to establish the R=1 case of the identity

(5)

where we use the metric g to identify the 2-form with a scalar, and T and N on are the tangent and outward normal vectors to the circle (the orientation of T is not relevant, but let us fix it as anticlockwise for sake of discussion, with the orientation chosen so that is positive). Note here that we are heavily relying on the two-dimensionality of the situation!

Because the metric is Euclidean near the origin, (5) is true for R close to zero. Thus by the fundamental theorem of calculus, it suffices to verify the identity

. (6)

for all . But as the Levi-Civita connection respects the metric (and all constructions arising from that metric, such as the identification of 2-forms with scalars) we have

. (7)

Since T is always a unit vector, and must be orthogonal to T and thus (in this two-dimensional setting) must be parallel, so the last term in (7) vanishes. An integration by parts then shows that has vanishing integral. Finally, from the Bianchi identities that allow one to express Riemann curvature of 2-manifolds in terms of Gauss curvature, we have

(8)

and the claim follows.

**Exercise 1.** Use Proposition 1 to reprove Proposition 1 from Lecture 4.

We can now combine Corollary 1 and Proposition 1as follows. We say that a family of curves is undergoing *curve-shortening flow* if we have

(8)

where T is the tangent vector to .

**Exercise 2. **If undergoes curve-shortening flow, show that

(9)

which may help explain the terminology “curve-shortening flow”.

Corollary 2.(First variation formula with Ricci flow and curve shortening flow) Let be a Ricci flow, and for each time t let be a loop in a 3-manifold (M,g) undergoing curve shortening flow, and let be a minimal-area disk spanning . Then we have. (10)

– Perelman’s width functional –

We now begin a non-rigorous discussion of Perelman’s width functional, and how it is used to derive finite time extinction. There is a significant analytical difficulty regarding singularities in curve shortening flow, but we will address this issue later.

To simplify the exposition slightly, we will restrict attention to compact 3-manifolds whose components are all simply connected, and take advantage of Remark 2 in Lecture 5, although one can avoid use of this remark (and extend the analysis here to slightly more general manifolds, namely those with fundamental group a direct sum of cyclic groups and finite groups, and which contain no embedded with trivial normal bundle) by using the extinction theory from Lecture 4.

By this remark, all connected components of such manifolds are homotopy spheres, and in particular have trivial and isomorphic to the integers; thus every map has a degree . This degree only fixed up to sign, so we shall work primarily with the magnitude of this degree.

Let M be one of these connected components, and fix a base point .

We can identify with the space , where are base points of respectively, thus we contract and to a single point . (To see why these spaces are topologically isomorphic, use the standard identification of n-sphere with an n-cube with the entire boundary identified with a point .) Thus, any map with can be viewed as a family of loops with fixed base point for , such that varies continuously in and is identically equal to when .

A little more generally, define a *loop family* to be a family of loops parameterised continuously by , such that . (To put it another way, a loop family is a continuous map from to the loop space which has the constant loop as base point; equivalently, a loop family is a continuous map from which maps to .) Thus we see that every map with generates a loop family. The converse is not quite true, because we are not requiring the loops in a loop family to have fixed base point (i.e. we do not require for all , only for ). However, as is trivial, the 2-sphere is contractible, and so every loop family is homotopic to a loop family associated to a map , and so in particular can be assigned a degree . This degree is well-defined and stable under deformations:

**Exercise 3.** Show that each loop family is associated to a unique degree magnitude, no matter how one chooses to contract the 2-sphere . Also, show that if a loop family can be continuously deformed to another loop family while staying within the class of loop families, then both loop families have the same degree . Conclude that the space of homotopy classes of loop families can be canonically identified with .

**Exercise 4.** Show that for any , the quotient is homotopy equivalent to the wedge sum , and then use this to give another proof of Exercise 3. (Hint: first show that both spaces are homotopy equivalent to a sphere with a disk glued to it (identifying the boundary of the disk with some copy of in ). The case might be easiest to visualise.) (Thanks to Kenny Maples, Peter Petersen, and Paul Smith for this hint.)

[*Aside*: the identification of with presumably must also be immediate from the homotopy long exact sequence for fibrations, though I wasn't able to see this clearly.]

Given a loop family , define the *width* of this family to be the quantity

(11)

and then for every non-negative , define the width to be the quantity

(12)

(this is an inf of a sup of an inf!). We can define this concept for non-empty disconnected manifolds M also, by taking the infimum across all components and all choices of base point.

I do not know if is always positive when M is non-empty and is positive (or equivalently, that if one has a loop family in which each loop is spanned by a disk of small area, that the entire loop family is contractible to a point). However, one can at least say that if is a loop family associated to a non-trivial degree , then the length of at least one of the loops is bounded away from zero by some constant depending only on , because if instead all loops had small length, then they could be contracted to a point, thus degenerating the loop family to an image of , which is contractible since we are assuming to be trivial. This lower bound on length is important for technical reasons (which we are mostly suppressing here).

Let us temporarily pretend, though, that at some point in time during a Ricci flow , that is positive for some positive , and that the infimum in (12) is attained by a smooth loop family , thus attains a maximum value of for some .

We now run the Ricci flow, while simultaneously deforming each loop in the loop family by curve-shortening flow (local existence for the latter flow is a result of Gage and Hamilton). Applying (10), we conclude that

(13)

(in the sense of forward difference quotients), and thus by assumption on that

. (14)

Now we investigate what happens when a surgery occurs. It turns out that whenever a component of a pre-surgery manifold is disconnected into components of a post-surgery manifold, that there exist degree 1 (or -1) maps from the pre-surgery components to each of the post-surgery components (recall that all components are homotopy spheres, and in particular the 2-spheres that one performs surgery on are automatically contractible). Furthermore, these maps can be chosen to have Lipschitz constant less than for any fixed , thus they are almost contractions. (We will discuss this fact later in this course, when we define surgery properly.) Because of this, we can convert any loop family on the pre-surgery component to a loop family on the post-surgery component which has the same degree magnitude and which has only slightly larger width at worst. Because of this, we can conclude that does not increase during surgery.

By arguing as in Lecture 4 we now conclude (using (14) and lower bounds in ) that either the manifold becomes totally extinct or that becomes negative. The latter is absurd, and so we obtain the required finite time extinction (indeed, we have shown extinction not just of here, but of the entire manifold).

– Ramps –

The above argument had one significant gap in it; it assumed that the infimum in (12) was always attained. In practice, this is not necessarily the case, and so the best one can do is find loop families for each time t with homotopy class whose width is within of the minimal width , for any small . One can try to run the above arguments with this near-minimiser in place of an exact minimiser, but in order to do so, it is necessary to ensure that the curve-shortening flow, when applied to , exists for a period of time that is bounded from below uniformly in .

Unfortunately, the local existence theory of Gage and Hamilton (see also Altschuler and Grayson) only guarantees such a uniform lower bound on time of existence when the curvature magnitude of these curves is uniformly bounded from above. (Indeed, by considering what curve-shortening flow does to small circles in Euclidean space, it is clear that one cannot hope to obtain uniform lifespan bounds without such a curvature bound.) And, in general, such curvature bounds are not available. (For instance, as one approaches the minimal value of , the curves may begin to develop cusps or folds (i.e. they cease to be immersed).)

To resolve this moderately serious technical obstacle, Perelman employed the use of *ramps*, following the work of Altschuler and Grayson (see also a related argument of Ecker and Huisken). The basic idea is to give all the loops an upward “slope” that is bounded from below, which (in conjunction with the maximum principle) will prevent singularities from forming. In order to create this upward slope, it is necessary to increase the dimension of the ambient manifold M by one, working with instead of M, where is the circle of length for some small . (Amusingly, this idea of attaching some tightly rolled up dimensions to space also appears in string theory, though I doubt that there is any connection here.)

We now turn to the details. We first develop some general variation formulae and estimates for a curve-shortening flow in a time-varying Riemannian manifold of arbitrary dimension. As before, we let T denote the unit tangent vector along . We write for the curvature vector, which is of course also the rate of change of the curve under curve shortening flow, and write for the curvature.

Write x for the variable parameterising the loop , and write for the spatial velocity vector for this loop, thus X is a scalar multiple of the tangent vector T. Here and in the sequel we abuse notation by identifying connections on the tangent bundle TM with connections on pullback bundles.

**Exercise 5.** (Commutativity of X and H) Show that . (Hint: first show that for any scalar function , and similarly with the roles of H and X reversed. Now use the torsion-free nature of the Levi-Civita connection and duality.)

We now record a variation formula for the squared speed .

Lemma 2.For fixed x, we have. (15)

**Proof. **From the chain rule we have

. (16)

The first term on the right-hand side of (16) is by the Ricci flow equation. On the other hand, H is orthogonal to T (as T is a unit vector), and so . From this and Exercise 5 we have. (17)

Writing X = g(X,X)^{1/2} T, and again using that H is orthogonal to T, we have

. (18)

Since , the claim follows.

Corollary 2.We have .

**Proof. **We already know that [H,X] vanishes. Expressing and using the previous lemma (writing as ), the corollary follows after a brief computation.

We can now derive a heat equation for the curvature (vaguely reminiscent of a Bochner-type identity):

Lemma 3.(First variation of squared curvature) We have(19)

where is the projection to the orthogonal complement of X, and the implied constants in the O() terms depend only on the Riemannian manifold (M,g(t)) (and in particular on bounds on the Riemann curvature tensor).

**Proof. **We write . By the chain rule and the Ricci flow equation we have

. (20)

The first term on the right-hand side is which is acceptable. As for the second term, we expand

(21)

The O(k) term gives a contribution of to (20) which is acceptable. By Corollary 2, we have

(22)

which gives a contribution of to (20). Finally, we deal with the top-order term . We express . Applying Corollary 2 (and the orthogonality of T and H), we have

(23)

whereas from the Leibniz rule we have

. (24)

Since H is orthogonal to T, we have

(25)

and so by Pythagoras

. (26)

Substituting (26) into (24), and combining this with (23) to calculate the net contribution of , we obtain (19) as desired.

Corollary 3.(First variation of curvature) We have(27)

**Proof.** Expanding out (19) using the product rule and comparing with (27), we see that it suffices to show that

. (28)

But if we differentiate the identity along T, we obtain

(29)

(since H is orthogonal to T) and the claim now follows from Cauchy-Schwarz.

Note that by combining this corollary with the maximum principle (Corollary 1 of Lecture 3) we can get upper bounds on k for short times based on upper bounds for k at time zero. (By using energy estimates, one can also control higher derivatives of k, obtaining the usual parabolic type estimates as a consequence; such estimates are important for the analysis here but we will omit them.) Unfortunately, the non-linear term on the right-hand side has an unfavourable sign and can generate finite time blowup.

The situation is much improved, however, for a special class of loops known as *ramps*. These curves take values not in an arbitrary manifold M, but in a product manifold (with the product Riemannian metric). The point here is that we have a vertical unit vector field U on this manifold (corresponding to infinitesimal rotation of the factor) which is completely parallel to the Levi-Civita connection: . Define a ramp to be a curve whose unit tangent vector T is always upward sloping in the sense that on all of (thus the ramp must “wrap around” the vertical fibre at least once, in order to return to its starting point). In particular, since is compact, we have a uniform lower bound for some .

Write for the evolution of such a ramp under curve shortening flow, thus one can view u as a function of t and x. On the one hand, we have the trivial pointwise bound

(30)

from Cauchy-Schwarz. On the other hand, we have an evolution equation for u:

Proposition 2.(First variation of u) We have(31)

**Proof. **Differentiating u = g(T,U) by the chain rule as before (using the fact that U is parallel to the connection, as well as the Ricci flow equation) we have

. (32)

Since U is parallel to the connection, it is annihilated by any commutator and thus . Writing and using Corollary 2, we conclude

. (33)

Since U is parallel to the connection, we can write , and the claim follows from the definition of u.

As a particular corollary of (31), we have the inequality

. (34)

Using the maximum principle (Corollary 1 from Lecture 3), and the assumption that u is initially bounded away from zero, we conclude that u continues to be bounded away from zero for all time t for which the curve-shortening flow exists (though this bound can deteriorate exponentially fast in t). In particular, u is positive and the curve continues to be a ramp. Furthermore, by applying the quotient rule to (27) and (31), one obtains after some calculation the differential inequality

(35)

for the quantity . Applying the maximum principle again, and noting that f is initially bounded at time zero, we conclude that f is bounded for all time for which the solution exists (though again, the bound can deteriorate exponentially in t). Combining this with the trivial bound (30), we conclude that the curvature k is bounded for any period of time on which the solution exists, with the bound deteriorating exponentially in t. Combining this with the local existence theory (see Altschuler and Grayson), which asserts that the curve shortening flow can be continued whenever the curvature remains bounded, we conclude that curve shortening flows for ramps persist globally in time.

Of course, in our applications to Ricci flow, the curves that we are applying curve shortening flow to are not ramps; they live in M rather than . To address this, one has to embed M in for some small and approximate each by a ramp that wraps around exactly once. One then flows the ramps by curve shortening flow, and works with the minimal spanning areas of these evolved ramps (rather than working with the curve shortening flow applied directly to the original curves).

There are of course many technical obstacles to this strategy. One of them is that one needs to show that small changes in the ramp do not significantly affect the area of the minimal spanning disk. To achieve this, one needs to show that if two ramps are initially close in the sense that there is an annulus connecting them of small area, then they stay close (in the same sense) for any bounded period of time under curve shortening flow. This can be accomplished by using a first variation formula for area of minimal annuli which is similar to Corollary 2. There are several other technical difficulties of an analytical nature to resolve; see Chapter 19 of Morgan-Tian’s book for full details.

[*Update*, Apr 21: New exercise added; various corerctions.]

## 14 comments

Comments feed for this article

21 April, 2008 at 2:19 pm

285G, Lecture 7: Rescaling of Ricci flows and kappa-noncollapsing « What’s new[...] and are also supercritical. Thus the various bounds we have on these quantities from Lectures 4, 5, 6 do not directly tell us anything about asymptotic [...]

12 May, 2008 at 2:20 pm

285G, Lecture 10: Variation of L-geodesics, and monotonicity of Perelman reduced volume « What’s new[...] and is the pullback of the Levi-Civita connection applied in the direction (cf. Exercise 5 from Lecture 6). An integration by parts (again using the parallel nature of g) then gives the first variation [...]

7 June, 2008 at 1:44 pm

kyI don’t see how

“we can identify S^3 with the space S^2 x S^1 / ({p_2} x S^1) U (S^2 x {p_1})} where p_1, p_2 are base points of S^1, S^2 respectively.”

Only way I know how to imagine S^3 is extension of one-point compactification.

Can somebody elaborate?

Thanks.

7 June, 2008 at 5:41 pm

Terence TaoDear Ky,

One can think of as the unit interval [0,1] with the boundary identified with a single point , and as the unit square with the boundary identified with a single point . Thus can be viewed as the unit cube with the sets identified with a single point for each , identified with a single point for each , and identified with . In particular, the set comprises the entire boundary of the cube. Identifying this to a single point thus gives .

(For visualisation purposes, it may help to first imagine the situation one dimension lower down, in which can be viewed as the torus (which one can think of as the unit square with opposing edges identified) with (i.e. the entire boundary of the square) identified to a point .)

17 October, 2011 at 11:50 am

285G, Lecture 19: The structure of Ricci flow at the singular time, surgery, and the Poincaré conjecture « What’s new[...] for Ricci flows, also work for Ricci flows with surgery, as discussed for in Lecture 5 and Lecture 6. Even if the manifold is not irreducible, one can show that there are only finitely many surgeries [...]

13 February, 2014 at 9:42 pm

Anonymoushi, professor Tao, (29) seems not very clear to me: we want to estimate , but (29) only gives us the estimation of

14 February, 2014 at 8:24 am

Terence TaoSorry, that was a typo: in (28) should be ; I’ve now corrected it in the post.

16 February, 2014 at 10:15 pm

handaoyuanHi, Professor Tao, on the deduction of (2), by using chain rule, do you mean that ?

[That's not quite the right formula for the total derivative - one needs some partial derivatives on the A - but basically yes. -T.]18 February, 2014 at 6:10 pm

handaoyuanHi, Professor Tao, it seems the coefficient in (9) should be -1 rather than -1/2.

[Corrected, thanks - T.]21 February, 2014 at 12:38 am

handaoyuanHi, professor Tao, since we’ve got almost the same formula (10) for minimal area as the following one for width functional, why do we need the width functional, why don’t we just use (10) to deduce the finite time extinction of the whole manifold ?

21 February, 2014 at 11:14 am

Terence TaoAs M is simply connected, it is quite possible for the area of a minimal disk to shrink to zero without the manifold itself developing any singularity or extinction, simply because the loop itself is shrinking to zero. One needs to work instead with families of loops, rather than just a single loop, to maintain topological non-triviality and avoid this scenario.

21 February, 2014 at 5:44 pm

handaoyuanThanks, but the contradiction point is that the solution of (14) and (10) (both could be viewed as the same differential equation) will go to negative (just as the last paragraph before section “Ramp” said) if we don’t have finite time extinction; since is area which can’t be negative, couldn’t that have the same contradiction effect as the case of ?

21 February, 2014 at 5:59 pm

Terence TaoNo, because once shrinks to a point (and shrinks to zero), the curve shortening flow is no longer definable and one cannot continue (10) beyond that time to reach a contradiction. In contrast, when working with a topologically non-trivial loop family, the loop with the largest minimal spanning area must be non-trivial, and one can perform curve shortening flow on that loop. (This assumes that there is a loop which attains this maximum; in practice, one has to modify the strategy via the ramp device, as discussed later in the blog post.)

To put it another way: the problem with trying to use (10) is that extinction of the loop can occur before extinction of the manifold. In contrast, with topologically non-trivial loop families, the (extremising) loop cannot become extinct before the manifold does.

21 February, 2014 at 6:34 pm

handaoyuanThanks, now I understand it.